144 37 37MB
English Pages 192 [193] Year 2021
Advances in Intelligent Systems and Computing 1402
Zhengbing Hu Sergey Petoukhov Matthew He Editors
Advances in Intelligent Systems, Computer Science and Digital Economics II
Advances in Intelligent Systems and Computing Volume 1402
Series Editor Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Advisory Editors Nikhil R. Pal, Indian Statistical Institute, Kolkata, India Rafael Bello Perez, Faculty of Mathematics, Physics and Computing, Universidad Central de Las Villas, Santa Clara, Cuba Emilio S. Corchado, University of Salamanca, Salamanca, Spain Hani Hagras, School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK László T. Kóczy, Department of Automation, Széchenyi István University, Gyor, Hungary Vladik Kreinovich, Department of Computer Science, University of Texas at El Paso, El Paso, TX, USA Chin-Teng Lin, Department of Electrical Engineering, National Chiao Tung University, Hsinchu, Taiwan Jie Lu, Faculty of Engineering and Information Technology, University of Technology Sydney, Sydney, NSW, Australia Patricia Melin, Graduate Program of Computer Science, Tijuana Institute of Technology, Tijuana, Mexico Nadia Nedjah, Department of Electronics Engineering, University of Rio de Janeiro, Rio de Janeiro, Brazil Ngoc Thanh Nguyen , Faculty of Computer Science and Management, Wrocław University of Technology, Wrocław, Poland Jun Wang, Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong
The series “Advances in Intelligent Systems and Computing” contains publications on theory, applications, and design methods of Intelligent Systems and Intelligent Computing. Virtually all disciplines such as engineering, natural sciences, computer and information science, ICT, economics, business, e-commerce, environment, healthcare, life science are covered. The list of topics spans all the areas of modern intelligent systems and computing such as: computational intelligence, soft computing including neural networks, fuzzy systems, evolutionary computing and the fusion of these paradigms, social intelligence, ambient intelligence, computational neuroscience, artificial life, virtual worlds and society, cognitive science and systems, Perception and Vision, DNA and immune based systems, self-organizing and adaptive systems, e-Learning and teaching, human-centered and human-centric computing, recommender systems, intelligent control, robotics and mechatronics including human-machine teaming, knowledge-based paradigms, learning paradigms, machine ethics, intelligent data analysis, knowledge management, intelligent agents, intelligent decision making and support, intelligent network security, trust management, interactive entertainment, Web intelligence and multimedia. The publications within “Advances in Intelligent Systems and Computing” are primarily proceedings of important conferences, symposia and congresses. They cover significant recent developments in the field, both of a foundational and applicable character. An important characteristic feature of the series is the short publication time and world-wide distribution. This permits a rapid and broad dissemination of research results. Indexed by DBLP, EI Compendex, INSPEC, WTI Frankfurt eG, zbMATH, Japanese Science and Technology Agency (JST). All books published in the series are submitted for consideration in Web of Science.
More information about this series at http://www.springer.com/series/11156
Zhengbing Hu Sergey Petoukhov Matthew He •
•
Editors
Advances in Intelligent Systems, Computer Science and Digital Economics II
123
Editors Zhengbing Hu School of Educational Information Technology Central China Normal University Wuhan, China
Sergey Petoukhov Mechanical Engineering Russian Academy of Sciences Moscow, Russia
Matthew He Halmos College of Natural Nova Southeastern University Plantation, FL, USA
ISSN 2194-5357 ISSN 2194-5365 (electronic) Advances in Intelligent Systems and Computing ISBN 978-3-030-80477-0 ISBN 978-3-030-80478-7 (eBook) https://doi.org/10.1007/978-3-030-80478-7 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Contents
On Mixed Forced and Self-oscillations with Delays in Elasticity and Friction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Alishir A. Alifov
1
An Extensible Network Traffic Classifier Based on Machine Learning Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Vladimir A. Mankov, Vladimir Yu. Deart, and Irina A. Krasnova
10
Intelligent Information Systems Based on Notional Models Without Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Valery S. Vykhovanets
20
Study of Properties of Growing Random Graphs with Neuron-like Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ivan V. Stepanyan and Vladimir V. Aristov
29
Planning of Computational Experiments in Verification of Mathematical Models of Dynamic Machine Systems . . . . . . . . . . . . . Isak N. Statnikov and Georgy I. Firsov
39
Optimization of Network Transmission of Multimedia Data Stream in a Cloud System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Vera V. Izvozchikova, Marina A. Tokareva, and Vladimir M. Shardakov
49
Using Virtual Scenes for Comparison of Photogrammetry Software . . . Aleksandr Mezhenin, Vladimir Polyakov, Angelina Prishhepa, Vera Izvozchikova, and Anatoly Zykov
57
Structural-Modal Analysis of Biomedical Signals . . . . . . . . . . . . . . . . . . A. Yu. Spasenov, K. V. Kucherov, S. I. Dosko, V. M. Utenkov, and Bin Liu
66
v
vi
Contents
Multichannel Plasma Spectrum Analyzer Based on Prony-Fourier Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . S. I. Dosko, V. M. Utenkov, K. V. Kucherov, A. Yu. Spasenov, and E. V. Yuganov
74
On Feature Expansion with Finite Normal Mixture Models in Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Andrey Gorshenin and Victor Kuzmin
82
Methodology for the Classification of Human Locomotion’s and Postures for the Control System of a Bionic Prosthesis . . . . . . . . . . I. A. Meshchikhin and S. S. Gavriushin
91
An Approach to Social Media User Search Automation . . . . . . . . . . . . . 101 Anastasia A. Korepanova, Valerii D. Oliseenko, and Maxim V. Abramov Method for Processing Document with Tables in Russian-Language Automated Information Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 Dudnikov Sergey, Mikheev Petr, and Dobrovolskiy Alexander Method of Multi-objective Design of Strain Gauge Force Sensors Based on Surrogate Modeling Techniques . . . . . . . . . . . . . . . . . . . . . . . 126 Sergey I. Gavrilenkov and Sergey S. Gavriushin Predicting University Development Based on Hybrid Cognitive Maps in Combination with Dendritic Networks of Neurons . . . . . . . . . . 138 M. E. Mazurov and A. A. Mikryukov Systems and Algebraic-Biological Approaches to Artificial Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 G. Tolokonnikov, V. Chernoivanov, Yu. Tsoi, and S. Petoukhov Concept of Two-Stage Acoustic Non-invasive Monitoring and Diagnostic System Based on Deep Learning . . . . . . . . . . . . . . . . . . 162 Vladimir V. Klychnikov, Dmitriy V. Lapin, and Mark E. Khubbatulin Harmonic Fractal-Like Features Related to Epi-Chains of Genomes of Higher and Lower Organisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 Sergey V. Petoukhov and Vladimir V. Verevkin Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
On Mixed Forced and Self-oscillations with Delays in Elasticity and Friction Alishir A. Alifov(B) Mechanical Engineering Research Institute of Russian Academy of Sciences, Moscow 101990, Russia
Abstract. We consider mixed forced and self-oscillation in the presence of a delay in the forces of elasticity and friction. The system receives energy from a limited power source. The method of direct linearization is used to solve a nonlinear system of differential equations describing the system’s motion. It differs from the known methods of analysis of nonlinear systems in its simplicity of application. It lacks the time-consuming and complex approximations of various orders inherent in known methods. It can be used to obtain final calculated ratios regardless of the specific type and degree of non-linearity. Compared with known methods, it reduces labor and time costs by several orders of magnitude. Equations of non-stationary and stationary movements and conditions of stability of stationary vibrations are derived. To obtain information on the effect of delays on the modes of oscillations, calculations were performed. The influence of delays on the amplitude and the location of the amplitude-frequency curve in the frequency range is shown. It turned out that in the frequency domain there is a displacement and deformation of the amplitude curve depending on the delay value, which also affects the stability of vibrations. And also at certain values of the delay, the form of the amplitude curves is similar to the form of the curves in the absence of delay and the presence of a nonlinear elastic force. Keywords: Forced oscillations · Self-oscillations · Method · Direct linearization · Energy source · Limited power · Elasticity · Friction · Delay
1 Introduction Delayed systems are widespread in various branches of technology: transportation, automatic control systems with logic devices, electronics, radio engineering, non-ferrous metallurgy, paper and glass production, etc. [1–6]. In mechanical systems, delay is caused by internal friction in materials, imperfection of their elastic properties, etc. [1], its presence can be both useful and harmful. Delay can lead to oscillations in the object that occur in servo systems, conveyor belts, regulators, rolling mills, etc. You can neglect the effect of delay in some cases, for example, in low-frequency systems. However, this cannot be done in high-frequency systems, since the delay has a great influence on the stability and process of regulation of the system. Nonlinear differential equations with deviating arguments are often used to describe systems with delay. In this case, a class © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): CSDEIS 2020, AISC 1402, pp. 1–9, 2021. https://doi.org/10.1007/978-3-030-80478-7_1
2
A. A. Alifov
of quasilinear systems is distinguished, because the solution of such equations is fraught with great difficulties. Studies are carried out using approximate methods, among which the mathematically well-founded asymptotic averaging method is widely used [7]. Currently, the issues of consumption and energy saving have come to the fore all over the world [8, etc.]. The use of the theory of the interaction of an energy source and an oscillatory system, the basis of which is the effect discovered by A. Sommerfeld in 1902, can make some contribution to the solution of these problems. The theoretical substantiation of this effect was carried out by V.O. Kononenko, the research results of which are described in the world-famous monograph [9]. These works led to the emergence of a fairly new direction in the theory of oscillations. Further development of this theory is reflected in many works of a wide range of researchers around the world, including in the book [10]. The connection of environmental problems with the level of energy consumed, metrology, the accuracy of models for calculating systems, and the accuracy of processing parts is shown in [11]. Methods, methods of calculation, models of linear and nonlinear oscillatory systems are devoted to a huge number of works, including, for example, [6, 7, 12–18]. Among a number of approximate methods for analyzing nonlinear oscillatory systems, there is an asymptotic averaging method, methods of sequential approximation, energy balance, harmonic linearization, and others. The use of these methods is associated with significant labor and time costs. One of the main problems of nonlinear dynamics of systems is the large labor costs for analyzing connected oscillator networks, which plays an important role in biology, chemistry, physics, electronics, neural networks, etc. This is indicated, for example, in [19] with reference to [20–23]. The method of direct linearization (MDL) described in [24–29] and others differs fundamentally from these methods. Properties of MDL: simplicity of application; absence of labor-intensive and complex approximations of various orders; possibility to obtain final calculated relations regardless of the specific type and degree of non-linearity; quite small labor and time costs (several orders of magnitude less than when using the known methods of nonlinear mechanics). These properties determine the advantage of MDL in comparison with known methods for calculating nonlinear systems. In [24, 27] and some other works of the author, a number of results obtained by well-known methods of nonlinear mechanics and MDL are compared. It shows their coincidence: qualitative (complete); quantitative (from a complete match to a few percent mismatch).The purpose of this work is to analyze the simultaneous influence of lagging elastic and friction forces on mixed forced and self-oscillations at a source of limited power with the help of MDL. It presents the model and equations of motion of the system, solutions of equations, conditions for the stability of stationary motions, calculations and conclusions.
2 Model and Equations of Motion Shown in Fig. 1 the model describes well enough the frictional self-oscillations that occur in a variety of technical systems for various purposes: in textile equipment, guides of metal-cutting machines, brakes, and a number of other objects [30, 31]. The body 1 with mass m lies on the belt, which is driven by a motor having a moment characteristic M (ϕ), ˙ where ϕ˙ is the speed of rotation of the motor rotor. The friction force T (U ),
On Mixed Forced and Self-oscillations with Delays
3
which depends on the relative speed U = V − x˙ , V = r0 ϕ, ˙ can cause self-oscillation of the body 1. The body is affected by an external force f (t) = λ sin ν t, the elastic and friction forces depend on the delay, i.e. they occur respectively c1 xτ and T (U ). Here xτ = x (t − τ ), U = V − x˙ , x˙ = x˙ (t − ), where τ and are constant time lag factors.
Fig. 1. System model
The equations of motion of the system have the form m x¨ + k0 x˙ + c0 x = λ sin ν t + T (U ) − c1 xτ , I ϕ¨ = M (ϕ) ˙ − r0 T (U )
(1)
where c0 is the stiffness coefficient of spring 2, k0 is the resistance coefficient of damper 3. The values m, k0 , c0 , λ, ν, I, r0 are constant. Let’s imagine the power of T (U ) in a form that is widely used in practice T (U ) = T0 (sgn U − δ1 U + δ3 U 3 )
(2,a)
or taking into account the delay in the form T (U ) = T0 (sgn U − δ1 U + δ3 U3 ).
(2,b)
Here T0 is the normal reaction force, δ1 and δ3 are positive constants, sgn U = 1 at U > 0 and sgn U = −1 at U < 0. Note that the form (2,a) was also observed when considering the problem of measuring friction forces in space conditions [32].
3 Solution of Equations We replace the function (2,a) by the direct linearization method [24, 27] with the function f∗ (˙x) = Bf + kf x˙ .
(3)
Here Bf = −V(δ1 − δ3 V 2 − 3δ3 N2 υ 2 ), kf= δ1 − 3δ3 V 2 − δ3 N¯ 3 υ 2 , υ = max |˙x|, N2 = (2r + 1) (2r + 3), N¯ 3 = (2r + 3) (2r + 5), r is the linearization accuracy parameter. Although the interval for selecting the value r is not limited, it can be selected in the interval (0, 2), as shown in [24, 27].
4
A. A. Alifov
Taking into account (3), Eqs. (1) take the form m x¨ + k x˙ + c0 x = B + T0 sgn U + λ sin νt − c1 xτ , I ϕ¨ = M (ϕ˙ ) − r0 T0 (sgn U + Bf + kf x˙ )
(4)
where B = T0 Bf , k = k0 − T0 kf . In practice, the main interest is the main resonance in ω ≈ ν. Therefore, we consider solutions (4) for it, for which we use the method of variable replacement with averaging [24]. In addition, we apply the procedure described in [28] for calculating the interaction of vibrational systems with energy sources. The nature of the solutions for x and x˙ under the conditions of U > 0 and U < 0 is fundamentally different, as shown in [10]. In this regard, we consider separately the solutions for u ≥ a ν and u < a ν. These solutions for determining the amplitude a, phase ξ of oscillations and velocity of the energy source
based on x = a cos ψ, x˙ = x˙ (t − ), ϕ˙ = , ψ = ν t + ξ are as follows. a) u ≥ a ν da 1 dt = − 2νm (aA + λ cos ξ ), dξ 1 dt = 2νma (aE + λ sin ξ ), r0 du u M ( = ) − r T (1 + B ) ; 0 0 f dt J r0
b) u < a ν
du dt
da dt
1 a A + λ cos ξ + = − 2νm
=
r0 J
4T0 πa ν
√
(5,a)
a 2 ν 2 − u2 ,
dξ 1 dt = 2νma (aE + λ sin ξ ), u M ( r0 ) − r0 T0 (1 + Bf ) − r0πT0 (3π − 2ψ∗ )
(5,b)
2 2 where A = ν(k0 −T0 kf cos ν )−c1 sin ντ , E = m(ω0 − ν )+c1 cos ντ −νT0 kf sin ν, 2 ω0 = c0 m, u = r0 , ψ∗ = 2π − arcsin(u a ν). Expressions (5,a) for a˙ = 0, ξ˙ = 0 deliver equations for determining the characteristics of stationary motions, whence we have the following relationships for determining the amplitude and phase of oscillations tgξ = E A. a2 (A2 + E 2 ) = λ2 ,
In the case of u < a ν, the amplitude of stationary vibrations is determined by the approximate expression aν ≈ u. The load S(u) on the energy source is determined by the third Eq. (5,a) for u˙ = 0. In the case of u ≥ a ν, the expression for calculating it looks like S(u) = r0 T0 (1 + Bf ). Using the equation
M (u r0 ) − S(u) = 0
(6)
stationary values of the velocity u are calculated, which can also be determined by the intersection point of the curves M (u r0 ) and S(u). In the case of u < a ν, there is also an equation of the form (6), but taking into account aν ≈ u in the expression S(u).
On Mixed Forced and Self-oscillations with Delays
5
4 Stability of Stationary Movements Stationary movements must be investigated for stability. For this purpose, we create equations in variations for (5,a), (5,b) and use the Routh-Hurwitz criteria. The stability criteria are as follows: D1 > 0, D3 > 0, D1 D2 − D3 > 0
(7)
where D1 = − (b11 + b22 + b33 ), D2 = b11 b33 + b11 b22 + b22 b33 − b23 b32 − b12 b21 , D3 = b11 b23 b32 + b12 b21 b33 − b11 b22 b33 . Stationary movements are stable if conditions (7) are met. For u ≥ a ν speeds we have b11 =
b23
r0 J (Q
∂B
− r0 T0 ∂uf ),
r02 T0 ∂Bf aT0 ∂kf J ∂a , b21 = 2m ∂u cos ν, ∂kf c1 + 2m ∂a ) cos ν ν sin ντ,
b12 = −
1 b22 = − 2m k0 − T0 (kf + a 2 2 T0 k f ω −ν c1 0 ∂kf = −a 02ν + 2νm cos ντ − 2m sin ν , b31 = − T2m ∂u sin ν, 2 2 T0 k f ω −ν c1 0 ∂kf cos ντ − 2m sin ν − T2m b32 = a1 02ν + 2νm ∂a sin ν,
b33 = −
a(k0 −T0 kf cos ν ) 2m
+
c1 a 2νm
sin ντ.
For u < a ν only the coefficients change ∂B ∂B r2 T , b11 = rJ0 Q − r0 T0 ∂uf − √2r20 T20 2 , b12 = − 0J 0 ∂af + √ 2u π a ν −u π a a2 ν 2 −u2 ∂k f T0 √4u b21 = a2m ∂u cos ν + π a2 ν 2 a2 ν 2 −u2 , 2 ∂k c1 1 √0 u b22 = − 2m k0 − T0 kf cos ν − aT0 ∂af cos ν + 2 24T + 2m 2 2 2 ν sin ντ πa ν
a ν −u
where Q=
∂Bf ∂u
∂Bf ∂a
∂k
= 6N2 δ3 u ap2 , ∂af = − 2N¯ 3 δ3 a p2 , ∂k = −δ1 + 3δ3 u2 + 3 δ3 N2 a2 p2 , ∂uf = − 6 δ3 u.
d u du M ( r ),
5 Calculations To obtain information on the effect of delay on the system, calculations were performed. They were carried out using parameters: ω0 = 1 c−1 , m = 1 kgf·c2 ·cm−1 , λ = 0.02 kgf, c1 = 0.05 kgf · cm−1 , k = 0.02 kgf · c · cm−1 , T0 = 0.5 kgf, δ1 = 0.84 c · cm−1 , δ3 = 0.18 c3 · cm−3 , r0 = 1 cm, I = 1 kgf · c · cm2 . For delays, the values p and pτ from the interval (0, 3π/2) are used. Figure 2, 3 and 4 shows some calculation results for the u = 1.2 cm · c−1 velocity value. Curve 0 corresponds to the absence of delay (c1 = 0, = 0, τ = 0). The curves
6
A. A. Alifov
Fig. 2. Amplitude-frequency curves: = π/2; curve 1 – τ = π/2, curve 2 – τ = π, curve 3 – τ = 3π/2
Fig. 3. Amplitude-frequency curves: = π; curve 1 – τ = π/2, curve 2 – τ = π, curve 3 – τ = 3π/2
¯ are obtained respectively for the accuracy parameter r = 1.5 for N3 in the linearization coefficient kf in (3). Note that there is a number N¯ 3 = 3 4, which is also obtained if we use the well-known method of averaging nonlinear mechanics [8] to solve (1). That is, both the direct linearization method and the averaging method give the same result, but the first method is much simpler. Oscillations with amplitudes are stable within the d M (u r0 ) characteristics of the energy source. Curves 1 shaded sectors for the Q = du are unstable at = π/2, = 3π/2, τ = π/2 (Fig. 2, Fig. 4) and vibrations are not realized in the resonance region.
On Mixed Forced and Self-oscillations with Delays
7
Fig. 4. Amplitude-frequency curves: = 3π/2; curve 1 – τ = π/2, curve 2 – τ = π, curve 3 – τ = 3π/2
Under the influence of delay, an interesting feature is manifested. For explanation, we consider the first Eq. (1) in the absence of delay in it and the presence of a nonlinear part of the elastic force γ x3 , i.e. m x¨ + k0 x˙ + c0 x + γ x3 = λ sin ν t + T (U ).
(8)
The system described by Eq. (8) was studied in sufficient detail in [10]. The shape of the amplitude-frequency curve at = π/2 and τ = π is similar to the shape of the curve in (8) with a “rigid” (γ > 0) characteristic of nonlinear elastic force, and in the case of = 3π/2, τ = π − “soft” (γ < 0). In this regard, the question arises: when removing the amplitude-frequency characteristics in real conditions, how to determine the conditionality of their type (non-linearity of elasticity or delay)? To answer it, a multilateral study of the device is necessary.
6 Conclusion The interaction of self-oscillations and forced oscillations with retarded forces of elasticity and friction in the case of an energy source of limited power is considered. An interesting feature in the dynamics of the system appears under the influence of the delay: 1) Depending on the value of the delay, which affects the stability of the oscillations, there is a shift and deformation of the amplitude curve in the frequency range. 2) The shape of the amplitude curves at certain values of the lag is similar to the shape of the curves in the absence of lag and the presence of a nonlinear elastic force. The method of direct linearization is easy to use, significantly reduces the cost of labor and time, which increases the efficiency of design and technological calculations.
8
A. A. Alifov
References 1. Encyclopedia of Mechanical Engineering. https://mash-xxl.info/info/174754/ 2. Rubanik, V.P.: Oscillations of Quasilinear Systems with Time Lag. Nauka, Moscow (1969). (in Russian) 3. Zhirnov, B.M.: Self-excited vibrations of a mechanical system with two degrees of freedom and delay. Soviet Appl. Mech. 9(10), 1109–1112 (1973). https://doi.org/10.1007/BF0089 4292. (in Russian) 4. Astashev, V.K., Hertz, M.E.: Self-oscillation of a visco-elastic rod with limiters under the action of a lagging force. Mashinovedeniye 5, 3–11 (1973). (in Russian) 5. Abdiev, F.K.: Delayed self-oscillations of a system with an imperfect energy source. Izv. Academy of Sciences of the Azerbaijan SSR. Series of physical, technical and mathematical Sciences, no. 4, pp.134–139 (1983). (in Russian) 6. Butenin, N.V., Neymark, Y., Fufaev, N.A.: Introduction to the Theory of Nonlinear Oscillations. Nauka, Moscow (1976). (in Russian) 7. Bogolyubov, N.N., Mitropolsky, Y.: Asymptotic Methods in the Theory of Nonlinear Oscillations. Nauka, Moscow (1974). (in Russian) 8. Ul Haq, Q.A.: Design and implementation of solar tracker to defeat energy crisis in Pakistan. Int. J. Eng. Manuf. 2, 31–42 (2019). MECS http://www.mecs-press.net. https://doi.org/10. 5815/ijem.2019.02.03 9. Kononenko, V.O.: Vibrating Systems with Limited Power-Supply. Iliffe, London (1969) 10. Alifov, A.A., Frolov, K.V.: Interaction of Nonlinear Oscillatory Systems with Energy Sources. Hemisphere Publishing Corporation. Taylor & Francis Group, New York (1990). ISBN 089116-695-5 11. Alifov, A.A.: About application of methods of direct linearization for calculation of interaction of nonlinear oscillatory systems with energy sources. In: Proceedings of the Second International Symposium of Mechanism and Machine Science (ISMMS – 2017), Baku, Azerbaijan, 11–14 September, pp. 218–221 (2017) 12. He, J.H.: Some asymptotic methods for strongly nonlinear equations. Int. J. Mod. Phys. B 20(10), 1141–1199 (2006) 13. Hayashi, C.: Nonlinear Oscillations in Physical Systems. Princeton University Press, Princeton (2014) 14. Moiseev, N.N.: Asymptotic Methods of Nonlinear Mechanics. Nauka, Moscow (1981). (in Russian) 15. Esmailzadeh, E., Younesian, D., Askari, H.: Analytical Methods in Nonlinear Oscillations. Springer, Dordrecht (2019) 16. Karabutov, N.: Structural identification of nonlinear dynamic systems. Int. J. Intell. Syst. Appl. (2015). MECS http://www.mecs-press.org/. https://doi.org/10.5815/ijisa.2015.09.01 17. Chen, D.-X., Liu, G.-H.: Oscillatory behavior of a class of second-order nonlinear dynamic equations on time scales. J. Eng. Manuf. 6, 72–79 (2011). MECS http://www.mecs-press.net. https://doi.org/10.5815/ijem.2011.06.11 18. Wang, Q., Fu, F.L.: Numerical oscillations of Runge-Kutta methods for differential equations with piecewise constant arguments of alternately advanced and retarded type. Int. J. Intell. Syst. Appl. 4, 49–55 (2011). MECS http://www.mecs-press.org/ 19. Gourary, M.M., Rusakov, S.G.: Analysis of oscillator ensemble with dynamic couplings. In: Hu, Z., Petoukhov, S.V., He, M. (eds.) AIMEE 2018. AISC, vol. 902, pp. 161–172. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-12082-5_15 20. Acebrón, J.A., et al.: The Kuramoto model: a simple paradigm for synchronization phenomena. Rev. Mod. Phys. 77(1), 137–185 (2005)
On Mixed Forced and Self-oscillations with Delays
9
21. Bhansali, P., Roychowdhury, J.: Injection locking analysis and simulation of weakly coupled oscillator networks. In: Li, P., Silveira, L.M., Feldmann, P. (eds.) Simulation and Verification of Electronic and Biological Systems, pp. 71–93. Springer, Dordrecht (2011). https://doi.org/ 10.1007/978-94-007-0149-6_4 22. Ashwin, P., Coombes, S., Nicks, R.: Mathematical frameworks for oscillatory network dynamics in neuroscience. J. Math. Neurosci. 6(1), 1–92 (2016). https://doi.org/10.1186/s13408015-0033-6 23. Ziabari, M.T., Sahab, A.R., Fakhari, S.N.S.: Synchronization new 3D chaotic system using brain emotional learning based intelligent controller. Int. J. Inf. Technol. Comput. Sci. 7(2), 80–87 (2015). https://doi.org/10.5815/ijitcs.2015.02.10 24. Alifov, A.A.: Methods of Direct Linearization for Calculation of Nonlinear Systems. RCD. Moscow, Russia (2015). (in Russian). ISBN 978-5-93972-993-2 25. Alifov, A.A.: Method of the direct linearization of mixed nonlinearities. J. Mach. Manuf. Reliab. 46(2), 128–131 (2017). https://doi.org/10.3103/S1052618817020029 26. Alifov, A.A.: Self-oscillations in delay and limited power of the energy source. Mech. Solids 54(4), 607–613 (2019). https://doi.org/10.3103/S0025654419040150 27. Alifov, A.A.: About direct linearization methods for nonlinearity. In: Hu, Z., Petoukhov, S., He, M. (eds.) AIMEE 2019. AISC, vol. 1126, pp. 105–114. Springer, Cham (2020). https:// doi.org/10.1007/978-3-030-39162-1_10 28. Alifov, A.A., Farzaliev, M.G., Jafarov, E.N.: Dynamics of a self-oscillatory system with an energy source. Russ. Eng. Res. 38(4), 260–262 (2018). https://doi.org/10.3103/S1068798X 18040032 29. Alifov, A.A.: On the calculation by the method of direct linearization of mixed oscillations in a system with limited power-supply. In: Hu, Z., Petoukhov, S., Dychka, I., He, M. (eds.) ICCSEEA 2019. AISC, vol. 938, pp. 23–31. Springer, Cham (2020). https://doi.org/10.1007/ 978-3-030-16621-2_3 30. Kudinov, V.A.: Dynamics of Machine Tools. Mashinostroenie. Moscow, Russia (1967). (in Russian) 31. Korityssky, Ya.I.: Torsional self-oscillation of exhaust devices of spinning machines at boundary friction in sliding supports/in SB. Nonlinear vibrations and transients in machines. Nauka. Moscow, Russia (1972). (in Russian) 32. Bronovec, M.A., Zhuravljov, V.F.: On self-excited vibrations in friction force measurement systems. Izv. RAN, Mekh. Tverd. Tela, no. 3, pp. 3–11 (2012). (in Russian)
An Extensible Network Traffic Classifier Based on Machine Learning Methods Vladimir A. Mankov1 , Vladimir Yu. Deart2 , and Irina A. Krasnova2(B) 1 Training Center Nokia 8A, Aviamotornaya, 111024 Moscow, Russian Federation 2 Moscow Technical University of Communications and Informatics 8A, Aviamotornaya,
111024 Moscow, Russian Federation
Abstract. The rapid development of telecommunications creates a lot of new types of the traffic. This cause a great problem for real time traffic classification using Machine Learning Methods due to their inability to add new classes. Supervised Learning methods are unable to classify samples that are not represented in train sequences. Unsupervised Learning methods require a significant amount of time to build a model, give relatively low results and the resulting clusters are difficult to interpret. The paper proposes a new approach to real-time traffic classification. The uniqueness and value of our model lies in its ability to expand, i.e. add new classes in real time. For implementation of proposed algorithm a set of techniques was used: a unique feature matrix for classification by the first 15 packets; fast high-precision classification by Random Forest and XGBoost methods; a matrix of distances for clustering obtained using Extremely Randomized Trees based on known flows; detection of new classes with Agglomerative clustering through the minimum distance between clusters; determination of the nearest neighbors of the cluster for traffic engineering; automatic updating of the classifier based on clustering results. The experimental part of the study confirms the success of the proposed methodology. Keywords: Machine learning · ML · QoS · Network traffic classification · Agglomerative clustering · Random forest · RF · XGBoost · Supervised learning · Unsupervised learning
1 Introduction Recently, telecommunications are increasingly using data mining methods, especially Machine Learning (ML) methods to solve a wide range of problems. One of such tasks is real-time traffic classification, which would allow identifying flows that cannot be allocated at the stage of network design and providing measures to support of QoS for detected flows. Early approaches to traffic classification, such as defining applications by port, Deep Packet Inspection (DPI), etc. are becoming increasingly difficult to apply in modern realities, as ports change dynamically, and encryption systems do not allow DPI operations. At the same time, when using ML methods, there is no need to correctly © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): CSDEIS 2020, AISC 1402, pp. 10–19, 2021. https://doi.org/10.1007/978-3-030-80478-7_2
An Extensible Network Traffic Classifier
11
interpret the contents of packets, since they can be applied based on the statistical properties of traffic. We described in more detail this problem and the general algorithm for solving it in [1]. The task of classifying traffic using Machine Learning (ML) methods is to determine the class of traffic flow by its known characteristics (packet length, Interarrival Time, flow rate, etc.). Applications, traffic types (voice, video, data), services, etc. can act as flow classes. A number of papers [2–4] prove the effectiveness of traffic classification by Machine Learning methods based on statistical properties of traffic. Such models show high results, but they are all static, i.e. they classify only the classes known for the model. The rapid development of telecommunications networks and the emergence of new technologies (for example, SDN) leads to the regular creation of new applications and types of traffic. A solution needed that allows to dynamically add new classes to an existing model. The purpose of the research is to create a dynamically expanding traffic classification model that works in real time. To achieve this goal, we need to solve the problem of developing a static traffic classification model and creating model blocks that allow adding new classes. The static traffic classification model was developed by us in previous studies [5, 6], so the task of this work is to transform the static model into a dynamic one by introducing additional clustering blocks. To confirm the theoretical assumptions, a number of experiments are conducted on real data. The paper is organized as follows. Section 2 consist of Related Work. Section 3 presents the basic theoretical concepts used in the work. Section 4 describes how our model works. Section 5 contains the results of three important tests: the test of the initial configuration of the model, the test of the model in the mode of adding new applications and the test of the running time of the algorithm. Section 6 contains the main conclusions and proposals for further work.
2 Related Work The papers [7–9] provide an overview of promising approaches and works using ML for traffic classification, which are divided into two main groups according to the methods of building the model: Supervised Learning and Unsupervised Learning. In both cases, each flow is represented by its own vector features. When Supervised Learning algorithms work, in addition to features, flows have their own class label and the initial sample is divided into two sequences: train and test. When building the model, only the train part is used. During testing, the features of the test sequence are fed to the input of the algorithm and the class labels are set based on the model’s predictions, then they are compared with the true values to evaluate the operation of the algorithm. When Unsupervised Learning algorithms are running, the original sample has no class labels. With the help of features, it is divided into various clusters, which are essentially classes. The authors of [10] presented a virtual agent that classifies flows in real time using Supervised Learning methods with very high accuracy (99-100%), but dividing traffic into only four main services (audio, control, video, data). In [11], it is proposed to use Random Forest to obtain distances between samples and further clustering using K-Medoids clustering algorithms. Previously, unsupervised
12
V. A. Mankov et al.
clustering algorithms were commonly used to cluster traffic, showing very low results. The method proposed by the authors [11] has high accuracy results - more than 80% for all applications except HTTP, for HTTP-68%. The main disadvantage of this approach is its static nature as the inability to work in real time. In addition, the proposed clustering method requires an initial understanding of the number of classes, which excludes the possibility of dynamically adding new classes to the existing model. In most studies, Supervised Learning algorithms produce better results than Unsupervised Learning, because with prior knowledge, classes can be processed more accurately. In addition, the advantages of Supervised Learning are the speed of the algorithm and the simplicity of model tuning. But Supervised Learning algorithms have a serious limitation - they are not able to detect classes that are not represented in the training set. In our model, we combine the accuracy and speed of Supervised Learning algorithms with Unsupervised Learning’s ability to define new classes.
3 Background 3.1 Supervised Learning Methods Based on Decision Tree Decision Tree (DT) represents a logical structure and consists of leaves and nodes. In the nodes, the correspondence of the attribute of the selection element to any condition is determined and in the leaves - the belonging of the element to some of the classes. The algorithm itself is unstable and rarely used but is basic for ensemble models such as Random Forest (RF), XGBoost and Extremely Randomized Trees (ET). RF and ET are a set of DTs that are built in parallel and independently of each other and the result of predictions is decided by a majority vote. In ET construction, unlike RF, the random factor has a greater influence. XGBoost is a set of DTs built one after the other and fixing each other’s bugs. 3.2 Agglomerative Clustering Agglomerative clustering is a hierarchical clustering model based on the “bottom-up” approach: each sample element is initially presented as its own separate class and as the distance between objects increases, the sample elements are combined into clusters. Thus, unlike most other clustering methods, Agglomerative clustering allows you to control the number of clusters depending on the distance and determine the clusters closest to the sample. Some standard approaches (Euclidean distance, Manhattan distance, Mahalanobis distance, etc.) or any other distance matrices can be selected as distances. 3.3 Clustering Performance Evaluation Accuracy is the ratio of the number of correctly classified flows to the total number of flows. Adjusted Rand index (ARI) and Adjusted Mutual Information (AMI) measure the similarity of two assignments (true and predicted), ignore permutations and normalize all values within [−1; 1], where −1 is bad labeling, 0 is random labels and 1 is the best match indicator. The ARI based on pair-counting and the AMI based on Shannon information theory.
An Extensible Network Traffic Classifier
13
Homogeneity indicates that each cluster contains objects that belong to only one class. By Completeness you can determine the proportion of a sample of one class belonging to one cluster. V-measure is the harmonic mean between Completeness and Homogeneity. All three characteristics are normalized within [0; 1], where 1 corresponds to the best result.
4 Methodology Our traffic classification model works in real time and can detect and add new classes to its database (Fig. 1). The primary training of the model is carried out on pre-marked flows, therefore, there are both training and test sequences (train and test) in them, which means that it is possible to correct the behavior of the algorithm on all parts of the circuit, build all the necessary decision trees and determine the initial clusters. This paper focuses on adding new flows to the model.
Fig. 1. Self-updating traffic classification model
In the network, for each packet, the flow to which it belongs is identified based on 5-tuple (IP src and dst, port src and dst, protocol). For the first 15 packets of one flow, the switch records the arrival time of each packet and its size (packet length), then this data is sent to the controller. The controller calculates the feature matrix (1), which consists of Inter arrival Time, packet length and their average characteristics. We described the collection of statistical data in real time in more details in [5] and classification of traffic using the selected features in [6]. The generated matrix of features is used to automatically classify flows (2) using the Supervised Learning methods: Random Forest and XGBoost. Appropriate network management policies are applied to the classified flow. At the same time, information about the incoming packet comes from the feature matrix to the database (3), which contains information about all flows that have passed through the network. After N flows pass through the network, the classification model is automatically updated. To do this, an Extremely Randomized Trees are built using the existing database and a matrix of distances between samples (4) is calculated based on it, normalized within the range of [0; 1]. Using a pre-calculated distance matrix allows, first, to adhere to a single approach for determining distances and classifying traffic, since both use Decision Tree algorithms, and, second, helps to consolidate the positions of known samples, since clustering algorithms do not provide for marked sequences. The clustering unit (5) receives information about the distances between samples and performs clustering. For this block, the model uses the Agglomerative Clustering
14
V. A. Mankov et al.
method. The division into clusters is based on the minimum distance between the clusters. It is important to note that with each new clustering procedure, the number of classes should not decrease. A situation with a decrease in the number of clusters may arise when adding a cluster to the database that occupies an intermediate position between some of the existing clusters. Thus, the distance between them becomes less than the initially selected one and several clusters are combined into one. To prevent this problem, the labels of the existing samples are fixed and in disputable situations, the distance chosen for dividing into clusters is reduced. Further, the sample marked up in the clustering block becomes the basis for training the RF and XGBoost algorithms (5-2). Thus, the location of the clusters and the distance between them are clarified, new classes are introduced, allowing the model to remain constantly up-to-date in real time.
5 Methodology During the experimental studies, 15-min traffic traces were used, collected in JanuaryFebruary 2020 by the MAWI research group within the framework of the WIDE project on a real network section [12]. The initial data markup was carried out using the nDPI tool. For the experiment, 1500 flows of the most common TCP applications were selected: DNS, Apple, IMAPS, HTTP, SSH, Telnet (1st group) and Skype, SMTP, RTMP, IMAP (2nd group). Applications of the 1st group were accepted as known from the beginning and the first clusters were created on their basis. Applications of the 2nd group participated in experiments on adding new kinds of classes to the existing base and played the role of unknown flows. All experiments were performed 10 times, and the results were averaged. The deviation from the average was no more than 5%. 5.1 Test 1. Clustering an Existing Database The widely used ML visualization method t-SNE (t-distributed Stochastic Neighbor Embedding) allows data to be dimensioned down and displayed in two-dimensional space in such a way that points from one sample are most likely to be closer to each other and from different ones - further away from each apart. The method is very dependent on random components and is not an accurate determination of the distances between points, but it can be used to get an idea of the possibility of clustering and the dataset. In Fig. 2, the selected TCP applications were visualized by the t-SNE method. The circles mark the train sequence samples and the crosses mark the test. Figure shows that the sample can be divided into clusters, although it has intersections in some places - for example, the intersection of IMAPS, HTTP and Apple. The appearance of such disputed zones can be caused not only by the difficulties of clustering, but also by the peculiarities of the operation of the protocols, as well as possible errors in automatic marking. Figure 3 is a graph of the dependence of the number of clusters on the minimum normalized distance between clusters is plotted and the results of clustering are presented depending on the number of resulting clusters. The graph shows that the smaller the minimum allowable distance between the clusters, the more clusters are formed. Note that the number of clusters decreases unevenly, despite a uniform increase in distance.
An Extensible Network Traffic Classifier
15
Fig. 2. Visualizing clusters with tSNE
Fig. 3. Clustering performance evaluation and dependence of the number of clusters on the selected minimum distance between them
16
V. A. Mankov et al.
The ARI and AMI results show positive results, close to 0,8-0,9 with the number of clusters from 12 or more. Homogeneity grows with an increase in the number of clusters, since with a large number of clusters, some of them may contain a rather small number of samples. In contrast to Homogeneity, Completeness decreases with an increase in the number of clusters, since now samples of the same classes belong to different clusters. From the point of view of traffic classification, this is not always an error. Different traffic behavior can correspond to the same application, which is confirmed by Fig. 2. Therefore, for the initial operation of the algorithm, it is necessary to choose the number of clusters greater than the number of applications in the database. Based on the results of the experiment, 12 clusters (1 DNS, 1 Telnet, 2 Apple, 2 IMAPS, 2 SSH and 4 HTTP) were chosen for the initial training of the model with a minimum relative distance between clusters of 0,9997. 5.2 Test 2. Adding New Classes to an Existing Model For testing the ability of the model to detect new types of classes, by 300 flows from each of the RTMP, IMAP, SMTP and Skype applications were added to the model input in turn. After addition of 300 flows from each application, the number of clusters was measured and the nearest neighbors to new clusters were identified. Figure 4 shows a graph of the dependence of the number of clusters formed on the composition of applications. You can see that each time a new class is added, a new cluster is added, except for the cases with the addition of IMAP. This is due to the presence of a small HTTP cluster that has teamed up with IMAP fetch.
Fig. 4. Formation of new clusters when adding new classes
For new applications, two different nearest neighbors were determined (Table 1). This information can be used when applying traffic control policies. For example, when adding an IMAP class, the HTTP and SSH classes were the closest to it. Since the model does not know the purpose of the class and information about its content and it only distinguishes it from others, when solving traffic management problems, the model focuses on HTTP, leaving SSH as an alternative option.
An Extensible Network Traffic Classifier
17
Table 1. The first two closest classes for new applications Class
1st nearest neighbor
2nd nearest neighbor
IMAP
HTTP
SSH
RTMP
IMAP
IMAPS
SMTP
IMAP
HTTP
Skype
IMAP
HTTP
5.3 Test 3. Measuring the Running Time of the Model and Its Individual Components For assessing the capabilities of real-time operation, measurements were made of the time of model building and the time of simultaneous classification of various numbers of flows. The results will also help determine the number of new flows N, after which the model needs to be updated. Experiments are carried out on a virtual machine with the following parameters: Hypervisor: KVM; Logical host CPUs:24; Memory: 48000 MiB; Storage size: 60GiB; OS: Ubuntu 16.04.4 LTS; Main tools: Scikit-learn [13], XGBoost [14]. Figure 5 shows the average construction and operation time of individual components of the model. The figure shows that the building time of the Agglomerative clustering model significantly exceeds the building time and the running time of the classifiers, which explains the advantage of the Random Forest and XGBoost methods for real-time classification.
Fig. 5. Left: model’s build time. Right: model’s work time
Accuracy of classification RF and XGBoost according to the given database reaches 0.99–1. On more complex samples, XGBoost achieves better results compared to RF. Also, based on Fig. 5, the XGBoost classification rate is higher (0,002 s vs 0,101 s) than that of RF, regardless of the number of simultaneously classified flows. Despite these advantages, XGBoost also takes a longer time to build the model compared to RF.
18
V. A. Mankov et al.
6 Conclusions The paper proposes experimentally tested approach for solving the problem of traffic classification by ML methods, which has the following advantages: – Ability to work in real time due to high-performance monitoring of the network [5] the features of the formation of the feature matrix [6] operating time of the model (0,002 s) by Supervised Learning methods; – Ability to detect new flows unknown for the model using Unsupervised Learning algorithms; – Identification of the nearest classes as auxiliary information for traffic engineering; – Automatic update of the model, taking into account the current situation in the network; – Undemanding to a large initial database, because the algorithm is capable of additional training in the process; – High classification accuracy for all indicators (0,8–0,9); – Flexible granularity of classification by changing the distance between clusters and, as a consequence, the number of clusters; – Adjustment of the refresh rate of the model depending on the intensity of the network. The main disadvantage of the method is that Accuracy decreases at the time of classification of new flows, since the model is updated after their classification. This assumption in the model is made deliberately: direct clustering would take too long and would give significant delays. Usually there are no traffic engineering policies for unknown classes in the network except for best efforts, therefore, even with fast clustering, further application of its results would be impossible and for a small rare number of flows, it is meaningless. In our proposal, when new classes are added, approach uses rules that work for flows with similar statistical characteristics. When the model is updated, the introduction of new classes is consistent with traffic management policies. In future work, we plan to apply our traffic classification method for adaptive routing purposes.
References 1. Mankov, V.A., Krasnova, I.A.: Algorithm for dynamic classification of flows in a multiservice software defined network. T-Comm 11(12), 37–42 (2017). (in Russian) 2. Singh, K., Agrawal, S., Sohi, B.S.: A near real-time IP traffic classification using machine learning. Int. J. Intell. Syst. Appl. (IJISA) 5, 83–93 (2013). https://doi.org/10.5815/ijisa.2013. 03.09 3. Kaur, J., Agrawal, S., Sohi, B.S.: Internet traffic classification for educational institutions using machine learning. Int. J. Intell. Syst. Appl. (IJISA) 4, 36–45 (2012). https://doi.org/10. 5815/ijisa.2012.08.05 4. Almubayed, A., Hadi, A., Atoum, J.: A model for detecting tor encrypted traffic using supervised machine learning. Int. J. Comput. Netw. Inf. Secur. (IJCNIS) 7, 10–23 (2015). https:// doi.org/10.5815/ijcnis.2015.07.02
An Extensible Network Traffic Classifier
19
5. Mankov, V.A., Krasnova, I.A.: Collection of individual packet statistical information in a flow based on P4-switch. In: Hu, Z., Petoukhov, S., He, M. (eds.) CSDEIS 2019. AISC, vol. 1127, pp. 106–116. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-39216-1_11 6. Mankov, V.A., Krasnova, I.A.: Klassifikatsiya potokov trafika SDN-setei metodami mashinnogo obucheniya v rezhime real”nogo vremeni. In: Informatsionnye Tekhnologii I Matematicheskoe Modelirovanie Sistem 2019, Odintsovo (2019). https://doi.org/10.36581/ citp.2019.31.51.016. (in Russian) 7. Ge’tman, A.I., Markin, Yu.V., Evstropov, E.F., Obydenkov, D.O.: A survey of problems and solution methods in network traffic classification. In: Trudy ISP RAN/Proc. ISP RAS, vol. 29, no. 3, pp. 117–150 (2017). https://doi.org/10.15514/ispras-2017-29(3)-8. (in Russian) 8. Boutaba, R., Salahuddin, M.A., Limam, N., Ayoubi, S., Shahriar, N., Solano, F.E., Rendon, O.M.: A comprehensive survey on machine learning for networking: evolution, applications and research opportunities. J. Internet Serv. Appl. 9, 1–99 (2018). https://doi.org/10.1186/s13 174-018-0087-2 9. Zhao, Y., Li, Y., Zhang, X., Geng, G., Zhang, W., Sun, Y.: A survey of networking applications applying the software defined networking concept based on machine learning. IEEE Access, 1–1 (2019). https://doi.org/10.1109/ACCESS.2019.2928564 10. Gomes, R.L., Madeira, M.E.R.: A traffic classification agent for virtual networks based on QoS classes. IEEE Latin Am. Trans. 10(3), 1734–1741 (2012). https://doi.org/10.1109/TLA. 2012.6222579 11. Wang, Y., Xiang, Y., Zhang, J.: Network traffic clustering using Random Forest proximities. In: 2013 IEEE International Conference on Communications (ICC) (2013), pp. 2058–2062. https://doi.org/10.1109/icc.2013.6654829 12. Cho, K., Mitsuya, K., Kato, A.: Traffic data repository at the WIDE project. In: USENIX Annual Technical Conference, FREENIX Track (2000) 13. Buitinck, L., Louppe, G., Blondel, M., Pedregosa, F., Mueller, A., Grisel, O., Niculae, V., Prettenhofer, P., Gramfort, A., Grobler, J., Layton, R., VanderPlas, J., Joly, A., Holt, B., Varoquaux, G.: API design for machine learning software: experiences from the scikit-learn project. ArXiv, abs/1309.0238 (2013) 14. Introduction to Boosted Trees. https://xgboost.readthedocs.io/en/latest/tutorials/model.html. Accessed 03 Mar 2020
Intelligent Information Systems Based on Notional Models Without Relationships Valery S. Vykhovanets1,2(B) 1 Institute of Control Sciences of the Russian Academy of Sciences, 65, Profsouznaya Street,
117997 Moscow, Russian Federation [email protected], [email protected] 2 Bauman Moscow State Technical University, 5-1, 2nd Baumanskaya Street, 105005 Moscow, Russian Federation
Abstract. The article describes models, called notional models, which are based on primary mental abstractions: identification, generalization, and association. The process of abstraction does not depend on any subject domain, but is determined only by the abilities of the cognizing human. A notional model consists of a notional structure and contents of notions. The notional structure describes each notion as a set of other notions united by one of the mental abstractions. The content of notions is described using various enumerating and resolving procedures. Refusal to describe associations as links between concepts makes the notional model semantically invariant, improves the transparency of notional models, and allows you to create intelligent information systems with a linear or logarithmic estimate of query execution time. These effects are because associations between notions are notions. Another difference between the conceptual and the notional models is the description of notions in several aspects simultaneously. Keywords: Mental abstractions · Notional structure · Notional model · Knowledge representation · Knowledge base · Knowledge inference · Intelligent information system
1 Introduction Many unresolved problems in the knowledge representation, engineering and management make difficult to implement intelligent information systems effectively in practice. In particular, the extraction of knowledge is not a completely solved problem [1], forms of knowledge representation are poorly adapted to the mental and psychological characteristics of a person [2], and the knowledge processing takes a long time and is a subject to hidden contradictions [3]. Some of the existing problems were avoided by using the notional analysis and modelling. Models are called notional to distinguish them from conceptual models. Conceptual models define concepts and various types of relationships between them. For example, there are such conceptual models as logical models, frame models, semantic networks, object-oriented models, etc., where relationships are defined as special © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): CSDEIS 2020, AISC 1402, pp. 20–28, 2021. https://doi.org/10.1007/978-3-030-80478-7_3
Intelligent Information Systems Based on Notional Models
21
concepts, interpreted as binary (n-ary) predicates, and denotes as links between ordinary concepts [4]. The closest to the notional model is ER-model, where abstractions of the conceptual design are classification, aggregation and generalization [5, p. 41], which are used for classifying primitive objects and building complex classes from them. For relationship between classes is used the abstraction of association called the n-ary aggregation. In the notional model, relationships between notions are themselves notions [6]. The notional model is constructed by specifying the mental abstractions that formed the notions as well as enumerating the entities belonging to the notions. Any query for the notion model is reduced to searching for the entities of notions that meet the specified conditions.
2 The Notion of a Notion A notion is a kind of thought that relates to a certain set of unique representations (entities) of the inner or outer world of a person (subject domain). Notions are formed (defined) during the mental abstraction by performing mental operations on entities1 . There are three types of notions and their generating abstractions: notion-signs (identification), notion-generalizations (generalization) and notion-associations (association). 2.1 Identification Notion-signs are the result of the mental selection of unique representations in the subject domain and naming them. Notion-signs are formed to fix a certain state of feelings or elementary abstract ideas. When forming a notion-sign, the entity is mentally replaced by a sign, by another unique representation, by name for example (one-to-one correspondence). Hence, it follows that any entity is a notion. The opposite is also true. Example 1. Examples of notion-signs are such notions as Green, Sour, One, Many, Love, etc.
2.2 Generalization The notion-generalization is formed when entities of generalized notions are united (the union of sets of entities). The abstraction of generalization is used to form a notion-type that is a union of notion-signs2 . Note that all entities of generalized notions are entities of a notion-generalization. Generalization has an inverse abstraction called specialization. Example 2. An example of the notion-generalization is the notion of Fruit, which is the result of union the entities of such notions as Apple, Pear, Peach, Apricot, etc. In turn, the notion of Apple is one of specializations of the notion of Fruit. 1 A notion is an idea, a conception, an opinion, a vague view or an understanding of something.
Notional is hypothetical, imaginary [7, p. 607]. A concept is a general notion, an abstract idea [7, p. 171]. Thus, unlike a concept, a notion is not objective; a notion is subjective. 2 A special case of generalization is a classification that unions the same entities.
22
V. S. Vykhovanets
Example 3. An example of the notion-type is the notion of Number, which is the union of notion-signs 1, 2, 3, etc.
2.3 Association The notion-association is formed when entities of associated notions are joined, i.e. when each entity of the notion-association includes one of the entities of the associated notions (a subset of the Cartesian product of sets of entities). Note that not all combinations of entities of associated notions can be entities of a notion-association3 . Association has an inverse abstraction called individualization. Example 4. An example of the notion-association is the notion of User, which joins such notions as Name, Gender, Phone, E-Mail, Login and Password. In turn, the notion of Phone is one of individualization of the notion of User.
3 Notional Analysis Notional analysis consists in identifying notions and their abstractions that are used to describe a given subject domain. Abstractions of identification, association and generalization used in notional analysis are considered as mental operations necessary and sufficient for the mental isolation and transformation into notions of existing representations from the described subject domain. 3.1 Problem Areas In the process of analysis, one or more problem areas are identified. A problem the notional area is a subject domain considered in some narrow aspect, from the point of view of some active problem4 . The same notion in different aspects (in different problem areas) may have different descriptions5 . Example 5. The notion of Project in the first aspect characterizes a project in the managerial sense: the start date, end date, budget, attracted resources, etc. The notion of Project in the second aspect characterizes a project in the technical sense as an artifact (an object of artificial origin). The notion of Project in the third aspect characterizes a project in the activity sense as a work program. More aspects of project see in [7, p. 715].
3 A special case of association is aggregation, in which all combinations of entities belong to the
notion-aggregation. 4 Note that footnotes describe the content of the article in others aspects that are necessary for 5
more complete understanding of the subject domain. Since the notion assumes a subjective reflection of the subject domain, it provides a plurality of descriptions of the same notion in different aspects. This allows us to objectify the notion in its collective (general) form.
Intelligent Information Systems Based on Notional Models
23
3.2 Notional Structures The purpose of the notional analysis is to obtain the notional structure of the subject domain. The notional structure defines the names and aspects of notions, their abstractions, and the sets of notions that these notions are defined (notion attributes). The name of the notion without specifying the aspect defines a collective notion (a notiongeneralization) that union all the specializations of this notion in the subject domain with various aspects6 . 3.3 Notional Attributes Any notion has a common attribute of Title – the notion identifier, a common attribute of Aspect – the identifier of the problem area, as well as a common attribute of Abstraction accepting an entity of Sign, Type, Generalization and Association. Private attributes of notions are notions, which are determined as the result of the subject domain analysis. Example 6. Consider the notional structure that describes the staffing of a company. In this structure (see Fig. 1), there may be such notions as Trainee and Employee (notionssigns, which are shown as ovals), Division and Position (notion-types, which are shown as rounded rectangles), Vacancy (notion-generalization, which is shown as a rectangle) and Staff (notion-association, which is shown as a rhombus). Here, Trainee and Employee are private attributes of Vacancy; Division, Position and Vacancy are private attributes of Staff.
Fig. 1. The notional structure of the staffing
4 Notional Models A notional model consists of a notional structure and a description of the entities of notions using enumerating and resolving procedures.
6 Generalization of similar notions can be used as a concept.
24
V. S. Vykhovanets
4.1 Schemas of Notions The notional structure of a subject domain consists of schemas of notions. A schema of a notion is an ordered set of notional attributes. The first three elements of this set are Title, Aspect, and Abstraction that are common attributes to all notions. Then follow private attributes of the notion. Example 7. The notion of Staff from Example 6 has the schema (Staff, Common, Associations, Division, Position, Vacancy), where Staff is the entity of Title; Common is the entity of Aspect; Associations is the entity of Abstraction; Division, Position and Vacancy are entities of Notion. Note that Title, Aspect and Abstraction are also entities of Notion.
4.2 Entities of Notions In intelligent information systems, values of simple data types are used as notion-signs: numbers, characters, strings, etc. Simple data types themselves are considered as build-in notion-types: Bit, Integer, Float, Character, String, Binary, etc. Notion-types of the subject domain have no private attributes. These notions are created as tables with a single field of Title. For example, such notions as Abstraction and Aspect are subject notion-types. Notion-associations are created as structures, aggregate or composite classes, associative relations (links), and database tables. Notion-generalizations are created as unions, generalizing classes, and queries for union data from database tables.
5 Knowledge Processing Knowledge processing requires a form of representation of knowledge and methods of manipulating it in order to imitate human thinking. Facts are used to represent knowledge, and inference rules are used for reasoning, which allow us making inferences based on existing facts and obtaining new facts about existing or newly introduced facts. 5.1 Facts Facts are true propositions with logical connectives AND (∧), OR (∨), NOT (¬), parentheses, and two types of atomic propositions: a predicate N(E) of belonging of the entity E to the notion N and N[E] • V, where N[E] is a functor that returns the entity of the attribute N of the entity E, • is a relation that allowed between entities N[E] and V (=,>, 0.
4 Results and Discussion. Estimation of the Scatter of Characteristics of Graphs with General Characteristics of Random Growth Parameters Consider the results of an experiment based on Algorithm 1 (Fig. 1, 2). For model experiments and estimation of the scatter of statistical values of the monitored parameters,
Fig. 1. Left - an example of a random graph according to algorithm 1. Right - graphs of changes in the values of the monitored parameters (one hundred experiments). The abscissa axis is the algorithm step, the ordinate axis is the parameter value.
Fig. 2. 50 (left) and 100 (right) graph growth iterations according to Algorithm 1 (one hundred experiments). The abscissa axis is the algorithm step, the ordinate axis is the parameter value.
34
I. V. Stepanyan and V. V. Aristov
the algorithm was repeated a hundred times. After each run of the algorithm, parameters were calculated, and curves were plotted. These curves were then superimposed on each other to visualize the so-called. “Corridors” of meanings. Despite the simplicity of the model and algorithm, there are statistical outliers in the complexity of the graph structure, both in cycles and in simple paths. The probability of a sharp increase in cyclic structures (5 and 7 times) is observed in 2% of cases (starting from 83 growths’ step), see Table 1. Consider the results of a statistical experiment based on Algorithm 2 - generation of graphs with a structure close to the columnar organization of the neocortex with feedbacks (Fig. 3, 4). An example of a random tree graph is in Fig. 3.
Fig. 3. An example of a graph model of a neocortex column without feedback in the form of a random tree with a depth of 5 vertices. These columns are then combined with multiple feedbacks using Algorithm 2.
Table 1. Probability of belonging to a range of values for evaluating statistical anomalies in the structure of graphs Steps
30
35
50
Diapason Probability Diapason Probability Diapason Probability Paths
Usual behavior
[1; 120]
Anomaly [120; 160] Cycles
Usual behavior
[1; 20]
Anomaly [20; 30] Euler Usual number behavior
[6; 16]
97% 3%
[1; 90] [90; 200]
98%
[1; 300]
98%
2%
[300; 600]
2%
85%
[0; 30]
99%
[0; 30]
96%
15%
[30; 50]
1%
[30; 60]
4%
100%
[8; 18]
100%
[13; 23]
100%
The parameter of Algorithm 2 is the intensity of feedbacks between the columnar structures combined into a single network. Feedbacks can only be established between the input and output vertices of different trees. As a result of numerical experiments, it was found that Algorithm 2, like Algorithm 1, has the property of the non-uniform distribution of the generation of the number of simple cycles. A sharp surge is observed in ~2% of cases with an increase in the number of simple cycles relative to the average (Table 2).
Study of Properties of Growing Random Graphs
35
Fig. 4. On the top - a random graph with five random columns (Algorithm 2). On the right bottom are the results from 50 independent experiments. On the left bottom are the results from 100 independent experiments. Random trees parameters: 3 columns (2-4 branches with 3 vertices deep), the number of feedbacks is half the number of outputs. The abscissa axis is the serial number of the independent experiment. The ordinate axis is the tracked parameter.
Tables 1 and 2 are shown with ranges of values that are statistically highlighted. To demonstrate the idea that random processes can, in rare cases, produce qualitative changes associated with the number of simple paths and the number of simple cycles in the graph. This does not apply to the Euler number, as our experiments have shown. As can be seen from Table 2, the differences can be hundreds and thousands of times with a small number of experiments (100 experiments). Such large anomalies are characteristic of Algorithm 2, which operates with trees that simulate the columnar structures of the neocortex. This indicates that in the evolutionary process if we consider evolution as
36
I. V. Stepanyan and V. V. Aristov
Table 2. Probability of belonging to a range of values for evaluating statistical anomalies in the structure of graphs Number of experiments
50
50 Diapason/value
Probability
Diapason/value
Paths
Usual behavior
[0; 1]
99%
0
Anomaly
140
1%
[500; 2500]
Usual behavior
[0; 20]
100%
[0; 450]
Anomaly
–
–
[450; 1400]
Usual behavior
[15; 37]
100%
[14; 40]
Cycles Euler number
100 Probability 95% 5% 97% 3% 100%
a series of random mutations, qualitative breakthroughs in the field of cyclic neural network structures are possible. The results of numerical experiments are correlated with the data of neurophysiology that the FMRi data of normal and pathological brain is different [24, 25]. Impairment of consciousness (this is an experimental data and some theoretical) corresponds to a violation of the intensity of connections between parts of the brain.
5 Conclusion In the present paper, we considered a system of growing random graph (ER) and studied the transition from the graph trees to the graph cluster. The existence of the possibility of the emergence of complex cycles is shown. We expect that these theoretical models and simulations can be compared with the real structures of the brain neuron systems. • Algorithms and methods for searching parameters for generating neural-like graphs have been developed. In the course of model experiments, the properties associated with cyclic structures were estimated. • The existence of the possibility of the emergence of complex cycles as a result of the random growth of various neural-like structures was shown and the probability of their occurrence under certain initial conditions was estimated. • The developed technique is useful for modeling the evolutionary processes of the emergence of preconsciousness associated with the morphogenesis of neural network structures with the participation of random processes. Thus, this article shows a new approach to the study of the phenomenon of the emergence of consciousness, based on a variety of methods for generating random graphs that simulate the columnar structures of the cerebral cortex. This approach can be developed by carrying out extended experiments with different numerical parameters of the graph generation algorithm. Nevertheless, the preliminary results obtained by us in the course of our numerical experiments support the hypothesis of the connection
Study of Properties of Growing Random Graphs
37
between complex processes occurring in the brain and the origin of consciousness with percolation transitions in complex network structures. The results of the study are promising for the construction of artificial neural networks that demonstrate complex behavior. It is possible to use the results of the study, for the generation of complex neural networks, with criteria for evaluating the final structures (the number of cycles, simple paths, Euler’s number). Artificial neural networks built according to the proposed algorithms and criteria are based on neurophysiological data on the structure of connections in the foci of brain activity. We can hope that some artificial neural nets, in particular, Spiking Neural Network [26] could be useful in future simulations. There are new investigations of fast signal contours of neurons. New global properties of the brain are studied, e.g., in [27]. Now it is possible to register the activity of not only separate neurons but large groups of neurons [28]. New tomographs can fix the bloodstreams in the brain, by these approaches, one might also determine the indirect circulation of neuron signals.
References 1. Chalmers, D.: The Conscious Mind: in Search of a Fundamental Theory. Oxford University Press, New York (1996) 2. Krapivsky, P.L., Redner, S.: Emergent network modularity. J. Stat. Mech. 073405 (2017) 3. Yang, W., et al.: Simultaneous multi-plane imaging of neural circuits. Neuron 89, 269–284 (2016) 4. Severino, F.P.U., et al.: The role of dimensionality in neuronal network dynamics. Sci. Rep. 6, 29640 (2016) 5. Ben-Naim, E., Krapivsky, P.L.: Kinetic theory of random graphs: from paths to cycles. Phys. Rev. E 71, (2005) 6. Krapivsky, P.L., Redner, S., Ben-Naim, E.: A Kinetic View of Statistical Physics. Cambridge University Press, Cambridge (2010) 7. Stepanyan, I.V., Petoukhov, S.V.: The matrix method of representation, analysis and classification of long genetic sequences. Information 8, 12 (2017) 8. Bodyakin, V.I., Stepanyan, I.V.: Adaptivnyj vysokokompressionnyj kanal peredachi dannyh na baze nejrosemanticheskogo podhoda. Nejrokomp’yutery: razrabotka i primenenie №. 9, pp. 61–64 (2011) 9. Aristov, V.V.: Biological systems as nonequilibrium structures described by kinetic methods. Res. Phys. 13, 102232 (2019) 10. Simpson, S.L., Moussa, M.N., Laurienti, P.J.: An exponential random graph modeling approach to creating group-based representative whole-brain connectivity networks. Neuroimage 60(2), 1117–1126 (2012) 11. Kozma, R., Puljic, M.: Random graph theory and neuropercolation for modeling brain oscillations at criticality. Curr. Opin. Neurobiol. 31, 181–188 (2015) 12. Sinke, M.R., Dijkhuizen, R.M., Caimo, A., Stam, C.J., Otte, W.M.: Bayesian exponential random graph modeling of whole-brain structural networks across lifespan. NeuroImage 135, 79–91 (2016) 13. Yang, Y., Guo, H., Tian, T., Li, H.: Link prediction in brain networks based on a hierarchical random graph model. Tsinghua Sci. Technol. 20(3), 306–315 (2015) 14. Eljinini, M.A.H., Tayyar, A.: Collision-free random paths between two points. Int. J. Intell. Syst. Appl. (IJISA) 12(3), 27–34 (2020). https://doi.org/10.5815/ijisa.2020.03.04
38
I. V. Stepanyan and V. V. Aristov
15. Thakkar, D.K., Dave, V.R.: Edge stable sets and secured edge stable sets in hypergraphs. Int. J. Math. Sci. Comput. (IJMSC) 5(2), 71–81 (2019). https://doi.org/10.5815/ijmsc.2019.02.06 16. Zhilyakova, L.Yu.: Graph dynamic threshold model resource network: key features. Int. J. Math. Sci. Comput. (IJMSC) 3(3), 28–38 (2017). https://doi.org/10.5815/ijmsc.2017.03.03 17. Listrovoy, S.V., Sidorenko, A.V., Listrovaya, E.S.: An approach to determination of maximal cliques in undirected graphs. Int. J. Mod. Educ. Comput. Sci. (IJMECS) 10(1), 1–12 (2018). https://doi.org/10.5815/ijmecs.2018.01.01 18. Rajangam, E., Annamalai, C.: Graph models for knowledge representation and reasoning for contemporary and emerging needs – a survey. Int. J. Inf. Technol. Comput. Sci. (IJITCS) 8(2), 14–22 (2016). https://doi.org/10.5815/ijitcs.2016.02.02 19. Rao, K.V.S., Sreenivasan, V.: The split domination in product graphs. IJIEEB 5(4), 51–57 (2013). https://doi.org/10.5815/ijieeb.2013.04.07 20. Bongini, M., Rigutini, L., Trentin, E.: Recursive neural networks for density estimation over generalized random graphs. IEEE Trans. Neural Netw. Learn. Syst. 29(11), 5441–5458 (2018) 21. Tang, M., Athreya, A., Sussman, D.L., Lyzinski, V., Park, Y., Priebe, C.E.: A semiparametric two-sample hypothesis testing problem for random graphs. J. Comput. Gr. Stat. 26(2), 344– 354 (2017) 22. Bullmore, E.T., Bassett, D.S.: Brain graphs: graphical models of the human brain connectome. Annu. Rev. Clin. Psychol. 7, 113–140 (2011) 23. Azondekon, R., Harper, Z.J., Welzig, C.M.: Combined MEG and fMRI exponential random graph modeling for inferring functional brain connectivity. arXiv preprint arXiv:1805.12005 (2018) 24. Richiardi, J., Eryilmaz, H., Schwartz, S., Vuilleumier, P., Van De Ville, D.: Decoding brain states from fMRI connectivity graphs. Neuroimage 56(2), 616–626 (2011) 25. Kremneva, E.I., Legostaeva, L.A., Morozova, S.N., Sergeev, D.V., Sinitsyn, D.O., Iazeva, E.G., Suslin, A.S., Suponeva, N.A., Krotenkova, M.V., Piradov, M.A., Maximov, I.I.: Feasibility of non-gaussian diffusion metrics in chronic disorders of consciousness. Brain Sci. 9(5), 123 (2019) 26. Tavanaei, A., et al.: Deep learning in spiking neural network. arXiv: 1804.08150v4 [csNE] 20 January 2019 27. Alivisatos, A.P., et al.: The brain activity map project and the challenge of functional connectomics. Nature 74(6), 970–974 (2012) 28. Bouchard, K.E., et al.: Functional organization of human sensorimotor cortex for speech articulation. Nature 495(7441), 327–332 (2013)
Planning of Computational Experiments in Verification of Mathematical Models of Dynamic Machine Systems Isak N. Statnikov and Georgy I. Firsov(B) Blagonravov Mechanical Engineering Research Institute of the RAS, 4, Malyi Kharitonievsky pereulok, 101990 Moscow, Russian Federation
Abstract. The method of adequacy estimation of the mathematical model describing the machine behavior in a dynamic process is considered. The method is based on the idea of verification of statements by means of their experimental verification. For experimental verification it is suggested to use computational experiment, which is implemented on the basis of simulation model experiments. This approach allows viewing the parameter space in the given ranges of their change, and as a result of special randomized character of planning these experiments to apply quantitative statistical estimates of the effect of changes in the variable parameters and their paired combinations on the analyzed properties of the dynamic system under consideration. This approach makes it possible to confirm (or deny) with the desired probability the statements made about the influence of the system parameters on its dynamic properties. Obtaining such information based on physical experiments is in most cases impossible. The method is illustrated by an example of a study of a model of a nonlinear dynamic transmission system for the main drive of a large broadband hot rolling mill. Keywords: Planned computational experiment · Identification · Verification · Dynamic model · Rolling mill drive
1 Introduction Construction and research of mechanical oscillatory systems models is closely connected with verification procedures, i.e. assessment of adequacy of constructed models to dynamic processes occurring in these systems. Verification of dynamic systems models, unlike verification of software [1], is usually reduced to the procedure of parametric identification [2, 3], i.e. to the definition of a set of numerical values of the system parameters, providing the minimum divergence in a given metric of experimental and theoretical values of certain system characteristics: time, frequency, etc. At the same time, it should be taken into account that the definition of system parameter values from experimental information in the general case is a problem incorrect according to Adamar, i.e. any given experimental characteristic can correspond to an infinite number of possible approximations. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): CSDEIS 2020, AISC 1402, pp. 39–48, 2021. https://doi.org/10.1007/978-3-030-80478-7_5
40
I. N. Statnikov and G. I. Firsov
So, at approximation ofa transfer function of linear system by a rational fraction n m of a kind H (p) = ai pi bi pi this fraction can be represented as an image of i=0
i=0
a vector of its coefficients e(a0 , a1, . . . , an ; bo , b1 , . . . , bm ) at display H from r = n on a segment [a, + m + 2-vector space E r in space of functions C(a,b), continuous r b]. In the E r space, the usual norm is given eE r = ei2 . In this case, despite the i=1
continuity of the display H, the continuity of the reverse display H −1 on the set H (E r ) ⊂ C(a, b) does not take place. In particular, the continuity is broken at points where the numerator and denominator of the fraction-rational function have at least one close root. Disruptions of continuity in the inverse mapping lead to instability of the task of restoring the analytical form of the fraction-rational function. Therefore, different forms of A.N. Tikhonov’s method of regularization have to be used for determination of the transfer function coefficients [4]. At the same time the question about the stability of qualitative features of dynamic processes in the system under study with a wider range of changes in its parameters remains open [5]. Apparently, to solve the task of model verification a specially planned computational experiment can be used, which will make it possible to identify areas of parameter values, in which the model functioning is sufficiently adequate to experimental observations. Not considering the used estimations of model adequacy, let us concentrate on the question of building such a plan of computational experiment that will allow us to solve the task in question: to estimate the regions of parameter values at which some or other features of model behavior reflecting the corresponding states of the object under study are observed. Different randomization procedures can be used as tools [6, 7], but we will focus on the PLP-search method (Planning LPτ - sequences) created at IMASH RAS [8], which due to simultaneous realization in it of idea of discrete quasiuniform on probability of probing J - dimensional space of varying parameters αj (j = 1, . . . , J ) and the methodology of the planned mathematical experiment allows, on the one hand, to carry out global quasiuniform viewing of the given area of varying parameters, and, on the other hand, to apply many formal estimates from mathematical statistics. Let us briefly consider the essence of the PLP search [9]. The method is based on randomization of vectors α¯ location in the area G(α) ¯ defined by type inequalities αj∗ ≤ αj ≤ αj∗∗ , j = 1, J , and J itself is the number of variable parameters;) and calculated with the help of LPτ - grids [10]. At present, the values of J ≤ 51 and N < 220 , where N is the maximum number of non-repeating values of pseudo-random variables in one LPτ - grids. The process of randomizing the location of vectors α¯ in the area G(α) ¯ consists of random mixing of levels of parameters αijhk in one way or another, where i = 1, M (j) - the number of the level, and M(j) - the number of levels of the variable j- th parameter by the k-th criterion; h = 1, Hijk , and H ijk - the number of values of the k-th criterion k (α) ¯ on the i-th level of the j-th parameter; k = 1, K - the number of the criterion, where K - the number of quality criteria. As a result of processing all N computational experiments carried out on a mathematical model, there are selected sets ¯ ijk (αij ) - the average value of the k-th quality criteria in ¯ ijk (αij )}, where of values { the i-th section of the j-th variable parameter.
Planning of Computational Experiments
41
2 Dynamic Model of the Rolling Mill Main Drive Transmission As an example of application of a method of PLP-search for the decision of a problem of verification of model let’s consider dynamic system of a drive of the rolling mill which rational choice of parameters provides improvement of quality of work of the mill from the point of view of its accuracy, reliability and durability [11]. In Fig. 1 the constructive scheme of a gear drive of a working stand of a wide strip hot rolling mill, where 1 rolling rolls, 2 - supporting rolls, 3 - rolled metal, 4 - upper and lower spindles, 5 - pinion stand, 6 - root clutch, 7 - reducer, 8 - motor clutch, 9 - motor is presented. The growth of rolling speeds and compressive stressing in the working stands of modern rolling mills leads to an intensive increase in dynamic loads formed in the main lines when gripping rolls and often three or more times higher than technological. It is these loads that cause unsatisfactory performance of some of the most important transmission elements in the main drives of large hot strip mills.
Fig. 1. Structural diagram of the gear drive of the working stand of a wide strip hot rolling mill
Calculation diagram of the main drive line of the working stand of a wide strip hot rolling mill is shown in Fig. 2. Here J 1 - the moment of inertia of rolled rolls, J 2 - the moment of inertia of pinion rolls, J 3 - the moment of inertia of gear wheel of the gear wheel relative to its rotation axis, J 4 - the moment of inertia of gear wheel of the gear wheel, J 5 - the moment of inertia of rotor of the motor. Let’s consider process of loading of elements of transmission of the main line of mill (Fig. 2) and formation of dynamic loadings in it at metal capture by rolls. During the no-load period in the main line there is a periodic closure and opening of gaps 12 . In addition, there are radial clearances in the bearings and between their outer rings and bores in the gearbox housing (pk ). When picking up metal, the rolling rolls rotate into the gap field 12 . In this case, the increase in torque on the rolls in the period T before closing this gap is not accompanied by a corresponding change in torque in the main line. Rolling in this period is due to loss of kinetic energy of the rolls. The speed of the rolls in this case decreases and becomes less than the idle speed of the main line elements. The action on the rolls changing in time rolling moment and the difference between the speeds of the rolls and drive elements at the time of closing the gap causes the formation of dynamic loads of the transition process, including the force in the gearing P. If the direction of rotation of the wheel is as shown in Fig. 2, the gearing force stimulates the wheel to break away from the bottom of the bore and its movement upward within the limits allowed by the gap pk . After this gap is selected, the wheel bearings hit the bearing cover.
42
I. N. Statnikov and G. I. Firsov
Fig. 2. Calculation scheme of the main drive line of the working stand of a wide strip hot rolling mil
The resulting force causes the gearbox housing to be deformed and tends to tear it away from the foundation. The combination of high dynamic loads on the input and output shafts of the gearboxes and the impact closure of the pk clearance leads to high dynamic loads on the anchor bolts and to significant dynamic deformations of the gearbox housing. Similar phenomena occur in gearboxes, but the resulting loads are smaller than in gearboxes. To determine the dynamic loads in the anchor bolts was used developed by B.E. Zhitomirskyi (VNIIMETMASH) [12] calculation scheme, the specific feature of which is that the gearbox housing is assumed to be mounted on special shock absorbers or on the basis of increased pliability, the gear wheel has the ability to move in the field of clearance between the outer ring of the bearing and its cover. When developing the calculation scheme (Fig. 2) the following assumptions were introduced, which do not have a significant impact on the accuracy of determining the dynamic loads in the studied elements of the transmission: the tangential gap 12 of the main line is focused on the section c12 , the axis of rotation of the gearbox housing coincides with the axis of rotation of the pinion; the interaction between the wheel supports and gearbox housing at the stroke closure of the gap pk is considered as an interaction of two masses through the elastic coupling cpk and proportional speed of the connection bpk ; the angular speed of the motor ϕ˙g in transient processes does not change ϕ˙g = const; technological moment,
Planning of Computational Experiments
43
acting onthe working rolls at metal capture, is a piecewise linear function of time Mt/T ; t ≤ T ; where T - time of metal capture by the rolls; t - current time; MT (t) = M; t > T; M - the moment of rolling at an established process. Taking into account the assumptions made, the equations of motion of the elements of the computational scheme during capture are as follows z¨1 + f¯1 + β12 (˙z1 − z˙2 )f2 = −k(τ ), θ2 z¨2 − f¯1 − β12 (˙z1 − z˙2 )f2 + c¯ 23 (z1 − z2 ) + β23 (˙z1 − z˙2 ) = 0; θ3 z¨3 − [¯c23 (z2 − z3 ) + β23 (˙z2 − z˙3 )](1 + θ4np ) + c¯ 45 (z3 + zk ) + β45 (˙z3 + z˙k ) ¯ k θ4np = 0, − [¯cpk f¯3 + βpk (˙zk − z˙p m)f4 ]θpk4 − G θk z¨k + c¯ pk f¯3 + βpk (˙zk − z˙p m)f4 + [¯c45 (z3 + zk ) + β45 (˙z3 + z˙k )]θ3np + [¯c23 (z2 − z3 )+ ¯ k = 0, β23 (˙z2 − z˙3 )](1 − θ3np ) + G ¯ p = 0, θp z¨np + c¯ p zp f5 + βp z˙p − [¯cpk f¯3 + βpk (˙zk − mp z˙p )f4 ]m + G
(1)
where gde z1 = ϕ1 c12 /M, z2 = ϕ2 c12 /M, z3 = ϕ3 c12 /M, zk = ϕk c12 /M, zp = ϕp c12 /M, c¯ 23 = c23 /c12 , c¯ 45 = c45 i2 /c12 , c¯ pk = cpk rk2 /c12 , c¯ p = cp lp2 /c12 , β 12 = b12 λn /c12 , β 23 √ = b23 λn /c12 , β 45 = b45 i2 λn /c12 , βpk = bpk rk2 λn /c12 , β p = bp λn /c12 , λn = c12 /J1 , ¯ k = Gk rk /M , G ¯ p = Gp lp /M , k(τ) = M T (τ)/M, m = l/r k = 1 + 1/i, i = r k /r s , G θ 3np = J 3 /(J 3 /(J 3 + J 4 i2 ), θ 4np = J 4 i2 /J k ), θ3 = (J3 + J4 i2 + J3 J4 i2 /Jk )/J3 , θk = [Jk + J3 J4 i2 /(J3 + J4 i2 )]/J3 , θ n = J p /J 1 , f¯1 = f1 c12 /M , f¯3 = f3 c12 /M , θ2 = J2 /J1 , τ = tλn , ⎧ |ϕ1 − ϕ2 | < 12 /2 ⎨ 0 f1 (ϕ1 − ϕ2 ) = ϕ1 − ϕ2 ≥ 12 /2 ϕ1 − ϕ2 + 12 /2 ⎩ ϕ1 − ϕ2 ≤ 12 /2, ϕ1 − ϕ2 − 12 /2 0 |ϕ1 − ϕ2 | < 12 /2 f2 (ϕ1 − ϕ2 ) = 1 |ϕ1 − ϕ2 | ≥ 12 /2,
⎧
ϕk − ϕp l/rk < pk /2rk ⎨ 0 f3 (ϕk − ϕp l/rk ) = ϕk − ϕp l/rk + pk /2rk ϕk − ϕp l/rk ≥ pk /2rk ⎩ ϕk − ϕp l/rk − pk /2rk ϕk − ϕp l/rk ≤ −pk /2rk ,
0
ϕk − ϕp l/rk
< pk /2rk 1 ϕp ≥ 0 , f5 (ϕp ) = f4 (ϕk − ϕp l/rk ) =
1 ϕk − ϕp l/rk > pk /2rk , c /cp ϕp < 0. In Eq. (1), the following designations are introduced: J 1 - moment of inertia of the gear wheel relative to the gearing pole; J p - moment of inertia of the gear housing relative to the received axis of rotation (pinion axis); c12 - spindle stiffness; c23 - communication stiffness between the gear wheel and pinion cage; c45 - communication stiffness between the gear wheel and motor; cp - reduced stiffness of anchor bolts of the gear wheel; cpk reduced contact rigidity of the bearing socket (between the outer ring of the wheel bearing and the bearing cover); b12 , b23 , b45 , bpk , bp - damping coefficients in the corresponding
44
I. N. Statnikov and G. I. Firsov
bonds; cF - reduced basic rigidity under the gearbox; r k - gear wheel radius; r s - gear wheel radius; l - gear wheel centre distance; Gk , Gp - weight of wheel and gear wheel housing respectively; lp - distance from gear wheel swivel axis to the point of anchor bolt rigidity adjustment. The following values are accepted as initial conditions for the study (1): z10 = z20 = z30 = 0, δ12 = 12 c12 /M , δpk = pk c12 /rk M . ¯ k /¯cpk , zpo = (−G ¯ k lp /l − Gp )/¯c , zko = −δpk /2 + zpo m − G z˙10 = z˙20 = z˙30 = z˙ko = z˙po = 0, c¯ = c /c12 , k(τ ) =
τ/τ3 τ ≥ τ3 1 τ > τ3.
As criteria for evaluating the dynamic properties of the system we use the following dynamic coefficients: K1 - in the gearbox anchor bolts; K2 - on the clutch connecting the motor with the gearbox; K3 - on the spindles; K4 - on the clutch connecting the gearbox with the gearbox stand; K5 - on the gearbox wheel support. Researches made by a method of PLP-search in the specified ranges of change of parameters of dynamic system of rolling mill, have shown, that the basic qualitative characteristics of vibrations of elements of a drive of mill are kept in all investigated range. The excitation of auto-oscillation modes was not noticed, the development and maintenance of continuous vibrations in the system was not observed. Thus, it is possible to draw a preliminary conclusion that the developed mathematical model adequately describes the ongoing dynamic processes.
3 Study of a Dynamic Model by the Method of a Planned Multilevel Computational Experiment Having made sure of the acceptable degree of adequacy of the developed mathematical model (verification process), the task was set to find the concentration region of the best solutions for each quality criterion in the J-dimensional hyperparallelepiped. To do this, it was necessary to solve the following tasks: to determine those of the variable parameters (and their combinations), which, on average, have the greatest influence on the values of the dynamism coefficients; determine the nature of the interdependence between individual quality criteria in order to develop a comprehensive assessment of the model; reduce, if possible, the dimension of the initial space of the parameters under study; define and select many compromise combinations of parameters. The solution of the formulated problems was carried out by the PLP search method based on the formulated mathematical model in the following sequence. In the first stage, the radial clearance at the gear pk wheel support was assumed to be zero. The following parameters varied in the ranges shown below: 105 ≤ cp (kg cm/rad) ≤ 108 , 0 ≤ 12 (rad) ≤ 0,02, 0,054 ≤ c45 ≤ 0,02, 0,054 ≤ c45 (kg cm/rad) 10−10 ≤ 0,27, 0,02 ≤ T z (c) ≤ 0,1, 108 ≤ c12 (kg cm/rad)) ≤ 1010 . The range of T z at the given limits of change c12 corresponded to the interval of dimensionless time τz = 0,338 ÷ 16,9. When conducting PLP search in the specified area, a planning matrix was compiled
Planning of Computational Experiments
45
experiments with parameters: J = 5; Ti∗ = 10; M j = 16; N = 160, where Ti∗ - number of permutations (series) of experiments; M j - number of experiments in one series; N total number of experiments. The results of processing of all experiments are given in Table 1, where the calculated F and theoretical F T values of Fisher criterion are indicated, allowing to judge about the influence of variable parameters and their pairwise interactions on the values of corresponding dynamic coefficients. Table 1 shows, for example, that the values of all the dynamism coefficients are significantly influenced by the time of setting the rated load torque on the rolls (τz ) and the gap 12 (F > F T ), and the values of K1 - the stiffness of the anchor bolts cp . At this stage, it was found that at τz ∈ [6; 16, 9] and 12 ∈ (0; 0,01), on average, the minimum values of all dynamic coefficients are provided. In the second stage, 32 machine experiments were carried out at pk = var. The gap changed in the range of 0 ÷ 3 mm. The experiments were carried out in series of 8 in each with four fixed values of τz (2; 4; 5; 6). The processing of their results showed that the changes in pk most affected the values of K1 . At the same time, the average dependence of pk for eight current values of K(pk ), in all sub-series was constructed, shown in Table 2, from which we can see that a significant increase in dynamic loads in the system is associated with an increase in pk gaps. Based on the results of the experiments carried out in the first two stages, the following area of search for rational solutions was selected at a fixed value of τz = 6; 0,025 ≤ c45 (kg cm/rad) 10−10 ≤ 0,225, 0 ≤ pk (mm) ≤ 1,5, 108 ≤ c12 (kg cm/rad) ≤ 1010 , 106 ≤ cp (kg cm/rad) ≤ 108 , 0 ≤ 12 (rad) ≤ 0,01. At carrying out of planned LP search in the specified area the matrix of planning of experiments with parameters: r = 5; Ti∗ = 10; M j = 16; N = 160 has been made. At this stage, on the basis of the dispersion analysis, a significant influence of the stiffness of cp on the values of all dynamic coefficients was established. Taking into account the results of all experiments, the following control (recommended) area for finding the best solutions for all quality criteria was identified: 0,36 ≤ cp (kg cm/rad) · 10−8 ≤ 0,9, 0,4 ≤ c12 (kg cm/rad) · 10−9 ≤ 1,6, 0,025 ≤ c45 (kg cm/rad) · 10−10 ≤ 0,225, 0 ≤ pk (mm) ≤ 0,6, 0 ≤ 12 (rad) ≤ 0,006. The results of control experiments are shown in Table 3. For variant N 5, additional studies were conducted in the selected area to identify the nature of changes in the values of dynamic coefficients with a decrease in the value of τz . It turned out that with a decrease in the value of τz . from 6 to 3.5 the most important coefficient of dynamics K1 increased from 1.62 to 3.79, which is still quite acceptable. At the same time, the values of other dynamic coefficients were less than 2.2. Thus, the most important influence on the dynamic coefficients in the anchor bolts is given by the parameters τz , cp , 12 , pk and the value cp /c12 , related to the energy distribution in the system. The stiffness c12 has a significant influence on all five dynamic coefficients, and 12 only on K1 and K5 . The most effective and constructively realized way to reduce the dynamic loads on the studied elements of transmissions is to reduce the rigidity of gearbox attachment units, in particular, by installing them on shock absorbers, as well as by increasing the resilience of connecting elements located on the high-speed drive shafts, and reducing the gaps in transmissions and gear supports. The use of the developed models and recommendations allows more than halving of the dynamic loads formed during gripping. The nature of
46
I. N. Statnikov and G. I. Firsov
Table 1. Calculated F values and theoretical F T values of Fisher criterion for variable parameters Parameters τz K1 F T 1,69
12 cp
c45
c12
C 12 τz 12 τz c12 12
1,69 1,69 1,69 1,69 1,94
1,94
1,94
16,62 4,73 1,91 2,80 1,42 2,67
2,11
7,56
1,69 2,07 2,07 1,69 1,94
1,94
2,98
33,62 1,50 1,35 1,05 1,29 42,0
6,22
2,46
K5 F T 1,69 1,69 2,07 1,69 1,69 1,94 F 20,29 4,50 2,29 1,04 1,19 4,28
1,94
1,94
2,92
5,98
F
K3 F T 1,69 F
K2 F T 1,69 F
1,69 1,69 1,69 1,69 1,94
1,94
1,94
22,60 4,65 1,23 1,31 1,04 3,99
2,70
5,09
1,94
1,94
14,37
14,50
K4 F T 1,69 F
1,69 2,07 2,07 1,69 1,94
13,56 6,16 1,62 1,01 2,52 1,64 Parameters C 45 τz c45 12 c12 c45 cp τz
K1 FT 1,89 F
10,67
K3 FT 1,89 F 26,7 K5 FT 1,89 F 11,65 K2 FT 1,89
cp 12 C p c12 C p c45
1,94
1,94
1,09
2,03
3,23
1,94
4,54
3,51
9,45
2,40
1,09
1,24
1,94
1,94
2,09
2,09
2,03
2,93
6,68
11,96
8,67
9,24
9,08
1,27
1,94
1,94
2,09
2,03
2,03
1,94
4,96
3,66
8,50
2,48
1,20
1,02
1,94
1,94
2,09
2,03
2,03
2,09
3,48
5,25
4,58
6,97
3,04
1,12
1,05
K4 FT 1,94
1,94
1,94
2,09
3,23
3,23
1,94
2,52
2,16
10,43 1,38
1,38
2,20
F F
14,50
Table 2. Average dependencies of dynamic coefficients on the gap value pk pk , mm 0,375 0,750 1,025 1,500 1,875 2,250 2,625 2,813 ˆ pk) K(
6,9
12,99 15,04 17,26 18,4
20,63 23,27 27,48
interrelationships between dynamism coefficients in different drive train units is nonantagonistic in parameters in the given search area, which allowed selecting one area of the best solutions for all quality criteria.
Planning of Computational Experiments
47
Table 3. The results of control computing experiments in the recommended area of searching for the best solutions for all quality criteria cp · 10−8 c12 · 10−9 c45 · 10−10 12 · 102 (rad) pk (mm) Dynamic coefficients (kg cm/rad) (kg cm/rad) (kg cm/rad) K5 K2 K1 K3
K4
0,630
1,000
0,125
0,300
0,300
1,11 1,26 3,46 1,11 1,18
0,495
1,300
0,975
0,450
0,150
1,43 1,57 3,05 1,14 1,25
0,765
0,700
0,175
0,150
0,450
1,00 1,00 3.07 1,08 1,14
0,428
1,150
0,200
0,975
0,375
1,06 1,15 3,53 1,11 1,16
0,698
0,550
0,100
0,375
0,975
1,15 1,13 1,62 1,09 1,09
0,563
0,850
0,150
0,525
0,525
1,14 1,16 5,75 1,09 1,18
0,833
1,450
0,050
0,225
0,188
1,46 1,62 3,21 1,16 1,23
0,394
1,525
0,163
0,188
0,113
1,30 1,27 2,50 1,13 1,19
0,664
0,925
0,063
0,488
0,413
1,26 1,51 4,87 1,15 1,21
0,529
0,625
0,213
0,338
0,263
1,07 1,03 2,55 1,07 1,10
4 Conclusions The use of multilevel planning of computational experiments [13, 14] for solving problems of verification of mathematical models of machine dynamics allows us to draw the following theoretical and practical conclusions: 1) the problem of verification of a mathematical model is solved successfully with a given noisy probability; in the described example, this probability; an increase in the probability value leads, of course, to an increase in the number of computational experiments; 2) the use of PLP search reduces psychological load when analyzing the influence of variable parameters on the values of the criteriev, since it reduces this task to the analysis of “on average” one-dimensional ¯ ijk (αij )}; 3) an additional possibility of the described plandependences of the form { ning method for calculating experiments is to simplify development of a mathematical model and decomposition of a complex system into several simple systems; 4) the identification of strongly acting parameters for the quality criteria of the machine simplifies its practical implementation; 5) at and, it becomes possible to more realistically determine the search area for a compromise in the values of the variable parameters.
References 1. Garoche, P.-L.: Formal Verification of Control System Software, p. 231. Princeton University Press, Princeton (2019) 2. Keesman, K.J.: System Identification. An Introduction. Springer, London (2011) 351 p. https://doi.org/10.1007/978-0-85729-522-4 3. Ikonen, E., Najim, K.: Advanced Process Identification and Control. Marcel Dekker Inc., New York, Basel (2002). 316 p 4. Tikhonov, A.N., Goncharsky, A.V., Stepanov, V.V., Yagola, A.G.: Numerical Methods for the Solution of Ill-Posed Problems. Springer, Dordrecht (2013). 253 p.
48
I. N. Statnikov and G. I. Firsov
5. Aliyev, A.G., Shahverdiyeva, R.O.: Application of mathematical methods and models in product-service manufacturing processes in scientific innovative technoparks. Int. J. Math. Sci. Comput. 3, 1–12 (2018) 6. Nilima, S., Alind, A.N.: Randamization technique for desiging of substitution box in data encryption standard algorithm. Int. J. Math. Sci. Comput. 3, 27–36 (2019) 7. Sinha, P.K., Sinha, S.: The better pseudo-random number generator derived from the library function rand() in C/C++. Int. J. Math. Sci. Comput. 4, 13–23 (2019) 8. Statnikov, I.N., Firsov, G.I.: Using sobol sequences for planning experiments. J. Phys: Conf. Ser. 937(012050), 1–3 (2017) 9. Statnikov, I., Firsov, G.: Numerical approach to the solution of the problem of rational choice of dynamical systems. In: Hu, Z., Petoukhov, S., He, M. (eds.) Advances in Artificial Systems for Medicine and Education II. AISC, pp. 69–79. Springer, Heidelberg (2020). https://doi. org/10.1007/978-3-030-67133-4_8 10. Sobol, I.M.: Multidimensional Quadrature Formulas and Haar Functions. Nauka, Moscow (1969). 288 p. (in Russian) 11. Ivanchenko, F.K., Krasnoshapka, V.A.: Dynamics of Metallurgical Machines. Metallurgiya, Moscow (1983). 295 p. (in Russian) 12. Zhytomyrskyi, B.E.: Creation of rational transmission systems for main drives in a number of modern rolling mills. VNIIMETMASH, Moscow (1976). 42 p. 13. Statnikov, I.N., Firsov, G.I.: Processing of the results of the planned mathematical experiment in solving problems of research of dynamics machines and mechanisms. In: 2018 International Multi-Conference on Industrial Engineering and Modern Technologies (FarEastCon), pp. 1–6. IEEE Xplore (2018) 14. Statnikov, I.N., Firsov, G.I.: Regression analysis of the results of planned computer experiments in machine mechanics. J. Phys. Conf. Seri. 1205(012054), 1–6 (2019)
Optimization of Network Transmission of Multimedia Data Stream in a Cloud System Vera V. Izvozchikova, Marina A. Tokareva, and Vladimir M. Shardakov(B) Orenburg State University, Ave. Pobedy 13, Orenburg, Russia [email protected], [email protected]
Abstract. High network bandwidth requirements for transmitting multimedia data and models affect the speed of data transfer from the server to the user. Optimizing the transfer of multimedia data stream is a difficult task, especially when there are multiple simultaneous requests to the server. In this article, we propose an algorithm for optimizing the transmission of multimedia data flow over a network for multiple clients. Instead of a fixed bandwidth, the server determines the optimal data transfer rate for each client connection according to the buffer packets and the playback speed of the three-dimensional scene on each client. The multimedia data transfer rate changes when the total requested bandwidth exceeds the network bandwidth. In addition, the proposed algorithm allows you to minimize the allocation of bandwidth and maximize the use of the client buffer. The results are modeled to show that the proposed algorithm for transmitting a stream of multimedia data allows you to dynamically change the data transfer rate for each client in order to avoid client buffer overflow and achieve optimal use of a limited network resource when connecting multiple client network environments to the server. Keywords: Optimization · Multimedia · Three-dimensional data · Data flows · Bandwidth
1 Introduction Optimizing the network transmission of multimedia data is an important task that is of interest to many researchers. The network multimedia data transmission system is based on specialized software and hardware. Transmitted multimedia streams require a large number of resources, channels, and nodes. These resources are limited. When transmitting a large volume of traffic and processing multimedia data, network nodes may fail more often, and communication lines may often have bottlenecks. This is particularly harmful for multimedia data transmission, which requires that data is transmitted without delays or interruptions. These factors have a big impact on the speed of data transfer from the server to the end user. The waiting time consists of the following components: – serialization delay – the time that the port will take to transmit the packet;
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): CSDEIS 2020, AISC 1402, pp. 49–56, 2021. https://doi.org/10.1007/978-3-030-80478-7_6
50
V. V. Izvozchikova et al.
– propagation delay – the time it takes a bit of information to reach the receiver; – overload delay – the time that the object spends in the output queue of the network element; – transmission delay – the time at which the network element performs analysis, processing and transmission of the packet. In connection with all of the above, it is necessary to develop new, effective methods and algorithms to optimize the cost of transmitting a stream of multimedia data.
2 Related Work There are several approaches to meet the bandwidth requirements of multimedia transmission. One approach is static resource reservation schemes that allocate a channel with a constant bit rate to transmit a stream at a peak data rate. With large variations in multimedia bandwidth requirements, static distribution usually results in a significant loss of network resources. Another approach is to adapt the speed [1], which adjusts the bandwidth used by the transmission connection according to existing network conditions. The adaptive approach makes better use of an existing network resource that changes over time, compared to the static distribution approach. Scientists under the leadership of Meng Zhang analytically studied the problem of scheduling in the multimedia data streaming system, described the minimum costs and network flow problems that are necessary to optimize the system’s bandwidth [2]. Shanchieh Yang and G. de Veciana have identified key properties that a dynamic bandwidth allocation policy should have in order to improve the performance of a network that supports multimedia file transfer. And they proposed a set of criteria for bandwidth allocation that depend on the residual operation of current transmissions [3]. In [4], the authors solve the synchronization problem for each application, regardless of other applications, taking into account the available CPU, GPU, memory, and communication bandwidth. The authors use synchronous data flow graphs (SDFGs) to build a time-limiting model for multimedia applications, and thus model cyclic multi-speed dependencies between tasks. A lot of work has been done in the field of resource allocation, traffic management, and multimedia data transfer via cloud services [5–8]. Scientists from Ryerson University presented a model for optimizing cloud resources, taking into account the requirements of QoS and the cost of resources. A Queuing model is proposed to characterize the process of transmitting multimedia data to a cloud system. An algorithm for minimizing response time and resource consumption is presented. The authors adapted their models and algorithm only to the global platform of the Windows Azure cloud service [9]. Employees of the Peking University of post and telecommunications presented a model for improving the system’s efficiency when resources are delayed or limited by parallelizing each task based on the search for a critical path of the process. In addition, we are dealing with the problem of planning in two different scenarios, respectively. In order to keep expenses under a certain time frame, we offer a delay-limited task planning approach. To reduce the completion time within certain resources, we suggest a limited resources approach to task planning [10].
Optimization of Network Transmission of Multimedia Data Stream
51
Scientists led by J. Zhao applying the ROC space design algorithm. In their work, the authors proposed a solution to two problems: first, the key points and errors in the transmission and presentation of multimedia data to the user are considered, and second, a subject-specific performance indicator for three-dimensional approximations of the front is introduced. Evaluating the performance of its algorithm on the developed theoretical problems using the proposed filter with a low number of false deviations, false techniques and low computational costs by using ensembles of rules and neural networks for binary classification of test data [11]. In his work [12] based on the development of a 3D coder- decoder convolution architecture of a neural network, necessary to accelerate the determination of the optimal computational strategy for its deployment. The authors conduct a comparative study between several neural network training strategies and evaluate the effect of using different combinations of input data. Optimization of data packet delivery by means of a connection matrix, as well as channel bandwidth for determining k shortest paths is considered in [13]. An example of a balanced distribution of tasks in a cloud network is presented in [14]. Network load balancing, image classification architecture, and performance evaluation are presented in [15–17]. Based on the related work, we can conclude that this study is relevant.
3 Methodology The use of multimedia Augmented Transition Network (ATN) and multimedia input strings is used to optimize and build spatial relationships of various media streams and three-dimensional objects. Key points are input data for the ATN and represent a sequence of representation of media streams of three-dimensional objects. In addition, a low-level approach to selecting key frames based on the temporal and spatial relationships of objects is adopted. Optimization of multimedia data transmission in the work is performed on the basis of the ATN model, taking into account the task of splitting multimedia data into parts. The principle of splitting the scene and the method of processing multimedia data is proposed in our previous study [18–20]. When a user requests information from the server, it analyzes the optimal Internet connection to determine the optimal transfer rate. Instead of transmitting information with a limited connection speed, the server can change the multimedia data transfer speed automatically based on the data buffer and the user’s scene playback speed. To optimize data buffer usage and minimize bandwidth allocation, linear quadratic tracker is used to build a multimedia data transfer scheduler in order to get the optimal transfer rate for each user. V = Xxk + Yuk
(1)
where V is the data transfer rate, X and Y are constant matrices describing the transmitted object, uk is the control input, and xk is the state of the tracked object. We define the
52
V. V. Izvozchikova et al.
cost function F of the system by the formula (2). ⎧ N −1 ⎪ 1 ⎪ ⎪ ⎪ F = Vk · Gk · Hk ); ⎪ ⎪ 2 ⎪ ⎨ k=1
(2)
Vk > 0; ⎪ ⎪ ⎪ ⎪ Gk > 0; ⎪ ⎪ ⎪ ⎩ Hk > 0.
Where G and H are the weight matrices for the boundary conditions [1, N] at which we want to track the object. The formula for finding the sending of multimedia data is defined by the formula (3). uk = −Tk + Fk · Vk+1 ,
(3)
where T k is the feedback gain. T k is defined by the formula (4). Tk = (BT + H )−1 · A · dk ,
(4)
where d k are auxiliary sequences for calculating the optimal gain factor T k . Applying the control input sequence to the object, we get a sequence of the object’s state that allows us to minimize the quadratic cost function. Let’s assume that there are m clients that request data from the server simultaneously. For the j-th client, let F j be the cost function, Z r , j be the size of the j-th client’s allocated buffer when connecting, f j (k) be the difference between the size of the allocated buffer and the jth client’s buffer packet at time T, Z j (k) be the buffer packet at time T, and V j (k) be the transfer rate of the j-th client. Determine the difference between the size of the allocated buffer and the buffer packets as follows: fj (k) = Zj (k) − Zr ,
(5)
To maximize the client buffer Z j (k), you must minimize f j (k). The server side optimization function should be as follows:
m m 1 fj (k) + Vj (k) , F= Fj = (6) j=1 j=1 2 For each client, the tracker is used to achieve optimal data transfer speeds. After the server calculates the optimal bandwidth for each client, the effective W J bandwidth for the connection for the j-th client must meet the following requirement: Wj ≤ w, (7) j
where w is the network bandwidth. To avoid the possibility that the total amount of requested bandwidth will be greater than the network bandwidth, we will reallocate the bandwidth for each client connection. The reallocated bandwidth must be proportional to the actual requirements of each client. The multimedia server operation algorithm is shown in Fig. 1. In the proposed algorithm, the transfer rate of the entire volume of multimedia data can be calculated before the transfer begins. Parameter calculations are performed offline, which greatly simplifies the implementation of the data rate controller.
Optimization of Network Transmission of Multimedia Data Stream
53
Fig. 1. Algorithm for optimizing multimedia data transmission
4 Experimental Part Currently, the main task in multimedia data processing is to optimize the transfer of information from one device to another. The optimization concept implies the possibility of technologically minimizing the time for transmitting information about the type of scene, objects located on it, textures, and the final rendering of the scene on the user’s side without significant loss of image quality. The architecture of the proposed process for processing dynamic multimedia data streams based on resource virtualization for cloud applications is shown according to the algorithm proposed in Fig. 2.
Fig. 2. Architecture for processing dynamic multimedia data streams
54
V. V. Izvozchikova et al.
The proposed architecture shows the interaction between the user and the cloud server. Multimedia data consisting of a three-dimensional scene with a random location of 3D objects is transmitted for each user. To build a three-dimensional scene, the user is provided with information about the scene itself and the key locations of three-dimensional objects with textures relative to its Internet connection speed, and the necessary three-dimensional models, which are taken from the “object library”. Creating a three-dimensional scene, key points, and the location of 3D objects is done on the user’s side. Three-dimensional objects are stored in the cloud. After the scene is completed, all objects are rendered using the server’s computing power. The key objects that are processed first are those that the user’s viewing angle is directed at. We will simulate the transfer of multimedia data in the interval from 1 to 60 s with a time step equal to 1 s. Figure 3 shows a graph of the speed at which multimedia data is transmitted to the user.
Fig. 3. Graph of the multimedia data transfer rate for a single user. a) Without using an optimization algorithm, b) using an optimization algorithm for multimedia data transmission
As you can see from Fig. 3, transmitting a stream of multimedia data is optimal for reproducing a three -dimensional scene. Figure 4 show’s the multimedia data transfer scheduler for 3 users.
Fig. 4. Multimedia data transfer rate graph for three users. a) Without using an optimization algorithm, b) using an optimization algorithm for transmitting multimedia data
Optimization of Network Transmission of Multimedia Data Stream
55
As you can see from the figure, at some point in time, the optimal multimedia data transfer rate is adjusted if the total data transfer rate of all users exceeds the network bandwidth. After adjusting the multimedia data transfer rate, the total data transfer rate will be equal to the network bandwidth. This is necessary in order to meet the playback requirements of all users as much as possible. If the transfer speed is not sufficient to meet the data playback speed, packets in the client buffer will be transmitted so that the difference between the size of the allocated buffer and the size of the packet in the buffer increases. Optimize the transfer rate of multimedia data is carried out automatically. Each user has their own schedule for getting information and playing it back.
5 Conclusion In this paper, the authors propose an algorithm for optimizing the transmission of multimedia data flow over the network by determining the optimal data transfer rate for each client connection in accordance with buffer packets and the multimedia data transfer rate for each user. Data transfer rate optimization can be adjusted according to the network bandwidth limit. Additionally, the calculation of parameters for transmission of multimedia data is carried out offline. Using the proposed algorithm, it is possible to minimize the allocation of bandwidth and maximize the use of the buffer. The simulation results show that the proposed algorithm for transmitting a stream of multimedia data allows you to dynamically change the data transfer rate for each client in order to avoid client buffer overflow and achieve optimal use of a limited network resource when connecting multiple client network environments to the server.
References 1. Yan, D., Chi, Z., Siliang, S., Zailu, H.: Adaptive bandwidth allocation based on particle swarm optimization for multimedia LEO satellite systems. In: IEEE Xplore, Digests 2006 First International Conference on Communications and Networking in China, Beijing, China, pp. 1–6 (2007) 2. Zhang, M., Xiong, Y., Zhang, Q., Sun, L., Yang, S.: Optimizing the throughput of data-driven peer-to-peer streaming. IEEE Trans. Parallel Distrib. Syst. 20(1), 97–110 (2009) 3. Yang, S., Veciana, G.: Size-based adaptive bandwidth allocation: optimizing the average QoS for elastic flows. In: Digests Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies, IEEE, New York, USA, pp. 657–666 (2002) 4. Stuijk, S., Basten, T., Geilen, M.C.W., Corporaal, H.: Multiprocessor resource allocation for throughput-constrained synchronous dataflow graphs. In: Digests 2007 44th ACM/IEEE Design Automation Conference, IEEE, San Diego, CA, USA, pp. 1–6 (2007) 5. Byna, G., Sun, X.-H., Thakur: Improving the performance of MPI derived datatypes by optimizing memory-access cost. In: Digests 2003 Proceedings IEEE International Conference on Cluster Computing, IEEE, Hong Kong, China, pp. 1–8 (2004) 6. Jin, Y., Qian, Z., Sun, G.: A real-time multimedia streaming transmission control mechanism based on edge cloud computing and opportunistic approximation optimization. Multimedia Tools Appl. 78, 8911–8926 (2019) 7. Wu, J., Yuen, C., Cheung, N.-M., Chen, J., Chen, C.W.: Modeling and optimization of high frame rate video transmission over wireless networks. IEEE Transact. Wireless Commun. 15(4), 2713–2726 (2016)
56
V. V. Izvozchikova et al.
8. Ranjan, R.: Streaming Big Data processing in datacenter clouds. IEEE Cloud Comput. 1(1), 78–83 (2014) 9. Nan, X., He, Y., Guan, L.: Queueing model based resource optimization for multimedia cloud. J. Visual Commun. Image Represent. 25(5), 928–942 (2014) 10. Gao, Y., Ma, H., Zhang, H., Kong, X., Wei, W.: Concurrency optimized task scheduling for workflows in cloud. In: Digests 2013 IEEE Sixth International Conference on Cloud Computing, IEEE, Santa Clara, CA, USA, pp. 709–716 (2013) 11. Zhao, J., et al.: Multiobjective optimization of classifiers by means of 3D convex-hull-based evolutionary algorithms. Inf. Sci. 367–368, 80–104 (2016) 12. Banga, S., Gehani, H., Bhilare, S., Patel, S., Kara, L.: 3D topology optimization using convolutional neural networks. ArXiv, vol. abs/1808.07440, p. 21 (2018) 13. Moza, M., Kumar, S.: Finding K shortest paths in a network using genetic algorithm. Int. J. Comput. Netw. Inform. Secur. 12(5), 56–73 (2020) 14. Alakbarov, R.G.: Method for effective use of cloudlet network resources. Int. J. Comput. Netw. Inform. Secur. 12(5), 46–55 (2020) 15. Sati, S., Abulifa, T., Shanab, S.: PRoPHET using optimal path hops. Int. J. Wirel. Microwave Technol. 10(4), 16–21 (2020) 16. Kaur, A., Kaur, B., Singh, P., Devgan, M.S., Toor, H.K.: Load balancing optimization based on deep learning approach in cloud environment. Int. J. Inform. Technol. Comput. Sci. 12(3), 8–18 (2020) 17. Aamir, M., Rahman, Z., Abro, W., Tahir, M., Ahmed, S.: An optimized architecture of image classification using convolutional neural network. Int. J. Image Graph. Signal Process. 11(10), 30–39 (2019) 18. Bolodurina, I.P., Parfenov, D.I., Shardakov, V.M.: Development and research of an adaptive traffic routing algorithm based on a neural network approach for a cloud system oriented on processing big data. In: Digests 4th Ural Workshop on Parallel, Distributed, and Cloud Computing for Young Scientists, Ural-PDC, vol. 2281, pp. 98–111. CEUR-WS, Yekaterinburg, Russian Federation (2018) 19. Shardakov, V., Parfenov, D., Bolodurina, I., Izvozchikova, V., Zaporozhko, V., Mezhenin, A.: Development of an effective model of parallel processing of multimedia data on the CPU and GPU in the cloud system. In: Journal of Physics: Conference Series, Digests International Scientific Conference on Applied Physics, Information Technologies and Engineering 2019, APITECH 2019, vol. 1399, no. 3, p. 8. Krasnoyarsk Science and Technology, City Hall, Krasnoyarsk, Russian Federation (2009) 20. Shardakov, V.M., Parfenov, D.I., Zaporozhko, V.V., Izvozchikova, V.V.: Development of an adaptive module for visualization of the surrounding space for cloud educational environment. In: Digests Management of Large-Scale System Development, MLSD 2018: Proceedings of 11th International Conference, IEEE, Moscow, Russian Federation, pp. 1–5 (2018)
Using Virtual Scenes for Comparison of Photogrammetry Software Aleksandr Mezhenin1(B) , Vladimir Polyakov1 , Angelina Prishhepa1 Vera Izvozchikova2 , and Anatoly Zykov1
,
1 ITMO University, Kronverksky Ave. 49, St. Petersburg, Russia 2 Orenburg State University, Ave. Pobedy 13, Orenburg, Russia
Abstract. Photogrammetry requires capturing a series of overlapping photographs with specific properties, from which a three-dimensional reconstruction is subsequently obtained. The quality of the models obtained on the basis of photogrammetry depends not only on the software used, but also on the shooting conditions, different number of images (number of cameras), settings, spatial orientation, etc. Manufacturers of photogrammetry software provide some advice on how to take photographs, but this information is often insufficient. The article discusses the issues of improving the quality of 3D photogrammetry reconstructions by planning the survey. To conduct experiments, it was proposed to use a simulation approach - a virtual studio that simulates the process of photography in a 3D computer modeling environment. The resulting sequence of photorealistic images of the test object can be processed using evaluation software to calculate a 3D point cloud. The mathematical apparatus for comparing test models and 3D reconstructions is based on the Hausdorff metric. Keywords: Photogrammetry · Simulation Software · 3D Data Acquisition · Virtual space · Virtual modeling · Point clouds · 3D computer modeling · Hausdorff metric
1 Introduction Currently, technologies for creating three-dimensional models based on digital photographs (photogrammetry) are becoming more widespread. To obtain high-quality images that contain a sufficient amount of data, on the basis of which a reliable reconstruction of objects can be made, it is necessary to solve the problems of the optimal choice of the place of photo or video shooting, the direction and distance between the objects under study and the mobile video system, as well as the number of photos and videos frames [11, 18]. The data obtained as a result of the survey is the source material for subsequent processing. The result obtained largely depends on their quality. The determining factors are not only the parameters of the cameras used, but also the conditions and methods of shooting, geometric parameters of shooting [23, 24]. And one of the important problems in solving such problems is a preliminary assessment of the volume of source materials to obtain a high-quality result [1, 2]. How many © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): CSDEIS 2020, AISC 1402, pp. 57–65, 2021. https://doi.org/10.1007/978-3-030-80478-7_7
58
A. Mezhenin et al.
photographs are needed, how best to set the lighting, if there is such an opportunity, etc. To address these issues, the authors propose to conduct a number of experiments, analyze the results and offer recommendations for organizing the shooting process for some types of objects. To carry out these experiments, it is proposed to use the so-called virtual studios in the 3D computer modeling environment to ensure the same shooting conditions (image quality, illumination) [15, 16]. In addition, with this approach, there is always an initial “ideal” object, the parameters of which are known, with which the results of photogrammetry are then compared.
2 Related Work In work [20] for planning a photogrammetric survey, it is proposed to use a special simulator created for the project. There is a library of georeferenced 3D solids that can be combined to create complex structures. Synthetic images are obtained using a virtual photogrammetric camera with known parameters of interior orientation. It is possible to create computer animations. The virtual simulator can be used to optimize the collection of preliminary data in simple or complex scenarios. In [21], short-range photogrammetric measurement systems are considered for high-precision control of the surface of car body parts. These measurement systems are based on an active light source, a projector and one or more cameras. The system under consideration uses a projection band sequence based on a combination of Gray code and phase shift techniques. Basically, the quality of the measurement results depends on the best position of the sensors, which requires human expertise and experience. It is proposed to use computer algorithms to find the optimal measurement positions. Modeling processes open up as part of a research project aimed at assessing the quality of position measurements relative to visibility, achievable accuracy and feasible feature extraction. One of the approaches is proposed - modeling a photogrammetric sensor using ray tracing methods to create a photorealistic image from the view of sensor cameras. The article [22] presents a simulation approach to obtaining three-dimensional (3D) data based on photogrammetry. For this, a series of overlapping photographs with certain properties must be created, from which a three-dimensional reconstruction is subsequently obtained. Scanning a building, person, or jewelry requires varying numbers of cameras, settings, spatial orientation, and so on. Without precise information on how to effectively take photographs, it takes a long time to acquire them with no guarantees of what the result will be. sufficient quality of 3D reconstruction. The proposed modeling approach aims to alleviate the aforementioned problems and improve the process of collecting 3D data based on photogrammetry. The presented simulator has been tested in the context of developing a 3D scanning system for scanning the human body and creating avatars. Experiments confirm that the proposed method leads to an improvement in the quality of the reconstruction of 3D objects in comparison with the previous practice in the field of 3D scanning of a person. In addition, it reduces the cost and time required for the manufacturing process of building 3D scanning systems, thereby confirming the value and validity of the presented approach.
Using Virtual Scenes for Comparison of Photogrammetry Software
59
3 The Proposed Solution As noted, the initial data for any photogrammetry system requires a series of overlapping photographs with certain properties, from which a three-dimensional reconstruction is subsequently obtained. To obtain a high-quality result, a certain number of photographic images are required. Shooting angle, camera parameters, placement of light sources are important. For a preliminary assessment of the organization of the shooting process, the authors propose to use existing three-dimensional modeling systems, such as 3ds Max, Blender, etc. (Fig. 1). The quality assessment is supposed to be carried out by comparing test objects at the level of polygonal objects. This approach will allow not only to simulate shooting conditions, allowing you to get the best result. The use of virtual test scenes will allow, to a certain extent, to evaluate various photogrammetry systems. This can be used to improve existing software and develop new solutions.
Fig. 1. Pipeline simulation concept.
The use of three-dimensional modeling systems will allow creating virtual stands of various configurations, simulating a different number of cameras, their position, stirring light sources (Fig. 2).
Fig. 2. Location of cameras and light sources. Reconstruction results.
The use of virtual stands allows an objective assessment of the results of photogrammetry [2]. The investigated model and the reconstructed one are brought to a polygonal form and their comparison is made. For such comparisons, different approaches are used based on the calculation of the Euclidean distance and the calculation of the RMSE. Such an estimate, according to the authors, is of an approximate nature, since does not take into account the topological features of the compared polygonal models. Analysis of the Similarity of Polygonal Models of Arbitrary Topological Form. In most cases, the Root Mean Square Error (RMSE) is used to estimate the accuracy of 3D
60
A. Mezhenin et al.
model reconstruction or to solve simplification problems, N Rows N Cols 1 fi,j − di,j 2 RMSE = NRows NCols i=1 j=1
This approach, according to the authors, does not allow obtaining reliable results. One of the possible solutions to this problem is to use the Hausdorff dimension, which will allow obtaining a quantitative estimate of the similarity of polygonal objects [3, 4]. To determine the Hausdorff distance, consider two polygonal models: the original model M and the model obtained as a result of reconstruction (simplification) M (Fig. 1). To indicate the magnitude of the similarity, we introduce an indicator E, the value of which E(M, M ) shows the magnitude of the deviation of one form from another. A topological space X is called Hausdorff if any two different points x, y from X have disjoint neighborhoods U (x), V (x). Let there be given two sets of points A = {a1 , a2 , ...am } and B = {b1 , b2 , ...bn }. Then the Hausdorff distance is defined as: H (A, B) = max(h(A, B), h(B, A)),
(1)
where h(A, B) = max mina − b and • Euclidean norm. a∈A b∈B
The value of the function h(A,B) is not symmetric and is called the directed Hausdorff distance between A and B. This value can serve as a basis for comparing two polygonal surfaces S1 and S2 . The accuracy of its calculation is determined by the integration step and can be specified as a percentage of the diagonal of the overall rectangle of the model. The calculation accuracy can be improved by generating k Monte Carlo samples proportional to the area of each face. The defining step in calculating the considered metric is the construction of the vectors of the normal to the surface [2]. The method for calculating the vectors normal to the surface can be classified as an optimization problem. For the description, we will use the following expression: min J (pi , Qi , ni ), ni
(2)
Finding the minimum of this function is determined by the criteria - the distance from a point to the local plane, the angle between the vectors and the tangent of the angle of inclination of the vector to the normal (Fig. 3).
Fig. 3. Different approaches to constructing normal vectors.
Using Virtual Scenes for Comparison of Photogrammetry Software
61
To improve the accuracy of calculating the Hausdorff distance, according to the authors, one can use the averaging method - calculating the weighted average value of the normal vectors formed by pairs of adjacent triangles (Fig. 1c). In this case, to find the coordinates of the intersection points of the normal vectors, constructed from one surface to another, we use the following assumptions. Let ABC be an arbitrary triangle, a, b, c the lengths of the sides lying opposite the vertices A, B and C, respectively, M the intersection point of the bisectors. Then for any point O the equality is true: OM =
a · OA + b · OB + c · OC a+b+c
(3)
Corollary from this theorem, if O is the origin, then: xM =
a · xA + b · xB + c · xC a · yA + b · yB + c · yC , yM = , a+b+c a+b+c a · zA + b · zB + c · zC zM = a+b+c
(4)
Thus, if vectors a and b are defined by their rectangular Cartesian coordinates, that is, they are presented in an orthonormal basis a = (ax , ay , az ) and b = (bx , by , bz ),
(5)
and the coordinate system is right, then their cross product can be described as follows: [a, b] = (ax bz − az by , az bx − ax bz , ax by − ay bx ).
(6)
Thus, the proposed method for analyzing the similarity of polygonal models of an arbitrary topological type can serve as a basis for the implementation of the corresponding algorithms. The use of the weighted average when calculating the normal vectors, according to the authors, increases the accuracy of the subsequent calculation of the Hausdoff metric. The proposed approaches can find application in the problems of assessing the quality of algorithms for reconstruction and recognition of 3D models, as well as in problems of multiscale representation of polygonal models.
4 Approbation In the 3ds Max environment, a virtual studio was created and the camera movement was configured to simulate shooting in a circle from different heights (Fig. 4). A simple geometric model (200 × 200 × 25 mm) with a non-uniform texture superimposed on it was created as an object for reconstruction (Fig. 2). The model figuratively represents an object standing on the platform. Then the visualization of a circular flight was carried out and 40 images were obtained: 20 at a height of 10 cm and 20 at a height of 40 cm. The data was loaded and processed in the Agisoft Photoscan environment, and a dense point cloud was obtained (Fig. 5). It was decided not to carry out further transformation into a polygonal model in order to reduce the number of possible losses in the quality of the result at the early
62
A. Mezhenin et al.
Fig. 4. Virtual studio in 3ds Max. Three-dimensional model for reconstruction (with and without texture).
Fig. 5. Result of photogrammetry, dense point cloud.
stages of experiments. To combine the materials, the CloudCompare 2.8 program was used, which allows you to combine polygonal meshes, point clouds and meshes with point clouds with each other (Fig. 6). For a more convenient alignment of the model and the point cloud, areas corresponding to the extreme row of numbers on the “site” were removed, for more accurate removal, and it was the lined texture that was chosen.
Fig. 6. The result of combining the polygonal model and the point cloud. Colored point cloud in MeshLab.
The result was transferred to the MeshLab program, in which the objects were compared and the data on the maximum and minimum discrepancy was obtained: min: 0.000000 max 0.010193 mean: 0.000441 (fractions of the bounding frame, i.e. Bounding Box of the object). For visual clarity, the point cloud was colored in accordance with the Quality map (Colorize by Quality), where red means exact match, green means maximum deviation. Assessing the Visual Quality of the Resulting Models. Agisoft Metashape, ReCap Photo, Photomodeler [4, 7, 25] programs were chosen to assess the realistically obtained
Using Virtual Scenes for Comparison of Photogrammetry Software
63
models. To assess the visual quality of the resulting models, a survey was conducted. The final samples were evaluated using the Mann – Whitney U-test. Reconstruction of the model in the Agisoft Metashape program. For the experiment, a white cube was taken on a surface with an inhomogeneous texture. It took about 3 h to build the point cloud and the model. The model has noticeable defects in the areas of some edges. The number of original images is 29. ReCap Photo program. Results: There were no difficulties in building the model. In the educational version of the program, the waiting time in the queue during construction was 7 h. The program is easy to learn. According to the results of a survey in a further experiment, this model turned out to be the most photorealistic. Photomodeler program. Results: The program needs to set control points; with automatic placement of Smart Match, the model cannot be built correctly. Therefore, 4 photographs of different angles were taken (instead of 29 in previous programs), control points were set at the vertices of the cube, then a model was built. The original model has some defects in the edges. In this model, these defects are not visible, as a result of setting control points, this is the lack of photorealism of this model. The results of model reconstruction - the original model, the obtained models in the Agisoft Metashape, ReCap Phot and Photomodeler programs are shown in Fig. 7.
Fig. 7. Model reconstruction results.
The respondents compare these images with the original photographs and rate the photorealism from 1 to 5. Based on the results of the pilot experiment, the final samples were formed and then they were compared using the Mann-Whitney test (Table 1). When comparing Agisoft and ReCap Photo, as well as Agisoft and Photomodeler, we recognize the result as statistically significant. Table 1. Mann–Whitney U-test. Pilot experiment.
64
A. Mezhenin et al.
Diagrams of the dependence of frequency on estimates in the programs under consideration are presented in Fig. 8.
Fig. 8. Frequency dependence on estimates.
5 Conclusion The approach proposed by the authors will help to find suitable software for photogrammetry tasks. The studies carried out show the feasibility and prospects of using virtual environments of various configurations, created by means of 3D computer modeling, for photogrammetry tasks. Their use will allow improving shooting scenarios and will save time for various experiments. In addition, imitation visual 3D environments can be successfully used to develop new photogrammetry systems and improve existing ones. The proposed mathematical apparatus for analyzing the similarity of polygonal models of an arbitrary topological form will allow obtaining more accurate estimates.
References 1. Dyshkant, N.: Measures for Surface Comparison on Unstructured Grids with Different Density. In: Debled-Rennesson, I., Domenjoud, E., Kerautret, B., Even, P. (eds.) DGCI 2011. LNCS, vol. 6607, pp. 501–512. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3642-19867-0_42 2. Dyshkant, N.: An algorithm for calculating the similarity measures of surfaces represented as point clouds. Pattern Recognit. Image Anal. 20(4), 495–504 (2010) 3. Tomaka, A.: The application of 3d surfaces scanning in the facial features analysis. J. Med. Inform. Technol. 9, 233–240 (2005) 4. ReCap Pro: Reality Capture & 3D Scanning Software. Autodesk. https://www.autodesk.com/ products/recap/overview. 5. Stepanyants, D.G., Knyaz, V.A.: PC-based digital closerange photogrammetric system for rapid 3D data input in CAD systems. Int. Arch. Photogrammetry Remote Sens. 33(part B5), 756–763 (2000) 6. Mitra, N.J., Guibas, L.J., Pauly, M.: Partial and approximate symmetry detection for 3D geometry. In: Proceedings ACM SIGGRAPH, pp. 560–568 (2006). 7. PhotoModeler Home: Accurate and Affordable 3D Modeling – Measuring – Scanning. XLsoft Corporatioin. https://www.xlsoft.com/en/products/photomodeler 8. Knyaz, V.A., Zheltov, S.Yu.: Photogrammetric techniques for dentistry analysis, planning and visualisation. In: Proceedings ISPRS Congress Beijing 2008, Proceedings of Commission V, pp. 783–788 (2008)
Using Virtual Scenes for Comparison of Photogrammetry Software
65
9. MathWorks: Makers of MATLAB and Simulink. https://www.mathworks.com/ 10. CloudCompare: Open Source project. https://www.danielgm.net/cc/ 11. Spirintseva, O.V.: The multifractal analysis approach for photogrammetric image edge detection. Int. J. Image Graphics Signal Process. 8(12), 1–7 (2016). https://doi.org/10.5815/ijigsp. 2016.12.01 12. Mezhenin, A., Izvozchikova, V., Ivanova, V.: Use of point clouds for video surveillance system cover zone imitation. In: CEUR Workshop Proceedings, vol. 2344 (2019) 13. Mezhenin, A., Zhigalova, A.: Similarity analysis using Hausdorff metrics. In: CEUR Workshop Proceedings, vol. 2344 (2019) 14. Sizikov, V.S., Stepanov, A.V., Mezhenin, A.V., Burlov, D.I., Éksemplyarov, R.A.: Determining image-distortion parameters by spectral means when processing pictures of the earth’s surface obtained from satellites and aircraft. J. Opt. Technol. 85(4), 203 (2018) 15. Mezhenin, A.V.: Virtual’nye 3D sredy kak sredstvo verifikacii i testirovaniya pri proektirovanii. Prioritetnye nauchnye napravleniya: ot teorii k praktike 21, 105–110 (2016) [in Russian] 16. Mezhenin A.V., Izvozchikova, V.V.: 3D modelirovanie metodov s"emki mobil’nymi videosistemami. Programmnye produkty i sistemy 3, 163–167 (2016) [in Russian] 17. Mezhenin, A.V., Izvozchikova, V.V.: Razmernost’ Hausdorfa v zadachah analiza podobiya poligonal’nyh ob"ektov. INTELLEKT. INNOVACII. INVESTICII (2016) [in Russian] 18. Dagar, N.S., Dahiya, P.K.: A comparative investigation into edge detection techniques based on computational intelligence. Int. J. Image Graphics Signal Process 11(7), 58–68 (2019). https://doi.org/10.5815/ijigsp.2019.07.05 19. Zykov, A.G., Mezhenin, A.V., Polyakov, V.I.: Virtual’nye 3D-sredy kak sredstvo verifikacii i testirovaniya robototekhnicheskih sistem. Gibridnye i sinergeticheskie intellektual’nye sistemy: teoriya i praktika: materialy 1-go mezhdunarodnogo simpoziuma. pod red. prof. Kolesnikova, A.V., BFU im. I. Kanta, Kaliningrad. vol. 2, pp. 128–134. 444 s (2012) 20. Piatti, E.J., Lerma, J.L.: A virtual simulator for photogrammetry (2013). https://doi.org/10. 1111/phor.12001 21. Becker, T., Özkul, M., Stilla, U.: Simulation of close-range photogrammetric systems for industrial surface inspection. In: Photogrammetric Image Analysis (PIA 2011), Munich, Germany, October 5–7 (2011) 22. Gajic, D.B., Mihic, S., Dragan, D., Petrovic, V., Anisic, Z.: Simulation of photogrammetrybased 3D data acquisition. Int. J. Simul. Model. 18(1), 59–71 (2019) 23. Tin, H.H.K.: Robust algorithm for face detection in color images. Int. J. Modern Educ. Comput. Sci. 2, 31–37 (2012). https://doi.org/10.5815/ijmecs.2012.02.05 24. Narendira Kumar, V.K., Srinivasan, B.: Ear biometrics in human identification system. Int. J. Inform. Technol. Comput. Sci. 4(2), 41–47 (2012). https://doi.org/10.5815/ijitcs.2012.02.06 25. Agisoft Metashape: https://www.agisoft.com/
Structural-Modal Analysis of Biomedical Signals A. Yu. Spasenov1(B) , K. V. Kucherov1 , S. I. Dosko1 , V. M. Utenkov1 , and Bin Liu2 1 Bauman Moscow State Technical University, Moscow 105005, Russia 2 Panther Healthcare Medical Equipment Co., Beijing 102209, China
Abstract. This paper describes the method for analyzing time series, which allows one to determine both sudden changes in the structure of a complex system and its long-term adaptation trends. The method is suitable for locally stationary and non-stationary signals processing. The method allows one to obtain an adaptation characteristic of the system to external influences and internal processes occurring in it on different time scales. The method is based on a signal decomposition using harmonic transformations and subsequent clustering of the resulting decomposition components. Keywords: Multidimensional time series · Electrocardiosignals · Cardiac cycle · Spectral analysis · Pattern recognition · Machine learning · Mathematical processing
1 Instruction Various aspects of a human body functioning appear in different physiological signals of different nature. Among them, the electrocardiogram (ECG), electroencephalogram (EEG) and photoplethysmogram (PPG) should be distinguished as widespread types of signals being registered. The main purpose of processing such signals using computer technology is to isolate the informative component. This goal can be achieved through the use of various methods, ranging from the heart rate variability analysis known to a wide range of doctors and researchers [1–5] and ending with the original methods of anesthesia depth analyzing by EEG [6]. Thus, biological signals contain a whole range of various informative components, which makes it possible to estimate the functioning of the organism to varying degrees and analyze the adaptive processes taking place in it. Research in the field of biology and medicine is traditionally associated with the processing of large arrays of non-stationary signals, describing the complex dynamics of changes in the state of living organisms. In the theory of functional systems, the papers of Anokhin P.K., which assumed the division of the human body into a set of functional systems, were widely disseminated. From these positions, the organism as a whole represents a complex multi-level hierarchical network of many functional systems, some of which, by their control activities, determine the stability of various indicators of the internal environment – homeostasis, and the others – the adaptation of living organisms to the environment. Information about adaptation is contained in minor changes in the properties of dynamic processes. Early detection of deviations in © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): CSDEIS 2020, AISC 1402, pp. 66–73, 2021. https://doi.org/10.1007/978-3-030-80478-7_8
Structural-Modal Analysis of Biomedical Signals
67
the dynamics of processes make it possible to diagnose a person’s condition long before the onset of obvious clinical manifestations of various kinds of deviations. Complex non-stationary signals are being studied, the lack of processing methods for which is still clearly visible. Consider the cardiovascular system. Contractions of the heart are caused by an action potential, which exists due to the activity of a special group of cells (called the sinus node) possessing automatism and generating an action potential not initiated by nervous activity. The sinus node produces about 60 potentials per minute on an average, however, various external and internal influences can significantly change this amount. The higher the degree of adaptation of the organism to influences being considered, the less the degree of deviation of the heart rate from the personal norm. The adaptation of the human organism comes out not only in an increase of the heart rate, but also in a change of the cardiac cycle parameters, and in such changes, as a rule, trends can be distinguished [1]. In the language of complex dynamic systems analysis, during the process of adaptation a control system is being created and a control program is being implemented by it. Such a program aims at increasing the level of the organism adaptation to environmental conditions or its own internal changes. Adaptation in a nutshell can be represented as a sequence of the following stages [7]: 1. 2. 3. 4. 5. 6.
Control goals statement. Control object finding. Model structural synthesis. Model parametric synthesis. Control synthesis. Control implementation.
Structural synthesis of a model involves finding some rule according to which the inputs of the model (controlled and uncontrolled) can be matched to the output value. The parametric synthesis of the model consists in finding the parameters of the model, and this can be done both by organizing special experimental effects on the object, and by analyzing its normal functioning. The analysis of the normal functioning of the object is also called the identification of the control system, and this is an approach to the parametric synthesis of the model that will be considered in this work. The complexity of representing the dynamics of non-stationary systems can be compensated for using symbolic approximation of the time series obtained during the analysis. One of the promising technologies for representing time series is the use of the symbolic signal representation (SAX – Symbol Aggregate ApproXimation) [8, 9]. Moreover, symbolic approximation is a convenient means of representing the dynamics of evolutionary processes in a system. In this paper, we propose to use the symbolic representation of parameterized time series in order to take into account trends of different signal duration.
2 Analysis of Non-stationary Time Series The dynamics of most systems appears to be non-stationary time series, which is a significant problem in their analysis. The model of locally transient change characterizes
68
A. Yu. Spasenov et al.
the time intervals in which the local stationarity of the time series is violated. This may be the result of the rapidly decaying modes presence in signal, sudden changes in the structure of the dynamical system, and rapid local changes in modal parameters [10]. Requirements for the spectral analysis method are based on the used model of local-transient change in time series. Segments of a series are not represented as quasistationary parts of a non-stationary time series, which for the observed series could turn out to be short. The specificity of the locally transitional time series reviewed in this work is reflected by considering the segments of the series with the constant sign of damping of the main spectral components of the segment. The use of spectral analysis that matches the time series model makes its physical interpretation results. In spectral analysis, the sizes of the series segments can vary significantly, as a result of which the time scale of the local analysis of the time series changes. Since it is necessary to analyze local transitional time series, the amplitude of oscillations of which decreases or increases in a time window of a limited duration, the Prony method is of interest. Prony decomposition of a time series segment x[1 : N ] has the form [11] y[n] =
M
Ak e(n−1)(αk +j2π fk )+jθk ,
k=1
where n = 1, 2, . . . , N , j2 = −1, - sampling interval. The objects of estimation are the amplitude of complex exponents Ak , damping factor αk , harmonic frequency fk and phase θk . By choosing the parameters N and p, it is possible to provide the information content of the results of the spectral estimation of a particular locally transient one [12, 13]. Figure 1 shows a fragment of an ECG record, Fig. 2 shows the dependence of the approximation error on the decomposition depth, and Fig. 3 shows a diagram of the frequency estimate stability depending on the signal decomposition depth.
Fig. 1. Fragment of ECG recording
The Prony method is applied taking into account the peculiarities of the selected model of locally transitional time series based on a priori knowledge. That is why the results obtained are informative. The main issue that needs to be resolved in spectral analysis of a time series using the Prony method is the choice of the size N of the time window and the choice of the shift value of this time window. The correct choice of the size of the time window makes it possible to significantly reduce the amount of information in comparison with the original segments of the time series and, as a result,
Structural-Modal Analysis of Biomedical Signals
69
Fig. 2. Approximation error depending on the depth of decomposition
Fig. 3. Stabilization diagram
to optimally solve the problem of searching for similarity in a locally transitional time series.
3 Functional Systems Identification Disturbances in the action of the human body functional systems, depending on the nature of the physical and chemical processes course, can manifest themselves both in a change in the structure of short-term signal fragments, and in a gradual, long-term change in the nature of the processes [14]. For the simultaneous analysis of long and fast processes, the paper proposes to use the structural-modal method based on the parametric spectral analysis of time series and the linguistic approach to describing the dynamics of the system under study state. This approach makes it possible to represent signal fragments in the form of a finite alphabet symbols chains, taking into account the assessment of local transient processes, and to perform the classification procedure for biomedical signals of various durations [15, 16].
70
A. Yu. Spasenov et al.
At the first stage, the time series is segmented. To solve this problem, static or adaptive segmentation methods can be used. Biomedical signals are quasiperiodic; thus, regular pattern search methods can be effective [17, 18]. Next, a determination of the segments’ characteristic features in the time and frequency domains is performed. Due to the fact that a significant part of the informative component is contained in the dynamic characteristics of the time series, it is necessary to estimate the parameters of local transient processes as accurately as possible. To achieve this goal, effective methods of parametric spectral analysis are used. The segue to a linguistic representation is carried out by comparing each segment of the time series with a symbol from the alphabet, the elements of which are code designations of the process under study nature [19]. In order to construct such an alphabet, algorithms for automatic pattern recognition are used. These algorithms sort the vectors of characteristic features into classes of similar ones. Depending on which feature vectors characterize the areas presented for analysis, it is possible to obtain different classifications of these areas. The resulting sequence is the text T = a1 , . . . , aN , where N is the number of signal segments. Each ordered pair of indices (i, j), i ≤ j, 1 ≤ i, j ≤ N , cuts out from T a segment of the sequence obtained by erasing in T symbols with indices less than i greater than j : cij = ai , ai+1 , . . . , aj . Each segment corresponds to an image – an ordered sequence of its symbols, in which indices have been dropped, taking into account the location of this segment in the text T . Each partition D of a text T into segments defines a dictionary M (D). The formation of a dictionary can be understood as highlighting a set of macro events in the development of the process under study [19]. The length of the sequence determines the width of the window within which the system state changes. The second stage is devoted to solving the problem of assigning the observed state of an object to one of the types of its states. Classification task is to find a mapping: η:S → Y ,
(1)
where S is the set of object states; η – an operator describing the relationship between the state of a system and its map to the space of diagnostic signs. The classification problem (1) being solved, each signal is represented as a document consisting of a set of elements obtained from a dictionary. To analyze the work of the proposed algorithm, we used open data of electrocardiograms [20] to predict atrial fibrillation, and heart sounds [21] from the resource “Research resource of complex physiological signals”. Figures 4 and 6 show fragments of the investigated time series of ECG signals and vibrations, respectively. Figures 5 and 7 show the segmentation of the analyzed signals spectrograms. The spectrogram values averaging over frequencies for all time windows is the most common method for extracting features from time series. The main disadvantage of the method is the impossibility of assessing the properties of all local transient processes separately. In this paper, we compare the efficiency of structural modal analysis of time series with this approach. The area under the ROC curve is used as a comparison criterion. This criterion shows the sensitivity and specificity of the classification method. Figure 8.a shows a comparison of the average frequency assessment of the heart vibration signal (0.78) and the proposed method (0.85). For signals, symbolic sequences of length
Structural-Modal Analysis of Biomedical Signals
71
Fig. 4. ECG signal
Fig. 5. ECG signal spectrogram segmentation
Fig. 6. Vibration signal
Fig. 7. Segmentation of the vibration signal spectrogram
2 were used. Increasing the word length did not significantly improve the prediction quality. Figure 8.b shows a comparison of the averaged frequency assessment of the heart vibration signal (0.72) and the proposed method with the symbol sequences of length 2 (0.76) and 3 (0.83). For the ECG signal, an increase in the length of the symbol sequences made a significant contribution to the quality of the classification. This fact can be
72
A. Yu. Spasenov et al.
explained by the need to analyze long-term processes that manifest themselves in several sequential cardiocycles at once to solve the problem of predicting atrial fibrillation.
Fig. 8. ROC curves
4 Conclusion In this article, the possibility of identifying functional systems using structural modal analysis of time series is shown. An original method of time series analysis is described, the results of its work are presented, and a comparison with methods solving similar issues is made. Based on the comparison results, it can be concluded that the proposed method has high sensitivity and specificity and can be used for an isolated assessment of the signal components generated by a complex dynamic system. Structurally modal time series analysis has significant scope for modification. This approach allows one to obtain an interpretable result of analyzing time series of a different duration and complexity, describing the evolution of complex dynamical systems. Further development of the approach described is possible within the framework of the analysis of multidimensional time series of various physical nature, such as a pulse wave, electrocardiogram and an electroencephalogram. Obtaining an indicator function based on heterogeneous signals will make it possible to achieve more accurate diagnostic results due to the information contained in various human regulatory mechanisms.
References 1. Baevsky, R.M., Ivanov, G.G., Chireikin, L.V.: Analysis of heart rate variability using various electrocardiographic systems. Bull. Arrhythmol. 24, 65–87 (2001). (in Russian) 2. Queyam, A.B., Pahuja, S.K., Singh, D.: Doppler ultrasound based non-invasive heart rate telemonitoring system for wellbeing assessment. Int. J. Intel. Syst. Appl. 10(12), 69–79 (2018). https://doi.org/10.5815/ijisa.2018.12.07
Structural-Modal Analysis of Biomedical Signals
73
3. Goshvarpour, A., Goshvarpour, A.: Chaotic behavior of heart rate signals during chi and kundalini meditation. Int. J. Image Graphics Sig. Process. 4(2), 23–29 (2012). https://doi.org/ 10.5815/ijigsp.2012.02.04 4. Goshvarpour, A., Goshvarpour, A.: Classification of heart rate signals during meditation using lyapunov exponents and entropy. Int. J. Intel. Syst. Appl. 2, 35–41 (2012). https://doi.org/10. 5815/ijisa.2012.02.04 5. Ahmad, A.A., Kuta, A.I., Loko, A.Z.: Analysis of abdominal ECG signal for fetal heart rate estimation using adaptive filtering technique. Int. J. Image Graphics Sig. Process. 2, 19–26 (2017). https://doi.org/10.5815/ijigsp.2017.02.03 6. Ye, S.-Y., Choi, S.-Y.: Analysis on the depth of anesthesia by using EEG and ECG signals. Trans. Electr. Electr. Mater. 14(6), 299–303 (2013) 7. Rastrigin L.A.: Adaptation of complex systems, p. 375 (1981). (in Russian) 8. Lin, J., et al.: A symbolic representation of time series, with implications for streaming algorithms. Workshop on Research Issues in Data Mining and Knowledge Discovery. 3, 2–11 (2003). https://doi.org/10.1145/882082.882086 9. Keogh, E., Lin, J., Fu, A.: HOT SAX: Efficiently finding the most unusual time series subsequence. In: Proc. of the 5th IEEE International Conference on Data Mining (ICDM 2005), pp. 226–233 (2005) https://doi.org/10.1109/ICDM.2005.79 10. Kukharenko, B.G.: Research by the Prony method of dynamics of systems based on time series, 1(2), 176–191 (2009). (in Russian) 11. Mitrofanov, G., Priimenko, V.: Fundamentals and applications of the proni-filtration method, 3, 93–108 (2011). (in Russian) 12. Baldin, A.V., et al.: ECG signal spectral analysis approaches for high-resolution electrocardiography. Adv. Intell. Syst. Comput. 902, 197–209 (2020) 13. Marple, S.L.: Digital spectral analysis and its applications, 584 (1990). (in Russian) 14. Rangaraj, M.: Biomedical Signal Analysis, p. 440 (2010) 15. Sudakov, K.V.: Functional Systems Theory, p. 95 (1996). (in Russian) 16. Anokhin, P.K.: Essays on the Physiology of Functional Systems, p. 446 (1974). (in Russian) 17. Ye, L., Keogh, E.J.: Time Series Shapelets: A New Primitive for Data Mining, pp. 1–17 (2009) 18. Karpenko, A.P., Sotnikov, P.I.: Modified method for classifying multivariate time series using shapelets. Vestnik MGTU im. N.E. Bauman 2, 46–65 (2017) 19. Braverman, E.M., Muchnik, I.B.: Structural Methods for Processing Empirical Data, p. 464 (1983). (in Russian) 20. Clifford, G., Liu, C.: Recent advances in heart sound analysis. Inst. Phys. Eng. Med. 38, 10–25 (2017) 21. Clifford, G., Liu, C.: AF classification from a short single lead ECG recording: the physionet computing in cardiology challenge. PhysioNet 2017(1), 1–4 (2017)
Multichannel Plasma Spectrum Analyzer Based on Prony-Fourier Method S. I. Dosko1 , V. M. Utenkov1 , K. V. Kucherov1 , A. Yu. Spasenov1(B) , and E. V. Yuganov2 1 Bauman Moscow State Technical University, Moscow 105005, Russia 2 VNIIAES, Moscow 109507, Russia
Abstract. The scheme of a digital broadband spectrum analyzer of the electromagnetic signal of plasma glow discharge radiation based on a multichannel induction sensor, ADC and PF (Prony-Fourier) spectral analysis is proposed and tested. An original method for analyzing multidimensional time series of complex structure is proposed. The spectrum analyzer can be used for both monitoring and control of the treatment process in automated technological environments, ensuring high reproducibility of results. It is discovered that changes in the shape, amplitude and frequency of registered electromagnetic pulses during processing are closely connected with pulsations of the current value in the discharge during processing. As the result of simulation the efficiency of magnetic induction control method of glow discharge plasma parameters is confirmed, which allows to recommend it for use in conjunction with the standard instrument devices of vacuum systems to determine the system reached the working mode, setting the moments of completion of processing steps, as well as for operational intervention in the process. Keywords: Plasma · Glow discharge · Automated technological environment · Vacuum systems · Spectrum analyzer
1 Instruction The development of technologies that ensure the formation of the necessary physical and mechanical properties in the surface layer of materials, adapted for use in an automated technological environment, is becoming an increasingly actual problem of modern machine building. The use of low-energy effects of glow discharge plasma radiation in controlled technological environments in the development of promising methods for processing tool materials for predicted changes in the physical and mechanical properties of the surface layers of these materials can be considered as a unique type of hardening treatment [1–4]. It is shown that low-energy bombardment by ions with an energy of 0.5…5 keV in the glow discharge plasma of various metals and alloys leads to an increase in dislocation density and rearrangement of dislocation structures. This leads to a significant hardening of materials at a sufficiently deep depth with the expenditure of
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): CSDEIS 2020, AISC 1402, pp. 74–81, 2021. https://doi.org/10.1007/978-3-030-80478-7_9
Multichannel Plasma Spectrum Analyzer
75
small energies to strengthen the parts. This modification can be explained by considering the nonlinear effects of breaking the translational symmetry of crystal lattices during the bombardment of the surface of solids with low-energy ions. The widespread use of resonant methods in the study of substances in the gaseous, liquid and solid states in recent decades is justified by their universality. The word “resonance” (from lat. resono - sound in response, respond) means an increase in the response of the oscillatory system to a periodic external influence when its frequency approaches one of the frequencies of the system’s own vibrations. All oscillatory systems are capable of resonating and can have a quite different nature. In a substance such systems can be electrons, electron shells of atoms, magnetic and electric moments of atoms, molecules, impurity centers in crystals and individual crystals, as well as their groups. However, in all cases, the general picture of the resonance is preserved: near the resonance, the amplitude of vibrations and the energy transmitted to the oscillating system from outside increases. This increase stops when the loss of energy compensates for its increase. Resonant methods can probably be attributed to the most sensitive and accurate methods of studying a substance (and, consequently, the impact on it). They allow obtaining various types of information about the chemical composition, structure, symmetry, and internal interactions between the structural units of a substance. A substance, depending on its internal structure, has its own characteristic set of natural oscillation frequencies (frequency or energy spectrum). The own frequencies fk can be in a wide range from 102 to 1022 Hz. This set of frequencies is a kind of visiting card of the substance. A spread and effective type of periodic external impacts is electromagnetic radiation. The frequencies of electromagnetic waves are in the ranges 102 …108 Hz (radio waves), 109 …1011 Hz (microwave radio waves), 1013 …1014 Hz (infrared light), 1015 Hz (visible light), 1015 …1016 Hz (ultraviolet light), 1017 …1020 Hz (x-ray radiation) and 1020 …1022 Hz (γ - radiation). Many researchers [5] have proved that energy processes in plasma are of broadband character with a frequency spectrum of 104 …1020 Hz, so it is almost impossible to cover it with a single sensitive element. To solve this problem, it is proposed to use a plenty of sensitive elements with some overlap of the probable frequency range of the process. Each of the elements can be considered as a linear filter configured for a specific sub-band frequency. Symbolically this can be represented as follows: ∼ =
M i=1
i ,
where, - the frequency spectrum of the process; i - a potential fragment of the process spectrum contained in the signal at the output of the i-th sensor element (coil). One of the ways to increase the resolution is the use of parametric methods of digital spectral analysis, which belong to the super-resolution class. According to many authors, MUSIC (Multiple Signal Classification) and ESPRIT (Estimation of Signal Parameters via Rotational Invariance Techniques), as well as their variations [6–11], are the most effective in terms of feasibility and potential resolution. It is noted that these methods allow several times to increase the resolution, but their application in practice is constrained by the increased requirements for the element base of the analyzers and the high computational load during signal processing. In our opinion, the possibility of using
76
S. I. Dosko et al.
Prony’s spectral analysis [12], as one of the high-resolution methods for determining the features of the temporal evolution of dynamic processes, is of considerable interest. The novelty of the proposed approach for studying energy processes in plasma is to determine the global characteristics of an object by identifying an indicator function based on multidimensional time series. In this paper, the efficiency of using the proposed technique is presented both for a model signal and for experimental data.
2 Research Methods In general, the model of the oscillatory process according to B. V. Bulgakov [13] can be represented by the following dependence. y(t) =
∞
ar eδr t +
r=1
∞ Ak e(δk +iω0k )t + A∗k e(δk +iω0k )t , k=1
where:ar ,Ak - the amplitude; fk - frequency; ϕk - the initial phase; δk - attenuation coefficient. A computational experiment was performed to test the effectiveness of the spectral analysis process. A model signal representing the sum of cosines and noise was used as a test signal. y(t) =
8 k=1
Ak cos(2π fk ) + ε(t).
(1)
The values of own frequencies and their corresponding amplitudes, which are model parameters, are shown in Table 1. Table 1. Model parameters Ak
0,7
1,0
1,5
0,5
0,6
1,0
1,0
1,0
fk , Hz
10000
12000
15000
20000
25000
50000
75000
100000
Figure 1 shows the graphical dependence of the oscillatory process synthesized in accordance with the dependence (1) and Table 1. In the modeled process, linear band filters are tuned for the following sub-bands: (5000…18000), (18000…40000), (40000…120000) Hz and then the analytical decomposition of the signal over the corresponding sub-bands is shown in Fig. 2. On the left is the time signal, and on the right is its A2 – Prony spectrum. On the A2 Prony spectrum, the modes of the model signal are highlighted; the other modes correspond to a random component or are “mathematical” modes, i.e. they are not really present in the signal, but appear as a result of signal approximation. Figure 3 shows a diagram of stabilization of the model signal frequency estimates. Figure 4 shows the analytical Fourier spectrum of the total model time signal, the spectral lines of which correspond to the model frequencies with high accuracy.
Multichannel Plasma Spectrum Analyzer
77
Fig. 1. Graphical dependence of the synthesized test signal of the oscillatory process
Fig. 2. Signal decomposition by frequency sub-bands of sensors.
Fig. 3. Diagram of stabilization of frequency estimates of the model process
3 Research Results For processes characterized by a wide and unpredictable frequency spectrum, it is impossible to have a single sensitive element that would track changes in the spectrum. A logical solution is to use a set of sensitive elements (sensors) tuned to certain frequency ranges, i.e. which are actually bandpass filters. In this case, the frequency range of the process under study can be represented as a combination of the possible frequency spectra of the sensors. Each of the sensors works independently, and the result is summed up in the time domain (indicator function). If we assume that the task is to identify changes in the spectrum of the process and the frequency distribution of energy, then the indicator
78
S. I. Dosko et al.
Fig. 4. Spectrum of the total model signal
function obtained in this way can be used for further spectral analysis. When the oscillatory processes are added into one process, the frequencies of each of them remain, at the same frequencies the amplitudes are added. The criterion of frequency identity is selected for each process individually. To estimate the frequencies of the process and the distribution of energy in time, it is proposed to use the Prony method. The Prony method is highly sensitive to frequency estimation. As an energy indicator of the process, it is proposed to use the squares of the estimates of the Prony amplitudes. To track the process, it is proposed to use the windowed mode, when the process is divided into segments in accordance with the technological process. In transient modes, estimates of damping factors can also be used. Phase estimates also carry important information about the process and can be effectively used. For experimental research, a pilot version of a wide-band spectrum analyzer for electromagnetic radiation from a glow discharge plasma with time-frequency separation of the signal was developed. The spectrum analyzer (Fig. 5) consists of a multi-channel measurement unit (MU), ADC and a software module that implements PF spectral analysis [14–16].
Fig. 5. Block diagram of a broadband pre-defined Prony spectrum analyzer of a glow discharge plasma electromagnetic radiation
The measuring unit is constructed in the form of a set of 18 induction sensors, which are coils of copper wire evenly located along the periphery of the disk made of a dielectric material. When conducting research, the MU was installed at the viewing window of the vacuum chamber. To reduce the influence of external noise on the sensor, a screen in the form of a Faraday jar is installed with a ground connection to the housing of the vacuum equipment [17].
Multichannel Plasma Spectrum Analyzer
79
MU is proposed to be considered as a distributed system with multiple inputs for the number of coils and one output. To identify the transfer function from each input to the output, a special procedure was developed and introduced into the program block. Each MU coil, having a set of own frequencies, perceives “resonates” to a certain spectrum of electromagnetic radiation of the glow discharge plasma performing the function of a physical band filter. Then the signals coming from all the coils of the sensor are summed up and fed to the ADC and then to the Prony spectrum analyzer, which performs the time-frequency division of the signal. At the output of the Prony spectrum analyzer it is possible to obtain estimates of the signal frequencies, the analytical Fourier spectrum, the analytical energy spectrum of the signal and its analytical time decomposition over the specified frequency ranges. Figure 6 shows the graphic dependence of changes in the electromagnetic radiation of the glow discharge plasma obtained at the output of the ADC during the treatment process in a vacuum installation, and Fig. 7 shows the analytical Fourier spectrum (spectral density) and the analytical energy spectrum of the signal after processing in the Prony spectrum analyzer, respectively.
Fig. 6. Graphic dependence of changes in the intensity of electromagnetic radiation of the glow discharge plasma obtained at the output of the ADC
Fig. 7. Analytical Fourier spectrum of the signal at the output of the Prony spectrum analyzer (a) and its energy spectrum (b)
80
S. I. Dosko et al.
4 Summary and Conclusion In a computational experiment, the principal possibility of using the SAProny software module for analyzing broadband signals of plasma electromagnetic radiation with the possibility of obtaining not only the spectra at the sensor output, but also deconvolution of the spectrum of the signal of electromagnetic radiation of a glow discharge plasma is proved. During the real experiments it is confirmed the efficiency of magnetic induction control method of plasma glow discharge parameters and it is recommended to use it in conjunction with vacuum system instrument devices in the definition of its working mode and the moment of processing stages completion, and the implementation of the operational intervention in the process of considering changes of electromagnetic pulses. The proposed scheme of the spectrum analyzer can be used for both monitoring and control of the treatment process in automated technological environments, providing high reproducibility of results. Further development of the proposed methodology is aimed at analyzing multidimensional time series of various physical nature. For each information channel, its own characteristic function can be determined and considered in the general diagnostic model with a preselected weight.
References 1. Wehner, G.K.: Cone formation as a result of whisker growth on ion bombarded metal surface. J. Vac. Sci. Technol. 4, 1821–1835 (1985) 2. Berish, R.: Sputtering of Solids by Ion Bombardment, p. 336 (1984). (in Russian) 3. Begrambekov, L.B.: Erosion and surface transformation by ion bombardment. In: Results of Science and Technology, ser. Charged Particle Beams, (7) 4–57 (1993). (in Russian) 4. Tereshko, I.V., Logvin, V.A., Tereshko, V.M., Redko, V.P., Sheptunov, S.A.: Strengthening of metals and alloys under low-energy ionic action inducing nonlinear processes. In: Fundamental and Applied Problems of Mechanical Engineering, Collection Of Proceedings of the VI International Conference “Design and Technological Informatics”, pp. 21–29 (2017). (in Russian) 5. Fortov, V.E.: Encyclopedia of Low Temperature Plasma. Low Temperature Plasma. Basic Concepts, Properties and Patterns, p. 586 (2000). (in Russian) 6. Gupta, P., Verma, V.: Optimization of MUSIC and improved MUSIC algorithm to estimate direction of arrival. Int. J. Image Graphics Sig. Process. 8, 30–37 (2016). https://doi.org/10. 5815/ijigsp.2016.12.04 7. Chakkor, S., Baghouri, M., Hajraoui, A.: High resolution identification of wind turbine faults based on optimized ESPRIT algorithm. Int. J Image Graphics Sig. Process. 7, 32–41 (2015). https://doi.org/10.5815/ijigsp.2015.05.04 8. Gong, B., Xu, Y.-T., Li, J., Liu, Z.-J.: A robust autofocusing approach for estimating directionsof-arrival of wideband signals. Int. J. Wireless Microw. Technol. 2, 28–37 (2012). https://doi. org/10.5815/ijwmt.2012.04.05 9. Lau, C.K.E., Adve, R.S., Sarkar, T.K.: Mutual coupling compensation based on the minimum norm with applications in direction of arrival estimation. IEEE Trans. Antennas Propag. 52, 2034–2041 (2004) 10. Schmidt, R.O., Frank, R.E.: Multiple source DF signal processing: an experimental system. IEEE Trans. Antennas Propag. 34, 276–280 (1986)
Multichannel Plasma Spectrum Analyzer
81
11. Pesavento, M., Gershman, A.B., Haardt, M.: Unitary root-MUSIC with a real-valued eigendecomposition: a theoretical and experimental performance study. IEEE Trans. Sig. Process. 48(5), 1306–1314 (2000) 12. Roy, R., Kailath, T.: ESPRIT-estimation of signal parameters via rotational invariance techniques //Acoustics. Speech Sig. Process. IEEE Trans. 7, 984–995 (1989) 13. Bulgakov, B.V.: Fluctuations, p. 891 (1954). (in Russian) 14. Marple, S.L.: Digital spectral Analysis and Its Applications, p. 584 (1990). (in Russian) 15. Kirenkov, V.V., Gusarov, S.V., Dosko, S.I., Volkov, N.V.: Method for diagnosing the state of mechanical systems based on modal analysis in the time domain. Bull. MSTU "Stankin" 19, 90 (in Russian) 16. Kukharenko, B.G.: Spectral analysis technology based on fast Prony transformation. Inform. Technol. 4, 38–42 (2011). (in Russian) 17. Dosko, S.I., Logvin, V.A., Sheptunov, S.A., Yuganov, E.V.: Identification of the induction sensor model and deconvolution of the input signal spectrum. Science-Intensive Technol. Mech. Eng. 12, 32–38 (2018). (in Russian)
On Feature Expansion with Finite Normal Mixture Models in Machine Learning Andrey Gorshenin1(B) and Victor Kuzmin2 1 Federal Research Center “Computer Science and Control” of the Russian Academy of
Sciences, Moscow, Russia [email protected] 2 Wi2Geo LLC, Moscow, Russia
Abstract. This paper is devoted to the research of quality improvement of medium-term data forecasting with neural networks within introduction of statistical models for observations. The main goal is to study the efficiency of nontrivial expansion of the feature space based on the characteristics of finite mixtures models. Such probabilistic models are successfully used as convenient approximations of plasma turbulence processes. In the paper, expectation, variance, skewness and kurtosis of the mixture models are introduced as additional features for machine learning algorithms. Comparison of medium-term forecasts with and without additional features is carried out, and various neural network architectures are investigated. The proposed methods are tested on the unique turbulent plasma ensembles obtained from the L-2M stellarator. It is demonstrated that the usage of the abovementioned statistical characteristics can increase the accuracy of neural network forecasts in terms of such standard metrics as root-mean-square error and mean absolute errors. Hybrid high-performance computing cluster is used in order to increase the learning rate. Keywords: Statistical models · Finite normal mixtures · Additional features · Neural networks · Medium-term forecasting · Deep learning · High-performance computing
1 Introduction Creation of data analysis algorithms, including those focused on efficient implementation on modern high-performance computing resources in finance [1], meteorology [2] or medicine [3], is often impossible without the development of mathematical models that describe the functioning of complex systems and their evolution in time. One of the powerful research tools is based on the results in modern theory of probability and mathematical statistics, firstly, in the field of limit theorems and random-size samples [4]. Machine learning methods and neural network appliance to studies of turbulent plasma make it possible to achieve significant results both in the issues of modeling the observed phenomena [5–8], and in the problems of analysis and prediction of instabilities and potentially destructive effects in stellarators and tokamaks [9, 10]. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): CSDEIS 2020, AISC 1402, pp. 82–90, 2021. https://doi.org/10.1007/978-3-030-80478-7_10
On Feature Expansion with Finite Normal Mixture Models
83
A combined approach based on probability mixture models and statistical methods for estimating their parameters together with machine learning procedures turns out to be effective. In particular, this approach for the analysis of turbulent plasma data made it possible to reveal physical phenomena using the developed statistical tools [11]. In this paper an approach for forecasting plasma instabilities in the unique experimental data of the L-2M stellarator [12] using various architectures of neural networks is considered. In order to improve the quality of the prediction expansion of the feature space with sample moments and moments of statistical models that can be obtained with the method of moving separation of probability mixtures (MSM) [4] is proposed. This paper significantly expands the results obtained by the authors in [13, 14] in the field of medium-term vector forecasts using neural networks. New results are focused on the development of a fundamentally new probability-statistical approach for description of the evolution of turbulent processes in a magnetoactive high-temperature plasma in combination with the modern neural network solutions.
2 Statistical Methodology for Data Analysis Based on Finite Normal Mixtures and Moving Windows Initial vector V = {V1 , V2 , …, VL } of observed plasma densities serves as input data for the task. Let us choose some arbitrary window length N, L ≥ N ≥ 1 and divide V into shorter window vectors X1 , X2 , X3 , … where Xi = {Vi , Vi+1 , … Vi+N-1 } is a sequence of N consecutive observations taken from V. We may notice that Xi differs from Xi+1 by two observations, first and last one. Window vectors serve as the main part of input for neural networks. Once vector Xi is obtained, new vector of differences Yi may be constructed, with i Yj = Xij + 1 −Xij . Applying the same transformation to all window vectors a collection of difference window vectors is built. Each difference window vector has a length of (N-1). Exact choice of window length is open to debate. Window length that is too small leads to lack of input data for machine learning. Window length that is too big leads to loss of stationarity across window vector. After window vectors and difference window vectors are constructed, they can be used to estimate statistical parameters for data enrichment. Additional features can be obtained in several ways. Firstly, sample moments, namely expectation, variance, skewness and kurtosis for each window vector can be estimated: N N 1 i 1 i Xj Di = (Xj − Ei )2 Ei = N N j=1 j=1 ⎛ ⎛ ⎞ 4 ⎞ 3 Xji − Ei Xji − Ei ⎠ ⎠ Si = E ⎝ √ Ki = E ⎝ √ Di Di
Additionally, an MSM model can be introduced. This model describes statistical properties of some vector T whose elements have probability distribution in form of finite normal mixtures. The density of finite mixture of probability distributions has the following
84
A. Gorshenin and V. Kuzmin
form: fθ (x) =
k
pi ψi (x; ti ).
i=1
Here the parameter k ≥ 1 is a natural number that defines the number of components; ψ 1 , …, ψ k are the known PDFs; and pi ≥ 0, i = 1, . . . , k, ki=1 pi = 1 are probabilities (weigths) of corresponding components. As shown in [4], finite normal mixtures have been successfully used for stochastic analysis of density fluctuations. So, the following kernels in the above-described formula are considered: ψi (x, ai , σi ) =
− 1 √ e σi 2π
(x−ai )2 2σi2
.
MSM models are implemented for window vectors X and for difference window vectors Y. Four corresponding moments [15] as well as sample moments can be used as additional features for data enrichment. Before data is sent to the neural network it is normalized: Xi and each of the input moments are scaled separately to range of [0, 1] across all vectors Xi .
3 Configurations of Neural Networks with Varying Feature Selection In order to observe the efficiency of statistical data enrichment for each window vector four different sets of input parameters with varying feature selection (data enrichment) were constructed. As discussed in [16], proper feature selection increases prediction performance and may reduce computation time. The further sections compare results obtained on differently enriched classes of sets to verify that statistical data enrichment works. Non-enriched set consists of a single window vector Xi . Simple enriched set consists of a window vector Xi and sample moments for vectors Xi−N+1 , …, Xi . Model-enriched set consists of a window vector Xi and statistical moments for MSM models created on vectors Xi−N+1 , …, Xi . Difference-enriched set consists of a window vector Xi and statistical moments for MSM models created on difference window vectors Yi−N+1 , …, Yi . It is worth noting that in all cases calculation of statistical moments does not require any information that cannot be derived from window vectors Xi and any prior window vectors, so these moments can be correctly used as additional features for data enrichment. Neural network output consists of M consequent predicted observations that serve as forecast for observations Vi+N , …, Vi+M−1 . For all sets length of forecast was chosen to be 15% of N (medium-term forecast). Optimization of neural networks requires the optimization of certain hyperparameters. In this research we can divide all hyperparameters into two categories – fixed and variable.
On Feature Expansion with Finite Normal Mixture Models
85
Fixed hyperparameters are as follows: each constructed neural network is a deep feedforward network; ReLU activation function is used for all neurons except for the output layer; RMSE is used as loss function; Keras callback for reduction of learning rate on plateau of loss function is used; linear activation function is used for the output layer. Variable hyperparameters are as follows: optimizers; number of layers and neurons in each layer; dropout rates [17]; enriched/non-enriched data sets. In this paper, four optimizers from Adam family (Adam, AdaMax, AdaDelta and NAdam [18]) were selected. Dropout rate was chosen among 0, 0.1, 0.25 and 0.5. Several configurations of network topology have been chosen: – one hidden layer with 50/100/150/200 neurons; – two hidden layers with 100/200 neurons in each layer; – three hidden layers with 50/100/200 neurons in each layer. Using grid-search for hyperparameter optimization as discussed in [19], a neural network was created for every possible architecture under these conditions, totaling about 570 neural networks. Learning stopped after 500 epochs or if loss function had not decreased for 35 epochs.
4 Data Enrichment Results for Turbulent Plasma Data Six different time-series of observed plasma densities, each consisting of approximately 60000 observations were processed. N = 200 and M = 30 were used for window and prediction length. TensorFlow and Keras Python libraries were used to train and evaluate neural networks. Initial data series were divided into training and test datasets in 70%/30% proportion. The calculations were performed using a hybrid high-performance computing cluster (IBM Power 9, 1 TB RAM, 2xNVIDIA Tesla V100 (16 GB) with NVLink). Best choices for hyperparameters for enriched and non-enriched data for all six sequences were very similar. As in [13, 14] neural networks with one wide hidden layer made more accurate predictions than deep neural networks with more hidden layers. Choice of optimizer varies among observed datasets, usually precision and learning speed is slightly better with Adam optimizer. No strong overfitting was observed in all constructed neural networks. Non-zero dropout rate affected loss function and learning rate negatively, the best results were achieved without dropout. At the same time the choice between enriched and non-enriched data greatly affected accuracy and loss function value. In most cases model training finished before 500 epochs cutout. Enriched data sets had 20–30% slower time to trained model, there were no major learning rate differences between enriched data sets. On Figs. 1–4 initial data (red line), forecasts for 1 step (blue line) and 30 steps (dashed green line) are shown. For each data series a random subinterval was chosen for a better demonstration. Graphs for non-enriched (top-left graph), simple-enriched (topright graph), model-enriched (bottom-left graph) and difference-enriched (bottom-right graph) sets are shown.
86
A. Gorshenin and V. Kuzmin
For all sets 1-step and 30-step forecasts follow major data trends, enriched sets produce forecasts that are better at adapting to quick shifts in data, for example A19692 time interval 52.1–52.15 and 54.4–54.41. In many cases (A19692 time interval 52.2– 52.3) model moments and incremental model moments offer better prediction of peak values compared to non-enriched data and sample moments prediction. Figure 5 presents the difference in loss function for three sequences achieved by using differently enriched sets with the same hyperparameters. For all series a significant decrease of loss function value for trained neural networks is achieved by using modelenriched data. Ensembles not shown on graphs behave similarly to A20229 and A20264.
Fig. 1. Example of forecast, series A19692 (52.00–52.40 ms).
Fig. 2. Example of forecast, series A19692 (54.30–54.55 ms).
Chart on Fig. 6 illustrates the relative improvement of loss function value between differently enriched datasets. First bar shows the relative difference between non-enriched and difference-enriched dataset: Valuenon−enriched = (
RMSEnon−enriched − 1) · 100 RMSE_model
On Feature Expansion with Finite Normal Mixture Models
87
Fig. 3. Example of forecast, series A20229 (54.20–54.40 ms).
Fig. 4. Example of forecast, series A20264 (55.20–55.60 ms).
Fig. 5. Accuracy of neural network forecasts of initial data (1) and with expanded training sets: simple- (2), model- (3) and difference-enriched sets (4)
88
A. Gorshenin and V. Kuzmin
Fig. 6. An increase in the accuracy of forecasts achieved with difference-enriched set relative to configurations for non-enriched (1), simple-enriched (2) and model-enriched (3) sets.
Second bar shows relative difference for a simple-enriched set, the third bar – for model-enriched. Value is calculated as shown above with the replacement of quantity RMSEnon-enriched to RMSEsimple-enriched and RMSEmodel-enriched . Such enrichment is based on calculation of statistical moments of already processed window vectors and can be performed relatively fast. At the same time, enrichment can decrease loss values by up to 81% for these series. The second and third bars display the difference between usage of sample moments and MSM model’s one. It can be seen that MSM models can give an improvement ranging from 4.51% (time-series A20264) to 30.31% (time-series A20229) compared to sample moments, and from 3.86% (A20264) to 18.91% (A20229) compared to model moments for initial data. Thus, the model-enriched and difference-enriched datasets can significantly improve the training accuracy for all considered ensembles of physical data.
5 Results and Discussions The paper presents an approach to the problem of prediction of plasma observations using machine learning. The effectiveness of finite normal mixture approximations for the increments of data to expand the feature space is demonstrated. Prediction of the experimental physical time-series is of significant interest in the following problems of turbulent plasma research: – verification of the observation data obtained in the plasma experiments with uniform initial conditions; – signal recovery in case of malfunction of the recording equipment; – analysis of the profiles of the current density of entrainment and absorption of electroncyclotron heating, in particular, for the International Thermonuclear Experimental Reactor (ITER) project [20, 21]. In addition, the proposed approach can be used to solve the problem of mixture model verification for the initial time-series (data). A significant increase in the accuracy of
On Feature Expansion with Finite Normal Mixture Models
89
data prediction when using the characteristics of some approximating mixture model can be an indicator of the correctness of model and parameter choice as a mathematical description of the phenomena under consideration. The suggested approaches and software solutions will be integrated into the research support system [22, 23] being developed by the authors. In addition, the methodology to improve the forecasting accuracy with the parameters of probability mixture models can be used for data of a different nature, for example, for air-sea turbulent heat fluxes [24]. These tasks can be considered as a direction for further research. Acknowledgments. The research is partially supported by the Russian Foundation for Basic Research (project 18-29-03100). The research was carried out using infrastructure of shared research facilities CKP «Informatics» of FRC CSC RAS (http://www.frccsc.ru/ckp).
References 1. Nayak, S.C.: Development and performance evaluation of adaptive hybrid higher order neural networks for exchange rate prediction. Int. J. Intel. Syst. Appl. 9(8), 71–85 (2017). https:// doi.org/10.5815/ijisa.2017.08.08 2. Mishra, N., Soni, H.K., Sharma, S., Upadhyay, A.K.: Development and analysis of Artificial Neural Network models for rainfall prediction by using time-series data. Int. J. Intel. Syst. Appl. 10(1), 16–23 (2018). https://doi.org/10.5815/ijisa.2018.01.03 3. Akkar, H.A.R., Jasim, F.B.A.: Intelligent training algorithm for artificial neural network EEG classifications. Int. J. Intel. Syst. Appl. 10(5), 33–41 (2018). https://doi.org/10.5815/ijisa. 2018.05.04 4. Korolev, V.Yu.: Probabilistic and Statistical Methods of Decomposition of Volatility of Chaotic Processes. Moscow: Moscow University Publishing House. 512 p. (2011) 5. Meneghini, O., Luna, C.J., Smith, S.P., Lao, L.L.: Modeling of transport phenomena in tokamak plasmas with neural networks. Phys. Plasmas 21(6), 060702 (2014). https://doi.org/10. 1063/1.4885343 6. Raja, M.A.Z., Shah, F.H., Tariq, M., Ahmad, I., Ahmad, S.: Design of artificial neural network models optimized with sequential quadratic programming to study the dynamics of nonlinear Troesch’s problem arising in plasma physics. Neural Comput. Appl. 29(6), 83–109 (2016). https://doi.org/10.1007/s00521-016-2530-2 7. Mesbah, A., Graves, D.B.: Machine learning for modeling, diagnostics, and control of nonequilibrium plasmas. J. Phys. D: Appl. Phys. 52(30), 30LT02 (2019). https://doi.org/10.1088/ 1361-6463/ab1f3f 8. Narita, E., Honda, M., Nakata, M., Yoshida, M., Hayashi, N., Takenaga, H.: Neural-networkbased semi-empirical turbulent particle transport modelling founded on gyrokinetic analyses of JT-60U plasmas. Nucl. Fusion 59(10), 106018 (2019). https://doi.org/10.1088/1741-4326/ ab2f43 9. Parsons, M.S.: Interpretation of machine-learning-based disruption models for plasma control. Plasma Phys. Control. Fusion 59(8), 085001 (2017). https://doi.org/10.1088/1361-6587/ aa72a3 10. Kates-Harbeck, J., Svyatkovskiy, A., Tang, W.: Predicting disruptive instabilities in controlled fusion plasmas through deep learning. Nature 568(7753), 526–531 (2019). https://doi.org/10. 1038/s41586-019-1116-4
90
A. Gorshenin and V. Kuzmin
11. Batanov, G.M., Borzosekov, V.D., Gorshenin, A.K., Kharchev, N.K., Korolev, V.Y., Sarksyan, K.A.: Evolution of statistical properties of microturbulence during transient process under electron cyclotron resonance heating of the L-2M stellarator plasma. Plasma Phys. Control. Fusion 61(7), 075006 (2019). https://doi.org/10.1088/1361-6587/ab1117 12. Batanov, G.M., et al.: Reaction of turbulence at the edge and in the center of the plasma column to pulsed impurity injection caused by the sputtering of the wall coating in L-2M stellarator. Plasma Phys. Rep. 43(8), 818–823 (2017). https://doi.org/10.1134/S1063780X17080049 13. Gorshenin, A., Kuzmin, V.: A Machine Learning Approach to the Vector Prediction of Moments of Finite Normal Mixtures. In: Hu, Z., Petoukhov, S., He, M. (eds.) CSDEIS 2019. AISC, vol. 1127, pp. 307–314. Springer, Cham (2020). https://doi.org/10.1007/978-3-03039216-1_27 14. Gorshenin, A.K., Kuzmin, V.: Analysis of configurations of LSTM networks for mediumterm vector forecasting. Informatika i ee Primeneniya. 14(1), 10–16 (2020). https://doi.org/ 10.14357/19922264200102 15. Gorshenin, A.K.: Concept of online service for stochastic modeling of real processes. Informatika i ee Primeneniya. 10(1), 72–81 (2016). https://doi.org/10.14357/199222641 60107 16. Chandrashekar, G., Sahin, F.: A survey on feature selection methods. Comput. Electr. Eng. 40(1), 16–28 (2014). https://doi.org/10.1016/j.compeleceng.2013.11.024 17. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1929–1958 (2014) 18. Buduma, N.: Fundamentals of Deep Learning: Designing Next-Generation Machine Intelligence Algorithms. O’Reilly Media, Sebastopol, CA (2017) 19. Gorshenin, A.K., Kuzmin, V.: Improved architecture of feedforward neural networks to increase accuracy of predictions for moments of finite normal mixtures. Pattern Recognit. Image Anal. 29(1), 68–77 (2019). https://doi.org/10.1134/S1054661819010115 20. Aymar, R., Barabaschi, P., Shimomura, Y.: The ITER design. Plasma Phys. Control. Fusion 44(5), 519–565 (2002). https://doi.org/10.1088/0741-3335/44/5/304 21. Kates-Harbeck, J., Svyatkovskiy, A., Tang, W.: Predicting disruptive instabilities in controlled fusion plasmas through deep learning. Nature 568(7753), 526–531 (2019). https://doi.org/10. 1038/s41586-019-1116-4 22. Gorshenin, A., Kuzmin, V.: Online system for the construction of structural models of information flows. In: Proceedings of the 7th International Congress on Ultra Modern Telecommunications and Control Systems and Workshops (ICUMT). Piscataway, NJ, USA: IEEE, 2015. pp. 216–219 (2015). https://doi.org/10.1109/ICUMT.2015.7382430 23. Gorshenin, A., Kuzmin, V.: On an interface of the online system for a stochastic analysis of the varied information flows. In: AIP Conference Proceedings. 1738. Art. No. 220009 (2016). https://doi.org/10.1063/1.4952008 24. Small, R.J., Bryan, F.O., Bishop, S.P., Tomas, R.A.: Air-sea turbulent heat fluxes in climate models and observational analyses: what drives their variability? J. Clim. 32(8), 2397–2421 (2019). https://doi.org/10.1175/JCLI-D-18-0576.1 25. Song, X.: The importance of relative wind speed in estimating air–sea turbulent heat fluxes in bulk formulas: examples in the Bohai Sea. J. Atmos. Oceanic Technol. 37(4), 589–603 (2020). https://doi.org/10.1175/JTECH-D-19-0091.1
Methodology for the Classification of Human Locomotion’s and Postures for the Control System of a Bionic Prosthesis I. A. Meshchikhin and S. S. Gavriushin(B) Bauman University, ul. Baumanskaya 2-ya, 5, Moscow 105005, Russia [email protected]
Abstract. The modern level of development of microelectronics: drives and control systems, opens up new opportunities in applications such as prosthetics. The ability to control the moment of resistance in the prosthesis of the knee based on the data of the telemetry of the prosthesis allows you to implement adaptive to the patient, external conditions control. Control systems for bionic prostheses are complex technical systems, including sensing systems, drives, control boards. Effective management is hierarchical and includes both the basic levels of finite automata, carrying out resilience, basic management in locomotion and postures, and the adaptive control loop and the system of tuning parameters for high-level correction of the prosthetic model. The control system of the prosthesis should include a system for classifying locomotions and postures. In this case, the rules by which the classification is made must be set explicitly for the possibility of expert assessment. The article presents a method for the classification of telemetry data of the prosthesis based on a combination of elements of the theory of fuzzy sets and kernel estimate. Keywords: Analysis of human walk · Clustering · Kernel estimate
1 Introduction Human’s gait is a complex dynamic process. Analysis of locomotion’s and postures is necessary both in the construction of control systems for prostheses and in the monitoring of human activity. Monitoring of human activity, an actively developing industry, which allows solving management, marketing, etc. tasks at a new technological level. Among private applications, prosthetics and the diagnosis of the medical and psychological state of a person are worth highlighting. The portrait of walking is closely related to human behavior, the results of its analysts, together with other biomedical signals, are also in demand for marketing research. In this article, the subject of work focuses on building a system for identifying postures and locomotion for the knee prosthesis control system [1].
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): CSDEIS 2020, AISC 1402, pp. 91–100, 2021. https://doi.org/10.1007/978-3-030-80478-7_11
92
I. A. Meshchikhin and S. S. Gavriushin
A natural description of the gait, the locomotion in general, is represented by the alternation of the dynamics of a forward and reverse two-link pendulum. The alternating states in the two-link pendulum are the phase indicator: support and transfer, and two angles: in the hip and knee joint. The subject of the knee prosthesis control is the moment in the knee joint, the value of which is determined by the values of the angles, the history and forecast of their evolution, and the force-moment factors [2]. It is convenient to analyze the system in the kinematic coordinates of the knee and hip and force angles: axial force and moment in the sagittal plane. A detailed list of postures, locomotion and scenarios of transitions between them is presented in paragraph d4 ICF mobility. Figure 1 shows the main postures and locomotion [3] in the kinematic portrait. Sitting pose d4153 Running, changing running to walking d4552
Knee pose d4152
Gait d450
Squatting pose d4151
Slow walking d450 Pose lying d4150
Standing pose d4154, d4104, d4105
Fig. 1. Kinematic portrait. The X axis is the angle at the hip joint. Y axis - the angle at the hip joint. The codes of the poses correspond to the ICF
Poses are points in the coordinates of angles, and locomotion, trajectories that can be approximated by a sequence of point states. In addition to descriptions of postures and states, the ability to achieve and implement which is a function, ICF [4] provides a system for assessing the quality of their achievements. The quality criteria of the prosthesis are the maximum compensation for the lost functions described in the ICF. In projection on the analysis of the kinematic portrait, this means the proximity of the patient’s trajectories to the reference ones. When developing a control system, it is necessary to determine the quality of implementations of functions for analogs, determine the priority postures and locomotion for convolving the vector of quality indicators for different postures and locomotion into a scalarized prosthesis quality criterion. It is possible for the expert community to participate in assessing the priority of postures (the priority of d450 over d4554 is obvious)
Methodology for the Classification of Human Locomotion’s and Postures
93
and the qualitative identification of management scenarios for (for example) a change in the position of d410. As the information-measuring system of the prosthesis is implemented, in addition to the calculated, experimental data of kinematic and force portraits for the implementation of postures, locomotion and transitions between them, it will be possible to convert qualitative expert estimates from control scenarios into control algorithms.
2 Article Structure This paper presents a description of a cascade (ranked by complexity and depth of analysis) of the gait analysis system functions. The article begins with a basic analysis of telemetry data. After dividing into steps, the paper presents a method for constructing a point cloud in the parameter space. Further, the problem of clustering is solved using the methods of nonparametric statistics. The results of using the proposed method are presented in the section “Discussion of the results”.
3 Literature Review There are analogues of the proposed methodology, in particular, for solving marketing problems [5]. It should be noted that since the classification result is used by the control system, the cost of errors of the first and second kind is high. The patient runs the risk of injury if the prosthesis does not work correctly. This factor is a limiting factor in the development of a bionic prosthesis: rehabilitation specialists choose simpler solutions, but with predictable behavior. For this reason, there is an explicit ban on the use of neural networks and other techniques, the use of which does not lead to explicit decision-making rules that can be adjusted by experts. Another feature of the application of the classification results in the loop of the control system is the speed requirement: no more than 0.1 s of delay. With a characteristic step frequency of 1 s (step duration), the classification should be carried out on the basis of instantaneous telemetry data values. It is not possible to use the result of classification based on Fourier [6], Log space [7]. On the other hand, the dynamics of locomotion is a rather complex and variable process, and the explicit convolution of telemetry data with reference solutions [8] does not give the proper results. A significant part of the work is devoted to pattern recognition based on the video stream using machine vision methods [9]. Machine vision technologies are limited in application by the dimensions of the image capture areas, and do not allow the study of human behavior in the framework of long-term monitoring. Also, these results cannot be applied to the classification of the prosthesis control system.
94
I. A. Meshchikhin and S. S. Gavriushin
4 The Basic Data Analysis The basic data analysis is to present the data in the coordinates of the angle in the KJ (the knee joint), the angle in the HJ (the hip joint). These coordinates were chosen as the most representative kinematic parameters available for prosthetic measurement of the knee node. Among other things, it is worth highlighting the angles of the frontal and horizontal planes, the analysis of which is important in identifying compensatory movements [10]. The analysis of force factors (forces and moments) is fully possible only for patients with amputation, which complicates the formation of the reference base of the norm. Selection of poses (fixed points) and locomotion Since the walking process is a periodic process, it is convenient to choose a step as the unit of analysis, highlighting the parameters of which allows one to go from the analysis of temporal functions (continual dynamics) to the analysis of a set of points (dynamics with discrete time). Figure 2 shows the characteristic trajectories of postural changes. The data are measured using the developed mobile measuring devices based on the MEMS [10] board.
Fig. 2. Telemetry data
5 Analytics Over Step For each step, there are three points - the minimum and maximum in the hip joint and the maximum in the knee join. The step cycle is encoded by three points in the coordinates of the corners in the hip-knee joints (Fig. 3).
Methodology for the Classification of Human Locomotion’s and Postures
95
Fig. 3. Characteristic portrait of a person’s gait without amputation and a system of characteristic points
6 Clustering There are many different tiling’s of human locomotion and postures. Their distinction is based on different criteria and goals. In general, the number of types is determined by the number of extremes of the density of distribution. Since the observed values are angles in the knee and hip joint, the number of types of movements will be determined by the number of extremes of the estimate of the distribution function in the corresponding coordinates. For three-point clouds, a nonparametric estimate of the probability density distribution is constructed in the form [11]: p(x, y) ∼
e−
(x−xi )2 +(y−yi )2 d
,
(1)
Where (xi , yi )- the coordinates of the i-th point of the cloud, d is the smoothness parameter. The number of groups for each point can be defined as the number of local extremes, and group membership as belonging to the corresponding basin. The estimate of the distribution of the maximum HJ is shown in Fig. 4. The geometry of the pools and the attractor is determined during the gradient descent initiated from each data point. The number of groups depends on the smoothness parameter d: when smoothness tends to infinity, the number of groups is 1 (unimodal distribution), and with infinitely small smoothness, the number of groups is equal to the number of points in the point cloud. Each value of smoothness will correspond to its division into groups with a different number of groups and the number of elements in them. Then for each partition we can estimate the entropy of the statistics of each partition as: #n Pi LnPi , (2) H (d ) = i=1
where Pi = nNi , #n- the number of groups, ni is the number of steps in the i-th group, N is the total number of steps.
96
I. A. Meshchikhin and S. S. Gavriushin
Fig. 4. Assessment of distribution, group centers and their basins
Fig. 5. The dependence of the entropy of splitting into groups of smoothness. X axis - smoothness, Y axis – entropy
Figure 5 shows a plot of the entropy of the smoothness parameter. It can be seen from the figure that for d∈ (50–80), the entropy is constant, which indicates the constancy of the number of groups and the distribution of their power. The entropy independence of a partition parameter can be defined as a criterion for a correct partition. Figure 6 shows the distribution estimates for three points, and the coordinates of the local extrema of the distributions. Coordinates are indices proportional to angles in HJ and KJ. Figure 7 shows the clouds of points and their centers in the coordinates of the angles HJ and KJ. The described approach allows us to break up groups of clouds into an atlas of unimodal distributions. Each unimodal area corresponds to its own pool, center and border. A pool is defined as an area in the telemetry data space (corners), the gradient descent from each point of which tends to be the same for the entire pool of points (pool center). The boundary of the basin can be described in many ways:
Methodology for the Classification of Human Locomotion’s and Postures
97
Fig. 6. Distribution of characteristic points
Fig. 7. Clouds of points and their centers
– medial representation [12]; – boundary representation [13]; – as a set of points equidistant from nodes evenly distributed over the cloud [14]. Belonging to a certain type of observed movement can be defined as affiliation of characteristic points of motion to a unique combination of three basins: maximums KJ and HJ and minimum HJ. The proposed data analysis technique allows both to solve the problem of data clustering, and to identify the type of locomotion using telemetry data. The analysis of the gait of a person with a free mode with a recording duration from 30 min to 7 h allowed us to distinguish the following types of movements and postures: – Walking; – Stair descent; – Climbing the stairs;
98
I. A. Meshchikhin and S. S. Gavriushin
– Transition processes; – Specific movements (running, etc.); – Poses (sitting, standing posture). Such movements as slow walking, walking on uneven, inclined surfaces belong to the periphery of the walking pools and do not form separate types with a given composition of measuring instruments. These results allow you to create a rational composition of the control profiles for the prosthetic knee [15]. The results of clustering for temporal responses are shown in Fig. 8 and in coordinates, HJ – KJ angles in Fig. 9.
Fig. 8. Result of clustering
Methodology for the Classification of Human Locomotion’s and Postures
99
Fig. 9. The result of clustering in the angle coordinates
7 Conclusions The presentation of data in implicit coordinates of angles proposed in the article makes it possible to effectively compare different locomotion. Encoding a trajectory with a set of characteristic points allows one to proceed to the description of the system as a dynamical system with discrete time, and the parameters of this representation (characteristic angles) are easy to interpret. The selection of the characteristic points of the trajectory according to the condition of global extrema at the step allows you to uniquely identify and compare the found characteristic points. Analysis of clouds of characteristic points allows us to describe confidence intervals on the norm. The construction of a non-parametric estimate of the distribution of data density allows us to estimate the number of extremes of probability density, and as a consequence, the number of groups (types) of movements. Each type of movement has its own control. The presented approach can be easily generalized to a wide class of biomedical signals with a periodic non-sinusoidal nature, which will allow solving problems of state classification and pathology identification [16]. The main novelty of the proposed technique can be expressed in several theses: 1. For the classification of locomotion’s, it is necessary to present sensory information by a set of parameters that change slightly over time. In the work, it is proposed to select the maximum and minimum for the step period as parameters. For non-periodic patterns, the envelope values are used. Further development of this approach is used in HHT [17]. 2. When presenting sensory information in the form of a cloud of points, the work proposes, using nonparametric statistics tools, to determine the number of classes as the number of extrema of the distribution density.
100
I. A. Meshchikhin and S. S. Gavriushin
3. Using the gradient descent method for each point of the parameter space, one can estimate the geometry of the classes as basins of attraction to the corresponding peak of density.
References 1. Polishchuk, M., Opashnianskyi, M., Suyazov, N.: Walking mobile robot of arbitrary orientation. Int. J. Eng. Manufact. 8(3), 1 (2018) 2. Meshchikhin, I.A., Gavriushin, S.S.: The application of elements of information theory to the problem of rational choice of measuring instruments. In: Hu, Z., Petoukhov, Sergey V., He, M. (eds.) AIMEE2018 2018. AISC, vol. 902, pp. 705–712. Springer, Cham (2020). https:// doi.org/10.1007/978-3-030-12082-5_64 3. Liu, G., et al.: A system for analyzing and indexing human-motion databases. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, pp. 924–926. ACM (2005) 4. World Health Organization et al.: International classification of functioning, disability and health. World Health Organization, ICF, Geneva (2001) 5. Asif, S., Javed, A., Irfan, M.: Human identification on the basis of gaits using time efficient feature extraction and temporal median background subtraction. Int. J. Image Graph. Signal Process. 6(3), 35 (2014) 6. Ghosh, M., Bhattacharjee, D.: Gait recognition for human identification using fourier descriptor and anatomical landmarks. IJ. Image Graph. Signal Process. 2, 30–38 (2015) 7. Goshvarpour, A., Goshvarpour, A.: Nonlinear analysis of human gait signals. Int. J. Inf. Eng. Electron. Bus. 4(2), 15 (2012) 8. Stepanyan, I.V., et al.: Neural network modeling and correlation analysis of brain plasticity mechanisms in stroke patients. Int. J. Intell. Syst. Appl. 11(6), 28 (2019) 9. Manjunatha Guru, V.G., et al.: An efficient gait recognition approach for human identification using energy blocks. Int. J. Image Graph. Signal Process. 9 (7), 45–54 (2017) 10. Worton, B.J.: Kernel methods for estimating the utilization distribution in the home range studies. Ecology 70(1), 164–168 (1989) 11. Kirtley, C.: Clinical Gait Analysis: Theory and Practice. Elsevier Health Sciences, New York (2006) 12. Sun, Y., Nelson, B.J.: MEMS capacitive force sensors for cellular and flight biomechanics. Biomed. Mater. 2(1), S16 (2007) 13. Yushkevich, P.A., Zhang, H., Gee, J.C.: Continuous medial representation for anatomical structures. IEEE Trans. Med. Imaging 25(12), 1547–1564 (2006) 14. Stroud, I.: Boundary Representation Modelling Techniques. Springer Science & Business Media, London (2006). http://doi.org/10.1007/978-1-84628-616-2 15. Jain, A.K.: Data clustering: 50 years beyond K-means. Pattern Recogn. Lett. 31(8), 651–666 (2010) 16. Wen, Y., et al.: Automatic customization of prostheses for individuals using adaptive dynamic programming (2019) 17. Fengcaia, C., Hongxiab, P.: The fault diagnosis research of gearbox based on Hilbert-Huang Transform. Int. J. Educ. Manage. Eng. 2(4), 71 (2012) 18. Aworinde, H.O., Onifade, O.F.W.: A soft computing model of soft biometric traits for gender and ethnicity classification. Int. J. Eng. Manufact. 9(2), 54 (2019)
An Approach to Social Media User Search Automation Anastasia A. Korepanova1,2(B) , Valerii D. Oliseenko1,2 and Maxim V. Abramov1,2
,
1 Laboratory of Theoretical and Interdisciplinary Problems of Informatics, St. Petersburg
Institute for Informatics and Automation of the Russian Academy of Sciences, 14-th Linia, VI, № 39, St. Petersburg 199178, Russia {aak,vdo,mva}@dscs.pro 2 Mathematics and Mechanics Faculty, St. Petersburg State University, Universitetskaya Emb., 7-9, St. Petersburg 199034, Russia
Abstract. The article presents an approach to searching for a user’s account on VK.com using data from that user’s account on Odnoklassniki, wich follows the research on the user identification across different social media. This issue is important as, to the author’s knowledge, there are no studies about user search automation for popular Russian social media. The proposed approach is based on the assumption that lists of friends on social media are a reflection of a user’s social environment and intersect across different accounts. Based on this intersection, it is possible to find accounts of a user on different social media and restore some hidden social ties of closed ones. This approach can be applied in the task of constructing and analyzing the social graph using information from different social media, and in studies on data security in social media. The accuracy of the proposed method is 0.753 on the test set of data. Keywords: Information security · User protection · Social media · Social engineering attacks · Social graph · User identification
1 Introduction 1.1 Prerequisites for Research Protecting information systems from data leakage is a topical issue nowadays; annually, experts in the field of information security note an increase in cybercrime aimed at information systems [23]. Many cybercrimes are committed using social engineering [19], this is due to the fact that often the weakest element of an information system can be a user [24]. The increasing threat of social engineering attacks is noted by experts: according to the report of Retruster, the number of phishing attacks alone increased by 65% in 2019 [21]. Social engineering attacks in this paper are considered as a set of applied psychological and analytical techniques that cybercriminals use to covertly motivate users of a public or corporate network to violate established rules and policies in the © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): CSDEIS 2020, AISC 1402, pp. 101–110, 2021. https://doi.org/10.1007/978-3-030-80478-7_12
102
A. A. Korepanova et al.
field of information security. Social engineering attacks are aimed at a system’s user’s vulnerabilities, which do not depend on the technical side of a system, but on a user’s personal characteristics: psychological, cultural, etc. All of these characteristics collectively to a large extent determine the user’s response to the attack, and therefore allow for estimation of the success of the attack. The set of pairs “vulnerability – the severity of vulnerability” is considered as a user’s vulnerability profile. [1] These vulnerability profiles are used to study and simulate social engineering attacks on an information system (in particular, multistep attacks [12]). Information obtained from experts, personal information, or information obtained from an analysis of a user’s personal page on social media can be a source of data on a user’s personal characteristics for constructing a vulnerability profile [11]. At the current stage of research, we focus on constructing the profile of user vulnerabilities based on information extracted from content published on social media. It is typical for Internet users to have several accounts on different social media: for example, according to Mediascope, the intersection of the Russian audience VK.com and Instagram is 21.6 million, VK.com and Facebook - 19.7 million [15]. In different social media, users can publish various content. That is, information collected from different sources benefits making conclusions about the personal characteristics of the user and, indirectly, his or her vulnerabilities. However, for this we need to know the accounts that belong to the same user. Such accounts are not always linked directly. That is, the task of identifying accounts belonging to one user in other social media seems relevant. Automating this search will allow for quickly finding accounts of many people, this can benefit, for example, the companies in which they work. This task of automatically searching for a user’s account in various social media is relevant in the framework of research on information security [8]. Odnoklassniki and VK.com were chosen in this study, since they are the most popular in Russia, and according to some estimates, the audience intersection can be more than 20 million [15]. Theoretical significance of this work lies in the development of user search methods for VK.com using known accounts from Odnoklassniki with respect to the specifics of these social media. This issue is important as, to the author’s knowledge, there are no studies about user search automation for popular Russian social media. Existing methods of account search are developed for specific social media and thus the possibility of their adaptation to the required social networks is an open question. Practical significance lies in the possible application of the developed methods and algorithms in the field of automation of the assessment of user protection from social engineering attacks. Solving the task of searching accounts of one user in different social media can also benefit areas such as targeted advertising, credit scoring and other social media analysis’ application. This work consists of the following sections: introduction, related works, problem statement, methods, data collection with algorithm evaluation, discussion and conclusion. 1.2 Related Work The task of searching for accounts of one user on different social media is closely related to the task of comparing different accounts and identifying those belonging to one user. Therefore, many articles dealing with identifying accounts belonging to one user also address the search problem. For convenience, we decompose this task into two
An Approach to Social Media User Search Automation
103
and treating them as separate ones: searching for accounts belonging to one user and comparing accounts among themselves to assess the likelihood of belonging to one user [9, 17, 18, 20, 22]. So, the article [7] provides a brief overview of methods for identifying accounts belonging to one user, and searching for accounts. It considers several basic approaches for determining accounts owned by a user, namely based on the social circle [16], comparison of profile attributes [2, 5, 10], deriving meta-information from user generated content (geodata, links to their accounts on other social media, etc.) [6]. Also, systems are already being created to combine user profiles from different social media into one [4]. It is worth noting that these works are focused on such social media as Facebook, Instagram, Twitter, Flickr and others popular mostly in English-speaking communities. Due to the specifics of the considered in the present paper social media (data structures, information availability, Russian language), some methods and approaches cannot be applied directly. Thus, the relevance of this study is emphasized by the fact that for analysis it considers social media that are popular in Russia, to the author’s knowledge, there are no such studies. This study is carried out as part of a general project dedicated to improving the information security of information systems by automating the receipt of user vulnerability assessments. Data for these assessments can be extracted from various sources, including social media accounts [1, 3]. Such studies are relevant in the field of cybersecurity [8]. As a basis for the present study served the work [1, 12] in which a model for constructing a social graph of the organization’s employees was presented, analysis methods of the spread of a social engineering attack were proposed. Information for constructing this graph is also extracted from social media, however, issues of a comprehensive analysis and data extraction from different social media were not considered. The methods for matching accounts in various social media used in this paper are considered in [14].
2 Problem Statement The proposed article is aimed at solving the problem of user search automation on VK.com based on information from this user’s account on the Odnoklassniki. That is, in other words, the link to the user’s account on the Odnoklassniki is given as input, the link to the user’s account on the social network VK.com must be displayed at the output. This task can arise in various contexts, for example, in the application to the development of recommendation systems. In this article, solving this problem is dictated by the need to build a user vulnerability profile. The user’s vulnerability profile is a set of pairs of vulnerability to a certain type of social engineering attack - the severity of this vulnerability. The user’s vulnerability profile is closely related to his or her personal characteristics. And the information obtained from experts, personal information or information obtained from the analysis of the user’s personal page in social media can act as a source of data on the user’s personal characteristics [11]. In this article, we consider social media as a source of information about the personal characteristics of a user. It’s also known that often one person has accounts on several social media. In different social media, users can publish various content. That is, information collected from various sources allows for making more confident conclusions about the personal characteristics of the user and, indirectly, the user’s vulnerabilities.
104
A. A. Korepanova et al.
It should be noted that in some cases, users publish a direct link to their second account on another social media. This is a trivial case that is not the subject of this article. The material is aimed at developing an algorithm for searching and identifying accounts in various social media through machine learning methods, data analysis, fuzzy comparison metrics, etc.
3 Methods 3.1 An Approach to Accounts Matching This work follows our previous work focused on user account identification. During this research we use a method of user identification that we have developed earlier [14]. Below is a brief description of it. Determining the belonging of two accounts to one user is considered as the task of classifying pairs of accounts with classes from the set {–1, +1}, where +1 denotes that a pair of accounts belongs to one user, –1 denotes that a pair of accounts belongs to different users. To match accounts, we use profile data and data about social circle. So, we consider sets of derived from accounts attribute values. The attributes are the following: name, surname, city, age, friend list of the account. User accounts are matched by matching their attribute values with pre-processing and partial inference of missing information. If values of the attributes city and age in the profile are hidden or not filled, then their estimated values are inferenced by analyzing the social circle of the account, namely list of friends and relatives. The following attribute matching methods are used: • Last name and city are matched using Jaro – Winkler’s string similarity measure [25]: dw =
if dj < bt dj , dj + (lp(1 − dj )) else
where dj denotes Jaro measure for two strings s1 i s2 , is the length of common prefix from the beginning of the lines to a maximum of 4 characters, p is the constant scaling factor used to adjust the estimate upward to identify the presence of common prefixes. p must not exceed 0.25, because otherwise the distance may become greater than 1. bt is a cut-off threshold. The result of applying this metric is a number in the interval [0, 1]. • The name is mapped through a dictionary of names, where different forms of the same name are mapped. If one or both of the names are not in the dictionary, then for the matching we use the Jaro – Winkler metric. • When matching pair of accounts in one or both of which the age has been inferenced, two ages are considered to coincide if the module of their difference is less than or equal to three. If both ages are known, then they are matched by exact matching. The result of matching age is a number from the set {0; 1}.
An Approach to Social Media User Search Automation
105
• Friend lists are matched by surname and name using the Jaro – Winkler measure. Two friend accounts presumed belonging to one friend if Jaro – Winkler measure of their first names and surnames is not less than 0.8 and 0.9 respectively. To evaluate the intersection of friend lists, the Shimkevich – Simpson metric is used, which gives a result in the interval [0; 1]: KSZ =
c min(a, b)
, where a denotes the number of the first account’s friends, b denotes the number of the second account’s friends, c denotes the number of matching friends. This measure was chosen among the most popular binary similarity coefficients, since it showed the greatest effectiveness in tests for this task [16]. Thus, as a result of the attribute matching of two accounts, we get a feature vector based on which the pre-trained model of logistic regression makes a decision on whether the accounts belong to one user. The identification method, its accuracy and the motivation for choosing these attribute matching methods are described in more detail in the article [14]. 3.2 Account Searching Method In order to describe algorithms of finding a user’s account on VK.com by information from his or her account on Odnoklassniki let’s introduce an appropriate notation. Let U OK be a known user account on Odnoklassniki, U VK be unknown user account on n OK ’s friends on Odnoklassniki, I = {∅} VK.com, F OK = {FriendOK i }i=1 be a set of U OK VK be an empty set of a U ’s and U ’s friend list intersection on social media VK.com, that is accounts on VK.com of friends from Odnoklassniki. F VK = {∅} be an empty set of U VK ’s friends on VK.com. Solving this problem using the algorithm below is based on the assumption that friend lists on social media are a reflection of a user’s real social environment and intersect across different accounts. Thus, a friend list of a user’s account in one social media intersects with a friend list of his or her accounts on other social media. Based on this hypothesis, we consider using a user’s friends for one of the methods for searching this user in another social media. This method can be used when searching for an account through personal data (by name and surname, city of residence, etc.,) is ineffective: because users can hide this data or not fill in a profile. This approach can also be useful in matching accounts to determine if they belong to one person. If one account has closed its friend list it is impossible to match them directly, but searching for friends of one account in another social media can help determine whether two accounts’ social circles intersect. Searching friends of account on VK.com. For each FriendOK i : This algorithm describes the block diagram of Fig. 1. 1. We search for his/her account on VK.com by his/her name – find FriendOK , get the i set of found accounts – FoundUsers.
106
A. A. Korepanova et al.
Fig. 1. Friends search algorithm
2. For each of found accounts using classification algorithm check whether it belongs to the same user as FriendOK i . 3. If the set of found accounts is not empty, based on logistic regression estimates from accounts that the classifier classified as belonging to the same user as FriendOK i , choose the one that has the highest score for belonging to class 1 – best(FoundUsers). Add this account to the set of a U OK ’s and U VK ’s friend list intersection – I = add(I , best(FoundUsers)). Assume that as a result, the set I is not empty. Accounts included in it were selected as potential accounts of the U VK ’s friends on VK.com. Based on this information, we will search for the U VK in VK.com. Searching for user account on VK.com: For each FriendVK from I get his/her friend j VK list – FofFriendj . If the friend lists’ intersection is not empty, then using the account matching algorithm from this intersection we select the account with the highest probability of belonging to one user with the account U OK . If the classifier classifies the pair consisting of this account and U OK as accounts belonging to different users then U VK is considered not found. Otherwise, the chosen account is presumed to be U VK .
An Approach to Social Media User Search Automation
107
4 Empirical Evaluation In this section, we discuss the experiment steps including the dataset crawling, data description and the performance evaluation metrics. 4.1 Data Collection The collected dataset contains pairs of Odnoklassniki - VK.com open accounts. According to a priori knowledge for some and expert estimation for others, in each pair both accounts belong to one user and each pair belongs to different users. There are two reasons for choosing these networks: firstly, they are the most popular in Russia, according to some sources [15], secondly, they give users wide opportunities of sharing personal data: by means of filling profile data, publishing content or joining communities. Analysis of this personal info can hugely benefit construction of user vulnerability profile. • VK.com is a popular online social media and social networking service. According to its statistic, it has 97 million active users monthly. • Odnoklassniki has 71 million users monthly, according to its eternal statistic. The dataset was automatically crawled iteratively over friends of accounts, which were a priori known to belong to one user. Profiles with close profile info were collected and then manually reviewed by experts. The dataset consists of 300 pairs of Odnoklassniki - VK.com open accounts, based on the experts estimates, in each pair both accounts belonged to one user and each pair belonged to different users. Friends were collected for each account from the dataset. The number of friends and their intersection is calculated. Intersection was calculated using the account identification method. An example of the data is presented in Table 1. The Id Ok, Id VK column shows the ids of social media users, the corresponding links to their profiles. The Vk Friends and Ok Friends columns indicate the number of friends of Id Ok and Id VK users respectively. The Intersection column shows the number of friends that the user has on both social networks. Table 1. Data example Id Ok
Id Vk
Vk Friends
Ok Friends
OkId1
VkId1
673
243
40
OkId2
VkId2
866
394
103
OkId3
VkId3
180
11
3
…
…
…
…
Intersection
…
4.2 Algorithm Evaluation For each account from Odnoklassniki, in accordance with the presented algorithm (Fig. 1), friends were found on VK.com.
108
A. A. Korepanova et al.
To evaluate the accuracy of the friends search algorithm, for each pair of accounts the accuracy metric was used Accuracy = NT , where T is the number of accounts found, and N is the total number of friends’ accounts from the intersection, this metric was averaged over the entire pairs dataset. Thus, the accuracy of the search for friends from the intersection in both social media in this sample is 0.726. Using found friends, user accounts in VK.com were identified. The accuracy of the search for user accounts of the Odnoklassniki on the VK.com was calculated using the Accuracy metric, where T was the number of accounts found, and N was equal 300. The accuracy of the method was 0.753.
5 Discussion The accuracy of searching on the test dataset is 0.753. So, the experiment results let us assume that over social media VK.com and Odnoklassniki users tend to befriend same people in different accounts. Thus, method of searching for a user’s account in other social media using his or her fiend list can be found useful. As further research plans, we are going to increase the accuracy of the user’s search to detect more sophisticated and indirect links among profiles, using not only friend info, but extended social graph and other information, such as subscriptions, personal photo, online statistics, etc.
6 Conclusion This article proposed an algorithm for searching a user on the VK.com according to data from this user’s account on Odnoklassniki. These two social media were chosen as the most popular in Russia, according to [20]. To the author’s knowledge, there are no studies about user search automation for popular Russian social media, thus there is no established method for user search automation. Existing methods of account search are developed for specific social media and thus the possibility of their adaptation to the required social networks is an open question. The proposed algorithm is based on the assumption that friend lists on social media are a reflection of a user’s real social environment and intersect across different accounts. Thus, a friend list of a user’s account in one social media intersects with a friend list of his/her accounts on other social media. The accuracy of the proposed algorithm was evaluated on a data set consisted of 300 pairs of accounts, in each pair both accounts belonged to one user, and different pairs belonged to different users. This study can be useful for various tasks of analyzing social media, for example, in matching two accounts one of which is closed to determine if they belong two one person. If one account has closed its friend list it is impossible to match them directly, but searching for friends of one account in another social media can help determine whether two accounts’ social circles intersect. The research also finds its application in constructing vulnerability profiles of a user of an information system and indirectly benefits obtaining estimates of the success of the spread of multistep social engineering attacks. Searching for one user in different social media can also benefit areas such as targeted advertising, credit scoring and other social media analysis’ application. As further research plans, we are going to increase
An Approach to Social Media User Search Automation
109
the accuracy of the user’s search to detect more sophisticated and indirect links among profiles. Acknowledgements. The research was carried out in the framework of the project on state assignment SPIIRAS № 0073-2019-0003, with the financial support of the RFBR (projects №18-01-00626, №20-07-00839).
References 1. Abramov, M.V., Tulupyeva, T.V., Tulupyev, A.L.: Social Engineering Attacks: Social Networks and User Security Estimates. SUAI, St. Petersburg (2018) 2. Agarwal, A., Toshniwal, D.: SmPFT: social media based profile fusion technique for data enrichment. Comput. Netw. 158, 123–131 (2019). https://doi.org/10.1016/j.comnet.2019. 04.015 3. Azarov, A.A., Tulupyeva, T.V., Tulupyev, A.L.: A prototype of a set of programs for analyzing the security of information systems personnel, based on a fragment of the user’s vulnerability profile. SPIIRAS Proc. 2, 21–40 (2012) 4. Aminu, E.F., Oyelade, O.N., Shehu, I.S.: Rule based communication protocol between social networks using Semantic Web Rule Language (SWRL). Int. J. Mod. Educ. Comput. Sci. (IJMECS) 8(2), 22–29 (2016). https://doi.org/10.5815/ijmecs.2016.02.03 5. Esfandyari, A., Zignani, M., Gaito, S., Rossi, G.P.: User identification across online social networks in practice: pitfalls and solutions. J. Inf. Sci. 44(3), 377–391 (2018). https://doi.org/ 10.1177/0165551516673480 6. Goga, O., Lei, H., Parthasarathi, S.H.K., Friedland, G., Sommer, R., Teixeira, R.: Exploiting innocuous activity for correlating users across sites. In: WWW 2013 - Proceedings of the 22nd International Conference on World Wide Web, pp. 447–457 (2013). ISBN: 978-145032035-1 7. Hazimeh, H., Mugellini, E., Khaled, O.A., Cudré-Mauroux, P.: Socialmatching++: a novel approach for interlinking user profiles on social networks. In: CEUR Workshop Proceedings, P. 1927 (2017) 8. Henriksen-Bulmer, J., Jeary, S.: Re-identification attacks—a systematic literature review. Int. J. Inf. Manag. Part B 36(6), 1184–1192 (2016). https://doi.org/10.1016/j.ijinfomgt.2016. 08.002 9. Humadde, H.S., Abdul-Hassan, A.K., Mahdi, B.S.: Proposed user identification algorithm across social network using hybrid techniques. In: SCCS 2019 - 2019 2nd Scientific Conference of Computer Sciences, ctat № 8852606, pp. 158–161 (2019) 10. Jain, P., Kumaraguru, P., Joshi, A.: Other times, other values: leveraging attribute history to link user profiles across online social networks. Soc. Netw. Anal. Min. 6(1), Article no. 85 (2016). https://doi.org/10.1007/s13278-016-0391-4 11. Kharitonov, Nikita A., Maximov, Anatoly G., Tulupyev, Alexander L.: Algebraic Bayesian networks: Naïve frequentist approach to local machine learning based on imperfect information from social media and expert estimates. In: Kuznetsov, Sergei O., Panov, Aleksandr I. (eds.) RCAI 2019. CCIS, vol. 1093, pp. 234–244. Springer, Cham (2019). https://doi.org/10. 1007/978-3-030-30763-9_20 12. Khlobystova, A., Abramov, M., Tulupyev, A.: An approach to estimating of criticality of social engineering attacks traces. In: Dolinina, O., Brovko, A., Pechenkin, V., Lvov, A., Zhmud, V., Kreinovich, V. (eds.) ICIT 2019. SSDC, vol. 199, pp. 446–456. Springer, Cham (2019). https:// doi.org/10.1007/978-3-030-12072-6_36
110
A. A. Korepanova et al.
13. Korepanova, A.A., Oliseenko, V.D., Abramov, M.V.: Applicability of similarity coefficients in social circle matching. In: 2020 XXIII International Conference on Soft Computing and Measurements (SCM), St. Petersburg, Russia, pp. 41–43 (2020). https://doi.org/10.1109/scm 50615.2020.9198782 14. Korepanova, A.A., Oliseenko, V.D., Abramov, M.V., Tulupyev, A.L.: Application of machine learning methods in the task of identifying user accounts in two social networks. Comput. Tools Educ. 3, 29–43 (2019) 15. Lede statistics of the Russian-speaking Internet audience in (2019). https://lede.pro/social_ network_statistics. Accessed 08 Apr 2020 (in Russian) 16. Li, Y., Su, Z., Yang, J., Gao, C.: Exploiting similarities of user friendship networks across social networks for user identification. Inf. Sci. 506, 78–98 (2020). https://doi.org/10.1016/j. ins.2019.08.022 17. Nurgaliev, I., Qu, Q., Bamakan, S.M.H., Muzammal, M.: Matching user identities across social networks with limited profile data. Front. Comput. Sci. 14 (6) (2020) 18. Olivero, M.A., Bertolino, A., Domínguez-Mayo, F.J., Escalona, M.J., Matteucci, I.: Digital persona portrayal: identifying pluridentity vulnerabilities in digital life. J. Inf. Secur. Appl. 52 (2020) 19. Ptsecurity — Actual cyberthreats (2019) results. https://www.ptsecurity.com/ru-ru/research/ analytics/cybersecurity-threatscape-2019/. Accessed 13 Apr 2020 20. Ranaldi, L., Zanzotto, F.M.: Hiding your face is not enough: user identity linkage with image recognition. Soc. Netw. Anal. Min. 10(1) (2020) 21. Retruster — Phishing Statistics and Email Fraud Statistics (2019). https://retruster.com/blog/ 2019-phishing-and-email-fraud-statistics.html. Accessed 13 Apr 2020 22. Sharif, S.H., Mahmazi, S., Navimipour, N.J., Aghdam, B.F.: A review on search and discovery mechanisms in social networks. IJIEEB 5(6), 64–73 (2013). http://doi.org/10.5815/ijieeb. 2013.06.08 23. Verizon Data Breach Investigations Report (2018). https://www.researchgate.net/profile/Suz anne_Widup/publication/324455350_2018_Verizon_Data_Breach_Investigations_Report/ links/5ace9f0b0f7e9b18965a5fe5/2018-Verizon-Data-Breach-Investigations-Report.pdf?ori gin=publication_detail. Accessed 15 Apr 2020 24. Fan, W., Lwakatare, K., Rong, R.: Social engineering: I-E based model of human weakness for attack and defense investigations. Int. J. Comput. Netw. Inf. Secur. (IJCNIS) 9(1), 1–11 (2017). https://doi.org/10.5815/ijcnis.2017.01.01 25. Winkler, W.E.: String comparator metrics and enhanced decision rules in the fellegi-sunter model of record linkage. In: Proceedings of the Section on Survey Research Methods (American Statistical Association), pp. 354–359 (1990)
Method for Processing Document with Tables in Russian-Language Automated Information Systems Dudnikov Sergey(B) , Mikheev Petr, and Dobrovolskiy Alexander Bauman Moscow State Technical University, Moscow, Russia
Abstract. This paper presents a method for processing images of documents with tables and extracting information from them for automated information systems. The key feature of this method is the decomposition of the process of extracting information from documents into 2 parts: processing tables and non-tabular text. This combined approach allows for processing different types of documents. We specify the scope of the method, describe its stages, compare and select solutions for each stage. The proposed method consists of 5 stages of processing scanned images of documents. The main of them are composition of analytical and machine learning approaches. An experiment was conducted separately for the tasks of processing tables and non-tabular text. We obtain results that prove the effectiveness of the selected solutions. For training and validation, we used both the ICDAR-2013 table competition dataset and our own dataset of 317 scans of layout heavy business documents, particularly financial documents. In conclusion, we provide recommendations for organizing a process of extracting information from images of financial documents that contain information inside and outside of tables. Based on this method, a software intelligent module is developed that is used in the ERP system. Keywords: Document image processing · Information extraction · Table detection · Table structure recognition · Object classification · Neural networks · Machine learning
1 Introduction Document image analysis and extraction of textual information are fundamental research tasks of computer vision [1]. Invoices, orders, loan agreements and similar business documents carry the information needed for trade to occur between companies and much of it is on paper or in semi-structured formats. To effectively manage this process, it is necessary to digitize the information from the document and input it into the company’s IT systems. However, manual input is still mostly used for this purpose. This is indicated by research conducted by consulting companies [2]. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): CSDEIS 2020, AISC 1402, pp. 111–125, 2021. https://doi.org/10.1007/978-3-030-80478-7_13
112
D. Sergey et al.
Manual data input into an information system is the reason for the lack of efficiency, availability, and reliability of the information. This leads to additional financial costs and time delays. There are many different methods for automating this operation, but most of them are either applicable for certain document types, or they require additional configuration and external support. The overview of these methods is presented in Sect. 3. Scans of financial documents with tables are considered as processed data in this work. Their structure and the requirements for the method are presented in Sect. 2. The basis of the proposed method is the decomposition of the process of extracting information from documents into 2 parts: processing tables and non-tabular text. The proposed method consists of a combination of two processing methods: analytical and machine learning approaches. The described approach to the combination of document processing is of scientific interest in the study of computer vision, which is related to the field of information extraction from scanned documents. This method is implemented in a software intelligent module embedded in the enterprise ERP system and has been tested experimentally.
2 Information Structure Features of Processed Documents The method for processing images of financial documents is based on the structure of the contained information – it is contained both outside and inside the tables. Due to the extraction of both textual and tabular information this method should handle a large range of document types. The proposed method should take into account the significant visual difference between textual and tabular information. Tables usually represent semantically richer information compared to the basic text representation. This method should also consider the task of table detection, table structure recognition, and information values extraction. Apart from the non-tabular text, the appearance and structure of tables has high intraclass variability. In accordance with this, the method should use different approaches for processing tables and non-tabular text. In addition, the method should provide for the universality of the algorithm to be applied to various documents. The method is applicable to the processing of Russian-language printed financial documentation. This method should also be applicable for English-language documents since most text recognition technologies are initially configured to work with a Latin script.
3 Literature Review This section covers both ways to extract information from documents and a separate task of table understanding. We focus on recent work for the sake of brevity. The most well-known IT solutions for document image processing and information extraction are various SaaS products: ABBYY FlexiCapture [3], Yandex Vision [4], SAP S/4HANA Invoice Processing [5], etc. They require additional configuration for the processed document types, external third-party support, and they don’t give much control over the system configuration.
Method for Processing Document with Tables in Russian-Language
113
Most of the existing solutions perform particular tasks of document processing. They do not consider the task of organizing the end-to-end process of extracting and presenting information from the entire scanned document. Apart from natural images [6], the task of text detection and recognition on document images traditionally is performed by the OCR systems. Microsoft Cognitive Services OCR [7] provides paid API access to the service with a closed architecture. An alternative solution from Google is Tesseract OCR [8]. This software is open-sourced and licensefree, however, in practice, it has difficulties in processing a whole document image and recognizing tables. Such a system should only be used for recognizing individual text fragments. The usage of Tesseract OCR is described in Sect. 4.5. There are works in the field of document understanding that combine NLP and computer vision approaches to perform information extraction from any document type. Katti et al. [9] and Denk et al. [10] provide the document representation as a two-dimensional grid, where pixels for each character and contextualized BERT embeddings are encoded, respectively. They perform semantic segmentation by deep CNN and achieve results from 34% to 84% across various fields. It is a promising research direction, but currently, this task is not solved due to the large variability in documents structure. Holecek et al. [11] use the metadata obtained from PDF format to perform the information extraction. They use it to construct a graph that represents the document and fed it to the GraphCNN. Palm et al. [12] provide a system for extracting textual information from invoices. They accept a document in PDF format, form N-grams from the text, and classify them by tags using the LSTM model. These works use metadata obtained from PDF format such as textboxes coordinates, textual values, etc. This approach is not possible in our case of documents images processing. Many studies separately deal with the subtasks of table detection and its structure recognition. The method proposed by Tran et al. [13] is not based on deep learning models, but it processes document images (not PDF) and shows a competitive result of table detection on the ICDAR-2013 dataset [14]. They analyze the spatial arrangement of text blocks and detected table borders. The disadvantage of this approach is that it cannot be applied to tables with no visible borders, but authors use an effective text detection technique. Schreiber et al. [15] suggest using a Faster-RCNN for table detection and FCNbased segmentation model for table structure recognition. Paliwal et al. [16] proposed the detection of tables and its elements by encoder-decoder and VGG-19 as a base model, where the encoder is the same and the decoder is different for both tasks. Prasad et al. [17] suggest a single deep Cascade mask R-CNN HRNet model for table detection and its structure recognition. The selection of a suitable model is carried out on the basis of an experiment given in Sect. 5.2. Separately, the reviewed methods do not allow complex information extraction from the financial document images.
4 Algorithm of the Proposed Document Image Processing Method A method for extracting structured information from document images is proposed. It takes into account the requirements for processing images of financial documents as described in Sect. 2.
114
D. Sergey et al.
It is assumed that the necessary information is located both in tables and non-tabular text. The difference in their processing complexity is the reason for using separate methods to solve these problems. The method describes a way for processing both text and tables. It can be configured for a specific task if necessary. If table or non-tabular text processing is excessive, the corresponding stage can be skipped. The algorithm of the document image processing is shown in Fig. 1.
Fig. 1. Algorithm of the proposed method
4.1 Preprocessing The preprocessing stage is necessary to improve the quality of desired elements detection and text recognition. The first step is image binarization, or its translation from colored to monochrome black-and-white in order to reduce the amount of processed information. In practice it is proposed to use the adaptive binarization method [18], which is implemented in the open-source computer vision library OpenCV [19]. In this case, the binarization threshold is determined individually for each pixel. Another preprocessing stage is noise reduction. It is performed by applying two morphological transformations – erosion and dilation, which are also implemented in the OpenCV library. Erosion narrows the size of all objects and removes noise depending on the morphological kernel size. It is accompanied by dilation, which restores the size of all objects except for the removed noise.
Method for Processing Document with Tables in Russian-Language
115
4.2 Classification The stage of document image classification is not considered in detail in this paper, but it is necessary to provide ways to solve this problem. This stage is required if the processed documents are divided into a large number of classes, and each class contains different types of text fields to be recognized. Alternatively, we can refer to the ideas of such general document analysis models as Chargrid [9] and BERTgrid [10]. The recent approaches to document classification are divided into two types. The first ones work directly with the document image that is fed into the deep CNN. Kolsch et al. [20] provide such approach and use pre-trained AlexNet as the backbone of their model. The second group includes approaches that work with document textual content that can be extracted using OCR systems. Gupta et al. [21] and Revanasiddappa et al. [22] provide classification based on text summarization and convolutional representation model, respectively. Yiheng Xu et al. [23] propose a model based on the BERT architecture, where bounding boxes of text, textual content, and the image itself are integrated. 4.3 Non-tabular Text Processing This section covers both tasks of text detection and its classification by fields (or tags) since not all the information is necessary for recognition. The main idea of processing the non-tabular text is the analysis of connected components in a binary image, which was inspired by Tran et al. [13]. They extend this analysis to the task of table detection, but their method can only detect a bordered table, in which all text elements are located strictly under each other. Such restrictions do not meet our concept of flexible configuration for various processed documents. However, the idea of using connected components is effective while working with plain text outside of tables. Connected components in binary images are pixel regions with non-zero values. OpenCV library provides algorithms for working with them, so it is possible to operate with convenient higher-level objects than pixels. Non-tabular text detection is conducted in the following way. We extract connected components CCS from the binary image and get their bounding boxes. Let CCi be i-th connected component, Bi – its bounding box, which is set by coordinates xleft , xright , ytop andybot . Then we calculate the average height of all connected components n
height
i , where heighti is the height of i-th connected component. avgHeight = i=1 n Afterwards, we define text elements. Each of them is a connected component whose height is approximately equal to avgHeight and the width is not too large. Practical experience shows that such elements can often be connected to each other due to insufficient image quality. In order not to lose them, it is recommended to consider any connected component that satisfies the following empirical system as a text element: width < 10 ∗ height (1) height < avgHeight ∗ 4
116
D. Sergey et al.
Text elements are then grouped into rows. Two connected components are located in the same row if the value of the next function equals 1: ⎡ ytop (CCi ) ≤ ybot CCj 1if F CCi , CCj = ⎣ (2) ybot (CCi ) ≥ Ytop CCj 0 otherwise This is followed by the construction of text blocks. It is a row-inside combination of text elements that are close to each other. First of all, statistics are collected for a group of documents that are similar to a processing document. Each of them has two numerically close parameters – the distance between letters in words and the distance between words in sentences. For each document in each row, the distance between consecutive text elements is calculated: D CCi , CCj = min θij , θji , where θij = x(CCi ) − x(CCj ) . Then we twice remove values close to the mode of the current array from the array of collected distances: the distance between letters in words, and the distance between words in sentences respectively. The variance of the remaining distance array is the threshold below which text elements are combined into text blocks. The resulting text blocks are simply bounding boxes of all non-tabular text or its location coordinates on the document image. The final step is the extraction of the text fields that are necessary for the creation of the information object of the document. This task is usually called semantic image segmentation, which is performed by deep neural networks. However, we have the possibility to process text blocks that contain information about the location of the text. This trick allows us to reduce the task of image segmentation to the task of text blocks classification. Due to the simplicity of the data structure, this task should not be solved by deep learning models to avoid rapid overfitting. In this case, it will be sufficient to use standard machine learning algorithms: unsupervised clustering, random forest or gradient boosting. The source data for classification is text blocks, or objects with such features: xleft , xright , ytop and ybot . A feature of a distance to the table was considered additionally, but it didn’t increase the result. All features must be normalized relative to the image size. Based on the experiment described in Sect. 5.1, it is recommended to apply the ensemble algorithm «random forest» [24]. It is worth noting that there is no universal model that is suitable for classifying any arbitrary data. Cross-validation must be performed for custom data and the model should be selected based on its results. Finally, we have text blocks with labels assigned as a result of classification. Text blocks with the necessary labels are sent to the OCR system as coordinates of the text to be recognized. The text itself is the essential component of the information object of the document. 4.4 Table Processing Tables on the scans of documents are very different from the regular text. Due to the wide variety of table types, many of them can’t be as easily extracted from the document image as it was done with text blocks. Our method positions itself as flexible and customizable for various documents, so it should universally detect and process different types of
Method for Processing Document with Tables in Russian-Language
117
tables – both with and without borders. In this regard, it is necessary to use deep learning models to perform table detection and its structure recognition. The usage of CascadeTabNet, proposed by Prasad et al. [17], is considered in this paper. The choice was made based on the results of an experiment on the ICDAR-2013 dataset [14], which is presented in the next section. In addition, this model is publicly available and has been implemented using the MMdetection framework [25]. That allows easily making changes in its architecture if necessary and learn from the custom data. This fits well into our concept of a flexible and customizable method. In addition, the authors have made models trained on the ICDAR-2013 and ICDAR-2019 (cTDaR) [26] datasets publicly available. Figure 2 shows the model architecture.
Fig. 2. CascadeTabNet model architecture
The backbone of the model is CNN HRNetV2p_W32 [27], which converts the image to feature maps. «RPN Head» makes preliminary objects predictions on the feature maps. «B» indicates the bounding boxes predicted by the model. «Bbox Head» accepts features along with «B» and performs regression and classification of objects bounding boxes. «Mask Head» predicts object masks, and «S» denotes the image segmentation. Thus, object detection is performed by «Bbox Head» , and image segmentation for these objects is performed by «Mask Head». Initially, the model was trained on the MS COCO dataset [28] for object detection task, so in order to predict tables, two iterations of transfer learning were performed. The first one was made on a merge of 3 table detection datasets [17] to train the model to detect tables in general. The second one was more pointwise on a specific dataset (ICDAR-2013 or ICDAR-2019) to predict the type of table (bordered or borderless). The pipeline of CascadeTabNet processing is shown in Fig. 3:
Fig. 3. The pipeline of CascadeTabNet processing
118
D. Sergey et al.
Within the proposed concept of transfer learning, a specific configuration of the model for a custom dataset was performed to work correctly at a deeper level. Models trained on ICDAR-2013 and ICDAR-2019 were chosen, and fine-tuning was performed for a custom dataset. Evaluation and analysis of the results of table detection on financial documents images are presented in Sect. 5.3. The model predicts segmentation masks for two types of tables – with borders and borderless. The cell masks are predicted only for borderless tables. In the case of a bordered table, it is much more effective to use analytical computer vision methods. Morphological algorithms are used for lines extraction, and cells are determined using lines intersection points. Then the structure is recognized by analyzing cell positions within the table. A structure is considered recognized when coordinates and positions in rows and columns are known for each cell. Finally, we have the coordinates of the tables and the cells located in them. It is possible to recognize the entire table by sending the cell coordinates to the OCR systems. If it is excessive, it is possible to apply the already described principle of text blocks classification. In this case, the coordinates of the cells act as text blocks. 4.5 Information Object of the Document Construction The structure of the information object of the document is determined by the engineer. It is based on what type of information is required to be extracted. For example, financial documents of the «Payment report» type have the following fields: document ID, transaction information, total transaction price, total price verification. Field names are labels that are assigned to text blocks or cells during the classification. In order to recognize textual information, classified text blocks are used as input to OCR systems. Due to confidentiality reasons, we use Tesseract OCR [8], as it is open-sourced and license-free software. It efficiently processes separate text fragments that do not contain unnecessary third-party information, and it is able to recognize both Russian-language and English-language text. Also, it is possible to perform fine-tuning for custom data in case of insufficient recognition results.
5 Experiment A software intelligent module was developed according to the algorithm presented in Sect. 4. The module consists of several software blocks that implement the assigned task. They have passed a preliminary experimental test, which has showed minimal error and maximum efficiency on various types of documents for the implementation of the task. We have created a custom dataset of 316 enterprise financial documents for the experiment. Figure 4 shows the distribution of document types in the dataset. Two main document types were selected: payment reports and invoices. Documents of the type «Other» are added to the dataset to increase its variability. 5.1 Text Blocks Classification The type of processed data (scans of financial documents) allows us to detect non-tabular text using an analytical method, as indicated in Sect. 4.3. As a result of the detection, we have text blocks, or objects with the following features: xleft , xright , ytop and ybot .
Method for Processing Document with Tables in Russian-Language
119
Fig. 4. Distribution of document types in the dataset
To create an information object of a document, it is necessary to determine which text blocks are important for recognition. Therefore, we assign labels or tags to the text, which represent the information structure of the document. Thus, the task of semantic field segmentation is reduced to text blocks classification. The tag structure is configured in advance and depends on the type of document. The public ICDAR-2013 dataset [14] does not provide a set of documents with a similar tag structure, so we decided to use our custom annotated dataset of financial documents for evaluation. 183 documents of the type «Payment reports» were selected from the dataset for the experiment. This type is the most numerous, and each document has the same tag structure. The annotation consisted of segmentation of the four fields: document ID, transaction information, total transaction price, and total price verification. After the automatic generation of text blocks, each of them was assigned a label according to the annotation. If a text blocks didn’t fit the annotation fields, it was assigned the «Misc» label. The result is a dataset of 3244 labeled text blocks, which were divided into a training and test sample in a ratio of 4:1. The distribution of text blocks types is shown in Fig. 5.
Fig. 5. The distribution of text blocks types
Table 1 shows a cross-validation evaluation of various machine learning classification models. We didn’t use deep learning models to avoid rapid overfitting due to the structure of classified data. The features of text blocks were normalized relative to the image size.
120
D. Sergey et al.
The hyperparameters of the models were selected using the grid search method, which generates models based on combinations of parameters. The number of clusters for the kNN clustering method is selected depending on the tag structure. Table 1. Cross-validation evaluation of text blocks classification models ML model
F1-score
K-nearest neighbors 0.704 Random forest
0.919
Gradient boosting
0.858
The metric used in evaluation is F1-score. The evaluation result of random forest model is better than others for our data. Clustering method isn’t as effective as others since it is necessary to use supervised learning to find ambiguous patterns in the labels location. We do not specify hyperparameters for all models, but we provide recommendations for setting up a random forest model for text blocks classification in Table 2. Table 2. Optimal hyperparameters of random forest model for text blocks classification Hyper-parameters Value max_depth
20
max_features
3
min_samples_leaf
1
n_estimators
100
It is worth noting that the F1-score is overestimated due to the class imbalance. The largest number of text blocks has «Misc» label that is unnecessary for recognition. A more representative result is confusion matrix, which is shown in Fig. 6. It provides the ratio of correct and incorrect classification for each label. The average model accuracy = 90.83% . for 4 annotated labels is 0.8+0.9633+0.9054+0.9643 4 5.2 Table Detection and Table Structure Recognition For table processing experiment, we selected models that showed the best results in the tasks of table detection and its structure recognition. Evaluation of these models performances is conducted below on the well-known dataset for tables processing ICDAR-2013 [14]. This dataset consists of PDF files that we have converted into images for their further processing. In total, it has 238 images containing 156 tables. It also has the necessary meta-information for table structure recognition task.
Method for Processing Document with Tables in Russian-Language
121
Fig. 6. Random forest confusion matrix
First, we evaluate models accuracy for table detection. We use the threshold of 0.5 for IoU to calculate the F1-score metric. Since it is unknown which images in the dataset were used by Prasad et al. [17] for training and test sample, we evaluated it on a random sample of 200 images for ICDAR-2013 dataset. Metrics of other models are taken from the corresponding papers. The evaluation of table detection models is shown in Table 3. Table 3. Table detection evaluation for ICDAR-2013 Model
Recall Precision F1-score
CascadeTabNet [17] 0.9754 0.9845
0.9799
DeepDeSRT [15]
0.9615 0.9740
0.9677
TableNet [16]
0.9628 0.9697
0.9662
Tran et al. [13]
0.9636 0.9521
0.9578
Currently, CascadeTabNet is a state-of-the-art solution for table detection on ICDAR2013 dataset. Further, we evaluate models accuracy for table structure recognition. The metric used in this task is exactly the same as before – F1-score. Prasad et al. [17] haven’t evaluated table structure recognition task for ICDAR-2013 dataset, so it was performed in our work. The evaluation of table structure recognition models is shown in Table 4. Table 2 and 3 show that CascadeTabNet model [17] provides a better result compared to the other models under consideration. Therefore, it is used in the proposed method as table processing stage.
122
D. Sergey et al. Table 4. Table structure recognition evaluation for ICDAR-2013 Model
Recall Precision F1-score
CascadeTabNet [17] 0.9041 0.9358
0.9197
DeepDeSRT [15]
0.8987 0.9215
0.9098
TableNet [16]
0.8736 0.9593
0.9144
5.3 Table Detection on Custom Dataset We chose the model that showed the best result on ICDAR-2013 dataset to perform a table detection experiment on custom dataset. We selected two states of this model: pretrained on ICDAR-2013 [14] and ICDAR-2019 [26]. Within the framework of transfer learning, we conducted fine-tuning of these models on more specific enterprise financial documents. From custom dataset (see Fig. 4), 80% of the data was fed to the model. Another 20% was used for table detection evaluation. The data fed to the model was divided into a training and validation sample in a ratio of 4:1. The models were trained for 6 epochs. The iterations number depends on the number of training data, and it was 101 iterations in our case. We used a standard SGD with a fixed learning step of 0.0012 and classical momentum of 0.9. The average precision metric is used to evaluate effectiveness of table detection. The practical implementation of this metric is based on the Padilla et al. [29] work. The model results are shown by epochs in Table 5. The zero epoch is the basic pre-trained model. Table 5. Table detection results on custom dataset Epochs ICDAR-13 average precision ICDAR-19 average precision 0
0.924
0.842
1
0.942
0.915
2
0.950
0.925
3
0.961
0.918
4
0.968
0.920
5
0.970
0.921
6
0.971
0.920
The model pre-trained on ICDAR-2013 dataset shows the best result (see Fig. 7) on the documents, which structure is given in Sect. 2. Also, it does not overfit as quickly as ICDAR-2019 model. ICDAR-2013 contains documents from the EU and the US Government, and it seems that it has more similar data to the enterprise financial documents than ICDAR-2019. In addition, the model originally has shown better results
Method for Processing Document with Tables in Russian-Language
123
on ICDAR-2013 [17]. Due to the results of the experiment, it is recommended to pretrain table detection models on ICDAR-2013 dataset to work with enterprise financial documents.
Fig. 7. Average precision for pre-trained models
It can be noted that this model copes well with table detection, even if it is overlapped by third-party elements, such as stamps and signatures.
6 Conclusion This paper has presented the method of financial documents images processing and extracting information from them for automated information systems. Within the method framework, the combination of main document processing approaches was proposed and investigated: non-tabular and tabular information processing. Recommendations are presented for each stage of the method. Also, recommendations for dataset selection for pre-training table detection model are presented. Based on this method, a software intelligent module has been developed. It is successfully used in ERP system of the enterprise. For the future we will study if this simple but practical method architecture is robust enough to be generalized to include taking document photographs.
References 1. Kasturi, R., O’gorman, L., Govindaraju, V.: Document image analysis: a primer. Sadhana 27(1), 3–22 (2002) 2. Niemann, F.: PAC – CXP Group, Fujitsu. What AI can bring to business applications (2018). https://www.fujitsu.com/global/solutions/business-technology/ai/pac-study. Accessed 1 Sept 2020 3. ABBYY. ABBY FlexiCapture (2020). https://www.abbyy.com/flexicapture/. Accessed 1 Sept 2020
124
D. Sergey et al.
4. Yandex. Yandex Vision (2020). https://cloud.yandex.com/services/vision. Accessed 1 Sept 2020 5. SAP. SAP S/4HANA cloud for invoice processing (2020). https://www.sap.com/cis/products/ s4hana-erp.html. Accessed 1 Sept 2020 6. Goel, V., Kumar, V., Jaggi, A.S., Nagrath, P.: Text extraction from natural scene images using OpenCV and CNN (2019) 7. Microsoft. Microsoft Azure computer vision (2020). https://azure.microsoft.com/en-us/ser vices/cognitive-services/computer-vision/. Accessed 1 Sept 2020 8. Smith, R., Abdulkader, A.: Google. Tesseract-OCR (2020). https://github.com/tesseract-ocr/ tesseract. Accessed 1 Sept 2020 9. Katti, A.R., et al.: Chargrid: towards understanding 2d documents. arXiv preprint arXiv:1809. 08799 (2018) 10. Denk, T.I., Reisswig, C.: BERTgrid: contextualized embedding for 2D document representation and understanding. arXiv preprint arXiv:1909.04948 (2019) 11. Holeˇcek, M., Hoskovec, A., Baudiš, P., Klinger, P.: Table understanding in structured documents. In: 2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), vol. 5, pp. 158–164. IEEE, September 2019 12. Palm, R.B., Winther, O., Laws, F.: Cloudscan-a configuration-free invoice analysis system using recurrent neural networks. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 406–413. IEEE, November 2017 13. Tran, D.N., Tran, T.A., Oh, A., Kim, S.H., Na, I.S.: Table detection from document image using vertical arrangement of text blocks. Int. J. Contents 11(4), 77–85 (2015) 14. Göbel, M., Hassan, T., Oro, E., Orsi, G.: ICDAR 2013 table competition. In: 2013 12th International Conference on Document Analysis and Recognition, pp. 1449–1453. IEEE, August 2013 15. Schreiber, S., Agne, S., Wolf, I., Dengel, A., Ahmed, S.: Deepdesrt: deep learning for detection and structure recognition of tables in document images. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 1162–1167. IEEE, November 2017 16. Paliwal, S.S., Vishwanath, D., Rahul, R., Sharma, M., Vig, L.: TableNet: deep learning model for end-to-end table detection and tabular data extraction from scanned document images. In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 128– 133. IEEE, September 2019 17. Prasad, D., Gadpal, A., Kapadni, K., Visave, M., Sultanpure, K.: CascadeTabNet: an approach for end to end table detection and structure recognition from image-based documents. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 572–573 (2020) 18. Moghaddam, R.F., Cheriet, M.: A multi-scale framework for adaptive binarization of degraded document images. Pattern Recogn. 43(6), 2186–2198 (2010) 19. OpenCV team. OpenCV (2020). https://opencv.org/. Accessed 1 Sept 2020 20. Kölsch, A., Afzal, M.Z., Ebbecke, M., Liwicki, M.: Real-time document image classification using deep CNN and extreme learning machines. In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 1, pp. 1318–1323. IEEE, November 2017 21. Gupta, A., Kaur, M., Bajaj, A., Khanna, A.: Entailment and spectral clustering based single and multiple document summarization. Int. J. Intell. Syst. Appl. 11(4), 39 (2019) 22. Revanasiddappa, M.B., Harish, B.S.: A novel text representation model to categorize text documents using convolution neural network. Int. J. Intell. Syst. Appl. 11(5), 36 (2019) 23. Xu, Y., Li, M., Cui, L., Huang, S., Wei, F., Zhou, M.: Layoutlm: pre-training of text and layout for document image understanding. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 1192–1200, August 2020
Method for Processing Document with Tables in Russian-Language
125
24. Liaw, A., Wiener, M.: Classification and regression by randomForest. R News 2(3), 18–22 (2002) 25. Chen, K., et al.: Mmdetection: open mmlab detection toolbox and benchmark. arXiv preprint arXiv:1906.07155 (2019) 26. Gao, L., et al.: ICDAR 2019 competition on table detection and recognition (CTDAR). In: 2019 International Conference on Document Analysis and Recognition (ICDAR), pp. 1510– 1515. IEEE, September 2019 27. Wang, J., et al.: Deep high-resolution representation learning for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. (2020) 28. Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48 29. Padilla, R., Netto, S.L., da Silva, E.A.: A survey on performance metrics for object-detection algorithms. In: 2020 International Conference on Systems, Signals and Image Processing (IWSSIP), pp. 237–242. IEEE, July 2020
Method of Multi-objective Design of Strain Gauge Force Sensors Based on Surrogate Modeling Techniques Sergey I. Gavrilenkov(B) and Sergey S. Gavriushin Bauman Moscow State Technical University, Moscow, Russia [email protected]
Abstract. The design of strain gauge force sensors is usually simulation-driven, requiring time-consuming finite element analyses to be conducted multiple times, limiting the efficacy of the simulation-driven approach. In this paper, we propose overcoming this limitation with a method of multicriteria optimization of strain gauge force sensors based on surrogate modeling techniques. Surrogate models approximate the relationship between the design variables and the elastic element stress-strain state’s properties that determine the sensor characteristics. The proposed method’s key feature is that a special surrogate model reconstructs the entire distribution of strain on the elastic element, which is required for determining the optimal placement of strain gauges. The surrogate models are neural network (NN)-based. We benchmarked the NN-based surrogate models against the RSM-based surrogate models in a case study. Based on the case study results, the RSM-based models are on par with the NN-based models in terms of predicting almost linear relationships between goal function and design variables, while the RSM-based model failed to reconstruct the strain distributions due to their nonlinearity accurately. The proposed method allows accelerating the design of strain gauge force sensors while retaining the possibility of determining the optimal placement of strain gauges on the elastic element. Keywords: Surrogate-modeling · Simulation-driven design · Strain gauge force sensors · Artificial neural network · Genetic algorithm
1 Introduction Strain gauge-based force sensors dominate force measurement in science and industry, from instruments in external balances [1] and sensors for load monitoring in special robots [2] and smart vehicle [3] to weighbridges and platform scales. The current trends of Industry 4.0 urge to instrument all production processes, including force measurement. So, there will likely be increased demand for application-specific force sensors. A strain gauge force sensor is made up of an elastic metal body with strain gauges mounted on it. When the measured force acts on the sensor body strain is produced in the elastic element. The elastic body strain deforms the attached strain gauges, making © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): CSDEIS 2020, AISC 1402, pp. 126–137, 2021. https://doi.org/10.1007/978-3-030-80478-7_14
Method of Multi-objective Design of Strain Gauge Force Sensors
127
them change their electrical resistance. The strain gauges for a Wheatstone bridge, So the strain of strain gauges produces a voltage at the output terminals of the Wheatstone bridge. The voltage is proportional to the applied force. The characteristics of strain gauge force sensors (nonlinearity, hysteresis, output signal magnitude, overload capacity, etc.) are related to the elastic body’s overall stressstrain state. The stress-strain state is related to the elastic element stiffness related to the geometry of the elastic element. So, the desired sensor properties can be achieved by optimizing the elastic element geometry. The stress-strain state is usually determined using Finite Element Analysis (FEA). However, FEA simulation studies are time-consuming, and many design variants have to be evaluated in the design process. So, the design process may take too much time, making the designers possibly limit the number of design evaluations. We propose overcoming this limit using surrogate models that approximate the relationship between the stress-strain state and design variables. The goals of this paper are as follows: (1) Propose a method for surrogate-based optimization of strain gauge force sensors. To determine the optimal placement of strain gauges on the elastic elements, the surrogate models will have to reconstruct the entire distribution of strain on the elastic element surface where strain gauges will be mounted. This is the crucial feature of the proposed method; (2) Conduct a case study, where the accuracy and efficacy of the proposed method will be tested. The rest of the paper is organized as follows. First, we review of the existing approaches to surrogate modeling. Second, we present the surrogate modeling method and explain how it is implemented. Third, we integrate the proposed method in the software [4] for the multi-objective design of strain gauge force sensors. After that, we test the proposed method in a case study, where we check the performance and accuracy of the chosen NN-based architecture against another surrogate model architecture (Response Surface Methodology). Lastly, we consider the results of the case study and map the plans for future research.
2 Literature Review Although there are many papers devoted to the design of strain gauge force and pressure sensors [5–7], there is little, if any, research devoted to applying the surrogate modeling method. However, surrogate modeling methods are widely used in other areas of design and optimization. For example, Temirchev et al. used deep neural networks to predict the oil movement in a development unit [8]. Novel approximation algorithms, several advanced methods of automated model selection were proposed in a special framework for surrogate modeling in industrial design by Belyaev et al in [9]. Kriging-based surrogates were used to accelerate the design of an airbus passenger aircraft’s main landing gear doors in [10].
128
S. I. Gavrilenkov and S. S. Gavriushin
Hoole et al. compared surrogate modeling methods for finite element analysis of landing gear loads [11]. Surrogate models can be used for model identification based on test data. For example, a kernel ridge regression confidence machine was used for that purpose in [12]. Surrogate modeling is also extensively used in chemical process engineering, as demonstrated by the review in [13]. Sometimes, the training set for the construction of surrogate models is based on heterogeneous data like results of high-fidelity and low-fidelity models. In [14], Gaussian processes regression was used to unify variable-fidelity simulation data in a surrogate model for design exploration. Based on the reviewed works, we decided to base our surrogate model on the neural network architecture because strain distribution on the elastic element can be very nonlinear, and neural networks can fit highly nonlinear functions. Aside from surrogate modeling problems, neural networks are very widely used in nearly all areas of science and engineering [15–17], making the tools and reference materials very accessible.
3 Methodology 3.1 The Procedure of Force Sensor Design Based on Surrogate Models The procedure for constructing the surrogate models is as follows: 1. The designer proposes the parametric topology of the elastic elements, chooses the design parameters that will be varied, and specifies the bounds for design parameters; 2. The bounded parameter space is investigated using Design of Experiment (DoE) techniques; 3. Each design variant undergoes FEA, and the analysis results (stress-strain data) are used to form the set for training and validating the surrogate models; 4. The models are trained and then used for design exploration. The simulations results are used to calculate the following goal functions (the list of goal functions can vary depending on the sensor application area): 1. Output signal of the Wheatstone bridge measuring circuit; 2. Nonlinearity of the output characteristics of a sensor (a relationship between the applied load and the output signal); 3. Hysteresis of the force sensor; 4. Overloading capacity of the force sensor; Goal functions 3–4 are functions of the force sensor elastic element geometry. For example, the magnitude of hysteresis is related to the maximum radial displacement of the force sensor support point. These goal function can be calculated based on a few scalar simulation results, like the maximum Von Mises stress in the elastic element or the maximum radial displacement of the base of a membrane force sensor. The task of the surrogate model is to approximate the relationship between design variables and these simulation scalar results. GF i = F i (DV )
(1)
Method of Multi-objective Design of Strain Gauge Force Sensors
129
where GF i is the value of the i-th scalar goal function, F is the sought approximation function, and DV is the vector of design variables. However, Goal Functions 1 and 2 also depend on the placement of strain gauges on the elastic element, and optimal placement of gauges can be determined. To that end, the distribution of strain components across the surfaces of the elastic element is required. So, in this case, the surrogate modeling task comes down to approximating the relationship between elements of the strain tensor and the vector comprised of a sub-vector of design variables DV and the coordinates of points on the elastic element surface: Eji = F i (DV , X , Y , Z)
(2)
where Eji is the j-th component of the strain tensor, for example, normal strain in the X-X direction, and X, Y, Z are coordinates of points on the surface of the elastic element. Thus, the total number of surrogate models is equal to the number of scalar goal functions plus the number of surrogate models required to predict strain components (one surrogate model pre each component) required to calculate Goal Functions 1 and 2. So, the results of finite element simulations mentioned above make up a set for training neural networks for surrogate models. This set is split into a training set and a validation set. The validation set is used to tune the parameters of the neural network (number of layers, types of activation functions, learning algorithm parameters) to prevent overfitting. When the surrogates are built, they are used to predict scalar Goal Functions and reconstruct strain distributions on the elastic elements. Strain distributions are used by a module for determining the optimal placement of strain gauges on the elastic element. The placement problem is a multi-objective problem with Goal Functions 1 and 2. The placement problem is solved using a multi-objective genetic algorithm to obtain a set of Pareto-optimal solutions. So, for one design variable of an elastic element, there can be a number of strain gauge placement configurations. So, the surrogate models enable fast investigation of the parameter space because time-consuming finite element simulations are no longer required. The rest of the optimization process is done according to the methodology of the Parameter Space Investigation (PSI) method [14]. The details of the implementation of the proposed surrogate modeling method, as well as the case study, are elaborated in the following sections. 3.2 Implementation of the Proposed Method Finite Element Analysis and Design of Experiment. The stress-strain state of the elastic elements is determined using open-source software Code_Aster. The mesh for the FEA is built using Salome-Meca. The design of experiment uses the low-discrepancy sobol sampling strategy. like latin hypercube sampling, sobol sampling provides very uniform coverage of the parameter space. The meshes are built automatically based on a parametric geometric model of the elastic element made in Salome-Meca’s geometry module. The FE analyses of the meshes are also done automatically. The whole pipeline
130
S. I. Gavrilenkov and S. S. Gavriushin
of investigating the parameter space for constructing the surrogate model is powered by the DOE module of the software system for multicriteria optimization of strain gauge force sensors [14]. Neural Networks. Each goal function corresponds to a dedicated surrogate model based on the neural network architecture. The exceptions are the first two Goal Functions determined based on the surrogate model for predicting strain distribution. In our case, a neural network is comprised of an input layer, a number of hidden layers, and an output layer. The configuration of the neural network (the number of neurons in the hidden layers, the number of hidden layers, type of activation function in the hidden layers), as well as hyperparameters (batch size for training, training algorithm, the number of training epochs) is determined based on the designer’s expertise. The simulation data is split into a training set and a validation set. The validation set can be used to evaluate the generalization capacity of an NN, i.e., its ability to interpolate between training points and avoid overfitting. In this paper, we use the Keras library [18] with Tensorflow to construct surrogate models. The procedure of training neural networks is implemented as a separate Python module, and the trained neural networks are then integrated into a module for design exploration of the aforementioned software [14]. Strain Gauge Placement Problem. As mentioned in the Introduction, the key feature of the approach is that a special surrogate model reconstructs the distribution of strain on the elastic element surface where strain gauges are mounted. In this case, the problem of determining the optimal placement of strain gauges has to be solved. This is a multicriteria problem, and its goal functions are as follows: 1. Maximize the output signal (OS) of the sensor Wheatstone bridge; 2. Minimize the difference between the strain magnitude sensed by the compression and tension gauges. This quantity directly influences the nonlinearity of the Wheatstone bridge and the overall sensor nonlinearity. The goal function NL is calculated as follows: NL = |1 −
|ε2 | | |ε1 |
(3)
where ε2 and ε1 are the strains measured by the compression and tension strain gauges, accordingly. It should be noted that the goal functions OS and NL correspond to the Goal Functions 1 and 2 mentioned at the beginning of Subsect. 3.1. This multicriteria problem is solved by obtaining a set of Pareto-feasible solutions. The details of the algorithm are elaborated in [19]. The dedicated surrogate model reconstructs the distribution of strain used to calculate the goal functions.
4 Case Study For the case study, we considered a membrane force sensor measuring compressive force up to 100 kN, Fig. 1. The elastic element material is 630 stainless steel (17–4 PH). The
Method of Multi-objective Design of Strain Gauge Force Sensors
131
Fig. 1. Drawing of the force sensor elastic element, dimensions are given in mm.
material properties are as follows: elasticity module E = 197 GPa, Poisson’s ratio μ = 0.3, hardness = 42.5 HRC, σ y = 1200 MPa. Four strain gauges are placed on a circular membrane highlighted in red in Fig. 1. Four gauges are placed symmetrically with respect to the elastic element axis of symmetry, two gauges on the left-hand side, two gauges on the right-hand side. As the strain distribution has axial symmetry, it is sufficient to calculate the strain values only for the gauges on one side, for example, the right-hand side. Two gauges must experience tensile strain, and the other two gauges must experience compressive strain. In this way, the maximum output signal is achieved by the Wheatstone bridge circuit. The strain gauges are genetic constantan strain gauges for transducer manufacturing, having the gauge factor of two. The dimensions of a strain gauges are given in the inset of Fig. 1. The output signal OS of the Wheatstone bridge is calculated using known formulas. The nonlinearity NL of the sensor is calculated implicitly by the difference of the magnitudes of compressive and tensile strain. The golden rule of strain gauge force sensor design states that the magnitude of the strain measured by strain gauges must be around 1000….1200 ppm, and magnitudes of the compression and tension strain measured by the strain gauges must be as equal as possible. So, the goal function NL is calculated as follows: NL = |1 −
|ε2 | | |ε1 |
(4)
where ε2 and ε1 are the strains measured by the compression and tension strain gauges, accordingly. Table 1 gives information about design variables. The first two goal functions are the same as in the strain gauge placement problem considered in Subsect. 3.2. The other goal functions are as follows:
132
S. I. Gavrilenkov and S. S. Gavriushin Table 1. Design variables Design variable name Lower limit, mm Upper limit, mm Rin
45
55
Dc
35
45
Dh
35
45
• Minimize the maximum absolute radial displacement of the force sensor support surface X Rmax , Fig. 3. As mentioned above, great radial displacement contributes to high dry friction hysteresis; • Minimize the maximum Von Mises stress in the elastic element σ max ; It should be reiterated that there can be many strain gauge placements for one design variant of the elastic element. So, one combination of design variables corresponds to several sensor configurations varying only in the placement of strain gauges, i.e., the values of X Rmax and σ max are the functions of the elastic element geometry. As the elastic element is a body of rotation, the axis-symmetric finite element model can be used to save computational time. An example of the finite element mesh is shown in Fig. 2. The static structural simulation’s boundary conditions are as follows: constraint on vertical displacements on the support, while a distributed load is applied to the top of the sensor, see Fig. 2. In the optimal placement problem, we constrain the gauges to point only in the radial direction. So, the surrogate model for predicting the strain distribution must predict the radial strain on the surface colored in red in Fig. 1. The coordinate values are the values of the radii of points on this surface (the line in the axis-symmetrical model).
Fig. 2. Example of the finite element mesh and boundary conditions.
Method of Multi-objective Design of Strain Gauge Force Sensors
133
For brevity, we shall refer to the surrogate model for predicting strain distribution as Surrogate Model A. The surrogate model for predicting the maximum Von Mises stress and the maximum radial displacement on the force sensor support as Surrogate Models B and C, respectively. For the placement problem in this study, we consider only the radial orientation of strain gauges on the elastic elements, i.e., strain gauges are positioned in the way to measure radial strain. So, for solving the problem of optimal placement, we only need the radial strain distribution. The boundaries of the Parameter Space are specified in Table 1, the training and validation sets are constructed based on 100 evaluations of different designs in the Parameter Space. Results from 80 samples go to the training set, and the rest goes to the validation set. The parameters of the surrogate models were tuned to obtain good accuracy on both the training and the validations sets. The tuning was done manually (Table 2). Table 2. Parameters of the surrogate models. RELU – Rectified Linear Unit, ELU – Exponential Linear Unit. Parameter name
Model A
Model B
Model C
Number of hidden layers
10
3
2
Activation function
RELU
RELU
ELU
Number of neurons in the input layer
200
50
10
Number of neurons in the hidden layers
200
200
100
Optimizer type
ADAM
ADAM
ADAM
Number of training epochs
1000
500
500
Batch size
10000
10
10
We also considered an alternative surrogate model architecture – quadratic Response Surface Methodology (RSM). Unlike neural networks taking considerable time to train, RSM models take very little time to train. However, the drawback of this approach that this is a parametric regression and may not be able to reconstruct nonlinear distributions of radial strain on the elastic elements’ surfaces. On the other hand, it can be a good choice for predicting the scalar goal functions XRmax and σmax. The size of the training and validation sets for the RSM models is the same as for the NN-based models. The details of constructing the RSM models are given in [11]. The trained surrogate models were integrated in the software for the multicriteria design of strain gauge force sensors to explore the design space according to the PSI method [14] and obtain a set of Pareto Feasible solutions based on 100 evaluations of the surrogate models. Similar to forming the training and validation sets, the exploration is based on the Sobol sampling strategy. The accuracy of the constructed surrogate models of both types was evaluated as follows: First, the prediction accuracy is evaluated for individual samples in the validation sets. The indicator was the Mean Absolute Error (MAE) [18].
134
S. I. Gavrilenkov and S. S. Gavriushin
Second, the Surrogate Models’ accuracy is evaluated based on the closeness between a reference set of Pareto-optimal solutions determined based on solutions that formed the training and validation sets and the set of Pareto-optimal solutions using NN- and RSMbased surrogate models. The indicator is Generational Distance (GD), and it should be minimized [20].
5 Results and Discussion Table 3 shows the values of MAE for the validation sets: Table 3. Value of MAE for the RSM- and NN-based surrogate models. MAE values have the same unit as the corresponding values (μm/m for strain, μm for XRmax , and MPa for σmax ) Surrogate model architecture
Model A Model B Model C
Neural network
36.4
1.491
0.051
Response surface methodology 64.4
1.023
0.029
Table 4 shows the time required to train models of both types. The training time for measured on an HP laptop having a 3 GHz CPU and 32 Gb RAM. Table 4. Training time in seconds Surrogate model architecture
Model A Model B Model C
Neural network
262
13.2
7.61
Response surface methodology 2.3
2.2
2.3
Tables 3 and 4 show that RSM outperforms NN in terms of training time, especially for Surrogate Model A. Both RSM and NN showed low prediction errors for Models B and C. However, the RSM-based surrogate model exhibited a substantial error for predicting strain distributions. In the study, radial strain varies from −1000 μm/m to 1000 μm/m, so the error of 64.4 μm/m is quite significant. In terms of quality of the obtained front of Pareto-optimal solutions, the value of GD for the RSM- and NN- based models is 0.091 and 0.057. It looks like the inaccurate prediction of the strain distribution affected the value of this performance indicator. The values of goal functions of the obtained Pareto-optimal solutions are plotted in Fig. 3. The effect of inaccurate prediction of strain by the RSM-based model is especially visible at the subplots where OS and NL values are shown. The information about Pareto-optimal values is then submitted to the Decision Maker. The Decision Maker may filter out the solutions according to the constraints imposed on the goal functions and choose the best solutions from the DM’s perspective.
Method of Multi-objective Design of Strain Gauge Force Sensors
135
Fig. 3. Distribution of Pareto-optimal solutions obtained using purely simulation-driven design (red circles), RSM-based surrogate models (green circles), NN-based surrogate models (red circles).
So, based on the case study results, it is evident that neither RSM nor NN offers the best results in terms of both accuracy and performance. NN is accurate but somewhat slow to train, while RSM is fast but completely inaccurate in terms of predicting the strain distribution that determines the key characteristics of the sensor – magnitude of the output signal and nonlinearity. Thus, it makes sense to use a combination of surrogate modeling techniques, for example, the neural networks for predicting the strain distributions and RSM for predicting other goal functions. Future research should focus on developing a tool for the selection of the best surrogate model for the given case similar to the one presented in [9].
136
S. I. Gavrilenkov and S. S. Gavriushin
6 Conclusion This paper proposed a method of multicriteria optimization of strain gauge force sensors based on surrogate models. The surrogate models’ architecture enabled the reconstruction of the strain distribution across the surfaces of the sensor elastic element where strain gauges are mounted, thus enabling determining the optimal placement of strain gauges on the elastic element. The developed method was implemented as a Python module for the system for the multicriteria design of strain gauge force sensors. The proposed method’s efficacy was tested in a case study of using surrogate models to design a membrane force sensor. The neural network-based surrogate models’ performance was compared to that of the surrogate models based on the Response Surface Methodology. Both surrogate model architecture revealed some advantages and disadvantages. Future research should focus on developing a method of selecting the best surrogate modeling method for a specific problem in terms of performance and accuracy. Overall, the proposed method can radically reduce the design time for force sensors or make the design exploration more in-depth due to the surrogate models’ low computational complexity.
References 1. Tavakolpour-Saleh, A.R., Setoodeh, A.R., Gholamzadeh, M.: A novel multi-component strain-gauge external balance for wind tunnel tests: simulation and experiment. Sensors Actuat. A Phys. 247, 172–186 (2016). https://doi.org/10.1016/j.sna.2016.05.035 2. Sun, Y., Liu, Y., Zou, T., Jin, M., Liu, H.: Design and optimization of a novel six-axis force/torque sensor for space robot. Measure. J. Int. Measure. Confeder. 65, 135–148 (2015) 3. Saxena, P., Pahuja, R., Khurana, M.S., Satija, S.: Real-time fuel quality monitoring system for smart vehicles. Int. J. Intell. Syst. Appl. 8, 19–26 (2016) 4. Gavrilenkov, S.I., Gavryushin, S.S.: Development and performance evaluation of a software system for multi-objective design of strain gauge force sensors. In: Hu, Z., Petoukhov, S., He, M. (eds.) CSDEIS 2019. AISC, vol. 1127, pp. 228–237. Springer, Cham (2020). https://doi. org/10.1007/978-3-030-39216-1_21 5. Gavryushin, S.S., Skvortsov, P.A.: Evaluation of output signal nonlinearity for semiconductor strain gage with ANSYS software. In: Solid State Phenomena, pp. 60–70 (2017) 6. Gavryushin, S.S., Skvortsov, P.A., Skvortsov, A.A.: Optimization of semiconductor pressure transducer with sensitive element based on “silicon on sapphire” structure. Periodico Tche Quimica. 15, 679–687 (2018) 7. Andreev, K.A., Vlasov, A.I., Shakhnov, V.A.: Silicon pressure transmitters with overload protection. Autom. Remote. Control. 77(7), 1281–1285 (2016). https://doi.org/10.1134/S00 05117916070146 8. Temirchev, P., Simonov, M., Kostoev, R., Burnaev, E., Oseledets, I., Akhmetov, A., Margarit, A., Sitnikov, A., Koroteev, D.: Deep neural networks predicting oil movement in a development unit. J. Pet. Sci. Eng. 184, 106513 (2020) 9. Belyaev, M., et al.: Gtapprox: surrogate modeling for industrial design. Adv. Eng. Softw. 102, 29–39 (2016) 10. Viúdez-Moreiras, D., Martin, M., Abarca, R., Andrés, E., Ponsin, J., Monge, F.: Surrogate modeling for the main landing gear doors of an airbus passenger aircraft. Aerosp. Sci. Technol. 68, 135–148 (2017) 11. Hoole, J., Sartor, P., Booker, J.D., Cooper, J.E., Gogouvitis, X., Schmidt, R.K.: Comparison of surrogate modeling methods for finite element analysis of landing gear loads. In: AIAA Scitech 2020 Forum, p. 681 (2020)
Method of Multi-objective Design of Strain Gauge Force Sensors
137
12. Moreno-Salinas, D., Moreno, R., Pereira, A., Aranda, J., Jesus, M.: Modelling of a surface marine vehicle with kernel ridge regression confidence machine. Appl. Soft Comput. 76, 237–250 (2019) 13. McBride, K., Sundmacher, K.: Overview of surrogate modeling in chemical process engineering. Chemie Ing. Tech. 91, 228–239 (2019) 14. Gavrilenkov, S.I., Gavriushin, S.S., Godzikovsky, V.A.: Multicriteria approach to design of strain gauge force transducers. J. Phys. Conf. Ser. 1379(1), 1–6 (2019) 15. Emuoyibofarhe, J.O., Ajisafe, D., Babatunde, R.S., Christoph, M.: Early skin cancer detection using deep convolutional neural networks on mobile smartphone. Int. J. Inf. Eng. Electron. Bus. 12, 1–8 (2020) 16. Pawadea, D., Dalvia, A., Siddavatamb, I., Carvalhoc, M., Kotianc, P., Georgec, H.: Cuisine detection using the convolutional neural network. Int. J. Educ. Manage. Eng. 10, 1 (2020) 17. Elzayady, H., Badran, K.M., Salama, G.I.: Arabic opinion mining using combined CNN LSTM models. Int. J. Intell. Syst. Appl. 12(4), 25–36 (2020). https://doi.org/10.5815/ijisa. 2020.04.03 18. Keras API reference. https://keras.io/api/. Accessed 13 Sep 2020 19. Gavrilenkov, S.I., Gavriushin, S.S.: Method of determining the optimal placement of strain gauges on the elastic element of a force sensor. In: Proceedings of the XXIV International Symposium “Dynamic and Technological Problems of Structures and Solid Media”, pp. 74–75 (2018). (in Russian) 20. Chand, S., Wagner, M.: Evolutionary many-objective optimization: a quick-start guide. Surv. Oper. Res. Manage. Sci. 20, 35–42 (2015)
Predicting University Development Based on Hybrid Cognitive Maps in Combination with Dendritic Networks of Neurons M. E. Mazurov(B)
and A. A. Mikryukov
Plekhanov Russian University of Economics, Moscow, Russia [email protected]
Abstract. The aim of the work is to substantiate and predict measures to ensure the increase in the values of the target indicators of the university’s activity in the international institutional rating QS to the values necessary to increase the rating in subsequent years. Mathematical modeling of cognitive maps is used in conjunction with selective dendritic networks of neurons. The possibility of increasing the efficiency of using cognitive maps for predicting the development of a university using selective dendritic networks of neurons is shown. The proposed structure of hybrid cognitive maps allows you to naturally simplify the structure by removing non-working network connections of the global structure. The universal matrix description of the mathematical model of the network allows you to effectively form a computational algorithm and software for predicting the indicators of the development of the university. The results obtained allowed us to form a scenario plan for the necessary step-by-step increase in the values of target indicators, taking into account the hidden factors affecting them. The possibility of forming a cognitive map based on a multilayer dendritic network of direct propagation in the presence of unidirectional content links is shown. Keywords: Cognitive model · Scenario forecasting · Targets · Institutional ranking · Dendrites · Selective dendritic networks of neurons
1 Introduction The aim of the work is to substantiate and predict activities to ensure the increments of the values of target indicators (indicators) of the university’s activities in the international institutional rating QS to the values required to increase the rating in subsequent years. The relevance of the problem being solved is due to the need to develop scientifically grounded proposals to achieve the required values of the target indicators of the Plekhanov Russian University of Economics by 2025 by calculating the necessary increments of latent factors (particular indicators), taking into account their correlations with the target indicators. In turn, the main indicator, called the functional F (aka the university rating R), is calculated as the sum of the products of the values of the target indicators by their weight coefficients [1, 2].. The analysis of the problem posed showed that it belongs to the class of semi-structured tasks, which is solved under the conditions of a limited amount of initial data and several uncertainties. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): CSDEIS 2020, AISC 1402, pp. 138–151, 2021. https://doi.org/10.1007/978-3-030-80478-7_15
Predicting University Development Based on Hybrid Cognitive Maps
139
2 Literature Review To solve the problem, the method of cognitive maps was used. Cognitive maps are a kind of mathematical models describing problem situations or complex semi-structured systems [3–21]. For the first time the term “cognitive maps” (Cogntve Maps) was proposed by E. Tolman in [22]. R. Axelrod proposed to consider a cognitive map as a directed graph, the arcs of which are assigned a plus or minus sign. In [23], he applied this model to construct a theory of decision-making in politics and economics. Thus, classical sign cognitive maps are specified in the form of a directed graph and represent the modeled system in the form of a set of vertices (concepts) and arcs, weighted by two-level values. The basic elements of such a map are links that describe the influence of one concept (the initial vertex in graph theory) on another concept (the final vertex in graph theory). The directionality of this connection w means that the concept source influences the concept receiver, i.e. a change in the values (states) of the concept-source leads to a change in the values (states) of the concept-receiver. At the same time, the transfer of influence is considered qualitatively: with a positive connection and an increase in the concept, the concept increases, and with a decrease, it decreases. If the relationship is negative, an increase in the value will cause a decrease (and vice versa). Such a cognitive map can be used for a qualitative assessment of the impact of individual concepts on the stability of the system. By identifying the contours formed in the map, analyzing the resulting signs of each of the contours and using the theory of feedbacks, it is possible to assess the stability of the modeled system. This analysis is based, in fact, on the methodology for analyzing conventional linear systems based on comparing various contours formed from concepts. The possibilities of such analysis are limited and do not allow identifying the features of the mutual influence of concepts, as well as ranking them according to the degree of influence on each other. This becomes especially noticeable when solving multicriteria optimization problems with given quantitative criteria. In 1986, in work [24] B. Kosko proposed a new type of cognitive maps called Fuzzy Cogntve Maps. Concepts in a fuzzy cognitive map (FCM) can take values from the range of real numbers [0, 1]. The term “fuzzy” means only that causal links can take not only a value equal to 0 or 1, but lie in the range of real numbers, reflecting the “strength” of the influence of one concept on another. The approach based on the theory of fuzzy sets by L. Zadeh, at least in the computational aspect, is not used in B. Kosko’s model. The structure of the influence of several input concepts on the output in maps of this type corresponds to the structure of a single-layer perceptron described in the theory of neural networks. The paper proposes a method for accumulating individual influences, like the weighted summation of the input vector components by an artificial neuron, followed by a nonlinear transformation of the results of this summation. The distinct influences of the input concepts are summed up and a special non-linear function is used to prevent the output concept from going out of range. Kj = f (
n
wij ∗ Ki )
i=1
where wij is the weight of the concept’s i influence on the concept j; n- the number of concepts that directly affect the concept j [8];
(1)
140
M. E. Mazurov and A. A. Mikryukov
Ki and Kj - the values of the input and output concepts, respectively. The sigmoid is used as an activation function f (x) =
1 1 + e−Ax
(2)
Even though in the computational aspect, B. Kosko’s fuzzy cognitive maps are like an artificial neural network, there are differences between the two models. Fuzzy cognitive maps can be purely expert in nature (although they can be trained) and correspond to a “white box” model, while an artificial neural network is fundamentally focused on learning (a “black box” model). Let’s consider the formation of hybrid cognitive maps for predicting the development of the university based on selective dendritic networks of neurons. The use of this approach makes it possible to increase the accuracy of prediction by using the nodes of a fuzzy cognitive map as input data of the neural network. It is no secret that the data sample on which the forecast is based has a great influence on the forecast accuracy. The approach has two main stages. At the first stage, a fuzzy cognitive map model is developed based on historical time series data using a genetic learning algorithm. The first stage can be described in stages as follows. 1. Initialization of a fuzzy cognitive map from historical data of time series. 2. Construction of an optimized fuzzy cognitive map (choosing the most significant concepts and their connections) using a genetic algorithm. 3. Testing a fuzzy cognitive map based on normalized test data. Using fuzzy cognitive map concepts to define inputs for a neural fuzzy network [18] to improve prediction accuracy. The second stage consists of the following steps. 1. Improving forecasting accuracy using the selected input data - the concepts of the developed cognitive map. 2. Training the neural network. 3. Testing the resulting neural network on test data. A similar example, shown in Fig. 2, represents the process of predicting such an indicator as the quality of life of the population. The use of a cognitive map in this situation is the most appropriate for the following reason: in order to build a qualitative forecast of this indicator, like many others, it is necessary to single out the factors most influencing the indicator. A cognitive map is the best way to help solve such a problem and thereby feed the prepared data of cognitive map concepts to the inputs of the neural network. The results of this work show the effectiveness of creating hybrid systems, both for the problem of decision making and for the problem of forecasting time series [25, 26]. An integrated hybrid model of decision support and forecasting based on fuzzy cognitive maps is presented. In the work of A. N. Averkin, a hybrid cognitive map is considered, consisting of a combination of a fuzzy cognitive map and a neuro-fuzzy network.
Predicting University Development Based on Hybrid Cognitive Maps
141
3 Main Part. Using Dendritic Networks of Neurons to Form Cognitive Maps Cognitive maps are a type of mathematical model that describes problem situations or complex semi-structured systems. It is possible to use deterministic and fuzzy cognitive maps. Fuzzy cognitive maps can be purely expert in nature (although they can be trained) and correspond to a “white box” model, while an artificial neural network is fundamentally focused on learning (a “black box” model). Further studies have shown the feasibility of using neural networks of types: multilayer neural networks of direct propagation of perceptrons, Kosco’s neural networks and Hopfield’s neural networks [27–40]. These neural networks are illustrated in Fig. 1.
Fig. 1. Neural networks: on the left - a multi-layer neural network of feedforward perceptron, in the center - Kosco’s neural network, on the right - Hopfield’s neural network [19, 27, 30–40]
It should be noted that the use of neural networks in combination with cognitive maps is redundant in the sense that the neural network is used as a system of adders, and nonlinear elements of neurons are not used as part of the neural network. In this regard, it is advisable to use networks composed of neuron dendrites. Dendritic networks, as recent studies have shown, can be very complex and perform specific cognitive functions. In this case, dendrite networks can have a complex hierarchical structure, each level of which can perform certain cognitive functions. For a better representation of dendritic networks in neurons, we present illustrations of some dendritic networks of neurons. These illustrations are shown in Fig. 2 [41–47]. From the above description of the hierarchical structure of some dendritic networks, these networks have extensive capabilities for processing input information and the ability to control the processes of neuron response to input information.
4 Predicting the Development of the University by Cognitive Maps Based on Selective Dendritic Networks of Neurons The university rankings are assessed using the QS World University Rankings system, which has been published since 2004. The following indicators are used to calculate the ranking of the university, called the main factors: 1. Academic reputation; 2. Reputation with the employer; 3. The ratio between the number of teachers and the number of students. 4. Indicator of citation of teachers; 5. Number of international teachers; 6.
142
M. E. Mazurov and A. A. Mikryukov
Fig. 2. The structure of neuron dendrites. In Fig. 2 shows: (a) a single-layer network of dandrites, (b), (c) two-layer networks, (d), (e) three-layer networks, (f), (g) complex multilayer networks of dendrites of brain neurons, (h) shows dendrites coming from the outside to the dendritic network of a cerebellar neuron for the formation of complex control actions
Number of international students. To assess the ranking of a university, a functional of the form y=w1 x1 + w2 x2 + w3 x3 + w4 x4 + w5 x5 + w6 x6
(3)
where w1 , w2 , w3 , w4 , w5 , w6 - the weighting factors are set, according to the QS recommendations, equal respectively: 0.4; 0.1; 0.2; 0.2; 0.05; 0.05. Taking into account the values of the correlation dependences between the functional and target indicators obtained on the basis of factor analysis in [4], as well as expert estimates of the mutual influence of latent factors and their influence on target indicators, a cognitive model was built that reflects the relationship of latent factors, target indicators and functional is shown in Fig. 3 was proposed in [2].
5 Cognitive Maps Based on Selective Dendritic Networks of Neurons Hybrid cognitive maps, including fuzzy neural networks, have been proposed by A. N. Averkin and others [25, 26]. In this paper, a hybrid intelligent system for predicting the development of the university with cognitive maps in combination with selective dendritic networks of a neuron is proposed, shown in Fig. 4. The hybrid system does not take into account the effects of feedback from the main factors. Usually this influence is not significant and as a first approximation they can be
Predicting University Development Based on Hybrid Cognitive Maps
143
Fig. 3. Cognitive model of the relationship between latent factors, target indicators and functionality. In Fig. 3 accepted designations: F - functional; R - university rating; target indicators: AR academic reputation; PP - reputation with the employer; OSB - the ratio of the number of students to the number of teachers; CP - indicator of citation of teachers; MP - number of international teachers; MS is the number of international students; latent factors: F1 - “Scientific schools and dissertation councils”; F2 - “Joint research projects”; F3 - “Availability of basic departments”; F4 “Number of publications in the Scopus database”; F5 - “Popular areas of training”; F6 - “The level of qualifications of scientific and pedagogical workers (SPD)”; F7 - “Number of teaching staff”; F8 - “The level of competence of students”; F9 - “CPD with language training”; F10 - “Places in the hostel”; F11 - “Demand for graduates from employers”; F12 - “Areas for educational activities”; F13 - “Level of payment for the teaching staff”; F14 - “Stimulating factors”)
ignored. If the influence of feedback is noticeable, then it can be taken into account by introducing additional connections, as is done for Hopfield neural networks. The proposed hybrid intelligent system for predicting the development of the university with cognitive maps in combination with selective dendritic networks of a neuron is topologically equivalent to a hybrid map representing a combination of a cognitive map with a neural network proposed by AN Averkin in [25, 26]. The structure of a hybrid cognitive map of a general view, taking into account the latent factors of the first order, affecting the formation of the main factors, is shown in Fig. 5. The proposed cognitive map does not take into account the influence of feedback from the main factors. Usually this influence is not significant and as a first approximation it can be ignored. If the influence of feedback is noticeable, then it can be taken into account by introducing additional connections, as is done for Kosko’s neural networks.
144
M. E. Mazurov and A. A. Mikryukov
Fig. 4. Hybrid intelligent system for predicting the development of the university with cognitive maps in combination with selective dendritic networks of a neuron
Fig. 5. The structure of a hybrid cognitive map of a general view, taking into account first-order factors in combination with dendritic networks of neurons. The designations of hidden factors are the same as in Fig. 4.
Some latent factors from among those taken into account have a noticeable effect on only some of the main factors and do not affect other main factors. The selective nature
Predicting University Development Based on Hybrid Cognitive Maps
145
Fig. 6. The structure of the hybrid cognitive map with remote irrelevant connections. The figure uses the same designations for the factors as in Fig. 4
of the influence of hidden factors can be taken into account by selective clustering of connections in the cognitive map by removing insignificant connections. As a result, the whole cognitive map becomes simpler and more descriptive. Such a cognitive map with non-essential connections removed is shown in Fig. 6. Consider a mathematical method for describing the processes occurring in the system of a modeled hybrid cognitive map.
6 Mathematical Description of a Cognitive Map Based on a Neuron Dendritic Network The values of the vectors of the main target factors are fed to the input of the dendritic network xi = (xi , ..., xin ), (i = 1, ..., m). The number n is equal to the number of years when these factors were considered. The sums are formed based on the matrix equation. S = WX,
(4)
where W is the matrix of weight coefficients, X =(x1 , . . . , xm ) Si =
n
wij xij , (i = 1, . . . , m).
(5)
j=1
The formation of sums Si can be compared with the addition of signals at the nodes of the dendritic tree of a neuron. In the terminal block, the error functional is formed F=
n j=1
(yj − (a0 +
m i=1
(ai xij ))2
(6)
146
M. E. Mazurov and A. A. Mikryukov
The error functional is minimized and the parameter values are determined ai (i = 1, ..., m), characterizing the degree of influence of factors xi on the value of the error functional. The minimization of the error functional is possible by any known method, for example, the gradient method, the backpropagation method. Analytically minimizing the functional can be achieved by methods of finding the extremum of a function of several variables. The equations for determining the parameters a0 , a1 , ..., am will have the form ⎛
⎞ ⎛ ⎞ (1, x1 )(x1 , x1 ) . . . (x1 , xm ) (y, x1 ) ⎜ (1, x2 )(x2 , x1 ) . . . (x2 , xm ) ⎟ ⎜ (y, x2 ) ⎟ ⎜ ⎟=⎜ ⎟ ⎝ ⎠ ⎝ ⎠ ... (y, xm ) (1, xm )(xm , 2x1 ) . . . (xm , xm ) (1)
(7)
(1)
The vectors X (1) =(x1 , ..., xm ) are determined considering the influence of hidden factors of the first order by the relation. X (1) = X + WX, where W is the matrix of weighting coefficients, determined by expert method. The last formula allows you to represent the increment of the main factors as a linear combination of hidden factors of the first order with weight coefficients determined by expert methods. This is the standard way of using fuzzy set methods. Further, the minimization of the functional (1) and the use of multiple regression methods allow us to determine the change in the rating of the object under study and the range of change of the hidden factors of the first order to achieve the set goal. Latent first-order factors may depend on other more subtle second-order factors. The proposed technique, based on the use of information processing methods in the dendritic networks of a neuron, allows us to study the task of forming a university rating with any degree of detail.
7 Results Interpretation of a Numerical Experiment Based on a Scenario Forecasting Model The calculation of rating indicators for the functional F was carried out for individual universities according to the main factors using the multiple regression method in the Matlab language7. Let us use the dynamics of the main factors of the MGSU University for 5 years 2015–2019, shown in Fig. 7. The values of the main factors are shown in Table 1.
Predicting University Development Based on Hybrid Cognitive Maps
147
Fig. 7. Dynamics of the main factors of MGSU University for 5 years 2015–2019
Table 1. Main factors values № The years F
AR
PP
OSB CP MP
MS
1
2015
28.0 17.3 27.0 86.9 1.2
7.5
7.0
2
2016
29.0 17.6 26.8 90.1 1.4 10.0
8.3
3
2017
29.1 17.7 21.3 92.7 1.5
9.3
9.8
4
2018
31.9 21.8 31.6 92.8 1.5
8.4 12.6
5
2019
31.6 19.4 28.8 96.6 1.7 10.3 14.5
The table uses the designations of the main factors as in Fig. 4. As a result of calculations that implement the minimization of the functional F, the dependence of the rating indicator of the Moscow State Construction University on the main factors was obtained. It is of interest to calculate the rating according to the equations of multiple linear regression according to the indicators for the first 3, 4, and 5 main factors for the data in Table 1. As a result of calculations using the Matlab7 programming system for the functional describing the rating, the following ratios were obtained: R = 0.4767 x1 + 0.0775 x2 + 0.2056 x3 R = 0.4195 x1 + 0.1030 x2 + 0.1868 x3 + 1.4415 x4 R = 0.4229 x1 + 0.1023 x2 + 0.1863 x3 + 1.4049x4 + 0.0066 x5. The weight coefficients in these relations are close to the weight coefficients of the functional (1). Let’s consider the calculation of the rating indicator using the first equation. For example, substituting the data for 2019 (19.4, 28.8, 96.6, 1.7, 10.3, 14.5), we get the value of the rating indicator 31.341, which is in good agreement with the tabular value of 31.6. Using the second equation, we get the value of the rating indicator 31.6, which is in perfect agreement with the table value. The obtained dependencies allow predicting the value of rating indicators for the next years 2020–2021 based on the values of the main factors. The given calculated data show that the most essential main factors are: AR - academic reputation; PP - reputation with the employer; OSB is the ratio of the number of students to the number of teachers. The highest growth rate will be required for the
148
M. E. Mazurov and A. A. Mikryukov
following latent factors: (1) Joint research projects; (2) The number of publications in the Scopus database; (3) Demand for graduates from employers. Among the latent factors, the highest growth rate is required for the factors: (1) The number of teaching staff; (2) Areas for educational activities. As a result of the calculations performed, the following conclusions can be drawn: For a guaranteed place in the QS rating, a stepwise (with an interval of one year) increase in the values of target indicators that influence latent factors is necessary.
8 Results Discussion The analysis of changes in the values of the main factors and rating values for the Plekhanov Russian University of Economics shows their chaotic non-deterministic nature. In this regard, QS Russian University of Economics is inferior to the rating of other universities, which requires the need for several special measures to improve it. A hybrid cognitive map based on a cognitive map in combination with a dendritic network of neurons has been developed. Such a hybrid cognitive map can be useful in predicting poorly defined social systems. The considered hybrid map can be useful for optimizing processes in the system, for comparing the effectiveness of various social systems. The possibility of forming a cognitive map based on a multilayer dendritic network of direct propagation in the presence of unidirectional content connections is shown.
9 Conclusion The proposed structure of hybrid cognitive maps allows a natural simplification of the structure due to the removal of non-working network connections of the global structure, allows a universal matrix description of the network mathematical model, allows for the effective formation of a computational algorithm and software for predicting university development indicators. To achieve the stated goal of the study, the application of methods for solving poorly structured problems was substantiated based on the development of a forecasting model using hybrid cognitive maps, which made it possible to choose the most preferable alternative. The proposed approach allows, under the given constraints, to find the most acceptable scenario for planning the increment of the functional values and target indicators to the required values due to the impact on the latent factors that ensure the guaranteed achievement of the goal. The possibility of forming a cognitive map based on a multilayer dendritic network of direct propagation in the presence of unidirectional content connections is shown. In the course of the study, the following tasks were solved: a hybrid cognitive model of scenario forecasting of measures to achieve the required values of the target performance indicators of the university in the international institutional ranking QS was developed, based on the developed model. The results obtained made it possible to formulate a scenario plan for the necessary stepwise increase in the values of target indicators, considering the latent factors affecting them in the interval of 2020–2025.
Predicting University Development Based on Hybrid Cognitive Maps
149
References 1. Information and analytical system QS - analytics [Electronic resource] Access mode: https:// analytics.qs.com/#/signing. 14 Nov 2019 2. Mikryukov, A.A., Gasparian, M.S., Karpov, D.S.: Development of proposals for promoting the university in the international institutional ranking QS based on statistical analysis methods. Statist. Econ. 17(1), 35–43 (2020) 3. Bolotova, L.S.: Artificial intelligence systems: models and technologies based on knowledge: textbook. FGBOU RGUITP; FGAU GNII ITT “Informatics”. Finance and Statistics, Moscow (2012) 4. Goridko, N.P.: Modern economic growth: theory and regression analysis. In: Goridko, N.P., Nizhegorodtsev. R.M., (eds.) Monograph. Infra M (2017) 5. Draper, N.: Applied regression analysis. John Wiley & Sons, Hoboken (2019) 6. Gorelova, G.V., Zakharova, E.N., Rodchenko, S.A.: Study of Semi-Structured Problems of Socio-Economic Systems: A Cognitive Approach. Publishing house of the Russian State University, Rostov (2006) 7. Oskin, A.F., Oskin, D.A.: The use of fuzzy cognitive maps for modeling poorly structured systems. Bull. Polotsk State Univ. Ser. C, 15–20 (2017) 8. Trakhtengerts, E.A.: Computer support for the formation of goals and strategies. Sinteg (2005) 9. Kornoushenko, E.K., Maksimov, V.I.: Situation management using the structural properties of the cognitive map. Tr. IPU RAS. T. XI. S. 85–90 (2000) 10. Kuznetsov, O.P.: Intellectualization of control decision support and creation of intelligent systems. Probl. Up. 3(1), 64–72 (2009) 11. Saati, T.: Decision Making. Method of analysis of hierarchies: per. from English M.: Radio Commun. (1993) 12. Kulba, V.V.: Scenario analysis of the dynamics of the behavior of socio-economic systems. In: Kulba, V.V., Kononov, D.A., Kovalesky, S.S., (eds.) IPU RAN (2002) 13. Gorelova, G.V., Melnik, E.V., Korovin, Y.S.: Cognitive analysis, synthesis, forecasting the development of large systems in intelligent RIUS. Artif. Intell. 61–72 (2010) 14. Kulba, V.V., Kononov, D.A., Kosyachenko, S.A., Shubin, A.N.: Methods for the formation of scenarios for the development of socio-economic systems. SINTEG (2004) 15. Cognitive analysis and management of the development of situations: materials of the 1st international conference. IPU (2001) 16. Huff, A.S.: Mapping strategic thought, pp. 11–49. Wiley, Chichester (1990) 17. Lu, Z., Zhou, L.: Advanced fuzzy cognitive maps based on OWA aggregation. Int. J. Comput. Cogn. 5(2), 31–34 (2007) 18. Carvalho, J.P., Tome, J.A.B.: Rule based fuzzy cognitive maps and fuzzy cognitive maps a comparative study. In: Proceedings of the 18th International Conference NAFIPS 1999, pp. 115–119 (1999). doi: https://doi.org/10.1109/NAFIPS.1999.781665 19. Dombi, J., Dombi, J.D.: Cognitive maps based on pliant logic. I. J. Simul. 6(6), 1–5 (2003) 20. Stach, W., Kurgan, L., Pedrycz, W., Reformat, M.: Genetic learning of fuzzy cognitive maps. Fuzzy Sets Syst. 153, 371–401 (2005) 21. Carlsson, C., Fuller, R.: Adaptive fuzzy cognitive maps for hyperknowledge representation in strategy formation process. In: Proceedings of International Panel Conference on Soft and Intelligent Computing, Technical University of Budapest, pp. 43–50 (1996) 22. Tolman, E.: Cognitive maps in rats and men. Psychol. Rev. 55(4), 189–208 (1948). https:// doi.org/10.1037/h0061626 23. Axelrod, R.: Structure of Decision: the cognitive maps of political elites, p. 422. Princeton University Press, New York (2016) 24. Kosko, K.B.: Fuzzy cognitive maps. Int. J. Man-Mach. Stud. 24(1), 65–75 (1986)
150
M. E. Mazurov and A. A. Mikryukov
25. Yarushev, S.A., Averkin, A.N.: Review of studies on time series forecasting based on hybrid methods, neural networks and multiple regression. Softw. Prod. Syst. (English) 1, 75–83 (2016) 26. Efremova, N.A., Averkin, A.N., Yarushev, S.A.: Hybrid fuzzy cognitive maps in decision making and forecasting. Softw. Prod. Syst. Algorithms 4, 1–9 (2017) 27. Khaikin, S.: Neural Networks: A Complete Course. 2nd edn. 1104 p.(2006) 28. Galushkin, A.I.: Neural Networks. Foundations of the Theory 496 p. (2010) 29. Osovsky, S.: Neural networks for information processing per from Polish. In: Rudinsky, I.D., Finance and Statistics, 344 p. (2002) 30. Kartvelishvili, V.M., Mazurov, M.E., Petrov, L.F.: Applied System-Dynamic Models: Monograph. Plekhanov Russian University of Economics, FGBOU VO, Moscow (2018) 31. Mazurov, M.E.: Identification of Mathematical Models of Nonlinear Dynamical Systems: Monograph. LENAND, 284 p. (2019) 32. Mazurov, M.E.: Intelligent recognition of electrocardiograms using selective neuron networks and deep learning. In: International Conference of Artificial Intelligence, Medical Engineering, Education. Moscow, Russia, pp. 182–198 (2017) 33. Mazurov, M.E.: Nonlinear dynamics, almost periodic summation, self-oscillatory processes, information coding in selective impulse neural networks . Izv. RAS. Ser. Physical. T. 82(11), 1564–1570 (2018) 34. Iyanda, A.R., Ninan, O.D., Ajayi, A.O., Anyabolu, O.G.: Predicting student academic performance in computer science courses: a comparison of neural network models. IJMECS. 10(6), 1–9 (2018). https://doi.org/10.5815/ijmecs.2018.06.01 35. Alkhathlan, A.A., Al-Daraiseh, A.A.: An analytical study of the use of social networks for collaborative learning in higher education. IJMECS. 9(2), 1–13 (2017). https://doi.org/10. 5815/ijmecs.2017.02.01 36. Moshref, M., Al-Sayyad, R.: Developing ontology approach using software tool to improve data visualization (case study: computer network). IJMECS. 11(4), 32–39 (2019). https://doi. org/10.5815/ijmecs.2019.04.04 37. Cheah, C.S., Leong, L.-M.: Investigating the redundancy effect in the learning of C++ computer programming using screen casting. Int. J. Mod. Educ. Comput. Sci. 11(6), 19–25 (2019). https://doi.org/10.5815/ijmecs.2019.06.03 38. Suhaimi, N.M., Abdul-Rahman, S., Mutalib, S., Hamid, N.H.A., Hamid, A.: Review on predicting students’ graduation time using machine learning algorithms. Int. J. Mod. Educ. Comput. Sci. 11(7), 1–13 (2019). https://doi.org/10.5815/ijmecs.2019.07.01 39. Alvarez-Dionisi, L.E., Mittra, M., Balza, R.: Teaching artificial intelligence and robotics to undergraduate systems engineering students. Int. J. Mod. Educ. Comput. Sci. 11(7), 54–63 (2019). https://doi.org/10.5815/ijmecs.2019.07.06 40. Adekunle, S.E., Adewale, O.S., Boyinbode Olutayo, K.: Appraisal on perceived multimedia technologies as modern pedagogical tools for strategic improvement on teaching and learning. Int. J. Mod. Educ. Comput. Sci. 11(8), 15–26 (2019). https://doi.org/10.5815/ijmecs.2019. 08.02 41. Stuart, G., Spruston, N., Hausser, M.: Dendrites, pp. 139–160. Oxford University Press, Oxford (1999) 42. Segev, I., Rinzel, J., Shepherd, G.: The Theoretical Foundation of Dendritic Function: Selected Papers of Wilfrid Rall with Commentaries, p. 456. MIT Press, Cambridge (1995) 43. Tuckwell, H.C.: Introduction to Theoretical Neurobiology:, vol. 1, p. 304. Linear Cable Theory and Dendritic Structure. Cambridge University Press, Cambridge (1988) 44. Jesper Sjöström, P., Rancz, E.A., Roth, A., Häusser, M.: Dendritic excitability and synaptic plasticity. Physiol. Rev. 88(2), 769–840 (2008). https://doi.org/10.1152/physrev.00016.2007 45. Hausser, M.: Diversity and Dynamics of Dendritic Signaling. Science. Vol. 290, pp. 739–744 (2000). doi: https://doi.org/10.1126/science.290.5492.739
Predicting University Development Based on Hybrid Cognitive Maps
151
46. London, M., Häusser, M.: Dendritic computation. Ann. Rev. Neurosci. 28(1), 503–532 (2005). https://doi.org/10.1146/annurev.neuro.28.061604.135703 47. Mel, B.: Information processing in dendritic trees. Neural Comput. 6(6), 1031–1085 (1994). https://doi.org/10.1162/neco.1994.6.6.1031
Systems and Algebraic-Biological Approaches to Artificial Intelligence G. Tolokonnikov1(B) , V. Chernoivanov1 , Yu. Tsoi1 , and S. Petoukhov2 1 FNAC VIM RAS, Moscow, Russia 2 IMASh RAS, Moscow, Russia
Abstract. The paper discusses and develops the proposed by the authors new approaches to modeling strong artificial intelligence, the transition from traditional artificial neural networks to categorical neural networks with more complex topology and hierarchy, functional systems integrated with a universal agent of artificial intelligence, genome analysis by methods of algebraic biology and categorical physico-chemical systems. Methods of categorical systems theory, which formalized an essential part of the theory of functional systems, are proposed for describing the indicated categorical neural networks, a G-graph network hierarchy (G is a given graph), a hybrid functional system equipped with a universal AI agent that implements a version of strong AI, physicochemical molecular systems, including DNA and RNA, as the basis of algebraic biological research. Keywords: Functional systems · Categorical systems · Universal agent · Strong artificial intelligence · Algebraic biology · Hierarchies of systems
1 Introduction The work is devoted to approaches to strong artificial intelligence. The study of the brain, neurobiology, and the physiology of higher nervous activity provide food for the simulation of strong artificial intelligence. The boundaries of the successful development of artificial neural networks of deep learning in recent years are emerging, and the need for strong artificial intelligence for practical tasks that cannot be solved by traditional methods of neural networks is becoming increasingly recognized. The next section is devoted to the proposed use of the theory of convolutional polycategories and categorical systems theory for modeling the elements of strong AI, the third section discusses the use of universal agents that implement the attributes of strong AI in functional and biomachsystems. The fourth section is devoted to algebraic biology, which makes it possible to predict the properties of organisms, including in the long term the intellectual capabilities of a person, on the basis of a strictly mathematical algebraic analysis of the genome; algebraic-categorical methods for algebraic biology are proposed, arising from the modeling of biomolecules using categorical systems. In the conclusion, the results of the discussions are summed up and the prospects for the proposed directions are outlined. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): CSDEIS 2020, AISC 1402, pp. 152–161, 2021. https://doi.org/10.1007/978-3-030-80478-7_16
Systems and Algebraic-Biological Approaches to Artificial Intelligence
153
2 Functional Systems and the McCarthy Model The outstanding achievements of artificial neural networks (see, for example, [1–5]) of deep learning in the last decade have given rise to the hope of the possibility of solving problems related to strong artificial intelligence (AI) using the neural network technologies used. For example, the Pavlov principle put forward in [6] postulates a similar possibility, in general, for all tasks of strong AI.
Fig. 1. An example of three neurons n1, n2, n3 (polyarrows) connected to the network shown on the right by the convolution S, the convolutions themselves are indicated on the left.
Physiology, however, has gone far ahead from the reflex theory of I.P. Pavlov and its expectations, to which the indicated Pavlov principle appeals. Reflexes turn out to be very particular models of the theory of functional systems of PK Anokhin-KV Sudakov [7], which covers the ability of foresight and other aspects of natural intelligence that are inaccessible to the reflex theory. Analysis of the methods of artificial neural networks [8] shows that they are just one of the methods for approximating functions of many variables and are based on the works of AN Kolmogorov on the representation of continuous functions of many variables using compositions and addition of functions of one variable. The function models a reflex, such modeling is not enough to formalize a functional system. From a physiological point of view, Pavlov’s principle is not enough for the goals declared by this principle, and therefore in [9], for the same purposes, the AnokhinSudakov categorical principle was put forward, based on more complex neural networks modeled by convolutional polycategories and their higher analogues. Polycategories were introduced in [10], with a very limited connection of the poly arrows in the form of a composition (one of the inputs of the first poly arrow with one of the outputs of the other). In convolutional polycategories, introduced by one of the authors (see [11] and references there), composition has been replaced by convolutions that define, in a sense (convolutional duality), the maximum possible types of polyarrow connections. With the help of convolutional polycategories, it was possible. to formalize an essential part of the theory of functional systems within the framework of the emerging category theory of systems. Some idea of polyarrows and convolutions is given by the diagram in Fig. 1 (for details see [11] and references there). Artificial neural networks with various topologies used turned out to be a very simple special case of convolutional polycategories. Let an artificial neural network (neurons, activation functions, arbitrary network topology) be given, then the following theorem is true.
154
G. Tolokonnikov et al.
Theorem Artificial neural networks are associative compositional convolutional polycategories with coronal type convolutions. Within the framework of the polycategorical model of artificial neural networks, in particular, a rigorous substantiation of the well-known Osovsky formula of the error backpropagation method derived from intuitive reasoning was found [12]. The possibility of modeling functional systems describing natural intelligence allows us to hope for the use of categorical systems theory as the basis of formalism for strong artificial intelligence. First, all known types of connections between neurons are covered, and not only those that are possible on the basis of corona type convolutions (that is, used in artificial neural networks). Second, it covers a wide class of higher systemic hierarchies, including those that are clearly implemented in the functional systems of animal and human organisms. Third, the categorical systems approach to molecular biology and biochemistry makes it possible to use categorical algebraic methods that were not previously used in these sciences, expanding the tools of algebraic biology and its approach to predicting the properties of intelligence based on genome analysis. Let us dwell on the issue of G-hierarchies. A neurograph is defined as a set of higher poly-arrows. Consider G-neurographs defined for a fixed directed graph G as follows. Let us choose one of the arrows of the graph G. To this link we associate some arbitrary polygraph P, as a set of polyarrows. The beginning of the arrow is the set of polyobjects, the end of the arrow is the set of polygraph P (in case the end of this arrow coincides with the beginning of the second one considered). The resulting set of polygraphs is called a G-neurograph. There is a theorem on the representation of higher categories by convolutional polycategories, which is generalized to G-neurocategories (neurographs with additional properties), according to the generalization, G-neurocategories are reduced to convolutional polycategories with suitable convolutions. The objects of the higher poly-arrows are poly-arrows, already for simple functional systems of the body, for example, the cardiovascular system (CVS), the use of higher poly-arrows is required. In the example of the CVC the objects will be other subsystems, such as polyarrows, which, in particular, the CCC supplies with blood and nutrients.
3 Application of Universal Agents in Biomachsystems and Functional Systems Fundamental steps towards the creation and use of strong AI in industry, production, and management were made within the framework of the theory of biomachsystems (see [8, 13] and references therein) with solvers containing a universal artificial intelligence agent operating within the constraints of a given subject area (Post block). Biomachsystems generalize ergatic (man-machine) systems by introducing a subsystem of "living" and emphasis in the study and construction of biomachsystems on the connection of subsystems using categorical methods (Fig. 2). The subsystem “machine” in the biomachsystem for making decisions and processing information includes the specified solver based on the application of Post calculus and universal calculus theorems [14].
Systems and Algebraic-Biological Approaches to Artificial Intelligence
155
Fig. 2. Diagram of n-component categorical systems, H – “man” block, M – “machine” block, F – “living” block (animal, biomass, plants, etc.), arrow lines are omitted 16.
On the basis of the approach of biomachsystems, the concept of “smart enterprise” has been developed and applied in agricultural production [15], which, in contrast to the generally accepted concepts, is based on real elements of a strong AI. The specified universal agent can be integrated (Fig. 3) with the functional system. Taking into account the well-known scheme of the functional system according to P.K. Anokhin [7], consisting of an afferent synthesis block (AS), a decision-making block (PR), an action result acceptor block and an action program block, we present a diagram of a functional system integrated with an agent, which completely replaces all blocks except for afferent synthesis and decision making. The agent’s work is as follows. A person (or AS and PR) writes the parameters of the required result in the agent’s memory. Further, the agent starts the calculus generator, receives the first on its basis, finds a conclusion to the result, starts the program for outputting to the effectors, the receptors supply information about the parameters of the result obtained. If the parameters coincide with the parameters of the acceptor, the result is achieved, the system can be decelerated. If they do not match, then the agent starts the calculus generator again, if an algorithm for solving the problem exists, then, according to the theorem on universal calculus, sooner or later the generator will work out this required algorithm and the problem will be solved. Both logic, calculus, and conclusions in the universal agent are generated, nothing is required in advance, except for describing the parameters of the desired result. So, the existing need generates motivation, which achieves the development of an image of the result, the parameters of which are transferred to the universal agent. If the result is achievable, then it will inevitably be achieved. Theoretically, the problem has been solved, but the engineering problem of real programming the agent, taking into account the constraints for the generated calculi, is very difficult.
156
G. Tolokonnikov et al.
Fig. 3. Compressed functional system diagram with a universal agent.
4 Genome, Algebraic Biology and Categorical Physicochemical Systems The mechanisms of natural intelligence, the modeling of which can serve to create a strong AI, like other properties of an organism, are associated with the genome. Prediction of the properties of an organism turned out to be possible by strictly mathematical algebraic methods within the framework of algebraic biology [16]. Thus, raising to the third tensor power of the matrix [C A; T G], composed of the names of the bases T, A, C, G gives [16] codes of amino acids (20 out of more than 500 already discovered) and a stop codon with a known degeneracy. Numerous properties of organisms realizing numerical and geometrical patterns (phyllotaxis, meiosis…) were explained within the framework of a strictly mathematical analysis of the genome using the developed theory of u-numbers and other algebraic structures [17]. A significant expansion of the methods of algebraic biology gives a systematic approach to describing the properties of biomolecules, including DNA and RNA, which is as follows. By the use of a categorical systems approach, a functional system leads to a molecular biochemical level. Consider the categorical model of an atom and a molecule [18], which, based on the Gelman-Feynman theorem [19], turns out to be applicable to both the classical and quantum cases. Let us construct a categorical system that simulates the classical mechanical system of charged particles, as well as their quantum system while considering only the static picture of stationary classical particles and stationary quantum states that satisfy the
Systems and Algebraic-Biological Approaches to Artificial Intelligence
157
stationary Schrödinger equation. The stationary case covers the main sections of the chemistry of atoms and molecules (Fig. 4).
Fig. 4. Tensor family of genetic matrices.
To construct a polycategory of chemical systems Chi, we consider a positive charge + and a negative charge − equal in magnitude to it as dom and codom of the polyarrows. We divide the Euclidean space of classical and quantum mechanics into identical cubes (with side a), the Cartesian coordinates of the center of the cube are denoted by (x, y, z). We place a charge + or − in the cube (x, y, z), we introduce diagrams of the polycategory. A particle with a charge Z+ = n+ + (Z- = n- − ) located as a point in a cube (x, y, z) is represented by the diagram in Fig. 5. The magnitudes of the charges and the sizes of the cubes are selected from the requirements of the approximation parameters. During convolution, lines of force are built between negative and positive charges, the beginning and end correspond in modulus to the same charge. In the shown diagram in Fig. 6, there are two arrow-systems, the interaction between them is not included, after applying the convolution, the interaction is turned on for pairs of charges and, two lines of force appear. The interior of the arrow on the right is located in the convolutional-dual part of the polycategory, where there is a conjugate convolution, under the action of which the charges can be returned to their original state. The inclusion of interaction at once for all charges corresponds to the approach of systems of charges initially not interacting, for example, at an infinity distance from each other. When the systems of charges approach the final configuration, the lines of force take on a certain form, while the connection of the beginnings and ends is quite definite,
158
G. Tolokonnikov et al.
Fig. 5. Particle polycategory diagrams.
Fig. 6. Convolution example.
as well as which of the charges with which coagulates and which cubes of space they belong to. For the convolution of a set of polyarrows, with the indicated cubes of space, according to the laws of electrostatics, the lines of force of the electric field strength created by all charges of the polyarrows are calculated. If there are more positive charges, then all negative charges are connected by lines of force and curled up with the corresponding positive charges, if there are more negative charges, then all positive charges are connected with the corresponding negative charges. Building a complete convolution allows you to build a system from two or more other systems A1 ,…,Ak . We apply to each of the systems the corresponding conjugate convolution S* i , the result of which will be noninteracting sets of charges in the corresponding cubes of space. We apply the complete convolution S to these obtained simple systems, which will change the arrangement of the charges over the cubes and connect the positive and negative charges with force lines corresponding to its task. Thus, the new system A is expressed in the form A = S(S* 1 A1 ,…, S* k Ak ). Let there be several systems that can interact. We consider them at first spaced far from each other and non-interacting. The distribution of the lines of force in the system is defined. When the interaction is switched on, the distribution of lines of force becomes equal to the distribution that takes place in the convolution of these systems. That is, in a situation when these systems have become subsystems of a new system, obtained from them by the corresponding convolution, meaning the indicated inclusion of interaction.
Systems and Algebraic-Biological Approaches to Artificial Intelligence
159
The constructed polycategory Chi is apparently of little interest for electrostatics, but it is of considerable interest for quantum chemistry. Quantum-mechanical assemblies of electrically charged particles obeying the stationary Schrödinger equation are modeled by the same convolutional polycategory Chi as in the above classical case. This follows from the well-known Gelman-Feynman theorem. Let a set of nuclei and electrons be given, write down the stationary Schrödinger equation for it, its solutions give a wave function by which one can determine the probability densities for nuclei and electrons in space, that is, clouds of negative and positively charged particles. In the adiabatic approximation, the nuclei are stationary, and the electrons form an electron cloud. Having chosen the charge value much less than the electron charge and the atomic size of the cubes into which the three-dimensional Euclidean space is divided, we obtain the co-regions and the regions of the poly-arrows of the Chi polycategory. Convolutions are now constructed not according to Coulomb’s law, but based on the solution of the Schrödinger equation, which gives an electron cloud. Among the solutions of the stationary Schrödinger equation for a given set of nuclei and electrons, there are various molecules with different chemical bonds, interpreted by the form of an electron cloud. As is known, due to the difficulties with the solution of the Schrödinger equation in the theory of molecules [19], various approximations are used, mainly the method of molecular orbitals. The approximate picture is also modeled by categorical systems, which are much simpler than the given model. So, if only covalent bonds in the arrowheads are taken into account, one can restrict ourselves to electrons and holes, then the formation, for example, of a hydrogen molecule is modeled by the scheme (Fig. 7).
Fig. 7. Formation diagram of a hydrogen molecule.
Chemical reactions are modeled by a second-order polycategory, areas (reagents) and co-areas (reaction products) of higher polyarrows here are sets of atoms and molecules, which are themselves systems (polyarrows). In the case of two reagents and one product, a subpolycategory arises that is equivalent to the usual partial associative algebra on the set of names of atoms and molecules. The resulting algebraic categorical structures are directly used to form protein molecules, as well as DNA and RNA molecules, as the main biological objects of algebraic biology.
5 Conclusion The paper proposes and develops new approaches to modeling strong artificial intelligence, which include the following approaches: the transition from popular traditional
160
G. Tolokonnikov et al.
artificial neural networks of deep learning to categorical neural networks with a more complex topology and hierarchy that simulate various existing and possible connections of neurons, in addition to those described by crown type convolutions of traditional artificial neural networks; functional systems integrated with the universal agent of artificial intelligence; genome analysis by methods of algebraic biology and categorical physicochemical systems. Methods of categorical systems theory, which formalized an essential part of the theory of functional systems, are proposed for describing these categorical neural networks, a G-graph network hierarchy (G is a given graph), a hybrid functional system equipped with a universal AI agent that implements a version of strong AI, physicochemical molecular systems, including DNA and RNA, as the basis of algebraic biological research. The nearest further development of the results obtained is supposed to include, in first, the study of convolutional polycategory algebraic structures that model biochemical molecules, including DNA and RNA.
References 1. Dharmajee Rao, D.T.V., Ramana, K.V.: Winograd’s inequality: effectiveness for efficient training of deep neural networks. IJISA 6, 49–58 (2018) 2. Karande, A.M., Kalbande, D.R.: Weight assignment algorithms for designing fully connected neural network. IJISA 6, 68–76 (2018) 3. Hu, Z., Tereykovskiy, I.A., Tereykovska, L.O., Pogorelov, V.V.: Determination of structural parameters of multilayer perceptron designed to estimate parameters of technical systems. IJISA 10, 57–62 (2017) 4. Medhat, H.A.: Awadalla: Spiking neural network and bull genetic algorithm for active vibration control. IJISA 10(2), 17–26 (2018) 5. Abuljadayel, A., Wedyan, F.: An approach for the generation of higher order mutants using genetic algorithms. IJISA 10(1), 34–35 (2018) 6. Dunin-Barkovsky, V.L., Solovieva, K.P.: Pavlov’s principle in the problem of reverse construction of the brain. In: XVIII International Conference Neuroinformatics-2016. Part 1, M., MEPhI, pp. 11–23 (2016) 7. Anokhin, P.K.: Fundamental questions of the general theory of functional systems. In: Principles of the systemic organization of functions, M., Nauka, pp. 5–61 (1973) 8. Tolokonnikov, G.K., Chernoivanov, V.I., Sudakov, S.K., Tsoi, Y.A.: Systems theory for the digital economy. In: Hu, Z., Petoukhov, S., He, M. (eds.) CSDEIS 2019. AISC, vol. 1127, pp. 447–456. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-39216-1_41 9. Tolokonnikov, G., Anokhin-Sudakov, K.: Category theory principle, pp. 454–455. XIV International interdisciplinary congress Neuroscience for Medicine and Psychology, Sudak, Crimea, Russia, Moscow, MAKS Press (2018) 10. Szabo, M.E.: Polycategories. Comm. Algebra 3(8), 663–689 (1975) 11. Tolokonnikov, G.K.: Informal categorical theory of systems. Biomachsystems 2(4), 41–144 (2018) 12. Tolokonnikov, G.K.: Convolution polycategories and categorical splices for modeling neural networks. In: Hu, Z., Petoukhov, S., Dychka, I., He, M. (eds.) ICCSEEA 2019. AISC, vol. 938, pp. 259–267. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-16621-2_24 13. Chernoivanov, V.I.: Biomach systems: emergence, development and prospects. Biomach Syst. 1(1), 7–58 (2017)
Systems and Algebraic-Biological Approaches to Artificial Intelligence
161
14. Post, E.L.: Formal reduction of the general combinatorial decision problem. Amer. J. Math. 65(2), 39–67 (1943) 15. Chernoivanov, V.I., Tsoi, Y., Elizarov, V.P., Perednya, V.I.: On the concept of creating a “smart” dairy farm. Rural Mach. Equip. 11, 2–9 (2018) 16. Petoukhov, S.V.: The matrix genetics, algebras of the genetic code, noise immunity. Regular and Chaotic Dynamics, Moscow (2008) 17. Petoukhov, S.V.: Genetic coding and united-hypercomplex systems in the models of algebraic biology. BioSystems 158, 31–46 (2017) 18. Tolokonnikov, G.K.: Category functional systems on the example of hemostasis. Biomachsystems 3(1), 123–196 (2019) 19. Gribov, L.A., Mushtakova, S.P.: Quantum chemistry, M., 390 p (1999)
Concept of Two-Stage Acoustic Non-invasive Monitoring and Diagnostic System Based on Deep Learning Vladimir V. Klychnikov, Dmitriy V. Lapin(B) , and Mark E. Khubbatulin Bauman Moscow State Technical University, 5, 2-ya Baumanskaya St, 105005 Moscow, Russian Federation {klyychnikovvv,khmea15u755}@student.bmstu.ru, [email protected]
Abstract. The transformation to completely automated digital manufacturing, controlled by intelligent systems in real time is impossible without monitoring systems and predictive condition analysis of the equipment. Many of this equipment are rotating machines, monitoring which is mainly based on classical diagnostic systems. Such machines are often inaccessible for invasive monitoring due to structural constraints, installation difficulties. At the same time, installation and setup of the invasive system will require significant labor costs. An alternative is an ergonomically designed and flexible non-invasive system based actively developing machine sound monitoring. This article discusses the concept of two-stage acoustic non-invasive monitoring and diagnostic system. Used system allows for both express normal and anomaly condition checks as well as an advanced analysis of the anomaly condition and its causes, which lead solution to predictive maintenance. Describes the system operation, the work of each module. A comparison with the anomaly detection baseline system from DCASE 2020 competition. Keywords: Acoustic analysis · Anomalous sound detection · Machine condition monitoring · Machine condition diagnostics · Predictive maintenance · Deep learning · Autoencoder · Pseudo Winger-Ville transform
1 Introduction In industrial predictive and prescriptive analysis, one of the most important direction in Industry 4.0, machine monitoring and diagnostics is critical part of its operation. Non-invasive acoustic data gathering is particular interest because of high ergonomics and low costs. The general method for processing such type of data is anomalous sound detection. This method allows express diagnostics and monitoring of machines and units with minimum integrations. Currently, there are systems in the world that monitoring rotating machines with vibration sensors or acoustic emission analysis systems. the process of automatization the collection and analysis of information is becoming more and more critical, increasing the speed of response to changes in the process and condition © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): CSDEIS 2020, AISC 1402, pp. 162–174, 2021. https://doi.org/10.1007/978-3-030-80478-7_17
Concept of Two-Stage Acoustic Non-invasive Monitoring
163
of the equipment, and the low cost of the technology used is also important. These are the requirements that the solution for non-invasive acoustic monitoring and diagnostics under development must meet. The choice of sound as a target physical data source is chosen because of the relative ease of collection and the versatility of installing data collection equipment. And of the above, industry needs systems that are efficient, cost-effective and ergonomic. This can be ensured by a system consisting of several monitoring and diagnostic stages. Also, for modern control systems it is necessary to lay down the possibility of further integration with predictive models, which will allow to make more effective not only the monitoring and diagnostic systems but also the production for which they are installed. The goal of the work is to develop a system of two-stage monitoring and diagnostics of rotation machines by non-invasive acoustic method with the following validation and verification of results on the experimental stand. Tasks set to achieve the goal: • • • •
make problem description and sources review; design the system architecture with the details of functional blocks and links; plan stand experiment and methodology; do analysis of the results obtained and assessment of the applicability of the solution at production facilities; • do conclusions and next steps in the study.
2 Sources Review According to the analysis [1], the most perspective is a two-stage approach that include the anomaly detection and advanced analysis blocks. For realization of the first stage various types of models of deep learning and classical methods of machine learning can be applied. To solve the problem of anomaly detection, an ensemble of blind divergence and extraction of anomalies is proposed [2]. This approach allows minimizing the impact of noise on data and improving accuracy by using a set of algorithms NMF, NMU, i NNE. The decisive component is a Autoencoder (AE). However, this approach is difficult to implement and requires a large set of data. A number of works represent the classic architecture of AE [3, 4]. However, the selection of parameters when using this architecture is a non-trivial task, but there is no need to build a complex physical model of the machine processes. In addition to the deep learning approach, the correlation method [5] can be used. The advantage of this method is the simplicity of implementation, but a significant disadvantage is low accuracy. Methods of classical machine learning can also be used for this task [6], those methods are characterized by the complexity of implementation and instability to noise. Also, an example of the application of machine learning can be k-NN algorithm [7], but the method is also sensitive to the environment noise. Equally important for qualitative monitoring is the preparation of data for their analysis, for example Kalman adaptive filter is used [8]. Convolution AE is also used [9], but it is not suitable for a wide range of tasks, and for untrained neural networks requires big set of data.
164
V. V. Klychnikov et al.
The solution to the problem of narrow-purpose configuration of convolution AE can be an ensemble of two pre-trained convolution neural networks [10]: MobileNetV2 [11] and ResNet-50 [12]. Less frequently encountered solution is the use of I-vector, which is formed after selecting parameters by one of the methods of machine learning [13], but in practice the method is not widely used because of the complexity of implementation, complex process of parameter selection and difficult process of classification of the obtained hodographs of I-vectors. The best approach is a combination of methods of classic machine learning and the use of pre-trained convolution autoencoders [14]. Such architecture is noise-resistant and requires less data, and the presence of an ensemble allows to validate the results obtained. For the second stage of monitoring and diagnostics, where the cause of the nonstandard operation is determined, there is not reasonable to use the approach of deep training, because the number of possible defects on one part of the rotating machine is large and the accumulation of data for training may take a long time. At the same time, the approach is difficult to interpret, which significantly complicates the details of anomalies. For drill-down diagnostics of the cause of the malfunction, a rational method is to use heuristic dependencies [15]. This approach requires an increased frequency resolution. For spectral picture detailing it is necessary to use advanced frequency-time Wigner-Ville transform [16], Continuous wavelet transform [17, 18]. To use heuristic dependencies, it is necessary to precisely define the main rotation frequency [19]. Various methods will be used to solve this problem. But it is necessary to carry out filtering of initial signal, for example, with Kalman filter, and it is necessary to get rid of nontarget part of spectrum by means of filter bank. The best results in finding the reference frequency are given by algorithms containing ensembles. Thus, it is possible to use simultaneously signal convolution, determination of maximum energy components and finding of maximum frequency factor in window Fourier transform [2].
3 System Architecture 3.1 Hardware Acoustic analysis with microphones has strict requirements for minimum frequency and AFC stability. Based on experiments, it was found out that the minimum characteristics for monitoring and analysis are: • Frequency band 10–10000 Hz • Bandpass flatness −3dB on 10–100 Hz Based on technical requirements, the SPL-Laboratory USB RTA METER Pro Edition was selected, presents on Fig. 1 and Table 1. Single board computers on the OS Linux are used for remote data collection. This solution is economical and provides the necessary control over the process of data collection. Characteristics of used single board computers present on Fig. 2 and in Table 2.
Concept of Two-Stage Acoustic Non-invasive Monitoring
165
Fig. 1. Microphone SPL-Laboratory USB RTA METER Pro Edition
Table 1. Microphone characteristics SPL-Laboratory USB RTA METER Pro Edition Characteristics
Values
Frequency band
10–20000 Hz
Amplitude range 50–120 dB Calibration
Audiomatica CLIO
ADC resolution
16 bit
Fig. 2. Single board computer Raspberry Pi model 3B+
Table 2. Raspberry Pi 3B+ specifications Micro-architectures CPU freq Core RAM
USB
Cortex-A53 (ARM v8)
1,2 GHz
4
1 Gb
4 port GE
Ethernet
Cortex-A53 (ARM v8)
1,4 GHz
4
1 Gb
4 port GE over USB 802.11ac 4.2
Cortex-A72 (ARM v8)
1,5 GHz
4
2, 4, 8 Gb 4 port GE
802.11ac 5.0
Cortex-A53 (ARM v8)
1,4 GHz
4
512 Mb
802.11ac 4.2
1 port
Wi-Fi
BLE
802.11n
4.1
—
The size of a single board computer allows you to set it in places that are difficult to access. Data transfer from a single-board computer is carried out via MQTT protocol. This protocol has the following advantages: • packet agnostic – any type of data can be transported in the payload carried by the package; • reliability – there are some Quality of Service (QoS) options that can be used to guarantee delivery;
166
V. V. Klychnikov et al.
• scalability – the publish/subscribe model scales well in a power-efficient field; • decoupled design – there are several elements to the design that decouple the device and the subscribing server, which result in a more robust communication strategy; • time – a device can publish its data regardless of the state of the subscribing server. The subscribing server can then connect and receive the data when it is able. Data validation is performed at two points. Validation by recording time and file size is carried out on a single-board computer, on the server the correctness of the file by its content is checked. 3.2 Software Server architecture includes several microservices (Fig. 3): • • • •
data validation service; database management system; anomaly detection service; advanced analytics service.
Fig. 3. Structural diagram of the vibroacoustic monitoring and diagnostics system
Database management service monitors data flows, performs data transfer between microservices. The service of calculating the file anomaly indication allows you to quantify the possibility of defect detection. The analysis has shown that for the implementation of such a microservice the best solution to achieve high accuracy and resistance to noise have ensembles that include AE and machine learning algorithms. Advanced analytics service is the second stage in analysis and diagnostics systems, in the current implementation of the microservice includes a library of heuristic frequencies dependent on the fundamental frequency, an algorithm for finding the reference frequency based on the voting system, which includes three competing algorithms. As a result of the service work the conclusion about the reason of the equipment non-standard operation is issued.
Concept of Two-Stage Acoustic Non-invasive Monitoring
167
4 Model Descriptions In real machine condition, anomaly sound signals are rare, and it can take considerable time to collect anomaly examples for certain machines. Therefore, the development of the system takes the fact that the data will be presented only by the normal operation of the machines. And based on this data, we could get information about the anomalous behavior of the machines. It is also not enough to have information only about the anomaly behavior of the machine, it is also necessary to have an information about possible defects that caused the anomaly operation of the machine. At present, vibration diagnostics is used for this task. Based on the already known dependencies for vibration, similar for an acoustic signal, in the detailed frequency and time domain, have been compiled. To solve the above tasks, the architecture of monitoring and diagnostics systems is based on the blocks of anomaly detection and advanced analytics, which will be described in more detail in the next section. 4.1 Anomaly Detection Block The anomaly detection block is a service of audio track anomaly assessment based on AE. The estimation itself is calculated as the difference between the original sound and the sound reconstructed after AE. To obtain minimum anomaly values for sounds with normal operation of the rotor machine, AE trains to minimize the error of restoring normal operation sounds. This method assumes that AE cannot reconstruct sounds that are not used for its learning. X image is fed to the AE input:
F = 128 - number of Mel-filters, T = 309 - number of Timeframe’s. Then, the acoustic feature at t is obtained by concatenating before/after several frames of Log-Mel-filterbank outputs as D = F x (2P + 1) and P is the context window size, P = 2 (Fig. 4).
Fig. 4. Feature extraction scheme
The, anomaly score is calculated as Aθ (x) =
1 T ψt − AEθ (ψt )22 , t=1 DT
168
V. V. Klychnikov et al.
Where AE is an autoencoder and norm (Fig. 5).
Fig. 5. The Convolutional Autoencoder used in this work
For some time, the system will record the sounds of the equipment based on the fact that it operates in normal mode. This data will be used to teach the anomaly detection model. This stage is called model initialization phase. After receiving the models, there is an online part where in the streaming mode the sounds of the equipment operation are recorded in the anomaly search block and each record will have its own status. Anomalous recordings go on to the block of advanced analytics (Fig. 6).
Fig. 6. Scheme of initialization and work Anomaly Detection block
Concept of Two-Stage Acoustic Non-invasive Monitoring
169
5 Advanced Analysis Block The advanced analysis block is responsible for finding the defect that caused the acoustic data abnormalities. This analysis is based on certain frequency heuristic dependencies [15] For most of the defects of rotary machines have heuristics. The analytical approach for find the main rotational frequencies is possible only if there is enough information on the object of studying. In most cases, the analysis begins with the determination of the main frequency [16]. This task is complicated by the strong noise of the specters, when diagnosing using acoustic signals, an important factor is the amplitude-frequency characteristic of the microphone, in certain case, using low-frequency analysis, in which it is impossible to determine the original frequency of movement, the analysis can only be carried out by secondary harmonics [17]. The most developed now are methods for determining the main frequency of the voice. Examples of such algorithms are: • Praat; • YAAPT; • YIN. Praat and YIN use autocorrelation to determine the main frequency in the frequency and frequency-time areas, respectively. Hybrid approach is a more modern one – in it both areas are used simultaneously, an example of such a group of algorithms can be YAAPT. For diagnostics of machines and unit’s autocorrelation approach also can be used, it allows you to find a frequency that is invisible in the frequency-time picture [18]. However, an auxiliary method is needed for the confirmation of the hypothesis of the correct main frequency found. After determining the main frequency using analytical expressions, the amplitude coefficients corresponding to the corrected defects are found. Most of the analytical expressions to identify vibroacoustic defects based on heuristic rules. Heuristic rules are an algorithm for solving a given task that has not been proven correct for all possible cases but is known to provide a fairly good solution in most of the cases. In fact, it may even be known (i.e. proven) that the heuristic algorithm is formally incorrect. It can still be used if it gives an incorrect result only in certain rare and well-defined cases, or if it gives an inaccurate but still acceptable result. The degree of defect development is determined by the ratio of amplitude coefficients. Also, some defects can be described by one of the probability distributions over a certain frequency range, such ratios usually act as additional conditions for diagnosing the defect. The appearance of a defect is exposed when the following condition is met [20]:
Due to the amplitude-frequency response of the microphone used in the timefrequency picture, it is impossible to determine the main rotation frequency. The autocorrelation approach solves this problem, the unbalance was diagnosed using the formula
170
V. V. Klychnikov et al.
(1), for such a defect, multiple harmonics are important. The unbalance factor considers the microphone’s amplitude-frequency response.
6 Experiments An experiment was conducted to test the system’s operability. The object of monitoring and diagnostics was an electric motor AIR71A4 (Fig. 7).
Fig. 7. Test stand and electric engine with bob weight
The motor was recorded at different speeds: from 5 Hz to 30 Hz. Then weights of different bob weights - 10, 20, 30 g - were hung on the disk to simulate the shaft run-out. Files are recorded for 10 s each (Table 3). Table. 3. Dataset description Status
Files
Normal
2000
Small bearing
300
Medium bearing 300 Big bearing
300
For training was used 1400 record files of normal motor operation. The test sample was used the remaining 600 normal operation files and 900 anomalies. You can see the results of the binary classification below (Table 4). The classification results are as follows: Recall = 99.833%, Precision = 99.667%. Then the abnormal files were moved to the analytics block (Fig. 8). The spectrogram shows the secondary harmonics (marked in red) of the main speed (marked in brown). The presence of these harmonic indicates the onset of a shaft beating defect (Fig. 9). Finding such dependencies for all possible defects will allow you not only to identify the causes of defects, but also to predict their occurrence and give recommendations.
Concept of Two-Stage Acoustic Non-invasive Monitoring
171
Table. 4. Confusion Matrix Normal Anomaly True 598
2
False 1
899
Fig. 8. Spectrogram from analytics block
Fig. 9. Scheme of data analytics progression
7 Discussions One of the problems with this approach is rare anomaly events. These events can be either normal or anomalous. But they are rare enough and it is impossible to collect enough data for the model to be able to generalize these events. Recently, various solutions to this problem have been proposed [24, 25]. In our solution we use a similar approach – transformation of rare events in some vector representation (Embedding) and storing this representation in a database. In the future, in parallel with the anomaly calculation, audio will also be compared with the base of already detected rare anomalies and normal events.
172
V. V. Klychnikov et al.
The results of the anomaly calculation and comparison with the rare event database shall help reduce the number of first and second type errors. An indicator of the efficiency of the developed first module can be a comparison of the architecture proposed on DCASE 2020. Training of models took place on the data offered to participants of competitions. In the data set were the sounds of different machines in different modes of operation. They were also noisy by other sources of mechanical noise. Only sound files of normal operation of the equipment were provided for training. The module developed by us showed the results presented in the table (Table 5). Table. 5. Comparison of results Fan (AUC)
Pump (AUC)
Slider (AUC)
Valve (AUC)
DCASE Baseline
65.83%
72.89%
84.76%
50.98%
Our solution
68.75%
76.00%
90.78%
63.78%
However, the results can be improved, for this purpose we optimize the neural network architecture and plan to use an ensemble of methods to determine the sound abnormality. Using a set of methods to improve the first module will help to reduce the probability of false positives, which ultimately will reduce the load on the second module. One of the main difficulties in the implementation of the second module was the determination of fundamental frequency. So far, the optimal approach for its finding has not been developed, because without localization of the sound source it is impossible to filter similar sound data, and the neural networks that have been trained are not adapted to the task of detecting machine sounds. Also, to improve the accuracy of finding the fundamental frequency, we will consider other spectral transformations including for non-stationary signals. For example, HilbertHuang Transform [26].
8 Conclusions As a result of this work, the concept of invasive vibroacoustic monitoring was developed. The hardware base for data collection was developed to be sufficiently flexible and easy to install. As a result of the experiment, the diagnostic system showed a good result in identifying the anomalies - overall accuracy: 99.8%. The correct determination of fundamental frequency is also a problem for our system. It is directly related to the accuracy of the speed determination of the secondary harmonic coefficients, and consequently to the equipment defects. This problem is now solved by setting the fundamental frequency search interval. Further technology development relates to data accumulation. Thanks to the use of historical data, it will be possible to implement models for predicting the state of equipment. Acknowledgment. This project is supported by Ctrl2Go Company.
Concept of Two-Stage Acoustic Non-invasive Monitoring
173
References 1. Klychnikov, V., Lapin, D., Khubbatulin, M.: Analysis of methods of non-invasive vibroacoustic diagnostics. In: AIP Conference Proceedings AIP Publishing LLC (2020) 2. Kawaguchi, Y., Tanabe, R., Endo, T., Ichige, K., Hamada, K.: Anomaly detection based on an ensemble of dereverberation and anomalous sound extraction. In: ICASSP 2019–2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 865– 869. IEEE, May 2019 3. Kim, H.: Machine anomaly detection using sound spectrogram images and neural networks (Doctoral dissertation, Purdue University Graduate School) (2019) 4. Oyedotun, O.K., Dimililer, K.: Pattern recognition: invariance learning in convolutional auto encoder network. Int. J. Image Graph. Signal Process. 8(3), 19–27 (2016) 5. Prego, T.D.M., de Lima, A.A., Netto, S.L., da Silva, E.A.: Audio anomaly detection on rotating machinery using image signal processing. In: 2016 IEEE 7th Latin American Symposium on Circuits & Systems (LASCAS), pp. 207–210. IEEE, February 2016 6. Rabaoui, A., Kadri, H., Lachiri, Z., Ellouze, N.: One-class SVMs challenges in audio detection and classification applications. EURASIP J. Adv. Signal Process. 2008, 1–14 (2008) 7. Sakamoto, Y., Miyamoto, N.: Anomaly calculation for each components of sound data and its integration for DCASE2020 challenge task2. Date of circulation: 07 September 2020 (2020). http://dcase.community/documents/challenge2020/technical_reports/DCASE2 020_Sakamoto_33_t2.pdf 8. Gong, C.S.A., Lee, H.C., Chuang, Y.C., Li, T.H., Su, C.H.S., Huang, L.H., Chang, C.H.: Design and implementation of acoustic sensing system for online early fault detection in industrial fans. Journal of Sensors (2018) 9. Oh, D.Y., Yun, I.D.: Residual error-based anomaly detection using auto-encoder in SMD machine sound. Sensors 18(5), 1308 (2018) 10. Giri, R., Tenneti, S.V., Cheng, F., Helwani, K., Isik, U., Krishnaswamy, A.: Unsupervised anomalous sound detection using self-supervised classification and group masked Autoencoder for density estimation. Date of circulation: 07 September 2020. http://dcase.commun ity/documents/challenge2020/technical_reports/DCASE2020_Giri_103_t2.pdf 11. Sandler, M., et al.: Mobilenetv2: inverted residuals and linear bottlenecks. In: Proceedings of the IEEE Conference on Computer Vision And Pattern Recognition, pp. 4510–4520 (2018) 12. Akiba, T., Suzuki, S., Fukuda, K.: Extremely large minibatch sgd: Training resnet-50 on imagenet in 15 minutes. arXiv preprint arXiv:1711.04325(2017) 13. Tiwari, P., Jain, Y., Anderson Avila, A., Monteiro, L., Kshirsagar, S., Gaballah, A., Falk, T.H.: Modulation spectral signal representation and I-vectors for anomalous sound detection. Date of circulation: 07 September 2020. http://dcase.community/documents/challenge2020/techni cal_reports/DCASE2020_Tiwari_84_t2.pdf 14. Hayashi, T., Yoshimura, T., Adachi, Y.: Conformer-based ID-aware autoencoder for unsupervised anomalous sound detection. Date of circulation, 07 September 2020. http://dcase.com munity/documents/challenge2020/technical_reports/DCASE2020_Hayashi_111_t2.pdf 15. Barkov, A.V., Barkova, N.A., Azovtsev: Monitoring and diagnostics of rotary vibration machines (2000) 16. Angeli, S., Quesney, A., Gross, L.: Image simplification using Kohonen maps: application to satellite data for cloud detection and land cover mapping. applications of self-organizing maps, vol. 269 (2012) 17. Chalapathy, R., Chawla, S.: Deep learning for anomaly detection: a survey. arXiv preprint arXiv:1901.03407 (2019) 18. Yadav, O.P., Pahuja, G.L.: Bearing fault detection using logarithmic wavelet packet transform and support vector machine. Int. J. Image Graph. Signal Process. 11(5) (2019)
174
V. V. Klychnikov et al.
19. Togami, M., Kawaguchi, Y., Takeda, R., Obuchi, Y., Nukaga, N.: Optimized speech dereverberation from probabilistic perspective for time varying acoustic transfer function. IEEE Trans. Audio Speech Lang. Process. 21(7), 1369–1380 (2013) 20. Debnath, L.: The wigner-ville distribution and time-frequency signal analysis. In: Wavelet Transforms and Their Applications. Birkhäuser, Boston, MA (2002). https://doi.org/10.1007/ 978-1-4612-0097-0_5 21. Antoni, J.: Cyclostationarity by examples. Mech. Syst. Signal Process. 23(4), 987–1036 (2009) 22. Kim, S., Park, Y.: On-line fundamental frequency tracking method for harmonic signal and application to ANC. J. Sound Vib. 241(4), 681–691 (2001) 23. Gerber, T., Martin, N., Mailhes, C.: Time-frequency tracking of spectral structures estimated by a data-driven method. IEEE Trans. Industr. Electron. 62(10), 6616–6626 (2015) 24. Koizumi, Y., Yasuda, M., Murata, S., Saito, S., Uematsu, H., Harada, N.: SPIDERnet: attention network for one-shot anomaly detection in sounds. In: ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 281–285. IEEE, May 2020 25. Koizumi, Y., Murata, S., Harada, N., Saito, S., Uematsu, H.: SNIPER: few-shot learning for anomaly detection to minimize false-negative rate with ensured true-positive rate. In: ICASSP 2019–2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 915–919. IEEE, May 2019 26. Fengcaia, C., Hongxiab, P.: The fault diagnosis research of gearbox based on Hilbert-Huang transform. Int. J. Educ. Manage. Eng. 2(4), 71 (2012)
Harmonic Fractal-Like Features Related to Epi-Chains of Genomes of Higher and Lower Organisms Sergey V. Petoukhov1,2(B) and Vladimir V. Verevkin1 1 Mechanical Engineering Research Institute, Russian Academy of Sciences, M. Kharitonievsky
pereulok, 4, Moscow, Russia 2 Moscow State Tchaikovsky Conservatory, Bolshaya Nikitskaya, 13/6, Moscow, Russia
Abstract. The paper is devoted to a new class of fractal-like and symmetric features revealed in long DNA sequences in eukaryotic and prokaryotic genomes in addition to known data about fractal-like features in molecular genetic systems. This new class was discovered due to the oligomer sums method described. This method can be applied for comparative analysis of numerical peculiarities of any complete genomic DNA nucleotide sequences and its special sparse shortened sequences termed as DNA epi-chains. An application of the method discovered so-called hyperbolic rules of oligomer cooperative organization in genomes. The article presents some results regarding the DNA sequence of the first human chromosome, which show a practical identity of the hyperbolic rules in the complete DNA sequence and its epi-chains of the orders 2, 3, 100. Similar results hold for all human chromosomes and many other eukaryotic and prokaryotic genomes. The results support the fundamental ideas by P. Jordan and E. Schrödinger about quantum biology and living organisms as quantum entities. The described fractal-like features of long DNA sequences show the existence of new classes of symmetric relations in genetic systems and allow developing new model approaches to genetically inherit biological structures and some new methods in biotechnologies and problems of artificial intelligence. Keywords: Genomes · DNA · Hyperbolic rules · Oligomer sums method · DNA epi-chains · Quantum biology
1 Introduction An intensive study of fractal-like structures is a characteristic feature of the development of computer technology, the digital economy, artificial intelligence systems, and other branches of modern science. Particular attention of researchers is attracted by the multitude of genetically inherited fractal-like structures that nature implements in living bodies. New methods of medical diagnostics, biotechnology, comparative analysis of biological objects, etc. are based on the analysis and modeling of these structures [1, 2]. The data available in scientific publications show the existence of fractal-like structures at various levels and branches of the hereditary organization of living bodies. In © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): CSDEIS 2020, AISC 1402, pp. 175–184, 2021. https://doi.org/10.1007/978-3-030-80478-7_18
176
S. V. Petoukhov and V. V. Verevkin
particular, oncological processes are associated with fractals: fractal patterns arise in a cell surface when a healthy human cell turns cancerous [3]. Fractals are frequently considered as fingerprints of dynamic chaos. Correspondingly, fractal patterns on surfaces of such cells give new materials to understand cancerous processes from the standpoint of the theory of dynamic chaos. The article “Fractals and Cancer” [4] shows a possible connection of cancer structures with fractal-like patterns generating by the tensor product of matrices, which is used in quantum mechanics and algebraic biology [5]. All the mentioned facts about fractal-like patterns in inherited biological structures draw attention to the question about a possible relation of fractals to the DNA sequences bearing genetic information. The molecular genetic coding system ensures the transmission of enormous genetic information along the chain of generations. But as known, mathematical fractals give an opportunity for a colossal compression of information. It seems very possible that the genetic system uses this opportunity of information compression. Modern computer science knows a great number of methods of information compression including many methods of fractal compression. But what compression methods are used in genetic information? A search of corresponding bio-information patents of Nature is one of the challenges in modern science. There are several publications on the relationship between the genetic system and fractals in various aspects [6–12]. For example, work [12] shows existence of fractal globules in the three-dimensional architecture of whole genomes, where spatial chromosome territories exist and where maximally dense packing is provided based on a special fractal packing, which provides the ability to easily fold and unfold any genomic locus. By contrast to the work [12], our article is devoted not to spatial packing of whole genomes in a form of fractal globules but quite another type of fractal, which is related to oligomer cooperative compositions of long DNA nucleotide sequences in eukaryotic and prokaryotic genomes. These fractal-like structures were discovered by a new method, which is termed the oligomer sums method, for analysis of DNA sequences, and they were never studied early by other authors [13, 14]. The described results about fractal features of genomes can lead to a discovery of those «genetic» methods of information compression, which are used in genetic systems and biological bodies as the whole. Each of DNA nucleotide sequences in eukaryotic and prokaryotic genomes can be represented as a sequence of monomers (like as A-C-A-T-G-T-…), or a sequence of doublets (like as AC-AT-GT-GG-…), or a sequence of triplets (like as ACA-TGTGGA-…), etc. In each of such fragmented representations of any DNA sequence, one can calculate total amounts of oligomers (or n-plets), which have their length n and belong to the equivalence classes of A-oligomers, or T-oligomers, or C-oligomers, or G-oligomers; these classes combine all n-plets, which start with the same nucleotide A, or T, or C, or G correspondingly. For example, the class of C-oligomers contains the following n-plets: 4 doublets CA, CT, CC, and CG; 16 triplets CAA, CAT, CAC, CAG, CTA, …, CGG; etc. This method of analysis of long DNA sequences, which is termed the oligomer sums method, has revealed for eukaryotic and prokaryotic genomes the following hyperbolic rule of oligomer cooperative organization, which is related to the famous harmonic progression [13, 14]:
Harmonic Fractal-Like Features Related to Epi-Chains of Genomes
177
• For any of the classes of A-, T-, C-, or G-oligomers in any long DNA sequences of individual genomes, the total amounts N,n,1 (n) of their n-plets, corresponding different n, are interrelated each other through the general expression N,n,1 ≈ SN/n with a high level of accuracy (here N refers to any of nucleotides A, T, C, or G; SN refers to the number of monomers N; n = 1, 2, 3, 4, … is not too large compared to the full length of the nucleotide sequence). The phenomenological points with coordinates [n, N,n,1 ] practically lie on the hyperbola having points HN,1 = SN/n . A sequence of the total amounts N,n,1 (n) of the n-plets at different values n is termed as an oligomer sums sequence (or briefly, OS-sequence). Figure 1 shows data regarding this rule and OS-sequences for the case of the DNA nucleotide sequence of the human chromosome № 1. One can see that deviations of phenomenological quantities N,n,1 from model values SN /n lie in the range − 0.11% ÷ 0.11%, that is, they are very small. Initial data on this chromosome was taken in the GenBank: https://www.ncbi.nlm.nih. gov/nuccore/NC_000001.11.
Fig. 1. The results of the analysis - by the oligomer sums method – the DNA nucleotide sequence of the human chromosome № 1. All abscissa axes show the values n = 1, 2, …, 20. The top row demonstrates that the model hyperbolic progressions SA /n, ST /n, SC /n, SG /n (red lines) almost completely cover the OS-sequences of phenomenological total amounts of those n-plets, which start with a nucleotide A, or T, or C, or G correspondingly (the ordinate axes show appropriate amounts). The bottom row show in fractions of percent slight alternating deviation of real values of the OS-sequences from model values. SA , ST , SC , and SG refer to the number of nucleotides A, T, C, and G in this sequence.
The article contains the main section about received results, the section with some concluding remarks, acknowledgments, and references.
2 DNA Epi-Chains and the Hyperbolic Rules for Oligomer Sums This Section presents some results of the study of special subsequences of long nucleotide sequences in single-stranded DNA by the oligomer sums method. These subsequences
178
S. V. Petoukhov and V. V. Verevkin
are termed «DNA epi-chains» [15]. The author’s initial results testify that the abovedescribed hyperbolic rule of oligomer sums for genomes are also fulfilled for these epi-chains; it gives new materials to the known theme of fractal-like structures in genetics. By definition, in a nucleotide sequence N1 of any DNA strand with sequentially numbered nucleotides 1, 2, 3, 4,… (Fig. 2a), epi-chains of different orders n are those subsequences that contain only nucleotides, whose numeration differ from each other by natural number n = 1, 2, 3, …. For example, in any single-stranded DNA, epichains of the second order are two nucleotide subsequences N2/1 and N2/2 in which their nucleotide sequence numbers differ by n = 2: the epi-chain N2/1 contains nucleotides with odd numerations 1, 3, 5, … (Fig. 2b), and the epi-chain N2/2 contains nucleotides with even numerations 2, 4, 6,… (Fig. 2c). By analogy, epi-chains of the third order are those three-nucleotide subsequences N3/1 , N3/2 , and N3/3 , each of which has sequence numbers that differ by n = 3: these epi-chains contain nucleotides with numerations 1, 4, 7,…, or 2, 5, 8,…, or 3, 6, 9,… respectively (Figs. 2d-f). The epi-chain of the first order N1 coincides with the nucleotide sequence of the DNA strand (Fig. 2a).
Fig. 2. A schematic representation of a single-stranded DNA and its initial epi-chains of nucleotides, denoted by black circles. a, a sequence N1 of numerated nucleotides of the DNA strand. b, an epi-chain of the second-order N2/1 beginning with nucleotide number 1. c, an epi-chain of the second-order N2/2 beginning with nucleotide number 2. d, an epi-chain of the third-order N3/1 beginning with nucleotide number 1. e, an epi-chain of the third-order N3/2 beginning with nucleotide number 2. f, an epi-chain of the third-order N3/3 beginning with nucleotide number 3.
Harmonic Fractal-Like Features Related to Epi-Chains of Genomes
179
The term “epi-chain” was coined from the Ancient Greek prefix «epi-», implying features that are "on top of" DNA strands. In any DNA strand, each nucleotide belongs to many epi-chains having different orders k. The symbol “N” in the designation of DNA epi-chains corresponds to the first letter in the word “nucleotides”. In the designation “Nk/m ” of single-stranded DNA epi-chains, the numerator "k" in the index indicates the order of the epi-chain, and the denominator "m" indicates the numeration of the initial nucleotide of this epi-chain along the DNA strand (Fig. 2a). For example, the symbol N3/2 refers to the epi-chain of the third order with the initial nucleotide having the number 2 in the DNA strand: 2–5-8-… (Fig. 2e). Each DNA epi-chain of k-th order (if k = 2, 3, 4,….) contains k times fewer nucleotides than the DNA strand and has its own arrangements of nucleobases A, T, C, and G. Each DNA epi-chain of the order k (if k = 2, 3, 4,….) contains k times fewer nucleotides than the DNA strand and has its own arrangements of nucleobases A, T, C and G. But unexpectedly, despite on these differences, OS-sequences of the total amounts of those n-plets, which start with a nucleotide A, or T, or C, or G, are modeled by very similar hyperbolic harmonic-like progressions in the complete DNA strand and its epi-chains (at this stage of the research, the authors studied OS-representations of epi-chains only in cases of epi-chains with relatively small orders k). Figures 3–7 explains these results in graphical forms by examples of the OSrepresentations of the DNA nucleotide epi-chains N2/1 , N3/1 , N4/1 , N10/1 , and N50/1 for the case of the same human chromosome №1 as in Fig. 1.
Fig. 3. The results of the analysis - by the oligomer sums method – the nucleotide sequence of the epi-chain of the second-order N2/1 (Fig. 2,b), which consists of nucleotides with serial numerations 1-3-5-7-9-… in the DNA sequence of the human chromosome № 1. All abscissa axes show the values n = 1, 2, …, 20. The top row demonstrates that the model hyperbolic progressions SA /n, ST /n, SC /n, SG /n (red lines) almost completely cover the OS-sequences of real total amounts of those n-plets, which start with a nucleotide A, or T, or C, or G in this epi-chain correspondingly (the ordinate axes show appropriate amounts). The bottom row shows in fractions of a percent slight alternating deviation of real values of the OS-sequences from model values. SA , ST , SC , and SG refer to the number of nucleotides A, T, C, and G in this sequence.
180
S. V. Petoukhov and V. V. Verevkin
Fig. 4. The results of the analysis - by the oligomer sums method – the nucleotide sequence of the epi-chain of the third-order N3/1 (Fig. 2,d), which consists of nucleotides with serial numerations 1-4-7-10-13-… in the DNA sequence of the human chromosome № 1. The top row demonstrates that the model hyperbolic progressions SA /n, ST /n, SC /n, SG /n (red lines) almost completely cover the OS-sequences of real total amounts of those n-plets, which start with a nucleotide A, or T, or C, or G in this epi-chain correspondingly. The bottom row shows in fractions of a percent slight alternating deviation of real values of the OS-sequences from model values. SA , ST , SC , and SG refer to the number of nucleotides A, T, C, and G in this sequence.
Fig. 5. The results of the analysis - by the oligomer sums method – the nucleotide sequence of the epi-chain of the fourth-order N4/1 , which consists of nucleotides with serial numerations 15-9-13-… in the DNA sequence of the human chromosome № 1. The top row demonstrates that the model hyperbolic progressions SA /n, ST /n, SC /n, SG /n (red lines) almost completely cover the OS-sequences of real total amounts of those n-plets, which start with a nucleotide A, or T, or C, or G in this epi-chain correspondingly. The bottom row shows in fractions of a percent slight alternating deviation of real values of the OS-sequences from model values. SA , ST , SC , and SG refer to the number of nucleotides A, T, C, and G in this sequence.
Figures 3–7 show that in these epi-chains, which are sparse shortened subsequences of the complete DNA sequence and represent its different scales, the same hyperbolic rule is fulfilled, which was formulated above for complete DNA sequences in eukaryotic
Harmonic Fractal-Like Features Related to Epi-Chains of Genomes
181
Fig. 6. The results of the analysis - by the oligomer sums method – the nucleotide sequence of the epi-chain of the 10th -order N10/1 , which consists of nucleotides with serial numerations 1-1121-31-41-… in the DNA sequence of the human chromosome № 1. The top row demonstrates that the model hyperbolic progressions SA /n, ST /n, SC /n, SG /n (red lines) almost completely cover the OS-sequences of real total amounts of those n-plets, which start with a nucleotide A, or T, or C, or G in this epi-chain correspondingly. The bottom row shows in fractions of a percent slight alternating deviation of real values of the OS-sequences from model values. SA , ST , SC , and SG refer to the number of nucleotides A, T, C, and G in this sequence.
Fig. 7. The results of the analysis - by the oligomer sums method – the nucleotide sequence of the epi-chain of the 50th -order N50/1 , which consists of nucleotides with serial numerations 1-51101-151-201-… in the DNA sequence of the human chromosome № 1. The top row demonstrates that the model hyperbolic progressions SA /n, ST /n, SC /n, SG /n (red lines) almost completely cover the OS-sequences of real total amounts of those n-plets, which start with a nucleotide A, or T, or C, or G in this epi-chain correspondingly. The bottom row shows in fractions of a percent slight alternating deviation of real values of the OS-sequences from model values. SA , ST , SC , and SG refer to the number of nucleotides A, T, C, and G in this sequence.
and prokaryotic genomes. The rule is fulfilled in these epi-chains with the same high accuracy as in the complete DNA of the sequence.
182
S. V. Petoukhov and V. V. Verevkin
Similar results were obtained by the authors in study of epi-chains in the singlestranded DNA of other analyzed genomes (see some corresponding data in the preprint [15]). These results allow formulating an additional hyperbolic rule of eukaryotic and prokaryotic genomes, which is considered by the authors as a candidate for the role of a universal genetic rule (it is necessary to further investigate the widest variety of genomes to verify a degree of its universality). The hyperbolic rule about interrelations of oligomers in epi-chains of long DNA sequences: • In any nuclear chromosome of eukaryotic genomes and also in prokaryotic genomes, the above-noted hyperbolic rule is fulfilled not only for the complete nucleotide sequences but also for their epi-chains of the order k (where k = 2, 3, 4, … is not too large compared to the full length of the nucleotide sequence).
3 Some Concluding Remarks New fractal-like and symmetric features of oligomer cooperative organizations of eukaryote and prokaryote genomes are revealed and described. This expands knowledge on fundamental structural peculiarities of molecular-genetic systems. Fractals are connected with the theory of dynamic chaos, which has many applications in engineering technologies. We believe that the discovery of the described new fractal-like properties of DNA-texts related to their epi-chains can lead to new ideas in theoretical and application areas, including problems of artificial intelligence and in-depth study of genetic phenomena for medical and biotechnological tasks [16–21]. The author’s method of oligomeric sums is proposed as a new effective tool for studying not only genomic sequences, but also long genes, giant viruses, and amino acid sequences in long proteins. This method arose from the simulation of long DNA nucleotide sequences based on the quantum information approach [22, 23]. The success of the application of this method for the analysis of epi-chains of genomic DNA and other genetic sequences provides additional arguments for considering living organisms as quantum information entities. These results support the fundamental ideas of P. Jordan and E. Schrödinger, who were founders of quantum mechanics, about the need for the development of quantum biology [24]. They noted that the key difference between living and inanimate objects is as follows: inanimate objects are controlled by the average random movement of millions of their particles, while in a living organism, genetic molecules have a dictatorial effect on the entire living organism due to a special mechanism of quantum amplification. Jordan claimed that «life’s missing laws were the rules of chance and probability (the indeterminism) of the quantum world that were somehow scaled up inside living organisms» [24]. Our presented results are based on studying of probabilities in long DNA sequences and describe corresponding hyperbolic rules. The authors believe that further studies of genomic DNA epi-chains will be associated with a comparative analysis of OS-sequences representing different DNA epi-chains of both an individual organism and different eukaryotes and prokaryotes. We expect that further development of quantum biology will be done in the nearest future. The results
Harmonic Fractal-Like Features Related to Epi-Chains of Genomes
183
described in this article significantly expand the scientific knowledge about fractal-like principles of the structure of the molecular genetic system (see for example data in [6–12]). Acknowledgments. The authors are grateful to their colleagues M. He, Z. Hu, I. Stepanyan, V. Svirin, and G. Tolokonnikov for research assistance.
References 1. Losa, G.A., Merlini, D., Nonnenmacher, T.F., Weibel, E.R. (eds.): Fractals in biology and medicine mathematics and biosciences in interaction. MBI, Birkhäuser Basel, Basel (2002). https://doi.org/10.1007/978-3-0348-8119-7 2. Buldyrev, S.V., Goldberger, A.L., Havlin, S., Peng, C.-K., Stanley, H.E.: Fractals in biology and medicine: from DNA to the heartbeat. In: Bunde, A., Havlin, S. (eds.) Fractals in Science, pp. 49–88. Springer, Heidelberg (1994). https://doi.org/10.1007/978-3-642-77953-4_3 3. Dokukin, M.E., Guz, N.V., Woodworth, C.D., Sokolov, I.: Emergence of fractal geometry on the surface of human cervical epithelial cells during progression towards cancer. New J. Phys. 17, 033019 (2015). 4. Baish, J.W., Jain, R.K.: Fractals and cancer. Cancer Res. 60, 3683–3688 (2000). 5. Petoukhov, S.V., He, M.: Symmetrical analysis techniques for genetic systems and bioinformatics: advanced patterns and applications. IGI Global, USA (2009) 6. Jeffrey, H.J.: Chaos game representation of gene structure. Nucl. Acids Res. 18(8), 2l163–2170 (1990). 7. Peng, C.K., et al.: Long-range correlations in nucleotide sequences. Nature 356, 168–170 (1992) 8. Peng, C.K., Buldyrev, S.V., Goldberger, A.L., Havlin, S., Sclortino, F., Simons, M., Stanley, H.E.: Fractal landscape analysis of DNA walks. Physica A. 191(1–4), 25–29 (1992). 9. Pellionis, A.J.: The principle of recursive genome function. Cerebellum 7, 348–359 (2008). https://doi.org/10.1007/s12311-008-0035-y 10. Pellionisz, A.J., Graham, R., Pellionisz, P.A., Perez J.C.: Recursive genome function of the cerebellum: geometric unification of neuroscience and genomics, In: Manto, M. Gruol, D. L., Schmahmann, J. D., Koibuchi, N., Rossi, F. (eds.) Handbook of the Cerebellum and Cerebellar Disorders, pp. 1381–1423 (2012). 11. Perez, J.C.: Codon populations in single-stranded whole human genome DNA are fractal and fine-tuned by the golden ratio 1.618. Interdiscip. Sci. Comput. Life Sci. 2, 228–240 (2010). https://doi.org/10.1007/s12539-010-0022-0 12. Lieberman-Aiden, E., van Berkum, N.L., Williams, L., Imakaev, M., Ragoszy, T., Telling, A., Lajoie, B.R., Sabo, P.J., Dorschner, M.O., Sandstrom, R., Bernstein, B., Bender, M.A., Groudine, M., Gnirke, A., Stamatoyannopoulos, J., Mirny, L.A., Lander, E.S., Dekker, J.: Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science. 326(5950), 289–293 (2009). https://doi.org/10.1126/science.1181369 13. Petoukhov, S.V.: Hyperbolic rules of the oligomer cooperative organization of eukaryotic and prokaryotic genomes. Preprints, 2020050471, 95 (2020), (doi:https://doi.org/10.20944/prepri nts202005.0471.v2), https://www.preprints.org/manuscript/202005.0471/v2 14. Petoukhov, S.V.: Genomes symmetries and algebraic harmony in living bodies. Symm. Cult. Sci. 31(2), 222–223 (2020). https://doi.org/10.26830/symmetry_2020_2_222 15. Petoukhov, S.: Nucleotide epi-chains and new nucleotide probability rules in long DNA sequences. Preprints 2019040011, 41 (2019). https://www.preprints.org/manuscript/201904. 0011/v2
184
S. V. Petoukhov and V. V. Verevkin
16. Khan, R., Debnath, R.: Human distraction detection from video stream using artificial emotional intelligence. IJIGSP 12(2), 19–29 (2020) 17. Erwin, D.R.N.: Improving retinal image quality using the contrast stretching, histogram equalization, and CLAHE methods with median filters. IJIGSP12(2), 30–41 (2020). 18. Niaz, M., Shuaib, M., Khalid, H., Khalilur, R., Shahjahan, A.: Design and development of an intelligent home with automated environmental control. IJIGSP, 12(4), 1–14 (2020). 19. Arora, N., Ashok, A., Tiwari, S.: Efficient Image retrieval through hybrid feature set and neural network IJIGSP 11(1), 44-53 (2019) 20. Basavaraj, S.A., Naveen, N.M., Surendra, P.: Automated paddy variety recognition from color-related plant agro-morphological characteristics. IJIGSP, 11(1), 12–22 (2019). 21. Mahtab, A., Akhand, M.A.H., Hafizur Rahman, M.M.: Recognizing bangla handwritten numeral utilizing deep long short term memory. IJIGSP, 11(1), 23–32 (2019) 22. Petoukhov, S.V.: The rules of long DNA-sequences and tetra-groups of oligonucleotides. Version 6 (22 May 2020), arXiv:1709.04943v6. 23. Petoukhov, S.V., Petukhova, E.S., Svirin, V.I.: Symmetries of DNA alphabets and quantum informational formalisms. Symm. Cult. Sci. 30(2), 161–179 (2019). https://doi.org/10.26830/ symmetry_2019_2_161. 24. McFadden, J., Al-Khalili, J.: The origins of quantum biology. Proc. Royal Soc. A, 474(2220), 1–13 (2018). https://doi.org/10.1098/rspa.2018.0674.
Author Index
A Abramov, Maxim V., 101 Alexander, Dobrovolskiy, 111 Alifov, Alishir A., 1 Aristov, Vladimir V., 29 C Chernoivanov, V., 152 D Deart, Vladimir Yu., 10 Dosko, S. I., 66, 74 F Firsov, Georgy I., 39 G Gavrilenkov, Sergey I., 126 Gavriushin, S. S., 91 Gavriushin, Sergey S., 126 Gorshenin, Andrey, 82
Krasnova, Irina A., 10 Kucherov, K. V., 66, 74 Kuzmin, Victor, 82 L Lapin, Dmitriy V., 162 Liu, Bin, 66 M Mankov, Vladimir A., 10 Mazurov, M. E., 138 Meshchikhin, I. A., 91 Mezhenin, Aleksandr, 57 Mikryukov, A. A., 138 O Oliseenko, Valerii D., 101
I Izvozchikova, Vera, 57 Izvozchikova, Vera V., 49
P Petoukhov, S., 152 Petoukhov, Sergey V., 175 Petr, Mikheev, 111 Polyakov, Vladimir, 57 Prishhepa, Angelina, 57
K Khubbatulin, Mark E., 162 Klychnikov, Vladimir V., 162 Korepanova, Anastasia A., 101
S Sergey, Dudnikov, 111 Shardakov, Vladimir M., 49 Spasenov, A. Yu., 66, 74
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): CSDEIS 2020, AISC 1402, pp. 185–186, 2021. https://doi.org/10.1007/978-3-030-80478-7
186 Statnikov, Isak N., 39 Stepanyan, Ivan V., 29 T Tokareva, Marina A., 49 Tolokonnikov, G., 152 Tsoi, Yu., 152 U Utenkov, V. M., 66, 74
Author Index V Verevkin, Vladimir V., 175 Vykhovanets, Valery S., 20 Y Yuganov, E. V., 74 Z Zykov, Anatoly, 57