140 14 31MB
English Pages 268 [265] Year 2022
Lecture Notes on Data Engineering and Communications Technologies 121
Zhengbing Hu Sergei Gavriushin Sergey Petoukhov Matthew He Editors
Advances in Intelligent Systems, Computer Science and Digital Economics III
Lecture Notes on Data Engineering and Communications Technologies Volume 121
Series Editor Fatos Xhafa, Technical University of Catalonia, Barcelona, Spain
The aim of the book series is to present cutting edge engineering approaches to data technologies and communications. It will publish latest advances on the engineering task of building and deploying distributed, scalable and reliable data infrastructures and communication systems. The series will have a prominent applied focus on data technologies and communications with aim to promote the bridging from fundamental research on data science and networking to data engineering and communications that lead to industry products, business knowledge and standardisation. Indexed by SCOPUS, INSPEC, EI Compendex. All books published in the series are submitted for consideration in Web of Science.
More information about this series at https://link.springer.com/bookseries/15362
Zhengbing Hu Sergei Gavriushin Sergey Petoukhov Matthew He •
•
•
Editors
Advances in Intelligent Systems, Computer Science and Digital Economics III
123
Editors Zhengbing Hu Faculty of Applied Mathematics National Technical University of Ukraine Kyiv, Ukraine
Sergei Gavriushin Automated Production Computer Systems Bauman Moscow State Technical University Moscow, Russia
Sergey Petoukhov Mechanical Engineering Research Institute Russian Academy of Sciences Moscow, Russia
Matthew He Halmos College of Arts and Sciences Nova Southeastern University Davie, FL, USA
ISSN 2367-4512 ISSN 2367-4520 (electronic) Lecture Notes on Data Engineering and Communications Technologies ISBN 978-3-030-97056-7 ISBN 978-3-030-97057-4 (eBook) https://doi.org/10.1007/978-3-030-97057-4 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
The book comprises high-quality refereed research papers presented at the Third International Symposium on Computer Science, Digital Economy and Intelligent Systems (CSDEIS2021), held in Moscow, Russia, on December 25–26, 2021, organized jointly by the Mechanical Engineering Research Institute of the Russian Academy of Sciences, Moscow State Technical University and the International Research Association of Modern Education and Computer Science. The topics discussed in the book include state-of-the-art papers in computer science and their technological applications; intelligent systems and intellectual approaches; digital economics and methodological approaches. It is an excellent source of references for researchers, graduate students, engineers, management practitioners and undergraduate students interested in computer science and its applications in engineering and management. The development of artificial intelligence systems and their applications in various fields belongs to the most urgent tasks of modern science and technology. This development is based on the rapid progress of computer technologies, quantum computing and digital communications, which are changing the lives and professional activities of people around the world. In particular, these changes lead to deep transformations of economic and financial activities with a formation of the so-called digital economy. One of many scientific and technological fields, where these changes are very essential, is mechanical engineering, whose aspects are represented in many articles in this book. It is the discipline that applies engineering, physics, engineering mathematics and materials science principles to design, analyze, manufacture and maintain mechanical systems. It is one of the oldest and broadest of the engineering disciplines that involves the design, production and operation of machinery. In particular, mechanical engineering uses tools such as CAD, CAM and product life cycle management to design and analyze manufacturing plants, industrial equipment and machinery, heating and cooling systems, transport systems, aircraft, watercraft, robotics, medical devices, weapons and others. Today, mechanical engineers are pursuing developments in such areas as composites, mechatronics and nanotechnology. It also overlaps with aerospace engineering, metallurgical engineering, civil engineering, electrical engineering, manufacturing engineering, v
vi
Preface
chemical engineering, industrial engineering and other engineering disciplines to varying degrees. Mechanical engineers may also work in the fields of biomedical engineering, specifically in biomechanics, transport phenomena, biomechatronic, bionanotechnology and the modeling of biological systems. The training of strong specialists in the mentioned fields can be improved and accelerated through the close cooperation of universities preparing young specialists and leading research institutes having distinguished scientists. International cooperation plays a great role in corresponding scientific and technological progress. Taking this into account, the Third International Symposium on Computer Science, Digital Economy and Intelligent Systems was conducted on December 25–26, 2021, in Moscow, Russia. The organization of such a symposium is one example of growing Russian-Chinese cooperation in different fields of science and education. The best contributions to the conference were selected by the program committee for inclusion in this book out of all submissions. We are grateful to Springer-Verlag and Fatos Xhafa as the editor responsible for the series “Lecture Notes on Data Engineering and Communications Technologies” for their great support in publishing the conference proceedings. Zhengbing Hu Sergei Gavriushin Sergey Petoukhov Matthew He
Organization
Conference Organizers and Supporters Moscow State Technical University, Russia Mechanical Engineering Research Institute of the Russian Academy of Sciences, Russia International Research Association of Modern Education and Computer Science, Hong Kong Polish Operational and Systems Society, Poland Hubei University of Technology, China International Center of Informatics and Computer Science, Ukraine
vii
Contents
Fuzzy Assessment of the Sustainability of the Agricultural Engineering Industry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Nataliya Mutovkina
1
Methodology for Assessing the Risk Level of a Technical Re-equipment Project Based on Fuzzy Logic . . . . . . . . . . . . . . . . . . . . . Nataliya Mutovkina
11
Digital System for Monitoring and Management of Livestock Organic Waste . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Yu. Izmailov, A. S. Dorokhov, A. Yu. Briukhanov, V. D. Popov, E. V. Shalavina, M. Yu. Okhtilev, and V. N. Koromyslichenko
22
Research on RPG Game Recommendation System . . . . . . . . . . . . . . . . Shenghui Li and Wen Cheng
34
LSTM-Based Load Prediction for Communication Equipment . . . . . . . Rui Guo, Yongjun Peng, Zhipeng Gong, Anping Wan, and Zhengbing Hu
47
Application of MEA-LSTM Neural Network in Stock Balance Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zhongzhen Yan, Kewei Zhou, Xinyuan Zhu, and Hao Chen
60
Future Design Forearm Prosthesis Control System . . . . . . . . . . . . . . . . D.A. Fonov and E.G. Korzhov
72
Map of Virtual Promotion of a Product . . . . . . . . . . . . . . . . . . . . . . . . . Sergey Orekhov, Andrii Kopp, and Dmytro Orlovskyi
81
Analysis of the Technical Condition of Agricultural Machinery Using Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Efim V. Pestryakov, Alexander S. Sayapin, Mikhail N. Kostomakhin, and Nikolai A. Petrishchev
92
ix
x
Contents
Information Spaces for Big Data Problems in Fuzzy Bayesian Decision Making . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 Peter Golubtsov Digital Twins as an Advanced Tool for the Approbation of Longevity Estimation Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 Irina V. Gadolina and Irina M. Petrova Algorithms Optimization for Procedural Terrain Generation in Real Time Graphics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 Aleksandr Mezhenin and Vera Izvozchikova Business Information Systems for Innovative Projects Evaluation at the Conceptual Design Stage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 Dmitry Rakov Analysis of Nonlinear Deformation of Elastic-Plastic Membranes with Physical Non-linearity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 Sergey A. Podkopaev and Sergey S. Gavriushin The Universal Multilevel Relationship Between the Stochastic Organization of Genomes and the Deterministic Structure of DNA Alphabets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 Sergey V. Petoukhov Projecting an Educational Program for the Preparation of Bachelors in the Profile “Digital Economy” . . . . . . . . . . . . . . . . . . . . 175 Nataliya Mutovkina RON Loss Prediction Based on Model of Light Gradient Boosting Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 Haikuan Yang, Hua Yang, Junxiong Wang, Kang Zhou, and Bing Cai Application of Emergency Grain Transportation Based on Improved Ant Colony Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 Yongqing Zhu, Hua Yang, Li Feng, Haikuan Yang, and Kang Zhou Research on Grain Pile Temperature Prediction Based on CNN-GRU Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214 Weihu Liu, Shuo Liu, Yang Wang, Guangbing Li, and Litao Yu Research on Prediction Model of Grain Temperature Based on Hybrid Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 Yang Wang, Shuo Liu, Weihu Liu, Yongfu Wu, and Wenbing Ma Scenario Forecasting of University Development Based on a Combined Cognitive Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 Andrey Mikryukov and Mikhail Mazurov Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
Fuzzy Assessment of the Sustainability of the Agricultural Engineering Industry Nataliya Mutovkina(&) Tver State Technical University, Tver 170012, Russia
Abstract. The article proposes a method of fuzzy integrated assessment of the sustainability of the agricultural engineering industry. The technique allows you to work with a set of heterogeneous indicators. Agricultural engineering is one of the main elements of the machine-building industry of the economy. Agricultural engineering can be called support for the development of agriculture. It is establishing a strong direct relationship between the levels of development of agriculture and agricultural engineering. The degree of development of the machine-building industry depends on the demand for agricultural machinery, increasing the intensity of agricultural production, increasing the number of farms, and their efficiency. In turn, sustainable, well-developed agricultural production is the key to the food security of the population. This is the relevance of the study. The method of fuzzy integrated assessment allows not only to estimate the sustainability of agricultural engineering by heterogeneous criteria but also to make management decisions at the condition of missing information. The peculiarity of the method is that it works with three groups of indicators: economic, social, and technological. The technique does not involve the use of time series and trend detection. The process is base on the normalization of sustainability indicators at regional, federal and global levels. The boundaries of the terms of sustainability indicators are fuzzy variables that vary depending on the overall state of the economy. The methodology provides for the possibility of taking into account the opinions of experts. The method is quite universal and easily adaptable for assessing the stability of other sectors of the economy and their components. Based on the analysis of the values of individual sustainability indicators relative to their falling within the desired boundaries, it is possible to propose specific measures to improve the state of the industry and enhance its development. This is the significance of the research performed. Keywords: Fuzzy assessment Assessment methodology Sustainability of the industry Agricultural engineering Expert assessments Management decisions
1 Introduction In itself, the concept of “sustainability” is somewhat vague. The degree of stability for each individ is different. It usually has boundaries, in which it can be considered nonsustainable development, below-average sustainability, conditionally sustainable development, sustainable development. At the same time, the borders themselves are
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 Z. Hu et al. (Eds.): CSDEIS 2021, LNDECT 121, pp. 1–10, 2022. https://doi.org/10.1007/978-3-030-97057-4_1
2
N. Mutovkina
also blurring, individual. When considering a complex system, it is possible to judge how steadily it develops only based on data on its state for a certain representative period. As a rule, a gradually increasing trend in key indicators indicates sustainable growth; an intermittent tendency shows unstable development, a decreasing trend signals destruction, and a constant level indicates stagnation of the system. However, such data is often difficult to collect or not available. Such situations arise if there is a trade secret or the system was forming recently and data on its state, results of work are not yet absent, or they are not enough to identify a trend. In these cases, it is advisable to conduct an assessment by comparing the values of indicators with the industry average values, average and relative values for the region, country, and the world. You can also attract experts who can adjust the comparison results of comparison into account the weighting factors. Weights characterize the impact of the system and its subsystems on other schemes, the significance of the system and the results of its work for the population and the economy as a whole, the sufficiency of the development for the local economy, etc. The work aims to form a method for a fuzzy comprehensive assessment of the sustainability of agricultural engineering. The technique allows, based on the estimation results, to propose measures to improve the industry’s position. The problem of assessing the sustainability of the development of agricultural engineering is the problem of estimation of a complex socio-economic system based on a scheme of heterogeneous indicators. The mathematical implementation of the assessment consists of several stages, which will be discussing in detail later. 1.1
Review of the Literature and Actual Assessment Methods
1.1.1
Development of the Agricultural Engineering Industry in Russia and the World Enterprises that specialize in the production of agricultural machinery and equipment for mechanizing all types of farm output (tillage, sowing, harvesting) are the basic units of agricultural engineering. Therefore, one of the essential conditions for ensuring food security at all economic levels: international, federal, regional, local (municipal), is the development of this branch of engineering. The analysis of publications devoted to the assessment of the state and development of the agricultural engineering industry showed the following: 1) The most popular types of agricultural machinery in Russia and the world are: tractors, self-propelled machines, trailed and mounted machines [1]. 2) The equipment of agriculture in Russia in comparison, for example, with Canada (similar in soil and climatic conditions, yield, and contour of fields) remains at a low level: in Russia, there are three tractors per 1000 hectares of arable land, and in Canada, there are 16 tractors. A similar situation is observing for harvesting machines [2]. Moreover, production indicators in the Russian agricultural machinery industry are highly volatile, with positive and negative results changing from year to year [3]. 3) The cost of production of agricultural machinery in the Russian Federation is currently higher than in the central supplier countries of these products. This is due
Fuzzy Assessment of the Sustainability of the Agricultural Engineering Industry
3
to general economic factors: a high tax burden, an inflated cost of borrowed funds, and, as a result, an increased cost of raw materials and components. Also, foreign producers receive significant state support in export subsidies (direct and indirect), tax incentives and deductions, and industrial subsidies [1, 4]. 4) In Russia, the production of agricultural machinery is mainly focusing on the domestic market. Therefore, the priority markets for increasing export supplies are historical partners, such as the UIS and the EU, and new destinations: North and South Africa, Australia, China, and North America [1, 5]. 5) Over the past six years, the peak of agricultural machinery production in Russia recording in 2017. However, in 2018, the positive dynamics could not be maintaining. The industry has reduced manufacture to the level of four years ago [3]. 1.1.2 Methods for Assessing the Effectiveness of Industrial Development In [6], the assessment of the level of sustainable development of the machine-building industry base on the estimation of three groups of indicators: financial, production, and innovation sustainability. This method assumes reducing the indexes to a homogeneous form. It is supposed to consider the coefficients of production, economic, and innovation stability in dynamics. The formation of a general level of indexes of sustainable development of the machine-building industry is carried out based on the geometric weighted average of complex production, financial, and innovation stability indicators. However, to apply this technique, it is necessary to have a representative sample of the values of stability indicators in dynamics. Also, it does not take into account the possibility of working with experts. In [7], the authors used indicators that characterize the main components of industrial potential: material and technical, innovation and investment, and others. At the same time, the authors judge the potential by the change in his the key indicators also in dynamics. In [4], for the economic assessment of the quality of equipment, it is proposed to use the optimal criterion for the quality of work performed by the unit in the appropriate conditions of use based on the calculation of compensating costs and equalizing accruals about the operating time of the machine. At the same time, we are not talking about assessing the sustainability of the entire industry. In [8], the evaluation method base on the use of similarity measures. This article proposes a new similarity measure for collaborative recommender systems based on Newton’s law of universal gravitation. To achieve this, a new way of interpreting the mutual mass and the relative distance has to be considered by using the rating data from the user-item matrix. However, this method is not suitable for assessing the sustainability of agricultural engineering due to the lack of statistical data and their universality. The method of pairwise comparison of alternatives proposed by T. Saati [9] has proven itself well. Since the publication of work, this method has been successfully used in evaluation studies and continues to be used to this day [10, 11].
4
1.2
N. Mutovkina
The Possibilities of Fuzzy Logic in Assessing the Sustainability of Development
The fuzzy logic and fuzzy sets methods have proven themselves well in solving estimation problems where the initial data are different-named, heterogeneous indicators [12–14]. Often, when evaluating, it is necessary to rank not the values of the indicators themselves but their imprecise estimates. For this purpose, various methods can be used, for example, as in [15]. In this article, a new ranking procedure based on defuzzification stemmed from the concepts of geometric mean and height of a fuzzy number is proposing. Fuzzy-multiple modifications have broader opportunities for the record the uncertainty of external conditions and expert assessments, greater compactness, clarity, and ease of use in contrast to precise methods of data mining. Among the most primary elements of fuzzy logic, we should single out the concept of a linguistic variable, the values of which are words or sentences in a natural or formal language. The definition of this variable uses the concept of a fuzzy variable a descript by a triple ða; X; AÞ, where a is the name of the variable, X is a universal set that is the domain of the definition a, and A is a fuzzy set on the set X that describes the constraints on the variable a. Linguistic variable roughly describes complex phenomena and processes that cannot define by precise quantitative characteristics, and fuzzy sets play the same role in defining this variable as words or sentences in natural language. Therefore, fuzzy set theory and fuzzy logic have considerable flexibility, for example, when describing and analyzing complex, not fully defined economic tasks. Usually, the method of defining a fuzzy set A that is a subset of X is to use the membership function lA ð xÞ; x 2 X, which takes values in the interval [0, 1]. With its help, the fuzzy set A is defining as the set of ordered pairs fx; lA ð xÞg, x 2 X [12]. When solving applied problems, first a fuzziness introduces the source data (fuzzification), then a fuzzy inference is made, and finally, with the help of defuzzification, the required solution is found. Methods for the defuzzification of an undefined variable are discussing in detail in [16].
2 Method of Fuzzy Assessment of Development Sustainability 2.1
Creating a List of Indicators
At the first stage, a list of the most significant indicators and criteria for sustainability is forming and their distribution into groups is carried out (Figure 1). Based on the need to take into account the specifics of the industry, it was decided to divide all the criteria into three groups: economic, social, and technological. The most informative indicators were recognized as the following: k11 is the number of enterprises, as a % of the total number of enterprises, k12 is the production index, as a % of the previous year, k13 is the investment in fixed assets, as a % of the total investment, k14 is the share of profitable organizations, as a % of the total number of organizations in the industry, k15 is the profitability of products, %, k16 is the return on assets,%, k17 is the share of repaid debt on the obligations of enterprises (at the end of the year), %, k18 is the share of repaid receivables (at the end of the year), % (economic indicators); k21 is the difference between the number of employed and moved to another
Fuzzy Assessment of the Sustainability of the Agricultural Engineering Industry
5
job, in %, k22 is the average accrued salary for the month, rub., k23 is the number of tractors per 1000 hectares of arable land, pcs., k24 is the load of arable land per 1 tractor, ha, k25 is the average number of accidents per 1 000 employees per year (social indicators); k31 is the degree of validity of machinery and equipment, %, k32 is the degree of suitability of vehicles, %, k33 is the proportion of fully updated machinery and equipment, %, k 34 is the proportion of fully updated vehicles, %, k35 is the level of use of the average annual production capacity of enterprises, in % (technological indices).
Fig. 1. System of indicators of the sustainability of agricultural engineering development
2.2
Ranking of Indicators by Importance
The weight coefficients of the importance of the indicators kij ; i ¼ 1; n; j ¼ 1; m; and their groups are determined based on expert assessments. Here n is the number of groups, and m is the number of indicators in each group. The ranking is basing on equality: Xm j¼1
xij ¼ 1; 0\xij \1
ð1Þ
For evaluation, it is recommending to use from four to ten indicators in each group. The number of indicators less than four will not give a sufficiently informative assessment, and exceeding their number will lead to unnecessary complications of the evaluation model. 2.3
Determination of Acceptable and Target Values of Indicators, Linguistic Variables The acceptable xvkij and target xkij values of the stability indicators are determining by experts. The target values are desirable and ensure maximum sustainability of the industry. This is necessary to further define the membership functions for each indicator. Also, the following linguistic variables are introducing: a is the level of indicator xkij , the term set of which a ¼ fa1 ; a2 ; a3 ; a4 g consists of the following terms: a1 is the very
6
N. Mutovkina
low level of the indicator, a2 is the low level of the indicator, a3 is the average level of the indicator, a4 is the high level of the indicator; Y is the comprehensive assessment of the sustainability of the industry, Y1 is the estimation of the financial situation of the industry, Y2 is the assessment of the social efficiency of the industry, Y3 is the estimation of the technological equipment of the industry. Term-set of variables Y, Y1 ; Y2 ; Y3 is b ¼ fb1 ; b2 ; b3 ; b4 g, where b1 is not sustainable development, b2 is stability below average, b3 is conditionally stable, b4 is stable. 2.4
Setting Membership Functions
Each sustainability indicator has its membership function defined. The choice of membership functions is primarily influencing by the nature of the stability indicator. If an increase in the values of the parameter indicates an increase in the sustainability of development, then the membership function has the form: 8 0; xkij xvkij > < xk xv > k ij ð2Þ f " xkij ; xvkij ; xkij ¼ x xvij ; xvkij \xkij \xkij kij kij > > : 1; x x kij
kij
If an increase in the values of the indicator indicates a decrease in the sustainability of development, then the membership function has the form: 8 0; xkij xvkij > < xv xk > k ij f # xkij ; xvkij ; xkij ¼ xvij x ; xkij [ xkij ; xkij \xvkij ð3Þ kij kij > > : 1; x x kij
2.5
kij
Integrated Assessment of Development Sustainability
The integrated assessment of the sustainability of agricultural engineering development is determining by the formula: P3 j¼1 xj Yj Y ¼ P3 ð4Þ j¼1 xj where xj is the weight of a group of stability indicators, Yj is the stability score for a group of parameters: Pn i¼1 xij xkij Yj ¼ P : ð5Þ n i¼1 xij In practice, the ideal variant, when all the indicators are sustainable, is usually not implementing. Sustainability of development can characterize by parameters whose values are near the sustainability area. Therefore, we can talk about some scale of sustainability of growth (Table 1).
Fuzzy Assessment of the Sustainability of the Agricultural Engineering Industry
7
Table 1. The scale of sustainability of the development of the agricultural engineering industry Sustainability interval 0:00 Y\0:25 0:25 Y\0:50 0:50 Y\0:75 0:75 Y\1:00
Verbal assessment Not sustainable development Below-average stability Conditionally stable Stable
For defuzzification, one of the methods discussed in detail in [16, 17] can adopt. Defuzzification is the procedure of producing a crisp value out of a fuzzy set. Several types of defuzzification methods act based on different criteria. Most types of these methods are the kinds of maxima methods like FOM, MOM, or LOM or distribution methods like COG, WA, WFM, FM or area methods like COA and ECOA based on fuzzy set geometry.
3 Experimental Part The initial data for experimental calculations were taken from official sources of statistical information [18–20] and others. The level of development of agricultural engineering in two regions of the Russian Federation and Russia was assessed. In addition, the numerical values of the sustainability indicators for 2019 were considering (Table 2). Table 2. Initial data for evaluation, acceptable and target values Indicator Region 1 Region 2 Federal level xvkij
xkij
k 11 k 12 k 13 k 14 k 15 k 16 k 17 k 18 k 21 k 22 k 23 k 24 k 25 k 31 k 32 k 33 k 34 k 35
0.35 112.0 0.50 100.0 8.0 5.0 100.00 100.00 3.0 60 000 12 105 0 100.0 100.0 30.0 30.0 60.0
0.15 104.7 0.11 48.7 3.7 1.3 62.30 74.57 −2.0 38 752 2 412 9 37.7 42.8 7.3 9.5 16.3
0.40 112.3 0.40 72.1 6.1 2.8 89.22 90.95 −4.8 44 128 3 345 11 44.2 41.3 11.2 13.0 18.7
0.38 110.5 0.27 68.5 4.6 3.2 74.16 82.07 1.2 48 524 7 396 10 58.8 62.2 18.7 22.4 35.5
0.10 100.0 0.10 35.0 3.0 1.2 50.00 60.00 0.0 43 000 5 300 0 60.0 60.0 15.0 15.0 30.0
8
N. Mutovkina
The values of the membership functions calculate using formulas (2), (3). If the weights of xij in the groups are equal, the estimates for the groups of indicators computed using formula (5) are presenting in Table 3. Table 3. Sustainability assessments Output linguistic variables Region 1 Region 2 Federal level 0.20 0.74 0.59 Y1 Y2 0.00 0.01 0.20 Y3 0.00 0.00 0.20 Y 0.08 0.30 0.35
The complex assessment of the sustainability of agricultural engineering development is determining by the formula (4) under the assumption that x1 ¼ 0:4; x2 ¼ 0:3; x3 ¼ 0:3. Thus, the first region is characterizing by the uneven development of the agricultural engineering industry, and in the second region and the country as a whole, stability is below average. For each state of the resulting variable, recommendations and measures aimed at improving the sustainability of the agricultural engineering industry were determining. The corresponding table is shown in Table 4. Table 4. The scale of sustainability of the development of the agricultural engineering industry Verbal assessment Not sustainable development
Below-average stability
Conditionally stable
Stable
Possible measures to strengthen the sustainability of the industry development Improvement of the personnel training system for the agricultural engineering industry Strengthening the financing of the industry through state subsidies Diversification of production at the main suppliers-dealers of large Russian manufacturers of agricultural machinery Development and implementation of innovations in the industry to improve the quality of equipment Co-financing, subsidizing agricultural producers, providing financial assistance in the purchase of expensive equipment Stimulating the growth of investments in research and development work Development of new types of competitive agricultural machinery Implementation of investment projects related to agriculture: construction of asphalt roads in rural areas, which will save the cost of repairing equipment; development of the trade-in mechanism, stimulating the production of components for agricultural machinery Ensuring equal competitive conditions with foreign manufacturers of agricultural machinery It is investing in the development of definite destinations of cattle breeding and crop production, taking into account the region’s peculiarities and natural and climatic conditions. This will increase the justified demand for agricultural machinery Creating favorable conditions for the development of exports of agricultural machinery and equipment
Fuzzy Assessment of the Sustainability of the Agricultural Engineering Industry
9
Based on the evaluation results in the considered example, a plan should be forming from the activities of the first and second groups.
4 Summary and Conclusion The advantages of the proposed method are as follows: 1. The method allows us to consider the heterogeneity of indicators, the measurability of the initial characteristics in different scales. 2. The methodology does not require consideration of the initial indicators in dynamics, does not involve trend analysis. 3. The methodology involves a comparative analysis of the industry’s regional, federal, and global levels of development. At the same time, the boundaries of the terms of stability indicators are fuzzy variables that vary depending on the general state of the economy. 4. The methodology provides for the possibility of taking into account the opinions of experts. 5. The methodology is quite universal and easily adaptable for assessing the sustainability of other sectors of the economy and their components. 6. Based on the analysis of the values of individual sustainability indicators relative to their falling within certain boundaries, it is possible to propose specific measures to improve the state of the industry and enhance its development. 7. To implement the methodology, specialized software can be developing that allows determining the sustainability of the promotion of any other branch of the economy. Thus, the developed methodology has shown its applicability for assessing and ranking the sustainability of agricultural machinery in the regions of the Russian Federation using multiple fuzzy aggregations of normalized estimates of the sustainability of its three subsystems: economic, social, and technological.
References 1. Strategy for the development of agricultural engineering in Russia for the period up to 2030, approved by the Order of the Government of the Russian Federation of July 7, 2017, N 1455-p. https://minpromtorg.gov.ru/common/upload/files/strategy_tll_2030.pdf. Accessed 06 Apr 2021 2. Food and Agriculture Organization Corporate Statistical Database (FAOSTAT). http://www. fao.org/faostat/en/. Accessed 30 Mar 2021 3. Butov, A.M.: The market of agricultural machines-2019. National Research University Higher School of Economics, 87 p. (2019). https://dcenter.hse.ru/data/2019/12/18/1523096 077/Market%20agricultural%20machines-2019.pdf. Accessed 03 Apr 2021 4. Kovaleva, Ye.V.: Assessment of agricultural machinery quality under its full and partial reuse. Agric. Eng. 3(97), 44–49 (2020). (In Rus.). https://doi.org/10.26897/2687-1149-20203-44-49 5. Mazilov, E.A., Demidova, O.S.: Trends of the agricultural machinery market in Russia. Econ. Bus. Theory Pract. 11–2(57), 72–76 (2019). (In Rus.)
10
N. Mutovkina
6. Merenkov, A.O., Medvedeva, E.V.: Assessment of the sustainability of the development of the machine-building industry in the region. Regional Econ. Issues 2(43), 96–100 (2020). (In Rus.) 7. Kozlova, T.M., Boyko, O.G., Paltseva, G.N.: Assessment of the industrial potential of the regions of the Central Federal District. Bull. Tver State Univ. Ser. Econ. Manag. 3, 132–142 (2018). (In Rus.) 8. Verma, V., Aggarwal, R.K.: A new similarity measure based on gravitational attraction for improving the accuracy of collaborative recommendations. Int. J. Intell. Syst. Appl. (IJISA) 12(2), 44–53 (2020). https://doi.org/10.5815/ijisa.2020.02.05 9. Saati, T.: Decision-making. Method of the analysis of hierarchies, 278 p. Radio and Communication, Moscow (1993) 10. Listyaningsih, V., Utami, E.: Decision support system performance-based evaluation of village government using AHP and TOPSIS methods: Secang sub-district of Magelang regency as a case study. Int. J. Intell. Syst. Appl. (IJISA) 10(4), 18–28 (2018). https://doi. org/10.5815/ijisa.2018.04.03 11. Alguliyev, R.M., Nabibayova, G.Ch., Abdullayeva, S.R.: Evaluation of websites by many criteria using the algorithm for pairwise comparison of alternatives. Int. J. Intell. Syst. Appl. (IJISA), 12(6), 64–74 (2020). https://doi.org/10.5815/ijisa.2020.06.05 12. Zadeh, L.A.: Fuzzy sets. Inf. Control 8, 338–353 (1965) 13. Leonenkov, A.V.: Fuzzy modeling in MATLAB and fuzzyTECH, 736 p. BHV Petersburg, St. Petersburg (2005) 14. Nedosekin, A.O.: Fuzzy Sets and Financial Management, p. 184. AFA Library, Moscow (2003) 15. Veerraju, N., Prasannam, V.L., Rallabandi, L.N.P.K.: Defuzzification index for ranking of fuzzy numbers on the basis of geometric mean. Int. J. Intell. Syst. Appl. (IJISA) 12(4), 13– 24 (2020). https://doi.org/10.5815/ijisa.2020.04.02 16. Amini, A., Nikraz, N.: Proposing two defuzzification methods based on output fuzzy set weights. Int. J. Intell. Syst. Appl. (IJISA) 8(2), 1–12 (2016). https://doi.org/10.5815/ijisa. 2016.02.01 17. Runkler, T.A.: Extended defuzzification methods and their properties. IEEE Trans. 694–700 (1996) 18. Industrial production in Russia: 2019: Stat. sb., 286 p. Rosstat, Moscow (2019) 19. Russian Statistical Yearbook 2020: Stat. book, 700 p. Rosstat, Moscow, (2020) 20. Socio-Economic Situation of Russia (January-February 2021). Federal State Service Statistics (Rosstat), 380 p. Moscow (2021)
Methodology for Assessing the Risk Level of a Technical Re-equipment Project Based on Fuzzy Logic Nataliya Mutovkina(&) Tver State Technical University, Tver 170012, Russia Abstract. The article proposes a methodology for assessing the level of risk of a technical re-equipment project. This technique involves the range of possible threats at different stages of the project life cycle and a fuzzy model for appraisal of the level of risk. This assessment is performed based on two risk components: the likelihood of the risk occurring and the level of hazard that the risk poses to the enterprise. Further, the assessment is performing by types of risk. Define which chance has the highest level and which type of risk has the lowest level. Basing on the results of the analysis, the management of the enterprise can take timely measures to minimize the highest risks from the entire risk population and management impacts to prevent the growth of acceptable risks. Risk assessment at the stages of a technical re-equipment project allows determining controlling influences at these phases. The proposed methodology allows drawing the attention of enterprise management to the most possible and severe risks. Risk probability and the level of its severity for the enterprise are determining by experts. For this, the external and internal environment of the enterprise, the market for technical equipment, the demand, and the supply of the products manufactured by the enterprise are thoroughly studying. Before specifying estimates, experts identify the factors that are the root cause of risks. Experts can be both employees of the enterprise itself and specialists involved from outside. Expert assessments are subject to mandatory consistency checks. Fuzzy logic elements fit well into the concept of risk estimation since risks and their general properties, namely, probability and severity, are characterized by some fuzziness. They are usually describing verbally using linguistic variables. Keywords: Fuzzy logic Fuzzy estimation Linguistic variables Risks Probability of risk occurrence Severity of risk Level of risk Technical re-equipment
1 Introduction The analysis of the economic activities of enterprises in the agricultural engineering industry in Russia has shown that, at present, the industry is experiencing a tendency to lag in the technical, technological, and economic levels [1]. The reasons for this are as follows: the archaism of the production and technological structure of enterprises; excess physical and moral depreciation of fixed assets; © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 Z. Hu et al. (Eds.): CSDEIS 2021, LNDECT 121, pp. 11–21, 2022. https://doi.org/10.1007/978-3-030-97057-4_2
12
N. Mutovkina
lack of sufficient innovation dynamics in the development of enterprises, their innovative activity; low labor productivity and average quality of strategic management. As the analysis of investment projects implemented by agricultural engineering enterprises has shown, the most relevant for the owner and investor are technical reequipment projects. Technical re-equipment is a complex of measures to improve the technical and economic indicators of fixed assets or their parts based on the introduction of advanced equipment and technology, mechanization and automation of production, modernization, and replacement of obsolete and physically worn-out equipment with new, more productive equipment [2]. Technical re-equipment can be viewed as a process of financing and investing in the consistent transformation of an idea into a product, passing through the stages of fundamental, applied, design, technical research and development, production, and marketing. Technical re-equipment projects are usually considering from two positions. First, they can be a simple development of investments for complete or partial reconstruction of existing products with the replacement of worn-out equipment. Secondly, the project of technical reequipment is the introduction of innovations, as a result of which a product with new qualitative and quantitative characteristics is producing at the updated capacities. A technical re-equipment project, like any investment project, is subject to various risks. These risks are causing by the influence of negative factors of the internal and external environment of the enterprise. As practice has shown, at each stage of the project life cycle, different types of risks arise due to the specifics of each phase. There are also general risks, such as technical and financial risks. To minimize the chances of a technical re-equipment project and ensure its profitability, it is essential to carry out a timely risk assessment. The purpose of the study is to propose a methodology for assessing the risk level of a technical re-equipment project, which will allow taking into account different risks that arise at certain stages of the project life cycle. The theory of fuzzy logic and procedures for the coordination of expert assessments allows achieving the largest objectivity in assessing the overall risk of a project based on a preliminary analysis of private, phased risks. Therefore, risk assessment of a technical re-equipment project is an essential part of risk management at any industrial enterprise. 1.1
Review of the Literature on the Research Topic
1.1.1 Features of Risks During Technical Re-equipment Implementation of the project of technical re-equipment of the enterprise is invariably associated with risk. Uncertainty is the cause of hazards. In the theory of risk management, many classifications of risks are presented [2], but none of them entirely considers the dangers of a technical re-equipment project. At each stage of the project life cycle, there are general risks typical for all projects and specific risks. In addition, atypical threats are characteristic of technical re-equipment projects in a particular
Methodology for Assessing the Risk Level of a Technical Re-equipment Project
13
industry. Thus, the types of hazards are determining by experts based on a detailed analysis of each stage of the life cycle of a technical re-equipment project. Risk management during technical re-equipment consists of the following processes: 1) 2) 3) 4) 5) 6)
Planning risk management. Identification of risks. Qualitative risk analysis. Quantitative risk analysis. Planning for response to identified risks. Monitoring of risks [3].
The success of the next stage depends on the quality of the analysis of the previous phase. Risk management should be carried out continuously. Identifying risks, their assessment, and the search for ways to deal with them are necessary regularly. This is due, first of all, that the risk is a dynamic entity that varies over time. As the project of technical re-equipment is being implemented, both the number and nature of risks change. The absence of hazards in the early stages of the project does not mean that they will not be in the future [4]. The main reasons for the onset of risks are usually associated with a possible delay in implementing the project and, accordingly, a shift of the beginning of obtaining an economic effect [5]. Therefore, a clear division of risks by stages of the project allows the future to more accurately identify the causes of risks and focus on their systematic elimination. Therefore, when managing risk, it is necessary to consider the features of the life cycle stage of a technical re-equipment project. 1.1.2 Risk Assessment Methods Traditionally, a probabilistic approach has been mainly applying to risk assessment. However, when it became apparent that probabilistic models do not always allow obtaining adequate results (for example, for innovative projects, it is impossible to conduct a reliable analysis of variance of risk factors), the theory of fuzzy logic began to be applied in risk management [6, 7]. In [8], for risk assessment, it is proposed to use a hybrid multicomponent approach that combines a statistical approach and elements of the sensitivity analysis and resistance testing method. However, this risk assessment method cannot be applied in enterprises where there is no database of projects and is also not suitable for assessing risks that the company has not yet faced. Risk assessment can be base on the method of expert estimations [5, 9]. The experts can be both employees of the enterprise in question and specialists involved from outside. An expert assessment of the likelihood of risks can be performed for various options of a technical re-equipment project by type of risk. Each expert gives his assessment of a particular kind of risk in percent. Based on the calculation of the
14
N. Mutovkina
coefficient of concordance, the consistency of expert assessments is checking. If the estimates are consistent, the average probability of risk occurrence is calculating. However, this method is characterizing by high subjectivity, which can lead to an incorrect risk assessment [10]. 1.2
Fuzzy Logic in Risk Assessment
Fuzzy logic and fuzzy set theory are very effective tools that allow you to assess expected risks in uncertain conditions. The use of methods based on the theory of fuzzy sets provides for the formalization of the initial parameters and target indicators in the form of a vector of interval values (fuzzy interval). Hitting each interval is characterized by a certain degree of uncertainty [11]. Based on the initial information, experience, and intuition, experts and developers of the technical re-equipment project can quantitatively characterize the intervals of possible (permissible) values of the investigated quantities and their threshold values [12]. Following the theory of fuzzy logic, risk can define as a function of the probability of its occurrence (P) and the level of severity for the enterprise (S) [4]. R ¼ f ðP; SÞ:
ð1Þ
Naturally, these arguments take on different meanings at various enterprises in the formation of projects for technical re-equipment. Each of these arguments can have several levels, described by linguistic variables [13]. For example, VL is very low, L is low, M is medium, H is high, VH is very high. The probability of the risk varies within 0 P 1, and the levels of severity (hazard) of the risk may be within 0 S 100.
2 Methodology for Fuzzy Risk Assessment of a Technical Reequipment Project 2.1
Risk Classification
The main risks identified by experts in the elaboration of a technical re-equipment project are technical, technological, marketing, informational, financial, organizational, environmental, and specific. In addition, the particular risk takes into account the industry affiliation of the enterprise. Technical risk arises due to possible equipment breakdowns, inconsistency, and complexity of equipment interaction, construction, and installation work with a deviation from the project. Technological risk consists of the discrepancy between the capacities of the new equipment and the manufactured products [14].
Methodology for Assessing the Risk Level of a Technical Re-equipment Project
15
Marketing risk is associated with the supply of technical re-equipment with the necessary resources and the sale of products manufactured using new equipment and new technologies. The marketing risk of sales is classifying into the risk of mistakenly choosing the target market segment, the risk of insufficient market segmentation, the danger of mistakenly choosing a sales strategy, the risk of conducting ineffective advertising. Information risk is subdividing into the risk of erroneous assessment and misalignment of interests of the owners and management of the enterprise, the risk of leakage of confidential information either through the employees’ fault or as a result of industrial espionage undertaken by competitors. Financial risk arises as a result of mistakes made in the management of financial flows. Monetary hazards can classify into the risk of interest rate, currency risk, and portfolio risk depending on the source of income [15]. Financial risk includes: the risk of not receiving funds necessary for the development of a technical reequipment project; self-financing risk; risk when using external sources of financing. Organizational risk includes personnel risk and the risk of a lack of qualified specialists to work on new equipment, the risk of delay in the supply of components for servicing new equipment [14]. Ecological risk entails environmental degradation. The specific risk is determined by the industry to which the industrial enterprise belongs and the peculiarities of the functioning of a particular business entity. In particular, specific risks can attribute to the risk of canceling government programs aimed at supporting enterprises in the industry [15]. 2.2
The Fuzzy Risk Assessment Model
The level of risk at each stage of a technical re-equipment project is determined using the fuzzy logic model [4], implemented in MatLab. The risk level is assessed by the two properties presented above: the likelihood of emergence and the level of seriousness that it represents for the enterprise. To implement fuzzy modeling in the MatLab environment, a special extension package FuzzyLogicToolbox is providing. The membership functions, ranks, and type have been identifying when building the model for each variable. For example, the variable “Risk Probability” is shown in Fig. 1.
16
N. Mutovkina
Fig. 1. Membership functions of the variable “Risk probability”
The vertices of the triangular functions correspond to the given linguistic variables. For example, the functions of the output variable ER (“Estimation of Risk”) are also triangular functions. Triangular membership functions were chosen because they provide a more flexible result and the ability to work with a model with a large number of options. An expert system based on fuzzy logic rules for given linguistic variables of risk indicators is shown in Table 1.
Table 1. Inference rules for the “Estimation of Risk” variable P
S VL VL VL L VL M L H L VH M
L VL L L M M
M L L M M H
H M M H H VH
VH H H H VH VH
Methodology for Assessing the Risk Level of a Technical Re-equipment Project
17
When an expert enters into the program his assessment of the probability and level of risk severity in linguistic variables, the model calculates the risk level on a scale from 0 to 1. Thus, the number of applications of the fuzzy risk assessment model depends on the number of risk types identified at each stage. 2.3
Calculation of the Overall Level of Risk
The overall (final) level of risk is determined based on the risk analysis at the stages of the project. The project stage risk is calculating using the formula: Ri ¼
Xm j¼1
Rij ; i ¼ 1; n; j ¼ 1; m
ð2Þ
where Rij are individual risks, assessed using a fuzzy model; n are number of stages; m are the number of types of risk, identified at the i-th stage. Based on a fuzzy risk assessment model, the minimum (Rmin ) and maximum (Rmax ) values of the risk levels are establishing. Depending on the number of types of risk, the minimum and maximum levels of risk are determining at each stage of the project according to the formulas: Rmin ¼ m Rmin ; i
Rmax ¼ m Rmax i
ð3Þ
The range of variation is calculating using the formula: Rmin Vi ¼ Rmax i i
ð4Þ
The boundaries of equal risk intervals are determined, corresponding to the linguistic variables VL, L, M, H, and VH according to the formula: =5 Rmin di ¼ Rmax i i
ð5Þ
Depending on the interval in which the value of Ri enters, a conclusion is making about the level of risk at each stage of the project. By aggregating the Ri values, one can conclude the level of risk for the entire technical re-equipment project.
3 Experimental Part The experts identified four stages of the technical re-equipment project: substantiation of the choice of the project, project development, development of production facilities, and operation of a new technical system. The types of risk identified at each stage, as well as the values of P and S, are shown in Table 2.
18
N. Mutovkina Table 2. Calculation of individual risks and risks at the stages of the project
Stage number
Stage of the project
Types of risk
Probability of occurrence
1
Justification of the choice of the project
2
Project development
3
Development of production facilities
4
Operation of a new technical system
Technical Marketing Informational Technical Financial Specific Technological Organizational Specific Technical Financial Ecological Specific
0.15 0.25 0.30 0.35 0.25 0.15 0.25 0.20 0.15 0.35 0.25 0.10 0.10
The severity of the risk, % 20 60 40 50 50 25 50 70 35 55 60 70 50
Rij
Ri
0.229 0.355 0.317 0.355 0.250 0.229 0.250 0.440 0.229 0.424 0.355 0.433 0.250
0.901
0.834
0.919
1.462
Risk levels Rij are determined using the developed fuzzy risk assessment model. Risk levels Ri were calculated using the formula (2). The common risk was 4.116. Also, the risk assessment should be performed according to their types (Table 3). This way you can determine which of the risks is minimal and which risk is maximal. Table 3. A risk assessment by type Number 1 2 3 4 5 6 7 8
Type of risk Technical Technological Marketing Informational Financial Organizational Ecological Specific
Risk level 1.008 0.250 0.355 0.317 0.605 0.440 0.433 0.708
Share of common risk 0.24 0.06 0.09 0.08 0.15 0.11 0.11 0.17
In this example, the minimum risk is technological (0.25), and the maximum risk is the technical risk (1.008). min At the first three stages of the project, the values of risks were Rmin 1 ¼ R2 min max max max min ¼ R3 ¼ 0:24, and R1 ¼ R2 ¼ R3 ¼ 2:76. At the fourth stage, R4 ¼ 0:32, and ¼ 3:68. These values were calculate using the formula (3). In this case, the values Rmax 4 Rmin and Rmax were determine at the values of P ¼ 0, S ¼ 0 (Fig. 2) and P ¼ 1, S ¼ 100, respectively.
Methodology for Assessing the Risk Level of a Technical Re-equipment Project
19
Fig. 2. Rules for output the “ER” variable
As can be seen from Fig. 2, the inference rules were drawn up in such a way that even if the expert determines a zero estimate of the probability and severity of the risk, the model still produces a nonzero estimate since the risk is always present due to the action of uncertainty conditions. Risk intervals calculated by the formula (5) are presenting in Table 4. Table 4. Risk intervals Stage number Risk boundaries Risk levels VL L 1, 2, 3 Min 0.240 0.744 Max 0.744 1.248 4 Min 0.320 0.992 Max 0.992 1.664
M 1.248 1.752 1.664 2.336
H 1.752 2.256 2.336 3.008
VH 2.256 2.760 3.008 3.680
20
N. Mutovkina
When comparing the values Ri presented in Table 2 with the risk intervals shown in Table 4, it is obvious that all values Ri include into the second interval, therefore, this technical re-equipment project has a low level of risk.
4 Summary and Conclusion The correct assignment of a particular risk to a defined type makes it possible to draw up a proper risk management system for the technical re-equipment of an enterprise. The level of risk characterizes the degree of its acceptability for a particular enterprise and shows whether it is necessary to minimize this risk or not. If the analysis reveals unacceptable risks (H and VH levels), then they should be eliminated first. In general, this approach allows you to assess risks throughout the entire life cycle of a technical re-equipment project, thereby reducing the high level of project uncertainty and increasing the efficiency of further risk management at all stages of the project life cycle. The proposed method is a universal method for assessing risks and can be applied to agricultural engineering enterprises and industrial enterprises in other sectors of the economy. It is planned to apply the developed system of fuzzy risk assessment at industrial enterprises of the Tver and neighboring regions. It is designed to compare the results of the system (risk assessment by their types) with the actual risk levels. Furthermore, statistical processing of deviations will allow us to conclude the quality of the fuzzy estimation system presented in MatLab. If successful, the system can be modified for the possibility of its commercialization.
References 1. Medvedeva, A.: Analysis of the agricultural machinery market in the context of the 2020 crisis: it all depends on the level of government support. AgroXXI: Agroindustrial portal (2020). https://www.agroxxi.ru/selhoztehnika/stati/analiz-rynka-selskohozjaistvennoi-tehniki-v-uslo vijah-krizisa-2020-goda-vse-zavisit-ot-urovnja-gosudarstvennoi-podderzhki.html. Accessed 05 May 2021 2. Gendlina, Yu.B.: Risk factors in the implementation of a project for the technical reequipment of a chemical enterprise. Bull. Cherepovets State Univ. 1(28), 114–119 (2011) 3. Aliyev, A.G., Shahverdiyeva, R.O.: Perspective directions of development of innovative structures on the basis of modern technologies. Int. J. Eng. Manuf. (IJEM) 8(4), 1–12 (2018). https://doi.org/10.5815/ijem.2018.04.01 4. Boldyrevsky, P.B., Igoshev, A.K., Kistanova, L.A.: Risk assessment of innovation processes. Econ. Anal. Theory Pract. 17(8), 1465–1475 (2018) 5. Babushkin, V.M., Abramov, V.A.: Investment project risk assessment. In: The Collection: New Technologies, Materials and Equipment of the Russian Aerospace Industry, vol. 2, pp. 796–801. Academy of Sciences of the Republic of Tatarstan, Kazan (2016) 6. Shtovba, S.D.: Introduction to the fuzzy set theory and fuzzy logic of the project. (In Russ.). http://matlab.exponenta.ru/fuzzylogic/book1/index.php
Methodology for Assessing the Risk Level of a Technical Re-equipment Project
21
7. Terano, T., Asai, K., Sugeno, M. (eds.): Fuzzy Systems Theory and Its Applications, 368 p. Mir Publ., Moscow (1993) 8. Badalova, A.G., Panovsky, V.N.: Risk management in the implementation of projects for the technical re-equipment of industrial enterprises. MSTU Bull. Stankin 1(32), 117–124 (2015) 9. Alovsat, G.: Development of models of manufacturing processes of innovative products at different levels of management. Int. J. Inf. Technol. Comput. Sci. (IJITCS) 11(5), 23–29 (2019). https://doi.org/10.5815/ijitcs.2019.05.03 10. Shilyaev, A.A., Kosmatov, E.M.: Application of the method of integral risk assessment based on the theory of fuzzy logic for the assessment of investment projects in the energy sector. In: Proceedings of St. Petersburg State Technical University, vol. 512, pp. 31–37 (2010) 11. Lee, R.S.T.: Fuzzy logic and the resolution principle. J. ACM 19(1), 109–119 (1972) 12. Obeng, A.Y., Mkhize, P.L.: Impact of IS strategy and technological innovation strategic alignment on firm performance. Int. J. Inf. Technol. Comput. Sci. (IJITCS) 9(8), 68–84 (2017). https://doi.org/10.5815/ijitcs.2017.08.08 13. Zadeh, L.A.: Outline of a new approach to the analysis of complex systems and decision processes. In: Mathematics Today, pp. 5–49. Znanie Publ., Moscow (1974). (in Rus.) 14. Osipova, I.V., Menshchikova, V.I.: Identification of risks of technical re-equipment of an industrial enterprise and key areas of their management. Soc.-Econ. Phenom. Process. 12(2), 91–97 (2017). (in Rus.) 15. Korchagina, M.V., Mukhiddinov, K.S.: Risks of innovation projects. Econ. Ecol. Territ. Format. 3(2), 52–56 (2019). https://doi.org/10.23947/2413-1474-2019-3-2-52-56. (in Rus.)
Digital System for Monitoring and Management of Livestock Organic Waste A. Yu. Izmailov1, A. S. Dorokhov1, A. Yu. Briukhanov1, V. D. Popov1, E. V. Shalavina1(&), M. Yu. Okhtilev2, and V. N. Koromyslichenko3 1
2
Federal Scientific Agroengineering Center VIM (FSAC VIM), Moscow, Russia Research and Experimental Centre for Intelligent Technologies “Petrokometa”, Saint Petersburg, Russia 3 Centre for Intelligent Technologies “Petroint”, Saint Petersburg, Russia
Abstract. The article describes an example of the successful application of digital monitoring and control technologies developed on an intelligent analytical software platform to address agroecological challenges. The study output is an interactive program designed to monitor livestock/poultry waste (manure) management and coordinate the use of resulting organic fertilisers considering environmental and economic indicators. The study applied the following methodological approaches: processing of expert knowledge, web-server access of users, and microservice access framework to geo-information resources. The functions of the designed interactive programme are the database maintaining, logistics arrangements, and summary report (digital passport) generation. The programme was realized for IBM PC with the programming language C++. OS: Linux Debian x64and the size of 500 MB. It received the copyright certificate No. 2021617771 as of 19. 05. 2021. The database was created entirely on expert knowledge in no-code programming technology. The digital monitoring system can be scaled up for all territorial subjects of the Russian Federation and provide the grounds for establishing a state management system of organic resources and environmental sustainability in agriculture. Keywords: Digital technology Intelligent technology Agriculture Environmental safety Organic fertiliser Programme Database
1 Introduction At the current stage of agriculture development, the specialized enterprises and their managerial processes become increasingly complex. The key tasks in the digital transformation of the agro-industrial sector are to ensure high product quality and reduce environmental effects through global planning and providing optimal recommendations to economic players [1, 2]. The basis of the modern national information system in this sector is the data related to farming status and development trends. The content is posted in the information systems of federal and regional executive authorities. It means the industry is © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 Z. Hu et al. (Eds.): CSDEIS 2021, LNDECT 121, pp. 22–33, 2022. https://doi.org/10.1007/978-3-030-97057-4_3
Digital System for Monitoring and Management
23
using integrated top-level governance information. At the same time, there is a fairly wide range of proposals concerning the automation of certain types of activities, technological processes and machines [3, 4]. Obviously, the availability of these numerous proposals and separate estimates does not achieve the goal of organising the integrated monitoring of the status and operations management of the agricultural sector as an integral object. Such monitoring should be designed for product quality control, environmental effect assessment, integration of information systems, and harmonising the enterprise data with adjacent and higher-level systems. There is no point in talking about digital transformation without this monitoring being in place [5]. In the context of modern information richness of technological processes within the life cycle stages of agricultural production, the management and control activity can be very effective if the related data are integrated and undergo intelligent processing and their influence degree on the enterprise’s parameters as a whole, including the environmental performance, is carefully determined. This can be achieved through the monitoring understood as an object investigation with its status tracking and functioning control aimed to predict the performance results [6, 7]. In our case, such monitoring should aim at both agricultural product quality and the environmental safety level of production. The key issue in this respect in North-West Russia, the Leningrad Region, in particular, is the utilisation of organic waste generated in large-scale livestock and poultry farms, with its total amount being at least 4,800 thousand tons per year. Large-scale livestock and poultry complexes produce hundreds of tons of manure per day. However, they do not have a sufficient agricultural land area within 10 km from the farm for applying the resulting organic fertilisers within specified time limits. Therefore, the related cooperation between livestock and crop farms needs to be established. Out of 580 million tons of manure produced in Russia in 2020, only 240 million tons of organic fertilisers were prepared and used following environmental standards and technological regulations to improve soil fertility. By various estimates, the total environmental and economic damage due to the current state of things significantly exceeds 200 billion rubles per year. The study aimed to create an interactive program for monitoring the livestock/poultry waste (manure) management and coordinating the use of resulting organic fertilisers with due account for environmental and economic indicators. With this aim in view, the study completed the following objectives: 1) visualisation (display) of all large-scale agricultural organisations in the Leningrad Region on a digital map: location, name, specialisation, animal/poultry stock, available land area; 2) calculation and display of the current situation on the farms: amount of produced organic fertilisers and their nutrient content; the sufficiency of land for all organic fertiliser application; the number of required manure storages and composting pads; 3) calculation and display of the forecast situation on the farms, including the logistic patterns of organic fertilisers transfer from supplier farms to consumer farms with due account for the nutrient load norms; 4) creation of electronic passports of farms, districts and the whole region under consideration based on the re-
24
A. Yu. Izmailov et al.
distribution of organic fertilisers between the farms and the data on the nutrient load within the boundaries of agricultural lands of the catchment area.
2 Theoretical Aspects of Research The application of intelligent monitoring and management technologies based on an intelligent analytical software-supported platform was found advisable to tackle the set objectives [8]. Such a platform provides developers with a unified technological environment – a set of technologies, ready software solutions and tools, supporting the full life cycle of an applied solution within the framework of the chosen development methodology [9]. In our case, this is a solution to the task of organising the monitoring and management to ensure the environmental safety of agricultural production. Application of such platform types was the subject of several foreign and Russian studies [10–13]. The combination of information technology with the needs of agricultural production was considered in such studies as [14–16]. The progress of information technology makes the daily life of consumers around the world easier. An ever-growing global population is driving up the exponential growth in food demand. Innovations are developing at a very fast pace today. Therefore, their application in every industry sector is advisable to achieve the desired results. In our case, the use of digital technologies will reduce the adverse environmental impact, improve soil fertility and receive the target crop yields that are essential in the context of the import substitution. The system of organic fertiliser handling plays an important role in making the farming more environmentally friendly [17]. It includes the techniques of raw material (animal and poultry manure) preparation for processing into organic fertilisers, which should maintain soil fertility of agricultural lands and provide for yield gains in food and fodder crop production [18, 19]. Many scientific publications note that a properly organised system of organic and mineral fertilisation with due account for the nutrient balance and natural and climatic territorial conditions promotes the long-term environmental sustainability of agroecosystems [20–22]. An integrated system of agro-monitoring and management of engineering and organizational solutions is required to achieve this aim as any dynamic system is based on such blocks as monitoring, evaluation, prediction and control. The intelligent analytical platform allows non-programmers to design intelligent systems in a professionally oriented language using a conceptual model for describing the subject area for heterogeneous data and knowledge analysis with a wide range of data visualisation tools. In this process, the following key approaches are applied: – Ontology-driven methodology for describing and presenting the objects of the subject area, integration of heterogeneous information, formalisation of data and knowledge;
Digital System for Monitoring and Management
25
– Models, methods and algorithms for data and knowledge presentation and visualization; – Methods and algorithms for automatic program synthesis for analysis and evaluation of the monitoring objects status; – Unified methods for reliable recognition of the monitoring objects status under conditions of fuzzy information; – Methods and algorithms for complex simulation of the monitoring objects performance. The intelligent analytical platform-driven development of application solutions includes a set of following technologies: – Intelligent technologies for acquisition, formalisation and correction of all required data and knowledge associated with the objects of the subject area; – Intelligent technologies for organisation of data-flow computing as part of a distributed information and computing environment, for creating a comprehensive and consistent system of heterogeneous mathematical models (polymodel complex); – Intelligent technologies for providing a user-friendly interface and dialogue with the information system, data display (presentation, visualisation) using cognitive (intellectual) graphics. This was the basis for monitoring the production and coordination of organic waste application (turnover) by agricultural enterprises in the Leningrad Region. The architecture of the software solution of the system for displaying the monitoring and coordination results is shown in Fig. 1. The system includes an expert database, user access to web-server, microservice access framework to geo-information resources.
Fig. 1. The software architecture of the system displaying the results of production monitoring and coordination of organic waste application (turnover) by agricultural enterprises in the Leningrad Region
26
A. Yu. Izmailov et al.
The architecture of the expert database is shown in Fig. 2. The database was developed on an entirely expert knowledge basis following the no-code programming technology [23]. Shared information space in terms of general access is implemented using web-based technologies. In general, the digital platform, obviously, should contain three segments: a public access segment, a segment of the internal shared space of an enterprise, and a regional/federal access segment. The proposed division will not disrupt the already established information and management ties and will ensure a painless transition to a digital platform for the agricultural industry. The conducted study analysed: – The state of livestock, poultry and crop production in the region under consideration; – Technological solutions applied on the complexes; – Technical and technological characteristics of the equipment used; – Available land for fertilisation. All the data obtained were combined into a single structure (Fig. 2).
Fig. 2. The architecture of the expert database.
A mathematical model of limiting the nutrient application per hectare of agricultural land was taken as a basis for creating a prediction and logistics system associated with the distribution of organic fertilisers [24, 25]. The limiting indicators by the application dose were total nitrogen (170 kg ha−1) and total phosphorus (25 kg ha−1) [26]. The indicator, the limit value of which is reached faster, should be considered the most significant in the calculation of the dose of organic fertilisation. The programme is designed for use in engineering ecology for information support of executive authorities and farm managers in the Leningrad Region to improve the
Digital System for Monitoring and Management
27
environmental and productive efficiency of farms as far as the preparation and application of organic fertilisers are concerned. The scientific product novelty lies in the models and algorithms for monitoring and controlling technological processes in the agroecosystems. These models and algorithms are the basis for creating intellectual expert systems (computer programmes). The problem of ineffective use of secondary resources is solved by creating a nationwide management system.
3 Practical Implementation In 2020–2021 IEEP – branch of FSAC VIM, Research and Experimental Centre for Intelligent Technologies “Petrokometa” and Centre for Intelligent Technologies “Petroint” together with the Committee for Agro-Industrial and Fishery Complex of the Leningrad Region and under support of EcoAgRAS project of the South-East Finland Russia CBC 2014–2020 Programme designed an interactive programme for monitoring the production and effective application of organic fertilisers. The programme is based on the created digital passports of rural areas, algorithms for information acquisition and analysis, and prediction models with due account for scenarios of technological development and management strategies.
Fig. 3. Screenshot of an interactive programme for monitoring the production and effective application of organic fertilisers.
28
A. Yu. Izmailov et al.
The programme is realized for IBM PC with the programming language C++. OS: Linux Debian x64and the size of 500 MB. It received the copyright certificate No. 2021617771 as of 19. 05. 2021. Such a digital tool allows assessing the current situation and simulates development scenarios and their impact on the environmental sustainability of agroecosystems. The programme algorithm implements the principle of nutrients balance and optimises the inter-farm relations to achieve environmental sustainability. Figure 3 is a screenshot of the programme demonstrating the logistics relationships of agricultural enterprises in the Leningrad Region established to ensure balanced nutrient use. The program implements the following functions. 1. Maintaining a database and displaying the current situation in agricultural organisations in the Leningrad Region in terms of: – – – –
Animal/poultry stock; Farm specialisation; Available area of agricultural land for organic fertiliser application; Technological solutions in place for animal/poultry manure processing into an organic fertiliser indicating units of equipment and specifications of main structures; – Indicators and characteristics of resulting organic fertilizer. This way the information on the characteristics of agricultural organizations is accumulated, stored and updated. At any time, the user can receive, upon request, the required information associated with an individual farm, a district, and the whole region. 2. Logistical arrangements for the use of organic fertilisers – the distribution of all resulting organic fertilisers between supplying enterprises, which produce more fertilisers than they apply to their own fields, and consumer enterprises, which need more fertilisers than they produce. As a result, the user receives the tabular data, which shows the supplier all its consumer enterprises, the fertiliser transportation distance to each consumer and the applied fertiliser amount. The calculations are based on the farm land area, crop rotation in place and agrochemical soil survey. 3. Summary report generation – generalised information on the flows of organic fertilisers and material and technical equipment to implement the system of logistics relationships. 4. Provision of analytical information on the machines and equipment involved. The program allows identifying the individual (by farms) and general (district, region) need for storage facilities and composting pads with due account for the required volume, for machines and equipment for preparation, transportation and application of organic fertilisers. This way an analytical block is formed to assess technological development scenarios and to make management decisions. Figures 4, 5, and 6 show examples of information from the analytical block of the programme.
Digital System for Monitoring and Management
29
Fig. 4. Analytical block of the programme “The need for machines and equipment” for the whole Leningrad Region, with the total amount of resulting organic fertilisers exceeding 4,800 thousand tons per year.
As can be seen from Fig. 4, to upgrade the livestock industry and improve its environmental status the greatest need is in machines and equipment for transporting the solid and liquid organic fertilizers from suppliers to consumers. This also supports the lack of well-established logistical links between the farms.
Fig. 5. Analytical block of the programme “The need for special structures” on the example of one district in the Leningrad Region.
The data from Fig. 5 can be a basis for the investment plans of upgrading a particular farm, district or the whole region.
30
A. Yu. Izmailov et al.
Fig. 6. Analytical block of the programme “Availability of land area for organic fertilizer application” on the example of one district in the Leningrad Region.
As can be seen from Fig. 6, only one farm in the district lacks the land for produced organic fertilizer application. However, the district as a whole has enough unfertilised lands where the organic fertiliser brought from other districts can be applied. Table 1 shows the number of suppliers and consumers by districts in the Leningrad Region. Table 1. The number of suppliers and consumers by districts in the Leningrad Region. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
District Suppliers Consumers Boksitogorskij District 1 2 Volosovskij District 0 18 Volkhovskij District 2 6 Vsevolozhskij District 3 4 Vyborgskij District 5 7 Gatchinskij District 3 12 Kingiseppskij District 0 5 Kirishskij District 0 4 Kirovskij District 2 4 Lodeinopolskij District 1 2 Lomonosovskij District 2 7 Luzhskij District 2 13 Priozerskij District 2 9 Slantsevskij District 0 3 Tikhvinskij District 0 3 Tosnenskij District 2 8 TOTAL 25 107
Digital System for Monitoring and Management
31
According to Table 1, there are 25 supplier farms and 107 consumer farms of organic fertilisers in the Leningrad Region. These farms are unevenly distributed by the districts. Therefore, it is of utmost importance to establish the interaction between farms in terms of organic fertiliser transfer to achieve the nutrient load balance across the whole region. The electronic passports generated by the programme for each agricultural enterprise, district and the whole region contain the detailed information how to achieve the optimal nutrients balance and the required material, technical and resource support. According to the literature review, other researchers focused on creating similar digital tools in agriculture and related sectors [27–29]. The programme developed in Sweden [30] calculates the nutrients balance and required amount of organic fertiliser within a particular location. However, it does not include the dynamic monitoring of farms and does not suggest any redistribution routes of excess manure in the region.
4 Conclusions The study outcomes are a good basis and a clear application example of information technologies together with intelligent information processing to address the challenges of organising environmentally friendly agricultural production. The created system of digital monitoring and data analysis has a huge potential for improvement. It can complete both individual production tasks and global scale tasks, for example, predicting the environmental sustainability of rural areas. The main functions of the developed programme are maintaining a database and mapping the current situation on agricultural enterprises; logistical arrangements for the use of organic fertilisers – the distribution of all resulting organic fertilisers between supplying enterprises and consumer enterprises; report generation. Digital maps were used based on the selected geographic information system to position the enterprises and identify the inter-farm connections in terms of organic fertiliser distribution. This way the programming was combined with spatial visualisation, and the process of agro-monitoring and nutrient load management became interactive. Monitoring and programming results received to date were used as a guide in drafting a plan for the modernisation of the livestock industry in the Leningrad Region in terms of preparation and use of manure-based organic fertilisers. By mastering such a system, the Leningrad Region claims to be the leader in implementing modern digital intellectualised technologies for predicting and planning the environmental sustainability of agricultural production. In the future, the system can be scaled up for all territorial subjects of the Russian Federation, with the Ministry of Agriculture being a key coordinator. The created pilot facility can provide the grounds for establishing a state system of monitoring and managing organic resources in agriculture, of which at least 580 million tons per year are generated in Russia in the form of livestock/poultry manure alone.
32
A. Yu. Izmailov et al.
Acknowledgments. The research was performed under support of the projects “Introduction of the ecological system of agriculture is the basis for sustainable development of border rural area – EcoAgRas” of the South-East Finland - Russia CBC 2014–2020 Programme (Grant contract №17086-LIP1601-KS1441) and “Water-driven rural development in the Baltic Sea Region – WaterDrive” of the Interreg Baltic Sea Region Programme.
References 1. Dubois, M., et al.: The agricultural innovation under digitalization. In: Handbook of Research on Business Transformations in the Era of Digitalization, pp. 276–303. IGI Global (2019). https://doi.org/10.4018/978-1-5225-7262-6.ch015 2. Kumar, K.: How can wireless devices connected over internet help. In: Increasing Agricultural Productivity?, pp. 1–4. ResearchGate. Principles and Practices of Scientific Work (2018) 3. Anke, J., et al.: PROMISE: product lifecycle management and information tracking using smart embedded systems. In: Handbook of Research on Business Transformations in the Era of Digitalization, pp. 559–566. IGI Global (2008). https://doi.org/10.4018/978-1-59904-8321.ch025 4. Takata, S., et al.: Maintenance: changing role in life cycle management. CIRP Ann. 53(2), 643–655 (2004) 5. Raymond, C., et al.: Integrating local and scientific knowledge for environmental management. J. Environ. Manage. 91, 1766–1777 (2010) 6. Okhtilev, M., Sokolov, B.V., Yusupov, R.M.: Intelligent Technologies for Monitoring and Controlling the Structural Dynamics of Complex Technical Objects. Nauka, Moscow (2006) 7. Okhtilev, M.Yu.: Artificial intelligence systems and their application in automated systems for monitoring the state of organizational and technical objects. GUAP, Saint Petersburg (2018) 8. Sun, Z., Stranieri, A.: The nature of intelligent analytics. In: Intelligent Analytics with Advanced Multi-industry Applications, pp 1–21. IGI Global, Hershey (2021) 9. Hu, Z., Tereykovskiy, I.A., Tereykovska, L.O., Pogorelov, V.V.: Determination of structural parameters of multilayer perceptron designed to estimate parameters of technical systems. Int. J. Intell. Syst. Appl. (IJISA) 9(10), 57–62 (2017) 10. Sun, Z.: An introduction to intelligent analytics ecosystems. PNG UoT BAIS 6(3), 1–11 (2021) 11. Karabutov, N.: Frameworks in problems of structural identification systems. Int. J. Intell. Syst. Appl. (IJISA) 9(1), 1–19 (2017). https://doi.org/10.5815/ijisa.2017.01.01 12. Sharda, R., Kalgotra, P.: The blossoming analytics talent pool: an overview of the analytics ecosystem. In: Cochran, J.J. (ed.) INFORMS Analytics Body of Knowledge, Chap. 9, pp. 311–326 (2018) 13. Omary, S., Sam, A.: Web-based two-way electricity monitoring system for remote solar mini-grids. Int. J. Eng. Manuf. (IJ EM) 9(6), 24–41 (2019). https://doi.org/10.5815/ijem. 2019.06.03 14. Bhavikatti, S., et al.: Automated roof top plant growth monitoring system in urban areas. Int. J. Eng. Manuf. (IJEM) 9(6), 14–23 (2019). https://doi.org/10.5815/ijem.2019.06.02 15. Akwu, S., et al.: Automatic plant irrigation control system using arduino and GSM module. Int. J. Eng. Manuf. (IJEM) 10(3), 12–26 (2020). https://doi.org/10.5815/ijem.2020.03.02
Digital System for Monitoring and Management
33
16. Mwemezi, K., Sam, A.: Development of innovative secured remote sensor water quality monitoring & management system: case of Pangani water basin. Int. J. Eng. Manuf. (IJEM) 9(1), 47–63 (2019). https://doi.org/10.5815/ijem.2019.01.05 17. Miller, C., et al.: Quantifying the uncertainty in nitrogen application and groundwater nitrate leaching in manure based cropping systems. Agric. Syst. 184(C), 1–14 (2020). https://doi. org/10.1016/j.agsy.2020.102877 18. Jackson, L.L., Keeney, D.R., Gilbert, E.M.: Swine manure management plans in NorthCentral Iowa: nutrient loading and policy implications. J. Soil Water Conserv. 55, 205–212 (2000) 19. Wortmann, C., et al.: Manure use planning: an evaluation of a producer training program. J. Ext. 43(4), Article Number 4RIB52005 (2005) 20. Rotz, C.A.: Management to reduce nitrogen losses in animal production. J. Anim. Sci. 82 (suppl. 13), E119–E137 (2004) 21. Groenestein, C.M., et al.: Livestock housing. In: Options for Ammonia Mitigation: Guidance from the UNECE Task Force on Reactive Nitrogen, pp. 14–25. Centre for Ecology and Hydrology, Edinburgh (2014) 22. Luostarinen, S., Kaasinen, S.: Manure nutrient content in the Baltic Sea countries. Natural Resources Institute Finland (Luke), Helsinki (2016) 23. Beaulieu, M.-C., Bucci, A.: Programming without code: teaching classics and computational methods. In: Heath, S. (ed.) DATAM: Digital Approaches to Teaching the Ancient Mediterranean, pp. 127–148. The Digital Press at the University of North Dakota, US (2020) 24. Briukhanov, A., et al.: Digital methods for agro-monitoring and nutrient load management in the Russian part of the Baltic Sea catchment area. In: IOP Conference Series: Earth and Environmental Science, vol. 578, pp. 1–7 (2020) 25. Shalavina, E.V., et al.: Interactive programme for monitoring and distribution of organic fertilisers produced in agricultural organizations. Technol. Mach. Equip. Mech. Crop Livestock Prod. 2(103), 81–91 (2020) 26. HELCOM Recommendation 28E/4. Annex III “Criteria and Measures Concerning the Prevention of Pollution from Land-Based Sources” to Convention on the Protection of the Marine Environment of the Baltic Sea Area (1992). https://helcom.fi/about-us/convention/ annexes-to-the-convention-2/annex-iii/ 27. Metson, G., et al.: Optimizing transport to maximize nutrient recycling and green energy recovery. Resour. Conserv. Recycl. 9-10, Article ID 100049 (2020) 28. Hu, Y., et al.: Logistics network management of livestock waste for spatiotemporal control of nutrient pollution in water bodies. ACS Sustain. Chem. Eng. 7(22), 18359–18374 (2019). https://doi.org/10.1021/acssuschemeng.9b03920 29. Spiegal, S., et al.: Manuresheds: advancing nutrient recycling in US agriculture. Agric. Syst. 182, Paper ID 102813 (2020). https://doi.org/10.1016/j.agsy.2020.102813 30. Akram, U., et al.: Enhancing nutrient recycling from excreta to meet crop nutrient needs in Sweden – a spatial analysis. Sci. Rep. (9), Article ID 10264 (2019). https://doi.org/10.1038/ s41598-019-46706-7
Research on RPG Game Recommendation System Shenghui Li(&) and Wen Cheng Communication and Information System, Wuhan Research Institute of Posts and Telecommunications, Wuhan 430074, Hubei, China
Abstract. In RPG Games, how to make players quickly adapt to the game and like the game has always been the focus of game merchants. Good props recommendation can not only improve the game experience of players, but also bring benefits to game providers. RPG games are different from other types of games in that the player level changes faster, and there is a great correlation between levels and equipment. Therefore, we need to recommend corresponding props in real time according to player’s levels. In the previous recommendation systems, offline features are often used, which leads to the inability to capture the level changes of players in real time, and finally affects the recommendation results; Similarly, in the process of offline sample splicing, there will be inconsistency between the characteristics used in online services and those used in model training samples. To solve the above problems, we use real-time computing to replace the previous offline framework. By real-time changing features, we avoid the problem of outdated features, and use real-time sample splicing instead of offline splicing to directly save the samples requested and purchased by users to HBase, to solve the problem of inconsistency between online and offline features. The experimental results show that after using the above scheme, the real-time recommended exposure ARPU effect is 14.75% higher than the offline effect. Finally, by solving the problem of data skew, the overall performance of the system is improved by more than 50%. The experimental results prove that the model can well capture the player’s status changes after the features are real-time, and the use of real-time technology to ensure the consistency of online and offline features significantly improves the prediction effect of the model. Keywords: RPG game time sample splicing
Recommendation system Real time features Real
1 Introduction With the increasing number of information and data, it is difficult for users to quickly choose what they like from a large amount of information [1, 2].The recommendation system can help users find what they are interested in from massive data according to user’s interests, hobbies, needs and other information [3, 4]. In order to strengthen the purchase of players in the game, Paul Bertens proposed to use the large amount of data generated by the players in the game for model training and achieve better results by employing machine learning algorithms that can predict © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 Z. Hu et al. (Eds.): CSDEIS 2021, LNDECT 121, pp. 34–46, 2022. https://doi.org/10.1007/978-3-030-97057-4_4
Research on RPG Game Recommendation System
35
the rating of an item or product by a particular user [5]. Pengbo Li establishes a system for the problem of lack of personalized commodity recommendation and low pertinence in shopping offline [6], Although a recommendation system was built, it was unable to capture fast-changing features due to the offline method. Jinhai Li built a recommendation system based on the SSM framework of collaborative filtering recommendation ideas. The management, promotion and warehousing departments of small and micro enterprises can formulate more efficient strategic guidelines based on the recommendation results and data visualization [7]. which provides an ideological basis for building the system. Role playing game (RPG) is a kind of game type. In the game, players play one or more roles in the virtual world. Players fight with the enemy by manipulating the game characters, improve their level, collect equipment, complete game setting tasks, and experience the game plot. In game commodity recommendation can provide personalized recommendation services for players, help players find and find their own suitable props, increase user’s sense of experience in the game, and improve user’s online time in the game. For game providers, the recommendation system can also bring greater benefits to the company [8]. In RPG Games, the level of players changes rapidly, as shown in Table 1, the needs of different levels are different. The use of props is closely related to the level of players. This requires the recommendation system to have the ability to recommend relevant props according to the rapidly changing level. Table 1. Grade change over time Grade change 0–10 level 10–20 level 20–30 level 30–40 level 40–50 level ……
Time consuming 0.5 h 2.0 h 12 h - 2 days 1–4 days 3–10 days ……
From the perspective of people and things, features are divided into prop features, user features and interactive features. Interaction features can also be regarded as a kind of user features. User features are generally obtained through clear records or statistical mining, such as user’s geographical location, access period, equipment used, gender and age, which can be obtained at the time of access; Interactive features count the user’s portrait preference through the historical user’s operation behavior [9]. The features of props are generally their own attributes and statistical data, such as inherent attributes such as category, keyword and theme, as well as effect statistics in a historical window, such as exposure, click, buy, forwarding, etc. In RPG games, there are problems with using fast-changing features as offline features. In the process of making offline features, the user’s behavior data and prop distribution information of some time ago will be periodically entered into Redis to
36
S. Li and W. Cheng
provide use for online services. In this way, the features will be outdated and inaccurate, that is, the changes of user level cannot be captured in real time. As shown in Fig. 1, the problem of using offline features online is shown. Suppose we write features to Redis daily. Users are at level 20 on March 2 and have been upgraded to level 25 on March 3. If the player initiates a request on March 3, because the offline feature is not updated to Redis, the recommended props still use the offline feature at the user’s level 20. Similarly, such problems also exist for rapidly changing characteristics.
Fig. 1. Problems with offline features
Sample splicing is to extract, summarize and splice the features from multiple data sources, and use it for model training. As shown in Fig. 2, the problem of inconsistency between offline sample splicing and online service usage characteristics is shown. In the offline sample splicing process, the user’s previous days feature will be selected as the sample feature of model training. Let’s take a user request as an example to simulate the process of using the characteristics of online users. If the user initiates a request on March 3, because the user characteristics of March 3 are not updated to Redis, the online uses the offline characteristics of March 1. Currently, there is a conflict. The online service uses the characteristics of March 1, while the model service uses the characteristics of March 2.
Fig. 2. Online and offline features are inconsistent
At present, real-time framework has become the development trend of recommendation and is widely used in many large Internet companies. For example, a highly
Research on RPG Game Recommendation System
37
visited advertising space on the website needs to monitor the drainage effect of advertising space in real time. If the conversion rate is very low, operators need to replace it with other advertisements in time to avoid the waste of traffic resources. In this example, it is necessary to count the exposure and click of advertising space in real time as a reference for operation decision-making. To solve the above two problems, this paper will use real-time calculation to generate real-time features instead of offline features for the features that change rapidly and use real-time sample splicing to replace the previous offline sample splicing to solve the problem of inconsistency between online and offline features.
2 Real Time Framework Description 2.1
Real Time Computing Framework Flink
At present, the real-time processing frameworks mainly include Flink [10, 11], storm [12–14] and spark streaming [15, 16]. Let’s briefly analyze the advantages and disadvantages of the three real-time frameworks, as shown in Table 2. Table 2. Comparison of real-time frameworks Faulttolerant Processing model and delay Throughput Data processing assurance
Flink Distributed snapshots checkpoint mechanism based on chandy Lamport Single event processing, sub second low latency
High Exactly once
Spark streaming WAL and RDD lineage mechanism
Storm Records ACK
All events in an event window have a high delay of seconds
Single event processing, sub second low latency Low At least once, which provides exactly semantics
Medium Exactly once
Flink is a distributed computing engine, which can perform both stream computing and batch processing. It adopts a native stream processing system to ensure low latency. It is also relatively perfect in API and fault tolerance. It is relatively simple to use and easy to deploy. Link can process stateful data, ensure that data will not be lost during job failover through its own state machine, allow users to process data according to event time, and push forward time through watermark. From the perspective of needs, we need to process user requests, expose and purchase real-time logs in real time at the same time. We have high requirements for throughput and delay of the real-time framework, so Flink framework is used for real-time computing [17, 18].
38
2.2
S. Li and W. Cheng
Distributed Message Queue Kafka
Kafka is a distributed message queue. When saving messages, Kafka classifies them according to topic. The sender of messages is called producer and the receiver of messages is called consumer. In addition, the Kafka cluster consists of multiple Kafka instances, each of which is called broker [19]. (1) Message persistence Kafka is highly dependent on the file system to store and cache messages. The data persistence queue can be based on the implementation scheme of simply appending files. Because it is sequential addition, Kafka adopts the disk structure with time complexity O (1) in design. It provides constant time performance, even if it stores a large amount of information (TB level), and the performance has little relationship with the size of data. It is precisely because Kafka persists messages that the stored messages can continue to be reused after Kafka restarts the machine. At the same time, Kafka can well support online or offline processing and integration with other storage and stream processing frameworks [20]. (2) High throughput High throughput is the main goal of Kafka design. Kafka writes data to disk and makes full use of the sequential reading and writing of disk. At the same time, Kafka adopts zero copy technology in data writing and data synchronization. (3) Data backup Kafka can specify the number of copies for each subject and make persistent backup of data, which can prevent data loss and improve availability to a certain extent.
3 Project Implementation 3.1
Design and Implementation of Real-Time Recommendation System
The real-time mystery store recommendation system designed in this paper includes three modules: data development module, offline training module and online service module. The data development module is mainly used to receive and process real-time streams, generate real-time user features, real-time interaction features and real-time props, and generate model training samples. User characteristics refer to recording the basic information of users, such as level, game age, etc.; Interactive features refer to the use of props by player in the game; Prop characteristics refer to the properties of props themselves, such as the sales of props. The off-line training module mainly carries out model training on the positive and negative samples generated by real-time tasks and saves the model to HDFS. The online service receives user requests, pulls real-time features and models, and returns the sorting results to the front-end display (Fig. 3).
Research on RPG Game Recommendation System
39
Fig. 3. Overall framework of real-time
3.2
Design and Implementation of Data Development Module
The real-time task receives and processes the real-time exposure and purchase flow, requests the dapan and datamore according to the user information to construct the offline user characteristics, then sorts the number of stores purchased by the user to obtain the real-time interaction characteristics, calculates the purchase distribution of props at each level to obtain the real-time props characteristics, and finally writes the spliced positive and negative samples into HBase (Fig. 4).
Fig. 4. Data development module processing flow
(1) Data source The data source of this paper comes from the shopping mall on the game side. The player’s behavior data is reported to Kafka, and then consumed by real-time tasks. The data format is shown in the table below, mainly including platform, player’s unique identification, purchased goods, purchased quantity and other information (Table 3).
40
S. Li and W. Cheng Table 3. Real time purchase stream data format Field column name Field type plat int open_id string world_id int role_id string count int item_id int game string
Describe Platform type Uniquely identify a user Cell ID Role ID Purchase quantity Commodity ID Game name
(2) Get offline user features In addition to the player’s behavior data in the game, it also needs the player’s historical behavior characteristics to further characterize the player. Here, the player’s portrait is mainly recorded by the dapan and datamore. dapan: depict the payment situation of players in all games and the basic information of users. Such as total payment amount, gender and other information. datamore: depict the player’s portrait in the current game. Such as level, VIP level, payment category, etc. Acquisition method: according to the user information, request the market and datamore interface, and finally return the request results and splice them. (3) Get real time interactive features Redis is used to record the number of each item purchased by players in the mall. In order to strengthen the impact of existing purchases, we will multiply the purchase quantity by a time attenuation factor, so we need to record an additional time stamp to specifically record the update time of the latest item purchase quantity. The format of the key stored in Redis is as follows: game name + open_id+world_id+role_ID and value are strings in the form of item_ID + cumulative purchase quantity. After a user purchases a new product, he obtains all the product information he has previously purchased from Redis, and then updates and sorts the purchase quantity. Take out the top n of the purchased quantity as the real-time interaction feature. (4) Get real time prop features Since we need to count the purchase quantity of props, in the real-time task, I make statistics by props. I found that when the traffic is large, work will continue to restart. As shown in the Fig. 5: the automatic restart of work is shown. The time in the red box is obviously shorter than others, indicating that the task has been automatically restarted. Observing the log, you can find that the task is abnormal. Then view the execution of the task through the storm UI, as shown in the Table 4. From the table, we can see that the amount of data processed by different tasks varies by an order of magnitude which is the phenomenon of data skew. In order to
Research on RPG Game Recommendation System
41
Fig. 5. The automatic restart of work is shown. Table 4. The execution of the task Task 001 002 003 004
Load Number of processed data 0.039 584540 0.56 520540 0.222 35720 0.692 25720
Fig. 6. Descending curve of purchase quantity
further confirm whether there is data skew, I queried the purchase amount of props from the HBase table for a period of time, which can be clearly seen from Fig. 6. The abscissa is the id of the item, and the ordinate is the number of items purchased. The number of purchased props is extremely uneven. Thus, it can be determined that there is a data skew problem in the task.
42
S. Li and W. Cheng
Fig. 7. Multi-threaded optimization
Therefore, I use props + level grouping to break up the data. As shown in Fig. 7, Similarly, I need to summarize and synchronize the data through the third-party component Redis. In order to ensure the efficiency of synchronization, I create an additional thread to deal with data synchronization. Since both the main thread and the auxiliary thread need to modify and access critical resources, I ensure the safety of data synchronization through the semaphore mechanism and allow the main thread and the auxiliary thread to run in parallel when executing different prop updates. Note that when using the thread pool to manage multiple threads, do not set the number of available threads too large, too large may cause the operation of Redis to time out, too small to optimize the effect will not be obvious, in this experiment, I set the parameter to 5 which is the best parameter I got after many experiments. After solving these problems, as shown in Fig. 8, we can see that the processing data of each task is more uniform, and the overall load is reduced. From the figure, we can see that the task load base has been reduced by 10 times.
Fig. 8. Improved effect
Research on RPG Game Recommendation System
43
After solving the problem of data skew and using multithreading to optimize the system, the overall throughput of the system is increased by more than 50%. (5) Solve the problem of inconsistency between online and offline features Record the characteristics of the purchase in real time and land the spliced samples on HBase to ensure that the samples for model training are consistent with the characteristics used by online services. 3.3
Model Services and Online Services Design and Reality
After the data development module accumulates the positive and negative sample data, the model is written to HDFS by using spark to process and train the data. The following is a brief introduction to the processing flow. (1) Pull the sample data to enhance the positive sample During our training, we found that some props were purchased very few times, so it is difficult for the model to learn the information of relevant props. Therefore, we plan to up sample the training data with relatively few purchases (data enhancement). The current practice is to count the purchase times of all props, and then sort them to obtain the median purchase times of all props, then give the median * n, which is the standard for the purchase times of props. If it is lower than this number, repeat the data to make the appearance times of this props reach this target number. In the training test, we found that when n takes 5, the effect of the model is the best. (2) Partition dataset The training samples are divided according to time. The data of the previous week is used as the training set and the eighth day is used as the test set. (3) Model training Xgboost [21] algorithm is used to train the model. Due to the large number of samples, Spark [22] is used to train the model here. (4) Online service ABTest the user request according to the user ID. The main recommendation process is as follows: (1) Request the market and datamore according to the user information and build the user characteristics. (2) Pull real-time interaction features and real-time props from Redis according to user information. (3) Pull the model service, input the built samples into the model, and return the sorting list.
4 System Implementation Test and Effect Through ABTest, we divide the online user requests into three parts. The first part is the offline algorithm, the second part is the real-time regular prop pool, and the third part is the real-time full prop pool. The rule item pool refers to the pool recommended by
44
S. Li and W. Cheng
different levels of servers. We can see that the exposure ARPU of the real-time regular props pool is 14.75% higher than that of the offline algorithm (Table 5). Table 5. Comparison of the effects of offline algorithm and real time - rule pool
Offline algorithm Real time rule pool Real time full prop pool
Number of exposed users 32179
Number of users purchased 3037
Purchase amount 700521
Exposure purchase rate 9.44%
Exposure ARPU 21.77
Exposure ARPU increase 0%
3242
377
80987
11.63%
24.98
14.75%
3315
438
87097
13.21%
26.27
20.67%
Rule item pool: different levels of servers recommend different pools – manually limit the levels, which is consistent with the offline algorithm. All props pool: the union of all recommended props, there is no level limit. As can be seen from the table, the effect of the model using real-time samples is significantly better than the model using offline samples. The effect of real-time-rule item pool is worse than that of real-time-full item pool, indicating that the model has learned the level limit after the features are real-time, the effect is better than manual level restriction.
5 Conclusion Through experiments, it is found that real-time features that change rapidly can well solve the problem of rapid grade change in RPG Games, resulting in poor recommendation effect. At the same time, the purchased samples will be landed in real time to solve the problem of inconsistent online and offline features. It can be seen from the statistical data that the optimized effect is significantly higher than the previous offline algorithm, and the expected effect is in line with the implementation. Applying realtime technology to RPG game recommendation can significantly improve the effect of the model. The current recommendation system only supports feature real-time, realtime stitching of samples, and does not provide online training. The advantage of offline training is that it is relatively stable. A large amount of data can be used for training and evaluation. If the model is not effective, it can be easily replaced. However, the advantages of online training are obvious. It can provide real-time feedback on training samples, produce models in real time, and have strong timeliness of data. In the future, it is planned to deploy online training. I believe that better results can be achieved in RPG game recommendation.
Research on RPG Game Recommendation System
45
References 1. Hui, A.: Research on diversity and novelty of personalized recommendation system. University of Electronic Science and Technology (2020) 2. Wang, G., Liu, H.: Overview of personalized recommendation system. Comput. Eng. Appl. 7 (2012) 3. Chen, C., Zhang, X., Ju, S., et al.: AntProphet: an intention mining system behind Alipay’s intelligent customer service bot. In: IJCAI, pp. 6497–6499 (2019) 4. Yong-hong, L., Xiao-Liang, L.: Research of data mining based on e-commerce. In: 2010 3rd International Conference on Computer Science and Information Technology, vol. 4, pp. 719–722. IEEE (2010) 5. Bertens, P., Guitart, A., Chen, P.P., Perianez, A.: A machine-learning item recommendation system for video games. In: 2018 IEEE Conference on Computational Intelligence and Games (CIG), pp. 1–4 (2018). https://doi.org/10.1109/CIG.2018.8490456 6. Li, P., Zhang, G., Chao, L., Xie, Z.: Personalized recommendation system for offline shopping. In: 2018 International Conference on Audio, Language and Image Processing (ICALIP), pp. 445–449 (2018). https://doi.org/10.1109/ICALIP.2018.8455252 7. Cai, Q.: Discussion on role design of multiplayer online role-playing games. Dissertation of Institute of Visual Communication Design, Kunshan University of Science and Technology, pp. 1–197 (2006) 8. Li, J., Zhou, B.: Research on recommended system construction based on SSM framework. In: 2021 IEEE 4th International Conference on Computer and Communication Engineering Technology (CCET), pp. 207–211 (2021). https://doi.org/10.1109/CCET52649.2021. 9544434 9. Sinitsyn, R.B., Yanovsky, F.J.: Kernel estimates of the characteristic function for radar signal detection. In: European Radar Conference, EURAD 2005, Paris, France, pp. 53–56 (2005). https://doi.org/10.1109/EURAD.2005.1605562 10. Mika, P.: Flink: semantic web technology for the extraction and analysis of social networks. J. Web Semant. 3(2–3), 211–223 (2005) 11. Carbone, P., Katsifodimos, A., Ewen, S., et al.: Apache flink: stream and batch processing in a single engine. Bull. IEEE Comput. Soc. Tech. Commit. Data Eng. 36(4) (2015) 12. Iqbal, M.H., Soomro, T.R.: Big data analysis: apache storm perspective. Int. J. Comput. Trends Technol. 19(1), 9–14 (2015) 13. Evans, R.: Apache storm, a hands-on tutorial. In: 2015 IEEE International Conference on Cloud Engineering, p. 2. IEEE (2015) 14. Van Der Veen, J.S., Van Der Waaij, B., Lazovik, E., et al.: Dynamically scaling apache storm for the analysis of streaming data. In: 2015 IEEE First International Conference on Big Data Computing Service and Applications, pp. 154–161. IEEE (2015) 15. Ichinose, A., Takefusa, A., Nakada, H., et al.: A study of a video analysis framework using Kafka and spark streaming. In: 2017 IEEE International Conference on Big Data (Big Data), pp. 2396–2401. IEEE (2017) 16. Chintapalli, S., Dagit, D., Evans, B., et al.: Benchmarking streaming computation engines: storm, flink and spark streaming. In: 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp. 1789–1792. IEEE (2016) 17. Stankovic, J.A.: Misconceptions about real-time computing: a serious problem for nextgeneration systems. Computer 21(10), 10–19 (1988) 18. Kreps, J., Narkhede, N., Rao, J.: Kafka: a distributed messaging system for log processing. In: Proceedings of the NetDB, vol. 11, pp. 1–7 (2011)
46
S. Li and W. Cheng
19. Yanovsky, F.J., Braun, I.M.: Models of scattering on hailstones in x-band. In: First European Radar Conference, EURAD 2004, Amsterdam, Netherlands, pp. 229–232 (2004) 20. Wang, G., Koshy, J., Subramanian, S., et al.: Building a replicated logging system with Apache Kafka. Proc. VLDB Endow. 8(12), 1654–1655 (2015) 21. Chen, T., He, T., Benesty, M., et al.: Xgboost: extreme gradient boosting. R Package Version 0.4-2 1(4), 1–4 (2015) 22. Meng, X., Bradley, J., Yavuz, B., et al.: MLlib: Machine learning in apache spark. J. Mach. Learn. Res. 17(1), 1235–1241 (2016)
LSTM-Based Load Prediction for Communication Equipment Rui Guo1, Yongjun Peng1,2, Zhipeng Gong3, Anping Wan3(&), and Zhengbing Hu4 1
The PLA Rocket Force Command College, Wuhan, China Department of Information Communication, National University of Defense Technology, Wuhan 430010, China 3 Department of Mechanical Engineering, Zhejiang University City College, Hangzhou 310015, China [email protected] 4 Faculty of Applied Mathematics, National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”, Kyiv 03056, Ukraine
2
Abstract. To reduce the failure rate of communication equipment and to achieve proactive and preventive maintenance of communication equipment, it is very important to grasp the change of the load of communication equipment in a timely and accurate manner. In this paper, a multivariate LSTM neural network-based load prediction model for communication equipment units is proposed. The power influencing factors are extracted by correlation analysis, and the average absolute percentage error is used as the evaluation criterion for the model’s merit to predict the power output of communication equipment units in the future period, which in turn provides a reference basis for the scheduling of communication equipment among operation and maintenance personnel. The results of the case validation show that, compared with traditional deep learning methods, the error of multivariate LSTM neural network in predicting the power of communication equipment is as low as 0.94%, while the error rates of the RNN and GRU method are 3.35% and 2.24% respectively, which proves that the prediction method proposed in this paper is more effective and has universal applicability. Keywords: Communication equipment Multivariable long-short term memory
Output prediction Deep learning
1 Introduction In recent years, with the development of artificial intelligence technology, more and more machine learning models are used in the data prediction of time series. Deep learning is the most popular branch of machine learning development, where convolutional neural networks (CNN) [1] can effectively handle spatial information and are now widely used in machine vision. Long Short Term Memory (LSTM) neural networks [2] can better handle temporal information, effectively learn patterns in historical temporal data and efficiently predict the distribution of data at a future point in time, and the applications in speech recognition, machine translation and industrial big data © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 Z. Hu et al. (Eds.): CSDEIS 2021, LNDECT 121, pp. 47–59, 2022. https://doi.org/10.1007/978-3-030-97057-4_5
48
R. Guo et al.
are quite sophisticated. Hossam Elzayady et al. [3] proposed a CNN-LSTM neural network prediction model combining convolutional neural network and long and shortterm memory neural network to achieve accurate prediction of future coal storage in power plants. Chen Bo et al. [4] used the LSTM model to predict the model values of coal mill time series currents to obtain the unit performance degradation trend and accurately predict the unit maintenance time nodes. Xiao Liang et al. [5] used an intelligent scheduling algorithm combining LSTM and ant colony algorithm to predict the coal consumption of a unit under certain conditions, which is more accurate and efficient compared to traditional methods. Electricity load forecasting is significant for energy management, infrastructure planning, etc., and the application of machine learning models in this field is gaining ground [6–9]. Communication equipment has accumulated a large amount of data when operating, these data and the operating state of the unit are inextricably linked, digging the potential value behind the data is of great significance to reduce the incidence of communication equipment failure and guide the optimal operation of communication equipment unit. For example, Wanxiang [10] combined the MapReduce framework of the power plant’s big data analysis platform and improved association rule algorithms to optimize the operating status of communication equipment, and finally developed a communication equipment performance monitoring and evaluation system to achieve a comprehensive index-wide face assessment and optimization of the unit. If we can go one step further, through machine learning means, based on the historical operation data of communication equipment to predict the future operating state of the unit under set working conditions (such as power, etc.), to make effective decision-making judgments in advance, adjust the relevant parameters of the unit, and thus obtain higher economic benefits, this is undoubted of great significance to the operation and maintenance of communication equipment units. At present, the management of communications equipment is still based on “reactive” maintenance. In the case of fault handling, for example, maintenance of equipment only starts after a fault has occurred, which can significantly reduce efficiency [11]. Proactive and preventive equipment maintenance can be achieved if the operating load of communication equipment units can be accurately and efficiently predicted and multiple units can be scheduled for operation. Therefore, by establishing a multi-parameter LSTM neural network model, this paper predicts the power value at future points in time based on the parameters that have a high correlation with power (e.g., atmospheric temperature, fuel quantity, etc.) in the operational history of communication equipment, and uses the predicted power change trend as the basis for unit load scheduling, making “proactive” maintenance possible.
2 Extraction and Analysis of Unit Performance Characteristics Parameters The operating capacity of the communication equipment units is closely related to the ambient temperature, while the operating parameters of each load section are also a prerequisite for determining whether operation is possible. The data used in this paper are the operating data of two communication equipment units for a total of one week
LSTM-Based Load Prediction for Communication Equipment
49
from December 24, 2019 to January 1, 2020, including 60 status parameters such as air mass flow rate, fuel quantity, and power output of communication equipment units. It is undesirable to directly use data with such high dimensionality for model building, and only some of these state parameters are highly correlated with the selected communication equipment power in this paper, so data pre-processing and feature extraction are needed first, and the state parameters with greater correlation with the communication equipment power are selected as inputs for subsequent model training and data prediction to reduce data dimensionality and improve model training efficiency.
Fig. 1. Correlation analysis: a-mass air flow; b-atmospheric temperature; c-inlet air temperature of compressor; d-outlet steam flow of medium pressure superheater; e-communication equipment power
Firstly, the sample data measured in the shutdown condition were excluded and all data were sorted by time (number of hours the unit was in operation). Secondly, this paper uses the Pearson correlation coefficient [12] to measure the correlation between
50
R. Guo et al.
different parameters and the power of the communication equipment, where the correlation between the four state parameters of air mass flow (where the fuel quantity indicator is implicitly included in the air mass flow and reflected through the air-fuel ratio), atmospheric temperature, inlet air temperature and bearing temperature and the power of the communication equipment is higher than 50% (as shown in Fig. 1), which is selected as the input of the model.
3 Load Forecasting Model 3.1
Introduction to Multivariate LSTM Time Series Prediction
The LSTM, a variant of RNN, was first proposed by Hochreiter and Schmidhuber in 1997. Compared with the traditional recurrent neural network, the LSTM can effectively solve the problem of gradient explosion or disappearance during model training. Almost all exciting results based on RNN have been achieved by LSTM, and as a result, LSTM has become the focus of in-depth learning. LSTM is now widely used for a variety of tasks due to its powerful learning capabilities and excellent workability [13]. The basic structural unit of the LSTM neural network is shown in Fig. 2
Memory merging Ct tanh
h
h it
et
ot
sig
tanh
sig
Output gate
Input gate
Forgetting gate
ft sig
Ht-1
+
h
Ct-1
+
Ht
Xt Fig. 2. Basic unit structure of LSTM network mode
Compared with the traditional RNN, in order to control the memory information on the time series, memory units are added to each neural unit in the hidden layer, and each neural unit is transmitted through the “gate” structure (forgetting gate, input gate, cell state update, and output gate), so it determines the memory and forgetting degree of past information and immediate information, and has the function of long-term memory.
LSTM-Based Load Prediction for Communication Equipment
51
Precisely because LSTM neural networks take input gates, forgetting gates and output gates to regulate the information trade-off between neurons in each network layer and control the rate of information accumulation, the model can more easily capture long-distance dependent information, which is more practical and accurate for huge time series data such as power plant unit operation. data
data
···
··· t=T-N
data
data
label
data
data
data
data
data
data
data
data
t=T-2
t=T-1
t=T
···
label
··· t=T+1
Fig. 3. Schematic diagram of multivariable input LSTM network prediction
And compared with the traditional univariate input LSTM prediction, the accuracy of multivariate input LSTM prediction is higher, and its principle is shown in Fig. 3. The prediction principle of the LSTM network model with multivariate inputs is to establish a sliding window between the data with set time step n and the data to be predicted at moment t. That is, the input data and predicted data at time period [t − n, t − 1] are used as features, the values to be predicted at moment t are used as labels, and the past n multivariate inputs are used to predict the label values at the current moment, and then sliding over the data set until the last label. Compared with the univariate input LSTM model that uses only the data to be predicted as input, this method considers more comprehensive relevant factors and has higher prediction accuracy. 3.2
Model Building
The proposed multivariate input LSTM neural network method for prediction of communication equipment power of communication equipment units has three main components: pre-processing of unit operation data, LSTM model building and training, and importing new data and prediction. Step 1: Data collection (this paper is for two sets of communication equipment unit operation data, a total of about 16,000, 60-dimensional data sets); Step 2: Data pre-processing, which performs outlier processing, missing value completion, normalization, and conversion of data-to-data types that can be used for supervised learning.
52
R. Guo et al.
Step 3: Dividing the data of machine 1 into training and testing sets and setting the data of machine 3 as the validation set. Step 4: Building multivariate input LSTM models and substituting training set and test set data into the models for training. Step 5: Substitution of validation set data into the model for prediction; Step 6: The model is evaluated by two evaluation metrics: the loss value in model training and the root means square error in prediction and the smaller the two, the higher the prediction accuracy of the model; Step 7: Generate true value-predicted value graphs to visually evaluate the predictive effect of the model. Based on the LSTM neural network model and the principle of multivariate input time series prediction, the power prediction model of communication equipment unit proposed in this paper, based on LSTM neural network, uses the power curve before the prediction time point and the power influencing factors to predict the power at the prediction time point. The output of the model is the power value at the predicted time point, as shown in Eq. (1). ^d ¼ ^l0;d ; ^l1;d ; ; ^li;d ; ; ^lT;d L
ð1Þ
The historical power value is: Lt;dw ¼ lt;d1 ; lt;d2 ; lt;d3 ; ; lt;dw
ð2Þ
Where: lt;dw is the power value of the first w data samples at the prediction time point. The power influencing factors: Ft;d ¼ at;d ; bt;d ; ct;d ; dt;d
ð3Þ
Where: at;d is mass air flow; bt;d is atmospheric temperature; ct;d is inlet air temperature; dt;d is bearing temperature. In the established communication equipment power prediction model, the input at time t is composed of historical power and power influencing factors, expressed as Iinput ¼ Lt;dw ; Ft;d , The output value is Ooutput ¼ ^lt;d . 3.3
Model Training
3.3.1 Data Pre-processing Data preprocessing mainly includes 2 steps of data vectorization and normalization. The neural network is based on linear algebra theory, while the original data cannot be brought directly into the model for training, which will produce large errors or even data type mismatch, the original data need to be transformed into vectors that can be used for supervised learning before training [14]. The data vectorization stitches the power influencing factors and historical power values, that is, a sliding window of step 3 is selected to extract and stitch the data, and finally transformed into a vector. Neural networks commonly use gradient descent-based backpropagation algorithms for model training, and normalizing the data to a standard interval is more
LSTM-Based Load Prediction for Communication Equipment
53
conducive to model training and solving. The minimum-maximum normalization method is used to scale each element in the vector to the interval [0, 1], and its calculation formula is shown in Eq. (4): Xnorm ¼
X Xmin Xmax Xmin
ð4Þ
Where: Xmax , Xmin are maximum and minimum values respectively; X is the original value; Xnorm is the normalized value. 3.3.2 Evaluating Indicator The commonly used prediction evaluation metrics include root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), etc. In this paper, when comparing the LSTM algorithm with other methods, RMSE and MAPE are used to analyze the performance of the algorithm, respectively, as follows:
eRMSE
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u N X 1u ^i Li Þ2 ¼ t ðL N i¼1
eMAPE ¼
b Li Li Li
100%
ð5Þ
ð6Þ
^i is the predicted power; Li is the Where: N is the total number of training samples; L actual power. In this paper, MAPE is used as the loss function for evaluating the convergence of the model at iteration, and RMSE is used as a criterion to further evaluate the prediction accuracy of the model. 3.3.3 Hyperparameter Setting During model training, all the data are fed into the network for one training session called a round. The number of rounds determines the training effect of the model, too few rounds will lead to underfitting and too many rounds will lead to overfitting. Through several experimental adjustments, this paper finally decided to use an LSTM network with an implicit layer, the number of nodes in the implicit layer of the neural network is 50, the input is a multidimensional array after data pre-processing, the output layer uses a linear activation fully connected layer with one neuron, and the output is the predicted value of the power of the communication equipment. The learning rate of the overall model is set to 0.01 and to prevent overfitting, a dropout layer is added before the output layer to filter out 20% of the network parameters. Divide the pre-processed data of unit 1 into the training set and test set (the ratio is 4:1), import the model for training, each time the amount of sample data sent to the model for training is 32, a total of 500 times training.
54
R. Guo et al.
3.3.4 Training Process Evaluation The trend of the loss function of the unit 1 data placed into the multivariate LSTM communication equipment power prediction model built in this paper for training is shown in Fig. 4.
Fig. 4. Variation of model training error
Fig. 5. Comparison of error trends of different models
As can be seen from Fig. 4, during the first 500 iterations, the loss function of the LSTM model’s training set tends to decrease significantly, and it finally converges to zero. And the loss function of the test set is at a low level at the beginning, indicating
LSTM-Based Load Prediction for Communication Equipment
55
the effectiveness of this LSTM model for multi-factor load feature mining and learning of unit 1. Figure 5 below shows the results of comparing the LSTM model used in this paper with a traditional RNN and a variant GRU. As can be seen from Fig. 5, during the power prediction of communication equipment, the loss value of the LSTM model converges faster than RNN and is more stable than GRU in the late model iteration. 3.4
Analysis of Prediction Results
To test the prediction effect of the multivariate input LSTM model, a new batch of data (validation set) needs to be input to verify the prediction accuracy of the model. After data pre-processing, the power of communication equipment and the power influencing factors after feature extraction of Unit 3 are input to the model as the validation set for prediction, and the error variation trend of the prediction process is shown in Fig. 6.
Fig. 6. Variation of power prediction error
As can be seen from the above figure, the prediction results using the data of unit 3 as the validation set show that the change of the loss value during its prediction follows the same trend as the data of unit 1, and the error tends to be 0 after 500 model iterations, which can almost accurately predict the power value of the communication equipment, indicating the excellent generalization ability of the model. The prediction results of unit 3 are visualized with the real values, as shown in Fig. 7. Figure (a) shows the comparison results of 600 data points collected during general operation of the communication equipment, which has a more fluctuating distribution of power values of the communication equipment, to test the prediction effect of the model on the unstable oscillation data; Figure (b) shows the comparison results of 100 data points with a stable distribution of power from 370 kW to 380 kW, to test the accuracy of the model’s prediction of specific power values at a future time point.
56
R. Guo et al.
Fig. 7. Results of power prediction
The prediction results of the LSTM neural network model have a high degree of overlap with the actual values, whether it is a long-range prediction with large data fluctuations or a short-range prediction with high data stability. And compared with the other two methods, the accuracy of the multivariate LSTM time series prediction taken in this paper is significantly higher than that of RNN in the prediction of communication equipment load, and the overall stability of its prediction results is comparable to that of GRU, which needs to find the specific value of the error-index to further study. The RMSE and MAPE values of each algorithm are shown in Table 1.
LSTM-Based Load Prediction for Communication Equipment
57
Table 1. Evaluation index Algorithm RNN GRU LSTM
eMAPE =% 3.35 2.24 0.94
eRMSE =kw 7.478 6.143 4.071
The RMSE values of RNN and GRU prediction results are 7.478 kW and 6.143 kW, respectively, while the RMSE value of LSTM model prediction results is 4.071 kW. The MAPE values of RNN and GRU prediction results are 3.35% and 2.24%, respectively, while the MAPE value of LSTM model prediction results is only 0.94%. To predict the short-term load variation of the power system, Wu L et al. [15] proposed a hybrid GRU-CNN neural network model, and the MAPE value of the prediction result was only 2.8% at the lowest. From the comparison results of indicator RMSE and indicator MAPE, we can see that the prediction results of the LSTM neural network model for communication equipment power are both more accurate than the other two methods, and it can be seen from Fig. 7 that the error between the predicted and true values of LSTM model is in a lower and more stable range, which indicates that LSTM model has better application value in communication equipment performance prediction.
4 Conclusions To accurately and efficiently predict the operating load and output load changes of communication equipment, this paper proposes a load prediction model for communication equipment units based on a multivariate input LSTM neural network. Using the advantage that LSTM can learn long-distance time-series dependence to identify the changing pattern of communication equipment power itself from the time dimension, and identify the non-linear influence of airflow and other factors on power from the influence dimension. In this paper, an LSTM time series model is developed to predict the power curves of two different sets of unit communication equipment and to verify the accuracy and reliability of the method, and the model predicted values are compared with the true values, the results show that: (1) The LSTM-based prediction model of communication equipment load can accurately predict the change of unit power on someday day in the future. The average absolute percentage error between the predicted values and actual values is 0.94%. (2) Compared with the traditional deep-learning model, the LSTM model adopted in this paper has higher accuracy and generalization ability. (3) Timely adjustment of the unit according to the prediction results of the LSTMbased prediction model of communication equipment load, which can effectively improve the operating efficiency of the communication equipment unit and enhance the economy of the unit.
58
R. Guo et al.
In this paper, the deep learning algorithm is applied to the load forecasting and scheduling optimization of communication equipment units, which promotes the application of artificial intelligence technology in the field of thermal power generation. Compared with the traditional load scheduling method, the method proposed in this paper truly realizes the accurate load forecasting of communication equipment units. Acknowledgments. This research is financially supported by National Natural Science Foundation of China (51705455), Aviation Science Foundation (20183333001) and China Postdoctoral Science Foundation funded project (2018T110587).
References 1. Yang, H.F., Dillon, T.S., Chen, Y.P.P.: Optimized structure of the traffic flow forecasting model with a deep learning approach. IEEE Trans. Neural Netw. Learn. Syst. 28(10), 2371– 2381 (2017) 2. Rao, K.S., Devi, G.L., Ramesh, N.: Air quality prediction in Visakhapatnam with LSTM based recurrent neural networks. Int. J. Intell. Syst. Appl. (IJISA) 11(2), 18–24 (2019). https://doi.org/10.5815/ijisa.2019.02.03 3. Elzayady, H., Badran, K.M., Salama, G.I.: Arabic opinion mining using combined CNN LSTM models. Int. J. Intell. Syst. Appl. (IJISA) 12(4), 25–36 (2020). https://doi.org/10. 5815/ijisa.2020.04.03 4. Chen, B., Wang, Y., Tao, Q., He, P., Chen, L.: Coal mill performance prediction model of coal-fired unit based on long-term and short-term memory neural network. Therm. Power Gener. 1–7 (2021) 5. Liang Xiao, H., Yifei, H.T., Duanchao, L., Weiheng, G., Yi, S.: Application of LSTM and ant colony algorithm in intelligent power plant scheduling. Autom. Instr. 39(05), 98–102 (2018) 6. Chaudhuri, A.K., Ray, A., Banerjee, D.K., Das, A.: A multi-stage approach combining feature selection with machine learning techniques for higher prediction reliability and accuracy in cervical cancer diagnosis. Int. J. Intell. Syst. Appl. (IJISA) 13(5), 46–63 (2021). https://doi.org/10.5815/ijisa.2021.05.05 7. Pasha, M.K.: Machine learning and artificial intelligence based identification of risk factors and incidence of gastroesophageal reflux disease in Pakistan. Int. J. Educ. Manag. Eng. (IJEME) 11(5), 23–31 (2021). https://doi.org/10.5815/ijeme.2021.05.03 8. Kundu, M., Nashiry, M.A., Dipongkor, A.K., Sumi, S.S., Hossain, M.A.: An optimized machine learning approach for predicting Parkinson’s disease. Int. J. Mod. Educ. Comput. Sci. (IJMECS) 13(4), 68–74 (2021). https://doi.org/10.5815/ijmecs.2021.04.06 9. Jiao, R., Zhang, T., Jiang, Y., et al.: Short-term non-residential load forecasting based on multiple sequences LSTM recurrent neural network. IEEE Access 6, 59438–59448 (2018) 10. Wan, X.: Research on operation optimization of thermal power units based on big data mining technology. Wuhan University (2017) 11. Fen, Z., Yanqin, Z., Chong, C., et al.: Management and operation of communication equipment based on big data. In: 2016 International Conference on Robots & Intelligent System (ICRIS), pp. 246–248. IEEE (2016) 12. Zhang, R., et al.: Using a machine learning approach to predict the emission characteristics of VOCs from furniture. Build. Environ. 196, 107786 (2021)
LSTM-Based Load Prediction for Communication Equipment
59
13. Yu, Y., Si, X., Hu, C., Zhang, J.: A review of recurrent neural networks: LSTM cells and network architectures. Neural Comput. 31(7), 1235–1270 (2019). https://doi.org/10.1162/ neco_a_01199 14. Gibson, A., Patterson, J.: Deep Learning: A Practitioner’s Approach. O’Reilly Media, Boston (2017) 15. Wu, L., Kong, C., Hao, X., et al.: A short-term load forecasting method based on GRU-CNN hybrid neural network model. Math. Probl. Eng. 2020 (2020)
Application of MEA-LSTM Neural Network in Stock Balance Prediction Zhongzhen Yan, Kewei Zhou(&), Xinyuan Zhu, and Hao Chen School of Computer Science and Technology, Hubei University of Technology, Wuhan 430068, China
Abstract. In the financial industry, the effective control of stock data has been concerned by various departments in this field. The stock is affected by a large number of controllable and uncontrollable factors, it is difficult to predict the future development, so it needs the help of computer and data. However, in the past research, the gradient vanishing problem will appear when using the recurrent neural network, the long short term memory network (LSTM) neural network is prone to over fitting, when the weights and thresholds are randomly generated by the network itself or artificially set according to the summary of many experiment. Therefore, in view of the characteristics of stock data, this paper proposes a model, which uses the mind evolutionary algorithm to optimize the long-term and short-term memory cycle memory neural network, so it can do the maximum degree of data fitting. This model not only retains the characteristics of LSTM neural network, but also alleviates the over fitting problem. It can avoid the situation that the gradient of recurrent neural network (RNN) network will disappear because of too long time series, and also relatively alleviate the problem of over fitting of traditional LSTM neural network. Finally, the comparison between LSTM and mind evolutionary algorithm-long short term memory network (MEA-LSTM) models shows that the prediction effect of our proposed model is better than the traditional LSTM. Keywords: Data prediction
Neural network MEA-LSTM
1 Instruction In the financial industry, financial data has always been an important invisible asset. It not only contains a lot of transaction information, but also contains a lot of information about the impact factors on the future direction of funds. Therefore, whether the financial data is effectively controlled will affect the fault tolerance of each financial company's future development. For stock data, the audience of stock data is influenced by many factors, some of which are controllable, and some of which are uncontrollable. For uncontrollable factors, human intervention is impossible. So we need to use computer to simulate the development trend of stock data on a large number of data, in order to help decision-makers make decisions in a short time. This paper proposes a model that combines the evolution of thinking with the longterm and short-term memory recurrent network of LSTM. The feature of the model is that it retains the filtering and utilization characteristics of LSTM on time series data, at © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 Z. Hu et al. (Eds.): CSDEIS 2021, LNDECT 121, pp. 60–71, 2022. https://doi.org/10.1007/978-3-030-97057-4_6
Application of MEA-LSTM Neural Network
61
the same time, it also avoids the gradient disappearance of recurrent neural network. The most important thing is that it avoids the over fitting problem, which can make the model reach the optimal without fitting, In this way, the prediction effect can be optimized as much as possible. The parameter problem in the objective function has always been the focus of research, and the accuracy of parameters is related to the degree of data fitting. So is LSTM neural network. The two stages from input layer to hidden layer and from hidden layer to output layer are the key points for LSTM network to process data. Improper selection of weights Wf and thresholds bf in the function will lead to over fitting of the model. At present, the methods to solve the over fitting has some methods, such as dropout, regularization, data adjustment. Although these methods are commonly used, the weights and thresholds are prone to local optima. Therefore, the optimal solution of the network is calculated by thinking evolution according to the sample data. Because the mechanism of convergence and dissimilation evolutionary operation adopted by mind evolution algorithm (MEA) is different from that of genetic algorithm [1], it will have a more prominent effect to solve the weight and threshold problems in LSTM long-term and short-term recurrent neural network by using MEA thought evolution.
2 Related Work At present, the development of stock price forecasting in the financial field is mainly represented by LSTM. Among them, some research results such as [8–14] focused on the study of simple LSTM model in the early stage, and the accuracy of stock price prediction by using LSTM model alone is not high enough. [15, 16] combined the attention mechanism with LSTM model, but the optimization of super parameters is not enough and the parameter search efficiency is not high. [17, 18] combines CNN convolutional neural network with LSTM model, which has higher accuracy and better robustness, but still lacks some interpretability. [19–22] used RNN and LSTM models for comparative study to compare the prediction results. The structure of LSTM algorithm is more complex than RNN, the parameters of its dynamic model and the number of training samples is relatively large, and the training results are more reliable. However, it is necessary to find a method to properly adjust the parameters of clustering algorithm to improve the accuracy. [23] applied the three-stage architecture constructed by the combination of toplitz inverse covariance clustering (TICC), temporal pattern attention and long-term and short-term memory (TPA-LSTM) and multivariable LSTM fuzzy neural network (MLSTM-FCN and MALTM-FCN) to the pattern discovery and prediction of stock index, but did not carry out forward-looking index tracking or construct the prediction model of stock index of other trading strategies. [24, 25] studied the bidirectional LSTM model. The proposed model was compared with various existing regression models and was better than them, but the combination test of static and dynamic characteristics was not carried out. [26] proposed a three-stage method based on natural language processing and deep learning technology to analyze and understand the past and present market scenarios and predict the future value of stocks. [27] proposed a hybrid method for predicting future stock
62
Z. Yan et al.
prices using LSTM and integrated empirical mode decomposition. [28] compared the performance of long-term and short-term memory (LSTM) neural network model and support vector machine (SVM) regression model. [29] proposed a two-dimensional attention LSTM model to predict the data. [30] proposed an association network model to predict. [31] a model independent approach. Here, instead of fitting the data to a specific model, we use a deep learning architecture to identify the potential dynamics in the data. It is obvious from the results that the CNN architecture can identify the change of trend, and it does not surpass the architecture in CNN. This is due to a sudden change, which takes place in the stock market. [32] proposed a new multi input LSTM model, which can extract valuable information from low correlation factors and discard their harmful noise by using additional input gates controlled by convincing factors called mainstream. The results show that the model is due to the ordinary LSTM model, but it must use networks such as CNN to obtain these current information.
3 Algorithm Description 3.1
Theory of Mind Evolutionary Algorithm
MEA is an evolutionary algorithm based on GA (Genetic Algorithm), to simulate the progress of human thinking. In order to avoid the defects of GA, MEA uses two kinds of similar operations: convergence and dissimilation, which are optimized as a whole [2]. Convergence operation refers to the process in which each individual in a subgroup becomes the optimal individual through competition. The dissimilation operation is a process in which the competition among the sub groups is called the optimal sub group and the new individuals are constantly searched in the space. During the execution of the algorithm, these two processes are repeated until the optimal individuals and subgroups are not found in the algorithm. In the solution space of mind evolutionary algorithm, the algorithm can remember more than one generation of evolution information, so it randomly generates individuals of different sizes. According to this feature, fitness function is used to find the winner and temporary individuals with the highest score and the best performance [3]. Use these individuals as subgroup centers and add new individuals around them. Finally, the center matrix of the winning subpopulation and the temporary subpopulation center matrix are determined according to the fitness value of the initial population [4]. “Convergence” refers to arranging the individuals in each of the winning subgroups according to their fitness values, leaving the individuals with the best fitness and eliminating the remaining disadvantaged individuals directly. Figure 1 below is a broken line diagram of the convergence operation of the superior sub population and the temporary sub population. On the way, the abscissa is the convergence times and the ordinate is the score. The superior sub population will eliminate the low molecular population according to the score. At the same time, the high molecular population will be selected from the temporary sub population to prepare for the alienation operation. “Alienation” refers to comparing the subgroup of the winner individual retained by the ``convergence'' operation with the optimal subgroup in the temporary subgroup. If the temporary optimal subgroup is better than the retained winner subgroup, the
Application of MEA-LSTM Neural Network
63
Fig.1. Convergence operation between superior subpopulation and temporary subpopulation
temporary optimal subgroup will replace it and the winner subgroup will be reduced to the temporary subgroup. If temporary subgroups are insufficient, they will be continuously supplemented. By iterating continuously until the condition that the optimal subpopulation and the optimal subindividual do not change is satisfied, otherwise the iteration continues and the optimal solution is obtained [5]. 3.2
LSTM Neural Network Theory
LSTM has the same structure as RNN recurrent neural network, but the repeated modules have different structures, as shown in Fig. 2 below. Unlike the single neural network layer of RNN, there are four network layers and interact in a very special way. Each symbol represents: if Xt is assumed to be the input layer data at this time,Xt1 is the input layer data value at the previous time, and Xt þ 1 is the input layer data at the next time; ht1 , ht and ht þ 1 respectively represent the hidden state of the previous time, the hidden state of this time and the hidden state of the next time. These hidden states are part of a complete cell state chain, and they are only represented as the hidden state of the cell state at a certain time. Then r (Sigmoid function: its output value is in the [0, 1] interval, 0 is completely discarded, 1 is completely passed) and tanh is the activation function, where x and + represent the multiplication and addition of the operation rules. The four networks have three kinds of gating structures (forgetting gate, input gate and output gate) in LSTM neural network. The advantage of this algorithm is that by adding input gate, forgetting gate and output gate to obtain variable self-cycling weights, the integral scales can be dynamically changed at different times when the model parameters are fixed. This can avoid the problem of gradient disappearance or gradient explosion [6].
4 Implement of MEA-LSTM Algorithm Although LSTM algorithm can process time series data, the weight W and threshold b are set artificially in its hidden layer, so it may lead to over fitting in the prediction process. In addition, when the data sample size is very small, it will also cause over fitting state. Therefore, an evolutionary algorithm based on MEA is presented, which optimizes the weights and thresholds of LSTM neural network [7].
64
Z. Yan et al.
Fig. 2. LSTM hidden layer structure
The MEA-LSTM model has the following processes: first, divide the data into training and test sets to form a solution space according to the topological structure of the LSTM neural network, then use the MEA algorithm to converge, dissimilate and iterate the solution space until the end conditions are met, and obtain the optimal weights and thresholds. Before LSTM network training, the weights and thresholds will be calculated by MEA algorithm instead of initialization or artificially set weights and thresholds. In this way, the over fitting of neural network caused by weights and thresholds can be prevented, and then the neural network training and data analysis and prediction will be carried out. Sample data is computed by MEA algorithm. When finding the optimal individual, the solving process is transformed into the problem of finding its minimum: W : minff ðxÞjx 2 Xg
ð1Þ
Where f ðxÞ 2 Y, X and Y are the optimal individual and the discrete point set of all individuals respectively, X is also the solution space, x 2 X. Therefore, a mapping relation is produced: f ðxÞ ! Y, and f are fitness value functions, and the discrete point X in X solution space is passed through the function f ðxÞ to obtain the value in the range Y. If there is a minimum, then equation F ¼ minf ðxÞ holds and a sequence of numbers fFn g ¼ fF; F1 ; F2 ; . . .Fi ; . . .; F g, is formed, where fFn g 2 Y. By initializing the process, the winner individuals with the size of N are obtained, and M is set as the number of individuals in the subgroup except for the center optimal individual, thus forming a subgroup mapping: S : X [ XM þ 1
ð2Þ
SN ðx; rÞ; ðx ¼ x1 ; x2 ; . . .; xM Þ
ð3Þ
Application of MEA-LSTM Neural Network
65
Among them, x 2 X, Through normal distribution individuals Nðx; rÞ, M new individuals and the optima individuals are randomly generated to form a subgroup. f ðxz Þ ¼ minf ðxi Þ
ð4Þ
Formula 4 means that the fitness function is used to compete against M new individuals within a subpopulation and the previously selected best individuals, the sub population represented by the new optimal individual is selected for dissimilation operation, and xz represents the new optimal individual. f ðxz1 Þ f ðxz2 Þ f ðxz3 Þ . . . f ðxzN Þ
ð5Þ
The result shows that the optimal subgroup f ðxz1 Þ is selected according to the fitness value of N subgroups, and the xz1 is the optimal solution. If the optimal solution is not obtained, repeat step (2), and use the optimal solution calculated by the above iteration process as the weight and threshold value in LSTM network. After the optimal weights and thresholds are obtained, they enter the forgetting gate of LSTM neural network. The proportion of the two kinds of data is obtained by multiplying the hidden state of the previous time and the input data at this time with their corresponding weights. Combined with the sum of the set thresholds, the activation function is used to determine whether the data is discarded or added, so as to realize the function of forgetting or remembering. The formula is as follows: ft ¼ rðWf ½ht1 ; Xt þ bf Þ
ð6Þ
Where is the weight Wf and the threshold bf , r is the activation function of sigmoid [33]. When information is forgotten and entered into the input gate, the data control of the gate consists of two parts, which are used to generate new information that needs to be updated. The first part is an “input gate” layer. Like the calculation method of forgetting gate, it calculates the proportion of the hidden state of the previous time and the input data at this time, and determines which values to update by sigmoid, which is represented by it ; The second part is a tanh layer to generate a new candidate value Ct as the content to be replaced, which may be added to the cell state. Finally, the values generated by these two parts are combined to update. The formula is as follows: it ¼ rðWi ½ht1 ; Xt þ bi Þ
ð7Þ
~ t ¼ tanhðWc ½ht1 ; Xt þ bc Þ C
ð8Þ
Where Wi , Wc is the weight and bi , bc is the threshold. Then it is the process of cell state Ct updating. Through the calculation of forgetting gate and input gate, it is obtained that the numerical results will act on cell state Ct at this time, the composition of Ct is the product of cell state at the last moment and forgetting gate output ft plus the product of Ct and it obtained by input gate. The formula is as follows:
66
Z. Yan et al.
~t Ct ¼ Ct1 ft þ it C
ð9Þ
Finally, it enters the output gate, which is composed of two steps. The first step is to pass through the output gate r function calculates the cell state at the previous time and the input data at this time with weight to determine which part of the data of the output cell state Ct . The second step is to multiply the cell state Ct through the tanh layer with the output calculated in the first step to get the final hidden state output ht at this time. The formula is as follows: Ot ¼ rðWo ½ht1 ; Xt þ bo Þ
ð10Þ
ht ¼ Ot tanh ðCt Þ
ð11Þ
Wo is the weight, bo is the threshold value, There are more than one layer in the network during training, so the hidden layer of LSTM repeats steps (6) to (11), depending on the number of hidden layers [34, 35]. The neural network process of MEA thinking evolution optimization is shown in Fig. 3.
Fig. 3. Optimization application process of MEA-LSTM neural network
Figure 3 is a model experimental diagram. The network structure of LSTM is determined from the preprocessed datasets. Before the weights and thresholds of network training are determined, the datasets need to be calculated using the thought evolution algorithm, the optimal weights and thresholds for the data set are determined by multiple iterations, and then the weights and thresholds are assigned to the LSTM network for training and analysis.
Application of MEA-LSTM Neural Network
67
5 Experimental Test and Prediction 5.1
Contrast Experiment
The experimental data of this paper are all the stock data collected by Shanghai Pudong Development Bank from 1999 to 2018, including 4496 pieces of data, and validate the model by using these data as datasets. The object of the experiment is to use the two columns of “date” and “high-low” in the data set. Date is the date value as the index, and high-low is the rise and fall value of the stock data (the difference between the highest price and the lowest price every day). Outlier handling should be considered first, because of some factors, some days of data is blank, in order to reduce the error, the mean value of data is used to make up for the blank value. Then the data is normalized to control the range of data between –1 and 1, which can help the MEA algorithm to find the optimal solution. The weights and thresholds calculated by MEA algorithm are assigned to W and b in LSTM model. The same data set is used to test the traditional LSTM and Bi-LSTM models. The test results are as follows: Table 1. Comparison of LSTM, Bi LSTM and MEA LSTM model experiments LSTM Bi-LSTM MEA-LSTM MSE 0.135849675 0.058648317 0.017463103 MAE 0.241698654 0.152438745 0.088843648
From Table 1, the three evaluation data in the table, it can be concluded that the mean square error, root mean square error and average absolute error of MEA-LSTM model are better than the results of the other two models. The model does not involve artificial setting in the selection of initialization super parameters, and is completely handed over to MEA algorithm for processing and calculation, which speeds up the convergence speed of the model to a certain extent, At the same time, it improves the gradient disappearance problem due to the long time span to a certain extent, and indirectly improves the efficiency and accuracy of model prediction. 5.2
Model Prediction Results
In the above two figures, Fig. 4 is the fusion diagram of training data, test data and prediction data. The blue line is the real data set running through the whole time line,
Fig. 4. Distribution of three kinds of data
68
Z. Yan et al.
Fig. 5. Distribution of predicted value and original data
the orange line is our test data, and the green represents the results we get by using the model prediction. Because the initial weight of LSTM is obtained by MEA algorithm and is no longer tested repeatedly, the model can converge in a short time and be in an optimal effect, which is also that the predicted data value can be kept within the acceptable error range with the real value. Because the data processed by LSTM is generally time series, it can be seen from the figure that the time span of stocks is very long, and the prediction of long time series will cause the gradient disappearance problem to a certain extent. However, because we deal with the weight optimization problem, MEA algorithm can indirectly avoid the gradient disappearance of LSTM and ensure the accuracy of data prediction. By comparing the predicted data with the real data and enlarging it, the prediction value shown in Fig. 5 fits the real value to the greatest extent. Under the condition of avoiding the disappearance of the gradient, the model has not had the problem of fitting, which proves the effectiveness of the model.
6 Conclusion Experiments show that the LSTM long-term and short-term cyclic neural network optimized by MEA thinking evolutionary algorithm can lock the optimal weight problem faster and prevent the disappearance of the gradient. MEA thinking evolutionary algorithm is optimized based on GA genetic algorithm. The effect can converge to the global optimal solution 100%, and the computational efficiency is much higher than that of standard GA. The LSTM model optimizes its own weight selection with the mechanism of MEA, which not only speeds up the convergence of LSTM model to a certain extent, but also improves the prediction accuracy. From the data observation, the LSTM neural network optimized by MEA can fit the stock difference value data to a great extent. The optimization effect is higher than that of conventional methods, and the prediction effect is in line with the reality. The disadvantage is that the problem of gradient explosion in LSTM model can not be solved. Therefore, it is necessary to find a model suitable for long-time series data and with higher operation accuracy to avoid such problems, which is also the work to be studied in the future. Acknowledgment. The authors would like to acknowledge the reviewers for their useful comments and advices.
Application of MEA-LSTM Neural Network
69
References 1. Sun, C., Sun, Y., Xie, K.: Thought evolution: an efficient evolutionary computing method, University of science and technology of China. In: Proceedings of the Third Global Conference on Intelligent Control and Automation, pp. 118–121 (2000) 2. Zhang, Y., Zhang, L., Dong, Z.: An MEA-tuning method for design of the PID controller. Math. Probl. Eng. 2019, 11 (2019). Article ID 1378783 3. Wei, Z., et al.: Prediction of diabetic complications based on BP neural network optimized by the mind evolutionary algorithm. China Med. Equip. 17(10), 1–4 (2020) 4. Daihai, Y., Yuewen, J., Liangyuan, W.: Research on wind power generation right trading based on MEA algorithm and principal agent model. Guangdong Electr. Power 29(11), 57– 63 (2016) 5. Keming, X., Yuxia, Q.: Sequence model and convergence analysis of mind evolutionary algorithm. Comput. Eng. 33(3), 180–182 (2007) 6. Zhaolan, C., Xiaoqiang, Z., Yue, L.: Prediction of railway freight volume based on LSTM network [J/OL]. Acta Sin. Sin. 11, 15–21 (2020) 7. Sun, W., Xiu, X., Xiao, H., Yu, X., Zhang, L., Mingshuang, L.: SOC state assessment of battery energy storage system based on mea-bp neural network. Electr. Appl. Energy Effic. Manag. Technol. (01), 51–54 + 83 (2018) 8. Hasan, M.M., Roy, P., Sarkar, S., Khan, M.M.: Stock market prediction web service using deep learning by LSTM. In: 2021 IEEE 11th Annual Computing and Communication Workshop and Conference (CCWC), pp. 0180–0183 (2021) 9. Baek, Y., Kim, H.Y.: ModAugNet: a new forecasting framework for stock market index value with an overfitting prevention LSTM module and a prediction LSTM module. Expert Syst. Appl. 113, 457–480 (2018) 10. Chen, K., Zhou, Y., Dai, F.: A LSTM-based method for stock returns prediction: a case study of China stock market. In: 2015 IEEE International Conference on Big Data (Big Data), pp. 2823–2824 (2015) 11. Reghin, D., Lopes, F.: Value-at-risk prediction for the Brazilian stock market: a comparative study between parametric method, Feedforward and LSTM neural network. In: 2019 XLV Latin American Computing Conference (CLEI), pp. 1–11 (2019) 12. Nelson, D.M., Pereira, A.C., De Oliveira, R.A.: Stock market’s price movement prediction with LSTM neural networks. In: 2017 International Joint Conference on Neural Networks (IJCNN), pp. 1419–1426 (2017) 13. Zhao, Z., Rao, R., Tu, S., Shi, J.: Time-weighted LSTM model with redefined labeling for stock trend prediction. In: 2017 IEEE 29th International Conference on Tools with Artificial Intelligence (ICTAI), pp. 1210–1217 (2017) 14. Mesquita, C.M., De Oliveira, R.A., Pereira, A.C.M.:Combining an LSTM neural network with the variance ratio test for time series prediction and operation on the Brazilian stock market. In: 2020 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2020) 15. Jin, Z., Yang, Y., Liu, Y.: Stock closing price prediction based on sentiment analysis and LSTM. Neural Comput. Appl. 32(13), 9713–9729 (2019). https://doi.org/10.1007/s00521019-04504-2 16. Cheng, L., Huang, Y., Wu, M.: Applied attention-based LSTM neural networks in stock prediction. In: 2018 IEEE International Conference on Big Data (Big Data), pp. 4716–4718 (2018)
70
Z. Yan et al.
17. Lai, Y., Chen, R., Caraka, R.E.: Prediction stock price based on different index factors using LSTM. In: 2019 International Conference on Machine Learning and Cybernetics (ICMLC), pp. 1–6 (2019) 18. Eapen, J., Bein, D., Verma, A.: Novel deep learning model with CNN and bi-directional LSTM for improved stock market index prediction. In: 2019 IEEE 9th Annual Computing and Communication Workshop and Conference (CCWC), pp. 0264–0270 (2019) 19. Kumar, K., Haider, M.T.U.: Enhanced prediction of intra-day stock market using metaheuristic optimization on RNN– LSTM network. New Gener. Comput. 39, 231–272 (2021) 20. Moghar, A., Hamiche, M.: Stock market prediction using LSTM recurrent neural network. Procedia Comput. Sci. 170, 1168–1173 (2020) 21. Naik, N., Mohan, B.R.: Study of stock return predictions using recurrent neural networks with LSTM. In: Macintyre, J., Iliadis, L., Maglogiannis, I., Jayne, C. (eds.) Engineering Applications of Neural Networks. EANN 2019. Communications in Computer and Information Science, vol. 1000. Springer, Cham (2019). https://doi.org/10.1007/978-3030-20257-6_39 22. Achkar, R., Elias-Sleiman, F., Ezzidine, H., Haidar, N.: Comparison of BPA-MLP and LSTM-RNN for stocks prediction. In: 2018 6th International Symposium on Computational and Business Intelligence (ISCBI), pp. 48–51 (2018) 23. Ouyang, H., Wei, X., Wu, Q.: Discovery and prediction of stock index pattern via threestage architecture of TICC, TPA-LSTM and multivariate LSTM-FCNs. IEEE Access 8, 123683–123700 (2020) 24. Istiake Sunny, M.A., Maswood, M.M.S., Alharbi, A.G.: Deep learning-based stock price prediction using LSTM and bi-directional LSTM model. In: 2020 2nd Novel Intelligent and Leading Emerging Sciences Conference (NILES), pp. 87–92 (2020) 25. Mootha, S., Sridhar, S., Seetharaman, R., Chitrakala, S.:Stock price prediction using bidirectional LSTM based sequence to sequence modeling and multitask learning. In: 2020 11th IEEE Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON), pp. 0078–0086 (2020) 26. Chatterjee, I., Gwan, J., Kim, Y.J., Lee, M.S., Cho, M.: An NLP and LSTM based stock prediction and recommender system for KOSDAQ and KOSPI. In: Singh, M., Kang, D.K., Lee, J.H., Tiwary, U.S., Singh, D., Chung, W.Y. (eds.) Intelligent Human Computer Interaction. IHCI 2020. LNCS, vol. 12615. Springer, Cham (2021). https://doi.org/10.1007/ 978-3-030-68449-5_40 27. Yujun, Y., Yimei, Y., & Jianhua, X.: A hybrid prediction method for stock price using LSTM and ensemble EMD. Complexity 2020, 16 (2020). Article ID 6431712 28. Lakshminarayanan, S.K., McCrae, J.P.: A comparative study of SVM and LSTM deep learning algorithms for stock market prediction. In: AICS 2019, pp. 446–457 (2019) 29. Yu, Y., Kim, Y.: Two-dimensional attention-based LSTM model for stock index prediction. J. Inf. Process. Syst. 15(5), 1231–1242 (2019) 30. Ding, G., Qin, L.: Study on the prediction of stock price based on the associated network model of LSTM. Int. J. Mach. Learn. Cybern. 11(6), 1307–1317 (2019). https://doi.org/10. 1007/s13042-019-01041-1 31. Selvin, S., Vinayakumar, R., Gopalakrishnan, E.A., Menon, V.K., Soman, K.P.: Stock price prediction using LSTM, RNN and CNN-sliding window model. In: 2017 International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 1643–1647 (2017) 32. Li, H., Shen, Y., Zhu, Y.: Stock price prediction using attention-based multi-input LSTM. In: Proceedings of the 10th Asian Conference on Machine Learning, vol. 95, pp. 454–469 PMLR (2018)
Application of MEA-LSTM Neural Network
71
33. Sinitsyn, R.B., Yanovsky, F.J.: Copula ambiguity function for wideband random radar signals. In: 2011 IEEE International Conference on Microwaves, Communications, Antennas and Electronic Systems, COMCAS 2011, p. 5. Tel-Aviv, Israel, 7–9 November 2011. https://doi.org/10.1109/COMCAS.2011.6105843 34. Yanovsky, F.J., Russchenberg, H.W.J., Ligthart, L.P., Fomichev, V.S.: Microwave Dopplerpolarimetric technique for study of turbulence in precipitation. In: IGARSS 2000. IEEE 2000 International Geoscience and Remote Sensing Symposium. Proceedings, vol.5, (Cat. No.00CH37120), pp. 2296–2298. Honolulu, HI, USA (2000). https://doi.org/10.1109/ IGARSS.2000.858387 35. Sinitsyn, R.B., Yanovsky, F.J.: Kernel estimates of the characteristic function for radar signal detection. In: European Radar Conference, 2005, pp. 53–56. EURAD 2005, Paris, France (2005). https://doi.org/10.1109/EURAD.2005.1605562
Future Design Forearm Prosthesis Control System D.A. Fonov1,2(&) and E.G. Korzhov3 1
2 3
Bauman University, St. 2nd Baumanskaya, building 5, Moscow 105005, Russia Smirnov Design, Kashirskoe highway, 1, building 2, Moscow 115230, Russia National University of Science and Technology MISiS, Leninskiy Prospekt 4, NUST MISIS, Moscow, RU, Russia
Abstract. This article describes/identifies in the field of control system for upper limb prosthetics. Consider the methods and control systems of modern bionic prostheses, their advantages and disadvantages. Proposed a method of control by electromyographic sensors. Proposed a method for managing a bionic prosthesis using the technology of reading electric potentials from the surface of the forearm, as well as using machine learning technologies. Keywords: Controlled prosthesis Electroencephalogram Human-machine interface Bionic Tensometric
Bioprosthetic
1 Introduction The new world brings a lot of new technologies, equipment, devices, machines and with them comes a huge number of injuries which, in particular, lead to the loss of upper limbs. Our hands are one of the most important elements of the body. Any damage, even a bruise, brings discomfort in self-service. The human hand is a very complex set of mechanisms with many sensory-motor functions, which in its full embodiment is almost impossible to achieve using modern technologies. It is known that the main organs that influence the environment are hands. 15% of people on the planet have impaired functions and structures of the body that interfere with physical activity and interfere with social life [1], and more than 50 million people a year become disabled. According to the International Classification of Functioning (ICF), the loss of the little finger is recognized as a 10% decrease in the working capacity of the hand, the loss of the thumb is 50%, which is already a serious harm to health [2]. Despite the modern level of rehabilitation and the highest skills to restore the working capacity of a damaged limb, amputation is still a relevant topic. If a person has lost one limb, despite the presence of good social rehabilitation, the standard of living downgrades. In addition, there are cases when both hands are missing. In both the first and second cases, prosthetics are necessary. The scope of this article discusses prostheses, modern and conservative approaches to managing them, and at the end of the article, the most effective and utilitarian, © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 Z. Hu et al. (Eds.): CSDEIS 2021, LNDECT 121, pp. 72–80, 2022. https://doi.org/10.1007/978-3-030-97057-4_7
Future Design Forearm Prosthesis Control System
73
according to the authors, method of control them is proposed. Also, some methods of machine learning, where - Machine Learning is a division of Artificial Intelligence which builds a system that learns from the data [3]. In particular, some filtering methods are proposed for use - neural networks, regression models, Markov chains, Bayes chains, k-means [4, 5]. In the past, some control models have been described in which machine learning is widely used, which can significantly help the readers of this article to delve deeper into the topic of machine learning [6–10]. Considered technology such as EEG - The brain wave signals are known as electroencephalography (EEG) signals [11]. In the article (Mathematical Modeling of DC Motors for the Construction of Prostheses) you can find the necessary information for mathematical modeling of a biorobotic system, a neurointerface - a mechanical drive. The article studies static and dynamic characteristics of a DC motor with independent excitation by mathematical modeling in MATLAB environment for the possibility of further construction of models of electric drive systems with this engine type for the prosthesis [12].
2 Literature Review There is an article Development of a neurodevice with biological feedback for compensating for lost motor functions [13], the authors focus on the applicability of the EEG EMG and EOG technology for managing prostheses, paying special attention to the functionality that is provided by the combination of these technologies, in particular: “In terms of information content and accuracy, the data were comparable. investigated the efficiency of hybridization of EEG and EMG signals using a neurodevice: it made it possible to increase the classification accuracy in all subjects by an average of 12.5% - to an average value of 86.8% (from 75 to 97%)”, while the authors did not pay attention to the applicability of this list of technologies in terms of ease of use and adaptability of the technology, according to the experimental data collected in the process of research and selection of the optimal technology on the basis of which it is worth developing a control system for prostheses, we can definitely say for sure that the EEG technology cannot be applied at the moment to control anything, since from the surface of the human cranium to remove any high-quality signal or their combination, on the basis of which one can definitely and accurately program any command, in practice, the authors of the study did not succeed. In the next reviewed article, Prosthetic Myoelectric Control Strategies: A Clinical Perspective, the authors are absolutely accurate in describing modern approaches to managing commercial options for hand prostheses, abstractly discussing the use of machine learning technologies to decipher signals received using EMG technology [14]. The content given below in our article is an organic addition and extension of the knowledge that article reveals.
74
D. A. Fonov and E. G. Korzhov
Article (A Novel Hand Prosthesis Control Scheme Implementing a Tongue Control System) describes the control of the prosthesis with the help of the human mouth and tongue - The control signals for the hand are determined by myoelectric signals from the arm and volitionally generated signals by tongue through an inductive interface with a mouth piece [15], which, from a practical point of view and interviewing potential users, is not a very convenient solution, since in the course of dealing with the limitation arising from the loss of a hand, a limitation in speaking arises.
3 Prosthetics Prosthetics is a complex of medical and social measures aimed at compensating anatomical and functional defects of a person with the help of prosthetic and orthopedic means and devices. In this case, the main task of prosthetics is the maximum possible restoration of the functions of the lost organ and the person’s return to social activity. The latter circumstance is of great psychological importance and affects the timing of development and the skill of managing the prosthesis. Options for prosthetics depend on the level of amputation, the degree of rehabilitation of the post-amputation stump, as well as the patient’s desire regarding the functionality of the prosthesis. Naturally, most people due to psychological trauma, as well as the need for lengthy rehabilitation and, which is often a limiting factor, choose material options either a passive prosthesis that carries a purely aesthetic function, or a simple mechanical prosthesis equipped with one or two motors to perform simple movements, but he does not return the colossal motor potential that the human hand has. Therefore, the main developments are now aimed at creating bionic prostheses. Bionic prosthesis is a means of rehabilitation from injuries of the musculoskeletal system. This device must repeat the biomechanics of lost limbs and to some extent restore their functionality. Now, when performing prosthetics, a person is forced to choose between aesthetics and functionality, replaced by limbs. There are several types of prostheses on the market today: 1. Common cosmetic type: is a passive prosthesis designed to recreate the physical appearance of the limb. These types of prostheses contain personally designed and fitted inner sockets or “interfaces” for a user. The outer design looks like normal skin and covers the whole construction like a glove made of silicone or similar material. 2. Simple helper device designed to perform certain actions. This type of prosthesis is slightly advanced over the common type. It is equipped with a special holder or receiver to perform desired working action for everyday self-service or even to work with special tools (hammer, a wrench, scissors, a chisel, a screwdriver clamp, etc.). In the case of using such prostheses, the user has a rather functional set, which is also very limited on its own movements.The outer design of this type could be the same as common or looks like a real spesial tool. 3. A body-powered helper device is an active (mechanical) traction type of prostheses. This design of prosthesis consists of tractions that are controlled by rods (cables,
Future Design Forearm Prosthesis Control System
75
belts), controlled by the efforts of the person himself without any electronic motors. Thus, the user gets the advanced opportunity to control the grasping movement and use it in various fields of activity. 4. Bioelectric, myoelectric or bionic prostheses: these are some of the most advanced limb prostheses. According to the classification of the Ministry of Labor of the Russian Federation, they are called prostheses with an external energy source. The management of such prostheses is based on reading the electrical potentials of human tissues. Sensors that read the change in electric potential are built into the stump receptacle. This information is transmitted to the microprocessor, and as a result, the prosthesis performs certain gestures and grips (Fig. 1). Most often, the range of movements programmed into the microprocessor is limited. The most advanced prosthesis is Ilimb by Touch Bionic, which has 24 grips. But these movements are not controlled in a natural way, since the user must constantly switch between them and perform the same action to perform them (tension of the flexor muscles). This brings a lot of inconvenience to the user, in connection with which he refuses all this “advanced functionality” and uses only grasping movement, as in ordinary hooks.
Fig. 1. Schematic diagram of the device “Bioelectric prosthesis”
4 Electric Potentials 4.1
Research Methodology
The methodology of the study proposed for reading consisted in work with focus groups consisting of people with amputated upper limbs, theoretical study of the functions and structures of the human hand and all organs of the human body involved in the process of hand control, practical research and the use of modern equipment of EMG, EEG and invasive sensors.
76
4.2
D. A. Fonov and E. G. Korzhov
The Connection Between Nerve Impulses and Muscle Activity
There are many nerves in our body. A nerve is a biological structure for informing the body and giving orders, consisting of many neuron cells. Muscles have many neuromuscular connections to receive orders from the central nervous system and to inform the central nervous system of the actions that take place. The coordination of the work of the muscles of the hand is especially necessary for performing motor acts in which whole muscle groups are responsible for fine motor skills, opening and closing of the hand, rotational movements, etc. In the light of modern ideas about the mechanisms of coordination of movements, muscles - not only the executive motor apparatus, but also a peculiar sense organ. In the muscle fibers and tendons, there are special nerve endings - proprioceptors that send feedback pulses to the central nervous system cells; in addition, more than 17,000 tactile sensors are responsible for sensations in the skin of the palmar surface of the human hand. As a result, a closed loop is created between the central nervous system and muscles: impulses from various CNS formations that travel along the motor nerves cause muscle contractions, and impulses sent by muscle receptors, tactile sensors inform the central nervous system of each element and moment of movement. The cyclic communication system provides precise control of movements and their coordination. Although various parts of the central nervous system are involved in the control of skeletal muscle movements during motor acts, the leading role of awareness in ensuring their interaction and setting the goal of the motor reaction belongs to the cerebral cortex and cerebellum, especially when making complex precise movements. In the cerebral cortex, the motor and sensory zones form a single system, with each muscle group corresponding to a certain section of these zones. The control of contractile activity of muscle cells is carried out using motor neurons - nerve cells whose bodies lie in the spinal cord, and long branches - axons in the motor nerve come to the muscle. Upon entering the muscle, the axon branches into many branches, each of which is connected to a separate group of muscle fibers. The signal transmission speed from the brain to the destination is 270 km/H, which is 0.001 s for signal transmission from the brain to the hand. The target muscle in our case is the flexor and extensor of the wrist of the forearm. The electric potential arising is in Table 1. Table 1. Electric potential value Object Maintenance Squeeze and hold Pulling force 163,5 ± 14,27 lV 186,65 ± 10,09 lV 332,87 ± 95,6 lV
Electric potentials arise at the moment of signal transmission from the brain to the destination, or from the pathogen to the brain. When the potential is generated, the electric charge in the cell’s changes, which leads to the generation of an electric current with a potential difference. These pulses are enough to detect electrical excitations with cutaneous sensors.
Future Design Forearm Prosthesis Control System
77
To summarize, let’s look at how all these movements occur and what is behind how we take a cup of coffee. It all starts with the brain. Seeing our cup, we want coffee. In the brain, and more precisely in the upper parts of the postcentral gyrus, a nerve impulse arises that passes through the midbrain, then through the cerebellum (for more precise coordination of muscle movements) and then through the spinal cord along the nerve fibers to the muscles. Thanks to the myelin sheath, which acts as an insulator the signal transmitting speed of the pulse reaches 120 m per second? In the muscles there is a so-called motor unit (a section of muscle controlled by a single nerve fiber). It contains the neuromuscular synapse (the place where the nerve joins the muscle). Through the synapse, the impulse passes and spreads over the muscle area, causing its contraction/relaxation. This leads to a change in the length and displacement of the muscle body in space. Muscles are tightly and firmly fused with tendons, and those, in turn, are attached to certain places on the bone. The displacement of the muscle, followed by the tendon, moves the bone to a different position in space. We considered the simplest option, when only one muscle works, but in reality, the muscles work in a pair of extensor-flexor, thus the bone-articular framework is constantly under control. For complex movements, such as flexion of the finger, pronation (rotation of the palm of the hand relative to the axis of the forearm) of the wrist, require precise combined movement of various muscle groups. In addition to sending commands, the body also has a feedback system. In the motor unit there are special motor neuron cells that send impulses back to the brain. The most important are signals about proprioception - the proper position of the muscle in space.
5 Proposed Prosthesis Control Method Based on the foregoing, signals can be considered in the following ways: 1. Reading from the brain - signals are read either using implanted electrodes, or using EEG; 2. Reading from the level of the spinal cord - implantation of electrodes in the spinal cord; 3. Reading from the level of the nerve fiber - implantation of electrodes into the nerves extending from the spinal cord; 4. Readings from muscle fiber - attaching sensors directly to muscle fiber (EMG invasive); 5. Reading from the skin - attaching sensors to the skin (EMG non-invasive); Since non-invasive testing is a priority, it is preferable to use either EEG or cutaneous EMG. But these methods are of a relative nature separately, since there are obstacles that distort signals (for EEG - skull bones and skin, hair, for EMG skin and adjacent muscles), When designing a prosthesis and methods of interacting with a person to control an artificial limb, the emphasis was placed on: lightness, simplicity and aesthetics of the
78
D. A. Fonov and E. G. Korzhov
structure, functionality (easy, simple, fast), small-sized, convenient, functional and aesthetically pleasing. To provide the artificial limb with a full set of functionalities, a focus was placed on the interaction with the electrical potentials of the muscles. After conducting an informational analysis on the control systems for upper limb prostheses, it was understood that the principles and methods of managing prostheses are quite conservative. The same management methods have been used practically unchanged for about a hundred years. Modern mechanical forearm prostheses are assembled from modular (interchangeable) structures. The formation of these conditions and systems for working with prostheses was influenced by many factors, including getting the cheapest, but at the same time, effective means for the rehabilitation of people who have lost their upper limbs. At present, the control signal is removed through the use of electronographic sensors, which in turn are located on the preserved arm segments, on which the sensors record the mechanical or electrical activity of large muscle groups, flexors and extensors. When a person without a hand needs to make a movement with his finger, the brain generates a corresponding signal that goes along the nerves leading to the muscles of the limb. But, since the hand is absent, the muscle with its own contraction cannot produce the desired movement. If we count nerve impulses along the way, then on this basis, after analyzing and processing the data, you can form a set of commands for controlling the prosthesis. The most suitable option for determining and recording impulses, we recognized the Myo Armband bracelet of the Canadian company ThalmicLabs. Myo works using electromyography technology - a method for studying the bioelectric potentials that occurs in skeletal muscles when muscle fibers are excited - in other words, recording electrical activity of muscles. Myo measures the electrical signals in the muscles of the hands and detects the planned gesture even before its actual implementation, before the hand moves. By detecting the planned signals before their actual implementation, it is possible to build basic models of the functionality of the human hand. If the patient does not have one upper limb, using two Myo Armband bracelets, you can do the following: To take impulses from a healthy upper limb and form a standard functional model of the hand; To take impulses from the affected limb (during amputation, the group of functional muscles loses its normal appearance, atrophies, truncates and shrinks because of this, a rough muscle scar can also form, which will limit the possibility of phantom movement) and form the base of movements of the group of active truncated muscles, using machine learning, level out the differences between the signals of the reference model and the model of the affected limb, and then build a functional model for a muscle group with a truncated hand. Such an approach will help to fully teach the prosthesis to understand and interact with the central nervous system of the patient and will make it possible to fill up practically the functions of the lost limb. This approach is the basic innovation of this article and it consists in collecting a calassal amount of information about a person and his nervous system, in particular about how his brain interacts with the muscles of the hands and what impulses arise in muscle tissues, what levels of voltage and electrical
Future Design Forearm Prosthesis Control System
79
potential arise in tissues from which the reading is made. Based on this information, through machine learning and averaging information about electrical impulses and movements at which they arise, it is possible to build an ideal and utilitarian mathematical model describing the control of a human hand in terms of the interaction of the brain and muscle tissues. If the patient lost both limbs, in this embodiment, the following methodology for interacting with the change is used: The reference models are movements of healthy limbs of volunteers, patients, replenishing the database of the reference model; The differences between the reference model and signals received from a truncated limb can be mitigated using machine learning technologies (neural networks, regression models, Markov chains, Bayes chains, k-means) or an analytical method for deriving dependencies, for example, using the least squares method, in which we construct trajectories using point cloud and then compare them and find the dependencies. The leveling of discrepancies between the injured limbs and reference models is carried out by constructing models of the movements of amputated groups of active muscles and the connections between functional sets of healthy and damaged limbs. This approach, in fact, is a modification of the basic approach and theoretically makes it possible to restore the functions of a lost limb to almost everyone in need.
6 Conclusion Thus, at the present stage of development of technologies for reading biological signals from the human body, it can be determined that EMG is the best method for constructing control systems for various means of rehabilitation. Muscle signals are still the only commercially and clinically viable biological signals for the control of upper limb prostheses. Despite several decades of academic research in this area, commercial devices are still driven almost entirely by algorithms from 60 years ago. Thanks to the research done, it can be determined that with the help of modern information processing technologies and modified systems for the identification of muscle activity, we and other developers can push prosthetic control systems to a new level. This technique will allow us to replenish the functionality of the missing limb, including expanding the functional set of a person with the help of a variety of signals arising in the process of muscle activity and the activity of a person with an amputated limb.
References 1. Fonov, D.A., Fediuk, R.S.: Mechanical-bionic prosthetic prosthesis. In: High Technologies in Mechanical Engineering: Materials of the All-Russian Scientific and Technical Internet Conference, Samara: Samar. state tech. Univ., p. 263 (2016). p.: ill., ISBN 978-5-79641961-8 2. International classification of functioning, disability and health. - Geneva: WHO, p. 342 (2001)
80
D. A. Fonov and E. G. Korzhov
3. Lamba, T.: Optimal machine learning model for software defect prediction. Int. J. Intell. Syst. Appl. (2), 36–48 (2019) 4. El Agha, M., Ashour, W.M.: Efficient and fast initialization algorithm for k-means clustering. Int. J. Intell. Syst. Appl. (IJISA) 4(1), 21–31 (2012). https://doi.org/10.5815/ijisa. 2012.01.03 5. Chatterjee, B., Saha, H.N.: Parameter training in MANET using artificial neural network. Int. J. Comput. Netw. Inf. Secur. (IJCNIS) 11(9), 1–8 (2019). https://doi.org/10.5815/ijcnis. 2019.09.01 6. Srivastava, S., Tripathi, B.K.: Development of hybrid learning machine in complex domain for human identification. Int. J. Intell. Syst. Appl. (IJISA) 11(1), 55–66 (2019). https://doi. org/10.5815/ijisa.2019.01.06 7. Lamba, T., Mishra, A.K.: Optimal machine learning model for software defect prediction. Int. J. Intell. Syst. Appl. (IJISA) 11(2), 36–48 (2019). https://doi.org/10.5815/ijisa.2019.02. 05 8. Hagras, S., Mostafa, R.R., Abou El-Ghar, M.: A biometric system based on single-channel EEG recording in one-second. Int. J. Intell. Syst. Appl. (IJISA) 12(5), 28–40 (2020). https:// doi.org/10.5815/ijisa.2020.05.03 9. Roche, A.D., et al.: Prosthetic myoelectric control strategies: a clinical perspective. Curr. Surg. Rep. 2, 44 (2014). https://doi.org/10.1007/s40137-013-0044-8 10. Nashiry, M.A., et al.: An optimized machine learning approach for predicting Parkinson’s disease. Int. J. Mod. Educ. Comput. Sci. (IJMECS) 13(4), 68–74 (2021). https://doi.org/10. 5815/ijmecs.2021.04.06 11. Saif, A.S., Hossain, M.R., Ahmed, R., Chowdhury, T.: A review based on brain computer interaction using EEG headset for physically handicapped people. Int. J. Educ. Manag. Eng. 2, 34–43 (2019) 12. Fonov, D.A., Meshchikhin, I.A., Korzhov, E.G.: Mathematical modeling of dc motors for the construction of prostheses. In: Hu, Z., Petoukhov, S., He, M. (eds.) CSDEIS 2019. AISC, vol. 1127, pp. 16–27. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-39216-1_2 13. Yadav, R.K., Kishor, J., Yadava, R.L.: Effects of superstrate on electromagnetically and gap coupled patch antennas. IJWMT 4(3), 26–35 (2014). https://doi.org/10.5815/ijwmt.2014.03. 03 14. Bogdanov, E.A., et al.: Development of a neurodevice with biological feedback for compensating for lost motor functions. In: VESTNIK RSMU, no. 2 (2016) 15. Johansen, D., Popović, D.B., Struijk, L.N.S.A., Sebelius, F., Jensen, S.: A novel hand prosthesis control scheme implementing a tongue control system. IJEM 2(5), 14–21 (2012)
Map of Virtual Promotion of a Product Sergey Orekhov(&), Andrii Kopp, and Dmytro Orlovskyi Computer Science and Software Engineering, Department of National Technical University Kharkov Polytechnical Institute, Kharkiv 61002, Ukraine
Abstract. Research in the field of search engine optimization on the Internet indicates the emergence of a new information technology - virtual promotion. The purpose of this technology is to increase the level of sales of goods over the Internet. The result of the technology’s work is the formation of the so-called market map. This is a special online sales scheme based on the customer’s travel map on the Internet. This scheme is based on a new business principle. We earn when we attract a new client. The costs are paid by attracting new and new customers. The more customers are attracted by web content, the more profit. Consequently, income generation is based on the use of WEB-services to attract potential customers. The main idea of the article is that the market map is now formed from WEB services. It must ensure that the process of attracting new customers and retaining existing ones is as efficient as possible. When creating a map, one should take into account the classic structures of the formation of marketing sales channels. From this point of view, virtual promotion is an information environment. Wednesday includes two channels. The first channel functions to disseminate product knowledge. The second channel is called marketing. This is a control channel. The purpose of the first channel is to transfer information (knowledge) about the product to the Internet. It combines the actions of transporting, storing and searching for information about a product, depending on the needs of a potential buyer. The marketing channel includes a network of websites, telegrams, marketplaces and video blogs. In other words, the marketing channel is formed by real businesses that buy and sell information about products, services, or potential customers. Therefore, our task is to form a virtual promotion map that describes the structure of the product’s marketing channel through the virtual space from the position of the chain of technologies used. Keywords: Virtual promotion journey map
Search engine optimization Customer
1 Introduction and Related Works Requirements for modern marketing describe not only the process of creating a product concept (selection of goods, prices and target segment of potential buyers), but also the formation of constant communication with real and potential customers [1, 2]. The set of marketing communications forms the complex of product promotion.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 Z. Hu et al. (Eds.): CSDEIS 2021, LNDECT 121, pp. 81–91, 2022. https://doi.org/10.1007/978-3-030-97057-4_8
82
S. Orekhov et al.
The concept of “product promotion” was first introduced in the works [1, 2]. It is known that promotion is part of the “4P” principle: product, location, price and promotion. The promotion system is formed in the following areas [2]: 1) Advertising (paid or free) as an impersonal representation of the product description. 2) Sales promotion, as a short-term action to stimulate the purchase of goods or services. 3) Public relations, as impersonal distribution of product descriptions through various information channels. 4) Personal sale, as the presentation of a product description during personal communication with a potential buyer. Marketing [2] describes the communication channel as follows – Fig. 1.
Source of message
Recipient of message Sending a
Encoding
message
Decoding
White noise Feedback Fig. 1. Marketing communication process
This diagram reflects a set of problems that exist in the process of marketing communications, namely: 1) Formation of a message, construction or selection of a channel that connects the sender and the recipient in the information space. 2) It is not clear through which channels the message is being sent. What kind of feedback is needed in a given period of time? To do this, it is necessary to eliminate the “white noise” in the communication channel and correct the coding and decoding errors. Therefore, it is necessary to choose the right channel that meets the requirements of the sender. 3) Formation of a feedback channel in response to a message. Classic marketing provides answers to these questions in part if communication takes place in a real information space, and not in a virtual one (Internet) [1–5]. To understand today’s challenges, it is necessary to trace the evolution of marketing communications since the advent of the Internet.
Map of Virtual Promotion of a Product
83
Thus, promotion is part of the marketing function to stimulate demand. The function of generating demand and stimulating sales [2, 3] is carried out mainly in the information space. This space was originally shaped by newspapers, television, radio and rumors. Now the main information space exists on the Internet, and all of the above information channels as well. The Internet has become an information space for everyone - for the company, for customers, and for competitors. Online competition is intensifying because it is now impossible to hide any information. You can win the competition if you are the first to inform a potential buyer about your product. The principle “first entered the mind - the first on the market” works flawlessly [1]. Online competition is intensifying because it is now impossible to hide any information. You can win the competition if you are the first to inform a potential buyer about your product. The principle “first entered the mind - the first on the market” works flawlessly [1]. To implement this principle, a customer relationship management (CRM) technology was proposed [4]. It creates an information environment within the enterprise for the accumulation of marketing data on current prices, orders, customers. Comparing internal data with external data reveals competitive advantages. In terms of promotion, CRM collects data about current marketing communications with customers. The range of communication channels is small: e-mail, phone calls and a list of checks. However, the advantage of this technology is the ability to form integrated marketing communications. The goal of this approach is to get the most out of existing communication channels. The classical marketing theory is based on the principle: “money - goods - money”. Previously, the problem existed in the production of goods that are needed. And now, thanks to globalization, there are enough products, but there is a problem of finding a solvent buyer for this product. Therefore, the business should be started by attracting a potential client according to a new principle: “you attract, earn, spend”. Within this principle, there are two approaches: order-driven or customer-driven. Therefore, the product is now a complex of three elements: audience, marketing channel and product message. Thus, the scheme shown in Fig. 1 is changed to a new one – Fig. 2. Existing channels Target audience The cost of attracting a client
Notification of product
Promotion metrrics Attracting customers
Earn A0
A0 Spend
Web services A0
Fig. 2. The new principle of on-line business
84
S. Orekhov et al.
The new sales scheme shows that the message about the product and its properties exists in parallel with the real product in the warehouse. The marketing channel itself is separate from the warehouse and works in parallel. That is, a virtual product description allows you to make money without having the most real product. This is the principle of “derivatives”, when you buy shares in a company, you buy a future opportunity to receive dividends from the company. Then the main elements of the implementation of the Internet promotion complex will change to the following: 1) Advertising on the Internet is contextual, when short text messages are displayed depending on the user’s search [5, 6]. 2) Sales promotion is transformed into working with user profiles on a social network (social media marketing). 3) Public relations are carried out through a corporate website based on search engine marketing [7–9]. 4) Personal sale is carried out on behalf of an intelligent agent (bot) on trading floors. We will call this new marketing channel scheme virtual promotion. Virtual promotion (VP) is a collection of actions, efforts and technologies performed on the Internet with the aim of increasing the demand and sales of goods. The goal of virtual promotion is to make a profit by influencing the formation of subjective perception of goods or services by buyers.
2 Problem Statement The new object of research [10] has the same nature as the logistics system for the distribution of goods. Then the first step is to form a product data management system in a distributed hierarchical system, which is the Internet. Thus, it is required to concretize this object, using an analogy in the form of an information dissemination system, or rather knowledge about a product or service. We believe that virtual promotion is a set of Internet technologies that form two channels of communication with the buyer: the dissemination of product knowledge and the marketing of this knowledge. The first channel is a channel for transferring product knowledge. It is formed on the basis of operations of transferring, storing knowledge, as well as processing requests for transferring knowledge. For simplicity, we will consider only two main operations in this channel: transfer and storage of knowledge. The process of transferring information on the network is based on the set of URLs of real sites that host software components. This process operates according to the OSI model [11]. In other words, it is a set of software components located at the specified IP addresses. Each component performs either a transfer function or a storage function for product knowledge. The marketing channel is based on a variety of websites, telegram channels, video blogs and marketplaces. This channel is formed by enterprises that buy and sell knowledge about a product, packaging, potential customer, product news on the Internet. It is also possible free placement of knowledge about the product for storage
Map of Virtual Promotion of a Product
85
in the specified web content. That is, a website or a telegram channel receives knowledge on a paid or free basis for storage. Then the moment of uploading knowledge about a product to a server on the Internet will be considered the fact of transferring knowledge to the Internet. Thus, the distribution channel sets the configuration of the virtual promotion, and the marketing channel sets its organizational structure. Virtual promotion is successful if a customer of the product places an order or makes a request to download data about our product from one of the specified IP addresses. Therefore, we have a two-level system where the levels (two channels) actively interact. The marketing level controls the distribution of product knowledge. The knowledge dissemination channel also includes information technologies such as contextual advertising, content management systems, social media marketing and others. Then the virtual promotion is aimed at the formation of an organizational system for managing the channel for the dissemination of knowledge on the Internet. We will call this organizational structure of the market map management or virtual promotion map. This map shows the two levels at which transportation, storage and transfer of product knowledge to a potential buyer is carried out. Let’s consider the classification of the distribution channel of knowledge about the product from the perspective of the theory of logistics. Two logistics concepts are now popular: just-in-time and continuous replenishment. The first method is based on assessing the demand from potential buyers. In this case, we know the channels through which the company communicates its message about the product to potential customers. Therefore, in our case, such an approach is inapplicable, since virtual promotion itself synthesizes the future channel of communication with customers. Of course, you can form any forecast value, for example, based on probability theory, but such forecasts will be ineffective. According to the second concept, knowledge about the product should be placed in the given points of the Internet and await their activation by a potential buyer. This is exactly the option that we need. To do this, we perform three actions: concentration, formatting and dissemination of product knowledge on the Internet. The first operation is aimed at uploading product knowledge, for example, to a corporate website or Youtube. At the same time, these technologies must be tuned so that product knowledge is transferred on demand to the Internet as quickly as possible. The second operation means the processing of knowledge depending on the user’s request on the Internet. We form combinations of knowledge, sort them into specific groups or sets. That is, the formatting of knowledge depending on external requests for their receipt. In other words, we provide the user with the right technology to encourage them to make a purchase of our product. The third operation of “scattering” involves the transfer of product knowledge to other storage nodes in specified formats. We duplicate product knowledge on the Internet. Let’s assume that such duplication is accompanied by reformatting, since we work in the anti-plagiarism system of programs and components. Such a channel for disseminating knowledge is considered to be intensive, since the number of consumers is unlimited. Having a detailed description of the phenomenon of virtual promotion, it is possible to formulate the research task and give a verbal description of information technology that implements a hierarchical distributed system
86
S. Orekhov et al.
for the dissemination of product knowledge in the form of a sequence of stages - Fig. 3. The result is a virtual promotion map, which consists of two parts: a set of technologies and a set nodes where such technologies should be applied. Product description
Semantic kernel Message forming (knowledge concentration)
RDF representation
A0 Synthesis of marketing channel (customization) Customer location
VP metrics
A0
Product location Building knwoledge distribution channel (spreading) Software and WEB services
A0
Marketing strategy
VP map
Analyzing and defining virtual promotion map A0 Final VP map
Fig. 3. The process of building of virtual promotion map
What is a virtual promotion map? It is, as it were, a training sample that we invest in information technologies that exist on the Internet in order to teach the virtual space to respond to a request for a given product or service in a certain period of time. A reaction is understood as the appearance of orders for the purchase of a particular product or service in a given geographic location. According to the theory of artificial intelligence, training an artificial system requires three components: a knowledge representation model, a training sample, a training method, and a criterion for stopping the learning process [7, 12]. We called the product knowledge representation model – the semantic kernel. It is formed at the first stage of the implementation of the process of virtual promotion [7, 12]. The kernel allows to concentrate product knowledge in a given place. Then we synthesize a system for disseminating this knowledge on the Internet. To do this, we form the nodes of the promotion map in order to understand where to place the
Map of Virtual Promotion of a Product
87
semantic core. In the next step, we fill these nodes with copies of our semantic core, indicating the original source. Having formed the map, we analyze the effectiveness of this map based on new metrics of virtual promotion, which are based on the metrics of WEB analytics [5, 6]. We need to constantly change this map to adapt to the current Internet configuration. The aim is to ensure that knowledge is on the Internet sites that are most frequently viewed by potential buyers of a given product or service.
3 Proposed Approach In this work, based on the description of virtual promotion from the standpoint of a logical theory, an approach to the synthesis of the structure of channels for the dissemination of product knowledge is formulated. Such a structure performs a virtual product promotion map in the form of the following sequence of stages – Fig. 4. Marketing strategy
Synthesis of marketing channel structure
Forming a set of semantic kernels Marketing metrics
Selection of pool of Internet technologies
Evaluation of VP map VP efficiency is optimal
Reconstruct VP map
Virtual promotion map
RDF representation of VP
Fig. 4. Solution schema
Figure 4 in the form of a UML diagram [13] displays the algorithm by which the marketing strategy is transformed into a virtual promotion map by sequentially performing the following steps. First, a parametric and structural synthesis of the marketing channel for transferring knowledge to the Internet is performed. The channel has two levels: product knowledge
88
S. Orekhov et al.
dissemination and marketing. The first channel describes the set of technologies that are involved in the dissemination of knowledge on the Internet. The second channel describes the main sources of product knowledge, knowledge consolidation centers, intermediate points for knowledge customization, and final Internet sites for a potential consumer to receive and pay for a product. As an example, consider the following chain: a corporate website, a social network, a telegram channel and a marketplace, where a customer can make an actual purchase of a product. Such a chain is supported by technologies such as contextual advertising, corporate website SEO, and social media marketing. Sites at the marketing level are represented by specific user accounts, for example, on the social network Facebook. Card nodes are characterized by their own goals, strategies, material and information flows, management components and software to support their functioning. An obvious problem arises of reconciling such disparate systems into a single whole. Thus, the first stage is aimed at the structural and parametric synthesis of the marketing channel in order to accurately determine the composition at two levels. Then the algorithm for the implementation of virtual promotion includes the following actions [5, 6]: 1) 2) 3) 4) 5)
Calculation of the attendance of the corporate WEB site. Formation of the profile of the buyer who usually visits the WEB site. Calculation of the level of costs for interaction with this profile. Assessment of the attractiveness and content of the WEB site by potential buyers. Selection of a set of products (semantic cores), knowledge of which is located in a given node on the Internet.
At the moment, there is no mathematical formula for the criterion for assessing the choice of the structure of the virtual promotion marketing channel. This criterion will assess the necessary organizational structure for the dissemination of product knowledge. However, as a first approximation, the metrics for assessing the quality of SEO or the criterion of the cost of placing knowledge (advertising) in a given Internet site can be taken as a basis. The criterion will make it possible to form the structure of the sales channel on the set of admissible nodes of the virtual promotion map. That is, to solve the problem of parametric synthesis. According to the classics of logistics, this problem is solved either on the basis of a modification of the transport problem, or through the problem of inventory management. In our case, these approaches are not applicable. An alternative solution is required. In our situation, it is required to evaluate two criteria: the cost of placing knowledge on the Internet site and site attendance by potential buyers of the product. A criterion is also needed that evaluates the ability of a node to transfer knowledge to another node. This is a criterion for the type of bandwidth. The presence of these criteria does not contradict the theory of logistics, where two criteria are also taken into account: minimizing costs and maximizing the level of customer service.
Map of Virtual Promotion of a Product
89
However, it is necessary to introduce one more criterion that will show the reliability of this virtual card depending on external influences. First of all, it depends on the influence or overlay of maps on our map from competitors on the Internet. At the second stage, a new model of product knowledge presentation on the Internet is formed. In [7], we proposed an approach to creating a semantic core of web content. The third stage indicates the synthesis of the second level of the marketing channel in the form of a system of technologies that are concentrated in each node. This pool of technologies manages the distribution channel through product knowledge. We choose the most effective pool from several options, based on the opinion of experts or using optimization methods according to three selected criteria. This is an iterative process. An example of such a pool of technologies is such a description. If the node of the marketing channel is a profile on a social network, then the technologies in the pool should be content analysis, promotion in social networks and web analytics of the profile. As a result, the map is specified as an RDF schema [14, 15].
Fig. 5. The sample of VP application
The knowledge gained about virtual promotion was applied to increase the effectiveness of the WEB project in the USA. For this, a three-level knowledge distribution management system was formed [16] from a company that sells this service to many potential buyers - Fig. 5. The knowledge gained about virtual promotion was applied to increase the effectiveness of the WEB project in the USA. For this, a three-level knowledge distribution
90
S. Orekhov et al.
management system was formed [16] from a company that sells this service to many potential buyers - Fig. 5. This three-level system has become a testing ground for the formation of options for a virtual promotion map. First, the nodes of the virtual promotion map were selected and the semantic core was formed. The evolution of the core of a web project is shown in Fig. 5. Next, a pool of technologies was proposed for each node of the map and software was selected to support the operation of the selected version of the map. Using key performance indicators [17] and metrics, the map variant was evaluated. After making adjustments, a new version of the map was evaluated. This process was repeated several times, which is also reflected in Fig. 5. The final version of the technology pool for this web project included: 1) Content management system (PHP application). 2) Semantic kernel building (Javascript application). 3) Content marketing with search engine optimization and social media marketing. This pool made it possible to increase traffic to this corporate website in 2021, which can be seen in Fig. 5. As a result, according to the scheme (Fig. 4), the following actions are required: 1) Develop a method for synthesizing variants of the virtual promotion map. 2) Propose models of synthesis of map elements. 3) Develop a method for solving the problem of coordinating the levels of the map and its elements at one level. 4) Propose a method for choosing a rational version of the map. 5) Test a version of the map using real software solutions on the example of a real web project.
4 Summary and Conclusion The first results of solving the problems indicated in the article are given in articles [7, 10]. The main problem stated in them is the formation of knowledge for transmission and storage in virtual space. We have proposed a model for representing knowledge about a product or service, which is called the semantic core. This research work has shown the fact of the existence of the core, as well as the fact that, as a whole, the system shown in Fig. 5 can be trained to transfer and store the semantic core we need. Thus, now it is possible to implement the tasks and assumptions specified in the article. The descriptions of the new object of research, called “virtual promotion”, indicated in the article, show two interesting facts: 1) Firstly, a new effect appears in the virtual space - the logistics of knowledge. These are models, methods and technologies for transporting and storing knowledge in the network. This theory originates from knowledge management, but uses logistics methods precisely to transport knowledge about a product, service, or an individual person in the virtual space.
Map of Virtual Promotion of a Product
91
2) Secondly, production cycles for the creation, transportation, storage and sale of knowledge are formed in the virtual space. That is, it is not the goods themselves that are sold, but the knowledge of how to satisfy the given needs with the help of goods.
References 1. Kotler, P., Keller, K.: Marketing Management, p. 812. Prentice Hall, Hoboken (2012) 2. Kotler, P., Armstrong, G., Saunders, J., Wong, V.: Principles of Marketing, p. 1036. Prentice Hall Europe, Hoboken (1999) 3. Rowley, J.: Understanding digital content marketing. J. Mark. Manag. 24(5–6), 517–540 (2008) 4. Rahman, A., Khan, M.N.A.: An assessment of data mining based CRM techniques for enhancing profitability. Int. J. Educ. Manag. Eng. (IJEME) 2, 30–40 (2017) 5. Hashimova, K.K.: Development of an effective method of data collection for advertising and marketing on the internet. Int. J. Math. Sci. Comput. (IJMSC) 3, 1–11 (2021) 6. Hashimova, K.K.: Analysis method of internet advertising-marketing information’s dynamic changes. Int. J. Inf. Eng. Electron. Bus. (IJIEEB) 5, 28–33 (2017) 7. Godlevsky, M.D., Orekhov, S.V.: Theoretical fundamentals of search engine optimization based on machine learning. CEUS-WS 1844, 23–32 (2017) 8. Kelsey, T.: Introduction to Search Engine Optimization, p. 126. Apress, New York (2017) 9. Veglis, A., Giomelakis, D.: Search Engine Optimization, p. 104. Future Internet (2021) 10. Orekhov, S.V., Malyhon, H.V.: Virtual promotion knowledge management technology. Bull. Nat. Tech. Univ. KhPI. Ser. Syst. Anal. Control Inf. Technol. 1(3), 74–79. NTU «KhPI», Kharkiv (2020) 11. Alani, M.M.: Guide to OSI and TCP/IP Models, p. 50. Springer, Cham (2014) 12. Sihare, S.R.: Roles of E–content for E–business: analysis. Int. J. Inf. Eng. Electron. Bus. (IJIEEB) 1, 24–30 (2018) 13. Zaretska, I., Kulankhina, O., Mykhailenko, H., Butenko, T.: Consistency of UML design. Int. J. Inf. Technol. Comput. Sci. (IJITCS) 9, 47–56 (2018) 14. Sharma, K., Marjit, U., Biswas, U.: RDF link generation by exploring related links on the web of data. Int. J. Inf. Technol. Comput. Sci. (IJITCS) 10, 62–68 (2018) 15. Rahman, M.M., Akter, F.: An efficient approach for web mining using semantic web. Int. J. Educ. Manag. Eng. (IJEME) 5, 31–39 (2018) 16. Al-Mutairi, S.B., Qureshi, M.R.J.: A novel framework for strategic alliance of knowledge management systems. Int. J. Mod. Educ. Comput. Sci. (IJMECS) 6(4), 38–45 (2014) 17. Taniguchi, A., Onosato, M.: Effect of continuous improvement on the reporting quality of project management information system for project management success. Int. J. Inf. Technol. Comput. Sci. 10(1), 1–15 (2018)
Analysis of the Technical Condition of Agricultural Machinery Using Neural Networks Efim V. Pestryakov(&), Alexander S. Sayapin, Mikhail N. Kostomakhin, and Nikolai A. Petrishchev Federal Research Center VIM FSBI FNCC VIM 5, 1st Institutsky proezd, 109428 Moscow, Russian Federation
Abstract. This article is devoted to the increasing role of artificial intelligence in agriculture, namely in the diagnosis of the state of agricultural machinery. The development of new software that uses artificial intelligence in its work is carried out all over the world. Also, in parallel with the programs, new computing devices are being developed that allow storing and processing large amounts of information. All this allows you to build complex computer systems, where many third-party devices are involved. In addition, agriculture should keep up with current trends. Keywords: Artificial intelligence Signal processing Database
CUDA Automation Microcontrollers
1 Introduction In the near future, agriculture will be inseparably linked to the field of information technology. This is the result of a huge array of information that requires high-quality and fast analysis. For these purposes, separate areas of IT are currently being developed, such as Big Data technology and artificial intelligence algorithms, primarily based on neural networks. But first, in order for the software to work, you need to provide it with the same array of information. There are many ways to do this, and this article presents the work in this direction, scientists of the FGBNU FNAC VIM. The significance of this study is mainly due to the constantly increasing need for the predictability of technology, as well as for improving this same technique. This situation is a consequence of the fact that technical means in agriculture, in particular tractors, are becoming more technically complex, and all production processes are becoming more and more automated. It follows from this that any abnormal situation can lead to a failure of the entire technological process. It is also worth noting that it is often more economically profitable to carry out planned maintenance than urgent repairs. Diagnostics and predictability are the key factors that fit into the concept of “lean production”, as well as in the digitalization and automation of agriculture. Very little research and development has been carried out in this direction, which allows us
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 Z. Hu et al. (Eds.): CSDEIS 2021, LNDECT 121, pp. 92–101, 2022. https://doi.org/10.1007/978-3-030-97057-4_9
Analysis of the Technical Condition
93
to speak about the novelty of the idea. To begin with, let’s consider the implementation of the technical component of this task.
2 Technical Devices and Methods Information is collected and analyzed using an external LCARD E14-140M ADC/DAC module on the USB bus of a PC (personal electronic computer). Additional STM32 and Arduino microcontrollers are also used, receiving data positionally from each sensor and redirecting it either to the LCARD or directly to the computer. All information is processed by the main proprietary special software written in the C/C++ programming language running on a personal computer and special real-time firmware written on microcontrollers, which are also written in C/C++ [1, 2]. The choice of programming language is not accidental because the special driver, the dynamic link library (dll), the static library (lib) and the program interface for module management (API) E14-140M are written in this language. Technically, in order to work with the LCARD programming interface, two header files are also provided: “Lusbapi.h”, which describes the main application API, and “LusbapiTypes.h”, an auxiliary file where the main data structures and types are displayed and defined, for more convenient and fast software development. Microcontrollers, as already mentioned above, can also send data to the E14-140M module using the built-in ADC/DAC module, as well as to the central computing device itself (PC), using for example a serial data transfer interface UART or RS-485, which is emulated on a personal computer in the form of a COM port. Unlike LCARD, it has its own data exchange protocol, and each MC can have its own protocol. In order to process, structure and further analyze all the information received, special software was developed in C/C++, in the integrated development environment of Visual Studio 2019 Community, using an external library for building the GUI interface Qt 5.15 [3]. This software runs under the operating system Windows 7, 8, 10. Here it is worth noting that there is an official recommendation of Microsoft to use only Windows 10, because other versions of this operating system are almost not supported, which in turn can lead to instability, and in the future to the failure of the program. In addition, protection against virus programs plays a significant role, which is much worse in older systems. In the future, it is planned to portability to other operating systems, mainly to Linux distributions. In fact, the only obstacle is about.
3 Implementation of the Method The software itself is a multiprocessor and multithreaded application that works together with a database to accumulate information. MySQL uses the database management system (DBMS). Connecting the program to the database is based on the ODBC3 driver that comes with the development environment. The algorithm of the software for data collection and accumulation (Fig. 1): As can be seen from the presented flowchart of the algorithm, all work is coordinated by the “Central Process”. It manages all other software processes except the
94
E. V. Pestryakov et al.
Fig. 1. The algorithm of the program
database server. The database operates independently and does not depend on a central process. However, it accumulates information and exchanges it with the central process.
Fig. 2. Interface for E14-140M
Figure 2 shows the interface for initializing the E14-140M module from the computer software. Figure 3 shows the process responsible for processing the information coming from the microcontrollers. With the CAN (Controller Area Network) interface and others, the work is carried out in a similar way. First, the device developed by the employees of the FSBI FNAC VIM (Fig. 4) is connected to the tractor’s CAN interface and reads the value in the hexadecimal standard of the CAN interface.
Analysis of the Technical Condition
Fig. 3. Interface for microcontrollers
Fig. 4. CAN device
95
96
E. V. Pestryakov et al.
These interfaces of the program are used exclusively for the user to be able to monitor and control certain processes in the program. All the necessary information can be collected and processed in the “background” mode. The form for working with the CAN interface is shown in Fig. 5.
Fig. 5. Data from the CAN bus
As you can see the CAN message line consists of 3 blocks: (1) “Extended ID” or “Standard ID” is the identifier of the devices inside the machine from which the information was received. In our test, CAN worked in the “Extended” format, this is a more modern version of the standard, an extended standard. Its size is 29 bits. (2) The “DLC” field-shows how many bytes of information came. According to the CAN standard, this is 8 bytes, but the packet may also be incomplete.
Analysis of the Technical Condition
97
(3) “Data” is the field of the data itself. The standard size is 8 bytes. They are presented in the numerical form of the hexadecimal system of calculus. Each byte can display both a separate indicator and a part of it. It is also worth noting that the CAN bus works according to its own protocol. And depending on the machine, the protocol may change. The J1939 protocol is often used on such equipment, but this needs to be analyzed separately.
4 Neural Network Development After we have set up the collection and structuring of information in the database, the question arises about its further analysis. Since information is accumulated in real time, forming large data banks (Big Data) over time, the employees of the FGBNU FNAC VIM are developing a deep learning neural network for qualitative analysis. Finding a solution to a real time problem is difficult as it may depend on the type of problem, sensitivity of the problem, and type of solution expected. Soft computing approach helps to find a solution in unpredictable situation. Neural network (NN) is one of the techniques of the soft techniques. NN requires weights to be assigned among neurons for calculating result [4]. One of the most significant spheres of the development of today’s technical systems for controlling and diagnosing of the objects of various designations is the improvement of the mathematical apparatus that is used for the estimation of the controlled parameters [6]. Deep learning differs from traditional machine learning techniques in that they can automatically learn representations from data such as images, video or text, without introducing hand-coded rules or human domain knowledge. Their highly flexible architectures can learn directly from raw data and can increase their predictive accuracy when provided with more data. This development is not third-party software; it is integrated into the main program. However, it requires additional hardware, in the form of a video card and software. Modern neural networks widely use graphics accelerators (GPUs). The main manufacturers of these devices are NVidia and AMD. We use an Nvidia graphics card. The choice is due to the fact that NVidia provides all the special CUDA and cuDNN software. CUDA is a software and hardware architecture for parallel computing, which allows you to significantly increase computing performance thanks to the use of NVidia GPUs. The NVIDIA CUDA Deep Neural Network (cuDNN) library is a library of primitives for GPU-accelerated deep neural networks. cuDNN provides carefully conFig.d implementations for standard procedures such as forward and backward convolution, union, normalization, and activation levels. Both of these libraries provide a programming interface (API) that uses the C/C++ programming language, as a result of which high-performance program code is generated. Control techniques, where cancelling sources are generated by artificial means to deliver destructive interference with the unwanted source. It reduces the level of vibration or disturbances occurring at the desired site.
98
E. V. Pestryakov et al.
In order to observe the system for identifying its features, a model for the flexible structure system using system identification schemes is required. The parameters of this model should be adjusted until the actual outputs match the measured outputs. A model for the system is highly essential for analyzing, simulating, predicting [7].
Fig. 6. Neural network training
The algorithm shown in Fig. 6 is a strengthened scheme of interaction between the main blocks of the program: the central process, the database, and the neural network. A neural network is trained based on a database and specified criteria within the neural network itself. The already analyzed information is transmitted to the central process to provide the result to the software user. With the help of the C/C++ programming language, in which all software is written, a low level of abstraction is achieved, which in turn indicates high efficiency. In general, CUDA technology is the heart of this work. It is on it that all research in this direction is tied. Below are the formulas that our software will use. This section describes the various convolution formulas implemented in convolution functions. The convolution terms described in the Table 1 below apply to all the convolution formulas that follow. Normal Convolution (using cross-correlation mode) yn; k;p;q ¼
XC XR XS c
r
s
xn ; c; p þ r; q þ s wk;c;r;s
Convolution with Padding x\0; \0 ¼ 0 x [ H; [W ¼ 0
ð1Þ
Analysis of the Technical Condition
99
Table 1. Convolution terms[12] Term x w y n c C H W k K p q G pad u v dilh dilw r R s S Cg Kg
yn; k;p;q ¼
Description Input (image) Tensor Weight Tensor Output Tensor Current Batch Size Current Input Channel Total Input Channels Input Image Height Input Image Width Current Output Channel Total Output Channels Current Output Height Position Current Output Width Position Group Count Padding Value Vertical Subsample Stride (along Height) Horizontal Subsample Stride (along Width) Vertical Dilation (along Height) Horizontal Dilation (along Width) Current Filter Height Total Filter Height Current Filter Width Total Filter Width C G K G
XC XR XS c
r
s
xn ; c; p þ r pad; q þ s pad wk;c;r;s
ð2Þ
Convolution with Subsample-Striding yn; k;p;q ¼
XC XR XS c
r
xn ; c; ðp uÞ þ r; ðq vÞ þ s wk;c;r;s
ð3Þ
xn ; c; p þ ðr dilh Þ; q þ ðs dilw Þ wk;c;r;s
ð4Þ
s
Convolution with Dilation yn; k;p;q ¼
XC XR XS c
r
s
100
E. V. Pestryakov et al.
Convolution Using Convolution Mode yn; k;p;q ¼
XC XR XS c
r
x ; c; p þ r; q þ s S n
Wk;c;Rr1;Ss1
ð5Þ
k þ c; p þ r; q þ s wk;c;r;s Kq
ð6Þ
Convolution Using Grouped Convolution
yn; k;p;q ¼
XCg XR XS c
r
Cg ¼
C G
Kg ¼
K G
x ; Cg floor s n
Of course, you can use other technologies or even completely abandon graphics accelerators, but the efficiency of neural networks is reduced tenfold. The task is precisely in the high efficiency of the software. Graphics accelerators perform mathematical operations much more efficiently, and all machine learning is based on simple mathematical calculations. This work is extremely complex, but at the same time the most promising based on modern realities. At the moment, the result of the work done is a number of new diagnostic devices, such as: a prototype of a counter-indicator for determining the technical condition and loading level of hydraulic systems, a prototype of a tractor wheel slip controller, a prototype of a pump pressure pulsation indicator and much more. A whole range of new software packages on this topic is also being developed. In the continuation of work on tractor diagnostics based on a neural network, it is planned to transfer a specific mathematical apparatus to the software and hardware complexes described above.
5 Conclusions (1) Currently existing technologies are already capable of producing a huge number of complex calculations, which is not interchangeable in the diagnostics of technical means for their prediction. (2) The main problem is that the existing software is designed with the condition of the old principles and models of construction. Often it does not even support modern technical solutions. The presented information system has been developed taking into account the latest achievements in the field of information technology. (3) At the moment, there is still a lot of work to be done, in particular, the development of a deep learning neural network based on the existing neural network. But even at this stage, the system has shown its effectiveness. The further development of the neural network in the diagnosis and prediction of the technical condition of agricultural machinery will benefit all specialists who are associated with this field.
Analysis of the Technical Condition
101
References 1. Saha, T., et al.: Construction and development of an automated greenhouse system using Arduino Uno. Int. J. Inf. Eng. Electron. Bus. (IJIEEB) 9(3), 1–8 (2017). https://doi.org/10. 5815/ijieeb.2017.03.01 2. Agarkhed, J., Ashalatha, R.: Security and privacy for data storage service scheme in cloud computing. Int. J. Inf. Eng. Electron. Bus. (IJIEEB) 9(4), 7–12 (2017). https://doi.org/10. 5815/ijieeb.2017.04.02 3. Anda, I., Isah, R.O., Enesi, F.A.: A safety data model for data analysis and decision making. Int. J. Inf. Eng. Electron. Bus. (IJIEEB) 9(4), 21–30 (2017). https://doi.org/10.5815/ijieeb. 2017.04.04 4. Stroustrup, B.: C83 Programming: Principles and Tactics Using C++, 2nd edn. LLC I. D. Williams, Moscow (2016). ISBN 978-5-8459-1 949-6 (Rus.). Trans. from English5. Peter, P., Crawford, L.S.: Handbook. Full Description of the Language, p. 880, 2nd edn. Alfa-Book LLC, Kuttikkattoor (2017). ISBN 978-5-9908911-6-6 (Rus.) 6. Shlee, M.: Sh68 Qt 5.10. Professional Programming in C++, p. 1072. BHV-Petersburg, St. Petersburg (2018). ISBN 978-5-9775-3678-3. ill - (In the original) 7. Aarti, M., Karande, D.R.: Weight assignment algorithms for designing fully connected neural network. Int. J. Intell. Syst. Appl. (IJISA) 10(6), 68–76 (2018).https://doi.org/10. 5815/ijisa.2018.06.08 8. Dharmajee Rao, D.T.V., Ramana, K.V.: Winograd’s inequality: effectiveness for efficient training of deep neural networks. Int. J. Intell. Syst. Appl. (IJISA) 10(6), 49–58 (2018). https://doi.org/10.5815/ijisa.2018.06.06 9. Zhengbing, H., Tereykovskiy, I.A., Tereykovska, L.O., Pogorelov, V.V.: Determination of structural parameters of multilayer perceptron designed to estimate parameters of technical systems. Int. J. Intell. Syst. Appl. (IJISA) 9(10), 57–62 (2017). https://doi.org/10.5815/ijisa. 2017.10.07 10. Awadalla, M.H.: Spiking neural network and bull genetic algorithm for active vibration control. Int. J. Intell. Syst. Appl. (IJISA) 10(2), 17–26 (2018). https://doi.org/10.5815/ijisa. 2018.02.02 11. Abuljadayel, A., Wedyan, F.: An approach for the generation of higher order mutants using genetic algorithms. Int. J. Intell. Syst. Appl. (IJISA) 10(1), 34–45 (2018). https://doi.org/10. 5815/ijisa.2018.01.05 12. https://developer.nvidia.com/cuda-toolkit 13. https://developer.nvidia.com/cudnn 14. https://docs.microsoft.com/en-us/windows/win32/ 15. Goodfellow, Y., Benji, I., Courville, A.: G93 Deep Learning, p. 652. In: Slinkin, A.A., 2nd edn. DMK Press, Moscow (2018). ISBN 978-5-97060-618-6. tsv. il, translated from the English by, ispr 16. Brian, W., Kernighan, B.W., Ritchie, D.M.: The C Programming Language, p. 288 (2019). ISBN 978-5-907144-14-9 17. Nikolenko, S., Kadurin, A., Arkhangelsk, E. H63 Deep Learning, p. 480. Peter, St. Petersburg (2018). ISBN 978-5-496-02536-2. ill – (Series “ Programmer’s Library”)
Information Spaces for Big Data Problems in Fuzzy Bayesian Decision Making Peter Golubtsov(&) Lomonosov Moscow State University, Moscow, Russia [email protected]
Abstract. We introduce and study fuzzy information spaces, which naturally emerge in decision making problems based on fuzzy representation of uncertainty. Such spaces, elements of which represent information, become especially important in big data problems – when all the relevant data can not be collected in one place. First, we accurately consider decision making problems when uncertainty is described in terms of fuzzy sets and show that the critical part in constructing optimal decision is the transition from a prior fuzzy distribution to a posterior one, similar to the Bayesian transition in probability theory. In the case of a data stream, such a fuzzy distribution update procedure can be applied iteratively. Then we show that the information contained in the data can be represented by elements of a certain mathematical structure - information space. In terms of this structure, it is possible to uniformly describe the transformation of a prior information into a posterior, the accumulation of information obtained from different sources, etc. Besides, many intuitively expected properties of the very concept of “information” acquire an adequate mathematical expression in terms of the properties of the information space. Keywords: Fuzzy distributions Decision making Big data streams Sequential Bayesian inference Information spaces Information quality Distributed parallel processing
1 Introduction The focus of this study lies at the intersection of three huge directions: big data processing, especially within MapReduce framework; Bayesian decision making, originally developed in probabilistic context; and fuzzy sets as a model of uncertainty. We will show how the most important probabilistic concepts, and in particular, Bayesian inference can be developed in fuzzy settings. In the process we will see how information from a fuzzy experiment can be elegantly represented by a certain fuzzy set – an element of a fuzzy information space. And in its turn, rich algebraic structure of such information spaces will be shown to provide a key element for MapReduce implementation of the corresponding distributed decision-making algorithm in big data settings. Fuzzy sets [35] are used more and more often instead of probability theory when describing uncertainty in decision-making problems. This is due to several reasons. One of them is that the probabilistic models do not always adequately represent © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 Z. Hu et al. (Eds.): CSDEIS 2021, LNDECT 121, pp. 102–114, 2022. https://doi.org/10.1007/978-3-030-97057-4_10
Fuzzy Information Spaces for Big Data Problems
103
uncertainty. Besides, in many decision-making problems it is not the average, but the maximum possible error that is important which naturally leads to a minimax approach. In such cases, it is more natural to describe uncertainty by a set, and an experiment by a multivalued mapping [14]. Finally, if we are interested in the maximum possible error with “weights” that take into account different “possibilities” different experiment observations, fuzzy set theory becomes a natural candidate for describing uncertainty. As shown in [15], the most important concepts of probability theory and mathematical statistics have a rather abstract nature and are easily transferred to other theories of uncertainty. In particular, the key role in decision-making procedures is played by the transition from a prior fuzzy distribution to a posterior one, similar to the Bayesian principle in the theory of statistical decisions [4]. Note that many studies [8, 27] developing fuzzy approaches to describing uncertainty, strive to make them as close to probability theory as possible. That allows to interpret probabilistic results in possibilistic terms and vice versa. Other directions [1, 33] are aimed at complementing the probabilistic model with fuzzy constructions and, thus, leading to generalized approaches. It was also recognized that the algebra of fuzzy sets is well suited for sequential data processing, such as fuzzy clustering of sequential data [22, 32]. Another important aspect of fuzzy set theory is its close relationship with fuzzy logic (see, e.g. [6, 24]) and fuzzy inference [5, 16], which naturally generalize traditional logic and inference. All these directions heavily influenced this research. In this work, we use one of the simplest uncertainty models based on the concept of a fuzzy set. At the same time, as shown in the work, even such a simple model of uncertainty provides a full-fledged foundation for the formulation and analysis of decision-making problems. And what is more important for the scope this paper, the analysis of the optimal decision-making process leads to the construction of rather rich information spaces. Elements of the information space, in fact, fully represent the information contained in the data. In terms of this structure, it turns out to be possible to uniformly describe the “accumulation” of information obtained from different sources, the transformation of a prior information to a posterior one, comparison of information according to its accuracy, etc. At the same time, many of the intuitively expected properties of the very concept of “information” obtain an accurate mathematical expression in terms of the properties of the information space. Thus, the main research objectives of this work are to introduce fuzzy information spaces and study their properties. As shown in [20], the construction of an adequate information space makes it possible to effectively parallelize the information accumulation process using the MapReduce distributed data analysis model [7] and in particular, to “parallelize” the initially sequential Bayesian procedure for updating information. This ability to adapt processing to parallel and distributed form is especially valuable in the context of big data problems [3, 11]. In [21], probabilistic information spaces for finite or countable measurable spaces are considered. This work continues this research and considers similar information spaces arising in a fuzzy experiment. As will be shown, a fuzzy model leads to richer information spaces than a probabilistic one. As a result, this study enriches our knowledge of information spaces by extending it to wide class of decision-making problems in fuzzy settings.
104
P. Golubtsov
2 Fuzzy Uncertainty Model In this section, we will briefly consider one of the simplest fuzzy models for describing uncertainty and experiment. More detailed study can be found in [15]. 2.1
Fuzzy Distribution and Fuzzy Experiment
By a fuzzy distribution on a set X we mean a normalized fuzzy set, i.e. a function p : X ! ½0; 1 such that sup pð xÞ ¼ 1. The fuzzy distribution p of an unknown element x2X
n 2 X describes the possibility pð xÞ that n ¼ x. The values of x at which pð xÞ ¼ 0 will be called impossible (with respect to the distribution p). The set of all possible values of n is determined by the support of the distribution p: supp p ¼ fx 2 Xjpð xÞ [ 0g: Imitating the probability-theoretic terminology, we say that the property UðnÞ is satisfied almost surely with respect to the distribution p (or p-almost surely) if it is satisfied on supp p. Obviously, any non-empty “crisp” subset QX can be considered 1; x 2 Q . as a fuzzy distribution determined by the membership function pQ ð xÞ ¼ 0; x 62 Q A fuzzy experiment [15] from X to Y is specified by a fuzzy transition distribution qðyjxÞ, namely, for any fixed x 2 X qðjxÞ is a fuzzy distribution on Y. Thus, qðyjxÞ determines the possibility of observing w ¼ y provided that n ¼ x. In fact, qðyjxÞ is a function of two variables with values in ½0; 1, such that for any x 2 X sup qðyjxÞ ¼ 1. y2Y
The data obtained from the experiment q as a result of the observation w ¼ y is described by the pair ðq; yÞ, i.e. model of the experiment and its result. 2.2
Fuzzy Joint Distribution
The distribution p on X and the transition distribution q from X to Y generate a joint distribution s on X Y: sðx; yÞ ¼ pð xÞqðyjxÞ: Thus, the possibility of ðx; yÞ 2 X Y is determined by the product of the possibility of x and the possibility of y for that x. On the other hand, the joint distribution s contains complete information about the distribution p as well as essential information about the transition distribution q. Indeed, consider marginal distributions (shadows of a fuzzy set [35, 36], marginal fuzzy constraints [9] on the spaces X and Y, respectively: sX ð xÞ ¼ sup sðx; yÞ; sY ð yÞ ¼ y2Y
sup sðx; yÞ: Then the distribution p coincides with the X-marginal distribution for s, that x2X
is, pð xÞ ¼ sX ð xÞ. It is natural to interpret the Y-marginal distribution sY as the full distribution of observations w 2 Y generated by the distribution p and the transition distribution q: sY ð yÞ ¼ sup pð xÞqðyjxÞ: x2X
Note that the generated joint distribution s and, accordingly, the full distribution sY can be defined in different ways [9, 13, 37]. In this paper, we will use the multiplicative form pð xÞqðyjxÞ since this leads to closeer relations to probabilistic framework.
Fuzzy Information Spaces for Big Data Problems
2.3
105
Conditional Distribution
Let s be some joint distribution X Y. By analogy with probability theory [2], we say that uðyjxÞ is conditional for s with respect to X if s is generated by its marginal sX and X-conditional u: sðx; yÞ ¼ sX ð xÞuðyjxÞ: Similarly, vðxjyÞ is conditional for s with respect to Y if sðx; yÞ ¼ sY ð xÞvðxjyÞ: Obviously, if the joint distribution s on X Y is generated by the distribution p on X and the transition distribution q from X to Y, then q is one of the variants of the X -conditional distribution. Theorem 1. [15] Conditional distributions exist for any joint distribution s on X Y. Variants of distributions u and v conditional with respect to X and Y are uniquely defined on for all possible values of x and y, i.e. on supp sX and supp sY , respectively, by the expressions: uðyjxÞ ¼
sðx; yÞ sðx; yÞ ¼ ; sup sðx; yÞ sX ð xÞ
vðxjyÞ ¼
y2Y
sðx; yÞ sðx; yÞ ¼ : sup sðx; yÞ s Y ð yÞ x2X
A joint distribution s on X Y is called independent if it is completely determined by its marginal distributions: sðx; yÞ ¼ sX ð xÞsY ð yÞ: Obviously, in this case, the marginal distributions sX ð xÞ and sY ð yÞ are also conditional, namely, uðyjxÞ ¼ sY ð yÞ and vðxjyÞ ¼ sX ð xÞ and do not depend from x and y, respectively. From Theorem 1 we immediately find that for the distribution p on X and the transition distribution q from X to Y, the conditional distribution n under the condition pðxÞqðyjxÞ that the observation w ¼ y is pðxjyÞ ¼ sup pðxÞqðyjxÞ ; similar to Bayes’ formula for x2X R ðyjxÞ probability densities pðxjyÞ ¼ R ppððxxÞq ÞqðyjxÞdy where ð Þdx is replaced by supð Þ. X
X
x2X
3 Decision-Making Problems Consider decision-making problems for a fuzzy description of uncertainty. By analogy with the theory of statistical decisions [4], we start with the simplest problem with prior information. 3.1
A Trivial Decision-Making Problem
Let X be some fixed set - the space of studied objects, Z a fixed space of decisions, and h : X Y ! ½0; þ 1Þ is a loss function. The value hðx; zÞ determines the quality of the decision z, provided that the unknown object n ¼ x. If it is known a priori that the possible values of n are determined by some fuzzy distribution p on X, we define the possible losses for the solution z as hp ðzÞ ¼ sup pð xÞhðx; zÞ x2X
106
P. Golubtsov
and say that the solution zp is optimal with respect to the prior distribution p if it provides the minimum of the function hp ðzÞ. It is natural to call the problem of finding optimal solution zp the Bayesian problem for the fuzzy theory of uncertainty since it is similar to the Bayesian decision problem in mathematical statistics where the prior information is determinedR by a probability density function pð xÞ, and the function hp ðzÞ has the form hp ðzÞ ¼ pð xÞhðx; zÞdx: X
3.2
Optimal Decision Strategy
Let us now turn to the problem of choosing the optimal decision strategy in a fuzzy experiment. Let a fuzzy experiment from X to Y be described by a fuzzy transition distribution q. A (deterministic) decision strategy is understood as some mapping d from the outcome space of the experiment Y to the space of decisions Z, i.e. d : Y ! Z. If some mapping d is chosen as the decision strategy, then for the unknown element equal to x and the experiment result y, the loss for the solution d ð yÞ will be equal to hðx; d ð yÞÞ. Since for a fixed x the measurement result y is distributed according to qðyjxÞ, then the risk function corresponding to a given x and decision strategy d, is H ðx; d Þ ¼ sup qðyjxÞhðx; d ð yÞÞ: Finally, the problem of choosing the optimal decision y2Y
strategy for the prior distribution p consists in finding a mapping dp that minimizes ! Hp ðd Þ ¼ sup pð xÞH ðx; d Þ ¼ sup pð xÞ supðqðyjxÞhðx; d ð yÞÞÞ ; x2X
x2X
y2Y
min Hp ðd Þ:
d:Y!Z
Note that this problem is much more complicated than the trivial problem, since the minimum is sought in the space of all mappings from Y to Z. A well-known result of mathematical statistics emphasized in [4] states that the problem of finding the optimal decision strategy can be reduced to a similar trivial problem for the conditional distribution. 3.3
Fuzzy Bayes Principle
Theorem 2. (Fuzzy Bayes principle) [15]. Let for a given prior distribution p on X and a transition distribution q from X to Y, the conditional distribution from Y to X is v. Let, in addition, for any possible y (with respect to the full induced distribution uð yÞ ¼ sup pð xÞqðyjxÞ) there exists a solution zvðjyÞ of the trivial problem optimal with x2X
respect to the distribution vðjyÞ. Then the optimal decision strategy dp and the cor responding loss Hp dp are given by the formulas. dp ð yÞ ¼ zvðjyÞ ;
Hp dp ¼ sup uð yÞhvðjyÞ zvðjyÞ : y2Y
The Bayes principle provides a “pointwise” construction for the optimal decision strategy dp . Namely, when observing the result y, we first construct the conditional
Fuzzy Information Spaces for Big Data Problems
107
distribution vðjyÞ, and then find the optimal decision zvðjyÞ for this conditional distribution. It is important to note that the first step of this procedure, the construction of the conditional distribution vðjyÞ does not depend on the specific decision problem being solved. It means that the conditional distribution contains all the “information” required to find the optimal decision in any decision-making problem. 3.4
Accumulation of Information in Big Data Streams
Bayesian transition from prior to posterior probability distribution [2, 23] is widely used in processing big data streams [30, 31, 38], since it allows recalculating prior distribution to posterior and use it as prior for the next experiment. In [23] this idea is expressed in the form: “today’s posterior is tomorrow’s prior,” see also [25, 28]. Fuzzy Bayesian transition fully preserves this property, Fig. 1. The transformation of the distribution at each step has the form
Finally, at any moment, based on the current updated distribution pk , an optimal decision zpk can be constructed.
Fig. 1. Sequential update of fuzzy distribution.
4 Fuzzy Information Space As shown above, when obtaining the outcome y for the experiment q, the transition from the prior distribution pð xÞ to the posterior (conditional) pðxjyÞ ¼
pð xÞqðyjxÞ ; sup pð xÞqðyjxÞ x2X
reflects the effect of information refinement as a result of an experiment and is key in decision-making problems. According to this formula, the conditional distribution
108
P. Golubtsov
pðjyÞ is completely determined by the product of two functions on the set X, namely, the prior distribution pðÞ and the function qðyjÞ. The expression in the denominator, in fact, “normalizes” the function pðÞqðyjÞ, turning it into a fuzzy distribution. 4.1
Information Space Definition
We say that two functions r1 ; r2 : X ! R are equivalent, r1 r2 , if they are proportional, more precisely, r2 ¼ ar1 for some a [ 0. The equivalence class of the function r will be denoted by r ¼ ½r ¼ far ja [ 0g. Let X be the set of equivalence classes of all nonnegative bounded functions on X. We define on X a composition operation ½u ½v ¼ ½u v. The element r 2 X will be interpreted as information about the unknown n 2 X, and the set X with operation will be called the information space for X. It is natural to interpret fuzzy distribution p as the prior information p ¼ ½p 2 X, element r ¼ ½qðyjÞ, as information delivered by experiment q and result y, and the conditional distribution u ¼ ½pðjyÞ ¼ p r as the posterior information. A special role in X is played by 0 ¼ ½0 ¼ f0g. This element will be interpreted as impossible information. Obviously, if the function r ð xÞ is not identically zero, it can be normalized and the class r ¼ ½r can be identified with the resulting fuzzy distribution. We denote 1 ¼ ½1 the class of constant nonzero functions. For two independent experiments q1 ðy1 jxÞ and q2 ðy2 jxÞ with observations in Y1 and Y2 , respectively, qðy1 ; y2 jxÞ ¼ q1 ðy1 jxÞq2 ðy2 jxÞ describes the joint experiment with observations in Y1 Y2 . Thus, if r1 ¼ ½q1 ðy1 jÞ is information from the first experiment, r2 ¼ ½q2 ðy2 jÞ, then the information obtained from two experiments r ¼ ½qðy1 ; y2 jÞ is equal to r ¼ r1 r2 . 4.2
Main Properties of Fuzzy Information
Theorem 3. Information space ðX; ; 1Þ is a commutative monoid, namely, u v ¼ v u;
ðu vÞ w ¼ u ðv wÞ;
u1¼u
8u; v; w 2 X:
By virtue of the latter property, 1 represents the absence of information. Obviously, this element corresponds to the distribution pð xÞ 1. We will say that pieces of information u; v 2 X are incompatible (mutually contradictory) if u v ¼ 0. Theorem 4. Let p be some distribution on X and q be a transition distribution from X to Y. Then ½pðÞ and ½qðwjÞ are compatible for any possible w 2 Y (w.r.t. the full distribution qð yÞ ¼ sup pð xÞqðyjxÞ; .i.e., ½pðÞ and ½qðwjÞ are a.s. compatible. x2X
Theorem 5. Let q1 and q2 be transition distributions from X to Y1 and Y2 , respectively. Then q1 ðw1 jÞ and ½q2 ðw2 jÞ are a.s. compatible for any possible pair of observations ðw1 ; w2 Þ 2 Y1 Y2 (w.r.t. the full independent joint distribution of ðw1 ; w2 Þ on Y1 Y2 , having the form qðy1 ; y2 Þ ¼ sup pð xÞq1 ðy1 jxÞq2 ðy2 jxÞ). x2X
Fuzzy Information Spaces for Big Data Problems
4.3
109
Types of Fuzzy Information
The carrier of information r 2 X will be called the carrier of its representatives r 2 r, i.e. supp r ¼ supp r. We say that r has complete support if supp r ¼ X and denote the set of such elements by Xc . Information r with complete support does not in any way restrict the set of values that the unknown element n can take. Obviously, the set Xc is closed with respect to composition and contains 1. Thus, the space ðXc ; ; 1Þ forms a commutative monoid – a submonoid of the information space ðX; ; 1Þ. Moreover, all elements from Xc are compatible with each other. An element u 2 X is invertible if there is an element v 2 X for which u v ¼ 1. In this case, the information v destroys the information u. The set Xi of all invertible elements from X forms an Abelian group ðXi ; ; 1Þ. An element w called cancellable if u w ¼ v w implies u ¼ v. Cancellability of information w means that it can be “excluded” from the accumulated information. Obviously, Xi Xc X. Theorem 6. Information u is cancellable iff it has a complete support, i.e., u 2 Xc . Theorem 7. Information u is invertible iff inf uð xÞ [ 0. x2X
4.4
Quality of Information
We say that information u is no less accurate than v, and denote it by u < v if for some information w we have u ¼ v w, that is, information v can be “refined” to u by adding w. If u < v and v < u, we say that u and v have comparable accuracy. If u < v but , we say that u is strictly more accurate than v and write u v. Theorem 8. u < v iff u ¼ O ðvÞ, that is, for u 2 u and v 2 v there is a constant C [ 0 such that uð xÞ Cvð xÞ for any x 2 X. Moreover, supp u supp v. Corollary. If uð xÞ vð xÞ for any x 2 X, i.e. uv as fuzzy sets, then u < v. Theorem 9. The relation < is a preorder on X, compatible with composition: • u 108. Their interest primarily exists because of the fact, that the life cycle of many machines includes long-term operation up to N > 108. Before recently we did not have the technical ability to carry out such a test in full, but finely, in the 1980-es, scientists solved this technical problem. For the first time, tests for gigacycle fatigue with 20–30 kHz ultrasonic frequencies were performed [16]. Because of the warming up of the samples during these tests, researchers usually use forced cooling. Some of the samples of materials for which an unlimited endurance limit was previously postulated, after overcoming a certain “plateau” on the fatigue curve, ultimately collapsed at N > 5.108 and more. However, we should note that scientists disagree on the correspondence of test results at regular and supersonic frequencies. So, the authors of [17] got different durabilities. Durability at ultrasonic frequencies turned out to be much greater. Figure 1 shows the results of fatigue tests of high-strength steel specimens normalized at 430 °C [18]. Authors of [18] carried out the experiments with loading asymmetry R = −1. They stopped testing after reaching the number of cycles N = 108 (in Fig. 1, we supplemented the censored tests’ points with arrows).
Digital Twins as an Advanced Tool for the Approbation
119
Fig.1. Fatigue curve of specimens for three types of steel, obtained using tests at ultrasonic frequencies, from [18].
Upon closer examination of Fig. 1, you may notice that if the researchers chose the test base Nb = 106, they could come to the wrong conclusion about the existence of an unlimited endurance limit, which would be incorrect.
4 Experiment Let’s define three types of fatigue curves: 1) The initial one is the result of tests under harmonic loading; 2) The secondary fatigue curve [13] represents the test result under irregular loading. In English-language literature, it is sometimes called the Gassner curve [19]. 3) Finely, we introduce the so-called modified fatigue curve (see below in Sect. 4), the calculated - experimental one. The shape of the modified fatigue curve coincides with the original one, but the constants of its equation we define according to the results of tests under irregular loading. Fatigue testing of smooth steel specimens (Fig. 2) was carried out on the servo hydraulic testing machine (Fig. 3) both under regular and block loading of two types.
Fig. 2. Steel specimen
The shapes of the loading blocks with constant loading frequency 30 Hz are shown in Fig. 4.
120
I. V. Gadolina and I. M. Petrova
Fig. 3. Servo - hydraulic testing machine
Figure 4 shows an example that illustrates the idea of building of the modified fatigue curve.
Fig. 4. Experiment on fatigue with loading by blocks of two types. 1 - initial fatigue curve; 2 extrapolated below the endurance limit curve; 3 - 7-steps test block; 4 - 5-steps test block, shortened.
Researchers tested steel specimens loaded with 7-steps (3) and 5-steps (truncated) blocks (4). The shortened block was similar to the original but without two lower loading steps, which were omitted (see Fig. 4). Experimental durability, expressed in the number of blocks before destruction, differ significantly. If, when assessing the durability according to Miner’s linear hypothesis, scientists would use the original fatigue curve, (1) shown in Fig. 4, then the calculated durability for the original and
Digital Twins as an Advanced Tool for the Approbation
121
shortened blocks should have coincided. It happens because the two omitted lower steps when using the linear hypothesis would not damage the material according to Miner’s rule. That happens because lowest amplitudes in 7-steps block are below the unlimited endurance limit (1). Comparing the test results under block loading by blocks (3) and (4), we can conclude that the nature of the modified fatigue curve is closer to reality: this is a curve continued below the endurance limit, namely, line 2 in Fig. 4.
5 Method The modified fatigue curve does not always coincide with the curve extended below the fatigue limit; we use a more complex mathematical algorithm to construct it, which we describe below. Modified Fatigue Curve Construction Method Miner’s hypothesis assumes that under irregular loading with variable amplitudes, the sum of the relative fatigue life at the moment of failure will be D¼
X
di ¼ 1
ð2Þ
Engineers, using (2) to estimate durability, often come to a systematic error, namely, experiments [13, 14] and others show that X D¼ di 6¼ 1 ð3Þ The relative damage in expressions (2) and (3) at the i - th stage of loading is defined as follows: di ¼ ni =Ni
ð4Þ
where ni is the cycles number with amplitude rai until fracture. Researchers determine the limiting number of cycles Ni using the fatigue curve equation. Here and below, we consider the Basquin’s equation [13]: lgðNÞ ¼ A m lgðra Þ
ð5Þ
Modified Fatigue Curve (Definition) The modified fatigue curve is the equation similar to the original fatigue curve (5), but transformed in a particular way to minimise the error in predicting durability based on the results of the experiments performed. We assume that this equation considers the effect of overloading on the shape and parameters of the curve. We write its equation in the form: lgðNÞ ¼ A m lgðra Þ
ð6Þ
122
I. V. Gadolina and I. M. Petrova
Lemma Based on the fatigue test results under at least two levels of irregular loading with a similar amplitude distribution spectrum, it is possible to construct a modified fatigue curve, and the equation is unique. In Eq. (6) the constants A and m are defined from the numerical decision of the equation system for the error (7): W¼
i2 Xk h Xw m lg n r A ij i¼1 j¼1 ij
ð7Þ
Here: k is the number of trials under irregular loading; w is the steps number in the block.
6 Results and Discussion The results of the fatigue experiment under block loading together with reference to 5% trust boundaries for the distribution functions are presented in Table 1. Table 1. Results of fatigue experiment under block loading
Experimental values (block numbers until failure) 5% Median Trust boundaries
Initial (7-steps block) 94; 110; 120 Median LIb = 110 92 …. 106
5-steps test block, shortened 127; 128; 138; 140 Median LSb = 133 121 … 139
We see that the difference between test results of initial and shortened blocks is prominent. That fact confirms the success of the experiment. To numerically determine the roots that minimize the residual W (i.e., to determine the parameters of the modified fatigue curve A* and m*), we need a detailed description of the spectrum and information about the operating time under irregular loading. In each experiment, at the same loading level, it is desirable to test several samples to reduce the effect of scattering. We evaluated the results by the median value of the operating time to failure for a series of experiments. To search for the unknowns A* and m* by minimizing the residual W in the system of Eqs. (7), we used the standard least-squares algorithm implemented in the MATLAB 6.5 program. To find an approximation close to the minimum point of the error function, we used the adaptive random search method [20]. In Fig. 5, we show the dependence of the logarithm of the total error W on the parameters m and A, i. e. an example of an error surface for an experiment with testing laboratory samples on the original and shortened blocks. The sought values of m* and A* are the points for which the value of W (residual of the non-linear equation) is minimal, namely m* = 6.8 and A* = 21.5. Those values (m* = 6.8 and A* = 21.5) will serve as the new constants in Basquin’s equation, providing the non-biased longevity estimation by Miner’s hypothesis for future estimations.
Digital Twins as an Advanced Tool for the Approbation
123
Fig. 5. The error surface for the experiments with irregular loading
7 Conclusions Because refining the calculation methods is essential in engineering practice at the design stage, and while considering the safe guarantee of life prolongation, we developed the concept of employing the digital twins for new methods approbation. For the possibility of information exchange between physical and digital twins, we employ the proposed method. Our approach utilizes the previously obtained empirical information concerning the laws of fatigue damage accumulation during irregular service loading in a specific manner by the newly proposed “modified fatigue curve” conception. The lemma is formulated. While submitting and probing the advanced design methods, the digital twins are essential due to the new possibilities they bring. The authors discuss these options.
References 1. Liu, J., et al.: A new method of reusing the manufacturing information for the slightly changed 3D CAD model. Eng. Comput. Sci. (2020). https://doi.org/10.1007/s10845-01901512-w 2. Grieves, M.W.: Virtually intelligent product systems: digital and physical twins. Complex Syst. Eng. Theory Pract. 7, 175 (2018) 3. Sergeev, V.I. (ed.): Digital Technologies in Logistics and Management Supply Chains. Analytical Review. Publishing House of Higher School of Economics, Moscow (2020) 4. Fangming, Y.: Real-time construction of 3D welding torch in virtual space for welding training simulator. Int. J. Eng. Manuf. (IJEM) 9(5), 34–45 (2019). https://doi.org/10.5815/ ijem.2019.05.03 5. Chang, R., He, Y.: “Design of wind power generator’s vane which based on computer” course based on CDIO model. Int. J. Eng. Manuf. (IJEM) 1(1), 27–33 (2011) 6. Urbonavicius, A., Saeed, N.: IoT leak detection system for building hydronic pipes. Int. J. Eng. Manuf. (IJEM) 9(5), 1–21 (2019). https://doi.org/10.5815/ijem.2019.05.01
124
I. V. Gadolina and I. M. Petrova
7. Mehmood, I., Anwar, S., Dilawar, A., Zulfiqar, I., Abbas, R.M.: Managing data diversity on the internet of medical things (IoMT). Int. J. Inf. Technol. Comput. Sci. (IJITCS) 12(6), 49– 56 (2020). https://doi.org/10.5815/ijitcs.2020.06.05 8. Durodola, J.F.: Machine learning for design, phase transformation and mechanical properties of alloys. Progress Mater. Sci. 123, 100797 (2022) 9. Lee, J., Bagheri, B., Kao, H.-A.: A cyber-physical systems architecture for industry 4.0based manufacturing systems. Manuf. Lett. 3, 18–23 (2020). www.sciencedirect.com 10. Kim, J., Kim, S.-A.: Noise barrier tunnels. Sustainability 12, 2940 (2020). https://doi.org/10. 3390/su12072940 11. Jo, J.: Numerical method of fatigue life prediction for each fatigue phase of steel structures. Kor. J. Steel Struct. 7, 231–239 (1995) 12. Miner, M.A.: Cumulative damage in fatigue. J. Appl. Mech. 3, 159–164 (1944) 13. Kogaev, V.: Calculations for Strength for Variables in time Strains (1993). (in Russian) 14. Sonsino, C.M.: Effects on lifetime under spectrum loading. MP Mater. Test. 52, 7–8, 440– 451 (2010) 15. Bathias, C.: There is no infinitive fatigue life in metallic materials. Fatigue Fract. Eng. Mater. Struct. 22(7), 559–566 (1999) 16. Stanzl-Tschegg, S.: Fatigue crack growth and threshold measured at very high frequencies (20 kHz). Metal Sci. 14(4), 137–143 (1980) 17. Xu, W., et al.: An ultra-high frequency vibration-based fatigue test and its comparative study of a titanium alloy in the VHCF regime. Metals 10, 1415 (2020). https://doi.org/10.3390/ met10111415 18. Kanazawa, K., Abe, T., Nishijima, S.: Fundamental fatigue properties of hard steels, NRIM fatigue datasheet technical document no. 9. National Research Institute for Metals, Tokyo (1989) 19. Sonsino, C.M., Heim, R., Melz, T.: Lightweight-structural durability design by consideration of variable amplitude loading. Int. J. Fatigue 92(Pt 2), 328–336 (2016). https://doi.org/10. 1016/j.ijfatigue.2015.07.030 20. Makhutov, N.A., Petrova, I.M., Gadolina, I.V., Kulakov, A.V.: On peculiarities in construction of a modified fatigue curve. J. Mach. Manuf. Reliab. 39(4), 338–342 (2010)
Algorithms Optimization for Procedural Terrain Generation in Real Time Graphics Aleksandr Mezhenin1(&) and Vera Izvozchikova2 1 2
ITMO University, Kronverksky Avenue 49, St. Petersburg, Russia Orenburg State University, Avenue Pobedy 13, Orenburg, Russia
Abstract. This article discusses the optimization (simplification) of procedurally generated landscapes for real time graphics applications. The problem of locality recognition, sufficient detailing and preservation of characteristic details is being solved. The mathematical models of the proposed solution are presented. The main stages of the proposed algorithm are described: the formation of the initial high-poly landscape model based on the height map, the selection from the initial high-poly model of the control points that most accurately convey the curvature and features of the landscape. At the simplification step, for the selection of control points, it is proposed to use the Ramer - Douglas Pecker algorithm, adapted for the three-dimensional case based on the use of majority logic. At the stage of constructing a polygonal mesh (triangulation), the Delaunay method and Hausdorff metric were applied. Objective and Subjective approaches to assessing the quality of optimization results are considered. Objective is based on measuring the measure of geometric similarity of polygonal models by calculating the Hausdorff distance. Subjective - based on the adapted Double Stimulus Impairment Scale (DSIS) quality testing method. Developed and tested a prototype of the polygonal mesh optimizer in the Unity3D environment. The results obtained are presented. The considered approaches can be used in the field of computer games, in various simulators, navigation systems, training programs, as well as in the film industry. This work is a continuation of a series of publications by the authors in the field of methods and models for constructing computer graphics tools. Keywords: Procedural landscape generation Polygonal mesh optimization Ramer-Douglas-Peucker algorithm Hausdorff metric Level of detail Delaunay triangulation Visual quality assessment Double Stimulus Impairment Scale Unity3D
1 Introduction Procedural landscape generation (PGL) is the automatic creation of a three-dimensional landscape model without human intervention or with its minimal contribution. This approach is often used in the field of computer games, in various simulators or training programs, as well as in the film industry. One of the most common methods of procedural landscape creation is the generation of a height map using various stochastic or fractal algorithms [1, 2]. The height map in this case is a black and white image in which each pixel contains information about the height of the future point of the © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 Z. Hu et al. (Eds.): CSDEIS 2021, LNDECT 121, pp. 125–137, 2022. https://doi.org/10.1007/978-3-030-97057-4_12
126
A. Mezhenin and V. Izvozchikova
landscape polygonal mesh. There are various programs that allow you to create such maps, for example, World Machine [3] or L3DT - Large 3D Terrain Generator [4]. Procedural landscape generation is used in many industries, in particular, in the gaming industry, in cinema, in training programs, simulators, etc. In some cases, the main task of PGL is to achieve the maximum realism of landscapes, in some - quickly and variably create a virtual environment [5, 21]. When the terrain is generated for realtime rendering, as is the case in games or, for example, navigation applications, there is a need to optimize the 3D model, because otherwise you may encounter a significant delay between rendering frames. And if in computer games this situation is more unpleasant than critical, then in navigation applications that display terrain data received from the satellite, such a problem can cause extremely negative consequences. In computer games, this delay can make the process of the playing uncomfortable, in the worst situations – impossible. However, in navigation applications render delays can cause extremely negative consequences. In this situation, 3D models contain information about real terrain, which means that correct navigation requires a timely and most reliable display of the terrain data received from the satellite. At the stage of landscape generation, it can only be optimized at the level of the polygonal mesh, that is, to reduce the number of polygons. At the same time, it is highly desirable to keep the landscape recognizable, not to distort the silhouette, and in special cases, to preserve distinctive details. All of the above indicates that there is a problem associated with optimizing the landscape model and maintaining sufficient detail. The approach described above allows you to get the landscape represented by a regular polygonal mesh, and the higher the resolution of the height map, the more polygons in this mesh, and the more detailed the landscape. If the generated model will be drawn in real time, for example, in a computer game or on a navigation monitor, then there is a need to optimize the polygon mesh so as to best preserve the landscape details and reduce the number of polygons. Description Of The Unity3D System. Unity3D, a cross-platform game and interactive application development environment, was chosen as a platform for testing the proposed approaches [5]. The platform has many built-in functions and libraries that accelerate product development, includes a ready-made real-time rendering system for 2D and 3D graphics, programmed physics (interaction of objects with the environment or with each other), collision detection and much more. This work is a continuation of a series of publications by the authors in the field of methods and models for constructing computer graphics tools [8–12].
2 Existing Optimization Methods In modern computer graphics systems, especially those operating in real time, several types of algorithms are used to reduce computational costs. One of these algorithmic approaches is to display an object with a different level of detail in terms of visual resolution. When creating each subsequent level of simplification, small model details that do not actually affect the image structure are combined and replaced with larger ones. This technology is known as Level of Detail (LOD).
Algorithms Optimization for Procedural Terrain Generation
127
Let’s consider a simple version of the LOD algorithm, which reduces the number of points with a certain step evenly over the entire plane. There are many more complex LOD methods [6], which partially change the polygonal mesh, depending on the detail of the terrain fragment or the distance between the fragment and the camera. In this work, we consider only a fragment of a complexly modified mesh in order to understand how the LOD method works with different types of landscapes. Obviously, if the level is too low, the terrain model can lose a lot of detail, but if you increase the detail, there is a risk of going beyond the limit in terms of the number of polygons of the model. Also, the quality of the result of the LOD method strongly depends on the “roughness” of the landscape, that is, on how detailed it is. Estimation of optimization algorithms based on the Hausdorff metric. With regard to this study, to estimate the similarity measure of a geometric form, it is proposed to use the Hausdorff metric [8]. For each optimization algorithm, its value will be calculated as a measure of the deviation between the general initial model and its optimized counterpart. We assume that the smaller the value of the Hausdorff distance, the better the detailing of the high-poly model is preserved. The process of measuring the Hausdof distance is discussed in detail in Sect. 5. Consider two different landscapes that differ from each other in the degree of smoothness (Fig. 1). For each landscape, we perform a series of actions: optimize the landscape using the LOD method, having examined and saved all the levels of detail possible for a given polygonal mesh; for each optimized grid, we calculate the Hausdorff distance relative to the original high-poly model [8, 9]. An image representing inputs with caption (flattened and detailed). The results of measurements of the dependence of the Hausdorff distance on the level of detail for the Flattened terrain and Detailed Landscape models are presented in Table 1.
Fig. 1. Detailed landscape and Flattened terrain
Thus, in the study of the Terrain object in Unity and the LOD method by which it is optimized by default, a number of the following problems were highlighted: on flat, poorly detailed areas, the polygon density is too high, after the LOD method works, fragments remain that can be optimized better; decisions on the selection of areas where the polygonal mesh is compacted are not always clear; the areas closest to the camera are simplified, and the resolution of the distant ones is increased, although the exact opposite result is expected; the only parameter to control how the LOD method works in Unity is Pixel Error. However, manipulations with its values do not solve the listed problem. Its effect only affects the maximum allowable mesh density.
128
A. Mezhenin and V. Izvozchikova Table 1. Three kinds of simulation examples estimated standard deviation Flattened terrain Detailed Landscape LOD Hausdorff LOD Hausdorff 1 0.231377 1 1.459385 2 0.818949 2 4.363617 3 0.818949 3 4.850229 4 3.195169 4 6.310234 5 4.560953 5 6.2938 6 5.476715 6 6.473554
A histogram of measurements of the dependence of the Hausdorff distance on the level of detail for the Flattened terrain and Detailed Landscape models is shown in Fig. 2.
Fig. 2. Graphs of the dependence of the Hausdorff distance on the level of detail
Most iterative algorithms for simplifying polygonal models can be divided into three categories: Thinning peaks; Coagulation of the ribs; Thinning faces [10]. Due to the fact that both in nature and in various virtual environments, the landscape is more “rough” than smoothed, it is necessary to find an approach to optimizing 3D landscape models better than LOD.
3 Proposed Approach The terrain created in Unity using the Terrain object is a 3D model that has a regular triangulated polygon mesh. The built-in LOD method works automatically and at the same time, as noted, has a number of drawbacks, the main one of which is the
Algorithms Optimization for Procedural Terrain Generation
129
controversial adaptability for landscape detailing. In this regard, the idea arose to represent the terrain model not as a regular grid, but as an irregular one, that is, Triangulated Irregular Network (TIN), which would be built using filtered, unevenly spaced points of the original model given by the height map. In this case, the filtering of points should be carried out so that the most important of them, characterizing the distinctive details of a given landscape, remain at the output. It is important to note that within the framework of this study, it was decided not to implement the approach in the form of a partial dynamic modification of the polygonal mesh, as it actually works for the Terrain object. Instead, it is proposed to build and optimize the entire landscape model, assuming that these are fragments of some larger scale model, which is partially optimized depending on the camera distance. Mathematical Apparatus of the Proposed Solution. So, the following pipeline is proposed: the initial, high-poly landscape model is presented in the form of a height map; from the original high-poly model, points are selected that most accurately convey the curvature and features of the landscape (control points); an irregular triangulated mesh is constructed from the obtained points. At the step of selecting control points, it was proposed to use the Ramer - Douglas - Pecker (RDP) algorithm adapted for the three-dimensional case, and to apply the Delaunay method at the stage of constructing the polygonal mesh. Let’s take a closer look at each of the algorithms.
Fig. 3. An example of discarding points by the Ramer - Douglas - Pecker algorithm. Top baseline, bottom - simplified
The RDP algorithm is an algorithm that reduces the number of points on a curve [13, 14]. Figure 3 shows an example of 2D simplification by this algorithm. The algorithm function takes an array of points (i.e., a line to be simplified), as well as a threshold variable epsilon - e. The operation of the algorithm begins with connecting the first and last points of the original line, that is, the formation of a “baseline”. The function then finds the point farthest from this line. If the point is greater than e, it will be saved, and the function will continue to recursively split the array of points into two segments and repeat the procedure. If the point is closer to the baseline than epsilon, then all points along the baseline can be discarded, since they will also be less than e.
130
A. Mezhenin and V. Izvozchikova
The described RDP algorithm is applicable for a two-dimensional case, when the input points are a curved line. On the basis of the considered algorithm, solutions have been developed for the three-dimensional case [15, 16]. This article reviews the formation of the 3-D Douglas–Peucker algorithm on the basis of analyzing the nature of the 2-D algorithm and further studies the application of the 3-D method to the automated global generalization of digital elevation model (DEM). The algorithm proposed by the authors is based on determining the origin of the vector set. To improve the selection of the keypoints, a dynamic point-to-plane distance weighting factor is used. In our opinion, the proposed approach allows obtaining good results, but it is rather difficult to implement. Adaptation of the Ramer - Douglas - Peker Algorithm for the Three-Dimensional Case Based on the Majoritarian Logic. In the presented work, it is proposed to adapt the RDP algorithm for the three-dimensional case by using majority logic to select the reference points. Majority plurality system - the candidate with the most votes, which is not necessarily an absolute majority, is considered the winner. A decision is made by a majority vote. The input to the function of the algorithm is a two-dimensional array of points and an epsilon value. Then the following steps are performed: two statuses are attached to each point of the array, one - whether the point has passed the row check, the second - the column check; the lines of the array of points are considered sequentially, if the point is discarded, the status line by line is set to FALSE; the columns of the array of points are examined sequentially. If the point is discarded, FALSE is set to the line-by-line status; only points that have both TRUE statuses are included in the optimized polygonal mesh, if there is at least one FALSE - the point is recognized as not carrying important information. Delaunay triangulation is an algorithm for constructing an irregular triangulated mesh of adjacent non-overlapping triangles from a given set of points. The main property of Delaunay triangulation is that for any triangulation triangle, all points from a given array, with the exception of points that are its vertices, lie outside the circle circumscribed around the triangle. This condition allows you to exclude the appearance of too long and irregular triangles in the polygonal mesh, which has a positive effect on the processing speed of the 3D model. The use of Delaunay triangulation and an irregular triangulated mesh for solving the problems posed has a significant advantage: since the nodes of the triangulation can are unevenly located on the surface, TINs can have higher resolution in areas with highly variable surface, that is, where there is more detail, and lower resolution in areas with low detail. The presented algorithm, implemented on the basis of the adapted Ramer-DouglasPecker algorithm, Delaunay triangulation and Hausdorff metric, will be called RDP/D.
4 Optimization Quality Assessment Objective Metrics and Subjective Visual Quality Assessment. An important aspect in this area is the assessment of the results obtained. Quality assessment of 3D models is a fundamental issue in computer graphics [17–20]. Quality assessment metrics may allow a wide range of processes to be guided and evaluated, such as level of detail
Algorithms Optimization for Procedural Terrain Generation
131
creation, compression, filtering, and so on. Some quality assessment metrics exist for geometric surfaces. They can be used for quantitative assessments. Most computer graphics assets are composed of geometric surfaces on which several texture images can be mapped to make the rendering more realistic. However, there is not enough research to evaluate texture-mapped 3D models. In this context, a subjective study is of interest to assess the quality of perception of textured meshes based on a pairwise comparison protocol [17]. Analysis of the Similarity of a Geometric Shape. The criteria used in the simplification process are highly differentiated and do not give the total value of the simplification error. In fact, many algorithms do not return measures of the approximation error obtained by simplifying the polygon mesh [8]. In most cases, the Root Mean Square Error (RMSE) is used to estimate the accuracy of 3D model reconstruction or to solve simplification problems: vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u N NCols Rows X X u 1 fi;j di;j 2 RMSE ¼ t NRows NCols i¼1 j¼1
ð1Þ
This approach, according to the authors, does not allow obtaining reliable results. One of the possible solutions to this problem is to use the Hausdorff dimension, which will allow obtaining a quantitative estimate of the similarity of polygonal objects [8, 22]. For the sake of simplicity, let us imagine discrete 3D models represented by triangular meshes, since this is the most general way of representation of such data. The triangular mesh M will be a representation of the ensemble of points P in R3 (api-ces) and the ensemble of triangles T that describe the connection between the apices of P. Let us denote the two continuous surfaces S and S′, and dðp; S0 Þ ¼ minp0 2S0 kp p0 k2 ;
ð2Þ
where kk2 – is the Euclidian norm. Therefore Hausdorff metrics between S and S′: dðS; S0 Þ ¼ maxp0 2S0 dðS0 ; SÞ
ð3Þ
It is important to understand the fact that the metrics is not symmetrical, h.e. dðS; S0 Þ 6¼ dðS0 ; SÞ. Let us denote dðS; S0 Þ as the direct distance, dðS0 ; SÞ as inverse distance. Subjective Quality Tests. For a qualitative assessment of the work of the developed RDP/D algorithm, it was proposed to conduct a survey using the Double Stimulus Impairment Scale (DSIS) method [18]. Visual Quality Assessment Metric - DSIS is designed and used primarily to assess video quality after applying various compression algorithms. In addition to assessing the quality of video content, there are adaptations of this technique for assessing the quality of 3D graphics [19, 20]. In the subjective quality assessment test, a set of video sequences is presented in a predetermined order to a group of subjects who are asked to rate their visual quality on a specific scale. The test is conducted according to precise methods and in a con-trolled
132
A. Mezhenin and V. Izvozchikova
test environment to obtain reliable and repeatable results while avoiding invol-untary exposure to external factors. According to the DSIS method, the respondents are sequentially presented with pairs of sequences, that is, stimuli A and B, and are asked to evaluate the quality of the second stimulus. When processing the results, “outliers” are removed, estimates that are very different from other estimates. There-after, statistics are calculated to describe the distribution of scores by subject for each of the test conditions. For the DSIS methodology, the mean opinion score (MOS) is calculated as: MOSj ¼
PN
i¼1 sij
N
;
ð4Þ
where N is the number of subjects that are not outliers, and sij is the assessment of test condition j by subject i.
5 Experimental Results During testing, the following independent variables were set: the size of the landscape in points; algorithm for generating height maps: Perlin Noise or Diamond Square; the parameter e of the Ramer-Douglas-Pecker algorithm, on which the number of discarded points depends; the lod parameter of the Level of Detail optimization algorithm. The optimizer prototype is implemented in Unity3d, in C # (Fig. 4).
Fig. 4. Interface optimizer prototype
The results of the optimizer are presented in Fig. 5. As an example, it can be seen that the density of points is higher precisely in places of detailed landscape - that is, on hills and depressions. As part of the pilot data collection, 10 sets of randomly shaped and sized landscapes were generated. Each set includes a reference model and two optimized counterparts. The polygon count of each original terrain was reduced by
Algorithms Optimization for Procedural Terrain Generation
133
about 90%. Examples of the landscapes involved in the pilot experiment are shown in Fig. 5. Estimation of Optimization Algorithms Based on the Hausdorff Metric. To compare the models based on the Hausdorff metric, the MeshLab software product was used [1, 2]. An open source system for processing and editing 3D triangulated meshes. The program provides a set of tools for editing, cleaning, checking, rendering, texturing and transforming polygonal meshes. There are also a number of functions for comparing models, including those based on the Hausdorff metric. In the process of
Fig. 5. Optimizerxamples of landscapes compared in the experiment. Left to right: original highpoly model, LOD optimization, RDP / D optimization
measurements, it is selected which model is the initial model, which is optimized, it is indicated on the basis of which components to make a comparison. Estimation of Optimization Algorithms Based on the Hausdorff Metric. To compare the models based on the Hausdorff metric, the MeshLab software product was used. An open source system for processing and editing 3D triangulated meshes. The program provides a set of tools for editing, cleaning, checking, rendering, texturing and transforming polygonal meshes. There are also a number of functions for comparing models, including those based on the Hausdorff metric. In the process of measurements, it is selected which model is the initial model, which is optimized, it is indicated on the basis of which components to make a comparison. Here you can use the data sampling function to improve the accuracy of calculations (Fig. 5). For this study, the Hausdorff metric will be computed for each optimization algorithm between the general original model and its optimized counterpart. If in one case the value of the Hausdorff distance is less than in the other, this will indicate that the first option better preserves the high-poly detail models than the second. The measurement results are presented in Table 2. All the resulting models were analyzed in the MeshLab package. Figure 6 shows a histogram of the Hausdorff metrics values obtained for each optimization method. Double Stimulus Impairment Scale Poll. According to the considered DSIS method, the respondents were presented with renders of landscape scenes before and after optimization using LOD and RDP/D methods. The survey presents five threedimensional landscape models of various shapes and details (Fig. 7). To obtain an
134
A. Mezhenin and V. Izvozchikova
Table 2. Obtained as a result of the main experiment Hausdorff metrics, as well as the calculation of their differences and ranks № 1 2 3 4 5 6 7 8 9 10 11 12
Hausdorff LOD Hausdorff RDP/D Absolute difference Difference rank 0.231 0.821 0.590 3 0.475 1.663 1.189 7 0.273 0.804 0.532 2 1.674 3.161 1.487 9 2.679 3.443 0.764 6 0.134 0.395 0.261 1 0.494 1.211 0.717 5 2.499 4.484 1.985 12 0.902 2.304 1.402 8 2.380 3.899 1.520 10 2.063 3.709 1.646 11 0.310 0.951 0.641 4
Fig. 6. The value of the Hausdorff metric
Fig. 7. Rice types of landscape scenes presented for assessment
Algorithms Optimization for Procedural Terrain Generation
135
Fig. 8. Rice types of landscape scenes presented for assessment
estimate for both optimization methods, 10 questions were formulated. Methods in pairs “original-optimization” were alternated, that is, a situation in which questions with the same optimization method were asked several times in a row was avoided. Distribution histograms were plotted for all estimates for each of the optimization methods (Fig. 8). A survey of respondents based on the DSIS method showed that the assessment of the visual quality of the results of the developed RDP/D method is significantly higher than that of the LOD method.
6 Conclusion The paper discusses approaches to the development and optimization of threedimensional landscape models for Real Time 3D applications. The relevance of the issues of creating and optimizing landscapes for games, GIS and simulators is shown. We also reviewed the Unity3D interactive application development system and explored the built-in method for creating terrain models. While working with the Terrain object in Unity3D, it was discovered that the LOD optimization method that is applied to the generated terrain model by default has a number of controversial features. In this regard, a possible alternative option for optimizing landscape models was proposed - a combination of RDP/D algorithms. Based on the proposed idea, a prototype of a procedural optimizer was developed in Unity3D, which, using the mentioned algorithms, builds a model based on a heightmap and then optimizes it. In conclusion, the process of approbation and evaluation of the proposed optimization method is presented. As part of the study, the RDP/D method was compared with the built-in LOD method in Unity3D. To date, a quantitative assessment of the optimization results by the RDP/D method based on the Hausdorff metric has shown a comparable result with the LOD method, without a clear superiority. However, the results of the polls showed that the visual quality of the developed method is
136
A. Mezhenin and V. Izvozchikova
convincingly higher than that of the LOD method. The results obtained can be regarded as positive. It is planned to continue work in this area.
References 1. Valencia-Rosado, L.O., Starostenko, O.: Methods for procedural terrain generation: a review. In: Carrasco-Ochoa, J.A., Martínez-Trinidad, J.F., Olvera-López, J.A., Salas, J. (eds.) MCPR 2019. LNCS, vol. 11524, pp. 58–67. Springer, Cham (2019). https://doi.org/ 10.1007/978-3-030-21077-9_6 2. Rose, T.J., Bakaoukas, A.G.: Algorithms and Approaches for Procedural Terrain Generation - A Brief Review of Current Techniques (2016). https://ieeexplore.ieee.org/document/ 7590336 3. World Machine: The Premier 3D Terrain Generator. http://www.bundysoft.com/L3DT/ 4. Large 3D Terrain Generator. http://www.bundysoft.com/L3DT 5. Unity Manual. https://docs.unity3d.com/Manual/index.html 6. Terrain Level of Detail. https://graphics.pixar.com/library/LOD2002/4-terrain.pdf 7. Hausdorff Distance. https://en.wikipedia.org/wiki/Hausdorff_distance 8. Mezhenin, A., Zhigalova, A.: Similarity analysis using Hausdorff metrics (2019). http://ceurws.org/Vol-2344/short5.pdf 9. Mezhenin, A., Polyakov, V., Prishhepa, A., Izvozchikova, V., Zykov, A.: Using virtual scenes for comparison of photogrammetry software. In: Zhengbing, H., Petoukhov, S., He, M. (eds.) Advances in Intelligent Systems, Computer Science and Digital Economics II, pp. 57–65. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-80478-7_7 10. Polyakov, V., Mezhenin, A.: Procedural generation of virtual space. In: Kovalev, S., Tarassov, V., Snasel, V., Sukhanov, A. (eds.) Proceedings of the Fourth International Scientific Conference “Intelligent Information Technologies for Industry” (IITI’19), pp. 623–632. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-50097-9_64 11. Mezhenin, A., Shevchenko, A.: Optimization of procedurally generated landscapes. In: CEUR Workshop Proceedings, vol. 2744 (2020). This link is disabled 12. Mezhenin, A., Izvozchikova, V., Grigoreva, A., Shardakov, V.: Point cloud registration hybrid method. In: CEUR Workshop Proceedings, vol. 2893 (2020). This link is disabled 13. Ramer, U.: An iterative procedure for the polygonal approximation of plane curves. Comput. Graph. Image Process. 1(3), 244–256 (1972) 14. Douglas, D.H., Peucker, T.K.: Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Can. Cartographer 10(2), 112–122 (1973) 15. Fei, L., He, J., Ma, C., Yan, H.: Three dimensional Douglas-Peucker algorithm and the study of its application to automated generalization of DEM. Acta Geodaetica et Cartographica Sinica 35(3), 278–284 (2006) 16. Fei, L., He, J.: A three-dimensional Douglas-Peucker algorithm and its application to automated generalization of DEMs. Int. J. Geogr. Inf. Sci. 23(6), 703–718 (2009). https:// doi.org/10.1080/13658810701703001 17. Guo, J., Vidal, V., Cheng, I., Basu, A., Baskurt, A., Lavoue, G.: Subjective and objective visual quality assessment of textured 3D meshes. ACM Trans. Appl. Percept. 14(2), 1–20 (2017). https://doi.org/10.1145/2996296 18. De Simonea, F., Goldmanna, L., Leea, J.-S., Ebrahimia, T., Baroncini, V.: Subjective Evaluation of Next-Generation Video Compression Algorithms: A Case Study (2014)
Algorithms Optimization for Procedural Terrain Generation
137
19. Vanhoey, K., Sauvage, B., Kraemer, P., Lavoué, G.: Visual quality assessment of 3D models: on the influence of light-material interaction. ACM Trans. Appl. Percept. 15(1), 1– 18 (2018). https://doi.org/10.1145/3129505 20. Nehmé, Y., Farrugia, J.-P., Dupont, F., Le Callet, P., Lavoué, G.: Comparison of subjective methods, with and without explicit reference, for quality assessment of 3D graphics (2019) 21. Wang, S., Li, W.: The implementation of 3D scene walkthrough in air pollution visualization. Int. J. Inf. Eng. Electron. Bus. (IJIEEB) 2(1), 44–50 (2010) 22. Bogoya, J.M., Vargas, A., Cuate, O., Schütze, O.: 2 A (p, q)-averaged hausdorff distance for arbitrary measurable sets. Math. Comput. Appl. 23, 51 (2018). https://doi.org/10.3390/ mca23030051
Business Information Systems for Innovative Projects Evaluation at the Conceptual Design Stage Dmitry Rakov(&) Mechanical Engineering Research Institute of the Russian Academy of Sciences (IMASH RAN), Moscow 101000, Russia
Abstract. The article deals with the creation of an information environment to support business decisions. The field and objects under consideration are limited to the consideration of economic aspects in the creation of innovative engineering solutions. The main drawback of modern approaches is the problem of making business decisions, taking into account incomplete information field and under conditions of uncertainty. A special place in the life cycle plays the stage of conceptual decision making with subsequent evaluation. The article considers the application of morphological approach for analysis, synthesis and forecasting of innovative solutions. The morphological representation will allow you to clearly define and structure information, generate and evaluate a morphological set of possible options. One of the main tasks of the morphological decision space is to take into account economic and business elements. The creation of a classification morphological matrix will improve the accuracy of the identification of these systems. The novelty of the method lies in the use of the morphological approach and cluster analysis to evaluate and select innovative solutions. The proposed approach eliminates the drawbacks of classical morphological methods. The advantages of the approach include the relative simplicity of evaluating the generated variants and choices. Thus, the curse of dimensionality of combinatorial problems can be overcome. As an example, the problem of selecting rational technology options for gas turbine engines is considered. The power of the morphological set is 559872 variants. Keywords: Production engineering Business information systems business models Conceptual design Morphological matrix
Digital
1 Introduction The need to create innovative solutions in business causes developers’ interest in various methods to support innovation processes, namely CAI (Computer Aided Innovation) [1, 2]. CAI can extend the capabilities of traditional CAD/CAM systems [3]. At present, the creation of competitive products based on innovative engineering solutions (ES) is particularly relevant. Relatively old ESs quickly becomes obsolete. Therefore, one of the main characteristics is the adaptability of synthesized ES and the ability to respond dynamically to changing environmental conditions. This is primarily due to the influence of market conditions and the economic situation. In this regard, the © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 Z. Hu et al. (Eds.): CSDEIS 2021, LNDECT 121, pp. 138–149, 2022. https://doi.org/10.1007/978-3-030-97057-4_13
Business Information Systems for Innovative Projects Evaluation
139
task of generating solutions that embody the latest scientific and technological achievements and discoveries, which have high technical and economic indicators, acquires particular importance. This predetermined the need to use a deep system analysis. As a result of the analysis and synthesis carried out, an economically solution must be selected from a variety of alternatives. This problem should be solved with the help of software in the first stages of creating innovative systems. The greatest information uncertainty exists in the conceptualization and acceptance phase. Thereafter, the uncertainty decreases as the project develops. In contrast, the cost of project execution is minimal (Fig. 1) [4]. However, the impact of business and engineering decisions made is maximum at the conceptual stage [5]. The more ES options analysed, the higher the quality of the study and the confidence in achieving the requirements and goals of the project. For this reason, the selection and consideration of alternatives is the main task of the design process [4, 6]. As you can see, the importance of the conceptual decisions made is very high at the initial stages of the project. As the project progresses, it drops. As the cost of changing the decisions becomes disproportionately high and threatens to increase the cost of the project, which can lead to a significant increase in the cost or closure of the project. This means that already at the initial stages it is necessary to calculate in detail the economic and investment components of decisions to be made. A mistake will be difficult or impossible to correct in the future.
Fig. 1. Changes in the information component and costs as the project progresses
140
D. Rakov
The above is confirmed by statistics from Airbus (Fig. 2). At the design stage of a project with a cost of about 5%, approximately 70% of the decisions determining the level of the project are made. In the project search phase, it is possible to use CAIs, which are a tool designed to assist decision makers. CAIs can be used to select solutions for some unstructured and weakly structured problems, including multicriteria problems.
Fig. 2. Dependence of the cost of the stages and the importance of decisions made on the various stages of a project. (Source - Deutsche Airbus, Design to Cost, Hamburg)
The search for solutions usually belongs to the class of unstructured or weakly structured problems. In such problems, decisions are made under conditions of incomplete and inaccurate information due to: • Impossibility in principle to obtain the exact information about the developed object, • Complicated conditions of functioning, • Some unreliability and insufficiency of the source information, • The possibility of manifestation of such properties, the existence of which was not assumed. Uncertainty in these tasks is systematic because of the complexity of creating new solutions, limited time to make decisions, subjective factors, etc. Solving the structural synthesis problem from a practical point of view will help in reducing uncertainty in the study of solutions in the early stages of design. The methodology will allow a more thorough and comprehensive study of new solutions,
Business Information Systems for Innovative Projects Evaluation
141
which can limit the number of changes at the manufacturing stage (Fig. 3). This will reduce costs, shorten implementation time, and improve competitiveness.
Fig. 3. Comparison of project management efficiency
The considered approach for the analysis and synthesis of solutions is a development of F. Zwicky’s “morphological box” method. The scientist formulated: The purpose of morphological research is to see the perspective of a complete “field of knowledge” about the subject” [7, 8]. Morphological methods have been actively developed in different countries and have been used in various fields, including [4, 8]: • • • • • • •
Analysis and development of technical systems, Forecasting the development of various fields of technology, Support of innovative projects, Development of business plans, Improvement of the studied objects, Safe development scenarios. Consideration of environmental aspects in design, etc.
Classical morphological methods have several disadvantages. The disadvantages include the complexity of evaluating the generated variants and the problem of choosing among a large number of variants, as well as the dimensionality curse of
142
D. Rakov
combinatorial problems [7]. In order to improve morphological methods, an improved morphological approach and the Okkam program were developed [9, 10]. Particularly in business, morphological methods have been used for: • • • • •
sustainability research [11], analysis of intelligent energy services [12], temporal data processing in B2C systems [13], technological forecasting and social change [14, 15], studies of energy scenarios [16].
The fields of artificial intelligence and learning are very promising for the morphological approach [17, 18].
2 Using Morphological Matrices for Project Analysis A morphological matrix (MM) has been developed for searching promising variants of engineering solutions for the finish machining of gas turbine engines blades (Table 1). Its dimensionality, corresponding to the essential features of an engineering solution, was established with the help of expert evaluations. First of all, 13 features and options, essential for technical system’s appearance, were singled out.
Fig. 4. Correlation between morphological set of variants, generated variants and selected variants
To reduce the number of alternatives analyzed, expert evaluation criteria were introduced, a list of which, together with their weighting coefficients of importance.
Business Information Systems for Innovative Projects Evaluation
143
Subsequently, each alternative was assigned a score for each criterion. As a result, 16,000 variants were generated, of which 130 were selected for further clustering (Fig. 4) [19]. Table 1. Morphological matrix Attributes
Option 1
Option 2
Option 3
Execution of the movement for finishing machining Measuring module Machining tool
The machining tool Optical Tape
The table
Mechanical Circle
The machining tool and table Combination Tor
Degree of freedom of the machining tool Portal base location Rotation of the part relative to its longitudinal axis Rotation of the part relative to the transverse axis Diagram
1
2
3
Horizontal None
Vertical Yes
None
Yes
Cube
Portal
End support for the blade
Yes
No
Surface monitoring
Full
Relative positioning of the measuring unit and carriage Number of parts to be machined Measurements
On one platform
Selective (by machining area only) On two platforms
1
2
Roughness
Macro geometry
Option 4
Cone cutter >3
Console with Tool Head Option can be selected Interactive (during machining)
3 and larger (petal device) Combined
The research was carried out with software Okkam [9]. The experts selected nine criteria and their weighting scores. Subsequently, attributes and options were evaluated. Business decisions must meet a system of criteria that can be formalized. The requirements developed will reduce project execution time, formalize and structure information. Business decisions should be based on the following main components: • Input information must be accurate, reliable and sufficient, • Decisions must be rational or optimal, • Decision makers must have adequate qualifications.
144
D. Rakov
Decisions made must comply with a number of rules, namely: • • • •
Consistency of the decision with the set objectives, The decision must be substantiated and objective, The wording should be concise and clear, The decision should be adaptive. This means that it should react flexibly to changes in external or internal conditions, as well as to changes in external environment conditions, • The decision should be made promptly, • Automation of information processing, including development processes. The validity of decisions requires: • Analysis of decision effectiveness, • Application of scientifically based approaches, • Application of cost and functional analysis, application of methods of forecasting and modeling of a decision. For the condition of objectivity, it is necessary to: • Reliable and qualitative information, • Multiple options - Possibility to synthesize several or many solutions, • Possibility to compare the set of generated solutions.
The use of automation of decision implementation will significantly reduce the development time of the decision and increase its validity. For this purpose, it is desirable to formalize decision making. The proposed morphological approach is suitable for this purpose. Many economic criteria were chosen as morphological criteria • • • • •
transaction cost productivity payback time price cost of operation
Also, a number of criteria indirectly related to the economic part of the project were chosen related criteria • • • •
processing accuracy ease of operation mass dimensions
The power of the morphological set is 559872 variants. The reference variants were selected from this set. Then the generation of variants with their evaluation and selection was performed, resulting in a set of variants for further analysis (Fig. 5). Then the clustering of options using a given measure of similarity is carried out. For the final analysis 130 generated rational variants were selected, grouped into 17 clusters (Fig. 6). After the clustering of variants, the final sets of choices from the morphological matrix were analyzed for optimization and experimental studies.
Business Information Systems for Innovative Projects Evaluation
145
Fig. 5. Morphological matrix and variants generation (screenshot)
Fig. 6. Location of clusters in the morphological field of solutions (screenshot)
3 Results and Discussion Calculations showed that among all clusters, clusters 8 and 17 have the highest score (value of the target function). Accordingly, these clusters were investigated. In these clusters there are variants with the best estimates of the target function: • 1.09 for variant 9 of cluster 8 (Fig. 7) • 1.10 for variant 1 of cluster 17.
146
D. Rakov
Fig. 7. Location of cluster 8 in the morphological field of solutions (screenshot)
The analysis of MM showed that most clusters that do not contain reference variants have lower average estimates of the target function than clusters that include reference variants. Therefore, it is reasonable to expand the set of reference variants and to include among them the variants with higher estimates of the target function, thereby expanding the set of analyzed technical solutions. As a result of MM analysis, a set of solutions for the method of finish machining of workpieces and technical systems that implement the corresponding method was formed. The method for finish machining of a gas turbine engine blade blank includes measuring the geometric characteristics of the blank, comparing the measured shape of the blank with the theoretical shape of the finished part, determining the areas of the blank for finish machining, forming the working tool path, setting the cutting modes based on the measured data and performing the finish machining; the working tool in the form of a circle is used. The workpiece is moved into the machining area and geometrical parameters of the workpiece are scanned in orthogonal coordinates (in transverse, longitudinal and vertical directions along Y, X, Z axes) by a measuring module by uniformly moving it along the rotating workpiece axis, after which the measured parameters of the workpiece shape are sent to the control unit. The control unit performs mathematical processing of the obtained data, compares the measured shape of the workpiece with the theoretical shape of the part and determines the set of areas of the workpiece surface to be machined, based on which the workpiece is machined by a working tool. Machining is carried out by joint movement of the working tool along three orthogonal coordinates Y, X, Z, and the current information from the force-moment sensor that controls the parameters of the finishing operation is transmitted to the control unit for generating the control commands. During all operations, the working tool is rotated relative to the longitudinal axis of the housing X, and the workpiece is rotated relative to its longitudinal axis during machining [16].
Business Information Systems for Innovative Projects Evaluation
147
Fig. 8. Comparison of project management efficiency
As a result, several competing solutions, synthesized on the basis of morphological analysis, were selected (Fig. 8). The proposed approach can be seen as an alternative or supplement to multiattribute decision making (MADM) methods [20]. In the future, it is possible to build anticipative models for the considered method, which will take into account changes in the environment and external conditions.
4 Summary and Conclusion The article considers the application of the morphological approach for the analysis and synthesis of innovative solutions for production engineering. It also shows the feasibility of applying the approach for business information systems. Morphological representation gives the opportunity to clearly define and structure business information and choose a rational set of variants. The task of approach is seen in synthesis of morphological space of decisions with primary consideration of business. The developed program makes it possible to synthesize a morphological set of variants. The proposed approach eliminates the drawbacks of classical morphological methods. The advantages of the approach include the relative simplicity of evaluating the generated variants and choices. Thus the curse of dimensionality of combinatorial problems can be overcome.
148
D. Rakov
As an example, the problem of selecting rational technology options for gas turbine engines is considered. The power of the morphological set is 559872 variants. As a result, several optimal variants remained for further research. In this way it was possible to significantly reduce the time to variants analyzes and makes a decision. It can be concluded that the proposed method significantly increases the productivity of the process of analysis, synthesis and decision-making.
References 1. Husig, S., Kohn, S.: Computer aided Innovation - state of the art from a new product development perspective. Comput. Ind. 60(8), 551–562 (2009) 2. Kohn, S., Husig, S.: Development of an empirical based categorization scheme for CAI software. Int. J. Comput. Appl. Technol. 30(2), 33–46 (2007) 3. Werner, D., Weidlich, C., Guenther, R., Blaurock, B., Joerg, E.: Engineers’ CAx education —it’s not only CAD. Comput. Aided Des. 36(14), 1439–1450 (2004) 4. Bardenhagen, A., Rakov, D.: Analysis and synthesis of aircraft configurations during conceptual design using an advanced morphological approach. In: Deutscher Luft- und Raumfahrtkongress, Darmstadt, Germany (2019) 5. Chakrabarti, A., Bligh, T.: An approach to functional synthesis of solutions in mechanical conceptual design. In: Part I: Introduction and Knowledge Representation Engineering Design Centre. UK Research in Engineering Design. Department of Engineering, University of Cambridge, pp. 127–141 (1999) 6. Kendal, S., Creen, M.: An Introduction to Knowledge Engineering. Springer, London (2007) 7. Zwicky, F.: Discovery, Invention, Research - Through the Morphological Approach. The Macmillan Company, Toronto (1969) 8. Ritchey, T.: General morphological analysis as a basic scientific modelling method. Technol. Forecast. Soc. Chang. 126, 81–91 (2018) 9. Rakov, D.: Okkam - advanced morphological approach as method for computer aided innovation (CAI). In: MATEC Web of Conferences, ICMTMTE (2019) 10. Bardenhagen, A., Rakov, D.: Advanced morphological approach in Aerospace design during conceptual stage. Facta Univ. Ser.: Mech. Eng. 17(3), 321–332 (2019) 11. Isenmann, R., Zinn, S.: Morphological box for education on sustainable development: approach and examples at the munich university of applied sciences. In: 5th Responsible Management Education Research Conference, CBS Cologne (2015) 12. Paukstadt, U., Becker, J.: From energy as commodity to energy as service—a morphological analysis of smart energy services. J. Bus. Res. 73, 207–242 (2021) 13. Knolmayer, G.F., Borean, A.: A morphological box for handling temporal data in B2C systems. In: Cellary, W., Estevez, E. (eds.) Software Services for e-World. I3E 2010, vol. 341, pp. 215–225. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-16283-1_ 25 14. Johansen, I.: Scenario modelling with morphological analysis. Technol. Forecast. Soc. Chang. 126, 116–125 (2018) 15. Ngo, T., Kashani, A., Imbalzano, G., Nguyen, K., Hui, D.: Additive manufacturing (3D printing): a review of materials, methods, applications and challenges. Compos. B Eng. 143, 172–196 (2018) 16. Witt, T., Stahlecker, K., Geldermann, J.: Morphological analysis of energy scenarios. Int. J. Energy Sect. Manag. 12(4), 525–546 (2018)
Business Information Systems for Innovative Projects Evaluation
149
17. Alvarez-Dionisi, L.E., Mittra, M., Balza, R.: Teaching artificial intelligence and robotics to undergraduate systems engineering students. Int. J. Mod. Educ. Comput. Sci. (IJMECS) 11 (7), 54–63 (2019). https://doi.org/10.5815/ijmecs.2019.07.06 18. Njenga, S., Oboko, R., Omwenga, E., Maina, E.: Use of intelligent agents in collaborative M-learning: case of facilitating group learner interactions. Int. J. Mod. Educ. Comput. Sci. 9 (10), 18–28 (2017). https://doi.org/10.5815/ijmecs.2017.10.03 19. Kondrat’ev, I.M., Rakov, D.L.: Advanced morphological approach to finding novel solutions for automated finishing of GTE blades. In: Journal of Physics: Conference Series, p. 032021. IOP Publishing (2019) 20. Adriyendi, M.: Multi-attribute decision making using simple additive weighting and weighted product in food choice. Int. J. Inf. Eng. Electron. Bus. 7(6), 8–14 (2015). https:// doi.org/10.5815/ijieeb.2015.06.02
Analysis of Nonlinear Deformation of ElasticPlastic Membranes with Physical Non-linearity Sergey A. Podkopaev(&) and Sergey S. Gavriushin Bauman Moscow State Technical University, Building 1, 5, 2nd Baumanskaya st., 105005 Moscow, Russia
Abstract. H A method of numerical investigation of the stress-strain state of 3D models taking into account geometrical and physical nonlinearities was developed. The problems of solid mechanics taking into account large elastic strains and plastic strains were considered. This paper considered a physically nonlinear calculation model of a spherical shell and a rectangular plate. We describe methods of solving nonlinear equations. Numerical methods were used to investigate the efficiency of the method of continuation by the best parameter in the calculation model with extreme point of the solution curve. A novel computational procedure for solving nonlinear problems was developed. The study developed a numerical algorithm for studying the processes of nonlinear deformation of multi-parametric system, the algorithm was implemented as a software program. Keywords: Nonlinear deformation Post-bucking behavior Discrete switching Complex resistance Limit point Continuation by parameter Change of the subspace of parameters
1 Introduction In industry, for modeling of manufacturing technological (stamping, etc.), it is necessary to solve problems taking into account both physical and geometric nonlinearities. In modern construction and mechanical engineering, the use of new materials that allow for both elastic and inelastic deformations is becoming increasingly popular. Many studies considered the properties of new materials and the development of methods for modeling structures taking into account large deformations. From the point of view of the mechanics of a deformable solid, we are talking about nonlinearly elastic media, in the deformation of which it is necessary to take into account the geometric nonlinearity within the framework of finite deformations. The problems of constructing a nonlinear theory of elasticity are considered in a large number of articles and summarized in monographs (Levitas V.I., Novozhilov V.V., Chernykh K.F., Shalashilin V. I., Bathe K.J, Miehe C.) [7, 17, 19, 26, 30, 39]. The problem of constructing a system of resolving equations for modeling finite deformations of continuous media is the subject of scientific research, which has recently been widely carried out. Numerous schemes and techniques for nonlinear analysis of three-dimensional bodies have been developed, constructed for various physically nonlinear media (nonlinear elastic, elastoplastic, viscoelastic, etc.). The © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 Z. Hu et al. (Eds.): CSDEIS 2021, LNDECT 121, pp. 150–163, 2022. https://doi.org/10.1007/978-3-030-97057-4_14
Analysis of Nonlinear Deformation of Elastic-Plastic Membranes
151
variety of different methods for solving this class of problems is due to the possibility of using various tensors describing the kinematics and arbitrary flows of a continuous medium, which include tensors such as the strain gradient tensor, tensors of strain measures (right and left Cauchy - Green tensors, Almansi strain measure, right Piola tensor), tensor of the logarithmic strain measure, distortion tensors, etc. The formulation and algorithms for solving elastoplastic problems under the assumption of small elastic deformations have been sufficiently developed and there are many publications in which aspects of analytical and numerical solutions are considered (Ilyushin A.A., Malinin N.N., Hughes T.J.R.) [15, 18, 35]. Traditionally, in the theory of flow, the representation of total deformations and their velocities is used as the sum of the elastic part and the plastic part, which allows an efficient solution of problems with small elastic deformations. But this formulation is not applicable to modeling for a number of practically important tasks (metal processing, geomechanical tasks). In this case, in addition to taking into account large plastic deformations, it is necessary to take into account large elastic deformations, as well as geometric nonlinearity. Therefore, when formulating elastic physical relations, the authors propose to use the approach used in solving the problems of deformation of hyperelastic materials, that is, it is believed that there is a function of the potential energy of elastic deformation, with the help of which physical relations are expressed. One of the approaches to solving elastoplastic problems taking into account large deformations is the generalization of methods for studying small elastoplastic deformations for finite deformations. For example, an additive representation of the strain rate tensor is used as the sum of the elastic and plastic parts. But such an expansion is true for small elastic deformations. In the case of finite both elastic and plastic deformations, a multiplicative representation of the deformation gradient has been developed in the form of the product of the elastic and inelastic parts of the deformation gradient. According to this concept, a material point is successively subjected to inelastic and elastic deformations, respectively Golovanov A. I., Paymushin V. N., (Fish J., Zienkiewicz O. C.) [12, 20, 32, 45]. Recently, computational mechanics has received great development, using finitedimensional models of continuous media, numerical methods for solving algebraic equations and computer modeling, and allowing one to study a wide range of nonlinear problems. In particular, a popular method for the numerical study of problems in the mechanics of deformable bodies is the finite element method (Bazhenov V. G., Korobeynikov S. N.,Bathe K.-J., Hughes T.-J. R.) [3, 16, 30, 35]. Nevertheless, despite the progress made in computational mechanics, solving problems using the finite element method currently remains a very difficult task, mainly due to the strong nonlinearity that arises when taking into account the geometric nonlinearity and nonlinear behavior of the material. The article examines three-dimensional deformable solids with physical nonlinearity and subject to large deformations. The authors have developed an innovative algorithm for studying the deformation of hyperelastic and elastoplastic three-dimensional bodies, taking into account geometric nonlinearity. The constitutive relations for elastoplastic bodies are obtained. A technique has been developed for studying the supercritical state of threedimensional bodies, taking into account geometric and physical nonlinearities.
152
S. A. Podkopaev and S. S. Gavriushin
Currently, commutation devices used in the industrial Internet of Things are become increasingly common due to the digital industrial revolution (Industry 4.0) [2, 11, 22, 23, 29, 33, 36, 40, 43, 44]. This paper considered devices in the form of square plates and axis-symmetrical membranes. In this paper, the term membrane, Fig. 1a, will refer to a thin-walled axis-symmetrical shell that rapidly change its deflection under external load. This process of stability loss without shell failure will be referred to as snapping [1, 4, 5, 27, 28, 31]. Elastic characteristic of a membrane is an important operational characteristic. Elastic characteristic is a relationship of displacement of a point on the membrane vs. external load (Fig. 1b) [25, 29].
Fig. 1. a) Example of a membrane b) Type of elastic characteristics of the membrane
Under external load upon reaching the first critical point (extremum) (Point B, Fig. 1b) corresponding to the value of extreme pressure pcr1 , the membrane bypasses unstable chart section BC, rapidly changes deflection and continues deforming on the section CD of elastic characteristic. When external load is removed from the shell, there will be reverse step-like change of deflection corresponding to the second critical pressure pcr2 [14, 21]. It should be noted that the stresses in the membrane will remain elastic after the initial step (first step). The problem of post-buckling behavior of axis-symmetrical membranes was thoroughly studied in [8–10, 13, 23, 24].
2 Numerical Solution Methods Determination of the stress-strain state is a direct boundary value problem of the spatial theory of elasticity. The solution of the problem is reduced to the solution of a system of partial differential equations. The authors developed a solution to such a problem, which consisted in reducing the geometrically nonlinear volumetric problem (3d) of the spatial theory of plasticity to a physically nonlinear plane problem (2d) and a
Analysis of Nonlinear Deformation of Elastic-Plastic Membranes
153
geometrically nonlinear one-dimensional problem (1d). When solving a twodimensional problem, a cross-section is usually considered, and when solving a onedimensional problem, the longitudinal axis of the bar element. The solution to the one-dimensional problem is not considered. Let us consider the solution of a physically nonlinear plane problem for the cross section of a bar. The proposed design model describing the stress-strain state is a system of nonlinear equations and includes the following relationships: – – – –
Kinematic laws of distribution of relative deformations in the cross-section; Stress-strain curves; Relationships of deformation plasticity theory; The assumption of the equivalence of the distribution of relative deformations in the elastic and elastoplastic stages, which is not always valid, especially for shear deformations [6].
The system of equations of the computational model establishes a connection between the vector of the external load f and the vector of deformation parameters u; it may not have solutions, or it may have several solutions. There are three types of numerical problem in investigation of the section model: the problem of two-dimensional numerical integration, boundary-value twodimensional problem and the problem of solving systems of nonlinear equations [8, 31, 36, 41]. Numerical integration over the cross-section is done using Gaussian quadrature rule for triangles. Therefore, the investigated area should first be triangulated. The boundary-value problem is solved using the variation method in the form of finite element method. To solve a system of nonlinear equations, one can use methods of discrete or continuous continuation by the parameter, as well as the combinations of these methods [34, 38, 42]. 2.1
Methods of Discrete Continuation by Parameter
Methods of discrete continuation by parameter are iteration methods. The most common solution methods are the simple iteration method, generalized secant method and the Newton-Raphson method and its modification. Some of the above-mentioned methods have a physical interpretation in the form of methods of variable elasticity methods [4], additional loads, additional deformations and a combined method [5]. We should note that the above-mentioned methods of elastic solution are generic and can be used not just for reducing a problem to a system of nonlinear equations. In the method of discrete continuation by parameter, it is best to use the parameters obtained from elastic solution as the first approximation of the vector of deformation parameters u: u ¼ hdi1 f ;
ð1Þ
where 〈d〉 is a symmetric matrix. Zero elements of the stiffness matrix 〈d〉 indicate that there is no cross-talk between the corresponding force parameters.
154
S. A. Podkopaev and S. S. Gavriushin
For the Newton-Raphson method having a quadratic convergence speed, the new vector of deformation on the i–th iteration is given by: 1 f F uðiÞ ; Uði þ 1Þ ¼ uðiÞ JðiÞ
ð2Þ
1 is a matrix pseudoinverse to a Jacobi matrix. where JðiÞ
2.2
Methods of Continuous Continuation by Load Parameter
Methods of continuous continuation by parameter are slower compared to the discrete ones. However, they allow simulating stress-strain state of the section in the whole loading history, including taking into account transient processes like corrosion, temperature influence and so on. Besides, some of them enable investigation of the “postbuckling” behavior of section and find all possible solutions. In this case, the most common methods are the Euler method (method of sequential loadings), explicit and implicit Runge-Kutta methods of different order. In the matrix form, the system of resolving equations of the section model has the following form: f FðuÞ ¼ wðuÞ ¼ 0;
ð3Þ
where u is a vector of deformation parameters. After introducing the parameter t in the load vector f and differentiation of w(u, t) with respect to time t, the expression (3) will be transformed as follows: J ðu; tÞ
du dWðu; tÞ þ ¼ 0; dt dt
ð4Þ
where J ðu; tÞ dWdtðu;tÞ m m Jacobi matrix (tangent stiffness matrix). The above expression in the matrix form is given by: du dWðu; tÞ ¼ J 1 ðu; tÞ ; dt dt
ð5Þ
This is a system of ordinary first-order differential linear equations. An unknown vector function of deformation parameters u(t) satisfies the initial conditions: uðt0 Þ ¼ uð0Þ ¼ 0
ð6Þ
which corresponds to the initial unloaded state of the section. This problem statement is a method of continuous continuation by parameter and can be reduced to integration of a Cauchy problem. The system of ordinary differential Eqs. (5) can be solved by Euler methods (method of sequential loadings), Runge-Kutta, Adams-Schtermer methods as so on.
Analysis of Nonlinear Deformation of Elastic-Plastic Membranes
2.3
155
Methods of Continuous Continuation by the Best Parameter
The solution of the system (5) forms a continuous curve K in a (m + 1)-dimensional Euclidean space Rm+1{u, t}. However, the curve K is non-monotonous and non-smooth even for simple loadings and has singular points with regular points. In the regular points, the determinant det( J) 6¼ 0 and the matrix rank are used. In singular points, det (J) = 0, rank(J) = r < m. The continuation of solution in singular points is unfeasible, which usually terminates the solution. Thus, one can either obtain only one solution of may or obtain no solutions at all. To be able to continue the solution, one can switch a parameter (multiple switching sometimes). However, the choice of optimal parameter is not straightforward and its change complicated the solution [10]. In terms of computational efficiency, introducing a new continuation parameter s, which is curve length K of a multiple solution of the system (4) in Rm+1 [10, 24], is preferable. This problem statement was dubbed a method of continuation by the best parameter. [14]. After introducing the continuation parameter s, we have the following: yðu; sÞ ¼ 0;
u ¼ uðsÞ;
t ¼ tðsÞ
ð7Þ
By differentiating the vector function y with respect to the continuation parameter s and adding the expression for the curve length K, we get the following system of equations: duT du dt 2 ds ds þ ds ¼ 1; ð8Þ dW du dW dt þ ¼ 0 du ds dt ds This system of nonlinear equations can be explicitly solved with respect to derivatives using numerical methods. The last matrix equation is an underdefined system of m ordinary linear equations for m + 1 unknowns dv/ds, it can be written was follows: dW du du dt T dv ; ; ð9Þ ¼ 0 , J ¼ 0; du ds ds ds ds where v = {u, t}T – extended column vector of deformation parameters, v 2 Rm+1; J – extended Jacobi matrix m (m + 1). The Eq. (9) with the initial condition v|s=0 = 0 (which corresponds to the unloaded state) is an implicit Cauchy problem. We should note that there are extreme points among singular points (det(J) = 0) of the solution curve of the simulation model. In the extreme points, rank(J) = m. Significantly singular points of the curve K where bifurcation is possible (rank(J) rank() m) are very rare. As det(J) 6¼ 0 in extreme points, the continuation of solution in them is possible. Let us write a system of ordinary differential Eqs. (9) in the explicit form: v0 ¼
dv ¼ J a O ¼ ortðJ; QÞ; ds
ð10Þ
156
S. A. Podkopaev and S. S. Gavriushin
where v′ – is an unknown ort of the of the tangent line to curve K in the point v and 0 orthogonal h to i the row vectors of the matrix J, v 2 Rm þ 1 ; J a ¼ QJ – complemented Jacobi matrix (m + 1) (m + 1) formed by joining a (m + 1)-dimensional row vector Q from the bottom; 1 J a – pseudo-inverse complemented Jacobi matrix. Essentially, (9) comes down to determining the ort v′ from the given row vectors J i;j , i = 1,m; j = {1,2,…,m + 1} of the matrix J and vector Q which is linearly independent relative to rows J i;j . We should note that introducing a pseudo-inverse Moor-Penrose matrix allows improving the stability of numerical solution for rank(J) < m. This situation is only possible with one non-zero load. Computation error of solution of the system (9) will decrease as Q approaches the desired vector v′. Evidently, at iteration k + 1, vector Q(k+1) can be set the value of v0ðkÞ from the previous iteration. Generally, Q(1) = {0, …, 0,1} at first iteration. Let us consider two explicit schemes of continuous continuation by numerical integration of the Cauchy problem (9) with initial condition v|s=0 = 0 by the parameter s. Let us introduce the following notation: 0 0 vð s k Þ ¼ vðkÞ ; v ðsk Þ ¼ v ðkÞ; J ðvðsk ÞÞ ¼ J vðkÞ ¼ JðkÞ Qðsk Þ ¼ QðkÞ
ð11Þ
The Euler method algorithm on the k = 1, 2, 3, … iteration will look as follows: Sl ¼ 0; vðlÞ ¼ 0; QðlÞ ¼ f0;. . .; 0; 1g; Sk þ 1 ¼ sk þ DSk ; v0ðkÞ ¼ ort JðkÞ ; QðkÞ ;
T Vðk þ 1Þ ¼ vðkÞ þ v0ðkÞ Dsk ; Qðk þ 1Þ ¼ v0ðkÞ :
ð12Þ
The algorithm for the four-stage fourth order Runge-Kutta method with the computational complexity of O(S5) is as follows:
vðk þ 1Þ
Sl ¼ 0; vðlÞ ¼ 0; QðlÞ ¼ f0; . . .; 0; 1g Sk þ 1 ¼ sk þ Dsk v0lðkÞ ¼ ort J vðkÞ ; QðkÞ ;
v02ðkÞ ¼ ort J vðkÞ þ 0:5v0lðkÞ Dsk ; v0lðkÞ ;
v03ðkÞ ¼ ort J vðkÞ þ 0:5v02ðkÞ Dsk ; v02ðkÞ ;
v04ðkÞ ¼ ort J vðkÞ þ v03ðkÞ Dsk ; v03ðkÞ ;
¼ vðkÞ þ ð1=6Þ v01ðkÞ þ 2v02ðkÞ þ 2v03ðkÞ þ v04ðkÞ Dsk ;
T Qðk þ 1Þ ¼ v04ðkÞ :
ð13Þ
Analysis of Nonlinear Deformation of Elastic-Plastic Membranes
157
We should note that the vector v should be scaled to provide the stability of the numerical solution process.
3 Numerical Solution Result To test the program, we used a uniaxial tension compression diagram in the form of a sine curve with a limited domain space (Fig. 2):
Fig. 2. Uniaxial tension-compression test diagram
This curve has extreme points em ; fy ; em ; fy and their continuation is of interest. Evidently, for this diagram, conventional methods of sequential loadings and the Newton-Raphson method will only yield one solution and will not converge upon reaching the extreme point. We should not that the downward branch of the diagram on the section ½jem j; jeu j is not absolutely conditional, but it is characteristic to some real materials. The calculations were done using the Euler method and the fourth order four-stage Runge Kutta method. The exact curves were obtained using the method of sequential displacements. Initially, numerical calculation was done for a square plate under pressure 11.25 MPa. The plate side length was 0.10 m, plate thickness was 0.005 m. Material mechanical properties are the following: Young modulus 3 GPa, Poisson ratio m = 0.46. Upper edges have no displacement, but they can move in their plane. As the plasticity criterion, the study utilized the nonlinear isotropic hardening law with the Huber-Mises criterion. Due to symmetry, we only simulated a quarter of the plate, the simulation model is shown in Fig. 3.
158
S. A. Podkopaev and S. S. Gavriushin
Fig. 3. Quarter plate finite element model
Points A and B lie on the central axis. Figure 4 shows the plot of vertical displacement of points A and B vs. pressure. Using this chart, one can evaluate the thickness of the central axial section.
Fig. 4. Vertical displacement of points A and B vs. pressure
Figure 5 shows the deformed state with a field of vertical displacement for the load of 11.25 MPa accordingly. The deformation process can be subdivided into three stages. At the first stage (load less than 0.75 MPa), only elastic strains are observed. Upon reaching the load of 0.75 MPa, plastic strains start occurring on the center point A of the upper edge. Maximum plastic strains are then observed at the point A for the loads up to 2.35 MPa. Figure 5 indicates that the maximum plastic strain intensity happens on the side edge when the load reaches 11.25 MPa.
Analysis of Nonlinear Deformation of Elastic-Plastic Membranes
159
Fig. 5. Deformed state of the plate
Fig. 6. Initial shape of the circular plate
At the second stage, we solved the problem of deformation of a circular plate under pressure P = 0.12 MPa applied to the lower plate surface. Plate radius is 0.04 m, plate thickness is 0.012 m, G = 0.44 MPa. The boundary conditions were as follows: the upper edge of the plate has no vertical displacement, but it can move in its plane. Figure 6 and 7 show the undeformed and deformed shape of the plate. The plate first bends, then starts blowing up: the edges of the plate are displaced outside, and at the last stage the plate bends upwards and the edges of the plate are displaced inside. Figure 8 shows the chart of the displacement of the upper face center point A vs. load. In both simulations, Runge-Kutta method exhibits good convergence. In the second simulation, Euler method also exhibited fair convergence due to small step increment of the parameter s.
160
S. A. Podkopaev and S. S. Gavriushin
Fig. 7. Deformed state of the plate
Fig. 8. Displacement the center point
The validity of the obtained results is ensured by using tried and tested method of solid body mechanics during construction of mathematical models, as well as physical and mathematical integrity of the models used. Author software was written based on the developed mathematical models and numerical algorithms. To prove the validity of the obtained results, these problems were solved using finite element method in ANSYS software. The results of the authors’ program have good agreement with those obtained from ANSYS, which testified to the validity of the presented results.
Analysis of Nonlinear Deformation of Elastic-Plastic Membranes
161
4 Conclusion The paper reviewed existing numerical solution methods and methods of investigation of post-buckling behavior of axis-symmetrical spherical shells and membranes with physical and geometric nonlinearities. The authors created a modern mathematical model and a computation method for solving the problem of deformation of 3D bodies taking into account geometrical and physical nonlinearities. A rational mathematical model for describing the processes of nonlinear deformation was selected. Based on the developed mathematical models and computational algorithm, a finite-element simulation complex was created. The complex can be used for simulating structures under finite strains and it can also be used for simulating different manufacturing processes. In future studies, the authors plan to create an algorithm for separating elastic and plastic deformations and also study the mechanism of contact interaction.
References 1. Alfutov, N.A.: Fundamentals of calculation on the stability of elastic systems. Mechanical Engineering. (Calculator library), p. 488 with illustration (1977) 2. Andreeva, L.E.: Elastic elements of devices. Tutorial. Mashinostroenie, p. 456 (1982) 3. Bazhenov, V.G., Kibets, A.I.: Numerical modeling of three-dimensional problems of unsteady deformation of elastic-plastic structures by the finite element method. Proc. Russ. Acad. Sci. Solid State Mech. 1, 52–59 (1994) 4. Biderman, V.L.: Mechanics of thin-walled designs. Statics. Mechanical Engineering (Calculator library), p. 488 with illustrations (1977) 5. Birger, I.A.: Calculation of structures taking into account plasticity and creep. Izvestia of the USSR Acad. Sci. Mech. 2, 113–119 (1965) 6. Chernov, N.L., Shebanin, V.S., Kupchenko, Yu.V., Walid, I., Artyushkin, I.A.: Strength of sections of thin-walled steel rods with limited plastic deformations. Izv. vuzov. Constr. Archit. 4, 1–5 (1990) 7. Chernykh, K.F.: Nonlinear theory of elasticity in machine-building calculations. Mashinostroenie, p. 336 (1986) 8. Valishvili, N.V.: Methods for calculating the shells of rotation on electronic digital computer. Mashinostroenie, p. 278 (1976) 9. Volmir, A.S.: Resistance of deformable systems. Fizmatgiz, p. 984 (1967) 10. Gavrushin, S.S.: Development of methods for calculating and designing elastic shell structures of instrument devices: Dissertation. Moscow, p. 316 (1994) 11. Gavrushin, S.S., Baryshnikova, O.O., Boriskin, O.F.: Numerical methods in dynamics and strength of machines. Publishing House of Bauman Moscow State Technical University, p. 492 (2012) 12. Golovanov, A.I., Sultanov, L.U.: Investigation of the supercritical elastic-plastic state of three-dimensional bodies taking into account finite deformations. Izvestiya vuzov. Aviation Equipment 4, 13–16 (2008) 13. Grigolyuk, E.I., Lopanitsyn, E.A.: Finite deflections, stability and post-buckling behavior of thin shallow shells. Moscow, MSTU “MAMI”, p. 162 (2004)
162
S. A. Podkopaev and S. S. Gavriushin
14. Grigolyuk, E.I., Kabanov: The stability of the shells. The main editors of the physical and mathematical literature publishing house “Science”, Moscow, p. 360 (1978) 15. Ilyushin, A.A.: Mechanics of a continuous medium. Moscow State University, p. 287 (1978) 16. Korobeynikov, S.N.: Nonlinear deformation of solids. Novosibirsk p. 262 (2000) 17. Levitas, V.I.: Large elastic-plastic deformations of materials at high pressure. Naukova dumka, Kiev, p. 232 (1987) 18. Malinin, N.N.: Applied theory of plasticity and creep. Mashinostroenie, p. 400 (1975) 19. Novozhilov, V.V.: Fundamentals of the nonlinear theory of elasticity. Gostekhizdat, p. 212 (1948) 20. Paimushin, V.N.: Relations of the theory of thin shells of the Timoshenko theory type under arbitrary displacements and deformations. Appl. Mech. Tech. Phys. 55(5(327)), 135–149 (2014) 21. Report at the International Scientific Conference “Boundary Problems of Continuum Mechanics and Their Applications” dedicated to the 100th anniversary of the birth of G.G. Tumashev and the 110th anniversary of the birth of Kh.M. Mushtari Published: Proceedings of the N.I. Lobachevsky Mathematical Center. Kazan: Kazan. Math Society, vol. 42, pp. 5– 19 (2010) 22. Podkopaev, S.A., Gavrushin, S.S., Nikolaeva, A.S.: Analysis of the process of nonlinear straining of corrugated membranes. Interuniversity compilation of scientific papers: Mathematical modeling and experimental mechanics of a deformable solid. Issue 1. Tver: Tver State Technical University, 162. UDC: 539.3/8, BBK: 22.251+22.19 (2017). http:// elibrary.ru/item.asp?id=29068499 23. Podkopaev, S.A., Gavrushin, S.S., Nikolaeva, A.S., Podkopaeva, T.B.: Calculation of the working characteristics of promising designs microactuators. Interuniversity compilation of scientific papers: Mathematical modeling and experimental mechanics of a deformable solid. Issue 1. Tver: Tver State Technical University, 162. UDC: 539.3/8, BBK: 22.251+22.19 (2017). http://elibrary.ru/item.asp?id=29068499 24. Podkopaev, S.A., Gavrushin, S.S., Podkopaeva, T.B.: Methods for studying the postbuckling behavior of axisymmetric membrane. In: The First International Symposium on Computer Science, Digital Economy and Intelligent Systems (CSDEIS2019), 4–6 October 2019, Moscow, Russia (2019) 25. Ponomarev, S.D., Biderman, V.L., Likharev, K.K., et al.: Strength Calculations in Mechanical Engineering vol. 2 Mashgiz (1958) 26. Shalashilin, V.I., Kostrichenko, A.V., Knyazev, E.N., Zuev, N.N.: Continuation according to the best parameter in nonlinear static problems solved by the finite element method. Izvestiya of High. Educ. Inst. Aviation Equipment 4, 18–24 (1997) 27. Feodosyev, V.I.: Elastic Elements of Precision Instrument Engineering, Oborongiz (1949) 28. Feodosyev, V.I.: To the calculation of the clapping membrane. Appl. Math. Mech. 10(2), 295–306 (1946) 29. Belhocine, A.: Exact analytical solution of boundary value problem in a form of an infinite hypergeometric series. Int. J. Math. Sci. Comput. (IJMSC) 3(1), 28–37 (2017). https://doi. org/10.5815/ijmsc.2017.01.03 30. Bathe, K.J., Ramm, E., Wilson, E.L.: Finite element formulations for large deformation dynamic analysis. Int. J. Numer. Methods Eng. 9, 353–386 (1975) 31. Crisfield, M.A.: A fast Incremental/Iterative solution procedure that handles “snapthrought.” Comput. Struct. 13(1), 55–62 (1981) 32. Fish, J., Shek, K.: Computational aspects of incrementally objective algorithms for large deformation plasticity. Int. J. Numer. Methods Eng. 4, 839–851 (1999)
Analysis of Nonlinear Deformation of Elastic-Plastic Membranes
163
33. Chuma, F.M., Mwanga, G.G.: Stability analysis of equilibrium points of newcastle disease model of village chicken in the presence of wild birds reservoir. Int. J. Math. Sci. Comput. (IJMSC) 5(2), 1–18 (2019). https://doi.org/10.5815/ijmsc.2019.02.01 34. Gupta, N.K., Venkatesh: Experimental and numerical studies of dynamic axial compression of thin-walled spherical shells. Int. J. Impact Eng. 30, 1225–1240 (2004) 35. Hughes, T.J.R., Wignet, J.: Finite rotation effects in numerical integration of rate constitutive equations arising in large-deformation analysis. Int. J. Numer. Methods Eng. 15, 1862–1867 (1980) 36. Marguerre, K.: Zur Theorie der gerkrümmten Platte grober Formänderung. In: Proceedings of the 5th International Congress for Applied Mechanics, Cambridge, Massachusetts, pp. 93–101. Wiley, New York (1939) 37. Moshizi, M.M., Bardsiri, A.K.: The application of metaheuristic algorithms in automatic software test case generation. Int. J. Math. Sci. Comput. (IJMSC) 1(3), 1–8 (2015). https:// doi.org/10.5815/ijmsc.2015.03.01 38. Mescall, J.: Numerical solution of nonlinear equations for shell of revolution. AIAA J. 4(11), 2041–2043 (1966) 39. Miehe, C.: A theory of large-strain isotropic thermoplasticity based on metric transformation tensors. Arch. Appl. Mech. 66(1), 45–64 (1995) 40. Oyeleke, O.D., Thomas, S., Nzerem, P., Koyunlu, G.: Design and construction of a prototype wireless power transfer device. Int. J. Eng. Manuf. (IJEM) 9(2), 16–30 (2019). https://doi.org/10.5815/ijem.2019.02.02 41. Reissner, E.: Asymmetrical deformations of thin shells of revolution. In: Proceedings of Symposium in Applied Mathematics, vol. 3, pp. 27–52. American Mathematical Society (1950) 42. Riks, E.: The application of Newton’s method to the problem of elastic stability. J. Appl. Mech. 39, 1060–1065 (1972) 43. Phuyal, S., Bista, D., Izykowski, J., Bista, R.: Design and implementation of cost efficient SCADA system for industrial automation. Int. J. Eng. Manuf. (IJEM) 10(2), 15–28 (2020). https://doi.org/10.5815/ijem.2020.02.02 44. Fataliyev, T., Mehdiyev, S.: Industry 4.0: the oil and gas sector security and personal data protection. Int. J. Eng. Manuf. (IJEM) 10(1), 1–14 (2020). https://doi.org/10.5815/ijem. 2020.02.01 45. Zienkievicz, O., Taylor, R.L.: Finite Element Method, 5th edn. Volume 2: Solid Mechanics. Butterworth – Heinemann, p. 459 (2000)
The Universal Multilevel Relationship Between the Stochastic Organization of Genomes and the Deterministic Structure of DNA Alphabets Sergey V. Petoukhov(&) Mechanical Engineering Research Institute, Russian Academy of Sciences, M.Kharitonievsky pereulok, 4, Moscow, Russia
Abstract. The article is devoted to regular interrelations between stochastic and deterministic properties in the molecular-genetic system. Some numeric interrelations are shown between composition probabilities of n-plets of different ntexts in single-stranded DNAs of eukaryotic and prokaryotic genomes. Besides this, numeric interrelations are shown between the probabilities rules of genomic DNAs and the deterministic system of DNA n-plets alphabets having binaryoppositional molecular characteristics of the deterministic type. The whole set of equality relations between the percentage groupings of n-plets in genomic DNA n-texts, revealed by the author, indicates that the stochastic organization of genomic DNAs is the highly limited stochastic organization with many internal numeric interrelations among summary percentages of separate groupings. The described author’s results are useful for the creation of new approaches in artificial intelligence problems, and also for the development of quantum biology and the author’s concept of genetic stochastic resonances. Keywords: DNA alphabets Binary oppositions Matrices Tensor product Stochastic resonance
Genomes Probabilities
1 Introduction In artificial intelligence systems, engineers strive to reproduce the properties of natural intelligence, the physiological foundations of which are genetically inherited from generation to generation. The disclosure of informational patents of living nature regarding natural intelligence is associated with mathematical modeling of the genetic coding system and the identification of stochastic rules of organization of information sequences in genetic DNA molecules. Genetics as a science began with the discovery by G. Mendel of stochastic rules for the inheritance of traits in experiments on the crossing of organisms. Many processes in living bodies are stochastic and proceed against a background of noise or are accompanied by noise. For example, the expressions “gene noise” or “cell noise”, which are known in biology, reflect the fact that even genetically identical cells within the same tissue exhibit different levels of protein expression, different sizes and
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 Z. Hu et al. (Eds.): CSDEIS 2021, LNDECT 121, pp. 164–174, 2022. https://doi.org/10.1007/978-3-030-97057-4_15
The Universal Multilevel Relationship Between the Stochastic Organization
165
structures due to the stochastic nature of interactions of individual molecules in cells [1–4]. This stochastic nature of genetic inheritance is manifested, in particular, in the fact that all people, even identical twins, have different fingerprints. In general, living bodies can be viewed as a mysterious phenomenon of a kind of block-stochastic organization, the structural features of which are subject to study. These features include genetically inherited multi-block structures of living bodies, in which individual blocks as a whole (globally) are similar to each other, although they differ significantly locally (like fingers in humans). It means that in living bodies stochasticlike phenomena are regularly connected with deterministic-like phenomena in some unknown ways. The article presented new phenomenological facts of the existence of regular connections between stochastic and deterministic properties at the level of the molecular-genetic system. P. Jordan, who was one of the founders of quantum informatics and the author of the first article on quantum biology, stated: “life’s missing laws are the rules of chance and probability of the quantum world” [5]. In line with these phenomenological facts and Jordan’s statement, the author studies possible probabilities rules in nucleotide sequences of single-stranded DNAs in eukaryotic and prokaryotic genomes. The first results of this study were described in articles [6–9]. The mentioned results revealed the existence of universal rules of probabilities in percentage compositions of DNA sequences of genomes of higher and lower organisms. In particular, they generalize the well-known second Chargaff’s rule about approximate equalities of the percent of adenine and thymine (%A %T) and also the percent of cytosine and guanine (%C %G) in long single-stranded DNAs [10–12]. The revealed universal probabilities rules of genomes provoke the following question: whether the probabilities rules of the stochastic organization of genomes are an independent class of genetic phenomena or are they structurally correlated with some other regular phenomena of the molecular-genetic system? The purpose of this article is to demonstrate the results obtained by the author about the multiple hidden connections of the named numeric rules of stochastic organization of genomes with the multilevel system of structured DNA alphabets and also with binary-oppositional molecular features of the nucleotides A, T, C, and G. The presented results are based on the analysis of many genomic DNA sequences, whose initial data were taken from the GenBank (https://www.ncbi.nlm.nih.gov/ genbank/). In particular, the set of these analyzed genomes includes the following: 1) all 24 human chromosomes, which differ in their length, the number, and type of genes, etc.; 2) all chromosomes of a fruit fly Drosophila melanogaster, all chromosomes of a house mouse Mus musculus, all chromosomes of a nematode Caenorhabditis elegans, all chromosomes of a plant Arabidopsis thaliana, and many other plants; 3) 19 bacterial genomes of different groups both from Bacteria and Archaea. In this article having a limited volume, the stated results will be demonstrated by numerical data regarding the single-stranded DNA sequence of the human chromosome №1, which contains about 250 million nucleotides A, T, C, and G.
166
S. V. Petoukhov
2 Research Methods As it is known, there are DNA alphabets of 4 nucleotides, 16 duplets, 64 triplets, etc. (each such alphabet of n-plets consists of 4n elements of length n). These alphabets are briefly termed “n-alphabets”. Each of the n-alphabets can be comfortably represented in a form of the appropriate square matrix from the tensor family of matrices [C, A; T, G](n), where (n) means a tensor (or Kronecker) power of the matrix of nucleotides C, A, T, and G (see more details in [6–9, 13]). Figure 1 shows the first matrices of this family for alphabets of 4 nucleotides, 16 duplets, and 64 triplets.
Fig. 1. The first members of the tensor family of matrices [C, A; T, G](n), containing the alphabets of 4 nucleotides, 16 duplets, and 64 triplets, respectively, in a strictly ordered form associated with the binary numbering of columns and rows.
All columns and rows of these matrices are enumerated by binary numbers based on the following binary-oppositional molecular features of the nucleotides A, T, C, G: 1) two of these nucleotides are purines (A and G), and the other two (C and T) are pyrimidines that give the numerical representing C = T = 1, A = G = 0, which is used for natural binary numeration of the matrix columns; 2) two of these nucleotides are keto molecules (T and G), and the other two (C and A) are amino molecules that give the numerical representation C = A = 1, T = G = 0, which is used for natural binary numeration of the matrix rows. These alphabetical matrices allow comfortable presentations of results of the study of probabilities rules in genomic DNAs. For this study, the author proposed the following effective method. Any long DNA sequence is analyzed as a multi-linguistic bunch of n-texts, each of which is written in its n-alphabet. For example, the 1-textual representation shows a DNA sequence as a text written in the alphabet of 4 nucleotides (such as C-A-G-G-T-A-…); the 2-textual representation shows the same DNA sequence in a form of another text written in the alphabet of 16 duplets (such as CAGG-TA-…); the 3-textual representation shows the same DNA sequence in a form of a text written in the alphabet of 64 triplets (such as CAG-GTA-…); and so on. The author has calculated the percent of each of the kinds of n-plets in corresponding ntextual representations of many genomic DNA sequences under n = 1, 2, 3, 5. For any analyzed DNA, the received percentage values of n-plets inside its n-texts are input into the appropriate cells of the family of n-alphabet matrices [C, A; T, G]^(n) (such as in Fig. 1). As a result, numeric matrices of percentage values of n-plets appear, which characterize a stochastic organization of the analyzed genomic DNA. Figure 2 shows
The Universal Multilevel Relationship Between the Stochastic Organization
167
such percentage matrices for the case of the DNA sequence of the human chromosome №1 (initial data of this DNA were taken from the GenBank: https://www.ncbi.nlm.nih. gov/nuccore/NC_000001.11). 11 1 0
10
01
00
1
0
11
0.05409
0.07274
0.05033
0.09504
0.2085 0.2918
0.2910 0.2087
10 01
0.07134
0.01031
0.07429
0.07137
0.06008
0.06312
0.04402
0.06008
00
0.09568
0.07286
0.05046
0.05419
Fig. 2. Matrices of percent of n-plets in n-texts of the DNA of the human chromosome №1 (n = 1, 2, 3). The arrangement of the n-plets percentage values inside matrices corresponds to the n-plets arrangement in the tensor family of matrices in Fig. 1.
3 Results At first glance, the sets of phenomenological percent in the different matrices in Fig. 2 have a chaotic character and are not numerically related to each other. In particular, the percentage of n-plets of the same letter composition depends on the sequence of letters and can differ several times: for example, %CG = 0.0103, and %GC = 0.0440. But the arrangement of these percentages in the cells of the n-plets of the above matrices [C, A; T, G](n) allowed discovering a family of genomic rules (laws) for the stochastic organization of genomic DNAs. The rules relate to the conservation of amounts in the groupings of these percentages. For example, in these percentage matrices, any pair of rows enumerated by bit-reversed binary numbers (for example, 11 and 00, or 10 and 01) has, with high precision, the same percent sums in each, although the individual percentages inside these rows differ significantly. Figure 3 illustrates these equalities of percentage sums in such rows from matrices of 16 duplets and 64 triplets (from Fig. 2) and also of 256 tetraplets (from the preprint [14]) in the case of the human chromosome №1. A similar phenomenological rule holds for pairs of matrix columns: in the considered percentage matrices, any pair of columns enumerated with bit-reversed binary numbers (for example, 11 and 00, or 10 and 01) has, with high precision, the same
168
S. V. Petoukhov Rows
Rows 1111 0000 1110 0001 1101 0010 1100 0011 1011 0100 1010 0101 1001 0110 1000 0111
Percentage sums in matrix rows for 16 duplets
11 00
0.05409+0.07274+0.05033+0.09504 = 0.27219 0.09568+0.07286+0.05046+0.05419 = 0.27319
10 01
0.07134+0.01031+0.07429+0.07137 = 0.22730 0.06008+0.06312+0.04402+0.06008 = 0.22731
Percentage sums in matrix rows of 256 tetraplets 0.0033+0.0055+0.0042+0.0044+0.0040+0.0056+0.0032+0.0070+ 0.0030+0.0042+0.0040+0.0053+0.0032+0.0059+0.0055+0.0149= 0.0830 0.0150+0.0072+0.0055+0.0045+0.0059+0.0057+0.0043+0.0054+ 0.0055+0.0032+0.0040+0.0042+0.0032+0.0039+0.0030+0.0033= 0.0838 0.0041+0.0010+0.0044+0.0058+0.0047+0.0010+0.0040+0.0044+ 0.0041+0.0005+0.0051+0.0054+0.0047+0.0006+0.0095+0.0071= 0.0664 0.0073+0.0078+0.0038+0.0051+0.0037+0.0046+0.0042+0.0051+ 0.0030+0.0028+0.0029+0.0042+0.0023+0.0024+0.0030+0.0046= 0.0666 0.0050+0.0029+0.0008+0.0006+0.0036+0.0039+0.0049+0.0059+ 0.0037+0.0032+0.0006+0.0006+0.0040+0.0070+0.0037+0.0066= 0.0569 0.0077+0.0006+0.0063+0.0038+0.0049+0.0006+0.0045+0.0057+ 0.0038+0.0004+0.0032+0.0027+0.0042+0.0010+0.0034+0.0052= 0.0580 0.0048+0.0058+0.0006+0.0009+0.0057+0.0047+0.0045+0.0058+ 0.0049+0.0044+0.0007+0.0007+0.0071+0.0057+0.0049+0.0047= 0.0659 0.0051+0.0063+0.0034+0.0063+0.0041+0.0049+0.0031+0.0061+ 0.0023+0.0031+0.0017+0.0035+0.0031+0.0042+0.0023+0.0052= 0.0648 0.0052+0.0057+0.0027+0.0038+0.0010+0.0006+0.0003+0.0006+ 0.0033+0.0045+0.0032+0.0063+0.0042+0.0049+0.0039+0.0078= 0.0579 0.0068+0.0058+0.0006+0.0006+0.0070+0.0039+0.0032+0.0029+ 0.0037+0.0048+0.0006+0.0008+0.0041+0.0036+0.0036+0.0051= 0.0570 0.0058+0.0009+0.0035+0.0028+0.0007+0.0003+0.0005+0.0008+ 0.0049+0.0005+0.0064+0.0035+0.0046+0.0007+0.0049+0.0059= 0.0466 0.0057+0.0040+0.0005+0.0006+0.0030+0.0055+0.0026+0.0041+ 0.0032+0.0027+0.0006+0.0005+0.0025+0.0031+0.0032+0.0057= 0.0476 0.0048+0.0036+0.0046+0.0051+0.0005+0.0004+0.0008+0.0006+ 0.0047+0.0056+0.0033+0.0050+0.0031+0.0037+0.0047+0.0057= 0.0563 0.0057+0.0006+0.0051+0.0052+0.0037+0.0004+0.0056+0.0036+ 0.0047+0.0008+0.0033+0.0046+0.0030+0.0005+0.0046+0.0048= 0.0563 0.0073+0.0044+0.0053+0.0058+0.0006+0.0010+0.0005+0.0010+ 0.0096+0.0040+0.0051+0.0044+0.0046+0.0047+0.0040+0.0041= 0.0666 0.0046+0.0049+0.0042+0.0051+0.0024+0.0045+0.0028+0.0078+ 0.0030+0.0041+0.0029+0.0038+0.0023+0.0037+0.0029+0.0073= 0.0664
Fig. 3. The sums of percentages in the rows of the matrices of 16 doublets, 64 triplets (from Fig. 2) and also of 256 tetraplets in the considered case of the human chromosome №1. Pairs of rows with bit-reverse numerations are highlighted by bold frames. In each frame, both amounts are practically the same, despite the strong difference in the values of the summands.
The Universal Multilevel Relationship Between the Stochastic Organization
169
percent sums in each column, although the individual percentages inside these columns differ significantly. Figure 4 illustrates this. The described phenomenological rules testify that the stochastic organization of genomic DNAs is closely related to features of the deterministic system of DNA nalphabets. Such discovered interrelations between stochastic and deterministic features in the molecular-genetic system seem to be useful for a deeper understanding of interrelations of stochastic and deterministic properties in genetically inherited physiological systems mentioned above in the Introduction. The received results give new abilities to develop quantum-information modeling genetic phenomena and for the creation of new approaches for artificial intelligence systems. The described phenomenological rules of probabilities in compositions of genomic DNAs, related to a system of molecular binary numerations of n-plets, draw additional attention to that binary-oppositional principle (Yin-Yang principle) play a more essential role at different levels of genetic organization than it was known early. The whole set of equality relations between the percentage groupings of n-plets in genomic DNA n-texts, revealed by the author, indicates that the stochastic organization of genomic DNAs is the highly limited stochastic organization with many internal numeric interrelations among summary percentages of separate groupings. This limited stochasticity of genomic DNAs is presented not only by the facts described above but also includes other universal genomic rules of numerical interrelationships between the percentage characteristics of n-plets groupings in different ntexts of the same genomic DNA. Let us describe one more example of the limited stochasticity related to pairs of mirror-complemented n-plets in single-stranded DNA. It is such two n-plets, which is identical if one of them is read in the opposite order with simultaneous replacement of each of its nucleotides by complemented nucleotide (A $ T, C $ G). For example, the triplet ACG is mirror-complemented to the triplet CGT (but not to TGC). The work [12] showed the existence of the following symmetry principle in long DNA. In any long DNA sequence of nucleotides (more than 50 000 nucleotides), one can take an arbitrary kind of n-plets (n = 2, 3, 4, 5, 6) and then calculate the total quantity of all such n-plets (including n-plets overlapping each other). It occurs that this DNA contains also approximately an equal quantity of mirror-complemented n-plets (including all mirror-complemented n-plets overlapping each other). In our research, we initiatively study DNA n-texts (under n = 2, 3, 4,…), which were not considered at all by other authors till now. One should note that each of the ntexts has no overlapping of its n-plets, which are separated from each other. Have such n-texts in genomic DNA (and also in enough long DNA sequences in general) similar symmetry principle related to pairs of mirror-complemented n-plets? Yes, our study confirms the existence of a similar symmetry principle in singlestranded DNA sequences of eukaryotic and prokaryotic genomes. Figures 5 and 6 show some examples of this symmetry principle of n-plets of the single-stranded DNA of the human chromosome №1 using the above-presented percentage data from Fig. 2. Those n-plets, which coincide with their mirror-complemented ones (CG, AT, GC, TA), are not shown here. The same equality (or symmetry) principle holds also for percentages of mirrorcomplemented tetraplets in the 4-textual representation of this chromosomal DNA. The
170
S. V. Petoukhov Columns
Columns 1111 0000 1110 0001 1101 0010 1100 0011 1011 0100 1010 0101 1001 0110 1000 0111
Percentage sums in matrix columns for 16 duplets
11 00
0.05409+0.07134+0.06008+0.09568 = 0.28119 0.09504+0.07137+0.06008+0.05419 = 0.28068
10 01
0.07274+0.01031+0.06312+0.07286 = 0.21903 0.05033+0.07429+0.04402+0.05046 = 0.21910
Percentage sums in matrix columns for 256 tetraplets 0.0033+0.0041+0.0050+0.0048+0.0052+0.0058+0.0048+0.0073+ 0.0046+0.0057+0.0057+0.0068+0.0051+0.0077+0.0073+0.0150=0.0982 0.0149+0.0071+0.0066+0.0047+0.0078+0.0059+0.0057+0.0041+ 0.0073+0.0048+0.0057+0.0051+0.0052+0.0052+0.0046+0.0033=0.0980 0.0055+0.0010+0.0029+0.0058+0.0057+0.0009+0.0036+0.0044+ 0.0049+0.0006+0.0040+0.0058+0.0063+0.0006+0.0078+0.0072=0.0670 0.0055+0.0095+0.0037+0.0049+0.0039+0.0049+0.0047+0.0040+ 0.0029+0.0046+0.0032+0.0036+0.0023+0.0034+0.0030+0.0030=0.0671 0.0042+0.0044+0.0008+0.0006+0.0027+0.0035+0.0046+0.0053+ 0.0042+0.0051+0.0005+0.0006+0.0034+0.0063+0.0038+0.0055=0.0555 0.0059+0.0006+0.0070+0.0057+0.0049+0.0007+0.0037+0.0047+ 0.0037+0.0005+0.0031+0.0036+0.0042+0.0010+0.0024+0.0039=0.0556 0.0044+0.0058+0.0006+0.0009+0.0038+0.0028+0.0051+0.0058+ 0.0051+0.0052+0.0006+0.0006+0.0063+0.0038+0.0051+0.0045=0.0604 0.0032+0.0047+0.0040+0.0071+0.0042+0.0046+0.0031+0.0046+ 0.0023+0.0030+0.0025+0.0041+0.0031+0.0042+0.0023+0.0032=0.0602 0.0040+0.0047+0.0036+0.0057+0.0010+0.0007+0.0005+0.0006+ 0.0024+0.0037+0.0030+0.0070+0.0041+0.0049+0.0037+0.0059=0.0555 0.0053+0.0054+0.0006+0.0007+0.0063+0.0035+0.0050+0.0044+ 0.0038+0.0046+0.0005+0.0008+0.0035+0.0027+0.0042+0.0042=0.0555 0.0056+0.0010+0.0039+0.0047+0.0006+0.0003+0.0004+0.0010+ 0.0045+0.0004+0.0055+0.0039+0.0049+0.0006+0.0046+0.0057=0.0476 0.0040+0.0051+0.0006+0.0007+0.0032+0.0064+0.0033+0.0051+ 0.0029+0.0033+0.0006+0.0006+0.0017+0.0032+0.0029+0.0040=0.0476 0.0032+0.0040+0.0049+0.0045+0.0003+0.0005+0.0008+0.0005+ 0.0028+0.0056+0.0026+0.0032+0.0031+0.0045+0.0042+0.0043=0.0490 0.0042+0.0005+0.0032+0.0044+0.0045+0.0005+0.0056+0.0040+ 0.0041+0.0008+0.0027+0.0048+0.0031+0.0004+0.0028+0.0032=0.0488 0.0070+0.0044+0.0059+0.0058+0.0006+0.0008+0.0006+0.0010+ 0.0078+0.0036+0.0041+0.0029+0.0061+0.0057+0.0051+0.0054=0.0668 0.0030+0.0041+0.0037+0.0049+0.0033+0.0049+0.0047+0.0096+ 0.0030+0.0047+0.0032+0.0037+0.0023+0.0038+0.0030+0.0055=0.0674
Fig. 4. The sums of percentages in the columns of the matrices of 16 doublets and 64 triplets (from Fig. 2) and also of 256 tetraplets (from the preprint [14]) in the considered case of the human chromosome №1. Pairs of columns with bit-reverse numerations are high-lighted by bold frames. In each frame, both amounts are practically the same, despite the strong difference in the values of the summands.
The Universal Multilevel Relationship Between the Stochastic Organization
% in doublet 1
% in doublet 2
% in doublet 1
% in doublet 2
%CC = 0.05409
%GG = 0.05419
%AC = 0.05033
%GT = 0.05046
%CA = 0.07274
%TG = 0.07286
%AA = 0.09504
%TT = 0.09568
%CT =0.07134
%AG = 0.07137
%TC = 0.06008
%GA = 0.06008
171
Fig. 5. Percentage equalities for mirror-complemented doublets (denoted as 1 and 2), whose corresponding pairs are presented inside bold frames, in the case of the 2-textual represen- tation of the DNA of human chromosome №1. All percent values are taken from Fig. 2.
% triplet 1
% triplet 2
% triplet 1
% triplet 2
%CCC = 0.01385
%GGG = 0.01382
%ACC = 0.01183
%GGT = 0.01185
%CCA = 0.01878
%TGG = 0.01895
%ACA = 0.01977
%TGT = 0.01988
%ССT = 0.01853
%AGG = 0.01848
%ACT = 0.01622
%AGT = 0.01614
%CCG = 0.00291
%CGG = 0.00291
%ACG = 0.00254
%CGT = 0.00259
%CAC = 0.01524
%GTG = 0.01534
%AAT = 0.02375
%ATT = 0.02388
%CAA = 0.01861
%TTG = 0.01884
%AAA = 0.03693
%TTT = 0.03725
%CAT = 0.01789
%ATG = 0.01781
%AAC = 0.01447
%GTT = 0.01445
%CTC = 0.01758
%GAG = 0.01756
%ATC = 0.01317
%GAT = 0.01327
%CTA = 0.01275
%TAG = 0.01284
%ATA = 0.01942
%TAT = 0.01939
%CGC = 0.00251
%GCG = 0.00253
%AGC = 0.01441
%GCT = 0.01437
%CGA = 0.00227
%TCG = 0.00233
%AAG = 0.01988
%CTT = 0.02009
%TCC = 0.01588
%GGA = 0.01600
%AGA = 0.02237
%TCT = 0.02226
%TCA = 0.01964
%TGA = 0.01947
%GCC = 0.01255
%GGC = 0.01256
%TAC = 0.01103
%GTA = 0.01115
%GCA = 0.01456
%TGC = 0.01457
%TAA = 0.01986
%TTA = 0.01981
%GAC = 0.00962
%GTC = 0.00956
%CAG = 0.02104
%CTG = 0.02088
%GAA = 0.01960
%TTC = 0.01972
Fig. 6. Percentage equalities for mirror-complemented triplets (denoted as 1 and 2), whose corresponding pairs are presented inside bold frames, in the case of the 3-textual representation of the DNA of human chromosome №1. All percent values are taken from Fig. 2.
described equality principle for percentages of mirror-complemented n-plets in singlestranded DNAs of eukaryotic and prokaryotic genomes shows additionally that the system of probabilities in n-texts of genomic DNAs has many special deterministic interconnections among percent values of its members. Briefly speaking, a stochastic organization of genomes is a special kind of very limited stochastic, having internal deterministic-like regularities.
172
S. V. Petoukhov
4 Concluding Remarks The described results state the existence of the following hidden interrelations, which were unknown early: – Interrelations between composition probabilities of n-plets of different n-texts in single-stranded DNAs of eukaryotic and prokaryotic genomes; – Interrelations between the described probabilities rules of genomic DNAs and the deterministic system of DNA n-plets alphabets having binary-oppositional (YinYang) molecular characteristics of the deterministic type. These results are useful for a deeper understanding of the genetically inherited bases of natural intelligence and their use for creating artificial intelligence. They concern themes of publications of many authors in different fields [15–20]. The discovered connections between stochastic and deterministic features of the genetic system attract attention to the supposed participation of stochastic resonances in genetic processes. The concept of resonance is one of the long-known fundamental concepts of science. But relatively recently - in 1981 - a new important concept of “stochastic resonance” entered science [21–23]. It reflects a physical phenomenon, the existence of which was confirmed in subsequent years by many authors. Stochastic resonance is usually described as the phenomenon of amplification of a periodic signal in nonlinear systems under the influence of noise of a certain power [https://en. wikipedia.org/wiki/Stochastic_resonance]. It is accompanied by the transfer of some of the noise energy into signal energy, which increases the energy of the weak signal without suppressing noise. Over the past time, this concept has proven its usefulness in various fields of knowledge and has taken a prominent place in modern science. In particular, it is actively used to simulate physiological phenomena [24, 25]. More than 13,000 publications have been devoted to the theory and applications of stochastic resonance, including a series of articles in the journal Nature. One of the fundamental biological questions is the question of how the information, which is recorded at the level of DNA molecules, is essentially amplified to dictate the ordered features of physiological macrostructures. The author develops a concept about an important role of genetic stochastic resonances in this amplification and ordering [14]. This concept takes into account special features of molecular-genetic systems including the following: chirality of genetic molecules; universal stochastic features of genomes, existing against the background of internal electromagnetic, vibrational and acoustic noises of organisms; the well-known fact that an organism is a set of genetically encoded cyclic processes that respond to external periodic influences, for example, a change in solar activity in the daily cycle “day-night”; the non-linear system of structured alphabets of n-plets of DNA, etc. Acknowledgments. The author is grateful to his colleagues E. Fimmel, M. He, Z. Hu, Yu.I. Manin, I. Stepanyan, V. Svirin, and G. Tolokonnikov for research assistance.
The Universal Multilevel Relationship Between the Stochastic Organization
173
References 1. Chalancon, G., et al.: Interplay between gene expression noise and regulatory network architecture. Trends Genet. 28(5), 221–232 (2012). https://doi.org/10.1016/j.tig.2012.01.006 2. Horikawa, K., Ishimatsu, K., Yoshimoto, E., Kondo, S., Takeda, H.: Noise-resistant and synchronized oscillation of the segmentation clock. Nature 441(7094), 719–723 (2006). https://doi.org/10.1038/nature04861 3. Raser, J.M., O’Shea, E.K.: Noise in gene expression: origins, consequences, and control. Science 309(5743), 2010–2013 (2005). https://doi.org/10.1126/science.1105891 4. Yampolsky, L.Y., Scheiner, S.R.: Developmental noise, phenotypic plasticity, and allozyme heterozygosity in daphnia. Evolution 48(5), 1715–1722 (1994). https://doi.org/10.2307/ 2410259.JSTOR2410259 5. McFadden, J., Al-Khalili, J.: The origins of quantum biology. In: Proceedings of the Royal Society A, vol. 474, no. 2220, pp. 1–13 (2018) 6. Petoukhov, S.V.: Hyperbolic rules of the cooperative organization of eukaryotic and prokaryotic genomes. Biosystems 198, 104273 (2020) 7. Petoukhov, S.V., Svirin, V.I.: Stochastic rules in nucleotide sequences in genomes of higher and lower organisms. Int. J. Math. Sci. Comput. (IJMSC) 7(2), 1–13 (2021). https://doi.org/ 10.5815/ijmsc.2021.02.01 8. Petoukhov, S.V.: Algebraic harmony and probabilities in genomes long-range coherence in quantum code biology. Biosystems 209, 104503 (2021). https://doi.org/10.1016/j. biosystems.2021.104503 9. Petoukhov, S.V.: Algebraic rules for the percentage composition of oligomers in genomes. Preprints 2021, 2021010360, 3rd version, 84 p. (2021). https://doi.org/10.20944/ preprints202101.0360.v3 10. Albrecht-Buehler, G.: Asymptotically increasing compliance of genomes with Chargaff’s second parity rules through inversions and inverted transpositions. In: Proceedings of National Academy Sciences USA, vol. 103, no. 47, pp. 17828–17833 (2006) 11. Chargaff, E.: Preface to a grammar of biology: a hundred years of nucleic acid research. Science 172, 637–642 (1971) 12. Prabhu, V.V.: Symmetry observation in long nucleotide sequences. Nucleic Acids Res. 21, 2797–2800 (1993) 13. Petoukhov, S.V., He, M.: Symmetrical Analysis Techniques for Genetic Systems and Bioinformatics: Advanced Patterns and Applications. IGI Global, Hershey (2010) 14. Petoukhov, S.V.: Tensor rules in the stochastic organization of genomes and genetic stochastic resonance in algebraic biology. Preprints 2021, 2021100093 (2021). https://doi. org/10.20944/preprints202110.0093.v1 15. Khan, R., Debnath, R.: Human distraction detection from video stream using artificial emotional intelligence. Int. J. Image Graph. Signal Process. 12(2), 19–29 (2020). https://doi. org/10.5815/ijigsp.2020.02.03 16. Erwin, D.R.N.: Improving retinal image quality using the contrast stretching, histogram equalization, and CLAHE methods with median filters. Int. J. Image Graph. Signal Process. (IJIGSP) 12(2), 30–41 (2020). https://doi.org/10.5815/ijigsp.2020.02.04 17. Mostakim, M.N., Mahmud, S., Jewel, M.K.H., Rahman, M.K., Ali, M.S.: Design and development of an intelligent home with automated environmental control. Int. J. Image Graph. Sig. Process. (IJIGSP) 12(4), 1–14 (2020). https://doi.org/10.5815/ijigsp.2020.04.01 18. Arora, N., Ashok, A., Tiwari, S.: Efficient image retrieval through hybrid feature set and neural network. Int. J. Image Graph. Signal Process. 11(1), 44–53 (2019). https://doi.org/10. 5815/ijigsp.2019.01.05
174
S. V. Petoukhov
19. Anami, B., Naveen, N.M., Surendra, P.: Automated paddy variety recognition from colorrelated plant agro-morphological characteristics. Int. J. Image Graph. Signal Process. (IJIGSP) 11(1), 12–22 (2019). https://doi.org/10.5815/ijigsp.2019.01.02 20. Mahtab Ahmed, M.A.H., Akhand, M.M., Rahman, H.: Recognizing bangla handwritten numeral utilizing deep long short term memory. Int. J. Image Graph. Signal Process. (IJIGSP) 11(1), 23–32 (2019). https://doi.org/10.5815/ijigsp.2019.01.03 21. Benzi, R., Sutera, A., Vulpiani, A.: The mechanism of stochastic resonance. J. Phys. A: Math. Gen. 14, 453–457 (1981) 22. Benzi, R., Sutera, A.: Stochastic resonance in two dimensional Landau Ginzburg equation. J. Phys. A: Gen. Phys. (2004). https://doi.org/10.1088/0305-4470/37/32/L01 23. Benzi, R.: Stochastic resonance: from climate to biology. Nonlin. Process. Geophys. 17(5), 431–441 (2010). https://doi.org/10.5194/npg-17-431-2010 24. Anishchenko, V.S., Neiman, A.B., Moss, F., Shimansky-Geier, L.: Stochastic resonance: noise-enhanced order. Phys. Usp. 42, 7–36 (1999). https://doi.org/10.1070/ PU1999v042n01ABEH000444 25. Krauss, P., Tziridis, K., Achim Schilling, A., Schulze, H.: Cross-modal stochastic resonance as a universal principle to enhance sensory processing. Front. Neurosci. 12, 578 (2018). https://doi.org/10.3389/fnins.2018.00578
Projecting an Educational Program for the Preparation of Bachelors in the Profile “Digital Economy” Nataliya Mutovkina(&) Tver State Technical University, Tver 170012, Russia Abstract. In the modern education system, taking into account the digitalization of various areas of people’s lives and the competence approach that is still relevant today, there is an urgent need to form educational programs that meet the requirements of the labor market and the expectations of applicants. The article considers an optimization and methodological approach to designing an educational program to prepare a bachelor of “Digital Economy” profile. This approach is currently being applied at Tver State Technical University to develop educational programs and curricula for the preparation of students at the bachelor’s and master’s levels. The result of the design is an educational program and a curriculum based on it. The set of disciplines is formed depending on the requirements of the selected professional standards and formulated professional competencies based on these standards. The novelty of the approach is optimization design, which is a complex iterative process in which the conformity of skills, knowledge, and other characteristics prescribed in professional standards and academic disciplines that allow them to develop are expertly evaluated. At each examination, an assessment of the consistency of expert opinions is carried out. The connecting link in the evaluation of correspondences is professional competencies. Experts are representatives of the teaching staff of the graduating department, the educational and methodological department of the educational organization, and representatives of regional employers. The significance of the proposed approach is that with its help, the most optimal combination and interrelation of academic disciplines and practices are achieved, which allows students to obtain all the necessary competencies by the end of their studies. Keywords: Educational process Educational program Digital economy Competencies Expert assessments Curriculum Coordination
1 Introduction In the context of the active development of the digital economy, professional training in this area is becoming relevant. However, the education organization in the Russian education system is regulated by certain regulatory legal documents of the federal level. Thus, each educational organization forms educational programs for the preparation of students depending on the existing national state academic standards of higher education and professional standards. For example, in the higher education system of the Russian Federation, there is a Federal state educational standard in the direction of © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 Z. Hu et al. (Eds.): CSDEIS 2021, LNDECT 121, pp. 175–186, 2022. https://doi.org/10.1007/978-3-030-97057-4_16
176
N. Mutovkina
“Economics” [1], but not “Digital Economy.” Therefore, the digital economy is considered here only as one of the possible profiles of student training, along with such profiles as “Accounting, analysis and audit,” “Finance and Credit,” etc. However, the Digital Economy profile is increasingly in demand by implementing the Digital Economy of the Russian Federation program [2]. This program defines the state policy to create the necessary conditions for the development of means, methods, and technologies for obtaining, storing, processing, and exchanging discrete digital information, which is now the main determining factor of the well-being and success of any person in all areas of business, a purposeful vector of the effectiveness of the production of goods and services needed by a person and demanded by society in all spheres of socio-economic activity. However, forming an educational program in the “Digital Economy” profile does not have the only correct solution. The structure and content of the program are influenced by various factors, for example, regional characteristics, material and technical support, and staffing of the educational organization, the demand for the profile of applicants, etc. Therefore, forming an educational program is transformed into the task of optimizing the preparation of students in the “Digital Economy” profile. In this regard, this article aims to develop an optimization-methodological approach to the design of an educational program for the preparation of a bachelor in the Digital Economy profile. 1.1
Bachelor’s Degree Profile “Digital Economy”
Graduates of the Bachelor’s degree “Digital Economy” should have competencies that allow them to work at high-tech enterprises developing “end-to-end” information and communication technologies and managing digital platforms. Such enterprises are, as a rule, full-fledged participants in the global market, forming a system of “startups,” research projects, industry units that support the development of the digital economy. Thus, graduates of the Digital Economy profile can find themselves in the profession as economists with the knowledge and skills to apply modern information and communication technologies, adapt them, if necessary, to the needs of an economic entity, integrate existing technologies with new digital technologies in leading sectors of the economy: bio- and medical technologies, nanotechnology, robotics, cognitive high humanitarian technologies, resource-saving environmental management. One of the leading elements of the digital economy is artificial intelligence. Russia has great potential in developing artificial intelligence and, shortly, will show itself in this market as a strong player [3]. Furthermore, based on the integrative and interdisciplinary nature of the training of students at Tver State Technical University in the field of Economics, it is planned to introduce disciplines related to the use of artificial intelligence in economics into the curriculum as part of the new training profile. The digital economy tools allow markets to develop even in conditions of economic crisis since these tools contribute to the prompt and flexible response of financial entities to changes in market conditions and better fulfillment of consumer requests. Furthermore, the training of students in the “Digital Economy” profile is perfectly consistent with the state program “Digital Economy of the Russian Federation” [2] since education and personnel are one of the priority areas of implementation outlined in the Program.
Projecting an Educational Program for the Preparation of Bachelors
177
The training of new personnel should be treated as a central problem of the digitalization of the Russian economy. Therefore, leading teachers of the Department of Accounting and Finance are involved in developing an educational program on the profile “Digital Economy.” 1.2
Literature Review and Regulatory Framework
The issues of designing educational programs have been considered and continue to be relevant in scientists from many countries of the world. For example, the optimization of the educational process through the introduction of a personalized e-learning system is shown in [4]. Furthermore, the article is devoted to developing an educational program on e-business [5]. Moreover, the problems and prospects of teaching digital economy are raised in works [6–10], the possibilities of using information technology in teaching other courses, as well as the development of the digital economy are discussed in publications [11–15]; however, the methods of forming an educational program in the profile “Digital Economy” taking into account the competence approach have not yet been presented in any work. The educational program in the “Digital Economy” profile should be developed following the requirements set out in such regulatory documents as: 1) The Federal Law “On Education in the Russian Federation” [16]. 2) The Order of the Ministry of Education and Science of Russia “On approval of the Procedure for organizing and implementing educational activities for educational programs of higher education – bachelor’s degree programs, specialty programs, master’s degree programs” [17]. 3) Federal state educational standard of higher education – bachelor’s degree in training 38.03.01 Economics [1]. 4) The program “Digital Economy of the Russian Federation” [2]. The total complexity of mastering the educational program for all forms of education is 240 credits. The term of education under the bachelor’s degree program in full-time education is four years.
2 Methodology of Formation of the Educational Program 2.1
Choosing Professional Standards
According to the competence approach to the design of educational programs, three groups of competencies are currently distinguished: universal, general professional, and professional. The competencies of the first two groups are fixed in the standard. Still, professional competencies are formulated by the educational program developers independently, based on professional standards corresponding to the professional activities of graduates. The register of professional standards (list of types of professional activity) is posted on the specialized website of the Ministry of Labor and Social Protection of the Russian Federation “Professional Standards” [18]. One or more generalized labor
178
N. Mutovkina
functions are allocated from each selected professional standard. In the absence of professional standards corresponding to the professional activity of graduates, professional competencies are determined based on the analysis of the requirements for professional competencies imposed on graduates in the labor market, generalization of domestic and foreign experience, consultations with leading employers, employers’ associations of the industry in which graduates are in demand, and other sources [1]. The Portal of Federal State Educational Standards of Higher Education [19] presents 43 professional standards in the field of Economics and finance. For the “Digital Economy” profile, the most appropriate standards are shown in Fig. 1.
Fig. 1. Choosing professional standards
The color highlights the standards taken as a basis for the design of the educational program on the profile “Digital Economy” at the Department of Accounting and Finance of Tver State Technical University. The choice is justified by the demand for such graduates in the regional labour market. Almost all experts agreed with this choice. 2.2
Analysis of Labour Functions
The selected standards are analyzed according to the characteristics of the labor functions presented in them. Each labour function is described using descriptors such as skills, knowledge, actions, and other features. Activities, as a rule, are unique; correspond to the names of labour functions. Therefore, it is recommended to focus on the analysis of skills, knowledge, and other characteristics. Stages of analysis: 1) Starting from the first labour function specified in the standard, each non-repeating skill, knowledge, and characteristic is assigned a unique code. When viewing the following elements, the codes may be repeated. In this case, the formulation of the descriptor is analyzed, and a value from 0 to 1 is posed. Zero or a value close to
Projecting an Educational Program for the Preparation of Bachelors
179
zero means the absence of this descriptor in the labour function or an insignificant match in the second variant. One implies a repeat of the descriptor. If a new descriptor is encountered, then a new code is entered for it. 2) The sum of points is calculated for each descriptor (knowledge, skills, and other characteristics) and each labour function. 3) A linguistic description of the most common knowledge, skills, characteristics, and labour function is displayed. 4) An academic discipline (or several academic disciplines) is aligned by experts, ensuring relevant knowledge, skills, and characteristics among students. At this stage, experts can act either together or independently of each other. In an independent assessment, it is necessary to check the consistency of expert assessments. These methods are well developed and do not need additional analysis. 2.3
Formulation of Professional Competencies
The linguistic description of the most common knowledge, skills, and characteristics is transformed into professional competence. First, competencies are formulated based on semantic analysis of knowledge, skills, and features, the methodology of which is presented in [20]. Then the resulting formulations of competencies are adjusted to take into account the critical competencies of the digital economy. Key competencies include communication and cooperation in the digital environment, self-development in conditions of uncertainty, creative thinking, information and data management, critical thinking in the digital environment [21]. When forming competencies, it is necessary to move from convergent thinking (narrowing the range of possibilities and searching for one correct answer) to divergent (formation of complementary mental abilities associated with creative thinking, innovation, imagination, and ingenuity). The final list of professional competencies is drawn up in the form of a table. 2.4
Determination of Disciplines for Preliminary Inclusion in the Educational Program
Expert selection of the most significant academic disciplines is carried out, the study of which will contribute to the formation of formulated professional competencies. First, the domains that are most often repeated in the educational program model are selected. Then the list is continued by all other disciplines. At the same time, semantic selection of academic disciplines is carried out. For example, it is advisable to combine the fields “Risk Theory” and “Risk Management Theory” into one sentence called “Risk Management Theory in the Digital Economy.” Likewise, the disciplines “Legal regulation in the digital economy,” “Legislative system,” “Legal foundations in the information world” should be considered as one course with the general title “Legal regulation in the digital economy.” Thus, a list of non-repeating disciplines is being formed, the study of which will contribute to the formation of previously defined professional competencies among students.
180
2.5
N. Mutovkina
Final Selection of Disciplines and Formation of the Curriculum
When designing an educational program, it is necessary to consider the disciplines that contribute to the development of students’ professional competencies and the competencies specified in the academic standard [1]. Universal competencies are presented in Table 1. However, they can be found in other educational standards.
Table 1. Universal competencies Name of the category (group) of universal competencies Systemic and critical thinking
Code of the Name of the competence competence UC-1 Can search, critically analyze, and synthesize information, apply a systematic approach to solving tasks Development and UC-2 Can determine the range of tasks within the set implementation of projects goal and choose the best ways to solve them based on existing legal norms, available resources, and limitations Teamwork and leadership UC-3 Can be able to carry out social interaction and realize his role in the team Communication UC-4 Able to carry out business communication in oral and written forms in the state language of the Russian Federation and foreign language(s) Cross-cultural interaction UC-5 Able to perceive the intercultural diversity of society in socio-historical, ethical, and philosophical contexts UC-6 Able to manage his time, build and implement a Self-organization and selftrajectory of self-development based on the development (including health principles of lifelong education care) UC-7 Able to maintain the proper level of physical fitness to ensure full-fledged social and professional activities Life safety UC-8 Able to create and maintain safe living conditions in everyday life and professional activities to preserve the natural environment, ensuring sustainable development of society, including in the event of a threat and occurrence of emergencies and military conflicts Inclusive competence UC-9 Able to use basic defectological knowledge in social and professional spheres Economic culture, including UC-10 Able to make informed economic decisions in financial literacy various areas of life Civic position UC-11 Able to form an intolerant attitude to corrupt behavior
Projecting an Educational Program for the Preparation of Bachelors
181
General professional competencies (Table 2) are also present in educational standards in areas related to Economics.
Table 2. General professional competencies Code of the competence GPC-1 GPC-2 GPC-3 GPC-4 GPC-5
Name of the competence Able to apply knowledge (at an intermediate level) of economic theory in solving applied problems Able to collect, process, and statistical analysis of data necessary for solving economic tasks Able to analyze and explain the nature of economic processes meaningfully at the micro and macro levels Able to offer economically and financially sound organizational and managerial solutions in professional activities Able to use modern information technologies and software tools in solving professional tasks
Therefore, the list of disciplines formed at the previous stage is supplemented with fields responsible for developing the student’s competencies presented in Table 1, 2. In addition, semantic links are established between the disciplines, determining the order of study of disciplines by semesters to set the duration of each of them.
3 Demo Example 3.1
Initial Data
The initial data for modeling the educational program are the professional standards listed above, digital competencies, and the provisions [21]. Modeling can be performed in any tabular editor that supports mathematical, statistical, logical calculations, and object-oriented programming. In this case, Microsoft Excel is selected as the primary modeling environment. A fragment of the source data location for the professional standard 08.001 is shown in Fig. 2.
182
N. Mutovkina
Fig. 2. A fragment of the simulation of the educational program on the profile “Digital Economy”
The calculation of the final values of the grades by coincidence is performed on the same Excel sheet, and the corresponding academic disciplines are indicated. As can be seen, even from the above fragment, some fields are repeated, so the transition to the selection of domains is carried out. 3.2
Designing the Structure of the Educational Program
The selection of academic disciplines is being carried out. Stage 1: Using the built-in functions of Microsoft Excel, a list containing unique names of disciplines is formed. The frequency of their repetitions in the model areas is established. Stage 2: A map of professional competencies is formed (Table 3). Table 3. Professional competencies Code of the competence PC-1 PC-2 PC-3 PC-4 PC-5 PC-6 PC-7 PC-8
Name of the competence Able to work with payment systems, analyze their functionality, assess the risks of using payment systems Able to apply various methods and tools for obtaining information, assess the reliability of the information received Able to work with large amounts of information Able to draw conclusions and make reports and presentations based on information analysis results using modern information technologies Able to assess the effectiveness of projects with signs of the digital economy and the risks associated with them Able to develop project implementation scenarios depending on various conditions of the internal and external environment Able to predict the required amount of financing for digital economy projects Able to use econometric methods of forecasting the development of the digital market in the short, medium, and long term
Projecting an Educational Program for the Preparation of Bachelors
183
Stage 3: Based on the frequency distribution and semantic comparison of fS; K; O $ hPC i $ Disciplinesg, a list of disciplines to be included in the educational program and curriculum is formed (Fig. 3).
Fig. 3. A fragment of the formation of the final list of academic disciplines for professional competencies
3.3
Formation of the Curriculum
After establishing academic disciplines responsible for developing students’ professional competencies, their list is supplemented with fields taking into account universal and general professional competencies. A fragment of the structure of the educational program with the definition of the order and duration of the study of disciplines is shown in Fig. 4.
184
N. Mutovkina
Fig. 4. A fragment of the structure of the educational program on the profile “Digital Economy”
Next, the types of classes for each discipline are designed (lectures, practices, laboratory classes), the number of hours for every kind of class is determined, for independent work of students, the form of absolute control of students’ knowledge (test or exam) is established.
4 Summary and Conclusion Thus, the proposed optimization and methodological approach to the design of educational programs allows: 1) Take into account the dynamism of the educational environment associated with the digitalization of society’s life, and therefore changes in preferences and requirements for the quality of education. 2) Monitor changes in the labor market in terms of requirements for graduates of educational organizations. 3) Take into account changes in the legislation regulating the educational sphere.
Projecting an Educational Program for the Preparation of Bachelors
185
4) Take into account the regional peculiarities of the educational system and the educational organization in which it is planned to introduce new academic programs. 5) To form precisely the list of academic disciplines and practices that fully meet the preferences, requirements, and changes listed above. As a result, the educational program and curriculum for the profile “Digital Economy” have been developed by the proposed optimization approach to design; the characteristics of the main components of the educational program project in tabular and graphical forms are presented. The proposed method is the basis of the technology for forming digital skills among students and reasonably completing the necessary adjustments to the curriculum. The approach is based on statistical and expert methods of obtaining information and evaluating it. The educational program, designed using the proposed approach, makes it possible to prepare modern students as people of a new digital formation. As a result, people with creative thinking and digital skills are more competitive and in demand in the labor market. This is the practical significance of the presented research results.
References 1. The Federal state educational standard of higher education – bachelor’s degree in training 38.03.01 Economics, approved by order of the Ministry of Science and Higher Education of the Russian Federation dated 12.08.2020 N 954. http://fgosvo.ru/uploadfiles/FGOS%20VO %203++/Bak/380301_B_3_31082020.pdf. Accessed 02 Nov 2021 2. The program “Digital Economy of the Russian Federation,” approved by the Decree of the Government of the Russian Federation dated 28.07.2017 No. 1632-r. http://static. government.ru/media/files/. Accessed 02 Nov 2021 3. Mutovkina, N.Yu., Tarasov, V.B.: Statistical approach to assessing the level of development of artificial intelligence methodology in Russia and abroad. In: Materials of the Conference “Artificial Intelligence: Problems and Solutions, pp. 178–185 (2018) 4. Agbonifo, O., Obolo, O.: Genetic algorithm-based curriculum sequencing model for personalised e-Learning system. Int. J. Mod. Educ. Comput. Sci. 10(5), 27–35 (2018) 5. Alghamdi, T., Jamjoom, A.: Developing e-business strategies curriculum case study in the information systems department. Int. J. Mod. Educ. Comput. Sci. (IJMECS) 4(2), 1–7 (2012) 6. Shmelkova, L.V.: Personnel for the digital economy: a look into the future. Additional Prof. Educ. Country World 8(30), 1–4 (2016) 7. Kupriyanovsky, V., et al.: Skills in the digital economy and the challenges of the education system. Int. J. Open Inf. Technol. 5(1), 19–25 (2017) 8. Loginova, S.L.: The specifics of the digital economy in higher education. In: International Scientific Conference “Global Challenges and Prospects of the Modern Economic Development”, pp. 1–12 (2019) 9. Korepin, V., Dorozhkin, E.M., Mikhaylova, A.V., Davydova, N.N.: Digital economy and digital logistics as new area of study in higher education. Int. J. Emerg. Technol. Learn. 15 (13), 137–154 (2020) 10. Anisimova, N.Yu.: Analysis of the continuous training system in the digital economy. Bull. Inst. Friendsh. Peoples Caucasus, 2(58), 28–35 (2021)
186
N. Mutovkina
11. Li, C.-X.: Research and reflections on college mathematics teaching based on information educational technology. Int. J. Educ. Manag. Eng. (IJEME) 1(2), 43–47 (2011) 12. Alahmadi, E.S., Qureshi, M.R.J.: Induction of interactive methods to teach software engineering course. Int. J. Mod. Educ. Comput. Sci. (IJMECS) 7(6), 43–49 (2015) 13. Niemelä, P., Helevirta, M.: K-12 curriculum research: the chicken and the egg of math-aided ICT teaching. Int. J. Mod. Educ. Comput. Sci. (IJMECS) 9(1), 1–14 (2017) 14. Wiradinata, T., Antonio, T.: The role of curriculum and incubator towards new venture creation in information technology. Int. J. Educ. Manag. Eng. (IJEME) 9(5), 39–49 (2019) 15. Purnahayu, I., Eosina, P., Susetyo, B., Nurhayati, I.: Indonesian universities readiness in providing professional HR in geospatial information. Int. J. Mod. Educ. Comput. Sci. (IJMECS) 12(3), 26–32 (2020) 16. Federal Law No. 273-FL dated 29.12.2012 “On Education in the Russian Federation” (2021). http://www.consultant.ru/document/cons_doc_LAW_140174/. Accessed 02 Nov 2021 17. Order of the Ministry of Education and Science of the Russian Federation No. 301 dated 05.04.2017 “On Approval of the Procedure for organizing and implementing educational activities for educational programs of higher education – bachelor’s degree programs, specialty programs, Master’s degree programs” (2020). URL: http://www.consultant.ru/ document/cons_doc_LAW_220229/. Accessed 02 Nov 2021 18. Professional standards. https://profstandart.rosmintrud.ru/obshchiy-informatsionnyy-blok/ natsionalnyy-reestr-professionalnykh-standartov/reestr-professionalnykh-standartov/. Accessed 03 Nov 2021 19. Portal of Federal State Educational Standards of Higher Education. http://fgosvo.ru/docs/ 101/69/2/8. Accessed 03 Nov 2021 20. Karpagam, K., Saradha, A.: A mobile based intelligent question answering system for education domain. Int. J. Inf. Eng. Electron. Bus. (IJIEEB) 10(1), 16–23 (2018) 21. Spiridonov, O.V.: Accounting of digital technologies in professional standards. https:// profstandart.rosmintrud.ru/upload/medialibrary/ff9/12.11.2020.pdf. Accessed 05 Nov 2021
RON Loss Prediction Based on Model of Light Gradient Boosting Machine Haikuan Yang, Hua Yang(&), Junxiong Wang, Kang Zhou, and Bing Cai Institute of Mathematics and Computer, Wuhan Polytechnic University, Wuhan 430023, China
Abstract. The emission produced after gasoline combustion caused great pollution to the environment, and the combustion of sulfur and olefin in the tail gas was the main pollutant. In the process of gasoline refining, reducing the content of sulfur and olefin will also reduce the octane number which represents the enterprise profit. To solve the problem of large octane number loss in the gasoline refining process, this paper established a prediction model of octane number (RON) content based on PCA's dimension of reducing variables affecting octane number content and Light Gradient Boosting Machine algorithm. In the end, 283 data samples were selected for training and 122 samples were tested. Five forecasting methods, such as linear regression, deep neural network, decision tree, support vector machine, and gradient boosting decision tree were used to compare with the Light Gradient Boosting Machine algorithm. By using Python and other tools, the results were visualized. RMSE and MAE were the smallest among the results predicted by the Light Gradient Boosting Machine algorithm model, which had good prediction accuracy and fitting degree, and revealed the nonlinear relationship between variables and octane number, which can help enterprises abate octane loss and increase profits. Keywords: RON Python Light gradient boosting machine neural network Linear regression Decision tree
PCA Deep
1 Introduction Octane number (expressed as RON) is an important factor affecting gasoline combustion performance. The higher the content of medium octane number of gasolines, the better the combustion performance, and the higher its economic value [1]. Among them, every unit of octane number reduction is equivalent to the loss of about 150 yuan/ton. Taking a 1 million tons/year FCC gasoline plant as an example, if RON loss can be reduced by 0.3 units, its economic benefit will reach 45 million yuan [2]. However, in the process of desulfurization and olefin reduction of gasoline, the octane number has a huge loss, so in the process of gasoline refining, by reducing the octane number loss, the enterprise's income can be effectively improved. Sun Jinfang[3] et al. proposed a research method based on a multi-layer perceptual neural network for the loss of gasoline in the refining process. Zheng Bin [4] et al. put forward a research method based on a stochastic forest regression algorithm according © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 Z. Hu et al. (Eds.): CSDEIS 2021, LNDECT 121, pp. 187–199, 2022. https://doi.org/10.1007/978-3-030-97057-4_17
188
H. Yang et al.
to the problem that it is difficult to detect octane numbers of gasoline in refined oil sales enterprises. Zhou Wen [5] put forward a method of neural network regression aiming at the guaranteed octane number content of gasoline in the cleaning process. Wang Ningning[6] aimed at the problem of octane number loss in the process of desulfurization and olefin reduction in enterprises, and mines it through BP neural network, and established a prediction model of octane number loss. Models for predicting octane numbers by machine learning algorithms are generally divided into two types: linear models and nonlinear models. For example, Linear Regress is a linear model; Deep Neural Network (DNN), Support Vector Machine Regression (SVM), and Decision Tree are nonlinear models. The Light Gradient Boosting Machine is a basic machine learning algorithm, which has obvious advantages in processing a small amount of data. It can not only be used in the case of low memory but also support parallel learning. It is rare to predict octane numbers by a light gradient machine algorithm. Based on this, in this study, 325 variables affecting RON content were used, and 20 related variables affecting RON content were obtained by dimension reduction. RON content was taken as a dependent variable, and a prediction model of RON content was established by the regression algorithm of the light gradient machine.
2 Problem Modeling 2.1
Dimension Reduction of Principal Component Analysis
Many factors affect the loss of octane number, and the information of different factors is complicated [7]. The key to establishing the model is to find the relevant variables that affect the octane number content through numerous variables. Dimension reduction is a mapping relationship, which can eliminate irrelevant variables in data and achieve the purpose of dimension reduction. Commonly used dimensionality reduction techniques include Pearson correlation coefficient (PCC), Spearman correlation coefficient (Rank IC), genetic algorithm (GA), principal component analysis (PCA), and so on [8]. In this paper, PCA technology is used to extract features and reduce dimensions of data, and the variables related to predicted values are found [9]. According to the basic idea of PCA, based on the analysis of the original data, this paper finds out 20 comprehensive variables affecting octane number loss from 354 variables. These new variables are independent of each other and can reflect the total variance of the original variables to the maximum extent. 2.2
Light Gradient Boosting Machine
The light gradient boosting machine is a fast, distributed, and high-performance gradient boosting framework based on a decision tree algorithm. LightGBM is a new member of Boosting set model, which is provided by Microsoft. It is an efficient implementation of GBDT and mainly consists of a gradient-based one-side sampling (GOSS) algorithm and exclusive feature bundling (EFB) algorithm [10]. Its advantages are as follows: faster training efficiency, lower memory usage, higher accuracy, support
RON Loss Prediction Based on Model of Light Gradient Boosting Machine
189
for parallel learning, ability to handle large-scale data, and support for direct use of category features [11]. In this study, LightGBM will be used to predict octane number loss, and the specific modeling steps are as follows: (1) After preprocessing the data set, the dimension of the data set is reduced to 20 related variables by PCA, and the related variables are normalized. (2) Divide training set and test set. 70% of the data is used as the training set and 30% as the test set to test the generalization ability of the model. (3) The LightGBM algorithm is used to build a prediction model to predict RON content. (4) Compare RMSE and MSE and evaluate the performance of the model. 2.3
Evaluating Indicator
According to the prediction effect of the model, the root means square error, mean absolute error, and running time are compared, and the specific meanings are as follows: 2.3.1 Root Mean Square Error Root mean square error (RMSE) in mathematical statistics refers to the expected value of the square root of the difference between the estimated value of parameters and the true value of parameters. RMSE is a convenient method to measure the ``average error'', which can evaluate the deviation of the predicted value from the true value. The smaller the RMSE value, the closer the experimental data obtained by the prediction model is to the true value. RMSE is very sensitive to a group of extra-large or extra-small errors in measurement, so it can well reflect the accuracy of measurement, so it is widely used in engineering measurement. The calculation formula is as follows: rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 XN RMSE ¼ ðRi Pi Þ2 i¼1 N
ð1Þ
Where n represents the number of samples, Ri represents the true value of octane number corresponding to samples i and Pi represents the predicted value of octane number corresponding to samples i. 2.3.2 Mean Absolute Error Mean absolute error (MAE) is the average of the absolute values of the deviations between all single observed values and arithmetic mean values [13]. The average absolute error can avoid the problem that positive and negative errors cancel each other out, so it can accurately reflect the deviation degree of the actual predicted true values. In addition, because of the absolute value of the discrete value, there will be no positive and negative phase cancellation, so the average absolute error can better reflect the actual situation of the measured value error. The calculation formula is as follows:
190
H. Yang et al.
MAE ¼
1 XN jRi Pi j i¼1 N
ð2Þ
Where n represents the number of samples, Ri represents the true value of octane number corresponding to samples i and Pi represents the predicted value of octane number corresponding to samples i. 2.3.3 Running Time The running time of the model is the time from the beginning of training to the completion of testing. Trun ¼ Tstart Tend
ð3Þ
In which, Trun indicates the actual time the program runs, Tstart show the time when the predictive model starts executing, and Tend express the time when the predictive model ends executing.
3 Method and Analysis 3.1
Preprocessing of Data
In the original data set, most of the data are normal, but some variables have null values, missing data, etc. Therefore, this study needs to process the data before it can be used. According to different data anomalies, there are adopted different preprocessing methods in this study. (1) When all the collected data samples are null, delete the whole row of data; (2) For variables with more missing data points, delete the whole column of this variable; (3) When some indicator variables are null, the null value is replaced by the average value of this variable; (4) According to the technical requirements and operating experience of gasoline catalytic cracking, the value range of each index affecting octane number is summarized, and by judging whether the variables are in the specified value range, the sample points of the data that are not in the range are deleted by the maximum and minimum clipping method. (5) The outliers are removed and solved by using the Paula criterion. Pauta criterion: It is assumed that the observed variables are measured with the same accuracy to obtain the initial data x1 ; x2 ; . . .; xn . First, the arithmetic means x of the observed variables is calculated, then the residual error vi ¼ xi xði ¼ 1; 2; . . .; nÞ is calculated, and then the standard error r is calculated according to Bessel formula of formula (5). Judge whether the residual error of data xm is within the range of 3r. If vm is within the range 3r, the data is considered normal; otherwise, jvm j ¼ jxm xj [ 3r, the data xm contains great errors and should be deleted [12]. Bessel’s formula is as follows:
RON Loss Prediction Based on Model of Light Gradient Boosting Machine
r¼
3.2
1 Xn 2 v i¼1 i n1
12
Pn ¼
2 i¼1 xi
Pn 2 !12 i¼1 xi =n n1
191
ð4Þ
Steps of Dimension Reduction by Principal Component Analysis
The steps of PCA dimension reduction are as follows: (1) original data: matrix x with m rows and n columns. (2) Find the average of samples. (3) Averaging each column of x (representing an attribute), that is, subtracting the average value of this row. (4) Find the covariance matrix: Cov ¼
1 T X X m
ð5Þ
(5) Find the eigenvalues and eigenvectors corresponding to the covariance matrix. (6) Arrange the eigenvectors into a matrix from top to bottom according to the size of the corresponding eigenvalues, and take the first k rows to form a matrix p. (7) It is the data after dimension reduction to k dimension Y ¼ PX. 3.3
Prediction of Octane Number Using LightGBM Model
(1) Linear model and nonlinear model. Due to the diversity of oil refining equipment and the complexity of the process, there is a highly nonlinear relationship among the operating variables, and there are relatively many main operating variables, and the relationship between them and the loss of octane number (RON) is relatively complex. If a linear prediction model (such as linear regression) is adopted, it may not be able to meet the needs of modeling, and it is difficult to achieve the desired results. Therefore, we give priority to using a nonlinear prediction model to establish a prediction model of the octane number (RON) loss. (2) General machine learning prediction model and deep learning prediction model. Based on the efficient organization and parameter fitting ability of the machine learning algorithm, we adopt the machine learning method to solve a nonlinear modeling problem. There are 405 pieces of data in the data set. After dimension reduction, the number of feature variables is reduced from 367 to 20, and the amount of data is relatively small. Using the deep learning method can easily make the model repeatedly trained, resulting in over-fitting, affecting the prediction effect, and relatively more computing resources and running time are needed. Therefore, we use an ordinary machine learning algorithm to construct a prediction model of the octane number loss. The Light Gradient Boosting Machine is one of the typical representatives of the common machine learning prediction model, which was proposed by Microsoft
192
H. Yang et al.
Corporation in 2017. A large number of studies have shown that it has faster training efficiency and higher accuracy than the traditional machine learning prediction model. It can not only be used in low memory but also support parallel learning. According to the characteristics of the sample data of the data set, through comprehensive analysis, the LightGBM algorithm is selected to construct the octane number loss, prediction model. Different machine learning algorithms have different prediction effects, and only using a single algorithm model can't explain the advantages and disadvantages of the prediction effect. Therefore, this paper chooses different algorithms to predict the octane number (RON) loss and makes a comparative analysis, further illustrating the superiority of the LightGBM model over other models. Therefore, we use the following five methods to compare with LightGBM: (1) (2) (3) (4) (5)
The Linear Regression Deep Neural Network A Decision Tree Support Vector Machine Regression Gradient Boosting Decision Tree
The first comparison method is mainly used to verify that the linear model is not suitable for the prediction of this data set, while the last four methods are mainly used to further verify the effectiveness of LightGBM according to the actual prediction effect. The specific experimental steps are as follows: (1) According to the prediction requirements of the model, preprocess the data set, divide the corresponding training set and test set, and set appropriate model parameters. (2) Put the training set data into six selected models for training, and get the corresponding loss prediction model of trained octane number (RON). (3) Put the test set into the model for testing, select the appropriate evaluation index to evaluate and verify the model, and visualize the comparison results. 3.3.1 Model Framework After the data set is reduced and the appropriate model is selected, we establish, solve and verify the octane number loss prediction model in three steps. (1) Pre-processing the data set after dimension reduction includes three steps: completeness analysis, normalization, and data set division. (2) The octane number (RON) loss prediction model based on LightGBM is trained by using the training set data, and then the test set data is predicted to get the corresponding prediction results. (3) After training and testing the five comparison methods and getting all the predicted results, the results are analyzed and demonstrated according to RMSE (Root Mean Squared Error), MAE (Mean Absolute Error), and Running Time [13].
RON Loss Prediction Based on Model of Light Gradient Boosting Machine
193
3.3.2 Data Analysis and Pretreatment After dimensionality reduction, we get 405 sample data, and each sample data contains 20 variables and one octane number (RON) loss value. As these data are real data samples collected from the cracking gasoline refining unit and have been processed basically, there is no need to clean the data. We only need to analyze and preprocess according to the data characteristics and the data input format of each subsequent model, as follows: (1) Data integrity analysis. Because there may be some null data in the data set, we need to calculate the data integrity of each variable, and the calculation formula is as follows: Ii ¼
Ni Sumi
ð6Þ
Where Ii is the completeness of variable i, Ni is the number of non-empty data in variable i and Sumi is the number of samples in variable i. According to the calculation formula of variable integrity, we can calculate the data integrity of each variable. (2) Normalization treatment Because the value ranges of different types of variables are different, to facilitate the subsequent training of the algorithm, every one-dimensional data is normalized. After normalization, the data can be conveniently processed and trained, which speeds up the running of the program without affecting the characteristics of the data [14]. There are two forms of normalization, that is, changing numbers into decimals between (0,1), or changing dimensional expressions into dimensionless expressions [15]. Mainly for the convenience of data processing, this paper uses the former method to map data processing within the range of (0,1), and the standardized calculation formula is as follows: xnormalization ¼
x Min Max Min
ð7Þ
(3) Data set partition 70% of the data set is selected as the training set and 30% as the test set.
4 Simulation 4.1
Comparison of Experimental Results Between Different Models
The experimental prediction results of the six models are shown in Figs. 1, 2, 3, 4, 5, 6 and 7.
194
H. Yang et al.
Fig. 1. Prediction results of linear regression
Fig. 2. Prediction results of deep neural network
From the figure, In Fig. 1, the linear regression algorithm was used and the predicted value deviated from the real value too much; In Fig. 2 of the result of the diagram, the deep neural network [13] was used and the predicted value deviated significantly from the real value; In Fig. 3, Fig. 4 and Fig. 5, they were predicted by the decision tree, support vector machine and gradient boosting decision tree algorithm respectively. The predicted value was close to the real value and the prediction result was good; In Fig. 6, the light gradient boosting algorithm was used to predict the RON value. The predicted value was very close to the real value and the prediction effect is better than other algorithms.
RON Loss Prediction Based on Model of Light Gradient Boosting Machine
Fig. 3. Prediction results of decision tree
Fig. 4. Prediction results of SVR
195
196
H. Yang et al.
Fig. 5. Prediction results of GBDT
Fig. 6. Prediction results of LightGBM
4.2
Error Comparison Between Different Models
According to the experimental results, the line graphs of root mean square error, average absolute error, and running time of different models are drawn, as shown in Fig. 7. From the figure, It can be seen that the root mean square error and average absolute error of linear regression are relatively larger than others and deep neural network, decision tree, support vector machine regression, and gradient boosting tree are
RON Loss Prediction Based on Model of Light Gradient Boosting Machine
197
Fig. 7. Comparison of experimental results Table 1. Statistical table of experimental results Way Root-mean-square error Average absolute error Run time Linear regression 164.0372 44.5632 0.00099 DNN 5.275955 4.1333 0.0 Decision Tree 0.6517 0.4344 0.0 SVM 0.5796 0.5192 0.01004 GBDT 0.5194 0.4577 0.000997 LightGBM 0.4602 0.4225 0.0
relatively low and light gradient boosting machine is the smallest, and the running time of the six algorithms is roughly identical. The experimental results of the six prediction methods are shown in Table 1. The data in the table statistics the root mean square error, average absolute error, and running time of different methods respectively. Visualize the prediction results of each method. Specifically, we put the training set into each model for training. After the model training is finished, we input the test set into the model and compare the final results with the real values to calculate their root mean square error, average absolute error, and running time. The smaller the values of the three evaluation indexes, the better the predicted results. All the experimental results are shown in Table 1 and Fig. 1, and the comparison of error results is shown in Fig. 2. At the same time, to compare the prediction results of each model more intuitively, we take the octane number loss value of each sample as the abscissa and the corresponding octane number loss as the ordinate, draw the scatter diagram corresponding to each model (as shown in Fig. 1, 2, 3, 4, 5 and 6) and analyze the training and testing effects of each model by displaying the octane number loss of the training set, the test set, and the original data. According to the above chart:
198
H. Yang et al.
The prediction effect of the linear regression (LR) method is far less than that of other nonlinear prediction models, but its model is not complicated and needs fewer parameters to be trained, so its running time is relatively minimal. The experimental results further verify that the linear regression model is not suitable for the training of this data set. The root mean square error of the LightGBM method is 0.4602, and the average absolute error is 0.4225, which is the smallest among all models, and the running time is at a medium level, which is superior to other models in both prediction effect and running time. The experimental results further verify that the LightGBM model is suitable for octane number prediction.
5 Conclusions Aiming at the problem of large octane number loss in the process of desulfurization and olefin reduction in oil refining enterprises, By processing, cleaning, and normalizing the octane number data of petrochemical enterprises, uses light gradient boosting machine algorithm and six prediction algorithms of LR, DNN, DT, SVM, and GBDT, and compares them with light gradient boosting machine algorithm when predicts octane number content to obtain that the mean square error and absolute error are the lowest and the prediction accuracy of the model is the highest. The model constructed in this study provides a way to solve the modeling difficulties caused by various influencing factors such as complex raw material composition, ensure that the enterprise can reduce the content of octane number in the process of desulfurization and olefin reduction, which is conducive to improving the profitability of the enterprise. In the future, it can better help the enterprise find the variables affecting the content of octane numbers, and then reduce the loss of octane numbers. Acknowledgment. This project is supported by the university scientific research project which is named multi-objective intelligent optimization algorithm and its application in pork emergency storage and transportation. The project number is 2020Y20.
References 1. Wei, W., Zefei, L., Yan, H.: Selection of gasoline blending octane number model based on oil properties. J. Petrol. (Petrol. Process.) 22(006), 39–44 (2006) 2. Sun, J., Zhiwen, W., Wang, K., et al.: Octane number prediction analysis based on principal component dimension reduction and multilayer perceptual neural network. J. Guangxi Univ. Sci. Technol. 32(3), 7 (2021) 3. Wang, N.: Construction of prediction model for octane number loss of gasoline based on BP neural network. Intell. Comput. Appl. 11(02), 76–79 (2021) 4. Wen, Z.: Study on octane number loss of gasoline based on neural network regression model. Light Ind. Sci. Technol. 37(06), 69–70 (2021) 5. Liu, C., Wei, J., Huang, T.: Looking for the main variables in the model of reducing octane number loss in the gasoline refining process. J. Chifeng Univ. (Nat. Sci. Ed.) 36(12), 7–11 (2020)
RON Loss Prediction Based on Model of Light Gradient Boosting Machine
199
6. Jiang, W., Tong, G.: Construction and analysis of gasoline octane number loss prediction model based on improved PCA-RFR algorithm. J. Petrol. (Petrol. Process.), 1–9 (2021) 7. Hinton, G.E., Osindero, S., The, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006) 8. Rahmani, B., Javadi, S., Shahdany, S.: Evaluation of aquifer vulnerability using PCA technique and various clustering methods. Geocarto Int. 36(18), 2117–2140 (2021). https:// doi.org/10.1080/10106049.2019.1690057 9. Kim, G.B., Hwang, C.I., Choi, M.R.: PCA-based multivariate LSTM model for predicting natural groundwater level variations in a time-series record affected by anthropogenic factors. Environ. Earth Sci. 80(18) (2021) 10. Duan S., et al.: LightGBM low-temperature prediction model based on LassoCV feature selection. Math. Prob. Eng. (2021) 11. Zhang, Y., Chen, R., Xu, C., Yang, G., Lu, X., Fang, K.: Station environmental temperature prediction based on LSTM-LightGBM model. Comput. Measur. Control: 1–11 (2021) 12. See, C.S., Luong, G.K., Robin, C.Y.H., Yong, S.K.: Coupling normalization with moving window in backpropagation neural network (BNN) for passive microwave soil moisture retrieval. Int. J. Comput. Intell. Syst. 14(1) (2021) 13. Islam, A., Redoun Islam, S.M., Rafizul Haque, S.M., Islam, M., Khan, M.: Rice leaf disease recognition using local threshold based segmentation and deep CNN. Int. J. Intell. Syst. Appl. 13(5), 35–45 (2021). https://doi.org/10.5815/ijisa.2021.05.04 14. Gautham, S.K., Koundinya, A.: CNN-based security authentication for wireless multimedia devices. Int. J. Wirel. Microw. Technol. 11(4), 1–10 (2021). https://doi.org/10.5815/ijwmt. 2021.04.01 15. Gustisyaf, A., Sinaga, A.: Implementation of convolutional neural network to classification gender based on fingerprint. Int. J. Mod. Educ. Comput. Sci. (IJMECS) 13(4), 55–67 (2021). https://doi.org/10.5815/ijmecs.2021.04.05
Application of Emergency Grain Transportation Based on Improved Ant Colony Algorithm Yongqing Zhu, Hua Yang(&), Li Feng, Haikuan Yang, and Kang Zhou Institute of Mathematics and Computer, Wuhan Polytechnic University, Wuhan 430023, China
Abstract. My country is a large grain country. In recent years, with the improvement of my country’s agricultural science and technology level, grain output has become larger and larger. Grain has become a necessity for people's lives. At the same time, the development of transportation industry has also promoted grain transportation. However, in the process of grain transportation, there are often problems of transportation time and cost waste. A large number of research results show that the ant colony algorithm has a good effect in solving logistics transportation problems. Recent studies have shown that feature migration can speed up the convergence speed of ant colony algorithm. Meanwhile, matching learning can increase the diversity of ant colony algorithm solutions. This is well applied in the TSP problem model. This paper solves this problem by adding selection, crossover, and mutation operations to the ant colony algorithm. This paper first introduces the background, improvement and application of the ant colony algorithm. Then the paper introduces the CVRP model and uses the obtained data for modeling. Finally, the paper uses matlab to run the two algorithm codes and gets the experimental results. The paper evaluates the experimental results through the final shortest path and the average of all the shortest paths. Finally, compared with the traditional ant colony algorithm, the improved algorithm can speed up the convergence of the traditional ant colony algorithm. At the same time, the stability of the algorithm and the quality of the solution have also been improved. In order to improve the shortcomings of ant colony algorithm, this paper adds genetic algorithm to ant colony algorithm. Finally, we use experiments to prove that the improved algorithm has better convergence, stability, and quality of solution. This is the innovative point of this article. Keywords: Grain transportation
Ant colony algorithm Experiment results
1 Introduction In recent years, our country’s transportation has become more and more convenient. This promotes the development of logistics and transportation industry. Therefore, various commodities can reach thousands of households through logistics transportation, which brings great convenience to people’s lives. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 Z. Hu et al. (Eds.): CSDEIS 2021, LNDECT 121, pp. 200–213, 2022. https://doi.org/10.1007/978-3-030-97057-4_18
Application of Emergency Grain Transportation
201
Especially in the disasters of earthquakes and floods, logistics and transportation have played a very important role. This situation is more urgent and requires reaching the disaster area as quickly as possible. Especially in the recent emergency situation of the pneumonia epidemic, people are isolated at home, which prevents people from going out to buy daily necessities. At this time, vehicles are required to transport daily necessities to each community. How to reach the community quickly is very important. We can treat the above problems as a vrp problem. A large number of studies have shown that the ant colony algorithm has a very good effect on solving the vrp problem. So, in the introduction, we firstly introduce the ant colony algorithm. 1.1
Ant Colony Algorithm
Italian scholar Marco Dorigo firstly proposed the ant colony algorithm in 1991. Early scholars used ant colony algorithm to study TSP problem. Later, a large number of scholars used the TSP problem to make various improvements for the ant colony algorithm. Now, ant colony algorithm has been well applied in path optimization problem, vehicle scheduling problem and image coloring problem. By observing and studying the behavior of ants in nature, scientists have found that each ant releases pheromone when it moves. The ants communicate through pheromone. If the distance is shorter, the pheromone concentration will be higher. At the same time, the ants are more likely to choose this path. Finally, the shortest path can be found. 1.2
Insufficient Improvement of Ant Colony Algorithm
However, because ants have a memory function, they can easily move to places with high pheromone concentrations. This makes it difficult for the algorithm to generate new after a certain number of cycles. In order to solve the local optimal problem, related scholars began to study the ant colony algorithm, and some research results were also obtained. Guo Yongmei and others studied the problem of open-loop vehicle routing with time windows under the background of emergency logistics. They considered the difference in road traffic performance and the priority of material transportation as constraints. At the same time, they took into account the shortcomings of poor convergence and local optimization of traditional optimization algorithms. So they used artificial fish swarm algorithm to improve the ant colony algorithm, this improvement improves the quality of the ant colony algorithm. Experiments have proved that the improved ant colony algorithm has significant advantages in solution convergence and accuracy [1]. Zhang Ran and others proposed a multi-objective optimization Q-learning quantum ant colony routing algorithm. The algorithm considers the average trust value and cost of the path comprehensively in the transition probability of selecting the next node. When considering the path cost function, they added quantum computing to avoid falling into the local optimal solution. By introducing the idea of Q learning, they mapped the pheromone to the Q value, which can speed up the convergence speed of the algorithm. Simulation results show that the improved algorithm improves
202
Y. Zhu et al.
performance indicators such as packet delivery rate, average end-to-end delay, and average node energy consumption [2]. Ning Tao and others adopted two-stage modeling. The dynamic vrp problem is converted into a static vrp problem. The path and pheromone are updated with an improved quantum revolving door. Finally, mathematical software is used to prove that the improved ant colony algorithm is effective [3]. Ren Gang and others proposed an ant colony algorithm based on genetic operator. The paper join the crossover and mutation operations in the biological genetic process, which expanded the search range of traditional ant colony algorithm solutions and produced a better solution [4]. Sun Li juan used TSP problem to study ant colony algorithm. She combine genetic algorithm and MMAS to improve the algorithm, then she use genetic algorithm to optimize 4 control parameters in ant colony algorithm. At the same time, MMAS is used to control the range of pheromone and improve the behavior of finding paths, finally, the ability of global search is improved [5]. Zhang Danxia studied the logistics network of Henan Province. By improving the heuristic function of the traditional ant colony algorithm, the accuracy of the algorithm search is improved. Finally the local optimal problem is effectively avoided [6]. Cheng Liang and others proposed an ant colony algorithm with a combination of roulette algorithm and 2-opt optimization algorithm. It is used to solve CVRP problems. Finally, experiments prove that the improved algorithm improves the performance and the quality of the solution [7]. G. Narendrababu Reddy and others used an improved ant colony optimization algorithm to study the scheduling model of cloud tasks. They perform the multi-objective task scheduling process by allocating the pheromone amount corresponding to the efficiency of the virtual machine. Experiments show that the improved algorithm reduces the relatively low degree of maximization and imbalance of the traditional ACO algorithm. Finally, the improved algorithm improves the performance of task scheduling [8]. 1.3
Application of Ant Colony Algorithm
In order to solve the problem of insufficient vehicles in the early stage of emergency transportation, li Zhuo and others adopted the transportation mode of joint distribution of their own vehicles and third-party leased vehicles to study the combined optimization problem of hybrid vehicle paths. He used the heuristic strategy of ant colony algorithm and the pheromone positive feedback mechanism to generate progeny populations. The non-dominated sorting strategy model is used to guide the algorithm’s multi-objective selection process, and the search space is expanded by introducing variable neighborhood descending search. Experiments show that the algorithm shows good performance in solving problems of different scales and problems of different distribution types [9]. Chen Tian heng and others used the ant colony algorithm to establish the linkage monitoring logic between the camera monitoring equipment. The experiment proved that the optimal visual effect of the current state of the observation object can be quickly and robustly presented [10]. He Wen and others used ant colony algorithm to improve BP neural network. First of all, the influencing factors of the radiation intensity are analyzed. Next, and the optimal influencing factors are selected from them. Finally, the ant colony algorithm is
Application of Emergency Grain Transportation
203
used to train the weights and thresholds to predict the radiation intensity. The results show that this method effectively improves the prediction accuracy of radiation intensity [11]. In order to effectively use the grid, Saeed Molaiy and others researched a good job scheduling algorithm, which can better allocate jobs to the resources of the grid. The paper proposes a new algorithm based on ant colony optimization (ACO) metaheuristic to solve this problem. Finally, they proposed an ACO algorithm for grid system scheduling. The experimental results show that the ACO algorithm optimizes the total response time and improves the utilization rate [12]. In summary, ant colony algorithm has good applications in different fields. However, the algorithm also has some shortcomings. Researchers have proposed various improvement methods. They mainly used other algorithms to optimize the search space, or use parameters to improve the performance of the algorithm. They also improve ant colony algorithm by the pheromone and heuristic function. The application model of this article is emergency grain transportation. For the convenience of calculation, we regard this model as a CVRP problem.
2 VRP Problem Description and Experimental Model Hypothesis 2.1
VRP Problem Introduction
The vrp problem is also called the route problem of the vehicle. In 1959, Danzig and Rams firstly raised the issue of VRP. The problem model is a logistics transportation model for gasoline delivery at gas stations. At the same time, a corresponding solution algorithm is proposed. In 1964, Clark and Wright proposed an effective heuristic algorithm. After the publication of the above two papers, the VRP problem became a hot topic in operations research and combinatorial optimization. After nearly 50 years of research, the vrp problem has broad application prospects in the fields of logistics transportation, express delivery and takeaway services. The research model in this paper belongs to the vehicle routing problem with capacity constraints. The optimization goal is dual-objective optimization. They are the shortest path and the least cost. The vehicle cost consists of two parts (Fixed cost of the vehicle and the driving cost). The constraints are mainly the maximum load and the transportation distance. The main constraints are the maximum load and transportation distance. When the constraints are met, the vehicle starts from a certain warehouse and passes through all demand points. The goal is to minimize the total transportation cost and transportation distance. Referring to Yuan Wen tao’s paper [13], we establish the following mathematical model. The problem is described in mathematical language as follows: P P Objective function: min F þ cij xij w; cij xij X
qi yki D
ð1Þ
204
Y. Zhu et al.
X
yki ¼ 1
ð2Þ
xijk ¼ ykj
ð3Þ
xijk ¼ y ki
ð4Þ
X i
X j
X i2s
X
x j2s ijk
j Sj 1
ð5Þ
xijk and yki Just take 0 or 1. The above is the objective function and constraint conditions of the CVRP problem. F indicates the fixed cost of each vehicle when it is dispatched. The optimization goal is dual-objective optimization (minimum cost and shortest distance). 2.2
Experimental Model
We transform actual problems into mathematical models. We can describe the emergency transportation problem as follows: Certain city has a transportation warehouse, there are m vehicles in the warehouse, and there are n demand cities. Now each vehicle is required to reach all other cities quickly, and the driving distance and transportation cost of the vehicle reach the smallest. When building a mathematical model, we require that the model be simplified and conform to the actual situation. So we have to make some assumptions. By reading Deng Lijuan’s paper [14], we make the following model assumptions. These assumptions are finally expressed in the code of the algorithm. 1) The central warehouse has m vehicles with the same load and volume, and the maximum load capacity of each vehicle is Q. 2) The customer set is C = {i}(i = 1,2,…,n). There are a total of n customers, and each customer is accessed by only one car. 3) The distance from customer i to customer j is Cij. 4) Each city is only visited once. In other words, the in-degree and out-degree are both 1.
3 Improvement of the Ant Colony Algorithm Process 3.1
Innovation
By adding genetic algorithm, there are the following improvements: 1) The selection operation can leave individuals with better adaptability, which is convenient for subsequent optimization operations. 2) The crossover operation generates new individuals and enhances the algorithm search ability.
Application of Emergency Grain Transportation
205
3) The mutation operation can accelerate the speed of convergence and find the optimal solution faster. At the same time, the mutation operation can also generate new individuals and increase diversity. 3.2
Improved Ant Colony Algorithm Process
This paper proposes an improved ant colony algorithm based on genetic manipulation. First, all ants generate initial solutions according to the state transition rules. We choose better individuals according to the fitness ratio. Then crossover operations are performed between individuals to generate new populations, and finally mutation operations are performed to generate the final populations. Excellent individuals are selected to update the pheromone. We repeat the above operations to reach the termination condition of the ant colony algorithm, and experiment output the optimal solution. 3.3
Algorithm Flow
By referring to Yu Dian’s paper [15], we finally determined the algorithm flow: 1) We initialize the location information of the city, and then set the important parameter values of the ant colony algorithm and genetic algorithm. 2) All ants start from the starting city, and then choose the next point according to the state transition rule. 3) After all the ants have completed the journey, m paths will be generated. These paths become the initial chromosome populations. Each chromosome is a path, and each path is represented as a code string. 4) We set the fitness function as the reciprocal of the ant path, and then use the roulette method to select outstanding individuals according to the value of the fitness function. 5) After producing outstanding individuals, we randomly select two individuals for single-point crossover. 6) We use crossover individuals to perform mutation operations. The mutation operation can change the gene value at a certain position of the chromosome. 7) We replace the low fitness parent with the newly generated offspring, which can keep the population size unchanged. We need to set a reasonable fitness threshold. 8) We judge whether the genetic termination conditions are met. We set the termination condition as the maximum number of iterations. If the termination conditions are met, the genetic algorithm will end. We use the optimal individual generated by the genetic algorithm to update the pheromone. If the conditions are not met, we return to step 4. 9) Finally, we judge whether the termination condition of the ant colony algorithm is satisfied. If it is satisfied, the codes output the optimal path. If it is not satisfied, we return to step 2 and continue to repeat steps 2–9.
206
3.4
Y. Zhu et al.
Core Technologies
Encoding The problem model used in the paper is the CVRP problem model. By comparing encoding methods, we found that the best method is integer encoding, which uses a sequence of integers to represent the path. We define the following rule: 1-2-4-5-3 is represented as 1 2 4 5 3. State Transition Rules The state transition rule is very important. It determines the walking path of the ant. The state transition rule is determined by the heuristic factor and pheromone. Referring to Tao Li Hua’s paper [16], we set the formula of the state transition rule as follows:
a b si;j ðtÞ gi;j ðtÞ ðt Þ ¼ P a b s2Jk ðiÞ Ti;s ðtÞ gi;s ðtÞ
ð6Þ
K represents the number of iterations and si;j means pheromone, gi;j represents heuristic factor and s means unvisited point. The above formula represents the probability of selecting the next point when the number of iterations is k. Pheromone Update The ant colony pheromone update method uses the global pheromone update rule. We compare with the local pheromone update method; the global pheromone update is a relatively stable method. The feature of global update is to update after the ant completes the process, but local update is to update every time the ant moves. By referring to Zhang Haijun’s paper [17], we finally determined the following global pheromone update methods: Dsij ¼
Xm k¼1
Dskij ¼
Dskij
Q Lk
ð7Þ ð8Þ
Lk is the total path of the k ant, Q is the pheromone increase intensity factor. Crossover and Mutation Operations By referring to Ding Bei’s paper [18], we finally determined the following crossover and mutation methods: Single-point crossover can lower the possibility of destroying individual characteristics and individual fitness, so we choose this method. The crossover process is shown in the Fig. 1 below. A crossover point is randomly selected, and then the gene fragments after the crossover point are exchanged.
Application of Emergency Grain Transportation
207
Fig. 1. Chromosome crossing process
The mutation operation adopts the exchange mutation method. The function generates two random numbers not greater than the chromosome length L, and then we exchange the position genes corresponding to the random numbers. If the chromosome is ½x1 ; x2 ; x3 ; x4 ; x5 , The random number generated is 3 and 4, after the exchange it will be ½x1 ; x2 ; x4 ; x3 ; x5 : Distance Representation We use the Euclidean distance formula to calculate the distance between two cities. It qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi can be expressed as dij ¼ ðx1 x2 Þ2 þ ðy1 y2 Þ2 . Finally, a symmetric matrix is used to represent the distance between any two cities. (We assume that the distance from i to j is the same as j to i).
4 Simulation Experiment and Result Analysis 4.1
Experimental Design Process
1) Relevant variables represent experimental data. 2) Design the operating process of the two algorithms. 3) Model assumptions and objective functions are added to the two algorithm processes. 4) Matlab software writes the complete code of the algorithm. 5) Check the code to ensure smooth operation. 6) After the code is executed, we can get the results needed for evaluation. Finally, we can make performance judgments and draw corresponding conclusions. 4.2
Lab Environment
We use matlab 2016a version to write the algorithm program. The algorithm runs on windows 10 and Intel(R) Core (TM) i7-5500U [email protected] GHz processor. The experimental data comes from from c101 in the national standard TSP problem database data set.
208
4.3
Y. Zhu et al.
Experimental Feasibility
1. The experimental code can be run on matlab2016. 2. The experiment can be completed in the laboratory, so the cost is not high. 3. In the experiment, if I don’t understand the problem of technology and theory, I can ask teachers and classmates for advice. Finally, they can give me useful help. 4.4
Code Program Flow
Procedure: The code execution process of the improved ant colony algorithm proposed in this paper is as follows: Begin Initialize the relevant parameters of the algorithm All ants choose the next point according to the state transition rule While (all the ants walk the entire course) End while Generate initial population The initial population performs genetic operations such as selection, crossover, and mutation While (the number of genetic operations is the maximum number of iterations) End while Use genetic algorithm to generate optimal individuals in the population or better individuals to update the pheromone in the path Start the ant colony algorithm, generate the initial population, perform genetic operations, and repeat the above process While (The number of ant colony algorithm is the maximum number of iterations) End while Output the final result, jump out of the loop, and stop the algorithm End while 4.5
Experimental Simulation
In order to verify the performance of the improved ant colony algorithm, this article uses the c101 data set in the international standard TSP problem database. In order to reduce the calculation time and complexity, we set two algorithms to perform 100 cycles. The final experimental results are shown in the figure below. The two algorithm parameters are shown in Table 1. We use the corresponding code to test two different algorithms. We finally get the following four pictures.
Application of Emergency Grain Transportation
209
Fig. 2. Iterative diagram of improved ant colony algorithm
Figure 1 and Fig. 2 show the optimal path for each iteration of the algorithm. Figure 3 and Fig. 4 show the shortest travel paths for all cycles of the two algorithms (Fig. 5).
Fig. 3. Iterative diagram of ant colony algorithm
210
Y. Zhu et al.
Fig. 4. The shortest path of the improved ant colony algorithm
Fig. 5. The shortest path of ant colony algorithm
Application of Emergency Grain Transportation
211
Table 1. The experimental parameters are shown in the table below: parameter a b Rho iter_max Q Cap m popnum pc pm Ant colony algorithm 1 5 0.9 100 10 200 50 0 0 0 Improved ant colony algorithm 1 5 0.9 100 10 200 50 80 0.6 0.01
Parameter Description: M is the number of ants and n is the number of demand cities, a indicates the importance of pheromone and b indicates the importance of heuristic factors, d indicates the global pheromone evaporation coefficient. Rho represents the pheromone residual coefficient. Q represents the pheromone enhancement coefficient. Cap represents the maximum load of the vehicle. Iter_max represents the maximum number of iterations. Pc represents the crossover probability. Pm represents the mutation probability. Popnum represents the total population. 4.6
Experimental Results
According to the results of the above figure, we illustrate the superiority of the algorithm from the following aspects: 1) Algorithm convergence performance: By comparing Fig. 2 and Fig. 3, we can see that when the number of iterations reaches about 15 times, the improved algorithm converges. However, when the number of iterations is about 35 times, the traditional ant colony algorithm converges. So the improved algorithm has better convergence. 2) The average value of the optimal path: We sum all the shortest paths and take the average value. The average value of the improved algorithm is 659.60 km. The average value of the traditional ant colony algorithm is 666.26 km. By comparing the results, the average value of the optimal path reduced by 6.66 km. So the improved ant colony algorithm is more stable. 3) The shortest path: After the program runs, we can get the shortest paths of the two experiments. The global shortest path of the improved ant colony algorithm is 645.30 km. The global shortest path of the traditional ant colony algorithm is 653.36 km. So, the shortest path is reduced by 8.06 km. This result shows that the improved ant colony algorithm can improve the accuracy of the solution. In summary, the improved ant colony algorithm speeds up the convergence speed and improves the quality of the algorithm’s optimal solution and the algorithm stability.
5 Conclusions According to the principle of minimizing transportation cost and time, this paper combines the advantages of ant colony algorithm and genetic algorithm. Firstly, genetic operations are added to the ant colony algorithm. Next, ant colony algorithm and improved algorithm are used to solve the emergency transportation problem. Finally,
212
Y. Zhu et al.
the experimental results are obtained in this article. By analyzing the experimental results, the article obtained the final conclusion. In order to solve insufficient pheromone in the early stage of ant colony algorithm, genetic algorithm is added in this paper. First, all ants generate paths of the first iteration. Then all paths are selected, crossed and mutated to generate new paths. Finally, the path with the best fitness is selected to update the pheromone. The above steps are repeated until the algorithm meets the stop condition. The final experimental result is output. By comparing the results of two algorithms, it is found that the operations of genetic algorithm speed up the convergence of the ant colony algorithm in the early stage. At the same time, the improved algorithm has better stability and quality of solution. So, the improved ant colony algorithm can better solve the problems of time and cost. The improved method provides practical reference for the grain transportation enterprises and the government. It can save time and cost for enterprises and governments. At the same time, China has paid more and more attention to the grain issue. So this improved path optimization method also responds to China’s grain policy.
References 1. Guo Yongmei, H., Dawei, C.: Improved ant colony algorithm to solve the emergency logistics open-loop vehicle routing problem with time windows. J. Chang’an Univ. (Nat. Sci. Ed.) 37(06), 105–112 (2017). https://doi.org/10.19721/j.cnki.1671-8879.2017.06.015 2. Zhang, R., Gao, Y., Zhao, Y., Ding, Y.M.: Micro-nano satellite routing algorithm based on Q-learning quantum ant colony. Comput. Eng. 1–10 (2021) 3. Tao, N., Xuan, J., Yingqi, W., Liang, X.: Dynamic vehicle routing problem with random demand based on quantum ant colony algorithm. J. Dalian Jiaotong Univ. 39(05), 107–110 (2018) 4. Gang, R.: GOP-MRPGA: genetic operator pre parallel genetic algorithm based on MapReduce big data computing model. J. Henan Institute Technol. 28(05), 7–10 + 21 (2020) 5. Sun, L.J., Wang, L.J., Wang, R.C.: Improved ant colony algorithm and its application in TSP. J. Commun. (10), 111–116 (2004) 6. Zhang, D.X., Huang, X.H.: Research on logistics path optimization based on improved ant colony algorithm – Taking Henan logistics network as an example. J. Henan Univ. Technol. (Soc. Sci. Ed.), 37(02), 56–60 + 96 (2021) 7. Cheng, L., Gan, H.C., Liu, Y.: Research on CVRP problem based on improved ant colony algorithm. J. Chongqing Univ. Technol. Technol. (Nat. Sci. Ed.) 38(05), 81–86 (2021) 8. Narendrababu Reddy, G., Phani Kumar, S.: MACO-MOTS: modified ant colony optimization for multi objective task scheduling in cloud environment. Int. J. Intell. Syst. Appl. 11(1), 73–79 (2019) 9. Li, Z., Li, Y.Z., Li, W.X.: Multi-objective optimization model and solution algorithm of emergency material transportation path. J. Comput. Appl. 39(09), 2765–2771 (2019) 10. Chen, T.H., et al.: Optimal design of substation video surveillance linkage scheme based on ant colony algorithm. Power Syst. Protect. Control 44(02), 134–139 (2016) 11. Wen, H., Shuang, Q., Houhe, C.: Solar power station irradiation intensity prediction based on ant colony BP neural network. J. Electr. Power Syst. Autom. 28(07), 26–31 (2016) 12. Molaiy, S., Effatparvar, M.: Scheduling in grid systems using ant colony algorithm. Int. J. Comput. Netw. Inf. Secur. 6(2), 16–22 (2014)
Application of Emergency Grain Transportation
213
13. Yuan, W.T., Sun, H.: CVRP logistics distribution route optimization and application research. Softw. Guide 15(11), 140–143 (2016) 14. Deng, L.J., Zhang, J.H.: Hybrid ant colony algorithm for dual-objective time window VRP. Complex Syst. Complex. Sci. 17(04), 73–84 (2020) 15. Yu Dian, W., Shan, Y.: Improved genetic-ant colony algorithm to solve the multidimensional 0/1 knapsack problem. Softw. Guide 19(03), 87–90 (2020) 16. Tao, L.H., Ma, Z.N., Shi, P.T., Wang, R.F.: Dynamic ant colony genetic algorithm based on TSP problem. Mach. Design Manuf. (12), 147149+154 (2019). https://doi.org/10.19356/j. cnki.1001-3997.2019.12.037 17. Zhang Haijun, X., Cheng, T., Han, Y.: CVRP problem based on improved ant colony algorithm. Firepower Command Control 44(01), 67–71 (2019) 18. Ding, B., Wei, Z., Sun, R.: Research on minimum cost distribution strategy based on genetic algorithm. J. Hefei Univ. Technol. (Nat. Sci. Ed.) 41(02), 273–278 (2018)
Research on Grain Pile Temperature Prediction Based on CNN-GRU Neural Network Weihu Liu1, Shuo Liu1(&), Yang Wang1, Guangbing Li2, and Litao Yu3 1
3
Institute of Mathematics and Computer, WuHan Polytechnic University, Wuhan 430023, China 2 Qianjiang Jujin Rice Industry Co. LTD., Qianjiang 433100, China Qianjiang Bureau of Agriculture and Rural Affairs, Qiangjiang 433100, China
Abstract. The temperature of grain while in storage in a silo is an important indicator used to determine food security. Therefore, monitoring and reasonable prediction of grain temperature can safeguard grain to a great extent. For the characteristics of grain pile temperature with nonlinear sequence, this paper adopts a hybrid neural network algorithm based on Convolutional Neural Network (CNN) and Gated Recurrent Unit Network (GRU) for grain pile temperature prediction model, which extracts the vector features of input data by CNN model, then learns the input features by GRU model, and finally predicts the grain pile temperature. In this paper, we build a model based on the existing grain bin data, train and test the model, and set up experimental comparisons. The experimental results show that the RMSE of the model is 0.049 and the mean absolute error MAE is 0.036. The temperature prediction is more accurate and has less error than other methods. Keywords: Grain pile temperature Convolutional neural networks Gated Recurrent Unit Network Nonlinear sequences Temperature prediction
1 Introduction Food is an important strategic material related to the livelihood of the nation and the most basic material basis for achieving sustainable economic development, and for our populous country, ensuring food security is always the top priority [1]. In the long-term storage of grain, the temperature of the grain bin, humidity, and other factors will directly affect the temperature of the grain, which will lead to heat, moldy grain, microbial proliferation, and a series of other conditions affecting food security. Therefore, reasonable prediction of grain temperature, timely access to grain pile temperature information, and good preventive protection measures can effectively guarantee food security. In real grain silos, the temperature prediction of grain piles can be viewed as a nonlinear time series problem due to the complexity of the environment and the effect of piles of grain between long periods. Traditional grain temperature prediction is mainly based on mathematical-statistical models or simulations using numerical © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 Z. Hu et al. (Eds.): CSDEIS 2021, LNDECT 121, pp. 214–226, 2022. https://doi.org/10.1007/978-3-030-97057-4_19
Research on Grain Pile Temperature Prediction
215
simulation software. With the rapid development of deep learning in recent years providing new ideas for research in various fields [2], more and more scholars are switching to deep learning methods to study grain pile temperature. CNN has the features of weight sharing and internal linkage, which can be well mined for its feature information on nonlinear problems. The internal structure of LSTM (Long Short-Term Memory) is more complex. GRU [3] optimizes the network structure of LSTM by reducing the number of “gates”, which makes the training process easier, faster, and less prone to fitting. Considering the advantages of CNN and GRU models and the characteristics of grain pile data: Although GRU neural network can solve the problem of long time dependence, it cannot effectively extract the relationship between input features, while CNN network can just make up for this problem. In this paper, a prediction method based on the CNN-GRU model is proposed. The CNN's are used to effectively extract the feature information of the data and input it into the GRU model so that the internal information of the data can be mined and the problem of long time dependence can be solved, and the grain pile temperature can be predicted more accurately. Experiments show that the model has better prediction results. The main contributions of this paper are the following two points: (1) The traditional prediction methods all directly use networks such as LSTM/GRU for direct prediction, without considering the influence of the correlation between mining feature data on the prediction results. In this paper, we first use the CNN network for feature extraction and dimensionality reduction processing, which improves the prediction accuracy of the GRU network. (2) The method in this paper is not only applicable to common grain pile temperature prediction, but also has some applicability to the prediction of some more difficult to mine sequence features.
2 Related Work The prediction methods of grain pile temperature are broadly classified into three types, namely the traditional building mathematical models, machine learning, and the recent gradual development of deep learning. The traditional conventional prediction methods include building mathematical models and simulations using numerical simulation software. For example, Jian [4] et al. developed a three-dimensional transient combination model to predict grain temperature in a barn. Wang [5] et al. based on the multiphysics field numerical simulation software COMSOL, thus simulating the variation of grain pile temperature. Traditional methods are more data computationally intensive, time-consuming, and have large computational errors. Machine learning algorithms have been widely used in the field of prediction research with their good nonlinear processing capability. Duan [6] et al. proposed a linear least squares regression and support vector machine (SVM) regression with different kernel functions to predict the average temperature of the surface layer of the grain pile. Han [7] et al. developed a temperature prediction model for mechanical ventilation of grain piles using a random forest algorithm. Guo [8] et al. addressed the problems of traditional grain bin
216
W. Liu et al.
temperature prediction methods by using a combination of gray GM(1, N) models and improved neural networks to improve the accuracy of the prediction method. Shi [9] used BP neural network to predict the average temperature of grain silos. Although machine learning can handle nonlinear and non-smooth data well, it is difficult to dig deep into the data features, and there is the problem of human interference, and the prediction accuracy is not high. With the development of deep learning. Chen [10], et al. proposed a prediction method based on the LSTM model for the problem of difficult prediction of environmental variables in greenhouse greenhouses. Yan [11] et al. used the LSTM model to predict grain pile temperature. Although the prediction accuracy can be improved to some extent using the above method, the feature extraction of the processed data is not considered. CNNs are commonly used in areas such as image and text classification [12], Its special network structure can handle data features well by the characteristic of shared weights. Recurrent Neural Network (RNN) [13] incorporates the concept of timing into neural networks, making it possible to better solve problems related to timing. The deformed-long short-term memory [14] (LSTM) model of RNN and GRU can be a good solution to the problem of gradient disappearance and gradient explosion generated by RNN models [15]. The GRU network merges the forgetting gate and the input gate into a single update gate based on the LSTM, so the training parameters are fewer and the training is easier than the LSTM. In this paper, we utilize a CNN as a feature extractor and then use a GRU model for prediction to improve the accuracy of the prediction. Although CNN-GRU is mainly used in areas such as image processing and natural language processing, it is also frequently used for time series tasks, including energy consumption prediction [16] and stock market forecasting [17]. However, to the best of our knowledge, there is no CNN-GRU model to study the grain pile temperature prediction.
3 CNN - GRU Method 3.1
CNN
Convolutional neural networks CNN [18] usually includes a convolutional layer, a pooling layer, and a fully connected layer with local connectivity and weight sharing. The convolution kernel in the convolution layer can extract the feature relationships of the input data, reducing the difficulty of extracting the features of the input data and making the feature extraction more effective. The pooling layer can effectively reduce the complexity of the network, reduce the parameters of the training data, and perform dimensionality reduction on the data features. 3.2
GRU
GRU [19, 20] is a simpler variant of LSTM with a simpler structure than the LSTM network, and there are only two gates in the model: update gate and reset gate. The update gate is used to determine the extent to which state information from the previous moment is brought into the current state. The reset gate determines the degree of
Research on Grain Pile Temperature Prediction
217
retention of information from the previous moment and new input information. The GRU hidden layer cell structure is shown in Fig. 1.
Fig. 1. GRU hidden layer cell structure
3.3
CNN-GRU
In many machine learning applications, feature extraction is used as an important step for predictive models to generate meaningful information that allows them to make accurate predictions. In recent years, researchers have been using convolutional operations for feature extraction, achieving groundbreaking results in the field of computer vision. However, its usefulness is not limited to data of image type, but also to define data of time series [21]. With the convolution operation, we can create an automatic feature extraction model and let the model learn how to extract features by optimizing the weights. However, CNN networks lose some valuable information by ignoring the correlation between the local and the whole when performing pooling operations. The GRU neural network has a special gate structure that takes into account the before-and-after relationship of temporal data, resolves long-time dependencies, and achieves effective prediction of serial data. After exploiting the spatial feature extraction capability of the convolutional layer, we consider introducing a GRU model that can resolve long-time dependencies, which can just compensate for the problem of losing some valuable information when training the CNN network. Considering that the grain pile temperature has the characteristics of a nonlinear sequence, and the temperature value at a certain moment has a certain relationship with the data at the current moment and the historical moment, this paper uses CNN-GRU neural network to predict the grain pile temperature. The framework diagram based on the CNN-GRU hybrid model is shown in Fig. 2. It contains an input layer, a CNN layer, a GRU layer, and an output layer. The input contains sequential characteristics of factors affecting grain temperature including the outdoor temperature, indoor temperature, outdoor humidity, and indoor humidity. To couple the feature information of these factors affecting grain temperature and to fully explore the connection and pattern among the factors, so the operation of the convolutional layer is used for feature extraction and mining the feature information. The
218
W. Liu et al.
pooling layer is then used to perform dimensionality reduction and reduce the number of training data parameters. In this way, the features in the historical sequence can be fully extracted and the maximum feature vector is obtained through the pooling layer, which is used to obtain the optimal data features. The GRU layer iteratively learns the output of the feature from the CNN, more fully learns the internal patterns of the features, learns the long-term dependencies between the features, and can reduce the training time.
Fig. 2. Structure of CNN-GRU hybrid model
(1) Input layer: The surface temperature of the grain pile at a certain moment in the silo and its associated outdoor air temperature, outdoor air humidity, indoor air humidity, and indoor temperature are concatenated into a brand new time series feature vector, and the data are expressed as a two-dimensional matrix of time feature vector, which is pre-processed and input into the prediction model. (2) CNN layer: The CNN layer mainly captures the characteristic patterns in the input historical sequence, and the convolutional layers (Conv1D) with the different number of layers are designed according to the characteristics of nonlinearity and sparsity of the input data, and the number of convolutional kernels is selected according to the experiment, and the ReLU function is chosen as the activation function. A maximum pooling (Maxpooling1D) is performed after each convolution operation to downscale the extracted high-dimensional features, reduce the parameters, and speed up the operation. The process of convolution and pooling layers is shown in the following equation: c ¼ f ð x W c þ bc Þ
ð1Þ
p ¼ max poolingðcÞ
ð2Þ
Among them,c is the output of the convolution layer; f ðxÞ is the activation function; Generally choose linear rectification function (Rectified Linear Unit, ReLU); is the convolution operation; Wc is the weight matrix; bc is the bias amount; p is the output of the pooling layer; max poolingðxÞ is the maximum pooling function.
Research on Grain Pile Temperature Prediction
219
(3) GRU layer: The GRU layer trains the global feature vectors extracted by the CNN layer. The number of GRU layers and the number of neurons can be continuously adjusted according to the experiment to reach the optimal value. The activation function is a tanh function, and a Dropout layer can be added to prevent overfitting. The output of the GRU layer is back-normalized to obtain the prediction.
4 Data Source and Processing The data in this article is based on research and recorded from May to early September 2018 with a total of 162 data. This data set monitors data including external temperature, external humidity, granary temperature, granary humidity, and grain pile temperature. where the Granary temperature is set at four corners and a center with five testing points to get five sets of data indicators. The Grain pile temperature includes 4 sets of data: surface, upper, middle, and lower. The tests were recorded at 8:00, 14:00, and 20:00 every day. The experiment was selected for the surface grain temperature. Part of the original data is shown in Table 1. Table 1. Part of the original data style No. Time External Tem/°C
External Hum/%
Granary Hum/%
Granary Tem1/°C
Granary Granary Tem2–4/°C Tem5/°C
1 2 3 4 5 6 7 8 9
Null Null 68.4 79.2 79.2 78.6 84.9 84.9 61.5
50.1 52.1 46.9 50.1 40.6 41.8 40.0 40.0 41.8
29.2 29.8 30.3 26.2 25.8 28.5 26.9 27.6 28.3
… … … … … … … … …
08:00 14:00 20:00 08:00 14:00 20:00 08:00 14:00 20:00
Null Null 30.3 28.1 28.1 26.2 25.2 25.2 33.7
29.2 29.8 27.7 26.2 25.8 25.8 26.9 27.7 25.8
Surface grain Tem/°C Null Null 25.9 25.9 25.9 25.7 25.5 25.5 25.4
Since there are inevitable data anomalies and missing data during the experimental data collection process, the data must be processed to improve the accuracy of the experiment before the model training. The measure taken is to automatically fill in the missing values after 5 iterations using linear regression based on the existing values of each variable. In addition to filling in the missing values, the abnormal data need to be removed to avoid affecting the experimental results.
220
W. Liu et al.
5 Experimental Analysis 5.1
Experimental Data Pre-processing
Before training the model, the cleaned data is normalized and the processed data set is divided into training set and data set according to 4:1. Data normalization is an important issue when the feature vector is expressed. When different feature data compositions are listed together, due to the different expression forms of the feature data itself, it will cause the small data to be weakened by the large data in absolute values, and this time it is necessary to do the normalization process to ensure that each feature data is treated equally. In this paper, min-max normalization (Min-Max Normalization) is used [22]. It maps all the feature data to between 0 and 1. The calculation formula is as follows: x ¼
x xmin xmax xmin
ð3Þ
In the formula: x denotes the normalized standard data. x is the original feature data. xmax is the maximum value of the feature data. xmin is the minimum value of the feature data. 5.2
Experimental Environment
Hardware Platform: Intel Core i7-4720HQCPU; RAM:8 GB; Operating system: Windows 10; Compiler Environment: python 3.7; Frame: Keras, Tensorflow. 5.3
Evaluation Criteria
In this paper, the mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination (R2) are selected to evaluate the model performance. Among them, MAE and RMSE can measure the error between the predicted value and the true value, and the smaller value means the better prediction effect. And R2 can measure the fitting effect of the model, and the closer its value is to 1, the better the model is fitted. The calculation formula is as follows: 1 XN jYi Xi j 1 N rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 XN RMSE ¼ ðYi Xi Þ2 i¼1 N MAE ¼
N P
R2 ¼ 1
ð5Þ
ðYi Xi Þ2
i¼1
N P
ð4Þ
ð6Þ
ðYi Y avgÞ2
i¼1
In the formula: Xi is the predicted value, Yi is the true value, N is the sample size, Y avg is the average of the true values.
Research on Grain Pile Temperature Prediction
5.4
221
Analysis of Experimental Parameters
The parameters that need to be set manually in the CNN-GRU model mainly include the number of convolutional layers, the number of convolutional kernels, the number of GRU layers, the number of neurons in the GRU layers, etc. The number of convolution layers and convolution cores reflect the ability of CNN feature extraction, while the number of Gru layers and the number of neurons in GRU layers reflect the long-term dependence of the Gru neural network from data. 5.4.1 Number of Convolutional Layers Theoretically, the more layers of convolution, the better the feature representation and the more data can be processed. However, the corresponding parameters will grow exponentially, and the training difficulty and training time of the model will increase significantly. In this paper, the input samples do not have many dimensions, so the number of convolutional layers of the model is not too many. To explore the effect of the number of convolutional layers 1 to 6 on the prediction results of the model, while keeping the other parameters of the model constant. as shown in Table 2. Table 2. Effect of the number of convolutional layers on the prediction results Number of convolutional layers 1 2 3 RMSE 0.103 0.086 0.094 MAE 0.066 0.066 0.074 2 0.981 0.987 0.984 R
4 0.100 0.076 0.982
5 0.120 0.089 0.974
6 0.126 0.096 0.972
As can be seen in Table 2, When the number of convolutional layers is in the range of 1–4, the error fluctuation of the model is small and the gap is not significant. As the number of convolutional layers increases to 5, the error fluctuates significantly and the error becomes larger. and as the number of layers increases, the model training time becomes longer. In summary, the model in this paper uses a two-layer convolutional layer. 5.4.2 Number of Convolution Kernels The number of convolution kernels reflects the ability of feature extraction. If the number of convolution kernels is too small, the potential features of the data cannot be fully explored. If it is set too large, the model becomes complex and the training time increases significantly. According to the previous section, the number of layers of convolutional layers is set to two. Based on general experience, the number of convolutional kernels in the second layer is set to be twice the number of convolutional kernels in the first layer. The same control variable approach is adopted to discuss the different combinations of the number of convolution kernels. The experimental results are shown in Table 3.
222
W. Liu et al. Table 3. Effect of the number of convolution kernels on the prediction results First layer Second layer RMSE 8 16 0.100 16 32 0.066 32 64 0.054 64 128 0.059
MAE 0.082 0.049 0.040 0.043
R2 0.982 0.992 0.995 0.995
As can be seen in Table 3, The error decreases and then increases with the number of convolution kernels, and the coefficient of determination increases and then decreases. When the number of convolutional kernels in the first layer is 32 and the number of convolutional kernels in the second layer is 64, the prediction error is the smallest and the decision coefficient is closest to 1. Therefore, in this paper, we consider the number of convolution kernels of the two layers as 32 and 64, respectively. 5.4.3 Number of GRU Layers and Neurons In general, the different number of GRU layers may also lead to changes in the prediction accuracy, and more layers will lead to a more complex network structure, resulting in inefficient training. The same control variables method as above was used to compare the effect of different GRU layers on the prediction results. As shown in Table 4. Table 4. Effect of the number of GRU layers on the prediction results Number of GRU layers 1 2 3 RMSE 0.044 0.053 0.062 MAE 0.035 0.043 0.053 R2 0.997 0.995 0.988
4 0.107 0.090 0.980
As can be seen in Table 4, When the number of GRU layers is set to 1, the prediction error is minimized. When the number of layers continues to increase, the error becomes larger instead, probably due to the overfitting of the model during the training process, resulting in larger predictions. Therefore, in this paper, the number of layers of GRU is set to one layer. According to the above, the number of layers of GRU is set to 1 layer. The effect of the number of neurons on the prediction results is investigated when the number of neurons is 10/20/30/40, respectively while keeping the other parameters in the model constant. as shown in Table 5.
Research on Grain Pile Temperature Prediction
223
Table 5. Effect of the number of GRU neurons on the prediction results Number of neurons 10 20 30 RMSE 0.069 0.048 0.059 MAE 0.049 0.041 0.044 0.992 0.996 0.994 R2
40 0.054 0.041 0.995
From the analysis in Table 5. When the number of neurons is taken as 10/20/30/40, the error fluctuation range is small and the prediction accuracy of the model changes in a stable range. Therefore, the number of GRU neurons in this paper is less influential, and the number of GRU neurons in the model is set to 20.
6 Model Comparison Analysis The model building process includes: (1) Determine the input and output: The input characteristics are outside temperature, outside humidity, grain bin humidity, and grain bin temperature 1–5. The output characteristic is the average surface grain temperature. (2) Model CNN layer settings: In this experiment, we choose to use one-dimensional convolution (Conv1D), Two convolutional and pooling layers are selected for the CNN layer. The number of convolution kernels is 32 and 64. The width of the convolution Kernel_size is 2. MaxPooling1D is performed for each side of the convolution, The pool size is 1. The activation function is the Relu function. (3) Model GRU layer settings: The model builds a 1-layer GRU hidden layer cell. The number of neurons in the hidden layer is 20. adding Dropout layer to prevent overfitting. Adding a fully connected layer between the hidden layer and the output layer ensures that the two are connected. (4) Selection of model hyperparameters: To be able to train at a better model, several hyperparameter metrics such as the number of iterations, activation function, batch size, discard rate, and optimization function is selected for training the model experimentally. Based on the selected hyperparameter metrics, the model is continuously tried to be trained by the control variables method, using the root mean square error as the metric to select the parameters. Finally, we set the number of iterations as 200, the batch size as 32, the activation function as tanh function, the dropout rate as 0.1, and the optimization function as Adam function. The experiments were conducted according to the training set, test set 4:1 to train the model and predict the average temperature of the surface layer of the grain pile at each time point. The variation of the loss values for the training and test sets of the experimental model is shown in Fig. 3. It can be seen that the fit of the CNN-GRU neural network is not bad.
224
W. Liu et al.
Fig. 3. Model loss value change curve
Fig. 4. Fitting diagram of predicted and true values of each model
To verify the effectiveness of the CNN-GRU neural network selected for this experiment on the grain pile temperature prediction problem, three methods, GRU, LSTM, and BP, were selected for comparison experiments. The data are trained and tested several times separately, and the predicted values of each model are plotted against the true values. as shown in Fig. 4. It can be seen from the figure that the trend of the predicted value of the BP neural network varies widely from the true value, and the model has the worst prediction effect. The prediction trends of the three models, LSTM, GRU, and CNN-GRU, are roughly similar, but the predicted values of CNN-GRU are closer to the true values. It shows that the model has more accurate predictions than LSTM and GRU models. To show the prediction results of each model more visually, the results of the evaluation criteria of each model are shown in a table, as shown in Table 6.
Research on Grain Pile Temperature Prediction
225
Table 6. Results of different evaluation indicators Models MAE RMSE R2
CNN-GRU 0.036 0.049 0.996
GRU 0.134 0.175 0.936
LSTM 0.189 0.232 0.893
BP 0.532 0.653 0.223
The results of the different evaluation metrics in Table 6 show that the CNN-GRU model has the smallest error, the predicted value is closer to the true value, and the prediction is better. For example, the RMSE measures the error between the predicted and true values, and the BP neural network model results in 0.653, the LSTM model results in 0.232, the GRU model results in 0.175, and the CNN-GRU model results in 0.071. R2 can measure the ability of the model to fit the data, and the closer its value is to 1, the better the model fits. The result of CNN-GRU is calculated as 0.992, indicating that the model has the best fit. In summary, the CNN-GRU model can handle the nonlinear time series of grain pile temperature better than other models and can over predict the grain pile temperature reasonably accurately.
7 Conclusion For the problem of grain pile temperature prediction, this paper proposes a CNN-GRU neural network-based grain pile temperature prediction model based on its highly nonlinear sequence. The CNN model is first used for feature extraction, and the GRU model learns long-term dependencies on the input features to predict the results. The CNN-GRU model is built by processing the existing data, comparing and analyzing the parameters such as the number of convolutional layers to arrive at the most suitable result, and then building the model and training and predicting the model. By comparing the implementation with GRU, LSTM, and BP prediction models, it is verified that the CNN-GRU model has higher accuracy in grain pile temperature prediction, can be more effective in grain temperature opinion prediction, and can play a more important role in grain situation monitoring. In the prediction research of many sequence problems, the accuracy of prediction by mathematical method or machine learning method is low, so the model proposed in this paper has wide application value and prospect in the prediction of grain stack temperature and practical life. In future work, we plan to use more advanced methods to refine the model and further improve the prediction accuracy of the model. Acknowledgment. Hubei Provincial Outstanding Young and middle-aged Science and Technology Innovation Team project: Research on multi-source perception and intelligent detection technology of grain quality information. Project Number: T2021009.
226
W. Liu et al.
References 1. Tian, W., Chen, L.: Some thoughts about grain storage safety. Mod. Food (15), 65–68 (2015) 2. Hsieh, S.: Tourism demand forecasting based on an LSTM network and its variants. Algorithms 14(8), 243 (2021) 3. Tan, X., Zhang, X.: GRU deep neural network based short-term railway freight demand forecasting. J. China Railw. Soc. 42(12), 28–35 (2020) 4. Jian, F., Jayas, D.S., White, N.D.G., Alagusundaram, K.: A three-dimensional, asymmetric, and transient model to predict grain temperatures in grain storage bins. Trans. ASAE 48(1), 263–271 (2005) 5. Wang, Z., Zhang, X., Chen, X.: Study on temperature field of grain heap under non-manual intervention. J. Huaiyin Inst. Technol. 30(1), 60–64 (2021) 6. Duan, S., Yang, W., Xiao, L., Zhang, Y.: A method for predicting surface temperature of storage grain depot based on meteorological data. J. Chin. Cereals Oils Assoc. 35(2),152– 158 (2020) 7. Han, J., Nan, S., Li, J., Guo, C.: Research on prediction and control of mechanical ventilation temperature of grain pile based on random forest algorithm. J. Henan Univ. Technol. (Nat. Sci. Ed.) 40(5), 108–114 (2019) 8. Guo, L., Lian, F.: Temperature prediction of granary based on SOM clustering algorithm and grey improved neural network. Cereals Oils 32(11), 97–100 (2019) 9. Shi, R.: Application of BP neural network in forecasting average temperature of granary. Softw. Guide 14(08), 42–44 (2015) 10. Chen, L., Pei, X., Liu, Y.: Prediction of greenhouse environment variables based on LSTM. J. Shenyang Ligong Univ. 37(05), 13–19 (2018) 11. Yan, Z., Dong, Z.J., Shuang, R.S.: Research on grain pile temperature prediction based on deep learning algorithm. Grain Sci. Econ. 44(11), 52–56 (2019) 12. Zheng, Y., Li, G., Li, Y.: Survey of application of deep learning in image recognition. Comput. Eng. Appl. 55(12), 20–36 (2019) 13. Graves, A., Mohamed, A.R., Hinton, G.: Speech recognition with deep recurrent neural networks. In: International Conference on Acoustics Speech & Signal Processing, Picasso, pp. 6645–6649 (2013) 14. Hochester, S., Schmid Huber, J.: Long short-term memory. Neural Comput. 9(8), 1735– 1780 (1997) 15. Bengio, Y.: Learning long-term dependencies with gradient descent is difficult. IEEE Trans. Neural Netw. 5(2), 157–166 (1994) 16. Ge, F., Lei, J.: Research on short-term power load forecasting based on CNN-GRU SA model. Mod. Inf. Technol. 5(07), 150–154 (2021). https://doi.org/10.19850/j.cnki.20964706.2021.07.039 17. Dang, J., Cong, X.: Research on hybrid stock index prediction model based on CNN and GRU. Comput. Eng. Appl. 57(16), 167–174 (2021) 18. Zhou, F., Jin, L., Dong, J.: Review of convolutional neural network research. Chin. J. Comput. 40(06), 1229–1251 (2017) 19. Che, Z., Purushotham, S., Cho, K., Sontag, D., Liu, Y.: Recurrent neural networks for multivariate time series with missing values. Sci. Rep. 8(1), 6085 (2018) 20. Gao, S., et al.: Short-term runoff prediction with GRU and LSTM networks without requiring time step optimization during sample generation. J. Hydrol. 589, 125188 (2020) 21. Elmaz, F., Eyckerman, R., Casteels, W., Latré, S., Hellinckx, P.: CNN-LSTM architecture for predictive indoor temperature modeling. Build. Environ. 206, 108327 (2021) 22. Cao, X.H., Stojkovic, I., Obradovic, Z.: A robust data scaling algorithm to improve classification accuracies in biomedical data. BMC Bioinform. 17, 359 (2016)
Research on Prediction Model of Grain Temperature Based on Hybrid Model Yang Wang1, Shuo Liu1(&), Weihu Liu1, Yongfu Wu1, and Wenbing Ma2 1
2
School of Mathematics and Computer, Wuhan Polytechnic University, Wuhan 430072, China Wuhan National Rice Trading Center Co., Ltd., Wuhan 430000, Hubei, China
Abstract. Grain temperature is an important indicator that affects the amount of grain in the process of grain storage, and the realization of accurate prediction of grain temperature has great social significance and application value. Based on this, a combination prediction model combining ensemble empirical mode decomposition (EEMD), long short-term memory neural network (LSTM) and support vector regression (SVR) is constructed. This method makes the input data independent of each other through EEMD decomposition, reduces the correlation between the input data, and enhances the memory ability of the previous grain temperature data through LSTM, and finally uses SVR to correct the prediction error generated by the LSTM after the input variables increase. To get a better forecasting effect. The average surface temperature and influencing factors data recorded by Wuhan National Rice Trading Center Co., Ltd. in 2018 at No. 1 warehouse are used as the modeling object. The experimental results show that the EEMD-LSTM-SVR combined model proposed in this paper predicts and evaluates the RMSE, MAE, and MSE values of 0.0761, 0.0014, 0.0021, respectively, which proves that the model is better than the LSTM, SVR, EEMDLSTM, and EEMD-SVR models. The forecasting effect is better. The research results of this paper can help the grain storage department to carry out timely risk warning work and formulate reasonable risk management policies. Keywords: Food temperature Ensemble empirical mode decomposition Long and short-term memory neural network Support vector regression Regression prediction
1 Introduction Food security is a prerequisite for national health and national security. Food storage security is an important part of food security. During the process of food storage, changes in the internal temperature of the granary will affect the quality and quantity of grain in the silo. Therefore, grasping the grain temperature within the granary that changes with time [1] and exploring its changing laws are important means to consolidate the foundation of national food security. The prediction methods of grain pile temperature can be roughly divided into three types, namely the traditional establishment of mathematical models, machine learning and the recently developed deep learning. Traditional conventional prediction methods © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 Z. Hu et al. (Eds.): CSDEIS 2021, LNDECT 121, pp. 227–240, 2022. https://doi.org/10.1007/978-3-030-97057-4_20
228
Y. Wang et al.
such as Wang Ke [2] and others based on the heat and mass transfer theory of hygroscopic porous media, used finite element analysis software to predict the grain temperature in the warehouse. Based on the principle of thermodynamics, Jin Libing [3] and others used the boundary conditions such as the heat transfer parameters of the silo enclosure structure and the grain storage environment to obtain the cloud diagram of the temperature field of the underground granary in different months of a year. Traditional modeling methods have a large amount of data calculation, time-consuming calculations and large errors. Machine learning has been widely used in the field of prediction research by virtue of its good nonlinear processing capabilities. For example, Wang Qiyang [4] took the machine learning method as the basic idea, and proposed two storage grain quality prediction models based on support vector regression and their corresponding optimization algorithms, which gave full play to the advantages of machine learning methods in the judgment and prediction of grain storage conditions. Nan Shaowei [5] and others used random forest algorithm to establish a grain pile mechanical ventilation temperature prediction model to study the internal relationship between the change of grain pile temperature and its influencing factors. With the continuous development and improvement of deep learning technology, deep learning has been applied to more and more scenarios. For example, Guo Pingfei [6] proposed using an improved activation function LSTM model to monitor and research grain conditions; Duan Shanshan [7] established a temperature field prediction model based on the SVR model and using different kernel functions. In order to further study better prediction models to improve the accuracy and stability of food production prediction, through a large number of experimental studies, it is found that the combined model can improve the prediction accuracy to a certain extent compared with the single model. For example, Ni Fan [8] proposed a particle swarm optimization algorithm. Improved SVR storage grain ventilation temperature prediction. Although the prediction accuracy of the hybrid model has improved, it does not consider the lack of memory of historical data features in SVR; Zhang Feilong [9] based on the EMD algorithm to decompose and study the fluctuation of grain production in various regions in Hohhot, and through the analysis of each component to examine the fluctuation characteristics of grain production in each region, and achieved good prediction results, but the current variable decomposition algorithm is in the grain. It is less used in the field of temperature prediction. Based on the above ideas, this article fully considers the 10 main external factors that affect the grain temperature, based on the characteristics of the non-linearity, time delay, and weak anti-interference ability of the grain temperature sequence, this article uses the LSTM-SVR model combined with the variable decomposition method. Predict the grain temperature. Use EEMD, which is suitable for nonlinear and non-stationary signal processing methods, to decompose environmental factors and reduce the correlation between environmental factor data. Then use the SVR model to correct the prediction error of the LSTM model to improve the prediction accuracy of the entire model. This method not only utilizes the memory ability of LSTM for time series data, but also uses the generalization ability of SVR, which makes the prediction effect of grain temperature better. At the same time, LSTM, SVR, EEMD-SVR, EEMD-LSTM and EEMD-LSTM-SVR combined models are compared. The research results of this paper can bring actual economic benefits to the safe storage of food and have good promotion value in the food storage sector.
Research on Prediction Model of Grain Temperature Based on Hybrid Model
229
2 Data Source and Preprocessing The data in this paper comes from the temperature data from May to early September 2018 monitored and recorded by the research project of shallow ground energy and aircooled grain pile surface temperature control green grain storage technology of Wuhan National Rice Trading Center Co., Ltd. The original data can be composed of one. A continuous time series with a length of 162. This data set includes a total of 3 bins of grain temperature data. Among them, the bin temperature is set at four corners and a center with 5 detection points to obtain a total of 5 sets of data indicators. We choose relatively few data missing. The No. 1 warehouse and the average surface temperature that has a great impact on grain storage as the research object, combined with external temperature, external humidity, warehouse humidity, upper warehouse temperature, upper middle warehouse temperature, middle warehouse temperature, middle lower warehouse temperature, lower warehouse temperature. Modeling and analysis of 10 environmental data of the highest temperature and the lowest temperature. Through observation, it is found that the original data contains a small number of missing values. The random forest [10] algorithm will fill in the missing values by constructing multiple decision trees, so that the filled data has randomness and uncertainty, and can better reflect the unknown data. True distribution. Therefore, the random forest algorithm is used to fill in the missing values of the original data.
3 Model Design 3.1
Introduction to LSTM
LSTM [11] adds forget gates, input gates and output gates on the basis of recurrent neural network (RNN), which enables the model to selectively memorize valid information and delete invalid information, and solves the problem of RNN gradient explosion and easy gradient disappearance. The insufficiency. The LSTM network model learns long-term dependence on information, and can use the state of the previous moment to deduce the state of the next moment to achieve the “memory” function, thereby effectively improving the prediction accuracy of the model. The forward propagation formula is as follows: (1) Forget gate layer : ft ¼ r wf ½ht1 ; xt þ bf
ð1Þ
: it ¼ r wi ½ht1 ; xt þ bi
ð2Þ
(2) Input gate layer
230
Y. Wang et al.
(3) New memory cell : ct ¼ tanh wc ½ht1 ; xt þ bc
ð3Þ
ct ¼ ft ct1 þ it ct
ð4Þ
Ot ¼ rðwo ½ht1 ; xt þ bo Þ
ð5Þ
ht ¼ ot tanhðct Þ
ð6Þ
(4) Final memory cell
(5) Output gate layer
In the above formula, wf , wi , wc , wo represent the weight, xt represents the current input, ht1 represents the output at the previous moment, ct1 the cell state at the previous moment, and ct represents the new cell state. ht Represents the output at the future moment. ft Represents the output of the forget gate at t moment, it represents the output of the t moment input gate, ot represents the output of t moment output gate, tanh represents the activation function, bf , bi , bc , bo represents the offset. 3.2
Introduction to SVR
SVR [12] is an important branch of SVM [13], which is mainly used in regression prediction problems. The solution process is very similar to SVM, mapping from lowdimensional space to high-dimensional space to find the optimal hyperplane. The simplified SVR model expression is: f ð xÞ ¼
Xl i¼1
bi kðxi ; xÞ þ c
ð7Þ
In the above formula, is the constant that needs to be optimized; is the vector coefficient, is the support vector; is the inner product kernel function, the radial basis kernel function (RBF) has fast fitting and strong generalization ability, and has been widely used, as shown below. Shown is the formula of the radial basis kernel function: jjx xi jj2 kðxi ; xÞ ¼ exp 2r2
! ð8Þ
In the above formula, it is the width of the kernel function, and its role is to limit the radial range of the function. 3.3
Combination Model Construction
The traditional neural network grain temperature prediction model is limited by the data source of the granary, ignores the influence of some external environmental factors on
Research on Prediction Model of Grain Temperature Based on Hybrid Model
231
grain temperature, and lacks the effective use of external influence factors, which greatly affects the prediction performance of the model. This article is based on the premise of the grain temperature data collected by real industry. It not only fully considers the various environmental factors that affect the grain temperature, but also in order to highlight the impact of environmental factors on the grain temperature prediction, in reference to Liu Shanghui [14] multiple regression in the multi-factor regression. After the discussion of the contribution size, it is found that the sum of the influence of a single factor on the dependent variable is independent of each other only when the different independent variables are independent, that is, under the premise that there is no correlation between the independent variables. The total influence is equal to the sum of the influence of a single independent variable on the dependent variable. Therefore, the EEMD decomposition is used to decompose various environmental factors into independent IMF components and RES components to reduce the correlation of input variables. Each IMF represents the oscillating changes in different frequency bands of the original signal, reflecting the local characteristics of the signal, and the final residual component RES reflects the slow change in the signal. Taking the decomposed sequence as the input of the mixed model, the decomposition process is as follows: xi ðtÞ ¼ xðtÞ þ ni ðtÞ j xi ðtÞ ¼ Rj¼1 di;j ðtÞ þ ri;j ðtÞ
cj ð t Þ ¼
1 M R ci;j ðtÞ M i¼1
ð9Þ ð10Þ ð11Þ
In the above formula, xðtÞ is the original signal, ni ðtÞ is the standard normal distribution signal, and xi ðtÞ is the new signal. The i in the ni ðtÞ represents the white noise signal added for the first time, and the i in the xi ðtÞ represents the ni ðtÞ signal added for the first time. The value ranges from 1 to M, where M is the number of decompositions, cj ðtÞ represents the jth IMF decomposed by EEMD, and j ranges from 1 to J, and j is the number of IMF obtained by decomposition. Although LSTM can use the state of the previous moment to deduce the state of the next moment to realize the memory function, it has been widely used in the field of grain temperature prediction. However, due to the increase of the input variables of the neural network after EEMD decomposition, the convergence speed of the model gradually slows down, and problems such as over-fitting gradually appear. Therefore, considering the one-sidedness of a single neural network prediction, the SVR model is used to correct the prediction error generated by the LSTM model. By combining the advantages of each model, the accuracy of grain temperature prediction can be improved. Figure 1 is a flow chart of grain temperature prediction based on the EEMDLSTM-SVR multivariate input model. Model input: Use EEMD to decompose data of external temperature, external humidity, warehouse humidity, upper warehouse temperature, upper middle warehouse temperature, middle warehouse temperature, middle lower warehouse temperature, lower warehouse temperature, maximum temperature and minimum temperature, and decompose them into different frequency characteristics Modular components IMF1, IMF2,…, IMFm and residual component RESN. Combine the average surface
232
Y. Wang et al.
Fig. 1. Work flow chart of EEMD-LSTM-SVR multivariate mixed model
temperature data and the decomposed data, normalize all the data, convert it into a data set suitable for LSTM and SVR model training, divide the training set and the test set, the division ratio of the training set and the test set is 9:1.
Research on Prediction Model of Grain Temperature Based on Hybrid Model
233
Model training: The decomposed data is used as input data, and the average surface temperature data is used as output data. Initialize the LSTM and SVR model parameters, and substitute the training set data into the LSTM model and SVR model for training. When the model reaches the maximum number of iterations or the minimum loss function, the training is stopped and the model training results are saved. Then substituting the test set into the model for training respectively to obtain the prediction results. Model mixing: Customize the error square sum minimum function f(x), as shown below, set the unknown weight variables w1, w2, and substitute them into the test set results of the LSTM model and the SVR model. When the value of the function is the smallest, the maximum value is obtained. The optimal weights w1 and w2. The test result of the mixed model is that the optimal weights are multiplied by the prediction results of LSTM and SVR, and then added. 2 f ð xÞ ¼ Rni¼1 xi Rki¼1 wk xik
ð12Þ
Weight acquisition: In the process of obtaining weights, a quadratic programming algorithm is used to solve the problem. The quadratic programming algorithm transforms the nonlinear optimization problem with equality and inequality constraints into a quadratic programming problem to solve. The specific equality constraints and inequality constraints are as follows. Set the constraint condition, when the function f (x) satisfies the boundary condition of the non-linear constraint minf(x), the optimal weight is obtained. The specific formula is as follows: 8 < minf ðxÞ
A xb Aeq ax ¼ beq : lb x ub
ð13Þ
Model saving: The weights are respectively multiplied and superimposed with the prediction results of the LSTM and SVR models to obtain the final prediction results. Output model evaluation indicators RMSE, MAE, MSE. In the above formula, wk represents the weight, xik represents the fitting results of different models, and xi represents the true value, where the value of k depends on the number of models, and the value of i depends on the number of test sets. lb And ub are custom boundary constraints. When the above formulas A x b and Aeq ax ¼ beq are satisfied, and the value of the above function f ð xÞ is minimized, the optimal weight value wk is saved.
4 Experimental Analysis This experiment is based on the Windows + Matlab environment to complete the construction, training and prediction of EEMD, LSTM, and SVR models. The main parameters include CPU: Intel I5-5200K @ 1.2 GHZ; operating system: Windows10. The experiment selects the temperature data from May to September as the training set,
234
Y. Wang et al.
and predicts the average temperature of the grain pile every day for the next 16 days. The parameter selection of the model in the experiment is shown below. 4.1
Model Evaluation Indicators
In order to compare the prediction results of different prediction methods, this paper uses root mean square error (RMSE), mean absolute error (MAE), and mean square error (MSE) three indicators [15] to measure the prediction results of each model. rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 Xn RMSE ¼ ðYi yi Þ2 i¼1 n
ð14Þ
MAE ¼
1 Xn jY yi j i¼1 i n
ð15Þ
MSE ¼
1 Xn jY yi j i¼1 i n
ð16Þ
Among them, predicted value: Yi ¼ fY1 ; Y2 ; :::Yn g, true value yi ¼ fy1 ; y2 ; :::yn g, n represents the number of samples. 4.2
Model Impact Factor Analysis
Build SVR and LSTM models based on matlab library functions, substitute training data for training, and continuously adjust model parameters to improve the accuracy of single model prediction to lay the foundation for subsequent experiments. The selection and adjustment process of model parameters is shown below. 4.2.1 Number of Hidden Layers of LSTM Theoretically, the more hidden layers, the more time overhead becomes out of control as the number of neurons increases, and the memory overhead will also become very large. Therefore, it is difficult to perform parallel optimization. On the other hand, the gradient between layers will disappear, which is also the most fatal point. When the number of layers exceeds three layers, the disappearance of the gradient between layers becomes very obvious. In addition to the timing model, the update iteration of the LSTM layer close to the input layer slows down, and the convergence effect and efficiency drop sharply, even very easy to enter the dilemma of local minimum. While keeping the other parameters of the model unchanged, the influence of the number of hidden layers from 1 to 4 on the prediction results of the model is discussed based on the multi-input LSTM model, as shown in Table 1. Table 1. The influence of the number of hidden layers on the prediction results Hidden layers 1 floor 2 floor 3 floor RMSE 0.0245 0.0183 0.0315 MAE 0.2341 0.2162 0.2467 MSE 0.0322 0.0183 0.0561
4 floor 0.0421 0.3465 0.0672
Research on Prediction Model of Grain Temperature Based on Hybrid Model
235
It can be seen from Table 1 that when the number of hidden layers is selected as 2 layers, the three indicators of RMSE, MSE, and MAE of the model are better than those of other layers. Each indicator first shows a downward trend and then gradually rises, indicating the model’s performance. The training effect also changes with the increase of the number of layers, so the number of hidden layers is selected as 2 layers. 4.2.2 SVR Parameter Selection For SVR models, the type of kernel function affects the fitting effect of the model, the size of the penalty factor will cause over-fitting or under-fitting, and the value of the kernel coefficient affects the accuracy of the model. Therefore, the combination of different parameters can be found. The best model that fits the current data will neither overfit nor underfit. In the table below, the first 3 columns are kernel functions, the middle 4 columns are penalty factors, and the last four columns are kernel coefficients. Table 2. The influence of the penalty factor on the prediction result Penalty factor 0.2 0.4 0.8 RMSE 0.1642 0.0835 0.0296 MAE 0.6528 0.2976 0.2462 MSE 0.3214 0.2341 0.0521
0.9 0.0346 0.2354 0.1234
Table 3. The influence of the kernel coefficient on the prediction result Nuclear coefficient 0.2 0.4 0.8 RMSE 0.1642 0.0835 0.0296 MAE 0.6528 0.2976 0.2462 MSE 0.3214 0.2341 0.0521
0.9 0.0346 0.2354 0.1234
Table 4. The influence of the kernel function on the prediction results Kernel function RBF LINEAR RMSE 0.0296 0.2684 MAE 0.2462 0.3598 MSE 0.0521 0.0387
POLY 0.4213 1.2346 0.2467
It can be seen from the above Table 2, Table 3 and Table 4 that when the kernel function selects RBF, the penalty factor selects 0.8, and the kernel coefficient selects 0.1, the three indicators of the model, RMSE, MAE, and MSE, are reduced to the lowest, and then gradually rise, indicating the prediction of the model at this time. The effect is better, so the kernel function chooses RBF, the penalty factor chooses 0.8, and the kernel coefficient chooses 0.1.
236
Y. Wang et al.
4.3
Model Prediction
The model prediction is considered from the three directions of whether to consider environmental factors, whether to decompose, and whether to combine. The average surface temperature and environmental factors are respectively decomposed by EEMD to verify the importance of considering environmental factors, and contrast with nondecomposition. The combined model is compared with the uncombined model to verify the advantages of the combined model. Regardless of environmental factors: the surface average temperature sequence is decomposed by EEMD, and the number of IMF is 7 and the number of RES is 1, a total of 8 dimensions. Normalize the data, divide the training set and test set, and use the temperature of the first three days to predict the temperature of the next day. It is transformed into a data set suitable for LSTM and SVR model training, and the training set and test set are divided. Initialize the LSTM and SVR model parameters. After the above-mentioned experimental parameter analysis, it is finally determined that the number of input features of the LSTM model is 3, the number of output features is 1, and the number of network layers is 2. The number of iterations epoch is 150, Batch_size is 32, and the time step time_step is 12. Input the training set of samples into 7 identical LSTMs for training, and then superimpose the training results of each LSTM model. SVR uses e-support vector regression, selects the rbf kernel function, C is 0.8, gamma is 0.2, and epsilon is 0.1. Then substituting the data into the SVR model for training, using e-support vector regression, choosing the rbf kernel function, C is 0.8, gamma is 0.2, and epsilon is 0.1. Using the fmincon function that comes with matlab, set the default boundary conditions, find the optimal weights w1, w2 that minimize the value of the custom error loss function, and substitute the weights into the test sets of LSTM and SVR respectively, and superimpose the training results of the model. The model training effect is shown in Fig. 2 below: Figure 2 shows that in the first few samples, the prediction results of each model are relatively close to the true value, and then the difference in model prediction becomes larger. Compared with LSTM, the predicted value of SVR is closer to the true value, but the combined model The predicted value always fluctuates near the true value, indicating that the combined model has a better training effect, combines the advantages of a single model, and solves the shortcomings caused by the prediction of a single model, and has high feasibility and accuracy. Considering environmental factors: Decompose the data through the EEMD algorithm to obtain IMF data by decomposing the external temperature, external humidity, warehouse humidity, warehouse temperature 1, warehouse temperature 2, warehouse temperature 3, warehouse temperature 4, warehouse temperature 5, maximum temperature, and minimum temperature. The number is 49 and the number of RES is 10. A total of 59 dimensions are added, and the average surface temperature data is added to normalize the above data and convert it into a data set suitable for LSTM and SVR model training. The training set and the test set are divided. Through the above-mentioned experimental analysis, it is finally determined that the number of input features of the LSTM model is 59, the number of output features is 1, and the number of network layers is 2. The number of iterations epoch is 150, Batch_size is 32, and the time step time_step is 12. Input the decomposed data into LSTM for training,
Research on Prediction Model of Grain Temperature Based on Hybrid Model
237
Fig. 2. Comparison of fitting curves of model test set
and then superimpose the training results. SVR uses e - support vector regression, selects the rbf kernel function, C is 0.8, gamma is 0.2, and epsilon is 0.1. Then substitute the data into the SVR model for training, and obtain the fitting results of the training set and the test set. Using the fmincon function that comes with matlab, set the default boundary conditions, find the optimal weights w1, w2 that minimize the value of the custom error loss function, and substitute the weights into the test sets of LSTM and SVR, and superimpose the training results. The model training effect is shown in Fig. 3 below: After the model is trained on the test set, the statistical results of MAE, RMSE, and MSE are as follows. In the following Table 5, the first three columns do not consider environmental factors, and the last four columns consider environmental factors. As can be seen in Fig. 3, after considering environmental factors and decomposing environmental factors, the training effect of the model has been improved. The error of the model increased in the first few samples, and then gradually stabilized, and then the LSTM began to experience errors, through the SVR model After the correction, it is closer to the true value, and the prediction effect of the combined model is always more stable. It can be seen from Table 5 and Table 6 that the training effect of the model is gradually getting better, and environmental factors have a great influence on the grain temperature. This result verifies the advantages of the mixed model in the grain temperature prediction.
238
Y. Wang et al.
Fig. 3. Comparison of fitting curves of model test set Table 5. Analysis and comparison of results without considering environmental factors LSTM SVR EEMD-LSTM-SVR MAE 0.0406 0.0197 0.0185 RMSE 0.4423 0.2332 0.2233 MSE 0.1956 0.0544 0.0498 Table 6. Analysis and comparison of results considering environmental factors LSTM EEMD-LSTM MAE 0.0296 0.0049 RMSE 0.2462 0.1560 MSE 0.0521 0.0214
EEMD-SVR 0.0034 0.0878 0.0074
EEMD-LSTM-SVR 0.0014 0.0761 0.0021
5 Conclusions From the perspective of non-linearity, time delay, and weak anti-interference ability of grain temperature series, this paper defines a hybrid model based on variable decomposition. The difference from previous studies is that this paper takes into account the importance of environmental factors to changes in grain temperature and reduces. The correlation of environmental factors has improved the prediction accuracy of grain temperature, which provides a reference for the prediction of granary temperature; on the other hand, considering the disadvantages of single model prediction, the experiment of mixed model is added. Enterprise food security management has obvious
Research on Prediction Model of Grain Temperature Based on Hybrid Model
239
auxiliary decision-making significance, which verifies the practicability of the hybrid model in the grain temperature domain, expands the application scope of deep learning technology, and provides a new perspective for in-depth discussion of the economic operation and scheduling of the granary system. Using this prediction method, the temperature change trend in the granary can be found in time, so that manual intervention can be carried out in time. There is room for improvement in the accuracy of grain temperature prediction. This article only considers some of the external environmental factors that affect grain temperature changes. In fact, the factors that affect grain temperature are diversified, and the advantages of other regression prediction models over the models proposed in this article are not discussed. The automatic optimization of model parameters has not been carried out. A follow-up article will discuss the addition of external factors, optimization of model parameters, and comparison and analysis of more regression prediction models, which will contribute to the safe food storage of enterprises.
References 1. Liu, Z., Zhu, Y., Zhen, T.: Current situation and methods of research on granary temperature in grain storage ecosystem. J. Henan Univ. Technol. (Nat. Sci. Ed.) 37(02), 123–128 (2016) 2. Wang, K., Wang, Y., Yu, X., Yu, H.: Numerical simulation of lateral cooling and water retention ventilation in storage silos under different ventilation temperatures. Cereals Oils Food Sci. Technol. 29(03), 198–207 (2021) 3. Jin, L., Xue, Y., Liang, X., Wang, Z.: Numerical simulation and experimental research on the temperature field of underground granary. J. Henan Univ. Technol. (Nat. Sci. Ed.) 40 (05), 120–125 (2019) 4. Wang, Q.: Research on the prediction method and application of stored grain quality based on machine learning. Jilin University (2021) 5. Han, J., Nan, S., Li, J., Guo, C.: Research on prediction and control of mechanical ventilation temperature of grain piles based on random forest algorithm. J. Henan Univ. Technol. (Nat. Sci. Ed.) 40(05), 107–113 (2019) 6. Guo, P.: Research on grain condition monitoring and early warming model based on deep learning. Henan University of Technology (2019). https://doi.org/10.27791/d.cnki.ghegy. 2019.000175 7. Duan, S., Yang, W., Xiao, L., Zhang, Y.: A method for predicting the surface temperature of stored grain piles based on meteorological data. J. Chin. Cereals Oils Assoc. 35(02), 152– 158 (2020) 8. Ni, F.: Research on temperature modeling method of stored grain transverse ventilation process based on support vector machine. Beijing University of Posts and Telecommunications (2017) 9. Zhang, F.: Research on the fluctuation of grain output in Hohhot area. Inner Mongolia Agricultural University (2017) 10. Shen, L., Hu, G., Chen, L., Tan, H.: Application of missing forest algorithm in missing value filling. China Health Stat. 31(05), 774–776 (2014) 11. Li, H.: Analysis of the impact of corona virus disease 2019 on stock returns based on time series and LSTM models. In: Proceedings of 2nd International Conference on Education Technology, Economic Management and Social Sciences (ETEMSS 2021), p. 11. Wuhan Zhicheng Times Cultural Development Co., Ltd. (2021)
240
Y. Wang et al.
12. Zhao, R., Cheng, X., Xu, X., Song, T., Sun, Y.: The warning and prevention system of greenhouse diseases based on PSO-SVR model. J. Jiangsu Agric. Sci. 37(04), 854–860 (2021) 13. Yuan, J., Chang, Q., Zhao, H., Tang, F.: Study on the classification method of grain mildew prediction based on SVM [J/OL]. J. Chin. Cereals Oils Assoc. 1–13 (2021) 14. Liu, S.: Analysis method and realization of the contribution of each factor to regression in multiple regression. J. Math. Med. (06), 14–15 (2005) 15. Tang, L., Cui, Z., Liu, K.: Analysis and construction of EEMD smart model and fuzzy forecasting through improved Bayesian estimation. J. Phys.: Conf. Ser. 1982(1) (2021)
Scenario Forecasting of University Development Based on a Combined Cognitive Model Andrey Mikryukov(&) and Mikhail Mazurov Plekhanov Russian University of Economics, Moscow, Russia [email protected] Abstract. A new approach to solving the problem of predicting the performance of a university as a weakly structured system based on an aggregated model, including an ensemble of gray fuzzy cognitive maps (FCM), is proposed. A combination of cognitive modeling, interval mathematics, and causal algebra was used to construct the gray FCM. Aggregation of gray FCM models gives a synergistic effect, in which the disadvantages of some models are compensated by the advantages of other models. The use of the interval approach in describing the relationships between the concepts made it possible to reduce the uncertainty of expert assessments of the strength of the relationship between the concepts of the cognitive map, as well as to increase the efficiency of the forecasting process. The model based on the ensemble of gray FCMs provided an increase in the reliability of the solution using several options for formalizing the knowledge and experience of experts. The novelty of the approach is determined by the use in the construction of a fuzzy cognitive map of special constructions in the form of interval estimates to assess the strength of connections between concepts, providing an adequate cognitive model, as well as the use of a predictive ensemble of gray FCMs, based on which a more accurate result was obtained in comparison with a single FCM and existing ones. Models of classic FCM. The proposed approach provided, under the given constraints, obtaining the most acceptable scenario for planning the increment of the basic indicators of the university's activity to target values by identifying the latent factors affecting them and calculating the aggregate of impulse influences on latent factors to ensure the achievement of the goal. Keywords: Scenario prediction Gray fuzzy cognitive map Interval fuzzy set
1 Introduction The purpose of developing a combined cognitive model is to build the most preferred scenario for the development of the university. The relevance of the problem being solved is due to the necessary development of scientifically grounded proposals to achieve the required values of the target indicators of the university by 2025. In the international QS ranking by calculating the necessary increments of latent factors considering their correlations with target indicators. In turn, the main indicator, called the functional F (university rating R), is the sum of the products of the values of target indicators by their weight coefficients [1]. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 Z. Hu et al. (Eds.): CSDEIS 2021, LNDECT 121, pp. 241–253, 2022. https://doi.org/10.1007/978-3-030-97057-4_21
242
A. Mikryukov and M. Mazurov
The analysis of the problem posed showed that it belongs to the class of semistructured problems, which is solved under conditions of a limited amount of initial data and several uncertainties. One of the approaches that have shown itself quite well in modeling semistructured systems is fuzzy cognitive modeling, which combines the methods of their linguistic, analytical, and statistical description. The indisputable advantage of fuzzy cognitive modeling in comparison with other approaches is the possibility of a formal description of immeasurable factors, representation of the interdependence between objects and components in the form of fuzzy relations of mutual influence using methods of fuzzy causal algebra, as well as the use of incomplete, fuzzy, including contradictory information [2]. Fuzzy cognitive modeling allows you to best reflect the uncertainty, dynamics, as well as the state of the concepts of the cognitive map and their interrelationships. The FCM model allows describing the behavior of a complex system, and each FCM concept - its factor characteristic. At present, based on most modern systems for modeling and forecasting, complex semi-structured systems, characterized by high uncertainty and stochasticity, belong to the class of social systems, socio-economic and organizational-technical systems, etc. To solve this problem, the necessity of applying the methodology of fuzzy cognitive modeling, designed for analysis and decision-making in bad situations, is substantiated a model based on an ensemble of gray FCM, which allows you to choose the most preferable alternative. The article presents the results of the development of approaches to modeling semistructured based on FCM. To reduce the uncertainty of expert assessments of the strength of connections between the concepts of FCM, which have a significant impact on the models of using the methods of interval mathematics that underlie the gray FCM, the models represent a combination of methods of cognitive modeling, interval mathematics and causal algebra. To improve the accuracy of modeling results based on gray FCMs, a predictive ensemble is used to obtain an adequate cognitive model. The novelty is used to determine, when constructing a fuzzy cognitive map, special constructions in interval estimates for assessing connections between concepts, providing an adequate cognitive model, as well as the use of a predictive ensemble of gray FCM, based on which a more accurate result is obtained compared to the existing models. The proposed approach provided, in the conditions of the given criteria, to obtain the most acceptable planning scenario for the increment of the basic indicators of the university's activity to the target indicators for determining the indicators for latent factors and calculating the aggregate of impulse influences on the guaranteed achievement of the set goal.
2 Literature Review A significant number of works by domestic and foreign scientists are devoted to the methods of cognitive modeling of semi-structured systems, which class include socioeconomic systems. In works [2–13], the tools used in cognitive modeling are
Scenario Forecasting of University Development
243
considered, which can provide exploratory and evaluative analysis of various strategies for the behavior of such systems, which, among other things, makes it possible to predict the scenarios of their development. The problems arising in this case of constructing adequate models are associated with the lack of the necessary reliable data, the presence of many uncertainties and hidden quantitative and qualitative patterns inherent in such systems, which do not always allow obtaining the desired simulation results. The works [6, 7] show the approaches and features of using the methods of a fuzzy cognitive approach when modeling semi-structured systems in conditions of an insufficient number of quantitative indicators. The advantages of fuzzy cognitive maps are their simplicity and clarity, identifying the structure of causal relationships between elements of a complex system that is difficult to quantitatively analyze by traditional methods, using the knowledge and experience of experts in a specific subject area, as well as adapting to the uncertainty of the initial data and the conditions of the problem being solved. In [9], methods of constructing and analyzing FCMs, including the issues of analyzing their stability, are considered. reliability and accuracy, as well as methods for determining their system characteristics. In the works [9, 12, 13], the varieties of FCM are presented: classical, generalized, relational, production, etc. Among the generalized FCM, gray, interval-valued, rough, intuitionistic are distinguished, in which approaches are used that make it possible to reduce the influence of some uncertainty factors, which in in turn, improves some of the system characteristics of the FCM, providing an increase in the accuracy and reliability of the simulation results. In works [14–16] it is shown that a promising method of modeling is the composition of models into an ensemble. At the same time, the effectiveness of using the ensemble organization of models has been experimentally proved. The analysis showed that to build a cognitive model of the university’s activity as a semi-structured system, it is advisable to use a variety of generalized FCM, called gray FCM, a feature of which is the ability to represent the strength of connections between concepts using special constructions in the form of interval estimates [5, 13]. In this case, the error in assessing the strength of connections between concepts is significantly reduced, which is of a pronounced subjective nature and is not reduced to some averaged point numerical value but is represented in the form of interval values. To improve the accuracy and reliability of obtaining the final forecasting result based on the cognitive model, an approach was developed based on an ensemble of gray FCMs, which implements several options for formalizing the knowledge and experience of experts, as well as a weighted voting method that allows you to obtain a more accurate result in comparison with a separate FCM. The developed cognitive model made it possible, under the given constraints, to find the most acceptable scenario for planning the increment of baseline indicators to target values. In the course of the study, the following tasks were solved: a combined cognitive model of scenario forecasting of measures to achieve the required values of the target indicators of the university's activity in the international institutional ranking QS was developed; on the basis of the developed model, the calculation of the most preferable variant of the set of required intensities of influence on the control variables (latent factors) for a given increment of the value of the target factor (university rating) was carried out.
244
A. Mikryukov and M. Mazurov
3 Scenario Forecasting Methodology of University Performance Indicators Based on a Gray Fuzzy Cognitive Maps Ensemble In the context of the problem under consideration, a scenario is understood as a dynamic sequence of possible events of change in the values of factors - causes and factors - consequences that affect the target indicators [4]. To solve this problem, an approach to scenario forecasting based on a gray fuzzy cognitive map is proposed, a feature of which is the possibility of using a special design that allows one to reduce the uncertainty (spread) of expert estimates of the states of cognitive map concepts. As you know, the FCM is specified using a tuple of sets [4]: FCM ¼ \C; F; W [ ; where C = {Cj} is a set of concepts - the vertices of the graph, which are the factors that are the most significant in the problem under consideration; F = {Fk} - set of directed arcs of the graph of connections between concepts; W = {Wij} - the set of weights of the FCM links (links can be positive (strengthening, Wij > 0) and negative (weakening, Wij < 0) the influence of the Ci concept on the Cj concept. The values of the weights Wij are set using a fuzzy linguistic scale. Each value of the linguistic variable is assigned a certain interval [0, 1] for positive relationships or an interval [−1, 0] for negative relationships. At an arbitrary discrete time, t = 0, 1, 2…, n, the state of the FCM is described by the equation h i X n Xi ðt þ 1Þ ¼ f Xi ðtÞ j¼1 Wij Xj ðtÞ
ð1Þ
i ¼ 1; 2; . . .; n; where Xi(t) is the value of the state variable of the i-th concept Cj at time (t + 1); n is the number of FCM concepts, f is the concept's nonlinear function. The initial conditions for the calculation are determined by the vector X(0) = X1(0), X2(0),…., Xn(0)T. Each concept is characterized by a term-set of a linguistic variable Ti ¼ fT i1 ; T i2 ; . . .T imj g
ð2Þ
where mj is the number of typical states of the i-th concept. To describe each term Tki, a term is constructed - a set with a membership function lTi ðxÞ. The connections between the typical states of each pair of concepts are specified by fuzzy variables described by the corresponding fuzzy sets. The values of the weights (bond strengths) Wij are set using a fuzzy linguistic scale, which is an ordered set of linguistic values (terms) of bond strength estimates, for example, of the form: LINK_POWER = {Does not affect; Weak; Average; Strong; Very strong}. Each of these values is associated with a certain numerical range
Scenario Forecasting of University Development
245
Table 1. Assessment of the strength of the connection between concepts Linguistic value No effect Very weak Weak Medium Strong Very strong
Numerical range 0 (0; 0,15) (0,15; 0,35) (0,35; 0.60) (0,6; 0,85) (0,86; 1,0)
Term designation Point estimate of bond strength+ Z 0 VL 0,12 L 0,23 M 0,47 H 0,72 VH 0,93
belonging to the segment [0, 1] for positive links (Table 1), or the segment [–1, 0] for negative links. In the general case, a weighted digraph with arbitrary values of the weights Wij 2 ½1; 1 is described by the dynamics of its state change in time. The state of the digraph (FCM) is determined by the set of states of its concepts Ci, (i = 1, 2,…, n), each of which is described by the state variable Xi(t), which takes values in the interval [0, 1]. The strength of the connection between the concept Ci and the concept Cj is chosen by the expert by one of the linguistic values presented in the table, as well as by some “point” estimate of the strength of the connection - a number within this range (if there are several experts, then the weight of Wij is averaged). The disadvantage of this approach is the presence of subjective opinions of experts, which can lead to a significant scatter of estimates of the states of concepts, which can cause a significant decrease in the reliability of the results of the constructed cognitive model due to the incorrect assessment by experts of the weights of connections between concepts. To eliminate this drawback, it is proposed to use an approach based on special constructions, implemented in fuzzy cognitive models based on gray FCMs, which allow describing the weights of connections between concepts of a cognitive map not by point estimates, but by interval numbers given on fuzzy interval sets [6]. In this case, the equation of state of the FCM (1) can be represented by the expression (3) 0
0
11 n xi ðt þ 1Þ ¼ f @XiðtÞ @ Wji XjðtÞ AA j ¼ 1ðj 6¼ iÞ
ð3Þ
i ¼ 1; 2; . . .; n; where: f is the activation function, and the weights of the connections Wji, as well as the state variables Xi (t + 1), Xi (t) are interval numbers, which are elements of fuzzy interval sets. The operations of addition ⊕ and multiplication ⊗ interval numbers are specified on fuzzy interval sets. The gray set A X can be represented as A ¼ f\x; ½x; x [ jx 2 Xg
ð4Þ
246
A. Mikryukov and M. Mazurov
Elements x 2 ½x; x A of the gray set can take values in the range [0, 1], where x; x are the lower and upper bounds of the gray number, X is the universal set. The number dx ¼ ðx; xÞ is called the grayness of the number x, and the number x0 ¼ ðx; xÞ=2 is the “bleached” (central) value of this number. The weights of links between gray FCM concepts are set as gray numbers Wij ; Wij . State variables of concepts are also described by gray numbers, the values of which lie in the interval Xi ; Xi and are determined by Eq. (3). In general, the use of the interval approach has several advantages [13]: – knowledge of the probabilistic characteristics of uncertain factors is not required, which are rarely precisely known in practice; – with the minimax approach, strict estimates are obtained for the desired quantities themselves, and not for probabilities or mathematical expectations, which is important in the presence of a small number of parameter measurements and one or several realizations. The accuracy of the interval result is completely determined by the following factors: uncertainty in the specification of the initial data; rounding values when performing operations that modify or generate interval objects; the approximate nature of the numerical method used; as well as the degree to which dependencies between the interval objects (variables and constants) participating in the calculation are considered. At the first stage of the study, the identification and interpretation of the latent factors that affect the target indicators and the assessment of their significance using the methods of factor analysis [17–19] were carried out. At the next stage, a cognitive model was developed, including an ensemble of gray FCMs, built considering various options for formalizing the knowledge and experience of experts in the specified subject area. In Fig. 1 shows a gray cognitive map in the form of a sign-oriented graph, reflecting the influence of a set of factors on the basic indicators of the university's activity and the rating indicator [20, 21]. The cognitive map reflects the relationship of latent factors, baseline indicators and functional, considering the values of correlation dependences between the functional and basic indicators obtained based on factor analysis in [18], as well as expert assessments of the mutual influence of latent factors and their impact on target indicators. The relationship between basic indicators and functionality is determined by formula (5) in accordance with the rules of the international institutional rating QS. R¼
X
6 i¼1 wi xi ;
ð5Þ
where wi is the weight of the corresponding indicator; xi - its meaning. In Fig. 1, the following designations are adopted: F - functional; R - university rating, Basic indicators are set by the rules of the international institutional rating QS [1]: AR - academic reputation; ER - reputation with the employer; RS/T - the ratio of the number of students to the number of teachers; CT - indicator of citation of teachers; IT number of international teachers; IS is the number of international students.
Scenario Forecasting of University Development
247
The weighted values of the factors are set on the basis of expert assessments, taking into account the interval scale (lower and upper values, ie grayness and “bleached” value of this number). F1 – factor “Scientific schools and dissertation councils” (0.5–0.7; 0.6); F2 – factor “Joint research projects” (0.2–0.4; 0.3); F3 – Availability of basic departments (0.1– 0.3; 0.2); F4 – Number of publications in the Scopus database (0.5–0.7; 0.6);
Fig. 1. Cognitive map of the scorecard based on the graph of the relationship of factors, baseline indicators and functionality (Legend: F - functional; R - rating, (F1–F24 - latent factors)
F5 – Popular areas of training (0.3); F6 – qualification level of the teaching staff (0.1–0.3; 0.2); F7 – Number of teaching staff (0.5–0.7; 0.6); F8 – The level of competence of students (0.4–0.6; 0.5); F9 – teachers with language training (0.3–0.5; 0.4); F10 – Places in the hostel (0.1–.3; 0.2); F11 – Demand for graduates from employers (0.2–0.4; 0.3); F12 –Areas for educational activities (0.2–0.4; 0.3); F13 – The level of payment for teaching staff (0.3–0.5; 0.4); F14 – Stimulating factors (0.1–0.3; 0.2); F15 – Expansion of the teachers social package (0.2–0.4; 0.3); F16 – Change in the structure of employment of the teaching staff (0.2–0.4; 0.3); F17 – Share of teaching staff planning to build an international scientific career (0.1–0.3; 0.2); F18 – Academic mobility of the teaching staff (0.2–0.4; 0.3); F19 – Convergence of educational programs with foreign universities (0.3–0.5; 0.4); F20 – Foreign entrant company (0.2– 0.4; 0.3); F21 – Increase in the number of On-line courses MOOCs (0.2–0.4; 0.3), F22 – Implementation of individual educational trajectories (0.3–0.5; 0.4); F23 – Implementation of remote sensing technologies (0.2–0.4; 0.3); F24 – Close relationship with the employer (0.3–0.5; 0.4). To improve the accuracy of the obtained solution, the cognitive model is built based on a predictive ensemble, the idea of using a combination of the results of predicting several models of cognitive maps (Fig. 2).
248
A. Mikryukov and M. Mazurov
In [14–16, 22–27], it is noted that a promising direction for improving the accuracy of solutions is the unification (composition) of a set of separate algorithms into one system. In this case, the errors of individual algorithms are mutually compensated. The paper [23] experimentally proved the effectiveness of using the ensemble organization for image recognition.
Fig. 2. Predicting ensemble based on the composition of gray fuzzy cognitive maps
Currently, several methods for constructing predictive ensembles are known. The advantages of an ensemble of models in comparison with a separate model included in an ensemble are due to the following reasons [23]: 1. The ensemble reduces the root mean square error. 2. Ensembles of models trained on different subsets of the initial data have a greater chance of finding the global optimum. To form the output value of the ensemble, we used the method of weighted voting, which has proven itself quite well. Based on the results of testing on historical data, each FCM model is assigned a weight factor considering the mean square error. The output value is determined according to the formulas: Y ð xÞ ¼ F ðy1 ð xÞ; y2 ð xÞ; . . .; ym ð xÞÞ X 1 Xm m ¼ a ð x Þy ð x Þ; i i i¼1 ai ð xÞ ¼ 1; 8x 2 X; i¼1 m
ð6Þ
where x = (x1, x2, …, xk) is the input vector, yi (x) – output value of the i - th FCM; ai – weighting factor of the i – th model; k is the number of FCM, Y(x) = (y1, y2, …, yk) – is the vector of values of the output signal of the predicting ensemble, F is the function for obtaining the resulting solution.
Scenario Forecasting of University Development
249
The problem to be solved is to find based on the developed cognitive model, the most cost-effective scenario for the increment of the values of latent factors to obtain the required value of the university ranking within the framework of the international institutional ranking of universities QS. The presence of causal relationships between latent factors and baseline indicators allows solving the set task of scenario forecasting under the conditions of the given constraints. The cognitive map-based approach to scenario forecasting includes the following stages [21]: – generating scenarios and assessing the influence of increments in the values of factors on an increase in the rating of the university, – scenario adjustment at discrete time intervals (in our case, annually), considering the achieved values of the university's place in the international QS ranking. At the end of the next time interval, a new scenario is built to achieve a new target rating value.
4 Scenario Forecasting Results of University Indicators After describing the relationships between the factors using equations, setting the interval values of the weights of their mutual influences and the values of the initial increments of the factors, it is possible to analyze the dynamics of changes in the factors and the development of the system of indicators. To form possible strategies for the development of the system, it is necessary, first, to predict its self-development, that is, to study the dynamics of changes in the values of basic reference points in the absence of external control influences. As a result of such a forecast, a vector of values of input influences on latent factors F = (F1, F2,…, F24) was obtained to achieve the required value of the target factor (university rating) at time t, considering the requirements for limiting resources. The values of the initial data that are not quantitative in nature (qualification level of the teaching staff, the level of competencies of students, etc.) are determined by an expert method. The preliminary values of the intensity of mutual influence between the measurable factors of the cognitive model were established based on correlation and factor analysis [18]. Further, the coefficients were refined in accordance with the logic of the transition of the system from one stationary state to another because of external impulse influences. Table 2 shows the weights of connections, given by experts, between concepts in a regular FCM, gray FCM, and in an ensemble of gray FCM, consisting of three heterogeneous FCM. The weights of the connections between the concepts of the FCM ensemble are given by three different experts. Scenario forecasting based on three variants of cognitive models (classical FCM, gray FCM, and an ensemble of gray FCM) made it possible to obtain the most preferable values of increments of factors influencing the basic indicators, which are
250
A. Mikryukov and M. Mazurov Table 2. Weights of connections between NCC concepts
Link weight Wij
Plain FCM Wij
Gray FCM Wij ; Wij
Ensemble of gray FCM 2nd FCM 1st FCM Wij ; Wij Wij ; Wij
3rd FCM Wij ; Wij
F1–AR F1–F6 F2–AR F2–IT F2–F4 … … F23–AR F23–ER F24–F11 F24–F3
0.6 0.7 0.3 0.4 0.5 … … 0.3 0.3 0,4 0.2
[0.5–0.7] [0.6–0.8] [0.2–0.4] [0.3–0.5] [0.4–0.6] … … [0.2–0.4] [0.2–0.4] [0.3–0.5] [0.1–0.3]
[0.5–0.7] [0.6–0.7] [0.2–0.4] [0.3–0.5] [0.4–0.6] … … [0.2–0.4] [0.3–0.5] [0.3–0.5]] [0.1–0.3]
[0.3–0.5] [0.4–0.6] [0.4–0.6] [0.3–0.5] [0.3–0.5] … … [0.4–0.6] [0.3–0.5] [0.5–0.7] [0.2–0.3]
[0.6–0.8] [0.4–0.6] [0.3–0.5] [0.4–0.6] [0.2–0.4] … … [0.2–0.4]] [0.2–0.4] [0.4–0.6] [0.3–0.5]
Table 3. Fuzzy cognitive map effects table Impulse effect on the latent factor F1 F2 ……. F22 F23 F24 Rating increment DR 0.001 0.0007 ……… 0.00004 0.00002 0.0006
necessary to achieve the required values of the latter, based on the results of the analysis of the degree of influence each factor per rating. Table 3 shows the results of the effects in the classical FCM. Similar tables were constructed for the other two variants of the cognitive model. Sequentially set “weak” increments in the values of the above factors at the level of 10%, made it possible to assess the sensitivity of the target indicator (university rating) to control actions in these areas of regulation, based on which the most preferable alternative of scenario forecasting was chosen. Verification of the cognitive model is carried out based on the criterion of completeness and consistency of the influence of factors - causes on factors - effects and violation of the transitivity of causal phenomena [20]. To check the adequacy and accuracy of the cognitive model, it was tested in the retrospective period 2014–2020, based on the available statistics on the measurable factors of the model. The general correctness of the model at this stage was confirmed by the closeness of the growth rates of factors calculated on the model to the actual growth rates. A comparative analysis of the results of scenario forecasting based on a conventional FCM, gray FCM and an ensemble of gray FCM, showed that the deviation of the obtained modeling result is the value of the target concept (rating university) of its real value on historical data is the smallest for a cognitive model based on a predictive ensemble (Table 4).
Scenario Forecasting of University Development
251
Table 4. Comparative assessment of scenario forecasting results. № p\p 1 2 3
Cognitive model type FCM Gray FCM Ensemble of gray FCM
Scatter of ratings state of target concepts of NCC in% 17.3 14.6 10.4
Average prediction error in % 14.1 11.7 8.2
The weighted average assessment of the state of the target concept based on the ensemble of gray FCMs is preferable, because it has a lower value of the scatter in modeling various effects on latent factors as compared to a separate gray FCM and classical FCM due to a decrease in the influence of the factor of subjectivity of expert assessments on the results of modeling using gray FCM, as well as a decrease in modeling errors due to the ensemble organization of the cognitive model. Scenario analysis of the options for the development of the situation made it possible to choose the most preferable option, which, under the conditions of the given restrictions, ensures the achievement of the required planned value of the target indicator with minimal resource consumption for the increment of latent factors.
5 Conclusion During the study, a new approach to the development of a fuzzy cognitive model of scenario forecasting of measures to achieve the required values of the target performance indicators of the university in the international institutional QS ranking based on the use of gray FCMs was proposed. A feature of gray FCMs is the ability to move from point estimates of expert opinions on the strength of the relationship between concepts to interval estimates, which made it possible to consider the available data and reduce the uncertainty factor when there is a range of expert opinions more fully. To improve the accuracy and reliability of forecasting results, a fuzzy cognitive model has been developed in the form of a predictive ensemble of gray FCMs, a distinctive feature of which is high interpretational properties of the process and intermediate/results of fuzzy cognitive modeling, adequate consideration of various types of uncertainties within a unified cognitive model, as well as an increase in accuracy. Simulation results due to their averaging over a set of models and a decrease in the expected value of the mean square error. The practical significance of the results obtained lies in the possibility of choosing, based on the performed calculations, the most preferable scenario for the increment of latent factors to achieve the required values of the target indicators of the development of the university. The main directions of further research are related to the further improvement of the mathematical foundations for constructing the FCM, assessing the adequacy, structural complexity, and stability of the FCM, the choice of learning algorithms that provide the desired characteristics of the FCM to achieve the set goals.
252
A. Mikryukov and M. Mazurov
Acknowledgments. The reported study was funded by RFBR according to the research projects No. 19-07-01137 and 20-07-00926.
References 1. International Institutional Ranking QS World University Rankings. https://www.qs.com/ rankings/. Accessed 17 Oct 2021 2. Axelrod, R.M.: Structure of Decision: The Cognitive Maps of Political Elites. Princeton University Press, Princeton, NJ (1986) 3. Yarushev, S.A., Averkin, A.N.: Modular forecasting system based on fuzzy cognitive maps and neural-fuzzy networks. In: 7th All-Russian Scientific and Practical Conference Fuzzy Systems, Soft Computing, and Intelligent Technology. St (in Russia), vol. 1, pp. 180–189. Politekhnika - servis Publication, St. Petersburg, Russia (2017) 4. Kuznetsov, O.P.: Cognitive modeling of semi-structured situations. http://posp.raai.org/data/ posp2005/Kuznetsov/kuznetsov.html. Accessed 12 Oct 2021 5. Roberts, F.S.: Discrete Mathematical Models with Applications to Social, Biological, and Environmental Problems. Nauka, Moscow, Russia (1986) 6. Carvalho, J.P., Tom, J.A.B.: Rule-based fuzzy cognitive maps – fuzzy causal relationships. In: Mohammadyan, M. (ed.) Computational Intelligence for Modeling, Control and Automation: Evolutionary Computing and Fuzzy Logic for Intelligent Control, Knowledge, and Information Retrieval. IOS Press (1999) 7. Silov, V.B.: Adoption of strategic decisions in a fuzzy environment. INPRO-RES, Moscow, Russia (1995) 8. Gorelova, G.V., Zakharova, E.N., Rodchenko, S.A.: Study of semi-structured problems of socio-economic systems: a cognitive approach. Publishing House of the Russian State University, Rostov n/d (2006) 9. Borisov, V.V., Kruglov, V.V., Fedulov, A.S.: Fuzzy Models, and Networks. 2nd edn. Stereotypes. Hot Line – Telecom, Moscow, Russia (2015) 10. Borisov, V.V., Luferov, V.S.: The method of multivariate analysis and forecasting of the state of complex systems and processes based on fuzzy cognitive temporal models. Control Syst. Commun. Secur. (2), 1–23 (2020) 11. Kosko, B.: Fuzzy cognitive maps. Int. J. Man-Mach. Stud. 24, 65–75 (1986) 12. Miao, Y., Miao, C.Y., Tao, X.H., Shen, Z.Q., Liu, Z.Q.: Transformation of cognitive maps. IEEE Trans. Fuzzy Syst. 18(1), 114–124 (2010) 13. Salmeron, J.L., Palos-Sanchez, P.R.: Uncertainty propagation in fuzzy grey cognitive maps with Hebbian-like learning algorithms. IEEE Trans. Cybern. 49(1), 211–220 (2017) 14. Goncharov, M.: Model ensembles. http://www.businessdataanalytics.ru/download/ ModelEnsembles.pdf. Accessed 17 Oct 2021 15. Kuncheva, L.I.: Combining Pattern Classifiers: Methods and Algorithms. Wiley, Hoboken (2004) 16. Zhou, Z.–H.: Ensemble Methods: Foundations and Algorithms. In: Machine Learning & Pattern Recognition. Chapman & Hall/CRC (2012) 17. Sokolov, G.A.: Introduction to regression analysis and planning of regression experiments in economics. Infra-M, Moscow, Russia (2016) 18. Mikryukov, A.A., Gasparian, M.S., Karpov, D.S.: Development of proposals for promoting the university in the international institutional ranking QS based on statistical analysis methods. Stat. Econ. 17(1), 35–43 (2020)
Scenario Forecasting of University Development
253
19. Tereshchenko, O.V., Kurilovich, E.I., Knyazeva I.A.: Multivariate Statistical Analysis of Data in the Social Sciences. BSU. Minsk (2012) 20. Avdeeva, Z.K., Kovriga, S.V., Makarenko, D.I., Maksimov, V.I.: Cognitive approach in management. Problemy Upravlenia 3, 2–8 (2007) 21. Bolotova. L.S.: Artificial intelligence systems: models and technologies based on knowledge. Finance and Statistics: Textbook, Moscow (2012) 22. Terekhov, S.A.: Brilliant committees of smart machines. In: IX Russian Scientific and Technical Conference “Neuroinformatics-2007”: Lectures on Neuroinformatics, MEPhI, Moscow, Russia, Issue no. 02, pp. 11–42 (2007) 23. Vorontsov, K.V.: Lectures on algorithmic compositions. http://www.ccas.ru/voron/ download/Composition.pdf. Accessed 16 Oct 2021 24. Borovikov, V.P.: Neural Networks. Statistica Neural Networks. Methodology and Technologies of Modern Data Analysis, 2nd ed. Hot Line – Telecom, Moscow, Russia (2008) 25. Karabutov, N.: Structural identifiability of nonlinear dynamic systems under uncertainty. Int. J. Intell. Syst. Appl. (IJISA) 12(1), 12–22 (2020). https://doi.org/10.5815/ijisa.2020.01.02 26. Bodyanskiy, Y.V., Tyshchenko, O.K., Deineko, A.O.: An evolving neuro-fuzzy system with online learning/self-learning. Int. J. Mod. Educ. Comput. Sci. (IJMECS) 7(2), 1–7 (2014). https://doi.org/10.5815/ijmecs.2015.02.01 27. Vijayalakshmi, V., Venkatachalapathy, K.: Comparison of predicting student’s performance using machine learning algorithms. Int. J. Intell. Syst. Appl. (IJISA) 11(12), 34–45 (2019). https://doi.org/10.5815/ijisa.2019.12.04
Author Index
B Briukhanov, A. Yu., 22 C Cai, Bing, 187 Chen, Hao, 60 Cheng, Wen, 34 D Dorokhov, A. S., 22 F Feng, Li, 200 Fonov, D.A., 72 G Gadolina, Irina V., 115 Gavriushin, Sergey S., 150 Golubtsov, Peter, 102 Gong, Zhipeng, 47 Guo, Rui, 47 H Hu, Zhengbing, 47 I Izmailov, A. Yu., 22 Izvozchikova, Vera, 125 K Kopp, Andrii, 81 Koromyslichenko, V. N., 22 Korzhov, E.G., 72 Kostomakhin, Mikhail N., 92
L Li, Guangbing, 214 Li, Shenghui, 34 Liu, Shuo, 214, 227 Liu, Weihu, 214, 227 M Ma, Wenbing, 227 Mazurov, Mikhail, 241 Mezhenin, Aleksandr, 125 Mikryukov, Andrey, 241 Mutovkina, Nataliya, 1, 11, 175 O Okhtilev, M. Yu., 22 Orekhov, Sergey, 81 Orlovskyi, Dmytro, 81 P Peng, Yongjun, 47 Pestryakov, Efim V., 92 Petoukhov, Sergey V., 164 Petrishchev, Nikolai A., 92 Petrova, Irina M., 115 Podkopaev, Sergey A., 150 Popov, V. D., 22 R Rakov, Dmitry, 138 S Sayapin, Alexander S., 92 Shalavina, E. V., 22
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 Z. Hu et al. (Eds.): CSDEIS 2021, LNDECT 121, pp. 255–256, 2022. https://doi.org/10.1007/978-3-030-97057-4
256 W Wan, Anping, 47 Wang, Junxiong, 187 Wang, Yang, 214, 227 Wu, Yongfu, 227 Y Yan, Zhongzhen, 60 Yang, Haikuan, 187, 200
Author Index Yang, Hua, 187, 200 Yu, Litao, 214
Z Zhou, Kang, 187, 200 Zhou, Kewei, 60 Zhu, Xinyuan, 60 Zhu, Yongqing, 200