Advances in Artificial Systems for Power Engineering 3030805301, 9783030805302

This book comprises refereed papers presented at The International Conference on Artificial Intelligence and Power Engin

300 103 18MB

English Pages 240 [252] Year 2021

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Contents
Optimal Dispatch of Combined Cooling, Heating and Power Microgrid with Advanced Adiabatic Compressed Air Energy Storage
1 Introduction
2 Modeling of AA-CAES Under CCHP Mode
2.1 Description of AA-CAES Combined Cooling, Heating and Power Generation
2.2 Dispatch Model of Trigeneration AA-CAES
3 Optimal Dispatch of CCHP System
3.1 Objective
3.2 Constraints
4 Case Study
4.1 System Setting
4.2 Results and Discussion
5 Conclusions
References
Chatbots as a Tool to Optimize the Educational Process
1 Introduction
2 Literature Review
3 Methodology for Using Chatbots in the Activities of the University
3.1 Chatbots in the Organization of the Educational Process
3.2 Methods of Studying Chatbots by Students in the Discipline “Artificial Intelligent and Neurotechnology”
3.3 Chatbots as a Training Tool
4 An Example of Creating a Chatbot in the Telegram
5 Results and Discussion
6 Conclusions
References
Intelligent System of Computer Aided Processes Planning
1 Introduction
2 Constructing of Conceptual Models
3 Constructing of Knowledge Bases
4 Conclusions
References
Micro-level Modeling of Traffic Flows Through Signalized Crossroads of an Arbitrary Structure
1 Introduction
2 Review of Mathematical Micro-level Models of TFs and Their Application to Signalized Intersections
3 The Proposed Models
3.1 Representation of Crossroads Structure and Traffic Organization
3.2 Traffic Flow Through an Intersection as an Event-Switched Process
4 Ways of the Proposed Approach Application for Traffic Control Options Optimization
5 Discussion and Conclusions
References
Polypolar Coordination by the Multifocal Lemniscates
1 Introduction
2 Polypolar Lemniscus Coordinate System
3 Metric Coordinates ρ
4 Angular Coordinates φ
5 Coordinate Families
6 φ-parameterization
7 1f-Polar CS: Passage to the Limit
8 Symmetries in Polypolar Coordination
9 Focal Approximation of Empirical Form
10 Conclusion
References
Non-intrusive Load Identification Decision Method Based on Time Signatures
1 Introduction
2 Establish Load Characteristics Library
2.1 Load Characteristics Selection
2.2 Load Characteristics Modeling
3 Non-intrusive Load Identification Decision Method Combining Time Characteristics
3.1 Mean-Shift Clustering Algorithm
3.2 Bayesian Classification Method
4 Experimental Results and Analysis
4.1 Parameter Settings
4.2 Parameter Settings
5 Conclusions
References
Research on Non-intrusive Load Identification Method Based on Support Vector Machine
1 Introduction
2 SVM Nonlinear Classifier
2.1 SVM Linear Classifier
2.2 SVM Nonlinear Classifier Based on RBF Kernel Function
3 Detection and Feature Extraction of Electrical Appliances
3.1 Event Detection
3.2 Characteristic Calculation
4 Construction of Load Classifier
4.1 Two Class Classifier Training
4.2 Multiple Classifier Training
5 Load Classification Test Analysis
5.1 Testing Environment
5.2 Analysis of Test Results
6 Conclusions
References
Intelligent Detection of Electricity Stealing by Replacing Instrument Transformer Based on Daily Load Date Mining
1 Introduction
2 Relationship Between Electricity Stealing Replacing Instrument Transformer and Line Loss
3 Using Pearson Correlation Coefficient to Identity Electricity Stealing of Replacing Instrument Transformer
3.1 Principle of Pearson Correlation Coefficient
3.2 Application of Pearson Correlation in Electricity Stealing Detection
4 Case Verification and Analysis
4.1 Typical Case Verification
4.2 Typical Case Analysis
5 Conclusions
References
Fuzzy Management of Teacher-Student Interaction in Distance Learning Settings
1 Introduction
2 Literature Review on the Research Issue
3 Theoretical Aspects of the Study
4 A Cognitive Model of Interaction of the Teacher and Students
4.1 The Model of Information Transmission
4.2 Model of Perception of Information
4.3 The Information Interaction Model
5 Practical Implementation and Results
6 Conclusions
References
Evaluation Method of Distributed Renewable Energy Access to Distribution Network Based on Variable Weight Theory
1 Introduction
2 Index System of Carrying Capacity of Distributed Renewable Energy Access Distribution Network
2.1 Refer to the Evaluation Index Proposed in DL/T + 2041–2019
2.2 Calculation Index of Safety and Reliability
2.3 Calculation Index of Operation Economy Index
3 Fuzzy Comprehensive Evaluation Method Based on Variable Weight Theory
3.1 Variable Weight Theory
3.2 Fuzzy Comprehensive Evaluation Method Based on Variable Weight Theory
4 Theoretical Example Analysis
4.1 Basic Information of Calculation Example
4.2 Index Calculation Results
4.3 Index Weight Calculation
5 Conclusions
References
Research on Realization Technology of Arc Grounding Fault on Distribution Network on Field Test Data
1 Instruction
2 Distribution Network Practical-Test-Platform
3 Modeling Analysis of Intermittent Arc Grounding Fault
3.1 Arc Grounding Fault Model
3.2 Development Process of Intermittent Arc Grounding Fault
4 Realization Technology of Arc Grounding Fault
4.1 Arc Grounding Fault Simulation Device
4.2 Simulation Process of Intermittent Arc Grounding Fault
5 Numerical Simulations
5.1 Neutral Ungrounded System
5.2 Low Resistance Grounding System
6 Summary and Conclusion
References
Research on Reactive Power Compensation Configuration of Wind Farm Integration
1 Introduction
2 Composition of Reactive Power Loss in Wind Power Engineering
3 Principle of Reactive Power Compensation Configuration
4 Theoretical Calculation Method of Reactive Power Compensation Configuration
4.1 Calculation of the Line Reactive Loss and Charging Power
4.2 Calculation of the Transformer Reactive Power Loss
4.3 Calculation of the Reactive Power Compensation Device Capacity
5 Case Study
6 Conclusions
References
Two Stage Stochastic Scheduling Model of Integrated Energy System with Renewable Energy Considering Demand Response
1 Introduction
2 Coordination of Generator Side Reserve and Demand Response
2.1 Interruptible Load and Positive Rotation Reserve
2.2 Negative Spinning Reserve and TOU Price
3 Analysis of the Response Model of Electricity Price
3.1 Elasticity Index
3.2 Modeling of Consumer Price Response in Integrated Energy System
4 Two Stage Model Based on Scenarios
4.1 Objective Function
4.2 First Stage Constraints
4.3 Second Stage Constraints
4.4 Joint Constraint of the First Stage and the Second Stage
5 Example Analysis
6 Conclusions
References
Evaluation of Distribution Equipment Utilization Based on Data Driven
1 Instruction
2 Influence Factors of Distribution Equipment Utilization
3 Indicators of Distribution Equipment Utilization
3.1 In-Serve Equipment Utilization
3.2 Retired Equipment Utilization
3.3 Comprehensive Utilization of Distribution Network Equipment
4 Data Processing and Key Influence Factors Mining
4.1 Normalization of Influence Factors
4.2 Multiple Linear Regression Modeling
5 Equipment Utilization Prediction Based on Convolution Neural Network
5.1 Convolution Neural Network
5.2 Basic Process of Equipment Utilization Prediction
6 Case Study
6.1 Key Factor Screening
6.2 CNN Prediction Model
6.3 Analysis of Prediction Results of Target Year
7 Conclusion
References
Approaches to Validation of Quantification of the Variable “Relationship Between Users” in the Context of Social Engineering Attacks
1 Introduction
2 Related Works
3 Statement of the Problem
4 Research Methods
4.1 Conducting Additional Survey/Surveys
4.2 Empirical Validity: Correlating Questions
5 Discussion of the Results
6 Conclusions
References
Parametric Oscillations at Delays in the Forces of Elasticity and Damping
1 Introduction
2 Equations of the System
3 Solving Equations
4 Stability of Stationary Movements
5 Calculations
6 Conclusions
References
A Coordinated Dispatching Model for HDR-PV Hybrid Power System: A Zero-Sum Game Approach
1 Introduction
2 A Hybrid Power System Architecture for HDR and PV
3 Zero-sum Game Dispatching Model of HDR-PV Hybrid Power System
3.1 Zero-sum Game Pattern
3.2 Operation Model of HDR Generation
3.3 Output Power Model of PV Plant
4 Optimization Method
4.1 Optimization Objective
4.2 Constraints
4.3 Conversion of Lower Level Problem
5 Case Study
5.1 System Parameters
5.2 Results Analysis
6 Conclusions
References
Category Splices and Modeling with Their Help Chemical Systems and Biomolecules
1 Introduction
2 Categorical Splices and Their Properties
3 Category Model of Systems of Quantum Particles
4 Category Chemical Bond Models for Algebraic Biology
5 Conclusion
References
Consensus Algorithm Based Distributed Coordinated Control for ESSs Integrated Off-grid PV Station
1 Introduction
2 Graph Theory Based Consistency Control
3 Distributed Coordinated Control for Off-grid PV Station
4 Consensus Algorithm Based Distributed Coordinated Control
4.1 Consistency Control
4.2 Droop Control
5 Simulation Analysis
6 Conclusions
References
Algebraic Harmony in Genomic DNA-Texts and Long-Range Coherence in Biological Systems
1 Introduction
2 The Harmonic Progression, Resonance Phenomena, and Helical Antennas
3 Hyperbolic Genomic Rules and Fröhlich’s Theory of Long-Range Coherence in Biological Systems
4 Presentation of Oligomer Cooperative Properties of Genomes in the Form of Numeric Genetic Mandalas
5 Some Discussing Remarks
6 Conclusions
References
Comparative Analysis of Inductive Density Clustering Algorithms Meanshift and DBSCAN
1 Introduction
2 Related Works
3 Inductive MeanShift
4 Inductive DBSCAN
5 Numerical Experiments
6 Conclusion
References
Author Index
Recommend Papers

Advances in Artificial Systems for Power Engineering
 3030805301, 9783030805302

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Advances in Intelligent Systems and Computing 1403

Zhengbing Hu Bo Wang Sergey Petoukhov Matthew He   Editors

Advances in Artificial Systems for Power Engineering

Advances in Intelligent Systems and Computing Volume 1403

Series Editor Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Advisory Editors Nikhil R. Pal, Indian Statistical Institute, Kolkata, India Rafael Bello Perez, Faculty of Mathematics, Physics and Computing, Universidad Central de Las Villas, Santa Clara, Cuba Emilio S. Corchado, University of Salamanca, Salamanca, Spain Hani Hagras, School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK László T. Kóczy, Department of Automation, Széchenyi István University, Gyor, Hungary Vladik Kreinovich, Department of Computer Science, University of Texas at El Paso, El Paso, TX, USA Chin-Teng Lin, Department of Electrical Engineering, National Chiao Tung University, Hsinchu, Taiwan Jie Lu, Faculty of Engineering and Information Technology, University of Technology Sydney, Sydney, NSW, Australia Patricia Melin, Graduate Program of Computer Science, Tijuana Institute of Technology, Tijuana, Mexico Nadia Nedjah, Department of Electronics Engineering, University of Rio de Janeiro, Rio de Janeiro, Brazil Ngoc Thanh Nguyen , Faculty of Computer Science and Management, Wrocław University of Technology, Wrocław, Poland Jun Wang, Department of Mechanical and Automation Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong

The series “Advances in Intelligent Systems and Computing” contains publications on theory, applications, and design methods of Intelligent Systems and Intelligent Computing. Virtually all disciplines such as engineering, natural sciences, computer and information science, ICT, economics, business, e-commerce, environment, healthcare, life science are covered. The list of topics spans all the areas of modern intelligent systems and computing such as: computational intelligence, soft computing including neural networks, fuzzy systems, evolutionary computing and the fusion of these paradigms, social intelligence, ambient intelligence, computational neuroscience, artificial life, virtual worlds and society, cognitive science and systems, Perception and Vision, DNA and immune based systems, self-organizing and adaptive systems, e-Learning and teaching, human-centered and human-centric computing, recommender systems, intelligent control, robotics and mechatronics including human-machine teaming, knowledge-based paradigms, learning paradigms, machine ethics, intelligent data analysis, knowledge management, intelligent agents, intelligent decision making and support, intelligent network security, trust management, interactive entertainment, Web intelligence and multimedia. The publications within “Advances in Intelligent Systems and Computing” are primarily proceedings of important conferences, symposia and congresses. They cover significant recent developments in the field, both of a foundational and applicable character. An important characteristic feature of the series is the short publication time and world-wide distribution. This permits a rapid and broad dissemination of research results. Indexed by DBLP, EI Compendex, INSPEC, WTI Frankfurt eG, zbMATH, Japanese Science and Technology Agency (JST). All books published in the series are submitted for consideration in Web of Science.

More information about this series at http://www.springer.com/series/11156

Zhengbing Hu Bo Wang Sergey Petoukhov Matthew He •





Editors

Advances in Artificial Systems for Power Engineering

123

Editors Zhengbing Hu School of Educational Information Technology Central China Normal University Wuhan, Hubei, China Sergey Petoukhov Mechanical Engineering Russian Academy of Sciences Moscow, Russia

Bo Wang School of Electrical and Automation Wuhan University Wuhan, China Matthew He Halmos College of Natural Sciences Nova Southeastern University Plantation, FL, USA

ISSN 2194-5357 ISSN 2194-5365 (electronic) Advances in Intelligent Systems and Computing ISBN 978-3-030-80530-2 ISBN 978-3-030-80531-9 (eBook) https://doi.org/10.1007/978-3-030-80531-9 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Contents

Optimal Dispatch of Combined Cooling, Heating and Power Microgrid with Advanced Adiabatic Compressed Air Energy Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Manshang Wang, Jiayu Bai, Bosheng Chen, Laijun Chen, and Wei Wei

1

Chatbots as a Tool to Optimize the Educational Process . . . . . . . . . . . . Nataliya Mutovkina

15

Intelligent System of Computer Aided Processes Planning . . . . . . . . . . . Georgy B. Evgenev

26

Micro-level Modeling of Traffic Flows Through Signalized Crossroads of an Arbitrary Structure . . . . . . . . . . . . . . . . . . . . . . . . . . Andrey M. Valuev Polypolar Coordination by the Multifocal Lemniscates . . . . . . . . . . . . . T. Rakcheeva Non-intrusive Load Identification Decision Method Based on Time Signatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zhengqi Tian, Zengkai Ouyang, Meimei Duan, Guofang Xia, Zhong Zheng, Xiaoxing Mu, and Chao Zhou Research on Non-intrusive Load Identification Method Based on Support Vector Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yixuan Huang, Qifeng Huang, Hanmiao Cheng, Kaijie Fang, Xiaoquan Lu, and Tianchang Liu Intelligent Detection of Electricity Stealing by Replacing Instrument Transformer Based on Daily Load Date Mining . . . . . . . . . . . . . . . . . . Gaojun Xu, Xin Zhang, Li Sun, Weimin He, Shuangshuang Zhao, Weijiang Wu, and Jian He

40 50

63

77

87

v

vi

Contents

Fuzzy Management of Teacher-Student Interaction in Distance Learning Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . N. Yu. Mutovkina

98

Evaluation Method of Distributed Renewable Energy Access to Distribution Network Based on Variable Weight Theory . . . . . . . . . . 108 Peng Wang, Chengliang Zhu, Hongjian Wang, Sen Wang, and Hengrui Ma Research on Realization Technology of Arc Grounding Fault on Distribution Network on Field Test Data . . . . . . . . . . . . . . . . . . . . . 122 Xiaoyong Yu, Lifang Wu, Weixiang Huang, Shaonan Chen, and Liqun Yin Research on Reactive Power Compensation Configuration of Wind Farm Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 Junfang Wang, Caifu Chen, Shuang Zhang, Zheng Ren, Xiaolu Chen, and Xinyu Wang Two Stage Stochastic Scheduling Model of Integrated Energy System with Renewable Energy Considering Demand Response . . . . . . 144 Qiao Chen, Yimin Qian, Kai Ding, Yi Wang, Chengliang Zhu, and Sen Wang Evaluation of Distribution Equipment Utilization Based on Data Driven . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 Xintong Li, Shuo Liang, Yangjun Zhou, and Xiaoyong Yu Approaches to Validation of Quantification of the Variable “Relationship Between Users” in the Context of Social Engineering Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 Anastasiia Khlobystova, Maxim Abramov, and Tatiana Tulupyeva Parametric Oscillations at Delays in the Forces of Elasticity and Damping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181 Alishir A. Alifov A Coordinated Dispatching Model for HDR-PV Hybrid Power System: A Zero-Sum Game Approach . . . . . . . . . . . . . . . . . . . . . . . . . . 189 Qingmiao Zhang, Yang Si, Xuelin Zhang, and Xiaotao Chen Category Splices and Modeling with Their Help Chemical Systems and Biomolecules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 Georgy K. Tolokonnikov Consensus Algorithm Based Distributed Coordinated Control for ESSs Integrated Off-grid PV Station . . . . . . . . . . . . . . . . . . . . . . . . 213 Xiaoling Su, Zhengxi Li, Yang Si, Yongqing Guo, and Wenhao Xu

Contents

vii

Algebraic Harmony in Genomic DNA-Texts and Long-Range Coherence in Biological Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222 Sergey V. Petoukhov Comparative Analysis of Inductive Density Clustering Algorithms Meanshift and DBSCAN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232 Zhengbing Hu, Irina Lurie, Oleksii K. Tyshchenko, Nataliia Savina, and Volodymyr Lytvynenko Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243

Optimal Dispatch of Combined Cooling, Heating and Power Microgrid with Advanced Adiabatic Compressed Air Energy Storage Manshang Wang1 , Jiayu Bai2(B) , Bosheng Chen1 , Laijun Chen3 , and Wei Wei2 1 State Grid Zhenjiang Power Supply Company, Zhenjiang 212001, China 2 State Key Lab of Control and Simulation of Power Systems and Generation Equipments,

Department of Electrical Engineering, Tsinghua University, Haidian District, Beijing 100084, China 3 New Energy (Photovoltaic) Industry Research Center, Qinghai University, Xining 810016, China

Abstract. Advanced adiabatic compressed air energy storage (AA-CAES) is a promising large-scale energy storage technology inherently combined cooling, heating and power (CCHP) generation, with the additional merits of high energy efficiency, long service time and zero carbon emission. In this paper, an integrated energy system dispatch-oriented AA-CAES model is developed, where the relationships between charging, discharging, heating and cooling power, and the state of charges (SOC) of air storage (AS) and thermal energy storage (TES) is established around air and heat transfer oil mass flow rate. The economic dispatch of a CCHP microgrid with AA-CAES is then modeled as mixed integer quadratic programming (MIQP). The simulation results demonstrate that the proposed AA-CAES dispatch model delivers more practical scheduling strategies and also verifies the economic benefits of AA-CAES operating in CCHP mode for reducing system operation cost and increasing renewables accommodation. Keywords: Advanced adiabatic compressed air energy storage · Combined cooling · Heating and power · Integrated energy system · Economic dispatch

1 Introduction Faced with growing serious energy crisis and environmental pollution, it is inevitable to develop renewable energies and to improve the comprehensive utilization efficiency of energy. The concept of integrated energy system is early proposed by U.S. and European researchers, with the aim to promote the deep-penetration of renewable generation and improve the energy system economy by dovetailing different energy systems [1]. The combined cooling, heating and power (CCHP) microgrids, also referred to as trigeneration system, is a typical integrated energy system whose energy efficiencies is much higher than independently operating energy supply systems, and was put into practice in some schools and hospitals [2]. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): AIPE 2020, AISC 1403, pp. 1–14, 2021. https://doi.org/10.1007/978-3-030-80531-9_1

2

M. Wang et al.

A lot of researches have been carried out on the operation and dispatch of CCHP based microgrid. Ref. [2] proposed a two-stage coordinated control scheme of CCHP system to cope with the forecast error of renewables outputs and load, which contains the first-stage economic dispatch and second-stage real-time adjusting. Ref. [3] established the dynamic model of cooling and heating systems taking into consideration thermal inertia, and proposed a joint-dispatch strategy of energy and reserve for CCHP-based microgrids. Model predictive control approach was employed in Ref. [4] to realize an efficient online scheduling of CCHP microgrid. The partial-load features of CCHPs are modeled in Ref. [5] and a multi time-scale dispatch model is presented with the aim to minimize the impact of microgrid on the main grid. Compressed air energy storage (CAES) is recognized as one of the large-scale energy storage technologies and attracts lots of attention in recent years due to its metrics of long service time, high reliability, and low environmental impact [6]. Compared with conventional CAES, the by-produced compression heat is recovered and fully utilized in advanced adiabatic compressed air energy storage (AA-CAES), which brings about higher energy efficiency and fuel independence [7]. In addition, conventional CAES as well as AA-CAES is inherently combined cooling, heating and power generation, and thus can play an important role in integrated energy systems [8]. AA-CAES operating in CCHP mode involves conversion of multi-energy flows, such as mechanical energy, electrical energy, thermal energy, cold energy and pressure potential energy, etc., with complex coupling among control and state variables. There has been abundant researches on the thermodynamic modeling of AA-CAES under CCHP mode, which was further applied in system efficiency evaluation [7, 9], and operation strategy analysis [8, 10]. However, few researches pay attention to the optimal dispatch of CCHP microgrid integrated with AA-CAES, where a multi-energy dispatch model of AA-CAES is required, of which the model accuracy and computational efficiency should be well balanced. The remainder of the paper is arranged as follows. The dispatch model of AA-CAES combined cooling, heating and power generation is established in Sect. 2. The optimal dispatch model of CCHP system integrated with AA-CAES is proposed in Sect. 3. Moreover, case study is carried out in Sect. 4 and conclusions are sumed up in Sect. 5.

2 Modeling of AA-CAES Under CCHP Mode 2.1 Description of AA-CAES Combined Cooling, Heating and Power Generation The schematic diagram of AA-CAES combined with cooling, heating and power generation is shown in Fig. 1, which consists of motor, multi-stage compressors, e.g., highpressure compressor (HPC), medium-pressure compressor (MPC) and low-pressure compressor (LPC) as for three-stage compression, inter-coolers denoted by HX1, HX2 and HX3, multi-stage turbines, e.g., high-pressure turbine (HPT) and low-pressure turbine (LPT) as for two-stage expansion, inter-heaters denoted by HX4, HX5, generator, air storage (AS), low-temperature thermal energy storage (LTES) and high-temperature thermal energy storage (HTES), oil pumps (OP) at the outside of HTES and LTES, heating-side heat exchanger (HX), and throttle valves (TV).

Optimal Dispatch of Combined Cooling, Heating and Power Microgrid

3

The ambient air is compressed and stored in AS during the charging process, while the compression heat is recycled by heat transfer oil (HTO) though inter-coolers. In discharging phase, the high-pressure air is released from AS and heated by hot HTO though inter-heaters, entering the turbines for work. The low-temperature turbine outlet air could be utilized for cold supply, and the residual compression heat stored in HTES could be used for heating. Input electricity

LPC

MPC

M HX1

HPT

AS

HPC TV1 HX3

HX2

TV2 HX4

LPT

Output electricity G

HX5

Cooling supply

Air OP1 OP2 LTES

HTES HX6 Heating supply OP3

Power flow

Air flow

High-temperature HTO flow

Water flow Low-temperature HTO flow

Fig. 1. Structure of an AA-CAES combined with cooling, heating and power generation

2.2 Dispatch Model of Trigeneration AA-CAES 2.2.1 Charging and Discharging Constraints The charging power and discharging power of AA-CAES can be written as the linear function of air mass flow rate, as given by [11]: PCAESc,t = ccom mc,t

(1)

PCAESe,t = cexp me,t

(2)

where PCAESc,t and PCAESe,t are charging and discharging power during the unit time period t, mc,t and me,t are the air mass flow rate at compression-side and expansion-side respectively, ccom and cexp are constant coefficients determined by the system design parameters, which can be calculated by [12]   Nc  cpa   k−1 in k βc,i T − 1 /ηm (3) ccom = i=1 ηc,i c,i   Ne   1−k in 1 − βe,i k (4) cexp = ηg cpa ηe,i Te,i i=1

where ηm is the motor efficiency, ηc,i and βc,i are isentropic efficiency and compression ratio of the ith stage compressor, ηe,i and βe,i are the isentropic efficiency and expansion

4

M. Wang et al.

in and T in are the inlet temperature of the ith stage ratio of the ith stage turbine, Tc,i e,i compressor and turbine, ηg is the generator efficiency, Nc and Ne refer to the stage number of compression and expansion train, cpa is the specific heat capacity of air at constant pressure, and k denote the adiabatic coefficient of air. The heat capacity of hot and cold fluids flowing though the heat exchangers during charging and discharging processes are generally assumed identical [13], as given by

cpa mc,t = cpHTO mHTOc,t

(5)

cpa me,t = cpHTO mHTOe,t

(6)

where mHTOc,t and mHTOe,t are the HTO mass flow rate of each heat exchanger at compression and expansion side, cpHTO is the specific heat capacity of HTO. The charging and discharging power limits are written as minCAESc,t

μCAESc,t PCAESc

minCAESe,t

μCAESe,t PCAESe

max CAESc,tCAESc

max CAESe,tCAESe

μCAESc,t + μCAESe,t ≤ 1

(7) (8) (9)

where μCAESc,t and μCAESe,t are binary variables representing the charging and dischargmin min and PCAESe are the lower bounds of ing conditions of AA-CAES at time slot t, PCAESc max max charging and discharging power, PCAESc and PCAESe are the upper bounds. Constraints (9) are required to avoid the simultaneous charging and discharging. The start-up and shut-down conditions of AA-CAES compressors and turbines denoted by the binary variables yCAESc,t , zCAESc,t , yCAESe,t , zCAESe,t are described as yCAESc,t − zCAESc,t = μCAESc,t − μCAESc,t−1

(10)

yCAESc,t + zCAESc,t ≤ 1

(11)

yCAESe,t − zCAESe,t = μCAESe,t − μCAESe,t−1

(12)

yCAESe,t + zCAESe,t ≤ 1

(13)

where yCAESc,t = 1, zCAESc,t = 0 represents the compression subsystem of AA-CAES starts up at time slot t, yCAESc,t = 0, zCAESc,t = 1 represents the compression subsystem shuts down at time slot t, and the expansion-side symbols have the similar meanings. 2.2.2 Heating Power Constraints The residual compression heat in the HTES can be utilized for centralized heat-supply. Considering the district heat-supply temperature is about 80 °C, which is much lower

Optimal Dispatch of Combined Cooling, Heating and Power Microgrid

5

than the HTO temperature stored in HTES, the heat capacity of heating water is assumed less than HTO. The heating power of AA-CAES can be taken as the linear function of HTO mass flow rate, as given by HCAES,t = cheat mHTOh,t

(14)

where HCAES,t is the heating power of AA-CAES, mHTOh,t is the HTO mass flow rate during the heating process, cheat is a constant determined by the system design parameters, e.g., the heating-side heat exchanger effectiveness εh , the temperature in the HTES TTESH , and the return water temperature of centralized heat-supply Twin , which is calculated by

(15) cheat = εh cpHTO TTESH − Twin The heating power limits is given by minCAES,t

μh,t HCAES

max h,tCAES

(16)

min and H max are the lower and upper bounds of heating power, μ where HCAES h,t is a CAES binary variable representing the heating conditions of AA-CAES.

2.2.3 Cooling Power Constraints The cooling power of AA-CAES supplied by low-temperature air at the outlet of finalstage turbine is expressed as QCAES,t = μcl,t ccool me,t

(17)

where QCAES,t is cooling power, μcl,t is a binary variable representing cold supplying condition of AA-CAES, ccool is a constant calculated by [14]

out , T > T out cpa T0 − Te,N 0 e,Ne e (18) ccool = out 0, T0 ≤ Te,Ne out is the air temperature at the outlet of the where T0 is the ambient temperature, Te,N e final-stage turbine.

2.2.4 Operation Constraints of AS The temperature in the AS are generally assumed unchanged within the operation cycle, and the pressure of AS at the end of time slot t denoted by pAS,t is expressed as   pAS,t = (1 − μloss )pAS,t−1 + α mc,t − me,t Δt (19) where μloss is the air leakage factor, the constant coefficient α = kRg TASAS is calculated based on the ideal gas equation, where VAS and TAS are the volume and temperature of

6

M. Wang et al.

AS, Rg is gas constant. Δt is the unit dispatch slot. To ensure a safe operation of AS, the following constraints are considered. min max pAS ≤ pAS,t ≤ pAS

(20)

min and pmax are the lower and upper bound of AS working pressure. The state where pAS AS of charge (SOC) of AS is defined by the pressure, as given by



min max min / pAS (21) SOCAS,t = pAS,t − pAS − pAS

The terminal SOC of AS is set to its initial state for the system continuous operation. 2.2.5 Operation Constraints of HTES The SOC of TES is generally determined by the stored heat energy based on law of conservation of energy, see Ref. [11, 15, 16], however, this model is unprecise for AACAES with double-tank TES due to a neglecting of the heat dissipation of LTES. The SOC of HTES can be accurately measured by the mass of stored HTO, as given by

maxmin min SOCHTES,t = MHTES,t − MHTES (22) / MHTESHTES min and M max are the lower and upper bounds of the mass of HTO stored in where MHTES HTES HTES, and MHTES,t is the mass of hot HTO stored in HTES at the end of time slot t, which is calculated by   MHTES,t = MHTES,t−1 + Nc mHTOc,t − Ne mHTOe,t − mHTOh,t Δt (23)

where mHTOh,t is the mass flow rate of HTO used for heat supply. The operation constraints of HTES is given as follows min max MHTES ≤ MHTES,t ≤ MHTES

(24)

It’s noted that the HTO circulates continuously between HTES and LTES, thus the mass constraint of LTES is not given separately. Similar to AS, the terminal SOC of HTES is set to its initial state in order to complete a cycle.

3 Optimal Dispatch of CCHP System The CCHP based microgrid is composed of AA-CAES, wind plant (WP), photovoltaic (PV) plant, gas turbine (GT), gas boiler (GB), heat pump (HP), and electric chiller (EC), as is illustrated in Fig. 2.

Optimal Dispatch of Combined Cooling, Heating and Power Microgrid PCAESc,t

PGrid,t

Upstream power grid

Natural gas

PPV,t

Pw ,t

GT

WP

PV

PHP,t

EC

AA-CAES QCAES,t

PEC,t

HP

QEC,t

H HP,t

GB Gas

Electric load

PCAESe,t

+ PGT,t

PLoad,t

7

H CAES,t

+

Cooling load QLoad,t

+

Heating load H Load,t

H GB,t

Heat

Cooling

Electricity

Fig. 2. Structure of a CCHP based microgrid

3.1 Objective The objective of the economic dispatch of the CCHP microgrid is to minimize the total operation cost, including the fuel cost, start-up and shut-down cost of GT, the operation cost of GB, the purchasing cost of electricity from upstream power grid, and the startup and shut-down cost of AA-CAES compression and expansion subsystems, which is denoted by CGT,F , CGT,S , CGB , CGrid , CCAESc,S and CCAESe,S respectively.  ⎧ 2 CGT,F = t α2 PGT,t + α1 PGT,t ⎪ ⎪ ⎪  ⎪ ⎪ C = c ⎪ GT,S GT,u yGT,t + cGT,d zGT,t t ⎪ ⎪  ⎨ 2 CGB = t β2 HGB,t + β1 HGB,t min CGT,F + CGT,S + CGB + CGrid + CCAESc,S + CCAESe,S  ⎪ ⎪ CGrid = t CE,t PGrid,t ⎪ ⎪ ⎪ ⎪ CCAESc,S =  cCAESc,u yCAESc,t + cCAESc,d zCAESc,t ⎪ ⎪ ⎩ t CCAESe,S = t cCAESe,u yCAESe,t + cCAESe,u zCAESe,t

(25) where α1 and α2 are the fuel cost efficiencies of GT, β1 and β2 are the fuel cost efficiencies of GB, PGT,t , HGB,t and PGrid,t are the generating power of GT, the heating power of GB, and the purchasing power from upstream power grid at time slot t respectively, cGT,u and cGT,d are the single start-up and shut-down costs of GT, yGT,t and zGT,t are the binary variables representing the start-up and shut-down conditions of GT, CE,t is the time-of-use (TOU) electricity price, cCAESc,u and cCAESc,d are the single start-up and shut-down costs of AA-CAES compression subsystem, cCAESe,u and cCAESe,u are the single start-up and shut-down costs of expansion subsystem. 3.2 Constraints 3.2.1 Components Operation Constraints The operation constraints of AA-CAES are given in (1)–(24), other components models are given as follows. (1) GT operation constraints: minGT,t

μGT,t PGT

max GT,tGT

(26)

8

M. Wang et al.

yGT,t − zGT,t = μGT,t − μGT,t−1

(27)

yGT,t + zGT,t ≤ 1

(28)

min and P max are the lower and upper bounds of GT output, μ where PGT GT,t is a binary GT variable describing the working condition of GT. (2) GB operation constraints: minGB,t max

HGB

GB

(29)

min and H max are the lower and upper bounds of the heating power of GB. where HGB GB (3) WP operation constraints: pre

0 ≤ Pw,t ≤ Pw,t

(30) pre

where Pw,t is the actual output of WP at time slot t, Pw,t is the wind power prediction. (4) PV operation constraints: pre

0 ≤ PPV ,t ≤ PPV ,t

(31)

pre

where PPV ,t and PPV ,t are the actual output and prediction of PV plant at time slot t. (5) EC operation constraints: QEC,t = COPEC PEC,t minECmax

PEC

EC

(32) (33)

where QEC,t and PEC,t are the cold power output and electric power input of EC respecmin and P max are the minimal and maximal tively, COPEC is refrigeration coefficient, PEC EC power inputs. (6) HP operation constraints: HHP,t = COPHP PHP,t minHP,t max

PHP

HP

(34) (35)

where HHP,t and PHP,t are the heat power supply and electric power input of HP respecmin and P max are the minimal and maximal tively, COPHP is heating coefficient, PHP HP electrical inputs of HP. (7) Upstream power grid operation constraints: max 0 ≤ PGrid,t ≤ PGrid

(36)

max is the maximal power purchased from upstream grid determined by the where PGrid distribution line capacity, and the minimal power is set zero avoiding the active power reversal.

Optimal Dispatch of Combined Cooling, Heating and Power Microgrid

9

3.2.2 Energy Balance Constraints (1) Electric power balance: PGT,t + Pw,t + PPV ,t + PCAESe,t + PGrid,t = PCAESc,t + PHP,t + PEC,t + PLoad,t (37) where PLoad,t is the power load at time slot t. (2) Heating power balance: HGB,t + HHP,t + HCAES,t = HLoad,t

(38)

where HLoad,t is the heating load at time slot t. (3) Cooling power balance: QCAES,t + QEC,t = QLoad,t

(39)

where QLoad,t is the cooling load at time slot t. To sum up, the optimal dispatch model of CCHP microgrid is shown in (1)–(39), where the constraints (17) contains product terms of binary and continuous variables, which can be equivalently replaced by linear inequalities as illustrated in [17]. As a result, the dispatch problem is expressed as a mixed integer quadratic programming (MIQP) and can be efficiently solved by commercial solvers.

4 Case Study 4.1 System Setting The CCHP microgrid consists of a 2 MW/10 MWh AA-CAES, a 3 MW GT, a 1 MW GB, a 6 MW wind plant composed of 4 * 1.5 MW wind turbine generators, a 2 MW PV plant, a HP and a EC both with the capacity of 1 MW. The Load and renewables forecasts are depicted in Fig. 3. The TOU electricity price is given in Table 1. The design parameters of AA-CAES are listed in Table 2. The value of component design parameters and cost coefficients are listed in Table 3 and Table 4 respectively.

Fig. 3. Left: Electrical, heating and cooling load forecast. Right: WP and PV output prediction.

10

M. Wang et al. Table 1. TOU electricity price

Period time

Detail time

Purchase price (¥/kWh)

Valley period

0:00–8:00, 23:00–24:00

0.3089

Intermediate period

11:00–16:00, 20:00–23:00

0.6268

Peak period

7:00–11:00, 16:00–20:00

1.044

Table 2. Design parameters of AA-CAES Param.

Value

Param.

Value

Param.

Value

Param.

Value

Nc

3

Ne

2

ηm

0.98

ηg

0.98

ηc,i

0.88

βc,i

4.621

TAS

293.15 K

VAS

3243.4 m3

ηe,i

0.82

βe,i

7.696

in Te,1

443.72 K

in Te,2

440.84 K

in Tc,1

293.15 K

in Tc,2

320.56 K

in Tc,3

327.24 K

TTESH

476.78 K

cpa

1.004

cpHTF

1.577

εh

0.85

out Te,N

275.28 K

e

Twin

313.15 K

T0

293.15 K

Table 3. Value of the component parameters Param.

Value

Param.

Value

Param.

Value

Param.

Value

min PCAESc min HCAES min MHTES

1.4 MW

2 MW

0.6 MW 6 MPa

max PCAESe max pAS

2 MW

1 MW

min PCAESe min pAS

0

max PCAESc max HCAES max MHTES

140.24 t

α

0.0363

μloss

0.001

ccom

0.2193

ccool

0.0280

0

max HGB max PHP

1 MW

0.2 MW

0.601

cexp

0.3269

cheat

min PGT min PEC

1.2 MW

3 MW

0

max PGT max PEC

COPHP

4

COPEC

3

min HGB min PHP max PGrid

1 MW

0

10 MPa

1 MW

5 MW

Table 4. Value of the cost coefficients Param. α1

Value

Param.

105 ¥/MW α2

Value

Param. Value

0.63 ¥/MW2

β1

126 ¥/MW β2

1.12 ¥/MW2

cGT,u

20 ¥

20 ¥

cCAESc,u 10 ¥

cCAESc,d 10 ¥

cCAESe,u 10 ¥

cCAESe,d 10 ¥

Param. Value cGT,d

Optimal Dispatch of Combined Cooling, Heating and Power Microgrid

11

To investigate the economic benefits of AA-CAES operating in multi-energy supplying mode brought to CCHP microgrid, the following 5 scenarios are considered in case study: (i) AA-CAES is equipped in CCHP microgrids and takes part in system electricity, heating and cooling dispatch (base case). (ii) AA-CAES is not installed in the microgrid. (iii) AA-CAES only works in charging and discharging modes. (iv) AACAES only takes part in system electricity and heating dispatch. (v) AA-CAES only takes part in system electricity and cooling dispatch. 4.2 Results and Discussion The system electrical and heating scheduling results of base case in summer and winter are shown in Fig. 4 and Fig. 5 respectively, and the scheduling results of AA-CAES are plotted in Fig. 6. As seen, AA-CAES charges during the valley hours and the microgrid purchases electricity from upstream power grid sometimes even when the generating power is enough within the microgrid, see t = 3 in Fig. 4 and t = 1 in Fig. 5. The cooling supply of AA-CAES must be at the same time with its discharging process. The residual compression heat is utilized for heating supply in winter and summer for heat supply.

Fig. 4. Left: System electrical scheduling results of base case (summer). Middle: System cold scheduling results of base case (summer). Right: System heat scheduling results of base case (summer).

Fig. 5. Left: System electrical scheduling results of base case (winter). Right: System heat scheduling results of base case (winter).

12

M. Wang et al.

Fig. 6. Left: Scheduling results of AA-CAES for base case (summer). Right: Scheduling results of AA-CAES for base case (winter).

The wind power and PV curtailment, and optimized system operation cost under different scenarios are shown in Table 5. As seen, the operation cost is the lowest when AA-CAES is equipped in the microgrid and operate in CCHP mode (Scenario 1), with a decrease of operation cost by 23.19% and 25.35% in summer and winter respectively compared with the system without AA-CAES (scenario 2). In view of the 3 stage compression and 2 stage expansion configuration of the adopted AA-CAES, the residual HTO mass takes approximately 1/3 of the total compression heat within a complete operation cycle, the system operation cost reduces obviously by 5.97% and 7.62% when AA-CAES takes part in electricity and heating dispatch instead of only taking part in electricity dispatch in summer and winter respectively. However, the total cost decrease slightly by 0.51% when AA-CAES takes part in electricity and cooling dispatch instead of only taking part in electricity dispatch on account that the cooling supply of AA-CAES coming from the low-temperature turbine outlet air is much smaller than its discharging capacity. Besides, since the wind and PV curtailment penalty are not included in the objective function, the renewables shedding may be at relatively high level when the system operation cost is optimal. Figure 7 displays the scheduling results of AA-CAES under base case in winter adopting the HTES model modelled by heat energy. The optimized operation cost is 29887 ¥, which is 1091 ¥ lower than the cost obtained adopting the propose HTES model based on conservation of mass. However, it can be observed that the actual SOC of HTES is higher than the upper bound during the period of 5–8, and the actual SOC of HTES during the period of 19–24 is less than the lower bound. The HTES model formulated by stored heat energy will enlarge the operational feasible region of AA-CAES, resulting in infeasible scheduling strategies.

Optimal Dispatch of Combined Cooling, Heating and Power Microgrid

13

Table 5. Scheduling results under different scenarios Scenario

Summer Wind curtailment (MWh)

Winter PV curtailment (MWh)

Cost (¥)

Wind curtailment (MWh)

PV curtailment (MWh)

Cost (¥)

1

3.952

0

30978

7.895

0

23481

2

22.232

0

41634

9.991

0.288

31453

3

1.786

0

33574

5.169

0

25417

4

5.235

0

31568

7.895

0

23481

5

2.149

0

33402

5.169

0

25417

Fig. 7. The scheduling results adopting the HTES model modelled by heat energy

5 Conclusions In this paper, a dispatch model of AA-CAES combined cooling, heating and power generation is developed, with consideration of the pressure dynamics of AS and stored HTO mass variation in HTES. The air and HTO mass flow rate are the link variables between multi-energy supply and SOC of AS and HTES. The proposed AA-CAES model is incorporated with the economic dispatch of CCHP microgrid, and is verified to give rise to more practical and reliable scheduling strategies. The simulation results also manifest that AA-CAES under CCHP mode is contributed to the system operation cost reduction and renewables accommodation. Acknowledgements. This work is supported by State Grid Jiangsu Electric Power Co. Ltd. Science and Technology Project (J2020083).

14

M. Wang et al.

References 1. Sun, H., et al.: Integrated energy management system: concept, design, and demonstration in China. IEEE Electrification Mag. 6(2), 42–50 (2018) 2. Li, G., et al.: Optimal dispatch strategy for integrated energy systems with CCHP and wind power. Appl. Energy 192, 408–419 (2017) 3. Wang, J., Zhong, H., Xia, Q., Kang, C., Du, E.: Optimal joint-dispatch of energy and reserve for CCHP-based microgrids. IET Gener. Transm. Distrib. 11, 785–794 (2017) 4. Gu, W., Wang, Z., Wu, Z., Luo, Z., Tang, Y., Wang, J.: An online optimal dispatch schedule for CCHP microgrids based on model predictive control. IEEE Trans. Smart Grid 8(5), 2332–2342 (2016) 5. Bao, Z., Zhou, Q., Yang, Z., Yang, Q., Xu, L., Wu, T.: A multi-time-scale and multi energytype coordinated microgrid scheduling solution—part i: model and methodology. IEEE Trans. Power Syst. 30, 2257–2266 (2015) 6. Nikolaidis, P., Poullikkas, A.: Cost metrics of electrical energy storage technologies in potential power system operations. Sustain. Energy Technol. Assess. 25, 43–59 (2018) 7. Jiang, R., Yang, X., Xu, Y., Yang, M.: Design/off-design performance analysis and comparison of two different storage modes for trigenerative compressed air energy storage system. Appl. Therm. Eng. 175, 115335 (2020) 8. Han, Z., Guo, S.: Investigation of operation strategy of combined cooling, heating and power (CCHP) system based on advanced adiabatic compressed air energy storage. Energy 160, 290–308 (2018) 9. Han, Z., Sun, Y., Li, P.: Thermo-economic analysis and optimization of a combined cooling, heating and power system based on advanced adiabatic compressed air energy storage. Energy Convers. Manage. 212, 112811 (2020) 10. Han, Z., Guo, S., Wang, S., Li, W.: Thermodynamic analyses and multi-objective optimization of operation mode of advanced adiabatic compressed air energy storage system. Energy Convers. Manage. 174, 45–53 (2018) 11. Bai, J., Chen, L., Liu, F., Mei, S.: Interdependence of electricity and heat distribution systems coupled by an AA-CAES-based energy hub. IET Renew. Power Gener. 14(3), 399–407 (2020) 12. Luo, X., et al.: Modelling study, efficiency analysis and optimisation of large-scale adiabatic compressed air energy storage systems with low-temperature thermal storage. Appl. Energy 162, 589–600 (2016) 13. Bai, J., Wei, W., Chen, L., Mei, S.: Modeling and dispatch of advanced adiabatic compressed air energy storage under wide operating range in distribution systems with renewable generation. Energy 206, 118051 (2020) 14. Jiang, R., Yin, H., Peng, K., Xu, Y.: Multi-objective optimization, design and performance analysis of an advanced trigenerative micro compressed air energy storage system. Energy Convers. Manage. 186, 323–333 (2019) 15. Li, R., Chen, L., Yuan, T., Li, C.: Optimal dispatch of zero-carbon-emission micro energy internet integrated with non-supplementary fired compressed air energy storage system. J. Mod. Power Syst. Clean Energy 4(4), 566–580 (2016). https://doi.org/10.1007/s40565-0160241-4 16. Li, Y., Miao, S., Yin, B., Han, J., Zhang, S., Hong, J., Luo, X.: Combined heat and power dispatch considering advanced adiabatic compressed air energy storage for wind power accommodation. Energy Convers. Manage. 200, 112091 (2019) 17. Wei, W., Wang, J.: Modeling and Optimization of Interdependent Energy Infrastructures. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-25958-7

Chatbots as a Tool to Optimize the Educational Process Nataliya Mutovkina(B) Tver State Technical University, Tver 170026, Russia

Abstract. The article discusses the possibilities of optimizing the educational process in higher educational institutions by introducing chatbots into their activities. Chatbots make it possible to make more rational not only the educational process but also other activities of higher educational institutions. They provide support for informational, administrative functions. Also, the study of chatbots, technology for their development, and application features are appropriate for students of various directions and training profiles. These programs are creating to simulate human behavior when communicating with one or more interlocutors are increasingly included in everyday life, becoming more and more popular. The method of introducing chatbots into the activities of universities has a novelty. It provides for the use of chatbots developed by students in the work of the organizational units of the university. The practical significance of the results obtained is beyond doubt. Students not only create thematic chatbots as part of the study of a specific course but also get the opportunity to implement them in the activities of an educational organization. So they gain experience not only in learning, creative activities but also in organizational management work. The proposed methodology provides for assessing the effectiveness of the implementation of chatbots in the activities of the university. The assessment is performing according to a complex system of criteria. Assessment is basing on statistical and expert methods. Keywords: Chatbot · Intelligent agent · DialogFlow · Telegram · Educational process · Rules · Training · Communication

1 Introduction The technology of building chatbots has long been using in various spheres of life, and the range of its application is continuously expanding. A chatbot is a computer program that can “communicate” with a person in a natural language via text or voice and interact with it through a simple, intuitive interface [1]. Chatbots can easily replace a human when users ask typical questions and want to get answers quickly. One of the areas where chatbots can successfully use is the field of higher education. Firstly, chatbots are appropriate for solving some tasks of organizing the educational process. Secondly, chatbots are one of the elements of training in the framework of the discipline “Artificial intelligent and Neurotechnology” Thirdly, chatbots can be an efficient teaching tool for presenting educational material for defined courses. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): AIPE 2020, AISC 1403, pp. 15–25, 2021. https://doi.org/10.1007/978-3-030-80531-9_2

16

N. Mutovkina

The identification of significant features of the use of chatbots in the educational process, the practice of their usage in higher education institutions allows you to identify the main advantages and disadvantages of chatbots, as well as to form general recommendations aimed at improving the effectiveness of theirs. The purpose of the work is to show the capabilities of chatbots to optimize the educational process and university activities in general. The proposed methodology for introducing chatbots into the activities of universities provides for the involvement in this not only of the teaching staff and methodological workers but also of students. To achieve this goal, it is necessary to solve such problems as: consider theoretical approaches to the development and implementation of chatbots, the problems they solve; analyze the possibilities of using chatbots in the organization of the educational process; to present a methodology for studying chatbots by students in the framework of the discipline “Artificial Intelligence and Neurotechnology”; consider chatbots as a learning entity that can, to some extent, replace a teacher. The practical part offers an example of creating and implementing a chatbot in the activities of a university. Evaluation of the effectiveness of using chatbots is performing based on the Department of Accounting and Finance of the Tver State Technical University (TvSTU). In the end, recommendations are given to improve the efficiency of using chatbots in the activities of the university.

2 Literature Review Currently, more and more scientists in their research are turning to the field of artificial intelligent. In particular, software entities that allow conducting thematic dialogues are of increasing interest. Many of these entities are so adapting to the dialogue that it is sometimes difficult for the user to determine whether he is communicating with a human or a computer. This technology is basing on speech recognition, neural networks, and a broad linguistic apparatus. As this technology develops, so does the intelligence of chatbots. A well-designed chatbot can automatically collect content from various sources, accurately recognize what the user said, and select an appropriate response in the user’s language [2]. There are two types of chatbots [3]: 1) Chatbots are basing on a set of rules and pre-defined algorithms that are written into the program to respond to user requests. These chatbots are the simplest and have significant limitations in the use; 2) Chatbots based on the principles of machine learning (artificial intelligent methods that allow a computer program to learn independently, solving many similar tasks in the process of interacting with a human). One of the areas where chatbots can successfully use is the field of higher education. Firstly, chatbots are appropriate for solving some tasks of organizing the educational process. Secondly, chatbots are one of the elements of training in the framework of the discipline “Artificial intelligence and Neurotechnology” Thirdly, chatbots can be an efficient teaching tool for presenting educational material for defined courses. Some issues related to the topic of this research have been considering in the works of many scientists, for example, [1, 4–9] and others. However, each researcher has their own

Chatbots as a Tool to Optimize the Educational Process

17

experience of using chatbots. This also applies to educational organizations. Therefore, it is advisable to form a systematic approach to the study of theoretical and practical aspects of the use of chatbots in higher education institutions. Also, you should take into account the variety of goals and tasks that can be solved using chatbots. The effectiveness of chatbots can be evaluated on the criteria from works [10, 11].

3 Methodology for Using Chatbots in the Activities of the University 3.1 Chatbots in the Organization of the Educational Process At the Department of Accounting and Finance of Tver State Technical University, it is planning to develop and use chatbots to bring to students organizational and methodological information related to the implementation of certain types of educational load. Thus, it seems appropriate to create and implement chatbots in the educational process, dedicated to the order of writing and defending coursework and passing practices. Creating chatbots requires careful structuring of information, dividing it into logical blocks. With the help of chatbots “Course work” and “Internship” students should be provided with data of a methodological nature promptly and in full: – information about the Department, the address of the Department’s page on the University’s website, e-mail addresses of teachers, consultation schedule; – dates of organizational meetings on practice, excerpts from the orders of the faculty on sending students to practice, deadlines for the delivery and defense of term papers and reports on the internship; – procedure and options for places of practice; – guidelines for writing term papers and reports on practice, requirements for writing papers at the University; – links to the electronic library system of the University with recommendations for the design of the bibliographic list; – forms of documents that must be issuing during the internship, forms of title pages of term papers and reports on the practice; – Presentations that reflect the algorithm of the teacher’s work with the student when writing a course work or when passing an internship. A chatbot can tell students the correct sources of information. New students do not always know who to turn to with their questions, which are usually the same: when the next test will be; how many points you need to get to pass the test or exam; whether there are changes in the schedule; where are the specific academic buildings and classrooms, etc. The bot can perform various functions: automatically sending documents and messages, scheduling meetings, providing answers to simple questions, interviewing students, and searching for necessary information. 3.2 Methods of Studying Chatbots by Students in the Discipline “Artificial Intelligent and Neurotechnology” The curriculum of most modern areas and profiles of higher education students includes disciplines related to the development and implementation of intelligent information

18

N. Mutovkina

systems and technologies. One of these disciplines is the course “Artificial intelligent and Neurotechnology”, which is attended by undergraduate students studying in the profile “Digital economy” of TvSTU. This course includes the study of chatbots as elements of artificial intelligence, their development, configuration, and training. It is recommending to build a study of the module “Creating and using chatbots” as follows: 1) organization of lectures on the development of chatbots and their practical application; 2) students perform a test on lecture materials; 3) organization of lectures and practical classes on the topic of creating a chatbot with the help of DialogFlow. The DialogFlow intending to recognize the meaning in natural language text, so it is quite suitable as a tool for developing a chatbot. This add-on is part of Google’s neural network library TensorFlow; 4) completion of the final task by students. The final task for the course consists of two parts: 1) creating, configuring, and publishing each student’s bot. To do this, you must first create a DialogFlow agent. This is a software entity that accepts the user’s questions and generates responses to them. An agent can be linking to bots in several different messengers, such as WhatsApp, Viber, Telegram, etc. It is advisable to integrate the DialogFlow agent with a chatbot in Telegram since only with Telegram you can create and launch your chatbot within a few minutes [12]; 2) cross-evaluation of bots. After each bot is published, its ID is at random reporting for evaluation to five students in this course. The bot is evaluating according to the following criteria: 1) the bot must tell you what topic it is communicating about; 2) the bot is tested for the most common phrases and questions such as greetings, name, author’s details; 3) apply the Turing test procedure to the bot; 4) the bot should be asked at least five questions about the subject area that it has stated; 5) the student acting as an expert forms a General impression of the adequacy of answers to questions, the comfort of interaction with this bot. Each item is evaluated on a scale from 0 to 2. The main criterion is the level of compliance of the bot’s behavior with human behavior: if “0”, then the bot responds incorrectly to a question; if “1”, then the bot reacts to the question ambiguously; if “2”, the bot’s response is adequate and expected.

Chatbots as a Tool to Optimize the Educational Process

19

3.3 Chatbots as a Training Tool The high mobility of modern students, their constant presence in the conditions of intensive information flows, the need to process and perceive big information arrays, as well as the increasing popularity of distance learning are the main factors in the rapid spread of messengers, which are quick message services [13]. These chat products are in-stalling in smartphones of almost all students of higher educational institutions. In this way, teachers and students have the opportunity to establish quick contact and transmit information in a compressed form. Of course, there are limitations here, since not all educational information and materials can be broadcasting via chatbots. This way of interaction between the teacher and the student is possible when students are brought to certain theoretical aspects of their study of defined disciplines or preparing coursework, making reports on practice. The information that is transmitting via the messenger must structure in a certain way, and the messages transmitting must not be longer. Programming and launching your chatbot based on machine learning is a complex process that requires qualified developers and interface specialists, as well as significant time and resource costs. However, it is now possible to quickly create a simple chatbot that does not require special technical skills and knowledge of programming languages. One of these technologies is a technology from Google called DialogFlow. This service provides the user with fast and convenient training of the agent using voice and text commands, with further implementation of the trained model in the developed application. The solution with DialogFlow is the most optimal, as practice shows, because to work with it, the user only needs to create a Google account, and further interaction with applications will be carried out automatically by Google services. An example of working with the DialogFlow add-in is showing below. Students can access chatbots in the “24 by 7” mode as they study the subjects provided for in the curriculum, prepare reports on practice and research work, write coursework and projects. The algorithm of interaction between students and teachers, structured in the chatbot, allows students to navigate freely in the stages of work and requirements for it. The presence of feedback via a chatbot makes it possible to identify issues that remain unsolved and to supplement the methodological materials for writing term papers and passing an internship. In general, the chatbot is a useful additional tool for working with students, fascinating and convenient to use for both students and teachers.

4 An Example of Creating a Chatbot in the Telegram This section presents an example of creating a chatbot as part of the students’ study of the course “Artificial intelligent and Neurotechnology”. In the future, such a chatbot can use as a training tool on the basics of neural networks. According to the methodology described in the third section of the article, students first get acquainted with the TensorFlow library and the DialogFlow add-on [14]. Then each student creates their agent in DialogFlow. Every user with a Google account has access to this resource. Besides, you still need to install the Telegram messenger, which is necessary in a particular case to create a chatbot.

20

N. Mutovkina

Figure 1 shows an example of creating an intelligent Tobias agent in DialogFlow.

Fig. 1. Creating an agent in DialogFlow.

Figure 2 shows the creation of a chatbot named Bob in Telegram. In the class-room, students use a practice manual that contains all the explanations and recommendations

Fig. 2. Creating a chatbot in Telegram.

Chatbots as a Tool to Optimize the Educational Process

21

for creating, launching, and training a chatbot. The practice manual provides step-bystep instructions with screenshots of intermediate results. Creating a new bot is done using the BotFather bot. Setting up a new bot is easy. To set the characteristics of a new bot, the settings built into Telegram are used. After the chatbot configuration is completing, students switch to DialogFlow again to integrate their chatbot with the agent created earlier. Integration begins by clicking the “Integrations” item and the “Telegram” button on the panel that opens. After completing the integration of the bot with the DialogFlow agent, you need to start training it. Among all the tools in DialogFlow, the most important are the rules. Rules are a description of the agent’s reaction to defined introductory phrases or events transmitted via messengers. In DialogFlow, all work with rules is materializing in the “Intents” section. Actions for creating a new rule and editing already created rules are also presenting in the author’s practice manual. Students create response rules for their chatbots following the chosen topic. This is probably the longest and most responsible part of the work. Each rule must contain possible input phrases and possible responses from the chatbot, which will return as a response to input phrases that match or are similar in meaning to those specified in the section “Training phrases”.

Fig. 3. Example of a dialog with Bob.

22

N. Mutovkina

After creating the rules and training the agent, you can have a conversation with the chatbot on the specified topic. An example of this dialog is showing in Fig. 3. The easiest way to increase the intelligence of an agent is to train him in live conversations with him. To do this, use the “Training” section in the menu in the left panel of the DialogFlow Toolkit. Training an agent based on dialogues is a very effective tool. For example, if an agent answers typical questions from users in a General chat, such as about changes in the class schedule, new dialogs will appear every day in the “Training” section, where users will ask numerous questions. If you distribute questions by rules every day and create new rules, then the chatbot will become smarter and smarter every day. Then, using the other features of DialogFlow, students will get a full-fledged intelligent agent who can communicate on a given and related topics as well as a human. Subsequently, such agents can use in the educational process as a training tool for presenting educational material for particular courses. Thus, they will reduce the burden on lecturers.

5 Results and Discussion According to the chatbot assessment procedure proposed in Sect. 3.2, third-year students of the “Finance and Credit” profile were conducting an assessment of the effectiveness of working with Bob. The assessment results are presenting in the Table 1. Table 1. Evaluating the effectiveness of the “Bob” chatbot Experts

Criteria Identifiability

Answers to elementary questions

Turing test

Answers to questions from the subject area

Comfort of interaction

1

2

2

1

1

1

2

2

2

1

2

1

3

1

1

1

1

1

4

1

1

0

1

1

5

2

1

1

2

1

According to these results, this chatbot needs training. Experts unanimously point out that the bot performs the assigned functions, but it is quite different from the human. The experts noted rather formal answers to questions, the absence of any soulful color. To assess the effectiveness of chatbots, three groups of indicators can be proposing: 1) characteristics of the chatbot’s performance of the tasks; 2) metrics reflecting the demand for a chatbot in target audiences; 3) metrics to determine the adequacy of the dialogues themselves.

Chatbots as a Tool to Optimize the Educational Process

23

Fig. 4. Chatbot effectiveness rating tree.

In Fig. 4 shows a three-level evaluation tree. The second level is representing by such indicators as: 1.1) 1.2) 1.3) 2.1) 2.2) 3.1) 3.2)

reducing the load on and teaching and methodical staff; the rate of increase in the number of users; frequency of mentions of the chatbot in target audiences; the number of users; the number of messages read; average session duration; a percentage of errors.

The third level of criteria: 2.1.1) involved users; 2.1.2) active users; 2.1.3) repeat users. The curly brackets show the significance of each indicator. It is set expertly and can vary depending on the situation and the subjectivity of the experts themselves. Since the listed showing has different units of measurement, they must be expressed in one scale, for example, from 1 to 10 points. On such a scale, 10 points are the highest (best) mark. The simplest way to get a final grade is to use a criteria convolution.

6 Conclusions The introduction of chatbots in the practice of working with students at the Department of Accounting and Finance will improve interaction with students, allow them to get answers to questions without contacting the teacher. This will save lecturers from having to repeatedly answer the same standard questions of students, often just clarifying. Based on the study of the practice of using chatbots in the educational process, we will form some recommendations aimed at increasing the effectiveness of their use:

24

N. Mutovkina

1. Compliance with the principle of chatbot activity. In other words, when a user logs in, for example, to the Department’s website, the chatbot must immediately declare itself. He should greet the user and explain the rules for working with them. This might look like this: “Hello! I am an interactive bot of the Department of Accounting and Finance. To ask a question, select the topic you are interested in”. 2. The chatbot must have a properly built dialogue algorithm. This algorithm implies a precise selection of the types of questions and answers. The questions answered by the chatbot must correspond to the following types: Who? What? Where? Why? Questions with answers “Yes” and “No” are allowed. It is better not to address rhetorical questions at all, otherwise, students will never receive a clear answer to their question. 3. The chatbot must equip with the ability to use buttons. These buttons include: “Yes”, “No”, “Ask a new question”, “Feedback”, “Class Schedule” buttons with links to external sites (the site of the educational institution, sites with upcoming student conferences). 4. When creating chatbots, you must use proven tools. To create a chatbot, it is not necessary to have specialized developers and significant financial resources in the educational institution. There are several free tools, armed with which it is possible to create a fully functional system of automatic student counseling. In addition to this tool, you can use, for example, mobilemonkey.com (a marketing platform that supports free chatbot creation features); chatfuel.com (a platform for creating chatbots based on the social networks Facebook and Telegram). Thus, the chatbot is a very beneficial tool in the organization of the educational process. Chatbots are successful and easy to use for both students and teachers. Among other things, they meet the needs of representatives of the younger generation who are getting knowledge in the conditions of digitalization.

References 1. Aristova, A.S., Beznasyuk, Yu.S., Vidiker, P.K., Voronovich, N.E.: The use of chatbots in the educational process. In: The Proceedings of the 2nd International Conference on Digital Transformation of Society, Economy, Management, and Education, pp. 95−99 (2020) 2. Zhukova, A.O., Sergeeva, I.I.: Chatbot-technology of the future. In: Malyavkina, L.I. (eds.) Formation of the Digital Infrastructure of Enterprises: Theoretical and Applied Aspects. Collection of Scientific Papers of the National Scientific and Practical Conference, pp. 15−19 (2020) 3. Kat’kalo, V.S., Volkova D.L.: Corporate Training for the Digital World, 2nd edn., 248 p. (2018). Reprint 4. Wirawan, K.T., Sukarsa, I.M., Bayupati, I.P.A.: Balinese historian chatbot using full-text search and artificial intelligence markup language method. Int. J. Intell. Syst. Appl. 11(8), 21–34 (2019). https://doi.org/10.5815/ijisa.2019.08.03 5. Alvarez-Dionisi, L.E., Mittra, M., Balza, R.: Teaching artificial intelligence and robotics to undergraduate systems engineering students. Int. J. Mod. Educ. Comput. Sci. 11(7), 54–63 (2019). https://doi.org/10.5815/ijmecs.2019.07.06 6. Os’kin, A.F., Os’kin, D.A.: Chatbots and their application in the educational process. Newsl. Assoc. Hist. Comput. 47, 166−167 (2018)

Chatbots as a Tool to Optimize the Educational Process

25

7. Nikonova, E.Z., Krivolapova, E.A.: Elements of artificial intelligence in education. Int. J. Adv. Stud. 2–2(8), 13–18 (2018) 8. Firsova, A.E.: Prospects for the use of chatbots in higher education. In: Improving Educational and Methodological Work at the University in a Changing Environment. Proceedings of the II National Interuniversity Scientific and Methodological Conference, pp. 188−193 (2018) 9. Makarchuk, T.A., Tkachuk, E.V.: New channels of communication between students and University teachers using a chatbot. In: High Technologies and Innovations in Science. Collection of Selected Articles of the International Scientific Conference, Saint-Petersburg, pp. 187−190 (2020) 10. Banerjee, P., Sarkar, A.: Quality evaluation of component-based software: an empirical approach. Int. J. Intell. Syst. Appl. 10(12), 80–91 (2018). https://doi.org/10.5815/ijisa.2018. 12.08 11. Rawat, B., Dwivedi, S.K.: Selecting appropriate metrics for evaluation of recommender systems. Int. J. Inf. Technol. Comput. Sci. 11(1), 14–23 (2019). https://doi.org/10.5815/ijitcs. 2019.01.02 12. Sutikno, T., Handayani, L., Stiawan, D., Riyadi, M.A., Subroto, I.M.I.: WhatsApp, Viber and Telegram: which is the best for instant messaging? Int. J. Electr. Comput. Eng. 6(3), 909–914 (2016) 13. Kadek Darmaastawan, I., Sukarsa, M., Buana, P.W.: LINE messenger as a transport layer to distribute messages to partner instant messaging. Int. J. Mod. Educ. Comput. Sci. 11(3), 1–9 (2019). https://doi.org/10.5815/ijmecs.2019.03.01 14. DialogFlow. https://dialogflow.cloud.google.com/. Accessed 20 Aug 2020

Intelligent System of Computer Aided Processes Planning Georgy B. Evgenev(B) Bauman Moscow State Technical University, 5b1, Baumanskaya Street, 105005 Moscow, Russian Federation

Abstract. New methodology for creating of intelligent systems of computer aided processes planning is proposed. The methodology is based on the presentation of knowledge in the language of business prose, accessible to non-programmers, which distinguishes it from analogues. Words of any national languages can be used as words for representing knowledge. The pyramid of knowledge is described. Process Flow Diagram Based on the standard IDEF3 is proposed. The intelligent system for designing technological processes SPRUT-TP is described. This system has been implemented at more than 20 enterprises in Russia. Keywords: Artificial Intelligence · Intelligent systems · Computer aided processes planning

1 Introduction At the present time, a scrupulous attention is paid to methods of digitalization of production [1, 4]. Before the digital revolution, written sources were knowledge carriers. As a result of the digital revolution, software became the carriers of knowledge. Software was originally built on an algorithmic basis using algorithmic languages. Non-programmer knowledge carrier could not enter them into the computer. Intermediaries were needed: a developer (algorithmist) and a programmer (Fig. 1a). At the beginning of the development of artificial intelligence, rather complex languages for describing knowledge were created. As a result, the scheme of the process of entering knowledge into a computer has not changed. Again, intermediaries were needed in the form of knowledge engineers and programmers (Fig. 1b). In this case, between the participants of the process, as a rule, there was a semantic gap. Each of the intermediaries represented knowledge, introducing their own considerations. As a result, knowledge does not fully reflect the point of view of the knowledge carrier. The digital revolution should radically change this scheme and allow nonprogrammer knowledge carriers to enter them into a computer without intermediaries (Fig. 1c). This became possible using the methodology of expert programming [5–8]. In this methodology, knowledge is described in the language of business prose, as close © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): AIPE 2020, AISC 1403, pp. 26–39, 2021. https://doi.org/10.1007/978-3-030-80531-9_3

Intelligent System of Computer Aided Processes Planning

27

as possible to the literary language, but so formalized that it is possible to automatically generate software corresponding to the source texts. Below are examples of the application of expert programming.

Fig. 1. Knowledge entry schemes

The Internet of knowledge is built on an ontological basis [5], the root object of which is metaontology. From the point of view of the problems associated with artificial intelligence (AI), ontology is an explicit specification of the conceptualization of knowledge. Metaontology operates with general concepts and relationships that do not depend on a specific subject area. Metaontology should contain concepts and relationships that are necessary both for subject ontology and for ontologies of tasks and optimization. Traditional product design includes the stages defined by standards. These include preliminary, engineering and detail design. Preliminary design should contain fundamental solutions that give a general idea of the device and the principle of the product operation. Engineering design includes final engineering solutions that give a full picture of the developed product layout, as well as initial data for the development of working documentation. The result of detail design is the documentation, which makes it possible to manufacture the product in accordance with technical requirements. The most common computer encyclopedia currently is Wikipedia (Fig. 2). Wikipedia is an open-source, multilingual, universal Internet encyclopedia with free content. Its conceptual principles are multilingualism and the ability of users to replenish and adjust content. An encyclopedia is a review of all branches of human knowledge or a range of disciplines cumulatively constituting a separate branch of knowledge. In this case, we are interested in the metacategory “Technique”. Only information about various devices, mechanisms and devices that do not exist in nature and are manufactured by humans should be directly placed in this category.

28

G. B. Evgenev

Traditional wiki systems have a number of flaws, among which, in particular, is the lack of consistency in the content. Due to the frequent duplication of data on wikis, the same information may be contained on several different pages. When changing this information on one wiki page, users should ensure that the data is also updated on all other pages. Another drawback is the difficulty of accessing the knowledge available on wikis. Large wiki sites contain thousands of pages. Performing complex search queries and comparing information obtained from different pages is a task that is quite laborious in traditional wiki systems. The program can only show the text of a Wikipedia article in a certain context and cannot take additional steps related to the object. Traditional wikis use flat classification systems (tags) or classifiers organized in taxonomy. The inability to use typed properties generates a huge number of tags or categories. In this regard, semantic wiki appeared (Fig. 2). A semantic wiki is a web application that uses machine-processed data with well-defined semantics in order to extend the functionality of the wiki system. Regular wiki systems are filled with structured text and untyped hyperlinks. Semantic wiki systems allow you to specify the type of links between articles, the type of data within articles, as well as information about pages (metadata). A variety of such systems, based on the technology of expert programming described below for creating knowledge bases, can be called “Expertopedia” (Fig. 2).

Fig. 2. Wikipedia and Expertopedia

The semantics of Expertopedia is determined by the metaontology used in all cases. There is an opportunity to work with databases. For this it is necessary to make a choice of a database from among the available ones and the table in it. Next, the access conditions are formed and the fields and properties used in the database and in the creation of the knowledge base are coordinated. Actually, the operation is performed by the corresponding DBMS.

Intelligent System of Computer Aided Processes Planning

29

It is possible to create and modify geometric 3D models based on the results of calculations, using the capabilities of various CAD systems. At the same time, it is necessary to form a model and parameterize it. Another type of mathematical knowledge necessary for performing calculations are models of continuous systems based on differential-algebraic systems of equations. To be included in an intelligent design system, there must be a tool that has the capabilities to generate the models mentioned, and provides: support for object-oriented modeling technology that is compatible with the UML language; convenient and adequate description of the model in the generally accepted mathematical language without writing any program code; automatic construction of a computer model corresponding to a given mathematical one, with the possibility of autonomous use of this computer model. The foundation for intelligent systems construction is knowledge bases [7, 8]. When making their structure, it is advisable to take advantage of the centuries-old experience of material products creation. Figure 3 represents the knowledge pyramid corresponding to the principles of product design described above.

Fig. 3. Pyramid of knowledge

At the top level of the pyramid, there is a conceptual model of digital intelligent production, it corresponds to the preliminary design of the system. The conceptual model gives grounds to generate a production system; that corresponds to engineering design. Software tools are at the pyramid base; they are the detail design of the system. The transformation of knowledge during the transition from one level to another is provided by appropriate toolkits. Constructing the upper levels of the pyramid utilizes engineering knowledge in its natural forms presented in books and practices.

30

G. B. Evgenev

2 Constructing of Conceptual Models In constructing conceptual models, it is advisable to use methods and means approved by international standards [7, 8]. IDEF3 methodology is most suitable for technological processes design. IDEF3 is a recognized standard to describe technological processes; it defines the notation for representing the meta-models of the processes structure and a sequence of changes in the properties of the manufactured object. IDEF3 documentation facilities and modeling aids allow achieving the following tasks: 1) To document the knowledge about the variants of technological operations execution and certain products’ manufacturing steps, in order to develop of knowledge bases for generating the TP structure of processing a particular part or an assembly-welding unit; 2) To build diagrams of processed objects’ state during technological processes. The IDEF3method was used many times when using the described system. Let us consider the application of the IDEF3 method by the example of a technological process meta-model (TPM) for the processing of cylindrical toothed gears (Fig. 4). The first TPM functional element (UoB) is a block blanking operation; it is appropriately decomposed separately as a secondary model (Fig. 5). This model starts with an exclusive-OR disjunction (XOR). Such junctions are most often used in TPM formation. This junction, in addition to the formal identifier J6, has its own name “Billet” and relates to the fan-out junction type. Each outgoing arrow has its own name: “casting”, “forming” and “circle”. Thus, we can assume that “Billet” is a character variable that takes one of the three indicated values. The listed blanking operations are performed depending on these values. If the variable takes the value “casting”, then the corresponding casting operation is performed. In the case of the “forming” value, the “Saw-cutting” operation, which prepares the billet for forming, and the “Forming” operation itself are performed. In the case when the value of the variable “Billet” is “circle”, the billet of the wheel is cut off from the corresponding rolled section by means of the “Milling-cutting” operation. In all cases, the billet is fed to the “Annealing” operation to improve the machinability of the material. After blanking operations, the main surfaces of the rim, the disc and the wheel hub of the gear and the axial bore are processed in the “CNC Lathe” operation (Fig. 4). In a general case, the described TP meta-model is an AND-OR graph. The AND junctions determine an unconditional sequence of operations. The OR junctions involve enumerated variables with a fixed set of legitimate values, which determine the selection of a variant for the technological process. These variables are divided into two classes: free and bound. The values of free variables can be chosen by the production engineer; the bound ones are determined by the design documentation. In the described TP metamodel, the bound variables are: “Spline way”, “Heat treatment” and “Accuracy degree”. The free variables are “Billet”, “Grooving” and “Toothing”.

Intelligent System of Computer Aided Processes Planning

31

Fig. 4. Process Flow Diagram (PFDD) of cylindrical gears

3 Constructing of Knowledge Bases The above diagrams of processes in the IDEF3 standard represent conceptual knowledge models of TP structural synthesis. It is necessary to enter this knowledge into a computer and ensure automatic generation of routing technological processes depending on the values of the control free and bound variables. The language of such knowledge representation should be as simple as possible and accessible to non-programmers. For production engineers, it is most natural to fill in standard technological documentation, for example, route sheets. For this reason, the SPRUT-TP system uses modernized standard route sheets that represent knowledge of the TP operations structure. In order to be able to generate process diagrams in the IDEF3 standard, it is necessary to add lines to standard technological lines of type A and type B; such line should set conditions for operations entry in the final technological process. These conditions should allow describing logical connectives of the exclusive-OR type. To define logical connectives in the route map form, there are lines of the type “Condition” and “End of condition”. These lines, together with the standard technological lines between them, represent an analogue of the condition-action rule. The whole array of such information can be considered as an analogue of a knowledge base of the production type, where rules are regulated in time. Figure 6 shows the form to enter in Russian the diagram of processing cylindrical tooth gears; it relates to the IDEF3 diagram presented in Fig. 5 [7, 8].

32

G. B. Evgenev

Fig. 5. Secondary diagram of blanking operations for cylindrical toothed gears

Fig. 6. The fill-in form for the diagram of processing cylindrical tooth gears (in Russian)

Intelligent System of Computer Aided Processes Planning

33

Fig. 7. IDEF0 diagram

Thus, SPRUT-TP succeeded in combining the software functions as a service (SaaS) (because the end user-production engineer designs a specific technological process directly in the standard route sheet) and the platform as a service (PaaS) (that is used by the developer of knowledge bases for the structural synthesis of group technological processes (Fig. 4). However, in addition to structural synthesis, any design should take into account parametric synthesis. With regard to technological design, it consists in the normalization, which defines the norms of per-piece and preparation-closing time. In parametric synthesis, knowledge bases are usually not related to the time parameter. To represent conceptual models in this case, it is advisable to use the IDEF0 standard [7, 8]. Figure 7 gives the external representation of the object function in the IDEF0 standard. The IDEF0 functional model can be considered as an equivalent of the conditionaction rule: controls define condition, and action lies in conversion of inputs into outputs by using a mechanism or by calling appropriate software. In expert programming [7, 8], the production rule has the name “knowledge module” (KM). The mechanisms of knowledge modules should ensure the implementation of all the functions that may be required in the formation of knowledge bases. These include the following basic functions: formula evaluation (including assignment of values to variables), definition of values by tables, selection of values from databases, update of values in databases, entry of values to databases, calculation of values with subprograms, calculation of values with the methods generated from knowledge modules, calculation of values with using the executable exe-modules or dll-libraries generated by other systems. There is a developed appropriate mechanism for the generation of 3D models. Knowledge modules, which are elementary generating systems, are combined into structured generating systems that carry KM models. The model of structured generating systems, from the AI point of view, is semantic networks. The KM semantic network is an acyclic oriented graph (Fig. 8). Acyclicity is required for the semantic network to

34

G. B. Evgenev

perform its functional purpose – to ensure the determination of output variable values by the given input variables. In expert programming, KM ranked semantic networks are generated automatically [2]. This means the realization of the first element of the structured programming basic set “sequence”. The second element of structural programming is provided by the presence of KM preconditions. The third element (associated with cycle generation) is provided with the help of the selected FinCalc variable. Its appearance starts a cycle ensuring the repeated execution of a KM set until the value of this variable is changed. From the point of view of the IDEF0 standard, a KM ranked semantic network implements a process consisting of the operations performed by KM. The actions to convert the information model properties are carried out by KM mechanisms. In fact, the KM semantic network, formed automatically, contains an algorithm for converting information, eliminating the non-programmers from the need to form this algorithm themselves. Knowledge modules may be considered as frames. Thus, expert programming integrates all ways of knowledge representation.

Fig. 8. KM ranked semantic network

The foundation for building a clear knowledge is a dictionary. A dictionary has a name and methods for sorting and searching for words, as well as importing words from text documents. A dictionary consists of words, each of which has a name-identifier, a common name and type (integer, real or symbolic). There are methods of adding and removing words, and determining the inclusion of words into knowledge modules. Words can be connected with associative lists of acceptable values. Associative lists, like words, have an identification name, a list name, and a value type. Associative lists are connected with the methods of adding, deleting, sorting and searching for a list. Lists consist of elements, each of which must have a value and can be added or deleted.

Intelligent System of Computer Aided Processes Planning

35

Knowledge modules are based on dictionaries. Each module has a literary name, an identifier name, a precondition name, and a version. A module is associated with the methods allowing adding, selecting an analogue module, translating and testing the module, defining the inclusion of the KM in the knowledge bases and other modules, as well as the removing the module itself. KM has its own dictionary, which is a subset of terms from the knowledge base dictionary and includes input and output variables. In addition, KM can have a precondition that defines the scope of the module definition and contains a set of interrelated logical expressions. A knowledge module can be compound and include other modules. Every module has its own mechanism, by means of which the input variables are converted to the output variables. When designing products, calculations can be performed both on the basis of engineering procedures and on the basis of mathematical models [8]. Engineering procedures apply formulas, tables and databases. Mathematical models use solutions of linear, nonlinear and differential systems of equations. Geometric models form a special kind of mathematical models. Below on Fig. 9, there is an external representation of the knowledge module for calculating the cutting speed during lathe turning. This module is used in the normalization of boring cylindrical axial holes. The calculation is carried out according to the formula. KM: “VtTok” - Calculation of the cutting speed during lathe turning Preconditions Name Description Type Condition NaimPer$ Step name STRING Bore ElDet$ Part element STRING Cylindrical axial hole So Chip load, mm/rev REAL (0,) t_ Cutting depth, mm REAL (0,) Input properties Name Description Type Value yv Yv index REAL 0.25 Cv Cv coefficient REAL 141 t_ Cutting depth, mm REAL 0.1 So Chip load, mm/rev REAL 0.03 xv Xv index REAL 0.15 Mechanism - Formula Vt = Cv/(t_^xv * So^yv) Output properties Name Description Type Value Vt Base cutting speed, m/min REAL 77 Fig. 9. KM with mechanism-formula

36

G. B. Evgenev

Knowledge modules can be used not only for calculations, but also for generating text information as shown below. Using knowledge modules, you can form text variables (Fig. 10). Text constants are enclosed in quotation marks, for example “Mill wheel rim m = ”. Numeric variables are converted to text using the Str () function. Individual text components are connected using the + sign. If the variable m = 1, and z_ = 33, then the module shown in Fig. 10 will generate the following text: “Mill the wheel rim m = 1, z = 33 finally”. KM: "FrSdPrC3" - Shaping the content of the finishing transition Preconditions Name ViZubKol$ Tooth type

Description

Type Condition STRING cylindrical, conical

n_ HarObr$

Number of simultaneously processed parts INTEGER 1 Processing nature STRING Rough

StToch

Accuracy degree

INTEGER

10, 9

Input properties Type Name Description m_ Part module, mm REAL z_

Value

Number of teeth INTEGER

Mechanism - Formula SodPer$ = " Mill the wheel rim m = "+Str(m_)+", z = "+ Str(z_)+" finally " Output properties Type Value Name Description SodPer$ Transition content STRING

Fig. 10. KM for text generating

Functional dependencies are often tabular. To enter such dependencies into knowledge bases, modules with mechanisms in the form of tables are used. An example of such a module for assigning numerical values is shown in Fig. 11. The table attached to the module can have a header and a sidebar, which in general can be multi-level. In Fig. 11, the header at the top level contains the values of the symbolic variable “Replacement fixture”, and at the bottom - the ranges of “Part modulus, mm”. The sidewall also has two tiers: on the upper value of the variable “Setup character”, and on the lower value the variable “Type of feed of the modular worm cutter”.

Intelligent System of Computer Aided Processes Planning

37

МЗ: "NzTpzbCh" - Establishing Tpz base for worm wheels Preconditions DescripName Type Condition tion ViZubKol$ Cogwheel type STRING worm Input properties Name Description ZamPris$ Replacing fixtures VidPod$ HarNal$ m_

Type Value STRING

Feed type of modular hob cutter Setup characteristic

STRING STRING

Part module, mm

REAL

Mechanism - Table Configuring Properties in a Table ZamPris$ m_ HarNal$ VidPod$

tpzb

Table with replacement of installa- without replacing the installation devices tion devices (0,6]

(6,12]

(12,)

(0,6]

(6,12]

(12,)

without replacing the milling slide

radial

29

38

47

17

23

27

tangential

31

40

50

19

24

30

with replacing the milling slide

radial

39

52

67

27

37

42

tangential

41

56

72

29

40

52

Output properties Name Description Type Value tpzb Basic preparatory and final time standard, min REAL

Fig. 11. KM with table mechanism

4 Conclusions Research on the application of artificial intelligence methods uses labor-intensive methods that are not available to specialists in the field of planning mechanical engineering processes [9–11]. The Digital revolution should enable the non-programming knowledge carriers to enter knowledge into the computer without intermediaries. That can be done by way of expert programming methodology, in which knowledge is described in the language of business prose, which is very close to the literary language, but formalized so that it

38

G. B. Evgenev

becomes possible to automatically generate software matching the source texts. Business prose can be formed in any languages, and software can be generated in different programming languages. Knowledge bases are generated on the basis of knowledge modules representing a condition-action rule, which has an identifier and name, a precondition, input and output properties, and a mechanism for converting the first to the second. Modules are automatically translated into subprograms in the programming language selected by the user. Thus, the user can choose both the input language of the knowledge representation and the resulting language of the software generation. To automate technological preparing in computer-integrated production, there are systems of two classes: systems to automate the design and standardization of technological processes (CAPP) and systems to automate the programming of operations on CNC machines (CAM). The CAPP function is the formation of a complete set of technological documentation (routing and operation sheets, tooling lists, materials, etc.) on the basis of design documentation (specifications, assembly drawings and parts drawings) CAD systems must perform planning and normalization of all operations, which is necessary for the proper work of production scheduling systems. Russia possesses all necessary technologies for the 4IR realization. It should be specially mentioned that the tenth technology, the artificial intelligence described in sufficient detail in this paper, is the most important technology for further development of the systems involved in the 4IR. The systems created on the basis of this technology could receive the name “Industry 5.0”. Bauman Moscow State Technical University conducts annual conferences “Effective methods of automation of technological preparation and production planning”. In fact, these conferences are devoted to the Industrial revolution in Russia. In 2017 the conference held 555 specialists from 248 enterprises from 95 Russian cities. Further research in this area should be aimed at creating integrated systems for the semi-automatic design of mechanical engineering products and designing processes for their manufacturing.

References 1. Phuyal, S., Bista, D., Izykowski, J., Bista, R.: Design and implementation of cost efficient SCADA system for industrial automation. Int. J. Eng. Manuf. 10(2), 15–28 (2020). https:// doi.org/10.5815/ijem.2020.02.02 2. Fataliyev, T., Mehdiyev, S.: Industry 4.0: the oil and gas sector security and personal data protection. Int. J. Eng. Manuf. 10(2), 1–14 (2020). https://doi.org/10.5815/ijem.2020.02.01 3. Anusha, K., Mahadevaswamy, U.B.: Automatic IoT based plant monitoring and watering system using raspberry PI. Int. J. Eng. Manuf. (IJEM) 8, 55–67 (2018) 4. Kharola, A.: Artificial neural networks based approach for predicting LVDT output characteristics. Int. J. Eng. Manuf. 8(4), 21–28 (2018). https://doi.org/10.5815/ijem.2018. 04.03 5. Evgenev, G.B. (ed.): Osnovy avtomatizatsii tekhnologicheskikh protsessov i proizvodstv. T. 2: Metody proyektirovaniya i upravlemiya [Fundamentals of technological processes automation and production. Vol. 2: Methods of design and management], p. 479. Bauman MGTU Publishing House, Moscow (2015). (in Russian)

Intelligent System of Computer Aided Processes Planning

39

6. SPRUT-Technology: effective CAD / CAM / CAE tools (in Russian) 7. Evgenev, G.B.: Sprut ExPro - sredstvo generatsii mnogoagentnykh sistem proektirovaniya v mashinostroenii [Sprut ExPro - a tool for generating multi-agent design systems in engineering]. Part 1. Izvestiya vysshikh uchebnykh zavedeniy. Mashinostroenie, vol. 6, pp. 66–77 (2017). (in Russian) 8. Evgenev, G.B.: Sprut ExPro - sredstvo generatsii mnogoagentnykh sistem proektirovaniya v mashinostroenii [Sprut ExPro - a tool for generating multi-agent design systems in engineering]. Part 2. Izvestiya vysshikh uchebnykh zavedeniy. Mashinostroenie., №. 7, c. 60–71 (2017). (in Russian) 9. Xu, X., Wang, L., Newman, S.T.: Computer-aided process planning – a critical review of recent developments and future trends. Int J. Comput. Integr. Manuf. 24, 1–31 (2011) 10. Al-wswasi, M., Ivanov, A., Makatsoris, H.: A survey on smart automated computer-aided process planning (ACAPP) techniques. Int. J. Adv. Manuf. Technol. 97(1–4), 809–832 (2018). https://doi.org/10.1007/s00170-018-1966-1 11. Nasr, E.A., Kamrani, A.K.: Computer-Based Design and Manufacturing: An InformationBased Approach. Springer, Boston (2007). https://doi.org/10.1007/b101244

Micro-level Modeling of Traffic Flows Through Signalized Crossroads of an Arbitrary Structure Andrey M. Valuev(B) Mechanical Engineering Research Institute of the Russian Academy of Sciences, 4, Malyi Kharitonievsky pereulok, 101990 Moscow, Russia [email protected]

Abstract. The aim of the paper is to develop a general form of a micro-level model of traffic flows (TFs) passage through a signalized crossroads of an arbitrary structure. The paper evolves the recently put forward approach based on the traffic model representation in the form of an event-switched process—a class of hybrid dynamical systems. The model reproduces specific features of drivers’ behavior in the area of a signalized intersection that require them to change their driving modes frequently because of the conflict points’ passage, the presence of straight and curved segments of permitted routes and the action of traffic lights. To illustrate it examples which main features correspond to real intersections in Moscow urban road network (URN) are presented. It is shown how the use of the proposed model enables to choose and optimize a wide variety of control options. Keywords: Signalized intersection · Traffic organization · Conflict points · Traffic separation scheme · Traffic safety · Microscopic models of traffic flows · Event-switched process · Computational experiments

1 Introduction Traffic organization and control for URNs face a lot of difficulties in the present day conditions of heavy traffic, most of them being related to signalized crossroads of major highways. Passage of TFs through crossroads’ areas is especially complicated both for drivers and for intelligent control systems (ICSs). Drivers must be very attentive and react immediately to changes of the route curvature, the movement of preceding or neighboring vehicles, especially in the case of passing through points of route branching or merging as well as to signals of traffic lights. ICS must treat characteristics of movement of vehicles’ chains on separate permitted routes and their expected reactions on control signals. The most exact prediction of the above takes into account control parameters’ impact on individual behavior of drivers and relationships between movements of neighboring vehicles that guarantee the safe traffic. The most adequate way for such prediction is the use of simulation of traffic flows based on their micro-level models. To support efficiently such type of simulation, modern ways of TFs monitoring as well as collection and treatment of the observation data are elaborated [1–4]. Then simulation results may be applied in many ways in vehicular traffic systems management, including the most recent ones [5, 6]. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): AIPE 2020, AISC 1403, pp. 40–49, 2021. https://doi.org/10.1007/978-3-030-80531-9_4

Micro-level Modeling of Traffic Flows

41

The problem exists, however, in the adequacy and accuracy of the traffic models themselves. The below analysis of the most widely used types of models shows their limited ability to reproduce important features of traffic through the crossroads. All international experience in modeling road traffic shows that emergency situations cannot be reproduced by a general model and are taken into account in other ways [7]; all models of road traffic known to us are safe traffic models. So the goal of the paper is to propose a universal way of construction of the micro-level models for signalized intersections of various structures and traffic organization aimed at the simulation of safe traffic flows through them for choosing the optimal control options. Section 2 presents the review of existing models. In Sect. 3 principal features of crossroads’ structure and traffic flows through them are characterized and on this basis the general model of traffic flows through an intersection in the form of an event-switched process is introduced. Section 4 presents ways of application of the proposed approach to the choice of various traffic control options.

2 Review of Mathematical Micro-level Models of TFs and Their Application to Signalized Intersections The common feature of micro-level models of TFs is that they try to reproduce movements of individual vehicles and so represent traffic in the most natural way. This type of models is now used for simulation in many aspects: from forecasting evolution of a current traffic situation under different assumptions to substantiation of principal solutions on traffic organization and road network development. It must be emphasized, however, that for a definite traffic situation it is really impossible to predict exactly trajectories of individual vehicles involved in it because of the lack of data about them. The real aim of TF micromodeling consists, however, in prediction of the entire traffic situation evolution when a definite control by traffic lights is applied; individual deviations of static and dynamic characteristics of vehicles and drivers’ behavior are averaged but the control impact to the entire TF may be revealed from the impact on individual driving. There are three principal types of micro-level models: in terms of random streams of events, cellular automata and dynamic systems. Models of random streams may represent discrete succession of events related to appearance of individual vehicles on crossroads’ entries and termination of some stages of their routes [8, 9]. So principally they may take into account specific features of the structure, geometry and traffic organization of a definite crossroads by the definition of a complicated set of streams, but really such complex models are not known. The principal drawback of such class of models is the impossibility to represent the existing feedbacks between TFs intensities on routes through the intersection and their densities as well as traffic lights action. Models representing TFs as dynamic systems and known as car-following models were used for decades [10]. Traditional models in fact are not the models of the entire TF even on one lane and describe only the movement of a pair of subsequent vehicles referred to as the leader and the follower. Lane change in these terms may be described as well taking into consideration more adjacent vehicles. Nevertheless, car-following models in a narrow sense may represent TF only on a short time interval between any qualitative changes, but the latter are inevitable and frequent for TFs through an intersection. The

42

A. M. Valuev

form of the model of the entire TF on a segment of a road network was put forward recently [11]. For this aim, the formalism of a class of hybrid dynamic systems, namely event-switched processes (ESP) [12], was evolved for TFs simulation [13]. This type of model is evolved in this paper. Another way to unify movements of individual cars in an integral model without its explicit formulation is based on the swarm approach [14]. The third form of the TF model is the cellular automaton (CA). Original model by Nagel and Schreckenberg [15] is based on separation of lanes into a set of relatively large cells that can contain at each moment of discrete time only one car or be empty. In the recent works its length is set to 7.5 m, time step is 1 s [16]. Traffic through an intersection is usually modeled as CA, but CA with such parameters yields a very crude reproduction of the real traffic characteristics (it may be concluded from the comparison of the cell length with lengths of routes through the intersection usually measured in tens of meters). To exclude the crudity of the CA models artificial randomization is used. However, CA models have the preference with respect to traditional car-following models as they may reproduce the logic of qualitative changes in the traffic process. Both virtues of CA and deficiencies of traditional CA models may be seen from the following formulation of the principal features of the algorithm of passing a crossroads. “Within 100 m before traffic lights the vehicle changes lane under its purpose according to the road laws… Additional speed decreasing takes place under the following conditions: if a vehicle is located near the turning point (at the turning point it stops); if a traffic light is red; if there is the collision threat on the crossroad. A vehicle moves under the foregoing rules with randomization. A vehicle turns if it is located in the turning point and has got the corresponding target” [16]. The second type of CA models is a discretization of the real process in time and space without introduction of any fixed cells. For multi-lane traffic on intercity highways this form of CA model was successfully developed [17]; there is a possibility of this approach application to signalized intersections as well.

3 The Proposed Models The proposed models incorporate three main elements, namely the model of crossroads structure, the model of traffic organization and control and the model of the traffic itself. 3.1 Representation of Crossroads Structure and Traffic Organization Crossroads structure is determined with a set of permitted routes and their mutual disposition. Structurally, the crossroad is an oriented graph. Its arcs are segments of routes and are labeled with codes of all routes to which they belong, its nodes belong to several classes. Nodes are treated as conflict points if they are points of crossing, merging or branching of routes. The node is a crossing point in the case of presence of two pairs of arcs belonging to different Route 1 and Route 2, the former arcs entering the node and the latter arcs exiting it. In the case of the merging points there are two entering arcs and one exiting arc and in the case of the branching point two exiting and one entering arc. Conflict points serve as singular points (SPs) for trajectories of passing vehicles and so nodes of crossing, merging and branching are designated SPC, SPM and SPB.

Micro-level Modeling of Traffic Flows

43

Geometrically routes are smooth curves. Other types of SPC are points on stop lines that either serve as entries to the intersection area or, for complex crossroads, terminate and originate sections of routes passed during one green phase for the corresponding crossroads section. They are denoted by SPES or SPIS, resp. Besides, some nodes may be defined that are not SPs; they separate straight and curved segments of routes. It is reasonable to define initial arcs of routes so that they always contain whole vehicles’ queues in front of the SPES. Analysis of real road intersection shows that there is always a single route or a succession of routes between a connected entry and exit. Traffic organization of the signalized intersection is aimed at minimization of possible conflicts, so permitted routes through the intersection are separated between phases of the traffic light cycle (TLC) forming a so-called phase-wise traffic separation scheme (TSS); passage of some routes on more than one TLC phases is admitted. Passage schemes for individual phases as a rule do not admit traffic by routes intersecting in one point and minimize traffic by routes branching in one point; in most cases simultaneous passage of SPMs by several routes may be excluded entirely. If SPIS exist, they divide routes and the whole crossroads into two or more sections and TSSs with separate TLCs are defined for each section. Examples of the crossroads area structure and TSSs are presented for a single-section crossroads in Fig. 1 and for a double-section one in Fig. 2. 3.2 Traffic Flow Through an Intersection as an Event-Switched Process Similarly to other car-following model, the proposed ones include one basic element: a description of the movement of a pair of cars moving one after another (“leader– follower”) [7], including the choice of acceleration serving as the vehicle control. For the latter, it is assumed to be determined with dependences determined in each mode individually; arguments of them may be coordinates and speeds of the vehicle itself and its leader and, possibly, accelerations of the leader. For movement along a road with a speed limit, the modes of normal acceleration (1), uniform movement with the maximum (or optimal for a particular vehicle) speed (2), maintaining the minimum safe distance to the leader—in other words, the tightest pursuit (3), normal braking mode (4) and immobility (5). Switching between modes of an individual vehicle takes place when its trajectory (and trajectories of related vehicles in some cases) reaches the corresponding manifold in the (united) space of vehicle(s) states; some switching conditions act only in periods determined by the traffic light control. The position on the present arc and the velocity along it are continuous-valued state variables for the i–th vehicle and the movement mode j(i, t); the route number r(i, t) (and other variables for some cases) serve as discrete-valued state variables of a vehicle. Other type of switching is related to the appearance of a certain vehicle in the route beginning, or passing SPC or SPM, or reaching the route end. These events change the set of vehicles on a definite route and for the vehicle itself (let it be the i–th one) changes the current leader numbers L(i,t) and the current route number—r(i,t). To describe the TFs dynamics, we denote the current position x i (t) of the vehicle with the number i on the route and then proceed to the model formulation. In the mode j(i, t) the movement of the i-th vehicle is subject to the equations   (1) x¨ i = Ui xi (t), x˙ L(i,t) (t), x˙ i (t), x¨ L(i,t) (t), j(i, t)

44

A. M. Valuev 1

1

2

1

2

3

3

3

4

4

4

5

5

5

10

10

10

9

9

9

8

8

8

7

2

7

6

6

7

6

Fig. 1. The principal system of routes at the intersection of Profsoyuznaya and Obrucheva street in Moscow (a) and passage schemes for the 1st (b) and the 2nd (c) phases of a three-phase TSS.

Fig. 2. Fragment of a complex intersection structure like Serpukhovskaya Square in Moscow (a) and passage schemes for the 1st (b) and 2nd (c) phases for its eastern sections. Legend:—SPES, —SPM as an exit, —SPIS; —SPC.

where xi (t) = xL(i,t) (t) − LENL(i,t) − xi (t) is the distance between the rear bumper of the leader and the front bumper of the follower and must satisfy speed restrictions x˙ i (t) ≤ vmax (xi (t)).

(2)

The form and parameters of dependencies in (1) are determined by restrictions −bnorm max i ≤ x¨ i ≤ anorm max i

(3)

(for j = 1, 2, 4, 5 we have, respectively, Ui = anorm max i , 0, − bnorm max i , 0) and the law of safe distance, which must also be respected at all times,   (4) xi (t) ≤ SSAFE x˙ L(i,t) (t), x˙ i (t) When (4) turns to equality the relationship takes place that connect v0 = x˙ L(i,t) (t) and v1 = x˙ i (t) Ui (xi (t), v0 , v1 , j(i, t)) =

(v0 − v1 ) − a˙ 0 ∂SSAFE (v0 , v1 )/∂v0 , j(i, t) = 3. ∂SSAFE (v0 , v1 )/∂v1

(5)

Micro-level Modeling of Traffic Flows

45

To complete the model formulation we must express conditions for events of switching and their consequences. These events are divided into two classes, the class of events that may happen permanently and the second that contains events that may take place only within corresponding periods depending on control (i.e., definite TLC phases) or SP passage by individual vehicles. A specific feature of the model is that the trajectory of a vehicle or its part may be defined in two variants and the ultimate version is defined in the following way. The 1st variant is defined ignoring the possibility of the next event of the 2nd class; if it shows that the problematic event does not take place, then this variant is defined as the ultimate one, otherwise as the trial one. Otherwise, the trajectory on which this event happens is determined and defined as the ultimate one. For both variants all possible events of the 1st class are treated. So we put forward the principle of drivers’ rational anticipation of the local traffic situation dynamics and their control choice consequences; it may explain the observed features of the normal safe traffic through the crossroads area and yields the constructive way to simulate drivers’ control choice in their usual problematic situations. In both variants the succession of switching events is defined in such a way: the expected moments of the next events of all regarded types for the vehicle in question moving in a present mode are calculated; the earliest of them is treated as the moment of the next event and the corresponding changes of state variables are performed. Now we formulate conditions of events of the 1st type and corresponding changes. Conditions for changing mode 1 to 2 and 4 to 5 are, resp., x˙ i (t) = vmax and x˙ i (t) = 0. Changing mode 1 or 2 to 3 takes place when (4) turns to equality. Leaving the route end by a certain vehicle and thus excluding this vehicle from consideration is linked to the event of reaching the hyperplane xi (t) = Xr(i,t) by its trajectory. In that case, the mode of the follower changes from 3 to 1. Analogous conditions and consequences with respect to the present and the next route determine the transition of a vehicle to another route through SPM or SPB thus liberating this SP, or the capture of SPCs, SPMs or SPBs. Changing mode 1 or 2 to 3 happens where inequality (5) turns to equality. The condition takes place for changing mode 5 to 3 with the respect to the new leader in the subsequent passage of SPM. The moment of mode changing from 5 to 1 for the vehicle before a stop line is simply the moment when the next green phase begins. The other expression have events of the vehicle control change that does not immediately produce the desired action (unlike in the above cases) and must be taken in advance, namely beginning of braking to diminish the velocity on turns or when approaching SPC occupied by another car or to stop before SPES or SPIS when it is impossible to pass them during a current green phase. The final moment tSB to start braking for diminishing the speed to v0 at the position x0 is found from the condition   x0 = xi (tSB ) + (˙xi (tSB ))2 − (v0 )2 /(2bnorm max i ). (6) The same conditions (6) with v0 = 0 are introduced: 1) for reaching the next stop line—if the moment of its crossing on the trial trajectory does not belong to the green phase for the route and 2) for reaching SPM—if the competing vehicle reaches it earlier. But they are applied to the determination of 2nd trajectory variant only. After determining the required moment of braking beginning we “slide” along the trial trajectory and change it to the segment of braking from the found moment tSB .

46

A. M. Valuev

From the above it goes that trajectories of vehicles on each route must be computed in the recursive fashion: each trajectory is being computed as a whole in the order of vehicles’ passage of each route: first, the entire trajectory of the 1st vehicle; then the trajectory of the second vehicle as linked with safety conditions with the 1st trajectory at each moment of their simultaneous motion and so on. If routes are merged in SPM, then the calculation of trajectories alternates along both routes, in the order of SPM passing. First a trajectory is determined in a trial version and compared with the trajectory of the competing car. This trajectory is treated as the ultimate one if it reaches SPM earlier; otherwise it is determined finally with the braking segment before SPM. The introduced set of relationships forms models for single-section and multi-section crossroads of any structure and traffic organization and control.

4 Ways of the Proposed Approach Application for Traffic Control Options Optimization For improving unfavorable situations resulting from heavy traffic it is reasonable to choose among many control options, both parametric and structural [18]. To illustrate that idea, see the intersection presented in Fig. 1. For it we can see three levels of control on the existing set of routes: the TSS choice, the choice of the order of passage schemes constituting the chosen TSS and the assignment of phase durations. Moreover, the possibility exists to shift slightly time limits for some passage directions on the same phase. As to phase durations, they must be sufficient for each route to pass the number of vehicles entering its beginning point for the entire TLC on average. Their preliminary minimal values may be determined treating TFs on various routes independently. For more exact valuations we must take into account the following fact: the real time that the chain of vehicles on a route which start on a definite green phase would have for the its passage may be either less or greater than the phase duration; it depends on the moment when conflict points on the route are left by the tail vehicle of previous cluster and the moment when the point is reached by the head vehicle of the present cluster. Thus changing the order of phases (passage schemes) we may more efficiently use the intersection throughput. Let us consider the problem of TLC phases’ durations optimization for the crossroad and its TSS shown in Fig. 1 defining the 3rd phase passage scheme as symmetrical to that shown in Fig. 1c. Let the entire TLC duration is given, denote N i the count of vehicles entering the intersection area by the lane i. Preliminary estimation of the required time to pass N i vehicles by a route is shown in Table 1; these times depend on the route slightly. Table 1. The minimum and maximum time required to pass a definite count of vehicles through permitted routes and respective green phase duration rounded to seconds. N

2

TminN

4,51 6,81 8,82 10,70 12,51 14,28 16,01 17,73 19,42 21,10 22,77 24,44 26,09

TmaxN

4,51 6,81 8,82 10,70 12,51 14,28 16,03 17,76 19,48 21,19 22,89 24,59 26,28

TroundN 5

3

7

4

9

5

11

6

13

7

15

8

17

9

18

10

20

11

22

12

23

13

25

14

27

Micro-level Modeling of Traffic Flows

47

In Table 2 the possible calculated delays because of the presence of SPs are shown and the moments of these SPs liberation on the previous phase and their capture by the first vehicle on the next phase are compared. The calculations are performed assuming that the previous phase duration is as shown in the last row of Table 1. Results of this comparison for 3 variants (with 8, 12 or 15 vehicles, passing on the 1st phase on routes from entries 1, 2, 6 and 7) are shown in the upper part of the Table 2 (rows with Route1 = 1, 2, 6, 7). In the rest rows we present the reverse situation when the passage scheme for the 1st phase is that shown in Fig. 1c and 8, 12 or 15 vehicles pass from entries 3, 4 and 5. From the data in Table 1 we come to conclusion that: 1) for the second variant we have positive time reserves for all routes (that is not less than 8.84 s for the Route 3, 1.41 s for the Route 4 and 4.77 s for the Route 5). So the traffic intensities on these routes may be even greater than the supposed phase duration may provide; to provide them, at the 2nd phase beginning times for these routes may be assigned earlier individually for each route without earlier termination of the 1st phase on any route. On the contrary, if the passage scheme for the 1st phase is that shown in Fig. 1b, then the phase duration may be insufficient for the Route 3 if the traffic intensity on it is maximal. It must be emphasized that the calculated valuations of the phases’ durations and shifts must be confirmed or corrected by the same analysis of TF passage at the 3rd phase. In this way the preferable succession of passage schemes of a given TSS and the reasonable values of TLC parameters for it can be established. The choice between TSSs, if possible, demands the same computations. If there are no branching routes, then the Table 2. Calculation of delays on SPs for the three variants of passage schemes succession. Route1 Route2 tLIB1

tCAPT1 tRES1

tLIB2

tCAPT2 tRES2

tLIB3

tCAPT3 tRES3

1

4

22.86 30.04

7.18 29.13 36.04

6.91 33.85 41.04

7.19

2

4

23.17 29.52

6.34 29.65 35.52

5.87 34.51 40.52

6.01

2

5

26.65 30.17

3.51 33.01 36.17

3.15 37.80 41.17

3.36

6

3

29.22 28.69

−0.53 35.69 34.69

−1.00 40.55 39.69

−0.86

6

4

26.73 27.33

0.60 33.21 33.33

0.12 38.09 38.33

0.24

6

5

24.92 27.33

2.41 31.43 33.33

1.90 36.33 38.33

2.00

7

4

26.52 27.95

1.43 32.88 33.95

1.08 37.67 38.95

1.29

7

5

24.42 27.95

3.54 30.84 33.95

3.12 35.67 38.95

3.29

3

6

24.20 33.33

9.13 30.49 39.33

8.84 35.23 44.33

9.10

4

1

25.59 27.33

1.73 31.92 33.33

1.41 36.69 38.33

1.64

4

2

25.11 27.33

2.22 31.46 33.33

1.87 36.24 38.33

2.09

4

6

23.13 30.90

7.78 29.55 36.90

7.35 34.38 41.90

7.52

4

7

23.68 30.90

7.22 30.09 36.90

6.82 34.90 41.90

7.00

5

2

25.73 31.02

5.29 32.06 37.02

4.96 36.83 42.02

5.19

5

6

23.13 29.11

5.98 29.57 35.11

5.54 34.40 40.11

5.71

5

7

23.69 28.69

5.00 30.10 34.69

4.59 34.92 39.69

4.77

48

A. M. Valuev

only required data are values of N i ; otherwise the proportions of branching must be known. The proposed model may yield and more exact prediction and recommendations by simulation of the entire TFs passage for the entire TLC. All the above is especially important for multi-sectional crossroads which control demands a rational combination of options for all sections.

5 Discussion and Conclusions The crossroads passage as the most complicated element of TFs dynamics has not found yet the reliable micro-level models, probably except the simplest cases. In the paper the model of TFs passage through a signalized intersection is put forward; undoubtedly, crossroads of various structures and traffic organization may be treated in the proposed terms. Principal ways of the model application to traffic control options choice is demonstrated. As to computational aspects, more experience is obtained for analogous models [13] and must be useful for it. However, many problems for its real application do exist. The first problem is its parametric identification that may not be universal and must reflect features of TFs composition in a certain area. We got a certain amount of the necessary local information without the use of specialized equipment, but obtaining experimental data by advanced technical means and their initial processing for all objectives stated here is a rather complex legal and organizational problem. The assessment of stochastic factors role in TFs stays both theoretical and practical problem. As to the introduced model, it enables calculation with both uniform deterministic and stochastic representation of incoming TFs, but the ways of interpretation and treatment of monitoring data revealing their probabilistic characteristics is still an unresolved problem. We hope to find some approaches to it in our further research.

References 1. Zhou, C., Weng, Z., Chen, X., Zhizhe, S.: Integrated traffic information service system for public travel based on smart phones applications: a case in China. Int. J. Intell. Syst. Appl. (IJISA) 5(12), 72–80 (2013) 2. Goyal, K., Kaur, D.: A novel vehicle classification model for urban traffic surveillance using the deep neural network model. Int. J. Educ. Manage. Eng. (IJEME) 6(1), 18–31 (2016) 3. Tourani, A., Shahbahrami, A., Akoushideh, A., Khazaee, S., Suen, C.Y.: Motion-based vehicle speed measurement for intelligent transportation systems. Int. J. Image Graph. Sig. Process. (IJIGSP) 11(4), 42–54 (2019) 4. Memon, S., Bhatti, S., Thebo, L.A., Talpur, M.M.B., Memon, M.A.: A video based vehicle detection, counting and classification system. Int. J. Image Graph. Sig. Process. (IJIGSP) 10(9), 34–41 (2018) 5. Danileviˇcius, A., Bogdeviˇcius, M.: Investigation of traffic light switching period affect for traffic flow dynamic processes using discrete model of traffic flow. Proc. Eng. 187, 198–205 (2017) 6. Adebiyi, R.F., Abubilal, K.A., Mu’azu, M.B., Adebiyi, B.H.: Development and simulation of adaptive traffic light controller using artificial bee colony algorithm. Int. J. Intell. Syst. Appl. (IJISA) 10(8), 68–74 (2018)

Micro-level Modeling of Traffic Flows

49

7. Tarko, A.P.: Use of crash surrogates and exceedance statistics to estimate road safety. Accid. Anal. Prev. 45(1), 230–240 (2012) 8. Babicheva, T.S.: The use of queuing theory at research and optimization of traffic on the signal-controlled road intersections. Proc. Comput. Sci. 55, 469–478 (2015) 9. Yang, S., Yang, X.: The application of the queuing theory in the traffic flow of intersection. Int. J. Math. Comput. Phys. Electr. Comput. Eng. 8(6), 986–989 (2014) 10. Treiber, M., Kesting, A.: Traffic Flow Dynamics: Data, Models and Simulation. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-32460-4 11. Glukharev, K.K., Ulyukov, N.M., Valuev, A.M., Kalinin, I.N.: On traffic flow on the arterial network model. In: Kozlov, V.V., Buslaev, A.P., Bugaev, A.S., Yashina, M.V., Schadschneider, A., Schreckenberg, M. (eds.) Traffic and Granular Flow ’11, pp. 399–411. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39669-4_38 12. Valuev, A.M.: A new model of resource planning for optimal project scheduling. Math. Model. Anal. 12(2), 255–266 (2007) 13. Solovyev, A.A., Valuev, A.M.: Organization of traffic flows simulation aimed at establishment of integral characteristics of their dynamics. Adv. Syst. Sci. Appl. 18(2), 1–10 (2018) 14. Pavlovsky, V.E., Pavlovsky, V.V.: Mathematical models of swarm aspects in traffic flows. In: Proceedings of the 18th International Conference on Computational and Mathematical Methods in Science and Engineering CMMSE, pp. 9–13 (2018) 15. Nagel, K., Schreckenberg, M.: A cellular automaton model for freeway traffic. J. Phys. I France 2, 2221–2229 (1992) 16. Churbanova, N.G., Chechina, A.A., Trapeznikova, M.A., Sokolov, P.A.: Simulation of traffic flows on road segments using cellular automata theory and quasigasdynamic approach. Math. Montisnigri 46, 72–90 (2019). https://doi.org/10.20948/mathmon-2019-46-7 17. Kerner, B.S., Klenov, S.L., Hermanns, G., Schreckenberg, M.: Effect of driver overacceleration on traffic breakdown in three-phase cellular automaton traffic flow models. Phys. A 392(18), 4083–4105 (2013) 18. Solovyev, A.A., Valuev, A.M.: Problems of optimization of control for crossroads with multistage passage of traffic flows. In: 2020 13th International Conference on Management of Large-Scale System Development (MLSD), 5 p. IEEE Xplore Digital Library (2020). https:// ieeexplore.ieee.org/document/9247655

Polypolar Coordination by the Multifocal Lemniscates T. Rakcheeva(B) Mechanical Engineering Research Institute of the Russian Academy of Sciences, 4 Maly Kharitonievskiy Pereulok, Moscow 101990, Russia [email protected]

Abstract. A polypolar coordinate system (CS) is introduced. The work is devoted to the formation of a polypolar CS based on a family of isofocus lemniscates parametrized along the radius. The complete set of focuses of the lemniscate family is defined as its structural origin. Functional definitions of polypolar coordinates are introduced: metric ρ and angular ϕ. The mathematical substantiation of the necessary properties concerning uniqueness, monotony, orthogonality, ranges of values is given. It is proved that the coordinate families of curves ρ = const and ϕ = const of a polypolar lemniscate CS for any connection are mutually orthogonal conjugate families. The basic provisions of polypolar lemniscate coordination and the passage to the classical limit have been developed. The concept of ϕparameterization is introduced, its features are considered in the critical range of different connectivity. The introduced polypolar lemniscate CS, like the classical polar one, characterizes a point on the plane with polar coordinates ρ and ϕ, but it has not one center-pole, but several, organized into a certain structure of poles. The metric component can be arbitrary, rather complex, manually adjusted or automatically, and for any shape of the metric component, the angular component is orthogonal. Polypolar CS makes possible such applications as combining different metrics in one CS, the formation of individual LCS-coordination for any object image, focal representation of forms and their symmetries, as well as curvilinear symmetries on multifocal lemniscates. LCS can be defined as a universal CCS, tunable to a special form of the subject area. Keywords: Foci · Curves · Lemniscates · Curvilinear coordinate system · Metrics · Symmetries · Approximation · Representation of forms

1 Introduction In the solution of both applied and theoretical problems, much is determined by an adequate choice of factors - variables in the space of which the solution is sought [1–9]. Despite the universality and simplicity of the universally used rectilinear Cartesian coordinate system, a class of other, curvilinear CS (CCS) has been developed. The use of curvilinear coordinates can greatly simplify many tasks, for example, in the case when the surface or curve under study has a constant value on the coordinate family. Due to this, curvilinear coordinate systems have a wide variety of shapes and organizations. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): AIPE 2020, AISC 1403, pp. 50–62, 2021. https://doi.org/10.1007/978-3-030-80531-9_5

Polypolar Coordination by the Multifocal Lemniscates

51

Well-organized coordinate systems, both rectilinear and curvilinear, are characterized, as you know, by the origin, guides and metric units along these guides. So, a rectangular Cartesian CS has the origin at the intersection of the coordinate axes and equal orthogonal guides with the equal metric units. In the general case, the two degrees of freedom at the point of the plane are coordinated differently in different CSs. The simplest of curvilinear CS – polar - characterizes point relative to a single center by two coordinates: polar radius ρ and a polar angle ϕ, where ρ is the Euclidean distance from the pole to the point well ϕ - radian measure of the angle relative to the polar axis. Such coordination is provided, as is known, by two orthogonal families: a family of concentric circles ρ = const and a family of strait lines ϕ = const, passing through a pole, while at each point of the plane, except for the pole, one circle and one straight line orthogonally intersect. In addition to the classical polar CS, there are a number of other curvilinear coordinate systems (CCS): cylindrical, parabolic, ellipsoidal, bipolar, etc. Universal well-organized orthogonal CCSs have functional-analytical conjugate coordinate families: parabolas in a parabolic CCS, spheres and cones in a spherical one, ellipses and hyperbolas in an elliptic one, cylinders and planes in a cylindrical one, etc. This limits their versatility in application. Special CCSs, such as, for example, geodetic or hydrodynamic ones, have coordinate families associated with the physics of a specific applied problem (in these examples, the shape of the Earth or field lines), which are also described by analytical formulas and have orthogonal conjugate families. But their application is initially limited to highly specialized applications. To achieve applied universality, it is possible, of course, to represent one of the coordinate families by an arbitrary empirical curve, given, for example, pointwise. However, even in this case it is impossible to build a well-organized CCS due to the absence of any analytical apparatus, and coordination will be limited only to this task. In this paper, we propose a different, polypolar coordinate system (PPL), which like the classical polar CS, characterizes a point in the plane with two coordinates: the polar radius ρ and the polar angle ϕ, and has not one center-pole, but several (finite number) poles. Such coordination is provided by families of multifocal lemniscates. Multifocal lemniscates are smooth closed curves without self-intersections, not necessarily simply connected, containing a finite number of foci within them [1, 2]. Lemniscate is completely determined by the system of a finite number k of points-foci in it and a numeric parameter R-radius as the geometry call locus of points for which remains constant, equal to Rk , the product of the distances to all k foci (Fig. 1a). The defining invariant of the lemniscate can be written as: k 

rj = Rk ,

(1)

j=1

where r j - Euclidean distance from any point of the lemniscate (x, y) to the j-th focus f j with the coordinates of (aj , bj ). Lemniscate with k foci will be called kf -lemniscate, and the focal system of k foci f = {f 1 ,…, f k } - kf -system. The left side of this invariant - in the complex representation the modulus of the polynomial - is a nonnegative function and defines an applicative surface P(x, y) tangent to the plane only at the focus points. By definition, lemniscate is a line level of this surface. Thus, each lemniscate corresponds to the horizontal section at the level Rk of the surface

52

T. Rakcheeva

P(x, y), and each line of this level surface is projected in the lemniscate with the corresponding radius value Rk (Fig. 1a). As a consequence, for fixing kf -system lemniscates with different radii form a family of curves nested from k-connected for small values of the radius R to entirely connected for large values, besides the curves with a large radius involve the curves with less radius without intersections [2, 3]. In a certain range of values of the radius the lemniscate have a large variation in form that may influence their use as approximating functions for a wide range of curve forms [1–3]. The family of a multifocal lemniscates forms class of approximating functions for the approximation of smooth closed curves (continuous basis). The indicated properties of the family of multifocal lemniscates make it possible to use them to generalize the classical polar representation in the form of polypolar coordination of a plane or space. The peculiarity of the proposed CS is in the combination of a certain degree of universality of the description of the metric component and the analytical organization of the mathematical apparatus. The latter is manifested in the provision of the required properties of well-organized coordination systems, such as unambiguous coordination, monotony, range of values, periodicity, orthogonality of conjugate families. Thus: The purpose of this work is to prove the possibility of constructing a polypolar coordinate system based on multifocal lemniscates.

2 Polypolar Lemniscus Coordinate System Let us introduce on the plane the absolute Cartesian system of reckoning (ACS), in which k focus points are coordinated: f j = {aj , bj }, j = 1,…, k. These k foci form a focal system, which we will further call a kf -system, and the corresponding lemniscates – kf -lemniscates.

a)

b)

c)

Fig. 1. LCS-coordination: a) 3f -lemniscate family, b) 2f -LCS, c) 2f - limit transition to 1f -one

An arbitrary point of the plane with ACS coordinates (x, y) in a lemniscatic coordinate system (LCS) has polar coordinates (ρ, ϕ), where ρ is metric, and ϕ is an angular coordinate, which are, respectively, functions of focal points ρj and ϕj - polar coordinates with respect to each focus f j . In this case, the classical polar CS turns out to be a special case of polypolar, when the focal system consists of one focus. The structure of the polypolar LCS should be such

Polypolar Coordination by the Multifocal Lemniscates

53

as to allow the limiting transition from the polypolar to the classical polar coordinate system in the transition from k foci to one. Thus, the introduced PPL coordination must satisfy the following requirements: a) existence and mutual uniqueness: each point (x, y) corresponds to one pair of polypolar coordinates (ρ, ϕ) and each pair (ρ, ϕ) coordinates one point (x, y); b) the existence of orthogonal isoparametric conjugate families, their continuity and monotonicity in the coordinate parameters ρ (for ϕ = const) and ϕ (for ρ = const); c) the existence of a limiting transition to the classical unipolar coordinate system when the polypolar kf -structure degenerates into a monopolar one, when all k foci approach a point.

3 Metric Coordinates ρ Let us define the function ρ of radius-vectors from an arbitrary point of the plane to the focus system as the geometric mean: √ ρ = k r 1 · r2 · . . . · rk , (2) focal polar radii ρj ≡ r j , equal, respectively:   2  2 x − aj + y − bj . ρj ≡ rj =

(3)

Thus, the introduced metric coordinate ρ for an arbitrary point (x, y) determines its distance to the kf -system {f 1 ,…, f k }. As a function of distance, the coordinate ρ must have the properties of mutually unambiguous existence and positivity everywhere, except for the origin, turn to zero at the origin, and have a range of values from 0 to ∝. Indeed, the lemniscus metric coordinate ρ exists, obviously, for any point of the plane (x, y), since for any point of the plane, there is a focal polar radius ρj (3), which has a single positive value. The ρ takes zero value only at k polar points of the focal system, which represents the kf-system origin. The metric coordinate ρ can take any non-negative value in the range [0, ∝], since the radius of the kf -lemniscate varies continuously from 0 to ∝. The monotonicity of the ρ coordinate is due to the above-noted property of a lemniscate with a large radius to encompass lemniscates with a smaller radius without contact and intersection. Like the family of concentric circles, the family of isofocus kf -lemniscates coordinates the distance to the kf -system – the structural origin of the coordinate system, and the radius R serves as the metric (radial) coordinate ρ of the polypolar LCS. Each point of the plane (x, y) obviously has a definite and unique value ρ the distance to the kf -system, on the one hand, and, on the other hand, each value of the radius R one-to-one corresponds to a certain lemniscate, and, therefore, to a certain value metric coordinate ρ. Thus: The metric polypolar coordinate ρ exists for any point (x, y). Factorization of the polypolar plane to the coordinate parameter ρ is one-to-one [3].

54

T. Rakcheeva

The isometric family of coordinate curves ρ = const, which represents the family of isofocus lemniscates, allows us to formulate an important consequence: kf-lemniscate ρ = const ≡ Rk , satisfying the condition of constant distance to the kf-system, can be considered on the plane as a multifocal analog of the polar circle – a polypolar circle centered at the kf-system origin. Unlike the classical circle, which is monopolar, we will call the kf -lemniscate a polypolar circle, meaning, as an additional justification, the fact that any lemniscate “surrounds” all its foci.

4 Angular Coordinates ϕ We introduce the angular coordinate of the polypolar lemniscatic CS as the arithmetic mean of the polar focal angles ϕj :  ϕ = (ϕ1 + ϕ2 + . . . ϕk ) k. (4) each of which is defined as the classic polar angle relative to the j-th pole-focus: ϕj = arctg

y − bj . x + aj

(5)

The validity of such a determination of the angular coordinate in one direction is obvious: for any point (x, y) of the plane, it is possible to unambiguously determine its direction to each pole-focus, i.e. the angle ϕj , and thus the sum of these angles ϕ. That is to say for each point of the plane there is a unique coordinate ϕ. At the same time, the range of variation of the angular coordinate ϕ gives us the principle of the argument. Lemniscate is naturally defined on the complex plane. The argument of a complex point moving along a closed contour with k zeros inside the contour and making a complete traversal changes according to the principle of the argument by 2πk. For a fixed value of ρ, the point (x, y) belongs to a dedicated lemniscate with k foci, and a complete traversal of this point along the lemniscate will lead to a change in the angular component by 2πk. Defining ϕ as the arithmetic mean of the polar angles ϕj , gives the range of variation [0 2π]. As an orientation coordinate ϕ has direction properties [3]: for any kf -structure, the coordinate ϕ exists everywhere, except for the structural origin of coordinates, where it is not defined, is everywhere positive, and increases monotonically when traversing the focus structure in the positive direction along the isometric curve ρ = const. Thus: The angular coordinate ϕ for an arbitrary point (x, y) determines its direction to the kf-structure. The angular coordinate ϕ has a periodic closure with the usual range of values from 0 to 2π . The zero isoline, where ϕ = const = 0, in contrast to the polar axis of the classical CS, is an axial line only in the special case of symmetry of the kf -structure, in the general case, it is a curved line.

Polypolar Coordination by the Multifocal Lemniscates

55

5 Coordinate Families In general, each individual polar angle ϕj changes in different directions when walking around the lemniscate, therefore, for the introduced polypolar coordinate ϕ, it is required to prove its uniqueness and monotonicity of change along an arbitrary lemniscate, i.e. it is required to prove that no two different points with the same value of ρ can have the same value of ϕ. In addition, the most important task is to determine for an arbitrary kf -system the family of isoparametric curves satisfying the condition ϕ = const, the family conjugate to the family of isometric kf-lemniscates. In this paper, we give proofs for the simplest case of a 2f -system (for the general case of a kf -system, the proofs are given in [10]). 2f -coordinate system. Consider a two-polar system with one parameter a, coordinating a system of two foci. Without loss of generality, both foci can be positioned on the x axis, symmetrically relative to the beginning of the ACS (Fig. 1b) at a distance a, i.e. the 2f -system is specified by foci with coordinates f 1 = (−a, 0), f 2 = (a, 0). The finite invariant equation of the isometric family of 2f -lemniscates:     (6) r12 r22 ≡ (x + a)2 + y2 · (x − a)2 + y2 = R4 = const· The differential equation for the lemniscate family is:

[(x + a)r2 + (x − a)r1 ]dx + yr2 + yr1 dy = 0. Replacing y with −1/y , we get the differential equation: −y x2 + y2 + a2 dx + x x2 + y2 − a2 dy = 0,

(7)

the solution of which is known to be a family of gradient curves. Despite the aparent simplicity, its solution presents certain difficulties, so let’s approach this problem from the other side. Consider the invariant equation for the angular coordinate: ϕ1 + ϕ2 ≡ arctg

y y + arctg = const. x+a x−a

(8)

The differential equation for an isoparametric family of curves on which the value of ϕ is invariable, we obtain from it in the form:



 y x+a y x−a − dx + − dy = 0. − − (x + a)2 + y2 (x − a)2 + y2 (x + a)2 + y2 (x − a)2 + y2 Reducing to a common denominator, we finally obtain a differential equation for the family conjugate to lemniscates: −y x2 + y2 + a2 dx + x x2 + y2 − a2 dy = 0. Comparing this equation with Eq. (7), we see that they are identical. Consequently, the solution to the differential Eq. (6) is the family of invariant curves (8).

56

T. Rakcheeva

Thus, for the 2f -system, the assertions have been proved that the family of isoparametric curves for the angular coordinate ϕ = const is a family of gradient curves to the family of lemniscates and vice versa, and that the conjugate families of isoparametric curves ρ = const and ϕ = const are mutually orthogonal (Fig. 1b, 4b). It remains to show that the variation of ϕ along an arbitrary 2f -lemniscate has the monotonicity property. To do this, consider the behavior of the derivative ϕ, in the direction along the lemniscate. Let us estimate the scalar product of the gradient ϕ, and the tangent vector for an arbitrary point of the lemniscate. The components of the gradient for coordinate ϕ are:   grad ϕ = 2x x2 + y2 − a2 , 2y x2 + y2 + a2 , and the components of the tangent to the lemniscate:   tangρ = 4x x2 + y2 − a2 , 4y x2 + y2 + a2 , Since, as shown above, the families of lemniscates and conjugate curves are orthogonal, the components grad ϕ and tang ρ turned out to be equal to within a constant factor, and, therefore, the required scalar product is certainly non-negative. Really: (9) (grad ϕ · tangρ) = 8 x2 + y2 R4 ≥ 0. It is also seen from this that the scalar product vanishes at special points: at the poles, where R = 0, and at the coordinate origin of the ACS, where x = y = 0 (the features of the LCS are considered in detail in [10]). Figure 2 illustrates the behavior of the angular components of a point moving along a certain lemniscate and making a complete revolution around it (the family of 2f lemniscates, parametrized along the radius, is shown in Fig. 1b). Each of Fig. 2 shows the intersecting graphs of two polar coordinates ϕ1 and ϕ2 and a graph of the polypolar coordinate ϕ (midlines with point marks). Graphs: in Fig. 2a refer to the Bernoulli lemniscate (R = a, Fig. 1b), in Fig. 2b - to a convex-concave simply connected lemniscate next to the Bernoulli lemniscate (R > a) and in Fig. 2c - to a convex lemniscate of large radius (R > > a). As can be seen from the figures, for small values of the radius R, the behavior of the angular coordinates ϕ1 and ϕ2 has large variations in derivatives, and for large values of the radius R, the focal angles ϕ1 and ϕ2 are closest to each other and to the polypolar coordinate ϕ. For nonconvex (and disconnected) lemniscates, the behavior of ϕ1 and ϕ2 is nonmonotonic; at the same time, the polypolar coordinate ϕ is monotonic everywhere [10]. Thus, the assertions proved for a 2f -system and valid for the general case of a kf -system [10] can be formulated as follows: The family of isoparametric curves of the angular coordinate ϕ = const is a family of gradient curves to the family of lemniscate ρ = const; The conjugate families of isoparametric curves ρ= const and ϕ = const are mutually orthogonal.

Polypolar Coordination by the Multifocal Lemniscates f = [ -1, 1 ] R =1

f = [ -1, 1 ] R =1.05

350

f = [ -1, 1 ] R =2.25

350

350

300

300

300

250

250

250

200

200

200

150

150

150

100

100

100

50

50

0

0

5

10

15

20

25

30

а)

0

57

50

0

5

10

15

20

25

30

0

0

5

10

b)

15

20

25

30

c)

Fig. 2. Graphs of angular coordinates for 2f -lemniscates with different radius R

Figure 3 shows the coordinate grids of conjugate isometric families of curves: ρ = const (closed curves covering foci) and ϕ = const (open curves coming from foci) for polypolar coordinate systems with different numbers of poles and their configurations: asymmetric case (a) with k = 3 and symmetric case (b) with k = 4. The below figures illustrate the shape and nature of the interposition of conjugate families of lemniscates and curves gradient to them, in particular, their mutual orthogonality and the shape of separatrices for both symmetric and non-symmetric kf -structures.

а)

b)

Fig. 3. Polypolar LCS: a) 3f -asymmetric, b) 4f -symmetric

6 ϕ-parameterization These singularities are associated: 1) with singular points and 2) with singular varieties of the space kf -lemniscate. 1) The scalar product (9), as noted, turns to 0 in two cases: at the points of the poles of the focal system, where the focal radius r j = 0 vanishes, and at the point of the beginning of the ACS, where the lemniscate loses its smoothness. The first case takes place in a polypolar system with any number of poles. In the classical polar CS, such a singular point is, as is known, the only pole, and in the proposed polypolar these are k poles of the kf -system, where ρ = 0, and the angular coordinate ϕ is not defined. The second

58

T. Rakcheeva

case is specific for a polypolar CS, when k poles are located at the vertices of a regular k-gon, a lemniscate containing a central point undergoes kinks in it. The singular point of the two-pole CS is on the Bernoulli lemniscate (R = a), the point of contact of the two loops of the figure-eight (Fig. 1b). The peculiarity of this point is the ambiguity ϕ. When traversing the figure-eight, it is passed twice: the first time ϕ = π/2, and the second ϕ = 3π/2. In the general case, a kf -system is a k-fold point with equal focal radii, one ρ and different angles ϕ: ϕ(j) = j2π/k, j = 1, …, k. An example of a 4-pole CS with a symmetrical organization is shown in Fig. 3b. For an LCS with k foci, a similar situation arises when the focus poles are located at the vertices of a regular k-gon, forming a kf -structure with a full symmetry group. Kf -lemniscate, like Bernoulli’s lemniscate, has the shape of k petals with one common point in the center of symmetry. 2) The vanishing of both components of the gradient of ϕ for 2f -LCS leads to the solution: {y = 0; x = a(r 1 – r 2 )/( r 1 + r 2 )} corresponding to the interfocal segment (Fig. 1b). This is a special line 2f -LCS - interlocal separatrix - geometrical place of pairs of symmetric points of doubly connected lemniscate having the same coordinates (ρ, ϕ). The monotonicity of the angular coordinate ϕ for disconnected lemniscates requires special consideration (Fig. 4a, b). For a 2f -system (Fig. 1b), the traversal of its two loops is performed from the polar axis to the interfocal separatrix, the transition to another loop, a complete traversal of it, return to the first and the end of the period. Interfocal separatrices are well marked in Fig. 3a, in Fig. 3b these are the lines connecting the poles with the center. Other separatrices, orthogonal to interfocal ones, do not present any peculiarities for coordination, dividing the zones of belonging of gradient curves to different poles; they also set the order of bypassing disconnected lemniscates (Fig. 1b, Fig. 3). In this regard, one of the most important questions for a kf -structure of an arbitrary configuration is the question of combining disconnected loops of a multiply connected lemniscate into a single closed coordinate line ρ = const, along which the values of the angular coordinate ϕ change monotonically in the range from 0 to 2π, i.e. organization of a unified ϕ-parameterization on a multi-connected lemniscate. The answer to the question posed about the possibility of the required ϕparametrization for any multiply connected metric contour ρ = const of an arbitrary kf -LCS and the way of organizing the corresponding traversal of disconnected forms is positive (the constructive proof is given in [10]): A closed and single traversal of multiply connected lemniscates with continuous and monotone ϕ-parameterization in the range [0, 2π ] exists and is unique. The structure of fragmentation and transitions on the entire set of multiply connected lemniscates is identified by the family of separatrices separating the focus basins.

7 1f -Polar CS: Passage to the Limit The polypolar coordinate system in the limiting case of superposition of all foci at one point is reduced to the case of a classical unipolar CS. In the case of one focus, as is known, the isometric family is made up of circles of different radii, and the gradient

Polypolar Coordination by the Multifocal Lemniscates

59

curves are radial straight lines. With simple transformations it is easy to check the obtained equations for the passage to the limit with substitutions: k = 1 and aj = bj = 0, j = 1, …, k. Indeed, the metric and angular coordinates are obtained from (2, 3) and (4, 5) in their natural form: {ρ2 ≡ r = x 2 + y2 ; ϕ = arctg y/x}. The finite invariant equations for the radial coordinate (family of kf -lemniscates) (6) and for the angular one (8) degenerate into: {ρ ≡ r = const, ϕ = const}. The solutions of the differential equations are families of confocal circles and lines passing through the origin: {x 2 + y2 = C 1 , y = C 2 x}. Thus, the results of the passage to the limit give the known formulas and objects of the classical polar coordinate system. With an unrestricted and continuous approach of all k foci to one point, objects of a polypolar CS are continuously transformed into objects of a classical polar CS. In the limit, the kf -structure goes into a single focus center, the kf -lemniscate family goes into a family of circles ρ = const with this center, and a family of gradient curves - into a family of radial straight lines that are gradient to circles passing through the focal center. The gradient straight lines y = Cx 1f -system are the asymptotes of the corresponding gradient curves ϕ = const kf -system with the same center (Fig. 1c).

8 Symmetries in Polypolar Coordination Preserving in a compressed form information about the shape of the represented curve, the focal structure also inherits the symmetry of its shape, in contrast, for example, from the system of freedoms of the harmonic representation. On the other hand, the lemniscate, being uniquely determined by the focal structure, contains focal symmetries in its form. A focal structure consisting of three or more foci can be either symmetric or asymmetric and, depending on the configuration, can have a different group of symmetries (Fig. 3). The corresponding lemniscates also have the same symmetry group. Summarizing, you can formulate the statement [8, 9]: Lemniscate and its focal structure have the same symmetry group. This statement is based on the generating invariant (1), as well as the fact that both the focal structure and the corresponding lemniscate are represented in the same coordinate system, and, therefore, withstand transformations that preserve shape. Any focal structure is the limiting form of the confocal lemniscate family as the radius ρ tends to zero. Thus: the ϕ-parameterization establishes a correspondence between the symmetries of an arbitrary kf-lemniscate and its focal structure. On the polypolar plane, classical planar symmetry structures are realized: reflections, rotary, etc. To construct symmetries, calculations of all transformations are performed in polypolar coordinates ρ, ϕ. In the ACS coordinates, the constructed symmetries are a composition of polypolar symmetries and symmetries of the structural origin of the PPL-coordinates. The polypolar lemniscate in the kf -LCS plays the same role as the circle in the classical polar CS. With respect to a single kf -lemniscate, it is possible to construct the same symmetry groups: rotations, reflections, inversions - curvilinear symmetries on

60

T. Rakcheeva

multifocal lemniscates. Such transformations for arbitrary form motives are possible for both a simply connected and multiply connected circle-lemniscate. The focal representation of a polypolar circle allows you to change its symmetry and the shape of the motif by controlling the focuses. The combination of kf -lemniscate symmetries and form-motifs allows obtaining a wide variety of ornaments, both rosette and parquet, and interactive control of focuses in a computer experiment makes it possible to continuously transform the ornament.

9 Focal Approximation of Empirical Form The most significant of the applications is the approximation of empirical curves given pointwise [4]. In [1], devoted to the analysis of approximation possibilities of the multifocal lemniscates, Hilbert showed that by choosing an appropriate number of foci and their location on the plane, we can get the lemniscate, which is close to any predetermined smooth closed curve. Appropriate methods of producing the focal approximation of an arbitrary curve, determining by the k foci of the kf -system and the radius of the approximating lemniscate, are developed [3]. An example of a representation of an arbitrary shape (contour of Africa defined by a set of points) is shown in Fig. 4a. The result of the approximation is the foci of the 6f -system - marked with “stars”. The family of confocal 6f lemniscates is also shown here. By visually manipulating the position of the foci and their number, one can also solve the problem of interactively generating forms for design, diagnostic and other purposes. Moreover, you can associate your own (individual) coordinate system with the subject image.

а)

b)

Fig. 4. Focal representation of shape: a) Africa’s contour (set of points), 6f -focal structure (market by «stars») and family of confocal 6f -lemniscates; b) own (individual) LCS

The focal representation of the curve shape by multifocal lemniscates allows you to adjust the polypolar coordinate system so that the metric component matches the shape of a given curve (Fig. 4a). Adjustment to an applied task can be done in a manual-visual mode. The metric component can be arbitrary, rather complex, adjusted manually or automatically, and for any shape of the metric component, the conjugate angular component is orthogonal (Fig. 4b). Thus, you can associate your own (individual) coordinate system with any objective shape.

Polypolar Coordination by the Multifocal Lemniscates

61

10 Conclusion The polypolar coordinate kf -system appears to be a well-organized CS in the large list of available curvilinear systems. The features and possibilities are a consequence of the capabilities of the considered class of multifocal lemniscates, which allow for a wide variety of applications. The metric structure of the polypolar LCS is such that different local metrics associated with individual foci are possible in it. In this case, the provisions proved above remain valid, in particular, all symmetry possibilities remain. The peculiarity of the focal-lemniscate CS, proposed in this work, is the combination of freedom in describing the metric component and the work of the analytical apparatus, which ensure the analytical organization of the LCS, which manifests itself in ensuring the required properties of well-organized coordination systems. Freedom of representation of the metric component is provided by focal approximation of the shape of the empirical curve (surface). The family of confocal lemniscates defining the metric component is parameterized with respect to the parameter that preserves the form of the applied problem, and the conjugate coordinate family is orthogonal. Thus, having the degrees of freedom that ensure the arbitrariness of the approximate description of the smooth shape of the metric component, the LCS retains the analytical apparatus of the classical curvilinear CS, and by adjusting the shape to a special subject area of the application, we obtain a special CCS [4, 8, 11, 12]. In other words, LCS can be defined as a universal CCS, tunable to a special form of the subject area.

References 1. Hilbert, D.: Gessamelte Abhandlungen, vol. 3, p. 435. Springer, Berlin (1935) 2. Markushevich, A.I.: The Theory of Analytical Functions, vol. 1, p. 486. Nauka, Moscow (1967) 3. Rakcheeva, T.A.: Multifocus lemniscates: Approximation of curves. Comput. Math. Math. Phys. 50, 1956–1967 (2010). https://doi.org/10.1134/S0965542510110187 4. Rakcheeva, T.: Focal Model in the Pattern Recognition Problem. In: Hu, Z., Petoukhov, S.V., He, M. (eds.) AIMEE2018 2018. AISC, vol. 902, pp. 127–138. Springer, Cham (2020). https:// doi.org/10.1007/978-3-030-12082-5_12 5. Al-Jubouri, H.A.: Integration colour and texture features for content-based image retrieval. Int. J. Mod. Educ. Comput. Sci. (IJMECS) 12(2), 10–18 (2020). https://doi.org/10.5815/ijm ecs.2020.02.02 6. Mahmoud, S.M., Habeeb, R.S.: Analysis of large set of images using MapReduce framework. Int. J. Mod. Educ. Comput. Sci. (IJMECS) 11(12), 47 (2019). https://doi.org/10.5815/ijmecs. 2019.12.05 7. Fahim, A.: A clustering algorithm based on local density of points. Int. J. Mod. Educ. Comput. Sci. (IJMECS). 9(12), 9 (2017). https://doi.org/10.5815/ijmecs.2017.12.02 8. Rakcheeva, T.A.: Quasilemniscates in the task of approximation of the curve forms. Intellect. Syst. 13(1–4), 79–96 (2009) 9. Rakcheeva, T.A.: Symmetries of polypolar coordination. Vestnik MGOU. Ser. Phys.-Math. 1, 10–20 (2011) 10. Rakcheeva, T.A.: Polypolar lemniscate coordinate system. Comput. Res. Model. 1(3), 256– 263 (2009)

62

T. Rakcheeva

11. Gourav, T.S., Singh, H.: Computational approach to image segmentation analysis. Int. J. Mod. Educ. Comput. Sci. (IJMECS) 9(7), 30–37 (2017). https://doi.org/10.5815/ijmecs.2017.07.04 12. Hamd, M.H., Mohammed, M.Y.: Multimodal biometric system based face-iris feature level fusion. Int. J. Mod. Educ. Comput. Sci. (IJMECS) 11(5), 1–9 (2019). https://doi.org/10.5815/ ijmecs.2019.05.01

Non-intrusive Load Identification Decision Method Based on Time Signatures Zhengqi Tian1(B) , Zengkai Ouyang1 , Meimei Duan1 , Guofang Xia1 , Zhong Zheng2 , Xiaoxing Mu1 , and Chao Zhou1 1 State Grid Jiangsu Electric Power Company Marketing Service Center, Nanjing 210019, China 2 State Grid Putian Electric Power Supply Company, Putian 351100, China

Abstract. Considering the problems of the uncertainty of the load type in household scenarios and the incompleteness of the load signature database in the nonintrusive load database, which easily leads to decreasing accuracy in load identification, this paper proposes a load identification method to cope with these problems. On the base of electrical signatures, this method also uses time signature which includes the characteristics of the length of operation time, load operation time, working period and vacation features. In this method, firstly, the piece-wise normalization mean-shift clustering method is used to cluster the detected load event features and obtain the number of potential load types. Then the time signatures and power signature of load events are counted to get their probability. And the Bayesian method is used to identify the load by decision-making. Finally, this paper uses the AMPds public dataset to carry out actual tests, the experimental results show that this method has a good identification effect to this scenario. Keywords: Non-intrusive · Load identification · Time signature · Mean-shift clustering · Bayesian decision-making

1 Introduction Non-intrusive load monitoring is an important technology to promote the intelligentization of the power grid [1]. As long as real-time electricity consumption information is obtained through the smart meters, it can extract information such as decomposition of power consumption and start-up time. Compared to traditional intrusive load measurement, it has the advantages of low cost, convenient installation and strong practicability. Moreover, with the rapid development of ubiquitous internet, non-intrusive load monitoring can not only provide the basis for users formulating smart power strategy to improve the power consumption mode of residential users [4, 5], but also help power companies to analyze users’ electricity consumption behavior, so as to strengthen load demand side management and optimize power grid structure [6]. In recent years, the rapid development of computer science and intelligent measurement technology provides a variety of solutions and ideas for solving the problem of load identification which is key to realizing non-intrusive load monitoring [7–9]. Reference © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): AIPE 2020, AISC 1403, pp. 63–76, 2021. https://doi.org/10.1007/978-3-030-80531-9_6

64

Z. Tian et al.

[10] introduces the deep translation sequence into the study of non-intrusive load identification, and proposed a non-intrusive load decomposition method based on deep sequence translation. Reference [11] presents a non-intrusive load identification algorithm based on feature fusion and in-depth learning, to overcome the limitations of feature load identification for a single device. This method combines the V-I track image features with the power numerical features, and trains a back-propagation (BP) neural network using the composite features as the new features of the device to achieve non-intrusive load identification. Reference [12] presents a Bayesian Optimized Bidirectional Long-term and Short-term Memory (LSTM) method to solve the multidimensional problem when the number of electrical appliances increases. A non-causal model is also introduced to deal with the inherent characteristics of multiple appliances. The above methods have a good identification effect for cases when there are few load types or the user load composition is known, but there are still some problems in some home scenarios, such as the incomplete establishment of the feature library, which leads to the insufficient load identification. For the scenario of incomplete user load feature library and in the case of a large number of loads and low power loads, this paper proposes a non-electrical characteristics load identification method based on the characteristics of active and reactive power. Firstly, the mean-shift clustering algorithm based on load events is used to preliminarily determine the number of load categories. On the basis of introducing time characteristics such as load running time length and running time period, a large number of load characteristic data are statistically analyzed and trained to obtain load characteristic database, and the Bayesian criterion is adopted to carry out non-intrusive load decision identification [13, 14]. Finally, an actual test is conducted on unknown AMPds (The Almanac of Minutely Power Dataset) public data set, and the test results show that the identification success rate of the non-intrusive load identification method considering the non-electrical characteristics reaches expectation.

2 Establish Load Characteristics Library Since load characteristics are important parameters reflecting load operation state [15], how to reasonably select load characteristics is the first step of load identification. Normally, load characteristics can be divided into electrical characteristics and nonelectrical characteristics. Electrical characteristics usually include active power, reactive power, current voltage amplitude, and current harmonic [16, 17]. Non-electrical features include time, weather, temperature and other boundary factors [18]. These characteristics ultimately lay the foundation for the actual load identification. 2.1 Load Characteristics Selection Active power is an efficient load identification feature, which acts as the electrical feature with the most significant variation during load operation and can reflect the energy consumption value of load [19]. Reactive power is an important parameter to show the inherent nature of the load, which can be classified into resistive, capacitive and

Non-intrusive Load Identification Decision Method

65

inductive loads. Therefore, active power and reactive power are selected as the electrical characteristics of load identification. In a complex operating environment with a large variety of loads, it is generally not effective to distinguish loads with similar electrical characteristics only by using electrical characteristics such as power, but too many electrical characteristics may cause the problem of excessive calculation. On the basis of appropriate electrical characteristics, this paper introduces the time characteristics of load as the non-electrical characteristics of load identification. Due to the uncertainty of external factors, some non-electrical characteristics cannot be accurately correlated with the load [20]. Therefore, this paper identifies the time characteristics of the load such as start-stop time, time duration, etc. as a more stable and reliable non-electrical feature. 2.2 Load Characteristics Modeling 2.2.1 Power Characteristics Active power P and reactive power Q are mathematically statistical for real-time voltage and current data sampled over a period of time, which are defined as follows, P = (1/T ) Q = (1/T )

T

V (t)I (t)

(1)

V (t + T /4)I (t)

(2)

t=0

T t=0

where, T is a waveform period, V (t) represents the voltage value of the general entrance and each branch of the circuit, and I(t) is the current value. In order to improve the efficiency of sequence statistics and training, this paper divides active power and reactive power into data segments, and uses m1 and m2 levels to sort them from small to large, as shown in the following formula. P = {P1 , P2 , . . . , Pm1 }

(3)

Q = {Q1 , Q2 , . . . , Qm2 }

(4)

Furthermore, unequal interval segmentation method can be used for statistical and subsequent identification, so as to distinguish different types of load, especially the small power load which is difficult to identify. In order to get accurate granulation power segmentation interval, it is necessary to make statistics on load characteristic distribution in load database and build P-Q rectangular feature distribution map. As shown in Fig. 1, it can be seen that the load distribution is dense within the range of P ∈ (0 − 1000) & Q ∈ (0 − 20), which needs to be refined and segmented. 2.2.2 Time Characteristics In addition to the electrical characteristics, the internal time characteristics of household electricity load can also be used as a potential feature to distinguish load. This paper

66

Z. Tian et al.

Fig. 1. Statistic diagram of load power signature

concretizes time into four mainly features: running time length, running time period, running periodicity, vacation and non-vacation. A) Load running time length L For common residential users’ electricity load, the running time length L usually reflects some regularity. To verify this feature, the running time of load in the load database is counted by box chart as shown in Fig. 2. From Fig. 2, it can be seen that most of the residential loads run for less than 100 min, and a small part of them run for more than 1400 min lasting almost all day long. At the same time, some of the loads running time distribution is not fixed in an interval, that is, there are many “outliers”. Therefore, different power loads present certain specificity in running time length, which can be used as a reference feature for load identification.

Fig. 2. Statistic diagram of load running time length

Based on the statistical time interval of each load, the time-length characteristic L is divided into m3 grades, as shown in the following formula. L = {L1 , L2 , . . . , Lm3 }

(5)

Non-intrusive Load Identification Decision Method

67

B) Load running time period t According to the usual schedule and load running time, the load running time period can be divided into 9 parts. The peak period of electrical appliance use is 14 h from 7:00 a.m. to 9:00 p.m. every two hours is a time interval. The period from 9:00 p.m. to 7:00 a.m. of the next day is divided into two time intervals, including the late night period from 9:00 p.m. to 12:00 p.m. and the early morning period from 0:00 a.m. to 7:00 a.m. of next day. It can be expressed in the following formula. t = {t1 , t2 , . . . , t9 }

(6)

C) Load running periodicity T Normally, the working rule of load can be divided into periodic and non-periodic. Electrical appliances with regular running time, such as refrigerators, are defined as periodic loads. The starting and stopping time of non-periodic electrical appliances is uncertain, such as TV sets, washing machines, etc. Therefore, load running periodicity can be labeled as follows, T = {T1 , T2 }

(7)

where T 1 is non-periodic, T 2 is periodic. D) Vacation and non-vacation Because residents use electric appliances differently at different times, such as holidays at home result in longer TV usage, travel outside result in less use of electrical equipments. This is also an indispensable measure of time characteristics. Therefore, different loads can be divided and counted according to each load feature, and load feature database can be built, which can provide training samples for subsequent Bayesian classification decisions.

3 Non-intrusive Load Identification Decision Method Combining Time Characteristics Considering the household scenarios with uncertain load categories, this paper conducts statistical analysis for load events in this scenario over a period of time. Through the clustering of the same load events, the number of categories of electrical equipment is determined. At the same time, according to the number of classes obtained by the cluster, the time characteristics are further calculated, and then the load is identified by the decision-making method. 3.1 Mean-Shift Clustering Algorithm Mean-shift algorithm is a non-parameterized clustering algorithm based on kernel density estimation. For a probability density function f(x), x i is one point of the n sample points in D-dimensional space, where i = 1, …, n. The kernel function of f(x) is estimated as follows,     n n   − x x i ω(xi )/ hd K ω(xi ) fˆ (x) = (8) h i=1

i=1

68

Z. Tian et al.

where ω(x i ) ≥ 0 is the weight coefficient of sampling point x i , h is the clustering radius, and K(x)is the kernel function. The gradient ∇f(x) of the probability density function f(x) is estimated as following.    2  n n       xi − x  d +2 ˆ ∇ f (x) = 2 ω(xi ) (9) ω(xi )/ h (xi − x)k  h  i=1

i=1

Assuming that g(x) = −k’(x), G(x) = gx2 , ∇f (x) can be expressed as follows,   ∗  n   n   − x x 2 i ω(xi )/ hd G ω(xi ) ∇ fˆ (x) = 2 h h i=1 i=1   n      n    xi − x 2 xi − x   G (10) ω(xi )/ ω(xi ) (xi − x)G  h  h i=1

i=1

where, the expression in the second square bracket is mean-shift vector, which is proportional to the probability density gradient. Compared to other clustering algorithms, mean-shift clustering algorithm does not need to set the number of clustering centers in advance and can adaptively select the number of clustering centers through the probability density of data distribution. Therefore, the selection of clustering radius is the main factor affecting the clustering effect of mean-shift algorithm. If the radius is too large, the small power equipments cannot be identified. If the radius is too small, the identification result of the large power equipments will be redundant. Active power P and reactive power Q are adopted as the load characteristics of clustering in this paper, which are the most common load characteristics and can reflect the load energy consuming level and load type characteristics. In addition, they have effective discriminability and adaptability in the process of constructing two-dimensional feature space for clustering. However, due to the difference in the selection of active power and reactive power values, it will increase the difficulty of cluster radius selection. To solve this problem, a segmental normalized radius selection method is proposed in this paper. According to the active power and reactive power characteristics of load events detected by CUSUM sliding window method, the clustering radius h is selected as follows, h = {(P1,10minmax minmax }

(11)

where, P is the active power of load values, Pmax and Pmin are respectively maximum and the minimum active power after subsection normalization, ε is the parameters for control of the clustering radius selection range, h is the clustering radius which is changed by distribution in the different power section. Assuming that x is a set of n load event data points on P-Q two-dimensional feature space, the basic form of drift vector of x for any point in the space can be expressed as,  (12) Mh = (1/K) (xi − x) xi ∈Sh

Non-intrusive Load Identification Decision Method

69

where x i (i = 1, …, n) is n sample points, S h is a two-dimensional circle with x as the center and h as the radius, K means that K out of n sample points are distributed in the region S h . When the distance point x in the data set X is less than the cluster radius h, S h can be expressed as follows.

Sh (x) = y : y − xi 2 < h2 (13) In the process of clustering, the position of center x is updated by calculating the drift vector.  xˆ = x + Mh (x) = (1/h) xi (14) xi ∈Sh

Through the above calculations, the probability density clustering results based on the power characteristics of load events can be obtained, and each cluster center represents a load type. Using mean-shift clustering algorithm to obtain the load state can reduce the influence of load power characteristics overlapping in the feature space and determine the number of load categories in the scenario with uncertain load category which provides the basis for the subsequent load identification using Bayesian method. 3.2 Bayesian Classification Method Bayesian classifier is a decision algorithm based on Bayesian maximum posteriori criterion and Bayesian hypothesis. Assuming that the eigenvectors of each dimension in the feature space of the object are independent to each other, the Bayesian classification algorithm can be obtained. Assuming z = {a1 , a2 ,…,am } as an item to be classified, where ai is a characteristic property of z, and C = {y1 , y2 , …, yn } as a category set, P(y1 |z), P(y2 |z),…, P(yn |z) can be calculated. The relationship between z and yk is z ∈ yk , if P satisfies the following conditions. P(yk |z) = max{P(y1 |z), P(y2 |z), . . . , P(yn |z)}

(15)

Using Bayesian classifier to identify load type decision needs the following steps. Step 1, build label model for load history data to train and generate training sample set. Step 2, calculate P(yi ) (i = 1, …, n) for each type of load. Since different types of loads are affected by external factors such as holidays, the use frequency of different types of loads can be used as the basis for calculating P(yi ). Step 3, calculate the conditional probability P(z|yi ) of different load characteristic attributes. Step 4, calculate P(z|yi ) P(yi ) for each category. Since there are several different load characteristics as classification attributes, and the conditions of each characteristic attribute are independent, the probability calculation of each class is expanded as follows. m 

P(z|yi )P(yi ) = P(yi ) (16) P aj |yi j=1

70

Z. Tian et al.

Step 5, determine the type of load. Taking the class corresponding to the largest term of P(z|yi )P(yi ) as the type of z, the type of load can be obtained by maximizing the molecular of Bayesian probability formula. P(yi |z) = P(z|yi )P(yi )/P(x)

(17)

where, the denominator P(x) is a constant for all classes. Therefore, to identify of user load scenarios, the collected load event characteristics are clustered by mean-shift algorithm firstly, and the number of load categories is preliminarily obtained. On this basis, the load event characteristics are statistically and annotated, and the probability is calculated by using Bayesian method, so as to determine the load type corresponding to the load event characteristics. The specific process is shown in Fig. 3. Load characteristic collection Characteristic classification parameter Load characteristic Label Load characteristic library Extraction load characteristics

Sample training by Bayesian

mean-shift clustering

Bayesian posterior probability calculation

Determine the type of load

Fig. 3. Flow chart of load identification

4 Experimental Results and Analysis This paper validates the identification algorithm using the AMPds public dataset, which records two years’ energy consumption data for a resident in Vancouver, Canada. It contains more than 20 different types of electrical equipment. As well as data on external factors such as weather and temperature, it can be tested as data representing the actual household electricity scenario, which is sufficient and reliable.

Non-intrusive Load Identification Decision Method

71

Eleven typical household electrical appliances in this dataset are selected to build load database, including traditional household loads such as bedroom appliances, washing machines, TV sets, and some modern electronic equipment loads, including some unidentified small power loads and multistate loads. These loads are expressed as {E1, E2, E3, E4, E5, E6, E7, E8, E9, E10, E11}. The power characteristics of loads are detected and the distribution range of power characteristics is calculated by kernel density estimation method. The feature information is recorded in the Mysql database as shown in Table 1. Table 1. Database of load characteristics Type of load

Active power P/W State 1

Reactive power Q/Var

State 2

State 3

State 1

State 2

State 3

24–36

34–41

23–42

88–142

5–15

E1

4–28

0–3

E2

3–77

E3

4360–4850

E4

155–215

E5

4–52

0–18

E6

712–808

26–44

E7

28–50

2–18

E8

114–164

173–257

6–13

6–14

E9

35–40

1585–1896

13–20

273–366

E10

28–43

130–370

10–16

2–41

E11

40–46

2467–2983

36–44

121–133

317–353

363–457

−1–11 360–460

240–360

490–610

434–506

3329–3581

−1–11

126–158

4.1 Parameter Settings For more accurate clustering results, the normalization coefficients of active and reactive power satisfy the following conditions, P/Q = α

(18)

where, α is a normalized parameter, which is determined by the contribution ratio of P and Q calculated by PCA algorithm. For the power values of different loads in this paper, α ∈ (3–30). Based on the load characteristic information, the segmented normalized clustering radius h is set as follows. ⎧ ⎫ ⎪ 5 0 < P < 30 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ 8 30 < P < 100 ⎪ ⎬ (19) h = 10 100 < P < 1000 ⎪ ⎪ ⎪ ⎪ ⎪ 20 1000 < P < 2000 ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ 40 ⎭ P > 2000

72

Z. Tian et al.

At the same time, according to the load power and time characteristic information in the database, Bayesian classifier is trained by using the load modeling method proposed in this paper, where m1 = 8, m2 = 7, m3 = 5.The results are shown in Table 2. Table 2. Classified labels of Bayesian method Label

Active power P/W

Reactive power Q/Var

1

0–30

0–10

360

13:00–15:00

6

400–1000

200–400

15:00–17:00

7

1000–2000

>400

17:00–19:00

8

>2000

9

Time length L/min

Time period t

Periodicity T

19:00–21:00 21:00–24:00

In addition, due to the different use frequency of different loads, the value of P(yi ) in Bayesian decision model is determined by the use frequency of such loads. In this paper, based on the statistics of working days and rest days in a week as well as holidays and other factors, the load is divided into daily operation, daily operation except holidays, and non-daily operation, so as to measure the sparseness of load usage. The corresponding P(yi ) values are respectively 1/8, 1/12 and 1/16. 4.2 Parameter Settings In order to maintain the consistency of external conditions, three months’ load power consumption data in a quarter are selected. The five load characteristics labeled in Table 2 are used to train the Bayesian decision model, and five groups of cases are selected to test the method. In each group of test cases, the specific type and number of running devices per day are unknown, simulating of a general home load running scenario, so as to verify the load identification capability of the method. Taking a certain set of data as an example, the power curve of load operation on that day is shown in Fig. 4. To identify the loads on this day, mean-shift clustering is carried out firstly for the detected load event power characteristics to obtain the distribution of operating load types, and the clustering results are shown in Fig. 5. From the clustering results, it can be seen that in daily operation loads, there are 3 kinds of load operation in the 0 W–30 W of low power range, 5 kinds of load operation in the 30 W–100 W range, 3 kinds of load operation in the 100 W–1000 W range, and 1 kind of load operation in the 1000 W–2000 W range of high power range. Thus, the

Non-intrusive Load Identification Decision Method

73

Fig. 4. Sample graph of load power

(a)

(b)

(c)

(d) Fig. 5. Load clustering result

load information of 12 different types of load operation on this day can be obtained preliminarily. As part of the loads have multiple states and the clustering results of the 12 types represent 12 different load states, the load types of each state need to be further distinguished. Next, the power characteristics of the load clustering and the time characteristics of its corresponding load events are labeled according to Table 2. The probability values are calculated through the Bayesian decision model after training, so as to finally determine the load types corresponding to the load events. The actual loads of the day are E1, E2, E5, E7, E8, E10 and E11. 28 load events with different characteristics are selected from the detection results of the daily load events,

74

Z. Tian et al.

and 25 load events can be accurately identified by using the method described in this paper. To further prove the effectiveness of the proposed method, this method is compared with the two independent methods to identify the loads of five groups of test data. One method is to directly use mean-shift to match and identify the clustering results of electrical characteristics with the database. The other method is to directly substitute the load event labels into the trained Bayesian classification method for identification. After calculating, the identification results of each method are shown in Fig. 6. 100

Accuracy Rate/%

80 60 40

Clustering Algorithm Bayesian Classification

20

Integrate two Methods 0 1

2

3 Test Scence/day

4

5

Fig. 6. Test result of load identification

As shown in Fig. 6, due to the lack of time characteristics and the deviation of clustering center by error disturbance, mean-shift clustering algorithm alone has the lowest accuracy to identify the load types. However, due to the overlapping of load power characteristic distributions, the Bayesian classification method alone is also limited for load identification in scenes where the operating load state information is not initially obtained through clustering. Thus, it can be seen that the load identification algorithm proposed in this paper combined mean-shift clustering algorithm with Bayesian classification method has a good identification effect on the scene with uncertain load category.

5 Conclusions Aiming at the problem of load identification caused by the uncertainty of load categories in household load use scenarios and incomplete load characteristics in equipment database, this paper proposes a load statistical identification method integrating time series characteristics. First of all, mean-shift clustering algorithm is used to cluster load power characteristics to obtain the load category of the running load states. Then, in order to overcome the error caused by clustering center migration, the time characteristics of loads are introduced to the method. At the same time, the load characteristics in the database are statistically analyzed to determine the parameters of Bayesian classification model. Finally, the load characteristics in the test data are substituted into the Bayesian classification model to determine the load types.

Non-intrusive Load Identification Decision Method

75

In this paper, AMPds public dataset is used to verify the method. Experimental results show that the proposed method had good identification effect for scenarios with uncertain load categories, and could also effectively identify low-power and multi-state loads. Acknowledgment. This work was supported by State Grid Corporation of China (SGCC) Technology Project “Research and application of key technologies of intelligent load sensing for residential users (5400-201918180A-0-0-00)”. The authors also would like to thank the reviewers and editors for their generous help.

References 1. Zhang, L., Zhang, T., Zhang, H.W.: Research on a method of load identification based on multi parameter hidden Markov model. Power Syst. Protect. Control 47(20), 81–90 (2019) 2. Qi, B., Han, L.: A non-intrusive residential load identification algorithm based on genetic optimization. Electr. Measure. Instrum. 54(17), 11–17 (2017) 3. Kang, W.T., Lin, X.H., Shi, S.B.: Non-intrusive load identification method based on twodimensional discrete fuzzy numbers. Electr. Measure. Instrum. 56(16), 13–18 (2019) 4. Li, J., Chung, J.Y., Xiao, J.: On the design and implementation of a home energy management system. In: International Symposium on Wireless and Pervasive Computing (ISWPC), Hong Kong, pp. 1–6 (2011) 5. Zhang, J., Sun, W.J., Wang, T.: Studies on requirements and architecture for automated demand response system. Proc. CSEE 35(16), 4070–4076 (2015) 6. Jiang, F., Yang, H.G.: Non-intrusive load identification method based on selected Bayes classifier. Electr. Power Constr. 40(2), 98–103 (2019) 7. Shi, C.K., Zhang, B., Sheng, W.X.: A discussion on technical architecture for flexible intelligent interactive power utilization. Power Syst. Technol. 37(10), 2868–2874 (2013) 8. Wu, Z.J., Yin, X.B., Chen, Z.: Identification method of load customers based on similarity of fuzzy clustering curves. Electr. Power Eng. Technol. 38(03), 151–156 (2019) 9. Liu, Y.Z., Guan, W.Y.: Parameter identification based on equivalent modeling of AWS wave farm. Electr. Power Eng. Technol. 38(02), 69–74 (2019) 10. Ren, W.L., Xu, G.: Non-intrusive load decomposition method based on deep sequence translation model. Power Syst. Technol., 1–11 (2019) 11. Wang, S.X., Guo, L.Y.: Non-intrusive load identification algorithm based on feature fusion and deep learning. Automation of Electric Power Systems, pp. 1–9 (2019). http://kns.cnki. net/kcms/detail/32.1180.TP.20191022.1625.016.html 12. Kaselimi, M., Doulamis, N., Doulamis, A.: Bayesian-Optimized Bidirectional LSTM Regression Model for Non-intrusive Load Monitoring, pp. 2747–2751. Institute of Electrical and Electronics Engineers Inc., Brighton, United Kingdom (2019) 13. Seevers, J.-P., Johst, J., Weiß, T., Meschede, H., Hesselbach, J.: Automatic time series segmentation as the basis for unsupervised, non-intrusive load monitoring of machine tools. Proc. CIRP 81, 695–700 (2019). https://doi.org/10.1016/j.procir.2019.03.178 14. Cao, J.J., Qin, L.J.: Application of modified Bayes classifier in load forecasting of power system. Sci-tech Innov. Prod. 5, 108–110 (2014) 15. Gao, H.H.: Signature Analysis of Non-intrusive Load Identification, pp. 25–29. Shandong University, Shandong (2019) 16. Wang, H.J., Wen, R.: Application of improved bird swarm algorithm in home appliance load disaggregation. In: Proceedings of the CSU-EPSA, pp. 1–5, November 2019

76

Z. Tian et al.

17. Zhang, B.D., Jing, Z.P.: An identification method of load harmonic current based on BP neural network. Power Syst. Protect. Control 40(20), 94–98 (2012) 18. Li, B., Men, D.Y., Yan, Y.Q.: Bus load forecasting based on numerical weather prediction. Autom. Electr. Power Syst. 39(1), 137–140 (2015) 19. He, H.J., Wang, H., Xiao, Y.: Non-Intrusive load monitoring model based on Bi-LSTM algorithm. South. Power Syst. Technol. 13, 20–26 (2019) 20. Kim, H., Marwah, M., Arlitt, M.F.: Unsupervised disaggregation of low frequency power measurements. In: DBLP, Mesa, Arizona, USA, pp. 747–758 (2012)

Research on Non-intrusive Load Identification Method Based on Support Vector Machine Yixuan Huang, Qifeng Huang, Hanmiao Cheng(B) , Kaijie Fang, Xiaoquan Lu, and Tianchang Liu State Grid Jiangsu Electric Power Company Marketing Service Center, Nanjing 210019, China

Abstract. Basing on the hardware of smart electric power meter and aiming at the requirements of non-intrusive load identification application, the method of off-line learning and on-line classification based on support vector machine is studied. Two-sided cumulative sum based event detection method is employed to cut out the waveform windows between the start-up and shut-down of electric appliances, and then the characteristics of the electrical appliances are calculated. By using the feature data collected from electrical appliances samples, the parameters and support vectors of several binary classification decision functions are optimized through the MATLAB toolbox in an offline way. Then, a multiple classifier is constructed with several two class classifiers. Finally, the parameters of the multiple classifier are solidified into smart electric power meters to identify the type of electrical appliance load on-line. In order to verify the feasibility of the proposed method, the electrical appliance load classification test was carried out on the hardware platform of a single-phase electric power meter in the laboratory. The test results show that the accuracy rate of non-intrusive electrical appliance load identification is above 80%. The method proposed in this paper makes a large number of parameter optimization processes to be put on the computer, which is able to eliminate the online learning process and reduce the demand for calculation and storage resources of electric power meter, and has practical application value for the research of energy Internet on the user side. Keywords: Non-intrusive load identification · Classification algorithm · Two-sided cumulative sum · Event detection · Support vector machine · Linear classifier · Supervised learning

1 Introduction Demand side response is an important focus of China’s energy reform. Residential power load monitoring can enable residential users and power companies to grasp the load details in real time. Better demand side response can be achieved through information interaction so as to ensure the security, stability and economy of the power system operation [1–3]. As one of the methods of residential electric load monitoring, nonintrusive load monitoring (NILM) uses matching algorithm to identify the type, the start and stop time and the operation power of electrical appliances on line by collecting the voltage and current signals at the bus end of the household. More information such as © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): AIPE 2020, AISC 1403, pp. 77–86, 2021. https://doi.org/10.1007/978-3-030-80531-9_7

78

Y. Huang et al.

the energy consumption level and regular pattern of use of electrical appliances can be obtained by further date analysis [4]. Load classification algorithm is one of the core technologies of NILM. The existing load classification algorithm can be divided into four levels according to the complexity. The first is to calculate the multidimensional characteristic distance between the unknown and the known electric load samples for type matching, with small amount of calculation, but low accuracy [4–7]. The other is convex optimization, which can achieve type matching by minimizing the objective function. It requires repeated iterations and a lot of computation [8, 9]. The third is to train the classifier through supervised learning, and then use the classifier to realize load classification, with high accuracy [10–12]. The fourth is deep learning, which decomposes the complete load curve with the deep learning model to get the energy consumption data of each electrical appliance [13, 14]. The power quality analysis method based on deep learning studied in reference [15] also has reference significance. In the above algorithm, considering the computation and accuracy, the classifier based on machine learning algorithm is the best choice for engineering implementation. This paper describes the principle of SVM classifier algorithm and describes the realization method of “offline learning, online classification” in detail. The effect of load identification is tested based on single-phase electric power meter. The experimental results show that the accuracy of the classification of electric appliances is more than 90% under the laboratory test conditions. The SVM application method realized in this paper has practical application value, for it avoids the online learning process, reduces the demand for the calculation and storage resources of electric power meter.

2 SVM Nonlinear Classifier The practice shows that it is a linearly inseparable problem to classify the types of electrical appliances according to the changes of electrical characteristics at the time of starting and stopping. Therefore, it is necessary to use a nonlinear classifier. The following is a brief introduction to the basic principle of SVM nonlinear classifier. For more details, it is better to refer to the literature [16]. 2.1 SVM Linear Classifier For a group of linearly separable samples X1 , X2 , ..., Xm , there is a hyper-plane to divide the samples into two categories. The hyper-plane of classification is determined by the following formula: wT x + b = 0

(1)

Where, x = {x1 , x1 , ..., xk } is the sample attribute vector, w = {w1 , w1 , ..., wk } is the weight of the attribute value of the sample, b is the offset. The decision function of linear classifier corresponding to formula (1) is as follows: f (x) = wT x + b

(2)

Research on Non-intrusive Load Identification Method

79

If f (x) > 0, mark positive class; if f (x) ≤ 0, mark negative class. According to the training samples of the known tags, the sum in formula (2) is continuously optimized and adjusted with the classical perceptron learning strategy to obtain a linear classifier that performs well in the training samples [17]. Furthermore, the maximum classification interval is taken as the optimization objective, and the estimated values of w and b are obtained by using optimization algorithms. This kind of classifier is SVM linear classifier, and its generalization ability is better than ordinary linear classifier. 2.2 SVM Nonlinear Classifier Based on RBF Kernel Function Based on SVM linear classifier, we can use transform function to map the attribute vector of the sample from the original space to the higher dimension feature space, so that the sample finally has linear separability. After introducing transformation function φ(·), the decision function in formula (2) is deformed as follows: f (x) = wT φ(x) + b

(3)

When we optimize w and b, we need to calculate the dot product of vectors in high dimensional space. The kernel function method can be introduced to avoid this problem. Common kernel functions include linear kernel, polynomial kernel, Gauss kernel, Laplace kernel, Sigmoid kernel and their combination transformation. After introducing the kernel function, Eq. (3) can be transformed into: f (x) =

m 

wT κ(x, xi ) + b

(4)

i=1

Where, xi is the supporting vector. In this paper, the Gaussian kernel function, also known as radial basis function (RBF), with better performance in general, is selected. Its expression is as follows: κ(x, xi ) = exp(−

||x − xi ||2 ) 2σ 2

(5)

Where, σ is the bandwidth of Gaussian kernel. By substituting formula (5) into formula (4), the classification decision function can be obtained as follows: f (x) =

m  i=1

wT exp(−

||x − xi ||2 )+b 2σ 2

(6)

Where, w and b are the parameters to be optimized, σ is the set hyper-parameter. When training the nonlinear SVM classifier, if the classifier is allowed to have a certain classification error rate on the training sample, a regularization term C should be added to reduce the generalization error of the classifier on the test sample.

80

Y. Huang et al.

3 Detection and Feature Extraction of Electrical Appliances 3.1 Event Detection In order to extract the characteristic value of start and stop of electrical appliances, the start and stop events of electrical appliances should be detected first. In this paper, the bilateral cumulative sum (CUSUM) detection algorithm is used [17]. Considering sampling sequence X = {x(k)}(k = 1, 2, ...), The definition of statistics g of nonparametric bilateral CUSUM is shown in formula (7) and formula (8). Formula (7) is the calculation of forward change statistics, and formula (8) shows the calculation of reverse change statistics.  + g0 = 0 (7) + + xk − (m0 + b)) gk+ = max(0, gk−1  − g0 = 0 (8) − − xk + (μ0 + β)) gk− = max(0, gk−1 Where, μ0 is the mean value of steady-state series, β is the noise. When gk+ and gk− exceed the set threshold, it is considered that the sequence has a positive or negative change event. For more detailed derivation, please refer to the literature [18]. In this paper, the active power curve is used to detect the start and stop events of electrical appliances, and the threshold value is set to 150 W. 3.2 Characteristic Calculation Whether the above event detection or feature extraction involves the calculation of basic electrical parameters. In this paper, voltage and current RMS are calculated as follows: ⎧ N ⎪ 1  2 ⎪ ⎪ U = uk ⎪ rms ⎪ ⎨ N k=1 (9) N ⎪  ⎪ 1 ⎪ 2 ⎪ ik ⎪ ⎩ Irms = N k=1

The calculation formulas of apparent power, active power and negative power are as follows: S = Urms Irms 1  uk ik N k=1  Q = S 2 − P2

(10)

N

P=

(11) (12)

Research on Non-intrusive Load Identification Method

81

Taking the electrical opening event as an example, four electrical related characteristics are selected and defined as follows: P = Pt2 − Pt1

(13)

Q = Qt2 − Qt1

(14)

T = Tt2 − Tt1

(15)

S = St2 − St1

(16)

Where, P is active power increment, Q is negative power increment, T is step duration, S is fluctuation increment. According to the power mutation time detected by CUSUM algorithm, the subscripts t1 and t2 are obtained by moving time T forward and backward respectively. The value of T should be based on experience. In this paper, the value of T is chosen as 3s. It is worth mentioning that in formula (16), S is characterized by the variance of the power curve, and the calculation time window is taken as 5s. Taking the opening of electric kettle as an example, t1 , t2 and the time window for calculating S are marked in Fig. 1.

Fig. 1. Diagram of calculating time

4 Construction of Load Classifier Taking electric kettle, induction cooker, hair dryer and rice cooker as classification objects, a multiple classifier is constructed by using 1-to-1 method. Six two class classifiers need to be trained and then combined into a classification voting system. These classifier parameters will be solidified into the memory of the watt hour meter and used to classify the test load.

82

Y. Huang et al.

4.1 Two Class Classifier Training First of all, record the start-up and stop load waveform of electrical appliances. Turn on the electrical appliances for several times under different background noises. Record the instantaneous sampling values of current and voltage when the electrical appliance is turned on, and calculate the related electrical characteristic quantities according to formula (13) to formula (16). Each appliance was tested 50 times in different environments. A total of 200 samples of 4 kinds of electrical appliances were obtained. 70% of the samples are used as the training set and 30% as the test set. The classifier is trained by LIBSVM in the open source MATLAB toolbox, and the parameters are optimized by cross validation method. To minimize the training error and generalization error, the model parameters w, b and support vectors xi are obtained by adjusting σ and C. Due to the difference of the order of magnitude of each attribute, the attribute values of the samples can be normalized before training the classifier. The MATLAB version used in this paper is R2014 and LIBSVM version is 3.23. The training function call method is as follows: model = svmtrain(train_label, train_data, options) Where, input the parameter train_label is the sample label. In this paper, it is a 140 × 1 array; train_data is the feature data of training samples, which is a 140 × 4 matrix; options are attributes of training model, including support vector machine type, kernel function type and super parameters, such as parameters of Gaussian kernel function, regularization term, etc. The training results are returned to a model structure. The main parameters of the model structure are shown in Table 1. Table 1. Main parameters of model structure Parameter Definition

Symbol

rho

Offset

−b

sv_coef

Support vector coefficient w

SVs

Supporting vectors

xi

After one round of training, the svmpredict function is called to test the classifier effect, and the training parameters are adjusted according to the ROC curve. The next round of training is carried out until the classifier with the best performance on the data set is obtained. The unknown samples can be classified according to formula (6). 4.2 Multiple Classifier Training Two classifiers are obtained from the above training. Based on the principle of 1-to-1 voting combination, two classifiers are combined into multiple classifiers by program in electric energy meter. Firstly, a two classifier is constructed for any two classes, and each classifier is used to classify the samples. Finally, the number of votes of each class

Research on Non-intrusive Load Identification Method

83

is counted, and the class with the largest number of votes is the classification result. Taking 3 categories as an example, the 1-to-1 classification voting process is shown in Fig. 2.

Fig. 2. Classification process of 1-to-1 multiple classifier

In Fig. 2, it is assumed that samples belonging to Class C1 enter into three two class classifiers in turn. In the correct case, the results returned by the three classifiers are C1, C1 and C2 (C3). It can be seen that C1 gets 2 votes, so the classification result is C1.

5 Load Classification Test Analysis 5.1 Testing Environment In order to detect the non-invasive load identification function of the electric power meter, a physical detection platform is built according to the typical household electrical equipment configuration, which can realize the accuracy of load classification identification and the error detection of energy collection. The schematic diagram of the detection platform is shown in Fig. 3. In Fig. 3, the AC power supply provides the working power for the electrical appliances. The tested load identification electric power meter is connected in series in the total current loop. Four common electric power meters are connected in series with the shunt switch to provide reference standard for calculating the error of the collected electric quantity. The power control unit controls the on-off of AC power supply and shunt switch. The acquisition unit collects the electric quantity information of all electric energy meters and the identification results of the load identification electric power meter. The electrical appliance control unit controls the working state of electrical appliances according to the predetermined test process.

84

Y. Huang et al. AC power supply

Power control unit

Electric power meter under test Shunt switch Electric power meter #1

Electric power meter #2

Electric power meter #3

Electric power meter #4

Electric kettle

Hairdryer

Rice cooker

Induction cooker

Acquisition unit

Electrical appliance control unit

Fig. 3. Principle block diagram of testing platform for load identification electric power meter

5.2 Analysis of Test Results The test is carried out according to the sequence of single appliance, two appliances activated and three appliances activated one after another. Each test lasts for 5 min and the test is repeated for 10 times. The test process is automatically controlled by computer program, and the test results of electrical opening identification accuracy are shown in Table 2. Table 2. Identification accuracy of electrical appliance opening event Types of electrical appliances

Times of accurate identification

Identification accuracy

Electric kettle

65

92.8%

Hairdryer

60

85.7%

Rice cooker

63

90.0%

Induction cooker

58

82.8%

It can be seen from Table 2 that the comprehensive identification accuracy of the above appliances is above 80%. The identification error mainly occurs when multiple electrical appliances are activated. For example, the identification error rate of opening event is relatively high based on the operation of induction cooker. The accuracy of power collection is statistical data, which is not the focus of this paper, so it is not discussed in this paper.

Research on Non-intrusive Load Identification Method

85

6 Conclusions For the purpose of engineering application, this paper studies the classification method of electrical appliances in electric power meter using nonlinear SVM classification algorithm. Based on this method, an electric power meter with load identification function is developed. The classification accuracy of the meter is more than 90% in the laboratory test environment. The method has practical application value. Although the method proposed in this paper has achieved good results in the laboratory, there are still some shortcomings. One is that the number of two class classifiers needed to construct multiple classifiers increases rapidly with the increase of the categories of electrical appliances. The other is that the electrical characteristics of some appliances are highly similar when they are turned on or off, which cannot be distinguished by SVM algorithm. In view of the above two problems, the feature library of electrical appliance types should be further expanded. Considering the use time, user habits and other non electrical characteristics, accurate classifiers can be constructed using unsupervised learning method to further improve the accuracy of load identification algorithm for engineering applications. Acknowledgment. This work was supported by State Grid Corporation of State Grid Jiangsu Electric Power Co., Ltd. Project “Research on load data acquisition technique and device development for power user (B310EG204PZJ)”. The authors also would like to thank the reviewers and editors for their generous help.

References 1. Li, Y., Kong, X.: Prompt checking for lost fee due to voltage loss in three-phase three-wire watt-hour meters. Electr. Meas. Instrum. 42(474), 24–26 (2005) 2. Wang, X., Su, H., Song, T., Huang, Q.: Differential customer baseline load forecasting based on load subdivision. Electr. Power Eng. Technol. 37(6), 33–38 (2018) 3. Dong, B., Mao, W., Li, F., Su, D.: The technique of day-ahead optimized scheduling with multi-type of flexible loads. Electr. Power Eng. Technol. 37(6), 97–102 (2018) 4. Hart, G.W.: Non-instructive appliance load monitoring. Proc. IEEE 80(12), 1870–1891 (1992) 5. Hong, Y., Chou, J.: Non-intrusive energy monitoring for micro-grids using hybrid selforanizing feature-mapping networks. Energies 5, 2578–2593 (2012) 6. Qu, H.: Research and Implementation of Non-intrusive Identification Method Based on Transient Characteristics (2012) 7. Li, Y.: Research on the Development and Application of Non-intrusive Multifamily Load Identification Device. Southeast University (2017) 8. Zhang, Z.: Design and Implementation of Non-intrusive Load Identification System Based on Steady-state Characteristics. University of Electronic Science and Technology of China (2018) 9. Li, P., Yu, Y.: Non-intrusive method for on-line power load decomposition. J. Tianjin Univ. 42(4), 303–308 (2009) 10. Zhou, M., Song, X., Tu, J.: Residential electricity consumption behavior analysis based on non-intrusive load monitoring. Power Syst. Technol. 42(10), 3268–3274 (2018) 11. Cao, M., Wei, L., Zou, J.: Research on non-intrusive load monitoring based on transient state process. Water Resour. Power 36(8), 177–180 (2018)

86

Y. Huang et al.

12. Liu, R.: Research on Household Load Identification Combining Improved Nearest Neighbour Method and Support Vector Machine. Chongqing University (2014) 13. Jiang, B.: Deep Learning Based Method for Non-intrusive Residential Appliances Load Disaggregation. Hefei University of Technology (2017) 14. Li, C.: Optimization methods of Non-intrusive Load Monitoring based on Deep Neutral Network. Huazhong University of Science and Technology (2017) 15. Zhang, J.: Detection and Recognition of Power Quality Disturbance Signals. Nanchang University (2017) 16. Zhou, Z.: Machine Learning. Tsinghua University Press (2016) 17. Kubat, M.: An Introduction to Machine Learning. Springer, Cham (2015). https://doi.org/10. 1007/978-3-319-20010-1 18. Niu, L., Jia, H.: Transient event detection algorithm for non-intrusive load monitoring. Autom. Electr. Power Syst. 35(9), 30–35 (2011)

Intelligent Detection of Electricity Stealing by Replacing Instrument Transformer Based on Daily Load Date Mining Gaojun Xu1(B) , Xin Zhang1 , Li Sun1 , Weimin He1 , Shuangshuang Zhao1 , Weijiang Wu1 , and Jian He2 1 Marketing Service Center, State Grid Jiangsu Electric Power Co., Ltd., Nanjing 210019, China 2 State Grid Xiongan New Area Electric Power Supply Company, Xiongan New Area

071600, China

Abstract. We consider the research on intelligent detection of electricity stealing by replacing instrument transformer based on the correlation of daily line loss fluctuation. The qualitative analysis shows that there is a close correlation between daily load of electricity stealing and line loss. Pearson correlation coefficient is introduced to distinguish the degree of correlation between them. And the method for detecting such electricity stealing mode is constructed in the practical application field. The accuracy and practicability of this method are further illustrated by detailed analysis of actual cases. This method solves the problem that it is difficult to detect the electricity stealing mode of high-voltage users by replacing instrument transformer furtively due to the lack of abnormal electric characteristic quantities such as voltage, current, and phase. Keywords: Electricity stealing · Replacing instrument transformer · Line loss · Pearson correlation coefficient

1 Introduction There are many ways to steal electricity, which are intelligent, professional and covert [1, 2]. It often happens by expanding measurement error of electric quantity related to the power of electric energy metering device including voltage, current, and phase [3, 4]. According to the classification of component in the metering circuit, it can be divided into internal and external electricity stealing [5]. Accordingly, through the characteristics of abnormal electricity consumption, such as loss voltage, under voltage, loss current, current imbalance, phase change, etc. this kind of electricity stealing can be determined easily. But one of hidden ways to steal electricity by replacing instrument transformer privately which often appears in high-voltage (HV) users is difficult to find because of the lack of abnormal characteristics of electric quantity. As the actual transformer transformation ratio on the spot is usually several times of that in the system, there is a huge difference between actual electricity consumption and measurement, which causes great losses to the power grid company. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): AIPE 2020, AISC 1403, pp. 87–97, 2021. https://doi.org/10.1007/978-3-030-80531-9_8

88

G. Xu et al.

In this context, it has become a research hotspot to analyze the relationship between abnormal electricity consumption and power loss in low-voltage area or medium-voltage line, so as to lock in abnormal users. Research on finding abnormal electricity meter users in the low-voltage area by calculating the relationship between the power loss and electricity consumption with Pearson correlation coefficient algorithm [6]. A method based on the fluctuation of real-time line loss rate is proposed, which is realized by setting the alarm threshold, monitoring the abnormal line normally, analyzing and checking the suspected electricity stealing [7]. The method is studied to locate the suspected electricity stealing of HV power user in the theory based on distance and spectrum analysis by comparing the similarity between the line loss and the user load [8]. At present, the research has not analyzed the correlation between different types of electricity stealing and power loss, and the actual factors have not been considered in the application. This paper first qualitatively analyzes the relationship between electricity stealing replacing instrument transformer and line loss. Secondly, it introduces the quantitative calculation of correlation degree by Pearson correlation coefficient, and formulates the discriminant model and process combining with the actual situation. Then, the feasibility of this method is illustrated by a typical case. In the end, we summarize the three characteristics of electricity stealing by replacing instrument transformer under the abnormal line, and put forward the significance of data mining technology based on the correlation of daily load fluctuation in the actual monitoring of electricity stealing.

2 Relationship Between Electricity Stealing Replacing Instrument Transformer and Line Loss Line loss includes technical line loss and management line loss. Technical line loss known as theoretical line loss is determined by the load condition of power grid and parameters of power supply equipment, which can be obtained through theoretical calculation. The corresponding line loss rate is theoretical line loss rate [9, 10]. Taking Jiangsu Electric Power Co., Ltd. as an example, it is considered as normal that the loss rate of 10 (20) kV line is maintained at 0–10%. Management line loss is determined by the man-made management of the line itself, which is related to the maintenance and management of the power grid, such as inaccurate file information of connection between line and customer transformer, abnormal operating condition of metering device, and electricity stealing. According to the definition and composition of line loss, the line loss can be presented by the following equation J = JT + JM

(1)

Where, J is line loss, JT is technical line loss, JM is management line loss. In the case of excluding other factors affecting the management of line loss, when there is electricity stealing by large transformation ratio without permission under the line, management line loss can be expressed by following equation JM = J

N’ − N N

(2)

Intelligent Detection of Electricity Stealing

89

Where, J is daily load of electricity stealing recorded in the marketing system, N is the comprehensive ratio of current transformer recorded in the marketing system, N ’ is on site of electricity stealing users which is larger than N . According to definition of daily load, the value of J is J = 3NUIcosϕt

(3)

Where U is voltage, I is current, cosϕ power factor. According to (1) and (2), the line loss can be presented by the following equation. J = JT + J

N’ − N N

(4)

According to (4), when the fluctuation of electricity consumption of other users under the line is relatively stable, there will be synchronous fluctuations of power consumption between electricity stealing by replacing instrument transformer and line loss. Therefore, by judging the synchronization in the same period of time, the electricity stealing can be screened out.

3 Using Pearson Correlation Coefficient to Identity Electricity Stealing of Replacing Instrument Transformer 3.1 Principle of Pearson Correlation Coefficient Pearson correlation coefficient is used to measure the statistical linear correlation between two random variables or two signals. Its equation is as follows   n  ¯ ¯ i−1 Xi − X Yi − Y r= (5) 2 n  2 n  ¯ ¯ i=1 Xi − X i=1 Yi − Y Where, r is Pearson correlation coefficient which ranges from −1 to 1, X and Y are sample values of two variables respectively, X¯ and Y¯ are the average values of two variable samples respectively, n is the sample dimension. When r = 1, it indicates that there is a linear relationship between X and Y with probability 1. When r = 0, it indicates that X and Y are completely unrelated. When 0 < r < 1, it shows that the change trend of X and Y is generally consistent. When 1 < r < 0, it shows that the change trend of X and Y is generally opposite [11–17]. 3.2 Application of Pearson Correlation in Electricity Stealing Detection In this paper, high loss lines are concerned, the Pearson correlation coefficient r is calculated by selecting the daily load of users and line loss in a certain period of high loss rate, which can reflect the fluctuation correlation between them, so as to distinguish electricity stealing replacing instrument transformer. The specific judgment process is shown in Fig. 1.

90

G. Xu et al.

Selection of abnormal line loss

1.line loss rate>0; 2.line loss rate increases suddenly and keeps high loss, or remains high loss all the time

High line loss rate

Y

Within 7-15 days

Time dimension selection of power consumption Calculate the correlation coefficient of high voltage special transformer under the line

N

Calculate Pearson correlation coefficient

r>0.9

∆J/J has little fluctuation

Electricity theft Detection of replac ing instrument tra nsformer

End

Fig. 1. Flow chart of electricity stealing by replacing instrument transformer detection based on the correlation of daily load fluctuation

Intelligent Detection of Electricity Stealing

91

a) Selection of analysis objects for abnormal line loss: The abnormal line analyzed should meet the following conditions: first, the line loss rate value is greater than 0; second, the line loss rate increases suddenly and keeps high loss, or remains high loss all the time. For lines with negative line loss, it can not be determined by this method, which is usually the file error of Connection between line and customer transformer. b) Time dimension selection of daily load: The time dimension has a direct impact on the discrimination of different user correlation coefficient values, and has an impact on the accuracy of judgment. When the time dimension selection is too short, the correlation coefficient fluctuates greatly, and the correlation coefficient differentiation between different users is not obvious. At the same time, in order to reduce the impact of power loss fluctuation caused by abnormal measurement and acquisition and other factors, the time dimension selection should be as appropriate as possible, generally within 7–15 days. c) Calculate the correlation coefficient of high voltage special transformer under the line: There are high-voltage special transformer users and low-voltage public substation assessment users under the medium voltage line. It is not necessary to consider the station area assessment gateway when there is no abnormal measurement at the assessment gateway. d) Electricity stealing detection of replacing instrument transformer: When r > 0.9, it shows that there is a strong synchronous fluctuations of power consumption between electricity stealing by replacing instrument transformer and line loss, and the larger r is, the higher the degree of suspicion is. When r = 1, it shows that the fluctuation of them are completely synchronized. At the same time, the calculated value of ΔJ/J has little fluctuation, which can further lock the suspected users.

4 Case Verification and Analysis 4.1 Typical Case Verification During novel Coronavirus epidemic prevention and control in 2020, this discriminant method was well applied in the actual medium-voltage line loss abnormal detection in Jiangsu province. For example, a 10 kV line in Suqian, Jiangsu Province, found a thief stealing user replacing Instrument Transformer privately, and connected 13 users, including 6 highvoltage users and 7 low-voltage users. Before January 21, 2020, the line loss rate has been in high loss for a long time, fluctuating around 20%. From January 22, 2020 to February 21, 2020, due to the impact of the Spring Festival and the Coronavirus epidemic situation, the high-voltage enterprise users under the line have not fully resumed their work and production, and the line loss keeps normal fluctuation under light load. However, on February 22, the line loss suddenly increased and remained high loss, maintained at about 20%, and the daily power loss was about 7000kWh. The abnormal fluctuation curve of line loss during this period is shown in Fig. 2.

92

G. Xu et al.

30

10000 9000 8000 7000 6000 5000 4000 3000 2000 1000 0

20 15 10 5 0 -5 1-23

1-30

2-6

2-13 Date

2-20

2-27

3-5

3-12

Fig. 2. Abnormal fluctuation curve of 10 kV line loss

Line loss HV user 1 HV user 2 HV user 3 HV user 4 HV user 5 HV user 6

6000 5000 4000 3000 2000

3-21

3-19

3-17

3-15

3-13

3-11

3-9

3-7

3-5

3-3

3-1

2-28

2-26

2-24

0

2-22

1000

Date Fig. 3. Daily load fluctuation curve between high-voltage user and line loss

Line loss rate(%)

25

1-16

electric power (kWh)

electric power (kWh)

Power supply Power sale Line loss rate

Intelligent Detection of Electricity Stealing

93

After the abnormal increase of line loss on February 22, the daily load fluctuation curve between HV users and line loss is shown in Fig. 3. Among them, HV user 5 and 6 have no load for a long time, so they are not shown in the figure. In the Fig. 3, daily load fluctuation between HV user 4 and line loss is almost completely synchronous, and the correlation coefficient from February 22 to February 28 is 0.98, which shows a strong positive correlation between them, so HV user 4 is highly suspected of stealing electricity. After on-site inspection, it was found that the current transformer nameplate of the user was suspected of being reinstalled. The test results of transformer ratio showed that Two CTs were 600/5A and one was 500/5A. But ratios in the marketing system are all 400/5A. On March 26, the line loss returned to normal and maintained at about 1% after HV user 4 was punished for stealing electricity. The normal fluctuation curve of line loss is shown in Fig. 4.

12000

25

10000

20

8000

15

6000

10

4000

5

2000

0

Line loss rat

electric

Power supply Power sale Line loss rate

-5

0 3-22

3-26

3-30

4-3 Date

4-7

4-11

Fig. 4. Normal fluctuation curve of 10 kV line loss

4.2 Typical Case Analysis 1) Comparison of correlation coefficients of different users in the different period: Electricity stealing exist: Take the time dimension from February 22 to February 28 as an example, the correlation coefficient of each high voltage user is calculated as shown in Table 1. Since HV user 5 and 6 have no load, there is no correlation coefficient between them. HV user 1 and 3 are negatively correlated with line loss and the correlation is low. HV user 2 is positively correlated with line loss and the correlation is low, while HV user 4 is positively correlated with line loss and the correlation coefficient is close to 1.

94

G. Xu et al. Table 1. Correlation coefficients of high-voltage user Data interval

HV user 1

2

3

4

5

6

2/22–2/28 −0.19 0.33 −0.35 0.98 − −

No electricity stealing: Take the time dimension from March 26 to April 12 as an example, the correlation coefficient of each high voltage user is calculated as shown in Table 2. We can see that their correlation coefficients are all low, which indicates their daily load are not closely related to line loss during the normal operation. Table 2. Correlation coefficients of high-voltage user Data interval

HV user 1

3/26–4/12 −0.03

2

3

4

5

6

0.38 0.14 0.05 − −

From the above comparison, we can see that the Pearson correlation coefficient of electricity stealing by replacing instrument transformer is much higher than that of normal customers, which is close to 1. when the line operation returns to normal, the correlation coefficient of all users is not large. 2) Influence of time dimension selection on correlation coefficient: Since February 22, Pearson correlation coefficient was calculated for four HV users with load according to different time intervals. The variation trend of correlation coefficient of HV users is shown in Fig. 5. In the Fig. 5, when time dimension is set within 7 days, the correlation coefficients of HV user 1–3 fluctuate greatly. When time dimension is within 5 days, the correlation coefficient of HV user 2 is relatively large, which will affect the judgment. However, the correlation coefficient of HV user 4 is above 0.97 in different time periods, which has a strong correlation. So, A appropriate selection of time dimension is vary important to the results. In order to better distinguish abnormal users and reduce the daily load data loss caused by metering equipment failure or collection loss, it is generally selected that at least 7 days, but not more than 15 days in engineering application. 3) Calculation and analysis of ΔJ/J value: From February 22 to March 22, the line loss rate has been in high for a long time. The ratio of line loss to the daily electricity consumption of electricity stealing under different time periods is calculated as shown in Fig. 6.

Intelligent Detection of Electricity Stealing

95

Correlation coefficient

HV user 1 HV user 2 HV user 3 HV user 4 1 0.8 0.6 0.4 0.2 0 -0.2 -0.4 -0.6 -0.8 -1

2 3 4 5 6 7 8 9 101112131415161718192021222324252627282930 Time dimension

Fig. 5. Variation trend of correlation coefficient of high-voltage user in different time dimensions

HV user 2

HV user 1

300.00 ∆J/J

200.00

10.00

100.00

5.00

0.00

0.00 2-22 2-29

2-22

3-7 3-14 3-21 Date HV user 3

0.44

3.00

0.42

∆J/J

2.00

2-29

3-7 3-14 Date

3-21

HV user 4

4.00 ∆J/J

∆J/J

15.00

0.4

0.38

1.00

0.36

0.00 2-22 2-29

3-7 3-14 3-21 Date

0.34 2-22 2-29

Fig. 6. Fluctuation curve of J/J

3-7 3-14 3-21 Date

96

G. Xu et al.

In the Fig. 6, The fluctuation range of ΔJ/J for different users is as follows, where HV user 1 is between 5.29 and 13.46, HV user 2 is between 110.90 and 264.99, HV user 4 between 1.03 and 3.63, HV user 4 between 0.38 and 0.44. From above, we can see that ΔJ/J (HV user 4) value fluctuates around 0.4 with the smallest fluctuation, which further reflects the linear correlation between ΔJ and J(HV user 4), and also indicates that the technical line loss is relatively stable in this period.

5 Conclusions From the introduction and analysis of the above typical cases, it can be seen that the method based on the correlation of daily line loss fluctuation has obvious effect on detecting electricity stealing by replacing instrument transformer, and we can conclude that this type of electricity stealing mode has the following characteristics: 1. Pearson correlation coefficient value between users and line loss is greater than 0.9, and the larger the value, the higher the suspicion of stealing electricity. 2. In a period of time with electricity stealing, the correlation coefficient is large no matter how long the time dimension is set. 3. The fluctuation of J/J value is not large. Therefore, if the above characteristics exist in the abnormal line, there is a great possibility of electricity stealing by replacing instrument transformer under the line. As there is no abnormality in the current, voltage and phase quantity, this type of electricity stealing users cannot be monitoring by their own electrical characteristic quantities. Instead, we can change to monitoring abnormal lines, and use the correlation method of daily load fluctuation to find a breakthrough for detect them.

References 1. Xiao, J., Zhao, F.P., Cheng, Y.Y., Zhou, F., Liu, R.M.: Study on a new type of electric larceny using half-wave rectifying method. In: Advanced Materials Research, vol. 986–987, pp. 1655– 1660 (2014) 2. Yang, X.L., Tao, X.F., Xiong, X., Qi, M.Y., Sun, M.: Detection method for electricity stealing based on deep forest algorithm. Smart Power 47(10), 85–92 (2019) 3. Wang, H., Liu, F.: Application of wireless communication technology in electricity larceny prevention. Electr. Meas. Instrum. 52(1), 124–128 (2015) 4. Guo, L.C., Peng, Z.W., Fan, Q.: A survey of electric energy metering countermeasures to electric power stealing. High Volt. Appara. 46(5), 86–88 (2010) 5. Zhao, X.L.: New Technology of Anti-stealing Electricity in Electric Watt-Hour Mete, pp. 83145. China Electric Power Press, Beijing (2013) 6. Wang, J., Wu, X.M., Wang, A.F.: The application of Pearson correlation coefficient algorithm in searching for the users with abnormal watt-hour meters. Power Demand Side Manag. 16(2), 52–54 (2014) 7. Tang, W.B.: The research of anti-stealing electric energy based on analysis of calculating line loss, MA.Eng. dissertation, Guangxi University, Guangxi (2015)

Intelligent Detection of Electricity Stealing

97

8. Wu, D.: Electricity stealing identification method based on curve similarity. Electr. Power 50(2), 181–184 (2017) 9. Wu, A.G., Ni, B.S.: Analysis and Calculation of Power System Line Loss, 3rd edn., pp. 23. China Electric Power Press, Beijing (2013) 10. Lin, Z.Q., Zhu, J.C.: Analysis of factors affecting medium-voltage distribution line loss based on mutual information and countermeasure theory. Smart Power, 45(12), 75–79 (2017) 11. Komaroff, E.: Relationships Between p-values and Pearson correlation coefficients, type 1 errors and effect size errors, under a true null hypothesis. J. Stat. Theor. Pract. 14(381), 129–133 (2020) 12. Li, G.Z., Li, L.Z., Zhou, N.: Exploration of correlation coefficient. J. Inf. Eng. Univ. 10(3), 318–321 (2009) 13. Xu, W.C.: A review on correlation coefficients. J. Guangdong Univ. Technol. 29(3), 12–17 (2012) 14. Zafar, S., Soni, M.K.: A novel crypt-biometric perception algorithm to protract security in MANET. Int. J. Comput. Netw. Inf. Secur. (IJCNIS) 6(12), 64–71 (2014). https://doi.org/10. 5815/ijcnis.2014.12.08 15. Agrawal, A., Brijpuria, P.: A dynamic object identification protocol for intelligent robotic systems. Int. J. Image Graph. Signal Process. (IJIGSP) 7(8), 35–41 (2015). https://doi.org/ 10.5815/ijigsp.2015.08.04 16. Naik, N.M., Kulkarni, G.S., Prakash, K.B.: Assessment of the deterioration of used engine Oil soaked fly ash concrete and its analysis using automated SEM analysis. Int. J. Eng. Manuf. (IJEM) 6(3), 1–11 (2016). https://doi.org/10.5815/ijem.2016.03.01 17. Manjunatha, B.A., Gogoi, P., Akkalappa, M.T.: Data mining based framework for effective intrusion detection using hybrid feature selection approach. Int. J. Comput. Netw. Inf. Secur. (IJCNIS) 11(8), 1–12 (2019). https://doi.org/10.5815/ijcnis.2019.08.01

Fuzzy Management of Teacher-Student Interaction in Distance Learning Settings N. Yu. Mutovkina(B) Tver State Technical University, Tver, Russia

Abstract. The article is devoted to the problem of improving the effectiveness of interaction between subjects of the educational process in higher education. The key persons of the educational process are teachers and students. Teachers are considering as translators of educational information, and students are its consumers. Taking into account the identified factors that influence the effectiveness of the educational process, the models of transfer of educational information, perception of educational information, and the process of distance learning with the exchange of information between the teacher and students in an interactive mode are proposing. All these models are representing as fuzzy graphs. A system for evaluating interaction based on fuzzy sets and fuzzy logic postulates is developing, and the technology of effective teamwork is describing. Improving the effectiveness of interaction between teachers and students direct affects the quality of education. The quality of the educational process depends on the level of training of students, the success of professional activities of graduates, the success of professional self-realization of teachers. Keywords: Educational process · Teachers · Students · The interaction between teachers and students · Fuzzy sets · Fuzzy relationships · Fuzzy graphs · Membership function

1 Introduction The interaction of subjects of the educational process understands as the manifestation of individual ways of actions and communication between the teacher and students, directed at each other, determined by their functional-role and personal positions. The result of the interaction is mutual changes in the activities, communication, relationships of participants in the educational process, as well as their personal development. As with any purposeful process, the interaction between teachers and students should be managing, is especially organizing and diagnosable, which means that it must-have criteria and appropriate assessment methods. Further, the communication of the teacher and students will be called subject-subject interaction, and the interplay of the teacher and students with educational information will be called subject-object cooperation. Educational information is subject to changes over time, which teachers should take into account when transmitting it to students. With the development of information and communication technologies (ICT), the educational process has moved beyond the classroom and is reflecting in the Internet and © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): AIPE 2020, AISC 1403, pp. 98–107, 2021. https://doi.org/10.1007/978-3-030-80531-9_9

Fuzzy Management of Teacher-Student Interaction

99

software solutions that are combining under the name Learning Management System (LMS). In general, the LMS is an integrated platform for creating educational content, delivering training materials, and managing training [1]. The main components posted on such software environments are lecture materials (including online lectures), teaching materials, workshops and tasks for laboratory work, and test tasks [2]. Before considering subject-subject interaction, it is necessary to investigate how the effectiveness of transmitting and perceiving educational information is achieving. To do this, you can resort to fuzzy modeling, in which the sets will be the corresponding sets of the components listed above, as well as the thematic content of the discipline, external investment in the distance learning process, external support for motivation, external interference. Using logical reasoning, based on statistical information and expert assessments of the effectiveness of distance learning, depending on internal and external factors, it is possible to establish links between the selected sets. The definition of connections, their strength, and orientation allows you to more clearly formulate probable management impacts and decisions regarding the processes of transmitting and perceiving educational information. Thus, the purpose of this study is to create a model of subject-subject interaction between teachers and students in the context of distance learning, working with which will determine measures to optimize and improve the effectiveness of the educational process. The developed model is useful for teaching staff and teaching staff of higher educational institutions to identify the advantages and disadvantages of distance teaching of specific courses. Taking into account the differences in assessments of the necessity and sufficiency of the development of components of academic disciplines, as well as the resource availability of the educational process and possible hindrances, it seems appropriate to use the theory of fuzzy sets and fuzzy relations in modeling.

2 Literature Review on the Research Issue To achieve high-quality training of specialists requires a detailed study and improvement of not only all stages of the learning process, but also teaching methods, knowledge of the psychology of the teacher and the trainee. The deficient study of these processes as a whole and their parts is primarily since these processes belong to the class of weakly structured systems. In these systems, information about the parameters and variables present in them is unclear. Therefore, fuzzy numbers are using to describe them. Expert evaluation methods [3], as well as methods of fuzzy logic and soft computing, are wellproven for the study of complex weakly structured, difficult-to-formalize systems and processes. The resulting vague solution for such problems makes it possible to initially take into account the incompleteness and inaccuracy of the source data. The basis of fuzzy modeling, the results of which are describing in this article, is the concept of a fuzzy set, defined in the well-known work of L.A. Zadeh “Fuzzy sets” [4]. Arithmetic operations with fuzzy numbers are presenting in the publications [5–9], and others. Operations on fuzzy sets and fuzzy relations are discussed in detail in such works as [8, 10–15], and others.

100

N. Yu. Mutovkina

The problem of optimization and management of interaction between teachers and students has been considering in many works, for example, in [1, 16, 17], and others. However, these studies did not take into account the factor of ambiguous personal perception of the effectiveness of transmission and perception of educational information, the variability of influence on the components of academic disciplines from both external and internal factors [18]. The main external factors are an investment in the educational process and interference of different nature, which can include, among other things, lack of funding for the educational process and restriction of the teacher’s freedom to choose the means and methods of broadcasting relevant learning information. Based on the review of publications and achievements in the field of tools for solving weakly structured problems, it is establishing that the most appropriate form of representation of models of transmission, perception of educational information, and models of interaction between teachers and students is fuzzy graphs. Graphs are “topological” rather than “geometric” objects, meaning they primarily Express relationships between vertices rather than the relative positions of vertices and edges on a plane. Therefore, all properties of fuzzy relations are also valid for fuzzy graphs. Fuzzy graphs are using to represent models of systems in which the direction of vague relationships between elements is essential. These systems include fuzzy semantic networks, social schemes, fuzzy algorithms, and so on. Fuzzy graphs are useful because in them the connection between vertices can be not only unambiguous but also conjectural.

3 Theoretical Aspects of the Study Let’s introduce the following notation system. ED is the volume of the discipline (transmitted information), which varies in the interval [0%, 100%]; A1 is a set of online lectures; A2 is a set of presentations of lecture material and methodological guidelines; A3 is a set of tasks for practical and (or) laboratory work; A4 is a set of test tasks; I1 is the amount of external investment in the distance learning process, which includes financial support for technical, software and methodological support for the distance learning process; H1 there are external obstacles that negatively affect of the transmitting educational information; A5 = A1 ∪ A2 there is educational information transmitted by the teacher; A∗3 ⊆ A3 is a lot of practical and (or) laboratory work performed; A∗4 ⊆ A4 is a lot of completed test tasks; A6 is a lot of motivating factors (internal incentives) to assimilate information; I2 is a lot of external factors that support motivation; H2 is a lot of possible obstacles to the perception of educational information. Motivation to assimilate educational information increases the effectiveness of its perception. Performing practical, laboratory work and test tasks reflect the final result of transmitting educational information [19]. The sets I1 , I2 , H1 , and H2 influence the lots, which are the main components of distance learning. All the listed fuzzy sets are connecting by fuzzy relations with the corresponding membership functions µRk ∈ [0, 1], k = 1, 3:     R1 = (I1 , A1 ), µR1 (I1 , A1 ) , . . . , (I1 , A4 ), µR1 (I1 , A4 ) ,

Fuzzy Management of Teacher-Student Interaction

    (H1 , A1 ), µR2 (H1 , A1 ) , . . . , (H1 , A4 ), µR2 (H1 , A4 ) ,     R3 = (A1 , A2 ), µR3 (A1 , A2 ) , . . . , (A3 , A4 ), µR3 (A3 , A4 )

101

R2 =

(1)

When modeling, the linguistic variable “INFLUENCE” is introducing with three levels: “weak”, “moderate”, and “strong”. Each of these levels is a separate fuzzy variable. The definition area here is the segment [0%, 100%]. It is assuming, that in relationship R1 , experts assess the impact of investment (I1 ) on the level of development and relevance of components A1 , A2 , A3 , A4 ; for relationship R2 , the impact of interference (H1 ) on A1 , A2 , A3 , A4 is assessing; and for R3 , the impact of components is clarifying, based on the following considerations: – the transfer of educational information by the teacher is carrying out through lectures, presentations of educational material, methodological instructions for performing practical and laboratory work (along with tasks for these works). Test tasks are a form of boundary control of students’ learning of educational material. In General, the result of training is evaluating by how well students perform practical, laboratory, and test tasks [19]; – in turn, the level of success in completing tasks depends on the quality of the lecture and practical material, its relationship, the quality of teaching, the quality of test tasks, and their compliance with the lecture material and the workshop; – in general, the quality of the learning process is judging by the results of the final tasks (exam in the form of testing), but in the conditions of distance learning, when setting the final grade for academic performance in the discipline, it is advisable to take into account the results of practical and laboratory work during the semester (trimester); – in general, components A1 , A2 , A3 , and A4 represent a system whose key feature is the continuity of educational material: A2 = f (A1 ), A3 = f (A1 , A2 ), A4 = f (A1 , A2 , A3 ).

(2)

It is hypothesizing that the potential of each component A depends on the initial amount of information of the discipline, the strength of the influence of investment and interference, as well as on the potential of the input component. Following the hypothesis put forward, the values of efficiencies A1 , A2 , A3 , A4 are proposing to be calculating using the formulas:  

PA1 = R˜ 0 + R˜ 1 −R˜ 2 = max µB˜ 1 (ED, I1 ) − µR˜ 2 (H1 , A1 ), 0 ,  B˜ 1

µB˜ 1 (ED, I1 ) = µR˜ 0 (ED, A1 ) + µR˜ 1 (I1 , A1 ) − µR˜ 0 (ED, A1 ) · µR˜ 1 (I1 , A1 )

(3)

 

PA2 = PA1 + R˜ 1 −R˜ 2 = max µB˜ 2 (A1 , I1 ) − µR˜ 2 (H1 , A2 ), 0 ,

 B˜ 2

  µB˜ 2 PA1 , I1 = PA1 + µR˜ 1 (I1 , A2 ) − PA1 · µR˜ 1 (I1 , A2 )

(4)

102

N. Yu. Mutovkina



    PA3 = PA1 + PA2 +R˜ 1 −R˜ 2 = max µB˜ 4 PA1 ,A2 , I1 − µR˜ 2 (H1 , A3 ), 0 , 

B˜ 3







B˜ 4

  µB˜ 3 PA1 , PA2 = PA1 + PA2 − PA1 · PA2 ,       µB˜ 4 PA1 ,A2 , I1 = µB˜ 3 PA1 , PA2 + µR˜ 1 (I1 , A3 ) − µB˜ 3 PA1 , PA2 · µR˜ 1 (I1 , A3 )

(5)

 

  PA4 = B˜ 3 + PA3 +R˜ 1 −R˜ 2 = max µB˜ 6 PA1 ,A2 ,A3 , I1 − µR˜ 2 (H1 , A4 ), 0 ,

 

B˜ 5





B˜ 6

  µB˜ 5 B˜ 3 , PA3 = PA1 ⊕A2 + PA3 − PA1 ⊕A2 · PA3 ,       µB˜ 6 PA1 ,A2 ,A3 , I1 = µB˜ 5 B˜ 3 , PA3 + µR˜ 1 (I1 , A4 ) − µB˜ 5 B˜ 3 , PA3 · µR˜ 1 (I1 , A4 ) (6) The potentials of the components A∗3 , A∗4 , A5 , A6 can be founding when using the formulas:

(7) PA6 = R˜ 1 − R˜ 2 = max µR˜ 1 (I2 , A6 ) − µR˜ 2 (H2 , A6 ), 0  

  PA5 = PA1 + PA2 +PA6 −R˜ 2 = max µB˜ 7 B˜ 3 , PA6 − µR˜ 2 (H2 , A5 ), 0 , 



B˜ 3



B˜ 7



  µB˜ 7 B˜ 3 , PA6 = PA1 ⊕A2 + PA6 − PA1 ⊕A2 · PA6

(8)

      PA∗3 = PA5 + PA6 −R˜ 2 = max µB˜ 8 PA5 , PA6 − µR˜ 2 H2 , A∗3 , 0 , 

B˜ 8



 µB˜ 8 PA5 , PA6 = PA5 + PA6 − PA5 · PA6 PA∗4 =

(9)

     PA5 + PA6 + PA∗3 − R˜ 2 = B˜ 8 + PA∗3 −R˜ 2

 B˜ 9

    = max µB˜ 9 B˜ 8 , PA∗3 − µR˜ 2 H2 , A∗4 , 0 ,   µB˜ 9 B˜ 8 , PA∗3 = PA5 ⊕A6 + PA∗3 − PA5 ⊕A6 · PA∗3

(10)

Knowing the potentials of the components and the factors that affect the values of the effectiveness, you can determine the coefficients of the mutual influence of the elements. Management impacts, in this case, consist of varying not only external impacts (which is often quite difficult) but also in changing internal coefficients of influence. In practice, this can be expressed in improving the structure and content of the work program of the discipline, optimizing the composition of methods and means of teaching it, and so on.

Fuzzy Management of Teacher-Student Interaction

103

4 A Cognitive Model of Interaction of the Teacher and Students The cognitive model of interaction between teachers and students in distance learning is basing on models of transmission of educational information and perception of this information (Fig. 1).

Fig. 1. Models of transmission – 1) and perception – 2) of information

4.1 The Model of Information Transmission The purpose of the system analysis of the cognitive model of the operational transfer of educational information by a teacher in the context of distance learning is to reveal the mechanism of information transfer and to understand the influence of individual factors on the effectiveness of the transfer of educational information. It is assuming that the teacher works offline without feedback from the audience. The information transfer model is basing on the following assumptions: The investments I1 should lead to an increase in the overall information transfer rate, that is, increase the overall potential of the model. Active independent work of the teacher, deep awareness of their intellectual activity should also help to increase the transfer rate of educational information; The presence of external interference H1 leads to a decrease in the volume and quality of transmitted information; The value of the total potential also depends on the initial amount of information already transmitted. The total potential of the information transfer model is determining by the formula:      PA1 + PA2 + PA3 + PA4 + R˜ 1 − R˜ 2 (11) KTI =

104

N. Yu. Mutovkina

The coefficients of mutual influence of the components can found from the system of equations: ⎧

⎪ PA1 = 1 − µR˜ 3 (A1 ,A2 )⊕R˜ 3 (A1 ,A3 ) (A1 , A2 , A3 ) + µR˜ 3 (A1 , A4 ) , ⎪ ⎪ ⎪ ⎨ PA2 = µR˜ 3 (A1 , A2 ) − µR˜ 3 (A2 ,A3 )⊕R˜ 3 (A2 ,A4 ) (A2 , A3 , A4 ), (12) ⎪ PA3 = µR˜ 3 (A1 ,A3 )⊕R˜ 3 (A2 ,A3 ) (A1 , A2 , A3 ) − µR˜ 3 (A3 , A4 ), ⎪ ⎪ ⎪ ⎩P = µ A4 R˜ 3 (A1 ,A4 )⊕R˜ 3 (A2 ,A4 ) (A1 , A2 , A4 ) + µR˜ 3 (A3 , A4 )

4.2 Model of Perception of Information The purpose of constructing and analyzing a cognitive model of operational perception of educational information is to determine the mechanism of perception and the influence of individual factors on the effectiveness of learning components. For the model of information perception, the same assumptions are valid as for the model of information transmission, only they are considering by students. The total potential of the information perception model is determining by the formula:     (13) KAI = PA∗3 + PA∗4 + PA5 + PA6 − R˜ 2 The coefficients of mutual influence of the components can found from the system of equations:     ⎧ PA∗3 = µR˜ 3 (A5 ,A∗ )⊕R˜ 3 (A6 ,A∗ ) A5 , A6 , A∗3 − µR˜ 3 A∗3 , A6 , ⎪ ⎪ 3 3 ⎪       ∗ ∗ ⎪ ⎪ ∗ ⎪ ∗ = µ˜ P + µ A − µR˜ 3 A∗4 , A6 , , A , A , A A ∗ ∗ ∗ 6 5 ˜ ˜ ⎪ A 3 4 4 R3 (A3 ,A4 )⊕R3 R3 A5 ,A4 ) ( 4 ⎪

⎨   PA5 = µR˜ 3 (A6 , A5 ) − µR˜ 3 (A5 ,A∗ )⊕R˜ 3 (A5 ,A∗ ) A5 , A∗3 , A∗4 + µR˜ 3 (A5 , A6 ) , (14)

⎪ 4  ∗3 ∗ ⎪ ⎪ ⎪ + µ P = µ , A , A , A A (A ) ∗ ,A ⊕R ∗ ,A A 6 6 5 ˜ ˜ ˜ ⎪ 6 3 4 A A R R 3( 3 6) 3( 4 6) 3 ⎪

⎪   ⎪ ⎩− µ ∗ , A∗ + µ (A , A ) , A A ∗ ∗ 6 6 5 3 4 R˜ 3 (A6 ,A3 )⊕R˜ 3 (A6 ,A4 ) R˜ 3

4.3 The Information Interaction Model The information interaction model (Fig. 2) is forming as a result of the integration of the first two models. As a result of this model, additional closed contours can form, which either enhance or reduce the effect of both transmitting and perceiving educational information.

Fuzzy Management of Teacher-Student Interaction

105

Fig. 2. Information interaction model

5 Practical Implementation and Results The proposed models were implemented at the Department of Accounting and Finance of Tver State Technical University to optimize the educational process in a remote format. As a result of the models, we can draw the following conclusions: 1) Even with small investments (the total impact is 0.145), but with the complete absence of external negative factors, a high value of the overall potential for information transfer (KTI = 0, 7234) is observing. In this case, the value µR˜ 0 (ED, A1 ) = 0, 01, which corresponds to the beginning of teaching the discipline (transmitting 1% of the information). Further stimulation of remote transmission of data with zero external threats ensures the growth of the model’s potential. Thus, an increase in total influence by 8.55% gives a 7.5% increase in potential; 2) In the complete absence of external motivation and external threats, extremely low efficiency of transmitting educational information is expecting. For µR˜ 0 (ED, A1 ) = 0, 01, it is only 0.0773. If there is at least one, even a very little threat (H1 = 0, 01), the potential of the model is 0.0294. When a 10% influence from negative factors is reaching, the potential of the model is zero. However, the appearance of a motivational component of the same magnitude increases the potential of the model to 0.3529; 3) Equal and equally high values of the influence of motivational (I1 = 0, 99) and destructive (H1 = 0, 99) factors give a zero value of the model’s potential, therefore, such a learning process is not useful; 4) Equal and equally low values of the influence of motivational (I1 = 0, 18549) and destructive (H1 = 0, 18549) factors provide a positive value of the model’s potential (KTI = 0, 0556); 5) If the information transmission model is a set by the parameters listed in option 1), and the information perception model does not have external interference and motivating factors, then its potential is 0.67647. If there is a negative impact (H2 = 0, 1) with zero external motivation, then KAI = 0, 51959. However, an increase in

106

N. Yu. Mutovkina

the motivational component by the same amount provides an increase in the potential of the information perception model by more than 20%; 6) With the potential of the information transfer model equal to 0.0773, but with I2 = 0, 1 and H2 = 0, the potential of the information perception model will be 0.60279. This confirms the hypothesis that a student motivated to study the discipline will be able to master it even with low efficiency of information transfer. Further experiments also confirmed all the hypotheses put forward earlier.

6 Conclusions Based on the results of the presented models, it can conclude that the management of interaction between teachers and students in the context of distance learning should carry out by the following principles: Ensuring profile differentiation and practice-oriented orientation in students’ education; Providing internal motivation for teaching; Independent project activity of students; The variability of the content of training, taking into account the needs, aptitudes, abilities, and cognitive interests of students, as well as the features of the industrial and economic environment and regional socio-cultural conditions. Preparation of special corrective programs based on the results of preliminary control with a statement of the main content of the training material; Implementation of impersonal continuous current, mid-term and final control for feedback to assess the degree of perception of educational material; Formation of communicative qualities of a student’s personality; Combination of a group and individual work with exceptional attention to the development of the personal style of the student’s activity; Implementation of training in an environment of attention, care, cooperation; Formation of a creative personality that can solve problems in non-standard conditions, flexibly and independently apply the acquired knowledge in a variety of life situations; Introduction of various forms of remote training (remote business games, Olympiad classes, collective mutual control of tasks).

References 1. Medzhitova, L.M.: Remote interaction between teachers and students using web-based learning management systems. Theory Methodol. e-Learn. 3(1)(3), 184–189 (2012) 2. Lashayo, D.M., Johar, Md.G.Md.: Preliminary study on multi-factors affecting adoption of e-Learning systems in universities: a case of open university of Tanzania (OUT). Int. J. Mod. Educ. Comput. Sci. (IJMECS) 10(3), 29–37 (2018). https://doi.org/10.5815/ijmecs. 2018.03.04 3. Rani, S.K., Raju, K., Kumari, V.: Expert finding system using latent effort ranking in academic social networks. I. J. Inf. Technol. Comput. Sci. (IJITCS) 7(2), 21–27 (2015). https://doi.org/ 10.5815/ijitcs.2015.02.03

Fuzzy Management of Teacher-Student Interaction

107

4. Zadeh, L.A.: Fuzzy sets. Inf. Control 8, 338–353 (1965) 5. Dutta, P., Boruah, H., Ali, T.: Fuzzy arithmetic with and without using α-cut method: a comparative study. Int. J. Latest Trends Comput. 2, 99–107 (2011) 6. Bansal, A.: Trapezoidal fuzzy numbers (a, b, c, d): arithmetic behavior. Int. J. Phys. Math. Sci. 2(1), 39–44 (2011) 7. Dubois, D.: Fuzzy Sets and Systems: Theory and Applications. Academic Press, Cambridge (1980). 393 p. 8. Leonenkov, A.V.: Fuzzy modeling in MATLAB and fuzzyTECH. SPb.: BHV-Petersburg (2005), 731 p. 9. Stefanini, L., Sorini, L.: Representing fuzzy numbers for fuzzy calculus. Anal. Des. Intell. Syst. Soft Comput. Tech. 41, 485–494 (2007) 10. Altunin, A.E., Semukhin, M.V.: Models and Algorithms for Decision-Making in Fuzzy Conditions: Monograph. TSU, Tyumen (2000). 352 p. 11. Acai, K., Terano, T., Asai, K., Sugeno, M. (eds.): Applied Fuzzy Systems: Translation from Japanese. Mir (1993). 368 p. 12. Akhmetov, B.S., Gorbachenko, V.I., Kuznetsova, O.Yu.: Fuzzy Systems and Networks: A Textbook. LEM Publishing House, Almaty (2014). 104 p. 13. Batyrshin, I.Z.: Basic Operations of Fuzzy Logic and Their Generalizations. Fatherland, Kazan (2001). 100 p. 14. Blyumin, S.L., Shuikova, I.A.: Models and Methods of Decision Making Under Uncertainty. LEGI, Lipetsk (2000). 139 p. 15. Pegat, A.: Fuzzy Modeling, and Control. BINOM. Laboratory of Knowledge (2009). 796 p. 16. Starichenko, B.E., Korotaeva, E.V., Sardak, L.V., Egorov, A.N.: Methods of using information and communication technologies in the educational process: textbook. In: Starichenko, B.E. (ed.) Part 4. Designing Methods for managing educational activities. Ural State Pedagogical University, Yekaterinburg (2013). 141 p. 17. Startsev, M.V.: Managing interaction between University teachers and students based on the qualimetric approach. Actual Innov. Res. Sci. Pract. 2, 12 (2009) 18. Fetaji, B., Fetaji, M., Ebibi, M., Kera, S.: Analyses of impacting factors of ICT in education management: case study. Int. J. Mod. Educ. Comput. Sci. (IJMECS) 10(2), 26–34 (2018). https://doi.org/10.5815/ijmecs.2018.02.03 19. Al-Hagery, M.A., Alzaid, M.A., Alharbi, T.S., Alhanaya, M.A.: Data mining methods for detecting the most significant factors affecting students’ performance. Int. J. Inf. Technol. Comput. Sci. (IJITCS) 12(5), 1–13 (2020). https://doi.org/10.5815/ijitcs.2020.05.01

Evaluation Method of Distributed Renewable Energy Access to Distribution Network Based on Variable Weight Theory Peng Wang1(B) , Chengliang Zhu2 , Hongjian Wang2 , Sen Wang2 , and Hengrui Ma3 1 State Grid Hubei Electric Power Co., Ltd., Wuhan 430077, Hubei, China 2 State Grid Jiaxing Power Supply Company of State Grid Zhejiang Electric Power Co., Ltd.,

Jiaxing 314000, Zhejiang, China 3 Tus-Institute for Renewable Energy, Qinghai University, Xining 810016, Qinghai, China

Abstract. Distributed renewable energy is widely used in the load side, which is clean and efficient. If the scale of distributed photovoltaic access exceeds the actual carrying capacity of distribution network, it will lead to overload of transformer and line, line voltage deviation out of limit, harmonic exceeding standard, protection failure and so on, which will affect the safe and stable operation of power grid. Therefore, this paper proposes a comprehensive evaluation method of distribution network carrying capacity of distributed generation. This method firstly selects the appropriate evaluation index to establish the index system for comprehensive evaluation of the bearing capacity of the new distribution network. Then, the index weight is calculated by the multi-level fuzzy comprehensive evaluation algorithm based on the variable weight theory. Finally, a distribution line in a city is taken as an example to evaluate the distribution network with new load and high proportion of distributed generation. The simulation results show that the proposed method can evaluate the distributed photovoltaic bearing capacity and its level of all levels of bus in distribution network, and has the characteristics of high universality and strong scalability, which can provide guidance for the healthy and orderly development of distributed renewable energy. Keywords: Renewable energy · Distribution network · Carrying capacity · Variable weight theory · Evaluation method

1 Introduction By 2030, the installed capacity of wind power and photovoltaic in Hubei will reach 10 million kW [1]. Up to now, the installed capacity of renewable energy in Hubei has reached 10.0534 million kW, breaking through the 10 million-kW mark for the first time. The installed capacity of renewable energy ranks second in Central China. The energy structure of Hubei Province is greener and low-carbon [2]. The installed capacity of renewable energy accounts for 13.16% of the total installed capacity of power generation. Among them, wind power is 3.7135 million kW, solar energy is 5.4826 million kW, biomass is 8573 million kW, and hydropower is 36.7522 million © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): AIPE 2020, AISC 1403, pp. 108–121, 2021. https://doi.org/10.1007/978-3-030-80531-9_10

Evaluation Method of Distributed Renewable Energy Access

109

kW, accounting for 4.86%, 7.18%, 1.12% and 49.66% of the total installed capacity respectively. However, from the perspective of construction cycle, the phenomenon of “Determine power supply construction according to the grid” exists in renewable energy construction projects, which affects the formulation and implementation of power grid planning to adapt to the development of renewable energy [3–5]. However, the construction cycle of renewable energy based on point is much shorter than that of power grid. In the current environment of lack of rigidity in renewable energy planning and implementation, renewable energy is out of specification However, due to the limited carrying capacity of power grid, wind, electricity and water will be abandoned. The existing research results on comprehensive evaluation method of distributed generation access to distribution network mainly consider the impact of distributed generation on distribution network from the aspects of reliability, security and economy, and establish the evaluation index system of distribution network with distributed generation [6–8]. On the reliability evaluation of distribution network with distributed generation, experts and scholars have studied the role of distributed generation in the calculation process of distribution network reliability index, reliability index calculation method considering the correlation between wind speed and load, and comprehensive evaluation method of power quality with distributed generation. Based on the above research, a comprehensive evaluation index system is established from four aspects of economy, service quality, safety and environmental protection benefits. Although the above evaluation index system fully considers the access of distributed generation, with the publication of “DL/T + 2041–2019 guidelines for evaluation of distributed generation access to power grid”, the original index system considering only distributed generation has been difficult to meet the requirements of comprehensive evaluation of distribution network of existing distributed generation, so it is urgent to carry out evaluation considering the interaction of distributed generation in this guide Price index system, so as to accurately evaluate the carrying capacity of distribution network.

2 Index System of Carrying Capacity of Distributed Renewable Energy Access Distribution Network 2.1 Refer to the Evaluation Index Proposed in DL/T + 2041–2019 2.1.1 Thermal Stability Evaluation Index Thermal stability assessment should be based on the grid operation mode, transmission and transformation equipment limits, load conditions, power generation conditions, distributed generation characteristics and other factors to calculate the reverse load rate λ. The reverse load rate λ shall be calculated according to formula (1). λ=

PD − PL × 100% Se

(1)

110

P. Wang et al.

Where PD is the output of distributed generation, PL is the equivalent power load at the same time, which is the output of other power sources except distributed generation is reduced by load; Se is the actual operation limit of transformer or line. The maximum value λmax of reverse load rate λ in the evaluation period should be used as the evaluation index for thermal stability evaluation. In the evaluation period, the special period λ caused by power grid load fluctuation, such as legal holidays, may not be considered. The capacity Pm of new distributed generation in the assessment area shall be calculated according to formula (2): Pm = (1 − λmax ) × Se × kr

(2)

Where kr is the equipment operation margin coefficient, generally taken as 0.8. 2.1.2 Short Circuit Current Check The short circuit current should be checked according to formula (3). IXZ < Im

(3)

Where IXZ is the short-circuit current of system bus and Im is the allowable short-circuit current limit. The minimum breaking current limit value of corresponding circuit breaker on all equipment connected with bus and feeder shall be selected. 2.1.3 Voltage Deviation Check Voltage deviation check should be based on the principle of reactive power local balance and grid voltage not exceeding the limit after the distributed generation is connected. The maximum positive voltage deviation and negative voltage deviation of the assessment area shall be calculated respectively according to the maximum and minimum operating voltage of the power grid in the evaluation period and in combination with the voltage limits given in GB/t12325, expressed as  uh and  UL respectively. According to the requirement of the capacity of the distributed generation which to be checked, and the requirement of the GB/T33593, the maximum positive and negative voltage deviations of the area caused by the new distributed generation connection shall be calculated according to formula (4), which are expressed as δUH and δUL respectively. δU (%) =

RL Pmax + XL Qmax UN2

(4)

Where Qmax is the maximum positive and negative value of reactive power calculated according to the requirements of GB/T33593 for the power factor of different types of distributed generation, UN is the rated voltage of the bus in the area, RL and XL are the resistance and reactance components of the grid impedance, and the resistance component of the grid can be ignored in the high-voltage power grid. The voltage deviation should be checked according to formula (5). UH > δUH or UL < δUL

(5)

Evaluation Method of Distributed Renewable Energy Access

111

2.1.4 Harmonic Check The harmonic current should be checked according to formula (6). Ixz,h > Ih

(6)

Where Ixz,h is the h-th harmonic current value and Ih is the h-th harmonic current limit specified in GB/T 14549. 2.2 Calculation Index of Safety and Reliability 2.2.1 “N − 1” Passing Rate of Medium Voltage Line The “n − 1” calibration pass rate of medium voltage line refers to that under the maximum load operation mode, after the switch outage occurs in the substation, all the load of the line can be transferred to other lines for power supply. The proportion of such lines is used to reflect the load transfer capacity of medium voltage line under the maximum load operation mode. The calculation formula is as follows: LN −1 =

nt × 100% nl

(7)

Where nt is the number of switchable supply lines; nl is the total number of medium voltage lines in the area. 2.2.2 Short Circuit Capacity After the distributed generation is connected to the grid, it will provide short-circuit current to the short-circuit point. When the distributed generation capacity reaches a certain degree, the over-current protection device will not work correctly. The shortcircuit capacity is the short-circuit current calculated under the maximum operation mode (minimum impedance) multiplied by the short-circuit point voltage when threephase short-circuit occurs in the feeder, and the apparent power value is taken to reflect the response ability of the feeder to fault. The calculation formula is as follows:    (3) (3) Ii (8) × Uf Sshort = If + i∈G (3)

(3)

Where If is the short-circuit current under the maximum operation mode; Ii is the short-circuit current injected into the short-circuit point by the i-th distributed generation; and Uf is the voltage at the short-circuit point. If the power grid can safely cut off the fault point and ensure the normal power supply of other loads, the short-circuit capacity should be less than the breaking capacity of feeder circuit breaker. 2.2.3 SAIFI, SAIDI and ASAI According to the configuration and automation degree of the switch devices in the distribution network, the distribution network is divided into blocks according to the

112

P. Wang et al.

switch position. If any component fault in the same block has the same impact on the load point, the equivalent failure rate λs and the average equivalent repair time γs of block s are as follows: λs =

m 

λi

(9)

i=1 m 

λ i γi

i=1

γs =

(10)

λs

Where λi is the failure rate of the ith element of block s; m is the total number of components of block s; and γi is the repair time of the i-th element of block s. Then the average annual outage time of block s is: Us = γs λs

(11)

The calculation formulas of SAIFI (System Average Interruption Frequency Index), SAIDI (System Average Interruption Duration Index), ASAI (Average Service Availability Index) are as follows:  λs Ns SAIFI =  (12) Ns  Us Ns SAIDI =  (13) Ns  Us Ns  ASAI = 1 − (14) 8760 Ns Where Ns is the number of users in block s. 2.3 Calculation Index of Operation Economy Index 2.3.1 Line Loss Rate Line loss rate refers to the percentage of active power loss of feeder in the input power at the beginning of feeder. The calculation formula of line loss rate is as follows:  2  2 Ii ri + Ij rj Floss =

i∈L

j∈T

P max

× 100%

(15)

l

Where L and T are branch and distribution transformer set respectively; Ii are current amplitude of i branch; Ij are current amplitude of j distribution transformer branch; ri is resistance of i branch; rj is resistance of j transformer branch; Plmax is power supply value of line.

Evaluation Method of Distributed Renewable Energy Access

113

2.3.2 Maximum Load Rate of Line and Distribution Transformer The maximum load rate of lines and distribution transformers refers to the ratio of the maximum load of lines and distribution transformers to the maximum transmission active power of lines and distribution transformers. The calculation formula is as follows: n   Pimax cos ϕi

Fload =

i=1

S

× 100%

(16)

Where Pimax is the maximum value of the i th load; cos ϕi is the power factor of the i-th load; n is the total number of loads on the line or transformer; S is the maximum transmission active power allowed by the line and distribution transformer. 2.3.3 Average Load Rate of Line and Distribution Transformer The average load rate of lines and distribution transformers refers to the average value of load rates of feeders and distribution transformers during the year. The calculation formula is as follows: Futi =

Pave × 100% Pmax

(17)

Where Pave is the average load.

3 Fuzzy Comprehensive Evaluation Method Based on Variable Weight Theory 3.1 Variable Weight Theory The variable weight theory solves the problem that the score of each index deviates from the normal value, that is, for the index score is too high or too low, the influence of these abnormal indexes on the evaluation results is reduced by reducing the weight:  T −1 wj xj (18) wj = m  T −1 (xl ) l=1

In the formula, wj and wj are the weights before and after the weight change; xj is the score of index j; m is the number of indicators; T ∈ (0, 1] is the equilibrium coefficient. This project considers that the mean value of all index scores is closest to the normal value. The more the index deviates from the mean value, the closer the index is to the abnormal value, the smaller the equilibrium coefficient should be. The closer the index is to the mean value, the more normal the index is, the larger the equilibrium coefficient should be. However, the existing evaluation on the variable weight theory is that the equilibrium coefficient is the most important factor It does not consider the change of equilibrium coefficient when the index deviates from the normal value in different degrees, which affects the evaluation effect.

114

P. Wang et al.

Therefore, for the score xij of the j index of object Aj , the calculation method of equilibrium coefficient T used in this project is as follows:

(19)

It should be noted that the above formula only qualitatively guarantees that the more xij deviates from the mean value of each index, the closer Tij is to 0; on the contrary, if xij is closer to the mean value of each index, the closer Tij is to 1; that is to say, although the above formula gives the calculated value of Tij , it is not the quantitative calculation value of Tij , which can only represent a trend. 3.2 Fuzzy Comprehensive Evaluation Method Based on Variable Weight Theory Single level fuzzy comprehensive evaluation steps based on Variable Weight Theory: (1) To determine the set of alternative objects (schemes) is to determine the operation and planning objects (schemes) of the existing power grid. Different schemes can be divided according to the different permeability of distributed generation, which is represented by A = {A1 , A2 , . . . , Ai , . . . , An }, and n is the number of objects (schemes); (2) The index set is the evaluation index of Sect. 2.2. Each index is represented by X = {X1 , X2 , . . . , Xi , . . . , Xn }, and m is the number of evaluation index; (3) According to the calculation method of each index in Sect. 2.2, all indexes of the object (scheme) are calculated, and each evaluation index is dimensionless by the efficacy coefficient method. The formula is as follows: xij = c +

xij − mj Mj − mj

d

(20)



⎤ ε(1,1) ε(1,2) · · · ε(1,24) ⎢ ε(2,1) ε(2,2) · · · ε(2,24) ⎥ ⎢ ⎥ Where E = ⎢ . .. .. ⎥ is the electricity price elasticity matrix. ⎣ .. . . ⎦ ε(24,1) ε(24,2) · · · ε(24,24) In the formula, xij is the score value of the j index of the object (scheme) i; xij is the actual calculated value of the j index of the object (scheme) i; Mj and mj are the satisfactory value and the disallowed value of the index j respectively; c, d is a constant, is usually taken as c = 60, d = 40; (4) Determine the weight of each index

Evaluation Method of Distributed Renewable Energy Access

115

According to the established index system, each factor of the index layer is compared in pairs, and the specific values are obtained by using the method in Table 1. For the case of n factors, the comparison matrix A is formed as follows: ⎤ ⎡ a11 a12 . . . a1j ⎢ a21 a22 . . . a2j ⎥ ⎥ (21) A=⎢ ⎣ ... ... ... ...⎦ ai1 ai2 . . . aij

Where aij is the importance of factor i relative to factor j, and its value reference is shown in Table 1. Table 1. The scaling meaning of comparison matrix A Standard metric

Meaning

1

Index ai is equally important to aj

3

Index ai is slightly more important than aj

5

Index ai is important than aj

7

Index ai is much more important than aj

9

Index ai is absolutely important than aj

Reciprocal of scale

If the ratio of importance of indicator i to indicator j is aij , then the ratio  of importance of indicator i to indicator j is aii = 1 aij

Calculate the index weight according to the geometric average method (square root method), as shown in Eq. (22):  1 n n  Aij j=1

Wi = n  i=1



n 

 1 (i, j = 1, 2, · · · , n)

(22)

n

Aij

j=1

Then, according to the variable weight theory in Sect. 3.1, the weight of each index of the object (scheme) is redefined, which is represented by Wi = {Wi1 , Wi2 , . . . , Wim }; (5) Calculate the comprehensive score of the scheme The comprehensive score value of object (scheme) i is equal to the weight of each index calculated in step (4) multiplied by the score of each index calculated by the efficacy coefficient method in step (3). The calculation formula is as follows: Pi =

m  j=1

wij xij

(23)

116

P. Wang et al.

(6) Determination of fuzzy comprehensive evaluation results based on the principle of maximum membership degree The evaluation grade of the results is determined by the scoring method of the percentage system, and the project is evaluated by five grades, as shown in Table 2. Table 2. Comparison of the scaling meaning of matrix A Level 1 2 (Badly) (Poor) Score [0, 20]

3 4 5 (Moderate) (Good) (Excellent)

[20, 40] [40, 60]

[60, 80] [80–100]

4 Theoretical Example Analysis This chapter takes Suizhou 220 kV transmission line as the basic information of simulation calculation, combined with the literature data, gives the bearing capacity analysis of Distributed Renewable Energy access to distribution grid considering safety reliability evaluation index and operation economy index under different renewable energy penetration. 4.1 Basic Information of Calculation Example The distribution network of a certain distribution line in a city is composed of 10 nodes, and the load parameters of each node are shown in the table. The electric vehicle charging load and electric heating load of the line can be obtained according to “The Special Planning for Electric Vehicle Charging Infrastructure Construction” and “The Clean Heating Implementation Plan” of the city, as shown in Table 3. Combined with the conventional load, the annual total load of the Beijing Guangzhou line from 2020 to 2022 can be predicted, as shown in Table 4. Table 3. Electric vehicle charging load and electric heating load planning (kW, kvar) Load nodes

Clothing factory

2020

2021

2022

Electric vehicle

Electric heating

Electric vehicle

Electric heating

Electric vehicle

Electric heating

200

160

280

300

320

300 (continued)

Evaluation Method of Distributed Renewable Energy Access

117

Table 3. (continued) Load nodes

2020

2021

Electric vehicle Food factory

Electric heating

2022

Electric vehicle

Electric heating

Electric vehicle

Electric heating

120

0

160

100

200

100

Residential area

0

0

0

0

120

120

Public institutions

80

0

0

0

40

0

Industrial park 160 1

150

200

150

200

150

Industrial park 120 2

320

160

320

200

320

Table 4. Total load forecast result (unit: kW, kvar) Load nodes

2020

2021

2022

Electric vehicle

Electric heating

Electric vehicle

Electric heating

Electric vehicle

Electric heating

Clothing factory

874.2

172.4

1299.9

247.0

1791.1

322.6

Food factory

898.2

179.4

540.7

102.8

613.5

112.9

Residential area

421.9

83.1

634.9

120.7

876.1

157.8

Public institution

108.5

21.7

61.6

11.7

68.3

12.6

Industrial 1

872.7

173.6

796.8

151.4

1015.0

184.3

Industrial 2

830.3

165.4

703.3

133.6

881.2

160.4

4.2 Index Calculation Results 4.2.1 Refer to the Evaluation Index Calculation Results Proposed in DL/T + 2041– 2019 4.2.1.1 Thermal Stability Evaluation Index Score When λ ≤ 0, take score as 100;    When 0 < λ ≤ 80%, take score as 0.8 − λ 0.8 ∗ 100; When λ > 80%, take score as 0.

118

P. Wang et al.

4.2.1.2 Short Circuit Current Check Index Score  When IXZ < Im , take score as (Im − IXZ ) Im ∗ 100; When IXZ > Im , take score as 0.

4.2.1.3 Voltage Deviation Check Index Score  When UH > δUH , take score as (UH − δUH) UH ∗ 100; When UL < δUL , take score as (δUL − UL ) UL ∗ 100.

4.2.1.4 Harmonic Check Index Score Take 100 for pass and 0 for no pass. 4.2.2 Scoring Results of Each Index The satisfactory values and disallowed values of each index of the efficacy coefficient method are given below: (1) For n − 1 passing rate, the satisfactory value is taken as 100, and the minimum value 0 is not allowed; (2) For n − 1 passing rate, the satisfactory value is taken as 100, and the minimum value 0 is not allowed; (3) Reliability index, allowable value and satisfactory value are obtained from literature [9–11], the allowable value is slightly higher than the average value of domestic urban reliability index. The satisfactory values are the highest limit values of each index; (4) For the maximum load rate and the average annual load rate, the satisfactory value is 100%, and the minimum limit value is 0; (5) For the line loss rate, the minimum limit value is 0, and the national average line loss rate is not allowed to be 5.2%. The scoring results of each index under different permeability are shown in Table 5, 6. Table 5. Scoring results of each index under different permeability Evaluation index

Distributed generation penetration 0%

15%

30%

N − 1 pass rate

100

100

100

Short circuit capacity

79.09

77.12

76.24 (continued)

Evaluation Method of Distributed Renewable Energy Access

119

Table 5. (continued) Evaluation index

Distributed generation penetration 0%

15%

30%

Average outage frequency

94.15

95.12

95.44

Average outage duration

82.13

83.14

83.65

Average power supply availability

99.84

99.85

99.87

Line loss rate

95.15

96.21

97.01

Maximum line load rate

84.40

84.15

83.14

Annual average load rate of line

82.42

80.12

78.37

Maximum load rate of distribution transformer

79.17

78.23

77.61

Annual average load rate of distribution transformer

85.23

83.15

81.14

Table 6. Refer to the evaluation index score results proposed in “DL/T + 2041–2019” Distributed power penetration

Thermal stability assessment

Short circuit current check

Voltage deviation check

Harmonic check index

0%

100

15%

100

82.63

87.85

100

82.63

87.85

100

30%

100

82.63

87.85

100

4.3 Index Weight Calculation The comprehensive score is obtained as shown in Table 7. Table 7. Comprehensive score/grade Scoring items

Distributed generation penetration 0%

15%

30%

《DL/T + 2041–2019》

90.72 90.72 90.72

Safety and reliability

94.23 91.32 90.21

Operation economic

86.29 85.27 84.21

120

P. Wang et al.

5 Conclusions The evaluation index of distribution network carrying capacity proposed in this report can well describe the safety and reliability, power quality, operation economy and flexibility of distribution network, and can be used for quantitative calculation. This paper summarizes the evaluation criteria of DL/T + 2041–2019, safety and reliability, operation economy as the bearing capacity evaluation standards, and the bearing capacity index system of Distributed Renewable Energy access distribution network to adapt to different renewable energy penetration evaluation requirements. The variable weight problem of index weight is solved. In the process of selection and evaluation of the evaluation index of distribution network carrying capacity, there are many indicators of multi-attribute problem, which is difficult to directly weight. Moreover, the consideration background of indicators is different, so it is difficult to avoid the subjective influence on the evaluation results. In the current research, the index weight is constant, and once the index weight changes, it will increase the uncertainty of the addition result. The evaluation method of Distributed Renewable Energy access distribution network carrying capacity based on variable weight theory makes up for the blank of current research to a certain extent. In the proposed method, the calculation weight of indicators is not constant, but varies according to the relationship between indicators or the needs of the actual situation. Acknowledgements. Project supported by Youth Program of National Natural Science Foundation of China (51907096); Natural Science Foundation of Qinghai Province (2019-ZJ-950Q).

References 1. Fan, J.L., Wang, J.X., Hu, J.W., et al.: Optimization of China’s provincial renewable energy installation plan for the 13th five-year plan based on renewable portfolio standards. Appl. Energy 254, (2019) 2. Amighini, A.: China’s Race to Global Technology Leadership. Ledizioni-LediPublishing (2019) 3. Irina, I.: Significant factors affecting the selection of rational options for power supply in an off-grid zone. In: E3S Web of Conferences. EDP Sciences, vol. 77, p. 02006 (2019) 4. Abdin, Z., Mérida, W.: Hybrid energy systems for off-grid power supply and hydrogen production based on renewable energy: a techno-economic analysis. Energy Convers. Manage. 196, 1068–1079 (2019) 5. Che, Y., Jia, J., Zhao, Y., et al.: Vulnerability assessment of urban power grid based on combination evaluation. Saf. Sci. 113, 144–153 (2019) 6. Lei, D., Yang, Y., Hu, M., et al.: Solutions for the distributed photovoltaic access distribution network relay protection. In: 2020 5th Asia Conference on Power and Electrical Engineering (ACPEE), pp. 1492–1496 IEEE (2020) 7. Razavi, S.E., Rahimi, E., Javadi, M.S., et al.: Impact of distributed generation on protection and voltage regulation of distribution systems: a review. Renew. Sustain. Energy Rev. 105, 157–167 (2019) 8. Ehsan, A., Yang, Q.: State-of-the-art techniques for modelling of uncertainties in active distribution network planning: a review. Appl. Energy 239, 1509–1523 (2019)

Evaluation Method of Distributed Renewable Energy Access

121

9. Wenxia, L., Min, Z., Jianhua, Z., et al.: Reliability modeling and quantitative analysis of distribution network considering electric vehicle charging and discharging. In: Proceedings of the CSU-EPSA, vol. 25, no. 4, pp. 1–6 (2013) 10. Shojaabadi, S., Abapour, S., Abapour, M., et al.: Optimal planning of plug-in hybrid electric vehicle charging station in distribution network considering demand response programs and uncertainties. IET Gener. Transm. Distrib. 10(13), 3330–3340 (2016) 11. Zhang, Q., Zhu, Y., Wang, Z., et al.: Reliability assessment of distribution network and electric vehicle considering quasi-dynamic traffic flow and vehicle-to-grid. IEEE Access 7, 131201– 131213 (2019)

Research on Realization Technology of Arc Grounding Fault on Distribution Network on Field Test Data Xiaoyong Yu(B) , Lifang Wu, Weixiang Huang, Shaonan Chen, and Liqun Yin Electric Power Research Institute, Guangxi Power Grid Corporation, Nanning 530023, China

Abstract. Based on the distribution network practical-test-platform, the research work on the realization technology of arc grounding fault in distribution network can be carried out. In this paper, the arc grounding fault model is established, the development process of intermittent arc grounding fault is described in detail, and an arc grounding fault simulation device and a simulation flow are designed. The simulation example of the distribution network practical-test-platform is built on the PSCAD software, and the intermittent arc grounding fault of the neutral ungrounded system and the low resistance grounding system is simulated and analyzed. Keywords: Distribution network practical-test-platform · Arc grounding fault · Neutral ungrounded system · Low resistance grounding system

1 Instruction The distribution network is mainly composed of distribution lines, transformers and loads, and the neutral point is mainly grounded by low resistance. Arc grounding fault is a common fault type in distribution network. When the line falls to the ground or the line is touched by a branch, it is easy to produce arc grounding fault, which seriously threatens the life and property of people and the safety of equipment [1, 2]. The research data show that the probability of intermittent arc grounding fault in neutral ungrounded power grid is very high, accounting for about 60% of all grounding faults. The so-called intermittent arc grounding fault refers to the repeated phenomenon that the grounding arc is extinguished and then reignited, and the arc voltage increases each time [3, 4]. This is due to the continuous increase of the free charge accumulated on the sound relative ground capacitor and the gradual increase of the displacement voltage, which leads to the increase of the transient voltage with the increase of the number of grounding arc re-ignition. This kind of over-voltage will exist for a long time and pose a great threat to the power equipment with weak insulation [5–7]. The possibility of arc grounding over-voltage in the low resistance grounding system can be analyzed from three aspects: the composition of feeder in distribution network, the grounding fault current and the fault removal time [8, 9].

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): AIPE 2020, AISC 1403, pp. 122–132, 2021. https://doi.org/10.1007/978-3-030-80531-9_11

Research on Realization Technology of Arc Grounding Fault

123

The main work of this paper is as follows. In this paper, the arc grounding fault model is established, the development process of intermittent arc grounding fault is described in detail, and an arc grounding fault simulation device and a simulation flow are designed. The simulation example of the distribution network practical-test-platform is built on the PSCAD software, and the intermittent arc grounding fault of the neutral ungrounded system and the low resistance grounding system is simulated and analyzed.

2 Distribution Network Practical-Test-Platform Based on the distribution network practical-test-platform shown in Fig. 1, the research work of arc grounding fault realization of distribution network can be carried out.

Fig. 1. Schematic diagram of distribution network practical-test-platform

The distribution network practical-test-platform is equipped with cable simulator and overhead line simulator, which can simulate cables and overhead lines with different lengths and parameters on the premise of ensuring no obvious distortion of line transient characteristics. The distribution network practical-test-platform can simulate the neutral ungrounded system, the low resistance grounding system and so on. The distribution network practical-test-platform can simulate arc grounding fault, single-phase short-circuit fault, two-phase short-circuit fault and other faults.

3 Modeling Analysis of Intermittent Arc Grounding Fault 3.1 Arc Grounding Fault Model Arc dynamic model based on energy balance theory is described as a cylindrical gas channel details as shown in Fig. 2.

124

X. Yu et al. Scattering energy

ia

ua Fig. 2. Schematic diagram of gas channel

Therefore, the general form of arc dynamic model equation is   ia 1 = = F(Q) = F (P − P0 )dt ua ra

(1)

Where ia is the dynamic arc current, ua is the dynamic arc voltage, r a is the dynamic arc resistance, P is the input power of arc, P0 is the output power of arc, Q is the accumulated energy in arc. Q is related to the temperature and ionization degree of the arc, as shown below. dQ = P − P0 dt The dynamic arc equation of Mayr is     d lng 1 P 1 Eia = −1 = −1 dt θ P0 θ P0

(2)

(3)

Where g is the arc conductance, θ is the time constant of the arc, and E is the arc column voltage gradient. 3.2 Development Process of Intermittent Arc Grounding Fault The characteristic of arc burning process is that the arc current passes through zero point every half cycle. When the arc current passes through the zero point, the input of the arc is equal to zero, and the temperature of the arc will drop, which is a favorable condition for the arc to be extinguished. During a period of time before and after the arc current naturally crosses zero point, the arc resistance becomes so large that it becomes the main factor limiting the current value. Therefore, at the end of the first half cycle and the beginning of the second half cycle, the arc current is equal to the ratio of arc voltage to arc resistance. For a short period of time before the arc current naturally crosses zero point, the arc current is limited by the arc resistance, which is actually equal to zero. The same is true at the beginning of the next half cycle. Although the arc current actually crosses the zero point only at a certain moment, the arc current is approximately equal to zero in the whole short period of time before and after the arc current naturally crosses zero ponit, and the

Research on Realization Technology of Arc Grounding Fault

125

whole period of time is called the zero rest time of the arc current. The zero rest time of arc current is related to many factors, on the one hand, it is related to the arc current process, on the other hand, it is related to the circuit conditions, namely voltage, current and circuit constant. The zero rest time of arc current is usually between a few to tens of microseconds. The re-ignition and extinction of the arc are closely related to the zero-crossing phenomenon of the arc current. During the period of zero arc current, the arc gap gradually changes from a conductor to an insulating medium, and the extinction of the arc mainly depends on this process.

4 Realization Technology of Arc Grounding Fault 4.1 Arc Grounding Fault Simulation Device The schematic diagram of the arc grounding simulation device is shown in Fig. 3. line PT

Fast switch

M d

CPU

Fig. 3. The schematic diagram of the arc grounding simulation device

Each insulator is connected with a graphite rod, one of which is connected to the line of phase A through the fast switch K, and the other is connected to the ground through a wire. The two graphite rods are close to each other under the control of the stepper motor M. when the breakdown gap is reached, an arc will be formed between the two graphite tips. The length of the arc channel depends on the distance between the two graphite tips. Continuous arc grounding fault and intermittent arc grounding fault can be simulated by adjusting the distance between two graphite rods properly. By adjusting the moving speed of graphite rod through stepper motor M, the discharge frequency of arc grounding fault can be controlled. Under certain discharge distance, arc grounding fault can occur at any electric angle by controlling switching off or closing of fast switch K. 4.2 Simulation Process of Intermittent Arc Grounding Fault The simulation process of intermittent arc grounding fault is shown in Fig. 4.

126

X. Yu et al. Start

The arc has burned.

No

Is the stable arc voltage less than the air withstand voltage? Yes

No

Does the arc current cross zero? Yes The arc has been extinguished.

Yes

Is the system recovery voltage less than the air withstand voltage? No Does the system recovery voltage change to normal phase voltage?

No

Yes Arc permanently extinguished End

Fig. 4. Flow chart of intermittent arc grounding fault simulation

The thermal breakdown process of stable arc grounding fault can be automatically simulated by arc model based on energy balance theory. The electric breakdown process of intermittent arc grounding fault needs to judge the re-ignition and extinguishment of arc by comparing stable arc voltage, recovery voltage and air withstand voltage.

5 Numerical Simulations In this paper, a simulation example of distribution network practical-test-platform is built on PSCAD software. The ignition and extinction of arc are characterized by opening and closing state of Faults element in PSCAD software. 5.1 Neutral Ungrounded System The simulation process of intermittent arc grounding fault for neutral ungrounded system is as follows: at 0.5035 s, arc grounding fault occurs in distribution network practicaltest-platform. The arc current is extinguished when the positive sequence current reaches zero and reignited when the arc voltage reaches the maximum, as shown in Table 1. The simulation waveforms of arc voltage, arc current and arc resistance are shown in Fig. 5, Fig. 6 and Fig. 7, respectively.

Research on Realization Technology of Arc Grounding Fault Table 1. Statistical table of arc extinction and re-ignition Arc extinguishing times

Time/second

Arc re-ignition times

Time/second

The first time

0.5234

The first time

0.5354

The second time

0.5537

The second time

0.5652

The third time

0.5837

The third time

0.59525

The fourth time

0.6137

Fig. 5. Arc voltage waveform

Fig. 6. Arc current waveform

127

128

X. Yu et al.

Fig. 7. Arc resistance waveform

As can be seen from Fig. 5, Fig. 6 and Fig. 7, the arc voltage increases every time from the first arc grounding fault to the fourth arc re-ignition. This is due to the increasing accumulation of free charge on the ground capacitance of non-fault lines, which leads to the increase of transient voltage with the increase of the number of grounding arc re-ignition. The simulation waveforms of neutral point voltage and three-phase voltage are shown in Fig. 8 to Fig. 9, respectively.

Fig. 8. Voltage waveform of neutral point

Research on Realization Technology of Arc Grounding Fault

129

Fig. 9. Three-phase voltage waveforms

As can be seen from Fig. 8 to Fig. 9, in the intermittent arc grounding fault stage, transient over-voltages are generated on non-fault lines, which is due to the high-frequency oscillation of the system caused by each arc re-ignition. With the increase of the number of arc re-ignition, the over-voltage increases gradually, and the maximum value can be as high as 3.187 pu (8.164 kV is the reference value). The maximum transient over-voltage at the neutral point is 2.13 pu. 5.2 Low Resistance Grounding System Consistent with the fault simulation conditions of the neutral ungrounded system, the intermittent arc grounding fault simulation of the low resistance grounding system is carried out. The simulation waveforms of arc voltage, arc current and arc resistance are shown in Fig. 10, Fig. 11 and Fig. 12, respectively.

Fig. 10. Arc voltage waveform

130

X. Yu et al.

Fig. 11. Arc current waveform

Fig. 12. Arc resistance waveform

As can be seen from Fig. 10, Fig. 11 and Fig. 12, the arc voltage decreases every time from the first arc grounding fault to the fourth arc re-ignition. Compared with the neutral ungrounded system, the arc voltage of the low resistance grounding system can also be greatly reduced. It can be seen that the intermittent arc grounding over-voltage can be well suppressed in the low resistance grounding system. The simulation waveforms of neutral point voltage and three-phase voltage are shown in Fig. 13 to Fig. 14, respectively. From Fig. 13 to Fig. 14, the maximum transient over-voltage of non-fault line is 1.7 pu and neutral point maximum transient over-voltage is 0.9628 pu. Compared with the neutral ungrounded system, the arc voltage of the low resistance grounding system can also be greatly reduced. There is no higher over-voltage phenomenon on non-fault lines and the over-voltage amplitude of non-fault lines remains unchanged with the increase of the number of grounding arc re-ignition.

Research on Realization Technology of Arc Grounding Fault

131

Fig. 13. Voltage waveform of neutral point

Fig. 14. Three-phase voltage waveforms

6 Summary and Conclusion In this paper, a simulation example of distribution network practical-test-platform is built on PSCAD software. Based on the simulation flow of arc grounding fault designed in this paper, the realization of intermittent arc grounding fault in neutral ungrounded system and low resistance grounding system is simulated and analyzed. From the simulation results of the intermittent arc grounding fault in neutral ungrounded system, we can see that the arc voltage increases every time from the first arc grounding fault to the fourth arc re-ignition. This is due to the increasing accumulation of free charge on the ground capacitance of non-fault lines, which leads to the increase of transient voltage with the increase of the number of grounding arc re-ignition.

132

X. Yu et al.

From the simulation results of the intermittent arc grounding fault in low resistance grounding system, we can see that the arc voltage decreases every time from the first arc grounding fault to the fourth arc re-ignition. Compared with the neutral ungrounded system, the arc voltage of the low resistance grounding system can also be greatly reduced. Therefore, the low resistance grounding system can reduce the insulation requirements for the primary equipment of the distribution network and reduce the investment cost of the distribution network. Acknowledgment. This work was supported by “Science and Technology Project of Guangxi Power Grid Company” (GXKJXM20180195).

References 1. Wang, Q., Liu, X., Liu, Z.: Research on diagnosis and positioning of single-phase arc grounding fault in the distribution network. In: 2018 Chinese Automation Congress (CAC), Xi’an, China, pp. 2141–2146 (2018) 2. Li, S., Xue, Y., Xu, B.: Simulation analysis of arc grounding fault in non-solidly earthed network. In: 2017 IEEE Power & Energy Society General Meeting, Chicago, IL, pp. 1–5 (2017) 3. Li, S., Xue, Y., Feng, G., Xu, B.: Simulation analysis of intermittent arc grounding fault applying with improved cybernetic arc model. J. Eng. 2019(16), 3196–3201 (2019) 4. Khakpour, A., Franke, S., Gortschakow, S., Uhrlandt, D., Methling, R., Weltmann, K.D.: An improved arc model based on the arc diameter. IEEE Trans. Power Delivery 31(3), 1335–1341 (2016) 5. Xu, Y., Guo, M., Chen, B., Yang, G.: Modeling and simulation analysis of arc in distribution network. Power Syst. Prot. Control 7, 57–64 (2015) 6. Liu, B., Tang, J., Wu, X., Wang, J., Yang, C.: Analysis of arc model and its application in single-phase grounding fault simulation in distribution networks. In: 2018 China International Conference on Electricity Distribution (CICED), Tianjin, pp. 1865–1871 (2018) 7. Baghipour, R., Hosseini, S.M.: Placement of DG and capacitor for loss reduction, reliability and voltage improvement in distribution networks using BPSO. Int. J. Intell. Syst. Appl. (IJISA) 4(12), 57–64 (2012). https://doi.org/10.5815/ijisa.2012.12.08 8. Zhang, B., Sun, Y., Shi, F., Zhang, H., Liu, S., Zhang, Y.: Detection of arc grounding fault in distribution network based on the harmonic component. In: 2018 13th IEEE Conference on Industrial Electronics and Applications (ICIEA), Wuhan, pp. 2559–2564 (2018) 9. Gu, R., Cai, X., Chen, H., et al.: Modeling and simulating of single-phase arc grounding fault in non-effective earthed networks. Autom. Electr. Power Syst. 23(13), 63–67 (2009)

Research on Reactive Power Compensation Configuration of Wind Farm Integration Junfang Wang(B) , Caifu Chen, Shuang Zhang, Zheng Ren, Xiaolu Chen, and Xinyu Wang Power Grid Technology Center, State Grid East Inner Mongolia Electric Power Research Institute, Hohhot 010020, China

Abstract. The large-scale wind power integration into power system brings great impact on the stability of the grid voltage. In order to reduce the impact of wind power integration on the power system and improve the safety of wind power integration, it is necessary to carry out the research on reactive power compensation configuration of wind farm integration. Firstly, based on the actual situation of wind power project, study the principle of reactive power compensation configuration. Secondly, propose a theoretical calculation method of reactive power compensation configuration. On this basis, combined with the practical 300 MW wind farm project, analyze the reactive power compensation configuration example. Finally, through the modeling and simulation by PSASP software, the accuracy and effectiveness of the calculation method proposed by this paper are verified. The reactive power compensation configuration method studied in this paper is applicable to all wind farms connected to the power system and provides important support for voltage stability in the wind power integration project. It is of great significance to ensure the safe and stable operation of the power grid. Keywords: Mathematical modeling · Fluidized bed · Dehydration · Granulation

1 Introduction In order to cope with the global energy crisis and environmental pollution problems, vigorously developing clean energy has become an important measure for all countries in the world [1, 2]. As the most mature renewable energy technology, wind power has been greatly developed in recent years [3]. However, the operation of the wind farm requires amount of reactive power support. With the continuous improvement in the scale and capacity of wind farm integration, the reactive power shortage of the system will be more and more large and the impact on the system voltage will increase if the reactive power is not compensated in time. The safe and stable operation of the grid will be challenged. Therefore, in order to reduce the impact of wind farm integration on power system and improve the security of wind power access to power system, it is necessary to carry out research on reactive power compensation configuration of wind farm integration. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): AIPE 2020, AISC 1403, pp. 133–143, 2021. https://doi.org/10.1007/978-3-030-80531-9_12

134

J. Wang et al.

At present, the research on the wind farm reactive compensation configuration mainly adopts the estimation method [4], the method of compensation capacity determined by reactive compensation degree, the method of compensation capacity determined by power factor and the method of compensation capacity determined by voltage regulation demand [5, 6]. There are few methods to calculate the capacity of reactive compensation based on the reactive compensation principle, which makes the calculation standardization and result accuracy insufficient. It is difficult to meet the requirements of safe and stable operation of the power grid. In view of the above problems, this paper proposes the research on reactive power compensation configuration of wind farm integration. Firstly, based on the actual situation of wind power project, the reactive power compensation configuration principle is studied to ensure that the reactive power compensation configuration conforms to the relevant principles and regulations of power grid. Secondly, based on the reactive power compensation principle of wind farm, the reactive power compensation theoretical calculation method is proposed to obtain the best reactive power compensation scheme. On this basis, combined with the actual 300 MW wind farm project, the reactive power configuration calculation example is analyzed. Finally, the accuracy and validity of the proposed method are verified by PSASP modeling and simulation.

2 Composition of Reactive Power Loss in Wind Power Engineering Typical wind farms are generally equipped with dozens or hundreds of wind turbines. Each wind turbine is equipped with a package transformer. The wind turbine and the package transformer are connected by parallel-laid cables. Each package transformer sends the wind power to the bus at the low voltage side of the main transformer of the booster station through the collection line. By being transformed to the high-pressure side, the wind power finally will be sent out by the outgoing line of the wind farm. Wind turbine, package transformer, collection line, main transformer and equipment in booster station constitute the main electrical system of wind farm. The wind farm integration diagram is shown in Fig. 1.

Fig. 1. Diagram of wind farm integration

Research on Reactive Power Compensation Configuration

135

In the wind farm integration system, the reactive power loss is mainly composed of six parts: wind turbine, the line from wind turbine to package transformer, package transformer, collection line, main booster transformer and outgoing line of the wind farm. a) Wind turbine At present, the widely used wind turbine can be roughly divided into three categories: constant frequency constant speed asynchronous generator (squirrel cage asynchronous generator), constant frequency variable speed doubly fed asynchronous generator (wound asynchronous generator) and direct drive permanent magnet synchronous generator [7]. The constant frequency constant speed asynchronous generator is connected with the reactive power compensation device in parallel at the end of the generator. The compensation capacity is generally 30%–50% of the wind turbine capacity. But its compensation capacity can not meet the reactive power requirements when the wind turbine starts and disconnects from the grid. When the wind turbines start or disconnect from the grid at the same time, the reactive power of 50%–70% of the wind turbine capacity needs to be absorbed from the grid again. In view of the small possibility of all wind turbines starting or disconnecting from the grid at the same time, it is recommended to increase the reactive compensation capacity to 30% of the total installed capacity. The constant frequency variable speed doubly fed asynchronous generator is equipped with a control unit in the rotor winding assembly. In normal operation, the control unit can control the frequency, amplitude and phase of the rotor current to keep the stator frequency, terminal voltage and power factor constant, without the need for the grid to provide reactive power. In the process of wind farm fault or low voltage ride through, the grid side frequency converter of the control unit can send out reactive power to adjust the terminal voltage. But the generated reactive power can not meet the needs of the wind turbine. It is recommended to increase the reactive compensation capacity to 10% of the total installed capacity. The direct drive permanent magnet synchronous generator is equipped with fullpower converter at the generator terminal. During normal operation and wind farm failure time, the full-power converter can adjust the reactive power, and the permanent magnet synchronous generator does not need to absorb the reactive power from the system. Therefore, for the wind turbine part of the wind farm, it is not necessary to increase the reactive compensation capacity. b) The line from wind turbine to package transformer The outlet voltage of wind turbine is generally 0.69 kV, which is connected to the corresponding package transformer by low-voltage cables. Due to the large current flowing, it is generally connected by parallel-laid cables.

136

J. Wang et al.

c) Package transformer The package transformer increases the voltage of the wind turbine from 690 V to 10 kV or 35 kV. One wind turbine corresponds to one package transformer. d) Collection line After the power of the wind turbine is boosted by the package transformer, it will be sent to the booster station of the wind farm through the collection line. The collection line in the wind farm has three connection methods: overhead line, cable, and hybrid mode of cable with overhead line. e) Booster transformer After the power of the wind turbine is boosted by the package transformer, it will be sent to the booster station of the wind farm through the collection line. The collection line in the wind farm has three connection methods: overhead line, cable, and hybrid mode of cable with overhead line. f) Outgoing line of the wind farm The outgoing line of the wind farm refers to the transmission line from the wind farm to the public grid. The booster transformer will boost the wind power and connect it to the power system through the outgoing line of the wind farm.

3 Principle of Reactive Power Compensation Configuration According to the “Technical Regulations for Wind Farm Access to the Power Grid (Q/GDW 1392-2015)” [8]: The reactive power source of the wind farm includes the wind turbine and the reactive power compensation device of the wind farm. The wind turbine installed in the wind farm should be able to meet the dynamic adjustment of power factor in the range of leading 0.95 to lag 0.95. The wind farm shall make full use of the reactive capacity and regulation capacity of the wind turbine. When the reactive capacity of the wind turbine can not meet the demand of system voltage regulation, the wind farm shall be equipped with appropriate capacity of reactive compensation device, and if necessary, dynamic reactive compensation device should be equipped. The reactive capacity of the wind farm shall be configured according to the principle of basic balance of the divided (voltage) layer and divided (power) area, and meet the requirements of maintenance and standby. For wind farms directly connected to the public grid, the configured capacitive reactive capacity can compensate the sum of the inductive reactive power of the collection system (including the collection line and the wind turbine package transformer), the inductive reactive power of the main transformer and the half of the inductive reactive

Research on Reactive Power Compensation Configuration

137

power of the outgoing line of the wind farm when the wind power is fully discharged. The configured inductive reactive capacity can compensate the sum of the capacitive reactive charging power of the wind farm itself and the half of the reactive charging power of the outgoing line of the wind farm. For the wind farms in the wind farm group that are connected to the public power grid through the 220 kV (or 330 kV) wind power collection system boosting the voltage to the 500 kV (or 750 kV) level, the configured capacitive reactive capacity can compensate the sum of the inductive reactive power of the collection system, the main transformer and the outgoing line of the wind farm when the wind power is fully discharged. The configured inductive reactive capacity can compensate the sum of the capacitive charging reactive power of the wind farm itself and all the reactive charging power of the outgoing line of the wind farm.

4 Theoretical Calculation Method of Reactive Power Compensation Configuration 4.1 Calculation of the Line Reactive Loss and Charging Power Calculate the line reactive power loss as (kvar), then its calculation formula is formula (1): QL = 3I 2 · X

(1)

In the formula, is the current (A) flowing through the line, the calculation formula is shown in formula (2); x is the line equivalent reactance (), the calculation formula is shown in formula (3); I=√

P 3U · cos φ

(2)

In the formula, is the line active power (kW); U is the line voltage (kV); φ is the line power factor; X =x·L

(3)

In the formula, is the reactance per unit length of wire (/km); L is the line length (km); Calculate the line charging power is, then its calculation formula is formula (4): QC = U 2 · ω · C/1000 = U 2 · 2π · f · c · L/1000 f is the line frequency (Hz), the value is 50 Hz; C is the single phase to ground capacitance of conductor (μF); c is the single phase to ground capacitance of conductor per unit length (μF/km);

(4)

138

J. Wang et al.

4.2 Calculation of the Transformer Reactive Power Loss Calculate the transformer reactive power loss as (kvar), then its calculation formula is formula (5); QT = n · (

UK % S 2 I0 % + · ) · SN 100 SN2 100

(5)

In the formula, is the number of transformers; n is the percentage value of transformer short-circuit voltage; I0 % is the percent value of transformer no-load current; QT is the operating apparent power of transformer(kVA); SN is the rated transformer capacity (kVA). 4.3 Calculation of the Reactive Power Compensation Device Capacity According to the reactive power compensation configuration principle, the capacitive reactive power compensation capacity of wind farm is the sum of line loss and transformer loss. For the wind farm directly connected to the public grid, the calculation formula is shown in formula (6); 1 QR = QXT + QJL + QZT + QSL 2

(6)

In the formula, QR is the capacitive reactive power compensation capacity to be configured; QXT is the reactive power loss of package transformer; QJL is the reactive power loss of collection line; QZT is the reactive power loss of the main transformer; QSL is the reactive power loss of outgoing line of the wind farm. For the wind farm connected to the public grid through the wind power collection system, the calculation formula is shown in formula (7); QR = QXT + QJL + QZT + QSL

(7)

The inductive reactive power compensation capacity of the wind farm is the sum of the line charging power. For the wind farm directly connected to the public grid, the calculation formula is shown in formula (8); 1 QG = QJC + QSC 2

(8)

In the formula, QG is the inductive reactive power compensation capacity to be configured;

Research on Reactive Power Compensation Configuration

139

QJC is the charging power of collection line; QSC is the charging power of the outgoing line of the wind farm. For the wind farm connected to the public grid through the wind power collection system, the calculation formula is shown in formula (9); QG = QJC + QSC

(9)

5 Case Study Taking an actual wind farm as the example, this paper analyzes the reactive power compensation configuration of the A wind power project. A wind farm has a capacity of 300 MW. Sixteen sets of direct-drive wind turbines with a single unit capacity of 3200 kW and seventy-three sets of direct-drive wind turbines with a single unit capacity of 3400 kw are installed. The wind turbine and the package transformer are connected in the connection mode of one generator, one transformer unit. The package transformer is located inside the wind turbine, with a capacity of 3600 kVA and 3800 kVA respectively. After the power is boosted to 35 kV through the package transformer, it is sent to the 35 kV bus of the 220 kV booster transformer by twelve 35 kV collection lines. After being boosted to 220 kV, it finally will be sent to the 500 kV transformer substation through by one 220 kV outgoing line. a) Wind turbine The project adopts direct-drive permanent magnet synchronous generator, which does not need to absorb reactive power from the system. So we do not consider increasing the reactive compensation capacity. b) The line from wind turbine to package transformer The package transformer in the project is located inside the wind turbine. The 0.69 kV line is very short. The reactive loss and charging power can be ignored, and will not be considered in the project. c) Package transformer The short-circuit voltage percentage and no-load current percentage of the package transformer in the project are 7% and 0.2% respectively. According to the reactive loss formula (5) of transformer, the reactive loss of 3600 kVA and 3800 kVA package transformer is respectively: QT 1 = 16 · (7% ·

32002 + 0.2%) · 3600 = 3300.976kvar 36002

(10)

QT 2 = 73 · (7% ·

34002 + 0.2%) · 3800 = 16099.931kvar 38002

(11)

140

J. Wang et al.

According to the calculation, the total reactive power loss of eighty-nine booster package transformers is 19400.907 kvar. d) Collection line There are 12 circuits of 35 kV collection lines in the wind farm, and the conductor models are LGJ-95/20 and LGJ-240/30. The total length of LGJ-95/20 is 33.5 km and the total length of LGJ-240/30 is 90.3 km. According to the calculation, the reactive loss and charging power of 35 kV collection line are shown in Table 1. Table 1. Calculation results of reactive power loss and charging power of 35 kV collection line

Total data of 35 kV lines

Reactive power loss (kvar)

Charging power (kvar)

24152.26

428.58

e) Booster transformer Three 100 MVA booster transformers are installed in the project. The short-circuit voltage percentage and no-load current percentage are 14% and 0.45% respectively. According to the reactive loss formula (5), the reactive loss of three 35/ 220 kV booster transformers is as follows: QT = 3 · (14% · 1 + 0.45%) · 100000 = 43350kvar

(12)

f) Outgoing line of the wind farm The outgoing line of this project adopts 2 × LGJ-240 conductor. The line parameters are shown in Table 2: Table 2. Parameters of 220 kV outgoing line Line model

Length (km)

Capacitance to ground (µF/km)

Reactance (/km)

2 × LGJ-240

22

0.0115

0.31

According to formula (1), (2) of line reactive loss and formula (4) of line charging power, the reactive loss and charging power of 220 kV outgoing line can be calculated as shown in Table 3. In conclusion, it can be calculated that the reactive power loss and charging power of the 300 MW wind power project are 93.244 Mvar and 2.35 Mvar respectively when the wind power is fully generated.

Research on Reactive Power Compensation Configuration

141

Table 3. Calculation results of reactive power loss and charging power of 220 kV outgoing line Line

Reactive loss (kvar) Charging power (kvar)

220 kV outgoing line 12682.60

3844.83

According to the “Technical Regulations on Reactive Power Configuration and Voltage Control of Wind Farms (NB/T 31099-2016)” [9], the “Technical Performance and Test Specification for Reactive Power Compensation Devices of Wind Farms(Q/GDW 11064-2013)” [10] and the “Technical Regulations for Wind Farm Access to Power System (GB/T 19963-2011)” [11], considering certain margin, this paper proposes to configure a set of ± 32 Mvar static dynamic reactive compensation device (SVG) on each 35 kV bus of the booster station, with the compensation capacity ranging between 32 Mvar (capacitive) to 32 Mvar (inductive). The compensation capacity should meet the requirements of dynamic continuous regulation and the response time shall not exceed 30 ms. At the same time, the reactive power compensation device of the wind farm shall operate reliably when the wind turbine is connected to the grid and in the process of low voltage ride through and high voltage ride through. A total of 96 Mvar SVG shall be configured under the three main transformers. Finally, based on PSASP software, the simulation model of A wind farm is built under the small mode in winter, small mode in summer, large mode in winter and large mode in summer. Check whether the compensation configuration capacity meets the operation requirements after the wind farm is equipped with ± 96 Mvar static dynamic reactive power compensation device (SVG) (Figs. 2, 3, 4 and 5).

Fig. 2. Large generation of wind power and thermal power in small mode of winter

It can be seen after the calculation that the voltage of each node of the system is reasonable in various modes of reactive power check calculation. The capacitive and inductive reactive power compensation capacity configured in the wind farm meet the operation requirements.

142

J. Wang et al.

Fig. 3. Large generation of wind power and thermal power in small mode of summer

Fig. 4. Small generation of thermal power in large mode of winter

Fig. 5. Small generation of thermal power in large mode of summer

Research on Reactive Power Compensation Configuration

143

6 Conclusions Based on the principle of wind farm reactive power compensation, this paper puts forward the theoretical calculation method of reactive power compensation. Combined with the actual wind farm project example, the reactive power compensation configuration is analyzed. Finally, the accuracy and effectiveness of the calculation method are verified by PSASP modeling and simulation. The reactive power compensation configuration method studied in this paper is applicable to all wind farms connected to the power system. It provides important support for the voltage stability in the wind power integration project, and is of great significance to ensure the safe and stable operation of the power grid.

References 1. Kitchenham, A.: Teachers and technology: a transformative journey. J. Transform. Educ. 4(3), 202–225 (2006) 2. Kang, C., Yao, L.: Key scientific issues and theoretical research framework for power systems with high proportion of renewable energy. Autom. Electr. Power Syst. 41(9), 2–11 (2017) 3. Zhang, P., et al.: Optimal allocation of reactive power source for wind farms. Power Syst. Prot. Control 20 (2008) 4. Liu, Y., et al.: Research on the wind farm reactive power compensation capacity and control target. In: 2011 Asia-Pacific Power and Energy Engineering Conference. IEEE (2011) 5. Yang, Y., et al.: Wide-scale adoption of photovoltaic energy: Grid code modifications are explored in the distribution grid. IEEE Ind. Appl. Mag. 21(5), 21–31 (2015) 6. Hassink, P., et al.: Dynamic reactive compensation system for wind generation hub. In: 2006 IEEE PES Power Systems Conference and Exposition. IEEE (2006) 7. She, X., et al.: Wind energy system with integrated functions of active power transfer, reactive power compensation, and voltage conversion. IEEE Trans. Ind. Electron. 60(10), 4512–4524 (2012) 8. State Grid Corporation of China. Technical regulations on wind farm access to power grid (Q/GDW 1392-2015). China Electric Power Press, Beijing (2016) 9. National Energy Administration. Technical regulations for reactive power allocation and voltage control of wind farms (Nb/T 31099-2016), Beijing 10. State Grid Corporation of China. Technical performance and test specification for reactive power compensation device of wind farm (Q/GDW 11064-2013). China Electric Power Press, Beijing (2014) 11. Dai, K., Bergot, A., Liang, C., et al.: Environmental issues associated with wind energy–a review. Renew. Energy 75, 911–921 (2015)

Two Stage Stochastic Scheduling Model of Integrated Energy System with Renewable Energy Considering Demand Response Qiao Chen1(B) , Yimin Qian1 , Kai Ding1 , Yi Wang1 , Chengliang Zhu2 , and Sen Wang2 1 State Grid State Grid Hubei Electric Power Research Institute, Wuhan 430015, Hubei, China 2 State Grid Jiaxing Power Supply Company of State Grid Zhejiang Electric Power Co., Ltd.,

Jiaxing 314000, Zhejiang, China

Abstract. A two-stage stochastic scheduling model of integrated energy system with renewable energy considering demand response the increasing proportion of photovoltaic power generation in Northwest China has brought great difficulties and challenges to power system dispatching. Therefore, this paper takes demand side response as virtual reserve to overcome the disadvantage of single form of traditional scheduling reserve. A two-stage stochastic model based on the fuzzification of uncertain parameters is adopted to consider the randomness and fluctuation of renewable energy output in the integrated energy system, and the economic characteristics of generator side reserve and virtual reserve are also considered. The scheduling model of integrated energy system with renewable energy based on the probability of prediction error scenario is established. The bacterial colony chemotaxis algorithm is used to solve the problem, and the simulation analysis is carried out on the actual system, which proves the effectiveness and feasibility of the proposed scheduling model. Keywords: Prediction error scenario probability · Two stage stochastic model · Demand respond · Bacteria colony chemotaxis

1 Introduction For a long time, China has paid great attention to the renewable energy industry, and the grid connected operation of renewable energy has gradually improved. However, the problem of wind and light power curtailment still exists. From the point of view of grid connected operation, the causes of wind and light abandonment are as follows: the renewable energy planning is relatively concentrated, and the peak load regulation capacity of power grid is insufficient; the construction of external transmission channel does not match the scale of power supply construction, and the transmission capacity of power grid is limited; there are weak links in the power grid, and some areas are affected by grid constraints. From the perspective of market mechanism, the reasons for the problem are: the national unified market mechanism is not perfect, renewable energy power consumption capacity across provinces is small; the mechanism of demand side resources participating in providing power auxiliary services is in the pilot stage, © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): AIPE 2020, AISC 1403, pp. 144–155, 2021. https://doi.org/10.1007/978-3-030-80531-9_13

Two Stage Stochastic Scheduling Model of Integrated Energy System

145

which affects the consumption of renewable energy. Therefore, in order to improve the penetration rate of renewable energy and improve the operation efficiency of power system, it is necessary to mobilize more kinds of reserve resources to participate in the power system dispatching with renewable energy. Integrated energy system [1, 2] is considered to be the main form of energy in the future. Integrated energy system involves the production, transmission, distribution and consumption of energy in human society. In the process of planning, construction and operation of the integrated energy system, the integrated system of energy production, supply and marketing is formed by organic coordination and optimization of energy production, transmission and distribution (energy supply network), conversion, storage and consumption, which is the physical carrier of energy Internet. The integrated energy system integrates multiple energy subsystems, realizes the energy transposition and cascade utilization through the energy conversion elements in the system, and reasonably allocates different energy through the supply and demand signals, so that the energy subsystem has a more flexible operation mode, which can effectively solve the problems in the grid connected operation of renewable energy. When the renewable energy power is surplus, the integrated energy system can absorb, convert and even store it; when the renewable energy power is insufficient, the integrated energy system can allocate other energy to fill the gap. In addition, renewable energy can be converted into energy form through the integrated energy system, and transported or absorbed by the pipe network and load of other energy systems in the integrated energy system. Demand response (DR) [3, 4] refers to the response of power users to price and market conditions and changes the supply-demand relationship of the past power consumption mode. Demand response has the function of peak load shifting. It can solve the problems caused by the randomness and volatility of renewable energy output by providing reserve generation capacity resources. The application of integrated energy system brings new development for demand response. The integration of electricity, heat, natural gas and other forms of energy enables all types of energy users to actively participate in demand response projects. Using the complementarity of different energy sources in the energy interconnection system, inelastic energy load can also use various other forms of energy to replace electrical energy in peak power consumption, so as to actively participate in demand response planning. From the perspective of power system, all types of energy users can reduce power demand during peak hours, thus improving the effect of demand response optimal scheduling; from the perspective of users, the energy consumption on the demand side has hardly changed, thus maintaining the comfort of consumers. Therefore, this paper takes the integrated energy system with renewable energy as the research object. Considering the demand response, the time of use (TOU) and interruptible load (IL) are introduced as virtual reserve, which are coordinated with machine side reserve. The two-stage stochastic model based on scenario probability is adopted, the first stage reflects the operation cost under the day ahead electricity market environment, and the second stage considers the scenario probability to reflect the actual operation of the power system, and establishes a comprehensive energy system scheduling model with renewable energy based on the scenario probability prediction error. The scheduling model not only reflects the randomness and volatility of photovoltaic output, but also fully reflects the economic characteristics of generator side reserve and virtual

146

Q. Chen et al.

reserve. Bacterial colony chemotaxis optimization (BCC) algorithm is used to solve the problem, the effectiveness and feasibility of the proposed scheduling model are verified by the simulation of classic standard scenario and actual scenario.

2 Coordination of Generator Side Reserve and Demand Response Demand response can be divided into two types, one is based on incentives, the other is based on price [5–8]. The first one is to achieve the goal of user’s load interruption right by means of economic compensation. The second way is to reduce the user’s load according to the price of electricity or take the way of transfer to reduce their own electricity charges. The two different forms can be used in coordination to complement each other This paper selects two forms to analyze, one is interruptible load, the other is the TOU price which is widely used at present. 2.1 Interruptible Load and Positive Rotation Reserve Interruptible load is a kind of reserve resource when crisis occurs. Once the fault with low probability and high risk occurs, the reserve service market must be utilized. There are two types of IL, one is the low-price interruptible load (ILL) and the other is the high price interruptible load (ILH). The former obtains the right of load interruption with the help of tariff preference. Even if there is no fault, it must pay some price, so the cost must occur. The electricity price of the latter belongs to the daily electricity price. After the blackout, it needs to compensate some expenses to the customers. The cost is related to the accident rate, so it is a risk cost [9]. There are two parts of the cost of positive spinning reserve, one part comes from electricity and the other comes from capacity. Of these two costs, capacity cost is not caused by failure. However, after signing the contract, more capacity will be needed, so this is also a deterministic cost. However, the cost of electricity is mostly caused by faults, which belongs to the risk cost [10]. In a long time, span scheduling cycle, the first stage of the scheduling model is considered, which is composed of the capacity cost of positive spinning reserve and the power compensation cost of ILL. In the second stage of the model, which is composed of the cost of positive spinning reserve and the cost of ILH compensation, the total cost of reserve is the lowest. Moreover, in order to reduce the cost, ILH mode must be adopted when serious faults with low probability occur. However, if the probability of failure is high, ILH cannot be used too much, because it will increase the cost. Therefore, the reserve subject of the cost model of this part must adopt the positive rotation reserve, and the auxiliary reserve should use the ILL form. 2.2 Negative Spinning Reserve and TOU Price Same as the positive spinning reserve in the previous part, the negative spinning reserve can also be divided into two parts: power cost and capacity cost, both of which belong to different stages in the scheduling model.

Two Stage Stochastic Scheduling Model of Integrated Energy System

147

In the past scheduling mode, when the photovoltaic output is surplus, it is often necessary to abandon the light because of the limitation of the system deep peak load shifting capacity. The TOU uses market-oriented means to guide power users to change the power consumption structure and mode, effectively peak load shifting, and smoothing the load curve. Increasing electricity consumption during the low load period can reduce the amount of light discarded and improve the utilization rate of renewable energy.

3 Analysis of the Response Model of Electricity Price 3.1 Elasticity Index According to the theory of economics, the elasticity of electricity price refers to the change of electricity demand caused by the fluctuation of electricity price. The details are as follows: ε=

d p0 d0 p

(1)

It can be seen that d is the increment of original electricity quantity d0 , while p is the increment of original electricity price p0 . Under normal circumstances, there are two forms of the corresponding electricity price, one is multi period, the other is single period. Multi period response is more practical and reasonable than single period response, so the model of this paper is built in multi period response. In this model, there are two kinds of elasticity coefficient, one is cross elasticity coefficient, the other is self-elasticity coefficient, which is used in the response of consumers to the current price and to the price in other periods [11]. In the analysis formula (1), the two elastic coefficients can be set as εii , εij , as follows: εii =

qi di di pi

(2)

εij =

di pj di pj

(3)

Where i represents stage i and j represents stage j. 3.2 Modeling of Consumer Price Response in Integrated Energy System In the n-stage, the user price response model is established as follows: ⎡ p ⎤ ⎤ d1 1 ⎢ p1 ⎥ ⎢ d1 ⎥ ⎢ ⎢ ⎥ ⎥ ⎢ p2 ⎥ ⎢ d2 ⎥ ⎢ ⎢ ⎥ ⎥ ⎢ ⎢ d ⎥ ⎥ 2 ⎥ = E ⎢ p2 ⎥ ⎢ ⎢ ⎢ ⎥ ⎥ ..⎥ ..⎥ ⎢ ⎢ ⎢ ⎢ ⎥ .⎥ . ⎢ ⎢ ⎥ ⎥ ⎣ p24 ⎦ ⎣ d24 ⎦ ⎡

d24

p24

(4)

148

Q. Chen et al.



⎤ ε(1,1) ε(1,2) · · · ε(1,24) ⎢ ε(2,1) ε(2,2) · · · ε(2,24) ⎥ ⎢ ⎥ Where E = ⎢ . .. .. ⎥ is the electricity price elasticity matrix. ⎣ .. . . ⎦ ε(24,1) ε(24,2) · · · ε(24,24) The electricity consumption of power users after being affected by electricity price is expressed as formula (5): ⎧ ⎫ ⎪ ⎪ ⎪ ⎪ 24 ⎨ Pt Ph ⎬ t dt = d0 1 + E(t, t) × t + (5) E(t, h) × h ⎪ P0 P0 ⎪ ⎪ ⎪ ⎩ ⎭ h=1 h=t

Where: dt is the total load of TOU users in t period after TOU price response.

4 Two Stage Model Based on Scenarios See reference [12, 13] for the description of scene generation, scene reduction and stop method. In order to make the model more scientific and reasonable, the following assumptions are made: 1) This paper focuses on the impact of renewable energy such as photovoltaic in the integrated energy system on the reserve capacity of power system. Therefore, load forecasting error, forced outage rate of generating units and random changes of demand side are not taken into account. 2) The cost function of unit operation is consistent with that of reserve. 3) The dispatching department can master the middle end weight of the interruptible load, which means the interrupted load can respond to the reserve demand immediately. 4) The output cost of renewable energy is zero. However, in order to adapt to the government system, it is necessary to absorb the renewable energy output in the comprehensive energy system as much as possible, so it is necessary to calculate the abandoned light cost in the objective function. 4.1 Objective Function The objective functions a represents the minimum value of the system operation cost. The first line in formula (6) is the first stage of the dispatching model, which reflects the operation cost under the day ahead electricity market environment, including the unit active power output cost, the positive and negative rotating reserve capacity cost and the electricity price discount cost for compensating ILL customers. In the second stage of the second behavior scheduling model, the scenario probability is considered to reflect the actual operation of the power system, including the cost of positive and negative spinning reserve power, the outage compensation cost and the light abandonment cost of ILH users.

Two Stage Stochastic Scheduling Model of Integrated Energy System

Fc =

⎧ ⎫ N ⎪ ⎪ ⎪  c− c− ⎪ ⎪ + ξ R f (Pit ) + ξic+ Rc+ it i it ⎬

T ⎪ ⎨

⎪ i=1

⎪ t=1 ⎪ ⎩

 ⎪ ⎪ ⎪ +fILL − ξtd dt − dt0 ⎭

+

S T t=1 s=1

ρs

⎧ N ⎨ ⎩

f (Rits ) + fILH + ξsp SPts

i=1

149

⎫ ⎬ ⎭

(6)

Where: T and N represent the total number of time periods, the total number of units, respectively. K, S, M are the number of users signing ILL contracts with power grid company, the total number of scenario sets, and the number of users signing ILH contracts with power grid company, respectively. Pit is the planned output of unit I in period T. f (x) is the fuel cost function of conventional unit f (x) = ai x2 + bi x + ci . ξitc+ is the positive spinning reserve capacity cost of unit i, and ξitc− is the negative rotating reserve c− capacity cost of unit i. Rc+ it , Rit is the government revolving reserve capacity introduced by the company; fILL is the electricity price discount cost of compensating ILL users; ξtd is the electricity price of t period; dt − dt0 represents the load difference before and after the implementation of TOU price. ρs is the occurrence probability of scenario s; Rits is the output change of unit i in period t under scenario s compared with the first stage. ξsp is the cost coefficient of light abandonment; SPts is the amount of abandoned light by photovoltaic power station in time period t under scenario s. 4.2 First Stage Constraints (1) Power balance constraint N

Pit + Pwt = Lt (1 − η) + dt

(7)

i=1

Where: Pwt is the planned output of renewable energy in period t in the integrated energy system; Lt is the total load of period t; η is the proportion of users participating in the TOU price; dt and dt0 are the electricity consumption of TOU users before and after the implementation of TOU price, dt0 = Lt η. (2) Output constraints of conventional units Pimin ≤ Pit ≤ Pimax

(8)

Where Pimin and Pimax are the minimum and maximum output of unit i. (3) Ramp rate constraints for conventional units −rdi T60 ≤ Pit − Pi(t−1) ≤ rui T60

(9)

Where Pi(t−1) is the active power output of unit i in the t−1 period; rui and rdi are the climbing speed of the active power output of the ith conventional unit; T60 is one operation period. (4) Renewable energy output constraints in integrated energy system Pwmin ≤ Pwt ≤ Pwmax

(10)

Where: Pwmin and Pwmax are the minimum and maximum output of renewable energy in the integrated energy system.

150

Q. Chen et al.

(5) Reserve constraint 0 ≤ Rc+ it ≤ Pimax − Pit

(11)

0 ≤ Rc− it ≤ Pit − Pimin

(12)

(6) Price discount cost fILL =

K

k k k UILL ξILL PILL

(13)

k=1 k belongs to the average compensation amount of the power enterprise Where CILL k refers to the load of the according to the power cut off users in the contract; PILL interruption operation carried out by the enterprise to the users according to the contract regulations.

4.3 Second Stage Constraints (1) Power balance constraint N

Pits + PWts − SPts = PL

(14)

i=1

Where Pits is the actual output of unit i in time interval t under scenario s, and PWts is the renewable energy output in the integrated energy system at time t under scenario s. PL = Lt + dt −

K

ts ts UILL PILL −

M

mts mts UILH PILH

(15)

m=1

k=1

(2) Output constraints of conventional units Pimin ≤ Pit ≤ Pimax

(16)

(3) Ramp rate constraints for conventional units Pimin ≤ Pit ≤ Pimax

(17)

(4) Abandoned wind/light constraint 0 ≤ SPts ≤ PWts

(18)

Where: Pwmin and Pwmax are the minimum and maximum output of renewable energy in the integrated energy system. (5) Compensation cost of power outage fILH =

M

mts mts mts UILH ξILH PILH

(19)

m=1 t In the above formula, CILH represents the compensation cost to be paid by the m is the cost required to power company once ILH is called after the accident; CILH interrupt the unit load specified in the contract signed between the power company and user m.

Two Stage Stochastic Scheduling Model of Integrated Energy System

151

4.4 Joint Constraint of the First Stage and the Second Stage (1) Unit output coordination e− Pits = Pit + Re+ its − Rits

(20)

c+ 0 ≤ Re+ its ≤ Rit

(21)

c− 0 ≤ Re− its ≤ Rit

(22)

e− Rits = Re+ its − Rits

(23)

(2) Reserve constraints

e− Where Re+ its and Rits are the positive and negative rotating reserve power of unit i in period t under scenario s respectively. (3) Interruptible load constraints ts ≤ PILL 0 ≤ PILL

(24)

mt m 0 ≤ PILH ≤ PILH max

(25)

5 Example Analysis According to the actual integrated energy system with renewable energy in a certain place, the BBC algorithm is used to analyze the example. The time cycle is 1 day, which is divided into 24 periods. The integrated energy system with renewable energy includes 22 thermal power units and 3 photovoltaic power stations. The parameters of thermal power unit are shown in Table 1. The market parameters of ill and ILH are shown in Table 2 and Table 3 respectively. The output prediction curve of renewable energy in the integrated energy system is shown in Fig. 1 below. The economic characteristic curve of each unit in the system is shown in Fig. 2. Table 1. Data of thermal generator Unit Pimin Pimax ai

bi

ci

ξitc+ ξitc−

1

0.05

0.50

100 200 10 19

16

2

0.05

0.60

120 150 10 20

15

3

0.05

1.00

40 180 20 18

14

4

0.05

1.20

60 100 10 19

21

5

0.05

1.00

40 180 20 18

15

6

0.05

0.60

100 150 10 16.5 14.5 (continued)

152

Q. Chen et al. Table 1. (continued) Unit Pimin Pimax ai 7

0.05

1.00

8

0.05

0.60

bi

ci

ξitc+ ξitc−

40 180 20 18

15

110 150 10 16.5 14.5

9

0.05

1.00

70 180 20 18

10

0.05

0.60

80 150 10 16.5 17

15

11

0.05

1.00

90 180 20 18

12

0.05

0.60

12

100 150 10 16.5 13.5

Table 2. Parameters of ILL market User i Upper limit Low price cost coefficient ($/MW.h − 1) 1

2

5

2

3

7

3

4

9

4

5

11

Table 3. Parameters of ILL market User i

Upper limit

High compensation cost coefficient ($/MW.h − 1)

5

2

30

6

3

35

7

4

40

8

5

45

The validity of the model is verified by the actual load data of an industrial user in a certain region. The user is three shift productions, process adjustment is easy, and the product output value power consumption is high. The value of self elastic coefficient is - 0.2, and the value of cross elastic coefficient is 0.033. The peak period is divided into: 23–7 is the valley period; 7–8 and 11–18 are the normal period; 8–11 and 18–23 are the peak period. Before the implementation of time of use tariff, the price is 40 ($/MW. H-1), and after the implementation, the price of peak period, flat period and low period is 27 ($/MW. H−1), 42 ($/MW. H−1) and 59 ($/MW. H−1) respectively. This paper is divided into two modes: mode 1 does not consider Dr, mode 2 considers DR. From Fig. 3, we can see the load curve before and after the implementation of time of use price. It is obvious from the figure that the use of time of use price can change the power consumption habits of the user side, so as to achieve the purpose of peak load shifting.

Two Stage Stochastic Scheduling Model of Integrated Energy System

153

Fig. 1. PV prediction curve

Fig. 2. Economic characteristic curve of each unit

For example, in Fig. 2, because the cost of unit 1 is the highest, in order to reduce the cost, unit 1 needs to have less output or even 0 output. On the one hand, in the peak load period, mode 1 can only rely on more output of unit 1, resulting in the increase of power generation cost and poor economy; while mode 2 can cope with the peak load through interruptible load, time-of-time price and other methods, which is not dependent on the output of unit 1, so it is more flexible and economical. Due to the limited capacity and high price of spinning reserve in mode 1, its capacity to absorb photovoltaic energy is very limited. However, the implementation of time of use tariff in mode 2 increases the power consumption in valley period of load curve, so mode 2 can absorb more renewable energy output, as shown in Fig. 4. The optimization results of the two optimization modes proposed in this paper are shown in Table 4. The total reserve cost in mode 1 is lower than that in mode 2. However, due to the significant cost of the abandont wind/light and fuel cost in mode 2, the total operating cost is also lower than that in mode 1. Which means conventional units can work at a higher economic level through the coordination and optimization of reserve on the supply and demand side, and the amount of light discarded can be greatly reduced high utilization of renewable energy.

154

Q. Chen et al.

Fig. 3. Load curves: with vs. Without TOU

Fig. 4. Contrast curve of output status of PV before and after demand response

Table 4. The optimization of two different models Total operation cost

Fuel cost

Generator side reserve cost

Demand side reserve cost

Abandoned light cost

Mode 1

31275.72

24369.44

4326.12



2580.16

Mode 2

29864.87

23673.85

3383.41

1646.22

1161.39

6 Conclusions In this paper, the impact of demand response on power system is considered. Time of use price and interruptible load are introduced as virtual reserve to coordinate with generator side reserve. A two-stage stochastic model based on scenario probability is adopted to fully consider the randomness and volatility of renewable energy output, and the economic characteristics of generator side reserve and virtual reserve are fully reflected. The feasibility of the proposed scheduling model is verified by an example, and the demand response plays an important role in the integrated energy system scheduling with renewable energy.

Two Stage Stochastic Scheduling Model of Integrated Energy System

155

Acknowledgements. Project supported by Youth Program of National Natural Science Foundation of China (51907096); Natural Science Foundation of Qinghai Province (2019-ZJ-950Q).

References 1. Gu, W., Wang, J., Lu, S., et al.: Optimal operation for integrated energy system considering thermal inertia of district heating network and buildings. Appl. Energy 199, 234–246 (2017) 2. Quelhas, A., Gil, E., McCalley, J.D., et al.: A multiperiod generalized network flow model of the US integrated energy system: Part I—Model description. IEEE Trans. Power Syst. 22(2), 829–836 (2007) 3. Huang, W., Zhang, N., Kang, C., et al.: From demand response to integrated demand response: review and prospect of research and application. Prot. Control Mod. Power Syst. 4(1), 12 (2019) 4. Monfared, H.J., Ghasemi, A., Loni, A., et al.: A hybrid price-based demand response program for the residential micro-grid. Energy 185, 274–285 (2019) 5. Xifan, W., Yunpeng, X., Xiuli, W.: Study and analysis on supply-demand interaction of power systems under new circumstances. Proc. CSEE 34(29), 5018–5028 (2014) 6. Yuewen, J., Chong, C., Buzhen, W.: Stochastic simulation particle swarm optimization algorithm for power system unit commitment problem with wind farm. Trans. China Electrotechnical Soc. 24(6), 129–137 (2009) 7. Sun, Y.Z., Wu, J., Li, G.J., et al.: Dynamic economic dispatch considering wind power penetration based on wind speed forecasting and stochastic programming. Proc. CSEE 29(4), 41–47 (2009) 8. Osório, G.J., Lujano-Rojas, J.M., Matias, J.C.O., et al.: A probabilistic approach to solve the economic dispatch problem with intermittent renewable energy sources. Energy 82, 949–959 (2015) 9. Zhang, X., Zhao, J., Chen, X.: Multi-objective unit commitment fuzzy modeling and optimization for energy-saving and emission reduction. Proc. CSEE 22, 71–76 (2010) 10. Luo, Y., Xue, Y., Ledwich, G., et al.: Coordination of low price interruptible load and high compensation interruptible load. Autom. Electr. Power Syst. 31(11), 17–21 (2007) 11. Liu, D., Guo, J., Huang, Y., et al.: Dynamic economic dispatch of wind integrated power system based on wind power probabilistic forecasting and operation risk constraints. Proc. CSEE 33(16), 9–15 (2013) 12. Bahmanifirouzi, B., Farjah, E., Niknam, T.: Multi-objective stochastic dynamic economic emission dispatch enhancement by fuzzy adaptive modified theta particle swarm optimization. J. Renew. Sustain. Energy 4(2), (2012) 13. Paterakis, N.G., Erdinc, O., Bakirtzis, A.G.: Load-following reserves procurement considering flexible demand-side resources under high wind power penetration. IEEE Trans. Power Syst. 30, 1–14 (2014)

Evaluation of Distribution Equipment Utilization Based on Data Driven Xintong Li(B) , Shuo Liang, Yangjun Zhou, and Xiaoyong Yu Electric Power Research Institute of Guangxi Power Grid Co., Ltd., Nanning 530023, China [email protected]

Abstract. In order to effectively quantify the relationship between the technical level of equipment planning, operation, maintenance and control and asset utilization, and to implement the accurate strategy of equipment utilization improvement, a data-driven distribution equipment utilization evaluation method is proposed. A distribution equipment utilization evaluation model with multi-dimensional influence factors of power grid-load-management is constructed. The process visualization of in-service equipment utilization, retired equipment life-cycle utilization and comprehensive utilization of regional equipment in time domain is carried out. Multiple linear regression algorithm is used to screen the key influencing factors with high sensitivity, and it is used to construct the hyper-parameter optimization convolutional neural network model for the trend prediction of various equipment utilization indexes. Finally, the feasibility and effectiveness of the proposed method are verified through the case analysis of the actual system, which can provide technical support for the fine management of the whole life cycle of distribution assets. Keywords: Power distribution network · Equipment utilization · Life-cycle management (LCM) · Multiple linear regression analysis · Convolutional neural network (CNN)

1 Instruction In recent years, the construction of power distribution networks has been vigorously promoted. The scale of investment of power grid continues to increase while the cost control has become increasingly tight [1]. The utilization of equipment assets has also received more and more attention because of its close relationship to corporate benefits. Power grid companies attach great importance to asset life cycle management [2] to promote the transformation from extensive to intensive equipment assets. Power distribution equipment assets account for a high proportion of the power grid, covering a wide area. Unreasonable distribution of network load and irregular level of operation and maintenance technology and equipment quality result in low efficiency and uneven distribution of power distribution equipment in the power grid [3]. Using the concept of life cycle management to analyze the current situation and development trend of distribution network equipment utilization so as to improve the utilization and optimize the © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): AIPE 2020, AISC 1403, pp. 156–173, 2021. https://doi.org/10.1007/978-3-030-80531-9_14

Evaluation of Distribution Equipment Utilization Based on Data Driven

157

investment efficiency of enterprises, and adapt to the new development needs under the power reform have become the mainstream direction of the development of distribution network lean management [4]. Most of the existing references focus on the theoretical research of evaluation criteria and evaluation methods for improving equipment utilization. The evaluation index system of distribution network equipment utilization includes load factor [5], capacity load ratio [6] and life cycle utilization [7, 8], and therefore the equipment utilization is calculated to evaluate the equipment asset utilization efficiency according to the index definition. Some of the research analyze the affecting factors of the asset utilization, and put forward a comprehensive evaluation model of equipment operation efficiency. Reference [9] selects distribution network power supply reliability, grid structure, load characteristics, grid construction margin and load supply capacity as influencing factors, and constructs a comprehensive evaluation model of distribution network equipment based on grey relational variable weight method and fuzzy integral theory. However, the evaluation system indexes are mainly high-frequency indicators from domestic and foreign experts and classical views. In reference [10], an equipment operation efficiency evaluation model based on load duration curve is proposed, and the system operation efficiency is calculated by considering the influence factors of power grid security and equipment operation economy. In reference [11], a power grid asset utilization evaluation model is established based on Data Envelopment Analysis (DEA) method, which can evaluate the risk and equipment utilization efficiency under different load and technical levels. Reference [12] puts forward the evaluation index system of retired and in-serve equipment. The utilization efficiency of decommissioned equipment is evaluated by the equipment life cycle utilization efficiency, while those of in-serve equipment is evaluated by the annual maximum utilization and other indexes. The influence of “N-X” criterion, user power demand, load characteristics, distributed generation and other factors on the utilization is also analyzed. The above references are lack of deep mining of influencing factors, so that they can not fully reflect the operation efficiency of distribution equipment under the multi-dimensional of grid side, load side and management side. It is still limited to describe the level of equipment operation and control by determining the quantitative relationship between the influence factors of operation efficiency. It is also necessary to predict the utilization efficiency of distribution equipment in the future statistical cycle, so as to effectively guide the accurate implementation of planning scheme. At present, the prediction model algorithms mainly include time series model, support vector machine (SVM) and neural network, and so on. The application of the prediction model to the distribution network mainly focuses on load forecasting, photovoltaic forecasting and power supply reliability prediction [13, 14]. However, the traditional evaluation method is still used in the equipment utilization research, which leads to the deficiency in the overall evaluation of utilization efficiency from historical big data. The application of prediction model in distribution network mainly focuses on load forecasting, photovoltaic forecasting and power supply reliability prediction[13, 14], while the equipment utilization still adopts the traditional evaluation method, resulting in the overall evaluation of utilization from historical big data is still insufficient. Reference [15] uses Apriori algorithm and convolution neural network to mine the main influence factors of the operation efficiency of distribution equipment, and quantitatively

158

X. Li et al.

measures the relationship between the operation efficiency and the main influence factors. However, the influence factors only include the four aspects of equipment, structure, operation and environment, and the evaluation indexes do not divide the different types of equipment more accurately, so it is difficult to evaluate the utilization of equipment comprehensively. To improve the distribution equipment utilization is a complex non-mechanism modeling problem considering the diversity and uncertainty of the influence factors. This paper proposes an evaluation method of distribution equipment utilization based on data driven. The multi-dimensional heterogeneous information system, based on the influence factors of power grid, load and managementhas which have obvious statistical characteristics of large samples, applies the convolutional neural network model to the equipment utilization in the actual distribution area, which is helpful to make reasonable prediction of equipment characteristics in long-term power planning.

2 Influence Factors of Distribution Equipment Utilization The information flow is formed by collecting data and is refined and summarized according to different requirements while making decisions or optimizing products and operations, so as to carry out scientific actions under the support or guidance of data, which is called data driven. According to the principle of “collectable” and full coverage of the index system [16], all the selected indicators should be effectively collected and can meet the basic conditions of data-driven. The selected indicators should cover and include other similar indicators in the range, so as to reflect the actual situation of the distribution network comprehensively. The operation efficiency, which is often the result of the joint action of power grid side, user side and management side [17], can directly reflect the actual operation effect of equipment. This paper comprehensively considers the three dimensions of power grid side, user side and management side, covering the user load characteristics, grid structure, distributed generation output characteristics, equipment operation status, power grid operation and maintenance management and operation technology, so as to achieve the comprehensive analysis of the factors affecting equipment utilization. The index system of distribution equipment utilization influence factor is shown in Fig. 1. (1) Load side factors mainly consider the load characteristics of the users carried by the equipment, which is mainly affected by the regional differences, seasonal climate conditions, industrial structure, user categories and their proportion. The operation state of power system can be reflected according to the analysis of power load characteristics [18]. (2) The distribution network includes substations, lines and connected distributed generation. The number of main transformers and the main connection mode will affect the utilization of substation. The installed capacity of distributed generation will account for 9% of the total installed capacity of China’s power generation by 2020. As the installed capacity of distributed generation increases year by year, the penetration and absorption capacity of distributed generation will also affect the equipment utilization of active distribution network [19]. Therefore, in addition to

Evaluation of Distribution Equipment Utilization Based on Data Driven

159

Grid Side

Load Side 1.Load peakvalley difference rate 2.Total user l oad fluct uati on rate 3.User load rati o 4.User average load rate 5.Maximum user load rate

Power supply capacit y

6.Single trans formation rate of substation 7.Connecti on rate of medium volt age line

User load characteri stics

Equipment fail ure

11.Total DG out put fluct uati on rate

14.Proportion of heavy load equi pment

Operation and maintenance management level

21.Pl anned outage delay trans mission rate 22.Average recovery tim e for emergency repair

16.Equipment fai lure rate 17.Mean ti me t o repai r

9.Utili zation rat io of medi um voltage outgoing line interval 10.DG penetrati on ratio

Management Side

15.Proportion of overload equipment

Network structure

8.Connecti on rate between medium volt age substations

13.Proportion of light load equi pment

DG acceptance capabilit y

12.Average DG output rate

St ati c parameters

18.Average li fe of equipment

Operation technical level

19.Equipment rat ed capacit y 20.Equipment act ual power

23.Distributi on autom ati on coverage rate 24.Live operat ion rate

Influence factors of distribution network equipment utilization

Fig. 1. Influence factors of distribution network equipment utilization

considering the network topology and equipment operation status, the influence of the output characteristics of DG connected to the grid on the power flow distribution of distribution network should be considered as well. (3) When equipment failure occurs in the distribution area, power supply enterprises need to timely locate the fault and carry out emergency repair work. The orderly operation and maintenance of equipment is conducive to improving the reliability of power supply and providing guarantee for lean management of equipment [20]. Improving the automation and control level of distribution network dispatching can effectively improve the electronic management requirements of power grid equipment in the whole life cycle and improve the utilization of distribution network equipment management.

3 Indicators of Distribution Equipment Utilization 3.1 In-Serve Equipment Utilization To evaluate the utilization of electrical equipment in the life cycle, the theoretical design value and actual use value should be fully considered. The utilization of power equipment in a fixed time scale can be evaluated intuitively by using general capacity factor index of power equipment utilization. The utilization of in-serve equipment can be defined as the ratio of the actual generated or transmitted power and the theoretical generated or transmitted power, which is directly affected by the total load, life cycle and total capacity of the equipment. The utilization of multiple in-serve equipment can be comprehensively defined as the ratio of the actual power consumption and the total rated power of all equipment: nin i E Ein = nin i=1i in (1) ηin = SN × T i=1 SN × T ηin is the utilization of in-serve equipment, Ein is the actual total power consumption of in-serve equipment in the evaluation period, SN is the total rated capacity of in-serve

160

X. Li et al.

equipment, T is the given assessment time cycle, generally one month, one year and i is the actual power consumption of the ith equipment in the evaluation other cycles, Ein i period, SN is the rated capacity of the of the ith equipment, nin is the quantity of in-serve equipment. 3.2 Retired Equipment Utilization According to the theory of asset life cycle management [21], the utilization of retired equipment in a given evaluation period can be defined as the ratio of actual and theoretical ampacity of the equipment in the life cycle. The life cycle utilization of several retired equipment is defined as the ratio of the accumulated actual power consumption of a single equipment and the accumulated rated power consumption of a single equipment in the life cycle: nre i E Ere = nre i=1i re i (2) ηre = SN × Td i=1 SN × Td ηre is the life cycle utilization of the retired equipment, Ere is the actual total power consumption of the retired equipment in the whole life cycle, Td is the design life of the i is the actual power consumption of the ith retired equipment in retired equipment, Ere the whole life cycle, Tdi is the rated capacity of the ith retired equipment, nre is the total number of retired equipment. 3.3 Comprehensive Utilization of Distribution Network Equipment The entropy weight method relies on the discreteness of the data itself, and uses the variation degree of information entropy to calculate the weight of each index, which is an objective weighting method [22]. Considering the importance and characteristics of different types of equipment and the proportion of retired and in-service equipment, the entropy weight method is used to calculate the comprehensive utilization of equipment in the specified region within a given statistical period. The specific steps are as follows: Step 1: Calculate the equipment utilization value of in-service operation equipment and retired equipment in the evaluation year. Step 2: Construct the evaluation index matrix of equipment utilization, which is shown in formula (3). D = (dij )b×m = (D1 , D2 , ..., Di , ...Dm )

(3)

b is the number of distribution area in the evaluated district, m is the number of categories of equipment to be evaluated, referring to in-service and retired equipment, D is the index matrix constructed by the values of b × m indicators, Di is the evaluation index column vector of the utilization of the ith type of equipment in the index matrix, that is, the column vector composed of the ith evaluation index of b distribution areas, and dij is the jth evaluation index value of the ith distribution area. Step 3: Calculate the entropy value of evaluation index, and the formula for entropy value of the jth evaluation index is shown in (4): b pij ln(pij ) (4) ej = −k i=1

Evaluation of Distribution Equipment Utilization Based on Data Driven

161

k = 1/ ln(b) and pij is the proportion of the score on the jth index of the ith distribution area and all distribution areas.  pij = dij b dij (5) i=1

Step 4: Calculate the entropy weight of evaluation index. The calculation formula of entropy weight of the jth evaluation index is shown in (6):  (6) wj = (1 − ej ) m 1 − ej  j=1 1 − ej is the dispersion degree of the jth evaluation index. If the values of the jth evaluation index of all distribution areas are the same, the entropy value ej reaches the maximum value, but it can not provide effective information, so the entropy weight should be the minimum. Step 5: Calculate the target value. The calculation of the comprehensive utilization ηi of the equipment in the evaluation year of the ith distribution area is shown in (7). m wj pij (7) ηi = j=1

4 Data Processing and Key Influence Factors Mining The influencing factors system of equipment utilization based on the three dimensions of power grid, load and management covers the multi-source heterogeneous and massive data generated in the whole process of distribution network operation, dispatching, management and maintenance, whcih is too complex to construct the prediction model. Therefore, the multi-element linear regression algorithm is further applied to mine and determine the correlation between the influence factors and the utilization, so as to select the key influence factors of utilization. For reducing the dimension of input variables and improving the effectiveness of the prediction model, the key factors are used as multi-dimensional feature vector input of the prediction model. 4.1 Normalization of Influence Factors In order to eliminate the influence of dimension and order of magnitude among variables of different influence factors, the min-max standardized method is used to normalize the initial data, so that the data is mapped into the interval [0,1]. This paper assumes that the index system of influencing factors has n sub indexes, the index set of influence factors is Ci = {C1 , C2 , . . . , Cn }, Cij is the initial value of the jth influence factor index in the ith evaluation year, and the composition of index matrix is shown in formula (8). C = (Cij )b×n = {C1 , C2 , ..., Cn }

(8)

If the influence factor index is positive, the larger the index value, the better, as shown in (9). If the index is reverse, the smaller the index value, the better, as shown in (10). cij∗ =

cij − min(c1j , c2j , . . . , cbj ) max(c1j , c2j , . . . , cbj ) − min(c1j , c2j , . . . , cbj )

(9)

162

X. Li et al.

cij∗ =

max(c1j , c2j , . . . , cbj ) − cij max(c1j , c2j , . . . , cbj ) − min(c1j , c2j , . . . , cbj )

(10)

The matrix C * = (cij∗ )b×n is obtained after normalization, where cij∗ is the value of the jth influence factor in the ith evaluation year after dimensioning and normalization. 4.2 Multiple Linear Regression Modeling In the determination module of key influence factors, the data samples of equipment utilization factor of each distribution area in different evaluation years are obtained. According to the multiple linear regression analysis model, the independent variable matrix X is established based on the utilization influence factor of each distribution area, and the dependent variable matrix Y is established by the comprehensive utilization. Multiple linear regression model is shown in Eq. (11). Y = Xβ + ε

(11)

Y = [Y1 , Y2 , . . . Yi , . . . , Yr ]T is the total number of data samples, Yi is the comprehensive utilization of the equipment of the ith data sample in the distribution area; X = [e, X1 , X2 , . . . Xk , . . . , Xr ]T , and e = [1, 1, . . . , 1]T is the n × 1 order vector; the influence factor column vector of the equipment utilizationof the kth data sample is X k = [X1k , X2k , . . . Xnk ]T ; the column vector of regression coefficient is β = [β0 , β1 , . . . βn ]T ; ε = [ε0 , ε1 , . . . εn ]T is the random error term, and ε ∼ N (0, σ 2 ). The regression coefficient can be obtained by the least square method, so that the residual sum of squares of all observed values can be minimized. According to the sensitivity, the importance of influence factors on the equipment comprehensive utilization is analyzed, and the key factors affecting the utilization of distribution network equipment are screened, and the multi-dimensional indexes of the grid side, load side and management side of the distribution network are excavated and analyzed. The sensitivity calculation formula is as follows: −1  XT Y (12) β = XT X

5 Equipment Utilization Prediction Based on Convolution Neural Network Convolutional neural network (CNN) is a kind of feedforward neural network with deep structure and convolution computation, which is widely used in the field of deep learning. Convolution neural network convolutes and pools the original data to extract the local features of it, and finally transfers various local features to the full connection layer. It has the characteristics of local connection and weight sharing, which can effectively realize the extraction of input features [23]. In this paper, CNN convolution neural network is used to construct the prediction model of distribution equipment utilization to determine the correlation characteristics between key influence factors and equipment utilization index, so as to accomplish multi-dimensional correlation analysis of distribution equipment utilization under different independent variable optimization combinations.

Evaluation of Distribution Equipment Utilization Based on Data Driven

163

5.1 Convolution Neural Network The multi-dimensional input vector matrix of influencing factors is converted into N two-dimensional feature map square matrix as the input of CNN neural network, and the feature map is convolved and pooled to extract features. In order to facilitate the operation of the convolutional layer, select the same rolling window size as the number of feature vectors. The input features are extracted from the convolutional layer. Using the convolution kernel to perform multi-depth convolution extraction and mapping of the features. The convolution kernel is used to deeply extract and map the feature, and then it is summed by dot multiplication with the input matrix in the sliding window. The “same convolution” is selected as the convolution method, and different features can be extracted by convolution of multiple convolution kernels with the feature maps. Then use relu activation function to perform nonlinear mapping on neurons. pooling The layer summarizes the features obtained by the convolution operation, and realizes data compression through operations such as maximum pooling and average pooling, which is beneficial to reduce overfitting and improve the fault tolerance of the model. The pooling layer summarizes the features obtained by the convolution, and realizes data compression through operations such as maximum pooling and average pooling, which is beneficial to reduce overfitting and improve the fault tolerance of the model. The fully connected layer is embedded in the bottom layer of the network in the form of BP neural network, merges the pooled features, and uses feedforward calculation and back propagation to find the quantitative relationship between input and output. The output vector selects three indicators of in-service equipment utilization, retired equipment utilization and comprehensive equipment utilization in the distribution area. The setting of hyperparameters affects the performance of the prediction model [24]. In this paper, the performance of the prediction model is optimized by adjusting the learning rate in the CNN training to avoid setting the learning rate too small to affect the efficiency of model training, and too large to give the model training belt. In addition, in order to alleviate the problem of poor generalization ability of too powerful neural networks, the dropout technology of random loss of neurons is introduced. The lost neurons reset the connection weight of the neurons to zero and do not participate in the forward calculation and backpropagation of network training, thus avoiding the phenomenon of over-fitting and increasing the diversity of data. Use root mean squared error (RMSE) and mean absolute percentage error (MAPE) functions as performance evaluation indicators to evaluate the expected value of the error between the estimated model prediction parameters and the true value of the parameters, as shown below: 1 N (pi − pˆ i )2 (13) RMSE = i=1 N pi is the actual value of the utilization of the first equipment, is the predicted value of the utilization rate of the first equipment, and is the number of data samples. 5.2 Basic Process of Equipment Utilization Prediction The basic process of distribution equipment utilization prediction based on data-driven is shown in Fig. 2.

164

X. Li et al.

Fig. 2. Data-driven equipment utilization prediction in distribution network

(1) Collect the original data of each distribution area from historical big data, including the distribution network structure, grid operation and maintenance data, asset management system, equipment historical operation data and other related information. (2) Define the utilization of in-service equipment and the whole life cycle utilization of retired equipment respectively by the capacity factor, and calculate the comprehensive utilization of whole life cycle of each equipment in different evaluation years using the entropy weight method. (3) Calculate the values of the influence factors of equipment utilization in the multiple dimensions of the grid side, load side, and management side, use multiple linear algorithms to explore the correlation between the influence factors and the comprehensive utilization, and screen out key influence indicators for the equipment utilization with high sensitivity in each dimension to determine the transformation priorities of various indicators in different dimensions.

Evaluation of Distribution Equipment Utilization Based on Data Driven

165

(4) Take the vector feature map square matrix of the key influence factor in step (3) as the input of the deep convolutional neural network to realize the dimensionality reduction of the prediction model input data; take the three types of equipment utilization evaluation indicators in step (2) as the output variables of the deep neural network, and generate training sample set and test sample set. (5) Optimize the hyperparameters of the convolutional neural network model, train the convolutional neural network prediction model, and obtain the optimized prediction model. (6) Enter the value of the key elements of the distribution area in the target year into the model obtained in step (5) to obtain the predicted value of the equipment utilization. This model can be used to predict the equipment utilization of the distribution area to be evaluated in the future target year. Combining regional characteristics to predict asset utilization trends, and formulate corresponding equipment utilization improvement measures. (7) Establish a multi-level benchmarking model to carry out a multi-level benchmarking evaluation combining horizontal benchmarking and vertical benchmarking with the predicted results of the in-service and retired equipment of the distribution area in the target year.

6 Case Study 6.1 Key Factor Screening Taking Nanning City, Guangxi as an example, select historical line utilization management data of different distribution areas in Nanning to illustrate the utilization evaluation level of in-service and retired equipment throughout the life cycle. Then apply the multiple linear regression models to mine the correlation of indicators from a large number of data samples of influence factors of equipment utilization, and carry out loadgrid-management side multi-dimensional horizontal benchmarking on the sensitivity of equipment utilization factor to determine the importance of influence factors. The data in the example is mainly obtained from Nanning Power Supply Bureau and Guangxi Distribution Network Data Analysis and Management platform. Take the medium-voltage distribution network in Shanglin District of Nanning as an example, select the in-service and retired lines of each distribution area for analysis, and visualize the change process in time domain of the utilization of in-service equipment, the full life cycle utilization of retired equipment and the comprehensive utilization of all equipment from 2017 to 2019, as shown in Fig. 3. From Fig. 3, the utilization of distribution equipment in Shanglin District of Nanning has been improved to varying degrees every year. After considering the life cycle management of retired equipment, the comprehensive utilization of distribution equipment has changed significantly. The sensitivities of the influence factors on the load, power grid and management side of the distribution area in Nanning are shown in Fig. 4. On the load side, for every 1% change in the load peak-valley difference rate and the total user load fluctuation rate, the value of the equipment utilization changes above 0.36%. Nanning Power Supply Bureau should control load operation characteristics through overall planning of user load types and implement demand-side response methods such as time-of-use electricity prices to

166

X. Li et al.

Fig. 3. The trend chart of equipment utilization varies in the time domain in Shanglin District

Fig. 4. Sensitivity comparation of load-grid-management side factors

suppress excessive load fluctuations in the region. The adjustment flexibility of load side is relatively high, and attention should be paid to measures to improve utilization. On the grid side, the network structure-type influence factor is more sensitive. For every 1% change of this type of index, the equipment utilization is basically above 0.35%, which reflects that the network structure-type index affects the utilization of equipment in the region vitally. Since it is more difficult to change the grid structure after the equipment is put into operation, the number of main transformers in the substation and the design of the main wiring mode in the station should be paid attention to during the planning of the distribution system, so as to improve the rationality of the distribution network structure. In addition, the sensitivity of the DG penetration ratio of the distribution network in this region is relatively high. The installation proportion of distributed power should be scientifically planned to alleviate the uncertainty of the system caused by the increase in the proportion of distributed power and reduce the impact of equipment utilization. Besides, the sensitivity of the equipment rated capacity and equipment failure rate is

Evaluation of Distribution Equipment Utilization Based on Data Driven

167

high. This region can focus on improving the equipment capacity parameter margin, selecting equipment according to the load growth rate, and strengthening equipment failure management to reduce the number of failures. On the management side, the distribution automation coverage rate of Nanning power supply area has the highest sensitivity, followed by the live line operation rate, which reflects the importance of the operation technology level of the distribution network to equipment utilization. The management department should give priority to improving the automation construction level of the distribution network, strengthen the monitoring and analysis of the operation status, and improve the management ability of live operation. The key influence factor indicators with sensitivity higher than 0.35 are selected as the input feature vector of the convolutional neural network prediction model. It can be seen from the figure that the influence factors with sensitivity higher than 0.35 are load peak-valley difference rate, total user load fluctuation rate, user average load rate, single transformation rate of substation, connection rate of medium voltage line, connection rate between medium voltage substations, DG penetration ratio, equipment failure rate, equipment rated capacity, equipment actual power, distribution automation coverage rate and live operation rate, a total of 12 influence factor indicators, are constructed as input parameters for the prediction model. 6.2 CNN Prediction Model Select the distribution network production management, equipment operation parameters (including all in-service lines and retied lines) and other relevant data for a total of 50 distribution areas in Nanning from 2000 to 2019 for statistical analysis, and calculate the multi-dimensional influence factor indicators on grid-load-management. side and the equipment utilization. The data sampling interval is one year as the statistical evaluation period, with a total of 1000 data samples, of which 975 data samples are used to train the model, and 25 data samples are used as the test set to test the error and accuracy of the model output. All experiments are repeated 10 times in MatlabR2019b to determine the final prediction results to avoid the interference of error factors. 6.2.1 CNN Parameter Settings The twelve-dimensional influence factors are input to eigenvectors for data normalization preprocessing. The input matrix is transformed into a two-dimensional feature map square matrix according to the rolling window method. The rolling window size is the same as the number of feature vectors, and the dimension of each two-dimensional matrix is 12 × 12. This paper sets up 2 layers of two-dimensional convolutional layers, 2 layers of maximum pooling layer, 1 layer of flat layer, adding Dropout regularization to further prevent overfitting. The number of convolutional layer filters are 32 and 64 respectively, the time step is 1, the convolution kernel size is 4 × 4, the pool size is 2 × 2, and the final flattened output is a one-dimensional array with a length of 576. The convolution neural network structure is shown in Fig. 5. This paper considers the hyperparameters of CNN neural network training including the learning rate in CNN training and the Dropout regularization weight coefficient. Using the control variable method based on 1000 iterations. The CNN learning rate is

168

X. Li et al. Input 12×12×1

12×12×32

6×6×32

6×6×64

3×3×64

576 Dropout 0.4 Flatten

Conv1: 4×4 32 filters

Conv2: 4×4 64 filters

Pool1: 2×2

Pool2: 2×2

Fig. 5. Convolutional neural network architecture

gradually attenuated without setting the regularization term to find the optimal rate that minimizes the loss of the squared difference predicted by the model. The results are shown in Table 1. Table 1. Squared difference loss of different learning rate Learning rate

Dropout weight coefficient

Squared difference loss

0.1

0

0.00896

0.01

0

0.00689

0.001

0

0.00807

Table 2. Squared difference loss of different dropout weight coefficients Dropout weight coefficient

Number of iterations

Squared difference loss

0.5

800

0.00646

0.4

800

0.00430

0.3

800

0.00723

0.2

800

0.00769

On this basis of Table 1, adjust the Dropout regularization weight coefficient to determine the smallest loss of the squared difference of the model prediction. The results are shown in Table 2. It can be seen from Table 1 that when the learning rate is set to 0.01, the loss of the squared difference is the smallest; when the Dropout weight coefficient is set to 0.4, the loss of the squared difference after iteration is the smallest. Based on this result, the learning rate in this paper is selected as 0.01, and the Dropout regularization weight coefficient of CNN network model training is 0.4. 6.2.2 Error Analysis After determining the convolutional neural network model parameters, a total of 25 test sets of the power distribution area are evaluated, and the average value of the root mean

Evaluation of Distribution Equipment Utilization Based on Data Driven

169

square error and the average absolute value percentage error is calculated and compared with the traditional BP neural network prediction results, as shown in Table 3. In order to more intuitively find out the differences between the prediction results of different neural network models and the actual utilization level of equipment, the predicted values of the three types of equipment utilization in each distribution area in Nanning are compared with the actual values, as shown in Fig. 6. Table 3. Comparison of model prediction results Model

yRMSE yMAPE CNN model In-serve equipment utilization 0.0540 0.0160

BP model

Output index

Retired equipment utilization

0.0756 0.0147

Comprehensive utilization

0.0656 0.0133

In-serve equipment utilization 0.1059 0.0504 Retired equipment utilization

0.1316 0.0450

Comprehensive utilization

0.1121 0.0434

Fig. 6. Model prediction curve comparison diagram

It can be seen from Table 3 and Fig. 6 that, compared with the traditional BP neural network model, the root mean square error of the three output indicators of in-service equipment utilization, retired equipment utilization, and equipment comprehensive utilization of the CNN neural network prediction model proposed in this paper has decreased by 0.0519, 0.056, and 0.0465, respectively, and the average absolute value percentage error has respectively decreased by 0.0344, 0.0303, 0.0301. The CNN neural network model has the highest degree of fitting with the actual utilization curve, which can better match the actual situation, and can be used to reasonably evaluate the development trend of the equipment utilization capacity of each distribution area in Nanning.

170

X. Li et al.

6.3 Analysis of Prediction Results of Target Year According to administrative divisions, the distribution districts of Nanning in 2020 can be divided into 12 districts. The predicted values of key influence factors of equipment utilization in each district are selected as the input parameters of the CNN network model, and the equipment utilization levels of different distribution districts in the planning target year are predicted as shown in Fig. 7, 8 and 9. Based on the horizontal benchmarking of the predictive values of the three equipment utilization indicators in each district, the predictive values of Qingxiu District, Xixiangtang District and Jiangnan District in 2020 are maintained at a higher level than other districts, which can be regarded as equipment management benchmarking district to assist other district as to develop equipment management and control plans. However, the predictive values of Long’an District, Mashan District and Shanglin District are all at a low level, meaning the overall situation of the equipment asset management of them is not optimistic, which should be designated as the priority governance zone while formulating the distribution equipment planning strategy for the forecast target year. Figure 8 shows the difference in the actual operating efficiency of the in-serve equipment in the 12 distribution districts in Nanning during the forecast year. The utilization of in-service equipment in each distribution district is in the range of 30% to 40%, among which of Qingxiu and Xingning districts is relatively high, which can reach more than 40%. However, the utilization of in-service equipment in Shanglin, longan and Mashan districts with poor utilization is about 20%, which reflects that there is still a lot of room for improvement in the operation efficiency of these districts, so it is necessary to focus on improving the management and control ability of equipment operation. Figure 9 shows that after considering the service capacity within the actual service life cycle, the difference effect between the equipment utilization is more significant, which reflects that the utilization of retired equipment in Nanning power distribution network is uneven in the whole life cycle, among which the higher one can reach more than 60%, while some distribution districts, such as Mashan and Shanglin, are around 35%, and the life cycle management of equipment assets still needs to be strengthened.

Fig. 7. Prediction of comprehensive equipment utilization in the target year

Evaluation of Distribution Equipment Utilization Based on Data Driven

171

Fig. 8. Prediction of in-service equipment utilization in the target year

Fig. 9. Prediction of retired equipment utilization in Nanning power distribution district

Table 4. Prediction of improvement degree of annual utilization in Shanglin and Qingxiu Index

Forecast annual growth rate of Forecast annual growth rate of Shanglin Qingxiu

In-serve equipment utilization 4.87%

5.41%

Retired equipment utilization

1.01%

3.49%

Comprehensive utilization

2.81%

4.59%

Taking Shanglin District as an example, select Qingxiu District, which is the best benchmarking district with great difference in equipment utilization, for comparison.

172

X. Li et al.

The differences of equipment comprehensive utilization, in-service equipment utilization and retired equipment utilization of the two zones are 24.96%, 21.10% and 29.157%, respectively. Shanglin District has more room for improvement in asset management of retired equipment. To improve the overall equipment utilization level, we can focus on improving the effective management of equipment assets in the whole life cycle, such as considering the contradiction between high load rate and long life of distribution equipment, and reasonably coordinating the uneven load distribution of equipment. The prediction of annual increase rate of equipment utilization in 2020 in Shanglin and Qingxiu districts is compared with that in 2019, as shown in Table 4. In 2020, the utilization of in-service equipment in Shanglin District increases faster than that of retired equipment, and the effect of equipment operation and control improves more obviously than in previous years. However, compared with the benchmarking district, the annual increase rates of in-service and retired equipment are still tend to be flat. Shanglin District can continuously strengthen load regulation and control, optimize system coordination control strategy, steadily improve user load management. According to the results of Fig. 7, 8 and 9, the prediction of equipment utilization in Nanning by using CNN neural network has validity and feasibility.

7 Conclusion Equipment utilization is an important index to measure the asset management level of power grid enterprises, which can effectively optimize the input-output efficiency of distribution network. In this paper, the prediction of distribution network equipment utilization is studied. A series of practical results have been obtained in the aspects of multi-dimensional data mining and analysis of influence factors, formulation of evaluation indicators, evaluation of equipment asset trend, implementation of promotion measures and so on, which makes up for the lack of coverage of index analysis dimension and prediction of equipment utilization trend in existing research. The data-driven utilization prediction has high requirements for the original statistical data. Due to the small number of training samples in this study, the accuracy of the prediction results will be affected to some extent. However, the prediction results of this method applied to Nanning show that the CNN neural network has good convergence, which can provide an important reference for equipment management of power grid enterprises in the future. Acknowledgment. This project is supported by Key Projects of China Southern Power Gird (GXKJXM20170389).

References 1. National Energy Administration of China. Action plan of distribution network construction and transformation (2015–2020). National Energy Administration of China, Beijing (2015) 2. Hou, G., Huang, Q., Chen, Y., et al.: Life cycle cost analysis of distribution transformer considering high overload capacity and vegetable insulating oil. Southern Power Syst. Technol. 12(07), 60–69 (2018)

Evaluation of Distribution Equipment Utilization Based on Data Driven

173

3. Li, Y.: Research on grey multi-level comprehensive evaluation of equipment utilization rate of distribution network. Southwest Jiaotong University (2019) 4. Sun, W.: Research on the cost management for the assets life cycle. North China Electric Power University (2018) 5. Zhang, B., Cao, H., Chen, Q.: Impact factors on use ratio of Guangdong power distribution network equipments and improvement measures. Guangdong Electr. Power 27(10), 107–111 (2014) 6. Wang, J.: Index analysis and research of promotion strategy on the equipment utilization ratio of 10kV distribution transformer. South China University of Technology (2018) 7. Dashti, R., Yousefi, S., Moghaddam, M.P.: Comprehensive efficiency evaluation model for electrical distribution system considering social and urban factors. Energy 60, 53–61 (2013) 8. Hu, Z.L., Zhang, Y.J., Li, C.B., et al.: Utilization efficiency of electrical equipment within life cycle assessment: indexes, analysis and a case. Energy 88, 885–896 (2015) 9. Bie, S.: Research on comprehensive evaluation system and method of distribution network equipment utilization. Hunan University (2016) 10. Sheng, W., Chen, H., Wang, J., et al.: Operation efficiency evaluation of distribution network based on load duration curve. Power Syst. Technol. 40(04), 1237–1242 (2016) 11. Li, X.: Studies on power grid asset utilization assessment based on data envelopment analysis. Tianjin University (2017) 12. Hu, Z.: Evaluaton indexes of utilization efficiencyof electrial equipment in distribution network. Hunan University (2016) 13. Zhou, N., Liao, J., Wang, Q., et al.: Analysis and prospect of deep learning application in smart grid. Autom. Electr. Power Syst. 43(04), 180–197 (2019) 14. Xing, X., He, T., Zheng, X., et al.: Reliability prediction method of distribution network based on ANN-dropout. Southern Power Syst. Technol. 13(02), 66–73 (2019) 15. Bai, H., Yuan, Z., Sun, R., et al.: Method based on apriori algorithm and convolution neural network for mining main influencing factors of distribution equipment operation efficiency. Electr. Power Constr. 41(03), 31–38 (2020) 16. Li, B., Luo, F., Huang, L., et al.: Construction of multi-dimensional benchmarking evaluation system for line loss management of the power grid of city-level. Proc. CSU-EPSA 30(06), 23–30 (2018) 17. Luo, D., Bie, S., Pang, Z., et al.: Comprehensive evaluation on the utilization rate of distribution network equipment considering the relevance among indexes. Proc. CSU-EPSA 29(10), 73– 78+150 (2017) 18. Li, J., Xu, S., Wan, C., et al.: Electricity load characteristics analysis based on adaptive k-means++ algorithm . Southern Power Syst. Technol. 13(02), 13–19 (2019) 19. Ye, L.: Research of evaluation theory and optimization improvement of key operation characteristics for active distribution network. South China University of Technology (2018) 20. Qian, L.: The research on lean management of transmission and distribution equipment inspection in Jinhua power company. North China Electric Power University (Beijing) (2018) 21. Liu, C., Si, S., Li, Z., et al.: Study on the assets lifecycle management of China Southern Power Grid. Southern Power Syst. Technol. 8(02), 113–116 (2014) 22. Luo, Y., Li, Y.: Comprehensive decision-making of transmission network planning based on entropy weight and grey relational analysis. Power Syst. Technol. 37(01), 77–81 (2013) 23. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning, vol. 1, pp. 326–366. MIT Press, Cambridge (2016) 24. Bergstra, J., Bardenet, R., Bengio, Y., et al.: Algorithms for hyper-parameter optimization. In: Advances in Neural Information Processing Systems, pp. 2546–2554 (2011)

Approaches to Validation of Quantification of the Variable “Relationship Between Users” in the Context of Social Engineering Attacks Anastasiia Khlobystova1,2(B) , Maxim Abramov1,2 , and Tatiana Tulupyeva1,2,3 1 Laboratory of Theoretical and Interdisciplinary Problems of Informatics, St. Petersburg

Federal Research Center of the Russian Academy of Sciences (SPC RAS), St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences, 14-th Linia, VI, No. 39, St. Petersburg 199178, Russia {aok,mva,tvt}@dscs.pro 2 Mathematics and Mechanics Faculty, St. Petersburg State University, Universitetskaya Emb., 7-9, St. Petersburg 199034, Russia 3 Branch of RANEPA, North-West Institute of Management, Sredniy prosp. VI, 57, St. Petersburg 199034, Russia

Abstract. The purpose of this study is to propose approaches to validation of quantification of variable “relationship between users” in the context of social engineering attacks, as well as consideration of the accuracy of the proposed model, correlativity checks of used in research formulations and desired characteristics and other conditions necessary for implementation when conducting such kind of research. The contribution to the theory of the research area consist in providing approaches to validate of quantification the variable. The practical significance of the results consists in the formation of the basis for the subsequent use of “types of relationship between users” estimates for solving of problems of analysis of the trajectories of propagation of multistep social engineering attacks. The knowledge of such estimates would identify the vulnerabilities on the social graph of the company employees. In addition, this study is one of the components of the foundation for the follow-up diagnostics of information systems in order to identify vulnerabilities to social engineering attacks, and can be used in solving social computing problems. The novelty of the study is validated of quantification the variable of “types of relationship between users”. Keywords: Information security · Social engineering attacks · Validation · Types of relationships · Social user graph

1 Introduction The significant growth of the introduction of information technologies in different areas of life, such as the economy, production, energy, agriculture, education, health care and a number of others [12, 15], requires increased attention to the security issues of digital technologies [4, 14, 19]. The most important aspect in this is the user through which © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): AIPE 2020, AISC 1403, pp. 174–180, 2021. https://doi.org/10.1007/978-3-030-80531-9_15

Approaches to Validation of Quantification of the Variable

175

the social engineering attack is carried out. The term “social engineering attack” mean set of applied psychological and analytical methods, which attackers apply for latent motivation of users of a public or corporate network to infringements of the established rules and policies in the field of information safety [1]. The importance of research in this area is confirmed by statistics cited by a number of experts, in particular HackerU [18] indicate that in the last 2019 more than 70% successful cyber-attacks were carried out using social engineering methods. In addition, it is noted that about 76% of employed Russians persons do not have the skills to protect their data, since they received education before the period of digital transformation [18]. In this case, one of the largest banks in Russia—Sberbank expressed the need to include in the national program “Digital Economy” measures on countering fraud based on methods of social engineering [16]. Often, Open Source Intelligence (OSINT) accompanies a successful social engineering attack—malefactors are keen to receive as much information as possible about the attacked object, for example, from its profile on social networking service [17]. Therefore, an important factor in research on social engineering attacks is the analysis of the user’s profile on social networks, with the subsequent determination of the level of his vulnerability to the manipulations of the social engineering malefactor. The authors presented the development of approaches, methods and models for increasing the security of users from social engineering attacks in [1, 7, 8]. In particular, in [7, 8] approaches to quantification of different types of relationships indicated by users in online social networks were discussed. With the proposed approaches, numerical estimates can be obtained that make it possible to build estimates of the probability of an attacker passing along the certain trajectory (the certain scenario) of the attack development. This study is aimed to explore approaches to validation of quantification of variable “relationship between users” in the context of social engineering attacks, as well as consideration of the accuracy of the proposed model, correlativity checks of used in research formulations and desired characteristics and other conditions necessary for implementation when conducting such kind of research. The contribution to the theory of the research area consists in providing approaches to validate of quantification the variable. The practical improvements of the results consist in the formation of the basis for the subsequent use of “types of relationship between users” estimates for solving of problems of analysis of the trajectories of propagation of multistep social engineering attacks. The knowledge of such estimates would identify the vulnerabilities on the social graph of the company employees. In addition, this study is one of the components of the foundation for the follow-up diagnostics of information systems in order to identify vulnerabilities to social engineering attacks, and can be used in solving social computing problems. The novelty of the study is validated of quantification the variable of “types of relationship between users”. This paper is organized as follows. Section 2 describes the related works. Section 3 describes the statement of the problem. Sections 4 describes approaches to checking the validity of quantified values. Section 5 then are discussed the results of such research. Finally, Sect. 6 presents conclusions.

176

A. Khlobystova et al.

2 Related Works A number of sources related to human-operated and human-interactive systems validation were reviewed to explore validation issues. In [10] formal verification and validation processes were presented for approach based on Bayesian Networks used to model the relationship between performance shaping factors and human errors, however, this approach is not applicable in the studies under consideration. The authors [3] studied a method for the formal verification of human-interactive systems. As a verification method, the article presents a developed process capable of both verifying a specification and providing a counterexample when a specification is violated. However, the proposed method is aimed at checking the system with respect to human interaction with it, and not at checking how well the system models or analyzes user behavior. In [5] discussed verification and validation challenges for human factors professionals involved in the design and development of complex human-operated systems. In the paper described how are addressing the problem of verification and validation now, and what is needed to address future challenges. It is also noted that, in fact, every day system work is a validation, but a formally adequate answer for how to sufficiently validate a human-machine system is impossible. As a measure to improve the reliability of the evaluation results, it is proposed to use 2-dimension linguistic variables [9]. In addition, the authors represent some operational laws, score, and accuracy functions. This information can be used in follow-up studies, for example, in the case of aggregation of types of relationships between users from different online social networks, but cannot be applied in the one-dimensional case considered in this study. [20] describes a decision-making model based on interval type-2 fuzzy linguistic variables. The authors note that it is important to take into account both people’s own preferences and external social relationships when assessing decision-making. The results obtained are recommended for use in the large-scale group decision making in advertising strategy and commodity recommendation through social media. This model is also useful for further research and can be applied in the context of social engineering attacks. As an approach for check of the results [13] represents designing a by designing a fuzzy decision support system (F.D.S.S.) compared with a fuzzy TOPSIS method. Fuzzy TOPSIS is a method that can help in objective and systematic evaluation of alternatives on multiple criteria [11]. This approach can be used at next phase of the study namely at the stage of modeling the process of an attacker’s passing along a certain trajectory. The field of psychology can serve as a source of approach to validating the results obtained, because one of the objects of its research is the behavior and relationships of people in large and small social groups. For example, there is empirical validity, which describes how closely scores on a test correspond (correlate) with behavior as measured in other contexts [2]. For current studies, correlative tests can be checked for the following: “estimates of the probability of an attacker passing from one user to other” and “an estimate of the strength of the type of relationship between two users”. The basis for this research was the work [7, 8], in which approaches to build estimates of the intensity of interactions between users were studied and quantified estimates were obtained. For this, the method of constructing estimates of the probability of alternatives

Approaches to Validation of Quantification of the Variable

177

from non-numerical, inaccurate and incomplete information [9] applied to linguistic variables [21] can be used.

3 Statement of the Problem The purpose of studies [7, 8] was to obtain estimates of intensity for m different types of relationship. To achieve this goal, a survey of experts was conducted  their  and, based on answers, a set of pairs was obtained {(Oi , Fi )}1≤i≤n , where Oi = Rj1 , Rj2 , ..., Rjk 1≤k≤m is an ordered set of ranked relationship (relationships rank), Fi is the number of experts, who ranked type of relationships in this order (order frequency). After that, the method by Khovanov [9] was applied to {(Oi , Fi )}1≤i≤n with applied the Bayesian model of randomization of the uncertainty of the choice of the vector of probability estimates from the set of all permissible vectors, on the basis of which the aggregate estimates of respondents. The purpose of this study is to propose approaches to validation of quantification of variables, as well as consideration of the accuracy of the proposed model, correlativity checks of used in research formulations and desired characteristics, as well as other conditions necessary for implementation when conducting such kind of research.

4 Research Methods This section will consider approaches to checking the validity of quantified values of variables and the empirical validity of ongoing research. 4.1 Conducting Additional Survey/Surveys This approach is based on the possibility of conducting additional research or a number of sequential studies. It is assumed that the subsequent research will be carried out after some predetermined time. As experts, both new participants with the required set of competencies can be selected, as well as experts who took part in the previous time, a mixed type of participants is also possible. However, in the last two cases, it is important to enter a unique expert identifier that would be the same in all surveys in order to track changes and detect inaccuracies. Experts are invited to study the ordered results obtained because of quantification, and in case of disagreement with the location of some of them indicate their intended place in Oi (order frequency). After receiving the corrected results, a second quantification procedure will be performed. Variables, the position of which has not been changed by any of the experts, will be fixed for following quantification. This is one of the options for validating the results obtained, since there is a process of comparing the obtained model with the representation of real people about it. The advantage of this approach is a quick correction of the results obtained, and it can also be used to compare the accuracy of approaches.

178

A. Khlobystova et al.

4.2 Empirical Validity: Correlating Questions One of the tasks facing this kind of research is the correct wording of questions proposed to experts. Namely, studies conducted in the context of social engineering attacks require assessments of the actions of attackers, which in turn are based on data about users and the assumption of their interaction. On the one hand, expert assessments in this case should be obtained from experienced attackers-social engineers. However, this approach seems absurd. On the other hand, psychologists and typical users of social networks can act as experts. With this approach, it is much more realistic to collect the required number of experts, however, there is a problem of discrepancy in the required wording of the questions (for example, “malefactor passing from user i to user j, if user i is marked on the “VKontakte” online social network as the user’s j best friend”) and compliance “face validity” (it refers to the transparency or relevance of a test as it appears to test participants [6]), due to the fact that an ordinary user or even an experienced psychologist cannot have the knowledge to assess the actions of intruders during a social engineering attack, such a formulation will not be correct and understandable. The following can be chosen as closer to understanding for the respondents: “Imagine the following situation: You have been invited to join the group “VK”. Please, give estimate the probability that you would respond to the request, if you received an invitation from a person who is marked in your account “VK” as “best friend””. However, as noted earlier, this approach requires checking “empirical validity”. However, this check implies the correspondence of diagnostic indicators to real behavior and observed actions, which is not possible until the implementation of the final software product and verification of its compliance with real socio-engineering attacks.

5 Discussion of the Results In the previous section, two approaches to confirming the validity of the obtained quantified estimates of the values of the variable “types of relationship between users” were considered. Both proposed approaches involve additional research with the involvement of experienced qualified experts, which in turn encourages leads to increased material costs and time burdens. The approach proposed in Sect. 4.1 demonstrates the reliability of the results obtained with the correct conduct of additional research and the involvement of a sufficient number of qualified experts. Empirical validity, discussed in Sect. 4.2, is a very important piece of research into social engineering attacks. As already noted, due to the impossibility of correctly assessing the actions of intruders during a social engineering attack, to obtain quantified values of “types of relationship between users”, a correct study design is required. Namely, it is required to find such formulations of questions that experts could give an answer to, and at the same time that would correspond to the desired estimates is the probability of an attacker’s transition between users. However, this is very difficult to achieve. It is also worth noting that the considered approaches can be applied together. So, for example, an expert (having competence in the field of psychology or cybercrime) may be asked to take two surveys in parallel, the first of which will be aimed at assessing the interaction between users, and the second at assessing the likelihood of the spread of an

Approaches to Validation of Quantification of the Variable

179

attack between users (with the same type of relationship as in parallel question). This approach shows less time to implement in practice and is effective at the same time.

6 Conclusions Thus, in the article are discussed approaches to validation of quantification of variables. The theoretical significance of the research consists in providing approaches to validate the variable based on expert opinion. The practical significance of the results consists in the formation of the basis for the subsequent use of “types of relationship between users” estimates for solving of problems of analysis of the trajectories of propagation of multistep social engineering attacks. This study contributes to the development of the field of information security, namely, contributes to the development of diagnostics of information systems to determine vulnerabilities to social engineering attacks, and can be used in solving social computing problems. Acknowledgements. The research was carried out in the framework of the project on state assignment SPIIRAN № 0073-2019-0003, with the financial support of the RFBR (project № 20-07-00839 Digital twins and soft computing in social engineering attacks modelling and associated risks assessment; project № 18-01-00626 Methods of representation, synthesis of truth estimates and machine learning in algebraic Bayesian networks and related knowledge models with uncertainty: the logic-probability approach and graph systems).

References 1. Abramov, M.V., Tulupyeva, T.V., Tulupyev, A.L.: Social engineering attacks: social networks and user security estimates. SUAI, St. Petersburg, p. 266 (2018). (in Russian) 2. Adkins, D.C., Fiske, D.W.: Psychological testing. Encyclopædia Britannica. Encyclopedia Britannica, inc. 28 July 2017. https://www.britannica.com/science/psychological-testing. Accessed 10 Sept 2020 3. Bolton, M.L., Bass, E.J.: A method for the formal verification of human-interactive systems. Proc. Hum. Fact. Ergon. Soc. Ann. Meet. 53(12), 764–768 (2009). https://doi.org/10.1518/ 107118109X12524442637309 4. Fan, W., Lwakatare, K., Rong, R.: Social engineering: IE based model of human weakness for attack and defense investigations. Int. J. Comput. Netw. Inform. Secur. 9(1), 1–11 (2017). https://doi.org/10.5815/ijcnis.2017.01.01 5. Hamblin, C.J., Castaneda, M., Fuld, R.B., Holden, K., Whitmore, M., Wilkinson, C.: Verification and validation: human factors requirements and performance evaluation. Proc. Hum. Fact. Ergon. Soc. Ann. Meet. 57(1), 2032–2036 (2013). https://doi.org/10.1177/154193121 3571454 6. Holden, R.B.: Face validity. In: Weiner, I.B., Craighead, W.E. (eds.) The Corsini Encyclopedia of Psychology, 4th edn. Wiley, Hoboken, pp. 637–638 (2010). ISBN 978-0-470-17024-3 7. Khlobystova, A., Korepanova, A., Maksimov, A., Tulupyeva, T.: An approach to quantification of relationship types between users based on the frequency of combinations of non-numeric evaluations. In: Kovalev, S., Tarassov, V., Snasel, V., Sukhanov, A. (eds.) IITI 2019. AISC, vol. 1156, pp. 206–213. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-500979_21

180

A. Khlobystova et al.

8. Khlobystova, A.O., Abramov, M.V., Tulupyev, A.L.: Soft Estimates for social engineering attack propagation probabilities depending on interaction rates among Instagram users. In: Kotenko, I., Badica, C., Desnitsky, V., El Baz, D., Ivanovic, M. (eds.) IDC 2019. SCI, vol. 868, pp. 272–277. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-32258-8_32 9. Khovanov, N.V.: Measurement of a discrete indicator utilizing nonnumerical, inaccurate, and incomplete Information. Meas. Tech. 46(9), 834–838 (2003) 10. Morais, C., Moura, R., Beer, M., Patelli, E.: Analysis and estimation of human errors from major accident investigation reports. ASCE-ASME J. Risk Uncert. Engrg. Sys. Part B Mech. Engrg. 6(1), 011014 (2020). https://doi.org/10.1115/1.4044796 11. N˘ad˘aban, S., Dzitac, S., Dzitac, I.: Fuzzy topsis: a general view. Procedia Comput. Sci. 91, 823–831 (2016). https://doi.org/10.1016/j.procs.2016.07.088 12. National Program for the Development of the Digital Economy of the Russian Federation “Digital Economy 2024”. http://static.government.ru/media/files/9gFM4FHj4PsB79I 5v7yLVuPgu4bvR7M0.pdf Accessed 2 Sept 2020. (in Russian) 13. Nobari, S.M., Yousefi, V., Mehrabanfar, E., Jahaniki, A.H., Khadivi, A.M.: Development of a complementary fuzzy decision support system for employees’ performance evaluation. Econ. Res.-Ekonomska istrazivanja. 32(1), 492–509 (2019). https://doi.org/10.1080/ 1331677X.2018.1556106 14. Patacsil, F.F.: Analysis of cyberbullying incidence among filipina victims: a pattern recognition using association rule extraction. Int. J. Intell. Syst. Appl. 11(11), 48–57 (2019). https:// doi.org/10.5815/ijisa.2019.11.05 15. Ryapenko, A.I., Fedorenko, A.A., Livkina, E.P.: Digital economy development in Russia and national security. Studencheskiy nauchno-obrazovatel’nyy zhurnal [Stud. Sci. Educ. J. 3(8), 566–574 (2020). (in Russian) 16. Sberbank proposes measures against hackers using social engineering. https://www.rbc.ru/ technology_and_media/17/04/2020/5e988cc29a7947ff6c7b4e6e. Accessed 2 Sept 2020. (in Russian) 17. Social Engineering Attacks: A Look at Social Engineering Examples in Action, Security boulevard. https://securityboulevard.com/2020/04/social-engineering-attacks-a-look-atsocial-engineering-examples-in-action. Accessed 23 May 2020 18. Stepanova, J.: Russian companies lost 1.26 billion rubles on social engineering. https://www. kommersant.ru/doc/4215008. Accessed 2 Sept 2020. (in Russian) 19. Süzen, A.A.: A risk-assessment of cyber attacks and defense strategies in Industry 4.0 ecosystem. Int. J. Comput. Netw. Inform. Secur. 12(1), 1–12 (2020). https://doi.org/10.5815/ijcnis. 2020.01.01 20. Wu, T., Liu, X., Liu, F.: The solution for fuzzy large-scale group decision making problems combining internal preference information and external social network structures. Soft. Comput. 23(18), 9025–9043 (2018). https://doi.org/10.1007/s00500-018-3512-3 21. Zadeh, L.A.: Linguistic variables, approximate reasoning and dispositions. Med. Inform. 8(3), 173–186 (1983)

Parametric Oscillations at Delays in the Forces of Elasticity and Damping Alishir A. Alifov(B) Mechanical Engineering Research Institute of the Russian Academy of Sciences, Moscow 101990, Russia

Abstract. Parametric oscillations in a nonlinear system with a source of energy of limited power in the presence of delays in the forces of elasticity and damping are considered. The system model includes a rod-oscillatory system and an electric motor-energy source. To solve the nonlinear equations of motion of the system, the method of direct linearization was used. Using this method, equations are derived for determining the non-stationary and stationary values of the amplitude, phase of oscillations and the speed of the energy source. The conditions of stability of stationary oscillations are considered on the basis of the Routh-Hurwitz criteria. In order to obtain information about the effect of delays on the dynamics of the system, calculations were carried out and a number of amplitude-frequency dependences were constructed for various combinations of delays in the forces of elasticity and damping. For a number of points of amplitude-frequency dependences, areas of steepness of the characteristics of the energy source are shown at which stationary modes of motion are stable. Keywords: Parametric oscillations · Non-ideal energy source · Elasticity · Damping · Delay

1 Introduction Both in various technological processes and equipment, there is a wide distribution of delay. It is present, for example, in the textile and chemical industry under the name of transport lag, which has been growing in interest lately. “Pure lag links are often found in various manufacturing processes where material is transported from one point to another using conveyor belts; in systems for controlling sheet thickness during rolling; in systems of magnetic recording and reproduction, etc.” [1]. It is caused in mechanical systems by imperfection of elastic properties of materials, internal friction in them, etc. [2]. The lag causes oscillatory processes that have both useful and harmful values. Many works have been devoted to the study of oscillations in systems with delay, for example, [3–7, etc.]. The overwhelming majority of them are works that do not take into account the interaction of the oscillatory system and the energy source. Such accounting is available only in a relatively small number of works. The relevance of this accounting has now greatly increased due to environmental problems, which have become quite acute. One

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): AIPE 2020, AISC 1403, pp. 181–188, 2021. https://doi.org/10.1007/978-3-030-80531-9_16

182

A. A. Alifov

of the ways to solve environmental problems is the rational use of energy consumed by humanity. As noted in [8, 9], “In any industry, saving energy resources, materials and components is an urgent problem. A significant part of the energy is consumed by various drives of machines and technological equipment”. In this context, the worldfamous theory of oscillatory systems with limited excitation, systematically presented by V.O. Kononenko in monographs [10, 11] and further developed by his many followers [12–14, etc.], comes to the fore. One of the main problems of the dynamics of nonlinear systems is, as you know, high labor costs in the analysis [15–20]. Various approximate methods of nonlinear mechanics are used to study the dynamics of these systems [21–30]. Methods of direct linearization differ significantly from these methods [31–37, etc.], which are quite simple to use, require incomparably lower (by several orders of magnitude) labor costs, etc. This is especially valuable from a practical point of view − for calculating technical systems and technological processes at the design stage. The aim of the work is to study, using methods of direct linearization of parametric oscillations in the presence of various delays and a source of energy of limited power. It presents the model and equations of motion of the system, solutions of equations, conditions for the stability of stationary motions, calculations and conclusions.

2 Equations of the System The analysis of the effect of delay on the dynamics of a parametric oscillatory system with a limited power energy source is based on the model studied in [38–40]. The equations of motion of the system given in these works are written in the form     (1) y¨ + ω2 + c˜ 2 sin ϕ y = − β1 y˙ + γ y3 I ϕ¨ = L(ϕ) ˙ − H (ϕ) ˙ − 0.5c2 y2 cos ϕ − 0.5c3 sin 2ϕ − c4 cos ϕ where   P0 π4 π2 π 2 r1 c1 EIx , P0 = f0 c1 , P1 = 2 EIx , c2 = − , c3 = c1 r12 , c = 3 1− 4l P1 l 2l c4 = f0 r1 c1 , γ =

1 π 4 c1 2 c c2 β , ω = , m = m1 l, c˜ 2 = , β1 = 2 8l m 4 m m

ϕ is the angular coordinate of the motor rotor rotation, I is the moment of inertia of the motor rotor, m is the mass of the rod, m1 is the mass of a unit length of the rod, E is the elastic modulus of the rod material, f0 is the static deformation of the spring, Ix is the moment of inertia of the cross section of the rod, β is resistance force coefficient, L(ϕ) ˙ is engine torque, H (ϕ) ˙ is resistance moment to rotation of the engine rotor. Supplementing system (1) with delays for the elastic force and damping, we represent it in the form

Parametric Oscillations at Delays in the Forces of Elasticity and Damping

    y¨ + ω2 + c˜ 2 sin ϕ y = − β1 y˙ + γ y3 + cτ yτ + kη y˙ η

183

(2)

I ϕ¨ = L(ϕ) ˙ − H (ϕ) ˙ − 0.5c2 y2 cos ϕ − 0.5c3 sin 2ϕ − c4 cos ϕ where cτ = c¯ τ /m, c¯ τ = const, kη = k¯η /m, k¯η = const, yτ = y(t − τ ), y˙ η = y˙ (t − η), τ = const and η = const are delays. Using the method of direct linearization [31], we replace the nonlinear function F(y) = γ y3 with the linear F∗ (y) = cF y

(3)

where cF represents the linearization coefficient given by the expression cF = cF (a) = N¯ 3 γ a2

(4)

In (4), the value a reflects the maximum value of the y variable, i.e. a = max|y|, N¯ 3 = (2γ + 3)/(2γ + 5) is a numerical coefficient that depends on the linearization accuracy parameter r, which has no restrictions on the value, which can be selected [31] from the interval (0, 2). For r = 1.5, the number N¯ 3 = 3/4 is obtained, which coincides with the number that occurs if we use the widely used asymptotic averaging method of nonlinear mechanics to solve the first equation of (2) [20, 21]. Equations (2) taking into account (3) take the form     y¨ + ω2 + c˜ 2 sin ϕ y = − β1 y˙ + cF y + cτ yτ + kη y˙ η I ϕ¨ = M (ϕ) ˙ − 0.5c2 y2 cos ϕ − 0.5c3 sin 2ϕ − c4 cos ϕ

(5)

where M (ϕ) ˙ = L(ϕ) ˙ − H (ϕ). ˙

3 Solving Equations To solve Eq. (5), we use the method of change of variables with averaging [31]. It allows you to consider stationary and non-stationary processes. In [31], for an equation of general form x¨ + ω2 x = H (t, x, x˙ ) with linearized functions, based on the change of variables x = vp−1 cos ψ, x˙ = −v sin ψ, ψ = pt + ξ , v = max| x˙ |, equations of the standard form (ESF) were obtained. ESF allows you to find non-stationary values of υ and ξ , while the amplitude of oscillations is determined by the expression a = vp−1 . With the help of the ESF, it is possible to consider the processes both in the presence of an external influence (in the resonance region and its immediate vicinity) and in its absence. In relation to the second Eq. (5), we use the averaging procedure described in [35–37] to calculate the interaction of oscillatory systems with energy sources. Solution (5) using ESF and y˙ η = −ap sin(ψ − pη), yτ = a cos(ψ − pτ ) is represented as

184

A. A. Alifov

y = a cos ψ, y˙ = −ap sin ψ, ψ = pτ + ξ, ϕ˙ = , p = /2

(6)

and we obtain the following equations for non-stationary values of a, ξ , : da a =− (B − c˜ 2 sin 2ξ ) dt 4pm dξ 1 = (E + c˜ 2 cos 2ξ ) dt 4pm   d 1 c2 a2 = M ( ) − dt I 8

(7)

where    B = 2 p β1 + kη cos pη − cτ sin pτ

  E = 2 m ω2 − p2 + cF (a) + cτ cos pτ ˙ = 0 provide relationships for determining stationary values Conditions a˙ = 0, ξ˙ = 0, a, ξ, . Since cF (a) depends on the amplitude, in the general case the expression  

 (8) cF (a) = − m ω2 − p2 + cτ cos pτ ± 0.5 c˜ 22 − B2 which in case (4) takes the explicit form of the amplitude-frequency dependence 

 2  2 2 2 2 a = − m ω − p + cτ cos pτ ± 0.5 c˜ 2 − B N¯ 3 γ

(9)

 Phase of stationary oscillations tg2ξ = −B E. Taking into account ω2 − p2 ≈ 2ω(ω − p), from (9), at a = 0, the boundary frequencies for the resonant zone follow  

p± = ω + 0.5m−1 ω−1 cτ cos pτ ± 0.5 c˜ 22 − B2

and, accordingly, its width is = p+ − p− = m−1 ω−1 c˜ 22 − B2 . In order for resonance to be excited, there must be c˜ 2 > B. The relation for determining the stationary values of the velocity has the form M ( ) − S( ) = 0

(10)

where S( ) = c2 a2 /8 represents the load on the motor from the oscillating system.

Parametric Oscillations at Delays in the Forces of Elasticity and Damping

185

4 Stability of Stationary Movements Stationary modes must be investigated for stability. For this purpose, we compose the equations in variations for (7) and use the Routh-Hurwitz criteria. The stability criteria are the inequalities D1 > 0, D3 > 0, D1 D2 − D3 > 0

(11)

where D1 = −(b11 + b22 + b33 ), D2 = −(b11 b33 + b11 b22 + b22 b33 − b23 b32 − b12 b21 − b13 b31 ), D3 = b11 b23 b32 + b12 b21 b33 − b11 b22 b33 − b12 b23 b31 − b13 b21 b32 b11 = Q/I , b12 = −c2 a/4I , b13 = 0, b21 = 0, b22 = 0 b23 = −

d aE 1 1 B , b31 = − , b32 = − N¯ 3 γ a, b33 = − ,Q = M ( ) 2pm 2 pm 2pm d

5 Calculations To obtain information about the amplitude-frequency dependence and stability of stationary movements, calculations were carried out with the following parameters: ω = 1c−1 , m = 1 kgf.c2 .cm−1 , c2 = 0.05 kgf.cm −1 , β = 0.02 kgf.c.cm−1 , kη = 0.06 kgf.c.cm −1 , cτ = 0.05 kgf.cm−1 , γ = ±0.2 kgf.cm−3 . Linearization accuracy parameter γ = 1.5 and the corresponding linearization factor N¯ 3 = 3/4. The calculated values for the delays are pη = 0, π/2, π; pτ = 0, π/2, π, 3π/2. The amplitude-frequency curves of a(p) for various combinations of delays are shown in Fig. 1 (γ > 0) and Fig. 2 ((γ < 0). They are the same that can be obtained using the widely used known averaging method. Solid lines everywhere represent the case of no delays. Dashed lines correspond to pτ = π/2, dashed lines to pτ = π, and dashed lines to pτ = 3π/2. The curves for pη = 3π/2 are the same as for pη = π/2. Stable fluctuations correspond to the slope |Q| of the characteristics of the energy source, which are within the shaded sector. In the case of γ < 0, the upper branch of the amplitude-frequency curve is stable in combinations of delays pη = 0, pτ = π/2, and pη = π, pτ = 3π/2 starting from approximately p  0.99. In the case of γ > 0, the lower branch in the combination pη = π/2, pτ = π, the upper and lower branches in the combination pη = π/2, pτ = π/2 are completely unstable. The curves of loads S(p) on the energy source presented in Fig. 3 take place at pη = 0, γ = 0.2. Their designations, depending on the delay pτ, are the same as in Fig. 1. At pη = 0, γ = −0.2, the load curves tilt to the left.

186

A. A. Alifov

а) pη = 0

b) pη = π/2

c) pη = π

Fig. 1. Amplitude curves: γ > 0

а)

=0

b)

= π/2

c)



Fig. 2. Amplitude curves:γ < 0

Fig. 3. Load curves: γ > 0, pη = 0

6 Conclusions Under the influence of elastic delays and damping, the region of resonant amplitudes of parametric vibrations may shift. The delay also affects the stability of stationary oscillations and, accordingly, their realizability, along with the properties of the energy source that supports the functioning of the system. The solution of nonlinear differential equations of motion of the system can be obtained with little labor and time using the direct linearization method.

Parametric Oscillations at Delays in the Forces of Elasticity and Damping

187

References 1. Babakov, N.A., Voronov, A.A., Voronova, A.A., et al. (eds.): Theory of automatic control: Textbook. for universities on spec. “Automation and telemechanics”. Part I. Theory of linear automatic control systems. Higher school, Moscow, Russia (1986). (in Russian) 2. Encyclopedia of mechanical engineering. https://mash-xxl.info/info/174754/ 3. Rubanik, V.P.: Oscillations of Quasilinear Systems with Time Lag. Nauka, Moscow (1969). (in Russian) 4. Zhirnov, B.M.: Single-frequency resonant vibrations of a frictional self-oscillating system with a delay under an external perturbation. Appl. Mech. 14(9), 102–109 (1978). (in Russian) 5. Abdiev, F.K.: Delayed self-oscillations of a system with an imperfect energy source. Izv. AN AzSSR Ser. Phis.-Tekh. Math. Nauk. (4), 134–139 (1983). (in Russian) 6. Zhou, B.: Input delay compensation of linear systems with both state and input delays by adding integrators. Syst. Control Lett. 82, 51–63 (2015) 7. Padhan, D.G., Reddy, B.R.: A new tuning rule of cascade control scheme for processes with time delay. In: Conference on Power, Control, Communication and Computational Technologies for Sustainable Growth, pp. 102–105 (2015) 8. Haq, Q.A.U.: Design and implementation of solar tracker to defeat energy crisis in Pakistan. Int. J. Eng. Manuf. (2), 31–42 (2019). https://doi.org/10.5815/ijem.2019.02.03. Published Online March 2019 in MECS (http://www.mecs-press.net) 9. Volkov, A.N., Matsko, O.N., Mosalova, A.V.: The choice of energy-saving laws of motion of mechatronic drives of technological machines Scientific and technical statements of SPbPU. Nat. Eng. Sci. 24(4), 141–149 (2018). (in Russian). https://doi.org/10.18721/JEST.24414. 10. Kononenko, V.O.: Vibrational Systems with Limited Excitation. Nauka, Moscow (1964). (in Russian) 11. Kononenko, V.O.: Vibrating Systems with Limited Power-Supply. Iliffe, London (1969) 12. Alifov, A.A., Frolov, K.V.: Interaction of Nonlinear Oscillatory Systems with Energy Sources, p. 327. Hemisphere Pub. Corp. Taylor & Francis Group, New York (1990) 13. Krasnopolskaya, T.S., Shvets, A.Yu.: Regular and chaotic dynamics of systems with limited excitation. Regular and chaotic dynamics, M.-Izhevsk, Russia (1964). (in Russian) 14. Cveticanin, L., Zukovic, M., Cveticanin, D.: Non-ideal source and energy harvesting. Acta Mech. 228(10), 3369–3379 (2017). https://doi.org/10.1007/s00707-017-1878-4 15. Ashwin, P., Coombes, S., Nicks, R.: Mathematical frameworks for oscillatory network dynamics in neuroscience. J. Math. Neurosci. 6(1), 1–92 (2016). https://doi.org/10.1186/s13408015-0033-6 16. Chen, D.-X., Liu, G.-H.: Oscillatory behavior of a class of second-order nonlinear dynamic equations on time scales. Int. J. Eng. Manuf. 6, 72–79 (2011). https://doi.org/10.5815/ijem. 2011.06.11. Published Online December 2011 in MECS (http://www.mecs-press.net). 17. Gourary, M.M., Rusakov, S. G.: Analysis of oscillator ensemble with dynamic couplings. In: Hu, Z., Petoukhov, S.V., He, M. (eds.) AIMEE2018 2018. AISC, vol. 902, pp. 161–172. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-12082-5_15 18. Ziabari, M.T., Sahab, A.R., Fakhari, S.N.S.: Synchronization new 3D chaotic system using brain emotional learning based intelligent controller. Int. J. Inf. Technol. Comput. Sci. (IJITCS) 7(2), 80–87 (2015). https://doi.org/10.5815/ijitcs.2015.02.10 19. Bhansali, P., Roychowdhury, J.: Injection locking analysis and simulation of weakly coupled oscillator networks. In: Li, P., et al. (eds.) Simulation and Verification of Electronic and Biological Systems, pp. 71–93. Springer, Dordrecht (2011). https://doi.org/10.1007/978-94007-0149-6_4 20. Karabutov, N.: Frameworks in problems of structural identification systems. Int. J. Intell. Syst. Appl. (IJISA) 1, 1–19 (2017). https://doi.org/10.5815/ijisa.2017.01.01

188

A. A. Alifov

21. Bogolyubov, N.N., Mitropolsky, Y.: Asymptotic Methods in the Theory of Nonlinear Oscillations. Nauka, Moscow (1974). (in Russian) 22. Vibrations in technology: directory. In: Blekhman, I.I. (ed.) Oscillations of Nonlinear Mechanical Systems, vol. 2. Engineering, Moscow (1979). (in Russian) 23. Tondl, A.: On the interaction between self-exited and parametric vibrations. National Research Institute for Machine Design Bechovice. Series: Monographs and Memoranda, no. 25 (1978) 24. Migulin, V.V., Medvedev, V.I., Mustel, E.R., Parygin, V.N. (eds.): Fundamentals of the theory of oscillations: Textbook. Management, 2nd edn. Rev. Nauka, Moscow, Russia (1988) (in Russian) 25. Hayashi, C.: Nonlinear Oscillations in Physical Systems. Princeton University Press, Princeton (2014) 26. Moiseev, N.N.: Asymptotic Methods of Nonlinear Mechanics. Nauka, Moscow (1981). (in Russian) 27. Butenin, N.V., Neymark, Y., Fufaev, N.A.: Introduction to the Theory of Nonlinear Oscillations. Nauka, Moscow (1976). (in Russian) 28. Andronov, A.A., Vitt, A.A., Khaikin, S.E.: Oscillation Theory. Nauka, Moscow (1981). (in Russian) 29. Biderman, V.L.: The Theory of Mechanical Vibrations: Textbook for Universities. High School, Moscow (1980). (in Russian) 30. Wang, Q., Fu, F.: Numerical oscillations of Runge-Kutta methods for differential equations with piecewise constant arguments of alternately advanced and retarded type. I. J. Intell. Syst. Appl. 4, 49–55 (2011). Published Online June 2011 in MECS (http://www.mecs-press.org/) 31. Alifov, A.A.: Methods of Direct Linearization for Calculation of Nonlinear Systems. RCD, Moscow, Russia (2015). (in Russian). ISBN: 978-5-93972-993-2 32. Alifov, A.A.: Method of the direct linearization of mixed nonlinearities. J. Mach. Manuf. Reliab. 46(2), 128–131 (2017). https://doi.org/10.3103/S1052618817020029 33. Alifov, A.A., Farzaliev, M.G., Dzhafarov, Je.N.: Dynamics of a self-oscillatory system with an energy source. Russ. Eng. Res. 38(4), 260–262 (2018). https://doi.org/10.3103/S10687 98X18040032 34. Alifov, A.A.: On the calculation by the method of direct linearization of mixed oscillations in a system with limited power-supply. In: Hu, Z., Petoukhov, S., Dychka, I., He, M. (eds.) ICCSEEA 2019. AISC, vol. 938, pp. 23–31. Springer, Cham (2020). https://doi.org/10.1007/ 978-3-030-16621-2_3 35. Alifov, A.A.: On the calculation of oscillatory systems with limited excitation by methods of direct linearization. J. Probl. Mech. Eng. Autom. (4), 92–97 (2017). (in Russian) 36. Alifov, A.A.: About direct linearization methods for nonlinearity. In: Hu, Z., Petoukhov, S., He, M. (eds.) AIMEE 2019. AISC, vol. 1126, pp. 105–114. Springer, Cham (2020). https:// doi.org/10.1007/978-3-030-39162-1_10 37. Alifov, A.A.: About application of methods of direct linearization for calculation of interaction of nonlinear oscillatory systems with energy sources. In: Proceedings of the Second International Symposium of Mechanism and Machine Science (ISMMS 2017), Baku, Azerbaijan, 11–14 September 2017, pp. 218–221 (2017) 38. Kononenko, V.O.: Interaction of a parametric oscillatory system with an energy source. Izv. Academy of Sciences of the USSR. REL. Mechanics and mechanical engineering, no. 5 (1960) 39. Kononenko, V.O., Frolov, K.V.: On the interaction of a nonlinear oscillatory system with an energy source. Izv. Academy of Sciences of the USSR. REL. Mechanics and mechanical engineering, no. 5 (1961) 40. Frolov, K.V.: Selected works: in 2 vol. vol. 1. Vibration and technology. Nauka, Moscow, Russia (2007) (in Russian)

A Coordinated Dispatching Model for HDR-PV Hybrid Power System: A Zero-Sum Game Approach Qingmiao Zhang1 , Yang Si1,2(B) , Xuelin Zhang2,3 , and Xiaotao Chen1 1 Qinghai Key Lab of Efficient Utilization of Clean Energy (Tus-Institute for Renewable

Energy), Qinghai University of China, Xining 810016, China {siyang,chenxiaotao}@qhu.edu.cn 2 State Key Laboratory of Control and Simulation of Power System and Power Generation Equipment, Tsinghua University of China, Beijing 100091, China 3 Technical Institute of Physics and Chemistry of China, Beijing 100190, China [email protected]

Abstract. With the increasing proportion of large-scale photovoltaic (PV) plants in Qinghai’s power grid, the energy storage system has become the key to the power grid’s reliable operation. However, due to the environmental adaptability of electrochemical energy storage, it cannot meet the needs of the Qinghai power grid. This paper puts forward the technical route of using a hot dry rock (HDR) power system instead of traditional chemical energy storage systems to improve the power grid’s reliability. The hybrid power system consisted of an HDR generation and the PV plants are constructed, and its operating models are given. The zerosum game method is used to establish the coordinated dispatching model of the hybrid power system. Finally, the proposed dispatching model is verified by an example with actual data of the HDR in the Gonghe basin. The results show that the HDR generation can effectively enhance the credible capacity of PV plants, thus reducing the influence of power fluctuation on the power grid. Keywords: Zero-sum game · HDR · Hybrid power system · Coordinated dispatching

1 Introduction With the rapid development of clean energy in the world energy system, the large-scale photovoltaic (PV) power station has become the energy system’s central component [1, 2]. In China, Qinghai province has a unique advantage in clean energy resources, and the clean energy industry has become the pillar industry in the economy. By the end of June 2020, the region has accumulated 28.01 million kilowatts of clean energy, accounting for 88% of the total installed capacity, including 16.08 million kilowatts of PV and wind power. This percentage makes Qinghai the first province in China where the total installed capacity of new energy accounts for more than 50% [3]. However, because solar energy is affected by weather and environment, its fluctuation and randomness © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): AIPE 2020, AISC 1403, pp. 189–202, 2021. https://doi.org/10.1007/978-3-030-80531-9_17

190

Q. Zhang et al.

restrict the further development of the clean energy industry in Qinghai. With the largescale construction of the PV plants, the high proportion of renewable energy access in Qinghai’s power grid has put forward a higher demand for flexible power sources. The clean energy delivery channels are facing the dilemma of lack of reliable power support [4]. In 2018, China first discovered large-scale and high-quality hot dry rock (HDR) resources in the Gonghe basin of Qinghai. The area of HDR resources in the Gonghe basin reached 246.9 km2 , of which the geothermal resource base of the HDR with the buried depth of 3–10 km is 163816 EJ, equivalent to 55.909 billion tons of standard coal [5]. The utilization of HDR resources has gradually become a new research hot spot of geothermal energy in China. Compared to other clean energy, The HDR generation has stable and reliable characteristics and is not affected by the environment and climate change. The HDR’s reliable features become the best choice for supporting power sources of Qinghai’s high renewable energy proportion power grid [6]. The HDR generation has great potential to support the grid connection of PV plants, which can improve the reliability of the power grid and promote the ability of clean energy delivery. Thus, the HDR generation provides a new technical route for Qinghai to realize the complete clean energy supply in the future. Scholars worldwide have carried out much research on grid accession of PV plants in high renewable energy proportional power grids. From the perspective of multi-energy complementary coordinated dispatching, some scholars use the optimized dispatching of stable power sources such as hydropower stations and energy storage power stations to suppress the fluctuation of the output power of PV plants [7–9]. According to the hybrid power system’s operation mechanism, the day-ahead dispatching model of the PV plant and the photothermal plant is established [10]. The system’s stability is improved by using the inertia of the thermal energy storage system of the photothermal power station. In [11], a kind of HDR enhanced geothermal system in a multi-energy hybrid system is proposed, which combines renewable energy such as wind and PV to meet the demand of cold, heat, and the electric load in an independent micro-energy network. Other scholars have discussed the coordinated dispatching of hybrid power systems from the mutual game’s perspective between multi-energy sources. Literature [12] uses a cooperative game to simulate the potential cooperative behaviors of multiple gridconnected microgrids to obtain higher efficiency and economy. Some research analyzes the game model and mathematical essence of the robust economic scheduling problem and regards nature as a virtual game player. A zero-sum game pattern is constituted by taking power grid dispatchers and nature as players, which provides a new idea for reliable operation of power systems with high renewable energy penetration [13–15]. Although the above research models and analyzes the multi-energy system’s operation mechanism from different perspectives, it seldom involves the hybrid power system participated by HDR generations. From the standpoint of game theory, the research on grid dispatcher as a player to improve the grid accession ability and reliability of PV plants by optimizing HDR generations’ dispatching strategy has not been carried out. Therefore, according to the HDR generation’s stability characteristics, a hybrid power system architecture of HDR-PV coordinated operation to support PV plants’ friendly grid connection is put forward in this paper. The zero-sum game pattern of power grid

A Coordinated Dispatching Model for HDR-PV Hybrid Power System

191

dispatcher and nature as game players is established. The flexible operation model of the HDR generation and the output power model of PV plants are constructed. Furthermore, the proposed zero-sum game dispatching model is transformed into the upper layer problem of power grid dispatching and the lower layer problem of PV plants power. The lower layer problem is further converted into the constraint of the upper layer problem by KKT conditions, thereby solving the dispatching problem. Finally, the feasibility of the proposed model and method is analyzed by an example using the actual data of HDR resources in Qinghai. The remainder of this paper is organized as follows. The architecture of the hybrid power system is given in Sect. 2. The zero-sum game dispatching models are proposed in Sect. 3. The optimization method is addressed in Sect. 4. Section 5 simulates the cases based on actual data of the Gonghe basin. Conclusions and further developments are discussed in Sect. 6.

2 A Hybrid Power System Architecture for HDR and PV An HDR generation proposed in this paper consists of an HDR enhanced geothermal system (EGS), a thermal storage exchanger (TSE), a thermal energy storage tank, two geothermal power generation systems (GPGS I and GPGS II), as shown in Fig. 1. EGS flexibly distributes the geothermal working fluid between the GPGS I and the TSE through the injection pump and distributor. GPGS I directly use geothermal fluids’ thermal energy to generate electricity. The TSE exchanges thermal energy from the high-temperature geothermal fluids to the thermal storage fluids. The GPGS II is used to create electricity using the stored thermal energy in the thermal energy storage tank. GPGS I and GPGS II are connected to the transformer substation’s low voltage side and supply the load through transmission lines. Simultaneously, there are large-scale PV plants on the low-voltage side of the same transformer substation. thermal energy storage tank

TSE

D

GPGS II

PV plant

GPGS I

transformer substation

PV plant

loads

transmission lines

PV plant

P production well

injection well

Fig. 1. Composition diagram of HDR generation and PV plants

192

Q. Zhang et al.

3 Zero-sum Game Dispatching Model of HDR-PV Hybrid Power System 3.1 Zero-sum Game Pattern The hybrid power system consisted of HDR generation and PV plants are influenced by the power grid dispatcher and nature (solar condition, weather) in actual operation and have a relationship of rivalry and competition. Hence, they accord with the constituent elements of a two-person zero-sum game. On the one hand, the power grid dispatcher attempts to maximize the connected PV plants’ credible capacity by formulating the HDR generation’s output dispatching strategy. On the other hand, nature determines the real-time output power of PV plants through uncertain weather. With extreme weather, the operation condition deteriorates, which improves the fluctuation of the output power of PV plants and reduces the power grid’s reliability. Whether the hybrid power system of the HDR generation and PV plants can maintain safe and economical operation depends on the game’s result between the power grid dispatcher and nature. The game’s ultimate goal is to restrain the worst effects of nature’s uncertain factors, improve the credible capacity of PV plants as much as possible, and maximize the HDR generation’s power generation profits. The game model based on the two-person zero-sum game theory is   max F(g, ξ ), min F(g, ξ ) ξ

g

s.t.

(1)

G(g, ξ ) ≤ 0 H (g, ξ ) = 0 where, g is the variable that can be dispatched for power grid dispatcher, ξ is the uncertainty variable controlled by nature, F(g, ξ ) represents the profit objective function, G(g, ξ ) and H (g, ξ ) are inequality constraints and equality constraints, respectively. In this decision-making problem with uncertain variables, nature controls ξ to minimize F while the power grid dispatcher controls g to maximize the F Therefore, the g and ξ in (1) can be seen as the strategy sets of both sides of the game. The Nash equilibrium solution (g * , ξ ∗ ) of a two-person zero-sum game is usually described by a max-min problem. The power grid dispatcher through the weather forecast to make the dispatching strategy g, and nature gives the uncertainty ξ of the response strategy in real-time. That is, in (g * , ξ ∗ ), g acts first and ξ acts later. According to the worst result of ξ ‘s action, g adopts the best strategy g ∗ to maximize the profit in the worst case brought by ξ . For (1), the optimization decision problem can be equivalent to the max-min problem F(g ∗ , ξ ∗ ) = max min F(g, ξ ) g

ξ

(2)

A Coordinated Dispatching Model for HDR-PV Hybrid Power System

193

3.2 Operation Model of HDR Generation HDR EGS consists of production well, injection well, distributor, and injection pump. Equations (3) and (4) give the models of the geothermal working fluid distributor. mGI = αst mHDR

(3)

mEX = (1 − αst )mHDR

(4)

where, mGI and mEX represent the mass flow of geothermal fluids directly utilized by the GPGS I in the geothermal extraction cycle and the mass flow of geothermal fluids supplied to TSE to storage, respectively. The αSt is the geothermal energy coefficient used for power generation at t time. mHDR is the total mass flow of geothermal fluids obtained by the geothermal extraction cycle. At present, the Organic Rankine Cycle (ORC) power generation system is used to realize geothermal energy conversion to electric energy. As a result, the GPGS of the HDR generation proposed in this paper adopts the ORC power generation system. The power model of GPGS I is given by t = ηmGI cpHDR (THDR − TWell ) PGI

(5)

t represents the output power of GPGS I, the η is the generation efficiency, where, PGI HDR the cP is the specific heat capacity of the geothermal working fluid, the THDR is the temperature of the geothermal working fluid in the production well, and the Twell is the temperature of the injection geothermal working fluid. GPGS II is used to convert thermal energy stored in a high-temperature storage tank into electric energy to realize the flexible dispatching of HDR generations. The power model is given by t = ηmGII cpTS (TH − TL ) PGII

(6)

t represents the output power of the GPGS II, mt where, the PGII GII provides the mass flow rate of thermal energy storage working fluid to the GPGS II from the high-temperature storage tank, and cPTS is the specific heat capacity of the thermal energy storage working fluid. The TH and TL are the temperature of the thermal energy storage working fluid outlet of the high-temperature storage tank and the temperature of that inlet to the low-temperature storage tank after power generation, respectively. The TSE adopts a tube-shell heat exchanger, and its simplified model is t (1 − αst − αRt )mHDR cpHDR (THDR − TWell ) = mtTS cpTS (TH − TL ) + Qcur

(7)

where, the αRt is the geothermal energy coefficient required by the GPGS I to dispatch the reserve, mtTS is the mass flow rate of thermal energy storage fluid in TSE when thermal energy is stored, Qtcur is a waste heat that cannot be stored in a thermal energy storage tanks. The thermal energy storage tanks consist of a low-temperature tank and a hightemperature tank, the energy storage state model of the high-temperature tank is t−1 t = ηT STank + (mtTS − mtGII − mtR )cpTS (TH − TL )τ STank

(8)

194

Q. Zhang et al.

t where, the STank is the thermal energy stored in the high-temperature tank at the t time, the ηT is the insulation coefficient. The mtR is the mass flow rate of the extra thermal energy storage working fluid released by the high-temperature tank when the reserve of the GPGS II is dispatched, τ represents the dispatching time interval. The HDR generation provides a reserve for the system to balance the uncertainty of the PV plants’ output power. The reserve consists of two parts: The GPGS I gives one; the other is provided by the GPGS II, which uses stored thermal energy in the thermal energy storage tank to generate. Equations (9)–(11) give the reserve models of the HDR generation.

RtGI = ηαRt mHDR cpHDR (THDR − TWell )

(9)

RtGII = ηmtR cpTS (TH − TL )

(10)

Rt = RtGI + RtGII

(11)

where, Eq. (9) represents the reserve model of GPGS I, RtGI represents the reserve power of GPGS I. Equation (10) is the reserve model of GPGS II, RtGII is the reserve power provided by GPGS II. Equation (11) is the total reserve model of the HDR generation, Rt is the full reserve power provided by the HDR generation for the hybrid power system. 3.3 Output Power Model of PV Plant The output power model of the PV plant is t = λtPV PPV P¯ PV

(12)

t ξPV = λtξ PPV

(13)

install 0 ≤ PPV ≤ PPV

(14)

t where, the P¯ PV represents the credible output power of the PV plant, the λtPV is the install are solar irradiance coefficient according to weather forecasting. The PPV and PPV the credible capacity and installation capacity of the PV plant connected to the power grid, respectively. The λtξ represents the actual solar irradiance coefficient decided by t is the real output power of the PV plant according to the credible capacity nature and ξPV PPV . Because of the uncertainty of the actual output power of the PV plant, there is always a deviation between the real output power and the prediction power. Therefore, the output power model of the PV plant includes uncertainty constraint t t t − P¯ PV | ≤ ϕ P¯ PV |ξPV

(15)

t where, ϕ indicates the prediction error, that is, the actual output power of PV plants ξPV t ¯ deviated from the prediction value PPV do not exceed ϕ.

A Coordinated Dispatching Model for HDR-PV Hybrid Power System

195

4 Optimization Method 4.1 Optimization Objective In the coordinated operation of the hybrid power system between the HDR generation and PV plant, on the one hand, it is necessary to provide a reserve to enhance the credible capacity of the PV plant and reduce the impact on the reliability of the power grid. On the other hand, considering the construction cost, the HDR generation is required to maximize electricity sales profits. A bi-level programming model can describe this zerosum game problem. Equation (16) gives the optimization objection of the two-person zero-sum game model proposed in this paper. max min

t ,P t t PGI GII ξPV

T 

 t  t t cet PGI + PGII + P¯ PV

(16)

t=1

where, the cet represents the time of use (TOU) price. According to the game players’ decision, the max-min problem shown in (16) is transformed into the upper decision problem and the lower decision problem. Equation (17) represents the optimization objection of power grid dispatcher of the upper decision problem, and the overall profit of the hybrid power system is maximized by dispatching the GPGS of HDR generation max

T 

 t  t t cet PGI + PGII + P¯ PV

(17)

t=1

By giving the real-time PV output power, the lower level problem minimizes the credible capacity of the PV plants to reduce the profit of the PV plant and increase the influence of the uncertainty of the PV plant on the reliability of the power grid, as shown in min

T 

t cet P¯ PV

(18)

t=1

4.2 Constraints In order to ensure the stable and continuous operation of the HDR generation, the upperlevel problem should also meet the following operation constraints. Equations (19)–(21) are the safety constraints to guarantee that at least 10% geothermal energy is used to power generation to maintain the minimal output of the GPGSs and ensure the HDR generation’s reliability. Equation (19) is a heat abandonment constraint to limit that the heat dissipation power is not negative. 0.1 ≤ αst ≤ 1

(19)

0.1 ≤ αst + αRt ≤ 1

(20)

196

Q. Zhang et al. t Qcur ≥0

(21)

Under thermal energy storage tanks and GPGS II, the HDR generation provides up and down reserve capacity through flexible dispatching strategy, tracks the output power prediction value of PV plant, and can adjust the output power according to the change of PV output power. Equation (22) gives the credible capacity constraint of the PV plant in the lower level problem. t t t + Rt − P¯ PV | ≤ σ P¯ PV |ξPV

(22)

where, Rt is the total reserve provided by the HDR generation, σ represents the allowable fluctuation of the power grid. This indicates that the actual output power of the PV plant t deviates from the dispatching value P ¯ t does not exceed the allowable fluctuation ξPV PV rate α of the power grid when the reserve Rt of the HDR generation is dispatched. 4.3 Conversion of Lower Level Problem By KKT conditions, the lower level problem is equivalent to the constraint of the upperlevel problem. The original bi-level model is transformed into a single layer nonlinear optimization problem to solve, which can be described as min

T 

t cet P¯ PV

t=1

s.t.  t  ξ − P¯ t  ≤ ϕ P¯ t : κ PV PV PV  t  ξ + Rt − P¯ t  ≤ σ P¯ t : PV PV PV

(23) θ

The complementary relaxation dual variables corresponding to the constraints in (23) are κ and θ . Then the KKT conditions corresponding to the lower level problem (23) are t t cet + κ+ − κ− + θ+t − θ−t = 0

(24)

t t 0 ≤ κ+ ≤ Muκ+ , 0 ≤ κ− ≤ Muκ−

(25)

0 ≤ θ+t ≤ Muθ+ , 0 ≤ θ−t ≤ Muθ−

(26)

t t −M (1 − uκ+ ) ≤ −(1 + ϕ)P¯ PV + ξPV ≤0

(27)

t t −M (1 − uκ− ) ≤ (1 − ϕ)P¯ PV − ξPV ≤0

(28)

t t −M (1 − uθ+ ) ≤ ξPV + Rt − (1 + σ )P¯ PV ≤0

(29)

t t −M (1 − uθ− ) ≤ −ξPV − Rt + (1 − σ )P¯ PV ≤0

(30)

A Coordinated Dispatching Model for HDR-PV Hybrid Power System

197

Equation (24) is the Lagrange condition corresponding to the problem (23). Equations (27)–(30) are complementary relaxation conditions. The big M method is used to linearize (23) by introducing boolean variables uθ , uκ , and a sufficiently large number M . The proposed model is transformed into a MILP, which can be solved by the Matlab2016b and the Cplex solver.

5 Case Study 5.1 System Parameters The scene is constructed and analyzed based on the HDR of the Gonghe basin. Solar irradiance data and TOU electricity prices are obtained from local historical data, as shown in Fig. 2. The operating parameters of the HDR generation and PV plant are shown in Table 1.

Fig. 2. Solar irradiance and TOU electricity price curves

5.2 Results Analysis In order to analyze the dispatching performance of the HDR-PV hybrid power system proposed in this paper, the zero-sum game results are compared and analyzed in the following three cases. Case 1: only rely on GPGS I support grid connection of PV plant, do not use GPGS II; Case 2: only GPGS II is used to recover waste heat for power generation and does not participate in supporting the grid connection of the PV plant; Case 3: Both GPGS I and GPSPII are involved in supporting the PV plants grid.

198

Q. Zhang et al. Table 1. System parameters

Parameter name

Numerical value

Production well outlet temperature (°C)

200

Quality flow of production well (kg/s)

75

Minimum temperature of recharge well (°C)

40

The initial temperature of thermal energy storage fluid (°C)

25

Condensed water temperature (°C)

15

Specific heat capacity of thermal energy storage fluid (kJ/(kg·°C))

1.938

Production well specific heat (kJ/(kg·°C))

4.2

HDR circulating pump power (kW)

551

ORC generator efficiency (%)

13.2

Insulation coefficient (%)

99

TSE efficiency (%)

90

Installed capacity of PV Plant (MW)

200

Prediction error of output power of PV plant (%)

≤ 20

Power fluctuation allowed by the grid (%)

< 10

In each case, the allowable fluctuation of the power grid is 3%, the solar irradiance and TOU price parameters are the same. The results of GPGS profits and credible capacity of the PV plant are shown in Table 2. Table 2. Results of cases Parameters

Case 1

Case 2 Case 3

Credible capacity of PV plant (MWp) 70.519

70.519 141.04

Profits of GPGS I (Ұ)

85916

69875 58181

Profits of GPGS II (Ұ)



60750 65126

Loss of waste heat (MWh)

146.193 0

0

From Table 2, it can be seen that in Case 1, because only the GPGS I support the PV plants to connect to the grid, a large amount of geothermal energy can only be abandoned when providing down a reserve. Thus a large amount of heat loss is generated. In Case 2, we use TSE and heat storage tank to recover waste heat and generate electricity through GPGS II. Under the condition of ensuring the PV plant’s credible capacity, waste heat loss is solved. In case 3, due to reserve providing by GPGS I and GPGS II, the credible capacity of the PV plant is doubled, and the profits of the GPGS I decrease. The GPGS II recovers the waste thermal energy when GPGS I provides down a reserve, so the profit

A Coordinated Dispatching Model for HDR-PV Hybrid Power System

199

is improved. However, the total profits of the HDR generation are 5.6% less than that of Case 2. The dispatching curves of the HDR generation are shown in Fig. 3. At 0:00–7:00, the system maintains a continuous low output power. Geothermal energy is stored in a thermal energy storage tank through the TSE. At 8:00–23:00, the GPGS I keep a stable maximum output power. This result accords with the high cost of GPGS I and the characteristics that the geothermal mining cycle needs to maintain continuous output. GPGS II mainly uses the stored thermal energy to generate electricity in a high price period to obtain a high profit to work full power in a high price period. During the PV Plant’s leading power generation period from 14:00 to 18:00, the reserve is provided to cope with the PV Plant’s output power uncertainty.

Fig. 3. Dispatching curves of the HDR generation

Further analysis of GPGS I and GPGS II in HDR are carried out, as shown in Fig. 4. Through the zero-sum game model, the system dispatcher will use the lower bound of the forecast value as the credible generation of the PV plan to dispatch HDR generation with the full generation strategy. Under this strategy, the actual output power curve of the PV plant must be above the credible generation curve. In real-time operation, the HDR generation only needs to provide a down reserve to absorb surplus PV power. Also, Fig. 4 shows that during 10:00–19:00, GPGS I continuously dispatches down reserve power to stabilize the PV plant’s fluctuation. At 15:00–17:00, the dispatched down reserve power of GPGS I reach the upper limit, and the power of PV is the largest at this time. The GPGS I can no longer meet the demand of down reserve. So the system dispatches GPGS II to further provide down the reserve. This dispatching process shows that the HDR generation’s primary purpose is to provide a down reserve by GPGS I because of thermal energy exchanging and storage loss. The waste heat generated by dispatching the down reserve of the GPGS I can be stored in the thermal energy storage tank. The down reserve of GPGS II will be needed only if the GPGS I cannot meet the down reserve requirements. It can be seen that in this dispatching strategy, GPGS I is mainly responsible for providing reserve, and GPGS II mainly improves economic

200

Q. Zhang et al.

Fig. 4. Reserve dispatched of GPGS I and II

benefits by recovering geothermal energy. Therefore, this strategy can give a down reserve for PV plants without abandoning heat and fully meet the financial requirements of the HDR generation. Figure 5 shows the results of the operation of the hybrid power system. During the low price period (0:00–7:00), the HDR generation maintains the minimum output power and stores the continuous geothermal energy into the thermal energy storage tank. During the PV power generation period (8:00–18:00), the HDR generation provides down reserve according to the PV plant’s forecast output power according to the day-ahead dispatching

Fig. 5. Coordinated operation of HDR and PV hybrid power system

A Coordinated Dispatching Model for HDR-PV Hybrid Power System

201

plan. After that, the HDR generation generates full power according to the dispatching plan to maximize profits. It can be seen from the above analysis that the prediction error of the PV output power and the allowable fluctuation rate of the power grid has a significant influence on the grid connection of the PV plants supported by the HDR generation. Set the prediction error as 10%–20% and the allowable fluctuation rate 0%–5%. The results are shown in Fig. 6.

Fig. 6. Sensitivity analysis

It can be seen from Fig. 6 that with the increase of power fluctuation ability of the power grid, the credible capacity of the PV plant supported by the HDR generation increases accordingly. With the rise of prediction error, the credible capacity of the PV plant is decreasing. When the allowable fluctuation is 0%, the PV plant’s credible capacity is increased by two times under the condition of 10% prediction error and 20% prediction error. It can be seen that improving the prediction accuracy of the output power plays a significant role in the PV plant’s friendly grid connection. When the prediction error is 20%, the PV plant’s credible capacity is increased by 1.4 times under the condition of 5% fluctuation rate prediction error and 0% fluctuation rate. This result shows that the grid structure’s improvement also plays an essential part in the grid connection of the PV plants.

6 Conclusions In this paper, a hybrid power system consisted of HDR and PV is constructed based on the characteristics of HDR resources in the Gonghe basin of Qinghai. The hybrid power system’s coordinated dispatching model is proposed using the zero-sum game method to optimize the operation reliability. The dispatching model is verified by the

202

Q. Zhang et al.

case simulation with actual data. The results show that the HDR system can improve PV plants and power grid reliability effectively as well as the electrochemical energy storage system. In the hybrid power system, the optimal capacity ratio between the PV plants and the HDR generation is 10:1. Simultaneously, the HDR generation’s profit will have an inevitable loss to provide the reserve for PV plants, so the reserve price of the HDR generation participating in the ancillary service market needs to be further studied in the future. Acknowledgements. This work has been supported by Joint Fund Project of National Natural Science Foundation of China (U1766203), Basic Research Project of Qinghai Province (2018-ZJ726).

References 1. Wang, T., Qiu, P., Liu, M.: The review of the impact of large scale PV power generation on power system. DEStech Transactions on Materials Science and Engineering (2017) 2. Nader, M.A., Basem, E., Hamed, A.: Performance assessment of bacterial foraging based power system stabilizer in multi-machine power system. Int. J. Intell. Syst. Appl. 7, 43–53 (2019) 3. Zhou, J.: Qinghai’s clean energy of the world’s biggest. https://baijiahao.baidu.com/s?id=167 7959435473796601&wfr=spider&for=pc,2020-09-16 4. Tan, Q., Mei, S., Dai, M.: A multi-objective optimization dispatching and adaptability analysis model for wind-PV-thermal-coordinated operations considering comprehensive forecasting error distribution. J. Cleaner Prod. 256, 120407 (2020) 5. Zhao, X., Zeng, Z., Wu, Y.: Interpretation of gravity and magnetic data on the hot dry rocks (HDR) delineation for the enhanced geothermal system (EGS) in Gonghe Town. China. Environ. Earth Sci. 79, 1–13 (2020) 6. Lu, S.M.: A global review of enhanced geothermal system (EGS). Renew. Sustain. Energy Rev. 81, 2902–2921 (2018) 7. Cui, D., Xu, F., Ge, W., Huang, P.: A coordinated dispatching model considering generation and operation reserve in wind power-photovoltaic-pumped storage system. Energies 13, 4834 (2020) 8. Peng, C., Peng, X., Pan, L.: Flexible robust optimization dispatch for hybrid wind/PV/hydro/thermal power system. IEEE Trans. Smart Grid 7, 751–762 (2016) 9. Song, J., Krishnamurthy, V., Kwasinski, A.: Development of a Markov-chain-based energy storage model for power supply availability assessment of photovoltaic generation plants. IEEE Trans. Sustain. Energy 4, 491–500 (2013) 10. Xu, T., Zhang, N.: Coordinated operation of concentrated solar power and wind resources for the provision of energy and reserve services. IEEE Trans. Power Syst. 32, 1260–1271 (2017) 11. Si, Y., Chen, L., Zhang, X.: Capacity optimization of micro energy network with hot dry rock enhanced geothermal system. Power Syst. Technol. 44, 1603–1611 (2020) 12. Du, Y., Wang, Z., Liu, G.: A cooperative game approach for coordinating multi-microgrid operation within distribution systems. Appl. Energy 222, 383–395 (2018) 13. Wang, Z., Liu, F., Chen, L.: Distributed economic automatic generation control: a game theoretic perspective. In: 2015 IEEE Power & Energy Society General Meeting, Denver, CO, pp. 1–5 (2015) 14. Mei, S., Wei, W., Liu, F.: On engineering game theory with its application in power systems. Control Theor. Technol. 15, 1–12 (2017) 15. Laskowski, S.: Criteria of choosing strategy in games against nature. In: International Conference on Computer as a Tool, Warsaw, Poland, pp. 2323–2328 (2007)

Category Splices and Modeling with Their Help Chemical Systems and Biomolecules Georgy K. Tolokonnikov(B) Federal Scientific Agro-Engineering Center VIM, Russian Academy of Sciences, 1st Institute Passage, 5, Moscow, Russia

Abstract. For algebraic biology, the main task of which is to predict the properties of organisms from the genome using rigorous algebraic methods, in particular, the properties of intelligence, which can be modeled by conventional and strong artificial intelligence (AI), new categorical methods used in categorical systems theory have been proposed. These methods are based on the theory of categorical splices, with the help of which the behavior of quantum mechanical particles is modeled, in particular, within the framework of the proposed representation of molecules, including biomolecules RNA and DNA, as categorical systems. Thus, new algebraic and categorical methods (associative algebras with identities, PROP, categorical splices) are involved in the analysis of the genome. The listed results are new and original. Keywords: Genome · Categorical systems · Categorical splices · Polycategories · Hierarchies of systems · Chemical bond · Covalent · Ionic · Hydrogen bond

1 Introduction One of the main tasks of algebraic biology is to predict the properties of organisms from the genome using rigorous algebraic methods, in particular, to predict the properties of intelligence that can be modeled by conventional and strong artificial intelligence (AI). This work is related to the application of categorical systems theory in algebraic biology and artificial intelligence [1]. New categorical methods used in categorical systems theory are proposed. These methods are based on the theory of categorical splices, with the help of which the behavior of quantum mechanical particles is modeled, in particular, within the framework of the proposed representation of molecules, including biomolecules and DNA, as categorical systems. Thus, new algebraic and categorical methods (associative algebras with identities, PROP, categorical gluing) are involved in the analysis of the genome. The next section provides definitions and properties of categorical splices, examples of models of categorical splices. In the third section, a categorical model of stationary systems of charged quantum-mechanical particles is presented. The model is based on the well-known Gelman-Feynman theorem [2]. The fourth section is devoted to categorical models for fundamental chemical approximations of a quantum-mechanical © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): AIPE 2020, AISC 1403, pp. 203–212, 2021. https://doi.org/10.1007/978-3-030-80531-9_18

204

G. K. Tolokonnikov

system of particles, based on the concept of chemical bonding. For alkanes, associative algebras with identities arise, for other homologous series of hydrocarbons, categorical splices arise as algebraic objects, including, in particular, PROP and other categorical constructions. Further, the task is to study algebraic polycategory objects that model DNA and RNA for the needs of algebraic biology. The conclusion contains conclusions and a brief summary of the work.

2 Categorical Splices and Their Properties Let’s introduce a four-sorted first-order logic language with variables p, p , p¯ , p¯  responsible for splices, convolutions, inner splices and inner convolutions, respectively. Convolutions are represented using: specified letters p , p¯ , p¯  ; functional symbols of ; names ; values ; predicates co-regions Mb , Mr , Mt , Ml ,; using a graphic representation of formulas similar to that introduced by Hetcher [2] when constructing the first-order language of category theory, for example

The properties of predicates and other letters of the signature are determined by axioms. Formulas xi = πi (p), bi = Ci (p) will be denoted by πi , Ci . The convolution S is defined by the ratio, the symbol ev has the opposite and the opposite convolution is determined by the S = Mb evMl . The complete splices can be , divided into four subformulas in represented as a conjunction formula accordance with the sort of variables, located in the 2 × 2 quadrants of the table a

b

c

d

Equivalences between co-regions, between names that correspond to the equal sign, are depicted by connecting the right and left letters inequality with dashed lines. The predicates Mt , Mr , Ml , Mb are determined by the following formula transformations. Mb moves to the lower quadrants from the upper quadrants only those pairs Ck πk that have co-area connections, Mt moves to the upper quadrants from the lower quadrants only those pairs Ck πk that have co-area connections, Ml moves vertically and inside a pair from right quadrants to left ones, Mr moves connected pairs from   left quadrants to rights. The function symbol ev translates πi , Ci to ev(πi ), ev Cj and places the = sign in the appropriate places. If you do not impose restrictions on the choice of splices and convolutions, then we speak of free general splices. Special categorical splices are constructed from a fixed set of convolutions and a fixed set of splices, while the original splices are called generators, and all the rest are obtained by all possible applications of these convolutions to splices. The splices model the set of affine simplexes (Fig. 1). For clarity, consider the following formula as an example.

Category Splices and Modeling

205

Fig. 1. Modeling simplices by splices

We will omit the conjunction sign, replace equalities with a line connecting the right and left sides of the equalities. The table for the formula is

For convolution, we have a table

Applying, Ml we get

A complete restoration of the original splice is possible as follows. We apply a − − − −

transfer Mr to the final table in relation to Cw πw C j π j C v π v , we will get a table that coincides with the previous final table. Then we apply Mt to it, which gives the original table and, thereby, the original splice. For a given splice, it is possible to construct a new ’-dual splice by transferring the contents in the right parts of the splice table to the left and at the same time vice versa (splice-convolution duality)



Similarly, you can construct a new -dual splice by transferring the content in the upper parts of the splice table to the lower parts and at the same time vice versa (duality

206

G. K. Tolokonnikov

“outside-inside”). If the names in the splice formula are replaced by the corresponding co-domains and simultaneously vice versa, then again the splice is obtained, C π -dual of the original splice, in which the equalities of the co-domains turn into the equal names and vice versa (the duality “name-co-domain”). For splices of the segments, such a transformation leads to the table

graphically it looks like this (Fig. 2)

Fig. 2. Example of “name-to-realm” duality for splices

Let formulas be given in the language of categorical splices. The transformations of ¯ C formulas, respectively, types  , , π defined above are called their transformation  C ¯ π -duality. The fundamental role played by duality in the theory of categories  , , is well known, which is logically substantiated in [2]. An analog of the theorem proved in [2] is valid for the listed dualities in the theory of categorical splices. Theorem. Let a provable well-formed formula A be given in the formal theory of ¯ C categorical splices. Then dual formulas  (A), (A), π (A) are also provable in this theory. Let there be k splices with sets of variables of four sorts, functional symbols, and predicates. It is not difficult to construct a unified formal theory of higher categorical k-splices. Let’s go to a table with 2k columns, in the first k columns, in order, insert ai , ci , i = 1, . . . , k, in the second k columns, insert bi , di , i = 1, . . . , k, as indicated below

Category Splices and Modeling

207

In convolutional polycategories, poly-arrows have inputs and outputs. Their modeling is possible using 2-splices. Let’s look at an example of traditional categories. Two arrows, when the beginning of one arrow coincides with the end of the other arrow, and the convolution (corresponding to the composition of the arrows), corresponds to the formula (between the letters the conjunction sign is omitted)

The polygraph and convolution are shown in the diagram below (Fig. 3).

Fig. 3. Graphical representation of the splices formula

In order not to write out awkward superscripts, the second pairs C (2) , π (2) will be denoted by D, ω:

We have a table



Apply Mb and ev from the convolution S = Ml ev Mb , we get (ev(C, D, . . .) =



C, D, . . .)

208

G. K. Tolokonnikov

In this model of the category, simultaneously with the result of the composition, its receipt from the original arrows is recorded. Convolutions of several arrows for a category in a Set form a nerve of this category, the nerve construction is transferred to convolutional polycategories and categorical splices.

3 Category Model of Systems of Quantum Particles We will consider atomic nuclei and electrons in the form of quantum particles, the systems of which are modeled by the considered convolutional polycategories and categorical splices, which are categorical systems [3, 4]. The Chs chemical system is modeled by a second-order convolutional polycategory Chs = (Chs0 , Chs1 ), which is defined as follows. Polycategory arrows Chs0 are atoms whose names are collected in the periodic table, as forming polycategories, and molecules obtained from atoms using the convolutions of the polycategory. The polycategory Chs1 is related to chemical reactions. Let’s imagine atoms as categorical systems. The sets of atoms are polygraphs of the polycategory Chs0 , consisting of polyarrows. Consider the hydrogen atom H, it is the only polygraph P. Projection π1 (P) = ω1 (P) = H . Similarly, for each of the atoms A, projections are defined with the same equality π1 (P) = ω1 (P) = A. To define the regions and co-regions of the arrowheads as projections Ci (P), Di (P), i = 1, 2, 3, . . ., let us turn to quantum mechanics of particles. Atoms and molecules are formed by the electrical interaction of nuclei and electrons. In the classical and quantum cases, the inputs and outputs of molecules, as systems, are associated with the charge (+ with outputs, − with inputs). We consider stationary charges, in this case only electrostatic interaction remains. In the classical case, the lines of force go from positive to negative charges without crossing. To describe the regions and co-regions of the arrowheads, we use classical electrostatics, which, by virtue of the Gelman-Feynman theorem, turns out to be adequate for the quantum case as well. In the well-known Bader approach “atoms in molecules” [5], the classification of chemical bonds is based on the density of the electron cloud of a molecule; nevertheless, the topological equivalence to the consideration of the lines of force of the electrostatic field used by us is proved there. Consider for each charge of quantity q, located far from other charges, a sphere of small radius, the center of which is the charge. We divide the sphere into nq equal convex areas (for example, curvilinear triangles or squareswith a side of 2πr/100, r is the radius of the sphere), each area corresponds to a charge q nq due to the spherical symmetry of the electric field. We number these sites in some fixed way. The density of the lines of force representing the strength of the electric field on the surface of the sphere  is chosen from the calculation of one line of force for each area. We take the charge q nq with the corresponding area number as co-areas and areas of the arrow representing the charge. When the charge q approaches with other charges, the field strength on the surface of the sphere changes, however, if it is enough to reduce the radius of the sphere r, then this change can be neglected. If there are two or more charges N, then we require such      qi   qj  a division into areas that  nq  =  nq , i, j = 1, 2, 3, . . . , N . i j Consider several stationary positive and negative charges in a certain region of space. Let us correspond the arrows to them in the described way. Each line of force starts at

Category Splices and Modeling

209

a positive charge and, without intersecting with other lines of force, ends at a negative charge, connecting area i with some area k. The sum of the charges can be positive, negative, or zero. In the case of a total zero charge, each line of force is closed. If the total charge is nonzero, those lines of force remain open for which there were not enough suitable areas on other charges. A set of pairs of platforms connected by lines of force corresponds to the spatial arrangement of charges and, thus, to the configuration of the electrostatic field of charges. When defining the areas and co-areas of the polyarrows, it would be possible, in addition to the area number and the amount of charge, to indicate the coordinates of the space cell in which the area is located, having previously divided the space into the indicated cells. Further, we restrict ourselves to defining areas and co-areas without specifying spatial coordinates. We define an elementary convolution by the fact that it transfers an area with a positive charge to a certain point in space, transfers an area with a negative charge to another point in space and uses a line of force starting at the first area and ending at the second. A convolution is defined as a set of atomic convolutions that satisfy several requirements. First, the number of elementary convolutions is chosen so that the total charge of the result of applying the convolution has a charge equal to the sum of the initial charges. Secondly, the transfer of sites did not change the relative position of sites belonging to the same charge. Thirdly, the location of all charges in space must correspond to the equilibrium state of the initial charges representing the nuclei. The third condition is always realized in the quantum case, for its realization in the classical case it is enough to “smear” negative charges in space in the form of an electron cloud of molecules. Blurring corresponds to a sufficiently fine crushing of the initial negative charges into smaller ones. In the quantum case, point electrons do not need to be split up, the role of negatively charged particles that balance the repulsion of positively charged nuclei from each other is played by the probability density of electrons in a molecule. The convolutions for several positive and negative charges, described by the constructed polyarrows, will be a composite system with corresponding regions and coregions. It is important to emphasize that convolution can lead to several molecules of different substances. The interior of the diagrams after applying convolution corresponds to an additional polygraph that records the history of the formation of a polygraph from −

the original polygraph. This polycategory Chs0 contains a convolution S inverse to the convolution S, acting on an additional polygraph. The presented version of the comparison of regions and co-regions to the lines of force of the electric field can be carried out with a predetermined accuracy sufficient for our consideration. A similar comparison takes place thanks to the Gelman-Feynman theorem [6] for the quantum-mechanical case, for which the electron cloud is divided into sufficiently small volume charges.

4 Category Chemical Bond Models for Algebraic Biology The exact construction of a representation of a molecule as a system in the form of a polycategory arrow requires knowledge of the wave function of the molecule in stationary and other states, therefore, as it is forced to do in quantum chemistry, one will have to

210

G. K. Tolokonnikov

be content with approximate methods. For available approximations, you can also build categorical models. The main approximation in quantum chemistry [6] is the adiabatic approximation, which consists in a separate consideration of the motions of nuclei and electrons. It can be assumed that chemical bonds are determined by the distribution density of the electron charge in the electron cloud, specified by the wave function of the molecule, which ensures the equilibrium position of the nuclei by compensating for their electrical interactions with each other, as justified by the Gelman-Feynman theorem. An electron cloud in different molecules can have the same substructures, each of these substructures can correspond to one or another type of chemical bond. The main types of chemical bonds in biochemistry are covalent, hydrogen bonds, and hydrocarbon aromatic bonds. After some refinements, the structural chemical formulas become equivalent to the corresponding splices and their convolutions, for which covalent strokes should be juxtaposed with the co-regions of the splices. For example, a carbon atom and its bivalent modification correspond to the splices in Fig. 4. Methane is obtained by simple convolutions (grouped into one convolution).

Fig. 4. Convolutions for obtaining divalent carbon and methane

Alkane models are represented by associative algebras with identities, the multiplication operation is ternary and corresponds to a carbon atom, identities correspond to the equivalence of trees when different elements of H are chosen as the root of the alkane tree. Hydrocarbons with double and triple bonds are modeled taking into account different types of convolutions for σ, πx , πy bonds (Fig. 5).

Fig. 5. Category splice for ethylene (σ- and π-bonds are marked)

Category Splices and Modeling

211

For the aromatic bond, a convolution with six (for benzene) and a large number of H atoms (for other aromatic hydrocarbons) is selected. Cyclic hydrocarbons are modeled naturally. The properties of the algebras, PROPs, and splices that appeared in this way, studied by strictly mathematical methods, give the properties of biomolecules. RNA and DNA biomolecules, which are essential for algebraic biology and its predictions, are also modeled by splices. It is not difficult to write out splices corresponding to nucleotides and nucleosides; further construction of RNA and DNA molecules from them comes down to algebraic convolutions. We emphasize the strictly mathematical nature of these models and the operations in them, which equips algebraic biology with a number of categorical algebraic research methods. Let’s go back to the polycategory Chs. Let’s construct a polycategory Chs1 as follows. The objects of the polycategory, which can make up the areas and co-areas of the arrowheads, are the molecules of all substances that can be obtained from the nuclei of atoms and electrons using the convolutions of the polycategory Chs0. Arrows are categorical reactions, that is, pairs (a set of reagent molecules, a set of products of a categorical reaction). Convolutions in a polycategory Chs1 are determined by basic compositional convolutions for two reactions: convolution is possible if among the products of the first reaction there are all the products that are reagents for the second reaction. In the case when there are two reagents and the reaction product is one, the polyarrows implement the associative algebra of reactions. In the general case, a polycategory is an algebraic object and the arsenal of categorical methods of study is poured into algebraic biology for the case of biochemical reactions associated with the genome and matrix genetics [7], which gave rise to algebraic biology.

5 Conclusion One of the main tasks of algebraic biology is to predict the properties of organisms from the genome using rigorous algebraic methods, in particular, to predict the properties of intelligence that can be modeled by conventional and strong artificial intelligence (AI). Popular methods of artificial neural networks [8–12], as it gradually turns out, are not enough for modeling strong AI, while algebraic biology in the future should provide the necessary models of AI. The report offers new categorical methods used in categorical systems theory. These methods are based on the theory of categorical splices, the elements of which are presented in the work. With their help, the behavior of quantum-mechanical particles, in particular, RNA and DNA biomolecules, as categorical systems is modeled. Thus, new algebraic and categorical methods (associative algebras with identities, PROP, categorical splices) are involved in the analysis of the genome. The listed results are new and original.

References 1. Tolokonnikov, G.K., Petoukhov, S.V.: New mathematical approaches to the problems of algebraic biology. In: Hu, Z., Petoukhov, S., He, M. (eds.) AIMEE 2019. AISC, vol. 1126, pp. 55–64. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-39162-1_6 2. Hatcher, W.S.: The logical foundations of mathematics, Perg.Pr., p. 320 (1982)

212

G. K. Tolokonnikov

3. Tolokonnikov, G.K.: Mathematical foundations of the theory of biomachsystems. In: Biomachsystems. Theory and applications, vol. 1, p. 31–213. Rosinformagrotekh, M (2016) 4. Tolokonnikov, G.K.: Categorical models of artificial neural networks and system information, categorical splices, and the paradigm of categorical systems theory. Biomachsystems 2(1), 127–174 (2018) 5. Bader, R.: Atoms in Molecules: A Quantum Theory. Oxford Univ, Press (1990) 6. Gribov, L.A., Mushtakova, S.P.: Quantum chemistry, M., p. 390 (1999) 7. Petoukhov, S.: Matrix genetics, algebra of genetic code, noise immunity. M., RHD (2008) 8. Dharmajee Rao, D.T.V., Ramana, K.V.: Winograd’s inequality: effectiveness for efficient training of deep neural networks. IJISA 6, 49–58 (2018) 9. Karande, A.M., Kalbande, D.R.: Weight assignment algorithms for designing fully connected neural network. IJISA 6, 68–76 (2018) 10. Hu, Z., Tereykovskiy, I.A., Tereykovska, L.O., Pogorelov, V.V.: Determination of structural parameters of multilayer perceptron designed to estimate parameters of technical systems. IJISA 10, 57–62 (2017) 11. Awadalla, M.H.A.: Spiking neural network and bull genetic algorithm for active vibration control. IJISA, v. 10, No. 2, p. 17–26 (2018) 12. Abuljadayel, A., Wedyan, F.: An approach for the generation of higher order mutants using genetic algorithms. IJISA 10(1), 34–35 (2018)

Consensus Algorithm Based Distributed Coordinated Control for ESSs Integrated Off-grid PV Station Xiaoling Su1(B) , Zhengxi Li2 , Yang Si1 , Yongqing Guo1 , and Wenhao Xu1 1 Qinghai Key Lab of Efficient Utilization of Clean Energy, Qinghai University, Xining 810016,

Qinghai, China [email protected] 2 State Grid Qinghai Electric Power Company Economic and Technological Research Institute, Xining 810016, Qinghai, China

Abstract. Features of small capacity and simple structure, harsh operating environment and insufficient operation and maintenance cause serious power quality and reliability problems in energy storage systems (ESSs) integrated off-grid photovoltaic (PV) station in remote area. In this paper, a distributed coordinated control strategy based on consensus algorithm is proposed for off-grid PV station considering the operation characteristics of PV and ESSs. The primary power control strategy is improved to optimize the PV and ESSs operation. A frequency disturbance observer is designed for the secondary control to eliminate static control error from droop control. The feasibility of the distributed cooperative control strategy is verified by simulation models. The feasibility and effectiveness of the proposed control method is validated by the simulation results. Keywords: Off-grid PV station · Distributed control · Consensus algorithm · Droop control

1 Introduction Electric power plays an important role in quality life. China has constructed over 2000 off-grid photovoltaic (PV) power stations in Qinghai Tibet Plateau to supply electricity to remote rural communities. This approach is a combination of advantages and disadvantages of the natural environment and climatic features, rich in solar energy resource makes PV an ideal option, remote location makes power grid construction impossible. It is a reasonable and practical response to social development and environmental protection requirements in this area. These off-grid PV power stations are hybrid microgrids, which contain PV units, energy storage systems (ESSs) and loads, also named as ESSs integrated off-grid PV station. For the features of small capacity and simple structure their operation strategies are relatively simple. The ESSs stores electric energy generated by PV and provide power supply at night, so it is hardly 24-h power supply. On the other hand, the harsh operating © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): AIPE 2020, AISC 1403, pp. 213–221, 2021. https://doi.org/10.1007/978-3-030-80531-9_19

214

X. Su et al.

environment and insufficient operation and maintenance, cause serious power quality and reliability problems. Thanks to the New Rural Pastoral Areas Construction Policy, both the life quality and power supply demand increased in these areas the power supply from the former off-grid PV stations are no longer sufficient, which makes capacity expansion and transformation are imperative [1]. Microgrids use droop control to be self-controlled systems during the islanded mode [2, 3]. And a secondary voltage/frequency control layer is introduced to eliminate the static error caused by the primary droop control at the same time [4–6]. The traditional centralized hierarchical control requires a complex communication network to transmit system information. In the meantime, the master controller responsible for all the calculations and single processing of the whole microgrid. The communication and control structure of the microgrid needs to be reestablished when distributed generation connected or disconnected from the microgrid [7–9]. So the system robustness and flexibility is relatively low and impossible to support Plug-and-play. Therefore, centralized hierarchical control is not suitable for off-grid PV stations with variable structures. Distributed control technology combines the advantages of centralized control and decentralized control, with distinct features like less information transmission and processing, better robustness and higher flexibility [10–15]. Distributed coordinated control algorithms based on robust control [16], adaptive control [17], neural network [18], consensus control [19] and other secondary control algorithms have been widely used in microgrids to eliminate voltage and frequency control deviation in islanded mode. A Distributed secondary and optimal active power sharing control for islanded microgrid is proposed in [20], it also analyzed the impact of communication delay on system stability. A distributed control strategy based on consensus theory with considerable communication and control delay is proposed for active power control in microgrid in [21]. Consensus algorithm has been widely used in distributed computing [22], The ideas of statistical consensus theory by De Groot reappeared two decades later in aggregation of information with uncertainty obtained from multiple sensors1 [23] and medical experts [24]. In networks of ESSs integrated off-grid PV stations, be consensus means to reach an agreement regarding a certain quantity of interest that depends on the state of all equipment, such as frequency, voltage. This paper proposes a distributed coordinated control strategy based on consensus algorithm constrained by the operating environment and communication network of offgrid PV stations in remote areas. The primary power control strategies of each unit are improved aiming at their different working modes to control the voltage and frequency of these PV-ESSs hybrid microgrids and prevent ESSs from overcharge/over-discharge or operate at poor state of charge (SoC) for a long time to extent their service life.

2 Graph Theory Based Consistency Control Graph Laplacians and their spectral properties [25] are important graph-related matrices that play a crucial role in convergence analysis of consensus and alignment algorithms. Graph theory based consistency control makes all individuals reach a consensus in

Consensus Algorithm Based Distributed Coordinated Control

215

a sparse communication network. In distributed control, the information interaction between different nodes is described by graphs. The PV and ESSs units in off-grid PV station adopt duplex communication mode and its network topology is treated as an undirected graph. For undirected communication network G(V, E), where V = {v1 , ···, vn } represents the n nodes of the network, E ⊆ V × V is the communication line in the network and {vi , vj } ∈ E represents the communication connection between nodes i and j. This paper defines off-grid PV station frequency as the nodes of the network, the coupling matrix A = (aij )n × n describes relationship between nodes in communication network G, characterize the network topology. Where off-diagonal element aij is the corresponding element of the adjacency matrix of the node connection graph, If there is an information link between node i and node j, then aij > 0 ((vi , vj ) ∈ E), indicates communication connection between these equipment, otherwise aij = 0 or the diagonal element aij = 0. The consensus algorithm based first-order linear system equation of the multi-agent network for the i-th node in the microgrid is x˙ (t) = ui (t), i = 1, 2, 3, · · · n

(1)

Where x i is the state variable of node i, and this node only communicates with its neighboring nodes. Only if all the state variable values of each node reach the consensus, the system achieves uniform convergence. Where ui (t) is the control protocol of node i and it is determined by the fed back information from the neighboring nodes, given by ui (t) = −c

N j=1

  aij xi (t) − xj (t)

(2)

Where c is the coupling strength or coupling weight, its value determines the speed of network converge to consensus. The output of each node only determined by its own state information and the current state information from its neighboring nodes, eventually achieve the consistency control. Define the Laplacian matrix of the network based on the adjacency matrix as   (3) L = lij n × n  Where lij = nj=1 aij , and when i = j, lij = −aij ≤ 0. Therefore, the system equation can be rearranged as x˙ (t) = ui (t) = −Lx(t)

(4)

L is time-invariant if the system is a fixed network topology. The system stability is determined by the eigenvalues of the Laplacian matrix L. According to the definition of matrix elements, the Laplacian matrix is a real symmetric matrix, so it is a diagonalizable matrix with an eigenvalue equal to zero and the remaining eigenvalues greater than zero. According to the distributed consensus protocol or Eq. (4), the state value of each node in the system converge to the global average eventually.

216

X. Su et al.

3 Distributed Coordinated Control for Off-grid PV Station Distributed coordinated control realize non-error voltage and frequency control by adjusting the output power of PV and ESSs units in off-grid PV station, maintenance power balance as well as optimal active power output distribution between PV and ESSs. Figures 1 and 2 show the consensus algorithm based distributed coordinated control strategy for ESSs and PV units in off-grid PV station respectively, which includes droop control and consistency control. The droop control or the local control strategy regulate its frequency by monitoring the real-time output active power of the ESSs and PV units. The consistency control part exchange information through the communication network and adjust the reference values of ESSs and PV units according to the operating parameters of each node, the operating status of PV units and the ESSs SoC, and distribute load reasonably.

Fig. 1. Distributed coordinated control strategy for ESSs

4 Consensus Algorithm Based Distributed Coordinated Control 4.1 Consistency Control For the secondary consistency control, a frequency disturbance observer is used. The frequency disturbance observer updates the next step output value according to its own output value and the output value of its neighboring unit observer. n   f = finext − fmi = − ∫ (5) aij fi − fj j=1

Consensus Algorithm Based Distributed Coordinated Control

217

Fig. 2. Distributed cooperative control strategy for PV

Where f inext is the next step node frequency, f mi is the current measurement value, f i and f j denote the calculation frequency values of nodes i and j respectively. Define frequency response characteristics according to the ESSs SoC and the operating characteristics of photovoltaic units  SOC ≤ SOCmin f = fN (6) f = fN + λ · (SOC − SOCmin ) 1 > SOC > SOCmin  PPV = PN f ≤ fN (7) PPV = PN − m · (f − fN ) f > fN where m=

PN fNmax

(8)

Where f N is the rated frequency of off-grid PV station, PN is the rated output power of PV and f max is the upper limit of the off-grid PV station frequency. λ=

fNmax 1 − SOCmin

(9)

Where f N represents the rated frequency of the ESSs, f max is the upper limit of the ESSs frequency, and SOC min is the lower limit of the ESSs SoC. 4.2 Droop Control PV and ESSs inverters adopt droop control strategy, which simulate the P-f regulation characteristics of traditional rotating motors, to regulate the off-grid PV station voltage

218

X. Su et al.

and frequency automatically. The droop control law of PV units and ESSs are designed as Eqs. (10) and (11) respectively. f = fN + λ · (SOC − SOCuppermin )

(10)

  PPV = PN − m · f − fN − fupper

(11)

Where Δf upper is the secondary consistency control output, and it is one of the inputs of primary droop control. The PV and ESSs control single is calculated according to Eqs. (10) and (11). In other words, the PV and ESSs in the off-grid PV station is operated according to the distributed coordinated control system based on the operating status and parameters of each node.

5 Simulation Analysis The simulation model of a 100 kWp off-grid PV station in Xueshan Township, Maqin County, Guoluo Prefecture is developed under MATLAB. The AC rated voltage is 220 V, 50 Hz. There are 250 Wp * 400 PV, ESSs which is 1200 Ah * 348 lead-acid battery and load in this off-grid PV station. Install one more set of PV and ESSs in this simulation model to increase the number of nodes in order to test the consensus algorithm based distributed coordinated control strategy, as shown in Fig. 3.

Fig. 3. Off-grid stand-alone photovoltaic power station structure.

The output power of PV is 40 kW. The output power of the two ESSs is 10 kW they both have the adjustable capacity which means the ESSs are ready to charge or discharge flexible, however their SoC is different. In addition, the total load in this off-grid PV

Consensus Algorithm Based Distributed Coordinated Control

219

4 5 10 PESSs1

4

active power

3 2 1 0

1

2

3

4 time

5

6

7

Fig. 4. Output power of the first ESSs. 4 3 10

PESSs2

active power kW)

2.5 2 1.5

1 0.5

0 1

2

3

4 time

5

6

7

Fig. 5. Output power of the second ESSs. 50.5

frequency Hz)

frequency

50

49.5

49

3

3.5

4

4.5

5 time

5.5

6

6.5

7

Fig. 6. Off-grid stand-alone PV station frequency

station simulation model is 100 kW. The load increases 40 kW at 4 s, Figs. 4, 5 and 6 give the simulation results. Figures 4 and 5 show that, before 4 s, the output power of the two ESSs is 10 kW as their initial values, as the load increases at 4 s, the output power of first ESSs increased from 10 kW to 40 kW and the output power of the second ESSs increased from 10 kW to about 20 kW. The output of the PV remains unchanged as ESSs are still in the adjustable range. In Fig. 6, the off-grid PV station frequency is 50 Hz before 4 s, as the load increment at 4 s, the off-grid PV station sees a frequency drop because of the power shortage. The control system adjusts the output power of ESSs according to the control law and the frequency return to the rated value which is 50 Hz.

220

X. Su et al.

6 Conclusions Off-grid PV stations improve the power quality in remote pastoral area effectively, however, most of the off-grid PV stations adopts simple control strategies with relatively simple structure which cause serious power quality problems and reduce reliability. In order to solve these problems and improve its power quality, plus provide sufficient power supply to farmers and herdsmen live in this area, this paper designs a distributed coordinated control strategy for off-grid PV station based on consensus algorithm. First, the secondary consistency control is responsible for power distribution, which means the power demand is assigned to PV and ESSs reasonably to balance power in the off-grid PV stations and avoid overcharge/discharge of ESSs as well. Second, the voltage and frequency are regulated by droop control automatically, to improve the power supply quality. The simulation results validate the accuracy and feasibility of the consensus algorithm based distributed coordinated control strategy for off-grid PV station in remote areas. Acknowledgments. This work is supported by Research on key Technologies of Self-supporting Micro Renewable Energy Network in Qinghai Agricultural and Pastoral Areas under grant number 2018-ZJ-748.

References 1. Nejabatkhah, F., Li, Y.W., Nassif, A.B., Kang, T.: Optimal design and operation of a remote hybrid microgrid. CPSS Trans. Power Electron. Appl. 3(1), 3–13 (2018) 2. Meng, X., Liu, J., Liu, Z.: A generalized droop control for grid-supporting inverter based on comparison between traditional droop control and virtual synchronous generator control. IEEE Trans. Power Electron. 34(6), 5416–5438 (2019) 3. Guerrero, J.M., Hang, L., Uceda, J.: Control of distributed uninterruptible power supply systems. IEEE Trans. Ind. Electron. 55(8), 2845–2859 (2008) 4. Chen, L., Wang, R., Zhen, T.: Model predictive control of virtual synchronous generator to improve dynamic characteristic of frequency for isolated microgrid. Autom. Electr. Power Syst. 42(3), 40–47 (2018) 5. Han, H., Hou, X., Yang, J., et al.: Review of power sharing control strategies for islanding operation of AC microgrids. IEEE Trans. Smart Grid 7(1), 200–215 (2016) 6. Han, Y., Li, H., Shen, P., et al.: Review of active and reactive power sharing strategies in hierarchical controlled microgrids. IEEE Trans. Power Electron. 32(3), 2427–2451 (2017) 7. Liu, W., Geng, G., Jiang, Q., Fan, H., Yu, J.: Model-free fast frequency control support with energy storage system. IEEE Trans. Power Syst. 35(4), 3078–3086 (2020) 8. Wang, Y., et al.: Aggregated energy storage for power system frequency control: a finite-time consensus approach. IEEE Trans. Smart Grid 10(4), 3675–3686 (2019) 9. Morstyn, T., Savkin, A.V., Hredzak, B., Agelidis, V.G.: Multi-agent sliding mode control for state of charge balancing between battery energy storage systems distributed in a DC microgrid. IEEE Trans. Smart Grid 9(5), 4735–4743 (2018) 10. Avila, N.F., Chu, C.: Distributed pinning droop control in isolated AC microgrids. IEEE Trans. Ind. Appl. 53(4), 3237–3249 (2017) 11. Chen, L., Wang, Y., Zheng, T.: Consensus-based distributed control for parallel-connected virtual synchronous generator. Control Theory Appl. 34(2), 1–8 (2017)

Consensus Algorithm Based Distributed Coordinated Control

221

12. Meng, L., et al.: Distributed voltage unbalance compensation in islanded microgrids by using a dynamic consensus algorithm. IEEE Trans. Power Electron. 31(1), 827–838 (2016) 13. Zhang, Z., Chow, M.Y.: Convergence analysis of the incremental cost consensus algorithm under different communication network topologies in a smart grid. IEEE Trans. Power Syst. 27(4), 1761–1768 (2012) 14. Shahab, M.A., Mozafari, B., Soleymani, S., Dehkordi, N.M., Shourkaei, H.M., Guerrero, J.M.: Distributed consensus-based fault tolerant control of islanded microgrids. IEEE Trans. Smart Grid 11(1), 37–47 (2020) 15. Shafiee, Q., Guerrero, J.M., Vasquez, J.C.: Distributed secondary control for islanded microgrids—A novel approach. IEEE Trans. Power Electron. 29(2), 1018–1031 (2014) 16. Mahdian Dehkordi, N., Sadati, N., Hamzeh, M.: Distributed robust finite-time secondary voltage and frequency control of islanded microgrids. IEEE Trans. Power Syst. 32(5), 3648– 3659 (2017) 17. Bidram, A., Davoudi, A., Lewis, F.L., et al.: Distributed adaptive voltage control of inverterbased microgrids. IEEE Trans. Energy Convers. 29(4), 862–872 (2014) 18. Cai, H., Hu, G., Lewis, F.L., Davoudi, A.: A distributed feedforward approach to cooperative control of AC microgrids. IEEE Trans. Power Syst. 31(5), 4057–4067 (2016) 19. Guo, F., Wen, C., Mao, J., et al.: Distributed cooperative secondary control for voltage unbalance compensation in an islanded microgrid. IEEE Trans. Industr. Inf. 11(5), 1078–1088 (2017) 20. Chen, G., Guo, Z.: Distributed secondary and optimal active power sharing control for islanded microgrids with communication delays. IEEE Trans. Smart Grid 10(2), 2002–2014 (2019) 21. Deng, S., Chen, L., Zheng, T.: Active power distributed control of microgrids considering system time delays. Power Syst. Technol. 43(5), 1536–1542 (2019) 22. Olfati-Saber, R., Fax, J.A., Murray, R.M.: Consensus and cooperation in networked multiagent systems. Proc. IEEE 95(1), 215–233 (2007) 23. Benediktsson, J.A., Swain, P.H.: Consensus theoretic classification methods. IEEE Trans. Syst. Man Cybern. 22(4), 688–704 (1992) 24. Weller, S.C., Mann, N.C.: Assessing rater performance without a “gold standard” using consensus theory. Med. Decision Making 17(1), 71–79 (1997) 25. Olfati-Saber, R., Murray, R.M.: Consensus problems in networks of agents with switching topology and time-delays. IEEE Trans. Autom. Control 49(9), 1520–1533 (2004)

Algebraic Harmony in Genomic DNA-Texts and Long-Range Coherence in Biological Systems Sergey V. Petoukhov1,2(B) 1 Mechanical Engineering Research Institute,

Russian Academy of Sciences, M. Kharitonievsky Pereulok, 4, Moscow, Russia 2 Moscow State Tchaikovsky Conservatory, Bolshaya Nikitskaya, 13/6, Moscow, Russia

Abstract. The paper is devoted to actual problems of development of quantum biology. Connections of the hyperbolic rules of cooperative oligomer organization of DNA-texts of eukaryotic and prokaryotic genomes with known Fröhlich’s theory of quantum long-range coherence in biological systems are considered for the first time. These new hyperbolic rules in long helical DNA, which are related with the harmonic progression 1, 1/2,…, 1/n, allow discussion of a connection of Fröhlich’s theory with fractal-like phenomena of the cooperative organization of long DNA-texts and also with helical antennas, which emit and absorb electromagnetic waves of circular polarization. The harmonic progression is related to standing waves in resonators and, in particular, to harmonics in music. It is noted that the algebra-harmonic features of genomes remind the well-known ancient practice of meditations using music and 4-sector mandalas. The described materials develop ideas of quantum biology and can lead to new ideas in theoretical and application areas, including problems of artificial intelligence and in-depth study of genetic phenomena for medical and biotechnological tasks. Keywords: Genomes · DNA · Hyperbolic rules · Harmonic progression · Helical antennas · Fröhlich’s theory · Quantum biology

1 Introduction The aim of the research is a presentation and discussions of genetic new materials and ideas useful for developing quantum biology. Recent publications have shown the existence of hyperbolic rules for the oligomer cooperative organization of long DNA texts in genomes of higher and lower organisms [1–4]. These rules are associated with the harmonic progression (1), whose historically known name is connected with harmonics in music and a series of standing waves in resonators: 1/1, 1/2, 1/3, 1/4, . . . , 1/n

(1)

The aforementioned hyperbolic rules of genomes were revealed as a result of the author’s method of analyzing long DNA sequences of nucleotides A (adenine), T (thymine), C © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): AIPE 2020, AISC 1403, pp. 222–231, 2021. https://doi.org/10.1007/978-3-030-80531-9_20

Algebraic Harmony in Genomic DNA-Texts and Long-Range Coherence

223

(cytosine), and G (guanine), which is called the method of oligomer sums and is as follows. This method interprets any long DNA text as a multi-layered text (or as a set of parallel texts). For example, the first layer of text CAGATCCGAAT … represents it as text from 1-letter words C-A-G-…, the second layer as text from 2-letter words CA-GA-TC-CG- …, the third layer as a text of 3-letter words CAG-ATC-…, and in general the n-th layer - as a text of n-letter words (or n-plets, or oligomers of length n). According to this method, in each of the studied n-layers, the total number of oligomers (or n-plets), which start from one of these nucleotides, is calculated. For example, for human chromosome # 1, whose DNA contains a text of about 230 million nucleotides A, T, C, and G, counting in each of its 20 first layers of n-plets starting with A gives some sequence of 20 total numbers from 67070277 up to 3354107. It turns out that this sequence is modeled with high accuracy by a sequence of 20 components of the ¯ (deviations are in the range of ±0.03%): following vector A A = 67070277 ∗ [1, 1/2, 1/3, . . . , 1/20]

(2)

where the multiplier 67070277 is the number of nucleotides A in the given DNA-text, and the indicated coordinates present the first 20 members of the harmonic progression (1). Vectors of this type, the coordinates of which are related to the members of a harmonic progression, will be conventionally called harmonic vectors. Knowing the number of A nucleotides in the genomic DNA text, it is possible to predict with high accuracy the values of 19 other named sums of n-plets (oligomers of length n), since, as it turns out, there is a regular relationship between them, reflecting the cooperative organization of a multimillion DNA chain. A similar result, expressed with high precision by the harmonic vectors T, C, and G (3), is obtained in the same DNA text when it is represented as multilayered with the calculation of the total number of n-plets, which start with any of the other three letters T, C, or G (n = 1, 2, 3, …, 20): T = 67244164 [1, 1/2, 1/3, . . . , 1/20], C = 48055043 [1, 1/2, 1/3, . . . , 1/20], G = 48111528 [1, 1/2, 1/3, . . . , 1/20],

(3)

where the multiplier for each vector, whose components are the first members of the harmonic progression (1), is equal to the amount of the corresponding nucleotide in this DNA. Similar harmonic relationships of n-plets sums in each of the classes of A-, T-, C-, and G-oligomers were obtained by the author for multilayer representations of DNA texts: (1) all of 24 human chromosomes; (2) all of the chromosomes of drosophila, a home mouse, a nematode, many plants; (3) 19 genomes of bacteria and archaea; (4) many extremophiles living in extreme conditions, for example, under radiation with a level 1000 times higher than fatal for humans. All the numerical data on these interconnections can be seen in the works [1, 3]. If in the DNA texts of genomes we take only each k th letter (k = 2, 3, …, 10, …, 50, …, 100, …), then new shortened DNA texts represent new sequences of letters A, T, C, G. It turns out that for each of such new texts the same harmonic rules are fulfilled

224

S. V. Petoukhov

for the interconnection of the sums of n-plets with high accuracy [5]. This suggests that DNA is fractal. These rules of harmonic relationships of oligomer sums in long DNA texts are candidates for the role of universal genetic rules. What are the possible physical mechanisms of such a cooperative organization of a great number of nucleotides in long DNA texts, and what role can helical DNA having such cooperative properties play? This article represents the author’s thoughts and materials on these fundamental issues, which are connected with development of quantum biology. The article includes five sections and conclusions.

2 The Harmonic Progression, Resonance Phenomena, and Helical Antennas Harmonic progression since the time of Pythagoras has been associated with harmonics of music and standing waves in resonators. This allows us to believe that the connection between the harmonic progression and the numerical structure of helical DNA texts is resonant and reflects the existence of a certain system of standing waves. This is consistent with the known hypotheses that helical DNA is helical antennas of electromagnetic waves (see, for example, a discussion in [6, 7]), and therefore, like other helical chiral biomolecules, emit and absorb electromagnetic waves of a certain circular polarization, which provides biomolecules with the ability to exchange radio waves of selective polarization. Regarding this topic on DNA as helical antennas, it is necessary to recall the important role of helical antennas in communication technology for space communications, radar, cellular telephony, and much more. Helical antenna theory and applications are described in many books, for example, in [8]. Helical antennas are structurally insensitive to manufacturing errors in essential degrees and they can operate in one of two principal modes - axial mode or normal mode. In the axial mode, helical antenna functions as a directional antenna radiating a beam off the ends of the helix, along the antenna’s axis (in this mode the diameter and pitch of the helix are comparable to a wavelength). It radiates circularly polarized radio waves, which distinguishes it from other antennas with directional radiation. In radio transmission, circular polarization is often used where the relative orientation of the transmitting and receiving antennas cannot be easily controlled, such as in animal tracking and spacecraft communications (that is, for example, a spacecraft rotation does not influence the communication). At present, intensive study of quantum polarization states is being conducted all over the world for the successful mastering of quantum information technologies, especially in the field of quantum communication. The direction of rotation of the circular polarization of the space receivingtransmitting antenna must coincide with the direction of rotation of the ground receiving and transmitting antenna operating from the space antenna. In space communication, polarization isolation is used, that is, antennas of opposite directions of rotation of polarization operate at the same frequency. In other words, regarding these helical antennas, the factor of chirality (left or right polarization) is very important for communication.

Algebraic Harmony in Genomic DNA-Texts and Long-Range Coherence

225

But the problem of chirality (or biological dissymmetry) also quite important for molecular biological systems as known beginning from famous experiments by Louis Pasteur in 1848 (see for example [9, 10]). Helical chiral biomolecules emit and absorb the electromagnetic waves of the corresponding circular polarization. This provides the opportunity for helical chiral biomolecules to exchange radio waves of the corresponding circular polarization selectively with helical biomolecules of the same kind of chirality. The author believes that the principle of the chiral stereo-chemical organization of biomolecules in living nature is deeply related to the informational principle of communication among biomolecules based on electromagnetic waves of appropriate circular polarization inside biological bodies. The existence of chirality in biomolecule structures and also in their electromagnetic waves of appropriate circular polarization is very significant for pharmacology and some methods of physiotherapy. It should be added that many species have useful informational ability to perceive the polarization light by their organs of sight (some crustaceans, arachnids, cephalopods, and vertebrates). Here it should be recalled that not only DNA molecule has its helical configuration but also many proteins have helical substructures, known as alpha-helices. The recent work [11] describes a discovery of left-handed helices in tips of human spermatozoon tails; its authors speculate that these helical structures in particular can «play a role in controlling the swimming direction of spermatozoa» that is, play a role of antennas for communication with the environment. When considering DNA helices as analogs of helical antennas, it should be taken into account that DNA is a dynamic structure capable of greatly changing its spatial configuration in different conditions. For this reason, they should be considered as dynamic antennas with parameters varied in time. Because of this dynamism, it seems that DNA is in constant searching and production of wave information with circular polarizations. The data on DNA epi-chains [5] show that long helical DNA nucleotide sequences have fractal-like features. In this regard, the attention of researchers seeking to understand the functioning of DNA as an antenna structure should also be additionally drawn to fractal antennas, which are another important type of antenna with many applications in technology. There are many publications on the theory and applications of fractal antennas and on fractal methods of information transfer (see, for example, [12]). Antenna-like helical structures are involved in the biomechanics of coordinated movements. For example, the unicellular organism Mixotricha paradoxa moves due to the 250 thousand helical bacteria Treponema spirochetes on its surface, the helical flagella of which are coherently being twisted, providing targeted movement (https://en. wikipedia.org/wiki/Mixotricha_paradoxa). Electrical and mechanical vibrations in living bodies are closely related, since many of their structures are piezoelectric: nucleic acids, actin, dentin, tendons, bones, etc. For this reason, electromagnetic phenomena are accompanied by the phenomena of vibration mechanics, which can be also connected with harmonics and standing mechanical waves. This thought correlates with the following thought of the famous Russian biophysics S. Shnoll on the interaction of biological macromolecules: “From possible consequences of interaction of macromolecules of enzymes, which are carrying out conformational (cyclic) fluctuations, we shall consider pulsations of pressure - sound waves. The range of numbers of turns of the majority of enzymes

226

S. V. Petoukhov

corresponds to acoustic sound frequencies. We shall consider … a fantastic picture of “musical interactions” among biochemical systems, cells, bodies, and a possible physiological role of these interactions. … It leads to pleasant thoughts about the nature of the hearing, about an origin of musical perception and about many other things, which belong to the area of biochemical aesthetics already” [13]. Harmonic progression (1) has interesting mathematical features including its connection with mathematical tools, which are used in studying visual perception and creating methods of a computer vision, recognition of images, realistic computer graphics, and systems of artificial intelligence; a great number of publications is devoted to these topics, including, for example [14–19]. The study of the mathematical properties of harmonic progression leads to new mathematical models in algebraic biology.

3 Hyperbolic Genomic Rules and Fröhlich’s Theory of Long-Range Coherence in Biological Systems Hyperbolic rules of the oligomer cooperative (or coherence) organization of genomes give pieces of evidence of the existence of long-range coherence in biological systems. This section is devoted to a consideration of a possible connection of this genomic cooperative organization with the Fröhlich theory of long-range coherence in biological systems. The hyperbolic rules of the oligomer cooperative organization of genomes give pieces of evidence of long-range coherence in biological systems. The foundations of this theory were presented in works [20–22]. Many other works consider Fröhlich’s theory and its applications, for example [23–26]. Let us remind briefly about this Fröhlich’s theory regarding a possible role for collective quantum effects in biological systems. The theory is based on the idea about vibrations in cells, which can resonate with microwave electromagnetic radiation due to a special biological quantum coherence phenomenon, which can exist under large energy of metabolic drive. According to Fröhlich, a phenomenon quite similar to a Bose-Einstein condensation may occur in substances in certain conditions, which are possible for biological bodies. In a certain sense, the mechanism described by Fröhlich is a special trap of the energy of metabolic processes in a living organism. Some authors obtained in their biophysical studies results that testify in favor of the validity of Frohlich’s theory with respect to its applicability to biological bodies. For example, the work [24] presents the experimental observation of Fröhlich condensation in a protein structure. In particular, the authors observed a local increase of electron density in a long α-helix motif consistent with a subtle longitudinal compression of the helix. The analysis shows that the received experimental results “can only be explained by Fröhlich condensation, a phenomenon predicted almost half a century ago” [24]. Fröhlich theory of long-range coherence in biological systems and its consequences have relations to medical diagnosis [23]. The book [25] considers the Frohlich theory as related to brain functioning and artificial intelligence. Taking into account Hameroff’s hypothesis [27, 28] that microtubules in the cytoskeletons of neurons might act as “dielectric waveguides”, Penrose supposed a subordination of microtubules to similar quantum-coherent behavior.

Algebraic Harmony in Genomic DNA-Texts and Long-Range Coherence

227

F. Fröhlich (the son of Herbert Fröhlich) wrote the article “Genetic code as Language” regarding quantum coherence states and long-range communication in genomes [29]. He discussed in detail the importance of Frohlich’s theory for understanding genetic informatics based on resonant mechanisms. In particular, he emphasized that the genome must have a certain long-distance communication between its parts by many reasons. Works [3, 31] should be added to the topic of quantum coherence and language. Since long DNA sequences have quantum long-range interaction between elements, quantum entanglement in this quantum genetic system can be essential for its surprisingly organized construction. The thought about the important role of quantum entanglement in DNA organization is not new: see, for example, the article [32] about the entanglement as a glue for DNA constructions where lattice vibrations or phonons are significant. All these materials give pieces of evidence that Fröhlich’s idea on the large-scale collective quantum phenomenon based on resonances in biological systems can be used for understanding the described hyperbolic (harmonic) rules of cooperative organization of long DNA-texts.

4 Presentation of Oligomer Cooperative Properties of Genomes in the Form of Numeric Genetic Mandalas Structural analogies of the system of 4-nucleotide DNA alphabets with the Yin-Yang schemes of the ancient Chinese “Book of Changes” (I-Ching) and also with 4-section mandalas have long been noted [33–36]. Throughout thousands of years, millions of Buddhists, Hindus, and other believers have been creating mandalas as a tool of meditation to achieve “enlightenment” and healing. The creator of analytical psychology C. Jung and his associate physicist W. Pauli considered the mandala to be an innate archetype of the unconscious and the conjugation of a cosmogram with a psychogram, capable of harmonizing the psyche. Revealing the connection of long DNA texts with a harmonic progression 1/n allows the author to build for them genomic mandalas (or graphic 4-sector representations) of an algebraic-harmonic numerical character, in which each of the four sectors represents one of the four harmonic vectors (Fig. 1). In this numeric mandala (Fig. 1, left), each of the 4 sectors represents the corresponding harmonic vector for specific long DNAs in the classes of n-plets starting with A, T, C, or G (the N symbol means any of the letters A, T, C, G) … For example, for the first human chromosome, each of these 4 sectors numerically corresponds to one of the above 4 harmonic vectors (2, 3). In other words, with this approach, 4-sector mandalas turn out to be sets of four harmonic vectors. In the normalized mandala of a DNA-text (Fig. 1, right), obtained by dividing all the numbers of each sector by the number of the corresponding type of nucleotides A, T, C, or G, the sequence of numbers in each sector is a sequence of members of the harmonic progression (1). For long DNA texts, in addition to the described mandala of the first order, based on n-plets, which start with one of the 4 monoplets A, T, C, and G, there are mandalas of the second, third, and more high orders, based correspondingly on n-plets, which start with one of 16 doublets, or one of 64 triplets, etc. Sectors of such mandalas of higher orders correspond to other types of harmonic vectors, whose coordinates are equal to

228

S. V. Petoukhov

Fig. 1. The representation of a set of 4 harmonic vectors of a long DNA-text in a form of 4 sectors of a mandala (or a circular diagram) where the symbols AN…N, TN…N, CN…N, and GN…N refer to total amounts of all oligomers, which start with an appropriate nucleotide A, T, C, or G in the considered DNA-text (here N denotes any of nucleotides A, T, C, and G). At the left: a general view of a genomic algebra-harmonic numeric mandala. At the right: a normalized form of such mandala where numbers AN…N, TN…N, CN…N, and GN…N are divided by the amount A, T, C, G of the corresponding nucleotide A, T, C, or G.

ratios of members of the harmonic progression. Examples of such harmonic vectors of the second-order AA and the third-order AAA are shown by expressions (4) for one of the 16 mandalas of the second-order and for one of the 64 mandalas of the third-order; the cases of sums of n-plets, which start with the doublet AA and with the triplet AAA correspondingly, are represented (symbols AA and AAA denote total amounts of the doublet AA and the triplet AAA in the considered long DNA-text correspondingly). AA = ΣAA ∗ [0, 2/2, 3/2, 4/2, 5/2, 6/2, . . . .] AAA = ΣAAA ∗ [0, 0, 3/3, 4/3, 5/3, 6/3, . . . ..]

(4)

Briefly speaking, there is a great fractal-like hierarchy of DNA mandalas tied to the harmonic progression (1). Taking into account the conjugation of this progression with the harmonics of music, one can think of a whole “choir” of such mandalas for each of long DNA-texts, which can be matched to corresponding ensembles of physical factors: acoustic, light, vibrational, etc. for various biological and physiological purposes, including the further development of ancient technologies of mandala-musical meditations for “enlightenment” and recovery. Here one can remind that ancient teachings of India and China affirmed the view of musical harmony as something primordial. Buddhists have thousands of years of mandala-meditation practice associated with the idea of musical harmony, reflected, for example, in different sequences of many scalable bells in ancient Buddhist monasteries.

5 Some Discussing Remarks The discovery of the hyperbolic (harmonic) rules of the oligomer cooperative organization of long nucleotide sequences in helical DNA allows considering Fröhlich theory of long-range coherence in biological systems, which appeals to vibrations and resonances, in a combination with ideas of helical antennas and electro-magnetic waves of

Algebraic Harmony in Genomic DNA-Texts and Long-Range Coherence

229

circular polarization. Until that time, Frohlich’s theory was mainly considered in its possible application to microtubules of tubulin, other proteins, and brains (see, for example, [25]). Our data allow considering the application of Frohlich’s theory to the cooperative fractal-like phenomena of the organization of long DNA-texts, that is, to the genetic informational phenomena of the deepest biological level. Our data about the cooperative (or coherent) organization of genomic DNA-texts support the Fröhlich theory of long-range coherence in biological systems and also its foresight about quantum mechanical principles in the functioning of living bodies. From the point of view of this confirmed theory, a living body is a quantum-mechanical system of traps of the energy of metabolic processes and energy from the outside world. Frohlich’s theory allowed considering the genetic code as language built on vibrational and resonant mechanisms [29]. Our data clarify that it makes sense to consider long DNA texts, not as a single linguistic text, but as a multilayer weaved text structure, whose different layers are mutually connected in their numerical characteristics through the harmonic progression (1), which is directly related to standing waves in resonators. Our results support the fundamental ideas of P. Jordan and E. Schrödinger, who were founders of quantum mechanics, about the need for the development of quantum biology [37]. They noted that the key difference between living and inanimate objects is as follows: inanimate objects are controlled by the average random movement of millions of their particles, while in a living organism, genetic molecules have a dictatorial effect on the entire living organism due to a special mechanism of quantum amplification. Jordan claimed that «life’s missing laws were the rules of chance and probability (the indeterminism) of the quantum world that were somehow scaled up inside living organisms» [37]. Our presented results are based on studying probabilities in long DNA sequences and describe corresponding hyperbolic rules. The author believes that the discovery of the described new properties of long DNAtexts related to their hyperbolic harmonic rules can lead to new ideas in theoretical and application areas, including problems of artificial intelligence and in-depth study of genetic phenomena for medical and biotechnological tasks.

6 Conclusions (1) New materials are received for the development of quantum biology; (2) Fröhlich’s theory of quantum long-range coherence in biological systems seems to be appropriate for modeling the hyperbolic rules of oligomer cooperative organization of long DNA-texts in genomes of higher and lower organisms; (3) The discovery of fractal-like phenomena of the cooperative organization of long DNA-texts gives new materials for considering helical DNA-molecules as electromagnetic antennas, which emit and absorb electromagnetic waves of circular polarization for molecular intercommunications; (4) The specificity of long DNA texts as sequences of four nucleotides and their oligomers, which obey the universal hyperbolic harmonic rules of the cooperative organization of genomes, allows us to consider the ancient practices of musicalMandal meditations as having a certain basis in the genetic organization of living organisms. The statements of K. Jung and W. Pauli about mandalas as an innate

230

S. V. Petoukhov

archetype of the unconscious should also be considered in connection with these new genetic rules.

Acknowledgments. The author is grateful to his colleagues M. He, Z. Hu, I. Stepanyan, V. Svirin, and G. Tolokonnikov for research assistance.

References 1. Petoukhov, S.V.: Hyperbolic rules of the cooperative organization of eukaryotic and prokaryotic genomes. Biosystems 198, 104273 (2020) 2. Petoukhov, S.V.: Modeling inherited physiological structures based on hyperbolic numbers. BioSystems (2020). https://doi.org/10.1016/j.biosystems.2020.104285 3. Petoukhov, S.V.: Hyperbolic rules of the oligomer cooperative organization of eukaryotic and prokaryotic genomes. Preprints 2020, 2020050471 (2020). https://doi.org/10.20944/preprints 202005.0471.v2, https://www.preprints.org/manuscript/202005.0471/v2 4. Petoukhov, S.V.: Genomes symmetries and algebraic harmony in living bodies. Symmetry: Cult. Sci. 31(2), 222–223 (2020). https://doi.org/10.26830/symmetry_2020_2_222 5. Petoukhov, S.V.: Nucleotide epi-chains and new nucleotide probability rules in long DNA sequences. Preprints 2019, 2019040011 (2019). https://doi.org/10.20944/preprints201904. 0011.v2, https://www.preprints.org/manuscript/201904.0011/v2 6. Blank, M., Goodman, R.: DNA is a fractal antenna in electromagnetic fields. Int. J. Radiat. Biol. 87(4), 409–415 (2011). https://doi.org/10.3109/09553002.2011.538130 7. Foster, K.R.: Comments on DNA as a fractal antenna. Int. J. Radiat. Biol. 87(12), 1208–1209 (2011). https://www.tandfonline.com/doi/full/10.3109/09553002.2011.626490?src=recsys 8. Kraus, J.D., Marhefka, R.J.: Antennas: For All Applications, 3rd edn. McGraw-Hill Higher Education, New York (2002) 9. Darvas, G.: Symmetry. Birkhauser, Basel (2007) 10. Kizel, V.A.: Optical activity and dissymmetry in living systems. Soviet Phys. Uspekhi 23(6), 277–295 (1980) 11. Zabeo, D., et al.: A lumenal interrupted helix in human sperm tail microtubules. Sci. Rep. 8(1), 2727 (2018). https://doi.org/10.1038/s41598-018-21165-8. PMID: 29426884, PMCID: PMC5807425 12. Potapov, A.A.: Fractals in radiophysics and radar. Elements of the theory of fractals: a review. J. Commun. Technol. Electron. 45(11), 1157–1164 (2000) 13. Shnoll, S.E.: Physical-Chemical Factors of Biological Evolution. Nauka, Moscow (1989). (in Russian) 14. Khan, R., Debnath, R.: Human distraction detection from video stream using artificial emotional intelligence. IJIGSP 12(2), 19–29 (2020) 15. Erwin, E., Ningsih, D.R.: Improving retinal image quality using the contrast stretching, histogram equalization, and CLAHE methods with median filters. IJIGSP 12(2), 30–41 (2020) 16. Mostakim, Md.N., Mahmud, S., Jewel, Md.K.H., Rahman, Md.K., Ali, Md.S.: Design and development of an intelligent home with automated environmental control. IJIGSP 12(4), 1–14 (2020) 17. Arora, N., Ashok, A., Tiwari, S.: Efficient image retrieval through hybrid feature set and neural network. IJIGSP 11(1), 44–53 (2019) 18. Anami, B.S., Naveen, N.M., Surendra, P.: Automated paddy variety recognition from colorrelated plant agro-morphological characteristics. IJIGSP 11(1), 12–22 (2019)

Algebraic Harmony in Genomic DNA-Texts and Long-Range Coherence

231

19. Ahmed, M., Akhand, M.A.H., Rahman, M.M.H.: Recognizing Bangla handwritten numeral utilizing deep long short term memory. IJIGSP 11(1), 23–32 (2019) 20. Fröhlich, H.: Long range coherence and the action of enzymes. Nature 228, 1093 (1970) 21. Fröhlich, H.: Introduction. Theoretical physics and biology. In: Fröhlich, H. (ed.) Biological Coherence and Response to External Stimuli, pp. 3–24. Springer, Heidelberg (1988). https:// doi.org/10.1007/978-3-642-73309-3_1, ISBN 978-0-387-18739-6 22. Fröhlich, H., Kremer, F.: Coherent Excitations in Biological Systems. Springer, Heidelberg (1983). https://doi.org/10.1007/978-3-642-69186-7. ISBN 978-3-642-69186-7 23. Hyland, G.J.: Coherent GHz and THz excitations in active biosystems, and their implications. In: The Future of Medical Diagnostics? - Proceeding of Matra Marconi UK, Directorate of Science, Internal Report, Portsmouth, UK, pp. 14–27 (1998) 24. Lundholm, I.V., et al.: Terahertz radiation induces non-thermal structural changes, associated with Fröhlich condensation in a protein crystal. Struct. Dyn. 2, 054702 (2015) 25. Penrose, R.: Shadows of the Mind: A Search for the Missing Science of Consciousness. Oxford University Press Inc., New York (1994). ISBN 0 19 853978 9 26. Vasconcellos, A.R., Vannucchi, F.S., Mascarenhas, S., Luzzi, R.: Frohlich condensate: emergence of synergetic dissipative structures in information processing biological and condensed matter systems. Information 3(4), 601–620 (2012). https://doi.org/10.3390/info3040601 27. Hameroff, S.R.: Chi: a neural hologram? Am. J. Clin. Med. 2(2), 163–170 (1974) 28. Hameroff, S.R.: Ultimate Computing. Biomolecular Consciousness and Nano-Technology. North-Holland, Amsterdam (1987) 29. Fröhlich, F.: Genetic code as Language. In: Fröhlich, H. (ed.) Biological Coherence and Response to External Stimuli, pp. 192–204. Springer, Heidelberg (1988). https://doi.org/10. 1007/978-3-642-73309-3_11, ISBN 978-0-387-18739-6 30. Holland, J.M.: Studies in Structure. MacMillan Press, London (1972) 31. Petoukhov, S.V.: Connections between long genetic and literary texts. The quantumalgorithmic modelling. In: Hu, Z., Petoukhov, S., Dychka, I., He, M. (eds.) ICCSEEA 2019. AISC, vol. 938, pp. 534–543. Springer, Cham (2020). https://doi.org/10.1007/978-3-03016621-2_50 32. Rieper, E., Anders, J., Vedral, V.: Quantum entanglement between the electron clouds of nucleic acids in DNA (2011). arXiv:1006.4053v2 33. Petoukhov, S.V.: Genetic code and the ancient Chinese “Book of Changes”. Symmetry: Cult. Sci. 10(3–4), 211–226 (1999) 34. Petoukhov, S.V.: Matrix Genetics, Algebras of the Genetic Code, Noise-Immunity. RChD, Moscow (2008). ISBN 978-5-93972-643-6. (in Russian) 35. Petoukhov, S.V., He, M.: Symmetrical Analysis Techniques for Genetic Systems and Bioinformatics: Advanced Patterns and Applications. IGI Global, Hershey (2009) 36. Hu, Z.B., Petoukhov, S.V., Petukhova, E.S.: I-Ching, dyadic groups of binary numbers and the geno-logic coding in living bodies. Prog. Biophys. Mol. Biol. 131, 354–368 (2017). https:// doi.org/10.1016/j.pbiomolbio.2017.08.018 37. McFadden, J., Al-Khalili, J.: The origins of quantum biology. Proc. R. Soc. A 474(2220), 1–13 (2018). https://doi.org/10.1098/rspa.2018.0674

Comparative Analysis of Inductive Density Clustering Algorithms Meanshift and DBSCAN Zhengbing Hu1

, Irina Lurie2 , Oleksii K. Tyshchenko3(B) and Volodymyr Lytvynenko2

, Nataliia Savina4

,

1 National Aviation University, Liubomyra Huzara Avenue 1, Kyiv 03058, Ukraine 2 Kherson National Technical University, Beryslavs’ke Hwy 24, Kherson 73008, Ukraine 3 Institute for Research and Applications of Fuzzy Modeling, CE IT4Innovations,

University of Ostrava, 30. dubna 22, 701 03 Ostrava, Czech Republic [email protected] 4 National University of Water and Environmental Engineering, Soborna Street 11, Rivne 33000, Ukraine [email protected]

Abstract. The article presents an inductive model of objective clustering based on the MeanShift clustering technique. The algorithm for breaking an assortment of original data into two evenly powerful subsets is employed. The balance criterion is handled as an external criterion. To test the functioning of the proposed model, the “Jain” and “Flame” data sets from the Computing School of the East Finnish University were employed. The inductive DBSCAN algorithm was adopted for matching the preliminary outcomes. Based on the simulation proceeds, the ways for further improvement of the proposed model are arranged in order to increase the clustering objectivity of the examined data. Keywords: Inductive clustering · MeanShift · DBSCAN · External quality measure · Internal clustering quality · Density-based clustering

1 Introduction Clustering is the principal process for distinguishing semi-empirical exemplars from processed data. The clustering intricacy is an unsupervised learning technique and it is degraded to dividing a collection of data units into subsets so that the elements of one subset diverge to a large extent from the items of all other subsets according to some specified criteria. Clustering may be a preprocessing stage in more sophisticated data applications. For the moment, there is a wide diversity of clustering algorithms. Every method has benefits and limitations and is concentrated on a distinct sort of data. A high level of subjectiveness is one of the essential impediments of available procedures, i.e., high-quality processing on one sample collection does not grant the equivalent level of outcomes on a different comparable data set. Expanding the objectivity of clustering is feasible through the application of inductive methods for modeling complex systems based on © The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): AIPE 2020, AISC 1403, pp. 232–242, 2021. https://doi.org/10.1007/978-3-030-80531-9_21

Comparative Analysis of Inductive Density Clustering Algorithms

233

the group method of data handling (GMDH) [1, 2], where two subsets of the same power provide data processing and the concluding judgment grounds on the nature of objects’ partition into clusters according to outer principles for relevance and inner guidelines for evaluating the clustering quality. The inductive clustering methodology [3–5] is based on three primary postulates: 1. The strategy of heuristic self-organization, which implies that sequential enumeration of various complicating candidate models in order to pick the best models by a suitable particular external criterion or a group of guidelines for judging the measure of data grouping. 2. The postulate of external expansion, the central intention of which is the requirement to adopt “fresh knowledge” for objective verification of a model. 3. The policy of decisions’ non-finality, the idea of which is to generate not a single one, but a certain set of intermediate results with the subsequent selection of the best features. Implementation of these policies in a modified way constitutes the preconditions for the production of an inductive framework for objective clustering of intricate data. Therefore, the development of hybrid systems for clustering objects on the grounds of inductive methods for modeling complex systems is an indispensable assignment both from a theoretical and practical point of view. There are many different clustering algorithms. Some of them split a data set into a known quantity of groups, some of them automatically choose a number of clumps. The intention of this inquiry is to research the effective execution of the Inductive Clustering technique on the authority of the algorithm for determining the maximum density probability within the objective inductive clustering methodology and comparative judgment with respect to the DBSCAN algorithm. The document holds five sections where Sect. 2 summarily details the state-of-the-art works on the investigated subject. Section 3 draws a survey on the inductive clustering strategy based on the MeanShift procedure. Section 4 clarifies an inductive clustering method based on the DBSCAN approach. Section 5 concentrates on the empirical results and comparative interpretation of the mentioned clustering tools. Section 6 concludes the paper and recapitulates the investigation.

2 Related Works Classification of numerous clustering algorithms by category is exhibited in [6]. Each method owns its own benefits and downsides. The pick of a relevant clustering technique is arranged by the sort of data being monitored and the scope of the prevailing assignment [7, 8]. Nonparametric methods that can identify clumps of an unspecified shape also produce hierarchical descriptions of data. In broader terms, clustering procedures can be classified into agglomerative and iterative algorithms. The hierarchical cluster analysis procedure runs adequately for a low product of units. Still, it does not always fit for big data due to the agglomerative algorithm’s complexity and huge sizes of a dendrogram. In the iterative algorithms, the

234

Z. Hu et al.

information is rapidly separated into numerous groups, the quantity of which is measured according to the correspondent patterns [9–11]. However, the iterative clustering tools possess diverse impediments. The first issue is that the attainment of the global minimum does not constantly ensure the absolute standard deviation; the second flaw is that the outcome often depends on a pick of the primary cluster centers, and their optimal option is not always unambiguous [12]. Inductive clustering tools may depreciate these limitations to an important degree. The elementary notions of forming an inductive scheme for clustering objects, as well as the additional progression of this opinion, are described in [3]. In [13, 14], the authors advise the ways how to perform the objective cluster examination, as well as the benefits of practicing inductive models in comparison with established practices of data clustering. In [15, 16], the authors stated an enriched inductive clustering paradigm following the k-means scheme. An algorithm for distributing the original data into two even subsets was proposed and effectively executed. The papers present considerations for the appraisal of model stability. A two-stage density inductive clustering algorithm invented for processing big data in bioinformatics [17]. The hands-on implementation of the DBSCAN procedure within the structure of inductive objective clustering is suggested in [18]. In this paper, the measurement of the optimal parameters of the algorithm is grounded on the maximum value of the complex quality criterion, which is estimated as the geometric mean of the Harrington’s desirability indicators for internal and external clustering quality criteria. In [19], for the first time, a new inductive approach for the mean shift clustering is introduced, which defines the location of the maximal probability density.

3 Inductive MeanShift MeanShift is a nonparametric technique of the feature space analysis to determine the maximal probability density location, the so-called mode search algorithm. The MeanShift strategy basically assigns data points to clusters iteratively by shifting points towards the highest density of the data points, which are cluster centroids (centers) [20]. For nonparametric estimation of the data density distribution, one of the most widely used estimates is the one by Rosenblatt-Parzen. The density is estimated as the total influence of sample elements. In contrast, each element’s contribution is described by the bell-shaped (kernel) function K(x), which depends on the distance to this element [21]. A formula for calculating the density estimate f (x) with a smoothing parameter (bandwidth) P at an arbitrary point x acquires the form: 

f (x) =

  N x − xi 1  . K P NP d i=1

One can use a classical Gauss kernel as K(x):     x − xi  x − xi KG . = exp − P 2P 2

Comparative Analysis of Inductive Density Clustering Algorithms

235

However, in practical tasks, in order to reduce computational costs, limited kernels are utilized, such as the Epanechnikov kernel:       x − xi  x − xi 2 x  · I − x ≤ P = 1− KEp i P P2 where I (x) marks up an indicator function. In this approach, the clusters are consistent with local maxima of the density estimation function (modes) [20, 22]. Furthermore, data items are intended to describe clusters using the MeanShift procedure converging along the gradient to the corresponding local maximum. An iterative method starts its path from a point x0 and moves sequentially to a shift point xk+1 = m(xk ) up to the moment when the convergence happens where N i=1 xi · K(x − xi ) . m(x) =  N i=1 K(x − xi ) A vector (m(x) − x) defines the “mean shift,” and its direction coincides with the direction of the maximal density increase at the point x. The clustering algorithms based on the use of the mean shift procedure allow obtaining high-quality partition; however, the main problem for applying this approach is its high computational complexity [23]. Here goes an inductive scheme for MeanShift: Step 1. The beginning point.

Step 2. An input flow is granted as a matrix  = xij ; i = 1, n; j = 1, m where n is either a pattern of rows or a set of objects under study; m indicates either a pattern of columns or a set of properties specifying the units. Step 3. Division of the matrix Ω into two equivalently influential subsets in-line with the scheme as mentioned earlier. The obtained subsets Ω A and Ω B can be formally exemplified, as stated below: A = xijA ; B = xijB ; i = 1, nA = nB ; nA + nB = n; j = 1, m. Step 4. Setting up the MeanShift scheme. For each equally powerful subset: Step 5. Selecting an initial bandwidth P value and a change pitch h. Step 6. Sequential clustering and fixing the same number of clusters for each subset. Step 7. Calculation of the internal clustering quality criterion [24, 25]: K  Sxj ; Silhouette index [26]: SWC = K1 Dunn index [27]: DI (k) = min; i∈k

j=1

QCB·(N −K) Index by Calinski – Harabasz [28]: QCCH = QCW ·(K−1) → max. Step 8. Calculation of the external criterion for balance [29]:

(ICA − ICB )2 ECB = . (ICA + ICB )2

236

Z. Hu et al.

Step 9. P = P + h; transition to Step 6. Step 10. If the final bandwidth value P is reached, the determination of the maximum ECB is performed. Step 11. Defining the optimal bandwidth value when ECB → max. Step 12. Clustering the data (subsets Ω of the objects under examination) and fixing the clusters. Step 13. The end.

4 Inductive DBSCAN Density-Based Spatial Clustering of Applications with Noise (DBSCAN) [30] assumes that clusters represent some dense concentration points. A similarity matrix is used in this case. Among other things, DBSCAN utilizes a set of terms: ε – neighborhood of an object x : U (x, ε) = {y ∈ V : ρ(x, y) ≤ ε}; a core point of a rank M (for the given ε) is an object whose ε – neighborhood contains at least M other objects. For the given value M, it is claimed that an item y straightly density-accessible from the unit x if y ∈ U (x, ε), and the object x is a core point. A feature y is density-attainable from an item x if such objects x1 , . . . , xn exist where x1 = x, xn = y for all i = 1, . . . , n − 1, an object xi+1 is straightly density-achievable from xi . Let us observe that the DBSCAN scheme defines a quantity of clusters K in the course of operation. A general flow of the DBSCAN procedure is presented as: Step 0. Let’s set values of the parameters ε and M and let’s put K = 0. Step 1. If all objects x ∈ V have already been viewed, the algorithm should stop. Otherwise, any object is selected and marked as a viewed one. Step 2. If x is a core object then a new cluster is created (at the same time it is assumed that K = K + 1) and pass on to Step 3; otherwise, the point x is marked as “noise” (this point may appear later in the ε – neighborhood of some other point and be included in one of the clusters) and pass on to Step 1. Step 3. The created cluster includes all the objects that are densely-reachable from a core object x, afterwards there is a transition to Step 1. What is offered in this manuscript is an inductive modification of the DBSCAN workflow: Step 1. The beginning point.

Step 2. An input flow is granted as a matrix  = xij ; i = 1, n; j = 1, m where n is either a pattern of rows or a set of objects under study; m indicates either a pattern of columns or a set of properties specifying the items. Step 3. Division of the matrix Ω into two equivalently influential subsets in-line with the aforementioned scheme. The obtained subsets Ω A and Ω B may be formally exemplified as stated below: A = xijA ; B = xijB ; i = 1, nA = nB ; nA + nB = n; j = 1, m.

Comparative Analysis of Inductive Density Clustering Algorithms

237

Step 4. Setting up the DBSCAN scheme. For each equally powerful subset: Step 5. Selecting an initial EPS value and a change pitch h. MinPts = 3. Step 6. Sequential clustering and fixing the same number of clusters for each subset. Step 7. Calculation of the internal clustering quality criterion: K  Sxj ; Silhouette index: SWC = K1 Dunn index: DI (k) = min;

j=1

i∈k

Index by Calinski - Harabasz: QCCH =

QCB·(N −K) QCW ·(K−1)

→ max.

Step 8. Calculation of the external criterion for balance: ECB =



(ICA −ICB )2 . (ICA +ICB )2

Step 9. EPS = EPS + h; transition to Step 6. Step 10. If the final value EPS is reached, the determination of the maximum ECB is performed. Step 11. Defining the optimal value EPS when ECB → max. Step 12. Clustering the data (subsets Ω of the objects under examination) and fixing the clusters. Step 13. The end.

5 Numerical Experiments The study’s experimental part consisted of solving the problem of clustering different data sets using two inductive-based methods - MeanShift and DBSCAN. The code is written in R. The data for the experiments were taken from the open repository [31]. The clustering quality of the algorithms was investigated for clusters of distinct kinds. In the whole group of the experiments provided, a pattern of the factors was m=2, which made it possible to represent acutely the gained outcomes. The experiments’ results are given in the tables. The first column of each table contains the names of the test data sets; the next two columns contain the clustering results obtained by MeanShift and DBDSCAN (Figs. 1 and 2).

Fig. 1. Dependence of the external criterion on the clusters’ quantity for Inductive MeanShift (left) and Inductive DBSCAN (right) for the Jain data set

238

Z. Hu et al.

Fig. 2. Dependence of the external criterion on the clusters’ quantity for Inductive MeanShift (left) and Inductive DBSCAN (right) for the Flame data set

Fig. 3. Visualization of the clustering outcome for Inductive MeanShift (left) and Inductive DBSCAN (right) for the Jain data set

Distinctively, the inductive clustering approach through the MeanShift and DBSCAN methods demonstrated relatively identical results for different datasets. The maximum value of the external balance criterion is also attained for a repeating number of clusters. In the graphical representation, clustering has specific differences. So the inductive DBSCAN illustrated the best outcome on the Jain data set, while the inductive MeanShift proved its efficiency for the Flame data. The inductive clustering quality’s verification results based on the Silhouette index for the Jain and Flame data sets testify that the outcome is almost identical. DBSCAN determines the “noise” in these data sets, which can be noted from the clustering results (Figs. 3 and 4). The inductive MeanShift performs slightly better by applying the Silhouette index for the Flame data set, although negative Silhouette values are slightly higher in this case. The inductive clustering quality’s verification results based on the Silhouette index for the Jain and Flame data sets testify that the outcome is almost identical. DBSCAN determines the “noise” in these data sets, which can be noted from the clustering results (Figs. 5 and 6). The inductive MeanShift performs slightly better by applying the Silhouette index for the Flame data set, although negative Silhouette values are higher in this case.

Comparative Analysis of Inductive Density Clustering Algorithms

239

Fig. 4. Visualization of the clustering outcome for Inductive MeanShift (left) and Inductive DBSCAN (right) for the Flame data set

Fig. 5. Defining clusters based on the Silhouette index for the Jain data set (IndMeanShift – left and IndDBSCAN - right)

As the results establish, the inductive algorithm is more applicable for spherical clusters, while the inductive DBSCAN method is less sensitive to the clusters’ shape. Following on from the results’ analysis of the numerical research, the inductive approach to density algorithms does not make it possible to distinguish a procedure which is superior to another one in all the criteria (simplicity and transparency of the algorithm, implementation complexity, execution speed, the ability to detect “noise,” a lack of input parameters, insensitivity to a number of objects and a shape of clusters). Consequently, before solving the clustering problem, it is necessary to conduct a thorough preparatory interpretation of the task under study and choose the most appropriate algorithm for a particular case. The lack of a universal clustering scheme (optimal according to all the quality criteria) leads to developing and modifying the computational algorithms for its solution (Table 1).

240

Z. Hu et al.

Fig. 6. Defining clusters based on the Silhouette index for the Flame data set (IndMeanShift – left and IndDBSCAN - right)

Table 1. Merits of the clustering quality measures Data sets

The quality measures

Inductive MeanShift

Inductive DBSCAN

Jain

A number of clusters

2

6

Bandwidth P

EPS

MinPts

3.778988

2.586

3

Silhouette index

0.2965784

0.3735486

Dunn index

0.01297478

0.074308863

Index by Calinski – Harabasz

111.2689369

104.1714315

Flame

Entropy

0.3793774

0.24634656

A number of clusters

2

6

Bandwidth P

EPS

MinPts 3

1.3102459

0.8351

Silhouette index

0.3321364

0.2228894

Dunn index

0.05479457

0.06109933

Index by Calinski - Harabasz

110.7979775

51.80709531

Entropy

0.65087335

0.80747949

6 Conclusion The document exhibited the consequences of the embedding of the inductive technology for objective clustering with reference to the MeanShift algorithm. The methodology of the inductive modeling was additionally improved for the selection of optimal clustering as part of the process of model operation by fulfilling the object-criterion strategy. The data from the Jain and Flame data collections were applied as experimental data. The

Comparative Analysis of Inductive Density Clustering Algorithms

241

analyzed data were split into two subsets of similar power, which were then treated using the inductive clustering algorithms. The balance criterion was adopted as an external measure. Comparative investigations for the suggested technique with the inductive DBSCAN procedure were carried out. The effects of appraising the clustering quality confirmed that the MeanShift and DBSCAN algorithms yielded roughly the equivalent outcomes. Nevertheless, the MeanShift inductive algorithm distinguishes objects with nonlinear patterns more precisely, while the inductive DBSCAN tool is less perceptive to nonlinear objects. Acknowledgment. The study of Oleksii K. Tyshchenko was additionally awarded by the National Science Agency of the Czech Republic in the context of the project TACR TL01000351.

References 1. Ivakhnenko, A.G.: Objective clusterization on the basis of the theory of self-organization of models. Soviet J. Automat. Inform. Sci. 20(5), 1–9 (1987) 2. Stepashko, V.: Inductive modeling from historical perspective. In: Proceedings of the 12th International Scientific and Technical Conference on Computer Sciences and Information Technologies, CSIT 2017, vol. 1, pp. 537–542 (2017) 3. Madala, H.R., Ivakhnenko, A.G.: Inductive Learning Algorithms for Complex Systems Modeling. CRC Press, Boca Raton (1994) 4. Stepashko, V.S.: Theoretical aspects of GMDH as a method of inductive modelling. Manag. Syst. Mach. 2, 31–38 (2003). (in Russian) 5. Hu, Zh., Bodyanskiy, Ye.V., Tyshchenko, O.K., Boiko, O.O.: An evolving cascade system based on a set of neo - fuzzy nodes. Int. J. Intell. Syst. Appl. (IJISA) 8(9), 1–7 (2016) 6. Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall Advanced Reference Series. Prentice-Hall Inc., Upper Saddle River (1988) 7. Izonin, I., Trostianchyn, A., Duriagina, Z., Tkachenko, R., Tepla, T., Lotoshynska, N.: The combined use of the wiener polynomial and SVM for material classification task in medical implants production. Int. J. Intell. Syst. Appl. (IJISA) 10(9), 40–47 (2018) 8. Babichev, S., Škvor, J., Fišer, J., Lytvynenko, V.: Technology of gene expression profiles filtering based on wavelet analysis. Int. J. Intell. Syst. Appl. (IJISA) 10(4), 1–7 (2018) 9. Hu, Zh., Bodyanskiy, Ye.V., Tyshchenko, O.K., Samitova, V.O.: Fuzzy clustering data given in the ordinal scale. Int. J. Intell. Syst. Appl. (IJISA) 9(1), 67–74 (2017) 10. Hu, Zh., Bodyanskiy, Ye.V., Tyshchenko, O.K., Samitova, V.O.: Possibilistic fuzzy clustering for categorical data arrays based on frequency prototypes and dissimilarity measures. Int. J. Intell. Syst. Appl. (IJISA) 9(5), 55–61 (2017) 11. Hu, Zh., Bodyanskiy, Ye.V., Tyshchenko, O.K., Tkachov, V.M.: Fuzzy clustering data arrays with omitted observations. Int. J. Intell. Syst. Appl. (IJISA) 9(6), 24–32 (2017) 12. Celebi, M.E., Kingravi, H.A., Vela, P.A.: A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Syst. Appl. 40(1), 200–210 (2013) 13. Stepashko, V.S.: Elements of Inductive Modeling Theory - State and Prospects of Informatics Development in Ukraine: Monographic Arts, pp. 471–486. Scientific Thought, Kyiv (2010) 14. Osypenko, V.V., Reshetjuk, V.M.: The methodology of inductive system analysis as a tool of engineering researches analytical planning. Ann. Warsaw Univ. Life Sci. 58, 67–71 (2011) 15. Babichev, S., Taif, M.A., Lytvynenko, V.: Estimation of the inductive model of objects clustering stability based on the k-means algorithm for different levels of data noise. Radio Electron. Comput. Sci. Manag. 4, 54–60 (2016)

242

Z. Hu et al.

16. Lurie, I., Podlevskyi, A., Savina, N., Voronenko, M., Pashnina, A., Lytvynenko, V.: Inductive technology of the target clusterization of enterprise’s economic indicators of Ukraine. In: CEUR Workshop Proceedings, vol. 2353, pp. 848–859 (2019) 17. Lytvynenko, V., Lurie, I., Krejci, J., Voronenko, M., Savina, N., Taif, M.: Two step densitybased object-inductive clustering algorithm. In: CEUR Workshop Proceedings, vol. 2386, pp. 117–135 (2019) 18. Babichev, S., Lytvynenko, V., Osypenko, V.: Implementation of the objective clustering inductive technology based on the DBSCAN clustering algorithm. In: Proceeding of the XIIth IEEE International Scientific and Technical Conference, pp. 479–484 (2017) 19. Babichev, S., Vyshemyrska, S., Lytvynenko, V.: Implementation of DBSCAN clustering algorithm within the framework of the objective clustering inductive technology based on R and KNIME tools. Radio Electron. Comput. Sci. Control 1, 77–88 (2019) 20. Cheng, Y.: Mean shift, mode seeking, and clustering. IEEE Trans. Pattern Anal. Mach. Intell. 17(8), 790–799 (1995) 21. Pestunov, I.A., Berikov, V.B., Sinyavskiy, Yu.N.: Segmentation of multispectral images based on an ensemble of nonparametric clustering algorithms. Herald of Siberian State Aerosp. Univ. Named After Academician M.F. Reshetnev 5(31), 56–64 (2010). (in Russian) 22. Pestunov, I.A., Sinyavskiy, Yu.N.: Analysis and synthesis of signals and images: a nonparametric algorithm for clustering remote sensing data based on a grid approach. Autometry 42(2), 90–99 (2006). (in Russian) 23. Freedman, D., Kisilev, P.: Fast mean shift by compact density representation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1818–1825 (2009) 24. Xu, R., Wunsch, D.C.: Clustering. IEEE Press Series on Computational Intelligence. Wiley, Hoboken (2009) 25. Aggarwal, C.C., Reddy, C.K.: Data Clustering: Algorithms and Applications. CRC Press, Boca Raton (2014) 26. Kaufman, L., Rousseeuw, P.: Finding Groups in Data. An Introduction to Cluster Analysis. Wiley, Hoboken (2005) 27. Bezdek, J.C., Dunn, J.C.: Optimal fuzzy partitions: a heuristic for estimating the parameters in a mixture of normal distributions. In: Proceeding of the IEEE Transactions on Computers, pp. 835–838 (1975) 28. Calinski, R.B., Harabasz, J.: A dendrite method for cluster analysis. Comm. Statistics 3(1), 27p (1974) 29. Osypenko, V.V.: Two approaches to solving the problem of clustering in the broad sense from the standpoint of inductive modeling. Power Autom. 1, 83–97 (2014). (in Ukrainian) 30. Ester, M., Kriegel, H., Sander, J., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD 1996), vol. 96, no. 34, pp. 226–231. AAAI Press (1996) 31. Franti, P., Sieranoja, S.: Clustering basic benchmark. http://cs.joensuu.fi/sipu/datasets/

Author Index

A Abramov, Maxim, 174 Alifov, Alishir A., 181

Huang, Qifeng, 77 Huang, Weixiang, 122 Huang, Yixuan, 77

B Bai, Jiayu, 1

K Khlobystova, Anastasiia, 174

C Chen, Bosheng, 1 Chen, Caifu, 133 Chen, Laijun, 1 Chen, Qiao, 144 Chen, Shaonan, 122 Chen, Xiaolu, 133 Chen, Xiaotao, 189 Cheng, Hanmiao, 77

L Li, Xintong, 156 Li, Zhengxi, 213 Liang, Shuo, 156 Liu, Tianchang, 77 Lu, Xiaoquan, 77 Lurie, Irina, 232 Lytvynenko, Volodymyr, 232

D Ding, Kai, 144 Duan, Meimei, 63 E Evgenev, Georgy B., 26 F Fang, Kaijie, 77 G Guo, Yongqing, 213 H He, Jian, 87 He, Weimin, 87 Hu, Zhengbing, 232

M Ma, Hengrui, 108 Mu, Xiaoxing, 63 Mutovkina, N. Yu., 98 Mutovkina, Nataliya, 15 O Ouyang, Zengkai, 63 P Petoukhov, Sergey V., 222 Q Qian, Yimin, 144 R Rakcheeva, T., 50 Ren, Zheng, 133

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG 2021 Z. Hu et al. (Eds.): AIPE 2020, AISC 1403, pp. 243–244, 2021. https://doi.org/10.1007/978-3-030-80531-9

244 S Savina, Nataliia, 232 Si, Yang, 189, 213 Su, Xiaoling, 213 Sun, Li, 87 T Tian, Zhengqi, 63 Tolokonnikov, Georgy K., 203 Tulupyeva, Tatiana, 174 Tyshchenko, Oleksii K., 232 V Valuev, Andrey M., 40 W Wang, Hongjian, 108 Wang, Junfang, 133 Wang, Manshang, 1 Wang, Peng, 108 Wang, Sen, 108, 144 Wang, Xinyu, 133 Wang, Yi, 144

Author Index Wei, Wei, 1 Wu, Lifang, 122 Wu, Weijiang, 87 X Xia, Guofang, 63 Xu, Gaojun, 87 Xu, Wenhao, 213 Y Yin, Liqun, 122 Yu, Xiaoyong, 122, 156 Z Zhang, Qingmiao, 189 Zhang, Shuang, 133 Zhang, Xin, 87 Zhang, Xuelin, 189 Zhao, Shuangshuang, 87 Zheng, Zhong, 63 Zhou, Chao, 63 Zhou, Yangjun, 156 Zhu, Chengliang, 108, 144