Proceedings of 2023 Chinese Intelligent Automation Conference (Lecture Notes in Electrical Engineering, 1082) 9819961866, 9789819961863

The book presents selected research papers from the 2023 Chinese Intelligent Automation Conference (CIAC2023), held in N

103 33 77MB

English Pages 854 Year 2023

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Contents
Predicting TUG Score from Gait Characteristics with Video Analysis and Machine Learning
1 Introduction
2 Related Work
3 Methodology
3.1 Copula Entropy
3.2 Predictive Models
4 Proposed Method
5 Experiments and Results
5.1 The Video System
5.2 Data
5.3 Experiments
5.4 Results
6 Discussion
7 Conclusions
References
Task-Space Finite-Time Prescribed Performance Tracking Control for Free-Flying Space Robots Under Input Saturation
1 Introduction
2 Problem Formulation
3 Main Result
4 Simulations
5 Conclusion
References
An Online Game Platform for Intangible Cultural Heritage Tibetan Jiu Chess
1 Introduction
2 Related Work
3 Overall Architecture Design of the Platform
3.1 Overall Process
3.2 Online Correlation Architecture
4 Implementation Process of Platform Tibetan Jiu Chess
4.1 Tibetan Jiu Chess Chessboard Representation Logic
4.2 Platform Game Scene
4.3 Frame Data Structure Design
4.4 Logic Realization and Chess Notation Expression in Each Stage
5 Test Results and Actual Application of the Platform
5.1 Platform Test
5.2 Actual Use of the Platform
6 Conclusion
References
Review of Human Target Detection and Tracking Based on Multi-view Information Fusion
1 Introduction
2 Multi-view Pedestrian Detection
2.1 Traditional Computer Vision
2.2 Monocular Detection
2.3 Multi-view Feature Fusion
3 Multi-view Pedestrian Tracking
3.1 Centralized Tracking
3.2 Distributed Tracking
3.3 Hybrid Tracking
4 Dataset
5 Applications and Prospects
5.1 Application in Electric Power Operation Scenes
5.2 Prospects
6 Conclusion
References
Federated Topic Model and Model Pruning Based on Variational Autoencoder
1 Introduction
2 Federated Topic Model and Model Pruning Based on Variational Autoencoder
2.1 Federated Topic Model Based on Variational Autoencode
2.2 Progressive Pruning of the Federated Topic Model
3 Experiments
3.1 Experimental Setup
3.2 Experimental Result
4 Conclusion
References
An Anti-interference Mechanical Fault Diagnosis Method Based on CNN and Attention Mechanism
1 Introduction
2 Fundamental Theory
2.1 Convolutional Neural Network
2.2 Long Short-Term Memory Network
2.3 Attention Mechanism
3 Anti-interference Fault Diagnosis Model
3.1 Network Structure
4 Experimental Validation and Result Analysis
4.1 Dataset Description and Experimental Process
4.2 Diagnosis Results of Faults Under Fixed Load
4.3 Fault Diagnosis Results Under Noise Interference Conditions
5 Conclusions
References
Refining Object Localization from Dialogues
1 Introduction
2 Related Work
2.1 Target Localization
2.2 Visual Language Navigation(VLN)
3 Problem Formulation
4 Method
4.1 Language Parser
4.2 World Model
4.3 Updating Object Distribution
4.4 Searching Process
5 Experiment and Result
5.1 Experiment 1: Language Parsing Evaluation
5.2 Experiment 2: Locating Object
5.3 Experiment 3: Comparison with One-time Message Method
6 Conclusion
References
License Plate Detection and Recognition Based on Light-Yolov7
1 Introduction
2 Method
2.1 License Plate Data Processing
2.2 License Plate Detection Model
2.3 License Plate Correction Model
2.4 License Plate Recognition Model
3 Experiments
3.1 Dataset
3.2 Analysis of Testing Model Indicators
4 Conclusion
References
Efficient Partitioning Method of Large-Scale Public Safety Spatio-Temporal Data Based on Information Loss Constraints
1 Introduction
2 Methodology
2.1 Problem Definition
2.2 Spatio-Temporal Partitioning Module (STPM)
2.3 Graph Partitioning Module (GPM)
3 Experiments
3.1 Experimental Data Preparation
3.2 Analysis of Experimental Results
4 Conclusion
References
Research and Application of Intelligent Monitoring and Diagnosis System for Rail Transit Emergency Power Supply Equipment
1 Introduction
2 Function Design of Diagnosis and Monitoring System
2.1 Fault Classification
2.2 Alarm Model
2.3 Fault Knowledge Base
3 System Application
3.1 Integrated Control of UPS Equipment
3.2 Real-Time Status Monitoring of Equipment
3.3 Fault Simulation
3.4 Visual Report Analysis Scenario
4 Conclusion
References
Reliability-Based Dynamic Positioning Control for Turret-Moored Vessels with Prescribed Performance
1 Introduction
2 Mathematical Model and Definitions
2.1 Mathematical Model of Turret-Moored Vessels
2.2 Reliability Index of Mooring Lines
3 Controller Design and Stability Analysis
4 Simulations and Analysis
5 Conclusions
References
A Pose Control Algorithm for Simulating Robotic Fish
1 Introduction
2 The Pose Control Model for Simulating Robot Fish
2.1 Simplified Dynamic Model
2.2 Simplified Kinematic Model
3 The Mathematical Model for Simulating Robot Fish
4 The Pose Control Algorithm for Simulating Robot Fish
4.1 Design of Angular Velocity Controller
4.2 Design of Linear Speed Controller
4.3 Design of Linear Fuzzy Controller
5 Simulation Verification
5.1 Simulation Task
5.2 Simulation Results and Analysis
6 Conclusion
References
Unsupervised Multidimensional Time Series Anomaly Detection Based on Federation Learning
1 Research Background and Purpose
2 Related Work
3 Methodology
3.1 Preliminaries
3.2 Local Model Construction
3.3 Federated Learning Model
3.4 Generate the Global Model
4 Experimental Analysis and Discussion
4.1 Configuration and Setup of the Experiment
5 Conclusion
References
Reinforcement Federated Learning Method Based on Adaptive OPTICS Clustering
1 Introduction
2 FedRO
2.1 Reinforced Federated Learning Method
3 Experiments
3.1 Datasets and Models
3.2 Experiment Settings
3.3 Federated Learning Experiment Results and Analysis
3.4 Experimental Results and Analysis of Clustering Algorithms
4 Conclusion
References
Simulation and Implementation of Extended Kalman Filter Observer for Sensorless PMSM
1 Introduction
1.1 Sliding Mode Observer Control
1.2 Extended Kalman Filter Control
2 Extended Kalman Filter Algorithm in PMSM
3 Simulation Analysis
3.1 Model of Simulation System
3.2 Simulation Results
4 Experimental Verification
4.1 Hardware Circuit Design
4.2 Software Programming
5 Analysis of Experimental Results
6 Conclusion
References
Research and Application of Comprehensive Health Assessment Based on Production Equipment of Bulk Cargo Terminal
1 Introduction
2 Design of Comprehensive Health Assessment for Equipment
2.1 Principles for the Overall System of Equipment Comprehensive Health Assessment
2.2 Main Indicators of Equipment Comprehensive Health Evaluation
2.3 Life Cycle Scoring Module
3 Comprehensive Health Assessment Model for Equipment
3.1 Running the Scoring Module
3.2 Process Flow
3.3 Practical Application Description
4 Conclusion
References
Enhancing Resilience of Microgrid-Integrated Power Systems in Disaster Events Using Reinforcement Learning
1 Introduction
2 Optimization Model
2.1 Power System Topology
2.2 Resilience Optimization Model
3 Proposed Algorithm
3.1 Reinforcement Learning
3.2 Improvement and Application
4 Numerical Experiment
5 Conclusion
References
An Improved Adaptive Median Filtering Algorithm Based on Star Map Denoising
1 Introduction
2 Noise Model
3 Algorithm Description
3.1 Adaptive Median Filtering Algorithm
3.2 Improved Adaptive Median Filtering Algorithm
4 Experimental Results and Analysis
4.1 Objective Evaluation Methods of Image Quality
4.2 Simulation Experiment and Analysis
4.3 Star Map Denoising Experiment
5 Conclusion
References
Research on Intelligent Monitoring of Big Data Processes Based on Radar Map and Residual Convolutional Network
1 Introduction
2 Theoretical Background
2.1 Resnet Convolutional Neural Network
2.2 Attention Mechanisms
3 A Big Data Process Intelligence Monitoring Model Based on Radar Map and Attention Residual Network
3.1 Two-Dimensional Radar Maps
3.2 Attentional Residual Network
3.3 Model Framework
4 Example Application and Analysis
4.1 Model Training
4.2 Comparative Experiments
5 Conclusion
References
Consensus Path-Following of Multiple Wheeled Mobile Robots with Complex Dynamics by Adaptive Fixed-Time Fuzzy Control
1 Introduction
2 Problem Statement and Preliminaries
3 Main Results
3.1 Consensus Path-Following Control for Multiple WMRs
3.2 Adaptive Fixed-Time Fuzzy Control Design for Dynamic Model
4 Numerical Example
5 Conclusion
References
Level Control of Chemical Coupling Tank Based on Reinforcement Learning Method
1 Introduction
2 Coupling Tank Level Control Problem and Reinforcement Learning Principle
3 Coupling Tank Level Control Algorithm Based on DQN Method
4 Results and Analysis
5 Summary
References
A Personalized Federated Learning Fault Diagnosis Method for Inter-client Statistical Characteristic Inconsistency
1 Introduction
2 Related Work
2.1 Stacked Self-encoder
2.2 Federated Learning
3 The Proposed Method
4 Analysis of Experimental Demonstration
4.1 Dataset Description
4.2 Experimental Results
5 Conclusions
References
Downsampling Assessment for LiDAR SLAM
1 Introduction and Related Works
2 Methodology
2.1 Overview
2.2 RPE Degeneration and ATE Degeneration
2.3 Precision Maintenance Rate
3 Experiments
3.1 LOAM Evaluation
3.2 LeGO-LOAM Evaluation
4 Discussion and Conclusion
References
The Key to Autonomous Intelligence is the Effective Synergy of Human-Machine-Environment Systems
1 Introduction
2 Patterns of the Future Game—Intelligentized Synergy
3 The Role of AI in the Future Game
3.1 Explainability
3.2 Learning
3.3 Common Sense
4 Issues in the Future Game
4.1 Deep Situation Awareness in Integrated Human-Machine Intelligence
4.2 Uncertainty
4.3 Human Issues
5 Conclusion
References
Design of Ultra-Low-Power Interface Circuit for Self-Powered Wireless Sensor Node
1 Introduction
2 Interface Circuit Principle
3 Experimentation
4 Conclusions
References
Distributed Consensus Tracking for Underactuated Ships with Input Saturation: From Underactuated to Nonholonomic Configuration
1 Introduction
2 Graph Theory
3 System Description
3.1 Multiple Surface Vessel Systems
3.2 Input Saturation
3.3 Preliminaries
4 Main Results
5 Simulation
6 Conclusion
References
Trajectory Planning of Launch Vehicle Orbital Injection Segment Under Engine Failure Based on DDPG Algorithm
1 Introduction
2 Model
2.1 Rocket Dynamics Model
2.2 Orbit Model
3 Algorithm
3.1 Reinforcement Learning
3.2 DDPG
4 Rocket Orbiting MDP Established
4.1 State and Action
4.2 Reward
5 Simulation
5.1 Comparison of Orbital Accuracy
5.2 Generalization Ability Verification
6 Conclusion
References
A Survey on Lightweight Technology of Underwater Robot
1 Introduction
2 Structural Optimization
2.1 Introduction to Structural Lightweight Optimization
2.2 Topology Optimization
2.3 Size Optimization
2.4 Shape Optimization
3 Lightweight Material
3.1 Lightweight Material Requirements for Underwater Robots
3.2 Material Introduction
4 Summary and Outlook
References
False Alarm Rate Control Method for Fiber Vibrate Source Detection with Non-stationary Interference
1 Introduction
2 Data Collection
3 PDF Estimation and PDF Model
4 N4SID Method
5 Controller Design
6 Simulation Results
6.1 PDF Model Simulation Results
6.2 False Alarm Rate Control in the Case of Trotting
7 Conclusions
References
An Improved YOLOv5-Based Small Target Detection Method for UAV Aerial Image
1 Introduction
2 Introduction to YOLOv5 Target Detection Algorithm
3 Improved YOLOv5 Target Detection Algorithm
3.1 Preset Anchor Box Optimization for Small Objects
3.2 Add Small Target Detection Layer
3.3 Multi-scale Feature Fusion Structure Optimization
4 Experimental Results and Analysis
4.1 Experimental Environment
4.2 Network Structure Improvement Experiment Results
5 Conclusion
References
A Federated Learning Method with DNN and 1DCNN Feature Fusion for Multiple Working Conditions Fault Diagnosis
1 Introduction
2 Related Work
2.1 Fault Diagnosis Method Based on DNN
2.2 Fault Diagnosis Method Based on 1DCNN
2.3 Fault Diagnosis Method Based on Federated Learning
3 Federated Learning Method with DNN and 1DCNN Feature Fusion
4 Experiments and Results
4.1 Datasets
4.2 Network Setup and Methodology Comparison
4.3 Results Analysis and Discussion
5 Conclusion
References
Backstepping Nonsingular Fast Terminal Sliding Mode Control for Manipulators Driven by PMSM with Measurement Noise
1 Introduction
2 Preliminaries and Problem State
3 Compostie Structure Design of ESO and EKF
3.1 Design of Extended Kalman Filter
3.2 Design of ESO Combined with EKF
4 Design of Nonsingular Fast Terminal SMC
5 Simulation Results
6 Conclusion
References
Adaptive Variable Impedance Control of Robotic Manipulator with Nonlinear Contact Forces
1 Introduction
2 Problem Statement and Preliminaries
2.1 Modeling of Environment
2.2 Dynamic Formulation and Characteristics of Robot Manipulators
3 Design of Adaptive Variable Impedance Control Approach
3.1 Trajectory Generation with Adaptive Variable Impedance
3.2 Controller Design
4 Simulation Experiments
4.1 Design Procedure
4.2 Simulation Results
5 Conclusion
References
Research on Adaptive Network Recovery Method Based on Key Node Identification
1 Introduction
2 Related Work
3 Adaptive Network Recovery Strategy
3.1 Problem Description and Modeling
3.2 Network Performance Metrics
3.3 Identification of Key Nodes in Failure Network
3.4 Adaptive Network Recovery Strategy
4 Experiments and Analysis of Results
4.1 Experiments Setting
4.2 Analysis of Results
5 Conclusion
References
Multi-Scale Feature Fusion Fault Diagnosis Method Based on Attention Mechanism
1 Introduction
1.1 Research Status
2 Ratele Work
2.1 Fault Diagnosis Method Based on Attention Mechanism
3 Multi-Scale Feature Fusion Fault Diagnosis Algorithm Based on Attention Mechanism
4 Experiment and Analysis
5 Conclusion
References
Designing Philobot: A Chatbot for Mental Health Support with CBT Techniques
1 Introduction
2 Background
3 Design
3.1 CBT Module
3.2 Chatting Module
4 Implementation
5 Evaluation
6 Conclusion
References
An EEG Study of Virtual Reality Motion Sickness Based on MVMD Combined with Entropy Asymmetry
1 Introduction
2 Experiment and Data Acquisition
3 Data Processing Methods
3.1 Data Pre-Processing
3.2 Feature Extraction
3.3 Feature Selection
4 Results and Analysis
5 Conclusion
References
GCN with Pattern Affected Matrix in Human Motion Prediction
1 Introduction
2 Related Work
2.1 Human Motion Prediction
2.2 Graph Convolution Network in Human Motion Prediction
2.3 Exploration of Adjacency Matrix in GCN-Base Model
3 Proposed Method
3.1 GCN with Pattern Affected Adjacency Matrix - GCN-PAAM
3.2 Pattern Generate Module
3.3 Training
4 Experiments
4.1 Datasets
4.2 Baselines and Comparisons Settings
4.3 Comparisons with BaseLines
4.4 Ablation Study
5 Conclusion
References
Cooperative Control of SMC-Feedback Linearization and Error Port Hamiltonian System for PMSM
1 Introduction
2 Mathematical Model of PMSM
3 Controller Design
3.1 SMC-FL Controller Design
3.2 EPH Controller Design
3.3 Load Torque Estimation
3.4 Cooperative Control Strategy Design
4 Simulation Results
5 Conclusion
References
Identification of Plant Nutrient Deficiency Based on Improved MobileNetV3-Large Model
1 Introduction
2 Methods
2.1 The Improved MobileNetV3
3 Experiments
3.1 Datasets
4 Results
4.1 Compared With Base Model
4.2 Compared With CNN Model
5 Conclusion
References
Modular Smart Vehicle Design and Technology for Shared Mobility
1 Introduction
2 Functional Definition and Scheme Design
3 Key Technologies
3.1 Multifunctional Modular Chassis
3.2 Multimodal Fusion Perception Module
3.3 High Precision Localization Module
3.4 Human-Like Decision Making Module
3.5 Safety of the Intended Functionality Module
4 Platform Integration and Test
4.1 Key Modules Test
4.2 Vehicle Test
5 Conclusion
References
A Quadrupedal Soft Robot Based on Kresling Origami Actuators
1 Introduction
2 Design and Working Principle
2.1 Design
2.2 Working Principle
3 Fabrication and Control
3.1 Fabrication
3.2 Control
4 Experiments and Results
4.1 Experiments with the Kresling Origami Actuator
4.2 Experiments with the Quadrupedal Soft Robot
5 Conclusion and Future Work
References
Design of Attitude Controller for Ducted Fan UAV Based on Improved Direct Adaptive Control Method
1 Introduction
2 Preliminary
2.1 Coordinate Frames
2.2 Attitude Dynamics
2.3 State Space Expression
3 Controller Design
3.1 LQR Controller
3.2 Improved DAC Controller
4 Numerical Simulation
5 Conclusions
References
Hetero-Source Sensors Localization Based on High-Precision Map
1 Introduction
2 Proposed Method
2.1 System Overview
2.2 Hetero-Source Point Cloud Registration
2.3 3D-2D Pose Estimation
3 Experiments
3.1 Map Construction
3.2 Localizaiton Result
4 Conclusion
References
Edge-Node Refinement for Weakly-Supervised Point Cloud Segmentation
1 Introduction
2 Related Work
2.1 Point Cloud Learning
2.2 Weakly-Supervised Learning for Point Cloud
3 Method
3.1 Network Architecture
3.2 Edge Refinement
3.3 Node Refinement
3.4 Supervision and Loss Function
4 Experiments
4.1 Experiments Setup
4.2 Quantitative Results
4.3 Ablation Study
5 Conclusion
References
Improving Dialogue Summarization with Mixup Label Smoothing
1 Introduction
2 Related Work
2.1 Dialogue Summarization
2.2 Knowledge Distillation
3 Approach
3.1 Problem Formulation
3.2 Uniform Label Smoothing
3.3 Context Label Smoothing
3.4 Mixup Label Smoothing
4 Experiment
4.1 Dataset
4.2 Baselines
4.3 Evaluation Metrics
4.4 Implementation Details
4.5 Results Comparison to SOTA Models
4.6 The Ablation Study
4.7 The Low Resource Exploration
4.8 Sensitivity Test
5 Discussion
5.1 Correlation with Knowledge Distillation
5.2 Case Study
5.3 Prospect and Future Work
6 Conclusion
References
An Improved Multi-robot Coverage Method in 3D Unknown Environment Based on GBNN
1 Introduction
2 Task Environment and GBNN Model
3 Jump Out Mechanism
4 Simulation Experiment
5 Conclusion
References
Behavior Recognition Method Based on Object Detection for Power Operation Scenes
1 Introduction
2 Overall Architecture
3 Network Framework and Training Improvement
4 Design of the Logical Judgment Module
5 Data Collection for Electric Power Scene Dataset
6 Experimental Results and Analysis
6.1 Experimental Parameter Settings
6.2 Introduction to Experimental Datasets
6.3 Experimental Results
7 Conclusion
References
Continual Learning for Morphology Control
1 Introduction
1.1 The Challenge for Morphology Control
1.2 The Challenge for Continual Learning
2 Related Works
2.1 Morphological Control
2.2 Continual Learning
3 Problem Formulation
4 Method
4.1 EWC for Scenarios
5 Experimental Validation
5.1 Main Steps of Experimental
5.2 Signal Task
5.3 Multi Tasks
6 Conclusions
References
An Improved Method for Text Classification Using Contrastive Learning
1 Introduction
2 Related Work
3 Proposed Scheme
4 Experiments
4.1 Performance Comparison
4.2 Ablation Experiment
5 Conclusion
References
Improved Cooperation by Balancing Exploration and Exploitation in Intertemporal Social Dilemma Tasks
1 Introduction
2 Related Research
3 Multi-agent Reinforcement Learning and Decision-Making Tasks
3.1 Multi-agent Reinforcement Learning
3.2 Dynamical Learning Rate
3.3 Decision Task
3.4 Homogeneous and Heterogeneous Group Attributes
4 Result
5 Discussion
References
Light-Weight High-Performance HRNet for Human Pose Estimation
1 Introduction
2 The Proposed Light-Weight High-Performance HRNet
2.1 H-Block
2.2 PixelShuffle and Final Fusion
2.3 H-Random Erasing
3 Experiments
3.1 Dataset and Evaluation Metric
3.2 Implementation Details
3.3 Experimental Results
4 Conclusion
References
AHEAD: A Triple Attention Based Heterogeneous Graph Anomaly Detection Approach
1 Introduction
2 Related Work
3 Methodology
3.1 Preliminary
3.2 Multi-view Heterogeneous Graph Encoder
3.3 View-Level Attention Aggregator
3.4 Heterogeneous Graph Decoder
4 Experiments
4.1 Experiment Results
4.2 Effect of Decoders
5 Conclusion
References
Graph Autoencoder-Based Anomaly Detection for Chemical Mechanical Planarization
1 Introduction
2 Methodology
2.1 Problem Setting
2.2 Feature Engineering
2.3 Graph Modeling
3 Experiments
3.1 Dataset
3.2 Baselines
3.3 Experimental Results
3.4 Discussion
4 Conclusion
References
Model-Free Adaptive Sliding Mode Control for Nonlinear Systems with Uncertainties
1 Introduction
2 Problem Description
3 Control Designing
3.1 The Model-Free Adaptive Sliding Mode Control Algorithm Designing
3.2 Stability Analysis
3.3 The ESO Designing
4 Simulation Results
5 Conclusion
References
Quantum Illumination with Symmetric Non-Gaussian States
1 Introduction
2 Quantum Illumination
3 Symmetric Non-gaussian States
4 Analysis of Quantum Illumination Performance
4.1 Target Sensitivity
4.2 Accessible Target Sensitivity
5 Conclusion
References
Rank-Level Fusion of Multiple Biological Characteristics in Markov Chain
1 Introduction
2 Introduction of Rank-Level Fusion Methods
2.1 The Highest Rank Method
2.2 The Borda Count Method
2.3 The Logical Regression Method
3 Markov Chain Method for Rank-Level Fusion
3.1 Markov Chain
3.2 The Condorcet Criterion
3.3 The Markov Chain Method
4 Experiment
4.1 Experimental Data Preparation and Processing Stage
4.2 Analysis of Experimental Results
5 Conclusion
References
PCB Defect Detection Algorithm Based on Multi-scale Fusion Network
1 Introduction
2 Related Work
2.1 Residual Network Model Structure
2.2 CSPDarknet53 Network Structure
2.3 Multi-Scale Feature Fusion
3 Experiment and Result Analysis
3.1 Experimental environment configuration
3.2 Data preprocessing
3.3 Data Augmentation Method
3.4 Model Evaluation Metrics
3.5 Experimental Results and Comparison
4 Conclusion
References
Event-Triggered Adaptive Trajectory Tracking Control for Quadrotor Unmanned Aerial Vehicles
1 Introduction
2 Problem Formulation
3 Controller Design
3.1 Fixed Threshold Strategy
3.2 Switching Threshold Strategy
4 Numerical Experiments
5 Conclusion
References
Coal Maceral Groups Segmentation Using Multi-scale Residual Network
1 Introduction
2 Multi-scale Residual U-Net Model
2.1 Multiscale Contextual Attention Block
2.2 Squeeze and Excitation Block
2.3 The Framework of MSRU-Net Model
3 Analysis of Experimental Results
3.1 Datasets
3.2 The Ablation Experiment of Function Block
3.3 Analysis of Experimental Results
4 Conclusion
References
Design of Magnetic Tactile Sensor Arrays for Intelligent Floorboard Based on the Demand of Older People
1 Introduction
2 Tactile Sensor Demand for Frail Older Persons
3 Design of Tactile Sensor Array and its Output Characteristic
3.1 Design of Magnetic Tactile Sensor Array
3.2 The Output Characteristic of a Magnetic Tactile Sensor Array
3.3 Intelligent Floorboard with Magnetic Tactile Sensor Arrays
4 Conclusion
References
Gaussian Process-Augmented Unscented Kalman Filter for Autonomous Navigation During Aerocapture at Mars
1 Introduction
2 Dynamic Models
3 Gaussian Process Regression
4 Gaussian Process-Augmented Unscented Kalman Filter
4.1 Predictive Step
4.2 Corrective Step
5 Simulation Results and Analysis
6 Conclusion
References
Application of EEG S-Transformation Combined with Dimensionless Metrics for Automatic Detection of Cybersickness
1 Introduction
2 EEG Data Source
3 Methods
3.1 EEG Data Pre-processing
3.2 S-Transform
3.3 Feature Extraction
4 Results and Discussion
4.1 S-Transformation Results
4.2 Feature Extraction Results
4.3 Classification Results
5 Conclusions
References
An Overview of Multi-task Control for Redundant Robot Based on Quadratic Programming
1 Introduction
2 System Model and Preliminary Analysis
2.1 Single-task Control Method
2.2 Multi-task Control Method
3 QP Algorithms for Multi-task Control
3.1 Multi-task Control with Fixed Priority
3.2 Multi-task Control with Transitional Priority
3.3 Comparison Between Different Algorithms
4 Simulation Results
4.1 Tasks with Fixed Priority
4.2 Tasks with Transitional Priority
5 Conclusion
References
Improvement of Hierarchical Clustering Based on Dynamic Time Wrapping
1 Introduction
2 Model Construction
2.1 Dynamic Time Wrapping
2.2 Kruskal Algorithm for Heap Optimization
2.3 Hierarchical Clustering
3 Experimental Analysis
3.1 Data Set and Programming Environment
3.2 Calculation Methods of Precision
3.3 The Measurement of Algorithm Performance
3.4 An Example of the Experiments
3.5 The Results of the Experiments
4 Conclusion
References
MNGAN: Multi-Branch Parameter Identification Based on Dynamic Weighting
1 Introduction
2 Parameter Identification Based on Multi-task Noise Graph Attention Network
2.1 Transmission Line Branch Parameter Identification Task
2.2 Select the Form of Attention
2.3 Noise Graph Attention Module
2.4 Multi-task Loss Module Based on Adaptive Weight Learning
3 Experimental Results and Discussion
3.1 Datasets and Simulation Environment
3.2 Comparison of Branch Parameter Identification of Different Models
3.3 Advantages of This Method
4 Conclusions
References
Error Selection Based Training of Fully Complex-Valued Dendritic Neuron Model
1 Introduction
2 Preliminaries
2.1 FCDNM
2.2 The CLM Algorithm
3 The Proposed Method
4 Experiments
4.1 Nonlinear Phase Equalization
4.2 Stock Price Forecast
5 Conclusion
References
Intelligent Identification Method of Flow State in Nuclear Main Pump Based on Deep Learning Method
1 Introduction
2 Study Object and Experiment Process
3 Signal Processing
4 Intelligent Recognition Algorithm
5 Results and Discussion
5.1 Recognition Results of Various CNN Models
5.2 A New Method for Data Pre-processing: Random Sampling Method
6 Conclusions
References
Design of Intelligent Window Dwelling System Based on Multi Sensor Fusion
1 Introduction
2 Overall System Design
3 System Hardware Design
4 System Algorithm Design
4.1 Sensor Module Data Collection
4.2 Multi Sensor Fusion Based on Bayesian Estimation
5 System Testing Experiment
5.1 Verification of Detection Data Accuracy
5.2 System Function Verification
6 Conclusion
References
Time-Varying Function-Based Anti-Disturbance Method for Permanent-Magnet Synchronous Motors
1 Introduction
2 PMSM Model
3 FCS-MPC Method
3.1 Simple-vector FCS-PCC Control
3.2 Dual-vector FCS-PCC Control
3.3 Sector Determination
4 Model of ADRC
5 Improved ADRC Method for PMSM Control
6 Simulation Examples
6.1 The Model Built by Simulink
6.2 Comparing ADRC Method with PID Method
6.3 Comparing Improved ADRC Method with Origin ADRC Method
7 Conclusion
References
Research on the Operation Status of Metro Power Supply Equipment Under Cyber Physical System
1 Introduction
2 Background
2.1 Metro Power Supply System Under Cyber Physics System
2.2 Equipment Operation Status
3 Related Work
3.1 Establishment of Indicator System
4 Experimental Results of Safety Assessment of Power Supply System
5 Discussion
6 Conclusions
References
Hybrid Underwater Acoustic Signal Multi-Target Recognition Based on DenseNet-LSTM with Attention Mechanism
1 Introduction
2 Related Work
2.1 CNN and DenseNet
2.2 LSTM
2.3 Attention Mechanism
3 DenseNet-LSTM with Attention Mechanism
3.1 DenseNet-LSTM
3.2 DenseNet-LSTM with Attention Mechanism
4 Dataset and Experimental Results
4.1 Dataset
4.2 Experimental Results
5 Conclusion
References
A Lightweight Deep Network Model for Visual Checking of Construction Materials
1 Introduction
2 Overall Framework Design for Building a Visual Checking Network Model of Materials
2.1 Lightweighting Approach Based on Improved-ShuffleNetV2 Module
2.2 BN Layer-Based Channel Pruning Method
3 Dataset and Experimental Design
3.1 Construction Material Image Dataset Construction
3.2 Experimental Environment and Configuration
3.3 Main Experimental Parameters of the Algorithm Model
4 Experimental Results and Analysis
4.1 Experimental Analysis of Light weighting Based on the Improved-ShuffleNetV2 Module
4.2 Analysis of Structured Pruning Experiments
4.3 Comparison and Analysis of Construction Material Checking Effect
5 Conclusion
References
Research and Application of Automatic Screening Technology for Marketing Inspection Abnormalities Based on Knowledge Graph
1 Introduction
2 Modeling the Power Grid Knowledge Graph
2.1 Entity Extraction
2.2 Relation Extraction
2.3 Establishment of a Query Rule Base for the Knowledge Graph
3 Example of the Application of a Knowledge Graph
4 Conclusion
References
Global Asymptotic Synchronization of Nonlinear Hyperchaotic Financial Systems via Hybrid Control and Adaptive Projection Control
1 Introduction
2 Hybrid Synchronization Between Two Identical Systems via Hybrid Feedback Control with Known Parameters
3 Increased Order Synchronization Between of Different Dimensional Financial Systems
4 Adaptive Projective Synchronization with Unknown Parameters
5 Numerical Simulations
6 Conclusions
References
Quadrotor UAV Control Based on String-Level Fuzzy ADRC
1 Introduction
2 Mathematical Model of Quadrotor UAV
3 String-Level ADRC
3.1 Controller Design
3.2 Simulation and Analysis
4 String-Level F-ADRC
4.1 Controller Design
4.2 Simulation and Analysis
5 Conclusion
References
Deep Neural Network for Performance Prediction of Silicon Mode Splitter
1 Introduction
2 Operating Principle and Forward Modeling
3 Results and Discussions
4 Conclusions
References
A Privacy Preserving Distributed Projected One-point Bandit Online Optimization Algorithm for Economic Dispatch
1 Introduction
2 Preliminaries and Problem Formulation
2.1 Graph Theory
2.2 Differential Privacy
2.3 One-Point Feedback
2.4 Problem Formulation
3 Algorithm and Convergence Analysis
3.1 Algorithm 1
3.2 Main Results
4 Numerical Experiments
5 Conclusion
References
Nonlinear Control of Dual UAV Slung Load Flight System Based on RBF Neural Network
1 Introduction
2 Double UAV Hanging Load System Model
2.1 Definition of Coordinate System
2.2 Control Input and External Disturbance
2.3 Euler-Lagrange Modeling
3 Trajectory Tracking Control of Double UAV Hanging Load System
4 Simulation Verification and Result Analysis
5 Conclusion
References
A LiDAR Point Cloud Semantic Segmentation Algorithm Based on Attention Mechanism and Hybrid CNN-LSTM
1 Introduction
2 Methodology
3 Experiments
3.1 Parameter Setting
3.2 Experiment Results
4 Conclusion
References
Robust Ascent Trajectory Optimization for Hypersonic Vehicles Based on IGS-UMPSP
1 Introduction
2 Unscented Optimal Control
3 Mathematical Formulation of the IGS-UMPSP
4 Simulations
4.1 Problem Setup
4.2 Performance of IGS-UMPSP
5 Conclusion
References
Exponential Visual Stabilization of Wheeled Mobile Robots Based on Active Disturbance Rejection Control
1 Introduction
2 System Modeling
2.1 Robot Kinematics
2.2 Measurable Signal Extraction
2.3 System Kinematics
3 Control Strategy Design
3.1 Controller Design of e0-Subsystem
3.2 Controller Design of e-Subsystem
4 Simulations
5 Conclusion
References
An Adaptive Observer for Current Sensorless Control of Boost Converter Feeding Unknown Constant Power Load
1 Introduction
2 System Model and Problem Formation
2.1 The Model of DC-DC Boost Converter with a CPL
2.2 Problem Formation
3 GPEBO Design
3.1 Derivation of a LRE
3.2 Use of the DREM Technology
3.3 Design of an Adaptive Sensorless Controller
4 Simulation Results
5 Conclusion
References
Author Index
Recommend Papers

Proceedings of 2023 Chinese Intelligent Automation Conference (Lecture Notes in Electrical Engineering, 1082)
 9819961866, 9789819961863

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Lecture Notes in Electrical Engineering 1082

Zhidong Deng   Editor

Proceedings of 2023 Chinese Intelligent Automation Conference

Lecture Notes in Electrical Engineering

1082

Series Editors Leopoldo Angrisani, Department of Electrical and Information Technologies Engineering, University of Napoli Federico II, Napoli, Italy Marco Arteaga, Departament de Control y Robótica, Universidad Nacional Autónoma de México, Coyoacán, Mexico Samarjit Chakraborty, Fakultät für Elektrotechnik und Informationstechnik, TU München, München, Germany Jiming Chen, Zhejiang University, Hangzhou, Zhejiang, China Shanben Chen, School of Materials Science and Engineering, Shanghai Jiao Tong University, Shanghai, China Tan Kay Chen, Department of Electrical and Computer Engineering, National University of Singapore, Singapore, Singapore Rüdiger Dillmann, University of Karlsruhe (TH) IAIM, Karlsruhe, Baden-Württemberg, Germany Haibin Duan, Beijing University of Aeronautics and Astronautics, Beijing, China Gianluigi Ferrari, Dipartimento di Ingegneria dell’Informazione, Sede Scientifica Università degli Studi di Parma, Parma, Italy Manuel Ferre, Centre for Automation and Robotics CAR (UPM-CSIC), Universidad Politécnica de Madrid, Madrid, Spain Faryar Jabbari, Department of Mechanical and Aerospace Engineering, University of California, Irvine, CA, USA Limin Jia, State Key Laboratory of Rail Traffic Control and Safety, Beijing Jiaotong University, Beijing, China Janusz Kacprzyk, Intelligent Systems Laboratory, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Alaa Khamis, Department of Mechatronics Engineering, German University in Egypt El Tagamoa El Khames, New Cairo City, Egypt Torsten Kroeger, Intrinsic Innovation, Mountain View, CA, USA Yong Li, College of Electrical and Information Engineering, Hunan University, Changsha, Hunan, China Qilian Liang, Department of Electrical Engineering, University of Texas at Arlington, Arlington, TX, USA Ferran Martín, Departament d’Enginyeria Electrònica, Universitat Autònoma de Barcelona, Bellaterra, Barcelona, Spain Tan Cher Ming, College of Engineering, Nanyang Technological University, Singapore, Singapore Wolfgang Minker, Institute of Information Technology, University of Ulm, Ulm, Germany Pradeep Misra, Department of Electrical Engineering, Wright State University, Dayton, OH, USA Subhas Mukhopadhyay, School of Engineering, Macquarie University, NSW, Australia Cun-Zheng Ning, Department of Electrical Engineering, Arizona State University, Tempe, AZ, USA Toyoaki Nishida, Department of Intelligence Science and Technology, Kyoto University, Kyoto, Japan Luca Oneto, Department of Informatics, Bioengineering, Robotics and Systems Engineering, University of Genova, Genova, Genova, Italy Bijaya Ketan Panigrahi, Department of Electrical Engineering, Indian Institute of Technology Delhi, New Delhi, Delhi, India Federica Pascucci, Department di Ingegneria, Università degli Studi Roma Tre, Roma, Italy Yong Qin, State Key Laboratory of Rail Traffic Control and Safety, Beijing Jiaotong University, Beijing, China Gan Woon Seng, School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore, Singapore Joachim Speidel, Institute of Telecommunications, University of Stuttgart, Stuttgart, Germany Germano Veiga, FEUP Campus, INESC Porto, Porto, Portugal Haitao Wu, Academy of Opto-electronics, Chinese Academy of Sciences, Haidian District Beijing, China Walter Zamboni, Department of Computer Engineering, Electrical Engineering and Applied Mathematics, DIEM—Università degli studi di Salerno, Fisciano, Salerno, Italy Junjie James Zhang, Charlotte, NC, USA Kay Chen Tan, Department of Computing, Hong Kong Polytechnic University, Kowloon Tong, Hong Kong

The book series Lecture Notes in Electrical Engineering (LNEE) publishes the latest developments in Electrical Engineering—quickly, informally and in high quality. While original research reported in proceedings and monographs has traditionally formed the core of LNEE, we also encourage authors to submit books devoted to supporting student education and professional training in the various fields and applications areas of electrical engineering. The series cover classical and emerging topics concerning: • • • • • • • • • • • •

Communication Engineering, Information Theory and Networks Electronics Engineering and Microelectronics Signal, Image and Speech Processing Wireless and Mobile Communication Circuits and Systems Energy Systems, Power Electronics and Electrical Machines Electro-optical Engineering Instrumentation Engineering Avionics Engineering Control Systems Internet-of-Things and Cybersecurity Biomedical Devices, MEMS and NEMS

For general information about this book series, comments or suggestions, please contact [email protected]. To submit a proposal or request further information, please contact the Publishing Editor in your country: China Jasmine Dou, Editor ([email protected]) India, Japan, Rest of Asia Swati Meherishi, Editorial Director ([email protected]) Southeast Asia, Australia, New Zealand Ramesh Nath Premnath, Editor ([email protected]) USA, Canada Michael Luby, Senior Editor ([email protected]) All other Countries Leontina Di Cecco, Senior Editor ([email protected]) ** This series is indexed by EI Compendex and Scopus databases. **

Zhidong Deng Editor

Proceedings of 2023 Chinese Intelligent Automation Conference

Editor Zhidong Deng Tsinghua University Beijing, Beijing, China

ISSN 1876-1100 ISSN 1876-1119 (electronic) Lecture Notes in Electrical Engineering ISBN 978-981-99-6186-3 ISBN 978-981-99-6187-0 (eBook) https://doi.org/10.1007/978-981-99-6187-0 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore Paper in this product is recyclable.

Contents

Predicting TUG Score from Gait Characteristics with Video Analysis and Machine Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jian Ma

1

Task-Space Finite-Time Prescribed Performance Tracking Control for Free-Flying Space Robots Under Input Saturation . . . . . . . . . . . . . . . . . . . . . . . Xuewen Zhang and Yingmin Jia

13

An Online Game Platform for Intangible Cultural Heritage Tibetan Jiu Chess . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xiali Li, Yanyin Zhang, Licheng Wu, Yandong Chen, and Bo Liu

23

Review of Human Target Detection and Tracking Based on Multi-view Information Fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Liuwang Wang and Haojun Liu

31

Federated Topic Model and Model Pruning Based on Variational Autoencoder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chengjie Ma, Yawen Li, Meiyu Liang, and Ang Li

51

An Anti-interference Mechanical Fault Diagnosis Method Based on CNN and Attention Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zhen-Jun Zhang and Ying-Yuan Liu

61

Refining Object Localization from Dialogues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xueze Kang, Lei Wu, Lingyun Lu, and Huaping Liu

69

License Plate Detection and Recognition Based on Light-Yolov7 . . . . . . . . . . . . . Shangyuan Li, Nan Ma, Zhixuan Wu, and Qiang Lin

83

Efficient Partitioning Method of Large-Scale Public Safety Spatio-Temporal Data Based on Information Loss Constraints . . . . . . . . . . . . . . . Jie Gao, Yawen Li, Zhe Xue, and Zeli Guan

92

Research and Application of Intelligent Monitoring and Diagnosis System for Rail Transit Emergency Power Supply Equipment . . . . . . . . . . . . . . . . . . . . . . . 101 Liang Chen, Lin Zhou, Xuliang Tang, Heng Wan, and Yanao Cao

vi

Contents

Reliability-Based Dynamic Positioning Control for Turret-Moored Vessels with Prescribed Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Yulong Tuo, Guilin Feng, Xiao Liang, Shasha Wang, and Chen Guo A Pose Control Algorithm for Simulating Robotic Fish . . . . . . . . . . . . . . . . . . . . . 118 Gang Wang, Simin Ding, and Qiang Zhao Unsupervised Multidimensional Time Series Anomaly Detection Based on Federation Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 Ying Deng, Yaogen Li, Yingqi Liao, Nan Ma, and Chengyu Yuan Reinforcement Federated Learning Method Based on Adaptive OPTICS Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 Tianyu Zhao, Junping Du, Yingxia Shao, and Zeli Guan Simulation and Implementation of Extended Kalman Filter Observer for Sensorless PMSM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 Ke Ma, ChaoJun Gao, Jun Wang, and Qiang Zhang Research and Application of Comprehensive Health Assessment Based on Production Equipment of Bulk Cargo Terminal . . . . . . . . . . . . . . . . . . . . . . . . . . 163 Xin Li, Xuliang Tang, and Renhui Chen Enhancing Resilience of Microgrid-Integrated Power Systems in Disaster Events Using Reinforcement Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 Zhongyi Zha, Bo Wang, Lei Liu, and Huijin Fan An Improved Adaptive Median Filtering Algorithm Based on Star Map Denoising . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184 Hancheng Cao, Naijun Shen, and Chen Qian Research on Intelligent Monitoring of Big Data Processes Based on Radar Map and Residual Convolutional Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 Jianli Yu, Yixiang Wang, Zhiao Jia, and Benkai Xie Consensus Path-Following of Multiple Wheeled Mobile Robots with Complex Dynamics by Adaptive Fixed-Time Fuzzy Control . . . . . . . . . . . . . 205 Junyi Yang, Zhichen Li, Huaicheng Yan, Hao Zhang, and Zhenghao Xi Level Control of Chemical Coupling Tank Based on Reinforcement Learning Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216 Yuheng Li, Quan Li, and Fei Liu

Contents

vii

A Personalized Federated Learning Fault Diagnosis Method for Inter-client Statistical Characteristic Inconsistency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226 Yanqi Wen, Funa Zhou, Pengpeng Jia, and Hanxin Huang Downsampling Assessment for LiDAR SLAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234 Jiabao Zhang and Yu Zhang The Key to Autonomous Intelligence is the Effective Synergy of Human-Machine-Environment Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 Wei Liu, Yangyang Zou, Ruilin He, and Xiaofeng Wang Design of Ultra-Low-Power Interface Circuit for Self-Powered Wireless Sensor Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257 Chunlong Li, Hui Huang, Dengfeng Ju, Hongjing Liu, Kewen Liu, and Xingqiang Zhao Distributed Consensus Tracking for Underactuated Ships with Input Saturation: From Underactuated to Nonholonomic Configuration . . . . . . . . . . . . . 263 Linran Tian, Tao Li, and Guoying Miao Trajectory Planning of Launch Vehicle Orbital Injection Segment Under Engine Failure Based on DDPG Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273 Zhuo Xiang, Bo Wang, Lei Liu, and Huijin Fan A Survey on Lightweight Technology of Underwater Robot . . . . . . . . . . . . . . . . . 281 Bofeng Fu and Gang Wang False Alarm Rate Control Method for Fiber Vibrate Source Detection with Non-stationary Interference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289 Liping Yin, Zhengju Zhu, Mingxing Shu, and Hongquan Qu An Improved YOLOv5-Based Small Target Detection Method for UAV Aerial Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298 Ruoyu Li, Yang Gao, and Ruixing Zhang A Federated Learning Method with DNN and 1DCNN Feature Fusion for Multiple Working Conditions Fault Diagnosis . . . . . . . . . . . . . . . . . . . . . . . . . . 313 Zhiqiang Zhang, Danmin Chen, and Funa Zhou Backstepping Nonsingular Fast Terminal Sliding Mode Control for Manipulators Driven by PMSM with Measurement Noise . . . . . . . . . . . . . . . . 322 Xunkai Gao, Haisheng Yu, Xiangxiang Meng, and Qing Yang

viii

Contents

Adaptive Variable Impedance Control of Robotic Manipulator with Nonlinear Contact Forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331 Ying Guo, Jinzhu Peng, Shuai Ding, and Yanhong Liu Research on Adaptive Network Recovery Method Based on Key Node Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341 Chaoqing Xiao, Lina Lu, Chengyi Zeng, and Jing Chen Multi-Scale Feature Fusion Fault Diagnosis Method Based on Attention Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353 Feilong Yu, Funa Zhou, and Chang Wang Designing Philobot: A Chatbot for Mental Health Support with CBT Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361 Qi Ge, Lu Liu, Hewei Zhang, Linfang Li, Xiaonan Li, Xinyi Zhu, Lejian Liao, and Dandan Song An EEG Study of Virtual Reality Motion Sickness Based on MVMD Combined with Entropy Asymmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372 Lining Chai, Chengcheng Hua, Zhanfeng Zhou, Xu Chen, and Jianlong Tao GCN with Pattern Affected Matrix in Human Motion Prediction . . . . . . . . . . . . . 378 Feng Zhou and Jianqin Yin Cooperative Control of SMC-Feedback Linearization and Error Port Hamiltonian System for PMSM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392 Youyuan Chen, Haisheng Yu, Xiangxiang Meng, Hao Ding, and Xunkai Gao Identification of Plant Nutrient Deficiency Based on Improved MobileNetV3-Large Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402 Qian Yan, Yifei Chen, and Caicong Wu Modular Smart Vehicle Design and Technology for Shared Mobility . . . . . . . . . . 409 Mo Zhou, Xinyu Zhang, Jun Li, Ying Fu, Xuebo Zhang, and Kun Wang A Quadrupedal Soft Robot Based on Kresling Origami Actuators . . . . . . . . . . . . 417 Yang Yang, Shaoyang Yan, Mingxuan Dai, Yuan Xie, and Jia Liu Design of Attitude Controller for Ducted Fan UAV Based on Improved Direct Adaptive Control Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427 Hongyu Zhang, Xiaodong Liu, and Yong Xu

Contents

ix

Hetero-Source Sensors Localization Based on High-Precision Map . . . . . . . . . . . 437 Zhuofan Cui, Junyi Tao, Bin He, and Yu Zhang Edge-Node Refinement for Weakly-Supervised Point Cloud Segmentation . . . . . 445 Yufan Wang and Qunfei Zhao Improving Dialogue Summarization with Mixup Label Smoothing . . . . . . . . . . . 460 Saihua Cheng and Dandan Song An Improved Multi-robot Coverage Method in 3D Unknown Environment Based on GBNN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 476 Wang Wenhao, Zhang Fangfang, Xin Jianbin, Yu Hongnian, and Liu Yanhong Behavior Recognition Method Based on Object Detection for Power Operation Scenes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484 Haojun Liu and Liuwang Wang Continual Learning for Morphology Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 500 Jing Zhao, Yinsong Wang, Huaping Liu, and Meng Liu An Improved Method for Text Classification Using Contrastive Learning . . . . . . 511 Maojian Chen, Xiong Luo, Qiaojuan Peng, Hailun Shen, and Ziyang Huang Improved Cooperation by Balancing Exploration and Exploitation in Intertemporal Social Dilemma Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 519 Cheng Zhenbo, Xu Xuesong, Liu Xingguang, Zhang Leilei, Chen Qihou, Chen Yuxin, Zhang Xia, and Xiao Gang Light-Weight High-Performance HRNet for Human Pose Estimation . . . . . . . . . . 533 Shengye Yan and Chengyu Yue AHEAD: A Triple Attention Based Heterogeneous Graph Anomaly Detection Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 542 Shujie Yang, Binchi Zhang, Shangbin Feng, Zhanxuan Tan, Qinghua Zheng, Jun Zhou, and Minnan Luo Graph Autoencoder-Based Anomaly Detection for Chemical Mechanical Planarization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 553 Xin Wang and Huangang Wang Model-Free Adaptive Sliding Mode Control for Nonlinear Systems with Uncertainties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 561 Xiangxiang Meng, Haisheng Yu, Jie Zhang, Ke Zhang, and Qing Yang

x

Contents

Quantum Illumination with Symmetric Non-Gaussian States . . . . . . . . . . . . . . . . . 571 Wen-Yi Zhu, Wei Zhong, and Yu-Bo Sheng Rank-Level Fusion of Multiple Biological Characteristics in Markov Chain . . . . 579 Qiankun Gao, Jie Chen, Xiao Xu, and Peng Zhang PCB Defect Detection Algorithm Based on Multi-scale Fusion Network . . . . . . . 589 Xiaofei Liao, Xuance Su, Guangyu Li, and Bohang Chao Event-Triggered Adaptive Trajectory Tracking Control for Quadrotor Unmanned Aerial Vehicles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 602 Zhongyuan Zhao, Chengjie Cao, and Zijuan Luo Coal Maceral Groups Segmentation Using Multi-scale Residual Network . . . . . . 610 Junran Chen, Zhenghao Xi, Zhengnan Lv, Xiang Liu, and Mingyang Wu Design of Magnetic Tactile Sensor Arrays for Intelligent Floorboard Based on the Demand of Older People . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 618 Lu Wang, Ling Weng, and Bowen Wang Gaussian Process-Augmented Unscented Kalman Filter for Autonomous Navigation During Aerocapture at Mars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 626 Shihang Cui and Yong Li Application of EEG S-Transformation Combined with Dimensionless Metrics for Automatic Detection of Cybersickness . . . . . . . . . . . . . . . . . . . . . . . . . 634 Zhanfeng Zhou, Chengcheng Hua, Lining Chai, and Jianlong Tao An Overview of Multi-task Control for Redundant Robot Based on Quadratic Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 641 Qingkai Li, Yanbo Pang, Wenhan Cai, Yushi Wang, Qing Li, and Mingguo Zhao Improvement of Hierarchical Clustering Based on Dynamic Time Wrapping . . . 667 Xudong Yuan and Yifan Lu MNGAN: Multi-Branch Parameter Identification Based on Dynamic Weighting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 675 Liudong Zhang, Zhen Lei, Zhiqiang Peng, Min Xia, Gang Zou, and Jun Liu Error Selection Based Training of Fully Complex-Valued Dendritic Neuron Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 683 Zhidong Wang, Yuelin Wang, and He Huang

Contents

xi

Intelligent Identification Method of Flow State in Nuclear Main Pump Based on Deep Learning Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 691 Ying-Yuan Liu, Di Liu, Zhenjun Zhang, and Kang An Design of Intelligent Window Dwelling System Based on Multi Sensor Fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 700 Simin Ding, Gang Wang, and Lihui Sun Time-Varying Function-Based Anti-Disturbance Method for Permanent-Magnet Synchronous Motors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 708 Tianjian Jiang, Yingcheng Wu, and Yang Yang Research on the Operation Status of Metro Power Supply Equipment Under Cyber Physical System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 720 Zhangbao Cao, Heng Wan, Xuliang Tang, and Xuefeng Chen Hybrid Underwater Acoustic Signal Multi-Target Recognition Based on DenseNet-LSTM with Attention Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . 728 Mingchao Zhu, Xiaofeng Zhang, Yansong Jiang, Kejun Wang, Binghua Su, and Tenghui Wang A Lightweight Deep Network Model for Visual Checking of Construction Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 739 Xi Deng, Bocheng Zhou, Bingdong Ran, Yingming Yang, Ling Xiong, and Kai Wang Research and Application of Automatic Screening Technology for Marketing Inspection Abnormalities Based on Knowledge Graph . . . . . . . . . 747 Dan Lu, Linjuan Zhang, Yiming Xu, Changqing Xu, Hefa Sun, Hongyang Yin, and Min Xia Global Asymptotic Synchronization of Nonlinear Hyperchaotic Financial Systems via Hybrid Control and Adaptive Projection Control . . . . . . . . . . . . . . . . 756 Guoliang Cai, Haojie Yu, Yanfeng Ding, and Huimin Liu Quadrotor UAV Control Based on String-Level Fuzzy ADRC . . . . . . . . . . . . . . . . 765 Bohan Xu, Zhibin Li, Wengcheng Song, and Shengjie Wang Deep Neural Network for Performance Prediction of Silicon Mode Splitter . . . . 775 Lin Zhang, Longqin Xie, and Weifeng Jiang A Privacy Preserving Distributed Projected One-point Bandit Online Optimization Algorithm for Economic Dispatch . . . . . . . . . . . . . . . . . . . . . . . . . . . 782 Zhiqiang Yang, Zhongyuan Zhao, and Quanbo Ge

xii

Contents

Nonlinear Control of Dual UAV Slung Load Flight System Based on RBF Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 790 Xin-Jie Han, Ji Li, Yun-Sheng Fan, and Xin-Yu Chen A LiDAR Point Cloud Semantic Segmentation Algorithm Based on Attention Mechanism and Hybrid CNN-LSTM . . . . . . . . . . . . . . . . . . . . . . . . . . 802 Shuhuan Wen, Yunfei Lu, Tao Wang, Artur Babiarz, Mhamed Sayyouri, and Huaping Liu Robust Ascent Trajectory Optimization for Hypersonic Vehicles Based on IGS-UMPSP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 809 Yuting Qi, Bo Wang, Lei Liu, Huijin Fan, and Yongji Wang Exponential Visual Stabilization of Wheeled Mobile Robots Based on Active Disturbance Rejection Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 821 Yao Huang, Lidong Zhang, and Xinrui Hou An Adaptive Observer for Current Sensorless Control of Boost Converter Feeding Unknown Constant Power Load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 830 Xiang Wang, Wei He, and Tao Li Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 839

Predicting TUG Score from Gait Characteristics with Video Analysis and Machine Learning Jian Ma(B) Hitachi China Research Laboratory, Beijing, China [email protected]

Abstract. Fall is a leading cause of death which suffers the elderly and society. Timed Up and Go (TUG) test is a common tool for fall risk assessment. In this paper, we propose a method for predicting TUG score from gait characteristics extracted from video with computer vision and machine learning technologies. First, 3D pose is estimated from video captured with 2D and 3D cameras during human motion and then a group of gait characteristics are computed from 3D pose series. After that, copula entropy is used to select those characteristics which are mostly associated with TUG score. Finally, the selected characteristics are fed into the predictive models to predict TUG score. Experiments on real world data demonstrated the effectiveness of the proposed method. As a byproduct, the associations between TUG score and several gait characteristics are discovered, which laid the scientific foundation of the proposed method and make the predictive models such built interpretable to clinical users. Keywords: copula entropy · gait characteristics · fall risk assessment Timed Up and Go · linear regression · support vector regression

1

·

Introduction

Fall is common among the elderly, especially those living with Parkinson or Dementia. Fall injury usually leads to devastating consequences, especially for the elderly, from physical or mental impairments to lose of mobility and independence, even to death. According to the WHO report [1], fall was one of the top 20 leading causes of death in 2015, with about 714 thousands death worldwide accounting for 1.2% of mortality and was anticipated to remain on the top 20 list in 2030, with an estimated 976 thousands death or 1.4% of mortality. Besides life cost, fall injuries have also imposed economic cost upon our societies. For instance, elderly fall injuries cost about $20 billion per year in the United States alone [2]. Multiple risk factors were identified to highly likely lead to fall injuries, including muscle weakness, fall history, gait and balance deficits, use of assisting devices, visual deficits, arthritis, impaired activities of daily living, c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023  Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 1–12, 2023. https://doi.org/10.1007/978-981-99-6187-0_1

2

J. Ma

mental and cognitive impairments, and high age [3]. In this research, we will focus on an important fall risk factor – gait impairments, with expectation of preventing fall through new technology for automatic fall risk assessment. There are several available instruments for fall risk assessment, pertinently on gait and balance abilities, including Tinetti Performance Oriented Mobility Assessment (POMA), Berg Balance Test, Dynamic Gait Index, and Timed Up and Go (TUG) test [4]. Among them, TUG [5] are widely used due to ease to perform, low time cost, and reliable performance. TUG is a simple test assessing mobility by means of only five sequential tasks: rising from a chair, walking three meters, turning around, walking back to the chair, and sitting down. However, in certain care settings, this instrument is still inconvenient for the elderly to perform many functional activities and time-consuming for caregivers as daily routine service. Meanwhile, TUG has also its limitations, including variability, measuring only simple tasks, sensitive to environmental conditions, subjective to professionals, etc. [6]. Technology for monitoring and assessing senior’s fall risk, continuously and obtrusively, remains an appealing demand. In this paper, we propose a new technology for automatic and unobtrusive TUG assessment with video analysis and machine learning techniques. With video analysis, gait characteristics will be extracted from video data. Their associations with fall risk scores will be measured with copula entropy and then the predictive models for predicting TUG score will be built on the mostly associated characteristics. Such technology is expected to has several advantages, such as monitoring senior’s functional condition automatically, continuously and unobtrusively. The model such developed is based on the nonlinear association between gait characteristics and fall risk which makes the predictive models interpretable to clinical users.

2

Related Work

For the same issue, there are already several initial researches. In a pilot research, King et al. [7] tried to use wearable sensors to assess fall risk and got some preliminary results on characteristics of sensor data from different fall risk groups. Another initial research by Rantz et al. [8] studied correlation (linear associations) between six fall risk scores and gait characteristics (including stride time, stride length, and gait velocity) derived from Pulse-Doppler radar and Microsoft Kinect. Wearable inertial sensors were investigated to derive gait related parameters during TUG test to classify faller/non-faller [9,10]. More works on wearable inertial sensors for fall risk assessment were reviewed in [11,12]. Automatic TUG test based on different technologies, such as computer vision, depth camera, wearable sensors, or smart phone, are gaining momentum recently [6]. Li, et al. proposed a automatic TUG sub-tasks segmentation method based on 2D camera [13]. The TUG test was divided into six sub-tasks. Pose was estimated from each video frame and the coordinates of pose were used as input of the classifiers to predict the sub-task of the frame. Depth camera, such as Kinect, has been applied to automatic TUG test by several researchers. Using skeleton or depth data, they all tried to estimate the TUG score by identifying six phases of the test. Savoie,

Predicting TUG Score

3

et al. [14] proposed a system for automating the TUG test using Kinect camera. 3D pose series were derived with the combination of 2D pose estimation and 3D depth information and then six sub-tasks were identified by detecting transition position. Kempel, et al. [15] proposed a method for automatic TUG test with Kinect camera. Skeleton data and depth data of Kinect were utilized for detecting six phases of TUG test. Dubois, et al. identified the phases of TUG test from the depth data of Kinect camera and additionally, extracted several gait related parameters to classify the subjects as with low or high fall risk [16]. Mehdizadeh, et al. studied whether the gait characteristics extracted from Kinect vision system are associated with number of fall during two week [17]. The gait characteristics include five categories: spatial-temporal, variability, symmetry, stability, and acceleration frequency domain. Poisson regression was used to model the relationship between gait characteristics and number of falls. Only the characteristics selected with Pearson correlations are taken as input of the regression model.

3 3.1

Methodology Copula Entropy

Theory. Copula theory is about the representation of multivariate dependence with copula function [18,19]. At the core of copula theory is Sklar theorem [20] which states that multivariate probability density function can be represented as a product of its marginals and copula density function which represents dependence structure among random variables. Such representation separates dependence structure, i.e., copula function, with the properties of individual variables – marginals, which make it possible to deal with dependence structure only regardless of joint distribution and marginal distribution. This section is to define an statistical independence measure with copula. For clarity, please refer to [21] for notations. With copula density, Copula Entropy is define as follows [21]: Definition 1 (Copula Entropy). Let X be random variables with marginal distributions u and copula density c(u). CE of X is defined as  (1) Hc (X) = − c(u) log c(u)du. u

In information theory, MI and entropy are two different concepts [22]. In [21], Ma and Sun proved that they are essentially same – MI is also a kind of entropy, negative CE, which is stated as follows: Theorem 1. MI of random variables is equivalent to negative CE: I(X) = −Hc (X).

(2)

The proof of Theorem 1 is simple [21]. There is also an instant corollary (Corollary 1) on the relationship between information of joint probability density function, marginal density function and copula density function.

4

J. Ma

Corollary 1. H(X) =



H(Xi ) + Hc (X).

(3)

i

The above results cast insight into the relationship between entropy, MI, and copula through CE, and therefore build a bridge between information theory and copula theory. CE itself provides a mathematical theory of statistical independence measure. Estimation. It has been widely considered that estimating MI is notoriously difficult. Under the blessing of Theorem 1, Ma and Sun [21] proposed a simple and elegant non-parametric method for estimating CE (MI) from data which comprises of only two steps1 : 1. Estimating Empirical Copula Density (ECD); 2. Estimating CE. For Step 1, if given data samples {x1 , . . . , xT } i.i.d. generated from random variables X = {x1 , . . . , xN }T , one can easily estimate ECD as follows: Fi (xi ) =

T 1 χ(xit ≤ xi ), T t=1

(4)

where i = 1, . . . , N and χ represents for indicator function. Let u = [F1 , . . . , FN ], and then one can derive a new samples set {u1 , . . . , uT } as data from ECD c(u). In practice, Step 1 can be easily implemented non-parametrically with rank statistic. Once ECD is estimated, Step 2 is essentially a problem of entropy estimation which has been contributed with many existing methods. Among them, the kNN method [23] was suggested in [21]. With rank statistic and the kNN method, one can derive a non-parametric method of estimating CE, which can be applied to any situation without any assumption on the underlying system. 3.2

Predictive Models

In this paper, two types of ML models, i.e., Linear Regression (LR) and Support Vector Regression (SVR), are selected among many others for building predictive models since LR is the most typical linear model and SVR is the most widelyused nonlinear model for small sample cases. LR models linear relationship between dependent and some independent random variables. Suppose there are dependent random variable Y and an independent random vector X, the LR model is as: y = Ax + β + ε

(5)

where A, β are parameters to be estimated, and ε is noise. 1

The R package copent for estimating CE is available on CRAN and also on GitHub at https://github.com/majianthu/copent.

Predicting TUG Score

5

SVR is a popular ML method that learns complex relationship from data [24]. Theoretically, SVR can learn the model with simple model complexity and meanwhile do not compromise on predictive ability, due to the max-margin principle. The learning of SVR model is formulated as an optimization problem [24], which can be solved by quadratic programming techniques after transformed to its dual form. SVR has its nonlinear version with kernel tricks. The final SVR model is represented as  vi k(x, xi ) + b (6) f (x) = i

where xi represents support vector, and k(·, ·) represents kernel function.

4

Proposed Method

We propose a method for building the model for TUG prediction with the above methodology. The proposed method starts from collecting raw video data during TUG test, and then extracting a group of gait characteristics from raw video data. With copula entropy measuring association strength, we then select the gait characteristics mostly associated with TUG score. Such selected characteristics are fed into the trained ML models (LR and SVR) to predict the TUG scores. The predictive model in the proposed method is not limited to the above mentioned two models, and may be others predictive models. The advantage of this method over other methods has been demonstrated on the UCI heart disease data [25] and applied to build the predictive models for dementia diagnosis by predicting MMSE scores [26].

5 5.1

Experiments and Results The Video System

In this research, the fall prediction models are built upon video data obtained from the video analysis system [27]. In the system, two type of video cameras, one 2D camera and one 3D camera (composed of two 2D cameras), are deployed in the living room of the elderly. All cameras work simultaneously. From 2D camera, video of human activities is recorded and then analyzed with pose estimation technique (Mask R-CNN [28] in our experiments), with which positions of human joints in the video frames are estimated. From 3D camera, depth information of each video frame is derived with stereo vision technique. Matching joint position and depth information, one can obtain 3D spatial information of human pose in every video frames. Such 3D information will be the raw data for the following ML pipeline to extract gait characteristics.

6

5.2

J. Ma

Data

The data for the experiments were collected from 40 subjects at Tianjin. All the participants signed informed consent. The subjects were administrated to perform TUG tests twice a day for several times in one month and did 146 tests totally. For each test, a video about 1–4 minutes was recorded and then gait characteristics were extracted from it with the above video analysis system [27]. With one sample composed of 18 characteristics for a video, a group of sample data were extracted from each video and then attached with the TUG scores corresponding to the video. In such way, the whole data set with 146 samples was finally generated from all the 146 videos. Most of the samples were collected from healthy subjects (TUG 10) and few subject was with high fall risk (TUG 30). 5.3

Experiments

We conducted experiments to study the associations within data and the performance of the predictive models. The predictive models will be evaluated on two aspects: the interpretability of the characteristics and the prediction performance. In the first experiment, the association between the TUG scores and the characteristics are measured with copula entropy. The most associated characteristics will be selected for the following prediction experiments. In the second experiment, the selected characteristics will be used for predicting TUG score to check the performance of the predictive models. The predictive models in the experiments are Linear Regression (LR) and Support Vector Regression (SVR). The ratio between training data and test data are (80/20)% and the data set was randomly separated 100 times. The hyper-parameters of SVR are tuned to obtain the best possible prediction results. The performance of the predictive models are measured from two aspects. It will be measured by Mean Absolute Error (MAE) between the true TUG score and the predicted ones. To check the clinical diagnosis ability of the predictive models, we take the clinical cutoff (TUG=13.5) to separate faller and non-faller. The diagnosis accuracy of the predictive models is derived by comparing the true diagnosis with the predicted diagnosis results. 5.4

Results

The association between the characteristics and the TUG score measured by copula entropy is shown in Fig. 1. It can be learned from Fig. 1 that 1) speed variance is the most associated characteristics with TUG score which is easy to understand and clinically meaningful; 2) the other characteristics with strong associations include gait speed, pace (step length), and acceleration range. The joint distribution between TUG score and the three most associated characteristics are plotted in Fig. 2, from which it can be learned that speed, pace, and speed variability are linearly associated and all non-linearly associated with TUG score. This may imply that the method for generating gait characteristics is well defined and the characteristics are biologically and clinically plausible.

7

stride_freq_sd

low_freq_per_sd

acc_range_sd

movement_intensity_sd

stride_time_sd

stridetime_var_sd

pace_sd

speed_var_sd

speed_sd

stride_freq

low_freq_per

acc_range

movement_intensity

stride_time

stridetime_var

pace

speed_var

speed

0.00 −0.05 −0.20

−0.15

−0.10

Copula Entropy

0.05

0.10

0.15

Predicting TUG Score

Fig. 1. Association between the characteristics and the TUG score. 0.5

0.6

0.7

10

15

20

25

30

0.8

1.0

0.4

0.6

0.7

0.4

0.6

speed

0.4

0.5

0.4

0.5

pace

20

25

30

0.1

0.2

0.3

speed_var

10

15

tug

0.4

0.6

0.8

1.0

0.1

0.2

0.3

0.4

0.5

Fig. 2. Joint distribution between the TUG score and the 3 characteristics.

Considering the associations between the characteristics and TUG score, two experiments on predicting TUG score are conducted. In the first experiment, three characteristics (gait speed, pace, and speed variance) are used as the input of the predictive models; In the second experiment, four characteristics, including gait speed, pace, speed variance, and acceleration range, are used as the input

8

J. Ma

20

of the predictive models. The predicting results are shown in Figs. 3 and 4. The performance of the two experiments in terms of MAE and diagnosis accuracy are listed in Tables 1 and 2. Comparing the results of two experiments between Tables 1 and 2, one can learn that the performance of the predictive models are hardly improved by including another characteristics into models. This may imply that the three characteristics are mostly useful for prediction but including other characteristics are not helpful for improving prediction performance. Comparing the prediction by LR and SVR between Tables 1 and 2, one can learn that the latter is better in terms of MAE, and the former is better in terms of diagnosis accuracy. This is because that LR presents better results for the faller (with high TUG score) as shown in Figs. 3 and 4.

14 12 10 8 6

Prediction

16

18

SVR LR

6

8

10

12

14

16

18

True TUG

Fig. 3. Prediction with the 3 characteristics.

20

9

20

Predicting TUG Score

14 12 6

8

10

Prediction

16

18

SVR LR

6

8

10

12

14

16

18

20

True TUG

Fig. 4. Prediction with the 4 characteristics. Table 1. Performance of the predictive models with the 3 characteristics. LR MAE

SVR

1.675 1.429

Diagnosis Accuracy (%) 94.4

92.7

Table 2. Performance of the predictive models with the 4 characteristics. LR MAE

Diagnosis Accuracy (%) 95.4

6

SVR

1.612 1.441 93.2

Discussion

From Fig. 1, one can learn that several characteristics, such as gait speed, pace, speed variance, and acceleration range are associated with TUG score. It is obvious that gait speed and pace are the mostly associated characteristics with TUG score. This indicates the method of generating characteristics is reasonable because gait speed has been widely considered as an effective indicator of functional ability in clinical research [29–31]. The other characteristics show also high association strength and contribute to the performance of the models, which may suggest that speed variance and acceleration range measure certain aspects of functional ability and is helpful for predicting TUG score. Previously, speed variance is suggested as a marker

10

J. Ma

of fall risk [32]. Several studies also show that variance of gait speed may be much closely related to fall risk than the average of gait speed [33–35]. This point is supported by the evidence in our experiments that speed variance has much stronger association with TUG score than gait speed. When examining the association pattern of gait speed, pace and TUG score, one can find that the associations are typical nonlinear with long tail in the distribution. This reflects how gait speed and pace change as functional ability deteriorates. This also implies that linear correlation coefficient is not a good choice for selecting characteristics for the predictive models. Examining the results of two experiments carefully, one can learn that including more gait characteristics slightly improves the performance of both models. This implies that the two characteristics (gait speed and pace) are mostly informative for the prediction and that the other characteristics are also somewhat helpful. We believe that the performance of the models may be further improved if more reasonable characteristics related to fall risk are introduced into the models. Though diagnosis accuracy of the predictive models are high, we do not consider this as a big success with caution. This is because that most samples are from healthy people and such imbalanced data makes the models trained from the data tending to predicting the subject as healthy people and hence majority of predictions are right. However, we can still notice that the models make good prediction for the samples from the patients with high fall risk, better than the ones in the previous research did. Compared with the related works, our method is novel on two points. First, the subjects are not necessarily asked to perform TUG test even they did in our experiment. We need not try to identify the phases of TUG test as others did [13– 16]. This makes the method automatic, unobtrusive and easy to deploy at any setting. Second, our method predicts TUG score instead of number of falls and meanwhile the prediction is based on the gait characteristics carefully selected with copula entropy instead of Pearson correlation coefficient [17]. This makes the method scientifically sound. It deserve a mention that there are already studies reporting the relationship of the characteristics selected, such as gait variability, to fall risk, as discussed above. Our research confirms this relationship instead of providing contradictory results as in [17].

7

Conclusions

In this paper, we study how to predict TUG score with gait characteristics extracted from video data. We propose a method that can extract gait characteristics from the whole-length video with pose estimation and stereo vision techniques. These gait characteristics are then selected by measuring the associations between the characteristics and TUG score with copula entropy and the selected characteristics are then fed into the predictive models (LR and SVR) to predict TUG score. Experiments on real world data show the effectiveness of the proposed method.

Predicting TUG Score

11

As a byproduct, the associations between TUG score and several gait characteristics, such as gait speed, pace, and gait speed variance, are discovered with copula entropy. This discovery provides another evidence of these gait characteristics as the markers of fall risk, and makes the predictive models interpretable, which is of critical importance to clinical users. Acknowledgement. The author thanks Zhang Pan for providing the data of gait characteristics.

References 1. World health organization, Global Health Estimates (2023). https://www.who.int/ data/global-health-estimates 2. Stevens, J.A., Corso, P.S., Finkelstein, E.A., Miller, T.R.: The costs of fatal and non-fatal falls among older adults. Injury Prev. 12(5), 290 (2006) 3. Society, A.G., Society, G., A.A. Of, O.S.P.: On falls prevention. J. Am. Geriatr. Soc. 49(5), 664 (2001) 4. Perell, K.L., Nelson, A., Goldman, R.L., Luther, S.L., Prieto-Lewis, N., Rubenstein, L.Z.: Fall risk assessment measures: an analytic review. J. Gerontol. Ser. A: Biol. Sci. Med. Sci. 56(12), M761 (2001) 5. Shumway-Cook, A., Brauer, S., Woollacott, M.: Predicting the probability for falls in community-dwelling older adults using the timed up & go test. Phys. Ther. 80(9), 896 (2000) 6. Sprint, G., Cook, D.J., Weeks, D.L.: Toward automating clinical assessments: a survey of the timed up and go. IEEE Rev. Biomed. Eng. 8, 64 (2015) 7. King, R.C., Atallah, L., Wong, C., Miskelly, F., Yang, G.Z.: In: 2010 International Conference on Body Sensor Networks, pp. 30–35. IEEE (2010) 8. Rantz, M., et al.: Automated in-home fall risk assessment and detection sensor system for elders. Gerontologist 55(Suppl 1), S78 (2015) 9. Greene, B.R., O’Donovan, A., Romero-Ortuno, R., Cogan, L., Scanaill, C.N., Kenny, R.A.: Quantitative falls risk assessment using the timed up and go test. IEEE Trans. Biomed. Eng. 57(12), 2918 (2010) 10. Weiss, A., Herman, T., Plotnik, M., Brozgol, M., Giladi, N., Hausdorff, J.: An instrumented timed up and go: the added value of an accelerometer for identifying fall risk in idiopathic fallers. Physiol. Meas. 32(12), 2003 (2011) 11. Howcroft, J., Kofman, J., Lemaire, E.D.: Lemaire. Review of fall risk assessment in geriatric populations using inertial sensors. J. Neuroeng. Rehabil. 10(1), 1 (2013) 12. Montesinos, L., Castaldo, R., Pecchia, L.: Wearable inertial sensors for fall risk assessment and prediction in older adults: a systematic review and meta-analysis. IEEE Trans. Neural Syst. Rehabil. Eng. 26(3), 573 (2018) 13. Li, T., et al.: IEEE Trans. Neural Syst. Rehabil. Eng. 26(11), 2189 (2018) 14. Savoie, P., Cameron, J.A., Kaye, M.E., Scheme, E.J.: IEEE J. Biomed. Health Inf. 24(4), 1196 (2019) 15. Kampel, M., Doppelbauer, S., Planinc, R.: In: Proceedings of the 12th EAI International Conference on Pervasive Computing Technologies for Healthcare, pp. 208– 216 (2018) 16. Dubois, A., Bihl, T., Bresciani, J.P.: Sensors 18(1), 14 (2017) 17. Mehdizadeh, S., et al.: Vision-based assessment of gait features associated with falls in people with dementia. J. Gerontol. Ser. A 75(6), 1148 (2020)

12

J. Ma

18. Joe, H.: Dependence Modeling with Copulas. CRC Press, City in Florida (2014) 19. Nelsen, R.B.: An Introduction to Copulas, Springer, New York (2007). https:// doi.org/10.1007/0-387-28678-0 20. Sklar, A.: Publications de l’Institut de statistique de l’Universit´e de Paris 8, 229 (1959) 21. Ma, J., Sun, Z.: Tsinghua Sci. Technol. 16(1), 51 (2011) 22. Cover, T.M.: Elements of Information Theory, Wiley, Hoboken (1999) 23. Kraskov, A., St¨ ogbauer, H., Grassberger, P.: Estimating mutual information. Phys. Rev. E 69(6), 066138 (2004). https://doi.org/10.1103/PHYSREVE.69.066138 24. Smola, A.J., Sch¨ olkopf, B.: Stat. Comput. 14, 199 (2004) 25. Ma, J.: Chin. J. Appl. Probab. Stat. 37(4), 405 (2021) 26. Ma, J.: Predicting MMSE score from finger-tapping measurement. In: Deng, Z. (ed.) Proceedings of 2021 Chinese Intelligent Automation Conference. LNEE, vol. 801, pp. 294–304. Springer, Singapore (2022). https://doi.org/10.1007/978-981-166372-7 34 27. Li, Y., Zhang, P., Zhang, Y., Miyazaki, K.: In: 2019 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pp. 1471–1475. IEEE (2019) 28. He, K., Gkioxari, G., Doll´ ar, P., Girshick, R.: In: Proceedings of the IEEE International Conference on Computer Vision (2017) 29. Viccaro, L.J., Perera, S., Studenski, S.A.: J. Am. Geriatr. Soc. 59(5), 887 (2011) 30. Verghese, J., Holtzer, R., Lipton, R.B., Wang, C.: Quantitative gait markers and incident fall risk in older adults. J. Gerontol. Ser. A 64(8), 896 (2009) 31. Prince, F., Corriveau, H., H´ebert, R., Winter, D.A.: Gait in the elderly. Gait Posture 5(2), 128 (1997) 32. Hausdorff, J.M.: J. Neuroeng. Rehabil. 2(1), 1 (2005) 33. Hausdorff, J.M., Rios, D.A., Edelberg, H.K.: Arch. Phys. Med. Rehabil. 82(8), 1050 (2001) 34. Hausdorff, J.M., Edelberg, H.K., Mitchell, S.L., Goldberger, A.L., Wei, J.Y.: Arch. Phys. Med. Rehabil. 78(3), 278 (1997) 35. Maki, B.E.: Gait changes in older adults: predictors of falls or indicators of fear? J. Am. Geriatr. Soc. 45(3), 313 (1997)

Task-Space Finite-Time Prescribed Performance Tracking Control for Free-Flying Space Robots Under Input Saturation Xuewen Zhang and Yingmin Jia(B) The Seventh Research Division and the Center for Information and Control, School of Automation Science and Electrical Engineering, Beihang University (BUAA), Beijing 100191, People’s Republic of China {zy2103120,ymjia}@buaa.edu.cn

Abstract. This paper investigates the task-space finite-time prescribed performance tracking control for a free-flying space robot subjected to input saturation. The proposed control law involves utilizing a modified barrier Lyapunov function, the backstepping technique, the dynamic surface control technique, and the adaptive approach. The closed-loop errors for both manipulator end-effector pose and base attitude are steered, converging in finite time with prescribed performance. Numerical simulations are carried out to illustrate the proposed control law. Keywords: Free-flying space robot · Finite-time stability performance control · Input saturation · Backstepping

1

· Prescribed

Introduction

The prescribed performance control (PPC) method has been developed recently to put forward more strict demands on system performances and introduced into the controller design of space robots in many recent works [1–4]. The commonly used prescribed performance functions (PPF) in these researches have infinite convergence time, which needs to be improved for practical requirements. Finite-time PPC method is a promising approach to realizing finite-time stability in such cases. Novel finite/fixed-time stable PPFs were proposed in [5–7] for the controller design of robot manipulators, uncertain mechanical systems, and non-strict feedback systems, respectively. In the PPC systems, the input saturation problem is imperative because fast transient responses increase the risk of saturated input and can cause errors to exceed the PPF envelope, resulting in system collapses [8]. In [9], the PPC problem of space robots under actuator saturation was solved by neural network approaches. Auxiliary systems were introduced to handle the saturated input in the PPC controller [8]. Moreover, most space robot tracking controllers were designed in joint space rather than task space, limiting their applications. c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023  Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 13–22, 2023. https://doi.org/10.1007/978-981-99-6187-0_2

14

X. Zhang and Y. Jia

The task-space control problems were tackled in [10–14], the main deficiency of which was the neglect of the base attitude stabilizing problem, which may cause problems related to remote communication and reduce the charging efficiency of onboard solar panels. This paper investigates the task-space finite-time PPC problem under input saturation for a free-flying space robot. Different from [11–13], the task-space control model is established for the considered space robot involving the endeffector (EE) pose and the base attitude. The backstepping-based controller design procedure chooses a PPF with finite convergent time and a modified barrier Lyapunov function (BLF) omitting the error transformation step. An auxiliary system is introduced to handle the influence caused by the saturated control input. Moreover, the dynamic surface control (DSC) and adaptive methods are used to avoid the explosion of derivatives and deal with the lumped disturbances. The remainder of this paper is organized as follows. Section 2 introduces the formulations of the control problem. Section 3 presents the design and stability analysis of the proposed control strategy. Numerical simulations are carried out in Sect. 4. Section 5 concludes the work briefly. In this paper, λ(·), λ(·), and tr(·) denote the minimum, maximum eigenvalue and the trace of a matrix. Vectors i v j , i ω j denote the linear/angular velocity respective to frame j, projected in frame i, where i, j can be ’0’ (the base frame), T ’e’ (the end-effector frame) or ’I’ (the inertial frame). For any x = [x1 , x2 , x3 ] ∈  = [0, −x3 , x2 ; x3 , 0, −x1 ; −x2 , x1 , 0], and R3 , The hat operator ’∧’ is defined as x the vee operator ’∨’ is the inverse of the hat operator.

2

Problem Formulation

A typical free-flying space robot composes a base and an n-degrees of freedom (DOF) manipulator. Its equations of dynamics can be given by [15], Aq˙ + c = τ ,

(1)

where A ∈ R(n+3)×(n+3) denotes the positive-definite inertial matrix, c ∈ Rn+3   0 I T T is the nonlinear term associated with modeling coefficients, q = uT ∈ m , ( ω0 ) Rn+3 with um ∈ Rn being the manipulator joint velocity and 0 ω0I ∈ R3 the base  T T T angular velocity. Additionally, the τ = τm , τ0 ∈ Rn+3 represents the control n torque with τm ∈ R being the manipulator joint torque and τ0 ∈ R3 the base torque. Considering the modeling uncertainties, the external disturbances and the control saturation problem, we rewrite (1) as below, A0 q˙ + c0 = sat (τ ) + d, d  τd − ΔAq˙ − Δc, T

sat (τ )  [sat (τ1 ) , · · · , sat (τ9 )] , sat (τi )  sign (τi ) · min {|τi |, τimax } ,

(2)

where (A0 , c0 ) and (ΔA, Δc) denote the nominal and uncertain part of (A, c), τimax is the maximum allowable value of τi . The d is lumped disturbance, and

Task-Space Finite-Time PPC for Space Robots

15

τd represents the external disturbance torque. Assume that d is bounded by d ≤ Θ T Φ,

(3)

T

where Θ = [θ1 , θ2 , θ3 ] ∈ R3 is the unknown coefficient vector to be estimated, T  ˙ q2 ∈ R3 is a state-related vector. and Φ = 1, q, The concerned free-flying space robot is supposed to have zero initial momentum without being affected by any external force. According to [16], the following kinematic relationship can be derived   JT ω I I 0 I  (4) p˙ e = ve = Jmv − um − p eg · R0 · ω0 , M EE linear velocity, Jmv , JT ω the where pe denotes the EE position, I veI the n configuration-related known matrices, M = i=0 mi the total mass and peg = pe − rg , with rg being the mass center of the system. 1 1 and Υ0 = √ , following the results in Let Υe = √ T T 2

1+tr(R ed R e )

2

1+tr(R 0d R 0 )

T [10,17], we define ep = pe − ped , eE = Υe (Red Re − ReT Red )∨ , eωe = e ωeI − T e I T T ∨ I as Re Red · ωed , eB = Υ0 (R0d R0 − R0 R0d ) and eω0 = 0 ω0I − R0T R0d · 0 ω0d the EE position error, attitude error, angular velocity error, the base attitude error, and the base angular velocity error, respectively. Therein, ped , Re , Red , e I e I I ωe , ωed , R0 , R0d and 0 ω0d denote the desired position, the current and desired rotation matrix, the current and desired angular velocity of the EE, the current and desired rotation matrix of the base, and the desired angular velocity of the base, respectively. By these defined errors, the following relationships can be established [17],   e˙ E = Υe tr(ReT Red )I3 − ReT Red + 2eE eT (5a) E eωe  EE eωe ,   T T T (5b) e˙ B = Υ0 tr(R0 R0d )I3 − R0 R0d + 2eB eB eω0  EB eω0 .

  T T T Define the tracking error e = eT , the dynamics of e can be derived p eE eB from (4), (5) and the definition of eωe , eω0 , as below, ¯= where E

J

Tω J mv − M E E R eT J mω O 3×6

¯ + Λ, ¯ e˙ = Eq



−p˙ ed − p eg R 0 T T I I T ¯ E E R e R 0 , Λ = −E E R e R ed R e · ω ed . T T I I EB

(6)

−E B R 0 R 0d R 0 · ω 0d

The prescribed performance constraints on error e can be described as − ρi (t) < ei (t) < ρi (t) , i = 1, · · · , 9, in which the ρi are the chosen finite-time PPFs described by [5], ⎧   ⎪ ⎨ (ρ0i − ρ∞i ) exp − li Ti t + ρ∞i , 0 < t < Ti Ti − t ρi (t) = , ⎪ ⎩ρ , t ≥ Ti ∞i

(7)

(8)

16

X. Zhang and Y. Jia

where Ti > 0 denotes the pre-specified convergent time, ρ0i > |ei (0)|, ρ∞i and li are the design parameters. Note that this PPF is superior to the conventional ρ(t) = (ρ0 − ρ∞ )e−lt + ρ∞ in that the latter has infinite convergent time. The control objective of this paper is stated as: find a tracking controller τ for (2) such that the error e in the form of (6) is subject to the performance constraint (7), and all the closed-loop signals are bounded for all t ≥ 0.

3

Main Result

 2  9 ρi Design a modified BLF as V1 = 12 i=1 log ρ2 −z , where z1 = e and ρi is the 2 i 1i PPF corresponding to z1i . Obviously, one has −ρi (t) < z1i (t) < ρi (t), provided that V1 is bounded. Design the virtual control law of q as follow   ¯ −1 −K1 z1 − 1 Ξz1 + ϑ − Λ¯ , (9) qc = E 21   1 1 , ϑ = where K1 = diag (k11 , · · · , k19 ) > 0, Ξ = diag ρ2 −z 2 , · · · , ρ2 −z 2 1 11 9 19  T ρ˙ 1 ρ˙ 9 and 1 > 0. Assume that there exists a constant vector χ ρ1 z11 , · · · , ρ9 z19 such that | q˙c |≤ χ. We use the first-order filter below to avoid direct calculation about q˙c , ˙ + = qc , (0) = qc (0), (10) τ0

where is the filter output, and τ0 denotes the time constant. The filter error is defined as ζ = − qc . Define z2 = q − , it then by (2), (10) follows that ¯ (z2 + ζ + qc ) + Λ, ¯ z˙ 1 = E z˙ 2 =

A−1 0 sat (τ )



A−1 0 c0

(11a) +

A−1 0 d

− ζ˙ − q˙c .

Consider the following auxiliary system, ⎧ 1 T −1 T ⎪ ⎨ − b ε − z2 A0 Δτ + 2 b2 Δτ Δτ ε + b Δτ , ε > ε 1 2 0 2 ε , ε˙ = ⎪ ⎩ 0, ε ≤ ε0

(11b)

(12)

where ε is the auxiliary state, b1 > 0, b2 > 0, Δτ = sat (τ ) − τ , and ε0 > 0 is a  be the estimation of Θ, we define the estimation error as small constant. Let Θ    as Θ = Θ − Θ. Design the control law τ and the updating algorithm for Θ 

τ = −A0

 T A−1 ζ 0 Θ Φ + z2 + K 2 z2 + K 3 ε + c0 , E Ξz1 + τ0 z2  + 0 ¯T

−1 2 ˙  (0) = 0,  + A0  · z2  Φ , Θ  Θ = −k4 Θ z2  + 0

(13)

(14)

where K2 = diag (k21 , · · · , k29 ) > 0, K3 = diag (k31 , · · · , k39 ) > 0, k4 > 0 and

0 > 0. Note that 0 is a constant used to alleviate the singularity problem.

Task-Space Finite-Time PPC for Space Robots

17

Theorem 1. Given the free-flying space robot described by (2) and the error dynamics (6), the control objective can be achieved by implementing the control law (13) and the updating algorithm (14). Proof. Define k 1 = min {k1i }. Differentiating V1 along with the solution trajectory of (11) after substituting (13), (14) into (11b), we later use Young’s ¯ ≤ 1 ζ T E ¯ T Eζ ¯ + 1 z T Ξ 2 z1 and obtain inequality z1T Ξ Eζ 2 21 1 V˙ 1 ≤ −k 1

9  i=1

2 z1i ¯ 2 + 1 ζ T E ¯ T Eζ. ¯ + z1T Ξ Ez 2 2 ρi − z1i 2

(15)

 TΘ  + 1 εT ε. Choose the Lyapunov function as V2 = V1 + 12 z2T z2 + 12 ζ T ζ + 12 Θ 2 According to (11b), (12) and (15), we have V˙ 2 ≤ − k 1

9  i=1

2  T  T z1i ¯ 2 + 1 λ E ¯ ζ ζ − z T K2 z2 ¯ E + z1T Ξ Ez 2 2 2 ρi − z1i 2

(16)

z2T ζ ζTζ ˙  TΘ  ˙ − z2T K3 ε + A−1 − − ζ T q˙c − Θ + εT ε. 0 z2  + τ0 τ0 Consider the following two cases, z T A −1 Δτ + 1 b Δτ T Δτ (i) When ε > ε0 , ε˙ = −b1 ε − 2 0 ε22 2 ε + b2 Δτ . Substituting (13) and (14) into (16), we have V˙ 2 ≤ −k 1

9  i=1

2 T z1i 1  ¯ T ¯  T A−1 0 Θ Φ λ + ζ − z2 2 E ζ E 2 2 ρi − z1i 2 z2  + 0

−1 T −1 T −1 T + z2T A−1 0 τ + z2 A0 Δτ − z2 A0 c0 + A0 z2 Θ Φ −

ζTζ τ0

(17)

1 ˙  TΘ  − ζ T q˙c − Θ − b1 εT ε − b2 Δτ T Δτ + b2 εT Δτ . 2 Obviously, if both | z1i |< ρi and the inequality − ≤

T A−1 ˙ T 0 Θ Φ T  z2 2 + A−1 0 z2 Θ Φ − Θ Θ z2  + 0  T Φz2 2 + A−1 Θ T Φz2 2 + A−1 z2 Θ T Φ 0 −A−1 Θ 0

0

z2  + 0  −1  2 ˙ T  T A0 Φz2  − Θ  ≤Θ + A−1 0 |Θ Φ| 0 z2  + 0 1 T  1 T ≤ − k4 Θ Θ + k4 Θ T Θ + A−1 0 |Θ Φ| 0 , 2 2 T ˙ hold, one then has qc ≤  −ζ 2 ρi 1 T < 2 b2 Δτ Δτ , log ρ2 −z 2 i

1i

1 T  2 2 ζ ζ + 2 χ with 2 z1i 2 . Substituting ρ2i −z1i

0

˙  TΘ  −Θ

(18)  > 0, b2 εT Δτ ≤ 12 b2 εT ε + the above four inequalities

18

X. Zhang and Y. Jia

into (17) leads to 

 1 log − k 2 − k 3 z2T z2 2 i=1     1 1  ¯ T ¯  T 1 1 1 − λ E E ζ ζ − b1 − b2 − k 3 εT ε − − (19) τ0 2 2 2 2 1 1 T   T Θ + χ2 + k4 Θ T Θ + A−1 − k4 Θ 0 |Θ Φ| 0 2 2 2 ≤ − Γ1 V2 + ∇1 ,  where k 2 =min {k2i }, k 3 =max {k3i }, Γ1 =min 2k 1 , 2b1 − b2 − k 3 , 2k 2 − k 3 , k4 ,  T  2 1 ¯ ¯ > 0 and ∇1 =  χ2 + 1 k4 Θ T Θ + A−1 |Θ T Φ| 0 > 0. 0 τ0 −  − 1 λ E E 2 2 (ii) When ε ≤ ε0 , ε˙ = 0. Similar with the analysis of case (i), substitute ε˙ = 0, (13),  −2 (14)  T and 1 the inequalities 12 k 3 εT ε ≤ k 3 ε20 − 12 k 3 εT ε, z2T A−1 A z2 z2 + Δτ ≤ λ 0 0 22 2 T 2 Δτ Δτ into (16), we get V˙ 2 ≤ − k 1

9 

ρ2i 2 ρ2i − z1i







   ρ2i 1 1  −2 A − k − λ − k z2T z2 3 0 2 2 2 ρ − z 2 2 2 i 1i i=1   1 1 1 T 1 1 ¯ T ¯ − − − λ E E Θ − k 3 εT ε ζ T ζ − k4 Θ τ0 2 2 2 2 2  1 2 T 2 T −1 + k3 ε0 + Δτ Δτ + χ + k4 Θ Θ + A0 |Θ T Φ|0 2 2 2 ≤ − Γ2 V2 + ∇2 ,

V˙ 2 ≤ − k1

9 

log

 where Γ2 = min 2k 1 , 2k 2 − k 3 −

1 2 λ 2



 2 A−2 , τ0 − 0

1 

(20)

  T  ¯ , k4 , k 3 > 0 ¯ E − 1 λ E

T and ∇2 = k 3 ε20 + 22 Δτ T Δτ + 2 χ + 12 k4 Θ T Θ + A−1 0 |Θ Φ| 0 > 0. ˙ It can be concluded from (19) and (20) that V2 ≤ −Γ V2 + ∇ with Γ = max {Γ1 , Γ2 } > 0, ∇ = min {∇1 , ∇2 } > 0, so the system states are ultimately  and ε are bounded. Consequently, bounded, which implies that V1 , z2 , ζ, Θ the prescribed performance constraints (7) hold for ∀t > 0. Additionally, for should be chosen such guaranteeing Γ1 > 0 and Γ2 > 0, the design parameters  2 1 ¯ > 0 and 2b1 − b2 − k 3 > 0. ¯ TE > 0, − −  λ that 2k 2 − k 3 − 12 λ A−2 E 1 0 τ0  This completes the proof.

4

Simulations

In this section, a space robot with a 6-DOF manipulator is used for simulation experiments. The nominal and perturbed dynamic parameter values of it are listed in Table 1 and Table 2. The external disturbance torque is chosen as τd = [0.5 sin(0.2t), sin(0.3t +π/2), sin(0.3t + π/3), 0.1 sin(0.5t + π/2), 0.1 sin(0.4t + π/4), 0.1 sin(0.5t), 2 sin T (0.01t + π/4), 1.5 sin(0.02t), 2 sin(0.03t + π/2)] (N · m). In the inertial frame,

Task-Space Finite-Time PPC for Space Robots

19

Table 1. Nominal and perturbed masses of the space robot Perturbed mass mi of link i/kg

Nominal mass mi of link i/kg

m1 m2 m3 m4 m5 m6 m0

m0

100 5

10

10

6

6

6

m1 m2 m3 m4 m5 m6

120 7

13

13

5

7

5

Table 2. Nominal and perturbed inertia matrices of the space robot Nominal inertia matrix I i of link i/kg · m2

Perturbed inertia matrix I i of link i/kg · m2

I 1 diag(0.05, 0.05, 0.05) I 0 diag(10, 11, 12)

I 0 diag(9.3, 9.3, 9.3)

I 1 diag(0.07, 0.06, 0.08)

I 2

diag(0.12, 0.09, 0.11) I 3 diag(0.08, 0.12, 0.09)

I 4 diag(0.07, 0.07, 0.07) I 5 diag(0.07, 0.07, 0.07)

I 4

diag(0.08, 0.09, 0.06) I 5 diag(0.08, 0.06, 0.08)

I 6 diag(0.07, 0.07, 0.07)

I 6

diag(0.06, 0.06, 0.06)

I 2 diag(0.1, 0.1, 0.1)

I 3 diag(0.1, 0.1, 0.1)

1.95 1.9

z/(m)

1.85 1.8 1.75 1.7 1.65 -1.15

-1.1

-1.05

-1.0

-0.95

y/(m)

-0.9

-0.85

0.75

0.7

0.65

0.6

0.55

0.5

0.45

x/(m)

Fig. 1. EE position trajectories in task space.

T

the initial EE position is [0.627, −0.998, 1.844] (m), the initial EE DCM is [0.95803, 0.16224, 0.23633; 0.14479, 0.43765, −0.88741; −0.2474, 0.88439, 0.39579], and the initial base DCM is [0.95803, −0.20531, 0.20005; 0.1447, 0.94889, 0.28045; −0.2474, −0.23971, 0.93879]. Suppose that the EE is required to track T a task-space circle trajectory with the center located at [0.6, −1.0, 1.8] (m), T  the radius 0.1(m) and the normal vector √13 , √13 , √13 . Thus ped is set as ped(x) (t) = 0.6 + 0.1 · √16 cos(δ) − 0.1 · √26 sin(δ)(m), ped(y) (t) = −1.0 + 0.1 · √1 cos(δ) + 0.1 · √1 sin(δ)(m), ped(z) (t) = 1.8 − 0.1 · √2 cos(δ) + 0.1 · √1 sin(δ)(m), 6 6 6 6 with the parameter δ(t) designed using the fifth-order polynomial interpolation. I = 03×1 . Moreover, Red is designed via spherical linear Set R0d = I3 and 0 ω0d   e I e I T ˙ interpolation and ωed is directly calculated by ωed = Red Red .

20

X. Zhang and Y. Jia 0.2 0.1 0 -0.1 -0.2

0

5

10

15

15 10 5 0 -5 -10 -15

0.2

10

0.1

5

0

0

-0.1

-5

-0.2 0.3 0.2 0.1 0 -0.1 -0.2 -0.3

0

0

5

5

10

t/s

10

15

15

Fig. 2. Tracking errors ep , eE , eB .

-10 20 15 10 5 0 -5 -10 -15 -20

0

5

10

15

0

5

10

15

0

5

10

15

t/s

Fig. 3. Control torques τ1 ∼ τ9 .

The parameters of the PPFs are chosen as ρ01 = ρ02 = ρ03 = 0.15, ρ04 = ρ05 = ρ06 = 0.2, ρ07 = ρ08 = ρ09 = 0.3, ρ∞i = 0.01, Ti = 10 and li = 0.5, i = 1, · · · , 9. The design parameters are selected as K1 = diag (0.01, 0.01, 0.01, 0.005, 0.005, 0.005, 0.03, 0.03, 0.03), K2 = diag (130, 130, 130, 90, 90, 90, 150, 150, 150), K3 = diag (0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1), k4 = 2, b1 = 2, b2 = T 0.05, τ0 = 0.01, 0 = 5, ε0 = 0.1, τmax = [10, 10, 10, 5, 5, 5, 15, 15, 15] (N · m) and 1 = 500. We depict the simulation results in Fig. 1 to Fig. 3. Figure 1 describes the task-space EE position tracking curves ped and pe . Time responses of the tracking errors ep , eE and eB are shown in Fig. 2, from which we can see that the error components involve strictly within their corresponding finite-time PPF envelopes. The control torque curves of the first three joints, denoted as the waist, shoulder and elbow joint of the manipulator, are limited by ±10(N · m), which can be drawn from the first sub-figure of Fig. 3. As depicted in the second sub-figure of Fig. 3, the control torques curves of the last three manipulator joints denoted as the wrist joints are limited by ±5(N · m). In addition, the third sub-figure of Fig. 3 reveals that the base control torques are limited by ±15(N · m).

Task-Space Finite-Time PPC for Space Robots

5

21

Conclusion

This paper presents the design and analysis of a task-space finite-time prescribed performance tracking controller for a free-flying space robot subject to control saturation. It is proved that the tracking errors satisfy the prescribed performance constraints, and all closed-loop errors are ultimately bounded. In the future, the authors would like to investigate the event-triggered finite-time tracking control problem for free-flying space robots. Acknowledgment. This work was supported in part by the NSFC (62133001, 62227810) and the National Basic Research Program of China (973 Program: 2012CB821200,2012CB821201).

References 1. Zhu, Y., Qiao, J., Guo, L.: Adaptive sliding mode disturbance observer-based composite control with prescribed performance of space manipulators for target capturing. IEEE Trans. Ind. Electron. 66(3), 1973–1983 (2018). https://doi.org/10. 1109/TIE.2018.2838065 2. Hu, Q., Shao, X., Guo, L.: Adaptive fault-tolerant attitude tracking control of spacecraft with prescribed performance. IEEE/ASME Trans. Mech. 23(1), 331– 341 (2017). https://doi.org/10.1109/TMECH.2017.2775626 3. Teng, X.J., Liu, B.L., Ichiye, T.: Understanding how water models affect the anomalous pressure dependence of their diffusion coefficients. J. Chem. Phys. 10(153), 104510 (2020). https://doi.org/10.1063/5.0021472 4. Teng, X.J., Hwang, W.: Chain registry and load-dependent conformational dynamics of collagen. Biomacromolecules 8(15), 3019–3029 (2014). https://doi.org/10. 1021/bm500641f 5. Yang, P., Su, Y.: Proximate fixed-time prescribed performance tracking control of uncertain robot manipulators. IEEE/ASME Trans. Mech. 27(5), 3275–3285 (2021). https://doi.org/10.1109/TMECH.2021.3107150 6. Yin, Z., Luo, J., Wei, C.: Quasi fixed-time fault-tolerant control for nonlinear mechanical systems with enhanced performance. Appl. Math. Comput. 352, 157– 173 (2019). https://doi.org/10.1016/j.amc.2019.01.068 7. Liu, Y., Liu, X., Jing, Y.: A novel finite-time adaptive fuzzy tracking control scheme for nonstrict feedback systems. IEEE Trans. Fuzzy Syst. 27(4), 646–658 (2018). https://doi.org/10.1109/TFUZZ.2018.2866264 8. Liu, L., Yao, W., Guo, Y.: Prescribed performance tracking control of a free-flying flexible-joint space robot with disturbances under input saturation. J. Franklin Inst. 358(9), 4571–4601 (2021). https://doi.org/10.1016/j.jfranklin.2021.03.001 9. Yao, Q.: Adaptive trajectory tracking control of a free-flying space manipulator with guaranteed prescribed performance and actuator saturation. Acta Astronautica 185, 283–298 (2021). https://doi.org/10.1016/j.actaastro.2021.05.016 10. Zhou, Z.G., Zhang, Y.A., Zhou, D.: Robust prescribed performance tracking control for free-floating space manipulators with kinematic and dynamic uncertainty. Aerospace Sci. Technol. 71, 568–579 (2017). https://doi.org/10.1016/j.ast.2017.10. 013

22

X. Zhang and Y. Jia

11. Jin, R., Rocco, P., Geng, Y.: Observer-based fixed-time tracking control for space robots in task space. Acta Astronautica 184, 35–45 (2021). https://doi.org/10. 1016/j.actaastro.2021.04.002 12. Jin, R., Geng, Y., Chen, X.: Predefined-time control for free-floating space robots in task space. J. Franklin Inst. 358(18), 9542–9560 (2021). https://doi.org/10.1016/ j.jfranklin.2021.09.030 13. Guo, S.P., Li, D.X., Meng, Y.H.: Task space control of free-floating space robots using constrained adaptive RBF-NTSM. Sci. China Tech. Sci. 57, 828–837 (2014). https://doi.org/10.1007/s11431-014-5487-3 14. Teng, X.J., Hwang, W.: Ch. 4. Structural and dynamical hierarchy of fibrillar collagen. In: Kaunas, R., Zemel, A. (eds.) Cell and Matrix Mechanics, pp. 101-118. (Taylor and Francis, 2014) 15. Giordano, A.M., Ott, C., Albu-Sch¨ affer, A.: Coordinated control of spacecraft’s attitude and end-effector for space robots. IEEE Robot. Autom. Lett. 4(2), 2108– 2115 (2019). https://doi.org/10.1109/LRA.2019.2899433 16. Liang, B., Xu, W.F.: Space robotics: modeling, planning and control (2017) 17. Lee, T.: Exponential stability of an attitude tracking control system on SO (3) for large-angle rotational maneuvers. Syst. Control Lett. 61(1), 231–237 (2012). https://doi.org/10.1016/j.sysconle.2011.10.017

An Online Game Platform for Intangible Cultural Heritage Tibetan Jiu Chess Xiali Li, Yanyin Zhang, Licheng Wu(B) , Yandong Chen, and Bo Liu Minzu University of China, Beijing 100081, China [email protected] Abstract. Tibetan Jiu Chess is a treasure of Tibetan culture. In order to make this intangible cultural heritage bloom, the Jiu Chess online battle platform has been developed. The platform is developed based on Cocos Creator development engine, and the language used are JavaScript and TypeScript. The SDK of Huawei Technologies Co. Ltd online battle engine is embedded to help the platform realize online battle function. It has successfully completed online game competition task for Tibetan Jiu chess in 2022 Uni versity Computer Games Championship & National Computer Games Tournament in China and can satisfy about 200 people to play online games stably. Keywords: Tibetan Jiu chess intangible cultural heritage

1

· online game platform · board game ·

Introduction

Tibetan Jiu chess is a kind of folk chess with complicated and unique rules. It is based on the “square chess” in the northwest region in China and is formed by the innovation and development of multi-ethnic cultures such as Han, Tibetan, Mongolia and Hui. It has dozens of varieties and is both competitive and interesting [6]. Tibetan Jiu chess has been listed in the catalogues of intangible cultural heritage of Qinghai Province, Sichuan Province, Tibet Autonomous Region and national level in China. Since 2010, Sichuan, Tibet and other places have held national Tibetan Jiu chess competitions every year. In 2019, Tibetan Jiu chess was listed as one of competition events in the annual University Computer Games Championship & National Computer Games Tournament in China. The rules of Tibetan JIU chess is in [10]. Tibetan Jiu chess belongs to the complete information game, and its game algorithm research is aimed at the frontier hotspot in the field of computer games and artificial intelligence, but it is still in its infancy. At present, there is the lack of digital chess platform, chess manual and expert knowledge in the research of Tibetan Jiu chess. By developing a digital chess platform, we can collect chess playing data and extract expert knowledge. Based on Cocos Creator, an online game platform for Tibetan Jiu chess was developed. The platform has successfully undertook the online competition task for Tibetan Jiu chess in 2022 University Computer Games Championship & National Computer Games Tournament in China. c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023  Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 23–30, 2023. https://doi.org/10.1007/978-981-99-6187-0_3

24

X. Li et al.

Fig. 1. Platform scenario flow.

2

Related Work

The famous Alpha Go trained a 13-layer strategy network using 30 million chess games of KGS Go server [8]. For chess, Lichess is the largest open-source chess platform in the world, with stable performance and rich functions. Lichess is suitable for playing chess and provides rich chess learning resources [3,7]. Microsoft’s Suphx mahjong artificial intelligence system, whose training data comes from Tenhou platform, has been tested in this platform and reached the level of ten segments [2]. Ludii is the one designed to play, evaluate and design a wide range of games [5]. Games are described as structured sets of ludemes (units of gamerelated information). The Ludii system presents a new and intuitive approach to game design and game playing [9]. Polygames is a framework for Zero learning [1]. They developed the interface between Ludii and Polygames [4].

3

Overall Architecture Design of the Platform

The platform is developed based on Cocos Creator development engine, and the language used is JavaScript and TypeScript. At the same time, the SDK of Huawei Technologies Co. Ltd online battle engine is embedded to help the platform realize online battle function. 3.1

Overall Process

Tibetan Jiu chess game platform consists of the following scenarios: Home scene, Hall scene, Room scene, RoomInfo scene, RoomList scene, Match scene and Game scene.

An Online Game Platform

25

Fig. 2. On-line related architecture diagram.

The process is as shown in Fig. 1, from the Home scene to the Hall scene. In the Hall scene, the player can play games by himself, enter the expert zone and novice zone according to the player’s level, and enter the match scene after entering the corresponding player’s horizontal partition. There are three ways to form a game: creating a room, joining a room and quickly matching. The player can choose to create a room, enter the RoomInfo scene, create a room, set the name of the room, set whether the room is public or private, enter the room scene after creation, and obtain the room ID for other players to search for. Private rooms can only be added to the room through the room ID. The player can choose to join the room in the Match scene and enter the RoomList scene, see the existing public room list, and use the room ID to join private and public rooms. If quick match is selected in the Match scene, the room that is not full will be matched first, and if it does not exist, a new room will be created. When both sides are ready in the Room scene, the homeowner opens the Game and both sides enter their respective game scenes. 3.2

Online Correlation Architecture

Huawei Technologies Co. Ltd online battle engine SDK provides core classes to help developers build online battle environment. The main core classes are Client, Room and Player. Client: online battle client constructor class. Room: online battle room management class. Player: online player constructor class. The schematic diagram of the online architecture is shown in Fig. 2. When the player opens the Client in the browser, the program initializes the Client locally, and calls the method of the client to operate in the actions of creating, matching and joining the room.

26

X. Li et al.

There are a series of monitors in the Room, the most important of which is to monitor the frame data sent by the client. Frame data is the data sending means and structure of frame synchronization communication, and the next section on frame synchronization will be elaborated. In the room, players send corresponding frames for different operations at different stages, and render their own operations on the opponent’s chessboard.

4 4.1

Implementation Process of Platform Tibetan Jiu Chess Tibetan Jiu Chess Chessboard Representation Logic

The representation principle of Tibetan Jiu chess chessboard is shown in Fig. 3. – bg is a Sprite node as the background of the scene. – board bg is also a Sprite node, which is used as the chessboard background. – Although the board is an empty node, it is very critical because all the pieces are added to this node. Its size should be equal to the size of the falling sub-area on the chessboard: – dot prefabrication will also be added to the board node. As many points as there are on the chessboard, there will be as many prefabrications. Each prefabrication will be monitored by touch. If the player clicks on which dot, then we will put a chess piece in the corresponding dot position. In order to show the dot, it is actually transparent. – piece is prefabricated as a chess piece. Piece prefabrication will be initialized in the layout stage. The hLines and vLines constants are used to save the number of horizontal and vertical lines on the chessboard. According to the size of the chessboard nodes, the size occupied by each chess piece is calculated. Each point is mounted with a dot and a piece preform. Dot is the carrier of the touch event, and piece is equivalent to the chess piece entity. Chessboard information is stored in boardPosArray, including its coordinates, using numbers to indicate whether there are black and white pieces here. 4.2

Platform Game Scene

Figure 4 shows the platform Game scene. The chessboard is located in the center of the interface with four buttons below. – jump capture button: used to select the trajectory of chess pieces in substages. – choose complete: click this button to end the selection mode when the trajectory is selected. – Adjust the board: adjust the use of the board in special circumstances, such as regretting chess. – frame retransmission: If the frame is not successfully transmitted due to network problems, the frame will be retransmitted.

An Online Game Platform

Fig. 3. Schematic diagram of chessboard representation.

27

Fig. 4. Game scene.

The text box at the top left of the chessboard is used to record the chess score in real time, and the button upload sgf at the top right uploads the chess score of this bureau to the server, and stores the chess score in sgf format file. 4.3

Frame Data Structure Design

Because it is frame synchronous communication, it is necessary to design different frame data structures for communication between players at each stage of the game. – Firstframe: used to determine the black and white parties. The homeowner determines whether he is the first Mover through random numbers and broadcasts it to the opponent. – Layoutframe: used in the layout Stage, which contains the x and y coordinates of the selected chess piece, and is now in the chess stage. – Chooseframe: used when the chessboard position is selected, including the x and y coordinates of the selected chessboard, the Stage of the chess game now, and the Action to be executed. – Moveframe: used when moving chess pieces. In addition to the above information, it contains the Coordinate information of the target position. – Complexframe: used in case of skipping and squaring in the sub-stage and the sub-stage of flying, where Chooselist[] is the coordinate array containing

28

X. Li et al.

the moving trajectory of the chess piece, and Capture[] is the array for storing the sub-coordinates to be captured. In addition to the necessary frame data for playing chess, Fitframe is also designed for adjusting chessboard and regretting chess. In different stages of frame synchronization communication, different subfunctions of SendFrameFunction send corresponding frame data according to different scenarios. In order to avoid the influence of packet loss and other situations on the chess game, the previous frame is stored locally, and the lost frame is retransmitted if the other party does not receive the previous frame. 4.4

Logic Realization and Chess Notation Expression in Each Stage

After the chess player separates the black and white sides, the first black and white squares are placed in the center of the chessboard and then spread all over the chessboard. In the process of falling, each touch point is initialized with a Piece, and the pictures of the corresponding pieces are set, and the corresponding coordinates and pieces types are recorded in boardPosArray. Each chess piece also has its own number correspondence, and the number has a direct conversion relationship with the chess piece coordinates. In the layout stage, the first letters W and B of the English words White and Black are used to represent white and black, and the content numbers in brackets immediately after them represent the moves in the corresponding columns, and the English letters represent the corresponding rows from top to bottom. For example, B[4C] means that black falls to the third row of the fourth column. Battle Stage. When playing chess, first remove the chess pieces at both ends of the diagonal of the central square of the chessboard and start playing chess. The black side plays chess first, because the white side plays first in the previous layout. Jump-caputre: When the adjacent point of the vertical (horizontal) line of the fallen chess piece is the opponent’s chess piece, and the third point is a blank point, the party can jump-capture the opponent’s chess piece. Jumping to capture the opponent’s chess pieces can be captured alone or in combination. The use of Chinese pinyin TC in chess stands for jumping to capture the opponent’s piece. Chess door: There are 196 squares on the chessboard. When playing chess, the four vertices on each square are all placed with square pieces to form a smallest square. The four vertices are called a chess door. A piece can be captured by the opponent every time an opponent’s chess door is formed. Chess doors are divided into single chess doors and double chess doors. The double chess door is formed by connecting two single chess doors together. When playing chess, each double chess door is formed. You can capture any two pieces of the opponent. The first letter FC of Chinese Pinyin is used in the chess manual to represent the capture of piece by forming a square.

An Online Game Platform

29

The coordinates of chess pieces are obtained according to the touch point dot, and the conversion between coordinates and chess piece numbers realizes the control of its moving rules and capturing rules according to the quantitative relationship.

5 5.1

Test Results and Actual Application of the Platform Platform Test

About 200 people participated in the actual test of the platform and use it online to test its function and performance, and the functions can run normally. The test participants consisted of 20 master’s students and 80 undergraduate students of Minzu University of China, approximately 50 China University Computer Games Competition participants, and 50 members of the Tibetan Chess Association. Under the windows operating system environment, mainstream browsers such as edge and google chrome can play games stably, which can satisfy about 200 people to play online games for player versus player at the same time. But Mac Opertating system is not supported for the time being. 5.2

Actual Use of the Platform

In 2022 China University Computer Games Competition and China University Computer Games Championship, as battle platform, it successfully completed the competition task. After the game, users who are configured with the windows operating system environment, mainstream browsers such as edge and google chrome can play Tibetan Jiu chess through the website “jiuchessmuc.tech”.

6

Conclusion

Tibetan Jiu chess is a treasure of the traditional culture of ethnic minorities in China, which is competitive and interesting. At present, the platform has realized player versus player game, and the follow-up work plan to realize the game between people and AI, AI versus AI. The research group will continue to improve the function and performance of the platform and contribute to the promotion of the intangible cultural heritage of Tibetan Jiu chess. Acknowledgment. This work was supported in part by the National Natural Science Foundation of China under Grant 62276285, Grant 62236011 and in part by the Major Projects of Social Science Fundation of China under grant 20&ZD279.

References 1. Cazenave, T., et al.: Polygames: improved zero learning. ICGA J. 42(4), 244–256 (2020)

30

X. Li et al.

2. Li, J., Koyamada, S., et al.: Suphx: mastering mahjong with deep reinforcement learning. arXiv preprint arXiv:2003.13590 (2020) 3. McIlroy-Young, R., Wang, Y., Sen, S., Kleinberg, J., Anderson, A.: Detecting individual decision-making style: exploring behavioral stylometry in chess. Adv. Neural Inf. Process. Syst. 34, 24482–24497 (2021) 4. Mella, V., Browne, C., Teytaud, O., et al.: Deep learning for general game playing with Ludii and Polygames. ICGA J. (Preprint) 43(3), 146–161 (2021) 5. Piette, E., Soemers, D.J., Stephenson, M., Sironi, C.F., Winands, M.H., Browne, C.: Ludii–the ludemic general game system. arXiv preprint arXiv:1905.05013 (2019) 6. Qiang, L.: Research on the origin of “Jiu qi” of Tibetan chess. Tibetan Stud. 6, 5 (2017) 7. Sanjaya, R., Wang, J., Yang, Y.: Measuring the non-transitivity in chess. Algorithms 15(5), 152 (2022) 8. Silver, D., et al.: Mastering the game of go with deep neural networks and tree search. Nature 529(7587), 484–489 (2016) 9. Stephenson, M., Piette, E., Soemers, D.J., Browne, C.: An overview of the Ludii general game system. In: 2019 IEEE Conference on Games (CoG), pp. 1–2. IEEE (2019) 10. Wang, S., Wu, Q.: Tibetan Jiu chess game algorithm based on expert knowledge. In: Proceedings of the 2022 2nd International Conference on Control and Intelligent Robotics, pp. 842–848 (2022)

Review of Human Target Detection and Tracking Based on Multi-view Information Fusion Liuwang Wang(B) and Haojun Liu Electric Power Research Institute of State Grid Zhejiang Electric Power Co., Ltd. Research Institute, Hangzhou 310014, China [email protected]

Abstract. With the increasing availability and affordability of cameras, pedestrian detection and tracking technology have become widely used in various applications such as intelligent monitoring, behavioral analysis, and traffic control. However, traditional single-view detection methods have limitations, including susceptibility to occlusion and a limited field of view. Multi-view detection and tracking methods have gained significant attention from both academia and industry, as they can overcome the limitations of single-view methods by leveraging complementary information from multiple views, resulting in improved algorithm performance. This paper presents a systematic review of multi-view information fusion for human target detection and tracking, focusing on three dimensions: dataset, multi-view detection, and multi-view tracking. Furthermore, it discusses potential applications of these methods in the power field and suggests future directions for development. Keywords: Multi-View · Visual Detection · Visual Tracking · Deep Learning

1 Introduction Pedestrian detection and tracking has always been a prominent research topic in computer vision, with wide-ranging applications in intelligent monitoring, behavioral analysis, and traffic control [39–41]. However, in the real world, pedestrians are often occluded by both other pedestrians and the environment, particularly in crowded scenes [42]. In such cases, single-camera-based methods for pedestrian detection and tracking often yield unsatisfactory results. With the increasing availability of camera equipment in various scenes, the same area can be observed by multiple cameras from different viewpoints. Thus, more and more researchers have explored the use of information from multiple viewpoints to improve the effectiveness of pedestrian detection and tracking. They have proposed a series of methods for multi-view pedestrian detection and tracking that have achieved excellent results both in theory and practice. This paper aims to provide a comprehensive summary and clarification of the research developments in multi-view pedestrian detection and tracking, serving as a valuable reference for future research in this field. The paper covers various aspects, including multiview pedestrian detection, multi-view pedestrian tracking, commonly used datasets in multi-view pedestrian research, their application in the power field, and future research directions and trends. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 31–50, 2023. https://doi.org/10.1007/978-981-99-6187-0_4

32

L. Wang and H. Liu

2 Multi-view Pedestrian Detection The task of multi-view pedestrian detection entails detecting pedestrians in a scene captured by multiple cameras from multiple viewpoints that may overlap. With the decreasing cost of camera hardware and increasing demand for practical applications, the task of multi-view pedestrian detection has attracted more attention from researchers. In 2007, Fleuret et al. [1] first formulated the problem of multi-view pedestrian detection as the estimation of the pedestrian occupancy on the gridding discrete locations, which marked the beginning of the multi-view pedestrian detection task. Since then, researchers have proposed numerous related methods to solve this problem. These methods can be generally categorized into three types: traditional computer vision, monocular detection, and multi-view feature fusion. 2.1 Traditional Computer Vision Prior to the rise of deep learning, algorithm research on multi-view pedestrian detection was primarily based on traditional computer vision theory. As depicted in Fig. 1, this method first separates the foreground from the background by performing background removal on the camera-view image to obtain a binary image. Then, it conducts pedestrian detection using traditional computer vision methods based on multiple viewpoints.

Fig. 1. Procedure of traditional computer vision theory.

Apart from formulating the problem of multi-view pedestrian detection, Fleuret et al. [1] also proposed a POM method (probability of occupancy map) based on a generative model. For views from different cameras at the same time, the POM method uses geometric constraints to estimate the probability of pedestrian occupancy for pedestrian detection, which is the first method applied in multi-view pedestrian detection.

Review of Human Target Detection and Tracking

33

Following the POM method, various multi-view pedestrian detection methods based on traditional computer vision theory have been proposed. For example, Golbabaee et al. [2] proposed the SCOOP method, which achieved pedestrian detection with a sparse model for binary motion detection maps solved with a novel greedy algorithm based on set covering. Alahi et al. [3] cast the multi-view pedestrian detection as a sparse linear inverse problem and regularized it by imposing a sparse occupancy vector. Ge et al. [4] considered both pedestrian detection and crowd counting, proposed a Bayesian approach that extends from single-view to multi-view and introduced multi-view geometric constraints to simultaneously estimate the number of people and detect their locations. To overcome occlusions in a dense crowd, where cameras are placed at a high elevation and only people’s heads are tracked, Eshel et al. [5] used head detection for pedestrian detection and utilized homography transformations to correlate synchronized frames from different camera views. Domestic researchers have also conducted research on multi-view pedestrian detection algorithms based on traditional computer vision theories. Xu et al. [62] adopted the probability of target existence in given locations for multi-view information fusion to obtain a posterior probability distribution for target prediction. Then, the target correspondence between multiple views was achieved based on the structure of target windows. In follow-up research, Xu et al. [63] used the space field to reconstruct the pedestrian target in 3D space for detection and localization. The projection method onto the plane of view was utilized to realize target correspondence and target detection on the camera view. In general, methods based on traditional computer vision require preprocessing of camera images to remove the background and obtain the foreground image of pedestrians. However, this process introduces noise that cannot be removed in follow-up procedures and may lose some of the original information. Consequently, these methods are often less effective in practical applications. The traditional methods based on computer vision are no longer mainstream in current research with the emergence of deep learning. 2.2 Monocular Detection In recent years, the continuous development of deep learning has led to breakthroughs in computer vision tasks, including significant advances in the detection of pedestrians. The monocular detection algorithm based on deep learning has become a hotspot in this research field. As depicted in Fig. 2, the monocular detection algorithms based on deep learning, such as Faster-RCNN [6] and YOLO [7], are first used on each camera view to detect pedestrians. The preliminary results are marked using bounding boxes and other ways and then correlated to obtain the final pedestrian detection results from multiple viewpoints. Researchers have proposed different methods based on monocular detection to address various problems in multi-view pedestrian detection. To tackle the lack of singleview information caused by occlusion, Zhang [64] proposed a multi-view pedestrian detection method based on structural constraints. This method uses a target detection model based on divided block information to detect pedestrians and solves the issue of target association among multiple viewpoints through human-based block constraints.

34

L. Wang and H. Liu

Fig. 2. Procedure of methods based on the monocular detection algorithm.

Peng et al. [8] addressed the frequent problems of "phantoms" (i.e., fake pedestrians) produced by the existing approaches by proposing a robust multi-camera pedestrian detection approach with a multi-view Bayesian network model (MvBN). This approach uses a monocular detection algorithm to obtain the pedestrian candidates in all views and their corresponding locations on the ground plane. Then, a multi-view Bayesian network is used to model occlusion relationships and effectively remove “fake pedestrians” by inferring the most likely occluded nodes in MvBN. López-Cifuentes et al. [9] proposed an approach to globally combine pedestrian detection by leveraging automatically extracted scene context to address the problem that the method based on monocular detection needs to be fine-tuned according to the target scene. This approach uses the context information, obtained via semantic segmentation, to automatically generate a target area, and obtains detection results for each camera by solving a global optimization problem that maximizes the coherence of detection both in each 2D image and in the 3D scene. These detection results are projected to the ground plane and combined by means of a disconnected graph to obtain the global detection results. In addition, to address the fine-tuning problem, Lima et al. [10] proposed a multi-camera pedestrian detection method that emphasizes generalizability and eliminates the need to train using data from the target scene. This method uses key points of human body poses and bounding boxes to jointly determine pedestrian locations, formulates the multi-camera view fusion as a clique coverage problem from graph theory, and solves the problem using a greedy algorithm. Yang et al. [11] proposed a multi-view detection framework that can simultaneously output pedestrian locations and ID information, to solve the problem of the state-of-art methods being unable to identify information, resulting in unclear detection results. This framework introduces image and ID models while using the monocular vision method to segment and recognize pedestrians. It also adopts the PIOM positioning algorithm designed based on the multidimensional Bayesian model to obtain the pedestrian locations on the ground plane and their ID tags. Chen [65] proposed a multi-view

Review of Human Target Detection and Tracking

35

data fusion and balanced YOLOv3 detection method (MVBYOLO), to tackle the problem of pedestrians being challenging to detect in complex traffic scenes. This method utilizes a self-supervised multi-view feature point fusion network to fuse multi-view images, followed by pedestrian detection using an improved version of YOLOv3 called the BYOLO network [13]. The advanced network is capable of balancing multi-scale image features, resulting in significantly improved detection accuracy of small target pedestrians in complex scenes. The easy expansion, easy migration, and simple implementation of the multi-view pedestrian detection method based on monocular detection make it a commonly used basis for the solutions of building multi-view downstream tasks. For instance, in the solution of the multi-view pedestrian tracking task, Ong et al. [12] used YOLOv3 [13] and Faster-RCNN [6] to detect pedestrians in each camera view, and feed the detection results into the designed filtering algorithm MV-GLMB-OC for fusion. Zhu et al. [14] employed Mask-RCNN [15] to detect pedestrians in each camera view, and used distance constraints to fuse the locations of multi-view pedestrians projected to the ground plane. In the research extended from Lima et al. [10], Lyra et al. [16] recorded information such as the pedestrian location, color histogram and pedestrian identity at each moment. Nguyen et al. [17] pre-clustered camera-view monocular detection results on the ground plane using the 3D geometric constraints of the projection of detection results of the same pedestrian in different views. Xu et al. [18] correlated the monocular detection results into short trajectory segments and used geometric proximity on the ground plane to correlate the trajectory segments of the same pedestrian under different camera views. Regarding multi-view 3D pose estimation, Chen et al. [19] utilized YOLOv3 [13] to detect pedestrians in the camera view, and then employed a modified SPPE network [20] to estimate 2D joint to accomplish multi-view matching of pedestrians via graph matching and foot joint association. Vo et al. [21] constructed the descriptor framework of human appearance using CPM [22] to detect pedestrians and their key body points in the camera view. By estimating the camera time alignment, the same pedestrian in different views can be correlated with the multi-view geometric limit constraint. Regarding the edge computing of smart cities, Sun et al. [23] utilized a lightweight pedestrian detector to detect pedestrians on the camera side, and cross-correlated pedestrian detection results from different cameras on the edge server. Compared with methods based on traditional computer vision theory, the multiview pedestrian detection method based on monocular detection offers a remarkable improvement in detection accuracy and speed, and provides high flexibility. However, the monocular detection algorithm cannot achieve 100% accurate detection, and the problem of false detection in the camera view is inevitable, which somewhat limits its performance ceiling. 2.3 Multi-view Feature Fusion In addition to the multi-view pedestrian detection method based on a monocular detection algorithm, researchers also committed to multi-view feature fusion, another method that leverages deep learning for multi-view pedestrian detection.

36

L. Wang and H. Liu

As illustrated in Fig. 3, the first step is to use deep learning-based feature extractors to extract features from the image captured by each camera view. Instead of conducting the monocular detection on each view, the extracted features are then projected and fused on the ground plane. Finally, the probability occupancy map for pedestrians in the scene is generated.

Fig. 3. Procedure of multi-view feature fusion.

Compared with monocular detection-based methods, the multi-view feature fusion method theoretically has better detection performance by avoiding potential errors caused by monocular detection. In addition, the end-to-end model training and reasoning approach is more concise and elegant in form. Baqué et al. [25] improved on this method by proposing Deep-Occlusion, which leverages a combination of convolutional neural network and conditional random field. The core part is to use high-order CRF items to model potential occlusions, to improve stability. Deep-Occlusion employed VGG-16 [26] for feature extraction and used meanfield inference to generate probabilistic occupancy maps. In contrast, DeepMCD proposed by Chavdarova et al. [27] was the first multi-view feature fusion model that fully used deep learning methods in the multi-view pedestrian detection field. DeepMCD consists of a monocular detector with a CNN architecture, a multi-view architecture based on the multi-layer perceptron, NMS (a post-processing algorithm), and others. Although DeepMCD can theoretically achieve end-to-end training, in practice, the model requires

Review of Human Target Detection and Tracking

37

fine-tuning of the pre-trained monocular detection network on the monocular pedestrian detection dataset, and then training on the multi-view dataset. In 2020, Hou et al. [28] proposed MVDet, an end-to-end multi-view feature fusion pedestrian detection method that was a landmark in the research of multi-view fusion methods. MVDet directly utilizes shared weight ResNet-18 [25] to extract features from multiple camera views, followed by viewpoints transformation to project the feature map onto the ground plane while combining the coordinate map for aggregation. The final probabilistic occupancy map is generated by leveraging large-scale convolution to aggregate spatial neighborhood information on the aggregated ground feature map. MVDet has a simple and efficient structure, achieves end-to-end training and reasoning, and greatly improves detection performance compared to previous methods. Subsequent researches has built on the success of MVDet, resulting in various improvements as follows. Hou et al. [29] proposed the MVDeTr architecture, which utilizes deformable attention [30] to design a shadow Transformer for multi-view feature aggregation. This addresses the problem of convolution’s translation invariance potentially interfering with multi-view feature aggregation. Liu et al. [31] introduced a dual-branch feature fusion structure based on a lightweight network design, which improves the feature extraction ability of the model. Song et al. [32] used stacked homography transformations to improve MVDet and developed the SHOT. This method projects the feature maps of different semantic parts of pedestrians to different horizontal heights to achieve more accurate projection from the camera view to the top view of the ground. To improve the generalization performance of MVDet, Vora et al. [33] added the regularization operation of DropView and applied average pooling with the camera-view order permutation invariant for feature aggregation, resulting in a significant improvement in the model’s generalization ability. In addition to research on improving the performance of multi-view pedestrian detection, there is also research investigating the applicability of these methods. Haoran et al. [34] proposed MVM3Det, a multi-view 3D object detection method, to solve the problem of multi-view detection only outputting the position of the detected object without orientation. MVM3Det is composed of two parts: Position proposal network (PPN) integrates the features from different viewpoints into consistent global features through feature orthogonal transformation to estimate the position. Multi-branch orientation network introduces feature viewpoints pooling to realize the multi-angle orientation estimation. Therefore, MVM3Det can simultaneously output the 3D position and orientation of the detected object. To explore detection methods for different kinds of objects, Ma et al. [35] proposed the voxelized 3D feature aggregation (VFA) in multi-view detection, which introduces oriented Gaussian encoding to match different projection shapes of detection objects on the ground plane. This method can not only facilitate pedestrian detection but also provide accurate detection of cows. Compare to methods based on monocular detection, the pedestrian detection method based on multi-view feature fusion may be less scalable and easy to use, but it offers certain benefits in detection performance and end-to-end training capabilities. As a result, it has been applied to the downstream task of multi-view pedestrian detection in various studies. To estimate 3D poses of multiple people from multi-camera views, Tu et al. [36] utilized the Cuboid Proposal Network to locate all people in the scene based on

38

L. Wang and H. Liu

the mapped 2D pose heatmap estimated by each camera view. You et al. [37] proposed an end-to-end real-time online tracking framework for multi-view pedestrian tracking. In this method, a viewpoints-aware deep ground point network is used to estimate the barycentric projection of pedestrians in each camera view, which is fused via 3D geometry. The lightweight Deep Glimpse Network is then used to detect pedestrians on the fused ground heat map. For depth sensing pedestrian detection, Wetzel et al. [38] designed a multi-view pedestrian detection network with CNN architecture, in which the foreground depth images segmented from multiple depth sensor views are input to detect the probability of pedestrian occupancy in the scene. The evolution of the multi-view pedestrian detection method has advanced from traditional computer vision theory to state-of-art deep learning methods. Currently, two mainstream research methods for multi-view pedestrian detection are being explored. The first method is based on monocular detection, which focuses more on constructing a solution for the downstream tasks of multi-view pedestrian detection as it exhibits sound expansibility and transferability. The second is the multi-view feature fusion method with better performance and potential, and thus, more research is devoted to improving its performance of multi-view pedestrian detection.

3 Multi-view Pedestrian Tracking Visual tracking is a downstream detection task. It usually entails detecting the target person in the picture, and then jointly estimating the trajectory of the target person via the detection information from the image stream [43]. Trajectories can be either occupancy maps of pedestrians on the ground plane or trajectories in three-dimensional space. Monocular tracking has basic limitations due to sensors, such as the narrow monitoring scale and susceptibility to occlusion. In contrast, the monitoring scale of multi-view tracking can be extended by improving the sensors’ configuration and deployment. The complementary information in the data can also be utilized to improve tracking performance, especially to deal with occlusions in a single view. Consequently, multi-view personnel tracking has wide-ranging applications in sports event analysis, elderly care, on-site monitoring of power operations and other fields. Multi-view tracking can be categorized into various types as follows. For example, it can be divided into three categories including centralized tracking, distributed tracking, and hybrid tracking based on different fusion ways. Centralized tracking fuses information from different viewpoints before tracking, while distributed tracking fuses information from different viewpoints after tracking. Hybrid tracking is a combination of both two methods. It can also be divided based on different intersections of camera views, with a common view and a non-common view being the two main classifications. Additionally, it can be divided based on data processing methods into online processing algorithms and batch processing algorithms. The difference lies in whether multi-frame information is used. Online processing algorithms provide real-time estimation when receiving the current data, while batch processing algorithms perform unified processing after receiving a set of data. Generally speaking, while the batch processing algorithm is more accurate, the online processing algorithm has a faster processing speed and is suitable for application scenes with higher real-time requirements.

Review of Human Target Detection and Tracking

39

The data processing method is the commonly used classification standard. The following is an overview of different multi-view tracking algorithms based on this standard. 3.1 Centralized Tracking Centralized tracking involves integrating information from different viewpoints through a centralized processing center before performing tracking. The method can be further subdivided based on the detection methods used, namely tracking based on monocular detection and tracking based on multi-view detection, as shown in Fig. 4. In the tracking algorithm based on monocular detection, images from different viewpoints are subjected to monocular detection, and the preliminary results are employed in multi-view tracking to output usually a trajectory in 3D space. Jonah Ong [12] segmented multi-view target tracking into multiple sub-tasks such as tracking management, clutter removal, state estimation, occlusion/false detection processing, etc. Moreover, a 3D occlusion model was established, and an online Bayesian multi-view multi-object filter algorithm was proposed. Tracking management includes initialization, system termination, and trajectory identification of each target. State estimation is used to measure the state vector of the target. Clutter removal and occlusion/false detection processing modules are used to solve tracking loss, tracking split, identity switching, and other problems of false negatives and true positives due to occlusions or detectors failing to identify targets. Yuanlu Xu [18] transformed the multi-view multi-object tracking problem into a structural optimization problem described by a hierarchical composition model. Hierarchical composition describes the target trajectory as a composable hierarchy, including the shape, geometry, motion, and other attributes of the target. A series of tracking sub-modules is adapted to focus on specific attributes and finally aggregates the tracking results of attributes from the same viewpoints or different viewpoints. Li et al. [69] established a mutual mapping between a multi-view camera system and geospatial data, which can obtain not only the target outline but also spatial and temporal information such as geographic location. Other solutions include methods based on homography constraints [44, 45] and joint reconstruction and tracking [46], etc. The algorithm based on multi-view detection adopted the multi-view feature fusion methods mentioned above in multi-view detection. You [37] et al. described the problem of people tracking after outputting an occupancy map as a path-following problem and used optimization to associate the detected trajectory at the previous moment with the detection result at the current moment. Each trajectory is only allowed to connect with new detection points within a fixed radius, and the radius is constrained by the maximum speed of the person and the frame rate of the tracking algorithm. Zhang [72] proposed a robust extreme learning machine model for multi-view feature fusion, which fuses the color, texture and other features of samples extracted in advance under different views to achieve robust target tracking for scenes such as illumination changes and occlusions. The advantage of monocular detection is that it can easily redeploy the camera network, including adding, reducing cameras or changing the camera position, etc. Therefore, it is more robust for the failure of some sensors, and the algorithm complexity is

40

L. Wang and H. Liu

approximately linear with respect to the number of cameras. However, multi-view detection is usually coupled with the deployment information of cameras. If it is redeployed, the network needs to be trained again.

Fig. 4. Procedure of centralized tracking.

3.2 Distributed Tracking Centralized tracking has high requirements for network transmission efficiency and the computing power of the centralized processing center. In contrast, in distributed tracking, each camera node independently completes the tracking and then fuses information through the network to obtain the global trajectory of the target object, which effectively makes up for the shortcomings of centralized processing. Therefore, distributed tracking is usually applied to the scene that has large scales, lacks overlapping areas among cameras, or cannot effectively utilize the geometric positional relationship among cameras. The procedure of distributed tracking is shown in Fig. 5. Feng [66] proposed a distributed data fusion framework based on Bayesian theory and particle filter algorithm, which can effectively deal with occlusions and reduce data consumption. The whole system still continues to work even when a camera node fails, so it has considerable robustness. Qu et al. [67] used Kalman filtering to achieve track fusion among multiple viewpoints. Wu et al. [68] applied distributed multi-view tracking to the intelligent monitoring system and used the algorithm based on a variational approximation to realize real-time target tracking under multiple views. Xu [70] proposed a design scheme of a high-definition multi-view personnel target tracking system implemented on an FPGA-embedded device. Furthermore, when there are multiple targets in the tracking view, identity recognition or identity matching technology will be used to distinguish and identify the target persons detected by different cameras, and to merge the trajectories belonging to the same target person. Therefore, the essence of the multi-target distributed tracking algorithm is

Review of Human Target Detection and Tracking

41

monocular tracking and identity re-identification. Monocular tracking is not the focus of this paper, more details can be found in the literature [47]. Identity recognition refers to judging the presence of a queried person from a set of candidates or pictures, where the queried person and the pictures are usually obtained from different camera viewpoints with no overlapping areas [48]. Traditional identification methods extracted stable and sufficiently discriminative features from pictures, including shape, color, structure, and gait, etc., and then performed feature matching [49–53]. However, Li et al. [71] regarded identity matching as a classification task, extracted HOG features for salient objects in different viewpoints, and constructed an SVM classifier, so as to realize multi-view and multi-target tracking. In recent years, the emergence of deep learning has also greatly promoted the development of the field of identity re-identification, including methods based on representation learning [54], methods based on metric learning [55], methods based on local features [56] and so on.

Fig. 5. Procedure of distributed tracking.

3.3 Hybrid Tracking When the network of all sensors/cameras is large, centralized tracking can lead to excessive data transfer volume, especially invalid data transfer, and increase communication burden. The reason lies in that not every camera can detect the target at the same time. Distributed tracing fails to take full advantage of information from multiple viewpoints. Therefore, some studies tried the hybrid tracking, which is the combination of the two methods. Hybrid tracking divides the camera network into multiple regions, each region has its own processing center, performs centralized tracking within the region, and performs distributed tracking among regions. Medeiros et al. [57] proposed a distributed target tracking system based on the clustered Kalman filter. When a target is detected, the cameras that can observe the same target communicate with each other to form a cluster and select the cluster center. The local measurements of the target acquired by each camera are sent to the cluster center, and then the target position is estimated by Kalman filtering and sent to the base station periodically. Venkata [58] proposed a multicamera target tracking method based on information filtering, which allows only a fixed

42

L. Wang and H. Liu

number of cameras with the most observation information to participate in information exchange through an information screening mechanism, in order to meet the bandwidth and energy constraints during transmission.

4 Dataset Conventional pedestrian detection datasets usually do not have image data of the same area captured by multiple cameras from different viewpoints, so they cannot be used to evaluate the performance of multi-view pedestrian detection and tracking algorithms. The lack of datasets makes it impossible for various algorithms to be compared fairly, which hinders the development of multi-view detection and tracking algorithms to a certain extent. To push the field forward, researchers have produced several datasets dedicated to multi-view detection and tracking in recent years. PETS2009 [59]. PETS2009 is one of the earliest multi-view detection datasets, which consists of crowd monitoring data in outdoor scenes and is used to help improve the safety of public areas. The PETS2009 dataset was recorded at the University of Reading in the UK and includes three sub-sequence scenes. Scene 1 (background) contains pedestrians and other moving objects, but the frames are not completely synchronized. Scene 2 (city center) contains people who are walking randomly, and the crowd is sparse. Scene 3 (regular flow) contains slow-moving people, and the crowd is denser. The PETS2009 dataset has the issue of inaccurate data labels and incomplete overlapping of camera views. However, as one of the earliest multi-view detection data sets, it provides an important basis for the performance evaluation of early multi-view detection and tracking algorithms, greatly promoting the development of the field. EPFL-RLC [27]. While proposing the first multi-view pedestrian detection model fully using deep learning, EPFL researchers also produced a three-view dataset, the EPFLRLC dataset, for algorithm evaluation. The EPFL-RLC dataset was recorded at the Swiss Federal Institute of Technology in Lausanne. The shooting equipment is 3 calibrated high-definition cameras with a frame rate of 60 frames per second and a resolution of 1920 × 1080 pixels. In terms of data labeling, EPFL-RLC performs position coding by discretizing the ground plane of the target area into regular grid points and assigning an ID to each grid point. The pedestrian in the target area is modeled as a cylinder in 3D space with the grid point at its location as the center, and the projection rectangle of the cylinder in each camera view is the annotation frame of the pedestrian in this view. The EPFL-RLC dataset has problems such as missing labels and limited shooting view. Compared with PETS2009, the EPFL-RLC dataset has higher joint calibration accuracy and time synchronization, so this dataset is considered one of the important benchmarks for algorithm evaluation for multi-view detection and tracking. Wildtrack [60]. Compared with previous multi-view datasets, Wildtrack has the advantages of large scale, high resolution, precise calibration, and high crowd crowding. The data in Wildtrack was recorded in front of the main building of the Swiss Federal Institute of Technology in Zurich, without a pre-designed script, and has a high degree

Review of Human Target Detection and Tracking

43

of authenticity. The shooting equipment of Wildtrack is 7 statically positioned highdefinition cameras with a frame rate of 60 frames per second and a resolution of 1920 × 1080 pixels. These cameras are arranged at positions higher than the average height of pedestrians, and their fields of view overlap highly. Wildtrack uses accurate joint camera calibration, and the synchronization accuracy between 7 cameras reaches 50ms. The target area of Wildtrack has reached a range of 12 m * 36 m. In terms of data labeling, similar to EPFL-RLC, Wildtrack also performs grid processing on the target field of view area. The pedestrian annotation operation is defined as adjusting the position of a 3D cylinder on the ground plane so that its projected rectangle on all camera views overlaps the annotated pedestrian. Wildtrack has a total of 9518 multi-view pedestrian annotations, with an average of 23.8 pedestrians per frame, and a high degree of crowding. Because Wildtrack has high-quality data and annotations, this dataset has become one of the mainstream datasets for evaluating multi-view detection and tracking algorithms. MultiviewX [29]. The data of the previous multi-view detection datasets all come from real-world scenes, and it takes a huge labor cost and time cost to label the data, which hinders the production of larger-scale datasets. Hou et al. [4] proposed the multi-view detection algorithm (MVDet), and at the same time synthesized a multi-view pedestrian detection dataset MultiviewX using a game engine, which provided a new idea for the production of multi-view detection datasets. The pedestrian model for the MultiviewX comes from PersonX. The target area of the MultiviewX is 16 m * 25 m, including 6 cameras with overlapping fields of view, and each camera outputs an image with a resolution of 1080x1920. On average, there are 4.41 cameras whose fields of view cover the same location in the target area. With default settings, this dataset has 40 pedestrians per frame, which is denser than the Wildtrack dataset. As a synthetic dataset, MultiviewX can automate data annotation with the help of programs, which greatly reduces the labor and time costs of dataset production. However, there are also erroneous data of pedestrians colliding with each other in this data set, which cannot fully simulate the real scene. GMVD [33]. The data contained in the previous multi-view detection data sets are all single scene data, and the camera position, number, shooting scene, lighting conditions and weather are all fixed, which makes the dataset unable to accurately evaluate the generalization performance of the algorithm. The emergence of the GMVD dataset effectively solves this problem. Similar to MultiviewX, GMVD is also a dataset generated by a game engine. It contains 53 data sequences from 7 scenes, of which 6 scenes are used for training and 1 scene is used for testing. The number, parameters, and positions of cameras in the GMVD dataset change with the scene and the diversity of data is increased from various aspects such as lighting conditions, weather, and pedestrian clothing. In order to simulate the real situation, GMVD also introduces synchronization errors between different cameras. GMVD emphasizes the generalization performance of the model, which effectively complements the deficiencies of the previous multi-view detection data set, and avoids the reduction of practicality caused by the over-fitting of the algorithm model on a specific data set.

44

L. Wang and H. Liu

IHDD [61]. Most multi-view detection datasets are constructed in outdoor scenes, but there are large differences between outdoor and indoor scenes in terms of camera angle, personnel scale and personnel shape. Therefore, these datasets are not fully suitable for indoor multi-view performance evaluation of detection algorithms. To solve the problem of lack of data sets for indoor multi-view personnel detection tasks and promote the development of indoor multi-view personnel detection algorithms, researchers from Tianjin University established the indoor personnel detection dataset IHDD. The shooting scenes of this dataset are corporate duty rooms and studios, including a total of 14665 images from different camera views, and are divided into the training set, verification set and test set according to the ratio of 3:1:1. Using the annotation format of PASCAL VOC, the dataset has a total of 17,854 human targets annotated. IHDD has diversity in environmental lighting, personnel appearance, camera position, and personnel posture, and can more comprehensively reflect the actual performance of detection algorithms in indoor scenes.

5 Applications and Prospects 5.1 Application in Electric Power Operation Scenes The safe and stable operation of power production is closely related to economic and social development. Dangerous factors such as high-voltage operations and high-altitude operations are common in power operation scenes. Therefore, it is particularly urgent to monitor the safety of operators in real time, such as identifying the identity of operators, monitoring the behavior of operators, and recording the trajectory of operators. Among them, the multi-view personnel target detection and tracking technology, due to its high detection accuracy and strong anti-occlusion ability, is of great significance to the development of an efficient power operation scene monitoring system and therefore has received more and more attention. In terms of autonomous inspection of transmission line robots, Yao [73] used the multi-view method to reconstruct the 3D contour model of obstacles, built an image library, and matched the model with the captured obstacle pictures to achieve real-time and accurate identification of obstacles. In terms of helmet-wearing detection in construction environments, Lu et al. [74] proposed a hierarchical heterogeneous hybrid GNN model consisting of a basic sequence encoder, a hierarchical attention model layer, and a final prediction layer. This method made full use of the information features in images from multiple viewpoints to achieve robust detection of helmet-wearing conditions of workers. Based on the multi-camera personnel target detection and tracking system, Xu [77] realized the judgment of the clothing and behavior of the characters and provided timely warning and evidence storage, which effectively guaranteed the safety of the operators. In addition, many studies have focused on the hardware design of multi-camera monitoring systems in combination with the needs of power operation scenes, extending the application possibilities of multi-view monitoring algorithms. For example, Xing [75] built a multi-view monitoring system with various devices such as fixed cameras and movable pan-tilt cameras mounted on UAVs. The system layout can be adjusted according to changes in the field environment and operational requirements, achieving

Review of Human Target Detection and Tracking

45

a more flexible and effective real-time monitoring system. Based on 5G communication technology, Chen et al. [76] used pulse transmitters to transmit signals and trigger image acquisition circuits to achieve hardware synchronization of multi-camera video sequence acquisition, which is conducive to the processing of subsequent detection and tracking algorithms. 5.2 Prospects This paper believes that future research trends and directions of multi-view detection and tracking can be expected from the three aspects: datasets, multi-view detection, and multi-view tracking. Multi-view detection: Based on the current research status of multi-view person detection, it is anticipated that deep learning-based methods will remain the mainstream in this field for the foreseeable future, including approaches that rely on monocular detection and multi-view fusion techniques. There are three primary research directions that will be the focus of future investigations: 1. Methods that are more suitable for downstream tasks: In many practical applications, multi-view pedestrian detection is used as a front-end module for tasks such as downstream tracking and pose estimation. Therefore, for specific downstream tasks, the research hotspot is the multiple-view pedestrian detection method. 2. Methods that focus on improving detection performance: There is still room for further improving the state-of-the-art multi-view pedestrian detection models. With the rapid development of deep learning, novel multi-view pedestrian detection models are continuously being proposed. Thus, the development of new detection methods is a research hotspot for improving the performance of multi-view pedestrian detection tasks. 3. Methods with more generalization performance: The current multi-view pedestrian detection method, especially the multi-view fusion method, more or less has the problem of overfitting, with weak generalization ability. The performance is not ideal. Therefore, how to improve the generalization ability of the multi-view detection method is also worthy of research. There are three hotspots for future research on multi-view tracking. 1. At present, the algorithm framework of multi-view tracking has matured, and it is necessary to pay attention to the robustness of the algorithm in scenes such as largescale, occlusion, and large crowd density. 2. The multi-view tracking algorithm based on deep learning has greatly promoted the development of the entire field, but the generalization of the model and the actual deployment effect still need to be improved. 3. The current research is based on the deployment of cameras with a fixed viewing angle. It will be valuable to explore increasing camera freedom and realize multi-sensor networks’ active perception. Dataset: In general, collecting multi-view detection data in the real world is very difficult. Several existing datasets are valuable to the academic community, but the scale and types of data are limited, resulting in most data sets can only be used to

46

L. Wang and H. Liu

evaluate the generalization ability of the model, but cannot be used to train a model with strong generalization ability. The migration of multi-view detection tasks to practical applications remains to be explored. In addition, using unsupervised domain adaptation techniques to bridge the feature distribution gap between synthetic and real datasets is also one of the future exploration directions.

6 Conclusion With the development of deep learning technology and the popularization of camera equipment, the research on multi-view pedestrian detection and tracking methods has made remarkable progress in recent years. This paper makes a comprehensive summary of the relevant research context of multi-view pedestrian detection and tracking, introduces various methods, and explains the commonly used datasets in this field. Based on the analysis of the existing research progress, this paper also looks forward to the application of this field in the power operation scene and the future development trend. In short, although some breakthroughs have been made in the research on multi-view pedestrian detection and tracking, a mature and unified theoretical framework has not yet been formed, and there are still many problems and challenges.

References 1. Fleuret, F., Berclaz, J., Lengagne, R., et al.: Multicamera people tracking with a probabilistic occupancy map. IEEE Trans. Pattern Anal. Mach. Intell. 30(2), 267–282 (2007) 2. Golbabaee, M., Alahi, A., Vandergheynst, P.: SCOOP: a real-time sparsity driven people localization algorithm. J. Math. Imaging Vis. 48(1), 160–175 (2014) 3. Alahi, A., Jacques, L., Boursier, Y., et al.: Sparsity driven people localization with a heterogeneous network of cameras. J. Math. Imaging Vis. 41(1), 39–58 (2011) 4. Ge, W., Collins, R.T.: Crowd detection with a multiview sampler. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6315, pp. 324–337. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15555-0_24 5. Eshel, R., Moses, Y.: Homography based multiple camera detection and tracking of people in a dense crowd. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2008) 6. Ren, S., He, K., Girshick, R., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, vol. 28 (2015) 7. Redmon, J., Divvala, S., Girshick, R., et al.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016) 8. Peng, P., Tian, Y., Wang, Y., et al.: Robust multiple cameras pedestrian detection with multiview Bayesian network. Pattern Recognit. 48(5), 1760–1772 (2015) 9. López-Cifuentes, A., Escudero-Vinolo, M., Bescós, J., et al.: Semantic driven multi-camera pedestrian detection. arXiv preprint arXiv:1812.10779 (2018) 10. Lima, J.P., Roberto, R., Figueiredo, L., et al.: Generalizable multi-camera 3D pedestrian detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1232–1240 (2021)

Review of Human Target Detection and Tracking

47

11. Yang, Y., Zhang, R., Wu, W., et al.: Multi-camera sports players 3D localization with identification reasoning. In: 2020 25th International Conference on Pattern Recognition (ICPR), pp. 4497–4504. IEEE (2021) 12. Ong, J., Vo, B.T., Vo, B.N., et al.: A Bayesian filter for multi-view 3D multi-object tracking with occlusion handling. arXiv preprint arXiv:2001.04118 (2020) 13. Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. arXiv preprint arXiv:1804. 02767 (2018) 14. Zhu, C.: Multi-camera people detection and tracking (2019) 15. He, K., Gkioxari, G., Dollár, P., et al.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017) 16. Lyra, V., De Andrade, I., Lima, J.P., et al.: Generalizable online 3d pedestrian tracking with multiple cameras (2022) 17. Nguyen, D.M.H., Henschel, R., Rosenhahn, B., et al.: LMGP: lifted multicut meets geometry projections for multi-camera multi-object tracking. arXiv preprint arXiv:2111.11892 (2021) 18. Xu, Y., Liu, X., Liu, Y., et al.: Multi-view people tracking via hierarchical trajectory composition. In: Proceedings of the IEEE Conference on Computer Vision And Pattern Recognition, pp. 4256–4265 (2016) 19. Chen, H., Guo, P., Li, P., Lee, G.H., Chirikjian, G.: Multi-person 3D pose estimation in crowded scenes based on multi-view geometry. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds.) ECCV 2020. LNCS, vol. 12348, pp. 541–557. Springer, Cham (2020). https://doi. org/10.1007/978-3-030-58580-8_32 20. Li, J., Wang, C., Zhu, H., et al.: CrowdPose: efficient crowded scenes pose estimation and a new benchmark. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10863–10872 (2019) 21. Vo, M., Yumer, E., Sunkavalli, K., et al.: Self-supervised multi-view person association and its applications. IEEE Trans. Pattern Anal. Mach. Intell. 43(8), 2794–2808 (2020) 22. Cao, Z., Simon, T., Wei, S.E., et al.: Realtime multi-person 2D pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) 23. Sun, H., Chen, Y., Aved, A., et al.: Collaborative multi-object tracking as an edge service using transfer learning. In: 2020 IEEE 22nd International Conference on High Performance Computing and Communications; IEEE 18th International Conference on Smart City; IEEE 6th International Conference on Data Science and Systems (HPCC/SmartCity/DSS), pp. 1112–1119. IEEE (2020) 24. He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016) 25. Baqué, P., Fleuret, F., Fua, P.: Deep occlusion reasoning for multi-camera multi-target detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 271–279 (2017) 26. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) 27. Chavdarova, T., Fleuret, F.: Deep multi-camera people detection. In: 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 848–853. IEEE (2017) 28. Hou, Y., Zheng, L., Gould, S.: Multiview detection with feature perspective transformation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.M. (eds.) ECCV 2020. LNCS, vol. 12352, pp. 1–18. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58571-6_1 29. Hou, Y., Zheng, L.: Multiview detection with shadow transformer (and view-coherent data augmentation). In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 1673–1682 (2021)

48

L. Wang and H. Liu

30. Zhu, X., Su, W., Lu, L., et al.: Deformable DETR: deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159 (2020) 31. Liu, Y., Han, C., Zhang, L., et al.: Pedestrian detection with multi-view convolution fusion algorithm. Entropy 24(2), 165 (2022) 32. Song, L., Wu, J., Yang, M., et al.: Stacked homography transformations for multi-view pedestrian detection. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 6049–6057 (2021) 33. Vora, J., Dutta, S., Karthik, S., et al.: Bringing generalization to deep multi-view detection. arXiv preprint arXiv:2109.12227 (2021) 34. Haoran, L., Zicheng, D., Mingjun, M., et al.: MVM3Det: a novel method for multi-view monocular 3D detection. arXiv preprint arXiv:2109.10473 (2021) 35. Ma, J., Tong, J., Wang, S., et al.: Voxelized 3D feature aggregation for multiview detection. arXiv preprint arXiv:2112.03471 (2021) 36. Tu, H., Wang, C., Zeng, W.: VoxelPose: towards multi-camera 3D human pose estimation in wild environment. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.M. (eds.) ECCV 2020. LNCS, vol. 12346 pp. 197–212. Springer, Cham (2020). https://doi.org/10.1007/978-3-03058452-8_12 37. You Q., Jiang, H.: Real-time 3D deep multi-camera tracking. arXiv preprint arXiv:2003. 11753 (2020) 38. Wetzel, J., Zeitvogel, S., Laubenheimer, A., et al.: People detection in a depth sensor network via multi-view CNNs trained on synthetic data. In: 2020 International Symposium on Electronics and Telecommunications (ISETC), pp. 1–4 . IEEE (2020) 39. Brunetti, A., Buongiorno, D., Trotta, G.F., et al.: Computer vision and deep learning techniques for pedestrian detection and tracking: a survey. Neurocomputing 300, 17–33 (2018) 40. Sun, Z., Chen, J., Chao, L., et al.: A survey of multiple pedestrian tracking based on trackingby-detection framework. IEEE Trans. Circuits Syst. Video Technol. 31(5), 1819–1833 (2020) 41. Kyrkou, C.: YOLOpeds: efficient real-time single-shot pedestrian detection for smart camera applications. IET Comput. Vis. 14(7), 417–425 (2020) 42. Gao, G., Gao, J., Liu, Q., et al.: CNN-based density estimation and crowd counting: a survey. arXiv preprint arXiv:2003.12783 (2020) 43. Fabio, P., Riccardo, M., Andrea, C.: Multi-target tracking on confidence maps: an application to people tracking. Comput. Vis. Image Underst. 117(10), 1257–1272 (2013) 44. Khan, S.M., Shah, M.: A multiview approach to tracking people in crowded scenes using a planar homography constraint. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3954, pp. 133–146. Springer, Heidelberg (2006). https://doi.org/10.1007/117440 85_11 45. Mustafa, A., Binlong, L., Caglayan, D., et al.: Dynamic subspace-based coordinated multicamera tracking. In: 2011 International Conference on Computer Vision, pp. 2462–2469 IEEE (2011) 46. Martin, H., Daniel, W., Gerhard, R.: Hypergraphs for joint multi-view reconstruction and multi-object tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3650–3657 (2013) 47. Liu, S., Liu, D., Srivastava, G., et al.: Overview and methods of correlation filter algorithms in object tracking. Complex Intell. Syst. 7(4), 1895–1917 (2021) 48. Wu, D., Zheng, S.J., Zhang, X.P., et al.: Deep learning-based methods for person reidentification: a comprehensive review. Neurocomputing 337, 354–371 (2019) 49. Bazzani, L., Cristani, M., Perina, A., et al.: Multiple-shot person re-identification by chromatic and epitomic analyses. Pattern Recognit. Lett. 33(7), 898–903 (2012) 50. Cheng, D.S., Cristani, M., Stoppa, M., et al.: Custom pictorial structures for re-identification. In: Bmvc, vol. 1, no. 2, p. 6 (2011)

Review of Human Target Detection and Tracking

49

51. Bak, S., Corvee, E., Bremond, F., Thonnat, M.: Person re-identification using spatial covariance regions of human body parts. In: 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance, pp. 435–440. IEEE (2010) 52. Hamdoun, O., Moutarde, F., Stanciulescu, B., et al.: Person re-identification in multi-camera system by signature based on interest point descriptors collected on short video sequences. In: 2008 Second ACM/IEEE International Conference on Distributed Smart Cameras, pp. 1–6. IEEE (2008) 53. Wang, X., Doretto, G., Sebastian, T., et al.: Shape and appearance context modeling. In: 2007 IEEE 11th International Conference on Computer Vision, pp. 1–8. IEEE (2007) 54. Tetsu, M., Einoshin, S.: Person re-identification using CNN features learned from combination of attributes. In: 2016 23rd International Conference on Pattern Recognition (ICPR), pp. 2428– 2433. IEEE (2016) 55. Ding, S., Lin, L., Wang, G., et al.: Deep feature learning with relative distance comparison for person re-identification. Pattern Recognit. 48(10), 2993–3003 (2015) 56. Sun, Y., Zheng, L., Yang, Y., Tian, Q., Wang, S.: Beyond part models: person retrieval with refined part pooling (and a strong convolutional baseline). In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds. ECCV 2018. LNCS, vol. 11208, pp. 501–518. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01225-0_30 57. Medeiros, H., Park, J., Kak, A.: Distributed object tracking using a cluster-based Kalman filter in wireless camera networks. IEEE J. Sel. Top. Signal Process. 2(4), 448–463 (2008) 58. Bhuvana, V.P., Schranz, M., Regazzoni, C.S., et al.: Multi-camera object tracking using surprisal observations in visual sensor networks. EURASIP J. Adv. Signal Process. 2016(1), 1–14 (2016) 59. Ellis, A., Ferryman, J.: PETS2010 and PETS2009 evaluation of results using individual ground truthed single views. In: 2010 7th IEEE International Conference on Advanced Video and Signal Based Surveillance, pp. 135–142. IEEE (2010) 60. Chavdarova, T., Baqué, P., Bouquet, S., et al.: WILDTRACK: a multi-camera HD dataset for dense unscripted pedestrian detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5030–5039 (2018) 61. Wang, X., Zhang, W.: Multi-view indoor human detection neural network based on joint learning. Acta Optica Sinica 39(2), 0210002 (2019) 62. Xu, J., Ding, X., Wang, S.: Object occupancy probabilistic field based multi-view moving object detection and correspondence. Acta Automatica Sinica 05, 609–612 (2008) 63. Xu, J., Ding, X., Wang, S.: Detection, location and labeling under a multi-moving-person, multi-view set. Tsinghua Sci. Technol. 49(08), 1139–1143 (2009). https://doi.org/10.16511/ j.cnki.qhdxxb.2009.08.025.Xx 64. Zhang, J., Guo, J., Liu, A.: Multi-view body structure-constrainted human detection method. J. Tianjin Univ. 47(09), 753–758 (2014) 65. Chen, L., Ma, N., Pang, G., et al.: Research on multi-view data fusion and balanced YOLOv3 for pedestrian detection. CAAI Trans. Intell. Syst. 16(1), 57–65 (2021) 66. Feng, W., Hu, B., Yang, C., et al.: A distributed multi-view object tracking algorithm under the Bayesian framework. Acta Electronica Sinica 39(02), 315–321 (2011) 67. Qu, J., Shen, X., Ni, J.: Research and implementation of multi-view target detection and tracking technology. Technol. Innov. Appl. 184(36), 12–13 (2016) 68. Wu, F.: Research and application of multi-angle target tracking in intelligent monitoring system. Tianjin University (2012) 69. Li, J., Wei, J., Jiang, J.: Spatio-temporal information extraction method for dynamic targets in multiperspective surveillance video. Acta Geodaetica et Cartographica Sinica 51(03), 388– 400 (2022) 70. Xu, H., Li, P.: Research on multi–view target tracking in intelligent video surveillance system and its implementation by FPGA. Mod. Electron. Tech. 39(17), 6–11 (2016)

50

L. Wang and H. Liu

71. Li, L., Yin, H., Xu, H., et al.: A robust multi-object detection and matching algorithm for multi-egocentric videos. CAAI Trans. Intell. Syst. 11(5), 619–626 (2016) 72. Zhang, J., Zhang, Y., Wei, Q.: Robust target tracking method based on multi-view features fusion. J. Comput. Aided Des. Comput. Graph. 30(11), 2108–2124 (2018) 73. Yao, G.: Research on obstacle detection on transmission lines based on view synthesis. Three Gorges University (2011) 74. Lu, J., Li, B., Lin, Y.: Robust detection for helmet wearing in multi-view operation based on hierarchical heterogeneous GNN. Guangdong Electr. Power 35(09), 19–26 (2022) 75. Xing, Y.: Multi-view real-time monitoring system of distribution network live working robot integrated with UAV: CN218343727U, 20 January 2023 76. Chen, S.: Multi-channel image and video stream synchronization and distributed processing method and system based on 5G environment: CN114339067A, 12 April 2022 77. Xu, D.: An early warning method for safe operation around high-voltage power lines: CN112217994A. 12 January 2021

Federated Topic Model and Model Pruning Based on Variational Autoencoder Chengjie Ma1 , Yawen Li2(B) , Meiyu Liang1 , and Ang Li1 1 Beijing Key Laboratory of Intelligent Communication Software and Multimedia, Beijing

University of Posts and Telecommunications, Beijing 100876, China 2 School of Economics and Management, Beijing University of Posts and Telecommunications,

Beijing 100876, China [email protected]

Abstract. Topic modeling can uncover themes and patterns in large documents. However, when cross-analysis involves multiple parties, data privacy becomes a key issue. Federated learning allows multiple parties to jointly train models while protecting privacy. But there are gains and losses, and in the case of federation, there are communication and performance challenges. In order to solve the above problems, this paper proposes a method to establish a federated topic model while ensuring the privacy of each node, and use neural network model pruning to accelerate the model. In addition, to handle the tradeoff between model training time and inference accuracy, two different methods are proposed to determine the model pruning rate. The first method involves slow pruning throughout the entire model training process, which has limited acceleration effect on the model training process, but can ensure that the pruned model achieves higher accuracy. The second strategy is to quickly reach the target pruning rate in the early stage of model training, and then continue to train the model with a smaller model size. This approach may lose more useful information but can complete the model training faster. Experimental results show that the federated topic model pruning based on the variational autoencoder proposed in this paper can greatly accelerate the model training speed and inference speed while ensuring the model’s performance. Keywords: Variational Autoencoder · Topic Model · Federated Learning

1 Introduction Topic models [1–3], such as Latent Dirichlet Allocation (LDA) [4] and Probabilistic Latent Semantic Analysis (pLSA) [5], have been widely used for analyzing social event data. LDA has been the dominant approach for topic analysis in the past two decades. However, with the emergence of deep learning, neural topic models (NTMs) have gained popularity as they utilize neural networks to learn the relationship between documents and topics, aiming for higher quality topic representations. This work was supported by the National Natural Science Foundation of China (62192784, U22B2038, 62172056). © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 51–60, 2023. https://doi.org/10.1007/978-981-99-6187-0_5

52

C. Ma et al.

In fields like science, technology, and innovation (STI) document analysis, topic models are used to compare topics in funded projects from different institutions and identify research strengths. However, constructing topic models for multiple document collections is challenging due to privacy constraints. Therefore, developing topic models while respecting privacy constraints is an important research area in STI analysis. Federated learning (FL) [6–10] is a distributed framework where central servers coordinate and facilitate protocols [11–13], privacy guarantees [14, 15], and node updates [16–20]. FL enables decentralized data utilization and privacy preservation by training models on local data while ensuring global model training. Researchers have explored the application of FL in topic modeling, developing federated frameworks similar to LDA or NMF [21–24], as well as proposing federated generic topic models [25]. However, implementing federated algorithms in practice faces challenges. Each participant needs to send a complete update of the model parameters to the server during each global model training round, resulting in significant communication overhead and increased training time. These challenges motivate the use of pruning techniques in federated topic models to speed up training and inference, enabling faster model convergence and reduced resource requirements. The main contributions of this paper include 3 aspects. • In this paper, we propose a federalized learning topic model approach by applying model pruning. The topic model is first federalized, and then the federalized topic model is applied with model pruning technique. • This paper introduces a new progressive pruning technique for training the federated topic model. During training, clients send weights and accumulated gradients of neural network nodes to the server at regular intervals. The server performs pruning operations on the neural topic model based on this information. Pruning significantly reduces communication and computation overhead on the client side. • Two different methods are proposed to determine the pruning rate of the model in order to cope with different requirements. The first method is slow pruning during training ensures higher accuracy and reduced inference time. The second method repaid initial pruning achieves faster training, sacrificing some useful information.

2 Federated Topic Model and Model Pruning Based on Variational Autoencoder1 2.1 Federated Topic Model Based on Variational Autoencode In this paper, we propose Prune-FedAVITM, a method of federated topic model and model pruning based on variational autoencoder [26]. The aim is to train topic models more efficiently, considering limited computational resources and network bandwidth.

1 Because of space limitation, we remove the relatively unimportant introduction and some

experiments, and mainly discuss our approach and main work.

Federated Topic Model and Model Pruning Based on Variational Autoencoder

53

Algorithm 1 outlines the training process of the proposed federated topic model implementation. In the vocabulary consensus phase, the server awaits word inputs from all nodes and merges them into a common vocabulary. This vocabulary is then used to initialize a global model with weights W (0) . Once all clients receive the shared vocabulary and the initial global model, the federal learning training process begins. During the training stage, the FedAvg algorithm is employed. On each small data batch, the server waits for all clients to send their locally trained neural models. These models are aggregated, and the updated global model parameters are sent back to all clients. This process is repeated over multiple training rounds until the global model converges or reaches a predefined number of iterations. 2.2 Progressive Pruning of the Federated Topic Model This section highlights the significance of incorporating model pruning into federated topic models and outlines the process of using pruning techniques during training. The commonly used method is magnitude pruning [27], which prunes neurons based on their weight’s absolute value. However, small-weight neurons may still be important during training, especially in later stages. To address this, our strategy involves tracking cumulative gradients during local training to identify potentially significant neurons. The pruning process is described in Algorithm 2.

54

C. Ma et al.

During FL, progressive pruning is performed alongside the standard FedAvg process. Pruning occurs at the boundary between rounds, with the pruning interval being a multiple of iterations per round. In each pruning process, an optimal set of remaining model parameters is identified, followed by pruning and training with the pruned model and mask until the next pruning process. The progressive pruning algorithm performs multiple pruning iterations to reach the target pruning rate. Two approaches are used in this paper to set the target pruning rate for each pruning iteration, ultimately leading to the final target pruning rate. The first way is to retain as much information as possible during the training process while pruning, and to allocate the target pruning rate equally to the whole training process. If the target pruning rate is set to 50%, then 25% of the target pruning rate is achieved when the model reaches the halfway point of the training process. In this paper, we will use this pruning strategy for the Federated Variational Autocoder theme model called NormalPrune-FedAVITM. The final target pruning rate of 50% is achieved when the model training is completed. It can still reduce the model inference time significantly during the model inference process. The second strategy is to speed up the model training by quickly reaching the target pruning rate at the early stage of model training, and then continuing to train the model at a smaller model size after the target pruning rate is reached. This approach may lose more useful information, but can complete the model training faster. In this paper, we will use this pruning strategy for the Federated Variational Autocoder theme model called FastPrune-FedAVITM. This pruning approach can significantly reduce the training time of the model, but too much useful information may be lost during the pruning process, and the final accuracy of the model may be affected.

Federated Topic Model and Model Pruning Based on Variational Autoencoder

55

3 Experiments 3.1 Experimental Setup Time of one FL Round. Model pruning in the federated topic model aims to improve training speed, so the model’s effectiveness is measured based on the time taken for model training. However, determining FL communication time directly in simulated FL experiments is challenging. Instead, an approximation method is used to estimate the model communication time. When the remaining parameters of the model W are aggregate functions, define the (approximate) time of one FL round as:  tj (1) T (W) := c + j∈W

where c ≥ 0 is a fixed constant and tj > 0 is the time corresponding to the j-th parameter component. Note that this is a linear function, which is sufficient according to [28]. In particular, the number has a value that can depend on the neural network layer and c captures a constant system overhead. For all j belonging to the same neural network layer, keeping constant. Setting of Data Set. In this paper, experiments are conducted using the 20NG and DBLP datasets. Text and label data are utilized for training and testing. The dataset is divided into training and test sets to evaluate model performance. The evaluation metric is classification accuracy on the test set. Default Settings. In the experimental process of federation learning, the number of clients is set to 10, the number of client local iterations is 10, and the batch size of the dataset used for training in each iteration is 64. Because the data volume and the number of categories of DBLP and 20NewsGroup are different, the number of rounds required for the model to converge on different datasets is different. First, we use the unpruned federal topic model to train on the two datasets, and get the training rounds of the model on the two datasets as 400 and 2500 rounds respectively. It is demonstrated experimentally that the model converges at various pruning rate settings. To verify the effect of the pruning algorithm on the federal topic model, experiments were conducted using two pruning strategies with target densities of 0.8, 0.6, 0.4, 0.2, 0.1, and 0.01, respectively, and various experimental results were recorded.

3.2 Experimental Result Comparison of AVITM and FedAVITM(1.0). First, we compare the final accuracy, average topic diversity, and average topic coherence in the dataset of the two models, AVITM in centralized mode [26], and FedAVITM in the federated environment. Because in the federated environment, the models trained individually at each client can only be aggregated periodically, there is a data bias caused by non-independent homogeneous distribution of data among different clients, which leads to a reduction in the final model accuracy. However, our Federated Topic Model can approach or even

56

C. Ma et al. Table 1. AVITM and FedAVITM(1.0) in the two datasets

Model

Accuracy

Topic coherence

DBLP 20NewsGroup DBLP AVITM

Topic Diversity

20NewsGroup DBLP

20NewsGroup

67.32

35.96

0.0448 0.00510

0.665

0.855

FedAVITM(1.0) 64.15

43.40

0.0558 00674

0.5215 0.897

exceed the AVITM in the centralized environment in DBLP and 20NewsGroup, which proves the effectiveness of our FedAVITM (see Table 1). Best Results at Different Target Density in DBLP and 20NewsGroup Datasets After Model Pruning. Federated topic models using model pruning techniques will speed up model training and model inference, model pruning may also lose some useful information, resulting in a possible reduction in the final model accuracy. In Fig. 1 and Fig. 2, we show the performance of Prune_AVITM on DBLP and 20NewsGroup datasets with different target densities and pruning speed strategies.

Fig. 1. Effect of different target density in DBLP.

Fig. 2. Effect of different target density in 20NewsGroup.

With larger target densities, the model accuracy is even better than without pruning. It may similar to dropout [29], and proper pruning may make each hidden unit more robust. As target density decreases and more neurons are pruned, the model accuracy gradually decreases. Prune_AVITM achieves better accuracy when retaining only 0.1 neurons, but a significant decrease is observed when retaining only 0.01 neurons. This indicates that after training, the model can learn lost information and removing less important neurons can improve the model. Only 0.1 neurons are needed to represent the main information, highlighting redundancy.

Federated Topic Model and Model Pruning Based on Variational Autoencoder

57

The normal method generally achieves higher model accuracy than the fast method, but the fast method outperforms in the end for a target accuracy of 0.4. Neurons around 0.4 may be an equilibrium point where retaining sufficient early-stage information leads to higher accuracy later. Comparing DBLP datasets, the model on 20NewsGroup shows a more significant accuracy drop when retaining 0.1 neurons, which may be because 20NewsGroup is a more difficult task, and need for more retained neurons to learn necessary knowledge (Tables 2 and 3). Table 2. Time reach target accuracy in DBLP Model

Time to reach 62% accuray

Time to reach 64% accuray

Time to reach 64.5% accuray

Normal

Fast

Normal

Fast

Normal

2357

Fast

FedAVITM(1.0)

1077

Prune-FedAVITM(0.8)

586

647

710

1062

1002

1879

Prune-FedAVITM(0.6)

537

667

816

883

1646

2178

Prune-FedAVITM(0.4)

615

575

1072

636

1951

745

Prune-FedAVITM(0.2)

650

496

783

1125

-

-

Prune-FedAVITM(0.1)

638

-

900

-

1672

-

Prune-FedAVITM(0.01)

609

-

663

-

716

-

Table 3. Time reach target accuracy in 20NewsGroup Model

Time to reach 40% accuray

Time to reach 42% accuray

Time to reach 43% accuray

Normal

Fast

Normal

Fast

Normal

7543

Fast

FedAVITM(1.0)

3406

Prune-FedAVITM(0.8)

5801

4635

10629

7871

14242

15338 –

Prune-FedAVITM(0.6)

3963

5271

7801

9534

13405



Prune-FedAVITM(0.4)

4986

3401

11150

5110



8381

Prune-FedAVITM(0.2)

3772

3524

7482

7483

10189



Prune-FedAVITM(0.1)

3614

5379

6102



8863



Prune-FedAVITM(0.01)

3808



6100



7946



Test Accuracy Versus Time Results of Two Datasets. Model pruning in the federated topic model aims to improve training speed, and the model’s effectiveness is measured based on the time used for training. In the DBLP dataset, pruning experiments generally outperform FedAVITM without pruning, except when the target density is too low. In

58

C. Ma et al.

the 20NewsGroups dataset, pruning to accelerate model training proves challenging, likely due to the task difficulty. However, when using the Norm method with a low target density, it still achieves better training acceleration before accuracy drops due to excessive pruning.

Fig. 3. Model size versus time results of all target densities experiments in DBLP(left) and 20NewsGroup(right).

Model Size and Time Spent Training The Model. Figure 3 shows the variation of model parameters over time for different settings. The Normal method generally achieves higher model accuracy, while the Fast method reduces training time by quickly completing pruning early on, resulting in a smaller model size. Table 4 presents the simulation time for training the model. For a target density of 0.2, the Normal method’s training time is 72% and the Fast method’s training time is 54% compared to the unpruned model. However, the time savings vary in different experimental environments. Considering only data access latency, the Normal method’s training time is 60% and the Fast method’s training time is 30% compared to the unpruned model for a target density of 0.2. Table 4. Time taken to complete training at various target_density Model

20NewsGroup (2500 Rounds)

DBLP (400 Rounds)

Normal

Fast

Normal

Fast

FedAVITM

18109

Prune-FedAVITM(0.8)

16934

16027

2592

2780 2532

Prune-FedAVITM(0.6)

15559

13946

2403

2212

Prune-FedAVITM(0.4)

14462

11864

2215

1893

Prune-FedAVITM(0.2)

13009

9782

2062

1574

Prune-FedAVITM(0.1)

12491

8741

1963

1418

Prune-FedAVITM(0.01)

11903

7819

1883

1294

Federated Topic Model and Model Pruning Based on Variational Autoencoder

59

4 Conclusion In this paper, we propose a new pruning method for the federal topic model. This method combines traditional topic models with neural network model pruning techniques to achieve the construction and training of federal neural topic models. In this approach, each node can train the model independently and periodically send the model parameters to the server. The server then uses these parameters to prune the model, thereby speeding up the training process. Furthermore, two methods for determining the model pruning rate are proposed to handle the tradeoff between model training time and inference accuracy, respectively. The experimental results demonstrate that our proposed pruning method can significantly accelerate the model training while maintaining satisfactory model performance. Future work includes three aspects. First, this paper only modifies AVITM to the federated environment based on it, and will try to implement other topic models in the federated environment in the future. Second, this paper has less exploration of the model pruning rate for model accuracy, in order to can try to explore the principle of it.

References 1. Li, A., Li, Y., Shao, Y., et al.: Multi-view scholar clustering with dynamic interest tracking. IEEE Trans. Knowl. Data Eng., 1–14 (2023) 2. Kou, F., et al.: Hashtag recommendation based on multi-features of microblogs. J. Comput. Sci. Technol. 33, 711–726 (2018) 3. Li, Y., et al.: Heterogeneous latent topic discovery for semantic text mining. IEEE Trans. Knowl. Data Eng. 35(1), 533–544 (2021) 4. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993– 1022 (2003) 5. Srinivasarao, U., Sharaff, A.: Email thread sentiment sequence identification using PLSA clustering algorithm. Expert Syst. Appl. 193, 116475 (2022). https://doi.org/10.1016/j.eswa. 2021.116475 6. Li, Y., Li, W., Xue, Z.: Federated learning with stochastic quantization. Int. J. Intell. Syst. (2022) 7. Shao, Y., Huang, S., Li, Y., Miao, X., Cui, B., Chen, L.: Memory-aware framework for fast and scalable second-order random walk over billion-edge natural graphs. VLDB J. 30(5), 769–797 (2021) 8. Meng, D., Jia, Y., Junping, D.: Consensus seeking via iterative learning for multi-agent systems with switching topologies and communication time-delays. Int. J. Robust Nonlinear Control 26(17), 3772–3790 (2016) 9. Li, Y., et al.: Predicting vehicle fuel consumption based on multi-view deep neural network. Neurocomputing 502, 140–147 (2022) 10. Xiao, S., Shao, Y., Li, Y., Yin, H., Shen, Y., Cui, B.: LECF: recommendation via learnable edge collaborative filtering. Science China Inf. Sci. 65(1), 1–15 (2022) 11. Guan, Z., Li, Y., Xue, Z., Liu, Y., Gao, H., Shao, Y.: Federated graph neural network for crossgraph node classification. In: 2021 IEEE 7th International Conference on Cloud Computing and Intelligent Systems, CCIS, pp. 418–422 (2021) 12. Li, Y., Yuan, Y., Wang, Y., Lian, X., Ma, Y., Wang, G.: Distributed multimodal path queries. IEEE Trans. Knowl. Data Eng. 34(7), 3196–3321 (2022)

60

C. Ma et al.

13. Huang, J., et al.: HGAMN: heterogeneous graph attention matching network for multilingual POI retrieval at Baidu maps. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, KDD 2021, pp. 3032–3040 (2021) 14. Li, Y., Jiang, W., Yang, L., Tian, W.: On neural networks and learning systems for business computing. Neurocomputing 275(31), 1150–1159 (2018) 15. Li, W., Jia, Y., Junping, D.: Tobit Kalman filter with time-correlated multiplicative measurement noise. IET Control Theory Appl. 11(1), 122–128 (2017) 16. Koneˇcný, J., McMahan, H.B., Yu, F.X., Richtárik, P., Suresh, A.T., Bacon, D.: Federated learning: strategies for improving communication efficiency. arXiv, 30 October 2017. https:// doi.org/10.48550/arXiv.1610.05492 17. Koneˇcný, J., McMahan, H.B., Ramage, D., Richtárik, P.: Federated optimization: Distributed machine learning for on-device intelligence. arXiv Prepr. arXiv161002527 (2016) 18. Shokri, R., Shmatikov, V.: Privacy-preserving deep learning. In: Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, pp. 1310–1321 (2015) 19. Lin, P., Jia, Y., Du, J., Yu, F.: Average consensus for networks of continuous-time agents with delayed information and jointly-connected topologies. In: 2009 American Control Conference, pp. 3884–3889 (2009) 20. Meng, D., Jia, Y., Du, J., et al.: Tracking algorithms for multiagent systems. IEEE Trans. Neural Netw. Learn. Syst. 24(10), 1660–1676 (2013) 21. Li, A., Du, J., Kou, F., et al.: Scientific and technological information oriented semanticsadversarial and media-adversarial cross-media retrieval. arXiv preprint arXiv:2203.08615 (2022) 22. Wei, X., Du, J., Liang, M., et al.: Boosting deep attribute learning via support vector regression for fast moving crowd counting. Pattern Recognit. Lett. 119, 12–23 (2019) 23. Si, S., Wang, J., Zhang, R., et al.: Federated non-negative matrix factorization for short texts topic modeling with mutual information. In: IJCNN 2022, pp. 1–7 (2022) 24. Wang, Y., Tong, Y., Shi, D.: Federated latent Dirichlet allocation: a local differential privacy based framework. In: AAAI 2020, pp. 6283–6290 (2020) 25. Shi, Y., Tong, Y., Su, Z., Jiang, D., Zhou, Z., Zhang, W.: Federated topic discovery: a semantic consistent approach. IEEE Intell. Syst. 36(5), 96–103 (2020) 26. Srivastava, A., Sutton, C.: Autoencoding variational inference for topic models. Statistics 1050, 4 (2017) 27. Han, S., Mao, H., Dally, W.J.: Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding. arXiv, 15 February 2016 28. Jiang, Y., et al.: Model pruning enables efficient federated learning on edge devices. IEEE Trans. Neural Netw. Learn. Syst., 1–13 (2022). https://doi.org/10.1109/TNNLS.2022.316 6101 29. Srivastava, N., et al.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)

An Anti-interference Mechanical Fault Diagnosis Method Based on CNN and Attention Mechanism Zhen-Jun Zhang and Ying-Yuan Liu(B) Shanghai Normal University, 100 Haisi Road, Shanghai 200234, China [email protected]

Abstract. In order to solve the problem of low accuracy in traditional fault diagnosis methods under practical working conditions and noisy environments, a convolutional neural network (CNN) fault diagnosis method was proposed with a high anti-interference ability based on attention mechanism. In this method, CNN and Long Short-Term Memory (LSTM) were used to learn the spatial and temporal features of data and the attention mechanism was selected to enhance important features for improving the resistance of the fault diagnosis model to interference. Meanwhile, simulation experiments on the bearing fault dataset of Case Western Reserve University (CWRU) were conducted for verification. Results show that the fault diagnosis method proposed in this paper achieves an average diagnostic accuracy of 97.55% on multiple load datasets and an average accuracy of 96.58% on noisy data. Furthermore, T-SNE algorithm was used to visualize the output of every layer for better understanding the proposed method during the fault diagnosis process. Keywords: Mechanical fault Diagnosis · CNN · LSTM · Attention Mechanism

1 Introduction The operational status of rotating machinery components is critical for the reliable operation of the entire unit. In the event of a failure, not only can the performance of the rotating machinery decline, but it can also result in significant economic losses and even threaten the safety of operators. In the practical working environment of rotating machinery, there is inevitable noise interference, and the obtained vibration signals are easily contaminated by noise [1]. Therefore, establishing an accurate noise-resistant fault diagnosis method, accurately identifying vibration signals in a noisy environment, and achieving timely warning of mechanical faults are of great significance for mastering the stability and reliability of mechanical equipment operation. Traditional mechanical fault diagnosis methods usually rely on the experience and skills of professional maintenance personnel, which requires a lot of time and manpower costs [2]. With the continuous development of deep learning technology, deep learning methods based on Convolutional Neural Networks (CNN) have gradually emerged in the field of fault diagnosis due to their powerful feature extraction and classification © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 61–68, 2023. https://doi.org/10.1007/978-981-99-6187-0_6

62

Z.-J. Zhang and Y.-Y. Liu

capabilities, as well as their independence from expert experience [3, 4]. More and more researchers have begun to explore CNN-based mechanical fault diagnosis methods and have made some research progress [5]. Choudhary et al. [6] used an infrared camera to collect fault images of rolling element bearings and diagnosed them using a CNN based on LeNet-5, which can effectively distinguish different types of faults. Chen et al. [7] diagnosed rolling bearings using a multi-scale convolutional neural network with feature alignment (MSCNN-FA), achieving good results. In addition, some researchers have combined CNN with other technologies such as wavelet analysis and time-frequency analysis to further improve the accuracy and robustness of mechanical fault diagnosis [8–11]. However, there are still some problems with existing CNN-based mechanical fault diagnosis methods, which cannot well handle noise interference in practical environments. Currently, research on artificial intelligence technology in the fault diagnosis field mainly focuses on how to combine signal processing techniques with neural networks to improve the accuracy and robustness of diagnostic models. In particular, for fault diagnosis in noisy environments, filtering is often used to remove noise components from the original vibration signal, and then combined with classification algorithms to achieve better diagnostic results [12, 13]. In recent years, many scholars have proposed anti-interference diagnostic models that can be directly integrated into deep learning. Such models do not require preprocessing of the original data and can achieve endto-end fault diagnosis directly. For example, some researchers have integrated attention mechanisms into convolutional neural networks to improve the model’s anti-interference ability [14]. Attention mechanisms can automatically focus on the key information of data signals and prevent important information from being covered by noise. However, research is still in its early stages, and how to deeply integrate attention mechanisms with CNN networks to further improve the accuracy and generalization ability of diagnostic models still needs to be studied. At the same time, there is still a need for further exploration of the principles of how attention can enhance network performance. Therefore, this article aims to propose an effective fault diagnosis method that can effectively deal with noise interference, and to analyze and test the generalization ability of this method in detail.

2 Fundamental Theory 2.1 Convolutional Neural Network CNN is used to process data with structures similar to audio, image, and text. It mainly consists of three elements: convolutional layer, pooling layer, and fully connected layer. 2.2 Long Short-Term Memory Network The core of LSTM is the “cell state”, which uses three types of gate structures forget gate, input gate, and output gate to control feature selection and solve the problem of inability to retain long-term memory, The neural structure is shown in Fig. 1.

An Anti-interference Mechanical Fault Diagnosis Method

63

Fig. 1. LSTM neuron structure.

2.3 Attention Mechanism The attention mechanism can calculate the similarity between each position in the input sequence and the current output position, convert the similarity into weights, and then multiply the weights by the feature vectors at each position in the input sequence. The attention mechanism in this article consists of three steps [15]: first, calculate the similarity; second, calculate the attention set mechanism and obtain the weight coefficient; finally, weight the convolution components with the weight coefficients and sum them up.

3 Anti-interference Fault Diagnosis Model 3.1 Network Structure The structure of the model proposed in this work is shown in Fig. 2, mainly includes two convolution layers, two LSTM layer and one attention layer. The two convolution layers are used to extract spatial features of one-dimensional vibration signals and each convolution layer is consisted of 1 convolutional layer, 1 batch normalization layer, and 1 max pooling layer. For the second layer of LSTM, the hidden state of each time step obtained is then input into the attention layer, and the attention mechanism is used to calculate the attention value of the hidden state at each time step. The attention value of the second LSTM layer at time step t is denoted as r t, as shown in Eq. (1). Then, softmax is used to normalize the attention value and the attention weight coefficients is obtained shown in Eq. (2). Finally, the hidden states at each time step are summed up with weighted coefficients to obtain the final output vector, which is further processed using Softmax. rt = WtT tanh(ht2 )

(1)

where ht2 represents the hidden state of the second layer LSTM at time t, W represents the parameters that the model needs to update, and W T is its transpose. at =

exp(rt ) n  exp(rj ) j=1

(2)

64

Z.-J. Zhang and Y.-Y. Liu

where j is the oder of the attention value.

Fig. 2. Network structure of the proposed model.

The structural parameters of the model in this article are shown in Table 1. Table 1. Model structure parameters. ID

Layer name

Parameters

Output size

0

Input

/

1024 × 1

1

CNN

kernel_size = 4, stride = 2, out_channels = 32

512 × 32

2

Normalization

/

512 × 32

3

Pooling

kernel_size = 2, stride = 2

256 × 32

4

CNN

kernel_size = 4, stride = 2, out_channels = 64

128 × 64

5

Normalization

/

128 × 64

6

Pooling

kernel_size = 2, stride = 2

64 × 64

7

LSTM

hidden_size = 64, num_layers = 2

64 × 64

8

Attention

/

64 × 1

9

Softmax

/

F×1

4 Experimental Validation and Result Analysis 4.1 Dataset Description and Experimental Process The detailed information of the dataset is given in the Table 2. The test bearing speeds are 1797 r/min, 1772 r/min, 1750 r/min, and 1730 r/min, corresponding to loads of 0 hp, 1 hp, 2 hp, and 3 hp. Then the vibration data of the bearing is collected and obtained with a sampling frequency of 12000 Hz.

An Anti-interference Mechanical Fault Diagnosis Method

65

In order to obtain a large amount of sample data, the CWRU bearing dataset at every working condition were sliced into segments. The slicing principle is shown in Eq. (3). M ≥m

60 q z

(3)

where M is the number of points contained in each segmented signal; m is the number of the bearing rotation cycles, here m takes 2.5; z is the rotation speed of the pump or bearing in units of r/min; q is the sampling frequency in units of Hz. Table 2. Dataset partitioning information. Label

Fault category

Fault location

Fault size

Train

Test

0

Normal

Normal

Normal

30

29

1

0.007-Ball

Ball damage

0.18 mm

30

29

2

0.014-Ball

Ball damage

0.36 mm

30

29

3

0.021-Ball

Ball damage

0.53 mm

30

29

4

0.007-InnerRace

InnerRace damage

0.18 mm

30

29

5

0.014-InnerRace

InnerRace damage

0.36 mm

30

29

6

0.021-InnerRace

InnerRace damage

0.53 mm

30

29

7

0.007-OuterRace6

OuterRace6 damage

0.18 mm

30

29

8

0.014-OuterRace6

OuterRace6 damage

0.36 mm

30

29

9

0.021-OuterRace6

OuterRace6 damage

0.53 mm

30

29

To better validate the reliability of the proposed model structure in this paper, comparative experiments were conducted with CNN+LSTM and CNN+Attention methods, which only reduce a certain layer on the basis of the proposed method, with the other structures unchanged. 4.2 Diagnosis Results of Faults Under Fixed Load The bearing data of 0 hp, 1 hp, 2 hp, and 3 hp were used for fault diagnosis experiments without noise. The diagnosis results of the three methods are shown in Table 3. It can be seen that under the four loads, the diagnostic ability of the proposed method in this paper performance better. 4.3 Fault Diagnosis Results Under Noise Interference Conditions To evaluate the model’s anti-interference ability, noise with signal-to-noise ratios of 10– 20 was added to the 3hp dataset from the CWRU. Where, the signal-to-noise ratio is defined as Eq. (4), Ps is the signal power and PN is the noise power. When the noise data were for testing, the diagnosis results of the three methods are shown in Table 4. As can be

66

Z.-J. Zhang and Y.-Y. Liu Table 3. Results of fixed-load fault diagnosis.

Method

0 hp

1 hp

2 hp

3 hp

Average

CNN+Attention

0.7172

0.6569

0.7152

0.8352

0.7311

CNN+LSTM

0.9290

0.9793

0.9434

0.9890

0.9602

Propoed method

0.9503

0.9952

0.9662

0.9903

0.9755

seen from the figure, under the condition of noise interference, the diagnostic reliability of the proposed method is still the highest, with an average accuracy rate 16.22% higher than that of CNN+Attention and 1.51% higher than that of CNN+LSTM. Meanwhile, when the signal-to-noise ratio is 10, the accuracy of the fault diagnosis method provided in reference [16] is 0.9806, which is lower than the method proposed in this article. SNR = 10lg

PS PN

(4)

Table 4. Fault diagnosis results under different signal-to-noise ratio (SNR) conditions. Method

SNR_10

SNR_12

SNR_14

SNR_16

SNR_18

SNR_20

CNN+Attention

0.8172

0.7848

0.8234

0.82

0.8545

0.7979

CNN+LSTM

0.9497

0.9559

0.969

0.9862

0.9722

0.9476

Proposed method

0.9814

0.9745

0.9717

0.9972

0.9731

0.9733

As shown in the Fig. 3, in order to better understand the role of each layer of the proposed method in the fault diagnosis process, the T-SNE algorithm was used to visualize the output of the model’s layers during testing on CWRU bearing data with a signal-tonoise ratio of 10. Figure 3(a) shows the visualization results of the original signal after dimensionality reduction, which indicates that the original signals are mixed together without regular distribution. Figure 3(b) shows the visualization results obtained by TSNE dimensionality reduction after 2 layers of CNN extract signal features, indicating that the label 1 is completely separated from other samples, and label 7 is separated but not aggregated. Figure 3(c) shows the visualization results obtained by T-SNE dimensionality reduction after feature extraction by the model’s LSTM layer. It can be seen that most samples are completely separated, but there are still some overlaps between label 2 and 6, 0 and 8, and 5 and 9, indicating that some faults still cannot be accurately identified. Figure 3(d) shows the visualization results obtained by T-SNE dimensionality reduction after feature extraction by the model’s Attention layer, indicating that label 2 and 6, 0 and 8 are now completely separated, and the overlapping part between 5 and 9 is reduced, which shows that the Attention layer further improves the model’s fault diagnosis ability and makes the model still reliable under noisy interference. In summary, compared to CNN+Attention and CNN+LSTM, the method proposed in this paper has stronger fault diagnosis ability in noisy environments.

An Anti-interference Mechanical Fault Diagnosis Method

67

Fig. 3. t-SNE visualization results of each layer in the model.

5 Conclusions This paper proposes a fault diagnosis method based on the attention mechanism in convolutional neural networks to improve the model’s resistance to interference. The main conclusions are as follows: 1) This paper studies the use of CNN and LSTM to extract spatial and temporal features from signals and enhances the model’s ability to recognize important features through attention mechanism, thereby achieving efficient fault diagnosis in the presence or absence of noise. 2) Through comparative experiments, it is found that the model combining LSTM and attention mechanism is superior to using LSTM or attention mechanism alone. 3) The visualization analysis of the model’s output in the testing process shows that the attention mechanism further improves the model’s resistance to interference on the basis of LSTM.

68

Z.-J. Zhang and Y.-Y. Liu

References 1. Peng, D., Wang, H., Liu, Z., et al.: Multibranch and multiscale CNN for fault diagnosis of wheelset bearings under strong noise and variable load condition. IEEE Trans. Ind. Inf. 16(7), 4949–4960 (2020) 2. Liu, R., Yang, B., Zio, E., et al.: Artificial intelligence for fault diagnosis of rotating machinery: a review. Mech. Syst. Signal Process. 108, 33–47 (2018) 3. Li, X., Cheng, J., Shao, H., et al.: A fusion CWSMM-based framework for rotating machinery fault diagnosis under strong interference and imbalanced case. IEEE Trans. Ind. Inf. 18(8), 5180–5189 (2021) 4. Jin, T., Yan, C., Chen, C., et al.: Light neural network with fewer parameters based on CNN for fault diagnosis of rotating machinery. Measurement 181, 109639 (2021) 5. Dibaj, A., Ettefagh, M.M., Hassannejad, R., et al.: A hybrid fine-tuned VMD and CNN scheme for untrained compound fault diagnosis of rotating machinery with unequal-severity faults. Expert Syst. Appl. 167, 114094 (2021) 6. Choudhary, A., Mian, T., Fatima, S.: Convolutional neural network based bearing fault diagnosis of rotating machine using thermal images. Measurement 176, 109196 (2021) 7. Chen, J., Huang, R., Zhao, K., et al.: Multiscale convolutional neural network with feature alignment for bearing fault diagnosis. IEEE Trans. Instrum. Meas. 70, 1–10 (2021) 8. Liang, P., Deng, C., Wu, J., et al.: Intelligent fault diagnosis of rotating machinery via wavelet transform, generative adversarial nets and convolutional neural network. Measurement 159, 107768 (2020) 9. He, D., Liu, C., Jin, Z., et al.: Fault diagnosis of flywheel bearing based on parameter optimization variational mode decomposition energy entropy and deep learning. Energy 239, 122108 (2022) 10. Sun, Y., Li, S., Wang, Y., et al.: Fault diagnosis of rolling bearing based on empirical mode decomposition and improved manhattan distance in symmetrized dot pattern image. Mech. Syst. Signal Process. 159, 107817 (2021) 11. Hajji, M., Harkat, M.F., Kouadri, A., et al.: Multivariate feature extraction based supervised machine learning for fault detection and diagnosis in photovoltaic systems. Eur. J. Control 59, 313–321 (2021) 12. Zhang, Z., Li, S., Wang, J., et al.: Enhanced sparse filtering with strong noise adaptability and its application on rotating machinery fault diagnosis. Neurocomputing 398, 31–44 (2020) 13. Li, Y., Cheng, G., Liu, C.: Research on bearing fault diagnosis based on spectrum characteristics under strong noise interference. Measurement 169, 108509 (2021) 14. Wang, H., Liu, Z., Peng, D., et al.: Attention-guided joint learning CNN with noise robustness for bearing fault diagnosis and vibration signal denoising. ISA Trans. 128, 470–484 (2022) 15. Zhou, P., Shi, W., Tian, J., et al.: Attention-based bidirectional long short-term memory networks for relation classification. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, pp. 207–212. ACL (2016) 16. Jian, M., Yurong, G., Man, Z., et al.: Fault diagnosis method for rolling bearings based on attention mechanism. J. Comput. Integr. Manuf. 29, 2233–2244 (2023)

Refining Object Localization from Dialogues Xueze Kang1 , Lei Wu1 , Lingyun Lu2 , and Huaping Liu1(B) 1

2

Department of Computer Science and Technology, Tsinghua University, Beijing, China [email protected] Nanjing Research Institute of Electronic Engineering, Nanjing, China

Abstract. The ability to locate objects accurately is crucial for robots’ application in large-scale environments. However, objects’ location changes constantly in practical scenes, so the robot’s knowledge of objects’ distribution needs to update correspondingly. We propose a method to refine objects’ location from dialogues between humans and robots. When a robot is searching for an object, if the information about the object’s location is insufficient, it constantly raises questions to a human informer until it acquires adequate location information. Furthermore, every possible target object found during the searching process is photographed and sent back to the human informer for judgment. If all possible locations have been searched and the target is not found, the robot requests more specific location information and starts a new search attempt after attaining the information. The interaction process repeats iteratively until the robot finally finds the target. The interactions with humans ensure the success and accuracy of searching. We deploy our method on a mobile robot and conduct language parsing experiment and locating objects experiment to evaluate the performance of our method. Moreover, we compare our method with a one-time message method from a previous work to demonstrate our advantage.

Keywords: robot localization

1

· mapping · human-robot interaction

Introduction

Nowadays, robotics are widely used in many large-scale environments to assist humans, such as factories and office buildings. Fetch-and-Deliver-Task is a typical application in these scenes. To fetch or deliver an object, robots need to locate the object first. A simple and direct strategy is to offer prior mapping knowledge of the connection between objects and locations. In a static scene, this method can be effective. However, in practical scenes, objects’ location may frequently change Supplementary Information The online version contains supplementary material available at https://doi.org/10.1007/978-981-99-6187-0 7. c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023  Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 69–82, 2023. https://doi.org/10.1007/978-981-99-6187-0_7

70

X. Kang et al.

so that the prior knowledge becomes outdated correspondingly. According to a previous study [1], another method is to offer location information at the beginning of searching, which solves the problem of timeliness. However, one-time information still cannot ensure the success and accuracy of object searching. For example, if the provided searching area is too spacious, some robots may not have the adequate ability to search comprehensively. Moreover, if the search attempt fails, there is no way for the user to offer further help to the robot. Granted that the robot finds an object in the end, there is still a possibility that the search result is wrong, but the system does not allow the user to make a judgment or correct the mistake. A possible solution for this challenge is multiple interactions. Instead of transmitting only one piece of information in the beginning, we can interact with the robot iteratively throughout the whole searching process to refine the object’s location and make an identification when it discovers a possible target. Inspired by this idea, we propose a new technique: by exploiting dialogues in a social app, we design a system with iterative interactions between humans and robots to locate objects. In the beginning, it makes a search attempt from prior knowledge of the object’s distribution. If the attempt fails, it asks the informer for more information regarding the object’s location through the social app. The informer’s answer is returned in the form of natural language. The system parses the answer and updates the robot’s knowledge of the target object’s distribution. Finally, the robot calculates the best configuration out of its knowledge and navigates to achieve the configuration. After finding an object matching the target type, the robot photographs the object and sends it back to the informer for identification. The informer can return an affirmative or negative answer, with which the robot will update its knowledge. If the answer is negative, it proceeds with its searching task. This process continues iteratively until the robot finally receives an affirmative reply. A typical interaction process is shown in Fig. 1. We deploy the proposed technique on a mobile robot and assess its effectiveness by running it in a university building. First, we evaluate the performance of our language parse system by collecting commands from the building’s inhabitants and examining the accuracy of the parsing results. Furthermore, we conduct an overall experiment of searching objects in a university building by human-robot interaction to assess the performance of our method to locate objects and compare our method with the one-time message method [1]. Our contribution is three-folded: – We extend the previous one-time interaction method [1] for locating objects to iterative interactions with location refinement and result identification, which guarantees searching success and accuracy. – We propose a method to parse out key information about localization through dialog. – We deploy the method on a mobile robot and evaluate the effectiveness of our technique.

Refining Object Localization from Dialogues

2 2.1

71

Related Work Target Localization

In previous works, researchers have used various methods for target localization of robots. A common method exploits vision system to capture objects [2–5]. [6,7] use landmarks to assist robots. Semantic map [8–10,22] and LiDAR information [24,25] are also useful assistance. The most related works to ours are [1] [11]. The researchers exploit interaction with users to provide location information to the robot. However, users only play a role at the beginning of the locating process. We extend this work to multiple interactions through dialogues, allowing users to identify the robot’s findings and offer further information if the robot runs into trouble. Instead of an information provider at the beginning, the user plays a role as a monitor during the whole process in our method.

Fig. 1. An intuitive demonstration of localization from dialogues. The informer offers target type, location information, and object identification to the robot. And the robot returns its findings with a photo. The maps in the bubbles represent the robot’s object distribution knowledge. The deepening of color represents the increase of certainty of the target location.

2.2

Visual Language Navigation(VLN)

VLN is a task where agents follow movement instructions from human users and use vision systems detecting objects to navigate to goal positions. Agents receive commands like move forward and turn right. Quantities of methods are used for VLN task [12–16]. VLN have been used in symbolic environments [17– 19], photorealistic indoor environment [20] and outdoor environment [21]. Our work combines VLN into our target localization system. When the robot reaches

72

X. Kang et al.

the target location but misses the target due to some unfavorable factors(such as the inaccuracy of object detection system), users can send movement instructions similar to VLN to guide the robot to the target.

3

Problem Formulation

In our method, we prepare an object set O that consists of over 300 types of objects which covers a majority of daily objects. All the objects in O can be recognized by the robot. In the beginning, the robot gets a target object type t from dialogues. For simplicity, we assume t belongs to O. Meanwhile, a coarse and static 2D map M 2D of the whole area is provided in advance. Also, the robot has access to prior knowledge S of objects’ distribution in O. For each object, the default distribution is uniform distribution. In a specific area, the status of robot can be defined with two arguments: location l and direction d. Our goal is to compute the best configuration of (l, d) iteratively using dialogues between robot and human, from which the target object can be captured by the camera attached to the robot. We formulate our problem as finding the (l, d) to maximize p(t|(l, d), M 2D , S)

4

Method

The typical application scene in our work is large-scale indoor spaces such as factories and office buildings where there is a need for locating or fetching objects. Users can interact with the robot using dialogues in social apps. The language parser extracts crucial information such as target type and location. After getting target and location information, the robot combines its prior distribution knowledge and the information from dialogues hitherto to update its distribution knowledge and calculate the (l, d) with the most success possibility. Later, it will navigate to the most promising l and turn to the corresponding d. If an object matching the target type is found, a photo of the object is taken and sent back for identification. If the searching attempt fails or the photo is denied, a request is sent to the human informer for more specific location information. After getting new information, a new attempt is made, and the result is sent back as above. This process repeats iteratively until the robot receives an affirmative confirmation. The overview of our method is shown in Fig. 2. 4.1

Language Parser

There are two kinds of crucial information to be extracted from the message: location and object type. The object type often appears at the beginning of interactions to tell the robot the target to search. We compare all the nouns in the sentence with the recognizable object set O to determine the object type. The location message sent by the informer can be divided into two categories. We define the first type as reply, which is mainly identification of the robot’s

Refining Object Localization from Dialogues

73

Fig. 2. Overview of our proposed method. Messages sent by the informer are processed by the language parse model. The robot updates its distribution knowledge every time it receives information from the informer. The robot calculates the node with the max possibility and navigate to them if the possibility reaches the threshold. The findings during the search process are returned in the form of photos. The process concludes only if the robot receives an affirmative confirmation of a photo. When a search attempt fails, the robot requests the informer for more specific location information. Then a new process repeats similarly.

action and searching result(for example, the identification for photos sent by the robot). Reply is mainly an affirmative or negative message, which is easy to parse so we will not have further discussion. The major parsing work is the second type, which we defined as command. Command is a message with information concerning the target’s location, which can be divided into two forms. The first form indicates the approximate area of the target (for example, besides room 4–516). Such instruction usually appears early in the interaction between humans and robots. The second form indicates detailed instructions of movement based on the robot’s current position (for example, go forward 2 m, then turn to your right). Additionally, the user can ask the robot to take a photo of what it sees currently, adding to the user’s understanding of the robot’s current environment. The second command form usually appears at the end of a search as supplement to the first form. It allows the user to operate robots manually in extreme situations. Even if the search fails in the end, the user can still acquire some information about the nearby environment through photos sent back, which can be a reference for the user’s following command. For a message m received by the robot, it needs to be parsed to a command list c, which the robot can recognize. Some examples and process results are shown in Table 1. We process m with Stanford CoreNLP Toolkit [23] to acquire dependency relation and part of speech information of m. The first command form (location information) must be nouns in m. With the part of speech information, We can locate every noun in the sentence. Furthermore, with the dependency relation information, we can filter out irrelevant nouns and leave the candidates. Likely,

74

X. Kang et al.

Table 1. The table lists some examples and corresponding process results of commands. For the first form of command, the location information in the sentence is extracted. For the second form of command, movement type(linear movement or rotation), direction, and distance or angle are extracted from the sentenceco. A particular case is that when the user asks for the robot’s location, it sends a photo of the current scene and a map with the current position marked.

forms

examples

process results

first form

Go to room 4–516 and have a look Check the place beside room 4–324 It’s somewhere in area 4 Examine beside the elevator in area 3

room 4–516 Room 4–324 Area 4 elevator in area 3

second form Turn right for 30◦ C Go forward for 10 m Tell me where you are

(right, rotate, 30.0) (forward, move, 10.0) (photo)

the second command form(movement instruction) must be verbs in m. We can process the message with a similar method to get the candidates. Selecting the appropriate command out of the candidates is equivalent to calculating the maximum c = argmax p(c|m). We define L as the location set, which consists of the recognizable location command for the robot, and M as the movement set, which consists of the recognizable movement command for the robot. Moreover, we define a function d(., .) as the similarity measurement between two texts (e.g., Levenshtein distance). For each command c in those candidates, we evaluate: p(c|m) ∝ maxi d(Li , c) + maxj d(Mj , c)

(1)

Li means the ith element in set L, and Mj means the jth element in set M . 4.2

World Model

Our world model is a 2D static map of the whole area M 2D established using laser radar. Such information is available in lots of indoor spaces (e.g., indoor Google Maps). Additionally, we have a semantic map M S describing the semantic information in the area, such as office numbers. To make our problem tractable, we split the whole map into a discrete node set L, and any possible observation location of the robot must belong to set L. Such approximation may harm the robot’s performance. So the discrete node set must be dense enough to be a good approximation of the continuous map. Initially, the robot’s observation direction can be any angle from 0 to 360 ◦ C. Similarly, we evenly cut the direction range into a dense discrete direction set D, and any possible observation direction belongs to set D. All possible observation location and direction is discrete, and the whole map is a limited area, so all possible robot status is finite and traversable.

Refining Object Localization from Dialogues

4.3

75

Updating Object Distribution

After getting a message m from the informer, the possibility for a status (l, d) to capture target t will change to p(t|(l, d), M 2D , S, m). Now, we have p(t|(l, d), M 2D , S, m)  = p(t|(l, d), M 2D , S, c)p(c|m)

(2)

≈ max p(t|(l, d), M 2D , S, c)p(c|m)

(4)

c

(3)

p(c|m) can be calculated using the method in Sect. 4.1. In Eq. (4), we approximate the summation to the maximum possibility. In other words, we only consider the command c that matches m best. We define L(c) as the location described by the command c. Now, according to the relation of status (l, d) and L(c), we multiply the possibility of every status by weight w((l, d), L(c)), which means the distribution of target is now updated based on the new location information. So now we have: y((l, d), c) = p(t|(l, d), M 2D , S) · w((l, d), L(c))

(5)

And, for simplicity, we define the status matching L(z) to have a weight of 1, and others have a weight of 0, that is  1, l ∈ L(c), w((l, d), L(c)) = (6) 0, l ∈ / L(c) It means that every status outside L(c) has no possibility of capturing the target. Now, we normalize it to get the new possibility for every status in M 2D y((l, d), c) p(t|(l, d), M 2D , S, c) =  y((l, d), c) 4.4

(7)

Searching Process

After getting a message from the informer, the robot starts computation using the method in Sect. 4.3. If the message is not detailed enough or there is not adequate prior knowledge, the information entropy in M 2D may be relatively high, and the possibility of capturing the target in each status is relatively slight. So we set a threshold h. Only when the maximum possibility of all status reaches h, the robot makes a search attempt. Otherwise, it sends a request back to the informer for more specific location information. The algorithm is shown below: After the robot commences a search attempt, it calculates the nodes with the maximum success possibility in M 2D , which are selected as candidates for searching. If the amount of candidates is too large, the robot only selects some of them as representatives for searching. If an object matching the target type is captured in any node, a photo of the object is sent back for identification. If the answer is affirmative, the searching task completes. If not so, the robot

76

X. Kang et al.

Algorithm 1 maxP ossibility ← CalculateMaxPossibility() while maxP ossibility < threshold do m ← AskForCommandInformation() for all (l, d) do p(t|(l, d), M 2D , S) ← maxp(t|(l, d), M 2D , S, c)p(c|m) c

end for maxP ossibility ← CalculateMaxPossibility() end while

records the object’s position and omits it next time seeing it. If no objects matching the target type are found in the area, or every found object is denied, the possibilities of all the nodes searched are updated to 0. Meanwhile, a new request is sent to the informer for more specific location information. The reply might be a new location or a series of movement instructions. No matter in which form, the robot’s location knowledge is refined, and the distribution of objects is more certain during this process. Then, the robot starts a new search attempt according to the new distribution knowledge. This process repeats iteratively, and the object’s location is refined continuously until the target is finally found. The algorithm is shown as Algorithm 2:

5

Experiment and Result

Fig. 3. The robot used for experiment. (a) is a photo of the robot. (b) is a scene of searching target. (c) is the robot’s observation.

We conduct two experiments with a mobile robot deployed in a university building (Fig. 3) to evaluate the performance of our method: language parsing evaluation experiment and robot search experiment. Furthermore, we conducted

Refining Object Localization from Dialogues

77

Algorithm 2 searchResult ← f alse while !searchResult do nodes ← SelectRepresentatives(CalculateMaxNodes()) for all node in nodes do for all direction in directions do object ← ObjectsInCamera() if match(target, object) then identif ication ← SendPhotoForCheck() if identif ication then searchResult ← true end if end if end for end for if !searchResult then for all node in nodes do for all direction in directions do p(t|(node, direction), M 2D , S) ← 0 end for end for m ← AskForCommandInformation() for all (l, d) in M 2D do p(t|(l, d), M 2D , S) ← maxp(t|(l, d), M 2D , S, c)p(c|m) c

end for end if end while

Fig. 4. A task example. The robot’s observation is shown on the top, and the corresponding dialogues between the informer and the robot are shown on the bottom. When the robot discovers a target matching the target, it takes a photo of its observation and send it back to the informer. The object map is shown on the right side. The two arrows represent the two searching routes calculated from the dialogues.

78

X. Kang et al.

a comparison experiment to demonstrate our advantage over the previous onetime message method [1]. More details can be found in the supplementary video. 5.1

Experiment 1: Language Parsing Evaluation

We first evaluate our language parser, for the accuracy of parsing result is the basis for the whole interaction system. We collect valid commands and unrelated sentences as samples and process them with our language parser. The valid commands are expected to be parsed into tuples of crucial information in the sentence, while the output of unrelated sentences is expected to be nothing. Procedure. We collect 116 valid commands from the inhabitants of the building, which covers both the first and second forms of command(location information and movement instruction). Besides the valid ones, we also randomly collect 100 unrelated sentences from the Internet as noise. Our language parser process all the sentences, and the output is tuples of crucial information extracted from the sentences. The researcher examine outputs and corresponding sentences to decide if the parsing results were accurate. The parsing result of a valid command is considered accurate if the location information and movement instruction are parsed out correctly. And the parsing result of a unrelated sentence is considered accurate if the output is none. Results. The results of the language parsing experiment are shown in Table 2. The high accuracy of both first and second forms of command demonstrates that most valid commands can be detected and parsed appropriately. In contrast, the accuracy of unrelated sentences is less outstanding, which means some words in unrelated sentences might baffle the language parser. Nevertheless, unrelated sentences should not appear in a typical interaction process. So, it is not worth worrying that these sentences may impair the performance of the language parser. On the whole, the performance of the language parser is reliable. Table 2. Accuracy of different types of sentences number accuracy first form

86

0.98

second form

30

1.0

100

0.77

unrelated

5.2

Experiment 2: Locating Object

We conduct an overall experiment to assess the performance of our method of locating objects. Nine objects are placed on the fifth floor of the experiment building. The objects include umbrellas, chairs, whiteboards, fire extinguishers,

Refining Object Localization from Dialogues

79

trash bins, potted plants, laptops, bottles, and backpacks, which covers different sizes of objects. The objects are placed at various places on the floor, including the front of rooms, corridors, top of tables, and breakout areas, which also covers the majority of public areas. The robot’s target is to locate the objects we preset through interactions with human informers. Procedure. Target objects are preset on the whole floor. The quantity of most types of objects is more than 1. An attempt for the target is defined as a task. Researchers who has a clear knowledge of the target’s location act as human informers for the robot and interacted with it through a social app. There are 4 tasks conducted for each type of objects, and all the tasks add to 36 in total. To simulate realistic application scenes, the researcher on task leaves the experiment floor and only communicates with the robot through the social app. The informer sends target type, location information and movement instructions to the robot through the social app. The robot computes and navigates to corresponding nodes according to Sect. 4 after receiving messages. The findings are photographed and returned to the informer for identification. A task is recorded as a success if the target is captured in a photo during the process. A task example is shown in Fig. 4. Results. The results of the search experiment are shown in Table 3. Due to the mechanism of our method, all the tasks succeed as expected. Interaction times mean the number of messages sent by the informer, which demonstrates the convenience of the system. Too many interaction times can be a burden on the users. The average interaction times is 4.2. A significant factor correlating with Table 3. The results of experiment 2. The success column records the number of success for each type of object. The object number column records the number of objects of each type used in the experiment. The interaction times column records the average number of messages the informer sends for each type.

type

success object number interaction times

umbrella

4

2

4.5

chair

4

6

6.0

whiteboard

4

1

2.0

fire extinguisher 4

2

2.8

trash bin

4

5

5.8

potted plant

4

1

2.0

laptop

4

2

4.3

bottle

4

2

5.5

backpack

4

2

5.0

average

4

2.6

4.2

80

X. Kang et al.

interaction times is the object numbers. The more objects there are, the more times the robot needs to send photos to the user for identification. 5.3

Experiment 3: Comparison with One-time Message Method

To compare our method with the previous one-time message method, we conduct a locating object experiment using the one-time message method [1] with the same scene and objects in experiment 2. And every searching task is repeated 4 times as well. Procedure. The informer sends the target type and location information only in the beginning. The robot calculates the best configuration from the location information using the method in Sect. 4. However, the informer offered no more information for the rest of the task. If the robot finds anything matching the target type, it photographs the object and sends it back. The user decides whether the found object is the target. If nothing is found, it returns a corresponding reply. Results. The results of experiment 3 are shown in Table 4. Compared with four success times on average in experiment 2, the experiment with the onetime message method only attains 1.6 times on average, which clearly shows the advantage of the multiple interaction method. The results demonstrate that the previous method is not competent for an environment with multiple objects similar to the target, where multiple identification from users are needed. Table 4. The results of one-time message method type

success object number

umbrella

0

chair

0

6

whiteboard

4

1

fire extinguisher 1

2

trash bin

2

5

potted plant

4

1

laptop

2

2

2

bottle

1

2

backpack

0

2

average

1.6

2.6

Refining Object Localization from Dialogues

6

81

Conclusion

In this paper, we propose a reliable method for refining information from natural language dialogues and locating objects in a large-scale environment. The previous study [1,11] demonstrates the feasibility of implementation of an autonomous robot to locate objects using a one-time message. We extend this method to multiple interactions using dialogues, ensuring the accuracy and success of the searching attempt. We conduct two experiments to evaluate the performance of our method. The first experiment assess our language parser’s ability, and the second experiment examine the system’s overall performance. Both experiments demonstrate the reliability of our method. Moreover, we compare our method with previous one-time message method to demonstrate our advantage. Nevertheless, there is still room for improvement in our design. For example, in our current method, the robot starts every searching task with the same prior distribution knowledge set in advance. However, we can design a new method to shape the distribution knowledge with the results from former searching tasks, enhancing the robot’s searching ability over time. Acknowledgement. This work was supported in part by the National Natural Science Fund under Grant 62025304.

References 1. Chung, M.J.Y., Pronobis, A., Cakmak, M., et al.: Autonomous question answering with mobile robots in human-populated environments. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 823–830. IEEE (2016) 2. Boutteau, R., Rossi, R., Qin, L., et al.: A vision-based system for robot localization in large industrial environments. J. Intell. Robot. Syst. 99(2), 359–370 (2020) 3. Guo, J., Xiao, X., Pan, P., et al.: A design of multi-vision localization and navigation service robot system. In: 2017 12th International Conference on Computer Science and Education (ICCSE), pp. 787–790. IEEE (2017) 4. Yin, R., Yang, J.: Research on robot control technology based on vision localization. J. Artif. Intell. 1(1), 37 (2019) 5. Chaplot, D.S., Gandhi, D.P., Gupta, A., et al.: Object goal navigation using goaloriented semantic exploration. Adv. Neural Inf. Process. Syst. 33, 4247–4258 (2020) 6. Chen, Y., Hafez, O.A., Pervan, B., et al.: Landmark augmentation for mobile robot localization safety. IEEE Robot. Autom. Lett. 6(1), 119–126 (2020) 7. Fajardo, J., Ferman, V., Guerra, J., et al.: LMI Methods for extended filters for landmark-based mobile robot localization. In: 2021 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM), pp. 511–517. IEEE (2021) 8. Cheng, J., Sun, Y., Meng, M.Q.H.: Robust semantic mapping in challenging environments. Robotica 38(2), 256–270 (2020) 9. Kostavelis, I., Gasteratos, A.: Semantic mapping for mobile robotics tasks: a survey. Robot. Auton. Syst. 66, 86–103 (2015) 10. N¨ uchter, A., Hertzberg, J.: Towards semantic maps for mobile robots. Robot. Auton. Syst. 56(11), 915–926 (2008)

82

X. Kang et al.

11. Chung, M.J.Y., Pronobis, A., Cakmak, M., et al.: Designing information gathering robots for human-populated environments. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5755–5762. IEEE (2015) 12. Anderson, P., Shrivastava, A., Truong, J., et al.: Sim-to-real transfer for visionand-language navigation. In: Conference on Robot Learning, pp. 671–681. PMLR (2021) 13. Wang, H., Wang, W., Liang, W., et al.: Structured scene memory for visionlanguage navigation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8455–8464 (2021) 14. Ma, C.Y., Lu, J., Wu, Z., et al.: Self-monitoring navigation agent via auxiliary progress estimation. arXiv preprint arXiv:1901.03035 (2019) 15. Irshad, M.Z., Ma, C.Y., Kira, Z.: Hierarchical cross-modal agent for robotics visionand-language navigation. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 13238–13246. IEEE (2021) 16. Qi, Y., Wu, Q., Anderson, P., et al.: Reverie: Remote embodied visual referring expression in real indoor environments. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9982–9991 (2020) 17. MacMahon, M., Stankiewicz, B., Kuipers, B.: Walk the talk: connecting language, knowledge, and action in route instructions. Def 2(6), 4 (2006) 18. Mei, H., Bansal, M., Walter, M.R.: Listen, attend, and walk: neural mapping of navigational instructions to action sequences. In: Thirtieth AAAI Conference on Artificial Intelligence (2016) 19. Chen, D., Mooney, R.: Learning to interpret natural language navigation instructions from observations. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 25, no. 1, pp. 859–865 (2011) 20. Anderson, P., Wu, Q., Teney, D., et al.: Vision-and-language navigation: interpreting visually-grounded navigation instructions in real environments. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3674– 3683 (2018) 21. Zhu, W., Wang, X.E., Fu, T.J., et al.: Multimodal text style transfer for outdoor vision-and-language navigation. arXiv preprint arXiv:2007.00229 (2020) 22. Chaplot D S, Gandhi D, Gupta S, et al. Learning to explore using active neural slam. arXiv preprint arXiv:2004.05155 (2020) 23. Manning, C.D., Surdeanu, M., Bauer, J., et al.: The Stanford CoreNLP natural language processing toolkit. In: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 55–60 (2014) 24. Hendrikx, R.W.M., Pauwels, P., Torta, E., et al.: Connecting semantic building information models and robotics: an application to 2D LiDAR-based localization. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 11654–11660. IEEE (2021) 25. Chen, X., Vizzo, I., L¨ abe, T., et al.: Range image-based LiDAR localization for autonomous vehicles. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 5802–5808.4. IEEE (2021)

License Plate Detection and Recognition Based on Light-Yolov7 Shangyuan Li1 , Nan Ma1(B) , Zhixuan Wu2 , and Qiang Lin3 1 Faculty of Information and Technology, Beijing University of Technology, Beijing 100124,

China [email protected] 2 Beijing Key Laboratory of Information Service Engineering, Beijing Union University, Beijing 100101, China 3 Beijing Intelligent Telematics Industry Innovation Center Co., Beijing 100176, China

Abstract. A license plate detection and recognition system is one of the practical applications of computer vision technology in the field of unmanned vehicles. In this paper, we proposed a Light-yolov7 for license plate detection and recognition model, which is applied to unmanned vehicles. The model contains three improvements: a lightweight neural network ShuffleNet2 is used for feature extraction, a depth-separable convolution is added to reduce the number of parameters, then this paper uses late fusion to connect features. Finally, CRNN is used to learn the obtained features. Experiments on a large Chinese license plate dataset (CCPD+CRPD) show that the model is feasible for mobile deployment and efficient for license plate detection and recognition. Keywords: Computer vision · License plate recognition · Feature recognition lightweight · YOLO

1 Introduction Lightweight license plate detection and recognition systems for unmanned vehicles can reduce the amount of storage and calculation in the computer. Installing a license plate detection recognition system on driverless vehicles can bind the surrounding vehicle information, track the source of illegal vehicles, reflect the traffic condition and traffic flow, and have a positive effect on traffic safety management. However, most of the existing methods are limited to applications in restricted scenarios such as traffic monitoring, road tolling, and parking lots. Therefore, further research on license plate detection and recognition systems is urgent and necessary. In this paper, we propose a deep-learning method that can achieve lightweight and efficient improvements in license plate detection. The method includes license plate detection, license plate image processing, and recognition of license plate features. The method can accurately recognize specific license plates, such as “ ” and other Chinese characters, “A” and other English letters, and “0” and other numbers. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 83–91, 2023. https://doi.org/10.1007/978-981-99-6187-0_8

84

S. Li et al.

The main contribution of this paper is as follows: (1) Lightweight improvement of the YOLOv7 model by the ShuffleNet2 module. Deeply separable convolution is used as the basic component of the YOLOv7 network to avoid the problem of model complexity caused by full convolution and focus operation of the YOLOv7 network. Fusing prediction frame detection and key point detection as the license plate detection method, the prediction frame is first obtained by gradually converging the anchor frame, and then converging to the four corner points of the license plate by the key points on the prediction frame. This method not only extracts the rich features of license plates well but also solves the low accuracy problem after lightening the network. The combination of the two greatly improves the spatial complexity and temporal complexity metrics of the model. The feasibility and recognition efficiency of mobile deployment of driverless cars is improved. (2) For the problem that the detected license plate image is distorted and skewed, the opencv-based perspective transformation of the license plate is proposed. The corrected license plate image is obtained by perspective transformation of the key points of the license plate to achieve the normalization of the input license plate characters. The processed license plate region images are used in the CRNN algorithm, which is usually used for text feature recognition, to achieve effective license plate feature recognition. (3) The model achieves high-precision license plate detection results on top of the fused Chinese license plate dataset. The experimental results show that the model of Lightyolov7 achieves an average accuracy of 99.2%. The number of parameters of the license plate detection model is 1.022 M and the floating point computation is 2.9 GFLOPS, which is a significant decrease compared to the YOLOv7 model, so the improved license plate detection and recognition system is effective and fast. The paper is organized as follows: Sect. 2 describes the related methods, Sect. 3 shows the results of the light-yolov7 model and discusses the techniques used, and Sect. 4 summarizes the model and discusses future work.

2 Method Light-yolov7 for license plate detection and recognition generally includes three phases: license plate image preprocessing, license plate detection, and license plate character recognition. The improved license plate detection and recognition system takes into account the deployment of the model and the features of the application scenario, the model should be lightweight. License plate detection is a rectangular box for predicting license plates. After filtering out the redundant targets, image processing is performed to correct the skewed license plate image by perspective transformation, and then the characters are recognized by license plate character recognition. 2.1 License Plate Data Processing The license plate image normalization operation is to scale the license plate image size before inputting it into the YOLO network for training. The YOLO network in order

License Plate Detection and Recognition Based on Light-Yolov7

85

to adapt the high-resolution image of 4096 * 2304 size and the low-resolution image of 698 * 500 size. The YOLO network performs the adaptive image size scaling by the Letterbox method, which changes the image size to 640 * X * 3 Perform adaptive license plate image size. The width and height of the license plate image are scaled so that one side of the larger size of the width and height is filled to 640 pixels, the other is scaled by the same multiple, and the remaining part is filled using gray color. It can also make the model inference faster and prevent model overfitting. 2.2 License Plate Detection Model You Only Look Once (YOLO) is a deep learning-based algorithm that is faster than most detection algorithms because detection and classification can be done in a single stage. It avoids the accumulation of intermediate errors, speeds up the detection and recognition process, and completes two stages of license plate detection and license plate feature recognition through one network, thus building a true end-to-end deep neural network [3], although the speed is increased the pollution of the network by blurred images leads to very limited features learned by the model and low recognition accuracy for skewed license plates. To address the problem that the license plate recognition method does not have high recognition accuracy in the case of distorted license plates, the perspective transformation method is used to correct the tilted or distorted license plates [4], then the corrected license plate area can achieve 97.5% license plate recognition rate by CNN model. The results show that the scheme can accurately recognize irregular license plates taken by cell phones. A deep learning model based on YOLOv7 [1] is applied to license plate detection. The drawback of YOLOv7 is that it uses a large number of fully connected structures in license plate detection, which is undesirable for the reduction of model parameters and is not conducive to deployment. The YOLOv7 model is lightened and improved using the ShuffleNet2 module with deeply separable convolution. This module has the feature of convolution with a single layer of kernels in a two-dimensional plane and has a small computational and parametric count to better make the network lightweight and easier to deploy the network model to mobile. The cost of this approach is the reduced accuracy of the model. In order to improve the accuracy of license plate detection, YOLOv7 key point detection combined with anchor frame detection structure is used for license plate detection. The final light-yolov7 network structure is shown in Fig. 1. The input original license plate image is extracted by the lightweight backbone network after extracting the license plate features, then the license plate features from three different sensory fields are fused by the connection network, and finally, the fused features are fed into the prediction network for predicting the license plate location and key points. The flowchart is shown in Fig. 2. 640 * 640 * 3 license plate images are input, and the license plate detection network outputs three feature maps of sizes 80, 40, and 20 by extracting features and fusing them. The key to YOLO target detection is to generate a grid of output feature map sizes, and each feature map grid has three a priori boxes corresponding to it. The label position of the license plate determines which grid corresponds to the anchor box that can predict the class and location of the license plate to obtain the prediction box. Finally, the prediction information corresponding to the license plate labels is obtained by associating the prediction boxes with the feature maps. The

86

S. Li et al.

Fig. 1. Light-yolov7 network structure

prediction information and labels are fed into the license plate category loss function, the license plate confidence loss function, and the license plate rectangular frame position loss function to obtain the bias, and the parameters are learned by the back propagation function. The final license plate detection results are obtained. Lightweight license plate recognition by YOLOv7 achieves more than 99% accuracy on the extended dataset.

Fig. 2. License plate detection process

2.3 License Plate Correction Model The accuracy of the model is improved by pre-processing the license plate image before character recognition. The traditional method of license plate image processing is to change the recognized license plate features into one connected region by using a connected domain, and then project the corresponding pixel histogram to get the corresponding pixel histogram and segment the license plate features according to the peaks and valleys of the histogram. The commonly used methods for text recognition after segmentation are template matching method [6], but for some scenes in the dataset, such

License Plate Detection and Recognition Based on Light-Yolov7

87

as too bright or too dark light, rain and snow, and other occlusion situations, these scenes have a very strong impact on the license plate screening license plate feature regions, resulting in non-recognition. The convolutional network based on deep learning recognizes the tilted and obscured license plate feature results are affected by the complex environment, but the license plate correction processing accuracy will be improved for the recognized license plate feature regions. The license plate area extracted after the license plate detection module is shown in Fig. 3 (a). According to the key points of the detected original picture, the mapping relationship between the four corner points of the original picture and the position of the corrected picture is calculated, and the coordinates of the four points of the original picture and the coordinates of the four points of the transformed picture are corresponded by a mathematical equation, and the four known coordinates can form a total of eight sets of equations to solve the eight unknowns of the transformation matrix, so as to obtain the transformation matrix M. Then, the tilted license plate is transformed by perspective transformation through the matrix The corrected license plate is obtained by inverting the matrix through perspective transformation as shown in Fig. 3 (b).

Fig. 3. (a) Original image (b) Corrected image

2.4 License Plate Recognition Model The traditional license plate character recognition technology uses the template matching method or support vector machine [7] method to recognize license plate numbers after extracting features. The above-mentioned license plate character recognition methods have the small computation, few parameters, and fast recognition speed, but they are influenced by the vehicle picture-taking environment, such as bad weather, plate defacement, and other complex environments. With the wide application of deep learning in the field of vision, license plate recognition technology has also improved efficiently in dealing with the accuracy of license plate recognition in complex scenes. It is proposed that the entire license plate features can be recognized directly without license plate segmentation and by getting the license plate location features through deep learning. Such as FAST faster arbitrary shape text detector but the model is too large for simple scene recognition such as license plate recognition, PSENET real-time scene text detection based on differentiable binarization and adaptive scale fusion [9] is too slow for recognition time does not meet the demand of fast detection on mobile. After the connected time classification loss function (CTC) proposed by Graves et al. [12], the sequence of output probability distributions can be interpreted as conditional probabilities on a sequence of possible labels, and license plate recognition is considered a sequential labeling problem. Many license plate detection techniques such as RCNN [10] and Faster R-CNN [11] have been born, which have the advantages of high speed

88

S. Li et al.

and accuracy. The convolutional neural network-based model CRNN [2] for single feature recognition and sequence recognition was applied to detect Korean license plate features and showed high accuracy. The license plate recognition model used in this paper is a CRNN model. In the feature recognition process, a series of feature vectors can be extracted by convolving the features of the extracted license plate region with a convolutional neural network (CNN). A recurrent neural network (RNN) with a connected time classification loss function (CTC)] is used to label consecutive features without separating them. In the training phase, the time duration (x), the license plate character probability (y), the picture of the input license plate, and the label corresponding to the license plate feature are used as CTC inputs. The character possible path sum is used as the loss function, and the character probability is continuously updated, and finally, a path with the maximum probability is obtained as the output. The final accurate license plate feature recognition result output is obtained. During the testing process, the corresponding feature sequences are obtained by the CRNN model through the argmax classifier, the path with the maximum probability is selected, and the license plate features are finally output by decoding. The result shown in Fig. 4 is “black AD03210”.

Fig. 4. License plate feature sequence

3 Experiments In this section, we will conduct experiments to test the Flops, Params, and accuracy of different license plate models and evaluate the feasibility of deploying the models to driverless mobile. This experiment is conducted on top of Ubuntu 20.04.6 LTS 64-bit OS. Our proposed system is executed on NVIDIA Corporation GPUs with 24G RAM. 3.1 Dataset Selecting and preprocessing Chinese license plate datasets. To cover various Chinese license plates in natural scenarios and to restore the data in unmanned scenarios, we use three datasets. The Chinese City Dataset (CCPD) [13] is the license plate data collected by the authors from different distances and angles under various natural weather conditions (e.g., rain, snow, and light effects). The Chinese Road Dataset (CRPD) [14] is a dataset produced by the authors with a single image containing a single license plate, two license plates, or multiple license plates as different categories. To restore license plate detection and recognition in driverless scenarios, we produced a dataset of license plates taken by a co-driver holding a cell phone while the vehicle is in motion. The above datasets are combined to form a new extended dataset. It makes the model more compatible with

License Plate Detection and Recognition Based on Light-Yolov7

89

the driverless scenario while avoiding the dataset to be skewed toward a certain license plate type. The extended dataset is uniformly transformed into the following YOLO data format, as shown in Table 1 below. Table 1. Yolo license plate data format

L

Category Center point coordinates Coordinates of the four corner points

x pt1x

y

w

pt1y pt2x pt2y pt3x

h pt3y pt4x pt4y

The license plate category (L), the coordinates of the center point of the rectangular frame of the license plate (x, y), the width and height of the rectangular frame of the license plate (w, h), and the coordinates of the four corner points of the license plate (ptx, pty). To prevent the data coordinates from deviating too much, the coordinates of the images are normalized. 3.2 Analysis of Testing Model Indicators FLOPs (floating point computations) are used to measure the time complexity of the model, and Parameters (number of parameters) are used to measure the spatial complexity of the model. By analyzing the complexity, it is possible to know the length of the network execution time and the amount of memory occupied. In the literature [5], a series of ablation experiments were conducted on the COCO dataset facing Yolov5, and from Table 2, we can get that Yolov5-lite-s has 1.66G Flops while Yolov5 s-6.0 has 16.5G Flops, Yolov5-lite-s has only 1.64M parametric number represents lower memory and fewer parameters, so Yolov5-lite-s has a faster computation speed and represents a faster inference speed on the mobile side. The unique network structure and detection network of Yolov7 make it the most accurate in the COCO dataset but also make it too complex, so the ShuffleNet2 lightweight module is used for license plate detection. The Yolov7 network for license plate recognition is lightened. The Yolov7 accuracy is improved by combining key point detection with target frame prediction for license plate featurization. Experimenting on the fused Chinese large license plate dataset, from Table 2, it can be obtained that the Yolov5-lite-e model has the lowest number of parameters and accuracy but the accuracy is only 50%. The best accuracy of the Yolov5-lite-k model with the key point detection method is 98.1% and 82.7%, while the accuracy of the Light-yolov7 model increases by 0.3% and 2.3% after incorporating SPPF. The Lightyolov7-y model with pre-trained weights achieves the best accuracy of 99.2%, 86.7%, which is four-fifths lower than the Yolov7-tiny parametric number and accuracy, and also 0.2% higher than the Yolov7-tiny-k model that incorporates the key point detection method.

90

S. Li et al. Table 2. Yolov comparison experiments

Models

Dataset

Flops

Yolov5s-6.0

COCO dataset

16.5G

7.23M

56

37.2

1.7G

1.64M

42

25.2

5.6G

8.86M

40.2

21.7

104.7G

36.9M

69.7

-

13.2G

5.74M

70.5

61.5

2.6G

0.68M

59

40.7

Yolov7-tiny-k

16.7G

7.47M

99.0

86.4

Yolov5-lite-e-k

2.6G

0.69M

98.1

82.7

Light-yolov7

3.0G

1.02M

98.4

85.0

Light-yolov7-y

2.9G

1.02M

99.2

86.7

Yolov5-lite-s Yolov4-tiny Yolov7 Yolov7-tiny Yolov5-lite-e

Converged license plate dataset

Params

[email protected]

[email protected]:0.95

4 Conclusion In this paper, we implement a deep learning algorithm-based detection model for experiments with license plate datasets and compare them in terms of complexity and accuracy. The results show that the Light-yolov7 model is the best in terms of accuracy and complexity. The license plate detection accuracy of 99.2% in the fused dataset fully meets the accuracy requirements of the license plate detection task. In future work, the accuracy of these detection models will be compared on different large Chinese license plate datasets separately. It is hoped to further optimize the computational power of the network while improving the computational speed of the models. It is hoped that the research will be applied to realistic unmanned systems and combined with advanced detection algorithms to achieve a systematic, lightweight, and high-precision solution from license plate detection, license plate correction, and feature recognition. Acknowledgment. This work was supported by Beijing Natural Science Foundation (No. 4222025) and the Beijing Municipal Science & Technology Commission, Administrative Commission of Zhongguancun Science Park No. Z221100000222016.

References 1. Wang, C., Bochkovskiy, A., Liao, H.M.: YOLOv7: trainable bag-of-freebies sets new stateof-the-art for real-time object detectors. ArXiv, abs/2207.02696 (2022) 2. Usmankhujaev, S., Lee, S., Kwon, J.: Korean license plate recognition system using combined neural networks. In: Herrera, F., Matsui , K., Rodríguez-González, S. (eds.) DCAI 2019. AISC, vol. 1003, pp. 10–17. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-23887-2_2 3. Li, H., Wang, P., Shen, C.: Toward end-to-end car license plate detection and recognition with deep neural networks. IEEE Trans. Intell. Transp. Syst. 20(3), 1126–1136 (2019)

License Plate Detection and Recognition Based on Light-Yolov7

91

4. Tao, S.: Research on license plate recognition algorithm in complex scenes. Univ. Electron. Sci. Technol. (2019). (Chinese) 5. Chen, X., Gong, Z.: YOLOv5-Lite: Lighter, faster and easier to deploy, GitHub repository (2022). https://zhuanlan.zhihu.com/p/400545131 6. Xun, R., Su, T., Ma, X.: BP neural network combined template matching license plate recognition system. J. Tsinghua Univ. (Nat. Sci. Ed.) (9), 3–8 7. Wang, W., Ma, Y., Peng, Q.: Application of SVM multi-class classifier in feature recognition of license plate. Comput. Eng. Des. (9), 262–265 8. Chen, Z., Wang, W., Xie, E., Yang, Z., Lu, T., Luo, P.: Fast: searching for a faster arbitrarilyshaped text detector with minimalist kernel representation (2021) 9. Liao, M., Wan, Z., Yao, C., Chen, K., Bai, X.: Real-time scene text detection with differentiable binarization and adaptive scale fusion. IEEE Trans. Pattern Anal. Mach. Intell. 45, 919–931 (2019) 10. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. IEEE Computer Society, Washington, D.C. (2014) 11. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell. 39 (2015) 12. Graves, A., Fernández, S., Gomez, F.: Connectionist temporal classification: labeling unsegmented sequence data with recurrent neural networks. ACM (2006) 13. Xu, Z., et al.: Towards end-to-end license plate detection and recognition: a large dataset and baseline. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11217, pp. 261–277. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-012618_16 14. Gong, Y., Deng, L., et al.: Unified Chinese license plate detection and recognition with high efficiency. J. Vis. Commun. Image Represent. 86 (2022)

Efficient Partitioning Method of Large-Scale Public Safety Spatio-Temporal Data Based on Information Loss Constraints Jie Gao1 , Yawen Li2(B) , Zhe Xue1 , and Zeli Guan1 1 Beijing Key Laboratory of Intelligent Communication Software and Multimedia, School of

Computer Science (National Pilot Software Engineering School), Beijing University of Posts and Telecommunications, Beijing 100876, China 2 School of Economics and Management, Beijing University of Posts and Telecommunications, Beijing 100876, China [email protected]

Abstract. The storage, management, and application of massive spatio-temporal data are widely applied in various practical scenarios, including public safety. However, due to the unique spatio-temporal distribution characteristics of realworld data, most existing methods have limitations in terms of the spatio-temporal proximity of data and load balancing in distributed storage. Therefore, this paper proposes an efficient partitioning method of large-scale public safety spatiotemporal data based on information loss constraints (IFL-LSTP). The IFL-LSTP model specifically targets large-scale spatio-temporal point data by combining the spatio-temporal partitioning module (STPM) with the graph partitioning module (GPM). This approach can significantly reduce the scale of data while maintaining the model’s accuracy, in order to improve the partitioning efficiency. It can also ensure the load balancing of distributed storage while maintaining spatio-temporal proximity of the data partitioning results. This method provides a new solution for distributed storage of massive spatio-temporal data. The experimental results on multiple real-world datasets demonstrate the effectiveness and superiority of IFL-LSTP. Keywords: data partitioning · information loss · spatial-temporal proximity · load balancing · graph partitioning

1 Introduction In recent years, the scale of spatio-temporal data has exploded, leading to several technical solutions based on distributed databases for storing, organizing, and managing massive amounts of data [1, 2, 13]. However, due to the spatial aggregation and temporal correlation of spatio-temporal data, it is an urgent problem to partition and load balance large-scale, unevenly distributed data while maintaining its spatio-temporal proximity This work was supported by the National Natural Science Foundation of China (62192784, U22B2038, 62172056, 62272058). © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 92–100, 2023. https://doi.org/10.1007/978-981-99-6187-0_9

Efficient Partitioning Method of Large-Scale Public Safety Spatio-Temporal Data

93

[1, 20–22]. To address the above problems, we propose an innovative efficient partitioning method of large-scale public safety spatio-temporal data based on information loss constraints (IFL-LSTP). Specifically, IFL-LSTP consists of two main modules, the spatio-temporal partitioning module (STPM) and the graph partitioning module (GPM). With the combined effect of these two modules, IFL-LSTP can significantly reduce the training time of the deep learning graph partitioning model while completing efficient data partitioning under the constraints of a predefined information loss threshold. This helps to effectively balance the data storage load and maintain the spatio-temporal proximity of the data partitioning results [23–25]. Besides, we conduct a large number of experiments on real-world spatio-temporal datasets to verify the effectiveness of IFL-LSTP. Our contributions are summarized as follows: • In this paper, we propose an efficient partitioning method for large-scale public safety spatio-temporal data based on information loss constraints (IFL-LSTP), which divides the spatio-temporal data efficiently through STPM and GPM to maintain the spatiotemporal proximity of the data while ensuring the partitioning result of load balancing. • STPM allows scaling down the data size within the constraints of a prescribed information loss threshold and iteratively partitioning large spatio-temporal datasets efficiently without pre-specifying the final number of partitions. • GPM uses the graph embedding technology to design a loss function that comprehensively considers the minimum normalized edge cut and partition load balancing of the graph, and completes the balanced division of the graph structure by training the deep learning model.

2 Methodology 2.1 Problem Definition Definition 1. Information Loss. The information difference between the initial input dataset and the result after STPM partitioning is called information loss (IFL). The specific formula for IFL is as follows:    s  n 1   di (k) − di (k) (1) IFL(d , d ) = n×s di (k) i=1 k=1

where di (k) denotes the representative value of attribute k corresponding to cell i after STPM partitioning. Definition 2. Minimum normalized edge cut of the graph [3]. Given an undirected graph G = (V , E), G can be divided into disjoint sets S1 , S2 , . . . , Sg by removing some of the connected edges. The following formula gives the idea of calculating the minimum normalized edge cut of the graph. MNcut(S1 , S2 , . . . , Sg ) =

g  cut(Sk , Sk ) k=1

vol(Sk , V )

where vol(Sk , V ) denotes the total degree of the nodes belonging to set Sk in G.

(2)

94

J. Gao et al.

2.2 Spatio-Temporal Partitioning Module (STPM) As shown in Fig. 1, STPM first normalizes the dataset with spatio-temporal attributes before the start of the iteration and pre-calculates the minimum neighboring attributes difference values, refer to previous methods [14–16]. After that, inspired by method [17– 19] the cell group extractor traverses the grid to find a rectangular group of neighboring grid cells, so that the difference in attributes between the grid cells in the group is less than or equal to the minimum neighboring attributes difference value computed above. Once all eligible cell groups are found, each cell group is made to act as a single grid cell in the grid of the input model in the next iteration. The feature allocator’s role is to create and assign representative values of the attributes of the cell groups to the extracted cell groups. In this process, STPM defines an allocation selection parameter λ, which contains only two values, average and mode, indicating the two operations of taking the average value and the highest frequency value for all the spatio-temporal attributes of the grid cells in the group, respectively. After assigning features to each cell group, the spatio-temporal partitioning result of the current iteration is obtained, and then STPM calculates the information loss between the current partitioning result and the initial input dataset grid according to the IFL calculation method given in Definition 1, and determines whether it is lower than the predefined information loss threshold θ, and then decides whether to continue with the subsequent iterations.

Fig. 1. Illustration of IFL-LSTP

2.3 Graph Partitioning Module (GPM) Loss Function. The output of the GPM model is Y ∈ Rn×g , where Yik denotes the probability that a node vi ∈ V in the graph is partitioned into the set S k of disjoint partitions. Besides, according to the graph structure mapping rules, the set of nodes that are adjacent to node vi can be obtained directly from the adjacency matrix A of the

Efficient Partitioning Method of Large-Scale Public Safety Spatio-Temporal Data

95

graph. Therefore, based on the formula given in the previous Definition 2, the minimum normalized edge cut of the graph can be reduced to solving for the following equation:    Y (1 − Y )  A E MNcut(S1 , . . . , Sg ) = Y D

(3)

where an n-dimensional vector D is defined to represent the degree of each node, and A denotes the Hadamard product with the adjacency matrix A [4]. For a given graph G, assuming that the number of partitions is g and the  amount of data on each node vi is fi , the average load on each partition should be ni=1 fi /g. For this purpose, an n-dimensional vector F is defined to represent the amount of load data of each node. That is, the final loss function LossGPM is specified by the following equation:     (e F − LossGPM = E MNcut(S1 , . . . , Sg ) +

n 1  2 fi ) g+ε

(4)

i=1

Graph Partitioning Model. GPM uses the graph embedding technique [5–7] to learn and adapt the graph structure. As shown in Fig. 1, according to the definition of the loss function in the previous section, the input of the model should include an n × n adjacency matrix A, an n-dimensional vector D recording node degree information, and node features X. Based on the feature that the graph convolution network (GCN) has powerful feature extraction capability for graph nodes as proposed in the literature [26, 27], GPM constructs a 2-layer GCN to perform feature extraction on the model input. In addition, GPM also uses GraphSAGE proposed in the literature [28] to generate highdimensional graph node representations based on node input features. Subsequently, GPM receives the learned node embeddings and generates the probability that each node belongs to partition S1 , S2 , . . . , Sg through the fully connected layer and SoftMax, which is the model output Y shown in Fig. 1.

Fig. 2. Evaluating cell reduction performance of IFL-LSTP with various values of IFL on all datasets

96

J. Gao et al.

3 Experiments 3.1 Experimental Data Preparation To verify the effectiveness of IFL-LSTP, experiments are conducted on three real-world datasets, including the home sales dataset, the GLONASS+112 dataset [8], and the public-safety dataset. Among them, the GLONASS+112 dataset is an emergency dataset. Based on this dataset, 100,000 emergency events are randomly sampled to form the mGLONASS dataset for experimental evaluation. Besides, the public-safety dataset is crawled and organized from the microblogging platform. 3.2 Analysis of Experimental Results Evaluation of Data Size Reduction Effects. In this paper, two different grid sizes, 100k (315 × 318) and 36k (191 × 193), are constructed and used in the experimental validation, and the degree of grid cell reduction and the grid cell reduction time are analytically evaluated for three predefined loss thresholds: 0.05, 0.1 and 0.15. Figure 2 shows the effect of data size reduction on three datasets with different information loss thresholds. The three subplots (a) (b) (c) in Fig. 2 show that IFL-LSTP is able to reduce the number of grid cells by about 25% with an information loss of only 0.05, and when the information loss IFL is increased to 0.1 and 0.15, the number of grid cells is further reduced up to about 36%. Although the number of grid cells decreases with increasing IFL thresholds, the degree of reduction is decreasing. Besides, the three subplots (d) (e) (f) in Fig. 2 show the grid cell reduction times with different IFL thresholds on all datasets, respectively. It is easy to see that the grid cell reduction times rise with increasing IFL thresholds and initial grid granularity. This is due to the fact that as the initial grid granularity becomes finer, STPM needs to process more cell groups, and as the IFL threshold increases, STPM will perform more iterations, resulting in a longer running time. Evaluation of Spatio-Temporal Proximity of Partitioning Results. In order to verify the spatio-temporal proximity of the data partitioning results of IFL-LSTP, the spatiotemporal hierarchical indexing method [9], TrajSpark [10] and LE [11] are compared with IFL-LSTP, respectively. The experimental results are shown in Table 1. From the analysis of the results in the above table, it can be seen that IFL-LSTP is optimal at IFL = 0.15 in terms of the performance of the spatio-temporal proximity p. Especially compared with TrajSpark, IFL-LSTP at IFL = 0.15 has a more obvious improvement, and can improve 5.76% on average on all datasets. In addition, the spatiotemporal proximity of IFL-LSTP partition results improve to different degrees with the increase of IFL information loss threshold, but the degree of improvement is limited. Analysis of Model Training Time Consumption. This section analyzes the performance of IFL-LSTP in reducing the training time when training a deep learning graph partitioning model for GPM. The training time of GPM on different datasets under each loss threshold is shown in Fig. 3. Figure 3 shows that IFL-LSTP with IFL = 0.05 can reduce the training time of GPM in the range of about 25%–32%. While when the IFL is 0.1 and 0.15, it does not reduce

Efficient Partitioning Method of Large-Scale Public Safety Spatio-Temporal Data

97

Table 1. Comparison results of spatial-temporal proximity. Datasets

Methods

Spatio-temporal proximity p/%

house-sales

TrajSpark [10]

80.43

Spatio-Temporal hierarchical indexing [9]

84.67

mGLONASS

public-safety

LE [11]

85.06

IFL-LSTP (IFL = 0.05)

83.28

IFL-LSTP (IFL = 0.1)

85.54

IFL-LSTP (IFL = 0.15)

85.79

TrajSpark [10]

78.76

Spatio-Temporal hierarchical indexing [9]

81.33

LE [11]

82.39

IFL-LSTP (IFL = 0.05)

79.40

IFL-LSTP (IFL = 0.1)

80.97

IFL-LSTP (IFL = 0.15)

82.14

TrajSpark [10]

76.53

Spatio-Temporal hierarchical indexing [9]

78.89

LE [11]

80.12

IFL-LSTP (IFL = 0.05)

77.74

IFL-LSTP (IFL = 0.1)

80.06

IFL-LSTP (IFL = 0.15)

81.38

the training time to a greater extent, although it allows for greater information loss in STPM. Therefore, a trade-off between reducing training time and maintaining model accuracy should be considered when making the choice of the IFL threshold.

Fig. 3. Analyzing training time of GPM with various values of IFL on all datasets

98

J. Gao et al.

Evaluation of Load Balancing Effectiveness. In this section, we mainly evaluate the effectiveness of graph partitioning results from the perspective of edge cut rate λ and overall imbalance ubd. The experimental results are shown in Table 2. Table 2. Evaluation of load balancing effectiveness. Datasets

Methods

λ

ubd/%

house-sales

Metis [12]

0.19

1.76

mGLONASS

public-safety

LDG [29]

0.13

1.69

IFL-LSTP (IFL = 0.05)

0.15

1.68

IFL-LSTP (IFL = 0.1)

0.11

1.23

IFL-LSTP (IFL = 0.15)

0.09

1.12

Metis [12]

0.26

2.35

LDG [29]

0.22

2.17

IFL-LSTP (IFL = 0.05)

0.23

2.21

IFL-LSTP (IFL = 0.1)

0.17

1.79

IFL-LSTP (IFL = 0.15)

0.16

1.72

Metis [12]

0.28

2.63

LDG [29]

0.25

2.54

IFL-LSTP (IFL = 0.05)

0.25

2.52

IFL-LSTP (IFL = 0.1)

0.2

2.07

IFL-LSTP (IFL = 0.15)

0.18

1.99

From the analysis of the results in the above table, it can be seen that our IFL-LSTP has a certain improvement under the evaluation index of graph partitioning compared with the classical graph partitioning methods Metis and LDG, which is mainly due to the fact that the IFL-LSTP first performs spatio-temporal partitioning and considers the influence of spatio-temporal distribution on load balancing. Further, by comparing the experimental results under three different information loss thresholds, we can see that the edge cut rate λ and the overall imbalance ubd decrease gradually with increasing IFL, but for the two cases of IFL = 0.1 and IFL = 0.15, the changes of the two indicators are not significant, which further verifies that the trade-off between reducing training time and maintaining model accuracy should be made.

4 Conclusion In this paper, we propose an efficient partitioning method IFL-LSTP for large-scale public safety spatio-temporal data based on information loss constraints, and innovatively propose the spatio-temporal partitioning module STPM and the graph partitioning module GPM. In this paper, extensive comparison experiments are conducted on three real-world datasets to validate the effectiveness and superiority of our method.

Efficient Partitioning Method of Large-Scale Public Safety Spatio-Temporal Data

99

References 1. Xiao, S., Shao, Y., Li, Y., Yin, H., Shen, Y., Cui, B.: LECF: recommendation via learnable edge collaborative filtering. Sci. China Inf. Sci. 65(1), 1–15 (2022) 2. Alam, M.M., Torgo, L., Bifet, A.: A survey on spatio-temporal data analytics systems. ACM Comput. Surv. 54(10s), 1–38 (2022) 3. Nazi, A., Huang, W., Goldie, A., et al.: GAP: generalizable approximate graph partitioning framework. arxiv preprint arXiv:1903.00614 (2019) 4. Horn, R.A., Yang, Z.: Rank of a hadamard product. Linear Algebra Appl. 591, 87–98 (2020) 5. Xu, M.: Understanding graph embedding methods and their applications. Soc. Ind. Appl. Math. 63(4), 825–853 (2021) 6. Guan, Z., Li, Y., Xue, Z., Liu, Y., Gao, H., Shao, Y.: Federated graph neural network for crossgraph node classification. In: 2021 IEEE 7th International Conference on Cloud Computing and Intelligent Systems, CCIS, pp. 418–422 (2021) 7. Li, Y., et al.: Heterogeneous latent topic discovery for semantic text mining. IEEE Trans. Knowl. Data Eng. 35(1), 533–544 (2021) 8. Dagaeva, M., Garaeva, A., Anikin, I., et al.: Big spatio-temporal data mining for emergency management information systems. IET Intell. Transp. Syst. 13(11), 1649–1657 (2019) 9. Zhao, X., Huang, X., Qiao, J., et al.: A spatio-temporal index based on skew spatial coding and R-tree. J. Comput. Res. Dev. 56(03), 666–676 (2019) 10. Zhang, Z., Jin, C., Mao, J., Yang, X., Zhou, A.: TrajSpark: a scalable and efficient in-memory management system for big trajectory data. In: Chen, L., Jensen, C., Shahabi, C., Yang, X., Lian, X. (eds.) APWeb-WAIM 2017. LNCS, vol. 10366, pp. 11–26. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-63579-8_2 11. Xia, H., Lin, L.: Spatio-temporal data partitioning method based on Laplacian eigenmaps. Sci. Surv. Mapp. 43(6), 32–38 (2018) 12. Luo, G., Chen, X., Nong, S.: Net clusting based low complexity coarsening algorithm in k-way hypergraph partitioning. J. Phys. Conf. Ser. 2245(1), 012019 (2022) 13. Lin, P., Jia, Y., Du, J., Yu, F.: Average consensus for networks of continuous-time agents with delayed information and jointly-connected topologies. In: 2009 American Control Conference, pp. 3884–3889 (2009) 14. Li, Y., Yuan, Y., Wang, Y., Lian, X., Ma, Y., Wang, G.: Distributed multimodal path queries. IEEE Trans. Knowl. Data Eng. 34(7), 3196–3321 (2022) 15. Huang, J., et al.: HGAMN: heterogeneous graph attention matching network for multilingual POI retrieval at baidu maps. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, KDD 2021, pp. 3032–3040 (2021) 16. Kou, F., et al.: Hashtag recommendation based on multi-features of microblogs. J. Comput. Sci. Technol. 33, 711–726 (2018) 17. Li, A., et al.: Scientific and technological information oriented semantics-adversarial and media-adversarial cross-media retrieval. arXiv preprint arXiv:2203.08615 (2022) 18. Wei, X., Du, J., Liang, M., Ye, L.: Boosting deep attribute learning via support vector regression for fast moving crowd counting. Pattern Recognit. Lett. 119, 12–23 (2019) 19. Shao, Y., Huang, S., Li, Y., Miao, X., Cui, B., Chen, L.: Memory-aware framework for fast and scalable second-order random walk over billion-edge natural graphs. VLDB J. 30(5), 769–797 (2021) 20. Li, Y., Zeng, I.Y., Niu, Z., Shi, J., Wang, Z., Guan, Z.: Predicting vehicle fuel consumption based on multi-view deep neural network. Neurocomputing 502, 140–147 (2022) 21. Li, Y., Jiang, W., Yang, L., Tian, W.: On neural networks and learning systems for business computing. Neurocomputing 275(31), 1150–1159 (2018)

100

J. Gao et al.

22. Li, W., Jia, Y., Du, J.: Tobit Kalman filter with time-correlated multiplicative measurement noise. IET Control Theory Appl. 11(1), 122–128 (2017) 23. Meng, D., Jia, Y., Du, J., Yu, F.: Tracking algorithms for multiagent systems. IEEE Trans. Neural Netw. Learn. Syst. 24(10), 1660–1676 (2013) 24. Li, W., Jia, Y., Du, J., Zhang, J.: PHD filter for multi-target tracking with glint noise. Signal Process. 94, 48–56 (2014) 25. Li, A., Li, Y., Shao, Y., Liu, B.: Multi-view scholar clustering with dynamic interest tracking. IEEE Trans. Knowl. Data Eng., 1–14 (2023) 26. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. arxiv preprint arXiv: 1609.02907 (2016) 27. Li, Y., Li, W., Xue, Z.: Federated learning with stochastic quantization. Int. J. Intell. Syst. (2022) 28. Hamilton, W., Ying, Z., Leskovec, J.: Inductive representation learning on large graphs. Adv. Neural. Inf. Process. Syst. 30, 1025–1035 (2017) 29. Pacaci, A., Özsu, M.T.: Experimental analysis of streaming algorithms for graph partitioning. In: Proceedings of the 2019 International Conference on Management of Data, pp. 1375–1392 (2019)

Research and Application of Intelligent Monitoring and Diagnosis System for Rail Transit Emergency Power Supply Equipment Liang Chen1(B) , Lin Zhou2 , Xuliang Tang3 , Heng Wan3 , and Yanao Cao4 1 Power Supply Branch of Shanghai Metro Maintenance Support Co., Ltd., Shanghai 201106,

China [email protected] 2 Hangzhou Branch of Shanghai Jiudao Information Technology Co., Ltd., Hangzhou 310012, China 3 Shanghai Institute of Technology School of Railway Transportation, Shanghai 201418, China 4 Shanghai Rail Transit Maintenance Support Co., Ltd., Shanghai 200070, China

Abstract. Emergency power supply equipment is an important part of the rail transit system. In order to master and adapt to the safe and reliable operation of the rail transit emergency power supply equipment, it is urgent to carry out intelligent monitoring, early warning and prediction of the equipment operation status, rapid and accurate fault location and provide effective fault handling guidance measures. This paper proposes a set of fault monitoring, diagnosis, location and disposal methods based on emergency power supply equipment, Through the intelligent monitoring platform for emergency power supply, the equipment life cycle health management and refined operation and maintenance management are realized, and the service life and operation reliability of emergency power supply equipment are effectively improved. Keywords: Rail Transit · Emergency Supply · Fault Analysis · Intelligent Monitoring

1 Introduction With the development of the rail transit industry, the number of rail transit emergency power equipment is increasing year by year. These monitoring methods have the disadvantages of low monitoring efficiency, low level of intelligent operation, and high operation and maintenance costs [1]. The intelligent monitoring system for emergency power supply established in this paper realizes all-round and multi-angle intelligent monitoring and early warning and prediction of the operation status of the UPS system. The system collects and classifies information according to the six parts of the UPS, makes logical judgment or intelligent combination of the single-point telemetry and remote signaling data collected from the equipment body, directly and quickly locates the fault to the specific part of the UPS, and outputs intuitive and clear fault phenomenon, fault reason The scope of influence, fault measures and emergency plans and other © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 101–108, 2023. https://doi.org/10.1007/978-981-99-6187-0_10

102

L. Chen et al.

information provide intelligent decision-making for daily operation and maintenance, and provide comprehensive research and judgment analysis for professionals through multiple data visualization platform. The system can not only save a lot of labor costs, but also analyze the operation data, so as to better grasp the health status of the equipment and maximize the service life of the equipment [2, 3]. It also helps transform the operation and maintenance mode of UPS equipment of metro power supply system from fault maintenance and regular maintenance to preventive maintenance, which can provide strong support for metro power supply business. In the operation of the power system, due to the aggravation of the power supply load, the deterioration of the weather and the maintenance or failure of the equipment, there will be a short-term power outage [4]. In order to ensure that the power load will not be affected under the power outage and that the important equipment can operate normally, the emergency power supply needs to be used. Uninterruptible Power System (UPS) is a kind of emergency power supply, mainly composed of rectifier, inverter, battery, static switch and other components. The UPS host is connected with the battery and transmits the mains power to important electronic equipment. When the mains power input is not interrupted, the UPS system can play the role of stabilizing the voltage to prevent the voltage fluctuation from damaging the electrical equipment. At the same time, the UPS rectifies the AC mains power to DC, Fully charge the internal battery, and then supply power to the load through the inverter. When the mains power is interrupted abnormally, UPS will immediately supply the DC power of the battery to the load by means of inversion zero-switching conversion. The problem of power sup-ply interruption will not occur due to a short power failure, so that the load can maintain normal operation and protect the load software and hardware from damage, and can always provide high-quality power supply for the protection of precision instruments [5, 6]. In the field of metro power supply, UPS emergency power supply is a system for the power supply profession to ensure the safe use of power for important loads such as train signal, communication, automatic fire alarm (FAS), environment and equipment monitoring system (BAS), automatic fare collection system (AFC), power monitoring system (SCADA), energy management control system (EMCS), access control system, etc., and for all operating units to provide power for train operation, communication, fire protection Customer service and other important 400V loads provide uninterruptible power supply guarantee, so UPS emergency power supply plays an important role in metro power supply, providing high reliability backup power supply for uninterruptible power supply [7] to ensure power supply quality and continuity. At present, a metro power supply branch in Shanghai has jurisdiction over 604 sets of UPS equipment, including 26 sets of tower type UPS and 78 sets of modular UPS. The equipment brands include Aerospace, GE, SOKMAN, etc. According to different outgoing line structure and equipment type, UPS has different operation modes and monitoring data [8, 9]. Different brands and models of UPS and multi-source data increase the complexity of UPS operation and maintenance work: mainly through experienced operation and maintenance personnel to manually analyze the single point information such as SCADA telemetry and remote signaling. It is unable to conduct real-time and efficient online monitoring of UPS operation status, and perceive and accurately locate in advance [10].

Research and Application of Intelligent Monitoring and Diagnosis System

103

2 Function Design of Diagnosis and Monitoring System 2.1 Fault Classification According to the composition and structure of UPS, the system divides UPS into six parts, including incoming line, rectifier and inverter, battery, outgoing line, communication and environment. The fault of incoming line includes ATS fault, UPS input fault, single incoming line fault, two incoming lines fault, incoming line bypass fault, etc. The faults of rectifier and inverter include AC input fault, rectifier fault, inverter failure fault, AC output fault, bypass fault, bypass operation, UPS overload, etc.; The faults of the storage battery include too short dischargeable time, abnormal cell voltage, abnormal cell voltage, abnormal cell temperature, battery discharge, low nuclear capacity, low residual capacity, battery fault, etc.; Outgoing line fault includes UPS output fault, UPS outgoing line fault, etc.; Communication part failure includes battery communication failure, power distribution cabinet communication failure, inverter cabinet communication failure, etc.; Environmental part faults include abnormal ambient temperature and humidity. 2.2 Alarm Model The system gives early warning or alarm for faults according to operation data and monitoring model. The fault monitoring model mainly includes the following three types according to data sources. The raw data monitored by the underlying Internet of Things (IOT) is transmitted to the platform through the data acquisition and monitoring system (SCADA). The platform directly classifies and combines these raw data and packages them to output alarms, or gives alarms based on the logic analysis or threshold judgment rules of remote signaling and telemetry data, and finally applies the generated alarm results to the business, such as UPS input and output current, voltage, load rate, power Frequency, etc.; Battery current, voltage, temperature and capacity; Status of components and disconnectors in UPS system; UPS overall operation mode, etc. The alarm business process is shown in Fig. 1.

Fig. 1. Alarm based on IOT raw monitoring data.

104

L. Chen et al.

Direct classification combined alarm, such as UPS lower load fault. If the UPS lower load has several different loads, SCADA will identify and output the alarm for each branch line fault, while the system will combine all the outgoing line faults under UPS into one outgoing line fault and package the output alarm and remind. The specific branch line or branch line faults and fault information can be viewed in the fault alarm details. This is beneficial for the operation and maintenance personnel to merge and deal with the originally scattered alarms. Logic analysis and threshold judgment alarm, such as ATS fault, most of the lines have no direct fault monitoring points, but SCADA can directly monitor the information such as the working signal of the main power supply or the standby power sup-ply of the UPS incoming line, the voltage value of the main and standby power supply, etc. The system can conduct rule diagnosis based on these original monitoring points, and use logic to determine whether the current ATS has a fault. For example, if the ATS works on the main power supply and the voltage value of the main power supply exceeds the set threshold, and the ATS does not automatically switch from the main power supply to the standby power supply, it is judged that the ATS has a fault. The main function of ATS is to allow two power supplies to operate as standby for each other. When the active power supply is abnormal, ATS will automatically switch to another standby power supply. 2.3 Fault Knowledge Base The system accumulates equipment fault analysis experience according to equipment faults, standing books, operation inspection, etc., and professional personnel carry out daily maintenance of the fault knowledge base to facilitate the operation and maintenance personnel to search for fault information, fault analysis, auxiliary decision-making, etc., and provide equipment fault operation and maintenance management life-cycle solutions. The knowledge base mainly includes fault phenomenon, fault reason, impact scope, treatment measures and emergency plan. When the system diagnoses the fault or fault tendency according to the rules and sends out the fault alarm, it will automatically associate and output the fault knowledge base content in the fault details.

3 System Application Before the construction of the emergency power monitoring system, the operation and maintenance management of the system faced the business pain points of high operation and maintenance cost of UPS equipment, low data utilization rate and low level of intelligent operation. Based on these pain points, a UPS intelligent power supply monitoring system has been built. The system takes digital twins as the core, and combines 5G, big data, IOT edge computing and other technologies to carry out intelligent monitoring, early warning and prediction, status assessment, predictive maintenance, etc., to achieve lean management, lean detection and lean control of equipment. In the intelligent monitoring of emergency power supply, the physical system is the UPS physical equipment in each station, and the digital twin corresponds to the simulation model in the system. The mapping relationship between the physical body and the

Research and Application of Intelligent Monitoring and Diagnosis System

105

twin body is: first, the bottom layer collects the physical UPS attributes, parameters, status and other data through various terminal devices, and then perceives, directly collects, transmits, and aggregates them into the emergency power intelligent monitoring system through the SCADA system. Through various computing strategies and intelligent services of the system, the physical emergency power data is analyzed, and the resulting insight results are reflected in the physical space, Make better decision and execution on the physical UPS system. 3.1 Integrated Control of UPS Equipment During the monitoring of emergency power supply UPS equipment, the overall operation of emergency power supply equipment within the jurisdiction can be viewed in real time through the large screen. Click the abnormal part to enter the part monitoring details. You can view the alarm condition, monitoring range, fault diagnosis details and other information by line and site switching. In this way, when the UPS is abnormal, the dispatcher can quickly and accurately locate the abnormal point, timely mobilize the operation and maintenance personnel and materials closest to the abnormal point to eliminate the shortage, quickly make a decision at the first time, and control the fault impact range and outage maintenance time to the minimum. The function example interface is shown in Fig. 2.

Fig. 2. Example interface of UPS multi-dimensional comprehensive control.

106

L. Chen et al.

3.2 Real-Time Status Monitoring of Equipment According to the monitoring data, the real-time circuit diagram of UPS at each station is presented. The connection mode and corresponding information of each component and between components in the circuit diagram are completely presented according to the actual situation of the UPS physical system. On the diagram, the real-time operation mode, operation abnormality, switch on/off, current flow direction, load distribution, equipment evaluation and fault alarm of UPS can be seen. The circuit diagram can not only understand the real-time status of UPS, but also help to troubleshoot UPS operation faults. The function example interface is shown in Fig. 3.

Fig. 3. Example interface of UPS real-time status monitoring.

3.3 Fault Simulation Fault simulation dynamically simulates the operation mode and status of the UPS digital twin, simulates the fault under different operation modes and predicts the discharge time of the battery, deduces the fault phenomenon, fault cause and influence range of the fault, and plays a pre-feedback role for the UPS. It is convenient to prepare fault measures and emergency plans in advance, and can simulate various UPS common faults for dispatchers and middle station personnel, which has good practical training significance. The function example interface is shown in Fig. 4. 3.4 Visual Report Analysis Scenario The platform provides rich visual reports and intelligent analysis charts, based on which intelligent decisions can be made. For example, subway power supply operation and maintenance personnel can observe the development trend of equipment status through

Research and Application of Intelligent Monitoring and Diagnosis System

107

Fig. 4. Future state simulation example interface.

large screen or PC, and analyze the change trend of the trained algorithm model, ambient temperature, load current, etc. to get the equipment health status under the change trend of equipment. Handle according to relevant disposal procedures to avoid serious failures such as equipment damage. In addition, the analysis and evaluation of accumulated historical operation data can reflect the trend of UPS equipment failure and facilitate the future arrangement to make more reasonable maintenance and repair planning. The function example interface is shown in Fig. 5.

Fig. 5. Example of UPS temperature clustering curve.

108

L. Chen et al.

4 Conclusion This paper introduces the important role of UPS emergency power supply in the rail transit power supply system. According to the six key parts of UPS, an alarm sensing system based on digital twins for monitoring and managing UPS is established. Through this platform, UPS is effectively managed, its service life and reliability are improved, and the safe and reliable operation of rail transit system is guaranteed. The system can support centralized and unified management of multi-brand and multi-type UPS equipment at each line station. At the same time, through more than one hundred intelligent research and judgment models for the six major parts of UPS, each UPS equipment connected is intelligently controlled, so that the various parameters of the on-site UPS equipment can be seen at a glance. The digital twin technology is used to comprehensively present the monitoring status and operation and maintenance status of UPS, and to give early warning and alarm for abnormal operation. The upgrade re-places the blind area of the original UPS monitoring and the operation and maintenance mode relying on on-site inspection by personnel. Since the platform has been operated for two years, it has warned more than 20000 failures, which has greatly improved the safety management and control of the entire emergency power sup-ply system, realized that important loads do not lose power, and put an end to fire incidents caused by battery heating.

References 1. Xu, W.: Intelligent operation and maintenance system for urban rail transit power supply equipment. Urban Mass Transit 24(09), 212–215 (2021) 2. Li, J.: Intelligent management and control system for operation and maintenance of urban rail transit power supply equipment. Urban Rapid Rail Transit 34(01), 149–154 (2021) 3. Yu, S., Chang, H., Wang, H.: Design of cloud computing and microservice-based urban rail transit integrated supervisory control system plus. Urban Rail Transit 6(4), 187–204 (2020) 4. Zhu, L., Yu, F.R., Wang, Y., Ning, B., Tang, T.: Big data analytics in intelligent transportation systems: a survey. IEEE Trans. Intell. Transp. Syst. 20(1), 383–398 (2018) 5. Xiao, B., Wang, W.: Intelligent network operation and maintenance system based on big data. J. Phys. Conf. Ser. 1744(3), 032033 (2021) 6. Wu, C., Zhang, W., Lu, S., Tan, Z., Xue, F., Yang, J.: Train speed trajectory optimization with on-board energy storage device. IEEE Trans. Intell. Transp. Syst. 20(11), 4092–4102 (2018) 7. Tryapkin, E.Y., Keino, M.Y., Protasov, F.A.: Synchronous phase measurements in the automated monitoring system of railway power supply facilities. I Russ. Electr. Eng. 87, 110–1112 (2016) 8. Yu, J., Wang, J., Tong, F.: Research and analysis of power supply load forecasting and selfhealing control in urban rail transit system. IOP Conf. Ser. Earth Environ. Sci. 769(4), 042093 (2021) 9. Bai, L., Wang, W., Zong, H., Yan, K.: Research on intelligent operation and maintenance technology of urban rail transit based on cloud computing platform. In: International Conference on Intelligent Traffic Systems and Smart City, vol. 12165, pp. 390–396 (2022) 10. Chen, C., et al.: Research and application of intelligent power supply of Huadu to Donguan highway. IOP Conf. Ser. Earth Environ. Sci. 310(3), 032006 (2019)

Reliability-Based Dynamic Positioning Control for Turret-Moored Vessels with Prescribed Performance Yulong Tuo1,2(B) , Guilin Feng3 , Xiao Liang4 , Shasha Wang1,2 , and Chen Guo1 1

3

College of Marine Electrical Engineering, Dalian Maritime University, Dalian 116026, China {tuoyulong,wangshashadmu}@dlmu.edu.cn 2 Dalian Key Laboratory of Swarm Control and Electrical Technology for Intelligent Ships, Dalian 116026, China Qingdao Special Equipment Inspection Research Institute, Qingdao 266100, China 4 TianJin Navigation Instruments Research Institute, Tianjin 300130, China

Abstract. To consider the safety of mooring lines and transient performance of system, a reliability-based dynamic surface controller with prescribed performance is proposed for the dynamic positioning (DP) system of turret-moored vessels. The descriptions are given for the vessel model and the reliability of mooring lines at first. Then, a dynamic surface controller is presented on the basis of reliability of mooring lines and prescribed performance function. The performance specifications are imposed in advance on the reliability and heading tracking errors according to the actual demands. By adjusting the reliability, the ability of mooring system can be fully used within the safe range of mooring lines. Therefore, less energy consumption is needed for the DP system. Finally, the numerical simulations illustrate the performance of the presented DP controller. Keywords: dynamic positioning control · turret-moored vessels prescribed performance · reliability of mooring lines

1

·

Introduction

Recently the turret-moored vessels have become the major platform in the field of marine resource exploitation [1]. Once the moored vessel is built, the structure of the mooring system is determined. Therefore, the research focus of this article is on the dynamic positioning (DP) control. For the DP control of turretmoored vessels, there have appeared some relevant research results. In [2,3], the backstepping technology was used for designing the DP controllers. However, there exists considerable conservativeness in the use of mooring system for positioning. Then, the structural reliability of mooring lines is introduced into the DP controller to fully use the mooring system [4]. Nevertheless, the structure of aforementioned reliability-based DP controllers are too complex to make further studying. In this context, simpler reliability-based DP control methods are c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023  Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 109–117, 2023. https://doi.org/10.1007/978-981-99-6187-0_11

110

Y. Tuo et al.

proposed in [5–7] on the basis of the reliability derivative-based matrix. Unfortunately, these controllers mainly focus on the system steady-state performance. In fact, the transient performance of system is also noteworthy in improving the performance of DP control system [8]. Inspired by the above-mentioned discussion, a reliability-based DP controller with prescribed performance is proposed by depending on the previous work [5– 7]. Under the proposed controller, the ability of mooring system can be fully utilized while ensuring the safety of mooring lines. Moreover, the performance specifications are imposed in advance on the reliability and heading tracking errors to improve the transient performance of DP system.

2 2.1

Mathematical Model and Definitions Mathematical Model of Turret-Moored Vessels

On the basis of the earth-fixed and body-fixed frames, the mathematical model of turret-moored vessels are established as below [9]: η˙ = J(ψ)υ,

(1)

M υ˙ + C(υ)υ + D(υ)υ = τ + τm + τenv , T

(2) T

where η = [x, y, ψ] consists of north, east and heading; υ = [u, v, r] is made up of the surge velocity, sway velocity and yaw rate; M is the inertia matrix, D (υ) and C (υ) and are the damping coefficient matrix and the Coriolis and centripetal matrix, respectively; τenv represents the environmental disturbance forces and moment; τ represents the control input of vessel; τm represents the mooring line forces exerted on the moored vessel; and J(ψ) is a rotation matrix given by literature [1]. In practice, τenv , C(ν) and D(ν) are difficult to obtain. However, some effective methods can be used to estimate them with high precision, such as the extended state observer, fuzzy system and neural networks. Therefore, these unknown items are regarded as known in this paper. 2.2

Reliability Index of Mooring Lines

To quantify the safety degree of mooring system, the reliability index of mooring lines is defined: (3) δi (t) = (Tb,i − kσi − Ti (Xi (t))) /σb,i , i = 1, ..., q,  2 2 where Xi (t) = (x − Ni ) + (y − Ei ) , (x, y) denotes north and east positions of vessel and (Ni , Ei ) represents the anchor’s position of mooring line i; q is the number of mooring lines; k is a scaling coefficient; Ti (Xi (t)) and Tb,i are the tension and mean breaking strength of mooring line i; σb,i and σi represent the standard deviations of Tb,i and Ti (Xi (t)). It can be seen from (3) that δi decreases with the increase of Ti (Xi (t)). Therefore, considering the smallest reliability is sufficient to ensure the safety of mooring system. Subscript j is used to denote the mooring line with smallest reliability in the subsequent design.

Reliability-Based Dynamic Positioning Control for Turret-Moored Vessels

3

111

Controller Design and Stability Analysis

The control goal is to make δj and ψ converge to the desired δd and ψd . It should be noted that the desired reliability should be set larger than the critical reliability for the safety of mooring lines, that is δd > δs . And the controller is designed as follows: Define the first dynamic surface s1 : T

s1 = [δj , ψ] − [δd , ψd ]

T

(4)

T

To improve the transient performance of [δj , ψ] , the following time-varying bounds are imposed on s1 : −ρi (t) < s1i < ρ¯i (t), i = 1, 2 where the aforementioned bounds have the following forms:  ρi (t) = (ρ0i − ρ∞i )e−ki t + ρ∞i i = 1, 2 ρ¯i (t) = (¯ ρ0i − ρ¯∞i )e−ki t + ρ¯∞i

(5)

(6)

where ρi (t) and ρ¯i (t) are monotonically decreasing positive performance functions; ρ0i , ρ∞i , ρ¯0i , ρ¯∞i , ki are designed parameters, which can ensure the transient and steady-state performance of s1 . In order to incorporate (5) into the subsequent controller design, the following error transformations is performed: ⎧ s = ρi (t)T (o1i ) ⎪ ⎨ 1i −o1i (7) T (o1i ) = (eo1i − e−o1i )/(eo1i − ϑ−1 ) i = 1, 2 i (t)e ⎪ ⎩ ϑi (t) = ρi (t)/¯ ρi (t) According to (7), we can get o1i =

s1i 1 s1i 1 ln(1 + ) − ln(1 − )i = 1, 2 2 ρi (t) 2 ρ¯i (t)

(8)

T

To stabilize o1 = [o11 , o12 ] , design the following virtual control law: φ1 = −K1 o1

(9)

where K1 ∈ R2×2 is a positive definite diagonal matrix. Then, a first-order low-pass filter is provided for φ1 as Td X˙ d + Xd = φ1

(10)

where Xd is the filter output, Td ∈ R2×2 is the time constant matrix, and Xd (0) = φ1 (0).

112

Y. Tuo et al.

To establish the connection between the mooring line reliability and vessel motion, take the derivative of the defined reliability as below: T˙j (Xj (t)) δ˙j (t) = − σb,j

(11)

Further solving T˙j (Xj (t)), obtain T j T˙j (Xj (t)) = Tj X˙ j = [(x − Nj ) x˙ + (y − Ej ) y] ˙ Xj

(12)

where Tj = ∂ [Tj (Xj (t))] /∂Xj (t), and ε1 ≤ Tj ≤ ε2 with positive constants ε1 and ε2 . In addition, it can be easily got from (1) that    x˙ cos ψ − sin ψ u = (13) y˙ sin ψ cos ψ v According to (12) and (13), (11) can be rewritten as   

cos ψ − sin ψ u ˙δj (t) = − T j x − Nj y − Ej sin ψ cos ψ v σb,j Xj

(14)

It is seen from (14) that the reliability of mooring lines is both related to (x, y) and (u, v). The positions and velocities are the basis of the DP controller design. Therefore, the area keeping of turret-moored vessels can be achieved by controlling the reliability of mooring lines. In order to facilitate the subsequent reliability-based DP controller design, the following reliability derivative-based matrix is designed according to (14):

 −T j [cos ψ(x−Nj )+sin ψ(y−Ej )] T  j [sin ψ(x−Nj )−cos ψ(y−Ej )] 0 σb,j Xj σb,j Xj (15) Q= 0 1 0 Then, define the second dynamic surface s2 :  T  T s2 = δ˙j , ψ˙ − Xd = δ˙j , r − Xd = Qυ − Xd

(16)

Furthermore, it can be got that ˙ + QM −1 (−C(υ)υ −D (υ) υ + τ + τm + τenv ) − X˙ d s˙ 2 = Qυ

(17)

According to (4) and (16), the derivative of s1 satisfies s˙ 1 = s2 + Xd

(18)

In addition, it can be obtained from (8) that o˙ 1 = ξ s˙ 1 + β

(19)

Reliability-Based Dynamic Positioning Control for Turret-Moored Vessels

113

T

where ξ = diag (ξ1 , ξ2 ) and β = [β1 , β2 ] , and ξi and βi have the following forms: ⎧  1 ⎪ ⎨ ξi = 1/(ρi (t) + s1i ) + 1/(¯ ρi (t) − s1i ) 2 (20)   ⎪ ⎩ βi = 1 ρ˙ (t)/(ρ (t) + s1i ) −¯˙ρ (t)/(¯ ˙ i (t)/ϑi (t) ρ (t) − s ) − ϑ i 1i i i 2 i Therefore, o˙ 1 = ξ (s2 + Xd ) + β Based on the above analysis, the following DP controller is designed:   ˙ + C(υ)υ + D (υ) υ − τm − τenv τ = M Q−1 −K2 s2 + X˙ d − Qυ

(21)

(22)

where K2 ∈ R2×2 is a positive definite diagonal matrix. To analyze the stability of DP control system under (22), choose the following Lyapunov function: V =

1 1 1 T o1 o1 + sT2 s2 + Y2T Y2 2 2 2

(23)

where Y2 = Xd − φ1 . Taking the derivative of V , obtain V˙ = oT1 o˙ 1 + sT2 s˙ 2 + Y2T Y˙ 2

(24)

According to (9), (21), Y2 = Xd − φ1 and the Young inequality, get 1 1 1 1 oT1 o˙ 1 ≤ −oT1 ξK1 o1 + oT1 ξξo1 + oT1 o1 + sT2 s2 + Y2T Y2 + β T β 2 2 2 2

(25)

Similarly, it can be obtained from (9), (10), (18) and Y2 = Xd − φ1 that Y˙ 2 = X˙ d − φ˙ 1 = − Td−1 Y2 + K1 o˙ 1

= −Td−1 Y2 + K1 (ξs2 + ξY2 − ξK1 o1 + β)

=

−Td−1 Y2

(26)

+ B (o1 , s2 , Y2 )

where B (o1 , s2 , Y2 ) is a continuous vector function. According to the Young inequality, obtain    Y2T Y˙ 2 ≤ − Y2T Y2 Td−1  + Y2  B (o1 , s2 , Y2 )  1   1 2 2 ≤ − Y2T Y2 Td−1  + Y2  B (o1 , s2 , Y2 ) + 2 2 According to (17), we can get   ˙ + QM −1 (−C(υ)υ −D (υ) υ + τ + τm + τenv ) − X˙ d sT2 s˙ 2 = sT2 Qυ

(27)

(28)

114

Y. Tuo et al.

Substituting (22) into (28), yield sT2 s˙ 2 = − sT2 K2 s2

(29)

The following inequality can be got by substituting (25), (27), (29) into (24): 1 1 1 1 V˙ ≤ −oT1 ξK1 o1 + oT1 ξξo1 + oT1 o1 + sT2 s2 + Y2T Y2 + β T β 2 2 2 2   −1  1  2 2 T T   + Y2  B (o1 , s2 , Y2 ) + 1 − s2 K2 s2 − Y2 Y2 Td 2 Furthermore, it can be obtain   1 T 1 T V˙ ≤ − λmin (ξK1 − ξξ) − o1 o1 − λmin (K2 ) − s s2 2 2 2   1 1 1 1 2 2 + Y2T Y2 − Y2T Y2 Td−1  + Y2  B (o1 , s2 , Y2 ) + β T β + 2 2 2 2

(30)

(31)

Assume that there exists a compact set Π = {(o1 , s2 , Y2 ) : V ≤ B  0 , ∀B0 > 0} ∈ R6 , and B (o1 , s2 , Y2 ) ≤ BM is within Π. By selecting Td−1  = 1/2 + 2 BM /2 + μ∗ (μ∗ > 0), (31) can be transformed as below:   1 T 1 T 1 V˙ ≤ − λmin (ξK1 − ξξ) − o1 o1 − λmin (K2 ) − s2 s2 + Y2T Y2 2 2 2     2 2 2 1 1 2 B (o1 , s2 , Y2 ) B Y2  1 1 + B + μ∗ Y2T Y2 + M − + βT β + 2 2 2 M 2 BM 2 2   2 2 1 B (o1 , s2 , Y2 ) 1 B 2 Y2  ≤ −2μV + β T β + − M 1− 2 2 2 2 BM   2 2 B 2 Y2  B (o1 , s2 , Y2 ) = −2μV + C − M 1− 2 2 BM (32) where μ has the following form:   1 1 ∗ μ = min λmin (ξK1 − ξξ) − , λmin (K2 ) − , μ > 0 (33) 2 2 and satisfies the following inequality: μ>

C 2B0

(34)

V˙ ≤ −2μB0 + C can be obtained from (32) when V = B0 . Then according to (34), we can get V˙ < 0. Therefore, Π = {(o1 , s2 , Y2 ) : V ≤ B0 , ∀B0 > 0} ∈ R6 is an invariant set, which means V (t) ≤ B0 , ∀t can be satisfied if V (0) ≤ B0 .Therefore, B (o1 , s2 , Y2 ) ≤ BM is suitable for all V (0) ≤ B0 , and the

Reliability-Based Dynamic Positioning Control for Turret-Moored Vessels

115

following inequality can be derived from (32): V˙ ≤ −2μV + C

(35)

Solving (35), yield 0 ≤ V (t) ≤

 C C + V (0) − exp(−2μt) 2μ 2μ

(36)

0 that holds o1  ≤ ζo1 , ∀t ≥ To1 According to (36), there exists To1 >  for all V (0) ≤ B0 and arbitrarily ζ > C/μ. Thus o1 can be guaranteed o1    2 to converge to Ωo1 = o1 ∈ R o1  ≤ ζo1 , ζo1 > C/μ by choosing suitable parameters K1 , K2 , Td and μ according to (33) and (34). Since o1 is the equivalent transformation of s1 , s1 can also converge to the arbitrarily small set, that is T T [δj , ψ] → [δd , ψd ] . Hence, the stability of DP control system of turret-moored vessel has been proven.

4

Simulations and Analysis

To verify the performance of proposed control method, simulations are conducted on a turret-moored vessel, whose main parameter details can be seen in [5]. And the disturbances are described by using the first-order Markov process [5]:  τenv = J T (ψ) b (37) b˙ = −T −1 b + Ψ w ¯ where the details of b ∈ R3 , T ∈ R3×3 , w ¯ ∈ R3 and Ψ ∈ R3×3 can be

T seen in [5]. are set as: b (0) = 104 × 20N, 103 N, 2 × 104 N m ,   4And4 they T = diag 10 , 10 , 104 and Ψ = diag 103 , 103 , 103 . The simulation time is set as 2000s, and the critical reliability of the mooring system is set as 4.4. The initial position and velocity vectors are both set as zero vectors, and the desired reliability and heading are chosen as 5 and 10◦ . And the allowable region radius of turret-moored vessel is set as 70.94 m under the aforementioned marine disturbances. The control parameters are set as Td = diag (1, 1), K1 = diag (5, 0.1), K2 = diag (3, 0.5), ρ01 = ρ¯01 = 7, ρ∞1 = ρ¯∞1 = 0.1, k1 = 0.009, ρ02 = ρ¯02 = 0.2, ρ∞2 = ρ¯∞2 = 0.001, k2 = 0.002. Then, the simulation results is given in Fig. 1 a-d: From Fig. 1-a, it is seen the turret-moored vessel can be kept within the allowed area under the proposed prescribed performance controller. Moreover, Fig. 1-b and Fig. 1-c show that the proposed controller can not only make δj and ψ converge to the desired values, but also ensure the safety of mooring lines. Maintaining the reliability at a small enough value while ensuring the safety of the mooring lines can fully use the positioning ability of mooring system. Since the mooring system doesn’t consume any energy, our proposed controller

116

Y. Tuo et al.

can reduce the energy consumption of DP system. Figure 1-d gives the reliability and heading tracking errors of the DP controllers with and without the prescribed performance. Under our proposed DP controller with the prescribed performance, the reliability and heading errors can be kept always within the prescribed bounds. However, the overshoots may too large in the initial stage for the controller without the prescribed performance. Therefore, our proposed control method can improve the transient performance of DP system to a certain extent.

Fig. 1. Simulation results. a. Trajectory of the turret-moored vessel. b. Reliability and heading of the turret-moored vessel. c. Tensions of mooring lines. d. Reliability and heading tracking errors under the DP controllers with and without prescribed performance.

5

Conclusions

In this paper, a reliability-based dynamic surface controller with prescribed performance is proposed for the DP system of turret-moored vessels. Based the reliability of mooring lines, the positioning ability of mooring system can be fully used within the safe range of mooring lines, which leads to less energy consumption for the DP system. Besides, the performance specifications are imposed in advance on the reliability and heading tracking errors, which can improve the transient performance of DP system to a certain extent. Finally, simulations are conducted to verify the better energy-saving efficiency and transient performance of the proposed reliability-based DP controller.

Reliability-Based Dynamic Positioning Control for Turret-Moored Vessels

117

Acknowledgements. This work was supported in part by the National Natural Science Foundation of China (52101298, 52201409, 62273068, 51879027, 51579024), and in part by the Fundamental Research Funds for the Central Universities (3132023102), and in part by the Natural Science Foundation of Liaoning Province (2023-MS-120).

References 1. Tuo, Y., Wang, S., Peng, Z., Guo, C.: Reliability-based fixed-time nonsingular terminal sliding mode control for dynamic positioning of turret-moored vessels with uncertainties and unknown disturbances. Ocean Eng. 248, 110748 (2022) 2. Strand, J.P.: Nonlinear Position Control Systems Design for Marine Vessels. Dissertation of Norwegian University of Science and Technology (1999) 3. Strand, J.P., Ezal, K.O., Fossen, T.I., Kokotovic, P.V.: Nonlinear control of ships: a locally optimal design. IFAC Proc. Vol. 31, 705–710 (1998) 4. Berntsen, P.I., Leira, B.J., Aamo, O.M., Sørensen, A.J.: Structural reliability criteria for control of large-scale interconnected marine structures. In: Proceedings of 23rd International Conference on Offshore Mechanics and Arctic Engineering, Vancouver, British Columbia, Canada, pp. 297–360 (2004) 5. Tuo, Y., Wang, S., Guo, C., Gao, S.: Robust output feedback control for dynamic positioning of turret-moored vessels based on bio-inspired state observer and online constructive fuzzy system. Int. J. Naval Archi. Ocean Eng. 14, 100440 (2022) 6. Tuo, Y., Wang, S., Guo, C., Yu, H., Shen, Z.: Reliability-based Event Driven Backstepping Positioning Control for a Turret-moored FPSO Vessel with Unknown Slow Time-varying Disturbances. Int. J. Control Autom. Syst. 20, 472–482 (2022) 7. Tuo, Y., Wang, S., Guo, C.: Finite-time extended state observer-based area keeping and heading control for turret-moored vessels with uncertainties and unavailable velocities. Int. J. Naval Arch. Ocean Eng. 14, 100422 (2022) 8. Li, M., Xie, W., Wang, Y., Hu, X.: Prescribed performance trajectory tracking faulttolerant control for dynamic positioning vessels under velocity constraints. Appl. Math. Comput. 431, 127348 (2022) 9. Peng, Z., Wang, J., Wang, D.: Distributed containment maneuvering of multiple marine vessels via neurodynamics-based output feedback. IEEE Trans. Ind. Electron. 64, 3831–3839 (2017)

A Pose Control Algorithm for Simulating Robotic Fish Gang Wang1,2 , Simin Ding3 , and Qiang Zhao2(B) 1 Jilin Communications Polytechnic, Changchun, JL, China 2 Baicheng Normal University, Baicheng, JL, China

[email protected] 3 Jilin Institute of Chemical Technology, Jilin, JL, China

Abstract. Aiming at the problems of low control accuracy an slow response speed of the position and orientation (position and direction) of the simulated robot fish, this paper proposes a position and attitude control algorithm for the simulated robot fish. Firstly, this paper introduces the simplified dynamics and kinematics model; Secondly, the first coordinate system is established based on the expected position and pose, and the position and pose error model of robot fish is constructed; Thirdly, the angular velocity controller, linear velocity controller and fuzzy controller are designed. Finally, Microsoft Visual Studio 2010 software is used to write the corresponding strategy, and it is applied to the simulation platform of URWPGSim2D software for simulation experiments. The experimental results show that compared to time-varying feedback control algorithm and cascade PID algorithm, this algorithm improves control accuracy while accelerating response speed. Keywords: Pose Control · Fuzzy Adaptive Cascade PID Algorithm · URWPGSim2D Software Simulation Platform

1 Introduction The position and posture control of the simulated robot fish is to ensure that the simulated robot fish can overcome various interferences and reach the designated target point in the desired direction during swimming. The essence of the swimming of the simulated robot fish is the movement transformation between position and posture. The quality of the position and posture control directly affects the completion effect of the simulated robot fish in swimming, ball snatching and other specific competitions [1, 2]. Yang Yun used specific swimming angles and swimming time to replace the pose and pose functions of the simulated robotic fish, which improved the swimming speed of the simulated robotic fish and the straightness of the swimming route, achieving the formation of multiple simulated robotic fish for linear swimming [3]. Li Shuqin designed dynamic obstacles to address the issue of only static obstacles and lack of variability on the URWPGSim2D simulation platform, which meet the practical application requirements of real-time obstacle avoidance in simulating robot fish pose control [4]. Jing Qi © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 118–127, 2023. https://doi.org/10.1007/978-981-99-6187-0_12

A Pose Control Algorithm for Simulating Robotic Fish

119

discussed and analyzed the “no ball stage” and “with ball stage” of the collaborative through-hole project separately, avoiding the phenomenon of “position drift”, shortening the project duration, and improving fault tolerance and stability [5]. This article focuses on the problems of low accuracy and slow response speed in the pose control of simulated robotic fish with interference. Using the underwater robot water ball competition simulator URWPGSim2D developed by Peking University as the simulation experimental platform, the fuzzy adaptive cascade PID algorithm is used to simulate the pose control problem of simulated robotic fish, achieving more accurate pose control of simulated robotic fish.

2 The Pose Control Model for Simulating Robot Fish 2.1 Simplified Dynamic Model The control commands used by the simulation robot fish in URWPGSim2D software are turning gear and speed gear, which are used as inputs for the simplified dynamic model. The essence of simplifying the dynamic model is to find the corresponding relationship between the input commands (speed gears and turning gears) and the output experimental data (linear velocity and angular velocity values) of simulating a robotic fish in an ideal state without collision [6]. 2.2 Simplified Kinematic Model The simple simulation diagram of simulating the motion of a robotic fish is shown in Fig. 1. The output actually corresponds to a triplet composed of the center coordinate and direction angle of the simulated robotic fish, called pose (x, y, θ ). The input is the linear velocity v and angular velocity w of the simulated robotic fish. It is easy to obtain the motion constraint equation of the simulated robot fish: x˙ (t)sinθ (t) − y˙ (t)cosθ (t) = 0

(1)

wherein, (x(t), y(t)) is the center coordinate of the simulated robot fish, θ (t) is the direction of motion, ω(t) is the angular velocity, and v(t) is the linear velocity. The kinematics equation of the simulated robot fish can be described as: ⎡ ⎤ ⎡ ⎤ x˙ (t) cos θ (t) 0  v(t) ⎢ y˙ (t) ⎥ ⎣ (2) ⎣ ⎦ = sin θ (t) 0 ⎦ ω(t) 0 1 θ˙ (t) Among them, linear velocity v(t) ≤ vmax and angular velocity ω(t) ≤ ωmax . As the simulation robot fish in URWPGSim2D software produces visual “swimming” effect through continuous redrawing of multiple discrete cycles, it is necessary to discretization the continuous system equation described above, and the discrete system equation is [7]. ⎧ ⎨ x(k + 1) = x(k) + Tv(k) cos θ (k) (3) y(k + 1) = y(k) + Tv(k) sin θ (k) ⎩ θ (k + 1) = θ (k) + Tw(k) where T represents the sampling period and k represents the sampling time.

120

G. Wang et al.

Fig. 1. Simple simulation of robot fish movement

3 The Mathematical Model for Simulating Robot Fish This article uses the actual pose and expected pose information of the robotic fish as input control signals to achieve pose control of the robotic fish. The coordinate system of the robotic fish pose control model is shown in Fig. 2, and the pose deviation p = [xe ye θe ]T between the actual pose [xc yc θc ]T and the expected pose [xT yT θT ]T is obtained. The goal of the robot fish pose control algorithm is to measure the expected pose relative to the direction and distance of the bionic robot fish in real-time. The control goal is to achieve the pose control of the bionic robot fish with the direction and distance approaching zero. An effective control law is established to ensure that the pose

of the bionic robot fish reaches the predetermined pose within a limited time. When (xc − xT )2 + (yc − yT )2 + (θc − θT )2 reducing to the allowable error range of the robot fish pose, the angular velocity of the robot fish is 0, and the linear velocity is 0, theoretically reaching the target pose. ⎡ ⎤ ⎡ ⎤ ⎤⎡ xe cos θT sin θT 0 xc − xT ⎢ ⎥ (4) Pe = ⎣ ye ⎦ = ⎣ − sin θT cos θT 0 ⎦⎣ yc − yT ⎦ θc − θT 0 0 1 θe

4 The Pose Control Algorithm for Simulating Robot Fish This article adopts a fuzzy adaptive cascade PID control system for pose control. The fuzzy adaptive cascade PID controller includes two parts: a fuzzy logic controller and a cascade PID controller [8–11]. The schematic diagram of the simulation machine fish pose control based on fuzzy adaptive cascade PID is shown in Fig. 3. 4.1 Design of Angular Velocity Controller The outer loop angle PID controller and the inner loop angle velocity PID controller of the angular velocity controller both adopt incremental PID algorithm. The input of the

A Pose Control Algorithm for Simulating Robotic Fish

121

Fig. 2. Coordinate system of robotic fish position and pose control model

Fig. 3. The schematic diagram of position and attitude control of simulated robot fish based on fuzzy adaptive cascade PID

outer loop angle PID controller of the angular velocity controller is the sum of the angle error between the actual direction angle and the expected direction angle and the line of sight angle. The output of the outer loop angle PID controller - the expected angular velocity is wr (k) = wr (k − 1) + kpo1 (eθ (k) − eθ (k − 1)) + kio1 · eθ (k) + kdo1 (eθ (k) − 2eθ (k − 1) + eθ (k − 2))

(5)

Among them, eθ (k) = θr − θc (k) + θs (k), θr − θc (k) Represents the difference between the expected direction Angle and the actual direction Angle of the simulated robotic fish, θs (k) = arctan[(yr − yc (k))/(xr − xc (k))] Represents the line of sight Angle of the simulated robotic fish, kpo1 , kio1 , kdo1 Respectively are the ratio, integral and differential coefficients of the outer ring Angle PID controller of the angular velocity controller, and k is the sampling time. The output of the outer ring Angle PID controller is the input of the inner ring angular velocity PID controller, and the output of the inner ring angular velocity PID

122

G. Wang et al.

controller – angular velocity instruction is uw (k) = uw (k − 1) + kpi1 (ew (k) − ew (k − 1)) + kii1 · ew (k) + kdi1 (ew (k) − 2ew (k − 1) + ew (k − 2))

(6)

Among them, ew (k) = wr (k) − wc (k), wc Is the motion parameter information obtained from the angular velocity value of the simulated robotic fish, kpi1 , kii1 , kdi1 Respectively are the proportion, integral and differential coefficients of the inner ring angular velocity PID controller of the angular velocity controller, and k is the sampling time. 4.2 Design of Linear Speed Controller The outer loop distance PID controller and the inner loop speed PID controller both adopt incremental PID algorithm. The input of the outer loop distance PID controller of the linear velocity controller is the distance error between the actual position and the expected position, and the output of the outer loop distance PID controller – the expected speed is vr (k) = vr (k − 1) + kpo2 (ed (k) − ed (k − 1)) + kio2 · ed (k) + kdo2 (ed (k) − 2ed (k − 1) + ed (k − 2))

(7)

Among them, ed (k) = (xr − xc (k))2 + (yr − yc (k))2 , kpo2 , kio2 , kdo2 Respectively are the ratio, integral and differential coefficients of the outer loop distance PID controller of the linear velocity controller, and k is the sampling time. The output of the outer loop distance PID controller is used as the input of the inner loop speed PID controller, and the output of the inner loop speed PID controller of the linear speed controller – linear speed instruction is uv (k) = uv (k − 1) + kpi2 (ev (k) − ev (k − 1)) + kii2 · ev (k) + kdi2 (ev (k) − 2ev (k − 1) + ev (k − 2))

(8)

Among them, er (k) = vr (k)−vc (k), vc Is the motion parameter information obtained by simulating the speed value of the machine fishing line, kpi2 , kii2 , kdi2 Respectively are the proportion, integral and differential coefficients of the inner loop speed PID controller of the linear speed controller, and k is the sampling time. 4.3 Design of Linear Fuzzy Controller Fuzzification converts the non-fuzzy input variable error signal e(k) and error change rate signal e(k) from its own variable domain to the internal domain of the fuzzy system, and obtains e∗ and e∗ , and calculates the membership degree of each fuzzy set on the respective domain, and converts them into the value of fuzzy variables. According to the fuzzy rules in the fuzzy rule base, the fuzzy inference machine obtains the corresponding fuzzy conclusions by fuzzy inference, and converts the fuzzy conclusions into precise

A Pose Control Algorithm for Simulating Robotic Fish

123

Table 1. Fuzzy variation information table of simulated robotic fish Symbol

Physical domain

Fuzzy domain

Quantizer



[−, ]

{−6, …, 6}

2/12

eθ

[−1, 1]

{−3, …, 3}

2/6

Kpi1

[−1, 1]

{−3, …, 3}

2/6

Kii1

[−0.1, 0.1]

{−0.03, …, 0.03}

0.2/0.06

Kdi1

[−5,5]

{−0.6, …, 0.6}

10/1.2

ed

[−5400, 5400]

{−6, …, 6}

10800/12

ed

[−45, 45]

{−3, …, 3}

90/6

Kpi2

[−0.1, 0.1]

{−3, …, 3}

0.2/6

Kii2

[−0.1, 0.1]

{−0.03, …, 0.03}

0.2/0.06

Kdi2

[−1, 1]

{−3, …, 3}

2/6

output Kpi∗ , Kii∗ , Kdi∗ , The theoretical domain is transformed into:Kpi∗ , Kii∗ , Kdi∗ , According to Formula (9), the parameters of the inner ring PID controller are set: ξ Kζ (k + 1) = Kζ (k) + Kζ , ζ = pi, ii, di

(9)

This article divides the fuzzy domain of the input and output of the fuzzy controller into seven fuzzy subsets, with NB, NM, NS, ZO, PS, PM, PB representing seven language values of negative large, negative medium, negative small, zero, positive small, positive medium, and positive large, respectively [12]. The fuzzy change information of the simulated robotic fish is shown in Table 1.

5 Simulation Verification 5.1 Simulation Task In the URWPGSim2D simulation environment, the No. 6 simulated robot fish of the non confrontational 2D simulation Synchronized swimming competition project is used as the model to verify the pose control algorithm, and the color of No. 6 simulated robot fish is set to green, in which No. 1 yellow simulated robot fish can swim randomly in the field and is not controlled by the strategy. In order to prevent other simulated robot fish from interfering with the movement of No. 6 simulated robot fish, the remaining red simulated robot fish is placed in the upper left corner, Without assigning any actions, the simulation robot fish model in the URWPGSim2D simulation environment is shown in Fig. 4. The experimental task is to simulate the robot fish from the initial pose (−800 mm, 500 mm, −π/2 rad) Swim to target position (600 mm, −600 mm, π/6 rad).

124

G. Wang et al.

Fig. 4. Simulation robot fish model under URWPGSim2D simulation environment

5.2 Simulation Results and Analysis The PoseToPose function in URWPGSim2D, commonly known as position and attitude control function, adopts time-varying feedback control law [13] as uv = −K1 sin2 t

(xc − xr ) cos θr + (yc − yr ) sin θr cos(θc − θr )

uw = k1 sin2 t((xc − xr ) cos θr + (yc − yr ) sin θr ) ·(k2 (yc − yr ) cos θr − k2 (xc − xr ) sin θr + k3 tan(θc − θr )) · cos2 (θc − θr )

(10)

(11)

Among them, k1 , k2 , k3 Is the control parameter of the time-varying feedback control law, and t is the time variable. Literature [14] proposes a cascade PID robotic fish position and pose control algorithm, and the control parameters in this algorithm are shown in Table 2. Figure 5 shows the video sequence diagram of the simulation program based on fuzzy adaptive cascade PID control law with 6-s intervals in the URWPGSim2D environment. According to the video sequence diagram of the simulation program, it takes 24 s for the No. 6 simulation robot fish to reach the target pose. After reaching the target pose, the simulated robotic fish comes to rest, and the position error and direction angle error remain unchanged. Figures 6 and 7 show the comparison of position and direction angle errors based on time-varying feedback control law, literature [14] algorithm, and fuzzy adaptive cascade PID control law, respectively. The time taken for the simulated robotic fish to reach the target pose is 27 s, 31 s, and 24 s, with errors of (29 mm, 18 mm, 0.1111 rad), (7 mm, 5 mm, −0.0358 rad), and (5 mm, 3 mm, −0.0283 rad). The simulation results show that using the fuzzy adaptive cascade control algorithm to simulate the robot fish reaching the target pose takes the shortest time, and the effect of reducing pose error is the most significant.

A Pose Control Algorithm for Simulating Robotic Fish Table 2. Control parameters of simulation experiment Control parameters

Parameter value

Control parameters

Parameter value

kpo1

20.0

kpo2

30.0

kio1

0.33

kio2

12.0

kdo1

200.0

kdo2

18.0

kpi1

1.0

kpi2

0.5

kii1

0.1

kii2

0.7

kdi1

60.0

kdi2

13.0

Fig. 5. Video sequence diagram in URWPGSim2D simulation environment

Fig. 6. Position error contrast

125

126

G. Wang et al.

Fig. 7. Directional angular error contrast

6 Conclusion By analyzing the simplified dynamics and Kinematics model of the simulated robot fish, and establishing the position and attitude control model of the simulated robot fish, the fuzzy adaptive cascade PID controller is designed, and the URWPGSim2D simulation software is used to realize the position and attitude control of the simulated robot fish from the initial position and attitude to the target position and attitude. The simulation results show that the fuzzy adaptive cascade PID controller achieves the effect of reducing errors and accelerating response speed. The simulation robot fish pose control algorithm used in this article is in an ideal experimental state of obstacle free and without considering boundaries. In future research on robot fish pose control, obstacle factors will be added to achieve pose control while avoiding obstacles in the field. Acknowledgment. The author thanks Wang Gang and others for their warm help. This work was supported by the Jilin Provincial Science and Technology Development Plan (Natural Science Foundation) “Research on key technologies of underwater intelligent detection robot based on multi-source information fusion”, project number: 20220101138JC; and the Natural Science Foundation of Jilin Province (general project of free exploration) “Research on cooperative strategy of multi-underwater AUV clusters based on bio-inspiration”, project number: YDZJ202301ZYTS420.

References 1. Ke, H., Wang, C.: Stability and path planning of robotic fish based on URWPGSim2D platform. Ordnance Ind. Autom. 37(04), 93–96 (2018) 2. Rong, H., Wei, X.: Optimal solution of water polo transportation path planning based on URWPGSim2D simulation platform. In: Proceedings of the 2018 7th International Conference on Sustainable Energy and Environment Engineering (ICSEEE 2018) (2019) 3. Yang, Y., Wang, H., Li, H., Shu, R.: An improved simulation robot fish synchronized swimming strategy. Ordnance Ind. Autom. 35(12), 87–88 (2016)

A Pose Control Algorithm for Simulating Robotic Fish

127

4. Li, S., Yuan, X., Xiao, C.: Design and realization of dynamic obstacle on URWPSSim2D. Telkomnika Indones. J. Electr. Eng. 12(1), 304–313 (2013) 5. Qi, J., Zhuo, L., Han, L.: Strategy optimization of cooperative through hole project based on URWPGSim2D Platform. Ordnant Ind. Autom. 35(05), 72–75 (2016) 6. Xie, G., Li, S., He, C.: Multi-robot Fish Cooperative Simulation System, pp. 28–40. Harbin Engineering University Press, Harbin (2013) 7. Zou, K.: Research on Path Planning and Trajectory Tracking Control of Underwater Vehicle. Peking University, Beijing (2009) 8. Wang, S., Chen, F.: Research on accurate projectile stability control of individual UAV in complex terrain. J. Ordnance Equip. Eng. 20, 41(03), 25–30 9. He, N.: Research on Four-rotor UAV Flight Control System Based on Fuzzy Adaptive PID Control. Hebei University of Technology, Tianjin (2017) 10. Yu, W., Yang, K.: Design of cascade fuzzy adaptive PID control system for four-rotor UAV. Mach. Des. Manuf. (01), 227–231 (2019) 11. Xiao, P., Liu, S.: Design of control system for small cabled unmanned underwater vehicle. Small Microcomput. Syst. 040(002), 451–455 (2019) 12. Liu, J., Zhu, M., Zhou, R.: Theory and Application of Advanced Fuzzy Intelligence Compound Classical PID Control and Its Matlab Implementation, pp. 74–80. Capital University of Economics and Business Press, Beijing (2016) 13. Intelligent Control Laboratory, Peking University. Control and Optimization of Robotic Fish. Technical report of Peking University (2011) 14. Wang, G., Song, Y., Tang, W., Zhao, Q.: Robot fish pose control algorithm based on cascade PID. J. Jilin Univ. 60(03), 734–742 (2022)

Unsupervised Multidimensional Time Series Anomaly Detection Based on Federation Learning Ying Deng1

, Yaogen Li1(B)

, Yingqi Liao2 , Nan Ma2 , and Chengyu Yuan1

1 Nanjing University of Information Science and Technology, Nanjing 210044, China

[email protected] 2 Nanjing Power Supply Company of State Grid Jiangsu Electric Power Co., Ltd.,

Nanjing 210019, China

Abstract. Efficient and accurate prediction of electricity data is an important task in electricity data research. Convolutional neural networks have excellent performance in electricity data prediction problems but require large amounts of data to train the models. The major power companies are not willing to share their power data due to privacy and security concerns, making it impossible to train more accurate models. Moreover, the huge amount of data uploading to the central server for training the federated model generates huge network resource overhead. To address these problems, this paper proposes a federation learning algorithm (FedFLA) for unsupervised multidimensional time series anomaly detection based on existing artificial intelligence techniques such as federation learning, time-domain convolutional neural networks, and self-attentive mechanisms, in combination with existing federation learning algorithms. This algorithm uses time-domain convolutional networks and self-attentive mechanisms to fully consider time series local data dependence and global data correlation, and fuses time series model parameters with information on features through cross-stitch units and obtains anomaly scores for each time stamp, so as to determine whether the data is anomalous. Keywords: Electrical power data · Federated learning · Self-attentive mechanisms · Time-domain convolutional neural networks

1 Research Background and Purpose Traditional statistical methods and machine learning techniques face several challenges when detecting complex anomalies in high-dimensional nonlinear power system data, such as large data volume, high dimensionality, complex non-linear relationships, and data privacy protection issues [1]. To address these challenges, this paper proposes a federated learning-based anomaly detection method for power data. The method distributes model training to local devices of multiple data owners to protect data privacy.

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 128–135, 2023. https://doi.org/10.1007/978-981-99-6187-0_13

Unsupervised Multidimensional Time Series Anomaly Detection

129

2 Related Work Anomaly detection in power systems has been extensively studied using statistical methods [2–4], traditional machine learning techniques, and deep learning algorithms. However, these methods have limitations in detecting complex anomalies in high-dimensional and non-linear power system data. To address this challenge, Temporal Convolutional Networks (TCNs) have been proposed as a class of neural networks that can model sequences of data with temporal dependencies effectively. TCNs use dilated convolutions to process input sequences, enabling them to capture long-range dependencies while maintaining computational efficiency [5]. Although federated learning has been proposed as a promising approach for collaborative machine learning, current methods do not adequately consider local and global dependencies in the data [5]. To overcome this limitation, we propose a novel approach that introduces a self-attentive mechanism in the time-domain convolutional network to capture both local and global features of the data [6]. An information fusion module is further employed to integrate the two features, leading to effective multidimensional time series anomaly detection. After data learning, the client side uploads the parameters to the server side, and the server side distributes the model to each local side. We conduct extensive experiments to evaluate the performance of our proposed method, and the results demonstrate its superiority over traditional statistical methods and machine learning techniques in both accuracy and efficiency.

3 Methodology 3.1 Preliminaries In the following, X = (x1 , x2 , . . . , xt ), denote t multivariate time series with xt ∈ Rm , t ∈ {1, 2, . . . , T }, m is the number of features. For evaluation we also define a class label yt ∈ {0, 1} for an entire time series where 0 denotes the normal class and yt = 1 indicates an anomaly. In order to eliminate noise, missing values and outliers in the original data, to make the data more standardized and regular, and to improve the accuracy and reliability of the model we first standardize the data in this paper, the raw data is first preprocessed to remove outliers and missing values. Then, time windows are constructed to divide the data into several segments, so that the system state at each time interval can be analyzed. The formula is as follows: xt∗ =

xt − min(X) max(X) − min(X) + α

(1)

where xt represents the relevant characteristic parameters of the substation collected at time t, xmax is the maximum value among the collected sample parameters, xmin is the minimum value among the collected sample parameters, xt∗ represents the standardized result of the relevant characteristic parameters of the substation collected at time t, and α is a small constant added to avoid division by zero.

130

Y. Deng et al.

Next, considering the dependence relationship between observation point xt and historical time points, a time window of length N is constructed: wt = (xt−N+1 , xt−N+2 , . . . , xt )

(2)

where xt represents the relevant characteristic parameters of the substation collected at time t, and xt−N+1 is the starting position of this time window. 3.2 Local Model Construction 3.2.1 Model Input Initialization The stacking of multi-layer fusion encoders is conducive to learning deeper potential time series correlation. A fusion encoder of L layers is assumed to exist, and a time series window of length N is input w ∈ Rm , the calculation process of layer l and the initial input of the model are formalized with equations as follows: ⎧ ⎪ ⎪ Z1 , Z1 = ω ⎪ 1 2 ⎪ ⎪ ⎪ ⎪ ⎨ Zl+1 , Ol = FusionEncoder(Zl , Zl ) 1 2 1 2 (3) ⎪ l l l )) ⎪ = LayerNorm(O + Convld(Z S ⎪ 2 2 2 ⎪ ⎪ ⎪ ⎪ ⎩ Zl+1 = FeedForward(Sl ) + Sl 2 2 2 in this equation, Z1l+1 , Z2l+1 ∈ RN ×dmodel , are the two inputs of the fusion encoder, represent the input of the fusion encoder at layer l + 1, l ∈ {1, 2, …, L}, dmodel represents the dimension of the eigenspace vector, O2l and Z2l are the value processed by the fusion encoder, FusionEncoder represents fusion encoder, LayerNormfunction represents normalization, S2l is the fusion encoder inputs the decoder feedforward network value, Convld represents convolution operation, FeedForward indicates the feedforward network. 3.2.2 Single-Layer Fusion Encoder Calculation Using a time sliding window W as a model for two identical inputs, Z1 , Z2 , the fusion encoder applies to each input and generates two outputs, By applying time-domain convolutional neural networks and self-attention mechanisms, matrix multiplication is performed to calculate the attention weights between each time point and all other time points. The input to the L-layer encoder is obtained using formula (3), and information extraction is carried out using time-domain convolutional networks and self-attention mechanisms. The extracted features are then fused together using cross-stitch units. The

Unsupervised Multidimensional Time Series Anomaly Detection

specific formula is shown in (4): ⎧ ⎪ l , Zl Wl , Zl Wl ⎪ Q, K, V = Z2l WQ ⎪ 2 K 2 V ⎪ ⎪ ⎪ T ⎪ QK ⎪ ⎪ M = Softmax( √m ) ⎪ ⎪ ⎪ ⎪ ⎪ l ⎪ ⎪ ⎪ I1 = TemporalBlock(Z1 ) ⎨ I1 = MI1 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ I2 = MV ⎪ ⎪ ⎡ ⎤ ⎡ ⎤⎡ ⎤ ⎪ ⎪ ⎪ ⎪ O γ γ ⎪ ⎪ ⎣ 1 ⎦ = ⎣ 11 12 ⎦⎣ I1 ⎦ ⎪ ⎪ ⎪ ⎩ O2 γ21 γ22 I2

131

(4)

in the formula, Q, K, V respectively from the attention of the query and the keys and l , values, by the attention of the module input Z2l and Z1l linear transformation matrix , WQ l , Wl for a linear transformation to get; M stands for self-attention matrix; Softmax WK V function represents normalization processing and TemporalBlock function represents time-domain convolution operation; I1 is a time−domain convolution after adding the attention matrix information of hidden layer, said I2 is a time-domain convolution after adding the focus module processing of the hidden layer, O1 and O2 are I1 and I2 after a single fusion results. γ11 , γ12 , γ21 , γ22 for weighting parameters. 3.2.3 Decoding Operation The decoder consists of two decoding parts to decode the two hidden layer features output by the L-layer fusion encoder respectively. The first decoding part is the reverse-timedomain convolutional layer, which is realized by replacing the dilatant causal convolution in the time-domain convolutional residual block with transposed convolution. The second decoding part is composed of a single-layer feedforward neural network layer and the function sigmoid. The cross-stitch unit integrates two kinds of feature representations that focus on extracting local data dependence and global data association. 3.2.4 Model Loss Function Reconstruction error loss, the model loss function is:. α × w−C 1  + β × w − C2 

(5)

in the formula, α and β are the hyper-parameters, and α + β = 1, C1 and C2 are the final output of the decoder.

132

Y. Deng et al.

3.2.5 Update Local Model Parameters Anomaly scores can be calculated using formula (5), where higher scores indicate a greater likelihood of the corresponding time point being anomalous. Formula (5) yields the loss function for the k-th participant, which is then used to update the model parameters θ of each local model using the Adam optimizer, as shown in formula (6). Formula (7) represents the parameter update equation. ⎧ ⎪ ⎪ mt = γ1 mt−1 + (1 − γ1 )gt ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ vt = γ2 vt−1 + (1 − γ2 )g2t (6) mt ⎪ ⎪ m ˜ t = 1−γ t ⎪ ⎪ 1 ⎪ ⎪ ⎪ ⎩ v˜ = vt t 1−γt 2

m ˜t θ˜ = θ − η · √

vt + 

(7)

3.3 Federated Learning Model The federated learning model is used in the server to enable joint training. A model loss function is constructed, and the local model parameters are updated. After updating the local model parameters through this process, they are uploaded to the terminal server and aggregated to generate the global model. Federated learning is a distributed machine learning approach in which multiple parties collaborate to train a model without sharing their data. Each party trains a local model on its own data using stochastic gradient descent or a similar optimization method. The local models’ gradients are aggregated to update the global model, which is then sent back to the parties for further training. This process is repeated until convergence [1]. 3.3.1 Construct the Model Loss Function Reconstruction error loss, the model loss function is: μ × w−D1  + ϑ × w − D2 

(8)

in the formula, μ and ϑ are hyperparameters, while μ + ϑ = 1, in the model testing, by reconstructing the model to obtain input w said D1 and D2 . 3.3.2 Update Local Model Parameters The loss function of the first participant can be obtained using formula (8), and then the model parameters of each local model are updated using the Adam optimizer, as shown

Unsupervised Multidimensional Time Series Anomaly Detection

in Eq. (9). Equation (10) represents the parameter update formula.” ⎧ ⎪ ⎪ mt = γ1 mt−1 + (1 − γ1 )gt ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ v = γ v + (1 − γ )g2 t

2 t−1

2

⎪ ⎪ m ˜t = ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ v˜ = vt t 1−γt

mt 1−γ1t

t

133

(9)

2

m ˜t θ˜ = θ − η · √

vt + 

(10)

gt2 = gt  gt . is a set of hyperparameters and the mean and biased variance of the ˜ t , v˜ t after updating the local modified gradient; γ1 , γ2 ∈ [0, 1) is the learning rate; m model parameters, η is learning rate. Through the above process, upload them to the terminal server for aggregation to generate a global model. 3.4 Generate the Global Model After updating local model parameters, upload them to the server and perform aggregation to generate the global model. Aggregate the local model parameters generated in the above steps at the trusted server side, take the parameter wkl of the local model correlation layer as features, extract features through a Resnet residual neural network. The number of neurons in the input layer of Resnet is consistent with the corresponding layer of the local model, and the output layer has U neurons corresponding to the global model. Local model training iterations are R, and training starts when the local model parameters of each layer reach a certain number, localep . The client uploads the model parameters to the server to generate a global model. l wG = ResNet(wlk )

(11)

l is the global neural network where ResNet(·) represents the ResNet network. wG parameters, and l represents the number of layers. By implementing this framework, both data local dependence and global data dependence can improve the effectiveness of multidimensional time series anomaly detection.

4 Experimental Analysis and Discussion 4.1 Configuration and Setup of the Experiment In this paper, the experiment is based on, software: 64- bit windows10 operating system, Pytorch framework. Hardware part: the AMD Ryzen5 3600XPowerGPU adopted by the CPU is Nvidia Geforce RTX3070. All experiments adopt the random gradient algorithm Stochastic Gradient Descent (SGD), the initial learning rate lr = 0.01, and the training batch Batch-size (B) is 10. Experiments on MINST, FMINIST and private dataset based

134

Y. Deng et al.

on the publicly available CIFAR10 and CIFAR100 datasets show that each dataset communicates 50 times, which is equivalent to 500 Epoch (E) trained by the local model. The comparative experiments in this paper are all based on the algorithm proposed in the previous paper, compared with the FedAvg and FedProx algorithms using the same local model, to first demonstrate the feasibility and superiority of the algorithm in this paper, and the results of the experiments are averaged over 10 times (Table 1). Table 1. Accuracy of multiple algorithms on four datasets Dataset

Independence

Algorithm

Average accuracy (%)

Highest accuracy (%)

MNIST

IID

FedAvg

97.96

98.38

FedProx

98.25

98.63

Non-IID

FMINIST

IID

Non-IID

Private

IID

Non-IID

FedFLA

98.52

99.21

FedAvg

92.64

96.83

FedProx

94.86

98.43

FedFLA

95.12

98.71

FedAvg

90.58

91.31

FedProx

91.74

92.45

FedFLA

93.52

94.02

FedAvg

88.34

92.72

FedProx

90.26

93.88

FedFLA

92.05

94.02

FedAvg

65.21

68.51

FedProx

66.34

72.88

FedFLA

70.95

74.52

FedAvg

59.68

62.71

FedProx

63.25

68.55

FedFLA

65.02

69.78

It can be easily seen from the table above that there is no significant difference between the FedAvg and the FedProx, because their horizontal segments overlap, but the accuracy of the FedFLA is higher than that of the FedProx. The algorithm FedFLA is significantly better than the algorithm FedAvg because their horizontal segments do not overlap.

5 Conclusion In conclusion, this paper proposes a novel approach for power data anomaly detection based on federated learning. The proposed method utilizes the self-attention mechanism in a time-domain convolutional neural network to capture both local and global features

Unsupervised Multidimensional Time Series Anomaly Detection

135

of the data, and an information fusion module to integrate these features. By adopting federated learning, the proposed method can effectively learn from the data of multiple clients without compromising data privacy. Experimental results demonstrate that the proposed method outperforms existing state-of-the-art methods in terms of detection accuracy and efficiency. The method in this paper can take into account both local dependence and global dependence of data and can better extract time sequence reconstruction information in time series, so as to improve the effectiveness of multidimensional time series anomaly detection. Acknowledgments. This work was supported by the Science and Technology Project of SGCC, the research on aggregated modeling and short-term load forecasting for bus nodes considering distributed resources (5108-202218038A-1-1-ZN).

References 1. Wang, Q., Zhang, Y., Yan, F., Song, Y.: Fault detection of industrial processes using a multidimensional time series client and a dynamic principal component analysis. J. Process Control 44, 66–77 (2016) 2. Lee, K., Kim, J., Han, J.: Statistical anomaly detection for power system monitoring. In: IEEE Power Engineering Society General Meeting (2011) 3. Li, X., Zhang, L., Sun, K.: Anomaly detection in power system based on support vector machine. In: IEEE International Conference on Power System Technology (2016) 4. Wang, Y., Zhang, L., Su, H.: Deep autoencoder-based feature learning for power system transient stability assessment. IEEE Trans. Power Syst. 33(2), 2060–2070 (2018) 5. Bai, S., Kolter, J.Z., Koltun, V.: An Empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv preprint arXiv:1803.01271 (2018) 6. Yang, J., Wang, S., Li, X.: Federated learning for anomaly detection in time series data. In: IEEE International Conference on Communications (2020) 7. Kairouz, P., McMahan, H.B., Avent, B., Bellet, A., Bennis, M.: Federated learning: strategies for improving communication efficiency. arXiv preprint arXiv:1910.11891 (2019)

Reinforcement Federated Learning Method Based on Adaptive OPTICS Clustering Tianyu Zhao, Junping Du(B) , Yingxia Shao, and Zeli Guan Beijing Key Laboratory of Intelligent Communication Software and Multimedia, School of Computer Science (National Pilot Software Engineering School), Beijing University of Posts and Telecommunications, Beijing 100876, China [email protected]

Abstract. Federated learning is a distributed machine learning technology, which realizes the balance between data privacy protection and data sharing computing. To protect data privacy, federated learning learns shared models by locally executing distributed training on participating devices and aggregating local models into global models. There is a problem in federated learning, that is, the negative impact caused by the non-independent and identical distribution of data across different user terminals. In order to alleviate this problem, this paper proposes a strengthened federation aggregation method based on adaptive OPTICS clustering. Specifically, this method perceives the clustering environment as a Markov decision process, and models the adjustment process of parameter search direction, so as to find the best clustering parameters to achieve the best federated aggregation method. The core contribution of this paper is to propose an adaptive OPTICS clustering algorithm for federated learning. The algorithm combines OPTICS clustering and adaptive learning technology, and can effectively deal with the problem of non-independent and identically distributed data across different user terminals. By perceiving the clustering environment as a Markov decision process, the goal is to find the best parameters of the OPTICS cluster without artificial assistance, so as to obtain the best federated aggregation method and achieve better performance. The reliability and practicability of this method have been verified on the experimental data, and its effectiveness and superiority have been proved. Keywords: Federated learning · Clustering · Reinforcement learning · OPTICS algorithm · Non-independent identically distributed

1 Introduction With the popularity of mobile computing and the Internet, more and more users are using mobile terminals for data communication and computing. With the increasing demand for user data privacy and local processing, federated learning technology has emerged in this context. This work was supported by the National Natural Science Foundation of China (62192784, U22B2038, 62172056). © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 136–144, 2023. https://doi.org/10.1007/978-981-99-6187-0_14

Reinforcement Federated Learning Method

137

Federated learning technology [1–4] is a new distributed machine learning technology proposed in 2017, with Google being one of the first companies to propose this technology. Unlike traditional centralized machine learning, federated learning technology can balance the contradiction between data value and data privacy well. In this distributed computing paradigm, all participating nodes can collaboratively train a global model while protecting user privacy and data security. The data of each participating node is stored locally and can be updated on the server’s coordination. The trained model can be allocated to various participants or shared among multiple parties [5–9]. The advantage of this distributed machine learning technology is that it can effectively avoid data leakage [10, 11] and data abuse caused by data privacy issues [12–14]. At the same time, federated learning technology is widely used in many scenarios, such as medical health, financial technology, and intelligent manufacturing. Therefore, solving the problem of non-independent and identically distributed data is one of the important directions for federated learning technology to address challenges [15–18]. This paper proposes a reinforcement federated aggregation method based on adaptive OPTICS clustering, which can solve the heterogeneity problem [19, 20] of data distribution in federated learning tasks, named FedRO. Compared with traditional federated learning algorithms, the algorithm proposed in this paper can not only achieve efficient data recording, learning, and updating, but also share and aggregate data while protecting user privacy. In summary, solving the problem of data heterogeneity in federated learning is of great practical significance. Therefore, the main contributions of this paper are as follows: 1. A reinforcement learning-based adaptive OPTICS clustering algorithm is proposed. This algorithm considers the problem of non-independent and identically distributed data across devices in federated learning. By clustering data from different devices, it can effectively solve the problem of uneven data distribution. Specifically, we use the advantages of the OPTICS clustering algorithm and use reinforcement learning to adaptively determine the core distance and minimum sample size of clustering, which can more accurately and reasonably handle data distribution. 2. A reinforcement federated learning method based on the adaptive OPTICS clustering algorithm is proposed. This method clusters clients into different clusters based on features using the adaptive OPTICS clustering algorithm and performs random selection within the clusters, which can make the federated learning more stable and accurate. 3. Experimental results show that the proposed adaptive OPTICS clustering algorithm and reinforcement federated aggregation method can effectively address the problem of data heterogeneity in federated learning tasks with non-independent and identically distributed data across different devices, improving the performance and accuracy of federated learning. The algorithm has good performance on the MNIST, CIFAR-10, and Fashion-MNIST datasets.

2 FedRO The core idea of this article is to extract features from local data using the Deep Sets model, upload feature vectors to the server node, and use reinforcement learning to define the state space, actions, and rewards. The clustering environment is modeled

138

T. Zhao et al.

as a Markov decision process, and the parameter search direction adjustment process is modeled to find the optimal clustering parameters eps and minPTs for the OPTICS clustering algorithm to achieve the best federated aggregation method. Similar clients are assigned to the same cluster, and random selection is performed within the cluster. Each cluster is used to determine a model. The overall architecture of the solution is shown in Fig. 1, which depicts the process of implementing reinforcement federated aggregation using adaptive OPTICS clustering algorithm with three clients as an example.

Fig. 1. Illustration of FedRO

2.1 Reinforced Federated Learning Method Specifically, the search process in round k takes the following form: As the state needs to represent the search environment of each step as accurately and completely as possible, we consider constructing the representation of the state from two aspects (i = 1, 2, …). First, the definition of the state for the global clustering status is: (k)(i)

(k)(i)

sglobal = P (k)(i) ∪ Db

∪ {R(k)(i) cn }

(1)

Secondly, for the description of each class, for the local state of class cn ∈ C at step i, this paper defines a local clustering status definition:    (k)(i) (k)(i) (k)(i)  (2) slocal,n = χcent,n ∪ {Dcent,n , cn(k)(i) } Based on the global state and local state, the current state is defined as:  (k)(i) (k)(i) s(k)(i) = σ (FG (sglobal ) FL (slocal,n )) cn ∈C

(3)

Reinforcement Federated Learning Method

139

Action: Represents the parameter search direction for step i. This paper defines the action space as D (left, right, down, up, stop), where left and right represent decreasing and increasing the eps parameter, respectively. Down and up represent decreasing and increasing the minPts parameter, respectively, while stop represents stopping the search. Specifically, this paper establishes an Actor as a policy network based on the current state: a(k)(i) = Actor(s(k)(i) )

(4)

Reward: This paper uses a small portion of externally measured samples as the basis for rewards. The reward for step i is: R(s(k)(i),a

(k)(i)

) = NMI (OPTICS(χ , P (k)(i+1) , y ))

(5)

Termination: For the entire search process, use the following termination conditions: stop when exceeding the boundary; stop when exceeding the maximum number of steps. The specific algorithm is shown as Algorithm 1.

The feature matrix is clustered using adaptive parameters with OPTICS, and the nodes are grouped, and the weights of the entire cluster model are sent to the worker nodes. The FedAvg algorithm is used to locally update the cluster, and the weights are

140

T. Zhao et al.

sent back to the server. Finally, the server receives all the weights and aggregates them with weights.

3 Experiments To verify the effectiveness of the FedRO algorithm, this paper evaluates the algorithm by training popular CNN models on three datasets, namely MNIST, CIFAR-10 [21], and Fashion-MNIST [22]. 3.1 Datasets and Models The statistical information of the experimental datasets and models is shown in Table 1. To demonstrate the reliability of the adaptive clustering algorithm in this paper, four datasets, glass, wine, yeast, and iris, are selected to test the clustering algorithm in this paper, using two evaluation indicators, Adjusted Rand Index (ARI) and Normalized Mutual Information (NMI). The specific information of the datasets is shown in Table 2. Table 1. Data set and model statistics Dataset

Number of samples

Model

MNIST

50000

CNN

CIFAR-10

60000

CNN

Fashion-MNIST

60000

CNN

Table 2. Data description Dataset

Samples

Features

Classes

glass

214

9

6

wine

178

13

3

yeast

1484

8

9

150

4

3

iris

3.2 Experiment Settings In this paper, experiments were conducted on different datasets using 100 available devices. The datasets were assigned to different training nodes according to different sampling methods, and each node trained its own data locally. After each round of training, these training nodes sent their training results (i.e., their own model parameters)

Reinforcement Federated Learning Method

141

to the parameter server. The parameter server used a certain algorithm to aggregate these model parameters to obtain a new global model parameter. Then, this new global model parameter was distributed to the next round of training nodes, allowing them to continue training their local data using the latest global model parameter. The following methods were selected as benchmarks for comparative experiments in this study: FedAvg, FedProx [23], and SHARE [24]. The FedProx algorithm is an optimization aggregation algorithm proposed in 2020 to solve system and statistical heterogeneity in federated networks, and SHARE is a hierarchical federated learning method proposed in 2021. In federated learning, reducing the number of communication rounds is important due to the limited computing power and network bandwidth of mobile devices. Therefore, we use the number of communication rounds and the highest accuracy as performance indicators. In the clustering experiment, on the datasets shown in Table 1, the clustering indicators of the four clustering algorithms, K-means, DBSCAN, OPTICS, and the reinforcement learning-based adaptive OPTICS algorithm proposed in this paper, were compared. The ARI and NMI of the four algorithms were compared. 3.3 Federated Learning Experiment Results and Analysis In reference [25], the parameter σ is used to measure the degree of non-independent and identically distributed data. We also use this method in this article, where a larger σ indicates a stronger degree of non-independent and identically distributed data. We use two parameters, 0.8 and 1.0, respectively. When σ = 1.0, it means that the data is the most non-independent and identically distributed, and each label belongs to only one device. When σ = 0.8, it means that 80% of the data belongs to one label and the remaining 20% belongs to other labels. Figures 2, 3 and 4 show the experimental results of different algorithms using three datasets when σ = 0.8. Table 3 shows the maximum accuracy that three algorithms can achieve on each dataset. In Table 4, each entry shows the number of communication rounds required to achieve a test set accuracy of 99% on MNIST, 55% on CIFAR-10, and 85% on Fashion-MNIST for CNN. In this article, we conducted experiments at σ = 0.8 and 1.0, and counted the number of communication rounds required for different algorithms to achieve the standard accuracy. The shorter the number of rounds, the more efficient the algorithm is, the lower the workload, and the higher the performance.

Fig. 2. Experimental results on the MNIST CIFAR-10 Fashion-MNIST datasets

142

T. Zhao et al. Table 3. The highest accuracy that can be achieved Algorithm

MNIST

CIFAR-10

Fashion-MNIST

FedAvg

0.991

0.557

0.872

FedProx

0.993

0.552

0.863

SHARE

0.995

0.564

0.879

FedRO (Ours)

0.993

0.576

0.874

Table 4. The number of communication rounds to reach a target accuracy Algorithm



MNIST

CIFAR-10

Fashion-MNIST

FedAvg

1.0

1517

1714

1811

0.8

221

87

52

FedProx

1.0

1421

1658

1720

0.8

210

96

70

SHARE

1.0

1201

1320

1526

0.8

142

76

48

1.0

1105

1240

1534

0.8

152

70

46

FedRO (Ours)

From the experimental results, it can be seen that FedRO has good performance on all three datasets. The experimental results of this article show that FedRO can reduce the number of communications by up to 27% on the MNIST dataset, up to 38% on the CIFAR-10 dataset, and up to 17% on the Fashion-MNIST dataset. 3.4 Experimental Results and Analysis of Clustering Algorithms Table 5 shows the clustering performance of four clustering algorithms on four datasets, including the values of ARI and NMI. The closer the value is to 1, the better the clustering effect of the algorithm. From the experimental results, it can be seen that the reinforcement learning-based adaptive OPTICS clustering algorithm proposed in this article has good clustering performance on all four datasets. The algorithm dynamically adjusts the clustering coefficient by using the NMI value as the output of the reward function, which leads to good NMI values in the clustering task. In contrast, the non-adaptive OPTICS clustering algorithm has poor clustering performance because it cannot effectively handle datasets with no clear boundaries between clusters, and it cannot effectively cluster high-dimensional datasets. However, the proposed clustering algorithm in this article can achieve good results even on high-dimensional datasets by adaptively adjusting parameters.

Reinforcement Federated Learning Method

143

Table 5. Experimental results on ARI and NMI Algorithm

Parameter

glass

wine

K-means

ARI

0.802

NMI

0.745

OPTICS

ARI

0.293

NMI

0.391

DBSCAN

ARI

0.838

NMI

0.732

FedRO (Ours)

yeast

iris

0.304

0.386

0.547

0.241

0.402

0.681

0.311

0.258

0.561

0.294

0.243

0.758

0.293

0.375

0.535

0.335

0.336

0.684

ARI

0.792

0.334

0.415

0.612

NMI

0.852

0.372

0.305

0.624

4 Conclusion This article proposes a reinforcement learning-based federated learning method based on adaptive OPTICS clustering, which can solve the heterogeneity problem of data distribution in federated learning tasks. Compared with traditional federated learning algorithms, the proposed algorithm can not only achieve efficient data recording, learning, and updating but also share and aggregate data while protecting user privacy. In the case of data heterogeneity, it reduces the impact caused by different data distributions, improves accuracy, and significantly improves performance. When clustering data, there needs to be a balance between the number of clusters. The more clusters there are, the higher the accuracy of the models within each cluster, but the worse the global model’s generalization will be. However, if there are too few clusters, the accuracy of the models within each cluster will decrease. By using adaptive parameters, we can determine a range that both clusters the data in a better range and reduces the number of iterations required for adaptive parameter adjustment. In future research, we will further optimize performance through further study and experimentation.

References 1. Mcmahan, H.B., Moore, E., Ramage, D., et al.: Communication-efficient learning of deep networks from decentralized data. In: Proceedings of the Conference on Artificial Intelligence and Statistics, pp. 1273–1282 (2017) 2. Li, Y., Li, W., Xue, Z.: Federated learning with stochastic quantization. Int. J. Intell. Syst. 37, 11600–11621 (2022) 3. Huang, J., Wang, H., Sun, Y., et al.: HGAMN: heterogeneous graph attention matching network for multi-lingual POI retrieval at Baidu maps. In: KDD 2021, pp. 3032–3040 (2021) 4. Xiao, S., Shao, Y., Li, Y., Yin, H., Shen, Y., Cui, B.: LECF: recommendation via learnable edge collaborative filtering. Sci. China Inf. Sci. 65(1), 1–15 (2022) 5. Yang, Q.: AI and data privacy protection: the way to federated learning. J. Inf. Secur. Res. 5(11), 961–965 (2019) 6. Sattler, F., Wiedemann, S., Müller, K.R., et al.: Robust and communication-efficient federated learning from non-IID data. IEEE Trans. Neural Netw. Learn. Syst. 31(9), 3400–3413 (2019)

144

T. Zhao et al.

7. Li, Y., Zeng, I.Y., Niu, Z., Shi, J., Wang, Z., Guan, Z.: Predicting vehicle fuel consumption based on multi-view deep neural network. Neurocomputing 502, 140–147 (2022) 8. Shao, Y., Huang, S., Li, Y., Miao, X., Cui, B., Chen, L.: Memory-aware framework for fast and scalable second-order random walk over billion-edge natural graphs. VLDB J. 30(5), 769–797 (2021) 9. Li, Y., Yuan, Y., Wang, Y., Lian, X., Ma, Y., Wang, G.: Distributed multimodal path queries. IEEE Trans. Knowl. Data Eng. 34(7), 3196–3210 (2022) 10. Li, Y., et al.: Heterogeneous latent topic discovery for semantic text mining. IEEE Trans. Knowl. Data Eng. 35(1), 533–544 (2021) 11. Li, W., Jia, Y., Du, J.: Tobit Kalman filter with time-correlated multiplicative measurement noise. IET Control Theory Appl. 11(1), 122–128 (2017) 12. Kou, F., et al.: Hashtag recommendation based on multi-features of microblogs. J. Comput. Sci. Technol. 33, 711–726 (2018) 13. Li, A., et al.: Scientific and technological information oriented semantics-adversarial and media-adversarial cross-media retrieval. arXiv preprint arXiv:2203.08615 (2022) 14. Wei, X., Du, J., Liang, M., Ye, L.: Boosting deep attribute learning via support vector regression for fast moving crowd counting. Pattern Recogn. Lett. 119, 12–23 (2019) 15. Yang, Y., Du, J., Ping, Y.: Ontology-based intelligent information retrieval system. J. Softw. 26(7), 1675–1687 (2015) 16. Lin, P., Jia, Y., Du, J., Yu, F.: Average consensus for networks of continuous-time agents with delayed information and jointly-connected topologies. In: 2009 American Control Conference, pp. 3884–3889 (2009) 17. Li, Y., Jiang, W., Yang, L., Wu, T.: On neural networks and learning systems for business computing. Neurocomputing 275(31), 1150–1159 (2018) 18. Meng, D., Jia, Y., Du, J., Yu, F.: Tracking algorithms for multiagent systems. IEEE Trans. Neural Netw. Learn. Syst. 24(10), 1660–1676 (2013) 19. Guan, Z., Li, Y., Xue, Z., Liu, Y., Gao, H., Shao, Y.: Federated graph neural network for crossgraph node classification. In: 2021 IEEE 7th International Conference on Cloud Computing and Intelligent Systems, CCIS, pp. 418–422 (2021) 20. Li, A., Li, Y., Shao, Y., Liu, B.: Multi-view scholar clustering with dynamic interest tracking. IEEE Trans. Knowl. Data Eng. 35, 1–14 (2023) 21. Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images (2009) 22. Xiao, H., Rasul, K., Vollgraf, R.: Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017) 23. Li, T., Sahu, A.K., Zaheer, M., et al.: Federated optimization in heterogeneous networks. Proc. Mach. Learn. Syst. 2, 429–450 (2020) 24. Deng, Y., Lyu, F., Ren, J., et al.: SHARE: shaping data distribution at edge for communicationefficient hierarchical federated learning. In: 2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS), pp. 24–34. IEEE (2021) 25. Wang, H., Kaplan, Z., Niu, D., et al.: Optimizing federated learning on non-IID data with reinforcement learning. In: IEEE INFOCOM 2020-IEEE Conference on Computer Communications, pp. 1698–1707. IEEE (2020)

Simulation and Implementation of Extended Kalman Filter Observer for Sensorless PMSM Ke Ma, ChaoJun Gao(B) , Jun Wang(B) , and Qiang Zhang School of Physics and Microelectronics, ZhengZhou University, ZhengZhou 450001, China {gaochaojun,eejwang}@zzu.edu.cn

Abstract. This article utilizes the Extended Kalman Filter algorithm as an observer for the entire motor control system to estimate the angle and speed of the motor rotor. This algorithm can achieve precise tracking of motor rotor position and speed, especially in low-speed or startup scenarios. It has good low-speed performance and can quickly achieve closed-loop control of the motor angle loop. Furthermore, the FOC vector control strategy was employed to achieve dual-loop control of the motor control system, thereby realizing control of the positionSensorless permanent magnet synchronous motor. In this paper, we design and construct a simulation model for vector control of permanent magnet synchronous motor based on Extended Kalman filter algorithm, and design a hardware experimental platform to compare the experimental results with those of the Sliding Mode Observer-based control algorithm for verification. Keywords: Permanent magnet synchronous motor · Extended Kalman filter · Sensorless control · Angle loop

1 Introduction The permanent magnet synchronous motor is an electric motor that can convert electrical energy into mechanical energy, consisting of stator and rotor parts. Compared with other types of motors, the permanent magnet synchronous motor has outstanding characteristics such as high efficiency, high power density, high torque, and high control precision [1]. Therefore, it has been widely used in various fields such as electric vehicles, precision machine tools, and industrial robots. In order to precisely control the permanent magnet synchronous motor, it is usually necessary to obtain the position and speed information of the rotor. The commonly used method is to obtain the rotor information of the motor through physical sensors such as Hall sensors and encoders [2]. These sensors not only increase the weight of the motor, but also greatly increase the cost of the motor. Moreover, the environmental adaptability of the sensors is poor, which limits the use scenarios of the permanent magnet synchronous motor. Therefore, the Sensorless control technology of the permanent magnet synchronous motor has become a hot topic in motor control. Currently, there are several ways for Sensorless control of the permanent magnet synchronous motor: Direct Calculation Method, Model Reference Adaptive Control, Sliding Mode Observer © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 145–162, 2023. https://doi.org/10.1007/978-981-99-6187-0_15

146

K. Ma et al.

Control, High-frequency injection method, Artificial intelligence algorithm, Extended Kalman Filter Control, etc. [3]. The following is a brief introduction of sliding mode observer control and extended Kalman filter control. 1.1 Sliding Mode Observer Control The Sliding Mode Observer (SMO) is an ingenious combination of observer theory and sliding mode variable structure control theory, which is mainly based on the theory of sliding mode variable structure [4, 5]. Robustness is the distinguishing characteristic of the Sliding Mode Observer [6]. The sliding-mode variable structure is different from the usual continuum control method in that it is nonlinear. In addition, the structure has a relatively special sliding mode control method, which allows the state of the system to slide to the desired point according to the phase trajectory that has been set. Since the phase trajectory given by the sliding-mode variational structure is independent of the variables of the control object and the changes of external disturbances, the algorithm has the characteristics of fast response and independence of the internal parameters of the system and external disturbances, thus ensuring that the system is asymptotically stable. The simplicity of the sliding-mode variable structure algorithm is another major advantage of the algorithm, so the algorithm is easy to implement in engineering [7]. However, when the system is in steady state, the state point of the Sliding mode observer will keep moving back and forth in the stable region, resulting in more high-frequency noise in its output signal [8]. In addition, since the Sliding mode observer cannot obtain current information when the motor is started, resulting in the inability to output rotor position information, the strong drag method is generally used at startup to make the motor reach a certain speed before switching the Sliding Mode observer, which reduces the operating efficiency of the motor. 1.2 Extended Kalman Filter Control As an optimal estimation algorithm, the Extended Kalman filter (EKF) algorithm eliminates the model error of the variables in the system and the estimation error is highly accurate and is not affected by the estimated values of the variables in the system [9]. The algorithm is also able to perform real-time processing of the variables in the system using recursive algorithms. In addition, this method has good dynamic performance and strong robustness and can work properly under very unfavorable conditions. The literature [10, 11] shows that the algorithm converges quickly and has good speed detection and position tracking capabilities. In addition, the algorithm has good low-speed performance and can output rotor position information at the click start for closed-loop control of the angle at start-up. This paper combines the Extended Kalman filter algorithm with the FOC algorithm and the auxiliary current and speed PI control loop, and compares the control effect of the Sliding mode observer and the Extended Kalman filter algorithm on the permanent magnet synchronous motor in simulation and experimental tests, and proves the effectiveness and stability of the Extended Kalman filter algorithm.

Simulation and Implementation of Extended Kalman Filter Observer

147

2 Extended Kalman Filter Algorithm in PMSM In this article, we have opted for the extended Kalman model of the permanent magnet synchronous motor in the two-phase stationary coordinate system α − β. The voltage equation of the surface-mounted permanent magnet synchronous motor in the α − β axis is presented in [11, 12]:  uα = Riα + Ls didtα − ωe f sin θe (1) di uβ = Riβ + Ls dtβ + ωe f sin θe where, R is the resistance of the stator winding of each phase of the motor, Ls is the inductance of the stator winding of each phase of the motor, f is the flux of the permanent magnet rotor; ωe is the angular velocity of the rotor; iα , iβ is the output signal of the Clark transform. In turn, it can be transformed into the current equation as:  f dia ua R dt = − Ls ia + ωe Ls sin θe + Ls (2) f diβ uβ R dt = − Ls iβ − ωe Ls sin θe + Ls With a very short sampling period, the rotor speed ω can be assumed to be constant, ω = 0, i.e.:  d ωe dt = 0 (3) d θe dt = ωe Since the PM synchronous motor is a typical nonlinear system, the PM synchronous motor can be represented in general terms as the state update equation and the state observation equation as follows: d dt x = f (x(t)) + Bu(t) + V (t) (4) y = C(x(t)) + W (t) where V (t) is the system noise and W (t) is the measurement noise. The state variable x, the input variable u, and the output variable y are selected in the above equation as: ⎤ iα



⎢ iβ ⎥ ⎥, u = uα , y = iα x=⎢ ⎣ w ⎦ uβ iβ θ ⎡

(5)

And in Eq. (4): ⎡



− LRs iα + ωe Lsf sin θe



⎢ R ⎥ f ⎢ ⎥ f (x(t)) = ⎢ − Ls iβ − ωe Ls sin θe ⎥ ⎣ ⎦ 0 ωe

(6)

148

K. Ma et al.



1 Ls

0

⎢ ⎢0 1 B = ⎢ Ls ⎣0 0 0 0



⎥ 1000 ⎥ , C = ⎥ 0100 ⎦

(7)

To apply Eq. (4) to the Kalman filter algorithm, it is necessary to linearize f (x(t)), i.e., to Taylor expand f (x(t)) at xˆ and take only its first order derivative term to obtain the Jacobi matrix: ⎡ ⎤ ψf ψf − LRs 0 Ls sin θe ωe Ls cos θe · ⎢ ⎥ ⎢ ⎥ ψ ψ 0 − LRs − Lsf cos θe ωe Lsf sin θe ⎥ (8) F(t) = ⎢ ⎢ ⎥ ⎣ 0 ⎦ 0 0 0 0

0

1

0

Let the sampling time be Ts , then Eq. (4) after discretization yields:  xk = f (xk−1 , uk−1 , wk−1 ) = (I + FTs )xk−1 + BTs uk−1 + wk−1 yk (xk , vk ) = Ck xk + vk Let I + FTs = φk , then ⎡ ψf ψf 1 − LRs Ts 0 Ls sin θe (k)Ts ωe (k) Ls ⎢ ψ ψ ⎢ 0 1 − LRs Ts − Lsf cos θe (k)Ts ωe (k) Lsf φk = ⎢ ⎣ 0 0 1 0 0 Ts

cos θe (k)Ts

(9)



⎥ sin θe (k)Ts ⎥ ⎥ ⎦ 0 1

(10)

The Extended Kalman filtering algorithm is divided into two stages of state prediction and state correction, and the specific process is as follows: (1) From the input variables u(k) and the posterior estimate xˆ (k −1) of the state variables at the previous moment, a priori estimation of the state variables x yields xˆ (k)− : xˆ (k)− = xˆ (k − 1) + [f (ˆx(k − 1)) + B(k)u(k)]Ts

(11)

It is worth noting that it is not necessary to use the system F(k) after linearization in the prediction phase, but the system f (k) before linearization should be used in order to make the prediction results more accurate. (2) Calculate the a priori error covariance matrix: ˆ − 1)[I + Ts F(k)]T + Q ˆ − = [I + Ts F(k)]P(k P(k)

(12)

(3) Calculate the extended Kalman gain matrix:

−1 ˆ − C T C P(k) ˆ −C T + R K(k) = P(k)

(13)

Simulation and Implementation of Extended Kalman Filter Observer

149

(4) A posteriori estimation of the state variable x yields xˆ (k). This step is generally referred to as the correction of the prior estimate of the state variable xˆ (k)− , i.e., filtering:   xˆ (k) = xˆ (k)− + K(k) Y (k) − C xˆ (k)− (14) (5) Update the error covariance matrix: ˆ ˆ − P(k) = (I − K(k)C)P(k)

(15)

In the above equation R is the measurement noise covariance matrix and Q is the system noise covariance matrix, i.e.:  cov(V ) = E(VV T ) = Q (16) cov(W ) = E(WW T ) = R The calculation of the five formulas Eqs. (11) to (15) should be completed within each round of sampling time Ts , and the posterior estimate of the state at each moment, i.e., the optimal estimate, can be obtained by continuous cyclic calculation. Since the Extended Kalman filter algorithm is based on recursive optimal estimation solution, so the motor starts from standstill need to assign an initial value x1 to the state variable x. The value of x0 has little effect on the filtering effect and is generally assigned to 0; the initial value P0 of the error covariance matrix needs to be chosen as a larger number; the matrices Q and R need to be obtained by repeated trials under the premise of the system convergence stability. The specific values are as follows. ⎡ ⎤ ⎡ ⎤ 0 0.1 0 0 0 ⎢0⎥ ⎢ 0 0.1 0 0 ⎥ ⎢ ⎥ ⎥ x0 = ⎢ (17) ⎣ 0 ⎦, P0 = ⎣ 0 0 5 0 ⎦ ⎡

0

0 ⎤

0 0 0.1

0.1 0 0 0

⎢ 0 0.1 0 0 ⎥ 0.2 0 ⎢ ⎥ Q=⎣ ,R = 0 0 0.1 0 ⎦ 0 0.2 0 0 0 0.01

(18)

Incorporating the aforementioned Extended Kalman Filter algorithm into the FOC vector control strategy, it is used as an observer to estimate the angle and speed information of the permanent magnet synchronous motor rotor. Figure 1 depicts the Sensorless PMSM vector control system utilizing the Extended Kalman Filter observer.

3 Simulation Analysis 3.1 Model of Simulation System Figure 2 displays a simulation system of an Extended Kalman filter algorithm-based permanent magnet synchronous motor built in MATLAB/Simulink. In this system, the Extended Kalman filter algorithm was studied through simulation and at the same time compared with the control system based on the Sliding mode observer. Table 1 lists the motor parameters used in the simulation.

150

K. Ma et al.

Fig. 1. Vector control system of non-inductive permanent magnet synchronous motor with Extended Kalman filter observer.

Fig. 2. Vector Control Diagram of Permanent Magnet Synchronous Motor based on Extended Kalman Filter Algorithm.

Table 1. Simulation model motor parameters. Motor parameter

Numerical value

unit

Stator phase resistance

0.6



Armature inductance

1.4 * 10−3

H

Flux link

0.034182

Wb

Moment of inertia

1.1 * 10−5

Kg · m2

Damping coefficient

1 * 10−4

N·m·s

Friction force

0.02

N·m

The “foc_control” module in Fig. 3 includes Clark, Park, inverse Park transformations, SVPWM module, current loop PI, and EKF estimation module. The SVPWM

Simulation and Implementation of Extended Kalman Filter Observer

151

module is written using MATLAB Function for ease of debugging, parameter modification, and portability. The EKF module is written using S-Function for the same reasons.

Fig. 3. “foc_control” module.

3.2 Simulation Results The rated DC voltage of the model is 24 V, and the total simulation time is 5 s. The motor is started without load, and the reference speed is set to 30 r/s from 0 to 2 s. At 1.2 s, a load of 0.12 N.m is added to the motor. At 3 s, the reference speed is set to 40 r/s. Table 2 provides the parameters of the PI controller used in the simulation. Table 2. Parameter selection of simulation model. PI controller

KP

KI

Speed loop

0.01

20

dCurrent loop

3.57

1620

qCurrent loop

3.57

1620

Figure 4 presents a comparison between the measured rotor speed and the reference speed given to the motor system model. The measured speed is the real-time speed output by the motor model. Line 1 in the figure indicates the reference speed for a given motor system model, and line 2 indicates the actual measured speed of the motor. It can be observed that the motor speed reaches the given reference speed of 30 r/s (188.4 rad/s) at around 0.2 s, and at 1.2 s, due to the sudden addition of load to the motor, the measured speed of the motor fluctuates, but is adjusted to the reference speed at around 1.4 s. After changing the motor speed for 0.2 s, the speed of the motor reaches the given reference speed of 40 r/s (251.3 rad/s).

152

K. Ma et al.

Fig. 4. Measurement of motor speed and reference speed.

Fig. 5. Motor measurement speed and Estimated speed.

The 1st and 2nd lines in Fig. 5 respectively represent the measured speed and extended Kalman estimated speed of the motor. Figure 6 shows the difference between the measured speed and extended Kalman estimated speed of the motor. Based on the above two figures, it can be seen that there is a significant difference between the measured and estimated values when the motor is just starting up or changing speed, but the estimated value can quickly follow the measured value within a short period of time. When the load changes suddenly, the error fluctuation between the measured and estimated values is relatively large. When the motor is operating stably, the estimation error is basically fixed within 2 rad/s. Lines 1 and 2 in Fig. 7 represent the measured angle of the motor and the extended Kalman estimated angle, respectively. Lines 1 and 2 in Fig. 8 show the measured angle of the motor and the estimated angle of the Sliding mode observer. The comparison shows that the angle estimated by the Sliding mode observer lags significantly compared to the measured angle of the motor. This is because the back-EMF estimated by the Sliding mode observer is filtered by low-pass filtering, so that there is a phase difference in the angular information, thus leading to a lag in the estimated angular information. In

Simulation and Implementation of Extended Kalman Filter Observer

153

Fig. 6. Motor speed estimation error.

Fig. 7. Measured and estimated angles based on EKF.

addition, because the angle loop cannot form a closed loop at startup when using the Sliding mode observer, the motor needs to be started with forced driving step at startup, and the first few angles in Fig. 8 rise significantly slower than the later angles. Figure 9 shows the variation curves of the three-phase currents in the stator winding during motor operation based on the Extended Kalman filter. Figure 10 shows the variation curve of the three-phase current in the stator winding during the operation of the motor based on the Sliding mode observer. In comparison, it can be seen that the Sliding mode observer-based motor requires a higher current in the early stage to drag the motor during operation because of the forced driving step required for starting, while the Extended Kalman filter-based motor does not. The simulation results show that the Extended Kalman filter algorithm can accurately estimate the rotor angle of the permanent magnet synchronous motor in real time compared to the Sliding mode observer, and can achieve the closed loop of the angle

154

K. Ma et al.

Fig. 8. Measured and estimated angles based on SMO.

Fig. 9. The three-phase current curve based on EKF.

loop when the motor is started, eliminating the need for a forced driving step. The motor speed estimated by the Extended Kalman filter algorithm lags slightly during motor startup or sudden load changes, but it can still enter the stable state quickly. In addition, the speed tracking error is small when the motor is running steadily.

4 Experimental Verification 4.1 Hardware Circuit Design As shown in Fig. 11, a microcontroller-based permanent magnet synchronous motor control system is designed for the STM32F446 as the main control chip, which is mainly divided into the control signal circuit part and the power circuit part. The main control chip completes current sampling, the entire FOC algorithm, EKF algorithm calculation, PWM control signal generation, and communication with external devices and upper computers. The DRV8301 chip integrates gate drivers and two operational amplifiers

Simulation and Implementation of Extended Kalman Filter Observer

155

Fig. 10. The three-phase current curve based on SMO.

internally, which mainly drive the MOS tubes and amplify the two-phase current signals sampled.

Fig. 11. The control system for permanent magnet synchronous motor.

4.2 Software Programming The Main Program Design The main program design involves initializing the system peripherals, interrupts, FOC, and Extended Kalman Filter algorithm, as well as configuring the parameters for DRV8301. In the main program, the system waits for interrupts and controls the OLED display to show the data information. Figure 12 shows the flowchart of the main program. ADC Sampling The calculation of the entire FOC algorithm, including the Extended Kalman Filter

156

K. Ma et al.

Fig. 12. Main program flow chart.

algorithm and the current loop, is performed in each ADC sampling interrupt service function. In this paper, the regular channel is used to sample the U, V, and W three-phase voltages, and the sampling data is stored in the memory through DMA. At the same time, the injection channel is used to sample the current of the motor stator winding, bus voltage, and bus current to obtain a higher sampling priority, thereby making the sampling data of the stator current and bus voltage more accurate. In addition, since the FOC algorithm is based on the stator current, to ensure the accuracy of ADC sampling, it is necessary to read the current value of zero-drift current after the system is powered on and before the motor starts. Therefore, when the motor is running, the stator current sampled by ADC needs to be subtracted by the zero-drift current value before being entered into the algorithm for calculation. Figure 13 shows the flow chart of ADC sampling interrupt.

5 Analysis of Experimental Results To verify the efficacy of the Extended Kalman filtering algorithm in the control system of a permanent magnet synchronous motor, a permanent magnet synchronous motor with model parameters such as those listed in Table 3 was utilized. Figures 14 and 15 show the angle waveform estimated by the Extended Kalman filter and the angle waveform obtained by the Hall sensor, respectively. The waveform in Fig. 15 fluctuates because the system forcibly updates the angle position every 60°, while the angle between each forced update is calculated from the velocity. The waveforms of the two figures basically match, indicating that the parameters of the R-matrix are chosen appropriately.

Simulation and Implementation of Extended Kalman Filter Observer

157

Fig. 13. ADC sampling interrupt flowchart.

Table 3. Test motor parameters. motor parameters

Numerical value

unit

Rated voltage

24

V

Rated current

2.4

A

Polar number

2

pair

Phase number

3

phase

Rated speed

3000

RPM

Stator phase resistance

0.6



Armature inductance

1.4 * 10−3

H

Flux link

0.034182

Wb

Figure 16 shows the angle estimation waveform of the Extended Kalman filter during motor startup, and from Fig. 17, the angle estimation waveform of the Sliding mode observer during motor startup. It is obvious that the angle climbs slowly and the waveform appears to be deformed during the start-up of the motor based on the Sliding mode observer due to the presence of a forced driving step, which makes the motor operation unstable.

158

K. Ma et al.

Fig. 14. The estimated angle waveform based on EKF.

Fig. 15. Angle waveform obtained by hall sensor.

Fig. 16. Angle estimation waveform based on EKF at startup.

Simulation and Implementation of Extended Kalman Filter Observer

159

Fig. 17. Angle estimation waveform based on SMO at startup.

The phase currents of the permanent magnet synchronous motor are sampled using a current clamp. Figure 18 shows the waveform of the starting current of phase U of the motor displayed on the oscilloscope when the Extended Kalman filter is used. Figure 19 shows the waveform of the starting current of the motor U phase when using the Sliding mode observer. It is obvious from the figure that the starting current of the motor U phase transitions from a forced driving step to an angular closed loop when using the Sliding mode observer, while the angle closed loop can be achieved without the forced driving step when using the Extended Kalman filter, thus improving the motor’s operating efficiency. Figure 20 displays the duty cycle waveform output by the SVPWM module, which resembles a three-phase saddle wave. This waveform indicates that the SVPWM module is functioning properly and can provide suitable duty cycles for PWM. Figure 21 compares the waveforms of Iα obtained through the Clark transformation (1) and the estimated Iα output by the Extended Kalman filtering algorithm (2). It can be observed from the graph that the estimated Iα by the Extended Kalman filtering algorithm can accurately track the actual Iα, indicating that the selection of parameters for the R matrices was appropriate. Set the initial speed of the motor to 30 r/s and increase it by 5 r/s with each acceleration, adding a sudden load when the motor is running steadily at 40 r/s. Figure 22 shows the speed waveform of the motor. The time taken to reach the specified speed of 188.36 rad/s (29.97 r/s) from motor start is approximately 0.3 S, while the two accelerations to 219.81 rad/s (34.98 r/s) and 251.2 rad/s (39.98 r/s) take approximately 0.2 s and 0.21 s respectively; The speed of the motor takes approximately 0.35 s to return to the original speed when the load is added. Basically meets design requirements. After the above experimental analysis, the experiment results are basically consistent with the simulation results, proving the correctness and effectiveness of the designed controller.

160

K. Ma et al.

Fig. 18. Phase current start-up waveform based on EKF.

Fig. 19. Phase current start-up waveform based on SMO.

Simulation and Implementation of Extended Kalman Filter Observer

Fig. 20. The three-phase saddle waveform of SVPWM.

Fig. 21. Comparison of the waveforms of the actual Iα and the estimated Iα.

Fig. 22. Motor speed estimation waveform.

161

162

K. Ma et al.

6 Conclusion This article provides theoretical derivation of the application of the Extended Kalman filter algorithm in permanent magnet synchronous motors. The feasibility of the Extended Kalman filter algorithm has been verified through MATLAB simulation. The relevant algorithm has been implemented by constructing the control circuit of permanent magnet synchronous motors. This enables the closed-loop control of the angle loop of the motor without the need for forced driving at startup, thus improving the motor’s operating efficiency. Considering the large computational cost of the Extended Kalman filter algorithm and the high design cost due to the use of the Cortex-M4 core as the main control chip in this design, in order to further reduce costs, the algorithm can be optimized to improve its computational speed and reduce memory usage, and applied to main control chips with lower cost in the future.

References 1. Zhao, Y., Liu, X., Yu, H.: Model-free adaptive discrete-time integral terminal sliding mode control for PMSM drive system with disturbance observer. IET Electr. Power Appl. 14(10), 1756–1765 (2020) 2. Krishnan, R.: Permanent Magnet Synchronous and Brushless DC Motor Drives. CRC Press, Boca Raton (2017) 3. Yousfi, D., Halelfadl, A., El Kard, M.: Sensorless control of permanent magnet synchronous motor. In: 2009 International Conference on Multimedia Computing and Systems, pp. 341– 344. IEEE (2009) 4. Saadaoui, O., Khlaief, A., Abassi, M., Tlili, I., Chaari, A., Boussak, M.: A new full-order sliding mode observer based rotor speed and stator resistance estimation for sensorless vector controlled PMSM drives. Asian J. Control 21(3), 1318–1327 (2019) 5. Zhang, S., Zhao, P., Du, X., Jin, J., Liu, H.: Research on sensorless control strategy of PMSM based on an improved sliding mode observer. In: 2017 2nd International Conference on Materials Science, Machinery and Energy Engineering (MSMEE 2017), pp. 942–946. Atlantis Press (2017) 6. Liang, D., Li, J., Qu, R.: Sensorless control of permanent magnet synchronous machine based on second-order sliding-mode observer with online resistance estimation. IEEE Trans. Ind. Appl. 53(4), 3672–3682 (2017) 7. Apte, A., Mehta, H., Joshi, V., Walambe, R.: Sensorless vector control of PMSM using SMO and NLDO. In: 2017 IEEE International Symposium on Sensorless Control for Electrical Drives (SLED), pp. 127–132. IEEE (2017) 8. Lu, H., Wu, J., Li, M.: A new sliding mode observer for the sensorless control of a PMLSM. In: 2017 29th Chinese Control and Decision Conference (CCDC), pp. 5364–5369. IEEE (2017) 9. Michalski, T., Lopez, C., Garcia, A., Romeral, L.: Sensorless control of five phase PMSM based on extended Kalman filter. In: IECON 2016-42nd Annual Conference of the IEEE Industrial Electronics Society, pp. 2904–2909. IEEE (2016) 10. Bado, A., Bolognani, S., Zigliotto, M.: Effective estimation of speed and rotor position of a PM synchronous motor drive by a Kalman filtering technique. In: PESC 1992 Record. 23rd Annual IEEE Power Electronics Specialists Conference, pp. 951–957. IEEE (1992) 11. Remzi, ˙IN.A.N., Üzüm, O.M.: Speed-sensorless DTC of BLDC motor with EKF-based estimator capable of load torque estimation for electric vehicle. Avrupa Bilim ve Teknoloji Dergisi 42, 6–13 (2022) 12. Bolognani, S., Oboe, R., Zigliotto, M.: Sensorless full-digital PMSM drive with EKF estimation of speed and rotor position. IEEE Trans. Ind. Electron. 46(1), 184–191 (1999)

Research and Application of Comprehensive Health Assessment Based on Production Equipment of Bulk Cargo Terminal Xin Li1(B) , Xuliang Tang2 , and Renhui Chen3 1 SDIC Caofeidian Port Co., Ltd., Tangshan 063210, China

[email protected]

2 Shanghai Institute of Technology, Shanghai 201418, China 3 Hangzhou Branch of Shanghai Jiudao Information Technology Co., Ltd., Hangzhou 310012,

China

Abstract. As the main productivity of a port, the health status of production equipment in bulk cargo terminals will directly affect the stability and economic benefits of the equipment. Through research and application of comprehensive health assessment for production equipment, comprehensive, accurate, and timely grasp of the comprehensive health status of equipment. This study will establish a LIFEO device health evaluation model through factor analysis and weight allocation through key indicators from the five dimensions of the device life cycle, inspection, failure, environment, and operation. Analyze, summarize, and summarize the historical and existing data of bulk cargo terminal equipment, and verify the effectiveness and feasibility of the model based on actual conditions, providing a dynamic and sustainable entire process. Provide scientific basis and decision support for the management and maintenance of production equipment in bulk cargo terminals. Keywords: Bulk Cargo Terminal · Production Equipment · Equipment Health Assessment · Data Analysis · Sustainability

1 Introduction Bulk cargo terminals are important places for the import and export of goods [1], and production equipment plays a key role in the normal operation of the terminal [2]. However, due to various types of equipment, frequent use, adverse environmental conditions, and other factors, their management is relatively complex, and their health status is easily affected, resulting in increased equipment failure rates, reduced production efficiency, and may even pose a threat to the safe production of the terminal [3]. Therefore, the health assessment of production equipment in bulk cargo terminals has both theoretical significance and practical value. Traditional health assessment methods mainly focus on equipment failures and maintenance or real-time status data obtained, while ignoring other factors related to equipment health, such as equipment life cycle, environment, etc. [4]. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 163–171, 2023. https://doi.org/10.1007/978-981-99-6187-0_16

164

X. Li et al.

Therefore, based on the multi-dimensional characteristics of equipment health, this study proposes a comprehensive evaluation model called LIFEO, which takes into account the five dimensions of the equipment life cycle, inspection status, fault status, environment, and equipment operation. Through factor analysis and weight allocation, dynamic and continuous optimization adjustment strategies are used to comprehensively evaluate the health status of equipment. It aims to improve the health management level and optimize the production efficiency of bulk cargo terminal equipment. Currently, many researchers at home and abroad have carried out a lot of related research work on equipment health evaluation. In China, researchers have proposed various evaluation models and methods mainly from aspects such as equipment failure prediction [5], health monitoring [6], and life cycle management [7]. In foreign countries, developed countries have accumulated rich experience in an equipment health evaluation, such as maintenance optimization in the United States and intelligent maintenance in Germany [8]. However, the current domestic research on the health assessment of production equipment for bulk cargo terminals is still relatively weak, and there is greater room for improvement and optimization. Main aspects: First, an equipment health assessment is a dynamic process that often ignores its sustainability [9]. Stay in a one-time or phased evaluation. Secondly, overemphasizing universality tends to overlook the need to integrate the establishment of models with specific on-site business formats, leading to deviation from the goal, and the effect on operability and on-site practical application is average, with some room for improvement [10]. Third, too much emphasis is placed on real-time status data obtained by devices, insufficient consideration is given to the difficulty of obtaining data, transmission speed, and quality in the on-site environment, as well as insufficient attention is paid to other types of data. Inadequate consideration of data such as the life cycle attributes and characteristics of the device itself, and historical data is often an important factor in accurately evaluating the health of the device. The concept and application of LIFEO equipment comprehensive health assessment proposed in this study. The equipment comprehensive health assessment model will take the life cycle, inspection, failure, environment, and operation of the equipment as the five basic modules of the model, and comprehensively consider the characteristics of each module of the equipment. This method is helpful for the more comprehensive and accurate evaluation of the health status of equipment. The LIFEO model has strong applicability, making it easy to adjust the indicators and weights in the model according to actual situations. At the same time, the model can quickly identify modules with greater impact and quickly identify indicators with greater impact among modules with greater impact. It has important theoretical and practical significance for improving the health management level of bulk cargo terminal equipment, reducing equipment failure rates, and improving production efficiency and safety.

Research and Application of Comprehensive Health Assessment

165

2 Design of Comprehensive Health Assessment for Equipment 2.1 Principles for the Overall System of Equipment Comprehensive Health Assessment The comprehensive health assessment of LIFEO equipment is designed based on the principle of comprehensive, dynamic, sustainable, and independent from the perspective of covering the entire life cycle process of the equipment. Based on equipment and centered on data, we always adhere to the policy of data-driven business. At the same time, a closed loop of data application is formed based on the clues of collecting, sorting out, and applying data. The data of LIFEO equipment comprehensive health assessment will be based on basic information data, operation data, historical data, and external data of the equipment life cycle. The design is divided into modules and analyzed in a loose and even combination of indicators to achieve the final function. The LIFEO equipment comprehensive health assessment model is divided into five basic modules: life cycle, inspection, failure, environment, and operation. The equipment health score is a full score of 100 points, which is obtained by multiplying the score of each module by the weight of the corresponding module. As follows: S = Sl ∗ Wl + Si ∗ Wi + Sf ∗ Wf + Se ∗ We + So ∗ Wo

(1)

In the formula: S is the total score of equipment health evaluation. S l is the score of the life cycle module. W l is the weight of the lifecycle module. S i is the score of the inspection module. W i is the weight of the inspection module. S f is the score of the faulty module. W f is the weight of the faulty module. S e is the score of the environment module. W e is the weight of the environment module. S o is the score of the operation module. W o is the weight of the operating module. Where the range of S is between [0, 100]. Wl + Wi + Wf + We + Wo = 1. Each module also uses 100 points as the benchmark score, which is obtained by summing the scoring results of various indicators that affect health within the module. The indicator scoring results are calculated for each indicator based on the indicator evaluation frequency and indicator calculation formula combined with the indicator score deduction rules. Qualitative analysis is transformed into quantitative health scores for each indicator through indicator evaluation time, calculation formulas, and indicator deduction rules. The formulation of relevant rules is mainly based on model measurement verification and expert opinions for setting and optimizing. Currently, the main methods used include: Analytic Hierarchy Process, expert survey, entropy method, mean square deviation method, etc. 2.2 Main Indicators of Equipment Comprehensive Health Evaluation Based on the main production equipment accumulated in multiple bulk cargo terminals in China, sorted by equipment type, and equipment life cycle resume data, the impact factors of the equipment were sorted out and sorted. Based on the opinions of experts in the port industry, the main module indicators of the model are obtained after multiple

166

X. Li et al.

rounds of adjustment and calculation tests. Form an independent score for each module based on the indicators, and ultimately form a comprehensive health score for the device. In summary, the structure of comprehensive health assessment indicators for LIFEO equipment is shown in the following figure (Fig. 1): Structure of equipment health evaluation indicators Life cycle indicators

Inspection indicators

Environmental indicators

Fault indicators

Dust factor index

Humidity factor index

Temperature factor index

Dust factor index

Humidity factor index

M T B F

Temperature factor index

Fault frequency factor index

Fault level factor index

Maintenance factor index

Joint inspection of factor indicators

Point patrol factor index

Ideal operating life factor index of equipment

Proportion factor index of equipment operation period

Equipment components and BOM factor indicators

Equipment Life Cycle Factor Index

M T T R

Operational indicators

Fig. 1. Comprehensive health evaluation indicators for equipment.

2.3 Life Cycle Scoring Module According to the structure and usage of the equipment, the following representative factors are selected in the life cycle scoring module to participate in the scoring calculation. Proportion factor index of equipment operation life: Set a% ratio deduction rule based on the design life of the equipment, and calculate the ratio based on the operation time. The earlier the commissioning time, the greater the score deduction. Ideal operating life factor index of equipment: Set a% ratio deduction rule based on the design life of the equipment, and calculate the ratio in combination with the operating life. The longer the service life, the greater the score deduction. The scoring module reference is shown in Table 1.

3 Comprehensive Health Assessment Model for Equipment 3.1 Running the Scoring Module The LIFEO model supports defining multiple weight models. Each weight model can set different weights and indicator numbers for each dimension based on the characteristics of the stage in which the device is located or the strong correlation logic between the

Research and Application of Comprehensive Health Assessment

167

Table 1. Life cycle scoring module. Module

Indicator Direction

life cycle

Indicator

Rules

Indicator Scores

Module Scores

Module Weights

Device lifecycle Indicator1

Rule1

76

82

20%

Components and BOM

Indicator2

Rule2

90

Proportion of years of operation

Indicator3

Rule3

86

Operating life factor

Indicator4

Rule4

89

indicators to meet specific stage or association requirements. The LIFEO model filters the obtained indicator data through the weight model data rules and then substitutes the indicator data into the weight model with the best matching degree. Calculate the equipment health score based on the 100-point system by running the weight model function. The structure diagram of the LIFEO model is shown in Fig. 2. Life cycle indicator data

Check indicator data

Environmental indicator data

Check indicator data

Operational indicator data

Weight Model Data Rules Model 1

Model 2

Model n

Model n dimension

weight

target

Calculation method

Indicator time

Rules and points deduction

function

Equipment Health Score

Fig. 2. LIFEO Model Structure.

3.2 Process Flow The LIFEO model calculates the score of the corresponding factor through the indicator data of each module and corresponding rules, and then calculates the health score result

168

X. Li et al.

of the device based on the weight. Model validation is performed by comparing the results with the actual device health status. If the demonstration fails, the model can be reconstructed after adjusting the corresponding indicator factors, scoring rules, and weights. After successful demonstration of the model, targeted optimization can be carried out for indicators with increased fluctuations based on the comparison between the equipment health score results over a period of time and the actual health status of the equipment. When the monitoring technology is improved, factors that cannot be detected before can be added to upgrade the evaluation model. The LIFEO model processing process is shown in Fig. 3.

Indicator 1

Life cycle

Indicator 2

Indicator 3

Inspection

Failure

...

Enviroment

Indicator n

Operation

Data Rules Weight Model 1

Model reconstruction

Weight Model 2

NO

LIFEO model

Model demonstration

YES

...

Model Tuning

Weight model n

Model Iteration

Fig. 3. LIFEO model processing process.

3.3 Practical Application Description During the practical application of the LIFEO model in bulk cargo terminals, it is found that the model has strong applicability. The main production equipment for bulk cargo terminals can be applied to eight categories of equipment (belt conveyor, ship loader, dumper, mobile machinery, etc.). The number and weight of indicator factors for the health assessment model can be flexibly set based on the characteristics of the eight categories and the historical and monitoring data available to various types of equipment. And conveniently adjust the index scoring rules based on the horizontal comparison results of similar devices. LIFEO model supports multiple device types. As shown in the Table 2, LIFEO can set different indicator factors and weights for different types of equipment based on actual situations.

Research and Application of Comprehensive Health Assessment

169

Table 2. Different indicator factors and weights for different types of equipment. Indicator factor

Mobile mechanical

Belt conveyor

Ship loader

Vibration factor

No

Yes

No

Temperature factor

No

Yes

No

Life cycle factor indicators

Yes

Yes

Yes

Point patrol inspection

No

Yes

Yes

The LIFEO model can quickly identify modules that have a significant impact. Table 3 shows the scoring results of a certain belt conveyor equipment. From the table, it can be intuitively found that faulty modules have a significant impact. Table 3. Scoring results of a certain belt conveyor equipment. Module

Module Weight

Index Scoring

Result Calculation

Equipment Scoring

life cycle

15%

−5

14.25

74.5

inspect

15%

−15

12.75

fault

40%

−45

22

environment

10%

−5

9.5

operation

20%

−20

16

The LIFEO model can quickly identify the influential indicators in the modules with significant impact. Table 4 shows the indicator scoring details of a certain fault module of the belt conveyor equipment. From the table, it can be intuitively found that the fault level indicator factor has a significant impact. Table 4. The indicator scoring details of a certain fault module of the belt conveyor equipment. Module

Indicator Time

Time Rule Calculation

Fault

Indicator1

Rule Calculation

Score

Module Weight

Scoring cycle Rule1

−5

40%

Indicator2

Scoring cycle Rule2

−5

Indicator3

Scoring cycle Rule3

−10

Fault level

Scoring cycle Rule4

−25

The LIFEO model supports indicator trend analysis. Figure 4 shows the historical trend of belt conveyor fault level indicators. The score deduction for fault level is becoming increasingly serious. The next step should be to analyze what factors are causing changes in the equipment fault level indicators. Through analysis, it is found that the

170

X. Li et al.

fault level of the belt conveyor has a linear increasing trend. Based on the on-site situation, it is found that the recent production and business volume is large, and the belt conveyor cannot be scheduled for normal downtime and maintenance, resulting in a gradual increase in the level of equipment failure.

Fig. 4. The historical trend of belt conveyor fault level indicators.

4 Conclusion The LIFEO device comprehensive health assessment proposed in this study will provide a new perspective and method for device management, using a simple, easy-to-use, reliable, practical, scientific, and accurate method that covers multi-dimensional evaluation and analysis of the device, to obtain the comprehensive health status of the device, as well as suggestions for the direction of equipment improvement. Provide favorable support for the stable operation and health maintenance strategies of the equipment, thereby improving the economic operation efficiency of the equipment. Based on equipment, with multi-source data as the core. Establish a data-driven business concept, build a model suitable for the type of on-site equipment, divide by module, and build a comprehensive health evaluation system for LIFEO equipment. Realize true data assets, data and business integration, and data-driven business. The comprehensive health assessment of LIFEO equipment proposed in this study can provide some reference for the management of production equipment and the safe and stable operation of equipment in bulk cargo terminals. I believe that it will have broad application prospects in the future and play a full role in the socio-economic development of the port.

References 1. Geng, H., Jiang, X., Wang, D., Li, J.: Prediction and assessment of marine pollution risk from ships based on statistical analysis probability model. Am. J. Traffic Transp. Eng. 6(2), 34–42 (2021)

Research and Application of Comprehensive Health Assessment

171

2. Luo, J.X.: Fully automatic container terminals of Shanghai Yangshan Port phase IV. Front. Eng. Manag. 6(3), 457–462 (2019) 3. Li, K., Lin, Y., Lu, C.: Aggregation-induced emission for visualization in materials science. Chem.–Asian J. 14(6), 715–729 (2019) 4. Bi, L., Wang, Z., Wu, Z., Zhang, Y.: A new reform of mining production and management modes under Industry 4.0: cloud mining mode. Appl. Sci. 12(6), 2781 (2022) 5. Lei, Y., Li, N., Guo, L., Li, N., Yan, T., Lin, J.: Machinery health prognostics: a systematic review from data acquisition to RUL prediction. Mech. Syst. Signal Process. 104, 799–834 (2018) 6. Seo, J., Han, S., Lee, S., Kim, H.: Computer vision techniques for construction safety and health monitoring. Adv. Eng. Inform. 29(2), 239–251 (2015) 7. Zhang, Y., Ren, S., Liu, Y., Sakao, T., Huisingh, D.: A framework for Big Data driven product lifecycle management. J. Clean. Prod. 159, 229–240 (2017) 8. Wen, Y., Rahman, M.F., Xu, H., Tseng, T.L.B.: Recent advances and trends of predictive maintenance from data-driven machine prognostics perspective. Measurement 187, 110276 (2022) 9. Ngiam, K.Y., Khor, W.: Big data and machine learning algorithms for health-care delivery. Lancet Oncol. 20(5), 262–273 (2019) 10. Carroll, J., Koukoura, S., McDonald, A., Charalambous, A., Weiss, S., McArthur, S.: Wind turbine gearbox failure and remaining useful life prediction using machine learning techniques. Wind Energy 22(3), 360–375 (2019)

Enhancing Resilience of Microgrid-Integrated Power Systems in Disaster Events Using Reinforcement Learning Zhongyi Zha, Bo Wang(B) , Lei Liu, and Huijin Fan Key Laboratory of Ministry of Education for Image Processing and Intelligent Control, Artificial Intelligence and Automation School, Huazhong University of Science and Technology, Wuhan 430074, China [email protected] Abstract. Enhancing the resilience of power systems has become increasingly important to mitigate the negative impact of risk attacks. Integrating microgrids, which are small-scale power grids that can operate independently or in conjunction with the main grid, has emerged as an effective energy source for emergency situations and can avoid power outages. Typhoons, which are inevitable and relatively easy to predict risk attacks, have become one of the contexts to examine power system resilience. In this paper, we propose a reinforcement learning algorithm that combines graph theory knowledge to make decisions for optimizing power system resilience with microgrids under the Typhoon event. The algorithm effectively utilizes graph theory knowledge and reduces the number of decisions required in the optimization process. Numerical experimental results show that compared to traditional methods, the reinforcement learning algorithm can more effectively respond to typhoon changes, achieve flexible economic control, and enhance the resilience of power systems with microgrids.

Keywords: resilience learning

1

· microgrid · power system · reinforcement

Introduction

Since the 21st century, changes in the natural and social environment, as well as natural disasters, have posed a threat to the stable operation of the power grid system. In particular, climate change has increased the frequency and intensity of severe weather, which has become one of the main causes of power system failures in recent years. For example, from 2003 to 2012, severe weather events caused 679 power outages, each affecting at least 50,000 users. Investigations and studies have shown that some power system failures occur in the transmission system, while most power outages occur in the distributed structure [1]. Reports from the Electric Power Research Institute (EPRI) [1] and the North American Electric Reliability Corporation (NERC) [2] point out that improving the resilience of c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023  Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 172–183, 2023. https://doi.org/10.1007/978-981-99-6187-0_17

Enhancing Resilience

173

the power system is key to resisting the impact of these uncertain severe weather conditions. Due to some unpredictable factors, power systems are difficult to avoid external threats during operation, which can trigger power outages. In addition, power systems are typically directly connected to commercial and residential users, so any failure will cause partial interruption of the entire system, posing a huge threat to society as a whole. To improve the ability of the power system to withstand damage and recover normal operation, research on the resilience of the power system has become increasingly important. Research has shown that incorporating distributed microgrids into the power system can enhance the overall system’s resilience. For example, in [3], researchers proposed an elasticitybased microgrid switch deployment model to isolate potential failures and connect emergency generators to the system. In [4], Wei et al. developed a coordinated reinforcement learning-based strategy for distributed microgrid generators, aimed at enhancing the infrastructure to reduce the impact of extreme weather events on the power system. Improving the resilience of power systems during weather events is a challenging problem because it is difficult to establish a state-action policy function for continuous event changes, resulting in less research compared to before and after events. Similar to the optimal battery scheduling problem in microgrids, scenario-based stochastic programming and robust optimization methods are difficult to solve for direct state-action decision-making functions due to the unpredictability of weather, uncertainty in load and new energy sources, and the real-time decision-making requirements. Therefore, many researchers currently consider improving the resilience of power systems during weather events as a Markov decision process to solve [5]. To address the Markov decision process problem, Wang et al. proposed a state-tree-based linear scalarization method [6] and verified that the method can provide decision-making solutions for power systems during weather events under approximate conditions. With the development of artificial intelligence technology, approximate dynamic programming and reinforcement learning have become effective algorithms for solving such problems. Wang et al. used approximate dynamic programming to solve power system structure reconfiguration [7] and power system resilience improvement with distributed microgrids [8], established a state-action decision-making function mapping relationship during weather events, and achieved rapid response and resilience improvement during weather events. Meanwhile, as a novel deep reinforcement learning algorithm, Kamel et al. used the Q-learning algorithm to solve the problem of branch overloading relief in power systems [9] and improved the decision-making response speed. During a typhoon event, the power system’s topology is vulnerable to damage caused by line failures. To control the opening and closing of the lines and change the topology of the power system at each moment to achieve system reconfiguration and improve resilience, we use deep reinforcement learning algorithms to cope with the random destruction of the topology caused by typhoon events. Considering the performance of the power system and the cost of line operation, the goal is to reduce the losses caused by power shedding and line

174

Z. Zha et al.

operation and operation. The main content of this paper is as follows: first, we describe the representation method of the topology of the power system. Then, we model the power system reconfiguration problem with Markov decision process, including decision variables and objective functions. Next, we explain the application of deep reinforcement learning algorithms. Finally, a numerical simulation experiment is conducted to verify that the proposed deep reinforcement learning algorithm can significantly improve the resilience of the power system compared to manual processing methods.

2 2.1

Optimization Model Power System Topology

For a normally operating power system, a random disaster event can cause losses to its lines based on different geographical locations. Specifically, attacks caused by weather events can be considered as a sequential trajectory for the power system. In the topology of a power system, components such as loads and energy sources are typically represented as nodes, while lines are distributed lines that can be opened or closed, as shown in Fig. 1. Let bi−j denote the connecting line between node i and node j, where components in node i and node j are either energy sources or loads, and i, j ∈ N , where energy sources include renewable energy and distribution networks. Let Ft represent the set of lines that may be damaged in the power system at time t, and let Fˆt denote the set of lines actually damaged in the power system at time t, where Fˆt ⊆ Ft . In Fig. 1, Ft3 = b1−2 , b2−4 , b2−5 , while Fˆt3 = b1−2 , because the losses of b1−2 , b2−4 , and b2−5 before t3 are uncertain and can only be predicted by probability, and can

Fig. 1. Example of power system topology under the disaster event

Enhancing Resilience

175

only be observed at t3 . Let B represent all the line connections, then the power system topology graph Gt at time t can be described as follows: Gt = B − Fˆt

(1)

bi−j is a binary variable, with 0 representing a broken line and 1 representing a closed line. In the topology of a power system, there are three situations that can change its value: – Breakage caused by a disaster event: that is, bi−j changes from 1 to 0. In this case, since the damage to the line caused by the disaster event is random, the impact of the disaster event at each time t needs to be predicted using prediction methods. Reference [10] presents many prediction methods for line damage in disaster events. – Power grid reconfiguration: that is, bi−j is provided by the intelligent control system of the power system. In this case, the topology of the power system will be reconfigured as a way to enhance resilience. – Repair: that is, bi−j changes from 0 to 1 after a period of time. In this case, bi−j is 0 until the repair is complete, and then changes to 1. 2.2

Resilience Optimization Model

The Markov decision process is described by a quadruple S, A, Ω, R. S represents the state of the Markov decision process, and in the case of power grid reconfiguration, the state S = G represents the topological structure of the power system. A represents the reconfiguration actions taken in the power system, and as explained in the previous section, these actions can change the topological structure of the power system and thus its load-carrying capacity. Ω represents the state transition probability of the Markov decision process. Let P (bi−j,t+1 |bi−j,t , at , ωt ) denote the state transition probability of each line, then the transition probability of the Markov decision process is given as follows:  P (bi−j,t+1 |bi−j,t , at , ωt ) (2) P (St+1 |St , at ) = b∈B

P (St+1 |St , at ) represents the state transition probability of the topological structure of the power system, while P (bi−j,t+1 |bi−j,t , at , ωt ) represents the state transition probability of each line in the power system. For example, when some lines in the power system have already been damaged or repaired, the state transition probability of these lines is 1. When a disaster event occurs, the damage to some lines is probabilistic, and the state transition probability in this case is the probability of damage. From Eq. (2), it can be seen that when the topological structure of the power system increases, the computational and decision-making complexity of the Markov decision process will increase exponentially. This is also one of the difficulties in solving this problem. Equation (2) describes the state transition probability of the future topological structure of the power system after the reconfiguration method is adopted.

176

Z. Zha et al.

Obviously, different topological states affect the performance of the power system. Therefore, the actions taken at time t should also consider the possibility of future topological changes after time t. As shown in Fig. 1, at time t1 , it is very likely that b1−2 will be damaged due to a disaster event. This will cause nodes 2, 4, and 5 to be disconnected from the entire system, which may lead to power outage risks. However, if b1,4 is closed at time t0 , even if b1−2 is disconnected, the topological structure of the entire power system remains connected. This means that using the method of power system reconfiguration can effectively avoid the risk of power outage. From the above example, it can be seen that the performance of the entire power system is changed by sequential decision-making during a disaster event. The study constructed a performance evaluation method for the power system in a disaster event from three perspectives: load-carrying capacity of the power system, line scheduling cost, and operation cost.    op (ηp,t ΔLp Δt) + cb,t + cb,t (3) Ct (St , at ) = b∈Btd

p∈P

b∈Btd

Ct (St , at ) represents the loss of the power system at time t, which also reflects its performance. The smaller the value, the better the performance. P represents all loads that cannot be operated normally, ΔLp is the active power quantity of load  shedding, and ηp,t represents the degree of attention to loadd p. The term b∈B d t cb represents the cost of using dispatchable lines, where B t denotes all dispatchable lines at time t, and cb is the operating cost of line b. The term  op op b∈B d t (c b) represents the cost of power system reconfiguration, where cb denotes the operating cost of line b.Thus,the objective function is to minimize the loss function of an operating cycle, i.e., the optimization objective of the Markov decision process is:  Ct (St , at , ωt )) (4) min E( at

t∈T

That is, in a disaster event, the goal is to minimize the loss of the power system, i.e., maximize its performance.

3 3.1

Proposed Algorithm Reinforcement Learning

Reinforcement learning is a class of algorithms based on Markov decision processes. As a type of machine learning algorithm, reinforcement learning learns through exploration and exploitation. Therefore, the state transition probability function Ω in the quadruple S, A, Ω, R is not considered in the reinforcement learning algorithm. The state S, action A, and reward function R are the key considerations.

Enhancing Resilience

177

State. In reinforcement learning, the state S includes all known environmental states, so in addition to the topology of the power system, it also includes disaster prediction information, power information, and component information. The state S in reinforcement learning is also the input of the deep neural network, as shown below: St = Gt × Evt × Grt

(5)

where Evt is the pre-known probability of the disaster event damaging the power system topology, i.e., Evt = pbreak bi−j,t , i, j ∈ N

(6)

pbreak bi−j,t represents the probability that the line i − j is damaged due to the disaster event at time t. Grt represents the power information of the power system, including changes in energy and load power. Here, because the trained intelligent agent is only limited to a specific power system, it is assumed that its internal facilities do not change, so it is not necessary to consider its resistance, reactance, and other characteristics. The information contained in Grt is shown below: Grt = Pi,t , i ∈ N

(7)

Pi,t represents the power information of node i, with a positive sign indicating energy and a negative sign indicating load. Action. For the power system reconfiguration during disaster events, the agent’s actions refer to the opening and closing operations of power system lines. To simplify the calculation of the subsequent reward function, actions are represented by 0, indicating that the line remains unchanged, and 1, indicating that the line’s state has changed. The actions are defined as follows: At = ab,t , b ∈ B d

(8)

For example, if a1−2,0 = 1, it means that at time t = 0, the controllable line b1−2 changes its state. If it was closed, it opens, and if it was open, it closes. If a1−2,0 = 0, it means that at time t = 0, the controllable line b1−2 remains unchanged. Reward. As described in Eq. (3), the performance of the power system during a disaster event consists of three parts: the load loss, the cost of using controllable lines, and the cost of operations. In reinforcement learning, the reward function is a mapping relationship between the state S, action A, and the reward value r, i.e., R : S × A → r. Based on this relationship, the reward value function design in the reinforcement learning algorithm is modeled as follows: ⎞ ⎛    ⎠ (9) ηp,t ΔLp Δt + b · cb,t + ab · cop rt = − ⎝ b,t p∈P

b∈B d

ab ∈A

178

Z. Zha et al.

In the reward value function (9), the first and second parts are calculated based on St , and the third part is calculated based on At . As reinforcement learning aims to maximize rewards, the loss function of the power system is negated. 3.2

Improvement and Application

The Q-network provides the actions ab,t that should be taken for the power system reconfiguration problem under disaster events. The output of the Q-network is a natural number ranging from 0 to the size of the Q-network. However, the power system reconfiguration actions are represented by bi−j,t , b ∈ B d . Therefore, it is necessary to create a mapping function from the Q-network output to the binary representation of the power system reconfiguration actions. This mapping is a bijection, as shown below: f A (ab,t ) = (ab,t )binary = bi − j, t, |bi−j,t | = log2 (max(ab,t ))

(10)

For example, if the output of the Q-network is 9 and the maximum output is 64, then the corresponding power system reconfiguration action would be {0, 0, 1, 0, 0, 1}, indicating that the third and sixth lines change state while the rest remain the same. In disaster weather events, power system overloading is a common cause of load shedding, where the energy is insufficient to meet the power balance requirements. From the topological structure of the power system, this means that at least one component in the connected component of G cannot satisfy power balance. Let Gt−1 be the topological structure diagram of the power system under disaster weather conditions at time t − 1, which becomes G(ab,t |Gt−1 ) at time t due to the reconfiguration operation by the agent, and its connected components are Gc = Gc 1, t, Gc 2, t, · · · , Gc i, t. The necessary condition for the power system to operate normally at time t − t + 1 is that each connected component has at least one node as a source. Let Gc,∗ denote the connected components that satisfy this condition. For all actions ab, t executed at time t, only those actions that satisfy the above condition can be selected and executed by the agent, and those that do not satisfy the above condition are blocked by the agent. Therefore, the masking layer, as a mapping from policy to policy, blocks the infeasible actions. Its mapping relationship for the Q-network is described as follows: θ Qt (ab,t |st ), if G(ab,t |Gt−1 ) ≡ Gc,∗ θ (11) M (Qt (ab,t |st )) = −∞, otherwise The overall framework of the power system reconfiguration algorithm based on reinforcement learning is shown in Fig. 2. It is applied to enhance the resilience of the power system under disaster events.

Enhancing Resilience

179

Fig. 2. Application framework of reinforcement learning for enhancing power system resilience under disaster events

4

Numerical Experiment

In the numerical experiments, this study conducted experiments on the IEEE 33-bus system, where all energy and load data were generated by Homer Pro. The structure of the IEEE 33-bus system is shown in Fig. 3. In this study, the IEEE 33-bus system was used as the power system, which includes energy, load, and external distribution network systems. The nodes and connections in the system are shown in Fig. 3, and the route of the typhoon is also displayed in the figure. Node 33 is considered as an external distribution source, which can be regarded as an infinite energy source. The initial topology diagram is also shown in Fig. 3, where the lines 8–21, 10–11, 1–18, 9–15, and 25–29 are disconnected. The schedulable lines include lines 10–11, 12–13, 25–29, 1–18, 14–15, 12–22, 8–21, and 9–15, while the remaining lines are not schedulable. The average power of loads and energy at time t in each simulation experiment is random. For the agent, the loads and energy at time t are unknown, and only at time t + 1 are they considered as known quantities. Load data is generated from the residential load in the Wuhan area of Homer Pro, where the hourly average load of residents is 1 kWh. Energy data comes from distributed microgrids with a storage capacity of 5 kWh, as backup power in a disaster event in the power system. These microgrids are distributed at nodes 1, 23, 19, and 17 in the topology of Fig. 3. The uncertainty of loads is obtained by sampling from a normal distribution, i.e., adding noise data from a normal distribution based

180

Z. Zha et al.

Fig. 3. Topology diagram of IEEE 33-bus system Table 1. Scheduling and operation costs of each line in the power system topology Line

scheduling cost (CNY) operation cost (CNY)

10–11 12–13 25–29 1–18 14–15 12–22 8–21 9–15

1,000 1,200 1,300 1,300 1,400 1,400 1,400 1,400

500 500 500 500 500 500 500 500

on the data generated by Homer Pro. In the power system topology diagram, the scheduling and operation cost of the schedulable lines are shown in Table 1. The cost of load shedding, ηp,t , is 3500 CNY/kWh. This study assumes that the disaster event, typhoon’s trajectory, duration, and probability of damaging the transmission lines are known. These data can

Enhancing Resilience

181

be obtained from relevant research [11,12] and local meteorological bureaus. In the experiments, the time interval between adjacent time points is set to 15 min, r is set to 2Δt. For the deep i.e., Δt = 0.25, and the line repair time ΔTi−j reinforcement learning-based agent, eight decisions need to be made during the disaster event, including two stages before and after the typhoon passes and six time points. It is worth noting that the sum of all the probabilities of the transmission line damage is less than or equal to 1 in this study, i.e., at most one transmission line can be damaged at each time point, similar to the case in [8]. The intelligent agent was trained 100,000 times in the power system environment based on disaster events in this study. The cumulative rewards of the agent trained based on the Rainbow and Deep Q-learning algorithms are shown in Fig. 4. There is a relationship of one ten-thousandth between cumulative rewards and actual power system losses to facilitate the training of reinforcement learning. The testing section in Fig. 4 represents the performance of the intelligent agent tested 100 times every 100 times after 100,000 training, so a total of 100 tests were conducted. The intelligent agent trained based on Rainbow algorithm [13] had a training time almost the same as the Deep Q-learning algorithm, which took a total of 69 min.

Fig. 4. Cumulative rewards for agents based on Rainbow and deep Q learning algorithm

The study also compared the decision-making effectiveness of the intelligent agents trained with the Rainbow-based reinforcement learning algorithm and the deep Q-learning-based reinforcement learning algorithm. The results of both algorithms were compared with the methods of closing all lines in the disaster event and not reconstructing the power system. The comparison results are shown in Table 2.

182

Z. Zha et al.

Table 2. Average performance of power system resilience under disaster events based on different algorithms Algorithm

Load shedding loss Scheduling cost Operation cost Total cost

Rainbow

23,000

12,300

47,000

82,300

Deep Q-learning

23,900

14,000

49,000

86,900

All closed

14,300

Without reconfiguration 62,100

2,800

82,200

99,300

0

32,000

94,100

Table 2 shows the average performance of the power system in 500 test cases. The evaluation performance includes load shedding loss, line scheduling loss, and line operation loss. From the table, it can be seen that the intelligent agents trained based on Rainbow algorithm reduce the total loss by 17.1% compared to the case where all lines are closed, and reduce the total loss by 11.8% compared to the case without system reconfiguration. Although the operation of closing all lines can reduce the load shedding loss during disasters, it can lead to a significant increase in line operation loss. Not conducting power system reconfiguration, although the line operation loss is zero, the load shedding loss will increase significantly. These two rule-based algorithms are obviously inferior to the Rainbow-based reinforcement learning algorithm proposed in this chapter in improving the resilience of power systems under disasters. Therefore, the reinforcement learning algorithm proposed in this chapter can significantly improve the overall resilience of the power system under typhoon impacts, thereby reducing the total system loss.

5

Conclusion

This paper proposes a decision-making model for microgrid-integrated power systems using Markov decision processes. The model improves system resilience during disasters such as typhoons by allowing the power system to shed load, operate and use lines with minimum cost. The Markov state models the sequence of disaster events and line changes during the reconfiguration process. To address the high dimensionality problem, the state-of-the-art Rainbow reinforcement learning algorithm is used to train an intelligent agent. The action mask layer is used to filter out suboptimal decision solutions without exploration, significantly enhancing the efficiency of reinforcement learning training. The proposed deep reinforcement learning-based power system reconfiguration method is verified effective through numerical simulation experiments on the IEEE-33 bus system. Acknowledgement. This research is surpported by state grid corporation of China headquarters science and technology project (grant number: 5100-202099522A-0-0-00).

Enhancing Resilience

183

References 1. Executive Office of the President, C.O.E.A.: Economic benefits of increasing electric grid resilience to weather outages. The Council (2013) 2. Force, S.I.R.T.: Severe impact resilience: Considerations and recommendations. North American Electric Reliability Corporation (2012) 3. Zare-Bahramabadi, M., Abbaspour, A., Fotuhi-Firuzabad, M., Moeini-Aghtaie, M.: Resilience-based framework for switch placement problem in power distribution systems. IET Gener. Trans. Distrib. 12(5), 1223–1230 (2017) 4. Wei, Y., Wang, J., Feng, Q., Chen, C., Kang, C., Bo, Z.: Robust optimization-based resilient distribution network planning against natural disasters. IEEE Trans. Smart Grid 7(6), 2817–2826 (2016) 5. Byon, E., Ding, Y.: Season-dependent condition-based maintenance for a wind turbine using a partially observed Markov decision process. IEEE Trans. Power Syst. 25(4), 1823–1834 (2010) 6. Wang, C., Hou, Y., Qiu, F., Lei, S., Liu, K.: Resilience enhancement with sequentially proactive operation strategies. IEEE Trans. Power Syst. 32(4), 2847–2857 (2016) 7. Wang, C., Lei, S., Ju, P., Chen, C., Peng, C., Hou, Y.: Mdp-based distribution network reconfiguration with renewable distributed generation: approximate dynamic programming approach. IEEE Trans. Smart Grid 11(4), 3620–3631 (2020) 8. Wang, C., Ju, P., Lei, S., Wang, Z., Wu, F., Hou, Y.: Markov decision processbased resilience enhancement for distribution systems: an approximate dynamic programming approach. IEEE Trans. Smart Grid 11(3), 2498–2510 (2019) 9. Kamel, M., et al.: A reinforcement learning approach for branch overload relief in power systems. In: 2020 IEEE Power & Energy Society General Meeting (PESGM), pp. 1–5. IEEE (2020) 10. Ouyang, M., Duenas-Osorio, L.: Multi-dimensional hurricane resilience assessment of electric power systems. Struct. Saf. 48, 15–24 (2014) 11. Liu, L., Padilla, L., Creem-Regehr, S.H., House, D.H.: Visualizing uncertain tropical cyclone predictions using representative samples from ensembles of forecast tracks. IEEE Trans. Visual. Comput. Graph. 25(1), 882–891 (2018) 12. Elsner, J.B., Jagger, T.H.: A hierarchical bayesian approach to seasonal hurricane modeling. J. Clim. 17(14), 2813–2827 (2004) 13. Hessel, M., et al.: Rainbow: combining improvements in deep reinforcement learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32 (2018)

An Improved Adaptive Median Filtering Algorithm Based on Star Map Denoising Hancheng Cao, Naijun Shen, and Chen Qian(B) Nanjing University of Science and Technology, Nanjing 210094, China [email protected]

Abstract. In order to improve the filtering performance of salt and pepper noise in starry sky images, an improved adaptive median filtering algorithm is proposed. Firstly, the algorithm performs adaptive median filtering on the noise image. Instead of pixel median replacement, it obtains the set of suspected noise points. Then, it uses a neighbourhood differential noise template for secondary detection of the suspected noise points to obtain the real noise points. Finally, a weighted mean value is calculated based on the distance weights, and the noise points are replaced. The proposed algorithm is used to filter the star maps containing 20%–90% salt and pepper noise. The experimental results show that the denoising effect of this algorithm is better than other algorithms, while preserving the image details and edge information better.

Keywords: median filter noise

1

· adaptive · weighted mean · salt and pepper

Introduction

During the image collection stage of a space-based target detection system, various sources of noise may degrade the quality of the collected data. The most common types of noise encountered include Gaussian noise, Poisson noise, and salt and pepper noise. After the non-uniform background correction process, both Gaussian and Poisson noise are filtered out from the background image[1]. In cases where residual Gaussian noise persists, simple Gaussian filtering can be applied to attenuate the noise. Consequently, this paper mainly focuses on filtering salt and pepper noise in the collected images. Salt and pepper noise greatly changes the gray value of a pixel, making it significantly different from the surrounding pixel values, so the noise can be filtered out by jumping the gray value[2]. Median filtering and its various modifications are widely used for this purpose, primarily due to their effectiveness. In [3] and [4], switching median filtering is an effective strategy for mitigating salt and pepper noise. This approach incorporates a noise classification technique that performs well for low-density noise but exhibits limitations when the noise density is relatively high. In [5] and [6], extremum median filtering is an extension c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023  Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 184–195, 2023. https://doi.org/10.1007/978-981-99-6187-0_18

Adaptive Median Filtering

185

of the median filter that improves noise reduction by incorporating noise detection mechanisms which reduce image blur. Despite its enhanced performance, this approach suffers from misclassification of edge information as noise. As a result, its ability to reduce noise in images is substantially compromised. In [7], median average filtering yields superior filtering performance in terms of image quality, preserving image details and mitigating noise effectively. However, the computational complexity of median average filtering is relatively high, resulting in slow operation speeds. In [8] and [9], weighted median filtering assigns higher weights to pixels that exhibit greater similarity to their neighboring pixels. The weighted average of these pixels is then used to replace the median pixel value, resulting in enhanced preservation of image details. However, the denoising effect of weighted median filtering is less effective than other traditional noise reduction techniques. The adaptive median filtering algorithm uses the gray extremum of pixels in the filtering window as the judgment criterion for noise points, but the algorithm has the following shortcomings: (1) When filtering the image edges, image expansion is performed and the edge pixel point is used as the centre of the filter window for filtering, which loses the image edge detail information and reduces the overall effect of the filtered image quality [10]; (2) Not all the gray extreme points in the filter window are noise points, the algorithm is easy to misjudge the edge points as noise, that is, there will be false detection of extreme points [11]. Moreover, the star map contains many isolated stars distributed in the form of spot-like spots and space objects distributed in the form of narrow bands or thin lines[12]. Consequently, traditional noise reduction algorithms may misfire, falsely classifying some isolated pixels as noise. Furthermore, the star map’s unique features often give rise to an abundance of edges that the algorithm may not preserve very well or at all. As a result, the algorithm may not adequately suppress noise in the star map, resulting in suboptimal image restoration quality. This paper presents an improvement to the adaptive median filtering algorithm. In the process of detecting noise points, a secondary detection mechanism is added to further reduce the misjudgment of effective pixels. Specifically, four domain filter templates are used to evaluate the nature of potential noise pixels. This refinement enhances the algorithm’s ability to detect noise points. Additionally, an improved weighted mean method is employed to replace the gray value of the noise points. Overall, the algorithm performs well in filtering images with various concentrations of salt and pepper noise, while effectively preserving image details.

2

Noise Model

Salt and pepper noise is a type of image noise characterized by bright and dark points that occur randomly in black and white [13]. It is commonly generated by various factors including image sensors, transmission channels, and decoding processing. When salt and pepper noise affects an image, some pixels in the image are replaced by either the minimum or maximum value allowed in a digitized image.

186

H. Cao et al.

Specifically, negative pulses appear as black dots (pepper dots) in the image and positive pulses as white dots (salt dots). For an 8-bit image, a is 255 and b is 0. The probability density model of the pretzel noise can be obtained as: ⎧ ⎨ pa , z = 255, p(z) = pb , z = 0, (1) ⎩ 0, other, where z is the gray value, pa and pb represent the probabilities of occurrence of two gray values.

3 3.1

Algorithm Description Adaptive Median Filtering Algorithm

Adaptive median filter adaptively adjusts the filter window size according to noise density [14]. The algorithm is divided into two stages, denoted as A and B. Within a filter window W , centered at a pixel located in the yth row and xth column of the image, the size of the window is determined by the area it covers. Wmax represents the maximum allowed size for the filter window. Zmax represents the maximum gray value within W , while Zmin represents the minimum gray value. Furthermore, Zmed represents the median of all gray values in W , and Zxy represents the gray value of the pixel located in row y and column x of the image. The algorithm flow is: A If Zmin < Zmed < Zmax is satisfied within the current filter window, proceed to B, otherwise increase the size of the window by adding 2 to its current size, represented as W = W + 2. If the new window size is less than or equal to Wmax , repeat the process from step A. However, if the window size exceeds Wmax , directly output the value of Zmed . B If Zmin < Zxy < Zmax is satisfied, output the value of Zxy , otherwise output the value of Zmed . 3.2

Improved Adaptive Median Filtering Algorithm

3.2.1 Initial Detection To initially detect suspected noise points, the image undergoes processing using the adaptive median filtering algorithm. It is not necessary to replace the noise points at this stage. The preliminary detection yields a set M of noise points and a set N of non-noise points. For each noise point, the difference between the maximum and minimum gray values within its filtering window is denoted as Exy . The set M and the set N are denoted as: M = {(x, y), Fxy , Exy } ,

(2)

N = {(p, q), Fpq } ,

(3)

where Fxy and Fpq represent the gray value of the pixel.

Adaptive Median Filtering

187

In this case, for images containing more edge information, not all pixels in the set M are noise points, so secondary detection is required to obtain real noise points. 3.2.2 Secondary Detection The gray values of the target region in a star map are distributed approximately in the form of a Gaussian function, with values decreasing from the central pixel to both sides. While noise points are randomly distributed, non-noise points can be detected by similar gray values of adjacent pixels. Considering the correlation between adjacent pixels, the filtering models in four different directions are established, as shown in Fig. 1 below.

Fig. 1. Four filtering models in different directions

Theoretically, within the 3 × 3 filtering window, more filtering models of three-neighborhood pixels can be considered. However, the simulation found that this can only slightly improve the detection accuracy of noise points, but greatly increases the calculation time of the algorithm. Therefore, only four representative filtering models are considered in this paper. Their mathematical expressions are:   (x, y − 1) , (x, y) , (x, y + 1) | (x, y) ∈ M, 1 = , (4) Wxy (x, y − 1) & (x, y + 1) ∈ N  2 Wxy =

 3 = Wxy

 4 = Wxy

(x − 1, y) , (x, y) , (x + 1, y) | (x, y) ∈ M, (x − 1, y) & (x + 1, y) ∈ N

 ,

(x − 1, y − 1) , (x, y) , (x + 1, y + 1) | (x, y) ∈ M, (x − 1, y − 1) & (x + 1, y + 1) ∈ N (x + 1, y − 1) , (x, y) , (x − 1, y + 1) | (x, y) ∈ M, (x + 1, y − 1) & (x − 1, y + 1) ∈ N

(5)  ,

(6)

.

(7)



Considering that factors such as signal energy attenuation will affect the similarity of adjacent pixel gray values, this paper introduces a weight factor. Specifically, in any filter window, the weighted average of the gray value difference between the center point and the adjacent point is denoted as dkxy , the expression is:

188

H. Cao et al.

dkxy =

⎧ k Hx−i,y−j dx−i,y−j ⎨ (i,j)∈W  xy ⎩

k (i,j)∈Wxy

Hx−i,y−j

Exy ,

k , Wxy ∈ / ∅, k Wxy ∈ ∅,

(8)

k ∈ / ∅ indicates that the two points adjacent to the suspected noise where Wxy point are both non-noise points. dx−i,y−j is the absolute value of the difference between the neighbouring non-noise points and the central noise point grey scale value, which can be expressed as:

dx−i,y−j = |Fij − Fxy | , Fxy ∈ M, Fij ∈ N.

(9)

Hx−i,y−j is the weight factor of dx−i,y−j . According to the research content of the article[15], the expression of the selected weight factor is : Hx−i,y−j =

1 . 1 + d2x−i,y−j

(10)

According to formulas (8)–(10), in the non-empty set filter window, if the central point is a non-noise point, it tends to have a similar gray value to the adjacent non-noise point, resulting in a smaller value of dx−i,y−j . Conversely, in a null set filter window, if the center point is a non-noise point, it implies that some pixels in the window are misjudged as noise points. However, for an area not affected by noise, the value of Exy is small, which in turn leads to a smaller value of dx−i,y−j . Consequently, when non-noise points are located in a smooth area, the values of dkxy in all four filtering windows tend to be small; when non-noise points are located in the edge region, the use of four different orientations in the filtering model ensures that the similarity of neighboring pixel points along the edge direction is detected. Therefore at least one window of dkxy will be small. A threshold T can be preset, as long as there is a dx−i,y−j less than the threshold T , the point is considered to be a non-noise point, the expression is: (11) ∃dkxy < T, k = 1, 2, 3, 4. 3.2.3 Noise Point Replacement The set of real noise points is obtained after the secondary detection and needs to be replaced. This paper adopts the weighted mean method. The gray value of each non-noise point in the filter window has a varying impact on the estimation of the gray value of the central noise point. This impact is determined by the distance of the non-noise point from the center and is reflected in the weight assigned to it. Suppose a non-noise point in the filter window is represented by the coordinates (p, q), and the center point is represented by the coordinates (x, y). The distance from the non-noise point to the center point is:  (12) spq = (p − x)2 + (q − y)2 .

Adaptive Median Filtering

189

Fig. 2. Flow chart of the algorithm

The influence of a non-noise point in the filter window is greater when it is closer to the center point. Therefore, the reciprocal of the distance from the non-noise point to the center point is selected as the weight, the expression is: ωpq =

1 . spq

(13)

The replacement process of noise points by weighted mean is as follows:  (p,q)∈U ωpq F (p, q)  F (x0 , y0 ) = , (14) (p,q)∈U ωpq where F (x0 , y0 ) is the grey scale value after centroid replacement, U is the set of non-noisy points in the filter window, F (p, q) is the grey scale value of the non-noisy point (p, q). It can be seen from the above formula, F (p, q) and ωpq jointly determine the grey value of the noisy pixel point after it has been replaced. The accuracy is higher if there are more non-noisy pixel points involved in the operation and the greater the weight of similar grey values in the neighbourhood range (Fig. 2).

190

H. Cao et al.

3.2.4 Algorithm Process The algorithm flow chart is shown in the figure below: The algorithm flow is as follows: Step 1: Adaptive median filtering algorithm is used to preliminarily detect suspected noise points. Instead of replacing suspected noise points, the set of noise points M and the set of non-noise points N are obtained, as well as Exy (the difference between the maximum and minimum gray values in each noise point filtering window). Step 2: Calculate the absolute weighted mean dkxy of the filter model in each of the four different directions in turn. If there is a dkxy less than the threshold T (T in this paper is 1), the point is a non-noise point; if none of the four values is less than the threshold T , the point is a real noise point. Step 3: Traverse all points in the noise point filtering window to obtain a set of non-noise points. Calculate the weight of each non-noise point and use the weighted mean of all non-noise points in the filter window to replace the noise point.

4 4.1

Experimental Results and Analysis Objective Evaluation Methods of Image Quality

Let I(i, j) and F (i, j) be the gray values of pixel point (x, y) in the original noiseless image and the denoised image. The image size is m × n. The expression of PSNR(Peak Signal to Noise Ratio) is: PSNR = 10 log10

1 mn

m 

n 

i=1 j =1

maxI2 2

[I (i , j ) − F (i , j )]

,

(15)

where maxI is the maximum gray value in the image. The expression of SSIM (Structural Similarity) is: SSIM =

m n (2 μi μj + C1 ) (2 σij + C2 ) 1

2

, mn i=1 j =1 μi + μ2j + C1 σi2 + σj2 + C2

(16)

where μi and μj compare the brightness of the image. σi , σj and σij compare the contrast of the image. C1 and C2 are constants [16]. In general, the larger the value of PSNR, the closer the value of SSIM is to 1, indicating that the denoised image is closer to the original true-value image (Fig. 3). 4.2

Simulation Experiment and Analysis

To verify the effectiveness of the algorithm in this paper, pepper noise with densities of [0.2, 0.4, 0.6, 0.8] is added to a Lena image with pixels 512 × 512 and a maximum pixel value of 256. The initial filter window size of the four algorithms

Adaptive Median Filtering

Fig. 3. Experimental test chart

Fig. 4. The denoising results of various algorithms with noise density 0.2

Fig. 5. The denoising results of various algorithms with noise density 0.4

Fig. 6. The denoising results of various algorithms with noise density 0.6

Fig. 7. The denoising results of various algorithms with noise density 0.8

191

192

H. Cao et al.

is 3 × 3, and the maximum filter window size is 15 × 15. The result is shown in the figure below, each row in turn is the graph processed by noise, median filtering, adaptive median filtering and its improved algorithm, this algorithm: As shown in the figure above, median filtering is only effective for images with low noise density. While the noise density increases, the adaptive median filter and its improved algorithm exhibit significantly better denoising performance than the median filter algorithm. However, for images with higher noise densities, the algorithm proposed in this paper outperforms the other algorithms in terms of denoising effectiveness (Figs. 4, 5, 6 and 7). In addition to direct observation of the images, an objective evaluation method was also used in this paper. PSNR and SSIM values are calculated for different algorithms and the results are shown in the following Table 1. As shown in the Table 1, all algorithms achieve high PSNR and SSIM values for images with noise density lower than 0.2, indicating good filtering performance. However, as the noise density increases, the median filter becomes less effective and fails to deal with the noise at density 0.4. Similarly, the adaptive median filter and its improved algorithm show reduced effectiveness at density 0.6. In contrast, even at a noise density of 0.8, the proposed algorithm exhibits higher PSNR and SSIM values compared to other algorithms, demonstrating its effectiveness in filtering out salt and pepper noise with high concentration. Table 1. PSNR and SSIM values by various algorithms under different noise densities Algorithm

Noise density P SN R SSIM

Median filter

0.2 0.4 0.6 0.8

29.14 19.00 12.35 8.163

0.8569 0.4600 0.1111 0.0267

Adaptive median filter

0.2 0.4 0.6 0.8

34.75 31.32 27.40 23.20

0.9433 0.9092 0.8362 0.6963

Improved adaptive median filter 0.2 0.4 0.6 0.8

35.33 33.50 30.76 26.71

0.9451 0.9297 0.8950 0.8068

38.91 35.03 32.91 29.29

0.9718 0.9530 0.9329 0.8790

Algorithm in this paper

0.2 0.4 0.6 0.8

Adaptive Median Filtering

4.3

193

Star Map Denoising Experiment

To evaluate the denoising performance of the algorithm on star maps, I apply the adaptive median filtering algorithm and the proposed algorithm to two star maps. The first image in the figure below shows the original star map, while the second and third images show denoised portions ([400 : 800, 200 : 600]) using the two algorithms. From Figs. 8 and 9, it can be seen that both algorithms effectively filter out the salt and pepper noise in the image. However, the adaptive median filtering algorithm loses the background stars because the background grey value is almost as low as 0 for the corrected star map, and thus the target can easily be misclassified as noise during extreme point noise detection. The algorithm proposed in this paper can preserve image details well while denoising.

Fig. 8. Star map denoising results 1

Fig. 9. Star map denoising results 2

5

Conclusion

This paper analyses the reasons why adaptive median filtering cannot handle star maps and proposes an improved adaptive median filtering algorithm. The algorithm adds a secondary detection process, and designs four filter templates according to the correlation of adjacent pixels, which reduces the misjudgment of noise points. Furthermore, noise point replacement using a weighted mean of

194

H. Cao et al.

non-noise points in the filter window reduces the loss of image detail. Simulation experiments show that compared with other algorithms, this algorithm not only shows clear images processed in subjective vision, but also shows good filtering performance in PSNR and SSIM indicators. More importantly, when the algorithm is processing the star map, it can reduce the loss of image details and better preserve the information of the tiny constant points in the star map. Acknowledgment. This work was supported by the National Defense Basic Scientific Research Program of China under grant No. JCKY2021606D002.

References 1. Xu, Z., Liu, D., Yan, C., Hu, C.: Stray light nonuniform background correction for a wide-field surveillance system. Appl. Opt. 59(34), 10719–10728 (2020) 2. Kandpal, A., Ramola, V. : Local image segmentation process for salt-and-pepper noise reduction by using median filters. Int. J. Eng. Dev. Res. (IJEDR) 3(2) (2015) 3. Li, X., Ji, J., Li, J., He, S., Zhou, Q.: Research on image denoising based on median filter. In: IEEE 4th Advanced Information Management, Communicates, Electronic and Automation Control Conference (IMCEC), vol. 4, pp. 528–531. IEEE (2021) 4. Bruntha, P.M., et al.: Application of switching median filter with L 2 norm-based auto-tuning function for removing random valued impulse noise. Aerospace Syst. 6, 1–7 (2022) 5. Hou, J., et al.: Research on tumor image segmentation in medical imaging based on extremum adaptive median filtering. In: 2022 IEEE International Conference on Advances in Electrical Engineering and Computer Applications (AEECA) (2022). IEEE (2022) 6. Dong, Z.F., Cheng, X.W., Han, Y.D., Tang, J.T., Dan, G.: Extremum median filter method used in data analysis of spray heat exchange temperature test. Adv. Mater. Res. 945–949, 2165–2169 (2014) 7. Garg, B., Arya, K.V.: Four stage median-average filter for healing high density salt and pepper noise corrupted images. Multimedia Tools Appl. 79(43–44), 32305– 32329 (2020) 8. Cao, X., Zhang, Z., Chen, S., Li, T.: Application of improved self-adaptive weighted median filtering algorithm in neutron radiography. In: 2021 6th International Conference on Intelligent Computing and Signal Processing (ICSP), pp. 50–55. IEEE (2021) 9. Iqbal, N., Ali, S., Khan, I., Lee, B.M.: Adaptive edge preserving weighted mean filter for removing random-valued impulse noise. Symmetry 11(3), 395 (2019) 10. Qian, Y.: Removing of salt-and-pepper noise in images based on adaptive median filtering and improved threshold function. In: 2019 Chinese Control And Decision Conference (CCDC), pp. 1431–1436. IEEE (2019) 11. Tang, J., Wang, Y., Cao, W., Yang, J.: Improved adaptive median filtering for structured light image denoising. In: 7th International Conference on Information, Communication and Networks (ICICN), pp. 146–149. IEEE (2019) 12. Han, J., Tong, J., Tang, C.: Rapid and accurate regional star-map simulated method. In: 2021 International Conference on Computer, Control and Robotics (ICCCR), pp. 319–324. IEEE (2021)

Adaptive Median Filtering

195

13. Karthik, B., Krishna Kumar, T., Vijayaragavan, S.P., Sriram, M.: Removal of high density salt and pepper noise in color image through modified cascaded filter. J. Ambient Intell. Human. Comput. 12, 3901–3908 (2021) 14. Md. Taha, A.Q., Ibrahim, H.: Reduction of salt-and-pepper noise from digital grayscale image by using recursive switching adaptive median filter. In: Jamaludin, Z., Ali Mokhtar, M.N. (eds.) SympoSIMM 2019. LNME, pp. 32–47. Springer, Singapore (2020). https://doi.org/10.1007/978-981-13-9539-0 4 15. Zhang, X., Xiong, Y.: Impulse noise removal using directional difference based noise detector and adaptive weighted mean filter. IEEE Signal Process. Lett. 16(4), 295– 298 (2009) 16. Setiadi, D.R.I.M.: PSNR vs SSIM: imperceptibility quality assessment for image steganography. Multimedia Tools Appl. 80(6), 8423–8444 (2021)

Research on Intelligent Monitoring of Big Data Processes Based on Radar Map and Residual Convolutional Network Jianli Yu(B) , Yixiang Wang, Zhiao Jia, and Benkai Xie School of Management Engineering, Zhengzhou University of Aeronautics, Zhengzhou 450046, Henan, China [email protected]

Abstract. Aiming at the problem that traditional convolutional neural network is difficult to extract fine features of images, a big data process monitoring method based on radar map and convolutional block attention residual network is proposed. According to the sampling frequency of equipment sensors, the radar map is used to visually present the process operation status, and a residual network incorporating the convolutional block attention mechanism is constructed for adaptive feature extraction to carry out intelligent state monitoring of complex processes in big data, and finally the operation status of a high-pressure roller mill is studied empirically. The results show that the proposed method is capable of real-time intelligent monitoring of the five operating states of the high-pressure roller mill, and has higher diagnostic accuracy and generalization capability than other deep learning methods. Keywords: residual convolutional network · attention module · intelligent monitoring · big data

1 Introduction After Germany proposed “Industry 4.0”, a new round of technological and industrial revolution has been launched worldwide. China’s “Made in China 2025” and the 14th Five-Year Plan point out that intelligent manufacturing projects should be implemented continuously to promote industrial intelligence and digital change. With the continuous development of computers, sensors and other technologies, and the massive amount of data reflecting the operation process and status of enterprise equipment. Therefore, under the premise of ensuring the safe and smooth operation of equipment, it is crucial to establish a system for intelligent monitoring and diagnosis of equipment operation process and status under the current big data environment. Traditional process monitoring technology relies on the experience of relevant personnel and tedious manual operations, which cannot meet the various needs of the current big data environment. Deep learning, as an emerging development in machine learning in recent years, convolutional neural network as the core of deep learning, with its ability of adaptive extraction of data features, to solve the shortcomings of the previous model © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 196–204, 2023. https://doi.org/10.1007/978-981-99-6187-0_19

Research on Intelligent Monitoring of Big Data Processes

197

and system based on the need for manual extraction of features and a priori knowledge, while greatly reducing the workload, has become a research in the process of intelligent monitoring Ince et al. [1] proposed a one-dimensional convolutional neural network for real-time condition monitoring and fault detection of electric motors and verified the effectiveness of the method; Peng et al. [2] proposed a deep one-dimensional convolutional network, using the idea of residuals to solve the network degradation problem. Although the above methods are able to have good results on the objects in their respective domains, there are still the following problems: firstly, one-dimensional time series or frequency series signal data will lose some of the original information in feature extraction [3]; secondly, single-scale convolutional layers will ignore subtle timing signals in feature extraction, which The accuracy of the model is reduced. In view of this, many scholars have introduced two-dimensional images such as binary maps, time-frequency spectra and grey-scale maps, combined with multi-scale convolutional neural networks to extract image features for related research. For example, Teng Rui et al. [4] used Gram’s angular field technology to image the data signal and used convolutional neural network for online monitoring of tool wear, which improved the monitoring accuracy; Che Changchang et al. [5] matrixed the long time series into a grey-scale map and established a deep residual convolutional neural network for fault diagnosis and classification. Most of the above research methods combine one or more segments of 1D process data into 2D images and then input them into a convolutional neural network. Although this segmental transformation method can further extract data features, it cannot avoid the correlation between the data at each sampling moment of the equipment, and the overlap between the segments will also affect the model accuracy. To this end, this paper proposes an intelligent monitoring method for big data processes based on radar maps and improved convolutional neural networks. In order to maximise the adaptive extraction of image features, a Convolutional Block Attention Module (CBAM) is added to the residual learning module of the traditional Resnet network and a Batch normalization (BN) layer is added to the main path of the final residual module to build a convolutional network model based on radar maps and attention residuals for intelligent monitoring of big data processes.

2 Theoretical Background 2.1 Resnet Convolutional Neural Network To address the drawbacks of traditional convolutional neural networks, He et al. [6] proposed the Resnet convolutional neural network with a structure consisting of a convolutional layer, a residual block composed of convolutional layers, a pooling layer and a fully connected layer. The structure of the basic residual module and the convolutional residual module of the Resnet network is shown in Fig. 1. x and y are the input and output of one residual sub-module of the network, F is the residual mapping function and x is the constant mapping, then the residual learning can be expressed as y = f (F(x) + x), The constant mapping operation can copy the features of the forward layer of the network directly to the backward layer without convolution operation, solving the problem of network recession.

198

J. Yu et al. x weight layer F(x)

Relu

x

weight layer F(x)+x Relu y

x 1×1 conv Relu F(x)

3×3 conv

1×1 conv

Relu 1×1 conv F(x)+x Relu y

Fig. 1. Basic residual module and convolutional residual module

When the residual mapping function F(x) and the input feature x have different dimensions in the network, a convolutional residual module is used to replace the constant mapping with a 1 × 1 convolutional layer, which is later added to F(x) to obtain the output y after the activation function. 2.2 Attention Mechanisms The attention mechanism in deep learning is a processing mechanism with autonomous learning and selective attention features that enhances the network’s extraction of information from local key parts and ignores less important information to achieve an optimized feature extraction network [7]. As the middle part of the 2D radar map is used to characterise the data, attention to the features in the middle part of the data is enhanced and attention to the surrounding noncritical parts is reduced by adding an attention mechanism. The attention mechanism module mainly consists of three types of modules: channel attention module, spatial attention module and mixed channel and spatial attention module [8]. In the channel attention module, let the input feature F be H × W × C, and first perform global maximum pooling and global average pooling based on width W and height H respectively [9], then input a multilayer perceptron (MLP) containing 1 hidden layer, and then sum the result, after sigmoid activation function, to get 1 × 1 × C channel attention weights Mc, as shown in Fig. 2, the new feature is the channel attention The product of the weights Mc and the input features F can be expressed as MC (F) = σ (MLP(MaxPool(F)) + MLP(AvgPool(F)))

(1)

Research on Intelligent Monitoring of Big Data Processes C C MC (F) = σ (W1 (W0 (FMax )) + σ (W1 (W0 (FAvg ))))

199

(2)

where σ is the sigmoid activation function. MaxPool F

1×1×C

H C

AvgPool

W

Sigmoid

Channel Attention MC

dense layer 1×1×C

Fig. 2. Channel Attention Module

In the spatial attention module, let the input features F be H × W × C, first perform global maximum pooling and global average pooling based on channel C respectively [10], and then perform convolution operation after merging by channel, and then after sigmoid activation function, get H × W × 1 spatial attention weights Ms, as shown in Fig. 3, the new feature is the product of channel attention weights Ms and input features F , which can be expressed as MS (F) = σ (f ([MaxPool(F), AvgPool(F)]))

(3)

S S , FAvg ])) MS (F) = σ (f ([FMax

(4)

where σ is the sigmoid activation function. F

[MaxPool, AvgPool] conv layer

H C

W

H×W×1, H×W×1

Sigmoid Spatial Attention MS

Fig. 3. Spatial attention module

3 A Big Data Process Intelligence Monitoring Model Based on Radar Map and Attention Residual Network 3.1 Two-Dimensional Radar Maps The radar plot is used to two-dimensionalise the one-dimensional sensor acquisition data, and multiple monitoring data with similar values are represented on a radar plot at each sampling moment, let the monitoring data at a certain sampling moment X = [x1 , x2 , ..., xn ], intersect the n equally divided equal length rays at a point, mark the n monitoring data on the n rays, set suitable scale values according to the numerical size of the data, add circular lines at each scale, and connect the n monitoring variable values to the lines in order to highlight the data characteristics, to obtain a two-dimensional radar diagram of the monitoring data at the sampling moment.

200

J. Yu et al.

3.2 Attentional Residual Network The Convolutional Block Attention Module (CBAM), proposed by Woo et al. [11] is capable of extracting features in both channel and spatial dimensions simultaneously. Due to the lightweight nature of CBAM, this paper integrates CBAM into the Resent network for feature extraction. The CBAM is added to the residual learning module, specifically used after the convolutional layer and before the constant mapping part, and the structure is shown in Fig. 4. Feature F

Channel Attention Module

Spatial Attention Module

F'

conv

F''

Fig. 4. CBAM residual learning module

3.3 Model Framework To address the problems of overlapping traditional two-dimensional image data and the tendency of Resnet networks to overfit, this paper introduces radar maps and CBAM, converting one-dimensional signal data into two-dimensional radar maps as the input to the convolutional neural network. In order to enable the network to adaptively extract the data features in the middle part of the radar map, CBAM is added to the residual learning module of the residual convolutional network for adaptive feature extraction. The feature extraction network is divided into an input part, an intermediate part and an output part, each intermediate part contains three residual learning modules containing CBAM, a batch normalization (Batch normalization, BN) layer is added to the main path of the final residual learning module to speed up network convergence, and the final output part uses an average pooling layer, a flatten layer and a dense layer to output The final classification results are used to construct the Resnet-CBAM network model, the framework is shown in Fig. 5. Conv BN relu

Conv block

Conv block

CBAM

CBAM

Conv block

BN

AvgPool

relu

Flatten dense

Resnet-CBAM block

Fig. 5. Resnet-CBAM Model framework

4 Example Application and Analysis The high-pressure roller mill process, as a key process for raw material handling in the metallurgical, coal and steel industries, is characterised by a large number of monitoring variables, a large amount of data and a high demand for real-time [12], which is a complex

Research on Intelligent Monitoring of Big Data Processes

201

equipment process in a big data environment. The high-pressure roller mill utilises the principle of laminar crushing under high pressure to achieve mutual crushing of particles between layers [13]. In this paper, six monitoring indicators such as current, voltage, load bearing and roller slit are considered. The operation status of the high-pressure roller mill can be divided into five kinds: normal, no-load, shutdown, abnormal and fault. The model in this paper will classify and identify these five states and verify the feasibility and validity of the model. 4.1 Model Training The experiments are based on the Keras deep learning library and are implemented using python 3.7 and Jupyter Notebook. The model selected six key variables such as current, pressure and roll gap as radar plot variables, and Fig. 6 shows an example of a two-dimensional radar map of some of the operating conditions of a high-pressure roller mill. The experiment divided 4445 sets of data into a training and validation set at 8:2, followed by 2580 sets of data as the test set.

(a) Normal condition (b) No-load condition (c) Abnormal states Fig. 6. Example of a partial operational status radar diagram

In order to avoid overfitting of the model, the radar plots is rotated 90°, 180°, 270° clockwise and horizontally to enhance the data. The activation function of model training is Relu, and the maximum number of iterations is 10. Due to memory constraints, the number of batches is set to 16. Experiment using Adam optimizer with a learning rate of 1 × 10−3 . The final model training and validation loss functions and accuracy variation are shown in Fig. 7. Table 1 and Fig. 8 show the performance of the model on the test set using precision, sensitivity, total accuracy and confusion matrix respectively, with the following equations: precision =

TP % TP + FP

(5)

sensitivity =

TP % TP + FN

(6)

total accuracy =

TP + TN % TP + TN + FP + FN

(7)

202

J. Yu et al.

Fig. 7. Model training results

where TP and FN represent positive samples which positive and negative samples predicted by the model respectively, FP and TN represent negative samples predicted by the model respectively. Table 1. Performance of the model in test set Sttus/Indicators

Precision (%)

Sensitivity (%)

F1-score (%)

Number of samples

Normal

99.74

99.68

99.71

1596

No load

1.00

94.86

97.36

214

Shutdown

99.40

1

99.70

329

Exceptions

98.03

1

99.01

249

Fault

97.77

1

98.87

219

99.38

2580

98.99

98.91

98.93

Total accuracy Macro average

Fig. 8. Confusion matrix

Experiments show that for the identification and classification of the five operating states of the high-pressure roller mill, the proposed model achieves a total accuracy of 99.38% on 2580 test sets, with a diagnostic time of about 8 ms per set of data, enabling

Research on Intelligent Monitoring of Big Data Processes

203

intelligent health monitoring of the high-pressure roller mill equipment state and meeting the real-time requirements in a big data environment. 4.2 Comparative Experiments In order to verify the effectiveness of the proposed method, the results of each model on the test set are shown in Table 2 when compared with the traditional Resnet network, the channel attention mechanism Resnet, the spatial attention mechanism Resnet and the Senet network under the same conditions. Table 2. Model comparison results Models

Number of entries (pcs)

Total Accuracy Avg precision (%) (%)

Avg sensitivity (%)

Avg f1-score (%)

Resnet

272741

96.05

94.19

96.43

95.08

Channel Attention Resnet

277151

98.42

98

97.68

97.81

Spatial Attention Resnet

273623

93.53

91.33

95.84

92.7

Senet

277151

98.68

98.20

97.98

98.38

Methodology of this article

278033

99.38

98.99

98.91

98.93

As can be seen from Table 2, compared with the traditional residual network, the method in this paper performs well in all evaluation metrics except for the slightly higher number of trainable parameters, and has strong adaptive feature selection, extraction and generalization capabilities, indicating that for 2D radar map image data, adding a convolutional block attention module to the residual network, compared with a single channel or spatial attention module residual network, can extract more granularly to image features.

5 Conclusion In this paper, the original one-dimensional signal is transformed into a two-dimensional radar map according to the sampling moment of the sensor, correlations between the data are excluded, and a convolutional block attention mechanism is added to the traditional residual network to build a radar map and attention residual network model based on radar map and attention for intelligent monitoring of big data processes. The effectiveness of the proposed method is verified by an example of the operation process of a high-pressure roller mill, which is able to monitor the five operating states of the roller mill in real

204

J. Yu et al.

time. In particular, the radar diagram presents the state of each variable of the equipment in a concise and intuitive way, and the image features are extracted by adaptive selection through an attention mechanism, thus improving the recognition accuracy. Comparative experiments show that, compared with the traditional residual network, it has improved in terms of accuracy, sensitivity and other evaluation criteria, except for a slightly higher number of parameters, providing a new solution for intelligent health monitoring of complex equipment in a big data environment.

References 1. Ince, T., Kiranyaz, S., Eren, L.: Real-time motor fault detection by 1-D convolutional neural networks. IEEE Trans. Ind. Electron. 63(11), 7067–7075 (2016) 2. Peng, D., Liu, Z., Wang, H.: A novel deeper one-dimensional CNN with residual learning for fault diagnosis of wheelset bearings in high-speed trains. IEEE Access 7, 10278–10293 (2018) 3. Li, Z., Lu, C., Wang, X.: Double-branch convolutional neural network fault diagnosis method considering the fault location and damage degree of rolling bearings. Sci. Technol. Eng. 22(4), 1441–1448 (2022) 4. Teng, R., Huang, H., Yang, K.: Online monitoring method of tool wear value based on image coding technology and convolutional neural network. Journal 28(4), 1042–1051 (2022) 5. Che, C., Wang, H., Ni, X.: Fault diagnosis of rolling bearings based on deep residual shrinkage network. J. Beijing Univ. Aeronaut. Astronaut. 47(7), 1399–1406 (2021) 6. He, K., Zhang, X., Ren, S.: Deep residual learning for image recognition. In: CONFERENCE 2016, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016) 7. Chen, Y., Zhang, X., Chen, W.: Research on recognition of fly species based on improved RetinaNet and CBAM. IEEE Access 8, 102907–102919 (2020) 8. Jiang, Z., He, T., Shi, Y.: Remote sensing image classification by fusing convolutional attention mechanism with deep residual networks. Journal 43(4), 76–81 (2022) 9. Xu, H., Jiang, N., Qi, Z.: Speech noise reduction by convolutional recurrent networks based on attention mechanism. Journal 22(5), 1950–1957 (2022) 10. Zhang, J., Xie, Y., Xia, Y.: Attention residual learning for skin lesion classification. IEEE Trans. Med. Imag. 38(9), 2092–2103 (2019) 11. Woo, S., Park, J., Lee, J.Y.: CBAM: convolutional block attention module. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_1 12. Wei, B., Zhang, X., Li, L.: Progress and development trend of foreign application of highpressure roller mill crushing process. Journal 10–18 (2022) 13. Qu, T., Yang, J., Xin, Y.: Structure of high-pressure roller mill for non-ferrous metal beneficiation and its process. Journal 1–24

Consensus Path-Following of Multiple Wheeled Mobile Robots with Complex Dynamics by Adaptive Fixed-Time Fuzzy Control Junyi Yang1(B) , Zhichen Li1(B) , Huaicheng Yan1 , Hao Zhang2 , and Zhenghao Xi3 1

School of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237, China [email protected], [email protected] 2 Department of Control Science and Engineering, Tongji University, Shanghai 200092, China 3 School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai 201620, China

Abstract. This paper concentrated on fixed-time consensus pathfollowing control for multiple wheeled mobile robots(WMRs). The main goal is to achieve fixed-time consensus performance of multiple robot systems in situations where the complex dynamic model of the robot is partially unknown, and the convergence time is independent of the initial states of the system. In order to facilitate implementation, the path-following control is first decoupled into speed control and heading control. By introducing an algebraic implicit reference path, a new path following control method is developed for the robots based on kinematic models. Secondly, by means of fuzzy logic system theory, a time-varying gain adaptive fuzzy control strategy is proposed, which approximates the unknown parts of the dynamic models and compensates for external disturbances in real time. In addition, the fixed-time stability analysis of tracking error dynamics based on backstepping method is presented by constructing a barrier and a quadratic form of Lyapunov function. Finally, its advantages and effectiveness are demonstrated by simulation examples. Keywords: wheeled mobile robots · consensus control fuzzy control · fixed-time stabilization

1

· adaptive

Introduction

The consensus control of multi-agent systems has attracted much attention due to its wide application in many fields [1]. Thanks to excellent physical implementation [2], wheeled mobile robots can relatively adapt to various operating c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023  Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 205–215, 2023. https://doi.org/10.1007/978-981-99-6187-0_20

206

J. Yang et al.

scenarios. Therefore, in the past few years, academic interests in collaborative control research of multiple WMRs has surged [3]. Consistency tracking control is a basic problem of formation control and coordinated motion control. At present, there have been some researches on multi-agent system, such as the control of multi-agent system with input saturation [4], consistent multi-agent control with interference [5], etc. In practical applications, with the increasing requirements for the performance of the control system, compared with the finite time consistency, the fixed time consistent convergence control of the multi-agent system no longer depends on the initial value of the system, and the system can be stabilized only by adjusting the parameters. Therefore, fixed time control is more in line with the needs of actual systems. In [6], the timing leader follower consistency of high-order nonlinear multi-agent system with time-varying gain under triangular model control was studied. In [7], the implicit path description is introduced as a reference for the second-order integrator multi-agent system, and the cooperative behavior on the path is realized by adjusting the speed of each agent to the required speed distribution. Unfortunately, complex vehicle dynamics and spatial constraints between network connected systems are not involved in [8]. As for the processing of unknown parts in complex models, fuzzy logic systems (FLS) have shown good ability in approximating any smooth nonlinearity in a compact set to any precision because they contain a reasonable structure of a series of local models in variable subspaces connected by basis functions [9]. Therefore, the adaptive fuzzy scheme derived from FLS with training algorithms has sparked a wave of research interest. The fixed time consensus problem for high-order integral multi-agent systems with matched external disturbances is studied [10]. In [11], a new fixed time adaptive fuzzy control scheme combined with barrier Lyapunov function (BLF) technology was proposed, which solved the problems of time-varying constraints and input saturation of uncertain of nonlinear systems that are common in many practical systems. However, in order to avoid singularity problem, most controllers based on fixed-time stability theory require exponential terms to be designed as odd, which parameter design is limited. Inspired by the above discussion, the main contributions of this letter can be highlighted as follows: 1) By introducing an algebraically implicit reference path, a consensus path-following algorithm with bounded velocity distribution is proposed. This algorithm decouples the design into forward velocity control and heading control to ensure that all path-following errors reach a consensus and asymptotically converge to the origin. 2) Using FLSs to approximate the unknown terms in the complex dynamic model of WMRs, a fixed time controller based on the barrier Lyapunov function is designed to keep the velocity error within the time-varying constraint range and further improve the control performance. 3) A controller based on fixed-time stability method can not only ensure that the error converges to a small compact set within fixed time, the convergence time only depends on the design parameters. And the parameters designed in this paper do not need to be constrained to odd numbers, while avoiding singularity issues.

Consensus Path-Following of Multiple Wheeled Mobile Robots

207

Throughout the paper, Rn represents n-dimensional Euclidean space; R is the set of real numbers; sign(·) means the sign function; for x ∈ R, xα = α |x| sign (x) with α > 0; the superscripts T and -1 are transpose and inverse of matrix, respectively; Q > 0 indicates that Q is positive definite.

2

Problem Statement and Preliminaries

A schematic diagram for WMR model is illustrated in Fig. 1.

Fig. 1. Schematic diagram for WMR

Consider a group of n nonholonomic wheeled mobile robots, the kinematic and dynamic model for WMRs can be described as ⎤ ⎡ ⎤ ⎡ x˙ i cosθi 0   ⎣ y˙ i ⎦ = ⎣ sinθi 0 ⎦ vi . (1) ωi 0 1 θ˙i where qi = [xi yi θi ] represents the position and orientation in global coordinate system, vi , ωi are the forward and angular speed of ith robot Ri , i ∈ In = {1, 2, ..., n}. M¨ qi + C(qi , q˙i )q˙i + G(qi ) + F(q˙i ) + τdi = Bτi + A (qi )λ,

(2)

where C(qi , q˙i ) is Coriolis matrix, F(q˙i ) is dynamic and static friction terms, G(qi ) denotes gravitational potential energy related to pose and speed with G(qi ) = 0, if WMR moves on horizontal plane, τdi is the bounded external disturbance, A(qi ) = [−sinθi cosθi 0] is the constraint matrix, τi = [τri τli ] is the right and left wheel torque, λ is Lagrange operator representing constraint ⎤ condition ⎡ ⎡ ⎤ cosθi cosθi m 0 0 for nonholonomic system, M = ⎣ 0 m 0 ⎦ and B = 1r ⎣ sinθi sinθi ⎦ are system b −b 0 0 I parameter and input transfer matrices, respectively.

208

J. Yang et al.

The nonholonomic constraint for WMR is A(qi )q˙i = 0 and choose a non⎤ ⎡ cosθi 0 singular matrix S(qi ) = ⎣ sinθi 0 ⎦ such that A(qi )S(qi ) = 0 to eliminate the 0 1 constraint. The dynamic Eq. (2) can be transformed into ¯ i )η˙ i + C(q ¯ i , q˙i )ηi + F( ¯ q˙i ) + τ¯di = τ¯i M(q

(3)

¯ 2 ] = S  (qi )MS(qi ), ηi = [η1i η2i ] = [vi ωi ] , ¯ i ) = [M ¯1 M where M(q  ¯ ¯ ¯ ˙ i ) + C(qi , q˙i )S(qi )], τ¯di = [¯ C(qi , q˙i ) = [C1 C2 ] = S  (qi )[MS(q τ1di τ¯2di ] =     ¯ ¯ ¯ τ1i τ¯2i ] = S (qi )Bτi , and F(q˙i ) = [F1 F2 ] = S  (qi )F(q˙i ). S (qi )τdi , τ¯i = [¯ Lemma 1. [12] Suppose that there is a continuous positive definite and radially unbounded function V (x) such that V˙ (x) ≤ −αV p x (t) − βV q x (t)

(4)

where α, β > 0, and satisfying 0 < p < 1, q > 1. The origin is fixed-time equilibrium for system (3) with setting function T (x0 ) bounded by T (x0 ) ≤ Tmax =

1 1 + α (1 − p) β (q − 1)

(5)

Lemma 2. [12] Suppose that there is a continuous positive definite and radially unbounded function V (x) such that V˙ (x) ≤ −αV p x (t) − βV q x (t) + ξ,

(6)

where α, β > 0, and satisfying 0 < p < 1, q > 1, and ξ denote positive constants. The origin is fixed-time equilibrium with setting function T (x0 ) bounded by T (x0 ) ≤ Tmax =

1 1 + . αδ (1 − p) βδ (q − 1)

The residual set of the system is given by    p1   q1 ξ ξ Ω = x|V (x) ≤ min , α(1 − δ) β(1 − δ)

(7)

(8)

where 0 < δ < 1. Lemma 3. [13] For z0 ∈ R satisfying |z0 | < k with any positive constant k, the following relationship is feasible

k2 z02 log (9) ≤ 2 k 2 − z0 k 2 − z02

Consensus Path-Following of Multiple Wheeled Mobile Robots

3

209

Main Results

3.1

Consensus Path-Following Control for Multiple WMRs

Define a reference planner path as Pri = [xi , yi ] ∈ R2 |pi (xi , yi ) = 0 , and regard pi (xi , yi ) as the path-following error. Thus, the path-following control problem of Ri can be converted to stabilize pi (xi , yi ) at the origin. The tracking error of Ri can be written as  p˙i = |∇pi | vi sin θei (10) θ˙ei = ωi − θ˙di  where |∇pi | = p2ix + p2iy , pix and piy are the first-order partial derivatives of pi , θei is the orientation error of Ri , the desired orientation θdi and angular speeds θ˙di can be calculated as follows pix θdi = −arctan( ), piy = 0 (11) piy (pix pixy − piy pixx ) vi cos θi + (pix piyy − piy pixy ) vi sin θi (12) θ˙di = 2 |∇pi | where pixx , pixy and piyy are the second-order partial derivatives of pi . Based on Graph Theory, the adjacent matrix  of a directed graph is A = aij (pi − pj ) + bi (pi − p0 ), [aij ] ∈ Rn×n . Define the consensus error as epi = j∈Ni

Ni is index set of all neighbors of robot, p0 represents the path-following error of the virtual leader R0 so that [p0 , θe0 ] = 0. bi = 1 when the leader is accessible by the Ri , and bi = 0 on the contrary. Lemma 4. [8] B = diag(b1 , b2 , ..., bn ) is a non-negative diagonal matrix and the Laplacian matrix L = [lij ] ∈ Rn×n is defined as lij = −aij , i = j and n lii = j=1,j=i aij , for all the eigenvalues of matrix L+B have positive real parts if and only if the directed graph G has a spanning tree with R0 is considered as its root vertex. In order to enable multiple robots to follow the planar path provided by the virtual leader R0 within a fixed time, the following controller is proposed: − |e1

vi = v0 + (vm − v0 )e

pi |

(13)

⎛⎛ α1 

ωi =θ˙di − γ1i si

β1 

− γ2i si

×|∇pi | vi sin θei −





γ3i

2

1 + (γ3i epi )

⎝⎝ ⎞



j∈Ni

⎞ aij + bi ⎠ (14)

aij |∇pj | vj sin θej ⎠

j∈Ni

where vm is the maximum velocity, 0 < α1 < 1, β1 > 1, γ1i , γ2i , γ3i are positive constants, and the sliding variables are designed as si = arctan(γ3i epi ) + θei .

210

J. Yang et al.

Theorem 1. Supposing a group of WMRs under directed interaction, if the consensus path-following controllers are designed as (13)(14), not only the velocity of the Ri can reach the desired positive speed v0 , but also the consensus error epi can converge to the neighborhood of zero within fixed-time Tc , regardless of the initial value of consensus error epi (0), i.e., lim vi − v0  = 0

t→+∞

lim epi = 0

Proof. Choose a Lyapunov functional as V1 = is calculated as V˙ 1 = 2

n 

si s˙ i =2

i=1

where

⎛ e˙ pi = ⎝



n  i=1

 si

(15)

t→+∞

n  i=1

γ3i e˙ pi

s2i , and the derivation of V1  2

1 + (γ3i epi )

⎞ aij + bi ⎠ |∇pi | vi sin θei −

j∈Ni



+ ωi − θ˙di

aij |∇pj | vj sin θej

(16)

(17)

j∈Ni

Injecting velocity control functions (13) and (14) into (16), one has n   (1+α1 )/2 (1+β1 )/2 V˙ 1 = − 2γ1i |si |1+α1 + 2γ2i |si |1+β1 ≤ −γ1 V1 − γ2 V1 (18) i=1

where γ1 = 2 min(γ1i ), γ2 = 2 min(γ2i ). Based on Lemma 1, the convergence time of si , i = 1, 2, ..., n can be express 1 + γ2 (β11 −1) . as Tc ≤ Tc max = γ1 (1−α 1) When the sliding surfaces are reached, si = 0, it gives θei = − arctan(γ3i epi ). Define P = [p1 , p2 , ..., pn ] and Ep = [ep1 , ep2 , ..., epn ] , one has Ep = (L + B)P and E˙ p = (L + B)P˙ , where P˙ = [p˙1 , p˙2 , ..., p˙n ] . And E˙ p can be rewritten as E˙ p = −(L + B)QEp with Q = diag( √γ31 |∇p1 |v1 2 , ..., √γ3n |∇pn |vn 2 ). It is obvious 1+(γ31 ep1 )

1+(γ3n epn )

that Q is positive definite, and according Lemma 4, we can obtain asymptotic convergence of Ep . Then asymptotic convergence of P is ensured, and all vi asymptotically converge to the desired velocity v0 with in fixed-time Tc . This completes the proof for Theorem 1. Remark 1. When only the kinematic model of robot is considered, the previously designed controller (13)(14) can solve consensus-tracking control problem for multiple WMRs system within a fixed time. In practical applications, the multi-robot system will be disturbed by other external factors, so it is necessary to consider the robot’s dynamics model. 3.2

Adaptive Fixed-Time Fuzzy Control Design for Dynamic Model

The velocity of fixed-time consensus control design is used as the control input of the dynamics model, and the torque controller is designed such that the actual

Consensus Path-Following of Multiple Wheeled Mobile Robots

211

velocity can track the reference information, i.e., lim ηia − Ui  = 0

t→+∞

(19)

where ηia = [via ωia ] is the actual velocity, and Ui = [vi ωi ] is the desired velocity designed by consensus path-following controller. Define Z = [z1i z2i ] = ηia −Ui as the error between the actual and reference velocity of the ith robot. To approximate unknown nonlinear function in the 2 yi  , ψi > 0, and dynamic model using FLS, define unknown constants as ψi = ¯ denote ψˆi as the estimate of ψi . Then the estimate error can be expressed as ψ˜i = ψi − ψˆi . The adaptive fuzzy control strategy is designed as:         τ¯1i m0 v˙ L1i = (20) + τ¯2i L2i 0 I ω˙  ˆ   2α2 −1 ψri Ψi (Xi )Ψi (Xi ) ri ri − − αk21i z2 ri 2 α2 −1 − ρ2z−z where Lri = −λi zri − 2 ρ2z−z 2 2 2a2 ( i ri ) 2 ρi −zri  i ri   k2i zri 2β2 −1

2 2β2 ρ2i −zri 

β2 −1

with r = 1, 2. λi =

ρ˙ i ρi

2

+ ϑ denotes a time-varying gain, ϑ,

α2 , β2 , k1i , k2i are parameters to be determined, Ψi is a fuzzy basis function ˙ ˙ vector with Xi = [via , vi , ωia , ωi , v˙ ia , v˙ i , ω˙ ia , ω˙ i , ψˆ1i , ψˆ2i ] . The corresponding adaptive law of ηˆi is given by    2β2 −1 z 2 Ψ  (Xi )Ψi (Xi ) ˙ ψˆri = ri i − κ1i ψˆi − κ2i ψˆi , r = 1, 2 2 2 ) 2a2 (ρ2i − zri

(21)

where initial condition ψˆi (t0 ) ≥ 0, κ1i and κ2i are positive constants to be designed. Theorem 2. For the dynamic system of ith WMR, under the adaptive fixed 2 ≥ 0, time fuzzy controller (20) with adaptive law (21) satisfying λi + ρρ˙ii

ϑ > 0, 12 < α2 < 1, β2 > 1, k1i > 0, and k2i > 0 , the velocity error Z can converge to the neighborhood of zero within fixed-time Tv , regardless of the initial value of the velocity error Z(0).

Proof. Consider the error z1i , construct a mixed barrier and quadratic type of Lyapunov function as

ρ2i 1 1 2 . (22) V2 = log + ψ˜1i 2 2 2 ρi − z1i 2 The derivation of V2 is calculated as



ρ˙ i − v˙ i − z1i ρi    1 τ¯1i − C¯1 ηia − F¯1 − τ¯1di + m

˙ V˙ 2 = − ψ˜1i ψˆ1i +

z1i 2 ρ2i − z1i

(23)

212

J. Yang et al.

Define a set of intermediate variables as Wi = −C¯1 ηia − F¯1 − τ¯1di

(24)

It is noticed that the complex models are not easy to obtain, which cannot be directly utilized for controller design. According to [12], there exists a FLS Y  Ψ (X ) suitable for approximation of Wi : Wi = Y  Ψ (X ) + ε (Xi ) ,

(25)

˙ ˙ where Xi = [via , vi , v˙ ia , v˙ i , ψˆ1i , ψˆ2i ] is input vector of FLSs, Y represents weight matrix, Ψ denotes basis function vector, and ε is approximation error satisfying ε (Xi ) < . Injecting the controller (20) into (23), one has 2α −1

z1i  2 z1i − k 1i 2 α2 −1 ρi 2 − z1i 2 2α2 ρ2i − z1i

2β −1 z1i z1i  2 1 −k2i − 2 ψˆ1i Ψi Ψi 2 2 β2 −1 2a ρ2i − z1i 2β2 ρ2i − z1i ! z1i ˜ ˆ˙ − 2 ) + Wi − ψ1i ψ 1i 2 (ρ2i − z1i

V˙ 2 ≤

(26)

Substituting adaptive law (21), using Young’s inequality, and according to the Lemma 3, it results in   α2 β2 2 2 z1i z1i 2 a2 V˙ ≤ − k1i + −k + 2i 2 2 2 2 2 (ρi − z1i ) 2 (ρi − z1i ) 2 2

    κ2i 1 2β2 2β2 2 2 ψ1i − κ1i ψ˜1i + 2κ2i − − ψ˜1i + κ1i ψ1i 2 β2   α2 β2 2 ρi ρ2i − k2i log ≤ − k1i log 2 ) 2 ) 2 (ρ2i − z1i 2 (ρ2i − z1i α2 β2   2 2 2β2 κ2i (2β2 − 1) ψ˜1i ψ˜1i − κ1i − +ξ 2 2β2 2 ≤ − K1 V α2 − K2 V β2 + ξ, #  ˜2 α2 " β2 ψ1i 2 −1) , and ξ = κ + where K1 = min {k1i , κ1i }, K2 = min k2i , 2 κ2i2β(2β 1i 2 2 κ2i (2β2 −1) 2β2 ψ1i 2β2

2 + 12 (ψ1i + a2 + 2 ). As for the error term z2i , the proof of convergence is similar to z1i . Based on Lemma 2, the convergence time of Z to the neighborhood of zero can be express 1 1 + , and the corresponding convergence as Tv ≤ Tvmax = K1 δ(1−α 2)  $ K2 δ(β2%−1) 1 % β1  $ α2 2 ξ ξ , set is Ω = x|V (x) ≤ min . Thus, the proof for K1 (1−δ) K2 (1−δ)

Theorem 2 is completed.

Consensus Path-Following of Multiple Wheeled Mobile Robots

213

2α −1

Remark 2. Different from the design in [13], z1i  2 does not require the exponential parameter α2 design to be an odd positive number, which relaxes the conditions for parameter design and can avoid the singularity problem at the same time. Remark 3. A mixed barrier and quadratic type of Lyapunov function is constructed in (22), which ensures that the error remains within the prescribed performance [−ρi , ρi ] regardless of initial value. The derived fixed convergent time is characterized by design parameters, which are independent of system coefficient and initial condition. Moreover, the time-varying control gain configuration is more preferable in line with practical application that the constants.

4

Numerical Example

To verify the effectiveness of the designed consensus path-following controller, consider three robots with a fixed ⎤directed topology as depicted in Fig. 2, the ⎡ 1 −1 0 Laplacian matrix L = ⎣ 0 1 −1 ⎦, and B = diag(1, 0, 0). The reference path −1 0 1 is given as p(x, y) = y −3 sin( π9 x) = 0, the initial positions of the three robots are [9, 3] , [7, 4] and [8, 5] , respectively. The desired velocity is designed as v0 = 1 m/s, and the maximum speed is vm = 2 m/s. The control gain are selected as γ11 = γ12 = γ13 = 0.2, γ21 = γ22 = γ23 = 0.8, γ31 = γ32 = γ33 = 0.2, and the exponent α1 = 0.8, β1 = 1.2.

Fig. 2. Topology

Figure 3 shows that the designed controller enables each robot to track the reference path, regardless of the initial position. Let the dynamic system parameters be m = 10 kg, I = 5 kg · m2 , r = 0.1 m, and b = 1 m. For adaptive fuzzy control design, by dint of Theorem 2, the prescribed performance index of barrier Lyapunov function is set as ρi = 0.01sin(0.5t) + 0.17, the fuzzy control strategy with a = 5, k11 = k12 = k13 = 5 and k21 = k22 = k23 = 1. ϑ = 0.1, α2 = 34 , β2 = 2, the corresponding adaptive law is put forward in (21) with κ11 = κ12 = κ13 = 0.5, κ21 = κ22 = κ23 = 3.

214

J. Yang et al.

Fig. 3. Consensus performance of following the reference path

Using the mixed barrier and the quadratic type Lyapunov function, it can be intuitively found from the Fig. 4 that the velocity error has a good convergence effect whatever the different initial errors, and will not violate the predefined constraints −ρ and ρ.

Fig. 4. Forward and angular velocity errors for WMRs.

5

Conclusion

This article studies the fixed-time consensus control of multi WMRs systems. The proposed speed controller and orientation controller can ensure that the error between the robots’ motion path and the reference path converge to zero within a fixed time. As for the unknown terms in the dynamic system, the fuzzy logic system is used to approximate them. The time-varying gain controller designed based on the barrier Lyapunov function can ensure that the velocity error meets the preset time-varying constraint boundaries. In addition, based on the fixedtime method, the velocity error will converge to a small compact set within fixed time. To demonstrate the effectiveness of the proposed adaptive fixed-time stabilization method, a numerical example of three WMRs system is provided to provide the advantages of the proposed control method.

References 1. Zuo, Z., Han, Q.L., Ning, B., Ge, X., Zhang, X.M.: An overview of recent advances in fixed-time cooperative control of multiagent systems. IEEE Trans. Industr. Inf. 14(6), 2322–2334 (2018)

Consensus Path-Following of Multiple Wheeled Mobile Robots

215

2. Zhang, H., Sun, J., Wang, Z.: Distributed control of nonholonomic robots without global position measurements subject to unknown slippage constraints. IEEE/CAA J. Automatica Sinica 9(2), 354–364 (2022) 3. Ning, B., Han, Q.L., Lu, Q.: Fixed-time leader-following consensus for multiple wheeled mobile robots. IEEE Trans. Cybern. 50(10), 4381–4392 (2020) 4. Su, Y., Wang, Q., Sun, C.: Self-triggered consensus control for linear multi-agent systems with input saturation. IEEE/CAA J. Automatica Sinica 7(1), 150–157 (2019) 5. Wang, H., Yu, W., Ding, Z., Yu, X.: Tracking consensus of general nonlinear multiagent systems with external disturbances under directed networks. IEEE Trans. Autom. Control 64(11), 4772–4779 (2019) 6. You, X., Hua, C., Li, K., Jia, X.: Fixed-time leader-following consensus for highorder time-varying nonlinear multiagent systems. IEEE Trans. Autom. Control 65(12), 5510–5516 (2020) 7. Zuo, Z., Cichella, V., Xu, M., Hovakimyan, N.: Three-dimensional coordinated path-following control for second-order multi-agent networks. J. Franklin Inst. 352(9), 3858–3872 (2015) 8. Zuo, Z., Song, J., Han, Q.L.: Coordinated planar path-following control for multiple nonholonomic wheeled mobile robots. IEEE Trans. Cybern. 52(9), 9404–9413 (2021) 9. Tong, S., Li, K., Li, Y.: Robust fuzzy adaptive finite-time control for high-order nonlinear systems with unmodeled dynamics. IEEE Trans. Fuzzy Syst. 29(6), 1576–1589 (2020) 10. Zuo, Z., Tian, B., Defoort, M., Ding, Z.: Fixed-time consensus tracking for multiagent systems with high-order integrator dynamics. IEEE Trans. Autom. Control 63(2), 563–570 (2017) 11. Sun, J., Yi, J., Pu, Z.: Fixed-time adaptive fuzzy control for uncertain nonstrictfeedback systems with time-varying constraints and input saturations. IEEE Trans. Fuzzy Syst. 30(4), 1114–1128 (2021) 12. Jia, T., Pan, Y., Liang, H., Lam, H.K.: Event-based adaptive fixed-time fuzzy control for active vehicle suspension systems with time-varying displacement constraint. IEEE Trans. Fuzzy Syst. 30(8), 2813–2821 (2021) 13. Sun, Y., Wang, F., Liu, Z., Zhang, Y., Chen, C.P.: Fixed-time fuzzy control for a class of nonlinear systems. IEEE Trans. Cybern. 52(5), 3880–3887 (2020)

Level Control of Chemical Coupling Tank Based on Reinforcement Learning Method Yuheng Li , Quan Li , and Fei Liu(B) Key Laboratory of Advanced Control of Light Industry Process of Ministry of Education, Jiangnan University, Wuxi 214122, Jiangsu, China [email protected]

Abstract. Most industrial processes in actual production have complex mechanism knowledge and nonlinear characteristics. Therefore, it is difficult to establish accurate mathematical models for these processes. Data-driven methods do not rely on process models, and only need process data to give process control strategies. Model-free reinforcement learning (RL) is one such method for obtaining control strategies based on process data. This paper proposes a self-adjusting exploration probability Deep Q-Network (DQN) method. The method is used to get the optimal control strategy of the chemical coupling tank with nonlinear interaction. The simulation results show that the DQN method can make the liquid level of the coupling tank quickly reach the target, and has a certain inhibitory effect on the disturbance. Keywords: Reinforcement Learning · Neural Network · Coupling Tank · Data-driven

1 Introduction In industrial processes, fluids are usually stored in storage tanks and flow to other storage tanks as required. But in most cases, it is required that the fluid must be kept at a specific height or within a certain range. In order to achieve this goal, the flow of fluid to the storage tanks must be controlled. The coupling tank system is a special tank system and is an important industrial model, as illustrated in Fig. 1 below. It represents the basic problem arising in the process industry and is applied in petroleum, chemical industry, and other important industries [1]. As an illustration, in the flotation process, liquid level control is a very important part of the entire flotation process. In mineral flotation, the liquid level in the tanks determine the recovery rate of the final flotation result and flotation level. Excessively high liquid level can increase the recovery rate of minerals in the flotation process, but it will also reduce the flotation level of minerals to a certain extent. Conversely, if the liquid level is too low, it will increase the flotation level, at the same time, the recovery rate is reduced. Therefore, the stability of the slurry level is not only a prerequisite for the normal operation of the flotation process, but also an important parameter and indicator to ensure the recovery rate and flotation level. Therefore, finding a suitable and efficient liquid level control strategy is crucial to improve the flotation efficiency. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 216–225, 2023. https://doi.org/10.1007/978-981-99-6187-0_21

Level Control of Chemical Coupling Tank Based on RL Method

217

In previous studies, scholars have proposed a variety of control methods for the coupling tank liquid level system. Boonsrimuang [2] adopted model reference adaptation (MRA) control method, the gain of the PI controller is updated in real time as the disturbance is applied to the coupling tank. The simulation results prove that this control scheme can better control the liquid level in the coupling tank. Ivan [3] studied the design of the PID controller of frequency-domain robustness of uncertain coupling tank processes. The design of the controller improves the stability of the system and ensures the stability and performance of the closed-loop system. In order to effectively solve the problems of liquid level tracking control and disturbance suppression, Meng [4] proposed a compound control strategy based on disturbance observer (DOB) and state error feedback linearization algorithm. Mathew in [5] proposed a method based on artificial neural network (ANN), which uses the generalization ability of neural networks to guess the correct control action for the continuous state from the information available in the discrete state, thereby alleviating the discretization error problem, and the algorithm is verified in the coupling tank level system effectiveness. The above researches are control methods based on model, but in a liquid level control system with an unknown model, the cost of obtaining a system mechanism model is high, and it is not suitable to use a model-based control method. The datadriven control method does not require the mechanism of the system model and give control policy only by process data. Reinforcement learning [6] is such a method. Some studies used data-driven reinforcement learning methods to solve the optimal control of nonlinear systems [7, 8]. With the improvement of computer computing power, the methods of combining reinforcement learning and neural network have received great attention, and have demonstrated their control ability in many fields, such as automatic driving [9], robot path planning [10, 11], breast cancer disease prediction field [12], industrial process control [13–15], etc. In this paper, the coupling tank control process is modeled as a Markov Decision Process (MDP), and DQN method of reinforcement learning is used to learn the control strategy. In order to balance exploration and exploitation, when the liquid level is far away from the target, the controller will explore the control action outward with a greater probability. When it is close to the target liquid level, in order to make the liquid level state to converge to the target state, it is necessary to reduce the exploration probability of the control action. Therefore, this paper uses the method of exploration probability reduction to store the data collected through learning into the replay memory, and extract small batches of data from the replay memory to train the neural network. The rest of this paper is organized as follows: Sect. 2 describes the problem of coupling tank level control and reinforcement learning method. Section 3 introduces that automatically adjusts the exploration probability DQN Algorithm for solving the optimal control strategy of liquid level in coupling tank. Section 4 analyzes and verifies the effectiveness of the algorithm in the liquid level control of coupling tank. Section 5 is the summary and prospect.

218

Y. Li et al.

Fig. 1. Structural diagram of coupling tank system.

2 Coupling Tank Level Control Problem and Reinforcement Learning Principle The reinforcement learning algorithm solves the general problem of optimal control strategy selection in the sequence decision process, and the controller selects the control action according to the state of the industrial plant. The agent, the controller, performs control action ‘a’ according to the state ‘s’ of industrial plant to get the reward value R(s, a). The reward value depends on the state of the industrial plant and the control action performed by the controller. After the industrial plant performs the control action, it will be distributed according to the probability P(s ), or deterministic transfer s = f (s, a) to the next state ‘s ’. Policy π ∗ is the policy that maximizes the expected cumulative discounted reward given by the action-value function. The action-value function is expressed as: Qπ (s, a) = E[R(s0 , a0 ) + γ R(s1 , a1 ) + . . . + γ T R(sT , aT )|st = s0 , at = a0 , π ] (1) where T represents the end time, γ ∈ [0, 1) called the discount factor, is the discount for future rewards, which ensures the convergence of formula (1). The goal of reinforcement learning is to obtain the policy function that maximizes the action-value function (1), π ∗ : π ∗ = arg maxQ(s, a) π

(2)

Reinforcement learning is used to solve the optimal control strategy of the liquid level control system of coupling tank with nonlinear interactions. The selection of the optimal control strategy is a Markov Decision Process, represented by a 5-tuple: (H , A, Psa , γ , R), where: H = {[h1 , h2 ]}: Liquid level state space, h1 , h2 expressed as the height of the liquid level of the two containers respectively.

Level Control of Chemical Coupling Tank Based on RL Method

219

A = {a(n) : n ∈ [1, Nd ], n ∈ Z}: The collection of control actions performed on the first container, Z is collection of positive integers. The control action is the flow rate of liquid. Phd : The probability distribution of reaching the next state after taking control action from the current state. In the liquid level control system, the state transition is determined, so the transition probability of only one state is 1, and the probability of all other states are zeros. γ ∈ [0, 1): Discount factor, used for different weights of short-term and long-term rewards, this paper takes γ = 0.9. R: The reward value set according to the state of the container. Formula (3) is the reward function set in this paper. R = −Chdesire − h

(3)

where C is a positive constant, h ∈ R2×1 is the current liquid level of the system, hdesire ∈ R2×1 is the target liquid level. To balance the exploration-exploitation problem inherent in reinforcement learning, the algorithm adopts the classic Q-learning in the selection process of the action, and represents the probability of randomly selecting the control action from the control set.  1 − ε + Nε , a = a∗ (4) π(s, a) = ε a = a∗ Nd , (4) is the selection strategy of the control action during the training process. However, at the initial stage of training, when the liquid level is far away from the target liquid level, since the optimal control action is unknown, the agent needs to explore outward with a greater probability to find the action that controls action value function better, so take ε a larger value. In the later stage of training, when the distance from the target liquid level is closer, in order to ensure the convergence of the training results, it is necessary to reduce the exploration probability. This paper sets the method of self-adjustment: ε=

h − hdesire  i · hdesire

(5)

Equation (5) expresses that the probability of exploration at the i-th training episode and the liquid level height is h. Starting from a state h(0) = [h1 (0), h2 (0)] initially, the controller chooses a control strategy a(0) ∈ A, the liquid level system reaches a new state h(1) = [h1 (1), h2 (1)]. Then the system starts selecting the control actions from the state h(1) until the level is reached to hdesire . The process is as follows: a(0)

a(1)

a(T )

h(0) −→ h(1) −→ . . . . . . −→ hdesire T is the end time of a training episode. The end time is based on the observed liquid level h(0), h(1) . . . , hdesire and the actions performed a(0), a(1), a(2), . . . a(T ), the controller calculates the reward of the process by formula (6): Q(h, a) = R(h(0)) + γ R(h(1)) + γ 2 R(h(2)) + . . . + γ T R(hdesire )

(6)

220

Y. Li et al.

The goal of reinforcement learning is to select control actions over time such that the controller gains Q(h, a) maximum. Action value function Qπ (h, a) defined as the controller starting from some initial liquid level h(0) and implementing a fixed strategy π reached to terminal state hdesire , the expected value of the discounted reward received, Qπ (h, a), is expressed as follows: Qπ (h, a) = E[R(h(0)) + γ R(h(1)) + γ 2 R(h(2)) + . . . + γ T R(hdesire )]

(7)

Formula (7) is written in the form of the Bellman equation: Qπ (h, a) = R(h) + γ Qπ (h , a )

(8)

where a is the control action taken under state h . R(h) is the immediate reward in the initial state of the system, Qπ (h , a ) represents the sum of future discounted rewards. The strategy function that makes (8) the maximum is called the optimal strategy and is written as: π ∗ . Following the strategy π ∗ , the Bellman equation is called the Bellman optimal equation, as follows: Q∗ (h, a) = maxQ(h, a) = R(h) + max γ Q∗ (h , a )  π

a ∈A

(9)

The optimal policy is expressed as: π ∗ = arg maxQ(h, a)

(10)

a∈A

The traditional RL algorithm iteratively solves the Q function through the following formula: ∗ (h, a) = E[R + γ maxQk∗ (h , a )] Qk+1 a

π∗

(11)

In (11), when the number of iterations k → ∞, Q → Q∗ , then, the optimal policy: ≈ arg maxQk (h, a). a∈A

The calculation of the optimal action value function and the optimal policy is realized by two algorithms, the value iteration algorithm and the policy iteration algorithm. In this paper, the value iteration algorithm is used to learn the optimal control policy.

3 Coupling Tank Level Control Algorithm Based on DQN Method In order to apply RL to a control system with a continuous state space, the state must be discretized into a finite number of levels. However, due to the exponential increase in the number of discretized states, the computational cost increases rapidly. Therefore, in order to solve the continuous state control problem, the action-value function is usually estimated using a function approximator Q(h, a; θ ) ≈ Q(h, a), Q(h, a; θ ) is called Qnetwork [16]. Figure 2 is the neural network structure for learning the optimal control strategy.

Level Control of Chemical Coupling Tank Based on RL Method

221

Fig. 2. Q-network structure.

There are two neural networks in the DQN algorithm [17], namely, the target network Q(h, a; θ − ) and Q-network Q(h, a; θ ). The Q-network trains parameters by minimizing the following loss function: Lk (θk ) = Eh,a∼Phd [(yk − Q(h, a; θk ))2 ]

(12)

− where yk = E[R + γ max Q∗ (h , a ; θk−1 )|h, a] is called the Target. θk− is the target  a

network parameter at the k-th iteration, θk is the parameter of Q-network at the k-th iteration. The network parameters of Q(h, a; θ ) will be assigned to the target network Q(h, a; θ − ) every N iteration, N is taken as 50 in this paper. Keep the parameters of the target network θk− unchanged when optimizing the parameters θk of Lk (θk ). Parameters update with mini-batch stochastic gradient descent. The derivative of Lk (θk ) with respect to θk is: ∇θk L(θk ) = E[(R + γ max Q(h , a ; θk− ) − Q(h, a; θk ))∇θk Q(h, a; θk )]

(13)

The DQN algorithm utilizes a technique known as experience replay [17] to store the experience data < h, a, R, h > which are collected by the agent in each iteration to the replay memory, and collect data from the replay memory each time to train the neural network. The training process is shown in Fig. 3. This algorithm is an off-policy learning algorithm: learning greedy strategy a = maxQ(h, a; θ ) while following an a action distribution that ensures the state space is fully explored. In the training of the Q-network, following the greedy strategy with the probability of 1 − ε, and probability of selecting random action is ε. The algorithm flowchart is shown in Fig. 3, and the DQN-coupling tank liquid level control algorithm is given in this paper below.

222

Y. Li et al.

Fig. 3. DQN-coupling tank liquid level control algorithm training flowchart.

M and N are the constants set when training the neural network, where M is the number of training episodes, and N is the iteration time interval. Due to the strong correlation between samples, it is inefficient to learn directly from continuous samples, and random sampling breaks the correlation, the update variance is reduced, and the learning efficiency can be improved. By using experience replay technology, the behavior

Level Control of Chemical Coupling Tank Based on RL Method

223

distribution is averaged on many previous states, smoothing the learning process, thereby avoiding parameter oscillation.

4 Results and Analysis In order to verify the effectiveness of the algorithm, the algorithm is applied to the liquid level system whose mathematical model is (14) [5]. The goal of the reinforcement learning controller in this paper is to control the liquid level of the first water tank to target level. √ √  (a1 −r1 h1 −r3 h1 −h2 ) dh1 = dt √ A1 √ (14) a2 −r2 h2 +r3 h1 −h2 dh2 = dt A2 The model is only used to generate data to replace the values read by sensors in actual production, and does not participate in calculations other than generate data. Parameter values: r1 = r2 = r3 = r4 = 1, set the maximum flow rate of the tank inlet as 20 m3 /s, in the reward function (3), C = 1, γ = 0.9, the target liquid level hdesire = 8 m. To determine the effect of the number of control actions on reinforcement learning, five different levels of discretization are considered, discretizing the number of control actions as Nd = 10, 15, 20, 30, 40, the simulation result show that DQN can achieve ideal result in liquid level control. Figure 4 shows the curves of container liquid level and flow change controlled by the DQN controller during the liquid level control process. The simulation result shows that the DQN-based method can maintain the liquid level near the target state, and the more the number of discrete control actions, the higher the control accuracy, and the smaller the fluctuation after the liquid level is stabilized. In order to verify the anti-disturbance ability of the method, random noise with a variance of 0.3 and a mean of 0 is added to the liquid level state with the number of discrete control actions Nd = 15. The simulation result is shown in Fig. 5. The simulation result show that the liquid level can still be maintained near the target level even with the addition of small noise, proving the effectiveness of the algorithm. However, compared to before adding the noise, the fluctuation of the valve output flow is larger.

5 Summary In this paper, the DQN method of exploring probability self-adjustment is applied to the liquid level control of coupling tank. The experiment verifies that the algorithm can give the best control according to the arbitrary liquid level height. Using the generalization ability of neural network and denoising ability, the control target can be achieved without the precise mathematical model of the controlled object. After the small noise test, the DQN controller can control the fluctuation of the liquid level near the target value. At the same time, in order to balance exploration-exploitation, The probability of outward exploration of the controller is adjusted according to the training period and the target

224

Y. Li et al.

Fig. 4. System response diagrams for different discrete numbers of control actions.

Fig. 5. The response diagram of the liquid level system after adding noise.

liquid level state. Finally, from the simulation, it can be seen that the higher the number of discrete times of control actions, the greater the number of discrete times of control actions, and the more accurate the control results are. Therefore, the fluctuations in the control process are also smaller.

Level Control of Chemical Coupling Tank Based on RL Method

225

References 1. Pan, H., Wong, H., Kapila, V., et al.: Experimental validation of a nonlinear backstepping liquid level controller for a state coupling two tank system. Control Eng. Pract. 13, 27–40 (2005) 2. Boonsrimuang, P., Numsomran, A., Kangwanrat, S.: Design of PI controller using MRAC techniques for couple-tanks process. World Acad. Sci. Eng. Technol. 59, 67–72 (2009) 3. Holiˇc, I., Veselý, V., Fikar, M., et al.: Robust PID controller design for coupling-tank process. In: Proceedings of the 18th International Conference on Process Control, Tatranska Lomnica, Slovakia, pp. 506–512 (2011) 4. Meng, X., Yu, H., Zhang, J., et al.: Disturbance observer-based feedback linearization control for a quadruple-tank liquid level system. ISA Trans. 122, 146–162 (2022) 5. Noel, M.M., Pandian, B.J.: Control of a nonlinear liquid level system using a new artificial neural network based reinforcement learning approach. Appl. Soft Comput. 23, 444–451 (2014) 6. Gao, Y., Chen, S., Lu, X.: A review of reinforcement learning. Acta Automatica Sinica 2004, 86–100 (2004). (in Chinese) 7. Kiumarsi, B., Lewis, F.L., Modares, H., et al.: Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics. Automatica 50, 1167–1175 (2014) 8. Peng, Y., Chen, Q., Sun, W.: Reinforcement Q-learning algorithm for H∞ tracking control of unknown discrete-time linear systems. IEEE Trans. Syst. Man Cybern.: Syst. 50, 4109–4122 (2019) 9. Xia, W., Li, H.: Autonomous driving policy learning method based on deep reinforcement learning. Integr. Technol. 6, 29–34+36–40+35 (2017). (in Chinese) 10. Zhang, F., Li, N., Yuan, R., et al.: Robot path planning algorithm based on reinforcement learning. J. Huazhong Univ. Sci. Technol. (Nat. Sci. Edn.) 46(12), 65–70 (2018). (in Chinese) 11. Dong, Y., Yang, C., Dong, Y., et al.: Robot path planning based on improved DQN. Comput. Eng. Des. 42, 552–558 (2021). (in Chinese) 12. Huan-Hsin, T., Yi, L., Sunan, C., et al.: Deep reinforcement learning for automated radiation adaptation in lung cancer. Med. Phys. 44, 6690–6705 (2017) 13. Yuan, Z., He, R.Y.C., et al.: Online control algorithm of thickener underflow concentration based on reinforcement learning. Acta Automatica Sinica 47, 1558–1571 (2021). (in Chinese) 14. Lin, K., Xiao, H., Jiang, W., et al.: Optimal control of denitrification process in power plants based on DDPG deep reinforcement learning. Comput. Meas. Control 30, 132–139 (2022). (in Chinese) 15. Zhou, D., Cao, J., Bi, S., et al.: Reinforcement learning performance optimal control framework and its application in operation optimization of high pressure feedwater heater. J. Xi’an Jiaotong Univ. 56, 32–42 (2022). (in Chinese) 16. Mnih, V., Kavukcuoglu, K., Silver, D., et al.: Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013) 17. Schauul, T., Quan, J., Antonoglou, I., et al.: Prioritized experience replay. arXiv preprint arXiv:1511.05952 (2015)

A Personalized Federated Learning Fault Diagnosis Method for Inter-client Statistical Characteristic Inconsistency Yanqi Wen, Funa Zhou(B) , Pengpeng Jia, and Hanxin Huang Shanghai Maritime University, Shanghai 200135, China [email protected]

Abstract. The existing personalized federated learning fault diagnosis methods only focus on personalizing the global model by using private data from each client to generate a personalized model suitable for each client’s data distribution. However, in the process of federated aggregation, the parameter information related to fault characteristics may be lost, while the parameter information that interferes with the local training of each institution may be retained, which will eventually lead to the failure of the personalized model. Therefore, this paper proposes a personalized federated learning method for inter-client statistical characteristic inconsistency to solve the above problem. According to the training effect of each client to guide the aggregation of the initial global model and local models, so that a federated model suitable for the local data distribution is learned for each client. This method effectively improves the reliability of the federation model parameter information, thus ensuring the performance of the personalized model. Experimental validation for benchmark dataset of rolling bearing shows that 7.5% improvement of fault diagnosis accuracy can be achieved in the case when the inter-client statistical characteristics are inconsistent. Keywords: Fault diagnosis · Personalized federated learning · Aggregated weights

1 Introduction When the rolling bearing of a motor fails, it is easy to affect the working performance of the motor, so the fault diagnosis of the rolling bearing of a motor has an extremely important significance [1–3]. With the development of artificial intelligence technology, data-driven fault diagnosis methods are more and more widely used [4]. Deep learning is an effective data-driven fault diagnosis method, which mainly acquires the fault information of the original data by learning the intrinsic laws and representation levels of a large amount of sample data [5], so the quantity and quality of samples play a decisive role in the performance of fault diagnosis. An effective deep learning fault diagnosis model cannot be built when the number of fault samples is small or the data quality is poor [6, 7]. Federated learning achieves joint optimization of fault diagnosis models of © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 226–233, 2023. https://doi.org/10.1007/978-981-99-6187-0_22

A Personalized Federated Learning Fault Diagnosis Method

227

each client through joint training of deep learning models of multiple clients [8]. The literature [9, 10] uses a federated learning approach to build a global fault diagnosis model in a federation center, which enables the fault diagnosis of bearings while ensuring the privacy of client data. However, when the sample size is unbalanced among clients, the federation center cannot obtain the optimal federation center using the given aggregation weights. The literature [11, 12] adjusts the aggregation weights of each client in the federation center by measuring the importance of each client’s model parameters to the federation model for the purpose of optimizing the federation center federation model. The above study of federated learning-based fault diagnosis aims to obtain an optimal federation model at the federation center, but the globally optimal model is not optimal for some clients when the statistical characteristics of the data are not consistent among clients. Federated learning is unable to eliminate inter-client parameter variability by aggregation when dealing with data with inconsistent statistical properties between clients, resulting in degraded model performance across clients and lack of personalization for local tasks or datasets. So building individualized models for each client can solve the problem of model performance degradation due to inconsistent data with statistical characteristics [13]. The literature [14] trains a FL model for each homogeneous group of clients by means of clustering, which is then distributed to the corresponding clients for local training to obtain a personalized model. The literature [15–17] learns a better federation model by learning it at the federation center and then adaptively finds the best combination of federation and local models to better personalize the model for each client. The above personalized federated learning method mainly fine-tunes the global model issued by the federal center locally, which ensures personalization but does not take into account the large parameter variability among clients leading to the loss of parameter information related to fault characteristics when the federal center aggregates the parameters of each client, and the training of local personalized models relies on the lost relevant parameter information, which eventually makes it difficult to train effective local The training of local personalization models relies on the missing relevant parameter information, which ultimately makes it difficult to train effective personalization models locally. Therefore, this paper proposes a personalized federated learning method based on multiple federation models, in which each client learns a corresponding personalized federation model for each client by re-aggregating the client’s model and the initial aggregated federation model, and then uses the corresponding federation model of each client for local training to obtain the personalized model of each client. The contribution of this paper is as follows: (1) In this paper, we propose a personalized federated learning method for multiple federation models (pFL_ISCI), which solves the problem that traditional personalized federated learning methods lose parameter information related to fault features during the aggregation of federation centers and retain parameter information that interferes with the local training of each client. (2) The federation model obtained from the initial aggregation of the federal center and the models of other clients are aggregated to obtain an initial federation model for each client, and then the local training effect of each client is used to guide the aggregation process of its corresponding federation model to obtain a federation

228

Y. Wen et al.

model with reliable and locally appropriate parameter information for each client, so that the parameter information related to the fault characteristics can be fully utilized and the parameter information in the federation model can be reduced. This allows us to take full advantage of the parameter information associated with the fault features while reducing the parameter information in the federated model that interferes with the local training of each client. (3) When the statistical characteristics of data are inconsistent among clients, the reliability of federation model parameter information can be ensured by using the personalized federated learning fault diagnosis method designed in this paper, thus effectively avoiding the problem of poor personalized federated learning accuracy.

2 Related Work 2.1 Stacked Self-encoder Stacked self-encoders are deep neural network models consisting of multiple selfencoders, which learn various representations of the original data layer by layer in order to better learn abstract feature vectors with different dimensions and levels from complex, high-dimensional input data [18]. The training process of stacked self-encoders consists of two steps: pre-training and fine-tuning. The output of each self-encoder hidden layer is used as input to the next selfencoder layer in the pre-training phase of unsupervised learning, and when all hidden layers are trained, the features of the last self-encoder hidden layer are used as input to the Softmax classifier, and then the entire network is fine-tuned using labeled data to better fit the classification task. 2.2 Federated Learning The issue of data privacy security and data silos is now a challenge in various fields. Federated learning as a distributed machine learning can solve the silo problem while protecting privacy and security [8]. Federated learning is a widely used fundamental AI technology that unites multiple clients in a collaborative manner to train and obtain a federated model to build a more superior deep learning model for each client under the premise of protecting the data privacy of each client.

3 The Proposed Method Existing personalized federated learning methods usually require aggregating model parameters across clients in a federation center to obtain a global model, and then each client uses the global model to obtain information from other clients that is beneficial for local training, so as to train a personalized model that is applicable to local data. However, due to the inconsistent statistical characteristics of data among clients, the parameter variability among clients is large, which causes the federal center to lose the parameter information related to the fault characteristics when aggregating the parameters of clients, and the training of local personalized models relies on the lost related

A Personalized Federated Learning Fault Diagnosis Method

229

parameter information, which eventually makes it difficult to train effective personalized models locally. Therefore, in this paper, a personalized federation method is designed to solve the above problem, and the main steps are as follows. Step 1. Aggregation of the Initial Federation Model of the Federation Center The federal center aggregates the model parameters of each client to obtain the federal center global model parameters as shown below:   1 2 h H (1) , θfl,s , . . . , θfl,s , . . . , θfl,s θfl,s = θfl,s H

N

mn h θ (2) m n,s   h = a where H means the network has H layers, θfl,s ij x×y denotes the parameter matrix corresponding to the network at layer H , the size of the parameter matrix of each layer is determined by the number of input neurons and output neurons of that layer. m is the total sample size of all clients and mn is the sample size of the nth client. h θfl,s =

h=1

n=1

Step 2. Federal Center Aggregates the Initial Federation Model and Other Client Models The initial aggregation process in the federation center loses a large amount of parameter information related to the fault characteristics, and the lost useful information is fully utilized from other clients by means of parameter aggregation between the federation model and other clients’ models. The process of aggregation to obtain N federations is shown below. H N c,h c,h c,h h h = PKn,s,t  θn,s + PKfl,s  θfl,s,t (3) θfl,s,t h=1

c θfl,s

n=1

  c,1 c,2 c,h c,H = θfl,s , θfl,s , . . . , θfl,s , . . . , θfl,s

(4)

where  denotes multiplying the elements corresponding to the weight matrix and the parameter matrix. In order to effectively distinguish the meaning of each corner label, h denotes the layer the number of federation models is denoted by the corner label c. θn,s h network model parameters uploaded to the federation center by the nrd client at the h denotes the network model parameters of layer h in the snd round of federation. θfl,s federation model obtained by the initial aggregation of the federation centers at the c denotes the set of all layer network models of the cnd snd round of federation, θfl,s,t federation model. Step 3. Find the Cost Function of Each Federation Model of the Federal Center The federal center distributes the n federation models obtained by aggregation to the corresponding sub-clients, and each sub-client uses the local private data for forward propagation to calculate the prediction value corresponding to each sample of each client, and then uses the mean square error loss function to find the error between the prediction value and the true value to obtain the loss value of each client, and the federal center uses the loss value of each client to construct the loss of the federation model parameters to

230

Y. Wen et al.

be optimized by the federal center function for the parameters of the federation model to be optimized by the federal center, as shown in the following equation. 2   1 N c c c c PKn,s,t − PKfl,s,t (5) Lossfl,s,t = lossn,s,t θfl,s,t + 1 − n=1 2 c where Lossfl,s,t denotes the loss function for the tth parameter aggregation of the crd   c denotes the loss value federation model at the snd round of federation, lossn,s,t θfl,s,t obtained from the forward propagation calculation by the agency n at the t rd parameter c , aggregation of the snd round federation using its corresponding federation model θfl,s,t c,h c,h PKn,s,t , PKfl,s,t are the parameters to be optimized in the process. The updated aggregated weights and the corresponding model parameters are aggregated to obtain multiple new federation models as shown in the following equation. Layers N c,h c,h c,h h h = PKn,s,t+1  θn,s + PKfl,s,t+1  θfl,s (6) θfl,s,t+1 h=1

n=1

The updated federation model is distributed to the corresponding sub-clients for the next update of aggregation weights in that round of federation. When the loss function corresponding to each federation model reaches a threshold, the federation center is able to learn an optimal set of aggregation weights for each federation model, and then weight each set of aggregation weights with its corresponding global model parameters and other agency model parameters to learn a locally appropriate federation model for each agency, at which time all federation models are trained in the federation process of the round. Step 4. The Federation Center Sends Each Federation Model to the Corresponding Sub-client The Federal Center distributes each federation model to the appropriate client. c θn,s = θfl,s,t

(7)

where θn,s denotes the model parameters for the cnd federation model to be sent down from the federal center to the nrd client. Each client gets the federation model parameters issued by the federal center and then trains them locally to get the personalized model. When the loss function of each client reaches a given threshold, the local training ends and the local model parameters are uploaded to the federal center. When the federation reaches a given number of times, the federation center distributes the corresponding federation model of each organization to each sub-client, and each sub-client trains the obtained local personalized model, and finally uses the personalized model of each organization for fault diagnosis.

4 Analysis of Experimental Demonstration 4.1 Dataset Description In this section of the experiments, federal learning is performed using three clients, each with its own bearing condition monitoring dataset. The data set used in this paper is through the motor bearing 6 o’clock sensor, bearing data at the drive end of the motor

A Personalized Federated Learning Fault Diagnosis Method

231

housing. The acquisition frequency is 12 kHz and the fault size is 0.007 in. Data failure types include: outer ring failures, rolling body failures, inner ring failures, and normal samples. Using the data obtained from the monitoring of three motor loads, 1HP, 2HP and 3HP, each mechanism contains the monitoring data of the motor under different loads, constituting a situation where the statistical characteristics of the data are inconsistent between mechanisms. The loadings of each client and the number of samples in the training set are shown in Table 1, where the number of samples in the test set is equal to the number of samples in the training set. Table 1. Loading of data by client. Client

Training sets Load of each client

Client 1 4 * 100

Outer ring failure (1Hp), Rolling body failure (1Hp), Inner ring failure (1Hp), Normal sample (1Hp)

Client 2 4 * 100

Outer ring failure (2Hp), Rolling body failure (2Hp), Inner ring failure (2Hp), Normal sample (2Hp)

Client 3 4 * 100

Outer ring failure (3Hp), Rolling body failure (3Hp), Inner ring failure (3Hp), Normal sample (3Hp)

4.2 Experimental Results The algorithm proposed in this section is compared with the traditional federal averaging algorithm, and the existing personalized federal methods: literature [17] (Ditto) and literature [16] (pFedMe) and where Ditto adaptively learns personalized models by means of parameter learnable methods. The personalization parameters of pFedMe are constrained with global model parameters by the regularization idea. The experimental results of each diagnostic model are shown in Table 2. Table 2. Fault diagnosis accuracy for data with inconsistent statistical characteristics. Load distribution Client by client

DNN

FedAvg

pFedMe

Ditto

pFL_ISCI

1/1/1/1 (Hp) 2/2/2/2 (Hp) 3/3/3/3 (Hp)

Client 1

75.50%

72.75%

80.50%

81.25%

88.00%

Client 2

73.75%

70.75%

81.25%

83.50%

91.25%

Client 3

75.00%

74.25%

83.75%

84.00%

92.00%

Mean

74.75%

72.58%

81.83%

82.92%

90.42%

As can be seen from column 3 and column 4 of the table, the traditional federal averaging algorithm has no significant effect compared to the single-agency fault diagnosis method, because the inconsistent statistical characteristics of the data between agencies

232

Y. Wen et al.

cause a large parameter variability among agencies, and federal learning cannot eliminate the parameter variability between agencies by averaging aggregation, leading to negative migration phenomenon of federal learning. From the experimental results in column 4 and column 5 of the table, it can be seen that pFedMe outperforms FedAvg in diagnosis, which is because pFedMe adapts the global model locally and improves the degree of personalization of the local model. From the experimental results in columns 5 and 6, it can be seen that the diagnostic accuracy of Ditto is higher than that of pFedMe. This is because the parameters controlling the strength of the relationship between the local model and the global model are fixed and need to be adjusted manually when pFedMe is trained locally, which makes it difficult to obtain the optimal parameters. In contrast, Ditto’s parameters are dynamically tuned, which makes the personalized model obtained from the training more suitable for local data. But Ditto and pFedMe like federal centers have inadequate aggregation strategies. Comparing the experimental results in columns 5, 6, and 7, it can be seen that pFedAw’s method outperforms the two personalized federated learning methods compared, with a significant improvement in the diagnostic accuracy of each agency, which indicates that the method in this chapter is able to solve inconsistent data of statistical characteristics by aggregating global models with unreliable parameter information and other agency models in a way that the fault feature-related ones lost during the initial federation aggregation are parameters lost during the initial federated aggregation, and reduce the parameter information that interferes with the local training of each client.

5 Conclusions In this paper, we propose a personalized federated learning method for the inconsistency of statistical properties among clients, aggregating the global model obtained from the initial aggregation of the federation center with other client models, and using those useful parameter information to guide the aggregation process of each federation model, obtaining a federation model for each client with reliable and locally appropriate parameter information. The process is able to make full use of the parameter information related to the fault characteristics lost during the initial aggregation of the federation model, while reducing the parameter information in the federation model that interferes with the local training of each client. The effectiveness of the proposed method is evaluated with the Case Western Reserve University bearing dataset, and the experimental results show that the proposed method outperforms existing personalized federated learning methods.

References 1. Li, X., Wan, S., Liu, S., et al.: Bearing fault diagnosis method based on attention mechanism and multilayer fusion network. ISA Trans. 128, 550–564 (2022) 2. Xu, W.: Research on bearing fault diagnosis base on deep learning. In: International Conference on Artificial Intelligence and Big Data (ICAIBD) (2021) 3. Yang, C., Zhou, F.: Imbalanced bearing fault diagnosis based on adaptive cost-sensitive neural network. In: 2021 China Automation Congress (CAC), pp. 6514–6519 (2021)

A Personalized Federated Learning Fault Diagnosis Method

233

4. Yin, S., Ding, S.X., Haghani, A., et al.: A comparison study of basic data-driven fault diagnosis and process monitoring methods on the benchmark Tennessee Eastman process. J. Process Control 22(9), 1567–1581 (2012) 5. Mu, R.H., Zeng, X.Q.: A review of deep learning research. KSII Trans. Internet Inf. Syst. 13(4), 1738–1764 (2019) 6. Huang, Z.W., Xie, K., Wen, C., et al.: Small sample face recognition algorithm based on transfer learning model. J. Changjiang Univ. (Nat. Sci. Edn.) 16(7), 88–94 (2019) 7. Sun, C.W., Wen, C., Xie, K., et al.: Recognition method of small sample based on deep migration model. Comput. Eng. Des. 39(12), 3816–3822 (2018) 8. McMahan, B., Moore, E., Ramage, D., et al.: Communication-efficient learning of deep networks from decentralized data. In: Artificial Intelligence and Statistics, pp. 1273–1282. PMLR (2017) 9. Zhang, W., Li, X., Ma, H., et al.: Federated learning for machinery fault diagnosis with dynamic validation and self-supervision. Knowl.-Based Syst. 213(1), 106679 (2021) 10. Li, Z., Li, Z., Li, Y., et al.: An intelligent diagnosis method for machine fault based on federated learning. Appl. Sci. 11(24), 12117 (2021) 11. Xiao, J., Du, C., Duan, Z., et al.: A novel server-side aggregation strategy for federated learning in non-IID situations. In: 2021 20th International Symposium on Parallel and Distributed Computing (ISPDC), pp. 17–24. IEEE (2021) 12. Chen, J., Li, J., Huang, R., et al.: Federated learning for bearing fault diagnosis with dynamic weighted averaging. In: 2021 International Conference on Sensing, Measurement & Data Analytics in the era of Artificial Intelligence (ICSMD), pp. 1–6. IEEE (2021) 13. Mansour, Y., Mohri, M., Ro, J., et al.: Three methods for personalization with applications to federated learning. arXiv preprint arXiv:2002.10619 (2020) 14. Ghosh, A., Chung, J., Yin, D., et al.: An efficient framework for clustered federated learning. Adv. Neural Inf. Process. Syst. 33, 19586–19597 (2020) 15. Dinh, C.T., Tran, N., Nguyen, J.: Personalized federated learning with Moreau envelopes. Adv. Neural Inf. Process. Syst. 33, 21394–21405 (2020) 16. Li, T., Hu, S., Beirami, A., et al.: Ditto: fair and robust federated learning through personalization. In: International Conference on Machine Learning, pp. 6357–6368. PMLR (2021) 17. Deng, Y., Kamani, M.M., Mahdavi, M.: Adaptive personalized federated learning. arXiv preprint arXiv:2003.13461 (2020) 18. Chen, L., Ma, Y., Hu, H., et al.: An effective fault diagnosis method for bearing using stacked de-noising auto-encoder with structure adaptive adjustment. Measurement 214, 112774 (2023)

Downsampling Assessment for LiDAR SLAM Jiabao Zhang1 and Yu Zhang1,2(B) 1 2

State Key Laboratory of Industrial Control Technology, College of Control Science and Engineering, Zhejiang University, Hangzhou, China Key Laboratory of Collaborative Sensing and Autonomous Unmanned Systems of Zhejiang Province, Hangzhou, China [email protected]

Abstract. LiDAR has become a widely used sensor in many autonomous areas, and LiDAR SLAM is one of essential applications. With the tendency of lightweight structure and the improvement of LiDAR resolution, downsampling points from LiDAR raw data has been utilized in some system. In this paper, we focus on the influence of point downsampling, especially on location and mapping, and propose a benchmark method to evaluate the LiDAR SLAM system based on downsampling property. Three criteria including RPE Degeneration, ATE Degeneration, and Precision Maintenance Rate are defined for assessment. Then, groups of experiments using this evaluation method are conducted, with a comparison and discussion of system characteristics, which may contribute to subsequent development of LiDAR SLAM system.

Keywords: downsampling maintenance

1

· LiDAR SLAM · degeneration and

Introduction and Related Works

Light Detection and Ranging (LiDAR) data has become important information in automatic systems such as autonomous driving, as well as location and detection [1,2], with the improvement of point cloud precision and the increasing number of scanning lines of beams. In this case, it is necessary to process more points in one scan, like rotor-based mechanical LiDAR scan [3,4], which requests a huge calculation power especially when instantaneity is required. However, nowadays research and production have given us a tendency to make automatic systems more lightweight, then the complexity and inadequateness of some point cloud process algorithms were exposed. In the area of LiDAR Simultaneous Location and Mapping (SLAM), there are some methods to handle the calculation power and storage problem, including scan point downsampling, which means directly reducing the raw points that would be used in odometry or SLAM system. Some downsampling methods were c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023  Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 234–242, 2023. https://doi.org/10.1007/978-981-99-6187-0_23

Downsampling Assessment

235

proposed specifically for a particular function or application [5], or by using a voting method to calculate points contribution [6–8]. In addition, it is sometimes simply used in some point cloud algorithms by turning a 64 beams LiDAR scan into 16 beams. In SLAM system, typically in LOAM [9], LeGO-LOAM [10], LIO-SAM [11] as well as other LiDAR SLAM solutions, the voxel grid filter [12] with its extension and range image projection-based method [11] are generally used, for the high efficiency of these methods. Although it seems frequent to use point downsampling methods in SLAM systems, few researches are aiming at its influence on SLAM systems. In this paper, we research how LiDAR scan point downsampling methods affect LiDAR SLAM and proposed a benchmark of LiDAR SLAM according to downsampling characteristics. The contributions of this paper are summarized as follows: – LiDAR SLAM benchmark based on downsampling, which can evaluate the system robustness, precision, and valid downsampling rate. – Experiments and comparison of different SLAM systems, which shows the evaluation results from the proposed benchmark and indicates the existence of superfluous points. – Discussion about essential factors of LiDAR scan point, which provides a direction of LiDAR SLAM optimization.

2

Methodology

2.1

Overview

The pipeline of the proposed SLAM evaluation based on point downsampling is illustrated in Fig. 1. In this pipeline, we first get an input LiDAR scan from the exterior, then downsample the point cloud from the scan. The point cloud after downsampling is sent to the odometry and mapping algorithm, which provides the final pose estimation and global map. Because the points used in SLAM systems are totally after downsampling, it ensures the reliability of this benchmark.

Fig. 1. This is the pipeline of SLAM evaluation based on point downsampling.

236

J. Zhang and Y. Zhang

SLAM system is always evaluated by Relative Pose Error (RPE) and Absolute Trajectory Error (ATE, also called Absolute Pose Error (APE) in some cases), defined in [13] section VII formula (1)–(5). These two criteria mainly evaluate the system precision both in unit time and in all processes and can be applied through toolkit [14,15]. Inspired by these methods, considering the system precision declining when downsampling, we proposed RPE Degeneration and ATE Degeneration to describe the precision degeneration during the increase of downsampling rate, and Precision Maintenance Rate to describe the system capability of keeping initial precision. 2.2

RPE Degeneration and ATE Degeneration

RPE Degeneration and ATE Degeneration are defined based on these two types of error mentioned above. For each downsampling rate, we calculate the RPE and ATE value, using its maximum, mean, or mostly RMSE. And we directly use Downsampling Rate (DSR) value and its reciprocal, written as Downsampling Parameter (DSP) value, to represent the downsampling property.

Fig. 2. This represents the relationship between pose error and downsampling rate and params. (a) is a chart of DST-RPE; (b) is a chart of DSP-RPE, with evaluation criteria showing in this figure.

Taking RPE Degeneration for example, a downsampling rate and RPE chart can be generated, called DSR-RPE chart as shown in Fig. 2(a). It can be seen that when the DSR drops to a small value, the RPE value may change rapidly. To make the variation more distinct, the DSR value of the horizontal axis can be replaced by the DSP value. This type of chart is called DSP-RPE chart as shown in Fig. 2(b). The DSP-RPE and DSP-ATE relationship can be written as: RPE = F (DSP, Algi , Serj )

(1)

ATE = G (DSP, Algi , Serj , Lengthj )

(2)

Downsampling Assessment

237

where Algi denotes the different SLAM algorithm and Serj denotes the various datasets series, Lengthj denotes the length of the trajectory. As shown in model (1), the results of RPE are related to DSP value, algorithm, and datasets. And through different SLAM algorithms and fixed datasets, the DSP-RPE relationship can evaluate the property of each algorithm. Although these relationships are nonlinear, a piecewise linearization method is used to fit the DSP-RPE and DSP-ATE curve, divide the valid interval into maintenance section and degeneration section as shown in Fig. 2, as well as analyze the slop of each piece to define the evaluation criteria. RPE and ATE Degenerate both include two parts: degenerate slope and degenerate rate. Degenerate slope is measured by the fitting slope in degenerate section. Degenerate rate is defined by the intersection of maintenance and degeneration sections. 2.3

Precision Maintenance Rate

Precision Maintenance Rate is proposed as a measurement of system robustness, which evaluates the capability of keeping valid and accurate results when points lose or low texture environment, through downsampling assessment. This evaluation criterion is based on the combination information of RPEDSP and ATE-DSP charts. First, we calculate the fitting slope in maintenance section and check if the slope is under the maintenance threshold (always set according to the length and difficulty of dataset series). Then, when the slope is valid, the location where the differential of DSP is equal to maintenance slope is chosen to be the precision maintenance location. We take this location in RPE-DSP and ATE-DSP chart and get the average downsampling rate, which is estimated as Precision Maintenance Rate. Note that the precision maintenance location is always before the degeneration location, as shown in Fig. 2.

3

Experiments

In this section, we evaluate and compare the typical LiDAR SLAM system using the proposed benchmark. All the experiments are conducted on a PC with 2.2GHz cores and 16GB RAM. KITTI datasets [16] are chosen in these experiments, which provide registered data of Velodyne HDL-64E LiDAR mounted on top of a vehicle driving on the road. To evaluate both long and short trajectories, we used sequence 00 with a length of 3724.187m and sequence 06 with a length of 1232.876m. The following parts present details of all the experiments. 3.1

LOAM Evaluation

LOAM [9] is a typical feature-based LiDAR SLAM system, and is available from open source. According to the pipeline in Fig. 1, we downsampled the input scan and fit them into LOAM system. Caused by the scanning type of Velodyne

238

J. Zhang and Y. Zhang Table 1. The evaluation results of LOAM in KITTI 00 and 06

Seq. DSP/DSR RPE-Max RPE-Mean RPE-RMSE ATE-Max ATE-Mean ATE-RMSE 00

1/1 2/0.50 3/0.33 4/0.25 5/0.20 6/0.167 7/0.143 8/0.125

0.284151 0.280321 0.284108 0.280555 0.271588 0.288099 1.055256 1.058111

0.022397 0.022062 0.022501 0.022999 0.023514 0.024710 0.035579 0.042373

0.031191 0.030056 0.030599 0.030652 0.031064 0.032503 0.087671 0.104268

8.093937 1.908372 9.082287 2.635733 5.546405 2.111695 7.625415 2.462113 14.66756 4.933111 14.68331 6.105435 26.89299 9.440987 45.01945 11.60611

06

1/1 2/0.50 3/0.33 4/0.25 5/0.20 6/0.167 7/0.143 8/0.125

0.361223 0.473493 0.353006 0.761053 0.922965 0.994771 1.156118 1.170489

0.015610 0.015268 0.015863 0.017088 0.018189 0.024100 0.038962 0.092243

0.021652 0.022854 0.021547 0.032740 0.043391 0.055360 0.097816 0.211157

1.419046 1.499180 1.676668 1.751928 1.992855 3.110989 11.78611 17.16525

0.719620 0.723937 0.802562 0.866708 0.933151 1.082953 1.870667 6.560940

2.423485 3.051321 2.353199 2.789576 5.758740 7.128767 11.354894 14.221941 0.776892 0.783873 0.869419 0.936191 1.017457 1.230517 2.188509 8.089759

Fig. 3. This is the evaluation results summary for LOAM system. (a)(b) shows evaluation on KITTI sequence 00; (c)(d) shows evaluation on KITTI sequence 06.

Downsampling Assessment

239

LiDAR and the range view scan projection method in LOAM, a downsampling method based on row data in range image was used in this experiment. To make the evaluation more explicit, DSP was used to describe the downsampling property, with a domain from 1 to 8 (for the trajectory always degenerates and becomes unvalid before DSP goes to 8.). The evaluation results are listed in Table 1 and analyzed in Fig. 3. From these evaluations, RPE Degeneration is in 6. ATE Degeneration is respectively in 5 and 6 in two sequences. Precision Maintenance Rate with stability goes to 4. As shown in Fig. 3, the total degenerate rate and maintenance rate are around 6. Noted that when trajectory length goes up, the uncertainty of system estimation result is exposed, with the degenerate rate dropping down. 3.2

LeGO-LOAM Evaluation

LeGO-LOAM [10] is a lightweight and ground-optimized algorithm of LOAM, and is available from open source. Analogously, we used the downsampling method based on row data in range image and got the evaluation results in Table 2 with a relational chart in Fig. 4. Note that LeGO-LOAM add a segmentation part before feature extraction, so we took this part into account by fitting the original parameter to downsampling parameter when testing. From these evaluations above, RPE Degeneration, as well as ATE Degeneration, is in 4. Precision Maintenance Rate is in 3 with reliable results. As shown in Fig. 4, the total degenerate rate and maintenance rate are around 3. Noted that in Fig. 4(a) and Fig. 4(b), there is a breakpoint in each fig, which means that when parameters come to 3, the pose error rises rapidly, even going to hundreds level in Fig. 4(d). Table 2. The evaluation results of LeGO-LOAM in KITTI 00 and 06 Seq. DSP/DSR RPE-Max RPE-Mean RPE-RMSE ATE-Max

ATE-Mean ATE-RMSE

00

1/1 2/0.50 3/0.33 4/0.25 5/0.20

5.628722 6.631719 5.816325 9.423266 15.182775

0.060819 0.065562 0.072764 0.119692 0.120434

0.145642 0.173486 0.166859 0.213228 0.274338

5.327052 6.733567 6.487077 35.80178 117.099522

1.892847 2.157106 2.311903 14.28248 30.530887

2.143006 2.498690 2.592051 16.362343 40.611493

06

1/1 2/0.50 3/0.33 4/0.25 5/0.20

0.789687 0.688886 0.920082 1.361418 1.254175

0.064245 0.073606 0.079869 0.159900 0.197430

0.076903 0.087938 0.096225 0.223820 0.274524

1.789468 0.858797 0.939227 1.671545 0.825441 0.899820 1.776301 0.876477 0.953486 104.54959 38.232463 46.861874 397.88986 147.27334 170.93391

240

J. Zhang and Y. Zhang

Fig. 4. This is the evaluation results summary for LeGO-LOAM system. (a)(b) shows evaluation on KITTI sequence 00; (c)(d) shows evaluation on KITTI sequence 06.

4

Discussion and Conclusion

In this section, we analyze the evaluation contribution of this assessment method, discuss the scan point cloud character in LiDAR SLAM based on the proposed methodology, compare the results of the experiments, and summarize this paper. The discussion and conclusion contents can be listed as follows: – The relationship between LiDAR SLAM precision and point cloud downsampling rate is discovered, which is proved a nonlinear correlation with an extremely sharp degeneration gap between a robust result and an invalid result. The location of this rate is also a vital reference of SLAM system, especially when the necessity of lightweight exists. – As the phenomenon of maintenance section, the LiDAR scan points can be cut down within a certain range without loss of accuracy. This provides the evidence to downsample the point cloud from the beginning part of SLAM. And the dramatic decrease in accuracy also indicates the uncertainty of one single feature point, with the necessity of more robust features. – Comparing typical LiDAR-only SLAM, LOAM outperforms LeGO-LOAM in this assessment method, which is probably caused by the joint optimization of 6◦ of freedom and the use of more information from raw points. Through the proposed benchmark of downsampling assessment for LiDAR SLAM, along with discussion according to the experiment results that prove the

Downsampling Assessment

241

existence of maintenance section and degeneration section, this work is expected to be a reference and inspiration for future research in the relevant area. Acknowledgements. This work was supported by STI 2030-Major Projects 2021ZD0201403, in part by NSFC 62088101 Autonomous Intelligent Unmanned Systems, and in part by the Open Research Project of the State Key Laboratory of Industrial Control Technology, Zhejiang University, China (No.ICT2022B04).

References 1. Roriz, R., Cabral, J., Gomes, T.: Automotive LiDAR technology: a survey. IEEE Trans. Intell. Transp. Syst. 23(7), 6282–6297 (2022). https://doi.org/10.1109/ TITS.2021.3086804 2. Yin, H., et al.: A survey on global LiDAR localization (2023). http://arxiv.org/ abs/2302.07433 3. Lambert, J., et al.: Performance analysis of 10 models of 3D LiDARs for automated driving. IEEE Access 8, 131699–131722 (2020). https://doi.org/10.1109/ACCESS. 2020.3009680 4. Halterman, R., Bruch, M.H.: Velodyne HDL-64E LiDAR for unmanned surface vehicle obstacle detection. In: Defense + Commercial Sensing (2010) 5. Kovalenko, D., Korobkin, M., Minin, A.: Sensor aware LiDAR odometry (2020). https://doi.org/10.48550/arXiv.1907.09167 6. Ervan, O., Temelta¸s, H.: Downsampling of a 3D LiDAR point cloud by a tensor voting based method. In: 2019 11th International Conference on Electrical and Electronics Engineering (ELECO), pp. 880–884 (2019). https://doi.org/10.23919/ ELECO47770.2019.8990544 7. Deschaud, J.E.: IMLS-SLAM: scan-to-model matching based on 3D data (2018). https://doi.org/10.48550/arXiv.1802.08633 8. Wan, Z., et al.: Observation contribution theory for pose estimation accuracy. CoRR abs/2111.07723 (2021) 9. Zhang, J., Singh, S.: LOAM: LiDAR odometry and mapping in real-time (2014). https://doi.org/10.15607/RSS.2014.X.007 10. Shan, T., Englot, B.: LeGO-LOAM: lightweight and ground-optimized LiDAR odometry and mapping on variable terrain. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4758–4765 (2018). https:// doi.org/10.1109/IROS.2018.8594299. ISSN: 2153–0866 11. Shan, T., Englot, B., Meyers, D., Wang, W., Ratti, C., Rus, D.: LIO-SAM: tightlycoupled LiDAR inertial odometry via smoothing and mapping (2020). https://doi. org/10.48550/arXiv.2007.00258 12. Rusu, R.B., Cousins, S.: 3D is here: point cloud library (PCL). In: IEEE International Conference on Robotics and Automation (ICRA). IEEE, Shanghai (2011) 13. Sturm, J., Engelhard, N., Endres, F., Burgard, W., Cremers, D.: A benchmark for the evaluation of RGB-D SLAM systems. In: 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 573–580 (2012). https://doi. org/10.1109/IROS.2012.6385773. ISSN: 2153–0866 14. Grupp, M.: EVO: Python package for the evaluation of odometry and SLAM. https://github.com/MichaelGrupp/evo (2017)

242

J. Zhang and Y. Zhang

15. Zhang, Z., Scaramuzza, D.: A tutorial on quantitative trajectory evaluation for visual(-inertial) odometry. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 7244–7251 (2018). https://doi.org/10.1109/ IROS.2018.8593941 16. Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the KITTI dataset. Int. J. Robot. Res. 32(11), 1231–1237 (2013). https://doi.org/10.1177/ 0278364913491297

The Key to Autonomous Intelligence is the Effective Synergy of Human-Machine-Environment Systems Wei Liu(B) , Yangyang Zou, Ruilin He, and Xiaofeng Wang Beijing University of Posts and Telecommunications, Beijing 100876, China [email protected]

Abstract. Autonomous systems are the key to not only intelligentization but also smartification. Human intelligence works prior to the provision of data, whereas machine intelligence works thereafter. An autonomous system transcends not only formalized arithmetic computations but also traditional logic calculations—it is a new game-type computation-calculation system that incorporates the strengths of humans, machines, and the environment. It relies chiefly on the effective synergy between them to form a game mode in a smartified system. During this process, artificial intelligence (AI) plays an important role in the gaming of the future, but there are three unsurmountable obstacles: explainability, learning, and common sense [1]. This article concludes with an introduction to potential issues of future warfare, including deep situation awareness, uncertainty, and human issues in integrated human-machine intelligence. Therefore, the key to winning the future of gaming lies in thorough comprehension of the mechanisms for integrated humanmachine intelligence and the approaches to effective synergy. Keywords: Autonomous Systems · Human-machine-environment Integration · Deep Situation Awareness · Artificial Intelligence

1 Introduction The Alpha programs, as eminent examples of AI, have achieved outstanding outcomes in game-type competitions such as the board game Go in recent years. Their root, nevertheless, is still relevance-based machine learning and reasoning in closed environments, whereas the root of game-type intelligence consists of human learning and understanding based on the integration of causality and relevance in open environments [2]. Such human learning engenders tacit knowledge [3] of logic calculations, order, and rules of a certain range of uncertainty (just like how a child learns); such human understanding can relate to seemingly irrelevant things. There are indications that the gaming of the future could be the gaming of human-machine-environment system integration. “Know your enemy and know yourself, and you need not fear the result of a thousand battles”, said Sun Tzu. To “know”, in this context, means to realize human and machine perception. The difference between them is that humans are capable of comprehending © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 243–256, 2023. https://doi.org/10.1007/978-981-99-6187-0_24

244

W. Liu et al.

and deformalizing, whereas machines are not able to understand intentions as flexibly and thoroughly as humans. The “enemy” here includes both the opponents as well as their equipment and environment; and “yourself” here refers to the humans, machines, and the environment of one’s own. Therefore, without humans, there will be no intelligence or AI, not to mention any game of the future. True intelligence or AI is not achieved by an abstract mathematical system, for mathematics is merely a tool to fulfill functions, not capabilities; only humans can build true capabilities. AI, for that reason, is a product of the interaction between humans, machines, and the environment. Therefore, the war of the future will be the result of machine computations combined with human calculations—a combination of calculations and computations. To put it another way, it is the power of insight. The faster and more accurate the computations, the greater the danger, and the easier it is for them to be deceived and fall victim to their smartness. As exemplified by the famous Chinese parable “The Old Man Lost His Horse (Sai Weng Shi Ma)”, sheer mathematics is no match for human calculations and insights. Services of the U.S. Armed Forces have recently envisioned their respective operational concepts for the future, including multi-domain operations, all-domain operations, and mosaic warfare, all of which concern human-machine-environment system engineering. It is the dispersion and aggregation of the elements in the human-machineenvironment system. It is also the synthesis, mixture, and integration of symbolic distributed representations and many non-symbolic phenomenological representations. On top of that, it is the cross-complementation of machines, information, knowledge, experience, AI, intelligentization, and smartification. Therefore, getting to the bottom of the mechanisms for integrated human-machine intelligence is the key to winning the game in the future. As any division of labor is subject to its scale and scope, the division of labor in integrated human-machine intelligence includes the division of both functions and capabilities. The former is caused by external demands and therefore passive, whereas the latter is active as it is generated by internal drive. Human or humanoid directional preprocessing is crucial for complex, heterogeneous, unstructured, and non-linear data, information, and knowledge. Upon preliminary reduction of a problem domain, the machine can give full play to its strengths: it is bounded, fast, and accurate. In addition, when a large amount of data, information, or knowledge is obtained, it can be mapped to various fields preparatory to human processing and analysis. Roughly speaking, the assimilation, accommodation, and cross-balance of these two processes are how humans and machines are organically integrated.

2 Patterns of the Future Game—Intelligentized Synergy The way of the future is about not only intelligentization but also smartification. It will transcend formalized arithmetic computations and even traditional logic calculations [4]— it is a new game-type computation-calculation system that incorporates the strengths of humans, machines, and the environment. This is somewhat akin to education: Schooling is to impart knowledge to students (like machine learning), but education should not stop here—it should tap into the logic behind it or go even deeper. For example, when teaching mathematics, one should think about what lies behind it. One should

The Key to Autonomous Intelligence

245

first help students cultivate their number sense before teaching the concept of calculation, such as addition and subtraction, and then move forward to their application. This is how students build insights. Synergy is an essential means in the game of smartification. As risks of nuclear proliferation grow, costs of future warfare between countries, regardless of their size, will continue to increase. In a sense, countries are both partners and competitors/strategic adversaries (as they must prevent a loss of control over nuclear/biochemical/smart weapons whilst working to demolish the morale of the adversary and defeat it). If power was a noun assigned to the masculine gender and wisdom feminine, then the game of the future is at least a neuter one, if not feminine. However AI develops, the future belongs to humanity. Humans should be the ones defining the rules of the future game and determining the future of AI, not the other way around because although AI and the future game are both logical, the latter also concerns a great amount of non-logical factors. When armed services work together in a multi-domain operation against a strong electromagnetic spectrum and cyberspace operational capabilities from the enemy, they will face a critical test of whether there is seamless transition and coordination among systems, including information communication, command, and control, intelligence, surveillance, reconnaissance, etc. Therefore, the future game is a behavioral pattern of intelligentized synergy generated from effective human-machine-environment integration and collaboration with various players.

3 The Role of AI in the Future Game The future game between AI and justice should undertake the mission of peace-building for humankind. Humans are responsible for doing the right things: upholding the values of fairness and justice, shared future, security, and controllability; AI is responsible for doing things right: with swiftness, accuracy, mass storage, and explainability [5]. The idea of AI originated in the 1940s. In 1948, the British mathematician Alan Turing described a machine capable of thinking in his paper Computing Machinery and Intelligence [6]. Which was regarded as the prototype of AI. The paper also introduced the concept of what is now known as the Turing test. In 1956, Marvin Minsky [7], John McCarthy, Claude Shannon, and other scholars ran a research project on AI at Dartmouth College in the U.S., known as the “Dartmouth Workshop”, which officially established the concept and development objectives of AI as a field. Its research covers propositional logic, knowledge representation, natural language processing, machine perception, machine learning, etc. The development of AI over the six decades can be summarized into the following stages. The golden years (1956–1974): After the Dartmouth Workshop, researchers achieved successes in areas including reasoning as search, natural language [8], and machine translation. The first AI winter (1974–1980): Due to limited computer(computing) power, the combinatorial explosion, and the difficulty of realizing commonsense reasoning, machine translation failed, and AI was subject to questioning and critiques. Boom (1980–1987): A form of AI program called “expert systems” using logical rules was widely adopted, which could solve problems in a specific domain. The Japanese Fifth Generation Computer project is considered a characteristic footprint of

246

W. Liu et al.

this period. Bust: the second AI winter (1987–1993): Abstract reasoning lost attention, and intelligence models based on symbolic processing were rejected. Growth (1993– 2012): Expectations for AI spiked once again upon the development of AI systems such as Deep Blue. Eruption (2012-): A new generation of information technologies such as machine learning [9], mobile networks, cloud computing, and big data [10, 11]reformed the information landscape and data foundation. Computing speed accelerated even more with substantial cost reduction, pushing AI to the next generation of explosive growth [12, 13]. Today, AI is like a schoolchild doing homework: it does exactly what it is assigned. It does not generate demand tasks autonomously/automatically, perform dynamic task planning, or coordinate demand conflicts. It is difficult for AI to cope with the contradiction between fast and slow situation awareness, and more so to realize the organic interaction of the entire human-machine-environment system and mixed embeddedness of facts and value elements. Therefore, limited rational logic and weak cross-domain capabilities are the Achilles’ heel of AI. It cannot understand equivalence or inclusion, especially that of values in different facts (something small can be greater than something large, and being can engender nothing). Humans can reach legitimate and rightful goals through devious and wrong means or vice versa. They can also address complex issues through standard methods, or (intentionally) resolve simple ones through complex methods. Meanwhile, there are three unsurmountable obstacles for AI in the field of the military: explainability, learning, and common sense. 3.1 Explainability Explainability refers not only to the interpretation of existing rules but also to the establishment of their new connotations and extensions. Today, the explainability of AI is becoming unrealizable. Last year, the EU issued The Ethics Guidelines for Trustworthy AI, which clearly stated that AI should be “trustworthy” in terms of safety, privacy, transparency, and explainability. AI applications aim to make decisions and judgments. Explainability refers to the degree to which humans can understand the reasons for a decision. The more explainable the AI model, the easier it is for humans to understand why certain decisions or predictions are made. The explainability of a model refers to the understanding of its internal mechanisms and outcomes. It is important in that it assists developers in the modeling stage to understand models, make comparisons and selections, and optimize them if necessary; and in the operation stage, it explains the internal mechanisms and the outcomes of a model to the decision-maker. For example, a decision-making recommendation model needs to explain why a certain solution was recommended to a certain user. At present, the understanding and definition of AI vary in different fields, but there is a consensus on generic technologies and basic research. In stage one, AI aims to solve problems and carry out logical reasoning through machine theorem proving, expert systems, etc.; in stage two, it aims to realize environmental interactions by obtaining information from the operating environment and exerting influence on it; and in stage three, it moves toward cognitive and thinking capabilities and works to uncover new knowledge through data mining systems and algorithms.

The Key to Autonomous Intelligence

247

Strictly speaking, the U.S. is the global leader in AI, but its strengths are often not so evident when it comes to integrated human-machine intelligence; it may not even be a leader in this field (maybe there is no generation gap between China and the U.S. in this field after all). And it all comes down to people. For instance, the U.S., with its healthcare capabilities, equipment, and professional competence, should have weathered this pandemic much better than China. “Weakness and ignorance are not barriers to survival, but arrogance is.” The aphorism from the book The Three-Body Problem has, alas, come true. The mistakes and errors of the U.S. leaders have greatly compromised and even offset its cutting-edge technologies. Another example comes from a report days ago by Arms Control Today that the U.S. Department of Defense has requested US$28.9 billion for the modernization of its nuclear arsenal in the fiscal year 2021, reflecting the focus of the strategic development of the Trump administration: increasing the high levels of infrastructure automation of nuclear command, control, and communications (NC3) and improving its speed and accuracy. It raises a disturbing question: What is the role of AI autonomous systems in determining the fate of humanity in the future nuclear tussle? Since computer-aided decision-making is at an initial stage, it is still prone to unpredictable malfunctions. Furthermore, although machine learning algorithms are good at specific tasks such as facial recognition, they sometimes output implicit “bias” which is conveyed through training data [14]. It is hence necessary to take a cautious and responsible approach to the application of AI in nuclear command and control. Humans (not machines) must exercise ultimate control over the use of nuclear weapons as long as they exist. In that case, the true ability of integrated human-machine intelligence will be as important as the ability to control a pandemic. Integrated human-machine intelligence is a fundamental emblem of the combination of technology and art, as well as the symbolic, factual language of mathematics and the natural language of experience and value. Spacetime can not only curve in the physical realm but also be distorted in the realm of intelligence. Just as philosophical logic experienced a turn in its perception of the origin of the world and its research approaches [15–17], the analysis of human languages by analytic philosophy in the last century has “revolutionized” the human mind [18]. This philosophical revolution embodied by Ludwig Wittgenstein directly triggered the rapid development of AI technologies characterized by the Turing machine and the Turing test [19]. However, according to the “philosophy of facticity” proposed by Mr. Jin Guantao, analytic philosophy has instead, in the 21st century, confined philosophy in shackles and led to the imprisonment of the mind: symbols are entitled to their facticity even when not referring to any objects of experience, and this applies to both mathematical and natural languages. Meanwhile, the facticity of sheer symbols can be embedded in the facticity of experience; scientific and humanistic research can be two fields that are somewhat unified yet nonoverlapping with their respective facticity standards [20, 21]. A significant progress of humankind is to place its instinct for facticity (the objectivity of common sense) under the control of the ultimate concern and corresponding value systems [22]. Today, the two cornerstones of facticity are being subverted by scientific advancements, and something truly terrifying has happened [23]: humans are irresistibly reduced to intelligent “animals”— in a world where truth and falsehood are arbitrary terms, there is no right or wrong, nor is there any genuine sense of morality or dignity.

248

W. Liu et al.

The so-called AI largely relies on the increasing computing power of computers, which is bound to fail. Human learning is adaptive, and machine learning is not. Human intelligence is the ability to situation awareness of small specimens. A prominent example of situation awareness is the “Wang Wen Wen Qie” diagnostic method (also as inspection, listening and smelling, inquiring, and palpation) in traditional Chinese medicine. It relates facts with values by bridging the gap between the mind and physics through the differences between natural and mathematical languages [24]. Situation awareness is thought to be first proposed in the 61st difficulty of The Nan Jing (or The Yellow Emperor’s Canon of 81 Difficulty Issues), which goes: “As stated in the scriptures, look and know and you are a god; listen and know and you are a sage; ask and know and you are a practitioner; feel and know and you are a master. Why so?” The earliest use of the coined term “Wang Wen Wen Qie” was found in Gujin Yitong Daquan (or Medical Complete Book, Ancient and Modern): “These four methods— inspection, listening and smelling, inquiring, and palpation—are the essence of medicine.” To elaborate, a doctor uses visual inspection to observe the patient’s physical development, complexion, tongue, expression, etc., listens to the patient’s voice, coughs, and respiration, attends to odors emitted through the patient’s breath and body, acks about the patient’s medical symptoms and history, and palpates the pulse or press the abdomen of the patient to observe for any lumps (hence the four diagnostic methods). It is hard to realize AI explainability because, fundamentally, AI concerns not only mathematical language but also natural language and even the language of thought (which is why its explainability is unrealizable) [25]. Meanwhile, integrated humanmachine intelligence can not only work without a subject, but also switch between subjects to realize real-time and timely deep situation awareness in the human-machine environment system interaction, thus enabling the organic switch between the signifiers, signified, and signification of the mathematical, natural, and thought languages, to fulfill its purpose and intention straightforwardly. 3.2 Learning Machine learning can be viewed as a “metaphor” for human learning. The latter is an interactive process of learning and practicing triggered by initial knowledge acquisition and, more importantly, the environment thereafter; machines are unable to perform the second part. Human learning involves a blend of facts and values and is a dynamic process with constant value adjustment. In addition, human memory is adaptive and changes with the human-machine-environment system, uncovering from time to time features not noticed before. The object of human learning is not knowledge, but the method of acquiring data, information, knowledge, and experience; the objects of machine learning are data, information, and knowledge. There are similarities between different material systems and between one system and its subsystems [26]. Material systems with different forms of motion and properties follow nevertheless similar physical laws. It all indicates that similarity is a basic attribute of nature. For example, the mechanical mass-spring-damper system and the resistance-inductance-capacitance circuit system are similar systems, reflecting the similarity between physical phenomena (which, generally, can be used to simplify complex systems for research). It is easier for machines to learn and transfer such

The Key to Autonomous Intelligence

249

homogeneous, linear similar systems, but the analogy and conversion of heterogeneous, nonlinear similar systems would be much harder for them. On the contrary, human learning can traverse freely between the realms of symmetry and asymmetry, homogeneity and heterogeneity, linearity and nonlinearity, homology and heterology, isomorphism and non-isomorphism, empath and non-empath, sympathy and non-sympathy, periodicity and non-periodicity, topology and non-topology, and family and non-family. Machine learning is inseparable from time, space, and symbols, while human learning is a system that changes with values, facts, and emotions. The former follows and depends on existing rules, while the latter seeks to reshape established rules, subvert conventions, and forge new orders. For instance, truly outstanding leaders and commanders are always unconventional—they are reformers who cannot stand a steady slide into degeneration and perishment, and when a pandemic is raging on, their political careers will certainly not be their sole concern. On March 16, 2017, the U.S. Defense Advanced Research Projects Agency (DARPA) announced to launch its Lifelong Learning Machines (L2M) program, which seeks to develop the next generation of machine learning techniques and, based on it, promote the third wave of AI. According to DARPA, we have witnessed two waves of AI, and a third is coming. The first wave was characterized by “handcrafted knowledge”. Some cases in point are the Windows operating system, smartphone applications, and traffic light programs. The second wave featured “statistical learning”. Typical examples are artificial neural network systems and the progress in self-driving vehicles. Despite their strong reasoning and judgment abilities on specific problems, these AI technologies are unable to learn and are weak with uncertainties. The third wave of AI will focus on “contextual adaptation”, which means that it will be able to understand the situation and identify logical rules, thereby activating self-training until the establishment of its decision-making process. This shows that the continuous self-learning capability of AI will be the core driver in the third wave, and the objective of the L2M program is exactly aligned with the feature of the third wave. By developing a new generation of machine learning techniques that enables constant contextual learning and general knowledge acquisition, the L2M program will serve as a strong foundation for the third wave of AI. At present, the program supports a large base of 30 performer groups via grants and contracts of different duration and sizes. In March 2019, researchers at the University of Southern California (USC), a DARPA partner, published the results regarding (the) exploration into bio-inspired AI algorithms.L2M researcher Francisco J. Valero-Cuevas, professor of biomedical engineering and biokinesiology at USC Viterbi School of Engineering, along with USC Viterbi School of Engineering doctoral students Ali Marjaninejad, Dario UrbinaMelendez, and Brian Cohn published an article in Nature Machine Intelligence, detailing their successful creation of an AI-controlled robotic limb. The limb is driven by animal-like tendons capable of teaching itself a walking task, even automatically recovering from a disruption to its balance. Behind the robotic limb is a bio-inspired algorithm that can learn a walking task on its own after only five minutes of “unstructured play”, that is, conducting random movements that enable the robot to learn its structure and its surrounding environment.

250

W. Liu et al.

Current machine-learning approaches rely on pre-programming a system for all possible scenarios, which is complex, labor-intensive, and inefficient [27]. What the USC researchers have accomplished shows that it is possible for AI systems to learn from relevant experience, finding and adapting solutions to challenges over time. In fact, it is difficult for people to achieve lifelong learning, as learning itself is infinite. As we learn something, there is always going to be more about which we have half-baked or even zero knowledge. And for machines with neither “common sense” nor “analogy”, lifelong learning sounds more like a slogan than anything more. The priority here is to clarify what can and cannot be learned. Human learning is omnidirectional and multi-perspective. One thing can turn into many. One relationship can grow into many. And one fact can turn into not only multiple facts but also multiple values. What’s more interesting is that sometimes, human learning can categorize different things, relationships, and facts into one type of thing, one relationship, one fact, or even one value respectively. As machine learning is essentially the human perception made explicit (of one or more persons), it is in a strict sense opinionated and presumptuous, for people only recognize things they are used to or familiar with and as a result will unconsciously carry that limitations and narrow-mindedness into their models and programs. It indicates the intrinsic deficiency of such one-to-many and many-to-one transformation mechanisms. That being said, machine learning is not without merits. Although not suitable for intelligentization, it could be useful for computer and automation applications. If the essence of learning is categorization, then human learning is about obtaining and creating the approaches to categorization, whereas machine learning is simply about the application of such approaches. DARPA’s Lifelong Learning Machines (L2M) program may only be a beautiful bubble of wishful thinking—floating high and low as the wind comes and goes—and however a bubble gleams in the sun, it pops. 3.3 Common Sense There are flaws, ignorance, and unpleasantness in common sense. Like medicines, knowledge is subject to its scope and premise without which its side effects will emerge. Knowledge is merely an ingredient of common sense. Machines have only “information” but no “perception”, therefore cannot unite the information it receives with their actions. Knowledge should not just be attached to the mind, but rather integrate with it; if knowledge cannot change the mind and perfect it, it is best to desert such knowledge. Having knowledge but not knowing how to use it is even worse than knowing nothing—such knowledge is so dangerous that it will put its owner in peril. One of the most effective means of mitigating the side effects of knowledge is establishing common sense. In general, common sense is often fragmented. Situation awareness is the formation of certain non-common-sense perceptions derived from the aesthesis of the state and trends of fragmented common sense. Furthermore, common sense is a basic ability for humans to perceive and understand the world. A typical AI system lacks a general understanding of how the physical world works (e.g. intuitive physics), human motivation and behavior (e.g. intuitive psychology), or an adult perception of universal entities.

The Key to Autonomous Intelligence

251

As DARPA continues to develop its second-wave AI technologies and relevant military applications, it is actively deploying the development of the third-wave AI and committed to relevant basic research through new and existing programs in the fiscal year 2018–2020. It is seeking breakthroughs in basic AI theories and core technologies through research in machine learning and reasoning, natural-language understanding, modeling and simulation, and human-machine integration. Relevant programs include Machine Common Sense, Lifelong Learning Machines, Explainable Artificial Intelligence, Assured Autonomy, Active Interpretation of Disparate Alternatives, Automating Scientific Knowledge Extraction, Guaranteeing AI Robustness against Deception, Accelerated Artificial Intelligence, Science of Artificial Intelligence, Learning with fewer Labels, Knowledge-directed Artificial Intelligence Reasoning Over Schemas, Accelerated Computation for Efficient Scientific Simulation, Complex Adaptive System Composition and Design Environment, Communicating with Computers, Symbiotic Design for Cyber-Physical Systems, etc. On top of that, DARPA publicized more AI programs in its recent Broad Agency Announcement: Science for Artificial Intelligence and Learning for Open-world Novelty, Artificial Social Intelligence for Successful Teams, Real-Time Machine Learning, etc. What a pity if learning fails to enable us to think and act! Learning is not intended to make the mindless think or the visionless see. Its job is not to restore sight to the blind, but to train and correct vision—at least there should be some vision for it to be trained. Learning is good medicine, but any good medicine can go bad; its shelf life depends on the quality of the container. The Russian-born mathematician Vladimir Voevodsky is renowned for his work in developing a new cohomology theory for algebraic varieties, which provided a new perspective for number theory and algebraic geometry. His work is distinctive in that he simplifies highly abstract concepts and applies them to solving specific mathematical questions. The idea of cohomology originally derived from topology, which roughly refers to the “science of shape” that studies the shapes of, for example, spheres, tori, and their higher-dimensional analogs. It looks into the basic properties of these objects that are preserved under continuous deformations (no tearing). In layman’s terms, cohomology serves as a method to divide topological spaces into pieces that are easier to study, and group cohomology contains information on how to assemble these pieces back into the original object. The central objects of study in algebraic geometry are algebraic varieties, which are the sets of solutions to polynomial equations. They can be represented by geometric objects such as curves or surfaces, which are more “rigid” than deformable topological objects. In 2017, DARPA’s Strategic Technology Office (STO) unveiled the vision of “Mosaic Warfare”, which believes that the future battleground is a mosaic where lower-cost, lesscomplex systems are linked together in a vast number ways to create desired, interwoven effects tailored to any scenario. Part of the concept is “combining weapons we already have today in new and surprising ways.” The key will be manned-unmanned teaming, disaggregating capabilities, and allowing commanders to seamlessly call on effects from sea, land, or air depending on the situation and no matter which of the armed services is providing the capability.

252

W. Liu et al.

Simply put, just like the motivic cohomology formulated by Vladimir Voevodsky, the “Mosaic Warfare” and (the) Machine Common Sense program are both new topological systems used against human-machine-environment systems. Yet the real strength lies not in the basic knowledge or rules, but in the people who have succeeded by applying them in practice, such as the large number of political and military strategists who were not graduates from prestigious military academies yet ultimately defeated the esteemed Huangpu principals and generals, including Mao Zedong (Hunan First Normal University) and Su Yu (Hunan Second Normal University).

4 Issues in the Future Game 4.1 Deep Situation Awareness in Integrated Human-Machine Intelligence On the surface, the development of military-relevant intelligentization in countries is in full swing; in fact, there is a fatal flaw in its development—the failure to achieve integrated human-machine intelligence, especially deep situation awareness. Any paradigmchanging technological advancement derives from the understanding of the basics. For example, all human behavior is motivated, and that motive means value. As there are three levels of motives, so are there three levels of values. In addition to the capability of value-based causal reasoning, humans are greater than AI in terms of variable characteristics, representations, understanding, judgments, predictions, and execution. Strictly speaking, the current AI application scenarios are hugely narrow and at the early stage of computational and perceptual intelligence. They do not initiate effective descriptions of scenarios, which is the greatest challenge in the science of intelligence. Historical and modern military intelligentization, however, has relied only on training AI algorithms for their respective application scenarios. In general, these AI technologies use symbols, behaviors, and connectionism to carry out formalized causal inference and data calculations regarding objective facts, but rarely touch on value-based causal judgment and decision-making. Deep situation awareness, on the other hand, involves the integration of facts and values (which points to “deep”), objective and factual data as well as the objective part of information and knowledge such as prominence, time and space parameters, or simply known as the chain of facts (which points to “situational”), along with the chain of values which refers to subjective value parameters such as expectations and effort level (which points to “awareness”). Deep situation awareness is a “double helix” with the chain of facts and the chain of values winding around each other, thereby enabling effective judgment and accurate decision-making. In addition, humans tend to control subjective value-based calculations, while machines focus on computations concerning objective facts and processes, which is another “double helix”. No country has managed to achieve proper matching between these two “double helices” (e.g. time and space, significance, expectations, effort, value, etc.) In a sense, deep situation awareness addresses not only the issue of the prominence of time and space contradictions in human-machine environment systems but also the issue of selectivity of the contradictions regarding facts, values, and liabilities. Contradictions are competition, and decision-making entails risks [28]. Good situation awareness can see the order in chaos, possibilities in impossibilities, and

The Key to Autonomous Intelligence

253

light in the darkness… Therefore, the current bottleneck in the field of military-relevant intelligence lies in deep situation awareness in integrated human-machine intelligence. 4.2 Uncertainty The famous military theorist Carl von Clausewitz believes that game is a fog of great uncertainty and unknowability. Unknowability refers to things that are unforeseeable and unpredictable. As regards modern AI development, there are still many unsolved concerns in human-machine integration in the game of the foreseeable future. More specifically, they are: 1) In a complex game environment, humans and machines absorb, digest, and use limited information in a specific spacetime. The challenge for humans is that as they face more pressure, the amount of information misunderstood them increases, leading to greater confusion and accidents. For machines, it is still very difficult for them to learn, understand and predict unstructured, cross-domain data. 2) The wide distribution, both in spacetime and emotions, of the information needed for decision-making in game makes it difficult to obtain certain key information in specific situations. Besides, it is challenging to integrate crucial objective, physical data collected by machines, and subjective, processed information acquired by humans. 3) A large number of nonlinear features and unforeseeable variability in the future game will often cause great unpredictability in behavioral processes and outcomes. Formalized logical reasoning based on axioms is far from sufficient to meet the demand of decision-making on the complex and ever-changing battlefield. 4.3 Human Issues “Cross-domain synergy” is a “human issue”, and multi-domain operations can solve it through convergence and the integration of systems. Convergence is defined as “the integration of capabilities across domains, environments, and functions in time and physical space to achieve a purpose.” The integration of systems emphasizes the importance of the people, processes, and technical solutions needed for “cross-domain synergy”. So far, “cross-domain synergy” has not come true, since the established systems and programs are “chimneys”—independent of each other—they still need “human solutions” to realize cross-domain mobility and firepower. As automation, machine learning, AI, and other technologies mature, the adversaries of the U.S. forces will seek to pose a greater challenge by applying these technologies. As required by Robert Work, the U.S. military was responsible for dismantling the existing “chimneys” and designing a new solution supported by manned-unmanned teaming. The idea of “multi-domain warfare” goes back to April 8, 2015, when the then deputy Secretary of Defense Robert Work asked the Army to explore into AirLand Battle 2.0 when speaking at the U.S. Army War College. In his speech, he summarized the problems that will arise from the 21st-century game and the necessary responses of the U.S. military. On AirLand Battle 2.0, Work envisions that it should become a means for the U.S. Army to act and defeat enemies after they enter the theater and break through A2/AD defense. These are his words, “We are going to have to think about fighting against enemies which have lots of guided rockets, artillery, mortars, and missiles, and are using informationalized warfare to completely disrupt our heavily

254

W. Liu et al.

netted force. So what does AirLand Battle 2.0 look like? I don’t know. The Army needs to figure this out.” He moved on to give an important directive when he emphatically mentioned the book Average is Over by Tyler Cowen and its significance for the research on AirLand Battle 2.0. The author of the book found out that in chess games, machines constantly beat grandmasters, but the combination of a man and a machine always beats the machine. This is called three-play chess. When the concept is applied to the game, Work hopes that manned-unmanned teaming will serve as the key for the U.S. military to win in the future. On May 12, 2020, American defense expert Peter L. Hickman published an article titled The Future of Warfare Will Continue to Be Human. He believes that the character of war continues to evolve, and AI will significantly contribute to elements of that evolution. However, there are risks to an overestimation of the rate of technological change and the role of advanced tech in future victory. An over-emphasis on technology risks allows blind spots that adversaries will be sure to exploit. There is no problem with striving for cutting-edge technology, but human intelligence and creativity will win the next game, not technology. In fact, it coincides with Mao Zedong’s strategic concept of the people’s game: weapons are an important factor in war, but not the decisive one; it is people, not things, that are decisive.

5 Conclusion If logic is the “relation of equality or inclusion of strings of symbols”, then non-logic is the “relation of equality or inclusion of non-symbols.” AI is better at processing logical issues, human non-logical ones and integrated human-machine intelligence can handle the blend of the two. For AI to fulfill its role in confrontation warfare, there are three conditions: One, to identify the part that can be solved by mathematical. quantitative computations; two, to determine the right time, method, and function for that AI part during human-machine integration; and lastly, for humans to do the right things and AI to “do things right.” Recently, Gen. John M. Murray, the U.S. Army Futures Command, along with other Army technical leaders emphasized that “humans” must ultimately be making important decisions and control the “command and control” systems. They also pointed out that the rapidly evolving AI weapon systems will allow Army command officers to “see, decide and act faster than an adversary” and, ultimately, destroy the adversary. (The speed of human decision-making, or the OODA loop, will see an exponential rise thanks to the data processing capabilities of AI.) There will always be signs before something is on the horizon, and integrated human-machine intelligence can discern these trivial signs and clues at the right time. Not too fast, not too slow, for they are equally unproductive. To “see, decide and act faster than an adversary” does not necessarily mean an ultimate victory. Sometimes, it is good to slow down. You may run into a trap when you run too fast. Currently, the mismatch between human intelligence and AI is the main cause of the frailties of the application of AI [29]. Humans can be here and not here at the same time. And we can be and not be at the same time. We can be in a superposition state. Crisis management often is a superposition scenario: peril and opportunity have

The Key to Autonomous Intelligence

255

a symbiosis. They are in each other and intertwined with each other. The key to deep situation awareness in integrated human-machine intelligence is to seize the momentum and make the best of it. Humans and machines are non-isomorphic, meaning they are different in nature—in controlled experiments, the former entails partial irreproducibility and the latter universal reproducibility. Human intelligence is also composed of controllable, uncontrollable, reproducible, and irreproducible elements. A colorblind person would believe that there is no color, which is true to him but false to other objects. If there are more colorblind people than color-sighted people, will the latter be regarded as colorblind instead. The flaw of science is that it denies the uncontrollability and irreproducibility of individuality, and this inevitably invites defects [30]. Every human being is a subject of naturally individualized uncontrollability and irreproducibility, whose existence cannot be denied[31]. From that standpoint, the nature of human-machine integration is to help science overcome its flaws and limitations. The merit of big data is the universal reproducibility of its controlled experiments. This is how it uncovers common regularities—learning through its findings. However, this is also its weakness, as it tends to overlook new things—the irreproducible factors in controlled experiments—and stays unadaptable to changes. The irreproducibility in certain controlled experiments is indeed real, yet it is not included within the realm of science. In the past, we mistook part for the whole; and now we simply stay put in our work of human-machine integration when the wind of change is blowing. On that account, the answer to future victory in the game is to get to the bottom of the mechanisms for integrated human-machine intelligence and realize effective synergy. AI technologies will be indispensable in that process.

References 1. Marcus, G.F.: Deepmind’s losses and the future of artificial intelligence. WIRED. https:// www.wired.com/story/deepminds-losses-future-artificial-intelligence. Last accessed 6 Jan 2020 2. Nielsen, M.F.: Is AlphaGo really such a big deal. Quanta Magazine (2016) 3. Polanyi, M.F., Sen, A.S.: The Tacit Dimension. University of Chicago press, Chicago (2009) 4. Mill, J.S.: A System of Logic, Ratiocinative and Inductive: Being a Connected View of the Principles of Evidence and the Methods of Scientific Investigation. Longmans, Green, London (1898) 5. Lanier, J.F.: Who Owns the Future? Simon and Schuster, New York (2014) 6. Turing, A.M., Haugeland, J.S.: Computing Machinery and Intelligence. MIT Press Cambridge, MA (1950) 7. Minsky, M.L.: Computation. Prentice-Hall, Englewood Cliffs (1967) 8. The homepage of mitsukuchatbot (year?). https://www.wired.com/story/deepminds-lossesfuture-artificial-intelligence/. Last accessed 12 Sep 2020 9. Jones, N.: Computer Science: The Learning Machines. Nature 505(7482), 146–148 (2014). https://doi.org/10.1038/505146a 10. Data, B.: A Revolution That Will Transform How We Live, Work, and Think. Viktor MayerSchönberger and Kenneth Cukier (2013) 11. Chen, X.-W., Lin, X.: Big data deep learning: challenges and perspectives. IEEE Access 2, 514–525 (2014). https://doi.org/10.1109/ACCESS.2014.2325029

256

W. Liu et al.

12. Floridi, L.: The Fourth Revolution: How the Infosphere is Reshaping Human Reality. OUP, Oxford (2014) 13. Brynjolfsson, E., McAfee, A.: The second machine age: Work, progress, and prosperity in a time of brilliant technologies. WW Norton & Company (2014) 14. Kreinovich, V., McClure, J., Symons, J.: The End of Theory? Does the Data Deluge Make the Scientific Method Obsolete? (2008) 15. Koyre, A.: Galileo and plato. J. History Ideas 4(4), 400 (1943). https://doi.org/10.2307/270 7166 16. Galilei G, F.: Dialogue concerning the two chief world systems. University of California press (1967) 17. Koyré, A., Franklin, A.: Galileo studies. Am. J. Phys. 47(9), 831–832 (1979) 18. Husserl, E.X.: The Crisis of European Sciences and Transcendental Phenomenology: An Introduction to Phenomenological Philosophy. Northwestern University Press (1970) 19. Lambert, K.A.: Turing’s Man: Western Culture in the Computer Age. Taylor & Francis (1988) 20. Newton, R.G.: The Truth of Science: Physical Theories and Reality. Harvard University Press (1997) 21. Penrose, R.: Shadows of the mind: a search for the missing science of consciousness. Sci. Spectra 11, 74 (1998) 22. Plato, F.: The Republic. Penguin Books, Harmondsworth (1955) 23. Sagan, L.A.: Electric and Magnetic Fields: Invisible Risks? Gordon and Breach Publishers (1996) 24. Crick, F., Clark, J.: The astonishing hypothesis. J. Consciousness Stud. 1(1), 10–16 (1994) 25. Dreyfus, H., Dreyfus, S.: Mind over machine: the power of human intuition and expertise in the era of the computer. IEEE Expert 2(2), 110–111 (1987). https://doi.org/10.1109/MEX. 1987.4307079 26. Penrose, R., Mermin, N.D.: The Emperor’s new mind: concerning computers, minds, and the laws of Physics. Am. J. Phys. 58(12), 1214–1216 (1990). https://doi.org/10.1119/1.16207 27. Najafabadi, M.M., Villanustre, F., Khoshgoftaar, T.M., Seliya, N., Wald, R., Muharemagic, E.: Deep learning applications and challenges in big data analytics. J. Big Data 2(1), 1–21 (2015). https://doi.org/10.1186/s40537-014-0007-7 28. Wikipedia: Jeopardy! https://en.wikipedia.org/wiki/Jeopardy! (year?). Last accessed 2 Feb 2017 29. Russell, B.: On the notion of cause. JSTOR 13, 1–26 (1912) 30. Pearl, J., Mackenzie, D.: The book of why: the new science of cause and effect. Basic books (2018) 31. Mihály, P.: Personal Knowledge: Towards a Post-critical Philosophy. University of Chicago Press (1962)

Design of Ultra-Low-Power Interface Circuit for Self-Powered Wireless Sensor Node Chunlong Li1,2 , Hui Huang1,2 , Dengfeng Ju1,2 , Hongjing Liu3,4 , Kewen Liu3,4 , and Xingqiang Zhao1,5(B) 1 State Grid Smart Grid Research Institute Co., Ltd., Beijing 102209, China

[email protected]

2 Electric Power Intelligent Sensing Technology Laboratory of State Grid Corporation,

Beijing 102209, China 3 State Grid Beijing Electric Power Research Institute, Beijing 100075, China 4 Standard Verification Laboratory for On-Site Testing Technology, Beijing 102209, China 5 School of Automation, Nanjing University of Information Science and Technology,

Nanjing 210000, China

Abstract. A self-powered wireless sensor node was designed to monitor transformer fault in this paper. A piezoelectric vibration energy harvester (PVEH) was used to generate electric power. To provide reliable electric power to WSN nodes, a switch-triggering circuit with hysteresis was designed. Before circuit triggering, the energy storage capacitor and WSNs node are disconnected. Once triggered, the switch is turned on to connect the capacitor to the node for the power supply. When the capacitor voltage drops to a lower threshold, the switch turns off again and returns to the open state. This hysteresis effect can result in the capacitor being charged with sufficient energy before supplying power to the node. A low-power temperature and humidity WSNs node was designed, which transmitted data wirelessly using a LoRa module. The voltage of the energy storage capacitor reaches 3.44 V after 2 min during cold start, triggering the switch circuit. The WSNs node operates for 7 ms and then the switch circuit returns to the OFF state. Vibration energy harvesting technology breaks through the bottleneck of battery life and supplies power to WSNs node for transformer monitoring. Keywords: Power transformers monitoring · Switch-triggering circuit · Piezoelectric · vibration energy harvesters

1 Introduction In recent years, wireless sensor networks (WSNs) have brought great advantages to transformer fault monitoring as they reduce monitoring costs and personal hazards during maintenance, such as fault diagnosis of transformers [1]. Power supply is a significant challenge for WSNs [2], especially when the battery life of the nodes is short. Energy harvesting technologies are an effective solution [3–5]. An energy harvester converts ambient energy (such as solar [6], vibration [7, 8] and thermal [9] sources) into electrical energy for the WSNs node. This self-powered, battery-free WSNs node can operate for © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 257–262, 2023. https://doi.org/10.1007/978-981-99-6187-0_25

258

C. Li et al.

an extended period. Typically, vibration energy harvesters include piezoelectric [10], electromagnetic [11] and electrostatic [12] mechanisms, of which piezoelectric vibration energy harvesters (PVEH) are the most widely studied because of their simple structure, high power, and considerable voltage. The output power of vibration energy harvester ranges from hundreds µW to tens of mW, while the WSNs node usually need hundreds of mW to transmit signals. PVEH cannot maintain stable operation of the WSNs nodes. Therefore, a management circuit is needed to provide a stable and reliable power supply. Typical power management circuits include rectification, charging and voltage stabilization. Due to the sub-threshold of semiconductor chips [13], the voltage of the chip in the node will be locked at a lower sub-threshold when the energy storage voltage is lower than the normal value. Finally, the power generated by the PVEH is consumed in the threshold state, and the node cannot operate normally. To prevent subthreshold lock-up during cold start, a trigger switch with a hysteresis should be used. This switch is open when the voltage across the energy storage capacitor is below an upper threshold. In this state, PVEH charges the capacitor. The load will not be powered until the capacitor has reached sufficient energy level. However, once the voltage exceeds the upper threshold, the switch is closed, and the capacitor discharges electrical power to the load. On the other hand, when the voltage falls below a lower threshold, the switch is opened, and the circuit returns to its original state. The upper threshold voltage is higher than the lower threshold, allowing the circuit to have a hysteresis characteristic. This mechanism ensures the normal power supply of WSNs nodes and avoid possible lock-up issues during cold startup. Stark designed a start-up circuit for microwatt energy harvesters [13]. This start-up circuit used an array of discrete MOSFETs operated in their sub-threshold regions. A voltage-detecting switch was proposed. The start-up circuit closes at 1.95 V and opens at 0.95 V, with a hysteresis voltage of 1.0 V. Alghisi proposed a nano-power trigger circuit using several MOSFETs for battery-less power management electronics [14]. The trigger circuit switches to the closed state at 2.15 V and opens at 1.15 V. A power consumption is s as low as 28 nW when the switch in open state. This paper proposes ultra-low-power interface circuit for self-powered wireless sensor nodes to monitor transformer temperature and humidity. Section 2 provides a simple trigger switch circuit with hysteresis to prevent subthreshold lock-up during cold start. Finally, experimental research was conducted on the PVEH trigger switch circuit, and a batter-free low power temperature and humidity node was successfully self-powered.

2 Interface Circuit Principle Figure 1 is the schematic of the PVEH interface circuit. Four diodes D1–D4 form a full-wave bridge rectifier to convert AC of the PVEH to DC. Then an energy storage capacitor C1 is charged to power the WSNs node. A simple trigger switching circuit with hysteresis is connected between the C1 and the sensor node. U1 is a DC-DC converter chip that provides a stable voltage VCC for the node. The trigger circuit is in the red rectangle in Fig. 2, which includes a hysteresis voltage comparator and a P-channel enhancement MOSFET switch Q3. The drainsource on-state resistance RDS(on) of the MOS Q3 is very small, about tens of m,

Design of Ultra-Low-Power Interface Circuit

259

Fig. 1. The power interface circuit

and the drain-source off-state resistance RDS(off) is higher than 1 G. So Q3 can be considered as an ideal switch. In the ON or CLOSED state, the capacitor C1 discharges through the subsequent circuit. While in OFF or OPEN state, the capacitor is no longer supplying power. The switch Q3 is triggered by a hysteresis comparator, which consists of a three-resistor voltage divider, an N-channel enhancement MOSFET Q2 and a P-channel enhancement MOSFET Q1.

Fig. 2. Simulation results of trigger switching circuit with hysteresis

For the trigger circuit, in the case of cold start, the capacitor is charged by PVEH, and its voltage V C1 divided by three resistors gradually increases from 0. If the voltage drop V 2 across resistor R3 is below the gate threshold voltage U GS(th) of NMOS Q2, all of the MOSFETs are in the cut-off state and the switch circuit is open, an shown in Fig. 4. In open state, the discharge current of C1 through the three resistors is not greater than 1 µA and the energy loss can be negligible. When voltage V 2 reaches to the U GS(th) of NMOS Q2, the NMOS is turned on. The voltage V 3 is almost pulled down to the ground, which will turn on the PMOS Q1. PMOS will bypass resistance R1. The PMOS Q3, as a switch, is turned on. The energy storage capacitor C1 supplies power to the load. The trigger voltage V H of the circuit can be calculated as follows: VH = UGS(th)

R1 + R2 + R3 R3

(1)

260

C. Li et al.

Once the switch circuit is triggered, the gate potential V 2 of NMOS Q2 much higher than U GS(th) . Even if the input voltage V C1 drops a little, the circuit is still closed. When the voltage continues to drop to U GS(th) again, NMOS Q2 switches to the cut-off mode. The switch circuit returns to the open state. Then the storage capacitance C1 is no longer supplies power for the subsequent circuit. The cut-off voltage VL of the circuit is VL = UGS(th)

R2 + R3 . R3

(2)

3 Experimentation The primary vibration frequency of the transformer remains stable at 100 Hz [15]. Therefore, to maximize output power, the PVEH resonance frequency should match it. Three PVEH prototypes have been developed in this paper as shown in Fig. 3a. The resonance frequency of the prototypes was adjusted to be 100 ± 2 Hz, as shown in Fig. 3b. The max powers of the three prototypes are 1.2 mW, 0.9 mW, and 1.4 mW respectively.

(a)

(b)

Fig. 3. The PVEH prototype(a) and the output power (b): The power of the three devices as function of the frequency at an acceleration amplitude of 0.5g. The L b and T m are the length of the beam and mass thickness, respectively.

The circuit PCB is shown in Fig. 4a. The NMOS is the Infineon SN7002N with threshold voltage U GS(th) of 0.8–1.8 V. PMOS Q1 and Q3 are the Vishay SI2329DS with threshold voltage U GS(th) of −0.5 V and RDS(on) of 30 m. R1 = 10 M, R2 = R3 = 5.1 M and R4 = 20 M. A signal generator is used to simulate the voltage in the capacitor. And an oscilloscope observes input and output voltage waveforms. The measured result is show in Fig. 4b. The trigger voltage V H is 3.27 V and V L is 1.58 V, with a voltage hysteresis width of 1.69 V. In the open state, the measured quiescent current is 0.24 µA. The trigger switching circuit with hysteresis can ensure that energy is enough for the wireless sensor node normal operation, and has ultra-low power consumption.

Design of Ultra-Low-Power Interface Circuit

261

Fig. 4. The input and output voltage of the trigger switching

A wireless temperature and humidity sensor was designed using an ultra-low-power sensor (HDC2080), microcontroller (STM8L101) and LoRa module (SX1268). The average current of the node is only 6 µA in sleep mode, 0.9 mA in the case of sensor signal acquisition, and 23 mA in the case of data wireless transmission. The PVEH was connected to the interface circuit and the wireless sensor node. The vibration exciter produces a sine acceleration of 0.5 g and 100 Hz. In the case of cold start, the voltage V C1 reaches 3.44 V after 2 min, the trigger switch closes and starts to output electric energy to the node. After that, the VCC maintained 3.3 V for about 7 ms (Fig. 5). During this time, temperature and humidity sensor nodes implement signal acquisition and wireless transmission. The data is received at 50 m away.

Fig. 5. The voltage waveform of V C1 and VCC

4 Conclusions A low-power self-powered WSNs node based on PVEH is designed to monitor the temperature and humidity of power transformers. Three PVEH prototypes were processed, with a resonance frequency of 100 Hz ± 2 Hz, consistent with the dominant frequency of the transformer. In order to provide stable power to WSNs nodes, a hysteresis trigger switch circuit was designed. The experimental results show that the upper and lower

262

C. Li et al.

threshold voltages are 3.27 V and 1.58 V, respectively. After 2 min of a cold start, the PVEH charged the capacitor to 3.44 V and triggered the switching circuit. Then the WSN node worked normally for 7 ms until the switching circuit turned to the OFF state. The trigger switching circuit has an ultralow static power of 0.7 µW, and can ensure sufficient storage power for the WSNs node normal operation. Vibration energy harvesting technology breaks through the bottleneck of battery life and supplies power to WSNs node for transformer monitoring. Acknowledgement. This work was supported by the R&D project of State Grid Corporation of China (Development of vibration energy harvesting device and Research on self power supply technology of sensor based on micro kinetic energy, No. 5500-202158417A-0-0-00).

References 1. Chen, X., Hu, Y., Dong, Z.: Transformer operating state monitoring system based on wireless sensor networks. IEEE Sens. J. 21, 25098–25105 (2021) 2. Garg, R.K., Bhola, J., Soni, S.K.: Healthcare monitoring of mountaineers by low power wireless sensor networks. IMU 27, 100775 (2021) 3. Sezer, N., Koç, M.: A comprehensive review on the state-of-the-art of piezoelectric energy harvesting. Nano Energy 80, 105567 (2021) 4. Liu, H., Fu, H., Sun, L.: Hybrid energy harvesting technology: from materials, structural design, system integration to applications. Renew. Sust. Energ. Rev. 137, 110473 (2021) 5. Safaei, M., Sodano, H.A., Anton, S.R.: A review of energy harvesting using piezoelectric materials: state-of-the-art a decade later. Smart Mater. Struct. 28, 113001 (2019) 6. Abdin, Z., Alim, M.A., Saidur, R.: Solar energy harvesting with the application of nanotechnology. Renew. Sust. Energ. Rev. 26, 837–852 (2013) 7. Wei, C., Jing, X.: A comprehensive review on vibration energy harvesting: modelling and realization. Renew. Sust. Energ. Rev. 74, 1–18 (2017) 8. Ma, X., Zhou, S.: A review of flow-induced vibration energy harvesters. Energ. Convers. Manage. 254, 115223 (2022) 9. Zabek, D., Morini, F.: Solid state generators and energy harvesters for waste heat recovery and thermal energy harvesting. Therm. Sci. Eng. Prog 9, 235–247 (2019) 10. Liang, H., Hao, G., Olszewski, O.Z.: A review on vibration-based piezoelectric energy harvesting from the aspect of compliant mechanisms. Sens. Actuator A Phys. 331, 112743 (2021) 11. Muscat, A., Bhattacharya, S., Zhu, Y.: Electromagnetic vibrational energy harvesters: a review. Sensor 22, 5555 (2022) 12. Li, M., Luo, A., Luo, W.: Recent progress on mechanical optimization of MEMS electretbased electrostatic vibration energy harvesters. J. Microelectromech. S. 31, 726–740 (2022) 13. Stark, B.H., Szarka, G.D., Rooke, E.D.: Start-up circuit with low minimum operating power for microwatt energy harvesters. J. Eng. Technol 5, 267–274 (2011) 14. Alghisi, D., Ferrari, V., Ferrari, M., et al.: A new nano-power trigger circuit for battery-less power management electronics in energy harvesting systems. Sensors 263, 305–316 (2017) 15. Miao, X., Jiang, P., Pang, F.: Numerical analysis and experimental research of vibration and noise characteristics of oil-immersed power transformers. Appl. Acoust. 203, 109189 (2023)

Distributed Consensus Tracking for Underactuated Ships with Input Saturation: From Underactuated to Nonholonomic Configuration Linran Tian1 , Tao Li2(B) , and Guoying Miao3 1

The Electronics and Information Engineering College, Nanjing University of Information Science and Technology (NUIST), Nanjing 210044, China 2 The Jiangsu Collaborative Innovation Center of Atmospheric Environment and Equipment Technology (CICAEET), Nanjing University of Information Science and Technology (NUIST), Nanjing 210044, China [email protected] 3 The School of Automation, Nanjing University of Information Science and Technology (NUIST), Nanjing 210044, China

Abstract. This work deals with the distributed consensus tracking control of multiple surface vessels (MUSVs) system with input saturation. First, under the leader-follower framework, the original vessel model is transformed into a nonlinear cascaded system with minimum phase and the tracking error system is obtained. Then, by approximating the saturation function using the arctan function and applying the mean value theorem, a distributed consensus tracking control method is proposed using state scaling, backstepping and relay-switching control technique. Stability analysis shows that under the tracking control strategy, the state error between the leader and followers is exponentially stable and meanwhile can also achieve the desired input saturation. Keywords: Underactuated system function · backstepping method

1

· Graph theory · Saturated

Introduction

In the past, underactuated surface vessels (USVs) have played an important role in maritime transportation, patrol, and underwater exploration [1]. Due to the increasing complexity and diversity of navigation tasks and maritime operations in recent years, there is an urgent need for MUSVs to work together to complete complex tasks that cannot be accomplished by a single vessel [2]. The coordinated control of multiple underactuated surface vessels has been a hot topic in the field of control. One difficulty in the coordinated control of MUSVs is the lack of actuators in the sway direction [3]. Another difficulty is the cooperative control of a group of USVs [4]. Generally, the formation control c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023  Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 263–272, 2023. https://doi.org/10.1007/978-981-99-6187-0_26

264

L. Tian et al.

problem is solved for a MUSVs system within the leader-follower framework [5], while there is less research on the consensus problem. In previous research, although USVs have nonholonomic constraints, their models cannot be transformed into a strictly upper triangular form. Therefore, traditional nonlinear control methods cannot be used for control design, making the design process cumbersome and difficult. To address this issue, a series of state input transformations were used to transform the underactuated surface vessel from underactuated to nonholonomic configuration for control design [6,7]. On the other hand, since the nonholonomic configuration is easy to transform into a chained system and there are various control strategies for high-order chained multi-agent systems [8], this transformation method can simplify control design and solve distributed control problems for multiple surface vessel systems. This paper studies the distributed consensus tracking problem of a class of MUSVs system with input saturation. Compared with existing work, based on the work of [7] which only achieved input saturation controller design for the rudder, this paper borrowed the treatment method of saturation function in [9] to complete the input saturation control design for all controllers and extended the results to the consensus tracking problem of multiple underactuated surface vessel systems. This allowed the state tracking error between the leader and followers in the system to achieve exponentially stable.

2

Graph Theory

To model the interaction between ships in MUSVs system, an undirected graph G = {V, S} is considered where V = {1, 2, . . . , n} is the set of nodes, S ∈ V ×V = {(j, i) : j, i ∈ V} is the set of edges, in which (j, i) implies ship i has access to information of ship j. The non-negative real number aij = aji determines the communication weight between ship i and ship j. When the ship i can receive information from the leader, Ei is set to be a positive constant,and to be zero n otherwise. A diagonal matrix D = diag{di , . . . , dn } with dn = j=1 aij is used to represent the degree of each ship. The Laplacian matrix of g is defined as L = D − A, in which A = [aij ] ∈ Rn×n . Its communication topology can be represented by a matrix H = L + E, in which E = diag{E1 , . . . , En }.

3 3.1

System Description Multiple Surface Vessel Systems

In this work, the mathematical model of Leader and Follower USVs moving on a horizontal plane will be given in the following: η˙ = J (ψ)v, M v˙ = τ − C(v)v − Dv.

(1)

η˙ i = J (ψi )vi , Mi v˙ i = τi∗ − C(vi )vi − Di vi .

(2)

Distributed Consensus Tracking

265

Table 1. Symbol definitions in the system (1) and (2). η, ηi

(x0 , y0 , ψ0 )T , (xi , yi , ψi )T

v, vi

(u0 , v0 , r0 )T , (ui , vi , ri )T

τ ,τi∗

(τ0,u , 0, τ0,r )T , (Sat(τi,u ), 0, Sat(τi,r ))T

J (ψ), J (ψi ) [cos(ψ) sin(ψ) 0; sin(ψ) cos(ψ) 0; 0 0 1], [cos(ψi ) sin(ψi ) 0; sin(ψi ) cos(ψi ) 0; 0 0 1] C(v),C(vi ) [0 0 − m0,22 v0 ; 0 0 m0,11 u0 ; m0,22 v0 − m0,11 u0 0], [0 0 − mi,22 vi ; 0 0 mi,11 ui ; mi,22 vi − mi,11 ui 0] M ,Mi

diag{m0,11 m0,22 m0,33 },diag{mi,11 mi,22 mi,33 }

D,Di

diag{d0,11 d0,22 d0,33 },diag{di,11 di,22 di,33 }

All symbols in the system are defined in Table 1, Sat(τi,u ) and Sat(τi,r ) represent the follower control inputs with saturations τi,u max and τi,r max . Please refer to [3] for more details about the ship model. The following state and input transformation will be introduced to facilitate upcoming control designs [7]. To make it easier, the transformations for the leadership system will be given first.  m0,22 v0 , x0,1 = x0 cos(ψ0 ) + y0 sin(ψ0 ), x0,2 = −x0 sin(ψ0 ) + y0 cos(ψ0 ) + d0,22 m0,11 x0,3 = ψ0 , x0,4 = − d0,11 u0 − x0,1 , x0,5 = v0 , x0,6 = r0 , ⎧ ⎨u

 0,1

=

⎩ u0,2 =



d0,11 τ0,u d0,22 − 1 u0 − x0,2 x0,6 − d0,22 = m0,11 −m0,22 d τ u0 v0 − m0,33 r0 + m0,r m0,33 0,33 0,33

(3) τ0,u Δ0,1 − d0,22 , τ0,r = Δ0,2 + m0,33 .

(4)

As a result, the follow below can be obtained as the cascade form of the ship model (1):  m0,22 d d x˙ 0,1 = x0,6 (x0,2 − d0,22 x0,5 ) − m0,22 x0,1 − m0,22 x0,4 , 0,11 0,11 (5) d0,22 d0,22 x˙ 0,5 = − m0,22 x0,5 + m0,11 x0,6 (x0,1 + x0,4 ), x˙ 0,2 = x0,4 x0,6 , x˙ 0,3 = x0,6 , x˙ 0,4 = u0,1 , x˙ 0,6 = u0,2 .

(6)

Next, the control algorithm relationship between an underactuated ship and a nonholonomic system will be established. To this end, state and input transformations are constructed as ⎧ ⎨ z0,1 = x0,4 , z0,2 = x0,2 − x0,3 x0,4 , w0,1 = u0,1 , z0,3 = −x0,3 , z0,4 = −x0,6 , (7) w0,2 = −u0,2 , ⎩ z0,5 = x0,1 , z0,6 = x0,5 ,

266

L. Tian et al.

under which subsystem (5) and (6) can be rewritten into the following special nonholonomic system. ⎧ ⎧ m0,22 ⎪ z˙0,1 = w0,1 , ⎪ ⎨ ⎨ z˙0,5 = (z0,2 − z0,3 z0,1 + d0,22 z0,6 )z0,4 ⎪ z˙0,2 = z0,3 w0,1 , d0,22 − m0,11 (z0,5 + z0,1 ), (8) z˙0,3 = z0,4 , ⎪ ⎪ ⎩ z˙ = − d0,22 (z + z (z + z )), ⎪ ⎩ 0,6 0,4 0,1 0,5 m0,22 0,6 z˙0,4 = w0,2 . Meanwhile, after undergoing the same transformation process as described above, the ship system of the follower is rewritten as below. ⎧ ⎧ mi,22 z˙i,1 = wi,1 , ⎪ z ˙ = (z − z z + z )z i,5 i,2 i,3 i,1 i,6 i,4 ⎪ ⎪ di,22 ⎨ ⎨ z˙i,2 = zi,3 wi,1 , di,22 − mi,11 (zi,5 + zi,1 ), (9) z ˙i,3 = zi,4 , ⎪ ⎪ ⎩ z˙ = − di,22 (z + z (z + z )), ⎪ ⎩ i,6 i,4 i,1 i,5 mi,22 i,6 z˙i,4 = wi,2 . 3.2

Input Saturation

The designed control input Sat(τi,∗ ) can be formulated as follows:  τi,∗ max sign(τi,∗ ) if |τi,∗ | ≥ τi,∗ max Sat(τi,∗ ) = τi,∗ if |τi,∗ | < τi,∗ max

(10)

where ∗ denotes u or r respectively and τi,u max and τi,r max are their respective saturation limits. To efficiently tackle the saturation problem, the actuator saturation can be approximated by Sat(τi,∗ ) = gi,∗ (τi,∗ ) + Δτi,∗ , where gi,∗ = 2τi,∗ max πτ arctan( 2τi,∗i,∗ ) and Δτi,∗ with i = 1, 2, . . . , n represents the bounded π max approximation error [9], and then we have |Δτi,∗ | ≤ τi,∗ max −

π 2τi,∗ max arctan( ) = Δi,∗ ≤ Δ¯i,∗ . π 2

(11)

According to the mean value theorem, g(τi,∗ ) can be reformulated in g(τi,∗ ) = τi,∗ − Λi,∗ τi,∗ , where Λi,∗ = 3.3

(πi,∗ τi,∗ /2τi,∗ max )2 (1+(πi,∗ τi,∗ /2τi,∗ max )2 )

with 0 < i,∗ < 1.

Preliminaries

In order to proceed with the control design, the following lemmas and hypotheses are required. Lemma 1 [9]: Assuming there exists a Lyapunov function V (x), if V˙ (x) < −ιV a (x) − κV b (x), where ι > 0, κ > 0, a > 1 and 0 < b < 1, it can be conclude that the equilibrium point of the system is fixed-time stable, and the 1 1 + κ(1−b) . setting time satisfies T ≤ Tmax = ι(a−1)

Distributed Consensus Tracking

267

  Lemma 2 [9]: For any l > 0 and any ς ∈ R, it has 0 ≤ |ς| − |ς| tanh |ς| ≤ l 0.2785l. n n Lemma 3 [10]: Let 0 ≤ p ≤ 1 and xi , . . . , xn ≥ 0, then ( i=1 xi )p ≤ i=1 xpi ≤  n n1−p ( i=1 xi )p . n 2 n i=1 xi . Lemma 4 [10]: If xi ≥ 0, it can be get that i=1 xni ≤ n Lemma 5 [10]: λmax (M ) and λmin (M ) represent the maximum and minimum eigenvalues of matrix M , respectively. If ∀x ∈ R and · stands for the Euclidean norm, it holds λmin (M )x2 ≤ xT M x ≤ λmax x2 . Assumption 1: It is necessary to select input τ0,u and τ0,r of the leader ship system so that w0,1 > 0. Assumption 2: The undirected graph G consisted of follower nodes is connected and at least one follower is connected to the leader. Assumption 3: The communication ranges between underactuated ships are unlimited.

4

Main Results

By defining tracking errors (ei,1 , ei,2 , ei,3 , ei,4 , ei,5 , ei,6 )T = (zi,1 − z0,1 , zi,2 − z0,2 , zi,3 − z0,3 , zi,4 − z0,4 , zi,5 − z0,5 , zi,6 − z0,6 )T , one obtains the following error model dynamics: ⎧ mi,22 di,22 ⎪ ⎨ e˙ i,5 = (zi,2 − zi,3 zi,1 + di,22 zi,6 )zi,4 − mi,11 (zi,5 + zi,1 )− m0,22 d ((z0,2 − z0,3 z0,1 + d0,22 z0,6 )z0,4 − m0,22 (z0,5 + z0,1 )) (12) 0,11 ⎪ ⎩ di,22 d0,22 e˙ i,6 = − mi,22 (zi,4 (zi,1 + zi,5 ) + zi,6 ) + m0,22 (z0,4 (z0,1 + z0,5 ) + z0,6 ) 

Sat(τ

)

Sat(τ

)

i,r e˙ i,1 = Δi,1 − di,22i,u − w0,1 , e˙ i,4 = − mi,33 − Δi,2 − w0,2 , e˙ i,2 = ei,3 wi,1 + z0,3 (wi,1 − w0,1 ), e˙ i,3 = ei,4 .

(13)

The control goal for trajectory tracking can now be described as: find feedback control laws Sat(τi,u ) and Sat(τi,r ) with saturations such that the origin of the closed-loop error system (12)–(13) is globally exponentially stable. The first step in the upcoming control development is to present a cascade lemma. Lemma 6: The zero solution of system (12)–(13) is globally exponentially stable if the zero solution of the subsystem (13) is globally exponentially stable. Proof: Select the candidate Lyapunov function as W = 12 β¯2 e2i,5 + 12 e2i,6 . √ ˙ ≤ −ρ1 W + ρ2 (t) W , Differentiating it along system (12) contributes to W m m d d where α0 = m0,11 , αi = mi,11 , β0 = m0,22 , βi = mi,22 , ζ0 = αβ00 , ζi = αβii , 0,22 i,22 0,22 i,22 ¯ max{β, ¯ 12 }, ρ2 = ζ¯ = min{ζ0 , ζi }, β¯ = min{β0 , βi }, ρ1 = 2 min{ζ¯β¯2 , β}/

268

L. Tian et al.

2 max{ β¯22 , 2} max{ν1 (t), ν2 (t)} with ν1 (t) = β¯2 |ei,1 | ζ¯ + |z0,4 | + |z0,3 z0,4 | + ¯ i,1 | + β|z ¯ i,4 | β|z ¯ 0,4 zi,1 | + |zi,6 | and ν2 (t) = β|e ¯ i,1 ||z0,4 | + β¯2 |ei,3 ||zi,1 zi,4 | + β|e ¯ ¯ |ei,4 | β|z i,1 | + β|zi,5 | . √ ¯˙ ≤ − 1 ρ W ¯ = W , it yields that W ¯ + 1 ρ (t). Thus, the closed-loop Define W



2 1

2 2

system is globally exponentially stable.  The tracking control design for ei,1 -subsystem will now be considered. Based on the ei,1 -subsystem structure, τi,u is designed as |i,1 | 3 τi,u,a =Ki,1 di,22 + Ki,2 di,22 i,1 + di,22 Δi,1 − di,22 w0,1 i,1 √ i n ¯ |Δ¯i,u | (|Δi,u | tanh( + ) + 0.2785li,u ),  li,u √ τi,u,a i,1 n τi,u,b = , τi,u = τi,u,a + τi,u,b 1 (1 − Λi,u max )

(14)

where i,k = Σnj=1 aij (zi,k − zj,k ) + ai0 ei,k , k = [1,k , . . . , n,k ]T , ek = [e1,k , . . . , en,k ]T and Ki,1 > 0, Ki,2 > 0 are control parameters. According to the definition of H, it has k = Hek [10]. Proposition 1: If (14) is applied to ei,1 -subsystem, then the subsystem is fixedtime stable. Proof: The Lyapunov function is chosen as V1 = 12 eT1 He1 . According to Assumption 2, H is positive definite, which means that there exist another positive definite matrix Q satisfying H = QT Q, then n

2 i,1 = eT1 QT HQe1 ≤ λmax eT1 QT Qe1 = 2λmax V1

(15)

i=1

where λmax is the maximum eigenvalue of matrix H. Taking time derivative of V1 , it has V˙ 1 =eT1 H e˙ 1 =

n

i,1 (−

i=1 n

Sat(τi,u ) + Δi,1 − w0,1 ) di,22

|i,1 | τi,u,b Λi τi,u 3 (16) − Ki,2 i,1 − +  d d i,1 i,22 i,22 i=1 √ Δ¯i,u Δτi,u 1 i,1 n ¯ (Δi,u tanh( − − ) + 0.2785li,u )). di,22 di,22 1  li,u n It can be deduced from Lemma 2 and Lemma 4 that i=1 i,1 Λi,u τi,u − 2 n n i,1 τi,u,a n i=1 i,1 √ i=1 i,1 τi,u,b = n( i=1 n Λi,u τi,u,a − (1 − Λi,u )  (1−Λi,u max ) n ) ≤ |τi,u,a | =

i,1 (−Ki,1

Distributed Consensus Tracking

269



n 2 √ ¯  i=1 i,1 n ¯i,u | tanh( |Δi,u | ) + |i,1 |(Λi,u max − 1) ≤ 0 and − n (|Δ i=1 i,1 Δτi,u − 1  li,u  n 2 ¯  |i,1 | ¯ i=1 i,1 ¯i,u | tanh( |Δi,u | ))) ≤ 0. 0.2785li,u ) ≤ n( n |Δi,u | − (0.2785li,u + |Δ i=1 n n l

n

i=1

n

i,u

i2 ≥ 2λmin V1 and Lemma 3, where λmin is the 1 minimum eigenvalue of matrix H, it yields that V˙ 1 ≤ −ρ1 V 2 −ρ2 V 2 where K1 = With the derivation of

i=1

1

1

1

min{K1,1 , . . . , Kn,1 }, K2 = min{K1,2 , . . . , Kn,2 }, ρ1 = K1 (2λmin ) 2 and ρ2 = n−1 K2 (2λmin )2 . Thus, the result presented in Proposition 1 has been proved. Since [wi,1 − w0,1 ] ≡ 0 when t ≥ Ti , systems (13) transform into e˙ i,2 = ei,3 w0,1 , e˙ i,3 = ei,4 , e˙ i,4 = −

Sat(τi,r ) − Δi,2 − w0,2 . mi,33

(17)

At this point, the controller τi,r can be designed for the third-order system. Let αei,2 and αei,3 be the virtual controllers in (17). In the sequel, the standard backstepping design is used to develop controller. The designed virtual control designs and intermediate controller in control design are summarized in Table 2, where ci,k > 0, i = 1, . . . , n, k = 1, 2, 3 and li,r > 0 are design parameters. Proposition 2: If τi,r in Table 2 is applied to (ei,2 , ei,3 , ei,4 )-subsystem for t > T , where T = max{Ti }, then the subsystem is asymptotically stable. Proof: Consider the following Lyapunov function V =

1 T 1 T 1 T e He2 + e ˜ He ˜ He ˜3 + e ˜4 2 2 2 3 2 4

(18)

where ,with the virtual controllers and intermediate controllers in Table 2, it ˜3 = [˜ e1,3 , . . . , e˜n,3 ]T , yields that V˙ ≤ −cV , where e2 = [e1,2 , . . . , en,2 ]T , e T T ˜ 1,3 , . . . ,  ˜ n,3 ] , e ˜4 = [˜ e1,4 , . . . , e˜n,4 ] ,  ˜ 4 = [ ˜ 1,4 , . . . ,  ˜ n,4 ]T ,c =  ˜ 3 = [ min{c1,l , . . . , cn,l }, l = 2, 3, 4. Table 2. Symbol definitions in the system (1) and (2). virtual control errors: ˜ p = H e˜p , p = 3, 4. e˜i,p = ei,p − αei,p−1 ,  virtual controllers and intermediate controllers: c i,2 , αei,3 = −ci,2  ˜ i,3 − i,2 w0,1 + α˙ ei,2 , αei,2 = − wi,1 0,1 τi,r,a = τi,r,b =

1 (ci,3  ˜ i,4 Mi∗

+ ˜ i,3 + α˙ ei,3 − Δi,2 − w0,2 ) ¯i,r | |Δ ¯ +(|Δi,r | tanh( li,r ) + 0.2785li,r ), τi,r,a , (1−Λi,r max )

τi,r = τi,r,a + τi,r,b .

270

5

L. Tian et al.

Simulation

The simulation ships are the Cybership II in [7]. A difference is that the consider ship is symmetric, thus the non-diagonal terms are zero. The consider leader-follower MUSVs consist of one leader and three follows, the communication topology is set up as in Fig. 1. For the leader i = 0, τ0,u = −0.05, τ0,r = −0.001, the control objective that Sar(τN,u ) and Sat(τN,r ) are within the saturations τN,u max = 10 and τN,r max = 60, N = 1, 2, 3. To comply with the control goals, the designed the control parameters are chosen as KN,1 = 10−6 , KN,2 = 8000, lN,u = lN,r = 1, cN,1 = cN,2 = 60, cN,3 = 10. In simulation, system initial conditions are configured as x0 (0) = y0 (0) = 1, x1 (0) = y1 (0) = 1.1, x2 (0) = y2 (0) = 1.2, x3 (0) = y3 (0) = 1.3, and all initial values of other system states are zero. Meanwhile, define system errors as exi = xi − x0 , eyi = yi − y0 , eψi = ψi − ψ0 , eui = ui − u0 , evi = vi − v0 , eri = ri − r0 . According to the Fig. 2, all system state errors reach stability and converge to zero after t > T , all inputs satisfy the saturations.

Fig. 1. The communication topology of leader-follower MUSVs

Distributed Consensus Tracking

271

Fig. 2. The responses of the system states and inputs.

6

Conclusion

This letter has investigated the problem of distributed consensus tracking for MUSVs. Based on the backstepping method, by using state and input transformations, arctan function and relay-switching control technique, a one-order finite-time and a third-order asymptotically saturated controllers are designed. Finally, a numerical example shows the effectiveness of the proposed control method. Acknowledgments. This work was supported by the National Natural Science Foundation of China (61973168) and the China University Industry-University-Research Innovation Fund Project (2022BL066).

References 1. Zhang, Z.C., Wu, Y.Q.: Further results on fixed-time stabilization and tracking control of a marine surface ship subjected to output constraints. IEEE Trans. Syst. Man Cybern. -Syst. 51(9), 5300–5310 (2021) 2. Gu, N., Wang, D., Peng, Z.H., Liu, L.: Observer-based finite-time control for distributed path maneuvering of underacturated unmanned surface vehicles with collision avoidance and connectivity preservation. IEEE Trans. Syst. Man Cybern. -Syst. 51(8), 5105–5115 (2021) 3. Huang, J.S., Wen, C.Y., Wang, W., Song, Y.D.: Global stable tracking control of underactuated ships with input saturation. Syst. Control Lett. 85, 1–7 (2015) 4. Do, K.D.: Synchronization motion tracking control of multiple underactuated ships with collision avoidance. IEEE Trans. Ind. Electron. 63(5), 2976–2989 (2016)

272

L. Tian et al.

5. Dai, S.L., He, S.D., Cai, H., Yang, C.G.: Adaptive leader-follower formation control of underactuated surface vehicles with guaranteed performance. IEEE Trans. Syst. Man Cybern. -Syst. 52(8), 1997–2008 (2022) 6. Zhang, Z.C., Tian, L.R., Su, H., Wu, Y.Q.: Stabilization of asymmetric underactuated ships with full-state constraints: from underactuated to nonholonomic configuration. IEEE/CAA J. Autom. Sinica. 9(12), 2197–2199 (2022) 7. Zhang, Z.C., Tian, L.R., Wu, Y.Q.: Stabilization of asymmetric underactuated ships with input saturation: from underactuated to nonholonomic configuration. Ocean Eng. (2022). https://doi.org/10.1016/j.oceaneng.2022.112177 8. Sarrafan, N., Zarei, J., Razavi-Far, R., Saif, M.: Resilient finite-time consensus tracking for nonholonomic high-order chained-form systems against DoS attacks. IEEE T. Cybern. (2022). https://doi.org/10.1109/TCYB.2022.3186207 9. Hu, Y.S., Yan, H.C., Zhang, H., Wang, M., Zeng, L.: Robust adaptive fixed-time sliding-mode control for uncertain robotic systems with input saturation. IEEE T. Cybern. (2022). https://doi.org/10.1016/j.oceaneng.2022.112177 10. Zheng, K.H., Fan, H.J., Liu, L., Cheng, Z.T.: Triggered finite-time consensus of first-order multi-agent systems with input saturation. IET Contr. Theory Appl. 16, 464–474 (2022)

Trajectory Planning of Launch Vehicle Orbital Injection Segment Under Engine Failure Based on DDPG Algorithm Zhuo Xiang, Bo Wang(B) , Lei Liu, and Huijin Fan National Key Laboratory of Science and Technology on Multispectral Information Processing School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China [email protected] Abstract. Traditional methods for rocket orbiting have serious deficiencies in orbital accuracy and computational efficiency, and cannot handle engine failure which leads to thrust drop. Accordingly, this manuscript designs a trajectory planning algorithm based on DDPG for the launch vehicle into orbit, transforms the rocket orbiting into an MDP problem, and constructs reward functions that satisfy the flight process and terminal constraints. The simulation results show that this method has significantly improved the accuracy of orbiting and calculation efficiency, and has the ability to adjust to suitable orbit adaptively under failure.

Keywords: rocket orbiting

1

· engine failure · thrust drop · DDPG

Introduction

The focus of this manuscript is the trajectory planning technology of the launch vehicle in the orbital injection segment, and its purpose is to complete the original orbiting task as much as possible or avoid complete loss in the event of an engine failure. Trajectory planning methods are mainly divided into indirect methods based on optimal control and direct methods based on numerical optimization. The early indirect method is simple in form, but its convergence speed is slow and the practicability is bad. The direct method, such as the pseudo spectral method [1], has the characteristics of fast calculation speed and high precision, but it is very sensitive to the initial value, and it is difficult to deal with failures. Intelligent algorithms that simulate physical or biological laws, such as genetic algorithm [2], particle swarm optimization [3], ant colony optimization [4], etc., which is an important and effective method in the field of aircraft trajectory optimization to solve the problems of initial value sensitivity and local convergence, however, its long calculation time and poor precision makes it difficult to apply online. In recent years, machine learning methods, especially deep learning and reinforcement learning, have been widely used in the field of aerospace. The use of artificial intelligence technology can solve some problems that are difficult to c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023  Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 273–280, 2023. https://doi.org/10.1007/978-981-99-6187-0_27

274

Z. Xiang et al.

overcome with traditional methods. Beijing Aerospace Automatic Control Institute proposed the concept of “learning” launch vehicles [5], Liu and He in [6] designed the terminal guidance law of intercepting missiles based on the DDPG algorithm, Guo and Huang in [7] realizes the intelligent guidance of the reentry vehicle by DDPG algorithm, and He in [8] realizes the landing guidance of the retrievable launch vehicle based on the PPO algorithm. At present, there are relatively few researches on deep reinforcement learning methods for rocket orbiting. Traditional methods such as pseudo spectral method, iterative guidance, and convex optimization face the problems of low accuracy, bad real-time performance and poor ability to deal with failure. Therefore, this manuscript proposes a DDPG-based trajectory planning algorithm for the launch vehicle under power failure, transforms the problem of orbiting into an MDP problem, and defines the corresponding state, action, and state transition probability. In the reward setting, an intermediate reward is added to speed up training. The algorithm can re-plan a new orbital trajectory when the thrust failure is small, and give priority to raising the orbital height to avoid crashes when the thrust failure is large.

2 2.1

Model Rocket Dynamics Model

The orbital injection segment of the launch vehicle refers to the flight section before entering the target orbit after flying out of the atmosphere. Therefore, it is not necessary to consider the influence of aerodynamic forces. The threedimensional dynamic equation of the launch vehicle orbital injection segment is established under the launch inertia coordinates [9]: x˙ = Vx , y˙ = Vy , z˙ = Vz ϕ cos ψ V˙ x = (1−η)T cos − μx m r3 (1−η)T sin ϕ cos ψ μ(y+R0 ) V˙ y = − m r3 sin ψ μz V˙ z = − (1−η)T − 3 m r m ˙ = − IspTg0

(1)

In the formula, the state quantities x, y, z, Vx , Vy , Vz represent the position and the velocity of the rocket, m is the mass, and the control quantity ϕ represent the pitch angle of the rocket, μ, R0 , Isp , g0 respectively represent the gravitational constant of the earth, the average radius of the earth, the specific impulse of the rocket engine and the gravitational acceleration, and η is the failure factor, which represents the percentage of thrust drop. 2.2

Orbit Model

Define the semi-major axis of the target orbit as af , the eccentricity as ef , the orbit inclination as if , the right ascension of the ascending node as Ωf , and the

Trajectory Planning

275

altitude of perigee as ωf . The orbit injection point is not constrained, so the true anomaly angle constraint is not considered. Then the nonlinear relationship between the five orbital elements and the terminal velocity position can be expressed by the function F unorb [10]:   T T  af ef if Ωf ωf (2) = F unorb (x) rf Vf

3 3.1

Algorithm Reinforcement Learning

The basic framework of the reinforcement learning system mainly consists of two parts, namely the environment and the agent. In reinforcement learning scenarios, the mathematical framework that defines the problem is called a Markov Decision Process(MDP). The complete MDP is usually described by (S, A, P, R, γ), where S is the state set and A is the action set; P is the transition probability; R is the reward value; γ is the discount factor [11]. The strategy is a mapping from the state to the action, denoted as μ(a|s), which represents the probability distribution of the action a that should be taken in the state s, and the expression is: μ(a | s) = P [at = a | st = s]

(3)

The environment will run to the next state st+1 according to the state transition probability P (st+1 | st , at ) under the current action at , and give the agent a reward value Rt+1 . This interactive process is executed cyclically until the end of the round, and the weighted sum of the reward value of each step is defined as the cumulative return Gt : Gt = Rt + γRt+1 + γ 2 Rt+2 + · · · =

∞ 

γ k Rt+k+1

(4)

k=0

The purpose of reinforcement learning is to train an optimal strategy that maximizes the expected cumulative return:   tf  R (st , at ) (5) μ∗ = arg max Est+1 ∼p(st+1 |st ,at )·at ∼μ(at |st ) μ

3.2

t=t0

DDPG

Deep Deterministic Policy Gradient (DDPG) is a deep reinforcement learning algorithm on the Actor-Critic framework [12] . The Actor network is used to learn the policy function μ(s | θμ ) for action selection, and the Critic network is used to learn the action value function Q(s, a | θQ ), which is used to judge the

276

Z. Xiang et al.

pros and cons of the strategy in the current state, the expression of the action value function is:  ∞  μ k Q (s, a) = Eμ γ Rt+k+1 | st = s, at = a (6) k=0

First, a large number of trajectory sequences (st , at , rt+1 , st+1 ) are generated by interacting with the environment and stored in the experience pool. Experience is randomly extracted during training, and network parameters are updated by gradient descent. First update the critic network by minimizing the loss function L(θQ ), so that the Q value output by the critic network is closer to the real sampling value. The expression of the loss function L(θQ ) is as follows:

2 1 yt − Q st , at | θQ L θQ = 2

(7)

According to the Bellman equation, the expression of yt is: yt = r(st , at ) + γQ(st+1 , at+1 |θQ )

(8)

The Actor network updates θμ by maximizing the cumulative return. The purpose is to obtain a larger Q value. It is necessary to solve the gradient of the Q function, which can be obtained by the chain rule: Δθμ =

∂Q(s, μ(s|θμ )|θQ ) ∂Q(s, μ(s|θμ )|θQ ) ∂μ(s|θμ ) · = ∂θμ ∂μ ∂θμ

(9)

In order to allow the Actor network and the Critic network to update the  weights smoothly, additional Target networks μ (s|θ( μ )) and Q (s, a|θQ ) is introduced, the Target network has the same initial value as the original network, and updates in a special way periodically: 



θ ← τ θ + (1 − τ )θ , τ ≤ 1

4 4.1

(10)

Rocket Orbiting MDP Established State and Action

The state quantity s is a 7-dimensional vector composed of rocket position, velocity, and pitch angle under launch inertia coordinate, and the action is selected as the rate of change of the pitch angle, that is, the angular rate: T  T ˙ s = rT V T ϕ , a = [ϕ]

(11)

Taking the angular rate as the action and the attitude angle as a part of the state can effectively ensure the continuity and stability of the control quantity.

Trajectory Planning

4.2

277

Reward

The goal of the mission for launch vehicle in the orbital injection segment is to send the payload into the target orbit, so the closer the terminal orbit is to the target orbit, the greater the reward. According to this idea, the reward function can be designed as: rT = γa |Δa| + γe |Δe| + γi |Δi| + γΩ |ΔΩ| + γω |Δω|

(12)

Among all the orbital elements, the orbital semi-major axis a is the most sensitive quantity to change as the rocket moves. This manuscript sets an intermediate reward to encourage the current orbital semi-major axis to be close to the target orbital semi-major axis. On the one hand, it can speed up learning efficiency , on the other hand, it can avoid the rocket moving back and reduce unnecessary exploration space, which conforms to the laws of physics. The intermediate reward function is set as: rm = c (|at−1 − aT | − |at − aT |)

(13)

When a serious failure occurs, and the rocket cannot enter orbit normally, the setting of the intermediate reward can guide the rocket to increase the orbit height as much as possible to avoid crashing, and enter the orbit with the largest semi-major axis that can be reached at present, aiming to provide conditions for subsequent rescue.

5

Simulation

During the orbital injection segment, the booster, core-one and core-two engines have all been jettisoned, and the rocket relies on the core-three engine to transport the payload into target orbit. The orbital elements of the target orbit are: af = 40503.14 km, ef = 0.8364, if = 27.51◦ , Ωf = 335.25◦ , ωf = 236.59◦ . The parameters of the rocket and the core-three engine are shown in the table (Table 1). Table 1. Rocket parameters initial mass(kg)

34740

engine thrust(KN)

208.32

Specific impulse(m/s)

48

engine structure quality(kg) 4700 payload mass(kg)

13500

278

5.1

Z. Xiang et al.

Comparison of Orbital Accuracy

In order to verify the advantages of the DDPG algorithm in orbital accuracy and solution time, the pseudo spectral method (PSM) in the literature [1] is used to simulate under the same conditions. The configuration of the simulation platform is windows 10, i7-12700, 2.10GHz, 16GB. The comparison of the simulation results of the two methods is shown in Fig. 2-4, and the comparison of the orbiting accuracy is shown in Table 3:

Fig. 1. Orbit height, speed, pitch angle comparison of DDPG and PSM

Table 2. Orbiting accuracy comparison of DDPG and PSM DDPG

PSM

Δa(km) 0.59

115.26

Δe

0.00012

0.0057

Δi(◦)

−0.00011 0.0038

ΔΩ(◦)

0.0026

−0.0285

Δω(◦)

0.041

−0.0977

Both algorithms can obtain a reasonable trajectory into orbit, but the trajectory obtained by the DDPG algorithm has a smoother change in pitch angle, which can greatly reduce the burden on hardware equipment in actual applications. It can be seen from Table 4 that the DDPG algorithm has obvious advantages in the accuracy of orbiting compared with the traditional adaptive pseudo spectral method, and the algorithm has higher efficiency and shorter solution time, and is more suitable for the application online (Fig. 1). 5.2

Generalization Ability Verification

In order to verify the generalization ability of the algorithm proposed in this manuscript, the simulation is carried out under the condition that η = 0.2 and

Trajectory Planning

279

Fig. 2. Orbit height, speed, pitch angle comparison under engine failure Table 3. Orbiting accuracy comparison under engine failure η=0

η = 0.2

η = 0.3

a(km) 40502.55 40501.89 37684.21 e

0.8363

0.8363

0.7916

i(◦ )

27.5101

27.502

27.2704



Ω( )

335.2474 335.2435 332.7812

ω(◦ )

236.549

236.538

235.343

η = 0.3,η is the failure factor mentioned in Eq. 1, representing the percentage of engine thrust drop. This algorithm can plan a new orbital trajectory when failure occurs by reconfigured control angles and extended shutdown time. When η = 0.2, rocket enters the target orbit successfully. When η = 0.3, the failure is too large to reach the target orbit. The algorithm successfully guided the rocket into the secondary orbit with the largest semi-major axis (Table 2).

6

Conclusion

Aiming at the rocket orbiting problem with complex terminal constraints and extremely high precision requirements, this manuscript designs a trajectory planning algorithm for rocket orbiting segment based on DDPG. By establishing the MDP model of the rocket’s orbital injection and designing the reward function, the rapid planning of the orbital trajectory is realized, and it can adaptively enter the orbit or enter the secondary orbit under the condition of engine failure. The simulation results show that the algorithm has greatly improved the calculation accuracy and solution efficiency compared with the traditional algorithm, and can solve the failure problem that the traditional algorithm cannot handle, which has certain engineering application value.

280

Z. Xiang et al.

References 1. Hong, B., Xin, W.: Trajectory optimization of solid launch vehicle based on HP adaptive pseudospectral method. Aerospace Control 30(04), 18–22+31 (2012). https://doi.org/10.16804/j.cnki.issn1006-3242.2012.04.004 2. Chi, Z., Yi, D.: Research on UAV route planning based on genetic algorithm. Inf. Res. 44(04), 10–16 (2018) 3. Kim, J.J., Lee, J.J.: Trajectory optimization with particle swarm optimization for manipulator motion planning. IEEE Trans. Industr. Inf. 11(3), 620–631 (2017) 4. Acciarini, G., Izzo, D., Mooij, E.: MHACO: a multi-objective hypervolume-based ant colony optimizer for space trajectory optimization. In: 2020 IEEE Congress on Evolutionary Computation (CEC), pp. 1–8. IEEE (2020) 5. Weimin, B.: Aerospace intelligent control technology enables launch vehicles to “learn”. Acta Aeronautics Sinica 42(11), 8–17 (2021) 6. Liu, Y., et al.: Research on design of terminal guidance law based on DDPG algorithm. J. Comput. 44(09), 1854–1865 (2021) 7. Guo, D., et al.: Research on deterministic strategy gradient guidance method for reentry vehicle. Syst. Eng. Electr. Technol. 44(06), 1942–1949 (2022) 8. Linkun, H., Ran, Z., Qinghai, G.: Reinforcement learning-based landing guidance for retrievable launch vehicle. Aerospace Defense 4(03), 33–40 (2021) 9. Li, S., et al.: Autonomous rescue trajectory planning under launch vehicle power failure. Flight Mech. 39(02), 83–89 (2021). https://doi.org/10.13645/j.cnki.f.d. 20201113.004 10. Zhengyu, S., Cong, W., Qinghai, G.: Autonomous trajectory planning for launch vehicle under thrust drop failure. Scientia Sinica Informationis 49(11), 1472–1487 (2019) 11. Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Belmont (2018) 12. Liu, J., Gao, F., Luo, X.: A review of deep reinforcement learning based on value function and policy gradient. J. Comput. 42(06), 1406–1438 (2019)

A Survey on Lightweight Technology of Underwater Robot Bofeng Fu1 and Gang Wang2,3(B) 1 Jilin Institute of Chemical Technology, Jilin 132000, China 2 Jilin Communications Polytechnic, Changchun 130000, China

[email protected] 3 Baicheng Normal University, Baicheng 137000, China

Abstract. With the deepening exploration of the ocean by humans, underwater robots have become one of the commonly used underwater exploration tools. In recent years, the difficulty of underwater operations has become increasingly high. How to improve the performance of underwater robots has become an important research issue for efficiently completing underwater operations, and lightweight technology plays a very important role in improving the performance of underwater robots. Lightweight technology is generally divided into structural lightweight and material lightweight. This article summarizes the current research status of lightweight technology, discusses structural lightweight optimization from three aspects: topology optimization, size optimization, and structural optimization, the variable density homogenization method, evolutionary structure optimization method and Level-set method in the topological method are mainly introduced, and analyzes the selection of lightweight materials for underwater robots. Finally, it discusses and prospects the future development trend of underwater robot lightweight technology. Keywords: Underwater Robot · Lightweight · Structural Optimization · Material Optimization

1 Introduction With the continuous exploration of the ocean by humans, how to improve the performance of underwater robots has become a noteworthy issue in response to difficult underwater operations such as underwater archaeology, underwater rescue, and pipeline construction. The lightweight of underwater robots can not only improve their range and speed under the same power, but also reduce their motion inertia, improve their operating speed and accuracy of actions [1, 2]. Therefore, the lightweight of underwater robots is an important direction to improve their performance. Lightweight technology includes two aspects: structural lightweight and material lightweight [3]. The specific classification is shown in Fig. 1. The following will elaborate on the lightweight technology of underwater robots in terms of structural lightweight and material lightweight. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 281–288, 2023. https://doi.org/10.1007/978-981-99-6187-0_28

282

B. Fu and G. Wang

Fig. 1. Specific classification of lightweight technology

2 Structural Optimization 2.1 Introduction to Structural Lightweight Optimization Structural lightweight refers to adjusting and simplifying the structural design of underwater vehicles to achieve lightweight effects. Generally speaking, structural lightweight can be divided into three aspects: topology optimization, size optimization, and shape optimization. 2.2 Topology Optimization Topology optimization [4] is an advanced structural design method commonly used in the conceptual design stage of robots. It searches for the optimal material structure distribution configuration in the specified design space based on task requirements and actual constraints, and generates innovative lightweight and high-performance structures, which is difficult to achieve with traditional ideas. Compared to size and shape optimization, topology optimization is independent of the initial configuration and has a wider design space. Therefore, it has been developed as a mainstream structural design technology for high-performance, lightweight, and multifunctional structures, and is widely used in aerospace, automotive, robotics, and other fields [5–9]. The commonly used methods of topology optimization methods are variable density homogenization method [10, 11], evolutionary structural optimization (ESO) method [12–14] and Level-set method (LSM) [15, 16]. The density based optimization method is the minimum objective function that determines whether each position of the target element should be filled or empty with materials within a fixed space. The constraints of the goal should be reasonable, including parameters such as the number of materials used, fundamentally speaking, this is a large-scale mathematical programming problem. The basic formula based on linear static finite element analysis [17] is as follows (1): min :

f (ρ, U ) K(ρ)U = F(ρ) subject to : gi (ρ, U ) ≤ 0 0≤ρ≤1

(1)

where f is the objective function, ρ is the vector of the density design variable, U is the displacement vector, K is the global stiffness matrix, F is the force vector, and gi is the

A Survey on Lightweight Technology of Underwater Robot

283

constraint condition. This general formula can be widely applied to many problems. But this method has high mathematical complexity and is sometimes difficult to implement. Bends ø e and Kikuchi [18] proposed another method called solid isotropic material with punishment (SIMP). Compared with variable density homogenization, the unit elastic modulus decays exponentially in the density variable. SIMP has quickly become the most popular topology optimization method in its simple form, and has been embedded into Commercial software to solve engineering problems. Evolutionary structure optimization (ESO) is a type of Hard-kill methods that gradually remove inefficient materials through heuristic strategies until the specified material volume requirements are met. Compared with density based methods, the discrete design space used in this method is not relaxed. The characteristic of Hard kill methods is that they can be used together with commercial finite element packages, making them very simple and convenient. The general formula for the hard kill method of the basic minimum compliance problem [17] is as follows (2): min :

c = U T KU V V0 ≤ Vf subject to : KU = F x = [0, 1]

(2)

The variable definitions here are the same as those in (1), c denotes the compliance, V and V 0 are the material volume and design domain volume, respectively, and V f is the allowable volume fraction. x is the vector of the element design variable. This method is no different from density based methods in terms of related physical quantities and material properties, but the difference lies in the presence or absence of finite element design variables. However, in some cases, it is difficult to obtain an effective solution for ESO. To solve this problem, a bidirectional evolutionary structural optimization (BESO) method was proposed, which is an extension of ESO and allows adding and deleting materials to modify the structure. Huang and Xieet al. [14] demonstrated the feasibility of this method. LSM uses high-dimensional level set functions to describe structural boundaries. By iteratively solving the Hamilton Jacobi equation to update the level set function, the optimal configuration with smooth structural boundaries can be obtained. Zhang et al. [19, 20] proposed a feature driven optimization method (FDO) based on LSM, which decomposes complex engineering structures into a set of simple geometric features. Compared with density based methods, this method can not only greatly reduce the number of design variables, but also avoid the appearance of serrated boundaries in the optimal solution. There are many other topology optimization methods, such as Guo et al. [21, 22] proposing moving morphable components (MMC) and moving morphable voids (MMV). The Independent Continuous Mapping Method (ICM) [23, 24] has good stability and universality, high solving efficiency, and has been well applied in structural topology optimization. This method has been extended to aspects such as topology optimization of multiphase materials, transient heat conduction problems, and integration of materials and structures.

284

B. Fu and G. Wang

2.3 Size Optimization Size optimization, also known as parameter optimization, is a detailed optimization design technique that is applied to the detailed design stage of a target. After preliminary design of the model structure in the early stage of conceptual design, designers optimize it by changing certain parameters to better meet design requirements, such as plate thickness, cross-sectional dimensions of rods and beams, elastic element stiffness, mass elements, and material properties. Size optimization can be divided into two categories. One is free size optimization used in conceptual design, which uses size optimization methods similar to topology optimization algorithms when determining the thickness of non equal thickness thin plates or other components. Another type is the size optimization technique of detailed design [25], which determines the form and material of the structure of the product or component in the later stage of design, and converts some specific size data parameters into mathematical expressions for optimization. Yin et al. [26] proposed a lightweight structural optimization design method for robotic arms using structural dimensions and powertrain parameterization as design variables, and demonstrated the effectiveness and superiority of this method through design examples. 2.4 Shape Optimization Shape optimization refers to the basic design stage where the shape or position feature parameters of a part are changed by adjusting the height connection or radius of the design to make the stress distribution more uniform, thereby achieving lightweight optimization design and applying it to the target. Zhang et al. [27] designed a lightweight, high stiffness, and compact robot using an integrated modular structural design method based on dynamics and finite element analysis. The following principles should be followed when selecting the shape scheme of an underwater vehicle: first, the shape resistance is small, and the navigation performance is good. Second, a reasonable internal space facilitates the overall layout. Third, it has good processability and is convenient for mechanical processing.

3 Lightweight Material 3.1 Lightweight Material Requirements for Underwater Robots The selection of lightweight materials for underwater robots needs to consider the underwater environment and meet the following requirements. High strength: The structural materials of underwater robots must ensure a certain strength to withstand the water pressure of deep-sea operations, otherwise it will affect the service life of the robot. Large elastic modulus and Elastic Limit: underwater robots will bear huge water pressure during underwater operation, which will lead to structural deformation, so they need to have the ability to resist elastic deformation, and at the same time, avoid plastic deformation during service as much as possible, which is the basis for accurate control of robots.

A Survey on Lightweight Technology of Underwater Robot

285

Corrosion resistance: The working environment of underwater robots is generally in the deep sea, and seawater has strong corrosiveness, which can cause material corrosion and seriously affect various performance. 3.2 Material Introduction Magnesium alloy, as the lightest metal structural material at present, has a density less than 1/4 of steel, and has advantages such as light weight, high specific strength and stiffness, good shock absorption performance, and electromagnetic shielding performance. It is widely used in fields such as automobiles, electronics, medical care, and aerospace [28–30]. Magnesium alloy can also be applied in the field of underwater robots, and has significant advantages in improving robot mobility and reducing energy consumption [31]. However, the strength and toughness of magnesium alloy cannot be compared to steel materials, such as in underwater archaeology, rescue, material collection and other operations where magnesium alloy arms are difficult to achieve heavy object grasping. Therefore, magnesium alloys cannot completely replace materials such as steel and aluminum alloys in key components at present. Aluminum and aluminum alloys have excellent properties such as low density, high specific strength, good corrosion resistance, and easy processing and forming, and are widely used in industries such as marine, petrochemical, aerospace, and aviation [32– 35]. Compared with traditional steel materials, aluminum and its alloy materials have more prominent technological advantages in the application of lightweight and highspeed ships. Aluminum alloys usually have an oxide film on their surface, which has good corrosion resistance under general environmental conditions. However, due to the strong corrosive environment of the ocean, the oxide film on the surface of aluminum alloys is easily damaged and prone to various types of corrosion, seriously affecting the structural strength and various properties of the material, greatly reducing its service life. Titanium alloy has become the preferred structural material for high-end equipment components due to its excellent properties such as high specific strength, good corrosion resistance, high temperature resistance, and fatigue resistance [36]. However, titanium alloy belongs to the category of difficult to deform alloys, which is difficult to form and process [37], with poor plasticity and high resistance during the forming process. Advanced hot forming processing technology is required [38], so the accuracy of forming dimensions is not high. It is difficult to meet the requirements for fine parts of underwater robots. Composite material is a multiphase material, generally composed of two or more components with different properties and shapes. It not only maintains the characteristics of the original component materials, but also expands the superior performance that the original component materials do not have [39]. Carbon fiber composite materials are currently one of the most widely used and important composite materials. It has the characteristics of low overall density and high specific strength, and is also more resistant to corrosion and fatigue than commonly used metal materials [40–42].

286

B. Fu and G. Wang

4 Summary and Outlook At present, structural optimization has become the main way to achieve lightweight design of underwater robots. In structural optimization, the novel and efficient design ability of topology optimization is far superior to traditional size and shape optimization. Advanced topology technology can achieve lightweight without reducing the various performance of underwater robots. Therefore, topology optimization is the main development direction of future research on lightweight design of underwater robots. In the selection of lightweight materials, magnesium alloy, aluminum alloy and composite materials are all good choices. Due to the complex structure of underwater robots, there are many components used, and the materials required for different positions also have different requirements. Therefore, the integrated application of multiple lightweight materials is a future development direction. Acknowledgment. The author thanks Wang Gang and others for their help. This work has been supported by the Jilin Province Science and Technology Development Plan Project (Natural Science Foundation) (Project No. 20220101138JC): Research on Key Technologies of Underwater Intelligent Detection Robots Based on Multi source Information Fusion; Research on Collaborative Strategies of Multiple Underwater AUV Clusters Based on Biological Inspiration – Jilin Provincial Natural Science Foundation (Free Exploration General Project) – YDZJ202301ZYTS420.

References 1. Yu, C., Zhang, J., Wu, Y.: Research progress in the application of robotic lightweight materials. New Mater. Ind. (12), 41–45 (2019) 2. Wei, X.: Research on lightweight methods and design of construction machinery. Stand. Qual. Mach. Indu. 09, 32–35 (2022) 3. Wang, J., Li, Y., Hu, G., et al.: Lightweight research in engineering: a review. Appl. Sci. 9(24), 5322 (2019) 4. Sigmund, O., Maute, K.: Topology optimization approaches. Struct. Multidiscip. Optim. 48(6), 1031–1055 (2013) 5. Zhu, J.H., Zhang, W.H., Xia, L.: Topology optimization in aircraft and aerospace structures design. Arch. Comput. Methods Eng. 23(4), 595–622 (2016) 6. Shi, G.H., Guan, C.Q., Quan, D.L., et al.: An aerospace bracket designed by thermo-elastic topology optimization and manufactured by additive manufacturing. Chin. J. Aeronaut. 33(4), 1252–1259 (2019) 7. Jankovics, D., Barari, A.: Customization of automotive structural components using additive manufacturing and topology optimization. IFAC-PapersOnLine 52(10), 212–217 (2019) 8. Jewett, J.L., Carstensen, J.V.: Topology-optimized design, construction and experimental evaluation of concrete beams. Autom. Constr. 102, 59–67 (2019) 9. Krog, L., Tucker, A., Rollema, G.: Application of topology, sizing and shape optimization methods to optimal design of aircraft components. In: 3rd Altair UK HyperWorks Users Conference, pp. 1–12 (2002) 10. Yang, R.J., Chuang, C.H.: Optimal topology design using linear programming. Comput. Struct. 52(2), 265–275 (1994) 11. Bendsoe, M.P., Sigmund, O.: Topology Optimization: Theory, Methods and Applications. Springer Science and Business Media (2013)

A Survey on Lightweight Technology of Underwater Robot

287

12. Xie, Y.M., Steven, G.P.: A simple evolutionary procedure for structural optimization. Comput. Struct. 49(5), 885–896 (1993) 13. Querin, O.M., Young, V., Steven, G.P., et al.: Computational efficiency and validation of bidirectional evolutionary structural optimization. Comput. Methods Appl. Mech. Eng. 189(2), 559–573 (2000) 14. Huang, X., Xie, Y.M.: Convergent and mesh-independent solutions for the bi-directional evolutionary structural optimization method. Finite Elements Anal. Des. 43(14), 1039–104914 (2007) 15. Michael, Y.W., Wang, X.M., Guo, D.M.: A level set method for structural topology optimization. Comput. Meth. Appl. Mech. Eng. 192(1–2), 227–246 (2003) 16. Mei, Y.L., Wang, X.M.: A level set method for structural topology optimization and its applications. Adv. Eng. Softw. 35(7), 415–441 (2004) 17. Deaton, J.D., Grandhi, R.V.: A survey of structural and multidisciplinary continuum topology optimization: post 2000. Struct. Multidiscip. Optim. 49, 1–38 (2014) 18. Bendsge, M.P., Sigmund, O.: Material interpolation schemes in topo-logy optimization. Arch. Appl. Mech. 69, 635–654 (1999) 19. Zhang, W.H., Zhou, Y., Zhu, J.H.: A comprehensive study of feature definitions with solids and voids for topology optimization. Comput. Methods Appl. Mech. Eng. 325, 289–313 (2017) 20. Zhou, Y., Zhang, W.H., Zhu, J.H., et al.: Feature-driven topology optimization method with signed distance function. Comput. Methods Appl. Mech. Eng. 310, 1–32 (2016) 21. Guo, X., Zhang, W.S., Zhong, W.L.: Doing topology optimization explicitly and geometrically -a new moving morphable components based framework. J. Appl. Mech. 81(8), 081009 (2014) 22. Zhang, W.S., Chen, J.S., Zhu, X.F., et al.: Explicit three dimensional topology optimization via Moving Morphable Void (MMV) approach. Comput. Methods Appl. Mech. Eng. 322, 590–614 (2017) 23. Long, K., Wang, X., Gu, X.G.: Concurrent topology optimization for minimization of total mass considering load-carrying capabilities and thermal insulation simultaneously. Acta. Mech. Sin. 34(2), 315–326 (2018) 24. Li, X., Zhao, Q., Long, K., et al.: Multi-material topology optimization of transient heat conduction structure with functional gradient constraint. Int. Commun. Heat Mass Transfer 131, 105845 (2022) 25. Zhao, H., Sun, L., Wang, Y.: Research on the lightweight of a subway car body anti creep structure. Mech. Manuf. Autom. 51(06), 61–65 (2022) 26. Yin, H., Huang, S., He, M., et al.: A unified design for lightweight robotic arms based on unified description of structure and drive trains. Int. J. Adv. Robot. Syst. 14(4), 1–14 (2017) 27. Zhang, W., Huang, Q., Jia, D., et al.: Mechanical design of a light weight and high stiffness humanoid arm of BHR-03. In: IEEE International Conference on Robotics and Biomimetics, Guilin, China, pp. 1681–1686. IEEE (2009) 28. Wu, G.H., Wang, C.L., Sun, M., et al.: Recent developments and applications on highperformance cast magnesium rare-earth alloys. J. Magnesium Alloys 9(1), 1–20 (2021) 29. Song, X., Wang, Z.-W., Zeng, R.-C.: Magnesium alloys: composition, 15 microstructure and ignition resistance. Chin. J. Nonferrous Metals 31(3), 598–622 (2021) 30. Qin, Y., Wen, P., Guo, H., et al.: Additive manufacturing of biodegradable metals: current research status and future perspectives. Acta Biomater. 98, 3–22 (2019) 31. Wang, Z., Jia, L., Du, W.: Analysis of the application of magnesium alloy materials in robot lightweight. New Mater. Ind. 7, 14–17 (2016) 32. Qi, Z.Y., Wu, R.Z., Wang, G.J., et al.: Light alloy processing technology 44(1), 12–18 (2016) 33. Zhou, B.: Stress Corrosion Behavior and Mechanism of 6082 Aluminum Alloy. Changchun University of Science and Techno, Changchun (2020) 34. Ajay Krishnan, M., Raja, V.S.: Role of temper conditions on the hydrogen embrittlement behavior of AA 7010. Corrosion Sci. 152, 211–217 (2019)

288

B. Fu and G. Wang

35. Alexopoulos, N.D., Charalampidou, C., Skarvelis, P., et al.: Synergy of corrosion-induced micro-cracking and hydrogen embrittlement on the structural integrity of aluminium alloy (Al-Cu-Mg) 2024. Corrosion Sci. 121, 32–42 (2017) 36. Hong, Q., Guo, P., Zhou, W.: Titanium alloy forming technology and application. Titanium Ind. Progress 39(05), 27–32 (2022) 37. Li, J., Sun, Q., Yu, H.: Research status of advanced forming technologies for high performance titanium alloys. Steel Vanadium Titanium 42(06), 17–27 (2021) 38. Zang, J., Chen, J., Han, K., Xing, Q., Dai, S.: Research progress and development trend of aviation aluminum alloys. China Mater. Progress 41(10), 769–777+807 (2022) 39. Chen, H., Liu, G., Wang, X., Le, H.: Application and prospect of lightweight composite materials and 3D printing technology in service robots. Eng. Res. – Eng. Interdiscip. Perspect. 14(01), 30–39 (2022) 40. Zhang, H.: Laying and molding process of carbon fiber prepreg tape and product performance control. Xi’an: Northwest University of Technology 5 (2018) 41. Fan, X.: Application status and development trend of carbon fiber composites chemical industry 37(4), 12–16, 25(2019) 42. Zheng, Y.: Continuous carbon fiber reinforced Al matrix composites and their trend development Modern Manufacturing Technology and Equipment 58(2), 93–95,101 (2022)

False Alarm Rate Control Method for Fiber Vibrate Source Detection with Non-stationary Interference Liping Yin1(B) , Zhengju Zhu1 , Mingxing Shu1 , and Hongquan Qu2 1

Nanjing University of Information Science and Technology, Nanjing 210044, China [email protected] 2 North China University of Technology, Beijing 100144, China

Abstract. In this paper, the false alarm rate control problem is considered for pipeline detection. Firstly, through the optical fiber sensing hardware system, the light intensity data is measured at the end of the optical fiber. Secondly, the probability density functions(PDFs) of the light intensity are estimated based on kernel density estimation(KDE). The dynamic weight model is then established through B-spline function selection and subspace system identification. To reduce the false alarm rate, the threshold coefficient to constant false alarm rate (CFAR) detection is designed as the control input to the identified weight dynamic system such that the difference between the output PDFs and the desired PDF can be minimized. Finally, the effectiveness of this method is illustrated by simulation experiments.

Keywords: fiber vibration source detection model · PDF shape control · CFAR

1

· PDF · weight dynamic

Introduction

Pipeline transportation of oil or natural gas is of great importance to the economy and livelihoods. However, once the pipelines leak, it will bring about some earth shaking losing and influence to both the pipeline company and the society [1,2]. Therefore, safe and reliable warning technologies are urgently needed. At present, the pipeline safety technology can be divided into two categories: the first is post-leakage alarm and the second is pre-leakage alarm. Fiber optic early warning system can directly measure the vibration of communication cable and realize fast, long-distance and large-scale vibration source detection and early warning of oil and gas pipelines. Optical fiber sensing system senses the vibration signals coming from the ground or underground(e.g. earthquake) through the optical fiber. In the front-end optical fiber sensor hardware system, the optical time domain reflection(OTDR) technology is widely adopted. However, the OTDR technology is extremely sensitive even to harmless interference and instantaneous environmental noises, which leads to frequent false alarm. c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023  Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 289–297, 2023. https://doi.org/10.1007/978-981-99-6187-0_29

290

L. Yin et al.

These interferences or environmental noises are usually non-stationary in both space and time [3], thus the detection system output does not satisfy the Gaussian distribution hypothesis. In the back-end vibration source detection software system, when there are space-time two-dimensional non-stationary interferences, it will cause false alarm/real alarm aliasing problem, and result in unstable false alarm rate as well as low vibration detection credibility, which seriously restricts the practical application effect of this technology [4]. To reduce the false alarm rate, this paper introduces a threshold coefficient control method. The rest of this paper is organized as follows: In Sect. 2, the field vibration data collection is introduced. In Sect. 3, the probability density functions of false alarm signal involved output optical intensity are estimated through kernel density estimation and the weight dynamic model of output PDFs is estabilshed. In Sect. 4, the model parameters are identified using N4SID method. Afterwards, the threshold control strategy for output PDF model is proposed in Sect. 5. Section 6 is the simulation of data modeling and control. Conclusions is summarized in Sect. 7.

2

Data Collection

Fiber-optic early warning system senses the vibration signals through the optical fiber buried in the same ditch with the pipeline. The optical fiber sensing hardware system consists of two parts: one is the optical fiber sensor module and the other is the data acquisition module. Among those vibration signals, some indicate real security threats, such as drilling, digging and so on. However, there are also some vibrations which do not really threat pipeline safety, such as vehicles passing, human running. Many false alarms can be excluded by the existed method-CFAR [5]. The problem is, in CFAR, the detection threshold is manually adjusted by experience. To solve this problem, a PDF shape control method is studied in this paper to dynamically tune the threshold coefficient.

3

PDF Estimation and PDF Model

As the PDF of output data from the fiber end can not be measured directly, kernel density estimation(KDE) method is used in this section to estimate the false alarm data PDF. Gaussian functions are used as the basis functions for KDE. In fact, the false alarm output PDF of optical fiber vibration data is also related with the detection threshold which is denoted as uk in this paper. Considering this, we denote the output PDF as γ(y, uk ), the output PDF γ(y, uk ) can be estimated as n

γˆ (y, uk ) =

1  y − yi ) K( nh i=1 h 5

1

1

(1)

σ 5 where K is the kernel function, h = ( 4ˆ σ n 5 , n is the sample number, 3n ) ≈ 1.06ˆ and σ ˆ is the standard deviation of the samples. It can be proved that the mean integrated square error between γ(y, uk ) and γˆ (y, uk ) converges to 0 if the bandwidth is chosen as in (3) [6].

Hamiltonian Mechanics

291

To find the relationship between uk and γ(y, uk ), the output PDF γ(y, uk ) is decomposed as [7,8]: γ(y, uk ) =

n 

ωi (uk )Bi (y) + e

(2)

i=1

where Bi (y) are basic spline functions, ωi are the weights and e is the modeling error. Therefore, the output PDF control problem is transformed into the weight control problem. Denote  b Bi (y)dy, bT = [b1 , b2 , · · · , bn−1 ] ∈ Rn−1 bi = a

1×1 (3) C1 (y) = [B1 (y), B2 (y), · · · ,Bn−1 (y)] ∈ R1×(n−1) , L(y) = b−1 n Bn (y) ∈ R Bn (y) T C0 (y) = C1 (y) − b ∈ R1×(n−1) , V (k) = [ω1 (k), ω2 (k), · · · , ωn−1 (k)]T bn

According to the natural constraint that PDf satisfies the integral of 1 in b the whole interval, i.e. a γ(y, uk )dy = 1, the corresponding weight of the nth T V B-spline can be obtained as ωn = 1−b bn . Hence, the weight vector of the linear B-spline model has a linear correlation with other (n − 1) weights, and only (n − 1) weights are independent of each other. Furthermore, the PDF in (2) can be rewritten as: γ(y, uk ) = C0 (y)Vk + L(y)

(4)

and the weight vector Vk can be expressed as [8–10]:  Vk = [

a

b

T

−1

C0 (y) C0 (y)dy]

 ·

a

b

C0 (y)T [γ(y, uk ) − L(y)]dy

To focus on the key issues, it is supposed that the weight vector and the control input yields the following linear dynamic relationship:  xk+1 = Axk + Buk (5) Vk = Cxk + Duk where x is the intermediate state which has no particular physical meaning, uk is the input which represents the CFAR detection threshold, the output Vk is the vector weight, and A, B, C, D are the parameter matrices to be identified.

4

N4SID Method

N4SID is a numerical algorithm for subspace state space system identification [11]. In N4SID, the following input/output Hankle matrix needs to be constructed:

292

L. Yin et al.



U0|i−1

⎡ ⎤ u0 u1 · · · uj−1 v0 ⎢ u1 u2 · · · uj ⎥ ⎢ v1 ⎢ ⎢ ⎥ =⎢ . . . .. ⎥ , V0|i−1 = ⎢ .. ⎣ .. .. . . ⎣ . ⎦ . ui−1 ui · · · ui+j−2 vi−1

⎤ · · · vj−1 · · · vj ⎥ ⎥ .. ⎥ .. . . ⎦ vi · · · vi+j−2

v1 v2 .. .

where i, j are the row number and column number respectively. Denote the

T generalized observable matrix Γi as Γi = C CA CA2 · · · CAi , the inverse i−1 i−2 generalized controllable matrix Δi as Δi =

A B A B · · · B , the state matrix Xi as Xi = xi xi+1 xi+2 · · · xi+j−1 , and the lower triangular Toeplitz matrix as ⎡ ⎤ D 0 ··· 0 ⎢ CB D ··· 0 ⎥ ⎢ ⎥ . Hi = ⎢ .. .. . . .. ⎥ ⎣ . . ⎦ . . CAi−2 B CAi−3 B · · · D

In N4SID, the matrices U , V , Γ , X and E should satisfy V0|i−1 = Γi X0 + Hi U0|i−1 + E0|i−1 , Vi|2i−1 = Γi Xi + Hi Ui|2i−1 + Ei|2i−1 , Xi = Ai X0 + Δi U0|i−1 , where E0|i and Ei|2i−1 are error terms. According to the above formulas, the parametric matrix of the model is derived from the following equation: [A

B

C

      X ˆ i+1 ˆ i 2 AB X   D] = arg min  − CD Vi Ui|i F A B C D

(6)

ˆ i+1 can be obtained by singular ˆ i and X In (6), the state estimation matrices X value decomposition of Vi|2i−1 and Ui|2i−1 .

5

Controller Design

The optimal control input can be obtained by optimizing the following performance index:  b (γ(y, uk ) − g(y))dy (7) J= a

Substitute (4) into (7) yields  J= a

b

(C T (y)V (uk ) + L(y) − g(y))2 dy

(8)

b  b Denote = a C0 (y) C0T (y) dy ∈ R(n−1)×(n−1) , γ0 = a (g(y) − L(y))2 dy, and b η = a (g (y) − L (y)) C0 (y) dy ∈R1×(n−1) , then (8) is reduced to J = V T (uk )



V (uk ) − 2ηV (uk ) + γ0

(9)

Hamiltonian Mechanics

Take the partial of J with respect to uk and let the partial we obtain [12] (V T (uk )



−η)

∂J ∂uk

293

equal zero,

∂V (uk ) =0 ∂uk

Generally, the gradient method is used to solve uk  ∂V (u) i (i = 1, 2, · · · , k) | ui+1 = uik − 2μ(V T (u) −η) (10) k ∂u u=uk where μ > 0 is the pre-set small enough optimization step, uk is the control input to (5).

6

Simulation Results

The data acquisition was conducted in Shangweidian Village, Mentougou District, Beijing. The buried optical fibers are about 1.5 km in length and 15–20 cm in depth. Another 10 km optical fiber is put in a shock-proof box. Considering the twists and turns in the actual laying of pipelines and optical cables, the buried optical fibers in the experiment are buried as “S” type along the red line. One end of the 10 km optical fiber in the shock-proof box is connected with the optical fiber vibration warning system, and the other end is fused with the 1.5 km buried optical cable. In this field experiment, the optical fibers vibration data is collected in the cases of electrical drill damage, trotting and non-vibration respectively. 6.1

PDF Model Simulation Results

The basic spline functions in (2) are selected as B1 (y) = 0.5y 2 I1 (y) + (−y 2 + 3y − 1.5)I2 (y) + 0.5(y − 3)2 I3 (y) B2 (y) = 0.5(y − 1)2 I2 (y) + (−y 2 + 5y − 5.5)I3 (y) + 0.5(y − 4)2 I4 (y)

(11)

B3 (y) = 0.5(y − 2) I3 (y) + (−y + 7y − 11.5)I4 (y) + 0.5(y − 5) I5 (y) 2

2

2

where Ii , (i = 1, 2, 3, 4, 5) is the indication function with  1, y ∈ [i − 1, i] Ii (y) = 0, others According to the linear weighted dynamic model (5), when the external disturbance of the system is not considered, the parameters of the weighted dynamic system model are identified by using N4SID identification algorithm and the identification result is     ⎧ 0.5092 −0.2642 0 ⎪ ⎪ = + x u x k ⎨ k+1 0.8522 1.034 −7.355e − 18 k   (12) ⎪ ⎪ ⎩ Vk = 9.134e − 15 8.324e − 15 1.481e − 14 1.349e − 14

294

6.2

L. Yin et al.

False Alarm Rate Control in the Case of Trotting

The weight vector of target PDF is calculated to be Vg = (1.3954e−05, 1.3100e− functions (11) 05)T . According to the basic spline   and formulas (3), (??), (??),  3.8731 4.0570 it can be concluded that = , η = [0.2, 0.2]. 4.0570 9.2266 The actual trotting happens between 160 m–200 m from the fiber head, which corresponds to data column 10–13. If only CFAR detection method is used, the detection result is displayed as in Table 1 from which it can be seen that the alarm indicates the trotting position locates between 12–13 column (192m-208m) and the alarm error in position is 40 m. Table 1. Comparison of actual trotting position and predicted trotting position test method

actual trot position

single CFAR detection

10–13 (160 m–200 m) 12–13 (192 m–208 m) 40 m

predicted position

error

threshold coefficient controlled CFAR

10–13 (160 m–200 m) 10–13 (160 m–208 m) 8 m

Figure 1(a) is the control input to model (12) by optimizing (9). It can be seen that the control input value converges to 0.02717, hence he threshold coefficient should be tuned to 0.02717. The performance index (7) changes as in Fig. 1(b) which shows that with the increase of sample time, the performance index rapidly decreases to 0. The weight vectors in (12) is represented in Fig. 1(c). The 3D mesh of trotting data based PDF is demonstrated in Fig. 1(d) which indicates the closed system stability and tracking performance can be satisfied through control input uk . The alarm position is detected between 160m-208m. More importantly, if the threshold is tuned to be the limit value 0.0217, the false alarm can be eliminated. Similarly, the threshold estimation simulation is carried out based on the electric hammer experimental data. The data is obtained at position 230m which corresponds to column 14 in the CFAR detection. In this simulation, the initial value of CFAR is set as u0 = 0.01, the initial state of dynamic weight system 12 is set as x0 = [1, 1]. It can be seen from Fig. 2(a) that the control input converges to 0.03623. The performance index is displayed in Fig. 2(b). In Fig. 2(b), the index decreases to 0. The PDF of γ(y, u(k)) is plotted in Fig. 2d. It can be seen from Fig. 2(d) that the alarm generated by the electric hammer is distributed at Column 14. More details are listed in Table 2. Compared with single CFAR method, in the electric drill experiment, the advantages of PDF threshold coefficient controlled CFAR method is summarized as follows: Single CFAR method might cause false detection and redundant detection. Along with the PDF threshold control, the alarm is more precise in position;

Hamiltonian Mechanics

4.5

0.14

295

× 109

4

0.12

3.5 0.1 3 0.08

J

uk

2.5 0.06

2 X: 82 Y: 0.02717

0.04

1.5

0.02

1

0

0.5 0

-0.02 0

10

20

30

40

50

60

70

80

0

90

10

20

30

40

50

60

70

80

90

k

k

(a)

(b)

× 10-14

3

50 2.5 2 0

PDF

1.5

1

V

-50 0.5

-5

0

0 × 10

-0.5

4

5

-1

10 data

-1.5 0

10

20

30

40

50

60

70

80

15

90

60

50 40 position

30

20

10

0

k

(c)

70

80

90

(d)

Fig. 1. False alarm rate control in the case of trotting: (a) Input sequence; (b) Performance index; (c) Weight vector; (d) PDF.

0.4 0.10

0.35 0.09

0.08

0.07

0.25

0.06

0.2 J1

control input u(k)

0.3

0.15

0.10

0.08

0.06

0.1

0.04

X: 58 Y: 0.03624

0.05

0.02

0

0

0

10

20

30

40

50

60

70

80

90

0

10

20

30

40

50

60

70

80

90

k

k

(a)

(b)

6

2

x 10

x 10

3

1

1.5

0.8 1

V

PDF

0.6 0.4

0.5

0.2 0 8

0

6 0.5

7

100

4

x 10

80

2

60 40

0 1

0

10

20

30

40

50

60

70

80

90

data

2

20 0

position

k

(c)

(d)

Fig. 2. False alarm rate control in the case of electric drill digging: (a) Input sequence; (b) Performance index; (c) Weight vector; (d) PDF.

296

L. Yin et al.

Table 2. Comparison of actual hammer breaking position and predicted hammer breaking position test method

actual drill position predicted position error

single CFAR detection

column 14(230 m)

column 14, 47, 75 false alarms

threshold coefficient controlled CFAR

column 14(230 m)

column 14(224 m) 6 m

Single CFAR method might cause false alarm and real alarm aliasing. Many false alarms are mixed with the real alarms. Along with the PDF threshold control, many false alarms can be removed.

7

Conclusions

Due to sensor sensitivity in hardware system, environmental noises might be mixed with real invasion and lead to high false alarm rates. To reduce the false alarm rate, PDF shape control method in adopted in CFAR. Firstly, the optical fiber vibration data is collected from the fiber end to be analyzed and processed. Secondly, the PDFs of vibration data are estimated and the weight dynamic model is established. N4SID is then used to identify the model parameters. Thirdly, the control input is obtained through performance index optimization. Finally, the control input is used as the threshold coefficient to make CFAR more accurate.

References 1. Hoffman, P.R., Kuzyk, M.G.: Position determination of an acoustic burst along a Sagnac interferometer. Lightwave Technol. 22(2), 494–498 (2004) 2. Shaw, D., et al.: Pipeline and hazardous materials safety administration, FinalReport, leak detection study-DTPH56-11-D-000001. U.S. Department of Transportation, Worthington (2012) 3. Xu, Y., Zhang, L., Lu, P., Mihailov, S., Chen, L., Bao, X.: Time-delay signature concealed broadband gain-coupled chaotic laser with fiber random grating induced distributed feedback. Opt. Laser Technol. 109, 654–658 (2019) 4. Zhang, Z., Bao, X.: Distributed optical fiber vibration sensor based on spectrum analysis of polarization-OTDR system. Opt. Expr. 16(14), 10240–10247 (2008) 5. Wang, Z., Zhao, Z., Ren, C., Nie, Z.: Adaptive GLR-, Rao- and Wald-based CFAR detectors for a subspace signal embedded in structured gaussian interference. Digit. Sig. Proc. 92, 139–150 (2019) 6. Yin, L., Zhang, H., Guo, L.: Data driven output joint probability density function control for multivariate non-linear non-Gaussian systems. Entropy 15(1), 32–52 (2013) 7. Guo, L., Wang, H.: Applying constrained nonlinear generalized PI strategy to PDF tracking control through square root B-spline models. Int. J. Control 77(17), 1481–1492 (2004)

Hamiltonian Mechanics

297

8. Wang, H.: Bounded Dynamic Stochastic Systems: Modeling and Control. SpringerVerlag, London (2000) 9. Guo, L., Wang, H.: Stochastic Distribution Control System Design. SpringerVerlag, London (2010) 10. Yi, Y., Zheng, W.X., Sun, C., Guo, L.: DOB fuzzy controller design for nongaussian stochastic distribution systems using two-step fuzzy identification. IEEE Trans. Fuzzy Syst. 24(2), 401–418 (2016) 11. Alenany, A., Mercre, G., Ramos, J.A.: Subspace identification of 2-D CRSD Roesser models with deterministic-stochastic inputs: a state computation approach. IEEE Trans. Control Syst. Technol. 25(3), 1108–1115 (2017) 12. Florea, M.I., Vorobyov, S.A.: A generalized accelerated composite gradient method: uniting Nesterov’s fast gradient method and FISTA. IEEE Trans. Signal Process. 68, 3033–3048 (2020)

An Improved YOLOv5-Based Small Target Detection Method for UAV Aerial Image Ruoyu Li, Yang Gao(B) , and Ruixing Zhang Nanjing University of Science and Technology, Nanjing 210094, China [email protected]

Abstract. With the continuous development of UAV technology and target detection technology, Using UAVs to detect ground targets has become a current research hotspot. However, due to the altitude of the UAV, the size of the ground target in the aerial image of the UAV is small, and the feature information is not obvious, it is very easy to cause missed detection and false detection. Therefore, algorithms with higher precision are needed for UAVs. For this purpose, based on the YOLOv5 target detection algorithm, this paper improves the detection accuracy of small targets optimizing the prior anchor box, adding a small target detection layer, and adjusting the multi-scale feature fusion structure. Finally, experiments results verifies that the proposed method can effectively increase the detection accuracy for small targets to 10.4%. Keywords: UAV Aerial Image · Target Detection · YOLOv5

1 Introduction In recent years, the rise of concepts like unmanned and intelligent warfare has led to a growing interest in UAV target detection as a pivotal technology [1]. Due to the high flying height and fast speed change of UAVs when performing missions, many important ground targets occupy fewer pixels in aerial images, and the target features are not obvious, resulting in false detection rates for these small targets and an increase in missed detection rates [2]. Hence, it holds immense significance to investigate the detection of small targets in aerial images captured by UAVs. The advancement of the target detection algorithm [3] has progressed through two stages thus far: the traditional approach based on manual feature extraction, and the convolutional neural network (CNN) based target detection algorithm [4]. The traditional method relies on sliding window mode and conventional machine learning classifiers. However, these conventional object detection algorithms lack targeted region selection in the image, have high time complexity, and are not robust to changes in image features. In contrast, with the continuous advancements in artificial neural networks, the target detection technology based on CNN [5] has surpassed traditional machine learning algorithms in terms of velocity and accuracy. Consequently, it has gradually become the predominant research direction for target detection in the present time. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 298–312, 2023. https://doi.org/10.1007/978-981-99-6187-0_30

An Improved YOLOv5-Based Small Target Detection Method

299

The convolutional neural network (CNN)-based object detection algorithm can be categorized into a two-stage algorithm and a single-stage algorithm. The two-stage detection algorithm achieves high precision, but it falls short in terms of detection speed, making it inadequate for real-time UAV detection demand. As a result, the single-stage detection algorithm, which offers significant advantages in detection speed, has emerged as a trending research topic, capturing attention in the field. The single-stage detection algorithm is a regression-based detection algorithm, which is different from the twostage detection algorithm that needs to find the candidate area first, and then identify the target from the candidate area. The single-segment detection algorithm removes the process of candidate region generation, and directly uses the neural network to acquire the classification and positional data of the target from the input image. It regards the classification and regression of targets as the same task, and has high detection precision and accurate real-time capability. As a representative of single-stage target detection algorithm, YOLOv5 [6] takes the entire image as the input of the network, and can retrieve the target’s location and classification data only through a neural network, which has exceptionally rapid detection rate and robust generalization capacity. Within the scope of this research, YOLOv5 is selected as the target detection algorithm. Addressing the issue of inadequate precision in detecting small targets within UAV aerial images, enhancements are made to the precision of detecting small targets by improving the original model, including optimizing the prior anchor box, adding small target detection layers, improving multi-scale feature fusion structure and other methods. Ultimately, by means of comparative experiments, it has been validated that the enhanced approach can substantially enhance the algorithm’s precision in detecting small targets, resulting in an increase of up to 10.4% in the detection accuracy of such targets.

2 Introduction to YOLOv5 Target Detection Algorithm The algorithm of YOLO series [7] has been significantly improved with regard to network structure, detection precision, and detection speed after multiple versions. As one of the best-performing algorithms in the current YOLO series, YOLOv5 maintains a high level of detection speed and accuracy, and the network structure is more lightweight and suitable for deployment on mobile terminals and embedded devices. In this paper, YOLOv5s with lighter network parameters and higher precision is selected as the main research network. The detailed network architecture diagram is shown in Fig. 1: The YOLOv5 model is mainly composed of a backbone network (Backbone), a feature fusion network (Neck) and a detection head (Head). The backbone network is employed to extract pertinent feature information from the image. It is composed of a series of convolutional neural network modules, mainly including standard convolution module (Conv), feature extraction module (C3) and spatial pyramid pooling module (SPPF) [8]. It can fuse local features and global features to obtain better feature results. The function of the feature fusion network is to fuse the feature images obtained by the downsampling of the backbone network, and strengthen the expressive ability of feature maps of different scales, so that the feature maps of different layers have more detailed contextual information.

300

R. Li et al.

BackBone Conv

Neck+Head Concat

C3_2

Upsample

Conv

Conv

Concat

C3_2

C3_2

Concat

Conv

Upsample

Concat

Conv

C3_2

Conv C3_1 Conv C3_1 Detect

Conv C3_1 Conv C3_1 SPPF

Fig. 1. YOLOv5s structure diagram

The detection head processes the result of feature fusion. It generates the target’s positional and categorical data within the image, and corrects the position of the candidate frame according to the position offset, so as to obtain more accurate detection results.

3 Improved YOLOv5 Target Detection Algorithm In the process of UAV small target detection, there are problems such as small scale of the target to be detected and difficulty in focusing; the feature information is not obvious, the extraction efficiency is low; the target information is easily lost, etc. By improving the structure of the network, the focus on small targets is strengthened, and the feature extraction ability of the model for small targets is improved. The main improvement methods are: optimizing the initial anchor box, increasing the detection layer of small objects, and improving the feature fusion structure. 3.1 Preset Anchor Box Optimization for Small Objects In the process of UAV target detection, there is a problem that the scale of the target to be detected is small, and it is difficult to accurately focus and extract feature information.

An Improved YOLOv5-Based Small Target Detection Method

301

In this section, by expanding the image size input to the network and selecting the k-means++ algorithm [9]. The model has demonstrated a substantial improvement in the efficacy of small target detection, leading to a notable enhancement in detection precision. YOLOv5 is an anchor-based target detection algorithm, utilizing anchor boxes as a fundamental component for precise localization and classification of objects. Before detection, some virtual anchor boxes need to be set. The size and size of these frames are fixed, and there are certain differences between them and the real boxes marked by the dataset. The YOLOv5 network output is an offset, representing the difference between the preset anchor box and the ground truth box. For target positioning is a process of border regression, it is necessary to fine-tune the position of the preset anchor box through linear transformation to make the result closer to the ground truth box. The fine-tuned anchor box is the prediction frame. YOLO represents the bounding box of the target in the format of (x, y, w, h), (x, y) is the center coordinate of the bounding box, (w, h) is the width and height of the bounding box. Note that the default anchor box is (xA , yA , wA , hA ), the ground truth box is (xG , yG , wG , hG ). The preset anchor box (xP , yP , wP , hP ) can be obtained by fine-tuning the preset anchor box. The relationship between the three frames is shown in Fig. 2: Prediction GT

Anchor

Anchor (XP,YP) (XG,YG)

Ground truth box

(XA,YA)

Prediction

Fig. 2. Border diagram

For the image to be detected that is input into the network, the detection layer divides the image into S × S grid cells. In Fig. 3, it can be observed that each cell assumes the responsibility of detecting the target if the target’s center lies within the cell. Amidst them, red point is the center point of each grid cell. The red overlapping box is the preset anchor frame, which has three different sizes, and the blue one is the target real label frame. Within the YOLOv5s model’s backbone network, there exist five convolutional layers with a stride of 2. These convolutional layers perform 5 downsampling operations on

302

R. Li et al.

Center Point Anchor Ground truth box

Fig. 3. Schematic diagram of anchor box and ground truth box

the input feature maps to obtain 5 different feature maps, and finally fuse them into three different detections layer. The three detection layers are designed to focus on the targets within feature maps that have undergone downsampling by a factor of 8, 16, and 32, respectively. Objects on each feature map are predicted by preset anchor boxes. The shallow feature maps with smaller sampling multiples possess smaller receptive fields, leading to the allocation of anchor boxes associated with smaller shallow feature maps for the prediction of small-scale objects. Conversely, the deep feature map with a larger sampling multiple exhibits a larger receptive field, necessitating the assignment of larger anchor frames to the deep feature map to facilitate the prediction of large-scale targets. Prior to training, the YOLOv5 algorithm uniformly resizes the image dimensions to 640 × 640 and subsequently feeds it into the network to facilitate batch training of the network. However, compressing the image will cause the small targets in the image to be compressed. After multiple downsampling, the target Feature information may be lost, resulting in the model not being able to discern the characteristics of small objects. To preserve a greater amount of information pertaining to small objects within the image, it becomes imperative to enlarge the dimensions of the input image provided to the network. Since the input image size is a multiple of 32, a model with better performance can be trained. At the same time, considering the video memory limitation of the GPU, the input image is expanded to 960 × 960 and 1280 × 1280 respectively. Table 1 exhibits the training outcomes obtained from images with varying dimensions. Presented below are the results: Through training experiments, it can be seen that the optimal model can be obtained when the input image size is 960 × 960, so the images input to the network are uniformly adjusted to 960 × 960. Since the algorithm focuses on the detection of small targets,

An Improved YOLOv5-Based Small Target Detection Method

303

Table 1. Training results for images of different sizes Image Size

Accuracy ([email protected])

640 × 640

0.350

960 × 960

0.436

1280 × 1280

0.352

even considering the change of target scale when UAV is explored in the air, there will be no situation where the target to be detected accounts for an excessively large proportion of the entire boxes. Therefore, the larger size objects appearing in the image can be ignored. Continuing to use the original preset anchor frame, there will be cases where the large-size anchor frame is not used and the precision of the small-size anchor frame is insufficient. Consequently, it becomes imperative to recompute the dimensions of the preset anchor frames, emphasizing their alignment with the detection of small objects. YOLOv5 employs the K-means clustering algorithm [10] to generate preset anchor boxes. By calculating the distance between samples, the close samples are clustered into the same category. During the training process, the anchor boxes are fine-tuned using the genetic algorithm. Since K-means needs to randomly select K points to initialize as the center point of the cluster before clustering, the convergence of the algorithm will be affected by the initialization of the cluster center. Different cluster centers may produce different results, so the K-means++ algorithm is selected to solve this problem. K-means++ builds upon the foundation of K-means and enhances the initialization approach for cluster center optimization. Obtaining the cluster center point through iterative calculation can significantly enhance the clustering efficacy. The algorithm flow of K-means and K-means++ is shown in Fig. 4: 3.2 Add Small Target Detection Layer This section addresses the challenges related to inconspicuous features of small targets, potential loss of target information during sampling, and the inefficiency of the extraction process. Through traditional network expansion and small target detection layer method based on Swin-Transformer, the model has witnessed enhancements in both detection accuracy and feature extraction efficiency. The backbone network of YOLOv5 encompasses five downsampling operations, yielding five distinct feature maps (P1, P2, P3, P4, P5) of varying dimensions. Pi denotes the feature map obtained after downsampling the input image by a factor of 2i. The three detection layers in the original detection head are respectively obtained by fusion of the three-layer feature maps of P3, P4, and P5 [11]. Due to the low downsampling factor of shallow feature maps, more small target features can be preserved. Therefore, a new detection layer is introduced in the 4 times downsampled feature map P2, and the detection head Head is expanded to a detection layer for 4 different feature maps. When the network receives an image input size of 960 × 960, the resulting feature map after downsampling by a factor of 4 measures 240 × 240 in dimensions. Furthermore, the K-means++ algorithm is employed to compute the predefined anchor boxes on the 240

304

R. Li et al. Start

Start

Initialize K cluster centers

Select a sample as the initial cluster center

Calculate the distance between each sample and K cluster centers

Compute the minimum distance between each sample and the nearest cluster center.

Assign samples to the nearest center based on distance

Compute the probability that each sample is selected as a cluster center

Recalculate each cluster center

Total number of cluster centers is K

No

Yes whether to converge

No K-means clustering algorithm

Yes Output clustering result

Output clustering result

a)K-means clustering algorithm

b)K-means++ clustering algorithm

Fig. 4. Clustering algorithm flowchart

× 240 feature map, yielding anchor box sizes of (4,6), (5,12), and (10,9). After adding a detection layer, the feature map will be divided into more cells during the detection process. The smaller size of the preset anchor box can also better perform regression positioning on the tiny target in the cell. The architecture of the four-layer detection layer effectively mitigates the adverse effects resulting from variations in target scales. The detection layer introduced from the 4 times downsampled feature map P2 is shown in Fig. 5: With the inclusion of a new detection layer, the model’s structure necessitates corresponding adjustments. Within the Neck, a feature fusion network, the feature map undergoes iterative upsampling to expand its dimensions. Subsequently, the upsampled feature map is fused with the corresponding feature map from the backbone network in the detection head, thereby accomplishing network-wide channel alignment. The aforementioned approach utilizes the conventional method of network expansion to introduce the small target detection layer, without altering the fundamental structure of the network. This entails duplicating and stacking existing neural network modules within the network. To explore additional approaches for enhancing the detection layer, this paper introduces a novel method for incorporating a small target detection layer.

An Improved YOLOv5-Based Small Target Detection Method

305

Fig. 5. Schematic diagram of small target detection layer structure

Specifically, drawing inspiration from the Transformer-based backbone network SwinTransformer [12], a small target detection layer based on Swin-Transformer is proposed. This integration aims to optimize the efficiency of network feature extraction following the incorporation of the small target detection layer. Transformer is an innovative network architecture originally introduced for natural language processing (NLP) tasks. Swin-Transformer is a recently proposed network architecture by Microsoft Research that extends the Transformer model for computer vision applications. By leveraging a hierarchical network structure, Swin-Transformer addresses the multi-scale challenge in image analysis and facilitates the effective extraction of multi-scale information from images. By means of sliding windows, adjacent windows can interact, which saves the amount of calculation and can pay attention to global and local information. Consequently, incorporating the feature extraction module of Swin-Transformer into the backbone network enhances the feature extraction capability following the addition of the detection layer. The structure of the Swin-Transformer module is depicted in Fig. 6, as shown below: Among them, LN (LayerNorm) is the normalization operation in the natural processing language NLP. This structure corresponds to the BN (BatchNorm) in the convolutional neural network, which normalizes the dimension of the feature map input into the module. W-MSA (Window-based multi-head self-attention) is a module that employs a window-based approach for multi-head self-attention. By utilizing windows, W-MSA effectively reduces the computational complexity associated with feature extraction. On the other hand, SW-MSA (Shifted-windows based multi-head self-attention) is a module that leverages sliding windows for multi-head self-attention. SW-MSA enables the exchange of information between windows, facilitating enhanced information flow and interaction. MLP (Multilayer Perceptron) is a multilayer perceptron that can solve nonlinear problems in feature extraction. 3.3 Multi-scale Feature Fusion Structure Optimization The original YOLOv5 network does not have a dedicated feature fusion network structure for small targets, resulting in the loss of small target information in the image during the fusion process. In this section, the introduction of the BiFPN (Bi-directional Feature

306

R. Li et al.

Input

Output

LN

MLP

W-MSA

LN

LN

SW-MSA

MLP

LN

Fig. 6. Structural diagram of Swin-Transformer module

Pyramid Network) structure enhances the capability to transfer small target information during the feature fusion process. The Feature Fusion Network (Neck) of YOLOv5 incorporates the structure of Feature Pyramid Network (FPN) [13] along with the Pyramid Attention Network (PAN) [14]. This combination enhances the model’s capacity for multi-scale feature extraction. The structure of FPN + PAN treats feature maps of all scales equally and adopts the same feature fusion method. Due to the significant variation in scale among UAV aerial images and the differential information transmitted to the detection layer by feature maps of varying scales, the introduction of the BiFPN structure is implemented to reinforce the preservation of small target feature information during the feature fusion process. BiFPN is a feature fusion technique proposed by Minxing Tan [15] in the novel EfficientDet backbone network. It achieves the fusion of shallow and deep layers by employing a two-way weighting approach. By incorporating weighted processing of feature maps at various scales, a balanced integration of information from different scales is achieved, resulting in strengthened information transmission between different layers of the network. The comparison of FPN and BiFPN network structures is shown in Fig. 7: Among them, the FPN + PAN structure facilitates the top-down transfer of semantic information from deep feature maps within the FPN network. The positional information of shallow feature maps is transferred in a bottom-up manner in PAN. BiFPN makes improvements on this basis, and performs feature fusion through two-way weighting. The information fusion is effectively enhanced without increasing the computational cost

An Improved YOLOv5-Based Small Target Detection Method FPN

PAN

P7

Conv

Conv

P7

P6

Conv

Conv

P6

Conv

Conv

P5

Conv

Conv

P5

Conv

Conv

P4

Conv

Conv

P4

Conv

Conv

P3

Conv

Conv

P3

307

Bi-FPN

a)FPN+PAN

Conv

Conv

b)Bi-FPN

Fig. 7. Schematic diagram of multi-scale feature fusion structure

too much, and the entire BiFPN module can be repeatedly stacked for deeper networks to extract deeper features. The optimized network structure is further expanded in depth and width. Excessive network depth will result in increased computational load and significantly impact the algorithm’s detection speed. Hence, it is imperative to carefully adjust the depth and width of the network during the training process to achieve a balance. Prior to training, it is crucial to assess the alignment between the input and output channels of the network and identify any redundant convolutional layers. After completing the training process, it is essential to evaluate the training time, model parameters, and computational requirements. It is advisable to identify and remove modules that involve complex parameters and computations. While ensuring that the model’s complexity remains within acceptable limits, efforts are made to enhance the accuracy of small target detection.

4 Experimental Results and Analysis 4.1 Experimental Environment To assess the effectiveness of various improvement methods, the model is trained and tested on the VisDrone dataset [16] under consistent conditions on a server. This investigation aims to determine the optimal approach for enhancing the model’s performance. Table 2 presents the details of the experimental environment utilized for conducting the experiments: Figure 8 illustrates the distribution of instances for each object category within the dataset:

308

R. Li et al. Table 2. Experimental environment configuration table Name

Specifications

CPU

Intel Xeon Platinum 8260C

GPU

Nvidia RTX 3090

Operating System

Windows 10.0

Scripting Language

Python

Deep Learning Framework

Pytorch

Fig. 8. VisDrone dataset instance distribution map Table 3. Experimental results after network structure improvement Model

[email protected]

APL

APS

YOLOv5s

0.436

0.777

0.502

K-YOLOv5s

0.448

0.828

0.536

4DET-YOLOv5s

0.414

0.753

0.554

STF-YOLOv5s

0.427

0.817

0.539

B-YOLOv5s

0.458

0.834

0.531

4.2 Network Structure Improvement Experiment Results Table 3 presents the experimental results obtained after enhancing the network structure:

An Improved YOLOv5-Based Small Target Detection Method

309

Among them, K-YOLOv5s is the model after recalculating the preset anchor box. 4DET-YOLOv5s refers to the model obtained by incorporating the small target detection layer through network stacking. STF-YOLOv5s, on the other hand, denotes the model obtained by integrating the Swin-Transformer-based small target detection layer. BYOLOv5s is a model that improves the feature fusion structure to BiFPN. The mAP changes of different improved methods are shown in Fig. 9:

Fig. 9. mAP curve after improving the network structure

Based on the data presented in Table 3 and Fig. 9, it is evident that the utilization of the K-means++ algorithm to recompute the preset anchor frame results in a 2.8% improvement in [email protected]. Moreover, the detection accuracy of large objects witnesses a 6.6% enhancement, while the detection accuracy of small objects experiences a boost of 6.8%. After re-optimizing the anchor box, the preprocessing calculation time is saved, and the model can quickly converge in the first few rounds of training. Following the expansion of four detection layers on the original network, there was a decrease of 5% in [email protected]. Moreover, the detection accuracy of large targets experienced a decline of 3.0%, whereas the detection accuracy of small targets witnessed a substantial increase of 10.4%. With the addition of the detection layer, it is expected that the model’s mAP will experience a certain decline. This is primarily attributed to the allocation of additional resources for detecting small targets, which significantly enhances the detection accuracy of such targets. Consequently, the model’s performance on small targets shows remarkable improvement. However, the addition of the detection layer has resulted in a decrease in the accuracy of detecting large targets, which subsequently led to a decline in the overall average accuracy of the model. To mitigate this issue and maintain a certain level of average accuracy, this paper proposes the integration of the Swin-Transformer detection layer. After incorporating the Swin-Transformer detection layer, there was a slight decrease in [email protected] by 2.1%. However, the detection accuracy of large targets witnessed a notable improvement of 5.1%, while the detection accuracy of small targets also experienced a substantial increase of 7.4%. Consequently, the average accuracy of the model displays a lesser decline, balancing the challenges posed

310

R. Li et al.

by small objects and model complexity. However, after adding the Swin-Transformer detection layer, the model can converge stably after training for about 100 rounds, and the model stability is better than that of using network expansion. After changing the feature fusion structure to BiFPN, [email protected] increased by 5%, large target detection accuracy increased by 7.3%, and small target detection accuracy increased by 5.8%. Hence, enhancing the feature fusion structure proves to be more effective in improving the model’s performance. The figure illustrating the detection results obtained by testing the improved model on the images from the test set is presented in Fig. 10.

Fig. 10. Algorithm improvement renderings

From the detection effect diagram shown in Fig. 10, it is evident that the improvement made to the YOLOv5s model has led to a significant enhancement in the detection performance of small distant targets in the images.

5 Conclusion In this research paper, the objective was to address the issue of low detection accuracy of small targets in UAV aerial images. The YOLOv5s algorithm was chosen as the target detection method, and various improvement approaches were explored. Firstly, the preset anchor boxes for small objects were optimized by enlarging the image size

An Improved YOLOv5-Based Small Target Detection Method

311

input to the network and using the k-means++ algorithm. Subsequently, a small target detection layer was added, introducing a new network detection layer based on Swin-Transformer. Finally, the multi-scale feature fusion structure was optimized by incorporating the BiFPN structure into the original network architecture. Experimental results have demonstrated that the proposed algorithmic improvements have significantly enhanced the detection effectiveness of small targets in UAV aerial images, leading to an increase in the detection accuracy of small targets by up to 10.4%. Acknowledgments. This work was supported by Jiangsu Funding Program for Excellent Postdoctoral Talent under Grant 2022ZB265; National Natural Science Foundation of China under Grant 62203222; National Defense Basic Scientific Research Program of China under Grant JCKY2021606B002.

References 1. Zhang, Z.: Research progress of vision based aerospace conflict sensing technologies for small unmanned aerial vehicle in low altitude. Acta Aeronautica et Astronautica Sinica 43(8), 191–214 (2022) 2. Zhao, S.: Vehicle detection in satellite imagery based on deep learning. J. Comput. Appl. 39(S2), 91–96 (2019) 3. Liu, W.: Detection of multiclass objects in optical remote sensing images. IEEE Geosci. Remote Sensing Lett. 16(5), 791–795 (2019) 4. Yuan, H.: A sample update-based convolutional neural network framework for object detection in large-area remote sensing images. IEEE Geosci. Remote Sens. Lett. 16(6), 947–951 (2019) 5. Xin, R.: Complex network classification with convolutional neural network. Tsinghua Sci. Technol. 25(4), 447–457 (2020) 6. Zhu, X.: Improved YOLOv5 based on transformer prediction head for object detection on drone-captured in scenarios. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2778–2788. IEEE, Montreal, Canada (2021) 7. Qian, Z.: Automatic polyp detection by combining conditional generative adversarial network and modified you-only-look-once. IEEE Sensors J. 22(11), 10841–10849 (2022) 8. Wan, S.: Deep convolutional-neural-network-based channel attention for single image dynamic scene blind Deblurring. IEEE Trans. Circuits Syst. Video Technol. 31(8), 2994–3009 (2021) 9. Hoang, T.M.: Detection of eavesdropping attack in UAV-aided wireless systems: unsupervised learning with one-class SVM and K-means clustering. IEEE Wireless Commun. Lett. 9(2), 139–142 (2020) 10. Mohammed, J.: Deep clustering: on the link between discriminative models and k-means. IEEE Trans. Pattern Anal. Mach. Intell. 43(6), 1887–1896 (2021) 11. Yingna, S.: An illumination-invariant nonparametric model for urban road detection. IEEE Trans. Intell. Veh. 4(1), 14–23 (2019) 12. Liu, Z.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022. IEEE, Montreal, Canada (2021) 13. Song, Z.: MSFYOLO: feature fusion-based detection for small objects. IEEE Latin Am. Trans. 20(5), 823–830 (2022) 14. Hu, J.F.: APANet: auto-path aggregation for future instance segmentation prediction. IEEE Trans. Pattern Anal. Mach. Intell. 44(7), 3386–3403 (2022)

312

R. Li et al.

15. Tan, M.: EfficientDet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 10781–10790. IEEE, Seattle, USA (2020) 16. Du, D.: VisDrone-DET2020: the vision meets drone object detection in image challenge results. In: Bartoli, A., Fusiello, A. (eds.) ECCV 2020. LNCS, vol. 12538, pp. 692–712. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-66823-5_42

A Federated Learning Method with DNN and 1DCNN Feature Fusion for Multiple Working Conditions Fault Diagnosis Zhiqiang Zhang1 , Danmin Chen2 , and Funa Zhou1(B) 1 School of Logistic Engineering, Shanghai Maritime University Shanghai, Shanghai, China

[email protected], [email protected] 2 School of Computer and Artificial Intelligence, Henan Finance University, Zhengzhou, China

Abstract. Under multiple working conditions, the sample size of each client is small and it is not easy to obtain data from other working conditions, making it difficult to establish an effective deep learning fault diagnosis model. Federated learning is a distributed training method that can accomplish collaborative training of multiple clients without directly sharing client data. This paper proposes a federated learning method with DNN and 1DCNN feature fusion. This method designs a feature fusion network based on 1DCNN and DNN, and establishes a federated learning architecture for the feature fusion network in order to better extract fault features and improve the accuracy of multiple working conditions fault diagnosis. The efficiency of the proposed method is demonstrated by utilizing the Case Western Reserve University bearing data set. Keywords: Fault Diagnosis · Federated Learning · 1DCNN · DNN

1 Introduction The healthy and stable operation of key components of electromechanical equipment is the guarantee of high quality and efficient production in intelligent manufacturing process [1, 2]. The research of intelligent fault diagnosis method is an important technical support to improve the safety of electromechanical equipment. With deep learning technology, the potential features in equipment operation status data can be fully explored to overcome the problem of lack of analytical models and expert experience. When the equipment operates under multiple working conditions, it is difficult for a single client to obtain data from multiple working conditions and the sample size is insufficient, making it difficult to establish an effective deep learning fault diagnosis model solely on its own. Using data from multiple clients for centralized training can solve the problem of insufficient diagnosis accuracy of single client deep learning models. However, data from different clients often comes from different entities, and due to factors such as interest competition and data privacy security, different entities are often unwilling to share data, resulting in data island [3]. Therefore, while ensuring the data privacy of various clients, © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 313–321, 2023. https://doi.org/10.1007/978-981-99-6187-0_31

314

Z. Zhang et al.

it is of great engineering significance to comprehensively utilize the data information of multiple clients to achieve accurate fault diagnosis for multiple working conditions. Federated learning is a distributed machine learning technique that can complete the joint training of multiple clients without directly sharing the client data, thus improving the fault diagnosis performance of each client, thereby improving the fault diagnosis performance of each client. However, the effectiveness of client models is an important factor affecting the performance of federated learning fault diagnosis for multiple working conditions. This paper proposes a federated learning method with DNN and 1DCNN feature fusion under multiple working conditions. Firstly, each client establishes a feature fusion model based on DNN and 1DCNN. Then, through federated learning, the collaborative training of multiple clients is carried out according to the average aggregation strategy. Finally, each client gains benefits from the federated model to improve the fault diagnosis performance under multiple working conditions. The main contributions of this paper are summarized as follows: • This paper proposes a feature fusion network based on DNN and 1DCNN (FDC) to improve the accuracy of single client model. DNN is used to extract global features of faults, 1DCNN is used to extract local features of faults, and they are fused to establish a feature fusion network to improve the accuracy of fault diagnosis for a single client. • This paper proposes a federated learning method with DNN and 1DCNN feature fusion (FedDC) for multiple working conditions fault diagnosis. This method achieves joint optimization by combining model information from multiple clients to further improve the accuracy of fault diagnosis. • When different clients work in different working conditions, the method proposed can still achieve satisfactory fault diagnosis accuracy in multiple working conditions.

2 Related Work 2.1 Fault Diagnosis Method Based on DNN Deep neural network (DNN) can extract fault features from original data through complex nonlinear functions, so it is widely used in fault diagnosis [4–8]. DNN can effectively extract global features of faults, but it cannot effectively represent local features of faults. DNN can be combined with other deep network models to extract fault features at a deeper level, achieving the goal of improving fault diagnosis accuracy. 2.2 Fault Diagnosis Method Based on 1DCNN Convolution neural network is a hierarchical model, which is mainly composed of input layer, convolution layer, pooling layer, full connection layer and output layer. Through a series of operations such as convolution, pooling and nonlinear activation function mapping, the high-level Semantic information is extracted layer by layer. Most of the measurement data for mechanical fault diagnosis is only related to time, which is a

A Federated Learning Method with DNN and 1DCNN Feature Fusion

315

one-dimensional parameter. Therefore, for fault diagnosis problems based on convolutional neural networks, a one-dimensional convolutional structure (1DCNN) should be a suitable choice [9–12]. Although 1DCNN can effectively characterize the local features of faults, the working conditions of each client are different, and the sample size is limited. It is particularly important to comprehensively utilize the data information of multiple clients to achieve accurate fault diagnosis while ensuring data privacy of each client. 2.3 Fault Diagnosis Method Based on Federated Learning Federated learning aims to establish a federated learning model based on distributed datasets. The trained federated learning model can be shared and deployed among various data participants, and this sharing does not expose any protected privacy parts of the data on each participant. Therefore, federated learning, as an emerging artificial intelligence technology, has received widespread attention from experts in the field of fault diagnosis [13–15]. The above article mainly focuses on federation aggregation strategy and model overfitting problem, which can speed up the training process and improve the fault diagnosis accuracy. However, in the case where the federated aggregation strategy is determined, it is crucial for clients to design a deep learning model to characterize the feature of faults and further improve the accuracy of fault diagnosis.

3 Federated Learning Method with DNN and 1DCNN Feature Fusion This paper proposes a federated learning method with DNN and 1DCNN feature fusion under multiple working conditions. This section elaborates on the algorithm process of FedDC in detail. The algorithm is divided into offline training stage and online diagnosis stage. The specific steps are as follows: Step1: Establish DNN and 1DCNN for each client Establish DNN and 1DCNN for each client and train them. Taking client k as an example, build DNNk and 1DCNNk, then train DNNk and 1DCNNk to obtain the features FeatureDNN,k of DNN and the features Feature1DCNN,k of 1DCNN, respectively. Step2: Build a feature fusion model Build a feature fusion model (FDCk ) based on DNN and 1DCNN. FDC includes a DNN, a 1DCNN and a feedforward neural network. The features extracted by DNN and 1DCNN are fused as the input of feedforward neural network, as shown in formulas (1) and (2). The FDC parameters are fine tuned and optimized using the supervised back-propagation algorithm to obtain the fusion model parameters as shown in formulas (3). The network parameters of FDC include DNN parameters, 1DCNN parameters, and Feedforward parameters. θc,k represents the network parameters of FDCk after training. The network structure of FDC is shown in Fig. 1.

Featurek = [FeatureDNNk ; Feature1DCNNk ]

(1)

316

Z. Zhang et al.

Fig. 1. FDC schematic diagram

Feedforwardk = Feedforward (Featurek )

(2)

  FDCk , θc,k = train(FDCk (DNNk , 1DCNNk , Feedforwardk ))

(3)

Step3: Upload and distribute model parameters Each client uploads its own model parameters to the federated center, which calculates the model parameters through an average aggregation strategy as shown in formula (4). In formula (4), n represents the total number of samples, nk represents the number of samples for the k-th client, k = 1,2,…K. θc,k represents the FDC network model parameters uploaded by the k-th client. θc,n represents the parameter calculated by the average aggregation strategy of the federated center. Then the federated center distributes parameter θc,n to each client. The interaction diagram between each client and the federated center is shown in Fig. 2. θc,n =

K k=1

nk θc,k n

(4)

Step4: Go to Step1 The client receives the model parameters issued by the federated center, which are used as initialization parameters to train the local FDC model. In this way, the local client alternately trains and optimizes the model parameters through several uploads, downloads, and updates until reaching the most accurate fault diagnosis model. Step5: Online fault diagnosis Using XFDC,online as the input of the FDC fusion network, real-time online fault diagnosis is performed through the trained deep fusion network FDC. In formula (5), result(t) represents the fault diagnosis result of online data XFDC,online (t) at time t. The algorithm flow is shown in Fig. 3. result(t) = argmaxj=1,2,...,J {p(label(t) = j|XFDC,online (t); θc,k )}

(5)

A Federated Learning Method with DNN and 1DCNN Feature Fusion

317

Fig. 2. Framework of FedDC

Fig. 3. Flow chart of FedDC based fault diagnosis method

4 Experiments and Results 4.1 Datasets The paper applies bearing health status test data set provided by the Case Western Reserve University Bearing Data Center [16] as the experimental data set. In order to better represent multiple working conditions, each client has different loads and speeds,

318

Z. Zhang et al.

as shown in Table 1. The experimental data used in this paper is monitoring data of the motor drive end under different working conditions with a sampling frequency of 12 kHz. The bearing states include normal state, inner ring fault, outer ring fault at 6 o’clock, and ball fault, with a fault size of 0.007 mm. The length of the sliding box in this experiment is set to 900, which means that each sample dimension is 900, and the step size of the sliding box is 20. Table 1. Working condition of each client. Client Name

Load

Speed

Client 1

3hp

1730 RPM

Client 2

2hp

1750 RPM

Client 3

1hp

1772 RPM

Client 4

0hp

1797 RPM

4.2 Network Setup and Methodology Comparison In this study, the proposed FedDC approach is examined and compared with three other approaches. The comparison method is as follows: • DNN: It is a stacked autoencoder to build the DNN without federated learning. • FedDNN: Each client is a stacked autoencoder that cooperatively constructs a global model through a federated learning average aggregation strategy. • FDC: It is a feature fusion model of DNN and 1DCNN without federated learning. The number of neurons in each hidden layer in FedDC and DNN models in contrast method is the same, and the number of neurons in each hidden layer is 500, 100 and 138, respectively. The number of channels in the two convolution layers of 1DCNN is 16 and 32 respectively. 4.3 Results Analysis and Discussion Experiment 1 involves training 100 samples of each type, and the fault diagnosis results are shown in Table 2. It can be displayed that the fault diagnosis accuracy of FedDC for each client is the highest, reaching 96.25%–97.06%. The fault diagnosis model accuracy of DNN is the lowest, and the fault diagnosis accuracy of FedDNN is at least 36.12% higher than that of DNN. The fault diagnosis accuracy of FDC is higher than that of FedDNN, indicating that feature fusion networks based on DNN and 1DCNN can better extract fault features and improve the accuracy of fault diagnosis. The fault diagnosis accuracy of FedDC is higher than that of FDC, while that of FedDNN is higher than that of DNN, indicating that federated learning can significantly improve the effectiveness of fault diagnosis.

A Federated Learning Method with DNN and 1DCNN Feature Fusion

319

Experiment 2 involves 200 training samples of each type, and the fault diagnosis results are shown in Table 3. The results show that as the number of training samples increases, the fault diagnosis accuracy of the same model in Table 3 for each client is higher than that in Table 2. The results of Experiment 2 are similar to Experiment 1, with FedDC having the highest fault diagnosis accuracy, FDC second, FedDNN third, and DNN having the lowest fault diagnosis accuracy. Table 2. Fault diagnosis accuracy of each model in Experiment 1. DNN

FedDNN

FDC

FedDC

Client 1

55.37%

72.00%

77.81%

96.25%

Client 2

60.81%

70.87%

83.87%

96.93%

Client 3

59.06%

63.18%

81.62%

97.06%

Client 4

57.43%

72.62%

80.43%

96.25%

Table 3. Fault diagnosis accuracy of each model in Experiment 2. DNN

FedDNN

FDC

FedDC

Client 1

62.68%

81.87%

82.18%

97.93%

Client 2

62.18%

71.37%

85.81%

98.00%

Client 3

66.62%

78.87%

90.62%

98.25%

Client 4

58.93%

78.43%

86.12%

97.87%

The experimental results show that within the client, the feature fusion model based on DNN and 1DCNN can better extract fault features and significantly improve the accuracy of fault diagnosis. In order to fully utilize multi working condition data, the FedDC method proposed in this paper can further improve the accuracy of fault diagnosis.

5 Conclusion This paper proposes a federated learning method with DNN and 1DCNN feature fusion. Firstly, a feature fusion model based on DNN and 1DCNN is established for each client to better extract fault features. Secondly, by utilizing the working condition information of other clients through federated learning, collaborative training is conducted on multiple clients based on the average aggregation strategy. Finally, each client gains benefits from the federated model to improve the fault diagnosis performance under multiple working conditions. The experimental verification of the proposed method shows that the model has high fault diagnosis accuracy. Although the FedDC described in this paper has achieved excellent fault diagnosis accuracy, there is still much research to be done. In this article, we assume that different

320

Z. Zhang et al.

clients have the same fault category. In practice, different clients may be different. The data of each client may not be balanced. We will conduct research in the future to address these issues. Acknowledgments. This work was supported by the National Natural Science Foundation of China (62073213), Henan Province Science and Technology Research Project (232102220022), and Shanghai Maritime University Graduate Student Training Program for Top Innovative Talents (2022YBR016).

References 1. Adamsab, K.: Machine learning algorithms for rotating machinery bearing fault diagnostics. Mater. Today Proc. 44, 4931–4933 (2021) 2. Li, D., Deng, R., Zou, Z., Huang, B., Fengshou, G.: A review of fault diagnosis methods for marine electric propulsion system. In: Zhang, H., Feng, G., Wang, H., Fengshou, G., Sinha, J.K. (eds.) Proceedings of IncoME-VI and TEPEN 2021: Performance Engineering and Maintenance Engineering, pp. 971–984. Springer International Publishing, Cham (2023). https://doi.org/10.1007/978-3-030-99075-6_78 3. Li, X., Huang, K., Yang, W., et al.: On the convergence of FedAvg on non-IID data. In: Proceedings of IEEE International Conference: Learning Representations, pp. 1–26 (2020) 4. Wang, F.T., Dun, B.S., Deng, G., et al.: A deep neural network based on kernel function and auto-encoder for bearing fault diagnosis. In: IEEE International Instrumentation and Measurement Technology Conference. Houston, TX, USA (2018) 5. Zhao, H.S., Liu, H.H., Hu, W.J., et al.: Anomaly detection and fault analysis of wind turbine components based on deep learning network. Renew. Energy 127, 825–834 (2018) 6. Meng, Z., Guo, X.L., Pan, Z.Z., et al.: Data segmentation and augmentation methods based on raw data using deep neural networks approach for rotating machinery fault diagnosis. IEEE Access 7, 79510–79522 (2019) 7. Shao, H.D., Jiang, H.K., Zhao, K., et al.: A novel tracking deep wavelet auto-encoder method for intelligent fault diagnosis of electric locomotive bearings. Mech. Syst. Signal Process. 110, 193–209 (2018) 8. Shao, H.D., Jiang, H.K., Li, X.Q., et al.: Intelligent fault diagnosis of rolling bearing using deep wavelet auto-encoder with extreme learning machine. Knowl. Based Syst. 140, 1–14 (2018) 9. Ma, S.J., Cai, W., Liu, W.K., et al.: A lighted deep convolutional neural network based fault diagnosis of rotating machinery. Sensors 19(10), 2381 (2019) 10. Wang, D.C., Guo, Q.W., Song, Y., et al.: Application of multiscale learning neural network based on cnn in bearing fault diagnosis. J. Signal Process. Syst. 91(10), 1205–1217 (2019) 11. Liu, Z.L., Wang, H., Liu, J.J., et al.: Multitask learning based on lightweight 1dcnn for fault diagnosis of wheelset bearings. IEEE Trans. Instrum. Meas. 70, 1–11 (2021) 12. Wang, Q.Y., Cao, D., Zhang, S.Y., et al.: The cable fault diagnosis for XLPE cable based on 1DCNNs-BiLSTM network. J. Control. Sci. Eng. 2023, 1068078 (2023) 13. Yang, Q., Liu, Y., Chen, T., et al.: Federated machine learning: concept and applications. Assoc. Comput. Mach. 10(2), 1–9 (2019) 14. Wang, Y.X., Yan, J., Yang, Z., et al.: A novel federated transfer learning framework for intelligent diagnosis of insulation defects in gas-insulated switchgear. IEEE Trans. Instrum. Meas. 71, 1–11 (2022)

A Federated Learning Method with DNN and 1DCNN Feature Fusion

321

15. Huang, G.Y., Lee, C.H.: Federated learning architecture for bearing fault diagnosis. In: Proceeding of 2021 International Conference on System Science and Engineering (ICSSE), pp. 408–411. IEEE (2021) 16. Bearing Data Set of Case Western Reserve University. http://csegroups.case.edu/bearingda tacenter/home

Backstepping Nonsingular Fast Terminal Sliding Mode Control for Manipulators Driven by PMSM with Measurement Noise Xunkai Gao1,2 , Haisheng Yu1,2(B) , Xiangxiang Meng1,2 , and Qing Yang1,2 1

2

College of Automation, Qingdao University, Qingdao, China [email protected] Shandong Province Key Laboratory of Industrial Control Technology, Qingdao University, Qingdao, China

Abstract. Backstepping Nonsingular fast terminal sliding mode control (SMC) is used to realize the accurate position tracking control of manipulators driven by permanent magnet synchronous motors. In this article, the manipulators and their drive motors are regarded as a whole system. The high gain extended state observer is utilized to compensate the lumped disturbance and modeling error. In case the high gain extended state observer (ESO) is sensitive to measurement noise, the extended Kalman filter (EKF) is added to combine with it. At last, the Lyapunov theory is used to prove the stability of the whole system, and the simulation verifies the performance and effectiveness of the proposed method. Keywords: Position tracking control · Extended state observer · Extended Kalman filter · Manipulator control · Terminal sliding mode

1

Introduction

Manipulators driven by permanent magnet synchronous motor (PMSM) is widely used in industrial production. PMSM has outstanding advantages such as simple structure and powerful performance. Some scholars only consider the dynamic control of the robot body, without considering the driving motor at the same time [1–4]. Because PMSM has prominent advantages such as maintenancefree, high power factor, high efficiency, low noise and ample starting torque, researchers prefer to take it as the research object [5–7]. However, in the practical industrial control systems, disturbance in the working environment, modeling error and measurement noise are inevitable phenomena, and they will have a great impact on the control effect. When dealing with this problem, sliding mode control (SMC) method is often used because of its strong robustness [8]. However, the conventional SMC cannot converge in a finite time and has obvious chattering problem. The nonsingular fast terminal c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023  Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 322–330, 2023. https://doi.org/10.1007/978-981-99-6187-0_32

Backstepping Nonsingular Fast Terminal Sliding Mode

323

SMC can converge in a finite time, chattering can be suppressed by disturbance compensation and design of reaching law. There are many methods to observer the lumped disturbance such as disturbance observer and extended state observer (ESO). ESO can not only observe the system state, but also observe the lumped disturbance as the extended state [9]. ESO needs high to realise quick convergence, but high gain will make it sensitive to measurement noise. For this question, a structure combined ESO and Kalman filter is proposed, but it cannot bu used in nonlinear system [10]. In this paper, the manipulator and its driven PMSM are considered as a whole system. For the nonlinear robot system, extended Kalman filter (EKF) is used to replace the normal Kalman filter. So a structure combined EKF and ESO is proposed to solve the problems of lumped disturbance, modeling error, measurement noise and high gain ESO sensitive to measurement noise. In order to improve the robustness of the system, the backstepping nonsingular fast terminal SMC is proposed to achieve accurate and fast position tracking control of the manipulator driven by PMSM.

2

Preliminaries and Problem State

Consider the Euler-Lagrange equation of an n-joint robotic manipulator described as (1) D (q) q¨ + C (q, q) ˙ q˙ + G (q) = τ − τf − τL where q, q, ˙ q¨ ∈ Rn denote the joint angle, velocity and acceleration, respectively. ˙ ∈ Rn×n is the matrix of centripetal D (q) ∈ Rn×n is the inertia matrix, C (q, q) n and Coriolis forces, G (q) ∈ R is the gravity vector. τ ∈ Rn , τf = Rf q˙ + T Fc sgn (q) ˙ and τL = J (q) Fe are the vector of the joint input control torque, vector of friction torque and vector of load torque, respectively. Rf ∈ Rn×n represents viscous friction coefficient matrix, Fc ∈ Rn is the vector of coulomb force. J (q) ∈ Rn×n is the Jacobian matrix, Fe ∈ Rn is vector of load force. The model of n PMSMs driving manipulators can be expressed as ⎧ di d ⎪ Ld ⎪ ⎪ ⎪ dt ⎪ ⎪ ⎪ diq ⎪ ⎪ ⎨ Lq dt dω ⎪ ⎪ ⎪ Jm ⎪ ⎪ dt ⎪ ⎪ ⎪ ⎪ dθ ⎩ dt

= ud − Rs id + np ΩLq iq = uq − Rs iq − np ΩLd id − np ωΦ =

3 np [(Ld − Lq ) Id iq + Φiq ] − τmL − Rm ω 2

(2)



where Id = diag {id1 , id2 , ..., idn }. Ω = diag {ω1 , ω2 , ..., ωn }. θ ∈ Rn is rotor angle position vector, ω ∈ Rn is rotor angle velocity vector. id , iq ∈ Rn are stator current vector of d-q axis, respectively. ud , uq ∈ Rn are stator voltage of d, qaxis, respectively. Ld and Lq represent stator inductance diagonal matrices of d-q axis, respectively. Jm represents inertia diagonal matrix. Rs is stator armature

324

X. Gao et al.

resistance diagonal matrix, and Φ is magnetic flux diagonal matrix produced by a rotor permanent magnet. np is matrix of pole pairs. τmL ∈ Rn is load torque, Rm ∈ Rn×n represents friction coefficient matrix of PMSM. Combine (1) and (2), the model of n-joint robotic manipulator system driven by PMSM which has modeling errors can be expressed as ⎧¯ ˙ q˙ + G (q) = γ −1 23 np Φiq − ξ − E1 ⎨ D (q) q¨ + C (q, q) did Ld = ud − Rs id + γ −1 · np Ω1 Lq iq − E2 (3) ⎩ didtq −1 −1 Lq dt = uq − Rs iq − γ · np Ω1 Ld id − γ · np Φq˙ − E3 ¯ (q) = D (q) + γ −2 Jm with γ is the reduction where Ω1 = diag {q˙1 , q˙2 , ..., q˙n }. D ratio diagonal matrix of the joint, and it holds τmL = γτ, θ = γ −1 q. ξ = γ −2 Rm q˙ + τL + τf is the lumped disturbance, E1 , E2 and E3 are modeling errors.

3

Compostie Structure Design of ESO and EKF

For the problem of measurement noise in the nonlinear system of the manipulator, EKF is designed to filter the measured value, and the filtered signal is used for ESO estimation. At the same time, ESO estimates disturbances in the system and compensates for unknown terms in EKF. Finally, use the estimated state signal output by ESO to design the controller. ˙ x3 = id and x4 = iq . Then the Let the states of system as x1 = q, x2 = q, system of the n-joint manipulator driven by PMSM is as follows ⎧ x˙ 1 = x2  ⎪  ⎪ ⎪ ¯ −1 γ −1 3 np Φi∗ − ξ − C (x1 , x2 ) x2 − G (x1 ) − E1 ⎪ ⎨ x˙ 2 = D q 2  −1 (4) x˙ 3 = L−1 np Ω1 Lq x4 − E2 d ud − Rs x3 + γ ⎪ −1 −1 −1 ⎪ u = L − R i − γ n Ω L x − γ n Φx − E x ˙ ⎪ q s q p 1 d 3 p 2 3 d ⎪ ⎩ 4 y = Cm xm + v where i∗q , ud and uq are control inputs of the system. y is measurement output,   T T Cm is measurable output matrix, xm = xT , v is the measurement noise. 1 , x2 3.1

Design of Extended Kalman Filter

Aiming at the problem of noise generated by robot position measurement, Kalman filter is used to reduce the influence of measurement noise [11]. Because the robotic manipulator system is a nonlinear system, extended Kalman filter is a good choice. It can be obtained by discretizing system (5) that ⎧ x1 (k) = x1 (k − 1) + Ts x2 (k − 1) ⎪ ⎪ ⎪ ⎨ x2 (k) = x2 (k − 1)   ⎪ ⎪ ⎪ ¯ −1 γ −1 3 np Φi∗q (k − 1) − ξ − Ck−1 (x1 , x2 ) x2 (k − 1) − Gk−1 (x1 ) ⎩ + Ts D 2

(5)

Backstepping Nonsingular Fast Terminal Sliding Mode

325

where Ts is the sample time. The observation equation is written as y (k) = Cm xm (k − 1) + v Then the extended Kalman filter is designed as   x ˆ (k) = x ˆ− (k) + Kk y (k) − Cm x ˆ− (k)

(6)

(7)

where x ˆ− (k) is the prior estimate of x (k), Kk is the Kalman gain and satisfies   T T Cm P (k/k − 1) Cm Kk = P (k/k − 1) Cm +R (8) So, the prior covariance estimate and the posterior covariance estimate are respectively as follows P (k/k − 1) = AP (k − 1/k − 1) AT + Q

(9)

P (k/k) = (I − Kk Cm ) P (k/k − 1)

(10)

where A is the Jacobian matrix of the robot system (5). 3.2

Design of ESO Combined with EKF

The hign gain ESO is sensitive to measurement noise, so the extended Kalman filter is used to reduce the influence [12]. The ESO combined extended Kalman filter is shown as follows. ⎧ z˙1 = z2 + a1 fal (ˆ x1 − z1 ) ⎪  ⎪ ⎪ z˙ = D ¯ −1 γ −1 3 np Φi∗q − C (q, q) ⎪ ˙ x ˆ2 − G (q) + z5 + a2 fal (ˆ x1 − z1 ) ⎪ 2 2 ⎪  ⎪ −1 −1 ⎪ u + L − R x + γ n Ω L x z + a fal (x ⎨ z˙3 = L−1 s 3 p 1 q 4 6 d  d d 3 −13 − z3 ) −1 −1 z˙4 = L−1 u − R x − γ n Ω L x − γ n Φˆ x q s 4 p 1 d 3 p 2 + Lq z7 + a4 fal (x4 − z4 ) q ⎪ ⎪ = a fal (ˆ x − z ) z ˙ ⎪ 5 1 1 ⎪ ⎪ 5 ⎪ z˙6 = a6 fal (x3 − z3 ) ⎪ ⎪ ⎩ z˙7 = a7 fal (x4 − z4 ) (11) where z5 , z6 and z7 are the estimated values of −ξ − E1 , −E2 and −E3 , respectively.

4

Design of Nonsingular Fast Terminal SMC

In this section, the NFTSMC is designed to combine with ESO and EKF, the control strategy diagram is shown as Fig. 1. The τm ∈ Rn in Fig. 1 is electromagnetic torque.

326

X. Gao et al.

Fig. 1. Control strategy diagram

Define the errors of the system as e1 = z1 − x∗1 , e2 = z2 − x∗2 , e3 = z3 − x∗3 , e4 = z4 − x∗4 . Then, the error system can be described as ⎧ e˙ 1 = e2 ⎪ ⎪ ⎪

⎪ ⎪ ⎪ ⎨ e˙ 2 = D ¯ −1 γ −1 3 np Φi∗q − ξ − C (z1 , z2 ) z2 − G (z1 ) − E1 − q¨∗ 2 (12)  ⎪ ⎪ −1 ⎪ e˙ 3 = Ld ud − Rs z3 + γ −1 np Ω1 Lq z4 − E2 − i˙ ∗d ⎪ ⎪ ⎪  ⎩ uq − Rs z4 − γ −1 np Ω1 Ld z3 − γ −1 np Φz2 − E3 − i˙ ∗q e˙ 4 = L−1 d Design the terminal sliding surface in the form p

s = e1 + C1 eβ1 + C2 e˙ 1q (13)

T p p p p T q q q where eβ1 = eβ11 , eβ12 , ..., eβ1n , e˙ 1q = e˙ 11 , e˙ 12 , ..., e˙ 1n . C1 and C2 are constant diagonal matrices, p and q are are positive odd integers. β is a positive constant, they satisfy p > q and β > pq . The attractor of sliding mode surface is e1 , so the finite time ts from e1 (tr ) = 0 to e1 (tr + ts ) = 0 is    ⎞ ⎛  p p 1− p p q − 1 − 1 |z (t )| 1 r q q p q β−1 ⎠   Ξ⎝ , ts = (14) p ;1 + p ; −C1 |z1 (tr )| p q (β − 1) (β − 1) q q C −1 1

q

where Ξ is Gauss’ hypergeometric function. Step 1. Take derivative of (13) yields p s˙ = e˙ 1 + C1 βΘe˙ 1 + C2 Υ e˙ 2 q

(15)

Backstepping Nonsingular Fast Terminal Sliding Mode

327

   p p β−1 β−1 β−1 −1 −1 where Θ = diag |e11 | , |e12 | ..., |e1n | , Υ = diag |e˙ 11 | q , |e˙ 12 | q  p −1 ..., |e˙ 1n | q . Substitute (12) into (15), it can be obtained that

p ¯ −1 −1 3 p np Φi∗q − ξ − Cz2 − G − E1 −C2 Υ q¨∗ (16) s˙ = e2 +C1 βΘe2 +C2 Υ D γ q 2 q Design the virtual sliding mode controller as α = i∗q = i∗qeq + i∗qn where i∗qeq

(17)

2 −1 −1 q −1 −1 ∗ ¯ = Φ np γ −z5 + Cz2 + G + q¨ + D Υ C2 (−e2 − C1 βΘe2 ) (18) 3 p

2q −1 −1 ¯ −1 −1 Φ np γ DΥ C2 [−k1 Sα1 sat (s) − k2 Sα2 sat (s) − λs] (19) 3p where λ, k1 and k2 are positive constants, sat (·) is saturation function. The Sα1 and Sα2 are diagonal matrices which can be expressed as   α1 α1 α1 (20) Sα1 = diag |s1 | , |s2 | , ..., |sn |   α2 α2 α2 (21) Sα2 = diag |s1 | , |s2 | , ..., |sn | i∗qn =

where α1 > 1 and α2 < 1. Step 2. Choose the candidate Lyapunov function as 1 V1 = eT e3 2 3 Differentiate V1 with respect to time yields V˙ 1 = eT 3 e˙ 3  −1   ud − Rs z3 + γ −1 np Ω1 Lq z4 − E2 − i˙ ∗d = eT 3 Ld

(22)

(23)

where i∗d = 0, so the controller is designed as ud = Rs z3 − γ −1 np Ω1 Lq z4 − z6 − Ld k3 e3

(24)

where k3 is a positive constant. Step 3. Choose the candidate Lyapunov function as 1 e4 V2 = V1 + eT 2 4 Differentiate V2 with respect to time, we obtain

(25)

V˙ 2 = V˙ 1 + eT 4 e˙ 4  −1   (26) ˙ uq − Rs z4 − γ −1 np Ω1 Ld z3 − γ −1 np Φz2 − E3 − i˙ ∗q = V1 + eT 4 L d

The control signal uq can be designed as

 uq = Rs z4 + γ −1 np Ω1 Ld z3 + γ −1 np Φz2 − z7 + Ld i˙ ∗q − k4 e4

(27)

328

5

X. Gao et al.

Simulation Results

To verify the effectiveness of the proposed ESO combined with EKF, simulation analysis was conducted in MATLAB/SIMULINK environment. Using a two joint robot with disturbances, we verified the effectiveness and accuracy of tracking control, and design simulations to compare the ESO observation performance and trajectory tracking performance under three cases: ESO low gain without EKF (Case 1), ESO high gain without EKF (Case 2), and ESO high gain with EKF (Case 3). The parameters of manipulators and driving PMSM are as follows: Ld = Lq = diag {0.0085, 0.0085}, Rs = diag {2.875, 2.875}, Φ = diag {0.175, 0.175}, np = diag {4, 4}, Jm = diag {0.0025, 0.0025}, γ = diag {0.01, 0.01}, g = 9.8, m1 = m2 = 0.5, l1 = l2 = 1. The dynamic of manipulator is shown as Fig. 2.

Fig. 2. Dynamic of manipulator

In three cases, q ∗ = [0.5 sin (t) , 0.5 sin (t)] , the initial position of the T manipulator is q = [0, 0] . The measurement noise is white noise with amplitude of 0.1. The parameters of disturbance and modeling errors is as follows:   π  T π T , sin 0.7t + , E2 = [50 sin (t) , 100 sin (t)] , E3 = E1 = sin 0.7t + 2 2 T [50 sin (t) , 50 sin (t)] , Rf = diag {0.05, 0.05}, Rm = diag {0.2, 0.2}. The parameters of nonsingular fast terminal SMC are as follows: C1 = diag {2600, 2600}, C2 = diag {15, 15}, β = 1.4, p = 9, q = 7, α1 = 2, α2 = 0.5, k1 = 7500, k2 = 4430, k3 = 20, k4 = 5000. The parameters of ESO are as follows: a1 = a2 = 50000, a3 = 100000, a4 = 200000, a5 = a6 = a7 = 2000 (Case 1), a5 = a6 = a7 = 20000 (Case 2 and Case 3). The trajectory tracking curves of the two joints of the manipulator in three cases are shown as (a) and (b) in Fig. 3 , respectively. The (c) and (d) in Fig. 3 show the trajectory tracking errors in the three cases. From Fig. 3, it is easy to know that when high ESO with EKF, the controller has the best steady-state performance. It means that EKF can effectively reduce the impact of measurement noise on the disturbance estimation performance of ESO and the control performance of the controller. Then, the output curves of ESO to joint-1 and joint-2 in three cases are shown in Fig. 4. T

Backstepping Nonsingular Fast Terminal Sliding Mode

Fig. 3. Simulation results in three cases.

Fig. 4. Output curves of ESO

329

330

6

X. Gao et al.

Conclusion

This paper used backstepping NFTSMC to realise the accurate position tracking control of manipulator driven by PMSM. Meanwhile, the manipulator and the drive motors are regarded as a whole system. ESO is designed to estimate the lumped disturbance, EKF is designed to deal with the problem that ESO is sensitive to measurement noise when the gain is high. At last, simulation results verify that the proposed method has good performance. Acknowledgments. The authors thanks the National Natural Science Foundation (grant number 62273189) and the Shandong Province Natural Science Foundation (grant number ZR2021MF005) of China for supporting this work.

References 1. Kim, M.J., Chung, W.K.: Disturbance-observer-based PD control of flexible joint robots for asymptotic convergence. J. IEEE Trans. Robot. 31(6), 1508–1516 (2015) 2. Ouyang, P.R., Pano, V., Tang, J., et al.: Position domain nonlinear PD control for contour tracking of robotic manipulator. J. Robot. Comput.-Integr. Manuf. 51, 14–24 (2018) 3. Meza, J.L., Santib´ an ˜ez, V., Soto, R., et al.: Fuzzy self-tuning PID semiglobal regulator for robot manipulators. J. IEEE Trans. Ind. Electron. 59(6), 2709–2717 (2011) 4. Ling, S., Wang, H., Liu, P.X.: Adaptive fuzzy tracking control of flexible-joint robots based on command filtering. J. IEEE Trans. Ind. Electron. 67(5), 4046– 4055 (2019) 5. Wang, H., Zhang, Z., Tang, X., et al.: Continuous output feedback sliding mode control for underactuated flexible-joint robot. J. Franklin Inst. 359(15), 7847–7865 (2022) 6. Wang, L., Chai, T., Zhai, L.: Neural-network-based terminal sliding-mode control of robotic manipulators including actuator dynamics. J. IEEE Trans. Ind. Electron. 56(9), 3296–3304 (2009) 7. Yang, J., Chen, W.H., Li, S., et al.: Disturbance/uncertainty estimation and attenuation techniques in PMSM drives-a survey. J. IEEE Trans. Ind. Electron. 64(4), 3273–3285 (2016) 8. Soltanpour, M.R., Moattari, M.: Voltage based sliding mode control of flexible joint robot manipulators in presence of uncertainties. J. Robot. Auton. Syst. 118, 204–219 (2019) 9. Mado´ nski, R., Herman, P.: Survey on methods of increasing the efficiency of extended state disturbance observers. J. ISA Trans. 56, 18–27 (2015) 10. Sun, H., Madonski, R., Li, S., et al.: Composite control design for systems with uncertainties and noise using combined extended state observer and Kalman filter. J. IEEE Trans. Ind. Electron. 19(3), 636–643 (2010) 11. Simon, D.: Optimal State Estimation: Kalman, H Infinity, and Nonlinear Approaches. John Wiley & Sons, Hoboken (2006) 12. Rsetam, K., Cao, Z., Man, Z.: Cascaded extended state observer based sliding mode control for underactuated flexible joint robot. J. IEEE Trans. Ind. Electron. 67(12), 10822–10832 (2019)

Adaptive Variable Impedance Control of Robotic Manipulator with Nonlinear Contact Forces Ying Guo, Jinzhu Peng(B) , Shuai Ding, and Yanhong Liu School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China [email protected]

Abstract. This study addresses the limitations of traditional impedance control in effectively adapting to the environment during robots’ force tracking tasks. To overcome this issue, an adaptive variable impedance control approach is proposed. By analyzing the interactive environment, the control scheme adaptively adjusts the impedance parameters in real-time using force and position feedback. The stability of the adaptive variable impedance control is demonstrated through the Routh stability criterion. The effectiveness of the control strategy is validated through simulations involving a two-link manipulator.

Keywords: Force tracking tracking · Robotic system

1

· Variable impedance control · Trajectory

Introduction

Force control plays a crucial role in the performance of robots during interactive tasks, and enhancing the robot’s accuracy and adaptability in force tracking control is essential. Industrial robots are extensively used in various operations such as assembly, testing, polishing, and welding. Effective control of the manipulator’s interaction force is vital for task execution [1–4]. Currently, impedance control [5–7] is one of the prevalent methods for compliance control. However, robotic systems are nonlinear and strongly coupled, leading to potential dynamic parameter uncertainties. Traditional impedance control alone may not achieve optimal control performance. Therefore, intelligent force control is necessary, with one notable approach being the combination of impedance control and fuzzy logic systems [8,9]. Recognizing the uncertainties and disturbances in the robot system, Ding et al. [10] proposed an adaptive hybrid impedance control scheme based on neural network. In this scheme, the neural network approximates the saturation error term, uncertain components, and external disturbances. Similarly, considering uncertainties and input saturation, Surdilovic et al. [11] employed a neural network for adaptive impedance control of an n-link manipulator. c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023  Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 331–340, 2023. https://doi.org/10.1007/978-981-99-6187-0_33

332

Y. Guo et al.

To address the limitations of traditional impedance control in handling uncertainties in environmental stiffness, variable impedance control is proposed [12– 15]. However, existing variable impedance controllers have faced challenges in terms of stability. To address this issue, Sun et al. [16] proposed a stable method for variable impedance control, utilizing approximate dynamic inversion techniques for robots with uncertain models. They introduced a novel constraint condition for the variable impedance curve, ensuring the desired exponential stability of the variable impedance dynamics. Another approach, presented by Duan et al. [17], involved online adjustment of impedance parameters based on tracking errors, enabling compensation for uncertainties in the environment. Hu et al. [18] proposed a control strategy that combined an adaptive variable impedance tracking controller with a two-arm end trajectory tracking controller. This method was successfully applied to the installation process of slab stone, meeting the requirements of force/position control. The utilization of adaptive variable impedance control is significant for studying force control in robot interactions with nonlinear environments. In this study, we contribute by designing an adaptive variable impedance trajectory generator (AV IT G) that obtains reference trajectories for the manipulator during space transitions. Additionally, we develop an effective controller to track these generated reference trajectories, resulting in improved performance for force and position-tracking control.

2

Problem Statement and Preliminaries

In this study, we consider the following notation:  represents the set of real numbers, n denotes an n-dimensional real vector space, and n×n represents the space of n×n real matrices. Additionally, the symbol |x| denotes the absolute value of x. 2.1

Modeling of Environment

Typically, the interaction force Fe in contact situations is determined by the stiffness and damping properties of the environment. As a result, an approximation of the environment model is formulated using a second-order nonlinear function, given by: (1) Fe = ke (P − Pe ) + Be (X˙ − X˙ e ) Here, ke ∈ m×m represents the diagonal symmetric positive definite matrix that characterizes the environmental stiffness, while Be ∈ m×m denotes the damping matrix. The vector Pe ∈ m corresponds to the position of the environment. It is assumed that the environmental position remains constant, leading to the following equation: (2) Fe = ke (P − Pe ) + Be P˙

Hamiltonian Mechanics

2.2

333

Dynamic Formulation and Characteristics of Robot Manipulators

The n-link rigid robot dynamic equation is, M (q)¨ q + C(q, q) ˙ q˙ + G(q) + τd = τ − τe

(3)

The above equation consists of various terms representing the joint position q, velocity q, ˙ acceleration q¨ ∈ n , and forces acting on the manipulator. These terms include the inertia matrix M (q), centrifugal and Coriolis forces C(q, q), ˙ gravity vector G(q), input torque vector τ , and interaction torque vector τe ∈ n . Converting the dynamic equation Eq. (2) into the Cartesian coordinate system, we obtain the following equation, which describes the motion of the manipulator in Cartesian space. Md P¨ + C P˙ + G = F − Fe

(4)

In the context given, P , P˙ and P¨ represent the positional, velocity, and acceleration vectors of the manipulator in the Cartesian coordinate system, respectively.

3

Design of Adaptive Variable Impedance Control Approach

In Sect. 3, an innovative strategy for adaptive variable impedance control is formulated to achieve accurate tracking of desired positions and forces, ensuring superior control performance. 3.1

Trajectory Generation with Adaptive Variable Impedance

The Adaptive Variable Impedance Trajectory Generator (AV IT G) incorporates an internal position control loop and an external force control loop. The manipulation of the three impedance controller parameters enables the modification of the manipulator’s reference trajectory input. This adjustment is based on the target impedance equation, which characterizes the desired behavior of the robotic system and can be expressed as follows: ¨ + Bd E˙ + Kd E = ΔF Md E

(5)

where Md , Bd , and Kd are inertia, damping, and stiffness matrix, respectively. In the context of the contact force model involving a flexible object, the error in position trajectory tracking is represented by E = Pd − P . Here, Pd and P correspond to the desired and actual position trajectories, respectively. Similarly, the ¨ tracking are conerrors in velocity trajectory E˙ and acceleration trajectory E sidered. Additionally, ΔF is introduced as the difference between the interaction force Fe and the desired force Fd .

334

Y. Guo et al.

According to Eq. (2), the contact force and its derivative becomes,  fe = ke (p − pe ) + be p˙ f˙e = ke p˙ + be p¨  e) p˙ = fe −keb(p−p e p¨ =

f −k (p−pe ) f˙e −ke e ebe be

(6)

(7)

In free space, a constant impedance controller is employed to achieve the purpose of tracking. However, in contact space, where the manipulator interacts with the environment, a variable impedance controller is designed. And the control strategy can be expressed as follows with e = pd − p, Δf = fe − fd :  mc e¨ + bc e˙ + kc e = 0 (in free space) (8) e(t) + (bd + Δbd (t))e(t) ˙ = Δf (t) (in contact space) (md + Δmd (t))¨ And the Adaptive term is defined as, ⎧ Δf (t) ke [fe −ke (p−pe )] ⎪ ⎨ Δmd (t) = e¨(t) − md be 2 e¨(t) ke (p−pe ) Φ(t) Δbd (t) = −bd be e(t) + e(t) ˙ ˙ ⎪ ⎩ Φ(t) = Φ(t − λ) − αΔf (t − λ)

(9)

Basing on Eq. (7), substituting Eq. (9) into Eq. (8), yields, e(t) + [bd + Δbd (t)]e(t) ˙ Δf (t) = [md + Δmd (t)]¨ f˙e (t) fe (t) = md [¨ pd (t) − ] + Δf (t) + bd [p˙d (t) − ] + Φ(t − λ) − αΔf (t − λ) pe pe (10) By multiplying both sides of Eq. (10) by be , the following equation is obtain, md [be p¨d (t) − f˙d (t)] + bd [be p˙d (t) − fd (t)] = md Δf˙(t) + bd Δf (t) + be αΔf (t − λ) − be Φ(t − λ)

(11)

Defining c(t) = Δf and r(t) = be p˙d − fd , Eq. (11) can be rewritten as, md r˙ + bd r = md c˙ + bd c − be Φ(t − λ) + be αc(t − λ)

(12)

The above Eq. (12) can be rewritten as, md r˙ + bd r =md c˙ + bd c + be (αc(t − (n + 1)λ) + · · · + αc(t − 2λ) + αc(t − λ)) (13) And the transfer function is described as, c(s) md s + b d = r(s) md s + bd + be α(e−(n+1)λs + · · · + e−λs )

(14)

when the sampling rate is sufficient, the characteristic equation can be expanded as (15) λmd s2 + λbd s − λbe αs + be α = 0

Hamiltonian Mechanics

335

According to the Routh stability criterion, the stability conditions of the system can be derived, bd 0 0 is the controller gain.

4

Simulation Experiments

In order to validate the suggested controller approach, computer simulations are conducted using a manipulator with two links. The manipulator’s configuration is depicted as follows:  χ 1 χ2 M (q) = (23) χ3 χ4

336

Y. Guo et al.



−2m2 l1 l2 q˙2 s2 m2 l1 l2 q˙2 s2 C(q) = m2 l1 l2 q˙2 s2 0  G(q) =



(m1 + m2 )l1 gc1 + m2 l2 gc12 m2 l2 gc12

(24) (25)

where χ1 = (m1 + m2 )l12 + m2 (l22 + 2l1 l2 c2 ), χ2 = m2 l22 + m2 l1 l2 c2 , χ3 = m2 l22 + m2 l1 l2 c2 and χ4 = m2 l22 . 4.1

Design Procedure

To validate the proposed approach, the steps involved in designing the adaptive variable impedance control AV IC are as follows, Step 1. Develop the AV IT Gby carefully selecting suitable impedance parameters: mc = 1, bc = 50, kc = 100. In contact space. md = 1I2×2 , bd = 1I2×2 , kd = 1I2×2 , α = 0.007. Step 2. Construct the controller: Choose controller gain Kz = 200I2×2 , and Λ = I2×2 . 4.2

Simulation Results

This section presents examples conducted on a two-link manipulator with the following actual parameters: m1 = m2 = 1 kg and l1 = l2 = 1 m. The contact force equation (2) is chosen in the x-direction, where ke = 2000I2×2 and be = 100I2×2 . The desired position trajectory is Pd (t) = [1.6 sin(0.4t + π/6), 1.6 sin(0.4t + π/3)]T m.

Fig. 1. Force tracking and errors of AVIC and IC

Assuming that the desired force Fd = [50, 0]T N when p ≥ pe ≥ 1 m in x direction. To verify the superiority of the AV IC, we compare it with the

Hamiltonian Mechanics

Fig. 2. Actual position tracking and errors of AVIC and IC in the x direction

Fig. 3. Actual position tracking and errors of AVIC and IC in the y direction

Fig. 4. Adaptive impedance parameter variation diagram

337

338

Y. Guo et al.

Fig. 5. Position tracking of end-effector

conventional Impedance Control (IC). Figure 1(a) and (b) illustrate the force tracking results and errors for a constant force. The tracking of the desired trajectory in the x and y directions of Cartesian space is shown in Figs. 2(a) and 5(a), respectively. Figures 3(b) and 5(b) depict the errors in desired position tracking in Cartesian space’s x and y directions, respectively. Figure 4 displays the variation of the adaptive impedance parameters. Figure 5 presents the motion space and position tracking of the end-effector. Figure 1 shows that the proposed AV IC significantly reduces overshoot and force tracking errors in comparison to IC. This improvement is due to the adaptive updating of impedance parameters in AV IC. The real-time feedback of force error in AV IC enhances its adaptability and robustness, resulting in better force tracking performance than IC. Figures 2–3 demonstrate that the proposed AV IC achieves superior position tracking performance and smoother trajectory compared to IC. Figure 4, shows the irregular changes in adaptive impedance parameters, which are determined by the adaptive law. Based on the above analysis, we conclude that the proposed AV IC reduces force overshoot and oscillation and achieves smaller force tracking errors compared to IC.

5

Conclusion

This paper proposes an adaptive variable impedance control approach for manipulators. The control scheme utilizes a nonlinear force contact-based environmental model to design a controller that tracks the reference trajectory of the manipulator. The Routh stability criterion and the Lyapunov asymptotic stability are used to demonstrate the stability of the controller. The effectiveness and feasibility of the control strategy are validated through simulations on a two-link robot. Future work will involve experimental validation and application of the proposed method to real robotic systems.

Hamiltonian Mechanics

339

Acknowledgments. The authors would like to acknowledge the funding supported by the National Natural Science Foundation of China (62273311, 61773351) and the Key Specialized Research and Development Breakthrough in Henan Province, China (222102220117).

References 1. Yoshikawa, T.: Force control of robot manipulators. In: Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No. 00CH37065), vol. 1, pp. 220–226. IEEE (2000) 2. Kiguchi, K., Fukuda, T.: Position/force control of robot manipulators for geometrically unknown objects using fuzzy neural networks. IEEE Trans. Industr. Electron. 47(3), 641–649 (2000) 3. Kumar, N., Panwar, V., Sukavanam, N., Sharma, S.P., Borm, J.-H.: Neural network based hybrid force/position control for robot manipulators. Int. J. Precis. Eng. Manuf. 12, 419–426 (2011) 4. Wahrburg, A., B¨ os, J., Listmann, K.D., Dai, F., Matthias, B., Ding, H.: Motorcurrent-based estimation of cartesian contact forces and torques for robotic manipulators and its application to force control. IEEE Trans. Autom. Sci. Eng. 15(2), 879–886 (2017) 5. Hogan, N.: Impedance control: an approach to manipulation. In: 1984 American Control Conference, pp. 304–313 (1984) 6. Song, P., Yu, Y., Zhang, X.: Impedance control of robots: an overview. In: 2017 2nd International Conference on Cybernetics, Robotics and Control (CRC), pp. 51–55. IEEE (2017) 7. Bonitz, R., Hsia, T.C.: Internal force-based impedance control for cooperating manipulators. IEEE Trans. Robot. Autom. 12(1), 78–89 (1996) 8. Li, G., Yu, J., Chen, X.: Adaptive fuzzy neural network command filtered impedance control of constrained robotic manipulators with disturbance observer. IEEE Trans. Neural Netw. Learning Syst. 34, 5171–5180 (2021) 9. Liu, Z., Sun, Y.: Adaptive variable impedance control with fuzzy-pi compound controller for robot trimming system. Arab. J. Sci. Eng. 47, 1–14 (2022) 10. Ding, S., Peng, J., Zhang, H., Wang, Y.: Neural network-based adaptive hybrid impedance control for electrically driven flexible-joint robotic manipulators with input saturation. Neurocomputing 458, 99–111 (2021) 11. Surdilovic, D., Cojbasic, Z.: Robust robot compliant motion control using intelligent adaptive impedance approach. In: Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No. 99CH36288C), vol. 3, pp. 2128– 2133. IEEE (1999) 12. Xu, K., et al.: Adaptive impedance control with variable target stiffness for wheellegged robot on complex unknown terrain. Mechatronics 69, 102388 (2020) 13. Abu-Dakka, F.J., Rozo, L., Caldwell, D.G.: Force-based variable impedance learning for robotic manipulation. Robot. Auton. Syst. 109, 156–167 (2018) 14. Hamedani, M.H., Sadeghian, H., Zekri, M., Sheikholeslam, F., Keshmiri, M.: Intelligent impedance control using wavelet neural network for dynamic contact force tracking in unknown varying environments. Control. Eng. Pract. 113, 104840 (2021)

340

Y. Guo et al.

15. Ferraguti, F., Secchi, C., Fantuzzi, C.: A tank-based approach to impedance control with variable stiffness. In: 2013 IEEE International Conference on Robotics and Automation, pp. 4948–4953. IEEE (2013) 16. Sun, T., Peng, L., Cheng, L., Hou, Z.-G., Pan, Y.: Stability-guaranteed variable impedance control of robots based on approximate dynamic inversion. IEEE Trans. Syst. Man Cybernet. Syst. 51(7), 4193–4200 (2019) 17. Duan, J., Gan, Y., Chen, M., Dai, X.: Adaptive variable impedance control for dynamic contact force tracking in uncertain environment. Robot. Auton. Syst. 102, 54–65 (2018) 18. Hu, H., Cao, J.: Adaptive variable impedance control of dual-arm robots for slabstone installation. ISA Trans. 128, 397–408 (2022)

Research on Adaptive Network Recovery Method Based on Key Node Identification Chaoqing Xiao, Lina Lu(B)

, Chengyi Zeng , and Jing Chen

National University of Defense Technology, Changsha 410003, China [email protected]

Abstract. As the world becomes increasingly interconnected, network failures have emerged as a practical problem that organizations must tackle. While extensive research has been carried out on network recovery in the past decade, previous studies have often neglected attribute information such as node capacity and status, resulting in inaccurate identification of key nodes and suboptimal network performance recovery. We aims to optimize the representation of failed networks, embedding both node attribute information and topological information using graph neural network. A deep reinforcement learning approach is employed to find the optimal node recovery sequence, leading to the proposal of a network recovery method called R-GQN, based on graph neural network and deep Qlearning. The performance of network recovery speed is evaluated by the normalized cumulative recovery efficiency (NCRE). To evaluate the proposed method, experiments were conducted on Barábasi-Albert scale-free networks of different sizes and compared with four existing network recovery methods: degree centrality, betweenness centrality, closeness centrality, and PageRank. The results show that the R-GQN method outperforms these methods by 10.32% in terms of NCRE, demonstrating its effectiveness in network recovery. Keywords: Key Nodes Identification · Network Recovery · Deep Reinforcement Learning · Graph Neural Network

1 Introduction The field of network science has revolutionized the way we approach real-world problems by modeling complex systems as graph structures. Many complex systems, including power grids, transportation networks, communication networks, metabolic networks, and ecosystems, can be represented as complex networks. The increasing reliance on infrastructure networks such as the Internet has brought great convenience to people’s lives, while the continuous improvement of networking in various industries provides development opportunities for countries and regions. However, network failures pose a significant threat to the social order and financial field. These failures can result from internal equipment overload or damage, as well as external factors such as natural disasters, malicious attacks, or other interdependent networks. The impact of these failures can be far-reaching. For instance, in 2003, bad © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 341–352, 2023. https://doi.org/10.1007/978-981-99-6187-0_34

342

C. Xiao et al.

weather in Italy caused local transmission line damage, resulting in a nationwide blackout [1]. In 2017, the Wannacry blackmail virus exploited vulnerabilities in Microsoft’s operating system [2], causing a large-scale spread of the virus and attacks on the Internet in more than 100 countries worldwide. Efficient network recovery methods are therefore critical for recovering network functionality in a timely manner. Network recovery is an essential branch of network science that aims to recover network efficiency when it is reduced or entirely collapsed due to network failure. Taking timely repair measures to recover the network to normal operation is crucial in minimizing the impact of network failures on people’s lives and various industries. Current academic research on network recovery focuses on spontaneous recovery based on network self-healing, optimal recovery sequence based on time constraints, partial recovery based on resource constraints, and optimal recovery sequence based on network load redistribution. However, identifying the key nodes for optimal recovery is a challenging task due to the complex structure of failure networks. To address this challenge, this paper draws inspiration from the FINDER [3] algorithm, which achieves excellent results in finding key nodes in network disintegration. Specifically, we improve the network recovery methods based on degree centrality and betweenness centrality. The paper first describes and models the network recovery problem and then studies the node embedding representation based on graph neural network to identify the key nodes of the failure network. Finally, the iterative recovery scheme is optimized using deep reinforcement learning. The main contributions of this paper are fourfold. Firstly, our approach optimizes the representation of failure networks, resulting in a more effective method for analyzing network failures. Secondly, we introduce a novel evaluation index for measuring the performance of network recovery processes. Thirdly, we apply graph neural network to the field of network recovery, enabling the embedding of node attributes and network structure information as low-dimensional vectors. Finally, we propose a network recovery method based on graph neural network and deep Q learning (R-GQN). This approach employs deep reinforcement learning to solve the optimal node recovery sequence, resulting in a more effective and efficient network recovery process.

2 Related Work To address the problem of network spontaneous recovery, Majdandzic [4] studied the conditions under which networks can recover spontaneously. They proposed a network spontaneous recovery model that is independent of other nodes’ internal failure (with a probability of p∗ ) and externally induced failure (with a probability of r). When the values of p∗ and r are different, the network experiences phase transitions between a high active state and a low activity state. Liu [5] studied the blocking effect of network spontaneous recovery on cascade failures by examining the network start-up spontaneous recovery time and resource allocation as variables. By reasonably combining recovery time and recovery resources, they aimed to block cascade failures of the network as soon as possible and protect the network to the maximum extent. However, not all failed networks can recover to a normal state spontaneously and may require human intervention.

Research on Adaptive Network

343

In order to solve the problem that the redistribution of network load will lead to secondary failure in the process of recovery, Fu [6] proposed four recovery strategies based on ascending (descending) betweenness centrality and ascending (descending) degree centrality to solve the problem of secondary failure caused by the redistribution of network load during recovery. They considered the network coupling and the different partial pressure ratio of other failed nodes. Reference [7] proposed a shell repair strategy (SRS) to prevent secondary failure during network recovery by increasing the capacity of high-risk nodes through search. Reference [8] modeled the failure network and decoupled the network by increasing the redundant capacity of nodes and cutting off the coupling link to prevent secondary cascade failure during the recovery process. Although these methods identify key nodes prone to secondary cascade failures and enhance network robustness by increasing node capacity, they do not address the optimal recovery problem based on the original network. To address this issue, Wu [9] proposed the sequential recovery graph (SRG) method. This method determines the sequence of recovery nodes according to the maximum frequency and establishes a directed graph, with node order determined by the weights of the digraphs. Huang [10] modeled the state change of nodes and proposed an unconstrained result-based recovery strategy and a constrained resource-based recovery strategy, where node capacity is described by intermediate number, and the optimal recovery sequence is found by iterative search and heuristic search. Literature [11–14] proposed a reinforcement learning method to recover failure networks sequentially, mainly using the Q-learning method to sort the importance of failure nodes. However, this method is only suitable for small-scale networks and does not consider the impact of nodes’ attributes on network recovery. Although the above methods can solve the corresponding recovery sequence, the complexity of time and space makes it difficult to obtain the optimal solution, especially for large-scale network failures. Hu [15] addressed the problem of network structure recoveryby modeling the intercity traffic network as a two-dimensional grid. They proposed the periphery repair (PR) strategy, which recovers the nearest node with the largest connected distance, and the node weight priority recovery (PRNW) strategy, which considers the weight of the node. These strategies effectively improve network resilience, similar to greedy recovery. Shang [16] divided the network nodes into several layers and proposed a method to recover nodes from inside to outside the layer, considering that failure nodes close to normal nodes are more likely to be recovered. Wang [17] proposed a community repair method, which breaks the network into small communities due to failure and re-establishes hub nodes to establish links with other communities, considering communication within small communities. Although these methods aim for the fastest connected network, they may not achieve the goal of recovering the network efficiency as quickly as possible. Real networks are often dependent on each other for energy or information [18], and Muro [19] addressed the problem of dependent network recovery by defining the boundary node as the failure node whose distance from the maximum connected piece is 1. They proposed the boundary node recovery method, randomly selecting the current recovered node from many boundary nodes. Wu [20] proposed a preferential recovery algorithm based on the connected edge, which measures the importance of the node based on its value inside and

344

C. Xiao et al.

outside of the giant connected component. The algorithm recovers nodes according to the order of importance, similar to layering the network and giving priority to recovering the node closest to the connected chip.

3 Adaptive Network Recovery Strategy 3.1 Problem Description and Modeling During network operation, local failures may be caused by damage to parts or natural disasters, while malicious attacks or cascading failures can lead to large-scale network paralysis. In the process of network recovery, different node recovery sequences can greatly affect network connectivity and performance recovery speed when dealing with multiple node failures. In this paper, we model the failure network using complex network theory and use the normalized value of global performance change of a node as the measure of recovery effect. Given a network G(V , E), where V is the set of nodes and E is the set of edges, G  (V  , E  ) is the network after failure, where V  is the set of nodes currently working normally in the network and E  is the set of edges that are still functioning, with V  ⊂ V and E  ⊂ E. Assuming the normal network performance is η, the presence of failed nodes in the network causes the edges directly connected to the failed nodes to stop working. Therefore, the performance of the failed network η is lower than the normal network performance η, i.e., η < η. If a node fails, the edges directly connected to the failed node will also stop working. In the failed network representation, the edges directly connected to the failed node will be deleted. If the failed node is recovered, the edges directly connected to it will return to their normal link state. We define η(G  ) as the performance of the failure network G  . The aim of this paper is to develop a recovery strategy that optimizes the node recovery sequence {v1 , v2 , · · · , vi , · · · } to maximize the normalized cumulative recovery efficiency (NCRE) given by formula (1).   1 K η G  + {v1 , v2 , · · · , vK } R(v1 , v2 , · · · vK ) = (1) j=1 K η(G) where K is the total number of nodes to be recovered, G is the network with no node failures, η(G) is the network performance under normal conditions, and η(G  + {v1 , v2 , · · · , vK }) represents the network performance after recovery in the order of failed nodes from the set F = {v1 , v2 , · · · , vK }. R is the normalized accumulation of the proportion of network efficiency to the original network efficiency after each recovery. A larger R indicates a better recovery strategy. 3.2 Network Performance Metrics Most scholars use the giant connectivity component (GCC) as the network performance measurement in network recovery research, but it has both advantages and disadvantages. While it is simple to calculate and understand, it falls short in evaluating the performance

Research on Adaptive Network

345

of power networks and the Internet where connectivity alone is not enough to represent global performance. To address this, we propose an improved evaluation index, the giant connectivity component contribution rate (GCoCR), to comprehensively evaluate network performance. GCoCR considers not only the connectivity of the network but also the overall contribution rate of the giant connectivity component. For network G, the GCoCR representation is given by formula (2): η(G) = max i

σi σinit

(2)

 Here, σi = m k C k represents the total contribution value of the i-th connected component, where m is the node sequence of the connected component, Ck represents the contribution value of node vk in the normal state, and σinit is the network contribution value when the network is fully connected. 3.3 Identification of Key Nodes in Failure Network The goal of this paper is to quantify the importance of failure nodes to maximize the positive effect of each node recovery on network performance. To achieve this, we use graph neural network to represent the failure nodes and obtain a comparable representation of low-dimensional vector nodes. Our model is designed to extract not only the attribute information of the node itself but also fully aggregate the network structure information. Additionally, it is expected to be adaptable to other types of networks. GraphSAGE [21] is a powerful inductive learning method that can perfectly meet these requirements. This paper uses GraphSAGE to represent the failure nodes. GraphSAGE updates its own characteristics by continuously aggregating neighbor node characteristics, as shown in formula (3) where j represents the number of aggregations, N (v) represents the set of neighbor nodes of node v, and the current node v contains the node characteristics of its j-order neighbors after j times aggregation.   (j) (j−1) (3) hN (v) ← AGGREGATEj hu , ∀u ∈ N (v) After obtaining the aggregation of j-order neighbors, it is activated after merging with the first-order node features of the previous j − 1 as shown in formula (4). The current feature representation of node v is finally obtained.   (j) (j−1) (j) (4) hv ← σ W (j) · CONCAT hv , hN (v) In this paper, we set j to 3, and the initial node characteristics are expressed by node degree value, node contribution value, and node state (normal is 1, failure is −1). The representation of all nodes is constantly updated by GraphSAGE, and the final vector representation of failed nodes is obtained through learning and optimization. In this paper, Algorithm 1 is employed to embed both the failure and normal networks, resulting in node and whole network representations, where σ is the non-linear activation function RELU. Although the same graph neural network model is utilized for embedding both failure and normal networks, the parameters are not shared.

346

C. Xiao et al.

3.4 Adaptive Network Recovery Strategy Building upon the GraphSAGE node representation method, this paper use deep Q network (DQN) approach to identify key nodes in the failure network and develop an optimal recovery strategy. In reinforcement learning, at a given time t, the agent chooses an action at to interact with the environment based on the current state st and the strategy π . The agent transitions to the next state st+1 according to the probability P and receives the corresponding reward rt = r(st , at ) to update the strategy π . The accumulated income is expressed in formula (5). n Rt = γ i r(st+i , at+i ) (5) i=0

The action value function in formula (6) represents the expected return after taking action a in state s. The goal of DQN is to maximize the action value function. Qπ (s, a) = Eπ [Rt |st = s, at = a]

(6)

The target network obtains a discounted reward QT based on the selected action a in the current state s, which is the sum of the maximum reward that can be obtained from the current reward r(s, a) and the next state s .    QT = r(s, a) + γmax (7) Q s ,a  a

The evaluation network evaluates the value QE according to the current state and the action output. QE = maxQ(s, a) a

(8)

Research on Adaptive Network

347

The loss function of DQN training is defined as the mean square deviation of the target network output QT and the evaluation network output QE , as shown in formula (9). 2 

  Q s , a − Q(s, a) (9) L = E (QT − QE )2 = E r(s, a) + γ max  a

In this paper, the single-step reward function is defined based on the impact of recovering a single node on network performance, as shown in formula (10).    1 η Gi + fi , Ei,Ni 1 η(Gi+1 ) ri (Gi , fi ) = · = · (10) K η(G) K η(G) Specifically, Gi+1 refers to the network after the failed node fi has been recovered, while Ni represents the set of neighbor nodes of fi . Additionally, Ei,Ni represents the set of edges that are directly connected to fi , meaning that when fi is recovered, the links directly connected to it are also automatically recovered. However, if the neighbor node ux of the current recovering node fi is still in a failure state, the link between the two nodes remains invalid.To calculate the reward of the recovery node fi , we divide the network performance after the current recovery action by the normal network performance and then normalize the result.

348

C. Xiao et al.

We introduce the network recovery method based on graph neural network and deep Q learning (R-GQN), presented in Algorithm 2. Firstly, Algorithm 1 is employed to f f obtain the node representation hv and whole network representation hg of the failure network, as well as the node representation hnv and whole network representation hng of the normal network. These representations are spliced to obtain the node representation hv and network representation hg . The final node representation is determined in step 8 of Algorithm 2. The current step in the node recovery process is determined by selecting the maximum scalar embedded of the node. The corresponding reward is then calculated based on the change in network state, enabling continuous optimization of the optimal ˆ node recovery sequence F.

4 Experiments and Analysis of Results 4.1 Experiments Setting Data Set. We generated several random Barabási-Albert (BA) network with 30–50 nodes as the training data set. Each node was assigned a contribution value that is a random decimal between 0–1, and the node state was set to 1, indicating a normal network state. We then subjected the network to both high degree and random attacks, invalidating 50% of its nodes and creating a failure network. We first attacked 18% nodes with high degree attack, based on the threshold of 0.18 for disintegration into fragments as reported by Albert [22]. To enhance the generalization ability of our model, we then randomly attacked 32% nodes. The failed node status was set to −1. Using this method, we obtained 2000 failed BA networks as the training set. We also generated test sets with network sizes of 50–100, 100–200, and 200–500 nodes using the same method to evaluate the effectiveness of our approach. Simulation Platform and Parameters. We conducted our simulations on a Windows10 platform with an Intel Core i7-9750H CPU, NVIDIA GeForce RTX2060, and 16GB RAM. We used Python 3.9, PyTorch 1.11.0, and igraph 0.10.2. The main parameters of our approach were set as follows: the greed factor was set to ε = 0.9, the discount factor was set to γ = 0.9, the learning rate was set to 5 × 10−6 , and the number of training epochs was set to 6 × 104 .

4.2 Analysis of Results Figure 1 illustrates the change in node recovery ratio and network performance. The horizontal axis represents the recovery ratio of failed nodes, while the vertical axis represents the normalized representation of network performance. The betweenness centrality (BC) and degree centrality (DC) methods are compared with the R-GQN method in this paper. A failure network G  is selected from any test set, where 24 nodes failed in the initial state, and the initial performance of G  is 0.48. It can be clearly seen from the Fig. 1 that the network’s performance improves significantly faster when recovered using the R-GQN method than when using the BC and DC methods in the early stages of recovery. As the number of recovered nodes increases,

Research on Adaptive Network

349

Fig. 1. Number of nodes recovered and network performance changes.

Fig. 2. R-GQN model training loss curve.

R-GQN enables faster network performance recovery, while the network performance recovered by BC and DC methods tends to change linearly. R-GQN consistently outperforms the other two methods. For instance, when 25% of the failed nodes recover, R-GQN revocery the network performance to 0.91, while BC and DC methods recovery it to only 0.67. Similarly, when 58% of the failed nodes recover, R-GQN has recovered the network performance to 0.98. Data analysis indicates that R-GQN tends to recover the network more quickly, with as few recovery nodes (or recovery cost) as possible, thereby maximizing the network’s performance. The area under the curve in Fig. 1 can approximately represent the normalized cumulative recovery efficiency (NCRE), which can also indicate the speed of network recovery. Formula 1 provides the specific calculation method for NCRE.

350

C. Xiao et al.

Figure 2 shows the variation of the training loss of the R-GQN model. At 3000 epochs, the loss exhibits significant fluctuations but gradually decreases thereafter. Overall, during the training process of 60,000 epochs, the loss fluctuates within a reasonable range, exhibits a decreasing trend, and gradually converges.

Fig. 3. Comparison of GCoCR values after network recovery by different methods.

Figure 3 presents a comparison of the normalized cumulative recovery efficiency (NCRE) scores obtained by five methods, including the R-GQN method proposed in this work, betweenness centrality (BC), closeness centrality (CC), degree centrality (DC), and PageRank (PR). The experiments involve recovering networks with varying numbers of nodes, ranging from 30 to 50, 50 to 100, 100 to 200, and200 to 500, while maintaining a proportion of 50% failed nodes relative to the original network size. The results demonstrate that the R-GQN method outperforms all other methods, particularly for small-scale networks with 30–50 nodes, where it achieves a performance improvement of approximately 10.32% compared to the other methods. Moreover, after testing on small-scale networks, the R-GQN model exhibits significant improvements when applied to larger networks, outperforming the other methods by 9.75%, 7.63%, and 4.91% for networks with more than 50 nodes. When tested on networks of varying sizes, R-GQN exhibits optimal performance when tested on networks of a similar size to the training data. As the network size increases, R-GQN’s performance tends to decrease. However, increasing the number of training iterations may enable R-GQN to exhibit superior performance in large-scale networks.

Research on Adaptive Network

351

5 Conclusion This paper presents a novel approach for network recovery, leveraging a graph neural network to represent the nodes in the failure network and deep reinforcement learning to solve the network recovery sequence. Our aim is to find the sequence that can recover network performance to the initial level as quickly as possible. The proposed network recovery method based on graph neural network and deep Q learning (R-GQN) pursues single-step recovery and achieves the highest positive effect. Our comparative experimental analysis reveals that the R-GQN method proposed in this paper surpasses other existing methods, with a maximum performance improvement of approximately 10.32%.

References 1. Buldyrev, S.V., Parshani, R., Paul, G., Stanley, H.E., Havlin, S.: Catastrophic cascade of failures in interdependent networks. Nature 464, 1025–1028 (2010) 2. Zhong, J., Zhang, F., Yang, S., Li, D.: Restoration of interdependent network against cascading overload failure. Physica A: Stat. Mech. Appl. 514, 884–891 (2019) 3. Fan, C., Zeng, L., Sun, Y., Liu, Y.-Y.: Finding key players in complex networks through deep reinforcement learning. Nat. Mach. Intell. 2, 317–324 (2020) 4. Majdandzic, A., Podobnik, B., Buldyrev, S.V., Kenett, D.Y., Havlin, S., Eugene Stanley, H.: Spontaneous recovery in dynamical networks. Nature Phys. 10, 34–38 (2014) 5. Liu, C., Li, D., Fu, B., Yang, S., Wang, Y., Lu, G.: Modeling of self-healing against cascading overload failures in complex networks. EPL 107, 68003 (2014) 6. Chaoqi, F., Ying, W., Xiaoyang, W.: Research on complex networks’ repairing characteristics due to cascading failure. Physica A: Stat. Mech. Appl. 482, 317–324 (2017) 7. Fu, C., Wang, Y., Gao, Y., Wang, X.: Complex networks repair strategies: dynamic models. Phys. A: Stat. Mech. Appl. 482, 401–406 (2017) 8. Chaoqi, F., Ying, W., Kun, Z., Yangjun, G.: Complex networks under dynamic repair model. Physica A: Stat. Mech. Appl. 490, 323–330 (2018) 9. Wu, J., Chen, Z., Zhang, Y., Xia, Y., Chen, X.: Sequential recovery of complex networks suffering from cascading failure blackouts. IEEE Trans. Netw. Sci. Eng. 7, 2997–3007 (2020) 10. Huang, Y., Wu, J., Ren, W., Tse, C.K., Zheng, Z.: Sequential restorations of complex networks after cascading failures. IEEE Trans. Syst. Man Cybern. Syst. 51, 400–411 (2021) 11. Wu, J., Fang, B., Fang, J., Chen, X., Tse, C.K.: Sequential topology recovery of complex power systems based on reinforcement learning. Phys. A: Stat. Mech. Appl. 535, 122487 (2019) 12. Zhang, Y., Wu, J., Chen, Z., Huang, Y., Zheng, Z.: Sequential node/link recovery strategy of power grids based on q-learning approach. In: 2019 IEEE International Symposium on Circuits and Systems (ISCAS). IEEE, New York (2019) 13. Jia, H., Gai, Y., Li, B., Zheng, H.: Power communication network fault recovery method based on reinforcement learning. Electr. Power 53, 34–40 (2020) 14. Jia, H., Gai, Y., Zheng, H.: Network recovery for large-scale failures in smart grid by reinforcement learning. In: 2018 IEEE 4th International Conference on Computer and Communications (ICCC), pp. 2658–2663. IEEE, Chengdu, China (2018) 15. Hu, F., Yeung, C.H., Yang, S., Wang, W., Zeng, A.: Recovery of infrastructure networks after localised attacks. Sci Rep-UK 6, 24522 (2016) 16. Shang, Y.: Localized recovery of complex networks against failure. Sci Rep-UK 6, 30521 (2016)

352

C. Xiao et al.

17. Wang, T., Zhang, J., Sun, X., Wandelt, S.: Network repair based on community structure. Epl-Europhys. Lett. 118, 68005 (2017) 18. Jiang, Y., Yan, Y., Hong, C., Yang, S., Yu, R., Dai, J.: Multidirectional recovery strategy against failure. Chaos Soliton Fract. 160, 112272 (2022) 19. Muro, M.A.D., La Rocca, C.E., Stanley, H.E., Havlin, S., Braunstein, L.A.: Recovery of interdependent networks. Sci Rep-UK 6, 22834 (2016) 20. Wu, J., Gong, K., Wang, C., Wang, L.: An optimal recovery algorithm based on connected edges in dependent networks. Acta Physica Sinica 67, 296–307 (2018) 21. Hamilton, W.L., Ying, R., Leskovec, J.: Inductive representation learning on large graphs. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems 30 (nips 2017). Long Beach, CA (2017) 22. Albert, R., Jeong, H., Barabasi, A.-L.: Error and attack tolerance of complex networks. Nature 406, 378–382 (2000)

Multi-Scale Feature Fusion Fault Diagnosis Method Based on Attention Mechanism Feilong Yu, Funa Zhou(B) , and Chang Wang Shanghai Maritime University, Shanghai 200135, China [email protected]

Abstract. As a key component of electromechanical equipment in the intelligent manufacturing process, rolling bearings play an important role in securing a stable operation. Deep learning techniques facilitate the extraction of features with datasets; however, when dealing with data from multiple working conditions, conventional deep learning approaches tend to inaccurately represent the features of any given condition, consequently affecting the fault diagnosis accuracy of the model. Methods that pre-distinguish operating conditions and subsequently model them separately cannot guarantee real-time fault diagnosis, and the limited amount of labelled data for each condition further hinders the realtime nature of fault diagnosis. Consequently, designing a multi-condition feature extraction method becomes imperative. This study aims to propose a multi-scale feature fusion approach based on the attention mechanism, which addresses the issue of insufficient information filtering in traditional multi-scale feature fusion methods under multi-condition, high-noise scenarios, ultimately leading to subpar model fault diagnosis accuracy. The proposed method leverages multiple networks to extract features from both single-condition and mixed-condition data. By utilizing the attention mechanism, features with distinguishable working conditions are selectively identified, thereby enhancing the effectiveness of information fusion and ultimately improving the accuracy of multi-condition fault diagnosis. Experimental validation was conducted using the bearing dataset from Case Western Reserve University. The results demonstrate that, under multi-condition and high-noise scenarios, the proposed method exhibits higher diagnostic accuracy compared to other multi-scale learning approaches. Keywords: Fault Diagnosis · Multi-scale Feature Fusion · Attention Mechanism · Mutiple Working Condition

1 Introduction 1.1 Research Status As a key component of electromechanical equipment, rolling bearing plays an important role in the safe and stable operation of the whole electromechanical system. However, due to the influence of overload, aging and other factors, rolling bearings are easy to fail. Therefore, it is of great significance to study rolling bearing fault diagnosis [1, 2]. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 353–360, 2023. https://doi.org/10.1007/978-981-99-6187-0_35

354

F. Yu et al.

Existing bearing fault diagnosis methods can be divided into three categories: fault diagnosis method based on physical model, fault diagnosis method based on prior knowledge and data-driven fault diagnosis method. Model-based fault diagnosis is a method of using physical model or statistical model to fault diagnose the system. It takes a lot of work and knowledge to construct accurate physical model or statistical model, which also leads to greater limitations of this kind of fault diagnosis method. Knowledge-based fault diagnosis methods are highly dependent on the quality and integrity of expert knowledge, which limits its generalization performance. The data-driven fault diagnosis method can build a fault diagnosis model based on a large amount of data, without the need to build accurate models and expert knowledge. Among them, fault diagnosis method based on deep learning has attracted wide attention due to its powerful autonomous learning ability of data features [3, 4]. Deep learning is an effective feature extraction tool. Through deep learning network structure, complex mapping relationships can be represented, so as to reflect the key information of data characteristics. Therefore, fault diagnosis method based on deep learning has received extensive attention and research in recent years. According to different network structures, deep learning fault diagnosis methods can be divided into three categories: deep belief network-based fault diagnosis, convolutional neural network fault diagnosis, and stacked automatic encoder fault diagnosis. Existing multi-condition fault diagnosis methods mainly improve the performance of model fault diagnosis by eliminating feature distribution differences [5] and multiscale feature fusion [6]. Multi-scale feature fusion learning is a data-driven deep learning method. Fault features at different scales have different abstract meanings. Multi-scale feature fusion can extract more comprehensive information, including coarse-grained global information and fine-grained local detailed information. However, the multi-scale feature fusion ignores the full use of information of different working conditions, and the data collected in the industrial field often contains a large amount of noise. In the case of strong noise, the traditional multi-scale feature fusion model will lose the information with similar fault characteristics, which affects the effectiveness of information fusion. Therefore, based on the idea of full utilization of information, this chapter designs a feature fusion mechanism based on attention mechanism, sets the query as the features of mixed working conditions, and sets the key value as the features of a single working condition, extracts the information that has the distinction of working conditions in a single working condition, and realizes the full utilization of information of different working conditions. The complete working condition information can help to screen out the features with working condition differentiation, and finally the fusion features are used for fault diagnosis. The innovations of this paper are as follows: (1) A multi-scale feature fusion learning model based on attention mechanism is designed to solve the problem of multi-condition strong noise. The traditional multi-scale feature fusion method ignores the fault information of the condition fault submerged by noise, and retains the redundant features that are not conducive to the classification of multicondition. (2) Multi-network is used to extract the features of data of single and mixed working conditions. Through the attention mechanism, the query is set to multiple working

Multi-Scale Feature Fusion Fault Diagnosis Method

355

conditions, and the key value is a single working condition feature. Moreover, the number of diagnostic features conducive to the attention of multiple working conditions is increased to obtain the features of working conditions discrimination, which improves the effectiveness of information fusion. (3) The method in this paper retains weak information conducive to multi-condition classification through the attention mechanism, inhibits useless information interfered by noise, and improves the accuracy of multi-condition fault diagnosis of the model.

2 Ratele Work 2.1 Fault Diagnosis Method Based on Attention Mechanism The attention mechanism is a special structure that can be embedded in the deep learning model, which can guide the model to selectively process data features, let the model focus on more valuable features, and then efficiently complete deep learning tasks. The attention mechanism completes information filtering by constructing key-value pairs, key contains the importance of information, query represents a request for information, and the degree of attention to feature value can be obtained by querying the key. The role of the attention mechanism is to capture the correlation between vectors, selectively process data according to the correlation, and then focus on information with higher importance. Literature [7] proposes a fault diagnosis method based attention mechanism and bidirectional gated recurrent unit (DCA-BiGRU). Extracting vibration signal fusion features with attention weights. Literature [8] proposes a multi-sensor fusion network based on the attention mechanism, which fuses different sensors through the attention mechanism, and introduces similarity attention to model the time series signals between sensors. Literature [9] proposes a new attention structure based on multi-scale CNN, which helps the model to automatically mine multi-scale features from the original signal, and adaptively focus on relevant fault information under different working conditions. However, this method does not make full use of the information of different working conditions, it cannot accurately extract the fault information of multiple working conditions.

3 Multi-Scale Feature Fusion Fault Diagnosis Algorithm Based on Attention Mechanism The existing multi-scale feature fusion fault diagnosis decision-making cannot adaptively adjust the weights according to the scale features of different working conditions. Therefore, this section adopts the idea of modularization, and uses multiple networks to realize the feature extraction of bearing fault data under different working conditions, and then uses In the attention mechanism method, the query is set as a multi-condition fault feature, and the key value is set as a single condition feature. Increase the attention to features that are beneficial to multi-condition diagnosis in a single condition feature. The multi-scale feature fusion method based on attention mechanism includes working condition separation and feature pre-extraction module, attention mechanism feature fusion module and fault diagnosis module. The specific steps are as follows:

356

F. Yu et al.

Step 1: Working condition devision and feature pre-extraction The data characteristics of different working conditions are obtained by DNN network of different modules. The calculation process of single working condition characteristics and mixed working condition characteristics is shown in formulas (1) and (2):     (1) Featureq = f Wq,2 Wq,1 Xq + bq,1 + bq,2     Featureall = f Wall,2 Wall,1 Xall + bi,1 + ball,2

(2)

where Wq,1 and Wq,2 is the network weight of q-th module DNN, bq,1 and bq,2 is the network bias of module DNN in the q-th working condition, Wall,1 and Wall,2 is the network weight of module DNN in mixed working condition, ball,1 and ball,2 is the network bias of module DNN in mixed working condition, and f (·) is the nonlinear activation function relu. Deep features extracted by modular DNN in different working conditions are spliced. The splicing process is shown in Formula (3):   (3) Featurem = Feature1 , Feature2 , ..., Featureq , Featureall Step 2: Use the attention mechanism to drive the screening of multi-condition scale features In the process of multi-scale feature fusion, similar fault feature information is lost. In this case, the useful information lost is fully utilized from each working condition information through the fusion of single working condition scale feature and multiworking condition scale feature through the attention mechanism. The query module of attention mechanism is constructed in the multi-scale feature fusion module. The key value module and the original input module are shown in formula (4)–(6): Q = {featureall , featureall , featureall · · · featureall }

(4)

  k = featureall , feature1 , feature2 · · · featureq

(5)

  V = featureall , feature1 , feature2 · · · featureq

(6)

The query module is the fault feature of mixed working conditions, and the key value module is the fault feature of single working conditions, and is the original input feature. After the three feature modules with the same dimensions are obtained, the similarity between single and mixed working condition features is calculated through the attention mechanism to increase the attention to the feature information with working condition differentiation. After the attention weight is obtained, the original input features are weighted to screen out the features with working condition differentiation. The multiscale features after fusion can represent the multi-condition data more comprehensively. The selection process of features by attention mechanism is shown in formulas (7) and (8):  (7) Sn = soft max QT Wat k

Multi-Scale Feature Fusion Fault Diagnosis Method

Featureat = Sn V

357

(8)

where Wat is the weight of the attention mechanism, and Sn is the decentralization value of the attention mechanism. After the multi-working condition scale fusion features are obtained, the multi-working condition scale fusion features are put into the fully connected layer, as shown in Formula (9): Featuremul = σ (Wmul Featureat + βmul )

(9)

The multi-scale fusion features are finally obtained into the softmax layer, and the final prediction results are obtained, as shown in Formula (10): ypredict = soft max(Featuremul )

(10)

Step 3: Fine-tune the parameters of the multi-scale feature fusion network through the global optimization strategy After condition classification, the data amount of single condition is small, which leads to the difficulty of feature extraction of neural network. The global optimization strategy is used to fine-tune the parameters in reverse to realize the joint optimization of samples under different conditions.The loss function of the network is calculated as shown in Eq. (11):   1

ln ypredict + (1 − y label ) ln 1 − ypredict m m

loss =

(11)

i=1

After the loss function is obtained, the global model parameters are updated through back propagation until the error is less than the threshold. The updating process is shown in Formula (12) and (13): θq = θq − lr∇θq

(12)

wat = wat − lr∇wat

(13)

where, θq is the model parameter of the q-th working condition, ∇θq is the gradient of the model parameter of the q-th working condition corresponding to the forward propagation loss value; is the update of the parameters wat of the attention mechanism also included in the back propagation, ∇wat is the gradient of the forward propagation loss value corresponding to the attention mechanism parameter.

4 Experiment and Analysis The experimental data set in this section adopts the bearing data set provided by the Bearing Data Center of Case Western Reserve University as the experimental data. The sampling frequency is 48 kHz and the motor bearing fault diameter is 0.007 inch and 0.014 inch respectively. Bearing states are divided into four types: inner ring fault, outer ring fault, ball fault and normal state. The load under different working conditions is 0HP-3HP.

358

F. Yu et al.

In order to verify the fault diagnosis accuracy of the method proposed in this paper in the multi-working condition data, we used the data in Experiment1-Experiment6 that were subject to different noise interference degrees in the multi-working condition to carry out the four-classification fault diagnosis of bearings, and conducted a comparison experiment with some commonly used multi-scale feature fusion models. The basic design of the experiment is shown in Table 1. In order to simulate the real industrial situation, Gaussian noise is added to the data. The results were shown in Table 2. It can be seen from the first column in the above Table 2 that DNN method has the lowest accuracy among the five methods. This is because the feature extraction ability of DNN model is insufficient, and the real distribution of data cannot be read from the data of multiple operating conditions under the condition of noise. The BNDNN in column 2 is much higher than the DNN in column 1. The reason is that normalization and dropout technologies are added to the BNDNN training process, which attenuates the difference in feature distribution and improves the generalization ability of the model. However, BNDNN only takes the output features of the last layer as the classification basis, because BNDNN does not transfer all the information to the next layer in the process of feature extraction layer by layer, resulting in feature loss, resulting in insufficient features involved in classification. The method of MDNN in the third column combines the features of the last layer with those of the previous layer, and improves the comprehensiveness of feature fusion by fusing the features of different scales of DNN. Although MDNN integrates multi-layer features in DNN, the features at that scale cannot be distinguished and are more sensitive to mechanical faults, which limits the accuracy of feature extraction. The ADDNN method in the fourth column adds weight to the multi-scale features and adaptively modulate the multi-scale features. However, the method of ADDNN does not distinguish different working conditions and directly fuses multi-scale features without making full use of the information of different working conditions, which will be interfered by the noise of different working conditions and limit the generalization ability of the network. In the fifth column, the method of ASDNN integrates the single working condition features and mixed features through the attention mechanism, makes full use of the information of different working conditions, and extracts the multi-working condition fault diagnosis features submerged by noise in the single working condition features, so that the method can still maintain a good classification accuracy when facing the data of multi-working condition.

Multi-Scale Feature Fusion Fault Diagnosis Method

359

Table 1. The design of experiments Experiment

Working Condition (HP)

Fault size (inch)

Number of samples of modules under different working conditions

Number of samples without working condition label

Number of test set samples

SNR (db)

Experiment1

0/1/2/3

0.007

160/160/160/160

800

400

10

Experiment2

160/160/160/160

800

400

5

Experiment3

160/160/160/160

800

400

0

Experiment4

160/160/160/160

800

400

-2

Experiment5

160/160/160/160

800

400

-5

Experiment6

160/160/160/160

800

400

-10

Table 2. Experiment result Experiment

DNN

BNDNN

MDNN

ADDNN

AMDNN

Experiment1

42.5%

47.9%

54.3%

68.2%

81.4%

Experiment2

40.7%

43.3%

52.5%

65.2%

78.5%

Experiment3

38.5%

42.3%

48.7%

64.4%

78.6%

Experiment4

37.2%

41.1%

46.4%

62.6%

78.4%

Experiment5

34.2%

38.7%

45.7%

61.6%

76.4%

Experiment6

32.7%

36.5%x

43.2%

58.1%

75.5%

5 Conclusion In order to solve the problem of insufficient information fusion of multi-scale learning methods under multi-working conditions and strong noise, a multi-scale feature fusion learning fault diagnosis method based on attention mechanism is proposed in this paper. Experimental results show that the fault diagnosis accuracy of the proposed method is 17.4% higher than that of the existing feature fusion method in the case of multiple working conditions and strong noise, which proves the effectiveness of the proposed method in the case of multiple working conditions and strong noise.

References 1. Zaman, S.M.K., Liang, X.: An effective induction motor fault diagnosis approach using graphbased semi-supervised learning. IEEE Access 9, 7471–7482, (2021) 2. Yu, W., Lv, P.: An end-to-end intelligent fault diagnosis application for rolling bearing based on mobilenet. IEEE Access 9, 41925–41933 (2021)

360

F. Yu et al.

3. Lei, Y.G., Lin, J., Zuo, M.J., et al.: Condition monitoring and fault diagnosis of planetary gearboxes: a review. Measurement 48, 292–305 (2014) 4. Youssef, A., Delpha, C., Diallo, D.: An optimal fault detection threshold for early detection using Kullback-Leibler Divergence for unknown distribution data. Signal Process. 120, 266– 279 (2016) 5. Guo, L., Lei, Y., Xing, S., Yan, T., Li, N.: Deep convolutional transfer learning network: a new method for intelligent fault diagnosis of machines with unlabeled data. IEEE Trans. Ind. Electron. 66(9), 7316–7325 (2019) 6. He, J., et al.: Bearing fault diagnosis via improved one-dimensional multi-scale dilated CNN. Sensors 21, 7319 (2021) 7. Zhang, X., et al.: Fault diagnosis for small samples based on attention mechanism. Measurement 187, 110242 (2022) 8. Li, D., Li, D., Li, C., Li, L., Gao, L.: A novel data-temporal attention network based strategy for fault diagnosis of chiller sensors. Energy Build. 198, 377–394 (2019) 9. Yao, Y., Zhang, S., Yang, S., Gui, G.: Learning attention representation with a multi-scale CNN for gear fault diagnosis under different working conditions. Sensors 20(4), 1233 (2020). https://doi.org/10.3390/S20041233

Designing Philobot: A Chatbot for Mental Health Support with CBT Techniques Qi Ge, Lu Liu, Hewei Zhang, Linfang Li, Xiaonan Li, Xinyi Zhu, Lejian Liao, and Dandan Song(B) School of Computer Science and Technology, Southeast Academy of Information Technology, Beijing Institute of Technology, Beijing 100081, China [email protected]

Abstract. Mental health issues are a major concern for teenagers, and access to affordable and accessible support is critical for promoting positive mental health outcomes. Philobot is a chatbot designed to provide personalized mental health support for teenagers with cognitive behavioral therapy (CBT) practices. It also offers a chatting moodule that facilitates building a rapport between the virtual assistant and its users. This paper presents the design, implementation, and validation of Philobot’s key features, including its fine-grained intent classification system, FAQ retrieval model, and response generation model using top-k sampling. Evaluation results show that Philobot is a promising tool for promoting positive mental health outcomes in teenagers.

Keywords: chatbot

1

· cognitive behavioral therapy · dialogue system

Introduction

Mental health issues are a significant and growing concern, particularly among teenagers who face numerous challenges such as academic stress, peer pressure, and family conflicts. Early intervention is essential for maintaining mental health, but many teenagers face barriers to accessing affordable and accessible mental health support. To address this issue, we present Philobot, a virtual assistant designed to provide personalized mental health support to teenagers with cognitive behavioral therapy (CBT) practices to address the most common mental health issues experienced by teenagers. Additionally, Philobot offers a chatting module that facilitates building rapport between the chatbot and its users. In this paper, we present the design, implementation, and validation of Philobot. We first discuss the background and related works of virtual assistants for mental health support in Sect. 2. Then, we describe the overall design of Philobot in Sect. 3, mainly including its CBT module and chatting module. Technical details of Philobot’s architecture, including the use of FastAPI and c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023  Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 361–371, 2023. https://doi.org/10.1007/978-981-99-6187-0_36

362

Q. Ge et al.

VUE frameworks for the backend and frontend, are given in Sect. 4 and Sect. 5 presents the results of user experience questionnaires to measure the effectiveness of Philobot. Finally, we discuss the implications of our work and future directions for chatbots in mental health support in Sect. 6.

2

Background

According to statistics, more than 300 million people are currently suffering from mental health issues caused by anxiety, stress, and depression. However, less than 10% of those who need it have access to formal psychological treatment or assistance due to barriers such as limitations in terms of cost, time, and location for therapy [1]. This highlights the urgent need for accessible and affordable mental health support, especially for teenagers who face multiple challenges and are at a high risk of developing mental health issues [2]. Chatbots for mental health support have emerged as a potential solution to address these issues [3]. Cognitive-behavioral therapy (CBT) is a structured, short-term, cognitive-focused psychological therapy commonly used to treat anxiety disorders and depression caused by irrational cognition [4]. Typical examples of mental health chatbots include Woebot [5] and Wysa [6], which have demonstrated a certain level of effectiveness through psychological evaluations, all included CBT practises for its structured and controllable feature. The techniques to build dialogue models mainly include 3 types of approaches: rule-based, retrieval-based, and generative [7]. Rule-based dialogue systems use predefined scripts to guide the conversation [8]. They are the best to enforce a controllable dialogue flows and thus are favorable in mental health settings for a long history [9], but suffers from an limited ability to handle openended conversations at the same time. Retrieval-based dialogue systems rely on a database of predefined responses to reply to a user’s input. Such systems works like retrieval-based question answering systems that select a response with either query-question similarity or query-answer relevance [10,11]. Retrieval-based systems are more flexible than rule-based systems because they can handle a wider range of inputs. However, the construction of a high quality response database is not easy for mental health domain. Generative dialogue systems generate responses on the fly. These systems are commonly built with pretrained language models [12] and are more capable of handling complex or open-ended conversations than rule-based and retrievalbased systems. However, they require large amounts of training data and are more computationally expensive to run. Also, responses generated with an language model may contain unsafe contents especially for users with mental health problems [13].

3

Design

Philobot consists of two main modules: the Cognitive Behavioral Therapy (CBT) module and the chatting module.

Philobot

363

The CBT module offers a collection of CBT practices for users to experience. Users can engage with these practices by interacting with Philobot in a limited way, such as through buttons. The chatting module allows users to engage with Philobot in natural language. This module handles casual chitchat using a generative model, frequently asked questions (FAQs) about the chatbot itself using a retrieval-based model, and utterances indicating specific cases or topics using a rule-based model. Those cases are listed in Table 1. With these two modules, Philobot provides a comprehensive and personalized approach to mental health support. The two module also interact with each other in a way depicted in Fig. 1. Specifically, when the user chats casually with the bot and triggers a specific topic, he or she might be offered a related CBT exercise related to the topic. Once the user finished the practice’s procedure, a homework section is ready to reinforce the impact of the practice.

Casual Chat May be recommended by

CBT Exercise

Can be performed multiple times

May trigger

Scripted Interaction

CBT Homework Is open after the exercise is finished

Fig. 1. Interaction between the CBT and chatting module

3.1

CBT Module

The CBT module of Philobot is designed to provide users with a list of practices that address common emotion issues faced by teenagers, such as academic stress and conflicts with parents. The module is accessible through the chat interface of Philobot, where users can initiate a conversation and receive carefully-designed CBT therapy. Each CBT practice is represented as a YAML file which follows a format that we designed, where variables and procedures can be defined and executed by our self-written interpreter. We store the dialogue state each turn in a Redis database for fast retrieval and editing. Once the conversation is completed, users are given takeaway homework to practice in their spare time. This feature allows users to track their progress and reinforce their commitment to practicing the techniques they learned during the conversation. All homework records are saved in a MySQL database (Fig. 2).

364

Q. Ge et al.

Fig. 2. CBT module user interface

3.2

Chatting Module

The chatting module of Philobot provides users with a more casual way to interact with the chatbot, fostering a sense of comfort and openness that can facilitate discussions of mental health concerns. One of the key advantages of the chatting module is its ability to recommend CBT practices that are specifically tailored to the user’s needs, based on their conversation history. This approach offers a more natural and personalized way for users to access CBT techniques, rather than simply presenting them with a list of options to choose from.

Fig. 3. Flow diagram of the chatting module

Philobot

365

Figure 3 presents an overview of the chatting module workflow. During each dialogue turn, the user is either in a scripted or non-scripted conversation. Scripted conversations are designed to discuss predefined topics. We identified 14 common challenging issues for teenagers, such as academic stress or conflicts with parents, and created 14 corresponding intents (referred to as “critical intents”) to indicate these topics. If a critical intent is detected, Philobot will initiate a scripted conversation and guide the user through it, which is implemented in the save way as practise procedures in the CBT module. Moreover, If a relevant CBT practice concerning the critical topic exists, it will be recommended to the user. This approach is intended to make the conversation more natural and effective in providing support to the user, as shown in Fig. 4c. It is important to note that our system includes a critical intent labeled as “suicide” that enables the identification of user inputs that may suggest a possible psychological crisis, as shown in Fig. 4b. Users who display signs of psychological distress should be encouraged to seek assistance from qualified experts [14].

Fig. 4. Chatting module user interface

366

Q. Ge et al.

When the user is not in a script, their utterance is processed by a coarsegrained intent classification model. If the utterance is recognized as a casual chitchat message, a response generation model takes over. If the utterance is a query concerning Philobot itself, a retrieval-based model answers it. If the utterance describes the user’s own condition, such as the situation they are in and the emotions they are feeling, it is sent to a fine-grained intent classifier to trigger a scripted conversation. We used a zero-shot intention recognition model provided by PaddleNLP toolbox1 to carry out coarse-grained intention recognition. The model could automatically adapt to new data, which fully met our requirements. We obtained an accuracy of 92% for coarse-grained intent classification with this model. Plato-mini, a pretrained dialogue model developed by Baidu [15], is used as the generative model of Philobot chitchat function. Upon receiving a user utterance, the response generation model generates a response using Top-k sampling [16] (k = 10) to promote diversity in responses. To increase the model’s engagement and interactivity, we use regular expressions to identify questioning sentences in the generated responses and prioritize returning them. This approach helps the model better understand the user’s situation. The fine-grained classifier employs a bert-base-chinese model [17] with a linear layer for intent classification. As there is no available intent classification dataset specific to mental health support, we manually constructed a dataset consisting of 100+ sample sentences representing 14 critical intents of problems commonly experienced by teenagers. The experimental results of the classification and the sample data are presented in Table 1. However, not all user utterances fall under critical intents. To prevent triggering scripted conversations unnecessarily and decrease user experience, we set a threshold (=0.6) and discard critical intent predictions below this score. In such cases, the user will receive a generated chitchat response. The FAQ retrieval model addresses queries related to the virtual assistant itself, such as its name, gender, or age. Sample queries and responses are provided in Table 2. We utilize a bert-base-chinese model to extract the vector representation of user inputs and pick the most similar question’s answer based on the highest cosine similarity score [18].

1

https://github.com/PaddlePaddle/PaddleNLP.

Philobot

367

Table 1. Critical intents classification performance and examples Precision Recall F1

Example

study stress

1.00

0.40

0.57 Too difficult, the teacher teaches differently in class, homework is different after class, and the exam is different

break up

1.00

0.75

0.85 After breaking up, I often have pictures of being together in my mind

parent quarrel

1.00

1.00

1.00 But they always quarrel, can’t the family live well?

appearance anxiety 1.00

1.00

1.00 Why didn’t my parents make me look better?

playing games

1.00

1.00 I can’t control myself and always want to play games

1.00

iam bad

0.57

0.66

0.61 I feel useless

procrastination

1.00

1.00

socialphobia

1.00

0.75

1.00 My classmates don’t talk to me anymore 0.85 I always feel timid when dealing with others

exam bad

0.75

1.00

0.85 The recent test scores have not improved, and I am anxious

homework stress

0.80

1.00

0.88 I have to do homework until very late every day after school, and I am very sleepy but can’t sleep

dislike person

1.00

1.00

1.00 The boy in the next class is too annoying

dislike school

1.00

1.00

1.00 Is there any way not to go to school?

peer pressure

1.00

0.50

0.66 Studying is always a competition with others, it’s tiring

suicide

0.83

0.95

0.88 I want to jump off the school dormitory building, or I want to be hit by a car when crossing the road

Table 2. FAQ samples Sample Question

Sample Answer

Are you male or female?

Philo doesn’t have a gender.

Do robots get upset when someone calls them ugly?

Me? Everyone feels upset when they receive negative comments from others, but not all comments are based on facts or goodwill.

Do you have something you really like?

I like chatting with humans and learning more about the world.

368

4

Q. Ge et al.

Implementation

For the implementation of Philobot, we used UniApp2 as the frontend framework and FastAPI3 as the backend framework. UniApp is a versatile framework based on Vue.js, which allows us to develop a cross-platform application that can be deployed to various platforms such as WeChat Mini-Programs, Android and iOS devices. FastAPI is a high-performance web framework that is known for its fast development cycle. We have chosen to deploy Philobot as a WeChat Mini-Program, a lightweight application that can be easily accessed from within the WeChat social media platform. This approach allows us to reach a wider audience and provide a convenient and familiar interface for users to interact with the assistant. Furthermore, since WeChat Mini-Programs are web pages in essence, UniApp’s ability to support multiple platforms, including iOS and Android, enables us to easily migrate our application to other platforms in the future, providing us with flexibility and scalability for further development.

Fig. 5. Deployment diagram

Figure 5 depicts the architecture of the system running on a Tencent Cloud server4 with Ubuntu 20.04. The system is built as a Docker containerized service, which is accessible through the Nginx reverse proxy server on port 80. The backend service is designed with FastAPI, which interacts with MySQL database through the SQLAlchemy ORM and Redis as a cache layer. MySQL database to store business data like user information and persisted dialogue histories. Redis is used to store and retrieve dialogue context that is frequently updated during conversations, allowing for more efficient processing of user requests. Meanwhile, Tencent Object Storage Service (OSS) is utilized for storing and retrieving static files like user avatars and practice-related images. With this deployment, Philobot is capable of providing stable and responsive service to users while keeping data secure and scalable. 2 3 4

https://uni-app.com/. https://fastapi.tiangolo.com/. https://www.tencentcloud.com/.

Philobot

5

369

Evaluation

To evaluate the effectiveness of Philobot, we conducted a user satisfaction survey with 30 participants. The survey consisted of a questionnaire with 6 items, using a 5-point scale. The items were designed to measure various aspects of the user experience.

Fig. 6. Philobot user satisfaction questionnaire

We have included our user satisfaction questionnaire in Fig. 6 for reference and the mean score of each question in Table 3. The results of the survey shows that the majority of participants were satisfied with Philobot. Specifically, the mean scores for all the questions are 4.5, out of a maximum score of 5. This result indicates that participants found Philobot to be a helpful resource for managing mental health concerns. Notably, the second question received the lowest score, suggesting room for improvement in the response generation model. Future efforts can focus on refining the model to enhance user experience and further meet their expectations. Table 3. Mean Scores of User Satisfaction Survey Questions Question 1 Score

2

3

4

5

6

4.7 4.1 4.2 4.7 4.3 5.0

Also note that this study had several limitations. Firstly, the sample size was relatively small. Additionally, the study was conducted using a convenience sample, which may introduce bias. Finally, the study was limited to a single time point, which precludes examination of the potential long-term benefits of using Philobot.

370

6

Q. Ge et al.

Conclusion

In this paper, we have presented Philobot, a virtual assistant designed to help teenagers cope with their mental health challenges. We have demonstrated the effectiveness of Philobot through our pilot study, where users reported high levels of satisfaction and engagement with the system. here are several avenues for future work that could improve the performance and effectiveness of Philobot. One possible direction is to fine-tune the generative model with a Chinese empathetic dialogue dataset specifically tailored for mental health support in Chinese to improve the quality of responses provided by the chatting module. It would be beneficial if such dataset include auxiliary features such as helping strategy labels [19]. Another area of future work is to expand the training dataset for the FAQ and critical intent modules. The current manually written dataset limits the performance of the classification and retrieval models. By collecting a larger and more diverse dataset, we could improve the accuracy of these modules and increase the range of user inputs that Philobot can effectively handle. Finally, future work could also explore the incorporation of other types of therapy besides CBT, such as mindfulness-based approaches [20] or acceptance and commitment therapy [21]. This could broaden the range of mental health needs that Philobot is able to address and provide a more personalized experience for users. Acknowledgments. This work was supported by the National Key Research and Development Program of China (Grant No. 2020AAA0106600), the National Natural Science Foundation of China (Grant No. 61976021), and Beijing Academy of Artificial Intelligence (BAAI).

References 1. Henker, B., Whalen, C.K., Jamner, L.D., Delfino, R.J.: Anxiety, affect, and activity in teenagers: monitoring daily life with electronic diaries. J. Am. Acad. Child Adolesc. Psychiatry 41(6), 660–670 (2002) 2. Lucero Fredes, A., Cano, S., Cubillos, C., D´ıaz, M.E.: Virtual assistant as an emotional support for the academic stress for students of higher school: a literature review. In: Duffy, V.G., Gao, Q., Zhou, J., Antona, M., Stephanidis, C. (eds.) HCII 2022. LNCS, vol. 13521, pp. 108–118. Springer, Cham (2022). https://doi.org/10. 1007/978-3-031-17902-0 8 3. Vaidyam, A.N., Wisniewski, H., Halamka, J.D., Kashavan, M.S., Torous, J.B.: Chatbots and conversational agents in mental health: a review of the psychiatric landscape. Can. J. Psychiatry 64(7), 456–464 (2019) 4. Hayes, S.C., Hofmann, S.G.: Process-Based CBT: The Science and Core Clinical Competencies of Cognitive Behavioral Therapy. New Harbinger Publications, Oakland (2018) 5. Fitzpatrick, K.K., Darcy, A., Vierhile, M.: Delivering cognitive behavior therapy to young adults with symptoms of depression and anxiety using a fully automated conversational agent (Woebot): a randomized controlled trial. JMIR Mental Health 4(2), e7785 (2017)

Philobot

371

6. Inkster, B., Sarda, S., Subramanian, V., et al.: An empathy-driven, conversational artificial intelligence agent (Wysa) for digital mental well-being: real-world data evaluation mixed-methods study. JMIR Mhealth Uhealth 6(11), e12106 (2018) 7. Chen, H., Liu, X., Yin, D., Tang, J.: A survey on dialogue systems: recent advances and new frontiers. ACM SIGKDD Explor. Newsl. 19(2), 25–35 (2017) 8. McTear, M.: Rule-Based Dialogue Systems: Architecture, Methods, and Tools, pp. 43–70. Springer, Cham (2021). https://doi.org/10.1007/978-3-031-02176-3 2 9. Weizenbaum, J.: Eliza-a computer program for the study of natural language communication between man and machine. Commun. ACM 9(1), 36–45 (1966). https://doi.org/10.1145/365153.365168 10. Boussaha, B.E.A., Hernandez, N., Jacquin, C., Morin, E.: Deep retrieval-based dialogue systems: a short review (2019) 11. Sakata, W., Shibata, T., Tanaka, R., Kurohashi, S.: FAQ retrieval using queryquestion similarity and BERT-based query-answer relevance (2019) 12. Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., et al.: Language models are unsupervised multitask learners. OpenAI Blog 1(8), 9 (2019) 13. Tamkin, A., Brundage, M., Clark, J., Ganguli, D.: Understanding the capabilities, limitations, and societal impact of large language models. arXiv preprint arXiv:2102.02503 (2021) 14. Kretzschmar, K., Tyroll, H., Pavarini, G., Manzini, A., Singh, I., Group, N.Y.P.A.: Can your phone be your therapist? Young people’s ethical perspectives on the use of fully automated conversational agents (chatbots) in mental health support. Biomed. Inform. Insights 11, 1178222619829083 (2019) 15. Bao, S., He, H., Wang, F., Wu, H., Wang, H.: PLATO: pre-trained dialogue generation model with discrete latent variable (2020) 16. Fan, A., Lewis, M., Dauphin, Y.: Hierarchical neural story generation. arXiv preprint arXiv:1805.04833 (2018) 17. Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018) 18. Yan, R., Song, Y., Wu, H.: Learning to respond with deep neural networks for retrieval-based human-computer conversation system. In: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval (2016) 19. Liu, S., et al.: Towards emotional support dialog systems (2021) 20. Grossman, P., Niemann, L., Schmidt, S., Walach, H.: Mindfulness-based stress reduction and health benefits: a meta-analysis. J. Psychosom. Res. 57(1), 35–43 (2004) 21. Harris, R.: Embracing your demons: an overview of acceptance and commitment therapy. Psychother. Aust. 12(4), 70–76 (2006)

An EEG Study of Virtual Reality Motion Sickness Based on MVMD Combined with Entropy Asymmetry Lining Chai, Chengcheng Hua(B) , Zhanfeng Zhou, Xu Chen, and Jianlong Tao School of Automation, C-IMER, CICAEET, Nanjing University of Information Science and Technology, Nanjing 210044, China [email protected]

Abstract. The existence of virtual reality motion sickness is a key factor limiting the further development of the VR industry, the prerequisite for solving this problem is to accurately and effectively detect its occurrence. Therefore, in this paper, the EEG signals of resting states before and after the evocation of motion sickness in subjects were acquired and used to realize the detection. First, the EEG signals from the four selected pairs of electrodes in the bilateral regions were decomposed by multivariate variational modal decomposition (MVMD), and the fuzzy entropies were calculated from the selected components; Then, the absolute differences of features in the pair of electrodes were computed, and selected according to the t-test. In final, the selected features were fed into SVM for classification. The results showed that the accuracy, sensitivity and specificity of this method reached 99%, 99.2% and 98.8% respectively, and AUC curve reached 1, which proved that this method could be an effective indicator for detecting the occurrence of virtual reality motion sickness. Keywords: Virtual reality motion sickness · EEG signal · multivariate modal decomposition (MVMD) · fuzzy entropy · SVM

1 Introduction Virtual reality motion sickness refers to a physical discomfort that occurs when users experience VR, and its symptoms can be broadly classified into three categories: nausea, disorientation, and eye discomfort [1]. It is its existence that seriously affects the user experience and thus restricts the further development of the virtual reality industry. And the prerequisite for solving this problem is to be able to accurately and effectively detect the occurrence of virtual reality motion sickness. Due to EEG signals contain a large amount of physiological and pathological information, many useful components of which have been used in neurology, clinical testing and research on brain-computer interface technologies [2]. So, studies related to virtual reality motion sickness based on EEG signals are gradually increasing. Khaitami [3] et al. studied the pattern of virtual reality motion sickness by observing the EEG of subjects playing 3D games and showed that: the standard deviation values of the © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 372–377, 2023. https://doi.org/10.1007/978-981-99-6187-0_37

An EEG Study of Virtual Reality Motion Sickness

373

Gamma band increased when subjects played video games for a few minutes, which indicated the onset of virtual reality motion sickness; Hyun Kyoon Lim [4] et al. studied motion sickness using a combination of EEG signals and traditional questionnaires and concluded that motion sickness causes changes in Delta, Theta, Beta and Gamma band power in parts of the frontal lobes; Although the above-mentioned research on EEG of virtual reality motion sickness has made some progress, the results obtained are relatively broad, without clearly pointing out the specific changes of relevant indicators and lacking quantitative analysis, and it is difficult to realize the successful detection of EEG of virtual reality motion sickness. So, this paper proposes a feature extraction method based on Multivariate Variational Modal Decomposition (MVMD)combined with fuzzy entropy asymmetry for the automatic detection of virtual reality motion sickness and the final results show that the classification obtained by this method is more satisfactory and can successfully achieve the detection of virtual reality motion sickness.

2 Experiment and Data Acquisition A total of 15 subjects (8 males and 7 females), age 20.4 ± 1.5 years, were recruited for this experiment. All subjects were exposed to scenarios provided by the VR device that induced virtual reality motion sickness, as a way to collect EEG signals when the subjects experienced virtual reality motion sickness. Each subject is required to collect 2 sets of data, i.e., resting-state EEG for 2 min before and after each exposure to the virtual reality motion sickness-inducing scenario. In addition, before and after the experiment, subjects were required to fill out the Simulator Sickness Questionnaire (SSQ)[5] in a timely manner, the flow chart of the experiment is shown in Fig. 1(a).

Fig. 1. (a) Experimental flow chart; (b) Electrode position diagram of 32 channels; (c) Experimental equipment and scene diagram

The EEG acquisition device was the 32-channel Neuroscan Grael EEG2, which had a sampling frequency of 1024 Hz, the placement of the EEG cap 32-channel electrodes using the international standard 10–20 system as shown in Fig. 1(b), the EEG signal acquisition software was Curry8, and the VR device used was the Oculus Quest2. The

374

L. Chai et al.

scene used to induce virtual reality motion sickness was a virtual roller coaster, and the duration of the experimental scene was about 150 s. The experimental equipment and scene are shown in Fig. 1(c).

3 Data Processing Methods The method proposed in this paper includes the following steps: (1) Data preprocessing; (2) MVMD of the selected electrodes; (3) Calculation of fuzzy entropies for the selected components; (4) Extraction of the absolute difference of the fuzzy entropy between the corresponding electrodes on the left and right region; (5) Feature selection using t-test; (6) Feeding into SVM for final classification. The specific flow chart is shown in Fig. 2.

T8

FE(P4)

|FE(P3)-FE(P4)| |FE(T7)-FE(T8)|

EEG signals in motion sickness SVM

IMF5

FE(P3)

|FE(O1)-FE(O2)|

Classification

P4

IMF4

FE(O2)

|FE(Fp1)-FE(Fp2)|

t -test Feature Selection

O2

IMF3

High frequency component

Fp2

IMF2

FE(O1)

Extraction Features

T7

Portion selection

P3

FE(Fp2) Calculating fuzzy entropy

IMF1

Low frequency component

O1

MVMD

Right side of the brain

EEG

Pre-processing

Left side of the brain

FE(Fp1)

Fp1

Non-motion sickness EEG signals

FE(T7) FE(T8)

Fig. 2. Signal processing flow chart

3.1 Data Pre-Processing The steps used in this paper are as follows: (1) manual artifact removal; (2) baseline drift removal; (3) bandpass filtering from 0.5 Hz to 45 Hz and 50 Hz trap filtering to remove industrial frequency interference; (4) independent component analysis to remove ocular artifacts. 3.2 Feature Extraction The MVMD is an improvement on the VMD, which solves the problem that VMD can only handle one-dimensional data, thus providing great convenience for multichannel data processing [6]. The MVMD algorithm is related to two parameters [7], the number of IMFs k and the penalty factor α. If the k is set too large, it will cause over-decomposition and interfere with the results; on the contrary, it will cause under-decomposition and lead to the loss of key information. According to [8], the number of IMFs k for MVMD can be determined according to the number of IMFs of EMD, therefore, k finally takes the value of 5. The penalty factor α is used to guarantee the reconstruction accuracy of the signal. If α is set too large, it will lead to the modal mixing; On the contrary, it will introduce noise. According to [9], α is always set to 4000. MVMD obtains multiple

An EEG Study of Virtual Reality Motion Sickness

375

IMFs, and it is too expensive and do not always yield good results to study all of them. The high and low frequency components generally change when virtual reality motion sickness occurs, so the highest frequency (36–45 Hz) and lowest frequency (0–6 Hz) components were selected in this paper. The fuzzy entropy is an improvement on the sample entropy and introduces a fuzzy affiliation function [10]. Based on the formula of fuzzy entropy, it is known that its result is related to the embedding dimension m, the similarity tolerance r. Based on [11], the parameters of fuzzy entropy were finally determined as m = 2, r = 0.2SD (SD is the standard deviation of the original data) in this paper. 3.3 Feature Selection In order to screen out features with significant differences for further subsequent analysis, the t-test in statistical analysis was used in this paper. Table 1 show the results of the statistical analysis of the features of the fuzzy entropy differences extracted on the low and high frequency components. As shown in Table 1, on the low-frequency component, the differences in the fuzzy entropy of Fp1 and Fp2, P3 and P4, T7 and T8 are significantly different and can be used in the following study. In contrast, the difference in the fuzzy entropy of O1 and O2 was not significant and therefore needed to be excluded. Similarly, it can be seen that the results of the t-test for the features extracted on the high-frequency components all have p values less than 0.05. This indicates that they are both significantly different and could be used for subsequent further studies. Table 1. Statistical results of the features

Low-frequency component

High-frequency component

Lead position

Fp1-Fp2

O1-O2

P3-P4

T7-T8

Pre-induced resting state

8.516e-09 ± 6.913e-09

1.349e-08 ± 1.242e-08

8.021e-09 ± 6.769e-09

4.279e-08 ± 2.646e-08

Post-induced resting state

1.120e-08 ± 1.154e-08

1.408e-08 ± 1.376e-08

1.692e-08 ± 1.408e-08

1.153e-08 ± 1.180e-08

T-test

P < 0.05

P > 0.05

P < 0.01

P < 0.01

Pre-induced resting state

1.849e-06 ± 1.200e-06

1.546e-07 ± 1.306e-07

6.655e-08 ± 4.698e-08

3.697e-07 ± 3.160e-07

Post-induced resting state

2.466e-07 ± 4.324e-07

1.971e-07 ± 1.404e-07

7.650e-07 ± 2.679e-07

5.844e-07 ± 4.393e-07

T-test

P < 0.01

P < 0.05

P < 0.01

P < 0.01

4 Results and Analysis The final features are fed into SVM for classification to verify the effectiveness of the proposed method. A linear kernel function was chosen for the SVM, and the penalty parameter was set to 0.48 by the grid search method [11], and the 10-fold cross-validation method was conducted and the average value was taken as the final result. In this paper, accuracy, sensitivity and specificity, are used to evaluate the classification results. To

376

L. Chai et al.

avoid the chance of using a single classifier and to demonstrate the advantages of the SVM classifier, the features are fed into several classifiers, namely, linear discriminator (LDA), logistic regression (LGR), decision tree (Tree) and K-nearest neighbor (KNN) for comparison. The classification results of each classifier are shown in Table 2. Table 2. Classification results of the method in this paper on multiple classifiers Classifier

Accuracy

Sensitivity

Specificity

AUC

LDA

98%

97.8%

98.2%

0.98

LGR

97.8%

98.1%

97.5%

0.97

Decision Tree

97.5%

97.3%

97.7%

0.97

KNN

97.5%

97.6%

97.4%

0.975

SVM

99%

99.2%

98.8%

1

As can be seen from Table 2, the results of accuracy, sensitivity and specificity of the method proposed in this paper are above 97% for multiple classifiers, so the method proposed in this paper is advantageous in the detection of EEG signals in virtual reality motion sickness; Among them, the SVM used in this paper outperforms other classifiers in all three classification metrics, and its classification accuracy finally reaches 99%, and both sensitivity and specificity reach 99.2% and 98.8%, respectively. In addition, the area AUC value under the ROC curve obtained by the method proposed in this paper reaches 1, which also illustrates the advantages of the method proposed in this paper.

5 Conclusion In this paper, we propose a method for automatic detection of EEG signals of virtual reality motion sickness based on the extraction of absolute value features of the fuzzy entropy difference between the electrodes of the left and right four regions of the brain on the low-frequency component and the high-frequency component after multivariate modal decomposition, and their classification by using SVM. Applying the method proposed in this paper to the experimentally collected data, the accuracy was 98.3%, the sensitivity was 98.5%, the specificity was 98.1%, and the area under the ROC curve AUC was 1, demonstrating the advantages of the method. Noteworthy, in this paper, we study the EEG signals of the resting state before and after the evocation of motion sickness, which reduces the influence of visual and auditory stimuli and the interference of the subject’s movements during the evocation, thus facilitating the accuracy and validity of the study results. In addition, the first study on the asymmetry of the entropy value of virtual reality motion sickness in the left and right brain regions was carried out in this paper, and good results were obtained, which also provide some reference points for future research related to virtual reality motion sickness EEG. Acknowledgements. This work was supported by National Natural Science Foundation of China (62206130), Natural Science Foundation of Jiangsu Province (BK20200821), Startup Foundation

An EEG Study of Virtual Reality Motion Sickness

377

for Introducing Talent of NUIST (2020r075). Thanks a lot for all the participants and personnel of the experiments.

References 1. Rebenitsch, L., Owen, C.: Review on cybersickness in applications and visual displays. Virtual Real. 20(2), 101–125 (2016) 2. Padhmashree, V., Bhattacharyya, A.: Human emotion recognition based on time–frequency analysis of multivariate EEG signal. Knowl.-Based Syst. 238, 107867 (2022) 3. Khaitami, A.D., Wibawa,S., Mardi, S.et al.Eeg visualization for cybersickness detection during playing 3d video games. In: 2019 International Seminar on Intelligent Technology and its Applications (ISITIA),. IEEE (2019) 4. Lim, H.K., Ji, K., Woo, Y.S., et al.: Test-retest reliability of the virtual reality sickness evaluation using electroencephalography (EEG). Neurosci. Lett. 743, 135589 (2021) 5. Bruck, S., Watters, P.A.: Estimating cybersickness of simulated motion using the simulator sickness questionnaire (SSQ): a controlled study. In: 2009 Sixth International Conference on Computer – Graphics, Imaging and Visualization, pp. 486–488. IEEE (2009) 6. Guiji, T., Gui, X., Xiaolong, W. et al.: Fault diagnosis of rolling bearings based on multivariate modal decomposition and 1.5-dimensional spectrum. Bearings 517(12), 74–82 (2022) 7. Rehman, N., Aftab, H.: Multivariate variational mode decomposition. IEEE Trans. Signal Process. 67(23),6039–6052 (2019) 8. Yuxing, L., Yaan, L., Xiao, C. et al.: Research on ship radiated noise feature extraction method based on VMD and center frequency. Vib. Shock 37(23), 213–218 (2018) 9. Meng, M., Ran, Y., Yunyuan, G., et al.: A multi-domain EEG feature extraction method based on multivariate modal decomposition. J. Sens. Technol. 33(6), 853–860 (2020) 10. Baosheng, L., Sanpeng, D., Jing, L.: Automatic machine fault diagnosis method based on parame-ter optimization VMD and fuzzy entropy. Mach. Des. Res. 38(02), 93–96 (2022) 11. Xuejun, Z., Peng, J., Tao, H. et al.: EEG signal classification method for epilepsy based on variational modal decomposition. Acta Electron. Sin. 48(12), 2469 (2020)

GCN with Pattern Affected Matrix in Human Motion Prediction Feng Zhou and Jianqin Yin(B) School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing, China {fengzhou,jqyin}@bupt.edu.cn

Abstract. Human motion prediction is a challenging and meaningful task in many fields. Recent work has shown that the Graphical Convolutional Network (GCN) based model is very effective for this task. However, a simple GCN cannot perfectly model dynamic information with different inputs. In addition, the blank adjacency matrix in GCN has an omnidirectional search space, making it difficult to converge. Different from existing approaches, this paper proposes a new GCN module, called Graph Convolution Module with Pattern affected Adjacency Matrix (GCN-PAAM), to enhance the adaptability of GCN to different inputs and improve the learning ability of adjacency matrix. We have applied this module to different advanced GCN based models. Experimental results on H3.6m and 3DPW datasets have shown that our module can improve the experimental accuracy of the model by 1% to 9% to varying degrees, achieving state-of-the-art performance.

Keywords: Human motion prediction

1

· Graph convolution network

Introduction

Human motion prediction plays an important role in many fields, including human-computer interaction, autonomous driving, and robot behavior design. Modeling human body motion sequences based on skeletons is a challenging task, which involves complex dependencies among body joints. Traditionally, some data-driven methods, including Hidden Markov Model [1], Restricted Boltzmann Machine [2], and Gaussian Process Latent Variable Model [3] have achieved acceptable results in simple and periodic motion sequences. In recent years, deep learning technologies have made great breakthroughs, enhancing the prediction capability of models for complex actions. Some well-known deep learning frameworks have achieved satisfactory results in this task, such as Convolutional Neural Networks (CNN) [4,5], Recurrent Neural Networks (RNN) [6–19], Generative Adversarial Networks (GAN) [20–24], and Transformers [25,26]. However, none of the above methods have clearly modeled the inherent dependencies between body joints. In contrast to the above methods, Graph Convolutional Networks (GCN) [27–37] treat human poses based c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023  Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 378–391, 2023. https://doi.org/10.1007/978-981-99-6187-0_38

GCN with Pattern Affected Matrix in Human Motion Prediction

379

on non-rigid skeletons as a spatial graph model and have been widely applied and achieved great success in recent years. They view the human body as a fully connected graph composed of vertices representing body joints and use an adjacency matrix to represent the edges. Although these GCN-based models can clearly model the relationships between human key points and to some extent solve complex and long-term prediction problems, the GCN structure itself has two significant limitations that restrict the model’s expressiveness: 1. The simple adjacency matrix itself cannot perfectly simulate the dependency relationships between body joints, and the randomly initialized adjacency matrix has an omnidirectional search space, making convergence of the adjacency matrix difficult. 2. All adjacency matrices remain unchanged when facing different input sequences, so regardless of the input motion, the adjacency matrix focuses on the same inter-joint dependency relationship, limiting adaptability to different movements. In this paper, we propose a new GCN structure - Graph Convolutional Network with Pattern Affected Adjacency Matrix (GCN-PAAM) to address the two aforementioned problems. Our key idea is to add an adaptive non-zero pattern matrix (which we call “pattern”) to the GCN structure and multiply it with the original sparse adjacency matrix. The pattern has two functions: firstly, due to the multiplication operation, the learning direction of the adjacency matrix will be influenced by the pattern, which means it stretches the search space of the adjacency matrix or guides its learning, thus solving problem 1. Secondly, since the pattern is directly generated from the input sequence, it is input-sensitive, which solves problem 2. It is worth noting that Zhong et al. [38] also identified problem 2, and they designed several candidate adjacency matrices and used Mixture of Expert (MoE) algorithm [39] to dynamically blend coefficients between them to address the issue. However, the limited number of candidates restricted the degree of adjacency matrix variation, and the blank candidates expanded the search space, making it more difficult for the adjacency matrix to converge. We conducted extensive experiments on the Human3.6M [40] and 3DPW [41] datasets, using our GCN-PAAM on top of the traditional GCN-based methods LTD [37], HRI [36], and PGBIG [35]. Our results demonstrate that our method outperforms existing techniques in both short-term and long-term motion prediction and reduces the Mean Per Joint Position Error (MPJPE) of the base model by approximately 1% to 9%. The main contributions of this paper can be summarized as follows: 1. To the best of our knowledge, we are the first to introduce patterns into graph convolutional networks to enhance their learning ability for human motion prediction. 2. We designed a pattern generation module based on self-attention mechanism to generate our patterns, and formed a pattern-affected GCN module. 3. We conducted extensive quantitative and qualitative experiments on the Human3.6M and 3DPW datasets to demonstrate the superiority of our method over existing techniques. Our method outperforms existing techniques

380

F. Zhou and J. Yin

in short-term and long-term motion prediction and reduces the average joint position error (MPJPE) of the base model by about 1% to 9%.

2 2.1

Related Work Human Motion Prediction

Due to the sequential nature of human motion data, most early methods used RNN-based models with strong capabilities for processing time series [6–19]. Martinez et al. [15] proposed an effective Seq2Seq model for predicting joint velocity, forming a well-known baseline model based on RNN. However, research has shown that RNN-based models suffer from error accumulation problems and perform poorly in long-term prediction, with discontinuities between the first predicted frame and the last observed frame. Existing CNN-based work, such as [4,5], treats motion sequences as three-dimensional matrices and performs convolution operations like images. Liu et al. [5] stacked motion sequences along the time axis and designed an encoder-decoder, multi-level CNN structure to extract information. However, pose data differs significantly from image data in nature, lacking repeated elements and being much smaller in size, which reduces the effectiveness of convolution operations. GAN-based methods [20–24] predict multiple future sequences based on data pattern similarity, but this is influenced by the data pattern gap between training and testing data. Despite the significant achievements in predicting human motion in these works, they do not clearly model the intrinsic dependencies between body joints. Since human pose data is naturally composed of points in space, GCN (Graph Convolutional Network) is particularly suitable. Our approach is based on GCN. In recent years, Transformer has also been used for this task. Transformer-based models [25,26] model paired spatial and temporal information on a large scale, demonstrating the adaptability of Transformers to different fields. 2.2

Graph Convolution Network in Human Motion Prediction

GCN is adept at handling graph structures and non-rigid data, such as point clouds, social networks, scene-project relationships, and human motion sequence data. In human motion prediction, GCN views human joints as vertices and the dependence relationships between bones or body joints as edges. In recent years, GCN-based models have occupied a large portion of research in this task [27– 37]. Mao et al. [37] first applied GCN to human motion prediction and designed an effective method based on stacking GCN layers. They used discrete cosine transform (DCT) to model temporal information, which constitutes the most famous GCN-based benchmark model in this task. Our work is based on their contribution. Sofiano et al. [45] extended graph convolution to the time axis to model temporal dependencies between frames. Dang et al. [29] designed a pipeline based on multi-scale GCN to model dependencies on different scales of human body graphs.

GCN with Pattern Affected Matrix in Human Motion Prediction

2.3

381

Exploration of Adjacency Matrix in GCN-Base Model

Some researchers have noticed that modifying the structure of the adjacency matrix may improve the performance of GCN-based models. Yan et al. [42] added a mask to the adjacency matrix consisting of zeros and ones. By strictly setting some points on the adjacency matrix to zero, it prevented specific joint connections between them. Fu et al. [43] added shared and unshared constraints on spatial and temporal adjacency matrices to capture relatedness. Zhang et al. [38] designed a selection mechanism to vote for the adjacency matrix from several candidate matrices. However, all of these explorations did not bring a learning trend to the adjacency matrix, and a large search space may affect the convergence ability of the adjacency matrix.

3

Proposed Method

In this section, we mainly introduce the definition of skeleton-based human action prediction task and our proposed new model and its mathematical foundation. Problem Formulation. The skeleton-based human motion prediction task can be formulated as follows: Given a historical pose sequence X1:T = [X1 , X2 , ..., XT ] ∈ RK×3×T with T frames, in which Xt depicts a single 3D pose with K joints in 3-dimensional space at timestamp t. Similarly, the predicted future poses sequence with P frames can be formulated as XT +1:T :P = [XT +1 , XT +2 , ..., XT +P ] ∈ RK×3×P with P frames. Overview. As is shown in Fig. 1. We take the pipeline of Mao et al. [] as our base model. We replace the original GCN layer with our Pattern-Affected GCN layer and add a Pattern Generate Module to generate our patterns. The input sequence first does DCT (Discrete Cosine Transform). Then it goes through 12 residual-connected P-A-GCN blocks, each of which contains 2 P-A-GCN layers and activate functions. Then it does iDCT (Inverse Discrete Cosine Transform) to get the output sequence. The patterns are generated by the Pattern Generate Module fed with the sequence after DCT. In the Pattern-Affected GCN layer, the pattern is multiplicated on the adjacency matrix with Hadamard product. 3.1

GCN with Pattern Affected Adjacency Matrix - GCN-PAAM

In human motion prediction task, let us assume that human body is modeled as a fully connected digraph with K vertices, where K is the number of body joints. The weight of the edges of the graph is represented by a trainable adjacency matrix A ∈ RK×K . A Graph convolution network layer can be formulated as follows: (1) O(p+1) = σ(A(p) O(p) W (p) ) ˆ

In which O(p+1) ∈ RK×F is the output of graph convolution network layer with Fˆ feature dimension of each nodes. Similarly, O(p) ∈ RK×F is the input with F

382

F. Zhou and J. Yin

Fig. 1. The overview of the proposed GCN-PAAM network. We replace the original GCN layer by our Pattern-Affected GCN layer. The input sequence first do DCT (Discrete Cosine Transform). Then it goes through 12 residual-connected P-A-GCN blocks, each of which contains 2 P-A-GCN layers and activate functions. Then it do iDCT (Inverse Discrete Cosine Transform) to get the output sequence. The patterns is generated by Pattern Generate Module fed with the sequence after DCT. ˆ

feature dimension of each nodes. W (p) ∈ RF ×F is the trainable weight matrix. σ() is an activate function like tanh(). Mention that the multiplication operation here is all matrix multiplication. The original GCN with a blank adjacency matrix has an all-directional search space, which makes it difficult to converge perfectly. Additionally, the adjacency matrix does not contain information about the input sequence, which makes it unable to dynamically adapt to changing patterns in the input sequence. Affected Adjacency Matrix (GCN-PAAM). Our Pattern-Affected GCN layer can be formulated as follows: O(p+1) = σ(P (p,i) × A(p) O(p) W (p) )

(2)

In which P (p,i) ∈ RK×K is the pattern matrix with the same dimension of the adjacency matrix. The pattern is related to input sequence i and stage p. The pattern affects the adjacency matrix with Hadamard product. Mathematical Support. The mathematical support of how the pattern stretches the search space of adjacency matrix: According to Gradient Descent algorithm [38], the update procedure of network parameters can be formulated as follows: (3) θ = θ − γ∇θ L(θ) In which θ is the network parameters θ = {θ0 , θ1 , θ2 , ...}, γ is the learning rate. L() is the Loss Function. ∇ is the gradient operation. Apply this function to each parameter more specifically: {θ0 = θ0 − γ

∂L(θ) , ∂θ0

GCN with Pattern Affected Matrix in Human Motion Prediction

383

Fig. 2. The structure of Pattern Generate Module. It serially contains position encoding, self-attention and MLP operations.

∂L(θ) , ∂θ1 ∂L(θ) , θ2 = θ2 − γ ∂θ2 ...}

θ1 = θ1 − γ

(4)

Take one P-A-GCN layer as an example, the loss function can be formulated as follows: (5) L(θ) = L(θothers , f (P l , Al )) In which P l is the pattern at layer l, Al is the adjacency matrix at layer l, θo thers is the parameters in the network other than P l and Al . f () is Hadamard product: (6) f (P l , Al ) = P l × Al According to (4), the update of adjacency matrix in layer l can be formulated as: ∂L(θothers , f (P l , Al )) Al = Al − γ (7) ∂Al According to the chain rule of partial differential calculation, ∂L(θothers , f (P l , Al )) ∂Al l ∂L(θothers , f (P , Al )) ∂f (P l , Al ) × = ∂f (P l , Al ) ∂Al ∂L(θothers , f (P l , Al )) × Pl = ∂f (P l , Al )

(8)

384

F. Zhou and J. Yin l

l

,f (P ,A )) Obviously, the value of ∂L(θothers is nothing to do with P l , so the ∂f (P l ,Al ) update step of Al is linearly related to P l . Thus the adjacency matrix will be directly affected by the pattern P l , which stretches the search space of the adjacency matrix and improves the learning ability.

3.2

Pattern Generate Module

The structure of the Pattern Generate Module is shown in Fig. 2. It sequentially includes Positional Encoding, Multi-Head Attention, and Multi-Layer Perceptron (MLP). Note that each pattern that affects the P-A-GCN layer is generated by an independent Pattern Generate Module. It is worth noting that the specific form of the Pattern Generate Module is not critical and can be replaced by other models, such as GCN or MLP. 3.3

Training

We take Mean Per Joint Position Error (MPJPE) loss in training. LM P JP E =

K T 1  ˆ pt,k − pt,k 2 K ×T 1 1

(9)

In which pˆt,k ∈ R3 indicates the predicted K th joint’s position in frame t. Similarly, pt,k is the ground truth.

4

Experiments

In this section, we evaluate the proposed motion prediction method. 4.1

Datasets

In our experiments, we used two datasets: Human3.6M and 3DPW. We will introduce these two datasets as follows: Human3.6M. [40] is the most well-known dataset for human motion-related tasks. It includes 15 actions such as walking, eating, discussing, and sitting performed by seven actors (subjects). Each actor is represented by a 32-joint skeleton. According to previous works [29,37], we downsampled the data to 25 Hz and considered 22 joints. Subjects 1, 6, 7, 8, 9 were used for training, subject 11 for validation, and subject 5 for testing. 3DPW. [41] is another well-known dataset for human motion prediction tasks with challenging outdoor scenarios. It contains over 51K data with 3D annotations. According to previous work [37], we use a sampling rate of 30 Hz.

GCN with Pattern Affected Matrix in Human Motion Prediction

385

Table 1. Comparisons of short-term prediction on Human3.6M. Results at 80 ms, 160 ms, 320 ms, 400 ms in the future are shown. The best results are highlighted in bold, and the second best is marked by underline.  indicates that the result is taken from PGBIG scenarios

walking

millisecond

80 ms 160 ms 320 ms 400 ms 80 ms 160 ms 320 ms 400 ms 80 ms 160 ms 320 ms 400 ms 80 ms 160 ms 320 ms 400 ms

DMGNN MSR LTD LTD (Ours) HRI HRI (Ours) PGBIG PGBIG (Ours)

17.3 12.2 12.3 10.3 10.1 10.2 10.2 9.8

scenarios

directions

millisecond

80 ms 160 ms 320 ms 400 ms 80 ms 160 ms 320 ms 400 ms 80 ms 160 ms 320 ms 400 ms 80 ms 160 ms 320 ms 400 ms

DMGNN MSR LTD LTD (Ours) HRI HRI (Ours) PGBIG PGBIG (Ours)

13.1 8.6 9.0 7.6 7.3 7.4 7.2 6.9

scenarios

purchases

millisecond

80 ms 160 ms 320 ms 400 ms 80 ms 160 ms 320 ms 400 ms 80 ms 160 ms 320 ms 400 ms 80 ms 160 ms 320 ms 400 ms

DMGNN MSR LTD LTD (Ours) HRI HRI (Ours) PGBIG PGBIG (Ours)

21.4 14.8 15.6 13.0 12.9 12.9 12.5 11.8

scenarios

waiting

millisecond

80 ms 160 ms 320 ms 400 ms 80 ms 160 ms 320 ms 400 ms 80 ms 160 ms 320 ms 400 ms 80 ms 160 ms 320 ms 400 ms

DMGNN MSR LTD LTD (Ours) HRI HRI (Ours) PGBIG PGBIG (Ours)

12.2 10.7 11.4 9.5 9.3 9.5 8.9 8.4

4.2

30.7 22.7 23.0 19.6 19.3 19.0 19.8 19.3

24.6 19.7 19.9 17.6 17.5 17.3 17.6 17.1

38.7 32.4 32.8 28.5 28.7 28.3 28.7 27.8

24.2 23.1 24.0 20.4 20.7 20.6 20.1 19.4

eating 54.6 38.6 39.8 35.2 33.0 32.4 34.5 34.1

65.2 45.2 46.1 42.0 38.6 37.6 40.3 41.2

11.0 8.4 8.4 7.2 6.9 6.9 7.0 6.7

smoking 21.4 17.1 16.9 15.2 14.6 14.6 15.1 14.7

36.2 33.0 33.2 31.1 29.7 29.3 30.6 30.5

43.9 40.4 40.7 38.6 36.7 36.2 38.1 38.5

greeting 64.7 43.3 43.4 40.6 40.8 39.7 40.9 40.2

81.9 53.8 53.7 51.2 51.4 50.0 51.5 50.9

23.3 16.5 18.7 15.9 15.7 15.4 15.2 14.3

50.3 37.0 38.7 34.2 34.7 33.7 34.1 32.8

92.7 79.6 79.3 73.5 73.5 71.5 73.3 73.8

11.9 10.5 10.6 9.4 9.3 9.3 8.8 8.6

107.3 77.3 77.7 72.3 74.4 71.6 71.6 70.6

132.1 93.4 93.4 88.6 91.1 87.6 87.1 87.1

77.5 59.2 61.5 55.3 56.7 55.8 54.3 55.0

47.1 20.7 23.4 19.0 19.1 19.0 18.8 18.0

32.1 31.3 31.9 28.4 28.0 27.6 28.2 28.1

40.3 38.2 38.9 35.2 34.4 33.8 34.7 34.9

12.5 10.1 10.2 8.8 8.8 8.8 8.3 8.0

25.8 20.7 21.0 18.6 18.9 18.8 18.3 17.8

25.1 22.0 21.9 19.7 19.7 19.6 19.2 18.9

44.6 46.3 46.3 43.4 43.3 42.6 42.4 42.7

50.2 57.8 57.9 55.0 54.8 53.9 53.8 54.3

93.3 42.9 46.2 38.6 39.6 38.7 39.3 38.5

15.0 16.1 16.1 14.5 14.6 14.6 13.9 13.5

32.9 31.6 31.1 28.3 28.6 28.4 27.9 27.5

48.1 41.5 42.5 39.2 39.3 39.0 38.7 38.2

58.3 51.3 52.3 49.0 48.9 48.4 48.4 48.4

171.2 93.3 96.0 86.6 87.6 85.5 86.4 87.9

14.3 10.6 10.5 9.1 9.0 9.0 8.7 8.4

26.7 20.9 21.0 18.8 18.7 18.3 18.6 18.1

34.8 26.8 27.4 24.2 24.7 24.3 23.8 23.2

61.0 57.1 58.5 54.0 55.9 54.6 53.6 53.4

69.8 69.7 71.7 67.2 69.3 67.8 66.7 66.9

15.3 12.8 13.7 11.5 11.3 11.6 10.7 10.1

29.3 29.4 29.9 26.2 26.6 26.7 25.7 25.1

71.5 67.0 66.6 60.9 63.6 63.1 60.0 60.2

96.7 85.0 84.1 77.9 81.4 80.0 76.6 77.6

takingphoto 77.1 62.5 61.5 57.7 58.1 57.4 57.4 57.3

93.0 76.8 75.5 71.9 72.0 71.3 71.5 71.7

walkingtogether 160.1 80.4 83.5 73.3 74.7 72.8 73.7 74.1

17.3 12.0 12.5 10.6 10.6 10.6 10.0 9.5 posing

sittingdown

walkingdog 59.6 48.3 50.1 44.1 45.6 44.8 43.6 43.3

17.6 16.3 16.2 14.1 14.0 13.9 14.1 13.8

phoning

sitting 75.7 66.1 65.7 60.1 60.3 58.8 60.1 60.0

9.0 8.0 7.9 6.8 6.8 6.7 6.6 6.4

discussion

50.1 37.4 38.5 34.7 34.5 33.6 34.4 33.9

13.6 9.9 9.9 8.9 8.8 8.9 8.4 8.2

29.0 21.0 20.9 19.1 19.2 19.1 18.9 18.5

46.0 44.6 45.0 42.2 42.7 42.1 42.0 41.6

58.8 56.3 56.6 53.5 54.1 53.4 53.3 53.2

average 63.2 43.9 45.2 41.3 40.9 39.8 41.0 41.0

17.0 12.1 12.7 10.8 10.7 10.7 10.3 9.9

33.6 25.6 26.1 22.9 23.0 22.8 22.7 22.2

65.9 51.6 52.3 47.8 48.3 47.3 47.4 47.2

79.7 62.9 63.5 59.1 59.4 58.2 58.5 58.8

Baselines and Comparisons Settings

Baseline in our experiments are all based on GCN, considering that our main contribution is a general GCN module. Specifically, we consider LTD [37], DMGNN [33], MSR-GCN [29], HRI [36], and PG-BIG [35] as our baseline models. Testing Setup. As mentioned in Dang et al. [29], we evaluate our models on the entire test dataset. For Human3.6M, the input length is 10 poses and the output length is 25 poses. For the 3DPW dataset, the input length is 10 poses and the output length is 30 poses. Implementation Details. Similar to LTD [37], we implemented our model using Pytorch and used ADAM as the optimizer for training. The learning rate was set to 0.0005 and decreased by a factor of 0.96 every 2 epochs. The batch size was set to 16 and gradients were clipped using l2-norm. We trained the model for 50 epochs on a single RTX3080 GPU.

386

F. Zhou and J. Yin

Table 2. Comparisons of long-term prediction on Human3.6M. Results at 560 ms and 1000 ms in the future are shown. The best results are highlighted in bold, and the second best is marked by underline.  indicates that the result is taken from PGBIG scenarios

walking

millisecond

560 ms 1000 ms 560 ms 1000 ms 560 ms 1000 ms 560 ms 1000 ms 560 ms 1000 ms 560 ms 1000 ms 560 ms 1000 ms 560 ms 1000 ms

eating

DMGNN MSR LTD LTD (Ours) HRI HRI (Ours) PGBIG PGBIG (Ours)

73.4 52.7 54.1 54.0 46.6 47.6 48.1 50.2

95.8 63.0 59.8 59.3 55.2 55.1 56.4 59.6

58.1 52.5 53.4 52.2 50.3 50.4 51.1 50.1

smoking 86.7 77.1 77.8 76.5 73.5 74.2 76.0 74.8

72.2 71.6 72.6 70.0 69.5 68.8 69.5 68.9

138.3 117.6 121.5 119.9 122.7 121.7 118.2 119.1

115.8 100.6 101.8 102.1 101.5 100.5 100.4 99.4

157.7 147.2 148.8 146.4 148.9 148.9 143.5 143.0

98.6 104.4 103.1 103.5 103.5 103.0 102.7 101.8

163.9 116.3 114.5 115.5 118.4 115.2 106.1 106.8

310.1 174.3 173.0 173.3 176.6 173.9 164.8 165.9

DMGNN MSR LTD* LTD (Ours) HRI HRI (Ours) PGBIG PGBIG (Ours)

118.6 101.6 102.0 98.1 96.5 97.5 95.3 96.0

168.8 155.5 150.2 150.9 151.6 150.8 147.8 147.8

91.6 77.9 77.4 77.8 78.6 77.5 74.3 74.2

120.7 121.9 119.8 120.0 122.0 120.3 118.6 117.1

106.0 76.3 79.4 77.4 78.1 77.5 72.2 73.0

walkingdog

78.9 68.3 69.2 68.4 68.5 67.9 65.9 65.3

posing

purchases

122.1 102.8 100.0 100.7 100.3 100.2 96.7 97.1

waiting

152.5 116.3 115.4 114.7 114.3 113.7 110.2 109.9

phoning

560 ms 1000 ms 560 ms 1000 ms 560 ms 1000 ms 560 ms 1000 ms 560 ms 1000 ms 560 ms 1000 ms 560 ms 1000 ms 560 ms 1000 ms 104.9 120.0 119.7 118.1 119.4 118.5 116.1 115.6

takingphoto

110.1 71.2 71.0 71.8 71.3 70.3 69.3 68.1

greeting

millisecond

60.1 78.2 78.3 77.2 77.6 77.5 74.4 74.2

sittingdown

81.9 88.6 91.6 90.3 92.0 91.4 87.1 87.7

directions

scenarios

153.8 139.2 143.5 136.6 133.4 135.8 133.3 134.6

sitting

50.9 49.5 50.7 48.9 47.6 47.5 46.5 46.4

discussion

136.7 106.3 108.1 106.6 108.2 108.5 103.4 104.3

194.0 111.9 111.9 103.4 107.1 108.6 104.7 103.4

182.3 148.2 148.9 138.3 143.8 143.4 139.8 137.8

walkingtogether average 83.4 52.9 55.0 55.4 51.7 51.6 51.9 50.4

115.9 65.9 65.6 64.4 61.6 61.1 64.3 61.6

103.0 81.1 81.6 80.4 79.9 79.6 76.9 76.9

137.2 114.2 114.3 112.4 112.8 112.3 110.3 110.1

Table 3. 3DPW: comparisons of average prediction errors. millisecond

200 ms 400 ms 600 ms 800 ms 1000 ms

Res. Sup.

113.9

173.1

191.9

201.1

210.7

DMGNN

37.3

67.8

94.5

109.7

123.6

MSR

37.8

71.3

93.9

110.8

121.5

LTD

35.6

67.8

90.6

106.9

117.8

LTD (Ours)

34.1

65.0

89.9

102.5

109.7

HRI

34.7

67.8

98.5

110.4

117.4

HRI (Ours)

34.1

65.2

93.1

103.1

108.6

PGBIG

29.3

58.3

79.8

94.4

104.1

PGBIG (Ours) 29.1

57.6

78.8

92.8

101.1

Table 4. Test of stretching the search space of adjacency matrix 80 ms 160 ms 320 ms 400 ms average Ours

4.3

10.7

22.9

47.8

59.1

35.1

without influence 11.1

23.4

48.5

59.8

35.7

Comparisons with BaseLines

We replaced the original GCN in LTD, HRI, and PG-BIG with our GCN-PAAM. However, we did not add our model to DMGNN and MSR-GCN due to their special multi-stage approach and varying adjacency matrix sizes. Human3.6M. For the Human3.6M dataset, Tables 1 and 2 show the quantitative comparison results of our model and baseline models in short-term prediction (less than 400 ms) and long-term prediction (greater than 400 ms and less than 1000 ms), respectively. ‘(ours)’ means our GCN-PAAM replaces the GCN module in that method. Table 3 shows the improvement of each model by our method. It is clear that our method improves the original methods to varying

GCN with Pattern Affected Matrix in Human Motion Prediction

387

degrees in most cases on the three different baselines. Specifically, in LTD, our method reduces the average error by about 9%, in HRI, our method reduces the error by about 2%, and in PGBIG, our method reduces the error by about 1%. 3DPW. Table 3 presents the quantitative comparison results of our model and the baseline models on short-term prediction and long-term prediction on the 3DPW dataset. Our GCN-PAAM is highly effective in reducing the error of the three baseline models in 3DPW, especially in long-term prediction. Figure 3 shows the effectiveness of our method on the two datasets using a line graph. Figures 4 and 5 visualize our model’s actions on “greeting” and “discussing” in Human3.6M, respectively. From the last frame of “greeting,” it can be seen that our method is more effective in modeling hand movements. In the ’discussing’ action, our method is more accurate in modeling leg and hand movements than the original method.

Fig. 3. The variation of the average error over the long and short time periods of Human3.6M (left) and 3DPW (right) for the three methods after replacing GCNPAAM.

Fig. 4. ‘Greeting’ motion prediction visualization, from top to bottom is the result of ground truth, LTD, LTD (ours)

388

F. Zhou and J. Yin

Fig. 5. ‘Discussing’ motion prediction visualization, from top to bottom is the result of ground truth, LTD, LTD (ours)

4.4

Ablation Study

We conducted ablation experiments to further investigate the effectiveness of our model’s motivation. The following experiments were conducted on the Human3.6M dataset using the LTD baseline model. To demonstrate that the pattern can optimize the learning direction of the adjacency matrix (Eq. 8), we divided the gradient of the adjacency matrix by the value of the pattern before the gradient backpropagation began. This eliminated the influence of the pattern on the learning of the adjacency matrix. Table 4 shows the results after the ablation experiment. After the influence was eliminated, our method’s accuracy decreased by approximately 0.5%.

5

Conclusion

In this paper, we propose a new GCN module - GCN-PAAM, which to some extent solves the problem of network adaptability to different inputs and enhances the learning ability of adjacency matrix in graph convolution. We replace the GCN module in LTD, HRI, and PGBIG with our module, and experimental results show that our module can improve the accuracy of the model to varying degrees and achieve state-of-the-art performance. Acknowledgements. This work was supported partly by the National Natural Science Foundation of China (Grant No. 62173045, 62273054), partly by the Fundamental Research Funds for the Central Universities (Grant No. 2020XD-A04-3), and the Natural Science Foundation of Hainan Province (Grant No. 622RC675).

References 1. Lehrmann, A.M., Gehler, P.V., Nowozin, S.: Efficient nonlinear markov models for human motion. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1314–1321 (2014)

GCN with Pattern Affected Matrix in Human Motion Prediction

389

2. Taylor, G.W., Hinton, G.E., Roweis, S.T.: Modeling human motion using binary latent variables. In: Advances in Neural Information Processing Systems, pp. 1345– 1352 (2007) 3. Wang, J.M., Fleet, D.J., Hertzmann, A.: Gaussian process dynamical models. In: NIPS, vol. 18, p. 3. Citeseer (2005) 4. Li, C., Zhang, Z., Lee, W.S., Lee, G.H.: Convolutional sequence to sequence model for human dynamics. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5226–5234 (2018) 5. Liu, X., Yin, J., Liu, J., Ding, P., Liu, J., Liub, H.: TrajectoryCNN: a new spatiotemporal feature learning network for human motion prediction. IEEE Trans. Circuits Syst. Video Technol. 31, 2133–2146 (2020) 6. Chiu, H., Adeli, E., Wang, B., Huang, D., Niebles, J.C.: Action-agnostic human pose fore-casting. In: 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1423–1432. IEEE (2019) 7. Corona, E., Pumarola, A., Alenya, G., Moreno-Noguer, F.: Context-aware human motion prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6992–7001 (2020) 8. Fragkiadaki, K., Levine, S., Felsen, P., Malik, J.: Recurrent network models for human dynamics. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4346–4354 (2015) 9. Ghosh, P., Song, J., Aksan, E., Hilliges, O.: Learning human motion models for long-term predictions. In: 2017 International Conference on 3D Vision (3DV), pp. 458–466. IEEE (2017) 10. Gopalakrishnan, A., Mali, A., Kifer, D., Giles, L., Ororbia, A.G.: A neural temporal model for human motion prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12116–12125 (2019) 11. Gui, L.-Y., Wang, Y.-X., Ramanan, D., Moura, J.M.F.: Few-shot human motion prediction via meta-learning. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11212, pp. 441–459. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01237-3 27 12. Guo, X., Choi, J.: Human motion prediction via learning local structure representations and temporal dependencies. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 2580–2587 (2019) 13. Jain, A., Zamir, A.R., Savarese, S., Saxena, A.: Structural-RNN: deep learning on spatio-temporal graphs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5308–5317 (2016) 14. Liu, Z., et al.: Towards natural and accurate future motion prediction of humans and animals. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10004–10012 (2019) 15. Martinez, J., Black, M.J., Romero, J.: On human motion prediction using recurrent neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2891–2900 (2017) 16. Pavllo, D., Feichtenhofer, C., Auli, M., Grangier, D.: Modeling human motion with quaternion-based neural networks. Int. J. Comput. Vision 128(4), 855–872 (2020) 17. Sang, H.-F., Chen, Z.-Z., He, D.-K.: Human motion prediction based on attention mechanism. Multimedia Tools Appl. 79(9), 5529–5544 (2020) 18. Shu, X., Zhang, L., Qi, G.-J., Liu, W., Tang, J.: Spatiotemporal co-attention recurrent neural networks for human-skeleton motion prediction. IEEE Trans. Pattern Anal. Mach. Intell. 44, 3300–3315 (2021)

390

F. Zhou and J. Yin

19. Tang, Y., Ma, L., Liu, W., Zheng, W.: Long-term human motion prediction by modeling motion context and enhancing motion dynamic. arXiv preprint arXiv:1805.02513 (2018) 20. Cui, Q., Sun, H., Kong, Y., Zhang, X., Li, Y.: Efficient human motion prediction using temporal convolutional generative adversarial network. Inf. Sci. 545, 427–447 (2021) 21. Gui, L.-Y., Wang, Y.-X., Liang, X., Moura, J.M.F.: Adversarial geometry-aware human motion prediction. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11208, pp. 823–842. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01225-0 48 22. Hernandez, A., Gall, J., Moreno-Noguer, F.: Human motion prediction via spatiotemporal inpainting. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7134–7143 (2019) 23. Ke, Q., Bennamoun, M., Rahmani, H., An, S., Sohel, F., Boussaid, F.: Learning latent global network for skeleton-based action prediction, vol. 29, pp. 959–970. IEEE (2019) 24. Kundu, J.N., Gor, M., Venkatesh Babu, R.: BIHMP-GAN: bidirectional 3d human motion prediction GAN. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 8553–8560 (2019) 25. Aksan, E., Cao, P., Kaufmann, M., Hilliges, O.: A spatio-temporal transformer for 3d human motion prediction. arXiv preprint arXiv:2004.08692 (2020) 26. Cai, Y., et al.: Learning progressive joint propagation for human motion prediction. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12352, pp. 226–242. Springer, Cham (2020). https://doi.org/10.1007/978-3-03058571-6 14 27. Aksan, E., Kaufmann, M., Hilliges, O.: Structured prediction helps 3d human motion modelling. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7144–7153 (2019) 28. Cui, Q., Sun, H.: Towards accurate 3D human motion prediction from incomplete observations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4801–4810 (2021) 29. Dang, L., Nie, Y., Long, C., Zhang, Q., Li, G.: MSR-GCN: multi-scale residual graph convolution networks for human motion prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 11467–11476 (2021) 30. Lebailly, T., Kiciroglu, S., Salzmann, M., Fua, P., Wang, W.: Motion prediction using temporal inception module. In: Proceedings of the Asian Conference on Computer Vision (2020) 31. Li, B., Tian, J., Zhang, Z., Feng, H., Li, X.: Multitask non-autoregressive model for human motion prediction. IEEE Trans. Image Process. 30, 2562–2574 (2020) 32. Li, M., Siheng Chen, X., Chen, Y.Z., Wang, Y., Tian, Q.: Symbiotic graph neural networks for 3D skeleton-based human action recognition and motion pre-diction. IEEE Trans. Pattern Anal. Mach. Intell. 44, 3316–3333 (2021) 33. Li, M., Chen, S., Zhao, Y., Zhang, Y., Wang, Y., Tian, Q.: Dynamic multiscale graph neural networks for 3D skeleton based human motion prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 214–223 (2020) 34. Liu, J., Yin, J.: Multi-grained trajectory graph convolutional networks for habitunrelated human motion prediction. arXiv preprint arXiv:2012.12558 (2020)

GCN with Pattern Affected Matrix in Human Motion Prediction

391

35. Ma, T., Nie, Y., Long, C., Zhang, Q., Li, G.: Progressively generating better initial guesses towards next stages for high-quality human motion prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6437–6446 (2022) 36. Mao, W., Liu, M., Salzmann, M.: History repeats itself: human motion prediction via motion attention. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12359, pp. 474–489. Springer, Cham (2020). https://doi. org/10.1007/978-3-030-58568-6 28 37. Mao, W., Liu, M., Salzmann, M., Li, H.: Learning trajectory dependencies for human motion prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9489–9497 (2019) 38. Zhong, C., Hu, L., Zhang, Z., Ye, Y., Xia, S.: Spatio-temporal gating-adjacency GCN for human motion prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6447–6456 (2022) 39. Jordan, M.I., Jacobs, R.A.: Hierarchical mixtures of experts and the Em algorithm. Neural Comput. 6(2), 181–214 (1994) 40. Ionescu, C., Papava, D., Olaru, V., Sminchisescu, C.: Human3. 6m: large scale datasets and predictive methods for 3D human sensing in natural environments. IEEE Trans. Pattern Anal. Mach. Intell. 36(7), 1325–1339 (2013) 41. von Marcard, T., Henschel, R., Black, M.J., Rosenhahn, B., Pons-Moll, G.: Recovering accurate 3D human pose in the wild using IMUs and a moving camera. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11214, pp. 614–631. Springer, Cham (2018). https://doi.org/10.1007/978-3030-01249-6 37 42. Yan, Z., Zhai, D.-H., Xia, Y.: DMSGCN: dynamic multiscale spatiotemporal graph convolutional networks for human motion prediction. arXiv preprint arXiv:2112.10365 (2021) 43. Fu, J., Yang, F., Yin, J.: Learning dynamic correlations in spatiotemporal graphs for motion prediction. arXiv preprint arXiv:2204.01297 (2022) 44. Ruder, S.: An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747 (2016) 45. Sofianos, T., Sampieri, A., Franco, L., Galasso, F.: Space-time- separable graph convolutional network for pose forecasting. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 11209–11218 (2021)

Cooperative Control of SMC-Feedback Linearization and Error Port Hamiltonian System for PMSM Youyuan Chen1,2 , Haisheng Yu1,2(B) , Xiangxiang Meng1,2 , Hao Ding1,2 , and Xunkai Gao1,2 1

2

College of Automation, Qingdao University, Qingdao, China [email protected] Shandong Province Key Laboratory of Industrial Control Technology, Qingdao University, Qingdao, China

Abstract. In this paper, a novel cooperative strategy combining sliding mode control based on feedback linearization (SMC-FL) and state error port-Hamiltonian (EPH) with variable damping injection and integral term is proposed for permanent magnet synchronous motor (PMSM). Firstly, a SMC-FL controller and an EPH controller are designed, respectively, and the load observer is applied to estimate unknown load. Then, Gaussian function is used for the cooperative strategy which coordinate two controllers. Finally, the simulation section verifies advantages and effectiveness of the proposed controller. Keywords: Permanent magnet synchronous motor · sliding mode feedback linearization · EPH · cooperative control · load observer

1

·

Introduction

PMSM is a widely used motor with high reliability, simple structure and high power density [1,2]. Therefore, the control of PMSM is a very popular research field. However, a single control method is hard to balance rapidity and stability simultaneously. Signal control is a class of control method, and common signal control methods include sliding mode control (SMC), predictive control, adaptive control and feedback linearization (FL), etc. In [3], a new predictive controller has been designed to replace traditional cascade control. Then, the sliding mode observer has been used to compensate the disturbance. In [4], a direct control scheme based FL has been designed for interior PMSM. However, the signal control methods mentioned above often consider the system’s rapidity while neglecting steady-state performance. Recently, energy control has gradually been applied due to its superior steady-state performance, such as port Hamiltonian (PH) and passivity-based control [5,6]. Meng et al. [7] has designed a novel adaptive EPH scheme which considers the input saturation for nonlinear systems, the system c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023  Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 392–401, 2023. https://doi.org/10.1007/978-981-99-6187-0_39

Cooperative Control

393

can ensure good steady-state performance. Gil-Gonz´alez et al. [8] has proposed an interconnection and damping assignment passivity-based control which can stabilize rotor velocity and adjust the motor voltage. The above article only considers designing controllers from a signal or energy perspective, making it challenging to balance rapidity and stability. To address this problem, this paper proposes a collaborative controller combining SMC-FL and EPH, which uses Gaussian function as the cooperative strategy. In order to further improve the control performance of EPH and eliminate steady-state error, variable damping injection and integration control are used. Finally, the simulation results validate the advantages of the proposed method.

2

Mathematical Model of PMSM

The dynamic model of PMSM in synchronous rotating coordinate system (d-q axis) is written as ⎧ did Ld = −Rs id + np ωLq iq + ud ⎪ ⎪ ⎨ didtq Lq dt = −Rs iq − np ωLd id − np ωΦ + uq (1) ⎪ J dω = 32 np [(Ld − Lq ) id iq + Φiq ] − τL − Rf ω ⎪ ⎩ dθdt dt = ω 3 np [(Ld − Lq ) id iq + Φiq ] (2) 2 where Ld and Lq represent stator inductance in d-q axis, id , iq , ud and uq are stator currents and stator voltages, respectively. Rs , Φ and ω are stator resistance, permanent magnet flux and rotor mechanical angular velocity, respectively. τ and τL are electromagnetic and load torque, np is number of pole pairs, θ is angle of motor. J is moment of inertia. Rf is viscous friction coefficient. The normal nonlinear system is described as τ=

x˙ = f (x) + g (x) u y = h (x)

(3)

Rewrite (1) into the form of nonlinear system (3), the state variable x = T T   T [x1 x2 x3 x4 ] = Ld id Lq iq 32 Jω θ , input vector u = ud uq , f (x) and g(x) is rewritten as ⎡ ⎤ −Rs id + np ωLq iq  T ⎢ ⎥ −Rs iq − np ωLd id − np ω Φ 1000 ⎥ f (x) = ⎢ (4) , g (x) = ⎣np [(Ld − Lq ) id iq + Φ iq ] − 2 τL − 2 Rf ω ⎦ 0100 3 3 ω The control strategy frame is shown as Fig. 1. For the position tracking problem of PMSM servo system, the SMC-FL controller can ensure that the system can quickly track the desired signal, while the EPH controller with variable damping can make steady-state error smaller. Through the cooperative control strategy, the advantages of two controllers are combined to solve the problem that it is difficult to balance the tracking speed and accuracy. Load observer is used to estimate real-time load disturbance.

394

3 3.1

Y. Chen et al.

Controller Design SMC-FL Controller Design

To facilitate the design of SMC-FL, (4) can be rewritten as x˙ = f (x) + g (x) us

(5)

T  where us = usd usq represent SMC-FL controller. Choose the output variables id and θ, in order to realize the decoupling of θ and id , according to feedback linearization theory, new state vector is defined as follows T T   (6) ξ = ξ1 ξ2 ξ3 ξ4 = id θ ω ω˙ As non-salient pole PMSM, Ld = Lq = L. The new equation of state can be expressed as ⎡ ⎤ s id − RL + np ωiq + uLsd ⎥ ⎡˙ ⎤ ⎢ ξ3 ⎢ ⎥ ξ1 ⎢ ⎥ ξ 4   ⎥ ⎢ξ˙2 ⎥ ⎢ ⎥ ⎥ = ⎢ 3np Φ np ωΦ Rs iq (7) ξ˙ = ⎢ ⎥ ⎣ξ˙3 ⎦ ⎢ − n ωi − − − p d ⎢ 2J ⎥ L L ⎢ ⎥  ξ˙4 ⎣ Rf  3np Φiq ⎦ τL Rf ω 3np Φ − − usq + J 2J J J 2JL Define the new input vector as     v ξ˙ v = 1 = ˙1 v2 ξ4

(8)

       v1 m1 n n usd = + 11 12 v2 m2 n21 n22 usq

(9)

where m1 , m2 , n11 , n12 , n21 and n22 are obtained from (7). FL theory is used to realize the decoupling of PMSM excitation current and position, but the new system has poor robustness. In order to improve the robustness, integral SMC method is adopted for new linear system. Integral sliding mode controller has two sliding surfaces s1 and s2 . Define error variable e1 = id − i∗d , by i∗d = 0 control, the error equation of d-axis current can be obtained (10) e˙ 1 = i˙ d = v1 The first surface is designed as  s1 = e1 + β

t

−∞

e1 dt

Select the integral initial value I0 as  0 e1 (0) e1 (τ )dτ = − I0 = β −∞

(11)

(12)

Cooperative Control

395

Fig. 1. Block diagram of PMSM servo system

where e1 (0) is the initial state of e1 , I0 is integral initial value, β is a positive constant. Then, s1 (0) = 0, which represents that the system moves on the sliding surface at the beginning, and the system has global robustness. The derivation of formula (11) can be obtained as s˙ 1 = e˙ 1 + βe1 = v1 + βe1

(13)

The reaching law can be described as s˙ 1 = −ε1 sgn (s1 ) − λ1 s1 , ε1 > 0, λ1 > 0

(14)

where ε1 and λ1 are positive constant. Combine (13) and (14), it yields − ε1 sgn(s1 ) − λ1 s1 =v1 + βe1

(15)

Therefore, the excitation current id sliding mode control law can be obtained as v1 = − ε1 sgn(s1 ) − λ1 s1 − βe1

(16)

Then, the second sliding mode controller is designed for position control. Define position error e2 = θ − θ∗ , its first derivative is e˙ 2 = θ˙ − θ˙∗ . The sliding surface is designed as  t s2 = e¨2 + c1 e˙ 2 + c2 e2 + c3 e2 dt (17) 0

where c1 , c2 and c3 are positive constant. The derivation of formula (17) is ...∗ ... s˙ 2 = e 2 + c1 e¨2 + c2 e˙ 2 + c3 e2 = v2 − θ + c1 e¨2 + c2 e˙ 2 + c3 e2 (18)

396

Y. Chen et al.

The improved power reaching law is designed by adding a linear term to improve reaching efficiency γ

s˙ 2 = −ε2 |s2 | sgn(s2 ) − λ2 s2 , ε2 > 0, 1 > γ > 0, λ2 > 0

(19)

where ε2 , λ2 and γ are positive constant. Combine (18) and (19), it yields ...∗ γ (20) v2 − θ + c1 e¨2 + c2 e˙ 2 + c3 e2 = − ε2 |s2 | sgn(s2 ) − λ2 s2 The position sliding mode control law can be expressed as v2 = − ε2 |s2 | sgn(s2 ) − λ2 s2 +∗ − c1 e¨2 − c2 e˙ 2 − c3 e2 γ

Therefore, SMC-FL controller can be derived as    −1   usd n11 n12 v1 − m 1 = usq n21 n22 v2 − m2 3.2

(21)

(22)

EPH Controller Design

The PH system with dissipation is  x˙ = [J (x) − R (x)] ∂H(x) ∂x + g (x) ue y = g T (x) ∂H(x) ∂x

(23)

where R (x) = RT (x) ≥ 0, J (x) = −J T (x), R(x) represents the dissipation, J(x) is interconnection structure, H(x) is Hamiltonian function. The input and T T output vector are defined as ue = [ued , ueq ] , y = [id , iq ] . The Hamiltonian function is described by   1 1 2 1 2 3 2 H (x) = x + x + x + τL x4 (24) 2 L 1 L 2 2J 3 The system (1) is described in PH form with ⎡ ⎤ 0 0 np x2 0 ⎢ 0 0 −np (x1 + Φ) 0 ⎥ ⎥ J (x) = ⎢ ⎣−np x2 np (x1 + Φ) 0 − 56 ⎦ 5 0 0 0 6 ⎡ ⎡ ⎤ ⎤ Rs 0 0 0 10 ⎢0 1⎥ ⎢ 0 Rs 0 0 ⎥ ⎢ ⎥ ⎥ R (x) = ⎢ ⎣ 0 0 2 Rf − 1 ⎦ g (x) = ⎣ 0 0 ⎦ 3 6 00 0 0 − 16 0

(25)

The PMSM system is controlled by the maximum torque per ampere (MTPA), the equilibrium point is T   x∗ = Li∗d Li∗q 32 Jω ∗ θ∗ = 0

2LτL +2LRf ω ∗ 2 ∗ 3np Φ 3 Jω

θ∗

T (26)

Cooperative Control

397

x ˜ = x−x∗ be state x) = 12 x ˜T D−1 x ˜=  Let  error. The Hamiltonian function Hd (˜ 2 2 2 ˜1 x ˜2 3˜ x3 1 x 2 x4 , where ρ is a constant, Hd (˜ x) > 0, Hd (0) = 0. 2 L + L + 2J + ρ˜ Assume we can find u = α (x), the expected interconnection and damping x) and Rd (˜ x) satisfying matrix Jd (˜ ∂H ∂Hd (˜ x) − [J(x) − R(x)] (x) ∂x ˜ ∂x then, the PH system with u = α(x) can be described as x) − Rd ] g(x)α(x) = [Jd (˜

∂Hd (˜ x) x ˜˙ = [Jd (˜ x) − Rd (˜ x)] ∂x ˜ it will be asymptotically stable at the x ˜ = 0 [9]. Let ⎡ ⎡ ⎤ 0 a −np x rs ˜2 −L˜ x2 ⎢ −a ⎢0 0 np x ˜1 L˜ x1 ⎥ ⎢ ⎢ ⎥ , Ra = ⎣ Ja = ⎣ np x ˜2 −np x ˜1 0 0 ⎦ 0 x1 0 0 L˜ x2 −L˜ 0

(27)

(28)

0 rs 0 0

0 0 rs 0

⎤ 0 0⎥ ⎥ 0⎦ 0

(29)

where a is an adjustive parameter, by using variable damping injection, a larger damping is added to the system in the early stage of startup to accelerate the system response speed, and a smaller damping is added when approaching steady state. The design idea of variable damping [10,11] is ⎧ ⎪ ⎨ η˙ 1 = η2   |η2 | η˙ 2 = −kc Ψ η1 − r (t) + η22γ ,δ (30) ⎪ ⎩ rs = k1 − (k1 − k2 ) η1  Ψ (ϕ, δ) =

sign (ϕ) , |ϕ| > δ ϕ δ , |ϕ| ≤ δ

(31)

where η1 represents the desired signal, δ > 0 is a smaller constant, kc is an acceleration factor, k1 and k2 are damping injection values at initial and steadystate conditions, respectively. To eliminate steady-state error, integral control is added into the EPH system.    t ∂Hd kI1 0 T (˜ x)dt, KI = zI = −KI g (˜ x) (32) 0 kI2 ∂x ˜ 0 where KI is the integral coefficient, and the new closed-loop system can be proved as the state EPH system easily. Let new Hamiltonian function 1 x, zI ) = Hd (˜ x) + zI T KI−1 zI Hdi (˜ 2

(33)

398

Y. Chen et al.

Therefore, the feedback control law can be written as     ⎧ ued = Rs i∗d − rs (id − i∗d ) + a iq − i∗q − ρL2 iq − i∗q (θ − θ∗ ) ⎪ ⎪ ⎪  t ⎪ ⎪ ⎪ ⎪ − np ωL(iq − i∗q ) − np ω ∗ Li∗q − KI1 (id − i∗d ) dt ⎨   0 ueq = Rs i∗q − a (id − i∗d ) − rs iq − i∗q + ρL2 (id − i∗d ) (θ − θ∗ ) ⎪ ⎪ ⎪  t ⎪ ⎪   ⎪ ⎪ ⎩ iq − i∗q dt + np ωL(id − i∗d ) + np ω ∗ (Li∗d + Φ) − KI2

(34)

0

3.3

Load Torque Estimation

In practical control systems, the load torque keeps unknown. To estimate the load torque, a load observer is designed as ⎧ ˙ ⎪ ˆ ˆ + l1 (θ − θ) ⎨ θˆ = ω Rf 3np ˆ ˙ω (35) ˆ − τˆJL + l2 (θ − θ) ˆ = 2J Φiq − J ω ⎪ ⎩˙ ˆ τˆL = l3 (θ − θ) where l1 , l2 and l3 are coefficients. The estimation error can be defined as θ˜ = ˆ ω θ − θ, ˜ =ω−ω ˆ and τ˜L = τL − τˆL . The observation equation can be obtained ⎡ ⎤ ⎡ ⎤⎡ ⎤ ˙ 0 −l1 1 θ˜ θ˜ ⎢ ˙ ⎥ ⎣ Rf 1 (36) ˜⎦ ⎣ω ˜ ⎦ = −l2 − J − J ⎦ ⎣ ω ˙τ˜L τ˜L −l3 0 0 The characteristic equation of (36) is s3 + (

Rf Rf l3 + l1 )s2 + ( l1 + l2 )s − = 0 J J J

(37)

According to the Routh’s criterion, the system is asymptotically stable when R R R R l3 < 0, l1 > − Jf , l2 > − Jf l1 and ( Jf + l1 )( Jf l1 + l2 ) > − lJ3 . Thus, τ˜L decays exponentially to zero, and τˆL converges to τL rapidly. Replacing τL with τ˜L in (22) and (34) yields two controllers with load estimation. 3.4

Cooperative Control Strategy Design

SMC-FL controller has fast dynamical tracking performance, EPH controller has superior stability. The cooperative controller can combine strong points of SMC-FL and EPH controllers, the system has fast tracking performance and high control accuracy. The cooperative controller can be written as  ud = c(eθ )usd (t) + [1 − c(eθ )] ued (t) (38) uq = c(eθ )usq (t) + [1 − c(eθ )] ueq (t)

Cooperative Control

399

Fig. 2. Gaussian function curves

where eθ = θ − θ∗ . The Gaussian function is chosen as the cooperative strategy, its expression can be shown below eθ 2

c(eθ ) = 1 − e−( σ )

(39)

where σ is scale parameter. Gaussian function curves for σ with different values are shown as (Fig. 2)

4

Simulation Results

The PMSM model parameters are Ld = Lq = 3 mH, Φ = 0.29Wb, Rs = 0.93Ω, J = 0.003 kg · m2 , np = 4. The proposed controller parameters are listed in Table 1. Table 1. Controller parameters SMC-FL EPH

Other parameters

ε1 0.01

k1

1500 l1 1000

ε2 0.5

k2

20

λ1 1000 kI 1 0.01 λ2 500

kI 2 0.01

c1 650

kc

l2 30 l3 −10

0.001

c2 5000 δ

0.01

c3 200

a

10

β

20

ρ

1

γ

0.1

Figure 3 shows a comparative diagram of SMC-FL, EPH and the cooperative controller under soft start 1 − e−2t rad. At t = 2.2 s, it can be seen that the cooperative curve almost completely coincides with the reference curve. At t = 0 s, the proposed method has almost zero overshoot, while SMC-FL and EPH have

400

Y. Chen et al.

Fig. 3. Simulation results when θ∗ = 1 − e−2t

Fig. 4. Simulation results when θ∗ = sin (t)

overshoots of 4‰ and 2‰, respectively. Therefore, the comprehensive effectiveness of cooperative strategy is better than SMC-FL and EPH controller. Sub figures (c) and (d) is voltage curves of the cooperative controller. Then, use θ∗ = sin (t) as the reference signal to verify the steady-state performance of the proposed controller and Fig. 4 presents the response curves of three methods. From sub figure (b), the steady-state error of SMC-FL is approximately 1‰, while the steady-state error of cooperative controller is approximately 0.1‰. Although the steady-state error of EPH is also 0.1‰, its overshoot is greater than that of cooperative controller at the beginning. Overall, the proposed method has better steady-state performance.

Cooperative Control

5

401

Conclusion

This paper proposed a collaborative control strategy that combines SMC-FL and EPH method to ensure a balance between dynamic and steady-state performance of PMSM. FL is used to decouple id and θ to get a linear system and SMC is designed to improve the robustness of the new linear system. Then, design an EPH controller with variable damping injection and integral term to improve stability. The Gaussian function is adopted as the coordination function. Finally, the simulation results under two different signals validate the superior performance of the proposed controller. Acknowledgments. This research work was supported by the National Natural Science Foundation under Grant 62273189 and the Shandong Province Natural Science Foundation under Grant ZR2021MF005 of China.

References 1. Bi, G., et al.: High-frequency injection angle self-adjustment based online position error suppression method for sensorless PMSM drives. IEEE Trans. Power Electron. 38(2), 1412–1417 (2022) 2. Wang, T., Guo, L., Wang, K., Wu, J., Liu, C., Zhu, Z.: Generalized predictive current control for dual-three-phase PMSM to achieve torque enhancement through harmonic injection. IEEE Trans. Power Electron. 38(5), 6422–6433 (2023) 3. Wang, Y., Liu, X.: Model predictive position control of permanent magnet synchronous motor servo system with sliding mode observer. Asian J. Control 25(1), 443–461 (2023) 4. Li, H., Wang, Z., Xu, Z., Wang, X., Hu, Y.: Feedback linearization based direct torque control for IPMSMs. IEEE Trans. Power Electron. 36(3), 3135–3148 (2020) 5. Meng, X., Yu, H., Zhang, J., Yan, K.: Optimized control strategy based on EPCH and DBMP algorithms for quadruple-tank liquid level system. J. Process Control 110, 121–132 (2022) 6. Chopra, N., Fujita, M., Ortega, R., Spong, M.W.: Passivity-based control of robots: theory and examples from the literature. IEEE Control. Syst. 42(2), 63–73 (2022) 7. Meng, X., Yu, H., Zhang, J., Yang, Q.: Adaptive EPCH strategy for nonlinear systems with parameters uncertainty and disturbances. Nonlinear Dyn. 111(8), 7511–7524 (2023) 8. Gil-Gonz´ alez, W.J., Garces, A., Fosso, O.B., Escobar-Mej´ıa, A.: Passivity-based control of power systems considering hydro-turbine with surge tank. IEEE Trans. Power Syst. 35(3), 2002–2011 (2019) 9. Ortega, R., Van Der Schaft, A., Maschke, B., Escobar, G.: Interconnection and damping assignment passivity-based control of port-controlled Hamiltonian systems. Automatica 38(4), 585–596 (2002) 10. Wang, W., Tong, S.: Adaptive fuzzy bounded control for consensus of multiple strict-feedback nonlinear systems. IEEE. Trans. Cybern. 48(2), 522–531 (2017) 11. Han, J.: From PID to active disturbance rejection control. IEEE Trans. Ind. Electron. 56(3), 900–906 (2009)

Identification of Plant Nutrient Deficiency Based on Improved MobileNetV3-Large Model Qian Yan1

, Yifei Chen2(B)

, and Caicong Wu2

1 China Agriculture University, Beijing 100080, China

[email protected]

2 China Agriculture University, College of Information and Electrical Engineering Key

Laboratory of Agricultural Machinery Monitoring and Big Data, Beijing 100080, China [email protected], [email protected]

Abstract. The nutrient element content of plants is one of the important factors affecting the growth and yield of crops. The use of the convolutional neural network (CNN) can quickly and accurately detect the degree of nutrient deficiency in plants, thereby liberating manpower and improving crop yields. This study uses MobileNetV3-Large as the backbone, combines a convolutional block attention module (CBAM) to obtain MobileNetV3-CBAM, and introduces gated linear units (GLU). During the training process, fine-tuning of transfer learning improves the speed and accuracy of model training. The identification of elements deficient in plants is studied using an open-source dataset, and compared with the cutting-edge CNN, the proposed lightweight MobileNetV3-CBAM with GLU has better comprehensive performance: I. For the imbalanced dataset, the proposed MobileNetV3-CBAM model achieves outstanding results in classifying plant nutrient deficiencies, with a remarkable test accuracy of 96.54%. II. The proposed model occupies a small memory (10.5M), and the training speed is remarkable, which can realize the identification of plant nutrient deficiency in a low-computing environment. III. After 10-fold cross-validation, the robustness of the model is good. This study can provide a theoretical basis and technical support for the real-time detection of nutrient deficiency in plant leaves. Keywords: Plant Nutrient Deficiency Recognition · Attention · MobileNetV3-Large

1 Introduction The lack of plant nutrient elements is a common problem in the plant growth cycle. If the fertilizer cannot be applied in time, it will cause problems such as yellowing of leaves and slow growth, which will eventually affect the yield of crops. Traditional methods for plant analysis, such as visual inspection and manual measurement [1], are susceptible to errors and misjudgments [2]. CNN has made significant advancements in the deep learning-based classification of plant nutrient deficiencies. Tran et al. [3] classified 571 tomato images for N, K, and Ca nutrient deficiencies by using InceptionResNetV2, autoencoder, and the integration © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 402–408, 2023. https://doi.org/10.1007/978-981-99-6187-0_40

Identification of Plant Nutrient Deficiency

403

of the two, and achieved 87.273% and 79.091% accuracy, respectively. Xu et al. [4] determined the different nutrient components of rice leaves. Four CNNs were evaluated and achieved more than 90% accuracy and outperformed the traditional machine learning method SVM. The aforementioned studies on various CNN models were conducted in GPU environments. However, low computing power environments are common in real-world engineering applications. Taking inspiration from human attention mechanisms, researchers have developed methods to selectively focus on key aspects of received information. By applying the attention mechanism of Convolutional Neural Networks (CNNs), they have successfully improved the performance [5]. Numerous studies have demonstrated the effectiveness of attention mechanisms in enhancing CNNs’ feature extraction capabilities and overall performance [6]. In their work, Jie et al. [7] proposed the Squeeze and Excitation (SE) module, which leverages loss to adjust the weights of feature maps and improve model performance. Wang et al. [8] presented a notable approach by developing a residual neural network that consists of multiple stacked attention modules. This architecture demonstrated remarkable performance in their study, Jiang et al. [9] introduced the Convolutional Block Attention Module (CBAM) to enhance feature extraction. CBAM comprises two sub-modules: the Channel Attention Module and the Spatial Attention Module. These modules calculate channel and spatial weights, enabling more efficient extraction of essential features from input images. For our study, we have selected the MobilenetV3-Large model as the base model, and improved the model. The main contributions are as follows: 1. The CBAM attention module is added to MobileNetV3-Large to help the model extract and select features more accurately, while preventing the model from overfitting. 2. The GLU is introduced to allow the model to learn more useful features, thereby improving the generalization ability, robustness, and accuracy of the model. 3. Accelerate model training and enhance accuracy by fine-tuning transfer learning.

2 Methods 2.1 The Improved MobileNetV3

Fig. 1. The network structure of MobileNetV3-CBAM

404

Q. Yan et al.

To enhance feature extraction and selection, MobileNetV3-Large [10] is used as the backbone in this article. A CBAM module is connected before the last pooling layer of MobileNetV3-Large, resulting in MobileNetV3-CBAM. This modification enables the model to learn more valuable features and improves the accuracy of feature extraction and selection. At the same time, the GLU [11] gate mechanism activation function is added after the convolutional layer to reduce the interference of redundant features to the model and improve the training efficiency and accuracy of the model. The framework is shown in Fig. 1.

3 Experiments 3.1 Datasets The plant nutrient deficiency images utilized in this study are sourced from a publicly available dataset created by Sunitha p. [12], which contains 3597 images of plant nutrient deficiency with RGB channels. To augment the dataset, we perform various enhancements such as rotation, mirroring, flipping, and cropping, resulting in a total of 8646 images as a dataset. The number of specific images is shown in Table 1. The ratio of the training set, validation set, and test set is 8:1:1. Table 1. Distribution of the dataset Class

Training

Validation

Test

Total

Boron

640

80

80

800

Calcium

704

88

88

880

Iron

688

86

86

860

Magnesium

640

80

80

800

Manganese

672

84

84

840

Potassium

672

84

84

840

Sulphur

992

124

124

1240

Zinc

640

80

80

800

Healthy

1268

159

159

1586

Total

6916

865

865

8646

4 Results 4.1 Compared With Base Model In this study, we conduct ablation experiments to compare the training effects of MobileNetV3-CBAM with GLU, the base model, and the addition of the CBAM module and GLU individually. Figure 2 demonstrates that the training loss of MobileNetV3CBAM with GLU decreases rapidly in the early stage and slows down in the later stage,

Identification of Plant Nutrient Deficiency

405

and the oscillation is not obvious in the later stage, and gradually tends to be stable, with the best effect. Transfer learning is employed in this study, allowing the model to rapidly converge during training. Figure 2 also reveals that the base model initially exhibits instability, characterized by significant oscillations and slower convergence speed compared to other models. After adding the CBAM module and GLU respectively, the model is more stable, with oscillations in the early stage, and a relatively stable loss curve in the later stage, with little change in the convergence speed.

Fig. 2. The loss curve of the ablation experiment

Figure 3 compares the improved MobileNetV3-CBAM with GLU model and the base model. From the loss curve and accuracy curve, it can be seen that the improved algorithm improves the accuracy of model recognition on the one hand, and has a better effect on the stability of model training on the other hand. The overall training process is relatively stable and converges faster than the base model.

406

Q. Yan et al.

Fig. 3. The loss and accuracy curve of the MobileNetV3-CBAM model and the base model

4.2 Compared With CNN Model We compared the experimental results of the improved MobileNetV3-CBAM with GLU model with other mainstream CNN models on plant nutrient deficiency recognition tasks, as shown in Table 2, deep CNN networks such as ResNet50[13], InceptionResnetV2 [14] The accuracy rate on the test set is relatively high, but the model is large and the training time is long. Lightweight models, such as GhostNet [15], MuxNet [16], and other models are small in size, among which the GhostNet model is the smallest, only 9.1M. The improved MobileNetV3-CBAM with GLU model is generally better than other models in terms of accuracy and model size. Compared with the deep CNN network, our model has greatly reduced the model size, and the accuracy rate has also improved. Compared with other lightweight models, the accuracy rate of our model is significantly better, which has increased by 0.12%–1.29%. Evidently, CBAM demonstrates the advantage of straightforward integration with minimal computational resources and low memory demands. Table 2. Comparison results of our MobileNetV3-CBAM model and other CNN models Model

Accuracy

Loss

Model Size

ResNet50

94.03%

0.2302

188.9M

InceptionResnetV2

96.23%

0.1695

289.7M

VGG16

94.65%

0.2134

88.5M

EfficientNetB0 GhostNet

95.25% 95.83%

0.1873 0.1725

277.1M 9.1M

MuxNet

96.42%

0.1534

14.9M

MobileNetV3-CBAM(ours)

96.54%

0.1450

10.5M

Identification of Plant Nutrient Deficiency

407

5 Conclusion The focus of this paper is to introduce a highly effective approach for automated identification and classification of plant nutrient deficiencies. This method uses the MobileNetV3-Large lightweight as the base model, adding CBAM and GLU to improve the recognition effect of the model. The generalization ability of the model was verified using banana leaves. The proposed method in this paper is highly applicable in mobile scenarios with limited computational resources, making it a practical solution for real-world engineering applications. The experimental results in this paper show that: 1. MobileNetV3-CBAM with GLU adds CBAM module and GLU gate mechanism activation function on the basis of the original model, which helps the model to extract and select features more accurately, thereby improving the recognition accuracy and stability of the model, achieving 96.54% of plant nutrients lack identification accuracy. 2. MobileNetV3-CBAM with GLU has the best comprehensive performance. Compared with other mainstream CNN recognition results, MobileNetV3-CBAM with GLU ensures the model’s strong performance to identify plant nutrient deficiency under limited computing resources, and occupies a small memory (10.5M), the training speed and classification speed are very fast, and the identification of plant nutrient deficiency can be realized in a low-computing environment. 3. Fine-tuning through transfer learning improves the speed and accuracy of model training, effectively preventing over-fitting of model training.

References 1. Patil, S.S., Dhandra, B.V., Angadi, U.B., Shankar, A.G., Joshi, N.: Web based expert system for diagnosis of micro nutrients deficiencies in crops. In: Proceedings of the World Congress on Engineering and Computer Science, pp. 20–22 (2009) 2. Kumar, P., Sharma, M.K. (eds.): Nutrient Deficiencies of Field Crops: Guide to Diagnosis and Management. Cabi (2013) 3. Tran, T.T., Choi, J.W., Le, T.T.H., Kim, J.W.: A comparative study of deep CNN in forecasting and classifying the macronutrient deficiencies on development of tomato plant. Appl. Sci. 9(8), 1601 (2019) 4. Zhe, X., et al.: Using deep convolutional neural networks for image-based diagnosis of nutrient deficiencies in rice. Comput. Intell. Neurosci. 2020, 1–12 (2020) 5. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014) 6. Jaderberg, M., Simonyan, K., Zisserman, A.: Spatial transformer networks. In: Advances in Neural Information Processing Systems, vol.28 (2015) 7. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018) 8. Wang, F., et al.: Residual attention network for image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3156–3164 (2017) 9. Jiang, Y., Pang, D., Li, C.: A deep learning approach for fast detection and classification of concrete damage. In: Automation in Construction, 128, pp. 103785 (2021) 10. Howard, A., et al.: Searching for mobilenetv3. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1314–1324 (2019) 11. Dauphin, Y.N., Fan, A., Auli, M., Grangier, D.: Language modeling with gated convolutional networks. In: International Conference on Machine Learning, pp. 933–941. PMLR (2017)

408

Q. Yan et al.

12. Sunitha, P.: Images of nutrient deficient banana plant leaves. In: Mendeley Data, V1, (2022). https://doi.org/10.17632/7vpdrbdkd4.1 13. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016) 14. Szegedy, C., Ioffe, S., Vanhoucke, V., Alemi, A.: Inception-v4, inception-resnet and the impact of residual connections on learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 31, No. 1 (2017) 15. Han, K., Wang, Y., Tian, Q., Guo, J., Xu, C., Xu, C.: Ghostnet: more features from cheap operations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1580–1589 (2020) 16. Lu, Z., Deb, K., Boddeti, V.N.: MUXConv: Information multiplexing in convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12044–12053 (2020)

Modular Smart Vehicle Design and Technology for Shared Mobility Mo Zhou1,2 , Xinyu Zhang1,2(B) , Jun Li1,2 , Ying Fu3 , Xuebo Zhang3 , and Kun Wang4 1 State Key Laboratory of Automotive Safety and Energy, Tsinghua University, Beijing, China

[email protected]

2 School of Vehicle and Mobility, Tsinghua University, Beijing, China 3 FAW Car Co., Ltd., Changchun, China 4 Daimler Greater China Ltd., Beijing, China

Abstract. Recently, single-vehicle intelligence is gradually changing to multivehicle intelligence and collaborative intelligence in traffic scenarios, and travel mode is gradually being widely recognized with the development of sharing economy. Therefore, intelligent shared travel based on smart driving vehicles is a powerful solution to improve traffic safety and efficiency. Based on the modular design theory, this paper designs a modular and intelligent as one smart vehicle. The innovative division method based on functional decomposition is proposed to study modularized vehicle chassis, body, environment sensing, and computational control technology. Moreover, it breaks through the vital technical bottlenecks of autonomous driving and provides an innovative solution to meet the demand for intelligent shared mobility. To validate the proposed platform, the performances of key modules and the whole vehicle are evaluated in a closed test site. The experimental results show that the proposed system can provide safe shared mobility services, which lays the foundation for realizing the intelligent shared mobility system in smart cities. Keywords: Modular Smart Vehicle · Autonomous Driving · Vehicle Design · Shared Mobility

1 Introduction With the ongoing energy revolution and the continual advancements in new materials and the latest generation of information technology, the development of automotive products is accelerating in the direction of electrification, intelligence and sharing [1]. Vehicles are transforming from transportation to large mobile intelligent terminals and third spaces, realizing intelligent interconnection and data sharing among occupants, freight, operation platforms and infrastructure [2–4]. Mobility services are the main direction in the transformation of the automotive industry. Under the comprehensive influence of the mobility revolution, digital economy and manufacturing service, the automotive industry is undergoing profound changes, leading to a further shift of the center of gravity of its value chain to the back end. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 409–416, 2023. https://doi.org/10.1007/978-981-99-6187-0_41

410

M. Zhou et al.

According to the forecast of Frost & Sullivan [5], by 2030, 54% of global high-automated vehicles will adopt the private business model, and 8 million SAE Level 4 vehicles will be used for shared mobility, driving new business models for autonomous driving, as Fig. 1.

Fig. 1. 2015–2030 Automotive Industry Value Shift Forecast [5].

The modular design of smart vehicles is the foundation of smart shared mobility platforms, which can better meet the needs of different applications. For example, the Toyota e-Palette Concept intelligent vehicle is a combination of different chassis and cabin modules to achieve a variety of uses, including ride-sharing and freight delivery. In particular, for shared mobility, studies in Singapore have shown that only 30% of vehicles are required to meet individual mobility demands through shared mobility solutions. Combining self-driving cars and sharing will significantly increase car utilization and optimize trip waiting times [6]. Smart shared vehicles can serve many customers during the day, resulting in far less parking time than private cars. After simulating self-driving vehicle parking in Atlanta, Georgia, U.S. academics have found that if accompanied by a 5% reduction in car ownership and an increase in vehicle ridership, each self-driving vehicle could reduce parking by up to 20 spaces [7]. In summary, self-driving vehicles can significantly improve the utilization of existing shared mobility vehicles, shorten passenger wait times, and considerably reduce the number of cars needed to complete the same number of trips. Therefore, this paper is dedicated to the smart vehicle that adapts to the multi-modal sharing economy. According to the principle of functional independence, the entire function is decomposed into multiple sub-functions, such as chassis, body, environment perception, and computing control, while establishing their respective functional module systems. Based on serialization, modularization, and standardized interface design, intelligent driving vehicles’ modular coding system architecture is established to realize the characteristics of good mobility, lightweight, and intelligence. This work explores the advantages of modular vehicles in terms of performance and adaptability, and promotes the transformation of traditional delivery vehicles to modular mobility platforms.

Modular Smart Vehicle Design and Technology for Shared Mobility

411

2 Functional Definition and Scheme Design The functional definition of a modular and intelligent all-in-one self-driving vehicle is proposed for light logistics, mobility, and public services in smart cities. Furthermore, the modularization scheme based on functional decomposition is studied, focusing on breaking through the key technologies of modularized vehicle chassis, body, environment perception, and computing control modules. At present, traditional urban transportation cannot meet urban mobility needs, while creating a smart city intelligent transportation system that joins modular multifunctional vehicles is the key solution. Figure 2 illustrates the design concept of a modular vehicle with multiple functions. Based on a multifunctional chassis, the vehicle can be extended in functionality by adding different bodies and components.

Fig. 2. The multifunctional vehicle design scheme.

3 Key Technologies 3.1 Multifunctional Modular Chassis In terms of overall chassis technology, a generalized multifunctional chassis is developed for different top-loading functions to simultaneously meet the functional requirements of logistics, passenger transportation, and service. The modularization research and solution design are implemented at four levels: complete vehicle, module group/system, assembly, and components. The chassis structure is designed as a three-stage system, with the middle section being the low-floor area and the power battery arranged in the frame, as shown in Fig. 3. It can meet the height of the ground for the first level of passenger transportation and the convenience of loading and unloading freight functions, saving space for chassis arrangement. The front and rear sections are the symmetrical structure of the main body, in which the drive system, steering system, suspension system, etc., are fully symmetrical system-level modular structures. The front and rear diagonal suspensions and subframe assemblies are the same, realizing the modular design requirements at all levels of the chassis to achieve maximum maneuverability and convenience.

412

M. Zhou et al.

Fig. 3. The design scheme of modular intelligent vehicle chassis.

3.2 Multimodal Fusion Perception Module Firstly, an automatic online calibration method based on linear features is proposed for the spatial synchronization problem of the monocular camera and the 3D LiDAR [8]. The optimization-based external parameter calibration method is completed by automatically extracting linear features that satisfy the scene adaptation and using environmental linefeature-based correction to eliminate the influence of error terms on accuracy. Based on extracting and filtering linear features from LiDAR point clouds and images, the point cloud linear features are projected onto the visual image coordinate system, and adaptive optimization is used to obtain accurate external parameters. Moreover, for the problem of 3D target detection based on multi-sensor front fusion, a plug-and-play RI-fusion module is proposed to achieve effective fusion of LiDAR and cameras [9]. By connecting the original depth image with the fused features to retain the information of the point cloud, the fusion result is projected into the spatial point cloud to form a feature-enhanced point cloud and input into the LIDAR-based 3D target detection model. This fusion method can significantly enhance multiple LiDAR-based 3D object detectors and achieve higher detection accuracy for small targets such as pedestrians and cyclists. 3.3 High Precision Localization Module In order to achieve high accuracy, high reliability, and high availability of positioning orientation in the case of unobstructed or briefly unlocked navigation satellite signals, an optimal fusion algorithm based on high-accuracy global navigation satellite system (GNSS), multi-sensor, and vehicle dynamics information is proposed. In response to the dynamic scenarios that involve partial obstructions of GNSS signals, a fusion localization technology based on global information and feature Simultaneous Localization and Mapping (SLAM) has been designed. It uses global information to perform extended Kalman filter-based positioning, while simultaneously establishing a feature map, to achieve high-precision localization in the event of temporary loss or variation of GNSS satellite signals, while ensuring environmental adaptability and robustness. When the GNSS signal of the vehicle fails, the system automatically determines and switches to particle filter positioning based on feature mapping to update vehicle pose, ensuring the continuity of the positioning data.

Modular Smart Vehicle Design and Technology for Shared Mobility

413

3.4 Human-Like Decision Making Module There are numerous intersections in complex traffic environments, and human drivers suffer from recognition errors and decision errors due to the lack of information, which presents enormous challenges for the behavior decision-making of autonomous driving. Therefore, to address the problem of interaction and coordination within complex urban traffic intersections, we comprehensively analyze the characteristics of scene element variations and adopt a self-learning approach based on the planning and decision-making knowledge of experienced human drivers. A representation of interactive social norms in path planning is described, and heterogeneous fusion with vehicle state features is used as input for reinforcement learning. Meanwhile, the proposed framework designs penalty functions for social norm violations and combines experience pool replay and target network technologies to improve training convergence speed. After multiple training sessions, it can be guaranteed that vehicles no longer choose behaviors that violate social norms, thus achieving mutual avoidance in complex real-world environments that conform to social norms. 3.5 Safety of the Intended Functionality Module With the increasing complexity of the traffic environment, the perceptual safety problem in the safety of the intended functionality (SOTIF) of autonomous driving mainly stems from unsatisfactory detection accuracy, lack of model reliability evaluation, and insufficient processing of noisy data. In order to break through the traditional manual design and result-oriented perception model design method, the theory of perception model is explained by applying the joint source-channel coding in information theory. Based on the theoretical explanation of feature extraction and fusion in perception models, a multi-sensor feature depth fusion method is proposed to ensure the interpretability of the model while enhancing the perception capability in complex scenes [10]. Meanwhile, a reliability evaluation method and a novel credibility index of the perception model based on information entropy are introduced. The proposed evaluation index, called Average Entropy Variation (AEV), quantitatively evaluates the stability of the model during real-time perceptual interaction and enhances the credibility of model evaluation and detection.

4 Platform Integration and Test The developed modular intelligent driving vehicle has a modular vehicle chassis, environment sensing, and computing control functions, which can perform real-time detection and tracking of various targets in the environment, and the overall performance of the vehicle can reach SAE Level 4. The hardware configuration of the proposed modular vehicle system is as follows: An Delphi ESR millimeter wave radar, a 32-line LIDAR and a vision camera are installed in the front and roof of the vehicle, respectively; A trunk equipped with a 12V battery and a power management module to provide power to the environmental sensing system; Two IPCs for data acquisition and information processing; An inertial navigation system for collecting information on the vehicle’s attitude and position; A monitor is installed in the rear of the vehicle for data display.

414

M. Zhou et al.

4.1 Key Modules Test To verify the performance of the system, the key modules were first tested under various operating conditions. The terrain conditions of the test area were urban road environments, and the speed adaptation range of the vehicle was 0–80 km/h. Perception Module Test. The proposed modular vehicle was driven in a test field (under clear weather conditions) and the detection accuracy of various classes of objects such as lane lines, vehicles, non-motorized vehicles, and pedestrians were recorded separately. The test results are shown in Table 1. Table 1. The test results of perception module. Objects

True

Detected

False

Missing

Accuracy

Lane Line

602

586

56

9

91.68%

Vehicle

1896

1916

142

27

96.98%

Non-motorized Vehicle

1653

1678

164

46

92.21%

Pedestrian

965

956

98

22

91.72%

Location Module Test. The proposed modular vehicle was tested at random positions Table 2. Partial test results of location module. Real Position

Output Position

Longitudinal error (cm)

Lateral error (cm)

(116.33058830, 40.00007635)

(116.33055698, 40.00007502)

33.6

13.9

(116.32763588, 39.99923854)

(116.32763279, 39.99923758)

30.9

10.3

(116.32647679, 39.99985586)

(116.32647016, 39.99985473)

38.2

11.9

(116.32665994, 40.00129869)

(116.32665699, 40.00129757)

35.2

11.5

(116.33068830, 40.00007643)

(116.33069300, 40.00007768)

37.5

12.5

(116.32336766, 39.98648922)

(116.32337058, 39.98648832)

29.6

9.6

(116.32223314, 39.99653565)

(116.32223705, 39.99653675)

38.6

12.1

(116.32445716, 39.98654565)

(116.32446001, 39.98654676)

31.8

13.3

in the test field during the driving process. The output positions of the localization

Modular Smart Vehicle Design and Technology for Shared Mobility

415

system were recorded in real time, and the lateral and longitudinal positioning errors can be calculated by comparing with the real positions of the vehicle. The test results are summarized in Table 2. According to the results of the test, the average values of latitude and longitude errors were 34.6 and 12.4 centimeters, respectively. 4.2 Vehicle Test A test mission was designed for driverless mobility services, with a simulated “the lastmile” driverless taxi service on the test field. Six pick-up points were arranged in the test area to simulate picking up passengers at designated parking spaces and delivering them to designated parking spaces for drop-off. The proposed modular vehicle was tested in a total of three simulations, the results of which are shown in the Fig. 4. During this period, the vehicle traveled a total of 60 km and completed eight safe travel services. The average speed in operational mode was 30 km/h. These tests primarily examined the environmental perception and decision-making capabilities, including obstacle avoidance, car following, and passing through obstacle courses. Taking the roundabout test as an example (see Fig. 5), the vehicles selected different lanes according to the current state of the environment and achieved a path planning that complies with social attributes and traffic rules while ensuring safety.

Fig. 4. Testing trajectories of the mobility services

Fig. 5. a) Schematic diagram of the roundabout lanes. b) Testing trajectories in the roundabout.

5 Conclusion The current new round of technological revolution is driving a comprehensive reorganization of the automotive industry, and automotive intelligence and informatization have become an important development direction and global competitive high ground

416

M. Zhou et al.

after new energy vehicles. The future intelligent electric vehicle will be the basic unit for storing and absorbing green energy, and the basic unit of intelligent transportation and smart city. It is also the node that links the new generation of mobile communication and shared mobility together, thus promoting the construction of the transportation revolution and reshaping the future travel experience. Focusing on multi-modal travel services for smart cities, this paper designs a modular vehicle with good mobility, lightweight, and intelligence. Firstly, the modular intelligent driving vehicle scheme is clarified, and the modularization division method based on functional decomposition is studied. Secondly, several key technology modules such as modular vehicle chassis, environment sensing, and computing control are developed. Finally, the performances of key modules and the whole vehicle are tested. Those test results show that the proposed modular vehicle can realize high-level autonomous driving based on the modular design, which provides a new solution to meet the demand for efficient travel in large cities. Acknowledgement. This work was supported by the National High Technology Research and Development Program of China under Grant No. 2018YFE0204300, and the National Natural Science Foundation of China under Grant No. 62273198.

References 1. Li, K., Hua, T., Ma, Z., Wu, F.: Smart connected autonomous vehicle: a survey on sensing, fusion, and control. IEEE Trans. Intell. Transp. Syst. 21(1), 16–31 (2020) 2. Khan, M.A., et al.: Level-5 autonomous driving—are we there yet? A review of research literature. ACM Comput. Surv. 55(2), 1–38 (2022) 3. Wu, X., Cao, D.: A survey of intelligent connected vehicles: applications, platforms, technologies, and challenges. IEEE Trans. Intell. Transp. Syst. 21(4), 1471–1496 (2020) 4. Zhang, X., Li, Y., He, F.: User-oriented autonomous vehicle fleet management system for shared mobility services. IEEE Trans. Intell. Transp. Syst. 22(3), 1599–1611 (2021) 5. Global Automotive & Transportation Research Team at Frost & Sullivan. Future Business Models of Autonomous Vehicle Services, 2030. MEDD-18 (2020) 6. Wang, Y., Khor, H.Q., Tan, C., Liu, J.: Investigating the willingness to adopt autonomous vehicles for shared mobility in Singapore. Sustainability 13(4), 2265 (2021) 7. Soylu, T., Lownes, N.E., Hegde, R.: What influences travel behavior and adoption of autonomous vehicle shared mobility services? A qualitative study of US consumer perceptions. Transp. Res. Rec. 2675(3), 20–32 (2021) 8. Zhang, X., Wang, L., et al.: RI-Fusion: 3D object detection using enhanced point features with range-image fusion for autonomous driving. IEEE Trans. Instrum. Meas. 72, 1–12 (2022) 9. Zou, Z., Zhang, X., Liu, H., et al.: A novel multimodal fusion network based on a joint coding model for lane line segmentation. Inf. Fusion 80, 167–178 (2021) 10. Zhang, Z., Zhu, S., Guo, S., Li, J.: Line-based automatic extrinsic calibration of LiDAR and camera. In: IEEE International Conference on Robotics and Automation (2021)

A Quadrupedal Soft Robot Based on Kresling Origami Actuators Yang Yang1,2(B)

, Shaoyang Yan1 , Mingxuan Dai1 , Yuan Xie1 , and Jia Liu1,3

1 School of Automation, Nanjing University of Information Science and Technology (NUIST),

Nanjing 210044, China {meyang,liujia}@nuist.edu.cn 2 Jiangsu Province Engineering Research Center of Intelligent Meteorological Exploration Robot, NUIST, Nanjing 210044, China 3 Tianchang Research Institute of NUIST, Tianchang, Anhui, China

Abstract. Kresling origami actuators have been proved as an effective technology and have been applied in many applications. This article presents a quadrupedal soft robot based on Kresling origami actuators. The soft robot can be manufactured by 3D printing technology, it has a simple structure and is easy to operate. Kresling origami actuators with vacuum control can realize the compound motions of twist and contraction, and the twisting direction is consistent with the preset crease direction when manufacturing the actuators. The four actuators can be driven independently to realize the programmable control of the quadrupedal soft robot. The walking speed and the turning angle of the quadruped soft robot in a cycle can be controlled by regulating the pressure regulating valve. In the experiments, we tested the twisting angle of the soft actuator under different negative pressure levels, and made a complete quadruped soft robot to demonstrate the three gaits of the quadrupedal soft robot, proving that it is feasible to use Kresling origami as an actuator to actuate the quadrupedal soft robot. The average speed of the quadrupedal soft robot walking on flat ground can reach 20.17 mm/s. Keywords: Quadrupedal Soft Robot · Kresling Origami · Soft Actuator

1 Introduction Soft robots made of flexible materials can realize continuous and frequent partial deformation, and have good adaptability to unknown environments [1, 2]. Therefore, soft robots are favored by more and more researchers. Soft actuators are key components to actuate soft robots, and they are also an essential part of soft robots. Many types of soft actuators have been designed and manufactured for soft robots, including shape memory polymer [3], shape memory alloy [4], electroactive polymer [5], and fluid actuators [6]. At the same time, the application of various additive manufacturing technologies [7, 8] and flexible materials [9] in soft robots has also promoted the progress and development of soft robots. Pneumatic soft actuators are widely used because of their safety, light weight, and ability to provide continuous and natural motion [10–13]. Due to the softness of the © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 417–426, 2023. https://doi.org/10.1007/978-981-99-6187-0_42

418

Y. Yang et al.

material and structural compliance, pneumatic soft actuators [14, 15] usually show small force exertion and high deformation. However, the low stiffness of the constituent materials also results in their poor load carrying capacity and slow response times. Due to the large contraction ratio and unique bistability characteristic [16], origami can effectively shorten the response time and increase the payload, it has received more and more attention in the area of soft robotics. Zhang et al. [17] designed a kind of pneumatic, multifunctional artificial muscles, which are created by origami folding of multiple layers of flexible sandpaper and a common monofunctional vacuumed-powered cube-shaped (CUBE) artificial muscle, the soft robot assembled with this artificial muscle can flexibly climb different pipes. MinJo Park et al. [18] developed a vacuum-powered soft origami arm module using the Tachi-Miura origami pattern. The stiffness of the module can be regulated by friction between the origami and tendon, and a soft robot made using the module can carry high loads and change volume 29 times. Jiao et al. [19] designed a new vacuum-powered Kresling origami actuator with a single seamless chamber and two degrees of freedom, and a robot made with multiple actuators can crawl and rotate. Li et al. [20] designed an untethered biomimetic-quadruped soft robot based on doublechamber pre-charged soft actuators with highly flexible trunk. The robot does not require any air compressors, valves, and hoses, and realizes a cordless design, which improves its compactness.

Fig. 1. The quadrupedal soft robot and the Kresling origami actuators.

This paper presents a quadrupedal soft robot based on Kresling origami actuators (see Fig. 1). The quadrupedal soft robot parts are all manufactured by 3D printing technology, which has a characteristic of simple structure and light in weight. Two Kresling origami actuators with the same crease are fixed on both sides of the soft robot, and the crease directions of the actuators on both sides are reversed. The actuators are vacuum-powered, and each actuator can be controlled individually. Through different control plans, the soft robot can be controlled to imitate different gaits. At the same time, the unique bistable characteristic of Kresling origami is beneficial to improve the response speed of quadrupedal soft robots. The contributions of this article are as follows:

A Quadrupedal Soft Robot Based on Kresling Origami Actuators

419

1. Applying Kresling origami as actuators to a quadrupedal robot and proving the feasibility of this method. 2. Programmable control of the quadrupedal soft robot. 3. A quadrupedal soft robot with lightweight and a simple manufacturing method. The remainder of the article is organized as follows: Sect. 2 describes the design and working principle of the quadrupedal robot. Section 3 describes the manufacture and control of the soft robot. In Sect. 4, the Kresling origami actuators are tested and the gaits of the soft robot are demonstrated to prove its ability to walk well. Finally, in Sect. 5, we summarize the article and propose future work.

2 Design and Working Principle 2.1 Design In this paper, the soft robot is designed for bionic quadruped and is divided into three main parts: head, body, and legs. They are all prepared by 3D printing technology using polylactic acid (PLA). The body is a hollow rectangular shape and the top surface can be opened. There are two square holes on each side of the body, which are slightly smaller than the bottom surface of the origami actuator. There are four 4 mm circular holes at the rear of the body, through which the rubber hoses extend into the quadrupedal soft robot. The surface of the leg parts is smooth and there is little friction with the ground surface, therefore we have stuck a rough material as a rough surface on the part of the leg parts in contact with the ground (see Fig. 2(a)), which is used to increase the friction between the legs and the ground.

Fig. 2. Quadrupedal soft robot based on Kresling origami actuators. (a) Side view of the soft robot. (b) Top view of the soft robot. (c) Schematic diagram of the structure of the Kresling origami actuator. (d) Top view of the Kresling origami actuator.

420

Y. Yang et al.

We used four Kresling origami actuators to actuate the motion of the quadruped robot. The Kresling origami actuators are printed on a 3D printer using thermoplastic elastomers (TPE). The soft actuator consists of four side surfaces, a top surface, and a bottom surface, with a single sealed chamber inside (see Fig. 2(c)). Creases are preset on the side surfaces to control the twist direction of the soft actuators. The top and bottom surfaces are rectangles of different sizes, and the size of the bottom surface is larger than that of the top surface (see Fig. 2(d)). The top and bottom surfaces are the same thickness and they are 2.2 mm thicker than the side surfaces to ensure that the top and bottom surfaces will not collapse inwards when the soft actuator is twisted and contracted. There is a 4 mm round hole in the center of the bottom surface. In the subsequent practical use, we fixed the bottom surface so that the soft actuator produces the compound motion of twist and contraction from the top surface only. 2.2 Working Principle The Kresling soft actuator used in this paper is vacuum powered and the bottom surface is fixed to the side of the soft robot so that the soft actuator can only be twisted and contracted from the top surface. Figure 3(a)–(b) show the model and actuating principle of the soft actuator with different creases. When it is not actuated, the soft actuator is in its initial state. When it is actuated by a vacuum, the soft actuator realizes a compound movement of twist and contraction, the direction of twist being related to the direction of the side crease, which is determined when the actuator is being modeled. When the negative pressure in the sealed chamber changes to atmospheric pressure, the soft actuator returns to its initial state by the elasticity of its own soft material. The angle of twist of the Kresling origami actuator is related to the size of the negative pressure in the seal chamber of the actuator.

Fig. 3. Working mechanism of Kresling origami actuators. (a) The Kresling origami actuator shows clockwise twisting and contraction. (b) Kresling origami actuator shows counterclockwise twisting and contraction.

A Quadrupedal Soft Robot Based on Kresling Origami Actuators

421

The quadrupedal soft robot has two Kresling origami actuators with the same crease on each side, and the actuators on the different sides of the soft robot have reverse creases. Each of the four actuators is controlled by a solenoid valve. Actuating the four actuators in different schemes can control the soft robot to walk, turn left or turn right. Figure 4(a)– (c) shows the three gaits of the quadrupedal soft robot. We named the combination of the actuator and leg member S1-S4. When none of S1-S4 is actuated by vacuum, the soft robot is in its initial state. When S1-S4 is driven at the same time, S1 and S2 twist counterclockwise along the crease direction, S3 and S4 twist clockwise along the crease direction, and S1-S4 simultaneously provide a forward force to the soft robot to actuate the soft robot to Walk (see Fig. 4(a)). When S3 and S4 are not actuated and S1 and S2 are actuated at the same time, S3 and S4 remain in their initial state, and S1 and S2 twist counterclockwise in the direction of the crease, providing a clockwise rotation force to the soft robot, turning the soft robot to the right by a certain angle (see Fig. 4(b)). The soft robot can be turned 360° to the right by controlling it in several cycles. Similarly, the soft robot can turn to the left when S1 and S2 are not actuated and S3 and S4 are actuated at the same time (see Fig. 4(c)). The twist angle of the soft actuator can be controlled by using a negative pressure regulating valve in order to control the distance walked and the turning angle of the soft robot in a single cycle.

Fig. 4. Three gaits of the soft robot. (a) Walking. (b) Turning right. (c) Turn left.

3 Fabrication and Control 3.1 Fabrication The Kresling origami actuators were printed with TPE by the 3D printer (KP3S, KINGROON, China), and the head, body, and legs of the soft robot were printed with PLA. The soft actuator is printed as a whole part, and has no internal support. In this case, in order to guarantee the printing quality of the top layer, it is necessary to reduce the printing speed and provide sufficient cooling during the printing process. A strong and thin wall is beneficial to promote the torsion and contraction of the soft actuator. Considering that the nozzle diameter of the printer is 4 mm, we set the wall thickness to

422

Y. Yang et al.

1.2 mm. At the same time, we set the extrusion volume of the nozzle to 115%, and the filling density to 100% to enhance the airtightness of the soft actuator (See Table 1). Table 1. Partial setting parameters of the printer Setting

PLA

TPE

Layer height (mm)

0.2

0.1

Wall thickness (mm)

1.2

1.2

Filling density (%)

25

100

Printing speed (mm/s)

50

30

Nozzle temperature (°C)

210

220

Extrusion volume (%)

100

130

Fan cooling

ON

ON

3.2 Control The control system of the quadrupedal soft robot is shown in Fig. 5. The control system includes a vacuum pump, an Arduino UNO, a 4-way relay module, and 4 normally closed 2-position 3-way solenoid valves. These electronic components can be purchased in Taobao stores. The vacuum pump is used to provide negative pressure to actuated the soft actuator to reverse, and supplies air throughout the entire motion of the robot. Arduino UNO is used to control the relay on and off. The solenoid valve is powered by an external power supply with a voltage of 12 V. The inlet hole of the solenoid valve extends to the vacuum pump through a rubber hose, and the air outlet hole is connected to the soft actuator through another rubber hose. The regulator is used to regulate the size of the negative pressure, and its adjustment range is −93 kPa~0 kPa.

4 Experiments and Results In this section, we tested the Kresling origami actuator, obtained the nonlinear relationship between angle change and negative pressure, and made a quadrupedal soft robot to demonstrate the three gaits to prove that the soft robot can walk well. 4.1 Experiments with the Kresling Origami Actuator We fixed the bottom of the Kresling origami actuator on the workbench, and adjusted the negative pressure through the pressure regulating valve to make the soft actuator twist different angles (see Fig. 6(a)). Figure 6(b) shows the non-linear relationship between the angle change of the Kresling origami actuator and the negative pressure. When the negative pressure is 93 kPa, the angle change of the soft actuator is the largest, which is 72°. Figure 6(c) shows the finite element analysis of Kresling origami actuators with different creases for twisting and shrinking under vacuum actuation.

A Quadrupedal Soft Robot Based on Kresling Origami Actuators

423

Fig. 5. Control system

Fig. 6. Related experiments for the Kresling origami actuator. (a) Working demonstration of the origami actuator. (b) The relationship between the angle change of the soft actuator and the negative pressure when it is driven by negative pressure (c) The finite element analysis of the Kresling origami actuator, the upper part: it is twisted and contracted counterclockwise. Bottom half: Twisted and contracted clockwise.

4.2 Experiments with the Quadrupedal Soft Robot We count the time it takes for the soft actuator to go from the initial state to the actuated state and to restore from the actuated state to the initial state as one cycle. By regulating the interval between the opening and closing of the relays, the length of a single cycle of the soft actuator can be controlled, and in this way, the walking and steering speed of the quadrupedal soft robot can be controlled. We set the cycle time at 1.5 s, where the time to actuate is 0.7 s and the time to recover is 0.8 s. Figure 7(a)–(c) shows the three gaits of the quadrupedal soft robot in the indoor environment. It can be seen from

424

Y. Yang et al.

Fig. 7(d) that when the quadrupedal soft robot walks in the indoor environment, the overall displacement-time relationship is linear. Under the condition that the negative pressure is 93 kPa and the motion period is 1.5 s, the average walking speed of the soft robot is 20.17 mm/s. Under the same parameter settings, the quadruped soft robot’s steering angle-time relationship is shown in Fig. 7(e), and the steering angle-time trend is consistent when turning left and right. In order to further verify the feasibility of the quadrupedal soft robot, we placed the quadrupedal soft robot on the grass for a walking test (see Fig. 8(a)), The overall displacement-time relationship is also linear (see Fig. 8(b)). Due to the softness and complexity of the grassland, the average walking speed of the soft robot on the grassland is only 4.2 mm/s.

Fig. 7. Indoor walking demonstration of the quadrupedal soft robot. (a) Walking. (b) Turning left. (c) Turning right. (d) The relationship between displacement and time of the quadruped soft robot when walking. (e) The relationship between steering angle and time for the quadrupedal soft robot when turning

A Quadrupedal Soft Robot Based on Kresling Origami Actuators

425

Fig. 8. Outdoor working demonstration of the quadrupedal soft robot. (a) Walking. (b) The relationship between displacement and time of the soft quadrupedal robot when walking.

5 Conclusion and Future Work In this paper, a pneumatic, programmable quadrupedal soft robot with walking and steering capabilities is developed by applying the Kresling origami actuator to the soft robot. The soft robot is actuated to walk or turn by the twisting of the soft actuator that turns the leg structure. A complete quadruped soft robot was made, and the soft robot was placed in indoor and outdoor environments to demonstrate gaits of the soft robot, proving its good walking ability. Nevertheless, there are still some problems to be solved in the quadrupedal soft robot based on the Kresling origami actuators proposed in this article: (1) The limited twisting angle of a single soft actuator also limits the motion speed of the soft robot. (2) When the soft robot walks, it will deviate at a certain angle. It is necessary to optimize the structure of the soft robot and improve the stability of the soft robot. Our subsequent research will propose solutions to these problems and further improve the quadrupedal robot based on the Kresling origami actuators. Acknowledgement. This research was supported in part by the National Natural Science Foundation of China (Grant No. 52005269), the Research Project of State Key Laboratory of Mechanical System and Vibration MSV202319 and the Postgraduate Research and Practice Innovation Program of Jiangsu Province under Grant SJCX23_0389.

References 1. Rus, D., Tolley, M.T.: Design, fabrication and control of soft robots. Nature 521(7553), 467–475 (2015) 2. Zhang, Y.F., et al.: Fast-response, stiffness-tunable soft actuator by hybrid multimaterial 3D printing. Adv. Funct. Mater. 29(15), 1806698 (2019) 3. Sachyani Keneth, E., et al.: Pre-programmed tri-layer electro-thermal actuators composed of shape memory polymer and carbon nanotubes. Soft Robot 7(2), 123–129 (2020) 4. Xiang, C., Guo, J., Chen, Y., Hao, L., Davis, S.: Development of a SMA-fishing-lineMcKibben bending actuator. IEEE Access 6, 27183–27189 (2018) 5. Song, Z., et al.: Origami lithium-ion batteries. . Nat. Commun. 5, 3140 (2014) 6. Huang, J., et al.: Modular origami soft robot with the perception of interaction force and body configuration. Adv. Intell. Syst. 4(9), 2200081 (2022)

426

Y. Yang et al.

7. Fang, H., Zhang, Y., Wang, K.W.: Origami-based earthworm-like locomotion robots. Bioinspir. Biomim. 12(6), 065003 (2017) 8. Liu, T., Wang, Y., Lee, K.: Three-dimensional printable origami twisted tower: design, fabrication, and robot embodiment. IEEE Rob. Autom. Lett. 3(1), 116–123 (2018) 9. Son, H., Park, Y., Na, Y., Yoon, C.: 4D multiscale origami soft robots: a review. Polymers (Basel) 14(19), 4235 (2022) 10. Liu, S., et al.: A six degrees-of-freedom soft robotic joint with tilt-arranged origami actuator. J. Mech. Robot. 14(6), 060912 (2022) 11. Jin, T., et al.: Modular soft robot with origami skin for versatile applications. Soft Robot. 10, 785–796 (2023) 12. Melancon, D., Forte, A.E., Kamp, L.M., Gorissen, B., Bertoldi, K.: Inflatable origami: multimodal deformation via multistability. Adv. Funct. Mater. 32(35), 2201891 (2022) 13. Cianchetti, M., et al.: Soft robotics technologies to address shortcomings in today’s minimally invasive surgery: the STIFF-FLOP approach. Soft Robot. 1(2), 122–131 (2014) 14. Rogatinsky, J., et al.: A collapsible soft actuator facilitates performance in constrained environments. Adv. Intell. Syst. 4(10), 2270051 (2022) 15. Jiao, Z., et al.: Lightweight dual-mode soft actuator fabricated from bellows and foam material. Actuators 11(9), 245 (2022) 16. Jin, T.: Origami-inspired soft actuators for stimulus perception and crawling robot applications. IEEE Trans. Robot. 38(2), 748–764 (2022) 17. Lin, Y., et al.: Controllable stiffness origami “skeletons” for lightweight and multifunctional artificial muscles. Adv. Funct. Mater. 30(31), 2000349 (2020) 18. Park, M., et al.: Deployable soft origami modular robotic arm with variable stiffness using facet buckling. IEEE Robot. Autom. Lett. 8(2), 864–871 (2023) 19. Jiao, Z., Ji, C., Zou, J., Yang, H., Pan, M.: Vacuum-powered soft pneumatic twisting actuators to empower new capabilities for soft robots. Adv. Mater. Technol. 4(1), 1800429 (2019) 20. Li, Y., Chen, Y., Ren, T., Li, Y., Choi, S.H.: Precharged pneumatic soft actuators and their applications to untethered soft robots. Soft Robot. 5(5), 567–575 (2018)

Design of Attitude Controller for Ducted Fan UAV Based on Improved Direct Adaptive Control Method Hongyu Zhang1,2 , Xiaodong Liu1 , and Yong Xu1,2(B) 1 2

School of Aerospace Engineering, Beijing Institute of Technology, Beijing, China Chongqing Innovation Center, Beijing Institute of Technology, Chongqing, China [email protected]

Abstract. Ducted fan UAV system is a typical multi-channel coupled nonlinear system. To implement reliable control of it, the structure and mathematical model of UAV must first be clarified. After the system model is established, it needs to be linearized and converted into a state-space expression before the controller can be designed. Thinking about controller design, we select the Linear Quadratic Regulator (LQR) method as the basic controller at first. According to simulation analysis, it is found that its dynamic performance is excellent, but it cannot effectively fight against the influence of the external disturbance. So, we improved direct adaptive controller and combined it with Kalman filter. The simulation results show that the controller realizes the function of anti-disturbance, and improves the robustness of the system.

Keywords: Ducted Fan UAV Control · Kalman Filter

1

· LQR Control · Direct Adaptive

Introduction

Ducted fan Unmanned Aerial Vehicle (UAV) refers to a new type of UAV that uses a ducted fan as the main body of the UAV and also as the main flight power generating device. The ducted fan UAV has the advantages of both helicopters and fixed-wing aircraft, such as efficient rotor propulsion, flexible mobility, low noisy and good concealment [1]. So it has good practicability and broad application prospects. Therefore, under the booming situation of today’s unmanned aerial vehicle market, whether in the military field or the civilian field, this type of unmanned aerial vehicle has a large market space. Ducted fan UAV has three motion states: vertical take-off, fixed height hover and tilt forward flight. Generally speaking, the hovering attitude of the ducted fan UAV is the basis of other attitudes, and the control of the hovering attitude is the key to the flight control of the entire ducted fan UAV. Since the flight missions of ducted fan UAV are mostly performed in the hover state, and excellent attitude control can make the ducted fan UAV maintain a constant flight c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023  Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 427–436, 2023. https://doi.org/10.1007/978-981-99-6187-0_43

428

H. Zhang et al.

attitude according to the control command, so it is of great significance to study the hovering attitude control system for the ducted fan UAV. Recent years, relevant scholars at home and abroad have done a lot of research on the control method of ducted fan UAV. In [2,3], the traditional PID controller is introduced into the control system. The advantage of the classic PID controller is that the controller has a simple structure and a small amount of calculation, but its disadvantage is poor robustness. In order to solve the problems of robustness and uncertainty of the UAV system, the H2 /H∞ robust control algorithm is used in [4,5]. Besides, the adaptive controller is used by [6–8] to solve the problem of internal and external disturbance. In the presence of model uncertainty, the output of the adaptive controller can still follow the output of the reference model well, suppressing the adverse effects of parameter changes. Inspired by the above motivations, we design a hovering attitude controller combining direct adaptive controller and Kalman filter, which can be used to eliminate input disturbances, measurement errors, and uncertainties in UAV parameters. Among them, the reference model design adopts state feedback decoupling and independent single-channel PID control loop, while the controlled object first uses LQR for basic control and then adds Kalman filter. The output of the Kalman filter is compared with the output of the reference model to realize direct adaptive control, and the state quantity of the reference model is also used as feedback to further adjust the performance of the system. The second part of this paper will introduce the structure and control model of the ducted fan UAV. Section three presents the basic LQR controller design and the improved direct adaptive controller design. Finally, in Sections four and five we verify the performance of the controller through simulation and draw some conclusions.

2 2.1

Preliminary Coordinate Frames

This article will model at the hovering point of the UAV, so two coordinate systems are determined, namely the ground coordinate system (OXe Ye Ze ) and the body coordinate system (OXb Yb Zb ). The schematic diagram is shown in Fig. 1. The ground coordinate system can be regarded as an inertial coordinate system, which is fixed to the earth’s surface. The origin of the coordinate system is located at the center of mass of the UAV at the moment of take-off, the Xe axis points to the flight direction of the UAV in the horizontal plane, the Ze axis points up vertically to the ground, and the Ye axis is perpendicular to the other two axes and forms a right-handed coordinate system. The body coordinate system is fixedly connected with the body and moves with the drone. The origin of the body coordinate system is the center of mass of the drone, the Xb axis (roll axis) points to the front of the drone’s flight direction, the Zb axis (yaw axis) points upward along the engine rotation axis, and the Yb axis (pitch axis) is perpendicular to the other two axes and form a right-handed coordinate system.

IDAC of Ducted Fan UAV

429

Fig. 1. Coordinate Frames

2.2

Attitude Dynamics

The basic structural layout of the ducted fan UAV studied in this paper is shown in Fig. 2.

Fig. 2. Sectional View of Ducted UAV

Before establishing the kinetic equation, the following reasonable assumptions are made for the convenience of analysis: (1) Assume that the UAV is a rigid body with a symmetrical structure, and the UAV has no deformation mechanism and self-vibration. (2) Assume that the thrust of the drone is vertically downward when hovering, and its magnitude is the same as its own gravity. (3) Assume that only the blades and propellers are aerodynamically driven in the UAV structure, and the aerodynamic torque generated by the propeller and the counter torque generated by the fixed fins cancel each other out. After considering the above assumptions, it is determined that the external force on the UAV hovering is its own gravity and the thrust of the UAV, and the external torque is the gyro torque generated by the rotation of the propeller and engine rotor and the aerodynamic moment generated by the rotation of the control rudder. The air resistance of the body is not considered.

430

H. Zhang et al.

According to the force analysis of the UAV, combined with the Coriolis formula and the momentum moment theorem, the attitude dynamic equation is obtained as follows: ⎧ ⎨ ω˙ x = [Mx + (Iy − Iz ) ωy ωz − It ωt ωy ] /Ix ω˙ y = [My + (Iz − Ix ) ωx ωz + It ωt ωx ] /Iy (1) ⎩ ω˙ z = [Mz + (Ix − Iy ) ωx ωy ] /Iz In the formula (1), M = 12 ρνe2 cSdδ is the control torque of three axis [4],  2T where νe = ρπl 2 is the induced velocity caused by propeller rotation, ρ is the air density, c is the lift coefficient of the control wing, S is the acreage of the control wing, d is the arm length of the control wing, δ is the control angle of three axes. After the equation is established, in order to ensure the operability of subsequent control, it must be linearized. Considering that when the UAV is hovering, the three attitude angles ψ, θ, φ are all small, so it can be approximated that sinψ = sinθ = sinφ = 0 and cosψ = cosθ = cosφ = 1. Neglecting the second-order small quantity, the attitude Dynamics equation after linearization is obtained as follows:   ⎧ mgcSdv ⎪ ω ˙ /Ix = δ − I ω ω 2 x x t t y ⎪ πl ⎪   ⎪ ⎪ ⎪ v ⎪ ω˙ y = mgcSd δy + It ωt ωx /Iy 2 ⎪ ⎪   πl ⎨ mgcSdh ω˙ z = (2) πl2 δz /Iz ⎪ ⎪ ˙ ⎪ φ = ω ⎪ x ⎪ ⎪ ⎪ ⎪ θ˙ = ωy ⎪ ⎩ ˙ ψ = ωz 2.3

State Space Expression

The relevant parameters of the ducted UAV are obtained by means of data review and simplified calculation. Substitute the parameters into the formula (2) and convert it into a state space expression. Taking system state variables ˙ T , output variables y = [φ θ ψ]T , and input variables u = x = [φ θ ψ φ˙ θ˙ ψ] T [δx δy δz ] , the state space expression is obtained as: x˙ = Ax + Bu (3) y = Cx In the formula (3), the values of each matrix are shown as follows: ⎤ ⎡ ⎤ 0 0 0 000 1 0 0 ⎢ 0 ⎢000 0 ⎡ ⎤ 0 0 ⎥ 1 0⎥ ⎥ ⎢ ⎥ ⎢ 100000 ⎥ ⎢ ⎥ ⎢000 0 0 0 ⎥ 0 1⎥ ⎢ 0 C = ⎣ 0 1 0 0 0 0⎦ A=⎢ ⎢ 0 0 0 0 −6.28 0⎥ B = ⎢ 1.3623 0 0 ⎥ ⎥ ⎢ ⎥ ⎢ 001000 ⎣ 0 1.3623 0 ⎦ ⎣ 0 0 0 6.28 0 0⎦ 0 0 2.5542 000 0 0 0 ⎡

IDAC of Ducted Fan UAV

3 3.1

431

Controller Design LQR Controller

Optimal control is one of the core contents of modern control theory. The central problem is to choose a suitable control law for a certain control system, so that the system is optimal in a certain sense. The linear quadratic optimal control used in this paper is one of the most basic methods of optimal control [9]. The essence of linear quadratic optimal control is to use relevant theories to calculate a feedback matrix, and then use state feedback to achieve optimal control. When designing a linear quadratic optimal control law for a linear invariant system, it is first necessary to select appropriate weight matrices Q and R, where P A + AT P − P BR−1 B T P + Q = 0. Then the value of the feedback matrix of the system state feedback is obtained as follows: K = R−1 B T P . At this time, the solution of the linear quadratic optimal control is transformed into the solution of the algebraic Riccati equation. The K matrix can be obtained by obtaining the P matrix, and the final optimal control input is obtained as follows: u∗ (t) = −Kx(t). Select an appropriate weight matrix, and obtain the required state feedback matrix through calculation as: ⎡ ⎤ 81.33 9.31 0 9.78 −2.11 0 0 ⎦ K = ⎣ 37.22 20.33 0 −2.11 6.7 0 22.36 0 0 2 4.18 Finally, for the selected linear quadratic optimal control method, its stability problem also needs to be considered. Generally speaking, for systems described by state-space expressions, Lyapunov’s principle can be used to judge stability. Theorem 1 (Lyapunov’s second method for stability). Assume the system state equation be x˙ = Ax, Among them, x is an n-dimensional state vector, and A is an n × n constant non-singular matrix. It is asymptotically stable in a large range at the equilibrium state x = 0 The necessary and sufficient conditions are: given a positive definite real symmetric matrix Q, there is a positive definite real symmetric matrix P , which satisfy the following formula: AT P + P A = −Q, For a system with linear quadratic optimal controller, the state equation of the system is x˙ = (A − BK)x. Calculate the matrix P by Theorem 1. ⎤ ⎡ 0.1403 0 0 −0.5 0.0447 0 ⎢ 0 0.2618 0 −0.0447 −0.5 0 ⎥ ⎥ ⎢ ⎥ ⎢ 0 0 0.115 0 0 −0.5 ⎥ ⎢ P =⎢ ⎥ −0.5 −0.0447 0 2.7938 0.0913 0 ⎥ ⎢ ⎣ 0.0447 −0.5 0 0.0913 1.6348 0 ⎦ 0 0 −0.5 0 0 3.3107 The order principal subforums of the above matrix P are all greater than zero, that is, the matrix is a positive definite real symmetric matrix, which meets the requirements of judging the stability of the system in Lyapunov second method,

432

H. Zhang et al.

and the conclusion is obtained: The ducted fan UAV hovering control system with the linear quadratic optimal controller is a system that is progressively stable in a large range in a balanced state. 3.2

Improved DAC Controller

Adaptive control can allow changes in the external environment, changes in system parameters, and external interference, etc., to ensure that the entire control system can still operate in the best state according to a certain performance index under the above conditions. The direct adaptive control (DAC) [10] algorithm adopted in this paper does not need the state quantity of the controlled object, but only needs the state quantity of the reference model and the error information between the system output and the reference model output. This method is similar to the principle of model reference adaptive control (MRAC), but it is simpler than it. It is easier to design both in structure and algorithm, and has fewer error sources, which is beneficial to engineering implementation. At the same time, considering the control interference and sensor error that will occur in practical applications, this paper improves the direct adaptive control, adding a Kalman filter in the control loop, which can effectively deal with external interference and achieve better control results. The overall control loop design is as follows:

Fig. 3. Improved Direct Adaptive Controller

In the Fig. 3, it is assumed that the controlled object of the system can be described by the following linear state space equation: x˙ p (t) = Ap xp (t) + Bp up (t) + wt (4) yp (t) = Cp xp (t) + vt where wt is the process noise, vt is the observation noise. And the state expression of reference model is: x˙ m (t) = Am xm (t) + Bm um (t) (5) ym (t) = Cm xm (t)

IDAC of Ducted Fan UAV

433

Then, we consider the design of the Kalman filter [11]. In practical applications, we need a discretized Kalman filter. The corresponding discretized state transition matrix is: xk+1 = Ak xk + Bk uk (6) yk = Ck xk         α 1 −T T where x = , Ak = , Bk = , Ck = 1 0 , and T is the sample β 0 1 0 time, α is the attitude angle, β is the sensor measurement angle. At the beginning of the Kalman filter algorithm, we need to calculate the − prior estimate x  k−1 + Bk Uk−1 , and then we can calculate the Kalman k = Ak x gain Kk =

Pk− CkT , Ck Pk− CkT +R

where Pk− = APk−1 AT + Q is the prior error covariance

 − − matrix. Next we need to correct the posterior estimate x k = x k +Kk (yk −Ck xk ) − and the error covariance matrix Pk = (I − Kk Ck )Pk . At last, we should wait for the sampling interval T, return to the beginning of the algorithm for recursive operation. In the description above, we can select an appropriate system process error covariance matrix Q and observation error covariance R according to the characteristics of system process white noise wt and observation white noise vt . After Kalman filtering, we can get the filtered output value yk (t). The goal of direct adaptive control is to make the generalized error signal tend to zero, which is expressed as follows: lim e(t) = lim [ym (t) − yk (t)] = 0

t→∞

t→∞

(7)

Finally, the input of the controlled object of the improved direct adaptive controller is: (8) up (t) = Kx xm (t) + Ku um (t) + Ke e(t) By designing the three control matrices Ku , Kx and Ke , each signal in the overall structure of the system is bounded, and the final output signal meets the dynamic performance requirements, which is the goal of direct adaptive control. In the structure of the controller designed in this paper, the reference model will select the system based on state feedback decoupling and PID controller, and the controlled object is the system after adding the linear quadratic optimal controller. It has been proved in the references [10] that for the direct adaptive control method used in this paper, under all possible values of the parameter matrix, only the Almost Strict Positive Realness (ASPR) system is stable. Definition 1. Let the transfer function of the controlled object be GP (s). If: (1) GP (s) is the minimum phase. (2) The relative order of GP (s) (that is, the absolute value of the order difference between the numerator and denominator of the transfer function) is 0 or 1. (3) The first coefficient of the numerator of GP (s) must be a positive real number. A system GP (s) satisfying the above conditions is called an Almost Strict Positive Realness system.

434

H. Zhang et al.

According to the ASPR conditions given above, it can be judged that the system after linear quadratic optimal control is almost strictly positive. Therefore, it can be concluded that after adding the adaptive control law to the outer loop of the basic controller, the overall IDAC system can still maintain stability.

4

Numerical Simulation

In the numerical simulation, we adjusted the parameters of the controller and added a step signal to three axes to verify the dynamic performance. The value of the control matrix in this paper is: Ku = I, Kx = diag([0.05; 0.01; 0.02]), Ke = diag([2; 2; 2]). The power of white noise is 10−5 and the overall sample time is 0.001 s. The simulation results and the analyses are shown down below.

Fig. 4. DAC Controller with Disturbance

Fig. 5. Effect of Kalman Filter

In Fig.4, we verified the performance of the DAC controller without Kalman Filter. The two sets of data are the attitude angle step response of the controlled object, in which Fig.4.a has no external interference, and Fig.4.b has added control interference and sensor error, and the interference signal is selected as white noise. It can be seen that after the external interference is added, the performance of the DAC controller has declined, and the attitude angle has obvious oscillations, so a filter is required to deal with the interference.

IDAC of Ducted Fan UAV

435

Fig. 6. Effect of Improved Direct Adaptive Controller

In Fig.5, we verified the effect of the Kalman filter. Among them, Fig.5.a shows the readout data of the attitude angle sensor after adding white noise interference, and Fig.5.b shows the data after the Kalman filter algorithm. It can be clearly seen that the Kalman filter effectively eliminates the noise, so it can be used to improve the controller design of the DAC. In Fig.6, we show the overall control results. Among them, Fig.6.a is the three-axis attitude angle step response of the improved direct adaptive control. It can be seen that the overshoot is less than 20%, the peak time is about 0.5 s, and the adjustment time is about 1.2 s. Figure 6.b is the generalized error e that needs to be controlled. At 2 s, the generalized error tends to 0. We can also see that even with external interference, the output of the system is still relatively smooth. The above data prove that the controller designed in this paper can achieve rapid control of the attitude angle, and can also effectively reduce the influence of external disturbances.

5

Conclusions

In this paper, we focus on the attitude control problem of the ducted fan UAV, design the reference model and the basic controller of the controlled object, and finally improve the direct adaptive controller based on the Kalman filter algorithm. Through numerical simulation, we have verified that this improved direct adaptive algorithm has a good processing ability for the influence of control disturbance and sensor error, eliminates the shock caused by disturbance, and at the same time ensures the robustness and stability of the system.

References 1. Han, J., Chen, Z., Jiang, B., Lu, L.: Development status and key technology analysis of ducted fan UAV. Aerodyn. Missile J. 9, 45–49 (2013). https://doi.org/10. 16338/j.issn.1009-1319.2013.09.008 2. Miwa, M., Shigematsu, Y., Yamashita, T.: Control of ducted fan flying object using thrust vectoring. J. Syst. Des. Dyn. 6(3), 322–334 (2012). https://doi.org/10.1299/ jsdd.6.322

436

H. Zhang et al.

3. Manouchehri, A., Hajkarami, H., Ahmadi, M.S.: Hovering control of a ducted fan VTOL unmanned aerial vehicle (UAV) based on PID control. In: 2011 International Conference on Electrical and Control Engineering, pp. 5962–5965. IEEE (2011). https://doi.org/10.1109/ICECENG.2011.6057155 4. Zhao, H.: Design And Realization Of Control System For Small Ducted Fan UAV. Master’s thesis, Harbin Institute of Technology (2009) 5. Ren, X.: Research on ducted fan unmanned vehicle autonomous modeling and control strategies. Master’s thesis, Harbin Institute of Technology (2008) 6. Johnson, E.N., Turbe, M.A.: Modeling, control, and flight testing of a small-ducted fan aircraft. J. Guidance Control Dyn. 29(4), 769–779 (2006). https://doi.org/10. 2514/1.16380 7. Aruneshwaran, R., Wang, J., Suresh, S., Venugopalan, T.: Neural adaptive back stepping flight controller for a ducted fan UAV. In: Proceedings of the 10th World Congress on Intelligent Control and Automation, pp. 2370–2375. IEEE (2012). https://doi.org/10.1109/WCICA.2012.6358270 8. Lei, W.: Research on adaptive control of ducted fan UAV. Master’s thesis, Harbin Institute of Technology (2014) 9. Zhang, S., Gao, L.: Modern Control Theory (Second Pass). Tsinghua University Press (2017) 10. Kaufman, H., Barkana, I., Sobel, K.: Direct Adaptive Control Algorithms: Theory and Applications. Springer, New York (2012). https://doi.org/10.1007/978-1-46120657-6 11. Wang, S., Wei, G.: Application of kalman filter in attitude measurement for fourrotor aircraft flight. Ordnance Ind. Autom. 30, 73–74+80 (2011). https://doi.org/ 10.3969/j.issn.1006-1576.2011.01.022

Hetero-Source Sensors Localization Based on High-Precision Map Zhuofan Cui1 , Junyi Tao1 , Bin He1 , and Yu Zhang1,2(B) 1

State Key Laboratory of Industrial Control Technology, College of Control Science and Engineering, Zhejiang University, Hangzhou, China [email protected] 2 Key Laboratory of Collaborative Sensing and Autonomous Unmanned Systems of Zhejiang Province, Hangzhou, China

Abstract. High-precision maps and localization techniques are important modules in robot systems. However, the existing simultaneous localization and mapping methods using the same sensors in both localization and mapping tasks suffers from high costs and inflexibility. Therefore, the method of implementing a low-cost visual sensor to achieve indoor localization with the pre-obtained high-precision point cloud maps is proposed in this paper. To find a similar area represented by point cloud of the maps and the sensors, a localization strategy based on ICP for point cloud matching is completed on the premise of giving a rough initial pose. Meanwhile, visual odometry is integrated into the localization framework to conquer the localization failure caused by repeated point cloud features in indoor environments. The sub-meter level localization accuracy in the indoor building environment is achieved in the experiments. Keywords: Localization odometry

1

· Mapping · Point cloud registration · Visual

Introduction

High-precision maps and localization techniques are important modules in robot systems (e.g., autonomous driving cars [1], patrol robots [2], and unmanned aerial vehicles [3]). The existing simultaneous localization and mapping techniques, utilizing visual sensors or LiDAR to conduct the surrounding environmental maps and achieve self-localization simultaneously, have received extensive attention and have been applied in many robot systems. However, environment information stored and represented in unstructured maps has relatively time-invariant geometric structure features for indoor scenes. The results of map construction can be used for a long time. The map can be constructed prior with the assumption of the time-invariant indoor scenes. Visual localization can be accomplished in the robot motion online based on the pre-requisite map separately. Therefore, The sensors to build the map and to be carried on the robot sensors can be different. High-cost and high-precision sensors can be used for map construction to c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023  Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 437–444, 2023. https://doi.org/10.1007/978-981-99-6187-0_44

438

Z. Cui et al.

improve mapping accuracy, while low-cost and lightweight sensors can be carried on the robot to complete the localization task. Several methods of hetero-source in mapping and localization have been studied. The geometry-based approach, proposed by Caselitz et al. [4], addressed visual localization by comparing point cloud maps reconstructed from an image with the 3D points from a known map. The projection-based approach, first proposed by Wolcott et al. [5], constructed a data grid for the intensity associated with 3D points on the maps. The camera images were compared using normalized mutual information (NMI). Neubert et al. [6] proposed a method of camera localization in indoor scenes by using depth maps generated by composite views. Huai Yu et al. [7] estimated 6-DOF camera pose and 2D-3D line correspondence in LiDAR maps, optimizing localization and line-matching results by direct matching. The localization methods based on RGB-D camera and stereo camera for localization on a point cloud map were also developed to supplement feature information. The closed-loop localization in the stereo SLAM method LSDSLAM [8] was based on the connection of parallax and keyframe. The edge constraints were introduced to the pose graph optimization and conducted global BA optimization. The localization method of ORB-SLAM2 [9] matched the ORB descriptors of the current frame and the 3D point cloud collected by the stereo or RGB-D camera with the visual odometry. However, the existing localization methods with depth information were based on low-precision point cloud maps, which limits localization accuracy. The localization method of hetero-source sensors based on high-precision maps in indoor scenes is proposed in this paper. The lightweight visual sensors are used to complete robot localization in the indoor environment based on the pre-obtained high-precision point cloud map. The high-precision point cloud map is constructed using high-cost LiDAR sensors. The lightweight RGB-D visual sensor is applied to obtain the environmental information on the robot’s movement. The association and matching of the pre-obtained map and the heterosensor data collected in the robot movement are used to realize the sub-meterlevel accuracy localization. The main contribution of this paper are as follow: – A new framework of indoor visual localization is proposed based on the preobtained high-precision LiDAR map. – A strategy for combining point cloud information and visual information is proposed to improve localization accuracy. – The accuracy of localization results have reached the sub-meter level, which can be applied in indoor scenes for unmanned vehicles.

2 2.1

Proposed Method System Overview

The framework proposed in this paper combined visual and point cloud information. The point cloud map is provided as prior information to assist the localization. LIO-mapping is a tight coupling method with the LiDAR and IMU.

Hetero-Source Sensors Localization Based on High-Precision Map

439

The point cloud map is generated by a mapping module inspired by LIOmapping. The RGB-D information is adequately used for localization.

Fig. 1. System overview.

As shown in Fig. 1, the high-precision map is constructed before the localization by LIO-mapping before localization. The RGB image and Depth image is used for visual odometry to get the initial pose estimation by solving the PnP problem. The ICP algorithm aligns the downsampled Depth point cloud and the pre-obtained LiDAR point cloud map. 2.2

Hetero-Source Point Cloud Registration

This paper uses the matching of high-precision point cloud maps and lightweight sensor information to complete the hetero-source localization. The selected points of the previous frame on the map and the downsampled points in the current frame obtained by the RGB-D camera are matched to complete the camera localization estimation. The point cloud coordinate in the coordinate system of the last camera frame PW is (X, Y, Z). The point cloud coordinates are projected into the corresponding pixel coordinate system based on the pinhole imaging model of a monocular camera. The projection formula is presented as follows: ⎡ ⎤ ⎤⎡ ⎤ X ⎡ ⎤ ⎡ 1000 ⎢ ⎥ u fx 0 cx ⎣ v ⎦ = ⎣ 0 fy cy ⎦ ⎣ 0 1 0 0 ⎦ ⎢ Y ⎥ (1) ⎣Z⎦ 0 0 1 0010 1 1

440

Z. Cui et al.

where the pixel coordinates (u, v) corresponding to the coordinates of point cloud (X, Y, Z) in the pixel plane are obtained. The point within the image range (image size is 640 × 480) and with a depth of fewer than 10 m is selected. The selected area is the field of view in the previous camera frame. ⎧ ⎨ 0 < u < 640 0 < v < 480 (2) ⎩ 0 < Z < 10 The map can be concentrated only on the view of the previous camera position to reduce the number of point clouds though the selection points in the FoV of the RGB-D camera. The point clouds observed at the camera position of the previous frame can be obtained continuously by processing all the point clouds in the point cloud map with this method. To better use high-precision map information, hetero-source matching is adopted between the map information of the previous frame and the sensor information of the current frame, which is different from other methods that use point cloud matching between the same frames and sensors for pose estimation. Therefore, the efficient calculation can reduce the difference between the current and previous frames and increase the accuracy of the point cloud registration results. The ICP methods for point cloud registration achieve a high speed. However, the initial value is demanded in ICP, which limits the range of the guesses of the initial position to avoid the divergence of the localization results. The accuracy of the localization results of this method depends on the localization results of previous frames. The mismatch error and the camera swings may lead to the divergence of subsequent localization results. 2.3

3D-2D Pose Estimation

Visual odometry is adopted to achieve the localization framework to solve the problem that points cloud registration may fail in areas with similar point cloud features, such as long corridors. Since RGB-D cameras can obtain depth and image information, 3D-2D matching can be adopted to complete the pose estimation. The PnP method estimates the camera motion based on the monocular camera images and depth images of the first and second frames. 3D points in the number of n are first to be obtained, and the corresponding 2D projection points are used to get the pose of these 3D points in the camera coordinate system by calculating the matching pairs. The matching 2D feature points are found in the RGB images between the first and second frames to find the corresponding feature points. The ORB feature points were extracted between two frames. The FAST corner points were removed first, and the places with sharp grayscale changes were selected in the image. The non-maximum suppression method was adopted to avoid the problem that the corner points extracted in the image were concentrated. The construction of an image pyramid can be used for feature extraction at various scale levels, reducing the impact of the scale change of image objects. The direction

Hetero-Source Sensors Localization Based on High-Precision Map

441

of key points can be calculated according to the grayscale center of mass of key points. BRIEF descriptors represent the information on feature points. Since the descriptors are composed of binary information, Hamming distance can be used as the matching standard to complete the preliminary matching of feature points. The RANSAC method eliminates the outer points to avoid mismatching by only adopting Hamming distance. The Homography Matrix H is found for the maximum matching points pairs satisfying this matrix. The 2D point of each frame corresponds to a 3D spatial position in the depth image. The corresponding relationship has been given through the calibration parameters. After the 2D-2D matching point pairs in two frames of RGB images are obtained, the 3D-2D matching relationship between the RGB images and the depth map between the two frames can be obtained. The coordinates of the 3D points can be obtained by reading the depth information of the corresponding pixels in the depth map. After obtaining multiple matching point pairs from 3D to 2D, PnP calculates the camera’s pose. R and t are calculated through the geometric relationship between 3D and 2D points. The L-M optimization method is used to solve the PnP iteratively. The advantage of PnP is that it can be calculated even when there are few matching point pairs. In indoor scenes, there may be white walls with sparse features and scenes with obscure features caused by over-exposure or over-darkness. Adopting the PnP method can avoid localization failure due to few matching point pairs.

3 3.1

Experiments Map Construction

LIO-mapping is employed in this paper to complete the map construction of the teaching building. The lidar point cloud map of the teaching building is constructed.

Fig. 2. The point cloud map for a floor in teaching building. The long corridor can be observed in the map.

The selection of one floor can avoid the situation that the features of point clouds are similar between different floors. Otherwise, the matching failure will

442

Z. Cui et al.

occur caused by the identical traits in point cloud matching. The mapping result can be observed in Fig. 2. The point cloud selection also reduces the number of point clouds, decreasing the computing consumption in point cloud matching and improving the algorithm’s efficiency. This experiment uses this high-precision map to provide environmental information for subsequent localization. 3.2

Localizaiton Result

Based on the design of the localization framework, the experiment is carried out in the teaching building. After establishing the high-precision map, the image obtained by the depth camera is calculated for the localization results. In the visualization results, the localization accuracy can be observed through the matching quality of the image scene inside the teaching building. Since the ground truth is hard to obtain in a large indoor building, the closed-loop error is used to measure the localization accuracy. The absolute value of the position difference between the starting position of the robot Pa and the termination position Pb is calculated as the close-loop errors. The starting point coordinate of the robot is (Xa , Ya , Za ), and the ending point coordinate of the robot is (Xb , Yb , Zb ), then the calculated closed-loop error is defined as: 2 2 2  = (Xb − Xa ) + (Yb − Ya ) + (Zb − Za ) (3) The value of the close-loop error achieved a sub-meter level accuracy for localization. The accuracy for localization in each frame can be roughly observed through the image information. The position in Fig. 3 is consistent with the position matched by the point cloud, and the starting and ending positions are accurate. The closed-loop errors in all directions can be calculated by obtaining the final position and the starting position. Where the Z direction represents the vertical direction, the X direction is the direction of the building on the shorter side of the horizontal plane, and the Y direction is the direction of the building on the longer side of the horizontal plane.

Fig. 3. Localization result

Hetero-Source Sensors Localization Based on High-Precision Map

443

Table 1. Robot position and closed-loop error (m) Axis Starting position Termination position Closed-loop error X

−0.26947

−0.25479

0.01468

Y

−1.74823

−1.89449

0.14626

Z

5.13623

5.72851

0.59228



0.61252

Sum −

The final closed-loop error result is shown in Table 1, which is less than one meter, and the error accuracy can reach the decimeter level. Meanwhile, the experimental results illustrated that the closed-loop error of the localization result is small in the X direction and Y direction. In contrast, the closed-loop error in the Z direction is larger than in the other directions. The main part of the closed-loop error is also from the Z direction. The reason is that the RGBD camera lacks a Z direction constraint due to the structure of the teaching building. In the experiment, Normal Distributions Transform (NDT), which is another point cloud registration method, is adopted to complete the comparison experiment. The efficiency is compared of the two methods. NDT is adopted in this framework to replace ICP. The time consumption of the point cloud matching for each frame is calculated. Table 2. Average time consuming of ICP and NDT ICP

NDT

Time consuming (s) 0.07832 1.65752

The results in Table 2 demonstrated that the computation efficiency of ICP is higherNDT. The environment difference between the previous frame and the current frame affects the accuracy of localization. Meanwhile, the closed-loop error of the point cloud registration method of ICP can reach the decimeter level, while the closed-loop error of NDT can only reach the meter level. In general, the ICP approach adopted in this paper is better than NDT in computation efficiency and localization accuracy.

4

Conclusion

Aiming to reduce operation costs and increase flexibility in the traditional SLAM system, a high-precision map in the indoor scene is established prior, and the sub-meter level indoor localization based on the pre-obtained offline maps using the online visual information is achieved in this paper. The hetero-source sensors localization is based on point cloud registration. The area of the point cloud map is divided according to the field of view information of the camera. For

444

Z. Cui et al.

mismatching situations in an environment full of similar point cloud features in the indoor building, the information of the visual odometry is integrated to estimate the initial pose for the improvement of localization accuracy. The final localization accuracy of the system met the sub-meter level, which can be widely used in practice. In future works, the information of RGB-D and IMU can be fused to improve the drift of the current localization results in the Z direction. Moreover, highdimensional features such as lines or surfaces can be introduced for point cloud registration for further improvement. Acknowledgements. This work was supported by STI 2030-Major Projects 2021ZD0201403, in part by NSFC 62088101 Autonomous Intelligent Unmanned Systems, and in part by the Open Research Project of the State Key Laboratory of Industrial Control Technology, Zhejiang University, China (No. ICT2022B04).

References 1. Zhang, J., Singh, S.: LOAM: LiDAR odometry and mapping in real-time. Robot. Sci. Syst. 2, 1–9 (2014) 2. Xu, N., Fan, Y., Yu, X., Chen, J., Chen, T.: A fast mapping for small mobile robot scanned by single line LiDAR. In: Other Conferences (2022) 3. Steenbeek, A., Nex, F.: CNN-based dense monocular visual SLAM for real-time UAV exploration in emergency conditions. Drones 6, 79 (2022) 4. Caselitz, T., Steder, B., Ruhnke, M., Burgard, W.: Monocular camera localization in 3D lidar maps. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1926–1931. IEEE (2016) 5. Wolcott, R.W., Eustice, R.M.: Visual localization within lidar maps for automated urban driving. In: 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 176–183. IEEE (2014) 6. Neubert, P., Schubert, S., Protzel, P.: Sampling-based methods for visual navigation in 3D maps by synthesizing depth images. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2492–2498. IEEE (2017) 7. Yu, H., Zhen, W., Yang, W., Zhang, J., Scherer, S.: Monocular camera localization in prior lidar maps with 2D-3D line correspondences. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4588–4594. IEEE (2020) 8. Engel, J., Sch¨ ops, T., Cremers, D.: LSD-SLAM: large-scale direct monocular SLAM. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014, Part II. LNCS, vol. 8690, pp. 834–849. Springer, Cham (2014). https://doi.org/10.1007/ 978-3-319-10605-2 54 9. Mur-Artal, R., Tard´ os, J.D.: ORB-SLAM2: an open-source slam system for monocular, stereo, and RGB-D cameras. IEEE Trans. Robot. 33(5), 1255–1262 (2017)

Edge-Node Refinement for Weakly-Supervised Point Cloud Segmentation Yufan Wang and Qunfei Zhao(B) Department of Automation, Shanghai Jiao Tong University, Shanghai 200240, China {wyf018032910040,zhaoqf}@sjtu.edu.cn

Abstract. Point cloud segmentation is an important issue of 3D industrial visual perception. With the development of learning-based method, some significant methods have been proposed and improve the performance greatly. However, in many cases, annotated data can not be easily obtained and some methods can not be provided with acceptable results. In this paper, we propose a novel graph-based method to exploit localglobal structure information for weakly supervision. We design a clusterwise edge-node refinement module for the graph-based on the feature map of point cloud. The edge refinement module can capture the intracluster consistency and inter-cluster similarity information. And the node refinement module can not only propagate the cluster-wise information between point cloud but also associate the local-contextual and longrange information for refined feature map. We apply a manifold regularizer and adjacency-based soft label as the supervision to enforce the features embedding more in line with a similarity distribution. Experiments on part segmentation datasets show the feasibility and efficiency in weakly supervised applications. Keywords: Point Cloud Feature Refinement

1

· Part Segmentation · Weakly-supervision ·

Introduction

With the rapid development of hardware device and storage technology, the acquisition of 3D data are effective and affordable for application. Compared with 2D image, 3D data can provide more precise scale, shape and surface details, which is essential to industrial production, autonomous driving and robotics [10]. Researches on point cloud processing technology are attractive and challenging, which include a series basic tasks, such as filter [37], registration [14,36], reconstruction [5], classification, shape/scene segmentation and object detection [11,23,27]. As a high-level task, semantic/part segmentation has always been a focused point. Especially, learning-based approaches are proposed to improve the precision and efficiency for point cloud processing [8,13,16,19,20,26,28,32,33,35]. c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023  Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 445–459, 2023. https://doi.org/10.1007/978-981-99-6187-0_45

446

Y. Wang and Q. Zhao

Fig. 1. Illustration of cluster-wise feature refinement.

Main stream methods include projection-based methods, discretization-based methods and point-based methods [10]. Many of learning-based method can fully use the supervision information and construct a powerful latent space for input data. However, some challenges need to be considered. Firstly, acquiring completely labelled data is relatively difficult due to its high cost and time-consuming nature. Secondly, the localglobal feature and structured information can not be easily to embedded into frauture map. In order to overcome the limits, we propose a novel graph-based network to learn local-global structure feature of point cloud. Benefit from the acceptable performance of contrastive learning, we apply Siamese network as the feature encoder to learn the potential unsupervised information with the augmented data. In order to associate the local contextual feature, we construct a graphbased on the feature map of point cloud and partition the graph into clusters. Especially we propose the definition of harmonious cluster to describe the cluster-wise consistency. Based on the proposed definition, we refine the edge by measuring the joint probability of the node-wise connection and consistency of clusters. Furthermore, in order to propagate the information of cluster-wise distribution and associate the long-range information, we refine the node feature by a two-layer graph convolution network (GCN) and global aggregation operation. Our method can adjust the adjacency relationship between intra-class points and inter-class points as shown in Fig. 1. The distance between intra-points is reduced, and the distance between inter-points is increased. Ulteriorly, for the supervision, we provide a unified segmentation loss for both labelled and unlabelled data, a contrastive loss for unsupervised feature and a graph regularizor for controlling smoothness constraint. Experimentally, we evaluate the performance of segmentation performance on ShapeNet dataset and our own weld seam dataset. To quantify the effectiveness of different modules, a series ablation experiments are conducted.

Edge-Node Refinement for Weakly-Supervised Point Cloud Segmentation

2 2.1

447

Related Work Point Cloud Learning

Extending deep learning to point clouds poses significant obstacles. The most noticeable issue is the disorganization of the point cloud. The main solution can be divided into two types. The direct method such as multi-view and volumetricbased methods, can give a certain structure and learn the semantic information from regular domains. Point-based methods aim to design the network with spatial invariance which can learn the semantic information from irregular domains. Multi-view methods project point cloud into 2D images and DCNN can be applied [18,29,30]. Volumetric-based methods voxelized Euclidean space and apply 3D-CNN to process the grid directly [6,7,17,21,22]. Point-based methods works on the irregular domains. The pioneering work PointNet introduces the max pooling as symmetrical operator to realize the invariance of order [19]. The following researches of PointNet can be divided into MLP-based [9,20], convolution-based [24] and graph-based methods [10,28,33]. Especially, graph-based methods have grown rapidly. Attention mechanism is introduced to assign the point-wise weights to encode the spatial positions [25]. [4] proposes an embedding graph attention mechanism to learn local geometric representations by stacked MLP layers. Jiang et al. propose an edge refinement method to integrate the local feature and propagate the message [15]. And we analyze the relationship of cluster-wise feature and propose a cluster-based edge-node refinement method to achieve the intra-class consistency and inter-class distinguishability. 2.2

Weakly-Supervised Learning for Point Cloud

Due to the high cost and time-consuming of annotated data, a series of weaklysupervised and self-supervised are proposed for point clouds. The methods are based on some common unsupervised machine learning methods [1,12,31]. 3D GAN propose a deep generation model for point cloud to achieve high quality reconstruction for invisible samples and learn compact representation of semantic operation, interpolation and shape completion [1]. Charles et al. introduce contrastive learning to point cloud, which is termed as PointContrast [31]. Kaveh Hassani et al. propose a multi-task framework which leverages autoencoding and joint clustering to achieve classification, segmentation and reconstruction tasks [12].

3

Method

In this paper, we propose a cluster-wise feature refinement network for point cloud. The goal of our method is to give a accurate prediction with a limited amount of labelled data. The dataset is composed of point cloud set P and its labels set d. P = {pi ∈ R3 |i ∈ N, 0 ≤ i ≤ N }, where pi is the i-th 3D point. And d = {di ∈ R3 |i ∈ N, 0 ≤

448

Y. Wang and Q. Zhao

i ≤ N }, where di is the discrete segmentation label of pi and di ∈ {0, 1, ..., K}, where K is the number of classes. For convenience of computing, d is usually transferred into one-hot assignment matrix. Let y be the one-hot assignment matrix and y = (yij ) ∈ {0, 1}N ×K . Given the training data P and y, we aim to learn a model M : RN ×3 → N ×K ˆ over the segmentation label d. and obtain the posterior probability y R Sequentially, ˆ = M(P ) y (1) ˆ = (ˆ where y yij ) ∈ RN ×K and 0 ≤ yˆij ≤ 1. 3.1

Network Architecture

The architecture of network is shown in Fig. 2. We apply the point-based backbone as encoder to construct a representation space for point cloud input. The proposed feature refinement module can capture the local-global feature with greater receptive field and effective information propagation. Finally, a projection head can transfer the refined feature into semantic space. In detail, let E : RN ×3 → RN ×D be the point-based backbone and h = (hi ) ∈ RN ×D be the feature extracted by E from input point cloud P . E(P ) = h ∈ RN ×D ,

(2)

where D is the corresponding dimensions of the feature. In order to analyze the local-global feature, we can construct a graph G = (V, E) conditioned on the feature map h. V is the node set, which can be formulated as (3) V = {vi | i ∈ N, 0 ≤ i ≤ N }.

Fig. 2. Framework of cluster-wise feature refinement network.

Edge-Node Refinement for Weakly-Supervised Point Cloud Segmentation

449

Fig. 3. Implement of edge and node refinement module.

where vi ∈ P is corresponding to pi ∈ P . E is the edge set and E can be described by adjacency matrix A in a probabilistic manner and A = (aij ) ∈ RN ×N , 0 ≤ aij ≤ 1.

(4)

E = {(vi , vj ) | vi , vj ∈ V, aij = 1}.

(5)

Sequentially, As shown in Fig. 3, we design a edge refinement module and node refinement module. Our Edge refinement module is to adjust the adjacency relationship of points embedding by construct a graph and capture the cluster feature. And our node refinement module aim to propagate the adjusted edge information. We denote the neighbor set of node vi by N (vi ), and N (vi ) = {vj | ∃vi ∈ V, (vi , vj ) ∈ E} . In order to extract more unsupervised information, we follow the contrastive learning method and utilize a Siamese architecture as encoder [34]. For ∀pi ∈ P , i by a random rotation and we can obtain a corresponding augmented data p flipping operation. We randomly rotate the point cloud by certain angles around x, y, and z axes. Due to the strong symmetry of point cloud in dataset and the existence of following flipping operation, we set the range of rotation angles from 0◦ to 180◦ . We denote the a random rotation and flipping operation by T and we have i can be given by Eq. 6: p i = pi T. (6) p  n for p n is supposed to approximate the point Ulteriorly, the point feature h feature hn for pn , which is introduced in Sect. 3.4. 3.2

Edge Refinement

The feature extracted by encoder is limited by the lack of local features, and the lack includes the representation about the inaction between the point cloud clusters and the dispersion of each cluster itself. In this section, we propose an edge refinement module and construct the refined adjacency matrix. We first compute the initial adjacency matrix A conditioned on current graph node feature h. A can be given by Gaussian weighted Euclidean distance between points.

450

Y. Wang and Q. Zhao

Fig. 4. Harmonious cluster and disharmonious cluster.

However, point-wise distance fails to encode the local feature into the adjacency matrix. The point-pair encoding method is ineffective and the receptive field of the feature is insufficient. In this section, we consider refining the clusterwise feature by analyzing graph edge. We partition the graph node set into Q groups (clusters), that is VP = {Cq |

Q 

Cq = V; Cq , Cp ⊂ V, Cq ∩ Cp = ∅, q = p ∈ {1, 2, . . . , Q}}.

(7)

q=1

Correspondingly, let Eq be the edge set for Cq , and the subgraph can be defined as: Gq = (Cq , Eq ), Eq = {(vi , vj ) | vi , vj ∈ Cq , (vi , vj ) ∈ E}. The corresponding adjacency matrix can be defined as Aq = (aij ) ∈ RNq ×Nq , where Nq is the number of nodes in Cq . And the corresponding node feature can be defined as hCq = (hi ) ∈ RNq ×D , where vi ∈ Cq . In order to encode the local neighbor connection information and refine the initial adjacency matrix, we introduce a new world state wij ∈ {0, 1} as the discrete description for A, and  1 (vm , vn ) ∈ Eq wmn = (8) 0 otherwise. Ulteriorly we consider distribution of wi of all vi ∈ Cq . We design a function π : Cq → {0, 1}, where Cq ∈ VP is an arbitrary graph node set.  1 ∀vi , vj ∈ Cq , wij = 1 π(hCq ) = (9) 0 otherwise. π(Cq ) can be considered as a binary cluster label to describe the consistency of members in the cluster Cq . Ulteriorly, we define that a cluster Cq ∈ VP is called a harmonious cluster if π(Cq ) = 1. Otherwise, a cluster Cq is called a disharmonious cluster , as illustrated in Fig. 4. The elements of A can be refined by π(Cq ) in a probabilistic manner by considering the joint posterior probability of wij and π for q th cluster. The

Edge-Node Refinement for Weakly-Supervised Point Cloud Segmentation

451

refined adjacency matrix is denoted by A = (aij ) ∈ RN ×N aij = P r(wij , π | Cq ) = P r(wij | π, Cq )P r(π | Cq ).

(10)

We use Eq. 10 to help the refine adjacency matrix absorb the information of the cluster-wise feature. According to Bayesian theory, the conditional probability P r(wij | π, Cq ) with given Cq can be formulated as: P r(wij )P r(π, Cq | wij ) , vi ,vj  ∈Cq P r(π, Cq | wi j  )P r(wij )

P r(wij | π, Cq ) = 

(11)

where p(wij ) can be given by Gaussian weighted Euclidean distance. In view of dependence of intra-cluster nodes, the likelihood p(π, Cq | wij ) is not appropriate to factorized as the product of individual probability for each nodes. Hence we consider applying the statistic methods to model the likelihood. We weaken the connection between outliers and inliers by cumulative distribution function FX . FX is the function of real-valued random variable X for amn ∈ Aq and FX : R → [0, 1]

FX (aij ) = p(X ≤ aij ).

(12)

The likelihood P r(π, Cq | wij ) can be implemented by: P r(π, Cq | wij ) = I(−

(aij − μq ) ≤ L)(FX (aij ) − 1) + 1. σq

(13)

where L is a hyperparameter, μq and σq are the mean and variance of Aq respectively and I is an indicator function.  1 if · is true I(·) = (14) 0 otherwise. In order to give consideration to intra-class consistency and similarity, we compute p(π | Cq ) by the trace of intra-cluster scatter matrix for {aij | vi , vj ∈ Cq } as follows: P r(π | Cq ) =

1+

T r( N1q



1 , T vn ∈Cq (hn − λq )(hn − λq ) )

(15)

where λq is the cluster center of Cq . 3.3

Node Refinement

The cluster-wise feature can be encoded into the adjacency matrix and refined implicitly. Note that our proposed edge refinement module is not designed to directly build a more powerful representation space but obtain the refined adjacency matrix. In this section, we propose a following node refinement module to fully utilize the refined adjacency matrix.

452

Y. Wang and Q. Zhao

The cluster projection gc is applied to map the initial feature map into a cluster-refined representation space. gc is implemented by a two-layer MLP and output is denoted by h . We have h = g c (h).

(16)

We use a graph neural network layer to propagate the link state from refined adjacency matrix. In detail, we utilize a two-layer graph convolution network (GCN) layer denoted by g. (17) h = g(h , A ). In order to utilize the cluster-wise feature and adjacency matrix, we do not select a deeper GCN for feature propagation. But it is considerable to capture the long-range dependence feature. We apply Point Attention [9] module to update the feature in a global manner. With the help of self-attention mechanism, longrange correlations can be encoded into the feature. The operation is formulated as: (18) hupdate = g3 (h ) + h × sof tmax(g2 (h ) × g1 (h ) ). where g1 , g2 and g3 are MLP layers. Note that the operation is shared by all the cluster and can be considered as the only one chance to associate the long-range feature. Exploiting the advantages of refined node feature, local feature with wider receptive field can be captured. 3.4

Supervision and Loss Function

The labels of point cloud are not completed, and the number of valid labels is much less than the number of 3D points. Hence we propose a refined segmentation loss as the sparse-supervision loss to combine the refine adjacency matrix and the output of model (Sect. 3.4). We use graph manifold regularizer (Sect. 3.4) and contrastive loss (Sect. 3.4) as the unsupervision losses. Refined Segmentation Loss. The model are supervised by both labelled data and unlabelled data with both explicit and implicit constraints. Traditionally, a binary mask B = (bn ) ∈ {0, 1}N ×1 is adopted as loss function for segmentation. And  1 if pn is labelled bn = (19) 0 otherwise. The original segmentation loss can be given as: 1   exp(ˆ ynk ) . ls = bn ynk log  K n ynk ) k exp(ˆ

(20)

k

In order to fully utilize labelled data and refined adjacency matrix, we propose a soft label matrix s to modify the original formulation of loss function. And s = (snk ) ∈ RN ×K . For the n-th unlabelled point cloud, we have: m = arg max anm , m

(21)

Edge-Node Refinement for Weakly-Supervised Point Cloud Segmentation

453

and snk = yˆmk .

(22)

And the refined segmentation loss can be given as:  1  exp(ˆ ynk ) (bn ynk log  K n exp(ˆ ynk ) k k  exp(ˆ ynk ) +α(1 − bn ) ). snk log  exp(ˆ ynk )  k

ls =

(23)

k

where α is a hyperparameter. The unlabelled data can be assigned with probabilistic label given by nearest neighbor node guided by adjacency matrix. Graph Smoothness Loss. Our node refinement module is implemented in a hierarchical way. And the hierarchical loss is also provided. For each node refinement module, we construct a manifold on the refinement feature space. And an appropriate penalty term for the intrinsic pattern of the unlabelled data [2]. Hence graph Laplacian matrix is employed to associate the data. The regularizer can be given by Eq. 24: lr =

1  ||H(hm ) − H(hn )||22 amn , ||A||0 n,m

(24)

where H is the classification head and can be implemented by convolution layer and softmax function. The regularizer keeps the edge weight corresponding to the node pairs with high similarity at a relatively low level. Contrastive Loss. For the unlabelled data trained without effective constraints, we apply the contrastive learning method to improve the performance. With the shared-parameter structure, the augmented data and the original data are supposed to approximate. Contrastive loss can be given as: lc =

1  − h||2 . ||h F N ×K

(25)

In the training stage, the total loss can be formulated as: l = ls + l r + l c

(26)

ˆ of our model M In the inference stage, we need to transform the output d ˆ and from one-hot state to discrete segmentation label which is denoted by d, N ×1 ˆ d = (dn ) ∈ N . dn can be given as follows: dˆn = max ynk k

(27)

454

4 4.1

Y. Wang and Q. Zhao

Experiments Experiments Setup

Dataset. ShapeNet [3] is a 3D dataset with rich annotated labels, which is represented by CAD model of 55 common object categories with 51300 models. ShapeNet contains multiple semantic categories and is widely used as benchmark for point cloud classification and segmentation. ShapeNet Part can be considered as a subset of ShapeNet which focus on the fine-grained shape part segmentation (2–6 parts). And our method is trained with 16 object categories with incomplete labels. Table 1. Comparison of part segmentation prediction on ShapeNet dataset (mIoU %) Supervision Method

Class. mIoU Inst.mIoU Air.

Bag

Cap Car

Chair Ear. Guitar Knife Lamp Lap. Motor. Mug Pistol Rocket Skate. Table

Full

PointNet [19] PointNet++ [20] DGCNN [28] SpiderCNN [35] SPLATNet [11] Ours (PointNet) Ours (DGCNN)

80.4 81.9 82.3 82.4 83.7 83.0 83.4

83.7 85.1 85.1 85.3 85.4 85.5 85.6

83.4 82.4 84.2 83.5 83.2 84.1 84.3

78.7 79.0 83.7 81.0 84.3 80.9 81.1

82.5 87.7 84.4 87.2 89.1 85.5 86.5

89.6 90.8 90.9 90.7 90.7 90.2 90.5

5%

PointContrast 72.1 Baseline (DGCNN) 68.6 Ours (DGCNN) 78.1

77.7 80.4 83.5

78.4 67.7 78.2 66.2 85.5 78.6 50.4 66.4 66.8 88.3 80.4 80.0 78.0 75.2 90.2

74.9 77.3 77.1 77.5 80.3 77.1 77.8

73.0 71.8 78.5 76.8 75.5 70.3 70.3

91.5 91.0 91.5 91.1 92.1 90.4 90.8

85.9 85.9 87.3 87.3 87.1 87.4 87.5

80.8 83.7 82.9 83.3 83.9 84.5 83.9

95.3 95.3 96.0 95.8 96.3 94.6 94.9

65.2 71.6 67.8 70.2 75.6 82.5 82.8

52.6 87.7 57.9 88.7 63.5 90.0

81.6 82.5 86.3

76.3 78.5 81.4

93.7 56.1 95.2 22.2 95.6 54.5

93.0 94.1 93.3 93.5 95.8 94.3 94.6

81.2 81.3 82.6 82.7 83.8 82.0 82.6

57.9 58.7 59.7 59.7 64.0 66.3 66.3

72.8 76.4 75.5 75.8 75.5 78.6 79.1

80.6 82.6 82.0 82.8 81.8 81.9 82.2

80.1 70.9 83.3 62.4 92.3 77.6

44.7 30.6 52.2

60.7 64.7 72.3

73.0 81.2 81.6

Our welding seam dataset is a 3D point cloud dataset generated by structure light scanner. The dataset consist of 14 welding plate sample and the total number of point clouds is 2111K. We annotated 10% of all the dataset and we use 1679K points as training set and 432K as test set. Implementation Details. We implement our method on the classic platform Pytorch, and train the network on NVIDIA RTX 2080Ti. The encoder is selectable, which is adaptive with different scenario requirements. In our experiments, we utilize PointNet [19] and DGCNN [28] as encoder. The parameters of edge modules and node modules are randomly initialized. We apply LGD optimizer with initial learning rate l = 0.0001 with momentum parameters 0.9 and weight decay  = 0.0001. We utilize Cosine Annealing Learning Rate Decay as the learning strategy. We set the training epoch to 200 epochs. Data augmentation strategies are following [34]. We sample 2048 points from a single model with a unit sphere. In order to avoid overfitting, we consider Gaussian noise and random rotations. The rotation angle in the range of [−180, 180] degrees around z axis and [−20, 20] around x and y axes. Note that the data augmentation before the experiments is independent on the data augmentation for contrastive learning.

Edge-Node Refinement for Weakly-Supervised Point Cloud Segmentation

455

Fig. 5. Qualitative results on the validation set of ShapeNet dataset.

Metrics. We utilize the mean Intersect over Union (mIoU) with two derived metrics class mIoU and Instance mIoU to evaluate the performance of methods. Class mIoU and Instance mIoU represent average over points in each single shape category and all shape instances respectively. Table 2. Comparison of semantic segmentation prediction on the test set of our welding seam dataset Method

mIoU (%)

Baseline (DGCNN) 92.8 Ours 95.4

4.2

Quantitative Results

The results on ShapeNet dataset are summarized in Table 1. For the supervised task, our method achieves the state-of-art performance with 85.6% instance mIoU and 83.5% class mIoU, which outperforms the baseline DGCNN and PointNet. And for weakly-supervised tasks, we utilize 5% labelled points. The gap is acceptable with 2.3% less than supervised models. The experiments on our welding seam dataset are shown in Table 2. The results of our method with 10% labelled points achieve acceptable results compared with baseline method. As shwon in Fig. 5, our refinement module can capture the structure information and local-global feature with less labelled data. Especially, for the object with regular edges, such as cars, chairs and knife, more similar harmonious clusters can be generated. Hence the embedded feature for same class will be provided with larger joint posterior probability p(wmn , π | Cq ) and the point around edges can be better classified. The visualization of segmentation results on our welding seam dataset is shown in Fig. 6. Our method can capture the distribution of different classes, especially for the plane and welding seam. Compared with baseline, our method is provided with more sharp edge and better continuity.

456

Y. Wang and Q. Zhao

Fig. 6. Qualitative results on the test set of our welding seam dataset. Table 3. Ablation study of each individual module (Instance mIoU %) Module Instance mIoU (%) Edge Refinement Node Refinement 5% 10% 100% local node refinement global node refinement 80.3 80.4 80.6 81.2 82.1 82.7 82.6 83.5

   

 

  

 



81.5 82.1 82.1 82.3 82.6 83.5 83.2 84.7

85.1 85.1 85.1 85.2 85.1 85.4 85.4 85.6

Table 4. Ablation study of each individual component of loss function (Instance mIoU %) Module Instance mIoU (%) Cross entropy Cross entropy (Refined) Contrastive Loss Graph Smoothness Loss 5% 10% 100%     

4.3

 

 

80.4 81.9 82.1 82.2 83.5

81.5 82.7 82.8 82.8 84.7

82.3 82.8 83.4 84.2 85.6

Ablation Study

Edge and node refinement module capture the local-global feature and structured information, which is benefit to feature propagation. As shown in Table 3, the local and global node refinement should exist simultaneously. If one of them is used separately, the performance improvement will not be obvious. And Edge refinement module with local node refinement module is more important for performance improvement. The ablation study of each individual component of loss function is shown in Table 4. We can see that our refined cross entropy loss can guide the model to

Edge-Node Refinement for Weakly-Supervised Point Cloud Segmentation

457

consider the unsupervised information, which is provided with significant effects, especially when the number of unlabelled data is sizable.

5

Conclusion

In this paper, we propose a cluster-wise feature refinement graph-based method to realize weakly learning of point cloud. A relative independent module is introduced to refine the node and edge feature based on the results of encoder. Baesd on the proposed harmonious cluster, cluster-wise distribution discrepancy is modeled and exploited to yield the feature with more semantic consistency. Furthermore, refined cluster-based segmentation loss, contrastive loss and graph smoothness loss give constraints for both labelled and unlabelled data. Experiments conducted on ShapeNet and our welding seam dataset can demonstrate the ability of representation and generalization with fewer labelled data. Acknowledgments. This work was supported by National Natural Science Foundation of China (U2013205).

References 1. Arshad, M.S., Beksi, W.J.: A progressive conditional generative adversarial network for generating dense and colored 3D point clouds. In: 2020 International Conference on 3D Vision (3DV), pp. 712–722. IEEE (2020) 2. Belkin, M., Niyogi, P., Sindhwani, V.: Manifold regularization: a geometric framework for learning from labeled and unlabeled examples. J. Mach. Learn. Res. 7(1), 2399–2434 (2006) 3. Chang, A.X., et al.: ShapeNet: an information-rich 3D model repository. Computer Science (2015) 4. Chen, C., Fragonara, L.Z., Tsourdos, A.: GapointNet: graph attention based point neural network for exploiting local feature of point cloud. Neurocomputing 438(7553) (2021) 5. Chen, S., Duan, C., Yang, Y., Li, D., Feng, C., Tian, D.: Deep unsupervised learning of 3D point clouds via graph topology inference and filtering. IEEE Trans. Image Process. 29, 3183–3198 (2019) 6. Choy, C., Gwak, J.Y., Savarese, S.: 4D spatio-temporal convnets: Minkowski convolutional neural networks. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019) 7. Dai, A., Ritchie, D., Bokeloh, M., Reed, S., Sturm, J., Nießner, M.: Scancomplete: large-scale scene completion and semantic segmentation for 3D scans. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4578–4587 (2018) 8. Feng, H., et al.: GCN-based pavement crack detection using mobile lidar point clouds. IEEE Trans. Intell. Transp. Syst. 23, 11052–11061 (2021) 9. Feng, M., Zhang, L., Lin, X., Gilani, S.Z., Mian, A.: Point attention network for semantic segmentation of 3D point clouds. Pattern Recogn. 107, 107446 (2020) 10. Guo, Y., Wang, H., Hu, Q., Liu, H., Bennamoun, M.: Deep learning for 3D point clouds: a survey. IEEE Trans. Pattern Anal. Mach. Intell. PP(99), 1 (2020)

458

Y. Wang and Q. Zhao

11. Hang, S., Jampani, V., Sun, D., Maji, S., Kautz, J.: SPLATNet: sparse lattice networks for point cloud processing. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2018) 12. Hassani, K., Haley, M.: Unsupervised multi-task feature learning on point clouds. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (2019) 13. Hu, Q., et al.: Learning semantic segmentation of large-scale point clouds with random sampling. IEEE Trans. Pattern Anal. Mach. Intell. 44, 8338–8354 (2021) 14. Jauer, P., Kuhlemann, I., Bruder, R., Schweikard, A., Ernst, F.: Efficient registration of high-resolution feature enhanced point clouds. IEEE Trans. Pattern Anal. Mach. Intell. 41(5), 1102–1115 (2018) 15. Jiang, L., Zhao, H., Liu, S., Shen, X., Fu, C.W., Jia, J.: Hierarchical point-edge interaction network for point cloud semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10433–10441 (2019) 16. Ma, L., Li, Y., Li, J., Tan, W., Yu, Y., Chapman, M.A.: Multi-scale point-wise convolutional neural networks for 3D object segmentation from lidar point clouds in large-scale environments. IEEE Trans. Intell. Transp. Syst. 22(2), 821–836 (2019) 17. Meng, H.Y., Gao, L., Lai, Y.K., Manocha, D.: VV-Net: voxel VAE net with group convolutions for point cloud segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8500–8508 (2019) 18. Milioto, A., Vizzo, I., Behley, J., Stachniss, C.: RangeNet++: fast and accurate lidar semantic segmentation. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (2019) 19. Qi, C.R., Su, H., Mo, K., Guibas, L.J.: PointNet: deep learning on point sets for 3D classification and segmentation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017) 20. Qi, C.R., Yi, L., Su, H., Guibas, L.J.: Pointnet++: deep hierarchical feature learning on point sets in a metric space. In: Advances in Neural Information Processing Systems, vol. 30 (2017) 21. Rethage, D., Wald, J., Sturm, J., Navab, N., Tombari, F.: Fully-convolutional point networks for large-scale point clouds. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11208, pp. 625–640. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01225-0 37 22. Rosu, R.A., Sch¨ utt, P., Quenzel, J., Behnke, S.: LatticeNet: fast point cloud segmentation using permutohedral lattices. arXiv preprint arXiv:1912.05905 (2019) 23. Shi, W., Rajkumar, R.: Point-GNN: graph neural network for 3D object detection in a point cloud. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1711–1719 (2020) 24. Thomas, H., Qi, C.R., Deschaud, J.E., Marcotegui, B., Goulette, F., Guibas, L.: KPConv: flexible and deformable convolution for point clouds. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (2020) 25. Veliˇckovi´c, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., Bengio, Y.: Graph attention networks. arXiv preprint arXiv:1710.10903 (2017) 26. Wang, W., Yu, R., Huang, Q., Neumann, U.: SGPN: similarity group proposal network for 3D point cloud instance segmentation. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2018) 27. Wang, X., Cai, M., Sohel, F., Sang, N., Chang, Z.: Adversarial point cloud perturbations against 3D object detection in autonomous driving systems. Neurocomputing 466, 27–36 (2021) 28. Wang, Y., Sun, Y., Liu, Z., Sarma, S.E., Bronstein, M.M., Solomon, J.M.: Dynamic graph CNN for learning on point clouds. ACM Trans. Graph. 38(5), 1–12 (2018)

Edge-Node Refinement for Weakly-Supervised Point Cloud Segmentation

459

29. Wu, B., Wan, A., Yue, X., Keutzer, K.: SqueezeSeg: convolutional neural nets with recurrent CRF for real-time road-object segmentation from 3D lidar point cloud. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 1887–1893. IEEE (2018) 30. Wu, B., Zhou, X., Zhao, S., Yue, X., Keutzer, K.: SqueezeSegV2: improved model structure and unsupervised domain adaptation for road-object segmentation from a lidar point cloud. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 4376–4382. IEEE (2019) 31. Xie, S., Gu, J., Guo, D., Qi, C.R., Guibas, L., Litany, O.: PointContrast: unsupervised pre-training for 3D point cloud understanding. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12348, pp. 574–591. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58580-8 34 32. Xu, Q., Sun, X., Wu, C.Y., Wang, P., Neumann, U.: Grid-GCN for fast and scalable point cloud learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5661–5670 (2020) 33. Xu, S., Wang, R., Wang, H., Yang, R.: Plane segmentation based on the optimalvector-field in lidar point clouds. IEEE Trans. Pattern Anal. Mach. Intell. 43, 3991–4007 (2020) 34. Xu, X., Lee, G.H.: Weakly supervised semantic point cloud segmentation: towards 10x fewer labels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13706–13715 (2020) 35. Xu, Y., Fan, T., Xu, M., Zeng, L., Qiao, Yu.: SpiderCNN: deep learning on point sets with parameterized convolutional filters. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11212, pp. 90–105. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01237-3 6 36. Yang, H., Shi, J., Carlone, L.: Teaser: fast and certifiable point cloud registration. IEEE Trans. Rob. 37(2), 314–333 (2020) 37. Zhang, D., Lu, X., Qin, H., He, Y.: Pointfilter: point cloud filtering via encoderdecoder modeling. IEEE Trans. Visual Comput. Graphics 27(3), 2015–2027 (2020)

Improving Dialogue Summarization with Mixup Label Smoothing Saihua Cheng and Dandan Song(B) School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China {chengsh,sdd}@bit.edu.cn

Abstract. The abstractive dialogue summarization models trained with Maximum Likelihood Estimation suffer from the overconfident issue because the training objective encourages the model to assign all probability to the hard target. Although Label Smoothing is widely adopted to prevent the models from being overconfident, it assumes a pre-defined uniform distribution that is not adaptive and is not an ideal soft target. Therefore, we propose a Mixup Label Smoothing method in this paper, which exploits the general knowledge from the language model to construct a flexible soft target to present diverse candidates. We conceptualize the hypothesis distribution obtained from a pretrained language model as the context-smoothing target, which encodes much knowledge through the massive pretraining corpus and implies more possible candidate summaries. Extensive experiments on three popular dialogue summarization datasets demonstrate that our method effectively outperforms various strong baselines, as well as in low-resource settings. Keywords: Dialogue summarization language model

1

· Label smoothing · Pretrained

Introduction

Recently, the emergence of massive online dialogue scenarios has rapidly stimulated the interest of the community to work on dialogue summarization, aiming to condense long dialogue flows into short narrative summaries while reserving the salient information [1]. Compared with well-structured monologue inputs such as news, it’s more challenging to summarize from a less-organized dialogue containing topic transitions, inconsecutive descriptions, complicated coreference, informal language, and so on. Generally, the abstractive dialogue summarization models are fine-tuned on the Pretrained Language Model (PLM). To bridge the disparity between dialogue input and narrative output, previous works widely explore enhancing dialogue understanding by incorporating dialogue features explicitly, including dialogue acts [2,3], stage transitions [4], coreference relations [5], and discourse graphs [3]. Some works also consider promoting summary generating with content guidance or planning, e.g., entity-chain prompts [6], and weakly sketch supervision c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023  Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 460–475, 2023. https://doi.org/10.1007/978-981-99-6187-0_46

Improving Dialogue Summarization with Mixup Label Smoothing

461

[7]. However, such carefully designed and collected auxiliary features complicate the summarization pipeline, leading to longer procedures [8]. Besides, error propagation caused by external tools may worsen the summaries, and the modified architectures (or inputs) to inject auxiliary information may hurt language models’ ability, leading to factual errors [9]. Furthermore, little exploration has been taken on low-diversity and overconfident issue where the abstractive summarization model is constantly forced to generate a specific summary. Typically, an example during training is a concatenated utterance flow paired with one reference summary, which means only one target is provided for supervision. Besides, the limited supervised signals and the objective of Maximum Likelihood Estimation (MLE) lead the model to be optimized continuously to generate a unique target for each dialogue, which resulting in a preference for high-frequency tokens [10] and is incompatible with the nature of the language, i.e., various expressions with the same meaning in different forms. In contrast, an ideal model is expected to be allowed to generate multiple semantically similar summaries without penalty, which is contradictory with the training objective of MLE. Besides, compared to summarization tasks with abundant annotated data (e.g., news), dialogue summarization suffers from the problem of data scarcity [11], which will exacerbate the overconfident issue. To tackle such overconfident issue, we propose a simple yet effective method to enrich the supervised target to present diverse candidates by exploiting general knowledge underlying the PLMs. We aim to reduce the penalty of the original training objective when generating summaries with similar semantics but different expressions. It is intractable to enumerate all the possible candidate targets but treating PLM as a distribution function may be a way out. PLMs have achieved strong performance on downstream tasks through learning on largescale corpus [12]. PLM is expected to predict tokens as close as possible to the input sequence semantically. Intuitively, the predicted distribution for each position can be regarded as a noisy approximate of the hard target where tokens in the vocabulary with more similar semantics will be assigned with higher probability [13]. Furthermore, mixup is adopted to balance original one-hot distributions with the hypothesis distributions for each position, guaranteeing the ground truth contributes the most and avoiding poor summaries [13,14]. The interpolated target is called context-smoothing label, and Context Label Smoothing (CLS) refers to the construction method. To leverage the advantages of vanilla Label Smoothing, it’s comprehensible to generalize CLS by incorporating additional uniform distribution to construct the mixup-smoothing label, i.e., Mixup Label Smoothing (MLS). To sum up, we regard the distribution over the vocabulary predicted by PLM feeding with the original hard target as an approximate soft target. Then we propose to utilize the soft target by mixing it up with the one-hot or uniform target. Our proposed smoothing label can be viewed as compressing a set of possible candidate summaries with slightly different forms from the perspective of sequence likelihood. The key contributions can be summarized as:

462

S. Cheng and D. Song

– We propose the Mixup Label Smoothing method to enhance the performance of abstractive dialogue summarization, which is simple yet effective. – It leverages the general knowledge from the PLM to construct dynamic adaptive smoothed target, alleviating the overconfident issue caused by Maximum Likelihood Estimation and data scarcity. – Extensive experiments on three datasets demonstrate the effectiveness of MLS in diverse dialogue scenarios, as well as in low-resource settings when compared to competitive baselines.

2 2.1

Related Work Dialogue Summarization

Recently, the new proposal of SAMSum [15] and other datasets for conversations have resurged the research direction on dialogue summarization, including email threads [16], meeting transcripts [17], daily chats [18], and so on. In this work, we focus on abstractive dialogue summarization and adopt three benchmarks on three scenarios, including SAMSum [15], DialogSum [18], and TweetSum [19]. Usually, the approaches can be classified into extractive and abstractive summarization paradigms for text summarization [20]. However, the complicated dialogue flow and utterance interaction make the essential information so scattered and trivial that there is a scarcity of semantic units suitable for extraction. As a result, previous works mainly focus on abstractive summarization, which is generally modeled as a sequence-to-sequence learning task [1]. To exploit the dialogue-specific features, researchers have widely explored re-organizing the dialogue and injecting auxiliary information into the summarization model. Novel attention mechanisms are devised to model the interaction relations among speakers, utterances, and topics via dialogue structures [21,22]. MV-BART [4] manages to enhance the dialogue understanding from the perspective of four views, such as stage change. Coref-ATTN [5] and Ctrl-DiaSumm [23] explicitly incorporate coreference relations or named-entity planning, promoting the model to distinguish complicated coreference relationships or content planning, respectively. BART(DALL ) [24] utilizes a pretrained dialogue model as an unsupervised annotator to alleviate error propagation due to the domain shift. A multitask learning framework is also considered to enhance dialogue understanding with multiple contrastive objectives [25]. Besides, these approaches may modify the model architecture or input format to leverage the carefully designed or collected auxiliary features [3,21,22]. However, no previous work considered optimizing the training objective where the hard target may be too strict, making the model overconfident and degenerate. In this paper, we propose to exploit the general knowledge from the pretrained language model to construct adaptive targets in supervised learning which allow the model to generate semantic similar hypothesis summaries but in the form of slightly different expressions.

Improving Dialogue Summarization with Mixup Label Smoothing

2.2

463

Knowledge Distillation

Knowledge Distillation (KD) comes with the success of deep modes with enormous parameters and vast training corpus, which have achieved remarkable success. Previous work on KD focus on compressing a large high-quality teacher model into a smaller student model to avoid expensive computing and storage cost. These works can be classified as knowledge type, distillation strategies, and model architectures [26]. KD can be interpreted as a regularizer that makes sense via the formula transformation [27], e.g., knowledge is learned in KD but not the pre-defined uniform distribution in Label Smoothing. Our proposed method, discussed in Sect. 3, can also be regarded as a kind of knowledge distillation to exploit general knowledge encoded in the teacher model (i.e., the PLM). But it is worth noting that we can make a distinction between KD and our method MLS, discussions can be found in Sect. 5.1.

3 3.1

Approach Problem Formulation

Generally, abstractive dialogue summarization is conceptualized as sequenceto-sequence learning trained with standard Maximum Likelihood Estimation (MLE). MLE encourages the neural model gθ to maximize the sequence likelihood of the reference summary S ∗ (i.e., hard target) given the source dialogue M LE D, represented as S ∗ ←−−− S = gθ (D). The model will be trained under teacher forcing and generate the output summary in an autoregressive manner during inference. In practice, MLE is mathematically equivalent to the cross-entropy loss, which minimizes the sum of negative log-likelihoods of all tokens in the reference summary S ∗ = {s∗1 , s∗2 , . . . , s∗|S ∗ | }, i.e.: ∗

L(q true , pgθ ) = −

|S | |V|   t=1

∗ ∗ qtrue (s|D, S 0 and ⎧     ⎨ si  si  , si ϕi  ϕi  ≤ 1  (11) sati = ⎩ sign si ,  si  > 1 ϕi ϕi ϕi

From (2), (8) and (9), the outputs of adaptive sliding mode virtual feedback controller and model-free adaptive sliding mode controller can be obtained as

3.2

us = c−1 (ks + ηˆsat(ϕ−1 s) + pe + q¨ e)

(12)

u = −α† c−1 (fd − y˙ 0 + ks + ηˆsat(ϕ−1 s) + pe + q¨ e)

(13)

Stability Analysis

Choosing the Lyapunov function as Vs =

1 T 1 s s + Γ η˜T η˜ 2 2

(14)

Taking the derivative of (14), we have V˙ s = sT s˙ + Γ η˜T η˜˙ = sT −ks − ηˆsat(ϕ−1 s) − Γ η˜T ηˆ˙ η − η)sat(ϕ = −ks s + s (˜ T

T

−1

(15)

s) − η˜ ||s|| T

564

X. Meng et al.

where ϕ is satisfied ϕ−1 < I, I is a unit matrix. According Eq. (10), when |ϕi −1 si | ≤ 1, the Eq. (15) can be calculated as V˙ s = −ksT s + sT (˜ η − η)sat(ϕ−1 s) − η˜T ||s|| ≤ −ksT s + sT η˜ϕ−1 s − sT ηϕ−1 s − η˜T ||s|| ≤ sT η˜ϕ−1 s − η˜T ||s||

(16)

≤ ϕ−1 η˜T ||s|| − η˜T ||s|| < 0 Similarly, when |ϕi −1 si | > 1, the Eq. (15) can be expressed as V˙ s = −ksT s + sT η˜sign(ϕ−1 s) − sT ηϕ−1 s − η˜T ||s|| = −ksT s − sT ηϕ−1 s < 0

(17)

Hence, if and only if si = 0, the (15) yields V˙ = 0. Given the aforementioned, it can be concluded that Eq. (15) consistently remains less than or equal to zero. Therefore, it can be ascertained that the proposed reaching law satisfies the reachability condition of sliding mode, thereby ensuring asymptotic stability of the sliding mode controller. 3.3

The ESO Designing

Due to the conservation of energy, in actual physical systems, all states and unknown perturbations are finite, and the Eq. (1) is rewritten as follows

z˙i = zdi + αi ui (18) z˙di = f˙di where zi = yi , zdi = fdi , i = 1, · · · , n. Then, the ESO is designed as ⎧ ⎨ z˜i = zˆi − zi zi , γ1 , δ1 ) zˆ˙ i = zˆdi + αi ui − 1 f al (˜ ⎩˙ zˆdi = −2 f al (˜ zi , γ2 , δ2 )

with f al (˜ zi , γ1 , δ1 ) =

f al (˜ zi , γ2 , δ2 ) =

(19)

γ

|˜ zi | 1 sign (˜ zi ) , |˜ zi | > δ 1 z˜i /δ11−γ1 , |˜ zi | < δ 1

(20)

γ

|˜ zi | 2 sign (˜ zi ) , |˜ zi | > δ 2 z˜i /δ21−γ2 , |˜ zi | < δ 2

where 1 , 2 , 0 < γi < 1, δ1 and δ2 are designed adjustable parameters. Letting z˜di = zˆdi − zdi and considering (18) and (19), we have

zi , γ1 , δ1 ) z˜˙ i = z˜di − 1 f al (˜ z˜˙ di = −2 f al (˜ zi , γ2 , δ2 ) − f˙di

(21)

(22)

Model-Free Adaptive Sliding Mode Control for Nonlinear Systems

565

From [10,11], the stability proof of (22) can be obtained by using Lyapunov functions. According to [12], it can be verified that z˜ = z˜d = 0 when the design parameters 1 and 2 are selected. If z˜d = 0, then fd = zˆd and z , γ2 , δ2 ). zˆ˙ d = −2 f al (˜ Thus, (23) fˆd = zˆd where fˆd = [fˆd1 , · · · , fˆdn ]T , zˆd = [ˆ zd1 , · · · , zˆdn ]T . Substituting the Eq. (23) into (13), one obtains e) u = −α† c−1 (fˆd − y˙ 0 + ks + ηˆsat(ϕ−1 s) + pe + q¨

4

(24)

Simulation Results

This article uses a widely known two tank liquid level (TTLL) system [16] in industry to verify the performance of the proposed method and its structure diagram is shown in Fig. 1.

Fig. 1. The structure diagram of TTLL system.

566

X. Meng et al.

The dynamic model of the TTLL system with uncertainties is described as  √ a2 +Δa2 √ a3 +Δa3 1 y˙ 1 = − Aa11 +Δa +ΔA1 √2gy1 + A1 +ΔA1 2gy2 + A1 +ΔA1 u1 + dex1 (25) a2 +Δa2 a4 +Δa4 y˙ 2 = − A2 +ΔA2 2gy2 + A2 +ΔA2 u2 + dex2 where Δa and ΔA are modeling error and measurement error of manual valves and Tanks’ cross section respectively. dex1 and dex2 are formulated as external disturbances. The model of the TTLL system as (1) in the following form        fd1 α1 0 u1 y˙ 1 = + (26) y˙ 2 fd2 u2 0 α2 According to the model-free control theory, the proposed controller of the system can be written as     −α1−1 c−1 (fˆd1 − y˙ 10 + k1 s1 + ηˆ1 sat(ϕ1 −1 s1 ) + p1 e1 + q1 e¨1 ) u1 1 = (27) ˆ u2 −α2−1 c−1 ˆ2 sat(ϕ2 −1 s2 ) + p2 e2 + q2 e¨2 ) 2 (fd2 − y˙ 20 + k2 s2 + η In the simulation verification stage of this paper, the input saturation problem caused by the inherent physical characteristics of the actuator in the actual engineering system is considered. The control input u is processed as follows ⎧ ⎨ umax , u ≥ umax (28) Sat(u) = u, umin ≤ u < umax ⎩ umin , u ≤ umin where umax and umin are actuator physical limits. The model parameters is given as a1 = 0.4 cm2 , a2 = 0.3 cm2 , a3 = a4 = 0.2 2 cm , A1 = A2 = 196 cm2 , g = 980 cm/s2 . The difference controller parameters is shown in Table 1. Table 1. The parameters of controllers. Controllers

Parameters

MFASMC-ESO c1 = 3000 k1 = 1.2 γ1 = 0.25 c2 = 320 k2 = 1.2 γ2 = 0.25

m1 = 0.001 Γ1 = 0.0025 δ1 = 0.001 m2 = 0.001 Γ2 = 0.0025 δ2 = 0.001

n1 = 0.001 φ1 = 0.001 1 = 20 n2 = 0.001 φ2 = 100 2 = 0.0025

c1 = 3000 γ1 = 0.25 k1 = 1.2 c2 = 1500 γ2 = 0.25 k2 = 1.2

m1 = 0.5 δ1 = 0.001 η1 = 1 m2 = 0.5 δ2 = 0.001 η2 = 1

n1 = 0.001 1 = 20

c1 = 3000 γ1 = 0.25 k1 = 1.2 c2 = 320 γ2 = 0.25 k2 = 1.2

m1 = 0.001 δ1 = 0.001 η1 = 1 m2 = 0.001 δ2 = 0.001 η2 = 1

n1 = 0.001 1 = 20

SMC

c1 = 3000 k1 = 1.2 c2 = 320 k2 = 1.2

m1 = 0.001 n1 = 0.001 η1 = 1 m2 = 0.001 n2 = 0.001 η2 = 1

PID

kp = 5000 kp = 5000

kI = 20 kI = 30

MFSMC-ESO

SMC-ESO

n2 = 0.001 2 = 100

n2 = 0.001 2 = 100

kD = 2 kD = 2

Model-Free Adaptive Sliding Mode Control for Nonlinear Systems

567

The set values are given as x10 = 16 cm and x20 = 10 cm. The actuator physical limits values are given as umax = 5000 cm3 and umin = 0 cm3 . The Figs. 2, 3 and 4 shows the output state and tracking error response curves of Tanks 1, 2, as well as its enlarged view. The Fig. 5 shows the response curves of the control inputs u1 and u2 . It can be seen from Figs. 2, 3 and 4 that the MFASMC-ESO method has a faster response speed at dynamic-state and smaller steady-state error compare with the other four methods. When external disturbance occurs at runtime 50 s, the response curve of the ESO method is smoother, and the observer error is smaller after recovery. 20 15

16.02 16

10

15.9

5 15.8

0

0

0

5

20

10

15

40

20

60

80

100

(a) 18.2 18

16.5

17.5

16.2 16

17

15.8 60

16.5

61

62

63

64

65

16 50

55

60

65

(b)

Fig. 2. The curves of state output y1 .

70

568

X. Meng et al. 15

10 10.02 10

5

9.98 9.96 0

0 0

2

4

20

6

8

40

10

60

80

100

(a) 12.2 12

10.2

11.5

9.8 10.2

11

10

10.5

9.8 55

60

65

70

10 9.8 50

55

60

65

70

(b)

Fig. 3. The curves of state output y2 . 0.2

MFASMC-ESO MFSMC-ESO PID SMC-ESO SMC

0.1 0 -0.1 -0.2

0

20

40

60

80

100

(a) 0.1

MFASMC-ESO MFSMC-ESO PID SMC-ESO SMC

0 -0.1 -0.2 -0.3

0

20

40

60

80

100

(b)

Fig. 4. The curves of tracking errors e1 and e2 .

Model-Free Adaptive Sliding Mode Control for Nonlinear Systems 5000

569

MFASMC-ESO MFSMC-ESO PID SMC-ESO SMC

4000 3000 2000 1000 0

0

20

40

60

80

100

(a) 5000

MFASMC-ESO MFSMC-ESO PID SMC-ESO SMC

4000 3000 2000 1000 0

0

20

40

60

80

100

(b)

Fig. 5. The curves of control inputs u1 and u2 .

5

Conclusion

This work proposes the model-free adaptive sliding mode control for nonlinear systems with uncertainties. By using the model-free control philosophy, the impact of inaccurate modeling on the system is avoided. To improve the control accuracy and robustness of the nonlinear systems, a novel adaptive sliding mode algorithm is introduced to design the virtual feedback controller. The adaptive gain and saturation function are used to reduce chattering and improve robustness on system. Further, to deal with the impacts of parameter changes and external strong disturbance during system operation, a new ESO is adopted, and the observed state errors is ensured to converge to zero. Finally, the comparison simulation experiments under TTLL system are conducted to verify the proposed strategy. Funding. This research work is supported in part by the National Natural Science Foundation under Grant 62273189 and the Shandong Province Natural Science Foundation under Grant ZR2021MF005.

570

X. Meng et al.

References 1. Hou, Z., Wang, Z.: From model-based control to data-driven control: survey, classification and perspective. Inf. Sci. 235, 3–35 (2013) 2. Hou, Z., Xiong, S.: On model-free adaptive control and its stability analysis. IEEE Trans. Autom. Control 64(11), 4555–4569 (2019) 3. Fliess, M., Join, C.: Model-free control. Int. J. Control 86(12), 2228–2252 (2013) 4. Xiong, S., Hou, Z.: Model-free adaptive control for unknown MIMO nonaffine nonlinear discrete-time systems with experimental validation. IEEE Trans. Neural Netw. Learn. Syst. 33(4), 1727–1739 (2020) 5. Zhao, J., Hill, D.J.: On stability, L2-gain and H∞ control for switched systems. Automatica 44(5), 1220–1232 (2008) 6. Wang, Z., Li, S., Wang, J., Li, Q.: Robust control for disturbed buck converters based on two GPI observers. Control. Eng. Pract. 66, 13–22 (2017) 7. Chen, W., Yang, J., Guo, L., Li, S.: Disturbance-observer-based control and related methods an overview. IEEE Trans. Industr. Electron. 63(2), 1083–1095 (2016) 8. Zhang, J., Jiang, W., Ge, S.S.: Adaptive fuzzy control for uncertain strict-feedback nonlinear systems with full-state constraints using disturbance observer. IEEE Trans. Syst. Man Cybern. Syst. 1–12 (2023). https://doi.org/10.1109/TSMC.2023. 3280569 9. Talole, S.E., Kolhe, J.P., Phadke, S.B.: Extended-state-observer-based control of flexible-joint system with experimental validation. IEEE Trans. Ind. Electron. 57(4), 1411–1419 (2009) 10. Wei, W., et al.: A scalable-bandwidth extended state observer-based adaptive sliding-mode control for the dissolved oxygen in a wastewater treatment process. IEEE Trans. Cybern. 52(12), 13448–13457 (2022) 11. Castillo, A., Garc´ıa, P., Sanz, R., Albertos, P.: Enhanced extended state observerbased control for systems with mismatched uncertainties and disturbances. ISA Trans. 73, 1–10 (2018) 12. Levant, A.: Sliding order and sliding accuracy in sliding mode control. Int. J. Control 58(6), 1247–1263 (1993) 13. Gao, X., Weng, Y.: Chattering-free model free adaptive sliding mode control for gas collection process with data dropout. J. Process Control 93, 1–13 (2020) 14. Corradini, M.L.: A robust sliding-mode based data-driven model-free adaptive controller. IEEE Control Syst. Lett. 6, 421–427 (2021) 15. Wang, X., Li, X., Wang, J., Fang, X., Zhu, X.: Data-driven model-free adaptive sliding mode control for the multi degree-of-freedom robotic exoskeleton. Inf. Sci. 327, 246–257 (2016) 16. Meng, X., Haisheng, Yu., Herong, W., Tao, X.: Disturbance observer-based integral backstepping control for a two-tank liquid level system subject to external disturbances. Math. Probl. Eng. 161–174, 2020 (2020) 17. Meng, X., Yu, H., Zhang, J., Yang, Q.: A novel discrete sliding mode controller for MIMO complex nonlinear systems with uncertainty. In: 2022 China Automation Congress (CAC), Xiamen, China, pp. 259–263 (2022)

Quantum Illumination with Symmetric Non-Gaussian States Wen-Yi Zhu1,2(B) , Wei Zhong1

, and Yu-Bo Sheng1,2

1 Institute of Quantum Information and Technology, Nanjing University of Posts and

Telecommunications, Nanjing 210003, China [email protected] 2 College of Electronic and Optical Engineering & College of Flexible Electronics (Future Technology), Nanjing University of Posts and Telecommunications, Nanjing 210023, China

Abstract. Quantum illumination (QI) is a technique that uses entangled states to illuminate a target region surrounded by a bright thermal bath to detect lowreflectivity objects. While QI has been extensively studied in the context of Gaussian systems, we propose a new QI protocol that utilizes symmetric non-Gaussian states to probe the target. We analyze the sensitivity of the target using the quantum Fisher information (QFI) of the symmetric non-Gaussian states under various squeezing parameters and the sensitivity of the measurement using the signalto-noise ratio (SNR). Our results show that under certain conditions, symmetric non-Gaussian states can outperform Gaussian states in QI. Keywords: quantum illumination · symmetric non-Gaussian states · quantum Fisher information · signal-to-noise ratio

1 Introduction Entanglement is a fundamental resource in various quantum applications, including quantum teleportation [1], quantum error correction [2], and superdense coding [3]. However, entangled states are highly susceptible to degradation due to losses and environmental noise, making it a significant challenge to utilize them in noisy and lossy environments. To address this issue, S. Lloyd proposed the concept of QI, which involves irradiating a target region with a signal entangled with an ancilla and measuring the reflected light with the ancilla [4]. Remarkably, there is still an advantage in QI over classical protocols with coherent states, even if the final state of the reflected and idler lights is not entangled. Expanding on the pioneering work of [4], numerous theoretical and experimental investigations have been conducted in the field of QI [5–7]. One notable protocol, developed by Tan et al., utilized a two-mode squeezed vacuum state (TMSVS) and achieved a remarkable 6 dB gain in the error probability exponent compared to coherent states [8]. However, TMSVS is a Gaussian probe and recent works have suggested that non-Gaussian probes may perform better [9, 10]. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 571–578, 2023. https://doi.org/10.1007/978-981-99-6187-0_56

572

W.-Y. Zhu et al.

In this paper, we propose a QI protocol that utilizes symmetric non-Gaussian states as probe states [11, 12]. We investigate the statistical properties of photon number and calculate the quantum Fisher information (QFI) for symmetric non-Gaussian states. Our findings reveal that under certain conditions, these states demonstrate an advantage over the commonly used TMSVSs in QI. This paper is organized as follows. In Sect. 2, we provide a brief introduction to QI. In Sect. 3, we analyze the statistical properties of photon number for symmetric non-Gaussian states. In Sect. 4, we derive the QFI and SNR of symmetric non-Gaussian states in the context of QI and conduct a comparative analysis with TMSVS. Finally, we present our conclusions in Sect. 5.

2 Quantum Illumination The setup of the QI scheme is illustrated in Fig. 1, which aims to detect low-reflectivity objects in a bright noisy region. We assume that the transmitter is prepared in a twomode pure state |ψ, where the signal mode is directed towards the potential target area, while the idler mode  The suspect object is generally modeled as a  is retained. beam splitter Uη = exp η s† b − sb† with a small reflectivity η  1. We assume   B n |nn| with mean that the environment is in a thermal state ρB = NB1+1 n NNB +1 states at the receiver can photon number NB . If the target is  present (η = 0), the detected  be represented as ρIR (η) = TrS Uη |ψSI ψ| ⊗ ρB Uη† . However, if the target absents (η = 0), the detected state at the receiver is a separable state ρIR (η) = ρI ⊗ ρB where ρI = TrS (|ψSI ψ|).

Fig. 1. A schematic diagram of quantum illumination. The transmitter sends the initial probe states to the beam splitter and the receiver respectively. After the beam splitter, the return mode including thermal noise and the signal mode is jointly measured with the idler mode at the receiver.

Quantum Illumination with Symmetric Non-Gaussian States

573

3 Symmetric Non-gaussian States A symmetric non-Gaussian state can be obtained by symmetrically adding or subtracting l photons from each mode of the TMSVSs which can be expressed by |z, ±l =



(±l)

DN |N S |N I ,

N =n±l

(±l)

DN

where

N±l

⎧   2 1/2 N ! ⎪ ⎪ −1/2 1 − z ⎪ z N −l , N = n + l ⎨ N+l − l)! (N = ,   ⎪ ⎪ −1/2 (N + l)! N +l ⎪ ⎩ N−l z ,N = n − l N!

⎧  −2l 2m l ⎪ z (l!)4 1 − z 2 ⎪ ⎪ ⎪ ,N = n + l ⎪ 2 ⎨ (m!) [(l − m)!]2 m=0 . =  2 ∞ ⎪ ⎪ m! ⎪ 2m ⎪ |z| ,N = n − l ⎪ ⎩ (m − l)!

(1)

(2)

m=0

Note that z represents squeezing parameter, the signs + and - denote photon-added TMSVSs (PA-TMSVSs) and photon-subtracted TMSVSs (PS-TMSVSs), the subscript S, I represents the signal mode and the idler mode respectively. It is straightforward to obtain the mean photon number of each mode of symmetric non-Gaussian states as Ns = z, ±l|a† a|z, ±l =



   (±l) 2 N DN  .

(3)

N =n±l

Figure 2 shows the average photon number of symmetric non-Gaussian states obtained by adding and subtracting photons. We observe that Ns converges to that of the TMSVS when l = 0. Additionally, we investigate the photon number of symmetric non-Gaussian states with an addition or subtraction of photons in increments of 1, 2, and 3. Interestingly, the average photon number increases monotonically as a function of the squeezing parameter. We also note that both photon-addition and photon–subtraction operations result in an increased average photon number. Furthermore, symmetric nonGaussian states with added photons have a higher average photon number than those with subtracted photons under the same l.

574

W.-Y. Zhu et al.

Fig. 2. The average photon number versus the squeezing parameter z for PA-TMSVS (blue solid lines) and PS-TMSVS (red dashed lines) with l = 0, 1, 2, 3. The average photon number of TMSVS is presented by the black solid line.

4 Analysis of Quantum Illumination Performance 4.1 Target Sensitivity In the context of QI, the QFI is a commonly used figure of merit to evaluate the performance of a quantum illumination protocol, where a higher QFI corresponds to better target sensitivity [13]. In the QI scenario, The QFI can be calculated using the following expression:

FQ |η→0 = 2

 2   ∂ρη ψ | | |ψ   k ∂η η→0 l  k

l

λk + λl

,

(4)

where λk and |ψk  are the eigenvalues and the eigenstates of the detected state ρη ≡ ρIR (η), respectively, with the summation requiring λk + λl = 0. Using Eq. (4), we can derive the QFI for symmetric non-Gaussian states as follows: 2 (DN(±l) DN(±l) 4 −1 ) N FQ = . (±l) 2 NB (±l) 2 1 + NB N (DN ) NB +1 + (DN −1 )

(5)

In Fig. 3, we observe a significant increase in the QFI for both PA-TMSVSs and PS-TMSVSs as l increases, which is consistent with the trend shown in Fig. 2, in the sense that the number of photons in the signal mode has a direct impact on the enhancement of target sensitivity. It also shows that both types of non-Gaussian states provide better performance in QI than TMSVS under the same squeezing strength constraint. Additionally, we observe that the protocol using PA-TMSVSs yields a higher target sensitivity than the one using PS-TMSVSs when their QFI is compared under equal values of l.

Quantum Illumination with Symmetric Non-Gaussian States

575

Fig. 3. The QFI versus squeezing strength for PA-TMSVSs and PS-TMSVSs with l = 0, 1, 2, 3. The same color scheme used in Fig. 2 is applied here as well. The mean photon number of bath is set to NB = 50.

4.2 Accessible Target Sensitivity Below, we evaluate the accessible sensitivity in the aforementioned scenario by employing double homodyne detection (dHD) [14]. The measurement operator for dHD can be represented as follows: MdHD = aR aR† − aR† aI† − aR aI + aI† aI ,

(6)

where subscripts R, I denote the reflected mode and the idler mode, respectively. In QI, the signal-noise ratio (SNR) is commonly identified as a figure of merit to evaluate the accessible target sensitivity by a specific detection method. According to Ref. [14], the formula for the SNR for a specific measurement operator M is defined by (M0  − M1 )2 SNR ≡ √ 2 , √ 2 M0 + M1

(7)

where the subscript 0 (1) represents target is absent (present), M represents expected value of measurement operator and M = M 2  − M 2 represents variance of measurement operator. Here, we compare the SNR of symmetric non-Gaussian states to that of the TMSVS with respect to the mean photon number of the bath NB and the reflection coefficient η, respectively. In Fig. 4, the SNR exhibits a monotonic decreasing trend as the mean photon number of the bath increases. This suggests that detecting the target becomes increasingly challenging in heavy noisy environments when the SNR is lower. Symmetric non-Gaussian states provide higher SNR than TMSVS, in the sense that using the symmetric non-Gaussian states enables more accurate detection of the target than using TMSVS as the initial probe state. It is clear that the SNR increases when adding

576

W.-Y. Zhu et al.

or subtracting more photons form the TMSVS. Furthermore, we observe that employing PA-TMSVSs yields a higher SNR than employing PS-TMSVSs when comparing their SNR under equal values of l.

Fig. 4. The SNR versus mean photon number of the bath NB for PA-TMSVSs and PS-TMSVSs with l = 0, 1, 2, 3. The same color scheme used in Fig. 2 is applied here as well. The reflection coefficient is set to η = 10−3 and the squeezing parameter to z = 0.5.

In addition, we compare the SNR of symmetric non-Gaussian states and TMSVS with respect to the reflectance coefficient. As shown in Fig. 5, the SNR monotonically increases with the reflectance coefficient, indicating that a higher reflectance coefficient corresponds to a higher probability of detecting the object. Notably, the SNR of symmetric non-Gaussian states is superior to that of TMSVS, suggesting that symmetric non-Gaussian states are preferable as the initial probe states. Among the symmetric nonGaussian states with equal photon numbers, PA-TMSVSs exhibit better performance than PS-TMSVSs. In conclusion, we find that the enhanced performance of symmetric non-Gaussian states in QI is due to the increase of photon number resulting from the modification of photon addition and subtraction operations.

Quantum Illumination with Symmetric Non-Gaussian States

577

Fig. 5. The SNR versus reflectance coefficients η for PA-TMSVSs and PS-TMSVSs with l = 0, 1, 2, 3. The same color scheme used in Fig. 2 is applied here as well. The mean photon number of bath is set to NB = 30 and the squeezing parameter to z = 0.5.

5 Conclusion This paper presents an analysis of the photon-number statistics of symmetric nonGaussian states, Moreover, we simulate QFI and SNR in the context of QI under high noise environments and low object reflectivity conditions. Our results show that using symmetric non-Gaussian states yields a significant improvement in the performance of QI over TMSVSs under the same squeezing parameters. Additionally, we find that the enhancement in performance is directly proportional to the number of added or subtracted photons. Notably, when the same number of photons are added or subtracted, the PA-TMSVSs perform better than the PS-TMSVSs. Based on the findings presented in this paper, it is advisable to select the photonaddition two-mode squeezed vacuum states as the initial probe state for optimal results in QI. Furthermore, the efficacy of QI is observed to increase proportionally with the addition of photons.

References 1. Furusawa, A., et al.: Unconditional quantum teleportation. Science 282(5389), 706–709 (1998) 2. Pirandola, S., Eisert, J., Weedbrook, C., et al.: Advances in quantum teleportation. Nature photonics 9(10), 641–652 (2015) 3. Wang, C., Deng, F.G., Li, Y.S., et al.: Quantum secure direct communication with highdimension quantum superdense coding. Phys. Rev. A 71(4), 044305 (2005) 4. Lloyd, S.: Enhanced sensitivity of photodetection via quantum illumination. Science 321(5895), 1463–1465 (2008)

578

W.-Y. Zhu et al.

5. Shapiro, J.H., Lloyd, S.: Quantum illumination versus coherent-state target detection. New J. Phys. 11(6), 063045 (2009) 6. Lopaeva, E.D., Berchera, I.R., Degiovanni, I.P., et al.: Experimental realization of quantum illumination. Phys. Rev. Lett. 110(15), 153603 (2013) 7. Xu, F., Zhang, X.M., Xu, L., et al.: Experimental quantum target detection approaching the fundamental Helstrom limit. Phys. Rev. Lett. 127(4), 040504 (2021) 8. Tan, S.H., Erkmen, B.I., Giovannetti, V., et al.: Quantum illumination with Gaussian states. Phys. Rev. Lett. 101(25), 253601 (2008) 9. Lee, S.Y., Ihn, Y.S., Kim, Z.: Quantum illumination via quantum-enhanced sensing. Phys. Rev. A 103(1), 012411 (2021) 10. Noh, C., Lee, C., Lee, S.Y.: Quantum illumination with definite photon-number entangled states. JOSA B 39(5), 1316–1322 (2022) 11. Carranza, R., Gerry, C.C.: Photon-subtracted two-mode squeezed vacuum states and applications to quantum optical interferometry. JOSA B 29(9), 2581–2587 (2012) 12. Ouyang, Y., Wang, S., Zhang, L.: Quantum optical interferometry via the photon-added twomode squeezed vacuum states. JOSA B 33(7), 1373–1381 (2016) 13. Sanz, M., Las Heras, U., García-Ripoll, J.J., et al.: Quantum estimation methods for quantum illumination. Phys. Rev. Lett. 118(7), 070803 (2017) 14. Jo, Y., Lee, S., Ihn, Y.S., et al.: Quantum illumination receiver using double homodyne detection. Physical Review Research 3(1), 013006 (2021)

Rank-Level Fusion of Multiple Biological Characteristics in Markov Chain Qiankun Gao(B) , Jie Chen, Xiao Xu, and Peng Zhang The First Research Institute of M.P.S. PRC, No.1 Capital Gymnasium Sounth Road, Beijing 100048, China [email protected]

Abstract. The rank-level fusion methods utilize the ranks of all individual matchers to derive the consensus rank in the biometric recognition system, which are simple and effective information fusion methods. In order to further improve accuracy of the biometric recognition system, compared with the common rank-level fusion methods, Markov chain method is proposed. This method constructs a Markov chain that meets the Condorcet criterion, and then obtains two specific forms of calculating the consensus rank to realize the rank-level fusion. Finally, the experiment in which we use three single biometric recognition algorithms shows that Markov chain method not only has higher accuracy than single algorithm, but also has higher accuracy than other rank-level fusion methods. Keywords: Rank-level Fusion · Biometric Recognition System · Markov Chain Method · Condorcet Criterion

1 Introduction When multiple biometrics algorithms or matchers are used in the biometric recognition system for human recognition, information fusion is very important. A good information fusion method can reduce the impact of unreliable information sources. According to the fusion stage, we can divide information fusion into pre-matching fusion and post-matching fusion, in which, pre-matching fusion includes sensor-level fusion and feature-level fusion, and post-matching fusion [1, 2] includes score-level fusion, rank-level fusion and decision-level fusion. Although the fusion before matching is more reasonable in theory, considering the compatibility of the original data, the cost of equipment hardware fusion, the additional cost of storing the original data, the difficulty in dimensionality reduction of highdimensional data and the complexity of developing matching algorithms, etc. [3–5], it is very difficult to implement such methods of pre-matching fusion. At present, most multi-biometric recognition systems use post-matching fusion. In the score-level fusion method [6, 7], the scores from different algorithms or matchers may not have the same basic properties or score range at the same time, so it is necessary to normalize the scores. However, this process is very time-consuming and difficult. And if the normalization method is not appropriate, the recognition accuracy of the system © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 579–588, 2023. https://doi.org/10.1007/978-981-99-6187-0_57

580

Q. Gao et al.

will be very poor. In the decision-level fusion method [8], the final decision results of all single biometric recognition algorithms or matchers are Boolean values, which only contain limited information, so the accuracy of the final comprehensive decision may not be high. Therefore, the rank-level fusion method only needs to consider the relative position of the rank sequences of different algorithms or matchers, which is a simple and feasible method.

2 Introduction of Rank-Level Fusion Methods When the output sequence of each single algorithm or matcher is sorted according to the descending order of the matching score (or the ascending order of the distance score), the multi-biometric recognition system can use the rank-level fusion method. The key step is to establish a specific rank score calculation formula of the consensus rank sequence. The rank-level fusion method utilizes this formula to fuse the output results of the initial rank sequence by each algorithm or matcher, and obtains the consensus rank sequence as the rank-level fusion result. At present, the commonly used rank-level fusion methods include highest rank method, Borda count method and logistic regression method [9–12]. In order to facilitate the introduction of the above methods, it is assumed that there are N samples whose biometric information is registered in each template feature database and K algorithms or matchers for human recognition. For a sample to be tested, according to the initial rank sequence   by each algorithm or matcher, we can obtain a corresponding sort matrix R = rn,k , in which rn,k , n = 1, 2, · · · , N ; k = 1, 2, · · · , K represents the rank of the registered sample n in the rank sequence output by algorithm or matcher k. It is assumed that the initial rank sequence has listed all registered users in the template collection; Otherwise, it is necessary to consider the problem of completing the initial rank sequence, which will be described in detail later. 2.1 The Highest Rank Method The highest rank method is conducive to the integration of a small number of special algorithms, so it can be effectively when some individual algorithms performance well in multi-biometric recognition systems. The highest rank method sorts the registered samples according to their highest sequence rank (i.e. the smallest or the highest rank number) in each initial rank sequence to get the consensus rank sequence. The score of consensus rank sequence Rn is calculated as follows: Rn = minrnk k

(1)

The rank of the consensus rank sequence is higher when the score Rn is smaller. This method can take advantage of each algorithm. Even if only one algorithm assigns the highest sequence number to the correct sample, the correct sample is likely to get the highest sequence number after reordering. But there are likely to be many draws in the final sequence. When a draw is randomly broken, it is possible to accept the incorrect decision of the weakest algorithm. At the same time, another disadvantage is that only

Rank-Level Fusion of Multiple Biological Characteristics

581

the top position of any initial rank sequence is considered, which may lead to unreliable decision making in multi-biometric recognition system. So this method may not be safe and reliable. We can utilize a disturbance factor and try to break the draw problem. Amend Eq. (1) as follows: Rn = minrnk + ε(n) k

(2)

In Which, K ε(n) =

k=1 rnk

M

(3)

Here M is a large value which is used to generate a small perturbation term ε(n), and ε(n) combines all the initial rank information associated with a particular sample n. 2.2 The Borda Count Method The Borda count method assumes that the assigned sequences are independent of each other and the performance of each algorithm is similar. This method does not need training stage, so it’s easy to implement. However, the assumption that all algorithms run equally well without considering the different performance of single algorithm is usually harsh. And this makes the Borda count method vulnerable to the weak algorithm. In this method, the total Borda score Bn is calculated according to the sample’s corresponding rank number in each initial rank sequence, and the consensus rank sequence is obtained by sorting Bn in ascending order, i.e. the smaller value of Bn , the higher the consensus rank. The equation for calculating the total Borda score Bn is as follows: Bn =

K 

rnk

(4)

i=1

The performance of this method can be improved by discarding the worst rank number of all algorithm outputs. maxrnk  0 k

(5)

2.3 The Logical Regression Method The logical regression method is a generalization of the Borda count method. By calculating the weighted sum Ln (ω) of each initial sequence, the consensus rank sequence is obtained by sorting Ln (ω) in ascending order, i.e. the smaller the weighted sum Ln (ω) is, the higher the ranking is. The weighted sum Ln (ω) is calculated as follows: Ln (ω) =

K  k=1

ωk rnk

(6)

582

Q. Gao et al.

In which ω  (ω1 , · · · , ωk , · · · , ωk ), here ωk is the weight of the algorithm k. In the training stage, it is necessary to optimize the system through several trial runs and application attempts, and use the logistic regression model to optimize the recognition performance of each algorithm and allocate the corresponding weight. The smaller the weight, the better the recognition performance. Different algorithms generally have obvious differences in recognition accuracy, so this method is very useful. Since the performance of a single biometric recognition algorithm varies with the input of different quality sample sets, the weight allocation process is more challenging and time-consuming. Unreasonable weight allocation will also reduce the overall performance of multi-biometric recognition system. Therefore, in some cases, this method can not be used for rank-level fusion.

3 Markov Chain Method for Rank-Level Fusion Usually, the scale of template database registered by biometrics is very large, and each algorithm only outputs the first several results as a rank sequence. Therefore, in the multi biometric recognition system, some results are ranked first by several algorithms, and the others do not output the results. At this time, the logical regression method has no good recognition performance. In order to improve the short-comings of the existing rank-level fusion methods, this section attempts to apply Markov chain method to multi biometric rank-level fusion, which can be effectively applied to the system with large-scale data sets of different quality. In this section, we first introduce the related knowledge of Markov chain, then introduce the Condorcet criterion, and finally define the state transition rules to construct a homogeneous Markov chain, and obtain two specific forms of Markov chain method applied to multi biometric rank-level fusion. 3.1 Markov Chain Markov chain is a stochastic process with discrete time space and state space, and has memorylessness, which is defined in literature [13].It is assumed that a discrete stochastic process {Xn |n = 0, 1, 2, · · · } has a countable at most discrete state space S = {s1 , s2 , · · · }. For any integer n and any i0 , i1 , · · · , in+1 ∈ S, if the conditional probability satisfies: P{Xn+1 = in+1 |X0 = i0 , X1 = i1 , · · · , Xn = in } = P{Xn+1 = in+1 |Xn = in }

(7)

Then {Xn |n = 0, 1, 2, · · · } is called a Markov chain. Equation (7) is the mathematical expression of Markov chain’s memorylessness, i.e. the random variable at step n + 1 is independent of the random variables earlier than step n (or other historical state) when the random variable at step n is given. When in = i, in+1 = j, the intuitive meaning of conditional probability P{Xn+1 = j|Xn = i} is the probability that the system transfers from state i at time (or step) n to state j at the next time (or next step) n + 1, we call P{Xn+1 = j|Xn = i} (one-step) transfer probability pij (n). The transfer matrix P is defined as follows:   P  pij (n) (8)

Rank-Level Fusion of Multiple Biological Characteristics

583

In general, the transition probability is not only related to states i and j, but also related to time n. When pij (n) is independent of time n, the transfer probability is recorded as pij and Markov chain is called homogeneous. Markov chain completely determines the transfer matrix, and the transfer matrix also completely determines the Markov chain. The transfer matrix can also be visualized by transition graph, where the node set is S and the weight of each directed edge is pij . 3.2 The Condorcet Criterion The Condorcet criterion [14], also known as pairwise voting or paired voting, i.e. if there is a choice which can win in a pair of votes against each other, it should be considered as the winner of the voting election. This choice is called the Condorcet winner. The process can be summarized as pairwise comparison of all alternatives, the decisionmaking group first selects two schemes randomly to vote, and choice the one which gets more than half of the votes, then compares the one with one of the other alternatives, and then votes in turn until the Condorcet winner is selected. 3.3 The Markov Chain Method The rank-level fusion of multiple-biometric recognition systems is similar to the voting mechanism in which each individual algorithm or matcher is considered as an independent voter, and the initial rank sequence is results of each voter. It’s very important to ensure voting system fairness. The most important fairness evaluation standard in modern voting system is the Condorcet criterion. Therefore, when we design the rank-level fusion method of the multi-biometric recognition system, it is necessary to find a suitable method which can meet the Condorcet criterion. However, the consensus rank sequence obtained by the existing methods may violate the Condorcet criterion, and the Markov chain method can meet this criterion. In the Markov chain method, for a sample to be tested, the state space consist of the set of all samples which are identified in the initial rank sequence obtained by each algorithm or matcher. Xn represents one registered sample of state space S at time n, so for every time n, n = 0, 1, 2, · · · , Xn is a random variable, and then {Xn |n = 0, 1, 2, · · · } is a stochastic process in which time space and state space are discrete. According to the Condorcet criterion, the state transition rules are defined as follows: suppose the current state is i, i ∈ S, we randomly choose state j, j ∈ S with the equal probability. If j is ranked higher than i (i.e. the rank number of j is less than the one of i) in more than half of algorithms or matchers, the state at next time will transform to j, otherwise it will be still in state i at next time. The rules make the conditional probability of {Xn |n = 0, 1, 2, · · · } satisfy the Eq. (7), i.e. {Xn |n = 0, 1, 2, · · · } has memorylessness, and its transition probability pij (n) is independent of time n. Therefore, we construct a homogeneous Markov chain which satisfies the Condorcet criterion. The transition matrix can be calculated according to the state transition rules. We input one sample to be tested into different algorithms or matchers, and get initial rank sequences. It should be noted that the length of the final consensus rank sequence is the total number of registered samples in all initial rank sequences. Therefore, the samples not listed in an initial rank sequence need to be supplemented firstly. There are

584

Q. Gao et al.

two means to use. The first mean is random insertion, in which we successively insert unlisted samples at the end of one initial rank sequence. The second mean is that we use the relative position of unlisted samples (such as Borda score) to sort and form a complete rank sequence. Usually the second mean is used, and the first mean can be used for random selection when a draw occurs. After these initial rank sequences are supplemented, the transfer matrix P can be calculated. The specific process is as follow: for one state (i.e. one registered sample) i, i ∈ S, we call the set of all states which satisfy the state transition rule except state i as J (i), and mark the size as |J (i)|. According to the following formulas, pij can be calculated:  pij = 1 (9) j∈S

pii = pij , ∀j ∈ J (i)

(10)

pij = 0, ∀j ∈ S − J (i) − {i}

(11)

Further, we can get: pii = pij =

1 , ∀j ∈ J (i) |J (i)| + 1

(12)

So we get the whole transition matrix P. Next, we utilize the transfer matrix P to get the consensus rank sequence. There are two specific approaches. In the first approach, for one state (i.e. one registered sample) j, j ∈ S, the consensus rank score sj is calculated as follows by subtracting the sum (i.e. it’s equal to 1) of the j-th row’s elements from the sum of the j-th column’s elements in transfer matrix P, or it is the difference between the in-degree and the out-degree of the j-th node in the corresponding transition graph. The equation is as follows:    pij − pji = pij − 1 (13) sj = i∈S

i∈S

i∈S

According to the order of sj from large to small, the consensus rank sequence can be obtained. In the second approach, the consensus rank score sj is calculated as follows by subtracting the sum of the j-th row’s elements from the sum of the j-th column’s elements in matrix W , i.e. sj is equal to the number of edges whose ending point is the node j minus the number of edges whose starting point is the node j in the Markov chain transition graph. The equation is as follows:   ω(i, j) − ω(i, j) (14) sj = i∈S

Here

i∈S

 ωij 

1 , pij > 0 0, pij = 0

(15)

Rank-Level Fusion of Multiple Biological Characteristics

  W  ωij

585

(16)

There are many advantages of Markov chain method for the rank-level fusion. It can handle the initial rank sequences well and provide a more comprehensive comparison between candidate samples when candidate samples are only a small part of the registration template database. It can also be applied when the results of initial rank sequences are very different. Therefore, it is widely used in the rank-level fusion of multi-biometric recognition system.

4 Experiment In this section, we demonstrate the performance of the rank-level fusion method based on Markov chain, other rank-level fusion methods and three single biometric recognition algorithms through specific experiments. 4.1 Experimental Data Preparation and Processing Stage Assuming that different algorithms or matchers are independent of each other, in order to ensure the security and privacy of the experimental data, we select the same number of three biometrics to establish a one-to-one correspondence, build a virtual test sample set, and input the corresponding biometrics of a sample to be tested into three single algorithms or matchers to obtain three initial rank sequences in this experiment. At the same time, in order to compare the recognition performance between the fusion methods and the three single algorithms, the three initial rank sequences should have certain differences, and at least one sequence contains the sample to be tested. When suppling the initial rank sequence of the sample to be tested by one algorithm, we insert the comparison results of the other two algorithms at the end of it in turn, and select only one with the lowest rank number for the same comparison result, so that each completed initial rank sequence will contain the sample to be tested, so the virtual test sample in this experiment is a positive sample. Finally, the consensus rank sequence is obtained by using different rank-level fusion methods. The specific experimental process is shown in Fig. 1. 4.2 Analysis of Experimental Results We use the top k hit rate top(k) as the evaluation index [15]. According to the completed initial rank sequence and the consensus rank sequence obtained by each fusion method, the Cumulative Match Characteristic (CMC) curves of the corresponding single algorithms or rank-level fusion methods can be drawn, as shown in Figs. 2 and 3 respectively. In Fig. 2, the first hit rates top(1) of the two forms of Markov chain method are 94.76% and 94.37% respectively, and the first hit rates top(1) of the three single algorithms are 81.55%, 74.76% and 85.05% respectively. By comparing the CMC curve trends of the two forms of Markov chain method and the three single algorithms, it can be seen that

586

Q. Gao et al. Input biometric 1 of a sam ple to be tested

Build the virtua l test sample se t

Input biometric 3 of a sam ple to be tested

Input biometric 2 of a sam ple to be tested Feature Extraction Algorithm/ Matcher 1

Feature Extraction Template Database 1

Algorithm/ Matcher 2

Feature Extraction Template Database 2

Algorithm/ Matcher 3

Template Database 3

Matching

Matching

Matching

Output

Output

Output

Initial rank sequence 1

Initial rank sequence 2

Initial rank sequence 3

Rank Identity 1 ··· ··· ··· N1 ···

Rank Identity 1 ··· ··· ··· N1 ···

Rank Identity 1 ··· ··· ··· N1 ···

Comple te

Comple te

Comple te

Rank Identity 1 ··· ··· ··· N1 ··· ··· ··· N ···

Rank Identity 1 ··· ··· ··· N1 ··· ··· ··· N ···

Rank Identity 1 ··· ··· ··· N1 ··· ··· ··· N ···

Input

The rank-leve l f usion me thods: Highest R ank, Borda Count, Logistic Regression, Mar kov Chain

Input

Input

Output Consensus rank s equence Rank 1 ··· ··· ··· N

Identity ··· ··· ··· ··· ···

Fig. 1. The experimental process of rank-level fusion methods

the recognition performance of the Markov chain method is superior to the three single algorithms. In Fig. 3, the first hit rates top(1) of the highest rank method, Borda count method and logistic regression method are 91.07%, 92.04% and 93.40% respectively, which are less than the first hit rates of the Markov chain method, 94.76% and 94.37%. By comparing the CMC curve trends of different rank-level fusion methods, the recognition performance from high to low in this experiment is Markov chain method, logistic regression method, Borda count method and the highest rank method. Here, there is a slight difference in the recognition results of individual test samples between the two forms of Markov chain method, so the two CMC curves of Markov chain are almost the same but not completely coincident.

Rank-Level Fusion of Multiple Biological Characteristics

587

Fig. 2. The CMC curves of Markov chain method and three single algorithms

Fig. 3. The CMC curves of Markov chain method and other three rank-level fusion methods

5 Conclusion This paper reviews three rank-level fusion methods: the highest rank method, Borda count method and logistic regression method, and introduces the rank-level fusion method based on Markov chain. Starting from the definition of the Markov chain, it constructs the state transition rule that meets the Condorcet criterion, and obtains two specific forms of Markov chain method. The performance advantages of Markov chain method compared with other three rank-level fusion methods and three single algorithms are verified by specific experiments.

588

Q. Gao et al.

References 1. Kaur, G., Bhushan, S., Singh, D.: Fusion in Multimodal Biometric System: A Review. Indian J. Sci. Technol. 10(28), 1–10 (2017) 2. Sahoo, S.K., Choubisa, T., Prasanna, S.R.M.: Multimodal biometric person authentication: a review. IETE Tech. Rev. 29(1), 54–75 (2012) 3. Wan, K., Min, S.J., Ryoung, P.K.: Multimodal biometric recognition based on convolutional neural network by the fusion of finger-vein and finger shape using near-infrared (NIR) camera sensor. Sensors 18(7), 2296–2329 (2018) 4. Karthiga, R., Mangai, S.: Feature selection using multi-objective modified genetic algorithm in multimodal biometric system. J. Med. Syst. 43(7), 214–224 (2019) 5. Xin, Y., Kong, L., Liu, Z., et al.: Multimodal feature-level fusion for biometrics identification system on IoMT platform. IEEE Access 6, 21418–21426 (2018) 6. Wang, J., Borji, A., Jay Kuo, C.C., et al.: Learning a combined model of visual saliency for fixation prediction. Image Proc. IEEE Trans. on 25(4), 1566–1579 (2016) 7. Dwivedi, R., Dey, S.: Score-level fusion for cancelable multi-biometric verification. Pattern Recogn. Lett. 126, 58–67 (2019) 8. Garg, S.N., Vig, R., Gupta, S.: Multimodal biometric system based on decision level fusion. In: 2016 International conference on Signal Processing, Communication, Power and Embedded System (SCOPES), pp. 753–758. IEEE (2016) 9. Monwar, M.M., Gavrilova, M.L.: Multimodal biometric system using rank-level fusion approach. IEEE Transactions on Systems, Man, and Cybernetics—Part B: Cybernetics 39(4), 867–878 (2009) 10. Abaza, A., Ross, A.: Quality based rank-level fusion in multibiometric systems. In: Proc. of 3rd IEEE International Conference on Biometrics: Theory, Applications and Systems, pp. 1–6. IEEE (2009) 11. Gunasekaran, K., Raja, J., Pitchai, R.: Deep multimodal biometric recognition using contourlet derivative weighted rank fusion with human face, fingerprint and iris images. Automatika 60(3), 253–265 (2019) 12. Wang, P.S.P.: Pattern Recognition, Machine Intelligence and Biometrics, pp. 656–673. Springer, Heidelberg (2011) 13. Liu, C.H.: Stochastic Processes, 50th Edition, pp. 41–70. Huazhong University of Science and Technology Press, Wuhan (2017) 14. Gehrlein, W.V.: The Condorcet criterion and committee selection. Math. Soc. Sci. 10(3), 199–209 (1985) 15. Gao, Q.K., Zhang, P., Xu, P., et al.: Overview of algorithm performance indicators in the biometric recognition system. Modern Computer 28(5), 9–12 and 47 (2020)

PCB Defect Detection Algorithm Based on Multi-scale Fusion Network Xiaofei Liao1,2 , Xuance Su1(B) , Guangyu Li1 , and Bohang Chao1 1 College of Information Science and Technology, Donghua University, Shanghai 201620, China

[email protected] 2 College of Information Science and Technology Engineering Research Center of Digitized

Textile and Fashion Technology, Ministry of Education, Donghua University, Shanghai 201620, China

Abstract. PCB defect detection is a crucial link for the production of PCB board, in order to pursue the yield, the production chain must ensure that there is an efficient PCB defect detection method. The traditional PCB defect detection adopts image processing algorithm, compares the detected image with the standard image, and determines the PCB defect type according to the parameters set. The disadvantage of this method is poor generalization and robustness of detection algorithm. In order to solve the above problems, this paper proposes a PCB defect detection model based on improved CSPDarknet53. Firstly, the PCB dataset is preprocessed, and then the multi-scale feature fusion method is used to complement the information of adjacent feature layers of the network, so as to retain more features. The experimental results show that the average accuracy of the proposed model in PCB defect detection can reach 97.4%, and the average speed of image detection is 43.8ms. The model can effectively identify various types of defects of PCB boards, and has practical application value for the quality detection of industrial PCB boards. Keywords: Defect detection · PCB board · Convolutional network · Feature fusion

1 Introduction Now printed circuit board (PCB) defect detection has become an important part of circuit board production, the traditional detection method is generally used electrical characteristics detection and artificial visual detection [1], these two methods have been used for many years, but the shortcomings are obvious. It requires a lot of manpower and material resources, and workers will be visually tired. It has not met the needs of the future development of automation. With the acceleration of the innovation of circuit boards, the market demand is also increasing year by year. Therefore, more accurate, rapid and low-cost defect detection methods are needed. It happens that deep learning detection method [2] not only conforms to the development trend of the current era of artificial intelligence, but also can better solve these © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 589–601, 2023. https://doi.org/10.1007/978-981-99-6187-0_58

590

X. Liao et al.

problems. V.A.Adibhatla et al. [3], used convolutional neural network to classify PCB images with low accuracy. On this basis, Qu et al., used convolutional neural networks combined with transfer learning to detect defects in PCB images, but the types of defects detected were few. Li Xuelu et al., added feature pyramid into Faster-RCNN network and optimized loss function to detect multiple defects of PCB bare board, but the number of datasets was small and needed to be optimized. By strengthening feature fusion in YOLOv4 network and combining Kmeans clustering to improve the network model, Xie Haofei et al., conducted defect detection on circuit board components. In order to further improve the detection accuracy and speed of PCB board and reduce the false detection rate, the cutting-edge lightweight network CSPDarknet53 [4], which is used in this study, and the overall performance of the model is enhanced by multi-scale feature fusion method, so as to realize the efficient detection of various defects of PCB board, which has practical significance for the quality specification of PCB board production process.

2 Related Work 2.1 Residual Network Model Structure The Residual Network (ResNet) [5] model is a concept proposed by the Chinese He Kaiming and his team in 2015. Once this concept came out, it caused an uproar in the field of computer vision. In that year, in various image classification and detection competitions such as ILSVRC and COCO, He and his ResNet network broke the record. Compared with the previous well-known networks such as AlexNet, VGG and Inception [6–8], the biggest advantage of ResNet is that it can guarantee good performance when training deeper networks, which cannot be achieved in early networks. In general, the larger the width and depth of the network, the better the extraction effect of complex features is. However, through a large number of experiments, it is found that, in fact, the deeper the network layer is, when the parameters of the shallow network are updated during the training process, it is easy to cause the gradient to disappear as the network gets deeper, the parameters of the shallow layer cannot be updated, and the model will degrade. Therefore, the emergence of the residual model perfectly solves this problem, and its realization principle is relatively simple, as shown in Fig. 1:

Fig. 1. Residual module.

PCB Defect Detection Algorithm Based on Multi-Scale

591

The formula (1) of the residual module is as follows: yk = h(xk ) + F(xk , Wk ) xk+1 = f (yk )

(1)

Here xk represents the input of the kth residual module, xk+1 represents the output of the kth residual module, F represents the residual function, h(xk ) = xk represents the equivalent mapping, and f refers to ReLu activation function. From (1), the relationship expression from layer k to layer K can be obtained, as shown in formula (2). xK = xk +

K−1 

F(xi , Wi )

(2)

i=k

This is the formula for forward propagation calculation. If it is back propagation, the calculation formula (3) is as follows: ⎛ ⎞ i=k K−1  ∂ ∂ ∂xK ∂ ⎜ ∂ ⎟ F(xi , Wi )⎠ Loss = Loss · = · ⎝1 + (3) ∂xk ∂xK ∂xk ∂xK ∂xK ⎛

The

⎝1 +

∂ ∂xK

∂ ∂xk Loss i=k



K−1

here represents the loss of the gradient. Through the ⎞

F(xi , Wi )⎠ in the formula, it can be seen that no matter how small the

derivation parameter on the right side of the parentheses is, because of the existence of 1 on the left, and changing the multiplication in the original chain derivation into a state of continuous addition, this can ensure that when the node performs parameter update, the phenomenon of gradient disappearance or gradient explosion will not occur, which greatly improves the robustness of the network. 2.2 CSPDarknet53 Network Structure The CSPDarknet network integrates the CSP structure [9] into the Darknet53 [10] network. Darknet53 is a relatively classic deep network. It combines the characteristics of the above-mentioned ResNet network. While efficiently extracting features, it can also avoid the problem of gradient explosion or disappearance caused by too deep a network. And the CSP structure is somewhat similar to the residual network introduced above. It is also divided into two parts: a residual edge and a backbone network. The backbone network repeatedly stacks residual blocks, while the other part is similar to the residual edge. After a small amount of processing connect to the end of the backbone network. Its schematic diagram is shown in Fig. 2: The final overall structure of CSPDarknet53 is shown in Fig. 3. The backbone network structure contains 5 CSP modules. The downsampling of each CSP module can be realized by a convolution kernel with a size of 3 × 3, which ensures that the feature extraction ability of the network is improved and the dimension of the feature map is quickly reduced without loss of detection accuracy. This excellent property makes some advanced target detection algorithms like YOLOX use CSPDarknet53 as the backbone network.

592

X. Liao et al.

Fig. 2. ResNet structure before and after CSP is added

Fig. 3. CSPDarknet53 network structure diagram

2.3 Multi-Scale Feature Fusion From the CSPDarknet53 network introduced above, it can be known that even if the network layer is very deep, it can have a good feature extraction effect and will not increase the computational loss of the model. Although the deeper the network, the richer the semantic information of the image, but at the same time the network will lose some detailed information of the target. After multiple convolutions and pooling compression, the resolution of the feature map is already very low, only a small amount of defect information will be retained, which is not conducive to feature extraction. For tiny targets such as PCB defects [11], low-level high-resolution detail information and high-level strong semantic information are very important. Based on the above analysis, the CSPDarknet53 network is proposed to fuse the abstract features extracted

PCB Defect Detection Algorithm Based on Multi-Scale

593

by the deep neural network with the adjacent feature maps of different scales [12]. Lowlevel networks combine feature layers of larger resolution with high-level semantics passed from top to bottom, so that each feature layer can be fully utilized [13].

Fig. 4. Structure diagram combining attention mechanism and multi-scale feature fusion

The whole feature extraction step of the improved CSPDarknet53 backbone network is roughly shown in Fig. 4. Firstly, a convolutional block attention model (CBAM) [14] is added to the first convolutional layer of the CSPDarknet53 backbone network. It is a lightweight and plug-and-play model, which can make the network pay more attention to the main features of the detected target and improve the detection accuracy of the network. Then there is the feature fusion process, starting from the Conv2_3 feature layer, the feature map with double width and height scaling is obtained by downsampling, and then concatenated with the Conv3_3 feature map after passing through the 1x1 convolution kernel. The obtained feature map is then reduced by double the width and height by downsampling, and then concatenated with the Conv4_3 feature map by a 1x1 convolution kernel. We then repeat the downsampling scaled width and height and 1x1 convolution and finally concatenate it with the Conv5_3 feature map. Finally, these layers are combined to obtain a feature map of size 19x19x2048. Here, the max-pooling downsampling operation is used to keep the shape of the low-level feature map consistent with the shape of the high-level feature map. The role of 1x1 convolution is to reduce the amount of calculation of channel dimension, extract image features and perform concat connection with the next feature layer. This process is actually to combine the feature maps of adjacent low-level multidetail information with the feature maps of high-level strong semantic information to generate a brand-new feature map, which contains not only the global detail information of the target, but also contains the strong feature information of the target after multiple convolutions. After comparative testing, the improved model has improved the Top1 accuracy of identifying defects by about five percentage points compared with the original.

594

X. Liao et al.

3 Experiment and Result Analysis 3.1 Experimental environment configuration The platform of this experiment was a rented GPU server on the AutoDL platform, with Intel(R) Xeon(R) Gold [email protected] CPU, RTX A4000 GPU, 16G video memory and 32G memory. The experimental environment and software were Anaconda3 and PyCharm2021.2.3, the GPU acceleration package was cuda11.3 and cuDNN v8.4.1, and the deep learning framework Pytorch1.11.0. 3.2 Data preprocessing The dataset used in this experiment is an enhanced version of the PCB dataset of the Open Laboratory of Intelligent Robotics of Peking University, which contains a total of 10668 Xml files in VOC format with a size of 600x600 pixels pictures. The images of the six defect types included are shown in Fig. 5, which are missing hole, mouse bite, open circuit, short, spur and spurious copper. The image annotation tool labelImg is used to mark the six types of defects on the PCB board, and the corresponding Xml format file is generated, which mainly contains the image name of the PCB board, the width and height of the image size, the annotation target, the category label, the annotation box, and the location information xmin, ymin, xmax and ymax of the defects on the board. The data is randomly assigned according to the ratio of (training set + validation set) and test set 9:1, and the ratio of training set and validation set 9:1. There are 8641 images in the training set, 1067 images in the

(a) Missing_hole

(b) Mouse_bite

(c) Open_circuit

(d) Short

(e) Spur

(f) Spurious_copper

Fig. 5. Six Common Defects on PCB Surface

PCB Defect Detection Algorithm Based on Multi-Scale

595

test set, and 960 images in the validation set. Before training, the VOC dataset should be initialized with the class name, number, and training path. Since the dataset of this paper is PCB images, this data is very different from the common ImageNet datasets and COCO, VOC and other datasets, and the features are not universal, so the pre-training weights trained on COCO or ImageNet datasets are not used in the training, and the weights of the backbone of the network are randomly initialized in this paper. However, random initialization may lead to slow network convergence, so the total number of epochs for model training is set to 200. In order to make full use of GPU parallel computing, improve training efficiency and prevent the number of iterations in each epoch from being small, batch_size is set to 32. The loss function used is the crossentropy loss which is very effective for classification tasks. Other hyperparameters are determined by grid search method. For example, the shape of the input image is selected as a multiple of 25 because the feature map size is changed 5 times by CSPDarknet53 network. Model optimizer candidates are sgd, Adam, and Momentum, which optimizer minimizes the loss function. As well as the choice of learning rate, its initial range is roughly from 1e-4 to 1e-2. In addition, to prevent overfitting, there is a weight decay parameter that needs to be increased from 0 to minimize the loss and so on. Finally, after many experiments, the hyperparameter scheme that makes the model accuracy and recall rate optimal is shown in Table 1. Table 1. Final selection scheme for hyperparameters hyperparameter

scheme

Input shape

512x512

batch_size

32

epoch

200

loss function

cross entropy loss

learning rate

1e-2

optimizer

sgd

weight decay

5e-4

3.3 Data Augmentation Method In order to solve the problem that the insufficient number of training datasets may cause the network model to overfit and fail to converge to the best result, this paper also performs data augmentation on the above datasets, mainly including several methods such as geometric transformation of images during model training, as shown in Fig. 6. The main idea of Mosaic is to randomly crop four input images and re-stitch them into a new image as a training sample. Its advantage is that the network computes four images at the same time during batch normalization [15], making full use of the parallel computing power of GPU. It can also speed up the convergence during training. It is worth noting

596

X. Liao et al.

that data augmentation is performed during model training after the images are fed into the network.

(a) original image

(b) reverse

(c) move

(d) rotate

(e) Brightness enhancement

(f) Mosaic Enhancement

Fig. 6. Various data augmentation methods

3.4 Model Evaluation Metrics The model evaluation index is an important parameter to measure the quality of the model. The average Precision (AP) is determined by the precision and Recall predicted on the PCB board dataset, and its calculation formula is shown in (4) and (5): TP TP + FP TP Recall = TP + FN

Precision =

(4) (5)

PCB Defect Detection Algorithm Based on Multi-Scale

597

In the above equation, TP means that the true label is a positive sample, and the network prediction is also a positive sample. FP means that the true label is a negative example, but the network predicts a positive example; FN means that the true label is a positive sample, but the network predicts a negative sample. The curve with Precision as the vertical axis and Recall as the horizontal axis is called P-R curve. Then AP is the proportion of the area enclosed under the PR curve to the total area. The average AP of all kinds of defects on PCB is mAP. The larger the values of the above indicators are, the better the prediction effect of the detection model is. The Loss of the model represents the error between the predicted sample and the real sample, and the smaller the Loss is, the more stable the model is and the better the detection effect is. After training, the network model will generate the corresponding training log, which contains the evaluation index saved every 10 epochs. The overall average loss value of the PCB board defect detection model and the precision and recall rate obtained by predicting the test set will be plotted into a graph and a bar graph respectively for model evaluation (Figs. 7 and 8).

Fig. 7. Overall average loss of the model

It can be seen from Fig. 7, for purpose of saving time and cost, the experiment only iterated for 150 epochs. But in fact, it can be seen from the above figure that the decline rate of training loss and verification loss tends to be flat with the increase of epochs, which shows that the effect of network convergence is already relatively good. It is reasonable to predict that if the training continues to 200 epochs, the training loss will eventually reach about 0.8, and the verification loss will reach about 0.1. Figure 8 shows that the model has the highest accuracy in predicting missing hole defects, while the accuracy in predicting spur defects is relatively low, indicating that the generalization ability of the model for different defect types needs to be improved. But basically, the model can accurately predict all kinds of PCB defects, and the missed detection rate is very low. At the same time, the higher recall rate can greatly reduce the false detection rate of PCB board defects, and the final average precision and recall rate are 96.83% and 96.67%, respectively, which can better complete the PCB board defect detection task.

598

X. Liao et al.

Fig. 8. The trained model predicts the precision and recall of six kinds of defects

3.5 Experimental Results and Comparison After the training of the PCB board defect detection model and obtaining each detection index, in order to further evaluate the improved model, this paper uses Faster-RCNN [16], SSD [17], RetinaNet [18], YOLOv4 [7] and YOLOX [19] target detection models to conduct experiments on the same dataset in the same experimental environment. The best weights of the above models are selected for comparison on the test set. The average accuracy value AP of each model is compared in Fig. 9, and the overall performance of each model is compared in Table 2.

Fig. 9. Comparison of AP of the detection model against PCB board defects

Figure 9 and Table 2 show that compared with Faster-RCNN, SSD and RetinaNet models, the mAP value of the improved model in this paper is increased by 8.3%, 9.9% and 5.8% respectively. The detection speed of images per second is increased by 24.7 frames, 20.3 frames, and 12.5 frames respectively, highlighting the superiority of the improved model. For the YOLOX-s network models with different backbone networks,

PCB Defect Detection Algorithm Based on Multi-Scale

599

Table 2. Comparison of detection model performance Detection model

Backbone

[email protected]/%

FPS/

FasterR-CNN

ResNet

50 89.1

19.1

SSD

VGG

16 87.5

23.5

RetinaNet

ResNet101-FPN

91.6

31.3

YOLOX-s

CSPDarknet53

93.2

45.6

YOLOX-s

Ours

97.4

43.8

the improved model is 4.2% higher than that of YOLOX-s(CSPDarknet53). The detection speed is 1.8 frames lower than that of YOLOX-s(CSPDarknet53). Although the detection speed is slightly reduced, it should also meet the defect detection requirements of industrial PCB boards, and on this basis, it also improves the detection accuracy, which is more important for the quality assurance of PCB boards. It can be concluded that the comprehensive performance of the improved detection model is stronger. The defect detection results of PCB board are shown in Fig. 9, where red represents missing hole, yellow represents mouse bite, green represents open circuit, sky blue represents short, dark blue represents spur, and purple represents spurious copper.

Fig. 10. The detection results of the improved model for PCB surface defects

It can be seen from Fig. 10 that the improved model has a good effect on the actual detection of six kinds of defects on the PCB board. No matter whether there are single types of defects or different types of defects in the image, the model can accurately detect them. Even some small defects, such as burrs and mouse bites, our model can recognize them well. This shows that the improved algorithm is very effective for the feature extraction ability of PCB surface defects, strengthens the detection performance of PCB surface defects, and greatly reduces the false detection rate.

600

X. Liao et al.

4 Conclusion This paper proposes a PCB surface defect detection method based on the improved CSPDarknet53 feature extraction network. By fusing the adjacent low-level feature maps of high-resolution global information with the high-level feature maps of low-resolution and strong semantic information, the problem of the loss of defect details at the highlevel is solved, so as to enhance the representation ability of the network. It is very effective against small targets such as PCB surface defects. Finally, in order to evaluate the performance of the improved algorithm, the experimental dataset, parameter selection and evaluation index are introduced, and a series of comparative experiments are carried out in the same experimental environment. The experimental results show that the YOLOX model using the improved version of CSPDarknet53 in this paper has the best detection effect, and the highest comprehensive performance of detection. It can be practically applied to the detection of industrial PCB surface defects, and has certain practical value for the quality assurance of production PCB boards.

References 1. Dai, W., Mujeeb, A., Erdt, M., et al.: Soldering defect detection in automatic optical inspection. Adv. Eng. Inform. 43, 101004 (2020) 2. Tao, X., Hou, W., Xu, D.: Survey of Surface defect detection methods based on deep learning. Acta Automatica Sinica 47(05), 1017–1034 (2021) 3. Adibhatla, V.A., et al.: Detecting Defects in PCB Using Deep Learning via Convolution Neural Networks. 2018 13th International Microsystems, Packaging, Assembly and Circuits Technology Conference (IMPACT) (2019) 4. Bochkovskiy, A., et al.: YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv: Computer Vision and Pattern Recognition (Apr. 2020) 5. He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. Conference on Computer Vision and Pattern Recognition, Seattle, WA, June 27–30, pp. 770–778 (2016) 6. Alex, K., et al.: ImageNet classification with deep convolutional neural networks. Communications of The ACM, n. pag (2012) 7. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014) 8. Szegedy, C., et al.: Going deeper with convolutions. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1–9 (2015) 9. Wang, C.-Y., et al.: CSPNet: A New Backbone that can Enhance Learning Capability of CNN. Computer Vision and Pattern Recognition n. pag (2019) 10. Yi, X., et al.: Enhanced Darknet53 Combine MLFPN Based Real-Time Defect Detection in Steel Surface. Chinese Conference on Pattern Recognition n. pag (2020) 11. Malge, P.S., Nadaf, R.S.: PCB defect detection, classification and localization using mathematical morphology and image processing tools. Int. J. Comp. Appl. 87(9), 40–45 (2020) 12. Ding, R., Dai, L., Li, G., et al.: TDD-Net: a tiny defect detection network for printed circuit boards. CAAI Transactions on Intelligence Technology (31), 27–35 (2019) 13. Lin, T.-Y., et al.: Feature pyramid networks for object detection. arXiv: Computer Vision and Pattern Recognition, n. pag (2016) 14. Sanghyun, W., et al.: CBAM: convolutional block attention module. European Conference on Computer Vision, n. pag (2018)

PCB Defect Detection Algorithm Based on Multi-Scale

601

15. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. International Conference on Machine Learning, n. pag (2015) 16. Ren, S., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1137–49 (June 2016) 17. Liu, W., et al.: SSD: Single Shot MultiBox Detector. Computer Vision – ECCV 2016, Lecture Notes in Computer Science, pp. 21–37 (2016) 18. Lin, T.-Y., et al.: Focal loss for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence n. pag (2017) 19. Ge, Z., et al.: YOLOX: Exceeding YOLO Series in 2021. arXiv: Computer Vision and Pattern Recognition: n. pag (2021)

Event-Triggered Adaptive Trajectory Tracking Control for Quadrotor Unmanned Aerial Vehicles Zhongyuan Zhao1 , Chengjie Cao1 , and Zijuan Luo2(B) 1

2

College of Automation, Nanjing University of Information Science and Technology, Nanjing, China [email protected] Key Laboratory of Information Systems Engineering, The 28th Research Institute of China Electronics Technology Group Corporation, Nanjing, China [email protected]

Abstract. An adaptive event-triggered trajectory tracking control approach is designed to address parameter uncertainties and external disturbance in quadrotor unmanned aerial vehicle (UAV) systems. Combine backstepping method with dynamic surface control method. Introduce radial basis function neural network (RBFNN) to compensate for unknown disturbance. Propose an event-triggered mechanism to reduce update frequency, thereby reducing actuator wear. Simulation results demonstrate that this control method has strong anti-interference ability and achieves ideal control accuracy, effectively solving the double-closedloop trajectory tracking control problem of quadrotor UAVs. Keywords: quadrotor UAV · trajectory tracking adaptive control · event-triggered control

1

· dynamic control ·

Introduction

Quadrotor UAVs are becoming popular towards greater intelligence and reduced weight [1–3]. As a nonlinear system that is underactuated and strongly coupled, trajectory tracking for quadrotor UAVs remains challengeable [4–6]. Since linear feedback control ignores the effects of unknown disturbances [7,8]. Researchers have proposed some nonlinear control methods [9–11]. Work [12] proposes a disturbance observer for unknown disturbances in the system. Work [13] makes approximation for continuous and unknown dynamics through neural network-based adaptive control method. In reference [14], a new method for prescribed finite-time feedback control is proposed. The aforementioned control algorithms generally requires periodic sampling at fixed time intervals and real-time updating of the controllers [15,16]. Event-triggered mechanism determines the triggering time for transmitting sampled data by setting eventtriggered conditions, reducing the controller update frequency [17]. Work [18] c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023  Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 602–609, 2023. https://doi.org/10.1007/978-981-99-6187-0_59

Hamiltonian Mechanics

603

designs an asymmetric barrier Lyapunov function to satisfy the output constraints. Work [19] introduces novel event-triggered tracking schemes in two distinct event-triggered architectures. Work [20] presents an adaptive eventtriggered consensus protocol with prescribed performance. This paper proposes an adaptive event-triggered tracking control approach for quadrotor UAVs. Based on backstepping control method, the dynamic surface control method is employed to circumvent the phenomenon of differential explosion. RBFNN is introduced to compensate for unknown disturbance through approximation properties. Additionally, propose a fixed threshold stategy and a switching threshold strategy. The control system achieves asymptotic stability and avoids Zeno behavior, as demonstrated by Lyapunov stability criterion. The implementation of an event-triggered mechanism reduces controller updates significantly, thereby preventing excessive updates of actuator.

2

Problem Formulation

The quadrotor UAV system is  divided into  an attitude system and a position system, which are defined as ϕ θ ψ x y z . Based on Newton-Euler equations, construct a dynamic model as ⎧ x ¨ = (cos ϕ sin θ cos ψ + sin ϕ cos ψ) U1 /m ⎪ ⎪ ⎪ y¨ = (cos ϕ sin θ sin ψ − sin ϕ cos ψ) U /m ⎪ 1 ⎪ ⎪ ⎨ z¨ = (cos ϕ cos θ) U /m − g 1 (1) ˙ r /Ix ϕ¨ = lU2 /Ix + θ˙ψ˙ (Iy − Iz ) /Ix − Jr θΩ ⎪ ⎪ ⎪ ⎪ ˙ ¨ ˙ r /Iy θ = lU3 /Iy + ϕ˙ ψ (Iz − Ix ) /Iy − Jr ϕΩ ⎪ ⎪ ⎩¨ ψ = U4 /Iz + ϕ˙ θ˙ (Ix − Iy ) /Iz where m represents mass. Ix , Iy , Iz are moment of inertia. l is rotor-to-body T  represents control input. Ωr represents motor distance. U = U1 U2 U3 U4 speed. T The dynamic equation is X˙ = f (X, U ). X = [x1 , ..., x12 ] is state variable. The state equation is ⎧ x˙ 1 = x2 ⎪ ⎪ ⎪ ⎪ x ˙ 2 = Ux U1 /m + d1 ⎪ ⎪ ⎪ ⎪ x ˙ 3 = x4 ⎪ ⎪ ⎪ ⎪ x ˙ 4 = Uy U1 /m + d2 ⎪ ⎪ ⎪ ⎪ x ⎪ ˙ 5 = x6 ⎪ ⎨ x˙ 6 = (cos x7 cos x9 ) U1 /m − g + d3 (2) x˙ 7 = x8 ⎪ ⎪ ⎪ ⎪ x˙ 8 = a1 x10 x12 − a2 Ωr x10 + b1 U2 + d4 ⎪ ⎪ ⎪ ⎪ x˙ 9 = x10 ⎪ ⎪ ⎪ ⎪ x˙ 10 = a3 x8 x12 − a4 Ωr x8 + b2 U3 + d5 ⎪ ⎪ ⎪ ⎪ x˙ 11 = x12 ⎪ ⎪ ⎩ x˙ 12 = a5 x8 x10 + b3 U4 + d6 where di denotes unknown interference. a1 = (Iy − Iz )/Ix , a2 = Jr /Ix , a3 = (Iz − Ix )/Iy , a4 = Jr /Iy , a5 = (Ix − Iy )/Iz , b1 = l/Ix , b2 = l/Iy , b3 = 1/Iz .

604

3

Z. Zhao et al.

Controller Design

Taking yaw angle subsystem as an example, the state equation is  x˙ 9 = x10 x˙ 10 = a3 x8 x12 − a4 Ωr x8 + b2 U3 + d5

(3)

¨dl are smooth, differenAssumption 1. The desired tracking signal xdl , x˙ dl , x tiable and bounded. That is, there exists a positive number xdl , x˙ dl

K1 such that , ¨dl ) x2dl + x˙ 2dl + x ¨2dl ≤ K1 . x ¨dl always exist in the compact set Π1 = (xdl , x˙ dl , x 3.1

Fixed Threshold Strategy

Design adaptive dynamic surface event-triggered controller as U˙ 3 (t) = w1 (t)

(4)

tk+1 = inf {t ∈ R ||w1 (t) − U3 (t)| ≥ m1 } (t1 = 0)

(5)

U3 (t) = (−c2 z2 − a3 x8 x10 + a4 Ωr x8 − fˆ (x) + α˙ 1 )/b2

(6)

¯ 1 tanh (m ¯ 1 z2 /ε1 ) w1 (t) = U3 (t) − m

(7)

where

¯ 1 > m1 ≥ 0 are preset constants. ε1 > 0. where {tk } represents trigger time. m Constitutes controller execution interval set {tk+1 − tk } , ∀k ∈ N + . z1 , z2 are error variables. xdl is desired angle. α1 is virtual control law. z1 = x9 − xdl

(8)

z2 = x10 − α1

(9)

α1 = −c1 z1 + x˙ dl

(10)

The input x ¯2 is passed through the low-pass filter. Define filtering error as s2 = α1 − x ¯2

(11)

|s˙ 2 + s2 /τ | ≤ B2 (x1 , xdl , x˙ dl , x ¨dl )

(12)

Since ¨dl ) is a continuous non-negative function. where B2 (x1 , xdl , x˙ dl , x Define the following compact set 

 ˜ z12 + z22 + s22 + W ˜ 2 ≤ 2P z1 , z2 , s2 , W Π2 =

(13)

Π1 and Π2 are compact sets. Since Π1 × Π2 contains the variables contained in B2 (·). M2 represents the maximum value of B2 (·). That is |B2 (·)| ≤ M2 . Define the RBF network algorithm as

  2  2 2bj hj = exp x − cj  (14)

Hamiltonian Mechanics

Δθ = f (x) = W ∗T h (x) + ε

605

(15) T

where x represents network input. j is the j-th node in hidden layer. h = [hj ] is output of Gaussian basis function. W ∗ is ideal weight. ε is approximation error. Design the output and adaptive law of the RBF network as ˆ T h (x) fˆ (x) = W

(16)

ˆ ˆ˙ = ηz2 h (x) − σ W W

(17)



˜ =W ˆ − W . σ and η are constants. σ > 0, η > 0. where W Theorem 1. For quadrotor UAV system (3), design an adaptive controller (6). Based on Assumption 1 and initial condition V1 (0) ≤ P , with event-triggered conditions (4), (5) and adaptive law (17), the system is asymptotically stable. Proof: Define Lyapunov function V1 as ˆ − W ∗ )2 /(2η) V1 = z12 /2 + z22 /2 + s22 /2 + (W

(18)

The derivative of V1 can be expressed as V˙ 1 = z1 (z2 + s2 − c1 z1 ) + s2 s˙ 2 + z2 (a3 x8 x10 − a4 Ωr x8 ˜ ˆ˙W + b2 (w1 (t) − λ (t) m ¯ 1 ) + d5 − α˙ 1 ) + η1 W 2  s2 =− ci zi2 + z1 (z2 + s2 ) + z2 ε − τ2 + |s2 | |m2 | i=1

 

¯ 1 z2 ˜W ˆ − z 2 b2 m ¯ 1 tanh m + λ (t) m ¯ − ση W 1 ε1 From Young’s inequality, it follows that  |s2 | |M2 | ≤ s22 M22 (2c) + c/2

(19)

(20)

Noting that 0 ≤ |ρ| − ρ tanh (ρ/ε1 ) ≤ 0.2785ε1 . Then simplify to obtain

   M2 σ ˜ 2 V˙ 1 ≤ (1 − c1 ) z12 + 12 − c2 z22 + s22 12 − τ1 + 2c2 − 2η W + Q1 (21) ≤ −2r1 V1 + Q1 where Q1 =

c 2

+ z2 ε + 0.2785b2 ε1 +

σ ∗ 2 2η W  .

Adjusting parameters c1 , c2 , c, τ ,

ε1 , σ can make r1 > Q1 /2P . When V1 = P , V˙ 1 ≤ 0. Since V1 (0) ≤ P , Π2 is a positively invariant set. V1 (t) ≤ P holds. The variables are ultimately bounded. Solving (21) yields V1 (t) ≤ e−r1 t V1 (0) − Q1 /r1 , t ≥ 0

(22)

Therefore, V1 (t) is bounded. The system achieves asymptotic stability. Remark 1. The horizontal control is coupled with pitch and roll channels. Design desired yaw angle as ϕd = 0rad. Calculate desired pitch angle and roll angle as  ϕd = arcsin (Ux sin (−ψ) − Uy cos (−ψ)) (23) θd = arcsin ((Ux cos (−ψ) + Uy sin (−ψ))/cos ϕd )

606

Z. Zhao et al.

Theorem 2. For quadrotor UAV system (3), the designed adaptive controller (6) with event-triggered conditions (4), (5) can avoid Zeno behavior. Proof: Taking derivative of measurement error e (t) = w1 (t) − U3 (t) yields e˙ (t) = sgn (e) · e ≤ |w˙ 1 (t)|

(24)

From (24), it can be obtained that h (z2 ) = (2m ¯ 21 /πε)/(1 + (m ¯ 1 z2 /ε)2 )

(25)

From (25), h (z2 ) is a bounded function. So w˙ 1 (t) is bounded. There exists ς > 0 satisfying |w˙ 1 (t)| < ς. For ∀k ∈ N + , when t = tk , there exists e (tk ) = 0. When the set event-triggered condition |w1 (t) − U3 (t)| = m1 , lim e (t) = m1 . t→tk+1

Integrating |e˙ (t)| over the execution time interval set [tk , tk+1 ) 

tk+1

|e˙ (t)|dt ≤ |w˙ 1 (t)| (tk+1 − tk ) ≤ ς (tk+1 − tk )

(26)

tk

Since

 tk+1 tk

|e˙ (t)|dt = lim |e (t) − |e (tk )|| = m1 , thus t→tk+1

tk+1 − tk ≥ m1 /ς, ∀k ∈ N +

(27)

Thus, the lower bound of execution time interval set {tk+1 − tk } is t∗ = m1 /ς. The controllers for other subsystems of the system can be designed similarly. 3.2

Switching Threshold Strategy

The measurement error is always constrained by the predetermined value while using fixed threshold strategies. Design the switching threshold strategy as U˙ 3 (t) = w1 (tk ) , ∀t ∈ [tk , tk+1 ]  tk+1 =

inf {t ∈ R ||e (t)| ≥ ζ |U (t)| + m1 } |U3 (t) < D| inf {t ∈ R ||e (t)| ≥ m } |U3 (t) ≥ D|

(28) (29)

where D is the designed parameter. m ¯ 1 > m1 /(1 − ζ) are positive constants. Theorem 3. For quadrotor UAV system (3), design controller (6). Under event-triggered conditions (28), (29), adaptive law (17), all state variables are bounded. Proof: According to switching threshold strategy (28), (29), it can be obtained e¯ = sup |e (t)| ≤ max {ζ |D| + m1 , m} , ∀t ∈ [tk , tk+1 )

(30)

Thus, e¯ ≤ m. ¯ The designed control rate is equivalent to fixed threshold strategy, it is still feasible to demonstrate the boundedness of all state variables.

Hamiltonian Mechanics

607

Theorem 4. For quadrotor UAV system (3), the designed controller (6) with event-triggered mechanism (28), (29) can avoid Zeno behavior. Proof: Similarly, according to Theorem 2, w˙ 1 (t) is bounded. There exists ς > 0, |w˙ 1 (t)| < ς. Integrating |e˙ (t)| over the interval {tk , tk+1 } yields tk+1 − tk ≥ max {ζ |D| + m1 , m}/ς, ∀k ∈ N +

(31)

Therefore, the lower bound of {tk+1 − tk } is t∗ = max {ζ |D| + m1 , m}/ς.

4

Numerical Experiments

This section carries out a trajectory tracking experiment of quadrotor UAV system to verify efficacy of control methodand performance of the controller.  Set trajectory tracking signal x y z ψ as 3 sin t 5 cos t 0.5t 0.5 . The initial values are all set to 0. Apply time-varying disturbances of 5 cos (t) N · m separately. The tracking comparison diagrams of subsystem are shown in Fig. 1.

4

10

2

5

0

0

-2

-5

6

4

2

-4 0

2

4

6

8

10

-10 0

(a) x subsystem

2

4

6

8

10

0 0

(b) y subsystem

2

2

4

6

8

10

(c) z subsystem

2

0.6

1

0.4

0

0.2

1 0 -1 -2 0

2

4

6

8

(d) ϕ subsystem

10

-1 0

2

4

6

8

(e) θ subsystem

10

0 0

2

4

6

8

10

(f) ψ subsystem

Fig. 1. Actual tracking trajectories and expected trajectories of each state.

The event-triggered controller remains unchanged at two adjacent triggers, so the control input is step-like. Taking the control input of x subsystem in the position system as an example, the simulation results are shown in Fig. 2. This paper adopts fixed threshold strategy and switching threshold strategy. The event-triggered time intervals are shown in Fig. 3. When using fixed threshold strategy, the event-triggered controller triggered 1358 times within 10 s, while using switching threshold strategy led to only 392 triggerings during the same period. Both of two strategies achieved the same tracking effect as the continuous sampling of 10000 times by the periodic sampling controller.

608

Z. Zhao et al. 10

0.4

0

5 -0.4 7.4

7.6

7.8

8

0

-5 0

2

4

6

8

10

Fig. 2. The control input curve of x direction subsystem. 0.06

The time interval of triggering events

0.05

1

The time interval of triggering events

0.8

0.04

0.6

0.03 0.4

0.02

0.2

0.01 0 0

2

4

6

8

(a) fixed threshold strategy

10

0 0

2

4

6

8

10

(b) switching threshold strategy

Fig. 3. Comparison of triggering time intervals for two event-triggered strategies.

5

Conclusion

This paper combines an event-triggered RBFNN control approach with dynamic surface control method on the basis of backstepping method. Both fixed threshold strategy and switching threshold strategy are designed, achieving trajectory tracking control and effectively avoid Zeno behavior. This control method effectively reduces controller update frequency and reduce actuator wear.

References 1. Fu, X.J., He, J.H.: Robust adaptive sliding mode control based on iterative learning for quadrotor UAV. IETE J. Res. 1–13 (2021) 2. Yang, H.J., Xia, Y.Q., Fu, M.Y., Shi, P.: Robust adaptive sliding mode control for uncertain delta operator systems. Int. J. Adapt. Control Signal Process. 24, 623–632 (2010) 3. Gao, Y., Li, R., Shi, Y.J., Xiao, L.: Design of path planning and tracking control of quadrotor. J. Ind. Manage. Optim. 18, 2221–2235 (2022) 4. Wang, M.Y., Chen, B., Lin, C.: Fixed-TimeBackstepping control of quadrotor trajectory tracking based on neural network. IEEE Access 8, 177092–177099 (2020)

Hamiltonian Mechanics

609

5. Wang, H.L., Chen, M.: Trajectory tracking control for an indoor quadrotor UAV based on the disturbance observer. Trans. Inst. Meas. Control. 38, 675–692 (2016) 6. Kim, H.S., Lee, K., Joo, Y.H.: Decentralized sampled-data fuzzy controller design for a VTOL UAV. J. Franklin Inst. 358, 1888–1914 (2021) 7. Islam, S., Faraz, M., Ashour, R.K., Cai, G., Seneviratne, L.: Adaptive sliding mode control design for quadrotor unmanned aerial vehicle. In: International Conference on Unmanned Aircraft Systems (2015) 8. Dou, J., Wen, B.: An adaptive robust attitude tracking control of quadrotor UAV with the Modified Rodrigues Parameter. Meas. Control 55, 1167–1179 (2022) 9. Lee, K., Kim, S., Kwak, S., You, K.: Quadrotor stabilization and tracking using nonlinear surface sliding mode control and observer. Appl. Sci. 11, 1417 (2021) 10. Xu, Q., Wang, Z., Zhen, Z.: Adaptive neural network finite time control for quadrotor UAV with unknown input saturation. Nonlinear Dyn. 98(3), 1973–1998 (2019). https://doi.org/10.1007/s11071-019-05301-1 11. Dalwadi, N., Deb, D., Rath, J.J.: Biplane trajectory tracking using hybrid controller based on backstepping and integral terminal sliding mode control. Drones 6, 58 (2022) 12. Ahmed, N., Chen, M.: Sliding mode control for quadrotor with disturbance observer. Adv. Mech. Eng. 10, 168781401878233 (2018) 13. Zhao, Z., Jin, X.: Adaptive neural network-based sliding mode tracking control for agricultural quadrotor with variable payload. Comput. Electr. Eng. 103, 108336 (2022) 14. Eliker, K., Grouni, S., Tadjine, M., Zhang, W.D.: Practical finite time adaptive robust flight control system for quad-copter UAVs. Aerosp. Sci. Technol. 98, 105708 (2020) 15. He, D., Wang, H., Tian, Y., Zimenko, K.: Event-triggered discrete extended state observer-based model-free controller for quadrotor position and attitude trajectory tracking. Proc. Inst. Mech. Eng. Part I J. Syst. Control Eng. 236, 754–771 (2022) 16. Ye, P., Yu, Y., Wang, W.: Event-based adaptive fuzzy asymptotic tracking control of quadrotor unmanned aerial vehicle with obstacle avoidance. Int. J. Fuzzy Syst. 24, 3174–3188 (2022) 17. Shao, X., Li, J.: Event-triggered robust control for quadrotors with preassigned time performance constraints. Appl. Math. Comput. 392, 125667 (2021) 18. Wang, J., Wang, P., Ma, X.: Adaptive event-triggered control for quadrotor aircraft with output constraints. Aerosp. Sci. Technol. 105, 105935 (2020) 19. Huang, Y., Liu, Y.: Practical tracking via adaptive event-triggered feedback for uncertain nonlinear systems. IEEE Trans. Autom. Control 64, 3920–3927 (2019) 20. Yao, D., Dou, C., Yue, D., Zhao, N., Zhang, T.: Adaptive neural network consensus tracking control for uncertain multi-agent systems with predefined accuracy. Nonlinear Dyn. 101(4), 2249–2262 (2020). https://doi.org/10.1007/s11071-020-05885z

Coal Maceral Groups Segmentation Using Multi-scale Residual Network Junran Chen1 , Zhenghao Xi1(B) , Zhengnan Lv2 , Xiang Liu1 , and Mingyang Wu1 1 School of Electronic and Electrical Engineering, Shanghai University of Engineering Science,

Shanghai 201620, China [email protected] 2 China Electronics Technology, Taiji Group Corporation, Limited, Beijing 100083, China

Abstract. The accuracy in the coal maceral group automated segmentation has an effect on the analysis of coal’s composition. This paper proposed a multi-scale residual network based on the U-Net to develop the accuracy of the coal maceral group segmentation. We first formulate a multi-scale contextual attention block, under this block, the channel attention block can filter features with different level on the coal maceral group images, and four cascaded dilated convolution paths and three different parallel max-pooling layers to improve the multi-scale feature extraction capability on the coal maceral group. Then, we used the squeeze and excitation block to distinguish different features of the coal maceral group in different level from the channel attention block. The proposed model is better than state of the art model in evaluation metrics, where the mean of PA is 91.24%, IoU is 83.01%, BFScore is 84.70%. Experiment results revealed that our proposed model can solve the problem of the coal maceral group segmentation accuracy. Keywords: The Coal Maceral Group · Image Segmentation · Multi-Scale Contextual Attention

1 Introduction The analysis of coal’s composition is an important means to judge the quality of coal. Usually, the analysis of coal’s composition can be worked by the image of polarizing microscope. The accurate segmentation of each component of coal in the image helps to improve the analysis of coal’s composition [1, 2]. The component features of coal maceral group images are complex and diverse, the accuracy of the existing automated segmentation methods is difficult to meet the industrial needs, and the manual auxiliary operation is still needed [3]. At present, the existing coal maceral group automated segmentation are mainly solved by machine vision and image processing method. [4] proposed a visual clustering algorithm to segment the coal maceral group. [5] enhanced the contrast, brightness and clarity of the coal maceral group images using adaptive Gamma correction so as to improve the accuracy of the coal maceral group segmentation. The application of deep learning in image processing promotes the automatic efficiency and accuracy of coal maceral group automated segmentation [6]. However, the © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 610–617, 2023. https://doi.org/10.1007/978-981-99-6187-0_60

Coal Maceral Groups Segmentation Using Multi-scale

611

neural network training for deep learning requires a large number of datasets, and the time cost of obtaining the coal maceral group images is high, there are no standard datasets with segmentation label also. [7] extracted and segmented the edge of a small number of ore images samples by the Res-UNet, and the result showed high robustness. [8] proposed an improved segmentation algorithm of U-Net to solve the problem of the small feature extraction and the unbalance of pixel categories under complex background. [9] improved U-Net model with an attention mechanism, and segmented the coal maceral group images used this algorithm. All of the above methods cannot improve the segmentation accuracy of coal maceral group images, and which used the small sample datasets. By contrast, we proposed the MSRU-Net model can effectively solve these problems, and MSRU-Net combined the Multiscale Contextual Attention (MCA) and residual network.

2 Multi-scale Residual U-Net Model The proposed Multi-Scale Residual U-Net model (MSRU-Net) is based on U-Net model, which is a symmetrical structure consisting of encoders and decoders. 2.1 Multiscale Contextual Attention Block We proposed the MCA block to enhance the feature extraction capability of the UNet. The MCA block includes four cascaded dilated convolutional paths, three different parallel maximum pooling layers and Channel Attention (CA) block. The MCA structure as shown in Fig. 1. 3×3 Conv, rate=1 3×3 Conv, rate=2 3×3 Conv, rate=1

3×3 Conv, rate=2

3×3 Conv, rate=1

3×3 Conv, rate=2

3×3 Conv, rate=4

1×1 Conv, rate=1

2×2 Pooling

1×1 Conv, rate=1

1×1 Conv, rate=1

3×3 Pooling

1×1 Conv, rate=1

1×1 Conv, rate=1

5×5 Pooling

1×1 Conv, rate=1

CA

Concate

Fig. 1. The Multiscale Contextual Attention block

During the down-sampling. The cascaded dilated convolution with different expansion rate can adjust the receptive field of the feature images. In MCA block, we let expansion rate as 1, 2 and 4 respectively, and obtain the multi-scale feature of feature images from encoder. Then, we encode the global context information with three maximum pooling layers are 2 × 2, 3 × 3 and 5 × 5, respectively, and obtain three different high-dimensional feature images. There is a 1 × 1 convolution after each maximum pooling layer, we can convert high-dimensional feature images of low-level into low-dimensional ones of high-level with it. The function F denotes low-dimensional images, and we restored F to the same dimension of input image by up-sampling. Considering the redundant information from F will affect the coal maceral group segmentation results. We obtain the coal maceral group features with the CA block, its structure as shown in Fig. 2.

612

J. Chen et al. F

Avgpool Maxpool

MLP

Sigmoid Sigmoid

Sigmoid

F

Fig. 2. The Channel Attention block

F Ap and F Mp are the output results of F by the CA block, where retain the background information and texture information of feature images of the coal maceral group images, respectively. We calculate F Ap and F Mp with Eq. (1) and (2). Both average pooling convolution kernel and maximum pooling convolution kernel use the same parameters of 4 strides and 3 × 3 size. FAp = Avgpool(F)

(1)

FMp = Maxpool(F)

(2)

The results from Eq. (1) and (2) were then sent into Multilayer Perceptron (MLP), and calculate the channel attention weight M c (F) with Eq. (3). Mc (F) = σ (w1 (w0 (FAp )) + w1 (w0 (FMp )))

(3)

where σ is the activation function sigmoid, w0 and w1 denote connection coefficients of MLP respectively. Then we acquire the channel attention feature F as follow: F  = MC (F) ⊗ F

(4)

2.2 Squeeze and Excitation Block In the U-Net, the high-level and low-level features from Sect. 2.1 are fused through skip connections, but it can’t let us distinguish which features are important. We introduce the additional Squeeze and Excitation (SE) block [10] in the decoder of the original U-Net to solve this problem. The SE block structure as shown in Fig. 3.

Fig. 3. The Squeeze and Excitation block

We obtain the feature images of the coal maceral group. Let different channels of feature images be x C (a,b), where C denotes the number of channels, a and b are the pixel positions of feature images in the transverse and longitudinal axes, respectively.

Coal Maceral Groups Segmentation Using Multi-scale

613

We split the feature images with the size of W × H × C into a number of 1 × 1 × C global description features Sq(a,b). We calculate the Sq(a,b) to squeeze the W × H × C features into 1 × 1 × C features with Eq. (5). Sq(a, b) =

W H   1 xC (a, b) H ×W

(5)

a=1 b=1

Then we use different weights with Eq. (6) to acquire the Ex(a,b). Ex(a, b) = σ (μ(Sq(a, b)))

(6)

where μ is the ReLU function. According to the Eq. (5) and (6), we obtain the features x with different weights in each channel as follow: x = Ex · xC

(7)

We set the weights of the feature images which from different channels using the SE block. This method weakens the influence of non-correlated factors on results of the coal maceral group segmentation, and improves the accuracy of segmentation result. 2.3 The Framework of MSRU-Net Model We combined the MCA block from Sect. 2.1 and the SE block from Sect. 2.2 to create the MSRU-NET based on the U-Net. The MSRU-Net as shown in Fig. 4.

SE SE 3×3 Conv+ReLU

SE

Skip connection 1×1 Conv 2×2 Maxpooling 2×2 Unsampling

SE

MCA

Fig. 4. The MSRU-Net model structure

To solve the problem of degradation and gradient vanishing while deepening the network, we optimized residual block with BN layer and ReLU function layer to replace convolutional block in the encoder and decoder of the U-Net, as shown in Fig. 5.

BN

1×1Conv

BN

ReLU

3×3Conv

BN

ReLU

3×3Conv

Output

Input

Fig. 5. The optimized residual block

614

J. Chen et al.

Each encoding block in MSRU-Net corresponds to one residual block with different dimension, and each decoding block corresponds to one residual block with different dimension and the same SE block, the MCA block is between the deepest encoding block and decoding block. The residual block is helpful to U-Net to obtain more features of the coal maceral. The SE block improves the ability of the U-Net to obtain multi-level features of the coal maceral images.

3 Analysis of Experimental Results 3.1 Datasets We used the datasets from a cooperative project with University of Science and Technology Liaoning and Angang Steel Group Limited. This paper selected fifty the coal maceral group images with the size of 3048 × 2836 were obtained by the laboratory partial reflective electron microscope. We use the scrolling crop method to get 850 images of the coal maceral group samples with the size of 512 × 512 as datasets. We use 750 images and corresponding labels as training datasets, and the rest were used as testing datasets in the experiment. 3.2 The Ablation Experiment of Function Block We did the ablation experiment on MSRU-Net to verify the improvement effect of each block on its segmentation ability. The loss functions used in the model are all cross entropy loss functions. Our results of the proposed MSRU-NET are evaluated using the Pixel Accuracy (PA), Intersection over Union (IoU) and BFScore of the [7, 11] and [12]. The final experimental results are shown in Table 1. Table 1. The results of the ablation experiment Model

PA

IoU

BFScore

U-Net

0.8438

0.7742

0.7832

U-Net + ResNet

0.8470

0.7909

0.8141

U-Net + ResNet + SE

0.8705

0.8124

0.8376

U-Net + ResNet + MCA

0.8763

0.8131

0.8491

MSRU-Net

0.9124

0.8301

0.8470

Table 1 illustrates the segmentation effect is slightly improved when we added the residual block in U-Net contrasted, and we can significantly improve evaluation metrics when the SE and MCA block are added. Contrasting with the U-Net only added residual block, PA, IoU and BFScore increased by 2.35%, 2.15% and 2.35% respectively after adding the SE block, PA, IoU and BFScore increased by 2.93%, 2.22% and 3.5% respectively after adding the MCA block. The results illustrate the addition of the residual block, the MCA and SE block can capture more effective features.

Coal Maceral Groups Segmentation Using Multi-scale

615

3.3 Analysis of Experimental Results This paper compared the proposed MSRU-Net with K-means [13], Res-UNet [14] and SE-UNet [15] experimentally. The results are shown in Table 2. Table 2. The comparison of evaluation metrics between our MSRU-Net and SOTA models. Index

Model

Vitrinite

Inertinite

Exinite

Fusinite

Mean

PA

K-means

0.6908

0.6291

0.4247

0.5425

0.5718

Res-UNet

0.8865

0.8959

0.7186

0.8873

0.8470

SE-UNet

0.9162

0.9128

0.8297

0.9026

0.8703

IoU

BFScore

Ours

0.9458

0.9362

0.8405

0.9273

0.9124

K-means

0.5231

0.5152

0.4376

0.5429

0.5047

Res-UNet

0.8467

0.8349

0.6792

0.8029

0.7909

SE-UNet

0.8939

0.8046

0.7308

0.7902

0.8049

Ours

0.9147

0.8462

0.7473

0.8125

0.8301

K-means

0.6267

0.4935

0.3558

0.5291

0.5013

Res-UNet

0.8451

0.8932

0.7015

0.8164

0.8141

SE-UNet

0.8869

0.8986

0.7422

0.8417

0.8424

Ours

0.8919

0.9037

0.7496

0.8428

0.8470

The coal maceral group mainly include Vitrinite, Inertinite, Exinite and Fusinite. The MSRU-Net was higher than state of the art (SOTA) in evaluation metrics of Sect. 3.2 in each coal maceral group. Even in the IoU of the most difficult Exinite to segment, 74.73% obtained by MSRU-Net is higher than other models. The proposed MSRU-Net obtained 91.47% on the Vitrinite, which gave the best results, and it’s superior to other models in other evaluation metrics. Figure 6 shows the segmentation results obtained by four models in four groups of sample images in this paper. The Fig. 6 shows K-means algorithm can divide the coal maceral group images into four categories through pixel calculation of gray images. However, the coal maceral group has complicated distribution, K-means can’t do precise segmentation, which leads to poor performance on evaluation metrics of Sect. 3.2. The SE-UNet, Res-UNet and MSRU-Net can distinguish the coal maceral group roughly. But, the SE-UNet and ResUNet may cause misjudgment of individual coal maceral groups, as shown in Fig. 6(d) and Fig. 6(e), and we proposed MSRU-Net shows better segmentation ability and the good fitting to the real boundary, as shown in Fig. 6(f).

616

J. Chen et al.

(a)

(b)

(c)

(d)

(e)

(f)

Fig. 6. The comparison of segmentation results of different algorithms.(a) Dataset image; (b) Label image; (c) K-means; (d) SE-UNet; (e) Res-UNet; (f) Ours

4 Conclusion To solve the problem of the coal maceral group segmentation accuracy, we proposed the MSRU-Net which took the U-Net as the main structure. The MSRU-Net includes the residual block for solving the loss of the coal maceral features, the MCA block for improving the multiscale feature extraction capability on the coal maceral group, and the SE block for distinguishing the importance of features with different level on the coal maceral group. Experimental results indicate that the proposed MSRU-Net is helpful in improving problem of the coal maceral group segmentation accuracy. Contrasted with SOTA models, the results show MSRU-Net is more accurate for the segmentation of coal maceral group images. The mean of PA is 91.24%, IoU is 83.01%, BFScore is 84.70%. The ablation experiment of function block shows the residual block can solve the gradient vanishing in MSRU-Net, the MCA block can collect context information and filter the multiscale feature information of the coal maceral, the SE block can give different weights to different features in different level. Our MSRU-Net shows a new model for solving the problem of distinguishing the coal’s quality better, and developing the coal efficiently.

References 1. Yuan, J., et al.: Coal use for power generation in China. Resour. Conserv. Recycl. 129, 443–453 (2018) 2. Wang, G., et al.: Intelligent and ecological coal mining as well as clean utilization technology in China: Review and prospects. Int. J. Min. Sci. Technol. 29(2), 161–169 (2019) 3. Yang, Z., et al.: Fracturing characteristics analysis of 800 meters deeper coalbed methane vertical wells. J. China Coal Soc. 41(1), 100–104 (2016) 4. Wang, P., et al.: Coal micrograph segmentation based on visual clustering. In: 9th World Congress on Intelligent Control and Automation (WCICA), pp. 683–687. IEEE, Taipei, TAIWAN (2011)

Coal Maceral Groups Segmentation Using Multi-scale

617

5. Jiang, M., et al.: The study of coal macerals enhancement based on adaptive Gamma correction. J. Chinese Elec. Micros. Soc. 39(01), 46–52 (2020) 6. Saraswathi, S., et al.: Adaptive supervised multi-resolution approach based modeling of performance improvement in satellite image classification. J. Ambient. Intell. Humaniz. Comput. 12, 6421–6431 (2021) 7. Liu, X., et al.: Ore Image Segmentation Method of Conveyor Belt Based on U-Net and Res_UNet Models. J. Northeas. Uni. (Natural Science). 40(11), 1623–1629 (2019) 8. Chang, H., et al.: Research on tunnel crack segmentation algorithm based on improved U-net network. Comput. Eng. Appl. 57(22), 215–222 (2021) 9. Meng, L., et al.: Maceral groups analysis of coal based on semantic segmentation of photomicrographs via the improved U-net. Fuel 294(2), 120475 (2021) 10. Hu, J., et al.: Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7132–7141. IEEE COMPUTER SOC (2018) 11. Csurka, G., et al.: What is a good evaluation measure for semantic segmentation? In: BMVC 2013 - Electronic Proceedings of the British Machine Vision Conference 2013, pp. 10–5244. BMVC, Bristol, United Kingdom (2013) 12. Memon, M.M., et al.: Unified DeepLabV3+ for semi-dark image semantic segmentation. Sensors. 22(14), 5312 (2022) 13. Estlick, M., et al.: Algorithmic transformations in the implementation of k-means clustering on reconfigurable hardware. In: Proceedings of the 2001 ACM/SIGDA ninth international symposium on Field programmable gate arrays, pp. 103–110. ACM, Monterrey, CA, United States (2001) 14. Liu, S., et al.: Segmenting nailfold capillaries using an improved U-net network. Microvasc. Res. 130, 104011 (2020) 15. Sofla, R.A.D., et al.: Road extraction from satellite and aerial image using SE-Unet. J. Appl. Remote Sens. 15(1), 014512 (2021)

Design of Magnetic Tactile Sensor Arrays for Intelligent Floorboard Based on the Demand of Older People Lu Wang1 , Ling Weng2 , and Bowen Wang2(B) 1 School of Architecture and Art Design, Hebei University of Technology, Tianjin, China 2 School of Electrical Engineering, Hebei University of Technology, Tianjin, China

[email protected]

Abstract. The need for home-based eldercare has grown as China’s population ages, and concerns related to improving older people’s quality of life, such as designing supplementary intellectual tools and assisting mobility, are gaining prominence. The sense of touch is an essential capability for humans to explore the ambient environment. The magnetic sensor array for detecting the fall of fragile elderly has been built in this case to meet the needs of older individuals. The testing results reveal that the magnetic tactile sensor arrays mounted on intelligent floors can detect the fall of elders. The sensor arrays, installed on the floorboard, are put on the path ground where the elderly often walk, such as the bedroom and toilet path floors, to detect whether the elderly have fallen. Keywords: Older people · Magnetic type · Tactile sensor array · Intelligent floorboard

1 Introduction Population aging is a worldwide problem, and the number of Chinese who are over 60 years old ranks first in the world. The need for eldercare is growing due to the large population, while the number of family carers is decreasing. According to the United Nations, 20% of the world’s older people will be elderly, while China will account for 25% of the world’s total older people. In 2020, 15.2% of the population was 60 years of age or older, and this proportion is expected to be as high as 24.9% by 2030 and 34.7% by 2050. There will be 300 million one-child families by 2050, which will show a decline in the population’s capacity to care for senior parents and grandparents [1]. The greatest way to take care of elderly people, according to Powell, requires family support because maintaining autonomy is challenging [2]. For older people, their homes and communities serve as both a physical location and a repository of memories. The fundamental challenge is how to support seniors who are aging in place while easing the burden of care on family caregivers. Older persons who are fragile can benefit significantly from the auxiliary tools. They can also improve the quality of older people’s life by expanding accessibility. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 618–625, 2023. https://doi.org/10.1007/978-981-99-6187-0_61

Design of Magnetic Tactile Sensor Arrays

619

The research related to this field includes assistive device control, space environment detection, and health monitoring. A haptic sensor was developed by Rimus et al. [3] to recognize common household items, and the measuring outcome reveals that the accuracy rate for fruits is found to be as high as 84%. Moreover, the rigidity of an object has been evaluated via contact with the object through a single contact point while grasping, according to Li et al.’s design of a bionic tactile finger based on Galfenol, which was equipped with a manipulator [4]. A magnetostrictive tactile sensor unit with a large force measurement range and high sensitivity is designed using L-shaped Galfenol wires, tunnel magnetoresistive elements, and cylindrical permanent magnets [5]. Tactile sensors are now being used a manipulator to help fragile elders regain their health [6, 7]. Manipulators with touch sensors can improve medical realism for more difficult activities [8]. The tactile sensors for detecting the fall of frail elders are important since falls at home can cause serious injuries. For fall detection, wearable devices have built-in sensors that monitor the user’s motion [9] and they generally rely on sensors such as accelerometers, gyroscopes, and magnetometers [10, 11]. Ref. [12] has further proposed an MVC-based approach for fall detection using the data transmitted by all five sensors on the test subject. However, the magnetic tactile sensors, installed on intelligent floorboards, for fall detection have not been found, Therefore, it is interesting to design a magnetic tactile sensor for detecting the fall of frail elders, based on the demand of older people.

2 Tactile Sensor Demand for Frail Older Persons When a person reaches old age, it is possible that their physical abilities will start to drastically decline. The loss of physical control might be linked to other fourth-age transitions including dementia and falls [13, 14]. A questionnaire was designed to understand the basic use of sensors in their life, such as blood pressure sensors, gas sensors, and fall detectors. Simultaneously, we also designed interviews to enquire about the family’s circumstances, including the residence conditions, neighborhood environment, and financial state. Ten households from five Tianjin neighborhoods have been interviewed at their homes, and the conversations were taken 40–60 min. The interviewees welcomed our investigators and were ready to talk about the issues they encountered when caring for vulnerable seniors. The results have been shown that more than 90% of families required for living auxiliary tools and health monitoring. 1) Auxiliary bed demand: body mobility function degrades with age, and the number of fragile older adults increases. “We cannot care for our parents all day, and we need an auxiliary bed to help our parents move their bodies,” Mrs Zhang explained. The supplementary intelligent bed has been developed and is now available on the market. 2) Demand for blood pressure detectors: there is a demand for blood pressure detectors, and older adults can use the detector to check their blood pressure. Mrs Li, for example, has stated, “My mother always worries about his blood pressure, and we need a detector to help her measure her blood.” The blood pressure detector can easily be found in the market. 3) During the depth interview, 50% of families talked about the fall problem of older people. As a result, it is vital to develop a fall detector based on the needs of the seniors. The fall detector could be utilized to help caregivers look after their parents.

620

L. Wang et al.

However, this innovation has limited its application, and it is difficult to find on the market. In this section, we will design a tactile sensor array that will be mounted on intelligent flooring to detect falls in older individuals.

3 Design of Tactile Sensor Array and its Output Characteristic 3.1 Design of Magnetic Tactile Sensor Array Each sensor unit in the tactile sensor array consists of Fe-Ga alloy wires, a Nd-Fe-B magnet, a contact head, a pedestal, and a Hall Chip. The composition of Fe-Ga alloy wire is Fe83Ga17, the wire diameter is 0.5 mm, and the wire length is 8 mm. The magnetic field of the Fe-Ga wires changes when they are stressed. The pedestal was attached to one end of the cantilevered Fe-Ga wire for the sensor unit. The Hall chip of the sensor unit was used to transform magnetic field into output voltage, and installed on the surface of the pedestal and outside of Fe-Ga wires. On both sides of the pedestal, there are permanent magnets with dimensions of 4 mm by 1.2 mm by 1 mm that create a biased magnetic field for the device. The diameter of the contact head is 1 mm, and the thickness is 5 mm. The structure of tactile sensor is shown in Fig. 1.

(a)

(b)

Fig. 1. Structure parts (a) and configuration (b) of magnetic tactile sensor unit

The magnetic tactile sensor array is created and integrated using sensor units, with the primary components being sensor units and a flexible circuit board. Nine sensor units were integrated into the circuit board in this case. From the magnetic field analysis of the sensing unit, it can be seen that the magnetic field of the sensing unit is symmetrically distributed with the middle filament as the central axis, and based on the concept of the sensing unit sharing permanent magnets, a 3 × 3 tactile sensing array is designed as shown in Fig. 2, and the transverse three sensing units share four permanent magnets with a width of 17 mm. The permanent magnet between the longitudinal sensing units will cause the magnetic circuit to be changed, so the distance of the permanent magnet is used as an optimization object to reduce the effect of the longitudinal magnetic field by changing the distance between the front and back. The Hall element circuit of each sensing unit consists of three parts: power line, ground wire and signal line, based on the integrated design of the electrical system, and the Hall connection circuit of the sensing unit in the array adopts a flexible integrated

Design of Magnetic Tactile Sensor Arrays

621

circuit board. The analog signal from the tactile sensing array is converted into a digital signal by an analog-to-digital converter and transmitted to a computer for further analysis.

(a)

˄b˅

Fig. 2. Flexible tactile sensor array, (a) three-dimensional structure; (b) plane structure design

3.2 The Output Characteristic of a Magnetic Tactile Sensor Array The magnetic sensor array is manufactured and tested. Pulsed forces in the same amplitude are applied to each sensor unit consecutively. The corresponding voltage signals from all sensor units are recorded during the entire testing procedure, as shown in Fig. 3 (a). The maximum output voltage reaches 87.5 mV and the minimum is 83.8 mV. Under the same pulsed force, the output voltage of the nine contacts is stable at 85 mV, maintaining the consistency of the output. Moreover, the output waveform proves that there is little interaction between the sensor units, which means that the array is not limited by mutual interference as it expands on demand. When each row of contacts is quickly pressed in turn, the output voltage versus time is shown in Fig. 3 (b). We can find that the output voltage signals are reflected when a force is exerted on the unit, and the voltage values of different units change with different pressures. At same time, it can be seen that when one of the rows is subjected to external forces, the effect on the output of the other sensor units is small and negligible, proving that the distance of 7.5 mm between the two rows of sensing units is appropriate. A static force of 2N is applied to the sensing unit No. 1−9, and the output voltage curve is shown in Fig. 4. The maximum output voltage of 9 sensing units in the array is 87.6 mV, and the minimum output voltage is 83.7 mV. The difference value of output voltage is 3.9 mV, which means that the output voltage of sensing units is stable [15]. The central No. 5 sensing unit has the highest output voltage due to the increased number of permanent magnets when the array is integrated, and the output voltage due to the increased bias magnetic field.

L. Wang et al.

Sensor output (mV)

622

time (s)

time (s)

(a)

(b)

Fig. 3. Magnetic Sensor output versus time when nine different units were pressed, 100

U/mV

90

80

70

60

(a)

1

2

3

4

5

N

6

7

8

9

(b)

Fig. 4. Output voltage of magnetic sensor array, (a) Schematic diagram of sensor array position; (b) Output voltage of nine different sensing units

Design of Magnetic Tactile Sensor Arrays

623

3.3 Intelligent Floorboard with Magnetic Tactile Sensor Arrays The older adults touch the ground with their feet when they are doing normal activities at home, and the area of contact with the ground increases significantly after they fall. Based on the area of touching the ground, we can judge whether the elderly have fallen at home. Therefore, a tactile sensor array can be combined with a household floor to develop a smart floor with a fall-detection function. We have installed the magnetic tactile sensor array on the floorboard, and it is used to detect the fall of older people at home.

Fig. 5. Intelligent floorboard with 3 × 3 tactile sensor arrays

Figure 5 shows the intelligent floorboard with 3 × 3 magnetic tactile sensor arrays. The intelligent floorboard mainly contains 6 tactile sensor arrays and 12 floorboard blocks, and the two feet can touch 4 sensor arrays (36 units). If the elderly touch more than 4 sensor arrays (more than 36 units), we can find the elderly may fall on his home ground. To increase the practicality of the smart floorboard, we designed a sensor array of 2 × 2 units, and applied the sensor array to make the smart floorboard, as shown in Fig. 6. The smart floorboard contains 6 sensor arrays, and the two feet touch 16 sensor units. The intelligent floorboard with a 2 × 2 sensor array is more simple and practical.

Fig. 6. Intelligent floorboard with 2 × 2 tactile sensor arrays

Figure 7 shows the output voltage from two different units of a 2 × 2 tactile sensor array when an elderly walks on the intelligent floorboard and the voltage output reaches 30 mV. Compared to the results in Fig. 3, the output voltage is reduced due to the sensor array embedded in the floor causing the sensor unit contacts to travel less. It can be seen that the developed intelligent floor can be used to detect the falling behavior of the elders at home. The intelligent floor can be installed on the path ground where the elders often walk, such as the bedroom and toilet path floor, to detect whether the elders have fallen. If the elderly may fall at home, we will notify family members through mobile phones in time.

624

L. Wang et al.

Fig. 7. The output voltage of the two different contacts

4 Conclusion This study results in a magnetic tactile sensor array of magnetostrictive materials, permanent magnets, and a Hall sensor. The magnetic sensor array is manufactured and tested. The maximum output voltage reaches 87.5 mV, the minimum is 83.8 mV, and the difference value of output voltage is 3.9 mV. A tactile sensor array is combined with a household floor to develop a smart floorboard with a fall-detection function. The magnetic tactile sensor arrays were used to manufacture the intelligent floorboard, and it can be used to detect the fall of older people at home. The intelligent floor can be installed on the path ground where the elders often walk, such as the bedroom and toilet path floor, to detect whether the elderly have fallen. Acknowledgements. This work was supported by the Natural Science Foundation of Hebei Province under Grant E2017202035.

References 1. Ji, J.: Older people’s preferences in eldercare: current situation, trends and group differences, based on the Class data. Sci. Res. Aging 10(7), 14–27 (2022) 2. Powell, C.: Care for older people in multigenerational families: a life course analysis across four generations. Fam. Relat. Soc. 7(1), 103–121 (2018) 3. Rimus, A., Kootstra, G., Bilberg, A., Kragic, D.: Design of a flexible tactile sensor for classification of rigid and deformable objects. Robot. Auton. Syst. 62(1), 3–15 (2014) 4. Li, Y.K., Wang, B.W., Li, Y.Y., et al.: Design and output characteristics of magnetostrictive tactile sensor for detecting force and stiffness of manipulated objects. IEEE Trans. Ind. Inform. 15(2), 1219–1224 (2019) 5. Yang, H., Weng, L., Wang, B., Huang, W.: Design and characterization of high-sensitivity magnetostrictive tactile sensor array. IEEE Sens. J. 22(13), 12645–12655 (2022) 6. Wan, Y., Wang, Y., Guo, C.: Recent progresses on flexible tactile sensors. Mater. Today Phys. 1, 61–73 (2017) 7. Kwon, O.-K., An, J.-S., Hong, S.-K.: Capacitive touch systems with styli for touch sensors: a review. IEEE Sens. J. 18(12), 4832–4846 (2018) 8. Zheng, W., Liu, H., Wang, B., Sun, F.: Cross-modal surface material retrieval using discriminant adversarial learning. IEEE Trans. Ind. Inform. 15(9), 4978–4987 (2019) 9. Khan, S., Parkinson, S., Grant, L., Liu, N., Mcguire, S.: Biometric systems utilising health data from wearable devices: applications and future challenges in computer security. ACM Comput. Surv. 53, 1–29 (2020)

Design of Magnetic Tactile Sensor Arrays

625

10. Wu, F., Zhao, H., Zhao, Y., Zhong, H.: Development of a wearable-sensor-based fall detection system. Int. J. Telemed. Appl. 2015, 1–12 (2015) 11. Khan, I., et al.: Monitoring system-based flying IoT in public health and sports using antenabled energy-aware routing. J. Healthc. Eng. 1686946 (2021) 12. Mankodiya, H., et al.: XAI-Fall: explainable AI for fall detection on wearable devices using sequence models and XAI techniques. Mathematics. 10(12), 1990–1998 (2022) 13. Degnen, C.: Cross-Cultural Perspectives on Personhood and the Life Course. Palgrave Macmillan, Basingstoke (2018) 14. Atkinson, J.D.: Research methodologies. In: Atkinson, J.D. (ed.) Journey into Social Activism. Fordham University, pp. 27–64 (2017) 15. Yang, H., Weng, L., Wang, B., Huang, W.: Design and characterization of high-sensitivity magnetostrictive tactile sensor array. IEEE Sens. J. 22(5), 4004–4013 (2022)

Gaussian Process-Augmented Unscented Kalman Filter for Autonomous Navigation During Aerocapture at Mars Shihang Cui(B) and Yong Li Qian Xuesen Laboratory of Space Technology, China Academy of Space Technology, Beijing 100094, China [email protected]

Abstract. The problem of Mars aerocapture autonomous navigation with uncertain parameter was investigated. Both the dynamic model and measurement model of spacecraft during Mars aerocapture were established. An autonomous navigation algorithm based on Gaussian Process-Augmented Unscented Kalman Filter was proposed. The Gaussian Process was used to simulate the relationship between the Mars atmospheric density and the states of spacecraft, and to provide statistical information to the Augmented Unscented Kalman Filter. Subsequently, the uncertain parameter was augmented to the system states, and the Augmented Unscented Kalman filter was used for state prediction and estimation. Simulation results demonstrated the effectiveness of the proposed navigation algorithm. Keywords: Mars Aerocapture · Parameter Uncertainty · Gaussian Process Regression · Augmented Unscented Kalman Filter

1 Introduction In recent years, Mars exploration has garnered increasing attention. When exploring Mars, spacecraft must decelerate to ensure that it is captured by the planet. Thus, one of the most critical considerations is how to minimize fuel consumption while accurately maneuvering the spacecraft into the designated orbit. One effective way to achieve this goal is aerocapture [1]. Aerocapture leverages the resistance of planetary atmospheres to reduce speed of spacecraft and guide it into the predetermined orbit, which greatly improves mission efficiency and makes it one of the paramount strategies in deep space exploration. To ensure the successful implementation of such tasks, the most important issue is accurate estimation of the spacecraft’s state and parameters such as atmospheric density and aerodynamic coefficient. To improve the accuracy of autonomous navigation during the Mars aerocapture process, it is essential to accurately model the uncertain parameters in the system dynamics models, particularly the atmospheric density of Mars [2]. However, the parameters during the Mars aerocapture process vary significantly, and the uncertainty is high, so traditional modeling methods can hardly provide a comprehensive explanation for it [3]. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 626–633, 2023. https://doi.org/10.1007/978-981-99-6187-0_62

Gaussian Process-Augmented Unscented Kalman Filter

627

Gaussian Process (GP), an advanced statistical modeling technique, has emerged as a popular research topic in the fields of machine learning worldwide [4]. Wei proposed a model-free Bayesian filtering method that combines Gaussian Process Regression (GPR) [5]. This approach uses GPR to learn the system dynamics and measurement models and dynamically acquire the statistical properties of the system model and output noise. Meng proposed a new algorithm called Gaussian Process Regression Square-Root Unscented Particle Filter (GPSR-UPF) to deal with the problem of decreased filtering accuracy due to inaccurate dynamic models [6]. Ye developed Enhanced GP-UKF to improve the prediction and estimation capability [7]. This approach enables uses GPR to learn the residuals between the filtering estimation and post-processed smooth estimation. However, these studies abandoned the original dynamic model of the system, which results in the need for multiple GPs to simulate the system model. In this paper, the Gaussian Process was used to provide statistical information of uncertain parameters, while the estimation of the states is given by the Augmented Unscented Kalman Filter. In the meantime, the covariance of state is affected by the uncertainty of the parameters, which can be reflected in the Unscented Kalman Filter algorithm. The remaining sections of this paper are structured as follows: Sect. 2 presents the system model during the process of Mars aerocapture; Sect. 3 provides a brief introduction to Gaussian Process Regression; Sect. 4 proposes a navigation algorithm using Gaussian Process-Augmented Unscented Kalman Filter (GP-AUKF); and Sect. 5 verifies the feasibility of the proposed algorithm through simulations.

2 Dynamic Models The planetary planar entry dynamic model ignores the planetary rotation effects. The system state x includes the radial distance r, which is the distance from the center of the spacecraft to the Martian centroid, the velocity of spacecraft V and flight-path angle (FPA) γ . The dynamic model with process noise w is given as ⎡ ⎤ V sin γ r˙ μ ⎢ x˙ = ⎣ V˙ ⎦ = f (x, t) + w = ⎣ −d − r 2 sin γ





γ˙

1 V

d DL

cos σ −

μ r2

cos γ +

V2 r

cos γ

⎥ ⎦ + w

(1)

in which that μ = 42, 828.29 × 109 m3 /s2 is the Martian gravitational constant, σ is the control input of the system, representing the bank angle of the spacecraft. d represents the drag acceleration terms of the spacecraft, and DL is the lift-drag ratio. d = Bρd , B =

CD Sr 1 , ρd = ρV 2 m 2

(2)

where B is ballistic coefficient, ρd is the dynamic pressure, CD is the aerodynamic coefficients of drag, Sr is the reference area of the spacecraft, m is the spacecraft mass, and ρ is the Mars atmospheric density. The actual density value is described as

r0 − r (3) ρ = ρ(1 + δρ), ρ = ρ0 exp hs

628

S. Cui and Y. Li

where ρ is the nominal value of Mars atmosphere density, and δρ is a small disturbance between ρ and ρ, which is the percentage difference from the nominal value [8]. ρ0 = 2.0 × 10−4 kg/m3 is the atmosphere density of the Mars reference surface, and hs = 7.5 km is scale height, r0 = 3436 km is the radial position of the Mars reference surface (40 km above the surface of Mars). We choose the measured acceleration as the measurement model, the measurement model with measurement noise v can be rewritten as

T (4) y = h(x, t) + v = d 0 0 + v

3 Gaussian Process Regression The entire statistical properties of GP can be fully determined by its mean and covariance function, which can be defined using the following equation:    (5) f (x) : GP m(x), k x, x Assuming there are n observation data points, where xi ∈ Rd represents the input and yi ∈ R represents the corresponding output vector, the training dataset is denoted as D(X , y). Assuming that the observed values y of the target variable are influenced by noise, the output values can be expressed as: yi = f (xi ) + ε

(6)

where  an independent random variable that follows a Gaussian distribution  ε is ε:N 0, σ 2 . Thus, the prior distribution of the observed target variable y is:   (7) y:N 0, K(X , X ) + σ 2 I Here, K(X , X ) is an n × n symmetric positive definite covariance matrix, where each element Kij represents the correlation between xi and xj . The joint Gaussian prior distribution formed by n training samples output y and one test sample output y∗ is:    y K(X , X ) + σ 2 I K(X , x∗ ) : (8) K(X , x∗ )T k(x∗ , x∗ ) y∗ where K(X , x∗ ) is n × 1 covariance matrix between the input of the test point x∗ and all input points in the training set, while k(x∗ , x∗ ) is the covariance matrix of the input of the test point x∗ itself. GP provides the expected value and variance of the predicted target value y∗ as follows:  −1 y (9) GPμ (x∗ , X , y) = k∗T K(X , X ) + σ 2 I  −1 k∗ GP (x∗ , X , y) = k(x∗ , x∗ ) − k∗T K(X , X ) + σ 2 I

(10)

where k∗ = (k(x1 , x∗ ), ..., k(xn , x∗ ))T is the covariance matrix between the training sample set and x∗ .

Gaussian Process-Augmented Unscented Kalman Filter

629

4 Gaussian Process-Augmented Unscented Kalman Filter The Gaussian Process-Augmented Unscented Kalman Filter (GP-AUKF) is proposed in this section. In this research, the dynamics model and measurement model of the system were retained, and GP was used to fit the relationship between density and altitude. In the AUKF, the states of the system are augmented with uncertain parameter. The augmented state of the system and its error covariance matrix are given as     P xx P xρ x (11) z= , P zz = P ρx P ρρ ρ

4.1 Predictive Step With the equations of motion given in Sect. 2.1, the sigma points are then created. Firstly, we gave the lower-diagonal square-root factorization of P zz , which is   Sxx  0 T P zz = Szz Szz , Szz = (12) P ρx S−T P ρρ − P ρx P −1 xx P xρ xx   In the AUKF algorithm, the square root matrix Szz,k−1 is used to create 2 nx + nρ augmented sigma points, which are   Zi,k−1 = zˆ k−1 + nx + nρ si,k−1 , Zi+nx +nρ ,k−1 = zˆ k−1 − nx + nρ si,k−1 (13)   where i = 1, 2, ..., nx + nρ , nx and nρ are the number of estimated state and augmented uncertain parameters, respectively. In this research, since there are three states to be estimated, and the only augmented parameter is the atmospheric density ρ, so nx = 3, nρ = 1. si,k−1 is the ith column of Szz,k−1 . Each sigma point of the estimate states is numerically integrated from tk−1 to tk using Eq. (1). For the augmented parameters, once the GP is trained, the sigma points are no longer F and GP F provided by used to calculate the mean and covariance, but instead GPμ,k ,k the GP are used to describe the statistical information of the atmospheric density, which can then affect the propagated state. The augmented mean and covariance at tk are then ⎡ ⎤ 2(n x +nρ )   xˆ ωi X i,k ⎥ ⎢ (14) zˆ k = k = ⎣ i=1 ⎦ ρˆk F GPμ,k ⎡  T ⎤  T   2(nx +nρ ) F  X i,k − xˆ k X i,k − xˆ k X i,k − xˆ k ρk − GPμ,k ⎦ (15)  P zz,k = ωi ⎣  T F F ρk − GPμ,k X i,k − xˆ k GP i=1 ,k where ωi = 2 n 1+n . ( x ρ) For the augmented uncertain parameter of atmospheric density, we use the mean and variance obtained from GP instead of the UT transformation results, as the atmospheric density model is inaccurate and traditional UKF cannot handle such uncertainties.

630

S. Cui and Y. Li

4.2 Corrective Step The measurement sigma points Y i can be obtained from sigma points Zi , which are Y i = h(Zi , t)

(16)

The predicted measurement yˆ and its covariance are 2(nx +nρ )

yˆ =



2(nx +nρ )

ωi Y i , P yy =

i=1



  T ωi Yi − yˆ Y i − yˆ + R

(17)

i=1

The measurement-state cross covariance Pzy can be obtained by 

P xy P zy = P ρy



2(nx +nρ )



=

i=1

   T  X i − xˆ Yi − yˆ  T ωi  ρi − GPμF Y i − yˆ

(18)

The augmented Kalman gain is defined as     P xy −1 Kx = P = K z = P zy P −1 yy Kρ P ρy yy

(19)

and the correction of zˆ − and P − zz are   zˆ + = zˆ − + K z y − yˆ  P+ zz

=

P− zz

− K z P yy K Tz

=

−1 P −1 xx P xρ −1 P ρx P −1 ρρ





K x P yy K Tx K x P yy K Tρ − K ρ P yy K Tx K ρ Pyy K Tρ

(20)  (21)

5 Simulation Results and Analysis The Mars aerocapture initial conditions for the navigation system are set with radial distance of 125 km, velocity of 7.150 km/s and FPA of −12.731◦ . The spacecraft has a reference area of 16.98 m2 , the entry mass of the spacecraft is 7387 kg, and the lift and drag coefficients are 0.348 and 1.45, respectively. The value of measurement noise of the spacecraft during Mars aerocapture is 1500

µg-rms as in [9], and the process noise covariance matrix is Q = diag Q11 , Q22 , Q33 = diag 0, 0.001 m/s, 0.0275 deg [10]. The target orbit after the aerocapture is a 500-km-altitude circular orbit. For the control input σ , we use a bang-bang control structure as in [11], which is  cos σ0 if t ≤ ts ∗ u (t) = (22) cos σd if t >ts where ts is the optimal switch time, and it is recalculated at each guidance call. A numerical predictor-corrector algorithm to calculate ts was described in [1]. In this work, we assume that σ0 = 15◦ and σd = 90◦ .

Gaussian Process-Augmented Unscented Kalman Filter

631

During the first 100 s of simulation, traditional UKF was used to predict the states, where atmospheric density was given as Eq. (3). The measured density data and states data were used to train GP during this process. After 100 s, GP-AUKF was used instead of traditional UKF to address the problem of uncertain system model parameters. Figure 1 depicts the true values of each state variable of the spacecraft during the Mars aerocapture. Figure 2 demonstrates that GP-AUKF and UKF produced the same errors in the first 100 s during the Mars aerocapture process due to the training requirements of Gaussian Process. The adopted density model during this period introduced significant errors, resulting in poor performance of UKF. After 100 s, the trained GP model was used to accurately simulate the relationship between Mars atmospheric density and altitude, which significantly reduced the errors of UKF.

r/km

geocentric radius

4400 4200 4000 3800 3600 0

100

200

300

400

500

600

700

time/s

V/km/s

velocity

6

4 0

100

200

300

400

500

600

700

time/s flight path angle

/deg

20 10 0 -10 0

100

200

300

400

500

600

time/s

Fig. 1. The true value of states during Mars aerocapture.

700

632

S. Cui and Y. Li

| r|/km

UKF GP-AUKF

10 5 0 0

100

200

300

400

500

600

700

400

500

600

700

400

500

600

700

time/s

| V|/km/s

0.04 0.02 0 0

100

200

300

time/s 10-3

2

|

|/deg

4

0 0

100

200

300

time/s

Fig. 2. The error of absolute value comparison between GP-AUKF and UKF.

6 Conclusion In this paper, we address the problem of uncertain parameters in the dynamic model for the Mars aerocapture. We established the dynamic model and measurement model with uncertain atmospheric density. The proposed GP-AUKF algorithm models the uncertain parameters of the system using Gaussian Process, making it possible to handle uncertain parameters effectively. Compared to traditional UKF algorithm, GP-AUKF algorithm overcomes the limitations of uncertain parameters and has better reliability. This work may provide valuable information for future Mars aerocapture missions. Acknowledgement. This work was supported by the National Key R&D Program of China under Grant 2018YFA0703800 and the State Key Program of National Natural Science Foundation of China under Grant 61833009.

References 1. Lu, P.: Entry guidance: a unified method. J. Guid. Control. Dyn. 37(3), 713–728 (2014) 2. Craft, K.J., DeMars, K.J.: Navigation performance of air data systems for atmospheric entry and descent. In: AIAA SCITECH 2022 Forum, p. 1218 (2022) 3. Karlgaard, C.D., Kutty, P., Schoenenberger, M.: Coupled inertial navigation and flush air data sensing algorithm for atmosphere estimation. J. Spacecr. Rocket. 54(1), 128–140 (2017) 4. Martino, L., Read, J.: A joint introduction to Gaussian processes and relevance vector machines with connections to Kalman filtering and other kernel smoothers. Inf. Fus. 74, 17–38 (2021)

Gaussian Process-Augmented Unscented Kalman Filter

633

5. Wei, X.-Q., Song, S.M.: Model-free cubature Kalman filter and its application. Control Decision 28(5), 769–773 (2013) 6. Yang, M., Shesheng, G., Wei, W.: Unscented particle filter based Gaussian process regression for IMU/BDS train integrated positioning. In: 2016 IEEE Information Technology, Networking, Electronic and Automation Control Conference. IEEE (2016) 7. Wen, Y.E., et al.: UKF estimation method incorporating Gaussian process regression. J. Beijing Univ. Aeronaut. Astronaut. 45(6), 1081–1087 (2019) 8. Justus, C.G., et al.: Mars-GRAM 2000: a Mars atmospheric model for engineering applications. Adv. Space Res. 29(2), 193–202 (2002) 9. Christian, J., Verges, A., Braun, R.: Statistical reconstruction of mars entry, descent, and landing trajectories and atmospheric profiles. In: AIAA SPACE 2007 Conference & Exposition (2007) 10. Dutta, S., et al.: Comparison of statistical estimation techniques for mars entry, descent, and landing reconstruction. J. Spacecr. Rocket. 50(6), 1207–1221 (2013) 11. Lu, P., et al.: Optimal aerocapture guidance. J. Guid. Control. Dyn. 38(4), 553–565 (2015)

Application of EEG S-Transformation Combined with Dimensionless Metrics for Automatic Detection of Cybersickness Zhanfeng Zhou, Chengcheng Hua(B) , Lining Chai, and Jianlong Tao School of Automation, C-IMER, CICAEET, Nanjing University of Information Science and Technology, Nanjing 210044, China [email protected]

Abstract. Cybersickness is a constraint to the development of the VR industry. To solve the problems caused by cybersickness, it is a prerequisite to detect whether users are experiencing cybersickness. In this study, the S-transform combined with differential entropy, peak factor and waveform factor was used to extract the EEG features of cybersickness. We used SVM to verify the validity of the extracted features, resulting in an average AUC and accuracy of 0.98% and 97.40% respectively. The finding suggests that the dimensionless features extracted based on the S-transform can be used as an effective indicator for detecting halo machines. Keywords: Cybersickness · S-transform · dimensionless metrics · Machine Learning

1 Introduction While users experience an immersive experience in a VR world, the cybersickness induced by the VR contents poses an unavoidable health risk and hinders the development of the VR industry. To address the problems caused by cybersickness, the detection of it is a prerequisite [1]. Currently, most researchers still use subjective measures such as verbal reports and questionnaires to detect cybersickness. However, subjective assessment methods have limitations to a certain extent, as individual differences in subjective assessment can be caused by differences in subjective feelings, expressions and understanding of subjective problems, making it difficult to objectively assess the level of cybersickness and limiting in practical application [2]. To overcome the limitations of subjective assessment, researchers have begun to investigate the use of multiple physiological signals to assess motion sickness, as physiological signals allow for a more objective assessment of the level of cybersickness in VR [3]. Among them, EEG signals can reflect the physiological state of the brain, such as cognitive load, emotion, fatigue, etc. The analysis of EEG signals is expected to reveal the neural activity mechanism, triggering process related to VR cybersickness, so researches on EEG-based cybersickness detection are increasing year by year [4]. Krokos et al. used a combination of independent component analysis, time-frequency © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 634–640, 2023. https://doi.org/10.1007/978-981-99-6187-0_63

Application of EEG S-Transformation Combined with Dimensionless Metrics

635

analysis and intercorrelation analysis to investigate EEG power spectrum accompanying fluctuations in the degree of cybersickness during the actual driving task was evaluated, and it was concluded that cybersickness can be caused by visual and vestibular stimuli, and with the occipital region showing power changes mainly in the Theta and Delta bands [5]. Hyun Kyoon Lim et al. used a combination of EEG signals and a traditional questionnaire to study cybersickness, and concluded that the frontal parts of Delta, Theta, Beta and Gamma bands showed changes in power associated with cybersickness [6]. The above studies had the following problems: Fewer electrodes were used, VR cybersickness is different from ordinary vertigo, and important EEG information about VR cybersickness is easily missed with only some of the electrodes used to detect vertigo, so more channels of EEG equipment are needed for this study. This paper uses a simulated roller coaster scenario to induce cybersickness and records resting and task state EEG signals from 15 subjects using a 32-channel EEG data acquisition device. This paper proposes a method of S-transform combined with dimensionless metrics to extract the EEG features of cybersickness, and the classification accuracy of 97.40% and may be applied to automatically detect cybersickness.

2 EEG Data Source In this study, 15 healthy college students (age 20.4 ± 1.5 years, 8 males and 7 females) were recruited as subjects within Nanjing University of Information Science and Technology. Two sets of EEG signals were collected from each subject: 1) 2-min resting state before the onset of cybersickness; 2) 2-min task state at the onset of cybersickness. In this paper, the experiment used Neuroscan Grael EEG 2 as the EEG acquisition device, Curry8 as the EEG data acquisition software and Oculus Quest2 as the VR device. The number of electrodes in the EEG cap is 32, with 2 EOG reference electrodes, and the electrode arrangement follows the international 10–20 system. The subjects were subjected to cybersickness induced by the “Epic Roller Coasters” scenario developed by BT4 Games.

3 Methods 3.1 EEG Data Pre-processing We first used DWT to decompose the EEG signal into six frequency bands: 64–128 Hz, 32–64 Hz, 16–32 Hz, 8–16 Hz, 4–8 Hz and 0–4 Hz. Since the frequency of vertigo episode activity typically ranges from 0.5–35 Hz, the four scales of 0–32 Hz were reconstructed as filtered data. We then performed data segmentation: the EEG data were divided into 120 epochs of 1 s length with no overlap, with the aim of making the EEG data as smooth as possible within the epoch and increasing the sample size. 3.2 S-Transform The S-transform has several advantages over other time-frequency analysis techniques, including good frequency resolution, good time localization, and the ability to handle

636

Z. Zhou et al.

non-stationary signals [7, 8]. In this paper, when using the S-transform for EEG signal processing, the frequency range is 1−35Hz, the number of frequencies is set to 35, and the standard deviation of the Gaussian window function is set to 0.1. 3.3 Feature Extraction After obtaining the S-transformed matrix of N × M, where N stands for frequency and M stands for time. Three dimensionless indicators: differential entropy, peak factor and waveform factor of the matrix are calculated as features. Those indicators are calculated for each row of the S-transformed matrix. The average value of the metrics for each frequency band is then calculated based on the four frequency bands 1−4 Hz (Delta), 5−8 Hz (Theta), 9−12 Hz (Alpha) and 13−35 Hz (Beta). It is worth noting that the parameters need to be adjusted according to different signals when calculating the differential entropy. In this paper, based on several adjustments, the best results were found when the delay time was set to 1 and the embedding dimension was set to 2.

4 Results and Discussion 4.1 S-Transformation Results S-transformation of the pre-processed EEG data yielded 15 × 2 × 120 samples (15 subjects, 2 groups of states, 120 s) with 35 × 30 × 1024 S-transformation matrices per sample (35 frequency points, 30 channels and 1024 time points). The three-dimensional time-frequency plots after averaging all channels and all epochs based on the group analysis are given in Fig. 1. As shown in Fig. 1, the S-transformation coefficient values increased in the high-frequency part (Alpha and Beta bands), but changed insignificantly in the low-frequency part (Delta and Theta bands), and we inferred that cybersickness might be closely related to the EEG in the 8–35 Hz band.

Fig. 1. Comparison of full-lead S-transformation time-frequency results between the two states (Non-cybersickness group vs. Cybersickness group)

Application of EEG S-Transformation Combined with Dimensionless Metrics

637

4.2 Feature Extraction Results In this paper, differential entropy, peak factor and waveform factor features were extracted from the data obtained in Sect. 4.1. According to the result described above, the cybersickness might be related to the 8–35 Hz band, and to eliminate computational costs, all dimensionless indexes were calculated in that band. Thus 28 frequencies were divided into two bands: Alpha and Beta, with a final 30 × 2 feature matrix for each sample. Then the significant features were selected according to the paired-samples t-tests and were shown in Table 1. Table 1. Significant features in the alpha and beta bands (p < 0.001)

Differential entropy

Alpha

Beta

Cp4

C4, P8, O1

Peak factor

None

Fp1, Fp2, Pz, P8, O1

Waveform factor

P8

F11, Ft12, P8, O1

According to Table 1, the number of significant features in the Beta band was much higher than that in the Alpha band, which confirms that the EEG features of cybersickness are embedded in the higher frequency bands again. Besides, the locations of the selected features were mostly in the occipital, parietal and frontal regions. The occipital lobe is responsible for visual perception, the parietal lobe for spatial perception, and the frontal lobe for attentional control [7], and it is worth mentioning that the inability to focus is also a symptom of cybersickness. This paper showed a comparison chart of the dimensionless metrics for some channels based on group analysis in Fig. 2.

Fig. 2. Comparison of partial selected features between two states (Group Analysis)

638

Z. Zhou et al.

As shown in Fig. 2, all dimensionless indicators increase significantly during the onset of cybersickness, reflecting the more dramatic changes in the time-frequency domain during cybersickness EEG, the large dynamic range in the signal and the more complex waveform pattern. It is interesting to note that the three dimensionless indicators have opposite trends between the two states. 4.3 Classification Results The selected features in Table 1 were put into feature vectors as the input, then SVM, LDA, KNN and GNB were used to classify the non-cybersickness and cybersickness groups, and the results are shown in Table 2. Table. 2 Comparison of classification results of various classifiers Methods

Accuracy (%)

Precision

Recall

F1-Score

AUC

GNB

85.41 ± 2.59

0.87

0.86

0.87

0.85

LDA

89.17 ± 4.38

0.88

0.90

0.91

0.91

KNN

92.33 ± 3.62

0.93

0.92

0.92

0.94

SVM

97.40 ± 4.15

0.97

0.95

0.98

0.98

The classification effects of single and mixed features were also compared, and to verify the effectiveness of the S-transform combined with the dimensionless index to detect cybersickness, the EEG rhythm energy feature and the fuzzy entropy feature are used as controls (Note: both the EEG rhythm energy feature and the fuzzy entropy feature are verified using the four classifiers proposed in the paper, for the sake of a concise table, only the best classifier is shown), and the results are shown in Table 3. Table. 3 Comparison of classification results of various methods Methods

Accuracy (%)

Precision

Recall

F1-Score

AUC

rhythm energy + KNN

89.43 ± 3.77

0.87

0.88

0.87

0.86

peak factor + SVM

90.69 ± 2.36

0.88

0.91

0.89

0.91

waveform factor + SVM

91.28 ± 4.52

0.93

0.93

0.94

0.92

fuzzy entropy + KNN

91.32 ± 5.13

0.93

0.94

0.93

0.92

(continued)

Application of EEG S-Transformation Combined with Dimensionless Metrics

639

Table. 3 (continued) Methods

Accuracy (%)

Precision

Recall

F1-Score

AUC

differential entropy + SVM

94.33 ± 3.27

0.95

0.96

0.94

0.97

this paper + SVM

97.40 ± 4.15

0.97

0.98

0.99

0.98

The classification results of the method combined with SVM in this paper: the AUC value was 0.98; the accuracy was 97.40% (±4.15), which indicated that the selected features could be used for cybersickness detection. SVM gave the best results compared to other classifiers: LDA, KNN, etc. This may be related to the characteristics of SVM, which is more advantageous in processing data with smaller sample size and high feature dimension [8].

5 Conclusions The aim of this paper is to further investigate the EEG features of cybersickness and to find a quantifiable EEG feature to achieve effective detection of cybersickness. The results showed an increasing of S-transform coefficients in the Alpha and Beta bands when the cybersickness was induced. The dimensionless features such as the differential entropy, waveform factor and peak factors especially in Beta band were significantly higher in cybersickness (p < 0.001). The average classification accuracy of the selected features based on SVM was 97.40%. This paper provides an analysis method of S-transform combined with dimensionless features, which is expected to be a valid indicator for cybersickness detection and provide an objective basis for the treatment or alleviation of VR cybersickness. Acknowledgements. This work was supported by National Natural Science Foundation of China (62206130), Natural Science Foundation of Jiangsu Province (BK20200821), Startup Foundation for Introducing Talent of NUIST (2020r075). Thanks a lot for all the participants and personnel of the experiments.

References 1. Liu, R., Zhuang, C., Yang, R., et al.: Effect of economically friendly acustimulation approach against cybersickness in video-watching tasks using consumer VR devices. Appl. Ergon. 8(2), 29–40 (2020) 2. Yildirim, C.: Cybersickness during VR gaming undermines game enjoyment: a mediation model. Displays 5(9), 35–43 (2019) 3. Yang, Z., Ren, H.: Feature extraction and simulation of EEG signals during exercise-induced fatigue. IEEE Access. 46(3), 89–98 (2019)

640

Z. Zhou et al.

4. Gursel Ozmen, N., Gumusel, L., Yang, Y.: A biologically inspired approach to frequency domain feature extraction for EEG classification. Comput Math Methods Med. 20(18), 98–112 (2018) 5. Krokos, E., Varshney, A.: Quantifying VR cybersickness using EEG. VR. 26(1), 77–89 (2021) 6. Lim, H.K., Ji, K., Woo, Y.S., et al.: Test-retest reliability of the VR sickness evaluation using electroencephalography (EEG). Neurosci Lett. 74(3), 135–143 (2021) 7. Ashokkumar, S.R., Anupallavi, S., Premkumar, M., et al.: Implementation of deep neural networks for classifying electroence phalogram signal using fractional S-transform for epileptic seizure detection. Int. J. Imaging Syst. Technol. 31(2), 895–908 (2021) 8. Khare, S.K., Nishad, A., Upadhyay, A., et al.: Classification of emotions from EEG signals using time-order representation based on the S-transform and convolutional neural network. Electron. Lett. 56(25), 1359–1361 (2020) 9. Corbetta, M., Shulman, G.L.: Control of goal-directed and stimulus-driven attention in the brain. Nature Rev. Neurosci. 3(3), 201–215(2002) 10. Hsu, C.-W., Lin, C.-J.: A comparison of methods for multiclass support vector machines. IEEE Trans. Neural Netw. 13(2), 415–425 (2002)

An Overview of Multi-task Control for Redundant Robot Based on Quadratic Programming Qingkai Li1 , Yanbo Pang1 , Wenhan Cai1 , Yushi Wang1 , Qing Li1 , and Mingguo Zhao1,2(B) 1

2

Department of Automation, Tsinghua University, Beijing, China [email protected] Beijing Innovation Center for Future Chips, Tsinghua University, Beijing, China

Abstract. Redundant degrees of freedom provide a feasible solution for robots to better interact with environments, but it also leads to an increase in the complexity of kinematic and dynamic control. This paper presents an overview of methods that realize the whole-body control (WBC) of a redundant robot with a quadratic programming (QP) approach. We first induce the general QP form from the single-task control method for redundant robots and describe WBC as a multi-task control problem in the form of multi-objective optimization (MOO). Then different QP algorithms for multi-task control with fixed and transitional priority are introduced. Weighted quadratic programming (WQP) and hierarchical quadratic programming (HQP) for fixed priority, and generalized hierarchical control (GHC) and recursive hierarchical projection (RHP) for transitional priority will be explicated, respectively. Finally, a toolkit implementing the above algorithms is provided, based on which simulation results are presented and remarks on different algorithms are given. Keywords: quadratic programming · whole-body control robot · multi-task control · multi-objective optimization

1

· redundant

Introduction

With increasing degrees-of-freedom (DoFs) in robot actuation, robots are capable of human-like versatility, achieving multiple task objectives simultaneously and adapting to changing environments. Consider the task of transferring a cup of water, a humanoid robot must consider the attitude of the water cup while grabbing and moving the cup, and it will also need to adjust to new tasks in time if the environment is disturbed by other people or objects. Although redundant DoFs of robots bring the possibility to achieve these tasks, how to coordinate the many tasks so as to achieve a reasonable control is an important research problem. A main approach to the multi-task control problem is whole-body control (WBC), whose basic idea is to assign different priority to multiple tasks. Dating back to the 1980 s, nullspace projection based methods were already used c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023  Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 641–666, 2023. https://doi.org/10.1007/978-981-99-6187-0_64

642

Q. Li et al.

to deal with the coupling between tasks [7,16,26,29,31]. Tasks with lower priority were projected into the nullspace of tasks with higher priority and then the pseudo-inverse was calculated to solve the problem. However, as the control scenario evolves, inequality constraints such as friction cone constraint, maximum torque constraint, etc., have also become control problems needed to be considered. Because the nullspace and pseudo-inverse based methods can only deal with equality constraints, a more popular solution dealing with inequality constraints is to transform the control problem into a quadratic programming (QP) problem to implement WBC [3,8,20,21,27]. In QP problems, the control tasks are abstracted as optimization objectives. One task corresponds to one objective, while multiple tasks can correspond to a single objective or to multiple objectives. For example, the horizontal position x and vertical position y of the end effector as two control tasks can be abstracted into two optimization objectives, e.g., minimizing the tracking error in each position, but they can also be abstracted as one optimization objective, e.g., minimizing the total error of x and y. In robot control, what often needs to be considered is the scenario of abstracting multi-tasks into multiple objectives, i.e., the multi-objective optimization (MOO) problem of QP. One of the most straightforward strategies to tackle multiple tasks is using the commands from the nullspace projection method as part of the references and dealing with other inequality constraints in one QP [17,18], which here named projected quadratic programming(PQP). Another method is to assign different weights to the optimization objectives of different tasks, thus aggregating the optimization problems of multiple tasks into a weighted quadratic programming (WQP) [1,23]. However, task priorities are not strictly hierarchical in both the above two methods, which means tasks with lower priority will influence the performance of higher ones. In the contrast, the hierarchical quadratic programming (HQP) method solves QP problems with different priorities in turn, using the results from higher priority tasks as constraints for lower priority tasks to ensure a strict hierarchy of priorities [4,9,15]. In order to adapt to more complex and flexible scenarios, it is also necessary to adjust the priority between different tasks, to which the key is to ensure the continuity of the adjusting process. A simple approach would be to continuously change the weights based on WQP to alter the priority of tasks [28], but limited by the vague meaning of weights, it is difficult to design the change rules of different weights under the same standard. Another method is to achieve the transition between different priorities based on the intermediate variables of task space, where strict hierarchical priorities before and after the transition are maintained [22]. Based on this method, [19] raised a continuous transition HQP(CT-HQP) to include inequality constraints, but HQP has to be done for the optimization problem before and after the transition in order to obtain the intermediate variables, which greatly increases the computational complexity. [6] introduced the continuous nullspace projections method. On this basis, [24] proposed the generalized projection matrix, and a generalized hierarchical control (GHC) framework was established in [25] to implement hierarchical control and

An Overview of Multi-task Control Globally

643

priority transition. Recently, [12] combined the continuous nullspace projection matrix and HQP to propose an iterative form of projection matrix transition, and established the recursive hierarchical control (RHP) framework for priority transition similar to HQP. Main QP algorithms for multi-task control of robots are summarized in Fig. 1.

Fig. 1. QP algorithms for robot multi-task control. Tasks are strictly hierarchical in HQP, CT-HQP, GHC and RHP, but not in PQP and WQP, which are depicted in dotted box.

This paper provides an overview of algorithms related to WBC of redundant robots using QP, then compares and discusses the relationship between different algorithms. The single-task redundant robot control problem is first introduced in Sect. 2 based on a specific scenario, and is then described as a single-objective quadratic programming problem. The multi-task control problem is later described as an MOO problem similarly. In Sect. 3, algorithms for multi-task control problem, which mainly focus on weighting strategy and hierarchical optimization strategy, will be given. Algorithms for fixed priorities are presented in Sect. 3.1, whereas algorithms for transition priorities are presented in Sect. 3.2, and both will be discussed in Sect. 3.3. Different algorithms will be compared through simulation results in Sect. 4, and concluding remarks will be given in Sect. 5.

2

System Model and Preliminary Analysis

Consider a robot with n DoFs and r task vectors, the forward kinematics can be written as ⎧ ⎪ ⎨xi = fi (q) (1) x˙ i = Ji q˙ ⎪ ⎩ ˙ ¨ i = Ji q¨ + Ji q, ˙ x

644

Q. Li et al.

where xi ∈ Rmi (i = 1, ..., r) is the vector of task i, for example, the position or ∂fi (q) orientation of manipulator, Ji (q) = is the Jacobian matrix and q ∈ Rn ∂q is the general vector. For a redundant robot, we have mi  n, ∀i = 1, · · · , r. 2.1

Single-task Control Method

For the ith task, the objective can be achieved by setting an expected velocity ¨ ∗i . Take the feedback control problem x˙ ∗i or acceleration x x˙ ∗i = x˙ ref + kp (xref − xfi b ) i i ref ref ¨ ∗i = x ¨ i + kd (x˙ i − x˙ fi b ) + kp (xref x − xfi b ) i

(2)

for example, where kp , kd are the PD gains of the PD controller, xfi b , x˙ fi b are ˙ ref ¨ ref are the the feedback positions and velocities of task i, meanwhile xref i ,x i ,x i reference positions, velocities and accelerations of task i. The expected angular velocity of joints q˙ ∗ can be obtained by solving a QP problem ˙ 2, q˙ ∗ ∈ arg minJi q˙ − x˙ ∗i 2 + wr q q˙

(3)

where wr  0 is the weight of the regular term. By selecting wr > 0, the quadratic matrix of the QP problem is positive definite to avoid algorithm singularity, and thus (3) has a solution. The above QP problem at the velocity level is called the damped least-squares methods [5]. ¨ ∗i must be taken For a torque-controlled robot, the expected acceleration x into consideration. Considering a robot with na actuation joints, the dynamical equation for the generalized coordinates q is given as ˙ = SaT τ + JcT ωc , M (q)q¨ + c(q, q)

(4)

˙ ∈ Rn is the where M (q) ∈ Rn×n is the generalized inertia matrix, c(q, q) non-linear term of Coriolis force, centrifugal force and gravity, SaT ∈ Rn×na is the transpose of the torque selection matrix, τ ∈ Rna is the actuation torque of joints, JcT is the transpose of Jacobian matrix of contact points with the environment and ωc is the external wrench applied to the robot at the contact points. The inverse dynamical, i.e. obtaining the actuation torque from the expected acceleration, can be described as the following QP problem ¨ ∗i − J˙i q) ˙ 2 + Wr [q¨T , τ T , ωcT ]T 2 arg min Ji q¨ − (x q¨,τ ,ω c

˙ = SaT τ + JcT ωc . s.t. M (q)q¨ + c(q, q)

(5)

ωc can be considered as constraints instead of optimization variables when it is known or can be denoted as a linear combination of other optimization variables, based on specific scenarios [14].

An Overview of Multi-task Control Globally

645

The QP problem given in (3) and (5) can be written in the following general form arg min wi 2 + vi 2

(6a)

x,w i ,v i

s.t. Ai x + wi = bi ldsi  Csi x + vi  udsi ldhi  Chi x  udhi

,

(6b) (6c) (6d)

where equality tasks are described as equality constraints in (6b) and inequality tasks are described as inequality constraints in (6c). (6d) is the hard constraint, which must be satisfied in a task, such as the dynamical equation in (5). wi , vi are the slack variables for equality and inequality constraints, respectively. Ai , bi are the task matrix and vector for equality constraints, and Csi ,ldsi ,udsi are the constraint matrix, lower and upper bound for inequality constraints, respectively. In fact, (6b) can be written in the form of (6c) mathematically, but it is retained because equality tasks are more common in robots, and algorithms can be simplified considering equality tasks. For example, when only (6a–6b) are included in an optimization problem, it can be solved by taking the pseudo-inverse [9]. The result is given by x∗ = A†i bi , where A†i is the M-P inverse matrix of Ai . Although calculating pseudo-inverse is more computationally efficient than solving an optimization problem, and multi-task control based on projection matrix has been studied thoroughly [7], friction cone constraint, joint torque limitation, etc. need to be considered in actual robot control problems, so inequality constraints in (6) must be included in the optimization problem for a more accurate control. Such tasks cannot be solved by pseudo-inverse so further discussion has to be made on multi-objective optimization. 2.2

Multi-task Control Method

The general QP form for a single task has been derived as (6), which is a singleobjective optimization problem with constraints. Similarly, solution for multitask control can be described in the form of an MOO problem with constraints, as arg min F = [f1 , · · · , fr ] x,w i ,v i

s.t. fi = wi 2 + vi 2 Ai x + w i = b i ldsi  Csi x + vi  udsi

(7)

ldhi  Chi x  udhi i = 1, 2, · · · , r. In multi-task control, tasks are often coupled with each other so that they can not be satisfied simultaneously because of interference. As a result, an optimal solution x∗ that satisfies fi (x∗ )  fi (x), ∀i ∈ I often does not exist, where

646

Q. Li et al.

for simplicity, we define I = {1, 2, · · · , r}. Naturally, we would like to assign different priority to different tasks according to their importance, so that an extra constraint concerning the priority of tasks must be defined to solve the MOO problem.

3

QP Algorithms for Multi-task Control

In this section, existing algorithms for multi-task control problem will be introduced. Algorithms for multi-task control with fixed task priority will be discussed in Sect. 3.1 and those for multi-task control with transition task priority will be presented in Sect. 3.2. 3.1

Multi-task Control with Fixed Priority

MOO problem itself is difficult to solve directly, and a common approach is to transform it into a single-objective optimization problem. In robot multi-task control, this is achieved by considering the trade-off between different tasks to give a single evaluation. How multi-task control is realized in different algorithms are introduced in this section. Weighted Quadratic Programming. One of the most commonly used methods in robotic MOO is the linear weighting method, where each task objective fi (i ∈ I) in (7) is given a weight λi and then added together to form a r  λi fi . The sum of multiple quadratic single-objective optimization problem of i=1

functions, where equality constraints and inequality constraints are given different weights, is still a quadratic function. Therefore the WQP problem can be described as arg min wQ e + vQ i

(8a)

s.t. Ax + w = b

(8b)

x,w ,v

lds  C s x + v  uds

(8c)

ldh  C h x  udh

(8d)

The new task matrix and task vector are obtained by stitching together all the task matrices, i.e. ⎡ ⎤ ⎡ ⎤ A1 b1 ⎢ .. ⎥ ⎢ .. ⎥ A = ⎣ . ⎦,b = ⎣ . ⎦ (9) Ar

br

wQ e = wT Qe w is the weighted norm and the weight matrix Qe is a diagonal matrix, whose diagonal elements are the weights of different equality tasks. Qi is the weight matrix for inequality tasks, similarly. (8d) represents all of the hard

An Overview of Multi-task Control Globally

647

constraints. Obviously, the weight matrix Qe and Qi reflect the importance of different tasks, and the larger the diagonal element of the matrix, the more important the corresponding task is. This algorithm has a high computational efficiency, especially in [1,23] where the soft constraints (8c) are not included. The slack variable wi can be eliminated by substituting (8b) into (8a) so x is the only optimization variable. In this case, the computational efficiency is close to that of the single-task optimization. However, the biggest drawback of this algorithm is the vague meaning of weights. It is difficult to compare the importance of tasks under different dimensions. In addition, even if the weights can be compared, the importance between tasks is only a relative concept and different tasks are still coupled. Thus tasks are not strictly hierarchical in WQP. Hierarchical Quadratic Programming. Another way to handle the multitask control problem is to choose one of the sub-objectives as the optimization objective and the other sub-objectives as the constraints. The HQP proposed by [15] describes the optimization problem on the k-th priority level as

(10a) arg min Ak xk − bk Q + v k Q i x k ,v k

e

s.t.lds,k  C s,k xk + v k  uds,k ldh  C h xk  udh aug aug Ak−1 xk = Ak−1 x∗k−1 aug aug ∗ ldk−1  C k−1 xk + v aug k−1

(10b) (10c) (10d)



aug udk−1

(10e)

where Ak , bk are the task matrix and vector of all the equality tasks on the k-th priority level, C s,k , lds,k , uds,k , v k are the constraint matrix, lower and upper bound and the slack variables for inequality constraints on the k-th priority level, respectively. Constraints in (10d) take the optimization results of equality task from all of the higher priority levels into consideration. x∗k−1 is the optimization aug result from the upper level and Ak−1 is the task matrix of all the tasks on the upper k − 1 levels, which is defined as ⎤ ⎡ A1 aug ⎥ ⎢ Ak−1 = ⎣ ... ⎦ (11) Ak−1 Constraints in (10e) take the optimization results of inequality tasks from all aug aug aug of the higher priority levels into consideration. C k−1 , ldk−1 , udk−1 are the constraint matrix, lower and upper bound for inequality constraints on the upper ∗ k −1 priority levels, and v aug k−1 is the optimal solution for the upper k −1 priority

648

Q. Li et al.

levels respectively. They are defined as ⎤ ⎤ ⎤ ⎡ ⎡ ∗ ⎤ ⎡ ⎡ C s,1 v1 lds,1 uds,1 aug aug aug ⎥ ⎥ ⎢ ⎢ .. ⎥ ⎢ .. ⎥ ⎢ ∗ .. C k−1 = ⎣ ... ⎦ , v aug ⎦ k−1 = ⎣ . ⎦ , ldk−1 = ⎣ . ⎦ , udk−1 = ⎣ . ∗ v C s,k−1 lds,k−1 uds,k−1 k−1 (12) respectively. Setting the optimization results of tasks from all of the higher levels as constraints in (10d–10e) will ensure that results from the current level will not have an impact on results from higher level, so as to guarantee that tasks are strictly hierarchical. Notice that (10c) means that hard constraints are considered on every level, so that all the hard constraints have the highest priority level. This is because hard constraints like dynamical equation and maximum torque limitation must be satisfied in robotic control tasks. If they are not considered at the beginning, the constraints from higher level may conflict with the hard constraints and no feasible solution can be found, as depicted in Fig. 2.

Fig. 2. No feasible solution can be found when hard constraints are not arranged to the highest level. Dotted lines are contours of f1 (x), the solid line is the task constraint and the green area is the solution space of the hard constraint. (Color figure online)

In (10), all optimization results from the upper k−1 levels need to be added as constraints when solving the QP problem of the k-th level, so as k increases, constraints in (10d,10e) increases and computational efficiency deteriorates significantly. To solve this problem, the constraints in (10d) are equivalently expressed as xk = Nk−1 uk + x∗k−1

(13)

according to [4], where Nk−1 ∈ Rn×(n−mk−1 ) is the matrix composed of the basis aug aug of the nullspace of Ak−1 , mk−1 = Rank(Ak−1 ). Nk−1 can be obtained by doing

An Overview of Multi-task Control Globally

649

aug

QR decomposition or singular value decomposition(SVD) to Ak−1 . For example, aug

Ak−1 = U ΣV T ,

(14)

then Nk−1 consists of the last n − mk−1 columns of V . xk derived from(13) satisfies aug

aug

aug

aug

Ak−1 xk = Ak−1 Nk−1 uk + Ak−1 x∗k−1 = Ak−1 x∗k−1 , ∀uk ∈ Rn−mk−1 .   

(15)

O

[4] presented a nullspace-based HQP that eliminates the constraint of (10d) and uses the parameter vector uk as the optimization variable of the k-th priority level. However, the case with inequality constraints are not discussed, a more general form containing inequality constraints is presented here as

arg min Ak (Nk−1 uk + x∗k−1 ) − bk Q + v k Q i (16a) e

u k ,v k

s.t. lds,k  C s,k (Nk−1 uk + x∗k−1 ) + v k  uds,k ldh  C h (Nk−1 uk + aug ldk−1



x∗k−1 )

aug C k−1 (Nk−1 uk

+

(16b)

 udh

x∗k−1 )

+

(16c) ∗ v aug k−1



aug udk−1 ,

(16d)

for which the optimal solution is u∗k . By substituting u∗k into (13), the optimal solution x∗k for (10) can be derived as x∗k = Nk−1 u∗k + x∗k−1 ,

(17) aug

In addition, as the solution proceeds, the dimension of Ak increases, and aug solving Nk directly through Ak will consume a lot of time. Because the previous task space does not change when a new task matrix is added, the HQP algorithm with recursive form shown in Fig. 3 is often used [4,13,14], where k . k ) is the matrix composed of the basis of the nullspace of A null(A ˆk is a full rank matrix, Nk = O and subsequent optimal Obviously, when A solutions will not change according to (17), so the algorithm can terminate at ˆk has full rank means that at this point this step. In terms of robot control, A the robot’s actuation joints have no more redundant DoFs for subsequent tasks while ensuring the higher priority tasks, therefore the subsequent tasks are not reachable. Compared with WQP, multiple QP problems need to be solved in HQP, so that it is a more time consuming approach for WBC. However, with the help of algorithm based on parametrization of the optimization variables and the recursive form of the nullspace basis matrix, computational efficiency significantly improves. In addition, [9] proposed a numerical algorithm for the HQP problem using the active set method, which ensures that (16) with up to 40-dimensional optimization variables can be solved in less than 1 ms. The real-time performance of WBC can thus be guaranteed.

650

Q. Li et al.

Fig. 3. Diagram of HQP algorithm with recursive form.

3.2

Multi-task Control with Transitional Priority

Changing the weights continuously is a straightforward approach to priority transition in WQP, but this method relies greatly on the setting of tasks to determine the form of transition, so it would not be discussed here. The transition algorithm based on intermediate variables in [19,20] can simultaneously realize the transition of inequality constraints, but it is often computationally demanding and does not have obvious advantages over the transition method based on projection matrix. As a result, this section mainly focuses on priority transition algorithms based on continuous projection matrix. Generalized Hierarchical Control. The generalized hierarchical control (GHC) proposed in [25] was extended from (8) by using the generalized projection matrix. In GHC, different equality tasks fi = Ai x − bi 2 no longer sum with weights. Each task corresponds to a new optimization variable xi , and is solved independently. The optimal solution x∗i for each task is then summed by the generalized projection matrix. The form of GHC is presented in the following as arg min x  ,v

r 

Ai xi − bi 2 + vQ i

(18a)

i=1

s.t. lds  C s x + v  uds

(18b)

ldh  C h x  udh r  x= Pigp xi = P x ,

(18c)

i=1

T  where P = [P1gp · · · Prgp ], x = xT1 · · · xTr .

(18d)

An Overview of Multi-task Control Globally

651

The generalized projection matrix Pigp is similar to the continuous projection matrix in [6], but tasks are not projected into the nullspace of a specific task, rather projected into the nullspace of all the tasks by Pigp . Pigp is a function of αij ∈ [0, 1], which portrays the extent of projection from task i to the nullspace of task j. When αij = 0, task i will not be projected to the nullspace of task j, meaning that task i will not be affected by task j, i.e., task i has a higher priority level. When αij = 1, task i will be fully projected to the nullspace of task j, meaning that task i will not affect task j, i.e., task i has a lower priority level. When 0 < αij < 1, the priority is not strictly hierarchical and the larger αij is, the higher the importance of task j is compared to task i. A priority matrix can therefore be defined as ⎤ ⎡ α11 · · · α1r ⎥ ⎢ (19) Ψghc = ⎣ ... . . . ... ⎦ ∈ Rr×r , αr1 · · · αrr where αij + αji = 1, ∀i = j for the consistency of priority between a pair of tasks. The generalized projection matrix Pigp (Ψghc ) for task i can be calculated in the following three steps: Step 1: Arrange the task matrix Aj in descending order of αij , j = 1, · · · , r into T  s an augmented matrix Ai = ATs1 · · · ATsr ; s Step 2: Perform Gaussian elimination to the row vector of Ai and get a set of s mi ×n of its row space, where mi is the rank of Ai . Conorthogonal bases Bi ∈ R s struct an mi -dimensional diagonal matrix Λi = diag(αis1 , · · · , αis1 , αis2 , · · · ) s according to the dimension of each task matrix in Ai ; Step 3: Calculate the generalized projection matrix Pigp = In − BiT Λsi Bi .

(20)

The diagonal elements of (19) have different meaning from other elements. When αii = 1, task i will be projected into its own nullspace, which is equivalent to deleting this task. When αii = 0, task i is fully activated. In conclusion, the priority between different tasks is defined by Ψghc , and by tuning the parameters of Ψghc , priority transition can be realized. Recursive Hierarchical Projection. The algorithm proposed in [12] realize priority transition of equality tasks with the continuous transition of the nullspace projection matrix based on HQP. This algorithm is called RHP-HQP and is similar to (16). All of the r tasks are divided into nl levels, and the k-th level QP problem has the following form:

rhp ∗ ˆA,s arg min AA,s (P u + x ) − b + v k Q i (21a) k k−1 k k−1 k u k ,v k

Qe

rhp uk + x∗k−1 ) + v k  uds,k s.t. lds,k  C s,k (Pk−1 rhp ldh  C h (Pk−1 uk + x∗k−1 )  udh aug aug aug ∗ rhp ldk−1  C k−1 (Pk−1 uk + x∗k−1 ) + v aug k−1  udk−1

(21b) (21c) (21d)

652

Q. Li et al.

where the inequality constraint matrix, upper and lower bound in (21b–21d) are defined in the same way as (16), the optimal solution x∗i is defined as rhp ∗ uk , x∗k = x∗k−1 + Pk−1

(22)

which is similar to (17). A priority matrix ⎤ ⎡ α11 · · · α1r ⎥ ⎢ Ψrhp = ⎣ ... . . . ... ⎦ ∈ Rnl ×r

(23)

αnl 1 · · · αnl r is also defined in RHP-HQP to indicate the priority between tasks. In this matrix, variables on the k-th row αkj ∈ [0, 1], j = 1, · · · , r represents the extent of freedom task j has on the k-th priority level. When αkj = 0, task j has no freedom on this priority level, i.e., the priority of task j is lower than k; When αkj = 1, task j has a priority level no lower than k; When 0 < αkj < 1, task j is in a transitional state on this priority level, and the larger αkj is, the more important it is on this level. For the consistency of priority, we have ∀m < n, αmj  αnj . If the priority level of task j is k, then αmj = 0, (m < k) and αmj = 1, (m  k). By selecting nk tasks that satisfy αkj = αk−1,j , j = 1, · · · , r according to (23), the task matrices need to be solved on the k-th level AA,s can be deterk mined. Arrange the task matrices and objective vectors of these nk tasks in descending order of αkj into an augmented matrix and vector in the form of ⎤ ⎤ ⎡ ⎡ As 1 bs1 ⎢ . ⎥ ⎢ . ⎥ AA,s = ⎣ .. ⎦ , bA,s = ⎣ .. ⎦ . (24) k k As n k

bsn k

Construct a diagonal matrix with the variable relating to the selected tasks as ⎤ ⎡ αk,s1 Ids1 ⎥ ⎢ .. Λsk = ⎣ (25) ⎦, . αk,snk Idsn k

where Ids ∈ Rds ×ds is an identity matrix, ds is the row number of As . To activate or delete a task, the objective vector in (21a) is given by ˆA,s = Λs bA,s + (I − Λs ) AA,s x∗ b k k k k−1 k k

(26)

with an approach similar to intermediate value [19,22]. After getting AA,s and Λsk , the recursive hierarchical projection matrix Pkrhp k on the k-th level can be calculated in the following three steps: ˜A,s = AA,s Pk−1 ; Step 1: Calculate the task matrix after projection A k k ˜A,s and get a set of Step 2: Perform Gaussian elimination to the row vector of A k

An Overview of Multi-task Control Globally

653

˜A,s . orthogonal bases Bk ∈ Rmk ×n of its row space, where mk is the rank of A k s,mk according to the dimension Construct an mk -dimensional diagonal matrix Λk ˜A,s ; of each task matrix in A k Step 3: Calculate the recursive hierarchical projection matrix Pkrhp on the k-th level P0rhp = I Pkrhp

=

rhp Pk−1 (In

(27) −

k BkT Λs,m Bk ). k

(28)

When all the variables in Ψrhp are 0 or 1, Λsk is an identity matrix and ˆA,s = bA,s . P rhp in this case, which projects the task into the nullspace therefore b k k k of all the tasks prior to level k, has a similar meaning to the definition in [2,30]. 3.3

Comparison Between Different Algorithms

First, let us consider the two fixed priority algorithms. HQP-original with form (10) and HQP-nullspace with form (16) solve the first level of task in the same way. Especially when there is only one level of task, HQP is exactly the same with WQP, which is defined in (8). Therefore WQP can be viewed as the special case of HQP with only one level of task. When the priority variables of RHP in (23) are set as 0 or 1, all the tasks at each priority level will be fully solved and the hierarchy will be maintained. In this case, the solution given by (21) is the same as that of HQP, hence HQP can be seen as a special case of RHP where all the tasks are in strict priority hierarchy. Considering numerical solution, although HQP and RHP are more computationally efficient than the original rhp constraint form, under some cases, e.g. Ak Nk−1  O or AA,s k Pk−1  O these two algorithms may become ill conditioned. As a result, regular terms may need to be introduced to avoid singularities, but notice that the optimal results may be affected. In general, algorithms from (8) to (16) then to (21) is a transition from particular form to a more general form. Then, the two priority transition algorithms GHC and RHP will be discussed. For ease of description, tasks are defined as in priority transition if variables αij ∈ (0, 1) in (19) and (23), and are defined as in strict priority hierarchy if αij = 0, 1. On one hand, the solutions given by GHC and RHP are not necessarily the same in strict priority hierarchy. The former will solve each task separately and then trim them in the order of priority, while the latter will solve each task in turn according to their priority order. On the other hand, GHC and RHP are not equivalent in priority transition. Although the two are similar in the idea of projection matrix transition, but the meaning of the two projection matrices are different: In GHC, it gradually cuts the optimal solution of the lower priority tasks off the optimal solution space of the higher priority task; While in RHP, priority transition relies on intermediate value transition for task objectives and transition in the corresponding feasible solution space during the transformation of the nullspace projection matrix. Take two single objective tasks in the two dimensional space for example. ∗ ∗ , qt2 and a one dimensional nullspace, and Each task has an optimal solution qt1

654

Q. Li et al.

(a) Optimal solution space (b) GHC priority transition(c) RHP priority transition

Fig. 4. Two single objective tasks in the two dimensional space with transition priorities. ∗ both can be satisfied simultaneously with an optimal solutionqt1,t2 , as depicted in Fig. 4(a). The task priority transition is set as from solving only task 2 to solving both task 1 and task 2, with task 1 having a higher priority than task 2. A variable α can be defined to represent the process, and the priority matrix for GHC is defined as   1−α1−α , α:0→1 (29) Ψghc = α 0

while the priority matrix for RHP is   α&0 , α : 0 → 1. Ψrhp = α&1

(30)

When α = 0, only task 2 needs to be solved and the solution from GHC and RHP are the same. When α = 1, task 1 and task 2 must be finished simultaneously and task 2 has a lower priority level than task 1. It can be observed from Fig. 4(b) that the solution from GHC is located in the optimal solution space of task 1 and task 2 has no effect on task 1, thus the order of the priority is satisfied. ∗ for both tasks. From the However, this solution is not the optimal solution qt1,t2 perspective of multi-objective optimization, the solution given by GHC is not Pareto optimum under strict priority hierarchy. This is because tasks in GHC can be considered as solved separately and projection matrix can only ensure the priority between tasks. In RHP, as illustrated in Fig. 4(c), its solution is Pareto optimum under strict priority hierarchy. In priority transition state when α ∈ (0, 1), both task 1 and task 2 are affected in GHC and the solution is not optimal for either one. In RHP however, Ψrhp (2, 2) = 1 so that task 2 is always fully solved, therefore, the priority transition results in a gradual transition to the optimal solution of task 1 while maintaining task 2 as the optimal solution.

An Overview of Multi-task Control Globally

4

655

Simulation Results

This section demonstrates the nature of different multi-task control algorithms with simulation on a robotic arm with 4 DoFs working in the vertical plane. As Fig. 5 illustrates, the position and orientation of the end-effector is defined as (xe , ye ) and θ respectively, and the angular positions of the four actuated joints T are defined as q = [q1 , q2 , q3 , q4 ] . Additionally, all the algorithms introduced in Sect. 3 have been implemented in one toolkit called WBCKits, which provides convenient methods to add tasks, constraints and set task weights and priorities. After selecting specific algorithm and task setting, WBCKits can automatically construct QP(s) using the general linear algebra library Eigen [11] and get the optimal results with QP solver qpOASES [10]. WBCKits is freely available and the C++ source code is published on https://github.com/yueqing-li/WBCKits. A Python wrapper is also provided for exploiting further employment like combining with learning algorithm. In Sect. 4.1 the main comparison is made between the control effects and computational efficiency of different algorithms in multi-task control under fixed priority. In Sect. 4.2, control effects of algorithms will be compared under transitional priority, and the preservation of priority hierarchy in different transition methods will also be compared.

Fig. 5. Model of a planar robotic arm with 4 DoFs. l1 = l2 = l3 = 0.5m, l4 = 0.2m.

656

Q. Li et al. Table 1. Task PD gains and corresponding priority or weights.

Sub-tasks

Kp (s−2 ) Kd (s−1 )

Horizontal position x

100

30

1

1000

15

Vertical position y

100

30

2

100

10

Orientation θ

35

10

3

10

5

8

4

1

1

Angular position of joints 20

4.1

Priority Level WQP-1 Weights WQP-2 Weights

Tasks with Fixed Priority

In this section, the tracking error of tasks with different priority and the computation time of different algorithms will be demonstrated. Simulation results are based on the 4-DoF planar robotic arm model with a maximum joint torque constraint. Four sub-tasks, including tracking the horizontal position xe , vertical position ye and the orientation θ of the tool center point (TCP) and the angular position of joints, are imposed. The four tasks are controlled in task space using PD feedback to obtain task acceleration, and the PD gains and task priority for each task are shown in Table 1, where smaller number of priority level means higher priority. Figure 6 depicts the step response curves for each task with the same feedback factor for different algorithms. To compare the characteristics of WQP, two sets of weighting parameters are used, as shown in Table 1. The “no priority” curves in Fig. 6 are obtained without using the multi-task control algorithms. Torque is generated by calculating the inverse dynamics after solving each task separately and summing their joint acceleration results. Table 2 presents the static error of different algorithms for all tasks at the 10th second of the simulation. The acceleration loss in Fig. 7 depicts the difference between the expected task acceleration of PD feedback control and the actual task acceleration calculated from optimal joint acceleration. As can be seen in Fig. 6, RHP performs almost identically to HQP with fixed task priority and both converge rapidly to zero static error state. GHC performs similarly to the other two hierarchical control algorithms (HQP, RHP) on the highest priority task (horizontal position x) but has a slower converging speed on tasks with lower priority. In WQP, tasks with larger weights converge slower because of the influence of tasks with smaller weights, but the smaller weighted tasks may converge faster than the hierarchical control algorithms accordingly. When there exists a joint configuration q and q˙ = 0 that enables multiple task objectives to be achieved simultaneously, these tasks are said to be feasible simultaneously. In this simulation, the first three tasks are feasible simultaneously, and zero static error of the first three tasks is achieved in every hierarchical control algorithm (HQP, GHC and RHP), as shown in Table 2. The fourth task, on the other hand, is not feasible simultaneously because it conflicts with the previous tasks, and therefore has a static error. WQP, on the other hand, does not guarantee zero static error for tasks with larger weights. It can be seen from Fig. 7 that when the weights are rela-

An Overview of Multi-task Control Globally

(a) Position x error

(b) Position y error

(c) Orientation θ error

(d) Joint angular position error

657

Fig. 6. Control effect of the 4-DoF planar robotic arm with different algorithms. Table 2. Static error of the 4-DoF planar robotic arm with different algorithms for all tasks at the 10th second of the simulation. Algorithm

Horizontal error Δx(m) Vertical error Δy(m) Orientation errorΔθ(rad) Norm of joint angular position error Δq2

WQP-1 WQP-2 HQP-original HQP-nullspace GHC RHP

0.000 0.001 0.000 0.000 0.000 0.000

0.000 0.008 0.000 0.000 0.000 0.000

–0.004 –0.016 0.000 0.000 0.000 0.000

2.373 2.357 2.375 2.375 2.375 2.375

tively small, an obvious static error exists between the optimal acceleration and expected acceleration, and increasing the weights can reduce the error. However, the impact of different dimensions and other relative tasks makes it difficult to determine task weight settings in a quantitative way to guarantee errors of more important tasks. Some comments should be made on the convergence of GHC. To deal with multi-task problem, GHC will directly cut off the components of lower priority tasks that have an impact on higher priority tasks. This is possible to result in the situation where the optimization results of lower priority tasks do not keep

658

Q. Li et al.

Fig. 7. Acceleration loss in the space of each task of optimal solutions from different algorithms.

up with the expected value even if the problem itself satisfies the multi-objective problem, such as the non Pareto minimum mentioned in Fig. 4(b) when α = 1. From Fig. 6(b) we can see that GHC converges faster in position y than any other algorithm, but this is not because GHC has a better convergence when dealing with lower priority tasks. After cutting off the influence on the x direction from the y direction, the acceleration mapped to the y direction from the optimal solution at the current moment is smaller than the expected acceleration of PD feedback, as can be seen in Fig. 7. As a result, GHC converges faster in the early stage. The fluctuation in position y error after about 1.2 s is also caused by this factor. This feature of GHC will greatly affect its convergence. Take the orientation θ in 6(c) for example, the optimized acceleration is the exact opposite of the expected acceleration, thus at 0.2–0.5 s the task is feasible but the error increases. In order not to affect the higher priority tasks of position x and y, the orientation task does not reach static stability until about 8 s later (not shown in the figure). Figure 8 depicts the histogram of time consumed by different algorithms solving the WBC problem in each control cycle during the whole simulation process, counted by interval. It can be seen that the calculation time of WQP is much lower than other algorithms because its structure is simple and it only needs

An Overview of Multi-task Control Globally

659

Fig. 8. Counts of computation time of 4-DOF planar robotic arm with different algorithms.

Fig. 9. Average computation time of 4-DOF planar robotic arm with different algorithms.

to solve one QP problem. HQP-nullspace is next, while the time consumed by HQP-original, GHC and RHP are the most and are almost the same. Figure 9 is a bar chart of the average time consumed by different algorithms in two parts: constructing WBC and solving QP. It can be seen that the average time spent on solving the QP problem by HQP-nullspace is less than that of HQP-original, which is in accordance with the previous theoretical analysis. The time consumption of HQP-nullspace, GHC and RHP in construction WBC should be higher than HQP-original theoretically because of additional matrix decomposition for calculating nullspace. While here we find that their time consumption is about the same. Because in this problem, the 4-DOF robotic model is relatively simple and the complexity of matrix decomposition is not high. 4.2

Tasks with Transitional Priority

This section uses the dynamic addition and deletion of tasks and the dynamic change of priority to reflect the characteristics of the priority transition algorithms, and the impact of introducing the transition process on priority changes.

660

Q. Li et al.

On the basis of tasks in Sect. 4.1, a limit on the angular position of the first joint is imposed. Let the lower and upper bounds of the first joint position q1 be [q 1 , q 1 ] and set the buffer width as β1 ∈ R+ , then this task can be defined as ¨j1 , min J1 q − x

(31)

q

where J1 ∈ R1×4 is a zero matrix with J1 (1, 1) = 1. x ¨j1 is defined as ⎧ ⎨ kp ((q 1 + β1 ) − q1 ) − kv q˙1 , if q1 < q 1 + β1 x ¨j1 = kp ((q 1 − β1 ) − q1 ) − kv q˙1 , if q1 > q 1 − β1 . ⎩ 0, otherwise

(32)

When q1 ∈ [q 1 + β1 , q 1 − β1 ], this task is deleted and the priority set of other tasks are the same with Sect. 4.1. For WQP, the corresponding weight will be set as 0, while the priority matrix of GHC and RHP are set as ⎡ ⎤ ⎡ ⎤ 00000 00000 ⎢10000⎥ ⎢10000⎥ ⎢ ⎥ ⎢ ⎥ ⎥ , Ψrhp,0 = ⎢ 1 1 0 0 0 ⎥ , 1 1 0 0 0 (33) Ψghc,0 = ⎢ ⎢ ⎥ ⎢ ⎥ ⎣11100⎦ ⎣11100⎦ 11111 11110 respectively. When q1  q 1 or q1  q 1 , this task is activated and has the highest priority to other tasks. For WQP, the corresponding weight will be set to 10 times the previous maximum task weight as 10wmax , while the priority matrix of GHC and RHP are set as ⎡ ⎤ ⎡ ⎤ 00001 00001 ⎢10001⎥ ⎢10001⎥ ⎢ ⎥ ⎢ ⎥ ⎥ , Ψrhp,1 = ⎢ 1 1 0 0 1 ⎥ , 1 1 0 0 1 (34) Ψghc,1 = ⎢ ⎢ ⎥ ⎢ ⎥ ⎣11101⎦ ⎣11101⎦ 00000 11111 respectively. An activation parameter can joint q1 as ⎧ 0, if ⎪ ⎪ ⎨ 1, if αj1 = (q + β − q )/β , if ⎪ 1 1 1 ⎪ ⎩ 1 (q1 − q 1 + β1 )/β1 , if

be defined by the angular position of q 1 + β1 q1 < q 1 q 1  q1 q 1 − β1

 q1 < q 1 − β1 or q1 > q 1 . < q 1 + β1  q1 < q 1

(35)

With this parameter, the priority of joint limit task can change continuously. In WQP, the weight for this task is set to be 10αj1 wmax , while in GHC and RHP the priority matrix are set as Ψghc = (1 − αj1 )Ψghc,0 + αj1 Ψghc,1 Ψrhp = (1 − αj1 )Ψrhp,0 + αj1 Ψrhp,1 , respectively.

(36) (37)

An Overview of Multi-task Control Globally

661

In this simulation, the setting of other tasks are again shown in Table 1. The lower and upper bounds of the first joint position q1 is set as [−15◦ , 15◦ ] and the buffer width is set as β1 = 5◦ . In order to reflect the influence of joint position limit, the reference position of the TCP is planned so that when it is close to the final target position, the first joint will enter the buffer zone and the remaining joints cannot complete the planar position task, ending up approaching a singular pose. Two transition processes are designed for RHP, and the difference is that the same priority is expressed with different Ψrhp,0 . Label RHP-1 corresponds to (34) and label RHP-2 corresponds to (38). Figure 10 depicts the curves of activation parameter αj1 and the angular position of the first joint over time, and Fig. 11 illustrates the control effects of the other tasks under the joint limit with different algorithms.

Fig. 10. Activation parameter and angular position of joint 1 of the 4-DOF planar robotic arm with different algorithms.

It can be seen that both GHC and RHP can use the transition state to achieve an effect similar to the partial completion of tasks in the WQP algorithm. When the first joint reaches the buffer zone as the task progresses, the joint position limit task is gradually activated to avoid the joint position exceeding the lower bound. Combined with the results in Sect. 4.1 we can see that different weight ratios have a huge impact on the task results in WQP, and the difficulty of quantitatively analyzing the weights is once again reflected. The weight itself can only reflect a relative relationship between tasks, and when tasks that are coped with each other increases, the weight change of a single task may have a great impact on the overall solution result, and this impact is difficult to describe by model. In contrast, GHC and RHP in transition are easier to analyze. For GHC, after it reaches a steady state, the activation parameter αj1 is about 0.326,

662

Q. Li et al.

meaning that the optimization results for subsequent tasks will be clipped by about one-third on the first joint. This results in an error between the optimal solution of GHC and the expected acceleration of the PD feedback controller on the subsequent tasks, and will lead to the error shown in Fig. 11 after entering the transition process. RHP will complete the lower priority tasks as much as possible after partially completing the joint limit (or choosing a joint space configuration that does not affect the joint limit task), so it can be seen that the lower priority tasks can still be tracked well.

(a) Posision x error

(b) Position y error

(c) Orientation θ error

(d) Joint angular position error

Fig. 11. Control effect of the 4-DOF planar robotic arm under the joint limit with different algorithms.

It should be noted that the setting of the RHP priority matrix will affect the completion of tasks during the transition process. If the priority matrix at the beginning is set as ⎡ ⎤ 10000 ⎢11000⎥ ⎢ ⎥ ⎥ (38) Ψrhp,0 = ⎢ ⎢ 1 1 1 0 0 ⎥, ⎣11110⎦ 11110

An Overview of Multi-task Control Globally

663

Table 3. Performance of different QP algorithms for redundant robot control. WQP

HQP-original

Strict hierarchy

no

yes

yes

HQP-nullspace GHC yes

yes

RHP

Pareto optimum

yes

yes

yes

no

yes

Numerical singularity

no

no

yes

yes

yes

Continuous priority transition yes

no

no

yes

yes

Hierarchy in transition

no





partial partial

Computation efficiency

very fast slow

fast

slow

slow

although it also means that there is no joint limit task and the priority order of other tasks is the same as Sect. 4.1, but in the transition process, it also has the meaning of gradually transitioning the priority of other tasks to a lower level in addition to gradually activating the joint limit task as the highest priority. This will result in conflict on the fourth level of the orientation task and the joint angular position task, i.e., the available DoFs for the orientation task become less. So as Fig. 11(c) illustrates, tracking error occurs on RHP-2 in stability state, but RHP-2 has a slightly smaller angular position error than RHP-1 as can be seen in Fig. 11(d). Therefore for RHP, the design of the transition process is also a factor that needs special consideration.

5

Conclusion

In this paper, we hope to present an overview on related algorithms for WBC of redundant robots using QP, and meanwhile offer inspiration on how to choose the appropriate algorithm according to task setting. Theoretical analysis and simulation results demonstrate the advantages and disadvantages of multi-objective optimization algorithms in different aspects. Table 3 summarizes the performance of the algorithms discussed in this paper. After the task control problem is abstracted as general quadratic programming, the conflict between tasks in the multi-task control problem is explained from the perspective of multi-objective optimization. The essential idea in dealing with these conflicts is to assign different priority to each task so as to obtain the optimal solution with trade-off. Our analysis first compare the effect of WQP and HQP on task hierarchy with fixed priority. The former can only reduce the impact of low priority tasks on high priority tasks to a certain extent through the weighting strategy, but this impact can never be completely eliminated; while the latter strictly guarantees that low priority tasks will not affect high priority tasks during the control process, and the optimization result is Pareto optimum. The original HQP needs to solve multiple optimization problems comparable to WQP, so that its computation efficiency is much lower than WQP. By parameterizing the nullspace basis matrix, the optimization variable dimension of subsequent QP problems are reduced and the computation efficiency can be improved. However, the introduction of the nullspace basis matrix brings instability to the numerical solution

664

Q. Li et al.

of the optimization problem, which needs to be avoided by adding a regular term. Then the control effect of GHC and RHP are discussed under transitional priority. The optimal solution of GHC is usually not the Pareto optimum of multi-objective optimization problem and the convergence performance of GHC is not as good as RHP. Both of them can realize the continuous change of priority and the effect that some tasks are affected during the transition process while other tasks are still strictly hierarchical. Although WQP can also achieve similar transition effects, the defect that its weight setting is not quantifiable will be more obvious and the priority of other tasks is not strictly hierarchical. It should be noted that there may be many ways to design the RHP priority matrix, which will eventually affect the priority of other tasks during the transition process, while the priority matrix of GHC is relatively unique. Finally, the above algorithms are implemented in a toolkit, which can easily realize various multi-task control algorithms with the help of a unified interface. With numerous simulations, the characteristics of each algorithm are compared and summarized. Despite the progress in algorithms, there are still drawbacks such as lack of considering the time dimension and time consuming of inequality tasks transition. One consideration of this issue is expanding the algorithm to combining model prediction or probability. Besides, the trade-off between algorithm effects and computational efficiency should also be considered. Therefore, another consideration is the efficiency of the algorithm. To validate this, benchmark data of the WBC problem for the different algorithms, which is scarce and elusive, could be explored. A last issue of robot WBC is how to assign appropriate priority to tasks, which relies heavily on the experience of the expert. And in the transition priority algorithm, the appropriate priority should vary with the environment. In reinforcement learning (RL), an optimal policy can be learned to maximize the user-defined reward from the environment. Therefore, a possible avenue to solve this issue is combining RL with robot WBC to train an agent to learn the policy of adjusting the task priorities. Acknowlegement. This 2021ZD0201402.

work

was

supported

by

STI

2030-Major

Projects

References 1. Abe, Y., da Silva, M., Popovi´c, J.: Multiobjective control with frictional contacts. In: Proceedings of the 2007 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, Eurographics Association, Goslar, DEU, SCA ’07, pp. 249–258 (2007) 2. Baerlocher, P., Boulic, R.: Task-priority formulations for the kinematic control of highly redundant articulated structures. In: Proceedings. 1998 IEEE/RSJ International Conference on Intelligent Robots and Systems. Innovations in Theory, Practice and Applications (Cat. No.98CH36190), vol. 1, pp. 323–329 (1998). https:// doi.org/10.1109/IROS.1998.724639

An Overview of Multi-task Control Globally

665

3. Bellicoso, D., Gehring, C., Hwangbo, J., Fankhauser, P., Hutter, M.: Perceptionless terrain adaptation through whole body control and hierarchical optimization. In: 2016 IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids), pp. 558–564 (2016) 4. De Lasa, M., Mordatch, I., Hertzmann, A.: Feature-based locomotion controllers. ACM Trans. Graph. (TOG) 29(4), 1–10 (2010) 5. Deo, A.S., Walker, I.D.: Overview of damped least-squares methods for inverse kinematics of robot manipulators. J. Intell. Robot. Syst. 14(1), 43–68 (1995) 6. Dietrich, A., Albu-Sch¨ affer, A., Hirzinger, G.: On continuous null space projections for torque-based, hierarchical, multi-objective manipulation. In: 2012 IEEE International Conference on Robotics and Automation, pp. 2978–2985 (2012). https:// doi.org/10.1109/ICRA.2012.6224571 7. Dietrich, A., Ott, C., Albu-Sch¨ affer, A.: An overview of null space projections for redundant, torque-controlled robots. Int. J. Robot. Res. 34(11), 1385–1400 (2015) 8. Du, W., Fnadi, M., Benamar, F.: Whole-body motion tracking for a quadruped-onwheel robot via a compact-form controller with improved prioritized optimization. IEEE Robot. Autom. Lett. 5, 516–523 (2020) 9. Escande, A., Mansard, N., Wieber, P.B.: Hierarchical quadratic programming: Fast online humanoid-robot motion generation. Int. J. Robot. Res. 33(7), 1006–1028 (2014) 10. Ferreau, H., Kirches, C., Potschka, A., Bock, H., Diehl, M.: qpOASES: a parametric active-set algorithm for quadratic programming. Math. Program. Comput. 6(4), 327–363 (2014) 11. Guennebaud, G., Jacob, B., et al.: Eigen v3 (2010) . https://eigen.tuxfamily.org 12. Han, G., Wang, J., Ju, X., Zhao, M.: Recursive hierarchical projection for wholebody control with task priority transition. In: 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 11312–11319 (2021) 13. Herzog, A., Rotella, N., Mason, S., Grimminger, F., Schaal, S., Righetti, L.: Momentum control with hierarchical inverse dynamics on a torque-controlled humanoid. Auton. Robot. 40(3), 473–491 (2016) 14. Hutter, M., Sommer, H., Gehring, C., Hoepflinger, M., Bloesch, M., Siegwart, R.: Quadrupedal locomotion using hierarchical operational space control. Int. J. Robot. Res. 33(8), 1047–1062 (2014) 15. Kanoun, O., Lamiraux, F., Wieber, P.B., Kanehiro, F., Yoshida, E., Laumond, J.P.: Prioritizing linear equality and inequality systems: application to local motion planning for redundant robots. In: 2009 IEEE International Conference on Robotics and Automation, pp. 2939–2944 (2009) . https://doi.org/10.1109/ROBOT.2009. 5152293 16. Khatib, O.: A unified approach for motion and force control of robot manipulators: the operational space formulation. IEEE J Robot. Autom. 3, 43–53 (1987) 17. Kim, D., Carlo, J.D., Katz, B., Bledt, G., Kim, S.: Highly dynamic quadruped locomotion via whole-body impulse control and model predictive control. ArXiv abs/1909.06586 (2019) 18. Kim, D., Jorgensen, S.J., Lee, J., Ahn, J., Luo, J., Sentis, L.: Dynamic locomotion for passive-ankle biped robots and humanoids using whole-body locomotion control. Int. J. Robot. Res. 39, 936–956 (2019) 19. Kim, S., Jang, K., Park, S., Lee, Y., Lee, S.Y., Park, J.: Continuous task transition approach for robot controller based on hierarchical quadratic programming. IEEE Robot. Autom. Lett. 4(2), 1603–1610 (2019). https://doi.org/10.1109/LRA.2019. 2896769

666

Q. Li et al.

20. Kim, S., Jang, K., Park, S., Lee, Y., Lee, S.Y., Park, J.: Whole-body control of non-holonomic mobile manipulator based on hierarchical quadratic programming and continuous task transition. In: 2019 IEEE 4th International Conference on Advanced Robotics and Mechatronics (ICARM), pp. 414–419 (2019) 21. Klemm, V., Morra, A., Gulich, L., Mannhart, D., Rohr, D., Kamel, M., de Viragh, Y., Siegwart, R.Y.: LQR-assisted whole-body control of a wheeled bipedal robot with kinematic loops. IEEE Robot. Autom. Lett. 5, 3745–3752 (2020) 22. Lee, J., Mansard, N., Park, J.: Intermediate desired value approach for task transition of robots in kinematic control. IEEE Trans. Robot. 28(6), 1260–1277 (2012). https://doi.org/10.1109/TRO.2012.2210293 23. Liu, M., Micaelli, A., Evrard, P., Escande, A., Andriot, C.: Interactive dynamics and balance of a virtual character during manipulation tasks. In: 2011 IEEE International Conference on Robotics and Automation, pp. 1676–1682. IEEE (2011) 24. Liu, M., Hak, S., Padois, V.: Generalized projector for task priority transitions during hierarchical control. In: 2015 IEEE International Conference on Robotics and Automation (ICRA), pp. 768–773 (2015). https://doi.org/10.1109/ICRA.2015. 7139265 25. Liu, M., Tan, Y., Padois, V.: Generalized hierarchical control. Autonom. Robot. 40(1), 17–31 (2016) 26. Nakamura, Y., Hanafusa, H., Yoshikawa, T.: Task-priority based redundancy control of robot manipulators. Int. J. Robot. Res. 6, 3–15 (1987) 27. Saab, L., Ramos, O.E., Keith, F., Mansard, N., Sou´eres, P., Fourquet, J.Y.: Dynamic whole-body motion generation under rigid contacts and other unilateral constraints. IEEE Trans. Robot. 29, 346–362 (2013) 28. Salini, J., Padois, V., Bidaud, P.: Synthesis of complex humanoid whole-body behavior: a focus on sequencing and tasks transitions. In: 2011 IEEE International Conference on Robotics and Automation, pp. 1283–1290 (2011). https://doi.org/ 10.1109/ICRA.2011.5980202 29. Sentis, L., Khatib, O.: A whole-body control framework for humanoids operating in human environments. In: Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006 ICRA 2006, pp. 2641–2648 (2006) 30. Siciliano, B., Slotine, J.J.: A general framework for managing multiple tasks in highly redundant robotic systems. In: Fifth International Conference on Advanced Robotics ’Robots in Unstructured Environments, vol. 2, pp 1211–1216 (1991). https://doi.org/10.1109/ICAR.1991.240390 31. Siciliano, B., Slotine, J.J.E.: A general framework for managing multiple tasks in highly redundant robotic systems. In: Fifth International Conference on Advanced Robotics ’Robots in Unstructured Environments, vol. 2, pp. 1211–1216 (1991)

Improvement of Hierarchical Clustering Based on Dynamic Time Wrapping Xudong Yuan1(B)

and Yifan Lu2

1 College of Information Science and Engineering, Hohai University, Changzhou 231022, China

[email protected] 2 Business School of Hohai University, Hohai University, Changzhou 231022, China

Abstract. In view of the low clustering efficiency and poor clustering effect of traditional hierarchical clustering algorithms, this paper measures distance based on dynamic time warping (DTW), and proposes an adaptive divisive analysis (DIANA) based on minimum spanning tree, which uses the solution of minimum spanning tree to replace the traditional clustering process so as to improve the algorithm performance and clustering quality. In the aspect of solving the minimum spanning tree, the algorithm is improved by combining the small root heap and disjoint set union, which significantly improves the operation efficiency. In the process of clustering, the divisive hierarchical clustering based on the minimum spanning tree is adopted, according to the principle of “nearest neighbor”, which ensures a better clustering effect and reduces the amount of calculation. Through the analysis of time complexity, this algorithm has a significant efficiency improvement compared with the previous algorithm. Finally, by using the data set of shared-bikes, and through Python simulation experiments, the model is effective in clustering quality. Keywords: Dynamic Time Wrapping · Hierarchical Clustering · Minimum Spanning Tree

1 Introduction Time series is a series of data that records one or more attributes that change with time. With the development of computer technology, the research on time series in related fields is increasingly extensive. Clustering is the process of grouping physical or abstract sets into multiple classes composed of similar objects [1]. Clustering algorithm is one of the important application directions of time series. Hierarchical clustering is an algorithm that can recursively merge or split data objects to achieve the desired results. The algorithm is insensitive to the order of data input and suitable for any form of clustering, so it serves a crucial role in traffic communication, weather forecasting, ecological preservation, environmental conservation, and other related areas. Reference [2] combines ant colony optimization with agglomerative hierarchical clustering, reduces the influence of local optimal solution by introducing the randomness and iteration mechanism of ant colony optimization algorithm, and determines the location of shared © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 667–674, 2023. https://doi.org/10.1007/978-981-99-6187-0_65

668

X. Yuan and Y. Lu

single vehicle maintenance points through clustering. Reference [3] uses agglomerative hierarchical clustering and different similarity measurement methods to analyze the clustering results, so as to conduct statistical analysis on the time series of precipitation. Reference [4] combines genetic learning adaptive particle swarm optimization with hierarchical clustering to minimize network loss. This paper enhances the hierarchical clustering algorithm presented in [5] by conducting an analysis of the interplay between the edges and nodes within the minimum spanning tree. Furthermore, it employs the nearest neighbor relationship to derive the edge set of the minimum spanning tree, thereby achieving improved performance. By precomputing the minimum spanning tree, unnecessary recalculations are avoided during each clustering process, resulting in improved computational efficiency.

2 Model Construction 2.1 Dynamic Time Wrapping Dynamic time warping (DTW) is an important similarity measurement method, which has the advantages of high accuracy and stability. Different from the traditional Euclidean distance that uses the square difference of data values at the same time to measure its similarity, dynamic time warping can realize the measurement of time series at asynchronous time points by warping the time domain of time series. Given two equal length time series x = (x1 , x2 , . . . , xm ) and y = (y1 , y2 , . . . , yn ) (if m = n, it will be equal length time series), calculate the Euclidean distance between any two points to obtain the distance matrix Dm×n . The distance of xi and yj is calculated as follows: D(i, j) = ||xi − xj ||w

(1)

where i = 1, 2, . . . , m, j = 1, 2, . . . , n is the Euclidean distance if and only if w = 2. The wrapping path P(p1 , p2 , . . . , pk , . . . , pl ) is used to record the accumulated distance between S and K. It can be described as p(i, j)k , k = 1, 2, . . . , l. To get such path, we require three principles to constrain the algorithm: • Boundary conditions: For the wrapping path P, the data obtained can only vary from w1 = D(1, 1) to wk = D(m, n), which requires the wrapping path to take value diagonally from the bottom left to the top right in the matrix. • Continuity: Given pk−1 = (a, b), pk = (a, b), a − a ≤ 1, b − b ≤ 1, it is required that the points obtained in the path must be adjacent and the path taken must be continuous. • Monotonicity: Given pk−1 = (a, b), pk = (a , b ), a − a ≥ 0, b − b ≥ 0, it is required that pk−1 = D(ik , jk ) and pk = D(ik+1 , jk+1 ) must increase monotonously with time. Among the path satisfying the three principles, select the path with the minimum accumulation distance as the final value of P: l pk P = min{ k=1 } (2) l

Improvement of Hierarchical Clustering

669

Through dynamic programming, the calculation of the shortest path involves obtaining the construction cost matrix using the dynamic programming function γ(i, j). This matrix allows us to determine the minimum accumulated distance. The dynamic programming approach can be expressed as follows: γ (i, j) = d (i, j) + min{γ (i − 1, j − 1), γ (i − 1, j), γ (i, j − 1)}

(3)

The Euclidean distance of two time series in this paper can be considered as a special case of DTW (when P(i, j)k , if and only if i = j = k). It is not only applicable to time series with two equal lengths, but also applicable to time series with unequal lengths. And its time complexity is O(mn). 2.2 Kruskal Algorithm for Heap Optimization The minimum spanning tree is a concept of graph theory [7], which selects several edges to form a tree according to the existing points in the graph. A graph can be expressed as G(V , E), where V represents the set of vertices in the graph and E represents the set of edges. For each edge, there will be a weight, and the set of weights is represented by W . The selected edges form the minimum spanning tree, which is represented by MST . Heap is represented by Heap(N , W ), which is the general name of a special structure in the data structure. The index root can be described as root = 0. For each node i, there is a number n(i) and a weight w(i). If the root node w(root) is the largest one among the heap, it is called the max heap, and if the root node w(root) is the smallest one among the heap, it is called minimum heap. The minimum heap has the following two properties: • For any node i, its weight w(i) is less than the weights w(2i + 1) and w(2i + 2) of the two sub nodes. • The heap is commonly considered a complete binary tree. The heap has two operations: floating up and sinking. Floating up involves moving smaller elements from the bottom of the min heap upwards. Conversely, sinking entails moving the larger element from the top of the heap downwards. Therefore, according to the properties mentioned above, we can sort the data input into the heap based on some rules. When the data in the heap is input at one time and adjusted from bottom to top, the time complexity is O(N ). Therefore, it can be used to optimize the sorting section of Kruskal algorithm. The disjoint set union is expressed as DSU . DSU is composed of several disjoint subsets [10], which have Union and Find operations. • Union: merge set a and set b into one set. • Find: determine that set a belongs to a certain set. And DSU can be used to judge whether the two vertices on the edge of the graph are included in the minimum spanning tree. If so, the edge will not be added to the minimum spanning tree, otherwise, it will be added. In our algorithm, we can use DSU in Kruskal algorithm. If the edges have the same ancestor, they are reasonably be a loop and will not be added into the MST. Taking into consideration the data characteristics, the heap is utilized to optimize the sorting algorithm for sorting the data based on the weight of each edge. The complete

670

X. Yuan and Y. Lu

graph generated using DTW is then fed as input to derive the adjacency list of the minimum spanning tree. The algorithm is demonstrated as follows: 1. Input the edge set of the complete graph G(V , E) into the small root heap Heap(N , W ) in the form of adjacency table, and do not sort in a single input. After all the inputs are completed, the sinking operation will be started from the penultimate layer. 2. If the heap is not empty, and the number of edges N (E) in the MST is less than the number of nodes N (V ) − 1, then take an edge e(i, j, w) from the top w(root) of the heap, input the vertices at both ends of the edge to the DSU , and judge the vertex before input. If the two vertices belong to the same parent vertex, do not execute. Otherwise add the edge e(i, j, w) to the minimum spanning tree MST , and repeat the operation in step 2. 3. Output MST and then end. Evidently, the heap plays a crucial role in efficiently obtaining the minimum weight of an edge with a lower time complexity. The efficiency of the heap implementation is influenced by the frequency of retrieving the minimum weight. Hence, to optimize the Kruskal algorithm, we can capitalize on pruning to minimize the number of sorting operations and then extract edges from the heap. We effectively reduce computational overhead, thus improving the overall efficiency of the algorithm. According to the calculation, the optimal time complexity of this algorithm is O(E + VlogE), and the worst time complexity is Ω(E + ElogE). 2.3 Hierarchical Clustering Hierarchical clustering is divided into “top-down” divisive hierarchical clustering (DIANA) and “bottom-up” agglomerative hierarchical clustering (AGNES). This paper employs divisive hierarchical clustering as the primary clustering approach, and further optimizes it using the minimum spanning tree algorithm. Since the traditional agglomerative hierarchical clustering method used in [5] needs to input the number of clusters in advance, and the efficiency is relatively low, we propose to use the minimum spanning tree algorithm to improve it. We improve the accuracy of clustering results by optimizing the value method of its threshold so as to improve the efficiency of the algorithm to a certain extent. The algorithm process is as follows: 1. QMST is the queue receiving the minimum subtree. List MST is the set that receives the clustering result. QMST and List MST is empty at the beginning. 2. Add minimum spanning tree MST to QMST . 3. If QMST is not empty, the element SubMST (MST ) gained from QMST will be taken for judgment. If each edge in SubMST (MST ) is less than or equal to the threshold value σ , enter it into the List MST as the clustering result. Otherwise input it into the subtree cutting algorithm, and repeat step 3. Subtree cutting algorithm 1. The input subtree edge set E is enumerated in the order of non-nearest neighbor edges, one-way nearest neighbor edges, and two-way nearest neighbor edges [6]. For each edge e(i, j, w), if w > σ , the edge will be cut.

Improvement of Hierarchical Clustering

671

2. Judging the connectivity on the graph after cutting to obtain two minimum spanning subtrees SubMST and return to QMST . Based on the calculations, it can be observed that the calculation of the threshold value requires O(N ) complexity. Similarly, the evaluation of whether a value exceeds the threshold also requires O(N ) time complexity. Furthermore, the subtree cutting algorithm entails O(KN ) time complexity.

3 Experimental Analysis 3.1 Data Set and Programming Environment This experiment uses the total amount of shared bicycles used in 2011 and 2012 in an American place in Alibaba Cloud Tianchi Database [9]. The total number of dates included in this dataset is 731. Owing to the different travel volume between holidays and working days, the use of shared bicycles is different and shows obvious cyclical fluctuations, as shown in the figure below (Fig. 1):

Fig. 1. Hourly use of shared bicycles from July 1st to July 8th, 2011.

3.2 Calculation Methods of Precision The clustering accuracy of the experimental data is expressed by the precision [8], which is recorded as p. The precision ratio can directly reflect the difference between the clustering results and the actual real results, and its formula is as follows: p=

accurate number of datapoints total number of datapoints

(4)

By adjusting the threshold value of the two algorithms, we can accurately obtain the optimal solution of the two algorithms’ clustering. If the p value is greater than or equal to a certain value α, it will be considered accurate. Here α = 1. Therefore, the accuracy of the experiment can be calculated by the following formula: accuracy =

accurate clustering times × 100% total clustering times

(5)

672

X. Yuan and Y. Lu

3.3 The Measurement of Algorithm Performance The algorithm performance of this experiment can be judged by time complexity. In [5], DTW_AGNES is used to cluster time series, and the time complexity of its clustering clustering based algorithm is O(N 3 ), while the average time complexity of hierarchical   on the minimum spanning tree algorithm used in this paper is O N 2 log(N ) , and N is the number of nodes. We have also investigated other clustering algorithms which have also been proved efficient. The time complexity of DBSCAN algorithm is O(Nlog(N )) and the K-means method is O(N), where the N represents the of nodes. Excluding DTW, the time   number complexity of our algorithm can reach O N 2 ) . It is greater than the TDMST algorithm but inferior to the linear K-means algorithm and the DBSCAN algorithm. From the data we have mentioned above, we can conclude that the time complexity of our algorithm has been improved. However, owing to the greatly influence of the Kruskal algorithm it has been improved slightly. 3.4 An Example of the Experiments We conducted a series of tests to evaluate the efficiency of our algorithm using a selected dataset consisting of 8 sets of time series. The figure below illustrates the data points (Fig. 2):

Fig. 2. The examples of the figure of the data points.

Based on the inference that the first, sixth, and seventh time series belong to one cluster, while the rest belong to another cluster, we conducted tests using our algorithm. The resulting cluster assignment is as follows (Fig. 3): Upon observation, it is evident that the indices 0 (the first), 5 (the sixth), and 6 (the seventh) belong to the same cluster, while the indices 1, 2, 3, 4, and 7 form another cluster. This outcome validates the success of our algorithm with a clustering accuracy rate of p = 100%. Furthermore, the accurate clustering count achieved by our algorithm is incremented by 1. 3.5 The Results of the Experiments Test DTW_AGNES and the algorithm (DTW_ DIANA) respectively in aspects of performance and accuracy. The accuracy and the experiment results are shown in the following Table 1:

Improvement of Hierarchical Clustering

673

Fig. 3. The results of the algorithm.

Table 1. Comparison of optimal calculation accuracy of algorithms. Type

DTW_AGNES

DTW_DIANA

Total times

32

32

Accurate times

26

31

Accuracy

81.25%

96.88%

We have also tested the algorithm with other algorithms and the results are as follows (Table 2): Table 2. Comparison of optimal calculation accuracy of other algorithms. Type

DBSCAN

K-means

Total times

32

32

Accurate times

29

23

Accuracy

90.62%

71.88%

Therefore, based on the same dataset and programming environment, we can conclusively state that DTW_DIANA exhibits higher accuracy compared to DTW_AGNES. Furthermore, it surpasses both DBSCAN and K-means in terms of accuracy when applied to the specific dataset.

4 Conclusion The clustering effect of this algorithm is better than that of the original algorithm, and because the minimum spanning tree does not need to be calculated repeatedly, its computational load is greatly reduced, and its computational performance is significantly higher than that of the traditional agglomerative hierarchical clustering. Because this algorithm uses heap and disjoint set union to realize the minimum spanning tree, it can be further optimized in two aspects: the optimization of heap

674

X. Yuan and Y. Lu

element acquisition and the state compression of disjoint set union and the merging operation. Since this algorithm is based on the idea of “nearest neighbor” [6], a data point and its nearest neighbor belong to the same clustering, So it can only be applied to low dimensional data sets. As this algorithm adopts the method of threshold, it still has some subjectivity, but in practice, it uses the concept of statistical significance [5], which still has some mathematical meanings.

References 1. Han, J.W., Kamber, M.: Data Mining: Concepts and Techniques. Morgan Kaufmann, SanFrancisco (2000) 2. Mao, H., Tang, K.: Planning model of shared bike maintenance points based on hierarchical clustering. Electr. Des. Eng. 30(21), 20–23 (2022). (in Chinese) 3. Jinrong, Q., Xinpeng, Y., Xudong, L., Yanxin, X.: Application of agglomerative hierarchical clustering method in precipitation forecast assessment. J. Arid Meteorol. 40(4), 690–699 (2022). (in Chinese) 4. Wang, Y., Wang, M., Zhou, J., Zou, Y., Li, S.: Dynamic configuration of distribution network based on improved hierarchical clustering and GL-APSO algorithm. Chinese J. Intell. Sci. Technol. 4(3), 410–417 (2022). (in Chinese) 5. Li, X., Li, J., Li, J.: Clustering research on time series of online car-hailing demand based on the improved DTW_AGNES. J. Chongqing Jiaotong Univ. (Nat. Sci.) 38(8), 13–19 (2019). (in Chinese) 6. Xu, C., Gao, M.: Improved adaptive hierarchical clustering algorithm based on minimum spanning tree. Comput. Eng. Appl. 50(22), 149–153 (2014). (in Chinese) 7. Li, G., Li, Y., Zhu, X., Liu, L.: Particle swarm optimization algorithm based on minimum spanning tree. Comput. Eng. Des. 43(7) (2022). (in Chinese) 8. Sun, J., Liu, J., Zhao, L.: Clustering algorithm research. J. Softw. 19(1), 48–61 (2008). (in Chinese) 9. Fanaee-T, H., Gama, J.: Event labeling combining ensemble detectors and background knowledge. Progress Artif. Intell. 2, 1–15 (2013) 10. Shi, B., He, Y., Ma, S.: Research and design of improved maze map generation algorithm based on union-find sets. J. Changchun Normal Univ. 41(4), 51–55 (2022)

MNGAN: Multi-Branch Parameter Identification Based on Dynamic Weighting Liudong Zhang1 , Zhen Lei1 , Zhiqiang Peng2 , Min Xia3(B) , Gang Zou3 , and Jun Liu4 1 State Grid Jiangsu Electric Power Co. Ltd., Nanjing 210024, China State Grid jiangsu Electric Power Co. Ltd., Research Institute, Nanjing 211103, China Nanjing University of Information Science and Technology, Nanjing 210044, China

2 3

4

[email protected] China Electric Power Research Institute Co. Ltd, Nanjing, Jiangsu 210000, China

Abstract. A transmission line parameter identification method based on graph neural network (MNGAN) is proposed. The self-supervised graph attention network (SuperGAT) is used to learn the branch characteristics of the power grid. After selecting the appropriate attention form, the network can dynamically learn the relationship between the current node and other nodes, and give a larger weight to the nodes with higher correlation with the current node. At the same time, the multihead attention mechanism is used to perform adaptive learning on different feature spaces. In addition, for the multi-task learning module, a new multi-task loss function based on homoscedastic uncertainty is designed, which can balance the parameters of multiple targets to achieve simultaneous identification of multi-branch parameters. Experiments show that our method has higher accuracy and stronger robustness than traditional methods. Keywords: Parameter Identification Transmission Line

1

· Attention Mechanism ·

Introduction

Transmission line parameter identification has always played an important role in the power grid [1]. Accurate transmission line parameters can ensure the safe and economic operation of the power system [2]. At present, there are four main types of transmission line parameter identification methods: (1) Transmission line parameter identification method based on theoretical calculation method [3]. (2) Outage measurement method (3) Transmission line parameter identification method based on SCADA (supervisory control and data acquisition) data [4] (4) Transmission line parameter identification method based on PMU (phasor c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023  Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 675–682, 2023. https://doi.org/10.1007/978-981-99-6187-0_66

676

L. Zhang et al.

measurement unit) [5]. Currently, PMUs are only installed in key substations, while SCADA systems are widely installed. Therefore, the parameter identification method based on SCADA is more practical [6]. The experimental data are based on SCADA data. The existing methods have the following limitations: (1) The topology of the grid branch cannot be considered, and only a single branch can be considered each time, which ignores that the power transmission system is a global system. (2) Over-reliance on measuring equipment, once the equipment problems, the task will not be completed. In view of the above problems, this paper uses a self-supervised graph attention network (SuperGAT [7]) to capture branch information. Different from traditional machine learning methods, the network can consider the topology of power grid branches and assign weights to each branch through attention. At the same time, a multi-head attention mechanism is added to the network. The model will aggregate node features from different feature spaces, so that the model can better express branch information. The multi-task regression module adds a multi-task loss function based on homoscedastic uncertainty, which makes the model focus on large-order targets while not ignoring smaller-order targets, and realizes simultaneous identification of multi-branch parameters.

2 2.1

Parameter Identification Based on Multi-task Noise Graph Attention Network Transmission Line Branch Parameter Identification Task

Firstly, according to the characteristics of the branch of the power grid, the task of this paper is to predict the true value of the branch conductance G of the power grid. Firstly, we construct an undirected graph G(V, E) of transformers and transmission lines in the power system, where V is the vertex set and E is the edge set. When generating graph data, this paper defines the transmission line as a node and the generator as an edge. For a given edge of the power transmission system, we express its input as X = (pm , pn , qm , qn , um , un , y), where pm and n represent the active power at both ends of the branch, qm and qm represent the reactive power at both ends of the branch, um and un represent the positive sequence voltage at both ends of the branch, and y represents the ground susceptance of the branch. Because the high-voltage transmission line with positive parameters satisfies the π-type equivalent circuit [8]. Therefore, according to the π-type equivalent circuit, we can use Equation (1) to calculate the real value of the branch conductance G. 2

G=

(pm + pn )[p2m + (qm + u2m y) ] 2

2

(u2m [(pm + pn ) + [qm + qn + (u2m + u2n )y] ])

(1)

Therefore, the power grid branch parameter identification task can be defined as a multi-objective regression problem. Specifically, given the input Xk of a branch, the branch conductance gk is obtained by f (Xk ), and f (X) represents the mapping of the input X by the neural network algorithm.

MNGAN

677

Different from traditional methods, our proposed multi-task noise graph attention network regards the power system as a global system, considers the influence of neighbor nodes, and has strong robustness to noise and other disturbances. The proposed model will be used to measure the branch conductance G of each branch, whose inputs are the characteristic matrix X ∈ Rn∗7 and the adjacency matrix A. The input data contains N nodes, each node contains 7 features. 2.2

Select the Form of Attention

The attention forms of the self-supervised graph attention network we proposed mainly include the following two types: SD (scaled dot-product) attention (Formula 4) and M X (mixed GO and DP ) attention (Formula 5). eij,GO = a(Whi  Whj )

(2)

eij,DP = (Whi )T · (Whj ) √ eij,SD = eij,DP / F

(3) (4)

eij,M X = eij,GO · σ(eij,DP )

(5)

Among them,  represents the splicing operation, σ represents the nonlinear activation function, and F represents the dimension of the feature. For different graphs, the choice of attention form is different. Which form of attention is selected depends on the homogeneity and average degree of the graph data [7]: 1. Homogeneity: the average ratio of neighbors with the same label as the central node [9]. 2. Average degree: the average degree of a graph node, reflecting the density of a graph. According to the above definition, the formulas for calculating graph homogeneity and average degree are as follows: X=

 1  ( 1l(i)=l(j) / |Ni |) i∈N j∈Ni N ˆ ¯ = D D N

(6) (7)

ˆ is where l(i) is the label of node i, Ni is the number of neighbors of node i, D the sum of degrees of all nodes, and N is the number of nodes. If the homogeneity is low, SD attention performs better in the model. If the average degree is not too low or not too high, and the homogeneity is higher than 0.2, then M X attention will perform better than SD attention [7]. According to the above formula, the homogeneity of our graph data is 0.33, so this paper chooses M X attention form. M X attention is the result obtained by multiplying the DP attention, which has been transformed by the Sigmoid function, with the GO attention

678

2.3

L. Zhang et al.

Noise Graph Attention Module

In order to predict the branch conductance G of the power grid, this paper designs a noise graph attention module, which is composed of SuperGATM X and multi-head attention mechanism. In the simulated real situation, the branch parameters can also be accurately predicted. Figure. 1 is the total flow chart of our proposed method. The noise graph attention module we designed is mainly used to capture the grid branch features. Specifically, for a set of input node features with noise and other disturbances,h = {h1 , h2 , ..., hN }, hi ∈ RF , The noise graph attention layer will generate a new set of node features  h = {h 1 , h 2 , ..., h N }, h i ∈ RF as output, where N represents the number of nodes. F represents the number of features contained in each node. The new node features will contain rich semantic information. The implementation details are as follows. First, the input data is passed through MLP layer to convert the data into more advanced features with sufficient expressive power, and then the self-attention mechanism is performed on the node to obtain GO attention. Similarly, the parameterized node features are dotted to obtain DP attention. After passing through the sigmoid function, DP attention is multiplied by GO attention, that is, the M X attention proposed in this paper, which is nonlinearized by LeakyReLU.

Fig. 1. Multi-task noise graph attention model

At the same time, in order to make the coefficients between different nodes easy to compare, we use the Sof tmaxj function to standardize the coefficients of the selected j, and finally get the attention coefficient of the node j to i as follows: k = Sof tmaxj (LeakyReLU (eij,M X )) (8) αij In order to make the model learn node features in different feature spaces and improve the robustness of the model to noise and outliers. We add a multi-head attention mechanism to the model and select the number of attention heads H = 3. Because we perform a multi-head attention mechanism on the final layer

MNGAN

679

of the network, the node feature aggregation method used is average aggregation. The final node features are expressed as follows: 1 k  k h = σ( αij Wk hj ) (9) k=1 j∈Ni k The above is the implementation process of the learning branch features of the noise graph attention network. 2.4

Multi-task Loss Module Based on Adaptive Weight Learning

Multi-task learning is mainly aimed at multi-objective optimization problems. The traditional multi-objective optimization method is simply weighted sum of the losses of each individual task. However, this method has many problems, such as the weight of the network may be dominated by a certain task. To solve these problems, we use the method of weighting loss function based on homoscedastic uncertainty proposed by Kendall [10], use the difference between predicted value and real value to measure the loss of the model, and weight the uncertainty of the difference according to the model. The multi-task loss function we use is as follows:  1 k2 (10) L(W, σgk ) = 2 Lk (W ) + log(1 + σg ) k k 2σg where k is the total task, σgk is the weight coefficient of the k-th branch conductance G, and Lk (W ) is the loss function. The main network of the self-balancing multi-task loss module is a multi-layer FCN. Firstly, the hard parameter sharing strategy is used to take the output of the attention layer of the noise map as the input of the regression layer. Then, the model separates different task layers and labels to complete the loss calculation. The implementation process of the multi-task loss module is shown in Figure. 2.

Fig. 2. Multi-task loss module based on task correlation weighting

3 3.1

Experimental Results and Discussion Datasets and Simulation Environment

Our data set is collected by a real SCADA system provided by China Electric Power Research Institute. The data consists of 17 transmission line stations, with

680

L. Zhang et al.

a total of 8640 sets of data. The recorder records the data every minute, and each set of data contains seven features. In order to prove that our proposed model can learn the branch parameters closest to the actual values in real situations, we perform the following three preprocessing in the original data to simulate the real situation: (1) Feature loss (Fig. 3 (a)) (2) Adding Gaussian noise [11] (Fig. 3 (b)) (3) Node loss (Fig. 3 (c))

Fig. 3. Simulate the real situation.

3.2

Comparison of Branch Parameter Identification of Different Models

In order to prove the effectiveness of our model, we use LSTM, FCN and GCN as benchmarks. At the same time, this paper defines the transmission line parameter identification task as a regression task. In order to prove the advantages of our mod-el, we use some regression algorithms such as SVR and LinerRegression as comparative experiments. Table 1. Average relative error (%) of branch conductance G predicted by different models Model

None Noise

Feature loss Node loss All

LinerReg SVR [12] SGD Bagging GBDT RidgeReg

0.04 18.955 5.132 5.132 1.798 10.915 0.756 6.55 0.323 2.073 0.041 20.536

28.763 5.132 12.752 7.56 3.211 23.258

LSTM [4] FCN GCN [13] Ours

1.234 1.431 1.205 1.012

1.235 1.411 1.271 1.013

1.563 1.434 1.243 1.016

17.639 5.132 5.425 7.57 5.746 19.636 1.602 1.445 1.211 0.966

/ / / / / / 2.568 2.158 1.349 0.986

Table 1 shows the results of different models for the identification of grid branch parameters. The experimental results show that the linear regression method obtains very good recognition results without noise, but its robustness is greatly reduced when simulating the actual situation. LSTM and FCN methods are significantly worse after adding interference. As a graph neural network

MNGAN

681

method, although GCN can consider the topology of power grid branches, its effect is not particularly ideal. Our model adds an attention mechanism suitable for the current task, which can dynamically learn the importance between nodes, and we add a multi-task loss function based on the uncertainty of the variance in the regression module, which improves the robustness of the model. 3.3

Advantages of This Method

The noise map used in this paper is that the attention module is based on the graph neural network method. In order to prove the advantages of this method, Figure 4 gives the input features and the hidden layer feature distribution after the SuperGATM X layer. We randomly select three of the branches as examples. It can be seen from the graph that after the input features pass through the SuperGATM X layer, the feature distribution is smoother and the distribution interval is smaller. Such data distribution will be conducive to branch conductance prediction, making our model performance better. And the graph neural network method can learn the lost node features through neighbor nodes. It can be proved that the SuperGATM X model proposed in this paper can be well applied to interference situations such as noise.

Fig. 4. Distribution comparison of input features and hidden layer features after SuperGATM X layer in different environments.

4

Conclusions

In the actual transmission line parameter identification task, the traditional theoretical calculation method and deep learning method cannot achieve the prediction accuracy we need. This paper proposes a multi-task noise graph attention network. This method considers the branch topology of the power grid and adds a multi-head attention mechanism. In the simulated real situation, the model has

682

L. Zhang et al.

good robustness. A multi-task loss function based on adaptive weight learning is proposed. This function can automatically optimize the weight of multi-branch loss, further reduce the influence of noise on parameter identification, realize simultaneous identification of multiple branch parameters, reduce the cost of computing resources, and improve the accuracy of parameter identification. However, the model cannot adapt to the dynamic changes of the grid topology. In the future, we hope to extend our model to dynamic grid node analysis. Acknowledgments. This work was supported by the Science and Technology Project of SGCC, named Research and application of data-driven intraday forward-looking scheduling technology for key transmission channels (5108-202318054A-1-1-ZN)

References 1. Khodayar, M., Wang, J.: Probabilistic time-varying parameter identification for load modeling: a deep generative approach. IEEE Trans. Ind. Inf. 17(3), 1625– 1636 (2020) 2. Suonan, J., Qi, J.: An accurate fault location algorithm for transmission line based on R-L model parameter identification. Electr. Power Syst. Res. 76(1–3), 17–24 (2005) 3. Thorp, J.S., Phadke, A.G., Horowitz, S.H., et al.: Some applications of phasor measurements to adaptive protection. IEEE Trans. Power Syst. 3(2), 791–798 (1988) 4. Que, L., Yang, L., Qian, H., et al.: A robust line parameter identification method based on LSTM and modified SCADA data. In: 2020 IEEE 4th Conference on Energy Internet and Energy System Integration (EI2), pp. 2981–2986. IEEE (2020) 5. Shi, D., Tylavsky, D.J., Logic, N., et al.: Identification of short transmissionline parameters from synchrophasor measurements. In: 2008 40th North American Power Symposium, pp. 1-8. IEEE (2008) 6. Yan, Y.: A robust transmission line parameters identification based on RBF neural network and modified SCADA data. In: 2020 10th International Conference on Power and Energy Systems (ICPES), pp. 251–255. IEEE (2020) 7. Kim, D., Oh, A.: How to find your friendly neighborhood: graph attention design with self-supervision. arXiv preprint arXiv:2204.04879 (2022) 8. Lu, M., Jin, X., Wang, X., et al.: A robust identification method for transmission line parameters based on BP neural network and modified SCADA data. In: 2020 IEEE International Conference on Energy Internet (ICEI), pp. 92–97. IEEE (2020) 9. Pei, H., Wei, B., Chang, K.C.C., et al.: Geom-GCN: Geometric graph convolutional networks. arXiv preprint arXiv:2002.05287 (2020) 10. Kendall, A., Gal, Y., Cipolla, R.: Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7482–7491 (2018) 11. Brown, M., Biswal, M., Brahma, S., et al.: Characterizing and quantifying noise in PMU data. In: 2016 IEEE Power and Energy Society General Meeting (PESGM), pp. 1–5. IEEE (2016) 12. Li, M.W., Geng, J., Wang, S., et al.: Hybrid chaotic quantum bat algorithm with SVR in electric load forecasting. Energies 10(12), 2180 (2017) 13. Wang, Z., Xia, M., Lu, M., et al.: Parameter identification in power transmission systems based on graph convolution network. IEEE Trans. Power Delivery 37(4), 3155–3163 (2021)

Error Selection Based Training of Fully Complex-Valued Dendritic Neuron Model Zhidong Wang, Yuelin Wang, and He Huang(B) School of Electronics and Information Engineering, Soochow University, Suzhou 215006, People’s Republic of China [email protected] Abstract. In this paper, an efficient training algorithm is proposed for fully complex-valued dendritic neuron model (FCDNM), where the weights and bias are tuned by complex-valued Levenberg-Marquardt (CLM) algorithm with error selection. Firstly, compared with complexvalued first-order algorithms, the CLM algorithm overcomes the problem of slow convergence for training FCDNM. Secondly, for practical applications, the choice of optimization algorithms is over-emphasized, while the effect of complex-valued Hessian matrix is usually ignored. Indeed, not all errors contribute equally to the parameter modification during the optimization process. Therefore, an error selection strategy is introduced to reduce the computational complexity during the training process. Experimental results show the effectiveness of the proposed algorithm. Keywords: Complex-valued LM algorithm · Fully complex-valued dendritic neuron model · Error selection · Computation reduction · Accelerated method

1

Introduction

An artificial neural network (ANN) is an abstraction, simplification and simulation of a biological nervous system. ANNs operate by adjusting the connection weights between neurons during a training process and are an effective way to solve complex problems in diverse fields. The McCulloch-Pitts neuron is a commonly used model for information processing in ANNs. However, this model has its limitation for oversimplifying real neurons and ignoring the nonlinear synaptic integration process [5]. To improve the computational power of ANNs and better characterize real neurons, a tree-structured dendritic neuron model (DNM) was proposed in [12]. DNM leverages the plasticity of synapses to enable nonlinear processing within dendrites. This plasticity enables synapses to adjust their own parameters adaptively and facilitates iterative learning within DNM. Compared to the traditional McCulloch-Pitts neuron model, DNM is better equipped to handle complex nonlinear data [9]. DNMs have demonstrated promising results in various applications, including classification [3], medical diagnosis [8] and time series prediction [7,13]. Z. Wang and Y. Wang contribute equally for this work c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023  Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 683–690, 2023. https://doi.org/10.1007/978-981-99-6187-0_67

684

Z. Wang et al.

A recent study [4] has extended DNM from the real domain to the complex domain. This extension enables DNM to process complex-valued signals, which are common in natural systems. In [4], the complex-valued gradient descent (CGD) algorithm has been utilized to optimize the parameters of fully complex-valued DNM (FCDNM), which was applied to deal with complex-valued exclusive-or (CXOR), channel equalization and wind prediction. Experimental results support the effectiveness of CGD for FCDNM. However, the first-order algorithms are usually slow to converge and tend to fall into local minima, which essentially stems from their inherent characteristic. In recent years, many complex-valued second-order algorithms were proposed, including complex-valued GaussNewton (CGN), complex-valued Levenberg-Marquard (CLM) [1] and complex-valued LBFGS (CL-BFGS) [11] algorithms, etc. The LM algorithm is a damped least squared method that minimizes the objective function with the form of the sum of squared error. It combines the ideas of gradient descent and Gauss-Newton method by adding a damping parameter into their update equation. When the damping parameter is close to zero, the algorithm behaves more like Gauss-Newton method. When the damping parameter is very large, the algorithm behaves more like gradient descent method. LM is extended into the complex domain based on Wirtinger calculus [1], and named as CLM. CLM exhibits higher accuracy and convergence rate than CGD. However, CLM requires vast memory to store Jacobian matrices when dealing with massive data. In [10], an improved computation is introduced to enhance the training efficiency of LM. The proposed modification avoids the storage of Jacobian matrix and replaces the multiplication between Jacobian matrices with vector operations. Therefore, the proposed algorithm can be used for problems with basically unlimited number of training patterns. On the other hand, in [6], the authors utilize the aforementioned construction method and employ an error threshold approach for RBF neural networks to decrease the number of necessary submatrices for the construction of Hessian matrix and expedite the training process of the LM algorithm. Motivated by these discussions, this paper presents an adaptive error selection strategy for the training of FCDNM, in which a variable window is introduced to determine how many errors are employed for the calculation of complex-valued Jocabian matrices. This eventually reduces the computational complexity involved in complex-valued quasi-Hessian matrix and provides an efficient training for FCDNM. Experimental results confirm the effectiveness of the proposed algorithm for FCDNM.

2 2.1

Preliminaries FCDNM

FCDNM is composed of the synaptic, dendritic, membrane and soma layers. The structure of FCDNM is shown in Fig. 1. The first layer of FCDNM is the synaptic layer. The role of its nodes is to connect the ith element pi (i = 1, 2, . . . , D) of the input sample to the jth dendrite (j = 1, 2, . . . , Ms ) in the dendritic layer by the activation function f˜, where D

Training Complex-Valued Dendritic Neuron Model

685

Fig. 1. Structure of FCDNM.

denotes the dimension of input data, and Ms is the total number of dendrites in the dendritic layer. Its output Yi,j is computed by Yi,j = f˜ (ωi,j pi − bi,j ) , where ωi,j and bi,j respectively represent the complex-valued weight and bias. The second layer is the dendritic layer which uses multiplication as the nonlinear D activation function. The output of the jth dendrite is calculated by Zj = i=1 Yi,j . The third layer is the membrane layer summating the outputs Msof all dendritic Zj . The final branches. The formula for this layer is expressed by V = j=1 layer is the soma layer with one node. Its output O is described by O = f˚(V ), where f˚ is the fully complex-valued activation function used in the soma layer. The objective function of FCDNM for the entire sample set with the size N is defined by N 1  f= fn , (1) N n=1 where fn is the loss corresponding to the nth training sample with fn = 12 en en , en = On − Tn , en is the error between the actual output On and the target Tn for the nth training sample, and en is the complex conjugate of en . 2.2

The CLM Algorithm

As a matter of fact, the CLM algorithm was widely applied to deal with unconstrained optimization problems of real-valued functions with complex variables. For ease of representation, denote θ = [w1,1 , . . . , wD,Ms , . . . , bD,Ms ]T = [θ1 , . . . , θS ]T with S being the total number of adjustable parameters of FCDNM. T  c The CLM algorithm used to update Δθ = ΔθT , ΔθH ∈ C2S is defined by  −1 c c ˆ + μI Δθ = H GH e,

(2)

T  c where I is an unit matrix with appropriate dimension, e = eT , eH ∈ C2N T ˆ = GH G ∈ C2S×2S is the complexwith e = [e1 , . . . , eN ] is the error vector, H

686

Z. Wang et al.



Jθ Jθ ∗ valued quasi-Hessian matrix, G = ∈ C2N ×2S denotes the composite J∗θ∗ J∗θ Jacobian and Jacobian matrix, Jθ and Jθ∗ represent the complex-valued matrix ∂en n ∗ its conjugate, respectively. Actually, Jθ = ∂e and J = . θ ∂θs ∂θ ∗ N ×S

3

s

N ×S

The Proposed Method

The complex-valued Jacobian matrix and conjugate Jacobian matrix for the nth sample are defined by Jθn = [Jθn1 , Jθn2 , . . . , Jθns , ..., JθnS ] ,   ∗ , Jθ ∗ , . . . , Jθ ∗ , ..., Jθ ∗ , Jθn∗ = Jθn1 ns n2 nS

(3)

∗ where Jθns = ∂en /∂θs , Jθns = ∂en /∂θs∗ . Therefore, it can be deduced that

JH θ Jθ =

N 

JH θn Jθn ,

(4)

n=1

Generally, the quasi-Hessian matrix is represented by H

T ∗ H T ∗ ˆ = GH G = JθH Jθ + JθT∗ J∗θ∗ JθH Jθ∗ + JθT∗ J∗θ . H Jθ ∗ Jθ + Jθ Jθ ∗ Jθ ∗ Jθ ∗ + Jθ Jθ

(5)

ˆ can be rewritten as According to (4) and (5), H ˆ = H

N  JA JB , JC JD

(6)

n=1

T ∗ H T ∗ ∗ ∗ with JA = JH ∗ Jθ ∗ , JB = Jθ Jθ ∗ + Jθ ∗ Jθ , JC = JB and JD = JA . θn Jθn + Jθn n n n n n Through (6), the complex-valued quasi-Hessian submatrix corresponding to the nth sample can be defined by



Jθn Jθn∗ qn = J∗θn∗ J∗θn c

H

Jθn Jθn∗ . J∗θn∗ J∗θn

(7)

ˆ can be expressed Therefore, according to (6) and (7), the quasi-Hessian matrix H N c  ˆ = GH G = qn . as H n=1

c

Similarly, the gradient subvector gn can be calculated by

Jθn Jθn∗ gn = J∗θn∗ J∗θn c

H

en . e∗n

(8) c

As a result, the gradient vector Q is obtained as Q = GH e =

N  n=1

c

gn .

Training Complex-Valued Dendritic Neuron Model c

687

c

It is easy to see that qn and gn are related to Jθn and Jθn∗ , and Jθn and Jθn∗ are related to en . Therefore, a scheme with error threshold can be introduced to filter the smaller error terms en . Specifically, set an error threshold e0 , and when |en | of the complex-valued output error is less than e0 , it will not be considered c c to construct the corresponding qn and gn . This measure effectively ensures that quasi-Hessian submatrices corresponding to large errors will be prioritized for the construction of quasi-Hessian matrix. It inevitably reduces the building time ˆ and speeds up the overall computing process. of H c c At the same time, to efficiently reduce the number of qn and gn , an adaptive window with variable size is introduced to select appropriate en . The size of the current window length Nnow is determined by the change of E(t), where t is the current number of iterations. The specific way of updating Nnow is defined by ⎧ ⎨ Nnow = Nnow /η, t>1 and Eσ 1 and Eσ >σ , (9) ⎩ t>1 and Eσ =σ Nnow = Nnow ,     where · means the rounding down operator and Eσ =  E(t)−E(t−1) . During E(t) the scaling process of the window size, if the upper bound is exceeded, the transboundary would occur. To avoid this, an upper bound Nmax = N should be set. Once the upper bound is exceeded, Nnow is immediately set as Nmax . On the contrary, when Nnow is too small, the quasi-Hessian matrix may involve too c c few qn and gn , which is counteractive to the training. Therefore, it is necessary to define a lower bound Nmin = ζ × Nmax for Nnow with ζ ∈ (0, 1). That is, when Nnow < Nmin , Nnow is reset as Nmin . After the window size is determined, the modulus value of each element in e is taken to obtain ¯ e = {|e1 | , . . . , |eN |}. The subscripts corresponding to the largest e are stored as esort . At this moment, the quasi-Hessian matrix Nnow errors in ¯ ˆ and the gradient vector Q are initialized. The specific calculation process is H defined by c ˆ =H ˆ + qn , |en | > e0 and n ∈ esort , H 0, otherwise c (10) gn , |en | > e0 and n ∈ esort Q=Q+ . 0, otherwise By combining (10) with (2), the parameters of FCDNM can be updated by the proposed CLM algorithm with the selection of errors.

4

Experiments

Two experiments are conducted for nonlinear channel equalization and stock price forecast. The simulations effectively validate the effectiveness and advantages of the proposed error selection based complex-valued Levenberg-Marquardt (ESCLM) algorithm for the training of FCDNM compared to CLM [1], CL-BFGS [11] and CGD [4].

688

Z. Wang et al.

Table 1. The training error of the nonlinear phase equalization problem achieved by different algorithms. 10dB

15dB

20dB

25dB

30dB

35dB

0.1301

0.1109

0.0947

0.0933

0.0891

0.0934

CL-BFGS 0.1017

0.0723

0.0625

0.0601

0.0576

0.0622

CLM

0.0932

0.0618

0.0511

0.0413

0.0399

0.0402

ESCLM

0.0905(2004.7) 0.0602(1987) 0.0453(2177.2) 0.0388(2466.1) 0.0381(2264.9) 0.0391(2433)

CGD

4.1

Nonlinear Phase Equalization

In [2], the nonminimum phase channel equalization with nonlinear distortion is introduced for 4QAM signaling. The channel output is given by zn = on + 0.1o2n + 0.05o3n + vn , vn ∼ H (0, 0.01), on = w1 sn + w2 sn−1 + w3 sn−2 ,

(11)

where w1 = 0.34 − 0.27i, w2 = 0.87 + 0.43i, w3 = 0.34 − 0.21i and H (0, 0.01) is the white Gaussian noise with the mean of 0 and the variance of 0.01. sn is the input signal of the nonlinear channel at the current moment, sn−1 and sn−2 respectively represent the signals delayed by one and two units. T For FCDNM, the input signal is [zn , zn−1 , zn−2 ] and the output signal is sn−1 . Moreover, the experiment iterates 100 times with 1000 training samples and 200 test samples generated for different SNRs. In the experiment, a constant step size is adopted for CGD and is set as 0.02. The memory size of CL-BFGS is 30. For ESCLM, the parameters of e0 , σ, η and ζ are respectively taken as 0.13, 0.01, 1.01 and 0.75. When the SNR is 25dB, the behaviors of the objective function for the nonlinear channel equalization are presented in Fig. 2(a). It is clearly seen that ESCLM can achieve better result with faster convergence than CGD, CL-BFGS and CLM. Figure 2(b) displays the number of matrices required by the error threshold method with and without the adaptive variable window mechanism. ESCLM incorporates the error threshold method and the adaptive window length mechanism (9), as mentioned earlier. Figure 2(b) illustrates that ESCLM requires fewer quasi-Hessian submatrices than the error threshold method alone during the optimization process. This suggests that the adaptive variable window mechanism can significantly reduce the computational workload. Table 1 displays the training errors of different algorithms under different SNR conditions. For these cases, ESCLM always achieves the best test results. Moreover, ESCLM can effectively reduce the number of quasi-Hessian submatrices when compared with the initial 3000 quasi-Hessian submatrices, as shown in Table 1. 4.2

Stock Price Forecast

The performance of different algorithms is further evaluated by stock price forecast of Shanghai Stock Exchange Composite Index (SSE) and New York Stock

Training Complex-Valued Dendritic Neuron Model

689

Fig. 2. Simulation results of the nonlinear phase equalization problem: (a) The convergence curves of different algorithms, (b) The number of quasi-Hessian submatrices required by different algorithms.

Exchange Composite Index (NYA). The prediction model utilizes the closing index data from five consecutive days to forecast the closing index for the subsequent day. 1200 sets of continuous data for SSE and 1250 sets for NYA can be obtained for training and testing. Among them, the first 1000 sets of data are used as the training set, and the remaining ones are used as the test set. The four algorithms mentioned in Sect. 4.1 are evaluated through 100 iterations on NYA and SSE. In the experiment, a constant complex-valued step size is adopted for CGD and is set as 0.1. The memory size of CL-BFGS is 30. For ESCLM, the parameters of e0 , σ, η and ζ are respectively taken as 0.0001, 0.01, 1.01 and 0.75. Table 2 displays the training and testing errors achieved by the four algorithms on the NYA and SSE datasets. The results in Table 2 demonstrate that the proposed ESCLM algorithm outperforms the other three algorithms in terms of the training and testing errors for stock price forecast. Table 2. Performance of different algorithms on the stock price forecast problem. Dataset Algorithm Training loss

Testing loss

NYA

CGD CL-BFGS CLM ESCLM

0.0022 1.81e-04 1.46e-04 1.33e-04(713)

0.0087 8.75e-04 4.66e-04 3.92e-04

SSE

CGD CL-BFGS CLM ESCLM

0.0122 4.63e-04 4.22e-04 3.91e-04(632)

0.0143 4.77e-04 4.41e-04 4.35e-04

690

5

Z. Wang et al.

Conclusion

This paper has presented the ESCLM algorithm for the efficient training of FCDNM by modifying the computation of complex-valued quasi-Hessian matrix. The error threshold and adaptive variable window mechanism have been incorporated with ESCLM to reduce the construction cost of quasi-Hessian matrix. Actually, larger errors have been prioritized by ESCLM and utilized to calculate quasi-Hessian submatrices and gradient subvectors, which contribute to the lower training loss and full training of FCDNM. By comparing the performance of the ESCLM algorithm with some previous ones, it can be concluded that ESCLM has provided a promising way for the training of FCDNM. Acknowledgements. This work was supported by Postgraduate Research and Practice lnnovation Program of Jiangsu Province, China under no. SJCX22-1497.

References 1. Amin, M.F., Amin, M.I., Al-Nuaimi, A.Y.H., Murase, K.: Wirtinger calculus based gradient descent and levenberg-marquardt learning algorithms in complex-valued neural networks. In: ICONIP, pp. 550–559 (2011) 2. Cha, I., Kassam, S.: Channel equalization using adaptive complex radial basis function networks. IEEE J. Select. Areas Commun. 13(1), 122–131 (1995) 3. Gao, S., Zhou, M., Wang, Y., Cheng, J., Yachi, H., Wang, J.: Dendritic neuron model with effective learning algorithms for classification, approximation, and prediction. IEEE Trans. Neural Networks Learn. Syst. 30(2), 601–614 (2019) 4. Gao, S., Zhou, M., Wang, Z., Sugiyama, D., Cheng, J., Wang, J., Todo, Y.: Fully complex-valued dendritic neuron model. IEEE Trans. Neural Networks Learn. Syst. 34, 2105–2118 (2021) 5. Li, X., Tang, J., Zhang, Q., Gao, B., Yang, J.J., Song, S., Wu, W., Zhang, W., Yao, P., Deng, N.: Power-efficient neural network with artificial dendrites. Nat. Nanotechnol. 15(9), 776–782 (2020) 6. Miaoli, M., Xiaolong, W., Honggui, H.: Accelerated levenberg-marquardt algorithm for radial basis function neural network. In: CAC, pp. 6804–6809 (2020) 7. Song, Z., Tang, Y., Ji, J., Todo, Y.: Evaluating a dendritic neuron model for wind speed forecasting. Knowl. Based Syst. 201, 106052 (2020) 8. Tang, C., Ji, J., Tang, Y., Gao, S., Tang, Z., Todo, Y.: A novel machine learning technique for computer-aided diagnosis. Eng. Appl. Artif. Intell. 92, 103627 (2020) 9. Teng, F., Todo, Y.: Dendritic neuron model and its capability of approximation. In: ICSAI, pp. 542–546 (2019) 10. Wilamowski, B.M., Yu, H.: Improved computation for levenberg-marquardt training. IEEE Trans. Neural Networks 21(6), 930–937 (2010) 11. Wu, R., Huang, H., Qian, X., Huang, T.: A L-BFGS based learning algorithm for complex-valued feedforward neural networks. Neural Process. Lett. 47, 1271–1284 (2018) 12. Zheng, T., Tamura, H., Kuratu, M., Ishizuka, O., Tanno, K.: A model of the neuron based on dendrite mechanisms. Electron. Commun. Jpn. 84, 11–24 (2001) 13. Zhou, T., Gao, S., Wang, J., Chu, C., Todo, Y., Tang, Z.: Financial time series prediction using a dendritic neuron model. Knowledge-Based Syst. 105, 214–224 (2016)

Intelligent Identification Method of Flow State in Nuclear Main Pump Based on Deep Learning Method Ying-Yuan Liu(B) , Di Liu, Zhenjun Zhang, and Kang An Shanghai Normal University, 100 Haisi Road, Shanghai 200234, China [email protected]

Abstract. Exploring the intelligent identification method for detecting working states of fluid machineries is essential for early fault diagnosis and ensuring the long-term safe operation of the system. Taking pressure pulsation signals under uniform and non-uniform flow of nuclear main pumps as the object, the influences of data processing methods and deep learning models on the recognition reliabilities of fluid states are evaluated. A new method of random sampling points is proposed to expand training data sets and three kinds of convolutional neural network models (e.g. 7-layers CNN, ResNet-50 and DesNet-121) are covered for comparisons. Results show that the proposed random sampling method can enhance recognition accuracy and convergence speed. The DenseNet-121 model has better recognition capacity than other models included for detecting working states of the pump, including accuracy, generalization ability and convergence speed. Further validation based on the cavitation diagnosis of centrifugal pump demonstrates that the methodology of the data processing and the deep learning method provided in this work indicates good generalization ability. Keywords: Fluid machinery · Pressure Pulsation · Fault diagnose · Deep Learning

1 Introduction Reactor coolant circulating main pump is the key component of the nuclear power equipment system, and its reliability and stability are related to the safe and stable operation of nuclear power plants [1]. The pressure pulsation of the nuclear main pump caused by the dynamic and static interference is an important factor for the instability of the nuclear power plant such as vibration and noise [2]. Considering that the pressure pulsation signals of the pump can be easily collected by sensors and experimental tests, they are generally adopted to judge the operating state of the pump. With the development of artificial intelligence and the proposal of the concept of “intelligent nuclear power”, exploring artificial intelligence technology to process pressure pulsation signals and to realize intelligent identification of fluid states is of great significance for mastering the operation stability and reliability of nuclear power main pump in real time. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 691–699, 2023. https://doi.org/10.1007/978-981-99-6187-0_68

692

Y.-Y. Liu et al.

The intelligent identification technology has made great progress in recent years, but its application in fluid machinery is still in the exploratory stage. For fault diagnosis or state recognition of fluid machinery, the support vector machine (SVM) [3, 4], the multilayer neural network [5] and artificial neural networks (ANN) [6] are three typical methods. However, the above models belong to a shallow network model, and it is necessary to combine signal processing techniques based on artificial experience in fault diagnosis to achieve good results. With the introduction of deep learning method, the deep learning technology represented by Convolutional neural network has strong feature extraction and classification capabilities, which has also started to be explored in the application research of fault diagnosis and state recognition of fluid machinery [7, 8]. However, ordinary CNN model may induce information loss, gradient disappearance or gradient explosion, resulting in difficulties in deep network training and low efficiency in feature extraction. To solve these issues, the Residual Neural Network (ResNet), Dense Convolutional Network (DenseNet) are proposed. ResNet model adds a residual structure to establishe skip connections between front layers and back layers, which could timely eliminate data with poor learning accuracy, to ensure the accuracy of deep learning. DenseNet model keeps the same basic idea as the ResNet model and the difference is that DenseNet builds a dense connection between all the front layers and the rear layers. Another feature of DenseNet is to realize feature reuse through the connection of features on the channel, which may achieve better performance than ResNet with less parameters and calculation costs. These two kinds of networks provide a good solution for deep learning network, which is more helpful in image recognition in complex situations. In recent years, these two typical deep learning networks have been employed in mechanical fault diagnosis [9–11], but relevant research is still very limited. Expanding the application of these newly developed algorithms in the field of fluid machinery should be put on the agenda. Moreover, the training of deep CNN learning model requires a large number of sample data sets. However, the amount of data is always insufficient in practical projects and the traditional periodic data slicing method could lead to the classification problem of small samples. Therefore, it is also necessary to explore different data processing methods to assist CNN model in identifying and diagnosing small sample data. Based on this, taking the vibration signals of the nuclear main pump as the research object, this work aims to explore the application of the deep CNN learning model in the identification of flow states of fluid machinery. The sample data adopted in this work are pressure vibration signals of the pump under two different flow states (i.e. the uniform flow and non-uniform flow). Although the non-uniform flow state of the fluid machinery is not regarded as one of the faults for fluid machineries, the data processing method, the intelligent identification method based on deep learning adopted in this work could provide guidance for fault diagnosis of fluid machinery.

2 Study Object and Experiment Process The experiment was conducted on a model test rig of the nuclear main pump [12], including a water tank, a pump model, pipelines and some measuring instruments (see Fig. 1). Two inflow states (uniform inflow and non-uniform inflow) are focused in this

Intelligent Identification Method of Flow State

693

work. To test the two different inflow states, flow disturbances were added in front of the pump model to bring the non-uniform flow state. High frequency pressure sensors were employed to measure the pressure pulsation of the pump case. The tested pump operated at rated rotational speed of 1, 800 rev/min, and the flow rates of the pump varied from 56 m3 /h to 336 m3 /h. Then, the pressure signals of each working condition were obtained by the data collector and saved in a txt file. Each group of the measured pressure signals contained 40, 960 points, with the sampling frequency of 8, 192 Hz and the sampling time of 5 s.

Fig. 1. Schematic diagram of test rig

3 Signal Processing Identifying the flow state in the pump intelligently based on the deep learning model requires a lot of sample data. Thus, each group of the measured pressure signals needs to be processed. In this work, two kinds of data processing methods are adopted. One is the data slicing method, and the other is the random sampling method proposed in this work. Regarding to the traditional slicing methods, the number of sampling points in each slice should be the multiple of the sampling points in a single rotation cycle of the main pump. For this case, the number of sampling points in one slice is selected as 550 points, which covers two rotation cycles of the nuclear main pump. This could generate about 74 data sets for model training finally. To expand the sample data, a new random sampling method is proposed in this work. The process of the random sampling method is shown in Fig. 2. A large number of data points were randomly selected firstly and then reordered in chronological order to form a new vibration signal. Multiple signals of this type could be gained by repeating this process and these new data signals were sliced to obtain sample data sets. In this way, more sample data were generated than before. Considering that too few sampling points could not reflect the characteristics of the original data and too many sampling points have too much data repeatability, a compromise number of samples was taken herein: 1024, 4096 and 8192 sampling points. Three kinds of sampling point selection methods (1024-point, 4096-point, 8192-point) are adopted and compared to expand data sets. Methods of 1024-point, 4096-point and 8192-point mean 1024, 4096 or 8192 points are selected from the pressure signal data to form a data set. Through this method, a large number of data sets were gained for the training classification of neural network model.

694

Y.-Y. Liu et al.

Take 4096 points as an example, 6000 samples are produced by this method. The number of new signals (e.g. 200 in Fig. 2) obtained through random sampling can be adjusted as needed for each working condition.

Fig. 2. Data selection process of the random sampling method

Intelligent Identification Method of Flow State

695

4 Intelligent Recognition Algorithm In this work, three kinds of CNN models were employed to identify the flow state of the nuclear main pump, including 7-layer CNN, 50-layer ResNet network (ResNet-50) and the 121-layer DenseNet network (DenseNet-121), shown in Fig. 3. Regarding the three networks, activation function adopted the rectified Linear Units(RELU) [13] and the loss functions were defined as the cross entropy of the softmax [14]. The resolution and the colour channel of the input image were 100 * 100 and 3, respectively. The whole data set was divided into 15 batches for training and the learning rate was set as 0.001. The model accuracy and loss were recorded once per cycle [15]. 5-fold cross verification method was adopted to avoid over-fitting and find the super parameter value for an optimal generalization performance in this model. The ratio of training set and test set is 4:1. The model accuracy and the model loss was adopted to evaluate the result.

Fig. 3. The structure of three CNN networks

5 Results and Discussion 5.1 Recognition Results of Various CNN Models The recognition accuracy and model loss by the three CNN models are shown in Fig. 4. Herein, data sets obtained by the traditional slicing method were used for analysis. Results show that for the three models, both the accuracies of the train and the test

696

Y.-Y. Liu et al.

data gradually increase with the increase of the epochs and converge to a stable value. The model accuracy of the 7-layer CNN model converge slowly and a stable value is reached until 2000 epoch. The convergence is still not reached at step 2000 for its model loss. Nevertheless, the model accuracy rates and losses could reach a stable level quickly. Then, it can be concluded that the performances of the ResNet-50 and DenseNet-121 models are better than that of the 7-layer CNN model in the accuracy rate and the convergence speed. In addition, Fig. 4 also includes the influence of the grayscale processing on recognition results. Regarding three models mentioned, grayscale process could improve the training speed of the model to a certain extent, but it has little effect on the recognition accuracy. Considering that the model of DenseNet-121 network occupied

0.7

cnn_loss_non_gray cnn_val_loss_non_gray cnn_loss_gray cnn_val_loss_gray

0.6

Model loss

0.5 0.4 0.3 0.2 0.1 0

a) Model accuracy, 7-layer CNN

500

1000

Epoch

resNet50_loss_non_gray resNet50_val_loss_non_gray resNet50_loss_gray resNet50_val_loss_gray

16

Model accuracy

0.8

0.7

14 12 10

Model loss

resNet50_acc_non_gray resNet50_val_acc_non_gray resNet50_acc_gray resNet50_val_acc_gray

0.9

2000

b) Model loss, 7-layer CNN 18

1.0

1500

0.6

8 6 4 2

0.5

0 0.4 0

20

40

Epoch

60

80

-2

100

0

c) Model accuracy, ResNet-50

20

40

Epoch

60

80

100

d) Model loss, ResNet-50 denseNet121_loss_non_gray denseNet121_val_loss_non_gray denseNet121_loss_gray denseNet121_val_loss_gray

4

Model loss

3

2

1

0 0

e) Model accuracy, DenseNet-121

20

40

Epoch

60

80

100

f) Model loss, DenseNet-121

Fig. 4. Recognition accuracy and model loss by different length of neural networks

Intelligent Identification Method of Flow State

697

smaller space capacity (81 MB) and has fewer model parameters than that of the ResNet50 network (occupied space capacity of 270 MB for the same case). Thus, the DenseNet121 network is selected for the flow state recognition in following studies. 5.2 A New Method for Data Pre-processing: Random Sampling Method The model accuracies and losses by different data points are presented in Fig. 5, where the 7-layer CNN and the DenseNet-121 models are included. It can be observed that the model accuracy and training speed of the 7-layer CNN and DenseNet-121 models are improved with the increase of the number of sampling points. All the model losses 1.0

0.10

0.9

1.000

acc val_acc

0.8 0.995

0.7

acc val_acc

Model loss

0.6

0.96

0.5 0.4

0.94

0.3

0.06

0.04 0.985

loss val_loss

0.980

0.1

0.08

0.990

0.2

loss val_loss

0.92

Model accuracy

Model accuracy

0.98

0.02

0.00

0.0 0.90

0

20

40

60

Epoch

80

100

Model loss

1.00

0.975

-0.1

0

2

4

6

8

10

12

14

Epoch

a) CNN model, 1024 points

b) DenseNet-121 model, 1024 points 1.0

0.8

Model loss

0.4 0.7

10

0.8

8

0.7

6

4

0.6

0.2

loss val_loss

0.6

Model accuracy

0.6 0.8

acc val_acc

0.9

Model loss

acc val_acc

0.9

Model accuracy

12

1.0

1.0

loss val_loss

0.5

2

0.0

0

0.4 0

20

40

60

80

0

100

2

4

6

0.2

loss val_loss

0.6

0.0 60

Epoch

e) CNN model, 8192 points

80

100

Model loss

0.7

Model accuracy

Model accuracy

0.4

1.0

acc val_acc

0.98

0.6

0.8

40

14

1.00

0.8

acc val_acc

20

12

d) DenseNet-121 model, 4096 points

1.0

0

10

Epoch

Epoch

c) CNN model, 4096 points

0.9

8

0.8

0.96

0.6

0.94

0.4

loss val_loss

0.92

Model loss

0.5

0.2

0.90

0.0 0

2

4

6

8

10

12

14

Epoch

f) DenseNet-121 model, 8192 points

Fig. 5. Model accuracies and losses of different sampling points for random sampling method

698

Y.-Y. Liu et al.

converge to 0 and the convergence speed increases with the data number. Meanwhile, the convergence speed of the sampling data of 4096 points is faster than that of 1024 points. In following cases, 4096 sampling points are selected for analyses. Comparisons of the model accuracy and the convergence speed are conducted between this method and the traditional slicing method (including the time domain and the time-frequency domain) (see Table 1). It can be observed that compared with the traditional slicing method, the recognition accuracy and convergence speed of the random sampling method are improved greatly for the 7-layer CNN model. For the deep learning networks (i.e. DenseNet-121 model), the convergence speed of the random sampling method is also faster than that of the traditional slicing method. Table 1. Model accuracy and speed by different data processing method. Method

7-layer CNN model

DenseNet-121 model

Model accuracy Training epoch Model accuracy Training epoch Traditional slicing method

93%

> 2000

100%

35

Time frequency diagram of slicing data

99%

1500

100%

27

100%

80

100%

6

Random sampling method

6 Conclusions This work aims to realize the intelligent identification of the flow state based on the method of deep neural network. The effects of data processing method and neural network depth on the recognition accuracy and speed of the fluid states are analyzed. The main conclusions drawn through this investigation are as follows: 1. The ResNet-50 and DenseNet-121 models present better performances than that of the 7-layer CNN model in the accuracy rate, generalization ability and the identification speed. 2. For the time domian image, grayscale process could improve the training speed of the model to a certain extent, but it has little effect on the recognition accuracy. 3. A method of random sampling points was proposed in this work and it is proved to perform better in recognition accuracy and convergence speed. Acknowledgements. This work is funded by the National Natural Science Foundation of China (Grant No. 51806145).

Intelligent Identification Method of Flow State

699

References 1. Ni, D., Zhang, N., Gao, B., et al.: Dynamic measurements on unsteady pressure pulsations and flow distributions in a nuclear reactor coolant pump. Energy 198, 117305 (2020) 2. Ni, D., Yang, M., Gao, B., et al.: Flow unsteadiness and pressure pulsations in a nuclear reactor coolant pump. Strojniski Vestnik/J. Mech. Eng. 62(4), 231–242 (2016) 3. Panda, A.K., Rapur, J.S., Tiwari, R.: Prediction of flow blockages and impending cavitation in centrifugal pumps using support vector machine (SVM) algorithms based on vibration measurements. Measurement 130, 44–56 (2018) 4. Shervani-Tabar, M.T., Ettefagh, M.M., Lotfan, S., et al.: Cavitation intensity monitoring in an axial flow pump based on vibration signals using multi-class support vector machine. Proc. Inst. Mech. Eng. C J. Mech. Eng. Sci. 232(17), 3013–3026 (2018) 5. ALTobi, M.A.S., Bevan, G., Ramachandran, K.P.: Fault diagnosis of a centrifugal pump using MLP-GABP and SVM with CWT. Eng. Sci. Technol. 22(3), 854–861 (2019) 6. Farokhzad, S., Ahmadi, H., Jaefari, A., et al.: Artificial neural network based classification of faults in centrifugal water pump. J. Vibro Eng. 14(4), 897 (2012) 7. Kumar, A., Gandhi, C.P., Zhou, Y., et al.: Improved deep convolution neural network (CNN) for the identification of defects in the centrifugal pump using acoustic images. Appl. Acoust. 167, 107399 (2020) 8. Zhang, Y., Azman, A.N., Xu, K.W., et al.: Two-phase flow regime identification based on the liquid-phase velocity information and machine learning. Exp. Fluids 61(10), 1–16 (2020) 9. Shi, C., Ren, Y., Tang, H., et al.: A fault diagnosis method for an electro-hydraulic directional valve based on intrinsic mode functions and weighted densely connected convolutional networks. Meas. Sci. Technol. 32(8), 084015 (2021) 10. Wen, L., Li, X., Gao, L.: A transfer convolutional neural network for fault diagnosis based on ResNet-50. Neural Comput. Appl. 32(10), 6111–6124 (2020) 11. Lin, S.L.: Intelligent fault diagnosis and forecast of time-varying bearing based on deep learning VMD-DenseNet. Sensors 21(22), 7467 (2021) 12. Qiao, Y.F.: Influence of Different Inflow Conditions on Pressure Pulsation and Vibration Characteristics. Shanghai Jiaotong university (2018) 13. Wilamowski, B.M., Yu, H.: Neural network learning without backpropagation. IEEE Trans. Neural Netw. 21(11), 1793–1803 (2010) 14. He, K.: Deep Residual Networks. Deep Learning Gets Way Deeper. ICML, New York (2016) 15. Vedaldi, A., Jia, Y., Shelhamer, E., et al.: Convolutional Architecture for Fast Feature Embedding. Cornell University (2014)

Design of Intelligent Window Dwelling System Based on Multi Sensor Fusion Simin Ding1 , Gang Wang2,3(B) , and Lihui Sun1 1 Jilin Institute of Chemical Technology, Jilin, JL, China 2 Jilin Communications Polytechnic, Changchun, JL, China

[email protected] 3 Baicheng Normal University, Baicheng, JL, China

Abstract. To address the issue of traditional windows not being able to respond to environmental changes in a timely manner, an intelligent window living system based on multi-sensor data fusion was designed. Firstly, the overall design, software system, and hardware system of the system were introduced; Secondly, multisensor data fusion is achieved through Bayesian estimation algorithm; Finally, Keil was used to build a testing environment to conduct reliability experiments on the window dwelling system. The results of experiment indicated that the system can respond to the environment in a timely and accurate manner, with high reliability. Keywords: Smart window · Central processing unit · Multi-sensor data fusion · Bayesian estimation

1 Introduction In recent years, with the increasing improvement of living standards, people’s demand for smart homes has been increasing year by year. As an important component of smart home, windows play a very important role in optimizing the indoor environment. However, the development of smart windows is relatively slow, and there are relatively few related research and products. Yang has designed an intelligent window system based on Internet of Things technology, which can automatically determine the temperature and humidity, infrared intensity, and harmful gas concentration of the current environment to achieve automatic window opening and closing [1]. Shan has designed an intelligent window control system based on CAN bus, which accurately and efficiently handles communication issues in smart homes at low cost [2]. Qu designed an intelligent window control system based on the STC series microcontroller, which realizes the intelligent opening and closing of the window [3]. As a pioneer in the field of smart home, has not studied windows as an independent system, and related products still cannot meet the demand. In order to make the window dwelling system more intelligent, an intelligent window dwelling system based on multi-sensor data fusion has been designed [4]. The system senses the indoor and outdoor environment through raindrop sensors, temperature and humidity sensors, and smoke concentration sensors. It utilizes multi-sensor data fusion technology to control the opening and closing of windows, improving the intelligence of the window living system. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 700–707, 2023. https://doi.org/10.1007/978-981-99-6187-0_69

Design of Intelligent Window Dwelling System

701

2 Overall System Design The overall design structure of the intelligent window housing system proposed in this paper is shown in Fig. 1. The system is mainly composed of data acquisition system, central data processing node and motor drive system. The data collection system includes a power module, raindrop detection module, temperature and humidity detection module, and smoke detection module, used to detect the indoor and outdoor environmental status. The central data processing node is mainly used by the central processing unit to process environmental information and perform multi-sensor data fusion. The motor drive system mainly receives control instructions from the central processing unit and executes them to achieve the opening and closing of the intelligent window.

Fig. 1. Overall structure of the system

3 System Hardware Design The hardware design of this system mainly involves the selection of sensor modules, central processing unit modules, and power drive modules. Among them, the sensor module serves as the data source for the system and adopts a multi-point data collection method, where each sensor uses multiple sensors, which can reduce the impact of accidental data on the system and improve the accuracy of system response. The sensor module requires suitable raindrop sensors, temperature and humidity sensors, and smoke concentration sensors. The raindrop sensor module includes 6 YDA1 raindrop sensors. The working voltage of the YD-A1 raindrop sensor is 3.3 V−5 V, which can output a digital signal. The central processor can use this digital signal to determine the rainfall situation; The temperature and humidity sensor module includes 6 DHT11 temperature and humidity sensors. The working voltage range of DHT11 temperature and humidity sensor is 3.3 V−5.5 V, and its monitoring range is humidity 20%−90% RH, temperature 0−50 °C; The smoke sensor module includes six MQ-2 smoke sensors. The working voltage range of the MQ-2 smoke sensor is 3.3 V−5 V, and it can output analog signals. The central processor can obtain smoke concentration information through this analog signal. The central processing unit plays a vital role as the data processing node of the system. Taking into account factors such as processor performance, power consumption, processing speed, and price, this system adopts a 32-bit STM32F103RET6 microprocessor based on the ARM Cortex-M3 core. The working voltage of this processor is 2–3.6

702

S. Ding et al.

V, and it has the advantages of fast data processing, high integration, high performance, low cost, low power consumption, low voltage, and convenient development [5]. The power drive module uses a stepper motor, and the central processing unit can control the rotation angle of the motor through the number of pulses, which meets the requirements of this system. In addition, the motor does not accumulate errors and has better positional accuracy and repeatability of motion.

4 System Algorithm Design The system algorithm design is mainly completed through Keil software, which mainly includes the process of sensor module data acquisition and the process of central processor multi-sensor data fusion. 4.1 Sensor Module Data Collection The YD-A1 raindrop sensor detects outdoor rainfall. When there is no rain, it outputs a high level and the indicator light goes out. When the rainfall exceeds the set threshold, it outputs a low level and the indicator light comes on; The DHT11 temperature and humidity sensor detects the temperature and humidity of the indoor environment, and the central processor obtains the collected data and completes digital to analog conversion to obtain specific temperature and humidity values; MQ-2 smoke sensor detects the smoke concentration of the indoor environment, and the central processor acquires the collected data and completes the digital analog conversion to obtain the smoke concentration of the environment. This process is completed through formula 1, where C represents the smoke concentration, R represents the resistance value of the sensor at this smoke concentration, and m and n are constants, indicating the sensitivity of the sensor to smoke detection. logC = (logR − n)/m

(1)

4.2 Multi Sensor Fusion Based on Bayesian Estimation Multi sensor data fusion is a processing technology for multi-sensor data and information. It obtains data information about the tested environment through data association, correlation, and combination, and timely evaluates and responds to the current situation. According to different fusion methods, fusion strategies can be divided into data level fusion, feature level fusion, and decision level fusion. Decision level fusion belongs to high-level fusion, which requires small communication volume, low transmission bandwidth, and strong fault tolerance ability [6, 7]. The framework of multi-sensor data fusion algorithm in this paper is shown in Fig. 2. The central processor first estimates the rainfall, temperature, humidity and smoke concentration information from different sensors at this time through Bayesian algorithm [8], then completes the decision level data fusion through threshold comparison and logic operation to draw conclusions, and sends instructions to control the work of motor drive system.

Design of Intelligent Window Dwelling System

703

Fig. 2. Multi sensor data fusion algorithm framework

Construct Bayesian Estimation Algorithms. In a multi-sensor system, sensor data interact with each other. Bayesian estimation defines this impact as the Conditional probability of an associated node when the prior value of one node is known [9]. Assuming that the Sample space D contains n events and meets the conditions described in Formula 2, the steps to estimate the monitoring data of different sensors through Bayesian estimation are as follows: X1 ∩ X2 ∩ . . . ∩ Xn = D Xi ∩ Xj = ϕ P(Xi ) > 0

(2)

1) Firstly, determine the prior distribution probability P(θ ) of parameter θ . 2) Secondly, solve the Joint probability distribution of samples P(D|θ ) P(D|θ ) =

N 

P(Xi |θ )

(3)

n=1

3) Then, calculate the posterior distribution P(θ |D) of parameter θ . P(θ |D) =

P(θ |D)P(θ ) n  P(Xj |θ )P(θ )

(4)

j=1

4) Finally, obtain the Bayesian estimate θˆ . θˆ =

n 

θ P(θ |D )

(5)

j=1

Decision Level Data Fusion. Decision level fusion includes threshold comparison and logical operations. The central processing unit first compares the environmental data with a predetermined threshold. If it is greater than or equal to the threshold, it is represented by logic 1, and vice versa, it is represented by logic 0. Then it performs logical operations

704

S. Ding et al.

based on the priority of environmental factors. When the logical operation result Y = 1, the control drives the system to execute the window opening instruction, and vice versa, to execute the window closing instruction. The logical algorithm operation process is shown in Table 1, where smoke has the highest priority, followed by raindrops, and temperature and humidity have the lowest. Table 1. Calculation table of logical algorithm data Situation

Y

Situation

Y

Situation

Y

Situation

Y

0/0/0/0

0

0/0/1/0

0

0/0/0/1

1

1/0/0/1

1

0/1/0/0

1

0/1/1/0

0

0/0/1/1

1

1/0/1/1

1

1/0/0/0

1

1/0/1/0

0

0/1/0/1

1

1/1/0/1

1

1/1/0/0

1

1/1/1/0

0

0/1/1/1

1

1/1/1/1

1

5 System Testing Experiment In the experiment part of this paper, Keil is used to build a test platform, and the central processor STM32F103RET6 is programmed, including data acquisition accuracy verification and system Functional verification. 5.1 Verification of Detection Data Accuracy This section selects environmental temperature, humidity, and smoke concentration as factors for the detection accuracy of the intelligent window system. Each sensor collects 5 sets of data to verify whether the accuracy error is within the allowable range. The accuracy error should meet the following requirements ε ≤ 3%. Tables 2, 3 and 4 represent the data collected by 5 different time points, 6 temperature sensors, humidity sensors, and smoke concentration sensors, with sensor numbers represented by S1-S6. The accuracy of the collected data and the true value is verified through ∧

formula 6, where X represents the final collected data and X represents the true value. After verification, the error of each group of collected data meets the requirements, and the collected data is true and effective.   ˆ  X − X  ε= × 100% (6) Xˆ

5.2 System Function Verification The system Functional verification mainly observes whether the intelligent window housing system can make real-time response to the indoor and outdoor environment. Due

Design of Intelligent Window Dwelling System

705

Table 2. Accuracy verification of temperature sensor (°C) Times

S1

S2

S3

S4

S5

S6

True value

Collect value

Error rate

1

23.4

22.6

23.1

23.2

22.8

23.2

23.2

23.08

0.51%

2

23.8

24.6

24.8

24.1

24.2

24.5

24.2

24.35

0.62%

3

25.5

26.6

26.2

25.4

26.3

26.2

26.2

26.05

0.57%

4

26.7

27.2

25.8

26.7

27.4

27.0

26.7

26.90

0.75%

5

28.9

26.7

26.9

27.4

28.3

28.2

28.0

27.70

1.07%

Table 3. Accuracy verification of humidity sensor (%) Times

S1

S2

S3

S4

S5

S6

True value

Collect value

Error rate

1

67

72

75

64

66

62

68

67.25

1.10%

2

75

77

69

80

78

81

76

77.50

2.00%

3

86

74

80

75

79

83

78

79.25

1.60%

4

57

62

69

59

56

64

62

60.50

2.42%

5

46

41

48

47

44

49

47

46.25

1.60%

Table 4. Accuracy verification of smoke sensor (ppm) Times

S1

S2

S3

S4

S5

S6

True value

Collect value

Error rate

1

562

573

582

567

557

576

574

569.50

0.78%

2

636

645

638

646

656

635

638

641.25

0.51%

3

784

764

793

772

779

788

775

780.75

0.61%

4

856

834

846

848

858

839

842

847.25

0.62%

5

883

875

898

869

886

878

876

880.50

0.51%

to the fact that this system is mainly designed for hot and humid summers, the temperature threshold is 25 °C, the humidity threshold is 75%, and the smoke concentration threshold is 800 ppm [10]. The central processing unit performs logical operations based on the above threshold to control the system’s operation. As shown in Table 5, this section selects 5 sets of collected data for verification, and the central processor outputs different control instructions based on different data. The validation results indicate that the accuracy of the windowing system’s response to these 5 sets of data is 100%.

706

S. Ding et al. Table 5. System function verification

Times

Rain sensor

Temperature sensor

Humidity sensor

Smoke sensor

Threshold comparison logic

Data fusion logic

Switch window

1

Yes

23.08

67.25

569.50

1/0/0/0

0

Close

2

No

24.35

77.50

641.25

0/0/1/0

1

Open

3

No

26.05

79.25

780.75

0/1/1/0

1

Open

4

No

26.90

60.50

847.25

0/1/0/1

1

Open

5

Yes

27.70

46.25

880.50

1/1/0/1

1

Open

6 Conclusion This paper designed an intelligent window dwelling system based on multi-sensor data fusion, including system overall design, system hardware design, and system algorithm design. Multi sensor data comes from raindrop sensors, temperature and humidity sensors, and smoke concentration sensors. The central processor performs multi-sensor data fusion and outputs control commands to achieve window opening and closing actions. The experimental test results show that the sensor data acquisition error of the system is small, the response speed to the environment is fast, and the accuracy is high. Acknowledgment. The author thanks Wang Gang and others for their warm help. This work was supported by the Jilin Provincial Science and Technology Development Plan (Natural Science Foundation) “Research on key technologies of underwater intelligent detection robot based on multi-source information fusion”, project number: 20220101138JC; and the Natural Science Foundation of Jilin Province (general project of free exploration) “Research on cooperative strategy of multi-underwater AUV clusters based on bio-inspiration”, project number: YDZJ202301ZYTS420.

References 1. Yang, J., Wang, K., Aixuan, H., Luo, Q.: Design and implementation of intelligent window system based on the Internet of Things. Internet of Things 10(04), 76–79 (2020) 2. Lijun, S., Yonghua, K.: Design of intelligent window control system based on STM32 single chip computer. J. Donghua Univ. (Natl. Sci. Ed.) 47(06), 84–90 (2021) 3. Long, J., Meng, Y., Tang, J.: Development and research of intelligent window sensing alarm . Comput. Knowl. Technol. 15(09), 249–250+253 (2019) 4. Han, J., Tian, J., Qiao, J.: Design of indoor environmental monitoring system for large venues based on multi-sensor information fusion. China Equip. Eng. 23, 186–188 (2022) 5. Seo, H., et al.: Compact implementation of ARIA on 16-Bit MSP430 and 32-Bit ARM Cortex-M3 microcontrollers. Electronics, 10(8), 908 (2021) 6. Zou, B., Li, W., Hou, X., Tang, L., Yuan, Q.: A framework for trajectory prediction of preceding target vehicles in urban scenario using multi-sensor fusion. Sensors 22(13), 4808 (2022)

Design of Intelligent Window Dwelling System

707

7. Zhang, S.: Research on indoor mobile robot positioning and navigation based on multi-sensor fusion. University of Chinese Academy of Sciences (Changchun Institute of Optics, Precision Mechanics and Physics, Chinese Academy of Sciences) (2021) 8. Liu, Y., Wang, Y.: Retraction note: monitoring of mountain ecological environment based on Bayesian estimation and testing of motor memory function in mice. Arab. J. Geosci. 14(22), 1754 (2021) 9. Wang, C., Qi, S., Zhang, H.: Bayesian estimation and application of logistic regression model parameters. Statist. Decision 36(22), 14–18 (2020) 10. Wu, Z., Liu, X., Huang, J., Han, Y., Xin, Y.: Design and implementation of smart windows based on multi-sensor fusion. Electrotechnical 12, 19–21 (2019)

Time-Varying Function-Based Anti-Disturbance Method for Permanent-Magnet Synchronous Motors Tianjian Jiang1 , Yingcheng Wu2 , and Yang Yang1(B) 1

Nanjing University of Posts and Telecommunications, Nanjing, Jiangsu 210023, China [email protected] 2 Nanjing Medical University, Nanjing, Jiangsu 211166, China

Abstract. This paper proposes a dual-vector method based on the finite-control-set model predictive control (FCS-MPC) and uses an improved active disturbance rejection control (ADRC) method to obtain a good tolerance of parameter mismatch and speed of regulation when the torque changes on a permanent-magnet synchronous motor (PMSM). It is famous that the ADRC controller produces huge peak outputs when the target speed changes, which can be a potential problem for the circuit and lead to reduced stability of the control system. For this reason, we propose an improved ADRC method that changes some of the large gain constants in extended state observer (ESO) module and nonlinear law state error feedback (NLSEF) module into time-varying functions, thereby suppressing peaks. The improved ADRC method is able to suppress the output peaks and has better robustness as demonstrated by simulation examples. Keywords: Permanent-magnet Synchronous Motor · Active Disturbance Rejection Control · Field-oriented Control · Model Predictive Control · Modern Control

1

Introduction

With its high stability and low cost, permanent-magnet synchronous motor (PMSM) has widely applications in the industrial scenarios [1]. With the introduction of carbon neutrality targets and the setting of green development strategies, energy-efficient motors are receiving more attentions. PMSM is a typical nonlinear multi-variable system, so linear control degrades its performance. The traditional PID algorithm does not cope well with nonlinear systems. To improve the control performance of PMSM, nonlinear control strategies, such as adaptive control, neural network control and passive control have been applied to PMSM [2]. The predictive current control method (PCC) is a highly effective c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023  Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 708–719, 2023. https://doi.org/10.1007/978-981-99-6187-0_70

Improved ADRC Controller

709

control method that currently has been discussed a lot. While using PCC, PMSM control system takes a dual-loop control. The current-loop controllers now mainly use the model predictive control method. Due to the advantages of its feedback correction and rolling optimization, predictive control presents better robustness and higher adaptability to changes than other origin control methods. Moreover, its structure is relatively simple. Facing the problem that large leakage voltage generated by single-vector control method will shorten the life of the motors, the simu-multi-vector control method not only reduces the large leakage voltage, but also increases the control accuracy [3]. Recently, the major multi-vector control includes dual-vector control [4] and triple-vector control [5], and because of its need for larger computational cost, sector detection is proposed to lock the region of selected vectors and reduce the computing burden of CPU. However, multivector control methods require further improvement in the accuracy of motor parameters. For this reason, we replace the PID controller of the speed-loop by the active disturbance rejection control (ADRC) module. The ADRC’s extended state observer (ESO) is able to observe both the state variables and disturbances of the system, which significantly reducing the impact of the changes of system parameters and load disturbances on the system compared to the traditional PID method [6]. But the notorious initial peak outputs of ADRC can generate huge values when the target speed changes, leading to short time instability of the system and increasing the potential for motor damage. For this reason, we modify the origin ADRC method to replace the large gain constants of extended state observer (ESO) module and nonlinear law state error feedback (NLSEF) module using the time-varying functions. Through simulation examples, we found that the improved ADRC method can effectively suppress the initial huge peaks and become more stable.

2

PMSM Model

The voltages of PMSM on the d-axis and q-axis could be described as:  ud = Rs id + Ld didtd − ωe Lq iq di uq = Rs iq + Lq dtq + ωe Lq iq + ωe ψm

(1)

where ud and uq are the stator voltages in the d-q frame; ωe is the current angle; id and iq represent for the stator current in d-axis and q-axis; Ld and Lq are inductance in d-q frame; Rs is the stator resistance of PMSM; and ψm is the flux vector of PMSM. With (1), an equation to predict the amount of current change could be obtained  Lq did Rs 1 dt = − Ld id + Ld ωe iq + Ld ud (2) diq ψm Ld Rs 1 dt = − Lq iq + Lq ωe id + Lq uq − Lq we

710

T. Jiang et al.

Using the Eulerian discretization method to discretize (2), a new equation could be obtained  L T Ts id (k + 1) = (1 − RLs dTs )id (k) + Lq d s ωe iq (k) + L ud q (3) Ts Ld Ts Rs Ts Ts iq (k + 1) = (1 − Lq )iq (k) − Lq ωe id (k) + Ld uq − ψm Lq ωe where id (k) and iq (k) are the value of d-q frame currents for the current sampling cycle; Ts represents for the sampling cycle; id (k+1) and iq (k+1) are the predicted currents in d-q frame for the next cycle. ud (k) and uq (k) are the values of d-axis voltage and q-axis voltage for the current cycle.ud (k) and uq (k) are calculated as below in (4) and (5)      cos θ sin θ uα (k) ud (k) = (4) − sin θ cos θ uβ (k) uq (k) ⎤ ⎡    1 1  Sa (k) 1 − − 2 uα (k) √2 √2 ⎣ Sb (k) ⎦ = Udc uβ (k) 3 0 23 − 23 S (k)

(5)

c

where θ is the rotation angle; Udc is the dc-link voltage; and Sa (k) , Sb (k) and Sc (k) are the switching states of three-bridge arms of the inverter.

3

FCS-MPC Method

Currently, model predictive control (MPC) is now widely used for the control of synchronous motors, which has higher robustness and bandwidth. However, it still has problems such as high leakage current and low parameters-mismatch tolerance. 3.1

Simple-vector FCS-PCC Control

Since the time component is small, we can assume that the motor runs at the same speed in the adjacent cycle we (k) = we (k + 1) We can organize (3) and obtain:  Δid (k + 1) = LTsd (ud (k) − Rs id (k) + ωe (k)Lq iq (k)) Ts (uq (k) − Rs iq (k) + ωe (k)Ld id (k) − ωe (k)Ψm ) Δiq (k + 1) = L q So the current in the next cycle shall be  id (k + 1) = id (k) + Δid (k + 1) iq (k + 1) = iq (k) + Δiq (k + 1)

(6)

(7)

(8)

Improved ADRC Controller

711

Table 1. Three phase voltages of different states V ector u1 u2 u3 u4 u4 u6 u7 u8 ua

0

0

0

1

1

1

1

0

ub

0

1

1

0

0

1

1

0

uc

1

0

1

0

1

0

1

0

By superposition principle, the three phase voltages of different states are superimposed to obtain the eight vectors shown in Table 1. By calculating the predicted values of id and iq after applying each voltage vector and establishing and iref evaluation functions based on id , iq , iref q . The evaluation values are d calculated and the d-axis voltage vector with the highest evaluation value is selected for deployment on the inverter [7]. The evaluation values for each group of switching situations can be designed as follows 2 ref 2 (9) f itness = (id (k + 1) − iref d ) + (iq (k + 1)) − iq )

3.2

Dual-vector FCS-PCC Control

As a result of zero-vector, there will be a large common mode voltage, and it will cause leakage currents. The electromagnetic interference generated by the leakage current is a factor that affects the efficiency and control stability of the motor. Therefore, we should minimise the use of zero-vector in our selection process. If we delete the zero vector only in the traditional single-vector control method, it will lead to a reduction in the vector to be selected and cause a reduction in control accuracy. For this reason, we synthesise some virtual voltage vectors by means of duty cycle control [8]. Typically, we combine the states of the 3 switches in two adjacent cycles to obtain 18 different voltage vectors(form v1 to v18 ) as Table 2. Table 2. Three phase voltages of different states V ector

v1

v2

v3

v4

v4

v6

V ector superposition 2 ∗ u1

2 ∗ u2

2 ∗ u3

2 ∗ u4

2 ∗ u5

2 ∗ u6

V ector

v8

v9

v10

v11

v12

v7

V ector superposition u1 + u2 u2 + u3 u3 + u4 u4 + u5 u5 + u6 u6 + u1 V ector

v13

v14

v15

v16

v17

v18

V ector superposition u1 + u3 u2 + u4 u3 + u5 u4 + u6 u5 + u1 u6 + u2

712

T. Jiang et al.

During the first run cycle the microcontroller calculates the current voltage vector to be deployed and deploys the vector for the first cycle to the three phase inverter, then the vector for the second cycle to be deployed is saved in the memory. In the second cycle, the vectors that were stored in the previous cycle are extracted and deployed to the three-phase inverter. 3.3

Sector Determination

Considering the relatively weak computing power of the microcontroller, we can reduce the number of traversals required by means of optimal sector determination in order to calculate the required voltage vector more quickly [6]. By discretizing for (2), we obtain the following equation ⎧ ⎪ ⎨

id (k + 1) = id (k) + 16 Ts [

⎪ ⎩

iq (k + 1) = iq (k) +

2 uid +ωLd id (k) 3R R3 3 ] ∗ (6 − L Ts + R Ts2 − 4L 3 Ts ) Ld L2 d d d 2 u −ωL i (k)−ωψ −Ri (k) q d q i f 1 T [ q ] ∗ (6 − 3R T +R Ts2 6 s Lq Lq s L2 q



R3 Ts3 ) 4L3 q

(10) We first bring the six vectors from u1 to u6 into (10) and bring the result id (k + 1) and iq (k + 1) to (9), to obtain the optimal sector.

4

Model of ADRC

PID control does not control nonlinear systems well. For this reason Prof Han Jingqing proposes ADRC method to address the disadvantages of PID [9]. ADRC contains 3 components: tracking-differentiator (TD) module, extended state observer (ESO) module, nonlinear law state error feedback (NLSEF) module. Tracking-differential (TD) is an algorithm that enables smooth tracking and differentiation of noisy signals. TD is useful mainly to allow the system to quickly and smoothly expect values, extract the input of noisy signals and reduce the high frequency noise of the system. A tracking differential controller of order N can be expressed as ⎧ e˙ 1 = e2 ⎪ ⎪ ⎪ ⎨ e˙ = e 2 3 ⎪ ... ⎪ ⎪ ⎩ n ) e˙ n = en f (e1 − v(t), eR2 , ..., Ren−1 where e1 is the tracking signal, e2 , e3 ... . en are the derivatives of each order of e1 , R is the range of the input signal and f corresponds to the non-linear function. Extended state observer (ESO), a mathematical tool for self-resistant control, is used to estimate the state variables of a system as well as internal and external

Improved ADRC Controller

713

disturbances and to compensate for them in the feedback, thus improving the system’s immunity to disturbances. An N order state observer can be expressed as ⎧ ⎪ e1 − x(t)) ⎪ eˆ˙ 1 = eˆ2 − g1 (ˆ ⎪ ⎪ ⎪ ˙ ⎪ eˆ2 = eˆ3 − g2 (ˆ e1 − x(t)) ⎨ ... ⎪ ⎪ ⎪ e1 − x(t)) eˆ˙ n = eˆ(n+1) − gn (ˆ ⎪ ⎪ ⎪ ⎩ eˆ˙ = −g (ˆ e ) n+1 n+1 1 − x(t)) where eˆ1 is the actual observed value of the tracking signal, eˆ2 , eˆ3 ... . eˆn are the derivatives of each order of eˆ1 . The overall ADRC formula can be expressed as follows [2]  ˙ x ¨, ..., x(n−1) , t) + w(t) + bu x(n) = f (x, x, y = x(t)

It has been proved that a first-order ADRC controller is sufficient for the task of controlling PMSM well [10]. TD module could be described as  e0 = v ∗ − v1 (11) v1 = −r0 fal (e0 , α0 , δ0 )

ESO can be described as ⎧ ⎪ ⎨ e = z1 − y zˆ1 = z2 − β1 fal (e1 , α1 , δ1 ) + b0 u(t) ⎪ ⎩ zˆ2 = −β2 fal (e1 , α1 , δ1 )

NLSEF can be described as ⎧ ⎪ ⎨ e1 = v1 − z1 u0 (t) = kfal (e2 , α2 , δ2 ) ⎪ ⎩ u = u0 (t) − z2 /b0

The optimization function f al can be expressed as  α |e| sgn(e), |e| > δ fal (e, α, δ) = e δ 1−α , |e| < δ

(12)

(13)

(14)

714

5

T. Jiang et al.

Improved ADRC Method for PMSM Control

Typically, an ESO module with a high gain constant is used to restore both the state of the system and its extended state. There are a number of problems with such an ESO module. The most serious of these are the large output spikes generated by the ADRC module when the desired speed is changed. Such problems create obstacles to the widely use of ADRC method in the control of PMSMs. In industrial production, we obviously do not want a system that produces a large output for a short period of time in some parts of the system under certain circumstances. Sudden high currents can cause damage to circuit components and reduce the life of them. Considering the wide range of motors available today, in most cases our expectations of motor speed are constantly being modified, resulting in a system that is constantly subjected to sudden large current changes, which not only leads to large errors in motor control, but also significantly affects the stability of the motor system, as well as motor life and increases commercial costs. To address this problem with ADRC, Zhi-Liang Zhao proposed a time-based ESO gain [11]: ⎧ 1 n ⎪ x ˆ˙ 1 (t) = x ˆ2 (t) + Qn−1 ˆ1 (t))) + g1 (u, x ˆ1 (t)) ⎪ (t) h1 (Q (t)(y(t) − x ⎪ ⎪ 1 n ⎪ ˙ ⎪ ˆ2 (t) = x ˆ3 (t) + Qn−2 (t) h2 (Q (t)(y(t) − x ˆ1 (t))) + g2 (u, x ˆ1 (t), x ˆ2 (t)) ⎨x ... ⎪ ⎪ ⎪ ⎪ x ˆ˙ n (t) = x ˆn+1 (t) + hn (Qn (t)(y(t) − x ˆ1 (t))) + g2 (u, x ˆ1 (t), x ˆ2 (t), ..., x ˆn (t)) ⎪ ⎪ ⎩˙ n ˆ1 (t))) x ˆn+1 (t) = δ(t)hn+1 (Q (t)(y(t) − x where x ˆ1 (t) is the tracking signal, x ˆ2 (t),ˆ x3 (t),...,ˆ xn (t) are the derivatives of each order of x ˆ1 (t),Q(t) is the time-varying gain, hi (t) is our design function, gi (t) is a known nonlinear function. In the system, we take the time-varying gain δ: R − → R from a small value to a maximum value gradually to reduce the peak.  eat , 0 ≤ t ≤ a1 lnr (15) δ(t) = r, t ≥ a1 lnr

We use this as a basis for making adjustments to the previously proposed model: ESO can be described as: ⎧ ⎪ ⎨ e = z1 − y (16) zˆ1 = z2 − β1 fal (e1 , α1 , δ1 ) + b0 u(t) ⎪ ⎩ zˆ2 = −δ1 (t)fal (e1 , α1 , δ1 )

Improved ADRC Controller

NLSEF can be described as: ⎧ ⎪ ⎨ e1 = v1 − z1 u0 (t) = δ2 (t)fal (e2 , α2 , δ2 ) ⎪ ⎩ u = u0 (t) − z2 /b0

715

(17)

We replaced the params k in (13) and β2 in (12) in the original model with two time-varying functions δ1 (t),δ2 (t). The difference between the δ1 (t),δ2 (t) here and the original δ(t) is that the parameters a and r may be different and are used to optimise the performance of the PMSM.

6

Simulation Examples

A number of simulation examples are used to demonstrate the good performance and robustness of our proposed model. 6.1

The Model Built by Simulink

The simulation model is built in Matlab / Simulink according to the simulation block diagram as Fig. 1, where the ADRC module is used for closed-loop speed control; the MPC module is used for closed-loop current control. The built-in PMSM model from the Simulink library is used. The relevant parameters are shown in Table 3. A number of tasks were designed to judge the performance of the motor, including changes in the target speed of the motor and changes in the motor torque. In this case, we initially set the target speed of the motor to 10.47 rad/s (100 rounds/min) and at 1 s the target speed changes to 15.707 rad/s (150 rounds/min). The torque of the motor is initially set to a value close to 0. At 0.5 s the torque changes to 3 and at 1.3 s the torque drops to 0.5 Table 3. Main parameters of PMSM Paramsite

Value

Stator phase resistance Rs (Ohm) 0.0485 Armature inductance(H)

0.000395

Flux Linkage(V.s)

0.1194

Inertia(J(kg.m2 )

0.0027

Viscous damping(F(N.m.s))

0.0624924

Pole pairs

1

Tf(N.m)

0.0

716

T. Jiang et al.

Fig. 1. Overall block diagram of the control system.

6.2

Comparing ADRC Method with PID Method

In order to make the experiment fair, we shall need to set the proper parameters of the PID. First we optimise the parameters of the PID module for a torque close to 0, so that it can achieve relatively good performance. Afterwards, we compare the two control algorithms. The simulation results are shown in Fig. 2. From Fig. 2, we can see that the PID has a relatively good performance for the target speed variation task, and in comparison with the ADRC controller, it runs significantly smoother with almost the same response speed. In order to obtain a faster response, the ADRC controller causes significant overshoot and speed jitter. For example, the ADRC controller causes the motor to produce a greater amplitude of vibration at 0.5 s. This is due to the calculation of the optimal vector every two cycles in the dual vector control and the aggressive control strategy adopted by the ADRC controller to recover the speed faster. However, during torque changes, the ADRC controller can recover to the target speed very quickly, whereas the PID controller takes significantly longer. As the large peaks generated by the ADRC controller when the target speed changes greatly Comparing ADRC controller with PID controller.

18 10.5

16

Comparing ADRC controller with PID controller.

18 10.5

PID ADRC

10

14

14 9.5 0.45

12

16.2 0.5

0.55

0.6

rotor speed(rad/s)

rotor speed(rad/s)

PID ADRC

16

10

0.65 16

10

15.8

8

15.6 1.3

1.35

1.4

1.45

1.5

6

16.2 0.5

0.55

0.6

0.65 16

10

15.8

8

15.6 1.3

1.35

1.4

1.45

1.5

6

4

4

2

2

0

9.5 0.45

12

0 0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

time(s)

Fig. 2. Comparing origin method with PID method.

2

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

time(s)

ADRC

Fig. 3. Comparing improved ADRC method with origin ADRC method.

2

Improved ADRC Controller

717

destabilise the system, we replace some of the gain constants in the ADRC with time-based functions of gain to suppress the peak values. 6.3

Comparing Improved ADRC Method with Origin ADRC Method

For experimental fairness, the maximum values of the time-varying gain functions used in improved ADRC module are the same as the original ADRC gain constants. Compared with (15), the only difference is the addition of the time constant a. In the simulation examples, we set the value of a to 100. Here,we set the time-varying function in (16)  1 e100t (t ≤ 100 lg(100000)) δ1 (t) = (18) 1 lg(100000)) 100000 (t > 100 and the time-varying function in (17)  e100t (t ≤ δ2 (t) = 1000 (t >

1 100 lg(1000)) 1 100 lg(1000))

(19)

Simulations were carried out in Simulink to evaluate the performance of the two methods. As can be seen in Fig. 3, the improved ADRC method has a slightly longer response time when the target speed changes, but the speed changes are much smoother and there is no overshoot any more. In industry, this can significantly improve motor life and reduce potential safety problems. In Fig. 4, the improved ADRC method effectively suppresses the large peak value that occur when the target speed changes. In Fig. 5, the output value of the improved ADRC module stays below 30, which is a huge improvement compared to the almost 9000 peak generated by the original one. Comparing Improved ADRC controller with origin ADRC controller.

10000

Output of New ADRC controller.

30

New ADRC

Origin ADRC New ADRC

25

6000 100

100

50

50

rotor speed(rad/s)

rotor speed(rad/s)

8000

4000

2000

0 0.02 0.04 0.06 0.08

0.1

0 0.95

1

20

15

10

1.05

5

0

0

-2000 0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

time(s)

Fig. 4. Outputs of Improved ADRC method and origin ADRC method.

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

time(s)

Fig. 5. Outputs of improved ADRC method.

718

7

T. Jiang et al.

Conclusion

This thesis presents an improved ADRC control method for PMSM. By converting some constants in ADRC into time-varying functions, the problem of overshoot can be solved effectively, and the motor speed changes much more smoothly. Simulation results show that compared with the traditional PI method, the ADRC method makes the PMSM speed control system with wide speed regulation range, fast response, strong anti-interference ability and robustness. While the large gain constants designed for torque variation due to its need can lead to overshoot of the system. The huge output value of ADRC when the target speed changes leads to instability of the system. The comparison test between the origin ADRC method and improved ADRC method shows that the improved ADRC method successfully suppresses the huge output value of ADRC controller when the target speed changes, and is also more stable. Moreover, the improved ADRC controller can be applied to other control systems as well, so that the robustness of the control system can be improved. Acknowledgments. This work was supported by the National Natural Science Foundation of China (Grant No. 61873130) and “Chunhui Program” Collaborative Scientific Research Project (Grant No. 202202004).

References 1. Chen, Z., Xiao, C., Zhang, X., Liu, C., Luo, G.: Dynamic position estimation improvement for sensorless control of PMSM with ADRC-DPLL embedded in current controller. In: 2022 25th International Conference on Electrical Machines and Systems (ICEMS), pp. 1–6 (2022). https://doi.org/10.1109/ICEMS56177.2022. 9983413 2. Deng, F., Guan, Y.: PMSM vector control based on improved ADRC. In: 2018 IEEE International Conference of Intelligent Robotic and Control Engineering (IRCE), pp. 154–158 (2018). https://doi.org/10.1109/IRCE.2018.8492927 3. Parvathy, M.L., Thippiripati, V.K.: An effective modulated predictive current control of PMSM drive with low complexity. IEEE J. Emerg. Sel. Top. Power Electron. 10(4), 4565–4575 (2022). https://doi.org/10.1109/JESTPE.2021.3077057 4. Yu, K., Wang, Z.: Online decoupled multi-parameter identification of dual threephase IPMSM under position-offset and HF signal injection. IEEE Tran. Ind. Electron. 1–11 (2023). https://doi.org/10.1109/TIE.2023.3273256 5. Agustin, C.A., Yu, J.T., Lin, C.K., Jai, J., Lai, Y.S.: Triple-voltage-vector modelfree predictive current control for four-switch three-phase inverter-fed SPMSM based on discrete-space-vector modulation. IEEE Access 9, 60352–60363 (2021). https://doi.org/10.1109/ACCESS.2021.3074067 6. Zhiqiang, W., Jiashu, W., Hanjun, G.: Design of the expert ADRC. In: 2008 Chinese Control and Decision Conference, pp. 1761–1763 (2008). https://doi.org/10. 1109/CCDC.2008.4597624 7. Zhang, H., Liu, W.: An improved current predictive control algorithm in the application of 22 kw variable speed permanent magnet synchronous motor. In: 2014 17th International Conference on Electrical Machines and Systems (ICEMS), pp. 556–562 (2014). https://doi.org/10.1109/ICEMS.2014.7013548

Improved ADRC Controller

719

8. Trivedi, M.S., Keshri, R.K.: Evaluation of predictive current control techniques for pm BLDC motor in stationary plane. IEEE Access 8, 46217–46228 (2020). https:// doi.org/10.1109/ACCESS.2020.2978695 9. Han, J.: From PID to active disturbance rejection control. IEEE Trans. Ind. Electron. 56(3), 900–906 (2009). https://doi.org/10.1109/TIE.2008.2011621 10. Shi, S., Guo, L., Chang, Z., Zhao, C., Chen, P.: Current controller based on active disturbance rejection control with parameter identification for PMSM servo systems. IEEE Access 1 (2023). https://doi.org/10.1109/ACCESS.2023.3274578 11. Wu, Z.H., Guo, B.G.: Extended state observer for uncertain lower triangular nonlinear systems subject to stochastic disturbance. Syst. Control Lett. 85, 100–108 (2015)

Research on the Operation Status of Metro Power Supply Equipment Under Cyber Physical System Zhangbao Cao1(B) , Heng Wan2 , Xuliang Tang2 , and Xuefeng Chen3 1 Hangzhou Branch of Shanghai Jiudao Information Technology Co., Ltd., Hangzhou 310012,

China [email protected] 2 Shanghai Institute of Technology, Shanghai 201418, China 3 Shanghai Jiudao Information Technology Co., Ltd., Shanghai 200050, China

Abstract. Based on the theoretical application of the Cyber Physical System (CPS), the real-time research on the operation status of metro power supply equipment under CPS is established. In view of the unclear boundary between the normal state and the alert state, the research and judgment matrix are set for the alert state. Use SWOT analysis method for reference to build the evaluation system of equipment operation indicators with the analytic hierarchy process, optimize the combined weights of IAHP and entropy weight method, establish the membership matrix under different indicators, realize the combination algorithm of weight vector and membership matrix, and then obtain the evaluation result vector. Use the maximum membership degree method to determine the final operation status evaluation results, so as to make correct research and judgment on the equipment operation status to improve the accuracy of condition-based maintenance and ensure the healthy operation of the subway power supply system. Keywords: Cyber Physical System · Degree of Membership · Fuzzy Comprehension Evaluation Method · Power Supply Equipment

1 Introduction CPS (Cyber-Physical Systems, CPS) is a complex and broad system, which in-volves multiple disciplines such as power, communication, control, computer, etc. [1]. CPS pays attention to the rational integration and utilization of resources and optimal scheduling to realize real-time perception and control of complex systems and environments. At the same time, it provides flexible, fast and intelligent network information services [2]. Information physics technology is of great significance to the development of industry and science and technology. In 2007, the United States Science and Technology Advisory Committee explicitly recommended that it be listed as a priority area for national research investment [3]. Subway plays a more and more important role in China’s megacities. The excellent performance and safe operation of subway power supply equipment are necessary © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 720–727, 2023. https://doi.org/10.1007/978-981-99-6187-0_71

Research on the Operation Status of Metro Power Supply Equipment Under CPS

721

conditions to ensure the healthy operation of urban rail transit [4]. How to deeply integrate the application of intelligent traction power supply equipment and CPS is not only the focus of current research, but also the development direction of electric power field [5]. In the information physical system, the physical system contains a large number of physical elements such as equipment and switching devices, and the information system contains a large number of information elements such as intelligent equipment and communication devices [6]. The information physical system is coupled with each other, and the failure of any component will affect the reliability of the whole system. Therefore, it is very important to establish the reliability coupling model of the information physical system and the analysis after component failure [7]. In the evaluation process, most of the above methods use a single weight and membership function. The subjective factors are strong, and the membership function makes the fuzzy problem accurate when dealing with the fuzzy phenomenon, which cannot fully reflect the uncertainty in the evaluation process, resulting in the results are not scientific and accurate enough. Based on the above documents, this paper proposes a power supply equipment health index based on the SWOT analysis method [8] under the theoretical guidance of the power system CPS (information physical system). The combined weight of IAHP and entropy weight method is used to combine the objective data of CPS with the subjective evaluation of expert evaluation [9]. The improved analytic hierarchy process and fuzzy evaluation synthesis method are used to determine the evaluation result matrix, The maximum membership method is used to map the evaluation results of equipment status. Thus, the equipment status can be clearly divided to improve the accuracy of subway power supply system early warning and ensure the reliability of subway power supply equipment status maintenance.

2 Background 2.1 Metro Power Supply System Under Cyber Physics System In China, most rail transit power supply mostly adopts centralized power supply mode, that is, it supplies power from the urban power grid to the special main substation of the subway, and the main substation supplies power to the traction substation and step-down substation of the subway. The safety of metro power supply network system equipment operation is very important, which affects the safety of metro power supply network. Common Class A equipment of metro power supply system is shown in Table 1. The data information of the subway power supply system is transmitted to the network terminal through SCADA, energy consumption system, etc., and through calculation, research and judgment, cloud storage scheduling, feedback control, etc., the precise control, early warning, fault prediction and other functions of the physical world are realized. Therefore, the subway power supply system under the CPS is formed, as shown in Fig. 1. The CPS system can comprehensively control the operation of subway power supply. In this paper, the operation status of metro power supply equipment is studied to achieve real-time analysis, research and judgment of equipment operation, and provide operation and maintenance efficiency.

722

Z. Cao et al. Table 1. Class A equipment of metro power supply system.

Equipment Category

Equipment

Transformer

110 kV main transformer 35 kV power transformer 35 kV grounding transformer 35 kV incoming line switch cabinet

Medium and High Voltage Switchgear 35 kV section switch cabinet 35 kV sectional lifting cabinet 35 kV follow-up power transformer switchgear 1500 V isolation switch cabinet up DC Taction Equipment

1500 V current drainage cabinet 1500 V cross zone isolation switch cabinet up 1500 V cross zone isolation switch cabinet downward

Fig. 1. Metro power supply system under CPS.

2.2 Equipment Operation Status Under the metro power supply information physical system, the cyber physical system of power supply equipment operation status is shown in Fig. 2. The CPS technology is used to achieve accurate division of equipment operation status through the division of equipment operation status in the original physical world.

Research on the Operation Status of Metro Power Supply Equipment Under CPS

723

Fig. 2. Equipment operation status under CPS.

The boundary between the nor-mal state and the alert state is easy to be unclear, and the abnormal early warning of equipment operation state cannot be achieved well. Usually, the traction power supply system operates normally at the first moment, and enters the emergency state at the next moment. If the alert state cannot be well identified, it cannot play the role of state repair. The traditional research and judgment of equipment operation status is often feedback after the event, which brings great uncertainty to the operation safety of power supply system and even to the personal safety of passengers.

3 Related Work 3.1 Establishment of Indicator System Using SWOT analysis method for reference, judge the equipment operation status from its own advantages and disadvantages as well as external factors. Set the target layer as X = {XS , XW , XO , XT }. The index XS is that the higher the factor score of the equipment itself, the better the performance of the equipment; The index XW refers to the statistics of the situation occurred during the operation of the equipment or the change of its own electrical data. If it exceeds a certain range, it will cause damage to the equipment; Layer XO represents the positive effect of artificial intervention; Layer XT represents the negative effect of external conditions on the equipment. xij refers to the jth index of the ith criteria layer, which is the general index of most electrical equipment. Establish the health status indicator system of metro power supply equipment as shown in the Table 2.

724

Z. Cao et al. Table 2. Health status indicator system of metro power supply equipment.

Criterion Symbol

Criterion layer

Indicator Symbol

Indicator layer

XS

strengths

xS1

remaining service life (months)

xS2

mean time between failures (months)

xS3

correct action rate of switch (%)

xS4

power factor

xW 1

load rate (%)

xW 2

DC resistance (M/KV)

xW 3

three-phase voltage unbalance (%)

xW 4

Alarm frequency (times/year)

xW 5

number of failures (times/year)

xW 6

wear degree (%)

xO1

maintenance frequency (times/year)

xO2

mean time to repair (h)

xO3

times of equipment upgrading and transformation (times/year)

xT 1

ambient temperature (° C)

xT 2

ambient humidity (%)

xT 3

dust impact

XW

XO

XT

weaknesses

opportunities

threats

4 Experimental Results of Safety Assessment of Power Supply System According to the health status method of power supply system built above, the transformer status in the power supply system of a rail transit line is selected as the object of this evaluation, and the effectiveness of the proposed evaluation model is verified through the example analysis of the relevant indicator data collected by the CPS data center. Substitute the normalized CPS data of the table into the ridge membership function, solve the membership degree of each parameter with respect to different state intervals, and then obtain the corresponding evaluation matrix as shown in the Table 3. ⎡ ⎤ 0.3948 0.6052 0 0 ⎡ ⎤ ⎢ 0 0 0 1⎥ 01 0 0 ⎢ ⎥ ⎢ ⎥ ⎢ 0 0 0.9443 0.0556 ⎥ 0 0 1⎥ ⎢ 0 ⎢ ⎥ = RS = ⎣ R ⎢ ⎥ W ⎢ 0 0 0 1⎥ 10 0 0 ⎦ ⎢ ⎥ ⎣ 0. 1 0 1⎦ 00 0 1 0 0 01

Research on the Operation Status of Metro Power Supply Equipment Under CPS



⎤ 0001 RO = ⎣ 0 0 0 1 ⎦ 0001

725



⎤ 0010 RT = ⎣ 0 1 0 0 ⎦ 0001   BS = 0.2361 0.2312 0.2036 0.3291 BO = 0 0 0 1  BT = 0 0.3147 0.4756 0.2097

Table 3. Judgment Matrix of Each Indicator. Indicator

Normal

Recovery

Alert

Emergency

xS1 xS2 xS3 xS4

0.4108 0.0476 0.468 0.2804

0.066 0.4693 0.4083 0.3971

0.041 0.0678 0.016 0.0488

0.4822 0.4153 0.1077 0.2737

xW 1 xW 2 xW 3 xW 4 xW 5 xW 6

0.3499 0.1689 0.2137 0.0625 0.0624 0.2967

0.0647 0.0672 0.1434 0.311 0.4752 0.4324

0.0305 0.0028 0.0646 0.0834 0.0586 0.0065

0.5549 0.7611 0.5783 0.5431 0.4038 0.2644

xO1 xO2 xO3

0.3781 0.07 0.0521

0.1206 0.0725 0.0314

0.0213 0.026 0.0239

0.48 0.8315 0.8926

xT 1 xT 2 xT 3

0.2427 0.2923 0.1646

0.001 0.4484 0.0182

0.0527 0.0433 0.0476

0.7036 0.216 0.7696

The target layer corresponds to the four levels of membership matrix R and evaluation vector B of transformer operation status. ⎡

⎤ 0.2361 0.2313 0.2036 0.3291 ⎢ 0.0425 0.5463 0.1293 0.2819 ⎥ ⎥ R=⎢ ⎣ 0 0 0 1 ⎦ 0 0.3147 0.4756 0.2097 Get fuzzy evaluation matrix by substitution.  B = 0.0759 0.2820 0.2074 0.4347

726

Z. Cao et al.

5 Discussion The evaluation vector E reflects the membership value of the equipment health status level in “Normal”, “Recovery”, “Alert”, “Warning”, and “Warning”. Based on the “maximum membership principle”, the “warning” membership value of the transformer is the largest, and the “normal” membership value is the smallest, and the overall equipment is between “warning” and “warning”. Referring to the GB/T 19212.1–2016 standard and the description of equipment health status in Table 1, the transformer is in “alert” and “fault” status at this time. Among them, since the service life of the transformer is about 20 years, and the service life of the transformer is 5 years, and the three-phase voltage unbalance is more than 2%, the replacement of equipment should be considered to ensure the system operation. In addition, in order to further verify the accuracy of the proposed evaluation model, the differences between the combination weighting method and the single weighting method in the determination of index weights are compared and analyzed, as shown in the Fig. 3.

Fig. 3. Comprehensive comparison of subjective and objective weights.

It can be seen from the comparison results that the combination of subjective and objective weighting method takes into account the expert dominance of equipment evaluation and the local difference under the objective data of equipment CPS, which has a more reasonable and effective weight setting than the single subjective or objective weight determination method.

6 Conclusions In this paper, the metro power supply system under CPS and the equipment operation state under the cyber physics system are studied based on the cyber physics system technology. The evaluation indexes of equipment operation state are classified by SWOT

Research on the Operation Status of Metro Power Supply Equipment Under CPS

727

analysis method and AHP. The fuzzy evaluation method is used to solve the problem that the boundary between the warning state and the normal state of the equipment is not clear, so as to improve the accuracy of the state repair. The evaluation index system is common to the subway power supply equipment, but in the actual application process, the proportion of different types of equipment indicators will be different. It is necessary to analyze the importance of indicators and optimize the indicator parameters in the research on the operation status of each type of equipment, so as to more truly reflect the equipment operation, give dynamic instructions in advance for timely manual intervention, and ensure the healthy operation of the subway power supply equipment.

References 1. Wolf, W.: Cyber-physical systems. Computer 42(03), 88–89 (2009) 2. Derler, P., Lee, E.A., Vincentelli, A.S.: Modeling cyber–physical systems. Proc. IEEE 100(1), 13–28 (2011) 3. Liu, Y., Peng, Y., Wang, B., Yao, S., Liu, Z.: Review on cyber-physical systems. IEEE/CAA J. Automatica Sinica 4(1), 27–40 (2017) 4. Shevlyugin, M.V., Golitsyna, A.E., Belov, M.N., Pletnev, D.S.: Increasing power supply reliability for auxiliaries of subway traction substations using energy storage devices. Russ. Electr. Eng. 91, 552–556 (2020). https://doi.org/10.3103/S1068371220090114 5. Sridhar, S., Hahn, A., Govindarasu, M.: Cyber–physical system security for the electric power grid. Proc. IEEE 100(1), 210–224 (2011) 6. Yohanandhan, R.V., Elavarasan, R.M., Manoharan, P., Mihet-Popa, L.: Cyber-physical power system (CPPS): A review on modeling, simulation, and analysis with cyber security applications. IEEE Access 8, 151019–151064 (2020) 7. Arghandeh, R., Von Meier, A., Mehrmanesh, L., Mili, L.: On the definition of cyber-physical resilience in power systems. Renew. Sustain. Energy Rev. 58, 1060–1069 (2016) 8. Guo, J., Han, Y., Guo, C., Lou, F., Wang, Y.: Modeling and vulnerability analysis of cyberphysical power systems considering network topology and power flow properties. Energies 10, 87 (2017) 9. Zhang, H., Peng, M., Guerrero, J.M., Gao, X., Liu, Y.: Modelling and vulnerability analysis of cyber-physical power systems based on interdependent networks. Energies 12, 3439 (2019)

Hybrid Underwater Acoustic Signal Multi-Target Recognition Based on DenseNet-LSTM with Attention Mechanism Mingchao Zhu1,2 , Xiaofeng Zhang2 , Yansong Jiang3 , Kejun Wang2(B) , Binghua Su2 , and Tenghui Wang2 1 Faculty of Data Science, City University of Macau, Macau, China 2 Beijing Institute of Technology, Zhuhai, China

[email protected] 3 Chinese Electronics Technology Group 54Th Institute, Hebei, China

Abstract. The research on multi-target recognition of mixed underwater acoustic signals is of great significance for military missions, ocean development, and navigation safety assurance. Due to the limited availability of information and the significant impact of the seawater medium and marine environmental noise on mono-channel underwater acoustic signals, achieving reliable and accurate multi-target recognition remains challenging. To overcome the challenge of reliability in multi-target recognition of mixed underwater acoustic signals, this paper focuses on investigating a deep learning-based method for recognizing and classifying underwater acoustic signals. A recurrent neural network fused with deep convolutional networks is proposed for multi-target recognition of mixed underwater acoustic signals. The deep convolutional network, DenseNet, is employed to extract frequency domain features, and an attention mechanism is introduced to capture the most salient features. Finally, hybrid underwater acoustic signal recognition is achieved through a recurrent neural network LSTM. The analysis of experimental results demonstrates that the DenseNet-LSTM model can enhance the accuracy of mixed underwater acoustic signal recognition based on frequency domain features. Furthermore, by incorporating the attention mechanism, the recognition rate is further improved. Keywords: Hybrid underwater acoustic signal · Multi-target recognition · DenseNet-LSTM · Attention mechanism

1 Introduction Underwater sound signals remain the most important and effective form of signal in surface and underwater target detection, communication, navigation, and guidance. Underwater acoustic target recognition technology plays a significant role in various tasks, including maritime warfare, maritime intelligence acquisition, and marine biological detection. Underwater acoustic mixed signal recognition is a technology that utilizes sonar or other acquisition devices to collect, process, recognize, and classify underwater © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 728–738, 2023. https://doi.org/10.1007/978-981-99-6187-0_72

Hybrid Underwater Acoustic Signal Multi-Target Recognition

729

sound. Waves, sea winds, rainfall, and marine biological noise can significantly increase the noise level in the ocean, thereby increasing the difficulty of obtaining and analyzing real target signals. Effectively recognizing underwater acoustic signals under limited data sample conditions and improving recognition accuracy under mono-channel, multi-target signals, and low signal-to-noise ratio conditions are still the main problems currently faced. In particular, in situations with mono-channel received signals where multiple signal types and energies are similar, the task of recognizing and distinguishing mixed underwater acoustic signals becomes more challenging. Despite efforts to improve the accuracy of underwater signal recognition, significant obstacles remain. In the way of data-driven, feature-based, and deep learning models as learning models, the signal separation method based on deep learning using supervised or unsupervised learning technology has made important progress. From the perspective of the pattern recognition field, underwater acoustic signal recognition consists of two main components: feature extraction from underwater acoustic signals and the development of classification models based on the extracted features. The commonly used feature extraction methods currently include time-domain, frequency-domain, and various signal processing and transformation methods. In classification models, various approaches can be employed, including traditional machine learning methods like support vector machines, multi-layer perceptrons, hidden Markov models, as well as deep learning models such as recurrent neural networks (RNN) and convolutional neural networks (CNN). By incorporating an attention mechanism, the network model can selectively focus on key information, thereby enhancing the efficiency of the model and improving the accuracy of recognizing mixed underwater acoustic signals. In recent years, extensive research has been devoted to deep learning approaches for recognizing underwater acoustic signals. Several studies [1–6] have specifically focused on employing deep learning methods for underwater acoustic signal classification. For instance, Ibrahim et al. [4] applied the discrete wavelet transform (DWT) to remove noise from the signals and utilized CNN and long short-term memory (LSTM) for classification, resulting in notable improvements. Sang et al. [6] proposed a dense CNN model for underwater target recognition, while [7] introduced RNN for the same purpose. Furthermore, CNN-LSTM models have been successfully applied in speech emotion recognition and the enhancement of medical systems [8–10]. Figure 1 illustrates the diagram of an underwater Acoustic Signal Recognition system based on CNN-LSTM. Additionally, attention mechanisms have been employed in underwater acoustic target recognition research [11, 12]. This article proposes a DenseNet fusion LSTM neural network prediction model based on attention mechanism for the frequency domain data characteristics of underwater acoustic mixed signals, which achieves accurate underwater acoustic signal recognition through coarse and fine granularity features. The DenseNet-LSTM model consists of two parts: DenseNet based on attention mechanism for extracting im-portant finegrained features; The backend is LSTM, which extracts coarse-grained features of hidden frequency domain feature patterns from fine-grained features to achieve multi-target recognition of mixed underwater acoustic signals. The superiority of using deep convolutional structures to extract frequency domain features of underwater acoustic signals

730

M. Zhu et al.

and then feeding them into RNN structures in recognition tasks has been demonstrated through experiments.

Fig. 1. Underwater Acoustic Signal Recognition Network Based on CNN-LSTM

2 Related Work 2.1 CNN and DenseNet CNN has gained significant popularity in the field of computer vision, with models such as ResNet, GoogLeNet, VGG-19, and Inception being widely acknowledged. ResNet has been a remarkable milestone in the history of CNN as it enables the training of deeper models, thereby leading to higher accuracy. ResNet achieves this by establishing ‘shortcuts’ (skip connections) amidst the frontal and posterior layers. These shortcuts facilitate gradient backpropagation throughout the training process and enable the training of deeper CNN networks. DenseNet shares a similar concept with ResNet by establishing dense connections between all preceding and subsequent layers, as shown in Fig. 2. Additionally, DenseNet promotes feature reuse through channel connections. These distinctive attributes of DenseNet empower it to outperform ResNet in terms of accuracy while utilizing fewer parameters and computational resources. DenseNet has been used for underwater acoustic mixed signal recognition and other fields. The research conducted by Y. Yao et al. [13] explores the utilization of DenseNet for underwater acoustic target recognition in their investigation of underwater acoustic target recognition approaches. GAO Y, et al. [14] proposed the DCGAN and DenseNet method. Zhenrong Deng, et al. [15] introduced the DenseNet and adaptive attention for image captioning. DenseNet is not only a connection between layers but also a dense connection. Each layer in the DenseNet’s Dense block receives input from all preceding layers, while the connections between layers are established through the shortcut method employed in ResNet. The dense connection in DenseNet offers several benefits, including the reduction of vanishing gradients, improved feature propagation, promotion of feature reuse, and significant parameter reduction. 2.2 LSTM LSTM is an enhanced neural network architecture that incorporates a ‘memory’ component based on RNN. Unlike traditional neural networks, LSTM considers not only the

Hybrid Underwater Acoustic Signal Multi-Target Recognition

731

current input’s impact on the output but also the influence of previous stage outputs on the current stage output. This enables connections between hidden layers and the ability to capture relationships within time series models. However, LSTM encounters challenges as the weight matrix remains constant across each time step. As the amount of data increases, the exponential growth of weight coefficients can lead to issues such as “gradient disappearance” and “gradient explosion”. These issues prevent the effective utilization of distant data points and fail to address the problem of “long-term dependence” in neural networks. To overcome these challenges, the LSTM model improves upon the RNN architecture. It introduces storage unit states and “gate” structures. The storage unit serves as a record of historical information, continually updated and transmitted within the unit state. The “gate” structures manage the flow of information. As seen in Fig. 3, this structure includes input, forget, and output gates that regulate information transmission through the sigmoid function. The forget gate selectively filters memory information from the previous moment and newly input information. A value of 1 allows all information to pass, while a value of 0 does not. In contrast to the RNN model, LSTM incorporates multiple gate mechanisms that enable it to filter input information from the previous time step. This allows LSTM to retain crucial information while disregarding less relevant data, addressing the issue of gradient vanishing. However, when dealing with long input sequences, LSTM may still encounter challenges in retaining essential data. To overcome this, deep CNN can be employed to preprocess the original data, extracting meaningful features, and filtering out irrelevant information. This approach ultimately enhances the accuracy of signal recognition.

Fig. 2. Dense Connections within the Dense Block

Fig. 3. Internal Structure of LSTM

2.3 Attention Mechanism The research on the combination of attention mechanism and deep learning has gradually increased and achieved good results. For example, [16] proposed a CNNLSTM-Attention Mechanism Neural Network to achieve Ionospheric TEC Forecasting Model. The study conducted by [17] employed a deep learning model combining CNN, LSTM, and Attention for the purpose of landslide mapping. [18–20] also used CNN-LSTM-Attention to solve problems in energy and IoT. This article introduces spatial attention, channel attention, and adaptive attention based on the existing network to improve the model’s recognition ability for multi-target signals in mixed signals in the task of mixed signal multi-target recognition.

732

M. Zhu et al.

(1) CBAM module. He Convolutional Block Attention Module (CBAM) is a lightweight and effective attention module, as illustrated in Fig. 4. It generates an attention map by considering two independent dimensions: channels and spatial order. This attention map is then multiplied with the corresponding feature map to enhance the input feature representation. CBAM can seamlessly integrate with various CNN model architectures, providing good results in end-to-end setups with minimal computational and spatial costs. The CBAM architecture comprises two integral components: the Channel Attention Module and the Spatial Attention Module. These modules optimize the data flow within the network by learning to emphasize or suppress relevant feature information. (2) Self-Attention module. Self-attention is proposed in SAGAN, which introduces self-attention into the generator convolution module. In traditional convolutional networks, the local receptive field is constrained by the size of the convolutional kernel, which poses limitations when attempting to create large-scale regional convolutional networks. Taking the generation of faces as an example, if the generated faces are asymmetric, they will appear unrealistic. To enhance the utilization of global information, a selfattention mechanism is introduced, as depicted in Fig. 5, to improve the structure of self-attention.

Fig. 4. CBAM Structure

Fig. 5. Self attention module structure

3 DenseNet-LSTM with Attention Mechanism Based on the research on deep convolutional networks and recurrent neural networks for recognition, a hybrid multi-target recognition network is constructed, and an attention mechanism is introduced to improve the recognition rate. DenseNet is used for deep convolution, and LSTM is used for recurrent neural networks. 3.1 DenseNet-LSTM DenseNet establishes dense connections between all layers in front of the network and the layers behind it. Due to the dense connection of feature maps, feature reuse is achieved. DenseNet exhibits better performance with fewer parameters and lower computational costs. DenseNet mainly consists of two modules: Dense Block and Transition Block. Before connecting LSTM after DenseNet, a convolutional layer needs to be applied to convert the number of channels to 128, aligning it with the time steps. Additionally, the number of internal nodes in the LSTM is set to 128. The specific structure is shown in Table 1.

Hybrid Underwater Acoustic Signal Multi-Target Recognition

733

Table 1. Network Structure of DenseNet-LSTM Number

Network

Output

Number of Parameters

1

Input

(128,128,1)

0

2

DenseNet

(8,8,304)

1183936

3

Conv2D-BN-Relu

(8,8,128)

5632

4

Reshape

(128,64)

0

5

LSTM

(128,128)

98816

6

Dense

(12)

196620

3.2 DenseNet-LSTM with Attention Mechanism Traditional methods such as LSTM, RNN, and others often face difficulties in preserving sequence information when dealing with long input sequences. To overcome this challenge, we propose a DenseNet-LSTM model that incorporates an attention mechanism, as illustrated in Fig. 6. This mechanism assigns distinct probability weights to the hidden layer of LSTM, thereby amplifying the impact of crucial information. Consequently, this approach effectively mitigates the loss of sequence information in lengthy input sequences, leading to improved recognition accuracy.

Fig. 6. Block Diagram of DenseNet-LSTM with Attention Mechanism for Hybrid Underwater Acoustic Signal Recognition

The DenseNet-LSTM model incorporating an attention mechanism consists of three main components: data preprocessing, a DenseNet unit with attention mechanism, an LSTM unit, and an output unit. A detailed description of each component is provided below. (1) Data preprocessing: For abnormal data, replace it with neighboring means; Normalize for significant differences in the values of different variables. Divide the dataset into training, testing, and validation sets.

734

M. Zhu et al.

(2) DenseNet unit with attention mechanism: This unit takes input from the original data and extracts multiple partially overlapping continuous subsequences. (3) LSTM unit: Utilize the preceding unit’s output as the input for this unit, establishing a model for time series prediction. (4) Output unit: The outputs are generated by the final hidden layer in the LSTM network.

4 Dataset and Experimental Results This experiment was built under the operating system Ubuntu 16.04 under Linux, using NVIDIA RTX 2070 GPU and corresponding CUDA and GPU accelerated CuDNN. The deep learning framework mainly adopts Tensorflow1.12 developed by Google, and the operating language is Python version 3.6. 4.1 Dataset The data used in the experiment were all sourced from the Shipear dataset [21], which provides various types of underwater sound signals and records the type of ship, location of equipment, and weather conditions when obtaining underwater sound signals. This dataset mainly includes 11 types of ship radiated noise and 1 natural environment marine radiated noise, totaling 12 categories. The distribution duration of each category is different. In the recognition task, it is necessary to frame each category sample with the same duration, and then extract features and input them into the classification model. In this experiment, each type of sample was divided into frames at 16000 sampling points, and the training and testing sets were divided into 4:1, as shown in Table 2. Table 2 primarily focuses on the distribution of datasets constructed for the recognition of individual signals. However, for the recognition of mixed signals, significant challenges arise. Therefore, it is necessary to select four types: Fishboat, Sailboat, Pilot, and Roro, which are mixed in pairs with signal-to-noise ratios of –5dB, 0dB, and 5dB. There are a total of six types of mixed signals, as displayed in Table 3. The dataset is a mixed signal of four types of ships mixed in pairs with different signal-to-noise ratios. In the task of recognizing multiple labels in mixed signal datasets, the activation function for the final fully connected layer is set as sigmoid, with a probability threshold of 0.5. If the output exceeds the probability threshold, it is con-sidered that the mixed signal contains such signals. The loss function adopts Binary Crossentropy. The frequency domain features mainly include the LOFAR image, Mel spectrum, and MFCC (Mel Frequency Cepstral Coefficients) obtained through short-time Fourier transform. The LOFAR feature effectively captures the recognition attributes of each ship, while the Mel features and MFCC, derived from LOFAR, emphasize lowfrequency information based on human auditory characteristics, thereby demonstrating good classification characteristics. 4.2 Experimental Results Table 4 shows the recognition accuracy of mixed signals. Before mixing the signals, the training and testing sets were shuffled and then mixed. This approach not only enriches

Hybrid Underwater Acoustic Signal Multi-Target Recognition

735

Table 2. Division of Shipear Dataset for Training and Testing Serial Number

Ship category

Total number of samples

Number of training sets

Number of test sets

1

Passagers

13195

10556

2639

2

Trawler

539

431

108

3

Musselboad

2425

1924

481

4

Tugeboat

681

544

137

5

Dredger

866

692

174

6

Fishboat

1702

1361

341

7

Oceanliner

3114

2491

623

8

Sailboat

1245

1076

169

9

Motorboat

3063

2450

613

10

Pilotship

456

364

92

11

Roro

4988

3990

998

12

Naturallambie

1063

755

308

Table 3. Division of Mixed Signal Dataset for Training and Testing Category label

Category of mixed signal ships

Total number of samples

Number of training sets

Number of test sets

1_8

Pas_Sail

5000

4000

1000

1_10

Pas_Pilo

5000

4000

1000

1_11

Pas_Ro

5000

4000

1000

8_10

Sail_Pilo

5000

4000

1000

8_11

Sail_Ro

5000

4000

1000

10_11

Pilo_Ro

5000

4000

1000

the diversity of mixed samples but also increases the difficulty of recognition. Among the recognition methods for mixed signals, the DenseNet and LSTM combination achieved the highest accuracy, reaching 0.6487. Table 5 shows the recognition accuracy of the time attention module added to the mixed signal recognition model, which is similar to the conclusions obtained from the single signal recognition model. In terms of temporal features, the introduction of temporal attention mechanism can have a slight improvement in the final model per-formance. On the one hand, it is because the temporal sequence expresses limited signal features, and many recognition features are hidden in frequency domain fea-tures, which is the reason for the upper limit of time-domain signal recognition rate.

736

M. Zhu et al. Table 4. Recognition Accuracy of Mixed Signals with DNN and RNN ResNet + LSTM

DenseNet

Training set

Test set

Training set

Test set

Training set

Test set

Training set

Test set

LOFAR

0.5120

0.4619

0.5321

0.4797

0.7121

0.6395

0.7323

0.6487

MEL

/

/

/

/

0.8190

0.6227

/

/

ResNet

LSTM MFCC

DenseNet + LSTM

GRU

Training set

Test set

Training set

Test set

0.5500

0.5433

0.5632

0.5123

Table 5. Recognition Accuracy of Mixed Signals with Time Attention CNN + LSTM + Time Attention

CNN + GRU + + Time Attention

Training set

Test set

Training set

Test set

Temporal Sequence

0.5231

0.5134

0.6231

0.5003

Autocorrelation Feature

0.5632

0.4917

0.5632

0.4900

Table 6 shows the introduction of CBAM module, self attention module, and temporal attention module in the mixed signal recognition model, with LOFAR feature selection as the feature selection. Both CBAM and self attention modules are uniformly placed after the Transition Block in DenseNet. According to the settings in the previous sec-tion, four CBAM or self attention modules are added to each DenseNet. The time attention module is placed after LSTM. In the training process, there is a overfitting condition. When the accuracy of the verification set is almost unchanged, the accuracy of the training set is almost 1.0, so the model with the highest accuracy rate on the verification set is preserved. In the test set, DenseNet + SA has an increase of about 0.04 compared with the original model, and the recognition accuracy of other models has an increase of 0.01–0.02. The improvement of the model using a mixture of CBAM and temporal attention, or a mixture of SA and temporal attention, is smaller than using a single mode attention module. The self attention module can provide more Receptive field, that is, it can integrate more frequency length and time length in the frequency domain features to obtain more global features, so it can improve the recognition rate most.

Hybrid Underwater Acoustic Signal Multi-Target Recognition

737

Table 6. Recognition Accuracy of Mixed Signal with CBAM, SA, and Time Attention

LOFAR

DenseNet + CBAM

DenseNet + DenseNet + SA CBAM + LSTM + TimeAttention

DenseNet + SA + + LSTM + TimeAttention

Training set

Test set

Training set

Testset

Training set

Testset

Training set

Testset

0.6325

0.6485

0.6845

0.6637

0.7365

0.6848

0.6956

0.6477

5 Conclusion This article proposes a fusion network that combines DenseNet, a convolutional neural network, with LSTM, a recurrent neural network. DenseNet is utilized to extract raw data features, capture interrelationships among multidimensional data, and filter out noise and unstable components. The processed information, which exhibits relatively stable patterns, is then transmitted as a whole to the LSTM network for recognition. The research investigates the recognition accuracy of different features extracted by various models on mixed signal datasets. The hybrid architecture, which combines convolutional and recurrent neural networks, achieves the highest recognition accuracy. This finding demonstrates the effectiveness of using deep convolutional structures to extract frequency domain and temporal features for recognition tasks, followed by inputting them into the recurrent neural network structure. Furthermore, in the context of mixed signal multi-target recognition, the incorporation of temporal attention, spatial attention, channel attention, and adaptive attention enhances the model’s capability to recognize multi-target signals within mixed signals. Acknowledgments. This research received partial support from the R&D plan project in key fields of Guangdong Province (No. 2018B010109001, No. 2021B0707010001), as well as the key scientific research platform of universities in Guangdong Province (No. 2022KSYS016).

References 1. Miao, Y., Zakharov, Y.V., Sun, H., Li, J., Wang, J.: Underwater acoustic signal classification based on sparse time–frequency representation and deep learning. IEEE J. Oceanic Eng. 46(3), 952–962 (2021) 2. Jin, G., Liu, F., Hao, W., Song, Q.: Deep learning- based framework for expansion, recognition and classification of underwater acoustic signal. J. Exp. Theor. Artif. Intell. 32(2), 205–218 (2020) 3. Doan, V.S., Huynh-The, T., Kim, D.S.: Underwater acoustic target classification based on dense convolutional neural network. IEEE Geosci. Remote Sens. Lett. 19, 1–5 (2022) 4. Guo, T., et al.: Underwater target detection and localization with feature map and CNN-based classification. In: 2022 4th International Conference on Advances in Computer Technology, Information Science and Communications (CTISC), Suzhou, China, pp. 1–8 (2022). https:// doi.org/10.1109/CTISC54888.2022.9849785

738

M. Zhu et al.

5. Hu, G., Wang, K.J., Liu, L.L.: Underwater acoustic target recognition based on depthwise separable convolution neural networks. Sensors (Basel) 4, 1429 (2021) 6. Sang, V.D., Huynh-The, T., Kim, D.S.: Underwater acoustic target classification based on dense convolutional neural network. IEEE Geosci. Remote Sens. Lett. 19, 1–5 (2020) 7. Liu, F., Shen, T., Luo, Z., Zhao, D., Guo, S.: Underwater target recognition using convolutional recurrent neural networks with 3-d mel-spectrogram and data augmentation. Appl. Acoust. 178, 107989 (2021) 8. Kumar, C.S.A., Maharana, A.D., et al.: Speech emotion recognition using CNN-LSTM and vision transformer. In: Innovations in Bio-Inspired Computing and Applications. IBICA 2022. Lecture Notes in Networks and Systems, vol. 649 (2022). https://doi.org/10.1007/978-3-03127499-2_8 9. Rayan, A., Alaerjan, A.S., et al.: Utilizing CNN-LSTM techniques for the enhancement of medical systems. Alexandria Eng. J. 72, 323–338 (2023) 10. Rafi, S.H., Nahid-Al-Masood, S.R., Deeba, E.H.: A short-term load forecasting method using integrated CNN and LSTM network. IEEE Access 9, 32436–32448 (2021) 11. Xue, L., Zeng, X., Jin, A.: A novel deep-learning method with channel attention mechanism for underwater target recognition. Sensors 22(15), 5492 (2022) 12. Xiao, X., Wang, W., Ren, Q., Gerstoft, P., et al.: Underwater acoustic target recognition using attention-based deep neural network. JASA Express Lett. 1(10), 106001-1–106001-8 (2021) 13. Yao, Y., Zeng, X., Wang, H., Liu, J.: Research on underwater acoustic target recognition method based on densenet. In: 2022 3rd International Conference on Big Data, Artificial Intelligence and Internet of Things Engineering (ICBAIE), Xi’an, China, pp. 114–118 (2022) 14. Gao, Y., Chen,Y., Wang, F., et al.: Recognition method for underwater acoustic target based on DCGAN and DenseNet. In: 2020 IEEE 5th International Conference on Image, Vision and Computing (ICIVC). Beijing, China: IEEE, pp. 215–221 (2020) 15. Deng, Z., Jiang, Z., Lan, R., Huang, W., Luo, X.: Image captioning using DenseNet network and adaptive attention. Signal Process. Image Commun. 85, 1–9 (2020) 16. Tang, J., Li, Y., Ding, M., Liu, H., Yang, D., Wu, X.: An Ionospheric TEC forecasting model based on a CNN-LSTM-attention mechanism neural network. Remote Sens. 14(10), 2433 (2022) 17. Chen, C., Fan, L.: CNN-LSTM-ATTENTION deep learning model for mapping landslide susceptibility in kerala, INDIA. ISPRS Ann. Photogrammetry Remote Sens. Spat. Inf. Sci. X-3/W1-2022, 25–30 (2022). https://doi.org/10.5194/isprs-annals-X-3-W1-2022-25-2022 18. Wang, J., Wang, R., Zeng, X.: Short-term passenger flow forecasting using ceemdan meshed Cnn-Lstm-attention model under wireless sensor network. IET Commun. 16, 1253–1263 (2022) 19. Chung, W.H., Gu, Y.H., Yoo, S.J.: District heater load forecasting based on machine learning and parallel CNN-LSTM attention. Energy 246, 123350 (2022) 20. Akmal, M.: Tensor factorization and attention-based CNN-LSTM deep-learning architecture for improved classification of missing physiological sensors data. IEEE Sens. J. 23(2), 1286– 1294 (2023) 21. Santos-Domínguez, D., Torres-Guijarro, S., Cardenal-López, A.., Pena-Gimenez, A., et al.: ShipsEar: An underwater vessel noise database. Appl. Acoust. 113, 64–69 (2016)

A Lightweight Deep Network Model for Visual Checking of Construction Materials Xi Deng1 , Bocheng Zhou2 , Bingdong Ran3 , Yingming Yang1 , Ling Xiong2 , and Kai Wang3(B) 1 China Academy of Building Research, Beijing 100013, China 2 Chongqing Western Water Resources Development Co. Ltd, Chongqing 400044, China 3 School of Automation, Chongqing University, Chongqing 400044, China

[email protected]

Abstract. As an important material in the construction process of water conservancy construction, the traditional construction material checking is mainly realized manually, which is not only time-consuming and labor-intensive but also lack of precision. In order to achieve fast and accurate detection and checking of construction materials, this paper takes building material rebar as the research object and proposes a novel lightweight deep neural network model. In order to solve the problem of limited memory space and processor computing power for model deployment, a lightweight method combining ShuffleNetV2 and BN layer channel pruning is proposed for the first time. The ShuffleNetV2 is further improved with enhanced accuracy. The results show that compared with the state-of-the-art deep networks, the number of parameters is significantly reduced by 60.57% and FLOPs by 66.45% while the accuracy reaches 97.7%, which can effectively guarantee the accuracy of construction material checking. Keywords: Water Construction · Construction Materials · ShuffleNetV2 · Channel pruning

1 Introduction Yuxi water resources project puts forward the development requirement of gradually changing from digital water conservancy to intelligent water conservancy, and continuously explores the application of new technologies such as Internet of Things, BIM and artificial intelligence in the process of engineering construction, and gradually forms the corresponding wisdom application results [1]. Building materials are used in large quantities by construction sites, and for commercial and administrative purposes, building material information should be checked during each delivery. For example, commonly used building material rebar and steel pipes are transported by truck between the supplier and the construction site, and the site storekeeper needs to check the actual quantity of material received against the purchase order. The traditional way of checking construction materials is mainly by weighing or manual methods, which has many problems such as low efficiency, high error rate and high cost. Therefore, there is an urgent need © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 739–746, 2023. https://doi.org/10.1007/978-981-99-6187-0_73

740

X. Deng et al.

to develop a new method that can automatically, quickly and accurately check and count dense building materials to free workers from this tedious and unskilled task and to improve the efficiency of receiving construction materials. Currently, visual checking algorithms for construction materials are mainly based on traditional image processing and deep learning based. Among the traditional image processing methods, Su et al. investigated the images of adhered rebar ends [2] to obtain the radius of normal rebar based on the maximum inner tangent circle algorithm. Zhang et al. completed the tracking and checking of rebar by locating the center of the rebar section through template matching and variable threshold segmentation [3]. Hou et al. proposed a bundle checking algorithm based on template overlay to obtain statistical information on the number of bundles of rebar [4]. Wang et al. proposed a pattern recognition method based on the quasi-circular assumption [5]. Thammasorn et al. proposed a template-based algorithm that can support rectangular and circular steel pipe checking and achieved good accuracy [6]. Lee et al. proposed to separate the rebar region from the background to achieve rebar checking [7]. In terms of deep learning, Li et al. designed a BMC-YOLO model [8] based on YOLOv3 for the problems of large number of steel pipes. Ming H ongyu et al. proposed an improved algorithm for dense rebar checking based on RetinaNet [9]. Jinglei Shi improved the feature extraction network ResNeXt101 network and proposed a static picture rebar checking algorithm based on cascaded target candidate head network [10]. Fan et al. proposed a CNN-DC model [11]. Zhu et al. proposed a proposed Inception-RFB-FPN based rebar detection algorithm with sliding window data augmentation (SWDA) to compensate the problem of less rebar data [12]. Cheng et al. proposed a weakly supervised multimodal annotation segmentation (WSMA-Seg) model based on anchorless and NMS to achieve rebar checking by deep learning semantic segmentation [13]. In this paper, we propose an improved YOLOv5s algorithm based on deep learning to achieve fast and high accuracy checking of construction materials while being easier to deploy on mobile. The main contributions of this article are summarized as follows: 1. The YOLOv5s backbone network is replaced with a more lightweight ShuffleNetV2 network, thereby reducing the model parameters and computational cost. ShuffleNetV2 is also improved to further reduce model complexity and improve model detection performance. 2. Structured pruning of the network after the backbone network is performed to further reduce the number of model parameters and computational effort. 3. Results show that compared with the traditional models, the number of model parameters can be significantly reduced by 66.45% with acceptable detection accuracy sacrifice for proposed method. This paper is organized as follows. Section 2 presents the overall framework design of the model. Section 3 presents the dataset and experimental design. Section 4 concludes with the experimental results and analysis. Section 5 summarizes the whole paper.

A Lightweight Deep Network Model for Visual Checking

741

2 Overall Framework Design for Building a Visual Checking Network Model of Materials The overall framework design of the network model is shown in Fig. 1. The overall framework is mainly composed of five parts: Input, Backbone, Neck, Head. Among them, the input Input is a three-channel RGB image. The backbone network uses ShuffleNetV2, a modified lightweight network, to reduce the model parameters and computation. The feature fusion layer Neck uses Path Aggregation Network (PANet) to fuse the effective feature maps output from Backbone to achieve the information intermingling of different feature layers. Feature extraction and get richer feature information. The output detection part Head has three YOLO Head detectors, which output different scale feature maps for target prediction. The detected construction materials are counted in the Count section.

Fig. 1. Overall framework of the model

2.1 Lightweighting Approach Based on Improved-ShuffleNetV2 Module Network Restructuring. The ShuffleNetV2 network is mainly designed for classifying 1000 categories of ImageNet dataset, but this paper does not involve complex classification requirements, so an excessively deep network depth is not required. For this reason, Therefore, this article changes the stacking number of basic unit 1 in shuffleNetV2 to a combination of (2,5,2) to reduce the depth of ShuffleNetV2 network and further reduce the number of parameters and computation of the network. Activation Function. The Swish activation function has been shown to outperform the ReLU activation function and improve the performance of neural networks. MobieNetV3 proposes the H-Swish activation function, which is smooth, non-monotonic and simple to derive, and has less computational overhead than the Swish activation function. Basic Unit Improvement. In this paper, the original 3 ✖ 3 Depthwise Convolution (DWconv) is replaced by 3 ✖ 3 Dilated Convolution (DConv) in the basic unit 1 of ShuffleNetV2. On the one hand, it can expand the perceptual field of ShuffleNetV2

742

X. Deng et al.

in the feature extraction process. On the other hand, it can avoid the loss of feature information and excessive growth of the number of network parameters. 2.2 BN Layer-Based Channel Pruning Method The channel pruning schematic based on the BN layer is shown in Fig. 2. Firstly, through sparse training, the γ parameter of the BN layer in the network is made progressively smaller to realize the sparse network and facilitate the selection and pruning of the channels. Secondly, by controlling the size of the hyperparameter λ, the network gets a moderate sparsity, which facilitates the selection of channels that play a small role in the network.

Fig. 2. Schematic diagram of channel pruning based on BN layer

3 Dataset and Experimental Design 3.1 Construction Material Image Dataset Construction In this paper, we build materials to rebar as an example, the data set consists of three parts, there are 900 rebar images, part of the built rebar images are shown in Fig. 3. The first part is from the Data Fountain dataset, with a total of 250 labeled rebar images. The second part is from the rebar images taken on the construction site, with a total of 650 unlabeled rebar images. The rebar image dataset is divided into a training set and a test set in the ratio of 8:2 for model training and testing, respectively.

Fig. 3. Rebar image dataset

A Lightweight Deep Network Model for Visual Checking

743

3.2 Experimental Environment and Configuration 3.2.1 Evaluation Indicators mAP. AP represents the performance of the test model for each category, while mAP represents the performance of the test model for all categories, which is the average of all APs. Number of Parameters. The number of parameters of the algorithm model. FLOPs. Refers to the number of floating-point operations, understood as the amount of computation. Used to measure the complexity of an algorithm or model.

3.3 Main Experimental Parameters of the Algorithm Model The main training parameters for the experiments in this paper are shown in Table 1. Table 1. Training parameters Hyperparameters

Variables

Img

900

Training set

720

Test Set

180

Batch Size

32

Epoch

300

Optimizer

Adam

Learning rate

0.001

4 Experimental Results and Analysis 4.1 Experimental Analysis of Light weighting Based on the Improved-ShuffleNetV2 Module Model Comparison and Analysis. The experimental results of the above multiple models are compared and analyzed, and the model effects are compared as shown in Table 2. where YOLOv5s is the baseline model, ShuffleNetV2-YOLOv5s denotes the model after replacing the original backbone network and Improved-ShuffleNetV2-YOLOv5s denotes the improved backbone network ShuffleNetV2 (network structure adjustment + replacement of activation function H-Swish + replacement of holes in basic unit 1 convolution) after the model. As can be seen from the Table 2, the ShuffleNetV2-YOLOv5s model has a significant decrease in FLOPs and Params, 49.36% and 46.14%, respectively,

744

X. Deng et al. Table 2. Comparison of the effects of different models

Models

FLOPs

Params

[email protected]

YOLOv5s

15.8 GFLOPs

7.02M

0.992

ShuffleNetV2- YOLOv5s

8.0 GFLOPs

3.79M

0.974

Improved-ShuffleNetV2-YOLOv5s

7.5GFLOPs

3.57M

0.979

with a significant effect of model lightweighting. However, the model detection accuracy also decreases significantly and the mAP value decreases by 1.8 percentage points. Then the backbone network ShuffleNetV2 is improved and Improved-ShuffleNetV2YOLOv5s further reduces the model complexity, while the detection accuracy also has a small improvement of 0.5 percentage points. 4.2 Analysis of Structured Pruning Experiments Sparse Training. When the model is sparsely trained, the scaling factor γ is added into the loss function for joint training, and the sparsity factor is set to 0.0002. The results of sparse training are shown in Fig. 4, As can be seen in Fig. 4, with the sparse training, the scaling factor γ of each layer gradually starts to be fixed, and the scaling factor γ of some of the BN layers has been reduced to near 0, thus the basic conditions for pruning the model are available. In the subsequent pruning, a pruning threshold needs to be set, and channels smaller than the threshold will be pruned, and the model will become more compact after removing the redundant channels.

Fig. 4. Sparse training results

Model Pruning. In order to verify the lightweight effect of the pruning of the constructed material vision checking model, the model was compared before, after and after the improvement of the model, and the results are shown in Table 3. The original YOLOv5s before improvement, the improved model is ImprovedShuffleNetV2 -YOLOv5s, and the improved model is pruned to obtain the pruned model Improved-ShuffleNetV2 -Prune-YOLOv5s, referred to as ISP-YOLOv5s, while the pruned model is the final model of this paper (ISP-YOLOv5s). As can be seen from Table 3, the completed pruned model ISP-YOLOv5s decreases in FLOPs and Params by 29.33% and 22.12%, respectively, compared to the unpruned

A Lightweight Deep Network Model for Visual Checking

745

Table 3. Comparison of model effect before and after pruning Models

FLOPs

Params

[email protected]

Before improvement

15.8GFLOPs

7.02M

0.992

After improvement

7.5GFLOPs

3.57M

0.979

After pruning

5.3 GFLOPs

2.78M

0.977

improved model. The pruned model further reduces the model complexity and further lightens the model, while the detection accuracy of 0.977 only decreases by 0.2 percentage points, which achieves the expected model compression effect in this paper. 4.3 Comparison and Analysis of Construction Material Checking Effect In this paper, the final model (ISP-YOLOv5s) is used to test the detection checking effect on the rebar image dataset, in which the detection checking effect of a part of the rebar images are shown in Fig. 5.

Fig. 5. Rebar image detection checking effect

5 Conclusion Although this paper has achieved certain research results in model light weighting, there are still many shortcomings, and further research can be conducted in the following aspects in the future. Model aspects. Apply the method proposed in this paper on top of the latest models, such as YOLOv7 and YOLOv8. Deployment. Deploy the models proposed in this paper on embedded AI platforms, such as JETSON Nano and other devices, so that the models can be truly applied on the ground. Rebar image aspect. Consider an in-depth study of untidy placement of rebar targets, diversified shooting angles and large number of rebar targets such as 300 or more.

746

X. Deng et al.

Acknowledgements. This work is partially supported by Research Fund Project of China Academy of Building Research Limited: BIM-based Construction Project Management Application Platform(No.20221802330730016) and Science and Technology Project of Chongqing Water Resources Bureau: Research and application of key technologies of intelligent construction of large-scale water transfer project based on BIM(Yu Xi Water Division Wen [2021] No. 19).

References 1. Du, C.Y., Zhang, Z.B.: Exploration of wisdom application during the construction period of large water conservancy projects. Water Resour. Informatization 4, 7 (2021) 2. Su, Z., Fang, K., Peng, Z., et al.: Rebar automatically checking on the product line. In: 2010 IEEE International Conference on Progress in Informatics and Computing. IEEE, pp. 756–760 (2010) 3. Zhang, D., Xie, Z., Wang, C.: Bar section image enhancement and positioning method in on-line steel bar. In: 2008 Congress on Image and Signal Processing, pp. 319–323 (2008) 4. Hou, W., Duan, Z., Liu, X.: A template-covering based algorithm to count the bundled steel bars. In: 2011 4th International Congress on Image and Signal Processing. IEEE, pp. 18131816 (2011) 5. Wang, J., Hao, C., Xu, X.: Pattern recognition for checking of bounded bar steel. In: Fourth International Conference on the Applications of Digital Information and Web Technologies (ICADIWT 2011). IEEE, pp. 173–176 (2011) 6. Thammasorn, P., Boonchu, S., Kawewong, A.: Real-time method for checking unseen stacked objects in mobile. In: 2013 IEEE International Conference on Image Processing. IEEE, pp. 4103–4107 (2013) 7. Lee, J.H., Park, S.O.: Machine learning-based automatic reinforcing bar image analysis system in the internet of things. Multimedia Tools Appl. 78(5), 3171–3180 (2019) 8. Li, Y., Chen, J.: Computer vision-based checking model for dense steel pipe on construction sites. J. Constr. Eng. Manag. 148(1), 04021178 (2022) 9. Min, H.Y., Chen, C.M., Liu, G.H., et al.: Improved algorithm for dense rebar checking based on RetinaNet. Sens. Microsyst. 39(12), 115–118 (2020) 10. Shi, J.L.: Research on steel checking algorithm based on convolutional neural network. Huazhong University of Science and Technology (2019) 11. Fan, Z., Lu, J., Qiu, B., et al.: Automated steel bar checking and center localization with convolutional neural networks (2019). arXiv preprint arXiv:1906.00891 12. Zhu, Y., Tang, C., Liu, H., et al.: End-face localization and segmentation of steel bar based on convolution neural network. IEEE Access 8, 74679–74690 (2020) 13. Cheng, Z., Wu, Y., Xu, Z., et al.: Segmentation is all you need (2019). arXiv preprint arXiv: 1904.13300

Research and Application of Automatic Screening Technology for Marketing Inspection Abnormalities Based on Knowledge Graph Dan Lu1 , Linjuan Zhang1 , Yiming Xu1 , Changqing Xu1 , Hefa Sun1 , Hongyang Yin2 , and Min Xia2(B) 1 State Grid Henan Economics Research Institute, Zheng Zhou 450052, China 2 Nanjing University of Information Science and Technology, Nanjing 210044, China

[email protected]

Abstract. With the rapid development of electric power enterprises and the Internet, a large amount of data has generated in marketing activities. Marketing inspection is an important means to ensure the power market, however, traditional marketing inspection methods have many limitations in dealing with massive data, facing problems such as low screening efficiency and poor accuracy. In response to this problem, this paper proposes a knowledge graph-based optimization technology for power marketing inspection system. This paper collects marketing inspection related data from the internal of the power grid and preprocesses it. According to the needs of marketing inspection, a knowledge graph model is designed, which includes three elements: entity, attribute, and relationship. Then, utilizing technologies such as entity extraction, relationship extraction, and knowledge fusion, the related data is mapped to the knowledge graph, forming a complete marketing inspection knowledge graph. Experimental results show that the proposed optimization technology can effectively establish a marketing inspection knowledge graph, improve the efficiency and accuracy of power marketing inspection, and has high practicality and promotion value. Keywords: Knowledge graph · Power marketing inspection · Entity recognition · Relationship extraction · Inference algorithm

1 Introduction Power companies are using big data technology and diverse marketing methods, which creates a lot of marketing data that is valuable but complex. This makes power marketing inspection more difficult and challenging. Power marketing inspection is important to standardize the power marketing market and improve the economic benefits of power companies. However, many power companies still use manual inspection methods, which wastes time and makes it difficult to find clear inspection clues. The main inspection methods used, spot check and thorough check, have limitations and rely on personal experience. Spot checks may miss errors due to limited scope, and thorough inspections require a lot of resources to carry out. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 747–755, 2023. https://doi.org/10.1007/978-981-99-6187-0_74

748

D. Lu et al.

A knowledge graph (KG) is a way of representing entities and their relationships using a graphical format [1]. It can be used to manage and extract information intelligently, allowing for quick and accurate understanding of the power grid’s stable situation [2, 3]. KGs can also provide intelligent decision support, improving the power grid’s active security defense [4, 5]. By constructing a knowledge graph model based on existing rules and historical samples, a scene intelligent inspection model can be established and applied in relevant business scenarios. This can deeply analyze feature correlations between multi-dimensional inspection data and quickly and accurately screen marketing core business problems and risks, improving inspection efficiency and accuracy. Establishing an information-based and intelligent supporting platform can promote the company’s information interconnection and intelligent transformation. Researchers have applied knowledge graph technology to the power field. QIAO et al. [6] proposed a knowledge graph application framework for assisting decision-making in power grid fault handling, which realizes fault information parsing and discrimination, intelligent decision support, and multidimensional human-computer interaction, and summarizes the application and challenges of knowledge graphs in intelligent decision support. PU et al. [7] proposed the NoDKG framework for a power domain knowledge graph based on the characteristics and trends of power IoT data. It provides a technical roadmap and implementation plan for applying the power domain knowledge graph in specific scenarios, such as fault handling, maintenance work order processing, and customer service. GUO et al. [8] proposed a hybrid knowledge graph construction method for power grid fault handling, combining top-down and bottom-up approaches and also proposed a power domain named entity recognition method using the LR-CNN model with a reflective mechanism to support knowledge extraction. LI et al. [9] proposed a KG-based fault diagnosis method that uses a composed of a bidirectional long shortterm memory network (BiLSTM) and a conditional random field (CRF) to achieve entity extraction, and utilizes knowledge graph construction technology to construct and apply a power grid fault handling knowledge graph. YE et al. [10] proposed a method for the KG of power distribution network fault handling and used pre-training method to construct deep learning models to achieve named entity recognition for fault handling data. ZHOU et al. [11] described the construction process of the stable regulation knowledge model from the conceptual model and implementation model perspectives, and combining the complex event processing (CEP) engine, proposed a stable regulation digital implementation solution based on the digital twin model. Studies show that knowledge graph technology is valuable in the power industry, but its application in automatic screening of anomalies in marketing inspections is still limited. Therefore, this paper proposes a knowledge graph-based automatic screening technique in response to the issue. Research on this technology analyzes and inspects data and forms entity/attribute relationship triples to build knowledge graphs under existing rules. It is helpful to quickly check the core business problems and hidden dangers in the marketing inspection, and improve the efficiency and accuracy of the inspection. The main contributions of this paper are as follows: The BERT-based model is used to establish training samples to extract entities, attributes, and relationships, which improves the speed of knowledge modeling; A reasoning model is also constructed based on conditional rules to improve the accuracy of inspection exception reasoning;

Research and Application of Automatic Screening Technology

749

The technique involves establishing a model of inspection exceptions, constructing a domain-specific knowledge graph, and determining two usage methods of the knowledge graph.

2 Modeling the Power Grid Knowledge Graph The logical structure has a pattern layer (or concept layer) and a data layer. The former is the core of the knowledge graph, determining what types of knowledge to display conceptually. The latter is the specific data, The technical architecture used for knowledge extraction includes traditional models (rule templates, domain dictionaries, clustering, SVM, decision trees, CRF) which are time-consuming for feature ex-traction and only suitable for scenarios with less data. With the development of deep learning, knowledge graphs based on deep learning models can handle more application scenarios. This paper presents a knowledge graph logical construction framework for marketing inspection texts, as shown in Fig. 1.

Fig. 1. Logical construction framework of knowledge graph.

The most important step in building a knowledge graph is knowledge identification, which is divided into two tasks: named entity recognition and relation extraction. The main task is to extract entities and their relationships from data. This article uses an unstructured information processing method to extract knowledge elements such as anomaly types and corresponding data attributes, and studies the completion of basic data cleaning based on multi-source heterogeneous marketing inspection data and existing inspection rules. Then, the entity recognition of business inspection rules knowledge is realized by using natural language processing (NLP) algorithms to transform the content of business rules and related language descriptions between computer language and natural language, and to achieve named entity recognition (NER). Deep learning algorithms are used to perform association operations on the sample information data

750

D. Lu et al.

of business rule knowledge graphs. Based on the degree of association obtained by the operation, the relationship between knowledge entities and related terms is identified, thereby extracting {entity-relation-entity} triplets in unstructured text, which can be easily stored in the Neo4j graph database [12, 13]. 2.1 Entity Extraction Entity extraction is a task in natural language processing that identifies entities with specific semantic categories in a given text, such as user information or device names. This article uses a BERT-based entity extraction model, and the overall entity recognition model for marketing inspection is shown in Fig. 2.

Fig. 2. Entity Extraction Model Architecture

For marketing inspection text as input sequences, word vectors are obtained through a joint word embedding model after part-of-speech tagging. At this point, words are mapped into a semantic space where semantically adjacent words are closer together. Word vectors pass through the Transformer layer to extract text features, outputting probabilities for labels. The model uses loss function as the constraint condition; when the loss function is minimum, the model obtains the optimal solution. The loss function for the named entity recognition model is defined as: Ptrue lossNER = − log  P

(1)

P = es

(2)

s = sl + st

(3)

In formulas 1 to 3, p represents the possible paths of a sentence; sl represents the predicted label scores of the Transformer module, and st represents the association

Research and Application of Automatic Screening Technology

751

 scores between the preceding and following labels; pture is the true path score, p representing the correct sentence representation. p is the comprehensive score of all paths, representing all possible sentence compositions. When the maximum value is obtained, it indicates that the Transformer’s predicted label score and the association score between the preceding and following labels are the highest. At this point, the minimum value is obtained, and the model training is complete. During iterations of model training, the model parameters are forward propagated, and the model’s loss value is obtained based on the loss function. The closer the value is to 0, the smaller the deviation between the two, and the higher the model’s accuracy. The model updates the weight parameters through Adam gradient descent optimization algorithm. The loss function and accuracy curve of the BERT model’s entity recognition in 25 iterations are shown in Fig. 3.

Fig. 3. Loss Function and Accuracy Curve

As can be seen from Fig. 3, The model’s accuracy does not improve and the loss no longer converges near the end of the curve. The model’s accuracy does not improve and the loss no longer converges near the end of the curve, which is excessive training leads to overfitting of the model. The final model accuracy reaches 97.81% on the training set and 86.57% on the test set. After training is complete, save the model’s weight parameters, i.e., save the model, for testing purposes. 2.2 Relation Extraction Template matching is a rule-based relation extraction method that can extract relationships between specific entities in a text based on predefined templates. Its has the advantages of clear rules and good domain adaptability. This method analyzes the descriptive features to infer the nature of the relationship between them and provide relationship definitions [14, 15].The relationship output is the matched entity relationships in a certain format, such as triplets {Entity1, Relationship, Entity2}. Due to the limited number of training samples, a semi-supervised co-training method is used for classification. Before relation classification, candidate word pairs need to be formed with the training set provided by a power industry dictionary. Finally, by combining the entity recognition model and the relation extraction model, unstructured

752

D. Lu et al.

inspection data are extracted. This paper conducts experiments on the performance and extraction effect of the joint extraction model on the abnormal subject data set of power marketing inspection, including 79 subject names and corresponding subject descriptions. Experimental results show that the highest F1-score of the joint extraction model on the inspection theme dataset is 0.88, and the extraction results are shown in Table 1. The issues such as excessively high meter replacement frequency potentially leading to system abnormalities, zero electricity usage within a year possibly indicating that the user hasn’t used electricity or there’s a meter malfunction, and frequent meter replacement records suggesting potential meter problems or abnormal replacement procedures, all fall under the topic of ‘Meter Replacement Abnormality’. Table 1. Example of the extraction performance of the model. Unstructured text input

Model Extraction Results

Theme name: Meter Replacement Abnormality. Theme description: Extracting users who have replaced their electric meters two or more times within a year, have had zero electricity consumption within the same year, and have a record of meter replacement within this month, are considered abnormal data

{Meter replacement times, judgment, Meter Replacement Abnormality}, {Meter replacement times, electric meter characteristics, user’s electric meter}, {Electricity consumption, judgment, Meter Replacement Abnormality}, {Zero electricity consumption, electricity consumption characteristics, electricity consumption}

2.3 Establishment of a Query Rule Base for the Knowledge Graph The construction of the design rule base for the instance rules needs to start from the design rules summarized by domain experts and have professional personnel rewrite and test them into production rules. The process diagram is shown in Fig. 4. The rule parser is responsible for rewriting “IF…THEN…” statements into Python program logic, which is part of the hybrid reasoning engine in the design system.

Fig. 4. The construction of the rule base

Research and Application of Automatic Screening Technology

753

Simple design rules are completed by combining and nesting “IF…THEN…” statements, fact judgments, and logical operators, and are parsed by the backend service program. Table 2 provides a simple demonstration of several inspection abnormal rules expressed in the production style, and other types of rules can be modified and combined based on these basic ones. Table 2. Modification of the query rule base Rule description

Rule type

Writing style

Replacing the electric meter two or more times within a year

A logical conclusion is obtained by a single fact judgment, resulting in a TRUE/FALSE value

IF(“Number of electric meter replacements in the current year” >= 2), Ture

No power outage was implemented for the transformer station

A factual conclusion is obtained by a single fact judgment

IF(“Transformer station power outage behavior”! = “Already implemented”), False

3 Example of the Application of a Knowledge Graph In order to verify the practical application effect of the deep learning-based electricity marketing inspection rule knowledge graph constructed, the article conducted experimental research using the business situation and work order data of a certain company in this quarter as the sample data. The experiment used a computer equipped with Windows10 system and Neo4j database as the basic equipment. Based on the techniques proposed above, a power grid KG management system was successfully applied to a provincial power grid. The marketing inspection knowledge graph can search for various nodes and relationships in the power grid through query statements. The Neo4j graph database is used with Cypher language and “match” command for querying, the query results as shown in Fig. 5.

Fig. 5. Query results.

754

D. Lu et al.

In addition, the marketing inspection theme data used in this article had 79 points and 4875 pieces of user abnormal data. After entity extraction, relationship extraction, and knowledge fusion were performed on the text, the marketing inspection knowledge graph was visualized using the graph database Neo4j. Some abnormal instances and their corresponding abnormal theme results are shown in Fig. 6.

Fig. 6. Partial abnormal instances and their corresponding abnormal themes.

4 Conclusion In order to establish an information-based intelligent query system for electric power enterprises, this paper conducts research on the construction of knowledge graphs for marketing audits, and proposes a solution for automatic screening of abnormal problems in marketing audits. Realize the automatic screening and retrieval function of information by building a Neo4j database based on knowledge graph. The system utilizes the simple and intuitive knowledge graphs, to visually display the abnormal marketing inspection in layers, so that inspectors can better manage marketing activities and accurately grasp abnormal information of the power grid. Our research shows that, compared to traditional manual methods, our model demonstrates significant advantages in efficiency and accuracy. By utilizing our model, precision is accordingly improved. This implies that, when dealing with large-scale and complex data, our method can provide faster and more accurate results, thereby advancing the technological progress in our field.

Research and Application of Automatic Screening Technology

755

References 1. Zhang, J., Zhang, X., Wu, C., et al.: Survey of knowledge graph construction techniques. Comput. Eng. 48, 23–37 (2022) 2. Li, M., Tao, H., Xu, H., Liu, J., Zhang, Q., et al.: The technical framework and application prospect of artificial intelligence application in the field of power grid dispatching and control. Power Syst. Technol. 44(2), 393–400 (2020) 3. Guangyi, L., Jiye, W., Yang, L., et al.: “One graph of power grid” spatio-temporal information management system. Electric Power Inf. Commun. Technol. 18(1), 7–17 (2020) 4. Jundong, W., Jun, Y., Yangzhou, P., et al.: Distribution network fault assistant decision-making based on knowledge graph. Power Syst. Technol. 45(6), 2101–2112 (2021) 5. Zhang, R., Liu, J., Zhang, B., et al.: Research on grid fault handling knowledge graph construction and real-time auxiliary decision based on transfer learning. Electric Power Inf. Commun. Technol. 20(6), 24–34 (2022) 6. Ji, Q., Xinying, W., Rui, M., et al.: Framework and key technologies of knowledge-graphbased fault handling system in power grid. Proc. CSEE 40(18), 5837–5848 (2020) 7. Tianjiao, P., Yuanpeng, T., Guozheng, P., et al.: Construction and application of knowledge graph in the electric power field. Power Syst. Technol. 45(6), 2080–2091 (2021) 8. Rong, G., Qun, Y., Shaohan, L., et al.: Construction and application of power grid fault handing knowledge graph. Power Syst. Technol. 45(6), 2092–2100 (2021) 9. Jinxing, L.I., Xiang, L.I., Tianlu, G.A.O., et al.: Research and application of fault handling based on power grid multivariate information knowledge graph. Electric Power Inf. Commun. Technol. 19(11), 30–38 (2021) 10. Xinzhi, Y., Lei, S., Xuzhu, D., et al.: Knowledge graph for distribution network fault handling. Power Syst. Technol. 46(10), 3739–3748 (2022) 11. Erzhuan, Z., Siyuan, Z., Jianfeng, Y., et al.: Decision knowledge modeling and implementation for power grid dispatching and control. Proc. CSEE 42(14), 5057–5066 (2022) 12. Zhang, H., Liu, X., Pan, H., Song, Y., Leung, C.W.K.: ASER: A large-scale eventuality knowledge graph. In: Proceedings of the Web Conference, pp. 201–211 (2020) 13. Deb, R., Roy, S.: A Software Defined Network information security risk assessment based on Pythagorean fuzzy sets. Expert Syst. Appl. 183, 115383 (2021) 14. Zhang, Y., Guo, Z., Lu, W.: Attention guided graph convolutional networks for relation extraction. IEEE Trans. Netw. Sci. 6, 159–168 (2019) 15. Peng, N., Poon, H., Quirk, C., Toutanova, K., Yih: Cross-sentence N-ary relation extraction with graph LSTMs. J. Net Sci. 5, 101–115 (2017)

Global Asymptotic Synchronization of Nonlinear Hyperchaotic Financial Systems via Hybrid Control and Adaptive Projection Control Guoliang Cai(B) , Haojie Yu, Yanfeng Ding, and Huimin Liu Institute of Applied Mathematics, Zhengzhou Shengda University, Zhengzhou 451191, China [email protected]

Abstract. In this paper, the hybrid synchronization and adaptive projective synchronization of a nonlinear hyperchaotic financial system are discussed. On based of the Routh Hurwitz criterion and the Lyapunov differential equation stability theory, we designed some special controllers to realize the global asymptotic synchronization of hyperchaotic financial systems under different Initial condition. The hybrid feedback controllers are designed for the system parameters are known. The hybrid synchronization of the system is investigated. The synchronization error system between the drive system and the response system achieves asymptotically stable. The increased order synchronization of two different dimensional financial systems is investigated. If the unknown parameters, the adaptive control is extended to projective synchronization of two different hyperchaotic financial systems. The parameter estimation update law and the special adaptive controller are designed. The numerical simulation results show that proposed schemes and theoretical analysis results are effective. Keywords: Nonlinear hyperchaotic finance system · Hybrid control · Adaptive projective control

1 Introduction The widespread application of chaotic synchronization in various fields has received in-depth research from scholars. The chaos control and the chaos synchronization of nonlinear hyperchaotic financial systems have received increasing attention due to their practical applications [1–5]. Many experts and scholars have proposed many methods for chaos control and synchronization [6–10], such as adaptive synchronization, projective synchronization, hybrid synchronization, and so on. This paper proposes hybrid feedback control and adaptive control for the synchronization between a four-dimensional nonlinear hyperchaotic finance systems and other financial systems. The nonlinear hyperchaotic financial system can be described as a set

© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 756–764, 2023. https://doi.org/10.1007/978-981-99-6187-0_75

Global Asymptotic Synchronization of Nonlinear Hyperchaotic Financial Systems

of nonlinear Ordinary differential equation [2]: ⎧ x˙ = z + (y − a)x + w ⎪ ⎪ ⎪ ⎨ y˙ = 1 − by − x2 ⎪ z˙ = −x − cz ⎪ ⎪ ⎩ w˙ = −dxy − kw

757

(1)

The system variables x, y, z, and w respectively represent interest rates, investment demand, price indices, and average profit margins. The system parameters a, b, c respectively represents the saving, the per investment cost, and the demands elasticity of commercials, d, k represent uncertain parameters. If the parameters a, b, c, d, and k of system (1) are selected as 0.9, 0.2, 1.5, 0.2, and 0.17, respectively, using the Wolf algorithm, it is not difficult to calculate the four Lyapunov exponents of system (1), which are 0.034432, 0.018041, 0, and −1.1499. The three-dimensional phase diagram of system (1) is shown in Fig. 1. Some scholars and experts have conducted further research on system (1) and obtained many useful results [11–20]. For more information on system (1), please refer to paper [2].

Fig. 1. Hyperchaotic attractors.

2 Hybrid Synchronization Between Two Identical Systems via Hybrid Feedback Control with Known Parameters Chaotic economics has had a profound impact on Mainstream economics. The occurrence of chaos in the economic system indicates the existence of instability in the macro economy. In this section, hybrid synchronization between two identical systems via hybrid feedback control with known parameters are discussed. Firstly, system (1) is used as the driving system. The drive system can be represented in the following nonlinear Ordinary differential equation: ⎧ x˙ 1 = z1 + (y1 − a)x1 + w1 ⎪ ⎪ ⎪ ⎨ y˙ 1 = 1 − by1 − x12 (2) ⎪ z˙1 = −x1 − cz1 ⎪ ⎪ ⎩ w˙ 1 = −dx1 y1 − kw1 The system (2) can be rewritten as follows: x˙ = Ax + Bg

758

G. Cai et al.

where ⎡

⎡ ⎤ ⎤ ⎡ ⎤ ⎡ x1 x˙ 1 −a 0 1 1 ⎢y ⎥ ⎢ y˙ ⎥ ⎢ 0 −b 0 0 ⎥ ⎢ ⎢ 1⎥ ⎢ 1⎥ ⎥, B = ⎢ x˙ = ⎢ ⎥, x = ⎢ ⎥, A = ⎢ ⎣ −1 0 −c 0 ⎦ ⎣ ⎣ z1 ⎦ ⎣ z˙1 ⎦ 0 0 0 −k w˙ 1 w1

1 0 0 −d

⎤ 0

x1 y1 1⎥ ⎥, g = . 0⎦ −x12 + 1 0

Correspondingly, the response system is represented as follows: y˙ = Ay + Bh + u where  T T  y˙ = x˙ 2 , y˙ 2 , z˙2 , w˙ 2 , y = x2 , y2 , z2 , w2 , h =

(3)

x2 y2 −x22 + 1

and the control functions u(t) = [u1 (t), u2 (t), u3 (t), u4 (t)]T . The error system is represented as follows:   e˙ = Ae + B h − g + u.

.

(4)

where synchronization error: e1 = x2 − x1 , e2 = y2 − y1 , e3 = z2 − z1 , e4 = w2 − w1 . We specially designed the following controllers:   u = B g − h + Ke,

(5)

where K = (k 1 , k 2 , k 3 , k 4 ) represents feedback matrix, B[g-h] represents a nonlinear controller, Ke represents a linear controller, therefor u represents the hybrid controller. The error system of the integrated controllers (5) and (4) can be represented as follows: e˙ = (A + K)e.

(6)

To ensure the asymptotic stability of system (6), it is necessary to select a suitable feedback matrix K, so that all eigenvalues of the characteristic matrix A + K have negative real parts. Let ⎡ ⎤ 0 0 −1 −1 ⎢0 0 0 0 ⎥ ⎥ K =⎢ ⎣ 1 0 0 0 ⎦, 00 0 0 Then the characteristic equation of A + K is: (λ + a)(λ + b)(λ + c)(λ + k) = 0. When the parameters a, b, c, d, and k of system were selected as 0.9, 0.2, 1.5, 0.2, and 0.17 respectively, the system (6) is global asymptotically stable. That is, the system (3) is asymptotically synchronizing to the system (2). In Example 5.1, MATLAB numerical simulation was used to provide hybrid synchronization results between system (3) and system (2).

Global Asymptotic Synchronization of Nonlinear Hyperchaotic Financial Systems

759

3 Increased Order Synchronization Between of Different Dimensional Financial Systems The synchronization analysis of two different dimensional financial systems will be introduced in this section. Consider using the three-dimensional chaotic financial system [5] as the drive system. It is a system of nonlinear Ordinary differential equation: ⎧ ⎪ ⎨ x˙ = z + (y − m)x (7) y˙ = 1 − ny − x2 ⎪ ⎩ z˙ = −x − pz When m = 0.9, n = 0.2, p = 1.2, the system (7) exhibits chaotic dynamic behavior. Let ⎡ ⎤ ⎡ ⎤ ⎤ ⎡ ⎡ ⎤

x˙ x −m 0 1 1 0 xy ⎢ ⎥ ⎢ ⎥ . X˙ 0 = ⎣ y˙ ⎦, X0 = ⎣ y⎦, A0 = ⎣ 0 −n 0 ⎦, B0 = ⎣ 0 1 ⎦, f = −x2 + 1 0 0 −1 0 −p z˙ z Select the system (3) as the response system. The error system: e˙ = Ae + AX + Bn − A0 X − B0 m + u

(8)

where synchronization errors: e1 = x2 − x, e2 = y2 − y, e3 = z2 − z, e4 = w2 − (x + y + z). In order to achieve synchronization, we specially designed the following controller: u = (A0 − A)X − Bn + B0 m + Ke Then u represents the hybrid controller. e˙ = (A + K)e

(9)

We choose ⎡

⎤ 0 0 −1 −1 ⎢0 0 0 0 ⎥ ⎥ K =⎢ ⎣ 1 0 0 0 ⎦, 00 0 0 Then the characteristic equation of A + K is: (λ + a)(λ + b)(λ + c)(λ + k) = 0 When the parameters a, b, c, d, and k of system were selected as 0.9, 0.2, 1.5, 0.2, and 0.17, respectively, the system (9) is global asymptotically stable. In other words, system (3) global asymptotically synchronizes with system (7). In Example 5.2, MATLAB numerical simulation was used to provide hybrid synchronization results of system (3) and system (7).

760

G. Cai et al.

4 Adaptive Projective Synchronization with Unknown Parameters The adaptive projection synchronization of two different hyperchaotic financial systems are discussed. The parameter estimation update law and an effective adaptive controller are designed. Ref. [6] gives another hyperchaotic finance system: ⎧ x˙ = −α(x + y) + w ⎪ ⎪ ⎪ ⎨ y˙ = −y − αxz (10) ⎪ z˙ = β + αxy ⎪ ⎪ ⎩ w˙ = −γ xz − θ w The system state variables x, y, z, and w respectively represent interest rates, investment demand, price index, and unknown variable, and α, β, γ , θ are real constant parameters. When α = 3, β = 15, γ = 0.2, θ = 0.12, system (10) exhibits hyperchaotic behavior. Select system (2) as the driving system. Select system (10) as the response system. It is represented as follows: ⎧ x˙ 2 = −α(x2 + y2 ) + w2 + u1 (t) ⎪ ⎪ ⎪ ⎨ y˙ = −y − αx z + u (t) 2 2 2 2 2 (11) ⎪ z˙2 = β + αx2 y2 + u3 (t) ⎪ ⎪ ⎩ w˙ 2 = −γ x2 z2 − θ w2 + u4 (t) The error system: ⎧ e˙ 1 ⎪ ⎪ ⎪ ⎨ e˙ 2 ⎪ e ˙ ⎪ 3 ⎪ ⎩ e˙ 4

= −α(x2 + y2 ) + w2 − t(z1 + (y1 − a)x1 + w1 ) + u1 (t) = −y2 − αx2 z2 − t(1 − by1 − x12 ) + u2 (t) = β + αx2 y2 − t(−x1 − cz1 ) + u3 (t) = −γ x2 z2 − θ w2 − t(−dx1 y1 − kw1 ) + u4 (t)

(12)

where e1 = x 2 –tx 1 , e2 = y2 –ty1 , e3 = z2 – tz1 and e4 = w2 – tw1 . The following theorem can be proven. Theorem 1: The adaptive control law as following: ⎧ u1 (t) = α(x ˆ 2 + y2 ) − w2 + tz1 + t(y1 − aˆ )x1 + tw1 − m1 e1 ⎪ ⎪ ⎪ ⎪ ⎨ u2 (t) = y2 + αx ˆ 1 − x2 ) − m2 e2 ˆ 2 z2 + t(1 − by 1 ⎪ u3 (t) = −βˆ − αx ˆ 2 y2 − tx1 − t cˆ z1 − m3 e3 ⎪ ⎪ ⎪ ⎩ ˆ 1 − m4 e4 u4 (t) = γˆ x2 z2 + θˆ w2 − t dˆ x1 y1 − t kw

(13)

Global Asymptotic Synchronization of Nonlinear Hyperchaotic Financial Systems

761

When a˜ = aˆ − a, b˜ = bˆ − b, c˜ = cˆ − c, d˜ = dˆ − d , k˜ = kˆ − k, α˜ = αˆ − α, β˜ = βˆ − β, γ˜ = γˆ − γ , θ˜ = θˆ − θ , and the parameter update law as: ˙ ˙ ˙ a˙ˆ =tx1 e1 , bˆ = ty1 e2 , c˙ˆ = tz1 e3 , dˆ = tx1 y1 e4 , kˆ = tw1 e4 , α˙ˆ = −(x2 + y2 )e1 , β˙ˆ = e3 γ˙ˆ = −x2 z2 e4 , θ˙ˆ = −w2 e4 . (14) Then system (11) will asymptotically synchronize with system (2). Proof: The error system can be obtained from (13) and (12): ⎧ e˙ 1 = α(x ˜ 2 + y2 ) − t a˜ x1 − m1 e1 ⎪ ⎪ ⎪ ⎪ ⎨ e˙ 2 = αx ˜ 1 − m2 e2 ˜ 2 z2 − t by ⎪ ˜ 2 y2 − t c˜ z1 − m3 e3 ⎪ e˙ 3 = −β˜ − αx ⎪ ⎪ ⎩ ˜ 1 − m4 e4 e˙ 4 = γ˜ x2 z2 + θ˜ w2 − t d˜ x1 y1 − t kw Consider Lyapunov functions: V = (e12 + e22 + e32 + e42 + a˜ 2 + b˜ 2 + c˜ 2 + d˜ 2 + k˜ 2 + α˜ 2 + β˜ 2 + γ˜ 2 + θ˜ 2 )/2. So, we get: V˙ = e1 e˙ 1 + e2 e˙ 2 + e3 e˙ 3 + e4 e˙ 4 + a˜ a˙˜ + b˜ b˙˜ + c˜ c˙˜ + d˜ d˜˙ + k˜ k˙˜ + α˜ α˙˜ + β˜ β˙˜ + γ˜ γ˙˜ + θ˜ θ˙˜ = e1 e˙ 1 + e2 e˙ 2 + e3 e˙ 3 + e4 e˙ 4 + ˙ˆ ˙ ˜ b) + c˜ (−c˙ˆ ) + d˜ (−dˆ )+ a˜ (−a˙ˆ ) + b(− ˙ˆ ˙ˆ + γ˜ (−γ˙ˆ )+ ˙ˆ + β(− ˜ k) ˜ β) k(− + α(− ˜ α)

(15)

˜ θ˙ˆ ) θ(− It is easy to obtain from (13), (14), and (15) V˙ = −m1 e12 − m2 e22 − m3 e32 − m4 e42 < 0 From this, it can be concluded that system (2) and system (11) are asymptotically synchronized. The MATLAB numerical simulation in Example 5.3 provides the synchronization results between system (2) and system (11).

5 Numerical Simulations Example 5.1. We choose the initial values of system (2) are (−1, −1, −1, −1), and the initial values of system (3) are (1, 1, 1, 1). The control gains are (1, 1, 1, 1). The initial errors of system (5) are (2, 2, 2, 2). The parameters a, b, c, d, and k of system were selected as 0.9, 0.2, 1.5, 0.2, and 0.17, respectively. The synchronization errors between system (2) and system (3) via the hybrid feedback method are shown in Fig. 2. The simulation results indicate that our controllers are effective.

762

G. Cai et al.

Fig. 2. The synchronization errors of (2) and (3) with the hybrid feedback method.

Example 5.2. We choose the initial values of system (7) are (−1, −2, −3), and the initial values of system (3) are (1, 2, 3, 2). The control gains are (1, 1, 1, 1). The initial errors of system (5) are (2, 4, 6, 2). The parameters of system (7) are chosen as m = 0.9, n = 0.2, p = 1.2, and the parameters a, b, c, d, and k of system (3) were selected as 0.9, 0.2, 1.5, 0.2, and 0.17. Figure 3 displays the synchronization errors of (7) and (3) via the hybrid feedback method.

Fig. 3. The synchronization errors of (7) and (3) via the hybrid feedback method.

Example 5.3. We choose t = 2, The initial values and control genes are the same as Example 5.1. The parameters of the system (11) are α = 3, β = 15, γ = 0.2, θ = 0.12, and the parameters a, b, c, d, and k of system (2) were selected as 0.9, 0.2, 1.5, 0.2, and ˆ ˆ ˆ 0.17. (ˆa(0), b(0), cˆ (0), dˆ (0), k(0), α(0), ˆ β(0), γˆ (0), θˆ (0)) = (1, 3, 1, 1, 1, 30, 1, 1), mi = 1, i = 1,2,3,4. Figure 4 displays the synchronization errors of (2) and (11) via the adaptive projection synchronization method.

Global Asymptotic Synchronization of Nonlinear Hyperchaotic Financial Systems

763

Fig. 4. The synchronization errors of (2) and (11) via the adaptive projection synchronization.

6 Conclusions The global asymptotic synchronization of a nonlinear hyperchaotic financial system was achieved using hybrid feedback control and adaptive projection control. We not only discuss the hybrid synchronization of nonlinear hyperchaotic financial systems under different Initial condition, but also discuss the hybrid synchronization of different dimensional financial systems. We also discussed the adaptive projective synchronization of two different nonlinear hyperchaotic financial systems with unknown parameters. The numerical simulation results are shown to verify that theoretical analysis results and controllers are effective. Acknowledgments. This work was supported by the National Social Science Foundation of China (No. 18BJL073), the Key Science and Technology Projects of the Henan province (No. 222102210335), and the Key Scientific Research Projects of Higher Education Institutions of Henan Province (No. 22A120011). Especially, thanks for the support of Zhengzhou Shengda University.

References 1. Dadras, S., Momeni, H.R.: Control of a fractional-order economical system via sliding mode. Physica A. 389(12), 2434–2442 (2010) 2. Yu, H.J., Cai, G.L., Li, Y.X.: Dynamic analysis and control of a new hyperchaotic finance system. Nonlinear Dyn. 67(3), 2171–2182 (2012). https://doi.org/10.1007/s11071011-0137-9 3. Chai, X.L., Gan, Z.H., Shi, C.X.: Impulsive synchronization and adaptive-impulsive synchronization of a novel financial hyperchaotic system. Math. Probl. Eng. 2013: (2013). 751616 4. Cai, G.L., Yao, L., Hu, P., Fang, X.L.: Adaptive full state hybrid function projective synchronization of financial hyperchaotic systems with uncertain parameters. Discrete Continuous Dyn. Syst. Series B 18, 2019–2028 (2013) 5. Zhao, X.S., Li, Z.B., Li, S.: Synchronization of a chaotic finance system. Appl. Math. Comput. 217, 6031–6039 (2011) 6. Ding, J., Yang, W.G., Yao, H.X.: A new modified hyperchaotic finance system and its control. Int. J. Nonlinear Sci. 8(1), 59–66 (2009)

764

G. Cai et al.

7. Cai, G.L., Hu, P., Li, Y.X.: Modified function lag projective synchronization of a financial hyperchaotic system. Nonlinear Dyn. 69(3), 1457–1464 (2012) 8. Zhu, D.R, Liu, C.X., Yan, B.N.: Drive-response synchronization of a fractional-order hyperchaotic system and its circuit implementation. Math. Problems Eng. (2013). 815765 9. Zhang, L.L., Cai, G.L.: Stability for a novel time-delay financial hyperchaotic system by adaptive periodically intermittent linear control. J. Appl. Anal. Comput. 7(1), 79–91 (2017) 10. Zhang, Z.Y., Liu, D.D., Li, T.R., Cai, G.L.: Study on synchronous control mathematical simulation analysis based on a class of financial chaotic systems. Boletin Tecnico 55(19), 355–362 (2017) 11. Kai, G., Zhang, W., Wei, Z.C., et al.: Hopf bifurcation, positively invariant set, and physical realization of a new four-dimensional hyperchaotic financial system. Math. Probl. Eng. 2017, 2490580 (2017) 12. Cao, L.: A four-dimensional hyperchaotic finance system and its control problems. J. Control Sci. Eng. 2018, 4976380 (2018) 13. Kocamaz, U.E., Goksu, A., Uyaroglu, Y., Taskin, H.: Controlling hyperchaotic finance system with combining passive and feedback controllers. Inf. Technol. Control 47(1), 45–55 (2018) 14. Zhang, Z., Zhang, J., Cheng, F.Y.: A novel stability criterion of time-varying delay fractionalorder financial systems based a new functional transformation lemma. Int. J. Control Autom. Syst. 17(4), 916–925 (2019). https://doi.org/10.1007/s12555-018-0552-5 15. Zheng, S.: Impulsive stabilization and synchronization of uncertain financial hyperchaotic systems. Kybernetika 52(2), 241–257 (2016) 16. Cai, G.L., Zhang, L.L., Yao, L., Fang, X.L.: Modified function projective synchronization of financial hyperchaotic systems via adaptive impulsive controller with unknown parameters. Discrete Dynamics in Nature and Society 2015 (2015). 572735 17. Zheng, J.M., Du, B.F.: Projective synchronization of hyperchaotic financial systems. Discrete Dynamics in Nature and Society (2015). 782630 18. Ahamad, H., Mojtaba, H., Dumitru, B.: On the adaptive sliding mode controller for a hyperchaotic fractional-order financial system. Physica A. 497, 139–153 (2018) 19. Amin, J., Mojtaba, H., Dumitru, B.: New aspects of the adaptive synchronization and hyperchaos suppression of a financial model. Chaos Solitons Fractals 99, 285–296 (2017) 20. Hadi, J., Amin, Y., Zhou, C.W., et al.: A financial hyperchaotic system with coexisting attractors: Dynamic investigation, entropy analysis, control and synchronization. Chaos Solitons Fractals 126, 66–77 (2019)

Quadrotor UAV Control Based on String-Level Fuzzy ADRC Bohan Xu(B) , Zhibin Li, Wengcheng Song, and Shengjie Wang Shandong University of Science and Technology, Qingdao 266590, China [email protected]

Abstract. Good flight attitude stability is crucial to the maneuverability of unmanned aerial vehicles. In order to ensure more accurate position and attitude control for a four rotor UAV, a cascade fuzzy ADRC strategy is designed in this paper. ESO and NLSEF are used to estimate and compensate the disturbance of the entire system in real time. A fuzzy controller is added to the cascade ADRC to achieve more accurate and stable control. External disturbances are added to the simulation system. The simulation results show that the cascade ADRC with fuzzy control has better auto disturbance rejection performance, smoother speed, and faster adjustment speed. Keywords: Quadrotor UAV · PID control · ADRC · Fuzzy control

1 Introduction In recent years, UAVs have been widely used in agriculture, industry and the military because of their simple construction, low manufacturing costs and ability to be controlled autonomously [1]. However, the UAV is an underdriven nonlinear system, which makes the model uncertain, e.g., complex aerodynamic characteristics, susceptibility to disturbances, and strong coupling between UAV attitude angles [2]. Improving the performance index of quadrotor UAV control systems has become one of the main research directions for an increasing number of scholars [3]. Quadrotor UAVs play an important role in aerial operations, and it is very important to improve the anti-interference capability while having good response speed and steady-state performance. There have also been many studies on control methods for quadrotors, and in [4], an adaptive fuzzy PID control method was proposed to vary the magnitude of the inputoutput domain to achieve finer control relative to PID. In [5], a sliding mode convergence method controller was used. The sliding mode control has some jitter problem, but the jitter can be suppressed to some extent by the convergence method. In [6], the authors designed a dual closed-loop controller based on ADRC and PD control, but there is some jitter in the attitude angle. In this paper, we propose to add a fuzzy controller to the series-stage ADRC, and adjust the self-tuning parameters more accurately according to the error magnitude in real time by the fuzzy controller to achieve certain parameter adaptability and better control performance. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 765–774, 2023. https://doi.org/10.1007/978-981-99-6187-0_76

766

B. Xu et al.

2 Mathematical Model of Quadrotor UAV Taking the “X” type four rotor wing as the research object, the roll angle, pitch angle, and yaw angle are represented by ϕ, θ , and ψ, respectively. The dynamic equations of the ground coordinate system for its four rotor wingless aircraft are as follows ⎧ ⎪ x¨ = mf (cos ψ sin θ cos ϕ + sin ψ sin ϕ) + d1 ⎪ ⎪ ⎪ ⎪ y¨ = mf (sin ψ sin θ cos ϕ − cos ψ sin ϕ) + d2 ⎪ ⎪ ⎪ ⎪ ⎨ z¨ = f cos ϕ cos θ − g + d3 m (1) (Iyy −Ixx ) + d τx ϕ ¨ = ⎪ 5 Ixx + qr Ixx ⎪ ⎪ τ ⎪ xx ) ⎪ + d5 θ¨ = Iyyy + pr (IzzI−I ⎪ ⎪ yy ⎪ ⎪ I −I ) ( xx yy τ ⎩ ψ¨ = Izzz + pq Izz + d6 where: f denotes the lift moment, g is the acceleration of gravity, d1−6 is the external disturbance, τx , τy , τz are the roll moment, pitch moment, yaw moment of the three axes, Ixx , Iyy , Izz denote the rotational inertia of the three axis rotations, p, q, r are the angular velocities of the x, y, and z axes, respectively.

3 String-Level ADRC 3.1 Controller Design In this paper, the structure of the tandem control system of the design quadrotor is shown in Fig. 1, which is divided into the inner loop attitude control loop and the outer loop position control loop.

Fig. 1. Control structure diagram

Tracking Differentiator (TD). The role of the tracking differentiator is not only to track the input signal and arrange the transition process, but also to soften the tracking input signal and reduce overshoot. The roll angle ϕ, for example, has a second-order function object according to Eq. (3):   Iyy − Ixx τx ϕ¨ = + qr + d4 (2) Ixx Ixx Rewrite the second-order object as: ϕ¨ = f (ϕ, p, θ, q, ψ, r, d 4) + bτx

(3)

Quadrotor UAV Control Based on String-Level Fuzzy ADRC

767

where f (ϕ, p, θ, q, ψ, r, d 4) is denoted as the total perturbation and the value of b is 1/Ixx , with x1 denoting the signal of ϕ and x2 denoting ϕ, ˙ then the model of the tumbling channel can be rewritten as ⎧ ⎨ x˙ 1 = x2  ˙ d 4 + bτx (4) x˙ 2 = f ϕ, ϕ, ˙ θ, θ˙ , ψ, ψ, ⎩ y = x1 Let ϕd 1 be the tracking signal of ϕd , and ϕd 2 be the differential signal of ϕd , then its Tracking Differentiator is expressed as  ϕ˙d 1 = ϕd 2 (5) ϕ˙d 2 = fhan(ϕd 1 − ϕd , ϕd 2 , r, h) where: ϕd is the input signal of the tumble angle loop tracking differentiator, ϕd 1 and ϕd 2 are the output signals, r is the tracking signal factor, h is the sampling step, and fhan is the maximum speed tracking function with the following internal relationship [7]: ⎧ d = rh ⎪ ⎪ ⎪ ⎪ ⎪ d0 = hd ⎪ ⎪ ⎪ ⎪ y = ϕ d 1 − ϕd + hϕd 2 ⎪ ⎪ ⎪ ⎪ a = d 2 + 8r|y| ⎨ 0 ⎧ fhan(ϕd 1 − ϕd , ϕd 2 , r, h) (6) ⎨ ϕd 2 + sgn(y) (a0 − d ) (|y| > d0 ) ⎪ ⎪ ⎪ 2 a= ⎪ ⎪ ⎩ ϕ − ϕ + y (|y| ≤ d ) ⎪ ⎪ 0 d1 d ⎪ ⎪ h  ⎪ ⎪ ⎪ −sgn(a) (|a| > d ) ⎪ ⎩ fhan = −ra/d (|a| ≤ d ) Expansion State Observer (ESO). The expansion state observer is the most important part of the ADRC control, which expands the total disturbance into new state variables observed and compensated by the control inputs and outputs of the system, with the following algorithm [8]: ⎧ ε1 = z1 − y ⎪ ⎪ ⎨ z˙1 = z2 − β01 ε1 (7) ⎪ z˙ = z3 − β02 fal(ε1 , 21 , δ) + bu ⎪ ⎩ 2 z˙3 = −β03 fal(ε1 , 41 , δ) where z1 –z3 denote the observed value of the system output, the differential observed value and the observed value of the total perturbation, respectively, β01 –β03 is the observer gain, and fal is the nonlinear function expressed as follows [8]:

e (|e| ≤ δ) fal(e, α, δ) = δ 1−α (8) α |e| sgn(e) (|e| > δ) Nonlinear State Error Feedback (NLSEF). The main role of the nonlinear error feedback part is to compensate for the disturbance based on the difference between the TD

768

B. Xu et al.

output and the ESO observations, and to suppress the disturbance with the following algorithm [8]: u0 = β1 fal(e1 , a1 , δ) + β2 fal(e2 , a2 , δ)

(9)

where e1 and e2 are the error signal values of TD output and ESO observation, and a1 and a2 are nonlinear factors, taking the values 0 < a1 < 1 < a2 . The final tumble angle ADRC expression is shown in Fig. 2.

Fig. 2. Rollover angle ADRC structure diagram

3.2 Simulation and Analysis To verify the performance of ADRC, a simulation model was established in the Simulink software environment, and the flight simulation was edited in combination with laboratory UAV parameters, as shown in Table 1, to compare the effects of cascade ADRC and PID control. Table 1. UAV Parameter Parameter/Unit

value

m/kg

1.02

l/m

0.20

g/m·s−2 IXX

/kg·m2

IYY /kg·m2 IZZ /kg·m2 CT /N/(rad2 ·s−2 ) CM /Nm /(rad2 ·s−2 )

9.80 1.87 × 10−2 1.64 × 10−2 2.97 × 10−2 1.405 × 10−5 1.675 × 10−7

Set the initial position of the quadrotor UAV: x = 0 m, y = 0 m, z = 0 m, the desired position: xd = 1 m, yd = 1 m, zd = 1 m, set the yaw angle to 0.5 rad, add the step disturbance after 10 s to simulate the small external disturbance in the actual flight, the expression of the step disturbance is as follows:  0.8 (t > 10) y= (10) −0.8 (t < 10.5)

Quadrotor UAV Control Based on String-Level Fuzzy ADRC

769

Table 2. Fuzzy rules. e2

e1 NB

NS

Z0

PS

PB

NB

PB

PB

PS

PS

Z0

NS

PB

PS

PS

Z0

NS

Z0

PS

PS

Z0

NS

NS

PS

PS

Z0

NS

NS

NB

PB

Z0

NS

NS

NB

NB

The simulation time is 20 s. Taking the position control result of x-axis displacement change and the attitude control result of roll angle change as examples, the control comparison results are shown in the following figure (Figs. 3 and 4):

Fig. 3. X-axis Position

Fig. 4. Roll Angle

In the above figure, the overshoot of PID control and ADRC on the X axis is 6.3% and 0.6% respectively. The overshoot of PID control and ADRC is 65% and 19% respectively at the attitude angle ϕ. After interference intervention, the recovery time of PID control and ADRC on X axis is 12.4 s and 11.6 s respectively. The mutation of ADRC is relatively stable. In order to further improve the performance of ADRC, fuzzy control is tried to be added to the series ADRC to achieve faster adjustment speed and better anti-interference.

770

B. Xu et al.

4 String-Level F-ADRC 4.1 Controller Design The fuzzy controller is divided into three parts: fuzzification, fuzzy inference and defuzzification. The observer parameters in ESO have a large adaptive range, and the NLSEF parameters β1 and β2 , which are more sensitive to the parameters, are selected as fuzzy objects. According to the magnitude of errors e1 and e2 , certain parameters in the ADRC are automatically adjusted by using fuzzy rules, and Fig. 5 shows the structure of the fuzzy ADRC.

Fig. 5. Fuzzy ADRC Structure Diagram

The final rectified input gain parameter to the NLSEF can be expressed as:   β1 = β1 + β1 β2 = β2 + β2

(11)

where β1 is the output after fuzzy inference and β1 is the original value of the nonlinear error feedback control rate. The fuzzy domain of the input error is taken as [−6,6], the output domain is [−10,10], and the five elements of the control subset of the input quantities e1 and e2 are set as {NB,NS,Z0,PS,PB}, which represent negative large, negative small, zero, positive small, and positive large, respectively, and the affiliation function is chosen to be Gaussian. Write fuzzy rules based on error size adjustment rules and experience, when the error e1 and e2 The fuzzy rule is set to a positive value when the values of both e1 and e2 are negative large values. The detailed rules are shown in the following table: 4.2 Simulation and Analysis In order to more clearly verify and compare the rationality and effectiveness of ADRC and fuzzy ADRC, a simulation model was established in the Simulink software environment. Select the same desired location and interference as the previous experiment, and the results are as follows (Figs. 6, 7 and 8): The response stabilization times for the x, y, and z axes in the figure under ADRC are 3.2 s, 4.2 s, and 1.6 s, respectively; under fuzzy ADRC are 2.7 s, 3.9 s, and 1.6 s, respectively. After adding fuzzy control, the overshoot of the three axes when reaching the initial desired position is reduced from [0.6%, 9%, 4.4%] to [0.5%, 1.5%, 4.2%].

Quadrotor UAV Control Based on String-Level Fuzzy ADRC

771

Fig. 6. X-axis Position

Fig. 7. Y-axis Position

Fig. 8. Z-axis Position

Fig. 9. Roll Angle

When the disturbance is intervened at the 10th second, the suppression effect of fuzzy ADRC is more obvious, and the time to converge to stability is shorter. In Figs. 9, 10 and 11, the roll angle and pitch angle under fuzzy ADRC tend to stabilize faster. When disturbance is involved, the disturbance ranges of roll angle, pitch angle, and yaw angle are [−0.02,0.022] rad, [−0.052,0.051] rad, and 0 rad, respectively, which are superior to [−0.04,0.039] rad, [−0.12,0.1] rad, and 0 rad under normal ADRC (Figs. 12, 13, 14 and 15).

772

B. Xu et al.

Fig. 10. Pitch Angle

Fig. 11. Yaw Angle

Fig. 12. Motor 1 speed

Fig. 13. Motor 2 speed

From the rotation speed curve, it can be seen that during the motion of the UAV, under ADRC, the rotation speed has a small oscillation process, which is nearly stable, but there is still a small unstable jump. Under fuzzy ADRC, the speed stabilization time is 11.3 s, 11.5 s, 11.2 s, and 11.3 s, respectively. The rotation speeds of the four motors can approach and stabilize at 422 rad/s more quickly, avoiding the instability caused by the UAV flight shake.

Quadrotor UAV Control Based on String-Level Fuzzy ADRC

773

Fig. 14. Motor 3 speed

Fig. 15. Motor 4 speed

To verify the anti-interference and tracking capabilities of the two controllers, the required trajectory settings are as follows, and the simulation time is set to 30 s. ⎧ ⎨ xd = cos(0.5t) (12) y = sin(0.5t) ⎩ d zd = t As shown in Figs. 16 and 17, before 2.8 s the quadrotor UAV will have instantaneous trajectory deviation in order to control the starting flight angle, and the deviation appeared by normal ADRC and fuzzy ADRC are 0.22 m and 0.1 m. After the step disturbance intervention, the tracking error of fuzzy ADRC is smaller, which further verifies its excellent steady-state performance and transient performance.

Fig. 16. Top view of track tracking

774

B. Xu et al.

Fig. 17. Track tracking effect

5 Conclusion In this paper, string-level ADRC is designed for a quadrotor UAV, and the results show that ADRC has better performance than traditional PID control. To further improve the rapidity and anti-interference capability, fuzzy control is added to the cascaded ADRC. Through the fixed-point hovering and trajectory tracking simulation experiments, it is verified that the fuzzy ADRC response speed, the ability to suppress overshoot, the antiinterference ability and the trajectory tracking ability are improved, which has some practical application value.

References 1. Yu, X.: Research on Key Technologies for Control of Four Rotor Unmanned Aerial Vehicle Based on Active Disturbance Rejection. Shanghai University of Engineering and Technology (2021) 2. Perez-Alcocer, R., Moreno-Valenzuela, J., Miranda-Colorado, R.: A robust approach for trajectorytracking control of a quadrotor with experimental validation. ISA Trans. 64, 267–774 (2016) 3. Luo, Y.: Design of a Control System for a Four Rotor Aerial Camera UAV Based on Active Disturbance Rejection. Southwest University of Science and Technology (2022) 4. Gong, J.: Research on attitude control of four-rotor aircraft based on fuzzy adaptive PID. Chongqing University of Posts and Telecommunications (2018) 5. Xiong, Z.: Research on Four Rotor Unmanned Aerial Vehicle Based on Sliding Mode Control. Hubei University of Technology (2018) 6. Zhang, D., Luo, B., Mei, L.: Design of a four rotor controller based on PD-ADRC. Meas. Control Technol. 34(12), 62–65+69 (2015). https://doi.org/10.19708/j.ckjs.2015.12.017 7. Li, J., Zhang, L., Li, S., Mao, Q., Mao, Y.: Active disturbance rejection control for piezoelectric smart structures: a review. Machines 11(2), 174 (2023) 8. Tian, X., Shao, X., Zhang, F.: Four rotor attitude control based on nonlinear auto disturbance rejection. Unmanned Syst. Technol. 5(06), 86–93 (2022). https://doi.org/10.19942/j.issn.20965915.2022.6.063

Deep Neural Network for Performance Prediction of Silicon Mode Splitter Lin Zhang1,2,3 , Longqin Xie1,2,3 , and Weifeng Jiang1,2,3(B) 1 College of Automation, Nanjing University of Information Science and Technology,

Nanjing 210044, China [email protected] 2 Jiangsu Province Engineering Research Center of Intelligent Meteorological Exploration Robot (C-IMER), Nanjing University of Information Science and Technology, Nanjing 210044, China 3 Jiangsu Collaborative Innovation Center of Atmospheric Environment and Equipment Technology (CICAEET), Nanjing University of Information Science and Technology, Nanjing 210044, China

Abstract. The silicon-based mode splitter is a key component for building onchip mode-division multiplexing systems. To accurately predict the performance of the silicon-based mode splitter automatically, we construct a neural network model based on deep neural network (DNN) and train it to predict the relationship between the structure parameters and performance of the mode splitter. Using a dataset of 823 samples, the training results show that the predicted transmittance of the silicon-based mode splitter is consistent with the actual transmittance. The proposed prediction method based on the DNN provides a reliable solution for the flexible and automated design of silicon-based photonics devices. Keywords: Deep Neural Network · Mode Splitter · Direct Binary Search

1 Introduction Mode division multiplexing (MDM) systems have the potential to significantly increase transmission capacity, and become a key technology for high-capacity communication and interconnects in the future [1]. Silicon-based mode splitters are critical components in building on-chip MDM systems, enabling flexible routing of modes [2]. Benefitting from the high refractive-index contrast, silicon photonics integrated devices based on the silicon-on-insulator (SOI) platform have the advantages of high integration density and their fabrication process is compatible with mature complementary metal-oxidesemiconductor (CMOS) technology [3]. The design of silicon-based mode splitters can be achieved through forward and inverse design methods [4, 5]. The forward design method generally uses numerical calculation methods such as the finite element method (FEM) or finite-difference timedomain (FDTD) method. The desired mode splitting characteristics is combined to optimize the structural parameters. However, the forward design method requires researchers © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 775–781, 2023. https://doi.org/10.1007/978-981-99-6187-0_77

776

L. Zhang et al.

to have strong professional background knowledge, and the related numerical calculation methods require significant computational consumption and operating time, making it difficult to meet the needs of widespread applications. Another inverse design method is based on various inverse design algorithms, such as direct binary search (DBS) algorithm [6], particle swarm optimization (PSO) [7], binary particle swarm optimization (BPSO) [8], etc. According to the target performance, the structural parameters are inversely designed to obtain the chip layout that meets the performance requirements. Although, the traditional inverse design process achieves relatively simple functions, it cannot flexibly predict the relationship between structure and performance. Recently, deep learning-based methods for photonics device design have become a research hotspot, as they can accelerate the design process and enhance flexibility of design results. Deep neural networks (DNNs) are a type of universal machine learning algorithm that has played a critical role in many artificial intelligence fields. DNNs consist of an input layer, multiple hidden layers, and an output layer, each of which contains multiple neurons. By performing weighted sum and nonlinear operations on the data through these layers, a nonlinear relationship between input and output can be established. As DNNs can simulate the nonlinear physical relationships of photonic systems, they provide a new perspective for predicting optical systems [9]. Traditional FDTD methods used to simulate the optical transmission response may take several minutes, or even longer, while the DNN model can predict multiple outputs within a few milliseconds, and then the input parameters can be given. In 2019, Chugh et al. proposed a method to combine FEM simulation with artificial neural networks to quickly and accurately calculate various optical characteristics of photonic crystal fibers [10]. Tahersima et al. also used neural networks to quickly and accurately predict the optical response of photonic power splitters, paving the way for the rapid design of integrated photonic devices based on complex nanostructures [11]. In this paper, the use of deep learning based on DNN is proposed to predict the performance of a silicon-based mode splitter. The training database is established using the DBS algorithm, which allow for the prediction of the relationship between the compact mode splitter structure and high-mode splitting performance. By establishing a forward design model, it is possible to achieve a rapid response between the trained structural parameters and mode splitting characteristics, avoiding the use of time-consuming FDTD methods.

Fig. 1. The topological structure of the silicon-based splitter. The parameters of T1@TE0 , T2@TE0 , and T2@TE1 are correspond to the optical transmittance of the TE0 mode at port T1 , the TE0 mode at port T2 , and the TE1 mode at port T2 , respectively.

Deep Neural Network for Performance Prediction

777

2 Operating Principle and Forward Modeling The topological structure of the silicon-based splitter designed in this paper is shown in Fig. 1. The functional region consists of many square subunits made of silicon or silicon dioxide, with each square subunit having a side length of 100 nm. The functional region is composed of 50 × 10 square subunits, with an overall length of 5 µm and a width of 1 µm. Data simulation is performed on the functional region having a binary state, with 1 indicating etched and 0 indicating unetched. The most critical step in the DNN model is to have a suitable dataset, which consists of the generated 50 × 10 labeled data and the corresponding output optical response parameters used to train the neural network. The input is a 50 × 10 labeled array corresponding to the functional region pixel points, and the output is the optical transmission response of port 1 (T1 ) and port 2 (T2 ).

Fig. 2. Forward modeling of the silicon-based splitter based DNN prediction. The topology of the silicon-based splitter is used as the input of the DNN, and its optical response is used as the output.

We trained a DNN using DBS-optimized database to optimize the functionality of the splitter structure. We collected a total of 823 samples to train the neural network model. Our DNN architecture consists of a multilayer fully connected network with three hidden layers, each containing 100 neurons that are connected to the neurons in the next layer, as shown in Fig. 2. We used the rectified linear unit (ReLU) as the nonlinear activation function, the Adam optimizer to update the weights of the hidden layers in each epoch, and the mean squared error (MSE) as the loss function to calculate the error between the predicted and actual output. After each iteration, the DNN model computed a set of output data, and we calculated the mean squared error between the generated output data and actual data, using back-propagation to update the weights of the hidden layers. To validate the effectiveness of the proposed network, we randomly selected 20% of the data samples from the training dataset as the validation dataset, and used the remaining 80% as the training dataset. The working principle of the proposed silicon-based splitter is shown in Fig. 3(a). The structure consists of two tapered waveguides and a functional region. The functional region is composed of square subunits, which is the area to be optimized. The proposed mode splitter is built on the SOI platform and has a 220-nm thick silicon waveguide. The silicon waveguide is sandwiched between a silicon dioxide lower-cladding layer

778

L. Zhang et al.

Fig. 3. (a) Schematic diagram of silicon mode splitter. (b) Structure and parameters of the silicon mode splitter.

and a silicon dioxide upper-cladding layer. The lower waveguide contains a tapered waveguide and a straight waveguide, where the tapered waveguide guides the light and the straight waveguide serves as an output port. The functional region contains many square subunits, with the distribution of silicon and silicon dioxide subunits affecting the effective refractive index of the functional region. When the TE1 mode is input from the port I, the tapered waveguide “squeezes” the mode into the functional region, and the mode is perturbed by the different subunits without changing the mode order, resulting in low-loss transmission through the functional region. The structure and parameters of the proposed mode splitter are shown in Fig. 3(b), where the width of w2 is set to 400 nm based on the single-mode condition. The width of w1 is selected as 1 µm to balance the inter-mode crosstalk and the length of the functional region. The two tapered waveguides are completely identical, so w1 = w3 = 1 µm and w2 = 400 nm. The length, L and width, g of the functional region are 5 µm and 1 µm, respectively, and the functional region is composed of 50 x 10 square subunits. The optimization of the functional region is based on the DBS algorithm. The proposed mode splitter’s figure of merit (FOM) is defined as FOM = TportO1-TE0 /Tport I-TE0 + Tport O2-TE1 /Tport I-TE1 , where Tport O1-TE0 and Tport O2-TE1 are the transmittances of the TE0 mode at output port O1 and the TE1 mode at output port O2 , respectively, and Tport I-TE0 and Tport I-TE1 are the energies of the TE0 mode and TE1 mode input from port I, respectively. Based on the DBS algorithm, 823 sets of databases are calculated for training.

3 Results and Discussions In this section, we validate the trained DNN model. As shown in Fig. 4, the MSE provides the average square difference between the estimated and true values. We set the epoch to 500, and the MSE curves of the training and testing sets indicate that the model has been well-trained. The MSE loss of the training set decreases as the number of epochs increases, from 0.293515 at Epoch 1 to 0.002273 at Epoch 250. When the epoch number reaches 500, the MSE loss further decreases to 0.001382, indicating that it has stabilized. This means that with the higher the epoch number and the lower the MSE loss, the error between the predicted output value and the actual value is smaller, resulting in a better fit. However, if the epoch is too much, it may increase the simulation time and lead to

Deep Neural Network for Performance Prediction

779

Fig. 4. The relationship between the MSE loss curves for the training and validation sets. The red curve represents the MSE loss for the training set, while the blue curve represents the MSE loss for the validation set.

overfitting, thereby reducing the prediction accuracy. We can also see from Fig. 4 that the overfitting does not occur.

Fig. 5. The optical response correlation plots of output ports T1 and T2 .The scatter plots of predicted and actual values for (a) T1@TE0 for transmitting TE0 mode at output port T1 , (b) T2@TE0 for transmitting TE0 mode at output port T2 , and (c) T2@TE1 for transmitting TE1 mode at output port T2 .

To test the generalization ability of the network, we used the trained DNN model to predict different T1 and T2 values and compared them with the actual values. We randomly selected seven data points in the range of T1 from 0.3 to 0.8, which are not included in the training set, and made network predictions. The optical response correlation plots of output ports T1 and T2 are shown in Fig. 5. The black line represents the ideal linear model, the red circles represent 80% of the training dataset, the blue circles represent 20% of the validation dataset, and the black circles represent the actual test data. The predicted output values of DNN are fitted with the true output values of T1 and T2 . It can be found that the T1@TE0 , T2@TE0 , and T2@TE1 of the multiple data points predicted by DNN all converged near the black line, reaching the expected level of the model. To quantify the prediction accuracy, we compare the predicted transmittance of the silicon-based mode splitter obtained from the DNN with the actual transmittance. The predicted and actual transmittances are found to be in good agreement. A set of error

780

L. Zhang et al.

Fig. 6. The distribution of T1 and T2 for seven testing data points, as well as the errors between predicted and actual values for (a) T1@TE0 , (b) T2@TE0 , and (c) T2@TE1 .

values by taking the difference between the predicted and actual values are obtained. As shown in Fig. 6(a), the maximum error between the predicted and actual for T1@TE0 is only 0.0313. Similarly, the maximum error for T2@TE0 is 0.00038, and that for T2@TE1 is 0.006. All errors are within an acceptable range, and the accuracy of each prediction is above 90%. Therefore, the prediction performance of the silicon-based mode splitter is found to be quite high. Since DNNs use the form of full connection, the connection in the structure brings orders of magnitude of weight parameters, which can easily lead to overfitting. This issue would cause the MSE falling into the local optimum, deviating from the real global optimum. In order to solve this problem, for more complex topologies, the residual networks (Resnets) could be used to achieve cross-layer propagation using hop connections. Furthermore, the convolutional neural networks (CNNs) can be used as intermediaries through convolutional kernels, so that the expression of features is more stable, and higher precision prediction can be realized.

Deep Neural Network for Performance Prediction

781

4 Conclusions In summary, we have developed a neural network model based on DNN to predict the performance of the silicon-based mode splitter. A multilayer fully connected DNN has been used, which included 3 hidden layers with 100 neurons in each layer, and each neuron in one layer was connected to the neurons in the next layer. A total of 800 sets of data have been collected and the neural network model has been trained. The predicted transmittance of the silicon-based mode splitter using DNN has been consistent with the actual transmittance. Our proposed DNN-based performance prediction scheme can provide an effective means for optimizing the design of silicon-based photonic chips. Funding. This work was supported in part by the National Natural Science Foundation of China (Grant Nos. 62275128, 11904178, 12174199 and 11704199), in part by the State Key Laboratory of Advanced Optical Communication Systems and Networks, Shanghai Jiao Tong University, China (2023GZKF015), and in part by the Startup Foundation for Introducing Talent of NUIST.

References 1. Liu, Y., Xu, K., Wang, S., et al.: Arbitrarily routed mode-division multiplexed photonic circuits for dense integration. Nat. Commun. 10(1), 3263 (2019) 2. Bozinovic, N., et al.: Terabit-scale orbital angular momentum mode division multiplexing in fibers. Science 340(6140), 1545–1548 (2013) 3. Heck, M.J.R., Bauters, J.F., Davenport, M.L., Spencer, D.T., Bowers, J.E.: Ultra-low loss waveguide platform and its integration with silicon photonics. Laser Photon. Rev. 8(5), 667– 686 (2014) 4. Tu, X., et al.: Analysis of deep neural network models for inverse design of silicon photonic grating coupler. J. Light. Technol. 39(9), 2790–2799 (2021) 5. Hegde, R.S.: Photonics inverse design: pairing deep neural networks with evolutionary algorithms. IEEE J. Sel. Top. Quantum Electron. 26(1), 1–8 (2020) 6. Mao, S., Hu, J., Zhang, H., Jiang, W.: Optimal design and experimental demonstration of a silicon-based ultra-compact mode splitter. Opt. Lett. 47(16), 4167–4170 (2022) 7. Rada-Vilela, J., Zhang, M., Seah, W.: A performance study on synchronicity and neighborhood size in particle swarm optimization. Soft. Comput. 17, 1019–1030 (2013) 8. Wang, S., Phillips, P., Yang, J., Sun, P., Zhang, Y.: Magnetic resonance brain classification by a novel binary particle swarm optimization with mutation and time-varying acceleration coefficients. Biomed. Tech. (Berl) 61(4), 431–441 (2016) 9. Jiang, J., Chen, M., Fan, J.A.: Deep neural networks for the evaluation and design of photonic devices. Nat. Rev. Mater. 6, 679–700 (2021) 10. Chugh, S., Gulistan, A., Ghosh, S., Rahman, B.M.A.: Machine learning approach for computing optical properties of a photonic crystal fiber. Opt. Express. 27(25), 36414–36425 (2019) 11. Tahersima, M.H., Kojima, K., Koike-Akino, T., et al.: Deep neural network inverse design of integrated photonic power splitters. Sci. Rep. 9(1), 1368 (2019)

A Privacy Preserving Distributed Projected One-point Bandit Online Optimization Algorithm for Economic Dispatch Zhiqiang Yang, Zhongyuan Zhao(B) , and Quanbo Ge College of Automation, Nanjing University of Information Science and Technology, Nanjing, China [email protected]

Abstract. In this paper, a distributed algorithm based on differential privacy and one-point feedback is proposed for the economic dispatch (ED) problem with privacy guarantees in microgrids. Different from most existing researches on ED problems with quadratic cost function, the ED problem with unknown cost function is considered in this paper. One-point feedback is used to estimate the real gradient information. In addition, the differential privacy mechanism is introduced into the algorithm. The algorithm not only protects the privacy information of the nodes, but also achieves the same sublinear rate regret. Finally, the effectiveness of the algorithm is verified by simulation results.

Keywords: Distributed online optimization differential privacy · microgrid

1

· economic dispatch ·

Introduction

The ED problem is one of the basic problems in power grid operation, and its goal is to achieve the goal of minimum total cost under the premise of satisfying the balance of supply and demand and various constraints [1]. At present, quite a few centralized algorithms have been designed to handle economic scheduling problems, such as particle swarm algorithm [2], and Quasi-Newton method [3]. However, the existence of the control center puts forward requirements on the communication and computing power of the power network. The distributed method only needs to exchange information with neighbors, so it is easy to deal with a single point of failure, reducing computing and communication costs [4]. This research was supported by Postgraduate Research & Practice Innovation Program of Jiangsu Province (grant number SJCX23 0393), Natural Science Foundation of Jiangsu Province (grant number BK20200824) and Startup Foundation for Introducing Talent of Nanjing University of Information Science and Technology (grant number 2019r082). c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023  Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 782–789, 2023. https://doi.org/10.1007/978-981-99-6187-0_78

Economic Dispatch

783

At present, many distributed optimization methods [5–10] have been developed. Traditional ED problems can be solved by a distributed way. In [6], a distributed optimization method for the ED problem considering transmission line losses was proposed. In [9], a distributed optimization algorithm based on both row and column random matrices was proposed, which solves the ED problem on directed networks. The above gradient-based algorithms require the exact gradient information of the cost functions, while it is usually difficult to be acquired directly in practical scenarios. In this case, the bandit optimization algorithm was developed. The bandit optimization algorithm is mainly divided into one-point feedback [11–13] and two-point feedback [14–16]. In [15], a distributed method based on two-point feedback was proposed to handle the ED problem considering forbidden operating areas, valve point loading effects, and multiple fuel choices. The effect of line loss is further considered in [16]. However, the above algorithm requires function values at two points. One-point feedback can achieve the same effect. In [12], a distributed optimization algorithm with one-point feedback was designed for optimization problems with inequality constraints. At present, there are few distributed optimization algorithms for economic dispatch based on one-point feedback. At the same time, because of the uncertainty of microgrid demand [17], the need for distributed optimization algorithms for online ED is emphasized. Therefore, designing a method based on one-point feedback for ED problems is one of our motivations. The above algorithm may cause information leakage in the process of information sharing, and customers and power grids may suffer losses due to accidental privacy disclosure. The current privacy protection technologies mainly include homomorphic encryption technology [18] and differential privacy technology [19]. Homomorphic encryption technology can achieve a strong privacy protection effect. However, the communication cost and computational complexity are high. As an alternative approach,differential privacy techniques were first proposed in [20]. In [21], the differential privacy mechanism is introduced into distributed optimization method to handle the optimization problem with privacy protection, which can protect node privacy while achieving sublinear regret. Inspired by the literature [11,14,21], a privacy-preserving distributed projection one-point deedback method is proposed for the ED problem with privacypreserving properties. This work is the first to solve an online ED problem by combining differential privacy mechanism with one-point feedback. Specifically, when the cost function is unknown, the one-point feedback mechanism is used to estimate the gradient information. In addition, the differential privacy mechanism is introduced into the distributed optimization algorithm of online economic dispatch, which can protect the privacy of nodes while realizing sublinear regret.

2 2.1

Preliminaries and Problem Formulation Graph Theory

Consider a directed graph G := (V, E), where V = {1, 2..., m} represents the set of agents and E ⊆ V × V represents the edge set, respectively. Denote Niin =

784

Z. Yang et al.

{ j ∈ V| (j, i) ∈ E} and Njout = { j ∈ V| (i, j) ∈ E} as the in-neighbors and outas the weighted adjacency matrix neighbors, respectively. Define A = [A]m×m ij of G with [A]ij being the weighting of (i, j). 2.2

Differential Privacy

Definition 1. Considering data sets E = {ei }i∈V and E  = {ei }i∈V , if there is an i ∈ V such that ei = ei and ej = ej for ∀j = i, E and E  are adjacent. Definition 2. Method H can guarantee ε-differential privacy if given data sets E and E  , and ∀Q ∈ R(H), it holds P[H(E) = Q]  eε P[H(E  ) = Q]. Definition 3. The sensitivity of random method H at time t is defined as Δt = sup H(Et ) − H(E  t )1 , where (Et )and(Et ) is the adjacency relationship. 2.3

One-Point Feedback

Consider the cost function f : Rd → R, where its corresponding one-point feed (z) = d f (z + δu)u, where δ is an exploration parameter and u ∈ Sd . back is ∇f δ The one-point feedback has some properties summarized in the following lemma. Lemma 1. [13] 1) For δ > 0, the estimator one-point feedback is an unbiased gradient estimate, ˜ ti (xit )] = ∇fˆti (xit ). where fti : fˆti (x) = Ev∈B [fti (x + δv)]. i.e., Euit ∈S [∇f   dC ˜ i (xi ) ˜ i (xi ) has  2) fti is bounded by C, the gradient estimator ∇f ∇f t t t t ≤ δ .     3) For fti (x) and fˆti (x): fˆti (x) − fti (x) ≤ δL. 2.4

Problem Formulation

This paper presents an economic dispatching model at any time t, and the specific model is as follows: min

m 

Cti (Pti ),

(1)

i=1

where P i ≤ Pti ≤ P i and

m 

Pti = PD . m i=1 Cti (Pti ) is the power

is the number of generators, Pti

is the power generation, generation cost, P i and P i are the upper and lower limits, respectively. PD is the total demand including load m 2 demand Pd and line loss Ploss , i.e., PD = Pd + Ploss , where Ploss = i=1 Ξi Pti . Ξi is the loss coefficient. Therefore, the ED problem (1) can be reformulated as max F = ρd Pd − C, C =

m  i=1

Cti (Pti ),

(2)

Economic Dispatch

785

where F represents the total revenue of the grid, ρ is the price of electricT  ity. The power generation vector Pt = Pt1 , · · · Ptm of the generator can be   1 m T represented by a variable vector xt = xt , · · · xt . xmax and xmin represent i i the upper and lower bounds of the variable. Therefore, the balance constraint m m 2 of active power can be expressed as Pd = i=1 xit − i=1 Ξi xit . Substituting

m 2 the above results into (2), we have max F = − i=1 ρd (Ξi xit − xit ) + Cti (xit ) . 2

If we let fti (xit ) = ρd (Ξi xit − xit ) + Cti (xit ), the original ED problem can be transformed into a minimization problem: min

x∈X

m 

fti xit ,

(3)

i=1

where x ∈ X represents the power generation constraint. Our aim is to propose an algorithm based on one-point feedback and differential privacy to solve the minimization problem (3). To evaluate the performance of the algorithm, the concept of individual regret is introduced. RjT =

T  t=0

where ft =

m  i=1

ft (xjt ) −

T 

ft (x∗ ),

(4)

t=0

fti . Intuitively, an online algorithm works well if the algorithm

can achieve sublinear regret, i.e., limt→0 RjT /T = 0 for any j ∈ V. Then, we make some assumptions as follows. Assumption 1. The directed graph is strongly connected. Assumption 2. The function fti :Rd → R is L-Lipschitz continuous and strongly convex on Ω.   Assumption 3. The subgradient ∂fti (x) of fti (x) is bounded, i.e., ∂fti (x) ≤ ˆ where D ˆ > 0. D,

3 3.1

Algorithm and Convergence Analysis Algorithm 1

The algorithm is summarized in Algorithm 1. stochastic matrices, respectively. > 0, Ar and Ac are ∞the row andcolumn ∞ αt > 0 satisfies t=1 αt = ∞, t=1 αt2 < ∞. Then, we rewrite the algorithm as 2m  m i i zt+1 = [Aij ]ztj + vti , where zti = hit , vti = xit+1 − xit + ηt+1 − j=1 [Ar ]ij xjt − j=1

i yti and zti = yti−m  , vt = 0n for i ∈ {m + 1, ..., 2m}, respectively. Thus, A =

I Ar . Finally, we define two auxiliary variables. Define ρ = xmax i I − Ar Ac − I    2N m m 1 1 1 i i i and z¯t = m i=1 zt = m i=1 xt + m i=1 yt .

786

Z. Yang et al.

Algorithm 1: Distributed Online Optimization Input: Each node i initializes both xi0 = 0 and y0i = 0. for t = 1 to T do Generate noise ηti ∼ Lap (σt ). Each node i updates xit+1 according to hit = xit + ηti ,  m  i j i i i  xt+1 = PΩ [Ar ]ij ht + yt − αt ∇ft (xt ) ,

(5a) (5b)

j=1

i yt+1 = hit −

m  j=1

[Ar ]ij hjt +

m  j=1

[Ac ]ij ytj − yti ,

(5c)

end Output: xit , ∀i ∈ V.

3.2

Main Results

Lemma 2. Assumptions 2 hold. Then,we have Δt ≤

3

2αt d 2 C . δ

Theorem 1. Under Assumptions 1 and 2. By perturbing the variable xit with the noise ηti , with σt = Δt /ε. Then, algorithm 1 can guarantee ε-differential privacy. 1 Lemma 3. Assumption 1 holds, with the parameter ∈ (0, (20+8m) m (1 − m |λ3 |) ). λ3 is the third largest eigenvalue of A. Then, there exists Γ > 0 and      1m 1T 1m 1T m m   m m 0 < γ < 1, such that At −  ≤ Γ γ t , ∀t ≥ 1.   0 0 ∞

Lemma 4. Assumptions 1 and 2 hold. Let the perturbed bandit oracle vti be 1−γ obtained with the constant such that ≤ min(¯ , 2mΓ γ ). Then, there exists   

m    G > 0 such that E vtj  Ft ≤ Gαt , ∀t ≥ 0 j=1

Lemma 5. Assumptions 1 and 2 hold. {zti }t≥0   obtained from   is the sequence Algorithm 1, for i ∈ {1, ..., m}, it holds E zti − z¯t  Ft−1 ≤ 2mρΓ γ t + t  GΓ γ t−r αr−1 . r=1

i Theorem 2. Assumptions 1 and 2 hold. {z √ t }t≥0 is the sequence obtained from Algorithm 1. Suppose the step size αt = 1/ t + 1. Then, for any j ∈ V and time duration T > 0, the regret RjT satisfies √ E[RjT ] ≤ 2(T + 1)mδL + D1 + D2 T + 1. (6)

Economic Dispatch

787

  ˆ m2 ρΓˆ /(1 − γ) and D2 = mVˆS + VT + where D1 = V4,t + 2 G + 5dC + D δ   5dC 2 ˆ ˆ 2 G + δ + D mGΓ /γ (1 − γ) + m G(G + mdC δ )

4

Numerical Experiments

We verify the effectiveness of the distributed online optimization algorithm in the IEEE 6-bus system. The system includes two DGs, one RG, one ESS, three loads and eleven transmission lines. Figure 1-a show the one-line diagram of the system. The communication topology is shown in Fig. 1-b. The cost function of each dispatchable agent is fi (xi ) = ai x2i + bi xi + ci , where ai ∈ [0.06, 0.4], bi ∈ [0.06, 3.3] and ci ∈ [0.0016, 0.0060]. Then, we let xi0 = 35, and y0i = 0. The parameters are designed as d = 1, δ = 0.4, = 0.00001 and ε = 15.

(a)

(b)

Fig. 1. (a) The single line diagram of the modified 6-bus system. (b) The undirected communication topology

By applying this algorithm, the simulation results are shown in Fig. 2-a and Fig. 2-b. In Fig. 2-a, the optimal power output of each agent is 44.2MW, 43.3MW and 17.5MW respectively, and the capacity constraints are satisfied. It can be seen from Fig. 2-b that the balance of supply and demand has been met. Therefore, the algorithm can solve the economic scheduling problem with quadratic cost function.

788

Z. Yang et al. 120 Total demand Total power output

Total power output vs. demand(MW)

DG1 DG2 ESS1

60

Power output(MW)

50

40

30

20

10

0

115

110

105

100

95

90 0

10

20

30

40

50

60

Iterations(k)

(a)

70

80

90

100

0

100

200

300

400

500

600

700

800

900

1000

Iterations(k)

(b)

Fig. 2. (a) Power output of DG1, DG2 and ESS1. (b) Total power output vs. demand(MW).

5

Conclusion

Aiming at the economic dispatch problem with privacy preserving properties, this paper proposes a distributed online optimization algorithm based on onepoint gradient estimation scheme and differential privacy mechanism. Theoretical analysis shows that the distributed online optimization algorithm can not only protect the privacy of nodes but also realize sublinear regret.

References 1. Sarfi, V., Livani, H.: An economic-reliability security-constrained optimal dispatch for microgrids. IEEE Trans. Power Syst. 33(6), 6777–6786 (2018) 2. Ali, M., Kaelo, P.: Improved particle swarm algorithms for global optimization. Appl. Math. Comput. 196(2), 578–593 (2008) 3. Zheng, W., Wu, W.: An adaptive distributed quasi-newton method for power system state estimation. IEEE Trans. Smart Grid 10(5), 5114–5124 (2018) 4. Xu, Y., Li, Z.: Distributed optimal resource management based on the consensus algorithm in a microgrid. IEEE Trans. Ind. Electron. 62(4), 2584–2592 (2014) 5. Wang, R., Li, Q., Zhang, B., Wang, L.: Distributed consensus based algorithm for economic dispatch in a microgrid. IEEE Trans. Smart Grid 10(4), 3630–3640 (2018) 6. Wang, Z., Wang, D., Wen, C., Guo, F., Wang, W.: Push-based distributed economic dispatch in smart grids over time-varying unbalanced directed graphs. IEEE Trans. Smart Grid 12(4), 3185–3199 (2021) 7. Huang, B., Liu, L., Zhang, H., Li, Y., Sun, Q.: Distributed optimal economic dispatch for microgrids considering communication delays. IEEE Trans. Syst. Man Cybern. Syst. 49(8), 1634–1642 (2019) 8. Xi, C., Khan, U.A.: Distributed subgradient projection algorithm over directed graphs. IEEE Trans. Autom. Control 62(8), 3986–3992 (2016) 9. Yang, T., et al.: A distributed algorithm for economic dispatch over time-varying directed networks with delays. IEEE Trans. Ind. Electron. 64(6), 5095–5106 (2016)

Economic Dispatch

789

10. Zhao, C., Duan, X., Shi, Y.: Analysis of consensus-based economic dispatch algorithm under time delays. IEEE Trans. Syst. Man Cybern. Syst. 50(8), 2978–2988 (2018) 11. Flaxman, A.D., Kalai, A.T., McMahan, H.B.: Online convex optimization in the bandit setting: gradient descent without a gradient, arXiv preprint cs/0408007 (2004) 12. Yi, X., Li, X., Yang, T., Xie, L., Chai, T., Johansson, K.H.: Distributed bandit online convex optimization with time-varying coupled inequality constraints. IEEE Trans. Autom. Control 66(10), 4620–4635 (2020) 13. Wang, C., Xu, S., Yuan, D., Zhang, B., Zhang, Z.: Push-sum distributed online optimization with bandit feedback. IEEE Trans. Cybern. 52(4), 2263–2273 (2020) 14. Pang, Y., Hu, G.: Randomized gradient-free distributed optimization methods for a multiagent system with unknown cost function. IEEE Trans. Autom. Control 65(1), 333–340 (2019) 15. Xie, J., Cao, C.: Non-convex economic dispatch of a virtual power plant via a distributed randomized gradient-free algorithm. Energies 10(7), 1051 (2017) 16. Xie, J., Yu, Q., Cao, C.: A distributed randomized gradient-free algorithm for the non-convex economic dispatch problem. Energies 11(1), 244 (2018) 17. Zhang, Y., Hajiesmaili, M.H., Cai, S., Chen, M., Zhu, Q.: Peak-aware online economic dispatching for microgrids. IEEE Trans. Smart Grid 9(1), 323–335 (2016) 18. Mao, S., Tang, Y., Dong, Z., Meng, K., Dong, Z.Y., Qian, F.: A privacy preserving distributed optimization algorithm for economic dispatch over time-varying directed networks. IEEE Trans. Ind. Inf. 17(3), 1689–1701 (2020) 19. Hale, M.T., Egerstedt, M.: Cloud-enabled differentially private multiagent optimization with constraints. IEEE Trans. Control Netw. Syst. 5(4), 1693–1706 (2017) 20. Dwork, C.: Differential privacy. 33rd international colloquium on automata (2006) 21. Xiong, Y., Xu, J., You, K., Liu, J., Wu, L.: Privacy-preserving distributed online optimization over unbalanced digraphs via subgradient rescaling. IEEE Trans. Control Netw. Syst. 7(3), 1366–1378 (2020)

Nonlinear Control of Dual UAV Slung Load Flight System Based on RBF Neural Network Xin-Jie Han1,2 , Ji Li1,2 , Yun-Sheng Fan1,2(B) , and Xin-Yu Chen1,2 1

2

College of Marine Electrical Engineering, Dalian Maritime University, Dalian 116026, China Key Laboratory of Technology and System for Intelligent Ships of Liaoning Province, Dalian 116026, China [email protected]

Abstract. In this paper, a nonlinear trajectory tracking control method based on RBF neural network and integral backstepping method is proposed for the precise control of position tracking and attitude control of UAV in the transportation process of double UAV hanging load flight system. The dual-UAV suspended load flight system is decomposed into three dynamic subsystems: attitude, position and load swing control, and the decoupling control under underactuated constraints is designed respectively. The RBF neural network is used to approximate the unknown coupling and external interference in the dual UAV flight system, and the stability of the closed-loop system, the tracking error and the uniform ultimate boundedness of all signals of the payload swing are proved. The simulation results verify the effectiveness and superiority of the nonlinear control of the double UAV hanging flight based on RBF neural network under unknown interference, and realize the precise control of the trajectory tracking of the double UAV hanging system, and quickly suppress the load swing during the flight. Keywords: Dual-UAV · Payload carrying Integral backstepping · Neural network

1

· Suspending flight ·

Introduction

Because of its advantages of wide application scenarios and high transportation efficiency, flight hanging transportation is widely used in military and civilian fields [1–3]. Its flight control has always been a hot issue. Traditional flight suspension transportation mostly uses single-rotor helicopter as the main carrier [4–7].In recent years, with the development of four-rotor UAV technology, fourrotor UAV suspension load has been studied and applied more and more [8–11]. Due to the limited load capacity of single UAV, in order to meet the needs of more scenes and work, the control problem of multi-UAV cooperative transportation has gradually become a research hotspot [12–15]. c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023  Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 790–801, 2023. https://doi.org/10.1007/978-981-99-6187-0_79

Nonlinear Control of Dual UAV Slung Load Flight System

791

The dual-UAV aircraft suspension system is a multi-degree-of-freedom strong coupling, underactuated complex system, which is more difficult than the single suspension control. More and more scholars have conducted in-depth research on the control of multi-quadrotor suspension systems and achieved many results. Reference [16] used the virtual structure method to deal with two UAVs and strip loads as a system. Adaptive control was used to make the system fast and stable, but the control accuracy was poor. Reference [17] used sliding mode control to solve the robust stability control problem of the four-rotor suspension system. The cantilever load system is divided into two independent systems, one fully driven and one stabilized by a small swing under another drive. Reference [18] studied the influence of polynomial trajectory on the vibration of hanging load, and designed a nonlinear control method to reduce the swing of hanging load under polynomial trajectory. The simulation results show that this method can suppress the swing of hanging load to a certain extent. In [19], the LQR controller was used to track the trajectory of the dual UAV with load system, and the nonlinear system was linearized to realize the tracking of the spiral orbit. In [20], a new nonlinear robust control strategy was proposed to realize the position tracking of three UAV hanging cubic load systems. The contribution of this paper is to analyze the force of the double UAV and the load as a whole system, and establish the mathematical model of the double UAV hanging load coupling system. Aiming at the problem of load swing and coupling verification and mutual interference in the flight process of the double UAV hanging load system, a nonlinear controller based on neural network backstepping method is designed to realize decoupling control under underactuated constraints. Finally, the tracking control experiment of three-dimensional space square trajectory is carried out by using the hanging system composed of Quanser ’s Qball2 aircraft model, which proves the effectiveness and superiority of the proposed method.

2

Double UAV Hanging Load System Model

In this paper, the system model of two UAVs hanging a rigid body load is considered. The structure of the system model is shown in fig-TracComp1. The centroids of UAV 1 and UAV 2 are denoted as Q1 and Q2, respectively, and the centroid of the hanging load is denoted as P. 2.1

Definition of Coordinate System

In the picture, {I : xg yg zg } is the ground coordinate system ( inertial coordinate system ). The three orthogonal unit basis vectors of the coordinate system are denoted as {Ex , Ey , Ez } . {B : xbi ybi zbi }is the body coordinate system, the three orthogonal unit basis vectors of the two coordinate systems are denoted as{eix , eiy , eiz } (i = 1, 2). {ϑ : xyz}To be fixed to the center of mass of the hanging load,The transition coordinate system with the same direction as {I}.Qi is the projection of Qi on the xg yg plane.

792

X.-J. Han et al.

Fig. 1. Double UAV hanging load system model

The space coordinate system can be converted by rotating peaceful dynamic transformations. The main description methods of rotation and transformations are Euler Corner and Four yuan. This article uses the Euler Horn commonly used in the aviation system to describe the quadruple gesture corner.The rotation order is Xb → Yb → Zb , the rotation angle around the Yb axis is the pitch angleθ, and the rotating angle around the Xb shaft is the rollφ. Based on the above Euler angle, the rotation matrix from the tetrine coordinate system B to the inertial system I can be expressed as: ⎞ ⎛ cθi cψi sφi sθi cψi − cφi sψi cφi sθi cψi + sφi sψi RIB = ⎝cθi sψi sφi sθi sψi + cφi cψi cφi sθi cψi − sφi cψi ⎠ −sθi sφi cθi cφi cθi Among them,c(·)and s(·) represent cos(·) and sin(·), respectively. The αi , βi in Fig. 1 is selected to represent the angle of the two ropes relative to the hanging load in the transition coordinate system ϑ, so that the unit direction vector from P to Qi can be defined as: ρi = [cos(βi ) cos(αi ), cos(βi ) sin(αi ), sin(βi )]T

(1) ◦

The default hanging load is always between two UAVs,So there is 0 ≤ αi < 360◦ (Considering the interference between UAVs, the actual βi value should be in a reasonable range.) According to the geometric relationship, the positions of UAV 1 and UAV 2 can be determined after the position of the hanging load is given. I I = ξPI + Lr ρ1 , ξQ = ξPI + Lr ρ2 ξpI = [xp , yp , zp ] , ξQ 1 2 T

(2)

Nonlinear Control of Dual UAV Slung Load Flight System

793

I I In the formula, ξQ , ξQ and ξpI are the positions of UAV l, UAV 2 and hanging 1 2 load, respectively, and Lr is the length of the rope (by default, the two ropes are the same length).

2.2

Control Input and External Disturbance

Similar to the previous, the control input of the UAV is force FQi and torque MQi acting on the body coordinate system {Bi } respectively. Bi FQBii = [0, 0, Fzi ]T , MQ = [Mxi, Myi , Mzi ]T i

(3)

In order to simulate the disturbance of the system, this paper also considers an external interference force FP that directly acts on the hanging load, which can be expressed as FPI = [Fxp , Fyp , Fzp ]T in {I}. 2.3

Euler-Lagrange Modeling

i The Jacobian matrix of angular rate is Ji , namely: ΩB i = Ji η˙ i , where ηi is the Bi Euler angle of UAV i, and Ωi is the angular rate of UAV i in {Bi }. The analysis is easy to obtain. The controlled system has 13 degrees of freedom. In this paper, the position of the hanging load ξpI , the attitude angles of the two UAVs ηi and αi , βi are selected as the generalized coordinates of the T system, that isq = [xp , yp , zp , α1 , β1 , φ1 , θ1 , ψ1 , α2 , β2 , φ2 , θ2 , ψ2 ] . According to the D ’Alembert principle, the generalized force corresponding to the generalized coordinate is : 2 I I + MηQi i · ηi ) + FIP · ξP ] ∂[( i=1 FIQ · ξQ i i (4) Fq = ∂q

ηi Bi Where MηQi i = JT qi MQi is obtained from the following formula: MQi · η˙i = T

B B (Ji T MB Qi ) η˙i = MQi · Ji η˙i = MQi · Ωi Therefore, the kinetic energy and potential energy of the system are expressed as:

1 1 1 1 1 T T ˙ T ˙ ˙T ˙ ξP + ( mQ1 ξ˙Q1 ξQ1 + ΩT mP ξ˙P Ω2 IQ2 Ω2 ) 1 IQ1 Ω1 ) + ( mQ2 ξQ2 ξQ2 + 2 2 2 2 2 P = mP g · ξP + mQ1 g · ξQ1 + mQ2 g · ξQ2 A=

(5) In the formula, mP , mQ1 and mQ2 are the hanging load, the mass of the UAV, and IQ1 and IQ2 are the moment of inertia of the UAV, respectively. The Euler-Lagrange equation is:   d ∂ (A − P) ∂ (A − P) (6) Fq = − dt ∂ q˙ ∂q

794

X.-J. Han et al.

Substituting Eq. (4) and Eq. (5) into Eq. (6), we can obtain: G¨ q = g(F, q, q) ˙ + gw (Fxp , Fyp , Fzp )

(7)

In the formula, G = GT ∈ R13×13 , g(F, q, q) ˙ ∈ R13 , gw (Fxp , Fyp , Fzp ) ∈ R13 is external interference. G is a broad inertial matrix. The specific form is as follows: ⎤ ⎡ (mQ1 + mQ2 + mp )I3 M21 03×3 M22 03×3 ⎢ MT21 M11 02×3 02×2 02×3 ⎥ ⎥ ⎢ ⎢ 03×3 03×2 Jq1 03×2 03×3 ⎥ (8) G=⎢ ⎥ ⎣ MT22 02×2 02×3 M12 02×3 ⎦ 03×3 03×2 03×3 03×2 Jq2 Among them, ⎡

M1i = mQi Lr 2 diag(c2 βi , 1), M2i

⎤ cβi sαi cαi sβi = −mQi Lr ⎣−cαi cβi sαi sβi ⎦ , Jqi = diag(Ixi , Iyi , Izi ) 0 cβi

Further select [xp , x˙ p , yp , y˙ p , zp , z˙p ,α1 , α˙ 1 , β1 , β˙ 1 , φ1 , φ˙ 1 ,θ1 , θ˙1 , ψ1 , ψ˙ 1 , α2 , α˙ 2 , β2 , β˙ 2 , φ2 , φ˙ 2 , θ2 , θ˙2 , ψ2 , ψ˙ 2 ]T as the state vectorx ∈ R26 . At the same time, [Fz1 , Mx1 , My1 , Mz1 , Fz2 , Mx2 , My2 , Mz2 ]T is selected as the control vector u ∈ R8 , and uω = [Fxp , Fyp , Fzp ]T ∈ R3 is the external disturbance input. Therefore, the nonlinear state space equation of the system is : x˙ = f (x, u) + fω (uω )

(9)

In summary, the dynamic model of the system can be expressed as: δ¨ = (Ft1 +Ft2 +FIP )/mp −g, η¨i = J−1 ¨i = M1i −1 (Fσei −Fσsi ) (10) qi (τηi +τηei ), σ Among them, Fti is the tension of the load on the i rope. δ  T T T T    xp yp zp , η = φi θi ψi , σ = αi βi ] Fti = Ftxi Ftyi Ftzi , FIP   T T   00g Fxp Fyp Fzp , g = lMxi lMyi Mzi , τηei = ηiT τηi =   ˙ ˙i + M ˙ 2i ξ˙Qi + B 0 0 1 RB (Jqi ηiT ), Fσsi = MT I I3×3 Fzi l Fσei = M1i σ  2i  ˙   T 2MQi Lr sβ α˙ β ∂ (q˙T Gq) ˙ , B = Bα Bβ = 12 ∂σ −MQi Lr sβcβ α˙ 2

3

= = × =

Trajectory Tracking Control of Double UAV Hanging Load System

Aiming at the problem of load swing and serious coupling and mutual interference in the flight process of the dual UAV hanging flight system, a nonlinear controller is designed to realize decoupling control under underactuated constraints. In the double UAV hanging system, the control part can be divided

Nonlinear Control of Dual UAV Slung Load Flight System

795

into three subsystems: flight position control, hanging load swing angle control and flight attitude control. In the position calculation, the roll angle and pitch angle of the UAV have little effect on the position of the aircraft. Therefore, it is neglected as a small angle approximation, and the position-attitude conversion formula is as follows: φdi = Uyi sφi − Uxi sφi , θdi = −Uyi sφi − Uxi cφi

(11)

According to the dynamic model of the dual drone hanging flight system and a single drone model, the drone analysis of the drone is analyzed, and the dynamic model of the drone hanging flight system in the dual drone hanging flight system is shown below:: z¨Qi = (cφi cθi Fzi − Ftzi )/mQi − g x ¨Qi = (uxi Fzi − Ftxi )/mQi , y¨Qi = (uyi Fzi − Ftyi )/mQi

(12)

Take the height channel as an example, The height controller controls the height of the drone through the drone lift z¨Qi . The designs of the expected control input Fzi are as follows:   2 (1 − kzi + λ1 )ezi + (kzi + kzpi )ezpi + g − kzi λzi χzi + Z¨di mQi + Ftzi Fzi = cφi cθi (13) proof: The difference between the expected height Zdi and the actual height Zi is: ezi = Zdi − Zi , The derivative of the tracking error is obtained :e˙ zi = Z˙ di − Z˙ i = Z˙ di − ωzi , ˙ Zi = ωzi is the actual altitude rise speed of UAV 1. To stabilize Z˙ i , let the Lyapunov function of Z˙ i be: 1 V1 = e2zi (14) 2 Derivation of it, get: V˙ 1 = ezi e˙ zi = ezi (Z˙ di − ωzi ) (15) The height change speed ωzi of UAV 1 is taken as the virtual input of the controller, and ωzdi is taken as the expected value of the virtual quantity. In order to make V˙ 1 ≤ 0, let: wzdi = kzi ezi + Z˙ di The integral term is added to the virtual control quantity, which can enhance the robustness of the controller and eliminate the influence of model uncertainty: t

wzdi = kzi ezi + Z˙ di + λzi χzi

(16)

Where χzi = 0 ezi (τ )dτ are constants greater than 0. The difference between the virtual control input wzdi and the actual height change speed wzi is: ezpi = wzdi − wzi = kzi ezi + Z˙ di + λzi χzi − ωzi

(17)

That isZ˙ d = epi − kzi ezi − λzi χzi + ωzi . In the substitution, we get : V˙ 1 = ezi e˙ zi = ezi (−kzi ezi − λzi χzi + ezpi ) = −kzi e2zi − ezi λzi χzi + ezi ezpi (18)

796

X.-J. Han et al.

In order to make V˙ 1 ≤ 0, let ezpi , χzi tend to 0, the Lyapunov function V2 is designed for ezpi , χzi : V2 =

1 2 1 1 ezi + e2zpi + λzi χ2zi 2 2 2

(19)

V2 is positive definite, the derivation of (17) : e˙ zpi = ω˙ zdi − ω˙ zi = kzi e˙ zi + Z¨di + λzi χzi − Z¨i

(20)

Substitute the aircraft height mathematical model Eq. (12) into Eq. (20):

Then:

cφi cθi Fzi − Ftzi e˙ zpi = kzi e˙ zi + Z¨di + λzi ezi + g − mQi

(21)

V˙ 2 =V˙ 1 + ezpi e˙ zpi + λzi χzi χ˙ zi = −kzi e2zi + ezi ezpi + cφi cθi Fzi − Ftzi ) ezpi (kzi e˙ zi + Z¨di + λzi ezi + g − mQi

(22)

By substituting formula (13) into formula (22), we get: V˙ 2 = −kzi e2zi − kzpi e2zpi

(23)

V˙ 2 is negative definite. This article combines the RBF neural network to get the tension between the rope between the drone and the load. Set the output of the neural network fˆzQi pair of tensile Ftzi on the rope. The Lyapunov function of designing a neural network is: 1 ˜W ˜ T) tr(W (24) V3 = V1 + V2 + 2γ Among them, γ is the adaptive factor, W ∗ is the ideal value of the network, and ˆ is the estimated power value.W ˜ =W ˆ − W∗ W Direction to the formula (24), get: T T 1 ˜W ˜˙ ) = V˙ 1 + V˙ 2 + 1 tr[W ˜ (γeT P h(x) + W ˆ˙ )] V˙ 3 = V˙ 1 + V˙ 2 + tr(W γ γ

(25)

When the network right coefficient is adjusted in the following formula,V˙ 3 ≤ 0: ˆ˙ = −γeT P h(x) W

(26)

The control rate of the designed by formula (24), (25) allows the height to gradually stabilize. In the same way, the control rate of the position controller of the droneUζi (ζ = x, y) in the system:   2 (1 − kζi + λζi )eζi + (kζi + kζpi )eζpi − kζi λζi χζi + ζ¨di mQi + fˆζQi uζi = (27) Fzi

Nonlinear Control of Dual UAV Slung Load Flight System

797

In the formula ζdi is the expectation direction of two drones, respectively, fˆζi is the estimation of the tension on the i-rope. kζi , λζi , kζpi is greater than 0 constant, and: T (28) eζi = ζdi − ζi , χζi = 0 eζi (T )dT, eζpi = kζi eζi + ζ˙di + λζpi χζpi − ζ˙i Due to the characteristics of its arrears, the hanging load corners cannot be directly controlled. The controller needs to be transformed into the position signal indirectly. The angle control of the aircraft gesture is controlled by the inner ring control of the system, which has nothing to do with the tensile force on the rope. The controller uses the exclusive backstepping method controller. The method is the same as above.   2 (1 − kσi + λσi )eσi + (kσi + kσpi )eσpi − kσi λσi χσi + σ ¨di M1i − Fσei uσi = Fzi (29)  Iξi  2 (1 − kηi + ληi )eηi + (kηi + kηpi )eηpi − kηi ληi χηi + η¨di − τηei (30) Mηi = l where σdi is the expected swing angle, η¨di is the expected attitude angle of UAV 1 and 2 respectively, kσi , λσi , kσpi kηi , kηpi , ληi is a constant greater than 0, and: eσi = σdi − σi , χσi = eηi = ηdi − ηi , χηi =

4

T 0

T 0

eσi (T )dT, eσpi = kσi eσi + σ˙ d1 + λσi χσi − σ˙ i

(31)

eηi (T )dT, eηpi = kηi eηi + η˙ di + ληi χηi − η˙ i

(32)

Simulation Verification and Result Analysis

In the MATLAB/Simulink simulation, the parameters of the QBall2 quad-rotor aircraft are used, the hanging load is the center of gravity, the ball with a mass of 0.4kg, and the hanging rope length is 1m. In this paper, FBC represents the fuzzy integral backstepping method to control the dual-UAV suspension flight system, and NNBC represents the control strategy proposed in this paper. The simulation data is set to 80 s. At the first √ √ second, the starting positions of the two UAVs are [− 22 , 0, 0] and [ 22 , 0, 0] respectively, and the initial values of the attitude angles are ( 0.01,0,0 ) rad and ( -0.01,0,0 ) rad respectively.so that the dual UAV hanging system tracks the desired trajectory in the specified height plane. After the take-off instruction is given, the dual UAV hanging flight system takes off from the starting position point, and the expected height is set to 5 m. After reaching the specified height, it advances along each target point in turn. For flight safety considerations, the maximum acceleration of each trajectory is set to 0.2, the maximum speed is 1, a fixed interference torque [0.2 0.3 0] is added to the two UAV attitudes, and the rear interference torque of the lower formula and the white   noise composition of the hanging load swinging angle:: cos 0.5πt, sin 0.5πt .

798

X.-J. Han et al.

(a) UAV 1-trajectory tracking comparison (b) UAV 2-trajectory tracking comparison curve curve

Fig. 2. UAV trajectory tracking comparison curve

(a) UAV 1-XOY plane trajectory tracking (b) UAV 2-XOY plane trajectory tracking comparison curve comparison curve

Fig. 3. UAV-XOY plane trajectory tracking comparison curve

Figure 2 is the trajectory of UAV 1 and UAV 2 in three-dimensional space in the four-rotor hanging flight system. Figure 3 is the trajectory of the two UAVs in the system projected onto other planes. Fig. 4 and Fig. 5 are the position tracking errors of the two UAVs in the double UAV hanging load system under the FBC and NNBC control algorithms, respectively. It is the position tracking error. It can be seen that the control strategy designed in this paper can well control the UAV to fly along the desired trajectory, and is smaller than the overshoot generated by PID and fuzzy backstepping, and the tracking error is also smaller.

Nonlinear Control of Dual UAV Slung Load Flight System

(a) NNBC-UAV 1 position tracking error

799

(b) FBC-UAV 1 position tracking error

Fig. 4. UAV 1 position tracking error

(a) NNBC-UAV 2 position tracking error

(b) FBC-UAV 2 position tracking error

Fig. 5. UAV 2 position tracking error

5

Conclusion

Considering the constraints of serious coupling, large external interference and uncertain model parameters of the dual UAV hanging load flight system, a nonlinear trajectory tracking control method based on neural network backstepping method is designed for the trajectory tracking precise control of a class of dual UAV hanging load flight system. On the basis of the single four-rotor aircraft, it is extended to a dual-UAV with load. The two UAVs and the suspended load are taken as the whole system for force analysis. The mathematical model of the dual-UAV suspended load coupling system is established. The nonlinear decoupling controllers of the three dynamic subsystems of attitude, position and load swing under underactuated constraints are designed respectively. At the same time, the stability of the closed-loop system is proved, and the tracking error is uniformly ultimately bounded. Finally, Quanser ’s Qball2 aircraft is used to track and control the three-dimensional space trajectory. The simulation results verify the effectiveness and superiority of the nonlinear control of the dual UAV

800

X.-J. Han et al.

hanging load flight based on neural network backstepping under unknown interference. It is an effective method to realize the precise control of the trajectory tracking of the dual UAV hanging load flight system.

References 1. Yuan, X., Zhu, B.: Survey on modeling and control of quadrotor UAV-slung load system. The Seventh Research Division, School of Automation Science and Electrical Engineering 2. Villa, D.K.D., Brandao, A.S., Sarcinelli-Filho, M.: A survey on load transportation using multirotor UAVs. J. Intell. Rob. Syst. 98, 267–296 (2020) 3. Juntong, Q.I., Yuan, P.: Survey on flight control technology for hanging load UAV. Unmanned Syst. Technol. 1(01) (2018) 4. Yuan, C.: Flight Dynamics Modeling Coupling Investigation of a Helicopter Suspended by a Slung Load. Nanjing University of Science Technology, China 5. Liu, L., Chen, M., Li, T.: Disturbance observer-based robust coordination control for unmanned autonomous helicopter slung-load system via coupling analysis method. Appl. Math. Comput. 427, 127148 (2022) 6. Thanapalan, K.: Nonlinear controller design for a helicopter with an external slung load system. Syst. Sci. Control Eng. 5, 97–107 (2017) 7. Liu, L., Chen, M., Li, T.: Disturbance observer-based robust coordination control for unmanned autonomous helicopter slung-load system via coupling analysis method. Appl. Math. Comput. 427, 127148 (2022) 8. Guo, K., Jia, J., Yu, X., et al.: Multiple observers based anti-disturbance control for a quadrotor UAV against payload and wind disturbances. Control. Eng. Pract. 102, 104560 (2020) 9. Fan, Y.-S., Chen, X.-Y., Zhao, Y.-S., Song, B.-J.: Nonlinear control of quadrotor suspension system based on extended state observer. Acta Automatica Sinica 49(8), 1758–1770 (2021) 10. Baraean, A., Hamanah, W.M., Bawazir, A., Baraean, S., Abido, M.A.: Optimal nonlinear backstepping controller design of a quadrotor-slung load system using particle swarm optimization. Alexandria Eng. J. 68, 551–560 (2023) 11. Roy, K.R., Waghmare, L.M., Patre, B.M.: Dynamic modeling and displacement control for differential flatness of quadrotor UAV slung-load system. Int. J. Dyn. Control 11(2), 637–655 (2022) 12. Arab, F., Shirazi, F.A., Yazdi, M.R.H.: Planning and distributed control for cooperative transportation of a non-uniform slung-load by multiple quadrotors. Aerosp. Sci. Technol., 117, 106917 (2021) 13. Mohammadi, K., Sirouspour, S., Grivani, A.: Passivity-based control of multiple quadrotors carrying a cable-suspended payload. IEEE/ASME Trans. Mechatron. 27(4), 2390–2400 (2021) 14. Prajapati, P., Parekh, S., Vashista, V.: On the human control of a multiple quadcopters with a cable-suspended payload system. In: 2020 IEEE international conference on robotics and automation (ICRA), pp. 2253–2258. IEEE (2020) 15. Guo, M.: Research on control issues of quadrotors with a cable-suspended payload. Nanjing University of Science Technology, Nanjing, China (2018) 16. Villa, D.K.D., Brand˜ ao, A.S., Carelli, R., et al.: Cooperative load transportation with two quadrotors using adaptive control[J]. IEEE Access 9, 129148–129160 (2021)

Nonlinear Control of Dual UAV Slung Load Flight System

801

17. Sun, H.,Gu, X.,Luo, S., Liang, Y., Bai, J.: Robust stabilization technique for a quadrotor slung-load system using sliding mode control. J. Phys. Conf. Ser. 2232, 012013 (2022) 18. Alkomy, H., Shan, J.: Vibration reduction of a quadrotor with a cable-suspended payload using polynomial trajectories. Nonlinear Dyn. 104(4), 3713–3735 (2021) 19. Alothman, Y., Guo, M., Gu, D.: Using iterative LQR to control two quadrotors transporting a cable-suspended load. IFAC-PapersOnLine 50(1), 4324–4329 (2017) 20. Xian, B.,Wang, G.-Y., Cai, J.-M.: Nonlinear robust control design for multi unmanned aerial vehicles suspended payload transportation system. J. Jilin Univ. Eng. Technol. Edn. 1–13

A LiDAR Point Cloud Semantic Segmentation Algorithm Based on Attention Mechanism and Hybrid CNN-LSTM Shuhuan Wen1(B) , Yunfei Lu2 , Tao Wang3 , Artur Babiarz4 , Mhamed Sayyouri4 , and Huaping Liu4 1 Yanshan University, Qinhuangdao, China

[email protected]

2 Silesian University of Technology, Gliwice, Poland 3 Sidi Mohamed Ben Abdellah University, Fez, Morocco 4 Tsinghua University, Beijing, China

Abstract. Environment perception is a prerequisite for achieving autonomous driving, and it is of crucial significance to perceive environmental information and interact with dynamic scenes through LiDAR. Therefore, to address the issue of easily neglecting global contextual information when extracting point cloud features, this paper proposes a LiDAR point cloud semantic segmentation algorithm based on attention mechanism and hybrid CNN-LSTM. The proposed method initially utilizes a point embedding module to map points with similar features in the point cloud to nearby positions in a high-dimensional space. Furthermore, an offset feature combining self-attention features and input features is used as the input of the next-level attention module. The attention scores are redistributed into a 2D coordinate system as a pseudo-image, which is then subjected to high-level semantic information extraction and segmentation using encoding and decoding networks. Experiments of the proposed algorithm were performed in SemanticKITTI dataset. Compared to other methods, the proposed algorithm achieved higher semantic segmentation accuracy, and showed a significant improvement for most object classes. Keywords: autonomous driving · LiDAR point cloud · semantic segmentation · attention mechanism

1 Introduction Autonomous driving has attracted widespread attention in recent years due to its potential to revolutionize transportation and make driving safer, more efficient, and more accessible. One of the key challenges in developing autonomous driving systems is the ability to accurately perceive and understand the surrounding environment in real-time [1–3]. LiDAR has many advantages such as high detection accuracy, strong penetration ability, and the ability to obtain spatial information of objects, and with the development of LiDAR technology, the cost has been greatly reduced, so LiDAR has been widely used © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 802–808, 2023. https://doi.org/10.1007/978-981-99-6187-0_80

A LiDAR Point Cloud Semantic Segmentation Algorithm

803

in the field of autonomous driving, as a sensor for sensing the surrounding environment. In particular, semantic segmentation of point cloud acquired by LiDAR plays a critical role in identifying objects and their attributes, which are essential for safe and reliable autonomous driving. In recent years, deep learning-based methods have made remarkable progress in semantic segmentation of LiDAR point clouds [4]. However, existing methods still face challenges in capturing global contextual information and effectively integrating it into the segmentation process. The LSTM network can alleviate the long-distance memory loss problem of RNN network to a certain extent, but its information “memory” capability is not strong, and there is a problem of feature loss in the process of use. The attention mechanism has the feature of filtering out local key information from the overall information and focusing on the key information, which is more suitable for large-scale point cloud data processing. Therefore, we propose a novel LiDAR point cloud semantic segmentation algorithm based on attention mechanism and hybrid CNN-LSTM [6]. Our contributions are as follows: 1. We optimize the attention mechanism for the characteristics of point cloud data and apply it to the LiDAR point cloud semantic segmentation task. 2. Experiments demonstrate that our network significantly outperforms other methods in segmentation accuracy.

2 Methodology PCT [5] is a network designed for semantic segmentation tasks in small-scale point cloud scenes. Due to its simple network structure and limited number of parameters, it is not suitable for large-scale outdoor point cloud semantic segmentation tasks. Therefore, this paper mainly utilizes the idea of PCT to design an optimized PCT module, proposes a novel LiDAR point cloud semantic segmentation algorithm based on PCT and hybrid CNN-LSTM. We first use the bird’s eye view projection [7] to transform the raw point cloud data from 3D to 2D. Based on the PCT framework, point cloud features are extracted for each grid, as shown in Fig. 1 [5]. The network used in this paper for extracting point cloud features based on attention mechanism mainly consists of three modules. The first module is the point embedding module, which aims to place points in closer positions in the embedding space when they have similar semantics, and increase the dimensionality of the input data. A two-layer MLP network is used to embed the point cloud P into a d-dimensional space Fe . The second module is the sampling module. The point embedding process of PCT can extract global features but tends to overlook the connection between local features. Therefore, we propose a local feature aggregation strategy. The sampling module aims to enlarge the receptive field of the neural network and reduce the number of network parameters. We use the farthest point sampling to ensure the uniformity of point cloud sampling. Specifically, assuming that the sampling layer takes a point cloud Pin with Nin points and corresponding feature Fin as input, and outputs a sampled point cloud Pout with Nout points and corresponding aggregated feature Fout . The third module is the offset attention module. As shown in Eq. (1), the query item Q, the key item K and the value item V are obtained respectively to obtain the semantic

804

S. Wen et al.

correlation between different items in the data sequence [8]. The input feature is the data after farthest point sampling, denoted as Fin .   (1) (Q, K, V ) = Fin · Wq , Wk , Wv

Fig. 1. PCT-based LiDAR point cloud feature extraction algorithm framework

Equation (2) shows the output attention score [8].   Fsa = softmax QK T V (2) The offset attention module can increase the attention weight and reduce the effect of noise, which is beneficial for the next layer of the network to extract features. The offset is fed into an MLP network in a feedforward manner to replace the self-attention feature [5], as shown in Eq. (3). Fout = OA(Fin ) = MLP(Fin − Fsa ) + Fin (3) In the actual algorithm, the network structure in the dashed box is treated as a whole module, called the point cloud attention feature extraction module. It can be cascaded to perform multiple attention feature extractions on the input point cloud features. After the original point cloud features are extracted by the attention mechanism module, the output feature Fout is used as a pseudo-image, where the feature dimension of Fout serves as the number of channels in the pseudo-image. The whole process of extracting point cloud attention features in Fig. 1 is represented as a module in Fig. 2. Similar to [6], the overall framework is shown in Fig. 2. In order to reduce the model parameters and improve the algorithm running speed, we simplify the neural network for processing pseudo-images, and both the encoder and decoder consist of three layers. First, the original point cloud is fed into the attention feature extraction module to obtain the offset-attention feature map. Then, the obtained pseudo-image is fed into both a bidirectional LSTM network and a CNN encoding network to extract high-dimensional semantic features. Finally, the features of the two channels are fused and then passed through deconvolution and upsampling to restore the feature map to the original size and output the predicted labels for each point.

A LiDAR Point Cloud Semantic Segmentation Algorithm

805

Fig. 2. The overall framework of our network

3 Experiments We validate the proposed algorithm in SemanticKITTI dataset. The system environment is configured with Ubuntu 16.04 system and the hardware configuration is Intel Core i9-10940X CPU, NVIDIA RTX A6000 GPU. The experiments follow the allocation scheme of the official documentation for the training, validation, and test sets. During the simulated training phase, we continuously adjust the parameters to obtain a set of optimal parameters, and finally compare the performance of various algorithms. 3.1 Parameter Setting The parameter settings for the experiments are shown in Table 1. Table 1. Hyperparameter setting Parameter Name

Symbolic representation

Value

Grid resolution

280 × 280 × 24

Maximum training rounds

LPC × WPC × HPC Nmax

Maximum steps per round

Tmax

19130

Hilbert curve sequence level

H

5

Learning rate

lr

0.5

LSTM sequence length

LLSTM

1024

Pseudo image resolution

LLSTM × WLSTM

32 × 32

Attention output dimension

OPCT

128

30

3.2 Experiment Results In this section, we show the performance of our algorithm in SemanticKITTI. Firstly, we conduct a qualitative analysis to evaluate the effectiveness of our algorithm. We compare

806

S. Wen et al.

our algorithm with PolarNet [9] and showcase the semantic segmentation results in various scenarios. Then, comparing the data of the other three algorithms in SemanticKITTI, we analyze the performance of our algorithm. Figure 3 illustrates the semantic segmentation of point clouds in two specific scenarios. The left column shows the environment containing multiple target classes in an urban road scene, where the segmentation performance for “bicycle” is notably dif-ferent, as highlighted in the red box (the correctly predicted points are shown in light purple). It can be observed that the proposed algorithm can accurately segment ob-jects such as cars and bicycles, while PolarNet’s segmentation performance for the “bicycle” class is poor. The right column displays a crossroad scene where traffic signs are highlighted in red boxes (the correctly predicted points are shown in red). The PolarNet algorithm predicts very few points correctly, whereas the hybrid CNN-LSTM algorithm based on attention mechanism can achieve complete semantic segmentation of traffic signs, demonstrating the superior perception ability of the proposed algorithm.

a. PolarNet

b. Ours Fig. 3. Comparison of semantic segmentation performance in two complex scenarios

Table 2 shows the performance comparison between our approach and multiple baselines, where the bolded font indicates the best semantic segmentation IoU and FPS values for each class. Analysis of the FPS metric shows that the proposed algorithm has a certain degree of speed improvement. In terms of the mIoU metric, the proposed algorithm achieves 57.1%, which is higher than other algorithms and exhibits better segmentation performance for most classes. Notably, the proposed algorithm shows significant improvement in the segmentation of “truck”, “person” and “fence” classes. However, for the “parking” class, the proposed algorithm performs worse than other networks. This is because the LSTM network and Hilbert curve used in our algorithm

A LiDAR Point Cloud Semantic Segmentation Algorithm

807

make it difficult for the neural network to distinguish the features of the “parking” class from surrounding objects such as the “road” class, as they are very similar. Table 2 demonstrates that the attention mechanism-based algorithm achieves higher accuracy for most classes, indicating that attention mechanism can be effectively applied in semantic segmentation of LiDAR point clouds. Table 2. Segmentation results on test split of SemanticKITTI Model

RandLA[10]

RangeNet + + [11]

SqueeeSegV3[12]

Ours

Size

50K pts

64 × 2048

64 × 2048

[280,280,24]

FPS

5

11

6

13

MIoU(%) Per class IoU(%)

53.9

52.2

55.9

57.1

car

94.2

91.4

92.5

91.7

bicycle

26.0

25.7

38.7

46.8

person

49.2

38.3

45.6

56.0

bicyclist

48.2

38.8

46.2

50.7

road

90.7

91.8

91.7

91.7

truck

40.1

25.7

29.6

62.2

building

86.9

87.4

89.0

87.3

vegetation

81.4

80.5

82.0

82.5

parking

60.3

65.0

63.4

39.1

fence

56.3

58.6

59.4

66.5

trunk

61.3

55.1

58.7

58.8

motorcycle

25.8

34.4

36.5

53.5

bus

38.9

23.0

33.0

38.9

terrain

66.8

64.6

65.4

72.6

pole

49.2

47.9

49.6

49.1

sidewalk

73.7

75.2

74.8

75.2

traffic-sign

47.7

55.9

58.9

39.6

motorcyclist

7.2

4.8

20.1

11.0

other-ground

20.4

27.8

26.4

19.7

4 Conclusion In this paper, we propose a LiDAR point cloud semantic segmentation algorithm based on attention mechanism and hybrid CNN-LSTM. To address the issue of feature loss when processing raw point clouds, we propose an attention mechanism network model

808

S. Wen et al.

for processing global point cloud features. Leveraging the attention mechanism’s ability to capture internal feature correlations and insensitivity to input sequence order, we apply it to LiDAR point cloud semantic segmentation. By replacing attention features with offsets between the attention features and the input of the self-attention module, we optimize the feature extraction method. Experiment results show that the proposed algorithm achieves higher semantic segmentation accuracy, as well as the algorithm runs faster, demonstrating the better perceptual capability of the proposed algorithm. The experiment results demonstrate that attention mechanism can be effectively applied to LiDAR point cloud semantic segmentation, which can focus on local key information from large-scale point cloud data and improve the algorithm’s segmentation accuracy. Acknowledgements. The work was partly supported by the National Natural Science Foundation of China (NSFC, Project No. 62273296), Hebei innovation capability improvement plan project (22567619H).

References 1. Hang, P., Lv, C., Huang, C.: An integrated framework of decision making and motion planning for autonomous vehicles considering social behaviors. IEEE Trans. Veh. Technol. 69(12), 14458–14469 (2020) 2. Miller, I.D.: Any way you look at it: semantic crossview localization and mapping with LiDAR. IEEE Robot. Autom. Lett. 6(2), 2397–2404 (2021) 3. Charles, R.Q., Su, H., Kaichun, M., Guibas, L.J.: PointNet: deep learning on point sets for 3d classification and segmentation. In: Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp. 77–85. IEEE, Honolulu HI (2017) 4. Chen, X., Li, S., Mersch, B.: Moving object segmentation in 3D LiDAR data: a learning-based approach exploiting sequential data. IEEE Robot. Autom. Lett. 6(4), 6529–6536 (2021) 5. Guo, M.H., Cai, J.X., Liu, Z.N.: PCT: point cloud transformer. Comput. Vis. Media 7(2), 187–199 (2021) 6. Wen, S., Wang, T., Tao, S.: Hybrid CNN-LSTM architecture for LiDAR point clouds semantic segmentation. IEEE Robot. Autom. Lett. 7(3), 5811–5818 (2022) 7. Luo, L., Cao, S.Y., Han, B.: BVMatch: LiDAR-based place recognition using bird’s-eye view images. IEEE Robot. Autom. Lett. 6(3), 6076–6083 (2021) 8. Vaswani, A., Shazeer, N., Parmar, N.: Attention is all you need. In: Advances in Neural Information Proceeding Systems (NIPS), pp. 5999–6009. ACM, California (2017) 9. Zhang, Y., Zhou, Z., David, P.: Polarnet: an improved grid representation for online lidar point clouds semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9601–9610. IEEE, Seattle WA (2020) 10. Hu, Q., Yang, B., Xie, L.: RandLA-Net: efficient semantic segmentation of large-scale point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11108–11117. IEEE, Seattle WA (2020) 11. Milioto, A., Vizzo, I., Behley, J.: Rangenet++: fast and accurate lidar semantic segmentation. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp.4213–4220. IEEE, Macau (2019) 12. Xu, C., Wu, B., Wang, Z., Zhan, W., Vajda, P., Keutzer, K., Tomizuka, M.: Squeezesegv3: Spatially-adaptive convolution for efficient point-cloud segmentation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12373, pp. 1–19. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58604-1_1

Robust Ascent Trajectory Optimization for Hypersonic Vehicles Based on IGS-UMPSP Yuting Qi, Bo Wang(B) , Lei Liu, Huijin Fan, and Yongji Wang National Key Laboratory of Multispectral Information Intelligent Processing Technology, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, People’s Republic of China [email protected]

Abstract. This paper proposes a new approach for solving trajectory optimization problems with uncertain system parameters, which is the generalized quasi-spectral version of unscented model predictive static programming. The proposed method combines two existing techniques, the generalized quasi-spectral model predictive static programming and the unscented optimal control, to generate control solutions that can meet multiple constraints, while minimizing the covariance of the terminal output. The technique has shown good robustness in simulations and has been demonstrated to be effective in solving ascent trajectory optimization problems of hypersonic vehicles, providing a reference trajectory for guidance.

Keywords: trajectory optimization control

1

· MPSP · unscented optimal

Introduction

Model predictive static programming (MPSP) is a popular algorithm that combines approximate dynamic programming and model predictive control to solve two-point boundary value problems with terminal constraints [8]. Its excellent computational efficiency has made it a popular choice for trajectory optimization and guidance problems in the aerospace field, as evidenced by its applications in studies such as [2,10]. However, the MPSP algorithm does not consider the uncertainty of system parameters, which can be a major limitation in certain applications [11]. To address this issue, researchers proposed a new algorithm called unscented model predictive static programming (U-MPSP) in [3]. This algorithm applies the unscented optimal control philosophy, which is a specific application of Riemann-Stieltjes optimal control theory [7,9]. While U-MPSP has been proven effective in solving the benchmark problems, it still cannot address the complex trajectory optimization problem of hypersonic vehicles with multiple constraints, which is attributed to limitations of the MPSP algorithm itself. c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023  Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 809–820, 2023. https://doi.org/10.1007/978-981-99-6187-0_81

810

Y. Qi et al.

Recently, the researchers have proposed the improved generalized quasispectral model predictive static programming algorithm (IGS-MPSP) in [1], which can solve the multi-constraint optimal control problem in the continuoustime framework. As a variant of the MPSP algorithm, the IGS-MPSP expresses the control in spectrum form as Quasi-Spectral MPSP(QS-MPSP) in [5], and the sensitivity matrix can be obtained by Gaussian quadrature collocation method, which greatly improves the calculation efficiency. Inspired by the IGS-MPSP technique, this paper proposes a new algorithm named improved generalized quasi-spectral MPSP (IGS-UMPSP). This method can effectively solve the ascent trajectory optimization problems of hypersonic vehicle, and the simulation results show that the control solution has good robustness, which provides a good reference trajectory for guidance.

2

Unscented Optimal Control

To address the uncertainty of system parameters, researchers have proposed the concept of unscented optimal control (see [3,4]). This method combines the unscented transformation with optimal control concepts to reduce the influence of uncertainties on the control solutions. A general nonlinear system in the continuous time framework is defined as ˙ X(t) = f (X, U, p, t)

(1)

Y (t) = h(X, t)

(2)

where X ∈ IRn ,U ∈ IRm , Y ∈ IRo and p ∈ IRnp denote the state, control, output and system parameter, respectively. t ∈ [t0 , tf ] represents the time, t0 represents the initial time, and tf represents the final time. The state is defined as X  [x1 x2 . . . xn ]T ∈ IRn

(3)

The goal of unscented optimal control is to design a control solution that is insensitive to the uncertain parameters of the original system, thereby reducing the output error ellipsoid and improving robustness. According to the unscented transformation philosophy, the result of applying nonlinear transformation to the probability distribution function can be estimated through finite sigma points. By considering the mean μ and covariance matrix Pp of the parameter, the sigma points can be derived as p0 = μp pj = μp + pj = μp −

 

 Pp  Pp

j j−np

j = 1, . . . , np j = np + 1, . . . , 2np

(4)

Robust Ascent Trajectory Optimization

According to these sigma points, the augmented system is defined as ⎡ ⎤ f (χ1 , U, t) ⎢ f (χ2 , U, t) ⎥ ⎥ ˆ U, t) ˆ˙ = ⎢ X ⎢ ⎥ = fˆ(X, .. ⎣ ⎦ .

811

(5)

f (χnσ , U, t) where nσ = 2np + 1 represents the number of sigma points. χj+1 corresponds to each specific sigma points, for example, χ1 corresponds to (X(t0 ), p0 ). Define the output of the augmented system as the mean value of the augˆ f ) , which can be written as mented state X(t n

T nσ σ 1 ˆ X, ˆ (tf )) = h( ˆ tf ) = Yˆ (X x1 (tf ) · · · xni (tf ) nσ i=1 i i=1

(6)

To achieve this goal, the control trajectory must ensure that the augmented ˆ f )) achieves the desired value Y ∗ (X(t ˆ f )) , while output at the final time Yˆ (X(t minimizing the trace of the covariance matrix at the final time. This requirement is reflected in the cost function, which is optimized using the IGS-UMPSP algorithm. By combining the IGS-MPSP algorithm with unscented optimal control concepts, the proposed algorithm can effectively handle the uncertainty of system parameters and provide robust control solutions for ascent trajectory optimization problems.

3

Mathematical Formulation of the IGS-UMPSP

This section presents the derivation of the proposed IGS-UMPSP method in the nonlinear system with continuous-time framework. In order to make the output meet the desired value at final time, define the terminal output error as ˆ f )) = Y ∗ (X(t ˆ f )) − Y (X(t ˆ f )) ΔY (X(t

(7)

IGS-UMPSP operates as follows: given an initial guess value, the control is continuously updated based on the existing deviation to find an optimal control that ˆ f )) → 0. Simultaneously, ˆ f )) , i.e. ΔY (X(t minimizes the output error ΔY (X(t the output trajectory satisfies multiple constraints and exhibits minimal impact from disturbances. In accordance with IGS-MPSP theory, the terminal output error can be expressed as ˆ ˆ f )) ∼ ˆ f )) = ∂ Y (X(tf )) dX(t ˆ f) ΔYˆ (X(t = dYˆ (X(t ˆ f) ∂ X(t

 tf ˆ U, t) ˆ U, t) ∂ fˆ(X, ∂ fˆ(X, ˙ ˆ ˆ + dU (t) − W (t)dX(t) dt W (t) dX(t) + W (t) ˆ ∂U (t) ∂ X(t) t0

812

Y. Qi et al.



ˆ f )) ∂ Yˆ (X(t ˆ f ) + W (t0 )dX(t ˆ 0) = − W (tf ) dX(t ˆ ∂ X(t)

  tf  ˆ(X, ˆ(X, ˆ U, t) ˆ U, t) ∂ f ∂ f ˆ (t) ˙ (t) dX(t) ˆ W dU (t) dt + +W +W (t)  ∂U (t) t0 ∂ X(t)

(8)

where W (t) ∈ Rn×nnσ is the weighting matrix. According to the MPSP concept, the terminal error should only be related to control, so there should be no state ˆ in (8) are independent of each other, so variable in (8). Obviously, dU (t) and X(t) ˆ selecting an appropriate weighting matrix can eliminate dX(t), and the selection rules are as follows: W (tf ) =

ˆ f )) ∂ Yˆ (X(t ˆ ∂ X(t)

(9)

ˆ ˆ ˙ (t) = −W (t) ∂ f (X, U, t) W ˆ ∂ X(t)

(10)

Equation (10) can be solved by Gaussian quadrature collocation [12]. The initial ˆ 0 ) = 0), so the terminal error in (8) can be written as condition is specific (dX(t  tf ˆ f )) = Bs (t) · dU (t)dt (11) dYˆ (X(t t0

Bs (t) = W (t) ·

ˆ U, t) ∂ fˆ(X, ∂U (t)

(12)

where Bs (t) is the sensitive matrix ,which reflects the relationship between the control variation dU (t)and the terminal output variation dY (X(tf )). In order to reduce the number of control variables and improve computational efficiency, the spectral representation of the control can be obtained by selecting limited basis functions as follows: dU (t) =

Np

dCj Pj (t)

(13)

j=1

where Np is the number of basic functions, Pj (t) and Cj ∈ IRm represent the jth basic function and its coefficient, respectively. Substituting (13) to (11) ⎛ ⎞  tf Np ˆ f )) = Bs (t) · ⎝ dCj Pj (t)⎠ dt dYˆ (X(t t0

=

Np  j=1

 Fj =

j=1

tf

t0

tf

t0

Bs (t) · Pj (t)dt · dCj =

Bs (t) · Pj (t)dt

Np

Fj dCj

(14)

j=1

(15)

Robust Ascent Trajectory Optimization

813

where Fj is the spectral sensitive matrix. Equation (14) presents the method to ˆ f )). calculate the decision variable dCj by the terminal error dYˆ (X(t Next, to minimize the trace of terminal covariance matrix T r(PXˆ (tf )), the ˆ X(t ˆ f )) is defined as the error between the terminal augmented state variable Z( variable and its mean value   ˆ f )) = X(t ˆ f ) − X(t ¯ f) Z(X(t (16) 

T





T

nσ nσ ˆ f )  χ1 (tf ) . . . χnσ (tf ) , X(t ¯ f)  1 where X(t . i=1 χi (tf ) . . . i=1 χi (tf ) nσ Consequently, the terminal covariance matrix trace of the augmented system T T 2 ˆ ˆ can be expressed as [Z(X(t f )) QZ(X(tf )) ]/nσ , where Q is the tuning matrix. ˆ f )) = 0, Z( ˆ X(t ˆ f )) using To minimize the covariance matrix trace, set Z ∗ (X(t small approximations can be expressed as

ˆ X(t ˆ f )) = Zˆ ∗ (X(t ˆ f )) + ΔZ( ˆ X(t ˆ f )) ∼ ˆ X(t ˆ f )) Z( = dZ(

(17)

Similar to (14), Eq. (17) can be rewritten as ˆ X(t ˆ f )) = Z(

Np

Kj dCj

(18)

Bz (t) · Pj (t)dt

(19)

ˆ U, t) ∂ fˆ(X, ∂U (t)

(20)

j=1

where  Kj =

tf

t0

Bz (t) = Wz (t) · Wz (tf ) =

ˆ X(t ˆ f )) ∂ Z( ˆ ∂ X(t)

(21)

where the calculation method of Wz (t) is identical to W (t) in (10). Subsequently, the transformation to multiple constraints is performed. Constraints can be written in the form of inequality ˆ X, ˆ U, t) ≤ G ˆ max G(

(22)

ˆ ∈ IRnσ l denote constraints, l denotes the number of constraints and where G ˆ max signifies the allowable range of constraints. Since small approximations G ˆ f )) , ΔZ( ˆ X(t ˆ f )) ∼ ˆ f )) ∼ method is used in (8) and (17) ( ΔYˆ (X(t = dYˆ (X(t = ˆ ˆ dZ(X(tf ))), obtaining the anticipated solution necessitates multiple iterations. Define the relationship between the constraint variables of the ith and (i + 1)th iterations, and use Taylor expansion while neglecting higher-order terms ˆ i+1 , U i+1 , tk ) = G(X ˆ i , U i , tk ) + ΔG(X ˆ i , U i , tk ) G(X

814

Y. Qi et al.

ˆi i ∼ ˆ k )i ˆ i , U i , tk ) + ∂G(X , U , tk ) dX(t = G(X ˆi ∂X ˆ i , U i , tk ) ∂G(X i + dU (tk ) (23) ∂U i where tk ∈ [t0 , tf ] represents a sampling point. According to the transformation ˆ k ) can be written as method of (8)∼(14), dX(t ˆ k) = dX(t

Np

Tjk · dCj

(24)

j=1

where Tjk =



tk

t0

Bgk (t) · Pj (t)dt

Bgk (t) = Wg (t)

ˆ U, t) ∂ fˆ(X, , ∂U (t)

(25) t ∈ [t0 , tk ]

Wg (tk ) = Inσ ×nσ

(26) (27)

Substituting (13) and (24) into (23) results in ˆ X ˆ i+1 , U i+1 , tk ) = G( ˆ X ˆ i , U i , tk ) + G(

Np

Hjk dCji

(28)

j=1

where Hjk =

ˆ X ˆ i , U i , tk ) ˆ X ˆ i , U i , tk ) ∂ G( ∂ G( · Tjk + · Pj (tk ) ˆi ∂U i ∂X

(29)

It is obviously that ˆ X ˆ i+1 , U i+1 , tk ) ≤ G ˆ max G(

(30)

Utilizing (28), the inequality can be rewritten as Np

¯k Hjk dCji ≤ G

(31)

j=1

¯ k = Gmax (tk ) − G( ˆ X ˆ i , U i , tk ) represents the error between constraints where G and the constraints thresholds. Thus far, utilizing (11), (18) and (31), the output error, output covariance matrix trace and inequality constraints can be expressed by dCj . Through the above derivation, the optimization problem based on IGSUMPSP can be written as min J =

Np T   1  i Cj + dCji Rj Cji + dCji 2 j=1 ⎛ ⎞T ⎛ ⎞ Np Np 1 ⎝ + Kj dCji ⎠ Q ⎝ Kj dCji ⎠ 2 j=1 j=1

(32)

Robust Ascent Trajectory Optimization

815

subject to Np

ˆ f )) Fj · dCji = dYˆ (X(t

(33)

j=1 Np

¯k, Hjk dCji ≤ G

k = 1, · · · , N

(34)

j=1

where both Q and R are positive matrices. By employing the Lagrange multiplier (λ) and the penalty function method, Eq. (32) can be written as ⎛ ⎞T ⎛ ⎞ Np Np Np  i    1 1 T J¯ = C + dCji Rj Cji + dCji + ⎝ Kj dCji ⎠ Q ⎝ Kj dCji ⎠ 2 j=1 j 2 j=1 j=1 ⎛ ⎞T ⎛ ⎞ Np Np N 1 ⎝ k i ¯ k ⎠ σk ⎝ ¯k⎠ + Hj dCj − G Hjk dCji − G 2 j=1 j=1 k=1

T

+λ (dYˆN −

Np

Fj dCji )

(35)

j=1

where σk is the penalty parameter. The first-order optimality conditions are expressed as ⎧ ⎛ ⎞ Np Np N ⎪ ⎪   ∂L T ⎪ T ⎪ ⎝ Hlk σk =Rl (Cli + dCli ) + (Kl ) Q Kj dCji + Hjk dCji ⎠ ⎪ ⎪ ⎪ ∂dCl ⎪ j=1 j=1 k=1 ⎪ ⎪ ⎪ ⎪ ⎨ N  T ¯ k − F T λ = 0, l = 1, · · · , Np (36) − Hlk σk G l ⎪ ⎪ ⎪ k=1 ⎪ ⎪ ⎪ Np ⎪ ⎪ ⎪ ∂L ⎪ ˆ ⎪ =d Y − Fj dCji = 0 N ⎪ ∂λ ⎩ j=1

Rewrite (36) in matrix form ⎤ ⎡ ⎤⎡ ⎤ ⎡ b1 dC1i A11 + R1 ··· A1(N p−2) A1N p −F1 T i ⎥ ⎢ ⎢ b ⎥ ⎢ A21 A22 + R2 ··· A2N p −F2 T ⎥ ⎢ ⎥ ⎢ dC2 ⎥ ⎢ .2 ⎥ ⎢ ⎥ ⎢ .. ⎥ ⎢ . ⎥ . . . . .. .. .. .. ⎥ ⎢ . ⎥ = ⎢ . ⎥(37) ⎢ ··· ⎥ ⎢ ⎥⎢ i ⎥ ⎢ ⎣ AN p1 ··· ··· AN pN p + RN p −FN p T ⎦ ⎣dCN p ⎦ ⎣ bN p ⎦ λ F1 F2 ··· FN p 0 dYN

816

Y. Qi et al.

where T

Aij = (Ki ) σk Kj +

N 

Hik

T

k=1

bi =

N 

 k T

Hi

¯k − σk G

σk Hjk (38)

Ri Cii

k=1

The decision variables dCj can be obtained by solving (37), thereby updating the control Cji+1 = Cji + dCji U (t) =

Np

Cji+1 Pj (t)

(39) (40)

j=1

The above constitutes a complete control update process. As iterations are carried out, the control solution is continuously updated, ensuring that the final trajectory meets the expected value within the allowable error range.

4

Simulations

To verify the effectiveness of the IGS-UMPSP method, the ascent trajectory optimization problems of hypersonic vehicle are considered. The trajectory of hypersonic vehicles is greatly affected by atmospheric parameters during the ascent phase. The experimental results show that the trajectory obtained by IGS-UMPSP can shrink the error ellipse while satisfying multiple constraints. 4.1

Problem Setup

There are complex constraints and uncertain external environment in the ascent of the aircraft. Considering the aircraft as a mass point, the three-degree-offreedom dynamics model (see [6]) is given as ⎧ h˙ = v sin γ ⎪ ⎪ ⎪ ⎪ ⎪ T cos α − D ⎪ ⎪ ⎪ v˙ = − g sin γ ⎪ ⎨ m (41) T sin α + L g u ⎪ γ˙ = − cos γ + cos γ ⎪ ⎪ mv v h + R ⎪ e ⎪ ⎪ ⎪ T ⎪ ⎪ ˙ =− ⎩m g0 Isp  T where X  h v γ m respectively represent the height, velocity, flight path angle and mass of the vehicle. α represents the angle of attack, which is the control variable. Re and g respectively represent the average radius of the earth

Robust Ascent Trajectory Optimization

817

and the gravity acceleration. L and D respectively denote aerodynamic drag and aerodynamic lift, which can be calculated according to the standard atmospheric model. Due to the error between the actual atmospheric model and the standard one, the parameter variables are defined as p = [ΔL, ΔD]. Define the output as  T Y  h v γ . The multiple constraints in the ascent of the aircraft are given as αmin < α < αmax 1 q = ρv 2 ≤ qmax 2 |YN − YN ∗ | ≤ ε

(42) (43) (44)

where q represents the dynamic pressure, which, if exceeding the threshold may cause damage to the aircraft. The objective of the trajectory optimization problem is for the aircraft to reach the terminal state YN with minimum control power, and the cost function is selected as  1 tf T J= U RU dt (45) 2 t0 According to the IGS-UMPSP algorithm, the augmented state system can be expressed as ⎧ ⎪ ˆ˙ i = vˆi sin γˆi ⎪h ⎪ ⎪ ⎪ ⎪ T cos α − D ⎪ ⎪ − g sin γˆi vˆ˙ = ⎪ ⎪ ⎨ i m ˆi (46) T sin α + L g vˆi ⎪ − cos γˆi + cos γˆi γˆ˙ i = ⎪ ⎪ ˆ i + Re m ˆ i vˆi vˆi ⎪ h ⎪ ⎪ ⎪ ⎪ T ⎪ ⎪m ˆ˙ i = − ⎩ g0 Isp where i = 1, . . . , 5 represent sigma points χ1 = (X (t0 ) , ΔL0 , ΔD0 ) , χ2 = (X (t0 ) , ΔL1 , ΔD0 ) , χ3 = (X (t0 ) ΔL2 , ΔD0 ) , χ4 = (X (t0 ) , ΔL0 , ΔD1 ) , χ5 = (X (t0 ) , ΔL0 , ΔD2 ) where √ √ ΔL0 = μΔL , ΔL1 = μΔL + σΔL , ΔL2 = μΔL − σΔL √ √ ΔD0 = μΔD , ΔD1 = μΔD + σΔD , ΔD2 = μΔD − σΔD The augmented output is required to meet the desired value, which can be expressed as 1 Yˆ  nσ

 nσ i=1

hi

nσ i=1

vi

nσ i=1

!T γi

T  → h∗f vf∗ γf∗

818

4.2

Y. Qi et al.

Performance of IGS-UMPSP

The simulations are performed using MATLAB 2021b on a computer with a 3.00GHz processor. The initial and final conditions of the problem are as follows: ◦

h0 = 23165 m, v0 = 1676 m/s, γ0 = 0 , m0 = 127005.86 kg ◦ h∗f = 39264 m, vf∗ = 4265 m/s, γf∗ = 0 The basic functions of the control are five polynomials of t, so the control can be expressed as U (t) = C1 + C2 t + C3 t2 + C4 t3 + C5 t4 . The initial guess of the Cj are as follows: C10 = 6.185, C20 = −0.1412, C30 = 0.002015, C40 = −1.090 · 10−5 , C50 = −2.387 · 10−9 . ◦



The allowable range of the control α is −3 ≤ α ≤ 15 .The dynamics pressure constraint and overload constraint are selected as qmax = 150 kPa. The uncertainties of the lift and drag are σΔL = 0.05 and σΔD = 0.05. To verify the effectiveness of IGS-UMPSP, it is compared with different methods. The output covariance matrix trace obtained by the IGS-UMPSP is greatly reduced, which demonstrates that the method can effectively reduce the error ellipse and obtain a trajectory that is not sensitive to interference and exhibits robustness. The trajectory results are shown in Table 1. The comparison of output trajectory and constraint value between IGS-UMPSP and other methods is shown in Fig. 1. Table 1. Comparison of different trajectory optimization methods Δhf (m)

Δvf (m/s)

Δγ(deg)

qmax (kPa) T r(PXˆ (tf ))

IGS-UMPSP

5.2532E-09 4.7112E-10 3.9877E-14 122.9704

1.3739E+03

IGS-MPSP

3.7502

5.9375E+03

MPSP

8.4047

Initial Trajectory 477.8507

0.9070

0.0045

151.2740

0.5950

0.0021

126.4763

1.8023E+03

0.0872

733.0417

229.0936

2.2340E+04

From Fig. 1a to Fig. 1e, it can be observed that the output trajectory obtained by the IGS-UMPSP still satisfies the terminal constraints when the parameters are uncertain. As seen in Fig. 1d, since the pseudo-spectral method is used in the IGS-MPSP algorithm to represent the control, the obtained control trajectory is smoother than that of the MPSP algorithm, which meets actual application requirements. Additionally, the calculation efficiency of IGS-UMPSP is much better than that of the traditional MPSP method. The primary reason is that the sensitivity calculation method is different. The MPSP algorithm needs continuous substitution at each discrete time node, which greatly increases the calculation. The IGS-UMPSP algorithm uses Gaussian quadrature collocation to calculate the sensitivity matrix with a few collocation points.

Robust Ascent Trajectory Optimization

Fig. 1. Comparison between MPSP, IGS-MPSP and IGS-UMPSP

819

820

5

Y. Qi et al.

Conclusion

A new IGS-UMPSP algorithm is proposed in this paper, which combines the IGS-MPSP algorithm with the unscented control method. The IGS-UMPSP algorithm can solve trajectory optimization problems with uncertain system parameters, and the obtained control solutions can satisfy multiple constraints including process constraints, while minimizing the covariance trace of the terminal output and reducing the error ellipse, which has good robustness. The simulation results show that the IGS-UMPSP algorithm can be applied to solve the ascent trajectory optimization problems of hypersonic vehicles, providing a good reference trajectory for guidance.

References 1. Liu, L., He, Q., Wang, B., Fu, W., Cheng, Z., Wang, Y.: Ascent trajectory optimization for air-breathing hypersonic vehicles based on IGS-MPSP. J. Guid., Navig., Control. 1(02), 2150010 (2021) 2. Luo, Z., Li, X., Wang, L.: Model predictive static programming guidance method with trajectory way-points constraints. J. Appl. Math. Comput. 6(1), 13–18 (2021) 3. Mathavaraj, S., Padhi, R.: Unscented mpsp for optimal control of a class of uncertain nonlinear dynamic systems. J. Dyn. Sys., Meas., Control. 141(6), 1–7 (2019) 4. Mathavaraj, S., Padhi, R.: Quasi-spectral unscented MPSP guidance for robust soft-landing on asteroid. J. Optim. Theor. Appl. 191, 823–845 (2021) 5. Mondal, S., Padhi, R.: Constrained quasi-spectral MPSP with application to highprecision missile guidance with path constraints. J. Dyn. Sys., Meas., Control. 143(3) (2021) 6. Murillo, O., Lu, P.: Fast ascent trajectory optimization for hypersonic air-breathing vehicles. In: Proceedings of AIAA Guidance, Navigation, and Control Conference, pp. 8173–8197 (2010) 7. Ozaki, N., Campagnola, S., Funase, R., Yam, C.H.: Stochastic differential dynamic programming with unscented transform for low-thrust trajectory design. J. Guid. Control. Dyn. 41(2), 377–387 (2018) 8. Padhi, R., Kothari, M.: Model predictive static programming: a computationally efficient technique for suboptimal control design. Int. J. Innov. Comput. Inf. Control. 5(2), 399–411 (2009) 9. Ross, I.M., Proulx, R.J., Karpenko, M., Gong, Q.: Riemann-stieltjes optimal control problems for uncertain dynamic systems. J. Guid. Control. Dyn. 38(7), 1251– 1263 (2015) 10. Sharma, P., Padhi, R.: Mars atmospheric entry guidance using MPSP with state and control constraints. IFAC-PapersOnLine 55(22), 91–96 (2022) 11. Wang, X., et al.: An online generation method of ascent trajectory based on feedforward neural networks. J. Aerosp. Sci. Technol. 128, 107739 (2022) 12. Zhou, C., Yan, X., Tang, S.: Generalized quasi-spectral model predictive static programming method using gaussian quadrature collocation. J. Aerosp. Sci. Technol. 106, 106134 (2020)

Exponential Visual Stabilization of Wheeled Mobile Robots Based on Active Disturbance Rejection Control Yao Huang(B)

, Lidong Zhang, and Xinrui Hou

University of Shanghai for Science and Technology, Shanghai 200093, China [email protected]

Abstract. In this paper, an active disturbance rejection control (ADRC) based visual servoing strategy is designed to maneuver the wheeled mobile robot from different initial poses to the desired pose at an exponential rate. In addition, the problems of monocular vision that lacking depth information and unknown camera-to-robot translational parameters are taken into consideration. In this strategy, an input state scaling technique is used to obtain two decoupled subsystems, then two control laws are designed separately. The un-modeled information and disturbances of the system are uniformly estimated by an extended state observer (ESO) and compensated for by a switching controller. Simulation results validate the effectiveness of the proposed scheme. Keywords: Wheeled Mobile Robot · Input State Scaling Technique · Active Disturbance Rejection Control · Exponential Stabilization

1 Introduction Most wheeled robotic systems are typically nonholonomic systems which subject to non-integrable constraints. The stabilization problem of nonholonomic mobile robots has been extensively studied since the seminal work of Brockett [1] who deduced that there is no continuous time-invariant state feedback control law could asymptotically stabilize the nonholonomic system. To solve this problem, some existing methods have been proposed including time-varying continuous controllers [2], discontinuous controllers [3]. Visual stabilization uses visual feedback to drive a robot from an initial configuration to the desired configuration which is defined by a pre-acquired image with a monocular camera. The monocular vision system lacks of depth information, which makes visual stabilization of wheeled mobile robots quilt difficult. Furthermore, the existence of unknown extrinsic camera parameters could increase the difficulty of control. In [4], an overhead fixed camera is used to obtain the posture information of the mobile robot through three marked points. Two smooth time-varying pose stabilization controllers are developed without any calibration of the camera parameters. The stabilization problem and simultaneous tracking and regulation problem are investigated for a differentialdrive mobile robot with a fixed on-board camera in [4] and [6], respectively. Unknown © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 821–829, 2023. https://doi.org/10.1007/978-981-99-6187-0_82

822

Y. Huang et al.

extrinsic camera parameters are considered in these works which are estimated online with the adaptive laws. And time-varying terms are used for controller design, which could yield oscillatory behavior of control signals and the robot motion, which could consequently slow down the rate of convergence. Combining ADRC proposed by Han [7], this paper presents a control strategy that can achieve exponential convergence. The emergence of ADRC has brought a new paradigm view of the traditional nonlinear control problem, which can actively estimate and reject disturbances from internal and external sources. The advantages such as small overshoot, fast response, high accuracy and immunity to disturbances have attracted extensive research [8, 9]. Su [10] extended the application of ADRC theory to the field of visual servoing. Huang [11] completed the theoretical derivation and validity proof of ADRC used in the control of nonholonomic systems. In this paper, a visual servoing strategy based on ADRC is proposed to regulate a nonholonomic mobile robot to the desired pose. Firstly, the system state is defined by the coordinate transformation of the static image feature points. They are used to develop a kinematic model of the system which could be decoupled into two subsystems via an input-state scale transformation. The unmodeled uncertainties caused by unknown depth information and unknown extrinsic camera parameters are estimated by a linear ESO and compensated for by a switching controller. Simulation results show that the proposed control strategy can drive the robot to the desired pose and make the system error converge exponentially.

2 System Modeling 2.1 Robot Kinematics In this paper, a fixed camera equipped on the mobile robot is utilized to obtain the real-time feedback information. As shown in Fig. 1, the world coordinate, mobile robot coordinate and camera coordinate are denoted as Fw , Fr and Fc , respectively. a, b, c ∈ R are the translational extrinsic parameters between the two coordinates along the zr , xr and yr axes. yr , yc axes satisfy the right-hand rule and are perpendicular to the motion plane of the mobile robot. The pose of the mobile robot in the world coordinate Fw is denoted by (x, z, θ ), where x and z are the position coordinates of the mobile robot in the motion plane, θ is the rotation angle of the current pose relative to the desired pose around yr − axis. Based on this, the kinematic model of the mobile robot can be described as ⎡ ⎤ ⎡ ⎤ x˙ sinθ 0   ⎣ z˙ ⎦ = ⎣ cosθ 0 ⎦ v , (1) ω ˙θ 0 1 where v ∈ R and ω ∈ R denote the linear velocity and angular velocity of the mobile robot, respectively. The 3-dimensional coordinates of the feature point in the corresponding frames are denoted as Xis Yis Zis (s = wrc∗). According to the geometric relation shown in Fig. 1, the

Exponential Visual Stabilization of Wheeled Mobile Robots

823

Fig. 1. The mobile robot frame and the camera frame.

coordinate transformation can be obtained as follows: ⎡ w⎤ ⎡ ⎤⎛⎡ c ⎤ ⎡ ⎤⎞ ⎡ ⎤ cosθ 0 sinθ b Xi Xi x ⎣ Y w ⎦ = ⎣ 0 1 0 ⎦⎝⎣ Y c ⎦ + ⎣ c ⎦⎠ + ⎣ y ⎦, i i Ziw Zic −sinθ 0 cosθ a z

(2)

By using (1) and (2), taking the time derivative of the feature point in Fc , the kinematic model is expressed as follows: c X˙ i = −ωZ˙ ic − ωa, (3) Z˙ ic = ωX˙ ic + ωb − v.

2.2 Measurable Signal Extraction T

Four coplanar feature points in Fc and Fc∗ are denoted as hic = Xic Yic Zic , h∗ic =

∗ ∗ ∗ T Xi Yi Zi ∈ R3 , respectively. The normalized image coordinates hi and h∗i ∈ R3 are defined as:  c c T  ∗ ∗ T Xi Yi Xi Yi ∗ 1 , hi = 1 . (4) hi = Zic Zic Zi∗ Zi∗ The image features pi , pi∗ could be extracted from the current image and reference image. Suppose the intrinsic camera matrix K ∈ R3×3 is known, then the normalized image coordinates can be calculated by hi = K −1 pi , h∗i = K −1 pi∗ . By employing the homography decomposition technique with the image feature pairs (pi , pi∗ ), the relative angle θ ∈ R can be extracted from the rotation matrix R ∈ R3∗3 between Fc and Fc∗ : ⎡

⎤ cosθ 0 −sinθ R = ⎣ 0 1 0 ⎦. sinθ 0 cosθ

(5)

824

Y. Huang et al.

Suppose θ ∈ [− π2 , π2 ], then θ can be calculated by the following equation:   r31 − r13 . θ = asin 2

(6)

2.3 System Kinematics The obtained relative pose information is utilized to develop the system kinematics. Define the following coordinate transformation to obtain the new states:  c  c  c Yi Xic Zic Xi Yi ε1 = / = = , ε = 1/ . (7) 2 Zic Zic Yic Zic Yic ε1 ∈ R, ε2 ∈ R are both measurable feedback signals. Accordingly, at the desired pose, X∗ Z∗ the states ε1∗ ∈ R and ε2∗ ∈ R are defined as ε1∗ = Yi∗ , ε2∗ = Yi∗ . i i Define the following error signals: ⎧ e0 = θ, ⎨ (8) e1 = ε1 − ε1∗ cosθ + ε2∗ sinθ , ⎩ e2 = ε2 − ε1∗ sinθ − ε2∗ cosθ It can be deduced that when e0 , e1 , e2 go to zero, the frame Fc will coincide with the frame Fc∗ , which means that the mobile robot has successfully arrived the desired pose. As introduced in Sect. 2, the normalized coordinates have been calculated from the image coordinates. The state information ε and ε∗ are easily obtained. To this end, with unknown depth information and extrinsic parameters, we will design suitable velocities v and ω such that e0 (t) → 0, e1 (t) → 0, e2 (t) → 0. Taking the time derivative of (8) yields the open-loop error system: ⎧ e˙ 0 = ω, ⎨ (9) e˙ 1 = −e2 ω − bγ0 ω, ⎩ e˙ 2 = −γ0 v + e1 ω + aγ0 ω.

3 Control Strategy Design The above chain system has a potentially linear structure. e0 -subsystem is controlled by ω only, while the control ability of e-subsystem is determined by v, so the controllers of e0 and e-subsystem can be designed independently. 3.1 Controller Design of e0 -Subsystem For the e0 -subsystem, the following controller is designed: ⎧ e e = 0, ⎨ −k 0 0 0 (0) ω= αt ≤ ts ⎩ e0 (0) = 0, −k0 e0 t > ts

(10)

where ts > 0 is the switching time of the controller, k0 > 0, α > 0. Then, the controller (10) can exponentially stabilize the e0 -subsystem.

Exponential Visual Stabilization of Wheeled Mobile Robots

825

3.2 Controller Design of e-Subsystem As the state e0 exponential converges to the zero with the designed control law ω, it leads to an uncontrollable e-subsystem. To overcome this problem, a discontinuous input state scale transformation is used: s1 = eω1 , (11) s2 = −e2 . Deriving the system states s1 and s2 with respect to the time, we get s˙1 = s2 + k0 s1 − bγ0 , s˙2 = γ0 v − e1 ω − aγ0 ω.

(12)

It can be found that e0 is no longer coupled with state e through the discontinuous transformation (11). e-subsystem is transformed into a second-order nonlinear system. Obviously, ω should satisfy the constraint of ω = 0 to ensure the effectiveness of the discontinuous transformation (11). For the e-subsystem, we propose an ADRC-based controller design method. The system state s is not completely measurable which should be estimated for controller design. We will reconstruct the s-system to find the deviation from the nominal integral chain model. A new state vector η = [η1 η2 η3 ]T is defined to develop an integral chain model with disturbances. The new model is presented as follows: ⎧ η˙ 1 = η2 ,  ⎨  (13) η˙ 2 = γ0 v − e1 ω − aγ 0 ω + k0 s2 + k0 s1 − bγ 0 , ⎩ η˙ 3 = f˙total ,   ftotal = γ0 − γ 0 v − e1 ω − aγ 0 ω + k0 (s2 + k0 s1 − bγ0 ), (14) 



where ftotal is the total disturbance. γ0 is unknown control coefficient and γ 0 is the estimation of γ0 . Design of Extended State Observer. The state and total disturbance of the system can be observed by an ESO which is designed as          η˙ η b A bf (15) + u u + LC T η − η , ˙ = 0 0 0 f f 











 01 where η and f are the estimates of the states η and ftotal , respectively. A = , bf = 00    T  T T

0 0 1 ,C = , L = β1 β2 β3 . Choosing appropriate parameters , bu = 1 γ0 0   

A bf − L C T 0 being a Hurwitz matrix. The observation to ensure the matrix Ae = 0 0 poles are placed at −ω0 by using the bandwidth method proposed in [12]. Then the parameters are selected according to β1 =3 ω0 , β2 =3 ω02 , β3 =ω03 , where ω0 denotes the bandwidth of the observer. ESO in this parameter configuration is asymptotically convergent [13]. 





826

Y. Huang et al.

State Error Feedback Control Law. Since the ESO estimates the total disturbance in real time, we design the control law v as   v = u − f /γ 0 . (16) 





Since f → ftotal , by substituting (16) into (15) we obtain 

η¨ 1 = γ 0 v + ftotal → u.

(17)

Then, the control component u is designed as   u = k1 r − η1 − k2 η2 .

(18)





The closed-loop differential equation of the system can be written as η¨ 1 + k2 η˙ 1 + k1 η1 = k1 r.

(19)

For the stabilization problem, let r = 0. The differential equation of the closed-loop system is η¨ 1 + k2 η˙ 1 + k1 η1 = 0. Obviously, the system states η1 , η2 (η2 = η˙ 1 ) are exponentially stable. Similarly, according to the bandwidth method, the controller parameters are selected as k1 = ωc2 , k 2 = 2ωc , where ωc is the bandwidth of the controller. In addition, in order to accurately estimate and compensate the error timely, a short delay strategy is set forv, so that its initial velocity is set to zero at the beginning. Based on the above design process, we can obtain the ADRC-based controller ⎧ ⎧ e , e0 (0) = 0, ⎨ −k ⎪ ⎪ 0 0 ⎪ ⎪ ω = α t ≤ ts ⎪ ⎨ ⎩ e0 (0) = 0, −k e t > ts 0 0 (20)  ⎪ ⎪ 0 t < t , u ⎪   ⎪ ⎪ ⎩v = −k1 ηˆ 1 − k2 ηˆ 2 − fˆ /γˆ0 t > tu , where tu > 0 is the switching time of the controller.

4 Simulations Simulations are performed in MATLAB to verify the exponential convergence performance of the proposed control strategy. The virtual camera parameters are set to fu = 829.77 and fv = 826.59, and the camera extrinsic parameters are selected as a = 0.1, b = 0.1, c = 0.56. Four coplanar points are randomly chosen as (0.60 m, 0.80 m, 4.51 m), (0.30 m, 1.20 m, 4.64 m), (0.25  m, 0.30 m,◦ 4.44 m), and (0.40 m, 0.40 m, 4.44 m). The desired pose is always set at 0 m, 0 m, 0 . An initial pose is randomly selected as(x, z, θ ) = (−1 m, −3 m, 25◦ ). The parameters of the controllers are selected as ω0 = 15, ωc = 0.3, k0 = 0.07. ts = 0.1, tu = 1. The sampling time is set to T = 0.01 s. The image trajectories of the selected four feature points are shown in Fig. 2(a), where the asterisk points indicate the goal points and the circle points denote the features in

Exponential Visual Stabilization of Wheeled Mobile Robots

(a) Trajectory of feature points

827

(b) Posture of the mobile robot

Fig. 2. Visual stabilization performance based on ADRC control strategy.

the starting image. Figure 2(b) displays the posture of the mobile robot which has an exponential convergence rate. The effectiveness of the ADRC-based control strategy is demonstrated. Considering that the target points should keep in the field of view, two sets of limit initial poses are randomly selected as (x, z, θ ) = (−3 m, −5 m, 42◦ ) and (x, z, θ ) = (1 m, −4 m, −18◦ ). As shown in Fig. 3(a) and (b), the feature trajectories are close to the edge of the image, which limit the selection of the initial poses. From the motion paths shown in Fig. 3(c), it is evident that the mobile robot move from different initial pose to the desired one without retuned the control parameters, which is more practical.

Y. Huang et al. 0

0

50

50

100

100

150

150

200

200

v(pixel)

v(pixel)

828

250

250

300

300

350

350

400

400 450

450 0

100

200

300

400

500

u(pixel)

(a) Feature trajectories of

600

0

100

200

300

400

500

600

u(pixel)

(b) Feature trajectories of

(c) Motion path of mobile robot with different initial poses

Fig. 3. The image trajectories and motion paths of mobile robot with different initial poses.

5 Conclusion A visual servoing strategy that could exponentially stabilize the nonholonomic mobile robot is proposed, although it suffers from uncertain camera translational parameters and unknown depth information. The simple reusability of the controller design in the ADRC framework is used for visual stabilization of wheeled mobile robots. Simulation results verify the effectiveness of the method.

References 1. Brockett, R.W., et al.: Asymptotic stability and feedback stabilization. Differ. Geom. Control Theory 27(1), 181–191 (1983) 2. Jiang, Z.P.: Iterative design of time-varying stabilizers for multi-input systems in chained form. Syst. Control Lett. 28(5), 255–262 (1996) 3. Gao, F., Wu, Y., Zhang, Z.: Finite-time stabilization of uncertain nonholonomic systems in feedforward-like form by output feedback. ISA Trans. 59, 125–132 (2015) 4. Liang, X., et al.: Purely image-based pose stabilization of nonholonomic mobile robots with a truly uncalibrated overhead camera. IEEE Trans. Rob. 36(3), 724–742 (2020) 5. Zhang, X., Fang, Y., Li, B., Wang, J.: Visual servoing of nonholonomic mobile robots with uncalibrated camera-to-robot parameters. IEEE Trans. Industr. Electron. 64(1), 390–400 (2017)

Exponential Visual Stabilization of Wheeled Mobile Robots

829

6. Lu, Q., Yu, L., Zhang, D., Zhang, X.: Simultaneous tracking and regulation visual servoing of wheeled mobile robots with uncalibrated extrinsic parameters. Int. J. Syst. Sci. 49(1), 217–229 (2018) 7. Han, J.Q.: From pid to active disturbance rejection control. IEEE Trans. Industr. Electron. 56(3), 900–906 (2009) 8. Li, J., Qi, X., Wan, H., Xia, Y.: Active disturbance rejection control: theoretical results summary and future researches. Control Theory Appl. 34(3), 281–295 (2017) 9. Xue, W., Huang, Y.: Tuning of sampled-data adrc for nonlinear uncertain systems. J. Syst. Sci. Complexity 29(5), 1187–1211 (2016) 10. Su, J.B.: Robotic uncalibrated visual serving based on ADRC. Control Decis. 30(1), 18 (2015). (in Chinese) 11. Huang, Y., Su, J.B.: Output feedback stabilization of uncertain nonholonomic systems with external disturbances via active disturbance rejection control. ISA Trans. 104, 245–254 (2020) 12. Gao, Z.: Scaling and bandwidth-parameterization based controller tuning. In Proceedings of the American Control Conference, vol. 6, pp. 4989–4996 (2006) 13. Guo, B.Z., Zhao, Z.L.: On the convergence of an extended state observer for nonlinear systems with uncertainty. Syst. Control Lett. 60(6), 420–430 (2011)

An Adaptive Observer for Current Sensorless Control of Boost Converter Feeding Unknown Constant Power Load Xiang Wang, Wei He(B) , and Tao Li Nanjing University of Information Science and Technology, Nanjing 210044, China [email protected]

Abstract. In this brief, the main objective is to address the current sensorless problem of boost converter feeding unknown constant power load (CPL). A reduced-order generalised parameter estimation is first proposed to simultaneously estimate the unmeasurable current and unknown CPL of the system, which is based on dynamic regressor extension and mixing technique. Combining the estimated terms with an existing full state feedback controller, an adaptive sensorless control scheme is achieved. It is guaranteed that the closed-loop system is asymptotically stable. Finally, the effectiveness of the presented method is assessed by simulation and experimental studies.

Keywords: boost converter sensorless control

1

· constant power load · adaptive

Introduction

As we all know, DC microgrid is regarded as an excellent method for dealing with energy integration, energy conversion and power supply, which is widely applied in some situations. It should benoted that DC-DC converters are one of important components in DC microgrid, which play a key role in power conversion. Especially, boost converter is employed to step up the voltage of output port. It is undoubted that the control performance of DC-DC converters will directly affect the power quality of the system. It is not surprise that many advanced control methods have been developed to improve their performance. In [1], for boost converter, a model predictive controller is introduced to achieve global asymptotic convergence and improve the transient and robustness. To eliminate the influence of time-varying disturbances on the system, a composite controller integrating with incremental passivity based controller is designed in [2]. However, all of the above mentioned methods need that all of states should be measured. This requires that sufficient sensors are installed, which will increase the total cost of the system and endanger the stable system [3]. c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023  Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 830–837, 2023. https://doi.org/10.1007/978-981-99-6187-0_83

An Adaptive Observer

831

Motivated by this problem, sensorless control approach will be an excellent solution. In [4], the authors devise a state and disturbance observer to estimate inductor current and load resistance of boost converter for achieving sensorless control. In [5], an output feedback controller is proposed for boost converter on the basis of passivity theory. Above mentioned results focus on resistance load. In fact, the converters connected in cascaded form are favored, which leads to various dynamic interactions. The negative impedance caused by constant power load (CPL) will reduce the equivalent damping of the system and even give rise to the instability [6]. Various control approaches have been presented to address the regulation problem of output voltage of DC-DC converters with CPLs. The review about this topic can be found in [7]. In [8], the authors develop a reduced-order generalised parameter estimation based observer to achieve adaptive sensorless control of buck converter with CPL. However, it should be noted that the model of this system is much simpler compared with that of boost converter. Moreover, an interval excitation condition is still needed. It should be pointed out that developing an adaptive state observer for a bilinear system with a nonlinear term introduced by CPL without the excitation condition is indeed a challenging work. Inspired by [9] , a generalised parameter estimation based observer (GPEBO) is developed in this brief to reconstruct unmeasurable state and unknown CPL of the system without the extra excitation condition. Note that state observation is transformed into parameter estimation. An existing energy shaping controller (ESC) proposed in [10] is introduced to achieve an adaptive sensorless control scheme. A slight change is that the parasitic resistance is considered in this brief. The main contributions are given as follows. 1) A GPEBO integrating with dynamic regressor extension and mixing (DREM) technology is devised to simultaneously estimate unmeasurable state and unknown CPL of boost converter without extra excitation condition. 2) Combining estimated terms with an ESC, an adaptive sensorless controller is designed. The asymptotic stability of overall closed-loop system is guaranteed. The rest of the brief is arranged as follows. Section 2 presents the average model of boost converter with CPL and the description of problem formation. Section 3 develops the design steps of GPEBO. Section 4 gives the design process of current sensorless controller. Section 5 verifies the feasibility of the proposed method via simulation analysis. Section VI summarizes the brief.

2 2.1

System Model and Problem Formation The Model of DC-DC Boost Converter with a CPL

Fig. 1 shows the topology of DC-DC boost converter supplying a CPL. Assuming the system works in the continuous conduction mode. The average model of the

832

X. Wang et al.

Fig. 1. Circuit topology of boost converter with a CPL

circuit is represented as.  Lx˙ 1 = −rx1 − x2 u + E, C x˙ 2 = x1 u − xP2 ,

(1)

where r is the parasitic resistance, x1 is unmeasurable inductor current, x2 is output voltage, P is unknown CPL and E is the input voltage, u = 1 − d is the control input and d ∈ [0, 1] is the duty ratio. 2.2

Problem Formation

The major control task of this paper is to regulate output voltage of the boost converter with CPL around the equilibrium x2 without the knowledge of x1 and P , which is clearly stated as follows. 1) The unmeasurable state x1 and parameter P are estimated simultaneously ˆ1 = by the designed GPEBO. To be precise, it can be expressed as limt→∞ x x1 , limt→∞ Pˆ = P. 2) The adaptive sensorless controller is designed to ensure that the states converge to the equilibrium, namely x ˆ1 , Pˆ , namely limt→∞ x2 (t) = x2 .

3

GPEBO Design

Here, a reduced-order GPEBO is introduced to estimate x1 and P . A linear regression equation (LRE) is first derived. For convenience, the system (1) can be rewritten as x˙ 1 (t) = Ax x1 (t) + bx , y˙ (t) = Ay x1 (t) + φy θy , where θy = P , Ax = − Lr , bx =

E L



x2 u L , Ay

=

u C , φy

(2a) (2b) 1 = − Cx , y(t) = x2 . 2

An Adaptive Observer

3.1

833

Derivation of a LRE

Proposition 1. For the system (2), the dynamic extension is given by. ξ˙y = Ax ξy + bx , X˙ Ax = Ax XAx , XAx (0) = 1,  T T XAx Ay m(t) ˙ = −λm + λ , w(t) ˙ = −λw(t) + λ [λy(t) + Ay ξy ] . φy

(3) (4)

where λ > 0, m(0) = 02×1 , w(0) = λy(0). The observed state is validated by x1 = ξy + XAx θ1

(5)

T  θ = θ1 , θy is a constant vector to be observed and satisfies the LRE q = mT θ. The measurable signal q can be defined as q = λy (t) − w (t) . Proof. Define the error signal as e := x1 −ξy . From (2a) and (3), one has e˙ = Ax e. According to the properties of linear system theory, one has e = XAx e(0). The d , one has equation (5) is obtained by defining θ1 := e(0). Define p = dt x(t) = ξy (t) + XAx (t)θ1 ⇒ y˙ = Ay [ξy (t) + XAx (t)θ1 ] + φy θy    θ1 λ λ  λp Ay XAx φy [y(t)] − [λy(t) + Ay ξy (t)] ⇒ = θy λ+p λ+p λ+p λ [λy(t) + Ay ξy (t)] ⇔ mT θ = λy(t) − w(t) ⇔ mT θ = λy(t) − λ+p The proof is accomplished. 3.2

Use of the DREM Technology

Two new one-dimensional LREs need to be generated in order to estimate each parameter Next, the dynamics is defined qn =     independently.  following  α1 expanded  α1 T  m1 q1 s+β1 q s+β1 m , mn = = = a2 α2 T , where α1 , α2 , β1 , β2 > 0 are q2 m2 s+β2 q s+β2 m unequal constants. Then, an extended LRE can be obtained qn = mn θ.

(6)

Proposition 2. Considering the new LRE (6) and combining with GPEBO and DREM technique, a parameter observer is designed as ˙ ˙ ˆ θˆg = γg Δ1 (qn − ΔT1 θˆg ), Ω˙ = AΩ, θˆ = γΔ(Y − Δθ)

(7)

ˆ = θ0 and gains satisfy γg , γ > 0. Define where θˆg (0) = θg0 , Ω(0) = I2 , θ(0) A(t) := −γg Δ1 ΔT1 , D := I2 − Ω, Δ := det{D}, Y := adj{D}(θˆg − Ωθg0 ). (8) where adj{·} is adjoint matrix, det{·} is determinant.

834

X. Wang et al.

Proof. Defining the error vector θ˜g := θˆg − θ, it is easy to obtain that ˙ ˙ θˆg = γg Δ1 (qn − ΔT1 θˆg )(⇐= (6), (8)) =⇒ θ˜g = A(t)θ˜g =⇒ θ˜g = Ω θ˜g (0)(⇐= (8) =⇒ θ − Ωθ = θˆg − Ωθg0

(9)

=⇒ D(t)θ = θˆg − Ωθg0 (⇐= (8) =⇒ Y = Δθ(⇐= adj{Ω}×, (8)) The state observation problem has now been transformed into the estimation of constant vector θ. Substituting the last equation of (9) into (7) yields the error ˙ ˜ Borrowing Proposition 2 in [11], , the observable system dynamics θ˜ = −γΔ2 θ. (1) implies that the LRE (6) is identifiable, which is equivalent to the regressor Δ1 been interval exciting. One has |Δ(t)| = |det{I2 − Ω}| > 0, hence Δ satisfies the persistence of excitation condition. The proof is finished. Then, we can get the estimates Pˆ = θˆy , x ˆ1 = ξy + XAx θˆ1 . 3.3

Design of an Adaptive Sensorless Controller

In this subsection, introducing the ESC proposed in [10], an adaptive sensor T less controller is devised. Define a coordinate transformation z = z1 z2 =  2 T 2 Lx1 + Cx2 x2 . Then, the system (1) is converted into the following form   z1 −Cz22 z1 −z22 − P ), z˙1 = −2r L + 2(E L z˙2 = ω, where ω =

x1 u C



(10)

P Cx2 .

Proposition 3. (10), the ESC is given by √

E arctan( √ Cz2 2 ) z1 −Cz2 √ ω=− + κ1 (κ2 + z1 ) + υ1 . CL Ez2 

where κ1 > max(K1 , K2 ), K1 = 2z1 L

rEz2 z1 L(E− 2rz1 L



, κ2 =

E 2 −4rP ) 2 2rCz2 ), κp2 L

E arctan( √

2 z1 −Cz2 L

√ Cz2 2 ) z1 −Cz2

√ κ1 CL

, K2 =

2rE−2r



(11)

E 2 −4rP +E 2 L−4rE



2Cz2 (E−

2 − z1 − 2rz , υ1 = κp2 (E Lκ1



E 2 −4rP 2L



)

2 z1 −Cz2 L

−P −

+ > 0. The system (1) under (11) is asymptotically stable. Because of the limitation of space, the proof of Proposition 3 is omitted here. Although this proof is similar with that of Theorem 3 in [10].

An Adaptive Observer

835

Introducing x ˆ1 and Pˆ into the controller (11), an adaptive sensorless controller is given as. √ ⎧ Cz ˆ2 E arctan( √ ) ⎪ 2 +Cx2 −C z 2 ⎪ Lˆ x ˆ2 ⎪ω 1 2 ⎪ √ + κ1 (κ2 + Lˆ x21 + Cx22 ) + υˆ1 , ⎨ˆ = − CL  2 2 (12) zˆ1 −Cz2 2rCz2 ˆ 2rzˆ1 ), ⎪υˆ1 = κp2 (E L −P − L + L ⎪ ⎪ ⎪ ⎩u ˆ + Pˆ ˆ = xˆ11 C W x2 Proposition 4. The system (10) in closed-loop with the controller (11) is asymptotically stable Proof. The adaptive sensorless controller (12) can be denoted as the perturbed form ˜ (13) ω ˆ = ω + δω (z, X). ˜ is a perturbed function. The closed-loop system can be where δω = (z, X) described as the cascaded form ⎧     ⎪ 0 ⎪ ˜ ⎨ z˙ = 0 −1 δw (z, X), ∇Hd (z) + 1 −κp2 1 (14) ⎪ ⎪ ⎩ ˜˙ ˜ X = −γΔ2 X, ˜ = 0 [10]. From Proposition 3, the system (14) is asymptotically stable for (z, θ) In addition, Propositions 1 and 2 illustrate that η˜ can converge exponentially to zero and (z, 0) = 0. Using the conclusion of the asymptotic stability of the cascade system, e.g., Proposition 4.1 of [12], the local asymptotic stability of the closed-loop system (14) can be established. The proof is completed.

4

Simulation Results

The effectiveness of the designed controller will be assessed by Matlab/Simulink simulation platform. Table 1 shows the circuit parameters of the system. Case 1 The robustness performance is investigated. Observer gains are κ1 = 20000, κp2 = 80. It is seen from Fig. 2(a) that a step change in P ranging from 25W to 35W is considered. By raising the gain γg , γ, the transient is better. It can be seen that the estimation error can quickly and accurately converge to zero. Case 2 A step change in reference is considered, which changes from 25V to 35V . The observer gains are γg = 3, γ = 3000. As shown in Fig. 2(b), the setting time will be shorter by properly increasing κ1 and suitably decreasing κp2 . The estimation error can quickly converge to zero.

836

X. Wang et al. Table 1. Parameters of boost converter system in simulation Parameters

Symbols Values

Input voltage Reference Output Voltage Inductance Capacitance Nominal Extracted Power parasitic resistance

E v L C P r

0.4

0

0.2

-0.2

0

15V 25V 330μH 820μF 30W 0.03Ω

-0.4 0

0.02

0.04

0.06

0.08

0.1

0

0.05

0.1

0.15

0.2

0

0.05

0.1

0.15

0.2

0

0.05

0.1

0.15

0.2

25 20

35 25

10

10 0

0 0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

0.1

25 35 20

25

15

15 0

0.02

0.04

0.06

Time[s]

(a)

0.08

0.1

Time[s]

(b)

Fig. 2. Response curves of boost converter with CPL under the novel controller with a step change in P and v . (a) a step change in P , (b) a step change in v .

5

Conclusion

In this paper, an adaptive state observer was proposed to deal with the current sensorless control problem of boost converter with unknown CPL. Borrowing dynamic extension and DREM technology, a reduced-order GPEBO was constructed. The simulation studies revealed the nice transient and robustness performance of the proposed controller. Compared with the full order observer, the designed reduced order captures low complexity and easy implementation.

References 1. Kim, S.K., Park, C.R., Kim, J.S., Lee, Y.I.: A stabilizing model predictive controller for voltage regulation of a DC/DC boost converter. IEEE Trans. Control Syst. Technol. 22(5), 2016–2023 (2014) 2. He, W., Li, S., Yang, J., Wang, Z.: Incremental passivity based control for dc-dc boost converters under time-varying disturbances via a generalized proportional integral observer. J. Power Electron. 18(1), 147–159 (2018) 3. Pahlevani, M., Pan, S., Eren, S., Bakhshai, A., Jain, P.: An adaptive nonlinear current observer for boost PFC AC/DC converters. IEEE Trans. Ind. Electron. 61(12), 6720–6729 (2014)

An Adaptive Observer

837

4. Pandey, S.K., Patil, S.L., Chaskar, U.M., Phadke, S.: State and disturbance observer-based integral sliding mode controlled boost DC-DC converters. IEEE Trans. Circ. Syst. II: Express Briefs 66(9), 1567–1571 (2018) 5. Rodriguez, H., Ortega, R., Escobar, G., Barabanov, N.: A robustly stable output feedback saturated controller for the boost DC-to-DC converter. Syst. Control Lett. 40(1), 1–8 (2000) 6. Emadi, A., Khaligh, A., Rivetta, C.H., Williamson, G.A.: Constant power loads and negative impedance instability in automotive systems: definition, modeling, stability, and control of power electronic converters and motor drives. IEEE Trans. Veh. Technol. 55(4), 1112–1125 (2006) 7. Singh, S., Gautam, A.R., Fulwani, D.: Constant power loads and their effects in dc distributed power systems: a review. Renew. Sustain. Energy Rev. 72, 407–421 (2017) 8. He, W., Shang, Y., Namazi, M.M., Ortega, R.: Adaptive sensorless control for buck converter with constant power load. Control Engineering Practice 126, 105,237 (2022) 9. Bobtsov, A., Ortega, R., Yi, B., Nikolaev, N.: Adaptive state estimation of stateaffine systems with unknown time-varying parameters. Int. J. Control 95(9), 2460– 2472 (2022) 10. He, W., Ortega, R.: Design and implementation of adaptive energy shaping control for DC-DC converters with constant power loads. IEEE Trans. Ind. Inform. 16(8), 5053–5064 (2019) 11. Wang, L., Ortega, R., Bobtsov, A.: Observability is sufficient for the design of globally exponentially convergent state observers for state-affine nonlinear systems. arXiv preprint arXiv:2108.09406 (2021) 12. Sepulchre, R., Jankovic, M., Kokotovic, P.V.: Constructive nonlinear control. Springer Science & Business Media (2012). https://doi.org/10.1007/978-1-44710967-9

Author Index

A An, Kang

691

B Babiarz, Artur

802

C Cai, Guoliang 756 Cai, Wenhan 641 Cao, Chengjie 602 Cao, Hancheng 184 Cao, Yanao 101 Cao, Zhangbao 720 Chai, Lining 372, 634 Chao, Bohang 589 Chen, Danmin 313 Chen, Jie 579 Chen, Jing 341 Chen, Junran 610 Chen, Liang 101 Chen, Maojian 511 Chen, Renhui 163 Chen, Xin-Yu 790 Chen, Xu 372 Chen, Xuefeng 720 Chen, Yandong 23 Chen, Yifei 402 Chen, Youyuan 392 Cheng, Saihua 460 Cui, Shihang 626 Cui, Zhuofan 437 D Dai, Mingxuan 417 Deng, Xi 739 Deng, Ying 128 Ding, Hao 392 Ding, Shuai 331 Ding, Simin 118, 700 Ding, Yanfeng 756 Du, Junping 136

F Fan, Huijin 172, 273, 809 Fan, Yun-Sheng 790 Fangfang, Zhang 476 Feng, Guilin 109 Feng, Shangbin 542 Fu, Bofeng 281 Fu, Ying 409 G Gang, Xiao 519 Gao, ChaoJun 145 Gao, Jie 92 Gao, Qiankun 579 Gao, Xunkai 322, 392 Gao, Yang 298 Ge, Qi 361 Ge, Quanbo 782 Guan, Zeli 92, 136 Guo, Chen 109 Guo, Ying 331 H Han, Xin-Jie 790 He, Bin 437 He, Ruilin 243 He, Wei 830 Hongnian, Yu 476 Hou, Xinrui 821 Hua, Chengcheng 372, 634 Huang, Hanxin 226 Huang, He 683 Huang, Hui 257 Huang, Yao 821 Huang, Ziyang 511 J Jia, Pengpeng 226 Jia, Yingmin 13 Jia, Zhiao 196 Jianbin, Xin 476

© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Z. Deng (Ed.): CIAC 2023, LNEE 1082, pp. 839–842, 2023. https://doi.org/10.1007/978-981-99-6187-0

840

Author Index

Jiang, Tianjian 708 Jiang, Weifeng 775 Jiang, Yansong 728 Ju, Dengfeng 257 K Kang, Xueze

69

L Lei, Zhen 675 Leilei, Zhang 519 Li, Ang 51 Li, Chunlong 257 Li, Guangyu 589 Li, Ji 790 Li, Jun 409 Li, Linfang 361 Li, Qing 641 Li, Qingkai 641 Li, Quan 216 Li, Ruoyu 298 Li, Shangyuan 83 Li, Tao 263, 830 Li, Xiali 23 Li, Xiaonan 361 Li, Xin 163 Li, Yaogen 128 Li, Yawen 51, 92 Li, Yong 626 Li, Yuheng 216 Li, Zhibin 765 Li, Zhichen 205 Liang, Meiyu 51 Liang, Xiao 109 Liao, Lejian 361 Liao, Xiaofei 589 Liao, Yingqi 128 Lin, Qiang 83 Liu, Bo 23 Liu, Di 691 Liu, Fei 216 Liu, Haojun 31, 484 Liu, Hongjing 257 Liu, Huaping 69, 500, 802 Liu, Huimin 756 Liu, Jia 417 Liu, Jun 675 Liu, Kewen 257 Liu, Lei 172, 273, 809

Liu, Lu 361 Liu, Meng 500 Liu, Wei 243 Liu, Xiang 610 Liu, Xiaodong 427 Liu, Yanhong 331 Liu, Ying-Yuan 61, 691 Lu, Dan 747 Lu, Lina 341 Lu, Lingyun 69 Lu, Yifan 667 Lu, Yunfei 802 Luo, Minnan 542 Luo, Xiong 511 Luo, Zijuan 602 Lv, Zhengnan 610 M Ma, Chengjie 51 Ma, Jian 1 Ma, Ke 145 Ma, Nan 83, 128 Meng, Xiangxiang 322, 392, 561 Miao, Guoying 263 P Pang, Yanbo 641 Peng, Jinzhu 331 Peng, Qiaojuan 511 Peng, Zhiqiang 675 Q Qi, Yuting 809 Qian, Chen 184 Qihou, Chen 519 Qu, Hongquan 289 R Ran, Bingdong

739

S Sayyouri, Mhamed 802 Shao, Yingxia 136 Shen, Hailun 511 Shen, Naijun 184 Sheng, Yu-Bo 571 Shu, Mingxing 289 Song, Dandan 361, 460 Song, Wengcheng 765

Author Index

Su, Binghua 728 Su, Xuance 589 Sun, Hefa 747 Sun, Lihui 700

T Tan, Zhanxuan 542 Tang, Xuliang 101, 163, 720 Tao, Jianlong 372, 634 Tao, Junyi 437 Tian, Linran 263 Tuo, Yulong 109

W Wan, Heng 101, 720 Wang, Bo 172, 273, 809 Wang, Bowen 618 Wang, Chang 353 Wang, Gang 118, 281, 700 Wang, Huangang 553 Wang, Jun 145 Wang, Kai 739 Wang, Kejun 728 Wang, Kun 409 Wang, Liuwang 31, 484 Wang, Lu 618 Wang, Shasha 109 Wang, Shengjie 765 Wang, Tao 802 Wang, Tenghui 728 Wang, Xiang 830 Wang, Xiaofeng 243 Wang, Xin 553 Wang, Yinsong 500 Wang, Yixiang 196 Wang, Yongji 809 Wang, Yuelin 683 Wang, Yufan 445 Wang, Yushi 641 Wang, Zhidong 683 Wen, Shuhuan 802 Wen, Yanqi 226 Weng, Ling 618 Wenhao, Wang 476 Wu, Caicong 402 Wu, Lei 69 Wu, Licheng 23 Wu, Mingyang 610

841

Wu, Yingcheng 708 Wu, Zhixuan 83 X Xi, Zhenghao 205, 610 Xia, Min 675, 747 Xia, Zhang 519 Xiang, Zhuo 273 Xiao, Chaoqing 341 Xie, Benkai 196 Xie, Longqin 775 Xie, Yuan 417 Xingguang, Liu 519 Xiong, Ling 739 Xu, Bohan 765 Xu, Changqing 747 Xu, Xiao 579 Xu, Yiming 747 Xu, Yong 427 Xue, Zhe 92 Xuesong, Xu 519 Y Yan, Huaicheng 205 Yan, Qian 402 Yan, Shaoyang 417 Yan, Shengye 533 Yang, Junyi 205 Yang, Qing 322, 561 Yang, Shujie 542 Yang, Yang 417, 708 Yang, Yingming 739 Yang, Zhiqiang 782 Yanhong, Liu 476 Yin, Hongyang 747 Yin, Jianqin 378 Yin, Liping 289 Yu, Feilong 353 Yu, Haisheng 322, 392, 561 Yu, Haojie 756 Yu, Jianli 196 Yuan, Chengyu 128 Yuan, Xudong 667 Yue, Chengyu 533 Yuxin, Chen 519 Z Zeng, Chengyi 341 Zha, Zhongyi 172

842

Zhang, Binchi 542 Zhang, Hao 205 Zhang, Hewei 361 Zhang, Hongyu 427 Zhang, Jiabao 234 Zhang, Jie 561 Zhang, Ke 561 Zhang, Lidong 821 Zhang, Lin 775 Zhang, Linjuan 747 Zhang, Liudong 675 Zhang, Peng 579 Zhang, Qiang 145 Zhang, Ruixing 298 Zhang, Xiaofeng 728 Zhang, Xinyu 409 Zhang, Xuebo 409 Zhang, Xuewen 13 Zhang, Yanyin 23 Zhang, Yu 234, 437 Zhang, Zhen-Jun 61 Zhang, Zhenjun 691 Zhang, Zhiqiang 313

Author Index

Zhao, Jing 500 Zhao, Mingguo 641 Zhao, Qiang 118 Zhao, Qunfei 445 Zhao, Tianyu 136 Zhao, Xingqiang 257 Zhao, Zhongyuan 602, 782 Zhenbo, Cheng 519 Zheng, Qinghua 542 Zhong, Wei 571 Zhou, Bocheng 739 Zhou, Feng 378 Zhou, Funa 226, 313, 353 Zhou, Jun 542 Zhou, Lin 101 Zhou, Mo 409 Zhou, Zhanfeng 372, 634 Zhu, Mingchao 728 Zhu, Wen-Yi 571 Zhu, Xinyi 361 Zhu, Zhengju 289 Zou, Gang 675 Zou, Yangyang 243