142 67 29MB
English Pages 965 [928] Year 2023
Lecture Notes in Electrical Engineering 1090
Yingmin Jia Weicun Zhang Yongling Fu Jiqiang Wang Editors
Proceedings of 2023 Chinese Intelligent Systems Conference Volume II
Lecture Notes in Electrical Engineering Volume 1090
Series Editors Leopoldo Angrisani, Department of Electrical and Information Technologies Engineering, University of Napoli Federico II, Napoli, Italy Marco Arteaga, Departament de Control y Robótica, Universidad Nacional Autónoma de México, Coyoacán, Mexico Samarjit Chakraborty, Fakultät für Elektrotechnik und Informationstechnik, TU München, München, Germany Jiming Chen, Zhejiang University, Hangzhou, Zhejiang, China Shanben Chen, School of Materials Science and Engineering, Shanghai Jiao Tong University, Shanghai, China Tan Kay Chen, Department of Electrical and Computer Engineering, National University of Singapore, Singapore, Singapore Rüdiger Dillmann, University of Karlsruhe (TH) IAIM, Karlsruhe, Baden-Württemberg, Germany Haibin Duan, Beijing University of Aeronautics and Astronautics, Beijing, China Gianluigi Ferrari, Dipartimento di Ingegneria dell’Informazione, Sede Scientifica Università degli Studi di Parma, Parma, Italy Manuel Ferre, Centre for Automation and Robotics CAR (UPM-CSIC), Universidad Politécnica de Madrid, Madrid, Spain Faryar Jabbari, Department of Mechanical and Aerospace Engineering, University of California, Irvine, CA, USA Limin Jia, State Key Laboratory of Rail Traffic Control and Safety, Beijing Jiaotong University, Beijing, China Janusz Kacprzyk, Intelligent Systems Laboratory, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Alaa Khamis, Department of Mechatronics Engineering, German University in Egypt El Tagamoa El Khames, New Cairo City, Egypt Torsten Kroeger, Intrinsic Innovation, Mountain View, CA, USA Yong Li, College of Electrical and Information Engineering, Hunan University, Changsha, Hunan, China Qilian Liang, Department of Electrical Engineering, University of Texas at Arlington, Arlington, TX, USA Ferran Martín, Departament d’Enginyeria Electrònica, Universitat Autònoma de Barcelona, Bellaterra, Barcelona, Spain Tan Cher Ming, College of Engineering, Nanyang Technological University, Singapore, Singapore Wolfgang Minker, Institute of Information Technology, University of Ulm, Ulm, Germany Pradeep Misra, Department of Electrical Engineering, Wright State University, Dayton, OH, USA Subhas Mukhopadhyay, School of Engineering, Macquarie University, NSW, Australia Cun-Zheng Ning, Department of Electrical Engineering, Arizona State University, Tempe, AZ, USA Toyoaki Nishida, Department of Intelligence Science and Technology, Kyoto University, Kyoto, Japan Luca Oneto, Department of Informatics, Bioengineering, Robotics and Systems Engineering, University of Genova, Genova, Genova, Italy Bijaya Ketan Panigrahi, Department of Electrical Engineering, Indian Institute of Technology Delhi, New Delhi, Delhi, India Federica Pascucci, Department di Ingegneria, Università degli Studi Roma Tre, Roma, Italy Yong Qin, State Key Laboratory of Rail Traffic Control and Safety, Beijing Jiaotong University, Beijing, China Gan Woon Seng, School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore, Singapore Joachim Speidel, Institute of Telecommunications, University of Stuttgart, Stuttgart, Germany Germano Veiga, FEUP Campus, INESC Porto, Porto, Portugal Haitao Wu, Academy of Opto-electronics, Chinese Academy of Sciences, Haidian District Beijing, China Walter Zamboni, Department of Computer Engineering, Electrical Engineering and Applied Mathematics, DIEM—Università degli studi di Salerno, Fisciano, Salerno, Italy Junjie James Zhang, Charlotte, NC, USA Kay Chen Tan, Department of Computing, Hong Kong Polytechnic University, Kowloon Tong, Hong Kong
The book series Lecture Notes in Electrical Engineering (LNEE) publishes the latest developments in Electrical Engineering—quickly, informally and in high quality. While original research reported in proceedings and monographs has traditionally formed the core of LNEE, we also encourage authors to submit books devoted to supporting student education and professional training in the various fields and applications areas of electrical engineering. The series cover classical and emerging topics concerning: • • • • • • • • • • • •
Communication Engineering, Information Theory and Networks Electronics Engineering and Microelectronics Signal, Image and Speech Processing Wireless and Mobile Communication Circuits and Systems Energy Systems, Power Electronics and Electrical Machines Electro-optical Engineering Instrumentation Engineering Avionics Engineering Control Systems Internet-of-Things and Cybersecurity Biomedical Devices, MEMS and NEMS
For general information about this book series, comments or suggestions, please contact [email protected]. To submit a proposal or request further information, please contact the Publishing Editor in your country: China Jasmine Dou, Editor ([email protected]) India, Japan, Rest of Asia Swati Meherishi, Editorial Director ([email protected]) Southeast Asia, Australia, New Zealand Ramesh Nath Premnath, Editor ([email protected]) USA, Canada Michael Luby, Senior Editor ([email protected]) All other Countries Leontina Di Cecco, Senior Editor ([email protected]) ** This series is indexed by EI Compendex and Scopus databases. **
Yingmin Jia · Weicun Zhang · Yongling Fu · Jiqiang Wang Editors
Proceedings of 2023 Chinese Intelligent Systems Conference Volume II
Editors Yingmin Jia School of Automation Science and Electrical Engineering Beihang University Beijing, China Yongling Fu School of Mechanical Engineering and Automation Beihang University Beijing, China
Weicun Zhang School of Automation and Electrical Engineering University of Science and Technology Beijing Beijing, China Jiqiang Wang Ningbo Institute of Materials Technology and Engineering Chinese Academy of Sciences Ningbo, Zhejiang, China
ISSN 1876-1100 ISSN 1876-1119 (electronic) Lecture Notes in Electrical Engineering ISBN 978-981-99-6881-7 ISBN 978-981-99-6882-4 (eBook) https://doi.org/10.1007/978-981-99-6882-4 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore Paper in this product is recyclable.
Contents
Two Performance Indicators of Reaction Wheel Underactuated Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jun Xie, Jia-jia Feng, Yu-jie Zhang, and Qiang Chen
1
MQB-RRT*:An Improved Path Planning Algorithm Based on Improving Initial Solution and Fast Convergence . . . . . . . . . . . . . . . . . . Tao Chen, Xinmin Chen, Feifan Yu, and Yue Lin
17
Research on Attention Mechanism Based Assisted Diagnosis of Pulmonary Embolism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . HuaTao Li, ZhongYi Hu, and MingZhe Hu
27
Method of Inverter with Inductive Load Based on Instantaneous Current . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jiang Tao, Huang YuLong, He Chao, and Liu Shuai
39
Early Gastric Cancer Screening Framework Based on Multimodal Fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Meichen Lu and Yi Chai
59
Cooperative Anti-disturbance Control for Leader-Follower Ships with Disturbance and Loss-of-Effectiveness Fault . . . . . . . . . . . . . . . . . . . . . Chengmei Mu and Xinjiang Wei
69
Adaptive Fault-Tolerant Control for a Class of Nonlinear Systems with Quantized Input and Quantized States . . . . . . . . . . . . . . . . . . . . . . . . . . Zhihao Jiang, Wenjie Chen, Yan Lin, and Xiantao Sun
81
Improved GWO-WOA and Fuzzy NN DWA Based Path Planning Algorithm for the UAV in Dynamic Environment . . . . . . . . . . . . . . . . . . . . . Bo Li, Siqi Wang, Wenwei Luo, Hang Xiong, and Chaolu Temuer
97
Circumnavigation Control of Fixed-Wing UAVs Using Distance Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 Jie Wang and Baoli Ma v
vi
Contents
Cooperative Path Planning for Multi-vehicle Systems by an Integrated Intelligent Algorithm of IABC and DWA . . . . . . . . . . . . . 125 Ying Tan, Jian Zhang, Zhonghua Miao, and Jin Zhou Event-Triggered Adaptive Dynamic Surface Control for Wheeled Mobile Robots with Unknown Skidding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139 Wenlong Yue, Yu Wan, and Xuehui Gao A K-Means and GMM-Based Fusion and Detection Algorithm Against FDI Attacks on Remote Estimator . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 Jinxing Hua and Fei Hao Research on Surface Quality and Wheel Wear of Internal Thread in High-Speed Grinding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 Zhang Zhaojing, Dou Yongqiang, Zhang Rong, Shi Wei, Zheng Jigui, and Zhao Yongsheng Fast Video Object Segmentation Network Based on Multi-scale Attention Feature Fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 Fan Zhou, Chaoli Wang, and Zhanquan Sun Design of Terminal Guidance Law for Air Defense Missile Based on Variable Structure Control Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 Shujun Yang, Jirong Ma, Juanzhi Lu, and Duansong Li A Trajectory Prediction Method via Affine Objective Fuzzy Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 Na Wang, Liang Luo, Yue Lei Cui, and Xin Hai Zhang Analysis on the Key Technology of Guidance and Control of Missile and Gun Combined Air Defense Missile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 Shujun Yang, Jirong Ma, Juanzhi Lu, and Duansong Li Evaluating RNN and Its Improved Models for Lithium Battery SoH and BRL Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 Feifan Yu, Jiqiang Wang, and Xinmin Chen MDIoT: IoT Device Identification Method Based on Traffic Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 Hanxi Zheng, Ruijun Liu, Huanpu Yin, and Haisheng Li Motor Fault Diagnosis Based on Improved Support Vector Machine . . . 241 Caixiang Guo, Jin Li, and Chenxi Yang Research on Automatic Detection and Sorting System of Spoiled Fruit Based on Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251 Bingbing Hou, Lei Cheng, Tiedan Hua, Wenle Wang, and Fengyun Li Based on Graph Model: A Method for Locating and Reconstructing Entanglement Ship Trajectory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265 Yongze Zhu, Shuangxin Wang, and Jingyi Liu
Contents
vii
A Weighted Degree Maximum-Based Base Station Frequency Allocation Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285 Xi Zhong, Min Liang, and Yaping Ji Adaptive Time-Varying Parameter Estimation of Nonlinearly Parameterized Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295 Fujin Luan, Xinkai Chen, Jing Na, Yashan Xing, and Guanbin Gao Static Characteristics Simulation Analysis of a Plate Type Flow Control Type Counterbalance Valve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305 Kan Li, Yingqi Shen, Guolei Si, Junhui Chen, and Teng Li Faster Convergence Rate of Sampled-Data Systems with Artificially Designed Optimal Time Delay Parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321 Wenwen Li, Jianqiang Liang, and Mingxing Li Multi-robot Formation Control Based on Improved Virtual Spring Path Planning Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 Yimei Chen, Minghao Zhang, and Baoquan Li LQR-Based Adaptive Optimal Control for Aircraft Engine . . . . . . . . . . . . 349 Jinsong Zhao, Yan Lin, and Lin Li A Terrain Aided Navigation Method Based on Point Cloud Digital Elevation Map Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359 Xiaolong Wang, Junzhi Zhu, Rui Chen, and Long Zhao Adaptive Fault-Tolerant Control for a Class of Nonlinear Systems . . . . . 369 Zhichao Wang, Yan Lin, and Lin Li Traffic Police Dynamic Gesture Recognition Based on Spatiotemporal Attention ST-GCN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379 Xiru Wu, Yu Zhao, and Qi Chen Radar-Based 3D Skeleton Estimation Enhanced with Joint Temporal-Spatial Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401 Guangyu Mei, Zhongping Cao, Guoli Wang, and Xuemei Guo Heuristic-Based Bi-RRT* Path Planning Algorithm for Unmanned Systems in Complex Channel Environment . . . . . . . . . . . . . . . . . . . . . . . . . . 415 Xiru Wu, Rili Wu, and Junce Jiang Improved Safety Helmet-Wearing Detection Algorithm Based on YOLOv5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433 Dongmei Chen, Huanyu Zhao, Wei Liu, and Dongsheng Du Research on Low-Cost Missile Borne Landing Point Positioning Device Based on RDSS/SMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443 Shiyao Gao, Shan Su, A. Liangmushage, Kai Wu, and Lu Chen
viii
Contents
A Prediction Algorithm for Lower Limb Movement Intention Based on Plantar Pressure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455 Hao Li, Junyu Quan, Longfei Jia, Jing Chen, Shitong Zhou, and Zhiyuan Yu Visual Localization and Map Construction Based on Ground Texture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465 Xin Chen and Lei Yu Parameter Optimization of Tracked Vehicle Steering Control Strategy Based on Particle Swarm Optimization Algorithm . . . . . . . . . . . 479 Yunfeng Wang, Hongcai Li, Yue Ma, and Xuzhao Hou Design of Planar Torsion Spring with High Linearity for Series Elastic Actuator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495 Yuqiao Cheng, Xiubo Xia, Yongling Fu, Jian Sun, and Pu Zhang Fault Diagnosis Method of Rolling Bearings Via Wavelet-Stacked Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 509 Na Wang, Yue Lei Cui, Liang Luo, and Zi Cong Wang A Visual-Based Aircraft Pose Estimation Method During Take-Off . . . . . 519 Feng Liu, Jong Zhang, Hao Guo, and Xue Chen Edge Detection of Carbon Electrode Image Based on Improved Image Restoration and Improved Canny Operator Fusion . . . . . . . . . . . . . 527 Feng Jia, Xiaobin Li, and Yanling Xu Landing Point Control Technology of Parafoil System Based on Sliding Mode Control in a Complex Environment . . . . . . . . . . . . . . . . . . 537 Weitao Lu, Hao Sun, Qinglin Sun, and Zengqiang Chen Nail Piece Detection Based on Lightweight Deep Learning Network . . . . 549 Chen Zhao, Chunbo Xiu, and Xin Ma Cosolute Interactions with the Tryptophan Peptide . . . . . . . . . . . . . . . . . . . 559 Bailang Liu, Xiaojing Teng, and Toshiko Ichiye Research on High-Voltage Pulse Ignition Power Supply Technology Based on µC_OS_II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 571 Xu Zhao, Changlu Yue, Xiuhua Xu, Cong Hu, and Lei Yang Path Planning of Mobile Robot Based on DBSCAN Clustering and Improved BA*-APF Hybrid Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 579 Yicheng Li, Bingshan Liu, and Linyuan Hou Method and Testing of Shaft Angle Digital Conversion Based on Improved CORDIC Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 595 Sixian Sun, Qijia Zheng, Liping Ren, Linxue An, Yufeng He, and Shuyue Han
Contents
ix
Multinomial Regression with Group Structure for Screening Biomarkers of Breast Cancer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 609 Chenxi Xi, Fugen Gao, and Juntao Li Path Planning of Mobile Robot Based on Improved A* Algorithm . . . . . 617 Ziyang Zhou, Liming Wang, Yuquan Xue, Xiang Ao, Liang Liu, and Yuxuan Yang Research on the Technology of Using Turning Instead of Grinding for Aerospace Titanium Alloy Thin Wall Parts . . . . . . . . . . . . . . . . . . . . . . . 627 Kong Guizhen, Zhang Huajin, Guo Yaxing, Yang Qiang, Li Dongwei, and Zhang Zhe Machine Learning in Molecular Dynamics Simulation . . . . . . . . . . . . . . . . 635 Xiaojing Teng Quality Evaluation of Airfoil Hybrid Mesh Based on Graph Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 641 Huaiqing Wang, Yufei Pang, Sumei Xiao, and Zhichao Wang Fixed-Time Consensus Control for Heterogeneous Multi-agent Systems Without Velocity Measurements of Neighbors . . . . . . . . . . . . . . . . 653 Yichao Ao and Qifeng Zhang Correlation Filter Feature Selection Strategy Based on Inland Ship Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 665 Lei Xiao, Feiyan Nie, Hanjie Ma, and Zhongyi Hu Heterogeneous Vehicle Platoon Control Based on Predictive Constant Time Headway Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 679 Yangzhou Chen and Bingzhuang Yan Improved Northern Goshawk Optimization Method for Intercepting Maneuvering Targets with Pulse Correction Projectiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 693 Yuming Zhang, Jian Fu, Xin Lei, Yifan Yang, and Hongyu Gao Analysis of Probabilistic Energy Flow for Integrated Electricity and Heat Systems Considering Source-Load Uncertainty . . . . . . . . . . . . . . 707 Taihao Liu, Yunzhong Song, Huimin Xiao, and Fuzhong Wang A Fire Alarm System for Agricultural Sheds Designed with Zigbee . . . . . 717 Yongsheng Xie, Xiaokai Du, and Linbing Wei Cooperative Control and Management for UAS in Distributed Dynamic Kill Web . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 729 Sun Zhangjun, Tang Qiang, and Li Hao Research on Fractional Order Unidirectional Sliding Mode Control for Fixed-Rudder Two-Dimensional Correction Projectile . . . . . 747 Xin Lei, Jian Fu, Liangming Wang, Yuming Zhang, and Shouyi Guo
x
Contents
UAV Path Planning Based on Improved Artificial Potential Field Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 761 YingKai Ma and ShuRong Li Research on Stiffness Design Basis and Dynamic Response of Series Elastic Actuator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 779 Xiubo Xia, Yuqiao Cheng, Yongling Fu, and Jian Sun Dynamic Estimation of Loads on Wind Turbine Blades Based on Sensor Optimization and Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . . . . 789 Hang Chen, Shanbi Wei, Yu Wang, and Yi Chai Improved Deadbeat Predictive Current Control for Open-Winding Permanent Magnet Synchronous Generators . . . . . . . . . . . . . . . . . . . . . . . . . 801 Wenfeng Wang and Yue Ma An Effective Method for Fault Localization Based on Combination of Convolution and LSTM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 815 Jinfeng Li and Haihao Yu Bearing-Only Formation Control for Nonlinear Multi-agent Systems with Unknown Dead-Zone Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . 829 Haoruo Geng, Qin Wang, Zitao Chen, and Yang Yi Adaptive Event-Triggered Output Feedback Tracking Control for Uncertain Nonlinear Systems with Sensor Failures . . . . . . . . . . . . . . . . 841 Chen Sun, Yan Lin, and Lin Li CARDNet: A Denoiser Based on Contrast-Aware and Residual-Dense Block . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 853 Qiang Cai, Ying Cao, Chen Wang, Haisheng Li, and Mengxu Ma An Automatic and Efficient Calibration Method for LiDAR-Camera in Targetless Environments . . . . . . . . . . . . . . . . . . . . . . 867 Fengli Yang, Juzhi Zhu, and Long Zhao Modeling Analysis of Force-Thermal Coupling for High-Speed Planetary Roller Screw . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 877 Zheng Jigui, Yang Bin, Tian Qing, Guo Yaxing, Shi Wei, and Cui Zhenglei Optimal Output Tracking for Unknown Linear Discrete-Time Systems Based on Adaptive Dynamic Programming and Output Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 893 Kexin Fan, Xuan Cai, Jinghan Wu, Xin Wu, and Qiaoshen Xiao LSTM-TD3-Based Control for Delayed Drone Combat Strategies . . . . . . 913 Bingyu Ji, Jun Wang, Hailin Zhang, and Ya Zhang
Contents
xi
Design of Adaptive Sliding Mode Controller Based on Neural Network Compensation for Stewart Platform . . . . . . . . . . . . . . . . . . . . . . . . 925 Weixiang Zeng, Wenlin Yang, Yunting Wang, and Weilun Situ Research on Adversarial Robustness Properties of Image Classification Networks Based on Deep Vision . . . . . . . . . . . . . . . . . . . . . . . . 937 Qiaoyi Li, Zhengjie Wang, Xiaoning Zhang, Hongbao Du, Bai Xu, and Yang Li Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 951
Two Performance Indicators of Reaction Wheel Underactuated Configuration Jun Xie, Jia-jia Feng, Yu-jie Zhang, and Qiang Chen
Abstract It is of great engineering significance to study the underactuated configuration of reaction wheels. Aiming at the problem of how to effectively improve the control performance of the control system, this paper proposes two performance optimization indicators in the reaction wheel underactuated configuration from the control mechanism of the underactuated control, which provides a reference for the optimization design of the reaction wheel underactuated configuration. The two performance indicators are the area of the actuated Euler axis and the error angle of the underactuated axis. The simulation analysis of the performance indicators is carried out for an engineering example. The results show that the optimization of the two performance indicators proposed in this paper can effectively improve the control performance of the system under underactuated control. It is proved that the two performance optimization indicators proposed in this paper are feasible and effective, and have certain engineering application value. Keywords Spacecraft · Reaction wheel · Underactuated configuration · Performance indicator
1 Introduction Reaction wheel is mainly used for attitude control of spacecraft. The underactuated configuration of reaction wheels mainly refers to the configuration of less than three degrees of freedom for the control torque generated by the reaction wheel,which generally only has one reaction wheel or two reaction wheels installed.The underactuated configuration of reaction wheels can improve the autonomous control performance of the spacecraft and maintain the normal operation of the spacecraft in case of actuator failure; On the other hand, it can reduce the design cost of the spacecraft J. Xie · J. J. Feng (B) · Y. J. Zhang · Q. Chen Beijing Institute of Control Engineering, Beijing 100190, China e-mail: [email protected] Science and Technology on Space Intelligent Control Laboratory, Beijing 100190, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_1
1
2
X. Jun et al.
control system and select the underactuated control mode in the design of spacecraft. Therefore, from this point of view, it is of great engineering significance to study the underactuated configuration of reaction wheels(see [1, 2]). At present, there is little literature on underactuated configuration optimization of reaction wheel, and most of the literature mainly focuses on the control theory and algorithm of underactuated. In reference [3, 4], the attitude maneuver of spacecraft was studied by using Ritz approximation theory and particle swarm optimization algorithm in the case of two reaction wheels working. From reference [5–8], the underactuated control of zero momentum spacecraft was studied for the case of one reaction wheel and two reaction wheels respectively. From reference [9–11], the system differential smoothing characteristics and indirect Legendre pseudospectral method were used to study the attitude motion and tracking problem of spacecraft. In fact, the essence of underactuated control is to realize the control of underactuated axis through the coupling of actuated axis and underactuated axis. Therefore, the coupling relationship between the actuated axis and the underactuated axis should be taken into account when the reaction wheel underactuated configuration is carried out. In this paper, two configuration optimization performance indicators are proposed for spacecraft with two reaction wheels, which provide reference for the optimization design of underactuated configuration of reaction wheels.
2 Spacecraft Dynamics Equation and Two Performance Indicators 2.1 Spacecraft Dynamics Equation According to Krishnan’s research (see [5]), if the angular momentum of the control system is zero, the spacecraft system with only two non-parallel reaction wheels is controllable. The angular momentum of the spacecraft control system can be unloaded through non-zero angular momentum devices such as magnetic torque converter or thruster, and finally meet the condition that the angular momentum of the system is zero. When the angular momentum of the spacecraft control system is zero, the spacecraft dynamics equation with two reaction wheels can be expressed as: J ω= ˙ − [n 1 , n 2 ] h˙ = u c n 1 = [sin(β1 ) cos(α1 ); cos(β1 ); sin(β1 ) sin(α1 )] n 2 = [sin(β2 ) cos(α2 ); cos(β2 ); sin(β2 ) sin(α2 )]
(1)
Two Performance Indicators of Reaction Wheel Underactuated Configuration
3
where, J = diag{I1 , I2 , I3 } is the moment of inertia matrix of spacecraft system, ω= [ω1 , ω2 , ω3 ]T is the component of angular velocity of spacecraft relative to the inertial coordinate system in the body coordinate system, h is the angular ∈ αi,min , αi,max , βi ∈ momentum of the reaction wheel relative to the spacecraft. α i βi,min , βi,max , i = 1, 2,is the installation angle of reaction wheel, n 1 , n 2 is the unit vector of the installation direction of the two reaction wheels. The spacecraft attitude kinematics equation expressed by four elements is:
q˙ = 0.5(q4 ω − ω× q) q˙4 = −0.5ω T q
(2)
where, vector part q = [q1 q2 q3 ]T and scalar part q4 describe the Euler axis and rotation angle of space rotation respectively.
2.2 Two Performance Indicators Area of the actuated Euler axis. Deformation of formula (1) can be obtained: ω= ˙ [J −1 n 1 , J −1 n 2 ]h˙ if
(3)
J −1 n 1 n¯ 1 = −1 J n1 J −1 n 2 n¯ 2 = −1 J n2 n¯ × n¯ 2 n¯ w = 1× n¯ n¯ 2 1
where, n¯ 1 , n¯ 2 represents the unit vector of the spacecraft’s rotation axis actuated by the control along the n 1 , n 2 direction, n¯ w represents the unit vector of the spacecraft’s underactuated axis. Obviously, in the underactuated control mode, the angular velocity ω of the spacecraft must be in the plane determined by n¯ 1 and n¯ 2 ,and meet the following requirements: (4) n¯ w ω = 0 ˙ the area of the plane surrounded by n¯ 1 , n¯ 2 directly represents the ω= ˙ [n¯ 1 , n¯ 2 ]h, control torque acting on the spacecraft, that is, the control torque range of the actuated axis. Therefore, it is obvious that the area of the plane formed by n¯ 1 , n¯ 2 directly represents the control ability in the underactuated control mode.
4
X. Jun et al.
In the underactuated control mode, in order to improve the control performance of the control system, it is necessary to maximize the moment of momentum envelope S enclosed by the unit vector n¯ 1 , n¯ 2 : S = n¯ 1 × n¯ 2
(5)
That is, the maximum momentum envelope area can be expressed as: ⎧ ⎨ f ind αi , βi , i = 1, 2, αi ∈ αi,min , αi,max , βi ∈ βi,min , βi,max max S ⎩ s.t. f (αi , βi ) < 0
(6)
where, f(αi , βi ) represents the constraint form of αi , βi . Error angle of the underactuated axis. The potential function method originated from Lyapunov direct method and was initially applied to the trajectory planning of space manipulator and robot. The main idea is to construct a non-negative potential function, which satisfies the minimum value at the global equilibrium point of the system, and when the time derivative of the potential function is negative, the system state converges to the equilibrium point. In order to satisfy the state constraint, the potential function should take the maximum value near the obstacle or the state that needs to be avoided. By reasonable designing the controller, if the time derivative of the total body potential function can be kept negative, the system can always run from high to low along the potential function, and approach the global equilibrium point of the system while avoiding the obstacle. The potential function generally includes the attractive potential function and the repulsive potential function. The attractive potential function Va is equivalent to the Lyapunov function and has a minimum value at the equilibrium point. The repulsion function Vr affects the motion state of the system by constructing a virtual control. The total potential function of the system is the sum of the attractive potential function and the repulsive potential function. Given the four elements of the specified attitude, the error four elements are expressed as: ⎤⎡ ⎤ ⎡ ⎤ ⎡ q1 qc4 qc3 − qc2 − qc1 e1 ⎢ e2 ⎥ ⎢ −qc3 qc4 qc1 − qc2 ⎥ ⎢ q2 ⎥ ⎥⎢ ⎥ ⎢ ⎥=⎢ (7) ⎣ e3 ⎦ ⎣ qc2 − qc1 qc4 − qc3 ⎦ ⎣ q3 ⎦ e4 qc1 qc2 qc3 qc4 q4 The kinematic equation of error four elements is as follows:
e˙ = 0.5(e4 ω − ω× e) e˙4 = −0.5ω T e
(8)
e = [e1 e2 e3 ]T . In reference [13], the absorption potential function Va should decrease as the distance between the current attitude and the desired attitude decreases.
Two Performance Indicators of Reaction Wheel Underactuated Configuration
Va = e T e + (1 − e4 )2 + 0.5λω T ω = 2(1 − e4 ) + 0.5λω T ω
5
(9)
λ represents the ratio between the adjusted attitude error and the angular velocity error. For the underactuated attitude control, because the rotation along the underactuated axis n¯ w cannot be performed, when the attitude error e is nearly parallel to n¯ w , the repulsion potential function Vr needs to be large enough to provide the repulsion torque to drive the spacecraft away from n¯ w . When the attitude error e is perpendicular to n¯ w , the repulsion potential function Vr needs to be minimum. At the same time, when the attitude error is very small, in order to avoid the jitter of the control system caused by the direction approaching the underactuated axis, when the attitude error is less than a preset value, Vr is set to zero. Vr is designed as follows:
Vr =
⎧ ⎪ ⎪ ⎪ ⎨
(n¯ w e)2 e2
(n¯ w e) ⎪1− ⎪ e2 ⎪ ⎩
2
(n¯ w e)2 1−e42 −(n¯ w e)2
=
0
e ≥ ε (10)
e < ε
this function has the following properties: Vr =
∞ n¯ w ||e, e ≥ ε 0 n¯ w ⊥e, e < ε
(11)
Therefore, from this point of view, the underactuated axis error angle ϑ = e, n¯ w is an important indicator of the underactuated configuration of the reaction wheel. That is, the optimized underactuated error angle can be expressed as: ⎧ ⎨ find αi , βi , i = 1, 2, αi ∈ αi,min , αi,max , βi ∈ βi,min , βi,max min cos(ϑ) ⎩ s.t. f (αi , βi ) < 0
(12)
From the perspective of the area of the actuated Euler axis and the error angle of the underactuated axis, the optimization expression of the underactuated configuration of the reaction wheel can be expressed as: ⎧ ⎨ find αi , βi , i = 1, 2, αi ∈ αi,min , αi,max , βi ∈ βi,min , βi,max min cos(ϑ) + 1/S ⎩ s.t. f (αi , βi ) < 0
(13)
6
X. Jun et al.
2.3 Controller Design In order to verify the effectiveness of the indicators described in this paper, the engineering examples in reference [12] are selected for simulation verification. The controller design is selected as the underactuated control method based on potential function (see [13]). The potential function is V p = Va + Vr ,whene ≥ ε, the time derivative of the potential function V p is: V˙ p = V˙a + V˙r = = =
2
T T T T 2n¯ w en¯ w e˙ (n¯ T e) [2e4 e˙4 +2n¯ w en¯ w e] ˙ + w 2 2 T e)2 2 T 1−e42 −(n¯ w [1−e4 −(n¯ w e) ] (V +1)n¯ T ee× n¯ +V e e ω T (e + λJ −1 u c ) − ω T [ r 1−ew2 −(n¯ Twe)2 r 4 ] w 4 (V +1)n¯ T ee× n¯ +V e e ω T [λJ −1 u c + e − r 1−ew2 −(n¯ Twe)2 r 4 ] w 4 ω T [λJ −1 u c + Q(e, n¯ w )]
= λω T ω˙ − 2e˙4 +
(V +1)n¯ T ee× n¯ +Vr e4 e
The function Q(e, n¯ w ) is Q(e, n¯ w ) = e − r 1−ew2 −(n¯ Twe)2 w 4 The following underactuation controller is designed.
(14)
.
u c = λ−1 J −gω − [Q(e, n¯ w ) − n¯ wT Q(e, n¯ w )n¯ w ]
(15)
g represents the ratio of the attitude angular velocity.Bringing function Q(e, n¯ w ) into the controller above, the controller is: ⎧ 0 e < ε ⎪ ⎪ ⎪ cos ϑ × ⎨ −gω + esin ¯w− 4 e n ϑ −1 e ≥ ε, ϑ = 0 uc = λ J e4 ·cos2 ϑ ⎪ 1 − e (e − e cos ϑ · n¯ w ) 2 ⎪ sin4 ϑ ⎪ ⎩ τ h wi (i = 1, 2) e ≥ ε, ϑ = 0
(16)
The key parameters are λ=1000, g = 50. τ is the torque amplitude output by the reaction wheel. In reference [13], a complete stability proof of the controller was given.
3 Simulation Verification 3.1 Area of the Actuated Euler Axis Simulation Verification The simulation is divided into three working conditions, that is, the maneuver of the four elements of attitude from [0, 0, 0, 1] to [0.3604, 0.4397, 0.0223, 0.8224] is carried out for three conditions of wheel installation.
Two Performance Indicators of Reaction Wheel Underactuated Configuration
7
Fig. 1 Angular momentum of two wheels (unit: N.ms)
The simulation results are as follows: Simulation case 1: The installation vector of two wheels is n 1 = [1; 0; 0], n 2 = [0.98; 0.199; 0], S = 0.2613 The simulation results are shown in Figs. 1 and 2. Simulation case 2: The installation vector of two wheels is n 1 = [1; 0; 0], n 2 = [0.7; 0.7141; 0], S = 0.8057 The simulation results are shown in Figs. 3 and 4. Simulation case 3: The installation vector of two wheels is n 1 = [1; 0; 0], n 2 = [0; 1; 0], S = 1 The simulation results are shown in Figs. 5 and 6. From the three simulation conditions, it can be seen and confirmed that with the increase of the area of the actuated Euler axis surrounded by two wheels, the better the control effect and the higher the control performance of attitude maneuver.
8
Fig. 2 Controlled attitude angle
Fig. 3 Angular momentum of two wheels (unit: N.ms)
X. Jun et al.
Two Performance Indicators of Reaction Wheel Underactuated Configuration
Fig. 4 Controlled attitude angle
Fig. 5 Angular momentum of two wheels (unit: N.ms)
9
10
X. Jun et al.
Fig. 6 Controlled attitude angle
3.2 Error Angle of the Underactuated Axis Simulation Verification Different attitude maneuvers are carried out for the same wheel mounting structure. The simulation results are as follows: Simulation case 1: The four elements of attitude maneuver from [0, 0, 0, 1] to [0.3604, 0.6397, 0.3923, 0.5541], where the cosine of the underactuated axis error angle is calculated as follows:cos(ϑ) = 0.3923. The simulation results are shown in Figs. 7 and 8. Simulation case 2:The four elements of attitude maneuver from [0, 0, 0, 1] to [0.4604, 0.3397, 0.1523, 0.8059], where the cosine of the underactuated axis error angle is calculated as follows:cos(ϑ) = 0.1523. The simulation results are shown in Figs. 9 and 10. Simulation case 3: The four elements of attitude maneuver from [0, 0, 0, 1] to [0.3604, 0.4397, 0.0223, 0.8224], where the cosine of the underactuated axis error angle is calculated as follows:cos(ϑ) = 0.0223. The simulation results are shown in Figs. 11 and 12. From the three simulation conditions, it can be seen and confirmed that with the increase of the underactuated axis error angle ϑ, that is, the farther the attitude error e and n¯ w are parallel, the better the control effect and the higher the control performance of the attitude maneuver.
Two Performance Indicators of Reaction Wheel Underactuated Configuration
Fig. 7 Angular momentum of two wheels (unit: N.ms)
Fig. 8 Controlled attitude angle
11
12
Fig. 9 Angular momentum of two wheels (unit: N.ms)
Fig. 10 Controlled attitude angle
X. Jun et al.
Two Performance Indicators of Reaction Wheel Underactuated Configuration
Fig. 11 Angular momentum of two wheels (unit: N.ms)
Fig. 12 Controlled attitude angle
13
14
X. Jun et al.
4 Conclusions In this paper, two optimization indicators, the area of the actuated Euler axis and the error angle of the underactuated axis, are proposed to analyze the influence of reaction wheel configuration on the control performance of control system. The simulation results verify that increasing the area of the actuated Euler axis or reducing the error angle of the underactuated axis can effectively improve the control effect of the control system. In a word, the two optimization indicators proposed in this paper can be widely applied to the performance analysis of various existing underactuated configuration schemes of reaction wheels, and have application value and engineering guiding significance for the performance analysis of underactuated configuration schemes and the selection and optimization of configuration schemes.
References 1. Hong-xin, W., Hu, J., Yong-chun, X.: Spacecraft intelligent autonomous control: past, present and future. Aerospace Cont. Appl. 42(1), 1–6 (2016). https://doi.org/10.3969/j.issn.1674-1579. 2016.01.001 2. Hong-xin, W., Shu-ping, T.: Spacecraft control: present and future. Aerospace Cont. Appl. 38(5), 1–7 (2012). https://doi.org/10.3969/j.issn.1674-1579.2012.05.001 3. Xin-sheng, G., Li-qun, C., Yan-zhu, L.: Nonholonomic motion planning for the attitude of rigid spacecraft with two momentum wheel actuators. Control Theory Appl. 21(5), 781–784 (2004). https://doi.org/10.3969/j.issn.1000-8152.2004.05.022 4. Xin-sheng, G., Peng-wei, S.: Nonholonomic motion planning for the attitude of underactuated spacecraft using particle swarm optimization. J. Astronaut. 27(6), 1233–1237 (2006). https:// doi.org/10.3321/j.issn:1000-1328.2006.06.022 5. Krishnan, H., Clamroch, N.M., Reyhanoglu, M.: Attitude stabilization of a rigid spacecraft using two momentum wheel actuators. J. Guidance, Cont. Dynam. 18(2), 256–263 (2012). https://doi.org/10.2514/3.21378 6. Horri, N.M., Hodgart, S.: Attitude stabilization of an underactuated satellite using two wheels. IEEE Aerospace Conf. 2003, 2629–2635 7. Horri, N.M., Palmer, P., Hodgart, S.: Practical implementation of attitude control algorithms for an underactuated satellite. J. Guidance, Cont. Dynam. 35(1), 40–50 (2012). https://doi.org/ 10.2514/1.54075 8. Inumoh, L.O., Pechev, A., Horri, N.M., et al.: Three-axis attitude control of a satellite in zero-momentum mode using a tilted wheel methodology. In: AIAA Guidance Navigation and Control Conference, Minneapolis, Minnesota, USA, August 13–16 (2006) 9. Aguilar, C.O.: Attitude Control of a Differentially Flat Underactuated Rigid Spacecraft. University of Alberta, Edmonton (2005) 10. Tsiotras P.: Feasible trajectory generation for underactuated spacecraft using differential flatness. In: Proceedings of the AAS/AIAA Astrodynamics Conference, Girdwood, AK, August 16-19, 1999 11. Cai, W.W., Yang, L.P., Zhu, Y.W.: Optimal reorientation of asymmetric underactuated spacecraft using differential flatness and receding horizon control. Adv. Space Res. 55(1), 343–353 (2015). https://doi.org/10.1016/j.asr.2014.10.014
Two Performance Indicators of Reaction Wheel Underactuated Configuration
15
12. Jia-jia, F.: An Optimization Method for Reaction Wheel Configuration. 2021CCC 13. Yan-ning, G., Chuang-jiang, L., Guang-fu, M.: Spacecraft autonomous attitude maneuver control by potential function method. ACTA Aeronautica ET Astronautica Sinica 32(3), 457–464 (2011). CNKI:11-1929/V.20101111.0915.033
MQB-RRT*:An Improved Path Planning Algorithm Based on Improving Initial Solution and Fast Convergence Tao Chen, Xinmin Chen, Feifan Yu, and Yue Lin
Abstract In order to enhance the efficiency of the RRT*(rapidly exploring random trees*) algorithm in the search for initial paths and the generation of high-quality paths, an enhanced variant known as MQB-RRT*(Multipoint sampling and Backend optimization based on RRT*) is proposed. Primarily, the incorporation of the depth parent node concept is employed to enhance path smoothness. Subsequently, a novel multi-point sampling method, inspired by the roulette wheel selection strategy, is introduced. Finally, a sophisticated back-end optimization technique based on triangle optimization is employed to refine the paths generated iteratively by the algorithm. The experimental results demonstrate the superiority of the MQB-RRT* algorithm in terms of both efficiency in searching for initial paths and the quality of generated paths. Keywords Path planning · Sampling-based algorithms · RRT · Optimal path plannning
1 Introduction As one of the important components of kinematics, path planning is widely used in UAV(Unmanned Aerial Vehicle) [1], robot [2] and AGV(Automated Guided Vehicle) vehicle [3]. The problem of path planning is mainly to find a collision-free feasible path from the starting point to the end point in the map [4]. Sampling algorithms are very popular in path planning applications because they are not limited by the complexity of the controlled object and have excellent performance in high-dimensional state space, among which probabilistic route diagram (PRM) algorithm [5] and fast expanding random tree (RRT) algorithm [6] are the most widely used. As a classical sampling algorithm, RRT can quickly find a feasiT. Chen · X. Chen · F. Yu · Y. Lin (B) Ningbo Institute of Materials Technology Engineering, Chinese Academy of Sciences, Ningbo 315201, China e-mail: [email protected] Qianwan Institute of CNITECH, Ningbo 315336, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_2
17
18
T. Chen et al.
ble path through random sampling and collision detection, but because it does not consider the cost of path generation, the generated path is usually not optimal. With the development of the algorithm, RRT* [7] is proposed, which adds the process of ChooseParent and adjacent node Rewire on the basis of RRT to ensure the completeness of the algorithm and ensure that the optimal solution can be found when the algorithm sampling tends to infinity, which can be regarded as a milestone in the development of RRT. In order to improve the performance of RRT*, a large number of scholars have carried out in-depth research and put forward a variety of improved algorithms, and have been widely used. RRT*-Connect [8] combines RRT-connect with RRT*, which searches shorter paths and converges faster than RRT*. Based on RRT*, Informed-RRT* [9] puts forward a new sampling rule. After finding the initial path, the sampling area will be re-selected, and the sampling area will be limited according to each new path, which greatly improves the efficiency of finding the optimal path. RRT*-SMART [10] generates beacons in the obstacles around the path after RRT* finds the initial path. These beacons are taken into account in path re-selection, which speeds up the convergence rate of the algorithm. P-RRT* [11] use the idea of artificial potential field in the sampling process to adjust the generated points and obtain the next node with better quality, so as to shorten the time of finding the path. Sampling algorithm to select depth node for optimization has a good effect, QRRT* [12] using the idea of triangle inequality, the parent node set is expanded in ChooseParent and Rewire process, so as to reduce the cost of the path and improve the algorithm convergence speed. PQ-RRT* [13] combines the features of both P-RRT* and Q-RRT* algorithms, and integrated the sampling idea of P-RRT* into Q-RRT*, which further accelerated the convergence rate of Q-RRT*. LQ-RRT* [14] combines virtual light with the Q-RRT* algorithm, guiding the sampling process through the virtual light, which further reduces the time of finding the path and improves the quality of the path. Based on the above research, MQB-RRT * algorithm is proposed by combining the method of multi-point sampling and back-end optimization process with Q-RRT*. The initial path rate is found by multi-point sampling and lifting, and the nodes near the obstacle are generated by back-end optimization processing to optimize the path. The framework of the paper is as follows. Problem definition are discussed in Sect. 2. MQB-RRT* is introduced in Sect. 3. Simulation results are analyzed in Sect. 4. Summarize and outlook are provided in Sect. 5.
2 Problem Definition This section defines and formalizes two motion problems. Define X as the configuration space, X obs , X f r ee ∈ X are the obstacle region and collision-free region X star t , X goal , X define a path planning problem, respectively. and X obs , X f r ee ∈ X f r ee represent the start state and goal state respectively.
MQB-RRT*:An Improved Path Planning Algorithm Based …
19
Define a continuous function σ : [0, 1] → X as a path, if it exists bounded variation. And for all τ ∈(0,1), σ (τ ) ∈ X f r ee , the path is collision-free. Definition 1 (Feasible Path Planning) It is to find a feasible path σ and the path is collision-free, σ (0) = X star t , σ (1) = X goal for the path planning problem X star t , X goal , X f r ee . Definition 2 (Optimal Path Planning) It is to find the lowest cost c(σ ∗ ) such that c(σ ∗ )=min{c(σ σ ): σ is feasible} for the path planning problem(X star t , X goal , X f r ee ).
3 MQB-RRT* This sections describes the implementation process of MBQ-RRT*, and introduce the innovative points.
3.1 Proposed Algorithm The principle of MQB-RRT* is to establish the root state X star t in the configuration space, after each iteration, sampling a new state X rand from X f r ee and then distance judgment to choose the state X near est from the generate states, iterate sequentially until the endpoint X goal is found, then terminate, finally generate an extended tree containing possible path from start state to goal state. Furthermore, the path generation process includes the integration of the ChooseParent and Rewire steps. ChooseParent is primarily responsible for optimizing the selection of a parent node within a specific radius centered around the new node, X new . It aims to identify the parent node with the lowest cost, thereby minimizing the overall cost of the path leading to node X new . On the other hand, Rewire focuses on further optimizing the nodes within the circular region generated by ChooseParent. It facilitates path reselection by replacing higher-cost paths with lower-cost alternatives, effectively reconnecting the nodes in a more cost-efficient manner (Fig. 1).
3.2 Depth Parent The depth-parent nodes introduces the novel concept proposed by the Q-RRT* algorithm, resulting in enhanced optimization of the ChooseParent and Rewire procedures. Within the ChooseParent phase, the node X new possesses the ability to select not only its immediate parent but also a parent node at a depth of dth . This expanded selection enables the discovery of more optimal paths that surpass the previous
20
T. Chen et al.
Fig. 1 Expansion tree principle diagram of MQB-RRT*
ones. Illustrated in Figure 2, with the current node’s depth set to 2, the selection of X new s parent node involves not only considering X near est within the circular region centered at X new but also contemplating X near est s parent node X 1 and X 1 s parent node X 2 . Through a cost evaluation process, X new s parent node is determined by leveraging the triangle inequality, which establishes that the path cost between X new and X 2 is the shortest, leading to X 2 becoming X new s parent node. This approach is similarly applied in the Rewire process, where X new is capable of reconnecting with a parent node at a depth of dth within the path, effectively optimizing the path. This optimization strategy revolves around an extended search for deeper parent nodes and subsequent optimization based on the triangle inequality. By broadening the scope of parent node exploration, substantial enhancements to the path can be achieved. However, it is worth noting that excessively large depth values for parent node search may lead to a significant increase in algorithmic runtime.
3.3 Multipoint Sampling Inspired by the roulette wheel selection method in ant colony algorithms, a new sampling method is proposed. The principle of multi-point sampling involves randomly selecting n sampling points during the sampling process, followed by calculating the cost from each sampling point to the target using Euclidean distance, as shown below:
MQB-RRT*:An Improved Path Planning Algorithm Based …
21
Fig. 2 Schematic diagram of depth-parent nodes
d=
(X 2 − X 1 )2 + (Y2 − Y1 )2
(1)
where d represents the cost between two points, (X 1 , Y1 ) and (X 2 , Y2 ) denote the coordinates of the two nodes currently being evaluated. Afterwards, the next node is selected from the sampling points using the roulette wheel selection method, as shown in the following formula: Pi j (t) =
ηi j =
ηi j (t) S∈n ηi S (t)
(2)
1 di j
(3)
where Pi j (t) represents the probability of selecting the next node j at time t, ηi j (t) denotes the heuristic function, di j represents the Euclidean distance between node i and node j, and n signifies all the nodes selected during sampling. In addition, to ensure the randomness of the algorithm, when the initial path is not found, a multi-point sampling method is employed. Once a path is found, it switches to using traditional sampling to expand the search space.
3.4 Back-End Optimization After path search, the existing algorithms usually return the path, but these paths can often be further optimized. For this question, we propose the back-end optimization. First, we use triangle inequality to delete some nodes and reconnect the path. Next, we generate the new node to replace the old one by mobile node optimization, as shown in Fig. 3, X new ultimately becomes xnew1 through optimization on both sides
22
T. Chen et al.
Fig. 3 Schematic diagram of back-end optimization
.In addition, after the optimization of some nodes, the nodes that can be connected with the node plus 2 without collision, which affects the optimization process, so it is necessary to judge and delete the nodes. Last, for the shortcoming of different number of sampling points lead to different optimization effect in different shape of obstacle, we optimize the new path of each output in algorithm iteration and find the optimal solution to solve this problem.
4 Simulation In this section, MQB-RRT* is compared with RRT* and Q-RRT* in two dimensional maps of three different environments, and the map size is 1184 × 842. What’s more, for the randomness of the sampling algorithm, each algorithm was run 100 times. In this paper, four indexes were used to evaluate the performance of the algorithm. For the difficulty of simultaneously recording data in the algorithm, two groups of simulations were used to record. In the first simulation, get the cost of optimal path , the cost of the initial solution ‘Cinit and the time of found initial solution ‘Coptimal . In the other simulations, get the time of found the sub-optimal solution (1.05 ‘Tinit Coptimal ) ‘T5% ’. The simulations were run on Intel i7-10700 CPU with 16G of RAM, and the simulation platform is MATLAB.
MQB-RRT*:An Improved Path Planning Algorithm Based …
23
4.1 Cluttered Environment Figure 4 shows the generated paths of RRT*, Q-RRT*, MQB-RRT* in Cluttered environment, and all performance data are shown in Table1, it can be seen that the performance of MQB-RRT* is best. In terms of finding the initial path length, the optimal path length and the time to find sub-optimal path, MQB-RRT*(Cinit = 669.81, Coptimal = 564.38, T5% = 0.3) has a great improvement compare with Q-RRT*(Cinit = 730.75, Coptimal = 666.63, T5% = 1.76) and RRT*(Cinit = 804.41, Coptimal = 673.53, T5% = 2.91). In terms of finding the time to initial path, Q-RRT*(Tinit = 0.57) is a little less than RRT*(Tinit = 0.38), this is mainly because Q-RRT* has an extra Chooseparent and Rewire process in finding path, which increases the time to find the initial path, but MQB-RRT*(Tinit = 0.21) solves this problem through multipoint sampling method, it is the best of three algorithm.
4.2 U-Shaped Environment Figure 5 shows the generated paths of RRT*, Q-RRT*, MQB-RRT* in U-shaped environment, and all performance data are shown in Table 2, it can be seen that the performance of MQB-RRT* is best. Same as the Cluttered environment, in terms of finding the initial path length, the optimal path length and the time to find sub-optimal path, MQB-RRT*(Cinit = 1403.16, Coptimal = 1385.90, T5% = 2.17) has a great improvement compare with QRRT*(Cinit = 1516.29, Coptimal = 1417.78, T5% = 4.31) and RRT* (Cinit = 1700.29,
Fig. 4 Cluttered environment simulation diagram Table 1 Simulation results in the cluttered environment Algorithm Cinit Coptimal RRT* Q-RRT* MQB-RRT*
804.41 730.75 669.81
673.53 666.63 654.38
Tinit
T5%
0.38 0.57 0.21
2.91 1.76 0.31
24
T. Chen et al.
Fig. 5 U-shaped environment simulation diagram Table 2 Simulation results in the U-shaped environment Algorithm Cinit Coptimal Tinit RRT* Q-RRT* MQB-RRT*
1700.29 1516.29 1403.16
1444.36 1417.78 1385.90
2.31 3.24 1.80
T5% 6.45 4.31 2.17
Coptimal = 1444.36, T5% = 6.45), at this time, Q-RRT* has improved a lot compared with RRT*, it still has a certain gap with MQB-RRT*. In terms of finding the time to initial path, RRT*(Tinit = 2.31) is still better than Q-RRT*(Tinit = 3.24), and MQB-RRT*(Tinit = 1.80) is the best.
4.3 Maze Environment Figure 6 shows the generated paths of RRT*, Q-RRT*, MQB-RRT* in Maze environment, and all performance data are shown in Table 3, it can be seen that the performance of MQB-RRT* is best. Same as the Cluttered and U-shaped environment, in terms of finding the initial path length, the optimal path length and the time to find sub-optimal path, MQBRRT*(Cinit = 1845.76, Coptimal = 1833.66, T5% = 2.63) has a great improvement com-
Fig. 6 Maze environment simulation diagram
MQB-RRT*:An Improved Path Planning Algorithm Based … Table 3 Simulation results in the Maze environment Algorithm Cinit Coptimal RRT* Q-RRT* MQB-RRT*
2188.27 1942.87 1845.76
1913.05 1868.35 1833.66
25
Tinit
T5%
4.01 4.82 2.46
7.97 5.83 2.63
pare with Q-RRT*(Cinit = 1942.87, Coptimal = 1868.35, T5% = 5.83) and RRT*(Cinit = 2188.27, Coptimal = 1913.05, T5% = 7.97), but Q-RRT* has greater optimization than RRT*. In terms of finding the time to initial path, RRT*(Tinit = 4.01) is still better than Q-RRT*(Tinit = 4.82), and MQB-RRT*(Tinit = 2.46) is the best.
5 Conclusion The key innovations can be summarized as follows: 1. This paper integrates the idea of deep parent node based on RRT* algorithm. 2. This paper proposes a sampling method which can improve the search efficiency effectively. 3. In this paper, a back-end optimization procedure is proposed based on the algorithm characteristics, which can effectively optimize the path. In view of the defects of slow convergence to optimal solution in RRT*, MQBRRT* is proposed, which accelerated the speed of convergence and optimized the generated path by use multipoint sampling and back-end optimization. What’s more, MQB-RRT* is a tree-expanding algorithm, the optimization steps can combine with any sampling strategy and further improve efficiency. Although MBQ-RRT* is a promising algorithm, if you want to apply it to objects, you have to consider the kinematic constraints. In addition, the performance of the algorithm in a dynamic environment should also be verified, and that’s what we’re going to do in the future.
References 1. Zhao, Y.Z.Z., Liu, Y.: Survey on computational-intelligence-based UAV path planning. Knowl.Based Syst. 158, 54–64 (2018) 2. Hu, B., Cao, Z., Zhou, M.: An efficient RRT-based framework for planning short and smooth wheeled robot motion under Kino dynamic constraints. IEEE Trans. Ind. Electron. 99, 1–1 (2020) 3. Wang, H.: An improved timed elastic band (TEB) algorithm of autonomous ground vehicle (AGV) in complex environment. Sensors 21 (2021) 4. Siciliano, B., Khatib, O.: Springer handbook of robotics. Springer (2016)
26
T. Chen et al.
5. Kavraki, L., Svestka, P., Overmars, M.H.: Probabilistic roadmaps for path planning highdimensional configuration spaces. IEEE Trans. Robot. Autom. 12(4), 566–580 (1996) 6. Lavalle, S.M.: Rapidly-exploring random trees: a new tool for path planning. Comput. Ence Dept. Oct 98 (1998) 7. Karaman, S., Frazzoli, E.: Sampling-based algorithms for optimal motion planning. Int. J. Rob. Res. (2011) 8. Klemm, S., et al.: RRT*-Connect: faster, asymptotically optimal motion planning. In: 2015 IEEE International Conference on Robotics and Biomimetics (ROBIO 2015) IEEE (2015) 9. Gammell, J.D. , Srinivasa, S.S., Barfoot, T.D.: Informed RRT*: optimal sampling-based path planning focused via direct sampling of an admissible ellipsoidal heuristic. IEEE (2014) 10. Nasir, J., et al.: RRT*-Smart: a rapid convergence implementation of RRT*. Int. J. Adv. Rob. Syst. 10(7), 299 (2013) 11. Qureshi, A.H., Ayaz, Y.: Potential functions based sampling heuristic for optimal path planning. Autonomous Rob. 40(6), 1079–1093 (2016) 12. Jeong, I.B., Lee, S.J., Kim, J.H.: Quick-RRT*: triangular inequality-based implementation of RRT* with improved initial solution and convergence rate. Expert Syst. Appl. 123, 82–90 (2019) 13. Li, Y., et al.: PQ-RRT*: an improved path planning algorithm for mobile robots. Expert Syst. Appl. 113425 (2020) 14. Zhuge, C., et al.: An improved Q-RRT* algorithm based on virtual light. Comput. Syst. Sci. Eng. 39(1), 107–119 (2021)
Research on Attention Mechanism Based Assisted Diagnosis of Pulmonary Embolism HuaTao Li, ZhongYi Hu, and MingZhe Hu
Abstract The traditional diagnosis of pulmonary embolism (PE) requires doctors to distinguish between computed tomography (CT) images, which is a very timeconsuming task and may result in patients not receiving timely and effective treatment. Therefore, the use of computer intelligence to assist in diagnosis is crucial. In this paper, we propose a Coordination and spatial attention convNext (CSACNet) based on CSA attention mechanism to realize early intelligent auxiliary diagnosis of pulmonary embolism disease. First, Convolutional neural network uses ConvNext network as the backbone network for feature extraction, which can solve the gradient problem caused by network deepening. Secondly, the CSA attention module introduced is a fusion of CoordAttention and Spatial Attention, extracting channel, position, and spatial information between features to obtain more discriminative features and improve the network’s classification accuracy for pulmonary embolism images. The proposed method was tested on the largest publicly contested PE dataset(RSNASTR), and the experimental results showed that it outperformed the current best method and improved the performance of pulmonary embolism assisted diagnosis. Keywords Diagnosis of pulmonary embolism · Deep learning · Attention mechanism
H. Li · Z. Hu (B) College of Computer and Artificial Intelligence, WenZhou University, Wenzhou 325035, China e-mail: [email protected] WenZhou Intelligent Information Processing and Key Analysis Laboratory, Wenzhou 325035, China M. Hu Radiology and Imaging Department of WenZhou People’s Hospital, WenZhou, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_3
27
28
H. Li et al.
1 Introduction Pulmonary embolism refers to the clinical and pathophysiological syndrome of pulmonary circulation obstruction caused by endogenous or exogenous emboli blocking pulmonary artery or its branches. The specific manifestations in clinical practice can range from asymptomatic to severe diseases that require intensive care (such as shock, cardiac arrest, etc.). The treatment method depends on the severity of the clot, and as patients often exhibit non-specific symptoms, determining the diagnosis of PE may be challenging. The typical manifestations of this disease include acute onset of breathing difficulties and pleura Symptoms of chest pain, tachycardia, and right heart strain [1]. A study in Europe shows [2] that approximately 34 percent of patients with pulmonary embolism suddenly die before starting treatment or without showing any other abnormalities. Therefore, it is necessary to use computer intelligent assisted diagnosis for early diagnosis of pulmonary embolism, in order to further assist doctors in making decisions beneficial to patients. In addition, compared to the global image of pulmonary embolism, the regional information of local vascular occlusion is very small. In order to extract more obvious discriminative features, many researchers have attempted to use deep learning models to extract advanced features. Lin et al. [3] proposed an end-to-end trainable convolutional neural network (CNN) for pulmonary embolism classification, which spatially converts the extracted candidate blocks containing suspected pulmonary embolism into images aligned along the direction of the blood vessels, Then classify it. Sudhir Suman et al. [4] proposed a two-stage attention based CNN-LSTM network to predict early pulmonary embolism. Tajbakhsh et al. [5] proposed a multi plane imaging method for vascular alignment using convolutional neural networks (CNN) as feature extraction networks. This method uses two image channels to provide details of candidate image regions, thereby improving the diagnosis of pulmonary embolism. Nahid Ul Islam et al. [6] introduced the Squeeze and extraction (SE) [7] modules into the Xeception [8] network structure, which not only obtains feature information but also mines the relation information between feature channels to optimize model performance. However, in these networks, more attention is paid to the information between local and feature channels, while neglecting some important other feature information, such as spatial and positional information, These are crucial for extracting discriminative features. To address the above issues and improve the detection performance and efficiency of pulmonary embolism diseases, this paper proposes a deep convolutional network model based on CSA attention to extract feature information. The model consists of the following modules: improving ConvNext network as the backbone network based on ResNet50, and integrating CoordSpatialAttention (CSA) module. Specifically, the ConvNext model was chosen as the backbone for feature extraction because its residual connections enable the model to learn global and local feature information, making the extracted features more refined than those extracted using non residual connected networks. Using the CSA module can extract position and spatial information while paying attention to feature channel information, which is beneficial for improving medical image classification performance.
Research on Attention Mechanism Based Assisted …
29
1.1 RSNA Dataset The dataset used in this paper is from the Radiological Society of North America (RSNA) on Kaggle, a regional academic group of radiology jointly established by the United States and Canada [1]. These datasets are composed of CT pulmonary angiography images, including training and testing sets, with 7279 files in the training set, A total of 1790594 labeled images in DCM format were used for training. The test set consisted of some unlabeled images. In our experiment, we used the training set data and divided it into 6279 files as our training set, with the remaining 1000 files as the test set.
1.2 Data Preprocessing The PE lung dataset [1] published by RSNA was preprocessed using the following method. The specific process was as follows: first, a windowing operation was performed, where the HU values of pixels in the CT image were cropped from the original −1000 to 1000 to −250 to 450 to enhance the contrast of the lung tissue. At the same time, since the lungs did not occupy a large part of the image, a lung localization operation was used to remove some images unrelated to the lungs, Save image computation time, and finally input these processed files into the network for training. Figure 1 shows the image effect after the above processing. The upper layer is the original single channel image, which is the image in the dataset. From top to bottom, the image is windowed and the image effect is obtained after lung positioning operations. After image preprocessing, the single channel images are fused into three channel images as input to the network.
2 Method 2.1 Image Classification Based on Deep Learning Convolutional neural networks are one of the most commonly used network models in deep learning, widely used in fields such as image recognition, image segmentation, object detection, and speech analysis. LeNet [9] used convolutional neural networks for image classification, and since then, convolutional neural networks have emerged in the field of images. AlexNet [10] achieved the latest results on ImageNet by demonstrating the superiority of convolutional neural networks in image classification, reigniting research interest in computer vision tasks. VGGNet [11] show the effectiveness of a 3 × 3 filter allows for a deeper network structure. ResNet [12] introduced a residual structure to solve the problem of gradient vanishing and exploding, in order to optimize deep convolutional networks more easily. Xeption
30
H. Li et al.
Fig. 1 Sample diagram of PE image preprocessing
[8] uses deep separable convolution to reduce the number of parameters in the convolutional layer. The ConvNext network used in this article starts from the ResNet50 model and uses SwinTransformer’s improved approach to sort and imitate. It does not introduce other attention mechanism modules, gradually improving the accuracy of ResNet-50 from 76.1 to 82.0. Therefore, this network is used as a feature extraction network in this article, while combining attention mechanism for pulmonary embolism image classification.
2.2 Attention Mechanism Since Kelvin Xu et al. [13] first studied attention in vision, the attention module has demonstrated its effectiveness in improving task performance in popular tasks such as classification, detection, and segmentation by focusing on important features and suppressing unnecessary features. This includes the recent Transformer [14] model structure, which also uses attention structures, and Hu J et al. [7] establishing interdependencies between channels through display, Combining the channel information
Research on Attention Mechanism Based Assisted …
31
of the local Receptive field, a novel architecture unit, “Squeeze and Extinction”, is proposed. The channel attention mechanism is used to improve the representation ability of the network and increase the generalization ability of the model. Wang Q[15] and others proposed a local cross channel interaction strategy without dimension reduction based on the negative impact of the dimension reduction operation adopted by SENet on the prediction of attention, which effectively avoids dimension reduction. In order to increase the spatial information between features, Woo et al. [16] based on the existing channel attention mechanism, fused the spatial feature maps generated by the spatial relationships between features, and widely improved the representation ability of convolutional neural networks.
2.3 A Convolutional Neural Network Based on CSA Attention Mechanism for Early Diagnosis of Pulmonary Embolism In order to enable the network to focus on the target area and extract distinguishable features, attention mechanisms have attracted researchers’ interest [17]. Vaswani et al. [14] demonstrated the effectiveness of attention mechanisms in computer vision tasks. In the diagnosis of pulmonary embolism images based on deep learning, SEMoudle [15] and ECAMoudle [7] are generally used to fuse traditional convolutional neural networks for diagnosis. Usually, they face the problem of only paying attention to the horizontal relationship between feature channels. In order to overcome the attention neglect between channels, which is very important for generating spatial selective attention maps, the position and spatial information are often ignored, This article uses the CSA attention mechanism to obtain the position and spatial information of the feature space, delves into the connections between feature maps, effectively reduces diagnostic errors, controls computational costs, and combines the above motivations. This article proposes a convolutional neural network model based on the CSA module, and Fig. 2 shows our proposed method.
Fig. 2 CSACNet network structure diagram
32
H. Li et al.
In the figure, the preprocessed image is input into the Backbone network ConvNext, and the generated feature map is used as input to the CSA module to extract the correlation information between the corresponding channels, and finally perform classification prediction. The dimensions of the corresponding feature map and the activation function used are marked in the figure.
2.4 Coordination Attention Module The Coordinated Attention Module (as shown in Fig. 2) module extracts a given backbone network feature map as input and outputs a feature map with position and channel information fusion. The global pooling method is usually used for global encoding of channel attention spatial information, but due to its compression of global spatial information into channel features, it is difficult to preserve positional information. In order to enable the attention module to capture positional information in the feature space, two one-dimensional global pooling operations were used to aggregate input features in the vertical and horizontal directions into two independent directional perception feature maps. Long range dependencies were captured in one spatial direction, while accurate positional information was retained in the other spatial direction, which helps the network locate targets of interest more accurately. Global pooling was decomposed according to the following formula, Convert to a pair of one-dimensional feature encoding operations, as shown in formula (1). H W 1 xc (i, j) OC = H × W i=1 j=1
(1)
Specifically, the given feature input x outputs the output related to the channel. Firstly, each channel is encoded along the horizontal and vertical directions using pooling layers of size (H, 1) or (1, W). Therefore, the output sum of channel c at height Och (h) and width Ocw (w) is shown in formula (2), (3). Och (h) =
1 xc (h, j) W 0≤i 0 u i ⎪ ⎪ ⎪ ui ⎪ ⎪ ⎨u i (1 + δ) sgn (u) , u i < |u| 1−δ , u˙ < 0, or (1+δ) ui q (u) = < |u| u i 1−δ , u˙ > 0 1−δ ⎪ ⎪ u min ⎪ 0, 0 |u| < 1+δ , u˙ < 0, or ⎪ ⎪ ⎪ ⎪ u min ⎪ u u min , u˙ > 0 ⎪ 1+δ ⎪ ⎩ − , u˙ = 0 q u t
(4)
where u i = ρ (1−i) u min with integer i = 1, 2... and u min > 0 determines the size of the dead-zone for q(u), 0 < ρ < 1 is considered as a measure of quantization density [1]. δ = (1 − ρ)/(1 + ρ), which means the smaller the ρ is, the coarser the quantizer is. When ρ approaches to zero, δ approaches to 1, then q(u) will have fewer quantization levels as u ranges over that interval. The hysteretic quantizer q(u) can be divided into two parts: a linear part and a nonlinear part, then q(u) can be defined as: q(u) = u + d
(5)
Considering the nonlinearity d, the following lemma holds. Lemma 1 The nonlinearity d satisfies the following inequalities: d 2 δ 2 u 2 , ∀|u| u min
(6)
84
Z. Jiang et al.
d 2 u 2min , ∀|u| u min
(7)
Proof See Lemma 1 of [10] for a proof. According to (5), we have q(w T u) =
m
wi (u i + di )
(8)
i=1
2.3 Actuator Faults Similar to [25], the actuator failure can be modeled as u i = ηi vi + u i,0
(9)
As same as [25], the actuator faults considered in this paper are lock-in-place and loss of effectiveness, which can be
described as: LOCK-IN-PLACE MODEL: u i (t) = u i,0 , t ti , i ∈ j1 , j2 , ..., j p ⊂ {1, 2, ..., m}, and LOSS OF EFFECTIVE
NESS MODEL: u i (t) = ηi vi , t ti , i ∈ j1 , j2 , ..., j p ⊂ {1, 2, ..., m}, respectively. vi is the input control signal. ηi and u i,0 are unknown constant with 0 ηi 1. ti are the unknown time instants that actuator faults happen. Combining (1) and (9), we have m wi ηi vi + u i,0 + di q wT u =
(10)
i=1
We define Tr , r m as the finite time instant of the last actuator failure. We suppose that there are pk failed actuators j1 , j2 , ... j pk , qk of them
suffer from lockin-place and the subset is defined as Ωk,1 = j1,1 , j1,2 , ..., j1,qk in this model. The others suffer from loss of effectiveness failures and the subset is defined as Ωk,2 =
j1,1 , j1,2 , ..., j1,qk ∩ {1, 2, ..., m}. In this article, the control design and stability analysis will be done during one time interval Tk−1 , Tr ), where k = 1, 2, ..., r + 1, T0 = 0, Tr +1 = ∞. Assumption 4 Each actuator can suffer from actuator faults only once. For the control input of system (1), up to m − 1 actuators experienced lock-in-place faults, that is, there is at least one actuator that is fault-free or loss of effectiveness.
Adaptive Fault-Tolerant Control for a Class …
85
3 Controller Design In this section, we will design adaptive backstepping control laws for a class of nonlinear systems in Brunovsky form (1) with quantized input and quantized states, the actuator faults are also considered.
3.1 States Are Not Quantized In this case, we assume that states are not quantized and the change of coordinates is introduced as follows: (11) z 1 = x1 z i = xi − αi−1 , i = 2, 3, ...n
(12)
where αi−1 denote the virtual control signal to be designed at the (i − 1)th step. Step i(i = 1, 2, ..., n − 1): The design of the first (n − 1) subsystems follows the backstepping design procedure in [6]. αi = − (ci + 1) z i +
i−1 ∂αi−1 j=1
∂x j
x j+1
where ci are positive parameters to be designed. Similar to [22], which depend on c1 , c2 , ..., ci−1 . For instance,
(13) ∂αi−1 ∂x j
are constants
∂α1 ∂z 1 ∂α1 = = − (c1 + 1) ∂ x1 ∂z 1 ∂ x1
(14)
∂α2 ∂z 2 = − (c2 + 1) = − (c2 + 1) (c1 + 1) ∂ x1 ∂ x1
(15)
∂α2 ∂z 2 ∂α1 = − (c2 + 1) + = − (c2 + 1) − (c1 + 1) ∂ x2 ∂ x2 ∂ x1
(16)
Define the Lyapunov function Vn−1 =
n−1 1 i=1
2
z i2
(17)
Therefore, the derivative of Vn−1 is given by V˙n−1 −
n−1
1 2 ci z i2 − z n−1 + z n−1 z n 2 i=1
(18)
86
Z. Jiang et al.
Step n: The visual control αn (x1 , x2 , ..., xn ) is designed as
n−1 ∂αn−1 3 zn − αn = cn + x j+1 2 ∂x j j=1
(19)
where cn is a positive parameter. We define a function n = αn + ϕ T θˆ + ψ
(20)
Consider the Lyapunov function Vn =
n 1 i=1
2
z i2 +
ηi |wi | 1 T k˜ T Γk−1 k˜ θ˜ θ˜ + 2γθ 2 i∈Ω
(21)
k,2
where γθ is a positive parameter to be designed and ΓkT = Γk > 0 is a positive definite matrix. Let θˆ be the estimate of θ , kˆ be the estimate of k, and ˜ kˆ = k − k˜ θˆ = θ − θ, We design k= [k0 , k1 , ..., km ]T , with k0 = i∈Ωk,2
(22)
−w u z n n √ , k j = i ηj,0i |wi | i∈Ωk,2 ηi |wi |(1−δ) z n2 n2 +ε2
for j ∈ Ωk,1 and k j = 0 for j ∈ Ωk,2 , where ε is a positive parameter to be designed. In addition, similar to [25], k satisfies the following equation
z n n2 − wi u j 2 2 2 i∈Ωk,2 (1 − δ) z n n + ε i∈Ωk,1
ηi |wi |k T β =
i∈Ωk,2
(23)
T where β = n , 1, 1, ..., 1 (m+1)×1 .
The actual control input signal and the adaptive laws for θˆ and kˆ are designed as vi = −sgn (wi ) kˆ T β
(24)
θ˙ˆ = γθ ϕz n − γθ σθ θˆ
k˙ˆ = Γk (1 − δ) z n β − σk kˆ
(25) (26)
Since, 2 1 1 u i,min wi (27) z n wi di δ|z n wi ηi vi | + u i,min |z n wi | −δz n wi ηi vi + z n2 + 2 2
Adaptive Fault-Tolerant Control for a Class …
87
Therefore, 1 2 1 2 z + u 2 n 2 i,min −z n2 n2 1 1 2 = + z n2 + u i,min + (1 − δ) |wi |ηi z n k˜ T β 2 2 2 2 2 i∈Ωk,2 z n n + ε
z n wi ηi vi + z n wi di (1 − δ) z n wi ηi vi +
ε − z n n +
2 1 2 1 zn + u i,min wi + (1 − δ) |wi |ηi z n k˜ T β 2 2
(28)
By combining (10), (21), (24-26) and (28), the derivative of Vn is given by m 1 2 ci z i2 − z n−1 + z n−1 z n + z n wi ηi vi + u i,0 + di 2 i=1 i=1 ⎛ ⎞ n−1 ∂αn−1 1 x j+1 ⎠ − θ˜ T θ˙ˆ − ηi |wi |k˜ T Γk−1 k˙ˆ + z n ⎝ϕ T θ + ψ − ∂ x γ j θ j=1 i∈Ωk,2 ⎞2 ⎛ n−1 1 1 2 ci z i2 − z n−1 + z n−1 z n + z n wi (ηi vi + di ) + ⎝ wi u i,0 ⎠ − 2 2 i∈Ω i=1 i∈Ωk,2 k,1 ⎛ ⎞ n−1 ∂α 1 1 n−1 + z n ⎝ϕ T θ + ψ − x j+1 ⎠ + z n2 − θ˜ T θ˙ˆ − ηi |wi |k˜ T Γk−1 k˙ˆ ∂ x 2 γ j θ j=1 i∈Ωk,2 ⎞2 ⎛ n 2 1 1 1 u i,min wi + ⎝ ci z i2 + ε + wi u i,0 ⎠ + θ˜ T γθ ϕz n − θ˙ˆ − 2 2 i∈Ω γθ i=1 k,1 ηi |wi |k˜ T Γk−1 k˙ˆ + (1 − δ) |wi |ηi z n k˜ T β −
V˙n −
n−1
i∈Ωk,2 n
i∈Ωk,2
2 σθ T σk 1 u i,min wi ηi |wi | k˜ T k˜ + ε + θ˜ θ˜ − 2 2 2 i=1 i∈Ωk,2 ⎞2 ⎛ 1⎝ σθ σk + wi u i,0 ⎠ + θ T θ + ηi |wi | k T k 2 i∈Ω 2 2 i∈Ω −
ci z i2 −
k,1
−C Vn + D
k,2
(29)
2 2 where C=min ci , γθ σθ , λ σΓk −1 , D=ε + 21 u i,min wi + 21 w u + i i,0 i∈Ωk,1 max ( k ) σk T σθ T θ θ + i∈Ωk,2 ηi |wi | 2 k k. It shows that Vn is uniformly bounded. Therefore, the 2
88
Z. Jiang et al.
signals z i (t), θˆ and kˆ are bounded. From (9), (12), (13), (19) and (24), it implies that xi (t) and u (t) are bounded. Therefore all the closed-loop signals are globally uniformly bounded.
3.2 States Are Quantized In this case, states xi , i = 1, 2, ..., n are quantized. For the change of coordinates, we choose (30) z¯ 1 = q (x1 ) z¯ i = q (xi ) − α¯ i−1
(31)
For virtual control laws, we choose α¯ 1 = − (c1 + 1) z¯ 1
(32)
i−1 ∂αi−1 q x j+1 ∂x j j=1
(33)
n−1 ∂αn−1 3 α¯ n = cn + z¯ n − q x j+1 2 ∂ x j j=1
(34)
α¯ i = − (ci + 1) z¯ i +
Choosing the input signals and parameter adaptive laws as vi = −sgn (wi ) kˆ T β
(35)
θ˙ˆ = γθ ϕ¯ z¯ n − γθ σθ θˆ
(36)
k˙ˆ = Γk (1 − δ) z¯ n β − σk kˆ
(37)
where γθ , σθ and σk are positive parameters and Γk is a positive definite matrix. The design of parameters k and β will be shown later. Theorem 1 Consider the closed-loop adaptive system consisting of plant (1), the hysteretic quantizer (4), the adaptive laws (36), (37) and the control law (35), all the closed-loop signals are globally uniformly bounded. Proof We establish some preliminary results to ensure the boundedness of all signals which are stated in the following lemmas.
Adaptive Fault-Tolerant Control for a Class …
89
Lemma 2 ϕ (q (x1 ) , ..., q (xn )) − ϕ (x1 , ..., xn ) Δϕ
(38)
|ψ (q (x1 ) , ..., q (xn )) − ψ (x1 , ..., xn )| Δψ
(39)
|z i (q (x1 ) , ..., q (xn )) − z i (x1 , ..., xn )| Δzi
(40)
|αi (q (x1 ) , ..., q (xn )) − αi (x1 , ..., xn )| Δαi
(41)
where i = 1, ..., n, Δψ and Δφ are positive constants which depend on the quantization bound δ and Lipschitz constants L ϕ and L ψ , respectively. Δzi and Δαi are positive which depends on the quantization bound and control design parameters(c1 , ..., ci ). Proof See Lemma 1 of [22] for a proof. Lemma 3 (x1 , ..., xn ) L x (z 1 , ..., z n )
(42)
where L x is a positive constant which depends on the control design parameters (c1 , ..., cn−1 ). Proof See Lemma 2 of [22] for a proof. Now we show the proof of theorem 1. Considering the Lyapunov function Vn =
n 1
2
i=1
z i2 +
ηi |wi | 1 T θ˜ θ˜ + k˜ T Γk−1 k˜ 2γθ 2 i∈Ω
(43)
k,2
where γθ is a positive parameter to be designed and ΓkT = Γk > 0 is a positive definite matrix. k is a matrix related to ¯ n which satisfies the following equation ¯ n = α¯ n + ϕ¯ T θˆ + ψ¯
(44)
and we choose to change the parameter k as k = [k0 , k1 , ..., km ]T with k0 = −w u z¯ n ¯ n √ , k j = i ηj,0i |wi | for j ∈ Ωk,1 , k j = 0 for j ∈ Ωk,2 , (1−δ)2 2 2 2 i∈Ωk,2
ηi |wi |
(1+δ)
z¯ n ¯ n +((1+δ)ε)
i∈Ωk,2
where ε is a positive parameter to be designed. k satisfies i∈Ωk,2
ηi |wi |k T β = i∈Ωk,2
T with β = ¯ n , 1, 1, ..., 1 (m+1)×1 .
z¯ n ¯ n2 − wi u¯ j 2 (1−δ) 2 2 + ((1 + δ) ε)2 z ¯ ¯ i∈Ω k,1 n n (1+δ)
(45)
90
Z. Jiang et al.
Therefore, the derivative of Vn is given as m 1 2 − z n−1 + z n−1 z n + z n wi ηi vi + u i,0 + di 2 i=1 i=1 ⎛ ⎞ n−1 ∂αn−1 1 + z n ⎝ϕ T θ + ψ − x j+1 ⎠ − θ˜ T θˆ˙ − ηi |wi |k˜ T Γk−1 k˙ˆ ∂ x γ j θ j=1 i∈Ωk,2 ⎞2 ⎛ n−1 1 2 1 ci z i2 − z n−1 + z n−1 z n + z n wi (ηi vi + di ) + ⎝ wi u i,0 ⎠ − 2 2 i=1 i∈Ωk,2 i∈Ωk,1 ⎛ ⎞ n−1 ∂αn−1 1 1 + z n2 + z n ⎝ϕ T θ + ψ − x j+1 ⎠ − θ˜ T θ˙ˆ − ηi |wi |k˜ T Γk−1 k˙ˆ 2 ∂ x γ j θ j=1 i∈Ω
V˙n −
n−1
ci z i2
k,2
n
1 ci z i2 + z n − (1 − δ) wi ηi vi + z n αn + ϕ T θ + ψ − θ˜ T θ˙ˆ γ θ i=1 i∈Ωk,2 ⎞2 ⎛ 2 1 1 − u i,min wi + ⎝ ηi |wi |k˜ T Γk−1 k˙ˆ + wi u i,0 ⎠ 2 2 i∈Ω i∈Ω k,2
−
n
k,1
ci z i2 +
(1−δ) i∈Ωk,2 (1+δ)
−z n z¯ n ¯ n2
+ z n αn + ϕ T θ + ψ
z¯ n2 ¯ n2 + ((1 + δ) ε)2 2 1 1 u i,min wi + ηi |wi |k˜ T Γk−1 k˙ˆ + (1 − δ) |wi |ηi z n k˜ T β − θ˜ T θ˙ˆ − γθ 2 i∈Ωk,2 i∈Ωk,2 ⎞2 ⎛ 1⎝ + wi u i,0 ⎠ (46) 2 i∈Ω i=1
k,1
Considering the following inequality i∈Ωk,2
−z n z¯ n ¯ n2 −z n2 ¯ n2 ε − z n ¯ n (1−δ) z n2 ¯ n2 + ε2 2 2 + ((1 + δ) ε)2 i∈Ω z ¯ ¯ k,2 n n (1+δ)
(47)
In (47), the following inequality is used because of the property of quantizer (4) and satisfies z n > 0 (48) (1 − δ) z n z¯ n (1 + δ) z n
Adaptive Fault-Tolerant Control for a Class …
91
We can obtain V˙n −
n i=1
+
⎞2 ⎛ 1 ci z i2 + ε + z n −¯ n + αn + ϕ T θ + ψ + ⎝ wi u i,0 ⎠ 2
(1 − δ) |wi |ηi z n k˜ T β −
i∈Ωk,2
−
n i=1
−
i∈Ωk,2
−
n
i∈Ωk,1
ηi |wi |k˜ T Γk−1 k˙ˆ −
i∈Ωk,2
2 1 T˙ 1 θ˜ θˆ + u i,min wi γθ 2
ci z i2 + z n αn + ϕ T θ + ψ − α¯ n − ϕ¯ T θˆ − ψ¯ + (1 − δ) |wi |ηi z n k˜ T β
i∈Ωk,2
⎞2 2 1 1 1 θ˜ T θ˙ˆ + ε + ⎝ u i,min wi ηi |wi |k˜ T Γk−1 k˙ˆ − wi u i,0 ⎠ + γθ 2 2 ⎛
i∈Ωk,1
ci z i2 + θ T ϕz n − θˆ T ϕz ¯ n − θ˜ T ϕ¯ z¯ n + z n (αn − α¯ n ) + z n ψ − ψ¯
i=1
σθ T σk σk σθ T θ˜ θ˜ − θ θ+ − ηi |wi | k˜ T k˜ + ηi |wi | k T k + ε 2 2 2 2 i∈Ωk,2
⎛
+
i∈Ωk,2
⎞2
2 1 1⎝ u i,min wi wi u i,0 ⎠ + 2 2 i∈Ωk,1
σθ T θ˜ θ˜ ci z i2 + θ T ϕz n − θˆ T ϕz ¯ n − θ˜ T ϕ¯ z¯ n + |z n |Δαn + |z n |Δψ − 2 i=1 ⎞2 ⎛ σk ˜ T ˜ σθ T σk T 1⎝ − ηi |wi | k k + ηi |wi | k k + ε + wi u i,0 ⎠ θ θ+ 2 2 2 2 −
n
i∈Ωk,2
i∈Ωk,2
i∈Ωk,1
1 u i,min wi + 2
(49)
Using the properties of quantizer, (2), (38), (40) and (42), the following inequality satisfied θ T ϕz n − θˆ T ϕz ¯ n − θ˜ T ϕ¯ z¯ n = θ T ϕz n − θ T ϕz ¯ n + θ˜ T ϕz ¯ n − θ˜ T ϕ¯ z¯ n ˜ ϕ||Δ ||θ |||z n |Δϕ + ||θ|||| ¯ zn ˜ ϕ ||q (x1 ) , q (x2 ) , ..., q (xn ) ||Δzn |z n |||θ ||Δϕ + ||θ||L ˜ |z n |||θ ||Δϕ + B||θ||||z||
(50)
92
Z. Jiang et al.
where B = L ϕ (L x +
√
nδ)Δzn . Then, by using the Young’s inequality, we have
|z n |Δαn + |z n |Δψ + |z n |||θ ||Δϕ
3 2 1 1 1 cn z + Δ2 + Δ2ψ + ||θ ||2 Δ2ϕ 4 n cn αn cn cn
˜ B||θ||||z||
c B2 ˜ 2 ||z||2 + ||θ|| 2 2c
(51)
(52)
where c = min c1 , c2 , ..., 41 cn . Considering (49), (50), (51) and (52), we have V˙n −
n
3 1 1 1 c ci z i2 + cn z n2 + Δ2αn + Δ2ψ + ||θ ||2 Δ2ϕ + ||z||2 4 cn cn cn 2 i=1
B2 σ σk σθ ˜ 2 − θ θ˜ T θ˜ − ||θ|| ηi |wi | k˜ T k˜ + θ T θ 2c 2 2 2 i∈Ωk,2 ⎞2 ⎛ 2 σk 1 1 u i,min wi + ηi |wi | k T k + ε + ⎝ wi u i,0 ⎠ + 2 2 i∈Ω 2 i∈Ωk,2 k,1
B2 c σθ σk 1 − θ˜ T θ˜ − − ||z||2 − ηi |wi | k˜ T k˜ + Δ2αn 2 2 2c 2 cn i∈Ω
+
k,2
1 1 σθ σk + Δ2ψ + ||θ ||2 Δ2ϕ + ε + θ T θ + ηi |wi | k T k cn cn 2 2 i∈Ωk,2 ⎞2 ⎛ 2 1 1 u i,min wi + ⎝ wi u i,0 ⎠ + 2 i∈Ω 2
(53)
k,1
2 and donating C = min c, γθ σθ − Bc , λ σΓk −1 , D = max ( k ) 2 σk T σθ T 1 1 1 1 2 2 2 2 u i,min wi Δ + Δ + ||θ || Δ + θ θ + η |w | k k + ε + i i i∈Ωk,2 cn αn cn ψ cn 2 2 2 ϕ 2 + 21 i∈Ωk,1 wi u i,0 , the derivative of Vn can be rewritten as Choosing σθ >
B2 , c
V˙n −C Vn + D
(54)
Solving the above inequality yields V (t) V (0) e−Ct + V (0) +
D C
D 1 − e−Ct C
(55)
Adaptive Fault-Tolerant Control for a Class …
93
which shows that Vn is uniformly bounded. Therefore, the signals z i (t), θˆ and kˆ are bounded. From (9), (31), (33), (34) and (35), it implies that xi (t) and u (t) are bounded. Therefore all the closed-loop signals are globally uniformly bounded.
4 Simulation Results In this section, we consider a robot manipulator system from [26]. The equation of the motion for the manipulator system is represented as J q¨0 + B q˙0 + Mgl sin (q0 ) = q w T u
(56)
where q0 and q˙0 are angular and angular velocity, respectively, q w T u is the quantized input, B is the overall damping coefficient, J is the total rotational inertias of motor, M is the link’s total mass, l is the distance from the joint axis to the center of mass, and g is the gravitational acceleration. We choose x1 = q0 and x2 = q˙0 . Therefore, the system (56) can be rewritten as x˙1 = x2 x˙n = q w T u + ϕ (x)T θ + ψ (x)
(57)
where ϕ (x)T = −10sin(x1 ) and ψ (x) = −2x2 , and it is assumed that the parameter θ is unknown. One should note that for the two actuators u 1 and u 2 , at least one actuator should experience loss of effectiveness fault, and the effective proportion should be positive; otherwise, the closed-loop system may be unstable. We choose u1 = u2 =
u1, t < 5 s 0.01, t 5 s u 2 , t < 10 s 0.1u 2 , t 10 s
(58)
In the simulation, the quantization parameter δ of hysteretic quantizer (4) is chosen as δ = 0.2. The initial states are chosen as x1 (0) = 0.2, x2 (0) = 0.5. The parameter w of system (57) is selected as w = [1, 1]T . The designed parameters are chosen as c1 = c2 = 1, = 0.01, γθ = 4, σθ = σk = 0.001. Figure 1 shows x1 and q (x1 ), x2 ˆ respectively. Figure 3 shows v1 and and q (x2 ), respectively. Figure 2 shows θˆ and k, q (u 1 ), v2 and q (u 2 ), respectively. Simulation results verify the effectiveness of the control strategy.
94
Z. Jiang et al.
0.3 0.4 0.2 0.2 0.1
0
0 0
-0.2 5
10
15
0
20
5
10
15
20
15
20
15
20
Time(seconds)
Time(seconds)
(a)
(b)
Fig. 1 a x1 and q (x1 ); b x2 and q (x2 ) 10 0 8 -10 6 -20
-30 0
4
5
10
15
20
2 0
5
10 Time(seconds)
Time(seconds)
ˆ b kˆ Fig. 2 a θ; 2
2
1.5
1.5
1
1
0.5
0.5
0
0
-0.5 0
5
10
15
Time(seconds)
Fig. 3 a v1 and q (u 1 ); b v2 and q (u 2 )
20
-0.5 0
5
10 Time(seconds)
Adaptive Fault-Tolerant Control for a Class …
95
5 Conclusion In this paper, an adaptive backstepping feedback stabilization scheme has been designed for a class of nonlinear systems with input and state quantization while the actuator faults is also considered. A parameter is designed to ensure the system stability under state quantization. Future research directions will focus on how to reduce the heavy computational burden caused by utilizing the backstepping design technique and save the communication channels. Acknowledgements This work was supported by the National Natural Science Foundation of China under Grant (51975002).
References 1. Elia, N., Mitter, S.K.: Stabilization of linear systems with limited information. IEEE Trans. Autom. Control 46(9), 1384–1400 (2001) 2. Fu, M., Xie, L.: The sector bound approach to quantized feedback control. IEEE Trans. Autom. Control 50(11), 1698–1711 (2005) 3. Liu, T., Jiang, Z., Hill, D.J.: A sector bound approach to feedback control of nonlinear systems with state quantization. Automatica 48(1), 145–152 (2012) 4. Brockett, R.W., Liberzon, D.: Quantized feedback stabilization of linear systems. IEEE Trans. Autom. Control. 45(7), 1279–1289 (2000) 5. Liberzon, D., Nesic, D.: Input-to-state stabilization of linear systems with quantized state measurements. IEEE Trans. Autom. Control 52(5), 767–781 (2007) 6. Krstic, M., Kokotovic, P.V., Kanellakopoulos, I.: Nonlinear and Adaptive Control Design. Wiley (1995) 7. Krishnamurthy, P., Khorrami, F.: Dynamic high-gain scaling: state and output feedback with application to systems with iss appended dynamics driven by all states. IEEE Trans. Autom. Control 49(12), 2219–2239 (2004) 8. Spooner, J.T., Manfredi Maggiore, Raul Ordonez, Kevin M Passino. Stable adaptive control and estimation for nonlinear systems: neural and fuzzy approximator techniques. Wiley (2004) 9. Hayakawa, T., Ishii, H., Tsumura, K.: Adaptive quantized control for nonlinear uncertain systems. Syst. Control Lett. 58(9), 625–632 (2009) 10. Zhou, J., Wen, C., Yang, G.: Adaptive backstepping stabilization of nonlinear uncertain systems with quantized input signal. IEEE Trans. Autom. Control 59(2), 460–464 (2013) 11. Xiaowei, Y., Lin, Y.: Adaptive backstepping quantized control for a class of nonlinear systems. IEEE Trans. Autom. Control 62(2), 981–985 (2016) 12. Zhaoxu, Y., Yan, H., Li, S., Dong, Y.: Approximation-based adaptive tracking control for switched stochastic strict-feedback nonlinear time-delay systems with sector-bounded quantization input. IEEE Trans. Syst. Man Cybern. Syst. 48(12), 2145–2157 (2017) 13. Jian, W., Zhengguang, W., Li, J., Wang, G., Zhao, H., Chen, W.: Practical adaptive fuzzy control of nonlinear pure-feedback systems with quantized nonlinearity input. IEEE Trans. Syst. Man Cybern. Syst. 49(3), 638–648 (2018) 14. Coutinho, D.F., Minyue, F., de Souza, C.E.: Input and output quantized feedback linear systems. IEEE Trans. Autom. Control 55(3), 761–766 (2010) 15. Picasso, B., Bicchi, A.: On the stabilization of linear systems under assigned i/o quantization. IEEE Trans. Autom. Control 52(10), 1994–2000 (2007) 16. Ishii, H., Basar, T.: Quantization in H∞ parameter identification. IEEE transactions on automatic control 53(9), 2186–2192 (2008)
96
Z. Jiang et al.
17. Chang, X., Li, Z., Park, J.H.: Fuzzy generalized H2 filtering for nonlinear discrete-time systems with measurement quantization. IEEE Trans. Syst. Man Cybern. Syst. 48(12), 2419–2430 (2017) 18. Liu, T., Jiang, Z.: Event-triggered control of nonlinear systems with state quantization. IEEE Trans. Autom. Control 64(2), 797–803 (2018) 19. Li, G., Lin, Y., Zhang, X.: Global output feedback stabilization for a class of nonlinear systems with quantized input and output. Int. J. Robust Nonlinear Control 27(2), 187–203 (2017) 20. Xing, L., Wen, C., Wang, L., Liu, Z., Hongye, S.: Adaptive output feedback regulation for a class of nonlinear systems subject to input and output quantization. J. Inst. 354(15), 6536–6549 (2017) 21. Zhaoxu, Y., Yang, Y., Li, S., Sun, J.: Observer-based adaptive finite-time quantized tracking control of nonstrict-feedback nonlinear systems with asymmetric actuator saturation. IEEE Trans. Syst. Man Cybern. Syst. 50(11), 4545–4556 (2018) 22. Zhou, J., Wen, C., Wang, W., Yang, F.: Adaptive backstepping control of nonlinear uncertain systems with quantized states. IEEE Transactions on Automatic Control 64(11), 4756–4763 (2019) 23. Jing, Y.: Yang, G: Fuzzy adaptive quantized fault-tolerant control of strict-feedback nonlinear systems with mismatched external disturbances. IEEE Trans. Syst. Man Cybern. Syst. 50(9), 3424–3434 (2018) 24. Zhao, K., Chen, J.: Adaptive neural quantized control of mimo nonlinear systems under actuation faults and time-varying output constraints. IEEE Trans. Neural Netw. Learn. Syst. 31(9), 3471–3481 (2019) 25. Yu, X., Wang, T., Qiu, J., Gao, H.: Barrier lyapunov function-based adaptive fault-tolerant control for a class of strict-feedback stochastic nonlinear systems. IEEE Trans. Cybern. 51(2), 938–946 (2019) 26. Yu, X., Lin, Y.: Adaptive quantized tracking control for a class of nonlinear systems. IEEE Trans. Syst. Man Cybern, Syst (2022)
Improved GWO-WOA and Fuzzy NN DWA Based Path Planning Algorithm for the UAV in Dynamic Environment Bo Li, Siqi Wang, Wenwei Luo, Hang Xiong, and Chaolu Temuer
Abstract An improved path planning algorithm for the unmanned aerial vehicles (UAVs) is developed, which is based on the combination of grey wolf optimizer-whale optimization algorithm (GWO-WOA) and fuzzy neural network dynamic window approach (FNN-DWA). It aims to realize dynamic path planning in a spherical obstacle environment. Firstly, an improved GWO and WOA based global planning algorithm is developed, in which an adaptive position updating equation is designed to improve GWO’s convergence speed. Meanwhile, inspired by the spiral bubble nets strategy in WOA and levy flight strategy in cuckoo search algorithm (CS), a random and the best individual’s information based random wandering strategy is developed to improve GWO’s global exploration ability. Then, an improved FNN-DWA based local planning algorithm is proposed, and the path points of GWO-WOA are set as goals for local planning. Based the proposed fuzzy logic based DWA (FDWA), FNN is utilized to train the input membership functions’ parameters and the connection weights. It aims to improve the algorithm’s adaptability to complex dynamic environments. Finally, simulations results are obtained to verify the effective performance of the designed scheme to realize dynamic path planning for the UAV. Keywords Grey wolf optimizer · Whale optimizer algorithm · Dynamic window approach · Fuzzy neural network · Unmanned aerial vehicle
This work was supported in part by National Natural Science Foundation of China (62073212), Natural Science Foundation of Shanghai (23ZR1426600), Innovation Fund of Chinese Universities Industry-University-Research (2021ZYB05004). B. Li (B) · S. Wang · W. Luo · C. Temuer Institute of Logistics Science and Engineering, Shanghai Maritime University, Shanghai 201306, China e-mail: [email protected] H. Xiong Logistics Engineering College, Shanghai Maritime University, Shanghai 201306, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_8
97
98
B. Li et al.
1 Introduction Due to the merits of maneuverability, low cost and high flexibility, UAVs have been widely utilized in diverse domains, including but not limited to intelligent agriculture, wildfire monitoring and target tracking, etc. Path planning in three-dimensional (3D) complex environment is a key technology for the UAV to perform missions effectively [1]. The path planning problem can be categorized into local path planning and global path planning according to the level of knowledge about the environmental information. Global path planning refers planning in environment with totally given information, and its methods mainly include node-based method, sampling-based method and swarm intelligence optimization algorithms (SIOAs), etc [2]. Nodebased method’s performance is up to the grid size of environment division. The smaller it is, the more accurately the environment is modeled, but the algorithm’s complexity grows exponentially. Sampling-based methods have probabilistic completeness and do not guarantee path optimality. SIOAs can effectively take the UAV’s performance and related constraints into account. It is widely used to solve optimization problems including path planning and obtains superior performance [2]. Local path planning aims to avoid sudden obstacles’ threats according to the environmental information detected in real-time by sensors. It includes artificial potential field, vector field histogram, DWA and so on [3]. DWA is broadly utilized due to its full consideration of the physical limitations, environmental constraints and current speed of the agent. In complex mission environment, there exist both known and sudden obstacles. Hence, it is imperative to consider the local barrier avoidance and global optimal restaints when planning path for the UAV. In view of the dynamic path planning problem for wheeled mobile robots in environments with sudden obstacles, an improved A* under safe constraints and DWA with adaptive detection range based hybrid path planning algorithm is developed is proposed in [4]. In [5], a swarm intelligence algorithm and DWA based algorithm is proposed to realize path planning for multiple NAO humanoid robots in static and dynamic environment. But, the constant setting of the evaluation function’s weights of DWA in above literatures cannot always guarantee the path planning performance and security. Thus, Chang et al. proposed an improved DWA with Q-learning for mobile robot’s path planning in unknown terrain, in which Q-learning is used to adjust the evaluation function’s weight coefficients dynamically [6]. And an item of distance from robot to goal and an oscillation item are added to its function to improve DWA’s performance of obstacle avoidance. In [7], a fuzzy logic DWA with simulated detection windows is proposed for space-based agents. Fuzzy logic is utilized to tune the weights of function. Further considering the global optimal constraints, a hybrid approach that integrates an enhanced A* with a fuzzy logic-based 3D DWA is devised in our prior study [8]. In [9], Yang et al. pointed that the path planning performance of FNN is superior to fuzzy logic when applied to adjusting the DWA’s weight coefficients. However, its research stays in two-dimensional environment and lacks global optimization constraints.
Improved GWO-WOA and Fuzzy NN DWA Based Path Planning …
99
Inspired by the results above, an improved GWO-WOA and FNN-DWA based hybrid path planning algorithm is designed. The objective is to address the dynamic path planning challenge for UAVs in an environment with spherical obstacles. (1) An improved GWO-WOA algorithm is developed for global path planning, in which an adaptive position updating equation is designed to increase the optimal individual’s weight to promote the convergence speed of GWO. And then, to prevent the algorithm from being trapped in local optima due to the optimal individual, a random individual and the best individual information based random wandering strategy is developed to enhance the algorithm’s global search ability. (2) FNN-DWA is designed for local path planning, and the path points generated by the proposed GWO-WOA are set as local goals. In our previous research [8], the fuzzy logic was adopted to tune the weights of the evaluation function. In this work, FNN is adopted to automatically extract rules from the training dataset obtained through FDWA, which aims to overcome the subjectivity of the fuzzy rules’ setting, and then improve the adaptability of FDWA to complex dynamic environments.
2 Mathematical Models 2.1 External Environment Constraint Inspired by [7], the obstacles in the environment are modeled as spheres with different radiuses. In order to guarantee the UAV’s security, the distances between the obstacles and the path segments must be greater than the obstacle’s radius. Define the path point at current time as P j = (x j , y j , z j )( j = 1, 2, ..., J ), J is the path points’ number. The next path point is P j+1 = (x j+1 , y j+1 , z j+1 ). The distance between the obstacle and P j P j+1 is set as P j P j+1 × O P j d= P j P j+1 where O = (Ox , O y , Oz ) is the center of the sphere. Define E j to refer whether the path segment vector P j P j+1 collides with the obstacle or not. If E j = 1, it represents that there exists a collision. Otherwise, there exists no collision. The definition of E j is given as ⎧ ⎨ 1, 0 < d < (Ri + dsa f e ), cos P1 > 0, cos P2 > 0 E j = 1, d < (Ri + dsa f e ) ⎩ 0, other wise −b −c where cos P j = a +c ; cos P j+1 = a +b ; Ri (i = 1, 2, ..., I ) refers the ith 2ac 2ab obstacle’s radius; I is the obstacle’s number; dsa f e is the safe distance between the obstacle’s edge and the UAV. 2
2
2
2
2
2
100
B. Li et al.
Fig. 1 Three different relative position situations between an obstacle and the path segment
As shown in Fig. 1, there exists three different relative position situations between an obstacle and the path segment. If d > Ri , there exists no collisions. When 0 < d < Ri , if a is not the longest side of the triangle, there also exists no collisions.
2.2 UAV’s Performance Constraints In order to obtain a more soomth path, the UAV’s attitude angles constraints must be considered [10]. The related mathematical formulations are shown as y j+1 − y j min max ≤ ψ j−1, j, j+1 = ψ j − ψ j−1 ≤ ψ , ψ j = arctan ψ x j+1 − x j ⎛ φ min ≤ φ j−1, j, j+1 = φ j − φ j−1 ≤ φ max , φ j = arctan ⎝ x
⎞ z j+1 − z j ⎠
2
2 j+1 − x j + y j+1 − y j
where j = 2, 3, .., J − 1; ψ max and φ max are the maximum values of turning and climbing angle. When the UAV is faced with obstacles, it needs to change its flight direction to avoid the severe influence. However, it cannot change its direction immediately due to inertia. Thus, the UAV needs to maintain a brief straight flight segment. The minimum length of this straight segment is defined as ∀L k ≥ L min , k = 1, 2, ..., J − 1, and L k represents the kth path segment’s length.
3 Improved GWO-WOA Based Global Path Planning 3.1 Standard GWO Taking inspiration from the grey wolves’s hunting behavior, GWO was proposed in [11]. There is a strict hierarchy in the grey wolf populations, in which the leadership
Improved GWO-WOA and Fuzzy NN DWA Based Path Planning …
101
level α, β and σ lead the hunting level ω to approach preys. ω update its position according to that of α, β and σ . A. Encircling Prey In the encircling prey step, grey wolves use the following formulation to update their locations. D = C · X p (t) − X (t) X (t + 1) = X p (t) − A · D where t is the iteration count; X p (t) and X (t) denote the position states of the prey and grey wolf, respectively; The distance vector between the grey wolf and the prey is denoted as D. C and A represent two weight vectors, A = 2 · a · r 1 − a; C = 2 · r 2 . A simulates the attack behavior of grey wolves on prey; a is a convergence factor vector, and its element’s value linearly decreases from two to zero as the iteration number increases; the elements of r 1 and r 2 are randomly selected in [0, 1]. B. Hunting Through the above encircling prey behavior, grey wolf individuals of the population control the prey within an encircling circle. In the iterative process, X p (t) is unknown, while the positions of α, β and σ is known. Thus, they are used to guide ω to the prey. The equations are shown as Dα = |C 1 · X α (t) − X (t)| Dβ = C 2 · X β (t) − X (t) Dα = |C 3 · X σ (t) − X (t)|
(1)
X 1 (t + 1) = |X α (t) − A1 · Dα | X 2 (t + 1) = X β (t) − A2 · Dβ X 3 (t + 1) = |X σ (t) − A3 · Dσ |
(2)
X (t + 1) =
X 1 (t + 1) + X 2 (t + 1) + X 3 (t + 1) 3
(3)
with Dα , Dβ and Dσ refer the distance between α, β, σ and other grey wolves; X α , X β and X σ are the locations of α, β and σ . The calculation of 3g=1 C g and 3 h=1 Ah are same as C and A. The definitions of the forward step size and direction from ω to α, β and σ are given in (1) and (2), and the position updating equation of ω is given in (3).
102
B. Li et al.
3.2 Improved GWO-WOA A. An Adaptive Position Updating Equation In standard GWO, the locations of individuals are updated based on the locations of α, β, and σ . However, the equal weighting given to the three wolves does not adequately reflect the optimality of α and leads to slow convergence of the algorithm. Thus, an adaptive position updating equations are developed, to promote GWO’s convergence speed and improve α’s weight in the iterative process. w1 = w2 = w3 =
|X 1 (t+1)| (|X 1 (t+1)|+|X 2 (t+1)|+|X 3 (t+1)|) |X 2 (t+1)| ; (|X 1 (t+1)|+|X 2 (t+1)|+|X 3 (t+1)|) |X 3 (t+1)| (|X 1 (t+1)|+|X 2 (t+1)|+|X 3 (t+1)|)
X (t + 1) = [w 1 · X 1 (t + 1) + w 2 · X 2 (t + 1) + w 3 · X 3 (t + 1)] + c1 · (X 1 (t + 1) − X α (t))
+ c2 · X 2 (t + 1) − X β (t) + c3 · (X 3 (t + 1) − X σ (t))
with w 1 , w 2 and w3 are the coefficients; c1 , c2 and c3 are the random numbers in [0,1]. B. A Random Wandering Strategy with Adaptive Spiral Bubble Nets According to the above step, the role of α has been enhanced. When α falls into local optimum, the global exploration capability of GWO is diminished. Therefore, to mitigate this issue and prevent GWO from getting trapped in local optima, a random wandering method is devised based on the information of random individuals and α. In the meantime, inspired by the spiral bubble nets in WOA and levy flight in CS [12, 13], an adaptive spiral bubble net is introduced as ⎧ ⎨ X o (t + 1) = X α (t) + Levy · (X α (t) − X o (t)) · ebl · cos (2πl) , r3 < t T ⎩ X p (t + 1) = X p (t) + Levy · X q (t) − X p (t) · ebl · cos (2πl) , r3 > t T (4) where o, p and q refer the random indexes of different individuals in population. (4) represents the random wandering strategy based on the information of α and the random individuals, respectively. After each iteration, the above location updating equation is conducted on one-quarter random individuals of the population. The random parameter Levy is generated by the Monte Carlo method; Levy = μ1 . |v| λ λ1
Γ (1+λ)·sin( π·λ 2 ) ;ν∼ μ and ν follow normal distribution. μ ∼ N 0, σ 2 , σ = μ
μ
Γ
(1+λ)
·λ·2
(λ−1) 2
2
N 0, σν2 , σv = 1; λ = 1.5; l is a number randomly selected in [0, 1]. In CS algorithm, b is a constant number. To enable dynamic updates of individuals’ positions based on the external circumstances during iteration, a dynamic value b is introduced, which depends on the iteration number. This adjustment further enhances the
Improved GWO-WOA and Fuzzy NN DWA Based Path Planning …
103
search capability of the algorithm. Its equation is b = 3l · sin π · Tt + 21 − 1. In the initial stage, the spiral shape is large, allowing individuals to explore a wider range and enhance the algorithm’s global exploration capability. In the later stage, the spiral shape becomes smaller, enabling a finer search near the optimal individual to improve convergence speed and accuracy of the algorithm.
3.3 Global Path Planning Problem Modeling The global path planning problem is formulated as a D-dimensional global optimization problem and solved using the proposed GWO-WOA algorithm. Firstly, a population composed of several D-dimensional individuals is randomly generated within the searching space for initialization. Then, the candidate solution that minimizes the value of the cost function is continuously searched during the iterative process, this solution is the optimal path. The cost function for the global path planning problem is developed as f1 =
J −1
(η1 · f Fk ) +
J e=1
k=1
η2 ·
I
f Oei + η3 · f H e
i=1
where f Fk refers the fuel consumption cost of the kth path segment of the UAV; f Oei and f H e are the height cost and the obstacle’s influence cost of the eth path point. k , L k is the kth path segment’s length; L max is the upper bound of the path f Fk = LLmax segment’s length. f Oei = d14 , dei is the distance between the eth path point and the ei ith obstacle. The definition of f H e is shown as
fHe
⎧ ⎨ 0, h min ≤ h e ≤ h max = h min − h e , h e ≤ h min ⎩ h e − h max , h e ≥ h max
where h min and h max are the minimum and maximum flight heights of the UAV; h e is the eth path point’s height in global path. It should be noted that the constraints in Sect. 2 must be satisfied.
4 3D DWA and Fuzzy Neural Network Based Local Path Planning In our previous study [8], a fuzzy logic based 3D DWA (FDWA) is proposed. This approach incorporates fuzzy logic into the evaluation function to tune the weights dynamically, which has improved the adaptability of DWA to dynamic environments. The advantage of fuzzy control lies in logical reasoning and human experience, but
104
B. Li et al.
there is a great subjectivity in the establishment of its fuzzy rules. In recent years, the utilization of NN has become increasingly prevalent due to their remarkable selflearning capabilities. Therefore, based on [8] and [9], a FNN-DWA is proposed in this section to let the output values be closer to desired values. A. Fuzzy Logic Based 3D DWA (FDWA) The path points that are obtained by the proposed improved GWO-WOA will be sent to FDWA as local goals for local path planning. The algorithm framework of FDWA mainly contains movement equations of the UAV, velocity sampling constraints, and a fuzzy logic based evaluation function. The details of them can be found in our previous work [8]. As a definition, α , β , γ , ζ and δ are coefficients of FDWA’s evaluation function which is generated by fuzzy logic control in real-time. The membership function is defined using Gaussian functions; Mamdani fuzzy reasoning method is used; the design principles of fuzzy logic rules, the inputs and outputs parameters, and their domains are as same as [8]. According to the descriptions above, the training data set is obtain by the proposed 3D FDWA for the following proposed FNN-DWA. B. Fuzzy Neural Network based 3D DWA In this section, FNN is adopted to adjust the weight coefficients of DWA’s evaluation function. The design and training of FNN is completed by the adaptive-networkbased fuzzy inference system (ANFIS) toolbox. The learning mechanism of the NN is used to automatically extract rules from the training data obtained by the above FDWA to form the FNN controller. Takagi-Sugeno fuzzy reasoning method is adopted. The application purpose of FNN is to overcome the poor adaptability and the unstable obstacle avoidance effect of fuzzy control, such that it can improve the algorithm’s adaptability to the environment. As shown in Fig. 2, it is a standard FNN form. Assuming that the inputs of the model are x1 and x2 , and the output is y. Two if-then rules are formulated as follows: If x1 = A1 , x2 = B1 , then y1 = p1 x1 + q1 x2 + r1 . If x1 = A2 , x2 = B2 , then y2 = p2 x1 + q2 x2 + r2 .
Fig. 2 A standard FNN
Improved GWO-WOA and Fuzzy NN DWA Based Path Planning …
105
where ym is the output of the mth rule; Am and Bm are non-linear parameters; pm , qm and rm are conclusion parameters. The network consists of the front-piece network (Layer 1-3) and the back-piece network (Layer 4-5). The main function of the front-piece network is to transfer the input and complete the fuzzy logic inference. The back-piece network is to process the fuzzy output and finally output the accurate value. (1) Layer 1: The input variables are fuzzified and converted into affiliation degrees of different fuzzy sets. The calculation formulation is defined as Q 1m = μ Am (x1 ) , m = 1, 2 Q 1n = μ Bn−2 (x2 ) , n = 3, 4 where m and n refer the nodes; x1 and x2 are the nodes’s the input values. μ Am (x1 ) and μ Bn−2 (x2 ) are the membership functions of x1 and x2 . Q 1m and Q 1n are the corresponding affiliation values of nodes in Layer 1. (2) Layer 2: This layer is known as the fuzzy rules layer, and the number of nodes in this layer is typically equal to the number of fuzzy rules. The purpose of this layer is to compute the degree of applicability of each rule to the input information, which is obtained by multiplying the affiliations of different fuzzy sets. The calculation formulation is Q 2m = am = μ Am (x1 ) × μ Bm (x2 ) (3) Layer 3: In this layer, the applicability obtained from Layer 2 is normalized to obtain the trigger ratio of the mth rule in all rule bases, i.e., the extent to which the mth rule is used in the whole inference process. The mathematical calculation formulation is developed as Q 3m = a¯ m =
am a1 + a2
(4) Layer 4: This layer calculates the output of the rule. The number of nodes in this layer is the same as the number of rules, and each rule corresponds to a node. This layer calculates the output value according to the different weights of the input value, which is defined as Q 4m = pm x 1 + qm x2 + rm (5) Layer 5: The role of layer 5 is defuzzification and to get the exact output. The formulation is a weighted average of the results for each rule. Q 5m = y = a¯ 1 y1 + a¯ 2 y2 =
a1 a2 y1 + y2 a1 + a2 a1 + a2
106
B. Li et al. −(
xm −cm,n m )2
In the front-piece network, c and σ of the input gaussian function e (σm,nm )2 are the parameters to be optimized; n m is the fuzzy subsets’ number of the input xm . In the back-piece network, the connection weights of each node pm , qm , and rm are optimized parameters. Since the FNN controller can only be trained to obtain a single output, the fuzzy rules of multiple-input multiple-output (MIMO) are decomposed into multiple fuzzy rules of multiple-input single-output (MISO), which are trained four times to obtain α /β , γ , ζ and δ .
5 Simulation Results 5.1 Global Planning Results Simulations are conducted on the Intel Core i7-8700, 3.20 GHz processor with 64bit MATLAB R2020 to verify the superiority of GWO-WOA, and its effectiveness for global path planning. A. Benchmark Functions Tests To validate the superiority of GWO-WOA, it is employed to solve the standard test functions, as described in [14]. The dimensions of f 1 − f 1 3 are set as 30. Meanwhile, GWO[11], Improved flower pollination algorithm (IFPA) [15], Selective opposition based grey wolf optimization(SOGWO) [16] and WOAGWO [17] are compared with the proposed GWO-WOA. The running time of benchmark functions are 25; the iteration time and the population size of all algorithms are 500 and 30. The mean value (Mean) and standard deviation value (Std) are employed to assess the stability performance and the ability of the algorithm to find the optimum. The mean value reflects the algorithm’s capability to locate the optimal value, while the standard deviation value inversely indicates the algorithm’s stability. The simulation results of GWO-WOA, GWO, IFPA, SOGWO and WOAGWO on benchmark functions f 2 , f 8 , f 20 and f 21 are demonstrated in Table 1. Table 1 reveals that the proposed GWO-WOA algorithm outperforms the other four algorithms in terms of convergence accuracy and stability. B. Global Path Planning Results A 20 km × 20 km × 20 km environment with static obstacles is established. The start and the goal points are defined as (0, 3, 4.5) km and (18, 20, 15) km; the intermediate path points number between start and goal point is 26; the iteration time is 5000. The individuals number in population is 60. L min = 2 km; L max = 1.5 × Dsg , Dsg is the distance between the start point and the goal point. ψ min = −60◦ ; ψ max = 60◦ ; φ min = −45◦ ; φ max = 45◦ ; h min = 0.02 km; h max = 20 km. η1 = 1, η2 = 1, η3 = 1.2; dsa f e = 0.5 km. The radius of the obstacle is 0.8 km. GWO, SOGWO and the proposed improved GWO-WOA are utilized to address the UAV’s global path planning problem. The simulation results are depicted in Fig. 3.
−3.2754E+00
−9.8674E+00
0.0000E+00
5.6850E-12
1.2436E-10
0.0000E+00
−4.9730E+03
−3.3220E+00
−1.01532E+01 1.4594E-01
f2
f8
f 20
f 21
−9.1474E+03
1.8356E-92
GWO
Mean
Std.
GWO-WOA
Mean
Function
1.0245E+00
1.2876E-01
2.6849E+03
1.9975E-92
Std.
5.2244E-02
5.6870E+01
1.7947E-43
Std.
−1.01391E+01 1.1435E-01
−3.1967E+00
−3.0829E+04
7.5659E-44
Mean
IFPA
−9.9345E+00
−3.3220E+00
−9.7483E+03
2.2854E-90
Mean
SOGWO
Table 1 Simulation results of GWO-WOA, GWO, IFPA, SOGWO and WOAGWO on benchmark functions
2.9890E+00
4.4350E-02
5.4436E+03
1.7649E-91
Std.
WOAGWO
1.3089E-01
7.9785E+03
1.6758E-78
Std.
−1.08905E+01 2.8659E-03
−3.2968E+00
−8.4567E+03
4.8530E-77
Mean
Improved GWO-WOA and Fuzzy NN DWA Based Path Planning … 107
108
B. Li et al.
Fig. 3 The global path planning results
The final convergence values of GWO, SOGWO and GWO-WOA are 1.92, 1.82 and 1.72, respectively. As illustrated in Fig. 3, the GWO-WOA consistently exhibits the smallest fitness value, indicating that its path is optimal. The iteration times of GWOWOA required to reach the same fitness value is smaller than GWO and SOGWO. It indicates that GWO-WOA is more efficient and has better global exploration abilities.
5.2 Local Planning Results In this subsection, FNN-DWA is applied to realize local path planning, in which FNN is formed by the ANFIS toolbox. Firstly, multiple dynamic sphere obstacles with random movement are added into the environment, and FDWA is used for the UAV’s local planning to obtain the input and output parameters to form the training dataset. Secondly, the input values and membership functions of FDWA are set as the initial parameters. Then, the parameters above are set to the ANFIS toolbox to train. The generate FIS is set as grid partition. The membership function type of output is constant. The train FIS’s optim method is set as hybrid, the error tolerance is 0.01, and the epochs are 500. Since the FNN controller can only be trained to obtain a single output, the outputs α /β , γ , ζ and δ are obtained through four training sessions. Take the training process of α /β as an example, the comparison results of the input membership functions before and after the training are illustrated in the Fig. 4. As observed in Fig. 4, the center c and width σ of the input Gaussian membership function have undergone significant changes. Then, the trained FNN is tested in a new dynamic environment to verify its effectiveness for local path planning. The simulation results
Improved GWO-WOA and Fuzzy NN DWA Based Path Planning …
109
Fig. 4 The comparison results of the input membership functions before and after the training process of α /β . (Z is zero; ZS is positive number closed to zero; S is small positive number; M is middle positive number; B is big positive number; H is huge positive number)
of DWA, FDWA and the proposed FNN-DWA for the UAV’s local path planning are shown in Fig. 5 and Table 2. And the weight coefficients’ changing curves of FDWA and FNN-DWA are shown in Fig. 6. Based on the results presented in Fig. 5 and Table 2, it is evident that the linear and angular velocity variances of the path generated by FNN-DWA are smaller compared to those of DWA and FDWA, therefore its path is more smoother than that of DWA and FDWA. And the planning time of FNN-DWA is shorter than that of FDWA. That is because the values of α /β and γ are larger than FDWA which are shown in Fig. 6, thus the UAV is moving rapidly towards its local goal. To sum up, it can be concluded that the performance of FNN-DWA for UAV’s local planning is superior than DWA and FDWA.
110
B. Li et al.
Fig. 5 Local path planning results of DWA, FDWA and FNN-DWA Table 2 Simulation results comparison of DWA, FDWA and FNN-DWA for local path planning Trajectory length
Iteration time
DWA
28.4921 m
58.8 s
FDWA
27.9735 m
49.6 s
FNN-DWA
27.6264 m
37.4 s
σv2
σω2
σω2
2 0.2709 km×s−1 2 0.1594 km×s−1 2 0.1519 km×s−1
2 0.0057 ◦ ×s−1 2 0.0061 ◦ ×s−1 2 0.0030 ◦ ×s−1
2 0.0156 ◦ ×s−1 2 0.0034 ◦ ×s−1 2 0.0034 ◦ ×s−1
ψ
φ
6 Conclusion In this work, an improved GWO-WOA and FNN-DWA based dynamic path planning algorithm is designed for an UAV in a spherical obstacle environment. Firstly, an improved GWO-WOA based global path planning algorithm is proposed. In GWOWOA, an adaptive position updating equation is proposed to improve the best individual’s weight, thus enhance GWO’s convergence speed. Simultaneously, drawing inspiration from the spiral bubble nets in WOA and the levy flight in CS, a random
Improved GWO-WOA and Fuzzy NN DWA Based Path Planning …
111
Fig. 6 The weight coefficients’ changing curves of FDWA and FNN-DWA
and best individuals information-based random wandering strategy is devised. This strategy is intended to enhance the global search capability of GWO. Then, an FNNDWA based local planning algorithm is proposed, in which the path points generated by the proposed GWO-WOA are set as local goals. Building upon the FDWA, FNN are incorporated to it to enhance the algorithm’s adaptability and performance in complex dynamic spherical environments. Finally, simulations are performed to validate the effectiveness of the proposed algorithm in addressing the dynamic path planning problem of the UAV in complex spherical obstacle environments. Simulation results demonstrated that GWO-WOA’s performance on benchmark functions are superior than GWO, IFPA, SOGWO and WOAGWO. The global path generated by the proposed GWO-WOA is more shorter and smoother than these compared algorithms, and its fitness value is always the smallest. Meanwhile, the generated global path points are sent to the UAV to guide it in local planning process. Simulation results shown that the local path generated by the proposed FNN-DWA is shorter and smoother than DWA and FDWA, and it has better adaptability and performance in a complex dynamic environment.
References 1. Jones, M., Djahel, S., Welsh, K.: Path-planning for unmanned aerial vehicles with environment complexity considerations: A survey. ACM Comput. Surv. 55(11), 1–39 (2023). https://doi. org/10.1145/3570723 2. Zhao, Y., Zheng, Z., Liu, Y.: Survey on computational-intelligence-based UAV path planning. Knowl Based Syst. 158, 54–64 (2018). https://doi.org/10.1016/j.knosys.2018.05.033 3. Wang, Y.X., Tian, Y.Y., Li, X., Li, L.H.: Self-adaptive dynamic window approach in dense obstacles. Control Decision. 34(5), 927–936 (2019). https://doi.org/10.13195/j.kzyjc.2017.1497 4. Zhong, X., Tian, J., Hu, H., Peng, X.: Hybrid path planning based on safe A* algorithm and adaptive window approach for mobile robot in large-scale dynamic environment. J. Intell. Robot. Syst. 99, 65–77 (2020). https://doi.org/10.1007/s10846-019-01112-z
112
B. Li et al.
5. Kashyap, A.K., Parhi, D.R., Muni, M.K., Pandey, K.K.: A hybrid technique for path planning of humanoid robot NAO in static and dynamic terrains. Appl. Soft Comput. 96, 106581 (2020). https://doi.org/10.1016/j.asoc.2020.106581 6. Chang, L., Shan, L., Jiang, C., Dai, Y.: Reinforcement based mobile robot path planning with improved dynamic window approach in unknown environment. Auton Robot. 45, 51–76 (2021). https://doi.org/10.1007/s10514-020-09947-4 7. Xu, C., Xu, Z., Xia, M.: Obstacle avoidance in a three-dimensional dynamic environment based on fuzzy dynamic windows. Appl. Sci. 11(2), 504 (2021). https://doi.org/10.3390/ app11020504 8. Wang, S., Li, B., Temuer, C.: Improved A* and fuzzy dynamic window based dynamic trajectory planning for an UAV. In: 2022 ICGNC, Aug. 5–7, 2022, Harbin, China. 1964–1974 (2023). https://doi.org/10.1007/978-981-19-6613-2_192 9. Yang, D., Su, C., Wu, H., Xu, X., Zhao, X.: Construction of novel self-adaptive dynamic window approach combined with fuzzy neural network in complex dynamic environments. IEEE Access. 10, 4375–4383 (2022). https://doi.org/10.1109/ACCESS.2022.3210251 10. Shao, S., Peng, Y., He, C., Du, Y.: Efficient path planning for UAV formation via comprehensively improved particle swarm optimization. ISAT. 97, 415–430 (2020). https://doi.org/10. 1016/j.isatra.2019.08.018 11. Mirjalili, S., Mirjalili, S.M., Lewis, A.: Grey wolf optimizer. Adv. Eng. Softw. 69, 46–61 (2014). https://doi.org/10.1016/j.advengsoft.2013.12.007 12. Mirjalili, S., Lewis A.: The whale optimization algorithm. Adv. Eng. Softw. 95, 51-67 (2016). https://doi.org/10.1016/j.advengsoft.2016.01.008 13. Song, P.C., Pan, J.S., Chu, S.C.: A parallel compact cuckoo search algorithm for threedimensional path planning. Appl. Soft Comput. 94, 106443 (2020). https://doi.org/10.1016/j. asoc.2020.106443 14. Rashedi, E., Nezamabadi-Pour, H., Saryazdi, S.: GSA: a gravitational search algorithm. Inf. Sci. 179(13), 2232–2248 (2009). https://doi.org/10.1016/j.ins.2009.03.004 15. Zhuang, J., Luo, H., Pan, T.S., Pan, J.S.: Improved flower pollination algorithm for the capacitated vehicle routing problem. J. Netw. Intell. 5(3), 141–156 (2020) 16. Dhargupta, S., Ghosh, M., Mirjalili, S., Sarkar, R.: Selective opposition based grey wolf optimization. Expert Syst. Appl. 151, 113389 (2020). https://doi.org/10.1016/j.eswa.2020.113389 17. Mohammed, H., Rashid, T.: A novel hybrid GWO with WOA for global numerical optimization and solving pressure vessel design. Neural. Comput. Appl. 32(18), 14701–14718 (2020). https://doi.org/10.1007/s00521-020-04823-9
Circumnavigation Control of Fixed-Wing UAVs Using Distance Measurements Jie Wang and Baoli Ma
Abstract This paper concerns the target circumnavigation of fixed-wing unmanned aerial vehicles (UAVs) in a three-dimensional space. A sliding-mode control law for the flight path angle and a dynamic feedback control law for the roll angle is derived. The proposed control scheme enables the UAV, under any initial condition, to reach a desired height in a finite time and maintain a prescribed horizontal standoff distance from the target. A modified controller is then developed to overcome the discontinuity of control inputs. The stability of the closed-loop system is rigorously justified. The control algorithms have low computational complexity and are easy to implement, as they use only the distance from the UAV to the target and from the UAV to the ground. Numerical simulations verify the effectiveness of the control method. Keywords Circumnavigation · Distance measurement · Fixed-wing uav
1 Introduction Unmanned aerial vehicles (UAVs) have many practical applications in military and civilian fields. Fixed-wing UAVs stand out due to their fast flight speed, relatively economical performance, and long endurance. In area surveillance, search and rescue, a primary demand is to let fixed-wing UAVs track a surface (land or sea) target. Since the speed of the fixed-wing UAV is higher than that of the target, the fixed-wing UAV usually moves around over the target and maintains a desired horizontal standoff distance. This motion pattern is called circumnavigation or standoff tracking [1, 2], which has become a hotspot in control and robotics. J. Wang · B. Ma (B) School of Automation Science and Electrical Engineering, Beihang University, Beijing 100190, China e-mail: [email protected] J. Wang e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_9
113
114
J. Wang and B. Ma
Circumnavigation is essentially a path-following control problem [3, 4], of which the objective is to guide autonomous vehicles to follow a circular path with the target as the center and the standoff distance as the radius. Early works assume that the position and velocity of the robot and target are known; see, e.g., [5, 6] and the literature therein. Many path-following control methods can be directly applied to this circumnavigation control scenario. For instance, in [3], a control law is proposed to steer a dynamic wheeled robot to the desired path by tracking a virtual target. A line-of-sight guidance law is presented in [7] to compensate for constant and slowly varying drift forces due to waves, wind, and ocean currents. Reference [8] has used the vector field approach to drive miniature air vehicles to follow straight-line and circular paths in the presence of constant wind disturbances. The circumnavigation problem of fixed-wing UAVs has been considered in [1, 9–12]. In [1], the authors constructed a Lyapunov guidance vector field to derive control laws for the airspeed and turn rate such that multiple UAVs orbit around a moving target. Reference [9] addressed the time timescale separation issue involving heading and standoff distance convergence. Coordinated standoff tracking of multiple moving target groups by a group of UAVs is investigated in [10]. In [11], a vector field guidance law is presented to solve the obstacle avoidance and circumnavigation problem. The particle filter is applied in [12] to settle the situation where the target is out of view for some time. It should be pointed out that these papers regarding fixed-wing UAV circumnavigation assume that the position of the target and the UAV are known. It is hard to acquire the location of a noncooperative or adversarial target. In addition, the global positioning system (GPS) signals are unavailable in harsh environments such as underwater and urban. Some papers have presented control laws using range-only or bearing-only measurements for first-order integrators and unicycle-type vehicles to orbit around noncooperative targets. See, e.g., [13–16] and the references therein. Whereas circumnavigation of fixed-wing UAVs with partial position information is rarely investigated. Therefore, to improve practicality, it is necessary to develop control algorithms to guide fixed-wing UAVs to surround unknown targets in GPS-denied environments. Motivated by these considerations, the paper aims to address the fixed-wing UAV circumnavigation problem using distance measurements. The contribution of the work is to present a control scheme such that the fixed-wing UAV, under any initial condition, reaches a desired height and keeps a predefined horizontal standoff distance from the unknown target. The global position of the UAV is not need in the proposed controllers, which require only the UAV’s vertical height and the distance to the target. Thus the algorithm is computationally efficient and suitable for low-cost UAVs. In addition, the developed controller explicitly satisfies the control input constraints, which are necessary for small UAVs. Some simulation examples are provided to test the effectiveness of the control method. The rest of the paper is organized as follows. Section 2 describes the fixed-wing UAV model and formulates the circumnavigation problem. The proposed control scheme and related stability analysis are presented in Sect. 3. Simulation results are provided in Sect. 4. Finally, Sect. 5 gives some concluding remarks.
Circumnavigation Control of Fixed-Wing UAVs Using Distance …
115
Fig. 1 UAV, target, and the graphical view of notations
2 Problem Statement The fixed-wing UAV is a very complex system in practical applications. It is necessary to find a physical model that is relatively simple in mathematics but still retains the basic characteristics of UAVs. In most literature [17], the kinematic equations of fixed-wing UAVs (see Fig. 1) are usually written as x˙ = V cos γ cos ψ y˙ = V cos γ sin ψ z˙ = V sin γ
(1)
where (x, y) and z is the horizontal position and vertical height of the UAV in a inertial frame, V represents the airspeed, γ and ψ are the flight path angle and the heading angle, respectively. The paper assumes that low-level autopilots keep the UAV flying at a constant airspeed. During the coordinated turn, the heading angle satisfies the equation ψ˙ =
V tan φ g
(2)
where g is the acceleration of gravity, and φ represents the roll angle. This paper assumes that the UAV’s roll and pitch dynamics are much faster than the heading and height dynamics. Therefore, the roll angle φ and flight path angle γ are regarded as the control inputs. Suppose there is a stationary target at (xt , yt , z t ) in three-dimensional (3D) space. The distance between the UAV and the target is denoted as ρ=
(x − xt )2 + (y − yt )2 + (z − z t )2
(3)
and let r = (x − xt )2 + (y − yt )2 be the horizontal distance of the projection of ρ on the X Y plane. In the following, we will explore how to use the ranging information
116
J. Wang and B. Ma
to find the control commands (φ c , γ c ) such that the UAV reaches a deisred height and makes a circular motion with a desired radiusrd over the target. That is, lim (z − z d ) = 0, lim (r − rd ) = 0
t→∞
t→∞
(4)
where z d and rd are predefined positive constants. The control command must satisfy the actual limitations |φ c | ≤ φmax < π/2 and |γ c | ≤ γmax < π/2. In the problem setting, the specified standoff distance should be greater than the minimum turning radius, i.e., rd > g/ tan φmax .
3 Main Results 3.1 Controller Design To steer the UAV to reache the desired height, we design the flight path angle command (5) γ c = − arcsin (sgn(z − z d )) where is a constant satisfying 0 < ≤ sin γmax , and the sign function sgn(·) is defined as ⎧ ⎪ ⎨ 1, x > 0 0, x = 0 . (6) sgn(x) = ⎪ ⎩ − 1, x < 0 In order to ensure that the UAV surrounds the target with a given radius, a dynamic feedback control law is introduced as g 1 2 c φ = arctan + k1 g tanh −k2 η + ρ rd 2 (7) 1 η˙ = −k2 η + ρ 2 2 where k1 and k2 are constants meeting 0 < k1 ≤
1 tan φmax − , k2 > 0. g rd
(8)
Before providing the relevant theoretical analysis, we define the following position error variable ⎛ ⎞ ⎛ ⎞⎛ ⎞ ex cos ψ sin ψ 0 x − xt − rd sin ψ ⎝ e y ⎠ = ⎝ − sin ψ cos ψ 0 ⎠ ⎝ y − yt + rd cos ψ ⎠ (9) 0 0 1 ez z − zd
Circumnavigation Control of Fixed-Wing UAVs Using Distance …
117
Taking the derivatives of these error variables with respect to time gives e˙x = V cos γ −
V V rd tan φ + e y tan φ g g
V ex tan φ g e˙z = V sin γ .
(10)
e˙ y = −
From definitions (3) and (9), it follows that ρ 2 = (x − xt )2 + (y − yt )2 + (z − z t )2 = (ex cos ψ − e y sin ψ + rd sin ψ)2 + (ex sin ψ + e y cos ψ − rd cos ψ)2 + (ez + z d − z t )2
(11)
= ex2 + (e y − rd )2 + (ez + h z )2 and the time derivative of the square of the distance is given by ρ ρ˙ =
d dt
1 2 ρ 2
= V ex cos γ + V (ez + h z ) sin γ .
(12)
3.2 Stability Analysis Theorem 1 Consider a fixed-wing UAV system (1)–(2) driven by control laws (5)– (7). Suppose that the control gains satisfy 0 < ≤ sin γmax and (8). Then the UAV, under any initial condition, reaches the desired height in finite time and eventually makes a circular motion with the prescribed radius. In addition, the designed control commands meet practical constraints |φ c | ≤ φmax < π/2 and |γ c | ≤ γmax < π/2. Proof In the following, we default the initial time to zero and define the error variable eη = −k2 η + ρ 2 /2. Then, the dynamic equations for the closed-loop system (ex , e y , eη , ez ) are e˙x = V cos arcsin( sgnez ) − V − k1 V rd tanh eη +
V e y + k1 V e y tanh eη rd
V ex − k1 V ex tanh eη rd e˙η = −k2 eη + V ex cos arcsin(sgnez ) − V (ez + h z )sgnez e˙z = −V sgnez e˙ y = −
(13)
For the subsystem ez , it is clear that ez converges to zero after time t1 = |z(0) − z d |/(V ). Now we analyze the boundedness of the subsystem (ex , e y , eη ) on the time
118
J. Wang and B. Ma
interval [0, t1 ]. Take the positive definite function V1 (ex , e y , eη ) = (ex2 + e2y + eη2 )/2, whose time derivative along the subsystem trajectory is V˙1 = −k2 eη2 + V ex (cos arcsin(sgnez ) − 1) − k1 V rd ex tanh eη + V ex eη cos arcsin(sgnez ) − V (ez + h z )eη sgnez ≤ (2 + k1rd )V |ex | + V |ex ||eη | + V (|ez (0)| + |h z |)|eη | 1 1 ≤ (3 + k1rd )V ex2 + (1 + |ez (0)| + |h z |) V eη2 2 2 1 + (2 + k1rd + |ez (0)| + |h z |)V 2 ≤ a0 + a1 V1 where a0 = (2 + k1rd + |ez (0)|+|h z |)V /2, a1 = V max{3 + k1rd , 1 + |ez (0)| + |h z |}. By comparison principle, we have V1 (t) ≤ (V1 (0) + a0 /a1 ) exp(a1 t) − a0 /a1 , t ∈ [0, t1 ]. Hence, the states (ex , e y , eη ) are bounded on [0, t1 ]. Since ez = 0 when t > t1 , the dynamic equations of (ex , e y , eη ) can be written as e˙x = −k1 V rd tanh eη +
V e y + k1 V e y tanh eη rd
V ex − k1 V ex tanh eη rd e˙η = −k2 eη + V ex e˙ y = −
(14)
after time t > t1 . In order to analyze the stability of the subsystem, a Lyapunov candidate function is selected as V2 =
1 2 e + e2y + ln cosh eη 2k1rd x
(15)
with time derivative along the trajectory of system (14) V˙2 = −k2 eη tanh eη ≤ 0.
(16)
V˙2 ≡ 0 ⇒ eη ≡ 0, e˙η ≡ 0 ⇒ ex ≡ 0, e˙x ≡ 0 ⇒ e y ≡ 0.
(17)
It can be confirmed that
That is, except for the trivial solution (ex , e y , eη ) = (0, 0, 0), there is no other solution that can keep V˙2 to be zero identically. According to the invariant principle [18], the states (ex , e y , eη ) of system (14) globally converge to zero. Based on the above analysis, we conclude that ez equals to zero at time t1 and (ex , e y , eη ) converges to zero under any initial condition. ez equal to zero means the UAV reaches the desired height. limt→∞ ex = 0, limt→∞ e y = 0 implies that
Circumnavigation Control of Fixed-Wing UAVs Using Distance …
lim r = lim
t→∞
t→∞
119
ex2 + (e y − rd )2 = rd .
The UAV will make a circular motion with a radius rd . In view of (5) and (7), we can obtain g + k1 g ≤ φmax . (18) |γ c | ≤ | arcsin () | ≤ γmax , |φ c | ≤ arctan rd
The proof is completed.
Remark 1 The control laws (5)–(7) require only the UAV’s height from ground and the distance to the target. Therefore, the proposed control algorithms have low computational complexity and is quite friendly to small UAVs with limited load. The height and distance information can be measured by laser or ultrasonic ranging sensors, thus, the algorithms also have advantages in GPS-denied environments. If the UAV is required to maintain a predefined distance ρd from the target, the desired
radius can be set to rd = known at this time.
ρd2 − h 2z . And the height of the target is supposed to be
Remark 2 It should be pointed out that the controller (5) applies the sliding mode control technique, which inevitably causes the phenomenon of chattering of the control input. In order to eliminate the discontinuity of the control input, we modify the control law as γ c = − arcsin ( tanh(z − z d )) g 1 2 c φ = arctan + k1 g tanh −k2 η + ρ rd 2 1 2 η˙ = −k2 η + ρ 2
(19)
where control parameters , k1 , and k2 are constants satifying 0 < ≤ sin γmax , 0 < k1 ≤ tan φmax /g − 1/rd , k2 > 0. Then, the state equations of the system (ex , e y , eη , ez , eη ) become e˙x = V cos arcsin( tanh ez ) − V − k1 V rd tanh eη +
V e y + k1 V e y tanh eη rd
V ex − k1 V ex tanh eη rd e˙η = −k2 eη + V ex cos arcsin( tanh ez ) − V (ez + h z ) tanh ez e˙ y = −
(20)
e˙z = −V tanh ez Theorem 2 The origin of the closed-loop error system (20) is exponentially asymptotically stable.
120
J. Wang and B. Ma
Proof Lete˙x = e˙ y = e˙η = e˙z = 0, we obtain V cos arcsin( tanh ez ) − V − k1 V rd tanh eη +
V e y + k1 V e y tanh eη = 0 rd
V ex + k1 V ex tanh eη = 0 rd k2 eη − V ex cos arcsin( tanh ez ) + V (ez + h z ) tanh ez = 0 V tanh ez = 0
(21)
Through calculation, it can be found that (ex , e y , eη , ez ) = (0, 0, 0, 0) is the only solution of equation (21). Thus, the origin is the only equilibrium point of the closedloop system. Linearizing the nonlinear system (20) at the origin gives ⎤ ⎡ ⎡ ⎤ ⎤⎡ ⎤ 0 0 rVd −k1 V rd ex e˙x ex ⎢ ey ⎥ ⎢ e˙ y ⎥ ⎢ − V 0 ⎥ ⎢ ey ⎥ 0 0 ⎢ ⎥ ⎢ ⎥ = ⎢ rd ⎥⎢ ⎥ ⎣ e˙η ⎦ ⎣ V 0 −k2 −V h z ⎦ ⎣ eη ⎦ A ⎣ eη ⎦ e˙z ez ez 0 0 0 −V ⎡
(22)
The characteristic equation of the linear system (22) is calculated as V2 2 2 |λI − A| = (λ + V ) (λ + k2 ) λ + 2 + k1 V rd λ rd k2 V 2 V2 = (λ + V ) λ3 + k2 λ2 + k1 V 2 rd + 2 λ + 2 rd rd
(23)
By Routh’s criterion, it can be noted that the real parts of the characteristic roots of the system (22) are all less than zero, so the origin of the linear system is exponentially asymptotically stable. Therefore, the origin of the nonlinear system (20) is exponentially asymptotically stable. The proof is complete. Remark 3 Exponential convergence enables the closed-loop system to be robust to external disturbances and deviations from ideal assumptions in practical applications. Since the linearization method is used, we can only conclude that that the origin of the system (20) is locally exponentially asymptotically stable. Nevertheless, numerical simulations show that the region of attraction is very large, and the UAV can achieve the circumnavigation objective even if it is initially far away from the target.
4 Numerical Simulation This section provides two simulation examples to test the performance of the proposed control scheme. The first case is to verify that the control law (19) drives the UAV to surround a stationary target, and the second case is to make the UAV to tracking a slowly moving target.
Circumnavigation Control of Fixed-Wing UAVs Using Distance …
121
1
0
-1
-2
-3
-4
-5 -5
-4
-3
(a)
-2
-1
0
1
2
(b) 3
8
Horizontal distance Height
7
c
2.5
c
2 6
1.5 1
5
0.5
4
0 3
-0.5 2
-1
1
-1.5 -2
0 0
10
20
30
40
50
(c)
60
70
80
90
100
0
10
20
30
40
50
60
70
80
90
100
(d)
Fig. 2 Simulation results of surrounding a stationary target: a UAV 3D trajectory; b trajectory projected on the X Y plane; c horizontal distance and height; d commanded angles
In the first example, the target is fixed at (0, 0, 0), and the UAV is initially at (−5, −5, 0) with ψ(0) = 0 and a constant airspeed V = 1. The UAV is required to reach a desired height z d = 3 and make a circular motion with a desired radiusrd = 2 over the target. The angle commands shoud satisfy |φ c | < π/2 and |γ c | ≤ π/3. Then the control parameters are selected as = 0.5, k1 = 0.6, k2 = 5, and η(0) = 0. To be realistic, we add white Gaussian noise with an expectation of 0 and a variance of 0.05 to the height and distance measurements. As shown in Fig. 2, the UAV satisfactorily fulfills the scheduled mission. In the second example, the target is initially at (0, 0, 0) and then moves with xt = 0.01t, yt = 0.02 sin(0.1t), z t = 0. The UAV is initially at (−5, −5, 0) with a constant airspeed V = 1, and the initial value for the flight path angle is ψ(0) = 0. The UAV is required to reach a desired height z d = 3 and make a circular motion with a desired radiusrd = 2 over the target. The angle commands shoud satisfy |φ c | < π/2 and |γ c | ≤ π/3. Then the control parameters are selected as = 0.5, k1 = 0.3, k2 = 25, and η(0) = 0. White Gaussian noise with an expectation of 0 and a variance of 0.05 is also added to the height and distance measurements. From Fig. 3, one notes
122
J. Wang and B. Ma 6 4 2 0 -2 -4 -6 -8
-5
0
5
(a)
10
(b)
8
3
Horizontal distance Height
7
15
c
2.5
c
2
6
1.5 5 1 4
0.5 0
3
-0.5 2 -1 1
-1.5
0
-2 0
50
100
(c)
150
0
50
100
150
(d)
Fig. 3 Simulation results of surrounding a slowly moving target: a 3D trajectory; b trajectory projected on the X Y plane; c horizontal distance and height; d commanded angles
that the UAV climbs to the desired height and hovers over the target. The horizontal distance between the UAV and the target is ultimately bounded, implying that the proposed method appiles to the circumnavigation of a slowly moving target.
5 Conclusions This paper has presented a control scheme for fixed-wing UAVs to reach a predefined height and maintain a specified standoff distance from an unknown target. The control algorithm has advantageous in a GPS-deined environment. Numerical simulations are performed to verify the effectiveness of the control method. It should be pointed out that the proposed control scheme is suitable for a single fixed-wing UAV surrounding a stationary target. Future work includes moving-target circumnavigation by multiple fixed-wing UAVs.
Circumnavigation Control of Fixed-Wing UAVs Using Distance …
123
Acknowledgements This work was supported in part by the National Natural Science Foundation of China (62133001, 62227810) and the National Basic Research Program of China (973 Program: 2012CB821200, 2012CB821201).
References 1. Frew, E.W., Lawrence, D.A., Morris, S.: Coordinated standoff tracking of moving targets using lyapunov guidance vector fields. J. Guidance Control Dyn. 31(2), 290–306 (2008) 2. Wang, J., Ma, B., Yan, K.: Mobile robot circumnavigating an unknown target using only range rate measurement. IEEE Trans. Circ. Syst. II: Express Briefs 69(2), 509–513 (2022) 3. Soetanto, D., Lapierre, L., Pascoal, A.: Adaptive, non-singular path-following control of dynamic wheeled robots. In: 42nd IEEE International Conference on Decision and Control, vol. 2, pp. 1765–1770 (2003) 4. Wang, J., Ma, B.: Global path following control for the planar vertical takeoff and landing aircraft. Int. J. Control Autom. Syst. 19(12), 4046–4055 (2021) 5. Yamaguchi, H.: A cooperative hunting behavior by mobile-robot troops. Int. J. Robot. Res. 18(9), 931–940 (1999) 6. Kim, T.H., Sugie, T.: Cooperative control for target-capturing task based on a cyclic pursuit strategy. Automatica 43(8), 1426–1431 (2007) 7. Fossen, T.I., Pettersen, K.Y., Galeazzi, R.: Line-of-sight path following for dubins paths with adaptive sideslip compensation of drift forces. IEEE Trans. Control Syst. Technol. 23(2), 820– 827 (2015) 8. Nelson, D.R., Barber, D.B., McLain, T.W., Beard, R.W.: Vector field path following for miniature air vehicles. IEEE Trans. Robot. 23(3), 519–529 (2007) 9. Summers, T.H., Akella, M.R., Mears, M.J.: Coordinated standoff tracking of moving targets: Control laws and information architectures. J. Guidance Control Dyn. 32(1), 56–69 (2009) 10. Oh, H., Kim, S., Shin, H.s., Tsourdos, A.: Coordinated standoff tracking of moving target groups using multiple uavs. IEEE Trans. Aerosp. Electron. Syst. 51(2), 1501–1514 (2015) 11. Wilhelm, J., Clem, G., Casbeer, D., Gerlach, A.: Circumnavigation and obstacle avoidance guidance for uavs using gradient vector fields. In: AIAA Scitech 2019 Forum, p. 1791 (2019) 12. Oh, H., Kim, S.: Persistent standoff tracking guidance using constrained particle filter for multiple uavs. Aerosp. Sci. Technol. 84, 257–264 (2019) 13. Matveev, A.S., Teimoori, H., Savkin, A.V.: Range-only measurements based target following for wheeled mobile robots. Automatica 47(1), 177–184 (2011) 14. Shames, I., Dasgupta, S., Fidan, B., Anderson, B.D.: Circumnavigation using distance measurements under slow drift. IEEE Trans. Autom. Control 57(4), 889–903 (2011) 15. Deghat, M., Shames, I., Anderson, B.D., Yu, C.: Localization and circumnavigation of a slowly moving target using bearing measurements. IEEE Trans. Autom. Control 59(8), 2182–2188 (2014) 16. Zheng, R., Liu, Y., Sun, D.: Enclosing a target by nonholonomic mobile robots with bearingonly measurements. Automatica 53, 400–407 (2015) 17. Beard, R.W., Ferrin, J., Humpherys, J.: Fixed wing uav path following in wind with input constraints. IEEE Trans. Control Syst. Technol. 22(6), 2103–2117 (2014) 18. Khalil, H.K.: Nonlinear Systems, 3rd edn. Prentice Hall, Upper Saddle River, NJ, USA (2002)
Cooperative Path Planning for Multi-vehicle Systems by an Integrated Intelligent Algorithm of IABC and DWA Ying Tan, Jian Zhang, Zhonghua Miao, and Jin Zhou
Abstract This brief mainly studies the issue of cooperative path planning (CPP) for multi-vehicle systems consisting of car-like models based on the integration of the improved artificial bee colony (IABC) algorithm and the dynamic window approach (DWA). An integrated intelligent algorithm is proposed by considering the objective function constructed of three different criteria formulated by IABC, including the goal shortest path, the collision avoidance between the vehicles and the obstacles, and the desired path smoothness, and then an optimized multi-objective evaluation function realized by DWA, which yield that an optimal collision-free smoothness path is successfully generated. Subsequently, the effectiveness and advantages of the developed CPP algorithm are furthermore illustrated by the simulation experiments, as well as the efficiency comparison between the standard artificial bee colony (ABC) algorithm and the IABC algorithm. Keywords Cooperative path planning · Multi-vehicle systems · Improved artificial bee colony algorithm · Dynamic window approach
1 Introduction With the great development of artificial intelligence and robotic techniques, the cooperation of a group of multi-vehicle systems of car-like robots, as a class of increasingly widely adopted mobile autonomous platforms, has considerably attracted attention from various kinds of complex task application scenarios, e.g., disaster rescue [1], environmental monitoring [2], terrestrial surveillance [3], fire-fighting [4], agriculY. Tan · J. Zhang · J. Zhou (B) Shanghai Institute of Applied Mathematics and Mechanics, and School of Mechanics and Engineering Science, Shanghai University, Shanghai 200072, China e-mail: [email protected] Z. Miao School of Mechatronic Engineering and Automation, Shanghai University, Shanghai 200072, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_10
125
126
Y. Tan et al.
tural production [5], due to its remarkable merits in flexibility, functionality, and robustness. In particular, multi-vehicle systems’ cooperative path planning (CPP) is the key and premise to ensure accurate, reliable, and safe collaborative operation in unstructured open operating environments. CPP of multi-vehicle systems is a key aspect of achieving safe and coordinated navigation of a system consisting of different intelligent vehicles, refers to finding an optimal safe and effective path for each vehicle in the same workspace, and ensuring that there is no collision between vehicles and obstacles at each moment such that minimum energy and resources are utilized. Therefore, there is a need to develop different optimization algorithms to solve the problem of CPP of multivehicle systems, which mainly include two categories: the first one is traditional algorithms, e.g., probabilistic roadmap (PRM) [6], rapidly-exploring random tree (RRT) algorithm [7], artificial potential field (APF) [8], the other one is swarm intelligence algorithms on evolutionary computation in the past decade, such as particle swarm optimization (PSO) [9], genetic algorithm (GA) [10], ant colony optimization (ACO) [11]. In recent years, some algorithms were solved using both centralized and distributed architecture, and the CPP performance of the proposed algorithms was evaluated based on the multi-objective requirements called by some specific applications. More recently, although some swarm intelligence algorithms were proposed to deal with the CPP of multi-vehicle systems in terms of bilevel coordination architectures of local and global programming layers [12], it is also a challenging issue that the path multi-objective optimization of total efficiency, safety, and smoothness are often conflicted. With the aforementioned background, the main object of this brief is to propose an integrated intelligent algorithm for CPP of multi-vehicle systems consisting of car-like models by integrating the improved artificial bee colony (IABC) algorithm and the dynamic window approach (DWA). The path planning algorithm proposed in this brief obtains the objective function constructed of three different criteria formulated by IABC, including the shortest path to achieve desired goals, the collision avoidance between the vehicles and the obstacles, and the smoothness of the path, and an optimized multi-objective evaluation function realized by DWA, thus an optimal collision-free smoothness path to achieve the desired goal is successfully generated. Finally, the effectiveness and advantages of the developed CPP algorithm are furthermore verified by the simulation computation, and the comparison efficiency index of average fitness, best fitness, path length, and obstacle avoidance success rate between ABC and IABC are furthermore provided. The rest of this brief is structured as follows. The global CPP based on IABC is first introduced in Sect. 2. The local CPP based on DWA is introduced in Sect. 3. The pseudo-code of the algorithm is then presented in Sect. 4. The result of the simulation experiment is provided in Sect. 5. Finally, conclusions of the work are drawn in Sect. 6.
Cooperative Path Planning for Multi-vehicle Systems …
127
2 Global CPP on IABC 2.1 Problem Formulation Consider the working scenario W of multi-vehicle systems as a 2D map that contains a set of static obstacles O j ( j = 1, 2, ..., m) shaped into a circle with radius Or j , and a set of circular vehicles V ehiclei (i = 1, 2, ..., n) with radius r . The initial position xig (xi , yi ) and the target position , yig of each vehicle V ehiclei are known, as well as the position O pj = x j , y j of obstacle O j . The path X i (i = 1, 2, ..., n) of vehicle V ehiclei can be represented by the set of path points (Pi1 , Pi2 , ..., Pil ), and the path point of the vehicle V ehiclei is Pik = (xik , yik ) (k = 1, 2, ..., l). Set the objective function F consisting of path length function Fl , path safety function Fsa , and path smoothness function Fsm . The three optimization objectives can be transformed into a single optimization problem by summing the weighting factors and then solving for the minimum of the objective function min F by IABC. The path length function Fl is defined as Eq. (1). Fl =
l−1 n
xi,k+1 − xi,k
2
2 + yi,k+1 − yi,k .
(1)
i=1 k=1
The role of the path safety function is to ensure that a reasonable safe distance is maintained between the vehicles and the obstacles. The definition of the path safety function Fsa is shown in Eq. (2).
Fsa =
⎞2 r + O rj ⎝ ⎠ . 2 2 j=1 x j − xi,k + y j − yi,k
n l−1 m i=1 k=1
⎛
(2)
In order to avoid excessive curvature of the global path, the path smoothness function Fsm is designed, which is defined as Eq. (3). Fsm =
n l−1
xi,k+1 − xi,k 2 + yi,k+1 − yi,k 2 − f ad ,
(3)
i=1 k=1
f ad =
xi,l − xi,1
2
2 + yi,l − yi,1
l −1
,
(4)
where f ad is the average distance of the path. Therefore, the objective function F is defined as follows: F = λ1 · Fl + λ2 · Fsa + λ3 · Fsm , where λ1 , λ2 and λ3 are weight coefficient.
(5)
128
Y. Tan et al.
2.2 Improved Artificial Bee Colony (IABC) Algorithm The detail of IABC is introduced as follows: (1) The initialization phase. A set of random solutions (x1 , x2 , ..., xi , ..., x S N ) is generated in D-dimensional space, which is equal to the number of employed bees. The initialization method is: d d d , + rand (0, 1) · xmax − xmin xid = xmin
(6)
d d is the maximum of xid ; xmin is the where i ∈ (1, 2, ..., S N ); d ∈ (1, 2, ..., D); xmax d minimum of xi ; rand (0, 1) is a random number between 0 and 1. (2) The employed bees phase. Employing bees to find new food sources vid based on information about available food sources xid . The new food sources vid are generated as follows: (7) vid = xid + rand (−1, 1) · xid − xkd ,
where rand (−1, 1) is a random number between -1 and 1; k must be different from i. In this work, if the value of a randomly generated parameter exceeds the predefined d d d boundary, integer between boundaries. d If vi > xmax then vi = d it isd set as a drandom d d d rand xmin , xmax ; If vi < xmin then vi = rand xmin , xmax . (3) After generating the food sources, the fitness for each vid was evaluated and compared with the fitness value of xid according to the greedy selection mechanism. The fitness function is defined as follows: ⎧ ⎨ 1 , Fi ≥ 0, f iti = 1 + Fi (8) ⎩ 1 + |F | , F < 0, i
i
where Fi is the objective function value of solution i; f iti is the corresponding fitness value after transformation. (4) The onlooker bees phase. Calculate the probability values pi for the solution xid by using their fitness values, the onlooker bees will choose a solution based on the probability value pi . The equation is shown as follows: f iti pi = S N , k=1 f itk
(9)
where S N denotes the swarm size. After generating of food sources, repeat the judgment in step 3. The nectar update formula is improved to change the range of perturbation and improve the location of the best local nectar. The equation is shown as follows: vid = xid + ω · rand (−1, 1) · xid − xkd ,
(10)
Cooperative Path Planning for Multi-vehicle Systems …
2
, ω=
√
2 − δ − δ 2 − 4δ
129
(11)
d x d + xmax where ω is weight coefficient; 4 ≤ δ ≤ min . 2 (5) After all onlooker bees have completed the search process, if there exists a solution that has not been further updated after exploring the limit number of cycles, Limit = S N10×D , then this food source will be replaced by a randomly generated food source xid using Eq. (6). (6) Record the position of the food source with the best objective function value F (X i ) and fit the global path X i using a Bessel curve.
3 Local CPP on DWA 3.1 Kinematics Model of Car-Like Robot This brief considers the car-like robot as an agent, which possesses typical nonholonomic and underactuated dynamic characteristics, and its kinematic model and typical application scenario are shown in Figs. 1 and 2, respectively [13]. The corresponding kinematics model of the car-like robot is given in the following difference equation: ⎧ ⎨ x(t + Δt) = x(t) + Hx (t), y(t + Δt) = y(t) + Hy (t), ⎩ θ (t + Δt) = θ (t) + ω · Δt, − ωv (sin(θ (t)) − sin(θ (t) + ω · Δt)), ω = 0, Hx (t) = v cos(θ (t)) · Δt, ω = 0, Hy (t) =
v (cos(θ (t)) ω
− cos(θ (t) + ω · Δt)), ω = 0, v sin(θ (t)) · Δt, ω = 0,
(12)
(13)
(14)
where x, y are the position coordinates of the vehicles in the x and y directions, respectively; θ is the heading of the vehicle.
3.2 Dynamic Window Approach (DWA) Before local planning, the vehicle needs to track the global path at the tracking speed vq , so as to make a prediction of the local collision point Ci = (xi , yi ) to determine whether a collision occurs between vehicles. If a collision is going to occur, a strategy
130
Y. Tan et al.
Fig. 1 The car-like robot model
Fig. 2 Autonomous agricultural work
of priority-based avoidance will be adopted. Taking the example of vehicle V ehicle2 avoiding V ehicle1 , this means that high-priority vehicle V ehicle1 will still keep the original path forward, while low-priority vehicle V ehicle2 will coordinate the local path near the conflict point through DWA. The steps for local coordination by DWA are as follows: (1) Perform velocity sampling. Take multiple sets of velocities in the velocity (v, ω) space according to Vr . According to Eq. (12), simulate the trajectory of the vehicle at these speeds within Δt. V1 = {(v, ω)|v ∈ [vmin , vmax ], ω ∈ [ωmin , ωmax ]},
(15)
˙ vc + vΔt], ˙ ω ∈ [ωc − ωΔt, ˙ ωc + ωΔt]}, ˙ (16) V2 = {(v, ω)|v ∈ [vc − vΔt, ˙ ω ≤ 2 · dist (v, ω) · ω}, ˙ (17) V3 = {(v, ω)|v ≤ 2 · dist (v, ω) · v,
Cooperative Path Planning for Multi-vehicle Systems …
Vr = V1 ∩ V2 ∩ V3 ,
131
(18)
˙ ω˙ are where vc , ωc are the current linear and angular velocities, respectively; v, the maximum linear acceleration and the maximum angular acceleration reached by the vehicle within Δt, respectively; dist (v, ω) represents the shortest distance between the trajectory and the obstacles. (2) After obtaining multiple sets of trajectories, the trajectories are evaluated according to the evaluation function G(v, ω) as Eq. (19), and the optimal trajectory is selected according to the evaluation score. G(v, ω) = α · goal(v, ω) + β · vel(v, ω) + γ · dist_o(v, ω) + η · dist_r (v, ω),
(19) where goal(v, ω) is the evaluation function of the distance between the end of the trajectory and the target position; vel(v, ω) is the evaluation function of the current velocity; dist_o(v, ω) is a function of the distance between the evaluated trajectory and the obstacles; dist_r (v, ω) is a function of the distance between the evaluation trajectory and the vehicle V ehicle1 ; and α, β, γ , η are the weighting factor of each evaluation function.
4 An Integrated CPP Algorithm The main aim of this research is to propose an integrated intelligent algorithm for CPP of multi-vehicle systems consisting of car-like models by integrating the IABC and the DWA. To do so, the following Algorithm 1 proposes a pseudo-code for the integrated algorithm on CPP of multi-vehicle systems, in which the IABC is used to optimize the fitness function to obtain the global CPP, and then DWA is then utilized to obtain local CPP of multi-vehicle systems. Algorithm 1 The integrated algorithm based on IABC and DWA Input: The population size of bees S N ; the maximum number of iterations Maxiter ; the maximum number of limits Limit; the initial position (xi , yi ) and target position xig , yig for each vehicle V ehiclei ; the position and radius of each obstacle O pj ,Or j ; time Δt; tracking velocity vq Output: The path Pi of each vehicle V ehiclei from (xi , yi ) to xig , yig 1: INITIALIZATION PHASE: 2: for i = 1 to S N do 3: Random selection of food source xid by Eq.(6) 4: end for 5: for I ter = 1 to Maxiter do 6: EMPLOYED BEES PHASE: 7: for i = 1 to S N do 8: Produce new solution vid by Eq.(7) 9: if f it (vid ) > f it (xid ) then
132
Y. Tan et al.
10: 11: 12: 13: 14: 15: 16: 17: 18: 19: 20: 21: 22: 23: 24: 25: 26: 27: 28: 29: 30: 31: 32: 33: 34: 35:
Set xid to vid Set triali to 0 else Increment triali end if end for ONLOOKER BEES PHASE: for i = 1 to S N do Calculate selection probability pi by Eq.(9) Produce new solution vid by Eq.(10) if f it (vid ) > f it (xid ) then Set xid to vid Set triali to 0 else Increment triali end if end for SCOUT BEES PHASE: for i = 1 to S N do if triali > Limit then Set xid to a new randomly produced solution by Eq.(6) end if end for SAVE RESULT: Bezier Curve fits the global path X i Vehicle V ehiclei track the path with the position of food source with the best objective function value F (X i ) end for Predict the conflict points Ci if Presence of a collision trend then Low-priority vehicles plan local paths by DWA High-priority vehicles continue to track global path end if return The path Pi of each vehicle V ehiclei from (xi , yi ) to xig , yig
36: 37: 38: 39: 40: 41: 42:
5 Simulation Experiment The CPP problem for multi-vehicle systems is carried out in a simulated environment. First, a 10 m × 10 m 2D environment map is generated, which contains 3 cars with a radius of 0.1 m and 6 obstacles. The coordinates of the vehicle initial point are set as (9, 9), (9, 1), (1, 6), and the coordinates of the target point are set as (1, 1), (1, 9), (9, 4). Assign the highest priority to V ehicle1 , the second to V ehicle2 , and
Cooperative Path Planning for Multi-vehicle Systems …
133
the last to V ehicle3 . The position and radius of the obstacles are set as (2, 5), (4, 8), (5, 5), (6, 2), (8, 6.5), (8.5, 3) and 0.5, 1.0, 0.8, 0.8, 0.6, and 0.2 m. Then the initial parameters are set as S N =50, Maxiter =500, and Limit=90. The value of δ in Eq. (11) is as δ = 4.29. The weight coefficients are as follows: λ1 = 0.9, λ2 = 0.05, λ3 = 0.4, α = 0.3, β = 0.5, γ = 0.1, η = 0.1. The tracking speed of the vehicle is vq = 0.2m/s and the DWA simulation time is Δt = 2s. For global path planning, the standard ABC and IABC are used for comparison experiments with the same parameter settings. The two algorithms carried out 10 independent experiments, and the simulation results of one time are shown in Fig. 3 and Fig. 4. The average global path fitness curve of the two algorithms is presented in Fig. 5. The comparison of the best global path fitness of the two algorithms is shown in Fig. 6. The comparison efficiency index of average fitness, best fitness, path length, and obstacle avoidance success rate between the standard ABC and
Fig. 3 Global paths based on ABC
Fig. 4 Global paths based on IABC
134
Y. Tan et al.
Fig. 5 Global path average fitness
Fig. 6 Global path best fitness
IABC are furthermore provided in Table 1. The results show that the IABC algorithm achieves lower energy consumption than the standard ABC algorithm, and has a higher obstacle avoidance success rate. Therefore, it can be concluded that the global path based on the IABC algorithm is better. The result of local coordination is illustrated in Fig. 7. The figure shows that V ehicle1 and V ehicle2 will collide at the conflict point C. After DWA coordination,
Cooperative Path Planning for Multi-vehicle Systems … Table 1 Comparison between ABC and IABC Algorithm Average fitness Best fitness ABC IABC
41.674 37.968
38.752 37.628
135
Path length
Obstacle avoidance success rate
32.192 31.695
0.7 1
Fig. 7 Local path planning
Fig. 8 Path planning result
V ehicle2 is able to avoid V ehicle1 . The final path planning of multi-vehicle systems is illustrated in Fig. 8. It can be seen that an optimal collision-free smooth path is successfully generated.
136
Y. Tan et al.
6 Conclusion In this brief, the issue of CPP for multi-vehicle systems of car-like models has been addressed by using an integrated intelligent algorithm in combination with IABC and DWA. A key feature of the proposed intelligent algorithm is suitably introduced the coordination architectures of local and global CPP for multi-vehicle systems, where the IABC is used to optimize the fitness function to obtain the global path, and then DWA is then utilized to avoid the collision between the vehicles, and this yields an optimal collision-free smoothness path to achieve the desired target with the advantages of low cost, high efficiency, and robustness. To this end, the effectiveness and advantages of the developed CPP algorithm are furthermore verified by the simulation experiment. Funding. This work is supported by the National Science Foundation of China (Nos.12072180 and 51875331); and the Innovation Program of Shanghai Municipal Education Commission (No.2023ZKZD47).
References 1. Grigore, L.S., Priescu, I., Joita, D., Oncioiu, I.: The integration of collaborative robot systems and their environmental impacts. Processes 8(4), 494 (2020). https://doi.org/10.3390/ pr8040494 2. Wang, T., Huang, P., Dong, G.: Cooperative persistent surveillance on a road network by multiUGVs with detection ability. IEEE Trans. Ind. Electron. 69, 11468–11478 (2022). https://doi. org/10.1109/TIE.2021.3121729 3. Li, J., Sun, T., Huang, X., Ma, L., Lin, Q., Chen, J., Leung, V.C.M.: A memetic path planning algorithm for unmanned air/ground vehicle cooperative detection systems. IEEE Trans. Autom. Sci. Eng. 19(4), 2724–2737 (2021). https://doi.org/10.1109/tase.2021.3061870 4. Dhiman, A., Shah, N., Adhikari, P., Kumbhar, S., Dhanjal, I.S., Mehendale, N.: Firefighting robot with deep learning and machine vision. Neural Comput. Appl. 34, 2831–2839 (2022). https://doi.org/10.1007/s00521-021-06537-y 5. Raikwar, S., Fehrmann, J., Herlitzius, T.: Navigation and control development for a four-wheelsteered mobile orchard robot using model-based design. Comput. Electron. Agric. 202, 107410 (2022). https://doi.org/10.1016/j.compag.2022.107410 6. Ravankar, A.A., Ravankar, A., Emaru, T., Kobayashi, Y.: HPPRM: hybrid potential based probabilistic roadmap algorithm for improved dynamic path planning of mobile robots. IEEE Access 8, 221743–221766 (2020). https://doi.org/10.1109/ACCESS.2020.3043333 7. Zhou, Y., Zhang, E., Guo, H., Fang, Y., Li, H.: Lifting path planning of mobile cranes based on an improved RRT algorithm. Adv. Eng. Inf. 50, 101376 (2021). https://doi.org/10.1016/j. aei.2021.101376 8. Xie, S., Hu, J., Bhowmick, P., Ding, Z., Arvin, F.: Distributed motion planning for safe autonomous vehicle overtaking via artificial potential field. IEEE Trans. Intell. Transp. Syst. 23(11), 21531–21547 (2022). https://doi.org/10.1109/TITS.2022.3189741 9. Song, B., Wang, Z., Zou, L.: An improved PSO algorithm for smooth path planning of mobile robots using continuous high-degree Bezier curve. Appl. Soft Comput. 100, 106960 (2021). https://doi.org/10.1016/j.asoc.2020.106960 10. Sarkar, R., Barman, D., Chowdhury, N.: Domain knowledge based genetic algorithms for mobile robot path planning having single and multiple targets. J. King Saud Univ.-Comput. Inf. Sci. 34(7), 4269–4283 (2022). https://doi.org/10.1016/j.jksuci.2020.10.010
Cooperative Path Planning for Multi-vehicle Systems …
137
11. Li, D., Wang, L., Cai, J., Ma, K., Tan, T.: Research on terminal distance index-based multi-step ant colony optimization for mobile robot path planning. IEEE Trans. Autom. Sci. Eng. (2022). https://doi.org/10.1109/TASE.2022.3212428 12. Zhang, J., Xiang, L., Miao, Z., Zhou, J.: A unified approach to path planning of multi-robot systems based on bilevel coordination architecture. In: Proceedings of 2022 Chinese Intelligent Systems Conference, pp. 731–741. Springer Nature Singapore, Singapore (2022). https://doi. org/10.1007/978-981-19-6226-4_70 13. Guillet, A., Lenain, R., Thuilot, B., Rousseau, V.: Formation control of agricultural mobile robots: a bidirectional weighted constraints approach. J. Field Robot. 34(7), 1260–1274 (2017). https://doi.org/10.1002/rob.21704
Event-Triggered Adaptive Dynamic Surface Control for Wheeled Mobile Robots with Unknown Skidding Wenlong Yue, Yu Wan, and Xuehui Gao
Abstract An adaptive controller is designed for the trajectory tracking control problem of Wheeled mobile robots on complex road with skidding and other unknown disturbances. Firstly, dynamic model of Wheeled mobile robots with skidding is established, the skidding is estimated online by Radial basis function neural network, and the control law is designed by backstepping. Secondly, considering the derivative of virtual velocity in backstepping will increase the complexity, dynamic surface control is used to avoid the derivative of virtual velocity. What’s more, in order to save resources and reduce the number of controller actions, an event triggering mechanism with fixed threshold was proposed. Finally Lyapunov function was constructed to prove the stability of the system, and simulation results show that the Radial basis function neural network can accurately estimate the unknown skidding, and the proposed control method has fast convergence speed and relatively accurate trajectory tracking accuracy. Keywords Wheeled mobile robots · RBF neural Network · Event-triggering · Adaptive control
1 Introduction Wheeled mobile robots (WMRs) are widely used in aerospace, service, rescue and other fields, and have a broad application prospect [1]. Wheeled mobile robots system is a typical multi-input multi-output nonlinear system [2], subject to nonholonomic constraints [3]. Trajectory tracking control of wheeled mobile robots requires the controller to make the wheeled mobile robots drive along the desired trajectory W. Yue College of Electrical Engineering and Automation, Shandong University of Science and Technology, Qingdao 266590, China Y. Wan · X. Gao (B) School of Intelligent Equipment, Shandong University of Science and Technology, Taian 271001, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_11
139
140
W. Yue et al.
according to the given speed from the given initial state, which is an important control mode of wheeled mobile robots. In order to ensure that wheeled mobile robots can complete tasks efficiently and accurately, the trajectory tracking control of wheeled robots is crucial [4]. Control of wheeled mobile robots have been widely studied. Literature [5] designed a new control strategy for non-holonomic constrained wheeled mobile robots by combining feedback and feed-forward. In literature [6], adaptive dynamic surface control was designed based on the dynamic model of the driving motor of wheeled mobile robots. Literature [7] designed an improved composite cosine function neural network controller for unknown disturbance in wheeled mobile robots. Wheeled mobile robots will inevitably be subjected to skidding and other disturbance when driving on complex road [8], and skidding will destroy the nonholonomic constraints of wheeled mobile robot. It is necessary to study the tracking control of wheeled mobile robots including skidding. Literature [9] establishes kinematic and dynamic models of wheeled mobile robots including skidding and external disturbance, and designs kinematic and dynamic control laws. In literature [10], skidding is regarded as three time-varying unknown parameters, which are estimated online by untraced Kalman filter, and the controller is designed according to backstepping. In literature [11], disturbance observer was designed based on virtual control law considering input disturbance such as skidding of wheeled mobile robots, and robust trajectory control was designed through prescribed performance function and disturbance observer. In literature [12, 13], for the unknown skidding disturbance of wheeled mobile robots, the trajectory tracking control is designed using backstepping and sliding mode control respectively based on the extended state observer. Literature [14] designed an adaptive fault-tolerant controller for wheeled mobile robots with unknown centroid and unknown disturbance including skidding. At present, most of the control schemes for wheeled mobile robots are continuous control laws, but in many cases, the actuator does not need continuous action. It is necessary to study the event-triggered wheeled mobile robots. The event triggering problem of nonlinear systems has been extensively studied [15, 16]. In [17] designed event triggering control for uncertain nonlinear systems by using fixed threshold, relative threshold and switching threshold respectively. In [18] designed a periodic event-triggering control based on type (3,0) mobile robot, which can effectively ensure the boundedness and convergence of the trajectory and greatly reduce the number of controller updates compared with continuous control. Considering that skidding will break non-holonomic constraints of wheeled mobile robots and continuous control will cause resource waste, this paper studies the trajectory tracking control problem of wheeled mobile robots under unknown skidding, an adaptive dynamic surface control method based on event triggering is designed and a skidding compensation scheme is proposed. Firstly, RBF neural network is used to estimate and compensate the skidding in real time. Secondly, the dynamic surface control is used to design the control law, and the command filter is used to replace the first-order low-pass filter. In addition, a fixed threshold event
Event-Triggered Adaptive Dynamic Surface Control …
141
triggering mechanism was proposed. Finally, by using Lyapunov stability theory, it is proved that all the signals of the system satisfy the semi-global uniform ultimate boundedness (SGUUB), and prove the “Zeno” phenomenon does not appear.
2 WMR Model with Unknown Sliding The research object of this paper is differential drive wheel mobile robot, whose center of mass is consistent with the geometric center. In the global coordinate system X OY and local coordinate system X r Pr Yr . The lateral and longitudinal sliding of the lower wheeled mobile robots are shown in Fig. 1. r is Radius of wheels, b is distance c.g. lies forward of wheel axles, m is mass of WMRs, Pr is Center of mass of WMRs, I is total inertia, ξr , ξl is longitudinal skidding of the WMRs driving wheel, μ is Lateral skidding of WMRs. T The wheeled robots state is q = x y θ and the equation of motion of the wheeled mobile robots including the unknown skidding as follows q˙ = S(q)(z − ξ ) + ϕ
(1)
⎡
⎤ cos θ 0 T where S(q) = ⎣ sin θ 0⎦, z = v ω , v and ω represent the speed and angular 0 1 T T velocity of wheeled mobile robots, ξ = ξ1 ξ 2 = r (ξr2+ξl ) r (ξr2b−ξl ) T ϕ = μ sin θ μ cos θ 0 . Assuming that ξr , ξl , μ are continuously bounded, and their first derivatives are also continuously bounded. Wheeled mobile robots dynamics equation describe as M(q)q¨ + V (q, q) ˙ + F(q) ˙ + G(q) + τd = E(q)τ − A T (q)λ
Fig. 1 Model of the wheeled mobile robot
(2)
142
W. Yue et al.
M(q) ∈ R 3×3 is the inertia matrix of the system, M(q, q) ˙ ∈ R 3×3 is related to the 3×3 is gravity matrix, τd ∈ R 3×1 is external speed and position matrix, G(q) ∈ R 3×3 is input transformation matrix, τ ∈ R 2×1 is disturbance of the system, E(q) ∈ R T 2×3 input vector control torque, A (q) ∈ R is constraint matrix, λ ∈ R 3×1 is Lagrange multiplier vector. Assuming that the wheeled mobile robots are running on the horizontal ground and ⎡ ⎤ the center of gravity coincides with the geometric center, then,
m 0 0 cos θ sin θ b , V (q, q) ˙ = G(q) = 0. The derivaM(q) = ⎣ 0 m 0⎦, E(q) = r1 cos θ sin θ −b 0 0 I tion of (1) is given by ˙ q¨ = S(q)(z − ξ ) + S(q)(˙z − ξ˙ ) + ϕ˙
(3)
Substituting q¨ into the kinetic equation and multiplying both sides by S(q)T yields ¯ − V¯ z − F¯ z˙ = Eτ
(4)
−1 −1 ˙ − S T M ϕ˙ where E¯ = (S T M S) S T E, F¯ = −[(S T M S) (S T M S ξ˙ + S T M Sξ T T ¯ − S F − S ΔM)], F is the lumped disturbance of the system and ΔM is the model m0 −1 ing error. The calculation shows that V¯ = (S T M S) S T M S˙ = 0, S T M S = , 0 I
1 1 And by simplifying, we can finally get S T E = r1 b −b
¯ − F¯ z˙ = Eu where E¯ =
1 1 rm r I
(5)
T T , u = u1 u 2 , u 1 = τ1 + τ2 , u 2 = τ1 − τ2 , F¯ = f 1 f 2 .
3 Unknown Skidding Compensation Radial basis function (RBF) neural network is able to approximate any continuous function. The unknown lumped disturbance of wheeled mobile robots be compensated by RBF neural network. For any given continuous nonlinear function f (x), it can be approximated by RBF neural network in the following form f (x) = W ∗T h(x) + ε
(6)
Fˆ = Wˆ T h(x)
(7)
F˜ = W˜ T h(x) + ε
(8)
Event-Triggered Adaptive Dynamic Surface Control …
143
where x is the input vector of the neural network, W ∗ is optimal constant weight
x−c j 2 , j is the number of matrix, h(x) is the radial basis function h j = exp − 2b2 j
neurons in the hidden layer, c j is the center point vector value of neurons in the j-th hidden layer, and b j is the base width of the Gaussian function, and ε is the approximation error, Fˆ is the network output, W˜ = W ∗ − Wˆ .
4 Controller Design In this section, kinematic control law and dynamic control law are designed through dynamic surface control, so as to achieve the trajectory tracking error can be lim et→∞ = 0. T Define the reference trajectory qr = xr yr zr , Then the equation of motion under the reference trajectory as ⎧ ⎨ x˙r = vr cos θr y˙r = vr sins θr ⎩ θ˙r = ωr
(9)
where vr and ωr are reference velocity and angular velocity of wheeled mobile robots. T The actual state of the wheeled mobile robots is q = x y θ , the trajectory error can be obtained as follows ⎤ ⎡ ⎤⎡ cos θ sin θ 0 x − xr T e = ex e y eθ = ⎣− sin θ cos θ 0⎦ ⎣ y − yr ⎦ (10) 0 0 1 θ − θr where ex is the error in the x direction, e y is the error in the y direction, and eθ is the angle error. Further, the differential equation of error can be written as ⎡ ⎤ ⎡ ⎤ e˙x ωe y − v + vr cos eθ ⎣e˙ y ⎦ = ⎣ vr sin eθ − ωex ⎦ e˙θ ωr − ω
(11)
The control objective of the kinematic controller is to select an appropriate auxiliary control law so that e converges to 0. According to Lyapunov function V1 , the kinematics control law of the designed wheeled robots is as follows
vc vr cos eθ + k1 ex = ωc ωr + k2 vr e y + k3 sin eθ
where the parameters, k1 , k2 , k3 are pastive constants.
(12)
144
W. Yue et al.
Considering that the virtual velocity is not equal to the actual velocity, it is necessary to design a dynamic controller through dynamic surface control. The errors between the speed and the virtual speed as
Define the command filter
z1 v v = − c z2 ωc ω
(13)
ϕ˙i = ωn ϕid ϕ˙id = −2ζ ωn ϕid − ωn (ϕi − αi )
(14)
where αi are the input of the command filter, in this paper α1 = vc , α2 = ωc , and ϕ1, ϕ1d and ϕ2, ϕ2d are the output of the filter respectively. ζ is the damping ratio of the filter and 0 < ζ ≤ 1; ωn is the filter bandwidth ωn > 0. and ϕ1 (0) = vc (0), ϕ2 (0) = ωc (0) . After passing through the filter, define velocity errors as
The filter errors as
z1 v ϕ = − 1 z2 ϕ2 ω
(15)
ϕ v y1 = 1 − c y2 ϕ2 ωc
(16)
According to Lyapunov function V2 , the dynamics controller is designed as u 1 = r m(−c1 z 1 + fˆ1 + ϕ1d ) u 2 = rbI (−c2 z 2 + fˆ2 + ϕ2d )
(17)
where c1 > 0, c2 > 0 are design positive parameters, fˆ1 , fˆ2 are the estimation of lumped disturbance by RBF neural network Considering the wheeled mobile robots skidding randomness and intermittent in actual operation environment, to reduce actuator number and save resources, design the following fixed threshold event triggering mechanism.
δ(t) = u(tk ), (∀t ∈ [tk , tk+1 ]) tk+1 = inf { t ∈ R| |e(t)| ≥ Δ} , t1 = 0
(18)
where e(t) = δ(t) − u(t) is the error of measurement, tk is controller update time, Δ is a positive parameter, δ(t) as the actual output signal of the controller, u(tk ) is k time sampling signal which keep on to the next sampling time value, when measuring error is greater than Δ, δ(t) updated to δ(t + 1).
Event-Triggered Adaptive Dynamic Surface Control …
145
5 Stability and “Zeno” Phenomenon Analysis Choose the Lyapunov function V1 V1 =
1 2 1 2 1 e + e + (1 − cos eθ ) 2 x 2 y k
(19)
Take the derivative of (19) and substitute (11) and (12) into (19) 1 V˙1 = ex e˙x + e y e˙ y + θ˙ sin(eθ ) k 1 = ex (ωe y − v + vr cos eθ ) + e y (vr sin eθ − ωex ) + (ωr − ω) sin eθ k 1 1 = −ex v + ex vr cos eθ + e y vr sin eθ + ωr sin eθ − ω sin eθ k k = −k1 ex2 − k3 sin2 eθ
(20)
where k, k1 , k2 and k3 are positive parameters, The Lyapunov function V˙1 ≤ 0, The trajectory tracking error can be guaranteed to converge to zero exponentially. Considering the actual speed is not equal to the virtual velocity, selection of Lyapunov function V2 as V2 =
1 2 1 2 1 2 1 2 1 ˜T ˜ 1 ˜T ˜ z 1 + z 2 + y1 + y2 + W1 W1 + W W2 2 2 2 2 2γ1 2γ2 2
(21)
Differentiating (21), we obtain 1 1 V˙2 = z 1 z˙ 1 + z 2 z˙ 2 + y1 y˙1 + y2 y˙2 − W˜ 1T W˙ˆ 1 − W˜ 2T W˙ˆ 2 γ1 γ2
(22)
Substitute Eqs. (5), (7), (13) and (17) into (22), and according to the event trigger condition, when the trigger event,(tk < t < tk+1 ), ∃ |λt | ≤ 1, δ(t) = u(t) + λt Δ we can obtain 1 b ωn 2 1 λt Δz 1 − c2 z 22 + λt Δz 2 − y1 − y1 ϕ˙1d V˙2 = −c1 z 12 + rm rI 2ζ 2ζ ωn 2 1 1 y − y2 ϕ˙2d − y2 ω˙ c − W˜ 1T [h(x)z 1 + W˙ˆ 1 ] − y1 v˙c − 2ζ 2 2ζ γ1 1 − W˜ 2T [h(x)z 2 + W˙ˆ 2 ] − ε1 z 1 − ε2 z 2 γ2
(23)
146
W. Yue et al.
The adaptive rate of neural network is expressed as follows
W˙ˆ 1 = −γ1 [h(x)z 1 + σ1 Wˆ 1 ] W˙ˆ 2 = −γ2 [h(x)z 2 + σ2 Wˆ 2 ]
(24)
we can obtain 1 b ωn 2 1 V˙2 = −c1 z 12 + λt Δz 1 − c2 z 22 + λt Δz 2 − y1 − y1 ϕ˙1d − y1 v˙c rm rI 2ζ 2ζ ωn 2 1 y2 − y2 ϕ˙2d − y2 ω˙ c − σ1 W˜ 1T Wˆ 1 − σ2 W˜ 2T Wˆ 2 − ε1 z 1 − ε2 z 2 − 2ζ 2ζ Using basic inequalities for
1 λ Δz 1 rm t
and
1 2ζ
(25)
y1 ϕ˙ 1d can be obtained
1 1 1 λt Δz 1 ≤ ( λt Δ)2 + rm 2 rm 1 1 1 y1 ϕ˙ 1d ≤ ( y1 ϕ˙1d )2 + 2ζ 2 2ζ
1 2 z 2 1 1 2
(26)
Meanwhile, using fundamental inequality for rbI λt Δz 2 , y1 v˙c , 2ζ1 y2 ϕ˙1d , y2 ω˙ c , (25) can be simplified as follows 1 2 1 1 ωn 1 − 2 ϕ˙1d − v˙c2 )y12 V˙2 ≤ −(c1 − )z 12 − (c2 − )z 22 − ( 2 2 2ζ 8ζ 2 1 2 ωn 1 2 2 − 2 ϕ˙2d − ω˙ c )y2 + 2 − σ1 W˜ 1T Wˆ 1 − σ2 W˜ 2T Wˆ 2 −( 2ζ 8ζ 2 1 1 1 b 2 + ( λt Δ) + ( λt Δ)2 − ε1 z 1 − ε2 z 2 2 rm 2 rI
(27)
Then using Youngs inequalities we can obtain
2 2 2 W˜ iT Wˆ i ≤ Wi∗ − W˜ i 2εi z i ≤ z i2 + εi2
(28)
And assuming that the filter output is bounded, the virtual velocity derivative is 2 2 bounded 8ζ1 2 ϕ˙1d + 21 v˙c2 ≤ M, 8ζ1 2 ϕ˙2d + 21 ω˙ c2 ≤ N . Finally, we can get ωn ωn σ1 ˜ 2 σ1 ˜ 2 − M)y12 − ( − N )y22 − V˙2 ≤ −(c1 − 1)z 12 − (c2 − 1)z 22 − ( W1 − W2 2ζ 2ζ 2 2 σ2 W ∗ 2 + σ2 W ∗ 2 + 1 ε2 + 1 ε2 + 1 ( 1 λt Δ)2 + 1 ( b λt Δ)2 + 2 + 1 1 2 2 2 1 2 2 2 rm 2 rI
where c1 > 1, c2 > 1, ω2ζn > max(M, N ).
(29)
Event-Triggered Adaptive Dynamic Surface Control …
147
V˙2 ≤ −C V2 + B
(30)
ωn ωn γ1 σ1 γ2 σ2 − M), ( − N ), , ] 2ζ 2ζ 2 2 1 1 1 1 b σ2 2 σ2 2 1 B = W1∗ + W1∗ + ε12 + ε22 + ( λt Δ)2 + ( λt Δ)2 + 2 2 2 2 2 2 rm 2 rI (31) For the inequality equation V˙2 ≤ −C V2 + B, by solving this equation we can get C = 2 min[(c1 − 1), (c2 − 1), (
V2 ≤
B B + (V2 (0) − )e−Ct C C
(32)
It can be seen that all error signals in the closed loop system are semi-globally uniformly bounded in the following compact set B = (z, y, W˜ ) : V2 = C
(33)
Barbalat’s lemma: lim V is converges and V˙ is uniformly continuous lim V˙ = 0. t→∞ t→∞ It is known that the compact set can be made arbitrarily small by adjusting the parameters. According to Barbalat’s lemma, when t → ∞, V2 → 0 and z 1 , z 2 → 0, because V˙1 ≤ 0, so, the pose error ex , e y , eθ is uniformly asymptotically stable. In order to avoid the “Zeno” phenomenon, it is necessary to prove that there is a minimum trigger interval. This means that ∃t ∗ > 0, ∀k ∈ N , {tk+1 − tk } ≥ t ∗ . According to e(t) = δ(t) − u(t) d d |e| = (e × e)1/2 = sign(e) × e˙ ≤ δ˙ dt dt
(34)
δ˙ is smooth and continuous differentiable function, all variables are global bounded ˙ so there is constant k > 0 makes δ˙ < k, e(tk ) = 0 and, lim e(t) = m therefore in δ, there must be t ∗ ≥
m k
t→tk
constant makes “Zeno” phenomenon will not happen.
6 Simulation Result Matlab/Simulink was used to carry out the simulation test to verify the control method. The physical parameters of wheeled mobile robots are m = 2.5 kg, I = 2.5 kg m2 , b = 0, 15 m, r = 0.05 m, disturbances are f 1 = 2 + 5 sin(t) + 2v, f 2 = 2 + 5 cos(t) + 2ω, kinematics control parameters were taken k1 , k2 , k3 , The kinetic control parameters were set as c1 = c2 = 50. The command filter parameter ζ = 0.6, ωn = 160, the neural network parameter b = 3, c is (− 3, 3) uniformly distributed, the number of hidden layers is 13, γ1 = γ2 = 500, σ1 = σ2 = 0.00001 and the fixed threshold of the event trigger is Δ1 = 0.1, Δ2 = 0.3. The reference track is
148
W. Yue et al.
selected as the circular track to verify the track tracking effect. The radius of the reference track is 2 m and its center is (3, 3). The reference velocity of wheeled mobile robots refers to the angular velocity vr = 2 ms−1 , ωr = 1 rad s−1 , and the initial pose is (3, 0, 0). Simulation results are as follows. As shown in Fig. 2a shows the overall trajectory tracking effect of the wheeled mobile robots. The wheeled mobile robots can quickly track the upper reference trajectory and run stably following the reference trajectory. Figure 2b shows the changes of x direction error, y direction error and Angle error in the trajectory tracking process of wheeled mobile robots. It can be seen that the error quickly approaches 0 and remains at a very small value. Figure 2c shows the estimation effect of RBF network on unknown lumped disturbance. It can be seen from the figure that the estimated value is very close to the real value, and the estimated accuracy is relatively high. Figure 2d is the output of torque. Figure 3 show the event triggering interval of the dynamic control law. It can be seen that the designed event trigger mechanism can stably trigger, The minimum trigger interval is 0.01 s, and there is no “Zeno” phenomenon.
(b)
(a) 1
7 Actual trajectory Reference trajectory
6
Tracking errors
5 y(m)
ex
0.8
4 3 2
ey
0.6
e
0.4 0.2 0 -0.2
1
-0.4
0 0
2
4
0
6
2
4
8
10
(d)
(c) 20
50 Disturbance actual value Disturbance estimated value
u1 u2
40 Control toque(Nm)
15 Disturbance
6 t(s)
x(m)
10
5
30 20 10 0
0 -10 0
2
4
6 t(s)
Fig. 2 Result of trajectory tracking
8
10
0
2
4
6 t(s)
8
10
Event-Triggered Adaptive Dynamic Surface Control … Fig. 3 Interval of event triggering
149
(a) Time interval
1 0.8 0.6 0.4 0.2 0 0
2
4
6
8
10
6
8
10
t(s)
(b) Time interval
1 0.8 0.6 0.4 0.2 0 0
2
4
t(s)
7 Conclusions In this paper an event-triggered neural network adaptive dynamic surface control for wheeled mobile robots with unknown disturbance such as skidding on complex pavement is designed. For unknown disturbance problems, RBF neural network is used for real-time online estimation. To solve the problem of complexity increase caused by backstepping derivation, dynamic surface control is adopted. Meanwhile, command filter is used to replace first-order low-pass filter to improve filtering accuracy. The event triggering mechanism with fixed threshold is designed to avoid frequent updating of actuator signals and effectively reduce resource loss. The system SCUUB stability is proved by Lyapunov stability theory. Finally, the simulation results show that the proposed control strategy can effectively approximate unknown disturbance, the trajectory tracking accuracy is accurate, not going to happen “Zeno” phenomenon, has good robustness.
References 1. Cen, H., Singh, B.K., Shanmuganathan, V.: Nonholonomic wheeled mobile robot trajectory tracking control based on improved sliding mode variable structure. Wireless Commun. Mob. Comput. 2021, 1–9 (2021) 2. Ding, L., Li, S., Gao, H., Chen, C., Deng, Z.: Adaptive partial reinforcement learning neural network-based tracking control for wheeled mobile robotic systems. IEEE Trans. Syst., Man, Cybern.: Syst. 50(7), 2512–2523 (2020)
150
W. Yue et al.
3. Fukao, T., Nakagawa, H., Adachi, N.: Adaptive tracking control of a nonholonomic mobile robot. Trans. Robot. Autom. 16, 609C615 (2000) 4. Chen, C., Gao, H., Ding, L., Li, W., Yu, H., Deng, Z.: Trajectory tracking control of WMRs with lateral and longitudinal slippage based on active disturbance rejection control. Robot. Auton. Syst. 107, 236–245 (2018) 5. Wang, X., Zhang, G., Neri, F., Jiang, T., Zhao, J., Gheorghe, M., Ipate, F., Lefticaru, R.: Design and implementation of membrane controllers for trajectory tracking of nonholonomic wheeled mobile robots. Integr. Comput.-Aided Eng. 23(1), 15–30 (2015) 6. Park, B.S., Yoo, S.J., Park, J.B., Choi, Y.H.: A simple adaptive control approach for trajectory tracking of electrically driven nonholonomic mobile robots. IEEE Trans. Control Syst. Technol. 18(5), 1199–1206 (2010) 7. Ye, J.: Tracking control of a non-holonomic wheeled mobile robot using improved compound cosine function neural networks. Int. J. Control 88(2), 364–373 (2015) 8. Li, L., Cao, W., Yang, H., Geng. Q.: Trajectory tracking control for a wheel mobile robot on rough and uneven ground. Mechatronics 83 (2022) 9. Kang, H., Park, C.-W., Hyun, C.-H.: Alternative identification of wheeled mobile robots with skidding and slipping. Int. J. Control, Autom. Syst. 14(4), 1055–1062 (2016) 10. Cui, M., Liu, H., Liu, W., Huang, R., Qin. Y.: An adaptive unscented Kalman filter-based adaptive tracking control for wheeled mobile robots with control constrains in the presence of wheel slipping. Int. J. Adv. Robot. Syst. 13(5) (2016) 11. Chen, M.: Disturbance attenuation tracking control for wheeled mobile robots with skidding and slipping. IEEE Trans. Ind. Electron. 64(4), 3359–3368 (2017) 12. Kang, H.-S., Kim, Y.-T., Hyun, C.-H., Park, M.: Generalized extended state observer approach to robust tracking control for wheeled mobile robot with skidding and slipping. Int. J. Adv. Robot. Syst. 10(3) (2013) 13. Wang, G., Zhou, C., Yu, Y., Liu, X.: Adaptive sliding mode trajectory tracking control for WMR considering skidding and slipping via extended state observer. Energies 12(17) (2019) 14. Shen, Z., Ma, Y., Song, Y.: Robust adaptive fault-tolerant control of mobile robots with varying center of mass. IEEE Trans. Ind. Electron. 65(3), 2419–2428 (2018) 15. Liu, X., Xu, B., Shou, Y., Fan, Q.Y., Chen, Y.: Event-triggered adaptive control of uncertain nonlinear systems with composite condition. IEEE Trans. Neural Netw. Learn. Syst. 33(10), 6030–6037 (2022) 16. Xing, L., Wen, C., Liu, Z., Su, H., Cai, J.: Event-triggered output feedback control for a class of uncertain onlinear systems. IEEE Trans. Autom. Control 64(1), 290–297 (2019) 17. Xing, L., Wen, C., Liu, Z., Su, H., Cai, J.: Event-triggered adaptive control for a class of uncertain nonlinear systems. IEEE Trans. Autom. Control 62(4), 2071–2076 (2017) 18. Villarreal-Cervantes, M.G., Sanchez-Santana, J.P., Guerrero-Castellanos, J.F.: Periodic eventtriggered control strategy for a (3,0) mobile robot network. ISA Trans. 96, 490–500 (2020)
A K-Means and GMM-Based Fusion and Detection Algorithm Against FDI Attacks on Remote Estimator Jinxing Hua and Fei Hao
Abstract This paper presents a K-means and gaussian mixture model (GMM) based detection and fusion algorithm for a multisensory cyber physical system (CPS). In the considered system, part of measurement channels may suffer from false data injection (FDI) attacks, which would deteriorate the estimation performance of the CPS. To handle this, a novel detection and fusion algorithm is proposed to eliminate compromised sensors and fuse safe sensors. Firstly, the K-means algorithm is utilized to get rid of severely biased sensors and the GMM algorithm is subsequently adopted to further detect sensors screened by the K-means algorithm. Moreover, a more computationally efficient sequential Kalman filter is used at the remote estimator side, and the detection and fusion algorithm based on K-means and GMM algorithms is derived in the framework of the sequential Kalman filter. In addition, the recursion of the estimation error covariance is recalculated in the presents of attacks. Finally, the effectiveness of the detection and fusion algorithm is verified by a simulation example of an unmanned ground vehicle (UVA). Keywords Cyber physical systems · False data injection attacks · K-means algorithm · Gaussian mixture model · Sequential kalman filter
1 Introduction CPS is a complex dynamic system that integrates communication, computing, control technologies and extensively connects information space, physical space and human society. Based on this, CPS has been widely used in industrial production, smart grids, intelligent transportation and other critical national infrastructures [1]. Due J. Hua · F. Hao (B) The Seventh Research Division, School of Automation Science and Electrical Engineering, Beihang University, Beijing 100191, China e-mail: [email protected] J. Hua e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_12
151
152
J. Hua and F. Hao
to the importance and open interconnectivity of the CPS, it has become one of the prime targets of malicious attacks, and any successful attack could cause damage to the economy and even the lives of people. Thus, the security of CPS is of critical importance. FDI attacks are one of the most important cyber attacks. It breaks the integrity and reality of the information by injecting malicious attack signals, and have attracted much attention in recent years. Many scholars have studied the problem of optimal FDI attacks from the attacker’s point of view under some constraints such as: stealthiness and energy consumption, as in [2–4]. Additionally, investigating the security of CPS from the defenders’ perspective is also very meaningful. [5] proposed a estimation scheme to filter out malicious sensors while retaining other sensors, and then a novel detector to detect FDI attacks on an unknown sensor subset is developed. In [6], a GMM-based detection algorithm was proposed to detect and fuse the measurements, some of which might be subject to attacks. In [7], an encryption-based defense strategy was proposed to counter FDI attacks, where measurement information was encrypted before being transmitted over the wireless channel to the remote state estimator. Our target is to detect the attacked sensors as accurately as possible and fuse the measurements from sensors which are not corrupted. Note that, all sensors are involved in the fusion process at the remote estimator side according to the beliefs generated by the GMM, regardless of whether they are attacked or not. This affects the estimation accuracy of the estimator. Moreover, the fusion process in [6] follows the parallel Kalman filter as in [8]. However, [8] pointed out that the sequential Kalman filter is computationally more efficient than the parallel Kalman filter and is optimal in the meanwhile. Since the computational efficiency is crucial for the system measured by abundant sensors, the fusion process under the sequential Kalman filter framework will enhance the system performance. Based on the above discussions, an improved K-means and GMM-based detection and fusion algorithm under the sequential Kalman filter process is proposed in this paper. Different from the GMM-based detection algorithm in [6], the K-means algorithm is first adopted to get rid of the severely biased sensors, which reduces the impact of the compromised sensor on the estimator. Based on this, a GMM-based algorithm is utilized to further detect the remaining sensors, and then a fusion process under the sequential Kalman filter framework is subsequential proposed, and the main contributions of this paper are summarized as follows. 1. An improved K-means and GMM-based detection and fusion algorithm is proposed when part of sensors are attacked. Different from [6], a K-means algorithm is first adopted to get rid of severely biased sensors. 2. Based on the K-means algorithm, the GMM-based algorithm is subsequential utilized to detect the remaining sensors for further fusion, and the fusion process under the sequential Kalman filter framework is derived. 3. A more general attack policy is proposed to analyze the effectiveness of the detection and fusion algorithm in this paper, and the remote estimation error covariance recursion under the improved K-means and GMM-based detection and fusion algorithm is recalculated when part of sensors are attacked.
A K-Means and GMM-Based Fusion and Detection Algorithm …
153
2 Problem Formulation In this section, the problem formulation is presented. The process is described by the following linear system as in Fig. 1: xk = Axk−1 + ωk−1 , yk,i = Ci xk + vk,i i = 1, . . . , N ,
(1)
where xk ∈ Rn is the system state, yk,i ∈ Rm is the measurement of sensor i. ωk−1 is the system noise with ωk−1 ∼ N (0, Q), and vk,i is the measurement noise with vk,i ∼ N(0, Ri ), where Q ≥ 0 and R > 0, and they are mutually i.i.d Gaussian random N (0, Π0 ) is independent of ωk and vk,i . The pairs variables. The initial state x0 ∈√ (A, Ci ) are detectable, and (A, Q) is stabilization. At every time step k, each sensor transmits its measurement to the remote estimator through wireless channels, and a sequential Kalman filter is adopted to estimate the system state in the following dynamic: xˆk− = A xˆk−1 , Pk− = A Pk−1 A T + Q, xˆk0 = xˆk− , Pk,0 = Pk− , xˆki = xˆki−1 + K k,i yk,i − Ci xˆki−1 , −1 , or K k,i = Pk,i CiT Ri−1 , K k,i = Pk,i−1 CiT Ci Pk,i−1 CiT + Ri
(2)
−1 −1 = Pk,i−1 + CiT Ri−1 CiT , i = 1, . . . , N , Pk,i = (I − K k,i Ci )Pk,i−1 , or Pk,i
Pk = Pk,N xˆk = xˆkN , where xˆk− and xˆk are the priori and posteriori minimum mean squared error (MMSE) estimates of the state xk , and Pk− and Pk are corresponding covariances, respectively. Additionally, each sensor i adopts a standard Kalman filter to obtain the local esti− − , xˆk,i , Pk,i and Pk,i are the the priori and posteriori mates as in [6], in which xˆk,i MMSE estimates and corresponding covariances, respectively. A K-L divergence detector is adopted to detect the existence of potential malicious attacks, and the distributed K-L divergence detectors for each sensor and centralized K-L divergence detector at the remote estimator side are:
Fig. 1 System architecture
154
J. Hua and F. Hao
Di (z k,i ||˜z k,i ) =
log
ξ | f zk,i (ξk )>0
D(z k ||˜z k ) = ξ | f zk (ξk )>0
f zk,i (ξk ) f z (ξk )dξk ≤ ηi f z˜k,i (ξk ) k,i
f z (ξk ) f z (ξk )dξk ≤ η, log k f z˜k (ξk ) k
(3)
− where z k,i = yk,i − Ci xˆk,i and z˜ k,i are uncorrupted and corrupted innovations for sensor i, respectively. z k = yk,N − C N xˆkN −1 and z˜ k are innovation and corrupted innovation at the remote estimator side. The distributed detector will trigger an alarm when Di (z k,i ||˜z k,i ) exceeds the threshold, and the measurement of sensor i will be dropped. When D(z k ||˜z k ) > η, the centralized detector will give an alarm, and all measurements will be dropped. As in [9, 10], a carefully designed attack by an FDI attacker can successfully bypass the distributed K-L divergence detector and deteriorate the performance of the system. However, the stealthy FDI attacks bypass the distributed detector will fail to remain stealthy to the centralized detector, which leads to abandoning of all measurements. Additionally, this will result in a large estimation error. Thus, the goal of this paper is to design an improved GMM-based detection algorithm for systems suffering from stealthy FDI attacks.
3 Detection Algorithm In this section, an improved GMM-based detection algorithm is proposed to detect the malicious FDI attacker. Firstly, the K-means algorithm is used to remove the local estimates of outliers. Then, the GMM is adopted to detect the screened local estimates. Finally, an improved GMM-based sequential Kalman filter algorithm is formulated to fuse the measurements from all sensors.
3.1 K-Means Algorithm The main function of the K-means algorithm is clustering. When part of sensors are attacked, the corresponding local estimates will deviate from the system state xk . Thus, the K-means algorithm is adopted to eliminate the severely biased local estimates, which will improve the detection accuracy of the subsequent GMM. Define the set of N local estimates as X = {xˆk,1 , . . . , xˆk,N }. For the N local estimates in X , the K-means aims to firstly screen the size N to M (0 < M < N ), and the process of the K-means algorithm is as follows. (1) Select M local estimates randomly from X as class centres, which can be denoted as C = {c1 , . . . , c M }. Additionally. the set of each class is defined as Ci . (2) For each local estimate xˆk,i , calculate the distance from xˆk,i to each M class the centre. centres: ||xˆk,i − c j ||, whichever category is closest to (3) For each class j, recalculate the class centre as c¯ j = i∈C j xˆk,i . Then, calculate the distance from xˆk,i ∈ C j to c¯ j , the nearest being the new class centre.
A K-Means and GMM-Based Fusion and Detection Algorithm …
155
(4) Iterate steps (2) and (3) to a pre-designed number of iterations. ¯ and output C¯ as the screened local estimates, Denote the final class centres set as C, which will be utilized in the subsequent detection and fusion. Sensors which are not in C¯ are identified as severely deviated, i.e. are attacked by the FDI attacker. Define δk,i as the parameter to indicate the sensor i is in C¯ or not, i.e. δk,i = 1 means the sensor is selected to participate in the next process of GMM-based detection and fusion, otherwise, δk,i = 0 means not.
3.2 GMM Algorithm The GMM algorithm is utilized to further detect and fuse the sensors’ measurements screened by the K-means algorithm to improve the estimation performance of the system. The Gaussian mixture model is formulated similarly 2 to in [6], and the mixp(x|Q q )Pr (Q q ) = ture density of the GMM can be represented as p(x) = q=1 2 q q q π f (x; μ , Σ ), where p(x|Q ) and Pr (Q ) are the Gaussian distribution q q q=1 density and weight of the q-th component, respectively, and q = 1 or 2 represent the sensor is uncorrupted or compromised. f (x; μq , Σ q ) = √(2π)1n |Σ q | exp(− 21 ) (x − μq )T Σ q (x − μq ) is the probability density function (pdf) for Gaussian random variables. Additionally, π q are the mixture component weights of Q q for q = 1 or 2 π q = 1. 2, and q=1 The local estimate xˆk,i follows two different distributions when sensor i is attacked or not. When the sensor i is uncorrupted, p(xˆk,i |Q 1 ) ∼ N (μ1k , Pi ), where Pi is the steady state estimate error covariance of sensor i, and Pi is the unique positive definite solution of X = AX A T + Q − (AX A T + Q)CiT (Ci (AX A T + Q)CiT + Ri )−1 Ci (AX A T + Q). Due to the information of the attacker is unknown, which includes the attack start time and the specific attack form, p(xˆk,i |Q 2 ) ∼ N (μ2k , Σk2 ). There2 p(xˆk,i |Q q ) Pr (Q q ) = fore, the density of xˆk,i can be given as p(xˆk,i ) = q=1 1 1 2 2 2 πk f (xˆk,i ; μk , Pi ) + πk f (xˆk,i ; μk , Σk ). Then, the expectation-maximization (EM) algorithm in [11] is adopted to find the maximum likelihood estimates for the q q 2 , and the log likelihood can be described as Lk = parameter Φk = {πk , μk , Σk2 }q=1 N 1 1 log(π f ( x ˆ ; μ , P ) + πk2 f (xˆk,i ; μ2k , Σk2 )). Generally, the EM algorithm k,i i k k i=1,i∈C¯ is divided into two steps: (1) The expectation step: for each sensor i, obtain the probability that xˆk,i belongs π 1 f (xˆk,i ;μ1k ,Pi ) 1 2 1 to q γk,i = π 1 f (xˆ ;μk1 ,P )+π , γk,i = 1 − γk,i . 2 f (xˆ ;μ2 ,Σ 2 ) k
k,i
k
i
k
k,i
k
k
q
q
(2) The maximization step: reestimate {πk , μk , Σk2 } according to N q πk
=
i=1,i∈C¯
M N
Σk2 =
i=1,i∈C¯
N
q
γk,i
,
q μk
=
i=1,i∈C¯
N
q
γk,i xˆk,i
i=1,i∈C¯
q
γk,i
2 γk,i (xˆk,i − μ2k )(xˆk,i − μ2k )T . N 2 i=1,i∈C¯ γk,i
, (4)
156
J. Hua and F. Hao
Repeat iterations (1) and (2) until the gap between the two iterations satisfies the expected value. As for the sensors screened out by the K-means algorithm, the 1 2 1 for them are all set to zero, and γk,i = 1. Then, based on the γk,i and values of γk,i 2 γk,i , the algorithm of K-means-based GMM can be described in the following theorem. Theorem 1 When the system (1) adopts the sequential Kalman filter at the remote estimator side, the K-means and GMM-based detection and fusion algorithm for the N 1 sequential Kalman filter can be given as xˆk = A xˆk−1 + i=1 γk,i δk,i Mk,i K k,i (yk,i − N 1 T −1 T −1 Ci A and Pk = (A Pk−1 A + Q) + i=1 γk,i δk,i Ci Ri Ci where Mk,i = Nj=i+1 1 (I − K k, j C j ), Mk,N = I and γk,i and δk,i are obtained by the GMM and K-means algorithms, respectively. Proof Firstly, based on (2), xˆk,i can be rewritten as the following form xˆki =
i
(I − K k, j C j )xˆk− + K k,i yk,i + (I − K k,i Ci )K k,i−1 yk,i−1
j=1
+ ··· +
i
(5) (I − K k,l Cl )K k,1 yk,1 .
l=1
N −1 Mk,i K k,i (yk,i − Substituting (5) into (2), xˆk can be rewritten as xˆk = xˆk− + i=1 Ci xˆk− ) + K k,N (yk,N − C N xˆk− ). When the K-means screens N local estimates to M, then handled by the GMM algorithm, and based on (5), xˆk can be given as Theorem 1. Moreover, the iteration of the estimate error covariance in (2) can be simplified as −1 −1 =Pk,0 + C1T R1−1 C1 = (A Pk−1 A T + Q)−1 + C1T R1−1 C1 Pk,1
··· −1 Pk,N
=Pk−1
= (A Pk−1 A + Q) T
−1
+
N
(6) CiT
Ri−1 Ci .
i=1
Thus, the dynamic of Pk for the K-means and GMM-based detection and fusion algorithm under the frame of the sequential Kalman filter is Pk−1 = (A Pk−1 A T + N 1 Q)−1 + i=1 γk,i δk,i CiT Ri−1 Ci , and the proof is completed. The detailed process of the improved GMM-based detection algorithm is summarized as Algorithm 1.
4 Performance Analysis for Algorithm 1 In this section, the scenario where part of sensors are attacked by FDI attackers is considered, and the performance of the K-means and GMM-based detection and fusion algorithm is analyzed. The optimal attack is considered as the attack policy
A K-Means and GMM-Based Fusion and Detection Algorithm …
157
in [6], different from it, a more general attack policy is investigated in this paper Δyk,i , where k,i = 1 or 0 means the sensor i which is defined as y˜k,i = yk,i + k,i N k,i ≤ N stands for the case that the FDI is attacked or not, respectively, and i=1 attacker can just compromise part of sensors. Additionally, Δyk,i ∼ N (0, Yk,i ) is the attack signal injected into sensor i and each attack signal can bypass the distributed K-L divergence detector as in [10]. When the system (1) is subject to the FDI attacks at instant k, the state estiN 1 γk,i δk,i Mk,i K k,i (yk,i − mate equation in should be rewritten as xˆ˜k = A xˆ˜k−1 + i=1 Ci A xˆ˜k−1 + k,i Δyk,i ), where xˆ˜k is the corrupted posteriori state estimate at the remote estimator. Therefore, the state estimation will deviate from its real value, and the estimation error covariance should be recalculated. The iteration of the corrupted estimation error covariance is summarized as the following theorem. Algorithm 1 The K-means and GMM-based detection and fusion algorithm Input xˆk,1 , . . . , xˆk,N , M; for k = −∞ : 0 do do for i = 1 : N do do xˆk,i = A xˆk,i + K k,i (yk,i − Ci A xˆk−1,i ), Pk,i = [(A Pk−1 A T + Q)−1 + CiT Ri−1 Ci ]−1 , end for The remote estimator reaches the steady state according to (2), end for Randomly select M local estimates as the initial class centers, Set the iteration number of the K-means algorithm T , for d = 1 : T do for i = 1 : N do Calculate The distance from each xˆk,i to each M class centres and determine which class to belong to, end for for j = 1 : M do Recalculate the class centre according to 3) in section 3.1, end for end for All class centres will enter in the next GMM process, (1) P¯i = P0,i , Σ1 = P0,i for k = 1 : +∞ do for i = 1 : N do xˆk,i = A xˆk−1 + +K k,i (yk,i − Ci A xˆk−1,i ) end for Initialize πk1 , πk2 , μ1k , μ2k , Σk2 ; for Termination condition not reached do 1 and γ 2 ; The expectation step: calculate γk,i k,i The maximization step: calculate πk1 , πk2 , μ1k , μ2k , Σk2 ; end for N xˆk = A xˆk−1 + i=1 γk,i δk,i Mk,i K k,i (yk,i − Ci A xˆk−1 ); N Pk = [(A Pk−1 A T + Q)−1 + i=1 γk,i δk,i CiT Ri−1 Ci ]−1 ; end for
158
J. Hua and F. Hao
Theorem 2 When the system (1) with the K-means and GMM-based detection algorithm under the attack, the estimation error covariance for the sequential Kalman filter at the remote estimator side follows the recursion: P˜k =M A P˜k−1 A T M T + M Q M T +
N −1
1 2 T T (γk,i ) δk,i Mi K i Ri K k,i Mk,i
i=1
+
1 (γk,N )2 δk,N K k,N
T R N K k,N
N 1 2 T T + (γk,i ) δk,i k,i Mk,i K i Yk,i K k,i Mk,i ,
(7)
i=1
where M = I −
N −1 N i=1
j=i+1
1 1 γk,i δk,i (I − K k, j C j )K k,i Ci − γk,N δk,N C N K k,N .
Proof When part of sensors are attacked by an FDI attacker and based on (1), the priori estimation error at the remote estimator side can be given by: xk − xˆ˜k− = A(xk−1 − xˆ˜k−1 ) + ωk−1 . Based on the attack policy, the corrupted state estimate xˆ˜k can be expanded as xˆ˜k =xˆ˜k− +
N
1 γk,i δk,i Mk,i K k,i Ci xk +
i=1
+
N i=1
1 γk,i δk,i k,i Mk,i K k,i Δyk,i
N
1 γk,i δk,i Mk,i K k,i vk,i
i=1
−
N
(8) 1 γk,i δk,i Mk,i K k,i Ci xˆ˜k− .
i=1
Then, based on xk − xˆ˜k− the posterior estimation error can be obtained as xk − xˆ˜k = N N 1 1 γk,i δk,i Mk,i K k,i Ci vk,i − i=1 γk,i δk,i k,i Mk,i M A(xk−1 − xˆ˜k−1 ) + Mωk−1 − i=1 T ˆ ˆ ˜ K k,i Δyk,i . Thus, based on Pk = E[(xk − x˜k )(xk − x˜k ) ], the estimation error covariance at the remote estimator can be represented as Eq. (7).
5 Simulation In this section, the example of an unmanned ground vehicle (UGV) in [12] is utilized to illustrate the effectiveness of the proposed K-means detec
and GMM-based p p˙ 0 0 , where p tion and fusion algorithm, which is described as = v v˙ 0 − Hμ and v are the UGV’s position and velocity, respectively. H = 0.8 and μ = 1 represent the mechanical mass and the translational friction coefficient to the UGV, respectively. The system is discretized with a sampling period 0.1 s in the standard zero-order-hold
manner, thus the responding
discrete system can be given by 1 0.094 0.012 pk+1 pk xk+1 = = + . The UGV is measured by four senvk+1 0 0.8825 vk 0.009 sors, and the dynamics of these are yk+1,i = Ci xk+1 + vk+1,i , where the special val-
A K-Means and GMM-Based Fusion and Detection Algorithm …
159
Fig. 2 Estimation error covariances under different detecters
10 00 ues of Ci and the covariances of vk+1,i , i.e. Ri are C1 = , C2 = , C3 = 01 01
10 11 , C4 = and R1 = 0.05I2 , R2 = 0.02I2 , R3 = 0.1I2 and R4 = 0.059I2 . 00 11 The attacker choose to corrupt sensors 2 and 3 as Δyk,2 ∼ N (0, 3.5 ∗ I2 ) and Δyk,3 ∼ N (0, 1.27 ∗ I2 ). The value of M in the K-means algorithm is 3, which means one sensor should be first screened out. Figure 2 shows the traces of estimation error covariance under different detectors. The K-means algorithm is first used to remove outliers, so that the number of sensors involved in the subsequent GMM-based process becomes 3, and a GMM algorithm is utilized to further detect and fuse the measurements of the 3 sensors in the sequential Kalman filter framework. From Fig. 2, it is demonstrated that the detection and fusion algorithm in this paper can obtain better detection performance than others, which include: GMMbased detection algorithm in [8], distributed K-L divergence detector, centralized K-L divergence detector and no attacks. Furthermore, traces of the steady-state estimation error covariance under different numbers of sensors and attacked sensors are shown in Table 1 to further demonstrate the effectiveness of Algorithm 1, where NS is the number of sensors, NA is the number of attacked sensors and TSE is the abbreviation for the trace of the steady-
160
J. Hua and F. Hao
Table 1 Traces of the steady-state estimation error covariance under different cases NS NA TSE in this paper TSE in [8] 16 16 32 64
4 8 16 32
0.0374 0.0597 0.1531 0.2912
0.109 0.19832 0.3862 0.8057
state estimation error covariance. It is clearly that as the number of attacked sensors increases, the fusion algorithm proposed in this paper shows better detection performance compared to the algorithm proposed in [8].
6 Conclusion This article proposed a novel detection and fusion algorithm to handle the issue when part of sensors are attacked and the FDI attacks can bypass the distributed K-L detectors and be detected by the centralized K-L divergence detector. The multisensory system adopted a sequential Kalman filter at the remote estimator side. Under this kind of system diagram, the K-means algorithm was first utilized to get rid of the severely biased sensors, and the GMM algorithm was subsequently adopted to further detect the remaining sensors. Based on these, the fusion process was derived in the sequential Kalman filter framework. Then, a more general attack model was formulated, and the evolution of the estimation error covariance of the compromised system under the detection and fusion algorithm was investigated. Finally, the feasibility of theoretical results was validated and demonstrated by a numerical example of a UVA.
References 1. Park, G., Lee, C., Shim, H., Eun, Y., Johansson, K.H.: Stealthy adversaries against uncertain cyber-physical systems: Threat of robust zero-dynamics attack. IEEE Trans. Autom. Control 64(12), 4907–4919 (2019) 2. Li, Y.G., Yang, G.H.: Optimal stealthy innovation-based attack with historical data in cyber physical systems. IEEE Trans. Syst., Man, Cybern. 51(6), 3401–3411 (2021) 3. Shang, J., Chen, T.W.: Optimal stealthy intergrity attacks on remote state estimation: The maximum utilization of historical data. Automatica 128, 1–15 (2020) 4. Ren, X.X., Yang, G.H.: Kullback-Leibler divergence-based optimal stealthy sensor attack against networked linear quadratic gaussion systems. IEEE Trans. Cybern. 52(11), 11539– 11548 (2022) 5. Chattopadhyay, A., Mitra, U.: Security against false data injection attack in cyber-physical systems. IEEE Trans. Control Netw. Syst. 7(2), 1015–1027 (2020)
A K-Means and GMM-Based Fusion and Detection Algorithm …
161
6. Guo, Z.Y., Shi, D.W., Quevedo, D.E., Shi, L.: Secure state estimation against integrity attacks: a gaussian mixture model approach. IEEE Trans. Sig. Prec. 67(1), 194–207 (2019) 7. Shang, J., Chen, M., Chen, T.W.: Optimal linear encryption against stealthy attacks on remote atate estimation. IEEE Trans. Autom. Control 66(8), 3592–3607 (2020) 8. Willner, D., Chang, C.B., Dunn, K.P.: Kalman filter configurations for multiple radar systems, MIT Lincoln Laboratory (1976) 9. Guo, Z.Y., Shi, D.W., Johansson, K.H., Shi, L.: Worst-case stealthy innovation-based linear attack on remote state estimation. Automatic 89, 117–124 (2018) 10. Li, Y.G., Yang, G.H.: Worst-case -stealthy false data injection attacks in cyber-physical systems. Inf. Sci. 515, 352–364 (2020) 11. Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, Berlin, Germany (2006) 12. Shoukry, Y., Tabuada, P.: Event-triggered state observers for sparse sensor noise/attacks. IEEE Trans. Autom. Control 61(8), 2079–2091 (2016)
Research on Surface Quality and Wheel Wear of Internal Thread in High-Speed Grinding Zhang Zhaojing, Dou Yongqiang, Zhang Rong, Shi Wei, Zheng Jigui, and Zhao Yongsheng
Abstract Planetary roller screw is a new type of precision transmission mechanism, its unique structure brings great difficulties to machining. In order to ensure the processing quality of the nut in the planetary roller screw and improve grinding efficiency, the influence of the linear speed of the grinding wheel, the speed of headstock and the grinding depth on the surface quality of the screw thread is researched by single factor experiments in this paper. The results show that increasing the linear speed of the grinding wheel and reducing the grinding depth are conducive to improving the surface quality. With the increase of headstock speed, the surface roughness of screw thread presents the trend of “decreasing first and then increasing”. At the same time, the wear law of the radius of rounded root of CBN grinding wheel and K value under different linear speed conditions are researched, and the dressing cycle of internal thread high-speed grinding wheel is found in this paper. Keywords Planetary roller screw · High-speed thread grinding · Surface roughness · Grinding wheel wear
1 Introduction Planetary roller lead screw is a kind of high-performance precision transmission mechanism that converts rotary motion into linear motion, which is mainly composed of lead screw, roller, nut, holder, inner gear ring and other components. Compared with ball screw, the planetary roller screw has obvious advantages in precision, bearing capacity, volume, weight, etc. due to more contact area between components Z. Zhaojing · D. Yongqiang · Z. Rong · S. Wei · Z. Jigui Beijing Research Institute Precision Mechatronics and Controls, Laboratory of Aerospace Servo Actuation and Transmission, Beijing 100076, China Z. Zhaojing (B) · Z. Yongsheng Beijing University of Technology, Institute of Advanced Manufacturing and Intelligent Technology, Beijing 100124, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_13
163
164
Z. Zhaojing et al.
during transmission [1]. The research shows that the processing technology level of thread grinding will directly affect the transmission performance of planetary roller screw pair [2, 3]. However, there are many difficulties in precision manufacturing of high precision, small pitch and multi-head internal thread. How to realize high efficiency precision machining of internal thread has always been the main objective of internal thread grinding process research. In grinding process, it is generally considered that when the linear speed of grinding wheel is 45–150 m/s, it is called high-speed grinding. Compared with ordinary grinding, high-speed grinding shows great advantages, such as improving processing efficiency, reducing grinding heat and prolonging service life of grinding wheel. At the same time, high-speed grinding can greatly reduce the volume and grinding force generated by a single abrasive grain, making the grinding surface more uniform, thus improving the processing quality of the workpiece [4]. At present, the research related to high-speed grinding focuses on plane grinding and internal and external grinding. Lin et al. analyzed the influence of velocity effect on the critical chip thickness by carrying out single grain grinding test on the plane made of superalloy GH4169 [5]. Xueying et al. studied the grinding process parameters of carburized fire gear steel material 18CNiMo7-6 based on ultra-high speed plane grinding, analyzed the influence law of grinding wheel linear speed, workbench speed, grinding depth and grinding wheel grain size on the three-dimensional roughness of workpiece surface, and the research results show that the grinding wheel grain size has the largest influence on the surface roughness, while the influence of workbench speed can be ignored [6]. Based on the experimental research on the grinding force and grinding surface quality of ceramic bearing outer ring internal grinding by the CNC large external compound grinding machine, Hao Huiling et al. studied the influence law of grinding wheel speed, workpiece speed and feed on the grinding force by using the three-way rotation dynamometer through the single factor experiment method, and the results show that the feed has the largest influence on the grinding force [7]. Songhua et al. studied the influence of grinding depth, grinding wheel linear speed and workpiece feed speed on the channel surface roughness by single factor experiment for channel grinding of oxidized ceramic. The results show that the feed has the greatest influence on the surface quality. Meanwhile, a group of optimized process parameter combinations are given by orthogonal experimental results [8, 9]. Yuewei from Dong hua University carried out high-speed continuous grinding experiment on carburized steel 20CrMnTi based on high-speed external grinding machine, characterized abrasion degree of grinding wheel by accumulative grinding volume, and researched influence law of abrasion degree of grinding wheel on surface roughness, residual stress, metallographic structure and surface hardness of workpiece [10]. Relatively few experimental studies on grinding force model and grinding process for internal thread grinding [11]. Based on the known unit grinding force, Cui et al. calculated the effective grinding area on the basis of abrasive grain overlap effect, and established the internal thread grinding force model considering material deformation and friction coefficient through integration [12]. Chao et al. studied different dressing parameters of CBN grinding wheel for small pitch thread through orthogonal test, and the results show that the dressing speed ratio has the greatest impact on thread profile [13].
Research on Surface Quality and Wheel Wear of Internal …
165
This article presents an adaptive renovation of the matrix-6900 internal thread grinding machine, which significantly increases the grinding wheel speed during the internal thread grinding process to reduce grinding force and improve the surface quality of the thread. The influence of grinding wheel line speed, headstock speed and grinding depth on thread surface quality was studied.
2 Adaptability Renovation of Internal Thread Grinding Machine After adaptive transformation of the UK matrix-6900 CNC internal thread grinding machine, as shown in Fig. 1a, it can meet the requirements of 80m/s grinding wheel linear speed, significantly reduce factors such as resonance and abnormal thermal deformation that affect the machining accuracy of the machine, and thus meet the accuracy requirements of high-precision planetary roller screw internal threads. Comparison before and after reconstruction is shown in Table 1.
(a) matrix-6900
(b) high-speed electric spindle
Fig. 1 Internal thread grinding machine and motor spindle Table 1 Comparison before and after reconstruction Renovation project Before renovation Thread outer diameter and length Minimum pitch (mm) Allowable grinding wheel width (mm) Maximum spindle speed (rpm) Total spindle power (kW) Feed axis accuracy (μm) Thread grinding accuracy Surface roughness (μm)
φ160 × 120 mm 1 8–12 13,600 7.1 3 G5 Ra0.8
After renovation φ160 × 120 mm 0.25 8–15 60,000 8 0.5 G2 Ra0.4
166
Z. Zhaojing et al.
The specific measures are as follows: (1) The original manufacturing accuracy of the machine tool has been restored, especially improving the accuracy of the master screw and workbench guide rail. (2) Use calibration devices or appropriately change the number of teeth in the exchange gear to eliminate the thermal elongation of the workpiece caused by grinding heat, the thermal elongation of the master screw caused by friction heat, and the wear of the grinding wheel. (3) Increase the coolant pressure pump system to a pressure of 5–10 kg for sufficient cooling. Simultaneously select the type, flow rate, and appropriate coolant temperature and nozzle arrangement to improve the temperature rise of the workpiece. (4) Diamond roller is used for grinding wheel finishing, which is driven by a separate motor. Diamond roller rotates through a flexible shaft. The motor power is 0.4 kW, the rotating speed is 1400 rpm, and a special seal protective cover is provided to prevent splashing of cooling oil. Trim the grinding wheel with diamond roller. When dressing, the grinding wheel is at high speed, and the control the diamond roller feeds evenly in a small amount for a single time. (5) Replace the original mechanical spindle with a high-speed electric spindle, as shown in Fig. 1b, the frequency of the electric spindle is 60–2000 Hz, and the maximum speed is 60,000 rpm, so that the linear speed of the grinding wheel reaches 80 m/s. (6) Under the premise of restoring the accuracy of the machine tool, increase the resolution to 0.5 μm grating ruler is used as the reference and feedback of headstock and cross feed to realize high-precision motion control. (7) Carbide extension rod is used to improve the rigidity of extension rod; Match the thread connecting extension rod and main shaft to ensure coaxially of extension rod and main shaft; The outer circle run-out of the extension rod shall be controlled within 10 μm, and the dynamic balance of the extension rod shall be lifted. (8) The six-jaw chuck is used to clamp the workpiece, and the end bounce of the workpiece is controlled within 2–4 μm by fine adjustment before processing.
3 Grinding Experiment In order to compare the surface quality of the workpiece after the grinding wheel rotating speed is increased, the sample piece is subject to internal thread grinding through orthogonal test, and the optimized process parameters of high-speed internal thread grinding are determined through the detection of the surface quality.
Research on Surface Quality and Wheel Wear of Internal …
167
Fig. 2 Grinding wheel dressing
3.1 Test Conditions (a) Test environment: 22◦ ±1◦ ; Relative humidity is 56%; The atmospheric pressure is the local atmospheric pressure. (b) Grinding wheel: CBN grinding wheel with ceramic binder shall be selected, with diameter of 40 mm and grain size of 60 meshes. (c) Test workpiece: GCr15 bearing steel, inner diameter of 50 mm, length of 50 mm, cylindrical degree of outer circle of 0.001 mm, coaxiality of inner hole of 0.002 mm. (d) Thread inspection equipment: Mahr XR20 Surface Roughness Tester of Mahr Company, Germany. (e) Grinding wheel finishing method: diamond roller/CNC online finishing grinding wheel shall be selected for forward correction. The finishing speed ratio shall be 0.5 (the grinding wheel linear speed is 2 times of the roller linear speed), and the grinding wheel shall be trimmed twice without feed. The finishing process is shown in Fig. 2.
3.2 Test Content The grinding process parameters that affect the surface quality of internal thread mainly include grinding wheel linear speed, headstock rotating speed (workpiece rotating speed) and grinding depth. Single-factor analysis method is adopted to study the influence rule of grinding process parameters on thread surface quality, and surface roughness is selected as the evaluation index of surface quality. Firstly, grinding tests were carried out at the head frame rotating speed of 3 rpm and the grinding depth of 0.03 mm with the grinding wheel linear velocity of 35, 60
168 Table 2 Test factor level of grinding surface quality
Z. Zhaojing et al. Factor
Parameter
Grinding depth /mm Head frame speed /rpm
0.01, 0.02, 0.03, 0.04, 0.05 1, 2, 3, 4, 5
and 80 m/s respectively to study the influence of grinding wheel linear velocity on thread surface quality. Then, the influence of headstock rotating speed and grinding depth on thread surface quality was studied under the high-speed grinding condition with grinding wheel linear speed of 80 m/s. The test factors and levels are shown in Table 2. The grinding depth range is 0.01–0.05 mm, and the headstock speed range is 1–5 rpm. A total of 25 groups of tests are conducted. Under the condition that the headstock speed is 3 rpm and the total grinding depth is 0.05 mm, 16 internal thread lines are machined at the linear speed of 40 m/s and 80 m/s respectively, and the grinding test is carried out to study the wear law of CBN grinding wheel tip arc radius and K value.
4 Analysis of Test Results See Table 3 for test results of different grinding wheel linear speeds under the working condition of head frame rotating speed of 3 rpm and grinding depth of 0.03 mm. With the increase of the grinding wheel linear speed, the thread surface roughness is greatly reduced, because increasing the grinding wheel linear speed will reduce the maximum unreformed cutting thickness of a single abrasive grain, the cutting depth of a single abrasive grain is reduced, the abrasive grain is thinner, the cutting depth of the abrasive grain on the workpiece surface is reduced, and the surface roughness is also reduced. Therefore, the higher the grinding wheel linear speed is, the better the grinding surface quality is. Under the working condition of 80 m/s linear velocity, head frame rotating speed of 3 rpm and cutting depth of 0.03 mm, the roughness value of thread surface reaches 0.39 μm (compared with 35 m/s linear velocity, the roughness is nearly doubled), which indicates that improving the linear velocity of grinding wheel is an effective way to improve the thread surface quality. Under the condition of grinding wheel linear speed of 80 m/s, the surface roughness results of grinding test thread are shown in the Table 4. When the headstock speed is 3 rpm, the grinding pattern of thread surface at different grinding depth is shown in Fig. 3. Table 3 Test results of surface roughness of thread under different linear velocities
Linear speed of grinding wheel (m/s)
Surface roughness (μm)
35 60 80
0.76 0.51 0.39
Research on Surface Quality and Wheel Wear of Internal …
169
Table 4 Test results of thread surface roughness Thread surface roughness/µm Grinding depth/mm 0.01 0.02 0.03 Head frame speed/rpm
1 2 3 4 5
0.52 0.47 0.35 0.44 0.66
0.54 0.50 0.37 0.48 0.63
0.55 0.51 0.39 0.48 0.73
0.04
0.05
0.58 0.57 0.42 0.51 0.76
0.62 0.58 0.48 0.56 0.78
Fig. 3 Grinding grain of thread surface
From the thread surface roughness test results: (a) Under the same headstock rotating speed, the surface roughness increases with the increase of finishing grinding depth. When the linear speed of grinding wheel is fixed, the cutting depth increases, the thickness of abrasive grain chips increases, and the depth of grinding marks on the workpiece surface also increases. At the same time, as the cutting depth increases, the grinding force increases accordingly, and the surface roughness deteriorates. However, in general, the increase amplitude of surface roughness is relatively small, and the increase amplitude is basically no more than 0.15 mm. (b) Under the same grinding depth, with the increase of headstock rotation speed, the thread surface roughness shows a trend of “becoming smaller first and then larger”. The reason is that there is an optimal matching relationship between the grinding wheel linear speed and the workpiece linear speed. Through research, it is found that the optimum value of the linear velocity ratio between grinding wheel and workpiece in high-speed grinding of internal thread grooves is around 8300–8500. If the velocity ratio is higher than 8500, the larger the velocity ratio is, the more the surface roughness of workpiece will be affected by grinding vibration, and the worse the surface roughness will be; If the speed ratio is lower than 8300, the workpiece roughness will also deteriorate.
170
Z. Zhaojing et al.
Fig. 4 The radius of rounded root and K value of grinding wheel
Fig. 5 The profile of grinding wheel
(c) When the headstock speed is 3–4 rpm, the thread roughness value is obviously superior to other headstock speeds, and when the headstock speed is 3 rpm and the finishing amount is 0.01mm, the thread roughness reaches the minimum value of 0.35 μm, achieving high-precision machining. Planetary roller screw is characterized by small lead, high precision and strong bearing capacity, which makes it extremely demanding for thread root arc and K value. In planetary roller screw with small screw pitch, generally, the root arc radius is less than 0.03 mm and K value is less than 0.01 mm. The thread root arc radius R determines the contact height of the whole thread and seriously affects the bearing capacity of thread with small pitch. Therefore, it is necessary to figure out the wear law of the grinding wheel tip arc and K value in the high-speed grinding process, so as to lay a foundation for the high-speed grinding process optimization and parameter formulation. The radius and K value of the grinding wheel tooth tip arc are defined as shown in Fig. 4. Under the condition of grinding wheel line speed of 80 m/s, headstock speed of 3 r/min and total feed of 0.05 mm (feed in two times), respectively process 16-head thread lines. The contour of grinding wheel during grinding is shown in Fig. 5. After processing each thread line, measure the abrasion of grinding wheel. See Table 5 for test results. Under other working conditions, the test is carried out by changing the linear speed of the grinding wheel to 40 m/s. The summary of the test results and the linear speed of the grinding wheel to 80 m/s is shown in Fig. 6. (a) Under the condition of the same grinding wheel linear speed, the grinding wheel tip arc and the grinding wheel tip K value increase with the increase of the total grinding volume.
Research on Surface Quality and Wheel Wear of Internal …
171
Table 5 Test results of the radius of rounded root and K value of grinding wheel No. Total grinding amount Tooth tip K 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
0.00 6.55 13.10 19.65 26.20 32.75 39.30 45.85 52.40 58.95 65.50 72.05 78.60 85.15 91.70 98.25 104.8
Fig. 6 The profile of grinding wheel
0.011 0.012 0.026 0.028 0.031 0.042 0.042 0.042 0.042 0.042 0.042 0.045 0.045 0.046 0.048 0.049 0.050
0.0050 0.0055 0.0085 0.0090 0.0100 0.0140 0.0140 0.0140 0.0140 0.0140 0.0140 0.0150 0.0150 0.0160 0.0160 0.0160 0.0160
172
Z. Zhaojing et al.
(b) The abrasion condition of 80 m/s linear speed grinding wheel is obviously better than that of 40 m/s linear speed grinding wheel. Under the condition of 80 m/s linear speed grinding, the radius of tooth tip of grinding wheel is changed from 0.011 to 0.050 mm, and the value of K is changed from 0.005 to 0.016 mm; Under the condition of 40 m/s linear speed grinding, the radius of the grinding wheel tip changed from 0.016 to 0.073 mm, and the value of K changed from 0.003 to 0.028 mm. The abrasion of the grinding wheel tip at 80m/s linear speed was obviously lower than 40 m/s linear speed. (c) Under the condition of 80 m/s linear speed grinding, when the total grinding amount reaches 32.75 mm3 , the grinding wheel tip is in fine wear state (basically no dimensional change in macro view) until the grinding thread line reaches 65.50 mm3 , the grinding wheel is worn again. Under the grinding conditions of 40 and 80 m/s linear speed, the grinding wheel wear is basically the same when the total grinding amount reaches 32.75 mm3 . After the total grinding amount reaches 65.50 mm3 , the grinding advantage of 80m/s linear speed is obvious, the grinding wheel is basically not worn, and the durability of the grinding wheel is significantly improved.
5 Conclusion This paper studies the influence of different grinding parameters on the surface quality of internal thread and the wear law of grinding wheel. Through the grinding process test of internal thread, the influence of grinding wheel linear speed, headstock rotating speed and grinding depth on the surface quality of internal thread is compared and analyzed, and the wear law of CBN grinding wheel tip arc radius and K value under different linear speeds is found out. The main conclusions are as follows: (1) Increasing the linear speed of grinding wheel and reducing the grinding depth are beneficial to improving the surface quality. Under the condition of determining the linear speed of grinding wheel, with the increase of headstock speed, the thread surface roughness shows a trend of “becoming smaller first and then bigger”. (2) Under the condition of fixed grinding wheel linear speed, the grinding wheel tip arc and K value increase with the increase of the total amount of grinding, while increasing the grinding wheel linear speed can improve the grinding wheel wear condition, with the increase of the total amount of grinding, the grinding advantage is more obvious.
Research on Surface Quality and Wheel Wear of Internal …
173
References 1. Huang Y., Li, J., Zhu C.: In: Aerospace Electromechanical Servo System. China Power Press (2013) 2. Zheng, W., Zu, L., Wang, K.: Experiment research on influence factors of travel error of planetary roller screw pair. Chinese J. Scient. Instrum. 42(09), 214–224 (2021) 3. Wu, H., Wei, P., Cai, L., et al.: Optimization method of planetary roller screw tolerance matching based on machining error sensitivity analysis and fuzzy analytic hierarchy process. Appl. Ultrahigh Speed Grinding Technol. Field of the Mech. Machining 33(22), 2693–2703 (2022) 4. Zhao, H., Feng, B., Gao, G., et al.: Application of ultra-high speed grinding technologies in the field of the mechanical machining. J. Northeastern Univer. (Natural Science) 06, 564–568 (2013) 5. Tian, L., Fu, Y., Yang, L., et al.: Investigations of the “speed effect” on critical thickness of chip formation and grinding force in high speed and ultra-high speed grinding of superalloy. J. Mech. Eng. 49(09), 169–177 (2013) 6. Sha, X., Wang, D., Chen, G.: Experimental study on three dimensional surface roughness of 18ccrnim07-6 by high speed precision grinding. Mach. Design and Manuf. 01, 92–95 (2020) 7. Hao, H.: Research on grinding force and surface quality of ceramic bearing outer ring. Silicate Bulletin 39(12), 3985–3990 (2020) 8. Li, S., Han, G., Sun, J., et al.: Research on surface quality of zro2, ceramic used in grinding bearing with diamond wheel. Diamond and Abrasive Eng. 39(06), 75–81 (2019) 9. Li, S., Han, G., Sun, J.: Study on the surface quality of zirconia ceramic raceway grinding. Silicate Bulletin 39(04), 1260–1265 (2020) 10. Zhu Y., Zhang, J., Zheng, X., et al.: Study on the influence of high speed grinding wheel wear on grinding surface quality. Modul. Mach. Tool Autom. Manuf. Technique. 03, 138–141 (2015) 11. Li, S., Wu, G., An, Y., et al.: Review on precision grinding of internal threads. Aeronaut. Manuf. Technol. 64(7), 72–80 (2021) 12. Fang, C., Yang, C., Cai, L., et al.: Predictive modeling of grinding force in the inner thread grinding considering the effect of grains overlapping. The Int. J. Adv. Manuf. Technol. 104(1– 4), 943–956 (2019) 13. Dong, C., Wang, S., Ding, Z., et al.: Shaping technology of vitrified bond cbn wheels for small pitch thread grinding. Missiles and Space Vehicles 5, 93–97 (2018)
Fast Video Object Segmentation Network Based on Multi-scale Attention Feature Fusion Fan Zhou, Chaoli Wang, and Zhanquan Sun
Abstract The semi-supervised video object segmentation task means to segment the object of interest in the subsequent frames of the video sequence when only the first frame of the video information is known. Faced the dilemma of very little available information, fast moving targets and the existence of similar target objects in the background, which makes it a challenging problem to perform video object segmentation with both speed and accuracy. The current research on video object segmentation algorithms mostly uses fixed-size convolutional kernels, which can lose some global information and make it difficult to capture long-range information, besides, the features from the encoder as well as the decoder subnets have great semantic differences during the skip connection, resulting in poor segmentation accuracy and speed. In this paper, we propose a video object segmentation network based on multi-scale attentional feature fusion. Firstly, a multiscale feature information extraction module is constructed to retain the information of the multi-scale global context in the encoder structure. Secondly, to enhance the robustness of the network, the feature maps generated by the feature extractor are fed into a bilayer convolution trained online to obtain the coarse mask accordingly. And then, a feature attention fusion module is constructed to complete the full fusion of the information of the two adjacent layers of features in the decoder structure. Finally, in order to reach the best segmentation accuracy, an output optimization module is proposed, which takes multi-scale features, coarse masks, decoder output results as input, progressively performs channel information enhancement along with position information enhancement, eventually outputting more accurate segmentation results. The algorithm has strong contour capture capability as well as foreground recognition capability, maintaining relatively fast segmentation speed and superior segmentation accuracy. The experimental results with a variety of existing advanced video object segmentation algorithms in single-object video object segmentation dataset DAVIS2016, and multi-object video object segmentation dataset DAVIS2017 show that the segmentation accuracy of the algorithm in this paper improves by 2.5% points and 1.3% points in the comprehensive evaluation index J&F, respectively, in the case of compaF. Zhou · C. Wang (B) · Z. Sun University of Shanghai for Science and Technology, Shanghai 200093, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_14
175
176
F. Zhou et al.
rable segmentation speed. Compared with other algorithms, this algorithm can still achieve better results in terms of segmentation accuracy and segmentation speed. Keywords Semi-supervised video object segmentation · Multi-scale · Attention
1 Introduction Semi-supervised video object segmentation is a task that, given a video sequence and the annotated segmentation mask of the first frame, leads to a frame-by-frame segmentation of the objcet of interest for subsequent frames. This task has been widely used in the fields of video editing, video analysis, etc. However, it is still an extremely difficult task to automate the accurate and fast segmentation of targets of interest in video sequences based on the very small amount of available information, accompanied by difficulties such as the disappearance of the target object, large changes in appearance, and fast movement speeds. Recently, several works based on deep learning algorithms have achieved good results for this task. They can be specifically classified as online learning based methods, propagation based methods, matching based methods and hybrid methods. Based on the online learning method, the network model and parameters are finetuned online by using the first frame and the annotation information to pre-train the network on a large dataset. The segmentation results of the object of interest in the subsequent frames of the video are obtained by feeding the network with the subsequent frame images. OSVOS [1] Proposes to take the first frame of the video and the annotation information to fine-tune the pre-trained network online to achieve target segmentation in subsequent frames. OnAVOS [26] To solve the problem that OSVOS cannot adapt to the large changes in the target shape due to the fine-tuning of the test network using only the first frame of the video information, adaptive ideas are added. OSVOS-S [7] adds instance-level semantic information to OSVOS, thus greatly improving segmentation accuracy. However, the online learning-based approach, which ignores temporal consistency and thus segments each subsequent frame individually, can to some extent avoid the situation where subsequent frames are difficult to segment accurately due to the disappearance of the target object in the current frame, but the relatively high time cost of online fine-tuning makes it difficult to achieve fast segmentation of targets in video sequences. The propagation-based approach is similar to the process of video compression coding, where the video information is considered too redundant and smooth, so a specific frame is taken and only the moving part is encoded to produce the next frame. The propagation-based approach takes into account the temporal consistency and considers the similarity of information between adjacent frames to be high, so the prediction information of the previous frame, i.e. the mask, is used as an important basis for the prediction of the current frame to achieve the target segmentation of the current frame. MaskTrack [14] proposed to segment the current frame mask
Fast Video Object Segmentation Network …
177
from the previous frame mask and the current frame image by convolutional network on a per-frame basis. RGMP [9] achieves the current frame mask segmentation by inputting the previous frame mask and the first frame information for stitching operation through convolution. OSNM [25] Uses spatio-temporal information to guide the segmentation of the middle layer of the network, which in turn can segment arbitrary targets. For the frame-by-frame propagation of the mask, the propagation-based approach will accommodate the deformation of the target, but if the target disappears from the previous frame, it will be difficult to segment the target in the current frame, and cumulative errors will be generated during the propagation process, resulting in reduced network robustness. The matching-based approach is to match the first frame of the video sequence with the current frame, and use the first frame annotation information, i.e. the object of interest, to select the part with high matching similarity, which can be regarded as the region where the object of interest is located in the current frame, and then achieve the target segmentation of the current frame. VideoMatch [17] Using the template extracted from the first frame of the video sequence by convolution and the features extracted from the current frame by convolution to perform soft matching, thus obtaining the mask of the current frame. FEELVOS [16] uses global and local pixel matching to achieve more stable pixel-level matching. Although the matching-based approach does not suffer from the reduced robustness caused by the propagation process, it also loses timing information, resulting in poor segmentation accuracy. Hybrid-based methods usually refer to a combination of two or more of these methods, making video segmentation more accurate and faster by combining the advantages of the above methods. RANET [21] proposes to build a video object segmentation network combining matching and propagation-based methods, by calculating the first frame feature map and the current frame feature map for correlation analysis, and then fusing the previous frame mask to finally achieve the generation of the current frame mask. STM [27] is used to match all frame information before the current frame with the current frame information. To reduce the large computational effort introduced by STM, TAB [19] MRP [5] proposed to use the historical motion trajectory of the target region as a priori knowledge, i.e. to predict the current frame by using the previous frame mask as the reference frame and perform similarity calculation to determine whether the prediction result can be used as the reference frame when predicting the next frame. However, the hybrid-based method usually falls into the dilemma of difficulty in balancing segmentation accuracy and speed in practical applications. FTRM [8] is a two-segment segmentation network, firstly by constructing a coarse segmentation network, then a refined segmentation network, and building a Memory into which the mask output via the segmentation network is injected to store a number of frames of information, which in turn guides the update of the coarse segmentation network. The algorithm has good performance in terms of segmentation accuracy and segmentation speed, however, the algorithm has a simple setup in the fine segmentation network structure, which does not fully exploit the coarse segmentation mask information, and there is a lot of room for optimisation. And because the fine segmentation network uses offline training model parameters in the
178
F. Zhou et al.
inference process, without parameter updates, experiments also show that the model inference process, the fine segmentation network occupies very little time. Therefore, such a network structure exists. At a higher segmentation accuracy is guaranteed for comparable segmentation speed at. Based on the above reasons , a multi-scale attentional video object segmentation network is proposed in this paper. The network uses FTRM as the baseline, and initially mines the effective information in the coarse segmentation mask by building a multi-scale feature information extraction module combined with a coarse mask segmentation module, and then realizes feature re-mining with a Feature attention fusion module, finally builds an output optimization module to achieve accurate target segmentation of the current frame.
2 Methodology Due to the poor robustness of the asymptotic target segmentation network and the inherent drawback of the high demand for training data due to the large number of training parameters, Andreas et al. propose the FTRM, which consists of two main components, a target network with two layers of convolutional networks for coarse mask extraction and a fine segmentation network for coarse mask refinement. The model is trained online to adapt to changes in video target information using only two layers of convolution, which greatly improves the robustness of the model. The fine segmentation network is designed to obtain the ability to re-extract coarse mask features from . Therefore, the network parameters can be obtained by offline training, which can greatly reduce the complexity of inference while ensuring segmentation accuracy and improve the speed of video object segmentation. However, the design of the fine segmentation network in this model is too simple, only the coarse mask is simply spliced with the features in the feature extractor, so there is still more room for optimization in the segmentation accuracy of this model. This paper uses the FTRM as a baseline and retains the simpler and more robust two-layer convolutional target network of the model, which is used to generate the coarse mask and incorporates the Multi-scale feature extraction module, Feature attention fusion module and Output optimisation module on top of this. In this network, the feature extractor is based on ImageNet [13]. The dataset is pre-trained on ResNet101 [3] network. The current frame image is obtained by the feature extractor layer by layer with deeper semantic information, and then the coarse mask is obtained by a two-layer convolutional network trained online to adapt to changes in the video target information, i.e. the coarse mask segmentation module. The optimization module obtains the current frame mask, and passes the current frame information into the Memory to achieve the guidance effect for the coarse mask segmentation module. The overall block diagram of the network is shown in Fig. 1. In this paper, we focus more on the level-by-level features extracted by the feature extractor and the coarse masks generated by the coarse mask segmentation module than the FTRM. Therefore, this paper proposes to use the Multi-scale feature extrac-
Fast Video Object Segmentation Network …
179
Fig. 1 Overall network framework diagram
tion module to reuse the features extracted by the feature extractor at each level, and then use the feature attention fusion module to fuse the features processed by the decoder at each level to improve the network’s utilization of effective feature information. The optimization is carried out to effectively improve the video object segmentation accuracy.
2.1 Multi-scale Feature Extraction Module In recent years, convolutional neural networks or other components with an encoderdecoder structure at their core have become widely popular in the field of medical image segmentation and have achieved excellent results. Convolution is represented by UNet [12] and its variants, such as Unext [15] , MultiResUNet [4] and even the NLP field has tried to build on the framework of U type encoder-decoders, e.g. TransFuse [20] , U-Net Transformer [28] and so on, and these networks have achieved good performance. Therefore, FTRM uses this encoder-decoder structure to perform the video object segmentation task. However, the convolution operation is inherently limited by the size of the convolution kernel, and although the operation can effectively extract local features, inevitably loses global information and has difficulty capturing long-range information . and the features from the encoder and decoder subnets have significant semantic differences during the hop-join operation [22], and the chained encoder-decoder structure underutilises global information at different scales. According to the above, this paper establishes a Multi-scale feature extraction module, performs multi-scale stitching operations on features at all levels of the feature extractor, and performs deep fusion with the feature map and coarse mask
180
F. Zhou et al.
Fig. 2 Multi-scale feature extraction module
information output from the encoder, preserving multi-scale global information to a certain extent. Because the contextual information at different scales is utilized, the problem that convolution operation cannot capture global information is solved, and the small target information of the video is effectively captured and verified in the subsequent ablation experimental results. The multi-scale feature extraction module is shown in Fig. 2. This module implements the appropriate size change operations on the feature maps from the feature extractor and then stitches them together. However, we see that the first layer of features output from the feature extractor is not used because has a large size and will generate a large number of parameters in the subsequent operation, which will affect the inference process, and this layer only contains the underlying features and a large amount of noise, so it will have a negative impact on the subsequent segmentation accuracy and segmentation speed. Therefore, this layer is not used in this module.
2.2 Feature Attention Fusion Module As shallow features have higher resolution and contain more location and detail information, they enable target contour capture. Deeper features have lower resolution, but contain stronger semantic information for target recognition. Therefore, the effective fusion of multi-scale features is more beneficial to the segmentation effect. Therefore, multi-scale fusion is often used in the decoder structure to connect features from the encoder to the decoder via jumping step by step to better complete the segmentation task. Therefore, in the decoder structure, the FTRM also introduces features into the decoder stage by stage. However, the FTRM only simply splices the decodergenerated features at each level with the corresponding encoder features after hopping together, which is difficult to achieve effective fusion.
Fast Video Object Segmentation Network …
181
Fig. 3 Feature attention fusion module
In this paper, we propose a Feature attention fusion module. Firstly, in order to reduce the training parameters, the feature maps obtained from the encoder-generated features of each level by the coarse segmentation module Fi and the upper-level feature map generated by the decoder F j Both are globally averaged and pooled to obtain two feature maps of size 1 × 1 × C of the feature map, where C Secondly, the two feature maps are stitched together to obtain a feature map of size 1 × 1 × 2C The feature map of size This process can be considered as a multi-layer perceptron to obtain a feature map of size 1 × 1 × C This process can be regarded as discriminating and assigning weights to the channel importance, and finally multiplying this feature map with the feature map Fi The new feature information is obtained by multiplying the feature map with the feature map Fk The new feature information is obtained by multiplying the feature map with the feature map, and is of the same size. This module fuses the feature information by channel attention and filters out the valid information, similar to the operation of assigning weights to each channel, which greatly enables the extraction of feature information. The Feature attention fusion module is shown in Fig. 3 of which Hi , Wi with H j , W j and Hk , Wk respectively, are feature maps Fi , F j and Fk height and width. Since this process is a decoder structure component, the process exists four times during the network implementation..
2.3 Output Optimisation Module In semi-supervised video segmentation tasks, only the initial frame information, i.e. the first frame of the video sequence, and the annotation are provided in the test phase, so if the segmentation target is similar to the background of the image and the appearance changes a lot, it is difficult to segment the result with high
182
F. Zhou et al.
precision mask. Therefore, the segmentation network needs to make full use of the semantic information contained in the deep layer and the detailed information such as appearance and position contained in the shallow layer to achieve accurate discrimination between the target and the background, and needs to memorise the past frames to adapt to the changes in the appearance of the target. FTRM achieves information retention of past frames and guides the target network to perform coarse mask segmentation by building a Memory, however, there are limitations in the utilization of deep and shallow information of video images as the segmentation results are only convolved by the decoder results . Therefore, this paper follows the Memory proposed by FTRM in terms of past frame information retention, while in terms of image deep and shallow information exploitation, the full exploitation of image information is achieved by constructing an Output optimisation module . The output optimisation model proposed in this paper is designed to ensure the segmentation accuracy to maximise the utilisation of the deep and shallow information of the image. With the coarse mask generated by the mask segmentation module Fn the deep and shallow features extracted from by the multi-scale feature extraction module Fm and the result generated by the decoder Ft as input to . The coarse mask is transformed to the appropriate size and then compared with the Fm . Ft for stitching, then ECA with Multi-scale feature fusion [18] Channel feature enhancement is achieved with the channel attention module, which has been used to excellent effect in several tasks in the field of computer vision. Afterwards, the feature maps obtained after channel information enhancement by the ECA channel attention module are position-weighted, i.e., the global average pooling and global maximum pooling are used to obtain two feature maps of both sizes Hm ×Wm ×1 The two feature maps are summed element by element and then convolved in a double layer to enhance the fit. The final result is that the network pays more attention to the target position information in the feature map, and then the size is changed after convolution mapping to obtain the final output. The output optimisation module is shown in Fig. 4. The module mainly fuses the coarse mask, multi-scale features and decoder output with the feature information one by one through the channel attention module to enhance the channel feature information, the spatial attention module to enhance the target position information, and finally the feature mapping and dimensional transformation to obtain the segmentation result.
Fig. 4 Output optimisation module
Fast Video Object Segmentation Network …
183
3 Experimental Setup and Analysis In this paper, we compare with several advanced algorithms in the field of video object segmentation and select the two most important datasets in the field of video object segmentation: the DAVIS2016 single target segmentation dataset [10] and DAVIS2017 multi-object dataset [11] The validity of the network was tested and the validity of the module was verified by ablation experiments and finally the evaluation metrics given by DAVIS were strictly followed. J&F The results were validated.
3.1 Experimental Software and Hardware Environment See Table 1.
3.2 Evaluation Indicators The evaluation indicators relied on in this paper are the officially given J&F where area similarity J is the degree to which the predicted mask approximates the true annotation, and the contour capture accuracy F presents how well the predicted mask contour captures the contours of the image target. J=
M∩G M∪G
(1)
where M is the predicted mask result of the current frame of video image segmented by the network, and G is the true annotation of the current frame of video image. F=
2P R P+R
Table 1 Experimental software and hardware environment Experimental hardware and software environment Environment configuration Operating systems Processor Memory Video cards Programming languages Deep learning frameworks
Ubuntu 18.04 Intel(R) Xeon(R) CPU E5-2678 v3 @ 2.50GHz 24G NVIDIA RTX3090 Python 3.8.8 PyTorch 1.11.0
(2)
184
F. Zhou et al.
where P is the accuracy rate, which is the ratio of the number of correct pixel points in the predicted mask contour segmented by this paper’s network to the number of pixel points in the mask contour, and R is the full rate, which is the ratio of the number of correct pixel points in the predicted mask contour segmented by this network to the number of true contour pixel points. P=
TP T P + FP
(3)
where TP is the number of correct pixel points in the mask outline and FP is the number of incorrect pixel points in the mask outline. R=
TP T P + FN
(4)
where FN is the number of pixel points where the true contour was mis-segmented.
3.3 Hyperparameter Setting In this paper, to ensure the objectivity of the subsequent experimental results, the number of iterations and optimizer settings in the training phase are kept the same as those of FTRM, with 260 iterations, the optimizer using Adam, the initial learning rate of 1e-3, and the decay rate β1 of 0.9 and β2 The training process metrics plot shows that the learning rate is reduced to one-tenth of the current learning rate after every 80 parameter iterations.
3.4 Algorithm Comparison This section will further verify the superiority and generalisation capability of this algorithm. To ensure the objectivity of the experiment, two of the most important publicly available datasets for the video object segmentation task are selected in this paper, of which the single-object segmentation dataset is the DAVIS 2016 dataset and the multi-object segmentation dataset is the DAVIS 2017 dataset. And to ensure the authenticity of the experimental results, the results of the comparison algorithms in this paper are selected from the original papers or third-party results.
3.4.1
DAVIS2017 Multi-object Dataset
The DAVIS2017 dataset is a multi-objective VOS dataset for instance-level segmentation, with 150 video sequences. The training set is 60 sets and the validation set
Fast Video Object Segmentation Network …
185
is 30 sets, while the other 60 sets are for competition purposes and are not available to the public. Therefore, this paper uses the training set and the validation set to demonstrate the effectiveness of the algorithm. Compared with the DAVIS 2016 dataset, this dataset takes into account the possibility of multiple targets in each video sequence, so each target is labelled as a single individual. Table 2 shows the experimental results of our algorithm and several advanced algorithms in semi-supervised video object segmentation on the DAVIS2017 dataset. Since the semi-supervised video object segmentation inference stage can only use the information of the first frame of the video sequence and then complete the task of target segmentation of interest in the subsequent frames of the sequence, most algorithms will take to better adapt to the video sequence information by increasing the amount of training data in the model training stage. The yv under training data refers to the YouTube—VOS [24] dataset, meaning the DAVIS2017 training set and the YouTube—VOS training set are trained together, seg means load pre-trained model, which is often used in semi-supervised target segmentation algorithms based on online learning methods, and synth refers to the synthetic VOS dataset generated from the image segmentation dataset. The experimental results show that the algorithm in this paper achieves a segmentation accuracy of 78.2% and a segmentation speed of 13.54 frames per second, which is 1.3% points higher than the accuracy of FTRM and only 1.25 fewer frames per second. Compared to the rest of the algorithms, the segmentation accuracy is only lower than STM, although the algorithm achieves an impressive 81.8% segmentation accuracy, 3.6% higher than the proposed algorithm, but is limited by the lower frame rate, processing frames per second only 6.25. In the segmentation speed FPS index,
Table 2 Segmentation results of different algorithms for the DAVIS2017 dataset Algorithms
Training data
DAVIS 2016
FPS
FPS (RTX3090)
13.54
13.54
76.9
21.9
14.79
50.3
22.7
–
65.7
30.3
–
70.0
14.3
–
–
81.8
6.25
–
66.7
7.7
–
FEELVOS [16] –
–
71.5
2.22
–
OSMN [25]
–
–
54.8
7.14
–
PReMVOS [6]
–
–
77.8
0.03
–
OSVOS-S [7]
–
–
68.0
0.22
–
OnAVOS [26]
–
–
67.9
0.08
–
TAB [19]
–
–
71.7
0.09
–
MPR [5]
–
–
63.4
7.69
–
yv
seg
synth
J&F
Algorithms in this paper
–
–
78.2
FTRM
–
–
RVOS [2]
–
–
RANet [21]
–
–
AGAME [23]
–
STM [27]
RGMP [9]
–
186
F. Zhou et al.
this algorithm is 13.54 frames per second in the fifth place, but the segmentation speed is significantly higher than the other algorithms of the top two algorithms The segmentation accuracy is too low, at 65.7 and 50.3%. The remaining two are slightly better than this algorithm in terms of segmentation speed despite the slightly larger gap in accuracy. The remaining network based on the online learning method has a very low segmentation speed due to the online fine tuning of the first frame information during the inference phase, which makes it difficult to reach even a frame per second speed. Therefore, the experimental results show that the algorithm in this paper can achieve high segmentation accuracy and maintain a not too low segmentation speed in the multi-object segmentation domain task, fully demonstrating the superiority of the algorithm in this paper.
3.4.2
DAVIS 2016 Single Target Dataset
The DAVIS2016 dataset is a single-object VOS dataset for target-level segmentation, consisting of 50 high-quality, full-HD video sequences, 30 sequences as training set and 20 sequences as validation set containing multiple video object segmentation task challenges such as occlusion, motion blur and appearance changes. Each of these videos is densely annotated with a binary annotation that distinguishes between foreground and background. The experimental results in Table 3 show that the algorithm in this paper improves the segmentation accuracy by 2.5% points compared to FTRM, and performs more
Table 3 Segmentation results of different algorithms for the DAVIS2016 dataset Algorithms
Training data
DAVIS 2016
FPS
FPS (RTX3090)
13.54
13.54
76.9
21.9
14.79
50.3
22.7
–
65.7
30.3
–
70.0
14.3
–
81.8
6.25
–
66.7
7.7
–
FEELVOS [16] –
–
71.5
2.22
–
OSMN [25]
–
–
54.8
7.14
–
PReMVOS [6]
–
–
77.8
0.03
–
OSVOS-S [7]
–
–
68.0
0.22
–
OnAVOS [26]
–
–
67.9
0.08
–
TAB [19]
–
–
71.7
0.09
–
MPR [5]
–
–
63.4
7.69
–
yv
seg
synth
J&F
Algorithms in this paper
–
–
78.2
FTRM
–
–
RVOS [2]
–
–
RANet [21]
–
–
AGAME [23]
–
STM [27]
–
RGMP [9]
–
Fast Video Object Segmentation Network …
187
outstandingly in terms of accuracy and speed balance. In comparison with the rest of the algorithms, it outperforms RGMP, FEELVOS, and MRP methods in terms of segmentation accuracy and segmentation speed, and is slightly lower than the online learning-based method case, while achieving a significant lead in segmentation speed.
3.5 Visualisation of Segmentation Results The above quantitative analysis shows the superiority of the video object segmentation performance of the algorithm in this paper, and this section will further visualise the segmentation results from the visualisation of the segmentation results. Figure 5 shows the segmentation results of the algorithm in this paper and FTRM on the DAVIS2017 multi-object segmentation dataset of horse-jump video sequences. One of the inherent difficulties in the field of video object segmentation is the high speed of target movement. According to the above segmentation results, as the target movement speed increases, the segmentation difficulty also increases. Figure 6 shows a camel video sequence from the DAVIS2016 single target segmentation dataset, which deals with another difficult area in the field of video object segmentation, namely distinguishing between targets of interest and similar objects. Based on what is shown in the figure, it can be clearly read that the FTRM algorithm results have difficulty in distinguishing similar objects. However, the algorithm adequately suppresses the similar object interference and accurately segments the target object, demonstrating the robustness of the algorithm in this paper in terms of its ability to distinguish between foreground and background.
Fig. 5 Visual segmentation results for the DAVIS2017 dataset
188
F. Zhou et al.
Fig. 6 Visual segmentation results for the DAVIS 2016 dataset Table 4 Ablation experiments Models J Base Base+MFE Base+MFE+FAF Base+MFE+FAF+OO
0.740 0.747 0.752 0.755
F
J&F
FPS
0.798 0.801 0.805 0.808
0.769 0.774 0.779 0.782
14.79 14.24 14.22 13.54
3.6 Ablation Experiments In this section, ablation experiments will be conducted to demonstrate the effectiveness of the modules proposed in this paper, relying on the YouTube-VOS and DAVIS2017 training sets for training the model parameters and validating the model on the DAVIS2017 validation set. Table 4 shows the experimental results of adding the three modules proposed in this paper to the baseline network step by step and testing them on the DAVIS2017 dataset to obtain. Base is the baseline network, MFE (Multi-scale feature extraction) is the multiscale feature extraction module mentioned in , FAF (Feature attention fusion) is the Feature attention extraction module, and OO (Output Optimization) is the output optimization module. By adding only the multiscale feature extraction module MFE , the region similarity index J value increased by 0.7% compared to the baseline network, the contour capture accuracy F value increased by 0.3% and the segmentation speed metric FPS decreased by 0.55 frames per second. The addition of the Feature attention fusion module FAF resulted in a 0.45% increase in accuracy and a 0.02
Fast Video Object Segmentation Network …
189
frames per second decrease in speed, which demonstrates the ability of this module to achieve better accuracy at the expense of segmentation speed. After all models have been added, the region similarity indexJ value reaches 75.5% and the contour capture accuracy F value of 80.8%, and after taking the average of the two J&F The value reached 78.15% and the FPS metric was 13.54 frames per second . Compared to the results of the model without the output optimisation, the region similarity and contour capture accuracy both improved by 0.3% and the FPS metric showed a reduction of 0.68 frames per second. The experimental results demonstrate that the three modules proposed in this paper achieve a better balance between accuracy and speed in the video object segmentation task than the baseline network without any loss in segmentation speed.
4 Conclusion Due to the inherent difficulties in the video object segmentation task, such as fast moving targets and similar foreground objects in the background, FTRM does not make full use of the multi-scale features extracted by the feature extractor, the decoder session is too coarse for deep and shallow features, and the decoder output is directly used as the segmentation result. In this paper, we take FTRM as the baseline network and make targeted improvements to the above mentioned defects, construct a multi-scale feature extraction module to achieve the retention of multi-scale global information, build a feature attention fusion module to enhance the feature channel information and then deep fusion of deep and shallow features in the decoder, create an output optimization module to ensure the segmentation accuracy by using multi-scale features, coarse masks and ensure the segmentation accuracy, the multi-scale features, coarse mask and decoder output results are used as the input of this module, after feature information enhancement, the output optimization is finally realized. Then, the effectiveness of the proposed module is verified through module-by-module ablation, and the experiments are compared with FTRM and various advanced algorithms in the single-object segmentation dataset DAVIS2016 and multi-object segmentation dataset DAVIS2017 respectively. However, the performance of this algorithm in the DAVIS2016 dataset is slightly worse than that based on the online learning method, which is not ideal. Future work will further explore to ensure the accuracy of multiobject segmentation while achieving successive improvements in the accuracy of single-object segmentation. Acknowledgements This paper was partially supported by Natural Science Foundation of China (62173232, 62003214) and Basic Research of National Defense Science and Industry Bureau (JCKY2019413D001).
190
F. Zhou et al.
References 1. Caelles, S., Pont-Tuset, J., Leal-Taixé, L., Cremers, D., Bruhn, A.: One-shot video object segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 221– 230. IEEE (2017) 2. Fan, L., Huang, W., Gan, C., Ermon, S.. Rvos: End-to-end recurrent network for video object segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 154–170 (2018) 3. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778. (2016) 4. Ibtehaz, N., Sohel Rahman, M.: Multiresunet: rethinking the u-net architecture for multimodal biomedical image segmentation. Neural Netw. 121, 74–87 (2020) 5. Lei, M., Yang, T., Sun, J., Wang, L., Li, J.: Video object segmentation based on motion-aware roi prediction and adaptive reference updating. Signal Process.: Image Commun. 100, 116357 (2021) 6. Luiten, J., Voigtlaender, P., Leibe, B.: Premvos: proposal-generation, refinement and merging for video object segmentation. In: European Conference on Computer Vision, pp. 718–733. Springer (2018) 7. Maninis, K.K., Caelles, S., Chen, Y., Pont-Tuset, J., Leal-Taixé, L., Cremers, D., Van Gool, L.: Video object segmentation without temporal information. IEEE Trans. Pattern Anal. Mach. Intell. 41(6), 1515–1530 (2019) 8. Marki, W., Torr, P.: Learning fast and robust target models for video object segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2908–2917. IEEE (2016) 9. Oh, S.W., Lee, J.-Y., Sunkavalli, K., Kim, S., Kweon ,I.S.: Fast video object segmentation by reference-guided mask propagation. In: European Conference on Computer Vision, pp. 386–402. Springer (2018) 10. Perazzi, F., Khoreva, A., Benenson, R., Schiele, B., Sorkine-Hornung, A.: A benchmark dataset and evaluation methodology for video object segmentation. In: European Conference on Computer Vision, pp. 430–445. Springer (2016) 11. Pont-Tuset, J., Perazzi, F., Caelles, S., Arbeláez, P., Sorkine-Hornung, A., Van Gool, L.: The 2017 davis challenge on video object segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 870–891. IEEE (2018) 12. Ronneberger, O., Fischer, P., Brox, T.: U-net: convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 234–241. Springer (2015) 13. Russakovsky, O., Deng, J., Hao, S., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vision 115(3), 211–252 (2015) 14. Tokmakov, P., Alahari, K., Schmid, C.: Learning video object segmentation from static images. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2663–2672. IEEE (2017) 15. Valanarasu, J.M.J., Patel, V.M.: Unext: Mlp-based rapid medical image segmentation network. In: Wang, L., Dou, Q., Thomas Fletcher, P., Speidel, S., Li, S., (eds) Medical Image Computing and Computer Assisted Intervention – MICCAI 2022, pp. 23–33. Cham, Springer Nature Switzerland (2022) 16. Voigtlaender, P.. Leibe, B.: Feelvos: fast end-to-end embedding learning for video object segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1205–1214. IEEE (2019) 17. Voigtlaender, P., Leibe, B.: Videomatch: matching based video object segmentation. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13975–13984. (2020)
Fast Video Object Segmentation Network …
191
18. Wang, H., Zhang, Y., Liu, Z., Shi, J., Loy, C.C., Lin, D.: Eca-net: efficient channel attention for deep convolutional neural networks. In: European Conference on Computer Vision, pp. 1–17. (2020) 19. Wang, H., Liu, W., Xing, W.: A temporal attention based appearance model for video object segmentation. Appl. Intell. 1–11 (2022) 20. Wang, T., Zhang, H., Song, K., Xuanang, X., Cheng, K., Liu, X.: Transfuse: fusing transformers and cnns for medical image segmentation. Med. Image Anal. 67, 101832 (2021) 21. Wang, Z., Lin, J., Liu, P.: Ranking attention network for fast video object segmentation. IEEE Trans. Image Process. 28(6), 3044–3054 (2019) 22. Huan, X., Liu, H., Tian, J., Sun, Z., Yang, X., Wang, C.: Ma-unet: an improved version of u-net based on multi-scale and attention mechanism for medical image segmentation. Comput. Med. Imag. Graph. 89, 101868 (2021) 23. Xu, N., Yang, L., Fan, Y., Zhang, J., Lau, R.W.H.: A generative appearance model for endto-end video object segmentation. In: Proceedings of the European Conference on Computer Vision (ECCV) 24. Xu, N., Yang, L., Fan, Y., Yue, J., Liang, X., Yang, J., Huang, T.S.: Youtube-vos: sequence-tosequence video object segmentation. In: European Conference on Computer Vision, pp. 0–0. Springer (2018) 25. Yang, C., Xu, D., Zhou, J., Wang, T.: Efficient video object segmentation via network modulation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 3173–3182. IEEE (2020) 26. Yang, W., Lu, P., Lin, S., Li, X., Zhang, J.: Online adaptation of convolutional neural networks for video object segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2114–2123. IEEE (2018) 27. Yang, X., Zhang, Y., Xu, C., Shen, J., Jia, J.: Video object segmentation using space-time memory networks. In: European Conference on Computer Vision, pp. 475–491. Springer (2018) 28. Zhao, Z.-Q., Liu, F., Zhang, S.-T., Cai, W.-S., Nie, D., Wang, L.-M., Gao, X.-B.: U-net transformer: self and cross attention for medical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 126–135. Springer (2021)
Design of Terminal Guidance Law for Air Defense Missile Based on Variable Structure Control Theory Shujun Yang, Jirong Ma, Juanzhi Lu, and Duansong Li
Abstract For air defense missile, a terminal guidance law based on variable structure control theory is designed. The terminal guidance law considers the target maneuverability and the relative velocity of the missile by using variable structure control. Under the condition that the target acceleration is regarded as bounded interference, the guidance law is designed and the zero speed of sight range is realized. The simulation results show that the miss distance of the guidance law is better than the traditional PN guidance law, which verifies the effectiveness of the guidance law. Keywords Anti-aircraft missile · Terminal guidance law · Variable structure control theory
1 Introduction With the development of air attack target towards the direction of small size, high speed and strong maneuverability, the guidance and control technology of air defense missile has been put forward with higher requirements. The traditional proportional guidance law has satisfactory performance in many cases, but its performance will be greatly reduced in the case of target maneuvering flight or observation noise, and it is difficult to meet the requirements of fast and large maneuvering interception and high hit accuracy. Therefore, it is one of the key technologies to study the guidance law of air defense missile suitable for attacking high speed and high mobility targets. Because the acquisition area of the anti-aircraft missile becomes very limited when the target is maneuvering, and the traditional pure proportional guidance law (PPN) intercepts the maneuvering target not nearly as good as intercepting nonmobile targets [1]. The predictive guidance law with optimal intercept time designed enables the system to have the capability of tracking time-varying maneuvering targets and the robustness of unknown maneuvering targets in the future, overcomes the optimization guidance law based on time optimization and differential game S. Yang (B) · J. Ma · J. Lu · D. Li Xian Institution of Modern Control Technology, 201848 Xian, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_15
193
194
S. Yang et al.
design [2], and the application conditions are difficult,it is not practical for any maneuvering target, so it is convenient for engineering realization. However, it is necessary to calculate when according to the current relative motion state of the target,forward target maneuver acceleration. Based on the analysis of the limitations of proportional guidance, literature. A predictive guidance law with the ability to attack high maneuvering targets [3]. By using this guidance law to guide missiles to attack targets with constant overload maneuvering, straight-line trajectory can be obtained, and it has the property of zero-control interception. This paper uses the advantages of variable structure control theory of interference invariance and strong robustness to derive a terminal guidance law of air defense missile which can deal with high speed and high maneuvering targets [7]. Simulation results show that this guidance law has high guidance accuracy and engineering realizability.
2 Variable Structure Guidance Law Design The relative motion relationship between missile and target is shown in Fig. 1. The formulation of the relative motion relationship between missile and plane : R˙ = −Vm cos(ψvm ) + Vt cos(ψvt − q)
(1)
R q˙ = −Vm sin(ψvm − q) + Vt sin(ψvt − q)
(2)
Considering the plane motion of missile and target, it is assumed that: (1) The missile and target are regarded as particles; (2) The missile autopilot and seeker dynamics are fast enough to be ignored compared with the guidance loop; (3) Target maneuver is regarded as external bounded interference, whose upper bound is known, that is|aT | < a, and a is positive number; (4) The upper bound of the maximum approach velocity of the missile target is known, R˙ = R˙ max ; (5) The upper bound of the maximum forward Angle of the missile is known, ηm < ηmmax .
Fig. 1 The relative motion relationship between missile and plane
Design of Terminal Guidance Law for Air Defense Missile …
195
The variable structure control theory is applied to the guidance law design [4]. Firstly, the switching plane is determined. S = q˙
(3)
The switching plane is selected in this way because the design objective of the guidance law is to zero the target sight Angle velocity q˙ when the target is doing bounded maneuvering. Variable structure control can ensure that the system state hits the switching plane within a finite time, so q˙ can be zero within a finite time [5]. Construct the following Lyardinov function: V = S 2 /2
(4)
Obviously, the function is positive definite. A sufficient condition for S=0 to be a sliding switch wire is to S=0, (5) V˙ = S S˙ < 0 V˙ =
q˙ R
−2 R˙ q˙ − am cos ηm + aT cos ηT − V˙m sin ηm + V˙t sin ηT
(6)
Law of structural index reaching W K S˙ = − S − sign (S) R R
(7)
where, K>0, W>0. By choosing the approach law in this way, the approach law of the missile line of sight angular velocity q˙ to zero can be adjusted according to the distance between the missile and the target. When R is larger, the reaching speed is slower; When R approaches zero, the approach speed increases rapidly, so as to ensure that the line of sight angular velocity q˙ does not diverge, effectively improving the missile’s hit accuracy. The missile acceleration can be obtained as 1 −2 R˙ q˙ + aT cos ηT − V˙m sin ηm + V˙t sin ηT + K q˙ + W sign (q) ˙ cos ηm (8) It can be seen from the above equation that the longitudinal acceleration of the missile can be measured by the missile navigation device, while the information of target maneuvering acceleration aT and target longitudinal acceleration is difficult to obtain for the missile. Therefore, these two terms are regarded as bounded uncertainty or interference terms, and then the variable structure guidance law is simplified as follows: ˙ K − 2 R˙ q˙ − V˙m sin ηm + W sign (q) (9) am = cos ηm am =
196
S. Yang et al. 2500
2000
Y(m)
1500
1000
500
0
0
1000
3000
2000
4000
5000
6000
7000
8000
9000
X(m)
Fig. 2 Trajectory curve of variable structure guidance law Fig. 3 Attack angle curve of variable structure guidance law
10
5
0
-5
-10
-15 0
5
10
15
t(s)
Substitute (9) into formula (6), there is 1 aT f q˙ cos ηT + V˙T f q˙ sin ηT − K q˙ 2 − W qsign V˙ = ˙ q˙ R where, aT f and q˙t f are the estimated values of aT and q˙t , respectively. Equation (9) can be modified to guidance law without flutter
(10)
Design of Terminal Guidance Law for Air Defense Missile … Fig. 4 Overload curve of generalized proportional guidance law
197
8 6 4 2 0 -2 -4 -6 -8 0
5
10
15
t(s)
Fig. 5 Variable structure guidance law inertial line of sight Aangle estimation curve
am =
˙ (q˙ + δ) K − 2 R˙ q˙ − V˙m sin ηm + W q/ cos ηm
(11)
According to Equation (11), aT f V˙T f 1 − K− V˙ = cos ηT − sin ηT q˙ 2 − W q˙ R q˙ q˙ If V˙ < 0, choose K and W as the following forms.
(12)
198
S. Yang et al.
Fig. 6 Variable structure guidance law linear angular velocity estimation curve of inertial system
Fig. 7 Generalized proportional guidance law trajectory curve
2500
2000
Y(m)
1500
1000
500
0 0
1000 2000 3000 4000
5000 6000 7000 8000
9000
X(m)
K >
1
˙
aT f + Vt f |q| ˙
K > 0, W > aT f + V˙t f
May make V˙ < 0 come true.
(13) (14)
Design of Terminal Guidance Law for Air Defense Missile … Fig. 8 Attack angle curve of generalized proportional guidance law
199
10 8 6 4 2 0 -2 -4 -6 -8 -10
0
5
10
15
t(s)
Fig. 9 Overload curve of generalized proportional guidance law
6 5 4 3 2 1 0 -1 -2 -3
0
10
5
15
t(s)
3 Simulation Study After the completion of the final guidance law design of missile, on the basis of the nonlinear six-free simulation model, the final guidance law based on variable structure control theory proposed in this paper and the general proportional guidance law commonly used in engineering are compared by simulation to verify the rationality of the final guidance law design [6]. The simulation conditions are as follows: the target is moving uniformly in a straight line, the velocity in the x direction is 200 m/s, and the initial coordinates are (8775, 2000, 0 m); The initial coordinate of the missile is (0, 0, 0 m), and the initial velocity is 40 m/s; The initial missile-target line-of-sight azimuth is 180◦ . The
200
S. Yang et al.
Table 1 100 random simulation statistics Guidance law Head-on attack (m) Generalized proportional guidance law Variable structure guidance law
3.36 1.41
Tailgating attack (m) 8.22 3.22
semi-active seeker has an internal step of 0.06◦ and a noise mean square error of 0.2. The results of a random simulation are given below, where the miss in the generalized proportional guidance law is 3.27 m, and the miss in the variable structure guidance law is 1.6 m (Figs. 2, 3, 4, 5, 6, 7, 8, 9 and Table 1).
4 Conclusion According to the special requirement of air defense missile attacking high speed and high maneuvering target, a terminal guidance law of air defense missile based on variable structure control theory is proposed in this paper. Through simulation calculation, it is verified that compared with the general proportional guidance law, the proposed terminal guidance law can effectively improve the terminal guidance accuracy and has higher engineering realizability.
References 1. Wang, Y., Fang, Y., Zhou, X.: Research status and development of proportional guidance law. Fire Control and Command Control 32(10), 5 (2007) 2. Talole, S.E., Ravi, N.: Banavar : proportional navigation through predictive control. J. Guidance Control Dynam. 21(6), 1004–1006 (1998) 3. Changqi, L.V.: In: Prediction Guidance Law and Target Maneuver Estimation. Aviation Weapons Z1, 9 (1996) 4. Ma, Hui, Fang, Qun, Yuan, Jianping: Research on space interception modified proportional guidance law. Flight Mech. 24(1), 52–54 (2006) 5. Cheng, F., Chen, S.: Modified proportional guidance law of interceptor warhead. J. Air Force Eng. Univer. (Natural Science Edition) 4(4), 15–18 (2003) 6. Jin, Z., Zhang, J., Zhang, G.: Guidance law and target position prediction. Tactical Missile Technol. 2, 3 (2006) 7. Guo, X., Jia, X.: Research on variable structure guidance law of target terminal escape. Aviation Weapons 1, 8–11 (2006)
A Trajectory Prediction Method via Affine Objective Fuzzy Clustering Na Wang, Liang Luo, Yue Lei Cui, and Xin Hai Zhang
Abstract For the single-step trajectory prediction of moving objects, a trajectory prediction method via affine objective fuzzy clustering is proposed. Firstly, the affine clustering method and objective clustering analysis are combined to obtain the appropriate number of clusters and clustering centers. Secondly, the results of clustering are integrated with the Fuzzy C -Means algorithm. As a result, the rule number and the premise parameters in T-S fuzzy model are determined. Finally, the Stable Kalman Filter approach is used to estimate the consequent parameters of T-S model. Through the simulation of three-degree-of-freedom flight dynamics trajectory, the presented method is compared with the traditional deep learning model. Then the validity of the presented method is verified. Keywords Trajectory prediction · Affinity propagation clustering · Objective fuzzy clustering · T-S model
1 Introduction Trajectory prediction is to estimate the position, velocity, acceleration and other information of the future time using the trajectory data of the historical time [1]. At present, data-driven deep learning [2] is the main method. Because the temporal features can be fully mined by this approach. However, the subjectivity still exists in establishing network structure and parameters by the deep learning methods. Moreover, large data are used in it and easily lead to low computational efficiency. Compared with deep learning, fuzzy model, e.g. T-S model is more objective because the parameters of antecedents and consequents are identified from data. N. Wang · L. Luo · Y. L. Cui · X. H. Zhang Tiangong University, Tianjin 300387, China N. Wang (B) Tianjin Key Laboratory of Intelligent Control of Electrical Equipment Universit, Tianjin 300387, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_16
201
202
N. Wang et al.
Furthermore, the simplicity and the accuracy of the model could be guaranteed without the requirements for large data. So it has been the main means for modelling. During the identification of T-S model, the premise identification is the most important, including the the number of rules and premise parameters. For that, clustering [3] is the main tool. But in traditional clustering methods, the number of clustering is determined by trial and error. As a result, the accuracy of T-S model is easily decreased. In this paper, a trajectory prediction method via Affine Objective Fuzzy Clustering(AOFC) is proposed. Firstly, the Affinity Propagation(AP) clustering [4] method is used to obtain the initial number and centers of clustering. On this basis, the simplicity of the model is considered. Thus the Objective Fuzzy Clustering Analysis(OFCA) approach is introduced and re-clustering on these initial clusters. So the redundant clusters are merged and the appropriate number and the centers of clustering are get. Therefore the consequent computation efficiency is strengthened [5]. Secondly, this result of clustering is afforded for the Fuzzy C -Means(FCM) algorithm. Thus the fuzzy membership matrix from the Fuzy c-Means is used to obtain the premise parameters directly. As a result, the speed of T-S fuzzy identification is increased. Following, the Stable Kalman Filter algorithm is applied to estimate the consequent parameters of T-S model quickly. The validity of the proposed method is verified by the simulation of three-degree-of-freedom flight path compared with the Long Short Time Memory(LSTM) and the Gated Recurrent Unit(GRU) deep learning model.
2 AOFC 2.1 AP Clustering All N sample points in the dataset are regarded as candidate clustering centers in the AP algorithm Information about the attractiveness of each sample point to other sample points is established, i.e., the similarity si j = − z i − z j 2 between any two sample points z i and z j is stored in N × N matrix S0 , i, j ∈ [1, N ], i = j; Before clustering,the maximum number of iterations is set to M for the clustering algorithm, and each point will be given a prior value p(i) = s(i, i) to indicate the tendency of data point z i to be selected as the cluster center, which is called the bias parameter. At the beginning of the algorithm, p(i) is taken its median value. In order to select the appropriate cluster center, the AP algorithm is updated iteratively. ri j is used to describe the degree to which data point z j is suitable as the class representative of data point z i .ai j is used to describe the suitability of data point z i to select data point z j as its class representative. The greater the {ri j + ai j }, the greater the possibility of z j as the final cluster center. Through repeating iterations, if the number of iterations k > M
A Trajectory Prediction Method via Affine Objective Fuzzy …
203
Fig. 1 Schematic diagram of AP clustering algorithm
or the cluster center does not change after several iterations, the algorithm ends, and outputs the cluster center V k and various cluster subsets C k . The schematic diagram of AP clustering algorithm is given in Fig. 1.
2.2 OFCA OFCA is used to cluster initialize based on enhanced objective clustering analysis (EOCA), and its result of cluster initialization is given to FCM to obtain fuzzy partition and final cluster centers. That is, training subsets are divided by dipoles and hierarchical clustering is performed. Secondly, by using relative dissimilarity measure and enhanced consistency criterion, the final cluster number and cluster centers are determined from two candidate cluster sets, and FCM clustering is performed to obtain the cluster center set and fuzzy partition matrix. OFCA is described as follows (Fig. 2). Definition 1 Dipole
zi are Give the training dataset Z = {z 1 , z 2 , · · · , z N } , symmetric samples like zj k named dipoles and recorded as Oi j , where i = j, i, j = 1, 2, · · · N , k ∈ {1, · · · , C N2 }. Let disi j = z i − z j represent the value of dipole Oikj , where disi j represents the Euclidean distances of zi and z j , and the similarity is defined as Si j = −disi j .
204
N. Wang et al.
Fig. 2 OFCA schematic diagram
Definition 2 Dissimilarity measure The relative dissimilarity measure represents the similarity degree between different clusters in the same training subset in each cluster merging process of hierarchical clustering, and its definition is as follows: ¯ ¯ Di j = dis i j / min( Di j , D ji ) D¯ i j = dik /(c − 2)
(1)
k∈c,k= j
where Di j denotes the relative dissimilarity between cluster i and j, i = j, i, j ∈ {1, · · · c}; c denotes the number of clusters per cluster; dik represents the Euclidean distance of clusters i and k. D¯ i j represents the average distance from i to clusters other than j. Definition 3 Enhanced consistency criteria The improved consistency criterion represents the degree of similarity of the results of each cluster among different training subsets formed by the dipole corresponding sample data in AHC, expressed as: B L new =
1 mc
c m t=1 i, j=1
t t 2 (V Ai − VBi ) .
(2)
t where m represents the dimension of the data sample; V Ai represents the cluster t center component t of cluster i in training subset Z A .VBi represents the cluster center component t of cluster j closest to cluster center i of Z A in training subset ZB.
A Trajectory Prediction Method via Affine Objective Fuzzy …
205
2.3 AOFC Algorithm Summarily,the AOFC algorithm steps are given below: Step 1: Preliminary clustering of dataset Z 0 is carried out using AP clustering algorithm,and a new dataset Z is formed from several high-quality clustering centers V 0. Step 2: Use dipole partitioning to generate training subset Z A , Z B and Z C , Z D for the dataset. Step 3: Condense hierarchical clustering on Z A , Z B based on relative dissimilarity measure and enhanced consistency criterion, and then compare the minimum B L new1 of the improved consistency criterion with the candidate cluster set {C AB , V AB }. Similarly, for Z C , Z D , the corresponding minimum improved consistency criterion values B L new2 and candidate cluster set {C C D , V C D } can be obtained. Step 4: Compare B L new1 with B L new2 to determine the optimal number of clusters C opt and its corresponding cluster center V opt from candidate sets {C AB , V AB } and {nC D , V C D } based on the minimum value of them. Step 5: Number of clusters nopt , Cluster Center V opt and FCM are combined to get Cluster Center Vector V and Fuzzy Membership Matrix F.
3 Construction of T-S Fuzzy Prediction Model Based on AOFC 3.1 Prerequisite Parameter Identification Based on AOFC In this paper, AOFC is used to determine the number of rules and the preceding parameters in the T-S model, where the number of rules is determined by the number of clusters obtained by AOFC, and the precondition parameters are determined by the cluster center of AOFC. Considering the MISO situation, the expression of T-S model is described as follows: Ri : I F x1 is Ai1 and x2 is Ai2 · · · and xm is Aim T H E N yi = ηi0 + ηi1 x1 + · · · + ηim xm
(3)
where Ri represents the ith fuzzy rule, i = 1, 2 · · · , n; x1 , x2 , · · · , xm represents m input variables; Aij represents the ith fuzzy subset of variable i, which can be expressed in functional form Aij (•).ηi j represents the conclusion parameter, j = 1, 2, · · · , k. Aik (xk ) is the prerequisite parameter of input xk , which can be determined by the following definition, and the prerequisite parameter of each rule can be obtained.
206
N. Wang et al.
Definition: ∃x 0j ∈ X , after AOFC is used to cluster in its input-output product space, its prerequisite parameters are determined by projection of the cluster on the coordinate axis of each feature vector. The form is detailed as follows: Aij (x 0j ) = μi0
(4)
where μi0 is the prerequisite parameter for input xk . The prerequisite parameters of each rule can be expressed as: μi = μi0 ∧ μi1 ∧ · · · ∧ μim
(5)
where the fuzzy operator ∧ adopts the product operation.
3.2 Identification of Rear Part Parameters Definition: μ¯ i = μi /
c
μi
(6)
i=1
Then the output of the fuzzy model can be written as: y=
c
μ¯ i yi
i=1
=
c
i=1
μ¯ i (η0i + η1i x1 + η2i x2 + · · · + ηki xk )
(7)
where P = [η01 , · · · , η0n , η11 , · · · , η1n , · · · , ηm1 , · · · , ηmn ] Recursive estimation of the posterior parameters is achieved using steady-state Kalman filtering. The non numerical solution problem caused by the ill-conditioned matrix of the traditional least square method can be overcome and the computational efficiency is further improved. Here, the steady-state Kalman filtering method is used to obtain the conclusion parameter P recursively, and the recursion steps are described as follows: P i+1 = P i + Si+1 X¯ i+1 (yi+1 − X¯ i+1 cij ) T
Si+1 = Si −
T Si X¯ i+1 (yi+1 − X¯ i+1 cij ) ,i 1+ X¯ i+1 Si X¯ T i+1
= 0, 1, . . . , N − 1
(8)
where Si is the covariance matrix, the initial condition is set to c0j = 0, S0 = λI and λ is generally the preset large real number (>10000). I is the unit matrix.
A Trajectory Prediction Method via Affine Objective Fuzzy …
207
3.3 Description of Single-Step Trajectory Prediction Algorithm Based on AOFC Detailed algorithm steps are given below: Step 1: Give a one-dimensional trajectory dataset X 0 = {x10 , x20 , . . . xn0 }, a feature matrix X with m + 1 columns and n − m rows is constructed by the sliding window method,where: ⎡ ⎢ ⎢ X =⎢ ⎣
x10 x20 .. .
x20 x30 .. .
0 0 xn−m+1 xn−m
⎤ 0 · · · xm0 xm+1 0 0 ⎥ · · · xm+1 xm+2 ⎥ .. .. .. ⎥ . . . ⎦ 0 · · · xn−1 xn0
(9)
Step 2: The first m columns of the partition matrix X are used as input, and the m + 1 columns are used as output. An input feature matrix X t and an output feature vector Y t are formed,where: ⎡ ⎢ ⎢ Xt = ⎢ ⎣
x10 x20 .. .
x20 x30 .. .
0 0 xn−m+1 xn−m
⎡ 0 ⎤ ⎤ xm+1 · · · xm0 0 0 ⎢ xm+2 ⎥ · · · xm+1 ⎥ ⎢ ⎥ ⎥ .. .. ⎥ , Y t = ⎢ .. ⎥ ⎣ ⎦ . . . ⎦ 0 · · · xn−1 xn0
(10)
Step 3: Take X t and Y t as inputs and outputs during the training process of the fuzzy prediction model, the antecedent parameter μi of the ith rule of the prediction model and conclusion parameter P are obtained. Step 4: Input m test track data X i , and get the output value from the model, so the predicted value Xˆ i+1 is get.
4 Simulation Study In order to verify the effectiveness of the method proposed in this paper, the data set used is from a track randomly generated by a 3-dof flight dynamics model [2]. The sampling time is 0.1 s, and the number of track points is 550. The first 399 track points are taken as the training set, and the last 151 track points are taken as the test set. After single-step prediction on the X, Y, and Z axes, the three-dimensional prediction track is formed. This paper uses Root Mean Square Error (RMSE) to measure the effectiveness of the proposed method: RMSE =
(
N i=1
(xi − xˆi )2 + (yi − yˆi )2 + (z i − zˆ i )2 )/n
(11)
208
N. Wang et al.
Fig. 3 Clustering result of AOFC, OFCA
where n is the number of track points, xi , yi and z i are the true values of the ith track point on the X, Y and Z axes respectively, i = 1, ..., n . xˆi , yˆi and zˆ i are the predicted values of the ith track point on the X, Y and Z axes respectively. In this paper, X-axis trajectory prediction is taken as the object. Take m=3 and construct a three-in-one feature matrix for trajectory dataset X r = {x1 , x2 , · · · xn }. Input X in and output Y are: ⎤ ⎡ ⎤ x4 x1 x2 x3 ⎥ ⎢ ⎢ .. . . .. .. ⎦ , Y = ⎣ ... ⎥ =⎣ . ⎦ xn−3 xn−2 xn−1 xn ⎡
X in
(12)
The first 396 trajectory samples were obtained as training sets, and the last 148 trajectory samples were obtained as test sets. Clustering on the training set. To compare the clustering effects of the proposed AOFC method with OFCA method, after normalization and Principal Component Analysis(PCA) dimensionality reduction, AOFC and OFCA clustering are performed respectively. It can be seen that the AOFC and OFCA clustering results are c = 2 classes from Fig. 3, the clustering effect is consistent, and the clustering results are basically the same. Taking the rule number i = c = 2, after the identification of the rule preceding and conclusion parameters respectively with FCM and steady state Kalman filter, the
Table 1 Comparison of AOFC, OFCA clustering effect Clustering method Training (RMSE) Testing (RMSE) OFCA AOFC
0.069 0.067
0.3303 0.3271
Cluster time (s) 28.6 1412.1
A Trajectory Prediction Method via Affine Objective Fuzzy …
209
Fig. 4 Comparison among LSTM, GRU and AOFC predicted trajectories and real trajectories in 3D trajectories Table 2 Comparison of track prediction errors among OFCA, LSTM, GRU and AOFC in X, Y, Z axes and 3-D coordinates Method X-axis (RMSE) Y-axis (RMSE) Z-axis (RMSE) 3D (RMSE) LSTM GRU OFCA AOFC
2.9068 2.1918 0.3303 0.3271
2.2040 1.5603 0.0723 0.0430
1.2945 0.9584 0.0416 0.0334
3.8708 2.8560 0.3407 0.3316
training and testing errors based on AOFC and OFCA model predictions and true values are shown in Table 1. Table 1 shows that the training error and test error of the model based on OFC or OFCA are low, which indicates that the prediction model has better prediction performance. In the clustering process, AOFC greatly improves the efficiency of clustering algorithm based on the original OFCA, which can be reflected directly from the clustering time. To verify the effectiveness of the proposed method, OFCA, LSTM, GRU are used to compare with this method, as shown in Fig. 4 and Table 2. From Fig. 4 and Table 2, it can be seen that the prediction effect of the method proposed in this article is the best, whether in one-dimensional coordinates or threedimensional coordinates.
5 Conclusion In this paper, the OFCA clustering algorithm is improved by using the AP clustering algorithm to form the AOFC clustering algorithm. The efficiency of clustering operation and the accuracy of clustering are greatly enhanced .The number of fuzzy rules
210
N. Wang et al.
and antecedent parameters for structure identification in T-S model are effectively determined. Secondly, the parameter identification of T-S model using recursive steady-state Kalman filter could be improved effectively. Finally, the effectiveness of the proposed method is verified by comparison with various methods.
References 1. Zhang, Z., Ni, G., Xu, Y.: Review of the status and development of trajectory prediction technology. Electron. Measurem. Technol. 43(13), 111–116 (2020). https://doi.org/10.19651/j.cnki. emt.2004194 2. Zhang, H., Huang, C., Xuan, Y., Tang, S.: Real-time prediction of air combat flight trajectory using GRU. Syst. Eng. Electron. 42(11), 2546–2552 (2020). https://doi.org/10.3969/j.issn.1001506X.2020.11.17 3. Li, H., Zhang, L.: Summary of clustering research in time series data mining. J. Univer. Electron. Sci. Technol. China 51(03), 416–424 (2022). https://doi.org/10.12178/1001-0548.2022055 4. Jiang, J., Wang, Z., Chen, T., Zhu, C., Chen, B.: Adaptive AP clustering algorithm and its application on intrusion detection. Complex Syst. Complex Sci. 36(11), 118–126 (2015). https:// doi.org/10.11959/j.issn.1000-436x.2015242 5. Wang, N., Hu, F.: A handwriting digital recognition method based on enhanced objective cluster analysis. Complex Syst. Complex Sci. 16(02):77–84+94 (2019). https://doi.org/10.13306/j. 1672-3813.2019.02.009
Analysis on the Key Technology of Guidance and Control of Missile and Gun Combined Air Defense Missile Shujun Yang, Jirong Ma, Juanzhi Lu, and Duansong Li
Abstract Starting from the challenge of the guidance and control system posed by the battlefield environment and combat mission of the missile-gun integrated air defense missile weapon system, the key technologies and development trend of the missile-gun integrated air defense missile guidance and control are analyzed. Finally, some ideas on the development of air defense missile are put forward. Keywords Missile and gun combined air defense missile · Key technology of guidance and control · Development trend
1 Introduction The projectile and gun combined air defense weapon system is a terminal integrated defense weapon system developed to cope with the rapid and continuous supersaturated attack situation of various air attack weapons. It organically combines small-caliber artillery and air defense missile, makes up for their respective defects, and can give full play to their respective operational advantages, realizing the rapid response and multiple interception of low altitude and short range targets, giving play to the outstanding advantages of strong firepower and high damage probability, so as to achieve the best short-range defense system combat effect [1]. The rapid development of air targets has put forward higher requirements for the guidance and control system of missile-gun combined air defense missile. In the type, in addition to a variety of excellent performance of fighter jets, bombers, helicopters, unmanned aerial vehicles, ballistic missiles, air-ground missiles, cruise missiles, guided bombs, patrol missiles and so on have joined the ranks of air attacks. In terms of performance, modern air attack targets generally have the advantages of fast speed, strong mobile ability, good stealth performance and lower and lower cost of air attack [2]. In the tactical application, sea, land, air, air and electromagnetic multidimensional integration makes the air defense missile guidance and control system S. Yang (B) · J. Ma · J. Lu · D. Li Xian Institution of Modern Control Technology, Xian 201848, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_17
211
212
S. Yang et al.
face more and more complex combat environment. A large number of research data show that without a good performance of air defense missile guidance and control system, air defense missile weapon system can not complete its prescribed tasks [3].
2 Status of Guidance and Control Technology of Missile-Gun Combined Air Defense Missile Due to the advantages of the missile-gun combination air defense weapon system, many countries such as Russia and the United States have carried out research on the missile-gun combination air defense weapon system. At present, there are more than 20 kinds of missile-gun combination air defense weapon systems in research and use. The main ones are “Tunguska”, “Armor” and “Kashtan” of Russia, “Avenger” and “Center Guard” of the United States, “Sinai” and “Nile” of Egypt, “Szopol” and “Lara” of Poland, “Sidam” of Italy, new “Adatz” jointly developed by the United States and Switzerland, and PGZ04 and Type 04(A) of China The 25 mm shell-gun combined anti-aircraft weapon. The former soviet union bullet-gun integrated air defense system developed by the former Soviet Union consists of two 2A38M 30 mm twin automatic guns, four or eight 9M311 air defense missiles, search radar, optical sight, fire control computer, etc. The 9M311 missile adopts a two-stage missile body, and the booster engine adopts a solid rocket engine. After the work, the separation mechanism functions and the booster engine is thrown off. The main stage missile flies inertial, and the main stage missile has 4 fixed wings and 4 control surfaces, which can deal with the air target of 15–3500 m and 2500–8000 m. “Panzer” is developed by Russia after “Tunguska”, the weapon equipment is composed of two 2A72 30 mm automatic cannon and 12 5736YE air defense missiles. The 5736YE missile has a range of 1200–12,000 m, a fire height of 5–6000 m, a maximum speed of 1100 m/s, and a maneuverable overload of 32 g. The missile adopts photoelectric and radar measurement and radio instruction guidance [4]. Figure 1 shows typical Russian shell-gun-combined anti-aircraft weapon systems: Tunguska and Panzer. Although the United States adopts the air defense strategy of “air to air”, it is still heavily equipped with shell-gun combined air defense weapon systems. Due to the differences with Russia’s national conditions and operational requirements, the United States’ missile-gun combined air defense weapon system integrates the missile, gun and fire control equipment in one vehicle, with strict weight restrictions. It is usually combined with small anti-aircraft guns and portable air defense missiles, which protects a small airspace and is often deployed in the front of the battle area. Such as “Avenger” and “Linebacker” and other models all use “Stinger” portable missiles [5].
Analysis on the Key Technology of Guidance …
213
Fig. 1 Russian “Tunguska” and “Panzer” shell-gun combined air defense system
The U.S. Army is mainly equipped with Avenger and Linebacker anti-aircraft weapon systems. The Avenger system consists of two four-mounted Stinger missile launch modules, infrared search and tracking equipment, and communication and navigation equipment. The Center-back artillery-gun combination system consists of a four-mounted Stinger missile launch module, infrared search and tracking equipment, communication and navigation equipment, etc. The Stinger missile is a man-portable infrared self-homing missile with an effective killing range of about 5 km. The basic version of the Stinger missile is the product of the late 1970s and early 1980s, and later there are “Stinger-POST”, “StingerRMP”, “Stinger-Block1”, “Stinger-Block2” and other improved versions. The most important improvement is reflected in the missile detector gradually developed from the basic type of unit infrared 4-element infrared, infrared ultraviolet two-color and the latest focal plane imaging, improving the missile’s ability to resist infrared decoy interference. The mode of use determined the design idea of the Avenger and Centre-back shell-gun combination system. Neither system is equipped with radar to improve operational concealment; The 25 mm automatic gun is light and flexible, but has less firepower as an anti-aircraft weapon and is mainly used for self-defense in emergency situations. Figure 2 shows a typical American gun-combined anti-aircraft weapon system: the Avenger and the Centerback. From the above analysis, it can be seen that Russian projectiles and guns share ground guidance equipment with air defense missiles and guns to maximize the utilization rate of ground guidance equipment and realize the integration of projectiles and guns in a real sense. The reason is that the guidance system uses remote command guidance and the guidance distance is far. The United States artillery combined with air defense missiles using independent guidance mode, equipped with infrared seeker, guidance distance is relatively close.
214
S. Yang et al.
Fig. 2 U.S. Army Avenger and Linebacker combined ammunition and gun systems Table 1 Performance and characteristics of the guidance and control system of main missile-gun combined air defense missiles in Russia Tunguska Panzer Fitted missile Range (m) Detection distance Tracking distance Guidance equipment Guidance mode Control mode The ability to hit multiple targets
9M311 2500–8000 18 km 13 km Search radar Radio command Pneumatic rudder Don’t have
5736YE 1200–20,000 36–38 km 24–30 km Search radar Radio command Pneumatic rudder Have
Table 2 Performance and characteristics of the main missile guidance and control system of the United States Avenger Lineback Fitted missile Range (m) Guidance equipment Guidance mode Control mode The ability to hit multiple targets
Stinger 5000 Infrared Infrared self-homing Pneumatic rudder Have
Stinger 5000 Infrared Infrared self-homing Pneumatic rudder Have
Tables 1 and 2 respectively summarize the performance and characteristics of the anti-aircraft missile guidance and control system in the above typical missile-gun combined anti-aircraft weapon systems of Russia and the United States (see [1, 2, 6]).
Analysis on the Key Technology of Guidance …
215
3 Analysis and Development Trend of Guidance and Control Technology of Missile-Gun Combined Air Defense Missile 3.1 Further Improve the Guidance and Control Accuracy with Anti-missile Capability In the future, more and more combat aircraft will be capable of fighting outside the area, The combat aircraft will gradually fade out of the category of terminal defense, and the missile-type targets such as air-to-ground missiles, guided bombs, patrol missiles will be more and more encounter. Such targets are generally small in size. In order to play the role of the missile-gun combined air-defense missiles in the future battlefield, the guidance and control accuracy must be further improved, so that the air-defense missiles have anti-missile capability. At the same time, improving the guidance and control accuracy plays a decisive role in realizing the miniaturization of weapons and thus increasing the payload capacity. On the other hand, the future war requires in many occasions not only to destroy enemy targets, but also to avoid causing unnecessary collateral damage. Therefore, it is imperative to further improve the guidance and control accuracy of ammunition and artillery combined with antiaircraft missiles. By using composite guidance, multi-mode homing guidance and composite attitude control, the precision of guidance and control can be improved step by step.
3.2 Reduce the Cost of Guidance and Control System Guided bombs, patrol missiles, small UAVs and other targets are cheap. To deal with such cheap targets, we must develop low enough cost projectiles combined with anti-aircraft missiles to compete with them. Otherwise, we will bear huge economic pressure in future wars. The development cost of guidance and control system accounts for 60–70% of the development cost of precision guided weapon [6]. Therefore, reducing the cost of guidance and control system of missile-gun combined air defense missile can greatly reduce the cost of the entire weapon system. The U.S. military plans to reduce the total cost of precision guided weapon by reducing the cost of detection device and guidance system of precision guided weapon. The cost of the existing system can be reduced through scientific and technological development and technological progress, the cost of materials, components, subsystems and the entire system can be reduced. The development of detector materials has reduced the cost of focal surface array by 20 times, and the cost of inertial measurement equipment has been reduced by an order of magnitude. When developing the guidance and control system, under the condition of satisfying the guidance and control performance, the detection
216
S. Yang et al.
with lower cost should be selected as far as possible Materials, components and components to achieve the guidance and control system of low cost.
3.3 Track Multiple Targets and Guide Multiple Missiles At the beginning of the war, large-scale air strikes are often carried out, that is, a large number of troops are concentrated on one or several targets in a short period of time, making the ground air defense fire impossible to defend. Therefore, in the development of new missile-gun combined air defense weapon systems, it is generally required to enhance the capability of anti-saturation attack, which is reflected in the guidance and control system, that is, tracking multiple targets and realizing the guidance of multiple missiles at the same time. The main measures include: equipped with phased array radar to realize the identification and tracking of multiple targets and the guidance of multiple missiles; Choose missiles that can attack targets independently, do not affect each other between missiles, missiles and artillery, and its missiles are generally “launch and forget” missiles. Such a combination of projectiles and guns combined with an air defense weapon system can simultaneously combat multiple targets, thus improving its ability to resist saturation attacks.
3.4 Enhance the Ability to Resist Low-Level Penetration All kinds of helicopters, unmanned aerial vehicles, cruise missiles, patrol missiles and other targets have the ability to fly at low altitude, from a few meters from the ground, the target signal is submerged in the clutter, radar is difficult to find, coupled with the impact of the curvature of the earth, radar low altitude detection distance is greatly reduced, making the warning time is very short, such target threat is very big. In the future, search radar and tracking guidance radar can improve its low-altitude detection performance by suppressing clutter and multi-path effect, and can also increase the photoelectric system with good low-altitude detection performance to enhance the anti-penetration ability of projectile-gun combined air defense missiles.
3.5 Anti-jamming and Anti-stealth Capabilities Infrared interference and electromagnetic interference will play a more and more important role in future air attack operations. At the same time, various air attack weapons are trying to reduce their radar scattering cross-sectional area (RCS) and inhibit their infrared radiation through various ways, reduce the detection probability, and enhance the ability of attack and penetration. In view of this, the development of new projectile-gun-combined air defense weapons must have the ability of anti-
Analysis on the Key Technology of Guidance …
217
infrared interference, electromagnetic interference and anti-stealth, only in this way can it have a place in the future air defense operations. By continuing to develop the guidance technology of compound guidance and multi-mode homing, and making great breakthroughs in information fusion processing technology and target automatic identification technology, the ability of anti-jamming of the missile-gun combined air defense missile can be improved and it can have anti-stealth capability.
3.6 Fast Response Control Technology High-speed targets such as ballistic missiles and supersonic cruise missiles are also important combat targets of anti-aircraft missiles combined with projectiles and guns. The common characteristics of such targets are extremely fast speed and strong maneuvering ability, with the speed reaching several Mach or even more than ten Mach. Against such targets, traditional control technologies are powerless, and the rapid response control system with direct force can make missiles in a short time to obtain a large maneuverability, to ensure that the combination of projectiles and guns with air defense missiles to complete the rapid maneuverability of combat tasks, successfully intercept ballistic missiles, supersonic cruise missiles and other high-speed targets.
3.7 Improve the Target Tracking Ability on the Motion Carrier Firing on the move has always attracted much attention, and is also an important development direction of anti-aircraft weapon system combined with ammunition and gun. As the system moves, the attitude of the carrier constantly changes with the relief of the terrain. To eliminate the influence of carrier attitude change on target tracking, including the line of sight, tracking speed and tracking accuracy, is an important link to realize the moving shooting function.
3.8 Compound Guidance A single mode guidance system will have difficulty adapting to the new requirements of warfare, and the development and adoption of composite guidance will be the only option. Compound guidance has the performance advantages of two or more spectrum, which can not only give full play to the advantages of each mode, but also make up for the disadvantages of each other. In tactical use, it will greatly improve the antijamming performance, all-weather performance, anti-stealth and target identification
218
S. Yang et al.
ability of the guidance system, improve the guidance accuracy and expand the range of action. There are many forms of compound guidance, according to the guidance system to compound between the radio frequency and optical compound. According to the basic way compound, remote control, seeking between the compound. In a variety of composite forms, infrared/millimeter wave composite technology performance is the best, the system optical and electrical complementary, overcome their shortcomings, integrated the advantages of optical and electrical guidance, is still the current and in the future quite a long period of time the focus of the world’s research.
3.9 Multi-mode Homing Guidance The guidance of multi-mode homing can not only improve the guidance accuracy, but also enhance the anti-jamming ability. Due to the emergence of a large number of high and new technologies and their wide application in precision guidance technology, the information content and intelligent level of guidance weapons will be constantly improved, thus driving the development of multi-mode homing guidance technology. At present, the multi-mode seeker used in the weapon or under development mainly adopts the dual-mode composite form, including: ultraviolet/infrared, TV/infrared, laser/infrared, radio frequency dual-frequency (including active/passive composite), radio frequency/infrared, millimeter wave/infrared and millimeter wave/infrared imaging. In addition, there are information fusion technology, target automatic identification technology, robust control technology, network center guidance technology, integrated guidance and control design technology, integrated fuze guidance design technology, etc., are the development direction and trend of missile and gun combined air defense missile guidance and control technology.
4 Conclusion Weapons in the field of air defense missile is a rising star, has a certain foundation, we have to go ahead, come up with excellent products, in the field of air defense missile firm footing. The research of guidance and control technology runs through the whole process of missile weapon system development, and its performance is of great significance to the evaluation of the overall performance of the weapon system. Therefore, it is necessary to conduct in-depth research on the guidance and control technology, give full play to the technical advantages of the weapon in guidance accuracy, low altitude defense capability, rapid response, convenience and flexibility, and find the right entry point and Developed Distinctive anti-aircraft missiles on display.
Analysis on the Key Technology of Guidance …
219
References 1. Zhang, L., Bai, R.: Discussion on short-range integrated defense weapon system combined with projectile and gun. Intel. Command Control Syst. Simul. Technol. 26(2), 19–21 (2004) 2. Ji, L., You, C., Peng, J.: Evaluation of comprehensive effectiveness of air defense missile guidance system. Ship Electron. Eng. 9, 14–17 (2004) 3. Liu, Y.: Research on integrated fire attack of joint campaign corps. J. Hefei Artill. Acad. 26, 13–16 (2006) 4. Xing, Y., Hao, L., Li, B.: Research on intelligent evolution method of joint fire attack strategy. Ordn. Equip. Eng. J. 42, 7 (2021) 5. Gong, M., et al.: Research on equipment system concept of air-ground integrated unmanned combat system of Marine Corps detachment. Unmanned Syst. Technol. 4, 8 (2021) 6. Shen, J., Feng, Y., Yang, L.: The development status and trend of precision guided weapon. Aeronaut. Sci. Technol. 1, 18–20 (2006)
Evaluating RNN and Its Improved Models for Lithium Battery SoH and BRL Prediction Feifan Yu, Jiqiang Wang, and Xinmin Chen
Abstract Drones require high-performance lithium batteries, and conventional battery replacement standards are not applicable in the context of drones. To address these issues, this paper introduces the State of Health (SoH) as an indicator to assess battery condition and proposes the concept of Battery Replacement Life (BRL) for Unmanned Aerial Vehicles (UAVs). To predict the SoH and BRL of UAV lithium batteries, this study employs models such as recurrent neural networks (RNN) to address the problem, including long short-term memory (LSTM) and gated recurrent units (GRU) designed for time-series problems. The research shows that the 3–4 layer LSTM and GRU models exhibit promising outcomes in predicting the SoH and BRL of the UAV lithium battery. Keywords SoH · BRL · UAVs · RNN
1 Research Background The prediction of UAV lifespan is a multifaceted and intricate matter that is influenced by a multitude of factors, such as the design, manufacture, and usage of the UAV. Consequently, relying solely on battery lifespan as a means of calculating UAV lifespan is inadequate and imprecise, necessitating the incorporation of other factors, as well as the execution of sufficient experimentation and testing. As a matter of fact, the lifespan of a battery [1] is significantly shorter than that of the device (drone) it powers, and it is not cost-effective to replace the entire device solely due to the expiration of the battery life. In light of this, this paper introduces the concept of “BRL for UAVs”.
F. Yu University of Chinese Academy of Sciences, Beijing 101408, China F. Yu · J. Wang (B) · X. Chen Ningbo Institute of Materials Technology & Engineering, CAS, Ningbo 315201, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_18
221
222
F. Yu et al.
BRL [2] refers to the number of charge/discharge cycles during which the actual capacity of a UAV battery declines with usage and the internal resistance increases until it can no longer meet the requirements of normal UAV flight. Within the lithium battery industry, a battery’s capacity decline to 80% [3] is regarded as the end of life, or the End of Life (EOL) [4] of the battery. However, due to the application of batteries to UAVs, the complexity of missions may increase the required capacity of the UAV battery. Therefore, in this paper, the EOL of the UAV battery [5] is set based on the UAV parameters to be greater than 80% to accommodate the complex flight requirements of the UAV. Merely relying on the remaining capacity of the battery to determine the replacement life of the battery is inadequate. In 2016, Berecibar [6] proposed a method to define the SoH to address this issue. Given the many distinct characteristics of lithium batteries, it is unscientific to use residual capacity to directly estimate and predict BRL. Since each lithium battery has a different standard capacity, capacity is not a uniform criterion for judging. Therefore, in this paper, SoH is employed to tackle this problem. Lithium-ion batteries have become a common energy source for UAVs, but it should be noted that lithium-ion batteries used in UAVs are model-specific. Due to dataset limitations, this study is restricted to publicly available lithium-ion battery test datasets from major university research institutions. The NASA Ames Center for Predictive Excellence (PCoE) and the Center for Advanced Life Cycle Engineering (CALCE) at the University of Maryland have widely used Li-ion battery datasets. For this study, the CALCE Li-ion dataset was selected as the training test dataset due to its better data continuity. The battery data utilized in this paper are CS2 batteries, which were cycled at a constant current of 1C [7], corresponding to datasets CS2-35, CS2-36, CS2-37, and CS2-38. Although there have been some previous studies on the prediction of lithium battery lifetime using neural networks [8–12], this paper will use neural networks to predict the lifetime and BRL of lithium batteries applied to UAVs in the context of UAV specialisation. The following paper is broadly divided into three parts, Sect. 2 firstly introduces the definition of SoH and BRL, secondly also introduces other methods of calculating SoH and shows the relationship between SoH and other parameters of lithium batteries under the calculation method of this paper. Finally, the data set used in this paper is presented. Section 3 presents the evaluation index system and calculation methods for SoH and BRL prediction accuracy, which focuses on creating an evaluation system with two objectives. Firstly, the predicted values of key points need to be similar to the actual values, and secondly, the whole SoH prediction curve needs to be similar to the actual curve. The prediction results for each neural network are then shown, with the paper starting with the standard RNN prediction and subsequently using the MLP prediction for comparison. Later, improved models such as GRU [13] and LSTM [14] were used to filter out valid parameters for training and to make predictions. Finally, the accuracy scores of each neural network under this evaluation metric system are shown. Section 4 concludes with an analysis of the results and conclusions.
Evaluating RNN and Its Improved Models …
223
2 Model Preparation 2.1 Calculating SoH For SoH, there are many ways to calculate SoH, and the literature [6] that uses battery capacity decay to define SoH is the most extensive, giving the following definition of SoH. SoH =
Caged × 100% Crated
(1)
Of which Caged is the current capacity of the battery and Crated is the rated capacity of the battery. The capacity decay model is used in this paper to calculate SoH. To illustrate the validity of the data set chosen for this paper, the following figure illustrates the relationship between SoH and the number of cycles for the four lithium battery datasets (Fig. 1). As can be observed, the four curves display a comparable pattern, characterized by a gradual decrease in SoH with increasing cycle count, with some fluctuations observed within an acceptable range of variability. Although there are several other methods of determining SoH, some of which are based on resistance, number of charge and remaining cycles, internal resistance, constant current charge time (CCCT) and constant voltage charge time (CVCT).
Fig. 1 Li-ion battery SoH variation with cycle time
224
F. Yu et al.
Fig. 2 Correlation of SoH with other cell parameters in the CS2-38 dataset
However, they are all usually somewhat proportional or inversely related to the SoH model of capacity decay used in this paper over time (i.e. number of charge/discharge cycles), which suggests that the choice of SoH calculation method will not have much impact on the results of this paper. This is shown in Fig. 2.
2.2 Definition of BRL In the following section, we will discuss the BRL thresholds that are applicable to UAVs. The cycle life of a battery refers to the number of times that the battery can be charged and discharged before its capacity and performance start to degrade. This phenomenon occurs due to the chemical changes and losses that the battery’s electrolyte and electrode materials undergo during the cyclic charging and discharging process. Hence, the battery’s cycle life is closely related to its usage conditions, such as the charging method, operating temperature, and depth of discharge, all of which can impact the battery’s lifespan. In the context of unmanned aerial vehicles (UAVs), the batteries powering these devices undergo frequent charging and discharging cycles, resulting in a relatively
Evaluating RNN and Its Improved Models …
225
short battery cycle life. Furthermore, environmental factors such as high/low temperatures and humidity levels can further decrease the battery’s lifespan. Consequently, to maintain optimal performance and safety of UAVs, users are generally advised to replace the battery after approximately 200 cycles. Based on the definition provided above, the Battery Run Life (BRL) for a UAV is essentially a specific End-of-Life (EOL) threshold measured in State-of-Health (SoH) terms, designed to meet the operational requirements of the UAV mission. For the CS2 Li-ion battery investigated in this study, a decay capacity of between 990 and 1020 mAh was observed after 200 charging/discharging cycles, with a rated capacity of 1100 mAh. This indicates that the decay capacity of the battery is approximately 90.0–92.7% of its rated capacity. Accordingly, using Eq. (1), this paper defines the BRL of a UAV as the point at which the battery’s SoH has declined to 0.9, expressed in terms of the number of charging/discharging cycles required to reach this threshold.
3 SoH and BRL Predictions 3.1 Neural Network Model Prediction When predicting the BRL, it is important to take into account not only the number of charge/discharge cycles at which the predicted SoH drops to a critical value, but also the error in the overall life prediction curve. Failure to do so can lead to overfitting, where the BRL prediction may be accurate, but the overall deviation is greater. In order to demonstrate the accuracy of single point predictions, we introduce the relative error (RE). To reflect the error of the entire prediction curve, we introduce the root mean square error (RMSE) and the mean absolute error (MAE). In this paper, we selected RNN as the primary focus of our study due to its superior performance in predicting time series datasets. We also included MLP as a comparison method. Subsequently, we further investigated the effectiveness of the improved versions of RNN, namely GRU and LSTM, by comparing and evaluating their performance. Firstly, we have tuned the individual models to make the predictions of the models as accurate as possible, with the following results. As you can see from Table 1, for this problem, MLP and RNN work best at three layers and start to overfit at four layers; while the more advanced models, LSTM and GRU, already achieve better results at two layers. The above models were each tuned to the optimal number of layers and their performance on the dataset was viewed. Still using CS2-38 as an example, the prediction results are shown in Fig. 2. We then went on to see how accurate the above models were in predicting BRL. The results are shown in Table 1.
226
F. Yu et al.
Table 1 Error in SoH curve predicted by each model Models RE MAE MLP layers = 2 MLP layers = 3 MLP layers = 4 RNN layers = 2 RNN layers = 3 RNN layers = 4 LSTM layers = 2 LSTM layers = 3 LSTM layers = 4 GRU layers = 2 GRU layers = 3 GRU layers = 4
0.2021 0.1575 0.2141 0.1614 0.1201 0.1308 0.0908 0.1172 0.1458 0.1367 0.2303 0.1624
RMSE
0.0949 0.0775 0.1030 0.0914 0.0797 0.1271 0.0544 0.0625 0.0681 0.1367 0.0888 0.0737
0.1262 0.1017 0.1334 0.1110 0.0975 0.1606 0.0743 0.0858 0.0976 0.0975 0.1233 0.1035
Table 2 RE error in BRL values predicted by each model Models MLP RNN LSTM Layers = 2 Layers = 3 Layers = 4
0.3864 0.1856 0.2273
0.3636 0.3826 0.375
0.3068 0.1098 0.0379
GRU 0.1061 0.0492 0.1212
From Table 2, we can see that GRU has good BRL accuracy when all models are only two or three layers deep, and LSTM performs even better when the model depth continues to deepen. At this point, we can expect that the LSTM model will demonstrate excellent performance in both overall SoH prediction and individual BRL point prediction. However, in order to more rigorously demonstrate the advantages of LSTM in SoH and BRL prediction, a simple evaluation model is constructed in this paper to illustrate the scientific nature of the LSTM’s advantages (Fig. 3).
3.2 Model Effects and Evaluation The objective of the model proposed in this paper is to minimise the prediction error of the first SoH curve while maximising the accuracy of the BRL prediction. The evaluation of SoH error involves three metrics: RE, MAE and RMSE, each of which has its own strengths and applicability. Therefore, this paper uses the entropy weighting method to determine their respective weights. It is important to note that error is a cost-oriented metric, with lower values indicating better performance. In order to compare the differences between indicator values more intuitively, we
Evaluating RNN and Its Improved Models …
227
Fig. 3 Correlation of SoH with other cell parameters in the CS2-38 dataset
convert all indicators to benefit-oriented indicators, i.e. subtracting cost-oriented indicators from 1 to obtain benefit-oriented indicators. Secondly, we normalise the data for the above three indicators and then use the entropy weighting method to obtain the corresponding weights for each indicator. The normalised data are then weighted and summed to obtain a score of D1 for each network model in terms of the accuracy of the SoH curve. D1 = α I R E + β I M AE + γ I R M S E
(2)
where I R E , I M AE and I R M S E are the benefit-oriented indicators for RE, MAE and RMSE respectively. Where α, β, γ are the coefficients obtained by the entropy weighting method for each of the three indicators. The only metric involved in the evaluation of BRL errors is a single RE, which, after processing by Benefit indexing, we use directly as the predictive accuracy score D2 for BRL. Finally we calculate the final total score D by means of the following equation. D = λD1 + (1 − λ)D2
(3)
228
F. Yu et al.
Table 3 Prediction accuracy scores for each model Models D D1 MLP layers = 2 MLP layers = 3 MLP layers = 4 RNN layers = 2 RNN layers = 3 RNN layers = 4 LSTM layers = 2 LSTM layers = 3 LSTM layers = 4 GRU layers = 2 GRU layers = 3 GRU layers = 4
0.8549 0.8848 0.8458 0.8760 0.8997 0.8624 0.9260 0.9099 0.8936 0.8739 0.8467 0.8837
0.3864 0.1856 0.2273 0.3636 0.3826 0.375 0.3068 0.1098 0.0379 0.1061 0.0492 0.1212
D2 0.7342 0.8496 0.8091 0.7562 0.7586 0.7437 0.8096 0.9000 0.9279 0.8839 0.8988 0.8812
where λ is the weight of the two scores. In the context of this paper, we consider the predictive accuracy of the SoH curve to be as important as the predictive accuracy of the BRL points, so λ = 0.5. We obtained the final SoH, total BRL forecast accuracy scores based on the above evaluation system and forecast error data, as shown in Table 3, where the parameters for calculating D1 , α = 0.3896, β = 0.3393, γ = 0.2711. To visualise the strengths and weaknesses of each model on this question, the final scores are shown in Table 3. We can observe that MLP produces good results at three levels, but it under-fits at two levels and over-fits at four levels. RNN, on the other hand, does not perform as well as MLP and also over-fits from layer four onwards. In comparison to the previous two models, LSTM and GRU show much better performance. This is expected, as they are both improved versions of the standard RNN. Regarding LSTM, its best results were achieved at two layers when predicting SoH, and gradually started to over-fit above three layers. However, when predicting BRL, we found that higher number of layers led to better results. As for GRU, its performance was average when predicting SoH, but it achieved its best results at three layers when predicting BRL.
4 Conclusion For this paper, the innovations are as follows. 1. This paper redefines the EOL of lithium batteries used in UAVs based on the characteristics of UAVs.
Evaluating RNN and Its Improved Models …
229
2. This paper introduces the concept of “UAV battery replacement life (BRL)” and uses SoH as the evaluation index for BRL. 3. Four neural network models, MLP, RNN, LSTM, and GRU, are used in this paper to predict SoH, and their effects are compared and evaluated. 4. This paper establishes a dual-objective evaluation system to ensure that the deviation in curve prediction is small while ensuring high accuracy in BRL prediction. For the context in which this paper is applied, predicting SoH and BRL, MLP outperforms RNN as a comparison model, as the dataset may still have some interference even after data cleaning. LSTM and GRU, being improved versions of RNNs, perform very well in dealing with such disturbances. This is because LSTM incorporates forgetting gates, input gates, and output gates to filter out valid parameters from previous training, resulting in significantly improved accuracy. Similarly, GRU’s update gate and reset gate also have the same effect. Based on the results, it is evident that the problem of SoH and BRL prediction is effectively solved using a 3–4 layer LSTM network, which is of great practical importance for the maintenance of UAVs.
References 1. Han, X., Lu, L., Zheng, Y., Feng, X., Li, Z., Li, J., Ouyang, M.: A review on the key issues of the lithium ion battery degradation among the whole life cycle. ETransportation 1, 100005 (2019) 2. Tran, M.-K., Cunanan, C., Panchal, S., Fraser, R., Fowler, M.: Investigation of individual cells replacement concept in lithium-ion battery packs with analysis on economic feasibility and pack design requirements. Processes 9(12), 2263 (2021) 3. Shahjalal, M., Roy, P.K., Shams, T., Fly, A., Chowdhury, J.I., Rishad Ahmed, Md., Liu, K.: A review on second-life of li-ion batteries: prospects, challenges, and issues. Energy 241, 122881 (2022) 4. Lee, J., Kwon, D., Pecht, M.G.: Reduction of li-ion battery qualification time based on prognostics and health management. IEEE Trans. Indus. Electron. 66(9), 7310–7315 (2018) 5. Abeywickrama, H.V., Jayawickrama, B.A., He, Y., Dutkiewicz, E.: Comprehensive energy consumption model for unmanned aerial vehicles, based on empirical studies of battery performance. IEEE Access 6, 58383–58394 (2018) 6. Berecibar, M., Gandiaga, I., Villarreal, I., Omar, N., Van Mierlo, J., Van den Bossche, P.: Critical review of state of health estimation methods of li-ion batteries for real applications. Renew. Sustain. Energy Rev. 56, 572–587 (2016) 7. Sun, H., Wen, X., Liu, W., Wang, Z., Liao, Q.: State-of-health estimation of retired lithium-ion battery module aged at 1c-rate. J. Energy Storage 50, 104618 (2022) 8. Wang, H.-K., Zhang, Y., Huang, M.: A conditional random field based feature learning framework for battery capacity prediction. Sci. Rep. 12(1), 13221 (2022) 9. Chemali, E., Kollmeyer, P.J., Preindl, M., Ahmed, R., Emadi, A.: Long short-term memory networks for accurate state-of-charge estimation of li-ion batteries. IEEE Trans. Indus. Electron. 65(8), 6730–6739 (2017) 10. Hoque, Md.A., Hassan, M.K., Hajjo, A., Tokhi, M.O.: Neural network-based li-ion battery aging model at accelerated c-rate. Batteries 9(2), 93 (2023)
230
F. Yu et al.
11. Ardeshiri, R.R., Ma, C.: Multivariate gated recurrent unit for battery remaining useful life prediction: a deep learning approach. Int. J. Energy Res. 45(11), 16633–16648 (2021) 12. Singh, M., Bansal, S., Panigrahi, B.K., Garg, A.: A genetic algorithm and rnn-lstm model for remaining battery capacity prediction. J. Comput. Inform. Sci. Eng. 22(4), 041009 (2022) 13. Che, Z., Purushotham, S., Cho, K., Sontag, D., Liu, Y.: Recurrent neural networks for multivariate time series with missing values. Sci. Rep. 8(1), 6085 (2018) 14. Gers, F.A., Schmidhuber, J., Cummins, F.: Learning to forget: continual prediction with lstm. Neural Comput. 12(10), 2451–2471 (2000)
MDIoT: IoT Device Identification Method Based on Traffic Characteristics Hanxi Zheng, Ruijun Liu, Huanpu Yin, and Haisheng Li
Abstract In recent years, the rapid growth of the Internet of Things (IoT) devices has brought with it security risks, highlighting the need for management for IoT devices. Identification and classification are key components of devices management. However, existing work on IoT device identification is based on prior knowledge to manually extract features, resulting in too many redundant features and reducing devices identification accuracy. In this paper, we propose a model called MDIoT. Our model use the multi-voting method for feature selection, and is combined with Multiobjective Genetic Algorithm to reduce redundancy features and noise. Experimental results show that our feature selection method is more efficient in feature selection and the classification accuracy is over 95%. Keywords MDIoT · Internet of Things · Device feature · Device identification
1 Introduction With the rapid development and popularity of IoT, its security issues are becoming increasingly serious [1]. For example, Mirai [2] is a large-scale botnet composed of IoT devices, and has infected over 600,000 devices worldwide. And it caused massive distributed denial-of-service attacks and resulted in significant harm. IoT device management has become an important task to ensure IoT security and stability of the IoT. It can provide support for IoT applications and services. One of the key aspects of IoT device management is device identification [3]. The identification of IoT devices refers to the recognition and classification of various smart devices connected to the network by monitoring network traffic and device fingerprints [4]. It helps administrators to track and manage the devices connected H. Zheng · R. Liu · H. Yin (B) · H. Li School of Computer and Engineering, Beijing Technology and Business University, Beijing 100048, China e-mail: [email protected] Beijing Key Laboratory of Big Data Technology for Food Safety, Beijing 100048, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_19
231
232
H. Zheng et al.
to the network for security and stability. Traditional methods include identifying IoT devices based on techniques such as protocol parsing [5], port scanning [6], feature extraction [7] and traffic classification [8]. The basic idea of these methods is to identify and classify devices using the identification information (MAC address, IP address, port number, device type, protocol type) generated when the device interacts with the network. By analysing these characteristics, the device type can be identified. Existing work on IoT device identification is based on prior knowledge to manually extract features, resulting in too many redundant features and reducing devices identification accuracy. In this paper, we propose a method for identifying IoT devices (MDIoT), which investigate how more effective features can be extracted from packets for device identification. In details, we extract features from the raw data and reduce redundancy features by multi-voting method. And It is combined with Multi-objective Genetic Algorithm (MGA) for feature selection to reduce redundancy features and noise. Finally, we combine with decision number classifier to build MDIoT model for the identification and classification of IoT devices. The contributions of our work are as follows: (1) We design a device identification method based on network traffic feature select for IoT device classification. (2) We propose a method called MDIoT for identifying IoT devices, which multivoting method can reduce redundancy feature and MGA can reduce noise in feature selection. (3) The experimental results validate that device identification model using a decision tree (DT) based on extracted features is better than other machine learning (ML) classification models with an accuracy of over 95%. This paper is structured as follows: In Sect. 2, we present a comprehensive review of the existing work on device identification. Section 3 introduces our proposed methodology. Next, in Sect. 4 discusses the experimental setting and present the results. Finally, Sect. 5 concludes the paper by summarizing the findings and future work is finally given.
2 Related Work In this section, we discuss the existing device identification works, which can be roughly grouped into five categories: Protocol Stack. To address the challenge of identifying operating system types in large-scale data. Siby [9] proposed a solution that focuses on protocol stack features, by observing the network traffic architecture at the link layer. However, a limitation of this approach arised from traffic variations that occur within the capture time window, leading to the potential misclassification of two identical device types as two different types.
MDIoT: IoT Device Identification Method …
233
Ensemble Learning. Cviti [10] presented a framework for classifying IoT devices in a smart home environment using an ensemble learning approach. They analyzed the traffic of 41 IoT devices. They used 13 network traffic characteristics at the flow level generated by IoT devices. Their method achieved 97.79% accuracy in identifying IoT devices based on flow-based characteristics. However, their study did not consider IoT devices or real-time applications with malicious activity. Semi-Supervised ML. As labelling data collection is difficult and time-consuming, Li [11] proposed a semi-supervised ML algorithm based on IoT device detection and identification. The model distinguished between IoT and non-IoT and can classify specific IoT devices based on timeslot characteristics, traffic characteristics, protocol characteristics and TLS-related characteristics. The model identified IoT and nonIoT devices with 99% accuracy using 5% of the tagged data. However, it did not target unlabelled devices. Deep Learning. Lopez-Martin [12] used features based on time series of traffic to predict IoT type traffic. They used Convolutional Neural Networks and Long Short Term Memory ML algorithms to identify by boosting the learning blocks in each fully connected layer. The analysis achieved an overall accuracy of 94.31% on the temporal values closest to the known historical values. This approach suffered from insufficient training time and efficiency issues. Fingerprinting. The IoT Sentinel, proposed by Miettinen [13] is a passive fingerprinting technique for analyzing packet headers of connected devices. The authors evaluated their solution using 27 different off-the-shelf IoT device types and showed that the method achieved an average accuracy of 81% for identifying IoT device types only. In this paper, we focus on network traffic features to identify devices and combined with a MGA algorithm.
3 Methodology In our study, we propose a method for identifying IoT devices (MDIoT) that include Data Pre-processing, Feature Extraction, Feature Selection and Device Identification. The first phase of the model involve pre-processing the initial dataset: the Pcap files are converted to CVS files. Next, feature selection is performed four multivoting method. Here, the MGA algorithm is proposed for additional feature selection. Finally, the device recognition model provides the classification results. The overall workflow of the system is illustrated in Fig. 1.
Fig. 1 Illustration of the overall framework
234
H. Zheng et al.
3.1 Data Pre-processing and Feature Extraction For IoT devices’ data packets, we can usually be obtained through network monitoring tools or APIs provided by the devices. After obtaining the data packets, we need to clean and preprocess the data for further analysis and classification. The data cleaning process usually involves the following steps: (1) filtering out invalid data packets, abnormal data packets, duplicate data packets, etc. (2) Then, the split data packets are reassembled into complete data packets to facilitate subsequent analysis. (3) Finally, the data is parsed to extract various fields from the data packets, such as source IP address, destination IP address, port number, protocol type, etc. Furthermore, to reduce the dimensionality of the feature space and alleviate the problem of data imbalance, we decided to simplify the extensive port numbers into a smaller number of categories. And We extract with 110 raw features from the packets using wireshark tool.
3.2 Feature Selection (1) Multi-voting Method. After pre-processing the data, we use the multi-voting method to select the most relevant features. The interference of irrelevant or redundant features to the model can be reduced. Traditional voting techniques, such as Random Forest (RF), are capable of handling high-dimensional data and a large number of features. However, they may be prone to overfitting issues, especially when there are numerous redundant or noisy features in the data. To overcome the limitations of traditional voting techniques, we leverage the multi-voting method. By combining the results of the multi-voting method, we can consider the strengths of different methods and mitigate the limitations of feature approaches. The fusion of the multi-voting method provides a more comprehensive and stable feature selection outcome, enhancing the accuracy. RF can be used to combine and predict multiple decision trees. Considering the importance of the features and the efficiency, Recursive Feature Elimination (RFE) is used to gradually eliminate less important features. In order to more accurately determine the relationship between each feature and the target variable, Chi_square (CS) is used. Finally, to limit the complexity of the model and to allow the selection of fewer features, L_One (LO) is chosed. We use the multi-voting method which fuse their results include RF, RFE, CS and LO, as the final voting result. We take features with scores greater than five as the result and exclude features with scores less than five. The results of the multi-voting method are illustrated in Fig. 2. (2) Multi-objective Genetic Algorithm. According to our experiments, the device identification accuracy is about 75% using the above features. In order to select automatically more relevant features, we propose the MGA algorithm for further feature selection. It can reduce noise and thus improve the performance and prediction accuracy of the model.
MDIoT: IoT Device Identification Method …
235
Fig. 2 The multi-voting methods
Our initial intention is to use a genetic algorithm (GA) for feature selection. However, the GA algorithm cannot handle data imbalance and may lead to overfitting. Thus, we propose MGA and represent the feature selection problem as a multi-objective optimisation problem. We propose MGA algorithm, which adds nondominatence sorting and congestion distance to the original, to solve the optimisation of multi-objective problem. Non-dominance Sorting. It sorts features in a population according to nondominance, assigning each feature to a different rank. For i = j, if f (xi ) ≺ f (x j ), then xi dominates x j , denoting the set of feature that dominate xi by Si and the number of feature that dominate xi by n i ; if i, j do not dominate each other, then xi and x j are said to be incomparable, denoted by Fk for the set of kth level set of levels and Nk the number of features in the kth level. The mutual information between the two random variables xi and yi is represented by MI(xi , yi ), while the probability density function is represented by P. Congestion Distance. We calculate the distance of each feature in objective function dimension and rank these distances. Next, each feature’s distances on objective function dimension can be calculated to obtain an average distance vector. Finally, the average distances on all dimensions can be weighted and summes to obtain the congestion distance of the features. MI (xi , yi ) =
SU (xi , yi ) =
−
lim p (xi , yi ) logb i= j
x
p (xi , yi ) p (xi ) p (yi )
2 ∗ MI (xi , yi ) p (xi ) log p (xi ) + y p (yi ) log p (yi )
(1)
(2)
236
H. Zheng et al.
For f i (x), compute the distance Di, f of xi in f i dimensions in the Fk rank set, and the sorted distance di, f , and sort all features’ distances in f i dimensions to obtain finally sum the distances in all dimensions: Di = mf=1 Di, f , where C Di = D1i , C Di denotes the congestion of feature xi , m denotes the number of objective functions and Di, f denotes the distance of feature xi in the dimension of the f th objective function.
3.3 Device Identification In the above step, we obtain most important feature by extracting feature. Next, we use DT to identify and classify feature of the device to determine device type. The result of the classification is usually a label for the type of device.
4 Evaluation 4.1 Experimental Setting Datasets. We use two publicly available device datasets (Aalto [13] and UNSW [14] dataset) to identify IoT devices. By examining the packets in the Pcap files, network traffic features are extracted for each outgoing packet. The Pcap file of the both datasets are divided into two subsets: 80% for training and 20% for testing. Parameter Setting. As for MGA method, the initial population is set to contain 38 randomly generated chromosomes, with each chromosome representing a subset of features. The mutation rate is set at 0.04, which means that each gene has a 4% probability of mutating in each iteration. Since the fitness function is dependent on the classification performance of the candidate feature vectors, the training dataset is used to evaluate the classification performance of each feature subset. The population undergoes 1000 iterations, and every 100 iterations, the average fitness value of all chromosomes is computed as the mean. We finally selected 12 features using the MGA algorithm. The features we extract include IP_DF, IP_ttl, dport23, IP_flags, payload_bytes, IP_len, pck_size, TCP_sport, EAPOL_type, ICMP_code, UDP_len, MAC, Label.
4.2 Results The Perfomance of Device Classification. We use five ML algorithms include Naive Bayes (NB), Gradient Boosting (GB), Random Forest (RF), K-Nearest Neighbors (KNN), and DT, to classify 25 different devices in the Aalto dataset. The ML algo-
MDIoT: IoT Device Identification Method …
237
Fig. 3 ML evaluation results under different methods
rithms are trained using extracted features. And we use nested cross-validation to identify suitable hyperparameters for each algorithm. Each model is trained 100 times to measure stability. We use accuracy, precision, recall, and f1_scores as evaluation metrics in the experiments. Figure 3 presents the classification results achieved by the different ML algorithms. Based on the results depicted in Fig. 3, it can be observed that DT better than other four algorithms in terms of the evaluated metrics. Consequently, we selected DT for devices classification.
Fig. 4 Compare with other methods
238
H. Zheng et al.
Compare with Other Methods. We compare MDIoT with IoTSentinel [13] and the IoTSense [5] on the Aalto and UNSW datasets. Figure 4 displays the results of device identification on both datasets. It is conclude that our method is better than IoTSentinel and IoTSense on both datasets in terms of accuracy, precision, recall, and f1_score.
5 Conclusion In this paper, we propose a model called MDIoT to improve the identification of IoT device. To reduce redundancy features and noise, we combine with the multivoting method and MGA for feature selection. We use decision number classifier to build MDIoT model for the identification and classification of IoT devices. Our model obtains experimental results of over 95% on two public datasets and is able to accurately identify devices type. In the future, we will consider a combination of active and passive IoT device identification techniques that can identify IoT devices in real time. Acknowledgements This work was supported by the National Natural Science Foundation of China No. 62277001 and scientific research program of Beijing Municipal Education Commission KZ202110011017.
References 1. Kumari, P., Jain, A.K.: A comprehensive study of ddos attacks over iot network and their countermeasures. Comput. Secur. 103096 (2023) 2. Kolias, C., Kambourakis, G., Stavrou, A., Voas, J.: Ddos in the iot: Mirai and other botnets. Computer 50(7), 80–84 (2017). https://doi.org/10.1109/MC.2017.201 3. Ahmed, S.T., Kumar, V., Kim, J.: Aitel: ehealth augmented intelligence based telemedicine resource recommendation framework for iot devices in smart cities. IEEE Internet Things J. (2023) 4. Mishra, D., Naik, B., Nayak, J., Souri, A., Dash, P.B., Vimal, S.: Light gradient boosting machine with optimized hyperparameters for identification of malicious access in iot network. Digit. Commun. Netw. 9(1), 125–137 (2023) 5. Bezawada, B., Bachani, M., Peterson, J., Shirazi, H., Ray, I., Ray, I.: Behavioral fingerprinting of iot devices. In: Proceedings of the 2018 Workshop on Attacks and Solutions in Hardware Security, pp. 41–50 (2018) 6. Yang, K., Li, Q., Sun, L.: Towards automatic fingerprinting of iot devices in the cyberspace. Comput. Netw. 148, 318–327 (2019) 7. Noguchi, H., Kataoka, M., Yamato, Y.: Device identification based on communication analysis for the internet of things. IEEE Access 7, 52903–52912 (2019) 8. Kouicem, D.E., Bouabdallah, A., Lakhlef, H.: Internet of things security: a top-down survey. Comput. Netw. 141, 199–221 (2018). https://doi.org/10.1016/j.comnet.2018.03.012 9. Siby, S., Maiti, R.R., Tippenhauer, N.: Iotscanner: detecting and classifying privacy threats in iot neighborhoods (2017). arXiv preprint arXiv:1701.05007 10. Cviti´c, I., Perakovi´c, D., Periša, M., Gupta, B.: Ensemble machine learning approach for classification of iot devices in smart home. Int. J. Mach. Learn. Cybern. 12(11), 3179–3202 (2021)
MDIoT: IoT Device Identification Method …
239
11. Fan, L., Zhang, S., Wu, Y., Wang, Z., Duan, C., Li, J., Yang, J.: An iot device identification method based on semi-supervised learning. In: 2020 16th International Conference on Network and Service Management (CNSM), pp. 1–7. IEEE (2020) 12. Lopez-Martin, M., Carro, B., Sanchez-Esguevillas, A.: Iot type-of-traffic forecasting method based on gradient boosting neural networks. Fut. Gener. Comput. Syst. 105, 331–345 (2020) 13. Miettinen, M., Marchal, S., Hafeez, I., Asokan, N., Sadeghi, A.R., Tarkoma, S.: Iot sentinel: automated device-type identification for security enforcement in iot. In: 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS), pp. 2177–2184. IEEE (2017) 14. Sivanathan, A., Gharakheili, H.H., Loi, F., Radford, A., Wijenayake, C., Vishwanath, A., Sivaraman, V.: Classifying iot devices in smart environments using network traffic characteristics. IEEE Trans. Mob. Comput. 18(8), 1745–1759 (2018)
Motor Fault Diagnosis Based on Improved Support Vector Machine Caixiang Guo, Jin Li, and Chenxi Yang
Abstract Statistical data shows that rotor bar breakage and stator inter turn short circuit faults are the two most common faults in asynchronous motors. In order to achieve autonomous diagnosis of motor faults, this paper studies a motor fault diagnosis method based on least squares wavelet support vector machine (LS-WSVM). Using wavelet packets to extract fault feature components from stator current signals. An improved particle swarm optimization algorithm is proposed. The inertia weight and convergence factor are introduced to improve the particle swarm iteration formula, optimize the hyperparameter of LS-WSVM, and find the hyperparameter that optimizes the performance of the support vector machine through iteration. Compared with the unoptimized LS-WSVM, the diagnosis results show that the fault diagnosis of LS-WSVM based on particle swarm optimization has faster training time and classification time, and higer classification accuracy. Keywords Motor fault diagnosis · Particle swarm optimization · Least squares support sector machine
1 Introduction When the motor malfunctions, the sample data collected is very limited. On the one hand, the operation time of motor faults is very short. In order to avoid damage to the unit caused by long-term fault operation and bring significant economic losses to the enterprise, once the motor malfunctions, it is necessary to quickly remove the fault; On the other hand, the probability of faults occurring is low and sudden, making them difficult to catch. However, traditional machine learning based on large sample data has significant limitations in mining limited sample knowledge. This is because the final solution of traditional machine learning is usually a local extremum, and there may be more than one local extremum in high-dimensional space. At the same C. Guo (B) · J. Li · C. Yang State Grid Taiyuan Electrical Power Supply Compony, Taiyuan 030000, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_20
241
242
C. Guo et al.
time, there may be significant differences between local extremum values of different dimensions, ultimately resulting in different solutions for each neural network [1, 2]. In addition, traditional machine learning requires a sample size that cannot be met by small samples when describing high-dimensional sample spaces. Meanwhile, when trained and tested with small samples, the convergence speed of the neural network also becomes very slow. Therefore, traditional machine learning has shortcomings in diagnosing motor faults. On the basis of statistical learning theory, support vector machine replaces the empirical risk minimization principle with the structural minimization principle. Its working mechanism is to classify samples by constructing the optimal hyperplane in space. Here, the hyperplane with the largest sample classification interval is defined as the optimal hyperplane [3]. The classification process of support vector machines is equivalent to a quadratic optimization problem, which can be understood as an optimization of traditional machine learning, even if the final solution of the problem changes from local extremum to global extremum. Another advantage of support vector machines is that when solving nonlinear problems in low dimensional spaces, they transform them into linear problems in high dimensional spaces through nonlinear mapping, and do not increase computation throughout the entire optimization process [4]. This is because support vector machines complete nonlinear mapping by introducing kernel functions, avoiding inner product operations in high-dimensional spaces. Compared to traditional machine learning, support vector machines have better generalization and generalization abilities, and can self adjust according to actual engineering situations. Support vector machines have great advantages in limited sample decisionmaking, as they can maximize the mining of feature information contained in small samples. The advantages it exhibits make support vector machines more suitable for application in motor fault diagnosis. However, the regularization factor and kernel function parameters have a great impact on the classification results of the support vector machine. Choosing an optimal combination will make the performance of the support vector machine reach the best, and the accuracy of the classification is the highest [5, 6]. In order to optimize the performance of support vector machines, this article introduced particle swarm optimization algorithm to optimize these two adjustable parameters and achieved certain results. In Sect. 2 of this paper, a fault diagnosis method based on the least squares wavelet support vector machine LS-WSVM is proposed, which combines the advantages of wavelet local analysis and enhances the generalization ability of the support vector machine. In Sect. 3, an improved particle swarm optimization algorithm is proposed to optimize the hyperparameter of LS-WSVM motor fault diagnosis system. In order to balance the global and local search capabilities of particle swarm optimization, inertia weights are introduced into traditional particle swarm algorithms; To prevent particle swarm optimization from falling into local optima, maximum speed is introduced. Section 4 verifies the effectiveness of the proposed method through motor fault diagnosis experiments. The conclusion is given in Sect. 5.
Motor Fault Diagnosis Based on Improved …
243
2 Least Squares Wavelet Support Vector Machine 2.1 Wavelet Kernel Function Research has shown that the classification ability of support vectors is affected when the selected kernel function and its parameters change. However, there is currently no unified framework and method for selecting a kernel function that is compatible with practical problems, which is a problem that needs to be solved [7, 8]. On the basis of statistical learning theory, SVM uses SRM principle to find the global optimal solution. However, when the frequency characteristics of the signal are relatively complex, such as the fault current signal during motor faults, pure fragmented support vector machines may have misjudgment, and the effect is not very ideal. Combining wavelet basis and SVM to construct a wavelet support vector machine that combines the advantages of both, making traditional support vector machines perform better [9–11]. ψ(x) is a one-dimensional mother wavelet function. Using tensor product theory, we can get that the N-dimensional wavelet function is ψn (x) = ψn (x1 x2 . . . xd ) =
n
Ψ (xi )
(1)
i=1
Construct the wavelet kernel function as K (x x ) =
n
Ψ
i=1
xi − m i ai
Ψ
x i − m i ai
(2)
a is the scaling factor, and mis the translation scaling factor. The commonly used Morlet wavelet function is selected, and its expression is ψ(x) = exp(− jω0 x)e−x
2
/2
(3)
Equation (3) is the complex form of Mexican straw hat wavelet function (Morlet), which has the problem of phase distortion in application. Remove the imaginary part and take the real Morlet wavelet function as ψ(x) = cos(ω0 x)e−x
2
/2
(4)
Construct Morlet wavelet kernel function expression as K (x x ) =
n i=1
Ψ
xi − x i ai
xi − x i (xi − x i )2 = exp − cos ω0 (5) ai 2ai2 i=1 n
244
C. Guo et al.
Table 1 Comparison of two kinds of kernel functions Kernel function γ1 σ12 γ2 σ22 RBF kernel function
RBF kernel function
0.8 10 0.8 0.8 10 0.8
0.5 1 0.5 0.5 1 0.5
3 3 8 3 3 8
6 6 9 6 6 9
γ1
σ12
Accuracy
0.08 0.08 0.07 0.08 0.08 0.07
0.23 0.23 0.17 0.23 0.23 0.17
0 0 0 0 0 0
2.2 Motor Fault Diagnosis Based on LS-WSVM Least Squares Wavelet Support Vector Machines (LS-WSVM) replace the kernel function of LS-SVM with a wavelet kernel function, taking into account the advantages of both wavelet and support vector machines. Compared with LS-SVM, LS-WSVM has stronger noise resistance and better adaptability to environmental changes. In order to verify the superiority of the LS-WSVM classification results, kernel functions and wavelet kernel functions were used for classification, and the accuracy of the classification results is shown in Table 1. From the data analysis in the table, it can be seen that the classification results of wavelet kernel functions are generally superior to those of RBF kernel functions, especially when nonlinearity is more complex. Therefore, the classification results of support vector machines are influenced by kernel functions. In addition, selecting a kernel function and changing its parameters can also affect the performance of support vector machines. At the same time, setting different regularization factors will also affect the generalization ability of SVM. In order to optimize the performance of LS-WSVM, it is necessary to find the optimal kernel parameters and regularization factors.
3 Support Vector Machine Based on Particle Swarm Optimization In view of the shortcomings of the standard particle swarm optimization, this paper improves the iterative formula of particle swarm optimization by introducing inertia weight and convergence factor.
3.1 Inertia Weight ω When optimizing practical problems, we hope that the particle swarm algorithm has stronger global search ability. After the particle swarm search converges to a certain region, it can achieve more detailed local search within that region to discover the
Motor Fault Diagnosis Based on Improved …
245
global optimal solution. In order to optimize the optimization ability of particle swarm optimization, inertia weights are usually introduced in traditional particle swarm iteration formulas, whose size represents how much speed is retained before the update. At this point, the iteration formula is updated to k−1 k−1 k−1 k = ωvid + c1 rand(1)( pid − xid ) + c2 rand(1)( pgd − xid ) vid
(6)
Obviously, the performance of particle swarm optimization algorithms is influenced by inertia weights. By adjusting the inertia weight, the balance between global and local search capabilities can be achieved, allowing particles to have different search abilities at different stages of iterative updates.
3.2 Convergence Factor In order to accelerate the convergence speed of the PSO algorithm, the iterative formula after introducing a convergence factor in is changed to k−1 k−1 k−1 k = χ(vid + c1 rand(1)( pid − xid ) + c2 rand(1)( pgd − xid )) vid 2 , φ = c1 + c2 > 4 χ=
2-φ- φ2 -4φ
(7) (8)
Usually taken as φ as 4.1, then χ is equal to 0.729. The difference between inertia weight and convergence factor is that inertia weight only adjusts the amount of inherited historical velocity, while convergence factor not only reflects historical velocity, but also is related to the historical position increment of individual extreme values and global extreme values.
3.3 LS-WSVM Model Based on Improved Particle Swarm Apply PSO-LS-WSVM to fault diagnosis, and the process is as follows: (1) Firstly, preprocess the collected stator current signal; (2) Using wavelet packets for feature component extraction, select a high-performance method as the fault component extraction method for fault diagnosis. Define the feature components that can distinguish the types of motor faults as fault feature components; (3) The fault components are divided into training set and test set, and PSO-LSWSVM support vector machine is trained with the training set to obtain the best hyperparameter that can optimize the performance of the support vector machine;
246
C. Guo et al.
Fig. 1 Fault diagnosis model of asynchronous motor
(4) Train support vector machines using the obtained optimal parameters and training set; (5) Verify the classification performance of the classifier using a test set. The flowchart of the motor fault diagnosis model based on the improved PSOLS-WSVM is shown in Fig. 1.
4 Experiment 4.1 Platform The comprehensive diagnosis platform for motor faults is shown in Fig. 2. Multiple fault simulations of asynchronous motors can be achieved on this platform. The rated current of asynchronous motors is 15.4 A, the rated speed is 1440 r/min, and the rated
Motor Fault Diagnosis Based on Improved …
247
Fig. 2 Fault diagnosis model of asynchronous motor
Fig. 3 Classification results of SVM1, SVM2, SVM3
power is 7.5 KW. The platform is divided into three large blocks using dashed lines in the figure. The top range is the signal acquisition module, the bottom left range is the platform control module, and the bottom right range is the light bulb load module. The signal acquisition module is composed of a current sensor, a voltage sensor, a protection circuit, a data acquisition card, and a data bus (Fig. 3).
248
C. Guo et al.
Table 2 Comparison of two kinds of kernel functions Sample Vector: E2 /E Normal motor Short circuit One-bar broken Three-bar broken
0.000968 0.011447 0.000996 0.000822
Vector: E4 /E 0.00507 0.000363 0.001442 0.000248
4.2 Experimental Results The stator current signals of a normal motor, one broken rotor, three broken rotor and inter turn short circuit are collected, and the sampling frequency is 5000 Hz. This paper adopts two methods to improve the classification effect of SVM: one is to improve the generalization ability of SVM itself by particle swarm optimization of hyperparameter of SVM; the second is to change the training strategy of PSO-LSWSVM. Twenty sets of data were decomposed using three-layer wavelet packets, each containing four types of sample data. The fault vector of one set is shown in Table 2. Using the PSO-LS-WSVM fault diagnosis process to classify these four sample categories. During general training, the samples are divided into training and testing sets according to 3:1. This method constructs three binary classifiers SVM1, SVM2, and SVM3, where SVM1 separates inter turn short circuit faults from other categories; SVM2 separates the broken bar fault from the normal motor; SVM3 will separate 1 and 3 broken faults. Based on the training set, optimize the parameters of LS-WSVM by setting the population size of the particle swarm to 10, the number of iterations to 50, and the learning factors c1 = 2 and c2 = 2. After optimization, the optimal parameters of SVM1 are γ1−best = 8.67 and σ1−best = 0.42; The optimal parameters of SVM2 are γ2−best = 2.81, σ2−best = 4.52; The optimal parameters γ3−best = 0.08, σ3−best = 0.32 for SVM3.
5 Conclusions As the most widely used power equipment, it is crucial to ensure the safe and reliable operation of asynchronous motors. Therefore, it is of great practical significance to adopt effective fault diagnosis techniques to detect early motor faults in a timely manner. In order to improve the convergence speed and generalization ability of support vector machines, this article proposes a least squares wavelet support vector machine. Compared with pure support vector machines, on the one hand, LS-WSVM replaces unequal constraints with equal constraints, simplifying calculations; On the
Motor Fault Diagnosis Based on Improved …
249
other hand, the use of wavelet kernel functions enhances the generalization ability of support vector machines by combining the advantages of wavelet local analysis. Aiming at the influence of hyperparameter on the performance of LS-WSVM, a LS-WSVM motor fault diagnosis system based on particle swarm optimization is proposed. In order to balance the global and local search capabilities of particle swarm optimization, inertia weights are introduced into traditional particle swarm algorithms; To prevent particle swarm optimization from falling into local optima, a maximum value of speed is introduced. Through motor fault diagnosis experiments, it was compared with the unoptimized LS-WSVM. The experimental results showed that the algorithm is faster in training time and classification time, and has higher classification accuracy.
References 1. Thomson, W.T., Fenger, M.: Current signature analysis to detect induction motor faults. IEEE Ind. Appl. Mag. 7(4), 26–34 (2001) 2. Li, B., Chow, M.Y., Tipsuwan, Y., et al.: Neural-network-based motor rolling bearing fault diagnosis. Ind. Electron. IEEE Trans. 47(5), 1060–1069 (2000) 3. Nandi, S., Toliyat, H.A.: Condition monitoring and fault diagnosis of electrical machines-a review. In: Industry Applications Conference, 1999. Conference Record of the Thirty-Fourth Ias Meeting, pp. 197–204, vol.1. IEEE (1999) 4. Ding, S., Qi, B., Tan, H.: Overview of support vector machine theory and algorithm research. J. Univ. Electron. Sci. Technol. 40(1), 2–10 (2011) 5. Wang, H., Zhang, X., Yu, J.: Fault diagnosis method based on support vector machine. J. East Chin. Univ. Sci. Technol. (Nat. Sci. Edn.) 30(2), 179–182 (2004) 6. Ma, X., Huang, X., Chai, Y.: SVM based binary tree multi class classification algorithm and its application in fault diagnosis. Control Dec. Mak. 18(3), 272–276 (2003) 7. Rong, H., Zhang, G., Jin, W.: Research on support vector machine kernel function and its parameters in system identification. J. Syst. Simul. 18(11), 3204–3208 (2006) 8. Das, S.R., Panigrahi, P.K., Mishra, K.D.D.: Improving RBF kernel function of support vector machine using particle swarm optimization. Int. J. Adv. Comput. Res. 2(7) (2012) 9. Lei, G., Jin, C., Yi, Z., et al.: Application of wavelet support vector machine in fault diagnosis of rolling bearings. J. Shanghai Jiao Tong Univ. 4, 678–682 (2009) 10. Lu, Z., Sun, J., Butts, K.: Multiscale support vector learning with projection operator wavelet kernel for nonlinear dynamical system identification. IEEE Trans. Neural Netw. Learn. Syst. 1–13 (2016) 11. Chang, P., Li, S., Ge, Y., et al.: Computation of reservoir relative permeability curve based on multi-scale wavelet kernel extreme learning machine. In: Chinese Control Conference, pp. 7179–7184 (2016)
Research on Automatic Detection and Sorting System of Spoiled Fruit Based on Deep Learning Bingbing Hou, Lei Cheng, Tiedan Hua, Wenle Wang, and Fengyun Li
Abstract Fruit is one of the important sources of nutrition for humans, but its lifespan is very short. The deterioration of one fruit has a direct impact on adjacent fruits, and early detection of spoilage and timely cleaning can prevent the spread of spoilage. In this paper, aiming at the phenomenon that fruits are easy to deteriorate and cause potential safety hazards, a real-time detection and sorting system for spoiled fruits based on deep learning technology is proposed. First, the lightweight ShuffleNetV2 is used as the backbone network of the YOLOv5s model to make the model lighter, and then the new model is introduced. The BiFPN feature map fusion network structure is used to enhance the detection performance of the model, so as to meet the real-time and accuracy requirements of the target detection in the robotic arm sorting process. Finally, the target object is obtained by image registration technology and coordinate transformation technology. The 3D coordinates in the coordinate system are sent to the controller of the robotic arm so that the robotic arm can complete the sorting task. Experiments have proved that the system has the advantages of strong anti-interference ability, fast detection frame rate, high precision, and high positioning accuracy. It can guide the robot arm to work quickly and accurately, which has important application significance and value. Keywords YOLOv5 · BiFPN · Target detection · Sorting
1 Introduction Fruits are an essential component of a balanced diet, though the longer they remain stored, the higher the probability of spoilage that contributes to various health hazards. Accordingly, several regulators and food processing companies in various countries investigate and control fruit quality to guarantee safety and minimize wastage. B. Hou (B) · L. Cheng · T. Hua · W. Wang · F. Li College of Information Science and Engineering,Wuhan University of Science and Technology, Wuhan, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_21
251
252
B. Hou et al.
An essential aspect of ensuring food safety is detecting fruit spoilage. Traditional fruit detection methods, which require manual visual inspection or chemical analysis [1] by experts, are inefficient and costly. Consequently, domestic and international researchers have proposed several solutions to address these issues. Wang Zhi et al. [2] proposed an ultra-sensitive flexible olfactory receptor peptide sensor based on the curved connection method, which can achieve ultra-sensitive detection of trimethylamine, thereby detecting century-old fruits, but this method is difficult to accurately obtain the location information of the target and the manufacturing cost of the sensor used is relatively high.Jianrong Cai et al. [3] introduced a method for the characterization and identification of citrus fruit spoilage fungi based on Raman scattering spectroscopy imaging. For the fungal detection method, the speed and efficiency have been improved, but it still takes a long time for imaging and data analysis, which cannot meet the needs of some real-time detection.V Hemamalini et al. [4] proposed a method using computer vision and machine learning technology to automatically process and analyze food images to achieve quality inspection and grading of food. Studies have shown that this method can effectively identify different types of food Perform quality inspection and grading, but there may be certain difficulties in image segmentation and feature extraction for fruits.David Ireli et al. [5] presented an automatic method to detect, classify, and grade tomato defects via machine learning and image processing techniques.The limitation is that the processing speed is difficult to meet the real-time sorting.Tian Yunong et al. [6] utilized a neural network architecture known as the Multi-Scale Dense Classification Network to extract features and classify images at varied scales. Olga A. Snatkina et al. [7] applied Inception-v3, a convolutional neural network to classify fruit images and assess the fruit’s quality condition. In addition, in practical applications, it is not only necessary to detect spoiled fruits, but also to achieve high-efficiency automatic sorting and processing under the premise of ensuring accuracy and to be able to work stably in various complex environments. Both optimization and system architecture place higher demands on. To address the aforementioned issues, this paper proposes a deep learningbased system for detecting and automatically sorting spoiled fruit. Specifically, the YOLOv5 algorithm will be employed as the target identification and positioning tool to enhance detection accuracy and efficiency. To expand the original data set and mitigate the issue of limited sample data, the detection system will include a variety of fruit types and utilize data augmentation techniques. Ultimately, the detection and sorting systems will be integrated to enable real-time and comprehensive inspection and sorting of spoiled fruit.
2 System Specification In the laboratory, this study developed an autonomous system for sorting spoiled fruit based on deep learning. The system used the Robot Operating System (ROS) development platform and the ABB robotic arm as the operation platform. The system
Research on Automatic Detection and Sorting System …
253
Fig. 1 Overall scheme of automatic detection and sorting
employed the Intel RealSense D435 depth camera for object detection and positioning, followed by 3D coordinate acquisition using image registration and conversion technologies. The robotic arm reached the target location and grabbed the object for sorting. The research delved into in-depth experimentation and analysis of key technologies, including the calibration of the camera internal parameters and visual system composed of the robot and camera, as well as target feature acquisition, ROS and robotic arm communication, robot arm motion control simulation, and target object position acquisition, among others. The study focused on three main aspects: the grip position of the robotic arm, the simulation of the robotic arm’s motion control, and the development of the robotic arm platform for visual sorting. See Fig. 1 for an overview of the research scheme.
3 Target Detection During the process of sorting spoiled food, the robot must first identify the type and location of the food. By employing a high-performance detection algorithm, the robot acquires two-dimensional coordinates to ensure the accuracy of the detection results. Then, after a coordinate system transformation, the actual three-dimensional position is determined, enabling automatic sorting of the spoiled food. Deep learning is widely applied to target detection and visual tasks, providing unparalleled accuracy and robustness, making it an important choice of target detection algorithm.
254
B. Hou et al.
Fig. 2 The framework of the target detection algorithm in this paper
3.1 YOLOv5 The YOLO algorithm is widely used in multi-classification tasks and can accurately perform tasks such as detection, semantic segmentation, and small target detection. Among the YOLO series algorithms, version 5 boasts faster inference speed, smaller model volume, and higher detection accuracy. These improvements come as a result of the algorithm’s more efficient calculation method and enhanced network structure. The YOLOv5 family consists of five versions, namely YOLOv5n, YOLOv5s, YOLOv5m, YOLOv5l, and YOLOv5x in order of complexity. While YOLOv5x has the highest detection accuracy, it is also the largest and slowest model. On the other hand, YOLOv5s boasts the smallest volume and fastest speed, albeit lower detection accuracy. As for YOLOv5n, it is a variant of the YOLOv5 series optimized for Nano devices. To ensure fast and accurate target identification and sorting, we deployed an efficient recognition and location algorithm. Given the need for real-time processing of large amounts of data, we selected the YOLOv5s model, which offers faster detection and a smaller model volume compared to other variants in the YOLOv5 family. This enhancement guarantees system efficiency without sacrificing accuracy. Additionally, we made improvements to the backbone network (Fig. 2).
Research on Automatic Detection and Sorting System …
255
Fig. 3 ShuffleNetV2 structure
3.2 Backbone Network Improvements When using the YOLOv5 algorithm for small target detection, due to the large number of training parameters and the large memory consumption of the model, the detection effect is not ideal. At the same time, under the requirements of real-time detection, the reasoning speed of the YOLOv5 algorithm is also difficult to meet the requirements. In order to solve this problem, this paper uses the lightweight ShuffleNetV2 [8] as the backbone network of the YOLOv5s model, thereby reducing the training parameters of the model and making it lighter (Fig. 3). ShuffleNet V2 (SNV2) is a lightweight convolutional neural network comprising a basic unit and a subsampling unit. In the basic unit, input feature channels are divided into two groups. The right branch performs convolution and batch normalization operations while the output features of the left branch are mixed with those of the right branch to enhance the ability to fuse subchannel information [9]. The subsampling unit enhances the network’s feature extraction capability by increasing the number of channels and the width, rather than using the channel separation method [10]. Object Detection System YOLOv5 uses a Neck architecture that includes the FPN+PAN structure as a means of fusing multi-layer feature maps for enhanced small-sized target detection. However, this architecture only processes feature maps of the same size, omitting information from maps of different sizes. To overcome this limitation, this paper introduces the BiFPN architecture for feature map fusion. BiFPN removes nodes with insufficient information for fusion, and merges shallower feature maps at the same scale to increase complexity. BiFPN supports both top-down and bottom-up paths, with the system feeding the feature map from the removed node into the next level graph to create a new fusion structure. Finally, feature maps P1P4, extracted using the backbone network, are inputted into BiFPN, resulting in the processed feature maps F1-F4 (Fig. 4).
256
B. Hou et al.
Fig. 4 The structure of BiFPN
The Bi-directional Feature Pyramid Network (BiFPN) employs weight control to regulate the learning of distinct feature information through the manipulation of weights.The formula is shown in (1) below. ωi Ii (1) Out = ε + i ωi i In the formula: ωi is the weight trained by the network, which is guaranteed to be greater than or equal to zero by the activation function, ε is used to maintain the stability of the overall result, and is a constant, Ii represents the feature of the image input, and Out represents the fusion result. After improvement, BiFPN extracts feature maps of different scales from the Backbone network for weighted feature fusion.
4 Grab Position Get The visual system is designed to obtain the label and three-dimensional coordinates of the object of interest in the coordinate system of the robot base. The system works in the following steps: the camera captures an image of the work area containing the objects, the algorithm for target recognition and positioning identifies all target objects in the frame, and features of the object in the image are extracted. The features are then converted into 3D coordinates in the robot base coordinate system using coordinate conversion technology.
Research on Automatic Detection and Sorting System …
257
4.1 Zhang Zhengyou Calibration Method This paper plans to use the Zhang-Zhengyou calibration method to calibrate the camera. This method is based on the computer vision camera model and uses the homography matrix of the camera and the calibration plate as a control field to solve for the camera’s internal and external parameter matrices. The process involves obtaining the distortion coefficients and the internal parameter information of the camera. Let the optical camera model be: s m = A [R t] M
(2)
is the image point coordinates; A is the In the formula, s is the scale factor; M internal reference matrix; [R t] is the external reference matrix; is the object point coordinates. Then there are: [h 1 h 2 h 3 ] = λA [r1 r2 t]
(3)
According to the orthogonality of the rotation matrix:
h 1T A−T h 2 = 0 h 1T A−T A−1 h 1 = h 2T A−T A−1 h 2
(4)
According to the homography matrix and then using Eq. (3), the initial value of the internal and external parameters can be obtained, and then the distortion coefficient can be calculated. After calibration, the RGB camera internal reference matrix and distortion coefficients are obtained as Formulae (5), and (6). ⎛ ⎞ 1376.09 0 947.958 1376.4 531.068⎠ MR = ⎝ 0 0 0 1 ⎛
[k1 k2 k3 p1
⎞ 0.109862 ⎜ −0.122685 ⎟ ⎜ ⎟ ⎟ p2 ] = ⎜ ⎜−0.000453445⎟ ⎝ −8.614e − 05 ⎠ −0.396171
(5)
(6)
4.2 Get Camera Data After completing camera calibration, the Intel RealSense D435i camera is used to acquire both depth and color data. The depth sensor captures the distance of objects
258
B. Hou et al.
Fig. 5 Color map
Fig. 6 Depth map
within the camera’s view, as shown in Fig. 5, while the color camera captures an image in color, as shown in Fig. 6. In this article, the Intel RealSense SDK is utilized to capture depth maps and color images in real-time.
Research on Automatic Detection and Sorting System …
259
4.3 Camera Data Preprocessing The left and right depth sensors of the Intel RealSense D435i and the color camera use separate coordinate systems. In order to align depth information with each pixel in the color map, both the depth and color images must be registered. Therefore, both the depth and color images must be preprocessed to achieve alignment.
5 Experimental Results and Analysis 5.1 Object Detection 5.1.1
Dataset and Preprocessing
In the present study, Images of spoiled apples, mangoes, oranges, and bananas were collected for data set training. 3600 images were collected, which included selfcaptured images and images obtained from kaggle.com/fruit 360. To label the dataset and generate.txt files, labelemg software was utilized to provide necessary data about categories and coordinates of the bounding box. In order to improve the generalization of the model and overcome the demand for a significant amount of data by the deep learning model, various methods, such as rollover, brightness, color and contrast enhancement were administered, along with the utilization of a special data enhancement method known as Mosaic. The Mosaic technique splices four different, random images together to create a diverse set of data and to enhance the ability of the model to detect objects more accurately. The procedure and functionality of Mosaic data enhancement can be observed in Fig. 7.
5.1.2
Experimental Environment and Evaluation Index
The presented model in this research is tested within the laboratory environment. The results of the tests are assessed both quantitatively and qualitatively. The deep learning model development tool employed in this study is Anaconda3. Table 1 exhibits the hardware and environment factors that were utilized in developing the deep network learning model within the laboratory. One of the widely used evaluation criteria in missions is the PASCAL VOC’s evaluation metric, which is determined by the mean accuracy of Intersection over Union (IOU) at 0.5 or mAP (Mean Average Precision). The calculation of mAP requires the application of accuracy and recall rates, which is determined by the formula displayed below. Pr ecision =
TP T P + FP
(7)
260
B. Hou et al.
Fig. 7 Mosaic data enhancement principle and implementation process Table 1 Development hardware and environment of deep network learning model
CPU
AMD Ryzen 7 5800H with Radeon Graphics
GPU
NVIDIA GeForce RTX 3060 Laptop GPU 6 GB System ubuntu20.04,CUDA10.2 Training framework pytorch1.70
TP T P + FN
Recall =
(8)
1 AP =
P (r ) dr
(9)
0
m AP =
C
A Pi /C
(10)
i=1
T P represents the spoiled fruit target identified by the network model, F P represents the non-spoiled fruit target identified by the network model, and F N corresponds to the spoiled fruit target missed by the network model. The P − r curve is utilized for evaluating the performance, with the recall rate plotted on the horizontal axis and the accuracy on the vertical axis, while the area under the curve is represented by A P. The average of all classes of A Ps, known as m A P, can be utilized for evaluating the network model in detecting multi-class targets.
Research on Automatic Detection and Sorting System …
261
Fig. 8 Training and validation loss curves
Fig. 9 Mean curve of average precision
5.1.3
Analysis of Results
As observed in Fig. 8, the training and verification loss curve initially showed a decreasing trend, eventually stabilizing after 200 iterations to converge around 0.20. This implies that our selected hyperparameters are appropriate and the network model’s predictive results are highly accurate. Figure 9 shows the average precision mean curve of the spoiled fruit detection model. After 80 iterations, the average precision of the training set and the validation set tends to be stable, and both of them finally reach more than 97%. This shows that
262
B. Hou et al.
Table 2 Parameters of monocular camera Method Map graphics F1 (%) Improved YOLOv5 YOLOv5 YOLOv4 YOLOv3 Fast R-CNN
Detection speed (FPS)
Weight size (M)
97.4
0.928
75
19.5
96.5 96.2 95.0 87.6
0.883 0.922 0.926 0.797
102 47 52 /
17 401 117 315
the model structure is reasonable, the performance is stable, and there are no problems such as overfitting or underfitting. To prove the superiority of this algorithm, we conducted a comparison by using identical datasets and training parameters to evaluate the accuracy and reliability of the improved YOLOv5 model against Fast R-CNN, YOLOv3, YOLOv4, and the original YOLOv5 model with default parameters. The following three indicators, namely mAP, F1 score, and detection speed, have been selected as measurement standards to satisfy the actual scenario requirements (Table 2). These models are trained and validated on the same dataset, and the experimental results are shown in Table 3. It can be seen that the improved YOLOv5-BiFPN algorithm has improved accuracy and detection speed compared with the original algorithm.
5.1.4
Visual Display in Different Environments
To illustrate the effectiveness of the detection algorithm, the model’s generalization ability and robustness were evaluated via simulations performed in diverse detection environments. The trial results demonstrated that the model’s detection accuracy was notably high for fruit surface images taken under harsh environments, such as strong light and low light. The model’s position fixing was accurate, and its classification judgement was correct. Hence, the model established in this study is sturdy and has high generalization ability. It can be used to automatically sort spoilt fruits in intricate environments, which provide tremendous algorithmic support for fruit sorting. The subsequent images compare the effectiveness of using YOLOv5 and improved YOLOv5 to identify spoiled apples are shown, the results show that the detection effect is better using the improved yolov5 model (Figs. 10 and 11).
5.2 Experimental Platform Test To evaluate the practicality of the system, we tested the system in our laboratory using a 6-DOF industrial robot arm. The arm had suction cups attached to its sixth
Research on Automatic Detection and Sorting System …
263
Fig. 10 Extracted frames of video prediction on YOLOV5
Fig. 11 Extracted frames of video prediction on improved YOLOV5
flange, and there was an Intel RealSense depth camera installed at an angle above the arm to capture images of objects on the operation platform. Additionally, we employed an “eye fixation” coordination system to prevent camera blurring caused by its movement as the arm moves, thereby ensuring the images captured are clear. Using the three-dimensional coordinates of the target, the robot arm identified the grasping position and picked up the spoiled fruit, placing it on a designated spot until all the targets are sorted. For optimal efficiency in the sorting task, we have set a suitable moving speed for the robot arm’s end effector. We have arranged the speed such that sorting time for a single target is not more than 6 s, and the sorting time for 14 targets is not more than 2 min. This time is acceptable for practical industrial applications.
6 Conclusion In this study, a real-time detection and sorting system for spoiled fruits based on deep learning technology was designed and implemented. The system can detect and sort fruits quickly and accurately, and effectively solve the problems that fruits are easy to deteriorate and bring security risks. By using the improved YOLOv5s algorithm, the system realizes fast and accurate fruit recognition, and the image registration technology and coordinate conversion technology can obtain the 3D coordinates of the target object under the robot base scale, so as to realize the accurate fruit sorting task. Experiments show that the system has the advantages of strong antiinterference ability, fast detection frame rate, high precision and high positioning precision, which can guide the work of the robot arm quickly and accurately. It
264
B. Hou et al.
has important application significance and value for improving the efficiency and accuracy of fruit sorting, and also provides a new solution for related fields.
References 1. Lee, Y.-N., Lee, S., Kim, J.-S. Patra, J.K., Shin, H.-S.: Chemical analysis techniques and investigation of polycyclic aromatic hydrocarbons in fruit, vegetables and meats and their products. Food Chem. 277, 156–161 (2019) 2. Wang, Z., Ma, W., Wei, J., Lan, K., Yan, S., Chen, R., Qin, G.: Ultrasensitive flexible olfactory receptor-derived peptide sensor for trimethylamine detection by the bending connection method. ACS Sens. 11, 3513–3520 (2022) 3. Cai, J., Zou, C., Yin, L., Jiang, S., El-Seedi, H.R., Guo, Z.: Characterization and recognition of citrus fruit spoilage fungi using Raman scattering spectroscopic imaging. Vib. Spectro. 124, 103474 (2023) 4. Hemamalini, V., Rajarajeswari, S., Nachiyappan, S., Sambath, M., Devi, T., Singh, B.K., Raghuvanshi, A.: Food quality inspection and grading using efficient image segmentation and machine learning-based system. J. Food Quality 2022, 1–6 (2022) 5. Singh, S., Singh, N.P.: Machine Learning-Based Classification of Good and Rotten Apple, pp. 377–386. Springer (2019) 6. Ireri, D., Belal, E., Okinda, C., Makange, N., Ji, C.: A computer vision system for defect discrimination and grading in tomatoes using machine learning and image processing. Artif. Intell. Agric. 2, 28–37 (2019) 7. Roy, K., Chaudhuri, S.S., Pramanik, S.: Deep learning based real-time Industrial framework for rotten and fresh fruit detection using semantic segmentation. Microsyst. Technol. 27, 3365– 3375 (2021) 8. Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: Shufflenet v2: practical guidelines for efficient CNN architecture design. Proc. Eur. Conf. Comput. Vis. 116–131 (2018) 9. Zhang, X., Zhou, X., Lin, M., Sun, J.: Shufflenet: an extremely efficient convolutional neural network for mobile devices. Proc. IEEE Conf. Comput. Vis. Pattern Recogn. 6848–6856 (2018) 10. Zhang, X., Zhou, X., Lin, M., Sun, J.: Apple leaf disease recognition method base on improved ShuffleNet V2. In: 21 3rd International Conference on Advances in Computer Technology, Information Science and Communication (CTISC), IEEE, 2021, pp. 276–282
Based on Graph Model: A Method for Locating and Reconstructing Entanglement Ship Trajectory Yongze Zhu, Shuangxin Wang, and Jingyi Liu
Abstract Accurate positioning and restoration of ship entanglement trajectory form the foundation of research on ship motion. Traditional methods for positioning entanglement trajectory suffer from high false alarm rate and low recall ratio, resulting in reduced accuracy of subsequent trajectory restoration, and potential loss or error of ship motion information, which adversely affects related research such as ship motion. Therefore, based on the characteristics of large inertia in ship motion and sudden change in folding degree and energy of entanglement trajectory in graph domain, this study proposes a novel approach for the localization and restoration of entanglement trajectory. The proposed method employs the entanglement index of trajectory path graph for entanglement region diagnosis, and restores the entanglement trajectory through depth-first search of the trajectory hypergraph, thus achieving the localization and restoration of entanglement trajectory. The experimental results show that, compared with the traditional entanglement trajectory localization and restoration algorithms, the proposed method can accurately locate the entanglement region and effectively reduce the information loss during the trajectory restoration process, which provides high-quality data inputs for subsequent research in the field of ship motion. Keywords Entanglement trajectory · AIS data · Graph theory · Laplace matrix
Y. Zhu · S. Wang School of Mechanical, Electronic and Control Engineering, Beijing Jiaotong University, Beijing, China J. Liu (B) The 54th Research Institute of China Electronics Technology Group Corporation, Shijiazhuang, Hebei, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_22
265
266
Y. Zhu et al.
1 Introduction The rapid expansion of the marine economy and the widespread use of automatic identification systems (AIS) have led to a significant increase in the volume of ship trajectory data [1, 2], thereby providing ample data support for research in the domain of ships based on AIS data [3]. Accurate and reliable ship trajectory data is the prerequisite for research. Thus, the identification and management of abnormal ship AIS data is crucial for various ship-related domains, such as ship trajectory prediction, ship behavior pattern recognition, and AIS data-based ship collision avoidance [4]. One of the challenging difficulties in processing trajectory data is ship entanglement trajectory, which is a typical abnormal trajectory problem found in AIS data [5]. The precise localization and restoration of entanglement regions are essential for effective management of entanglement trajectory. The current approaches to entanglement trajectory localization can be categorized into two groups: entanglement localization based on threshold filtering and entanglement localization based on neural networks. Threshold-based entanglement localization approaches are employed to locate entanglement regions by defining varying abnormal thresholds. Scholars such as Wu et al. [6], Lei et al. [7], and Nie et al. [8] have proposed multiple thresholds for different abnormal types in AIS data to localize abnormal trajectories. These method are easy to implement and perform well in identifying obvious abnormal trajectories, but achieving high precision and recall in complex entanglement problems remains challenging. Wang [5] has proposed a BP neural network-based method for identifying abnormal AIS data. This method can automatically identify entanglement regions with high accuracy by utilizing entanglement trajectory samples as training data. However, the training process of the neural network requires a substantial amount of samples. There are two main categories of existing methods for restoring entanglement trajectory: based on trajectory simplification for removing entanglements, and based on motion patterns for restoring entangled trajectories. The entanglement removal method based on trajectory simplification aims to eliminate entanglement by either reducing or directly deleting the entangled regions. Scholars such as Jiaqi [9], Pan [10], Zhang [11], and Sang [12] have employed this method for processing entanglement trajectories, owing to its simplicity. However, this approach is often challenged by the potential loss of valuable ship motion information that may be inadvertently deleted. The entanglement restoration method proposed by Liu et al. [13] is based on motion attribute constraints, which involve constraining instantaneous velocity and time intervals to accurately restore entangled trajectories. However, the thresholds for these constraints need to be set manually. In summary, the aforementioned methods can achieve the localization and restoration of entanglement regions to some extent, but striking a balance between effectiveness and universality remains a challenge. Therefore, leveraging the large inertia of ship movements, this paper proposes entanglement localization based on trajectory graph complexity and entanglement restoration based on dynamic hypergraph embedding. The proposed methods focus on significant jumps and energy changes in
Based on Graph Model: A Method for Locating …
267
entanglement trajectory in the graph domain. By minimizing the loss of motion information, these methods can efficiently and accurately achieve entanglement localization and restoration, filling some gaps in existing methods for localizing and restoring entanglement in ship trajectory.
2 Ship Entanglement Trajectory and AIS Data Preprocessing 2.1 Ship Entanglement Trajectory Trajectory entanglement belongs to abnormal trajectory, differed from individual data affected by outliers and missing values, which randomly affect the data in the entangled neighborhood, as shown in Fig. 1. In Fig. 1, trajectory points 3, 4, 5, 6, and 7 are entangled. The normal trajectory sequence is [1, 2, 3, 4, 5, 6, 7, 8, 9], while the entangled trajectory sequence is [1, 2, 6, 7, 5, 3, 4, 8, 9]. An abnormal trajectory with correctly captured spatial information but temporally misplaced is known as trajectory entanglement. Trajectory entanglement has multiple causes, with the majority attributed to the unique broadcast mechanism of AIS data. Each AIS frame contains multiple message segments, and the sending intervals of different fields within the same frame can vary. In addition, the sending intervals of the same field in different frames can also differ. When the AIS data does not update in time or the base interrupts the request to send a message, trajectory entanglement occurs. As ship motion has a large inertia, normal trajectories are relatively smooth, whereas entangled trajectories have significant variations in trajectory complexity in the entangled region due to the time misalignment, as shown in Fig. 1. Therefore, this paper takes the trajectory Normal trajection 1
4
3
5
6
7
8 9
2 Entangled Entanglement trajection 1
4
3 2
Fig. 1 Illustration of trajectory entanglementd
5
6
7
8 9
268
Y. Zhu et al.
complexity jump in the entangled region as start and the advantages of graph in preserving local trajectory structures to conduct research on the graph structures for entanglement trajectory identification and restoration.
2.2 Data Cleansing Compared to ordinary time series data, AIS data exhibits characteristics such as missing, noise, and a large number of abnormal data due to its unique dynamic message transmission mechanism. To improve the accuracy and efficiency of subsequent identification and management of anomalous trajectories, it is necessary to enhance the quality of data through appropriate preprocessing methods. AIS data consists of static and dynamic data. The former records specific attributes of ships, such as ship name, MMSI, ship length, ship width, and ship type. The latter mainly records real-time updates of ship information during navigation, such as ship latitude and longitude, real-time time, course, speed, and heading. This study focuses on the position information and speed information data related to ship motion attributes.
2.2.1
Imputing Missing Values
There are numerous missing values in the ship AIS data, due to the large data volume and unequal time intervals of message transmissions in AIS datasets. To address this issue, linear interpolation is applied to fill the missing values based on the principle of the large inertia of ship motion.
2.2.2
Duplicate Value Processing
Considering the large volume of AIS data, retaining key points that can describe the motion characteristics of trajectories is sufficient. Therefore, directly deleting duplicated non-key information do not significantly affect the description of trajectories.
2.2.3
Outlier Handling
For obvious outliers that deviate from the trajectory, a method of first deleting the data and then using linear interpolation is applied to handle them.
2.2.4
Data Normalization
Row-wise normalization is performed using the following formula to standardize each row of data to a specific range[ymin , ymax ]:
Based on Graph Model: A Method for Locating …
y=
269
(ymax − ymin ) (x − xmin ) + ymin xmax − xmin
(1)
where x is the input data and y is the standardized data.
2.3 Trajectory Segmentation In the field of ship trajectory research, it is common practice to connect dynamic information in a time sequence and synchronously record static information to generate ship trajectories. In this paper, ship trajectories are segmented and extracted based on the unique identifier of the ship (MMSI), speed, navigation status, and timestamp. The specific steps are as follows:
Raw AIS data
Sort data with the same MMSI by time
no
Sailing status is on sail ? yes Start point = point
no ?
no
yes End point = point
Start point, ... ,End point Fig. 2 The flowchart for trajectory segmentation
? yes
270
Y. Zhu et al.
(1) The AIS data of the target ship is extracted using MMSI and sorted according to the timestamp; (2) The trajectory starting point Ti is marked by traversing the AIS data until a navigation status of “on sail” is found; (3) The time interval between trajectory point Ti+1 and Ti is computed, and if the time interval Δt is greater than 300 s, Ti is marked as the trajectory end point. Otherwise, if the speed is close to zero, Ti+1 is marked as the trajectory end point; (4) The above steps are repeated until the entire AIS data of the target ship is traversed; (5) The trajectory data set X of the original ship is generated by traversing the AIS data of the target ship according to the time sequence, and extracting the trajectory points between adjacent trajectory start and end points. The above steps involve two thresholds, where the time interval threshold of 300 s is due to the dynamic information transmission mechanism of AIS data. The maximum time interval for sending AIS data is 300 s, and the minimum is 2 s. Therefore, the maximum time interval is used as one of the thresholds for determining trajectory endpoints, which aims to minimize simulated data generated by linear interpolation and improve the accuracy of the trajectories (Fig. 2).
3 Tangled Trajectory Localization Method Based on Graph Complexity 3.1 Model Establishment 3.1.1
Trajectory Path Graph
Compared with time series data, graphs are widely used in related fields such as trajectory research by maintaining the geometric structure of samples to improve computational performance and efficiency [14]. Generally speaking, a graph is a binary tuple consisting of vertices and edges. The path graph representing ship trajectories is different from social network graphs and document citation graphs: there are no isolated edges and vertices in the path graph of the trajectory, and the structure of the determined ship trajectory path graph can not change. Different trajectory points on the ship trajectory have rich physical features, such as distance, speed, acceleration, etc. Therefore, to ensure the consistency defined in the graph, this paper uses vertices to represent the positions of trajectory points in the trajectory, and edges in the graph represent the trajectory segments connecting these trajectory points. The ship trajectory graph is specifically represented as an undirected path graph G = (V, E), where V = V (G) = [v1 , v2 , . . . , vn ] is the vertex set of G, and E = E (G) is the edge set of G. The number of vertices |V | = n and the number of edges |E| = m are respectively called the order and size of G. Specifically, for the trajectory path graph, m = n − 1. If the edge ei j belongs to the two vertices of E (G), denoted by vi and v j , then ei j = vi , v j = vi v j , and the vertices vi and v j are said to be adjacent.
Based on Graph Model: A Method for Locating …
3.1.2
271
Adjacency Matrix and Laplacian Matrix
The connectivity of nodes in the trajectory path graph is represented by an n dimensional adjacency matrix A ∈ R N ×N . A = A (G) = ai j n × n, where u j − vi , vi v j ∈ E (G)i ; ai j = 0, Other wise.
(2)
Clearly, A(G) is a symmetric matrix that represents the connectivity of nodes in the trajectory path graph through an n-dimensional adjacency matrix A ∈ R N ×N . The degree matrix D = D(G) = (di j )n × n of the trajectory path graph is a real diagonal matrix obtained from the adjacency matrix A, where the value of the diagonal element dii is equal to the algebraic sum of all the elements in the corresponding column of the adjacency matrix, representing the situation of the edges emanating from the corresponding vertex vi in the graph, ⎧ N ⎪ ⎨ a , i = j; ij di j = j=1 ⎪ ⎩0, i = j.
(3)
Based on the adjacency matrix A and the degree matrix D of the trajectory path graph, can obtain the Laplacian matrix L of the trajectory path graph, the computation formula is as follows: L = D−A
(4)
From the computation formula, it can be seen that the Laplacian matrix of the trajectory path graph is a real symmetric matrix. Therefore, can perform orthogonal similarity diagonalization on the Laplacian matrix, ⎞
⎛ λ1 ⎜ .. L =U⎝ .
=U
λ1
⎟ −1 ⎠U λn
..
. λn
(5)
UT
where U is the eigenvector matrix of the Laplacian matrix, and λ is the eigenvalue of the Laplacian matrix. 3.1.3
Entanglement Entropy of Trajectory Path Graph
In the graph domain, entanglement has the characteristics of sudden change in the folding degree and local energy of the graph. Therefore, mapping the trajectory data of
272
Y. Zhu et al.
the time series to the graph domain to obtain the trajectory path graph and calculating the entanglement index of the trajectory path graph can achieve the determination of entanglement regions. The entanglement index of the trajectory path graph consists of two parts: local folding degree and local energy index. The local folding degree characterizes the folding of edges in the graph and is proportional to the sum of the folding degree of all edges in the graph. Its calculation formula is as follows: ⎛ T1 = exp ⎝
−
α vi+l − vi + ti+l − ti
4 4 j
l
⎞
⎠ exp a j+i + ti+l − ti
(6)
where α is the space-time coefficient ratio, with a value of 1; ti is the time of vertex vi , and l is the size of the sliding window. The local energy index reflects the energy change of the trajectory path graph and is proportional to the energy of the graph. Its calculation formula is as follows: T2 = l
leλ
j=1
eλ j+i−1
(7)
where λ is the eigenvalue of the Laplacian matrix of the trajectory path graph, and l is the size of the sliding window. Integrating the above formulas, the entanglement index of the trajectory path graph can be obtained as follows: T = T1 + βT2
(8)
where β is the balance coefficient, with a value of 0.1. The entanglement index of the trajectory path graph integrates the folding degree index and the energy index, which can reflect both the folding degree of edges and the distribution of energy in the trajectory path graph. Analyzing the above index formula, it can be known that the range of the entanglement index of the trajectory path graph is from 0 to 1. Taking the angle between two adjacent edges in the trajectory path graph as the integration variable and using the characteristic of the large inertia of ship movement, the range of the limit values of the angle from 0 to π is used as the upper and lower limits of the integral. Integrating the entanglement index of the trajectory path graph, a threshold value of 0.6 can be obtained. When the entanglement index of the trajectory path graph in a region is greater than 0.6, it is considered that the entangled trajectory appears in that region
3.2 Entangled Trajectory Localization Process Based on Graph Complexity Compared with time series data, using the Laplacian matrix of a graph to study abnormal trajectories has the advantages of easy operation and high efficiency, and
Based on Graph Model: A Method for Locating …
273
entangled trajectories have the characteristics of sudden folding and dramatic energy increase in the graph domain. Therefore, by mapping time-series trajectories to the graph domain and analyzing the entanglement index of trajectory path graphs, it is possible to locate entangled trajectories. The specific algorithmic process is as follows: (1) Map the time-series trajectory X = [x1 , x2 , . . . , xn ] to the graph domain using the method in Sect. 3.1 to obtain the trajectory path graph G. (2) Select a subgraph of order 5 as the sliding window, with a step size of 1. (3) Calculate the adjacency matrix A, degree matrix D, and Laplacian matrix L of the trajectory path graph subgraph in the sliding window. (4) Calculate the folding degree T1 and energy index T2 of the trajectory in the sliding window. (5) Calculate the entanglement index T of the trajectory in the sliding window. (6) Compare the entanglement index result with a threshold of 0.6, and extract the trajectory graph vertex region [ p, q] with a result greater than 0.6. (7) Use the order 5 of the subgraph to extend the right interval of the trajectory graph vertex region, obtaining the entangled trajectory graph vertex region [ p, q + 4]. By inputting preprocessed time-series trajectory data into the above algorithm, trajectory path graph data can be obtained and the existence of trajectory entanglement can be determined. If entanglement exists, the vertex number of the entangled region can be output, providing accurate data input for subsequent entangled trajectory reconstruction (Fig. 3). 2
Tenporal trajectory
Entangled Trajectory Localization
Calculate and
Trajectory path graph 1
1
Sliding window ?
Calculate yes Calculate
Search area with greater than 0.6
Entanglement area 2 Trajectory Path Graph
Entanglement trajectory
Fig. 3 The flowchart for the localization of the entanglement region
no
274
Y. Zhu et al.
3.3 Evaluation Metrics To quantitatively verify the feasibility and reliability of the algorithm, and considering the data imbalance problem of entangled trajectories, the following indicators are used as evaluation criteria for the experiments. 3.3.1
The Symbols in the Evaluation Criteria Are Defined as Follows
T P represents the accuracy of entanglement localization, F N R represents the missed detection rate of entanglement localization, F A R represents the false alarm rate of entanglement localization; E T represents the number of trajectory points in the test samples that are actually entangled points and are also identified as entangled points by the test results; E F represents the number of trajectory points in the test samples that are actually non-entangled points but identified as entangled points by the test results; E W represents the total number of actual entangled points in the test samples; P T represents the number of trajectory points in the test samples that are actually non-entangled points and are also identified as non-entangled points by the test results; P represents the total number of trajectory points in the test samples. The accuracy of entanglement localization (T P) is calculated as follows: TP =
ET + PT P
(9)
The missed detection rate of entanglement localization (F N R) is calculated as follows: FNR =
EW − ET EW
(10)
The false alarm rate of entanglement localization (F A R) is calculated as follows: F AR =
EF P − EW
(11)
4 Method for Reconstructing Entangled Trajectories Based on Dynamic Hypergraph Embedding Optimization 4.1 Model Establishment 4.1.1
Hypergraph
In order to fully utilize the relative positional relationships between track points in trajectory and maintain the local structure of trajectories, this study maps the temporal trajectory data of entangled regions onto a graph domain using hypergraphs
Based on Graph Model: A Method for Locating …
275
[15] for representation. The hypergraph is a triple, mathematically expressed as G = (V, E, W ), where V = [v1 , . . . , vn ] and E = [e1 , . . . , em ] are the sets of vertices and hyperedges, respectively, and W = [w1 , . . . , wm ] are the weights of the hyperedges. The number of vertices |V | = n and the number of edges |E| = m are referred to as the order and size of G, respectively. The hypergraph Laplacian matrix is an important mathematical formula for studying hypergraphs, and its specific formula is as follows: L = Dv − H W De−1 H T
(12)
where, Dv , De , and W are the degree matrices of the hyperedges ei , the vertices v j , and the weights of the hyperedges, respectively. H is the incidence matrix of the hypergraph vertices and hyperedges. The calculation formulas for the degree matrices of the hyperedges δ = [δ (ei )] and the vertices d = d d v j are as follows: δ (ei ) =
h v j , ei
(13)
w (ei ) h v j , ei
(14)
v j ∈E
d vj =
v j ∈ei ,ei ∈E
Using the above formulas, it can be concluded that the two important elements for constructing the hypergraph Laplacian matrix are the adjacency matrix H and the weight matrix W . The weight matrix W is learned from the trajectory data X by the hypergraph using the adjacency matrix H . The formula for calculating the adjacency matrix H is as follows: 1, vi ∈ e j ; H vi , e j = (15) 0, Other vise. The symmetric normalized hypergraph Laplacian matrix for the trajectory hypergraph can be obtained by normalizing the Laplacian matrix as follows: L sym = I − Dv−1/2 H W De−1 H T Dv−1/2
(16)
Using the general graph generation formula of LPP, min S tr S T X L sym X T S , the objective function for constructing the trajectory hypergraph is as follows: min S
e∈E,xi ,x j ∈V
2 T w (e) h (xi , e) h x j , e S T xi √ − S√ x j d (e) xi x j 2
(17)
276
Y. Zhu et al.
where S is the transformation matrix and x is the trajectory data. In order to ensure a unique solution for the above equation, an orthogonal constraint S T X L X T S = I is imposed, that is, min
S T X L X T S=I
e∈E,xi ,x j ∈V
2 T w (e) h (xi , e) h x j , e S T xi √ − S√ x j d (e) xi x j 2
(18)
By solving the above equation, the construction of the trajectory hypergraph can be realized.
4.2 Dynamic Hypergraph Embedding Optimization-Based Entangled Trajectory Reconstruction Process In order to improve the accuracy of entangled trajectory reconstruction, the entangled trajectory area data obtained in Sect. 3 is mapped to the graph domain to establish a trajectory hypergraph, which is used to obtain the relative relationships between data and save the local structure. This process includes four important steps: (1) Randomly generate and optimize the trajectory hypergraph; (2) Calculate the Laplacian matrix of the trajectory hypergraph; (3) Use the entanglement index T as the objective function to find the reachable path that minimizes the objective function through depth-first algorithm; (4) Use the reachable path with the minimum objective function as the reconstruction result of the entangled area, and fuse it with the trajectories in the non-entangled area to obtain the final reconstruction result. The specific process is as follows: (1) Map the entangled trajectory data X = [x1 , . . . , xn ] to the graph domain and generate a random hypergraph G; (2) Based on the formula in Sect. 4.1, optimize the trajectory hypergraph G using a stepwise optimization approach: (a) Correct the transformation matrix S. After fixing the other variables, the generating objective function of the trajectory hypergraph becomes: ST
min
X XT
tr S T LsymX T S
(19)
S=I
−1 X L sym X T , where The optimal solution for S is the eigenvectors of X X T + εI ε is a very small positive value. (b) Correct the correlation matrix H . Using the optimal transformation matrix S obtained in the previous step, transform the original trajectory data X into a lowdimensional space, and construct a new set of hyperedges: ei = v j | θ S T xi , S T x j ≤ 0.1σ˜ i , i, j = 1, . . . , n
(20)
Based on Graph Model: A Method for Locating …
277
where, θ is the similarity function, usually using Euclidean distance, and σi ˜ is the average distance between S T x j and other low-dimensional trajectory data. Using the above formula, the correlation matrix H of the trajectory hypergraph can be obtained, and the degree matrix De of the hyperedges in the trajectory hypergraph can be calculated using H :
δ (ei ) =
v j ∈E
h v j , ei , i, j = 1, . . . , n
De = diag (δ)
(21)
(c) Modify the weight matrix W . With other variables fixed, the objective function for generating the trajectory hypergraph becomes: min tr S T X I − Dv−1/2 H W De−1 H T Dv−1/2 X T S
w T 1=1
(22)
Exploiting the property of W being a diagonal matrix, the above formula can be transformed into: min −tr De−1 H T Dv−1/2 X T SS T X Dv−1/2 H W
w T 1=1
(23)
Let Q = D −1 eH T D −1/2 v X T SS T X D −1/2 v H , q = diag (Q) and w = diag (W ). The above equation becomes: 2 1 min w − q 2 2 w T 1=1 Using the Lagrange multiplier method, get the Lagrange equation: 2 1 − η w T 1 − 1 − γw w − q Γ (w, η, γ) = 2 2
(24)
where γ ≥ 0 and η ≥ 0 are Lagrange multipliers. Based on the KKT conditions, get the closed-form solution for wi , i = 1, . . . , n, as shown below: wi =
1 qi + η 2
+
, i = 1, . . . , n
(25)
k where η = k1 − 2k1 i = 1qi and k is the number of non-zero elements in q. After obtaining w, the weight matrix W = diag (w) and the vertex degree matrix Dv for the trajectory hypergraph can be obtained using the following formula:
d (vi ) =
vi ∈ei ,ei ∈E
w (ei ) h vi , e j , i, j = 1, . . . , n Dv = diag (d)
(26)
278
Y. Zhu et al. Entanglement trajectory
1
Generata Hypergraph
Optimize Hypergraph
Revise
Revise
ɛ
no
Convergence ?
Convergence ?
no
yes
yes Revise
Calculate Convergence ? yes
no
no
Convergence ? yes Trajectory restoration result
Optimize Hypergraph
1
Trajectory restoration
Fig. 4 The flowchart of the entangled trajectory restoration process
(3) Compute the Laplacian matrix L of the trajectory hypergraph based on the results from step (2). (4) Use the depth-first search algorithm to find the reachable path that minimizes the trajectory entanglement indicator T as the objective function for entanglement restoration. (5) Merge the restored trajectories from the entanglement region with the original trajectories from the non-entanglement region to obtain the final trajectory restoration result (Fig. 4).
4.3 Evaluation Metrics In order to quantitatively verify the feasibility and reliability of the algorithm, and considering the issue of data imbalance in entangled trajectories, the following indicators are adopted as evaluation criteria for the experiments.
Based on Graph Model: A Method for Locating … Table 1 Grouping of experimental trajectory data Group Characteristics Real count L1 L2 L3
4.3.1
Straight 9 One direction 4 change Multiple 5 direction changes
279
Simulated count
Total count
21 26
30 30
25
30
The Symbols in the Evaluation Criteria Are Defined as Follows
E R A represents the entangled trajectory restoration accuracy, E R L represents the entangled trajectory restoration loss rate; P represents the total number of trajectory points in the test sample; E P represents the total number of trajectory points after entangled trajectory restoration in the test sample; E P T represents the number of correct trajectory points after entangled trajectory restoration in the test sample. The entangled trajectory restoration accuracy E R A is calculated as follows: ERA =
E PT P
(27)
The entangled trajectory restoration loss rate E R L is calculated as follows: E RL =
P − EP P
(28)
5 Experimental Verification 5.1 Data Source The data used in the experiment were selected from the historical data of various types of ships in the Chinese coastal area (107◦ E-123◦ E, 5◦ N-26◦ N), which were archived by the Chinese AIS shore-based network and covered the period from December 2017 to December 2020. In addition, in order to verify the reliability of the proposed method, 72 entangled trajectories were generated by simulation. The trajectories were grouped based on their overall complexity, and three groups with different levels of complexity were created. The data for each group are shown as follows.
280
Y. Zhu et al.
Fig. 5 Sample instances of 3 groups: a L1 group; b L2 group; c L3 group
5.2 Comparison Results and Analysis of Entangled Trajectory Positioning Methods In order to verify the reliability of our proposed method on trajectory data with different motion characteristics, experiments were conducted on test samples according to the entangled trajectory localization process described in Sect. 3.2. Additionally, to further validate the advantages of our proposed method, comparisons were made with entanglement localization methods based on BP neural network and trajectory thinning methods, respectively. Some experimental samples are shown in Fig. 5, where the red trajectory points indicate the entangled areas. The experimental results are shown in Figs. 6, 7, and 8, where the vertical axis of 0 and 1 represents the normal area and entangled area, respectively. If the actual output result of an area is the same as the expected output result, it indicates that the method’s diagnostic result for that area is correct. If the actual output of an area is 0 and the expected output is 1, it means that the method misjudged the entangled area as a normal area. If the actual output of an area is 1 and the expected output is 0, it means that the method misjudged the normal area as an entangled area. Combining with Table 2, it can be seen that the proposed method achieves good results in both precision and recall rates for entanglement localization. However, the false positive rate of the proposed method is higher than that of the BP method, because the algorithm automatically adds the sliding window size to the right interval when judging entangled areas, resulting in a higher false positive rate. Overall, considering the three evaluation criteria, it can be proved that the algorithm is effective in recognizing entangled areas.
Based on Graph Model: A Method for Locating …
Fig. 6 Diagnostic results of L1 group samples
Fig. 7 Diagnostic results of L2 group samples
281
282
Y. Zhu et al.
Fig. 8 Diagnostic results of L3 group samples Table 2 Diagnostic results of entangled trajectory Method F A R% T P% Proposed method BP Thinning
5.34 4.84 15.90
96.13 91.56 77.02
F N R% 1.70 13.29 11.74
5.3 Comparison Results and Analysis of Entangled Trajectory Reconstruction Methods In order to verify the reliability of the proposed method on trajectory data with different motion characteristics, experiments were conducted on the test samples according to the trajectory reconstruction process described in Sect. 4.2. In addition, to further verify the superiority of the proposed method, comparisons were made with MDL and trajectory thinning methods. The experimental results are shown in Fig. 9. It can be seen from the figures that compared with the MDL method and the thinning method, the proposed method can maximize the preservation of the spatial information and important motion features of the original trajectory. Combining with Table 3, it can be concluded that the method has a high accuracy in reconstructing the entangled trajectory, and can effectively reduce the loss rate of trajectory points. In addition, there are no entangled trajectories
Based on Graph Model: A Method for Locating …
283
Fig. 9 The entanglement restoration results of 3 methods: a thinning method; b MDL method; c proposed method Table 3 The entanglement restoration results of each method Method E R A% Proposed method MDL Thinning
92.16 70.56 64.37
E R L% 0.95 25.08 31.47
in the results reconstructed by the proposed method, while the results of the MDL method and the thinning method still contain entangled trajectories, which further demonstrates the effectiveness of the proposed method.
6 Discussion This paper proposes a graph-based method for locating and reconstructing entangled ship trajectories in AIS data. Experiments and comparisons were conducted on AIS data with different motion patterns. The results show that: 1) Compared with traditional methods, the proposed method has higher precision; 2) Under similar precision, the proposed method has lower missed detection rate; 3) The proposed method preserves more trajectory motion information and reduces the possibility of losing important motion patterns, providing better data inputs for subsequent ship research based on AIS data. Acknowledgements This work was supported by the Key R & D Plan Projects in Hebei Province [grant number 22340301D].
284
Y. Zhu et al.
References 1. Jingkui, M.: AIS Based Remote Navigation Dynamic Monitoring System for Ships. Shanghai Maritime University, Shanghai (2006) 2. Tong, X.P., Xu, C., Lingzhi, S., Zhe, M., Qing, W.: Vessel trajectory prediction in curving channel of inland river. In: 2015 International Conference on Transportation Information and Safety (ICTIS), pp. 706–714. IEEE, Wuhan (2015). https://doi.org/10.1109/ICTIS.2015.7232156 3. Wen, T.Y., Lai, C.H., Lei, P.R., Peng, W.C.: RouteMiner: mining ship routes from a massive maritime trajectories. In: 2014 IEEE 15th International Conference on Mobile Data Management, vol. 1, pp. 353–356. IEEE, Brisbane (2014). https://doi.org/10.1109/MDM.2014.52 4. Baldauf, M., Benedict, K., Motz, F.: Aspects of technical reliability of navigation systems and human element in case of collision avoidance. In: Proceedings of the Navigation Conference & Exhibition, vol. 28. London (2008) 5. Yongming, W.: Ship Abnormal Behavior Detection and Early Warning Based on Large-Scale AIS Data. Dalian Maritime University, Dalian (2020). https://doi.org/10.26989/d.cnki.gdlhu. 2020.001980 6. Jianhua, W., Chen, W., Wen, L.G.: Automatic detection and repair algorithm of ship AIS trajectory anomaly. J. Navig. China 40, 8–12 (2017). https://doi.org/10.3969/j.issn.1000-4653. 2017.01.003 7. Lei, L., Zhonglian, J., Xiumin, C., Cheng, Z., Daiyong, Z.: Research on AIS base station coverage characteristics based on historical data. In: Proceedings of the 12th China Intelligent Transportation Annual Conference, pp. 42–52. Changshu (2017) 8. Yang, N., Xiumin, C., Xinglong, L.: Comparison of effectiveness analysis methods for inland river AIS data. J. Navig. China 39, 59–62 (2016). https://doi.org/10.3969/j.issn.1000-4653. 2016.02.014 9. Jiaqi, A.: AIS Track Clustering Analysis and Abnormal Trajectory Detection. Dalian Maritime University, Dalian (2020). https://doi.org/10.26989/d.cnki.gdlhu.2020.001128 10. Pan, S.: Ship Route Information Mining and Application Based on Shore-Based AIS Data. Shanghai Jiaotong University, Shanghai (2019). https://doi.org/10.27307/d.cnki.gsjtu.2019. 001740 11. Zhang, W., Wu, Q., Sang, L., Mao, Z.: Denoising method of inland AIS information based on vessel track. In: 2012 11th International Symposium on Distributed Computing and Applications to Business, Engineering & Science, pp. 358–361. IEEE, Guilin (2012). https://doi.org/ 10.1109/DCABES.2012.99 12. Sang, L., Wall, A., Mao, Z., Yan, X., Wang, J.: A novel method for restoring the trajectory of the inland waterway ship by using AIS data. Ocean Eng. 110, 183–194. (2015). https://doi. org/10.1016/j.oceaneng.2015.10.021 13. Jingyi, L., Xiaoqian, G., Qi, G.: Ship AIS track de-entanglement method based on motion attribute constraint. Radio Eng. 53, 678–685 (2023) 14. Cui, Y., Henrickson, K., Ke, R., Wang, Y.: Traffic graph convolutional recurrent neural network: a deep learning framework for network-scale traffic learning and forecasting. IEEE Trans. Intell. Transp. Syst. 21, 4883–4894 (2019). https://doi.org/10.1109/TITS.2019.2950416 15. Huanxia, W.: Unsupervised dynamic hypergraph learning Laplacian matrix feature selection. Comput. Eng. Des. 43, 2078–2087 (2022). https://doi.org/10.16208/j.issn1000-7024.2022.07. 035
A Weighted Degree Maximum-Based Base Station Frequency Allocation Algorithm Xi Zhong, Min Liang, and Yaping Ji
Abstract In wireless cellar communication network with multiple base stations, the limited frequency resources often conflict with high real-time responsivity requirement, which demands for the number of interference-free communication base stations simultaneously as large as possible. Therefore, how to improve the work parallelism of base station and reducing the time consumption of communication tasks have become one of research challenges. To address this issue, this paper first models the real-world scenario and applies genetic algorithm, simulated annealing algorithm, greedy algorithm, and maxcut algorithm to find the optimal allocation scheme for the frequencies of base stations. Then, based on the definition of degree in graph theory, a frequency allocation algorithm based on maximum-weighted-degree is proposed. According to simulation results for various base station distribution scenarios, the algorithm proposed can achieve higher base station parallelism and complete a global communication task in the shortest possible time. Keywords Frequency allocation · Graph theory · Genetic algorithm · Simulated annealing algorithm · Greedy algorithm · Maxcut · Parallelism
1 Introduction With the continuous development of digitalization in various fields, wireless communication networks, as an important technological component of digital scenarios, have also continuously facing new challenges. Wireless cellular networks are favored especially in scenarios such as large supermarkets, warehouses, factories, and logistics centers for their advantages of full-scene coverage. However, in wireless cellular networks, wireless frequency resources are always limited, and wireless interference always exists beacause the communication distance generally is 10 m to 30 m and multiple wireless system might be coexisted in this crowd space. How to dynamically and scientifically allocate frequency resources to multiple base stations, avoid interX. Zhong (B) · M. Liang · Y. Ji Hanshow Technology Co., Ltd., Beijing 100012, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_23
285
286
X. Zhong et al.
ference inside and outside the system, and improve system throughput has always been one of the hottest research topics. The purpose of base station frequency allocation [1] is to assign the limited available frequencies to different base stations, allowing them to work without interference and thus improving communication quality. The allocation algorithm plays a crucial role in determining the capacity and efficiency of the communication network. The frequency allocation problem is an NP-complete problem for the increasing demand of communication and limited amount of frequency [2]. Traditional frequency allocation algorithms are mainly based on heuristic algorithms including genetic algorithm [3], simulated annealing algorithm [4], particle swarm optimization algorithm [5], greedy algorithm [6], and others. In [7], meta-heuristic genetic algorithm is applied to construct the frequency allocation model. After the process of genetic operators, reproduction and selection, the model obtains the optimal solution which represents the final frequency allocation scheme. In [8], simulated annealing algorithm and local search algorithm were applied to get the optimal frequency allocation scheme. Particle swarm optimization algorithm was used in research [9] to solve the frequency allocation problem. Discrete particle swarm optimization is incorporated into the traditional particle swarm algorithm. The paper aims to minimize the interference cost function and get the optimal frequency allocation scheme. A multiple greedy-based frequency allocation algorithm was proposed [10], which used three greedy methods to obtain the optimal frequency allocation strategy under time-permitting condition. In addition to heuristic algorithms as well as the optimization algorithms from heuristic algorithms, there are also frequency allocation methods based on graph theory. In paper [11], the interference relationship between base stations was abstracted into an undirected graph, and the frequency allocation was solved based on the vertex coloring theory in graph theory. Paper [12] solves the problem of frequency allocation based on graph theory to obviate severe interference, enhance the overall capacity of the system and improve network coverage.
2 System Model and Problem Description In wireless cellular communication networks, the degree of interference between base stations varies according to distance and obstruction between them. In practical engineering, how much one base station can be interferenced by another base station can be evaluated through measuring the strength of the received signal from that specific base station. The measured signal strength is presented as r ssi AB , and r ssi AB gets higher as the distance between base station A and B gets larger. When 0 ≤ r ssi AB ≤ R1 , the frequency offset between the two base stations needs to be f 1 MHZ to guarantee that they can work simultaneously without interference. When R1 < r ssi AB ≤ R2 , the frequency offset between the two base stations needs to be f 2 MHZ. And when R2 < r ssi AB ≤ R3 , it needs to be f 3 MHZ. When r ssi AB > R3 , the two base stations can work with the same frequency. At present, the available frequency range is 2473MHz∼2483MHz, and all odd frequency points in this range can be
A Weighted Degree Maximum-Based Base Station …
(a)
(b)
287
(c)
Fig. 1 Distribution scenarios of base stations
allocated to base station. The corresponding relationship between frequency F and the frequency point f is: F = 2400 + f × 0.5, therefor, 10 odd frequency points are available to be allocated. Figure 1 shows three distribution scenarios of base stations: grid distribution, cellular distribution, and random distribution. This paper aims to get an optimal scheme that base stations can work in a higher parallelism without signal interference and large-scale date communication task to be completed in shortest possible time.
3 Traditional Algorithms Solve Frequency Allocation Problems 3.1 Frequency Allocation Based on Genetic Algorithm Genetic algorithm is an algorithm to obtain a population that meets the conditions by mutation, crossover, and selection. At the end of the evolution, the best individual is chosen to be a frequency allocation solution. Constructing the initial population which contains pop_num individuals. The initialization of each individual gene sequence is shown in Fig. 2, and the number of genes contained in each gene segment equals to the number of available frequency. As shown in Fig. 3, mutation occurs on a random single gene, and crossover occurs between gene segments. The probability of mutation and crossover occurrence is determined by roulette wheel method [10]. In the evolution, only individuals that meet the interference-free operation of the base station will be selected. Defining the number of unallocated base stations as the loss value, the loss value gradually decreases during the evolution process, which indicates that the number of allocated base stations gradually increases during the process. Finally, the solution with the lowest loss value is selected out from each generation’s optimal individuals as the final solution. This algorithm iterates based on probability mechanism, therefor, randomness exists in the evolution process, and is prone to falling into local optimal solutions. As a result, global optimal solution can’t be reached in frequency allocation.
288
X. Zhong et al.
Fig. 2 Initial population gene sequence
Fig. 3 Mutation and crossover
3.2 Frequency Allocation Based on Simulated Annealing Algorithm Simulated Annealing Algorithm is a search algorithm that searches for the optimal solution in the entire space. It updates current solution based on the principle of progressively lowering temperature to make target function convergent. A solution vector X = [x1 , x2 , . . . , xk ] is initialized where k is the number of base stations that needs to be allocated frequency.X also represents a frequency location scheme. The initial annealing temperature is defined as tinitial , the final annealing temperature is defined as t f inal , and the cooling parameter is defined as α. The temperature after k rounds cooling satisfies with fomula T (k) = α × T (k − 1) where T (k − 1) represents temperature after k-1 rounds cooling. In each cooling round, a markov chain perturbation is applied to the solution vector to obtain a new solution which is defined as X new . The new solution must meet the interference-free working conditions of the base station, otherwise it will be discarded. Calculating the energy value of new solution in each cooling round is the method to record whether the annealing system has calmed down. High-quality solutions with lower energy are unconditionally preserved, while poor-quality solutions are conditionally preserved. This process continues until the temperature is below the final temperature t f inal , then the optimal solution vector is obtained, and it can be decoded as final frequency allocation scheme. Since the speed of the simulated annealing algorithm depends on the cooling rate and the length of the Markov chain, the search for the optimal solution often takes a long time, especially when the number of base stations is particularly large.
A Weighted Degree Maximum-Based Base Station …
289
3.3 Frequency Allocation Based on Greedy Algorithm The greedy algorithm is an algorithm that selects the optimal option for each step in order to achieve overall optimization. Therefore, in the frequency allocation of base stations, the goal is to minimize the total number of assigned frequencies in each round of allocation. For example, the frequency set assigned by previously allocated base stations is setused , then priority is given to frequencies in setused when assigning frequency to unallocated base station. If it cannot find a suitable frequency in setused , then it will look for a frequency outside setused . However, this algorithm does not consider the characteristics of each base station, it takes all frequency allocation tasks into equal level. In reality, the interference environment of each base station is different, and the location of the base station is also different. Therefore, only using greedy algorithm for frequency allocation cannot obtain the optimal strategy.
3.4 Frequency Allocation Based on Maxcut Algorithm It is impossible to allocate frequencies to all base stations without interference in a single round. The frequency allocation algorithm based on maximum cut adopts a way of cutting the graph, which divides the original graph G(E, V ) into a set of subgraphs {G 1 , G 2 , . . . , G N }, ensuring that the base stations in subgraph can work without interference. Maxcut algorithm decompose complex frequency allocation tasks into several simple as well as achievable tasks, but this algorithm based on the assumption that the original graph structure is connected. However, in practical base station communication scenarios, the interference relationship graph may not wholly connected, which limits the maximum of graph cut. Task may become too discrete and, on the contrary, reducing communication efficiency.
4 WDMaxium-Based Algorithm With the goal of achieving higher work parallelism of the base station and spending less time on a communication task, this paper proposes a WDMaximum-Based (Weighted Degree Maximum-Based) frequency allocation algorithm. The number of base station is defined as N, and the interference relationship between base stations can be abstracted into an undirected graph structure G = (V, E), where V is the set of base stations, and E is the set of edges that is abstracted from the interference between base stations. The frequency offset between base stations can be abstracted as edge’s weight, which is related to the strength of the measuring signal. The weight matrix W is shown in formula (1), where w(ei j ) can be defined in formula (2). Among them, r ssi i j is the strength of the measuring signal between the base stations represented by nodes vi and v j .
290
X. Zhong et al.
Fig. 4 Undirected graph of base station interference relationship
W = w(ei j ); (0 ≤ i < N , 0 ≤ j < N ) ⎧ 1, (i = ⎪ ⎪ ⎨ 3, (i = w(ei j ) = 5, (i = ⎪ ⎪ ⎩ 0, (i
j, R2 < r ssi i j j, R1 < r ssi i j j, R0 < r ssi i j = j, r ssi i j >
≤ R3 ) ≤ R2 ) ≤ R1 ) R3 )
(1)
(2)
The undirected graph structure of base station interference relationship is shown in Fig. 4. The adjacency matrix of undirected graph is defined as M: M = m i j (0 ≤ i < N , 0 ≤ j < N ), where m i j indicates whether there is interference between base station i and base station j. If there is interference, m i j is 1, otherwise it is 0. Here, we define degi as a weighted degree of each node which based on the interference relationship between base stations:
degi =
N −1
w(ei j ) × m i j
(3)
j=0
The weighted degree quantifies the interference of a certain base station in the entire network. In frequency allocation tasks, it is manifested as the “difficulty" of assigning frequencies to a certain base station. Assigning frequencies to base stations in decreasing order of weighted degree is an effective method for balancing the difficulty of base station allocation in each round. Priority should be given to base stations that are difficult to allocate frequencies, so that both high and low weighted degree base stations can be allocated frequencies in a single round. Taking
A Weighted Degree Maximum-Based Base Station …
291
the grid distribution environment as an example, the results of each allocation round are shown in Fig. 5. As shown in the figures above, base stations near the center have higher weighted degree, while those near the edge have lower weighted degree. In the result of one allocation round, the proportion of these two types is relatively balanced. The histogram of base station weighted degree in each round of allocation is shown in Fig. 6. If base stations are allocated in ascending order of their weighted degree, then base stations that are easier to allocate will be given priority in each round. At the beginning of the allocation process, there are more base stations with lower weighted degree will be allocated, as shown in Fig. 7. Base stations with lower weighted degree are completely allocated after first two rounds, and the remaining rounds only allocate frequencies to base stations with higher weighted degree. These base stations with higher weighted degree become increasingly difficult to allocate in subsequent rounds, leading to more rounds for frequency allocation.
(a)
(b)
Fig. 5 Result of WDMaximum-based frequency allocation algorithm
Fig. 6 Histogram of base station weighted degree in each round
Fig. 7 Histogram of base station weighted degree in each round
(c)
292
X. Zhong et al.
5 Result of Experiments There are 10 available odd-frequency points (147∼165) can be assigned to base stations. The communication objects under each base station are fixed, so that the total number of communication objects is fixed too. Communication data needs to be transmitted to these objects are packed and will be sent in sequence. Each base station that assigned frequency sends data with an upper limit on the number of objects per second. Base stations that unallocated should wait until there are available frequencies to use. The communication time for a single object is randomly 1 s, 2 s, 3 s, or 5 s. This paper applied above algorithms in three base station distribution scenarios. The required allocation rounds, maximum parallelism of base station works, and total communication time of each algorithm are compared in Tables 1, 2, and 3, which show the frequency allocation results of 100 base stations with grid distribution, 99 base stations with cellular distribution, and 40 base stations with random distribution. These tables indicate that the frequency allocation algorithm this paper proposed has the lowest number of allocation rounds, and a higher work parallelism of base station. The algorithm can also achieve high-quality communication without interference and complete the task as quickly as possible. The number of assigned base station in each round is shown as Fig. 8, which indicates that this algorithm can assigned frequencies to most of the base station in the early rounds compared to other algorithms.
Table 1 Comparison of algorithms in grid distribution Algorithm Maximum parallelism Total rounds GA SA Greedy algorithm MaxCut WDMaximum-based WDMinimum-Based
33 36 42 35 45 46
4 4 4 4 3 4
Table 2 Comparison of algorithms in cellular distribution Algorithm Maximum parallelism Total rounds GA SA Greedy algorithm MaxCut WDMaximum-based WDMinimum-Based
26 31 32 30 35 37
5 5 5 5 4 6
Times(s) 135 129 125 135 120 124
Times(s) 158 157 135 165 130 135
A Weighted Degree Maximum-Based Base Station … Table 3 Comparison of algorithms in random distribution Algorithm Maximum parallelism Total rounds GA SA Greedy algorithm MaxCut WDMaximum-based WDMinimum-Based
(a)
22 23 25 19 26 26
3 3 3 3 2 3
(b)
293
Times(s) 80 75 73 94 64 74
(c)
Fig. 8 Result of WDMaximum-based frequency allocation algorithm
6 Conclusion This paper investigated various of mainstream frequency allocation algorithms. Under different backgrounds and scenarios, there are different characteristics and objective functions. Firstly, the popular algorithms were applied to three scenarios mentioned in this paper and simulated according the task. Then, according on the interference relationship between base stations, a weighted degree maximum-based base station frequency Allocation algorithm was proposed. Finally, different algorithms were compared in three scenarios: cellular distribution, grid distribution, and random distribution. The proposed algorithm in this paper could complete all communication tasks of base stations in the least time, with higher work parallelism for base stations and the least number of rounds for frequency allocation.
References 1. Luo, J.: Research on the performance of frequency allocation algorithm based on fractional frequency reuse. In: IEEE 5th International Conference on Information Systems and Computer Aided Education (ICISCAE), pp. 518–522. Dalian, China (2022). https://doi.org/10.1109/ ICISCAE55891.2022.9927539 2. Uykan, Z.: Spectral based solutions for (near) optimum channel/frequency allocation. In: 18th International Conference on Systems, Signals and Image Processing, pp. 1–4 (2011)
294
X. Zhong et al.
3. Ohatkar, S.N., Bormane, D.S.: Channel allocation technique with genetic algorithm for interference reduction in cellular network. In: Annual IEEE India Conference (INDICON), pp. 1–6. Mumbai, India (2013). https://doi.org/10.1109/INDCON.2013.6726084 4. Novillo, F., Valdivieso, C., Velasquez, F.: Centralized channel assignment algorithm for OSAenabled WLANs based on simulated annealing. In: 7th IEEE Latin-American Conference on Communications (LATINCOM), pp. 1–6. Arequipa, Peru (2015). https://doi.org/10.1109/ LATINCOM.2015.7430136 5. Zhang, X., Zhang, X. and Wu, Z.: In utility- and fairness-based spectrum allocation of cellular networks by an adaptive particle swarm optimization algorithm. IEEE Trans. Emerg. Top. Comput. Intell. 4(1), 42–50 (2020). https://doi.org/10.1109/TETCI.2018.2881490 6. Li, R., Zhu, P., Jin, L.: Channel allocation scheme based on greedy algorithm in cognitive vehicular networks. In: IEEE 3rd Information Technology, Networking, Electronic and Automation Control Conference (ITNEC), pp. 803–807. Chengdu, China (2019). https://doi.org/10.1109/ ITNEC.2019.8729351 7. Singh, S.K., Kaushik, A. and Vidyarthi, D.P.: A cognitive channel allocation model in cellular network using genetic algorithm. Wireless Pers. Commun. 96, 6085–6110 (2017). https://doi. org/10.1007/s11277-017-4465-z 8. Lu, L., Fan, R.: Simulated annealing algorithm in solving frequency assignment problem. In: International Conference on Advanced Computer Theory and Engineering, V1-361–V1-364 (2010) 9. Benameur, L., Alami, J., Imrani, A.E.: A hybrid discrete particle swarm algorithm for solving the fixed-spectrum frequency assignment problem. Int. J. Comput. Sci. Eng. 5(1), 68–73 (2010). https://doi.org/10.1504/IJCSE.2010.030231 10. Cao, J.: Analysis and implementation of mobile-base-station frequency allocation based on multiple-greedy algorithm. J. Nantong Vocat. Univ. (2014) 11. Yang., L., Yu, Y.H.: A mobile frequency allocation algorithm based on the graph theory. In: International Conference on Computer Information Systems and Industrial Applications (2015) 12. Hao, J., Zhang, H., Song, L. and Han, Z.: Graph-based resource allocation for device-todevice communications aided cellular network. In: IEEE/CIC International Conference on Communications in China (ICCC), pp. 256–260. Shanghai, China (2014). https://doi.org/10. 1109/ICCChina.2014.7008282 13. Kumar, R., Jyotishree, Blending roulette wheel selection & rank selection in genetic algorithms. Int. J. Mach. Learn. Comput. 2(4), 365–370 (2012)
Adaptive Time-Varying Parameter Estimation of Nonlinearly Parameterized Systems Fujin Luan, Xinkai Chen, Jing Na, Yashan Xing, and Guanbin Gao
Abstract Although adaptive estimation of constant parameter has been studied for decades, most of existing methods cannot achieve satisfactory performance for timevarying parameters in particular for nonlinearly parameterized systems. In this paper, a novel adaptive parameter estimation framework based on parameter estimation errors, is proposed to estimate time-varying parameters for generally nonlinearly parameterized systems. The main idea is to restructure the system as a linearly parameterized form through Taylor expansion. Following the introduction of auxiliary filtered variables, the estimation errors of the unknown parameter are derived and used to design an adaptive law to achieve uniform ultimate boundedness under the persistent excitation condition. Furthermore, it is verified that the suggested methods are robust against bounded disturbances. Numerical simulation results validate the effectiveness of adaptive time-varying parameter estimation methods. Keywords Adaptive estimation · Time-varying parameters · Nonlinear parameterization · System modeling
1 Introduction The application of control theory is heavily related to the development of system identification and state estimation, which penetrate and combine to constitute three crucial components in modern control theory. In control engineering, the adaptive control [1, 2], and the adaptive estimation have made essential contributions to the area. The conventional adaptive control algorithms achieve online parameter estimaF. Luan · J. Na (B) · Y. Xing · G. Gao Kunming University of Science and Technology, Kunming 650500, China e-mail: [email protected] Yunnan International Joint Laboratory of Intelligent Control and Application of Advanced Equipment, Kunming 650500, China X. Chen Shibaura Institute of Technology, Saitama 337-8570, Japan © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_24
295
296
F. Luan et al.
tion to address the uncertainty of unknown parameters through the adaptive laws. However, instead of ensuring that the parameter estimation converges to the true value, they primarily concentrate on tracking a given reference trajectory [2]. It is widely accepted that the convergence of adaptive parameter estimation (APE) is advantageous for the stability and robustness of closed-loop system. Then, Least squares [1, 2], gradient descent, projection algorithms [3], and neural networks [4–6] are among some of the well-known successes that have been accomplished in constructing APE algorithms with guaranteed convergence. The adaptive laws involving in the new APE structure, which is updated by the parameter estimation error, are applied in various practical applications, including [7–10]. However, they are unable to be employed directly for time-varying parameters encountered in nonlinearly parameterized structures. The constructive methods have been identified in [11, 12] for designing APE framework in nonlinearly parameterized systems. In this instance, the convexity and concavity with respect to parameters were first explored, and the new hierarchical min-max method was then suggested to achieve estimation convergence. Concerning the Monod growth kinetics [13], which are obtained from the state prediction error, an APE algorithm was also developed, where the necessary persistent excitation (PE) requirement was additionally considered. To ensure the convergence of estimated parameter to true values, an online updating method for uncertainty sets [3] was developed. However, it is found that the APE for time-varying parameters in the nonlinearly parameterized systems remains challenging [3]. To deal with the time-varying behavior of the estimated parameters, several nonrecursive algorithms have been implemented to address the slowly time-varying parameters [14]. The mean-square estimation error can be reduced by using a local regression of the time-related polynomial for the recursive least-squares algorithms with the forgetting factor [15]. However, these estimation algorithms are suitable for slowly time-varying. In [16], time-varying parameters are approximated in terms of polynomials with unknown coefficients by dividing time into small intervals. However, it is usable for linearly parameterized systems. In this paper, we’ll research APE for systems with nonlinear time-varying parameters. Unknown parameters are first separated from dynamics by using the Taylor expansion, and then we transform this system into a linearly parameterized one. To achieve accurate parameter estimation, intermediate variables containing estimation errors of unknown parameters are defined, which are used to design a new adaptive law to guarantee the convergence under the PE condition. Moreover, the proposed method is independent of any observer or predictor design. Two simulations that illustrate satisfactory convergence and acceptable robustness of adaptive estimation against the bounded disturbance are provided. The structure of this paper is as follows. Section 2 presents the problem formulation. The adaptive estimation for time-varying parameters is shown in Sect. 3. Two simulations are illustrated in Sects. 4, and 5 concludes.
Adaptive Time-Varying Parameter Estimation …
297
2 Problem Formulation Consider the nonlinearly parameterized system x˙ = f (x, u, θ(t)) + d
(1)
where x ∈ Rn is the system state, d ∈ Rn is the uncorrelated stochastic noise with mean zero, u ∈ Rn is the system control input, and θ(t) ∈ Rr is unknown timevarying parameters, respectively. Using the measurements of x, u, estimating the unknown nonlinear parameters θ(t) is the problem that needs to be solved. The subsequent assumptions are required in order to facilitate the APE design. Assumption 1 The system state x, and the control input u are bounded and measureable. Moreover, the unknown parameters θ, θ˙ and disturbance d are bounded, ˙ ≤ b, d ≤ ς. e.g.,θ ≤ a, θ Assumption 2 The objective function f (x, u, θ) is continuously differentiable and the Jacobi matrix satisfies ∇ f (w) ≤ c for all w ∈ Rr [17]. Besides, the unknown parameters θ belong to a bounded convex set D ⊆ Rn [18]. Remark 1 In the APE design [19], Assumption 1 has been significantly utilized and can be satisfied if necessary by setting an adequate input u = τ (·) on system (1). Assumption 2 implies that f is Lipschitz continuous with Lipschitz constant c > 0. Moreover, the nonlinearly parameterized system (1) must then be reformulated into an equivalent linear form and ensure global convergence of APE in the convex set, which is used in the subsequent APE design as [3]. Note that the nonlinear function f (x, u, θ) contains the unknown system parameter θ, enabling the system nonlinearly parameterized. Then, we first reformulate the system (1) into a linearly parameterized structure ˆ + Φ(x, u, θ)θ ˆ − Φ(x, u, θ) ˆ θˆ + ν + d x˙ = f (x, u, θ)
(2)
ˆ ∈ Rn×r is the Jacobi matrix of where θˆ is the estimation of unknown θ; Φ(x, u, θ) f (·); ν is bounded by the Taylor series expansion’s remainder [20]. Definition 1 [1] A vector t+τor matrix function Φ is persistently excited if there exists τ > 0, ε > 0 such that t Φ T (ι)Φ(ι)dι ≥ εI for all t > 0. Since only the variables x and u are accessible, the filtering operation is adopted to manage the unmeasurable x. ˙ To simplify the notation in (2), a variable ϕ = ˆ θˆ is defined. Then the filtered variables x f , f f , Φ f , ϕ f are obtained by Φ(x, u, θ) ⎧ k x˙ f + x f = x, x f (0) = 0, ⎪ ⎪ ⎨ ˙ k f f + f f = f, f f (0) = 0, k Φ˙ f + Φ f = Φ, Φ f (0) = 0, ⎪ ⎪ ⎩ k ϕ˙ f + ϕ f = ϕ, ϕ f (0) = 0.
298
F. Luan et al.
1 where k > 0 is the constant for a low-pass filter ks+1 . Since the above filter is a proper stable rational transfer function [21], swapping lemma [22] can be applied on Eq. (2), such that
s 1 1 1 k 1 [x] = [f]+ [Φ]θ − [ϕ] [Φ]θ˙ ks + 1 ks + 1 ks + 1 ks + 1 ks + 1 ks + 1 1 + [ν + d]. ks + 1 Using the filter operation, x˙ can be avoided by x˙ f =
x − xf k ˙ + υf = ff + Φfθ − ϕf − [Φ f θ] k ks + 1
(3)
where v f is the filtered variable of v = ν + d, and according to the linear filtering operation, Φ f , v f are bounded, i.e., Φ f ≤ η0 , v f ≤ ς. x−x k ˙ For simplicity, define the variables y, as y = k f − f f + ϕ f , = ks+1 [Φ f θ], and then the filtered system (3) can be represented as y + − υf = Φfθ
(4)
˙ where is bounded because of the bounded Φ f , θ. ˆ the time-varying parameters As shown in (4), using the accessible variables x, u, θ, θ can be separated from the nonlinearly parameterized system (1). Although the gradient descent and the stochastic gradient descent algorithm [17] have achieved good results for constant parameters, they may fail for time-varying parameters. To maintain beneficial results with respect to time-varying parameters, a new APE is needed to further improve the idea provided in [16, 23], which is driven by extracted parameter estimation errors.
3 Adaptive Estimation for Time-Varying Parameters Proposition 1 Consider the nonlinear parameterized system (1). Designing constant gains Γ, α > 0, the adaptive law for driving θˆ is given by ˙ θˆ = Γ (E 1 − αE 2 )
(5)
where estimation errors E 1 , E 2 ∈ Rr are driven by
ˆ − P θ˙ˆ E˙ 1 = −E 1 + κΦ Tf (y + ˆ − Φ f θ) E 2 = κΦ Tf (Φ f θˆ − y − ˆ).
(6)
Adaptive Time-Varying Parameter Estimation …
299
with the computational variables P, ˆ
P˙ = −P + κΦ Tf Φ f , P(0) = 0, ˙ˆ ˆ(0) = 0. ˆ˙ = − 1 ˆ + Φ f θ,
(7)
k
The estimation error θ˜ is then uniformly ultimately bounded under the PE condition. Proof A nominal vector ψ ∈ Rr can be defined as ˙ ψ(0) = 0. ψ˙ = −ψ + κΦ Tf υ f − P θ,
(8)
˙ˆ Owing to θ˙˜ = θ˙ − θ, we use Eq. (4) to achieve another transformation of E˙ 1 ˜ = −(E 1 − P θ) ˜ − κΦ T ˜ + κΦ T υ f − P θ˙ (E 1 − P θ) f f
˙ˆ k ˙ − k [Φ f θ] where ˜ = − ˆ = ks+1 [Φ f θ] = ks+1 into (9), the variables E 1 , E 2 fulfill
˙˜ k [Φ f θ]. ks+1
E 1 = P θ˜ + ψ + R, E 2 = −κΦ Tf (Φ f θ˜ − ˜ + υ f ).
(9)
Then, substituting (8)
(10)
where the nominal vector R ∈ Rr yields
R˙ = −R − κΦ Tf ˜, ˙˜ ˙˜ = − 1 ˜ + Φ f θ. k
It is evident that the matrix P defined in (7) is positive definite (λmin (P(t) ≥ σ > 0 for all t > 0) if the matrix Φ is persistently excited [24] in Definition ˙ ψ = t e−(t−ι) (κΦ T υ f − P θ)dι ˙ is 1. Based on the boundedness of Φ f , v f , P, θ, f 0 t − 1 (t−ι) ˙ t ι t − ˜ R= ˜ = Φ f θ˜ − e k (e k Φ f ) θdι, bounded by ψ ≤ δ, and ˜ = 0 e k Φ f θdι 0 t −(t−ι) T Φ f ˜dι hold with integration by parts. Provided that the derivative of −κ 0 e t t ι ˜ of ˜ is Φ f is bounded, i.e., Φ˙ f ≤ η1 , the norm of second term e− k 0 (e k Φ f ) θdι t t ι η1 −k ˜ ˜ k e 0 (e Φ f ) θdι ≤ (η0 + k )θ. Hence E 1 , E 2 contain the error information of the parameter estimation. It is a fact that d(t) in E 1 can be weakened by E 2 encountered in a low-pass filter, and E 1 can be treated as an average term to improve the robustness against d(t), but it will reduce the fast tracking capability of fast time-varying parameters. E 2 is therefore necessary to enhance the instant estimation performance. Most valid parameter estimation methods have regraded θ˙ as a new unknown parameter. To extract the new parameter θ˙ the same as θ in (4), system (1) has to be implemented dimension expansion operations such as the second Taylor expansion [25]. Unfortunately, the dimension expansion suffers from methodological limita-
300
F. Luan et al.
tions. Firstly, accompanied by the introduced expansion operation, the additional high-order system state is hard to make acquisition, because of the corresponding existence of the high intensity noise. The most important limitation lies in the fact that the added computation can not avoid the same problem. Namely, the introduced new variable θ˙ in the derivative of the designed Lyapunov function will inevitably ¨ confront the high-order term θ. The aforementioned limitations lead us to recall the purpose of the parameter ˙ estimation, such that the adaptive law θˆ will converge to the derivative θ˙ with the ˆ Motivated by this property, we use the adaptive law convergence of the estimation θ. ˆ˙ to address the unknown θ˙ of in (4), and the differential ˆθ˙ by the form ˆ = k [Φ f θ] ks+1 form ˆ of (7) is then to accomplish this filtering operation. Consequently, the novel estimator of θ˙ which is equal to the adaptive law can be independent of any observers design and above restrictions can be defused. For system (1) with the adaptive law (5), a Lyapunov function is chosen as V = 1 ˜T −1 ˜ Γ θ, and then its derivative V˙ can be obtained by θ 2 V˙ = θ˜T Γ −1 θ˙ + θ˜T [−P θ˜ − ψ − R − ακΦ Tf (Φ f θ˜ − ˜ + υ f )] t ˜ = θ˜T Γ −1 θ˙ + θ˜T [−P θ˜ − ψ + κ 0 e−(t−ι) Φ Tf Φ f θdι t −(t−ι) T − t t ι t t ι T ˜ ˜ + υ f )] Φ f e k ( 0 (e k Φ f ) θdι)dι − ακΦ f (e− k 0 (e k Φ f ) θdι −κ 0 e 2 2 −1 2 2 κη0 ˜ 2 m 1 λmax (Γ )b m2δ 1 1 2 2 2 ˜ ˜ ˜ ≤ −σθ + θ + + θ + + θ κη (η +
η1
)
2m 1
2
˜ 2 + ακη0 (η0 + + 0 0 k θ ≤ −μV + ρ
η1 ˜ 2 )θ k
2m 2
+
1 ˜ 2 θ 2m 3
2
+
κη η
m 3 α2 κ2 η02 ς 2 2
2κη02 + 0k 1 2 1 1 − ακη0 (η0 + ηk1 ) − 2m1 3 ] is where μ = λmax (Γ −1 ) [σ − 2m − 2m − 1 2 2 −1 2 2 2 2 2 2 m λ (Γ )b +m 2 δ +m 3 α κ η0 ς tive and ρ = 1 max for adequate constants m 1 , m 2 , m 3 . 2
(11) posi-
Subsequently, it is evident that the estimation error θ˜ is uniformly ultimately bounded from the above Lyapunov-based stability
analysis. Moreover, the estimation ˜ ≤ error θ˜ converges to the compact set θ PE condition.
2V (0)e−μt + 2ρ μ λmin Γ −1 μ
around zero provided the
Remark 2 As aforementioned, the parameter estimation error variables E 1 , E 2 incorporated into the adaptive law (5) can improve the parameter estimation performance. The proposed adaptive learning algorithm is then driven by the estimation errors. As a result, the observer and predictor used in the classical learning algorithms can be avoided. Compared with the previous APE method [26, 27], it is a contribution ˙˜ Specifically, ˜ θ. that the proposed APE method is driven by the error information θ, ˙ the adaptive law (5) is used to address θ, which can address the fast time-varying ˙ parameter. Detailed in Eq. (7), the adaptive law θˆ need to be reconstructed by the ˙ differential form ˆ, and the boundedness of θ, θ in Assumption 1 can not be avoided to achieve the uniform ultimate convergence.
Adaptive Time-Varying Parameter Estimation …
301
4 Simulations 4.1 Academic Example [11] The following system as considered in [11] can be presented as
u 2 u 2 + 12 exp −5 θ − 2 + x˙ = −2x + θ − 8 4
(12)
where the true value of θ is a triangle wave θ(t) = sawtooth(20πt, 0.5) + 5, the ˆ simulations’ initial values are specified as θ(0) = 6, x(0) = 0.1. The control input is u = 1.1 sin(2t) and the external disturbance is set as d = sin(100t). Other parameters are set as = 15, κ = 1, Γ = 1, α = 40. As shown in Fig. 1, the proposed APE algorithm can guarantee that the estimated parameter converges to the true value even it suffers fast time-varying behavior. It is also shown that this algorithm also works well in the case of discontinuous derivatives. Moreover, the satisfactory robustness can be also guaranteed, and the estimation error overshoots little in the initial time because of the quick estimation behavior but can successfully disappear later. Consequently, the proposed algorithms can contribute to estimating fast time-varying parameters.
4.2 Monod Kinetics [3] To demonstrate the effectiveness of the proposed algorithms in multidimensional systems, we consider the Monod Kinetics model [3] as With disturbance Parameter Estimation
Parameter Estimation
Without disturbance 6 5 4
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
5 4
1
0
0.1
0.2
0.3
0.4
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0.5
0.6
0.7
0.8
0.9
1
1
Estimation error
1
Estimation error
6
0.5 0 -0.5
0.5 0 -0.5 -1
-1 0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Time (s)
Fig. 1 Parameter estimation θˆ with the law (5) in the system (12)
Time (s)
302
F. Luan et al. With disturbance Parameter Estimation
Parameter Estimation
Without disturbance 1 0.8 0.6 0.4 0.2
0.8 0.6 0.4 0.2 0
0 0
5
10
15
20
25
30
35
40
45
0
50
5
10
15
20
25
30
35
40
45
50
0.2
Estimation error
0.2
Estimation error
1
0 -0.2 0.03
-0.4
0.01 -0.01
-0.6
-0.03
30
35
40
45
0 -0.2 0.03
-0.4
0.01 -0.01
-0.6
-0.03
50
30
35
40
45
50
-0.8
-0.8 0
5
10
15
20
25
30
35
40
45
50
0
5
10
15
20
25
30
35
40
45
50
Time (s)
Time (s)
Fig. 2 Parameter estimation θˆ with the law (5) in the system (13)
x˙1 = x˙2 =
θ1 x 1 x 2 − Dx1 , θ2 +x2 −θ3 x1 x2 + S0 − θ2 +x2
Dx2 .
(13)
where θ1 , θ2 are the unknown parameters, and the constant S0 = 5. Control input is established by D = 0.05 sin t + 0.1, and initial state is x(0) = [1, 1]T . The timevarying parameters are θ1 = 0.1 sin(2t) + 0.3, θ2 = 0.1 sin(2t) + 0.5, which are the unknown parameters and to be estimated, and θ3 = 1 is a known constant in this case. Other parameters are set as = 0.1, κ = 0.01, Γ = 0.1I, α = 500I . The initial ˆ values is θ(0) = [1, 1]T , and the noise is d = 0.1 sin(20t). From Fig. 2, it is found that the suggested APE Algorithm (5) can guarantee that estimated time-varying parameters converge to their true values. The convergence rate of the parameter θ2 is slower than the other because of the properties of system (13), i.e., this parameter is involved in the denominator. Moreover, the robustness can be also retained. Overall, the proposed APE can achieve satisfactory convergence of time-varying parameters estimation for general nonlinearly parameterized systems.
5 Conclusions This paper proposes a new APE structure for time-varying parameters in nonlinearly parameterized systems. With the help of Taylor expansion and filtered variables, the extracted estimation errors of unknown parameters can be obtained, and then used to design a constructive adaptive law. In this case, the widely used observer and predictor design can be avoided. Additionally, the influence of regressor dynamics on the convergence performance can be properly compensated. Based on the Lyapunov stability theory, the estimation error under the PE condition can be guaranteed to converge to a compact set around zero even in presence of bounded disturbances. As illustrated by two simulations, this APE framework can estimate the fast time-
Adaptive Time-Varying Parameter Estimation …
303
varying parameters. Future work will be carried out to investigate the case with the relaxed PE condition. Acknowledgements This work was funded by National Nature Science Foundation of China under grants (62203194, 62273169), and partially supported by Yunnan Major Scientific and Technological Projects under grants (202102AA310042, 202203AP140005, 202202AG050002, 202001AV070001, 202001AS070028) and Kunming University of Science and Technology & the First People’s Hospital of Yunnan Province Joint Special Project on Medical Research (KUSTKH2022003Y). (Corresponding author: Yashan Xing.)
References 1. Sastry, S., Bodson, M.: Adaptive Control: Stability, Convergence, and Robustness. Prentice Hall (1989) 2. Ioannou, P.A., Sun, J.: Robust Adaptive Control, vol. 1. PTR Prentice-Hall, Upper Saddle River (1996) 3. Adetola, V., Guay, M., Lehrer, D.: Adaptive estimation for a class of nonlinearly parameterized dynamical systems. IEEE Trans. Autom. Control 59(10), 2818–2824 (2014) 4. Hecht-Nielsen, R.: Kolmogorov’s mapping neural network existence theorem. Neural Netw. (1992) 5. Ahmed-Ali, T., Kenn, G., Lamnabhi-Lagarrigue, F.: Identification of nonlinear systems with time-varying parameters using a sliding-neural network observer. Neurocomputing 72(7), 1611–1620 (2009) 6. Hornik, K., Stinchcombe, M., White, H.: Multilayer feedforward networks are universal approximators. Neural Netw. 2(5), 359–366 (1989) 7. Na, J., Mahyuddin, M.N., Herrmann, G., Ren, X.: Robust adaptive finite-time parameter estimation for linearly parameterized nonlinear systems. In: Control Conference, pp. 1735–1741 (2013) 8. Na, J., Mahyuddin, M.N., Herrmann, G., Ren, X., Barber, P.: Robust adaptive finite-time parameter estimation and control for robotic systems. Int. J. Robust Nonlinear Control 25(16) (2015) 9. Na, J., Herrmann, G., Burke, R., Brace, C.: Adaptive input and parameter estimation with application to engine torque estimation. In: 2015 54th IEEE Conference on Decision and Control (CDC) (2016) 10. Na, J., Yang, G.-Y., Gao, G., Zhang, J.-G.: Parameter estimation error based robust adaptive law design and experiments. Control Theor. Appl. 33, 956–964 (2016) 11. Cao, C., Annaswamy, A.M., Kojic, A.: Parameter convergence in nonlinearly parameterized systems. IEEE Trans. Autom. Control 48(3), 397–412 (2003) 12. Annaswamy, A.M., Skantze, F.P., Ai, P.L.: Adaptive control of continuous time systems with convex/concave parametrization. Automatica 34(1), 33–49 (1998) 13. Zhang, T., Guay, M.: Adaptive parameter estimation for microbial growth kinetics. Aiche J. 48(3), 607–616 (2002) 14. de Mathelin, M., Lozano, R.: Robust adaptive identification of slowly time-varying parameters with bounded disturbances. Automatica 35(7), 1291–1305 (1999) 15. Joensen, A., Madsen, H., Nielsen, H.A., Nielsen, T.S.: Automatica 36(8), 1199–1204 (2000) 16. Na, J., Yang, J., Ren, X., Guo, Y.: Robust adaptive estimation of nonlinear system with timevarying parameters. Int. J. Adapt. Control Signal Process. 29(8), 1055–1072 (2015) 17. Bottou, L., Curtis, F.E., Nocedal, J.: Optimization methods for large-scale machine learning. Siam Rev. 60(2), 223–311 (2018) 18. Zhang, L., Zhao, Y., Guo, L.: Identification and adaptation with binary-valued observations under non-persistent excitation condition. Automatica 138, 110158 (2022)
304
F. Luan et al.
19. Adetola, V., Guay, M.: Finite-time parameter estimation in adaptive control of nonlinear systems. IEEE Trans. Autom. Control 53(3), 807–811 (2008) 20. Xing, Y., Na, J., Costa-Castelló, R.: Adaptive online parameter estimation algorithm of pem fuel cells. In: 2019 18th European Control Conference (ECC), pp. 441–446. IEEE (2019) 21. Yang, J., Na, J., Gao, G.: Robust adaptive control for unmatched systems with guaranteed parameter estimation convergence. Int. J. Adapt. Control Signal Proces. 33(12), 1868–1884 (2019) 22. Ioannou, P.A., Sun, J.: Robust adaptive control. Courier Corporation (2012) 23. Na, J., Xing, Y., Costa-Castelló, R.: Adaptive estimation of time-varying parameters with application to roto-magnet plant. IEEE Trans. Syst. Man Cybern. Syst. 51(2), 731–741 (2018) 24. Na, J., Herrmann, G., Ren, X., Mahyuddin, M.N.: Robust adaptive finite-time parameter estimation and control of nonlinear systems. In: IEEE International Symposium on Intelligent Control, pp. 1014–1019 (2011) 25. Luan, F.: Adaptive parameter estimation and its application on robotic control designs. Kunming University of Science and Technology (2019) 26. Na, J., Yang, J., Xing, W., Guo, Y.: Robust adaptive parameter estimation of sinusoidal signals. Automatica 53, 376–384 (2015) 27. Luan, F., Na, J., Yang, J., Gao, G., Zhu, Q., Robust adaptive finite-time parameter estimation for nonlinearly parameterized nonlinear systems. In: 37th Chinese Control Conference (CCC), pp., 1622–1627. IEEE (2018) 28. Luan, F., Na, J., Huang, Y., Gao, G.: Adaptive neural network control for robotic manipulators with guaranteed finite-time convergence. Neurocomputing 337, 153–164 (2019) 29. He, H., Na, J., Huang, Y., Liu, T.: Integrated modeling and adaptive parameter estimation for hammerstein systems with asymmetric dead-zone. IEEE Trans. Indus, Electron (2022) 30. Ortega, R., Romero, J.G., Aranovskiy, S.: A new least squares parameter estimator for nonlinear regression equations with relaxed excitation conditions and forgetting factor. Syst. Control Lett. 169, 105377 (2022)
Static Characteristics Simulation Analysis of a Plate Type Flow Control Type Counterbalance Valve Kan Li, Yingqi Shen, Guolei Si, Junhui Chen, and Teng Li
Abstract The paper elaborates the structural principle and operation mechanism of a plate flow control type (Duan and Chang, Hydraul Pneumatic Seals 5 (2021) [1]) counterbalance valve, and builds a simulation model based on its structure using AMESim hydraulic component design library (HCD library). Through simulation analysis, the effect of load pressure on the opening of the counterbalance valve, the effect of compensation damping hole diameter on the flow of the counterbalance valve, the relationship between pilot spool and main spool displacement, and the effect of different load pressures on the displacement of the main spool and pilot spool are explored. Keywords Flow control · Counterbalance valve · Pilot pressure · Load pressure · Spool displacement
1 Introduction Flow control type counterbalance valve has developed into a multi-functional integrated motion control valve that combines one-way function, liquid-controlled throttling function, load locking or holding, anti-pressure shock and load controllable devolution. Different pilot control methods and load pressure compensation structures can form different opening pressure, on/off time, flow supply characteristics to meet different working conditions, with the advantages of compact structure, good sealing, stable and reliable work, etc., which are widely used in military missile launch equipment, rocket launch equipment, civil engineering machinery, port machinery, lifting table and other variable load occasions [2]. Counterbalance valve as one of the important components of hydraulic system with beyond load conditions, its performance has an important role in the performance of the whole machine K. Li (B) · Y. Shen · G. Si · J. Chen · T. Li SiChuan Aerospace FengHuo Servo Control Technology Corporation, Chengdu, Sichuan 610000, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_25
305
306
K. Li et al.
and energy saving and emission reduction, with the rapid development of hydraulic technology, manufacturing technology and control technology, the system designers are required to use the appropriate counterbalance valve to meet the needs of different working conditions and different objects [3]. However, designers are not yet able to select the right counterbalance valve according to the working conditions, so it is necessary to analyze the characteristics of counterbalance valves to guide designers [4–8].
2 Counterbalance Valve Working Principle The hydraulic circuit with counterbalance valves generally includes three operating conditions, i.e., neutral, rising and falling conditions. Neutral condition (load holding circuit): Load pressure and compression spring act on the main spool in the closing direction, and the valve is in a zero-leakage closed state. This is shown in Fig. 1. In the falling position (load falling circuit): the pilot pressure acts on the pilot piston while overcoming the control spring force, the pilot spool is opened, so that the load pressure at the B port is released to the A port through the groove on the pilot hair spool, and the opening of the pilot spool becomes larger as the pilot pressure at the X port increases. At this time, the pressure of the spring cavity of the pilot spool is gradually released, and when the force of the load port pressure acting on the main spool is greater than the force of the pilot valve spring cavity pressure acting on the main valve, the main spool starts to move to the right, and thereafter, the main spool moves with the movement of the pilot spool. As shown in Fig. 2. In the rising position (load rising circuit): the pressure at the A port overcomes the very small compression spring and load force, the pilot spool and the control main spool move simultaneously in the opening direction, and the oil flows from A→B, which is equivalent to a one-way valve at this time. As shown in Fig. 3.
3 Counterbalance Valve Structure The plate counterbalance valve consists of four main blocks: pilot end cap, pilot control piston, relief valve (thermal surge valve), and main valve, as shown in Fig. 4. Figure 5 shows the composition of the Bucher Cindy series flow control counterbalance valve with D-shaped end caps and the names of the parts. The detailed structure of the counterbalance valve corresponds to the hydraulic schematic as shown in Fig. 6. In order for the counterbalance valve to be fully reclosed without leakage, it is necessary to set a sufficient control piston spring preload pressure Pspring and to intercept the flow completely below the maximum load pressure Pspring , which should be at least 30% higher than the maximum load port pressure Pmax .
Static Characteristics Simulation Analysis of a Plate …
307
variable amplitude system
Hydraulic cylinder
Pilot valve
Load control vave main valve
Secondary relief valves
Muti-way directional valve check valve
Fig. 1 Load holding condition
pump
relief valve
308
K. Li et al.
variable amplitude system
Hydraulic cylinder
Pilot valve
Load control vave main valve
Muti-way directional valve check valve
pump
Fig. 2 Load falling working condition
relief valve
Secondary relief valves
Static Characteristics Simulation Analysis of a Plate …
309
variable amplitude system
Hydraulic cylinder
Pilot valve
Load control vave main valve
Secondary relief valves
Muti-way directional valve check valve
Fig. 3 Load rising condition
pump
relief valve
310
K. Li et al.
1
2
3
4
Fig. 4 Typical structure of plate type counterbalance valve
Fig. 5 Bucher counterbalance valve structure diagram
4 Counterbalance Valve Pilot Ratio The structure of the pilot ratio of the counterbalance valve is shown in Fig. 7. Assuming the control piston area of the counterbalance valve is A1 and the pilot area is A2, the pilot ratio is pilot ratio =
π(d1/2)2 A1 control piston area = = pilot valve area A2 π(d2/2)2
Note: The explanation for the pilot area of d2 is as follows: as shown in Fig. 8. Pilot spool blue both sides of the spool action area is equal, and in static conditions, that is, the oil does not flow, cavity C1 and cavity C2 pressure size equal, the direction
Static Characteristics Simulation Analysis of a Plate …
311
pilot valve
Fig. 6 Structure and principle correspondence diagram
of the opposite. Therefore, the pressure on both sides of the blue part of the pilot spool is balanced, and only the d2 part acts. Note: The explanation for the pilot area of d2 is as follows: as shown in Fig. 1, the pilot ratio is the pilot ratio under static conditions.
5 Counterbalance Valve Damping In a hydraulic circuit, there is a differential pressure before there is flow, and there is a differential pressure before there is flow.
312
Fig. 7 Counterbalance valve pilot ratio structure
Fig. 8 Pilot-operated counterbalance valve force area
K. Li et al.
Static Characteristics Simulation Analysis of a Plate …
313
For two throttle orifices in parallel, the flow distribution is shown in Fig. 9. ⎧ qV 1 2 = dd1 2 ⎪ qV 2 ⎪ 2 ⎪ ⎪ ⎪ qV = qV 1 + qV 2 ⎪ ⎪ 1 1 1 ⎪ ⎪ ⎨ R1+2 = R1 + R2 Δp = Δ p1 = Δ p2 ⎪ ⎪ q ⎪ V 1 = 1d2 2 qV ⎪ ⎪ 1+ d ⎪ 1 ⎪ ⎪ ⎪ ⎩ qV 2 = 1d 2 qV 1+
1 d2
Two throttle orifices in parallel with equivalent throttle orifice: d = d12 + d22 For two throttle orifices in series, the differential pressure distribution is shown in Fig. 10. ⎧ ΔP = Δ p1 + Δ p2 ⎪ ⎪ 4 ⎪ Δ p1 ⎪ = dd2 4 = RR21 ⎪ ⎪ Δ p2 1 ⎪ ⎪ ⎪ ⎨ R1+2 = R1 + R2 qV = qV 1 = qV 2 ⎪ ⎪ Δ p1 = 1d 4 ΔP ⎪ ⎪ ⎪ 1+ d1 ⎪ 2 ⎪ ⎪ 1 ⎪ ⎩ Δ p2 = d 4 ΔP 1+
2 d1
Two throttle orifices in parallel with equivalent throttle orifice: d1+2 = 4
Fig. 9 Parallel damped shunt
d1 d2 d1 + d2 4
4
or
1 1 1 = 4+ 4 4 d1+2 d1 d2
314
K. Li et al.
Fig. 10 Series damping voltage divider
6 Counterbalance Valve Modeling For analyzing the static characteristics of the counterbalance valve, a simulation model of the counterbalance valve was built based on the Hydraulic Component Design Library (HCD) in the multidisciplinary system simulation software AMESim (Fig. 11).
Fig. 11 Counterbalance valve amesim simulation model
Static Characteristics Simulation Analysis of a Plate …
315
7 Simulation Analysis The simulation parameters of the counterbalance valve model are set as shown in Table 1. Enter the following parameters into each model block and run the simulation.
7.1 Effect of Load on the Opening Pressure of Counterbalance Valve The load pressure settings are shown in Table 2. It is assumed that the pilot spool displacement is 0.02 mm when the spool is considered to be opened. The effect of the load pressure on the pilot opening pressure is shown in Fig. 12. However, the difference between the opening pressure when the load pressure is
Table 1 Component Parameters No. Component parameters 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
Control piston large end diameter/(mm) Diameter of small end of control piston/(mm) Inlet damping diameter/(mm) Initial position of control piston/(mm) The maximum distance the control piston moves/(mm) Control piston mass/(kg) Distance between control piston and pilot spool/(mm) Control spring preload/(N) Regulating spring stiffness/(N/m) Pilot spool diameter/(mm) Outside diameter of the second force surface of the pilot spool/(mm) Inner diameter of the second force surface of the pilot spool/(mm) Diameter of the third force surface of the pilot spool/(mm) Pilot spool mass/(kg) Main spool mass/(kg) One-way spring preload force/(N) One-way spring stiffness/(N/m) Main spool large end outside diameter/(mm) Main spool large end internal diameter/(mm) Main spool fixed throttle hole diameter/(mm) Main spool wedge throttle length/(mm) Main spool wedge throttle width/(mm)
Value 32 4 0.3 0 10 0.086 0.5 464 123,770 3 11 3 11 0.0124 0.071 30 3180 22 11 1.4 7.6 0.8
316
K. Li et al.
Table 2 Load pressure No. Load pressure (MPa) 1 2 3 4
10 20 30 40
Pilot opening pressure (MPa)
Pilot spool displacement (mm)
0.78 0.82 0.89 0.94
0.02 0.02 0.02 0.02
0.96
pilot start of opening(MPa)
0.94 0.92 0.90 0.88 0.86
0.16
0.84 0.82 0.80 0.78 0.76 10
15
20
25
30
35
40
load pressure(MPa)
Fig. 12 Load versus opening pressure
40 MPa and the opening pressure when the load pressure is 10 MPa is 0.16 MPa, so it can be judged that the load of this type of counterbalance valve has an effect on the pilot opening pressure, but the effect is very small and can almost be ignored.
7.2 Effect of Compensation Damping Hole Diameter on the Flow Rate of Counterbalance Valve The compensation damping hole diameter, pilot pressure and load pressure settings are shown in Table 3. The simulation results are shown in Fig. 13. At a pilot pressure of 10 bar, the main valve is not yet open. Between a pilot pressure of 13 bar and 28 bar, the main valve flow increases first with increasing load pressure and then decreases with increasing load pressure. Except for the pilot control pressure of 31 bar, the main valve flow rate increases exactly according to the square root relationship in the flow equation, and the main flow rate increases with the increase of load pressure and then decreases with the increase of load pressure at each pilot control pressure.
Static Characteristics Simulation Analysis of a Plate … Table 3 Load pressure No. Diameter of compensation damping hole (mm)
Pilot pressure (bar)
Load pressure (MPa)
Linear change from 1 MPa to 42 MPa in 20 s The same as above The same as above The same as above The same as above The same as above The same as above The same as above
1
1.2
10
2 3 4 5 6 7 8
1.2 1.2 1.2 1.2 1.2 1.2 1.2
13 16 19 22 25 28 31
400
200
1b -3 =1 0
250
px
the flow of main valva(L/min)
300
ar
px=10bar px=13bar px=16bar px=19bar px=22bar px=25bar px=28bar px=31bar
350
317
150 100 50 0 -50 0
100
200
300
400
500
load pressure(bar)
Fig. 13 Simulation results of load pressure compensation characteristics
7.3 Pilot Spool and Main Spool Displacement Relationship The main spool is set to a step signal with a pilot pressure of 8 bar for the first 5 s, 30 bar for the middle 20 s, and 0 bar for the last 15 s. The load pressure is set to 100 bar. The relationship between the main spool and pilot spool displacement is shown in Fig. 14. In the first 5 s, due to the pilot pressure can not open the pilot spool, the main spool and pilot spool displacement are 0 mm, in the middle 20 s, the pilot pressure is step increase, the pilot spool began to move, the movement of the main spool after a period of time; in the last 15 s, the pilot pressure is step decrease, the main spool with the movement of the pilot spool, and finally reached the same position.
318
K. Li et al.
Fig. 14 Main spool and pilot spool displacement relationship
pilot valve main valve
10
displacement(mm)
8
6
4
2
0
0
10
20
30
40
time(s)
7.4 Effect of Different Load Pressure on the Displacement of Main Spool and Pilot Spool The pilot pressure is set to 12 bar, and the load pressure is set to 10, 20, 30, and 40 MPa. The simulation results are shown in Fig. 15. For the pilot spool, under certain pilot pressure conditions, the spool displacement gradually decreases as the load increases; for the main spool, the main spool starts to open only when the pilot spool moves about 2.2 mm, i.e., after the pilot spool closes the quick-closing valve port on 10MPa/pilot valve 20MPa/pilot valve 30MPa/pilot valve 40MPa/pilot valve 10MPa/main valve 20MPa/main valve 30MPa/main valve 40MPa/main valve
3.5
displacement(mm)
3.0 2.5 2.0 1.5 1.0 0.5 0.0 -0.5 -2
0
2
4
6
8 time(s)
Fig. 15 Spool displacement versus load pressure
10
12
14
16
Static Characteristics Simulation Analysis of a Plate …
319
the main spool, the pressure starts to drain in the control chamber, the force balance on the main spool is broken and starts to move, and as the load increases, the spool displacement gradually decreases. When considering the combined effect of load pressure and pilot pressure on spool displacement, the pilot spool and main spool are located as shown in Figs. 16 and 17, respectively. The two figures show that the spool displacement increases with the increase of load pressure and then decreases with the increase of load pressure, compared with the pilot spool displacement, the main spool displacement reaches the extreme value earlier and starts to decrease. Lu Liang
=1 0
-3
6
px
displacement(mm)
8
ar
px=10bar px=13bar px=16bar px=19bar px=22bar px=25bar px=28bar px=31bar
10
1b
Fig. 16 Pilot spool location in relation to pilot and load pressures
4
2
0
0
100
200
300
400
500
400
500
load pressure(bar)
ar 1b
3
-3
displacement(mm)
4
px=10bar px=13bar px=16bar px=19bar px=22bar px=25bar px=28bar px=31bar
=1 0
5
px
Fig. 17 Main spool location in relation to pilot pressure and load pressure
2 1 0
0
100
200
300
load pressure(bar)
320
K. Li et al.
8 Conclusion The working principle and basic structure of the plate-type counterbalance valve are analyzed. By building a simulation model of the counterbalance valve, an in-depth analysis is conducted for the correlation between the main spool, pilot spool, load and pilot pressure, and the results show that: (1) The load pressure of this plate counterbalance valve has little effect on the opening pressure; (2) Compensation damping has a large impact on the flow of the main valve, and according to the diameter of the compensation damping can be designed to meet the different needs of the counterbalance valve; (3) The relationship between the pilot spool and the main spool is that the pilot spool always moves with the main spool first, and the main spool follows the pilot spool, but there is a certain hysteresis; (4) In the case of compensation damping, as the load pressure increases, the spool displacement increases and then decreases at different pilot pressures, i.e., after reaching the maximum flow rate, the flow rate will gradually decrease when the load pressure continues to increase to prevent the load from dropping too fast.
References 1. Duan, H., Chang, Q.: Design study of counterbalance valve in hydraulic system of hydraulic opening and closing machine. Hydraul. Pneumatic Seals 5 (2021) 2. Xu, H., Li, G.: Research on the performance analysis of an energy-saving hydraulic counterbalance valve. Hydraul. Pneumatic Seals 8 (2022) 3. Zhang, H.: Hydraulic Speed Control Technology. China Machine Press, Beijing (2014) 4. Liu, Z., Hu, Y., Lu, C., Zhu, W.: Numerical simulation study on cavitation and noise of internal reduction type counterbalance valve. Hydraul. Pneumatic 08, 125–132 (2020) 5. Bo, C.: Study on Static and Dynamic Characteristics of Balanced Valve with Asymmetric Damping Network Structure. Lanzhou University of Technology (2018) 6. Liu, J.: Research on the New Principle of Inter-Stage Hydraulic-Mechanical Dual Feedback and its Application in High Flow Control Valves. Zhejiang University (2017) 7. Lu, Y.: Performance analysis of a new type of counterbalance valve for truck cranes. Hydraul. Pneumatic 06, 114–117 (2014) 8. Lu, L.: Cavitation Flow and Noise in Hydraulic Throttle Valves. Zhejiang University (2012) 9. Lu, L., Li, Y., Si, G., Li, K., Li, M.: Study on dynamic performance of two-stage damped pilot-operated counterbalance valve. Hydraul. Pneumatic 47(06), 1–10 (2023) 10. Ye, W., Ge, M.: Research on the performance of the pressure stabilization circuit of the control chamber of counterbalance valve. Coal Min. Mach. 44(01), 50–51 (2023). https://doi.org/10. 13436/j.mkjx.202301015
Faster Convergence Rate of Sampled-Data Systems with Artificially Designed Optimal Time Delay Parameter Wenwen Li, Jianqiang Liang, and Mingxing Li
Abstract In this paper, a new scheme to improve the performance of linear sampleddata systems is investigated by considering the time delay as a new control parameter, and the corresponding algorithm to determine the optimal parameter has been proposed based on the discrete-time approach. To clearly illustrate the proposed conclusions, the sampled-data system with time delay has been transformed into an augmented discrete-time system, which has been proven to be equivalent in terms of stability. Furthermore, a natural stability criterion based on the discrete-time system has been derived, enabling accurate computation of the maximum allowable sampling period for the systems under consideration. Additionally, an optimal time delay parameter solution algorithm has been proposed. Finally, two numerical examples are presented to certify the effectiveness and superiority of our theoretical results. Keywords Discrete-time approach · Optimal time delay · Sampled-data system · Stability
1 Introduction In the last decade, due to the ubiquitous presence of embedded controller in relevant application domains, sampled-data systems have been applied widely in different areas, such as in satellite control systems [1], vehicle suspension systems [2], and power systems [3], and some other physiological systems [4–6]. Meanwhile, control systems based on sampled data have been extensively studied and many important conclusions have been proposed in many fields, such as the system stability analysis [7, 8] and control law design [9, 10]. Different from the continuous-time systems, in addition to the control law, the sampling period and time delay also affect the perW. Li · J. Liang · M. Li (B) The Seventh Research Division and the Center for Information and Control, School of Automation Science and Electrical Engineering, Beihang University (BUAA), Beijing 100191, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_26
321
322
W. Li et al.
formance of sampled-data systems. Between them, self-triggered control [11] techniques make sampled-data system achieve better control performance by designing time-varying sampling periods. This field is popular and mature. In contrast, there are still many deficiencies in another filed about the impact of time delay on sampled-data system performance. Meanwhile, the research about this field is of great significance to correctly recognize the role of time delay and improve the theory of sampled-data systems. Some existing literatures which take the time delay into consideration obtained the similar results that time delay will decrease the maximum allowable sampling period (MASP) of sampled-data systems, which means time delay will deteriorate the system performance, such as in [12–16]. In fact, time delays do not always have bad effects. The existence of positive impact is mentioned in [17] but it is only a simple description of the phenomenon without theoretical analysis. Therefore, to further investigate the impact of the time delay on system performance from theoretical level is quite vital and necessary. Several approaches to deal with the problem of sampled-data systems have been developed until now, such as the discrete-time approach [8, 18, 19], the input delay approach [20, 21], and the impulsive system approach [22, 23]. Among them, the sampled-data system is modeled as a discrete-time system, a continuous-time system with a delay input and an impulsive system, respectively. In order to more clearly illustrate the physical implication of the theorems proposed in this paper, the discretetime approach has been chosen to deal with problems of sampled-data systems. In contrast, other approaches do not yield any clearer conclusions. In this paper, the sufficient conditions for the existence of optimal time delay and the algorithm to determine the optimal time delay have been provided for sampleddata systems. The contributions of this paper are summarized as follows: (1) The first one lies in the proposed augmented discrete-time model. Moreover, based on this model, we can obtain less conservative results when calculating the MASP of considering systems. (2) As mentioned above, this paper proposes the existence condition of optimal time delay and corresponding algorithm for calculation. There is no doubt that these conclusions demonstrate that time delay can be used to improve system performance from the theoretical level, which leads to the comprehensive knowledge about the impact of time delay in sampled-data systems. This paper is organized as follows: Sect. 2 formulates the problem to be addressed. Section 3 gives the main results. Section 4 verified our main results via two numerical examples. And Conclusions are given in Sect. 5.
2 Problem Formulation and Preliminaries Consider the following linear sampled-data system x(t) ˙ = Ax(t) + Bu(t), tk ≤ t < tk+1
(1)
Faster Convergence Rate of Sampled-Data Systems …
323
where A and B are constant matrices with compatible dimensions, x(t) ∈ Rn and u(t) ∈ Rm denote the state vector and the control input, and tk , tk+1 denote the sampling instant which satisfy, 0 ≤ t0 < t1 < · · · < tk < tk+1 < · · · ,
lim tk = ∞.
k→∞
(2)
The state-feedback control signal u is generated by zero-order-hold function with constant time delay, i.e., u(t) =
K x(tk−1 ), t ∈ [tk , tk + τ ) K x(tk ), t ∈ [tk + τ , tk+1 )
(3)
where K is the state feedback controller, τ ≥ 0 is a new introduced parameter. In this paper, questions that is there a new parameter τ > 0 such that the closed-loop system of plant (1) under controller (3) has a bigger sampling interval for a given controller K and how calculate this optimization parameter τ are studied. To derive our main results, the following assumptions and lemmas are needed. Assumption 1 System (1) is a periodic sampled-data system and h is the sampling period. Assumption 2 Parameter τ is less than the sampling period, i.e., 0 ≤ τ < h. Assumption 3 A + B K is Hurwitz. Lemma 1 Consider the discrete-time system which is shown as following, x(tk+1 ) = Θ(h k )x(tk )
(4)
with h k = tk+1 − tk ∈ [h, h], k ∈ N. The origin of system (4) is Globally Uniformly Exponentially Stable for all sampling sequences σ = {tk }k∈N , if there exists P > 0 such that, (5) ΛT (θ)PΛ(θ) − P < 0, ∀θ ∈ [h, h] To a controllable sampled-data system (1)–(3), Assumption 1, 2, 3, are all satisfied easily in the practice. Lemma 1 is a commonly used result for the discrete-time system and the proof is not given.
3 Main Results Define h τ = h − τ , then the state vector of the closed-loop system of system (1) and controller (3) can be expressed as following,
324
W. Li et al.
τ x(tk + τ ) = e x(tk ) +
e As B K dsx(tk−1 )
Aτ
(6)
0
h τ x(tk+1 ) = e
Ah τ
x(tk + τ ) +
e As B K dsx(tk )
(7)
0
According to (6) and (7), x(tk+1 ) is written into as following, x(tk+1 ) = Φ1 (h, τ )x(tk ) + Φ2 (h, τ )x(tk−1 )
(8)
where Φ1 (h, τ ) and Φ2 (h, τ ) are taken as, h−τ Φ1 (h, τ ) = e
Ah
+
h e ds B K , Φ2 (h, τ ) = As
0
e As ds B K h−τ
Therefore, the closed-loop system of system (1) and controller (3) is equivalently expressed as following,
x(tk ) x(tk−1 ) = Θ(h, τ ) x(tk+1 ) x(tk )
(9)
where Θ(h, τ ) is taken as, Θ(h, τ ) =
0 In . Φ2 (h, τ ) Φ1 (h, τ )
(10)
System (9) is a discrete form of sampled-data system (1) under control (3) with a time delay parameter τ while Θ(h, τ ) is taken as Eq. (10).
3.1 Stability Analysis with a Given τ Lemma 2 To the sampled-data system (1–(3) and the discrete form (9) and (10) with constant sampling period h, time delay τ and initial state x(t0 ), the following two results are equivalent, 1. limt→∞ x(t) = 0. T 2. limk→∞ x T (tk ), x T (tk+1 ) = 0. Proof 1) ⇒ 2) is obvious. If 2) is satisfied, then the following equations are all established simultaneously,
Faster Convergence Rate of Sampled-Data Systems …
325
lim x(tk−1 ) = 0, lim x(tk ) = 0, lim x(tk+1 ) = 0.
k→∞
k→∞
k→∞
(11)
To system (1)–(3) when tk + τ ≤ t < tk+1 , we have, ⎡ x(t) = ⎣e A(t−tk ) +
⎤
t e
A(t−s)
ds B K ⎦ x(tk ) +
tk +τ
tk +τ
e A(t−s) ds B K x(tk−1 ) tk
Taking norm on each sides of above equation, we get that, ⎡ ⎤ t A(t−t ) A(t−s) k ⎣ ⎦ x(t) ≤ e + e ds B K x(tk ) tk +τ t +τ k A(t−s) x(tk−1 ) e ds B K +
(12)
tk
Furthermore, the following equation is obtained while tk ≤ t < tk + τ , x(t) = e
A(t−tk )
t x(tk ) +
e A(t−s) ds B K x(tk−1 )|
(13)
tk
Taking norm on each sides, we get, t A(t−t ) A(t−s) k x(tk−1 ) x(tk ) + x(t) ≤ e e ds B K
(14)
tk
In Inequality (12) and (14), due to t ∈ [tk + τ , tk+1 ) or t ∈ [tk , tk + τ ), following results are all established, t t +τ k A(t−s) < ∞, e A(t−s) ds B K < ∞ e ds B K tk tk t A(t−t ) A(t−t ) A(t−s) k k e 0, such that following inequality is established, Θ T (h, τ )PΘ(h, τ ) − P < 0
(17)
then, system (1) under the control input (3) is stable. Theorem 1 can be obtained directly from Lemma 1 and Lemma 2. To the discrete form system (9), if inequality (17) is satisfied, then according to Lemma 1, we can infer the following result, x(tk−1 ) =0 (18) lim x(tk ) k→∞ Furthermore, from Lemma 2, we have, lim x(t) = 0
(19)
t→∞
thus system (1) under the control input (3) is stable, i.e. Theorem 1 is established. To the parameter τ , τ = 0 could be regard as a special case of Theorem 1. In this case, following equations are established: ⎡ x(t) = ⎣e A(t−tk ) + ⎡ x(tk+1 ) = ⎣e Ah +
t
⎤ e A(t−s) ds B K ⎦ x(tk )
tk
h
(20)
⎤
e As ds B K ⎦ x(tk )
(21)
0
and if we let
h Θ(h) = e
Ah
+
e As ds B K 0
then the following Corollary can be obtained,
(22)
Faster Convergence Rate of Sampled-Data Systems …
327
Corollary 1 To the sampled-data system (1) with control input (3) while τ = 0, Θ(h) is calculated by Eq. (22) to the given sampling period h. If there is matrix P¯ ∈ Rn×n , P¯ > 0, such that following inequality is established, T ¯ − P¯ < 0 Θ (h) PΘ(h)
(23)
then, system (1) under the control input (3) is stable for the given h. Remark 2 Theorem 1 and Corollary 1 give a new method to evaluate the stability of the sampled-data system (1)–(3). This new method may have less conservativeness, and more testing details for simulation please see Example 1.
3.2 Stability Analysis with Different τ To the sampled-data system (1) with control input (3), inequality (17) is revised into the following form, Θ T (h, τ )QΘ(h, τ ) − Q + 2αQ < 0
(24)
where Q ∈ Rn×n , α ∈ R and Q > 0, α > 0. It is easy to obtained that if inequality (24) is established then inequality (17) is also established and there are Pˆ ∈ R2n×2n and Pˆ > 0 such that the following inequality is established, Θ T (h, τ )PΘ(h, τ ) − P + Pˆ ≤ 0 ˆ which means if inequalTo the above inequality, there are α > 0 such that 2αP < P, ity (17) is established then there are Q > 0 and α > 0 such that inequality (24) is also established. As a short summary, following lemma is established, Lemma 3 To the sampled-data system (1) with control input (3), if there is matrix Q ∈ R2n×2n , Q > 0, such that following inequality is established, Θ T (h, τ )QΘ(h, τ ) − Q + 2αQ < 0 then, system (1) under the control input (3) is stable. If inequality (25) is established, then for any x ∈ R2n , x = 0, we have, x T Θ T (h, τ )QΘ(h, τ )x − (1 − −2α)x T Qx < 0 which is equivalent to the following inequality, x T Θ T (h, τ )QΘ(h, τ )x < 1 − −2α x T Qx
(25)
328
W. Li et al.
ˆ Moreover, let Θ(h, τ ) Q − 2 Θ(h, τ )Q 2 then 1
1
ˆ τ ) 2 Θ(h, τ ) 2 = Θ(h, while y T Q − 2 Θ T (h, τ )QΘ(h, τ )Q − 2 y y =0 yT y T T x Θ (h, τ )QΘ(h, τ )x = max x =0 x T Qx 1
1
ˆ Θ(h, τ ) 22 = max
(26)
1
and y Q 2 x then for any x ∈ R2n , x = 0, we have, ˆ Θ(h, τ ) 22 = Θ(h, τ ) 22 = max x =0
x T Θ T (h, τ )QΘ(h, τ )x x T Qx
which means the maximum value of α in inequality (24) is only determined by Θ(h, τ ) 2 for the given h, τ of the sampled-data system (1) with control input (3), and this maximum value αmax can be used to qualitatively describe the stability of the closed-loop system. To different h, τ , bigger αmax is, faster x(t) converges to a neighborhood of zero. The maximum value of α can be obtained by solving the following optimization problem, max{α|Θ T QΘ − Q < −2αQ} Q>0
(27)
for the given h, τ . To the different value of τ , such as τ1 and τ2 , then the corresponding optimal values αmax,1 , αmax,2 of α can be obtained by solving the above optimization problem. If αmax,1 < αmax,2 , then the convergence rate and the stability of the closed-loop system are both better for τ = τ2 than τ = τ1 . Thus, as a short summary, Algorithm 1 is established to calculated an optimal τ . Remark 3 In theory, parameter τ can be treated as a new control parameter which is helpful to accelerate the rate of convergence or reduce the communication time of sampled-data systems.
4 Numerical Example In this section, two numerical examples are provided to illustrate the less conservative results obtained by using Theorem 1 and Corollary 1, and the effectiveness of Algorithm to obtain optimal τ .
Faster Convergence Rate of Sampled-Data Systems …
329
Algorithm 1 Optimal Parameter τ Algorithm Require: System matrices A, B and K, Sampling period h, Initial time delay τ = 0, Step size δ, τ ∗ = 0, αmax = 0 Ensure: Optimal time delay τ ∗ 1: for 0 ≤ τ < h do h−τ
As e ds B K 2: Φ1 ← e Ah + 3: 4: 5: 6: 7: 8: 9: 10: 11:
h
0
Φ2 ← h−τ 0 I Θ← Φ2 Φ1 α ← max Q>0 {α|Θ T QΘ − Q < −2αQ} if αmax ≤ α then τ∗ ← τ αmax ← α end if τ ←τ +δ end for e As ds B K
Example 1 Consider the LTI sampled-data system with A=
0 1 , 0 −0.1
0 , 0.1
B=
KT = −
3.75 11.5
(28)
This example is simple and commonly used example which is used in literatures [12–17]. The MASP calculated by Theorem 1 and Corollary 1 is listed in Table 1 along with those obtained in other literatures. In Table 1, bold font indicates the best result for each column. It is observed that, for various τ , our results are much less conservative than those obtained in literatures [12–17], especially while τ = 0. Meanwhile, from the table, the MASP with τ = 0.1, 0.2, 0.3, 0.4 are larger than that of τ = 0, which demonstrate the benefit of adjusting parameter τ from the perspective of sampling period. In order to elaborate the benefit of adjusting parameter τ in more details, setting the sampling period h = 1.7 s, by using Algorithm 1, we can obtain an optimal Table 1 MASP for given τ Methods τ =0 [12] [13] [15] [14] [16] [17] Our results
1.68 1.3659 1.7208 1.7208 1.7294 1.7294 1.7294
τ = 0.1
τ = 0.2
τ = 0.4
τ = 0.6
1.23 1.2232 1.2955 1.4688 1.6266 1.9223 1.9670
1.06 1.0883 1.1328 1.3122 1.4515 2.0850 2.2103
0.78 0.8286 0.8682 1.0459 1.1537 1.7843 2.7139
0.54 0.5770 0.6202 0.7877 0.8617 1.0266 1.6163
330
W. Li et al.
Fig. 1 Dynamic of system (28) with h = 1.7 s and different τ
Fig. 2 Comparison of system (28) states with different τ
parameter whose value is τopt = 0.1115 s. The dynamic of system (28) with τ = 0, τ = 0.1115 s are presented in Fig. 1. From this figure, it can be conclude that the closed-loop system has a better stability and faster convergence rate while τ = τopt = 0.1115 s. Moreover, the comparison between state responses under different time delay has been presented in Fig. 2. This figure shows that system (28) with the optimal parameter τ = τopt has the fastest convergence rate. From Fig. 1 and Fig. 2, we can preliminarily conclude that our theoretical results to calculate the MASP and optimal parameter τ are effective and superiority.
State x1
Faster Convergence Rate of Sampled-Data Systems …
331 =0.0033s =0.0066s =0.0132s =0s
4 2 0 0
1
2
3
4
5
6
7
State x2
0 =0.0033s =0.0066s =0.0132s =0s
-2 -4
State x
3
0
1
2
3
4
5
6
7
=0.0033s =0.0066s =0.0132s =0s
4 2 0 0
1
2
3
4
5
6
7
Time(s)
Fig. 3 Comparison of system (29) states with different τ
Example 2 Consider LTI sampled-data system: ⎡
⎤ ⎡ ⎤ 0 21 100 A = ⎣ 0 1 1 ⎦ , B = ⎣2 1 0 ⎦ −1 3 2 011 with feedback matrix,
(29)
⎡
⎤ −1 −2 −1 K =⎣ 2 2 1 ⎦ −1 −5 −4
According to Corollary 1, the MASP is 0.66359 s. The optimal parameter τ can be obtained by using Algorithm 1 which is τopt = 0.0066 s while the sampling period of system (29) is set to 0.65 s. Similar to Example 1, the comparison results with different parameter τ are presented in Fig. 3. It is observed that sampled-data systems with optimal parameter τ = τopt has the fastest convergence rate and better stability. Thus, as a summary, we can conclude that our theoretical results to calculate the MASP and optimal parameter τ are effective and superiority.
5 Conclusion This paper investigates the impact of parameter τ on sampled-data systems under controller (3). In order to obtain the solution method for the optimal parameter, we have established a new stability criterion theorem and an optimization algorithm. Based on
332
W. Li et al.
these developments, we present a less conservative stability criterion. To illustrate the effectiveness and superiority of our theoretical results, we provide several numerical examples. Both theoretical analysis and numerical simulation demonstrate that the convergence rate of sampled-data systems can be accelerated through the deliberate adjustment of the parameter τ .
References 1. Zhang, X.-M., Han, Q.-L.: Event-triggered dynamic output feedback control for networked control systems. IET Control Theor. Appl. 8(4), 226–234 (2014) 2. Li, H., Jing, X., Lam, H.-K., Shi, P.: Fuzzy sampled-data control for uncertain vehicle suspension systems. IEEE Trans. Cybern. 44(7), 1111–1126 (2013) 3. Zhang, C.-K., Jiang, L., Wu, Q., He, Y., Wu, M.: Delay-dependent robust load frequency control for time delay power systems. IEEE Trans. Power Syst. 28(3), 2192–2201 (2013) 4. Teng, X., Hwang, W.: Effect of methylation on local mechanics and hydration structure of DNA. Biophys. J. 114, 1791–1803 (2018) 5. Teng, X., Hwang, W.: Elastic energy partitioning in DNA deformation and binding to proteins. ACS Nano 10(1), 170–180 (2016) 6. Teng, X., Hwang, W.: Chain registry and load-dependent conformational dynamics of collagen. Biomacromolecules 15(8), 3019–3029 (2014) 7. Kao, C.Y., Bo, L.: Simple stability criteria for systems with time-varying delays. Automatica 40(8), 1429–1434 (2004) 8. Fujioka, H.: A discrete-time approach to stability analysis of systems with aperiodic sampleand-hold devices. IEEE Trans. Autom. Control 54(10), 2440–2445 (2009) 9. Nesic, D., Teel, A.R.: Backstepping on the euler approximate model for stabilization of sampled-data nonlinear systems. In: Proceedings of the 40th IEEE Conference on Decision and Control, pp. 1737–1742 (2001) 10. Katayama, H.: Design of one sampling period delay stabilizing controllers for nonlinear sampled-data strict-feedback systems. Ifac Papersonline 50(1), 11541–11546 (2017) 11. Chen, J., Fan, Y., Zhang, C., Song, C.: Sampling-based event-triggered and self-triggered control for linear systems. Int. J. Control Autom. Syst. 18(9) (2019) 12. Liu, K., Fridman, E.: Networked-based stabilization via discontinuous lyapunov functionals. Int. J. Robust Nonl. Control 22(4), 420–436 (2012) 13. Liu, K., Fridman, E.: Wirtingers inequality and lyapunov-based sampled-data stabilization. Automatica 48(1), 102–108 (2012) 14. Zhang, C.-K., He, Y., Jiang, L., Wu, M., Wu, Q.: Stability analysis of sampled-data systems considering time delays and its application to electric power markets. J. Franklin Inst. 351(9), 4457–4478 (2014) 15. Zhang, C.-K., Jiang, L., He, Y., Wu, H., Wu, M.: Stability analysis for control systems with aperiodically sampled data using an augmented lyapunov functional method. IET Control Theor. Appl. 7(9), 1219–1226 (2013) 16. Zeng, H.-B., Zhai, Z.-L., Xiao, H.-Q., Wang, W.: Stability analysis of sampled-data control systems with constant communication delays. IEEE Access 7, 111–116 (2018) 17. Zeng, H.-B., Zhai, Z.-L., He, Y., Teo, K.-L., Wang, W.: New insights on stability of sampleddata systems with time-delay. Appl. Math. Comput. 374, 1–11 (2020) 18. Fujioka, H.: Stability analysis of systems with aperiodic sample-and-hold devices. Automatica 45(3), 771–775 (2009) 19. Oishi, Y., Fujioka, H.: Stability and stabilization of aperiodic sampled-data control systems using robust linear matrix inequalities. Automatica 46(8), 1327–1333 (2010)
Faster Convergence Rate of Sampled-Data Systems …
333
20. Fridman, E., Seuret, A., Richard, J.-P.: Robust sampled-data stabilization of linear systems: an input delay approach. Automatica 40(8), 1441–1446 (2004) 21. Fridman, E.: A refined input delay approach to sampled-data control. Automatica 46(2), 421– 427 (2010) 22. Naghshtabrizi, P., Hespanha, J.P., Teel, A.R.: Exponential stability of impulsive systems with application to uncertain sampled-data systems. Syst. Control Lett. 57(5), 378–385 (2008) 23. Dong, S., Zhu, H., Zhong, S., Shi, K., Zeng, Y.: Hybrid control strategy of delayed neural networks and its application to sampled-data systems: an impulsive-based bilateral loopedfunctional approach. Nonl. Dyn. 105(4), 3211–3223 (2021)
Multi-robot Formation Control Based on Improved Virtual Spring Path Planning Method Yimei Chen, Minghao Zhang, and Baoquan Li
Abstract To solve the problems of unreasonable motion trajectory, slow formation recovery, and low obstacle avoidance safety in multi-robot systems, the method based on the improved virtual spring (IVS) path planning method and the second-order consensus protocol is proposed. Based on the virtual spring algorithm, the attractive and repulsive functions are optimized, and the adaptive velocity adjustment module for obstacle avoidance is designed to improve the situation of unreasonable motion trajectory and weak obstacle avoidance ability. The second-order consensus protocol is introduced for multi-robot formation, which increases the formation recovery speed and formation stability. Based on the algorithm, to optimize the problems of mutual collision and low safety of obstacle avoidance among robots, the priority model among robots is established. Finally, the effectiveness and safety of the proposed method are verified by simulation experiments. Keywords Multi-robot · Virtual spring · Formation control · Priority model · Collision avoidance
1 Introduction With the development of the time, various types of robots have gradually entered our daily life, and greatly improved our quality of life. Nowadays, single robot is mostly involved in the fields of path planning, motion control, etc. [1]. However, single robot has limitations in many scenarios, so multi-robot collaboration is used to accomplish tasks that are difficult for single robot. Meanwhile, robot formation has strong antijamming, robustness, and are relatively lower cost. Multi-robot formation control plays an indispensable role in military, industrial, medical, and adventure fields and is extremely relevant [2].
Y. Chen · M. Zhang (B) · B. Li School of Control Science and Engineering, Tiangong University, Tianjin 300387, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_27
335
336
Y. Chen et al.
Robot formation requires multi-robot always maintain some geometric formation to move to the target, and each robot can form a stable rigid body formation after avoiding the obstacles [3]. It is important to design an effective and reasonable control method for multi-robot formation. The current formation control methods are broadly classified into leader-follower method [4, 5], virtual structure method [6, 7], behavior-based method [8, 9], and consensus protocol method [10, 11], etc. Gao et al. [12] proposed a local path planning algorithm for mobile robots based on virtual spring model, this method effectively solved the problem of local minima in traditional artificial potential fields. However, this method was only applied to single robot and was not implemented in robot formation. Pan et al. [13] proposed an effective virtual spring method for multi-robot formation control, established an interaction dynamics model describing the logical topology and physical topology of the network, and designed virtual spring path planning and formation control law based on the network model. However, this method was slow in formation recovery and error convergence. Pan et al. [14] proposed an improved artificial potential field and PID adaptive tracking control method for the multi-robot collision avoidance problem, which planed the optimal path and reduced the local minimum. However, this method treated the robots as particle and did not consider the actual situation, so it was not applicable to the actual robot formation. Li et al. [15] proposed a multirobot motion planning method based on consensus formation, this method solved the problem of conflicting motion trajectories due to the increasing number of robots in a limited environment, and proposed a static obstacle formation model based on spring force mapping, then it established a dynamic velocity limitation function to ensure the efficiency of formation avoidance. However, the paths planned by this method are more tortuous, and the paths were too close, which may cause collisions when applied to actual robots. In conclusion, to solve the problems of the above algorithms, this paper proposes the method based on the improved virtual spring path planning method and the second-order consensus protocol. In this paper, the attractive and repulsive functions are optimized, the adaptive velocity adjustment module for obstacle avoidance is designed, the second-order consensus protocol is introduced and deformed, and the priority model among robots in the obstacle avoidance process is established. Compared with some existing algorithms, this method not only ensures the safety of multi-robot obstacle avoidance, but also ensures fast recovery of robot formation, stable formation, and smooth trajectory.
2 Mathematical Model 2.1 Multi-robot Model For the multi-robot model in this paper, we apply the idea of leader-follower model, where a robot is selected the leader with the highest rank, the interaction information flows from the leader to the rest of the followers, and the multi-robot model fails if the
Multi-robot Formation Control Based on Improved Virtual Spring …
337
Fig. 1 Leader-follower formation model
information flow from the leader to the followers is interrupted. The leader-follower model is shown in Fig. 1. The discrete time mathematical model of the robot Ri can be abstracted as ⎧ ⎨ xk+1 = xk + Tc vk cos θk yk+1 = yk + Tc vk sin θk (1) ⎩ θk+1 = θk + Tc ωk where [xk ,yk ,θk ] represents the posture of k moment, vk and ωk represent the linear and angular velocities of k moment, and Tc denotes the sampling period. where L denotes the distance from the center of the leader to the follower, αn is the relative direction of the leader and the follower, αn ≡ θ f n − θl , n = 1, 2, 3, . . . , n.
3 Multi-robot Formation Obstacle Avoidance Control Strategy 3.1 Control Objectives In this paper, we consider a task: N two-wheel differential drive robots move in formation following their leader, while avoiding any collisions between multi-robot systems; during obstacle avoidance, keeping the formation unbroken as much as possible; and can quickly resume formation after obstacle avoidance is complete.
338
Y. Chen et al.
3.2 The Proposed Method 3.2.1
Improved Virtual Spring Path Planning Method for Obstacle Avoidance
The improved virtual spring path planning method is a local path planning method based on sensor information, which is collectively referred to as the IVS algorithm in the following. Compared with the artificial potential field method, this algorithm has low computational requirements and takes into account the morphological factors of obstacles more fully, and it is able to handle complex and diverse obstacles in unknown environments with better obstacle avoidance and higher safety. The virtual spring method innovatively employs a virtual spring model. The algorithm assumes that the robot, the obstacle and the target are connected by a virtual tension spring, and the natural length of the spring is small enough to be negligible. The algorithm redefines the repulsive and attractive forces, when the robot moves, the target generates an attractive force Fatt , which is proportional to the distance among the robots and the target, and points in a direction close to the target. The improved attractive elastic force of the target is defined by Fatt (q) =
k g · f (q − qt )/S, f (q − qt ) > 0 0, f (q − qt ) = 0
(2)
where k g is the positive scale factor of the target virtual spring, f (q − qt ) is the Euclidean distance among the robots and the target, and q is the robot coordinate (x, y)T , S is the scaling ratio. If the obstacle is irregularly shaped, it can be considered as the smallest circle that can contain the obstacle, and we assume that there are many virtual springs of natural length uniformly distributed around the obstacles. When the robot is far from the obstacle, we do not expect the obstacle to affect the robot, and when the robot is close enough to the obstacle, the obstacle generates a repulsive force Fr ep , it is inversely proportional to the robot’s distance from the obstacle and points away from the obstacle. The improved repulsive elastic force of the obstacles is defined by Fr ep (q) =
kr · |dr − f (q − qo )|, 0 < f (q − qo ) ≤ dr 0, f (q − qo ) > dr kr =
f (q − qo ) S
(3)
(4)
where dr is the maximum robot obstacle detection distance, kr is the spring damping factor, and f (q − qo ) is the Euclidean distance between the robot and the obstacles. F is the combined force of Fatt and Fr ep , and the robot will move in the direction of the combined force until it reaches the target. The force analysis of robot under the IVS algorithm is shown in Fig. 2.
Multi-robot Formation Control Based on Improved Virtual Spring …
339
Fig. 2 Force analysis diagram of robot
The combined force of the robots can be expressed as ⎧ n ⎨ F (q) + Fr ep (q), f (q − qt ) > 0 att F(q) = i=1 ⎩ 0, f (q − qt ) > dr
(5)
where n is the number of the obstacles that have influence on the robot.
3.2.2
Obstacle Avoidance Mode Switching
However, in some collision conflicts, the robot will fail in obstacle avoidance because of too fast velocity. To solve such problem, this paper designs an adaptive speed adjustment module, sets the threshold interval as [0.4, 1], the unit is m, sets the obstacle radius as 0.4 m, and the maximum obstacle recognition radius as 1m. When the robot enters the obstacle recognition area, it completes velocity reduction and obstacle avoidance by setting the maximum velocity of the robot, thus avoiding the above problem. The equation is as follows f (Vmax ) =
⎧ ⎨
0.27, 0.4 < f (q − qo ) < 0.7 2.5a · f (q − qo ) + b, 0.7 ≤ f (q − qo ) < 1 ⎩ 0.57, f (q − qo ) ≥ 1
(6)
340
Y. Chen et al.
According to a large number of simulations and experimental data analysis, when a = 0.4 and b = −0.43, the robot can safely avoid obstacles and has a smooth trajectory. The IVS algorithm works well in single robot path planning. However, the IVS formation algorithm has slow formation recovery and general formation stability, so in this paper we fuse the consensus protocol with the IVS algorithm to compensate for the above algorithm defects.
3.3 Multi-robot Formation Control The consensus protocol is an excellent formation strategy. Compared with the traditional formation method, it has the characteristics of high stability, fast convergence, and quick formation recovery ability, so it is widely used in daily life. Multi-robot systems are modeled as a set of dynamic systems, and directed graphs are used to model the information interaction among robots. To describe this network mathematically, we use graph theory. For n robot sets R = {R1 , R2 , . . . , Rn }, we use the digraph G = (N , E) to simulate the information interaction among n robots, where N = {1, 2, . . . , n} is a set of finite non-empty nodes, E is a set of edges belonging to ordered nodes n by n, called edges. (i, j) ∈ E have told edge node N j can get and use information from nodes Ni , Ni is defined as the parent node, N j is defined as a child node. To describe the connection between nodes and edges, Adjacency matrix A is introduced. Adjacency matrix A = A (G) = ai j is a given N × N matrix, representing the information connectivity between nodes (among robots and their neighbors), which is defined by 1, (i, j) ∈ E ai j = (7) 0, other wise
The Laplacian matrix L = li j nxn is defined by L ii = −
N j=1, j=i
L i j , L i j = −ai j , i = j
(8)
which ensures the diffusion property Nj=1 L i j = 0. The second-order continuous dynamical system model is considered as follows
x˙i (t) = vi (t) i ∈n v˙i (t) = u i (t)
(9)
where xi (t) and vi (t) denote the position and velocity of the robot at moment t, vi (t) denotes the velocity, u i (t) denotes the control input, and n is the number of robots. The system is said to have reached second-order consensus protocol if for any initial condition the following equation is satisfied,with
Multi-robot Formation Control Based on Improved Virtual Spring …
lim ||xi (t) − x j (t)|| = 0
t→∞
341
(10)
lim ||vi (t) − v j (t)|| = 0
t→∞
where xi (t) and vi (t) denote the position and velocity of the robot at moment t. For the second-order dynamical system, consider the second-order consensus protocol as follows x˙i (t) = vi (t) N ai j [xi (t) − x j (t)] + β v˙i (t) = α j=1, j=i
N j=1, j=i
ai j [vi (t) − v j (t)]
(11)
where xi (t) and vi (t) are the position and velocity states of the i-th robot, respectively, and the control gains α and β are both positive numbers. By combining Eq. (8) x˙i (t) = vi (t) N v˙i (t) = −
j=1, j=i
L i j {α[xi (t) − x j (t)] + β[vi (t) − v j (t)]}
(12)
where x = (x1T , x2T , . . . , xnT )T , v = (v1T , v2T , . . . , vnT )T , y = (x T , v T )T . Equation (12) can be written in matrix form as
x(t) ˙ x(t) 0n×n In = (13) −αL −β L v(t) ˙ v(t)
0n×n In is the matrix, L is the Laplacian matrix, In is the nth-order where −αL −β L unit matrix. Second-order consensus in multi-robot system (13) can be achieved, the conditions for asymptotic stability are fully elaborated and demonstrated in [16], this paper only gives a brief explanation.
Lemma 1 In a multi-robot system, the multi-robot system can be asymptotically stable when and only when the matrix has exactly one zero eigenvalue and all other eigenvalues have negative real parts. Theorem 1 Second-order consensus in multi-robot system (13) can be achieved if and only if the network contains a directed spanning tree and β2 2 (μi ) > max 2 2≤i≤N (μi )[ (μi ) + 2 (μi )] α
(14)
where μi are the nonzero eigenvalues of the Laplacian matrix L, i = 2, 3, . . . , N .
342
Y. Chen et al.
In this paper, a variant of algorithm (12) is used to guarantee di − d j → δi j and vi → v j , where denotes the desired distance among robot i and j. Let δi be a constant and control the input deformation as x˙i (t) = vi (t) N v˙i (t) = −
j=1, j=i
L i j {α[(di (t) − d j (t)) − (δi (t) − δ j (t))]
(15)
+ β[v j (t) − vi (t)]} where di − δi and vi satisfy Eq. (12), which di − δi plays the role xi in algorithm (12). If it satisfies Theorem 1, then it has di − δi → dj − δ j and vi → v j . Therefore, in order to avoid collision among robots, we can set appropriate δi to ensure ideal robot spacing. When the multi-robot system satisfies Theorem 1 and tends to be asymptotically stable, the velocity model of the robot can be expressed as
vx (t + 1) = γ · d(t) · cos φ v y (t + 1) = γ · d(t) · sin φ
(16)
model of the robot can be expressed as d(t) =
u x (t)
2
+
u y (t)
2 (17)
where u x (t) and u y (t) are the control inputs in the x- and y-directions, respectively, φ is the angle among the robots and the x-axis of the world coordinate system, and γ is the adjustment factor.
3.4 Priority Model among Robots In the process of multi-robot obstacle avoidance, it is not only necessary to avoid obstacles in the environment, but also to avoid collisions among robots. Usually, we need to consider the following two cases, the first one is the robot obstacle avoidance under the same communication layer, and the second one is the robot obstacle avoidance under different communication layer. To cope with the collision among robots, we regard multiple robots as moving obstacles for obstacle avoidance, and to prevent conflicts among robots, this paper designs a suitable priority model for robot obstacle avoidance, and the robot with high priority passes first. In the system, we set a fixed name Ri (i = 1, 2, . . . , n) for each robot, and the leader has the highest pass priority C1 . For the followers, the pass priority is Ci (i = 2, 3 . . . , n). If there is no formation control among the robots, the priority model is
Multi-robot Formation Control Based on Improved Virtual Spring …
343
Ci = Ri , (i = 1, 2, . . . , n)
(18)
If the system was in formation control, the priority model is composed of p Ci , Cid , Ci , and the priority model is ⎧ ⎨
p
p
Cid , (Ci = C j ) p p p Ci , (Ci = C j ) Ci = ⎩ p p Ri , (Ci = C j , Cid = C dj )
(19)
p
where, Cid represents the distance among follower i and leader, and Ci represents the p p number of communication layers among follower and leader. When Ci = C j , the p p d d comparison of Ci and C j depends on the comparison of Ci and C j ; when Ci = C j , p p the comparison of Ci and C j depends on the comparison of Ci and C j ; and for other case the comparison of Ci and C j depends on the comparison of Ri and R j . In the obstacle avoidance process of the leader and the follower, the velocity model synthesis Eq. (16) can be expressed as follows respectively
Vx − l = η · Fatt · cos ϕ + ω · Fr ep Vy − l = η · Fatt · sin ϕ + ω · Fr ep
(20)
where η and ω is the consistency adjustment factor, and ϕ is the angle among the robots motion direction and the global coordinate system.
Vx − f = Vx + δ · Fr ep Vy − f = Vy + δ · Fr ep
(21)
where δ is the system gain coefficient.
4 Simulation 4.1 Algorithm Implementation The flow chart of multi-robot system formation obstacle avoidance is shown in Fig. 3. When there is no obstacle, the robots advance in formation according to the predefined formation using the second-order consensus protocol, and when an obstacle is encountered, the leader and the follower avoid the obstacle according to the IVS algorithm and robot priority model, and then use the second-order consensus protocol to quickly resume the formation and continue to move until all robots reach the target.
344
Fig. 3 Flow chart of algorithm
Y. Chen et al.
Multi-robot Formation Control Based on Improved Virtual Spring …
345
4.2 Simulation Experiments The algorithm in this paper has proved its efficiency by comparing with the virtual spring formation algorithm and the consensus protocol based on the artificial potential field. The experimental physical environment is a limited two-dimensional region, the obstacle types are set as random circular and square obstacles,the radius of circular obstacle is 0.4m, and the side length of square obstacle is 0.8 m. In the simulation experiment, the defined particle robots are of the same type with basic navigation, obstacle avoidance, communication and other functions, and the experiments choose four followers and one leader for multi-robot formation obstacle avoidance experiments. Influenced by the robot motion model, the linear velocity of the robot in x and y directions is restricted to 0.57 m/s, the control input gain α = 1 and β = 0.3, and set the ideal spacing among robots δi = 1 m. The initial position of the robot is indicated by a gray circle and the end position is indicated by a blue-green circle. Simulation 1: No dynamic obstacle environment. Random circular obstacles of the same size are set, and the target is set to (25, 2), the initial positions of the leader and follower1–4 are successively (2, 2, 0), (1, 3, 0), (0, 4, 0), (1, 1, 0), (1, 1, 0). The formation is a triangular queue, and the comparison of the trajectory simulation effect is shown in Fig. 4. From Fig. 4a, it can be seen that the formation recovered slowly after the multirobot finished obstacle avoidance. After the robots move to 7 m, the trajectories of the follower3 and the leader overlap, when the robots move to 15 m, the trajectories among robots are too close to each other, and there is a potential danger of collision. In contrast, the trajectories dispersion among robots in Fig. 4b is smooth, and fast formation recovery, which can traverse the dense obstacle environment more efficiently and safely. Simulation 2: Mixed obstacle environment. Five static obstacles are set, four of which are circular obstacles of the same size and one is a square obstacle. In addition, set a dynamic obstacle that moves at a uniform speed, the target is 10
Follower1 Follower2 Follower3 Follower4 Leader
8
y(m)
6
4
2
0
-2
-4
0
5
10
15
20
25
x(m)
(a) IVS formation algorithm Fig. 4 Trajectory simulation diagram
(b) Algorithm of this paper
30
346
Y. Chen et al.
20
20
15
15
10
10
y(m)
y(m)
set to (14,14),the initial positions of the leader and follower1–4 are successively (−2.5, −3, 0), (−4.5, −1.5, 0), (−6, −1, 0), (−2, −3.5, 0), (−6, −4, 0). The formation is pentagonal, simulation effect comparison is shown in Figs. 5, 6 and 7. From Fig. 5a, the trajectory of the follower1 is unsmooth after avoiding the first circular obstacle and before avoiding the red obstacle, the trajectory of the follower3 is unsmooth before avoiding the circular obstacle (2, −0.5), and the follower 4 has a radical change in trajectory while avoiding the red obstacle, with collision potential. From Fig. 5b, the algorithm proposed in this paper optimizes the trajectory problems in Fig. 5a, and the improved trajectory is smoother and more secure. From Fig. 6, three fluctuations in the error appear, corresponding to the three stages of obstacle avoidance in the multi-robot system. From Fig. 6a, it can be seen that the peaks of the three fluctuations are around 1.05, 1.3 and 1.95 m, with large fluctuations. As seen in Fig. 6b, the peaks of the three fluctuations are significantly reduced, with peaks around 1, 1.1, and 1.45 m, with smaller fluctuations, and the average peak is lower than the original algorithm. As the formation advances, the position error converges to 0, which proves the stability of the second-order consensus protocol. Both algorithms does not exceed the velocity limit during the robot’s advancement, and the velocity curve showed three fluctuations, corresponding to the robot’s three
5
Follower1 Follower2 Follower3 Follower4 Leader Attacker
0 -5 -10
-5
0
5
10
5
Follower1 Follower2 Follower3 Follower4 Leader Attacker
0 -5
15
-10
-5
0
x(m)
5
10
15
x(m)
(a) Trajectory diagram of the consensus protocol based on the APF
(b) Trajectory diagram of the algorithm in this paper
Fig. 5 Trajectory simulation diagram 2.5
Position error(m)
2 1.5 1
2.5
Position error(m)
Follower1 Follower2 Follower3 Follower4
0.5
Follower1 Follower2 Follower3 Follower4
2 1.5 1 0.5
0 0
20
40
60
80
100
120
Time(s)
(a) Position error of the consensus protocol based on the APF Fig. 6 Position error simulation diagram
0 0
20
40
60
80
100
120
Time(s)
(b) Position error of the algorithm in this paper
Multi-robot Formation Control Based on Improved Virtual Spring … 0.6
0.6
0.4 0.3
Follower1 Follower2 Follower3 Follower4
0.5
Speed(m/s)
Follower1 Follower2 Follower3 Follower4
0.5
Velocity(m/s)
347
0.4 0.3
0.2
0.2
0.1
0.1
0
0 0
20
40
60
80
100
120
Time(s)
0
20
40
60
80
100
120
Time(s)
(a) Velocity of the consensus protocol based on the APF
(b) Velocity of the algorithm in this paper
Fig. 7 Velocity simulation diagram Table 1 Performance comparison between the algorithm in this paper and the original algorithm Metod Original algorithm This paper Trajectory smoothness Trajectory overlap Peak of position error/m Velocity smoothness
Unsmooth High 1.95 High volatility
Smooth Low 1.45 Low volatility
stages of obstacle avoidance. However, it can be seen from Fig. 7a that the robot velocity fluctuates widely and jitters seriously, which has certain safety hazards. Compared with Fig. 7a, in Fig. 7b the robot velocity fluctuation is significantly reduced and the jitter degree is also significantly decreased. This simulation experiment has been conducted several times, and the experiment proves that the algorithm in this paper is better than the original algorithm in terms of obstacle avoidance safety, formation recovery, and velocity fluctuation, etc. The evaluation indicators are shown in Table 1.
5 Conclusion In order to solve the problems of multi-robot system in formation and obstacle avoidance, the formation obstacle avoidance method based on the improved virtual spring path planning method and the second-order consensus protocol are proposed, which combines with the leader-follower model. To solve the problem of low safety of multirobot obstacle avoidance more effectively, the obstacle avoidance mode switching model and the priority model among robots are Established. The effectiveness of the algorithm is verified through simulation experiments, and the results show that the formation avoidance algorithm proposed in this paper can effectively complete the task of formation avoidance, and improve the trajectory smoothness, formation recovery speed, and obstacle avoidance safety.
348
Y. Chen et al.
Acknowledgements This research is partly funded by the State Scholarship Fund of China (No. 202109347006) and the National Natural Science Foundation of China (No. 61973234).
References 1. Liang, D., Yin, X., Wang, M.: Research status and development trend of mobile robots. Sci. Technol. Inf. 9, 33–37 (2014). https://doi.org/10.3969/j.issn.1001-9960.2014.09.025 2. Yang, T., Liu, Z., Chen, H., Pei, R.: Current situation and problems of mobile robot formation control. J. Intell. Syst. 2(4), 21–27 (2007) 3. Fu, L., Qin, Y., He, D., Liu, Z.: Obstacle avoidance of multi robot formation based on improved artificial potential field method. Control Eng. 29(3), 388–396 (2022) 4. Latif, M.: Leader-follower formation tracking of multiple mobile robots with constant leader velocity. J. Phys. Conf. Ser. 1569(3), 032084 (2020) 5. Liang, X., Wang, H., Liu, Y.H., Chen, W., Liu, T.: Formation control of nonholonomic mobile robots without position and velocity measurements. IEEE Trans. Robot. 34(2), 434–446 (2017) 6. Zhen, Q., Wan, L., Li, Y., Jiang, D.: Formation control of a multi-AUVs system based on virtual structure and artificial potential field on SE(3). Ocean Eng. 253, 111148 (2022) 7. Dong, L., Chen, Y., Qu, X.: Formation control strategy for nonholonomic intelligent vehicles based on virtual structure and consensus approach. Procedia Eng. 137, 415–424 (2016). https:// doi.org/10.1016/j.proeng.2016.01.276 8. Balch, T., Arkin, R.C.: Behavior-based formation control for multirobot teams. IEEE Trans. Robot. Autom. 14(6), 926–939 (1998) 9. Mohamadi, Y., Konukseven, E.˙I., Koku, A.B.: Behavior-based approach for cooperative control of a haptic-driven mobile robot. In: 23rd International Conference on Mechatronics Technology. IEEE Press, Salerno, Italy (2019). https://doi.org/10.1109/ICMECT.2019.8932156 10. Ren, W.: Consensus based formation control strategies for multi-vehicle systems. In: American Control Conference. IEEE Press, Minnesota, USA (2006). https://doi.org/10.1109/ACC.2006. 1657384 11. Zhang, Y., Jiang, Y., Dai, J.: Dynamic obstacle avoidance control of three-order multi-robot cooperative formation. J. Syst. Simul. 34(8), 1762–1774 (2022) 12. Gao, S., Xu, F., Guo, H.: Research on mobile robots’ path planning based on a spring model. Chin. J. Sci. Instrum. 37(4), 796–803 (2016) 13. Pan, Z., Wang, D., Deng, H., Li, K.: A virtual spring method for the multi-robot path planning and formation control. Int. J. Control Autom. Syst. 17(5), 1272–1282 (2019) 14. Pan, Z., Li, D., Yang, K., Deng, H.: Multi-robot obstacle avoidance based on the improved artificial potential field and PID adaptive tracking control algorithm. Robotica 37(11), 1883– 1903 (2019) 15. Li, X., He, J., Zhao, Z., Yan, D., Wang, X.: Multi-robot formation motion planning method based on improved consistency model. J. Xi’an Eng. Univ. 35(3), 44–53 (2021) 16. Yu, W., Chen, G., Cao, M.: Some necessary and sufficient conditions for second-order consensus in multi-agent dynamical systems. Automatica 46(6), 1089–1095 (2010)
LQR-Based Adaptive Optimal Control for Aircraft Engine Jinsong Zhao, Yan Lin, and Lin Li
Abstract A linear quadratic regulator (LQR) based adaptive optimal control method for aircraft engine is presented. Firstly, an LQR controller is designed for the nominal system, based on which, a adaptive controller is designed by using the back-stepping technique. With the proposed method, the system uncertainties and external disturbances can be compensated, and the actual controller can converge to the LQR controller as close as possible. Finally, a turbofan engine simulation is considered to illustrate the effectiveness of the proposed method. Keywords Aircraft engine · Adaptive control · LQR control
1 Introduction Aircraft engines are complex, nonlinear, and strongly-coupled systems [1]. Its aerodynamic and thermodynamic processes undergo significant changes with variations in its working conditions and states. With the growing demand of aircraft engine’s performance requirements, such as safety, reliability, and high performance, research on advanced control technology for aircraft engines is of great significance. Aircraft engine control has long been actively studied, and many techniques have been used in aero-engine control, such as robust control [2, 3], intelligent control [4], sliding mode control [5–7], optimal control [8–10]. Among these methods, optimal control is suitable for the control of aircraft engine, due to its mature theories and successful applications. Optimal control is a crucial component of modern control theory, with which the designer can seek for appropriate transient and steadystate performances. The fundamental problem of optimal control is to find a control strategy for a system described by differential equations, under certain initial and J. Zhao · L. Li School of Energy and Power Engineering, Beihang University, Beijing 100191, China Y. Lin (B) College of Electrical Engineering and Automation, Shandong University of Science and Technology, Qingdao 266590, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_28
349
350
J. Zhao et al.
terminal conditions, to minimize (or maximize) a specific performance index, thereby enabling the controlled system to achieve the expected performance index [11, 12]. For a linear system whose performance index is the integral of the quadratic function of the control or state variable, the optimal control problem is transformed into a linear quadratic regulator (LQR) problem. The LQR is applied to design a controller for F401 turbofan engine in 1973 [8] and for F100 engine in 1976 [10]. However, unlike other control methods, classic LQR control cannot cope with system uncertainties and external disturbances [13]. Actually, the linear model of the engine is obtained by linearizing the nonlinear model at some working points within the flight envelope. Since the complexity and nonlinearity of the engine, the obtained linear system model inevitably has many uncertainties. Meanwhile, external environmental factors, such as the impact of different flight environments, aging or damage of components, may cause disturbances to the system. Fortunately, adaptive control method has been proven to be a good way to deal with uncertainties and disturbances [14–16]. By using the adaptive technique, the uncertainties and disturbances can be estimated and compensated, and the stability of the system can be ensured. In this paper, in order to improve the control performance and reliability of aeroengine, we combine the LQR method with the adaptive method. Firstly, an LQR controller is designed for the nominal system, and then the actual controller is designed using the back-stepping method, based on which, an adaptive law is designed to compensated uncertainties and disturbances, so that the actual controller can converge to the LQR controller as close as possible. This paper is organized as follows. In Sect. 2, the control problem and the basic assumption are given. In Sect. 3, the controller is designed and the main results is given. In Sect. 4, a turbofan engine simulation is given to illustrate the effectiveness of the proposed method. Finally, we concludes this paper.
2 Problem Statement Consider a linearized model of turbofan engine in the following form x˙ = (A + ΔA)x + (B + ΔB)u + w, u˙ = u r ,
(1)
where x = [ΔN f , ΔNc ]T , control input u = [ΔW f , ΔV SV, ΔV BV ]T , and w represents normalized health parameters; A, B are known matrices with appropriate dimensions, and ΔA, ΔB are unknown perturbation matrix. We use d = ΔAx + ΔBu + w to represent generalized uncertainty. The meanings of the symbols in x, u are shown in Table 1. The objective of this paper is to design an adaptive optimal controller for aircraft engine so that all the closed-loop signals are bounded, and the output x(t) → 0 when t → +∞.
LQR-Based Adaptive Optimal Control for Aircraft Engine
351
Table 1 Nomenclature in the model of turbofan engine Symbol Nomenclature ΔN f ΔNc ΔW f ΔV SV ΔV BV
Fan speed increment Core speed increment Fuel flow increment Variable bleed value increment Variable stator value increment
To achieve the control objective, the following assumption is made. Assumption 1 The generalized uncertainty d and its are bounded, that derivative is, there exists a positive constant d M , such that max d , d˙ ≤ d M .
3 Controller Design Consider the following performance criteria 1 J= 2
∞
x T Qx + u T Ru dt,
(2)
0
where Q, R are weighted matrix. By choosing appropriate Q and R, a compromise can be made between the the control quantity and dynamic performance of the system. In the absence of disturbances, according to the LQR theory, the feedback control law is designed as u L Q R = −R −1 B T K x,
(3)
in which, K is the solution of the following Riccati equation A T K + K A + Q − K B R −1 B T K = 0.
(4)
Note that the system has unknown disturbances. Under this circumstance, in order to ensure the control performance of the system, the following adaptive controller is designed. In what follows we shall employ positive scalars c1 , c2 , κ, σ, as design parameters without restating. Defined the error variables as z 1 = x − v, z 2 = u − α1 ,
(5)
352
J. Zhao et al.
where v satisfied v˙ = Av + Bu L Q R , v(0) = x(0).
(6)
and α1 is the virtual control signal to be designed. It can be seen that once z 1 = z˙ 1 = 0, then x˙ = Ax + Bu L Q R . Step 1: The time derivative of z 1 can be expressed as z˙ 1 = x˙ − Ax − Bu L Q R = Bz 2 + Bα1 + Az 1 − Bu L Q R + d.
(7)
Define the quadratic function V1 =
1 T 1 ˜T ˜ z z1 + d d, 2 1 2κ
(8)
ˆ and dˆ is estimate d. where d˜ = d − d, Choose the virtual signal ˆ α1 = −c1 B † z 1 − B † Az 1 + u L Q R − B † d,
(9)
in which, B is assumed to be row full rank, and B † = B T (B B T )−1 . The time derivative of V1 by considering (9) yields 1 1 V˙1 = z 1T Bz 2 + Bα1 + Az 1 − Bu L Q R + d − d˜ T d˙ˆ + d˜ T d˙ κ κ 1 1 ˜ T ˙ˆ T T T ˙ ˜ = −c1 z 1 z 1 + z 1 Bz 2 − d d − κz 1 + d d. κ κ
(10)
Step 2: The time derivative of z 2 is z˙ 2 = u r − α˙ 1 = u r − β1 − β2 d,
(11)
where β1 = − c1 B † + B † A Bz 2 + Bα1 + Az 1 − Bu L Q R ˙ˆ − R −1 B T K (Ax + Bu) − B † d, β2 = − c1 B † + B † A − R −1 B T K .
(12) (13)
Define the quadratic function 1 V2 = V1 + z 2T z 2 . 2
(14)
LQR-Based Adaptive Optimal Control for Aircraft Engine
353
Let the adaptive control law u r and the adaptive law dˆ be designed as ˆ u r = −c2 z 2 − B T z 1 + β1 + β2 d, ˆ d˙ˆ = κ z 1 − β2T z 2 − κσ d.
(15) (16)
Then, by considering (15), (16), the time derivative of V2 yields 1 1
V˙2 = −c1 z 1T z 1 − c2 z 2T z 2 − d˜ T d˙ˆ − κ z 1 − β2T z 2 + d˜ T d˙ κ κ
1 ˜T ˜ 1 σ σ T T 2 − + dM . ≤ −c1 z 1 z 1 − c2 z 2 z 2 − d d+ 2 2κ 2 2κ
(17)
The main results are summarized in the following Theorem. Theorem 1 Let the linearized model of turbofan engine is given by (1) under Assumption 1. Consider the closed loop system consisting of the engine model (1), the adaptive controller (15), and the update law (16). Then, by choosing the appropriate control parameters, all signals of the closed-loop system are bounded, and the errors can converge to an arbitrarily small neighborhood of the origin. Proof We define the overall Lyapunov function as V = V2 + v T K v,
(18)
T V˙ = V˙2 + v T K Av + Bu L Q R + Av + Bu L Q R K v.
(19)
whose derivative yields
Since z 1 = x − v, then u L Q R = −R −1 B T K v − R −1 B T K z 1 .
(20)
Substitute (20) into (19), and in view of (4) and (17), we have V˙ = V˙2 + c3 v T A T K + K A − K B R −1 B T K v − 2z 1T K B R −1 B T K z 1
1 ˜T ˜ σ − ≤ −c1 z 1T z 1 − c2 z 2T z 2 − (21) d d − v T Qv + D, 2 2κ in which D=
1 σ 2 + dM . 2 2κ
(22)
For a given positive constant σ, by choosing the design parameters c1 , c2 and κ, satisfy
354
J. Zhao et al.
c1 ≥
C + 1 λmin (Q) C C , c2 ≥ , κ ≥ , ≥ C, 2 2 σ λmin (P)
(23)
where C > 0 is a positive constant. Then, it can be deduced that V˙ ≤ −C V + D.
(24)
Solving (24) yields
D 0 ≤ V (t) ≤ V (t) − C
e−Ct +
D . C
(25)
(25) implies that V is bounded on [0, +∞). Hence, z 1 , z 2 , v and d˜ as well as dˆ are bounded. Then, x and u are bounded. From (15), u r is also bounded. Therefore, all signals of the closed-loop system are bounded. According to (25), lim |z i | ≤
t→∞
2D , i = 1, 2. C
(26)
Then, the errors z 1 and z 2 can be made arbitrarily small by properly choosing the design parameters listed in (23).
4 Simulation Results To show the effectiveness of the proposed design method, consider the following twospool turbofan engine linearized model from NASA Glenn research center. Table 2 summarizes the state equilibrium parameters for the example [17].
−3.3808 1.2954 667.8408 −39.2134 −14.2485 A= , B= . 0.4444 −3.0501 1333.9594 117.2730 −26.8107
(27)
In the simulation, the initial states are set to be x(0) = [−200, −500], u(0) = [0, 0, 0]T . For the convenience of controller design, x, u are normalized by Tx = Table 2 Engine equilibrium values at 1000 ft, Mach 0.10 and W f = 0.33 lb/s Variable
Eq. value
Units
Fan speed, N f Core speed, Nc Fuel flow, W f Variable bleed value, V SV Variable stator value, V BV
1375.77 8624.00 0.33 −51.40 1.00
rpm rpm lb/s ◦
%
LQR-Based Adaptive Optimal Control for Aircraft Engine
355
ΔNf (rpm)
100 0
Traditional adaptive control Adaptive optimal SDRE control
−100 −200 0
1
2
1
2
3
4
5
3
4
5
ΔNc (rpm)
200 0 −200 −400 −600 0
Time(s)
Fig. 1 Trajectories of ΔN f and ΔNc
diag{0.002, 0.001} and Tu = diag{0.2, 0.01, 1}, respectively. The weighted matrix are chosen as Q = diag{1, 1}, R = diag{10, 10, 10}. The adaptive control design parameters are chosen as c1 = c2 = 5, κ = 10, and σ = 0.2. Meanwhile, in order to verify the robustness of the proposed scheme, 10% bias for A, B are considered in the simulation. In the simulation, traditional adaptive controller without LQR is used as a comparison, the controller parameters are the same as the proposed method. The simulation results are shown in Figs. 1, 2 and 3. Figure 1 shows the state response trajectories. It can be seen that by using the proposed method, the state response curves are smoother and without overshoot. Figure 2 shows the control inputs, and Fig. 3 shows the histogram of the performance criteria. From Fig. 3, it can be seen the performance criteria are smaller than traditional method.
5 Conclusions This paper proposed a LQR-based adaptive optimal control method for aircraft engine. The LQR method and the adaptive method is combined to improve the control performance and reliability of aero-engine. Firstly, an LQR controller is designed for the nominal system, based on which, a adaptive controller is designed by using the back-stepping technique. It has been proven that all signals of the closed-loop are bounded, and the system uncertainties and external disturbances can be compensated. Finally, a turbofan engine simulation has shown that the proposed method has good transient and steady-state performances.
356
J. Zhao et al.
ΔWf (lb/s)
2
Traditional adaptive control Adaptive optimal SDRE control
1 0 −1 0
1
2
Time(s)
3
4
5
5
5
0
ΔV BV (%)
ΔV SV (º)
−4
10
0
−5
x 10
−5
−10
−10 0
1
2
3
4
5
Time(s)
−15 0
1
2
3
Time(s)
Fig. 2 Trajectories of ΔW f , ΔV SV and ΔV BV
0.12 0.1 0.08 0.06 0.04 0.02 0 Total 0~0.1 s 0.1~0.2 s 0.3~0.3 s 0.3~0.4 s 0.4~0.5 s
Fig. 3 Histogram of the performance criteria J
Traditional method Proposed method
4
5
LQR-Based Adaptive Optimal Control for Aircraft Engine
357
Acknowledgements This work was supported by the National Natural Science Foundation of China (NSFC) under Grant 62073197, Grant 61933006, and the Special Funding for Top Talents of Shandong Province.
References 1. Pan, M., Cao, L., Zhou, W., et al.: Robust decentralized control design for aircraft engines: a fractional type. Chin. J. Aeronaut. 32(2), 347–360 (2019) 2. Frederick, D.K., Garg, S., Adibhatla, S.: Turbofan engine control design using robust multivariable control technologies. IEEE Trans. Control Syst. Technol. 8(6), 961–970 (2000). https:// doi.org/10.1109/87.880600 3. He, A., Tan, D., Wang, X., et al.: An LMI-based anti-windup design for acceleration and deceleration control of jet engines using the H2 /H∞ optimization. In: Proceedings of 47th AIAA/ASME/SAE/ASEE Joint Propultion Conference and Exhibit (2011) 4. Kulkarni, N.V., Krishnakumar, K.: Intelligent engine control using an adaptive critic. IEEE Trans. Control Syst. Technol. 11(2), 164–173 (2003). https://doi.org/10.1109/tcst.2003.809254 5. Yang, S.B., Wang, X., Wang, H.N., Li, Y.G.: Sliding mode control with system constraints for aircraft engines. ISA Trans. 98(2), 1–10 (2019). https://doi.org/10.1016/j.isatra.2019.08.020 6. Richter, H.: A multi-regulator sliding mode control strategy for output-constrained systems. Automatica 47(10), 2251–2259 (2011). https://doi.org/10.1016/j.automatica.2011.08.003 7. Du, X., Richter, H., Guo, Y.: Multivariable sliding-mode strategy with output constraints for aeroengine propulsion control. J. Guidance Control Dyn. 39(7), 1631–1642 (2016). https://doi. org/10.2514/1.g001802 8. Michael, G.J., Farrar, F.: An analytical method for the synthesis of nonlinear multivariable feedback control. In: Technical Report M941338-2. United Aircraft Research Laboratory (1973) 9. Michael, G.J., Farrar, F., Development of optimal control modes for advanced technology propulsion systems. In: Technical Report UARL-N911620-2. United Aircraft Research Laboratories, East Hartford, CT, USA (1974) 10. Weinberg, M.S.: A multi-variable control for a turbofan engine operating at sea level static. In: Proceedings of ASME International Gas Turbine Fluids Engineering Conference, pp. 1–10 (1976) 11. Ogata, K.: Modern Control Engineering. Prentice-Hall, Englewood Cliffs, NJ (1998) 12. Levine, W.S.: The Control Handbook. CRC Press, Boca Raton, FL (1996) 13. Olalla, C., Leyva, R., Aroudi, A.E., Queinnec, I.: Robust LQR control for PWM converters: an LMI approach. IEEE Trans. Ind. Electron. 56(7), 2548–2558 (2009). https://doi.org/10.1109/ TIE.2009.2017556 14. Ge, S.S., Wang, C.: Adaptive neural control of uncertain MIMO nonlinear systems. IEEE Trans. Neural Netw. 15(3), 674–692 (2004). https://doi.org/10.1109/TNN.2004.826130 15. Li, J., Du, J.L., Sun, Y.Q., Lewis, F.L.: Robust adaptive trajectory tracking control of underactuated autonomous underwater vehicles with prescribed performance. Int. J. Robust Nonlinear Control 29(14), 4629–4643 (2019). https://doi.org/10.1002/rnc.4659 16. Pan, H.H., Sun, W.C., Gao, H.J., Jing, X.J.: Disturbance observer-based adaptive tracking control with actuator saturation and its application. IEEE Trans. Autom. Sci. Eng. 13(2), 868– 875 (2016). https://doi.org/10.1109/TASE.2015.2414652 17. Richter, H.: Advanced Control of Turbofan Engines. Springer (2012)
A Terrain Aided Navigation Method Based on Point Cloud Digital Elevation Map Construction Xiaolong Wang, Junzhi Zhu, Rui Chen, and Long Zhao
Abstract Terrain aided navigation systems use barometric altimeters and radio altimeters to measure terrain profile elevation, which is prone to a decrease in accuracy due to the limitations of terrain distribution characteristics and data quantity. This paper introduces airborne LiDAR as a measurement sensor for terrain aided navigation systems, establishing point cloud digital elevation maps which are used to be compared with a prior map in order to obtain the optimal matching position and correct the position errors. The proposed method is verified by using an actual digital elevation map and real flight data. The results show that the proposed method can effectively improve the terrain elevation detection ability and map matching accuracy of the terrain aided navigation system, and ensure the availability of the system under complex terrain conditions. Keywords Terrain aided navigation · Airborne LiDAR · DEM construction · Map matching
1 Introduction As one of the integrated navigation systems, terrain aided navigation (TAN) has been widely used in aircraft, missiles and underwater submarines because of its strong antiinterference ability, wide versatility and easy operation [1]. The system continuously compares the measured terrain elevation data with a prior digital elevation map (DEM) to find the position with the highest similarity and correct the position errors. Terrain aided navigation methods are mainly divided into two categories, namely terrain contour matching (TERCOM) [2] and Sandia inertial terrain aided navigation (SITAN) [3, 4]. On the basis of these two types of algorithms, many improved methods are proposed [5–7]. However, most of them utilize measurement sensors combining barometric altimeters and radio altimeters, which can only collect terrain X. Wang · J. Zhu · R. Chen · L. Zhao (B) School of Automation Science and Electrical Engineering, Digital Navigation Center, Beihang University, Beijing 100191, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_29
359
360
X. Wang et al.
profile elevation along the flight path. The data quantity is limited and susceptible to noise interference, which makes these algorithms perform poorly in terrain with gradual changes in elevation, such as plains. As an active measurement system, airborne LiDAR offers a larger measurement range and more accurate results compared to altimeters. Based on efficient and reliable point cloud data processing algorithms, it is possible to construct large-scale and high-precision DEMs, which can be used for map matching in TAN. For instance, in Ref. [8], a method is employed to convert LiDAR points into an elevation model for map matching, aiming to evaluate the horizontal positioning accuracy of a lunar lander. By combining GPS and radio altimeters, Ref. [9] utilizes LiDAR to scan the ground and convert the points into elevation estimates. The system aligns them with a prior DEM based on the minimum sum of squared error criterion, ultimately achieving an aircraft landing procedure with minimal drift error. For the application of LiDAR in long-distance terrain aided navigation, Ref. [10] achieved two flight tests with low errors by directly selecting the lowest points within the point cloud grid map to fit the DEM. The methods utilizing LiDAR mentioned above only employ some simple point cloud data processing methods, which are susceptible to the influence of noise points, leading to matching failures and divergence of position errors. To address this issue, this paper proposes a terrain aided navigation system based on a point cloud DEM construction method. By performing outlier removal and filtering on the point cloud data while ensuring system efficiency, higher-precision point cloud DEMs are constructed. Position errors are then corrected using the results obtained from a rapid map matching method, which enhances the accuracy and usability of terrain aided navigation under different terrain conditions.
2 Construction Method of Point Cloud DEM The accuracy of terrain aided navigation largely depends on the quality of the data collected by the system. For point cloud DEM construction, it is crucial to adequately remove non-ground points and accurately fit the remaining points to ensure that the obtained surface closely approximates the actual bare ground. Only then can the established point cloud DEM be sufficiently reliable, and the map matching results be sufficiently accurate. The framework for constructing a point cloud DEM is primarily consist of coordinate transformation, outlier removal, non-ground point filtering, and point cloud interpolation. Firstly, the point pli in the original LiDAR point cloud data pl needs to be transformed from the LiDAR local frame l to the global frame g using the transform chain: g g (1) pi = Rb (t) · Rlb · pli g
where b represents the vehicle frame, Rb is the transformation matrix from the vehicle frame to the global frame, which is obtained through INS propagation, and
A Terrain Aided Navigation Method Based on Point Cloud …
361
Fig. 1 Flow chart of outliers removing
Rlb is the constant transformation matrix from the LiDAR frame to the vehicle frame. To ensure an adequate amount of data, it is necessary to accumulate point cloud data within a certain range based on velocity or time. If the resolution of the point data differs from that of the prior DEM, downsampling methods need to be applied to match their resolutions. Before performing non-ground points filtering, it is necessary to remove outliers to ensure the accuracy of the filtering results and save computing resources. The flow chart of outliers removing is shown in Fig. 1. In this paper, a KD-tree structure is used to manage the data. By iteratively traversing each point in the point cloud, neighboring points within a radius of r centered around the point are searched. If the outlier condition is met, the point will be removed and the KD-tree will be updated. The outlier condition is defined as: |h curr ent − μ| > 3σ
(2)
where h curr ent is the elevation of the current traversed point, μ is the mean elevation of the neighboring points, and σ is the standard deviation of these elevations. For non-ground point filtering, this paper adopts the Cloth Simulation Filter (CSF) method proposed in Ref. [11], which has been widely used in the offline construction of precise DEMs from point cloud data. Compared to the Adaptive TIN algorithm in Ref. [12], the CSF algorithm requires fewer pre-set parameters, has higher computational efficiency, and does not require interpolation processing afterwards [13]. The main idea of this method is to simulate a piece of cloth with rigidity and weight falling from above the flipped point cloud under the influence of internal and external driving factors. The final position of the cloth after stabilization represents the surface obtained by removing non-ground points and applying interpolation processing, which corresponds to the point cloud DEM we want to construct. The comparison of the point cloud data before and after outlier removal and filtering is shown in Fig. 2. After removing the outliers from the original point cloud, the overall elevation distribution aligns better with that of ground objects. Furthermore, non-ground points such as trees and buildings in the original point cloud are effectively removed after being processed by the CSF algorithm, resulting in an approximate fit to the distribution of bare ground.
362
X. Wang et al.
(a) Raw data
(b) Outliers removed
(c) CSF result
Fig. 2 Schematic diagram of intermediate results of the point cloud DEM construction
3 Terrain Aided Navigation Algorithm A terrain aided navigation system is typically a combined system that integrates with an inertial navigation system. Within the framework of integrated navigation, the terrain aided navigation system utilizes the results obtained from map matching to optimize and compensate for position errors of the inertial navigation system, ensuring its accuracy.
3.1 Map Matching Strategy During the map matching process, terrain aided navigation typically employs a sliding window search approach to traverse the prior DEM and calculate the similarity between the prior DEM and the point cloud DEM. To save computational resources, as shown in Fig. 3, this paper combines the Kalman filtering process described in Sect. 3.2 to perform pre-screening of the sliding window search range. Based on the estimation results of position p pr ed and position errors δp from the integrated navigation system at the corresponding time, the prior DEM data within a three times error range is extracted as the map search area for sliding window traversal. In this paper, the similarity evaluation between the point cloud DEM and the prior DEM is based on the normalized cross-correlation (NCC) criterion [14]. To ensure the accuracy and reliability of the map matching results, we introduce an NCC threshold for filtering them. Only when the results meet the threshold condition can the system obtain sufficiently accurate position correction information for integrated navigation. The NCC calculation model is: 1 (I D (i, j) − μ D ) (I L − μ L ) (3) N CC (i, j) = N σ D σL where N is the number of DEM grids, I D and I L are the elevation values of corresponding points in the prior DEM and the point cloud DEM respectively, (i, j) is
A Terrain Aided Navigation Method Based on Point Cloud …
363
p pred the flight route
map extraction area sliding window
Fig. 3 Schematic diagram of the extraction of the prior DEM searching area
LiDAR DEM
prior DEM
Fig. 4 Schematic diagram of map matching results
the offset from the sliding window to p pr ed , μ D and μ L are the mean elevations of the two DEMs, σ D and σ L are the corresponding standard deviations of the elevations. As shown in Fig. 4, the diagram illustrates four sets of matching results under the condition N CC > 0.85. Here all DEMs have a resolution of 1.0 m × 1.0 m and a map size of approximately 150 m × 150 m. The top row represents the point cloud DEMs, while the bottom row shows the regions of the best matching positions of the prior DEM. The first three columns depict successful matching results, while the fourth column shows a set of failed matches due to excessively flat terrain.
3.2 Kalman Filtering Framework With the assumption of modeling IMU noise as zero-mean Gaussian noise, terrain aided navigation typically utilizes the Extended Kalman Filter (EKF) framework for
364
X. Wang et al.
the integration information. However, since the EKF requires linearization of terrain features that exhibit non-linear distributions, it inevitably introduces linearization errors into the system. To avoid this issue, this paper adopts the Error State Kalman Filter (ESKF) framework to model system errors and achieves system state correction through the optimization of these errors. Here, we select a 15-dimensional parameter set in the North-East-Up (ENU) coordinate system as the nominal state variables: T T T x = pT vnT θ nT b bg ba
(4)
where pT is the position of the vehicle, including latitude, longitude and altitude. vnT is the velocity of the vehicle, including the eastward, northward and upward velocities. θ nT b represents the attitude angles corresponding to the transformation matrix from the vehicle frame to the global frame, including roll, pitch and yaw angles. bTg and baT represent the biases of the gyroscope and accelerometer in each direction, respectively. These state variables satisfy the mechanical propagation equations of the inertial navigation system. Correspondingly, we select the corresponding 15-dimensional error state variables: T T T (5) δx = δpT δvnT δθ nT b δbg δba where δpT is the position error of the vehicle, δvnT is the velocity error, δθ nT b is the attitude angle error, δbTg is the gyroscope bias error, and δbaT is the accelerometer bias error. T With the map matching results pTtan = L tan λtan as the observation information, we further obtain the mathematical model equations for the system: δ x˙ (t) = F(t)δx(t) + G(t)w(t)
(6)
(L ins − L tan )R M R M δL + N x = H(t)δx(t) + V(t) (7) = (λins − λtan )R N cos L R N cos Lδλ + N y
Z(t) =
where F(t) is the error propagation matrix of the inertial navigation system, G(t) is the covariance matrix of the process noise; H(t) is the observation matrix of the system; L ins and λins are respectively the latitude and longitude obtained through the propagation of the inertial navigation system; R M and R N are the meridional and prime vertical radius of curvature, respectively; V(t) is the covariance matrix of the observation noise, including N x and N y . When the map matching result is unavailable, the system only performs the mechanical propagation of the inertial navigation system and the prediction process of the ESKF. However, when the matching result is valid, the measurement equation is established based on the position information obtained from the result to correct the system errors. At time k, the correction result δxk for the error state variables is obtained, and the nominal state variables xk are updated accordingly:
A Terrain Aided Navigation Method Based on Point Cloud …
365
⎧ pk+1 = pˆ k − δpk ⎪ ⎪ ⎪ ⎪ vk+1 = vˆ k − δvk ⎨ ˆ k exp (δθ k ) Rk+1 = R ⎪ ⎪ bg,k+1 = bˆ g,k − δbg,k ⎪ ⎪ ⎩ ba,k+1 = bˆ a,k − δba,k
(8)
where R is the rotation matrix obtained from corresponding attitude angles θ nb .
4 System Test and Analysis We utilized DEMs which had a resolution of 1.0 m × 1.0 m as prior data, and a segment of real flight data to test and validate the proposed algorithm. The inertial data during the flight was captured by a NovAtel SPAN STIM 300 IMU at a frequency of 125 Hz. Simultaneously, point cloud data was acquired by an ALS60 LiDAR with a 45-degree field of view (FOV) at a frequency of 20 Hz. The red dashed box in Fig. 5 represents the flight path, which is approximately 2.2 km long with a point cloud strip width of about 210 m and a time span of approximately 92 s. The proposed algorithm was tested on a platform with a quad-core 3.10 GHz i5 processor and 16 GB of RAM. We accumulated point cloud data every 2 seconds and used it to construct a point cloud DEM with a size of 150 m × 150 m for map matching. When the matching results meet the condition (2), the position error will be corrected using the ESKF framework described in Sect. 3.2. The success ratio of matching along the entire flight path is shown in Table 1.
Fig. 5 Schematic diagram of the flight path and point cloud strip data used in this test
366
X. Wang et al.
Table 1 Test results of the algorithm for the flight data Path length Number of Percentage of NCC matches successful threshold matches 39
Horizontal Error (m) North Error (m)
East Error (m)
2.2 km
54%
0.85
RMSE of position
Maximum matching error
18.31 m
48.09 m
200 100 0 -100
0
10
20
30
40
50
60
200
70
80
90
Only INS propagation TAN with LiDAR
100 0 -100
0
10
20
30
40
50
60
70
80
90
10
20
30
40
50
60
70
80
90
200 100 0 -100
0
time (s) Fig. 6 Comparison of position errors in east, north and horizontal directions
Based on the benchmark of the tightly-coupled INS and GNSS integration results, Fig. 6 shows a comparison of the position errors in different directions for the flight data under two conditions: using the proposed algorithm and using only inertial navigation propagation. The red line represents the results of the algorithm proposed in this paper, while the blue line represents the results solely based on inertial navigation propagation. It can be seen from the result that compared with the propagation results of inertial navigation, the algorithm proposed in this paper can guarantee the positioning effect under a certain accuracy. At the same time, although a large position error is caused when the aircraft flies over the water at the last moment, the algorithm can still maintain the stability of the system.
5 Conclusion This paper proposes a terrain aided navigation method based on point cloud DEM construction. Compared to conventional terrain aided navigation measurement sensor schemes, it processes the point cloud collected by an airborne LiDAR using methods
A Terrain Aided Navigation Method Based on Point Cloud …
367
such as outliers removing and CSF, to establish a large-scale and accurate point cloud DEM that matches the prior DEM. This approach enables reliable matching results in challenging conditions for terrain aided navigation, such as flat terrain. The obtained matching results also can be further used to correct the position errors of the inertial navigation system, effectively ensuring the accuracy of the system. Acknowledgements The project is supported by the Aeronautical Science Foundation of China (Grant No. 2022Z022051001), the National Science Foundation of China (Grant No. 42274037), and the National key research and development program of China (Grant No. 2020YFB0505804).
References 1. Cheng, C.Q., Hao, X.Y., Zhang, Z.J., Ma, Z.G.: Robust integrated navigation algorithm of terrain aided navigation. INS. J. Chin. Inert. Technol 24, 202–207 (2016) 2. Golden, J.P.: Terrain contour matching (TERCOM): a cruise missile guidance aid. In: Image Processing for Missile Guidance, vol 238, pp 10–18. SPIE (1980) 3. Hollowell, J.: Heli/SITAN: a terrain referenced navigation algorithm for helicopters. Technical Report. Sandia National Laboratory (SNL-NM), Albuquerque, NM (United States) (1990) 4. Boozer, D.D., Fellerhoff, J.R.: Terrain-aided navigation test results in the AFTI/F-16 aircraft. Navigation 35(2), 161–175 (1988) 5. Chen, Z., Yu, P.J., Yang, H.: BUAA inertial terrain-aided navigation (BITAN) algorithm. In: ICAS, Congress, 18 th, Beijing, China, pp. 647–654 (1992) 6. Pei, Y., Chen, Z., Hung, J.C.: BITAN-II: an improved terrain aided navigation algorithm. In: Proceedings of the 1996 IEEE IECON. 22nd International Conference on Industrial Electronics, Control, and Instrumentation, vol. 3, pp. 1675–1680. IEEE (1996) 7. Zhao, L., Gao, N., Huang, B., Wang, Q., Zhou, J.: A novel terrain-aided navigation algorithm combined with the TERCOM algorithm and particle filter. IEEE Sens. J. 15(2), 1124–1131 (2014) 8. Johnson, A., Ivanov, T.: Analysis and testing of a lidar-based approach to terrain relative navigation for precise lunar landing. In: AIAA Guidance, Navigation, and Control Conference, p. 6578 (2011) 9. de Haag, M.U., Vadlamani, A., Campbell, J.L., Dickman, J.: Application of laser range scanner based terrain referenced navigation systems for aircraft guidance. In: Third IEEE International Workshop on Electronic Design, Test and Applications (DELTA’06), p. 6. IEEE (2006) 10. Hemann, G., Singh, S., Kaess, M.: Long-range GPS-denied aerial inertial navigation with lidar localization. In: 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1659–1666. IEEE (2016) 11. Zhang, W., Qi, J., Wan, P., Wang, H., Xie, D., Wang, X., Yan, G.: An easy-to-use airborne lidar data filtering method based on cloth simulation. Remote Sens. 8(6), 501 (2016) 12. Axelsson, P.: Dem generation from laser scanner data using adaptive tin models. Int. Arch. Photogr. Remote Sens. 33(4), 110–117 (2000) 13. Polat, N., Uysal, M.: Investigating performance of airborne lidar data filtering algorithms for DTM generation. Measurement 63, 61–68 (2015) 14. Zhao, F., Huang, Q., Gao, W.: Image matching by normalized cross-correlation. In: 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings, vol. 2, pp. II–II. IEEE (2006)
Adaptive Fault-Tolerant Control for a Class of Nonlinear Systems Zhichao Wang, Yan Lin, and Lin Li
Abstract This paper considered a fault tolerant control (FTC) method for a class of nonlinear systems. Based on the neural network approximation theory, the unknown nonlinear functions is approximated by neural network, with which, an adaptive FTC method is introduced. With the proposed method, the actuator faults are compensated and the tracking error are maintained in the banded region all the times. Finally, an attitude control simulation is considered to illustrate the effectiveness of the proposed method. It is shown that in the case of possible actuator failures, this method can effectively achieve control objectives, ensuring system stability and high-precision control effects. Keywords FTC control · Adaptive control · Neural network
1 Introduction In recent years, in order to improve the stability and security of control systems, FTC has attracted extensive attention in many fields, such as aircrafts [1], aerospace systems [2], etc., in which, actuators are important components that directly affect the final control effect. However, due to incorrect operation or inevitable environmental factors, the actuators may suffer from unknown failures, which may lead to performance degradation, instability, and other disastrous consequences. Therefore, the research on FTC is very meaningful A lot of research in actuator FTC has been achieved, which can be roughly divided into two categories: passive FTC and active FTC. In passive FTC, robust fixed controller has been designed to deal with the actuator failures [3]. On the contrary, in active FTC, the controller is reconfigured online to accommodate actuator failures Z. Wang · L. Li School of Energy and Power Engineering, Beihang University, Beijing 100191, China Y. Lin (B) College of Electrical Engineering and Automation, Shandong University of Science and Technology, Qingdao 266590, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_30
369
370
Z. Wang et al.
[4]. The active FTC mainly include multiple models switching and tuning (MMST) [5, 6], information-based diagnostic approach [7], and direct adaptive actuator failure compensation [4, 8]. One of the main characteristics of MMST based and information based diagnostic methods is to construct a bank of observers with each observer matching a failure type, and then based on the constructed performance indicators or estimation fault information to adjust the controller to handle the possible failures. However, if the fault is out of the bank, these methods may fail. Instead of using the failure information, the direct adaptive actuator failure compensation approach can directly reconfigures the control law using system performance errors [4]. Then, many researches have been achieved based this method, covering a large class of systems [9–13]. In this paper, the research is focused on direct adaptive. Although significant progress has been achieved in the research of direct adaptive FTC, there are still some problems. One is how to ensure control performance in the event of sudden actuator failure. In [13], the prescribed performance technique is introduced to guarantee the control efforts. However, the nonlinear transformation of tracking error is required in the controller design, which increases the complexity of calculation. At the same time, the performance constraints of the horn are distributed on different sides, which can not effectively guarantee the transient performance. Another problem is that to ensure error performance within the prescribed bounds, a large control value at the initial time is often required, which may lead to actuator saturation. In view of these problems, an adaptive fault-tolerant control for a class of nonlinear systems is proposed in this paper. By introducing new error variables such that the initial tracking error is zero, which avoids the large initial control inputs. Meanwhile, without any nonlinear transformation, the tracking error can be maintained within a banded region, ensuring better transient performance. This paper is organized as follows. In Sect. 2, the control problem and the basic assumption are given. In Sect. 3, the controller is designed and the main results is given. In Sect. 4, simulation is given based on a attitude control problem. Finally, we concludes this paper.
2 Problem Statement Consider the following nonlinear system x˙ = A(x)x + B(x)u + G(x) + d,
(1)
where x = [x1 , · · · , xn ]T ∈ Rn is the system state vector, u ∈ Rq is the control input, G(x) ∈ Rn is the unknown function vector represents the system uncertainties, d ∈ Rn is the unknown disturbance, A(x), B(x) are known function matrices with appropriate dimensions, and B(x) is invertible.
Adaptive Fault-Tolerant Control for a Class of Nonlinear Systems
371
In this paper, a typical actuator failure model [13–16] is considered as u = ρu c + ξ T (t)θ,
(2)
where u c is the control signal vector to be designed, and ρ = diag{ρ1 , · · · , ρn } with ¯ 1], ξ(t) is known function matrix, and θ is an unknown constant positive ρi ∈ (ρ, vector. The objective of this paper is to design an adaptive controller for system (1) under possible actuator faults, so that all the closed-loop signals are bounded, and the state vector x tracking the reference signal xr as close as possible. To achieve the control objective, the following assumption is made. Assumption 1 The unknown disturbance vector d is bounded.
3 Controller Design In this section, the adaptive FTC controller is designed. In what follows, we shall use positive constants k, c, κΛ , κθ , κw , κd , κ as design parameters without restating. In this paper, G(x) is unknown function vector. Hence, the radial basis function neural networks (RBFNNs) are used to approximate G(x). According to the well-known universal approximation characteristics [17, 18], G(x) on an arbitrary compact set Ω can be written as G(x) = φT (x)w + ε,
(3)
where ε is the approximation error satisfies ε ≤ ε M , with ε M be a unknown positive constant; φ(x) = [φ1 (x), · · · , φ N (x)]T , with φi (x) be the commonly used Gaussian function; w ∈ R N is the ideal weight vector, satisfies T w = arg min sup G (x) − φ (x)w ,
w∈R N
(4)
x∈Ω
in which N is the number of neural networks nodes. Taking (1)–(3) into consideration, x˙ can be written as ¯ x˙ = A(x)x + B(x)ρu c + B(x)ξ T (t)θ + φT (x)w + d,
(5)
where d¯ = ε + d. From Assumption 1, d is bounded. Then there exists an unknown positive constant d¯M such that d¯ ≤ d¯M . Define the error variable vector as e = x − xr − χ exp(−kt),
(6)
372
Z. Wang et al.
where χ = x(0) − xr (0), and k is a positive design parameter. The adaptive FTC strategy is proposed as follows: ˆ u c = Λv,
v = −B −1 (x) ce + A(x)x + B(x)ξ T (t)θˆ + φT (x)wˆ +dˆ¯ M sgn (e) − x˙r + kχ exp (−kt) ,
(7)
ˆ wˆ and dˆ¯M are the estimates of Λ (= ρ−1 ), θ, w and d¯M , respectively. ˆ θ, where Λ, The adaptive update laws are given as Λ˙ˆ = −
κ2
κΛ B T (x)ev T , − eT e
κθ ξ(t)B T (x)e, − eT e κw φ(x)e, w˙ˆ = 2 κ − eT e n ˙ κd |ei |. dˆ¯ M = 2 κ − e T e i=1 ˙ θˆ =
κ2
(8)
The main results of this paper are summarized as follows. Theorem 1 Let the plant to be controlled is given by (1) satisfies Assumption 1. Consider the closed system consisting of the plant (1), the adaptive controller (7), and the update laws (8). Then, under possible actuator failures (2), it can still guaranteed that, (i). All signals of the closed-loop system are bounded; (ii). lim x − xr = 0; t→+∞
(iii). χ exp (−kt) − κ ≤ x − xr ≤ χ exp (−kt) + κ. Proof (i). From (5)–(7), the error e satisfies the dynamics ˜ + B(x)ξ T (t)θ˜ + φT (x)w˜ + d¯ − d¯ˆ M sgn (e) . e˙ = −B(x)ρΛv
(9)
ˆ w˜ = w − w. ˆ θ˜ = θ − θ, where Λ˜ = Λ − Λ, ˆ Define a Lyapunov function candidate as V =
κ2 1 1 ˜T ˜ 1 ¯˜ 2 1 T 1 ˜ T ρΛ˜ + log 2 + tr Λ w˜ w˜ + θ θ+ d , (10) T 2 κ − e e 2κΛ 2κθ 2κw 2κd M
where tr(·) represents the trace of the matrix, and d˜¯ M = d¯M − dˆ¯ M .
Adaptive Fault-Tolerant Control for a Class of Nonlinear Systems
373
Taking the derivative of (10), one can obtain 1 ˜ + B(x)ξ T (t)θ˜ + φT (x)w˜ + d¯ − dˆ¯ M sgn (e) e T −ce − B(x)ρΛv T −e e
T ˙ 1 1 ˙T 1 1 ˙T ˙ ˆ ˜ − tr Λ ρΛ − θˆ θ˜ − wˆ w˜ − d˜¯ M dˆ¯ M κΛ κθ κw κd
1 −ce T e κΛ ˙ˆ T ρΛ˜ T − ve ≤ 2 tr B(x) + Λ κ − e T e κΛ κ2 − e T e
κθ κw 1 1 ˙ˆ T ˜ T T T T ˙ˆ T w˜ e e + B(x)ξ (t) − θ φ (x) − w θ + κθ κ2 − e T e κw κ2 − e T e
n ˙ κd 1 |ei | − dˆ¯ M . + d˜¯ M (11) 2 T κd κ − e e i=1
V˙ =
κ2
Substitute (8) into (11), we have −ce T e ≤ 0, V˙ ≤ 2 κ − eT e
(12)
hence, the error e is bounded, and satisfies lim e = 0. Therefore, the boundedness t→+∞
ˆ wˆ and dˆ¯M can be obtained. ˆ θ, of Λ, (ii). By considering (6) and lim e = 0, the tracking error satisfies t→+∞
lim x − xr = lim e + χ exp(−kt) = 0.
t→+∞
t→+∞
(13)
(iii). Since V is bounded, then e = x − xr − χ exp (−kt) ≤ κ ⇒ χ exp (−kt) − κ ≤ x − xr ≤ χ exp (−kt) + κ, This completes the proof.
(14)
4 Simulation Results To show the effectiveness of the proposed design method, the following nonlinear attitude dynamic model for reusable launch vehicle is considered [19]:
374
Z. Wang et al.
α˙ = q − tan β ( p cos α + r sin α) + dα , β˙ = p sin α − r cos α + dβ , μ˙ = sec β ( p cos α + r sin α) + dμ , Izz − I yy m δa q0 Sr L r qr + x δa + d p , Ix x Ix x m δye q0 Sr L r Ix x − Izz q˙ = − pr + δe + dq , I yy I yy p˙ = −
r˙ = −
m δr q0 Sr L r I yy − Ix x qp + z δr + dr , Izz Izz
(15)
where α, β, μ, p, q, r are the angle of attack, the sideslip angle, the bank angle, roll, pitch and yaw rate, respectively; Ix x , I yy , Izz are the moment of inertia; and δa , δe , δr are the equivalent three-channel grid fins angles, respectively. Let X 1 = [α, β, μ]T , X 2 = [ p, q, r ]T . Then, (15) can be written as X˙ 1 = B¯ 1 X 2 + d1 , X˙ 2 = B¯ 2 U + G¯ 2 (X 2 ) + d2 ,
(16) (17)
in which ⎡
⎤ − tan β cos α 1 − tan β sin α sin α 0 − cos α ⎦ , B¯ 1 = ⎣ sec β cos α 0 sec β sin α
δa m δye q0 Sr L r m δzr q0 Sr L r m q S L 0 r r x B¯ 2 = diag , , , Ix x I yy Izz I yy − Ix x Izz − I yy Ix x − Izz qr, pr, qp]T , G¯ 2 (X 2 ) = −[ Ix x I yy Izz T T d1 = dα , dβ , dμ , d2 = d p , dq , dr ,
(18)
and U = [δa , δe , δr ]T . Due to the different bandwidth between the angle loop and rate loop, the proposed method can be applied to (16) and (17), respectively, that is, for a given reference attitude, design X 2c by using the proposed method, then let X 2 c as the reference signal, the control torque M can be obtained. In the simulation, the RLV dynamics model parameters and aerodynamic coefficients are the same as those in [19]. The initial states are set to be X 1 (0) = [3, −1, 0]T (◦ ), X 2 (0)=[0, 0, 0]T (◦ )/s, The desired attitude is selected as [1, 0, 3]T (◦ ). The adaptive control design parameters are chosen as k = 3, c = 5, κΛ = κθ = 0.01, κw = κd = 1, κ = 0.5◦ . Meanwhile, in order to verify the robustness of the proposed scheme, the following actuator failures are considered,
Adaptive Fault-Tolerant Control for a Class of Nonlinear Systems
δa = δe = δr =
375
if t < 5, δac , 0.8δac , otherwise,
(19)
if t < 5, δec , 0.6δec + 0.3◦ , otherwise,
(20)
if t < 5, δr c , 0.7δr c + 0.5◦ , otherwise.
(21)
The simulation results are shown in Figs. 1, 2, 3 and 4. Figure 1 shows the attitude response trajectories. It can be seen that by using the proposed method, the attitude angles track the desired trajectories well despite actuator faults. Figure 2 shows that the norm of the tracking error constrained within a banded range determined by (14). Figure 3 shows the attitude angular rates, and Fig. 4 shows the control inputs.
4
α
Desired attitude Actual attitude
2
0 0 1
2
4
6
8
10
2
4
6
8
10
2
4
6
8
10
β
0
μ
−1 0 4 2 0 0
Time(s)
Fig. 1 Trajectories of attitude angles (deg) 5
x − xr
4 3 2 1 0 −1 0
2
4
Time(s)
Fig. 2 Norm of the tracking error vector (deg)
6
8
10
376
Z. Wang et al. 20
p
0 −20 0 10
2
4
6
8
10
2
4
6
8
10
2
4
6
8
10
q
0
r
−10 0 5 0
−5 0
Time(s)
Fig. 3 Attitude angular rates (deg/s)
1
0.2 0
2
−0.2 0 1
4
6
8
10
2
4
6
8
10
2
4
6
8
10
0
−1 0 0.5 3
2
0
−0.5 0
Time(s)
Fig. 4 Control inputs (deg)
5 Conclusions An adaptive FTC method has been proposed for a class of nonlinear systems. With the neural network approximation theory, the unknown nonlinear functions is approximated. By introducing new error variables, the tracking error has been maintained within a banded region. Finally, an attitude control simulation has shown that the proposed method can effectively achieve the control objectives, and ensuring system stability and high-precision control performance. Acknowledgements This work was supported by the National Natural Science Foundation of China (NSFC) under Grant 62073197, Grant 61933006, and the Special Funding for Top Talents of Shandong Province.
Adaptive Fault-Tolerant Control for a Class of Nonlinear Systems
377
References 1. Peni, T., Vanek, B., Szabo Z., Bokor J.: Supervisory fault tolerant control of the NASA AirStar aircraft. In: Proceedings of American Control Conference, pp. 666–671 (2014) 2. Jiang, B.Y., Hu, Q.L., Friswell, M.I.: Fixed-time attitude control for rigid spacecraft with actuator saturation and faults. IEEE Trans. Control Syst. Technol. 24(5), 1892–1898 (2016) 3. Zhang, Y.M., Jiang, J.: Bibliographical review on reconfigurable fault tolerant control systems. Annu. Rev. Control 32(2), 229–252 (2008) 4. Tao, G.: Direct adaptive actuator failure compensation control: a tutorial. J. Control Decis. 1(1), 75–101 (2014) 5. Boskovic, J.D., Jackson, J.A., Mehra, R.K., Nguyen, N.T.: Multiple-model adaptive faulttolerant control of a planetary lander. AIAA J. Guid. Control Dyn. 32(6), 1812–1826 (2009) 6. Boskovic, J.D., Mehra, R.K.: A decentralized fault-tolerant control system for accommodation of failures in higher-order flight control actuators. IEEE Trans. Control Syst. Technol. 18(5), 1103–1115 (2010) 7. Zhang, X.D., Parisini, T., Polycarpou, M.M.: Adaptive fault-tolerant control of nonlinear uncertain systems: an information-based diagnostic approach. IEEE Trans. Autom. Control 49(8), 1259–1274 (2004) 8. Tao, G., Joshi, S.M., Ma, X.L.: Adaptive state feedback control and tracking control of systems with actuator failure. IEEE Trans. Autom. Control 46(1), 78–95 (2001) 9. Tao, G., Chen, S., Joshi, S.M.: An adaptive actuator failure compensation controller using output feedback. IEEE Trans. Autom. Control 47(3), 506–511 (2002) 10. Tang, X.D., Tao, G., Joshi, S.M.: Adaptive actuator failure compensation for parametric strict feedback systems and an aircraft application. Automatica 39(11), 1975–1982 (2003) 11. Tang, X.D., Tao, G., Joshi, S.M.: Adaptive output feedback actuator failure compensation for a class of non-linear systems. Int. J. Adapt. Control Signal Process. 19(6), 419–444 (2005) 12. Yao, X.L., Tao, G., Jiang, B.: Adaptive actuator failure compensation for multivariable feedback linearizable systems. Int. J. Robust Nonlinear Control 26(2), 252–285 (2016) 13. Wang, W., Wen, C.Y.: Adaptive actuator failure compensation control of uncertain nonlinear systems with guaranteed transient performance. Automatica 46(12), 2082–2091 (2010) 14. Wang, J., Pan, H., Sun, W.: Event-triggered adaptive fault-tolerant control for unknown nonlinear systems with applications to linear motor. IEEE/ASME Trans. Mechatron. 27(2), 940–949 (2022) 15. Li, X.J., Yang, G.H.: Robust adaptive fault-tolerant control for uncertain linear systems with actuator failures. IET Control Theory Appl. 6(10), 1544–1551 (2012) 16. Wang, H., Bai, W., Zhao, X., et al.: Finite-time-prescribed performance-based adaptive fuzzy control for strict-feedback nonlinear systems with dynamic uncertainty and actuator faults. IEEE Trans. Cybern. 99, 1–13 (2021) 17. Zhao, K., Song, Y., Ma, T., He, L.: Prescribed performance control of uncertain Euler-Lagrange systems subject to full-state constraints. IEEE Trans. Neural Netw. Learn. Syst. 29(8), 3478– 3489 (2018) 18. Dong, H., Lin, X., Gao, S., et al.: Neural networks-based sliding mode fault-tolerant control for high-speed trains with bounded parameters and actuator faults. IEEE Trans. Veh. Technol. 69(2), 1353–1362 (2020) 19. Zhang, L., Wei, C., Wu, R., Cui, N.: Fixed-time extended state observer based non-singular fast terminal sliding mode control for a VTVL reusable launch vehicle. Aerosp. Sci. Technol. 82–83, 70–79 (2018)
Traffic Police Dynamic Gesture Recognition Based on Spatiotemporal Attention ST-GCN Xiru Wu, Yu Zhao, and Qi Chen
Abstract In order to solve the problems of low accuracy and poor robustness of traffic police dynamic gesture recognition, a ST-GCN based traffic police dynamic gesture recognition method is designed in this paper. The skeletal features of traffic police were extracted from space and time dimensions. By updating the graph attention matrix, the skeletal connectivity structure of traffic police was optimized to highlight the effective spatial features. Increase the time attention mechanism and strengthen the characteristics of the core movements in the learning gestures. The design model compares 8 types of gestures of traffic police in different scenarios, and the results show that: The ST-GCN traffic police gesture recognition network, which integrates spatio-temporal attention mechanism, achieves good recognition effect, with an average recognition accuracy of 88.82%, which is 5.79% higher than that of classical ST-GCN, which verifies the good performance of the proposed algorithm in traffic police dynamic gesture recognition. Keywords Spatiotemporal attention mechanism · ST-GCN · Traffic police gesture recognition
1 Introduction With the rapid development of urbanization in China, there is a growing demand for assisted driving or intelligent driving systems. In real life, a complete traffic scene includes not only a variety of driving vehicles, but also the recognition and prediction of actions in the traffic scene within a limited time range, for example, the identification of traffic police gestures commanding traffic [1]. Accurate detection and recognition of traffic police gestures on the road is crucial to the safety of assisted driving and intelligent driving. The new version of the traffic X. Wu (B) · Y. Zhao · Q. Chen College of Electronic Engineering and Automation, Guilin University of Electronic Technology, Guilin, Guangxi 541004, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_31
379
380
X. Wu et al.
police gesture signal in China has been implemented since 2007 and is divided into 8 kinds of instructions, which are successively expressed as “Stop”, “Go Straight ”, “Turn Left”, “Turn Left to Stay”, “Turn Right”, “Change Lanes”, “Slow Down” and “Pull Over ”. In the 8 kinds of traffic police gesture commands, each command is composed of the head, arms and other limbs coordination, these dynamic gestures aggravate the difficulty of recognition. In addition, in the actual traffic scene, the location of the traffic police in the picture, the difference in size and scale, the blocking interference of pedestrians, vehicles, buildings, etc., and the background interference of bad weather such as light intensity and rain and snow are all the main interfering factors for the accuracy and rapidity of the detection and recognition algorithm. Solving the problem of dynamic gesture recognition of traffic police in traffic environment plays an extremely important role in the safety performance of intelligent driving vehicles. In view of the actual needs of intelligent driving vehicles for traffic police dynamic gesture recognition, this paper carries out relevant research work on pose estimation and gesture recognition based on deep learning algorithms, and promotes the visual semantic understanding ability of intelligent driving vehicles and advanced driver assistance systems for traffic police gestures, which has important theoretical and application value. The pose estimation detection algorithm based on deep learning has developed rapidly in recent years, and a large number of excellent algorithms have emerged, breaking the limitations of classical methods. In 2016, Wei et al. [2] proposed a convolutional pose machine to learn long-distance spatial relationships by using a larger receptive field, and realized end-to-end single-person pose estimation for the first time. Literature [3] proposed that multiple Hourglass networks were used to obtain image features through continuous down-sampling and up-sampling processes, and key point information was output by Gaussian heat map at each stage, which improved the detection accuracy of key points. Multi-person pose estimation methods are usually divided into top-down method and bottom-up method. Among them, the top-down method first detects the position of the human body, outputs the bounding box coordinates, and then regenerates the key points of the human skeleton in each bounding box. Representative algorithms include Mask R-CNN [4], CPN [5], Simple Baselines [6], HRNet [7], etc. However, depending on the result of object detection, the processing time increases with the number of detectors needed. The bottom-up method first detects all bone key points globally, and then aggregates these key points into complete bones of different individuals. Representative algorithms include DeepCut [8], OpenPose [9], HigherHRNet [10], etc., which can greatly improve the detection real-time performance. In recent years, many research teams continue to innovate pose estimation algorithms for practical applications. Microsoft Research Asia proposed a multi-stage structured keypoint regression method in 2021 [11], which uses adaptive convolution to activate pixels around the keypoint area to learn new features. This method of direct regression coordinates exceeds many methods represented by heat map detection in detection accuracy, and achieves the best results of bottom-up pose estimation methods on COCO and CrowdPose datasets.
Traffic Police Dynamic Gesture Recognition …
381
The action recognition method based on deep learning usually uses optical flow feature method, skeletal feature method and spatio-temporal feature method. The core idea of optical flow feature method is to analyze the trajectory characteristics of pixels in video sequences over time. For example, Simonyan and Zisserman [12] proposed two-stream structure, and used two-dimensional convolutional neural network to extract action features on video stream and optical flow respectively. Convolutional neural networks can still achieve considerable recognition results by training optical flow features. Literature [13] proposed that the video was put into PWCNet to extract optical flow features, and the model performed bilinear interpolation warping on adjacent video frames through the feature pyramid network. Finally, the optical flow estimator was used to generate optical flow features to complete action recognition. The skeleton feature method extracts joint features to realize action recognition by analyzing the position information and the relationship between the key points of each human skeleton. Si et al. [14] combined graph attention convolutional network with Long Short Term Memory (LSTM) network, and proposed AGC-LSTM model to enhance the attention to the dynamic changes of human skeletal spatial features in video images. In Ref. [15], a graph convolutional network PGCN-TCA based on spatio-temporal attention was proposed, which replaced the fixed adjacency matrix with a matrix of learnable connection strength to help the network extract more hierarchical features in the spatial domain. The spatio-temporal feature method requires the network to have the ability to learn temporal and spatial features simultaneously, so as to obtain the action features of adjacent frames and aggregate them to achieve good recognition effect. The C3D model proposed by Tran et al. [16] has been widely adopted, which directly performs 3D convolution operation on adjacent frames of the video and simultaneously models spatio-temporal features. Literature [17] proposed a long and short-term spatial feature extraction network LT-net, the convolution kernel was extended from two-dimensional convolution to three-dimensional convolution, and C3D was used as the backbone network. Experiments show that the superimposed RGB images can capture long-term spatio-temporal features, which improves the stability and convergence of the network. In this paper, OpenPose is used to predict the key points of the traffic police bones from the bottom up, and the spatio-temporal graph convolutional network ST-GCN is used to describe the spatial structure and time series of the extracted bones as the topological map of the bone sequence information. The spatial attention mechanism was integrated, and the traffic police connectivity structure was optimized by updating the graph attention matrix to strengthen the extraction of effective spatial features in traffic police dynamic gestures. The temporal attention mechanism was added to improve the model’s attention to the traffic police gesture channel information. This paper studies the dynamic gesture recognition algorithm of traffic police based on ST-GCN with spatio-temporal attention, and designs experiments to verify the effect.
382
X. Wu et al.
2 Preliminary Theory 2.1 OpenPose Model OpenPose model is a bottom-up, high-precision real-time pose estimation model proposed by a research team of Carnegie Mellon University in the United States in 2017, which is used to collect 2D human posture in images [9]. The model algorithm architecture is shown in Fig. 1. The specific steps are summarized as follows: (a) The input is a single or multiple colored image with ω × h, and the features are extracted by convolutional neural network; (b) The Forward feedback predicts the skeletal keypoints as well as the joint connection vectors of the figure, respectively, and obtains its keypoint confidence map S and a set of joint connection vectors L. Specifically, the keypoint confidence map S and the joint connection vectors L are expressed as follows: S = (S1 , S2 , . . . , S N ), Sn ∈ R ω×h , n ∈ {1, . . . , N }
(1)
L = (L 1 , L 2 , . . . , L M ), L m ∈ R ω×h×2 , m ∈ {1, . . . , M}
(2)
(c) Using matching algorithms to regress the overall skeleton structure of the same person from S and L; (d) The final visualization outputs is the detected single or multi-person skeletal key points and connected joints. Each keypoint is predicted N times and output M confidence heatmaps; C joint connection vectors are obtained for each keypoint of the estimated target according to each prediction pair. The internal structure of the OpenPose network is shown in Fig. 2 and the external framework is shown in Fig. 3. For the input traffic police images, VGG19 network is used for feature extraction, and the feature map F is obtained for human pose estimation through OpenPose network with multi-segment parallel structure. The network is divided into six stages, and each stage has two channels. One channel outputs Part Confidence Map (PCM) for predicting key points of the human body through convolutional neural network,
Fig. 1 OpenPose algorithm architecture
Traffic Police Dynamic Gesture Recognition …
383
Fig. 2 OpenPose internal structure
Fig. 3 OpenPose external framework
and the other channel outputs Part Affinity Fields (PAF) for joint connection. The output keypoint confidence and joint affinity domains are estimated simultaneously in parallel, and the obtained output vector is fused with the original feature map F as the input of each channel in the next stage. Through the utilization of multi-stage predictions, the attained outcomes are poised to exhibit enhanced precision. In each prediction channel, the feature map F extracted by the backbone network VGG19 as the input of the first stage is first passed through a 3×3 convolutional layer connected by three residues to fuse the deep and shallow features. Then two 1×1 convolution kernels are used for dimensionality boosting or dimensionality reduction to facilitate feature fusion after each stage and also to increase the nonlinear trend. From the second stage to the sixth stage, the input is the fusion of the prediction results of the previous stage and the feature map F, and five 7×7 convolution kernels are used for the small residual network. In the subsequent t stage, the keypoint prediction S t−1 , the keypoint connection field prediction L t−1 as well as the original feature map F in the previous stage are fused for deeper refined prediction. The network can be expressed as follows: S t = ρ t (F, S t−1 , L t−1 ), ∀t ≥ 2
(3)
L t = ϕ t (F, S t−1 , L t−1 ), ∀t ≥ 2
(4)
384
X. Wu et al.
where ρ t represents the prediction result of PCM under phase t and ϕ t represents the prediction result of PAF under phase t. In the OpenPose, in order to improve the prediction accuracy by iterating the key point heat map and joint affinity domain of each stage, and to avoid the loss of some bone key points, the L2 loss function is used in the prediction results of each stage, and the loss function is spatially weighted. The keypoint heatmap at stage t and the loss function of the joint affinity domain are specifically expressed as follows: f St =
N n=1
f Lt
=
2 W (P) · Snt (P)− Sn∗ (P)2
(5)
2 W (P) · L tm (P)− L ∗m (P)2
(6)
P
M m=1 P
where Sn∗ represents the true skeleton keypoint confidence value; L ∗m denotes the true affinity domain vector value. W is used to represent the spatial weighting operation adopted in the training process to prevent unlabeled human joints in the data set from affecting the learning results of the model, which is essentially a binarized Mask matrix. The value of W (P) is 0 if the label at image location P is unlabeled, and 1 otherwise. The global loss function of OpenPose is expressed as follows: f =
T
( f St + f Lt )
(7)
t=1
After the keypoint confidence and joint affinity domain obtained by OpenPose prediction, a simplified Bipartite Matching algorithm is used to connect the keypoints of the traffic police bones to form a skeleton. Taking the left hand joint as an example, we first obtain all the keypoint sets j7 , j9 of left elbow serial number 7 and left wrist serial number 9 in the keypoint confidence, and then calculate the connection confidence E of each pair of d j7 , d j9 keypoint ( j7 ∈ {J7 }, j9 ∈ {J 9}) by combining the joint affinity domain: u=1 E= L c ( p(u)) · u=0
d j9 − d j7 du ||d j9 − d j7 ||2
p(u) = (1 − u)d j1 + ud j2
(8)
(9)
where p(u) represents the difference between two keypoints from elbow to wrist, d j9 −d j7 denotes the unit vector from keypoint j7 to keypoint j9 , L c ( p(u)) denotes ||d j9 −d j7 ||2 the value of the joint affinity domain vector at position p(u). Using the Hungarian
Traffic Police Dynamic Gesture Recognition …
(a) Keypoint confidence
385
(b) Confidence ranking
Fig. 4 Keypoint feature matching
algorithm to match all left elbow keypoints 7 with all left wrist keypoints 9 based on their connection confidence, as follows: (1) For all connections of keypoint 7 and keypoint 9, sorted by connection confidence E; (2) Using keypoint 7 as the starting point and keypoint 9 as the ending point, traversing related connections; (3) For one of the connections, if the start and end points are not marked, it is added to the final matching connection. If the start or end points are marked, it is discarded. Keypoint feature matching is shown in Fig. 4, where m represents the three sets of keypoint 7 on the left elbow, n represents the four sets of keypoint 9 on the left wrist, and the matrix values in Fig. 4a represent the connection confidence of m i and n j . Figure 4b shows the result sorted by keypoint connection confidence, and then the unoccupied keypoint is added to the final result set, and the keypoint is discarded if a new connection uses it.
2.2 Graph Convolutional Networks CNN and RNN have shown strong advantages in dealing with stable and regular two-dimensional data structures. However, traditional CNN and RNN will fail when dealing with many data with irregular spatial structures, such as gesture recognition, brain neural signals, recommendation systems, traffic prediction, molecular structures, etc. For feature extraction of graph-structured data, Thomas’s team proposed graph convolutional Network (GCN) to extract and classify spatial features of graphstructured data. It is assumed that the graph structure to be processed includes N nodes and each node has its own independent features. These node features are combined into
386
X. Wu et al.
Fig. 5 Inter-layer propagation in graph convolutional networks
an N × D dimensional matrix H , and the connection relationship of each node is represented by an adjacency matrix A. The inter-layer propagation of the graph convolutional network is shown in Eq. 10. H (l+1) = σ (D − 2 A D − 2 H (l) W (l) ) 1
1
(10)
where A = A + I is the adjacency matrix, I is the identity matrix W (l) is the weight whose representation form is a hierarchical matrix. σ (•) denotes the nonlinear acti is the degree matrix of adjacency matrix A which is expressed vation function. D as follows: D ii = j Aii . The propagation process between layers of graph convolutional network is shown in Fig. 5. For each H in the input layer, after the calculation of the hidden layer, the output layer will generate the corresponding Z , which is reflected as the node feature. The structural features between the nodes are not changed, which is reflected as structural features. The core of graph convolutional neural network is to learn not only the node features, but also the structural features between nodes, so as to extract the spatial features of topological graph.
2.3 ST-GCN Model The Chinese University of Hong Kong [18] proposed a human action recognition method based on ST-GCN, which introduced the time dimension, so that the graph convolutional neural network could extract the spatial and temporal characteristic information of key points of human skeleton. The traffic police gesture recognition algorithm based on ST-GCN regards the skeleton sequence of the traffic police as a spatial graph structure, so the convolution calculation is carried out in two dimensions of spatial and temporal. Therefore, before analyzing the traffic police gesture, the obtained traffic police skeleton graph needs to be hierarchic constructed.
Traffic Police Dynamic Gesture Recognition …
387
Fig. 6 Spatio-temporal map of traffic police skeleton
In this paper, the OpenPose attitude estimation method mentioned in Sect. 2.1 is used to obtain 18 skeletal keypoint information of traffic police, and the spatiotemporal schematic diagram is constructed as shown in Fig. 6. In the process of spatial dimension convolution of graph convolutional network, according to the defined skeleton undirected spatio-temporal graph G, the number of input channels is set as C, and the convolution calculation result of skeleton keypoint x is shown in Eq. 11: f out (x) =
K K
f in ( p(x, h, w)) · w(h, w)
(11)
h=1 w=1
where f in is the input image, f out is the output feature, p(•) is the sampling function, denoted by p(x, h, w) = x + p (h, w), w(h, w) s the weight. the undirected graph G, the set of traffic police skeleton point vti as the center point is defined as B(vti ) = vt j |d(vt j , vti ) ≤ D, and d(vt j , vti ) represents the shortest distance between skeleton point vt j and vti When the domain distance D is set to 1, the sampling function can be expressed as follows: (12) p(vti , vt j ) = vt j
388
X. Wu et al.
In the partitioning strategy of constructing subsets, ST-GCN uses a partitioning method based on spatial structure, the weighting function for each subset can be expressed as follows: w(vti , vt j ) = w (lti (vt j ))
(13)
Applying the optimized sampling function and weighting function to the graph convolution in Eq. (11), the following can be obtained: f out (x) =
vti ∈B(vti )
1 f in (vt j ) · w(lti (vt j )) Z ti (vt j )
(14)
In the use of gesture recognition based on ST-GCN, the adjacency matrix A and identity matrix I of the graph are used to represent the links of keypoints of human ij ij skeleton in a single frame, denoted as ii = A + I . For the spatial structure division strategy, ST-GCN is denoted as: f out =
−1
−1
j 2 A j j 2 f in w j
(15)
j
where A j = A + I , ii j = (Aikj ) + α, α = 0.01. Add attention, let A j = k j Mk , The overall expression is shown in Eq. (16): Ak f out =
wk
Mk f in Ak
(16)
The overall structure of ST-GCN network is shown in Fig. 8, which is mainly divided into three parts: (1) The input traffic police skeleton data were batch normalized. (2) The processed data will go through 9 ST-GCN spatio-temporal units, each of which is composed of graph convolution and temporal convolution, and the feature information of traffic police dynamic gestures in two dimensions of space and temporal is extracted respectively. The Resnet connection mechanism is used between ST-GCN units; (3) The spatio-temporal features of the output traffic police were passed through the average pooling layer, the fully connected layer and the Softmax function in turn to classify the dynamic gesture results of the traffic police (Fig. 7).
Traffic Police Dynamic Gesture Recognition …
389
Fig. 7 ST-GCN network structure
3 Method The attention matrix in the classical ST-GCN model is cleverly designed, which uses the succession relationship between key points in the human skeleton topology map to update the associated weight parameters through training, so as to enhance the attention to important information in the action recognition task. However, as shown in Eq. (16), the adjacency matrix Ak represents the natural connection of human joints, and the matrix is 1 in the table. If there is no connection between the joints, the matrix is represented by 0 in the matrix. When the attention matrix Mk is dot multiplied with the adjacency matrix Ak , the output matrix is represented by the strength of the natural connection of human joints, and Mk cannot assign weights to the joints that are not connected originally. Causing the network to ignore some features. In the task of traffic police gesture recognition, the semantic information of instructions is mostly reflected in the arms and head joints. Although there is no direct node connection between the two hands, there are also many connections. In addition, a video includes not only the iconic gestures commanded by the traffic police, but also the transition actions between gestures, so the temporal convolution process also requires different attention. Aiming at such problems, this chapter integrates the spatial attention mechanism and temporal attention mechanism into the ST-GCN model to improve the performance of the model in the traffic police gesture recognition task.
3.1 Spatial Attention Mechanism In the spatial dimension, the characteristics of traffic police are represented by the coordinates of skeletal key points in a single frame. In order to enable the attention mechanism to focus on the connection of key points other than the skeletal topology map of traffic police, Eq. (16) is optimized, which is specifically expressed as F=
Kv k
wk ( f (Ak + Bk ) ⊗ Mk )
(17)
390
X. Wu et al.
Fig. 8 Spatial attention mechanism
where F is the output, f is the input feature, Bk is the N × N graph correlation matrix with learnable weight parameters. Different from the adjacency matrix Ak with fixed data, the data in the graph matrix Bk is completely composed of the results of training traffic police gestures, which is expressed as the correlation between keypoint xti and keypoint xt j in the same frame t. For the input feature keypoint f (xti ), the feature vectors Pti and Q ti of P dimension and Q dimension are calculated by using convolutional network, as shown in Eq. (18):
Pti = W P f (xti ) Q ti = W Q f (xti )
(18)
where W P and W Q are the weights of different dimensions, and the inner product of the two dimensional features is calculated, which is the element of the graph correlation matrix The adjacency matrix and the graph correlation matrix can be summed to produce joint connections that do not exist. The sum result matrix is normalized by the Softmax function, and multiplied with the weight matrix Mk to dynamically adjust the connection between the key points. The spatial attention mechanism architecture is shown in Fig. 8.
3.2 Temporal Attention Mechanism In the temporal dimension, the connection of keypoints described by different time frames also requires different attention. The temporal attention mechanism is used to attach different importance to the video temporal channel. The model is shown in Fig. 9, The input F of W × H × C is transformed into a 1 × 1 × C matrix Z by global average pooling F G A P. The elements of the matrix computing Z c can be expressed as:
Traffic Police Dynamic Gesture Recognition …
391
Fig. 9 Temporal attention mechanism
Zc =
H W 1 Fc (i, j) W × H i=1 j=1
(19)
where Fc (i, j) denotes the element in the two-dimensional matrix of the c channel in input F. After that, the attention operation is performed to generate the attention matrix S, as shown in Eq. (20). S = σ (W2 δ(W1 Z ))
(20)
where W1 and W2 are weight matrices, denoted as fully connected operations, δ(•) is denoted by the ReLU function, σ (•) is denoted by the Sigmoid function. The matrix S is multiplied with the input matrix F to obtain the weight information, and the with the time attention residual of the matrix F is added to obtain the output matrix F weight information.
3.3 ST-GCN Dynamic Gesture Recognition of Traffic Police Based on Spatio-temporal Attention Mechanism Spatial attention mechanism and temporal attention mechanism can make the model pay more attention to valuable information. In this section, the attention mechanism is trained separately, and the training results are integrated into the ST-GCN model after adjustment. The graph attention matrix obtained from the spatial attention mechanism training and the spatial graph convolution module jointly participate in the convolution operation to extract the spatial features of the key point sequence of the traffic police skeleton; The time graph convolution is multiplied with the time attention weight to obtain the change characteristics of the key points of the traffic police in the time dimension. Nine ST-GCN spatio-temporal units were formed by residual connections to improve the robustness of the model. ST-GCN with spatio-temporal attention mechanism represents the input and output data of traffic police gesture recognition through tensor [B, C, T, V, M], where B
392
X. Wu et al.
Table 1 Spatio-temporal attention ST-GCN parameter configuration Network layer Number of joints Input parameters Batch normalization ST-GCN1 ST-GCN2 3 ST-GCN4 ST-GCN5 6 ST-GCN7 ST-GCN8 9 Pooling Softmax
18 18 18 18 18 18 18 18 –
(32, 3, 150, 18, 1) (32, 3, 150, 18) (32, 64, 150, 18) (32, 128, 150, 18) (32, 128, 75, 18) (32, 128, 75, 18) (32, 256, 38, 18) (32, 256, 38, 18) (32, 256, 1, 1)
Output parameters (32, 3, 150, 18) (32, 64, 150, 18) (32, 128, 150, 18) (32, 128, 75, 18) (32, 128, 75, 18) (32, 256, 38, 18) (32, 256, 38, 18) (32, 256, 1, 1) (32, 8, 1, 1)
is the training batch, C is the feature number of keypoints, T is the number of video key frames, V represents the number of traffic police skeleton keypoints, and M represents the number of traffic police. Table 1 represents the parameter configuration of each layer of the model.
4 Experiment and Analysis 4.1 Dataset Preprocessing Before training the dataset, the traffic police dynamic gesture instructions need to be preprocessed. Videos are cut according to the gesture classification, and similar gestures are spliced into a video of about 5 minutes, with 25 videos of each type of gesture, the frame rate is fixed at 15, and the video perspective does not change with the position of the traffic police. The resolution of each frame of the captured video image is set to 640 × 640. The numbers 0–7 correspond to 8 types of traffic instructions, and other gestures other than traffic instructions are represented by number 8. Each frame in the video is manually labeled frame-by-frame, and a.csv file including frame number, coordinates of 18 skeletal key points and categories is finally generated. Part of the data set is shown in Table 2, and each row of data is represented as a frame, and a total of 45,398 frames are collected. The ratio of training to test data is 7:3.
Table 2 Partial header keypoint datasets Frame Nosex Nose y REyex REye y LEyex LEye y REarx REar y LEarx Lear y 0 0 2
0.38 0.38 0.37
0.26 0.26 0.26
0.37 0.37 0.36
0.23 0.23 0.23
0.39 0.39 0.39
0.23 0.23 0.23
0.35 0.35 0.35
0.26 0.26 0.26
0.41 0.41 0.41
0.26 0.26 0.26
Class 0 0 0
Traffic Police Dynamic Gesture Recognition …
393
4.2 Experimental Parameters and Evaluation Indicators In the traffic police dynamic gesture recognition experiment, the hyperparameter Epochs of the training model were set as 100, the Batch size was 32, the traffic police gesture Class was 8, the initial learning rate was 0.0001, and the optimization function Adam was set as the default value. The experiment uses the classified cross-entropy loss function and Softmax activation function in Keras to calculate the loss, which is expressed as Eq. (21) : Loss = −
N 1 yin • log( pin ) N i n=1
(21)
The model uses Precision (P), Recall (R), F-Measure score and other indicators to evaluate the dynamic gesture recognition effect of traffic police, and its calculation formula is shown in Eq. (22) : TP T P + FP TP R= T P + FN
P=
F=
(1 + α)2 P × R α 2 (P + R)
(22)
The Loss-F1 Acc curves of the trained model are shown in Fig. 10. It can be seen from the figure that when the Epoch is less than 20, the Loss curves of the training set and the test set show a rapid downward trend, and the Acc curve shows a rapid growth trend, indicating that the model begins to effectively train the dynamic gesture features of the traffic police. When the epochs tended to be between 20 and 80, the changes of Loss curve and Acc curve gradually tended to moderate, indicating that the ability of the model to learn the traffic police gesture features began to gradually approach saturation. When the Epoch reaches 100, the Loss curve tends to 0.3, and the classification accuracy is close to 0.87. The network model reaches a saturation state for traffic police gesture recognition, and the training is over. The recognition and classification of each gesture of the traffic police by the trained model can be visually presented by the confusion matrix, as shown in Fig. 11. The rows of the confusion matrix represent the true label of the gesture category, the columns represent the predicted label, and the percentage value in each cell represents the probability that gesture n is predicted to be gesture m.
4.3 Experimental Result In order to further verify the effectiveness of the trained attention model, the attention confidence of the model is visualized by a heat map, as shown in Fig. 12 for the
394
Fig. 10 Model training loss accuracy
Fig. 11 Confusion matrix for traffic police gesture classification
X. Wu et al.
Traffic Police Dynamic Gesture Recognition …
395
(a) Initialize the adjacency matrix
(b) Train the attention map matrix
(c) Attention heatmap of Stop
(d) Attention heat map of Change Lanes
Fig. 12 Schematic of the Spatial attention
visualization of the effect of the spatial attention mechanism. Figure 12a shows the initialized adjacency matrix of skeletal key points, which represents the natural connection graph matrix of skeletal key points of traffic police. The diagonal elements represent the key points, the other elements represent the connection of key points by color gray, the white area represents the connection of related joints, and the black area represents the connection without joints. Figure 12(b) shows the trained attention matrix, and the connection strength of the trained joint is updated according to the movement of the traffic police gesture, which is represented by gray scale in the figure. The confidence of spatial attention is reflected in the joint area of the traffic police, and the output results are shown in Fig. 12c, d. Figure 12c is the attention heat map of the stop instruction, and Fig. 12d is the attention heat map of the lane change instruction. The confidence size is represented by the color depth, and the blue area represents the larger confidence. It can be seen from Fig. 13 that among the 18 skeletal key points of traffic police, the confidence of key points and joints of both arms is higher than that of head and leg, indicating that the model can pay more attention to learning the motion characteristics of both arms of traffic police, which is consistent with the importance of semantic information expressed by each joint when the actual traffic police command, and the effectiveness of the spatial attention mechanism is verified.
396
X. Wu et al.
(a) Single instruction (pull over)
(b) Continuous instruction (stop +go straight) Fig. 13 Temporal attention prediction confidence Table 3 Comparison of different model results Model Average F1 score (%) ST-GCN 83.03 AS-GCN 85.43 ST-GCN with spatio-temporal 88.82 attention
FPS 21 17 17
The confidence of the temporal attention mechanism on the video stream is shown in Fig. 13. Figure 13a shows the sampled video frame of the pull over instruction, Fig. 13b shows the continuously sampled video frame of the stop instruction and the go straight instruction. The lower confidence axis shows the importance of the attention mechanism to the current video frame through the color depth, and the dark red indicates the key frame that needs to be learned more. The red area in the confidence axis is concentrated in the video frames with large changes in the key points of the traffic police skeleton, which contains rich semantic information and has a good effect in the traffic police gesture recognition task. The experiment compares the average F1 score and FPS of the three network models ST-GCN, AS-GCN and STGCN with spatio-temporal attention mechanism on the traffic police gesture data set, and the results are shown in Table 3. From the experimental results in the table, it can be seen that the ST-GCN model integrating the spatio-temporal attention mechanism achieves higher accuracy, which is 5.79%higher than the accuracy of the original ST-GCN. Among them, the performance accuracy of different traffic police gesture categories is shown in Table 4. By analyzing the data in the above table, it can be seen that the average precision is 88.75%, the average recall is 89.38%, the average accuracy is 96.95%,and the average F1 score is 88.80%. The right turn instruction has the highest F1 score of 95.57%. The recognition accuracy of the deceleration command is low. Combined
Traffic Police Dynamic Gesture Recognition …
397
Table 4 Gesture recognition accuracy of various types of traffic police Gesture categories Gesture frames Precision (%) Recall (%) Stop GO straight Turn left Turn left to stay Turn right Change lanes Slow down Pull over
4948 5901 5795 5575 5701 5815 5823 5848
85.00 94.00 89.00 91.00 97.00 84.00 93.00 77.00
83.33 96.91 90.82 89.22 94.17 92.31 74.40 93.90
Table 5 Gesture recognition accuracy of traffic police in different scenarios Scene Gesture frames Precision (%) Recall (%) Indoors 11,848 Road 10,364 Outdoor bright light 9875 Park, street 13,307
95.18 76.00 86.00 89.76
93.45 90.48 91.49 80.55
F1 score (%) 84.16 95.43 89.90 90.10 95.57 87.96 82.67 84.62
F1 score (%) 94.31 82.61 88.66 84.91
with the confusion matrix, it can be seen that the F N of the deceleration instruction is 0.32, which leads to a low recall rate, making the model easy to misjudge the stop instruction and lane change instruction as the deceleration instruction. addition, the experiment evaluated the effect of the model on traffic police gesture recognition in different scenarios, and the results are shown in Table 5. In Table 5, due to the absence of external interference such as background in the indoor environment, the model obtains F1 score of 94.31%, and the recognition results are shown in Fig. 14a–c. In outdoor bright light environment, as shown in Fig. 14g–i, the model can achieve an accuracy of 88.66%, achieving considerable recognition effect. However, compared with the recognition accuracy of other scenes, the recognition effect in road environment has a low recognition accuracy of 82.61%, because there are interference factors such as passing vehicles, pedestrians, and complex environment behind traffic police in this scene, as shown in Fig. 14d–f, and the model is prone to misjudgment and missing judgment. In other scenes such as parks and streets, the recognition effect reaches 84.91%, and the average recognition accuracy reaches 87.62%, which verifies the effectiveness of the traffic police gesture recognition model.
398
X. Wu et al.
Fig. 14 Traffic police hand gesture recognition visualization results
5 Conclusion In this paper, we propose to use the ST-GCN network model with spatio-temporal attention mechanism to solve the technical problems of dynamic gesture recognition of traffic police. The attention mechanism is used to update the weight parameters in the time dimension and the space dimension respectively, so as to strengthen the attention to the spatio-temporal characteristics of traffic police dynamic gestures. This paper preprocesses the traffic police data set, verifies the effectiveness of the attention mechanism through the spatio-temporal attention confidence, and evaluates the model for different gestures and experiments in different scenarios. Compared with the traditional ST-GCN, it obtains higher recognition accuracy, which verifies the good performance of the ST-GCN algorithm with spatio-temporal attention mechanism on traffic police dynamic gesture recognition.
Traffic Police Dynamic Gesture Recognition …
399
Acknowledgements This work was supported by National Natural Science Foundation of China under Grant 62263005, Guangxi Natural Science Foundation under Grant 2020GXNSFDA238029, Laboratory of AI and Information Processing (Hechi University), Education Department of Guangxi Zhuang Autonomous Region under Grant 2022GXZDSY004, Innovation Project of Guangxi Graduate Education YCSW2023298, Innovation Project of GUET Graduate Education under Grant 2023YCXS124.
References 1. He, J., Zhang, C., He, X., Dong, R.: Visual recognition of traffic police gestures with convolutional pose machine and handcrafted features. Neurocomputing 390, 248–259 (2020). https:// doi.org/10.1016/j.neucom.2019.07.103 2. Wei, S.-E., Ramakrishna, V., Kanade, T., Sheikh, Y.: Convolutional pose machines. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4724–4732 (2016) 3. Newell, A., Yang, K., Deng, J.: Stacked hourglass networks for human pose estimation. In: Computer Vision-ECCV: 14th European Conference, Amsterdam, The Netherlands, 11–14 Oct 2016, Proceedings, Part VIII 14. Springer, pp. 483–499 (2016) 4. He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 5. Chen, Y., Wang, Z., Peng, Y., Zhang, Z., Yu, G., Sun, J.: Cascaded pyramid network for multiperson pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7103–7112 (2018) 6. Xiao, B., Wu, H., Wei, Y.: Simple baselines for human pose estimation and tracking. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 466–481 (2018) 7. Sun, K., Xiao, B., Liu, D., Wang, J.: Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5693–5703 (2019) 8. Pishchulin, L., Insafutdinov, E., Tang, S., Andres, B., Andriluka, M., Gehler, P.V., Schiele, B.: Deepcut: Joint subset partition and labeling for multi person pose estimation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4929–4937 (2016) 9. Cao, Z., Simon, T., Wei, S.-E., Sheikh, Y.: Realtime multi-person 2d pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017) 10. Cheng, B., Xiao, B., Wang, J., Shi, H., Huang, T.S., Zhang, L.: Higherhrnet: scale-aware representation learning for bottom-up human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5386–5395 (2020) 11. Geng, Z., Sun, K., Xiao, B., Zhang, Z., Wang, J.: Bottom-up human pose estimation via disentangled keypoint regression. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognitionpp. 14676–14686 (2021) 12. Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: Advances in Neural Information Processing Systems, vol. 27 (2014) 13. Berlin, S.J., John, M.: Spiking neural network based on joint entropy of optical flow features for human action recognition. Vis. Comput. 38(1), 223–237 (2022) 14. Si, C., Chen, W., Wang, W., Wang, L., Tan, T.: An attention enhanced graph convolutional LSTM network for skeleton-based action recognition. In: proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1227–1236 (2019) 15. Yang, H., Gu, Y., Zhu, J., Hu, K., Zhang, X.: PGCN-TCA: pseudo graph convolutional network with temporal and channel-wise attention for skeleton-based action recognition. IEEE Access 8, 10040–10047 (2020). https://doi.org/10.1109/ACCESS.2020.2964115
400
X. Wu et al.
16. Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3d convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4489–4497 (2015) 17. Wan, Y., Yu, Z., Wang, Y., Li, X.: Action recognition based on two-stream convolutional networks with long-short-term spatiotemporal features. IEEE Access 8, 85284–85293 (2020). https://doi.org/10.1109/ACCESS.2020.2993227 18. Yan, S., Xiong, Y., Lin, D.: Spatial temporal graph convolutional networks for skeleton-based action recognition. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 32, no. 1 (2018). https://doi.org/10.1609/aaai.v32i1.12328
Radar-Based 3D Skeleton Estimation Enhanced with Joint Temporal-Spatial Constraints Guangyu Mei, Zhongping Cao, Guoli Wang, and Xuemei Guo
Abstract Radar-based human pose estimation faces the challenge of anatomically incorrect pose estimation within frames and unstable pose estimation results across multiple frames, primarily due to the sensitivity of radar signals to the radial component of motion. To overcome this, we propose a two-stream model that incorporates spatio-temporal constraints into learning network. Spatial stream network captures the physical connections among joints, ensuring anatomical consistency within frames. Meanwhile, temporal stream network focuses on learning action-specific synergies in motion, capturing the temporal dependencies among joints. The fusion of two streams enables accurate and stable 3D skeleton estimation. Experiments demonstrate the effectiveness of our method. Keywords Millimeter wave radars · Human pose estimation · Deep learning method · Kinematic constraints
1 Introdution In recent years, 3D human skeleton estimation has gained significant attention due to its ability to provide detailed structural information about the human body, enabling a wide range of applications of human behavior perception [1]. This paper focus on radar-based human pose estimation. researchers typically preprocess radar reflection signals into spectrograms and employ neural networks to extract features and estimate the coordinates of human joints [2]. However, radar signals is sensitive only to the radial component of motion, which introduces inherent ambiguity in the learning process for pose estimation. This ambiguity arises when different movements produce similar reflection signals, such as throwing punches at different heights with the same magnitude, resulting in radar spectrograms that are difficult to distinguish. This creates difficulties for neural networks to differentiate between similar spectroG. Mei · Z. Cao · G. Wang (B) · X. Guo School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_32
401
402
G. Mei et al.
Fig. 1 The temporal-spatial constraints represented in the physical topology and logical topology of the skeleton tree: the joint dependence in the logical topology is temporally activity-specific, e.g. the joints of the hands follow the movements of the joints of the feet while walking
gram sequences during feature extraction, resulting in inaccurate predictions. The network may assign different outputs to the same input, leading to anatomically incorrect within single frame and unstable pose estimation results across multiple frames [3]. Therefore it is dispensable to embed a prior knowledge into neural network. To mitigate the problem of generating anatomically unrealistic results in radarbased pose estimation, we consider the physical connections among joints within the human skeleton tree, so the estimated joints can adhere to the spatial constraints of the skeleton. This integration ensures that the position of each joint is inferred based on the position of its parent joint, thereby maintaining the consistency of the reconstructed skeleton with the corresponding spatial constraints and embed the knowledge of spatial relationships into the network. To address the challenge of unstable pose across multiple frames, we model the interdependencies among key movements in human motion. As depicted in Fig. 1, each joint in the skeletal tree not only has physical connections but also exhibits a temporal logical dependence associated with movement. These logical dependencies among the joints are not as constant as the physical connections; in fact, they are highly contingent on specific activities, which contain more abstract temporal information such as body balance and movement co-occurrence [4, 5]. By embedding the knowledge of temporal relationships into learning model, network will amplify the differences between different motion patterns, making originally indistinguishable features discernible and enhancing the stability of the pose estimator. Specifically, we propose a two-stream network model for radar-based 3D human pose estimation. Spatial stream network captures the physical connections among joints, ensuring anatomical consistency within frames. Meanwhile, temporal stream network focuses on learning action-specific synergies in motion, capturing the temporal dependencies among joints. We also present a fusion module that dynamically
Radar-Based 3D Skeleton Estimation Enhanced …
403
learns a more robust skeleton representation by reducing the redundant kinematic constraints of the two streams. We test the proposed learning paradigm on datasets collected by a commercial millimeter wave radar. Experimental results validate the efficacy of our proposed method.
2 Methods 2.1 Overview As discussed earlier, radar-based 3D human pose estimation using spectrograms encounters challenges related to ambiguity and instability. To address these issues, we propose a two-stream network model, illustrated in Fig. 2. The model consists of a spatial stream, which focuses on learning the spatial constraints of joints, and a temporal stream, responsible for modeling the temporal constraints of joints through a joint update module. Furthermore, we introduce a fusion module to dynamically learn a more robust skeleton representation by eliminating redundant kinematic constraints from both streams. We adopt 2D range-Doppler spectrum data as input for the skeleton estimation model, which is obtained by preprocessing the signals received by the millimeter-wave radar [6, 7].
Fig. 2 The overview of the joint temporal-spatial kinematic constrained model: two-stream architecture for integrating the spatial and temporal kinematic constraints in a complementary fashion
404
G. Mei et al.
2.2 Spatial Stream Network In order to learn the spatial constraints among joints, the spatial stream network incorporates forward kinematics into deep neural network to explore the physical connection of human skeleton for obtaining the human pose under spatial constraints in an iterative fashion. Specifically, in forward kinematics (FK) model, the 3D position of a joint can be generated by the position of its parent joint and the rotation angle as well as length of the limb connecting the two joints [8]. For example, in Fig. 1, when we only consider the movement of right arm, the right shoulder (i.e., joint 14) is the parent joint of the right elbow (i.e., joint 15), which is also the parent of the right hand (i.e., joint 16). Thus, the arm movement can be decomposed by first rotating the elbow with respect to the shoulder, and then rotating the hand with respect to the elbow [9]. In other words, the position of hand can be calculated with the length of the arm and the 3D rotation angle. Therefore, given the rotation of the joints and an initial skeleton, each joint can be localized recursively, and thus the reconstructed human posture naturally satisfy the spatial constraints of the skeleton. Mathematically, the FK model can be formulated as follows: yi = y par ent (i) + Ri ( y¯i − y¯ par ent (i) ),
(1)
where yi ∈ R 3 is the 3D coordinate of joint i, y par ent (i) ∈ R 3 is the parent joint of yi on the skeleton tree, and y¯i , y¯ par ent (i) are the initial position of yi and y par ent (i) , respectively. Ri represents the rotation of the joint yi with respect to its parent. To this end, as shown in Fig. 1, in our spatial stream network, both the initial skeleton and joint rotation needs to be learned. For the initial skeleton, we use the range-Doppler spectrum data to learn the skeleton information with spatial information through the CNN and one fully connected layer. For joint rotation, we employ CNN-LSTM, i.e., a CNN-based spatial module followed by a LSTM-based temporal module, to extract the spatio-temporal features of the range-Doppler spectrum for learning the 3D joint rotation matrix of the joints. After obtaining the initial skeleton and each joint’s rotation, starting from the root joint (i.e., the hip center, joint 0 in the skeleton tree), the location of each joint can be recursively calculated through the FK model according to the relationship between parent and child joint constructed by the skeleton tree, which can be defined as follows: Ysp = F K (Fr ot (X ), Finit (X )),
(2)
where X is the input RDM sequence, Fr ot (·) is the learning process of obtaining 3D joint rotation matrix through CNN-LSTM and Finit (·) is defined as the process to get the initial skeleton through CNN, and Ysp ∈ R J ×3 is the joint positions, where J is the number of joints. Through the above process, we can obtain the joint positions Ysp output by the spatial stream network. More details can be found in [10].
Radar-Based 3D Skeleton Estimation Enhanced …
405
2.3 Temporal Stream Network Although the FK model makes full use of the spatial constraints among skeletal joints, it ignores the temporal constraints that may exist between the joints during the movement. Therefore, our temporal stream network aims to learn the temporal constraints among skeletal joints through the joint update module to supplement the action-specific dynamic information, thereby reducing the joint estimation error of the specific motion with strong temporal correlation. In this module, we first apply the CNN-LSTM to learn the features of each joint from the range-Doppler spectrogram, which can formulated as: H = G(X ), (3) where G(·) represents the process of feature extraction, using a CNN module and a LSTM layer followed by a fully connected layer, and H ∈ R J ×F is the feature matrix of all joints, where F represents the feature dimension. Then, we use the relationship between the joints to update the features of each joint. By forcing all the related joints to learn the correlation between each other, we can extract useful information from related joints to enhance the feature learning of the current joint. In this way, the updated skeletal joint features can learn the temporal constraints among each joint in the skeleton tree. The updating process is defined as: Fˆ j = gate(H j ) +
J −1
(λk ∗ g([H j , Ck ])),
(4)
k=0,k!= j
where Ck =
Fˆk , i f k updated Hk , other wise.
(5)
where H j and Fˆ j represents the original feature and the updated feature of the j-th joint respectively, g(·) is the fully connected layer to learn the temporal constraints among two joint features, λk is a trainable parameter and gate(·) is a function designed for retaining useful information of the original joint features before updating, which is implemented by a two-layer fully connected layers. In addition, the joint update module is carried out in an iterative way, that is, the feature of the next sequential joint will be learned with the previously updated joint features. If this joint is not updated, it will be used to learn the association between the two joints with the original features. That is, similar to information transmission, the latest information is transmitted to each node. Finally, the fully connected layer is used to project the updated joint features to the coordinate space of the human skeleton. The formula is as follows:
406
G. Mei et al.
ˆ Ytem = FC( F),
(6)
where Ytem represents the reconstructed human skeleton output by the temporal stream network.
2.4 Fusion Model Since there will be some information redundancy in the spatial constraints and temporal constraints among skeletal joints to some extent, simple addition operations do not alleviate this problem. So we use trainable weights to combine the two streams adaptively, so as to yield the pose estimations being consistent with joint temporalspatial kinematic constraints. Therefore, the final 3D skeleton estimation is defined as: Y pos = λl ∗ Ysp + λ f ∗ Ytem .
(7)
2.5 Loss Function In order to better supervise and train our model, we use four loss functions to make the estimated 3D human skeleton more natural and realistic, and to keep it as consistent as possible with ground truth, including joint position loss, smooth loss, distance loss and angle loss. The joint position loss L J P is to calculate the Euclidean distance for each joint. The smooth loss L S is developed to maintain the temporal smoothness of the reconstructed skeleton between consecutive frames. The distance loss L DL and angle loss L A are to keep the relative distance and angle of the connected joints consistent with the ground truth, respectively. Therefore, the total loss function is defined as follows: L total = L J P + αL S + β L DL + γ L A , where LJP =
LS =
L DL =
T J 1 1 t ( yˆ − ykt )2 , T t=1 J k=1 k
(8)
(9)
T J 1 1 ( yˆkt − yˆkt−1 ) − (ykt − ykt−1 ) H , T − 1 t=2 J k=1
(10)
T J 1 1 yˆit − yˆ tp(i) − yit − y tp(i) H , T t=1 J − 1 i=2
(11)
Radar-Based 3D Skeleton Estimation Enhanced …
LA =
T J 1 1 1 − coSim( yˆit − yˆ tp(i) , yit − y tp(i) ), T t=1 J − 1 i=2
407
(12)
where α, β and γ are hyper-parameters to balance the four loss functions, · H is Huber function, T is the length of the input range-Doppler spectrum sequence, and J represents the number of joints. ykt is the true position of the k-th joint of the t-th frame, and yˆkt is the estimated position, correspondingly. coSim is the the cosine similarity function.
3 Experiments In this section, we perform experimental studies to demonstrate the effectiveness of the proposed model. Firstly, we introduce our collected radar dataset. Secondly, we compare with several advanced methods through experiments on this dataset to highlight the superiority of our proposed method. Finally, ablation experiments are designed to evaluate the effectiveness of each module as well as the robustness of the proposed method.
3.1 Dataset Description We collect radar data for human skeleton estimation by a commercial millimeter wave radar (AWR1843 by Texas Instruments Corp, Dallas, USA), which involves 3 Tx antennas and 4 Rx antennas. Then the beat signals are obtained through using the supporting toolkit called mmWave Studio. We preprocess the beat signals through range fast Fourier transformation (FFT) and Doppler FFT to obtain range-Doppler spectrogram data. Correspondingly, we obtain visual data through the Intel RealSense Depth Camera D415. The visual data is used to obtain the ground truth label corresponding to radar data with the network proposed in [11]. And we use network time protocol (NTP) to realize the synchoronization between the radar and camera. Our dataset currently collects radar data for human skeleton estimation from three environments and four subjects, including five activities, namely boxing, raising hands, stepping, arm swinging and lifting legs. And the collection environment of these three scenes and layouts are shown in Fig. 3, where the positions of targets for data collecting are marked with the area about 0.5 m * 0.5 m. The radar is set infront the human with no occlusion. The four subjects are males, age 24–29 years, body mass 65.2–72.1 kg, and height 170.1–181.9 cm. The training set and test set are divided with the ratio of 8:2 in the experiments. Since radar signal reflection varies in different environments, which will result in different signal interference, the performance comparison in various scenes can reveal the ability of the model to resist the perceived data incompleteness caused by environmental interference reflection.
408
G. Mei et al.
Fig. 3 The three scenes and layouts for experimental studies. a and d are the photo and layout of Scene 1, respectively; b and e show the picture and layout of Scene 2; c and f are the photo and layout of Scene 3, respectively. The radar is set infront the human with no occlusion
3.2 Implementation Details The proposed model is implemented on Pytorch running on a computer with Intel Xeon(R) E5-2696 CPU, 4 NVIDIA GeForce Titan X GPU, and 2 32 GB RAM. The input to our model is a 5-frame range-Doppler spectrogram sequence. We employ a CNN module with four 2D convolutional layers, each followed by a BatchNorm layer and a ReLU activation layer. The number of channels in the convolutional layers is set to 16, 32, 64, and 16, respectively, and the kernel size is fixed at 3 with a stride of 1. The first three convolutional layers are also accompanied by max pooling layers. The model consists of two main streams: the spatial stream and the temporal stream. The spatial stream employs the CNN module to extract spatial features. It is followed by a bidirectional Long Short-Term Memory (Bi-LSTM) network and fully-connected layers, which yield joint rotation matrices. Additionally, a separate CNN module is used for the initial skeleton, followed by two fully connected layers. We consider a total of 17 joints. After updating the features of all joints, we employ two fully connected layers to project the joint features into the 3D skeleton coordinate space. Finally, in the fusion stage, the joint positions obtained from both the spatial and temporal streams are combined to estimate the most desirable 3D skeleton.
Radar-Based 3D Skeleton Estimation Enhanced …
409
3.3 Experimental Results We conduct experiments on the collected radar datasets of three environments, and compare the advanced methods with our proposed method. The evaluation metric of the experiment is the mean per joint position error (MPJPE), which is defined as the average Euclidean distance between the estimated joint locations and the ground truths for all the subjects and activities. As shown in Table 1, we compare the average error results of the skeleton estimated by different methods and the ground truth in the test set of Scene 1, Scene 2, and Scene 3. We include several previous mainstream methods, such as mmPose [12] and mPose [13], for comparison with the proposed method. Figure 4 presents the estimation error compared between each human joint, where the joint index corresponds to the key point sequence number marked in Fig. 1.
Table 1 Comparison of skeletal joint estimation errors (unit: mm) in different scene Method Scene 1 Scene 2 Scene 3 CNN-LSTM Self-attention Joint-update mmPose [12] mPose [13] Kinematic [10] Ours
31.45 29.67 28.74 48.36 39.68 29.60 25.97
35.23 33.67 34.52 41.94 40.80 32.15 29.63
37.63 35.24 36.31 50.11 40.20 35.99 32.63
Fig. 4 Comparison of skeletal joint estimation errors of each joint (unit: mm), where the joint index corresponds to the key point number marked in Fig. 1
410
G. Mei et al.
The results across the three scenes reveal that the baseline model CNN-LSTM exhibits the highest estimation error. The Self-attention method, which incorporates a self-attention module to learn spatio-temporal associations between joints and map them to the 3D skeleton space, shows improved accuracy by leveraging prior knowledge. However, it is not specifically tailored for this particular application scene and thus not the optimal solution. In contrast, the proposed two-stream kinematic constrained model, which combines the Joint-Update module and Kinematic module, outperforms the other methods and demonstrates superior generalization ability. This model is capable of reconstructing reasonable human poses under the influence of random reflections in different radar environments. The experimental results across various scenes confirm the robustness of the proposed method against ambient interference caused by radar random reflections.
3.4 Ablation Experiment In order to study the importance of the effectiveness of each proposed module as well as the robustness of the proposed method, we design extensive experiments and adopt MPJPE as an evaluation metric for analysis. Impact of Two-Stream Network In order to explore the necessity of combining spatial constraints and temporal constraints, the bar chart of estimating skeleton error of single stream and two-stream models can more intuitively reflect the powerful ability of joint temporal-spatial constrained model to reconstruct human pose, as shown in Fig. 5. We observe that the error of pose estimation of the joint temporal-spatial two-stream kinematic constrained model is significantly reduced compared with the spatial-stream and temporal-stream models. This is because a single stream network
Scene 1 Scene 2 Scene 3 0
5
10
15
20
25
30
35
Esmaon Error (mm) Two-Stream
Temporal-Stream
Spaal-Stream
Fig. 5 Comparison results of estimation errors of two-stream and single-stream networks
40
Radar-Based 3D Skeleton Estimation Enhanced … Table 2 The comparison of fusion strategies Scene 1 Concatenation Add Trainable weights
48.64 26.77 25.97
411
Scene 2
Scene 3
50.37 31.34 29.63
53.27 33.83 32.63
method is not sufficient to support the model to reconstruct the accurate 3D human skeleton from the radar data under various activities, and there is a lack of complete human kinematics modeling. The model with only spatial-stream network will perform poorly in activities with strong temporal constraints among skeletal joints, while the model with only temporal-stream network weakens the spatial constraints among joints. The results in various scenes show that the interference of the environment may affect the radar sensing data, resulting in the performance of the spatial-stream model in Scene 1 and Scene 3 is slightly worse than the temporal-stream model, and slightly better than the temporal-stream model in Scene 2. Relatively speaking, the joint temporal-spatial two-stream kinematic constrained model method combines the spatial constraints among joints on the human skeleton tree and the temporal constraints in line with the characteristics of kinematic synergy. It can reasonably model the human motion mode to resist the data incompleteness, and can reconstruct the accurate human pose under the random reflection interference of different scenes, which proves the necessity of two-stream network. Impact of Fusion Strategy We test different fusion strategies, as shown in Table 2. It is found that the concatenation operation will completely couple the spatial constraints and temporal constraints, resulting in a significant increase in error. And the direct addition method performs slightly worse than the fusion strategy using trainable weights, so the fusion strategy with trainable weights is more suitable for information fusion of two kinds of kinematic constraints among skeletal joints in various activities. Impact of Different Joint Update Module Methods In the temporal stream network, we update the learned joint features by different methods. One is to use the self-attention mechanism to learn the constraint relationship between joints for joint update, and the other is to use the fully connected layer mapping to learn the temporal constraints among the two joints after a simple concatenate operation on the two joint features and update them in update step. Furthermore, we compare the two joint update methods that use the original joint features H to update and use the updated joint features Fˆ to update. The results show that the self-attention module is not better than our proposed update method, as shown in Table 3.
412
G. Mei et al.
Table 3 The comparison of joint update module methods Scene 1 Scene 2 Self-attention Updated with H Updated with Fˆ
27.25 26.88 25.97
31.37 31.56 29.63
Scene 3 34.25 33.12 32.63
4 Conclusion In this work, we try to address the challenges of anatomically incorrect pose estimation within frames and unstable pose estimation results across multiple frames in radar-based 3D human skeleton estimation. To this end, we propose a two-stream kinematic constrained model to further utilize the information provided by the radar spectrogram itself. By incorporating spatio-temporal constraints, our proposed model leverages the inherent relationships between joint positions over time. This enables the network to learn and capture the temporal dependencies and dynamics of human poses, leading to more accurate and stable pose estimations. Through extensive experimentation on various sensing scenarios, our method has demonstrated promising results. It achieves improved accuracy in localizing each joint on the human skeleton, with an average error ranging from 26–32 mm.
References 1. Li, S., Yi, J., Farha, Y.A., Gall, J.: Pose refinement graph convolutional network for skeletonbased action recognition. IEEE Rob. Autom. Lett. 6(2), 1028–1035 (2021) 2. Liu, R., Shen, J., Wang, H., Chen, C., Cheung, S.C., Asari, V.: Attention mechanism exploits temporal contexts: real-time 3D human pose reconstruction. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5063–5072 (2020) 3. Zhai, K., Nie, Q., Ouyang, B., Li, X., Yang, S.: Hopfir: Hop-wise graphformer with intragroup joint refinement for 3d human pose estimation (2023). arXiv preprint arXiv:2302.14581 4. Yu, X.: Exploiting the joint motion synergy with fusion network based on transformer for 3d human pose estimation (2022). arXiv preprint arXiv:2210.04006 5. Zhao, L., Peng, X., Tian, Y., Kapadia, M., Metaxas, D.N.: Semantic graph convolutional networks for 3D human pose regression. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3420–3430 (2019) 6. Patole, S.M., Torlak, M., Wang, D., Ali, M.: Automotive radars: a review of signal processing techniques. IEEE Sig. Process. Mag. 34(2), 22–35 (2017) 7. Schöffmann, C., Ubezio, B., Böhm, C., Mühlbacher-Karrer, S., Zangl, H.: Virtual radar: real-time millimeter-wave radar sensor simulation for perception-driven robotics. IEEE Rob. Autom. Lett. 6(3), 4704–4711 (2021) 8. Yang, C., Wang, X., Mao, S.: RFID-pose: vision-aided three-dimensional human pose estimation with radio-frequency identification. IEEE Trans. Reliab. 70(3), 1218–1231 (2021) 9. Jiang, W., Xue, H., Miao, C., Wang, S., Lin, S., Tian, C., Murali, S., Hu, H., Sun, Z., Su, L.: Towards 3D Human Pose Construction Using Wifi (2020) 10. Ding, W., Cao, Z., Zhang, J., Chen, R., Guo, X., Wang, G.: Radar-based 3D human skeleton estimation by kinematic constrained learning. IEEE Sens. J. 21(20), 23174–23184 (2021)
Radar-Based 3D Skeleton Estimation Enhanced …
413
11. Pavllo, D., Feichtenhofer, C., Grangier, D., Auli, M.: 3D human pose estimation in video with temporal convolutions and semi-supervised training. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7745–7754 (2019) 12. Sengupta, A., Jin, F., Zhang, R., Cao, S.: mm-pose: Real-time human skeletal posture estimation using mmwave radars and cnns. IEEE Sens. J. 20(17), 10032–10044 (2020) 13. Shi, C., Lu, L., Liu, J., Wang, Y., Chen, Y., Yu, J.: mpose: Environment- and subject-agnostic 3d skeleton posture reconstruction leveraging a single mmwave device. Smart Health 23, 100228 (2022). https://www.sciencedirect.com/science/article/pii/S2352648321000489
Heuristic-Based Bi-RRT* Path Planning Algorithm for Unmanned Systems in Complex Channel Environment Xiru Wu, Rili Wu, and Junce Jiang
Abstract In order to solve the problems low success rate and slow search rate of BiRRT* algorithm in multi-obstacle environment with complex channel, a local guidedbased heuristic bidirectional RRT*( LGHB-RRT*) was proposed, which uses hybrid search strategy to carry out heuristic planning for different characteristic regions of the map. The algorithm uses the channels and recognition points information obtained after environmental map preprocessing to conduct heuristic growth. And non-uniform sampling search is carried out in the channel region to make the random tree pass through the channel region. The path optimization is added to remove unnecessary path points with hope of improving the optimality. The proposed algorithm is simulated and compared with the comparison algorithm. The results show that the proposed algorithm has higher success rate and search speed, which has strong adaptability in different types of complex channel obstacle environments. Keywords Complex channel · Path planning · Bi-RRT* · Map preprocessing · Heuristic
1 Introduction Path planning is one of the main subjects in unmanned systems research, widely applied in fields such as transportation, agriculture, and industrial production. Its core objective is to find a high-quality and low-cost obstacle-free path from a starting point to a destination in a given environment map [1]. Currently, the mainstream planning algorithms consist of graph search methods that prioritize path optimality and sampling-based methods that prioritize probabilistic completeness [2, 3]. The X. Wu · R. Wu (B) · J. Jiang School of Electronic Engineering and Automation, Guilin University of Electronic of Electronic Technology, Guilin 541004, China e-mail: [email protected] X. Wu e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_33
415
416
X. Wu et al.
RRT (Rapidly-exploring Random Tree) algorithm is a sampling-based single-query planning method that has attracted researchers’ attention due to its strong performance in large-scale and multi-dimensional environments [4]. However, it encounters challenges in real-world planning scenarios, such as low search success rate and sluggishness when confronted with corridor-like environments and doorways [5]. Regarding the above issues, researchers have proposed various generic algorithms based on RRT (Rapidly-exploring Random Tree), such as RRT* with asymptotic optimality, Bi-RRT* with bidirectional search, and Informed-RRT* that utilizes prior information for search [6–8]. These algorithms significantly improve the search speed in typical maps. However, due to the uniform sampling nature of such algorithms, a considerable amount of time is still required for iteration in complex channel environments. Currently, there are three main approaches to address planning problems in complex channel environments. The first approach is to improve the search process by setting non-uniform sampling functions and restricting the sampling range. This method aims to achieve rapid passage through channels by sampling based on obstacle and initial path characteristics [9–12]. In Ref. [13], multiple-tree search and setting sub-goal points are used to guide random trees through obstacle regions. The second approach is to utilize learning from the environment map. This includes pre-planning on simplified maps or heuristic search based on path prior information learned by neural networks [14–16]. References [17, 18] preprocess the map for channel environments by employing bridge detection and clustering algorithms to accurately locate channel positions. They achieve global path planning using prior information about the channels. The third approach involves combining different types of search algorithms. For example, according to the advantages of Bi-RRT* and JPS algorithms in search scale and search accuracy, the literature [19] divides the environmental map into areas suitable for the planning of the two types of algorithms according to the characteristics of obstacles, and then uses different algorithms to plan separately. Among the aforementioned approaches, the first approach is structurally simple but provides limited improvement in channel region performance. The second approach can effectively handle simple channels like “doors,” but its performance is less satisfactory in complex channel environments like “corridors” due to the underutilization of preprocessed information. The third approach can handle various channel environments effectively, but it combines graph search and sampling-based algorithms, leading to higher overall algorithm complexity. The present article introduces the Local Guided-based Heuristic Bidirectional RRT* algorithm (LGHB-RRT*), which combines the advantages of the three aforementioned approaches.In the preprocessing stage, it obtains two types of prior information: channel regions and channel recognition points. In the search phase, based on Bi-RRT*, it employs non-uniform sampling to achieve channel search performance similar to graph search algorithms. This enables swift traversal through complex channel areas. Furthermore, by leveraging channel recognition points, it guides the heuristic growth of Bi-RRT* towards the channels, thereby achieving an organic integration of channel search and Bi-RRT* search. Finally, the obtained path is optimized and pruned to enhance its optimality.
Heuristic-Based Bi-RRT* Path Planning …
417
2 Preliminaries and Problem Formulation 2.1 Problem Formulation In this paper, X denotes the global configuration space, where n ∈ N and n ≥ 2. X obs ⊆ X denotes the obstacle space, and X f r ee = X \X obs denotes the free space. [T = (V, E)] is the randomly generated tree by the algorithm, where V ∈ X f r ee denotes the set of nodes in the tree, and E ∈ X f r ee represents the set of branches connecting the nodes in the tree. Feasible paths are represented as σ : [0, 1] → X f r ee , and the cost of the path is denoted as c. For the path planning problem in complex channel environments, the algorithm mainly considers the indicators of search feasibility and path optimality. Feasibility is represented by the success rate of search within a fixed number of iterations and the time to complete the search. Path optimality is represented by the length of the path, where the search time does not include the preprocessing stage.
2.2 Bi-RRT* Algorithm RRT* is a type of asymptotically optimal global search algorithm based on random uniform sampling. Bi-RRT* is a bidirectional search version of RRT*, which improves the search speed of the algorithm. The pseudocode of the Bi-RRT* algorithm is shown in Algorithm 1. Algorithm 1 Bi-RRT* algorithm 1: T ← (xinit , x goal ), Ta ← (xinit ), Tb ← (x goal ), cbest = ∞ 2: for i=1 to N do 3: r ← a mod b 4: xrand ← SampleFr ee(i) 5: T ← E xtend(Ta , xrand ) 6: xconnect ← N ear est (Tb , xnew ) 7: (cnew , σnew ) ← Connect T r ee(Tb , xconnect , xnew ) 8: if cnew = ∅ and cnew < cbest then 9: (cbest , σbest ) ← (cnew , σnew ) 10: end if 11: Swaplower (Ta , Tb ) 12: end for
Algorithm 1 begins with initialization, defining the start and goal points, and creating two trees that grow from the start and goal points respectively (line 1). The algorithm then performs random uniform sampling in the obstacle-free space using SampleFree and extends the current tree using the sampled points and Extend (lines 2–4). Next, it searches for the nearest node xconnect in the other tree to the current
418
X. Wu et al.
node xnew using Nearest. It attempts to connect xnew and xconnect using ConnectTree and iteratively updates the best cost value for connecting the two nodes (lines 5–9). At the end of each iteration, Swap is used to switch to the tree with fewer nodes (line 10). Algorithm 2 Extend algorithm 1: 2: 3: 4: 5: 6: 7: 8: 9:
T ← (Ta , Tb ), T ← (V, E) X near est ← N ear est (Ta , xrand ) X new ← Steer (X near est , xrand ) if Collision Free(xnear est , xnew ) then xnaer ← N ear (Ta , xnew ) x par ent ← Select Par ent (Ta , xnear , L sp ) V ← V ∪ (xnew ), E ← E ∪ (x par ent , xnew ) E ← Rewir e(E, xnear , xnew ) end if
Algorithm 2 extends the random tree and corresponds to the Extend part in line d of Algorithm 1. It selects a new node xnew added to the current random tree as the center and reselects the parent node and rewires the tree within the range of diameter Łsp . The function Collision Free(xi , x j ) is used to check whether there is a collision between the two points (xi , x j ).
3 LGHB-RRT* Algorithm The workflow of the LGHB-RRT* algorithm is shown in Fig. 1. It extracts the channel regions from the gridded global map and defines channel recognition points based on these regions. The algorithm performs a bidirectional random tree search within the original map. If a newly expanded node of a random tree falls within the neighborhood of a channel recognition point, the algorithm enters a heuristic growth phase, guiding the random tree towards the recognition point and searching the corresponding channel region. It attempts to connect the most recently added node of one random tree with the nearest node in the other random tree. If the two nodes can be connected and the path cost is lower than the current best cost, the best path and cost are updated. The process is repeated by swapping the random trees to generate a collision-free and low-cost path from the start point to the end point. The global path is then optimized to reduce the cost and improve the smoothness of the path within the channel.
Heuristic-Based Bi-RRT* Path Planning …
419
Fig. 1 LGHB-RRT* algorithm flowchart
3.1 Map Preprocessing Before planning the global path, the global configuration space is gridded, and the obstacles are mapped onto the grid map. The obstacle map is preprocessed to extract channel regions and define channel recognition points based on these regions. The channel regions and recognition points are stored as prior information. Extraction of Channel Regions. Channel extraction uses the closed operation of the image, where the closed operation involves consecutive dilation and erosion operations on the obstacles. The property of closed operation is that it does not affect independent obstacles when the convolution operator is invariant; it can only fill the regions between adjacent obstacles. From the property of closed operation, it is known that by performing the XOR operation between the closed map and the original obstacle map, the channel region can be obtained. The formula for closed operation is:
420
X. Wu et al.
(a) Original map
(b) Dilation
(c) Erosion
(d) Channel regions
Fig. 2 Channel extraction based on closed operation
A · B = (A ⊕ B) B
(1)
where · represents the closed operation, ⊕ represents the dilation operation, and represents the erosion operation. A represents the mapping of the obstacle space X obs on the grid map, and B represents the convolution operator for dilation and erosion operations. An example of channel extraction using N × N square convolution operator for closed operation is shown in Fig. 2. Figure 2a shows the original map with two adjacent obstacles. Figure 2b represents the map after the dilation operation, where the brown area represents the newly expanded part through dilation. Figure 2c shows the map after performing the erosion operation on Fig. 2b. By performing an XOR operation between the eroded map and the original map, we obtain the brown channel region in Fig. 2d. This channel region is defined as X cha and is a subset of the obstacle-free area, i.e., X cha ⊂ X f r ee . Definition of Channel Identification Points. For the extracted channel, it is necessary to define its channel boundary and identify channel recognition points based on this boundary. In the process of performing the dilation operation on the obstacle map, the property of dilation operation is duality: the dilation operation and the erosion operation are dual to each other. In other words, performing the dilation operation on obstacle A is equivalent to performing the erosion operation on the complement of A in the global space, and vice versa. The duality formula of the dilation operation is:
Heuristic-Based Bi-RRT* Path Planning …
421
A ⊕ B = A−1 B
(2)
A B = A−1 ⊕ B
(3)
where A−1 is the mapping of the accessibility space X f r ee on the raster map. According to the property of the dilation operation, the area expanded by the dilation operation on the obstacle map is equal to the area expanded by the erosion operation on the obstacle-free map. Since the N × N square convolution operator used in the closing operation has the same basic shape as the grid of the map, the boundaries of the channels are all part of the square convolution operator. All the boundary points (x, y) of the channels belong to the brown area X cha in Fig. 2d, and the boundary points should have different attributes in the left-right or up-down directions. The formula for defining the boundary points is: ⎧ ⎫ ⎪ ⎨ (x, y) ∈ X cha |(x − 1, y) + (x + 1, y) = 1⎪ ⎬ Border = or ⎪ ⎪ ⎩ ⎭ (x, y − 1) + (x, y + 1) = 1
(4)
In the equation, when the mapping of point (x, y) is within the channel area X cha , its value is 1. The channel boundaries of the closed operation on Fig. 2c are shown as yellow lines in Fig. 3a. Based on the channel boundaries, channel recognition points I Di are defined as follows: (a) Channel recognition points are either the midpoint of a single boundary or the intersection point of two adjacent boundaries. (b) Within each connected channel region, if there are fewer than 2 channel recognition points, those points are deleted. (c) If the number of channel recognition points within a channel area is 0, that channel area is deleted. When there are fewer than 2 recognition points in a channel area, the planning algorithm cannot find a valuable path within that area. Therefore, the recognition point and the channel area are removed to reduce resource consumption and improve algorithm efficiency. The channel recognition points obtained from the channel boundaries in Fig. 3a are shown in Fig. 3b, where the red dots represent the channel recognition points. The rectangular channel and its recognition points in the lower right corner are removed.
3.2 Heuristic Growth of Random Trees Using random uniform sampling in the global space for searching has good obstacle avoidance capability, but it is less efficient in spaces with complex channels. This
422
X. Wu et al.
(a) Channel Boundaries
(b) Channel identification points
Fig. 3 Channel boundaries and identification points
paper introduces a heuristic growth mechanism to guide the random tree to grow towards channel recognition points. After reaching a recognition point, it is deleted, and efficient path searching within the channel is achieved using variable-probability non-uniform sampling, exploring the recognition points within the channel region. When all the recognition points within the channel have been explored, the channel area is deleted, and the dual-tree random search continues. Guided Growth of Random Trees. A possible process of guiding the random tree to reach channel recognition points is shown in Fig. 4. In Fig. 4a, the random tree Ta grows heuristically towards the recognition point starting from the initial node. In Fig. 4b, the random tree Tb grows in the same way starting from the goal node. In Fig. 4c, the two trees meet and connect, and the algorithm successfully searches for the initial path. In this type of map, the random tree can effectively utilize the channel recognition points to improve the quality of the path. However, this method has the following limitations: (a) There are redundant recognition points that cannot improve the optimality of the path. The efficiency of the algorithm may be reduced due to searching these recognition points. (b) Prior knowledge is required to determine the search priority of recognition points. The random tree will obtain paths with different optimality depending on the order in which it searches the recognition points. The proposed method of guiding the random tree growth in this paper is to set a triggering distance threshold Dt for each recognition point. When the distance between a new node xnew in the random tree and a recognition point is smaller than this threshold, the random tree is guided to grow by that recognition point. If there are multiple recognition points within the threshold range simultaneously, the nearest recognition point is triggered. The triggering distance threshold is related to the scale of the environment map and the extractable channel width.
Heuristic-Based Bi-RRT* Path Planning …
423
Fig. 4 LGHB-RRT* algorithm flowchart
(a) Traditional growth
(b) Guided growth
Fig. 5 Two growth processes of random trees
The growth formula for the basic RRT* algorithm is as follows: xnew = xnear est + ρ
(xrand − xnear est ) xrand − xnear est
(5)
where xrand represents the randomly sampled point, xnear est represents the node in the random tree that is closest to xrand , and ρ represents the step size for the growth of the random tree. The process of random tree growth in the classical RRT* algorithm is shown in Fig. 5a. Based on the idea of objects being attracted to target points in the artificial potential field method, this paper improves the process of random tree growth. After the improvement, the random tree not only grows towards the sampled points but also includes an additional attractive component in the direction of channel recognition points. The formula for random tree growth guided by channel recognition points is as follows: xnew = xnear est + ρ
(xrand − xnear est ) (x I D − xnear est ) +k i xrand − xnear est x I Di − xnear est
(6)
where ρ represents the step length, and k represents the guidance coefficient. The process of random tree growth guided by recognition points is shown in Fig. 5b, where xs is the start node, x g is the target node, and x I Di is the triggered channel recognition point. By incorporating the guidance of channel recognition points in
424
X. Wu et al.
Fig. 6 Variable density sampling probability density map
random tree growth, the random tree can be guided towards these points, enabling rapid traversal through channel areas. Channel Search Mode of Random Trees. After the random tree reaches the recognition point in the channel area, it enters the channel search mode. The difference between the channel search mode and the traditional RRT* algorithm is that the target point changes from a single global target point to multiple local target points composed of other recognition points within the channel area. Additionally, the sampling function changes from global random uniform sampling to variable-density sampling within the channel area. The sampling function in the channel search mode is a Cauchy distribution function, where the density of sampling points is positively correlated with the distance from the newly added node of the random tree. There is a higher probability of generating sampling points near the new node. The probability density function p(x, x0 , γ) is given as follows: x 1 (7) p(x, x0 , γ) = π (x − x0 )2 + γ 2 where x0 represents the location parameter of the probability density peak, and γ is the scale function representing half the width of the peak at half its height. This function reaches its peak at the location parameter x0 and is symmetric on both sides. The probability density function for variable-density sampling is illustrated in Fig. 6. Let l = |x − x0 | represent the distance between the sampling point and the newly added node of the random tree, and equation (7) becomes: p(l, γ) =
1 γ π l2 + γ2
(8)
When there are multiple recognition points within the same channel area serving as endpoints, after the algorithm searches and reaches one recognition point endpoint, that recognition point becomes the latest node added to the random tree. It is
Heuristic-Based Bi-RRT* Path Planning …
425
more likely to generate sampling points near that point, which makes it difficult to effectively search for the remaining recognition points. Therefore, a variable-density sampling coefficient β0 ∈ (0, 1) is introduced. Before each sampling, the system generates a random number β0 ∈ (0, 1) . The sampling point generation process is as follows:
SampleRand, xnew = x I Di or β > β0 (9) SampleHeur = SampleVari, other where SampleRand represents random uniform sampling within the channel region, and SampleVari represents variable-density sampling within the channel region. When a recognition point is found during the search or the randomly generated number exceeds the variable-density sampling coefficient, the sampling method switches to random uniform sampling. This improves the adaptability and search efficiency of the algorithm in complex channels.
3.3 LGHB-RRT* Algorithm The pseudocode for the LGHB-RRT* algorithm, a heuristic Bi-RRT* algorithm based on local guidance, is presented in Algorithm 3. The differences between Algorithm 3 and Algorithm 1 are only in lines (1) and (7, 8). MapPreprocess() represents the map preprocessing, and the obtained recognition points are stored in the recognition point list id List (line 1). If the current new node satisfies the triggering condition with a recognition point, heuristic growth HeuristicGrowth can be performed, where d(xnew , id List) represents the distance between the new node and the nearest recognition point (lines 7, 8). Algorithm 3 LGHB-RRT* algorithm 1: id List ← Map Pr epr ocess() 2: T ← (xinit , x goal ), Ta ← (xinit ), Tb ← (x goal ), cbest = ∞ 3: for i=1 to N do 4: r ← a mod b 5: xrand ← SampleFr ee(i) 6: T ← E xtend(Ta , xrand ) 7: xconnect ← N ear est (Tb , xnew ) 8: (cnew , σnew ) ← Connect T r ee(Tb , xconnect , xnew ) 9: if cnew = ∅ and cnew < cbest then 10: (cbest , σbest ) ← (cnew , σnew ) 11: end if 12: Swap(Ta , Tb ) 13: end for
426
X. Wu et al.
The pseudocode for heuristic growth HeuristicGrowth in Algorithm 3 is shown in Algorithm 4. The random tree generated by local heuristic growth is denoted as Th . This random tree is not independent but connected to the original random tree. Since it is uncertain which random tree reaches the triggering condition at this point, Th is used to represent the random tree generated in this search mode. The starting point of random tree Th is the latest added node, and the endpoint is the recognition point within the channel region (line 1). The triggering condition to enter heuristic growth is when the new node is within the triggering range of the recognition point. The random tree grows under the guidance of the recognition point. When the new node grows to the neighborhood of the recognition point, the recognition point is removed from the recognition point list (lines 2–8). When the random tree reaches the channel, it enters the channel search mode. In this part of the pseudocode, the sampling method is modified based on lines 2–8, continuing until all recognition points are traversed (lines 9–18). Finally, the random tree Th obtained from heuristic growth is connected to the original random tree T (line 19). Algorithm 4 HeuristicGrowth algorithm 1: T ← (Ta , Tb ), Th ← (Vh , E h ), Vh ← (xnew , x I Di ) 2: for i=1 to N do 3: xrand ← (SampleFr ee, x I Di ) 4: T ← E xtend(Ta , xrand ) 5: if d(xnew , x I Di ) < ζ then 6: id List ← Remove(id List (x I Di )) br eak 7: end if 8: end for 9: for i=1 to (N-i) do 10: if id List = 0 then 11: break 12: end if 13: xrand ← SampleH eur 14: T ← E xtend(Ta , xrand ) 15: if d(xnew , x I Di ) < ζ then 16: id List ← Remove(id List (x I Di )) 17: end if 18: end for 19: T ← Merge(T, Th )
3.4 Path Optimization RRT* and its generic algorithm have asymptotic optimality due to the randomness of their sampling. However, within a finite number of iterations, the optimality of the path cannot be guaranteed. The path tends to be tortuous, and there may be excessive growth in certain areas. Therefore, path optimization is necessary to obtain a higher
Heuristic-Based Bi-RRT* Path Planning …
(a) Before path optimization
427
(b) After path optimization
Fig. 7 Example of path optimization procedure
level of optimality within a fixed number of iterations. First, the planned path is numbered starting from the initial point, where the initial point is labeled as L 0 , the final point as L n , and the intermediate path nodes as L i , where i is an integer ranging from 1 to n − 1. If the line segment between L i−1 and L i+1 does not intersect with obstacles, the intermediate node L i is removed. The process of path optimization is illustrated in Fig. 7, where the optimized path and nodes are depicted as gray dashed lines and nodes in Fig. 7b.
4 Simulation and Analysis 4.1 Maps and Their Pre-processing The channel environment maps used in the simulation experiments are shown in Fig. 8. The environment maps have a size of 500×500, where the black regions represent obstacle areas. Within the maps, there are channel areas formed by obstacles. To validate the performance of the algorithm in complex channel environments, different types of channel maps were designed. Figure 8a depicts an environment map with three short channels, Fig. 8b shows a multi-channel environment map, Fig. 8c represents a multi-exit channel environment map, and Fig. 8d illustrates an S-shaped long channel environment map. Figure 9 shows the results of preprocessing the channel environment map. In the preprocessing step, a closing operation with a square-shaped convolution kernel of size 10×10 was applied, where the value of the convolution kernel is slightly larger than the designated channel width. The yellow regions in the image represent the extracted channel areas, and the red dots within the yellow channel areas represent the recognized points within the channel.
428
X. Wu et al.
(a) Map 1
(b) Map 2
(c) Map 3
(d) Map 4
(c) Map 3
(d) Map 4
Fig. 8 Channel environment map
(a) Map 1
(b) Map 2
Fig. 9 Preprocessing of channel environment map
(a) RRT* map 1 (b) Bi-RRT* map 1 (c) RRT* map 2 (d) Bi-RRT* map 2
Fig. 10 Simulation results of compare algorithm
4.2 LGHB-RRT* Comparison Simulation The simulations were conducted on a device with an Intel Core i7-10875H CPU to verify the feasibility and effectiveness of the proposed algorithm. The improved algorithm LGHB-RRT* was compared with two other algorithms, RRT* and BiRRT*, and each algorithm was tested in 20 experiments. The step size for all three algorithms was set to 15, and the maximum iteration count was set to 5000. The target point sampling rate for the two comparative algorithms was 0.1. In the LGHBRRT* algorithm, the triggering distance threshold Dt was set to 20 to balance the search randomness and speed. The starting point was represented by a blue dot at coordinates (15,15), and the end point was represented by a red dot at coordinates (485,485). During the simulation experiments, it was observed that both RRT* and Bi-RRT* had low success rates within the specified iteration count in maps 3 and 4, making the data less informative. Therefore, only the comparative simulation results for maps 1 and 2 are presented in Fig. 10, while the simulation results for the proposed algorithm are shown in Fig. 11. From Figs. 10 and 11, it can be observed that the LGHB-RRT* algorithm outperforms the Bi-RRT* and RRT* algorithms by finding feasible paths without explor-
Heuristic-Based Bi-RRT* Path Planning …
429
(a) LGHB-RRT* (b) LGHB-RRT* (c) LGHB-RRT* (d) LGHB-RRT* map 1 map 2 map 3 map 4 Fig. 11 Simulation results of LGHB-RRT* algorithm Table 1 Simulation data Map Algorithm 1 1 1 2 2 2 3 3 3 4 4 4
RRT* Bi-RRT* LGHB-RRT* RRT* Bi-RRT* LGHB-RRT* RRT* Bi-RRT* LGHB-RRT* RRT* Bi-RRT* LGHB-RRT*
Success rate
Time (s)
Path length
55 80 100 70 65 95 5 20 100 0 0 100
28.71 19.55 9.06 24.02 18.34 13.07 37.10 22.17 9.68 – – 15.39
1293 1279 1159 1262 1160 1092 1212 1193 972 41 8.8 1504
ing the entire map, resulting in a significant improvement in search efficiency. The LGHB-RRT* algorithm also demonstrates good adaptability in various complex channel maps, indicating the feasibility of the proposed improvements. To further validate the optimality of the algorithm, the simulation data for the algorithm in different maps were analyzed. The simulation data is presented in Table 1.
4.3 Simulation Analysis In Fig. 10, the RRT* algorithm exhibits high randomness in sampling and searching for Maps 1 and 2. The algorithm requires traversing the entire map to complete the search. The Bi-RRT* algorithm, with its partial heuristic growth, improves efficiency compared to RRT* but may reduce the probability of the random tree passing through the channel, thus affecting the success rate of the algorithm. In Fig. 11, the proposed algorithm successfully searches for feasible paths in all four maps, with significantly smaller search areas compared to the two comparative methods, reducing
430
X. Wu et al.
blind exploration. The simulation data indicates that the improved algorithm shows the most significant improvement in success rate in Map 4. From Table 1, it can be observed that compared to RRT*, the proposed algorithm reduces search time by 68.44 In Map 1, when the random tree reaches the trigger threshold of the channel recognition point, it can enter channel search mode, skipping subsequent blind searches. Compared to Map 1, Map 2 contains redundant channel regions, and the proposed algorithm may be affected by exploring these redundant channels, which could impact the success rate and search speed. In Map 4, where there is only one channel region and the optimal path must pass through it, increasing the trigger distance threshold Dt to 30 or 40 results in a search time of 14.63 and 14.27 s, respectively. This indicates that in maps with such characteristics, algorithm performance can be improved by increasing the trigger distance threshold. It is important to note that the improvement of the proposed algorithm is based on the data obtained from map preprocessing. If there are no extractable channel regions in the environment map, the proposed algorithm will degrade to the Bi-RRT* algorithm.
5 Conclusion In response to the difficulty of planning in complex channel environments using the Bi-RRT* algorithm, this paper proposes a planning algorithm that combines map preprocessing and heuristic search techniques. By identifying and extracting channel regions that hinder the planning algorithm’s operation, the proposed algorithm guides Bi-RRT* to rapidly reach and traverse the channel regions, thereby improving the success rate and efficiency of planning in complex channel maps. Simulation results demonstrate that the proposed algorithm exhibits significant improvements in success rate, search speed, and path length, and it performs well in various complex channel maps. Currently, the improved algorithm has achieved good results for non-redundant channels and single-exit channel environments. However, its improvement is relatively weak in multi-channel and multi-exit environments, and the trigger distance threshold needs to be determined through trial and error for different environments. To enhance the adaptability of the planning algorithm to different environmental maps, future research will focus on the following two aspects: (a) The exploration weight relationship between multiple exits and multiple channel environments. (b) The relationship between the trigger distance threshold, which combines global random search and channel search, and the environmental map. Acknowledgements This work was supported by National Natural Science Foundation of China under Grant 62263005, Guangxi Natural Science Foundation under Grant 2020GXNSFDA238029,
Heuristic-Based Bi-RRT* Path Planning …
431
Laboratory of AI and Information Processing (Hechi University), Education Department of Guangxi Zhuang Autonomous Region under Grant 2022GXZDSY004, Innovation Project of Guangxi Graduate Education YCSW2023298, Innovation Project of GUET Graduate Education under Grant 2023YCXS124.
References 1. Zhou Jun, H.: Research progress on navigation path planning of agricultural machinery. Nongye Jixie Xuebao/Trans. Chin. Soc. Agric. Mach. 52(9) (2021) 2. Chi, W., Ding, Z., Wang, J., Chen, G., Sun, L.: A generalized voronoi diagram-based efficient heuristic path planning method for rrts in mobile robots. IEEE Trans. Indus. Electron. 69(5), 4926–4937 (2021) 3. Li, D., Yin, W., Wong, W.E., Jian, M., Chau, M.: Quality-oriented hybrid path planning based on a* and q-learning for unmanned aerial vehicle. IEEE Access 10, 7664–7674 (2021) 4. LaValle, S., Kuffner, J.: Randomized kinodynamic planning. In: Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No.99CH36288C), vol. 1, pp. 473–479 (1999) 5. Wang, L.-L., Sui, Z.-Z., Pu, Z.-Q., Liu, Z., Yi, J.-Q.: An improved rrt algorithm for multi-robot formation path planning. Acta Electonica Sinica 48(11), 2138 (2020) 6. Karaman, S., Frazzoli, E.: Sampling-based algorithms for optimal motion planning. Int. J. Rob. Res. 30(7), 846–894 (2011) 7. Wang, J., Li, B., Meng, M.Q.-H.: Kinematic constrained bi-directional rrt with efficient branch pruning for robot path planning. Exp. Syst. Appl. 170, 114541 (2021) 8. Mashayekhi, R., Idris, M.Y.I., Anisi, M.H., Ahmedy, I., Ali, I.: Informed rrt*-connect: an asymptotically optimal single-query path planning method. IEEE Access 8, 19-842–19-852 (2020) 9. Wang, H.-F., Cui, Y.-Y., Li, M.-F., Li, G.-Y.: Mobile robot path planning algorithm based on improved rrt* fn. J. Northeastern Univ. (Nat. Sci.) 43(9), 1217 (2022) 10. Qin, Z., Xiaoliang, Y., Bin, L., Xianping, J., Zheng, X., Can, X.: Motion planning of picking manipulator based on ctb-rrt* algorithm. Nongye Jixie Xuebao/Trans. Chin. Soc. Agr. Mach. 52(10) (2021) 11. Zhong, J., Su, J.: Robot path planning in narrow passages based on probabilistic roadmaps. Int. J. Rob. Autom. 28(3) (2013) 12. Meng, J., Pawar, V.M., Kay, S., Li, A.: Uav path planning system based on 3d informed rrt for dynamic obstacle avoidance. In: 2018 IEEE International Conference on Robotics and Biomimetics, pp. 1653–1658 (ROBIO). IEEE (2018) 13. Ruan, X., Zhou, J., Zhang, J., Zhu, X.: Robot goal guide rrt path planning based on sub-target search. Control Decis. 35(10), 2543–2548 (2020) 14. Wan, X., Ye, Y., Leitao, Y., Zhu, L.: A global path planning algorithm based on improved rrt. Control Decis 37(4), 829–838 (2022) 15. Zhang, T., Wang, J., Meng, M.Q.-H.: Generative adversarial network based heuristics for sampling-based path planning. IEEE/CAA J. Autom. Sinica 9(1), 64–74 (2021) 16. Wang, J., Chi, W., Li, C., Wang, C., Meng, M.Q.-H.: Neural rrt*: learning-based optimal path planning. IEEE Trans. Autom. Sci. Eng. 17(4), 1748–1758 (2020) 17. Fu, J., Zeng, G., Huang, B., Fang, Z.: Narrow channel path planning based on bidirectional rapidly-exploring random tree. J. Comput. Appl. 39(10), 2865 (2019) 18. Shu, X., Ni, F., Zhou, Z., Liu, Y., Liu, H., Zou, T.: Locally guided multiple bi-rrt for fast path planning in narrow passages. In: 2019 IEEE International Conference on Robotics and Biomimetics (ROBIO), pp. 2085–2091. IEEE (2019) 19. Zhang, Q., Zhou, L., Zhao, Y., Cao, R., Liu, J.: A parallel algorithm combining improvedconnect-rrt and jps with closed-operation. In: 2020 IEEE 16th International Conference on Automation Science and Engineering (CASE), pp. 359–364. IEEE (2020)
Improved Safety Helmet-Wearing Detection Algorithm Based on YOLOv5 Dongmei Chen, Huanyu Zhao, Wei Liu, and Dongsheng Du
Abstract This paper presents an improved algorithm, TNS-YOLOv5, based on YOLOv5s to address issues in existing safety helmet wearing detection algorithms for small and obstructed targets. TNS-YOLOv5 incorporates the NAM module, Transformer Encoder structure, and SIoU loss as the regression loss function. NAM focuses on important channels and spatial features, improving recognition accuracy. The Transformer Encoder’s multi-head attention layer enhances target feature extraction and performs well in detecting occluded objects. SIoU loss accelerates network convergence and improves regression accuracy by redefining distance loss through vector angles. Experimental results demonstrate that TNS-YOLOv5 significantly improves accuracy and detection speed, achieving excellent detection results in safety helmet wearing detection tasks. Keywords Object detection · YOLOv5 · Attention mechanism · SIoU loss
1 Introduction In recent years, the frequency of safety accidents has increased, making the need for effective safety management systems more important than ever. As a result, safety helmet recognition technology has become a key aspect of workplace safety management. Safety helmet intelligent recognition systems, which use machine vision technology to offer significant advantages, such as high accuracy, rapid detection, and personalized customization. Currently, deep learning-based safety helmet wearing detection algorithm can be classified into two categories. The first category is a two-stage detection network that requires the generation of preselected boxes before further object detection. Although this type has a higher detection accuracy, it exhibits poor real-time performance.
D. Chen · H. Zhao (B) · W. Liu · D. Du Faculty of Automation, Huaiyin Institute of Technology, Huai’an 223003, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_34
433
434
D. Chen et al.
Notable examples of this type include R-CNN [1], Fast R-CNN [2], Faster R-CNN [3], HyperNet [4], and SPP-Net [5] networks. The second category of deep learning-based safety helmet detection algorithm is a one-stage detection network, which enables object detection with only one feature extraction. Notable examples of this type of network include SDD [6], YOLO serial [7–10], and Efficient Det [11]. In recent studies, researchers have proposed various modifications to the YOLOv3 detection algorithm to improve its performance in safety helmet detection. The actual production environment, such as factories and construction sites, is complex and the detection target is easily obscured, requiring higher precision in detection accuracy. Therefore, research on helmet detection algorithm is very necessary. This paper proposes modifications to the YOLOv5 detection algorithm, aimed at ensuring the safety helmet detection accuracy while concurrently reducing the network complexity, making it lighter and more cost-effective for practical applications.
2 Related Work and Improved Algorithm 2.1 YOLOv5 The YOLOv5 detection algorithm was released by Ultralytics in 2020 and consisted of four parts: input end, backbone (main network), neck (multi-scale feature fusion network), and prediction (detection head). YOLOv5 has demonstrated excellent detection accuracy in one-stage detection algorithm, particularly in small object detection. Therefore, this paper selects the YOLOv5 algorithm as the foundation for safety helmet detection.
2.2 The NAM Attention The primary purpose of attention mechanism is to help deep neural networks suppress less significant pixels or channels while focusing on the most crucial parts. Classical attention mechanism used in target detection model typically employ operation like full connection and convolution to generate higher dimensional feature maps, including models like CA and CBAM. However, operations like convolution and pooling can often lead to the loss of important feature information. To address this challenge and enhance the network concentration on significant feature information, this paper introduces another efficient and lightweight attention mechanism called the Normalization-based Attention Module (NAM).
Improved Safety Helmet-Wearing Detection Algorithm Based on YOLOv5
435
2.3 The Transformer Architecture The Transformer architecture which has achieved state-of-the-art results in many natural language processing (NLP) tasks. The core structure of the Transformer consists of two parts: the Encoder and the Decoder. In the context of computer vision, the Transformer Encoder is used to extract features from a sequence of image blocks that are combined to form a sequence. Through a self-attention mechanism, the Encoder emphasizes the importance of each image block and extracts feature accordingly. This section introduces the Transformer encoding module into the C3 structure (C3TR ) which structure is shown in Fig. 1. In this paper, the C3TR structure is only applied to the 32x downsampling stage in YOLOv5. The reason is that the feature map in this stage has the lowest resolution, and the computational complexity of the Transformer encoding module increases exponentially with the feature map resolution. By applying the C3TR structure to low- resolution feature maps, the memory usage can be reduced. The improved structure is shown in Fig. 2.
Fig. 1 C3TR structure diagram
436
D. Chen et al.
Fig. 2 YOLOv5 structure diagram
2.4 The SIoU Loss Function The loss function is a commonly used measure of model prediction error in machine learning, which evaluates the difference between the predicted and actual results. However, the CIoU loss function does not account for the issue of direction matching between the predicted and actual bounding boxes, which can lead to slow convergence of the network. To address this issue, this paper proposes a new loss function called Scylla-IoU (SIoU), which replaces the CIoU loss function. We propose an improved object detection algorithm, TNS-YOLOv5, which incorporates NAM, Transformer Encoder, and SIoU loss to achieve higher accuracy and faster convergence in object detection task.
3 Experimental Analysis 3.1 Experimental Environment The experimental platform used in this study includes a GPU (NVIDIA GeForce RTX 3050), a CPU (AMD Ryzen 7 5800H), 8 GB of graphics memory, and 16 GB of RAM. The operating system is Windows 11, and the software environment includes Python 3.11 and CUDA 11.7, with cuDNN used for accelerated training.
Improved Safety Helmet-Wearing Detection Algorithm Based on YOLOv5 Table 1 Number of annotations in dataset Datasets Number of image Number of hat Train Text
18,148 4641
49,878 12,404
437
Number of person Total number of marking 102,398 25,868
1,52,276 38,272
3.2 Dataset We use a self-made safety helmet dataset, which includes some samples from the publicly available SHWD dataset. After removing unreasonable images and adding images to represent factory scenes and actual factory scenarios, the dataset consists of 22,789 images in total. The dataset is split into training and testing sets with an 8:2 ratio. We use LabelImg to label the images, with two categories of labels: “hat” and “person”, where “hat” represents positive object and “person” represents negative object, saved in YOLO format. The annotation details are shown in Table 1, with a total of 190,548 annotate objects, including 62,282 objects with safety helmets and 128,266 objects without safety helmets.
3.3 Experimental Results In this section, we conduct experiments on a self-made dataset under the same experimental conditions. Here and throughout the paper, YOLOv5 refers to the YOLOv5s model.
3.3.1
Attention Module Comparison Experiments
The evaluation metrics used include computational complexity, mean average precision (mAP), and frames per second (FPS), among others. The experimental results are shown in Table 2. As seen from Table 2, we conclude that NAM has clear advantages in detection accuracy, recall rate, and detection speed compared to CA, SE, and CBAM attention mechanisms.
3.3.2
Improved Transformer Experiment Results Analysis
To evaluate the effectiveness of the proposed Transformer Encoder structure, we introduce it into the C3 structure of YOLOv5 and compare its performance with the original algorithm. The experimental results are shown in Table 3.
438
D. Chen et al.
Table 2 Comparison of different attention modules Network model GFLOPs APs/% YOLOv5 YOLOv5 + CA YOLOv5 + SE YOLOv5 + CBAM YOLOv5 + NAM
15.8 15.8 15.8 15.8 15.8
90.1 90.5 90.9 91.7 91.6
ARs/%
mAP/%
FPS
82.8 82.7 83.3 84.8 85.0
89.4 89.8 90.1 90.7 91.0
123 118 109 115 123
Table 3 Transformer function verification experiment Network GFLOPs APs/% ARs/% model YOLOv5 5.0 YOLOv5 + Transformer
15.8 15.6
90.1 90.1
82.8 82.1
mAP/%
FPS
89.4 88.9
123 127
From Table 3, it can be observed that although the model incorporating the Transformer module did not show an improvement in detection accuracy, the computational cost of the model reduced to 15.6 GFLOPs, and there was a slight improvement in speed. This result demonstrates the effectiveness of the proposed improvement method.
3.3.3
Comparison Experiment of Loss Functions
To evaluate the effectiveness of the proposed SIoU loss function, we conduct a comparative experiment between the SIoU loss function and the original CIoU loss function used in YOLOv5. The comparative experimental results are presented in Table 4. The results from the table clearly indicate that SIoU loss, compared to CIoU loss, improves the model’s detection accuracy by 0.9% points while maintaining a similar detection speed.
Table 4 Loss function verification experiment Network model APs/% ARs/% CIoU loss SIoU loss
90.1 91.0
82.8 84.0
mAP/%
FPS
89.4 90.0
123 123
Improved Safety Helmet-Wearing Detection Algorithm Based on YOLOv5
3.3.4
439
Ablation Experiment
To validate the effectiveness of the proposed improvements on safety helmet detection, a set of ablation experiments were designed. The results of the ablation experiment are summarized in Table 5. Based on the experimental data, we conclude that the proposed TNS-YOLOv5 algorithm achieves better detection performance than the original YOLOv5 model. The improvements made to the algorithm, including the NAM module, transformer structure, and SIoU loss function, all contribute to the improved performance of helmet detection.
3.3.5
Mainstream Algorithm Comparison Experiment
To evaluate the detection performance of the proposed algorithm in this paper, we compare it with several mainstream object detection algorithms, including Faster R-CNN, SSD, YOLOv3, YOLOv4, YOLOv5, and YOLO X. The comparison results are presented in Table 6. As shown in Table 6, YOLOv5 has the fewest parameters and the highest detection speed, meeting the requirement of real-time detection. Compared to other mainstream object detection algorithms, the proposed algorithm in this paper improves detection accuracy and detection speed. And it is also the lightest among all compared models.
Table 5 Ablation experiment No. Transformer NAM 1 2 3 4
SIoU
GFLOPs APs/%
ARs/%
mAP/%
FPS
15.8 15.6 15.6 15.6
82.8 82.1 85.0 85.5
89.4 88.9 91.0 91.5
123 128 127 127
90.1 90.1 91.6 92.2
Table 6 Comparative experiment of mainstream algorithm Network GFLOPs APs/% ARs/% model Faster R-CNN SSD YOLOv3 YOLOv4 YOLOv5 YOLO X Ours
23.8 16.3 20.5 20.6 15.8 24.9 15.6
90.2 86.8 86.0 87.5 90.1 89.5 92.2
80.8 71.5 70.3 74.2 82.8 75.3 85.5
mAP/%
FPS
85.5 79.9 80.3 83.1 89.4 82.8 91.5
67 87 88 90 123 72 127
440
D. Chen et al.
In summary, the TNS-YOLOv5 algorithm proposed in this paper demonstrates significant advantages over other mainstream object detection algorithms in terms of detection accuracy, detection speed, and model complexity. The proposed algorithm is a promising solution for real-time helmet detection applications.
(a)
(b)
(c) YOLOv5s
TNS-YOLOv5
Fig. 3 Comparison of TNS-YOLOv5 algorithm detection results
Improved Safety Helmet-Wearing Detection Algorithm Based on YOLOv5
3.3.6
441
Image Detection Results
We select several representative images to compare the detection results with the original YOLOv5 algorithm. The detection results are shown in Fig. 3. From Fig. 3, we can observe that for scene (a) with movement, the TNS- YOLOv5 algorithm accurately detected the white safety helmet in motion without any false detection, while the YOLOv5 algorithm failed to detect it and mistakenly detected a small object in the distance. For scene (b) with occlusion, the TNS-YOLOv5 algorithm accurately detected the target, while the YOLOv5 algorithm had serious missed and false detections. For scene (c), the TNS-YOLOv5 algorithm could better distinguish safety helmets from helmets being worn than the YOLOv5 algorithm. These results demonstrate the superiority of the proposed TNS-YOLOv5 algorithm over the original YOLOv5 algorithm in terms of detection accuracy and robustness.
4 Conclusion In this paper, we have proposed an improved algorithm based on the YOLOv5s algorithm for safety helmet detection. By introducing the NAM module, adding the Transformer Encoder structure, and replacing the regression loss function with SIoU loss, we developed the TNS-YOLOv5 detection algorithm. Experimental results have shown that the TNS-YOLOv5 algorithm achieved an average precision of 92.2%, which is 2.1 percentage points higher than the original YOLOv5 algorithm, while maintaining slightly improved mAP and detection speed. More- over, the network is lighter, demonstrating the feasibility and superiority of the proposed algorithm in safety helmet detection. Acknowledgements This work was supported in part by the National Natural Science Foundation of China under Grant 62173159 , in part by the Natural Science Foundation of Huai’an under grant HAB202151.
References 1. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587. IEEE Press, New York (2014). https://doi.org/10. 1109/CVPR.2014.81 2. Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448. IEEE Press, New York (2015). https://doi.org/10.1109/ICCV.2015.169 3. Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems. MIT Press, Cambridge (2015)
442
D. Chen et al.
4. Kong, T., Yao, A., Chen, Y., Sun, F.: Hypernet: towards accurate region proposal generation and joint object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 845–853. IEEE Press, New York (2016). https://doi.org/10.1109/CVPR.2016. 98 5. He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015). https:// doi.org/10.1109/TPAMI.2015.2389824 6. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) Computer Vision–ECCV 2016. vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-464480_2 7. Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: Unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788. IEEE Press, New York (2016). https://doi.org/10.1109/CVPR.2016.91 8. Redmon, J., Farhadi, A.: Yolo9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271. IEEE Press, New York (2017) 9. Redmon, J., Farhadi, A.: Yolov3: an incremental improvement. arXiv. 1804.02767 (2018) 10. Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: Yolov4: optimal speed and accuracy of object detection. arXiv. 2004.10934 (2020) 11. Tan, M., Pang, R., Le, Q.V.: Efficientdet: scalable and efficient object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10781–10790. IEEE Press, New York (2020). https://doi.org/10.1109/CVPR42600.2020.01079
Research on Low-Cost Missile Borne Landing Point Positioning Device Based on RDSS/SMS Shiyao Gao, Shan Su, A. Liangmushage, Kai Wu, and Lu Chen
Abstract The guided transformation of rocket launchers is one of the best ways to utilize inventory of uncontrolled rocket launchers, and is also an important aspect of low-cost ammunition development. However, in the early stages of guidance transformation, platform matching, guidance accuracy, and single machine operation will all cause significant deviations in the landing point of the rocket, affecting the recovery of onboard data recorders and seriously delaying the research and development process. In response to the issue of low-cost missile landing point positioning, combined with the existing Beidou communication on guided ammunition, a research on a missile landing point positioning device based on RDSS/SMS short message service is proposed. Firstly, a low-cost landing location scheme is proposed, with FPGA as the main control chip, satellite positioning as the information source, and Beidou short message/short message as the communication means; Secondly, design the software logic scheme, construct various communication modules, and build a hardware platform; Finally, design individual tests and program validation for important modules, and conduct overall performance testing and environmental testing on the ground. The results indicate that the landing point positioning device is relatively low-cost compared to telemetry and other data acquisition devices, and has continuous communication capabilities. It provides certain assistance in landing point positioning and assists in the rapid recovery of the missile body and data recorder. Keywords Low cost · RDSS · SMS · Location
1
Introduction
Data acquisition is an important part of effectively driving research and development progress. For aircraft systems, the main means of data acquisition include telemetry communication, satellite communication, shortwave wireless communication, data S. Gao (B) · S. Su · A. Liangmushage · K. Wu · L. Chen SiChuan Aerospace Fenghuo Servo Control Technology Corporation, Chengdu 611130, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_35
443
444
S. Gao et al.
recorder recycling, etc. In the current data acquisition methods, data recorder recycling plays an important role in low-cost ammunition research and development due to its advantages such as low cost and large data storage capacity. However, during the development phase, the control system was not mature, resulting in a significant deviation in the rocket’s landing point. Moreover, due to the use of sand bullet tests, especially in areas such as grasslands and wetlands in the shooting range, the landing point was not obvious, and the radar in the shooting range was prone to loss in near ground tracking. For controlled missiles, the landing point calculation error was significant. Therefore, solving the problem of rocket landing location is an important condition for the use of data recorders in low-cost ammunition development. At present, the transformation of rocket guidance mainly focuses on 7–45 km [1, 2]. In the range of this range, it can be covered by mobile networks, and satellite positioning technology is often used in navigation methods. Therefore, the combination of mobile communication and satellite communication can help to recover data recorders on low-cost ammunition. At present, the landing point positioning method commonly used in flight tests uses range radar tracking, but the radar has low accuracy in predicting the landing point position due to its low accuracy; The use of telemetry can directly send test data without the need to excavate the wreckage, but its cost is high, and for low-cost modification of rockets, the cost is relatively high; Therefore, combining satellite positioning technology on the vast majority of guided ammunition, short message/short message transmission of positioning data can achieve landing point positioning at a lower cost Distribute to ground equipment.
2 Overall Scheme Design 2.1 Beidou Short Message Communication Beidou short message is a type of RDSS communication service that can perform positioning broadcasting and bidirectional communication. Its communication frequency can reach 1 Hz, and its working principle and process are shown in Fig. 1.
2.2 SMS Short Message Service SMS is a low-cost service that utilizes existing operator base stations to communicate with other mobile devices such as mobile phones. When using this module for communication, the main control module first converts the positioning data of the satellite positioning module into the data format recognized by the SMS communication protocol. After being sent through SMS service, the positioning situation can be monitored in the receiving device equipped with a SIM card on the ground. The basic flowchart is as follows (Fig. 2).
Research on Low-Cost Missile Borne Landing Point Positioning Device …
445
Fig. 1 GNSS principle
Fig. 2 SMS communication basic process
2.3 Overall Scheme Design According to the application scenario of low-cost ammunition development, the overall plan mainly includes the main control module, communication module, and power management module, as shown in Fig. 3. In this scheme design, the main control module mainly considers using FPGA permanent programmable components as the main control chip. Compared to software programmed ARM and DSP, it has stronger stability and higher data processing efficiency.
446
S. Gao et al.
Fig. 3 Overall device plan
The communication module mainly adopts the design of Beidou short message and SMS short message services as backup for each other, achieving full coverage communication at high and low altitudes. According to research, a certain satellite receiver manufacturer can directly provide a Beidou short message communication frequency of 1 s, and Beidou-3 also inherits the Beidou short message, and positioning and communication can be completed through the same channel. When designing SMS short messages, it is necessary to first place the sending module in the positioning device, convert the positioning data from satellite positioning into the data format recognized in the SMS communication protocol through the mutual control module program, and then send it out through SMS service. At this time, the receiving device equipped with a SIM card on the ground can detect the positioning situation of the positioning device.
3 Software Logic Analysis and Design 3.1 Overall Software Logic Scheme Design 1. Data reception and processing: The satellite positioning module and the landing point positioning device module are transmitted in the form of differential word quantities, while the thermal battery analog signal in the main control module is transmitted, allowing real-time understanding of the component power supply status. Therefore, the control module of the landing point positioning module needs to mix digital and analog signals, and separately capture, judge, filter, intercept, cache and other steps for the signals, extract useful information and send it to the communication module (Fig. 4). 2. Related protocol analysis: In order to be able to interface well with each module, it is necessary to interpret their respective transmission protocols in detail, including: positioning protocol, Beidou communication protocol, and short message AT instruction set;
Research on Low-Cost Missile Borne Landing Point Positioning Device …
447
Fig. 4 Software logic scheme
3. Framing and sending of positioning data: After being processed internally by FPGA, the positioning data is framed according to the Beidou Short Message Communication Protocol and Short Message Communication Protocol, and together with their respective instructions, transmitted to the Beidou Short Message Sending Module and Short Message Sending Module at predetermined frequencies, which are then sent outward to the ground receiving equipment.
3.2 Porter Adaptive Serial Port Design In this design, serial port communication is used for data transmission between satellite positioning module, FPGA master module and communication module. In order to adapt to different transmission baud rates of other models and minimize module changes, FPGA based baud rate adaptive serial port is added, and module design is used. After the baud rate is determined, only the adaptive module needs to be deleted, without affecting subsequent serial port communication. By utilizing the feature that each packet of positioning data can have a maximum of 80 bytes and all end with a carriage return and line feed character, the binary corresponding to the ASCII code of the carriage return and line feed character is 0000 1101 0000 1010, which ensures that there will inevitably be high and low level intervals in each packet of data. Therefore, the first 80 bytes received can be used to measure and calculate the baud rate to achieve baud rate adaptation. 1. At present, in commonly used serial communication, 10 bits are commonly used as a byte: in serial communication, when the data line is idle, the level is high (i.e. binary “1”), and when there is data, the first bit is the low level start bit (i.e. binary “0”), followed by 8 data bits (some high bits come first, some low bits come first), and finally followed by a bit end bit, which is high. 2. Assuming the baud rate is within the maximum range of 460,800, the minimum frequency division parameter can be obtained (i.e. all frequency division param-
448
S. Gao et al.
eters in the design are greater than this value): Then a 43 division frequency division counter A can be generated, which can generate a waveform that counts 43 clocks before reaching below “1”; 3. Establish counters B and D with an initial value of 1; 4. Firstly, scan the data stream level. When the first falling edge and a low level of 4 consecutive clocks appear, start counting for 20 times, and then collect the current level value. If it is still low, it is considered that the data has started, and the start bit of the first data has arrived, and counter A starts working; 5. Compare the new C value with the existing C value in the register, take the minimum value of the two as the new C value, and add 1 to the counter D. When the D value is equal to 820 (a line of data can have a maximum of 82 bytes, assuming each byte is 0XAA, each byte can change 10 times by long, so loop detection 82 * 10 times), the baud rate scan has at least one packet of positioning data, including carriage return and line feed characters. At this point, The value in register C is the ratio of 460800 to the actual baud rate.
4 System Platform Construction The system platform construction mainly includes the circuit design of the FPGA main control module and the Beidou short message/short message sending module. The power supply is mainly provided through the component system, and there is no need to design an additional power management module.
4.1 FPGA Main Control Module The main control module needs to control the communication of the entire landing point positioning device, including data acquisition, reception, processing, and communication. Due to the parallel working mode, stable operation, and reliable performance of FPGA, in order to ensure rich logical resources, reduce size, adapt to the internal space of the projectile, reduce energy consumption, and avoid adverse effects on system power supply.
4.2 Beidou Short Message Sending Module The BeiDou short message sending module mainly utilizes the unique short message function of BeiDou to send the positioning information output by FPGA to the satellite forwarding station through the radio frequency part. To ensure communication success rate, combined with the requirements of the working environment and considering low-cost applications, a certain company’s RD0538B1 chip in chip
Research on Low-Cost Missile Borne Landing Point Positioning Device …
449
Fig. 5 RDSS peripheral interface schematic Diagram
packaging is selected, which integrates the radio frequency transceiver chip, baseband chip, 5 W power amplifier chip, etc. of the BeiDou satellite navigation system internally (Fig. 5).
4.3 Short Message Sending Module The short message sending module mainly sends data from FPGA short message content to ground receiving devices through mobile, China Unicom, and telecommunications base stations. Considering module size, electrical parameters, RF characteristics, and low-cost applications, a certain company’s SIM7600CE integrated chip is selected, which can simultaneously accommodate three major operators and cover 2G/3G/4G networks, making it particularly suitable for flight tests in small shooting ranges (Figs. 6 and 7). The short message sending module circuit mainly includes the USIM card circuit design and level conversion circuit design as follows. The USIM card integrates 32CPU, RAM/ROM/EEPROM memory, and serial communication unit internally. To protect the USIM card, an electrostatic protection chip is also used; The level conversion circuit adopts UART serial port transmission, and a 22 ohm resistor is connected in series at both ends of the conversion chip, which can slow down signal pulses and high-frequency noise.
450
S. Gao et al.
Fig. 6 Design of USIM card interface circuit
Fig. 7 Design of level conversion circuit
5 Experimental Verification and Analysis 5.1 Beidou Short Message Communication Test The communication testing of Beidou short messages mainly includes sending Beidou short messages and receiving them at the ground receiving end. Under good satellite reception conditions, a frequency of 60 s/time is used to test the performance of the ground receiving link. Through testing, it is known that the delay in receiving each packet of data is about 5 s. For guided rocket missiles, selecting a frequency of 1s/time can meet the positioning requirements (Fig. 8).
Research on Low-Cost Missile Borne Landing Point Positioning Device …
451
Fig. 8 Data test
Simulate sending data to the Beidou short message sending module through the supporting upper computer, and use the ground receiving system to receive and convert the data into recognizable Chinese characters before displaying them. After multiple tests, the success rate of sending Beidou short messages can reach over 85
5.2 Short Message Communication Test The communication testing of Beidou short messages mainly includes sending Beidou short messages and receiving them at the ground receiving end. Under good satellite reception conditions, a frequency of 60s/time is used to test the performance of the ground receiving link. Through testing, it is known that the delay in receiving each packet of data is about 5s. For guided rocket missiles, selecting a frequency of 1s/time can meet the positioning requirements. And so far, the communication module testing has passed a large number of experiments, and both the Beidou short message and short message have not experienced any error or garbled code, and the error rate meets the requirements (Fig. 9).
5.3 Master Control Testing The FPGA program in the recycling system was burned to the development board and tested for complete data reception, caching, interception, framing, and transmission with the positioning module connected. On the left is a data graph directly output by satellite positioning, with different types of data packets, varying in length and length. On the right is the GGA data filtered out by FPGA after determining the frame head. The data frame heads are aligned and of the same length, with only
452
S. Gao et al.
Fig. 9 Receiving positioning information
Fig. 10 Comparison of positioning data before and after FPGA filtering
GGA data. Experiments have shown that through FPGA programs, GGA data can be independently judged and cached (Fig. 10). As shown in the following figure, the left image shows the final Beidou short message data packet sent by FPGA, and the right image shows the instructions and data packets that comply with the short message sending process. The experimental
Research on Low-Cost Missile Borne Landing Point Positioning Device …
453
Fig. 11 Instructions and data sent by FPGA framing
results show that the FPGA program can automatically extract effective positioning data (HDOP ≤ 6.00), and perform correct framing and serial port transmission (Fig. 11).
6 Conclusion Based on the modular design concept of RDSS/SMS, the low-cost landing point positioning is analyzed for the principles of Beidou short messages, short message communication, etc. The system’s work content is examined, and program design is carried out for data reception and processing, framing, etc. Combined with lowcost applications, suitable main control and communication chips are selected and relevant hardware platforms are built. Finally, the main control program and communication testing are verified through ground experiments, The results indicate that the communication delay of Beidou short message is around 5 seconds, and different transmission frequencies can be selected for different missile types with different flight times; The short message communication test has a high latency of about 20 seconds, but due to its extremely low cost, it can be used as a redundant communication method for landing point positioning devices, ensuring that ground equipment can accurately receive landing point information, and can also provide reference for other low-cost positioning devices.
References 1. Guo-Hu, F.: Monocular Vision/Inertial Integrated Navigation Observability Analysis and Dynamic Filtering Algorithm Research. National University of Defense Technology (2012) 2. Tao, L.: Integrated Navigation System Based on Quasi-3D Vision Model. Northwestern Polytechnical University (2007)
454
S. Gao et al.
3. Dan-Qi, C., Guo-Dong, J., Li-Bin, T.: A review of target localization methods for unmanned aerial vehicle-borne optoelectronic platforms. Winged Missiles J. 08 (2019). https://doi.org/ 10.16338/j.issn.1009-1319.20190063 4. Sun, L., Pack, D.: Guidance law design for tracking mobile ground targets using an unmanned aerial vehicle with a fixed camera. In: International Conference on Unmanned Aircraft Systems, pp. 235–241. IEEE (2016) 5. Li-Juan, Q.: Research on Solving Geometric Space of Monocular Vision Measurement Method. Trans. Shenyang Ligong Univ. 32(02) (2013) 6. Li, J.: Research on image blur measurement method based on multi-scale space analysis. Xihua University (2014) 7. Li-Ying, D.: Research on Inertial/Visual Integrated Navigation System. Harbin Engineering University (2017) 8. Hai-Peng, Z.: Research on Strapdown Attitude Measurement System of Marine Fiber Optic Gyroscope. North University of China (2019) 9. Hong-Lei, Q., Zi-Zhong, T., Li, C: Positioning technology based on ORBCOMM satellite signal of opportunity. J. Beijing Univ. Aeronaut. Astronaut. (2020). https://doi.org/10.13700/ j.bh.1001-5965.2019.0565
A Prediction Algorithm for Lower Limb Movement Intention Based on Plantar Pressure Hao Li, Junyu Quan, Longfei Jia, Jing Chen, Shitong Zhou, and Zhiyuan Yu
Abstract This paper proposes a lower limb motion intention prediction algorithm based on plantar pressure. First install the pressure sensors on the soles of the foot, install the IMUs to the lower limbs of the human body. At the same time, it collects plantar pressures information and motion information such as the angle and angular velocity of the lower limbs during walking; Then, the sliding window method is used to divide each prediction sample, and a prediction model based on CNN-LSTM is trained. The plantar pressure is used to predict the motion intention of the lower limbs at the current moment (the motion information collected by IMUs);Finally, this paper verifies the performance of the model. The test results show that through the lower limb motion intention to predict the algorithm designed in this paper, the plantar pressure sensors can replace the IMUs installed in the lower limbs, predict the current motion intention of the lower limbs of the human body, thereby reducing the number of sensors installation, reducing the complexity of the exoskeleton robot’s electrical system and mechanical structure. Keywords Deep learning · Exoskeleton robot · Motion intention prediction
1 Introduction Exoskeleton robot is a typical equipment that is worn on the human body, controlled by the human body, and has intelligent features of human-machine integration. It can improve the human body’s long-term load-bearing endurance and reduce body damage [1]. However, statistical data shows that there are not many exoskeleton robot systems that have been truly applied, and exoskeleton robots still face enormous challenges [2]. As one of the key research technologies for exoskeleton robots, prediction of human lower limb motion intention requires real-time, intelligent, and accurate H. Li · J. Quan · L. Jia · J. Chen · S. Zhou (B) · Z. Yu Beijing Institute of Precision Mechatronics and Controls, Beijing 100076, China e-mail: [email protected] Laboratory of Aerospace Servo Actuation and Transmission, Beijing 100076, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_36
455
456
H. Li et al.
perception of the current wearer’s motion intention, such as current gait phase, lower limb motion trend, etc. [3]. Usually, gait phase can be determined through plantar pressure sensors, and lower limb movement trends are collected through IMUs installed in the lower limb, such as the current angle and angular velocity of the lower limb [4]. The more sensors there are, the more complete the information collected, but it also increases the complexity of the electrical system and mechanical structure of exoskeleton robots. To solve this problem, this paper attempts to design a prediction algorithm based on deep learning, using the data of pressure sensor data to predict the lower limb movement information of the current moment, and achieve the IMUs function installed on the lower limbs by the pressure sensor installed on the soles. In order to ensure the predictive performance of the human movement intention of the exoskeleton robot, reduce the number of sensors, reduce the complexity of the electrical system and mechanical structure of the exoskeleton robot, and improve the performance of the exoskeleton robotic. The remaining content of this paper is as follows: Sect. 2 introduces the installation locations of each sensor, the types of data collected, and the data preprocessing methods; Sect. 3 introduces a lower limb motion intention prediction algorithm based on CNN-LSTM; The experiment and results will be discussed in Sect. 4.
2 Data Collection and Preprocessing To collect plantar pressure information during lower limb walking, we first developed a plantar pressure insole, with four pressure information collection points arranged on the sole of the foot, as shown in Fig. 1. At the same time, in order to collect lower limb movement intentions, we selected four IMUs and deployed them in sequence at the left and right legs of the lower limb, as shown in Fig. 2. The experimental personnel walked on a treadmill and collected real-time plantar pressure and IMU information through Speedgoat, as shown in Fig. 3. Due to the influence of sensor accuracy and dimensional differences during the collection process, plantar pressure and IMUs information cannot be directly applied, and filtering and normalization processing are required. This paper first uses the sliding window method to partition the prediction samples, and then uses the Savitzky Golay method to filter the sensor data in each sample. The core idea of Savitzky Golay filtering is to use polynomials to perform least squares fitting on a certain length of data. Finally, normalize the filtered data.
3 CNN-LSTM Algorithm Convolution neural network (CNN) is a feedforward neural network, which has the characteristics of local connection and weight sharing, and is widely used in image recognition and other fields [5]. The composition structure of CNN includes
A Prediction Algorithm for Lower Limb Movement Intention …
457
Fig. 1 Sole pressure insole
Fig. 2 IMUs installation location
convolutional layer, pooling layer, and fully connected layer, as shown in Fig. 4. The convolutional layer can perform convolution operations on the input layer by checking convolutions of different sizes, and convolution kernels of different sizes can extract different features. The calculation formula for the convolutional layer is as follows: M i Wn ∗ xi + bn (1) yn = f i=1
where xi is the input of the current convolution layer, bn is the offset term, Wn is the weight matrix, f is the activation function, and yn is the nth feature extracted from the current convolution layer. The number of parameters output by the convolution layer is often large, which is not convenient for training. Pooling layer is mainly used for data compression to reduce the calculation amount and reduce the risk
458
Fig. 3 Data collection process
Fig. 4 CNN schematic diagram
H. Li et al.
A Prediction Algorithm for Lower Limb Movement Intention …
459
Fig. 5 LSTM schematic diagram
of overfitting, usually including maximum pooling and average pooling. Finally, the feature vectors generated from the original data through convolution and pooling operations are fed into the fully connected layer for processing, and the model output results are obtained. Long short term memory (LSTM) solves the problem of gradient explosion and gradient disappearance in traditional recurrent neural network (RNN) in processing long time series data, which leads to the inability to model, by introducing “memory unit” and “gate” mechanisms, and can fully mine the forward and backward dependencies of long-term time series data [6]. Its output results not only depend on the input at the current time, It also depends on the results of previous moments, and has been widely used in natural language processing and other fields. The internal structure of LSTM is shown in Fig. 5. The “gate” mechanism in LSTM includes input gate i t , forgetting gate f t , and output gate ot . xt represents the input of the current node, h t −1 represents the output of the previous node, and ct −1 represents the state of the previous node. LSTM selectively remembers the input xt of the current node and the output h t −1 of the previous node through the input gate i t , determining how much information can be stored in the current node to generate a new state ct . Then, through the forgetting gate f t , LSTM selectively forgets the state ct −1 of the previous node, retaining only some useful information. Finally, through the output gate ot , LSTM converts ct into the output ht of the current node. The specific update process of LSTM is as follows: i t = σ (Wxi xt + Whi h t−1 + bi )
(2)
f t = σ Wx f xt + Wh f h t−1 + b f
(3)
ot = σ (Wxo xt + Who h t−1 + bo )
(4)
c˜t = tanh (Wxc xt + Whc h t−1 + bc )
(5)
h t = ot ∗ tanh (ct )
(6)
460
H. Li et al.
Fig. 6 CNN-LSTM model
ct = f t ∗ ct−1 + i t ∗ c˜t
(7)
W is the weight coefficient matrix of each gate, b is the offset matrix, σ is the sigmoid activation function, which converts the output of each gate to between (0,1). From the above equation, it can be seen that the current node’s output state ct is generated by weighting the state of the previous node and the internal information of the current node. As long as the forgetting gate f t is not 0, LSTM can remember the information of the previous node. This paper is based on the research of CNN and LSTM, and designs a lower limb motion intention prediction algorithm based on the CNN-LSTM model. By utilizing the advantages of CNN and LSTM, the spatial and temporal features of the data are fully extracted. The working principle of the CNN-LSTM model is to mine the local spatial features of perception samples through convolutional layers to obtain feature vectors. Unlike in a single CNN model where the feature vectors are directly fed into the fully connected layer, the CNN-LSTM model takes the feature vectors as input to the LSTM layer and fully excavates the pre and post dependency relationships of local spatial features using the LSTM layer[7]. The specific structure of the model is shown in Fig. 6.
A Prediction Algorithm for Lower Limb Movement Intention …
461
4 Experiments and Results To verify the performance of the algorithm proposed in this paper, we recruited volunteers for experiments. After installing each sensor according to the position shown in the figure, walk on the treadmill and collect data, as shown in Fig. 7. This paper selects 4-axis plantar pressure and the sum of the plantar pressure of each axis as inputs for the deep learning model, totaling 5 axes. The collected data was divided into samples using the sliding window method, with a sliding window length of W = 100 and a step length of L = 10. A total of 981 data samples were obtained. Preprocess the divided data, and the Savitzky Golay filtering effect is influenced by the data length M and polynomial order k. Through comparative experiments, M = 7 and k = 1 are selected. Select 70% as the training set for training, and the remaining 30% for testing. Optimize the parameters of the CNN-LSTM model through experiments. In deep learning, the number of layers and nodes of the model and other hyperparameter is not the more the better. Too many layers and nodes can help the model learn more complex features, but it also means that a large number of model parameters need a large number of training sets as support, and need to be carefully selected. After several previous experiments, this section finally selects the number of layers of CNN as 3, the number of LSTM nodes as 128, the number of layers as 2, the training epoch as 50, and the batch size of the model as 128. The training set data is randomly scrambled before each epoch training, taking the IMU angle at the left thigh, the sagittal plane angular velocity, the IMU angle at the left calf, and the sagittal plane angular velocity as examples, the test results are shown in Fig. 8. This paper uses the commonly used interpretable variance value in the field of prediction as the evaluation standard for prediction results. The interpretable variance
Fig. 7 Data collection for testing personnel
462
H. Li et al.
(a) Left thigh angle velocity.
(b) Left thigh angle.
(c) Left calf angle velocity.
(d) Left calf angle.
Fig. 8 Comparison diagram of predicted results Table 1 Evaluation of prediction results Data type Thigh angular Thigh angle velocity Interpretable variance value
0.9573
0.9881
Calf angular velocity
Calf angle
0.9810
0.9877
value is a measure of the similarity between the dispersion degree of all predicted values and samples and the dispersion degree of the samples themselves. The best model has a interpretable variance value of 1, and the worse the model, the smaller the value. Taking the IMUs at the left thigh and calf as an example, the results of the interpretable variance values are shown in Table 1. From the table, it can be found that using plantar pressure sensor data combined with deep learning prediction algorithms to predict the motion information of human lower limbs has good results. It is proved that there is a certain mapping relationship between plantar pressure information and lower limb motion information. Using plantar pressure sensors can successfully predict the lower limb motion information, which can reduce the complexity of electrical system and mechanical structure.
A Prediction Algorithm for Lower Limb Movement Intention …
463
5 Conclusion This paper proposes a lower limb motion intention prediction algorithm based on plantar pressure using the CNN-LSTM model, which can use plantar pressure sensor data to predict lower limb motion information. While ensuring the performance of human motion intention prediction, it can reduce the number of exoskeleton robot sensors, reduce the complexity of electrical systems and mechanical structures, and improve the performance of exoskeleton robots, which is of great significance.
References 1. Sawicki, G.S., Beck, O.N., Kang, I., Young, A.J.: The exoskeleton expansion: improving walking and running economy. J. NeuroEng. Rehabili. 17(1) (2020) 2. Hamaya, M., Matsubara, T., Teramae, T., Noda, T., Morimoto, J.: Design of physical user-robot interactions for model identification of soft actuators on exoskeleton robots. Int. J. Robot. Res. 40(1), 397–410 (2021) 3. Lee, H.D., Kim, W.S., Lim, D.H., Han, C.S.: Control algorithm of the lower-limb powered exoskeleton robot using an intention of the human motion from muscle. J. Korea Robot. Soc. 12(2), 124–131 (2017) 4. Li, H., Yu, Z., Yin, Y., Yan, G., Quan, J.: Human activity recognition of exoskeleton robot based on adaptive DTW classifier. In: Proceedings of 2021 Chinese Intelligent Systems Conference, vo. II, pp. 213–221. Springer (2022) 5. Su, B.Y., Wang, J., Liu, S.Q., Sheng, M., Xiang, K.: A CNN-based method for intent recognition using inertial measurement units and intelligent lower limb prosthesis. IEEE Trans. Neural Syst. Rehabil. Eng. 27(5), 1032–1042 (2019) 6. He, Y., Zhou, C., Hu, Y.: Application of LSTM method combined with feature optimization in chiller failure detection. J. Phys.: Conf. Ser. 2442, 012026 (2023) (IOP Publishing)
Visual Localization and Map Construction Based on Ground Texture Xin Chen and Lei Yu
Abstract This paper describes a visual localization and map construction method based on ground texture. The ground texture is scanned twice in a zigzag pattern and divided into a grid to avoid distortion of the stitched image. A global optimization method is used to calculate the image pose by reprojection constraint and key frame selection to stitch the global map. For positioning, the global positioning is transformed into a local positioning process through global search and local optimization, which improves the speed and accuracy of positioning. For map updating, local optimization is performed by constructing loopbacks for possible stain occlusion or breakage problems to solve tracking loss and repositioning problems. Experimental results show that the ground texture stitching and localization results after global optimization match well with the true values, and the global maps constructed by stitching are aligned with the true physical locations. Keywords Ground texture · Visual localization · Map construction · Map update · Global optimization
1 Introduction Ground texture localization is a technique that uses ground texture information for accurate position estimation. It achieves accurate localization of agent (e.g., vehicles, robots, etc.) positions by analyzing and matching texture patterns on the ground, X. Chen · L. Yu (B) School of Mechanical and Electric Engineering, Soochow University, Suzhou, China e-mail: [email protected] L. Yu Key Laboratory of Opto-Technology and Intelligent Control, Ministry of Education, Lanzhou Jiaotong University, Lanzhou, China Jiangsu Key Laboratory of Advanced Manufacturing Technology, Huaiyin Institute of Technology, Huai’an, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_37
465
466
X. Chen and L. Yu
texture features, and relative relationships between textures. In high-precision localization, such as realizing autonomous navigation of indoor robots and guiding the precise position of driverless cars in parking lots, ground texture-based visual localization methods show great potential. The basic principle of ground texture high-precision localization relies on the stability and uniqueness of ground textures and usually relies on computer vision techniques, including image processing, feature extraction, matching and localization algorithms. Ground texture localization offers many advantages. Compared with other environmental features, ground texture remains stable and has high information density. It is not affected by occlusion and can work in dynamic environments that lack static landmarks. Ground texture localization is independent of lighting conditions, reducing dependence on light. In addition, it does not rely on personal information, avoiding privacy concerns [1]. For task-relevant ground coverings, such as asphalt and concrete, ground texture localization provides reliable centimeter-level [2, 3] precision positioning and can be efficient and stable autonomous agents in most domains [3, 4]. Currently, ground texture-based localization is one of the research hotspots in precise positioning techniques, especially in autonomous vehicle localization. In literature [5], a ground texture-based map matching method is proposed to solve the intelligent vehicle localization problem. By installing a camera and a controllable light source on the bottom of the vehicle and using RTK-GPS for map construction, a global ground texture map is created by combining sensor data such as camera and odometer. Ranger [6] is a high-precision localization system for ground vehicles that uses a downward-facing camera for map-based localization, measuring accuracy at the centimeter level. Streetmap [2] uses a downward facing camera for mapping and localization on various ground planes with different textures, using feature-based and line-based methods to achieve accurate and real-time localization. In this paper, we explore a set of methods to select the best front-end combination by combining different feature points and descriptors, these algorithms include CenSurE [8], MSD, Harris, SURF [9], AKAZE, FAST, GFTT, MSER, AGAST, ORB [10], BRISK [11] and different binary descriptors, including BEBLID [12], BRIEF [13], LATCH, BRISK, FREAK [14], TEBLID. because of the specificity of ground texture information, the optimal visual front-end for different ground textures may be different, the front-end uses some open source mainstream feature point and descriptor algorithms for combination, the final system SURF and Brief are used as visual front ends. In this paper, the image stitching and localization problem in ground texture localization is investigated to improve the accuracy of stitching and localization through global optimization and global localization methods. The camera horizontal motion model and the map update method are also introduced. In the before-and-after global optimization comparison experiments, the degree of agreement between the image acquisition trajectory and the true value after global optimization is demonstrated, and quantitative data support is provided. Finally, through global optimization comparison experiments and localization comparison experiments, the method is found to
Visual Localization and Map Construction Based on Ground Texture
467
perform well in terms of accuracy and robustness, providing valuable references for research and applications in the field of visual localization.
2 Off-Line Map Building and Online Positioning Our system is divided into two parts: offline construction and online localization, and the flow chart of the algorithm is shown in Fig. 1. We store this information and filter the key frames according to the overlap rate, then stitch these key frames to form a visual map and build an offline database through the bag-of-words model. Online localization is done by extracting image features to build a visual vector to search the database for connected key frames, and then local optimization is performed to complete real-time pose output after global localization.
2.1 Image Scanning As in Fig. 2, since we assume that the ground is approximately horizontal, we need to correct the physical position of the camera so that the camera plane is parallel to the physical plane before the system starts. We use a single-response matrix to describe such a relationship. After extracting the feature points, we use the RANSAC [2] algorithm to minimize the reprojection error of the feature points to calculate the single response matrix. The single-response matrix can be expressed as
Fig. 1 Algorithm flowchart
468
X. Chen and L. Yu
H=R+
tnT d
(1)
where R denotes the 3D rotation between two frames, t denotes the translation vector, n denotes the normal vector of the horizontal plane, and d denotes the depth value in the camera coordinate system, which is the distance from the camera plane to the horizontal. We determine whether the camera is parallel to the ground by decomposing the single-response matrix H to obtain the plane normal vectors. Since there are multiple sets of solutions for decomposing this matrix [7]. The other solutions can be eliminated by some a priori information. We only need to determine whether the absolute value of the z-component of the normal vector is close enough to 1. Thus, we can obtain: H = [h1 , h2 , h3 ]
(2)
nz = h2T h3 ± (h2 h3 )2 − (||h2 ||2 − 1)(||h3 ||2 − 1)
(3)
= sign −h1T I + [h2 ]2× h3
(4)
where nz is the z-axis component of the ground direction vector n and denotes the positive and negative signs. As shown in Fig. 3, we scan the ground texture twice in a zigzag pattern, the first time from point A to D, and the second time from D to C. The two scans cross over to divide the ground into a grid. If we scan only once, we find that the final stitched image is clearly distorted in one direction. Each scanned row we consider as a sequence of images. A tight loop is created between adjacent image sequences, which prepares the ground for global optimization later. Fig. 2 Camera horizontal motion model
Visual Localization and Map Construction Based on Ground Texture Fig. 3 Data collection roadmap
A
469 D First Scan Second Scan Map grid Agent
B
C
2.2 Global Optimizations As shown in Fig. 4, each line of captured images is used as a sequence, and the sequences form a loopback with reprojection constraints between adjacent keyframes. The global localization can be defined by bag-of-words vector search to a set of connected keyframe regions, after which two keyframes can be identified as candidates for accurate localization by further filtering. After we capture texture images, there is a better loopback between adjacent image sequences, so we construct reprojection constraints to perform global optimization. To speed up the computation and also control the size of the offline database, our
Fig. 4 Global optimization and global positioning schematic Fig. 5 Keyframe splicing
470
X. Chen and L. Yu
Fig. 6 Global visual map obtained from stitching
optimization is based on key frames, and we do not need to take out all image frames for optimization. We only need to have sufficiently robust matching features between key frames. Therefore, we select keyframes based on the overlap area, so that we can achieve a grid-like distribution of keyframes similar to the one in Fig. 5. We use the G2O optimization framework with the LM optimization algorithm. We consider the bit pose of each image as the vertex and the constraint relationship of matching projection error of feature points of the overlapping part between images as the edge to construct the least squares problem. The optimization objective function is: min T
|| pki − Ti−1 T j pk ||2 j
(5)
T∈O pk ∈B(i, j)
where point pk denotes the k-th feature point that has a matching relationship between image i and image j, b denotes the poses of image i in the global map, B(i, j) denotes the set of feature points that have a matching relationship between images i and j, and O denotes the set of poses of all images involved in the optimization. With global optimization, we can calculate the bit pose of the acquired images and we can stitch a visual global map by placing the images in the correct position in the global map (as in Fig. 6). But we do not want to use all the captured images, because it is only necessary to be able to cover the map texture. So we need to filter out some key frames to stitch the global image and build an offline database. We determine the keyframes by calculating the overlap ratio f = so /S I of the adjacent images, with so being the area of the overlap and S I being the area of the acquired image. We want the overlap ratio to be not too high but not too low, suggesting 40–50%.
2.3 Localization Global Localization: During initialization and re-localization, we conduct a global search in the offline database. Instead of relying on feature points, we use image-based
Visual Localization and Map Construction Based on Ground Texture
471
search, significantly improving localization speed. Using the bag-of-words vector, we search the image in the offline database, resulting in a ranked list of candidate frames. Due to the zigzag acquisition process, frames with similar scores tend to form a local grid of keyframes, while the highest scoring frames are typically consecutive and between two keyframes. This approach transforms global localization into a localized process by narrowing it down to a small area. Local Localization: We do local positioning during global positioning or normal tracking. We compare the two adjacent keyframes, get the highest keyframe score and take out the other keyframes connected to it, then we do a local optimization of the common frame between these two keyframes and the other keyframes with the current positioning frame to calculate the bit pose of the current positioning frame. In the normal tracking process, we have a priori information so we can directly perform local optimization. Since the feature points and descriptors are already computed when building the database, this step only requires extracting the current feature point descriptors, similar to the global optimization during offline map building, and we have: || pk − T−1 Ti pki ||2 (6) min T
where point pk denotes the pixel coordinate of the k-th feature point that has a matching relationship between the current frame and frame i participating in local optimization, and Ti denotes the bit pose in the global map of the image frame i participating in local optimization in the database, similarly we iterate through the current localization frame and all frames participating in optimization. We also invoke the G2O framework to perform iterative optimization using the LM algorithm.
2.4 Map Update It is possible that when we are working on the ground parts may be partially obscured or broken due to human factors, which may cause us to lose tracking or even relocate without being able to fix it. We have noticed that such cases are usually only for small areas, such as ground stains that are obscured. To solve this problem, we propose the following solution. As shown in the Fig. 7, the camera-equipped AGV is operating through a stained and blurred area, and we choose three images A, B and C to illustrate the distance during the operation. The image A before entering the stained area can be tracked and positioned normally and we can know its position. For B, since we have no information about it in the offline database we created, we will perform normal image tracking but not optimize its pose, we just record its image data and set a range for its position in the global map. Image C can be globally positioned to determine its pose. Since we believe the pose of A and C, which is equivalent to constructing a loopback, we construct a reprojection constraint on image frames A, B, C and all
472
X. Chen and L. Yu
Fig. 7 Map masking and updating
frames taken in the stained area for local optimization. Note that we also save the results of this optimization in the database, and we do not delete the previous data in the staining region because usually the staining in this part may revert to its original state.
3 Algorithm Effect Comparison Experiments 3.1 Introduction to the Experimental Platform In order to verify the rationality of the aforementioned algorithm, we constructed an experimental platform as depicted in Fig. 8. The experimental platform encompasses various components including a single-line LiDAR, an IMU, two hub motors, a motor driver, a buck module, a hood, a host computer, a wireless relay, and a microcontroller. The ground camera is positioned perpendicular to the ground and placed inside the hood, which is equipped with a light source to maintain stable lighting conditions during image capture. This meticulous design ensures reliable data acquisition and facilitates accurate evaluation of the algorithm’s performance.
3.2 Global Optimization Before and After Comparison Experiment After determining the best pair of feature description subassemblies, we start to acquire ground texture information, and then we select key frames for image stitching to build a visual global map. We use the marble ground in Zhang’s dataset [2] to
Visual Localization and Map Construction Based on Ground Texture
473
Fig. 8 Hardware diagram of the experimental platform
conduct the experiments, and we give the true value of the global map image trajectories obtained by inter-frame pose transformation alone compared to the global map trajectories after global optimization. Figure (a) shows the comparison of keyframe pose and true value before and after optimization, and Figure (b) shows the comparison cloud of keyframe pose and true value after optimization. Figure (c) shows the comparison of different error components of keyframe pose and true value before and after optimization. Figure (d) shows the absolute error between the optimized pose and the true value. From (a) in Fig. 9, we can see that the image acquisition trajectory map fits the true value after global optimization. Figure (b) further analyzes the absolute error between the globally optimized trajectory and the true value, and we can find that the average error is 1.9 cm, and the specific quantitative data we give in Table 1, where Rms denotes the root mean square error, Sse denotes the sum of squared errors, and Std denotes the standard deviation. From Figure (c), we can find that there is a large error in the bit pose obtained in the x-direction relying only on the interframe transformation of the image, which may be due to the camera resolution and the image acquisition direction. Figure (d) gives a plot of the absolute positional transformation error, and we can see that the error is smaller in the middle part due to the constraint of global optimization, and the overall shows a parabolic trend. After obtaining the bit poses, we stitch the key frames to obtain a visual global map as shown in Fig. 10. Figure (a) shows the marble floor before image stitching for positional optimization. Figure (b) shows the marble floor image stitching after posture optimization. Figure (c) shows a granite floor before posture optimization. Figure (d) shows a granite floor with image stitching after posture optimization. We
474
X. Chen and L. Yu
Fig. 9 Image acquisition route and global optimization trajectory map
Table 1 Comparison of global optimization and original result error (m) Max
Mean
Median
Min
Rms
Sse
Std
Optimized
0.033124
0.014570
0.013934
0.000870
0.016186
0.042966
0.007050
Original
0.752925
0.434271
0.403854
0.132115
0.460481
34.775066
0.153139
use the dataset provided by Zhang [2]. where (a) and (b) are marbles and (c) and (d) are floor datasets. Where (a) and (c) are stitched directly using unoptimized poses, while (b) and (d) are stitched using optimized poses. We can see that the images without optimization look visually strange, while the optimized images are more friendly and we can basically align with the real-world physical positions.
Visual Localization and Map Construction Based on Ground Texture
475
Fig. 10 Global visual image stitching
3.3 Localization Comparison Experiments To further validate the effectiveness of our approach and other visual localization algorithms, we selected some excellent open-source visual and visual inertial SLAM algorithms from recent years. We installed a forward-facing camera and IMU on our cart to run the open source SLAM algorithms. We chose ORB_SLAM2 [15], ORB_SLAM3 [16] and VINS_Mono [17] algorithms for comparison experiments with our approach, where ORB_SLAM2 is pure visual SLAM and ORB_SLAM3 and VINS_Mono are visual inertial SLAM algorithms. Since ORB_SLAM2 pure monocular vision lacks scale information, we use evo review tool to align them for comparative analysis. The experimental results are shown in Fig. 11. where Figure (a) plots the trajectories of different open source algorithms and texture localization algorithms. Figure (b) shows the error cloud plot for texture localization with ORB_SLAM3. Figure (c) represents the absolute bit pose error comparison plot of our method with ORB_ SLAM3. Figure (d) represents the comparison bar chart of each error metric between
476
X. Chen and L. Yu
different methods and ORB_SLAM3.We can see that the results of our algorithm and other open source slam algorithm runs are basically the same through the figure (a) in Fig. 11. The length of the trajectory we run is around 47.5 m. From figure (b), we can see that our algorithm and ORB_SLAM3 have similar results with an average error of about 0.6%. Figure (c) then gives the graphs of other related metrics. From the comparison in Figure (d) we can know that our algorithm is closer to the results of ORB_SLAM3 than VINS_Mono and ORB_SLAM2, which is proven to be the best open source algorithm at present. So in Table 2 we give some specific quantitative data.
(a)
(b)
Std Rms Min Median Mean Max 0
0.2
0.4
0.6
ORB_SLAM2 VINS_Mono Ours
(c) Fig. 11 Comparison plots of the effect of different algorithms
(d)
Visual Localization and Map Construction Based on Ground Texture
477
Table 2 Different algorithms and ORB_SLAM3 errors (m) Max
Mean
Median
Min
Rms
Sse
Std
Ours
0.410635
0.155881
0.159022
0.041419
0.170758
10.817728
0.069710
VINS_ Mono
0.486825
0.233362
0.220878
0.025640
0.256475
45.321855
0.106403
ORB_ SLAM2
0.545633
0.288430
0.283961
0.074245
0.308171
47.864517
0.108525
4 Conclusion The paper introduces a method for visual localization and map-building based on ground texture. It employs a zigzag scanning pattern to capture the ground texture and divides it into grids instead of stitching images together to avoid distortion. The proposed approach utilizes global optimization techniques to compute image poses by considering reprojection constraints and keyframe selection, enabling the creation of a global map. To enhance speed and accuracy, global localization is transformed into a local process through global search and local optimization. Loop closures are employed to handle tracking loss and relocalization challenges caused by occlusions or damages. Experimental results demonstrate that the ground texture stitching and the localized results align well with ground truth, and the constructed global map corresponds to the physical locations. This method exhibits promising potential for precise localization techniques and performs effectively in applications such as autonomous navigation and self-driving vehicles. Comparative experiments with other open-source visual localization algorithms indicate its superiority. Acknowledgements The work is supported by National Natural Science Foundation of China (61873176); Opening Foundation of Key Laboratory of Opto-technology and Intelligent Control, Ministry of Education; The open fund for Jiangsu Key Laboratory of Advanced Manufacturing Technology (HGAMTL-2202); Tang Scholar of Soochow University and Jiangsu Province’s “333 High Level Talent Training Project”. The authors would like to thank the referees for their constructive comments.
References 1. Schmid, J.F., Simon, S.F., Mester, R.: Ground texture based localization using compact binary descriptors. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 1315–1321 (2020) 2. Chen, X., Vempati, A.S., Beardsley, P.: StreetMap—mapping and localization on ground planes using a downward facing camera. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1672–1679 (2018) 3. Zhang, L., Finkelstein, A., Rusinkiewicz, S.: High-precision localization using ground texture. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 6381–6387 (2019) 4. Kelly, A., Nagy, B., Stager, D., et al.: Field and service applications - an infrastructure-free automated guided vehicle based on computer vision - an effort to make an industrial robot
478
5. 6.
7. 8.
9. 10.
11. 12. 13.
14. 15. 16. 17.
X. Chen and L. Yu vehicle that can operate without supporting infrastructure. IEEE Robot. Autom. Mag. 14(3), 24–34 (2007) Fang, H., Yang, M., Yang, R., et al.: Ground-texture-based localization for intelligent vehicles. IEEE Trans. Intell. Transp. Syst. 10(3), 463–468 (2009) Kozak, K., Alban, M.: Ranger: a ground-facing camera-based localization system for ground vehicles. In: 2016 IEEE/ION Position, Location and Navigation Symposium (PLANS), pp. 170–178 (2016) Malis, E., Vargas, M.: Deeper understanding of the homography decomposition for vision-based control. 90 (2007) Agrawal, M., Konolige, K., Blas, M.R.: Censure: center surround extremas for realtime feature detection and matching. In: Computer Vision-ECCV 2008: 10th European Conference on Computer Vision, pp. 102–115. Springer (2008) Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: Speeded-up robust features. Comput. Vis. Image Underst. 110(3), 346–359 (2008) Rublee, E., Rabaud, V., Konolige, K., Bradski, G.: Orb: an efficient alternative to sift or surf. In: 2011 IEEE International Conference on Computer Vision (ICCV), pp. 2564–2571. IEEE (2011) Leutenegger, S., Chli, M., Siegwart, R.Y.: Brisk: Binary robust invariant scalable keypoints. In: 2011 International Conference on Computer Vision (ICCV), pp. 2548–2555. IEEE (2011) Suárez, I., Sfeir, G., Buenaposada, J.M., et al.: BEBLID: boosted efficient binary local image descriptor. Pattern Recogn. Lett. 133, 366–372 (2020) Calonder, M., Lepetit, V., Strecha, C., Fua, P.: Brief: binary robust independent elementary features. In: Computer Vision-ECCV 2010: 11th European Conference on Computer Vision, pp. 778–792. Springer (2010) Alahi, A., Ortiz, R., Vandergheynst, P.: Freak: fast retina keypoint. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 510–517. IEEE (2012) Mur-Artal, R., Tardós, J.D.: Orb-slam2: an open-source slam system for monocular, stereo, and rgb-d cameras. IEEE Trans. Rob. 33(5), 1255–1262 (2017) Campos, C., Elvira, R., Rodríguez, J.J.G., et al.: Orb-slam3: an accurate open-source library for visual, visual-inertial, and multimap slam. IEEE Trans. Rob. 37(6), 1874–1890 (2021) Qin, T., Li, P., Shen, S.: Vins-mono: a robust and versatile monocular visual-inertial state estimator. IEEE Trans. Rob. 34(4), 1004–1020 (2018)
Parameter Optimization of Tracked Vehicle Steering Control Strategy Based on Particle Swarm Optimization Algorithm Yunfeng Wang, Hongcai Li, Yue Ma, and Xuzhao Hou
Abstract The electric drive system relies on high-power generators to meet the electric energy required for vehicle driving and combat, which has become an important prerequisite for the development of future all-electric tanks. In this paper, the control parameters optimization research is carried out on how to improve the control accuracy and stability of the steering control strategy of electric tracked vehicles. The steering control strategy of series hybrid dual-motor coupling drive tracked vehicle based on active disturbance rejection control (ADRC) designed by myself is partially improved, and a control parameter optimization algorithm based on particle swarm optimization (PSO) is designed. The integral of time multiplied by the absolute value of error criterion (ITAE) is used as the particle swarm optimization algorithm evaluation function to optimize the key control parameters in the steering control strategy to realize the optimization output of the tracked vehicle steering control system. Matlab/Simulink and Speedgoat semi-physical simulation platform are used to verify the steering control strategy before and after parameter optimization. The comparative test results verify the effectiveness of this parameter optimization. Keywords Tracked vehicles · Steering control strategy · Parameter optimization · Particle swarm optimization · Semi-physical simulation
1 Introduction Electric drive crawler vehicle steering control strategy generally adopts torque control, the steering process by controlling the left and right sides of the drive motor to produce different driving force, and then make the active wheels on both sides of the vehicle produce speed difference, to achieve the steering function, electric drive crawler vehicle can flexibly control the driving force of the drive motor, which is Y. Wang · H. Li · Y. Ma (B) · X. Hou School of Mechanical Engineering, Beijing Institute of Technology, Haidian, Beijing 085500, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_38
479
480
Y. Wang et al.
conducive to the realization of the crawler vehicle dynamics control [1], the control characteristics of the drive motor torque determine the performance of the vehicle steering [2]. Experts and scholars at home and abroad have studied the steering control of electrically driven tracked vehicles. In the classic steering control strategy [3–6], PID control theory is generally used in the control strategy to achieve closed-loop control of a control target. The factors influencing the steering performance of tracked vehicles are too complex, such as nonlinear steering force, dual-motor coupling drive, excessive fluctuation of road surface conditions, and relatively bad driving conditions, and the PID control method has defects such as lack of adaptability to different vehicle speeds, large overshoot of integral saturation, and serious “noise amplification” effect [7]. Related theoretical studies [7–13] found that the self-disturbance rejection control method can effectively solve the problem of steering control error caused by variable road conditions and random factors. In this paper, a steering control strategy for tracked vehicles was previously designed based on ADRC, which can observe the disturbance of the vehicle in real time and reduce the steering control error caused by variable road conditions and random factors through disturbance compensation, of which the extended state observer is a very important part of the ADRC control structure, and the stability and excellent working performance of extended state observer(ESO) directly affect the stability and control effect of ADRC. The control parameters of ESO in the previously designed steering control strategy are obtained through experience or lookup tables, which cannot ensure whether they are ideal values. In view of how to improve the control accuracy and stability of the steering control strategy of electric drive crawler vehicles, this paper optimizes the algorithm for ESO The particle swarm optimization algorithm has the characteristics of few setting parameters, fast convergence speed, and simple and easy implementation of the algorithm [14–16], and has been widely used in optimization problems in various engineering fields [17, 18].Therefore, this paper will optimize the parameters of the previously designed steering control strategy based on the particle swarm optimization algorithm.
2 Steering Control Strategy 2.1 Controller Structure The steering controller structure diagram of the tracked vehicle is shown in Fig. 1. The driver model outputs the accelerator pedal signal, brake pedal signal and steering wheel angle signal. After signal analysis, the corresponding driving torque and reference steering angle speed are obtained, and then the steering controller obtains the corresponding steering torque by the reference steering angle speed, the actual steering angle speed, and the actual vehicle speed. Next the coupling of steering torque and driving torque is used for driving motors on both sides and outputting
Parameter Optimization of Tracked Vehicle Steering Control Strategy …
481
Fig. 1 Controller structure diagram
power, and the speed of driving wheels on both sides is obtained through power transmission. The current actual vehicle speed and actual steering angle speed are obtained through signal analysis. The feedback acts on the driver model and steering controller, thus enabling closed-loop feedback control. The signal analysis mentioned above is as follows: The driving torque is obtained by the signal analysis of the acceleration and brake pedals, and the driving torque is set αmax = 15◦ , βmax = 15◦ , Tα max = 1150 N · m, Tβ max = −1150 N · m, the acceleration and brake pedal torques are respectively: Tα =
α αmax
· Tαmax , Tβ =
β βmax
· Tβmax
(1)
This pedal signal analysis follows the principle of brake priority response, and the driving torque is Tα , Tβ = 0 TD = (2) Tβ , Tβ = 0 The reference steering angular speed is obtained through the steering wheel angle signal analysis and setting θmax = 90◦ , Δθ = 5◦ , ωmax = 1.2 rad/s, then the reference steering angular velocity is ⎧ θ+Δθ ⎨ θmax −Δθ · ωmax , θ ∈ [−θmax , −Δθ ) θ ∈ [−Δθ, Δθ ] ω0 = 0, ⎩ θ−Δθ · ω , θ ∈ (Δθ, θmax ] max θmax −Δθ
(3)
Through the speed of the drive motor on both sides, the speed of the active wheel on both sides can be deduced, and then the signal analysis obtains the actual vehicle speed and the actual steering angular speed as 0.377r z (n 1 + n 2 ) 2i c i b i q
(4)
v2 − v1 0.377r z (n 2 − n 1 ) = B Bi c i b i q (2k0 + 1)
(5)
va = ωa =
482
Y. Wang et al.
where r z is the radius of the active wheel (m), n 1 , n 2 is the speed of the drive motor on both sides (rpm), B is the track center distance (m), k0 is the parameter of the power coupling planetary row, i b is the transmission ratio of the transmission mechanism, i q is the front transmission ratio, and i c is the side transmission ratio.
2.2 Active Disturbance Rejection Control According to the existing research on the steering theory of tracked vehicles, the electric drive tracked vehicle is the research object, and the steering dynamics equation is v 2 ω2 (6) J ω˙ = 2rBz Tω − μ04G L 1 − μx2 g2 0
where
B 2r z
Tω is the steering driving torque,
μ0 G L (1 4
−
vx2 ω2 ) μ20 g 2 2
is the steering resistance
torque, J is the moment of inertia of the vehicle (kg · m ), G is the vehicle weight(kg), L is the track ground length (m), vx is the longitudinal speed of the tracked vehicle (km/h), ω is the steering angle speed of the tracked vehicle (rad/s), g is the acceleration due to gravity (m/s2 ). x˙1 = h(x1 ) + bu + W (t) (7) y = x1 According to the above vehicle dynamics model, the active disturbances rejection controller is designed, where x1 is the actual steering angle speed (rad/s), u is the torque required for steering (N · m), b = 2rBiz sJ , i s is transmission ratio, v 2 ω2
h(x1 ) = − μ04JG L (1 − μx2 g2 ), W (t) is the unknown disturbance. 0 Next, according to the literature, the composition of each link of the steering controller is briefly introduced, and the detailed derivation process will not be repeated this time [19].
2.2.1
Tracking Differentiators
The transfer function of the tracking differentiator can be approximated as ω(s) =
r 2s s 2 +2r s+r 2
=
r2 s (s+r )2
(8)
The equivalent state variable is implemented as ⎧ ⎨ x˙1 = x2 x˙2 = −r 2 (x1 − v(t)) − 2r x2 ⎩ y = x2
(9)
Parameter Optimization of Tracked Vehicle Steering Control Strategy …
483
Therefore, the discrete tracking differentiator is designed as ⎧ ⎨ f h = f han(x1 (k) − v(k), x2 , r, h) x1 (k + 1) = x1 (k) + T x2 (k) ⎩ x2 (k + 1) = x2 (k) + T f h
(10)
where f han(x1 , x2 , r, h) is the control synthesis function, r is the speed factor, h is the filter factor, T is the integration step.
2.2.2
Feedback Steering Control
According to the above vehicle steering dynamics model (7), the feedback linearized steering control method is used to design the control law, and the error of the controller is ex1 = x1d − x1 (11) e˙x1 = x˙1d − h(x1 ) − bu − W (t) where x1 is followed by x1d = ω0 (the reference steering angle speed). Therefore, the control law of this design is u=
(x˙1d − h(x1 ) + K (x1d − x1 ) + W (t)) b
(12)
where x1d , x1 , b and W (t) can be measured. According to Lyapunov’s stability method we can obtain that the control effect tends to stabilize when K > 0.
2.2.3
Nonlinear Expansive State Observer
The nonlinear system model is as follows: ⎧ ⎨ x˙1 = x2 + bu ˙ 1 ) + W˙ (t) x˙ = h(x ⎩ 2 y = x1
(13)
This ESO is partially improved, and the second-order nonlinear expansion state observer is designed as ⎧ ⎨ e = z1 − y z˙ 1 = z 2 − β1 e + bu (14) ⎩ z˙ 2 = −β2 f al(e, 21 , δ) f al(e, α, δ) =
| e |1−α sign(e), | e |> δ | e |≤ δ e/δ 1−α ,
(15)
484 Table 1 Relevant parameters Parameters r h T K β1 β2 δ
Y. Wang et al.
Meaning TD speed factor TD filter factor TD integration step ADRC status tracking error gain ESO steering angle speed tracking error gain ESO perturbation observation gain ESO filter factor
where α is the reciprocal of the order of ESO, β1 , β2 , δ are the relevant control parameters, and the value range of the three depends on the control system.
2.3 Steering Control Relevant Parameters The parameters of this steering control strategy are shown in Table 1.
3 Optimization of Steering Control Parameters The extended state observer is a very important part of the steering control strategy, and the expanded state in the observed signal is used to observe uncertain disturbances, and its feedback is used as real-time compensation for disturbances, eliminating the uncertainty of the control object, so that the closed-loop control has a better control effect. The stability and excellent working performance of ESO will directly affect the stability and control effect of ADRC, there are more adjustable parameters in ESO, its working performance is closely related to the selection of parameters, so in order to make the steering control strategy designed this time can obtain a more ideal control effect, reduce the error caused by manual parameter adjustment, the particle swarm optimization algorithm is used for ESO The adjustable parameters in the parameter optimization are obtained to obtain the ideal steering control parameters.
3.1 Principles of Particle Swarm Optimization Algorithms Particle Swarm Optimization (PSO) is a branch of evolutionary computing, a random search algorithm that simulates biological activity in nature [20–22]. PSO simulates the process of predation by flocks of birds and schools of fish in nature. Finding the
Parameter Optimization of Tracked Vehicle Steering Control Strategy …
485
global optimal solution to the problem through group collaboration has now been widely used in optimization problems in various engineering fields. The following are the specific steps for the particle swarm optimization algorithm: 1. Initialize all particles, that is, assign values to their speed and position, and set the historical optimal pBest of the individual as the current position, and the optimal individual position in the group as the current gBest; 2. In each iteration, calculate the fitness function value of each particle (obtained according to the evaluation function set by the criteria); 3. If the current fitness function value is less than the individual historical optimal value, update pBest; 4. If the current fitness function value is less than the global historical optimal value, update gBest; 5. Update the velocity and position of the d-dimensional dimension of each particle i according to the following formulas (16) and (17) respectively; 6. Determine whether the end condition (the number of iterations completed or the accuracy requirement is reached), if it is not met, go to step 2 to continue execution.
d = ω × vid + c1 × rand d1 ( p Bestid − xid ) + c2 × rand d2 (g Best d − xid ) (16) vi+1 d = xid + vid xi+1
(17)
where ω is the inertia weight, generally initialized to 1, which decreases linearly with the iterative process; c1 andc2 is the acceleration coefficient(also known as the learning factor); rand d1 and rand d2 are random numbers on two [0, 1] intervals. Set the speed range of each particle as [vmin , vmax ]. In addition, the position update in formula (17) must be guaranteed to be legal. So after each update, check whether the updated position is still in the problem space. If not, it must be corrected, and the general correction method is to reset it randomly or limit it to the boundary.
3.2 Optimization of Steering Control Parameters The steering control parameter optimization is mainly for the three adjustable parameters in ESO, according to their approximate range to set the problem boundary, based on the particle swarm optimization algorithm for parameter optimization, according to the evaluation criteria, through continuous iteration until convergence finally obtain three parameter values, so that the control effect of the steering control strategy of the tracked vehicle is optimized, the specific process is shown in Fig. 2. First, a certain number of particles are initialized, the position and velocity dimensions of each particle are three-dimensional, and the position dimension corresponds to three adjustable parameter values, and then all particles after initialization are
486
Y. Wang et al.
Fig. 2 Flow chart
compared to obtain the optimal position of the individual history and the optimal position of the group. Each iteration will update the velocity and position of each particle according to formulas (16) and (17), and after the update, each particle will be evaluated and the function adaptation value will be obtained. This time using the ITAE criterion for evaluation, the expression of the criterion is (18), which makes the oscillation of the transient response smaller, and has good selectivity for parameter optimization. By comparing the functional adaptation values of each particle, the historical optimal position of each particle and the global optimal position of the population are updated. Determine whether the number of iterations or error accuracy requirements have been met. If they meet, exit the iteration and get the global optimal position of the group, otherwise continue the iteration. t t· | e(t) | dt
f (t) = 0
The parameters of this parameter optimization are set as follows (Table 2).
(18)
Parameter Optimization of Tracked Vehicle Steering Control Strategy … Table 2 Relevant parameters Parameters Iterations Number of particles Dimension β1 β2 δ ω C1 C2 Particle velocity
487
Value or range 50 20 3 [30, 150] [0, 50] [0.001, 0.1] [0.4, 1] 0.8 0.5 [−10, 10]
Fig. 3 Initial location map
4 Trial Testing The parameter optimization is based on MATLAB/Simulink software, first write m files in MATLAB to implement the iterative process, each iteration will input the updated parameter value into the tracked vehicle simulink model that has been built, run a set steering condition to follow, get the corresponding function fitness value and return it to the MATLAB workspace, and update the data according to the obtained results. Then continue to the next iteration until the end condition is met, and we will finally get the values of three parameters that minimize the control error. Figure 3 is the position map of all particles after initialization, and the twodimensional schematic diagram of the x y axis of β1 β2 is respectively, and it can be seen that the particles after initialization are randomly distributed, and they are all within a reasonable range, which ensures the randomness of this parameter optimization. As shown Figs. 4, 5, 6 and 7 shows the result of this parameter optimization iteration process, of which Fig. 4 represents the optimal individual adaptation value obtained after all particles are evaluated by the ITAE criterion after each iteration. Figures 5, 6 and 7 represents the optimal parameter value updated after each iteration. From the change trend of each figure, it can be seen that after 20 iterations, the values
488
Y. Wang et al.
Fig. 4 Optimal adaptation value iteration plot
Fig. 5 β1 Iteration diagram
Fig. 6 β2 Iteration diagram
of the three control parameters and the optimal individual adaptation values have converged, and a set of optimized ESO control parameters (β1 = 126, β2 = 9, δ = 0.002) is obtained.
Parameter Optimization of Tracked Vehicle Steering Control Strategy …
489
Fig. 7 δ Iteration diagram
Fig. 8 Semi physical simulation platform
As shown in Fig. 8, we will realize semi-physical simulation test based on the speedgoat platform, in which the upper computer adopts the R2018b version of MATLAB, and the real-time machine adopts the high-performance real-time target machine of speedgoat (Performance real-time target machine), the real-time machine replaces the controlled object, the model of the controlled object is carried out in a real-time manner, the tracked vehicle model and the driver model are replaced by the real-time machine, the steering angular velocity control strategy based on the disturbance rejection control is burned into the controller, and the results of the previous manual parameter adjustment are compared with the results after parameter optimization to verify the effectiveness of the parameter optimization. This time, by setting the multi-segment steering wheel angle signal, the following working condition of the given steering angle speed is realized, and the specific working condition is shown in Fig. 9. After comparing Figs. 10 and 11, it can be seen that the steering angular velocity following after parameter optimization has been significantly improved. The maxi-
490
Y. Wang et al.
Fig. 9 Specific condition
Fig. 10 Steering angular velocity result plot (before optimization)
mum rise time has been shortened by 0.5s, while the maximum overshoot has been reduced by 8%, indicating that the working performance of ESO has been further improved after parameter optimization. The dynamic response performance of the steering control strategy is further improved.
Parameter Optimization of Tracked Vehicle Steering Control Strategy …
491
Fig. 11 Steering angular velocity result plot (after optimization)
Fig. 12 Diagram of the torque result of the drive motor (before optimization)
After the comparison of Figs. 12 and 13, it can be seen that after parameter optimization, the power output of the drive motor on both sides can be transmitted to the active wheels on both sides faster and more stably, which has a better anti-interference effect on changeable road conditions and a smaller range of fluctuation errors, thereby driving the tracked vehicle to better follow the target working conditions for steering.
5 Conclusion In this paper, the steering control strategy of tracked vehicles is optimized based on the particle swarm optimization algorithm, and the semi-physical simulation of the steering angular velocity following working condition is verified by MATLAB/Simulink software and Speedgoat semi-physical simulation platform. The parameter optimiza-
492
Y. Wang et al.
Fig. 13 Diagram of the torque result of the drive motor (after optimization)
tion algorithm can improve the working performance of ESO in ADRC, improve the adaptability of the steering angular velocity of the tracked vehicle, drive the drive motor on both sides to better output the required power, and achieve more reasonable steering dynamic performance, which proves the feasibility and superiority of the steering control parameter optimization algorithm.
References 1. Chen, Z.Y., Zhao G.Y., Zhai, L.: Pivot steering control and D2P real-time simulation for electric tracked vehicle. J. Northeast. Univ. (Nat. Sci.) 34(1), 114–115 (2013) 2. Gai, J.T., Huang, S.D., Zhou, G.M.: Adaptive sliding mode steering control of double motor coupling drive transmission for tracked vehicle. Acta Armamentarii 36(3), 405–411 (2015) 3. Zeng, Q.H., Ma, X.J., Liao, Z.L.: Stable steer control of electric drive tracked vehicle based on equivalent sliding mode technique with conditional integrator. Acta Armamentarii 37(8), 1351–1358 (2016) 4. Ma, X.J., Zeng, Q.H., Wei, S.H.G.: Simulation research of steering control strategy for electric drive tracked vehicles. J. Acad. Armored Force Eng. 16(6), 41–45 (2012) 5. Zhang, L.H., Han, K.: Research on coordinated control simulation of a tracked vehicle with low speed steering. Veh. Power Tech. 2, 6–11 (2015) 6. Hou, X.Z.: Research on control strategy for dynamic steering process of tracked vehicles driven by dual motor coupling. Beijing Inst, Tech (2018) 7. Han, J.Q.: From PID to active disturbance rejection control. IEEE Trans. Ind. Electron. 56(3), 900–906 (2009) 8. Han, J.Q.: Active disturbances rejection control technology. Front. Sci. 01, 24–31 (2007) 9. Han, J.Q.: Nonlinear functions derived from symbolic functions and absolute value functions. Cont. Eng. S2, 4–6 (2008) 10. Qiu X,B., Dou, L.H., Han, J.Q., Zhou Q.H.: Application of ADRC in state estimation of tank mobile targets. Acta Armamentarii 30(7), 989–993 (2009) 11. Yang, X., Huang, Q., Jing, S., Zhang, M., Wang, S.X.: Servo system control of satcom on the move based on improved ADRC controller. Energy Rep. 8(S5) (2022) 12. Zhang, M.Y., Li, Q.D.: A compound scheme based on improved ADRC and nonlinear compensation for electromechanical actuator. Actuators 11(3) (2022)
Parameter Optimization of Tracked Vehicle Steering Control Strategy …
493
13. Wu, Z.L., Shi, G.J., Li, D.H., Liu, Y.H., Chen, Y.Q.: Active disturbance rejection control design for high-order integral systems. ISA Trans. (2021) 14. Li, Y.G., Jiao, P.P., Qiao, W.D.: Prediction of steering behaviors on curves based on BP neural network optimized by modified PSO. J. High. Transp. Res. Dev. 36(10), 128–136 (2019) 15. Liu, Y.W., Tang, L.P., Wang, Y.T.: Particle swarm optimized fuzzy PID for quadrotor control. Autom. Instr. 37(8), 57–61 (2022) 16. Zhang, W., Xie, Y.H., Wang, Y.G.: A deviation particle swarm optimization algorithm based on convergence analysis and its application on PID tuning. Cont. Eng. China 28(7), 1466–1473 (2021) 17. Hu, W.Q., Yang, X.J., Dong Y.Q., Li, Y.H.: Research on parameter optimization method of electrohydraulic executive mechanism controller based on particle swarm optimization algorithm. J. Henan Univ. Tech. (Nat. Sci. Edn.) 42(2), 114–119 (2023) 18. Li, H.H., Liu, H., Gai, J.T., Li, X.M.: Research on steering control of tracked vehicledriven by dual motor coupling based on particle swarm optimization PID parameter optimization. Acta Armamentarii, pp. 1–8 (2023) 19. Wang, Y.F.: Steering control strategy of tracked vehicle based on active disturbances rejection control. China Soc. Automot, Eng (2022) 20. Van Der Merwe, D.W., Engelbrecht, A.P.: Data clustering using particle swarm optimization. In: Proceedings of the 2003 Congress on Evolutionary Computation, Canberra, Australia: ACT, no. 1, pp. 215–220 (2003) 21. Ratnaweer, A., Halgamuge, S.K., Watson, H.C.: Self-Organizing Hierarchical Particle Swarm Optimizer with Time-Varying Acceleration Coefficients. IEEE Press, New York, NY, US (2004) 22. Moharam, A, EL-Hosseini, M.A., Ali, H.A.: Design of optimal PID controller using hybrid differential evolution and particle swarm optimization with an aging leader and challengers. Appl. Soft Comput. 38, 727–737 (2016)
Design of Planar Torsion Spring with High Linearity for Series Elastic Actuator Yuqiao Cheng, Xiubo Xia, Yongling Fu, Jian Sun, and Pu Zhang
Abstract The active compliance control of series elastic actuator is based on the torque sensor composed of torsion spring and magnetic encoder, and its accuracy depends on the linearity of torsion spring. In this paper, a planar torsion spring with high linearity and small volume is designed for serial elastic actuator of robot. The design of planar torsion spring takes high linearity as the optimization objective under the premise of meeting the requirements of stiffness and bearing capacity. The finite element analysis results show that the structural nonlinear error of the planar torsion spring designed in this study is only 0.04%, which is reduced by more than 90% compared with the similar torsion spring. The mass is about 67 g, the stiffness is 1039 N m/rad, and the rated carrying capacity is 55 N m. Keywords Series elastic actuator · SEA · Planar torsion spring · Topology optimization · Torque sensor
1 Introduction 1.1 Background Traditional robot systems, mostly composed of rigid actuators, are widely used in tasks requiring high position accuracy, stability and high torque bandwidth. However, rigid actuators are not compliant and therefore unsafe and uncomfortable for tasks that require human-machine physical interaction (PHRI), such as medical rehabilitation, service and teleoperating robots. Robotic systems for PHRI usually use flexible actuators for adaptability and safety. The compliance of actuators is divided into active compliance and passive compliance. Passive compliance is achieved by means of an elastic element between the motor and the end. Active compliance is based on torque control, which is generally achieved by impedance control and requires Y. Cheng · X. Xia · Y. Fu · J. Sun (B) · P. Zhang Beihang University, 37 Xueyuan Road, Haidian District, Beijing, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_39
495
496
Y. Cheng et al.
torque feedback. There are three main ways of torque feedback, namely measuring motor current, measuring strain gauge resistance and measuring elastic element deformation. In recent years, series elastic actuators (SEA) have been widely used in robotics. In SEA, there is an elastic element connected to the mechanical energy [1]. The deformation of the elastic element is measured as the torque information, so that the SEA has both passive and active flexibility characteristics. SEA has been shown to be successful in multiple robotics applications, including wearable robots such as bipedal, quadruped, and two-armed robots [2]. The main challenge facing SEA technology is that its performance relies on torsion springs. The stiffness, bearing capacity, linearity and weight of torsion spring will affect the performance of SEA. Therefore, it is very important to study the lightweight planar torsion spring with designable stiffness, strong bearing capacity and high linearity for the development of SEA technology.
1.2 Research Status of Planar Torsion Spring Planar torsion spring is a special torsion spring. Different from the traditional torsion spring, its input and output structure are in a plane. The planar torsion spring is suitable as the elastic element of series elastic actuator because of its smaller axial size. There are two main design methods of planar torsion spring. The first kind of method is to assemble the linear spring with a special inner and outer ring structure. The stiffness of the planar torsion spring obtained by this method is small and the linearity is poor. Tsagarakis et al. [3] designed a planar torsion spring belonging to the first type by using linear springs, as shown in Fig. 1a. When the deflection Angle of the planar torsion spring increases from 0 to 0.17rad, the stiffness decreases by about 6%. This nonlinear planar torsion spring is not suitable for high precision torque measurement. The second method is to obtain a single main structure through special topological design. The performance of this kind of planar torsional spring is related to its shape and usually has better linearity and greater torsional stiffness than the first kind. A lot of efforts have been made to design a unique type II flat torsion spring module for specific application requirements. Yoo, S. designed a flat torsion spring [4] similar to the coiling spring structure, as shown in Fig. 1b. In this study, cubic polynomial curve was selected as fitting curve to fit the relationship between torque and angle of planar torsional spring, and the correlation coefficient R 2 of fitting curve was 0.9994. Chen et al. [5] designed a corrugated flat torsion spring, as shown in Fig. 1c. The study compared the calculated results with the experimental results, it is found that due to the geometrical nonlinearity and heat treatment of metal materials, the torsional stiffness obtained by numerical calculation is different from the actual experimental results, and the linearity of the sample is poor. Al-Dahiree et al. [6] designed a low-cost planar torsional spring by using a manufacturability oriented design method, as shown in Fig. 1d, whose structure refers to the planar torsional spring adopted by NASA-JSC Valkyrie Humanoid Robot [7]. In this study, the rela-
Design of Planar Torsion Spring …
497
Fig. 1 Torsion spring for series elastic actuator [3–10]
tionship between torque angle of planar torsional spring was obtained by simulation and experiment. The fitting R 2 values of simulation and unidirectional loading experiment were 0.9997 and 0.9987, respectively, by linear regression, indicating that this structure has good linearity. Dos Santos et al. [8] designed a flat torsion spring for exoskeleton, as shown in Fig. 1e. The spring stiffness experimentally determined in this study is about 50% lower than that obtained by finite element analysis, but the torque and angle have good linearity, except for the clearance caused by the worm gear pair. Leal-Junior et al. [9] designed a torque sensor based on the above planar torsion spring, and the measuring accuracy was less than 5%. Irmscher et al. [10] adopted the optimization method based on finite element analysis to optimize the planar torsional spring designed by dos Santos et al., and obtained the simulation results with a stiffness error of +1.9% and approximate linearity with the target. Two samples were produced by laser cutting method. However, the experiments of two samples in this study show that the stiffness of the manufactured planar torsional spring has a large error with the target value which is −11.8% and −17.8% respectively, and the stiffness changes obviously during the loading process. To sum up, the design results related to planar torsion springs, especially those with high linearity, are not ideal and cannot meet the high precision force control requirements of SEA. Therefore, the design method of planar torsion spring is still immature and needs further exploration and practice to improve.
498
Y. Cheng et al.
1.3 Design Objectives The purpose of this study is to design a kind of planar torsion spring with high linearity, strong bearing capacity and light weight, which is used for SEA with high precision requirements.
2 Design Requirements and Indicators 2.1 Integrated Design The planar torsional spring in this study will be integrated in the series elastic actuator used for the two-sided force feedback telemanipulator, so it is necessary to consider how to integrate with other SEA components. The overall structure of SEA involved in this study is shown in Fig. 2.
Fig. 2 Abstract physical model of SEA
Design of Planar Torsion Spring …
499
Table 1 Design requirements for torsion springs Design parameter Demand value Torsional stiffness Allowable torque Maximum thickness Maximum outside diameter Minimal lumen diameter Material
1000 N m/rad 55 N m 10 mm 65 mm 12 mm Titanium alloy TC4
2.2 Selection of Spring Stiffness The selection of spring stiffness should take into account the demand of SEA for buffer performance, force measurement accuracy and control bandwidth. The lower spring stiffness improves the resolution of the force measurement and, at the same time, the buffering performance is improved. However, too low stiffness will reduce the force control bandwidth of SEA. Therefore, the selection of spring stiffness must balance the above three indexes. This work has been completed by using multibody dynamics simulation software ADAMS and mechanical and control system simulation software Amesim. Finally, the design indexes are determined in Table 1.
3 Design Method of Planar Torsion Spring The design of planar torsion spring mainly includes two stages. The first step is to determine a topology with specific parameters that meet the design and DFM requirements. The second step is to optimize spring design parameters based on finite element geometric optimization based on the selected topology to minimize the equivalent stress and achieve the desired spring stiffness while ensuring a lightweight design. In this research, based on the existing design methods, the symmetry design idea is proposed and topology optimization technology is applied in the design process, which further improves the linearity and torque density of the planar torsion spring. The design flow chart of the improved planar torsion spring is shown in Fig. 3.
3.1 Selection and Improvement of the Suitable Topology According to the existing research, based on the planar torsional spring topological structure designed by Al-Dahiree et al. [6], dos Santos et al. [8], the topology with better linearity was selected through the simulation results, and then the symmetry improvement design was carried out.
500
Y. Cheng et al.
Fig. 3 Design method of planar torsion spring for SEA
Mathematical derivation is used to verify that the symmetric structure has better linearity than the original topology in the case of small deformation. Based on the structural nonlinear theory, it may be assumed that the function of stiffness varying with Angle is K (θ ), which is second-order differentiable. When the Angle is very small, according to Taylor formula, the stiffness of the asymmetric structure when the Angle is theta can be expressed as: K (θ ) = K (0) + K (0) · θ +
1 K (0) · θ 2 2
(1)
In contrast, the stiffness of the symmetric structure when the Angle is theta can be expressed as: K (θ ) + K (−θ ) = K (0) + K (0) · θ 2 2
(2)
(−θ) is eliminated. Therefore, Compared with K (θ ), the quadratic term of K (θ)+K 2 the stiffness change of the new structure with improved symmetry is obviously less
Design of Planar Torsion Spring …
501
Fig. 4 Topologies to be selected a Topology from Ref. [6]. b Topology from Ref. [8]. c Topology redesigned based on symmetry Table 2 Torque-angle data and angle error value after linear fitting Toque/N m
−20
degree a/◦
−2.6593 −1.9856 −1.3180 −0.6561 3E−5
−15
−10
−5
0.6509
1.2966
1.9375 2.5738
degree b/◦
−1.9993 −1.4988 −0.9988 −0.4992 2E−5
0.4987
0.9968
1.4944 1.9914
degree c/◦
−1.8171 −1.3630 −0.9088 −0.4544 2E−5
0.4544
0.9088
1.3631 1.8172 0.0066 0.0243
0
5
10
error a/◦
0.0257
0.0059 −0.0078 −0.0156 −0.0178 −0.0147 −0.0065
error b/◦
0.0022
0.0006 −0.0006 −0.0013 −0.0017 −0.0015 −0.0008
error c/◦
−0.0002
0.0001
0.0001
0.0001 2E−05
−0.0001 −0.0001
15
20
0.0006 0.0024 −0.0001 0.0001
than that of the asymmetric structure under small deformation, which means that the structure with improved symmetry has better linearity. Finite element analysis was used to verify the linearity of the three structures shown in Fig. 4. The topological structure of Fig. 4a, b respectively refer to the research results of Al-Dahiree et al., and Fig. 4c shows the topological structure after symmetry improvement of Fig. 4b. The multi-linear isotropic hardening TC4 material model is adopted for finite element analysis, which can more accurately simulate the nonlinear characteristics of the material. The load changes uniformly from 20 N m counterclockwise to 20 N m clockwise. The torque-angle data of the three topologies in the simulation results and the Angle errors after linear fitting are shown in Table 2. The maximum nonlinear Angle errors are 0.0257◦ , 0.0024◦ and 0.0002◦ , and the linearity is 0.485%, 0.06% and 0.005%, respectively. The nonlinear error after optimization is reduced by more than 90% compared with b before optimization, and 99% compared with a. The maximum Angle error after optimization is only 0.0002◦ , which is smaller than the resolution of the commonly used 19-bit magnetic encoder. Therefore, nonlinear error is no longer the bottleneck of torque measurement accuracy after optimization.
502
Y. Cheng et al.
Fig. 5 The topology optimization flow chart
3.2 Topology Optimization and Parametric Design Compared with previous studies, topology optimization was used in this study to search for shapes with higher load capacity before parameterization. The topology optimization flow chart is shown in Fig. 5. According to the topology optimization flow chart, the structure of Fig. 4C was strengthened, and then the torque load was set as 55N m , the optimization objectives are the minimum mass and minimum stress, and the optimization weights are 0.1 and 0.9, respectively. According to the target stiffness, the angle constraint of the planar torsional spring is set between 3◦ and 3.3◦ . The stiffness penalty coefficient is gradually increased from 3 to 5 to facilitate convergence. In addition, since topology optimization can only be performed for linear analysis at present, the material properties need to be set to the linear elastic model and the large deflection option needs to be turned off. In order to improve the calculation speed, the optimization process is carried out in a two-dimensional plane. After the optimization convergence, the stress nephogram is shown in Fig. 6.
Design of Planar Torsion Spring …
503
Fig. 6 Stress nephogram after topology optimization
3.3 Design Method Based on Finite Element Method According to topology optimization results, parametric modeling was carried out. Because of the symmetry, the structure of the whole planar torsion spring can be represented by establishing the parameter model of one-sixth model, as shown in Fig. 7. The outer diameter is 60mm and the inner diameter is 15 mm. The 13 key parameters from R1 to L7 determine the shape and performance of the planar torsion spring. In the design of torsion spring, the stress distribution and deformation of topological structure of selected spring are determined based on finite element method. ANSYS Workbench 2023R1 was used for nonlinear static structure analysis to ensure that the highest equivalent stress was lower than the yield strength of the material (750 MPa with 1.1 safety factor) at 55 Nm torque. The finite element simulation is part of the iterative design optimization process and aims to find the optimal value of the search space parameters while reducing weight and size but maintaining the desired spring characteristics. In the meshing phase of FEM, finer grids surrounds the lamellae and the holes, while coarser grids surround the outer ring, where lower stresses are expected to occur. The divided grid has 355,009 nodes and 223,473 cells, as shown in Fig. 8. The finite element method was used to verify the design, and the design parameters of the selected topology were verified and optimized in the search space shown in Table 3. In Ansys Workbench, the optimized structure was simulated several times by finite element analysis to evaluate its performance at 55 N m torque. The results show that the stress concentration is located on the surface of the beam whose sector is close to the axis, and decreases with the increase of parameter L7. Through iterative design, the stress distribution is uniform and the optimized structure is obtained. The optimal geometric parameters are shown in Table 3.
504
Y. Cheng et al.
Fig. 7 Parameter model of one-sixth model
Fig. 8 Meshed model
The thickness of the flat torsion spring is set at 8mm to reduce the size as much as possible under the condition that the stress is allowed. The finite element simulation results of initial parameters and optimal parameters, including Von Mises stress and calculated torsional stiffness, were shown in Figs. 9 and 10. After optimization, the error of the stiffness of the planar torsion spring relative to the target stiffness of 1000 N m/rad is reduced from 54.9 to 3.9%. The maximum stress was reduced from 833 to 729 Mpa. Based on the simulation data, the linearity is 0.039% by linear fitting. The comparison of performance between the final optimization results and the latest reference [6] is shown in Table 4.
Design of Planar Torsion Spring …
505
Table 3 Parameter optimization result Parameters Initial Minimum R1 R2 R3 R4 R5 R6 L1 L2 L3 L4 L5 L6 L7
1.0 3.0 1.5 3.5 19.0 19.0 16.0 8.0 3.0 5.0 1.0 0 0
Fig. 9 Von mises stress
Fig. 10 Torsional stiffness
0.5 1.5 1.0 2.0 16.0 16.0 14.0 6.0 5.0 8.0 0.7 0 0
Maximum
Optimal
1.5 3.0 2.0 4.0 22.0 22.0 18.0 9.0 7.0 12.0 1.5 1.0 2.0
1.0 2.7 2.0 3.7 20.0 19.0 15.7 8.8 3.5 6.4 0.7 0.2 0.5
506
Y. Cheng et al.
Table 4 Performance comparison table with the latest research results Design result Ref. [6] This research Thickness /mm Outer diameter /mm Weight /g Linearity obtained by nonlinear simulation Designed maximum load /N m Torque density (N m/g) Maximum strain energy density /(J/kg) Stiffness/(N m/rad)
8 85 140 0.48%
8 65 67 0.04%
45.7 0.33 33.14
55 0.82 43.43
450
1039
4 Manufacture of Flat Torsion Spring Wire cutting process can cut irregular shape, curved shape and internal hole, cutting edge quality is good, processing accuracy up to 0.03 mm, so choose wire cutting processing spring prototype. Finally, TC4 plate was used to produce the optimized planar torsion spring with a mass of 67 g. As shown in Fig. 11.
Fig. 11 prototype of flat torsion spring
Design of Planar Torsion Spring …
507
5 Conclusion and Prospect In this study, a symmetric topology is proposed first, and it is verified that it has better linearity than the asymmetric topology. According to the proposed symmetrical structure, the stress and mass are optimized by topological optimization method, and the shape and geometric parameters of the planar torsion spring are determined. Finally, the geometric parameters were optimized based on finite element method, and the optimized planar torsion spring prototype was made by wire cutting technology. According to the nonlinear finite element simulation analysis, the stiffness of the proposed planar torsional spring is 1029 N m/rad, the stress is 729 Mpa when the torque is 55 N m, and the linearity error is only 0.04%. Compared with the latest research results, the nonlinear error is reduced by more than 90% and the torque density is increased by 152%. It is worth noting that the topology optimization method adopted in this study has obtained a planar torsion spring structure with a very high energy density. Unfortunately, the results of topology optimization are not manufacturable, so parameterized design is also needed. Therefore, topological optimization method for high-energydensity planar torsional spring is a potential research direction.
References 1. Pratt, G.A., Williamson, M.M.: Series elastic actuators. In: Proceedings 1995 IEEE/RSJ International Conference on Intelligent Robots and Systems. Human Robot Interaction and Cooperative Robots, vol. 1, pp. 399–406 (1995) 2. Stienen, A.H.A., et al.: Design of a rotational hydroelastic actuator for a powered exoskeleton for upper limb rehabilitation. IEEE Trans. Biomed. Eng. 57(3), 728–735 (2010). https://doi. org/10.1109/TBME.2009.2018628 3. Tsagarakis, N., Laffranchi, M., Vanderborght, B., Caldwell, D.: A compact soft actuator unit for small scale human friendly robots. IEEE Int. Conf. Robot. Automat. 4356–4362 (2009). https://doi.org/10.1109/ROBOT.2009.5152496 4. Yoo, S., et al.: Development of rotary hydro-elastic actuator with robust internal-loopcompensator-based torque control and cross-parallel connection spring. Mechatronics 43, 112– 123 (2017). https://doi.org/10.1016/j.mechatronics.2017.03.003 5. Chen, Y., et al.: Novel torsional spring with corrugated flexible units for series elastic actuators for cooperative robots. J. Mech. Sci. Tech. 36(6), 3131–3142 (2022). https://doi.org/10.1007/ s12206-022-0544-5 6. Al-Dahiree, O.S., et al.: Design and characterization of a low-cost and efficient torsional spring for ES-RSEA. Sensors 23(7), 3705 (2023). https://doi.org/10.3390/s23073705 7. Paine, N., et al.: Actuator control for the NASA-JSC valkyrie humanoid robot: a decoupled dynamics approach for torque control of series elastic robots. J. Field Robot. 32(3), 378–396 (2015). https://doi.org/10.1002/rob.21556 8. dos Santos, W.M., Caurin, G.A.P., Siqueira, A.A.G.: Design and control of an active knee orthosis driven by a rotary series elastic actuator. Control Eng. Pract. 58, 307–318 (2017). https://doi.org/10.1016/j.conengprac.2015.09.008
508
Y. Cheng et al.
9. Leal-Junior, A.G., et al.: Polymer optical fiber for angle and torque measurements of a series elastic actuator’s spring. J. Lightwave Tech. 36(9), 1698–1705 (2018). https://doi.org/10.1109/ JLT.2017.2789192 10. Irmscher, C., et al.: Design, optimisation and testing of a compact, inexpensive elastic element for series elastic actuators. Med. Eng. Phys. 52, 84–89 (2018). https://doi.org/10.1016/j. medengphy.2017.12.004
Fault Diagnosis Method of Rolling Bearings Via Wavelet-Stacked Feature Extraction Na Wang, Yue Lei Cui, Liang Luo, and Zi Cong Wang
Abstract For the rolling bearing, the fault diagnosis method via Wavelet-Stacked feature extraction is proposed. Firstly, the time domain features are obtained via the statistical formulas. And the time-frequency domain features are obtained via the Wavelet Packet Transform. Then these features are united with the time ones. Therefore, the initial high-dimension features are formed. Secondly, the Stacked Autoencoder is adopted to reduce the initial feature set. So the simplified fault features are obtained. Thirdly, the Fuzzy c-Means is used to cluster the training data. Thus the labels of faults classing are got. Finally, the k-Nearest Neighbor is applied to diagnose the testing data. By the simulation verification on the rolling bearing data, the method is simple and of higher accuracy of diagnosis compared with the traditional methods of time-frequency domain and the Stacked Autoencoder. Keywords Fault diagnosis · Rolling bearing · Feature extraction · Stacked autoencoder
1 Introduction As essential components in rotating machinery, irreparable economic losses and safety issues for the rolling bearings can be caused if there are hidden dangers. As a result, it is important for fault diagnosis of the rolling bearings [1]. As the core component of rotating machinery, important information is contained in the vibration signal of the rotor. But when the extracted features are not adequate, some important fault features may be missed. The decrease in diagnostic accuracy is caused. For this problem, the initial features are constructed firstly, then dimenN. Wang (B) · Y. L. Cui · L. Luo · Z. C. Wang Tiangong University, Tianjin 300387, China e-mail: [email protected] N. Wang Tianjin Key Laboratory of Intelligent Control of Electrical Equipment, Tianjin 300387, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_40
509
510
N. Wang et al.
sionality reduction is implemented such as Principal Component Analysis (PCA), Manifold Learning (ML), and Deep Learning (DL). For PCA, the determination of final dimension relies on the preset threshold and it is subjective. And as for the nonlinear high-dimension data, some useful information may be lost [2]. Compared with PCA, the local features are extracted via ML. But the effectiveness of ML is not good for global features [3]. For DL, artificial threshold is not necessary. Additionally, the diagnosis accuracy can be improved by automatic end-to-end feature learning [4]. AS a type of unsupervised learning, the original information could be reserved as possible by Autoencoder (AE). In it, the input variables are coded to reduce the dimension. Due to the insufficiency caused by the single hidden layer in AE, Stacked Autoencoder (SAE) is generated. In it, multiple autoencoders are stacked in series. And simpler features are obtained. In this paper, the fault diagnosis method of rolling bearings via Wavelet-Stacked feature extraction is proposed. Firstly, in order to extract the data features fully, time domain and time-frequency domain features are extracted from the vibration signal. Thus the high-dimensional features are formed. Then for the redundancy reduction, SAE is introduced. Following, the Fuzzy c-Means (FCM) is used to cluster the training data and the corresponding labels are added. Finally, the k-Nearest Neighbor (KNN) algorithm is used to diagnose the testing data and determine the final fault types.
2 Construction of Joint Features via Time and Time-Frequency Domain 2.1 Construction of Time Domain Features In this paper, the time domain features are extracted from the original vibration signal firstly. For the convenience, fifteen time domain features named as Pit , (i = 1, 2, . . . , 15) are adopted and shown in Table 1. Three main features in Table 1 are described as follows. Maximum value X max : X max = max {|xi |}
(1)
Root mean square X r ms :
X r ms
N 1 = x2 N i=1 i
(2)
Fault Diagnosis Method of Rolling Bearings …
511
Table 1 Time domain features of rolling bearings Num Feature Num 1 2 3 4 5 6 7 8
Mean value Absolute mean value Variance Standard deviation Maximum value Minimum value Peak-to-peak value Mean square amplitude
Feature
9 10 11 12 13 14 15
Skewness Cliffness Cliffness indicator Peak indicator Pulse indicator Margin indicator Waveform indicator
Peak indicator C: C=
X max X r ms
(3)
where xi , (i = 1, 2, . . . , N ) are discrete data points. From formulas (1) to (3), the above time domain features are simple and intuitive. But the general periodicity of variation are only reflected in the bearing. Additionally, it is difficult to diagnose precisely.
2.2 Construction of Time-Frequency Domain Features via Wavelet Packet Transform In this paper, Wavelet Packet Transform (WPT) is used to extract the time-frequency domain features [5]. The four-layer WPT is as follows. The signal x10 can be decomposed into 16 frequency bands by the four-layer WPT as follows: x10 = x41 + x42 + · · · + x415 + x416
(4)
In Fig. 1, the input signal is x10 . The wavelet packet coefficients of the ith frequency band in the jth layer of WPT are represented as x ij . The low-pass filter is represented as L. And the high-pass filter is represented as H. Therefore, the total energy of the signal can be expressed as: 2
∞
j
E=
i=1
E ij (t)
=
2
x ij (t) dt −∞
(5)
512
N. Wang et al.
Fig. 1 Diagram of four-layer WPT
Finally, the ratio of the signal energy of each frequency band to the total energy is calculated as: Piw =
E ij E
(6)
In this paper, db20 wavelet is used and 16 frequency bands x4i (t), (i = 1, 2, . . . , 16) are obtained. Finally, the energy proportion Piw , (i = 1, 2, . . . , 16) is calculated by the formula (6).
2.3 Procedures of Fault Joint Feature Construction via Time Domain and Time-Frequency Domain In summary, the procedures of fault joint feature construction via time domain and time-frequency domain are described as follows: Step 1: 15 domain features P t are obtained by the statistical formulas in Table 1. Step 2: 16 time-frequency domain features P w are obtained by the four-layer WPT and formula (7). Step 3: P t and P w are combined into P l = [P t , P w ].
Fault Diagnosis Method of Rolling Bearings …
513
3 Fault Diagnosis Method of Wavelet-Stacked Feature Extraction for Rolling Bearings 3.1 Principle of SAE In this paper, SAE is used to reduce the dimensionality of the features P l . The principle of AE is shown in Fig. 2. And it is divided into input layer, implicit layer and output layer [4]. As is seen in Fig. 2, input dataset X = {x 1 , . . . , x i , . . . , x N } and its output dataset Y = { y1 , . . . , yi , . . . , y N } are given. Data dimensionality reduction is implemented as follows: h = f (W1 x + b1 )
(7)
where the activation function of the encoder is represented as f , the ReLU 0 x ≤0 function f (x) = is used in this paper; W1 is the weight matrix from the x x ≥0 input layer to the implicit layer; b1 is the coding bias matrix. In the decoding stage, the decoding function g is used to map the feature vector h to the output layer. Thus the n dimensional output vector y = (y1 , . . . , yn ) is obtained in order to achieve the fit to the input vector x: y = g(W2 h + b2 ) In this paper, g is also used as the ReLU function.
Fig. 2 The structure of AE
(8)
514
N. Wang et al.
Finally, the loss function Loss is constructed to evaluate the fitting degree from y to x: 1 i ( y − x i )2 Loss = N i=1 N
(9)
3.2 Principle of FCM Clustering In this paper, FCM is used to cluster the feature set P after Wavelet-Stacked feature extraction. The labels of the training data are added. And the diagnostic model is established by KNN to the testing data. The clustering results of FCM are achieved by minimizing the given objective function [6]. The objective function is as follows. min J (X 0 , U, V 0 ) =
N c i=1 j=1
(u i j )m (di j )2 s.t
c
ui j = 1
(10)
i=1
where U = [u i j ]c×N is the membership matrix; u i j is the membership degree of the jth sample to the ith clustering center. di j is the Euclidean distance between the sample i and the sample j. V 0 = {v1 , v2 , . . . , vc } is the distribution centers vector of the current data, and the weighted exponent m is equal to 2.
3.3 Algorithm Description of Fault Diagnosis for Rolling Bearings Via Wavelet-Stack Feature Extraction In summary, the algorithm description of fault diagnosis via Wavelet-Stacked feature extraction is described as follows: Step1: For the original bearing data X, new data set X n = [X 1 ; . . . ; X N ] is obtained by the sliding window with the length 1200*1. With X n , 15 time domain features P t are got in Table 1. Step2: The signal X n is decomposed by the four-layer WPT, and the energy proportions of each frequency band P w are calculated. The time domain and timefrequency domain features P l = [P t , P w ] are combined and normalized. Step3: X are divided into training set X tr and testing set X tst .The SAE network initial parameters are set. And the constructed training data X tr are afforded to SAE. So the reduced feature set P is got by the SAE network. Step4: P is clustered by FCM and labeled for each type. Step5: Finally, the training data P and corresponding label m are input into KNN and build the diagnostic model.
Fault Diagnosis Method of Rolling Bearings …
515
4 Simulation Studies 4.1 Single Operating Condition Diagnosis Result Analysis In the rolling bearing, a certain periodicity exists in the normal signal. However, the fault signal is unstable due to the fault. Certain noise and redundant information are presented in the signal. Thus the classing is affected seriously by them. The constructed data set is shown in Table 2, and the operating condition is load0 and fault damage is 0.21. There are four statuses: normal (N), rolling element (R) fault, outer ring (OR) fault and inner ring (IR) fault. The diagnosis model is obtained by excuting the steps in Sect. 3.3. The validity of the method is confirmed by the testing data. The results of feature extraction in single operating condition are shown in Fig. 3, 31 dimensional data are reduced to 2 dimensions by SAE and named as S1 and S2 respectively. As is seen in Fig. 3, after Wavelet-Stacked feature extraction, the reduced feature is clearly clustered into four classes by FCM. Compare with the original features, through the reduced features, the clustering is Well-defined and simpler and more effective. Firstly, the training accuracy is 100%. Finally, the testing data are diagnosed by KNN for the classing. And the testing accuracy is 100%.
Table 2 Dataset composition of rolling bearings in single operating condition Num Type Training data Testing Data 1 2 3 4
N R OR IR
Fig. 3 Feature extraction of SAE in single operating condition
203 101 101 101
20 20 20 20
516
N. Wang et al.
Table 3 Dataset composition of rolling bearings in multi-operating condition Num Type Training data Testing data 1 2 3 4
N R OR IR
1414 404 404 404
80 80 80 80
Table 4 Clustering and classification results of five methods in multi-operating condition Method Training accuracy (%) Testing accuracy (%) TF-PCA TF-KPCA TF-t-SNE TF-AE TF-SAE
99.96 99.89 100 99.20 100
100 65.94 59.06 100 100
4.2 Multi-operating Condition Diagnostic Result Analysis The case of fault diagnosis in multi-operating conditions is discussed in this section. The constructed data set is shown in Table 3, including load0 to load3 operating conditions and with the fault damage of 0.21. The results via the entire algorithm in the Sect. 3.3 are shown in Table 4. The results in multi-operating condition are shown in Table 4. Among them, the joint feature extraction in time domain and time-frequency domain is represented as TF. For example, the joint feature extraction is implemented firstly, then feature dimensionality reduction is implemented via SAE. And it is represented as TF-SAE. In traditional dimensionality reduction methods, the higher training and testing accuracy are possessed for TF-PCA, but the manual threshold setting is required for TF-PCA. Similarly, the manual threshold setting is required for TF-KPCA.The testing accuracy of TF-KPCA is poor. In manifold learning, the training and testing accuracy of TF-t-SNE are poor. Compared with the shallow network AE, The better feature extraction ability is owned for TF-SAE. The higher training and testing accuracy are implemented.
5 Conclusion In this paper, the fault diagnosis method of rolling bearings via Wavelet-Stacked feature extraction is proposed. Firstly, feature extraction is implemented in time domain and time-frequency domain. Then feature dimensionality reduction is realized via SAE. Finally, the result of clustering is implemented by FCM. Then KNN is used
Fault Diagnosis Method of Rolling Bearings …
517
to build the diagnostic model. It is difficult to handle the high-dimensional data for FCM. However, this problem can be effectively improved via the proposed method. The structure of the proposed model is simple. The validity is confirmed via the simulation of standard bearing dataset from Case Western Reserve University. Acknowledgements This work is supported by Open Project of Key Laboratory of Micro Optical Electronic Mechanical System Technology of Ministry of Education (MOMST2016-4), and Tianjin Key Research and Development Plan Project (19YFHBQY00040).
References 1. Wang, X., Tang, G., Yan, X., He, Y., Zhang, X., Zhang, C.: Fault diagnosis of wind turbine bearing based on optimized adaptive chirp mode decomposition. IEEE Sens. J. 21(12), 13649–13666 (2021). https://doi.org/10.1109/JSEN.2021.3071164 2. Zhai, R., Zeng, J., Ge, Z.: Structured principal component analysis model with variable correlation constraint. IEEE Trans. Cont. Syst. Tech. 30(2), 558–569 (2022). https://doi.org/10.1109/ TCST.2021.3069539 3. Li, Q., Ding, X., He, Q., Huang, W., Shao, Y.: Manifold sensing-based convolution sparse selfLearning for defective bearing morphological feature extraction. IEEE Trans. Ind. Inf. 17(5), 3069–3078 (2021). https://doi.org/10.1109/TII.2020.3030186 4. Cui, M., Wang, Y., Lin, X., Zhong, M.: Fault diagnosis of rolling bearings based on an improved stack autoencoder and support vector machine. IEEE Sens. J. 21(4), 4927–4937 (2021). https:// doi.org/10.1109/JSEN.2020.3030910 5. Liu, C., Zhuo, F., Wang, F.: Fault diagnosis of commutation failure using wavelet transform and wavelet neural network in HVDC transmission system. IEEE Trans. Inst. Measur. 70, 1–8 (2021). https://doi.org/10.1109/TIM.2021.3115574 6. Shi, Z., Wu, D., Guo, C., Zhao, C., Cui, Y., Wang, F.Y.: FCM-RDpA: TSK fuzzy regression model construction using fuzzy C-means clustering, regularization, droprule, and powerball adabelief. Inf. Sci. 574, 490–504 (2021). https://doi.org/10.1016/j.ins.2021.05.084
A Visual-Based Aircraft Pose Estimation Method During Take-Off Feng Liu, Jong Zhang, Hao Guo, and Xue Chen
Abstract In this paper, we propose a visual-based aircraft pose estimation method, and this method can provide positioning and navigation for the aircraft during the autonomous take-off phase. Using images obtained by a camera installed in the aircraft cockpit, the pose estimation model can estimate the horizontal position deviation and heading deviation between the aircraft and the runway centerline. Then, the flight control system uses these two deviations to control the aircraft for autonomous take-off. The pose estimation model is designed by deep learning algorithm, with a simple and efficient structure. We performed 1000 experiments using a Boeing 737 model in a flight simulation environment, and the results show that the maximum horizontal position error and maximum heading error outputted from the pose estimation model during take-off were 1.89 m and 0.95◦ , respectively. In all experiments, the aircraft was able to complete a well take-off navigated by the pose estimation model. Keywords Aircraft pose estimation · Deep learning · Machine vision
1 Introduction In recent years, autonomous take-off has been a hot research topic for fixedwing unmanned aerial vehicles (UAVs) and commercial aircraft [1–3]. To achieve autonomous take-off for fixed-wing aircraft , the control system must be able to perceive the horizontal position deviation and heading deviation between the aircraft and the runway centerline during take-off [4]. Some UAVs have achieved precise positioning during take-off by radio devices and gyroscopes, thus being able to perceive the deviation of the aircraft relative to the runway [5, 6]. Currently, commercial aircraft do not have autonomous take-off capabilities, but it is a trend in this field. Using radio positioning to achieve autonomous take-off for commercial aircraft is F. Liu (B) · J. Zhang · H. Guo · X. Chen COMAC Artificial Intelligence Innovation Center, 102209 Beijing, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_41
519
520
F. Liu et al.
a feasible technical solution, and it requires both the runway and the aircraft to be installed with radio equipment, but not all airports meet the conditions for installing radio positioning equipment [7–9]. When pilots operate an aircraft to take-off, they keep the aircraft taxis in the middle of the runway by visually observing the relative position of the aircraft to the centerline of the runway. Inspired by the visual positioning method adopted by pilots, this study uses machine vision technology to recognize the deviation between the aircraft and the centerline of the runway. This method does not require any modifications to the runway, and only need to install the image acquisition equipment and corresponding computing units on the aircraft, it is easier to implement and promote. This study constructs a deep learning model for aircraft pose estimation by convolutional neural networks [10] and fully-connected networks [11]. The input of the model is photos of runways obtained from the camera in the cockpit, and the output is the horizontal position deviation and heading deviation between the aircraft and the runway centerline. We collect the image datasets for training the model from driving an aircraft to take off multiple times. Finally, we tested and verified the aircraft pose estimation model in a flight simulation environment, and the results show that the vision-based aircraft pose estimation model proposed in this study can accurately recognize the horizontal position deviation and heading deviation between the aircraft and the runway centerline, and it can provide accurate navigation for autonomous take-off of the aircraft.
2 Related Works Many research institutions and aircraft manufacturers have conducted research on aircraft pose estimation during take-off. In 2020, Airbus achieved a fully automatic take-off of an A350 aircraft using visual data, inertial navigation data, and aircraft front wheel rotation data [12]. This method detect the position of the aircraft on the runway and the precise heading by multiple sensor data. However, installing multiple sensors on an aircraft it difficult to modify the existing aircraft models. Moreover, the fusion algorithm of multi-sensor data is complicated, which increases the complexity of the software program and easily causes the instability of the system. Tang and Hu propose a Chan-Vese model-based approach for ground stereo vision detection, this approach could help a fixed-wing unmanned aerial vehicle autonomous take-off and landing within Global Navigation Satellite System denied environments [13]. Extended Kalman Filter is fused into stateestimation to reduce the localization inaccuracy caused by measurement errors of object detection and Pan-Tilt unit attitudes. The region-of-interest setting up is conducted to improve the real-time capability. This method belongs to the traditional machine vision algorithm, and compared with the latest machine vision model based on Deep-learning, it has poor performance in high-dimensional feature extraction [14, 15].
A Visual-Based Aircraft Pose Estimation Method During Take-Off
521
Fig. 1 Aircraft pose estimation model
3 Aircraft Pose Estimation Model As shown in Fig. 1, the aircraft pose estimation model is constructed by 3 convolution layers, 3 pooling layers and 3 fully connected layers. The input data is the image in front of the aircraft which is resized to 640 × 360 pixels. All of the convolution layers have 128 convolution kernels which size is 3 × 3, and all of the pooling layers use a 2 × 2 max pooling kernel. After the last pooling layer the feature data is flattened and transmitted to a fully connected layer which size is 64. The size of the second fully connected layer is also 64, and the last fully connected layer has 2 elements. The output data from the last fully connected layer are the horizontal position deviation and heading deviation between the aircraft and the runway centerline. When the aircraft pose estimation model is trained, we use mean-square error as the loss function 1 2 2 (1) L= (dl − dt ) + (θl − θt ) 2 dl and θl is the label, dt and θt is the predicted value of the model during training.
4 Producing Training Data Set Training the aircraft pose estimation model requires thousands of labeled data samples. In this study, we perform the experiment in a flight simulation environment, and the production steps of training data set are as follows:
522
F. Liu et al.
Fig. 2 The diagram of an aircraft take-off
Step 1: As shown in Fig. 2, we select two points (S and E) on the center line of the runway, and record the coordinates of these two points (xs , ys ) and (xe , ye ). Step 2: Placing the aircraft at the start of the runway, ensure that the aircraft is aligned with the center line, and recording the heading of the aircraft h i . Step 3: The aircraft is manually piloted to complete multiple take-offs in the flight simulation environment, and in this process, the images of the runway in front of the aircraft are obtained every certain time. Step 4: After each image acquisition, the current heading of the aircraft h t is obtained from the flight simulation system, and then the heading deviation is calculated. θ = hi − ht
(2)
Step 5: Calculating the horizontal position deviation between the aircraft and the runway centerline.
A Visual-Based Aircraft Pose Estimation Method During Take-Off
523
Fig. 3 The training data set of the model
d=
(y − ys ) (xe − xs ) − (x − xs ) (ye − ys ) (xe − xs )2 + (ye − ys )2
(3)
(x, y) is the coordinate of the aircraft. Step 6: As shown in Fig. 3, the images are named by these two deviations and are stored as learning samples. About 200 images can be collected from each take-off, and a total of 6000 samples are produced as the data set for training the aircraft pose estimation model.
5 Experiments Environment: We perform the experiment of this study in X-Plane which is a professional flight simulation software and it has been used by many organizations in the industry such as NASA and Boeing [16]. X-Plane is equipped with functions of advanced flight dynamics simulation, instrument simulation, flight environment simulation and flight operation simulation, and it can provide a qualified test environment for flight algorithm verification [17–19]. Experiment details: We perform experiments with the Boeing 737 model in XPlane and a research tool-X-Plane connect. The aircraft pose estimation model is compiled in Python and Tensorflow. The visual data are continuously collected by taking screenshots of the X-Plane window. After completing the model training, we conducted 1000 autonomous take-off tests navigated by visual positioning. At the start of each take-off, the aircraft is reset and placed at the starting point of the runway. Table 1 shows more details of this experiment.
524
F. Liu et al.
Table 1 Detailed configurations for aircraft pose estimation experiment during take-off Environment X-Plane Aircraft model Boeing-737 Airport runway Beijing Daxing International Airport 29R Data set size 6000 Learning rate 10−3 Batch size 32 Optimization method Adam Sample collection frequency 6 Hz
Fig. 4 Comparison of the actual horizontal deviation and the predicted horizontal deviation
Results and discussion: From the results demonstrated in Figs. 4 and 5, the maximum error of the predicted horizontal deviation is less than 2 m, and the maximum error of the predicted heading deviation is < 1◦ . The predicted value is consistent with the trend of the actual value during take-off. When the predicted heading deviation is close to the actual value, the predicted horizontal deviation will also be close to the actual horizontal deviation, so we can determine that the accuracy of the two predicted values has a positive correlation.We counted the results of 1000 autonomous take-offs, the maximum horizontal position error and maximum heading error outputted by the pose estimation model during take-off were 1.89 m and 0.95◦ , respectively. In each test, the control program was able to complete the take-off navigated by the pose estimation model.
A Visual-Based Aircraft Pose Estimation Method During Take-Off
525
Fig. 5 Comparison of the actual heading deviation and the predicted heading deviation
6 Conclusion In this work, we proposed a visual-based aircraft pose estimation method, and this method use a deep learning model which consists of convolutional neural networks and fully connected neural networks. It can estimate the horizontal position deviation and heading deviation between the aircraft and the runway centerline according to the image of the runway in front of the aircraft during take-off, so as to provide navigation information for the aircraft to take off autonomously, and it also can provide early warning for the aircraft to run out of the runway. This method uses the latest artificial intelligence technology and does not require complex modifications to the aircraft and runways. At last, we did the verification by comparing the predicted value of the model with the actual value in a flight simulation environment, and the results show that the visual-based aircraft pose estimation method we proposed can accurately identify the horizontal position deviation and the heading deviation between the aircraft and the runway centerline during take-off.
References 1. Tang, D., Hu, T., Shen, L., et al.: Ground stereo vision-based navigation for autonomous take-off and landing of uavs: a chan-vese model approach. Int. J. Adv. Robot. Syst. 13(2), 67 (2016) 2. Carnes, T.W., Bakker, T.M., Klenke, R.H.: A fully parameterizable implementation of autonomous take-off and landing for a fixed wing UAV. AIAA Guidance, Navigation, and Control Conference (2015) 3. Daibing, Z., Xun, W., Weiwei, K.: Autonomous control of running take-off and landing for a fixed-wing unmanned aerial vehicle. In: 12th International Conference on Control Automation Robotics and Vision (ICARCV), pp. 990–994. IEEE (2012)
526
F. Liu et al.
4. Roos, J.C.J.C.: Autonomous take-off and landing of a fixed wing unmanned aerial vehicle. Stellenbosch University, Stellenbosch (2007) 5. Kim, H.J., Kim, M., Lim, H., et al.: Fully autonomous vision-based net-recovery landing system for a fixed-wing UAV. IEEE/ASME Trans. Mechatron. 18(4), 1320–1333 (2013) 6. Seymour, A.C., Ridge, J.T., Rodriguez, A.B., et al.: Deploying fixed wing Unoccupied Aerial Systems (UAS) for coastal morphology assessment and management. J. Coast. Res. 34(3), 704–717 (2018) 7. Novak, A., Pitor, J.: Flight inspection of instrument landing system. In: 2011 IEEE Forum on Integrated and Sustainable Transportation Systems, pp. 329–332. IEEE (2011) 8. Rosin, A., Hecht, M., Handal, J.: Analysis of airport-runway availability. In: Annual Reliability and Maintainability. Symposium. 1999 Proceedings (Cat. No. 99CH36283), pp. 432–440. IEEE (1999) 9. Jeong, M.S., Bae, J., Jun, H.S., et al.: Flight test evaluation of ILS and GBAS performance at Gimpo International Airport. GPS Solutions 20, 473–483 (2016) 10. Li, Z., Liu, F., Yang, W., et al.: A survey of convolutional neural networks: analysis, applications, and prospects. IEEE Trans. Neural Netw. Learn. Syst. (2021) 11. Suk, H.I.: An introduction to neural networks and deep learning. Deep learning for medical image analysis, pp. 3–24. Academic Press (2017) 12. Coppinger, R.: Mission Autonomous, pp. 66–72 (2020) 13. Tang, D., Hu, T., Shen, L., Zhang, D., Kong, W., Low, K.H.: Ground stereo vision-based navigation for autonomous take-off and landing of UAVs: A Chan-Vese model approach. Int. J. Adv. Robot. Syst. 13(2) (2016). https://doi.org/10.5772/62027 14. Mahony, N., Campbell, S., Carvalho, A., et al.: Deep learning versus traditional computer vision. In: Advances in Computer Vision: Proceedings of the 2019 Computer Vision Conference (CVC), vol. 1 1, pp. 128–144. Springer International Publishing (2020) 15. Manzoor, S., Joo, S.H., Kuc, T.Y.: Comparison of object recognition approaches using traditional machine vision and modern deep learning techniques for mobile robot. In: 19th International Conference on Control, Automation and Systems (ICCAS), pp. 1316–1321. IEEE (2019) 16. Teubert, C., Watkins, J.: The X-Plane connect toolbox. Available Online: https://github.com/ nasa/XPlaneConnect. Accessed 2 Mar 2023 17. Nanduri, A., Sherry, L.: Generating flight operations quality assurance (FOQA) data from the X-Plane simulation. Integrated Communications Navigation and Surveillance (ICNS), pp. 5C1-1–5C1-9. IEEE (2016) 18. Bittar, A., et al.: Guidance software-in-the-loop simulation using x-plane and simulink for uavs. Int. Conf. Unmanned Aircraft Syst. (ICUAS). IEEE (2014) 19. Junior, J.M. Magalhaes, et al.: Test platform for autopilot system embedded in a model of multi-core architecture using X-Plane flight simulator. In: IEEE/AIAA 38th Digital Avionics Systems Conference (DASC). IEEE (2019)
Edge Detection of Carbon Electrode Image Based on Improved Image Restoration and Improved Canny Operator Fusion Feng Jia, Xiaobin Li, and Yanling Xu
Abstract Aiming at the problem that the low contrast of carbon electrode, the reflection phenomenon caused by strong light reflection of metal strip and too much background noise affects the edge detection of carbon electrode image in complex industrial environment, an edge detection algorithm of carbon electrode image based on improved image inpainting, and improved Canny operator fusion was proposed. Firstly, the improved color enhancement algorithm was used to enhance the contrast of the image. Then, the image inpainting algorithm was used to partially repair the metal strip of the carbon electrode, which improved the uneven brightness problem caused by strong light interference. At the same time, the guided filter was used to improve the problem of image edge blurring caused by Gaussian filtering in noise reduction. Secondly, the 3 × 3 gradient template Sobel operator was used to calculate the gradient amplitude and direction to improve the accuracy of edge location. Finally, the discontinuous edges were repaired by morphological closing operation to obtain the final detected edges. The experimental results show that compared with other algorithms, the proposed algorithm can better protect the edge details, and the index of the edge connection degree is greatly improved, which can effectively improve the edge detection effect of carbon electrode images based on complex environment. Keywords Carbon electrode · Guided filtering · Canny operator · Edge detection
F. Jia · X. Li (B) Department of Control Science and Engineering, Shanghai Institute of Technology, 100 Haiquan Road, Fengxian District, Shanghai 201418, P. R. China e-mail: [email protected] Y. Xu Beijing Academy of Science and Technology, Beijing 100089, P. R. China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_42
527
528
F. Jia et al.
1 Instruction Carbon electrode instead of graphite electrode has the advantages of energy saving, lower cost and high performance price ratio, and replacing electrode paste can reduce pollutant emissions and energy consumption, so carbon electrode is widely used in industrial silicon smelting in our country. Carbon electrodes often need to be measured manually before leaving the factory. However, the industrial production environment is complex, and manual measurement is easy to produce security risks. Therefore, artificial intelligence is needed to replace manual work by visual detection technology to complete the identification of carbon electrode size specifications. Due to the complex production environment of carbon electrodes and the single color of carbon electrodes, the gray value of carbon electrodes is too high, and the related research on carbon electrode identification is still relatively limited. Lu et al. [1] improved the traditional Canny operator by adding 45 and 135 gradient templates, but this algorithm is difficult to meet the requirements of electrode defect detection under low contrast. Huang Mengtao et al. [2] proposed an improved Canny operator, which can achieve high accuracy for lithium battery defect recognition, but has low recognition accuracy for some defects. Sun Haoran et al. [3] proposed a lithium-ion battery electrode defect detection method, but for some defects detection accuracy and accuracy are low . Aiming at the shortcomings of the existing visual detection methods, a fusion algorithm based on image inpainting and edge detection is proposed. Firstly, aiming at the problem of image contrast, the color contrast was increased. Then, a new image inpainting algorithm was proposed to solve the problem of metal aperture interference. Finally, the improved edge detection method is used to extract the edge of carbon electrode.
2 Problem Statement 2.1 System Description The image acquisition system for the carbon electrode is shown in Fig. 1. In the experiment, the carbon electrode image was captured by a fixed network video recorder, and the collected image was sent to the PC terminal for further image processing tasks. Considering the influence of the actual industrial environment, the experiments were carried out under two different scenes with textured background noise, and random salt and pepper noise and Gaussian noise were added to the captured images, which were as close to the actual environment as possible. Similar electrode images captured are shown in Fig. 2, and the gray level histogram of the electrode image is shown in Fig. 3.
Edge Detection of Carbon Electrode Image Based…
529
Fig. 1 The image acquisition system for the carbon electrode
Fig. 2 Similar electrode images captured
2.2 Algorithm Design Through the image acquisition after the system design, this experiment needs to solve the following problems: (1) The overall color of the carbon electrode is single, which is close to the surrounding conveyor belt and environment color, and the contrast is low; (2) The overall color of the carbon electrode is single, which is close to the surrounding conveyor belt and environment color, and the contrast is low;
530
F. Jia et al.
Fig. 3 The gray level histogram of the electrode image
Fig. 4 The model diagram of the proposed algorithm
(3) Due to the complex production environment, the existence of dust and noise will make the collected image blurred, and it is difficult to detect the image edge. Based on the above analysis of the three problems existing in the edge detection of carbon electrode images at the present stage, an improved algorithm framework model is proposed, and the model diagram is shown as Fig. 4. Where “a” is the input, that is, the collected carbon electrode picture, “b” represents the image processing block, the input is optimized, and “c” is the output, that is, the edge of the carbon electrode after image processing.
3 Image Pre-processing 3.1 Improved Color Enhancement and Filtering In the first step of processing the image captured by the camera, the overall color of the carbon electrode is single and black. At the same time, the image is blurred and has low contrast because of the addition of salt and pepper noise and Gaussian noise. The steps are as follows: 1. Convert the image to CMY space, and subtract the minimum value from each pixel (three channels, including C, M, Y values), as follows:
Edge Detection of Carbon Electrode Image Based…
C = C − min(C, M, Y ) M = M − min(C, M, Y ) Y = Y − min(C, M, Y )
531
(1)
2. Convert the image obtained in the previous step to HSV space. 3. Calculate the maximum and minimum value of V , and re-quantize V value, as follows: (vmax − vmin ) (2) new Pixel = old Pixel × (255 − vmin ) 4. The filtering methods commonly used in image processing include bilateral filtering, median filtering, and mean filtering. These methods have good denoising effects for common noise types of color images, but they are not effective for polar slice images with single surface color. Therefore, guided filtering algorithm is adopted to improve the problem that Gaussian filtering is easy to blur image edges. Compared with bilateral filtering which is popular in recent years, the computational complexity of guided filtering is much less than bilateral filtering. At the same time, gradient inversion does not occur near the edge of the guided filtered image. Its expression is as follows: qi = Wmn (I ) p j (3) j
3.2 Improved Image Repair Algorithm Aiming at the problem of metal halo interference, an improved image inpainting algorithm is proposed. The algorithm is as follows: STEP1 The binary image is blurred, and Gaussian blur is used to process the image, which can reduce the burr; STEP2 The improved median filtering algorithm is used to filter, and the sorting method is mainly improved. A flag is set. If data exchange occurs in the for loop, flag=true, otherwise it is false. STEP3 The binary image is traversed, the Hough line detection is used, and the detected line segment is represented by a red line. STEP4 The detected line segments are clustered, and the offset Angle of the line segment is less than 5◦ , which is classified as a straight line. And the pixels on one of its line segments are turned into black pixels to complete the line segment,the binary image and the effect of the algorithm is as shown in Fig. 5,where a and c are binarized images, b and d is the algorithm’s renderings.
532
F. Jia et al.
Fig. 5 The binary image and the effect of the algorithm
4 Edge Detection Improvement 4.1 Improved Canny Operator For the edge detection of carbon electrode image, it is difficult to achieve the ideal effect by using the traditional Canny operator. Therefore, the traditional canny operator will be improved in the following steps in this paper, respectively. 1. Replace the Gaussian filter in the traditional Canny operator with the guided filter 2. Increase the multi-scale detail enhancement of the carbon electrode image 3. The gradient calculation of the traditional canny operator is carried out in the 2 × 2 neighborhood, and now the sobel operator is used to improve it, and the gradient calculation is carried out in the 3 × 3 neighborhood. Through the above improvement of the traditional canny operator, it is expected to realize the edge detection of the carbon electrode image, and the flow chart is shown in Fig. 6.
Fig. 6 Improved Canny operator flowchart
Edge Detection of Carbon Electrode Image Based…
533
4.2 Multi-scale Detail Enhancement Compared with other image enhancement algorithms, such as gamma correction [4], fuzzy entropy [5] based method, etc., the multi-scale detail enhancement algorithm is more suitable for carbon electrode images with complex environment, and solves the problem of uneven illumination.
4.3 Improved Gradient Calculation The traditional canny operator calculates the gradient magnitude in the 2 × 2 neighborhood size, but the accuracy is low and the anti-interference ability is poor. Therefore, the Sobel operator based on the 3 × 3 neighborhood size of the city distance is proposed. In order to obtain more edge information, the gradient weighted summation of x, 45, y and 135 is considered, and the gradient template of Sobel operator is shown as follows. ⎧ ⎡ ⎤ −1 0 1 ⎪ ⎪ ⎪ ⎪ G0 = ⎣ −2 0 2 ⎦ ⎪ ⎪ ⎪ ⎪ −1 0 1 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎡ ⎤ ⎪ ⎪ −2 −1 0 ⎪ ⎪ ⎪ ⎪ G45 = ⎣ −1 0 1 ⎦ ⎪ ⎪ ⎪ ⎪ 0 1 2 ⎨ (4) ⎡ ⎤ ⎪ ⎪ −1 −2 −1 ⎪ ⎪ ⎪ ⎪ G90 = ⎣ 0 0 0 ⎦ ⎪ ⎪ ⎪ ⎪ 1 2 1 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎡ ⎤ ⎪ ⎪ 0 −1 −2 ⎪ ⎪ ⎪ ⎪ G = ⎣ 1 0 −1 ⎦ ⎪ ⎪ ⎩ 135 2 1 0
5 Experimental Results and Analysis This experiment is run under windows system, based on opencv vision library to achieve, the experiment selected similar carbon electrode objects in different scenes for verification. Through the comparison of the improved Canny operator and other common edge detection operators, the experimental results are as follows Fig. 7 In order to visually evaluate the performance of the edge detection algorithm, the text is statistically analyzed according to the evaluation indicators in the literature, and the statistical results are shown in Table 1. A is denoted by a 4-connected component
534
F. Jia et al.
Fig. 7 Edge detection effect graph Table 1 Quality(PSNR) evaluation comparison results of image filtering Filtering Ours Canny Sobel A(a scene) B(a scene) B A (a Scene) A(b scene) B(b scene) B A (b Scene)
2 1 0.5 12 5 0.41
236 168 0.71 1125 904 0.80
663 602 0.91 2021 1071 0.52
Robert 39 36 0.92 485 460 0.94
and B by an 8-connected component. Under normal circumstances, when the AB value is smaller, it means that the image edge connection degree is better[6], and the image edge detection effect is better. Therefore, the AB ratio can be used as an index to measure the degree of edge connectivity of an image[7]. From the data in Table 1, it can be seen that the algorithm has the best edge connectivity in A background, followed by the traditional Canny algorithm, Sobel and Robert algorithm has the worst edge connectivity in A background; In B background, the image connection degree of this algorithm is also the best, Sobel algorithm is next, and the traditional Canny algorithm and Robert algorithm are worse in B background, the improved algorithm in this paper has better image edge connection degree and better image edge detection effect.
Edge Detection of Carbon Electrode Image Based…
535
6 Conclusion Aiming at the problem that the existing algorithm is difficult to detect the edge of carbon electrode image in complex environment and has too much noise, this paper proposes a carbon electrode image edge detection method based on improved Canny operator. After experimental verification, the conclusions are as follows: In this paper, the improved image inpainting algorithm in the process of image preprocessing can effectively eliminate the metal aperture of the carbon electrode, which is convenient for subsequent image detection. Compared with the classic Canny algorithm, Sobel operator and Robert operator, the image edge detection method based on improved Canny algorithm proposed in this paper achieves better results in image edge detection, and improves the effectiveness of the algorithm to a certain extent. Acknowledgements This work was supported by Shanghai Collaborative Innovation Technology Fund (Grant No. XTCX2022-29).
References 1. Lu, H.L., Yan, J.: Window frame obstacle edge detection based on improved Canny operator. In: 3rd International Conference on Electronic Information Technology and Computer Engineering (EITCE), pp. 493–496. IEEE, Xiamen, China (2019) 2. Huang, M.T., Lian, Y.X.: Lithium battery electrode plate surface defect detection based on improved Canny operator. Chin. J. Sci. Instrum. 42(10), 199–209 (2021) 3. Ho-ran, S., Ya-wen, L., You-jun, H., Yue-ming, H.: Defect detection of lithium-ion battery electrode based on topology filtering and improved Canny operator. Energy Storage Sci. Technol. 11(10), 3297–3305 (2022). Wang, B., Fan, S.S.: An improved CANNY edge detection algorithm. In: Second International Workshop on Computer Science and Engineering, pp. 497–500. IEEE, Qingdao, China (2009) 4. Li, C.,Yang, Y., Xiao, L., et al.: A novel image enhancement method using fuzzy Sure entropy. Neurocomputing 215(26), 196–211 (2016) 5. Gupta, B.,Tiwari, M.: Minimum mean brightness error contrast enhancement of color images using adaptive gamma correction with color preserving framework. Optik 127(4), 1671–1676 (2016) 6. Lin, H., Shunin, Z., Changsheng.: A new edge evaluation method based on connected component. Remote Sens Land Resour (03), 37–40 (2003) 7. Song, R.J., Liu, C., Wang, B.J.: An adaptive canny edge detection algorithm. J. Nanjing Univ. Posts Telecommun. (Natural Science Edition) 38(03), 72–76 (2018)
Landing Point Control Technology of Parafoil System Based on Sliding Mode Control in a Complex Environment Weitao Lu, Hao Sun, Qinglin Sun, and Zengqiang Chen
Abstract Parafoil system plays an important role in spacecraft recovery and material delivery due to its unique advantages, but because of its complex nonlinear dynamic characteristics, it is easily affected by complex environments such as wind fields, and it is difficult to accurately control the destination. In order to solve this problem, based on the dynamic constraints of the parafoil and the Lagrange equation, an eightdegree-of-freedom parafoil model is established, and a sliding mode control (SMC) parafoil landing point control method based on the extended state observer (ESO) is designed. The ESO can be used to accurately estimate the impact of complex environments on the parafoil system and compensate for the estimated disturbance in SMC. The results show that the control method proposed can effectively overcome the influence of environmental interference, which can not only significantly shorten the convergence time, but also have good control accuracy and robustness, which can realize the precise landing point control of the parafoil system, and provide a theoretical reference for the landing point recovery based on the parafoil. Keywords Parafoil system · Home control · SMC · ESO · Perturbation compensation
1 Introduction The parafoil system has received widespread attention because of its controllable flight direction, which can be used in the fields of battlefield material delivery and spacecraft recovery. Since the parafoil is easily affected by complex environments such as gusts, rain, the relationship between the control input and the yaw angular velocity will make the dynamic characteristics more complex nonlinear, and it is difficult to achieve accurate homing. Therefore, it is of practical significance and
W. Lu · H. Sun · Q. Sun (B) · Z. Chen College of Artificial Intelligence, Nankai University, Tianjin 300350, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_43
537
538
W. Lu et al.
wide application value to study the fixed-point homing problem of parafoil system in complex environment. Slegers identified the aerodynamic parameters of the unpowered wing by the least squares method, and used model predictive control to achieve trajectory tracking, and the trajectory tracking error was within the range of 36 62 m in the flight test in [1]. Carter in [2] proposes a band-limiting guidance control method with mass model. The maximum landing error in flight experiments was 600 m, the minimum landing error was 36 m. Leon devised a feedback control method and conducted a flight experiment with a landing error of 142 m in [3]. TanakeThe standard circle is tracked by the rational polynomial tracking control method, and the maximum tracking error is about 25 m in [4]. To improve modeling accuracy, Moriyoshi tested the influence of the wing assembly angle in a wind tunnel test in [5]. Xing Xiaojun proposed an optimal trajectory planning algorithm for parafoil, and obtained a return trajectory considering energy and landing point accuracy through simulation experiments in [6]. The actual flight and landing error of the parafoil return is always much larger than the error in the ideal simulation, which is mainly affected by the accuracy of the parachute model and external interference such as wind field. To deal with above problem, this paper establishes an accurate dynamic model of the eightdegree-of-freedom parachute system to ensure that the model can accurately simulate the real flight state of the actual parafoil. In addition, SMC based on the ESO is designed, which is used to observe system disturbances and environmental disturbances. Finally, the effectiveness of the control algorithm is verified by simulating multiple parafoil airdrop simulation experiments.
2 Parafoil Eight-degree-of-freedom Model As shown in Fig. 1, the parafoil system consists of two parts: a flexible parafoil suit and a load. To facilitate analysis, three main coordinate systems are established: the inertial coordinate system Od xd yd z d , umbrella coordinate system Os xs ys z s , the payload coordinate system Ow xw yw z w .
2.1 The Umbrella Body Is Stressed with the Load First, the force between the parafoil and the load can be expressed as:
∂ Pw + Ww × Pw = Fwaer o ∂t ∂ Hw + Ww × Hw = Mwaer o ∂t
+ FwG + Fwt + MwG + Mwt
(1)
Landing Point Control Technology of Parafoil System …
539
Fig. 1 Structure diagram of brushless DC motor
∂ Hs ∂t
∂ Ps ∂t
+ Ws × Ps = Fsaer o + FsG + Fst f + Ws × Hs + Vs × Ps = Msaer o + MsG + Mst + Ms
(2)
where: P and H represents momentum and angular momentum; F and M indicates force and moment; superscript aer o, G, f are the aerodynamics, gravity, tension of the paracord and friction. The momentum and moment of the load and the parafoil can be depicted as:
Ps Hs
Pw = m w Vw Hw = Jw Ww
= [Aa + Ar ]
Vs Ws
=
(3) A1 A2 A3 A4
Vs Ws
(4)
In the formula: [Vw , Ww ]and [Vs , Ws ] represent the speed and angular velocity of the system under the load and umbrella coordinate system; Aa and Ar represents the attached and true mass matrices of the umbrella body, m w indicates the weight of the load; Jw indicates the moment of inertia of the load. In Eqs. (1) and (2), the tension force on the suspension rope and the control rope is a pair of forces acting on the load and the parafoil, which can be depicted as:
540
W. Lu et al.
FsT = −Tw−s FwT
(5)
In the formula: Tw−s represents a transformation matrix from a payload coordinate system to an umbrella coordinate system.
2.2 Velocity and Angular Velocity Constraints Since the recovered load is connected to the parafoi by a suspension rope, the velocity and angular velocity constraints of these two parts can be expressed as: Vw + Ww × L w−c = Vs + Ws × L s−c
(6)
Formula: L w − c represents the position vector from the center of gravity of the load to point Cm , L s − c represents the position vector from the center of gravity of the umbrella to point Cm . By deriving Eq.(6), the constraints on velocity and angular velocity can be expressed as: V˙s − L s−c × W˙ s − Tw−s V˙w + Tw−s L w−c × W˙ w (7) = Tw−s Ww × (Vw + Ww × L w−c ) −Ws × (Vs + Ws × L s−c ) The Euler angle limit between the load and the parafoil can be expressed as: Ww = Ws + τs + κw
(8)
In the formula: τs = [ 0 0 ψ˙ r ]T and κw = [ 0 θ˙r 0 ]T represent the relative yaw and relative pitch angular velocity of the recovered load to the parafoil. Similarly, after deriving Eq. (8), it is obtained: Tw−s W˙ w − W˙ s − Tw−s κ˙ w − τ˙s = (Ws − Tw−s Ww ) × Ws − (Tw−s κw ) × τs
(9)
Finally, based on Eqs. (5) and (9), the relative motion between the load and the umbrella body can be expressed as: A1 V˙s + A2 W˙ s + Tw−s m w V˙ p = Fsaer o + FsG + Tw−s (FwG + Fwaer o + Fwth ) −Ws × (A1 Vs + A2 Ws ) − Tw−s Ww × m w Vw
(10)
Combining the above equations, the dynamic model of the parafoil system can be obtained.
Landing Point Control Technology of Parafoil System …
541
Fig. 2 Structure diagram of SMC based on ESO
3 Design of SMC Based on ESO The homing controller of the parafoil system designed in this paper mainly uses the ESO to observe the heading angle, and then compensates for it in the sliding mode controller. The controller designed in this article is shown in Fig. 2.
3.1 Extended State Observer Design For the return control of the parafoil, real-time position information of the parafoil is required. As shown in Fig. 3, the real-time position of the parafoil, the heading angle of the parafoil, the heading angle of the target, and the deviation of the course. The heading angle of the parafoil is simplified to a second-order differential form: ψ¨ = f + b0 u
(11)
Thereinto, f is the total perturbation of the system, b0 is the control gain, u indicates the downward deflection of the parafoil flap. ˙ So Eq. ( 11) can be converted to the form of a state-space Define η = f˙(ψ, ψ), expression: ⎧ x˙1 = x2 ⎪ ⎪ ⎨ x˙2 = x3 + b0 u (12) x˙3 = η ⎪ ⎪ ⎩ y = x1
542
W. Lu et al.
Fig. 3 Schematic diagram
where, x1 represents the heading angle of the parafoil, and x2 is the first derivative of the heading angle. So a third-order linear state observer can be designed: ⎧ ⎨
z˙ 1 = z 2 + L 1 (y − z 1 ) z˙ 2 = z 3 + b0 u + L 2 (y − z 1 ) ⎩ z˙ 3 = L 3 (y − z 1 )
(13)
T
T L 1 L 2 L 3 = 3ω0 3ω0 2 ω0 3 is the observation gain, ω0 is the bandwidth of the observer, adjusting it can affect the estimation ability of the observer. This can be achieved in a limited time using the above observer, z i → xi (i = 1, 2, 3).
3.2 Design of Sliding Mode Controller Design parafoil heading control sliding surfaces: s = ce + e˙
(14)
Thereinto, c > 0, e = eψ , eψ = ψ − ψd , The approach rate is taken −k1 s − k2 sgn(s). Then SMC based on ESO can be designed as: u=
1 (−k1 sˆ − k2 sgn(s) − vˆ − fˆ) b0
(15)
Landing Point Control Technology of Parafoil System …
543
˙ˆ sgn(•) is In the formula: vˆ = ce˙ˆ − ψ¨d , eˆ = z 1 − ψd , e˙ˆ = z 2 − ψ˙ d , sˆ = ceˆ + e. toggle functions. k1 and k2 are the controller parameter.
3.3 Convergence of the Controller Select the Lyapunov function as Vs = 21 s 2 , Then derivation to Vs yields: V˙s = s s˙ = s(ce˙ + b0 u + f − ψ¨ d ) = s(−k1 (s − s˜ ) − k2 sgn(s) + v˜ + f˜) = s(−k1 s − k2 sgn(s) + E) = −k1 s 2 − k2 |s| + s E
(16)
In the formula: s˜ = s − sˆ , v˜ = v − v, ˆ f˜ = f − fˆ, E = k1 s˜ + v˜ + f˜. Order E = k1 s˜ + v˜ + f˜, then we get: (17) Vs (t) ≤ 0 Therefore, the closed-loop system gradually converges, which can realize the control of the return of the parafoil.
4 Simulation Experiments 4.1 Simulation Environment In order to verify the effectiveness of the above method, the downhill parafoil return control simulation is carried out. The main structural parameters of the parafoil are shown in Table 1. The origin of the coordinate system is assumed to be the target landing point, and a real-time wind of 2 m/s is added in the positive direction of the y axis. In addition, the error of the positioning system during the actual flight was taken into Table 1 Structural parameters of the rappelling parafoil
Parameter name
Numeric value
Parafoil quality Load mass Showman Chord length Parafoil area
0.3 kg 12 kg 2.3 m 0.7 m 1.6 m2
544
W. Lu et al.
account, and white noise with an average value of 1 m was added. PID, SMC and SMC+ESO were used to carry out control experiments. PID controller parameter is set to: k p = 0.5, ki = 0.1, kd = 1; The parameters of the SMC are set to: c = 0.278, k1 = 0.395, k2 = 0.905. The control frequency of the modulo controller is set to 0.5 Hz. The observation frequency of the ESO is 5 Hz, the parameters are set to: b0 = 1.3, w0 = 0.4.
4.2 Simulation Results Under the conditions of initial position (− 500, 650, 600) and initial speed (− 6, 0, 3), the parafoil system was simulated with PID, SMC, SMC+ESO algorithms, and the horizontal trajectory and control quantity were shown in Figs. 4 and 5. The dotted line in Fig. 4 is the initial reference heading trajectory, and take the target point as the center of the circle as the two circles with radii of 30 m and 60 m. It can be seen that in the initial stage, the SMC+ESO controller can quickly realize the homing control compared with PID and SMC, which has a faster convergence speed and a smoother homing trajectory. After the parachute system crosses the target point, the SMC+ESO controller can quickly control, and carry out height adjustment and hovering near the target point, and the landing point error is 19.4 m. The PID and SMC controllers reacted after crossing the target point for 10 s, and the landing point errors were 42.19 m and 48.25 m. It can be seen that SMC+ESO has fast convergence speed, which can overcome wind disturbances and has better control performance. Based on SMC+ESO controller, several simulation experiments of parachute landing point control in random initial state were carried out to verify the effectiveness of the SMC based on the ESO. The different initial states for the five experiments are given in Table 2.
Fig. 4 Horizontal trajectory
Landing Point Control Technology of Parafoil System …
545
Fig. 5 Control quantity
Table 2 The initial state of the rappelled parafoil Description Initial position Test1 Test2 Test3 Test4 Test5
[− 450, -850, 670] [750, − 450, 620] [− 400, 650, 640] [650, 650, 600] [− 300, − 850, 700]
Initial speed [0, 6, 3] [− 2, 3, 3] [2, − 3, 3] [6, 0, 3] [6, 0, 3]
Figures 6 and 7 show the 3D and horizontal trajectories. It can be seen from the multiple flight trajectories that the downhill parafoil system accurately estimates the impact of the wind field and other environments through the observer, and compensates in the controller, so as to quickly correct the heading angle, so that the parafoil flies towards the target landing point in real time, even if it leaps over the target point, it can quickly correct the heading angle, and finally land near the desired target area. The smallest error was only 6.09 m, the largest drop point error was 16.64 m, and the average drop point error was 10.71 m. It can be seen that in complex environments, the designed controller not only has fast convergence speed, but also has good control accuracy and robustness. As can be seen from the control plot in Fig. 8, the control quantity oscillates slightly in the initial phase, but then converges and maintains stable flight. In the first, second, third and fifth experiments, because the parachute has finally flown over the target point, the control amount will control the downhill wing parachute to carry out the spiral height cutting stage, so there will be a situation that the operating rope on the side of the parachute is full; In the fourth simulation experiment, the parachute did not fly over the target point, but only landed near the target area.
546
W. Lu et al.
Fig. 6 3D trajectory of parafoil system Fig. 7 Control quantity
5 Conclusion In this paper, a sliding mode control landing point control technology based on the ESO is designed for the eight-degree-of-freedom model of the parafoil under complex conditions such as wind field, position noise, and control delay. ESO is designed to observe the internal disturbance and external environmental disturbance of the system, based on which the sliding mode surface and exponential approach rate are constructed, and finally the observed disturbance information is used to compensate in the SMC, and the convergence of the controller is proved. Simulation experiments compare with PID and traditional SMC to verify that the designed SMC based on ESO has excellent control performance. In addition, the ideal landing point
Landing Point Control Technology of Parafoil System …
547
Fig. 8 Horizontal trajectory of parafoil system
position is finally reached through the control of the controller under the condition of different initial states of the wing parachute, and the average landing point error is only 10.71 m, which verifies the accuracy of the wing parachute model and the effectiveness of the control algorithm. This paper can provide a theoretical reference for further research on the return control of airdropped wing parachutes.
References 1. Slegers, N., Costello, M.: Model predictive control of a parafoil and payload system. J. Guidance Control Dyn. 28(4), 816–821 (2005). https://doi.org/10.2514/1.12251 2. Carter, D., Singh, L., Wholey, L.: Band-limited guidance and control of large parafoils. In: 20th AIAA Aerodynamic Decelerator Systems Technology Conference and Seminar, p. 2981 (2009). https://doi.org/10.2514/6.2009-2981 3. Leon, B.L., Wachlin, J., Ward, M.B.: A complete in-canopy system for autonomous aerial delivery. In: AIAA Aviation 2019 Forum, p. 3282 (2019). https://doi.org/10.2514/6.2019-3282 4. Tanaka, K., Tanaka, M., Iwase, A., Wang, H.O.: A rational polynomial tracking control approach to a common system representation for unmanned aerial vehicles. IEEE/ASME Trans. Mechatron. 25(2), 919–930 (2020). https://doi.org/10.1109/TMECH.2020.2965576 5. Moriyoshi, T., Yamada, K., Nishida, H.: The effect of rigging angle on longitudinal direction motion of parafoil-type vehicle: basic stability analysis and wind tunnel test. Int. J. Aerosp. Eng. 2020, 1–16 (2020). https://doi.org/10.1109/TMECH.2020.2965576 6. Xiaojun, X., Yichen, H., Guozheng, F.: Research on optimal trajectory planning algorithm for accurate recovery of rocket substages. J. Northwest. Polytechnical Univ. 40(1) (2022). https:// doi.org/10.3969/j.issn.1000-2758.2022.01.008
Nail Piece Detection Based on Lightweight Deep Learning Network Chen Zhao, Chunbo Xiu, and Xin Ma
Abstract To reduce parameters and simplify the structure of the nail piece detection model in industrial production, a lightweight learning network based on the YOLOv5 model is designed. MobileNetv3 is used to replace the backbone network in YOLOv5, which reduces the model parameters and calculations. In order to fuse multi-scale feature information, FPN+PAN structure is replaced by the simplified BiFPN structure. The expressiveness of feature map is enriched by introducing the spatial pyramid pool structure. The CBAM lightweight attention mechanism is used to strengthen the useful information and suppress the irrelevant information. Experimental results show that the average precision of the proposed model is 99.3%, the number of model parameters is 5.8M, and the model computation is 12.6G FLOPs. Compared with YOLOv5, the number of model parameters is decreased by 86.4%, the amount of computation is decreased by 89% and the detection accuracy is improved by 1.1%. The improved nail piece detection model can be applied to the automatic sorting system of nail piece production, and meets the performance requirements of industrial production. Keywords YOLOv5 · MobileNetv3 · BiFPN · Attention mechanism · Nail piece
1 Introduction With the development of computer vision, target detection algorithms based on deep learning have been widely used in industrial production. However, target detection is still a challenging problem in machine vision due to some complex factors such as C. Zhao (B) · X. Ma Tiangong University, School of Control Science and Engineering, No. 399 Binshui West Road, Xiqing District, Tianjin 300387, People’s Republic of China e-mail: [email protected] C. Xiu Tiangong University, Tianjin Key Laboratory of Intelligent Control of Electrical Equipment, No. 399 Binshui West Road, Xiqing District, Tianjin 300387, People’s Republic of China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_44
549
550
C. Zhao et al.
illumination changes, occlusion, variant posture, and so on [1]. At present, the identification of nail pieces in industrial production mainly relies on manual operation, which can lead to low production efficiency and high cost. The application of target detection based on deep learning can identify the position of nail pieces in batches, improve the accuracy of nail pieces recognition, and have important significance for improving the production efficiency of nail pieces. In industrial production, the detection accuracy of nail pieces is required to be not less than 99%, which leads to many detection models failing to meet the requirements of industrial applications. Generally, target detection methods can be divided into two categories: one-stage and two-stage detectors. One-stage detectors directly extract feature information through neural networks to perform target detection and classification. Common models of one-stage detectors include YOLO Series [2–6], EfficientDet [7] and SSD [8]. Two-stage detectors generates a series of candidate boxes according to the feature map and then classify the samples through neural networks. Common models of two-stage detectors include Fast R-CNN [9], Faster R-CNN [10], etc. Compared with the two-stage detector, the one-stage detector has faster detection speed, but lower detection accuracy. At present, there are two main design methods of lightweight neural networks. One is the manual design of lightweight network models. For example, in ShuffleNet [11], pointwise group convolution and channel shuffle operations are designed manually to ensure feature extraction and reduce the amount of calculation. Another is to design automated network based on neural network architecture search. For example, NASNet [12] proposed the Scheduled Drop Path regularization technique and a new search space, the automatic search structure was better than the manually designed network structure for the same number of parameters. However, using neural network architecture to search the best model requires a lot of GPU resources, which needs hundreds of GPUs to infer at the same time. In the actual industrial production of nail pieces, the detection accuracy of nail pieces needs to be not less than 99%, but the current target detection model cannot reach the required detection accuracy. Therefore, deploying a light-weight nail piece detection model with an accuracy higher than 99% to actual industrial production can improve production efficiency and reduce costs. At present, few mature nail piece detection lightweight network can be industrial production. Therefore, an improved target detection network based on YOLOv5 is constructed to accomplish the nail piece detection. The improved nail piece detection network uses MobileNetv3 to replace the backbone network of YOLOv5, uses the improved BiFPN structure to replace the feature fusion structure, and introduces CBAM and SPP structures. Thus, the parameters of the detection network are reduced and the detection speed is increased.
Nail Piece Detection Based on Lightweight Deep Learning Network
551
2 Improved Nail Piece Detection Method 2.1 Improvement of Backbone Network In the actual nail piece production systems, the detection and positioning of nail pieces is required to be less than 45 ms, so detection network with the complex structure cannot meet the requirements of real-time detection. The backbone network, as the main component of the model, plays an important role in the feature extraction process. Although YOLOv5L has high accuracy, the complex backbone network leads to slow detection speed. Therefore, a lightweight network with small parameters and good performance is introduced in YOLOv5L to form a new network structure. Compared with other networks, MobileNetv3 has less parameters, faster detection speed and better performance in target detection tasks, so MobileNetv3 is used to replace the backbone network of YOLOv5L to detect nail pieces. MobileNetv3 [13] retained the inverse residual structure in MobileNetv2 [14] and the depth wise separable convolution structure in MobileNet [15], adopted the h-swish activation function and reduced the number of convolutional kernels in the first layer to 16. MobileNetv3 introduced Neural Architecture Search (NAS) [16] to optimize the structure of networks, and introduced lightweight channel attention mechanism based on Squeeze-and-excitation (SE) structure [17] to improve the attention of valuable information and suppress irrelevant information.
2.2 Feature Fusion Structure 2.2.1
Improved BiFPN Structure
The structure FPN+PAN is used in YOLOv5 to fuse feature information. FPN [18] transferred strong semantic features from top to bottom, and PAN [19] transferred strong localization features from bottom to top. Due to the cascade method used for the connection between FPN and PAN, feature information at different scales cannot fused, which is not conducive to the improvement of detection accuracy. In order to make the feature fusion structure fuse multi-scale feature information without increasing too much complexity, the BiFPN structure in EfficientDet is improved to replace the FPN+PAN structure. The improved BiFPN structure is shown in Fig. 1. In Fig. 1a is BiFPN structure, and Fig. 1b is the improved BiFPN structure. In Fig. 1b, P3, P4, P5 represent the feature maps of the backbone output at layers 6, 13 and 15.
552
C. Zhao et al.
(a)The structure of BiFPN
(b)The structure of the improved BiFPN
Fig. 1 The BiFPN improvement process Fig. 2 The structure of SPP
In BiFPN structure, the input layer P7 does not realize multi-scale feature information fusion through cross node connection, so the layer can be deleted to simplify the two-way feature fusion structure. In addition, the prediction side of YOLOv5 network has only three layers, so the BiFPN structure is further simplified by removing the input layer P6 to realize the combination with the prediction side of YOLOv5. Spatial Pyramidal Pooling Structure The SPP module in the last layer of the YOLOv5 backbone network is implemented by convolution kernels of different sizes with uniform step sizes. The SPP structure is shown in Fig. 2.
Nail Piece Detection Based on Lightweight Deep Learning Network
553
Fig. 3 The structure of CBAM
In order to further enhance the fusion of feature information at different scales, the SPP structure is retained. The SPP module processes the feature maps by max pooling with different sizes to fuse the local and the global features. The SPP modules are inserted into the last three layers of the improved BiFPN structure to increase the range of the receptive field and fuse feature information of different scales. CBAM Lightweight Attention Mechanism There is redundant irrelevant feature information in the network that does not contribute to nail piece recognition and is computationally intensive during the training process. The attention mechanism is introduced to put weight on feature information, which reduces the attention to irrelevant information and improves the efficiency and accuracy of image processing. CBAM [20] can be inserted into any CNN architecture as a lightweight attention module and applies attention in both the channel and spatial dimensions, which effectively enhances important information and suppresses irrelevant information. Therefore, the CBAM structure is inserted between the BiFPN and the prediction side to improve the feature information extraction capability of the network. The CBAM structure is shown in Fig. 3. In Fig. 3, the CBAM lightweight attention mechanism consists of two modules: the Channel Attention Module (CAM) and the Spatial Attention Module (SAM). The two modules are connected in series to process feature information in channel and space dimensions respectively.
2.3 Improved YOLOv5 Model The improved nail piece detection network is mainly composed of the backbone network, the feature fusion structure and the prediction head. The improved model is shown in Fig. 4. In Fig. 4, MobileNetv3 replaces backbone network in YOLOv5, which reduces the model parameters and calculations. The improved BiFPN structure is used to replace the FPN+PAN structure to fuse multi-scale feature information. The SPP module is added to the last three layers of BiFPN to enrich the expressiveness of the feature map, and the CBAM lightweight attention mechanism is added between the BiFPN structure and the prediction layer to enhance important information and suppress irrelevant information.
554
C. Zhao et al.
Fig. 4 The structure of improved YOLOv5 model
3 Experiment Results 3.1 Comparative Experiments In order to reflect the advantages of the improved model over other models, experiments were conducted on the nail piece dataset under the same hardware platform to compare the calculation, the number of parameters and the size of the generation weight. The comparison results are shown in Table 1. In Table 1, in terms of floating-point calculation, the calculation amount of the improved model is 89% less than that of YOLOv5L, 25% less than that of PP-Picodet model, 91.9% less than that of YOLOX model, and much lower than YOLOv6L6, DINO-5scale and Swin-T-MASK R-CNN models. In terms of model parameters, it decreased by 86.4% compared with YOLOv5L, increased by 10.8% compared with PP-Picodet model, decreased by 88% compared with YOLOX, decreased by 92.4% compared with Swin-T-MASK R-CNN model, decreased by 95.4% compared with YOLOv6L6, and decreased by 86.2% compared with DINO-5scale. In terms of the generated weight parameters, the improved model reduces 85.8% compared with YOLOv5L and 42.5% compared with PP-Picodet, and is far lower than YOLOX, Swin-T-MASK R-CNN, YOLOv6L6 and DINO-5scale. It can be seen that compared
Nail Piece Detection Based on Lightweight Deep Learning Network Table 1 Comparison of calculation, model parameters and weights Model Calculation (FLOPs) Parameter (M) (G) YOLOv5L YOLOX [21] Swin-T-MASK R-CNN PP-Picodet YOLOv6L6 DINO-5scale The improved model
YOLOX Swin-T-MASK R-CNN PP-Picodet YOLOv6L6 DINO-5scale The improved model
Weights (M)
114.6 155.6 267
47.8 54.2 48
89.3 413 512
16.81 673.4 860 12.6
5.8 140.4 47 6.5
22.1 269 539 12.7
Table 2 Comparison of AP and detection speed Model Backbone Detection time (/ms) YOLOv5L
555
AP (%)
Modified CSPDarknet53 Modified CSP V5 Swin-T
191
98.2
49 330
87.8 98.6
Picodet-L CSPRepBackbone ResNet50 MobileNetv3
147 48 100 42
85.6 98.7 96.6 99.3
with YOLOv5L, the FLOPs, the number of parameters and the generated weight parameters are significantly reduced, and compared with other models, it also has significant advantages. The improved model is compared with other comparative models in terms of detection speed and detection accuracy. The experiments are conducted on the GTX1650 server, all input images are 448 × 448 in size, and all models are trained for 200 epochs. The comparison results of AP and detection speed are shown in Table 2. In Table 2, the detection time is the average detection time of a single image. From Table 2, the improved model has 78% less detection time than YOLOv5L, 71.4% less detection time than PP-Picodet, 14.3% less detection time than YOLOX, 12.5% less detection time than YOLOv6L6, 58% less detection time than DINO-5scale, and far less detection time than Swin-T-MASK R-CNN. In terms of detection accuracy, the improved model is 1.1% higher than YOLOv5L, 13.7% higher than the light model PP-Picodet, 11.5% higher than YOLOX, 0.7% higher than the more complex Swin-T-MASK R-CNN model, 0.6% higher than the YOLOv6L6 model, and 2.7% higher than DINO-5scale model. Compared with other models, the improved model not only improves the detection accuracy, but also the
556
C. Zhao et al.
(a) Original Image
(b) YOLOv5L
(c) Swin-T-MASK R-CNN
(d) YOLOX
(e) PP-Picodet
(f) YOLOv6L6
(g) DINO-5scale
(h) The improved model
Fig. 5 Comparison of nail piece recognition results between the improved model and other models Table 3 Comparison of calculation, model parameters and weights MobileNetv3 BiFPN SPP CBAM Parameter (M)
×
× ×
× × ×
5.2 5.3 6.4 6.5
Detection time(/ms)
AP (%)
25 27 31 42
98.1 98.2 98.3 99.3
detection speed is significantly faster than other models. The comparison results are shown in Fig. 5. Figure 5 is a scene diagram of overlapping nail pieces in the sorting process. In or-der to achieve automatic sorting, it is necessary to accurately locate the position of the top nail pieces. From the detection results, YOLOv5L and Swin-T-MASK RCNN have repeated detection, YOLOX cannot accurately detect the position of nail pieces. DINO-5scale, YOLOv6L6 and PP-Picodet has missed detection. In contrast, the improved model can accurately detect and locate all positions of the top nail pieces, and avoid the occurrence of missed detection or repeated detection.
3.2 Ablation Experiments Ablation experiments are done to verify the effects of various improved strategies. The experiment results are shown in Table 3. From Table 3, after replacing the backbone network, the model parameters are simplified and the detection time is significantly shortened, but the detection accuracy is reduced due to the weakening of the feature extraction ability. After the introducing BiFPN module, the feature fusion structure can fuse multi-scale feature information, so the detection accuracy can be effectively improved. On this basis, when SPP module and CBAM module are further introduced, although the
Nail Piece Detection Based on Lightweight Deep Learning Network
557
complexity of the network in-creases slightly, the detection accuracy can be further improved. Above all, compared with YOLOv5L, the complexity of the improved network is significantly reduced, and the detection accuracy is improved.
4 Conclusions An improved nail piece detection model based on YOLOv5L is proposed to solve the problems of many parameters, long detection time and large amount of calculation in the target detection model. Based on the YOLOv5L model, the lightweight network MobileNetv3 is used as the backbone. SPP structure, CBAM structure and improved BiFPN structure are introduced into the network to improve the recognition accuracy. The experimental results show that compared with YOLOv5L, the accuracy of the improved model is increased by 1.1%, the number of model parameters is reduced by 86.4%, and the detection time is reduced by 78%. The improved model shows strong robustness and high detection accuracy in nail piece detection. The average accuracy of the improved model in nail piece detection is 99.3%, which meets the industry requirements.
References 1. Jun, W., et al.: Deep learning for object detection: a survey. Comput. Syst. Sci. Eng. 38(2), 165–182 (2021). https://doi.org/10.32604/csse.2021.017016 2. Qiu, Z., et al.: Automatic visual defects inspection of wind turbine blades via YOLO-based small object detection approach. J. Electron. Imaging 28(4), 043023 (2019). https://doi.org/ 10.1117/1.JEI.28.4.043023 3. Degui, X., et al.: Robust license plate detection and recognition with automatic rectification. J. Electron. Imaging 30(1), 013002 (2021). https://doi.org/10.1117/1.JEI.30.1.013002 4. Shi, T., et al.: Underwater targets detection and classification in complex scenes based on an improved YOLOv3 algorithm. J. Electron. Imaging 29(4), 043013 (2020). https://doi.org/10. 1117/1.JEI.29.4.043013 5. Li, X., et al.: Improved YOLOv4 network using infrared images for personnel detection in coal mines. J. Electron. Imaging 31(1), 013017 (2022). https://doi.org/10.1117/1.JEI.31.1.013017 6. Kim, J.H., et al.: Object detection and classification based on YOLO-V5 with improved maritime dataset. J. Mar. Sci. Eng. 10(3), 377 (2022). https://doi.org/10.3390/jmse10030377 7. Mouna, A., et al.: An evaluation of EfficientDet for object detection used for indoor robots assistance navigation. J. Real-Time Image Process. 19(3), 651–661 (2022). https://doi.org/10. 1007/S11554-022-01212-4 8. Gao, X., et al.: Detection of lower body for AGV based on SSD algorithm with ResNet. Sensors 22(5), 2008 (2022). https://doi.org/10.3390/s22052008 9. Shin, L.Y., Park, W.H.: Diagnosis of depressive disorder model on facial expression based on fast R-CNN. Diagnostics 12(2), 317 (2022). https://doi.org/10.3390/diagnostics12020317 10. Liu, S., et al.: Method for detecting Chinese texts in natural scenes based on improved faster R-CNN. Int. J. Pattern Recogn. Artif. Intell. 34(2), 2053002 (2020). https://doi.org/10.1142/ S021800142053002X
558
C. Zhao et al.
11. Wang, Y., et al.: ShuffleNet-based comprehensive diagnosis for insulation and mechanical faults of power equipment. High Voltage 6(5), 861–872 (2021). https://doi.org/10.1049/hve2. 12035 12. Tariq, S., et al.: Brain tumor detection and multi-classification using advanced deep learning techniques. Microsc. Res. Tech. 84(6), 1296–1308 (2021). https://doi.org/10.1002/jemt.23688 13. Mohamed, A.E., et al.: Boosting COVID-19 image classification using MobileNetV3 and aquila optimizer algorithm. Entropy 23(11), 1383 (2021). https://doi.org/10.3390/e23111383 14. Ganesan, G.G., Arun, C.A.: Anti-raider ATM system using Mobilenetv2. Int. J. Smart Secur. Technol. 9(1), 1–9 (2022). https://doi.org/10.4018/IJSST.287871 15. Tangudu., et al.: COVID-19 detection from chest x-ray using MobileNet and residual separable convolution block. Soft Computing 26(5), 2197–2208 (2022). https://doi.org/10.1007/s00500021-06579-3 16. Kohei, N., Matsubara, T., Uehara, K.: Neural architecture search for convolutional neural networks with attention. IEICE Trans. Inf. Syst. 104(2), 312–321 (2021). https://doi.org/10. 1587/transinf.2020EDP7111 17. Hu, J., Shen, L., Sun.G.: Squeeze-and-excitation networks. IEEE Trans. Pattern Anal. Mach. Intell. 42(8), 2011–2023 (2020). https://doi.org/10.1109/tpami.2019.2913372 18. Ngoc, Q.T., Lee, S., Song, B.C.: Object detection using improved bi-directional feature pyramid network. Electronics 10(6), 746 (2021). https://doi.org/10.3390/ELECTRONICS10060746 19. Yu, J., Zhang, W.: Face mask wearing detection algorithm based on improved YOLO-v4. Sensors 21(9), 3263 (2021). https://doi.org/10.3390/s21093263 20. Ullah, K.R., et al.: Evaluating the efficiency of CBAM-Resnet using Malaysian sign language. CMC-Comput. Mater. Continua 71(2), 2755–2772 (2022). https://doi.org/10.32604/cmc.2022. 022471 21. Song, J., et al.: Fisheye image detection of trees using improved YOLOX for tree height estimation. Sensors 22(10), 3636 (2022). https://doi.org/10.3390/s22103636
Cosolute Interactions with the Tryptophan Peptide Bailang Liu, Xiaojing Teng, and Toshiko Ichiye
Abstract Hydrophobic interaction between a model peptide and small molecules is been studied and discussed in this paper. Two absorption models have been tested and Everett model described the interaction precisely. By incorporating the hydrophobic effect into our previously proposed dynamical model, the interaction between solutions of a specific small molecules and proteins can be predicted. Furthermore, with more data provided, machine learning can be used to make general predictions on the interaction between any small molecules and proteins. Keywords Molecular dynamics · Protein stability · Machine learning
1 Introduction Small organic cosolutes can stabilize or destabilize proteins in aqueous solution by a variety of mechanisms. For example, trimethylamine N-oxide (TMAO) is a protein stabilizer that likely works by interacting strongly with water while urea is a strong denaturant that has been shown to affect the protein stability directly by binding to the protein [1, 2] with little effect on the water structure [3]. Although the hydrogen bonding interaction between urea nitrogens and protein backbone carbonyl oxygens is believed to play an important role in protein denaturation [4, 5], interestingly, methyl-substituted ureas such as 1,3-dimethylurea (DMU) and 1,1,3,3tetramethylurea (TMU) are stronger denaturants than urea [6, 7] with less or even no hydrogen bond donors. The mechanisms for denaturation by methylureas have not been investigated thoroughly, studies have ascribed the strong denaturation by methylureas to changing the properties of water in the solutions as well as to affecting the interactions between water and protein. For instance, experiments indicate that urea adapts ideally into the water network [8, 9] while both DMU and TMU slow down the orientational relaxation of the water [10, 11]. With a much higher spatial B. Liu · X. Teng (B) · T. Ichiye Department of Chemistry, Georgetown University, Washington, DC 20057, USA e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_45
559
560
B. Liu et al.
and temporal resolutions, molecular dynamics (MD) simulation is now a powerful tool to study molecular mechanisms [12–16]. MD studies of TMU and polypeptides indicate that TMU strengthens hydrogen bonds between water molecules while also distorts the tetrahedral structure of water [17]. Experiments of methylureas and poly(N-isopropylacrylamide) in solution suggest that methylureas lower the number of water molecules that can coordinate with the carbonyl oxygens of the polymer [18]. Other experiments of methylureas and short peptides in solution indicate that methylureas can effectively disrupt hydrophobic interactions in proteins [19]. Our recent studies of urea, methylated urea, and TMAO in aqueous solution indicates that the methylureas possess different hydrophobicity behavior. The methyl groups attached to the amide of DMU or TMU lead to little electrostatic potential at the molecular surface, making it hydrophobic. In contrast, the methyl groups attached to the amine of TMAO lead to a strongly negative potential at the molecular surface, making it hydrophilic. The finding is also in consistent with the fact that DMU and TMU interact with hydrophobic sidechains of proteins, and urea binds to carbonyl oxygens while no binding of TMAO to proteins. In the current study, the interactions of urea, DMU, TMU, and TMAO at different concentrations with the tripeptide glycine-tryptophan-glycine (GWG, structural formula exhibited in Fig. 1) at 298 K are investigated utilizing MD simulations. Previous studies have revealed the stronger affinity between TMU and the indole rings in the GWG [19]. Two adsorption models are tested here to describe the affinity of different cosolute to model peptides. The good fitting to the Everett adsorption model not only provides simulation evidence for previous studies regarding methylated urea, but also strengthen the idea that the origin of affinity between methylated urea and indole ring is the disturbing of hydrophobic water. The current study will
Fig. 1 Structural formula of tripeptide GWG
Cosolute Interactions with the Tryptophan Peptide
561
not only provide qualitative evidence for methylated urea denaturing the protein via hydrophobic effects, but also provide insight about how to quantify this hydrophobic effect, which will be discussed later. In our previous work [20–26], we have thoroughly studied the properties of aqueous solutions of many small molecules, and proposed a quantitative dynamical model to explain the denaturation mechanisms of cosolutes, by calculating diffusion coefficients, hydrogen bond lifetimes, and hydrogen bond occupancies from MD simulations of cosolute in aqueous solution. This model successfully explained the direct denaturation mechanism of urea, the indirect stabilization mechanism of TMAO, the counteracting effects of urea and TMAO on each other, and the protecting effects of TMAO against pressure. In this work, we will try to expand the dynamical model to include hydrophobic interaction.
2 Methods The MD simulations were carried out by using the molecular mechanics packages CHARMM version 41a2 [27] and OpenMM version 7.5.0 [28] compiled with CUDA version 10.0. The CHARMM36 all-atom protein force field [29] was used for tripeptide GWG. The CHARMM*/REDS force field [30] which was optimized from CHARMM force field, were used for urea and methyl-substituted urea solutes, the Hölzl force field [31] were used for TMAO, and the TIP4P-FB force field [32] was used for water. Four-point water model is computationally expensive, but it can capture the dynamic properties of water accurately [33]. The original structures of urea, DMU, TMU, TMAO and GWG were generated by CHARMM program, from their internal coordinates. A single GWG tripeptide was placed at the center of water box and different numbers of urea, DMU, TMU or TMAO were added to build the system with different cosolute concentrations. The simulations of the ternary solutions were performed using OpenMM with default settings except as noted here. The particle-mesh Ewald (PME) summation method [34] was used to calculate the long-range electrostatic interactions with an Ewald error tolerance of 1 × 10−5 . The Lennard-Jones potentials were gradually switched off using the OpenMM switching function from 10 to 12 Å without long-range corrections. The SHAKE algorithm [35] was used to fix the length of covalent bonds involving hydrogen atoms. Each system was minimized for 500 steps via the L-BFGS algorithm [36]. At the beginning, the simulations were performed using a leapfrog Verlet integrator. Timestep is 1-fs time. The systems were maintained in the NPT ensemble, with temperature and pressure controlled by Monte Carlo barostat [37] and Andersen thermostat [38], respectively. Each system was started at 1 bar and heated from 0 K to the final temperature 298 K in a 5-K intervals for every 5 ps. Next, the system was equilibrated for 5 ns in the NPT ensemble, and continued until the volume of the system differed less than 0.05% from the target volume, which is determined by CHARMM in the NPT ensemble to utilize its excellent pressure and temperature control modules, as described in our previous work [30]. Finally, 50 ns of
562
B. Liu et al.
production run without any perturbation was carried out in OpenMM using a velocity Verlet integrator with a 1-fs time step for each system studied. The temperature is maintained constant in the NVT ensemble by a Nosé-Hoover chain thermostat [39] with a chain length of 5 and collision frequency of 50 ps−1 . The first 5 ns are further equilibration to the NVT ensemble, and the following 45-ns trajectories are used for analysis. The coordinates were saved every 1 ps. The criterion for a neighboring molecule uses a distance cut-off of 5.0 Å between the heavy atom of small molecule and any heavy atom of the indole ring in tripeptide GWG. The average number of neighboring molecules is the summation of counted neighboring molecules averaged by 45 ns of production run. The summary of simulation details of each system is given in Tables 1 and 2. To find a quantitative description for the hydrophobic effect, our first attempt originated from the Langmuir adsorption model, which describes the adsorption of single component to the surface. Al + [S] As
(1)
Denoting the molarity of component as c, the equilibrium constant K of adsorption process is K =
cAs cAl [S]
(2)
Total binding concentration [S0 ] is the summation of the free binding site concentration [S] and the binding concentration of component A, c SA , which can be defined as the following equation. 1 S S (3) [S0 ] = [S] + cA = cA 1 + K cA1 Therefore, the binding fraction of component A is l cAS NS K cA = AS = N [S0 ] 1 + K cAl
(4)
The neighboring molecules (Both cosolute and water) around the indole ring of tripeptide GWG, NuS or NwS , over solute concentration c can therefore be fitted to Eq. 4, with the equilibrium constant K and total binding site N S as parameters. The Everett adsorption model is an analog of the Langmuir adsorption model that describes the composition of a surface phase for a binary liquid of components A and B, where the components are interchangeable, in contact with a surface. Al + Bs As + Bl
(5)
Cosolute Interactions with the Tryptophan Peptide Table 1 Details of simulated systems Solute type Number of Number of solutes in box waters in box N/A urea urea urea urea urea urea urea urea DMU DMU DMU DMU DMU DMU DMU DMU TMU TMU TMU TMU TMU TMU TMU TMU TMAO TMAO TMAO TMAO TMAO TMAO TMAO TMAO
0 40 70 100 140 160 230 230 230 40 70 100 140 160 230 230 230 40 70 100 140 160 230 230 230 40 70 100 140 160 230 230 230
1956 2030 1680 1560 1550 1440 1608 1206 617 1934 1680 1560 1550 1440 1608 1206 617 1873 1680 1560 1550 1440 1608 1206 617 1961 1680 1560 1550 1440 1608 1206 617
563
cu (M)
m u (mol/kg)
xu
1.04 2.10 3.09 4.14 4.91 5.99 7.39 11.20 1.04 1.93 2.73 3.52 4.06 4.77 5.61 7.58 1.03 1.81 2.50 3.14 3.57 4.10 4.71 6.01 1.04 1.96 2.81 3.67 4.26 5.06 5.99 8.35
1.09 2.31 3.56 5.01 6.17 7.94 10.59 20.69 1.15 2.31 3.56 5.01 6.17 7.94 10.59 20.69 1.19 2.31 3.56 5.01 6.17 7.94 10.59 20.69 1.13 2.31 3.56 5.01 6.17 7.94 10.59 20.69
0 0.02 0.04 0.06 0.08 0.10 0.13 0.16 0.27 0.02 0.04 0.06 0.08 0.10 0.13 0.16 0.27 0.02 0.04 0.06 0.08 0.10 0.13 0.16 0.27 0.02 0.04 0.06 0.08 0.10 0.13 0.16 0.27
564
B. Liu et al.
Table 2 Number of molecules N within 5 Å of non-hydrogen atoms of indole ring, or of C5 /C6 atoms in different systems Solute type cu (M) Within 5Å of indole ring Within 5Å of C5 /C6 Nw Nu Nw Nu N/A Urea Urea Urea Urea Urea DMU DMU DMU DMU TMU TMU TMU TMU TMAO TMAO TMAO TMAO
0 1.04 4.14 5.99 7.39 11.20 1.04 3.52 5.61 7.58 1.03 3.14 4.71 6.01 1.04 3.67 5.99 8.35
26.8 24.7 20.3 17.5 16.3 13.1 20.5 11.5 9.4 8.6 16.0 10.1 8.1 6.0 24.6 19.3 15.1 9.5
1.3 3.9 5.8 6.5 8.5 2.3 6.3 7.3 7.6 3.6 5.8 6.9 7.7 1.0 3.6 5.2 8.1
15.6 14.2 11.8 10.2 9.5 7.7 11.9 6.2 5.0 4.6 8.4 5.0 3.3 2.4 14.3 10.9 8.4 5.0
0.8 2.6 3.8 4.3 5.5 1.6 4.4 5.0 5.2 2.7 4.2 5.0 5.5 0.7 2.5 3.5 5.3
where a superscript ‘l’ denotes in the liquid phase and a superscript ‘s’ denotes it is in the surface phase. Denoting the mole fraction of component i in phase α as xiα such that xAs + xBs = 1; xAl + xBl = 1
(6)
And the equilibrium constant K of the reaction 5 is K =
xAs xBl xAl xBs
The fractional saturation of binding of A can be derived from Eq. 8 K xAl / 1 − xAl K xAl s = xA = 1 + (K − 1)xAl 1 + K xAl / 1 − xAl
(7)
(8)
Cosolute Interactions with the Tryptophan Peptide
565
If N s is the total number of sites on the surface for binding of either A or B, we have NAs =
N s K xAl 1 + (K − 1)xAl
(9)
In which N As is the neighboring molecules A bound to the indole, and x lA is the mole fraction of cosolute A in the solution (We can approximately assume that the cosolute or water bound to the indole does not change the mole fraction in the solution). From Eq. 9 we can fit for the total binding site of cosolute N s and the equilibrium constant K . Also, water binding to the indole ring can similarly derived as Eq. 10 as NBS =
N s K −1 xBl 1 + K −1 − 1 xBl
(10)
In which N Bs is the number of waters bound to the indole, and x lB is the mole fraction of water in the solution. The inverse of K −1 from Eq. 10 should be close to K from Eq. 9, if the Everett adsorption model successfully describe the hydrophobic association among cosolute, water and indole ring.
3 Results The average number of neighboring molecules have been fitted to the Langmuir adsorption model (Eq. 4). The fitting results have been summarized in Figs. 2 and 3, and Tables 3 and 4. However, since Eq. 4 can only represent the concave function, fitting parameter K is going to be approaching to zero and a linear function will be modeled if the data points have a convex function trend, such as the neighboring cosolute numbers Nus for TMAO in Fig. 2. and neighboring water numbers Nws for all systems in Fig. 3. The significantly small equilibrium constant K for TMAO in Table 3 is not reasonable to be used for further quantitative analysis, which means, it is not appropriate to put TMAO together with methylated urea, to be discussed together under the framework of Langmuir adsorption model. The failure of Langmuir adsorption model for TMAO and urea is not a surprise. As discussed in previous research, we have qualitatively discussed that the methylated urea has an affinity to backbone of protein while urea and TMAO do not. This affinity is an analogy of adsorption therefore Langmuir isotherm may be appropriate to describe. However, while TMAO has no hydrophobic affinity to the indole ring, it can still influence the solvation environments of the indole ring by affecting bulk waters. The competition of binding affinity between cosolute and water is therefore examined via the Everett adsorption model.
566
B. Liu et al.
Fig. 2 The average number of neighboring cosolute molecules Nus around the indole ring as a function of cosolute mole fraction xu . Urea (blue circle), DMU (green triangle) TMU (orange square), TMAO (red diamond) aqueous solutions. The dashed lines are the fit to the Langmuir isotherm
Table 3 Fitting parameters for cosolute in cosolute-water solutions for the Langmuir isotherm, with the inverse of variance as the fitting weights Solute Ns Ku R2 Urea DMU TMU TMAO
27.27 13.04 11.91 1916
0.04134 0.2267 0.3008 4.999 × 10−4
0.9991 0.9980 0.9992 0.9988
Fig. 3 The average number of water molecules Nws around the indole ring as a function of water mole fraction xw . Urea (blue circle), DMU (green triangle) TMU (orange square), TMAO (red diamond) aqueous solutions. The dashed lines are the fit to the Langmuir isotherm
Table 4 Fitting parameters for water in cosolute-water solutions for the Langmuir isotherm, with the inverse of variance as the fitting weights Solute Ns KW R2 Urea DMU TMU TMAO
4204 3823 2368 2715
1.149 × 10−4 1.282 × 10−4 2.082 × 10−4 1.816 × 10−4
0.9038 0.8800 0.9146 0.9908
Cosolute Interactions with the Tryptophan Peptide
567
The number of neighboring cosolute molecules Nus and neighboring water molecules Nws are plotted in Figs. 4 and 5. The data is fitted to Everett adsorption model with the inverse of variance as fitting weights, and the fitting parameters are summarized in Tables 5 and 6. Overall, K u and K w−1 overlap at 95% confidence intervals, indicating that the Everett adsorption model is appropriate for describing the competition of cosolute and water molecules around the indole ring. For the binding constant of cosolute K u , TMAO ≈ urea < DMU < TMU showing that TMU and DMU tend to aggregate around the indole ring, whose effect is pronounced with more methyl group substituted, and this effect also can be directly observed from Fig. 4 at lower concentrations. For the maximum cosolute binding sites, urea ≈ TMAO > DMU > TMU, which is dominated by the size of the cosolute molecules.
Fig. 4 The average number of neighboring cosolute molecules Nus around the indole ring as a function of cosolute mole fraction xu . Urea (blue circle), DMU (green triangle) TMU (orange square), TMAO (red diamond) aqueous solutions. The dashed lines are the fit to the Everett model
Fig. 5 The average number of water molecules Nws around the indole ring as a function of water mole fraction xw . Urea (blue circle), DMU (green triangle) TMU (orange square), TMAO (red diamond) aqueous solutions. The dashed lines are the fit to the Everett model
Table 5 Fitting parameters for cosolute in cosolute-water solutions for the Everett model, with the inverse of variance as the fitting weights Solute Ns Ku R2 Urea DMU TMU TMAO
13.79 9.522 8.562 13.63
4.509 17.21 23.12 3.813
0.9991 0.9980 0.9992 0.9987
568
B. Liu et al.
Table 6 Fitting parameters for water in cosolute-water solutions for the Everett model, with the inverse of variance as the fitting weights Solute Ns K w−1 R2 Urea DMU TMU TMAO
26.77 26.80 26.80 26.84
3.495 11.02 15.80 4.188
0.9944 0.9937 0.9911 0.9987
4 Conclusions The simulations of the model tripeptide GWG and cosolute aqueous solution provides direct evidence regarding the denaturation mechanism of methylated urea. The more methyl groups substituted, the stronger their hydrophobic effect to the protein backbone will be, hence contributing to the equilibrium of protein structure shifted to the denatured state. In the future work, hydrophobic effect should be incorporated into our dynamical model to better describe the interaction between solutions and proteins. With more molecules being studied, machine learning can also be used to predict any small molecule’s effect on proteins. Acknowledgements B.L. and T.I. acknowledge support from the National Science Foundation through Grant No. CHE-1464766 and from the McGowan Charitable Fund and X.T. acknowledges support from the National Institutes of Health through Grant No. R01-GM122441. This work used time on the Extreme Science and Engineering Discovery Environment (XSEDE) granted via MCB990010, which is supported by National Science Foundation Grant No. OCI-1053575; and the Medusa cluster maintained by University Information Services at Georgetown University.
References 1. Robinson, D.R., Jencks, W.P.: The effect of compounds of the urea-guanidinium class on the activity coefficient of acetyltetraglycine ethyl ester and related compounds1. J. Am. Chem. Soc. 87(11), 2462–2470 (1965) 2. Simpson, R.B., Kauzmann, W.: The kinetics of protein denaturation. i. the behavior of the optical rotation of ovalbumin in urea solutions1. J. Am. Chem. Soc. 75(21), 5139–5152 (1953) 3. Kokubo, H., Rösgen, J., Bolen, D.W., Pettitt, B.M.: Molecular basis of the apparent near ideality of urea solutions. Biophys. J. 93(10), 3392–3407 (2007) 4. O’Brien, E.P., Dima, R.I., Brooks, B., Thirumalai, D.: Interactions between hydrophobic and ionic solutes in aqueous guanidinium chloride and urea solutions: lessons for protein denaturation mechanism. J. Am. Chem. Soc. 129(23), 7346–7353 (2007) 5. Kumaran, R., Ramamurthy, P.: Denaturation mechanism of BSA by urea derivatives: evidence for hydrogen-bonding mode from fluorescence tools. J. fluoresc. 21, 1499–1508 (2011) 6. Pace, C.N., Marshall, H.F., Jr.: A comparison of the effectiveness of protein denaturants for β-lactoglobulin and ribonuclease. Arch. Biochem. Biophys. 199(1), 270–276 (1980)
Cosolute Interactions with the Tryptophan Peptide
569
7. Shimizu, A., Fumino, K., Yukiyasu, K., Taniguchi, Y.: NMR studies on dynamic behavior of water molecule in aqueous denaturant solutions at 25 c: Effects of guanidine hydrochloride, urea and alkylated ureas. J. Mol. Liq. 85(3), 269–278 (2000) 8. Funkner, S., Havenith, M., Schwaab, G.: Urea, a structure breaker? answers from THz absorption spectroscopy. J. Phys. Chem. B 116(45), 13374–13380 (2012) 9. Rezus, Y., Bakker, H.: Effect of urea on the structural dynamics of water. Proc. Natl. Acad. Sci. USA 103(49), 18417–18420 (2006) 10. Agieienko, V., Horinek, D., Buchner, R.: Hydration and self-aggregation of a neutral cosolute from dielectric relaxation spectroscopy and md simulations: the case of 1, 3-dimethylurea. Phys. Chem. Chem. Phys. 19(1), 219–230 (2017) 11. Tielrooij, K.-J., Hunger, J., Buchner, R., Bonn, M., Bakker, H.J.: Influence of concentration and temperature on the dynamics of water in the hydrophobic hydration shell of tetramethylurea. J. Am. Chem. Soc. 132(44), 15671–15678 (2010) 12. Karplus, M., McCammon, J.A.: Molecular dynamics simulations of biomolecules. Nat. Struct. Mol. Biol. 9(9), 646–652 (2002) 13. Brooks, C. L., Case, D.A., Plimpton, S., Roux, B., Van Der Spoel, D., Tajkhorshid, E.: Classical molecular dynamics. J. Chem. Phys. 154(10) (2021) 14. Teng, X., Hwang, W.: Chain registry and load-dependent conformational dynamics of collagen. Biomacromolecules 15(8), 3019–3029 (2014) 15. Teng, X., Hwang, W.: Elastic energy partitioning in DNA deformation and binding to proteins. ACS Nano 10(1), 170–180 (2016) 16. Teng, X., Hwang, W.: Effect of methylation on local mechanics and hydration structure of DNA. Biophys. J. 114(8), 1791–1803 (2018) 17. Wei, H., Fan, Y., Gao, Y.Q.: Effects of urea, tetramethyl urea, and trimethylamine n-oxide on aqueous solution structure and solvation of protein backbones: a molecular dynamics simulation study. J. Phys. Chem. B 114(1), 557–568 (2010) 18. Sagle, L.B., Zhang, Y., Litosh, V.A., Chen, X., Cho, Y., Cremer, P.S.: Investigating the hydrogenbonding model of urea denaturation. J. Am. Chem. Soc. 131(26), 9304–9310 (2009) 19. Ding, B., Yang, L., Mukherjee, D., Chen, J., Gao, Y., Gai, F.: Microscopic insight into the protein denaturation action of urea and its methyl derivatives. J. Phys. Chem. Lett. 9(11), 2933–2940 (2018) 20. Teng, X., Huang, Q., Dharmawardhana, C.C., Ichiye, T.: Diffusion of aqueous solutions of ionic, zwitterionic, and polar solutes. J. Chem. Phys. 148(22), 222827 (2018) 21. Teng, X., Ichiye, T.: Molecular dynamics study of tmao and urea aqueous solutions. In: Abstracts of Papers of the American Chemical Society, vol. 256. Amer Chemical Soc 1155 16th St, NW, Washington, DC 20036 USA (2018) 22. Liu, B., Ichiye, T.: Hydrophobic effect in molecular dynamics simulations of water/ethanol mixtures using single-site multipole water. In: Abstracts of Papers of the American Chemical Society, vol. 256. Amer Chemical Soc 1155 16th St, NW, Washington, DC 20036 USA (2018) 23. Teng, X., Ichiye, T.: Dynamical effects of trimethylamine n-oxide on aqueous solutions of urea. J. Phys. Chem. B 123(5), 1108–1115 (2019) 24. Teng, X., Ichiye, T.: Aqueous solutions of TMAO and urea under pressure: molecular dynamics simulation study. In: Abstracts of Papers of the American Chemical Society, vol. 256. Amer Chemical Soc 1155 16th St, NW, Washington, DC 20036 USA (2019) 25. Liu, B., Ichiye, T.: Molecular dynamics study of the hydrophobic effect in ethanol-water mixtures. In: Abstracts of Papers of the American Chemical Society, vol. 256. Amer Chemical Soc 1155 16th St, NW, Washington, DC 20036 USA (2019) 26. Teng, X., Ichiye, T.: Dynamical model for the counteracting effects of trimethylamine n-oxide on urea in aqueous solutions under pressure. J. Phys. Chem. B 124(10), 1978–1986 (2020) 27. Brooks, B.R., Brooks, C.L., III., Mackerell, A.D., Jr., Nilsson, L., Petrella, R.J., Roux, B., Won, Y., Archontis, G., Bartels, C., Boresch, S., et al.: Charmm: the biomolecular simulation program. J. Comput. Chem. 30(10), 1545–1614 (2009) 28. Eastman, P., Swails, J., Chodera, J.D., McGibbon, R.T., Zhao, Y., Beauchamp, K.A., Wang, L.-P., Simmonett, A.C., Harrigan, M.P., Stern, C.D., et al.: Openmm 7: Rapid development of
570
29.
30.
31.
32. 33. 34.
35.
36. 37.
38. 39.
B. Liu et al. high performance algorithms for molecular dynamics. PLoS Comput. Biol. 13(7), e1005659 (2017) Best, R.B., Zhu, X., Shim, J., Lopes, P.E., Mittal, J., Feig, M., MacKerell, A.D., Jr.: Optimization of the additive charmm all-atom protein force field targeting improved sampling of the backbone φ, ψ and side-chain χ1 and χ2 dihedral angles. J. Chem. Theory Comp. 8(9), 3257–3273 (2012) Liu, B., Ichiye, T.: Concentration dependence of dynamics and hydrogen bonding in aqueous solutions of urea, methyl-substituted ureas, and trimethylamine n-oxide. J. Mol. Liq. 358, 119120 (2022) Hölzl, C., Kibies, P., Imoto, S., Frach, R., Suladze, S., Winter, R., Marx, D., Horinek, D., Kast, S.M.: Design principles for high-pressure force fields: aqueous TMAO solutions from ambient to kilobar pressures. J. Chem. Phys. 144(14), 144104 (2016) Wang, L.-P., Martinez, T.J., Pande, V.S.: Building force fields: an automatic, systematic, and reproducible approach. J. Phys. Chem. Lett. 5(11), 1885–1891 (2014) Teng, X., Liu, B., Ichiye, T.: Understanding how water models affect the anomalous pressure dependence of their diffusion coefficients. J. Chem. Phys. 153(10), 104510 (2020) Feller, S.E., Pastor, R.W., Rojnuckarin, A., Bogusz, S., Brooks, B.R.: Effect of electrostatic force truncation on interfacial and transport properties of water. J. Phys. Chem. 100(42), 17011– 17020 (1996) Ryckaert, J.-P., Ciccotti, G., Berendsen, H.J.: Numerical integration of the cartesian equations of motion of a system with constraints: molecular dynamics of n-alkanes. J. Chem. Phys. 23(3), 327–341 (1977) Liu, D.C., Nocedal, J.: On the limited memory BFGS method for large scale optimization. Math. Prog. 45(1–3), 503–528 (1989) Åqvist, J., Wennerström, P., Nervall, M., Bjelic, S., Brandsdal, B.O.: Molecular dynamics simulations of water and biomolecules with a monte carlo constant pressure algorithm. Chem. Phys. Lett. 384(4–6), 288–294 (2004) Andersen, H.C.: Molecular dynamics simulations at constant pressure and/or temperature. J. Chem. Phys. 72(4), 2384–2393 (1980) Martyna, G.J., Klein, M.L., Tuckerman, M.: Nosé-hoover chains: the canonical ensemble via continuous dynamics. J. Chem. Phys. 97(4), 2635–2643 (1992)
Research on High-Voltage Pulse Ignition Power Supply Technology Based on µC_OS_II Xu Zhao, Changlu Yue, Xiuhua Xu, Cong Hu, and Lei Yang
Abstract The µC_OS_II is a complete, portable, curable, and customizable preemptive multitasking kernel, contains basic functions such as task scheduling, task management, time management, memory management, and communication and synchronization between tasks. lntroduce a high-voltage pulse ignition power supply design based on µC_OS_II, and made a more detailed description of the system structure and the realization of multitasking. The conclusion shows that the use of embedded real-time operating system can improve the reliability and maintainability of the system, the power supply has been successfully used in the pulse plasma space electric propulsion power system. Keywords Operating system · Multitasking kernel · Pulse ignition power supply · Embedded real-time
1 Introduction The high-voltage pulse ignition power supply is an important component of the pulse plasma space electric propulsion power supply, responsible for detecting the voltage at both ends of the thruster energy storage element and conducting ignition and discharge according to the corresponding frequency, so as to adjust the thrust of thruster. The STM32 series micro-controller base on Cortex-M3 has been increasingly widely used due to its powerful product lineup, rich peripheral resources, high cost-effectiveness, compact packaging, and good software and peripheral compatibility. The article propose a design scheme for digitally controlled high-voltage pulse point power supply, which is combined with the use of a real-time kernel base on µC_OS_II for task management and scheduling. It has the characteristics of stable system operation, convenient task addition and reduction, and flexible functional configuration [1, 2]. X. Zhao (B) · C. Yue · X. Xu · C. Hu · L. Yang Beijing Institute of Precision Mechatronics and Controls, Beijing 100076, China e-mail: [email protected] Laboratory of Aerospace Servo Actuation and Transmission, Beijing 100076, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_46
571
572
X. Zhao et al.
2 System Scheme Design The article focuses on the development of a compact microsecond pulse ignition power supply for the semiconductor spark plug ignition system, which is one of the key components of high-voltage Pulse Plasma electric Thrusters(PPT), the voltage amplitude 1200–2500 V adaptive, and the pulse width 5–20 µs adjustable. It can be used as an ignition power source for pulse plasma electric propulsion, and is used to control the startup control characteristics of thrusters in a vacuum environment. The schematic diagram of the power supply hardware circuit is shown in Fig. 1. From Fig. 1, it can be seen that the power controller needs to complete the following tasks: (1) (2) (3) (4) (5) (6)
Charging voltage collection; Ignition pulse signal output; Manual discharge; User interface display; Touch screen driver; Ignition status indication.
High voltage pulse ignition power control task as shown in Fig. 2. Among them, the charging voltage collection task is mainly responsible for collecting the voltage of the energy storage capacitor of the electric thruster; The ignition pulse signal output is mainly responsible for sending ignition drive signals; Manual discharge mainly involves single ignition operation through buttons; The user interface and touch screen driver mainly display and refresh the LCD screen; The power status is mainly indicated by LED lights to indicate whether the ignition is normal.
Fig. 1 The schematic diagram of the power supply hardware circuit
Research on High-Voltage Pulse Ignition Power Supply …
573
Fig. 2 High voltage pulse ignition power control task
3 System Control Strategy The task created by the µC_OS_II real-time kernel is also known as a thread, is a simple program that assumes that CPU belongs entirely to itself. Each task is a part of the entire application, and each task is assigned a certain priority, with its own set of CPU registers and stack space. Each task is an infinite loop, and the states of each task are further divided into 5 types, namely sleep state, ready state, running state, suspended state(waiting state), and interrupted state. The task status is shown in Fig. 3 [3].
Fig. 3 µC_OS_II task status
574
X. Zhao et al.
Fig. 4 µC_OS_II system program operation flowchart
The pulse ignition power control system needs to create a total of 6 user tasks, namely: Task_ADC(), Task_Ign(), Task_Key(), Task_User_IF(), Task_Kbd(), Task_Led(). The running process of the power control system program is shown in Fig. 4. The operation of multiple tasks is actually achieved by the CPU converting and scheduling between many tasks, which is similar to the front-end and back-end systems, except that there are multiple back-end tasks. Multi-task operation maximizes CPU utilization and modularizes applications. In practical applications, developers can hierarchy complex applications. By using multitasking, applications will become easier to design and maintain [4]. The relationships between the six tasks established in this article are as follows. Task_ADC() is responsible for collecting the voltage on the energy storage capacitor of the space electric thruster. After the voltage reaches the discharge threshold, manual buttons can be used for ignition, or fixed frequency automatic ignition can be used. If the collected voltage is lower than the threshold voltage, ignition cannot be triggered. To achieve system redundancy, dual voltage sampling is used; In order to enhance the error and anti-interference ability of the sampled signal, it is usually necessary to continuously collect a large amount of data, calculate the average, and filter it. In general controllers, it is necessary to handle interrupts, but the efficiency of using interrupts is still not high enough. In the STM32 program design of this article, ADC is used and DMA (Direct Memory Access) is used for transmission. DMA transfers the data converted by ADC peripherals to SRAM, then, it is processed in the DMA interrupt processing function. After the two sampled data are summed and averaged at intervals, a sampled semaphore is generated. After receiving this semaphore, Task_ADC() sends the numerical value to the upper computer through the serial port for display.
Research on High-Voltage Pulse Ignition Power Supply …
575
Task_Ign() is the ignition pulse output signal, in this function, a signal quantity of the pulse signal needs to be set to adjust the ignition frequency, which is generated during the timer interrupt. Task_Key() is the manual button ignition output, which also requires the effective signal quantity and pulse signal quantity of the button internally, the former indicates whether a button has been pressed, while the latter restricts the ignition time interval to prevent the electric propulsion system from continuously ineffective idle ignition due to insufficient energy storage time when the ignition button is frequently pressed. The key semaphore is generated in the EXTI9_5_IRQHandler() interrupt function and sent to Task_Key() through OSSemPost(KEY_SEM). Task_User_IF() and Task_Kbd() are the task functions for LCD touch screen window display and touch screen adjustment. The Task_User_IF() touch screen window displays two ADC sampling values, and Task_Kbd() reads the touch coordinate values every 10 ms to ensure sensitivity of touch response. This function is used during the debugging phase and can be removed after the power supply is finalized. Task_Led() is used for indicating the ignition status of the circuit board, when igniting, the indicator light flashes to allow the test observer to see the ignition signal output through the observation window outside the vacuum chamber. This function is used to replace the LCD screen display after the product is finalized. The priority allocation rules for each task in the µC_OS_II system follow the RMS(Rate Monotonic Scheduling) method, which is based on the number of times the task is executed. The task with the most frequent execution times has the highest priority. According to the RMS rule, if all tasks meet the hard real-time requirements, then all tasks with time requirements should occupy less than seventy percent of the CPU time. The priority of each task above is, Task_ADC_PRIO = 4, Task_Led_PRIO = 7, Task_Ign_PRIO = 10, Task_Key_PRIO = 11, Task_Kbd_PRIO = 12, Task_User_IFPRIO = 14. The semaphore in the µC_OS_II system consists of two parts: the count value of the semaphore and the waiting task table for the waiting signal task. The count value of a semaphore can be binary or other integers. There are three main uses of semaphores: controlling the usage rights of shared resources, marking the occurrence of events, and synchronizing the behavior of two tasks [5]. This paper uses semaphores in the Tick clock processing function SysTickHandler() to switch and synchronize each task switching, and the clock interrupt processing function executes once every 10 ms. In this function, the ADC_SEM sampling data update semaphore, the PLUS_SEM pulse ignition semaphore, and pulse ignition semaphore in the PLUS_FAULT_SEM fault mode are generated. The ADC_SEM sends every 1 s, PLUS_SEM sends every 2.5 s, and PLUS_FAULT_SEM sends every 5 s. In the app.c task handler, each task will be executed separately based on these semaphores. The principle of SysTickHandler() task switching is shown in Fig. 5.
576
X. Zhao et al.
Fig. 5 µC_OS_II system program operation flowchart
4 Implementation of Power Supply Prototype Considering the low clock frequency and limited resources of the microcontroller, the following precautions should be taken: (1) Task should be divided reasonably to minimize the extra cost of the system on task switching; (2) Reasonably prioritize tasks to ensure that each can obtain CPU usage rights; (3) The stack of each task is independent, so memory space needs to be allocated according to the needs of the task (local variables, function calls, interrupt nesting); (4) Shared variables between multiple tasks are declared as global variables for ease of operation. Therefore, to prevent shared data from being corrupted, interrupts should be turned off before accessing shared data, and interrupts should be enabled when leaving. (5) Due to limited system resource, it is necessary to trim the µC_OS_II to remove unnecessary functions. At the same time, some modifications can be made to the µC_OS_II according to system needs. For example, if the system only 6 tasks,
Research on High-Voltage Pulse Ignition Power Supply …
577
the task priority table can be changed to a single byte variable, while removing unnecessary content from the OS_TCB. (6) Calculation and control should be implemented separately as much as possible, and they can communicate in real-time through semaphores or message queues [6]. After the power supply hardware platform is built, as shown in Fig. 6a, the microcontroller selection is STM32F103RCT6, with built-in 128 KB FLASH and 20 KB ROM. Write the power program code according to the above method, after compilation, the final system code size is 25 KB, which occupies less than 18 KB of memory and is suitable for the use of this micro-controller. Place the pulse ignition power supply and the electric thruster with the spark plug installed in the vacuum chamber, connect the wire between the pulse ignition power supply and the spark plug, and connect the power supply. If the spark plug ignition is normal, close the vacuum chamber door, start the air extraction pump, and when the atmospheric in the chamber storage capacitor of the electric thruster to the rated voltage, at this point, by lighting up the spark plug, plasma can be observed shooting from the nozzle of the electric thruster and emitting a dazzling white light. In order to prevent the spark plug discharge from interfering with the ignition power supply, the isolation drive circuit is used to isolate the single-chip microcomputer control circuit from the ignition drive circuit, and the ignition drive circuit and the spark plug load are isolated through the pulse transformer. Finally, the thyristor is used to replace the metal oxide Field-effect transistor to enhance the anti-interference ability of the power device. The pulse ignition discharge of the pulse plasma electric thruster is shown in Fig. 6b.
(a) Pulse ignition power supply prototype
(b) Pulse plasma electric thruster ignition discharge
Fig. 6 Hardware platform and the pulse plasma electric thruster
578
X. Zhao et al.
5 Conclusion Digital control high-voltage pulse ignition power supply based on µC_OS_II realtime core combines the advantages of flexible functions of digital control power supply and stable and reliable embedded real-time operating system, This operating system, combined with the STM32F103 general-purpose high-performance microcontroller, has the characteristics of good real-time performance, simple transplantation, easy maintenance, and flexible function expansion. It can significantly shorten the software development cycle and improve the efficiency of ignition power development. After tens of thousands of ignition tests and assessments, there has been no attenuation of ignition energy. This design can be expanded and promoted as a technical platform, and the relevant technology has been successfully applied to a small satellite electric thruster.
References 1. Xisan, L.: High Power Pulse Technology, pp. 95–174. National defense industry press, Beijing (2005) 2. Falun, S., Fei, L., Haitao, G., et al.: Research progress on miniaturization of high-power Marx pulse power source. High Power Laser Part. Beams (2) (2018) 3. Jean, J.: Labrosse, Translated by Shao Beibei, Embedded Real-Time Operating System µC_OS_II(second version), pp. 73–113 . Beijing University of Aeronautics and Astronautics Press (2003) 4. Liu, H., Yang, S.: STM32 Library Development Practical Guide, pp. 448–452. China Machine Press (2013) 5. Liu, B., Sun, Y.: Classic Example of Embedded Real Time Operating System µC_OS_II– Based on STM32 processor (second version), pp. 22–37. Beijing University of Aeronautics and Astronautics Press (2014) 6. Meng, B.: STM32 Self Study Notes, 3rd edn, pp. 73–113. Beijing University of Aeronautics and Astronautics Press (2019)
Path Planning of Mobile Robot Based on DBSCAN Clustering and Improved BA*-APF Hybrid Algorithm Yicheng Li, Bingshan Liu, and Linyuan Hou
Abstract To effectively improve the problems of the traditional A* algorithm in mobile robot path planning, such as insufficient accuracy, long search time, slow convergence and unsafe, a DBSCAN(Density-Based Spatial Clustering of Applications with Noise)-based clustering and improved BA*-APF(Bidirectional A*-Artificial Potential Field) hybrid algorithm is proposed. First, the DBSCAN algorithm is adopted to reduce the environmental complexity by clustering the obstacles. Second, a four-eight hybrid switching neighborhood search method is proposed to improve the search efficiency while ensuring the path security. Third, the bidirectional search method is introduced to further increase the convergence speed and considering the problem that the bidirectional search cannot converge as the map expands, the artificial potential field method is introduced to improve the heuristic function. Finally, a third-order Bessel curve is used to improve the smoothness of the generated optimal path. The proposed algorithm is proven to work effectively with significant improvement through simulations. Keywords Path planning · Dbscan · A* algorithm · Artificial potential field · Bessel curve
1 Introduction Recent years have seen significant advances in the field of robotics, leading to the increased use of mobile robots in various applications such as manufacturing, logistics, and rescue. A critical aspect of mobile robots in autonomous navigation is path Y. Li (B) · L. Hou College of Science, Hohai University, Nanjing 210098, China e-mail: [email protected] B. Liu School of Computer Science, Nanjing University of Information Science and Technology, Nanjing 210044, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_47
579
580
Y. Li et al.
planning, which aims to find a collision-free path from a start point to a goal point according to a certain set of evaluation criteria (such as distance, time, cost, etc.) in an environment with obstacles [1–3]. According to the complexity of the environment and the application scenario where the mobile robot is located, path planning algorithms can be divided into global path planning and local path planning [4]. The global path planning algorithms involve generating an optimal path that connects the start and goal points while avoiding all obstacles in the entire environment. The local path planning algorithms involve generating a path that is only a short distance ahead of the robot’s current position when environmental information is unknown and they are performed online, in real-time [5]. Since there are still several problems with the traditional A* algorithm for mobile robot path planning, several improvement strategies have been proposed to improve the traditional A* algorithm to obtain a smaller amount of calculation, a shorter search time, and a better search result. Harabor et al. [6] proposed a hop search algorithm to improve the A* algorithm by reducing the computation of unnecessary nodes, thus improving the path-finding efficiency, but the path smoothing problem is highlighted because the computation of some nodes is reduced, resulting in an increase in the magnitude of path transitions. Huang et al. [7] proposed an APSO algorithm combining A* and PSO, which solved the problems of low path accuracy and long running time. Li et al. [8] proposed an improved A* algorithm by comparing Dijkstra algorithm with A* algorithm of different heuristic functions in a known environment and obtain higher efficiency and less inflection points than the traditional A* algorithm. Goldberg et al. [9] improved the accuracy of the heuristic function to improve the pathfinding efficiency, but the over-focus on speedup increased the spatial complexity. Therefore, this paper proposes an obstacle clustering algorithm based on DBSCAN clustering, which effectively solves the diagonal obstacle and U-shaped obstacle problems in pathfinding by clustering the obstacles that the mobile robot cannot normally pass. Then, the same cluster of obstacles is integrated using the convex polygon envelope algorithm of the obstacle points set. Meanwhile, the bidirectional A* algorithm with improved search neighbourhood is combined with the artificial potential field method to form the improved BA*-APF hybrid algorithm. On the one hand, the application of bidirectional A* search for global path planning solves the problem of low efficiency of the traditional A* algorithm. On the other hand, the artificial potential field method is used to improve the heuristic function of the A* algorithm for local dynamic path planning. Finally, to address the problem of unsmoothness in the generated path, a third-order Bessel curve is used for smoothing.
2 Environmental Modelling Based on the Grid Method Mobile robot path planning is based on a path planning algorithm designed to find a collection of continuous and short paths of free grids in the grid environment. The grid method is a commonly used approach for environment modeling in mobile robot path
Path Planning of Mobile Robot Based …
581
Fig. 1 The numbered grid map
planning. It is mainly used to describe the environment map information in a twodimensional plane, and was first proposed by Howeden. It divides the environment into square map patterns of equal size, each small square is divided into two forms: free and occupied. In this paper, we adopt the grid method for environment modeling to describe the workspace of the mobile robot. The workspace is divided into a grid of equally sized square cells, and the entire map is represented by a two-dimensional array. Each grid cell can be classified as either free (passable) or occupied by an obstacle (impassable). By marking the positions of obstacles on the map, we can accurately capture the constraints present in the environment, providing precise constraints for path planning. In order to make the description of the grid map simpler and more intuitive, this paper makes S = {1, 2, 3, . . . , 144} the set of grid serial numbers, and takes the lower left corner point (0, 0) as the origin of the map, and then number the map sequentially from left to right and from bottom to top, as shown in Fig. 1. The coordinates of the i-th grid are determined by the following equation: xi = ((i − 1) mod M) + 1 yi = ceil ((i − 1) / N ) + 1
582
Y. Li et al.
where mod is the remainder operation, M and N are the number of rows and columns of the grid map respectively, and ceil is the rounding down operation.
3 DBSCAN-Based Obstacle Clustering 3.1 Algorithm Introduction The DBSCAN (Density-Based Spatial Clustering of Applications with Noise) algorithm is used to cluster adjacent obstacles before searching for paths in the grid map using an improved A* algorithm, thereby simplifying the multi-obstacle unstructured scene map. The DBSCAN algorithm is a density-based clustering algorithm widely used in the process of performing data mining, and its core idea is to classify clusters based on the density of obstacle points.
3.2 Algorithm Preparation The neighborhood radius and the threshold MP defining the core points are two important parameters in the DBSCAN algorithm. Definition 1 Neighborhood ε: For any object xi ∈ D, the ε region within the radius of the object is called the neighborhood ε of xi . Definition 2 Density MP: Set xi , say M P = r (xi ) = |X ε (xi )| is xi ’s density, which refers to the value of the number of points in xi ’s neighborhood, depending on the radius. Definition 3 Core points: For any object xi ∈ D, if the number of samples in the neighborhood ε of xi is greater than or equal to MP, then the object is determined to be a core object. MP denotes a certain number of sample points, and the parameter can be set according to the actual application scenario of the algorithm. Definition 4 Boundary points: A non-core point belonging to a class, the number of points in its neighborhood ε is less than MP. Definition 5 Noise points: Points that do not belong to any of the class clusters and are density unreachable from any of the core points. Definition 6 Density directly reachable: If xi is a core object, and there is x j in the neighborhood ε of xi , then it is decided that x j is directly reachable by the density of that object. Definition 7 Density reachable: For xi , x j ∈ D, suppose there exists a sample sequence { p1 , p2 , · · · , pn }, where p1 = xi , pn = x j and pi is directly reached by pi−1 ’s density, then it is decided that x j is reached by xi ’s density.
Path Planning of Mobile Robot Based …
583
Fig. 2 Density directly reachable
Fig. 3 Density reachable
This paper focuses on path planning in a known environment, where information such as start point, goal point, obstacles, and drivable areas are already available. Based on the geometric information of the obstacles, we discretize each boundary of the obstacles into several key points using a distance parameter l that satisfies the condition (Figs. 2 and 3): l < f + 2d where f represents the width of the mobile robot, and d denotes the single-sided safety distance of the robot. The discretization process starts from the vertices of
584
Y. Li et al.
the obstacles, and if the distance from a discretized key point to the other end of the boundary is less than d, that vertex is considered as a key point. The next boundary is then discretized starting from this key point.
3.3 DBSCAN-Based Clustering of Obstacles The DBSCAN algorithm eventually clusters objects with similar relationships into a cluster by continuously searching for core objects and growing clusters based on the relationships between objects. We set specific parameters of the DBSCAN algorithm based on key point discretization of obstacle boundaries by distance d. Among them, we set the minimum number of samples MP to 2 and require that the sum of the width of the mobile robot and twice the single-sided safety distance is greater than d. The discrete effect of the key points is shown in Fig. 4. Meanwhile, when any discrete point of obstacle O1 falls into the neighborhood of obstacle O2 , the discrete points of both obstacles become density reachable to each other because all discrete points are considered as core objects, which means they are merged into the same cluster. By the process of continuously merging clusters until the DBSCAN algorithm completes the clustering. In this way, we complete the clustering of the discrete points that belong to the same cluster into one obstacle. In the result after clustering, the discrete points of different clusters are represented by different colors, as shown in Fig. 5. After completing the obstacle clustering, we extract the vertex coordinates of the same class of obstacles and use the convex polygon envelope algorithm to process these point sets to generate a new integrated obstacle. If the goal point is inside the clustered obstacle group, the clustering of that obstacle group is canceled.
Fig. 4 The discrete effect of the key points
Path Planning of Mobile Robot Based …
585
Fig. 5 Obstacles clustering schematic
With the above steps, we achieve the clustering of discrete points and integration of obstacles, which complete the DBSCAN-based clustering part of mobile robot path planning. Through the clustering of DBSCAN, all obstacles that the mobile robot cannot pass through normally, such as U-shaped obstacles and diagonal obstacles, are clustered, and new replacement obstacles with minimum external convex polygons are generated, which fundamentally solves the problem of misdirected mobile robot leading to long search time.
4 Improved BA*-APF Hybrid Algorithm 4.1 The Traditional A*(A-Star) Algorithm The A*(A-Star) algorithm is an efficient and searchable optimal path algorithm with fast convergence and high robustness. The algorithm uses the properties of evaluation functions to calculate the estimated value of the current point, and uses the estimated value to select the optimal node and finally search for an optimal path. The A* algorithm combines the advantages of the Dijkstra algorithm and the Breadth-First search algorithm, it retains the ability of the Dijkstra algorithm to search the shortest path to some extent, and introduces the greedy strategy of the Breadth-
586
Y. Li et al.
First search algorithm to prevent the algorithm from searching blindly, combining into a fused evaluation function with the evaluation function as: f (n) = g(n) + h(n)
(1)
where n is the current point, f (n) is the estimated value from the start point to the goal point, g(n) is the actual value from the start point to the current point, and h(n) is the estimated value from the current point to the goal point, also known as the heuristic function, and we use the Euclidean distance to calculate the heuristic function. Considering that the traditional A* algorithm is computationally expensive in a relatively complex environment, leading to long search time as well as low efficiency of the algorithm, this paper addresses three aspects of the A* algorithm: search neighborhoods, search direction, and heuristic function, to improve the path planning efficiency and real-time planning capability of the mobile robot. And an improved BA*-APF(Bidirectional A*-Artificial Potential Field) hybrid algorithm is proposed.
4.2 Improved Search Neighborhoods The traditional A* algorithm is generally a four-neighborhood or eight-neighborhood approach when traversing nodes. Four-neighborhood traversal is to search in the four directions of up, down, left, and right around the current point, which is efficient but can easily cause too many turning points of the planned paths and lead to unsmooth planned paths. Eight-neighborhood traversal adds four diagonal directions based on four-neighborhood traversal, for a total of eight directions. It increases the range of traversing nodes, but it is easy to ignore the existence of obstacles when traversing nodes, resulting in unsafe planned paths (Fig. 6). To solve the problem that the traditional A* algorithm searches the neighborhood poorly, leading to unsmooth and unsafe paths, this paper proposes an improved
Fig. 6 Nodes traversal method
Path Planning of Mobile Robot Based …
587
Fig. 7 Different search mechanisms
search method—Four-eight hybrid switching neighbourhood search. By evaluating the subnodes, the appropriate neighborhood search method is selected according to the distance from the obstacles. Specifically, when the distance between the current node and the obstacle reaches the safe distance, such as the distance with the obstacle as the center of the circle and the diagonal length of the unit grid as the diameter, the search is switched to four-neighborhood search; When the distance exceeds the safe distance, the search is switched to eight-neighborhood search. This mixed switching method combines the advantages of the four-neighborhood search and the eight-neighborhood search, which makes the mobile robot move safer and the path smoother in the driving process and can effectively improve the effect of path planning (Fig. 7).
4.3 Improved Bidirectional Search The bidirectional A* algorithm is a two-way simultaneous search based on the traditional A* algorithm, which search from the start point to the goal point and search from the goal point to the start point at the same time, until the two search points intersect and then the search is completed. Set up an evaluation function for forward and backward search as shown in the following equations: f 1 (n) = g1 (n) + h 1 (n) f 2 (n) = g2 (n) + h 2 (n)
(2)
The advantage of the bidirectional A* algorithm is that it can start from both the start point and the goal point in order to find a solution faster. Compared to the traditional one-way A* algorithm, the bidirectional A* algorithm can usually reduce the number of nodes searched, especially when the search space is large and the start point is far away from the goal point. So the bidirectional A* algorithm has higher search efficiency and can greatly reduce the search time. The pseudo-code of the algorithm is as follows:
588
Y. Li et al.
Algorithm 1 Improved Bidirectional A* Algorithm Input: the start point S and the goal point G Output: global optimal path; 1: Initialization: create forward open1,backward open2 and close lists 2: put S/G into the forward/backward open list 3: search from S and G at the same time 4: while open1 = ∅ and open2 = ∅ do 5: take out the adjacent node a and b with the smallest f value in open1 and open2 6: consider a and b as the respective current node and add them to the close list 7: if the adjacent node is an obstacle or in the close table then 8: skpip this node 9: else 10: if a ∈ / open1 and b ∈ / open2 then 11: add a to open1 and add b to open2 12: take the current node as its parent node 13: else 14: compare its current f value with the f value in the open list; 15: if curr ent f value < the f value in the open list then 16: replace the f value in the open list with current f value 17: replace its parent node with current node 18: end if 19: end if 20: end if 21: end while 22: use the backtracking principle to derive the path
4.4 Improved Heuristic Function One issue that needs to be considered is that the bidirectional A* algorithm may encounter the problem that the two search directions do not meet during the pathfinding process, resulting in the algorithm getting stuck in a loop and ultimately failing to find a path. To solve this problem, the method of an artificial potential field is introduced. By incorporating the method of gravitational and repulsive effects based on the artificial potential field, a new evaluation function is provided for the bidirectional search. This approach reduces the search in ineffective areas and improves the search efficiency of the algorithm. The main idea of the artificial potential field method is to simulate the action of the force field, considering the obstacle as the repulsion point, forming a repulsive force to prevent the robot from moving towards the obstacle, and considering the target point as the attraction point, forming an attractive force to push the robot towards the target point. Finally, the robot moves in the direction of the combined force under the action of the combined force potential field (Fig. 8). In the evaluation function, the original heuristic function h(n) of the A* algorithm is replaced by h 1 (n) + h 2 (n), where h 1 (n) still represents the estimated value from the current point to the target point, and h 2 (n) represents the potential field value at that point. By introducing the artificial potential field, the focus of the search can be brought closer to the target point, but if only the artificial potential field is used, the
Path Planning of Mobile Robot Based …
589
Fig. 8 The force field action of artificial potential field
search will converge to the local optimal solution too early, and it will be difficult to find the global optimal solution. To solve this problem, a weighted approach can be used to combine the artificial potential field method and the A* algorithm, allowing the pathfinding process to speed up the search in the early stage and expand the search area in the later stage. The specific formulas are as follows: f (n) = g(n) + ω1 h 1 (n) + ω2 h 2 (n)
(3)
h 2 (n) = Pgra (x) + Pr ep (x)
(4)
1 k gra (xn − x g )2 2
(5)
Pgra (x) = Pr ep (x) =
1 kr ep ( xn −x − obs 0
1 2 ) ρ0
, xn − xobs ≤ ρ0 , xn − xobs > ρ0
(6)
where Pgra (x) denotes the gravitational field, Pr ep (x) denotes the repulsive field, xn is the robot position, x g is the target point position, xobs is the obstacle position, k gra is the gravitational force coefficient, kr ep is the repulsion coefficient, and ρ0 is the action distance of a single obstacle.
590
Y. Li et al.
Specifically, in the early searching period, the weight factor ω1 of the artificial potential field can be increased to make the search more biased toward the target point, thus speeding up the search. And in the later searching period, ω1 can be reduced and ω2 can be increased to make the search more biased to expand the searching range, so as to be able to explore the global optimal solution better.
5 Third-Order Bessel Curve Improve Path Smoothness The above improvement strategy makes the mobile robot safer and the search efficiency is greatly improved, but there are still problems such as the planned path is not smooth enough. And it is easy to cause excessive turning angles in the driving process of the mobile robot, and the driving path needs to be smoothed. The thirdorder Bessel curve has the advantages of convexity preservation and smoothness and is widely used in the field of path planning. The specific equations are as follows: B(t) =
n n pi (1 − t)n−i t i , 0 < t < 1 k i=0
b(t) = p0 (1 − t)3 + 3 p1 t (1 − t)2 + 3 p2 t 2 (1 − t) + p3 t 3 , 0 < t < 1
(7)
(8)
where the upper equation is a Bessel curve expression, the lower equation is a thirdorder Bessel curve and pi (i = 1, 2, . . . , n) is a given trajectory point. As t keeps changing continuously, a smooth and complete driving path of the mobile robot can be depicted. In practice, a third-order Bessel curve is introduced to improve the smoothness of the generated robot path. In other words, the start and goal points are sampled at a fixed grid interval (1/2 grid length here), and a third-order Bessel curve transformation is applied to the detected fold line to convert the fold line into a curve with a certain curvature. The center of the circle is the robot’s point and the radius is 1/3 of the grid length to check for collisions with the obstacle grid.
6 Simulation Results In order to verify the scientific validity and effectiveness of this study, a simulation validation is carried out. For the global path planning study of the mobile robot, the traditional A* algorithm and the improved A* algorithm are simulated and compared in a more complex conventional grid environment with a grid environment map of 30 × 15 unit grid, a robot movement radius of 1 unit grid, a starting position of [1, 1], and a goal point position of [29, 14], respectively. To verify the scientificity
Path Planning of Mobile Robot Based …
591
Fig. 9 Algorithm simulations Table 1 Simulation experiment data Algorithm Path length Traditional A* algorithm Bidirectional A* algorithm A* with improved search neighborhood Improved BA*-APF hybrid algorithm
Search time
Traversal numbers
42
1.73
643
42
1.06
379
35.56
1.16
419
35.56
0.82
262
of the improved A* algorithm, the A* algorithm improvement process is compared and analyzed one by one. And the corresponding simulation diagrams are as follows (Fig. 9). Where the green cell is the start point, the red cell is the goal point, and the yellow line is the best path planning solution. And simulation results are as follows (Table 1). According to the simulation results, compared with the traditional A* algorithm, the improved bidirectional A* algorithm reduced the search time by 37.6% and the search area by 41%, the improved search neighborhood A* algorithm reduced the path length by 15.3%, the search time by 8.3% and the search area by 34.8%, while the DBSCAN-based clustering and the improved BA*-APF hybrid algorithm reduced the path length by 15.3%, the search time by 51.8% and the search area by 59.3%. The effectiveness of the algorithm has been verified.
592
Y. Li et al.
7 Conclusion To solve the problems of long search time, slow convergence speed, and path insecurity of the traditional A* algorithm, this paper proposes a DBSCAN-based clustering and improved BA*-APF hybrid algorithm that first clusters the obstacles by the DBSCAN algorithm to reduce the environmental complexity, then improves the search neighborhoods and introduces the simultaneous bidirectional search method to improve the search efficiency and ensure the path security. Considering the problem that the simultaneous bidirectional search cannot meet for a long time as the map expands, the artificial potential field method is introduced to improve the heuristic function to further improve the convergence speed. Finally, a third-order Bessel curve is used to improve the smoothness of the path. The simulation results prove that the algorithm proposed in this paper can effectively improve the search efficiency of the mobile robot, and has higher safety and path smoothness to meet the requirements of path planning, which has certain practical significance. However, this paper only considers the path planning in the two-dimensional static environment, and cannot effectively solve the problems such as obstacles avoidance in the dynamic environment. So the next step will be targeted at improvement.
References 1. Qu, D.K., Du, Z.J., Xu, D.G., et al.: Research on path planning for a mobile robot. Robot. 30(2), 97–101,106 (2008). https://doi.org/10.13973/j.cnki.robot.2008.02.002 2. Hossain, M.A., Ferdous, I.: Autonomous robot path planning in dynamic environment using a new optimization technique inspired by bacterial foraging technique. Robot. Auton. Syst. 64, 137–141 (2015). https://doi.org/10.1016/j.robot.2014.07.002 3. Wang, D.J.: Indoor mobile-robot path planning based on an improved A* algorithm. J. Tsinghua Univ. Sci. Technol. 52(08), 1085–1089 (2012). https://doi.org/10.16511j.cnki.qhdxxb2012.08.009 4. Zhang, T.W., Xu, G.H., Zhan, X.S. et al.: A new hybrid algorithm for path planning of mobile robot. J. Supercomput. 78, 4158–4181 (2022). https://doi.org/10.1007/s11227-021-04031-9 5. Curiac, D.I., Volosencu, C.: A 2D chaotic path planning for mobile robots accomplishing boundary surveillance missions in adversarial conditions. Commun. Nonlinear Sci. Numer. Simul. 19, 3617–362 (2014). https://doi.org/10.1016/j.cnsns.2014.03.020 6. Harabor, D., Grastien, A.: The JPS pathfinding system. In: Proceedings of the International Symposium on Combinatorial Search, pp. 207–208 (2012). https://doi.org/10.1609/socs.v3i1. 18254 7. Huang, C., Zhao, Y., Zhang, M., Yang, H.: APSO: An A*-PSO hybrid algorithm for mobile robot path planning. In: IEEE Access, vol. 11, pp. 43238–43256 (2023). https://doi.org/10. 1109/ACCESS.2023.3272223 8. Li, B., Dong, C.et al.: Path planning of mobile robots based on an improved A*algorithm. Assoc. Comput. Mach. 49–53 (2020). https://doi.org/10.1145/3409501.3409524 9. Goldberg, A.V., Harrelson, C.: Computing the shortest path: A* search meets graph theory. Annu. ACM-SIAM Symp. Discrete Algorithms 156–165 (2005) 10. Huang, Y., Guo, S.: Path planning of mobile robots based on improved A* algorithm. In: 2022 Asia Conference on Advanced Robotics, Automation, and Control Engineering (ARACE), pp. 133–137. IEEE Press, Qingdao (2022). https://doi.org/10.1109/ARACE56528.2022.00031
Path Planning of Mobile Robot Based …
593
11. Zhao, Y., Wang, Z., Huang, C.C., et.al.: Path planning of mobile robot based on improved A* algorithm. Robot. 40(6), 137–144 (2018). https://doi.org/10.13973/j.cnki.robot.170591 12. Korf, R.E.: Depth-first iterative-deepening: an optimal admissible tree search. Artif. Intell. 27(1), 97–109 (1985). https://doi.org/10.1016/0004-3702(85)90084-0 13. Yu, Y., Guo, C., Li, T.: Path following of underactuated autonomous surface vessels with surge velocity constraint and asymmetric saturation. IEEE/CAA J. Automatica Sinica 10(5), 1343–1345 (2023). https://doi.org/10.1109/JAS.2023.123168 14. Mandava, R.K., Bondada, S., Vundavilli, P.R.: An optimized path planning for the mobile robot using potential field method and PSO algorithm. In: Bansal, J., Das, K., Nagar, A., Deep, K., Ojha, A. (eds.) Soft Computing for Problem Solving. Advances in Intelligent Systems and Computing, vol. 817. Springer, Singapore (2019). https://doi.org/10.1007/978-981-13-15954_11 15. Ferguson, D., Stentz, A.: Using interpolation to improve path planning: the Field D algorithm. Field Robotics. 23, 79–101 (2006). https://doi.org/10.1002/rob.20109* 16. Yang, H., Fan, X., Shi, P., et al.: Nonlinear control for tracking and obstacle avoidance of a wheeled mobile robot with nonholonomic contrint. IEEE Trans. Control Syst. Technol. 24(2), 741–746 (2016). https://doi.org/10.1109/TCST.2015.2457877 17. Ou, Y., Fan, Y., Zhang, X., et al.: Improved A* path planning method based on the grid map. Sensors 22, 6198 (2022). https://doi.org/10.3390/s22166198 18. Huang, D., Zhao, Y., Li, Q., Wu, T.: Research on path planning of mobile robot based on improved a-star algorithm. In: Proceedings—2022 International Conference on Informatics, Networking and Computing, pp. 251–255. ICINC (2022). https://doi.org/10.1109/ ICINC58035.2022.00058
Method and Testing of Shaft Angle Digital Conversion Based on Improved CORDIC Algorithm Sixian Sun, Qijia Zheng, Liping Ren, Linxue An, Yufeng He, and Shuyue Han
Abstract In response to the problem of multiple iterations of the classic CORDIC algorithm and poor real-time performance of shaft angle conversion, a shaft angle digital conversion method based on the improved CORDIC algorithm is proposed. This method uses the improved CORDIC algorithm and closed-loop tracking algorithm to achieve high-precision shaft angle digital conversion; On this basis, an FPGA based shaft angle conversion testing system was built, and the application verification of this method was carried out. The experimental results showed that the shaft angle digital conversion method is correct, reliable, and has better dynamic characteristics. Keywords CORDIC algorithm · Motor control · Axis angle digital conversion
1 Introduction In servo systems, a resolver is often used for position measurement. In addition to its small error and high accuracy, the resolver also has advantages that ordinary sensors cannot compare, such as small size, simple structure, good reliability, and basically maintenance free. As a precision angle sensing element, the rotary transformer also has the advantages of good impact resistance, strong anti-interference ability, and adaptability to high-speed operation. It is often used in situations with high anti vibration requirements, such as military equipment of aviation, aerospace, radar, tanks, and ground gun fire control, and can also be used in civil equipment such as digital control machine tools and robots [1]. Traditional shaft angle measurement usually uses peak sampling, coherent demodulation, or dedicated shaft angle digital conversion chips . These methods either have insufficient measurement accuracy or high implementation prices, making it difficult to meet the requirements of low-cost and integrated applications. The CORDIC S. Sun (B) · Q. Zheng · L. Ren · L. An · Y. He · S. Han Beijing Precision Electromechanical Control Equipment Research Institute, Aerospace Servo Drive and Transmission Technology Laboratory, Beijing 100076, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_48
595
596
S. Sun et al.
axis angle conversion scheme based on FPGA can fully utilize the rich hardware resources of FPGA, not only eliminating the expensive and bulky rotary decoding chip, but also achieving the iteration of the CORDIC algorithm through only shift , addition and subtraction operations [2]. The classic CORDIC algorithm has as many as 16 iterations [3], and the real-time performance of axis angle conversion is poor. In response to this issue, this article proposes a shaft angle digital conversion method based on the improved CORDIC algorithm, which reduces the number of iterations through “region transformation” and “shaft angle jump”, improves the real-time performance of shaft angle conversion, and achieves high-precision shaft angle digital conversion through closed-loop tracking algorithm. The improved CORDIC algorithm is simulated and analyzed to verify the correctness of the algorithm, And a comparative experiment was conducted on the FPGA based hardware platform for shaft angle conversion with the classic CORDIC algorithm. The experimental data results showed that the improved CORDIC algorithm has better dynamic characteristics while ensuring conversion accuracy.
2 Principle of Shaft Angle Digital Conversion Method Based on Improved CORDIC Algorithm The shaft angle digital conversion method based on the improved CORDIC algorithm is implemented through the improved CORDIC algorithm and closed-loop control algorithm. This chapter explains the principle of the shaft angle digital conversion method based on the improved CORDIC algorithm from four parts: rotating transformer principle, CORDIC algorithm, improved CORDIC algorithm, and closed-loop tracking algorithm.
2.1 Principle of Rotating Transformer The resolver consists of two parts: the excitation winding and the output winding. When an external excitation signal is applied to the excitation winding, the transformer rotor modulates and outputs a voltage signal on the output winding. Due to the special design of the rotor, the voltage amplitude of the output winding is generally in a sine or cosine relationship with the rotor angle. The basic principle of a rotary transformer is shown in Fig. 1. Regardless of the configuration, the calculation formula for the output voltage Usin and Ucos of the resolver is the same, as shown in Eq. (1). U R = E × sin ωt Usin = E × k × sin ωt sin θ Ucos = E × k × sin ωt cos θ
(1)
Method and Testing of Shaft Angle Digital Conversion …
597
Fig. 1 Composition block diagram of rotating transformer
S2
R1
Ucos S4
R2
S1
Usin
S3
In the formula, θ is the shaft angle, ω the rotor excitation frequency, and E is the rotor excitation amplitude [4].
2.2 CORDIC Algorithm CORDIC is an approximate approximation method that approximates the initial angle by continuously rotating multiple predetermined angles. In 1957, Jack Volder first proposed the CORDIC algorithm and applied it to digital processing for real-time navigation. John Walther improved and popularized the Volder algorithm and put forward the unified CORDIC algorithm in 1971, which has a wider application range, such as trigonometric function, inverse trigonometric function, hyperbola function and transcendental function. As shown in Table 1, CORDIC algorithm includes rotation mode and vectorization mode, which can complete two different modes in circular coordinate system, linear coordinate system and hyperbolic coordinate system, and obtain six different results [5]. This article uses the rotation mode in the circular coordinate system for derivation. The system completes a plane coordinate rotation, as shown in Fig. 2. It can be seen that a new vector X j , Y j is obtained through rotating the vector (X i , Yi ) clockwise by θ: X j = X i cos θn − Yi sin θn (2) Y j = Yi sin θn + X i cos θn In the Formula (2), R is the radius of the circle, θ For the rotation angle, the iterative formula can be obtained: xj xi xi cos θn − sin θn 1 −Sn tan θn = = cos θn (3) sin θn cos θn tan θn Sn yj yi yi
598
S. Sun et al.
Table 1 Operator values for 16 iterations Iterations tan θ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Fig. 2 Circular system plane coordinate rotation
1 0.5 0.25 0.125 0.0625 0.03125 0.015625 0.007813 0.003906 0.001953 0.000977 0.000488 0.0002441 0.0001221 0.000061 0.0000305
θ
cos θ
45 26.565051 14.036244 7.1250163 3.5763344 1.7899106 0.8951737 0.4476141 0.2238105 0.1119057 0.0559529 0.0279765 0.0139882 0.0069941 0.0034971 0.0017485
0.7071068 0.8944271 0.9701425 0.9922779 0.9980526 0.9995121 0.999878 0.9999695 0.9999924 0.9999981 0.9999995 0.9999999 0.9999999 0.9999999 0.9999999 0.9999999
Y
(Xj,Yj)
Yj
(Xi,Yi)
Yi θ Xj
Xi
X
The Sn is the direction operator, positive when rotating counterclockwise, neg ative when rotating clockwisein. Considering θn = arctan 21n , the matrix can be rewritten as xj xi 1 −Sn 2−n = cos θn −n (4) 2 Sn yj yi After 16 rotations, the value of tan θ, θ And cos θ are shown in Table 1. From Table 1, it can be seen that the accuracy of the classic CORDIC algorithm can reach 0.0017◦ , and after 16 iterations, it converges to the constant k = 0.60725. 15 Si θi . The rotation angle is θ = i=0
Method and Testing of Shaft Angle Digital Conversion …
599
Making X 16 = kU R cos θ, Y16 = kU R sin θ,the angle to be calculated is θ, The to yn .Making y approach 0 and iterating direction operator Sn has the opposite sign 15 Si θi is obtained after 16 iterations. according to equation (4), the result θ = − i=0 From Table 1, it can be seen that the maximum angle range that can be calculated for 16 iterations is −99.88◦ to 99.88◦ .
2.3 Improved CORDIC Algorithm This article adopts the methods of “region transformation” and “angle jump” to reduce the number of iterations of the classic CORDIC algorithm. The “region transformation” method is used to preprocess the input coordinates, limiting the input to a smaller input range, and then using “angle jump” to reduce the number of iterations, improving the real-time performance of the algorithm. The classic CORDIC algorithm preprocesses the input coordinates (X, Y ) first, transforming them into one to four quadrants, and then iterating. The ‘region transformation’ divides the circumference into 8 regions, as shown in Fig. 3. After preprocessing the input coordinates, they are transformed into 0, π4 , and the transformation relationship is shown in Table 2. The first iteration is cancelled through ‘region transformation’. Using the iterative algorithm process in Fig. 2 to achieve “region transformation” to extend the quadrant of the arctangent operation of the CORDIC algorithm (Fig. 4). The classic CORDIC algorithm needs to go through 16 iterations, and the “angle jump” compares the input pseudo coordinate’s y-axis with the converted y-axis of the rotation angle to obtain the rotation angle that makes the input coordinate closest to the target coordinate after iteration. It skips unnecessary coordinate iterations and directional rotations, allowing the coordinate to rotate along the optimal path,
Fig. 3 Dividing the circumference region (3
(
/4)
( C
(
)
/4)
B
D
A
E
H
F (5
/2)
G (7
/4) (3
/2)
/4)
600
S. Sun et al.
Table 2 “Region transformation” Before transformation Region Coordinate θ B C D E F G H
(X, Y ) (X, Y ) (X, Y ) (X, Y ) (X, Y ) (X, Y ) (X, Y )
π 2 π 2
−θ +θ π−θ π+θ 3π 2 −θ 3π 2 +θ 2π − θ
After transformation Region Coordinate
θ
(Y, X ) (Y, −X ) (−X, Y ) (−X, −Y ) (−Y, −X ) (Y, −X ) (X, −Y )
θ θ θ θ θ θ θ
A A A A A A A
Fig. 4 The algorithm flowchart of “region transformation”
Method and Testing of Shaft Angle Digital Conversion … Table 3 Rotation angle and coordinate correspondence table Iterations i tan θi θi 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0.5 0.25 0.125 0.0625 0.03125 0.015625 0.007813 0.003906 0.001953 0.000977 0.000488 0.0002441 0.0001221 0.000061 0.0000305
26.565051 14.036244 7.1250163 3.5763344 1.7899106 0.8951737 0.4476141 0.2238105 0.1119057 0.0559529 0.0279765 0.0139882 0.0069941 0.0034971 0.0017485
601
yi = sin θi 0.4472136 0.2425356 0.1240347 0.0623783 0.0312348 0.0156231 0.0078123 0.0039062 0.0019531 0.0009766 0.0004883 0.0002441 0.0001221 0.000061 0.0000305
greatly reducing the number of iterations in the algorithm. The vertical coordinates corresponding to the rotation angle of 16 iterations are shown in Table 3. After the iteration is completed, θ can be caculated from Table 2 in which we knows the relationship between θ and θ (Fig. 5).
2.4 Closed-Loop Tracking Algorithm Calculation of shaft angle using open-loop CORDIC algorithm is prone to noise interference, resulting in incorrect output results. For the improved CORDIC algorithm mentioned above, a closed-loop tracking algorithm structure diagram is shown in Fig. 6. Firstly, the improved CORDIC algorithm is used to perform arctangent operation on the input signals of Usin and Ucos, and the calculation results are input to a closedloop tracking loop composed of a low-pass filter, PI controller, and integrator. The closed-loop tracking algorithm can quickly track the rotor angle with small steadystate error.
602
Fig. 5 Shows the iterative algorithm process using “angle jump”
Fig. 6 Block diagram of improved closed-loop tracking algorithm
S. Sun et al.
Method and Testing of Shaft Angle Digital Conversion …
603
3 Simulation Analysis 3.1 Method Validation According to the principle and algorithm flow of the improved CORDIC closed-loop algorithm in Chap. 1, a simulation model is designed, as shown in Figs. 7 and 8. Figure 9 shows the simulation results of the shaft angle digital conversion, with input angles θ which refers to the sawtooth wave with amplitude of 360◦ and period of 10 ms. θˆ is the output of the shaft angle conversion. It can be seen that this algorithm can accurately calculate the position of the motor rotor.
3.2 Performance Comparison It can be seen that the anti-interference performance of the improved CORDIC closed-loop algorithm is basically consistent with that of the classic CORDIC closedloop algorithm. In the three loop control system of permanent magnet synchronous motor based on SVPWM control, the improved CORDIC closed-loop algorithm and the classic CORDIC closed-loop algorithm are used to replace the rotary decoding. The dynamic characteristics of the output results of the two algorithms are evaluated using step input. The two algorithms use the same PI regulator parameters, and the response curve is shown in Fig. 10. From the results in Fig. 10, it can be seen that the improved CORDIC closed-loop algorithm has better dynamic characteristics than the classic CORDIC closed-loop algorithm.
Fig. 7 Simulation model for shaft angle conversion based on the improved
Fig. 8 Improved CORDIC algorithm model
604
S. Sun et al.
Fig. 9 Simulation results
Fig. 10 Response curves of two algorithms to step input
Based on the above analysis, the angle output of the improved CORDIC closedloop algorithm is consistent with the theoretical calculation results, and better dynamic performance than the classic CORDIC closed-loop algorithm.
Method and Testing of Shaft Angle Digital Conversion …
605
4 Shaft Angle Conversion Test Based on FPGA 4.1 FPGA Based Shaft Angle Conversion Testing System The shaft angle conversion testing system based on FPGA includes DSP and peripheral circuits, FPGA and peripheral circuits, signal conditioning circuits, driving circuits, external A/D, and upper computer. DSP is used for phase current and line displacement sampling and closed-loop control, external A/D converter magnetic signal collection, FPGA is used for 1553B communication with upper computer, shaft angle conversion based on improved CORDIC algorithm, and multi-channel servo control PWM signal expansion, as shown in Fig. 11.
4.2 Application Verification Based on Improved CORDIC Closed-Loop Algorithm In the FPGA based shaft angle conversion testing system, the upper computer sends periodic position commands to the system through the 1553B bus. The testing system feeds back the shaft angle conversion results of the improved CORDIC algorithm to the upper computer, and outputs the testing curve on the upper computer. The curve of the displacement command and the position feedback curve calculated from the rotor angle are shown in Fig. 12, and the corresponding relationship between the command and feedback is correct. The shaft angle conversion based on the improved CORDIC algorithm is correct.
signalconditio ning
Phase current
PMSM
phase shift
PWM DSP
FPGA
Driver A D
Upper computer
1553B
signalconditio ning
signalconditio ning
Rotational demagnetization
Rotational excitation
Fig. 11 Block diagram of FPGA based shaft angle conversion testing system
606
S. Sun et al.
Fig. 12 Curves of the displacement command and the position feedback
In the FPGA based shaft angle conversion testing system, the improved CORDIC algorithm and the classic CORDIC closed-loop algorithm are respectively used. The response curves under the same position step input are shown in Fig. 13. From the comparison results in Fig. 13, it can be seen that in the closed-loop control system of permanent magnet synchronous motor, the improved CORDIC closed-loop algorithm has better dynamic characteristics compared to the classic CORDIC closed-loop algorithm.
5 Conclusion This article proposes a shaft angle digital conversion method based on the improved CORDIC algorithm. This method is based on the improved CORDIC algorithm and the closed-loop tracking algorithm. Through modeling and simulation, the correctness of the improved CORDIC closed-loop algorithm principle is verified, and a comparative analysis is conducted on the anti-interference and dynamic characteristics of the two algorithms. A shaft angle conversion testing system based on FPGA has been built and applied and verified. The shaft angle conversion method is correct and reliable, and the use of this method can effectively improve the dynamic characteristics of the servo system.
Method and Testing of Shaft Angle Digital Conversion …
607
(a) The classic CORDIC closed-loop algorithm
(b) The improved CORDIC closed-loop algorithm Fig. 13 Comparison of transient characteristics
References 1. Liu, B., Liao, Y., He, Z.: Research and system design of FPGA based rotary transformer decoding algorithm. Micromotors 40(12), 48–51 (2007) 2. Song, X., Zhu, H., Wang, W.: Research on the decoding algorithm of rotating transformer based on CORDIC. Electron. Measur. Technol. 33(6), 39–43 (2010) 3. Yang, H.: Implementation of CORDIC algorithm based on FPGA. J. Xi’an Univ. Posts Telecommun. 13(1), 75–77 (2008) 4. Ma, L., Jia, X., Chen, S.: Optimization design of decoding algorithm for rotating transformer of permanent magnet synchronous motor. Motor Control Appl. 48(2), 31–35, 44 (2021) 5. Wang, Q„ Ying, H.: The CORDIC algorithm for arctangent function and its FPGA implementation. Military Autom. 39(6), 45–48 (2020) 6. Weipeng, Zhang, Ruifeng, Yang, Chenxia, Guo, Shuangchao, Ge.: A high-precision rotary transformer decoding algorithm. Sci. Technol. Eng. 19(24), 157–163 (2019) 7. Hao, Shuai, Ruifeng, Yang, Chenxia, Guo: Research on digital decoding algorithms for rotating transformers. Chin. Sci. Technol. Paper 9(10), 1192–1196 (2014) 8. Sahu, N., Londhe, N.D., Kshirsagar, G.B.: FPGA applications in inverter and converter circuits: a review on technology, benefits and challenges. In: 2017 International Conference on
608
9.
10. 11. 12. 13. 14.
S. Sun et al. Innovations in Information, Embedded and Communication Systems (ICIIECS), Coimbatore (2017) Zhang, J., Liang, F., Liu, N.: FPGA implementation of an improved CORDIC algorithm. Microelectron. Comput. 27(11), 181–184 (2010). Yang, H.: Implementation of CORDIC algorithm based on FPGA. J. Xi’an Univ. Posts Telecommun. 13(1), 75–77 (2008) Wang, X. "Design and Implementation of CORDIC Algorithm Based on FPGA. In: 2018 International Conference on Robots and Intelligent System (ICRIS), Changsha (2018) Wu, Y., Guo, L., Li, G.: Research on FPGA based CORDIC decoding algorithm for rotary transformers. Meas. Control Technol. 37(2), 38–41, 46 (2018) Yang, R., Zhang, W., Guo, C., et al.: Research on error suppression and decoding techniques for rotating transformers. Micromotors 53(2), 56–60, 66 (2020) Pan, M., Wang, H.: Design of a high-precision and high-speed tracking shaft angle converter. Power Electron. Technol. 55(8), 58–59 (2021) Cunsheng, Z., Dexue, Z., Chao, W., Xuesen, H., Ji, Z., Feifei, D.: Improvement of CORDIC algorithm without scaling factor and FPGA implementation. China Integr. Circuit J. 26(03), 62–66 (2017)
Multinomial Regression with Group Structure for Screening Biomarkers of Breast Cancer Chenxi Xi, Fugen Gao, and Juntao Li
Abstract This paper was devoted to diagnosing breast cancer subtypes and screening miRNA biomarkers by combining multinomial regression with group structure and Cox regression. Firstly, the Leiden algorithm was proposed for obtaining the group structure of miRNA. Then, a new feature importance evaluation criterion was built based on the ranking of miRNA’s joint mutual information and its changing trend. The experimental results demonstrated that the proposed method outperformed other six methods in terms of diagnosis accuracy, and 13 miRNAs were identified as biomarkers for breast cancer. Keywords Breast cancer · Group structure · Multinomial regression
1 Introduction Breast cancer is a prevalent global cancer with numerous subtypes. According to a report by the American Cancer Society, there has been a steady increase in its incidence over the past forty years [1]. More and more evidence shows that there are differences between these subtypes, which can be used to predict the clinical survival rate of patients [2]. MiRNAs are small non-coding RNAs that regulate target genes post-transcriptionally and have immense potential in cancer diagnosis, treatment, and prognosis [3]. MiRNA biomarkers can offer a basis for early disease diagnosis and prognosis assessment, promoting precision medicine and individualized disease management [4]. Various machine learning methods have been widely applied in the diagnosis and feature screening of breast cancer subtypes. Sarkar et al. employed a machine learning integrated ensemble of feature selection methods followed by survival analysis to predict miRNA biomarkers for breast cancer subtypes [5]. Li et al. identified 124 C. Xi · F. Gao · J. Li (B) College of Mathematics and Information Science, Henan Normal University, Xinxiang 453007, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_49
609
610
C. Xi et al.
miRNAs by employing an ensemble regularized multinomial logistic regression and screened out 22 biomarkers using Cox regression [6]. However, previous methods often neglected the group effects among miRNAs. From a biological perspective, miRNAs exhibit group effects, whereby they not only individual effects but also synergistic or inhibitory effects through interactions, thereby influencing the regulation of biological processes [7]. While weighted gene co-expression network analysis (WGCNA) and local maximal quasi-clique merger (lmQCM) are widely used algorithms for grouping studies, they are more suitable for large datasets with explicit network structures and are not applicable to miRNA grouping[8]. Therefore,it is necessary to explore algorithms suitable for miRNA grouping to fully leverage the group effects among miRNAs. Motivated by community connectivity and micro complex modeling [9–11], we utilized the Leiden algorithm in miRNA grouping to obtain group structures [12]. We then proposed the multinomial regression with group structure (MRWGS), and further screened miRNA biomarkers through Cox regression.
2 Problem Statement Given two miRNA expression datasets mR-1 and mR-2 derived from reference [6]. The two datasets contain 124 and 296 miRNAs, respectively, and both include 86 Luminal A (LA), 39 Luminal B (LB), 24 HER2-Enriched (H2), 41 Basal-Like (BL), and 41 control subtype samples. This study established a breast cancer subtypes diagnosis model and performed biomarkers screening on mR-1. The effectiveness of the model was validated on mR-2. Let {(x1 , y1 ), (x2 , y2 ), . . . , (x231 , y231 )} represent the miRNA expression profiling data of mR-1, where xi = (xi1 , xi2 , · · · , xi124 )T represents the expression levels of miRNAs for the ith sample, and yi represents a subtype label corresponding to xi . If the ith sample belongs to LA, LB, H2, BL, or control subtype, yi takes the values 1, 2, 3, 4, or 5, respectively. From a machine learning perspective, the diagnosis of breast cancer subtypes can be transformed into a 5-classification problem. The following decision function is adopted D(x) = arg
max
k=1,2,3,4,5
dk (x)
(1)
to predict the label of a new sample, where dk (x) = β0k + x T βk represents the kth linear discriminant function. Here, β0k denotes the threshold, and βk is the coefficient vector. This paper was dedicated to solving decision function in (1) by developing a sparse regression model, and further screened miRNA biomarkers based on survival data.
Multinomial Regression with Group Structure for Screening …
611
3 Methods 3.1 Feature Grouping via Leiden Algorithm The Leiden algorithm is a graph-based community detection algorithm that can identify group structures in small datasets [12]. This algorithm takes into account the similarity between nodes and assigns them to communities in order to maximize intra-community similarity and minimize inter-community similarity. The Leiden algorithm was used to obtain miRNA communities that were considered groups of miRNAs. Let X = (x1 , x2 , . . . , x231 )T be the feature matrix of mR-1, constructed from the expression values of 124 miRNAs in 231 samples. To obtain the miRNA groups, we performed principal component analysis (PCA) on highly variable miRNAs and constructed a shared nearest neighbor graph based on the top ten principal components. The Leiden algorithm was then applied to divide miRNAs into communities by maximizing the modularity score: Q=
ki k j 1 )δ(ci , c j ), (Ai j − γ 2m i, j 2m
(2)
where m denotes the total number of edges in the shared nearest neighbor graph, Ai j represents the weight of the edge between miRNA i and j, γ > 0 is the resolution parameter, and ki and k j are the degrees of ith and jth miRNA, respectively. In addition, ci represents the community assigned to ith miRNA, and the function δ equals 1 if ci = c j , and 0 otherwise. The Leiden algorithm iteratively optimizes the initial partition to maximize the modularity score Q in (2). The algorithm was utilized the R package leidenAlg for implementation.
3.2 Feature Importance Evaluation Four groups were obtained by applying the Leiden algorithm. Correspondingly, the feature matrix X is represented as X¯ = ( X¯ 1 , X¯ 2 , X¯ 3 , X¯ 4 ), where X¯ l is the submatrix corresponding to the lth group. Let pl denotes the number of miRNAs in the lth group. The column vectors X¯ il , X¯ lj , and X¯ kl represent the expression values of the ith, jth, and kth miRNAs within X¯ l , respectively. The importance of miRNAs within the groups was evaluated based on mutual information and conditional mutual information. Let I ( X¯ il ; X¯ lj ) and I ( X¯ il ; X¯ lj | X¯ kl ) represent the mutual information and conditional mutual information between X¯ il and X¯ lj , respectively. Similar to reference [13], we adopted the joint mutual information (w3)lk of the kth miRNA in the lth group:
612
C. Xi et al.
(a)
(b)
Fig. 1 (a) Trend chart of all miRNAs joint mutual information changes with sort number. (b) Local trend chart of miRNA joint mutual information changes with sort number
(w3)lk
=
(w1)lk × max(w2)lk k
max(w1)lk + max(w2)lk k
k
+
(w2)lk × max(w1)lk k
max(w1)lk + max(w2)lk k
,
(3)
t
pl pl pl 1 1 l ¯l ¯l ¯l where (w1)lk = ( pl −1) 2 i=1,i=k j=1, j=k I ( X i ; X j | X k ), (w2)k = ( pl −1)2 i=1,i=k pl l ¯l ¯l l ¯l ¯ ¯ j=1, j=k [I ( X i ; X j | X k ) − I ( X i ; X j )]+ . Let wl = ((w3)l1 , (w3)l2 , . . . , (w3)lpl ) denote the joint mutual information of the lth group. The joint mutual information of all 124 miRNAs is represented as w = (w 1 , w 2 , w 3 , w 4 ). Sort the elements in w in ascending order based on their numerical values and assign them numbers from 1 to 124. Figure 1(a) shows the trend chart of all miRNAs joint mutual information changes with sort number. It was observed that the sorted w value showed an increasing trend, with two obvious turning points in this trend, namely at numbers 50 and 53. In order to amplify the differences in importance among individual miRNAs, the importance wkl of the kth miRNA within the lth group was built based on the variation trend of the joint mutual information: ⎧ 2 1 ≤ t ≤ 50 ⎨t , l 146377 7311350 wk = (4) t − 3 , 50 < t < 53 ⎩ 33 53 ≤ t ≤ 124 t . where t is an integer representing the sequential number of (w3)lk in w, t = 50 marks the starting point of a sudden rapid growth, while t = 53 marks the endpoint of this rapid growth. Based on Fig. 1(b), the joint mutual information exhibits a rapid and
Multinomial Regression with Group Structure for Screening …
613
nearly linear increase between t = 50 and t = 53. We defined the importance of miRNA at sorting positions 51 and 52 by establishing a linear equation that passes through the points (50, 502 ) and (53, 533 ).
3.3 Model Building In order to prevent the use of information from the testing samples during the model training phase and ensure a balanced distribution of data classes, we employed stratified sampling to randomly select four-fifths(185) of the samples as the training set, and the remaining samples were designated as the testing set. According to criterion (4), we constructed the weight matrix W l of miRNA in the lth group: W = diag l
1 1 1 , l ,..., l l w1 w2 w pl
.
(5)
By incorporating criterion (4) into group lasso penalty, and combining adaptive group lasso with the negative log-likelihood loss function, we proposed the multinomial regression with group structure (MRWGS): min L({β0k , βk }51 ) + (1 − α)λ β,β0
5 4
βkl 2 + αλ
k=1 l=1
5 4
W l βkl 1 ,
(6)
k=1 l=1
where the negative log-likelihood loss function is defined as 5 5 185
1 T yik β0k + xiT βk − ln eβ0k +x βk , (7) L {β0k , βk }51 = − 185 i=1 k=1 k=1 yik = I (yi = k) denotes the indicator function, α ∈ [0, 1] and λ ≥ 0 are regularization parameters, β = (β 1 , . . . , β l , . . . , β 4 ) represents the coefficient matrix, β l = (β1l , . . . , βkl , . . . , β5l ), β0 = (β01 , . . . , β0k , . . . , β05 )T is threshold vector. Simultaneously optimizing the model parameters α and λ would lead to a significant computational burden. Therefore, we fixed the values of α to 0.001, 0.01, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, and 0.9, and determined the optimal λ through the 10 fold cross-validation. To further identify biomarkers, this study employed the same strategy as Li et al., which involved Cox regression for biomarker screening. The Cox regression formula is as follows: h(t) 110 ˆ ˆ 1 = eβ1 x +...+β110 x , h 0 (t)
(8)
where h(t) and h 0 (t) represent the hazard function and the baseline hazard function, respectively, and (βˆ1 , . . . , βˆ110 ) is the coefficient vector.
614
C. Xi et al.
4 Experiment To validate the effectiveness of MRWGS, we compared it with multinomial logistic regression with ridge regression penalty (MLR-R) [14], multinomial logistic regression with lasso penalty (MLR-L) [15], multinomial logistic regression (MLR) [16], random forest (RF) [17], support vector machine (SVM) [18] and Naive Bayes (NB) [19] on two datasets, mR-1 and mR-2, respectively. MRWGS was solved via the R package msgl. MLR, MLR-L, and MLR-R were executed with the R package glmnet. SVM and NB relied on the R package e1071 for implementation. RF was implemented using the R package randomForestt. By analyzing the two datasets, it was found that the zero expression values accounted for 0.1% and 0.08% of mR-1 and mR-2, respectively. Additionally, each miRNA exhibited a concentrated trend in expression values across different samples. To further process the two datasets, non-expressed values were replaced with the average expression value of the corresponding miRNA across all samples. The Leiden algorithm was implemented on the mR-1 and mR-2 to generate 4 and 5 feature groups, respectively. To accurately evaluate the performance of each method and avoid the randomness of a single experiment’s results, we performed 50 repeated data partitions and calculated the average diagnosis accuracy (ADA) and variance (Var) for each method across these 50 partitions. By computing the average diagnosis accuracy and variance, we could provide a more accurate reflection of the diagnosis performance of each method. Table 1 presented the ADA and Var of seven methods in 50 random data partition experiments on mR-1 and mR-2. The results indicated that MRWGS achieved higher diagnosis accuracy than the other six methods on both datasets. On mR-1, it outperformed MLR-R, MLR-L, MLR, RF, SVM, and NB by 0.30%, 6.43%, 14.78%, 5.13%, 3.43%, and 4.87%, respectively. On mR-2, it surpassed MLR-R, MLR-L, MLR, RF, SVM, and NB by 3.04%, 4.57%, 14.26%, 1.87%, 3.04%, and 5.30%, respectively. It was observed that MRWGS showed a significant improvement in diagnosis accuracy on the mR-2. On mR-1, the highest ADA achieved by MRWGS in 50 random data partitions is 0.9130. Using the parameters corresponding to this highest ADA, we performed
Table 1 ADA and Var of seven methods on different datasets mR-1 MRWGS[Por posed] MLR-R[14] MLR-L[15] MLR[16] RF[17] SVM[18] NB[19]
0.8009 (0.0025) 0.7978 (0.0032) 0.7365 (0.0023) 0.6530 (0.0032) 0.7496 (0.0021) 0.7665 (0.0022) 0.7522 (0.0029)
mR-2 0.7683 (0.0026) 0.7378 (0.0020) 0.7226 (0.0021) 0.6257 (0.0036) 0.7496 (0.0019) 0.7378 (0.0027) 0.7152 (0.0046)
Multinomial Regression with Group Structure for Screening …
615
feature screening and obtained a subset of features containing 110 miRNAs. Further survival analysis was conducted on the feature subset using the Cox regression model based on the survival data, utilizing the R package survival. The 16 miRNAs corresponding to Cox regression coefficients with absolute values greater than 0.2 and p-values less than 0.05 were identified as biomarkers for breast cancer. Among the 16 screened miRNAs, 15 of them have also been identified as biomarkers in the study by Li et al. However, what sets our study apart is the identification of hsa-miR-454-3p as a biomarker. This particular miRNA has been previously recognized as a biomarker for breast cancer and has been shown to possess functional activity in breast cancer cell lines, exhibiting increased proliferation signaling and tumorigenic properties [20]. Among the remaining 15 miRNAs, hsa-let-7e-5p, hsamiR-127-3p, hsa-miR-340-5p, hsa-miR-193b-5p, hsa-miR-744-5p, and hsa-let-7g3p have been previously identified as biomarkers for breast cancer according to the literature. Additionally, hsa-miR-107, hsa-miR-27b-5p, hsa-miR-99b-3p, hsa-miR130b-3p, hsa-miR-331-3p, hsa-miR-452-5p, hsa-miR-629-5p, hsa-miR-889-3p, and hsa-miR-1266-5p have been shown to be highly associated with breast cancer [6].
5 Conclusion This paper proposed the multinomial regression with group structure for diagnosing breast cancer subtypes. Four and five feature groups were obtained by using the Leiden algorithm on two datasets (mR-1 and mR-2), respectively. A criterion for evaluating feature importance was constructed by utilizing joint mutual information and its variation trend. Experimental results demonstrated that MRWGS exhibited the highest diagnosis accuracy among the seven methods. Furthermore, the feature set containing 110 miRNAs was screened, and 16 miRNA biomarkers related to breast cancer were further identified through Cox regression. Acknowledgements This work was supported by the Scientific and Technological Project of Henan Province (232 102 210 066).
References 1. Giaquinto, A.N., Sung, H., Miller, K.D., Kramer, J.L., Newman, L.A., Minihan, A., Jemal, A., Siegel, R.L.: Breast cancer statistics, 2022. CA Cancer J. Clin. 72(6), 524–541 (2022). https:// doi.org/10.3322/caac.21754 2. Yeo, S.K., Guan, J.L.: Breast cancer: multiple subtypes within a tumor? Trends Cancer 3(11), 753–760 (2017). https://doi.org/10.1016/j.trecan.2017.09.001 3. Nassar, F.J., Nasr, R., Talhouk, R.: MicroRNAs as biomarkers for early breast cancer diagnosis, prognosis and therapy prediction. Pharmacol. Ther. 172, 34–49 (2017). https://doi.org/10.1016/ j.pharmthera.2016.11.012
616
C. Xi et al.
4. Jordan-Alejandre, E., Campos-Parra, A,D., Castro-Lopez, D.L., Silva-Cazares, M.B.: Potential miRNA use as a biomarker: from breast cancer diagnosis to metastasis. Cells 12(4), 525 (2023). https://doi.org/10.3390/cells12040525 5. Sarkar, J.P., Saha, I., Sarkar, A., Maulik, U.: Machine learning integrated ensemble of feature selection methods, followed by survival analysis for predicting breast cancer subtype specific miRNA biomarkers. Comput. Biol. Med. 131, 104244 (2021). https://doi.org/10.1016/j. compbiomed.2021.104244 6. Li, J.T., Zhang, H.M., Gao, F.G.: Identification of miRNA biomarkers for breast cancer by combining ensemble regularized multinomial logistic regression and Cox regression. BMC Bioinform. 23(1), 434 (2022). https://doi.org/10.1186/s12859-022-04982-7 7. Hill, M., Tran, N.: MicroRNAs regulating MicroRNAs in cancer. Trends Cancer 4(7), 465–468 (2018). https://doi.org/10.1016/j.trecan.2018.05.002 8. Hu, J.X., Zhou, S., Guo, W.Y.: Construction of the coexpression network involved in the pathogenesis of thyroid eye disease via bioinformatics analysis. Hum. Genomics 16(1), 38–38 (2022). https://doi.org/10.1186/s40246-022-00412-0 9. Teng, X., Liu, B., Ichiye, T.: Understanding how water models affect the anomalous pressure dependence of their diffusion coefficients. J. Chem. Phys. 153(10), 104510 (2020). https://doi. org/10.1063/5.0021472 10. Teng, X., Ichiye, T.: Dynamical model for the counteracting effects of trimethylamine N-oxide on urea in aqueous solutions under pressure. J. Phys. Chem. B. 124(10), 1978–1986 (2020). https://doi.org/10.1021/acs.jpcb.9b10844 11. Teng, X., Ichiye, T.: Dynamical effects of trimethylamine N-oxide on aqueous solutions of urea. J. Phys. Chem. B. 123(5), 1108–1115 (2019). https://doi.org/10.1021/acs.jpcb.8b09874 12. Traag, V.A., Waltman, L., van Eck, N.J.: From Louvain to Leiden: guaranteeing well-connected communities. Sci. Rep. 9(1), 1–12 (2019). https://doi.org/10.1038/s41598-019-41695-z 13. Li, J.T., Cao, F.Z., Gao, Q.H., Liang, K., Tang, Y.: Improving diagnosis accuracy of nonsmall cell lung carcinoma on noisy data by adaptive group lasso regularized multinomial regression. Biomed. Signal Process. Control 23(1), 434 (2023). https://doi.org/10.1016/j.bspc. 2022.104148 14. Ghosh, D.: Penalized discriminant methods for the classification of tumors from gene expression data. Biometrics 59(4), 992–1000 (2003). https://doi.org/10.1111/j.0006-341X.2003. 00114.x 15. Zheng, S.F., Liu, W.X.: An experimental comparison of gene selection by Lasso and Dantzig selector for cancer classification. Comput. Biol. Med. 41(11), 1033–40 (2011). https://doi.org/ 10.1016/j.compbiomed.2011.08.011 16. Yin, M., Zeng, D.Y., Gao, J.B., Wu, Z.Z., Xie, S.L.: Robust multinomial logistic regression based on RPCA. IEEE J.-STSP. 12(6), 1144–1154 (2018). https://doi.org/10.1109/JSTSP.2018. 2872460 17. Sherafatian, M.: Tree-based machine learning algorithms identified minimal set of miRNA biomarkers for breast cancer diagnosis and molecular subtyping. Gene 677, 111–118 (2018). https://doi.org/10.1016/j.gene.2018.07.057 18. Pochet, N., De Smet, F., Suykens, J.A., De Moor, B.L.: Systematic benchmarking of microarray data classification: assessing the role of non-linearity and dimensionality reduction. Bioinformatics 20(17), 3185–95 (2004). https://doi.org/10.1093/bioinformatics/bth383 19. Zhang, H., Jiang, L.X., Yu, L.J.: Attribute and instance weighted Naive Bayes. Pattern Recogn. 111, 107674 (2021). https://doi.org/10.1016/j.patcog.2020.107674 20. Ren, L.L., Chen, H., Song, J.W., Chen, X.H., Lin, C., Zhang, X.L., Hou, N., Pan, J.Y., Zhou, Z.Q., Wang, L., Huang, D.P., Yang, J.N., Liang, Y.Y., Li, J., Huang, H.B., Jiang, L.L.: MiR-4543p-mediated Wnt/β-catenin signaling antagonists suppression promotes breast cancer metastasis. Theranostics 9(2), 449–465 (2019). https://doi.org/10.7150/thno.29055
Path Planning of Mobile Robot Based on Improved A* Algorithm Ziyang Zhou, Liming Wang, Yuquan Xue, Xiang Ao, Liang Liu, and Yuxuan Yang
Abstract This paper studies the path planning application of A* algorithm in substation inspection robot. The traditional A* algorithm has problems such as large memory consumption and slow computing speed when traversing too many redundant nodes in map path planning in a large environment. In this paper, an improved A* algorithm strategy is proposed to solve the above problems. On the premise of keeping the shortest path, the cost function of A* algorithm is adjusted, and the pruning strategy of Jump Point Search (JPS) algorithm is integrated, so as to shorten the search time, reduce the number of nodes and inflection points, and make the path smoother. The simulation results show that the new algorithm is better than the traditional A* algorithm in a larger search area, showing higher efficiency and path optimization performance. Finally, the effectiveness and practicability of the improved algorithm are verified on the experimental platform. Keywords A* algorithm · JPS algorithm · Path planning · Mobile robot
Z. Zhou · L. Wang · Y. Xue (B) · X. Ao · L. Liu · Y. Yang School of Electrical Engineering, Naval University of Engineering, Wuhan 430030, China e-mail: [email protected] Z. Zhou e-mail: [email protected] L. Wang e-mail: [email protected] X. Ao e-mail: [email protected] L. Liu e-mail: [email protected] Y. Yang e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_50
617
618
Z. Zhou et al.
1 Introduction With the rapid development of science and technology, all kinds of modern technology is used in many fields. Among them, mobile robot [11] is widely used. Path planning is the focus of mobile robot research and the core of mobile robot research. The path planning problem is divided into two directions, namely global path planning and local path planning. Global path planning [9, 10]is mostly based on path planning in static known environments, while local path planning [3, 8] is mostly based on path planning in dynamic unknown environments. Global path planning algorithms mainly include A* [4], RRT [1, 15], ant colony algorithm [2], etc. Many scholars have done A lot of research on this. Li Yan et al. [13] improved the evaluation function of A* algorithm with neural network to plan the optimal path and smooth path. LAN xing et al. [5] combines ant colony algorithm with A* algorithm to improve the convergence speed of the algorithm. Liu et al. [7] improved the ant colony algorithm by combining pheromone diffusion and geometric local optimization to improve the convergence rate of the ant colony algorithm. Wang et al. [12] combined the dynamic window method with A* algorithm to increase the search scope from 8 fields to 24 neighborhood fields, better avoid obstacles, and obtain a smoother trajectory. Li et al.[6] introduced the bidirectional alternate search (BAS) strategy in the A * algorithm to improve the search efficiency, and introduced the filtering function of path nodes to reduce redundant nodes in the path and effectively reduce the Angle. Yang Zheng et al. [14] combined the A * algorithm with the jump-point search strategy to reduce redundant nodes, and adopted the redundant jump-points and the adaptive arc optimization strategy to shorten the path length and increase the path smoothness. Based on the above research, this paper mainly studies the shortest path search A* algorithm in the global path planning algorithm, optimizes it for the problem of large computing memory caused by too many traversal nodes, and uses the improved A* algorithm on the mobile robot for verification.
2 Traditional A* Algorithm A * algorithm is a classic heuristic search algorithm proposed in 1968. It can search the shortest path in the static road network with known global environment information. The estimation function combined with node information is used to guide the path search process. Starting from the starting point, guided by the valuation function, the minimum point of generation value is found as the current node, and the neighboring nodes of the current node are continuously traversed, and the process is cycled until the target node is searched, the search is finished, and the shortest path is obtained.
Path Planning of Mobile Robot Based on Improved A* Algorithm
619
The cost function formula of A* algorithm is shown as follows f (n) = g(n) + h(n)
(1)
f (n) is the evaluation function; g (n) is the cost function, representing the actual cost from the starting point to the current point; h (n) is the heuristic function, representing the expected cost from the current point to the target point. In A* algorithm, the selection of heuristic function is very important. The map environment established in this paper is based on Cartesian coordinate system, and there are obstacles in the map, so Manhattan distance is used as the heuristic function. The Manhattan distance expression is d = |x1 − x2 | + |y1 − y2 |
(2)
x1 and y1 are the horizontal and vertical coordinates of the current point respectively, while x2 and y2 are the horizontal and vertical coordinates of the target point respectively. So the formula for the valuation function is as follows f (n) = g(n) + |x1 − x2 | + |y1 − y2 |
(3)
h (n) uses the Manhattan distance to calculate, avoiding the whole graph node search to a certain extent, and thus improving the application efficiency of the algorithm.
3 Improved A* Algorithm 3.1 Exploration Rule Combined with the basic idea of pruning rules of JPS algorithm, when expanding to node x, any node n that does not need to be added to the Open table for evaluation is selected from eight adjacent nodes, so as to reach the target point fastest. The generation value of two paths is compared to filter: path one starts from the parent node p(x), passes through node x and reaches node n; Path two starts from parent node p(x) and does not pass through node x to node n. In addition, each node on both paths belongs to a neighbor of lower x. The pruning rules are discussed in two cases according to whether there are obstacles in the adjacent nodes of the current node. As shown in Fig. 1a, the current node x moves to the right from the blue parent node p(x) in a straight line. At this time, the gray node is a meaningless node, because the cost from p(x) to these nodes through x is definitely not lower than the cost from p(x) to these nodes directly. Figure 1b The blue parent node p(x) of the current node x moves diagonally. Similarly, at this time, grey nodes will lead to an
620
Z. Zhou et al.
(a) Traditional A* algorithm
(b) Improved A* algorithm
Fig. 1 No obstacle trimming rules
increase in computation and are meaningless. In order to screen out the shortest path, grey nodes are pre-processed and deleted, and the remaining nodes are defined as natural nodes.Therefore, when there are no obstacles around the node, the following pruning conditions are defined: (1) When moving in a straight line, the pruning conditions are as follows: len(< p(x), ..., n > |x) ≤ len(< p(x), x, n >)
(4)
(2) When the diagonal line moves, the pruning conditions are as follows: len(< p(x), ..., n > |x) < len(< p(x), x, n >)
(5)
When the improved A* algorithm is extended to node x, unnecessary adjacent nodes are deleted according to the above constraints. As shown in Fig. 2, the black node is an obstacle. When the blue parent node p(x) moves to node x, if there are obstacles in the adjacent nodes of x, all adjacent nodes under the above constraints cannot be deleted. For the green node in Fig. 2, there is no path from p(x) that can be reached without going through node x with minimum cost. To reach this node, you must pass through node x, otherwise it is not the least cost path. For each such forced node, the pruning condition is given: (1) n is a neighbor node but not a natural node; (2) When the node moves in a straight line, the pruning conditions are as follows: len(< p(x), x, n >) ≤ len(< p(x), ..., n > |x)
(6)
Path Planning of Mobile Robot Based on Improved A* Algorithm
(a) Traditional A* algorithm
621
(b) Improved A* algorithm
Fig. 2 Obstacle trimming rules
When the node diagonal moves, the pruning conditions are as follows: len(< p(x), x, n >) < len(< p(x), ..., n > |x)
(7)
3.2 Selection Rules In this section, nodes are screened out according to the pruning constraint rules described in Sect. 3.1, and all natural nodes and forced nodes are added to the Open table for cost assessment. The shortest path is found out according to all screened nodes, and the selected nodes are defined as key points. The path composed of key points in turn is the shortest path of this routing. This rule reduces the calculation of intermediate nodes in the routing process of A* algorithm. In the process of pathfinding, it only needs to spend a short time for preprocessing, which can reduce a large number of unnecessary nodes, reduce the calculation amount in the process of calculation, and greatly improve the efficiency of path finding.
3.3 Compensation Rules In this section, for the redundancy of path points in A* algorithm’s path finding, the heuristic function h(n) in the cost function f(n) of A* algorithm is compensated. The compensation function is set as k(n), and the cost function is expressed as follows:
622
Z. Zhou et al.
(a) Traditional A* algorithm
(b) Improved A* algorithm simulation
Fig. 3 Improved A* algorithm path finding
f (n) = g(n) + k(n)h(n) s k(n) = w − w m
(8) (9)
w is the compensation weight, s is the number of exploring nodes, and m is the node amplitude. The improved path finding mode of A* algorithm is shown in Fig. 3a. The blue node is the starting point, the black node is the obstacle node, the green node is the forced node, and the red node is the end point. After exploring from the starting point S to node x, continue to find the way according to recursive rules, and stop recursion when the target node G is found. Figure 3b shows the simulation experiment of improved A* algorithm. From the blue starting point S, it will first recursively expand in the horizontal and vertical directions. If obstacles are encountered in the recursive process, then the exploration rule judgment will be executed. If the nodes do not meet the pruning conditions, the oblique expansion will be carried out; if the nodes are forced, they will be added to the Open table. When it recurses to a new node along the diagonal, this node will recursively expand along the horizontal and vertical direction towards the end point, until the search stops at the end point, and finally select key points to form the path of minimum cost.
4 Simulation Comparison In order to verify the effect of the improved A* algorithm, this paper makes A simulation comparison between the A* algorithm and the improved A* algorithm under different environment maps, and chooses two map layouts with different sizes and obstacles. The specific parameters of the simulation computer are CPU i7-12700, the operating system is Windows 11 and the compilation environment is Pycharm2022.
Path Planning of Mobile Robot Based on Improved A* Algorithm
(a) Traditional A* algorithm
623
(b) Improved A* algorithm
Fig. 4 30 m × 30 m map algorithm simulation
(a) Traditional A* algorithm
(b) Improved A* algorithm
Fig. 5 50 m × 50 m map algorithm simulation
Simulation and comparison results are shown in the figure below. As shown in Fig. 4, simulation experiment is carried out on a 30 m × 30 m coordinate map. The green dot in the lower left corner is the starting point, the blue x in the upper right corner is the target node, and the black rectangle is the obstacle. Figure 4a is the traditional A* algorithm, the green five-pointed star is the node added to the Open table of A* algorithm, and the red line is the shortest path of the final output; Fig. 4b shows the improved A* algorithm, and the red five-pointed star is the final output path Figure 5 is the simulation experiment under the 50 m × 50 m coordinate map, and parameter Settings are consistent with Fig. 4. Obviously, under the same environment, although there is no big difference between the paths generated by the two algorithms, the improved A* algorithm can remove a large number of redundant nodes, and the greater the field range of path finding, the more obvious the effect. And the improved A* algorithm basically has no inflection point in the second half of the path finding, which strengthens the smoothness of the path.
624
Z. Zhou et al.
Table 1 algorithm simulation data Parameter Traditional A* algorithm 30 m × 30 m data Extension node Turning points Path length/m Time consuming/s 50 m × 50 m data Extension node Turning points Path length/m Time consuming/s
Improved A* algorithm
267 14 42.21 2.084
37 10 40.08 1.863
609 18 61.01 3.562
50 12 59.60 3.177
Table 1 provide the expansion nodes, search time, number of inflection points, and path length of the two algorithms under different environment maps. It can be seen from the table that the improved A* algorithm reduces the number of evaluation nodes significantly, and the larger the range, the more obvious the saving effect. Calculation time, the number of inflection points also has a corresponding reduction, the route is smoother.
5 Physical Experiment In order to verify the effectiveness of the improved A* algorithm in the actual operation of the mobile robot, the improved A* algorithm is applied to the mobile robot, as shown in Fig. 6. The laser radar used by the mobile robot is Radium God M10P, which has a sampling frequency of 20,000 Hz and a scanning frequency of 12 Hz. It uses the TOF ranging principle to carry out laser ranging scanning within a radius of 30 m on a two-dimensional plane, and generates plane point cloud map information. Based on ROS environment, a mobile robot is used to complete the laser radar mapping task in the experimental site. Algorithm verification is shown in Fig. 7. Figure 7 in the algorithm verification experiment, the white moving square represents the mobile robot, the black straight line represents the obstacle, the red straight line represents the real-time obstacle point cloud information detected by the Lidar, and the green line represents the mobile robot movement track. Figure 7b is the position information diagram of the mobile robot after it stops moving in Fig. 7a.
Path Planning of Mobile Robot Based on Improved A* Algorithm
625
Fig. 6 Mobile robot
(a) Traditional A* algorithm
(b) Improved A* algorithm
Fig. 7 Improved A* algorithm validation
6 Summary In this paper, the traditional A* algorithm in the process of mobile robot path planning for the redundancy of path points for targeted improvement, proposed the improved A* algorithm, A* algorithm combined with JPS algorithm pruning method, targeted selection of path nodes, reduce the calculation of redundant nodes; At the same time, the compensation function is increased to optimize the path smoothness of A* algorithm. The simulation and experimental results show that the improved A* algorithm can reduce a large number of redundant nodes and reduce the computation, running time and number of inflection points when the mobile robot is searching for the path. With the increase of path length, the number of evaluation nodes and inflection points decreases obviously, and the route becomes smoother. In the future, we will consider how to do path planning in dynamic environment and improve the robustness of the algorithm.
626
Z. Zhou et al.
References 1. Gu, Q., Rong, Z., Liu, J., Li, C.: An improved RRT algorithm based on prior AIS information and DP compression for ship path planning. Ocean Eng. 279, 114595 (2023) 2. He, C., Mao, J.: AGV optimal path planning based on improved ant colony algorithm. MATEC Web Conf. 232, 03052. EDP Sciences (2018) 3. Jiang, Y., Peng, P., Wang, L., Wang, J., Wu, J., Liu, Y.: Lidar-based local path planning method for reactive navigation in underground mines. Remote Sens. 15(2), 309 (2023) 4. Jin, M., Wang, H.: Robot path planning by integrating improved A* algorithm and DWA algorithm. J. Phys.: Conf. Ser. 2492, 012017. IOP Publishing (2023) 5. Lan, X., Lv, X., Liu, W., He, Y., Zhang, X.: Research on robot global path planning based on improved a-star ant colony algorithm. In: 2021 IEEE 5th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), vol. 5, pp. 613–617. IEEE (2021) 6. Li, C., Huang, X., Ding, J., Song, K., Lu, S.: Global path planning based on a bidirectional alternating search A* algorithm for mobile robots. Comput. Ind. Eng. 168, 108123 (2022) 7. Liu, J., Yang, J., Liu, H., Tian, X., Gao, M.: An improved ant colony algorithm for robot path planning. Soft Comput. 21, 5829–5839 (2017) 8. Mısır, O.: Dynamic local path planning method based on neutrosophic set theory for a mobile robot. J. Braz. Soc. Mech. Sci. Eng. 45(3), 127 (2023) 9. Ni, Y., Zhuo, Q., Li, N., Yu, K., He, M., Gao, X.: Characteristics and optimization strategies of A* algorithm and ant colony optimization in global path planning algorithm. Int. J. Pattern Recognit. Artif. Intell. 37(03), 2351006 (2023) 10. Niu, C., Li, A., Huang, X., Li, W., Xu, C.: Research on global dynamic path planning method based on improved A* algorithm. Math. Prob. Eng. 1–13, 2021 (2021) 11. Rubio, F., Valero, F., Llopis-Albert, C.: A review of mobile robots: concepts, methods, theoretical framework, and applications. Int. J. Adv. Robot. Syst. 16(2), 1729881419839596 (2019) 12. Wang, X., Ye, T.: Research on robot path planning based on improved A* algorithm. Comput. Meas. Control 26(07), 282–286 (2018) 13. Yan, L., Qi, L., Feiran, K., Guang, C., Xinbo, C.: A study of improved global path planning algorithm for parking robot based on ROS. In: 2020 4th CAA International Conference on Vehicular Control and Intelligence (CVCI), pp. 607–612. IEEE (2020) 14. Yang, Z., Li, J., Yang, L., Chen, H.: A smooth jump point search algorithm for mobile robots path planning based on a two-dimensional grid model. J. Robot. 2022 (2022) 15. Zhao, P., Chang, Y., Wu, W., Luo, H., Zhou, Z., Qiao, Y., Li, Y., Zhao, C., Huang, Z., Liu, B., et al.: Dynamic RRT: fast feasible path planning in randomly distributed obstacle environments. J. Intell. Robot. Syst. 107(4), 48 (2023)
Research on the Technology of Using Turning Instead of Grinding for Aerospace Titanium Alloy Thin Wall Parts Kong Guizhen, Zhang Huajin, Guo Yaxing, Yang Qiang, Li Dongwei, and Zhang Zhe Abstract Titanium alloy TA15 has the characteristics of poor thermal conductivity and small elastic modulus. In aerospace products, the surface roughness of titanium alloy thin-walled parts is 0.8 µm, the ratio of excircle to wall thickness is D/t > 16 (D is excircle diameter and t is wall thickness), and the dimensional accuracy and form and position tolerance are required to be high. In this paper, the structural characteristics and machining elements of titanium alloy thin-walled parts are analyzed. By means of turning instead of grinding, continuous experiments and research, technological innovation and corresponding technological measures, the stability of machining dimensions and surface quality accuracy of titanium alloy thin-walled parts in batch production is solved, and the feasibility and rationality of machining this titanium alloy thin-walled parts are verified. Keywords Titanium alloy · Thin-walled parts · Replace grinding with turning · Processing technology
1 Introduction Titanium alloys have been widely used in aerospace due to their high specific strength, wide operating temperature range and excellent corrosion resistance. The ratio of excircle to wall thickness of titanium alloy thin-walled parts is D/t > 16 (D is excircle diameter and t is wall thickness), and the dimensional accuracy of each excircle is from − 0. 009 to − 0.02 mm, that is, the actual wall thickness error should be less than from − 0. 0045 to − 0.01. The outer roundness and coaxiality should be less than 0.005. For the parts with such high tolerance of size and form and position, grinding is usually used, but for the titanium alloy thin-walled parts, the efficiency, precision and stability cannot meet the requirements of batch production. The main problems are: the surface roughness is unstable, the material is soft, the K. Guizhen (B) · Z. Huajin · G. Yaxing · Y. Qiang · L. Dongwei · Z. Zhe Beijing Institute of Precision Mechatronics and Controls, Beijing 100076, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_51
627
628
K. Guizhen et al.
chips quickly block the gap of the grinding wheel during grinding, the grinding wheel loses its cutting ability, and the surface roughness of the parts becomes worse; To ensure normal grinding, it is necessary to modify grinding wheels frequently, resulting in very low processing efficiency and unable to meet the needs of batch production; The parts are thin-walled parts, and the elastic modulus of materials is small, the clamping force makes the parts produce elastic deformation, the cutting is completed without dismantling the parts, and the size meets the requirements. After dismantling the workpiece, the elastic deformation recovers, which makes the size out of tolerance, resulting in the matching contact surface between the conical surface of the inner hole and the outer conical surface of the positioning mandrel not reaching more than 75% of the requirements. Therefore, the new technology innovation points are: turning instead of grinding, solving the problems of unstable surface roughness and machining efficiency caused by blocking the grinding wheel; Design special tooling to solve the problem of clamping and deformation of parts; Optimize cutting parameters, especially increase cutting speed and improve surface roughness requirements. After trial, the parts that meet the design requirements are machined, the machining efficiency and quality are obviously improved, and the expected results are achieved. This kind of titanium alloy thin-walled parts processing technology operation is simple, the effect is good, can be used as a general method of thin-walled parts processing extended to other material structural parts.
2 Processability Analysis of Titanium Alloy Thin-Walled Parts Titanium alloy [1, 2] TA15 is characterized by high strength, low thermal conductivity and small elastic modulus. Dimensions of thin-walled parts are shown in Fig. 1. Its main dimensions are: −0.009 0 mm, φ50−0.02 mm , coaxiality 0.005 mm, 0.02 mm, (1) The outer circle φ60−0.013 −0.009 0 mm are matched with the roundness 0.005 mm, outer circle φ60−0.013 mm, φ50−0.02 inner hole of standard bearing. (2) The angle of inner hole of thin-walled parts is 4◦ 46 20 ± 10 , the coaxiality of inner conical hole is 0.01 mm, and the surface roughness is Ra 0.8 µm; When the inner conical surface fits with the conical surface of the connecting shaft, the contact area is required to reach more than 75%, so as to achieve the purpose of accurate positioning of locking fit. (3) The ratio of excircle to wall thickness of titanium alloy thin-walled parts is D/t > 16 (D is excircle diameter and t is wall thickness), and the dimensional accuracy of each excircle is from − 0.009 to − 0.02 mm, that is, the actual wall thickness error should be less than from − 0.0045 to − 0.01 mm; External roundness and coaxiality should be less than 0.005 mm.
Research on the Technology of Using Turning Instead of Grinding …
629
Fig. 1 Titanium alloy thin-walled parts
(4) Parts are thin-walled parts, the thinnest part of the wall thickness is 3mm, large deformation in machining, so increase the control of dimensional accuracy and shape and position accuracy requirements.
3 Processing Technology Analysis of Titanium Alloy Thin-Walled Parts According to the assembly requirements and the characteristics of parts, the machining difficulties are analyzed as follows: titanium alloy is a difficult material, the deformation control of thin wall, dimensional accuracy, form and position tolerance and surface roughness control. It is an innovation to solve the traditional grinding of titanium alloy thin-walled parts by turning instead of grinding process, and the innovation points are mainly in the following three aspects: (1) Optimize the process flow: The process flow separates the coarse and fine processes, increases the heat treatment stress relieving process, and reduces the deformation in the processing process. (2) Design special tooling: Solve the clamping problem of thin-walled parts and control the influence of thin-walled deformation on dimensional accuracy. (3) Turning instead of grinding: Choose high-precision CNC lathe, choose reasonable cutting parameters and CNC inserts suitable for finishing titanium alloy, and turn instead of grinding, so as to solve the unstable surface roughness of grinding and improve machining efficiency.
630
K. Guizhen et al.
3.1 Optimize the Process Flow In the process of finishing turning, thin-walled parts [3] change the balance state of internal stress of materials due to the influence of clamping, cutting force and other factors. In the process of redistributing internal stress to achieve balance, deformation and irregular deformation such as ellipse appear, which makes it difficult to ensure the technical requirements such as size and form and position tolerance. Reducing the deformation of thin-walled parts and ensuring the dimensional accuracy and shape and position tolerance of thin-walled parts are the key points in machining, and the machining quality is very high, so it is impossible to meet the requirements in one clamping machining. In order to reduce the deformation of thinwalled parts, the working procedures are arranged as follows: forging materials-rough turning (removing large allowance for shape and inner hole)-heat treatment stress relief aging-semi-finishing turning (4◦ 46 20 ± 10 inner taper hole, excircle φ60 mm and φ50 mm, unilateral allowance 0.2 mm)-finishing turning (4◦ 46 20 ± 10 inner taper hole, excircle φ60 mm and φ50 mm). The number of inner holes and excircle is the key working procedure of finishing machining, and it is necessary to ensure dimensional tolerance, shape and position tolerance and circle runout.
3.2 Turning Instead of Grinding Choosing high-precision CNC lathe, spindle circle runout and end face runout are the necessary conditions to ensure the roundness and verticality of parts; High cutting speed and small feed rate are the preconditions to ensure the surface roughness below Ra 0.8 µm; High-speed cutting and difficult-to-machine materials selection suitable for fine turning titanium alloy materials and groove inserts to overcome turning tool and tool wear; Special tooling not only ensures the tightness of clamping, but also controls the clamping deformation. Through the above control, the process method of turning instead of grinding can achieve the precision of grinding and the machining efficiency of NC turning.
3.3 Blade and Special Tooling Design 3.3.1
Requirements and Selection of CNC Blades
The machining requirement can be guaranteed by using HARTING ELITE 51ULTRA CNC lathe, the spindle speed n = 1000 r/min and the feed rate f = 0.08 to 0.2 mm/r. Choosing appropriate CNC lathe insert, the groove width of the insert groove is small, which can make chip removal smooth, the larger back angle can reduce the wear of the flank face, the larger rake angle can reduce the cutting force, do not produce cres-
Research on the Technology of Using Turning Instead of Grinding …
631
cent depression wear, and have little affinity with titanium alloy. Therefore, Kyocera insert is selected according to these requirements. Semi-finish machining: The material of the insert is PR1305, the model is CCMT09T304, and the groove type is MQ, which mainly cuts stainless steel and heat-resistant steel. The groove type of MQ is characterized by: the groove back at the tip is raised forward, forming a narrow front and a wide back, which is a composite chip breaking groove, which can be finished and suitable for semi-finish machining. Finish machining: Insert material is SW05, model is CCMT09T304, groove is MQ, mainly cutting titanium alloy material.
3.3.2
Turning Instead of Grinding Inner Holes
In order to ensure the geometric requirements of parts, prevent parts from deforming and improve machining quality, it is necessary to design appropriate fixture for clamping. The tooling design is shown in Fig. 2: open sleeve and thin-walled parts; Thin-walled parts are easy to load, the whole circle of the opening sleeve holds the thin-walled excircle, and the excircle of the parts is uniformly stressed to avoid clamping deformation. The specific steps are as follows: (1) The opening sleeve is designed to hold the thin-walled piece, and the inner hole of the thin-walled piece is roughly machined to remove a large amount. The deformation is controlled by changing the clamping position, direction and stress point and increasing the clamping contact area; Secondly, the opening sleeve has a large contact area with the outer circle of the thin-walled piece, which can effectively balance the clamping force and the cutting force. (2) Heat treatment: Carry out stress relief aging on the thin-walled parts after rough turning to remove the internal stress generated during rough machining.
Fig. 2 Inner hole tooling of thin-walled parts
632
K. Guizhen et al.
(3) Semi-finish machining the inner taper hole of thin-walled parts with internal stress removed, and finally boring the inner taper hole of 4◦ 46 20 ± 10 with a special boring cutter to complete the finish machining.
3.3.3
Turning Instead of Grinding the Excircle
Design the tooling shown in Fig. 3, which is mainly composed of positioning mandrel, thin-walled parts and process plug. The front end is the coordination between the outer conical surface of the positioning mandrel and the inner conical surface of the thin-walled part, with a contact area of 75%, which plays a positioning and supporting role for the thin-walled part. The two parts transmit torque by the friction force generated by the conical surface, and are used for high-speed light-load cutting with high transmission accuracy. The back end adopts the pressing method of process plug thread, so that the workpiece can be stably installed on the positioning mandrel, and the coaxial relationship of the connected parts is well ensured. The tooling design of thin-walled parts excircle is shown in Figs. 3 and 4 of the actual machining drawing of thin-walled parts excircle. Specific steps are innovated as follows: (1) The front end of the thin-walled part adopts three-claw self-centering copper claw chuck to clamp the mandrel, and takes the boring 4◦ 46 20 ± 10 thin-walled part inner taper hole as the benchmark, and matches with the outer taper surface of the positioning mandrel, so that the generatrix of the two parts can be better attached, the contact area of the inner taper surfaces at both ends is not less than 75%, and the back end is pressed by self-made thread plug. The design idea is to change the radial force of the part into axial force [4], and the pressing mode is axial compression, which can effectively improve the stability of machining and provide guarantee for the subsequent improvement of the size and shape tolerance of each outer circle.
Fig. 3 Cylindrical tooling for thin-walled parts
Research on the Technology of Using Turning Instead of Grinding …
633
Fig. 4 Actual machining diagram of thin-walled part excircle
(2) In addition to the influence of cutting parameters on cutting heat, more important is the influence on cutting force. For thin-walled parts, more consideration should be given to how to reduce the deformation [5]. The deformation of thin-walled parts almost belongs to radial deformation, and the cutting parameters of radial cutting force affect roundness. The increase of cutting depth and cutting amount will increase radial cutting force, while the increase of rotating speed will reduce radial cutting force. Therefore, in finishing, increasing rotating speed and adopting small cutting depth and cutting amount can reduce the deformation of workpiece to a certain extent and ensure the machining quality of parts. According to the material and structure of thin-walled parts, the selection parameters are: spindle speed n=1000 r/min, cutting depth ap ≤0.3 mm, last cutting depth 0.1 mm to 0.2 mm, feed rate 0.08 mm/r to 0.2 mm/r. If the amount is too small, the cutting force can be reduced and the surface quality can be improved [6, 7], but the machined parts will have taper due to tool wear in the machining process, resulting in out-of-tolerance of parts.
4 Conclusion The final measurement results prove that the machining method of replacing grinder with several cars is feasible after adopting specific design tooling and reasonable process method, and the problem of inner and outer machining deformation of the
634
K. Guizhen et al.
thin-walled part is solved. The quality of thin-walled parts is stable, the roundness and coaxiality are less than 0.005 mm, the surface roughness of inner hole and outer circle is Ra 0.8 µm, the qualified rate of parts reaches 100%, the machining efficiency is greatly improved, and a large number of production costs are saved. This method can provide powerful technical support for machining similar thin-walled parts.
References 1. Li, X., Liu, L., Angel, Zhao, P.: Discussion on economic and low-cost methods of titanium and titanium alloy materials. Mater. Prog. China 2015(5), 401–406 (2015) 2. He, D., Shi, H.: Discussion on the application of titanium alloy in aerospace field. China Hightech Enterprises 27, 50–51 (2016) 3. Huang, C., Zou, C.: Cutting technology of titanium alloy thin-walled high cylinder parts. Technol. Equipment 60(9), 82–84 (2022) 4. Ran, G.: Application of turning instead of grinding technology in high precision thin-walled parts. Metalworking 11, 27–28 (2014) 5. Wan, P., Tan, X., Fan, W., Zhao, J., et al.: On the quality control of turning deformation of large thin-walled parts. China New Technol. New Prod. 13, 162–163 (2012) 6. Zhao, C., Zhou, J., Wang, P.: Effect of cutting parameters on surface roughness in titanium alloy TA15 cutting process. Mech. Eng. Autom. 3, 129–131 (2018) 7. Hu, Z.: Application exploration of turning instead of grinding technology in machining thinwalled ring parts. Innov. Appl. Sci. Technol. 5, 145–146 (2018)
Machine Learning in Molecular Dynamics Simulation Xiaojing Teng
Abstract Molecular dynamics simulation is a powerful tool to study biological problems, such as protein-ligand binding, protein folding/unfolding, flexibility of biomolecules, free energy calculations, etc. It also plays an important role in drug design in identifying potential small molecules that binds to target. As the surging development of machine learning in recent years, many possibilities in molecular dynamics simulation become visible. Keywords Molecular dynamics · Protein structure prediction · Machine learning
1 Introduction The first ever molecular dynamics (MD) simulation can be dated back to 1950s [1]. Due to its high precision in spatial and temporal scale, it soon became an important supplementary to biological experiments, by elucidating the potential mechanisms underlying the biological processes. As the development of computation technology and optimization of the simulation algorithms [2–5], longer simulation with much higher precision became possible. Nowadays, many MD packages are readily available, such as CHARMM, Amber, GROMACS, NAMD, OpenMM, etc. MD simulation can be applied in protein-ligand binding [6–13], protein folding and unfolding [14–20], fluctuation and conformational change of biomolecules [21–27], drug design [28–33], and many other aspects [34–37] 2013 Nobel Prize in Chemistry was awarded to Martin karplus, Michael Levitt, and Arieh Warshel, the pioneers in the field of computational chemistry, which proved the importance of MD in the research. Protein structure prediction is a long-term challenge to human beings. Many efforts have been put in the field but the progress was not promising until 2018. AlphaFold, an artificial intelligence and machine learning based program in X. Teng (B) Department of Chemistry, Georgetown University, Washington, DC 20057, USA e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_52
635
636
X. Teng
protein structure prediction developed by DeepMind, outperformed all other groups in the 13th Critical Assessment of Structure Prediction (CASP) [38]. Two years later, AlphaFold 2 reached a higher level comparable to experimental results. The advent of machine learning brought many possibilities to the field of MD simulation. In fact, researchers have adopted it in force field optimization and drug design before AlphaFold. This short review will summarize the recent progress in the application of machine learning in MD simulations.
2 Force Field Development The force field is one of the key components in MD simulations [39, 40]. It describes the bonded (bond, angle, and dihedral) and non-bonded interactions (electrostatic and van der Waals) between atoms. The accuracy of the simulation greatly relies on the force field. In general, the force field optimization is a tedious process with quantum calculations, following with parameterization with empirical modification and adjustment to experimental observables, such as density, diffusion, etc. Pande and Wang developed a machine learning based automatic process (ForceBalance) to optimize the parameters of the force field [41]. By using a set of reference data (ab initio calculations and experiments) as the gold standard, initial parameters of the force field are updated and optimized towards the gold standard through MD simulations. This method is computational efficient and strikingly accurate. Currently the best 4-point water model is developed by ForceBalance [42, 43]. Machine learning now is a routine way to parameterization the force field [44– 46]. It turned this labor intensive process automatic and efficient. There are still challenges remaining. As the system becomes more complex, which is the case in biological relevant studies, the parameter space would be exponentially more complex to explore. Therefor the efficiency of the process will be a challenge. The weight of different observables is another problem. Sometimes several gold standard cannot be reached simultaneously. How to assign the weights to different observables is still an open question. Moreover, as the optimization goes forward, the generality of the force field might be undermined [47]. The balance between accuracy and generality is another challenge.
3 Structure and Dynamics Prediction As mentioned above, prediction of protein structure is was a long-term challenge of great significance [48, 49]. Without knowing the structure of a protein, it is impossible to understand its molecular mechanism in biological process, and therefore hinders the consequent research, such as the functionality in a big picture, drug design, and not to mention the design of proteins. For a long period, the structure of proteins can only be obtained from X-ray crystallography and NMR spectroscopy. Not only this
Machine Learning in Molecular Dynamics Simulation
637
process is low efficient, but many proteins are hard to form the structure in vitro to be measured. Thanks to the tremendous progress in computing power and machine learning algorithms, as well as the accumulation of experimental database, the accurate prediction of protein structure from sequence only is now possible [38]. The precision is comparable to experiments, which totally changed structural biology [50]. Many progresses based on AlphaFold have been proposed [51–53]. The achievement of machine learning in protein structure prediction cannot be underrated. It facilitates the study in protein functions and drug design. However, the prediction is a static geometry structure under certain crystallography setup, whereas the biological environment is much more complex, and proteins can undergo significant structural fluctuation and conformational change, which is more relevant to the biological process, e.g., enzymatic catalysis requires flexibility of proteins and often accompanies with large motion of enzymes [54, 55]. Many summarizing work and modeling have been done. For example, the interaction between small molecules and protein and their effect on protein stability have been studies overwhelmingly [56–59], and a dynamical model to predict the effect based only on small molecules and water MD simulation was proposed [60–62]. The interaction between proteins and conformational changes are other significant questions to be explored by machine learning [63, 64]. Now the data amount are quite limited, but with more simulation and the help of machine learning, prediction of protein dynamics and behavior under different environment is possible yet challenging [19, 65].
4 Conclusions MD simulation is a powerful tool to study what is happening at the molecular level, and a perfect complementary to experimental studies. It can uncover the molecular mechanisms underlying biological processes. With the help of machine learning, MD simulation can have more accurate results (force field optimization), and numerous structures to study. Besides what was mentioned in the review, machine learning can also help MD simulation in drug design and simulation data analysis. There are remaining many challenges waiting ahead, such as the dynamic property prediction under different complex environments, interactions between proteins, and conformation changes. Acknowledgements X.T. acknowledges support from the National Institutes of Health through Grant No. R01-GM122441, and the mentoring of Toshiko Ichiye.
638
X. Teng
References 1. Alder, B.J., Wainwright, T.E.: Phase transition for a hard sphere system. J. Chem. Phys. 27(5), 1208–1209 (1957) 2. Ryckaert, J.-P., Ciccotti, G., Berendsen, H.J.: Numerical integration of the cartesian equations of motion of a system with constraints: molecular dynamics of n-alkanes. J. Chem. Phys. 23(3), 327–341 (1977) 3. Feller, S.E., Pastor, R.W., Rojnuckarin, A., Bogusz, S., Brooks, B.R.: Effect of electrostatic force truncation on interfacial and transport properties of water. J. Phys. Chem. 100(42), 17011– 17020 (1996) 4. Nosé, S., Klein, M.: Constant pressure molecular dynamics for molecular systems. Mol. Phys. 50(5), 1055–1076 (1983) 5. Hoover, W.G.: Canonical dynamics: equilibrium phase-space distributions. Phys. Rev. A 31(3), 1695 (1985) 6. Swegat, W., Schlitter, J., Krüger, P., Wollmer, A.: Md simulation of protein-ligand interaction: formation and dissociation of an insulin-phenol complex. Biophys. J. 84(3), 1493–1506 (2003) 7. Woo, H.-J., Roux, B.: Calculation of absolute protein-ligand binding free energy from computer simulations. Proc. Natl. Acad. Sci. USA 102(19), 6825–6830 (2005) 8. Sousa, S.F., Fernandes, P.A., Ramos, M.J.: Protein-ligand docking: current status and future challenges. Proteins 65(1), 15–26 (2006) 9. Cui, Q., Karplus, M.: Allostery and cooperativity revisited. Protein Sci. 17(8), 1295–1307 (2008) 10. Deng, Y., Roux, B.: Computations of standard binding free energies with molecular dynamics simulations. J. Phys. Chem. B 113(8), 2234–2246 (2009) 11. Teng, X., Hwang, W.: Structural and dynamical hierarchy of fibrillar collagen. In: Kaunas, R.R., Zemel, A. (eds.) Cell and Matrix Mechanics, Chap. 4, pp. 101–118. CRC Press (2014) 12. Guterres, H., Im, W.: Improving protein-ligand docking results with high-throughput molecular dynamics simulations. J. Chem. Inf. Model. 60(4), 2189–2198 (2020) 13. Lahey, S.-L.J., Rowley, C.N.: Simulating protein-ligand binding with neural network potentials. Chem. Sci. 11(9), 2362–2368 (2020) 14. Durell, S.R., Brooks, B.R., Ben-Naim, A.: Solvent-induced forces between two hydrophilic groups, vol. 98, no. 8, pp. 2198–2202 (1994) 15. Duan, Y., Kollman, P.A.: Pathways to a protein folding intermediate observed in a 1microsecond simulation in aqueous solution. Science 282(5389), 740–744 (1998) 16. Mayor, U., Johnson, C.M., Daggett, V., Fersht, A.R.: Protein folding and unfolding in microseconds to nanoseconds by experiment and simulation. Proc. Natl. Acad. Sci. U.S.A. 97(25), 13518–13522 (2000) 17. Piana, S., Lindorff-Larsen, K., Shaw, D.E.: Protein folding kinetics and thermodynamics from atomistic simulation. Proc. Natl. Acad. Sci. U.S.A. 109(44), 17845–17850 (2012) 18. Teng, X., Hwang, W.: Chain registry and load-dependent conformational dynamics of collagen. Biomacromolecules 15(8), 3019–3029 (2014) 19. Noé, F., De Fabritiis, G., Clementi, C.: Machine learning for protein folding and dynamics. Curr. Opin. Struct. Biol. 60, 77–84 (2020) 20. Strodel, B.: Energy landscapes of protein aggregation and conformation switching in intrinsically disordered proteins. J. Mol. Biol. 433(20), 167182 (2021) 21. Brooks, B., Karplus, M.: Harmonic dynamics of proteins: normal modes and fluctuations in bovine pancreatic trypsin inhibitor. Proc. Natl. Acad. Sci. U.S.A. 80(21), 6571–6575 (1983) 22. Im, W., Roux, B.: Ions and counterions in a biological channel: a molecular dynamics simulation of OmpF porin from Escherichia coli in an explicit membrane with 1 m kcl aqueous salt solution. J. Mol. Biol. 319(5), 1177–1197 (2002) 23. Formaneck, M.S., Ma, L., Cui, Q.: Reconciling the “old” and “new” views of protein allostery: a molecular simulation study of chemotaxis y protein (chey). Proteins. 63(4), 846–867 (2006) 24. Teng, X., Hwang, W.: Elastic energy partitioning in DNA deformation and binding to proteins. ACS Nano 10(1), 170–180 (2016)
Machine Learning in Molecular Dynamics Simulation
639
25. Teng, X., Hwang, W.: Effect of methylation on local mechanics and hydration structure of DNA. Biophys. J. 114(8), 1791–1803 (2018) 26. Jiang, Z., You, L., Dou, W., Sun, T., Xu, P.: Effects of an electric field on the conformational transition of the protein: a molecular dynamics simulation study. Polymers 11(2), 282 (2019) 27. Stevens, J.A., Grünewald, F., van Tilburg, P.M., König, M., Gilbert, B.R., Brier, T.A., Thornburg, Z.R., Luthey-Schulten, Z., Marrink, S.J.: Molecular dynamics simulation of an entire cell. Front. Chem. 11, 1106495 (2023) 28. Zhong, S., Chen, X., Zhu, X., Dziegielewska, B., Bachman, K.E., Ellenberger, T., Ballin, J.D., Wilson, G.M., Tomkinson, A.E., MacKerell, A.D., Jr.: Identification and validation of human DNA ligase inhibitors using computer-aided drug design. J. Med. Chem. 51(15), 4553–4562 (2008) 29. Lill, M.A., Danielson, M.L.: Computer-aided drug design platform using PyMOL. J. Comput.Aid. Mol. Des. 25, 13–19 (2011) 30. Zhao, H., Caflisch, A.: Molecular dynamics in drug design. Eur. J. Med. Chem. 91, 4–14 (2015) 31. Rajasekhar, S., Karuppasamy, R., Chanda, K.: Exploration of potential inhibitors for tuberculosis via structure-based drug design, molecular docking, and molecular dynamics simulation studies. J. Comput. Chem. 42(24), 1736–1749 (2021) 32. Sabe, V.T., Ntombela, T., Jhamba, L.A., Maguire, G.E., Govender, T., Naicker, T., Kruger, H.G.: Current trends in computer aided drug design and a highlight of drugs discovered via computational techniques: A review. Eur. J. Med. Chem. 224, 113705 (2021) 33. Bai, Q., Liu, S., Tian, Y., Xu, T., Banegas-Luna, A.J., Pérez-Sánchez, H., Huang, J., Liu, H., Yao, X.: Application advances of deep learning methods for de novo drug design and molecular dynamics simulation. Wiley Interdiscip. Rev. Comput. Mol. Sci. 12(3), e1581 (2022) 34. Karplus, M., Petsko, G.A.: Molecular dynamics simulations in biology. Nature 347, 631–639 (1990) 35. Hansson, T., Oostenbrink, C., van Gunsteren, W.: Molecular dynamics simulations. Curr. Opin. Struct. Biol. 12(2), 190–196 (2002) 36. Teng, X.: Mechanical analysis of collagen and DNA. Ph.D. thesis (2016) 37. Brooks, C.L., Case, D.A., Plimpton, S., Roux, B., Van Der Spoel, D., Tajkhorshid, E.: Classical molecular dynamics. J. Chem. Phys. 154(10) (2021) 38. Senior, A.W., Evans, R., Jumper, J., Kirkpatrick, J., Sifre, L., Green, T., Qin, C., Žídek, A., Nelson, A.W., Bridgland, A., et al.: Improved protein structure prediction using potentials from deep learning. Nature 577(7792), 706–710 (2020) 39. Ponder, J.W., Case, D.A.: Force fields for protein simulations. Adv. Protein Chem. 66, 27–85 (2003) 40. Huang, J., Rauscher, S., Nawrocki, G., Ran, T., Feig, M., de Groot, B.L., Grubmüller, H., MacKerell, A.D.: Charmm36: An improved force field for folded and intrinsically disordered proteins. Biophys. J. 112(3), 175a–176a (2017) 41. Wang, L.-P., Chen, J., Van Voorhis, T.: Systematic parametrization of polarizable force fields from quantum chemistry data. J. Chem. Theory Comp. 9(1), 452–460 (2013) 42. Wang, L.-P., Martinez, T.J., Pande, V.S.: Building force fields: an automatic, systematic, and reproducible approach. J. Phys. Chem. Lett. 5(11), 1885–1891 (2014) 43. Teng, X., Liu, B., Ichiye, T.: Understanding how water models affect the anomalous pressure dependence of their diffusion coefficients. J. Chem. Phys. 153(10), 104510 (2020) 44. Li, Y., Li, H., Pickard, F.C., IV., Narayanan, B., Sen, F.G., Chan, M.K., Sankaranarayanan, S.K., Brooks, B.R., Roux, B.: Machine learning force field parameters from ab initio data. J. Chem. Theory Comp. 13(9), 4492–4503 (2017) 45. Botu, V., Batra, R., Chapman, J., Ramprasad, R.: Machine learning force fields: construction, validation, and outlook. J. Phys. Chem. C 121(1), 511–522 (2017) 46. Unke, O.T., Chmiela, S., Sauceda, H.E., Gastegger, M., Poltavsky, I., Schütt, K.T., Tkatchenko, A., Müller, K.-R.: Machine learning force fields. Chem. Rev. 121(16), 10142–10186 (2021) 47. Poltavsky, I., Tkatchenko, A.: Machine learning force fields: recent advances and remaining challenges. J. Phys. Chem. Lett. 12(28), 6551–6564 (2021)
640
X. Teng
48. Al-Lazikani, B., Jung, J., Xiang, Z., Honig, B.: Protein structure prediction. Curr. Opin. Chem. Biol. 5(1), 51–56 (2001) 49. Zhang, Y.: Progress and challenges in protein structure prediction. Curr. Opin. Chem. Biol. 18(3), 342–348 (2008) 50. Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Tunyasuvunakool, K., Bates, R., Žídek, A., Potapenko, A., et al.: Highly accurate protein structure prediction with AlphaFold. Nature 596(7873), 583–589 (2021) 51. Du, Z., Su, H., Wang, W., Ye, L., Wei, H., Peng, Z., Anishchenko, I., Baker, D., Yang, J.: The trRosetta server for fast and accurate protein structure prediction. Nat. Protoc. 16(12), 5634–5651 (2021) 52. Mirdita, M., Schütze, K., Moriwaki, Y., Heo, L., Ovchinnikov, S., Steinegger, M.: ColabFold: making protein folding accessible to all. Nat. Methods 19(6), 679–682 (2022) 53. Callaway, E.: Protein-folding contest seeks next big breakthrough. Nature 13–14 (2023) 54. Kokkinidis, M., Glykos, N., Fadouloglou, V.: Protein flexibility and enzymatic catalysis. Adv. Protein Chem. Struct. Biol. 87, 181–218 (2012) 55. Secundo, F.: Conformational changes of enzymes upon immobilisation. Chem. Soc. Rev. 42(15), 6250–6261 (2013) 56. Timasheff, S.N.: Protein-solvent preferential interactions, protein hydration, and the modulation of biochemical reactions by solvent components. Proc. Natl. Acad. Sci. U.S.A. 99(15), 9721–9726 (2002) 57. Bennion, B.J., Daggett, V.: The molecular basis for the chemical denaturation of proteins by urea. Proc. Natl. Acad. Sci. U.S.A. 100(9), 5142–5147 (2003) 58. Teng, X., Huang, Q., Dharmawardhana, C.C., Ichiye, T.: Diffusion of aqueous solutions of ionic, zwitterionic, and polar solutes. J. Chem. Phys. 148(22), 222827 (2018) 59. Teng, X., Ichiye, T.: Dynamical effects of trimethylamine n-oxide on aqueous solutions of urea. J. Phys. Chem. B 123(5), 1108–1115 (2019) 60. Teng, X., Ichiye, T.: Dynamical model for the counteracting effects of trimethylamine n-oxide on urea in aqueous solutions under pressure. J. Phys. Chem. B 124(10), 1978–1986 (2020) 61. Liu, B., Ichiye, T.: Concentration dependence of dynamics and hydrogen bonding in aqueous solutions of urea, methyl-substituted ureas, and trimethylamine n-oxide. J. Mol. Liq. 358, 119120 (2022) 62. Teng, X., Ichiye, T.: Aqueous solutions of tmao and urea under pressure: molecular dynamics simulation study. In: Abstracts of Papers of the American Chemical Society, vol. 258, American Chemical Society 1155 16th street, NW, Washington, DC 20036 USA (2019) 63. Liu, S., Liu, C., Deng, L.: Machine learning approaches for protein-protein interaction hot spot prediction: progress and comparative assessment. Mol. 23(10), 2535 (2018) 64. Jin, Y., Johannissen, L.O., Hay, S.: Predicting new protein conformations from molecular dynamics simulation conformational landscapes and machine learning. Proteins: Struct. Funct. Bioinf. 89(8), 915–921 (2021) 65. Orlando, G., Raimondi, D., Codice, F., Tabaro, F., Vranken, W.: Prediction of disordered regions in proteins with recurrent neural networks and protein dynamics. J. Mol. Biol. 434(12), 167579 (2022)
Quality Evaluation of Airfoil Hybrid Mesh Based on Graph Neural Network Huaiqing Wang, Yufei Pang, Sumei Xiao, and Zhichao Wang
Abstract In airfoil numerical simulation, the mesh quality has an important influence on the accuracy and error of numerical simulation. The existing mesh quality evaluation requires a lot of manual interaction, which greatly reduces the efficiency of mesh generation and necessitates the implementation of intelligent mesh evaluation methods. Graph neural networks can extract features from graph data, possess self-adaptability and generalization ability, and have been successfully applied in many industries. In this paper, we propose a deep graph neural network, SDeepNet, to evaluate mesh quality and construct a large-scale mixed mesh dataset, MixSet, for training and validating the model. We test and compare the performance of the mesh quality evaluation models GridNet, GMeshNet, and SDeepNet on the mesh dataset MixSet. The experimental results show that the SDeepNet model can achieve high accuracy and recall in the mixed mesh quality evaluation task. Keywords Graph Neural Network · Airfoil hybrid mesh · Mesh quality evaluation
1 Introduction With the rapid development of computational fluid dynamics (CFD), numerical simulation has been widely used in the design and experimentation of spacecraft. When performing numerical simulations, mesh quality is used to describe the accuracy of H. Wang · S. Xiao (B) School of Manufacturing Science and Engineering, Southwest University of Science and Technology, Mianyang 621010, China e-mail: [email protected] Y. Pang Computational Aerodynamics Istitute, China Aerodynamics Research and Development Center, Mianyang 621010, China Z. Wang Science and Technology on Parallel and Distributed Processing Laboratory, National University of Defense Technology, Changsha 410073, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_53
641
642
H. Wang et al.
the discretized mesh in terms of geometric shape and topology [1]. Good mesh quality can improve calculation accuracy, reduce simulation error, and decrease calculation time and resource overhead. However, since different mesh generation algorithms may produce varying mesh quality, manual evaluation of mesh quality can be very time-consuming [2]. There are two main methods of traditional mesh quality evaluation: one is to judge the quality of the mesh based on human experience and evaluate it based on geometric angles and shape indices [3]. Li et al. [4] summarized various commonly used mesh quality indicators, such as the minimum included angle of triangles, etc, which can be used to evaluate 2D and 3D meshes with different topological structures. The other method is to judge based on mathematical formula algorithms. Mark et al. [5] used algebraic methods to study mesh quality indicators and provided a method to construct, classify, and evaluate mesh quality metric. Although these methods are widely used in the industry, they are usually only used to detect defective mesh cells. There is no objective evaluation index, and different evaluation indexes may produce different evaluation results. With the development of artificial intelligence, deep learning has been widely used in many fields [6, 7]. Chen et al. [8] built a CNN model, GridNet, for two-dimensional mesh quality evaluation based on convolutional neural networks. The length, width, and included angle of the mesh element on the two-dimensional structured mesh are used as the input features of the model, and two-dimensional convolution and layers are employed to complete the learning of the features on the mesh. Wang et al. [9] converted mesh data into graph data based on mesh element representation and used the graph neural network model GMeshNet to extract mesh features. All of the above methods can achieve high-precision mesh quality evaluation. However, GridNet can only be applied to quality evaluation on two-dimensional structured grids because it is difficult to convert three-dimensional unstructured mesh into two-dimensional structured data; GMeshNet can be applied to three-dimensional unstructured mesh, but the efficiency of the graph representation scheme and the assessment accuracy of the mixed mesh are slightly insufficient. In this paper, we propose a mesh quality evaluation method based on graph neural networks. The main contributions are as follows: (1) We extract mesh and corresponding labels from eight types of four different model datasets to construct a large-scale mesh dataset MixSet for the mesh quality evaluation task. (2) We propose a graph representation method based on mesh cross structure, and design and implement an efficient algorithm to construct graph structures from raw mesh files. (3) We build a deep graph convolutional network structure, SDeepNet, to capture the local and global feature of the mesh. The experimental results show that the SDeepNet model has high assessment precision in mixed mesh with various topologies.
Quality Evaluation of Airfoil Hybrid Mesh Based on Graph Neural Network
643
2 Data Preparation Large benchmark datasets are important to promote the development of deep learning research and applications. By using datasets with accurate labels, one can evaluate the performance of various algorithms, such as classification accuracy, recall, precision, and other indicators, and determine which algorithms are best suited for specific tasks [10]. To better develop efficient mesh generation and quality evaluation algorithms to study and optimize the design of aircraft, a high-quality airfoil mesh datasets is needed. However, due to the limitation of computing resources, currently publicly available mesh datasets are usually datasets of a single airfoil geometry [11], such as NACA-Market, NACA6510, NACA0009, etc. [8]. In this section, to train a mesh quality discrimination model for more airfoils. We use the developed automatic mesh collector to extract and classify four different models (NACA-Market, NACA6510, NACA0009, NACA0012Alter). A benchmark dataset, MixSet, is proposed for deep learning-based mesh quality assessment. First, among the four different airfoils, each airfoil dataset has 10,240 grids of different sizes. According to the smoothness, orthogonality, and distribution of the grids, the 10,240 grids are divided into eight types. The collector extracts 640 grids and corresponding labels in eight types of four different airfoil datasets and combines them into our benchmark dataset MixSet. Second, using this collector, we generated a total of 20,480 airfoil meshes with four different models. Figure 1 shows two examples of airfoil mesh geometry models; each category contains 2560 mesh samples (640 for each airfoil). The eight label are Good Mesh(W) 4 × 640, Bad Mesh Orthogonality(O) 4 × 640, Bad Mesh Smoothnes(S) 4 × 640, Bad Mesh Density(D) 4 × 640, Mesh Orthogonality and Smoothness Difference(OS) 4 × 640, Mesh Orthogonality and Density Difference(OD) 4 × 640, Mesh Smoothness and Density Difference(SD) 4 × 640, and Mesh Lattice Orthogonal Smoothness and Density Difference(OSD) 4 × 640.
Fig. 1 Examples of meshes in NACA0012 and NACA6510 dataset
644
H. Wang et al.
Fig. 2 Diagram representation scheme of Mesh Cross Structure, mesh cross structures are treated as vertices on the graph. Two cross structures are adjacent only when their two end points touch. For example, cross structure x3 is adjacent to elements x1 , x2 , x4 and x5
3 Grid Preprocessing In the classification task of the graph neural network, graph data is required as input, and the mesh data is stored in the form of a set of point coordinates, which necessitates a preprocessing method to convert the mesh data into graph data that the graph neural network can recognize. We propose a graph representation scheme based on mesh cross structure to transform mesh data into graph data. The representation scheme is shown in Fig. 2, where the mesh cross structure x1 , x2 , . . . , xn are regarded as vertices, and the adjacencies of mesh intersection frames are regarded as edges of the graph. Therefore, the mesh can be represented by its input feature matrix X and adjacency matrix A; the former represents the characteristics of vertices (mesh cross structure), while the latter represents the topological relationship of the mesh. In this representation, we treat mesh cross structure as vertices. Calculate the number of nodes on each mesh face as well as the total number of nodes in the entire mesh. The adjacency between mesh points A is constructed by traversing each mesh face and determining the connection relationship between nodes in the x and y directions. Then, Generates characteristics of each mesh cross structure, including edge lengths, edge length vectors, twist angles. Finally, the function returns the characteristic matrix X and Adjacency matrix A.
4 Network Structure In order to effectively capture the local and global feature of the mesh, we designed a deep graph convolutional network structure (SDeepNet), as shown in Fig. 3. The feature matrix X and adjacency matrix A obtained by MixSet through preprocessing, after 15 layers of graph convolution operations, graph pooling operations, JK-Net [12], and graph readout operations, input the low-dimensional representation of the
Quality Evaluation of Airfoil Hybrid Mesh Based on Graph Neural Network
645
Fig. 3 Graph convolutional network structure (SDeepNet)
mesh to the fully connected layer MLP for classification and obtaining the quality assessment labels of the meshes. This structure introduces residual connections to make the model more efficient in representation learning. Using TopKPooling [13] as the pooling method, the key structural feature in the graph is preserved and the efficiency is improved. Efficiently combining different levels of graph feature through JK-Net enables the model to simultaneously capture local and global graph structure feature [14]. We introduce Dropout [15] to enhance the robustness and generalization ability of the model, prevent overfitting, and improve the generalization performance of the model on different datasets.
646
H. Wang et al.
4.1 Graph Convolutional Layer The DeepGCN layer [16] is our proposed graph convolution layer, which is the core component of the SDeepNet model, responsible for performing convolution operations on the graph to capture the relationship between nodes. It comprises graph convolution operations, normalization, activation functions, and dropout to enhance the performance and robustness of the model. First, we use GCNConv [17] as a graph convolution operation, which enables node features to be propagated and aggregated in local neighborhoods. To strengthen the feature transfer between different layers, we introduce residual connections. Residual connections help to solve the vanishing gradient problem, allowing the model to better learn deep features. Second, we normalize the feature matrix H(L) using LayerNorm [18] in each graph convolutional layer. We use the ReLU activation function to nonlinearly transform the normalized feature matrix. We add a Dropout layer after each graph convolutional layer. Dropout randomly sets a portion of the node features to zero, which can prevent the model from overfitting and improve the generalization performance of the model on the MixSet dataset.
4.2 Graph Pooling To capture the global feature of the graph and reduce computational complexity, we downsample nodes using graph pooling layers. In the model, we adopt TopKPooling as the pooling method. TopKPooling sorts the nodes according to their importance and then selects the top K most important nodes. This approach helps preserve key structural feature in the graph. Specifically, TopKPooling is computed as follows: a parametric linear transformation is applied to the node features to compute the importance score of each node. The nodes are sorted according to their importance scores, and the top K most important nodes are selected. The feature matrix and adjacency matrix of the retained nodes are updated to form a new subgraph.
4.3 Jump Knowledge Layerand and Multilayer Perceptron To efficiently combine graph feature at different levels, we use the Jumping Knowledge layer, referred to as JK-Net. JK-Net fuses the outputs of individual graph convolutional layers to generate a comprehensive graph representation. In the SDeepGCN architecture, we apply the skip knowledge network to the output of the GCN layer, given a series of feature matrix H(1) , H(2) , . . . , H(L) , where L represents the number of GCN layers. The skip knowledge network calculates the final feature matrix H(out) in the following way:
Quality Evaluation of Airfoil Hybrid Mesh Based on Graph Neural Network
H(out) = CONCAT H(1) , H(2) , . . . , H(L)
647
(1)
CONCAT represents the concatenation operation of the feature matrix. Here, we choose the "cat" mode to realize the skip knowledge network, that is, to concatenate the feature matrices of different layers along the feature dimension. This method can preserve the feature of each layer, allowing the model to adaptively select the required local or global feature as needed. To predict the graph category from the features extracted in the JK layer, we use a multi-layer perceptron (MLP) to process the output of the skip knowledge layer and generate the final graph classification result. MLP layer consists of linear layers, batch normalization layers [19], and ReLU activation functions, which gradually map high-dimensional graph representations to the target category space.
5 Experimental Results This section details the experimental design and analysis of results. First, we construct and preprocess the experimental dataset. Then, we introduce the experimental environment and parameter settings. Next, we analyze the performance of the mesh quality discrimination method based on graph neural networks through comparative experiments with baseline methods. Finally, we analyze and discuss the experimental results.
5.1 Experimental Environment and Parameter Settings This experiment is carried out on a cloud server with two NVIDIA A100-PCIE-40GB graphics cards. We tuned the parameters using mesh search and experimented on a subset (2048 meshs of different categories). As seen from Table 1, as the number of network layers deepens, the performance of the network generally shows a trend of first increasing and then decreasing. In this set of experiments, the 10-layer network with a pooling rate of 0.7 has the best performance, but the average accuracy rate of the 15-layer network is 1.06% higher than that of the 10-layer network. Considering the pooling rate of 0.7 and the network depth of 15 layers, Table 2 presents the experimental results of a group of optimal parameter settings selected by the mesh search method. Through the simulation experiment on the subset, it provides the basis for the subsequent analysis and discussion of the experimental results. In the following experiments, we will use this set of optimal parameter settings to train and validate a graph neural network-based mesh quality discriminant model to evaluate its performance in practical tasks.
648
H. Wang et al.
Table 1 Effect of network depth and pooling rate on accuracy
Table 2 Parameter settings and value range Parameter Value range learning rate Batch size Network depth Pool rate
0.001–0.1 16–64 5–30 0–1
Optimal value 0.05 32 15 0.76
5.2 SDeepNet Network Evaluation Results In order to verify the performance of the SDeepNet model based on the graph neural network, we selected two neural network-based mesh quality discrimination models (GridNet, GMeshNet) to conduct comparative experiments on three benchmark datasets (NACA0012, NACA6510, MixSet). The recall and accuracy of the three network models on different datasets are shown in Table 3. On the NACA0012 dataset, the accuracy of GMeshNet and GridNet both performed better, at 92.88% and 91.56%, respectively, while the accuracy of training on the NACA6510 and MixSet datasets decreased. This is because the GridNet and GMeshNet models were trained on the NACA0012 dataset, resulting in slightly weaker generalization ability on other datasets. The SDeepNet model is trained on the MixSet dataset (including NACA0012 and NACA6510 datasets) and effectively fuses the features of vertices and edges, adaptively learns the representation of the model, and automatically learns local and global features, which makes SDeepNet uniquely strong in terms of generalization performance. It is worth noting that both GMeshNet and SDeepNet models are based on the graph neural network structure. In terms of preprocessing, GMeshNet uses a representation scheme based on mesh elements. The processing time for each mesh is about 0.79 s, while our mesh intersection frame-based representation
Quality Evaluation of Airfoil Hybrid Mesh Based on Graph Neural Network
649
Table 3 Recall and accuracy of three network models on different data sets Model/set NACA0012 (%) NACA6510 MixSet (%) GridNet GMeshNet SDeepNet
91.56 92.88 90.26
– 87.37% 89.60%
Table 4 SDeepNet of the confusion matrix Table W (%) O (%) S (%) D (%) W O S D OS OD SD OSD
90.10 0.00 8.20 0.00 0.00 0.00 0.00 0.00
0.00 94.50 0.00 0.00 12.80 0.00 0.00 0.00
0.80 0.00 86.80 0.00 0.00 0.00 1.20 0.00
0.00 0.00 0.00 98.50 0.00 0.00 7.00 0.00
85.43 88.25 91.37
OS (%)
OD (%)
SD (%)
OSD (%)
0.00 5.50 0.00 0.00 86.50 0.00 0.00 0.60
0.00 0.00 0.00 0.00 0.00 93.70 0.00 9.00
0.00 0.00 5.00 1.50 0.00 0.00 91.80 0.00
9.10 0.00 0.00 0.00 0.70 6.30 0.00 90.40
scheme takes about 0.56 s to process each mesh, which improves the preprocessing time efficiency by about 30%. The accuracy of the final mesh classification prediction is also improved, which shows that our mesh representation scheme can effectively express the characteristics of the mesh and greatly enhance the efficiency of preprocessing. To better analyze the accuracy among different mesh classes, we construct the confusion matrix of SDeepNet. From Table 4, it can be seen that D (the mesh with poor density) has the highest category estimation accuracy, and the correct classification rate of the mesh is 98.50%. Other categories such as W, O, OD and SD performed reasonably, with a correct classification rate of around 92%. S and OS had the lowest accuracy, with 9.1% and 5.00% of S (well meshed) meshes being misclassified as W and SD meshes, and 12.8% of OS meshes being misclassified as O meshes. It is worth noting that while some meshes were incorrectly classified as other types, major meshes quality issues were still correctly assessed. Although the neural network still has some errors in the mesh quality identification, the evaluation results obtained still can provide guideline for the subsequent mesh optimization process.
6 Conclusion In computational fluid dynamics, mesh quality plays a vital role in obtaining accurate results. To improve the efficiency of mesh quality evaluation, we introduce graph neural networks into mesh quality evaluation and utilize the efficient learning ability
650
H. Wang et al.
of neural networks to extract high-quality mesh attributes. We extracted and produced a mesh dataset, MixSet, based on multiple original airfoil mesh benchmark datasets and proposed a graph representation scheme based on mesh intersection frames to convert mesh data into graph data. We train a graph-based neural network model, SDeepNet, to automatically evaluate the overall quality of the input mesh and distinguish the quality category of the mesh. Experimental results show that SDeepNet has good accuracy on the 2D airfoil mesh dataset.Future work includes: (1) Solving the problem of insufficient feature extraction for complex geometric and topological features of the mesh. (2) Solving the problem of labeling inconsistency caused by blurred boundaries of some mesh quality.Existing GNN methods mainly target isomorphic graphs, but in practical applications, mesh data may contain multiple types of nodes and edges. Future research can focus on developing GNN methods for heterogeneous graphs to better handle complex mesh data.
References 1. Zhang, L., Fengshun, et al.: Computational Fluid Dynamics for Grid Generation techniques. Science Press 2017, 1–14, 219–225 (2017) 2. Gammon, M.: A review of common geometry issues affecting mesh generation. In: 2018 AIAA Aerospace Sciences Meeting (2018) 3. Lav, G.: A multiscale metric for 3D mesh visual quality assessment. Comput. Graphics Forum 30(5), 1427–37 (2011) 4. Li, H., Wu, J., et al.: Finite element grid section and grid quality determination index. China Mech. Eng. 23(3), 368–377 (2012) 5. Shephard, M.S., Seol, S.: Algebraic Mesh Quality Metrics for Unstructured Meshes? International Meshing Roundtable (2003) 6. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale Image Recognition . Comput. Sci. (2014) 7. Chollet, F.: Deep learning with depthwise separable convolutions. In: Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2014) 8. Chen, X., Liu, J., Pang, Y., Chen, J.: Developing a new mesh quality evaluation method based on convolutional neural network. Eng. Appl. Comput. Fluid Mech. 14(1), 391–400 (2020) 9. Wang, Z., Chen, X., Liu. J.: Evaluating Mesh Quality With Graph Neural Networks. Springer (2021) 10. Chen, X., Chen, R., Wan, Q., Xu, R., Liu, J.: An improved data-free surrogate model for solving partial differential equations using deep neural networks, Sci. Rep. 11 (2021) 11. Pang, Y., Lu, F., Liu, Y., Chen, B., Jiang, X., Qi, L., Chen, J., Xie, D., Zhang, H.: A general structured grid generation software of national numerical windtunnel, Acta Aerodyn. Sin. 38(4) (2020) 12. Xu, K., Li, C., Tian, Y., et al.: Representation learning on graphs with jumping knowledge networks. In: International Conference on Machine Learning, vol. 2018, pp. 5453–5462 (2018) 13. Diehl, F.: Edge contraction pooling for graph neural networks arXiv:1905.10990 (2019) 14. Ranjan, E., Sanyal, S.: In: ASAP: Adaptive Structure Aware Pooling for Learning Hierarchical Graph Representations (AAAI) (2020) 15. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014) 16. Ying, Z., Bourgeois, D., You, J., Zitnik, M.: GNN explainer: a tool for post-hoc explanation of graph neural networks. Adv. Neural Inf. Process. Syst. 32, 9240–9251 (2019)
Quality Evaluation of Airfoil Hybrid Mesh Based on Graph Neural Network
651
17. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv:1609.02907 (2016) 18. Ba, J.L., Kiros, J.R., Hinton, G.E.: Layer normalization (2016) 19. Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International Conference on Machine Learning, vol. 2015, pp. 448– 456 (2016) 20. Ying, R., You, J., Morris, C., et al.: Hierarchical graph representation learning with differentiable pooling. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 4805–4815 (2018)
Fixed-Time Consensus Control for Heterogeneous Multi-agent Systems Without Velocity Measurements of Neighbors Yichao Ao and Qifeng Zhang
Abstract This paper investigates fixed-time consensus control problems for heterogeneous multi-agent systems, which consist of single and double integrators, without using the measurements of neighbors’ velocities. Distributed control protocols are designed for single and double integrators, respectively, based on the ideas of backstepping approach and adding a power integrator method. The convergence performance of the closed-loop systems is analyzed by using Lyapunov theory and fixed-time stability theory. Moreover, numerical examples are provided to show the effectiveness of our results. Keywords Heterogeneous multi-agent systems · Fixed-time consensus · Without velocity measurements
1 Introduction Consensus problems for networked multi-agent systems has attracted a lot of attention in last two decades, and has been widely studied by multi-disciplinary researchers [1–5]. This is mainly due to the broad application in unmanned aerial vehicles, power grid, sensor networks, sociology and so on. Convergence rate is a essential design consideration for the cossensus of multiagent systems. In some circumstances, it is required that all agents can reach agreement fast and even in a certain settling time. To this end, several finite-time or fixed-time consensus protocols have been proposed for multi-agent systems. In [6, 7], finite-time consensus protocols were designed for multi-agent systems of single and double integrators, respectively. Literature [8] extended the finite-time consenY. Ao (B) · Q. Zhang Guangdong Institute of Intelligent Unmanned System (Nansha), Guangzhou 511458, China e-mail: [email protected] Q. Zhang State Key Laboratory of Robotics, Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_54
653
654
Y. Ao and Q. Zhang
sus into multi-agent systems with switching topologies. In [9], control actuators with limited power were taken into consideration for finite-time consensus of double integrators. Besides, finite-time observers were designed to cope with the situations where the velocity information can not be precisely measured. To step further, fixedtime control design method was proposed in [10], such that the settling time is not dependent on the initial conditions. Literatures [11, 12] investigated fixed-time consensus control for multi-agent systems of single and double integrators, respectively. Unfortunately, all multi-agent systems considered in above results are homegenous. In pratical systems, it usually needs cooperations between agents with different dynamics. In [13], velocity observers were designed for heterogeneous multi-agent system composed of single and double integrators to achieve consensus with exponential convergence rate. In [14], two classes of consensus protocols were proposed by combining the homogeneous domination method with adding a power integrator method. However, to the best of the authors’ knowledge, there exist few literatures investigating fixed-time consensus control for heterogeneous multi-agent systems. The contribution of this paper is in two aspects. Firstly, we propose a distributed consensus proptocol, under which the heterogeneous multi-agent systems consisting of single and double integrators can achieve consensus in prescribed fixed time. The design of the protocol is based on the ideas of backstepping approach and adding a power integrator method, and we can see that the structures of the protocol for these two kinds of integrators are entirely different. Secondly, neighbors’ velocity measurements are not used for each agent to achieve fixed-time consensus.
2 Preliminaries 2.1 Graph Theory Let G = (V, E, A) be an undirected graph with a finite nonempty set of nodes V = {1, . . . , n}, an edge set E ⊆ V × V and an adjacency matrix A = [ai j ] ∈ R n×n . Agent i in the system is represented by node i. The information flow between each agent is reflected by E and A. An edge ( j, i) ∈ E ⇔ ( j, i) ∈ E ⇔ weight ai j = a ji > 0 ⇔ agent i, j can directly get information from each other. Moreover, we have aii = 0 for each i ∈ V. Ni = { j, ( j, i) ∈ E} denotes the set of the neighbors of node i, and di = nn=1 ai j is the degree of node i. An undirected path which connects nodes i and j is a sequence of distinct nodes k0 , k1 , · · · , km , where k0 = i, km = j and (kr , kr +1 ) ∈ E, 0 ≤ r ≤ m − 1. An undirected graph is said to be connected, if there exists a path between every two distinct nodes of the graph. The Laplacian matrix of the graph is defined as L = D − A, where D = diag{d1 , . . . , dn } is a diagonal matrix whose ith diagonal element is di .
Fixed-Time Consensus Control for Heterogeneous Multi-agent …
655
2.2 Some Useful Lemmas Lemma 1 The Laplacian matrix L of undirected graph G has a single zero eigenvalue, if and only if G is connected. Moreover, other eigenvalues of L are strictly positive real number. Lemma 2 Let q ≥ 1 be a ratio of odd integers and any scalars x, y ∈ R. Then, the following inequality holds: |x − y|q ≤ 2q−1 |x q − y q |. Lemma 3 Let x1 , x2 , . . . , x N ≥ 0. Then N
p xi
N p ≥ xi ,
i=1
for ∀ 0 < p ≤ 1; and
N
p xi
i=1
≥N
1− p
N p xi ,
i=1
i=1
for ∀ 1 < p ≤ ∞. Lemma 4 Let x, y ∈ R and p, q > 0. Then |x| p |y|q ≤
p q |x| p+q + |y| p+q . p+q p+q
Lemma 5 Consider the autonomous system x˙ = f (x), x(0) = x0
(1)
where x ∈ R and f : Rn → Rn is a continuous function. Suppose that there exists a continuous radially unbounded function V (x) : Rn → R+ ∪ {0} such that 1) V (x) = 0 ⇒ x ∈ M; 2) any solution x(t) of (1) satisfies that V˙ (x) ≤ −αV p (x) − βV q (x), for some α, β > 0, 0 < p < 1, q > 1. Then the set M ⊂ Rn is globally fixed-time 1 1 + β(q−1) . attractive for (1) and the finite settling time T satisfies that T ≤ α(1− p)
656
Y. Ao and Q. Zhang
3 Problem Formulations Consider heterogeneous multi-agent systems consisting of single and double integrators. The number of all agents is n, labelled 1 through n. Assume that the number of the double integrators is m (m < n). The communication topology among the agents is undirected. The dynamics of each double integrator is described as follows:
x˙i (t) = vi (t) vi (t) = u i (t)
, i ∈ Im ,
(2)
where xi ∈ R, vi ∈ R and u i ∈ R are the position, velocity and control input of double integrator i, respectively, Im is the set which consists of all double integators. The initial conditions are xi (0) = xi0 , vi (0) = vi0 . Then, the number of the single integrators is n − m. And the dynamics of each single integrator is described as follows: (3) x˙i (t) = u i (t), i ∈ V/Im , with xi (0) = xi0 , where xi ∈ R and u i ∈ R are the position and control input of single integrator i, respectively. The aim of this paper is to design distributed control protocols such that the heterogeneous multi-agent systems achieve fixed-time consensus. Definition 1 The heterogeneous multi-agent system (2)–(3) is said to achieve fixedtime consensus, if there exists a constant settling time T , independent of the initial conditions xi0 , i ∈ V, and vi0 , i ∈ Im , such that limt→T− xi (t) − x j (t) = 0, and xi (t) = x j (t) for ∀t ≥ T , i, j ∈ V; limt→T− vi (t) − v j (t) = 0, and vi (t) = v j (t) for ∀t ≥ T , i, j ∈ Im .
4 Main Results In this section, we propose the protocol for heterogeneous multi-agent system (2)– (3) to achieve fixed-time consensus. For each single integrator, the control input is designed as follows q (4) u i = k2 yir + k4 yi , n ai j (x j − xi ), 0 < r < 1 and q > 1 are ratios of positive i ∈ V/Im , where yi = i=1 odd integers, k2 and k4 are feedback gains. With regard to double integrators, a q virtual control input is constructed as vi∗ = k2 yir + k4 yi , i ∈ Im , based on the idea 1/r ∗1/r of backstepping design. And denote that ξi = vi − vi . Then, for each double integrator, the control input is designed as follows
Fixed-Time Consensus Control for Heterogeneous Multi-agent …
657
2k2r +1 D + 21−r ma r +1 (yi ) + k1 ξi2r −1 u i = − 2 D(yi ) + 1+r q+1
2k4 D q+1 r +q−1 − , (yi ) + k3 ξi 1+q 1−r
(5)
i ∈ Im , where k1 and k3 are feedback gains, D = maxi∈V {di }, a = maxi, j∈V {ai j }, and 2−r q−r 1 q−r (k2 + k4 yi ) r −1 (r k2 + qk4 yi ). (yi ) = 21−r r It is not difficult to verified that (yi ) > 0 is proper, since 2 − r > 0 and q − r > 0 is the ratio of positive even integer and positive odd integer. The following results give the sufficient conditions such that the heterogeneous multi-agent system (2)–(3) with distributed control protocol (4)–(5) achieves consensus in fixed time. Theorem 1 Suppose the graph G is undirected and connected. Then the heterogeneous multi-agent systems (2)–(3) for i ∈ V achieves fixed-time consensus under protocol (4)–(5), if the feedback gains satisfy that (ma + 1)r 1−r 2 + 2 2 α, 1+r 21−r (D + ma)r 1+r k2 > + + (2λ)− 2 α, 1+r 1+r k1 > 21−r
(1−r )(1+q) 2
(6)
q−1 2
k3 > 2 (m + n) β, 1+q q−1 (D + ma)q + (2λ)− 2 (m + n) 2 β, k4 > 1+q where λ = λmin (L) is the second smallest eigenvalue of the Laplacian matrix L of graph G, and α, β > 0 are arbitrary positive constants. Moreover, the constant 2 2 + β(q−1) . settling time T satisfies that T ≤ α(1−r ) Proof By Lemma 1, we have that λ > 0 since undirected graph G is connected. Without loss of genarelity, assume that Im = {1, . . . , m}. And consider the Lyapunov function candidate m
vi
V1 (t) = V0 (t) +
∗1/r 2−r
(s 1/r − vi
)
i=1 v ∗ i
where
1 ai j (x j − xi )2 . 4 i=1 i=1 n
V0 (t) =
n
ds
658
Y. Ao and Q. Zhang
v ∗1/r We can see that v∗i (s 1/r − vi )2−r ds is proper and positive definite, and V0 is i positive definite with respect to xi (t) − x j (t), for ∀i, j ∈ V. Differentiating V0 along the trajectories of (2)–(3) results in 1 ai j (x j − xi )(x˙ j − x˙i ) V˙0 = 2 i=1 j=1 n
=−
n
n n
n
ai j (x j − xi )x˙i = −
i=1 j=1
=−
m
yi x˙i
(7)
i=1 m
yi (vi − vi∗ ) −
i=1
n
yi vi∗ −
i=1
yi u i .
i=m+1
r
1/r ∗1/r By Lemma 2, we have that |vi − vi∗ | ≤ 21−r vi − vi . And by Lemma 4, we obtain that
r
r ξ r +1 y r +1
1/r ∗1/r + i . |yi | vi − vi ≤ i 1+r 1+r Substituting above inequalities, control input (4) and virtual input vi∗ into (7) yields V˙0 ≤ ≤
m i=1 m i=1
|yi |||vi − vi∗ | − 1−r
2 y r +1 + 1+r i
n
k2 yir +1 + k4 yi
q+1
i=1 m 1−r i=1
2 r r +1 q+1 ξi − k2 yir +1 − k4 yi , 1+r i=1 i=1 n
n
(8)
where r + 1 and q + 1 are ratios of positive even integer and positive odd integer, since r and q are ratios of positive odd integers. Next, along the trajectories of (3) we differentiate that d dt
vi (s vi∗
1/r
−
∗1/r vi )2−r ds
∗1/r
d(−vi = (2 − r ) dt 1/r
+ (vi
)
vi 1/r
(s 1/r − vi )1−r ds vi∗
∗1/r 2−r
− vi
)
ui ,
i ∈ Im . Again using Lemma 2, we have
vi
1/r
(s 1/r − vi )1−r ds ≤ |vi − vi∗ | · |ξi |1−r ≤ 21−r |ξi |.
vi∗ On the other hand, it is easy to show that
(9)
Fixed-Time Consensus Control for Heterogeneous Multi-agent …
659
∗1/r
d(−vi )
r − 2 ∗ r1 −1
q−1 r −1 vi (r k2 yi + qk4 yi ) y˙i
=
(2 − r )
dt r
n
≤ 2r −1 (yi )
ai j (x˙ j − x˙i )
.
j=1
Thus substituting above inequalities into (9) results in d dt
vi
∗1/r 2−r
(s 1/r − vi
)
1/r
ds ≤ (vi
vi∗
n
∗1/r 2−r
− vi ) u i + (yi )|ξi | ai j (x˙ j − x˙i )
.
j=1
To proceed, we need to calculate that
n
a ( x ˙ − x ˙ ) i j j i
j=1 ≤ di |x˙i | +
m
ai j |x˙ j | +
j=1
≤ di (|vi − vi∗ | + |vi∗ |) +
n
ai j |x˙ j |
j=m+1 m
ai j (|v j − v ∗j | + |v ∗j |) +
n
ai j (k2 |y j |r + k4 |y j |q )
j=1
j=m+1 n 21−r ai j |ξ j |r + ai j (k2 |y j |r + k4 |y j |q ). ≤ di (21−r |ξi |r + k2 |yi |r + k4 |yi |q ) + j=1 j=1 m
By Lemma 4, it can be evidently obtained that k r +1 D r +1 Dr r +1 y k2 di (yi )|ξi ||yi |r ≤ 2 (yi )ξir +1 + , 1+r 1+r i q+1
D q+1 k Dq q+1 q+1 y (yi )ξi + , k4 di (yi )|ξi ||yi |q ≤ 4 1+q 1+q i m m m m m m 21−r ai j r +1 21−r ai j r r +1 ξj 21−r ai j (yi )|ξi ||ξ j |r ≤ (yi )ξir +1 + 1+r 1+r i=1 j=1
≤
i=1 j=1 m 1−r 2 ma i=1
i=1 j=1 m m 21−r ai j r r +1 r +1 (yi )ξir +1 + ξ 1+r 1+r i i=1 j=1
m 21−r ma r +1 21−r mar ≤ (yi ) + ξir +1 . 1+r 1+r i=1
660
Y. Ao and Q. Zhang
By similar manipulation, we have m n
m n ai j k r +1
ai j k2 (yi )|ξi ||y j |r ≤
2
i=1 j=1
i=1 j=1
1+r
m k r +1 D
=
2
i=1
1+r
r +1 (yi )ξir +1 +
r +1 (yi )ξir +1 + +
m n ai j r r +1 y 1+r j i=1 j=1
n mar r +1 y , 1+r i i=1
and n m
ai j k4 (yi )|ξi ||y j | ≤ q
i=1 j=1
m q+1 k D 4
i=1
1+q
q+1
q+1 (yi )ξi
n maq q+1 y , + 1 +q i i=1
Combining (8)–(9) and all aforementioned inequalities, we can conclude that V˙1 ≤
n 1−r m 1−r 2 (D + ma)r 2 D(yi ) + − k2 yir +1 + 1+r 1+r i=1 i=1
m (ma + 1)r 2k2r +1 D + 21−r ma r +1 (yi ) + 21−r ξi2−r u i (10) + ξir +1 + 1+r 1+r i=1 m n q+1 (D + ma)q 2k4 D q+1 q+1 q+1 (yi )ξi + − k4 yi . + 1 + q 1 + q 1=1 i=1
Due to the fact of (6), substituting control input (5) into inequality (10) yields V˙1 ≤ − −
n
q−1 1 1+r r +1 1 q+1 q+1 ) 2 yi − β(m + n) 2 ( ) 2 yi 2λ 2λ i=1
n
α(
i=1 m
α2
1−r 2 2
ξir +1 −
i=1
m
β(m + n)
i=1
1 2λ
≤−α
r +1 2
≤ − αV1
n
yi2 +
m
i=1
i=1 q+1 2
− βV1
21−r ξi2
q−1 2
2
(1−r )(q+1) 2
r +1 2 −β
1 2λ
q+1
ξi
n
yi2 +
i=1
m
q+1 2
(11)
21−r ξi2
i=1
,
where the second inequality is obtained according to Lemma 3, and the third v ∗1/r 1 T x L L x and v∗i (s 1/r − vi )2−r ds ≤ inequality based on the fact that 21 x T L x ≤ 2λ i
|vi − vi∗ | |ξi |2−r ≤ 21−r ξi2 , where x = [x1T , . . . , xnT ]T . Finally, it is indicated by (11) and Lemma 5 that the heterogeneous multi-agent system (2)–(3) achieves fixedtime consensus under protocol (4)–(5), and the constant settling time T satisfies that 2 2 + β(q−1) . T ≤ α(1−r )
Fixed-Time Consensus Control for Heterogeneous Multi-agent …
661
Remark 1 From control protocol (4), we see each single integrator updates its state based on the relative position information yi from its neighbors. From control protocol (5), we see each double integrator updates its state based on the difference between real and virtual velocity ξi , and i (yi ), which are functions of relative position information yi . Thus, the designed control protocol is distributed, with which the heterogeneous multi-agent system can achieve fixed-time consensus without using the velocity measurements of the neighbors.
5 Numerical Examples Numerical examples are presented to show the effectiveness of our fixed-time consensus control approach. Consider multi-agent systems with 2 single integrators (labelled 1 and 2) and 3 double integrators (labelled 3 through 5), among which the topology structure is described by Fig. 1. Let r = 35 , q = 75 , k1 = 2.9307, k2 = 3.6657, k3 = 2.114, k4 = 4.687 and α = β = 1. Agents’ initial positions are chosen at [−6, 4, 6, 2, −2]T . Double intrgrators’ initial velocities are chosen at [−120, −60, 40]T . Figures 2 and 3 depict the posi-
Fig. 1 Topology structure of the heterogeneous multi-agent system The positions of agents
6 4 2 0
agent 1 agent 2 agent 3 agent 4 agent 5
-2 -4 -6
0
0.5
1
1.5
Time(s)
Fig. 2 Position trajectories of the agents
2
2.5
3
662
Y. Ao and Q. Zhang
The velocities of agents
50 0 -50 agent 3 agent 4 agent 5
-100 -150
0
0.5
1
1.5
2
2.5
3
Time(s)
Fig. 3 Velocity trajectories of the double integrators
tion trajectories of all agents and the velocity trajectories of the double integrators, respectively. We can see that both the position and velocity can reach agreements in prescribed settling time. Acknowledgements This work was supported by the Project for high quality development of 6 marine industries of Department of Natural Resources of Guangdong Provincial (GDNRC[2023]32), and the International Science and Technology Cooperation Project of Guangdong Province under Grant 2022A0505050027.
References 1. Ren, W., Beard, R.W.: Consensus seeking in multiagent systems under dynamically changing interaction topologies. IEEE Trans. Autom. Control 50(5), 655–661 (2005). https://doi.org/10. 1109/TAC.2005.846556 2. Ao, Y., Jia, Y.: Output feedback control of mixed H2 /H∞ multi-agent consensus via an inner auxiliary system approach. Syst. Control Lett. 158, 105064 (2021). https://doi.org/10.1016/j. sysconle.2021.105064 3. Ao, Y., Jia, Y.: Distance-targeted competitive follower-attraction containment control for multiagent systems with weighted directed graphs. Int. J. Robust Nonlinear Control 33(8), 4577– 4601 (2023). https://doi.org/10.1002/rnc.6629 4. Teng, X., Hwang, W.: Elastic energy partitioning in DNA deformation and binding to proteins. ACS Nano 10(1), 170–180 (2016). https://doi.org/10.1021/acsnano.5b06863 5. Teng X., Hwang W.: Ch.4 structural and dynamical hierarchy of fibrillar collagen. In: Kaunas, R., Zemel, A. (eds) Cell and Matrix Mechanics, pp. 101–118. Taylor and Francis (2014) 6. Bhat, S.P., Bernstein, D.S.: Finite-time stability of continuous autonomous systems. SIAM J. Control Optim. 38(3), 751–766 (2000). https://doi.org/10.1137/S0363012997321358 7. Li, S., Du, H., Lin, X.: Finite-time consensus algorithm for multi-agent systems with double-integrator dynamics. Automatica 47(8), 1706–1712 (2011). https://doi.org/10.1016/ j.automatica.2011.02.045 8. Jiang, F., Wang, L.: Finite-time information consensus for multi-agent systems with fixed and switching topologies. Phys. D 238(16), 1550–1560 (2009). https://doi.org/10.1016/j.physd. 2009.04.011
Fixed-Time Consensus Control for Heterogeneous Multi-agent …
663
9. Zhang, B., Jia, Y., Matsuno, F.: Finite-time observers for multi-agent systems without velocity measurements and with input saturations. Syst. Control Lett. 68, 86–94 (2014). https://doi.org/ 10.1016/j.sysconle.2014.03.010 10. Andrieu, V., Praly, L., Astolfi, A.: Homogeneous approximation, recursive observer design, and output feedback. SIAM J. Control Optim. 47(4), 1814–1850 (2008). https://doi.org/10. 1137/060675861 11. Zuo, Z., Tie, L.: A new class of finite-time nonlinear consensus protocols for multi-agent systems. Int. J. Control 87(2), 363–370 (2014). https://doi.org/10.1080/00207179.2013.834484 12. Huang, Y., Jia, Y.: Fixed-time consensus tracking control of second-order multi-agent systems with inherent nonlinear dynamics via output feedback. Nonlinear Dyn. 91, 1289–1306 (2018). https://doi.org/10.1007/s11071-017-3945-8 13. Zheng, Y., Wang, L.: Consensus of heterogeneous multi-agent systems without velocity measurements. Int. J. Control 85(7), 906–914 (2012). https://doi.org/10.1080/00207179.2012. 669048 14. Zheng, Y., Wang, L.: Finite-time consensus of heterogeneous multi-agent systems with and without velocity measurements. Syst. Control Lett. 61, 871–878 (2012). https://doi.org/10. 1016/j.sysconle.2012.05.009
Correlation Filter Feature Selection Strategy Based on Inland Ship Tracking Lei Xiao, Feiyan Nie, Hanjie Ma, and Zhongyi Hu
Abstract The early correlation filters algorithm has been widely used in many scenes because of its good real-time performance. With the pursuit of high-precision tracking effect and the introduction of various image features, such as CN features and depth features, the calculation speed of correlation filtering algorithm has been decreasing. In recent years, the operation speed of correlation filtering algorithms applied to inland waterway scenes is generally lower than 3 FPS, which cannot meet the demand of real-time inland waterway ship tracking. In order to ensure the tracking accuracy while improving the computing speed, this paper selects a suitable feature selection strategy based on different feature application characteristics CFFS(Correlation filter feature selection strategy). First, we obtain the HOG features, depth features and corresponding labels of the current frame from the basic correlation filter framework. Second, we select the HOG features and depth features using the swarm intelligence optimization algorithm and principal component analysis, respectively. Finally, the selected features are put into the correlation filtering framework to obtain the tracking results. The performance of the proposed algorithm is evaluated on the inland waterway ship dataset of our group. The results show that compared with the performance of the basic correlation filtering algorithm, CFFS has improved the accuracy by 1.3%, the success rate by 2.8%, and the speed by 12.946 FPS. Keywords Ship tracking · Correlation filtering · Feature selection
1 Introduction In recent years, with the rapid development of inland navigation, the demand for intelligent monitoring of ships has been increasing. Along with the rapid development of computer vision and artificial intelligence, many scientists have conL. Xiao · F. Nie · H. Ma · Z. Hu (B) Wenzhou University, Wenzhou 325035, Zhejiang, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_55
665
666
L. Xiao et al.
ducted research on ship tracking in inland river scenes. Although there are comparatively mature general target tracking algorithms, it still faces greater challenges to apply them directly to inland waterway scenes. The main challenges are as follows: (1) The inland river navigation scene is changeable. The weather condition is complicated in the open environment of inland river, the ship video data collected when the light intensity is high often shows the phenomenon of light change, and the water fog is easy to appear on the water surface in cloudy and rainy days causing low resolution of ship data and difficulty in distinguishing front and rear view. Especially at night time, the captured ship images are also easily affected by noise interference and weak texture blur. (2) Continuous dynamic movement of the ship. The collected data usually is the video of the ship moving, and post-processed into image sequences. The camera is fixed and the shooting image is limited. The ship appears to scale change and move out of view during the process of moving from far to near and from near to far. (3) Nearby ship obscuring. At present, the data of ships near the port are mainly collected, the number of ships is large, most of them are relatively similar, and the ships are often obscured from each other. Obscured target ship is easily susceptible to sampling errors and update template failure situations. All the above challenges affect the direct application of generic tracking algorithms to inland waterway scenarios, which cannot achieve accurate real-time ship tracking. Based on the above background, CFFS is proposed in this paper to improve the accuracy and computing speed of inland river ship tracking. In this paper, HOG features and depth features are extracted from the relevant filtering framework. The above two features are selected using the BGWO(Binary Grey Wolf Optimization) and Principal Component Analysis methods in the Swarm Intelligence Optimization algorithm, respectively. The selected features can be directly ported to any correlation filter algorithm to improve its computational speed and have transposability. Finally, using the sparse update strategy to further reduce the model training time. The rest of the paper is organized as follows: Sect. 2 provides a literature review focusing on existing research results. In Sect. 3, we introduce the algorithm architecture and the theoretical basis of this paper. In Sect. 4, the implementation of the algorithm proposed in this paper and the performance evaluation and result analysis are explained. Finally, this study is concluded in Sect. 5.
2 Related Work 2.1 Correlation Filter Tracking Methods Previous research has focused on generative model tracking algorithms, including algorithms based on different modeling forms such as kernels, subspaces, and sparse representations. In recent years, generative tracking methods are unable to handle the complex tracking variations of kernel adaptation, and discriminative tracking
Correlation Filter Feature Selection Strategy Based …
667
methods have gradually taken the mainstream. People have started using online learning methods to train trackers, such as a TLD (Tracking Learning Detection) method for long-term tracking proposed in the literature [1]. TLD achieves target tracking by combining learning and detection, with the learning module for learning the appearance of the target. The learning module improves the tracking accuracy by continuously collecting samples and using them to update the target model. The detector is used to determine whether the measured target has drifted by comparing the appearance of the target in the current frame with the appearance of the target in the previous frame. The later proposed algorithm Struck [2] used a structured SVM approach to learn a classifier online and achieved the best tracking results at that time. Then, people introduced correlation filter in the communication field into target tracking. The basic idea of the tracking method based on correlation filtering is to search for candidate regions in the current frame that are similar to the target position in the previous frame. Then the feature vectors of these candidate regions are convolved with the correlation filter to obtain a set of response maps, and the maximum value in the response map is the target position in the current frame. Following this idea, many methods based on correlation filters have emerged sequentially. MOSSE [3] is the first method in the field of target tracking related to the correlation filter with a speed of up to 615 FPS. The correlation filter is also proved to be superior in that paper with fast computational speed. Many improvements based on MOSSE have been made subsequently, such as the literature [4] extended the dense sampling and kernel methods based on MOSSE, CN [5] extended multichannel color based on CSK, and KCF [6] calculated and extended the HOG features of multichannel gradients using a circular matrix based on CSK. A series of approaches have been developed based on KCF to address the challenges. For example, DSST [7] used image pyramids to scale transform the target based on the scale space theory, while using correlation filters for accurate position estimation of the target. Compared with other traditional target tracking algorithms, DSST algorithm is widely used in the field of real-time video tracking because it can achieve faster operation speed while maintaining high accuracy. SRDCF [8] adopted a spatial regularization technique to improve the robustness of the filter by introducing spatial relationships between image blocks. In addition, SRDCF algorithm employed feature compression technique to map the original feature vector into a low-dimensional space, thus improving the computational efficiency of the algorithm. Li et al. [9] proposed the STRCF algorithm, which employed a structured regression method to model the filter response values with the target state, thus directly predicting the change of the target state during the target tracking process. Also, the STRCF algorithm introduced a nonlinear kernel function to improve the robustness of the filter. With the widely application of deep learning in computer vision, researchers have started to explore the application of deep learning to target tracking tasks. The DeepSRDF method [10] replaced the HOG features in SRDCF with the depth features of a single convolutional layer in CNN, and HCF [11] combined multilayer convolutional features to enhance the effect. C-COT [12] integrated the spatial regularization of SRDCF and adaptive sample weights in literature [13], and used
668
L. Xiao et al.
deep learning model to extract the visual features of the target. Then continuous convolution operation is used to match the target template with the candidate region to achieve the target tracking.
2.2 Feature Selection Methods At present, the amount of data handling various learning tasks is increasing, leading to an exponential increase in computational complexity and excessive memory usage. Therefore the study of feature selection is receiving more and more attention and scholars are using different methods to analyze and evaluate features to improve model performance. The Relief algorithm was first proposed by Kira et al. [14] to determine which features are useful for classification tasks by estimating the correlation between features and categories. Relief algorithm is mainly applicable to the case where the number of features in the dataset is large, but most of them are redundant or irrelevant. To address the problem of diverse and complex multi-label datasets and low prediction accuracy of some feature selection methods, Sun et al. [15] proposed a multi-label feature selection method using multiple labels and neighborhood mutual information in a multi-label neighborhood decision system. However, the output result of this algorithm has randomness, which affects the fluctuation of weight values. The literature [16] similarly addressed the problem of large dimensionality of multi-label data by proposing a multi-tag online stream feature selection algorithm.In order to address that the classical Relief and F-statistic feature selection cannot be directly applied to the multi-label problem, Kong et al. [17] proposed a combination of Relief and F-statistic algorithms for feature selection. In recent years, Swarm intelligence optimization algorithm have also started to be widely used in feature selection problems, achieving excellent performance. Swarm intelligence optimization algorithm is a group intelligence-based optimization algorithm that uses swarm intelligence systems to solve complex optimization problems. The basic idea of swarm intelligence optimization algorithm is to solve optimization problems by simulating many different group behaviors in a collective intelligence system. Swarm intelligence optimization algorithm typically consist of three main components: group search, social learning, and social behavior. The use of swarm intelligence optimization algorithms enables feature selection, such as nondominated sorting algorithms, ant colony algorithms, particle swarm algorithms, simulated annealing algorithms, genetic algorithms, etc. These algorithms are able to adapt to different types of problems and solve problems of different complexity. The literature [18] introduced an optimization algorithm based on honey foraging behavior, the basic idea of this is to consider the solution in the search space as the location of the honey bee looking for a honey source. Yang et al. [19] proposed FA, an optimization algorithm based on firefly brightness and attractiveness. Literature [20] combined the binary hybrid gray wolf algorithm and Harris Hawk optimization algorithm to convert the continuous search space into a binary search space to meet the feature selection requirements. Ewees et al. [21] proposed a grasshopper
Correlation Filter Feature Selection Strategy Based …
669
optimization algorithm based on OBL(Opposition Based Learning) strategy, which consists of two stages. The first stage uses the OBL strategy to generate the initial population and the second stage uses OBL as an additional stage to update the grasshopper population. Research scholar Seyedali [22] proposed GWO (Grey Wolf Optimization) algorithm in 2014, and the basic idea of GWO is to consider each solution in the solution space as an individual grey wolf, and solve the optimization problem by simulating the process of finding prey in a grey wolf group. With the development of swarm intelligence optimization algorithm, many other new algorithms and improvements have been proposed. For example, ant colony algorithm, artificial immunity algorithm, and improvement of fish swarm algorithm. In particular, the ant colony algorithm [23] simulated the behavior of ants when searching for food. Ants release pheromones on the path and other ants choose the path according to the concentration of pheromones. By simulating this behavior, the ant colony algorithm can find the optimal solution in the search space.
3 Method To address the above issues, this section combines the binary gray wolf algorithm and the principal component analysis method for feature selection. The features containing significant information of the target are selected. And the sparse update is used to further reduce the computing time and achieve the goal of real-time tracking of inland river ships. The model framework of this paper is shown in Fig. 1 Firstly, the correlation operation is performed on the samples using correlation filtering, and obtain the feature labels according to the sample response values. Accord-
Fig. 1 Algorithm framework diagram
670
L. Xiao et al.
ing to the positive and negative sample labels, the binary gray wolf algorithm is introduced to select the HOG features to get the BGWO-HOG features. Meanwhile, the principal component analysis is introduced to select the depth features to get the PCA-CNN features. Finally, after the dimensionality reduction features are input to the original C-COT framework, the ship tracking results of the current frame are obtained, and then the correlation filters are sparsely updated every N frames.
3.1 Binary Gray Wolf Algorithm to Select Features After inputting the ship images, the HOG features are selected using the swarm intelligence algorithm. By experimentally comparing the performance of each swarm intelligence algorithm, the binary gray wolf algorithm is finally selected to select the HOG features. Firstly, we process the extracted HOG features, and the original HOG feature map size is M×N×D. Each dimensional feature is tiled to obtain the MN×D feature matrix, which is input to the binary gray wolf algorithm. After optimization by the binary gray wolf algorithm, the features containing significant information about the target are selected, and the optimization process is specified as follows. The gray wolf optimization algorithm simulates the swarm intelligence exhibited by gray wolves in group predation behavior. This algorithm is widely used to solve various complex optimization problems. The gray wolf algorithm [22], originally proposed by Mirjalili et al. in 2014, modeled the collaborative search behavior of small packs of gray wolves for prey, and the swarm intelligence of gray wolves enabled them to find the best location for prey faster, just like wolf packs hunt, and the algorithm has been widely used in the past few years. According to the gray wolf population leadership hierarchy, four types of gray wolves are classified and three main steps of hunting are realized: finding, encircling and attacking prey. α is the head wolf, responsible for designating decisions such as predation, rest and advance. In a wolf pack, β wolves can take on subordinate roles, assisting the head wolf α in making decisions and organizing other activities. Meanwhile, ordinary wolf δ needs to obey the command of α and β, while it can command ω to complete the task. ω is the bottom wolf, following the commands of the other three levels of gray wolves. When wolves hunt, the task of α, β and δ is to locate the location of the prey target and collaborate to command ω to complete actions such as approaching, encircling and attacking in order to finally successfully hunt the prey. The gray wolf algorithm is mathematically modeled according to the four social levels of the gray wolf. α is the optimal solution, β and δ are the second and third optimal solutions, respectively, and ω is the remaining candidate solution. Gray wolf populations can approach and surround hunting targets through Eqs. (1) and (2): X (t + 1) = X p (t) − A · D
(1)
Correlation Filter Feature Selection Strategy Based …
= C × X p (t) − X (t) D
671
(2)
Equation (1) to update the position of the gray wolf, while Eq. (2) represents the distance relationship between individual wolves and their prey. Where, X is the position vector of the gray wolf, t is the number of iterative generations. A and D are the coefficient vectors, X p is the position vector of the prey. A and C are calculated as follows: → A = 2a · − r1 − a − → C = 2 r2
(3)
→ → where, − r1 and − r2 are random vectors taking values in the range [0, 1]. a is the convergence factor, which decreases linearly from 2 to 0 with the number of iterations: a =2−
2t Tmax
(4)
After the wolf pack identifies the prey position, the head wolf identifies the prey position. The subordinate wolves and ordinary wolves guide the whole wolf pack for encirclement action under their leadership. The formula for them to track the prey position is as follows: α = C1 × X α − X D β = C2 × X β − X D (5) δ = C3 × X δ − X D α, D β and D δ denote the distance vectors with α, β and δ other individuals. where D X α , X β and X δ denote α, β and δ current position, respectively. C1 , C2 and C3 are coefficient vectors, and X is the current gray wolf individual position vector. α X 1 = X α − A1 · D β X 2 = X β − A2 · D δ X 3 = X δ − A3 · D
(6)
Equation (6) defines the step length and direction required for individual wolf ω to advance towards α, β and δ, respectively. X 1 + X 2 + X 3 X (t + 1) = 3
(7)
672
L. Xiao et al.
Equation (7) defines the ω final position so that the gray wolf population position is updated to the closest to the prey. Feature selection is a binary optimization problem, so the feature labels should be set to binary values 0, 1. The gray wolf algorithm is changed to a binary version to apply to the feature selection problem. After the study, it is found that the binary gray wolf optimization algorithm has the disadvantage that the algorithm effectiveness decreases when the feature dimension is larger. Therefore, the binary gray wolf optimization algorithm is applied to the dimensionality reduction process of HOG feature selection. The flow of the binary gray wolf algorithm is: (1) First, initialize the position of each wolf in the population and generate a random population; (2) In the second step, the fitness is calculated to obtain the sum position; (3) In the third step, update the position of each wolf according to the transformation function to further improve the search results; (4) Repeat the above steps until the termination condition is satisfied to obtain the optimal solution or suboptimal solution. In binary discrete space, the position update is transformed between 0 and 1. The initialization formula for positions α, β and δ, is as follows: X i = x j , x j =
1 rand ≥ 0.5 , i = α, β, δ · · · j = 1, 2, . . . , n 0 rand < 0.5
(8)
where j is the dimension of the solved problem. The binary discrete space requires conversion of wolf pack positions to 0 and 1. The conversion function is chosen as follows: S X α (t) = S X β (t) = S X δ (t) =
1 1 + exp X α (t) 1 1 + exp X β (t)
(9)
1 1 + exp X δ (t)
The conversion from continuous to discrete problems is achieved by Eq. (9). The wolf pack locations are updated as follows: X (t + 1) =
⎧ ⎨1
sigmoid
⎩0
sigmoid
X β (t)+ X β (t)+ X δ (t) 3 X β (t)+ X β (t)+ X δ (t) 3
≥ rand ≥ rrand
(10)
Correlation Filter Feature Selection Strategy Based …
673
Multiple iterations of the above wolf position update are performed to obtain the global optimal solution. According to the global optimal solution, the feature subset M×N× D˜ is obtained by selection. The optimal feature subset is input to the correlation filter framework for training, the correlation filter is obtained, and the current frame tracking result is obtained.
3.2 Principal Component Analysis to Select Features Principal Components Analysis (PCA) can convert the feature variables in the original data into new feature variables in data processing for data analysis. The idea of applying PCA principal components analysis in this section is to downscale the highdimensional depth data to low-dimensional and remove some unnecessary redundant features in order to reduce the number of feature dimensions, suppress dimensional disasters, and further enhance the computing speed. PCA is mainly divided into three steps: data centering, calculating the covariance matrix, and obtaining the eigenvectors and eigenvalues. In PCA dimensionality reduction, in order to eliminate the bias caused by the difference in magnitude between attributes, the original data is first centralized. This means that each dimensional feature value in the original data is subtracted from its mean value, so that each attribute has the same scale, making the data mean value 0. The formula is as follows: x¯ t =
1 t x (m, n) M N m,n
(11)
where MN is the number of rows and columns of the feature matrix. (m, n) denotes the pixel coordinate position. x t denotes the characteristic vector of the t-th frame. The covariance matrix is then derived to measure the relationship between the attributes. The covariance matrix is an n × n matrix, which is a symmetric matrix. The formula is as follows: Ct =
T 1 t x (m, n) − x¯ t x t (m, n) − x¯ t M N m,n
(12)
Then the covariance matrix is evaluated for eigenvalues, which are real numbers that indicate the variability of the data. The larger the eigenvalue, the larger the variability of the data, and the smaller the eigenvalue, the smaller the variability of the data. Finally, the eigenvalues are sorted from largest to smallest, and the eigenvectors corresponding to the first d largest eigenvalues are taken to obtain the projection matrix P. The new eigenmatrix is obtained by multiplying the projection matrix with the original eigenmatrix to achieve dimensionality reduction. The formula is as follows: F=X·P (13)
674
L. Xiao et al.
where X is the original feature matrix with dimension d. P is the projection matrix ˜ ˜ Through the above with size d×d. F is the new feature matrix with dimension d. formula, the dimension of the feature matrix changes from d˜ to d, and these feature vectors can form a new feature space, thus achieving the purpose of dimensionality reduction. The original objective function of correlation filter is: D 2 1 d d xt ∗ f − y arg min f 2 d=1
(14)
where, the sample xt consists of a feature map of size M×N×D, y is a predefined gaussian label, and f denotes the correlation filter. The objective function after adding PCA dimensionality reduction is: D 2 1 d d xt · P ∗ f − y arg min f 2 d=1
(15)
After the PCA dimensionality reduction process, the sample feature map size is ˜ M×N× D.
3.3 Sparse Update Strategy Most tracking algorithms update the filter every frame and can mitigate the model degradation problem. However, the ship in the inland river scene moves slowly and the target appearance changes little before and after frames, so we use a sparse update strategy. The model is updated every N frames to further reduce the time to train the model.
4 Experiment In this section, the proposed ship tracking algorithm is validated by the inland river ship dataset collected by our group. Comparison experiments are conducted with some advanced algorithms using Visual tracker benchmark [24]. The collected inland river ship dataset has 227 video sequences from CCTV videos of Wenzhou Maritime Bureau, Wuhu Maritime Bureau, Wuhan Yangtze River Maritime Branch, Tianjin Port, Wenzhou Port, and Zhenneng Power Plant Terminal, with various tracking scenarios such as occlusion, lighting, and scale change. The proposed method is implemented based on MATLAB2016a platform with MatConvNet toolbox, and all experiments are run on a PC with Intel i5-10600k CPU, 16GRAM.
Correlation Filter Feature Selection Strategy Based …
675
Table 1 Performance comparisons of multiple group intelligence optimization algorithms Algorithms Accuracy (%) Iteration time (S) Matthews Optimal adaptation BGWO BGSA BPSO BALO BBA BSSA BWOA
84.81 84.49 84.61 84.74 83.56 84.21 84.29
2874.9136 5815.1429 7662.5646 7739.0209 9038.4475 15895.6674 15433.1964
0.70324 0.69376 0.70055 0.70003 0.67937 0.69360 0.69477
0.16134 0.15999 0.16430 0.16314 0.16928 0.16842 0.16761
Fig. 2 Comparison of the success rate and accuracy rate of the algorithm before and after feature selection
In this section, BGSA, BPSO, BALO, BBA, BSSA, and BWOA six intelligent optimization algorithms are selected for comparison. The number of iterations of the selected algorithms is 50, and the comparison results are shown in Table 1. The experimental results are shown in Fig. 2, (a) showing the success rate comparison results, and (b) showing the accuracy rate comparison results. According to the experimental results, after introducing BGWO to reduce the dimensionality of HOG features, the computational speed is not greatly improved, but the accuracy rate is greatly improved. After introducing PCA method to dimensionality reduction of depth features, the operation speed is greatly improved and the accuracy rate is slightly decreased. The overall feature selection results improve both accuracy and operation speed, and the specific experimental indexes are shown in Table 2.
676
L. Xiao et al.
Table 2 Comparison of tracking success rate, accuracy rate and computing speed before and after feature selection Algorithms Success rate (%) Accuracy rate (%) Operation speed (FPS) C-COT 77.2 C-COT+BGWO 77.4 C-COT+BGWO+PCA 80.0
85.3 87.7 86.6
0.516 2.975 13.462
5 Summary The analysis in this chapter finds that the correlation filtering algorithm with the introduction of multiple features is slow in operation. So the feature selection-based filtering tracking algorithm for inland river ships is adopted. Firstly, CN features, HOG features and depth features of ship images are extracted. Then, the HOG features are reduced from 31 dimensions to 10 dimensions by introducing the binary gray wolf algorithm according to the extracted features. And the depth features are dimensionalized by the improved PCA method, and the depth features are reduced from 512 to 64 dimensions. Finally, the reduced-dimensional features are put into the model for training to obtain the current frame correlation filter.
References 1. Kalal, Z., Mikolajczyk, K., Matas, J.: Tracking-learning-detection. IEEE TPAMI 34(7), 1409– 1422 (2012) 2. Hare, S., Golodetz, S., Saffari, A., et al.: Struck: Structured output tracking with kernels. IEEE TPAMI 38(10), 2096–2109 (2016) 3. Bolme, D.S., Beveridge, J.R., Draper, B.A., et al.: Visual object tracking using adaptive correlation filters. CVPR 2544–2550 (2010) 4. Henriques, J.F., Caseiro, R., Martins, P., et al.: Exploiting the circulant str-ucture of trackingby-detection with kernels. ECCV 702–715 (2012) 5. Danelljan, M., Shahbaz Khan, F., Felsberg, M., et al.: Adaptive color attrib-utes for real-time visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1090–1097 (2014) 6. Henriques, J.F., Rui, C., Martins, P., et al.: High-speed tracking with kern-elized correlation filters. IEEE TPAMI 37(3), 583–596 (2015) 7. Danelljan, M., Hager, G., et al.: Discriminative scale space tracking. IEEE Trans. Pattern Anal. Mach. Intell. 39(8), 1561–1575 (2017) 8. Danelljan, M., Hager, G., Shahbaz Khan, F., et al.: Learning spatially regularized correlation filters for visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4310–4318 (2015) 9. Li, F., Tian, C., Zuo, W., et al.: Learning spatial-temporal regularized corr-elation filters for visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4904–4913 (2018) 10. Danelljan, M., Hager, G., Khan, F.S., et al.: Convolutional features for correlation filter based visual tracking. In: 2015 IEEE International Conference on Computer Vision Workshop (ICCVW), pp. 621–629. IEEE (2016)
Correlation Filter Feature Selection Strategy Based …
677
11. Ma, C., Huang, J.B., Yang, X., et al.: Hierarchical convolutional features for visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3074–3082 (2015) 12. Danelljan, M., Robinson, A., Shahbaz Khan, F., Beyond correlation f-ilters: learning continuous convolution operators for visual tracking. In: Co-mputer Vision-ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part V 14, vol. 2016, pp. 472–488. Springer International Publishing (2016) 13. Danelljan, M., Hager, G., Shahbaz Khan, F., et al.: Adaptive decontamination of the training set: a unified formulation for discriminative visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1430–1438 (2016) 14. Kira, K.: The feature selection problem: traditional methods and a new algorithm. Proc. AAAI 92 (1992) 15. Sun, L., Yin, T., Ding, W., et al.: Multilabel feature selection using ML-ReliefF and neighborhood mutual information for multilabel neighborhood decision systems. Inf. Sci. 537, 401–424 (2020) 16. Wang, H., Yu, D., Li, Y.: Multi-label online streaming feature select-ion based on spectral granulation and mutual information. In: Rough Sets: International Joint Conference, IJCRS 2018: Quy Nhon, Vietnam, Aug 20–24, 2018, Proceedings 6. Springer International Publishing, pp. 215–228 (2018) 17. Kong, D., Ding, C., Huang, H., et al.: Multi-label relieff and f-statistic fea-ture selections for image annotation. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2352–2359. IEEE (2012) 18. Baykasoglu, A., Ozbakir, L., Tapkan, P.: Artificial bee colony algorithm and its appliation to generalized assignment problem. In: Swarm Intelligence: F-cus on Ant and Particle Swarm Optimization, p. 1 (2007) 19. Yang, X.S.: Firefly algorithms for multimodal optimization. In: Stochastic Algorithms: Foundations and Applications: 5th International Symposium, SAGA: Sapporo, Japan, Oct 26–28, vol. 2009, pp. 169–178 . Proceedings 5. Springer, Berlin Heidelberg (2009) 20. Al-Wajih, R., Abdulkadir, S.J., Aziz, N., et al.: Hybrid binary grey wolf with harris hawks optimizer for feature selection. IEEE Access 9(1), 31662–31677 (2021) 21. Ewees, A.A., Elaziz, M.A., Houssein, E.H.: Improved grasshopper o-p-timization algorithm using opposition-based learning. Expert Syst. Appl. 112(Dec), 156–172 (2018) 22. Mirjalili, S., Mirjalili, S.M., Lewis, A.: Grey wolf optimizer. Adv. Eng. Softw. 69(3), 46–61 (2014) 23. Dorigo, M., Birattari, M., Stützle, T.: Ant colony optimization. IEEE Comput. Intell. Mag. 1(4), 28–39 (2006) 24. Wu, Y., Lim, J., Yang, M.H.: Online object tracking: a benchmark. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2411–2418 (2013)
Heterogeneous Vehicle Platoon Control Based on Predictive Constant Time Headway Strategy Yangzhou Chen and Bingzhuang Yan
Abstract This paper proposes a predictive constant time headway spacing strategy (PCTHS) for the heterogeneous vehicle platoon control problem. This strategy uses the vehicle’s own speed information and the speed information of the leader vehicle associated with the time headway to improve the driving safety and stability of the vehicles in the platoon. Based on the proposed PCTHS, a corresponding vehiclefollowing control protocol is designed. The proposed control protocol has a flexible general frame structure, which can not only realize the consensus of the position, velocity and acceleration of heterogeneous vehicles in the platoon, but also effectively reduce the unreasonable acceleration/deceleration of vehicles, so as to ensure the comfort of vehicles during driving. Then, the stability conditions are obtained by performing state linear transformation and partial stability theory analysis on the vehicle platoon system. Finally, the effectiveness of the proposed method is verified by the simulation of a third-order heterogeneous vehicle platoon. Keywords Heterogeneous vehicle platoon · Predictive constant time headway strategy · State linear transformation · Partial stability
1 Introduction In recent years, vehicle networking and autonomous driving technology have developed rapidly. Traditional adaptive cruise control (ACC) has gradually developed into another promising application: cooperative adaptive cruise control (CACC). ACC measures the relative distance and speed with adjacent vehicles through sensors such as on-board radar, but it is difficult to obtain the acceleration of adjacent vehicles; CACC uses vehicle-to-vehicle communication (V2V), vehicle-to-infrastructure Y. Chen · B. Yan (B) Beijing University of Technology, Beijing, China e-mail: [email protected] Y. Chen e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_56
679
680
Y. Chen and B. Yan
communication (V2I) and other technologies to share state information (including position, velocity and acceleration) between vehicles (including but not limited to adjacent vehicles), so as to achieve better vehicle platoon control effect [1–3]. Therefore, it is of great application value and research significance to study the control method of vehicle platoon from the perspective of improving vehicle driving mode to improve driving safety, comfort and vehicle fuel utilization. The core of vehicle platoon control is the designed vehicle following control protocol. A reasonable and effective control protocol can ensure that the vehicles in a platoon maintain the desired safe spacing and stable driving. Bernardo et al. [4] considered the platoon problem as a consensus problem in multi-intelligent system networks, and designed a distributed control protocol, which relies on the state information of the vehicles themselves and neighboring vehicles. Jia et al. [5] proposed a consensus-based vehicle platoon control protocol, and analyzed that the globally accessible leader vehicle information is beneficial to the stability of the vehicle platoon. Therefore, inspired by the above research, this paper regards vehicle platoon control as a consensus problem in the dynamic network of third-order multi-agent systems, so as to design a consensus-based vehicle control protocol. The information communication relationship or communication topology between vehicles in CACC system plays an important role in platoon control. Researchers have conducted some in-depth studies on the influence of communication topology on platoon stability. Zheng et al. [6] analyzed the internal stability of vehicle platoon for some typical communication topologies, and obtained the stability condition by the Routh-Hurwitz stability criterion. Pirani et al. [7] investigated the effect of communication flow topology on the resilience of distributed control algorithm for vehicle platoon, and showed that only relying on the traditional predecessor-follower (PF) topology is not conducive to the performance and safety of vehicle platoon. Considering the diversity of topologies between connected vehicles, it is necessary to explore a more general inter-vehicle communication topology. Therefore, in the research of this paper, we will assume a general communication topology containing a directed spanning tree, whose root comes from the leader. From the perspective of vehicle spacing during vehicle platoon movement, the existing spacing strategies are mainly divided into two categories: one is the constant spacing strategy (CSS), that is, one vehicle follows the previous vehicle with a fixed distance [8, 9]; the other is the constant time headway strategy (CTHS) based on speed dependence, which makes the distance between vehicles vary with vehicle speed by keeping a constant time headway [10, 11]. And the time headway can also be divided into two categories: leader vehicle speed dependence and vehicle’s own current speed dependence. However, CSS cannot simultaneously take into account the reasonable safety distance between vehicles at low and high speeds; in complex actual traffic scenarios, CTHS cannot effectively avoid unreasonable acceleration and deceleration of vehicles, and thus cannot guarantee the stability and comfort of vehicle driving. Therefore, in view of the problems existing in the above research, a novel spacing strategy is proposed in this paper. In view of the above problems, a stability analysis is performed for a third-order heterogeneous vehicle formation system considering the CACC case with a general
Heterogeneous Vehicle Platoon Control Based …
681
communication topology in this paper. The main contributions include: (1) A predictive constant time headway strategy (PCTHS) relating to leader speed is proposed. By simulation comparison analysis, it is shown that PCTHS can effectively avoid unreasonable acceleration and deceleration of vehicles, and thus improve the safety, stability and comfort of vehicle driving; (2) The third-order heterogeneous platoon control problem is transformed into a leader-follower consensus problem, and a distributed consensus control protocol with a general structure is designed to improve the freedom and flexibility of vehicle control; (3) The consensus problem is further transformed into the stability problem of partial variables by using the state linear transformation elaborately. By means of the partial stability theory, the stability conditions of the system are given. In particular, the use of linear transformation method and partial stability theory simplifies the stability analysis process of the system and provides a reference for the stability analysis of the actual vehicle platoon system. The remaining work in this paper is organized as follows. Section 2 presents the problem statement. The control protocol design of the vehicle and the state linear transformation of the system is given in Sect. 3. Then the stability analysis is performed in Sect. 4. Finally, in Sect. 5, we make simulations to check the effectiveness of the PCTHS and the control protocol.
2 Problem Description Consider a platoon of N + 1 heterogeneous vehicles, where the leader vehicle is numbered i = 0, the following vehicles are numbered i = 1, 2, · · · , N in turn. It is assumed that the vehicles in the platoon will be able to exchange state information (position, velocity and acceleration) with each other through V2V communication technology, and that the distance to the vehicle in front will be obtained through distance sensors such as LIDAR. Abstract the communication topology between the following vehicles as a weighted directed graph G = (V, E, W ), where V = {1, 2, · · · , N } is the set of nodes in the digraph G, and E ⊆ V × V denotes the set of directed edges of two nodes in the digraph G. Or we can use Ni to represent the index set of vehicles that N can send their information to vehicle i. W = Wi j i, j=1 is defined as the block connection matrix of communication weights, where Wi j ∈ R3×3 ≥0 . Wi j = 0 indicates that vehicle j can transmit its position, velocity and acceleration information to vehicle / E. Based on matrix W , we i, that is,( j, i) ∈ E. Otherwise Wi j = 0, that is, ( j, i) ∈ N can define block Laplacian matrix L W = L i j i, j=1 , where ⎧ ⎪ ⎨ k∈Ni Wik , j = i L i j = −Wi j , j ∈ Ni ⎪ ⎩ 0, j∈ / Ni , j = i
(1)
682
Y. Chen and B. Yan
(a) PF
(b) PLF
(c) BD
(d) TPF
Fig. 1 Typical communication topology for vehicle platoon
Further, consider the communication connection relationship between the follow¯ W, δ} be the weighted directed ing vehicle and the leader vehicle. Let G = {V¯ , E, graph of the platoon joining the leader vehicle. Let V¯ = V ∪ {0}, and E¯ represents the T edge set of two nodes in the graph G with leader 0. Let δ = D1T , . . . , D NT ∈ R3N ×3 be the communication connection matrix between the following vehicle and leader vehicle. Di = 0 indicates that vehicle 0 can transmit its position, velocity and ¯ Otherwise Di = 0, that acceleration information to vehicle i, that is, (0, i) ∈ E; ¯ Note that the communication matrix is generally in the diagonal form, is, (0, i) ∈ / E. that is, Wi j = Di = diag(w p , wv , wa ), where w p , wv and wa represent the communication weights of the position, velocity and acceleration information transmission between vehicles respectively. Suppose that the general digraph G contains a directed spanning tree with a leader as the root node. This general digraph can cover the following typical communication topologies: predecessor-following (PF), predecessor-leader following (PLF), twopredecessor following (TPF) and bidirectional (BD) shown in Fig. 1. According to reference [12], the heterogeneous third-order vehicle dynamics model can be established as follows p˙ i = vi v˙i = ai a˙ i = −τi−1 ai + τi−1 u i
(2)
where pi , vi and ai represent the position, velocity and acceleration of vehicle i, respectively; u i denotes the control input of vehicle i; τi is the inertial time constant of the vehicle power system. In this paper, we consider heterogeneous vehicle platoons, that is, the inertial time constant τi of each vehicle can be different from each other. Next, we define a new vehicle spacing strategy: the predictive constant time headway strategy (PCTHS), as shown in Fig. 2. This strategy uses the speed information of the vehicles themselves to predict a period of distance forward from the current position, and expects that the predicted position difference between two adjacent vehicles is equal to a safety value r related to the current speed of the leader, that is
Heterogeneous Vehicle Platoon Control Based …
683
Fig. 2 Predictive constant time headway strategy (PCTHS)
( pi−1 + hvi−1 ) − ( pi + hvi ) → r = s + l + h 0 v0
(3)
where l denotes the largest length of the vehicle body, s denotes the safe parking distance between two adjacent vehicles, h and h 0 are the constant time headways. Assuming that the leader vehicle travels at a constant speed, the corresponding vehicle platoon control objective is ⎧ ⎨ limt→∞ pi − p0 + h (vi − v0 ) + ir = 0 limt→∞ vi − v0 = 0 ⎩ limt→∞ ai − a0 = 0
(4)
where · denotes Euclidean distance. Let p¯ i = pi + hvi + ir , then (2) becomes p˙¯ i = vi + hai + i h 0 a0 v˙i = ai a˙ i = −τi−1 ai + τi−1 u i
(5)
Let xi = [ p¯ i , vi , ai ]T , then (5) can be further rewritten as x˙i = Ai xi + Ai0 x0 + Bi u i
(6)
⎡ ⎡ ⎤ ⎤ ⎤ 01 h 0 0 0 i h0 Ai = ⎣ 0 0 1 ⎦ , Bi = ⎣ 0 ⎦ , Ai0 = ⎣ 0 0 0 ⎦ , i ∈ V¯ 00 0 0 0 −τi−1 τi−1
(7)
where ⎡
Thus, the vehicle platoon control objective (4) is transformed into a leaderfollowing consensus problem for the system (6), that is lim xi (t) − x0 (t) = 0
t→+∞
(8)
684
Y. Chen and B. Yan
3 Distributed Control Protocol Architecture Design and Linear Transformation For the above vehicle platoon control problem, the following distributed control protocol is designed, taking into account the heterogeneity of vehicles and the consensus of vehicle position, velocity and acceleration. ui = Ki
Wi j x j − xi + K i Di (x0 − xi ) , i ∈ V
(9)
j∈Ni
where K i = ki p , kiv , kia ∈ R1×3 , ki p , kiv , kia are the position, velocity and acceleration control gains of the vehicle distributed controller respectively. T T T T Let x = x1T , x2T , · · · , x NT , A˜ = A10 , A20 , · · · , A TN 0 , and substitute the control protocol (9) into the system (6), the following centralized form can be obtained:
x˙0 x˙
=
A0 0 A˜ + BD K D δ A
B0 x0 + u0 x 0
(10)
where A = AD − BD K D (L W + ), AD = diag (A1 , · · · , A N ), BD = diag(B1 , · · · , T B N ), K D = diag (K 1 , · · · , K N ), = diag (D1 , · · · , D N ), δ = D1T , · · · , D NT . Next, we perform a linear transformation on the system (10). Based on the general directed graph G¯ with the above assumptions, let the edge of the spanning tree be (gi , i) ∈ E, i ∈ V , that is, each non-root node i ∈ V , gi is its parent node, and g1 = 0 is the root node. Let egi ,i be the incidence vector of edge (gi , i), that is, the gi -th component is 1, the i-th component is −1, and the remaining components are 0. Let P0 = [eg1 ,1 , eg2 ,2 , · · · , eg N ,N ] be the incidence matrix of a directed spanning tree, then consider a virtual directed spanning tree, that is, assuming that the edges of the directed spanning tree are (i − 1, i), i ∈ V , and the incidence matrix is ⎤ ⎡ 1 0 ··· 0 ⎢ . ⎥ −1 1 · · · .. ⎥ T ⎢ ⎥ ⎢ p ⎥ ⎢ (11) P0 = ˜ T = ⎢ 0 −1 . . . 0 ⎥ P0 ⎥ ⎢ ⎥ ⎢ . . . ⎣ .. .. . . 1 ⎦ 0 0 · · · −1 The following transformation matrix is constructed: P=
P0T e1T
⊗ I3 =
p P˜0 ⊗ I3 1 0
(12)
Heterogeneous Vehicle Platoon Control Based …
685
where e1 = [1, 0, · · · , 0]T ∈ R N +1 , ⊗ denotes the Kronecker product and I3 represents the third-order unit matrix. Note that P˜0T is the basic incidence matrix corresponding to the root node of the incidence matrix P0 . Therefore, P˜0T and P˜0 are invertible matrices. Let 1 N denote an N -dimensional vector whose all components are 1. From P˜0T 1 N +1 = 0, p + P˜0T 1 N = 0 can be obtained, so that the inverse matrix of P is P −1 = P¯0 1 N +1 ⊗ I3 =
0 1 ˜ P0−1 1 N
⊗ I3
(13)
Therefore, by constructing the state linear transformation
x z=P 0 x
(14)
the system (10) can be equivalently transformed into the following system:
where
y˙ x˙0
=
A¯ Aˆ 0 A0
y Bˆ + u x0 B0 0
(15)
T T T y = (x0 − x1 )T , (x , 1 − x2 ) ,· · · , (x N −1 − x N ) −1 ˜ ¯ ˜ A = P0 ⊗ I3 A P0 ⊗ I3 , T Aˆ = ( Aˆ 0 − Aˆ 1 )T , ( Aˆ 1 − Aˆ 2 )T , · · · , ( Aˆ N −1 − Aˆ N )T , T Bˆ = B0T , 0, · · · , 0 , Aˆ 0 = A0 , Aˆ i = Ai + Ai0 , i ∈ V
The detailed derivation process of the expression is no longer displayed, which can be referred to [13]. It can be seen that through the ingenious linear transformation (14), the leader-following consensus problem of the platoon is transformed into a partial variable y-stability problem of the system (15).
4 Stability Analysis Next, we will discuss the y-stability of the system (15) in the following two cases: 1)u 0 ≡ 0; 2)u 0 = 0.
686
Y. Chen and B. Yan
4.1 u0 ≡ 0 First, consider the leader vehicle traveling at a constant speed, that is u 0 ≡ 0. At this time, (15) can be written as ¯ + Ax ˆ 0 y˙ = Ay (16) x˙0 = A0 x0 Lemma 1 Suppose u 0 ≡ 0. The system (6) can asymptotically achieve leaderfollowing state consensus via the designed vehicle control protocol (9) if and only if the zero equilibrium state of the system (16) is globally asymptotically y-stable. According to the partial stability theory, the zero equilibrium state of the system (16) is globally asymptotically y-stable, which is equivalent to the following auxiliary system is asymptotically stable [13, 14]: ζ˙ = Mζ, M =
ˆ 3 A¯ AL 0 L 1 A0 L 3
(17)
where L 1 ∈ Rh×3 and L 3 ∈ R3×h are auxiliary matrices, and the algorithm steps for constructing them can be referred to [13]. Theorem 1 Suppose u 0 ≡ 0. The vehicle platoon system (2) with a given directed topology G¯ can accurately achieve the control objective (4) via the designed PCTHS (3) and vehicle control protocol (9) if and only if the matrices A and L 1 A0 L 3 are Hurwitz stable. Proof Through the above analysis, it can be seen that the necessary and sufficient condition for the vehicle platoon system (2) to achieve the control objective (4) under the designed PCTHS (3) and vehicle control protocol (9) is that the matrix M in the auxiliary system (17) is Hurwitz stable. Furthermore, as an upper triangular block matrix, the Hurwitz stability of M is equivalent to the Hurwitz stability of its diagonal matrices A¯ and L 1 A0 L 3 . Finally, according to A¯ = ( P˜0 ⊗ I3 )A( P˜0−1 ⊗ I3 ), the Hurwitz stability of A¯ is equivalent to the Hurwitz stability of A. This proof ends. On the basis of u 0 ≡ 0, it is further considered that the vehicle dynamics model (2) can be homogenized, that is, τi = τ0 , i = 1, · · · , N , then Aˆ = 0. Combined with the above analysis process, the following corollary can be obtained. Corollary 1 Suppose u 0 ≡ 0, and the vehicle platoon is homogeneous. The vehicle platoon system (2) with a given directed topology G¯ can accurately achieve the control objective (4) via the designed PCTHS (3) and vehicle control protocol (9) if and only if the matrice A is Hurwitz stable.
Heterogeneous Vehicle Platoon Control Based …
687
4.2 u0 = 0 Secondly, consider the case where the leader vehicle accelerates or decelerates, that is,u 0 = 0. At this time, the input u 0 of the leader can be regarded as an unknown and bounded perturbation. Definition 1 Suppose u 0 = 0. The leader-following multi-vehicle system (6) is called to achieve global state consensus in the sense of input-to-state stability via the vehicle control protocol (9) if there are a KL-class function β : R+ × R+ → R and a K-class function γ : R+ → R such that for any x(0) ∈ R3N and x0 (0) ∈ R3 there is x(t) − 1 N ⊗ x0 (t) ≤ β ( x(0) − 1 N ⊗ x0 (0) , t) + γ ( u 0 ∞ )
(18)
where u 0 is the unknown input bounded signal of the leader. According to Definition 1, if the input of the leader vehicle is not zero, the consensus of the system (6) cannot be reached accurately, and there will be bounded errors related to the leader input. Since the leader input u 0 is unknown, the stability of (15) becomes the input-to-state asymptotic stability. Then, according to (15), its first equation can be isolated and written as ¯ + Bη ¯ y˙ = Ay
(19)
x where η = 0 , B¯ = Aˆ Bˆ . u0 Therefore, based on the input-to-state stability and combined with the conclusion in the case of u 0 ≡ 0, the following results are obtained. Theorem 2 Suppose u 0 = 0. The vehicle platoon system (2) with a given directed topology G¯ can achieve the control objective (4) in the sense of input-to-state stability via the designed PCTHS (3) and vehicle control protocol (9) if and only if the matrices A and L 1 A0 L 3 are Hurwitz stable. Further, the vehicle tracking error can be estimated as ⎞ ⎛∞ ¯ ¯ ⎝ y(t) ≤ ce−θt y(0) + B (20) e Aτ dτ ⎠ η ∞ 0
where c and θ are positive constants. On the basis of u 0 = 0, considering that the vehicle dynamics model (2) is homogeneous, and the following corollary can be obtained. Corollary 2 Suppose u 0 = 0, and the vehicle platoon is homogeneous. The vehicle platoon system (2) with a given directed topology G¯ can achieve the control objective (4) in the sense of input-to-state stability via the designed PCTHS (3) and vehicle
688
Y. Chen and B. Yan
control protocol (9) if and only if the matrice A is Hurwitz stable. Further, the vehicle tracking error can be estimated as ⎞ ∞ ¯ ˆ ⎝ y(t) ≤ ce−θt y(0) + B e Aτ dτ ⎠ u 0 ∞ . ⎛
(21)
0
5 Numerical Simulation In this section, the proposed PCTHS and the CTHS in the literature [9] are simulated and compared under the TPF communication topology to verify the effectiveness of the proposed controller and the superiority of PCTHS. Firstly, six vehicles (one leader vehicle and five following vehicles) are selected as the experimental objects. When the communication weight matrix Wi j = 0 and Di = 0 between vehicles, unify Wi j = Di = diag(1, 0.9, 0, 8). The controller parameters and vehicle structure parameters are shown in Table 1. And it can be verified that the designed parameters satisfy the theorem given above. Note that, in order to compress the content of this paper, the design method of the control gain can be found in the relied literature [13]. The initial position and speed of the leader vehicle are set to p0 = 0m,v0 = 10m/s, the initial position and speed of the following vehicle are set to p(0) = [-12,-20,-33,-41,-53]m,v(0) = [12, 9, 13, 11, 14]m/s, and the initial acceleration of the vehicles is set to ai = 0m/s2 , i ∈ V¯ . Let the control input of the leader be
Table 1 Controller parameters Parameter Value Unit τi
ki p
kiv kia
τ0 = 0.5, τ1 = 0.45, τ2 = 0.41, τ3 = 0.35, τ4 = 0.32, τ5 = 0.29 k1 p = 3.6, k2 p = 3.0, k3 p = 2.4, k4 p = 2.0, k5 p = 1.5 0.9 2.3
Parameter
Value
Unit
s
li
4
m
−
s
1
m
− −
h0 h
0.5 1
s s
Fig. 3 State trajectory curve of vehicle platoon under TPF topology.
(b) PCTHS
(a) CTHS
Heterogeneous Vehicle Platoon Control Based … 689
690
Y. Chen and B. Yan
(a) Spacing error
(b) Velocity error
(c) Acceleration error
Fig. 4 State tracking error trajectory curve of vehicle platoon under TPF topology.
⎧ ⎨ 1.5, 10s ≤ t ≤ 20s, u 0 (t) = −2, 35s ≤ t ≤ 45s, ⎩ 0, otherwise,
(22)
Based on these two spacing strategies, the position, velocity, acceleration and spacing trajectories of the vehicle platoon system under the action of the controller (9) are shown in Fig. 3. Furthermore, the state tracking error trajectory of the PCTHSbased vehicle platoon system is shown in Fig. 4. The strategy comparison in Fig. 3 shows that the platoon under PCTHS can better maintain steady spacing, ensure the smoothness of vehicle acceleration and deceleration, and effectively suppress the speed fluctuation of vehicles, thereby improving ride comfort and vehicle fuel efficiency. Figure 4 clearly shows that in the process of leader acceleration and deceleration, the tracking errors of the platoon can be kept in a small and stable range, so as to ensure the driving safety of vehicles in the platoon.
6 Conclusion In this paper, an improved predictive constant time headway strategy has been proposed for the problem of vehicle platoon control based on heterogeneous models. Secondly, combined with the leader-following consensus, a general structured vehicle platoon control protocol has been proposed. Then, the conditions satisfying the stability of the vehicle platoon are analyzed and deduced by using linear transformation and partial stability theory. In the future, the problem of vehicle platoon control in communication delay environment will be further considered.
Acknowledgements This work was supported in part by the Natural Science Foundation of Beijing Municipality under Grant 4232041 and the National Natural Science Foundation of China under Grant 62273014.
Heterogeneous Vehicle Platoon Control Based …
691
References 1. Rajamani, R., Shladover, S.E.: Experimental comparative study of autonomous and cooperative vehicle-follower control systems. Transp. Res. Part C: Emerg. Technol. 9, 15–31 (2001). https://doi.org/10.1016/S0968-090X(00)00021-8 2. Dey, K.C., Yan, L., Wang, X., et al.: A review of communication, driver characteristics, and controls aspects of cooperative adaptive cruise control (CACC). IEEE Trans. Intell. Transp. Syst. 17, 491–509 (2016). https://doi.org/10.1109/TITS.2015.2483063 3. Brunner, J.S., Makridis, M.A., Kouvelas, A.: Comparing the observable response times of ACC and CACC systems. IEEE Trans. Intell. Transp. Syst. 23, 19299–19308 (2022). https://doi.org/ 10.1109/TITS.2022.3165648 4. Bernardo, M.D., Salvi, A., Santini, S.: Distributed consensus strategy for platooning of vehicles in the presence of time-varying heterogeneous communication delays. IEEE Trans. Intell. Transp. Syst. 16, 102–112 (2014). https://doi.org/10.1109/TITS.2014.2328439 5. Jia, D., Ngoduy, D.: Platoon based cooperative driving model with consideration of realistic inter-vehicle communication. Transp. Res. Part C: Emerg. Technol. 68, 245–264 (2016). https:// doi.org/10.1016/j.trc.2016.04.008 6. Zheng, Y., Li, S.E., Wang, J., Cao, D., Li, K.: Stability and scalability of homogeneous vehicular platoon: study on the influence of information flow topologies. IEEE Trans. Intell. Transp. Syst. 17, 14–26 (2015). https://doi.org/10.1109/TITS.2015.2402153 7. Pirani, M., Baldi, S., Johansson, K.H.: Impact of network topology on the resilience of vehicle platoons. IEEE Trans. Intell. Transp. Syst. 23, 15166–15177 (2022). https://doi.org/10.1109/ TITS.2021.3137826 8. Chen, Y., Zhang, G., Ge, Y.: Formation control of vehicles using leader-following consensus. In:2013 16th International IEEE Conference on Intelligent Transportation Systems—(ITSC 2013). pp. 2071–2075 (2013). https://doi.org/10.1109/ITSC.2013.6728534 9. Zheng, Y., Bian, Y., Li, S., Li, S.E.: Cooperative control of heterogeneous connected vehicles with directed acyclic interactions. IEEE Intell. Transp. Syst. Mag. 13, 127–141 (2021). https:// doi.org/10.1109/MITS.2018.2889654 10. Swaroop, D., Rajagopal, K.R.: A review of constant time headway policy for automatic vehicle following. In: 2001 IEEE Intelligent Transportation Systems, Proceedings (Cat. No. 01TH8585), pp. 65–69 (2001). https://doi.org/10.1109/ITSC.2001.948631 11. Li, Y., He, C., Zhu, H., Zheng, T.: Nonlinear longitudinal control for heterogeneous connected vehicle platoon in the presence of communication delays. Acta Automatica Sinica. 47, 2841– 2856 (2021). https://doi.org/10.16383/j.aas.c190442 12. Yu, G., Wong, P.K., Huang, W., et al.: Distributed adaptive consensus protocol for connected vehicle platoon with heterogeneous time-varying delays and switching topologies. IEEE Trans. Intell. Transp. Syst. 23, 17620–17631 (2022). https://doi.org/10.1109/TITS.2022.3170437 13. Chen, Y., Xu, G., Zhan, J.: Leader-following consensus of heterogeneous linear multi-agent systems: new results based on linear transformation method. Trans. Inst. Measur. Control. 44, 1473–1483 (2022). https://doi.org/10.1177/01423312211058281 14. Vorotnikov, V.I.: Partial Stability and Control (vol. 45, p. 2119). Springer, Berlin (1998). https:// doi.org/10.1109/TAC.2000.887711
Improved Northern Goshawk Optimization Method for Intercepting Maneuvering Targets with Pulse Correction Projectiles Yuming Zhang, Jian Fu, Xin Lei, Yifan Yang, and Hongyu Gao
Abstract To achieve the interception of maneuvering target by pulse correction projectile, the intelligent algorithm is used to optimize the pulse control parameters of the pulse correction projectile according to the characteristics of the discretization pulse action of the pulse correction projectile.Under the condition that the predicted trajectory of the target is known, the minimum miss distance is the main objective and the minimum number of pulse engines is the secondary objective for optimization. In this paper, the optimization model was established under certain constraint conditions with pulse start time, ignition time interval, ignition angle and pulse engine ignition number as pulse control parameters. The pulse control parameters are optimized based on the improved Northern Goshawk optimization algorithm. Keywords Pulse correction projectile · Discretization · Optimization design · Northern goshawk optimization
1 Introduction Controlled projectile is usually more accurate than conventional projectile, because non-controlled projectile is difficult to avoid various errors and perturbations. As a controlled projectile with high precision, pulse correction projectile has been widely used in projectile. It has the advantages of simple structure, low cost and rapid response. Micro pulse engines are arranged around the projectile body, and the attitude and velocity of the projectile are changed by the thrust generated by the ignition of the pulse engine, so as to correct the deviation. Scholars at home and abroad have done a lot of research in related fields [1–5].
Y. Zhang · J. Fu (B) · X. Lei · Y. Yang · H. Gao School of Energy and Power Engineering, Nanjing University of Science and Technology, 210094 Nanjing, People’s Republic of China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_57
693
694
Y. Zhang et al.
Literature [6] proposes a correction strategy for pulse ignition based on Angle deviation, Literature [7] studied the motion characteristics of projectile under the pulse action, providing reference for stability analysis of pulse correction projectile. Literature [8] proposed a guidance method for final correction based on miss distance and azimuth deviation. In literature [9], aiming at the minimum number of pulse engines used and the minimum miss distance, the improved particle swarm optimization algorithm was used to optimize the pulse interval, and better results were obtained under wind interference conditions. In literature [10], the method of pulse frequency modulation control was used to enable the rocket to track the program trajectory. In previous studies, the impact targets of pulse correction projectiles were mostly fixed targets, on this basis, this paper proposes an improved Northern Goshawk Optimization algorithm to optimize the design of pulse control parameters. The goal is to minimize the miss distance and the number of pulse engines used, and the prediction or detection trajectory of the target is introduced into the objective function to obtain pulse control parameters that can meet the conditions for hitting maneuvering targets. It solves the problem of intercepting maneuvering targets by the pulse correction projectile. The organization of this paper is as follows. Section 2 establish the optimization model of pulse correction projectile. An improved Northern Goshawk optimization algorithm is proposed in Sect. 3. In Sect. 4, simulation results are presented. Finally, the work is concluded in Sect. 5.
2 Optimization Model of Pulse Correction Projectile The pulse correction projectile studied in this paper changes the attitude and velocity of the projectile through the control force and control moment generated by the side pulse engine ignition around the projectile body, so as to realize the trajectory correction.
2.1 Kinematics and Dynamics Equations The pulse control force and control moment should be introduced in the course of establishing the trajectory model of pulse correction projectile, The pulse control force is expressed in the projectile coordinate system as
Optimization of Pulse Control Parameters …
695
⎤ ⎤ ⎡ 0 F xB FB = ⎣ F y B ⎦ = ⎣−F p cos γ pk + γ ⎦ FzB −F p sin γ pk + γ ⎡
(1)
γ pk is the installation angle of the k-th pulse engine, γ is the current rolling angle of the projectile The projection of the pulse force in the ballistic coordinate system is expressed as ⎡
⎤ ctr Fx2 ctr F2 = A V N A N A A AB FB = ⎣ctr Fy2 ⎦ ctr Fz2
(2)
A V N , A N A , A AB respectively represent the conversion matrices of the reference coordinate system to the velocity coordinate system, the missile axis coordinate system to the reference coordinate system, and the projectile body coordinate system to the missile axis coordinate system are the conversion matrix of the reference coordinate system to the velocity coordinate system, the projectile axis coordinate system to the reference coordinate system, and the projectile body coordinate system to the projectile axis coordinate system, θa is the velocity elevation angle, ψ2 is velocity direction angle, ϕa is projectile axis elevation angle, ϕ2 is projectile axis direction angle. The projection of pulse control force in the projectile axis coordinate system is ⎡
⎤ ⎡ ⎤ FxA F xB FA = A AB FB = ⎣ F y A ⎦ = ⎣ F y B cos γ − F z B sin γ ⎦ FzA F y B sin γ + F z B cos γ
(3)
The projection of the pulse control torque in the projectile axis coordinate system is ⎡
⎤ ⎡ ⎤ ctr Mξ 0 ctr M A = ⎣ctr Mη ⎦ = ⎣−L F z A ⎦ ctr Mζ L F yA
(4)
L is the pulse moment arm. According to the pulse component force Eq. (2) in the ballistic coordinate system and the pulse moment Eq. (4) in the projectile axis coordinate system, by combining with the conventional 6-degree-of-freedom outer trajectory equations, the 6-degreeof-freedom equations of pulse correction projectile can be written as
696
Y. Zhang et al.
⎧ dv = m1 (Fx2 + ctr Fx2 ) ⎪ dt ⎪ ⎪ dθa ⎪ 1 ⎪ Fy2 + ctr Fy2 = mv cos ⎪ dt ψ2 ⎪ ⎪ ⎪ Fz2 +ctr Fz2 dψ2 ⎪ ⎪ ⎪ dt = mv ⎪ ⎪ ξ ⎪ ⎪ dω = C1 Mξ + ctr Mξ ⎪ dt ⎪ ⎪ ⎪ dωη ⎪ = A1 Mη + ctr Mη − CA ωξ ωζ + ωζ2 tan ϕ2 ⎪ dt ⎪ ⎪ dω ⎪ ⎨ ζ = 1 M + ctr M + C ω ω − ω ω tan ϕ ζ ζ η ζ 2 dt A A ξ η ωζ dϕ a ⎪ dt = cos ϕ ⎪ 2 ⎪ ⎪ ⎪ dϕ2 ⎪ = −ω ⎪ η dt ⎪ ⎪ dγ ⎪ ⎪ ⎪ = ωξ − ωζ tan ϕ2 ⎪ dt ⎪ ⎪ dx ⎪ ⎪ = v cos ψ2 cos θa ⎪ dt ⎪ ⎪ dy ⎪ ⎪ = v cos ψ2 sin θa ⎪ dt ⎪ ⎪ ⎩ dz = v sin ψ dt
(5)
2
Fx2 , Fy2 , Fz2 are respectively the three axial components of the resultant force of the projectile without considering the pulse action in the ballistic coordinate system, ωξ , ωη , ωζ are respectively the rotational speeds of the three axes in the projectile axis coordinate system , Mξ , Mη , Mζ are respectively the components of the combined moment of projectile on the three axes in the projectile axis coordinate system without considering the pulse action
2.2 Design of Pulse Control Parameters and Control Strategies If the pulse parameters include the ignition time and ignition angle of each ignition engine, then the velocity vectors generated by the ignition of some engines may weaken each other and affect the correction amount. In addition, setting the ignition time of each engine separately will also lead to too many design dimensions of pulse parameters, a large amount of optimization calculation, and the possibility of a short interval between two ignitions, resulting in violent shock of the angle of attack, which is not conducive to flight stability. Therefore, pulse start time, pulse ignition interval, pulse ignition angle and pulse ignition number were selected as the pulse control parameters, and the serial ignition strategy is adopted. According to the designed
control strategy, a group of control parameters can be expressed as tb , tq , γ p , n p , the sequence is pulse start time, time interval, pulse angle, pulse ignition number, this paper optimizes the pulse control parameters.
Optimization of Pulse Control Parameters …
697
2.3 Parameter Constraints Limited by the limitations and requirements of the actual situation, some restrictions should be put forward for the design variables and optimization process, so that the design scheme can reach the optimal condition when meeting the actual requirements. (1) Due to the structural space and cost constraints of the projectile body, the number of pulse engines that can be arranged on the projectile body is limited. The number of pulse ignition engines n should not exceed the number of pulse engines n max n ≤ n max
(6)
n is number of engines fired. (2) Since the most direct change at the end of the pulse ignition is the velocity vector of the projectile, the pulse start time determines the time that the velocity change affects, which directly affects the correction amount,therefore, the pulse start time tb is selected as one of the optimization variables, to satisfy the constraint there is tb min ≤ tb ≤ tb max
(7)
tb min , tb max are the minimum and maximum pulse start time respectively. (3)The ignition interval of the two pulse engines tq should be within a certain range to ensure the stability of the projectile tq min ≤ tq ≤ tq max
(8)
(4) The interception of maneuvering targets requires that the miss distance at the same time meet the requirements
R(t) ≤ Rmax (t)
(9)
(5) In order to make the projectile stable during full ballistic flight, the flight angle of attack should not be too large, which should be satisfied δ ≤ δmax
(10)
2.4 Objective Function Considering the limited number of pulse engines of the pulse correction projectile, pulse engines should be reserved for the final correction that the projectile may need, so the miss distance should be as small as possible under the condition that the number of pulse engines should be used as little as possible. It is necessary to optimize both
698
Y. Zhang et al.
miss distance and ignition number of pulse engine. The double objective is introduced into the objective function for weighted summation, which can be written as J = min k1
n n max
+ k2 (x(t) − xt (t)) + (y(t) − yt (t)) + (z(t) − z t (t)) 2
2
2
(11)
k1 , k2 is the weight coefficient, k1 , k2 ∈ (0, 1) ∩ k1 + k2 = 1, (x(t), y(t), z(t)) is the coordinate of the projectile at time t in the reference coordinate system, (xt (t), yt (t), z t (t)) is the coordinate of the target at time t in the reference coordinate system.
3 Improved Northern Goshawk Algorithm The traditional control mode is the continuous control force generated by the actuator, while the pulse control has discrete control forces, so traditional control methods are not applicable. Therefore, the improved Northern Goshawk algorithm is used to optimize the design of pulse control parameters to achieve the interception of maneuvering targets.
3.1 Northern Goshawk Algorithm The Northern Goshawk optimization algorithm simulates the hunting behavior of goshawk, which can be divided into two stages. The first is the global search stage to identify prey and attack prey. Individuals randomly select prey in the global search phase, at this stage, the individual’s prey selection and aggression are described as Pk = X i , k = 1, 2, ..., N , i = 1, 2, ...N X i,new1 j
=
X i, j + r (Pi, j − I X i, j ), FPi < Fi X i, j + r (X i, j − Pi, j ), FPi ≥ Fi
Xi =
X inew1 Finew1 < Fi Xi Finew1 ≥ Fi
(12)
(13)
(14)
N is the population number, Pk is the chosen prey of the k-th goshawk, X i is the is the new position of the j-th dimension of the position of the i-th goshawk, X i,new1 j i-th goshawk in the first stage, X inew1 is the new location of the i-th goshawk in the first phase, Fi is the objective function value of the original position of the i-th goshawk in the first stage. Finew1 is the objective function value of the new position of the i-th goshawk in the first stage, r is a random number in the range [0, 1], I is
Optimization of Pulse Control Parameters …
699
a random integer of 1 or 2. The first phase updates each goshawk position before moving on to the next phase. The second stage is the local search phase in which the prey is chased and escaped, the behavior at this stage is described as = X i, j + R (2r − 1) X i, j X i,new2 j
(15)
t R = 0.02 1 − T
(16)
Xi =
X inew2 Finew2 < Fi X i Finew2 ≥ Fi
(17)
T is the maximum number of iterations, t is the current iteration number, X inew2 is the is the new j-th dimension new position of the i-th goshawk in the second phase, X i,new2 j position of the i-th goshawk in the second stage, Fi is the objective function value of the original position of the i-th goshawk in the second stage, Finew2 is the objective function value of the new position of the i-th goshawk in the second stage. After the completion of the first and second stages, a complete iteration is carried out, and the first and second stages are repeated until the maximum number of iterations is reached.
3.2 Improved Northern Goshawk Algorithm Based on Chaotic Mapping and Levy Flight Random selection of initial population may result in uneven population distribution, chaotic mapping has the characteristics of ergodicity and randomness, using chaotic mapping to initialize the population often achieves better results than pseudorandom numbers, therefore, this paper introduces “Tent” chaotic mapping to initialize the population and improve the quality of the initial population. The “Tent” chaotic map is defined as follows X n ∈ [0, α) X n /α (18) X n+1 = (1 − X n ) / (1 − α) X n ∈ [α,1] among them α ∈ (0, 1). The Levy flight is a random walk named after French mathematician Paul Levy, whose step size probability distribution is heavy tailed, there is a high probability of large step sizes occurring during random walking.This characteristic of Levy flight can help expand the scope of local search and avoid the algorithm falling into local optimal.
700
Y. Zhang et al.
The integral of the symmetric Levy stable distribution proposed by Paul Levy is 1 Levy(s) = π
+∞ λ e−β|k| cos (ks) dk
(19)
0
This integral cannot be analytically solved, s is the move step, s0 is the starting step.When s s0 > 0 λβ (λ) sin Levy(s) ≈ π
πλ 2
·
1
(20)
s 1+λ
When β = 1.5, the distribution is heavy-tailed, the variance of Levy flight is exponential over time, σ 2 (t) ∼ t 3−β , 1 ≤ β ≤ 3, so Levy flight is better than random step size. In practice, “Mantegna” method is often used to simulate the generation of random step size obeying the Levy distribution s=
u
(21)
1
|v| β
among them, u ∼ N 0, σ 2 , v ∼ N (0, 1), σ =
(1+β) sin( πβ 2 ) β−1 2 2 β 1+β 2
β1 .
When Eq. (21) is introduced into the local search process of the second stage, Eq. (15) becomes = X i, j + R · 2Levy(s) · X i, j X i,new2 j
(22)
The range of the algorithm in random search is increased to avoid falling into the local optimal situation.
4 Numerical Simulation and Result Analysis This chapter will simulate the interception of maneuvering targets in two modes of motion, mode 1 is a target flying in a straight line at a constant speed of 250 m/s, mode 2 is an airborne bomb fired horizontally at a speed of 250 m/s 7
4.1 Simulation Parameters Setting In this paper, the object of study is a 35 mm tail pulse correction projectile, with a weight of 600 g and a length of 0.304 m, and the length of the pulse moment arm is 0.035 m. For tail stabilizer projectile, the correction effect of velocity deflection
Optimization of Pulse Control Parameters … Table 1 Initial launch conditions state names Initial value(case 1) Velocity Velocity elevation angle Velocity direction angle Projectile axis elevation angle Projectile axis direction angle Projectile rotational speed Wind speed
930 13 0 13 0 62.8 5
701
Initial value(case 2)
Unit
930 16 0 16 0 62.8 5
m/s deg deg deg deg rad/s m/s
angle of pulse engine before the center of mass is better than that after the center of mass, so as to obtain greater correction ability [7]. The initial conditions of simulation are shown in the following Table 1. The values of some restrictions are as follows, maximum number of engines n max = 8 , the upper and lower limits of pulse ignition time are respectively tb max = 1.3s, tb min = 0.01s, the upper and lower limits of the ignition interval time are respectively tq max = 0.2s, tq min = 0.01s, the maximum angle of attack satisfies the condition of δmax ≤ 15◦ , the miss distance meets the condition of Rmax ≤ 1 m. The ignition duration of each engine is 0.003 s, the pulse force is 467 N. The population of the Northern Goshawk algorithm is 23, and the maximum number of iterations is 40.
4.2 Simulation Results and Analysis According to the initial conditions in 4.1, the trajectory of intercepting maneuvering targets is optimized and simulated. The following figures are comparison curves of ballistic parameters under different conditions. ‘Corrected trajectory’ is the pulse correction trajectory , ‘Target trajectory’ is the trajectory of the intercepted target, ‘Natural trajectory’ means the original trajectory without pulse correction. As shown in Figs. 1, 2 and 3, the shock decreases after the angle of attack increases under the action of pulse, and the posture of the projectile remains stable. At the same time, pulse engines also have margin, ensuring that the pulse correction projectile still has correction capability at the end of the trajectory. As can be seen from the Figs. 4, 5, 6 and 7, When the ‘natural trajectory’ cannot hit the target, the pulse correction effect can play a good correction effect on the ‘natural trajectory’, it can still ensure the correction effect in the case of wind disturbance. The miss distance for Mode 1 is Rmode1 = 0.4606 m, and the miss distance for Mode 2 is Rmode2 = 0.4803 m. In summary, the Northern Goshawk Algorithm can ensure hit accuracy while saving engines.
702
Y. Zhang et al.
Fig. 1 Longitudinal angle of attack
Fig. 2 Lateral angle of attack
E 50x is the intermediate error in thexdirection, It means that there is a 50% probability that the random variable xi will occur within the range of E 50x on each side of the mean u x . E 50y , E 50z are the intermediate errors in the y, z directions respectively. For three dimensional independent normal random variables, such as the hit point of an air shot, its shooting accuracy can be described by the target-centered ball probability error SEP [11]. When the hit point appears in a ball centered on the target with a probability of 50%, the radius of the ball E D R 50 is SEP.
Optimization of Pulse Control Parameters …
703
Fig. 3 Ignition time distribution
(a) Mode 1
(b) Mode 2
Fig. 4 Interception of the target
ux =
m
xi /m
(23)
i=1
m σx = (xi − xa )2 / (m − 1)
(24)
i=1
E 50x = 0.6745σx
(25)
704
Y. Zhang et al.
Fig. 5 Height-range trajectory
Fig. 6 Height-lateral deviation trajectory
E D R50 = 2.2805 3 E 50x E 50y E 50z
(26)
The optimization effect of the improved northern Goshawk optimization algorithm on pulse parameter is verified and analyzed by Monte Carlo method. The errors of initial velocity, projectile weight, lift coincidence coefficient, drag coincidence coefficient and wind speed are respectively −10 m/s, ±5 g, ±1%, ±1%, ±5 m/s. Figure 9 and Fig. 10 show the dispersion of ballistic hit points after 100 times of simulated shooting. It can be seen that the spread of the hit points in the three axis directions is very small. The hit points have a high density. After calculation, The ball probability error of the two modes is respectively S E Pmode1 = 0.5768 m,
Optimization of Pulse Control Parameters …
705
Fig. 7 Range-lateral deviation trajectory
(a) Mode 1
(b) Mode 2
Fig. 8 The distribution of hit points
S E Pmode2 = 0.5112 m, It shows that the pulse control parameters optimized by the improved Northern Goshawk algorithm have a good control effect, and realize the correction of the original trajectory and the interception of maneuvering targets, and have the ability of anti-interference (Fig. 8).
5 Conclusion In this paper, pulse control parameters are designed according to the discretization characteristic of pulse action of the pulse correction projectile. Aiming at the minimum number of pulse engines and the minimum miss distance of the maneuvering
706
Y. Zhang et al.
target, the improved Northern Goshawk optimization algorithm is used to optimize the control parameters. The simulation results show that the design of pulse control parameters is reasonable and effective, and the research idea of designing pulse control parameters under certain constraints is effective; The improved northern Goshawk algorithm is used to optimize the pulse control parameters of intercepting maneuvering targets with two moving modes. It has fast convergence, reliable accuracy, high density of hit points, which can meet the accuracy requirements, and has anti-interference ability. The research results can provide a reference for finding the pulse control method of pulse correction projectile to intercept maneuvering target. Acknowledgements This work was supported by the National Science Foundation of China [61603191,61603189], Jiangsu Province Natural science research project of colleges and universities [20KJD510005]. The authors also gratefully acknowledge the helpful comments and suggestions of the reviewers, which have improved the presentation.
References 1. Yang, Z., Fu, J., et al.: Intelligent optimal pulse-jet control for dual-spin projectiles. In: 2021 40th Chinese Control Conference (CCC), pp. 7639–7644, IEEE (2021) 2. Geswender, C.E., et al. :2-D projectile trajectory correction system and method. U.S. Patent No. 7,163,176. 16 (2007) 3. Corriveau, D., Wey, P., et al.: Thrusters pairing guidelines for trajectory corrections of projectiles. J. Guidance, Control, Dyn. 34(4), 1120–1128 (2011) 4. Corriveau, D., Berner, C., Fleck, V.: Trajectory correction using impulse thrusters for conventional artillery projectiles. In: Proceedings of the 23th International Symposium on Ballistics, pp. 639–646 (2007) 5. Wey, P., Corriveau, D.: Trajectory deflection of fin-and spin-stabilized projectiles using paired lateral impulses. In: Bless, S., Walker, J. (eds.) Proceedings of the 24th International Symposium on Ballistics , pp. 188–197 (2008) 6. Shengzheng, C., Bo, Y., Shuanhu, Y., et al.: Research on impulse correction munitions technology. J. Ordnance Equipment Eng 38(12), 60–64 (2017) 7. Zhiwei, Y., Liangming, W.A., Jianwei, C., et al.: Motion characteristics of the small caliber anti-aircraft guncorrection projectile subjected to a pulse action. Acta Armamentarii 43(6): 1337 (2022) 8. Cao, X., Xu, Y., et al.: Research on control method for terminal correction mortar projectiles subjected to lateral impulses based on miss distance prediction. J. Projectiles, Rockets, Missiles Guidance 37(2), 23–26 (2017) 9. Ruisheng, Sun, Qiao, Hong, Jinzhang, Chen, et al.: Particle swarm optimization method for impulse-correction projectiles. J. Nat. Univ. Defense Technol. 38(4), 159–163 (2016) 10. Pavkovic, B., Pavic, M., Cuk, D.: Frequency-modulated pulse-jet control of an artillery rocket. J. Spacecraft Rockets 49(2), 286–294 (2012) 11. Guosheng, Lu.: Research on relationship between multidimensional dispersion index of projectile and hitting probability. Journal Ballistics 27(04), 59–63 (2015)
Analysis of Probabilistic Energy Flow for Integrated Electricity and Heat Systems Considering Source-Load Uncertainty Taihao Liu, Yunzhong Song, Huimin Xiao, and Fuzhong Wang
Abstract Building the integrated energy system has become a major strategy for the country to achieve energy transformation and sustainable development. Aiming at the source-load uncertainty in the actual operation of integrated electricity and heat systems (IEHS), a probabilistic energy flow calculation method for IEHS considering source-load uncertainty based on discrete solution is proposed. Firstly, the models of power system, thermal system, coupling element and source-load uncertainty are established respectively. Then, on the basis of steady-state energy flow, the uncertainty of electric, heating load and photovoltaic and wind power output is considered, and Monte Carlo simulation (MCS) method is used to calculate the probabilistic energy flow for IEHS. The effectiveness of the proposed method is verified by a case, and it is explained that IEHS can be operated more safely by adjusting the capacity of heat pump (HP). Keywords Source-load uncertainty · Integrated electricity and heat systems · Probabilistic energy flow · Monte carlo simulation
WWW home page: http://www.researchgate.net/profile/Song-Yunzhong. T. Liu · Y. Song (B) · F. Wang School of Electrical Engineering and Automation, Henan Polytechnic University, 454003 Jiaozuo, China e-mail: [email protected] Y. Song · F. Wang Henan International Joint Laboratory of Direct Drive and General of Intelligent Equipment, Zhengzhou, China Henan Key Laboratory of Intelligent Detection and Control of Coal Mine Equipment, Zhengzhou, China H. Xiao School of Computer and Information Engineering, Henan University of Economics and Law, Zhengzhou, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_58
707
708
T. Liu et al.
1 Introduction In recent years, integrated energy system has become an important way for national power system construction and energy transformation [1], and the laws of physics or chemistry also value so much in this field [2–4]. IEHS is the most widely used. In reference [5], the heat network pipe cooling operator is simplified, and the rapid iterative method is used to solve the IEHS energy flow. In reference [6], based on the analytical method and the approximate method, a fast probabilistic power flow method suitable for radial heating network is proposed. In reference [7], considering the uncertainty of electric and heat load fluctuations, a method based on linear optimization interval power flow is proposed. In reference [8], a hydraulic model considering the refined resistance and a thermal model considering the dynamic characteristics of the thermal network are established, and an IEHS time series energy flow calculation method is proposed. At present, the IEHS probabilistic energy flow calculation often only considers the uncertainty of the load side or the energy side, but in the wave of new energy power generation, not only the uncertainty of the load must be considered, but also the uncertainty of photovoltaic (PV) and wind power generation, which is bound to bring hidden dangers to the operation of IEHS [9–12]. Therefore, researching the IEHS probabilistic energy flow considering source-load uncertainty can simulate the possible problems of the system and find ways to solve them. Main contributions of this paper are threefold: (1) comprehensively considering the impact of several uncertainties on IEHS, (2) a method of probabilistic energy flow calculation based on discrete solution is proposed, (3) the influence of HP on the probabilistic energy flow of the system was analyzed.
2 IEHS Model IEHS consists of power system, thermal system, combined heat and power (CHP), HP and other coupling units. The power system is mainly composed of generator, electric load and transmission line. The thermal system is mainly composed of heat source, heat load, supply and return pipeline and circulation pump. Each system node is classified and its variables are shown in Table 1.
2.1 Power System Model The power system adopts the AC power flow model, and the nodal power equation is U j G i j cos(θi − θ j ) + Bi j sin(θi − θ j ) (1) Pi = Ui j∈J
Analysis of Probabilistic Energy Flow for Integrated …
709
Table 1 Node category and variables of system System category Node type Known variables Power system
Thermal system
P Q node P V node Slack node ΦTs node ΦTr node Slack node
Q i = Ui
Unknown variables U, θ Q, θ P, Q Tr , m Ts , m Φ, Tr , m
P, Q P, U U, θ Φ, Ts Φ, Tr Ts
U j G i j sin(θi − θ j ) − Bi j cos(θi − θ j )
(2)
j∈J
where Pi and Q i are the active and reactive power injected at grid node i, respectively; θi and θ j are the voltage phase angles at node i and j, respectively; Ui and U j are the voltage magnitudes at node i and j, respectively; G i j and Bi j are the real and imaginary parts of the element in row i and column j of the grid node admittance matrix, respectively.
2.2 Thermal System Model The hydraulic model constraint equations are as follows: Am = m q
(3)
Bh f = 0
(4)
h f = Km | m |
(5)
where Eq. (3) is the nodal mass flow rate balance equation. A is the thermal network node-branch correlation matrix; m is the pipe mass flow rate; m q is the load mass flow rate into the node. Equation (4) is the loop pressure equation. B is the loop correlation matrix; h f is the pipe pressure drop caused by friction loss; Eq. (5) is the pressure head loss equation. K is the pipe resistance coefficient. K = 1 √ = −2lg f
8L f
(6)
D5ρ 2π 2 g
2.51 ε/D + √ 3.7 Re f
Re = 4m/(ρπ μD)
(7) (8)
710
T. Liu et al.
where L is the pipe length; f is the friction coefficient; D is the pipe diameter; ρ is the water density; π is the circumference; g is the gravity acceleration; ε is the pipe roughness; μ is the water kinematic viscosity; Re is the Reynolds number. The thermal model constraint equations are as follows: Φ = C p m q (Ts − Tr )
(9)
Tend = (Tstar t − Ta )e−λL/(C p m) + Ta
(10)
m out Tout = (m in )Tin
(11)
where Eq. (9) is the heat power equation; Φ is the heat power injected into the node; C p is the specific heat capacity of water; Ts is the node supply temperature; Tr is the node return temperature; Tstar t is the pipe head temperature; Tend is the pipe end temperature; Ta is the ambient temperature; λ is the heat transfer coefficient; m in is the pipe mass flow rate into the node; m out is the pipe mass flow rate out of the node; Tin is the input node pipe end temperature; Tout is the mixing temperature at the node.
2.3 Coupling Element Model 1. CHP Unit Model According to whether the heat-to-electricity ratio of the CHP unit changes, it can be divided into two types: fixed heat-to-electricity ratio (e.g. gas turbine, reciprocating internal combustion engine) and variable heat-to-electricity ratio (e.g. pumped steam turbine). The electricity and heat production of fixed and variable heat-to-electricity ratios can be expressed in Eqs. (12) and (13) respectively. Cm = ΦC H P /PC H P
(12)
C z = Φ/P = ΦC H P /(ηe Fin − PC H P )
(13)
where ηe is the condensation efficiency of the CHP unit; Fin is the constant fuel input rate. 2. HP unit model A heat pump is a device that uses a small amount of electricity to convert heat from the surrounding environment to a higher temperature, and is mathematically modelled as (14) Φ H P = α H P PH P where α H P is the power to heat efficiency of the HP.
Analysis of Probabilistic Energy Flow for Integrated …
711
2.4 Source-Load Uncertainty Model In this paper, the IEHS uncertainty factors mainly consider the uncertainty of load, PV and wind power output, without considering parameter uncertainty. 1. PV output probabilistic model PV output is characterised by strong randomness and volatility. PV plants use empirical probability distributions to build probability models, mainly based on the Beta distribution, and the probability density function of PV output is Γ (α + β) f (Ppv ) = Γ (α)Γ (β)
Ppv Ppv_max
α−1 1−
Ppv
β−1
Ppv_max
(15)
where α and β are the shape parameters of the Beta distribution. The PV power station is controlled by constant power factor. In reference [9], assuming that the PV power station has a power factor of 1, its reactive power output is 0. 2. Probabilistic model of wind power output The wind farm output depends mainly on the wind turbine model and the local wind speed conditions. When the wind speed obeys a two-parameter Weibull distribution, the probability density function is f (vw ) =
k vw k−1 −( vw )k e c c c
(16)
where c and k are the scale and shape parameters of the Weibull distribution, respectively; vw is the wind speed. The relationship between wind power output and wind speed as a function of wind speed can be described as ⎧ 0 ⎪ ⎪ ⎪ ⎨c + c v 1 2 Pw = ⎪ Pw_N ⎪ ⎪ ⎩ 0
vw ≤ vci vci < v ≤ vr vr < v ≤ vco vco < v
(17)
where Pw_N is the rated output of wind power; vci , vr and vco are the measured cut-in wind speed, rated wind speed and cut-out wind speed respectively; the expressions for the coefficients c1 and c2 are c1 = Pw_N vci /(vci − vr )
c2 = Pw_N /(vr − vci )
(18)
3. Load probabilistic model In general, the Gaussian distribution can better describe the prediction errors of electric and heating loads [9], and the probability density functions of electric and heating loads are
712
T. Liu et al. 2
(P−μ p ) − 1 2 f (P) = √ e 2σ p 2π σ p
(19)
2
(P−μ p ) − 1 2 f (Q) = √ e 2σ p tan ϕ 2π σ p
f (Φ) = √
1 2π σΦ
e
−
(Φ−μΦ )2 2 2σΦ
(20)
(21)
where μ and σ are the expectation and standard deviation, respectively; ϕ is the power factor angle.
3 Probabilistic Energy Flow Calculation Method for IEHS Considering Uncertainty In this paper, power system data based on MATPOWER toolbox and thermal system data from reference [7]. PV output, wind power output, electric load and heating load are simulated by Monte Carlo method. Both power grid and thermal network use the exhaust steam turbine CHP as slack node, the other CHP units are P V node in the power grid and ΦTs node in the heat network. The electric power output of the CHP unit is obtained by Eq. (12) based on the known heat power of the CHP unit, and then the CHP unit output is distributed to the power system and the thermal system to achieve decoupling. After decoupling, the discrete energy flow of the power and thermal system is calculated to obtain the IEHS state variables, and then the probability distribution of each state variable after N times energy flow calculation is further analyzed. Based on reference [11, 12], the entire flow of the proposed method is shown in Fig. 1.
4 Case Study 4.1 Case Condition In this paper, IEHS is a coupling of IEEE 14-node power grid and Bali Island thermal network, and the topology is shown in Fig. 2. CHP 1 is a gas turbine unit; CHP 2 is an extractive steam turbine unit; CHP 3 is a reciprocating internal combustion engine unit. Node 9 of the power grid is connected to one distributed photovoltaic generator unit. node 6 is connected to two wind turbines. Suppose the expectations for PV and wind power output and electricity and heat loads are their forecasts, the standard deviation is 5% of the expectation, the HP efficiency is 3, and the simulation scale is 10,000.
Analysis of Probabilistic Energy Flow for Integrated …
713
Fig. 1 Flow chart of probabilistic energy flow for IEHS considering uncertainty
4.2 Analysis of Example Result Four scenarios are set for simulation and comparative analysis: electric load uncertainty, heating load uncertainty, wind power and PV output uncertainty, and sourceload uncertainty. From Fig. 3, The output performance of each variable fluctuates up and down as the result of certain energy flow calculation. Except for some special nodes (ΦTs , P V and slack node), other nodes will be affected. The source-load uncertainty makes the IEHS energy flow results have interval output characteristics. From Fig. 4, uncertainty in the thermal system will affect operating state of power system. When analyzing the probabilistic energy flow of power system, it is necessary to consider uncertainty of energy flow of thermal system. Similarly, when analyzing the probabilistic energy flow of thermal system, it is necessary to take into account uncertainty of power flow of power system. From Fig. 5, as the capacity of HP increases, the voltage magnitude of node 10 and the mass flow rate of pipe 4 are gradually decreasing. Therefore, configuring the heat pump with certain capacity can effectively reduce the probability of power grid voltage exceeding the limit. When the load of power system encounters a trough and the load of thermal system reaches a peak, it is sometimes difficult to meet the system
714
T. Liu et al.
Fig. 2 Topology of IEHS with PV and wind power
Fig. 3 Voltage magnitude of each node and supply water temperature of loop node
operation requirements by only adjusting the power and thermal output of the CHP unit, and the heat pump can be used to balance the heat network load nearby, which can effectively relieve the pressure of the heat network.
Analysis of Probabilistic Energy Flow for Integrated …
715
Fig. 4 In four scenarios, CDF at node 10 and CDF in pipe 4
Fig. 5 By adjusting HP capacity, CDF at node 10 and CDF in pipe 4
5 Conclusion In this paper, a probabilistic energy flow calculation method for IEHS considering source-load uncertainty based on discrete solution is proposed, and it is applied to the example, and the influence of source-load uncertainty and HP on IEHS is analyzed. 1. Uncertainty will be propagated to other energy systems through coupling elements, and the interaction of energy flow determines that a single system cannot reveal the operating characteristics of IEHS. 2. Adjusting the capacity of HP can better match the peak-valley characteristics of uncertain factors and enhance the security of the system. Acknowledgements This work was supported in part by the National Natural Science Foundation of China (No. 61340041, 61374079 and 61903126) and the Natural Science Foundation of Henan Province (182300410112).
716
T. Liu et al.
References 1. Ming, Z., Shuo, Z.: Comprehensive energy development analysis of the “14th Five-Year Plan” power planning. China Power Enterpr. Manage. 598(13), 26–28 (2020) 2. Teng, X., Liu, B., Ichiye, T.: Understanding how water models affect the anomalous pressure dependence of their diffusion coefficients. J. Chem. Phys. 153(10), 104510 (2020) 3. Teng, X., Ichiye, T.: Dynamical model for the counteracting effects of trimethylamine N-Oxide on urea in aqueous solutions under pressure. J. Phys. Chem. B. 124(10), 1978–1986 (2020) 4. Teng, X., Ichiye, T.: Dynamical effects of trimethylamine N-Oxide on aqueous solutions of urea. J. Phys. Chem. B 123(05), 1108–1115 (2019) 5. Dan, J., Chengwei, F., Yang, L., et al.: Calculation methodology for electric-thermal hybrid power flow based on improved Sukhov cooling operator. Southern Power Syst. Tech. 14(10), 18–26 (2020) 6. Hong, L., Wenxue, W., Fu, X., et al.: Probability power flow calculation for electric-thermal interconnected integrated energy system based on analytical method. Elect. Power Eng. Tech. 40(05), 151–157 (2021) 7. Wenxue, W., Hu, W., Guoqiang, S., et al.: Interval energy flow calculation method of integrated electro-thermal system. Power System Tech. 43(01), 83–95 (2019) 8. Hong, L., Chenxiao, Z., Shaoyun, G., et al.: Sequential power flow calculation of power-heat integrated energy system based on refined heat network model. Automat. Electr. Power Syst. 45(04), 63–72 (2021) 9. Juan, S., Zhinong, W., Guoqiang, S., et al.: Analysis of probabilistic energy flow for integrated electricity-heat energy system with P2H. Electric Power Automat. Equipm. 37(06), 62–68 (2017) 10. Huicheng, W., Chun, W., Kuan, L., et al.: An interval energy flow calculation method for integrated electro-thermal energy system. Power Syst. Tech. 43(01), 91–99 (2019) 11. Jing, X., Xu, K., Shiju, W., et al.: Multi-energy flow calculation method for integrated electricity-gas system based on discrete solution. Proc. CSU-EPSA 34(01), 114–120 (2022) 12. Liu, X., Jianzhong, W., Jenkins, N., et al.: Combined analysis of electricity and heat networks. Appl. Energy 162, 1238–1250 (2016)
A Fire Alarm System for Agricultural Sheds Designed with Zigbee Yongsheng Xie, Xiaokai Du, and Linbing Wei
Abstract This paper designs a fire alarm system for agricultural greenhouses based on ZigBee sensor network and GSM communication technology. The system monitors the environmental conditions in the greenhouse in real time through a sensor network, and can promptly remind relevant personnel in two ways: mobile phone text messages and emergency fire alarms. At the same time, the irrigation system of the agricultural greenhouse is used to physically extinguish the fire, making the fire treatment more efficient and real-time, and reducing the damage of the fire to the agricultural greenhouse. In order to improve the probability of flame detection, dualband infrared detection signals are used and the BP neural network fusion algorithm is used to achieve dual-band signal fusion to compensate for measurement errors such as illumination. Keywords Fire alarm · GSM · STM32 · Internet of Things · Neural BP network
1 Introduction Automated agricultural greenhouses play an important role in the development of smart agriculture. The fire in the greenhouse has also become a hidden safety problem that cannot be ignored. Especially with the development of modern agricultural technology, agricultural greenhouses are mostly controlled by automation and are left
Y. Xie School of Mathematics and Computer Science, Guangxi Science and Technology Normal University, Laibin 546199, China e-mail: [email protected] X. Du Jiangnan University, Wuxi 214122, China L. Wei (B) Guangxi Science and Technology Normal University, Laibin 546199, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_59
717
718
Y. Xie et al.
unattended. Once a fire breaks out, it will cause great harm to agricultural development. Therefore, it is particularly important to set up an Internet of Things fire alarm system in agricultural greenhouses to give an alarm in time. At present, there are related designs in China for related problems, but they have not been widely used. The reason is that they have a limited monitoring range and are only suitable for home life with a small space. It is not suitable for agricultural greenhouse scenarios with large space and high automation. In addition, its detection method adopts traditional smoke and temperature detection to detect fire, which has a high false alarm rate and wastes fire-fighting resources. In view of this situation, this paper designs an Internet of Things fire alarm system based on ZigBee sensor network and GSM communication technology [1]. Monitoring two frequency bands of flame infrared radiation in the current agricultural greenhouse environment with sensor network [2]. And through the BP neural network calculation, the influence factors in the multi-sensor fusion environment are realized to improve the detection accuracy [3]. The coordinator module sends a short message to the relevant personnel’s mobile phone through the GSM module. At the same time, the irrigation system in the high concentration and high temperature area in the agricultural greenhouse is driven by the sensor network to carry out physical fire extinguishing.
2 Overall System Design The overall module structure of the system is shown in Fig. 1. Mainly include, ZigBee module, sensor module, GSM module and so on. The system first collects two frequency bands of infrared detection signals through terminal nodes scattered in agricultural greenhouses. It is then forwarded through the routing node, and finally the information is aggregated to the coordinator node. The coordinator node judges whether there is a fire in the current greenhouse by running relevant software programs. In the event of a fire, it will not only drive the alarm to remind the danger, but also send SMS notification to the relevant personnel through the GSM module. At the same time, the irrigation system in the agricultural greenhouse is used to carry out regional physical fire extinguishing to prevent the spread of the fire.
3 System Hardware Components 3.1 ZigBee Sensor Network ZigBee networking methods are ever-changing, and theoretically, 65,536 nodes can be connected. But there are three main networking methods (as shown in Fig. 2): Star topology network, mesh topology network, and tree topology network. The choice of networking mode depends on the specific application scenario, after taking
A Fire Alarm System for Agricultural Sheds Designed with Zigbee
719
4.3μm Flame sensor User receiving information
3.8μm Flame sensor
ZigBee modular
Light sensor
Alarm module
Relay module
Agricultural irriga on system
Fig. 1 System overall module structure diagram
Fig. 2 Schematic diagram of ZigBee network in spinach planting area
into account factors such as the wide planting area of agricultural greenhouses, the scattered areas of fire hazards, and the high degree of automation. This design uses a mesh topology network. The topology diagram is shown in Fig. 2. This networking method has a more flexible information routing principle, which makes the communication of information more efficient, and means that once a problem occurs in a certain path, the information can be automatically transmitted along other paths. Therefore, the problems of long distance of information dissemination and scattered detection area in agricultural greenhouses are better solved.
3.2 ZigBee Node Hardware Design Different ZigBee nodes have different hardware platform requirements according to their functions. For example, as the core of the fire alarm system, the coordinator node needs to send and receive a large amount of data and run the flame identification algorithm: Need to have excellent data processing capabilities; As the key to
720
Y. Xie et al.
data collection of fire alarm system, terminal node requires low price, low power consumption and small size 0. Therefore, for the coordinator, this design uses TI’s CC2538 chip as the coordinator main control chip (CC2538 × Fnn is an ideal wireless microcontroller system-on-chip (SoC) suitable for high-performance ZigBee applications. The device contains a powerful MCU system based on ARMCortexM3, which can meet the requirements of large data volume and frequent processing in sensor networks. A platform is built for the realization of the flame recognition algorithm.
3.3 Infrared Flame Sensor Module According to different working principles, the performance and models of flame detectors are also very different. Different types of detectors have different sensitivities, detection ranges, and applications. Considering that there are more interference sources such as fluorescent lamps and thermal radiation in agricultural greenhouses than other application scenarios. So this design adopts dual-band infrared flame detection device. With the ZigBee sensor network, the agricultural greenhouse can be divided into multiple detection areas to monitor the environmental conditions of the greenhouse. The single area flame detection algorithm is discussed below: Figure 3 shows the analysis and comparison of 3 different combustibles radiation spectrum. It can be seen that their peaks overlap at wavelengths of 4.3 µm and 2.6 µm, respectively [4]. Among them, 4.3 µm is the most obvious main signal of the flame, and it is an important wavelength band for infrared flame detection. At the same time, in order to highlight the energy peak of the CO2 absorption band released by the flame, it is also necessary to select a contrast band to highlight this peak [5]. 3.8 µm is the band near the peak, and the spectral center of most common gases is not in this band. Therefore, the band of 4.3 µm can be used as the detection band. The 3.8 µm band is used as a reference band to eliminate environmental interference [6]. Using the reaction characteristics between the two bands, the flame can be accurately judged.
3.4 GSM Module The chip of the GSM module used in this system is SIM900A. SIM900A is a dualband GSM/GPRS module, the working frequency band is EGSM 900 MHz and DCS1800 MHZ, the transmission rate supports from 1200 BPS to 115,200 BPS, and supports standard AT commands. After connecting with the coordinator, it is only responsible for receiving the AT commands transmitted by the coordinator, and sending out the information that has been written. The physical map of the GSM module is shown in Fig. 4.
A Fire Alarm System for Agricultural Sheds Designed with Zigbee
721
Fig. 3 Radiation spectra of different fuels Fig. 4 Physical map of GSM module
How to automatically send AT commands to the GSM module is a key and difficult point of this system when writing codes. When sending AT commands too frequently, it will cause the GSM module to report an error. Therefore, after each command is sent, a certain delay is required before sending the next command. There are three modes of short message encoding: Block mode, Text mode based on AT commands, and PDU mode based on AT commands 0. Block mode is rarely used, Text mode is relatively simple, but it only supports English information, PDU mode is a general encoding method, and the text of the short message is transmitted after hex encoding. Because the message sent in this article is “Warning!!! Fire”, as shown in Fig. 5, it is relatively simple, so select Text mode (Table 1).
722
Y. Xie et al.
Fig. 5 GSM information transmission display
Table 1 AT commands
AT instructions Function AT + CSCS
Coding settings
AT + CMGF
Select the supported format of message
AT + CMGS
Set the mobile phone number to receive SMS
4 System Software Design The system software is designed to detect the various sensor information received by the coordinator. Then, the BP neural network algorithm is used as the flame detection algorithm, and the environmental interference in the agricultural greenhouse is integrated to judge whether a fire occurs. Thereby effectively improving the adaptability and operating efficiency of the instrument. The control flow chart of the coordinator is shown in Fig. 6, which mainly consists of modules such as initialization module, light measurement, temperature measurement, and information sending.
4.1 Design and Implementation of BP Neural Network Algorithm The BP algorithm, also known as the back propagation algorithm, is the BP neural network algorithm to effectively fuse the measured values of infrared radiation in two frequency bands. The specific network structure is shown in Fig. 7. Its algorithm includes forward propagation and error back propagation.
A Fire Alarm System for Agricultural Sheds Designed with Zigbee
723
Fig. 6 Coordinator control flow chart
Fig. 7 Network structure diagram of three-layer BP algorithm
Input layer
Hidden layer
Output layer
4.3μm
3.8μm
0 or 1
illumination
The forward propagation is about to carry out data collection and model training under different fire sources (soldering iron, alcohol lamp, straw), and use the infrared light and illuminance measured by each sensor in the sensor network as the three input information of the fusion algorithm. The value is passed down to the output layer. Error backpropagation means that the error check signal propagates backward from the output layer to the input layer until the error signal is within the design allowable range. The number of input layer nodes of the above three-layer network is 3, the number of output layer nodes is 1, and the number of hidden layer nodes is determined through experience or multiple training comparisons. Let the number
724
Y. Xie et al.
of input neurons be in, the number of output neurons be out, the value range of a is [1,10], and the number of hidden layer neurons is h[1.], usually using the The formula for the range gives the neuron value range as shown in Eq. 1: h=
√ out + in + a
(1)
In this design, the number of hidden layer nodes is set from 3 to 12 by Formula (1). Since the target with or without flame is a binary classification problem, set 1 to indicate fire and 0 to indicate no fire. The network is continuously trained. Keeping other parameters unchanged, when the number of neurons in the hidden layer reaches 12, the recognition accuracy of the BP neural network is higher. Using MATLABR2010b software, the Sigmoid activation function is used for the hidden layer and the output layer respectively, and the model training function is Trainrp. Use MATLAB to run the Trainrp function to train the sample data to be tested until the training requirements and training objectives are met, so as to determine the weights and thresholds of the neural network.
5 Experimental Test Results and Analysis This experiment is mainly carried out from two aspects of system reliability and stability [7]. The experimental environment is a laboratory rooftop spinach plantation. The experiment is divided into two parts. One is the error analysis of the flame detection algorithm. The second is the effective communication distance test of sensor network networking in agricultural environment.
5.1 Error Analysis of Flame Detection Algorithm This test is mainly aimed at the detection probability of the flame detection sensor network under different flames and the influence of light in the agricultural environment. Experimental platform: 1 ZigBee coordinator and 4 terminal test nodes. Networking mode: The coordinator broadcasts data, and the four terminal detection nodes all send and receive data unicast, which is convenient for testing sensor data. When analyzing the test accuracy of this part, randomness is used in 4 different positions of the spinach planting field, and the method of different fire sources is used to carry out high-frequency, short-time (flame duration and time interval are both 5 s) repeated ignition (1 means there is fire, 0 means no fire), and it is sent to the host computer through the USB serial port of the coordinator for viewing. The collected data are shown in Fig. 8, which are the extreme data of flame detection in
A Fire Alarm System for Agricultural Sheds Designed with Zigbee
725
the afternoon and night with or without light. This can be seen in the presence of light conditions. A small probability will have a certain impact on flame detection. But the error is within the acceptable range. At the same time, the same fire source was tested multiple times in different places without changing the measurement conditions, and the results are shown in Table 2. It can be seen that there is a certain error in the detection accuracy under the same fire source in different positions, but the error is still less than 2%, which is within the acceptable range. Different fire sources are in the same position, and the detection accuracy error is also kept within 2%.
Fig. 8 Test data
Table 2 Detection probability of different fire sources at different locations Number of tests: 200 Fire source type
Location distribution (%) 1
2
3
4
Alcohol lamp
94.31
95.47
95.56
93.92
Soldering iron
5.29
5.75
6.15
5.58
Incandescent lamp Straw burning
6.38
6.11
6.56
5.32
97.82
96.73
98.27
98.65
726
Y. Xie et al.
Fig. 9 Node networking test diagram
5.2 Network Transmission Test of Sensor Network Nodes This test is mainly aimed at the relationship between data packet loss rate and communication distance of ZigBee in agricultural environment. In order to determine the optimal communication distance between the ZigBee sensor network nodes in this design. Experimental platform: 1 ZigBee coordinator, 1 routing node, 2 terminal test nodes The configuration is shown in Fig. 9. Networking mode: The coordinator broadcasts and sends data, and 2 terminal detection nodes A, B and 1 routing node broadcast and send and receive data to form a minimum ZigBee sensor network. Data collection method: On the premise of the success of the above-mentioned flame detection experiment. Node A and Node B use flame sensors and light sensors to capture environmental data, respectively. Use an alcohol lamp near node A to perform high-frequency, short-time (flame duration and time interval are both 5 s) repeated ignition (1 means fire, 0 means no fire) test. Data display mode: The coordinator displays the detection data on the serial interface of the PC side. And keep it to the Txt file, and perform statistical analysis on some data with a communication distance of 20 m, as shown in Fig. 10. The experimental test results are shown in Table 3. The test results show that there is a certain gap between the ZigBee sensor network communication distance and the theory. However, considering that the application scenario of this design is an agricultural greenhouse. The network transmission can be optimized in a multi-node way to increase communication efficiency. This design can be applied to the agricultural greenhouse fire warning system after optimization in terms of flame identification algorithm and network transmission.
A Fire Alarm System for Agricultural Sheds Designed with Zigbee
727
Fig. 10 Some data statistics under the communication distance of 20 m
Table 3 ZigBee network test results in betting greenhouse Communication distance
Sent
Received
Packet loss rate (%)
20
200
200
0
40
200
176
12
60
200
18
91
80
200
0
100
100
200
0
100
6 Conclusion Based on ZigBee sensor network and BP neural network algorithm, this paper uses two infrared bands to realize the function of fire alarm in agricultural greenhouses. And send fire information in time through GSM module. It solves the problems of traditional alarm equipment, such as limited detection range and low precision. The test results show that this design has a good promotion prospect in a wide range of application scenarios such as smart agriculture. Acknowledgements This work was supported by the Youth scientific research and innovation team of Guangxi Normal University of science and Technology (No. GXKS2020QNTD02), Research on the Extraction Algorithm of Field Navigation Line in Visual Navigation of Mountain agricultural robot (No. 2020KY23023), Laibin City Scientific Research and Technology Development Plan Project “Research and application demonstration and promotion of sugarcane breeding based on artificial intelligence” (No. 20211216121424482) and Guangxi Science and Technology Plan Project “Key Technologies and Industrial Cluster Applications for Manufacturing Human Machine Intelligent Interactive Touch Terminals” (No. Guike AA21077018).
728
Y. Xie et al.
References 1. Zhao, Q.: Design of remote fire alarm system based on Internet of things. Softw. Develop. Appl. 14 (2020) 2. Jia, X.: Application of double fire detection device in Changjiang West Road River Crossing Tunnel. Electrical Autom. 41, 19 (2019) 3. Feng, H.T.: Application of S-type RBF neural network in infrared flame detection system. Laser Infrared (02), 2, 50 (2020) 4. Yuan, J.: Research and development of three band infrared flame detector. Zhejiang University (03) (2012) 5. Cheng, W., Zheng, H.: Development of a dual band infrared flame detection device. Equipment Manufact. Technol. (07) (2019) 6. Guo, X.: Research and scheme design of three band infrared flame detector. Fire technology and product information 8, 77 (2013) 7. Cai, Z.: Research on communication and coverage of ZigBee and Lora wireless sensor networks for forest fire early warning. Harbin Normal University (06) (2020) 8. Liu, M., Pan, X., Liu, F., Zhou, Y., Jiang, K.: Flame target detection based on stepwise discriminant method and BP neural network. J. Inner Mongolia Agric. Univ. (Natural Science Edition) 9. Feng, H., Xie, L.: Design and implementation of recognition algorithm for four band infrared flame detector. Laser Infrared (05), 40, 5 (2018) 10. Zhou, Y.: Research on feature extraction of infrared flame detection signal. Sens. Microsyst. 36, 2 (2017) 11. Li, Q.: Design and implementation of high precision, high reliability and wide spectrum flame detector. Anhui University (04) (2020)
Cooperative Control and Management for UAS in Distributed Dynamic Kill Web Sun Zhangjun, Tang Qiang, and Li Hao
Abstract Driven by new military requirements and technological innovation, modern warfare is developing towards system-of-systems combat based on distributed operations. Correspondingly, the traditional kill chains are extending to kill webs or effects webs. During this process, unmanned aircraft systems play a more and more important role, especially those networked, small, low-cost ones, and are collaborated and integrated as goal-driven systems in the new distributed combat style. In this paper, we firstly introduce the shift from kill chain to kill web, which has put forward new requirements for the applications of UAS. Then, typical UAS distributed dynamic cooperative combat styles are discussed. Under this background, several key issues of cooperative control and management for UAS that need to be addressed by future research activities are analyzed and summarized, related with cooperation architecture, communication, navigation, decision-making, planning, guidance, and control. Finally, a conclusion is briefly summarized with future development tendency and challenges. Keywords Distributed combat · Dynamic kill web · Unmanned aircraft systems · Cooperation · Control and management
1 Introduction Entering the twenty first century, the modern warfare mode has been developed from mechanization to informatization, and is expected to gradually move towards intelligence in the near future. At the same time, with the proposal of “Third Offset S. Zhangjun · T. Qiang (B) · L. Hao AVIC Xi’an Flight Automatic Control Research Institute, Xi’an 710076, People’s Republic of China e-mail: [email protected] National Key Laboratory of Science and Technology on Aircraft Control, Xi’an 710076, People’s Republic of China S. Zhangjun School of Management, Xi’an Jiaotong University, Xi’an 710049, People’s Republic of China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_60
729
730
S. Zhangjun et al.
Strategy”, the U.S. military has put forward a series of new operational concepts and thoughts, which have been continuously practiced, improved and developed, including Net-Centric Warfare, Distributed Maritime Operations (DMO), Multi-domain Operations (MDO), Mosaic Warfare, Decision-Centric Warfare. Many programs have also been carried out for the research and verification of related key techniques, such as CODE, DBM, SoSITE, OFFSET, etc. It is obvious that the U.S. military is pursuing a distributed combat capability in the dynamic battlefield environment. With the ability of distribution, rapidly compose and recompose, traditional kill chain has been extended to kill web, and several operational advantages could be obtained compared to today’s U.S. military. As the most promising new combat force, unmanned aircraft systems (UAS), especially small ones could enable more distributed formations by disaggregating the capabilities of traditional multi-mission platforms and units into a larger number of less-multi-functional and less-expensive systems. In recent years, several cooperation styles of UAS have been widely concerned because of their potentiality to augment or even redefine the airpower employment such as teaming, swarming, and loyal wingman. This trend undoubted has a profound and long-term impact on the development and application of unmanned aerial vehicles. From the perspective of control, the plant and objective have changed significantly. On the one hand, the control plant is expanded from traditional single platform motion to multi-platform task oriented cooperation. On the other hand, the objective changes from the achievement of flight quality represented by stability and maneuverability of the platform to the cooperative combat effectiveness by using multiple vehicles. Moreover, the control structure needs to be more open, flexible, and scalable on the basis of distribution, and the control strategy should be more autonomous and intelligent. The rest of this paper is organized as follows. Section 2 briefly introduces the shift from kill chain to kill web. In Sect. 3, typical new ways of employing UAS to accomplish tactical to strategic level mission objectives are presented. Then the control and management of UAS for cooperation is discussed in detail in Sect. 4. Finally, a short conclusion is included to summarize the existing research and to propose several problems in Sect. 5.
2 From Kill Chain to Kill Web Since the U.S. military put forward the kill chain in the 1990s, its concept and connotation have been constantly enriched and improved. It has experienced the development process from the earliest strike chain to the distributed kill chain, to the kill web, and then to the Adapting Cross-Domain Kill-Webs (ACK) proposed in recent years. From the point of view of mosaic warfare, the evolution and development of kill chains is shown in Fig. 1 [1]. Generally speaking, the kill chain can be understood as a six step process, which are abbreviated as “F2T2EA” and include: (1) Find, (2) Fix, (3) Track, (4) Target,
Cooperative Control and Management for UAS …
731
Fig. 1 Evolution of kill chains [1]
(5) Engage (implying that a decision was made), and (6) Assess [2]. Through the optimization of the kill chain process, the OODA cycle can be accelerated, the closedloop of the strike can be shortened, and the operational response ability and efficiency can be improved. Kill web or effects web is the extension and expansion of kill chain concept, which is oriented to the distributed operations in a networked environment. It means that the kill web combines different configurations of independent sensors, countermeasures, weapons, and decision elements through an open architecture, the force elements in the web are deployed and managed in a distributed manner, and can be dynamically composed and recomposed in a flexible way to meet the actual combat need. Based on the research of kill web, DARPA STO supported ACK program in 2018, which could assist users with selecting sensors, effectors, and support elements across military domains to form and adapt kill webs to deliver desired effects on targets. At present, the most representative example of kill web is Mosaic Warfare promoted by the U.S. military, in which Department of Defense (DoD) is pursuing decision and information superiority by decomposing some of today’s monolithic multi-mission units into a larger number of smaller elements with fewer functions that would be more composable. The advantage is to allow any or all available resources (large or small, manned or unmanned) to be rapidly tailored to the requirement, adapt to dynamic threats, and be resilient to losses and attrition. As shown in Fig. 2 [1], even the limited function UAS depicted in the disaggregated force package would be capable of being sensors, decoys, or communications nodes, and they could change their roles during an operation. Obviously, the development and practice of kill web means that more and more UAS will be used in cooperative manners in the future battlefield.
732
S. Zhangjun et al.
Fig. 2 From kill chain to kill web [1]
3 UAS in Distributed Dynamic Cooperation Collaboration or cooperation is a high-level intelligent activity oriented to tasks. When tasks are complex and arduous and the ability of individual is limited, tasks can be effectively completed through collaboration and cooperation among individuals. As important combat platforms in modern warfare, various unmanned systems have great potential in the future distributed operation. They can be collaborated to perform nearly all kinds of tasks under the organization and coordination of the kill webs, such as reconnaissance, surveillance, strike, communication, decoy, jamming, and effectiveness evaluation. In addition, they also can be jointed with manned systems or other cross-domain heterogeneous unmanned systems to accomplish complex missions toward more dynamic battlefield adaptation. As early as the beginning of this century, the U.S. military put forward the concept of Manned-Unmanned Teaming (MUM-T). And in 2003, Boeing and General Atomics Aeronautical Systems had planned to integrate the command and control
Cooperative Control and Management for UAS …
733
Fig. 3 Swarming, Teaming, and Loyal Wingman comparison [5]
architecture for the Predator unmanned aerial vehicle into the E-3 Airborne Warning and Control System and the AH-64 Apache attack helicopter [4]. In 2016, U.S. Air Force (USAF) released “Small Unmanned Aircraft Systems (SUAS) Flight Plan: 2016–2036” [5]. In this document, several typical concepts of operations had been clearly proposed and summarized, which described how a force might employ capabilities necessary to meet current and future military challenges, including Swarming, Teaming, and Loyal Wingman (Fig. 3). In the past two or three years, the researches on Loyal Wingman and Swarming Warfare have been further deepened, and some concepts and ideas are gradually turning into reality. For example, great progress has been made in such U.S. military programs as the famous SKYBORG, OFFSET and so on. In this paper, according to the different collaborators, the application of UAS in distributed dynamic cooperation is divided into three main styles, which are introduced and discussed briefly as follows.
3.1 Cooperation Between Unmanned and Manned Aircraft Systems The United States Department of Defense (DoD) has always emphasized its vision on unmanned systems in the nearly all versions of “Unmanned Systems Integrated Roadmap” released over the years. In these documents, DoD envisions unmanned
734
S. Zhangjun et al.
systems seamlessly operating with manned systems to compress the warfighters’ decision-making process, while reducing the risk to human life [6]. Unmanned aircraft systems have achieved a high degree of maturity, with significant levels of automation. But they are still not autonomous and intelligent enough nowadays, and can only be used as assistants and supplements of manned platforms. In this sense, the cooperation between unmanned and manned aircraft systems is one of the most promising application at practice. In this cooperation style, Manned-unmanned Teaming is becoming one of key innovations that will pave the way to future airpower, which is an operation of manned and unmanned assets in concert towards a shared mission objective. As team members, diverse, smart, connected and modular UAS integrated by a distributed network of intelligence will act as force multipliers for the manned aircraft, enhancing the team capabilities and decreasing the risk to human in uncertain or hostile environments, while still in control. Currently, multiple programs around the world are focusing on the realization of this objective, such as the USAF’s MUM-T experimentation initiative and the related SKYBORG AI program, Boeing’s Airpower Teaming System program and the Royal Australian Air Force’s parallel Trusted Wingman effort, the Royal Air Force’s TEMPEST/Lightweight Affordable Novel Combat Aircraft (LANCA) program, and Hindustan Aeronautics’ Combat Air Teaming System, for example [7]. As far as control is concerned, the primary problem is how to realize the matching of capabilities between the manned and unmanned systems. It is necessary to dynamically switch or transfer the control authority among the pilots, the auxiliary systems and the autopilot systems through the variable autonomy of UAS. At the same time, it needs to solve the problem of manned-unmanned mission-driven cooperative behavior control, mainly including the modeling and reasoning to grasp and judge human intervention and tactical intention, the structure design of cooperative behavior controller, and the stability analysis of multi-loop controls for this hybrid large scale system. In order to realize information exchange and collaborative operation between heterogeneous systems, interoperability is also a problem that needs to be paid great attention. There are relevant studies and standards for reference in this respect, for example, in the Chapter “MQB-RRT*:An Improved Path Planning Algorithm Based on Improving Initial Solution and Fast Convergence” of STANAG 4586 [8], the interoperability of UAS has been discussed in detail, which not only identifies the interoperability architecture in Fig. 4, but also defines the well-known five Levels of Interoperability (LOI).
Cooperative Control and Management for UAS …
735
Fig. 4 UAS Interoperability Architecture [8]
3.2 Cooperation Among Multiple Unmanned Aircraft Systems The Cooperation among multiple UAS requires them to have a higher level of autonomy, such as shown in Fig. 5 from ref [9] in 2017. Through unmanned teaming or collaboration which is characterized with distributed functions, swarm intelligence and decentralized communication, the advantages of group operations can be fully demonstrated in many mission scenarios, including cooperative ISR, saturation attack, SEAD/DEAD and other tasks. For example, as early as ten years ago, the simulation carried out by U.S. Naval Postgraduate School had verified the effectiveness of saturated suicide attacking from a swarm of UAS to the Burke-class Destroyer with Aegis system [10]. In this paper, according to different external forms in cooperation, multiple UAS in flight can be divided into formation and self-organized ones. The former undoubtedly has a regular form, while the latter can be dynamically changed in shape. Formation flight is known as two or more aircraft traveling and maneuvering together in a disciplined, synchronized, predetermined manner. In a tight formation, such as is typically seen at an airshow, aircraft may fly less than several feet apart and
736
S. Zhangjun et al.
Fig. 5 Unmanned systems planned in air force science and technology program [9]
must move in complete harmony. The control for formation flight has become a hot research direction in recent years due to its wide applications in many near future civil and military scenarios, whose main research contents include formation design and maintenance, formation dynamic adjustment and reconfiguration, formation collision and obstacle avoidance. At present, there are some available control methods to realize formation flight, especially those Leader-Follower ones based on hierarchical architecture [11], which have been verified and validated in many real flight tests. The research of self-organized cooperation originates from the socialized behaviors in biological groups, such as shoals of fish, swarms of bee, flocks of birds, herds of sheep, etc. The research has shown that through some simple local-interaction rules, decentralized groups can emerge global coordinated self-organization behaviors. Compared with the formation flight control, most self-organized cooperative control is still in the theoretical stage [12–16], and the main researches focus on consensus control problems. From this perspective, formation flight can also be recognized as one of the special cases or sub-problems of consensus control, which is the same as flocking/swarming, rendezvous, containment, and coverage problems.
Cooperative Control and Management for UAS …
737
Fig. 6 OFFSET vision of DARPA [19]
3.3 Cross-Domain Cooperation of Heterogeneous Platforms In the cross-domain cooperation style, UAS and other various heterogeneous platforms can be seamlessly and harmoniously integrated to provide complementary capabilities, address shortcomings and weaknesses, improve efficiency and effectiveness, and establish advantages as a whole. In 2016, the U.S. Army first proposed the concept of Multi-Domain Operations, and published the most recent version of the Army Operating Concept (AOC) on 27 November 2018, titled “The US Army in Multi-Domain Operations, 2028”. MDO intends to achieve “rapid and continuous integration of all domains of warfare” to provide land forces with an advantage over adversaries during both competition and armed conflict [17]. In other words, “The ability to employ cross-domain fires provides options to commanders and builds resilience within the Joint Force to overcome temporary functional separation imposed by enemy anti-access and area denial systems” [18]. Currently, the concept of MDO has been widely accepted with greater emphasis on cross-domain cooperation, UAS doubtlessly included. Taking the urban combat as an example, the project OFFSET shown in Fig. 6 by DARPA is pursuing to discover innovative technologies to enable large-scale teams of air and ground robots to support small-unit forces operating in complex urban environments [19]. In OFFSET, the representative mission of Vignette 3 is to seize the key urban terrain of approximate 8 square city blocks in 4–6 hours by more than 250 unmanned heterogeneous platforms and payloads in multiple domains.
738
S. Zhangjun et al.
4 Key Issues for Cooperative Control and Management In order to build a distributed dynamic kill web based on UAS, the realization of cooperative control and management functions is very important and necessary, several key issues should be concerned and paid more attention, which are introduced and discussed below.
4.1 Cooperation Architecture Architecture is defined as “fundamental concepts or properties of a system in its environment embodied in it elements, relationships, and in the principles of its design and evolution” [20]. The architecture of a system reflects how it interacts with other systems and the outside world. It describes the interconnection of all the system’s components and the data link between them. Conventionally, the basic architecture for control and management systems can be divided into centralized, distributed and decentralized ones [21], as shown in Fig. 7. For the cooperation of multiple UAS, it currently tends to adopt layered hierarchical architecture to research and design the complex control and management system [22, 23]. In the layered hierarchical design, the system can be divided into several relatively decoupled function layers, such as mission layer, decision-making layer, planning layer, executive layer, and so on. Obviously, on the basis of determining the relationship between those layers, more flexible design can be carried out for each layer,
Fig. 7 A comparison of centralized, distributed, and decentralized control [21]
Cooperative Control and Management for UAS …
739
thus reducing the design difficulty and improving the openness and configurability of this complex cooperation system. In this paper, it is believed that at present and in the near future, the realization of cooperative control and management of UAS should be implemented in hybrid architecture. For example, centralized control can be used for cooperation with high real-time requirements and strong interaction, while distributed cooperation can be adopted for those individuals with low real-time requirements and less information interaction, and on the basis above, a certain hierarchical control can also be used to make overall coordinate, compromise the control costs, and improve the control effectiveness.
4.2 Information Interaction The sufficient effective information is the premise and primary condition to realize system control and management. For UAS in distributed operations, it is necessary to exchange and share different multi-source information based on task or mission requirements, and this task/mission driven information interaction mainly involves human-being, machine (control station and UAS included) and environment, as shown in Fig. 8. Meanwhile, the triggered or periodic interaction among them can be roughly divided into three major classes. First, as the operator and monitor of the whole system operation, human-being has the highest control and management authority. Through the human-machine interface (HMI) provided by the control station, the flight and mission process of UAS can be
Fig. 8 Information interaction in operation
740
S. Zhangjun et al.
controlled and monitored in a global view, so as to realize the control and management functions in the manner of “human in-the-loop” or “human on-the-loop”. Secondly, by means of the data link, such as the most popular ad hoc based network equipment, it can be realized the remote control and telemetry between control station and UAS, and at the same time, multiple UAS in operation can also be connected to achieve information exchange and sharing. Thirdly, it relies on human-being’s subjective or objective judgment and various sensors to complete the interaction with environment and situation, realize the function of environmental perception and situation assessment, and transmit the results to each agent in the operation, including human-being and UAS. To realize the above-mentioned information interaction, the key is to design a set of effective communication protocols, in which the constraints in those actual applications should be comprehensively considered, such as communication data volume, frequency, bandwidth, delay, range/distance, etc.
4.3 Relative Navigation and Positioning In UAS cooperative flight, relative navigation and positioning information is of great significance for its accuracy and reliability. Typical application scenarios include autonomous landing, aerial refueling, formation flight, target location, and other task-driven cooperation ones, some of which even require the accuracy in centimeter level. Generally, there are two main ways to obtain relative navigation and positioning information, namely, through absolute position information and using relative measurement sensors. The former is represented by inertial navigation system (INS), global navigation satellite system (GNSS), and INS/GNSS integrated system, while the latter includes data link, radar etc. In some cases, the information obtained by the two methods can also be fused and integrated to improve the accuracy of navigation and positioning. Some navigation and positioning technologies and products have been developed for many years and are quite mature. For example, it is well known that carrier-phase differential GPS can easily achieve very high positioning accuracy. However, there are still some problems to be solved in military applications. It is still taking GNSS, which is the most widely used, as an example. GNSS is very likely to be masked, interfered or deceived under denied battlefield conditions, thus becoming ineffective or losing accuracy. In addition, there are many non-cooperative targets or objects in the battlefield, so it is impossible to get their localization only by GNSS. Therefore, relative navigation and positioning in GNSS and communication denial environment has become a hot issue for UAS in cooperative operations. In recent years, some new approaches have attracted a lot of researches’ interests. One of them is computer vision based navigation, which is a very promising direction and has been applied in automatic driving of automobile industry. A common sense is that more than 90% of the external information obtained by human beings comes
Cooperative Control and Management for UAS …
741
from vision, and in part for this reason, people have invested great enthusiasm in the field of computer vision for UAS [24–28]. Ultra-wideband (UWB) localization is another available technology gained attention recently [29–31], which is defined as an RF signal that occupies a portion of the frequency spectrum that is greater than 20% of the center carrier frequency, or has a bandwidth greater than 500 MHz. UWB can be used for positioning by utilizing the time difference of arrival (TDOA) or the time of flight (TOF) to obtain the distance between the reference point and the target. Although these new technologies mentioned above are not mature enough and there are still various shortcomings, such as the computer vision is greatly affected by light conditions and the range of UWB is limited, they have the potential to combine with other existing technologies to take full advantages.
4.4 Decision-Making and Planning The main features of autonomous control systems are determined by the need to solve complex optimization problems in the face of uncertainty, in near real time, and without human intervention [32]. Decision-making and planning are just such complex optimization problems, which can best represent the autonomous capability level of UAS. Decision-making is actually a process of selecting feasible alternatives, which requires effective reasoning, evaluation and prediction based on the acquired information and existing knowledge, so as to obtain the final results. The decision-making in cooperative UAS mainly refers to target and behavior selecting, such as continuous tactical maneuver decision-making in autonomous cooperative air combat. Planning is essentially an optimization process compromise between own capabilities and costs. The planning problem of UAS mainly refers to mission planning, which generally includes task prioritization, task allocation/assignment, cooperative route/trajectory planning, payload planning, communication topology planning, and planning for support and emergency. Planning can be divided into offline and online planning, as well as global and local planning. At present, many researches are focused on task allocation and route planning [33–37], and a series of effective results have been achieved, which are gradually applied to engineering. For UAS in cooperation, the challenges of decision-making and planning mainly come from the complexities, dynamics and uncertainties of environment and tasks. At the same time, the expansion of the scale of the cooperative system also brings difficulties to the solution of the problem. For example, in the distributed decisionmaking of UAS cooperative control, each agent needs to consider not only environment, task and the agent itself, but also the strategies that other agents may take, so that the problem has a highly complex solution space. Moreover, with the increased individuals in cooperation, the complexity of this optimization problem is likely to increase exponentially and become difficult to solve.
742
S. Zhangjun et al.
Fig. 9 A mission system architecture for UAS [33]
The methods proposed to solve this kind of complex problems are generally based on the principle of “divide and conquer”, which involve formulating several smaller sub problems, each of which is simpler and, therefore, easier to solve. The hierarchy architecture for intelligent control, and the singular perturbation theory based loopby-loop control are all based on this idea. One simple such solution architecture in ref [33] is shown in Fig. 9.
4.5 Guidance and Control In the classical hierarchical architecture of the autonomous control system, guidance and control belong to the traditional “flight control” in the executive layer, which realize the functions to stabilize and control the vehicle platforms to perform specific tasks automatically. The most common tasks would be trajectory tracking and path following functions, which usually serve as the basis of more complicated flight missions, such as formation flight and other cooperative operations. There are many commonly used flight control methods for UAS, including classical control methods and advanced control designs based on exact models or uncertain models [38]. In distributed networked conditions, these control methods can be flexibly applied in combination with requirements to support UAS to complete cooperative flight and combat tasks in practice. In this paper, it mainly involves high-precision tracking control, robust control, adaptive control, and fault-tolerance control from the perspective of functionality, which are briefly introduced as follows. High precision command tracking control is a necessary guarantee for UAS to complete flight and mission. Especially in some specific scenarios such as landing and formation flight, control accuracy will play a decisive role in the execution of the mission. Robust control is mainly aimed at various uncertainties with the controlled plants, such as structural and parameter uncertainties, to ensure the robust stability and robust performance of the system.
Cooperative Control and Management for UAS …
743
Fig. 10 Conventional aircraft development process versus Learn-to-Fly concept [39]
Adaptive control can face large-scale dynamic changes of control plants and provide satisfactory flight quality. The main research and development directions in the future include nonlinear adaptive control for large envelope flight, mission adaptive control for multi-mission scenarios, and variable configuration adaptive control for those morphing or cross-domain aerial vehicles. Fault tolerance control is the basis for the system to work under fault conditions. The fault tolerance ability is mainly reflected in the self-diagnosis ability and the reconfiguration control ability during the online operation of the system. Therefore, the main research contents of fault tolerance control include fault detection, fault diagnosis, and fault isolation at the front end, and the self-repairing control and reconfiguration control at the back end. In addition, with the rapid development of computer, control and communication science, several new methods have been proposed and researched, such as NASA’s “Learn-to-Fly (L2F)” approach [39–43], which combines real-time nonlinear aerodynamic modeling with autonomous control law design as illustrated in Fig. 10. There are three pillars in the L2F philosophy that have been the focus of recent studies: (1) real-time nonlinear aerodynamic modeling; (2) “learning” control law design methods; and (3) guidance algorithms.
5 Conslusion The development and application of UAS change rapidly, and related new concepts, theories and algorithms are emerging. Especially in the past decade, the approaches of multiple UAS for distributed dynamic cooperation had been widely researched and gradually matured. This paper has made a brief review mainly from the perspective of control and management for the UAS cooperative operations. The purpose is to sort out the relevant researches and understandings, so as to clarify the main work in the next step, which is helpful to the future system design and engineering implementation.
744
S. Zhangjun et al.
Undoubtedly, UAS will play an important role in the operation of future kill web, and the following trends need to be focused on to address the related control and management problems. • • • • • • • •
More intelligent and autonomous unmanned platforms. More widely distributed deployment and application. More abundant and flexible cooperation. More emphasis on cognition and decision-making superiority. More emphasis on composable capability. More emphasis on dynamic change. More emphasis on attrition. Others.
The trends of UAS listed above have proposed new challenges to those key issues for cooperative control and management discussed in Sect. 4, and these challenges lead to series of related problems and research directions, which are also of great importance and still very open. It should be carried out further detailed and systematic research in the future.
References 1. Clark, B., Patt, D., Schramm, H., Warfare, M.: Exploiting Artificial Intelligence and Autonomous Systems to Implement Decision-centric operations. Center for Strategy and Budgetary Assessments, Washington (2020) 2. Cheater, J.C.: Accelerating the Kill Chain via Future Unmanned Aircraft. USAF (2007) 3. Haystead, J.: DARPA’s mosaic warfare: moving to address the ever-more-rapidly-paced advances/changes in fielded threat capabilities. J. Electron. Def 43(2), 20–25 (2020) 4. BRIEFS - Manned/unmanned teaming. Jane’s Int Defense Rev 2003, 06 5. Small Unmanned Aircraft Systems (SUAS) Flight Plan: 2016–2036. USAF, 2016 6. Unmanned Systems Integrated Roadmap 2017-2042. DoD, 2018 7. Donaldson, P.: Manned-unmanned teaming–The true challenge. Military Tech. 45(2), 26–29 (2021) 8. Standard Interfaces of UAV Control System (UCS) for NATO UAV Interoperability: STANAG 4586 [S]. NATO, 2017 9. Blackhurst, J.L.: Integrity-service-excellence–air force science and technology program. Headquarters U.S, Air Force (2017) 10. Pham, L.: UAV Swarm Attack: Protection System Alternatives for Destroyers. Naval Postgraduate School, Monterey (2012) 11. Chen, H., Wang, X., Shen, L., et al.: Formation flight of fixed-wing UAV swarms: a group-based hierarchical approach. Chinese J. Aeronaut. 34(2), 504–515 (2021) 12. Jadbabaie, A., Lin, J., Stephen Morse, A.: Coordination of groups of mobile autonomous agents using nearest neighbor rules. IEEE Trans. Automat. Cont. 48(6), 988–1001 (2003) 13. Olfati-Saber, R., Murray, R.M.: Consensus problems in networks of agents with switching topology and time-delays. IEEE Trans. Automat. Cont. 49(9), 1520–1533 (2004) 14. Ren, W., Beard, R.W.: Consensus seeking in multiagent systems under dynamically changing interaction topologies. IEEE Trans. Automat. Cont. 50(5), 655–661 (2005) 15. Ren, W.: Consensus strategies for cooperative control of vehicle formations. IET Cont. The. Appl. 1(2), 505–512 (2007) 16. Chen, Z., Mendes, A., et al.: Intelligent Robotics and Applications. Springer (2018)
Cooperative Control and Management for UAS …
745
17. Fawcett, G.S.: History of US army operating concepts and implications for multi-domain operations. U.S. Army Command and General Staff College, Fort Leavenworth (2019) 18. US Department of the Army: The US Army in Multi-Domain Operations, 2028. US Army Training and Doctrine Command, Fort Eustis (2018) 19. Chung, T.H.: Offensive Swarm-Enabled Tactics (OFFSET). DARPA (2021) 20. Systems and Software Engineering -Architecture Description. ISO/IEC/IEEE 42010-2011 (2011) 21. Fowler, J.M., D’Andrea, R.: A formation flight experiment. IEEE Cont. Syst. Magaz. 23(5), 35–43 (2003) 22. Boskovic, J.D., et al.: A Multi-layer control architecture for unmanned aerial vehicles. Am. Cont, Conf (2002) 23. Emel’yanov, S., et al.: Multilayer cognitive architecture for UAV control. Cognit. Syst. Res. 39, 58–72 (2016) 24. Courbon, J., et al.: Vision-based navigation of unmanned aerial vehicles. Cont. Eng. Pract. 18, 789–799 (2010) 25. RodiVetrella, A., Fasano, G.: Attitude estimation for cooperating UAVs based on tight integration of GNSS and vision measurements. Aeros. Sci. Tech. 84, 966–979 (2019) 26. Kakaletsis, E., et al.: Computer vision for autonomous UAV flight safety: an overview and a vision-based safe landing pipeline example. ACM Comput. Surv. 54(9), 1–37 (2021) 27. Couturier, A., Akhloufi, M.A.: A review on absolute visual localization for UAV. Robot. Autonom Syst 135, 1–17 (2021) 28. Qin, H., Meng, Z., et al.: Autonomous exploration and mapping system using heterogeneous UAVs and UGVs in GPS-denied environments. IEEE Trans. Veh. Tech. 68(2), 1339–1350 (2019) 29. Lazzari, F., Buffi, A., et al.: Numerical investigation of an UWB localization technique for unmanned aerial vehicles in outdoor scenarios. IEEE Sens. J. 17(9), 2896–2903 (2017) 30. Guo, K., Li, X., et al.: Ultra-wideband and odometry-based cooperative relative localization with application to multi-UAV formation control. IEEE Trans. Cybernet. 50(6), 2590–2603 (2020) 31. Xianjia, Y., Qingqing, L., Queralta, J.P., et al.: Cooperative UWB-based localization for outdoors positioning and navigation of UAVs aided by ground robots. In: IEEE International Conference on Autonomous Systems (ICAS). Montreal, QC, Canada (2021) 32. Pachter, M., Chandler, P.R.: Challenges of autonomous control. IEEE Cont. Syst. Magaz. 18(4), 92–97 (1998) 33. Bethke, B., Valenti, M., How, J.P.: UAV task assignment. IEEE Robot. Automat. Magaz. 15(1), 39–44 (2008) 34. Pilloni, V., Ning, H., Atzori, L.: Task allocation among connected devices: requirements, approaches, and challenges. IEEE Internet of Things J. 9(2), 1009–1023 (2022) 35. Peng, Q., Husheng, W., Xue, R.: Review of dynamic task allocation methods for UAV swarms oriented to ground targets. Complex Syst. Model. Simul. 1(3), 163–175 (2021) 36. Radmanesh, M., Kumar, M., et al.: Overview of path-planning and obstacle avoidance algorithms for UAVs: a comparative study. 6(2):1–44 (2018) 37. Tordesillas, J., Lopez, B.T., Everett, M., How, J.P.: FASTER: fast and safe trajectory planner for navigation in unknown environments. IEEE Trans. Robot. 38(2), 922–938 (2022) 38. Zuo, Z., Liu, C., Han, Q.-L., Song, J.: Unmanned aerial vehicles: control methods and future challenges. IEEE/CAA J. Automat. Sinica 9(4), 601–614 (2022) 39. Heim, E.H.D., Viken, E.M., Brandon, J.M., Croom, M.A.: NASA’s learn-to-fly project overview. NASA (2018) 40. Frost, S., Teubert, C., et al.: Online control design for learn-to-fly. NASA (2018) 41. Foster, J.V.: Autonomous guidance algorithms for NASA learn-to-fly technology development. NASA (2018) 42. Snyder, S.: Design, autopilot, with learn-to-fly. AIAA Scitech: Forum, p. 2020. Orlando, FL (2020) 43. Riddick, S.E.: Overview, an., of NASA’ss learn-to-fly technology development. AIAA Scitech,: Forum, p. 2020. Orlando, FL (2020)
Research on Fractional Order Unidirectional Sliding Mode Control for Fixed-Rudder Two-Dimensional Correction Projectile Xin Lei, Jian Fu, Liangming Wang, Yuming Zhang, and Shouyi Guo
Abstract Related to the complex and variable working environment of the projectile, the roll control strategy of the high-spin and tail-control correction projectile faces severe challenges. It is difficult to design an efficient controller through the conventional control theory. According to the principle of guidance and control in this paper, a new control model is proposed. USMC algorithm with the fractional order non-singular terminal sliding mode surface is designed with the factional order theory and terminal sliding mode. The simulation results demonstrate the effectiveness of the proposed roll control algorithm, presenting it as a promising approach for achieving effective roll control. Keywords Fractional order · Terminal sliding mode · Precision-guided munitions · Unidirectional sliding mode
1 Introduction The fixed rudder trajectory correction projectile has received significant attention in the field of long-range precision strike. This kind of projectile replaces the traditional fuse with trajectory correction module, which has the advantage of high efficiencycost ratio and precise attack capability. The Precision Guidance Kit (PGK) [1], proposed by the U.S. military in 2003 as a two-dimensional(2D) trajectory correction fuze, achieves 2D trajectory correction by controlling the roll angle of the fixed canards. There have been a lot of research on 2D trajectory correction projectiles with fixed canards, However, the research on the algorithm of spin reduction and roll channel control for 2D trajectory correction projectile with fixed tail correction component is still relatively limited.
X. Lei · J. Fu (B) · L. Wang · Y. Zhang · S. Guo School of Energy and Power Engineering, Nanjing University of Science and Technology, 210094 Nanjing, People’s Republic of China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_61
747
748
X. Lei et al.
For the fixed rudder trajectory correction projectile, accurate roll position control is essential for effective strike. The choice of control method is crucial for enhancing the signal tracking performance of nonlinear time-varying control systems. Slidingmode variable structure control, as a special type of nonlinear control, offers advantages such as fast response and strong disturbance rejection capabilities. However, the practical application of sliding mode variable structure control is limited due to chattering problems. To address these issues, researchers have explored various methods to suppress chattering, such as higher-order sliding mode methods [2], reaching law methods [3], and unidirectional sliding mode methods [4]. Fractional calculus operators, known for their memory and hereditary properties, have played a significant role in the field of traditional sliding mode control. In [5], a fractional order integral sliding mode surface is designed, which combines the integral sliding mode surface with the observer to suppress the chaos phenomenon. Weakening the chattering has limitations, because it can only be achieved by adjusting the parameters of the exponential reaching law. In [6], combined fractional PID with sliding mode and proposed an exponential reaching law applied to the permanent magnet synchronous motor speed regulation system, which improved the response speed of the system and suppressed the chattering of the system. Building upon the literature mentioned above, this paper establishes a mathematical model for the ship tail roll control system of a high-spin guided missile. A sliding mode surface combining terminal sliding mode with fractional-order is designed. The control accuracy is improved while the strong robustness and stability of the sliding mode are preserved. The chattering is eliminated by unidirectional sliding mode control algorithm. The organization of this paper is as follows. Section 2 establish the trajectory control model. A new fractional order unidirectional sliding mode control(FOTUSMC) algorithm is proposed in Sect. 3. The simulation results are given in Sect. 4. Finally, the work is concluded in Sect. 5.
2 Model and Control Principles of High-Spin Tail-Controlled Correction Projectile 2.1 Layout of Fixed Control Surfaces As shown in Fig. 1, its shape is axially symmetrical, and is composed of a projectile forebody and a guided tail aftbody. These two components are connected by a bearing, allowing for relative rotation. The guided tail features two pairs of fixed control surfaces: one pair of deflectors is responsible for reducing tail spin, while the other pair provides control forces. When the fixed control surfaces reach their assigned positions, influenced by bearing friction torque, control torque, turning torque, and damping torque, the co-directional control surfaces on the projectile
Research on Fractional Order Unidirectional Sliding Mode …
749
Fig. 1 Model of High-Spin Tail-Controlled Correction Missile
Fig. 2 Rear view of the correction components
generate transverse control forces and moments, enabling 2D correction. To generate control torque, an electric servo motor is mounted in the tail. The guided tail of the guided cartridge undergoes a variety of moments during flight. In this paper, only the moments affecting the roll channel are considered. The structure of the guided tail, shown in Fig. 2, consists of the fixed rudder and the portion that rotates relative to the projectile forebody. The main moments acting on the roll channel are as follows: tail deflection torque Mxw , viscous damping torque Mx z , bearing friction torque M f , and electromagnetic torque Me . For a right-spinning projectiled, the direction of the tail deflection torque is left-spinning, the direction of the viscous damping torque is opposite to the absolute rotational speed of the guided tail, and since the rotational speed of the projectile forebody is higher than that of the guidance components, the bearing friction torque is right-spinning.
2.2 Fixed Control Surface Roll Angle Control Model The ballistic model for the high-spin tail-controlled correction projectile is similar to the traditional six degrees of freedom model for a projectile. The projectile consists of a forebody and a aftbody, which are mutually separated and can rotate with respect to each other. With the addition of an additional degree of freedom for the roll channel, the differential equations describing the projectile’s seven degrees of freedom rotation around its center of mass are as follows [7]:
750
X. Lei et al.
⎧ dω f ξ = C1f M f ξ − Msξ − Me ⎪ ⎪ dt ⎪ ⎪ dωaξ ⎪ ⎪ = C1a Mxwξ + Max zξ + Msξ + Me ⎪ dt ⎪ ⎪ dω ⎪ ⎪ dtη = A1 Mη − A1 (Ca ωaξ + C f ω f ξ )ωζ + ωζ ωζ tan ϕ2 ⎪ ⎨ dωζ = A1 Mζ + A1 (Ca ωaξ + C f ω f ξ )ωη − ωη ωζ tan ϕ2 dt dγ f ⎪ = ω f ξ − ωζ tan ϕ2 ⎪ ⎪ dt ⎪ dγa ⎪ ⎪ = ωaξ − ωζ tan ϕ2 ⎪ dt ⎪ ω ⎪ dϕa ⎪ = cosζϕ2 ⎪ dt ⎪ ⎩ dϕ 2 = −ωη dt
(1)
In the equation, M f ξ represents the projection of the moment on the ξ axis of the projectile body, Ms is the friction moment, Mxwξ is the rolling moment. Max zξ is the spin-damping moment, Me represents the electromagnetic moment exerted on the guided tail control. γ is the roll angle, ω is the spin rate. According to the seven degrees of freedom ballistic model, the control model for the guided tail is as follows: dγa = ωaξ − ωζ tan ϕ2 dt (2) dωaξ = C1a Mxwξ + Max zξ + Msξ + Me dt where γa is the roll angle; ωaξ is the roll angular velocity; ωζ is the projection of the rotational speed on the projectile’s axis in the projectile coordinate system; ϕ2 is the azimuth angle of the projectile axis. To facilitate controller design, according to Eq. (2), define: x1 = γa ,x2 = ωaξ , f 1 (x1 ) = −ωζ tan ϕ2 , f 2 (x2 ) = Mxwξ + Max zξ + Msξ /Ca , u = Mc , g1 (x1 ) = 1, g2 (x2 ) = 1/Ca . The nonlinear equations of the fixed control surface roll angle loop and roll angular rate can be reduced to a unified representation in the form of the following affine nonlinear equations:
x˙ 1 = f 1 (x1 ) + g1 (x1 ) · x2 x˙ 2 = f 2 (x2 ) + g2 (x2 ) · u
(3)
After receiving the target command, the system generates a control command based on the 7-DOF ballistic model and the control algorithm established using the fixed control surface roll channel model. This control command aims to ensure that the system can quickly and stably track the target command even in the presence of time-varying nonlinearities and saturation nonlinearities.
Research on Fractional Order Unidirectional Sliding Mode …
751
3 Fractional Order Unidirectional Sliding Mode Controller 3.1 Controller Design In this paper, the following assumptions are made to facilitate controller design. Additionally, the definitions, theorems, and lemmas related to fractional-order systems and their stability proofs are introduced. Assumption 3.1 The target value to be tracked is assumed to be bounded. The actuator, under normal operating conditions, can reliably provide the control torque calculated by the controller, ensuring stable operation. Assumption 3.2 The system parameters are assumed to be known or can be accurately estimated. Lemma 3.1 [8] If there exists a positive definite continuously differentiable function V (x) ∈ R n satisfies the following conditions:
V˙ (x) + η(V (x))γ ≤ 0 V˙ (x) < 0, x ∈ R n
(4)
where: η > 0, 0 < γ < 1, then the system globally converges in finite time, and the convergence time is given by t (x0 ) ≤
V (x0 )1−γ η (1 − γ )
(5)
Lemma 3.2 [9] Due to the nonzero initial conditions, this paper adopts the Caputo fractional order derivative definition: α t0 Dt
1 f (t) = (m − α)
t t0
f (m) (t) dτ (t − τ )α−m+1
(6)
where: f (t) ∈ C n ([t0 , +∞] , R) , 0 < α < 1. Lemma 3.3 [10] If there exists a fractional order system described as follows: D α x (t) = f (x, t) (α ∈ (0, 1) , x (t) ∈ R)
(7)
that satisfies the following conditions: x (t) f (x, t) < 0, ∀x = 0
(8)
then is the equilibrium point corresponding to Eq. (7), and it is asymptotically stable.
752
X. Lei et al.
Lemma 3.4 [11] Assume that is the equilibrium point of the following system D α x (t) = f (x, t) α ∈ (01]
(9)
If there exists a Lyapunov function on that is continuously differentiable, where satisfies the Lipschitz condition, and a1 x (t) ≤ V (t, x (t)) ≤ a2 x (t)
(10)
V˙ (t, x (t)) ≤ −a3 x (t)
(11)
And
where: t > 0, f (x, t) ∈ , a1 , a2 , a3 are given arbitrary constants, the system (9) is Mittag-Leffler stable. To ensure the convergence of the tracking error to zero in finite time and to mitigate the singularity problem, this research paper proposes enhancing the speed and stability of the system. This is achieved by designing a terminal sliding mode surface in integral form and employing fractional-order techniques to mitigate chattering. First, define the roll error of the fixed rudder loop as: e = x1 − xd , then: e˙ = f 1 (x1 ) + g1 (x1 ) · x2 − x˙d
(12)
Since the system (3) has a relative order of 2, the introduction of a fractional-order sliding surface can attenuate the system chattering [12]. This is inspired by [13] Inspired by this, the following fractional-order integral terminal sliding surfaces are designed
˙ s 1 = D 1−α e˙ + c1 D −α ξ1 |e|λ sgn (e) + ξ2 |e|γ sgn (e) s 2 = D 1−α e˙ + c2 D −α ξ1 |e|λ sgn (e) + ξ2 |e|γ sgn (e) ˙
(13)
where: c1 , c2 is the slipform surface gain to be designed, 0 < α < 1, other parameters satisfy the following conditions: ⎧ ⎨ λ ∈ (0, 1) 2λ γ = 1+λ ⎩ ξ1 , ξ2 > 0
(14)
Analysis of the fractional order terminal sliding surface shown in expression(13) ˙ can ensure that shows that the two non-linear terms at |e|λ sgn (e) and |e|γ sgn (e) the system converges to the system equilibrium point in finite time. To demonstrate the stability of the sliding mode surface in this paper, according to the analysis of Lemma 3.1, when the system state reaches the sliding mode surface,
Research on Fractional Order Unidirectional Sliding Mode …
˙ =0 D 1−α e˙ + c1 D −α ξ1 |e|λ sgn (e) + ξ2 |e|γ sgn (e) ˙ =0 D 1−α e˙ + c2 D −α ξ1 |e|λ sgn (e) + ξ2 |e|γ sgn (e)
753
(15)
Theorem 3.1 When the value of ξ1 , ξ2 is satisfied ξ2 > ξ1 , the following is true: ˙ < 0, ∀e = 0, i = 1, 2 − ec ˙ i D −α ξ1 |e|λ sgn (e) + ξ2 |e|γ sgn (e)
(16)
2λ Proof First, because ξ2 > ξ1 and γ = 1+λ , λ ∈ (0, 1), Thus ξ2 |e|γ > ξ1 |e|λ . Assum˙ < 0, therefore Eq. 16 is true. ing that e˙ < 0, then ξ1 |e|λ sgn (e) + ξ2 |e|γ sgn (e)
Next, assuming that e˙ > 0, then ξ1 |e|λ sgn (e) + ξ2 |e|γ sgn (e) ˙ > 0, therefore Eq. 16 is true. Thus, according to Lemma 3.1 it is possible to obtain that the slipform surface corresponding to Eq. (13) is asymptotically stable and that the system state can converge to the equilibrium point. In order to map the coordinates of a fractional order non-singular terminal slip surface, the variables z1 and z2 are defined as follows:
z 1 = D 1−αe˙ z 2 = D −α ξ1 |e|λ sgn (e) + ξ2 |e|γ sgn (e) ˙
(17)
The drawn roll angle control system FOTUSMC phase plane is shown below: As shown in Fig. 3, the four points ps1+ , ps1− , ps2+ , ps2− , on the sliding mode surface form the corresponding four fractional-order terminal unidirectional auxiliary faces h 0 , h 1 , h 2 , h 3 , After selected points, The sliding mode surface can be written in the following simplified format: h i = ω1i z 1 + ω2i z 2 + m i , i = 0, 1, 2, 3
(18)
where m i > 0, ω1i , ω2i represent the coefficients of unidirectional auxiliary surfaces located in different subspaces, Its limits are as follows: ⎧ ω10 ⎪ ⎪ ⎨ ω11 ω1i = ω ⎪ 12 ⎪ ⎩ ⎧ ω13 ω20 ⎪ ⎪ ⎨ ω21 ω2i = ω ⎪ 22 ⎪ ⎩ ω23
s1 s1 s1 s1 s1 s1 s1 s1
< 0, s2 < 0, s2 ≥ 0, s2 ≥ 0, s2 < 0, s2 < 0, s2 ≥ 0, s2 ≥ 0, s2
da
Among them, da is the influence range of the target point gravity, and β is the gravity compensation coefficient. The expression for the improved repulsion potential field function is:
1 d(X,X obs )
−
2
d n (X, X goal )d m (X, X l ), d(X, X obs ) ≤ d0 0, d(X, X obs ) > d0 (6) In order to solve the problem that when there are multiple obstacles near the target point, the potential field at the target point is no longer the global minimum, this paper improves the traditional repulsive potential field function. The improved method is to introduce a distance factor of d(X, X goal )into the repulsion potential field function to increase the impact of the distance between the UAV and the target point on the repulsion. After such modification, the potential field of the target point is always the minimum value of the entire spatial potential field, which solves the problem of unmanned aerial vehicles being unable to reach the target point when facing multiple obstacles near the target point. At the same time, another distance factor d(X, X l ) is also introduced, where X l is a temporary target point whose position is a random point perpendicular to the line between the current position and the obstacle and located on the side of the target point. Its introduction can help unmanned aerial vehicles get rid of the high potential field at obstacles faster, and also solve the situation where unmanned aerial vehicles fall into local minima in special situations. Ur ep (X ) =
1 K 2 r ep
1 d0
UAV Path Planning Based on …
767
The gravitational function Fatt (X ) is defined as the negative gradient of the gravitational potential field function: Fatt (X ) = −∇Uatt (X ) K att d(X, X goal ), d(X, X goal ) ≤ da X −X = katt da + β(d(X, X goal ) − da ) d(X,Xgoal d(X, X goal ) > da goal )
(7)
Using the chain derivative rule of composite functions for the repulsion potential field function shown in Equation (6), the repulsion function can be obtained, as shown in Eqs. (8–11): Fr ep (X ) = −∇Ur ep (X ) =
F1 (X ) + F2 (X ) + F3 (X ), d(X, X obs ) ≤ d0 0, d(X, X obs ) > d0
(8)
∂d(X, X goal ) m 1 1 K r ep ( − )2 d n (X, X l )d m−1 (X, X goal ) (9) 2 d(X, X r ep ) d0 ∂X 1 d n (X, X l )d m (X, X goal ) ∂d(X, X r ep ) 1 − ) (10) F2 (X ) = K r ep ( d(X, X r ep ) do d 2 (X, X r ep ) ∂X n 1 ∂d(X, X l ) 1 F3 (X ) = K r ep ( − )2 d n−1 (X, X l )d m (X, X goal ) (11) 2 d(X, X r ep ) d0 ∂X F1 (X ) =
The F1 (X ), F2 (X ), and F3 (X ) in Eq. (8) are shown in Eqs. (9–11), respectively. The partial derivatives in F1 (X ), F2 (X ), and F3 (X ) are unit vectors that point in the direction of the connecting line of the corresponding distance. The improved repulsive force increases the impact of the distance between the UAV and the target point, ensuring that the repulsive force received by the UAV when approaching the target point is less than the gravitational force, enabling the UAV to reach the target point; and when the UAV is in the local minimum state when collinear with the obstacle and the target point, F3 (X ) can help the UAV get rid of the current local minimum state and finally reach the target point.
3.2 Optimization of Artificial Potential Field by Inertial Force and Smooth Force Inertial Force. Although this paper improves the traditional potential field function, so that the UAV can avoid falling into the local minimum in most cases, in some complex and multi-obstacle environments, the force received by the UAV is too complicated, it is still possible to get stuck in a local minimum. Therefore, this
768
Y. Ma and S. Li
study introduces an inertial term to keep the UAV in the previous motion state, thereby jumping out of the local minimum to a certain extent. At the same time, the introduction of inertial force is more in line with the movement mode of the UAV in the actual situation, and can also reduce unnecessary shocks in path planning, thereby shortening the planned path length to a certain extent. The inertia force is defined as follows: Fine = K ine Vcur − V pr e
(12)
One of the parameters is K ine , which represents the size of the inertial force gain constant, and the other two parameters are Vcur and V pr e , which represent the velocity vectors at the current moment and the previous moment respectively. By introducing the inertia item into the motion equation of the UAV, the UAV can have a certain motion inertia, so that it can jump out of the local optimal state and find a better path. In addition, this improvement is more in line with the motion laws of the actual UAV. Smooth Force. To improve the trajectory of the UAV and reduce energy loss, this paper introduces a smoothing force, which makes the flight path of the UAV smoother, the movement more stable, and closer to the real flight situation. Smoothing forces are defined in terms of curvature between adjacent points on the UAV’s path. When the curvature of the UAV path increases, the received smoothing force also increases, making the planned path smoother. The specific definition of smoothing force is as follows: Fsmo = K smo
d2 p ds 2
(13)
2
where K smo is the smoothing factor and dds 2p is the curvature between path points. Introducing smooth force in path planning can make the path of UAV travel smoother, reduce the impact force and properly shorten the total length of the planned path.
3.3 Position Update Strategy for UAVs The position update strategy in the traditional artificial potential field method is to directly calculate the next position of the UAV based on the magnitude and direction of the resultant force. However, in this study, we introduce the concept of UAV velocity to indirectly update the UAV position by considering the resultant force acting on the UAV. The improved position update strategy depends on the current speed of the UAV, and the change of the speed depends on the resultant force on the UAV. This improvement makes the drone’s position updates more realistic, which improves the reliability and accuracy of path planning. The UAV position update strategy is mainly divided into three steps:
UAV Path Planning Based on …
769
Fig. 5 Force situation of the UAV under the improved APF
1. Calculate the resultant force at the current moment In the improved potential field method, the UAV is no longer only affected by the gravitational force and repulsive force, but by the joint action of four forces: gravitational force, repulsive force, smooth force and inertial force. These four force vectors combine to form a resultant force. This improvement makes the motion of the UAV in the potential field more realistic and more complex, and is no longer a simple motion under the traditional potential field method. The schematic diagram of the resultant force of the improved potential field method acting on the UAV is shown in Fig. 5, and its expression is: Ftotal = Fatt +
n
Fr ep + Fsmo + Fine
(14)
i=1
Among them, n represents the number of obstacles. 2. Calculate the speed at the next moment In order to update the velocity vector of the UAV, this study adopts a method of multiplying the resultant force vector at the current moment by a normalization parameter to ensure that the resultant force and velocity vector are in the same magnitude. Specifically, the obtained result is added to the velocity vector at the current moment to obtain the velocity vector at the next moment. Therefore, the speed update formula of the UAV is: Vnext = Vcur + ηFtotal
(15)
Among them, Vcur and Vnext represent the velocity vectors at the current moment and the next moment respectively, The parameter η determines the step size of the UAV, regulating its velocity, while Ftotal represents the vector of the resultant force. 3. Calculate the position at the next moment Simply add the position vector at the current moment to the velocity vector to calculate the position at the next moment. Here is the formula for UAV position update:
770
Y. Ma and S. Li
Pnext = Pcur + Vcur
(16)
Among them, Pcur and Pnext represent the current and next position vectors respectively.
3.4 Parameter Optimization Based on NSGA-II In the artificial potential field algorithm, there are many parameters that need to be initialized and many parameters have a great influence on the result of path planning. However, in traditional algorithms, these parameters are usually set by humans based on experience, and this setting method may often cause the algorithm to fail to find the global optimal solution in the current environment. To this end, we improve the performance and stability of the algorithm by optimizing the initialization parameters, so as to obtain the optimal combination of initialization parameters in the current environment, and then obtain the optimal solution after optimization. In this study, the multi-objective optimization algorithm NSGA-II based on genetic algorithm is used to generate new candidate solutions, and the optimal solution set is selected through non-dominated sorting and crowding distance. To obtain a better optimal solution, this study uses the NSGA-II algorithm to optimize the five initialization parameters in the artificial potential field algorithm, including the gravitational gain coefficient and repulsive force gain coefficient, the maximum influence range of obstacles, the inertial force gain coefficient and Smoothing force gain factor. These parameters are treated as variables with defined value ranges, planning path length and UAV flight time are chosen as evaluation functions. The output of the evaluation function is a vector, where each dimension of the vector corresponds to an optimization objective. By optimizing these parameters using the NSGA-II algorithm, the optimal combination of initialization parameters can be found, and a better optimal solution can be found. Such optimizations can improve the performance and stability of the algorithm. To find the best combination of parameters, the algorithm executes using an evaluation function to create and evaluate different combinations of parameters, with a higher score indicating a better solution. When choosing the best combination of parameters, the points on the Pareto front are usually chosen from the resulting solutions. The Pareto front is a solution that can no longer be achieved by improving one goal without losing the others. In the final vector obtained from the Pareto front, the value of each dimension represents our optimized parameter value. The initialization parameter settings of the improved artificial potential field method after being optimized by the NSGA-II algorithm are shown in Table 1. In order to verify the effectiveness of the NSGA-II algorithm in optimizing the initialization parameters of the improved artificial potential field method, a comparative experiment was carried out. In the experiment, 10 obstacle coordinate positions were randomly generated first, and then the path planning results obtained by the improved method using unoptimized parameters and the improved method using
UAV Path Planning Based on …
771
Table 1 Comparison of parameters before and after optimization Parameter Before optimization After optimization K att K r ep K ine K smo d0
30 20 20 0.01 1
33.173 23.546 48.846 0.005 0.861
Table 2 Comparison of path length before and after optimization Experiments number Before parameter optimization After parameter optimization 1 2 3 4 5 6 7 8 9 10
21.875 18.547 19.564 26.648 25.136 16.198 28.975 20.275 23.772 32.478
19.348 17.216 17.771 23.467 21.456 16.379 22.183 19.154 22.377 27.474
NSGA-II optimized parameters were compared. Table 2 shows that the path length planned by the optimized method is significantly smaller than that of the unoptimized method, and the average path length is reduced by 11.4%. This shows that the NSGA-II algorithm effectively improves the performance of the artificial potential field method.
4 Path Planning Results and Result Analysis To verify the effectiveness of the improved algorithm, this chapter sets up comparative simulation experiments of three scenarios. Among them, in the comparative experiments of the three scenarios, shared parameters are set. These parameters include the starting point of the UAV at (0,0,0), the target point at (10,10,10), gravitational compensation coefficient β =0.1, and gravitational influence range da =8, and a maximum UAV speed of Vmax =2.The number of obstacles and specific coordinates are set according to the needs of different simulation experiments in each scene.For obstacles, four common obstacle shapes are selected: cube, triangle, Octahedron and sphere. Although the local minimum problem is solved by improving the repulsive
772
Y. Ma and S. Li
potential field function, drones may still be trapped when encountering certain special obstacles, such as concave obstacles. Therefore, this article limits obstacles to the four shapes mentioned above. The NSGA-II algorithm is used to optimize the values of five parameters, including gravitational gain coefficient K att , repulsive force gain coefficient K r ep , maximum influence range of obstacles d0 , inertial force gain coefficient K ine and smoothing force gain coefficient K smo . Before each experiment, these five parameters are optimized by the NSGA-II algorithm to obtain the optimal parameter combination values in different environments.
4.1 Scenario 1 The simulation experiment of this scene is used to verify whether the improved potential field algorithm in this paper can help the UAV get rid of the local minimum state. In this scene, the traditional artificial potential field algorithm and the improved algorithm in this paper are used for simulation and comparison. In the experiment, only one obstacle is set, and its coordinates are (3,3,3).Before the experiment, we optimize the algorithm’s initialization parameters using the NSGA-II algorithm, resulting in the following values:K att =14.651, K r ep =10.710, d0 =0.532, K ine =34.794, and K smo =0.004. In Fig. 6, when the UAV, obstacle, and target point are in a straight line in space, the UAV under the traditional artificial potential field method is stationary in front of the obstacle, falling into a local minimum state, and unable to reach the target point; when using the improved artificial potential field method proposed in this paper for UAV path planning, it can break the current force balance state, help the UAV escape the local minimum state, bypass obstacles and successfully reach the target point, as shown in Fig. 7 Show.
Fig. 6 The path planned by the traditional APF falls into a local minimum state
UAV Path Planning Based on …
773
Fig. 7 The path planned by the improved APF get rid of the local minimum state
4.2 Scenario 2 To evaluate the effectiveness of the improved artificial potential field method proposed in this paper in solving the problem of target inaccessibility, this simulation comparison experiment was carried out. This problem is caused by dense obstacle clusters around the target point. Similar to the previous scenario, a comparative analysis is conducted between the traditional artificial potential field algorithm and the proposed improved algorithm. The experiment features three obstacles positioned at (8.3,7.5,8), (7,6,6.5), and (6,8,8.2) coordinates. The optimized initialization parameters of NSGA-II algorithm are: K att = 27.866, K r ep = 7.416, d0 = 1.057, K ine = 19.640, K smo = 0.003. The defects of the traditional artificial potential field method are well reflected in Fig. 8, because in this figure, the UAV encounters multiple obstacles near the target point, resulting in local shocks, excessive repulsion, and ultimately failure to reach Target. In contrast, the proposed improved method overcomes these challenges with a combination of modified repulsion potential field, inertial force, and smoothing force, resulting in a safer and smoother path to the target point (Fig. 9).
4.3 Scenario 3 The control experiment in this scene aims to prove that the improved artificial potential field algorithm proposed in this paper is superior to the traditional artificial potential field algorithm and the existing improved artificial potential field method in terms of algorithm calculation efficiency, planning path length and UAV flight time. To achieve this goal, the improved algorithm proposed in this paper is compared with the traditional artificial potential field algorithm and the
774
Y. Ma and S. Li
Fig. 8 Path planning results of traditional APF
Fig. 9 Path planning results of improved APF
artificial potential field algorithm improved by other articles through simulation experiments. In this experiment, 6 obstacles were set up, the coordinates are: (1.2,1.8,2),(8.2,8.2,9.6),(2.5,7.5,7.5),(8.5,4.5,5.4),(7,5,2), (4.7,6,5). Among them, the initialization parameters of the improved artificial potential field algorithm proposed in this paper are obtained by optimizing the NSGA-II algorithm: K att = 38.467, K r ep = 37.612, d0 = 0.671, K ine = 79.481, K smo = 0.003. Figure 10 is a schematic diagram of the path planning results of the traditional artificial potential field method. As can be seen from the figure, when the UAV reaches the vicinity of the target point, it falls into a local minimum state and cannot reach the target point. The improved algorithm shown in Figs. 11 and 12 did not fall into a local minimum state, and both reached the target point. Figure 11 shows the artificial potential field algorithm improved by other articles. When the UAV approaches the target point, the planned path has an obvious backtracking path. In Fig. 12, the path planned using the improved algorithm in this article is significantly
UAV Path Planning Based on …
Fig. 10 The path planning results of traditional APF
Fig. 11 The path planning results of improved APF proposed by other paper
Fig. 12 The path planning results of improved APF proposed by this paper(from mid-air)
775
776
Y. Ma and S. Li
Fig. 13 The path planning results of improved APF proposed by this paper(from the ground) Table 3 Algorithm effect comparison Index Traditional APF Voyage UAV flight time Algorithmic planning time
∞ ∞ ∞
Other improved APF 15.975 7.81 0.207
The improved APF Proposed by this paper 14.523 6.86 0.087
smoother, without significant backtracking paths. The difference between Figs. 12 and 13 is that Fig. 12 is a schematic diagram of the planned path for the UAV to depart from mid air, and Fig. 13 is a schematic diagram of the planned path for the UAV to depart from the ground. Table 3 below reports the experimental comparison results of the traditional artificial potential field algorithm, the improved algorithm proposed in other articles, and the improved algorithm proposed in this article. In the traditional artificial potential field method, the UAV cannot reach the target point and hovers around the local minimum value, resulting in a dead solution; therefore, the parameters under the traditional artificial potential field method are infinite. The results show that the improved artificial potential field method is superior to the traditional artificial potential field method and other improved algorithms in terms of algorithm execution time, the distance from the UAV to the destination, and the flight time of the UAV.
UAV Path Planning Based on …
777
5 Conclusion This paper analyzes the principle of the traditional artificial potential field method and some existing problems, and proposes solutions. Solve the local minimum problem by modifying the repulsion potential field function; solve the problem of excessive gravity by setting the gravitational compensation coefficient and the maximum influence range of gravitational force; reduce the motion jitter of the UAV by introducing smooth force, and introduce inertial force adjustment to prevent the UAV’s movement direction from falling into a local optimal state; through parameter optimization of the multi-objective optimization algorithm NSGA-II , a better combination of algorithm initialization parameters is obtained, which solves the problem that the optimal planning result cannot be obtained due to artificial setting of algorithm initialization parameters based on experience. Finally, a series of simulation experiments were carried out, which confirmed the feasibility and effectiveness of the improved artificial potential field method in this paper, and was better than other improved artificial potential field algorithms, indicating that the improved method proposed in this paper can be applied to UAV path planning in complex 3D environments.
References 1. Chen, H., Chen, H., Liu, Q.: Multi-uav 3d formation path planning based on improved artificial potential field. J. Syst. Simul. 32(3), 414 (2020) 2. Dongcheng, L., Jiyang, D.: Research on muti-uav path planning and obstacle avoidance based on improved artificial potential field method. In: 2020 3rd International Conference on Mechatronics, pp. 84–88. Robotics and Automation (ICMRA), IEEE (2020) 3. Guo, Y., Liu, X., Zhang, W., Yang, Y.: 3d path planning method for uav based on improved artificial potential field. Xibei Gongye Daxue Xuebao/J. Northwest. Polytech. Univ. 38(5), 977–986 (2020) 4. Huang, T., Huang, D., Qin, N., Li, Y.: Path planning and control of a quadrotor uav based on an improved APF using parallel search. Int. J. Aerosp. Eng. 2021, 1–14 (2021) 5. Keyu, L., Yonggen, L., Yangchi, Z.: Dynamic obstacle avoidance path planning of uav based on improved apf. In: 2020 5th International Conference on Communication, Image and Signal Processing (CCISP), pp. 159–163. IEEE (2020) 6. Tang, J., Pan, R., Zhou, S., Wang, W., Zou, R.: An improved artificial potential field method integrating simulated electric potential field. Electron. Opt. Control. 27, 69–73 (2020) 7. Xiong, C., Xie, W., Dong, W.: Obstacle avoidance path planning for uav based on artificial potential field improved by collision cone. Comput. Eng. 44, 320–341 (2018) 8. Xu, Z., Hu, J., Ma, Y., Wang, M., Zhao, C.: A study on path planning algorithms of uav collision avoidance. Xibei Gongye Daxue Xuebao/J. Northwest. Polytech. Univ. 37(1), 100–106 (2019) 9. Yao, Y., Zhou, X.S., Zhang, K.L., Dong, D.: Dynamic trajectory planning for unmanned aerial vehicle based on sparse a* search and improved artificial potential field. Kongzhi Lilun Yu Yingyong/Control Theory Appl. 27, 953–959 (2010) 10. Zhijiu, H., Wenjiang, W., Xiaowei, L., Dan, Z., Chunxin, L.: An improved artificial potential field method constrained by a dynamic model. J. Shanghai Univ. Nat. Sci. Ed. 6, 879–887 (2019)
Research on Stiffness Design Basis and Dynamic Response of Series Elastic Actuator Xiubo Xia, Yuqiao Cheng, Yongling Fu, and Jian Sun
Abstract Different from the traditional arm joint, elastomer is introduced into the series elastic actuator, SEA, and the stiffness of elastomer will have a great impact on the system performance. To improve the design process of SEA and provide the design basis for elastomer stiffness, this paper proposes a design method to determine the upper and lower limits of elastomer stiffness based on dynamic and buffering properties. The experimental results show that the method can effectively guide the selection and confirmation of elastomer stiffness in SEA design. Keywords Series elastic actuator · Elastomer buffer · Dynamic response
1 Introduction Cooperative robots have been widely developed. Currently, joint force sensors commonly used in cooperative robots are high-stiffness strain gauge force sensors, which have problems such as temperature drift and zero drift. Therefore, elastic force sensors are gradually introduced into the joint of the manipulator arm to form a series elastic driver, SEA, which is characterized by a series elastic device on the driving source. To achieve accurate force control, the driver has the advantages of low impedance, small volume, high energy density, stable force output and good buffer against external impact loads [1]. Currently, series elastic drivers have been widely used in various cooperative robots. Because SEA has a series of advantages in robot application that traditional driving devices do not have, in recent years, many scholars have begun to pay attention to the research of SEA. With the deepening of the research, the application of SEA is increasingly extensive. At present, SEA is used in walking robot [2], robot arm that contacts with complex external environment [3], Rehabilitation robots [4] and
X. Xia (B) · Y. Cheng · Y. Fu · J. Sun Beihang University, 37 Xueyuan Road, Haidian District, Beijing, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_63
779
780
X. Xia et al.
robotic exoskeletons [5] have been applied, among which typical robot examples are Istituto Italiano di Tecnologia’s COMAN [6, 7] and Centauro [8]. Zhang T et al. established the mechanical model and control strategy of SEA [9], and verified its control performance and stability through experiments. The control strategy can reduce the shock when the legs of the robot are affected by the external impact, so as to improve the safety and motion control performance of the robot. M. Moltedo proposed a SEA with adjustable stiffness [10], which can be used to achieve safety in human-computer interaction. By incorporating an adjustable stiffness device into the SEA, the robot can be made more flexible and secure when interacting with humans. Y. Park established a SEA dynamics model based on Lagrange [11], analyzed the dynamic characteristics of each component in the SEA in detail, analyzed the influence of the coupling effect between the spring and the motor in the SEA on the dynamic response of the system, and verified the characteristics of the influence through simulation and experiment. K.-Y. Lin analyzed the corresponding influence of motors with high torque inertia ratio on the dynamic performance of series elastic actuator [12], established the dynamic model of linear drivers, analyzed the influence of the system’s undamped resonant frequency on the dynamic performance, and proposed an impedance control method for stability control. The results of this study can provide useful reference for the dynamic response analysis and control of SEA. This paper analyzes the influence of elastomer stiffness on dynamic and buffering properties in SEA, and proposes a design method to determine the upper and lower limits of elastomer stiffness based on dynamic and buffering properties, which provides theoretical support for SEA design. In this paper, the dynamic performance test experiment is carried out to ensure the reliability of the dynamic performance estimation method.
2 Design Basis of Elastic Body Stiffness To give full play to the buffering characteristics of the elastic joint of the manipulator, it is also necessary to analyze the buffering effect of the elastomer on the impact torque of the joint when the joint is impacted. The traditional mechanical arm will have a serious impact on the harmonic reducer in the joint when it is impacted. The elastomer introduced into the joint in this paper can effectively alleviate the impact torque of the harmonic reducer. Therefore, the upper limit of stiffness of the joint elastomer can be obtained from the moment of impact of the manipulator does not exceed the maximum allowable torque of the harmonic reducer. The joint flexibility of the manipulator will affect the overall dynamic response of the manipulator [13]. When the joint stiffness decreases, the dynamic response performance of the system will decrease accordingly. Therefore, the joint stiffness of the manipulator should not be too small.
Research on Stiffness Design Basis and Dynamic …
781
2.1 Analysis of Flexible Buffer Firstly, the upper limit of the stiffness of the elastic body is obtained by analyzing the flexible buffering effect of the elastic body. The elastic body can be approximately regarded as a torsion spring. According to Hooke’s law Ts = ks θs , the energy that can be absorbed by the spring during the impact can be obtained as E. E=
θ2 d E = ks θ1
1 1 θs dθs = − ks θ12 + ks θ22 2 2
where Ts is the torque received, ks is the stiffness of the torsion spring, and θ1 θ2 is the Angle before and after the torsion spring is stressed. T2 In particular, when θ1 is 0, E = 21 ks θ22 = 2k2s . It can be concluded that the smaller the stiffness of the spring, the more energy can be absorbed and the better the buffered effect on the system. Taking an humanoid 7-DOF manipulator as an example, a model of the manipulator was established in the dynamic analysis software Admas to analyze the impact torque of the elbow joint when elastomers with different stiffness were used. The impact model is as follows: the mechanical arm is in normal working posture, each joint is in standby state, the end load is 5kg, and the end of the mechanical arm falls 10cm free under the influence of gravity (Fig. 1). In the absence of an elastomer installation in the elbow joint, the terminal impact induces an impulse torque of approximately 793 Nm, which vastly exceeds the instantaneous maximum permissible torque of the CSG20 harmonic reducer selected by the project team, which is only 182 Nm. Such excessive torque would inevitably result in a catastrophic impact on the harmonic reducer. The elastomer stiffness was set as 150 Nm/deg, 100 Nm/deg, 75 Nm/deg, 67.5 Nm/deg, and 50 Nm/deg respectively. The simulation results showed the elbow torque curve in the falling and impact process, as shown in Fig. 2. When the elastomer stiffness was 67.5 Nm/deg, The maximum impact torque is reduced to 178.78 Nm,
Fig. 1 Impact process of free fall of mechanical arm
782
X. Xia et al.
Fig. 2 Cushioning torque curve of elbow joint
which is less than the instantaneous maximum allowable torque of the harmonic reducer, which can effectively protect the harmonic reducer. According to the above method, the impact torches of shoulder joint and wrist joint are analyzed respectively. It can be obtained that when the maximum peak torches of harmonic reducer are 388 Nm and 67 Nm, the upper limit of the elastomeric stiffness of shoulder joint is 263 Nm/deg and the upper limit of the elastomeric stiffness of wrist joint is 28 Nm/deg.
2.2 Dynamic Response Analysis To analyze the dynamic response of series elastic actuator, its mathematical model should be established first. The physical abstract model of SEA is as follows (Figs. 3 and 4). The torque on both sides of the spring is the same τr = τs , and the end Angle of the spring is the same as the load Angle θs = θl .
Fig. 3 Abstract physical model of SEA
Research on Stiffness Design Basis and Dynamic …
783
Fig. 4 SEA mathematical model
System input is motor torque τm and load torque τl , and system output is θl . Firstly, according to Newton’s second law, the relationship between the motor and the reducer is established as follows: N (τ m − cm θ˙m − Jm θ¨m ) = τr
(1)
Then the force balance equation between the spring and the load is established: Jl θ¨l = ks (θr − θl ) + cs θ˙r − θ˙l −τ l
(2)
The above formula can be obtained by Laplace transform and collation θm = (T m −
Tr )/(Jm s 2 + cm s) N
Jl s 2 θ l = (k s + cs s) (θr − θl ) −T l
(3) (4)
The mathematical model of the system is established as follows: The dynamic performance of the system was simulated in Amesim, and the Amesim simulation model was established as shown in Fig. 5. The closed-loop control of the end response of the system was carried out using a PID controller. The command displacement curve was a sine wave with a frequency of 0.5 Hz and an amplitude of 1 rad. The motor was modeled as an ideal force source, with a maximum output torque of 0.4 Nm, a reduction ratio of 100, and an end rotational inertia of 0.4 kg m2 . The parameters of the designed components of the wrist joint were taken into account, the elastic stiffness range was set between 200 Nm/rad and 450 Nm/rad, and the end was loaded at 5 Nm. Then the dynamic response of the system could be obtained, as shown in Fig. 6. When the elastic stiffness was less than 350 Nm/rad, the system would jitter obviously and the dynamic performance would decline. Therefore, the lower limit of elastic stiffness of wrist joint can be determined as 350 Nm/rad, i.e. 6.1 Nm/deg.
784
X. Xia et al.
Fig. 5 Amesim simulation model of SEA
Fig. 6 SEA dynamic corresponding curve
3 Experimental System Design and Test According to Sect. 2, the stiffness range of wrist elastomer should be between 6.1 and 28 Nm/deg, so the designed elastomer stiffness is 14 Nm/deg, i.e. 800 Nm/rad. Figure 7 shows the elastomer machined using titanium alloy TC-4, which has a high yield strength and can withstand a maximum torque of 50 Nm. The elastic body has been tested in static loading test bench, and the stiffness is about 820/rad, the linear regression of the stiffness R = 0.9987, and the linearity of the stiffness is good. Figure 8 shows the elastomer dynamic response test platform, which is used to test the dynamic response of the system and verify the accuracy and effectiveness of the simulation model in the stiffness design basis. The experimental platform includes driver, elastomer, encoder and load, etc. Driver is eRob80I joint, sustainable output 34 Nm torque, encoder is ecoder35, 19 bit absolute encoder. Two encoders are installed respectively between the inner and outer rings of the elastomer and at the end of the system. The encoder between the inner and outer rings of the elastomer is used to test the relative Angle of the elastomer and can calculate the output torque. The encoder at the end is used to test the absolute position of the end for closed-loop control. In the experiment, a closed-loop control was performed on the end effector to follow a sine wave position command with a frequency of 0.5 Hz and an amplitude of 1 rad. The end effector was loaded with two 0.5 kg loads, and in static conditions, the joint can withstand a gravity load of 6 Nm. The end rotational inertia was approx-
Research on Stiffness Design Basis and Dynamic …
785
Fig. 7 Elastomer used for testing
Fig. 8 Elastomer dynamic response test platform
imately 0.25 kg m2 . Figure 9 shows the sine motion process of the system, during which the end encoder and command signal were sampled to obtain the dynamic response curve and elastomer torque curve of the system. The experimental curve shows that the dynamic response of the system is similar to that of the simulation, and the results of the dynamic response evaluation are satisfactory (Fig. 10).
4 Summary and Outlook This article analyzes the influence of elastomers on the system in a SEA, and proposes a stiffness design basis for the elastomer in SEA. The stiffness design upper limit is based on the protective effect of the elastomer in the joint on the harmonic drive reducer. The stiffness design lower limit is based on the dynamic response of the joint after introducing the elastic element. This method can effectively guide the
786
X. Xia et al.
Fig. 9 Dynamic response experiment process
Fig. 10 Dynamic joint response
design process of SEA. In addition, the dynamic response of the proposed method was tested, and the results showed that the dynamic response of the system met the requirements, and the design basis for the elastomer stiffness was practical. However, there were still deviations between the dynamic response curve in simulation and that in the experiment. Therefore, in the future, a more detailed mathematical and physical model of SEA can be established for dynamic response simulation, such as considering the rotational inertia of the elastomer itself and the dynamic characteristics of the motor. The coupling relationship between elastomers in different joints and its influence on the overall dynamic response of the robotic arm can also be considered to further limit and optimize the design basis for the elastomer stiffness.
Research on Stiffness Design Basis and Dynamic …
787
References 1. Negrello, F., et al.: A modular compliant actuator for emerging high performance and fallresilient humanoids. In: 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids), pp. 414–420 (2015). https://doi.org/10.1109/HUMANOIDS.2015.7363567 2. Werner, A., Lampariello, R., Ott, C.: Trajectory optimization for walking robots with series elastic actuators. In: 53rd IEEE Conference on Decision and Control, Dec 2014, pp. 2964–2970. IEEE, Los Angeles, CA, USA. https://doi.org/10.1109/CDC.2014.7039845 3. Gu, X., Wang, K., Cheng, T., Zhang, X.: Mechanical design of a 3-DOF humanoid soft arm based on modularized series elastic actuator. In: 2015 IEEE International Conference on Mechatronics and Automation (ICMA, Aug 2015, pp. 1127–1131. IEEE, Beijing. https://doi.org/10. 1109/ICMA.2015.7237644 4. Lagoda, C., Schouten, A.C., Stienen, A.H.A., et al.: Design of an Electric Series Elastic Actuated Joint for Robotic Gait Rehabilitation Training. IEEE Computer Society, Tokyo, Japan (2010) 5. Ning, C., Li, Y., Feng, K., Gong, Z., Zhang, T.: SoochowExo: a lightweight hip exoskeleton driven by series elastic actuator with active-type continuously variable transmission. In: 2022 IEEE International Conference on Mechatronics and Automation (ICMA), Aug 2022, pp. 1432–1437. https://doi.org/10.1109/ICMA54519.2022.9855957 6. Tsagarakis, N.G., Li, Z., Saglia, J., Caldwell, D.G.: The design of the lower body of the compliant humanoid robot ”ccub”. In: Robotics and Automation (ICRA), 2011 IEEE International Conference, pp. 2035–2040. IEEE (2011) 7. Baccelliere, L. et al.: Development of a human size and strength compliant bi-manual platform for realistic heavy manipulation tasks. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Sep 2017, pp. 5594–5601. https://doi.org/10.1109/IROS. 2017.8206447 8. Klamt, T., et al.: Remote mobile manipulation with the Centauro robot: full-body telepresence and autonomous operator assistance. J. Field Robot. 37(5), 889–919 (2020). https://doi.org/ 10.1002/rob.21895 9. Zhang, T., Tran, M., Huang, H.: Admittance shaping-based assistive control of SEA-driven robotic hip exoskeleton. IEEEASME Trans. Mechatron. 24(4), 1508–1519 (2019). https://doi. org/10.1109/TMECH.2019.2916546 10. Moltedo, M., et al.: Variable stiffness ankle actuator for use in robotic-assisted walking: control strategy and experimental characterization. Mech. Mach. Theory 134, 604–624 (2019). https:// doi.org/10.1016/j.mechmachtheory.2019.01.017 11. Park, Y., Paine, N., Oh, S.: Development of force observer in series elastic actuator for dynamic control. IEEE Trans. Ind. Electron. 65(3), 2398–2407 (2018). https://doi.org/10.1109/TIE. 2017.2745457 12. Lin, K.-Y., Chung, C.-C., Lan, C.-C.: Improving the dynamic force control of series elastic actuation using motors of high torque-to-inertia ratios. IEEE Access 8, 6968–6977 (2020). https://doi.org/10.1109/ACCESS.2020.2963885 13. Yu, H., Sun, Q., Wang, C., Zhao, Y.: Frequency response analysis of heavy-load palletizing robot considering elastic deformation. Sci. Prog. 103(1) (2020). https://doi.org/10.1177/ 0036850419893856
Dynamic Estimation of Loads on Wind Turbine Blades Based on Sensor Optimization and Kalman Filter Hang Chen, Shanbi Wei, Yu Wang, and Yi Chai
Abstract Monitoring the moments at all nodes on blades in real time enables the safe operation of wind turbines and improves the efficiency of wind farms. As the size of blades grows, existing methods for estimating the moments on blades are lacking in real-time, accuracy or cost. Accordingly, this paper proposes a new method to estimate the moments on blades. Firstly, finite element model for the blade is constructed based on beam structure theory. Subsequently, a framework for sensor optimization based on the adaptive levy black hole-gravitational search algorithm (ALBH-GSA) is established. On this basis, the state-space equations for the blade are derived, and the kalman filter is applied to map the measured moments to the nodes where no sensors are installed. Experimental results show that the proposed method is able to estimate the moments at any node on the blade in real time, with the advantages of high computational accuracy, low computational complexity, and manageable cost. Keywords Blade moment estimation · ALBH-GSA · Sensor optimisation · Kalman filter
1 Introduction Modern turbines have been built with blades up to 110 m in length, such as the 12 MW Haliade-X. In addition, cost-effective blades are often designed as light and slender structures [1]. However, as the size of the blades grows, the moments on the blades become more complex, and the safety of the turbine during operation becomes impossible to guarantee. Monitoring and analyzing the moment distribution on blades in real time will improve the stability of the wind turbine, and ensure the safety of the turbine and the wind farm staff.
H. Chen · S. Wei (B) · Y. Wang · Y. Chai College of Automation, Chongqing University, 400044 Chongqing, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_64
789
790
H. Chen et al.
In order to calculate the moments on blades, researchers have proposed various effective methods. The existing methods are broadly classified into 3 groups. The first category of methods is called engineering methods, such as the blade element momentum (BEM) method and the generalized dynamic wake (GDW) method. The second category of methods are computational fluid dynamics (CFD) methods. The last category of methods calculates moments based on measured data. In [2], BEM was applied to explore the mechanical properties and structural integrity of the blades of offshore wind turbines under critical loads. In [3], an analytical framework for the rotor of an offshore floating wind turbine was established based on GDW. To develop a blade model that includes non-constant lift and drag coefficients, [4] combined 2DCFD with different relative onset flows. In [5], force and moment on blades were calculated from strain measurements on blades and data collected by the supervisory control and data acquisition (SCADA) system. Among the existing methods, the engineering method is simple, but the accuracy is not sufficient. The CFD method is accurate, but its long calculation time makes it unsuitable for real-time analysis of blades. It is useful to calculate moments on the blades based on the data measured by the sensors. However, existing studies usually produce results only at discrete nodes where sensors are installed. In view of the size of the blades, a large number of sensors are required to obtain the moments at all nodes on the blades, which leads to increased costs for wind farms and may alter the aerodynamic performance of the blades. Aiming at the appeal problem, this paper proposes a new method for calculating the moment on blades in real time based on data from sensors, with the advantages of low computational complexity and controllable cost. The innovations are summarized as follows: (1) A sensor optimization framework is proposed, including optimizations of the number and position of sensors. Sensors are installed at the optimal nodes to obtain the overall motion characteristics of the blade. (2) In order to estimate the moments at all nodes, the moments measured by sensors are mapped to the nodes where no sensors are installed by modal space and kalman filter.
2 Estimation Methodology 2.1 Structural Model of the Blade The blade is modeled as a finite element model with N cells and N + 1 nodes, as shown in Fig. 1. Where R is the impeller radius and l is the length of cell. A wind turbine blade is usually defined as a cantilever with one end fixed to the hub, the equation of motion of the blade is described as follows: M x(t) ¨ + C x(t) ˙ + K x(t) = f (t)
(1)
Dynamic Estimation of Loads on Wind Turbine Blades Based …
791
Fig. 1 Finite element model
where C is the damping matrix of the blade system and f (t) is the static load acting on the blade. x(t), ¨ x(t), ˙ and x(t) are the acceleration, velocity, and displacement of the blade, respectively. The modal matrix of the blade as well as the natural frequencies are calculated by the following equation: (2) (K − ω2 M) = 0 where the positive solutions of ω are the self-oscillation frequencies ωi (i P=1, 2, ..., n) of the blade. = (φ1 , φ2 , ..., φn ) is the blade modal matrix, and φi is the i-th modal shape. According to the modal superposition principle [6] and Betti’s theorem, Eq. 3 is obtained: M¯ ε¨ (t) + C¯ ε˙ (t) + K¯ ε(t) = f¯(t) (3) ¯ K¯ and C¯ are the modal where ε = [ε1 , ε2 , ..., εn ]T is the modal amplitude vector. M, ¯ ¯ ¯ matrix, and f (t) is the modal load. M and K are both diagonal matrices. C¯ is also a diagonal matrix whose diagonal elements are c¯i = 2m¯ i ξi ωi , ξi is the i-th modal damping ratio. Equation 3 is the decoupled equation of motion. In [7], the structural dynamic response of a 1.25 MW wind turbine was accurately calculated based on the first and second modal shapes. This means that the accuracy requirement is met by considering the response of the first m(m < n) modal shapes (effective modal shapes) when calculating the deformation of the blade.
2.2 Optimization of Sensor Mounting Solutions Based on the ALBH-GSA According to sensor optimisation theory, the number of sensors cannot be smaller than the degree of effective modal shape. In order to minimize the cost of wind farms, the number of sensors is always equal to the degree of the effective modal shape. Fitness function The minimize modal assurance criterion (MinMAC) [8] is a typical sensor optimization method. Due to its excessive computational cost, it is usually considered an evaluation criterion. The expression for MAC is given by:
792
H. Chen et al. 2
M ACi j =
(ϕiT ϕ j ) (ϕiT ϕi )(ϕ Tj ϕ j )
(4)
where M ACi j is an element of the MAC matrix and ϕi and ϕ j are the i-th and j-th order modal vectors of the blade respectively. The value of M ACi j is in the interval [0,1]. The smaller the M ACi j (i = j), the closer the corresponding modal vector is to orthogonal. Therefore, the maximum non-diagonal elements of the MAC matrix are minimized as a fitness function for the optimization of sensor positions on the blades: f itness = max |M ACi j |
(5)
i= j
where the smaller the value of fitness, the more effective the sensor optimisation. ALBH-GSA GSA is a population intelligence optimization algorithm that simulates the way particles move under the effect of gravity. Compared to optimization algorithms such as particle swarm optimization algorithms and genetic algorithms, GSA has better search accuracy and convergence speed. The calculation steps for GAS are described in [9]. p p According to GSA, the velocity vi and position z i of the particle are updated to: p
p
p
vi (t + 1) = r × vi (t) + ai (t) p
p
p
z i (t + 1) = z i (t) + vi (t + 1)
(6) (7)
p
where r is a random number between [0,1] and ai is the acceleration of the i-th particle in direction p. The coordinates of the particle are the mounting position of the sensor on the blade. The black hole mechanism [10] assumes the existence of a black hole in the solution space with the global optimum gbest as the center and R2 as the radius. If a particle is attracted to the black hole, the new position update formula is taken. Otherwise, the original position update method is employed. Adding a black hole mechanism to the GSA enhances its global search capability. However, the traditional black hole mechanism may induce the GSA to converge at a local optimum in the solution space. To solve this problem, this paper modifies the black hole mechanism applied in GSA from two aspects: judgment indicators and position update strategy, to achieve adaptive evolution of the population. In addition, due to the memory capability and random walk strategy of levy flight, it is applied to a variety of optimization algorithms to improve the local search capability of the algorithm. Based on this, the ALBH is applied to the position updating process of the particles of the GSA. It is described as follows: N2 σt 2 =
i=1
( f itnessi (t) − f itness(t))
N2 × max{ f itness j (t) − f itness(t), 1}
, j ∈ {1, 2, ..., N2 }
(8)
Dynamic Estimation of Loads on Wind Turbine Blades Based …
p
z i (t + 1) =
793
Levy f ight σt2 < P p gbest + (2 × r − 1) × (gbest − z i (t)) σt2 ≥ P
(9)
where f itness(t) is the average of the fitness of all particles at the t-th iteration and P is the threshold value. Levy fight is defined as follows: z i (t + 1) = z i (t + 1) + L(β) ⊗ (z i (t + 1) − gbest)
(10)
where zi (t + 1) is the position of the i-th particle at moment t calculated according to Eq. 8 and L(β) is the random number of the levy flight. When updating the position of a particle based on levy flight, z i (t + 1) is adopted if it has a better fitness than z i (t + 1), otherwise z i (t + 1) is adopted.
2.3 Method for Estimating Blade Moments Based on the modal superposition principle, the modal amplitude vector ε(t) and its derivative ε˙ (t) are chosen to construct the system state vector X (t) = [ ε˙ (t) ε(t) ]T . The modal amplitude ε(t) is estimated based on the kalman filter and then the moment on the blade is calculated. Based on Eq. 3, the state equation of the blade is as follows: X˙ (t) = AX (t) + B f¯(t)
(11)
where the system matrix A and the input matrix B are defined as follows: A=
− M¯ −1 C¯ − M¯ −1 K¯ I 0
B=
M¯ −1 0
(12)
In [11], the moment on a node is defined as a linear combination of the displacements of adjacent nodes and is described as follows: M Fi (t) =
ki xi−1 (t) − li
ki ki + li li+1
xi (t) +
ki li+1
xi+1 (t)
(13)
where M Fi (t) is the measured moment at i-th node, xi (t) is the deflection of i-th node, and ki is the stiffness of i-th node. The state vector of the system is introduced into Eq. 16, and the equation is rewritten as the observation equation at node i, described as follows: M Fi (t) = Hi X (t) = 0 G i X (t)
(14)
794
H. Chen et al.
ki ki , li+1 where G i = 0, ..., klii , − klii + li+1 , ..., 0 . The static load on the blade is calculated based on BEM. There is a deviation
f¯(t) between the calculated load f¯(t) and the true load f¯r (t): f¯r (t) = f¯(t) + f¯(t)
(15)
The inaccurate calculation of the static load is compensated by including f¯(t) in the state vector. On this basis, a new state-space model of the blade system is obtained and described as follows: X˙ˆ (t) = Aˆ Xˆ (t) + Bˆ f¯(t) (16) M Fi (t) = Hˆ i Xˆ (t) AB T T T ˆ ˆ where X (t) = X (t) f is the new state vector, A = is the new sys0 I T T tem matrix, Bˆ = B 0 is the new input matrix, and Hˆ i = Hi 0 is the observation matrix of the i-th node. Since m sensors are installed on the blades, the system’s observation equation is the set of observation equations at m nodes, i.e., T T M F (t) = M F1 (t) ... M Fm (k) , Hˆ (t) = Hˆ 1T (t) ... Hˆ mT (k) . In engineering, it is useful to discretize the continuous equations. The discretized state space model is defined as follows:
Xˆ (k + 1) = Aˆ Xˆ (k) + Bˆ f¯(k) M F (k) = Hˆ Xˆ (k)
(17)
The kalman filter consists of five steps. Each of these five steps is described in the following equation: ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨
Xˆ (k + 1|k) = Aˆ Xˆ (k|k) + Bˆ f¯(k) Yˆ (k + 1|k) = Aˆ Yˆ (k|k) Aˆ T + Q ˆ + 1) = Yˆ (k + 1|k) Hˆ T [ Hˆ Yˆ (k + 1|k) Hˆ T + O]−1 G(k ⎪ ⎪ ⎪ ˆ + 1)(M F − Hˆ Xˆ (k + 1|k)) Xˆ (k + 1|k + 1) = Xˆ (k + 1|k) + G(k ⎪ ⎪ ⎩ ˆ + 1) Hˆ ]Yˆ (k + 1|k) Yˆ (k + 1|k + 1) = [I − G(k
(18)
where Yˆ (k) is the system covariance matrix, Q is the variance of the system process noise, O is the variance of the system observation noise, and I is the unit matrix. Once Xˆ (k + 1|k + 1) is calculated, the moment at any node on the blade is estimated from Eq. 13.
Dynamic Estimation of Loads on Wind Turbine Blades Based … Table 1 Wind conditions Turbulence Wind speed (m/s) TI for X (%) 1 2
5 16
24.35 25.81
795
TI for Y (%)
TI for Z (%)
19.29 20.16
14.02 14.25
3 Experimental Data and Experimental Platform A 5 MW wind turbine is employed in this paper, with a blade length of 57 m. In the experiments, the blade is divided into 18 finite elements. In order to verify that the method proposed works when the turbine is under different wind conditions, various wind conditions are defined in Table 1. To judge the accuracy of the estimated results, the RMSE as well as the MAPE, commonly applied in the wind industry, are used. The specific formulas are as follows: N3 1 p (M Fo −M F )2 RMSE = N3 i=1
(19)
N3 M Fo − M Fp 1 ∗ 100% M AP E = N3 i=1 M Fo
(20) p
where N3 is the number of data, M Fo represents the true moment and M F represents the estimated moment on blades. In engineering, a result is considered acceptable when the value of MAPE is less than 10%. Therefore, if the MAPE at all nodes are less than 10%, the experiment is finished. On the other hand, if MAPE is greater than 10%, it is necessary to increase the order of the effective modal matrix and the number of sensors to update the sensor mounting scheme and the state space model. A new experiment is then carried out. This procedure is repeated until the MAPE at all nodes meets the engineering requirements.
4 Experimental Results and Analysis Figure 2 shows the results of the estimated flap moments at all nodes. The MAPE curves and RMSE curves are shown in Fig. 3. The number of sensors installed on the blades is 3, and the mounting positions are (R4, R12, R17). As shown in Fig. 2, the method proposed in this paper provides a good estimate of the flap moment at all nodes on the blade. The flap moment on the blades is mainly influenced by the aerodynamic load. As the wind speed changes, the flap moment
My [Nm]
My [Nm]
1
2
3
4
5
6
7
8
2
4
6
8
10
12
0.5
1
1.5
0
0
0
10 4
10 5
10 6
100
100
100
200
200
200
Time [s]
300
R13 = 46.78m
Time [s]
300
R7 = 18.82m
Time [s]
300
R1 = 1.61m
400
400
400
600
True Estimation
600
True Estimation
600
-1000
0
1000
2000
3000
4000
1
1.5
2
2.5
3
3.5
4
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
0
0
0
10 5
10 6
100
100
100
(a) v0 is 5m/s.
500
500
500
True Estimation
Fig. 2 The results of estimating the flap moment
My [Nm]
My [Nm]
My [Nm]
My [Nm]
200
200
200
Time [s]
300
R16 = 54.31m
Time [s]
300
R10 = 33.88m
Time [s]
300
R4 = 6.45m
400
400
400
500
500
500
600
True Estimation
600
True Estimation
600
True Estimation
My [Nm] My [Nm] My [Nm] 0
0.5
1
1.5
2
2.5
0
1
2
3
4
5
0
2
4
6
8
10
0
0
0
10 5
10 6
10 6
100
100
100
200
200
200
Time [s]
300
R13 = 46.78m
Time [s]
300
R7 = 18.82m
Time [s]
300
R1 = 1.61m
400
400
400
600
True Estimation
600
True Estimation
600
-5000
0
5000
10000
2
4
6
8
10
12
14
1
2
3
4
5
6
7
8
0
0
0
10 5
10 6
100
100
100
200
200
200
(b) v0 is 16m/s.
500
500
500
True Estimation
My [Nm] My [Nm] My [Nm]
2
Time [s]
300
R16 = 54.31m
Time [s]
300
R10 = 33.88m
Time [s]
300
R4 = 6.45m
400
400
400
500
500
500
600
True Estimation
600
True Estimation
600
True Estimation
796 H. Chen et al.
Dynamic Estimation of Loads on Wind Turbine Blades Based … 9
v0=16m/s
v0=5m/s v0=16m/s
16
7
14
6
12
RMSE [Nm]
MAPE [%]
10 4
18
v0=5m/s
8
797
5 4 3
10 8 6
2
4
1
2
0
0 0
10
20
30
40
Distance from blade root [m]
50
60
0
10
20
30
40
50
60
Distance from blade root [m]
Fig. 3 Estimation error of the flap moment for different wind conditions
will fluctuate accordingly. The results of the estimation give a good representation of these fluctuations. In addition, due to wind shear and the blade structure, the dynamic response at the node is more complex as the distance between the node and the blade root increases. In this case, the method proposed still accurately captures the fluctuations caused by aerodynamic loads, and accurately estimates the flap moment at this node. The results of the estimation of the edge moments at all nodes are shown in Fig. 4. Figure 5 shows the MAPE and RMSE curves. The number of sensors installed on the blade is also 3 and the sensor locations are (R3, R10, R15). As shown in Fig. 4, the method proposed allows accurate estimation of the edge moment at all nodes. The edge moment is mainly influenced by the gravity load. Therefore, the periodic pattern of the edge moment is more evident than that of the flap moment, and the values of the edge moment are alternately positive and negative. Aerodynamic loads also have an effect on the edge moment. As it is clear from the experimental results, The method in this paper captures these patterns accurately. As it is clear from Figs. 3 and 5, the MAPEs of the edge moments and flap moments estimated are all less than 10% for the two wind conditions selected for the simulation experiments, which meets the engineering requirements. This implies that the method in this paper is robust and not affected by the operating conditions of the wind turbine. Overall, it is feasible to apply the method of this paper to real-time monitoring of the moment on the blades.
5 Conclusion Wind turbines work in harsh environments for a long time, and it is beneficial to monitor the moments on their blades in real time. In order to improve the accuracy and reduce the cost of estimating the moment on blades, this paper proposes a dynamic estimation method based on sensor optimization techniques and kalman filter. For the 5 MW wind turbine employed in the experiments, only three sensors need to be
Mx [Nm]
Mx [Nm]
-4
-2
0
2
4
6
-1
-0.5
0
0.5
1
-3
-2
-1
0
1
2
0
0
0
10 4
10 6
10 6
100
100
100
200
200
200
400
400
400
500
500
500
600
True Estimation
600
True Estimation
600
True Estimation
-3000
-2000
-1000
0
1000
2000
3000
-3
-2
-1
0
1
2
3
-2
-1
0
1
2
0
0
0
(a) v0 is 5m/s.
Time [s]
300
R13 = 46.78m
Time [s]
300
R7 = 18.82m
Time [s]
300
R1 = 1.61m
10 5
10 6
Fig. 4 The results of estimating the edge moment
Mx [Nm]
Mx [Nm]
Mx [Nm]
Mx [Nm]
100
100
100
200
200
200
400
400
Time [s]
300
400
R16 = 54.31m
Time [s]
300
R10 = 33.88m
Time [s]
300
R4 = 6.45m
500
500
500
600
True Estimation
600
True Estimation
600
True Estimation
Mx [Nm] Mx [Nm] Mx [Nm] -5
0
5
10
-1.5
-1
-0.5
0
0.5
1
1.5
-4
-3
-2
-1
0
1
2
3
0
0
0
10 4
10 6
10 6
100
100
100
200
200
200
400
400
Time [s]
300
500
500
500
600
True Estimation
600
True Estimation
600
True Estimation
-4000
-2000
0
2000
4000
6000
-4
-2
0
2
4
6
-3
-2
-1
0
1
2
3
0
0
0
10 5
10 6
(b) v0 is 16m/s.
400
R13 = 46.78m
Time [s]
300
R7 = 18.82m
Time [s]
300
R1 = 1.61m Mx [Nm] Mx [Nm] Mx [Nm]
3
100
100
100
200
200
200
400
400
Time [s]
300
400
R16 = 54.31m
Time [s]
300
R10 = 33.88m
Time [s]
300
R4 = 6.45m
500
500
500
600
True Estimation
600
True Estimation
600
True Estimation
798 H. Chen et al.
Dynamic Estimation of Loads on Wind Turbine Blades Based … 10
10 4
6
v =5m/s
799 v =5m/s 0
0
v0=16m/s
v0=16m/s
9
5
8
4
6
RMSE [Nm]
MAPE [%]
7
5 4
3
2
3 2
1
1 0
0 0
10
20
30
40
Distance from blade root [m]
50
60
0
10
20
30
40
50
60
Distance from blade root [m]
Fig. 5 Estimation error of the edge moment for different wind conditions
installed on the blades to accurately estimate the moment at each node, even when the turbine is under different wind conditions. Overall, compared to existing methods, the method proposed has the advantages of low computational complexity and manageable cost, and provides a more accurate real-time estimation of the moment at any node on the blade.
References 1. Rosemeier, M., Berring, P., Branner, K.: Non-linear ultimate strength and stability limit state analysis of a wind turbine blade. Wind Energy 19(5), 825–846 (2016) 2. Hicham, B., et al.: Structural analysis of offshore wind turbine blades using finite element method. Wind Energy 44(2), 168–180 (2020) 3. Steven, N.R., Justin, W.J.: Strongly-coupled aeroelastic free-vortex wake framework for floating offshore wind turbine rotors. Part 1: numerical framework. Renew. Energy 149, 1018–1031 (2020) 4. Mullings, H., Stallard, T.: Analysis of tidal turbine blade loading due to blade scale flow. J. Fluids Struct. 114, 103698 (2022) 5. Bridget, M., Babak, M., Sauro, L.: Estimation of blade forces in wind turbines using blade root strain measurements with OpenFAST verification. Renew. Energy 184, 662–676 (2022) 6. Horas, C.S., et al.: Efficient multiscale methodology for local stress analysis of metallic railway bridges based on modal superposition principles. Eng. Fail. Anal. 138, 106391 (2022) 7. Liu, X., et al.: Structure dynamic response analysis of horizontal axis wind turbines. Acta Energiae Solaris Sinica 30(6), 804–809 (2009) 8. Carne, T.G., Dohrmann, C.R.: A modal test design strategy for model correlation. In: Proceedings of the 13th International Modal Analysis Conference, 1995 9. Rashedi, E., Nezamabadi-Pour, H., Saryazdi, S.: GSA: a gravitational search algorithm. Inf. Sci. 179(13), 2232–2248 (2009) 10. Hatamlou, A.: Solving travelling salesman problem using black hole algorithm. Soft Comput. 22(24), 8167–8175 (2018) 11. Muto, K., et al.: Model-based load estimation for wind turbine blade with Kalman filter. In: 8th International Conference on Renewable Energy Research and Applications, 2019
Improved Deadbeat Predictive Current Control for Open-Winding Permanent Magnet Synchronous Generators Wenfeng Wang and Yue Ma
Abstract In order to solve the problem of parameter mismatch in the DPCC (deadbeat pr edictive curr ent contr ol) of open-winding permanent magnet synchronous generators, this paper proposes an IDPCC (impr oved deadbeat pr edictive curr ent contr ol) algorithm. By designing a Longberg observer to estimate disturbance error and compensate for the reference voltage of the d, q, and 0 axes, the impact of mismatched inductance parameters on the d, q, and 0 axes is eliminated. Based on the Longberg observer, an incremental model is applied to the d and q axis current predictive control to eliminate the impact of permanent magnet flux mismatch. And a control system model of an open-winding generator was established in the Matlab/Simulink environment, verifying the effectiveness of the algorithm proposed in this paper. Keywords Open-winding generator · IDPCC · Longberg observer · Incremental model
1 Introduction In recent years, permanent magnet synchronous generators have been widely used in the field of power generation due to their superior performance, high efficiency, and power density. Opening the neutral point of the stator winding of a permanent magnet synchronous generator and maintaining an open-winding topology at both ends can reduce the demand for converter capacity and voltage level in high-power power generation systems. At the same time, it has a three-level modulation effect on the generator, thereby suppressing current harmonics and torque pulsation, and improving the operating performance of the generator [1, 2]. However, there is a W. Wang · Y. Ma Beijing Institute of Technology, Beijing, China Y. Ma (B) Beijing Institute of Technology Chongqing Innovation Center, Chongqing, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_65
801
802
W. Wang and Y. Ma
zero sequence circuit under this structure, and the presence of zero sequence voltage will generate zero sequence current in the circuit [3], which will lead to additional losses and increased torque ripple in the system, reducing system efficiency and stable operation ability. Therefore, it is of great significance to study current control including zero axis current suppression. Common current loop control algorithms for permanent magnet synchronous motors include PI control [4], hysteresis control [5], and predictive control [6]. DPCC enables the actual current to track the reference value as quickly as possible, and has the advantages of fast dynamic response and small steady-state tracking error, making it widely used in current loop control of motor systems. However, this method relies on accurate mathematical model of the motor. Due to structural aging, ambient temperature and other factors, the generator parameters change, resulting in the timing of model parameters and actual parameters, and the control performance is seriously affected, resulting in a series of problems such as steady-state error and system divergence, which affect the zero sequence current suppression effect and current control performance of the open-winding generator. Therefore, solving the problem of model parameter mismatch is the focus of research on DPCC. At present, research is mainly divided into two approaches: the first approach is to use parameter identification algorithms to identify motor parameters online. Common online parameter identification methods include recursive least squares method [7], Kalman filtering method [8], and model reference adaptive method [9]. However, existing parameter identification methods have slow identification speed and high computational complexity, and their accuracy decreases when identifying multiple parameters simultaneously. Another method is to establish a disturbance observer to compensate the disturbance caused by parameter mismatch [10–12], but the design of the observer is complicated, and the observer equation usually contains motor parameters, which cannot completely get rid of the dependence on generator parameters. At present, the research on the uncertainty of DPCC parameters mainly focuses on the control of d and q axis currents, while the research on the robust suppression of zero axis currents is less. Therefore, this paper takes the common bus line open winding permanent magnet synchronous generator as the research object, and puts forward the IDPCC algorithm. Longberg observer is designed to estimate the disturbance error to compensate the reference voltage of d, q and 0 axis, and eliminate the influence caused by the mismatch of inductance parameters of d, q and 0 axis. It can realize the robust control of d and q axis current and the robust suppression of zero sequence current. At the same time, the incremental model is applied to the predictive control of d and q axis current to eliminate the influence of the flux mismatch of permanent magnet. In this paper, a simulation control model is established based on Matlab/Simulink environment, verifying the effectiveness of the proposed algorithm.
Improved Deadbeat Predictive Current Control for Open-Winding …
803
2 Mathematical Model The topology of the fully controlled common DC bus open winding generator based on the two-level converter is shown in Fig. 1, and the voltage vectors acting on the converter 1 and converter 2 are defined as Us and Us , respectively. Under the open-winding topology, the voltage vector generated on the motor stator winding should be the difference of the voltage vector acting on the two sides, so there are 64 composite basic voltage vectors on the stator winding, which are synthesized by the basic voltage vectors of two sets of convertors, as shown in Fig. 2: Due to the formation of a zero sequence circuit in a common bus topology, the sources of zero sequence voltage are mainly divided into common mode voltage and third harmonic back electromotive force. The value of zero sequence inductance is usually very small, so even a small zero sequence voltage can generate a large zero sequence current. Therefore, considering the mathematical model of a zero axis generator, it can be expressed as: ⎧ ⎨ u d = u d1 − u d2 = L d i˙d + Rs i d − ωe L q i q (1) u = u q1 − u q2 = L q i˙q + Rs i q + ωe L d i d + ωψ f ⎩ q u 0 = u 01 − u 02 = L 0 i˙0 + Rs i 0 − 3ωe ψ f 3 sin 3θ
Fig. 1 Topological schematic diagram of open winding generator
Fig. 2 Composite basic voltage vector diagram
804
W. Wang and Y. Ma
ψd = L d i d + ψ f ψq = L q i q
(2)
where u d , u q and u 0 represent the d, q and 0 axes voltage, which are jointly synthesized by converters 1 and 2; L d , L q and L 0 represent the inductance of the d, q and 0 axes, respectively. R represents the stator resistance, ψ f represents the fundamental component of the rotor flux linkage, and ψ f 3 represents the third harmonic component of the rotor flux linkage. ψd , ψq represents the d, q axes components of the magnetic linkage. ωe represents the electrical angular velocity of the motor, and θ represents the electrical angle.
3 Control Algorithm 3.1 Zero Vector Redistribution SVPWM See Fig. 3. In order to fully reduce the voltage level of the DC bus and improve the modulation level of the dual converters, the reference voltage vector is evenly divided into two converters, so that their output amplitude is the same and the phase is opposite to the voltage vector Ur e f 1 and Ur e f 2 . Assuming that the basic voltage vectors (100), (110), and (111) are applied for t1 , t2 and t7 , respectively, to synthesize Ur e f 1 for converter 1. The synthesis of Ur e f 2 by converter 2 requires the basic voltage vectors (011), (001), and (111) to operate for t4 , t5 and t7 , respectively. Within a cycle, the common mode voltages generated by converter 1 and converter 2 are u 01 and u 02 , respectively.
Fig. 3 a Represents the common mode voltage state of 180◦ decoupling, b represents the schematic diagram of 180◦ decoupling
Improved Deadbeat Predictive Current Control for Open-Winding …
1 Udc 2Udc t1 + t2 + Udc t7 Ts 3 3 1 2Udc Udc t4 + t5 + Udc t7 = Ts 3 3
805
u 01 =
(3)
u 02
(4)
The zero axis voltage reference value obtained from the previous control algorithm should be equal to the common mode voltage value generated by the dual converter: u r e f = u 01 − u 02
1 = Ts
Udc Udc t2 − t1 + Udc t7 − t7 3 3
(5)
It can be calculated that the difference in operating time between converter 1 and converter 2 is:
T = t7 − t7 =
ur e f 0 1 1 Ts − t2 + t1 Udc 3 3
(6)
According to the time difference of zero vector action, the redistribution of zero vectors in dual converters can be achieved.
3.2 Improved Deadbeat Predictive Current Control DPCC should consider the one beat delay of the actual system, and the control goal is to track the reference values of the motor’s d, q, and 0 axis currents as soon as possible. DPCC considering one beat delay mainly consists of two steps: predicting the next cycle current and calculating the reference voltage. First, discretization the aforementioned generator mathematical model and predict the current of the next cycle: ⎧ ⎪ i d (k + 1) = 1 − TLs Rs s i d (k) + Ts ωe i q (k) + LTss u d (k) ⎪ ⎪ ⎨ Tω ψ i q (k + 1) = −Ts ωe i d (k) + 1 − TLs Rs s i q (k) + LTss u q (k) − s Les f ⎪ ⎪ ⎪ ⎩ i (k + 1) = 1 − Ts Rs i 0 (k) − 3 Ts ωψ f 3 sin 3θ + Ts u 0 (k) 0 L0 L0 L0
(7)
For DPCC considering one beat delay, in order to ensure that the actual current value reaches the reference value in the k + 2nd cycle, it is represented as follows: ⎧ ref ⎪ ⎨ i d (k + 2) = i d (k) ref i q (k + 2) = i q (k) ⎪ ⎩ i (k + 2) = i r e f (k) 0 0
(8)
806
W. Wang and Y. Ma
Based on the above equation, calculate the reference voltage values for the next cycle d, q, and 0 axes separately: ⎧ ref ⎪ ⎪ ⎪ u d (k + 1) = ⎪ ⎪ ⎪ ⎨ u rqe f (k + 1) = Ts ⎪ + ⎪ ωe ψ f ⎪ Ls ⎪ ⎪ ⎪ ⎩ u r e f (k + 1) = 0
Ls Ts Ls Ts
ref i d (k) − 1 − ref i q (k) − 1 −
Rs Ls Rs Ls
Ts i d (k + 1) − Ts ωe i q (k + 1) Ts i q (k + 1) + Ts ωe i d (k + 1)
L0 Ts
ref i 0 (k) − 1 −
Rs T L0 s
i 0 (k + 1) + 3 LTs0 ωψ f 3 sin 3θ
(9)
After one beat delay and Parke inverse transformation, it serves as the input reference value for SVPWM. By analyzing the above equation, it is found that the predictive control model includes generator inductance, resistance, and flux parameters. However, changes in the surrounding environment such as temperature or longer operation time can cause changes in motor parameters, resulting in errors in predicted voltage and deteriorating generator control performance. The mismatch of resistance parameters has a negligible impact on the control performance, while the mismatch of inductance and permanent magnet flux parameters has a significant impact on the control performance, making it a key research object in the following text. In order to enhance the robustness of generator control, this paper introduces a Longberg observer in the control process. When parameters such as inductance and resistance change, the voltage and current mathematical model can be expressed as: ⎧ ⎨ u d = L d i˙d + Rs i d − ωL s i q + Fd u = L q i˙q + Rs i q + ωL s i d + ωψ f + Fq ⎩ q u 0 = L 0 i˙0 + Rs i 0 − 3ωψ f 3 sin 3θ + F0
(10)
where Fd , Fq and F0 represent the disturbance generated when the inductance parameters are mismatched. Therefore, considering the d, q and 0 axes of the generator, the Longberg observer can be designed as formula (11), where k1 and k2 are the gain coefficients of the Longberg observer, and iˆd , iˆq , iˆ0 , Fˆd , Fˆq , and Fˆ0 all represent the observed values. By discretization, the current prediction equations of d, q and 0 axes can be obtained as Formula (12). ⎧ ˙ˆ = − Rs iˆ + ω iˆ − Fˆd + u d + k i − iˆ ⎪ i ⎪ d d e q 1 d d Ls Ls Ls ⎪ ⎪ ⎪ ⎪ ˙ ⎪ ˆ ˆ Fd = k2 i d − i d ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ i˙ˆq = − Rs iˆq − ωe iˆd − Fˆq + u q −ψ f ωe + k1 i q − iˆq Ls Ls Ls ˙ ⎪ ˆ ˆ ⎪ ⎪ Fq = k2 i q − i q ⎪ ⎪ ⎪ ⎪ i˙ˆ0 = − Rs iˆ0 + 3ωψ f 3 sin 3θ − Fˆ0 + u 0 + k1 i 0 − iˆ0 ⎪ ⎪ L L L L 0 0 0 0 ⎪ ⎪ ⎪ ⎩ Fˆ˙ = k i − iˆ 0
2
0
0
(11)
Improved Deadbeat Predictive Current Control for Open-Winding …
807
⎧ ˆ ⎪ iˆd (k + 1) = iˆd (k) + − LRss iˆd (k) + ωe iˆq (k) − FdL(k) + u dL(k) ⎪ s s ⎪ ⎪ ⎪ ⎪ ˆ ⎪ T i +k − i (k) (k) ⎪ 1 d d s ⎪ ⎪ ⎪ ⎪ ˆ ⎪ ˆd (k) + k2 Ts i d (k) − iˆd (k) ⎪ F + 1) = F (k d ⎪ ⎪ ⎪ ⎪ ⎪ ˆq (k + 1) = iˆq (k) + − Rs iˆq (k) − ωe iˆd (k) − Fˆq (k) + u q (k)−ψ f ωe ⎪ i ⎪ Ls Ls ⎪ ⎨ L s ˆ +k1 i q (k) − i q (k) Ts ⎪ ⎪ ⎪ ⎪ Fˆ (k + 1) = Fˆ (k) + k T i (k) − iˆ (k) ⎪ q q 2 s q q ⎪ ⎪ ⎪ ⎪ ⎪ ˆ0 (k + 1) = iˆ0 (k) + − Rs iˆ0 (k) + 3ωψ f 3 sin 3θ − Fˆ0 (k) + u 0 (k) ⎪ i ⎪ L0 L0 L0 ⎪ ⎪ L 0 ⎪ ⎪ ⎪ ˆ ⎪ +k1 i 0 (k) − i 0 (k) Ts ⎪ ⎪ ⎪ ⎪ ⎩ Fˆ (k + 1) = Fˆ (k) + k T i (k) − iˆ (k) 0 0 2 s 0 0
(12)
The Longberg observer only considers the impact of inductance mismatch on motor control, and the mismatch of permanent magnet flux can also lead to poor generator control performance. Therefore, based on the design of the Longberg observer, an incremental model is established for the d and q axes of the generator. According to the Longberg current k + 1 time prediction model, the k-th time prediction model can be derived. The difference between the k-th time and the k-1st time can be used to obtain the generator d-axis and q-axis current increment prediction model based on the Longberg observer just as Formula (13) and (14). This model has eliminated the magnetic linkage parameters, so it can get rid of the impact caused by magnetic linkage parameter mismatch. ⎧ ⎪ iˆd (k + 1) − iˆd (k) = iˆd (k) + − LRss iˆd (k) + ωe iˆq (k) − ⎪ ⎪ ⎨ + uLds(k) + k1 i d (k) − iˆd (k) Ts ⎪ ⎪ ⎪ ⎩ Fˆd (k + 1) − Fˆd (k) = Fˆd (k) + k2 Ts i d (k) − iˆd (k) ⎧ ⎪ ˆq (k + 1) − iˆq (k) = iˆq (k) + − Rs iˆq (k) − ωe iˆd (k) − i ⎪ ⎪ Ls ⎪ ⎨ u q (k)−ψ f ωe ˆq (k) Ts i + + k − i (k) 1 q L s ⎪ ⎪ ⎪ ⎪ ⎩ Fˆq (k + 1) − Fˆq (k) = Fˆq (k) + k2 Ts i q (k) − iˆq (k)
Fˆd (k) Ls
(13)
Fˆq (k) Ls
(14)
Similarly, based on the Formula (15) for predicting the d-axis, q-axis, and 0-axis voltage at time k + 1, the voltage equation at time k can be derived. The difference between the voltage at time k + 1 and the voltage at time k can be used to obtain a motor d-axis and q-axis voltage increment prediction model with a Longberg observer, as shown in the Formula (16). It can also overcome the influence of magnetic flux parameter mismatch, where iˆd , iˆq , i d , i q , Fˆd and Fˆq all represent the difference between the k-th and k-1st moments.
808
W. Wang and Y. Ma
Fig. 4 Block diagram of deadbeat current predictive control based on Longberg observer incremental model
⎧ ref Ls ⎪ u i + 1) = − 1− (k (k) d ⎪ d Ts ⎪ ⎪ ⎪ ⎪ ˆ + Fd (k + 1) ⎪ ⎪ ⎪ ⎪ ⎨ u q (k + 1) = L s i qr e f (k) − 1 − Ts Ts ⎪ + Fˆq (k + 1) + ω ψ ⎪ ⎪ Ls e f ⎪ ⎪ ⎪ ref L0 ⎪ i u + 1) = − 1− (k (k) ⎪ 0 0 ⎪ Ts ⎪ ⎩ ˆ + F0 (k + 1)
Rs Ls
Ts iˆd (k + 1) − Ts ωe iˆq (k + 1)
Rs Ls
Ts iˆq (k + 1) + Ts ωe iˆd (k + 1)
Rs T L0 s
iˆ0 (k + 1) + 3 LTs0 ωψ f 3 sin 3θ
(15)
⎧ ref Ls Rs ⎪ u i iˆd (k + 1) + T + 1) − u = − 2 − (k (k) (k) ⎪ d d s d ⎪ Ts Ls ⎪ ⎪ ⎪ ⎪ 1 − Rs T iˆ ˆ ˆ ⎨ + Fˆd (k + 1) − Fˆd (k) d (k) − Ts ωe i q (k + 1) − i q (k) Ls s ref ⎪ ⎪ u q (k + 1) − u q (k) = LTss i q (k) − 2 − LRss Ts iˆq (k + 1) + ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ 1 − Rs Ts iˆq (k) − Ts ωe iˆd (k + 1) − iˆd (k) + Fˆq (k + 1) − Fˆq (k) Ls
(16)
Based on the derivation of the above model algorithm, draw the algorithm control diagram as shown in Fig. 4. In subsequent simulation experiments, the algorithm is referred to as IDPCC.
4 Simulation Results In order to verify the control robustness of IDPCC, a control system model of an openwinding permanent magnet synchronous generator based on IDPCC was established in the MATLAB/Simulink environment. The generator parameters are shown in Table 1, with a switching frequency of 20 kHz and an initial no-load condition. At 0.17 s, the load was loaded, and the load resistance was set to 40 . At 0.38 s, the load
Improved Deadbeat Predictive Current Control for Open-Winding … Table 1 PARAMETERS OF OW-PMSG Parameters d- and q-axes inductances/H Zero sequence inductance/H Stator phase resistance/ Number of pole pairs Load resistance/ Flux linkage of permanent magnets/Wb Third harmonic flux linkage/Wb
809
Value L s = 83e−6 L 0 = 80e−6 Rs = 0.0018 pn = 5 R L = 40 ψ f = 0.0249 ψ f 3 = 0.002
Fig. 5 a Shows the loading and unloading situation of the generator, and b shows the voltage stabilization situation on the DC side; When no parameter mismatch occurs, c shows the simulation results of zero sequence current using IDPCC+SVPWM method, and d shows the simulation results of zero sequence current using IDPCC+ZVR SVPWM
was unloaded to the no-load condition. After extensive simulation, k1 was taken as 6000 and k2 was taken as −6000, And conduct simulation analysis and comparison between IDPCC and DPCC algorithm to demonstrate the effectiveness of the IDPCC. Figure 5a shows the voltage stabilization results of the generator under loading and unloading disturbances. When the load is suddenly increased or decreased as shown in Fig. 5b, the DC voltage fluctuation range is between 149.5 and 150.5, meeting the voltage stabilization control requirements. From Fig. 5c, d, it can be seen that
810
W. Wang and Y. Ma
Fig. 6 a, c Show the simulation results of the d-axis q-axis current using DPCC+ZVR SVPWM, while b, d show the simulation results of the d-axis q-axis current using IDPCC+ZVR SVPWM
using the traditional SVPWM algorithm cannot suppress the zero sequence current, and the peak value of the zero sequence current reaches 40A. After using the ZVR SVPWM algorithm, the zero sequence current remains stable at around ± 1A, and the suppression effect is obvious. From Fig. 6, it can be seen that when the parameter mismatch does not occur, using DPCC, the vibration range of the d and q axis currents is ± 1.4A, −18.0A-20.7A. Using IDPCC, the vibration range of the d and q axis currents is ± 1.2A, 17.9A20.4A, respectively. It can be seen that the control effects of the two algorithms are basically the same when the parameters are not mismatched. When the inductance parameter L s = 3L s , it can be seen from Fig. 7 that using the DPCC method, the i d oscillates violently with a vibration range of ± 2A. However, using IDPCC in this paper, the vibration range is all within ± 1A, which is not significantly different from when the parameters are not mismatched; For the q-axis current, using the DPCC method, the i q vibration range is between −16.2A and 20.2A, and the current vibration is more severe. The variance of the current error is 0.5544. However, the i q vibration range of IDPCC is within 17.8A-20.6A, and the variance of the current error is 0.1757. The fluctuation of the current i q is smaller, and the control following performance is improved by 68.3%. The variation of permanent magnet flux parameters mainly affects the q-axis current, and this paper only analyzes the variation of i q . When the magnetic flux parameter of the permanent magnet is ψ f = 3ψ f , it can be seen from Fig. 8 that the
Improved Deadbeat Predictive Current Control for Open-Winding …
811
Fig. 7 When L s = 3L s , a, c shows the simulation results of the d-axis q-axis current using traditional DPCC+ZVR SVPWM, and b, d shows the simulation results of the d-axis q-axis current using IDPCC+ZVR SVPWM
Fig. 8 When ψ f = 3ψ f , a shows the q-axis current simulation results using traditional DPCC+ZVR SVPWM, and b shows the q-axis current simulation results using IDPCC+ZVR SVPWM
812
W. Wang and Y. Ma
Fig. 9 a, b Respectively represent the zero sequence current suppression results of DPCC+ZVR SVPWM and IDPCC+ZVR SVPWM without parameter mismatch. When L 0 = 2L 0 , c shows the simulation results of zero axis current using DPCC+ZVR SVPWM, and d shows the simulation results of zero axis current using IDPCC+ZVR SVPWM
current i q using DPCC is smaller than the reference current, and there is a significant steady-state current static difference. At this time, the steady-state current static difference is 0.75A; Using IDPCC, as the incremental model can eliminate the influence of changes in magnetic flux parameters, the steady-state current static difference is 0.07A, which is basically zero. The coincidence degree between the actual value of current i q and the given value is significantly improved. From Fig. 9, it can be seen that when the zero sequence inductance parameter L 0 = 2L 0 , the vibration range of the zero sequence current using DPCC is within ± 1.8A. Compared with ± 1.2A when the zero sequence inductance parameter is not mismatched, the zero sequence current suppression effect becomes worse. The vibration range of the zero sequence current using IDPCC has little change compared to that before the zero sequence inductance mismatch, and is stable at about ± 1A, thus verifying the robustness of IDPCC to suppress zero sequence current during the zero sequence inductance mismatch.
Improved Deadbeat Predictive Current Control for Open-Winding …
813
5 Conclusion This paper proposes an IDPCC algorithm. Firstly, an improvement was made on the DPCC algorithm based on the Longberg observer, which effectively solved the problems of deteriorating d-axis and q-axis current following performance and deteriorating zero sequence current suppression effect when inductance parameter mismatch occurred. Then, the incremental model was combined with the Longberg observer and applied to the d-axis and q-axis of the generator, effectively suppressing the adverse effects of current steady-state performance caused by magnetic linkage parameter mismatch.
References 1. Kumar, P.R., Rajeevan, P.P., Mathew, K., et al.: A three-level common-mode voltage eliminated inverter with single DC supply using flying capacitor inverter and cascaded H-bridge. IEEE Trans. Power Electron. 29(3), 1402–1409 (2013). https://doi.org/10.1109/TPEL.2013.2262808 2. Hwang, J.C., Wei, H.T.: The current harmonics elimination control strategy for six-leg threephase permanent magnet synchronous motor drives. IEEE Trans. Power Electron. 29(6), 3032– 3040 (2013). https://doi.org/10.1109/TPEL.2013.2275194 3. Zhou, Y., Nian, H.: Zero-sequence current suppression strategy of open-winding PMSG system with common DC bus based on zero vector redistribution. IEEE Trans. Ind. Electron. 62(6), 3399–3408 (2014). https://doi.org/10.1109/TIE.2014.2366715 4. Briz, F., Degner, M.W., Lorenz, R.D.: Analysis and design of current regulators using complex vectors. IEEE Trans. Ind. Appl. 36(3), 817–825 (2000). https://doi.org/10.1109/28.845057 5. Suul, J.A., Ljokelsoy, K., Midtsund, T., et al.: Synchronous reference frame hysteresis current control for grid converter applications. IEEE Trans. Ind. Appl. 47(5), 2183–2194 (2011). 10.1109/TIA.2011.2161738 6. Zhang, X., Zhang, L., Zhang, Y.: Model predictive current control for PMSM drives with parameter robustness improvement. IEEE Trans. Power Electron. 34(2), 1645–1657 (2018). https://doi.org/10.1109/TPEL.2018.2835835 7. Underwood, S.J., Husain, I.: Online parameter estimation and adaptive control of permanentmagnet synchronous machines. IEEE Trans. Ind. Electron. 57(7), 2435–2443 (2009). https:// doi.org/10.1109/TIE.2009.2036029 8. Zhou, Y., Zhang, S., Zhang, C., et al.: Current prediction error based parameter identification method for SPMSM with deadbeat predictive current control. IEEE Trans. Energy Convers. 36(3), 1700–1710 (2021). https://doi.org/10.1109/TEC.2021.3051212 9. Gatto, G., Marongiu, I., Serpi, A.: Discrete-time parameter identification of a surface-mounted permanent magnet synchronous machine. IEEE Trans. Ind. Electron. 60(11), 4869–4880 (2012). https://doi.org/10.1109/TIE.2012.2221113 10. Chen, Z., Wu, C., Zhong, D., et al.: Robust deadbeat predictive current control for PMSM drives based on single FPGA implementation. In: 2019 IEEE International Symposium on Predictive Control of Electrical Drives and Power Electronics (PRECEDE), pp. 1–6. IEEE (2019). https:// doi.org/10.1109/PRECEDE.2019.8753272 11. Xu, C., Han, Z., Lu, S.: Deadbeat predictive current control for permanent magnet synchronous machines with closed form error compensation. IEEE Trans. Power Electron. 1 (2019). https:// doi.org/10.1109/TPEL.2019.2943016 12. Yang, M., Lang, X., Long, J., et al.: Flux immunity robust predictive current control with incremental model and extended state observer for PMSM drive. IEEE Trans. Power Electron. 32(12), 9267–9279 (2017). https://doi.org/10.1109/TPEL.2017.2654540
An Effective Method for Fault Localization Based on Combination of Convolution and LSTM Jinfeng Li and Haihao Yu
Abstract Software testing takes up a sizeable portion of the overall process of developing software. The process of software testing must always include software fault localization as one of its core components. CNN-LSTM-FL is the name of the effective method for fault localization that we propose in this paper. Deep learning’s convolutional and recurrent neural network serve as the foundation for this model’s deep learning architecture. This method is applied to Siemens programs that are stored in the Software Infrastructure Repository (SIR), and it is contrasted with fault localization techniques using convolutional neural networks (CNN-FL) and fault localization techniques using recurrent neural networks (LSTM-FL). The method that is proposed in this paper is found to be superior to the other two methods in terms of both effectiveness and stability. Keywords Software testing · Fault localization · Deep learning
1 Introduction In order to guarantee the dependability of the software we are developing, we have to test it repeatedly throughout the development process. As a result of ongoing software upgrades and the complexity of software, the cost of software testing accounts for a significant portion, up to and even exceeding 50% of the overall cost of developing software. The process of testing software begins with the finding of any faults that may exist. After a fault has been identified, the next step is to locate it, also known as fault localization, and then fix the issue. Software testing is a cyclical process J. Li (B) College of Computer and Information Technology, Mudanjiang Normal University, Mudanjiang 157012, China e-mail: [email protected] H. Yu College of Computer Science and Technology, Heilongjiang Institute of Engineering, Harbin 150050, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_66
815
816
J. Li and H. Yu
consisting of defect detection-fault localization-fix, because further defect detection is required to prevent the introduction of new faults during the process of fixing existing faults. The process of locating faults is an essential part of testing software. Numerous researchers have devoted their time and energy to studying this topic and coming up with a variety of solutions for locating faults. There are numerous approaches to fault localization, such as fault localization based on the spectrum, fault localization based on models, fault localization based on mutations, fault localization based on slices, machine-learning-based fault localization, and so on. In recent years, deep learning has experienced rapid expansion, which has resulted in a great deal of change in a variety of fields. Automatic speech recognition (ASR), memory networks, natural language processing (NLP), image recognition, computer vision, and other areas have all made use of it at some point or another. Deep learning is a theory that proposes a method for computers to learn pattern features automatically and integrates feature learning into the process of modeling. This theory is also referred to as “feature learning integration”. One of the most important advantages of deep learning is that it helps reduce the incompleteness that can arise as a result of artificially designed features. As a result of the powerful learning properties that deep learning possesses, a growing number of researchers are beginning to apply it to the field of fault localization. There, it has demonstrated a good localization effect and has attracted positive attention. Because each deep learning structure has different fault localization effects on different programs, there is no single deep learning structure that is effective on all programs. In this paper, we combine two better deep learning structures and apply them in the field of fault localization. These structures are the Convolutional Neural Network (CNN) and the Long Short Term Memory (LSTM) from the Recurrent Neural Network (RNN). We refer to this combined structure as the CNN-LSTM-FL. The following is a list of the most significant contributions that this paper has made to the field: A technology that combines two distinct types of deep learning structures is what we propose as a solution for fault localization. In order to determine whether or not the proposed method is effective, experiments are run on a selection of representative programs. The remaining parts of this essay are structured as follows: The research that is relevant is presented in Sect. 2. In Sect. 3, we will go over the method that can be used to combine two distinct deep learning structures. The experimental study and the threat to the study’s validity are both topics covered in Sect. 4. The conclusion can be found at the beginning of Sect. 5.
2 Related Work The spectrum-based fault localization method determines the likelihood that particular program elements contain faults by using the test case code coverage [1] and the results of the test case execution. It is assumed that elements executed more fre-
An Effective Method for Fault Localization …
817
quently by failed test cases and less frequently by passed test cases are more likely to contain errors. The components of the program are ranked according to how suspicious they appear, and each one is scrutinized and examined in turn until the error is located. Methods for fault localization based on the spectrum include Tarantula [2], Barinel [3], Ochiai [4], Ample [5], Crosstab [6], and Dstar [7]. The fact that the same suspiciousness score is assigned to multiple elements of the program is one of the drawbacks of using this method, which also reduces the effectiveness of the localization process. For the purpose of fault localization, static slicing, dynamic slicing, hybrid slicing, and thin slicing have been suggested by Weiser [8], Korel et al. [9], Gupta et al. [10], and Sridharan et al. [11]. The size of the slices produced by dynamic slicing are typically more diminutive than those produced by static slicing, which is one of the primary distinctions between the two methods. The term “dynamic slicing” refers to the executable statements that are included in a particular input, while the term “static slicing” refers to all executable statements that are involved in any and all inputs that could be used. One potential drawback of the method of slice-based localization is that it is possible for slice statements to be free of faults. Even if there are faults in the slice, the localization efficiency is low because a significant number of the statements in a slice are not sorted. The generation of test cases and the detection of flaws are two of the most common applications of mutation-based testing. On the other hand, fault localization is an application of mutation-based testing that is used very infrequently. The techniques for locating errors that rely on mutations Muse [12] and Metalaxis-FL [13] are two of the most widely used methods. Dutta et al. [14] proposed a mutation-spectrabased fault localization model. First, generate as many mutants as possible for the program and generate a spectrum for each mutant and fault program via test cases. The information that was generated for the spectrum is then inputted into the fault localization method, which then generates a statement sequence for each mutant and fault program. Due to the fact that the statement sequences of the fault program and mutants are very similar to one another, the error has been pinpointed to the mutant line that is the most similar to the original fault program. It is impossible to afford to create multiple mutants in order to use the mutation-based fault localization method because the cost is so prohibitively high. A number of researchers have begun to incorporate the use of deep learning into fault localization methods in recent years. Wei Zheng et al. [15] came up with a method for fault localization that is based on a deep neural network, and Zhu Zhang et al. [16–18] combined a deep neural network with dynamic backward slicing technology in order to solve the problem of fault localization. They also proposed resampling failed test cases as a solution to the problem of an imbalance between failed test cases and passed test cases. By applying the findings of the research to actual fault programs, they determined whether or not the method that was proposed was effective. Deep learning is a method that has proven to be very successful in fault localization; however, it is not applicable to all programs because the majority of models have their own set of flaws.
818
J. Li and H. Yu
3 Our Technique 3.1 The General Idea We create an evaluation model by making use of the learning ability of the deep learning structure. This model is able to determine how likely it is that certain statements in a program are to have errors. We use the results of test case executions as labels and test case coverage on program statements as training data. We then feed this information into the deep learning structure and train a nonlinear prediction model based on the relationship between test case coverage on program statements and execution results. P is a program that has M statements and N test cases, and out of the total, there is at least one test case that does not pass. Figure 1 depicts the outcomes of the execution of N test cases across M statements based on the coverage that was performed. ci j can either have a value of 1 or 0. If it is 1, then the ith test case covers the jth statement; otherwise, it is 0. ri is the execution result of the ith test case, where 1 indicates that the execution was unsuccessful and 0 indicates that it was successful. A prediction model between a failed test case and a fault statement is obtained by using a deep learning structure, and a virtual test case set is input into this prediction model. The coverage of statements by the virtual test case set is illustrated in Fig. 2. In the event that program P contain M statements, the virtual test case set will also contain M test cases; however, each of those test cases will only cover a single statement. ti is the ith test case provides coverage for the ith statement. In point of fact, a test case set of this kind does not exist. The output of the prediction model indicates the likelihood that each statement has an error because each test case only covers a single statement. The value can be anywhere between 0 and 1, and the higher the value, the more cause for concern there is that this statement contains an error. Check all of the statements that are based on this value, working your way down from the most significant to the least significant, until you find the error. Figure 3 presents a flow chart that depicts the entire technological process.
Fig. 1 Coverage and execution results of N test cases on M statements
Fig. 2 Coverage of statements by virtual test case set
An Effective Method for Fault Localization …
819
Fig. 3 The flow chart
3.2 CNN-LSTM-FL In this part of the article, we will develop a hybrid model for fault localization that is based on CNN-LSTM. Figure 4 provides a clear view of the model’s underlying structural components. The model is made up of three distinct layers, which are the convolutional layer, the recurrent layer, and the fully connected layer. The model begins by acquiring features by means of a convolutional layer, then moves on to a recurrent layer in order to discover the correlation between features, and finally transfers the acquired features to a fully-connected layer in order to classify them. The first component of the model is referred to as the convolutional layer, and it is made up of two layers of convolution, each of which is then followed by a ReLU activation function. The model’s input is the statement coverage matrix B × M × 1 of test cases, where B denotes the batch dimension and M is the total number of statements in the program. The convolutional layer is going to be responsible for producing the feature map FB×M×100 regarding the statement coverage matrix. The feature map is going to be fed into the recurrent layer, which will then figure out the dependencies that exist between the statements. The second component of the model is called the recurrent layer, and it is comprised of two other recurrent layers. In this part of the article, we will apply the LSTM structure that is so widely known. The LSTM is a recurrent neural network
820
Fig. 4 CNN-LSTM structure
Fig. 5 LSTM model structure
J. Li and H. Yu
An Effective Method for Fault Localization …
821
that facilitates the retrieval of information that was acquired in the past. The particular structure of an LSTM is illustrated in Fig. 5. It is constructed of three distinct elements: a forget gate, an input gate, and an output gate. The sigmoid function is utilized by the forget gate in order to carry out its purpose of determining which information from the cell state can be forgotten. The sigmoid function generates an output that is a number in the range of [0, 1] based on the previous state h t−1 and X t the state it is currently in. A value of 0 indicates that the information should be disregarded, while a value of 1 indicates that it should be kept. The following mathematical expression can be used to describe it: f t = σ (W f [h t−1 , xt ] + b f )
(1)
The data from the input that will be saved in the memory unit is selected by the input gate, which is responsible for this selection. There are two distinct components. The sigmoid function is responsible for producing the first part i t , which identifies the values that may be kept, and also determines which ones may be dropped. The t , which identifies the tanh function is responsible for producing the second part C newly discovered information brought about by the current input: i t = σ (Wi [h t−1 , xt ] + bi ) Ct = tanh(WC [h t−1 , xt ] + bC )
(2)
After that, the information regarding the current state is obtained in accordance with the information presented above: t Ct = f t ∗ Ct−1 + i t ∗ C
(3)
The output gate is in charge of determining the information that is sent out. The sigmoid function is used in order to decide what kind of information will be output by it. In order to produce the final result, the tanh function first converts the output data to a value between -1 and 1, then multiplies it by the weight Ot : ot = σ (Wo [h t−1 , xt ] + bo ) h t = ot ∗ tanh(Ct )
(4)
The third component of the model is the fully-connected layer. This layer, which is analogous to a Multi-Layer Perceptron (MLP) and acts as a classifier throughout the entirety of the convolutional neural network, is the model’s most important component. Each neuron in the fully-connected layer is connected with all of the features in the feature map. This ensures that the extracted features can be mapped in the sample space and that the classification can be completed successfully.
822
J. Li and H. Yu
Within the context of this model, the loss function that is being applied is the Cross Entropy Loss Function. The model can only predict two possible outcomes from the data. For each category, it is p and then 1 − p, respectively. The loss function is expressed as follows: L=
1 −[yi log( pi ) + (1 − yi )log(1 − pi )] N i
(5)
In the formula, yi stands for the sample. The number 1 denotes the positive class, while the number 0 denotes the negative class.
4 Experimental Study 4.1 Data Set We carried out a number of tests in order to provide empirical evidence supporting the viability of the proposed strategy. For the purpose of determining how well fault localization technology works, we chose five different subject programs from the Siemens suite. These subject programs include Replace, Schedule, Schedule2, Tcas, and Tot info. These subject programs are commonly used benchmark programs. The Software Infrastructure Repository (SIR) is where the faulty versions of all five programs as well as the test cases can be found. The specific characteristics of the experimental programs are outlined in Table 1, including the functional description of the programs, the number of faulty versions of the programs, the size of the programs, and the number of test cases. There are ten faulty versions included in Schedule2. We limited our selection to just 9 faulty versions because all of the test cases for version 9 passed.
Table 1 The overview of subject programs No. Program Description 1
Replace
2
Schedule
3
Schedule2
4
Tcas
5
Tot info
Pattern recognition Priority scheduler Priority scheduler Altitude separation Information measure
No. of faulty versions
LOC(lines of code)
No. of test cases
32
563
5542
9
422
2650
9
307
2710
41
173
1608
23
406
1052
An Effective Method for Fault Localization …
823
4.2 Evaluation Metric We make use of the Exam Score [19] benchmark when conducting the effectiveness assessment of fault localization strategies. It is the percentage of statements that need to be checked in order to find a statement that is incorrect, and the formula for calculating it is as follows: E xam Scor e =
Sexam ∗ 100 Stotal
(6)
The value of Sexam indicates the number of statements checked for the purpose of locating an error, whereas the value of Stotal indicates the total number of statements contained in the program. The value of the Exam Score should be as low as possible for this method to be effective.
4.3 Experimental Results This section focuses primarily on the presentation of our experimental results and makes a comparison between CNN-LSTM-FL, CNN-FL, and LSTM-FL. In Table 2, we compare the results of three distinct types of Exam Scores across five different subject programs. The best refers to the Exam Score value that is the lowest across all of the different faulty versions of a subject program. This value reflect the best results across all of the different versions of the faulty program. The average represents the average value of an Exam Score all of the faulty versions of a subject program. It also reflects the effect, on average, that a method has across all of the different versions of the faulty program. The variance is the difference in Exam Score between all of the different versions of a subject program, and it is a reflection of how stable the program is. As can be seen in Table 2, the method CNN-LSTM-FL that was proposed in this paper performed very well on three different types of Exam Scores when compared to the five different subject programs. The CNN-LSTM-FL method received four first-place finishes in the best Exam Scores, five first-place finishes in the average Exam Scores, and four first-place finishes in the variance Exam Scores. This demonstrates that the method proposed in this paper is superior to the other two deep learning methods in terms of both its effectiveness and its stability. The distribution of Exam Scores for the three approaches is shown in Fig. 6, which covers a total of five distinct subject programs. The percentage of statements that were checked in each program is depicted along the x-axis of the figure, while the percentage of faulty versions that were found is shown along the y-axis. Figure 6 demonstrates that the curves that are depicted by the method that was presented in this paper are, for the most part, located above the curves that are depicted by the other two methods. When only 20% of the statements are examined, the results of the Replace shown in Fig. 6a show that LSTM-FL has located more than 66% of the
824
J. Li and H. Yu
Table 2 Three types of exam scores among CNN-FL, LSTM-FL and CNN-LSTM-FL Exam score Replace Schedule Schedule2 Tcas Tot info CNN-FL
LSTM-FL
CNNLSTM-FL
Best Average Variance Best Average Variance Best
0.01776 0.204596 0.012673 0.01776 0.147147 0.023467 0.01776
0.009479 0.104792 0.01204 0.009479 0.129015 0.022289 0.007109
0.065147 0.231596 0.014193 0.029316 0.205212 0.021098 0.013029
0.011561 0.199422 0.012895 0.00578 0.193931 0.011204 0.017341
0.019704 0.122593 0.011736 0.014778 0.108934 0.008561 0.002463
Average Variance
0.08337 0.01614
0.097946 0.006407
0.203909 0.013299
0.145665 0.010802
0.106135 0.006324
incorrect versions, CNN-FL has located more than 53% of the incorrect versions, and CNN-LSTM-FL has located 78% of the incorrect versions. After investigating 40% of the statements, CNN-FL and CNN-LSTM-FL were able to identify all of the incorrect versions. According to the Schedule in Fig. 6b, when 15% of statements are examined, LSTM-FL has located more than 67% of faulty versions, while CNNLSTM-FL and CNN-FL have located more than 89%. After reviewing 25% of the program statements, CNN-LSTM-FL was able to identify all of the faulty versions. While CNN-FL is required to check 30% of the statements, LSTM-FL is required to check 40% of the statements. According to the Schedule2 in Fig. 6(c), when only 20% of statements are investigated, LSTM-FL is able to identify more than 40% of incorrect versions, CNN-FL is able to identify more than 20%, and CNN-LSTM-FL is able to identify 60%. When only 20% of the program statements are investigated, the Tcas presented in Fig. 6d reveals that LSTM-FL is able to identify more than 50% of the incorrect versions, CNN-FL is able to identify more than 40%, and CNN-LSTM-FL is able to identify 70%. When compared to CNN-FL and LSTMFL, which both need to examine 40% of program statements, CNN-LSTM-FL was able to identify all faulty versions after only checking 30% of program statements. The Tot info program is shown in Fig. 6e, and we can see that when 15% of the program statements are examined, CNN-LSTM-FL and CNN-FL have located more than 89% of the faulty versions, whereas LSTM-FL has located more than 67% of the faulty versions. After examining only 25% of the statements, CNN-LSTM-FL was able to identify all of the incorrect versions, whereas CNN-LF needs to examine 30% of the statements and LSTM-FL needs to examine 40% of the statements. The findings of the evaluation metric indicate that the approach that is the subject of this paper is superior to CNN-FL and LSTM-FL in terms of its efficiency as well as its stability.
An Effective Method for Fault Localization …
825
Fig. 6 Effective comparison
4.4 Validity Threat The subject programs used in the experiment are from Siemens suite of SIR. Each program has unique characteristics, as do the factors influencing the experiment. In order to determine whether or not the approach suggested in this paper is applicable to other programs, additional testing is required. The purpose of the experiment in
826
J. Li and H. Yu
this paper is to identify artificially faulty versions. Extensive testing is necessary in order to determine whether or not the real-world faulty versions are effective.
5 Conclusions The process of fault localization requires a considerable amount of time to complete. Quite a few different approaches to fault localization have been proposed so that testers can receive the kind of useful assistance they need while working through the process of locating faults. In this paper, the neural network and recurrent network deep learning approaches are combined to form a new method that is referred to as the CNN-LSTM-FL method. The method that is suggested in this paper is put to the test on five programs from Siemens suite and evaluated alongside CNN-LF and LSTM-FL. The method that is proposed in this paper is discovered to be more efficient and reliable, as determined by comparing it to the evaluation metric of Exam Score. The next step that we will take is to apply the method that was proposed in this paper to other programs as well as real faulty versions. Acknowledgements This work was supported by the Research Project of Mudanjiang Normal University (QN2021004), Science Research Project of Heilongjiang Provincial Education Department (1451MSYYB001, GJB1214028), the Natural Science Foundation of Heilongjiang Province (LH2023F037), Discipline Construction of Mudanjiang Normal University (MSYSYL2022010, MSYSYL2022008).
References 1. Zhao, G.Y., He, H.D., Huang, Y.F.: Fault centrality: boosting spectrum-based fault localization via local influence calculation. Appl. Intell. 52, 7113–7135 (2022). https://doi.org/10.1007/ s10489-021-02822-4 2. Sarhan, Q.I., Beszedes, A.: A survey of challenges in spectrum-based software fault localization. IEEE Access 10, 10618–10639 (2022). https://doi.org/10.1109/access.2022.3144079 3. Abreu, R., Zoeteweij, P., Van Gemund, A.J.: Spectrum-based multiple fault localization. In: 2009 IEEE/ACM ICSE, pp. 16–20. IEEE, Auckland, New Zealand (2009). https://doi.org/10. 1109/ase.2009.25 4. Gabriela, K.M., Jabier, M., Bruno, S.: Fault centrality: boosting spectrum-based fault localization via local influence calculation. Appl. Intell. 52, 7113–7135 (2022). https://doi.org/10. 1007/s10489-021-02822-4 5. Wong, W.E., Gao, R., Li, Y., Rui, A., Wotawa, F.: A survey on software fault localization. IEEE Trans. Softw. Eng. 42, 707–740 (2016). https://doi.org/10.1109/TSE.2016.2521368 6. Wong, W.E., Debroy, V., Xu, D.: Towards better fault localization: a crosstab-based statistical approach. IEEE Trans. Syst. Man Cybern. Part C (Appl. Rev.). 42, 378–396 (2011). https:// doi.org/10.1109/tsmcc.2011.2118751 7. Wong, W.E., Debroy, V., Gao, R., Li, Y.: The DStar method for effective software fault localization. IEEE Trans. Reliab. 63, 290–308 (2014). https://doi.org/10.1109/TR.2013.2285319
An Effective Method for Fault Localization …
827
8. Weiser, M.: Program slicing. IEEE Trans. Softw. Eng. SE-10, 352–357 (1984). https://doi.org/ 10.1109/TSE.1984.5010248 9. Korel, B., Laski, J.: Dynamic program slicing. Inf. Process. Lett. 29, 155–163 (1988). https:// doi.org/10.1016/0020-0190(88)90054-3 10. Gupta, R., Soffa, M.L.: Hybrid slicing: an approach for refining static slices using dynamic information. ACM SIGSOFT Softw. Eng. Notes. 20, 29–40 (1995). https://doi.org/10.1145/ 222132.222137 11. Sridharan, M., Fink, S.J., Bodik, R.: Thin slicing. In: Proceedings of the 28th ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 112–122. Association for Computing Machinery, New York, United States (2007). https://doi.org/10.1145/1273442. 1250748 12. Moon, S., Kim, Y., Kim, M., Yoo, S.: Ask the mutants: mutating faulty programs for fault localization. In: 2014 IEEE Seventh International Conference on Software Testing, Verification and Validation, pp. 153–162. IEEE Computer Society, Cleveland, OH, USA (2014). https:// doi.org/10.1109/ICST.2014.28 13. Papadakis, M., Le Traon, Y.: Metallaxis-FL: mutation-based fault localization. Softw. Test. Verification Reliab. 25, 605–628 (2015). https://doi.org/10.1002/stvr.1509 14. Dutta, A., Godboley, S.: MSFL: a model for fault localization using mutation-spectra technique. In: International Conference on Lean and Agile Software Development, pp. 156–173. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-67084-9_10 15. Zheng, W., Hu, D.S., Wang, J.: Fault localization analysis based on deep neural network. Math. Probl. Eng. 2016, 1–11 (2016). https://doi.org/10.1155/2016/1820454 16. Zhang, Z., Lei, Y., Tan, Q.P., Mao, X.G., Zeng, P., Chang, X.: Deep learning-based fault localization with contextual information. IEICE Trans. Inf. Syst. 100, 3027–3031 (2017). https:// doi.org/10.1587/transinf.2017EDL8143 17. Zhang, Z., Lei, Y., Mao, X., Yan, M., Xu, L., Wen, J.: Improving deep-learning-based fault localization with resampling. J. Softw. Evol. Process. 33, e2312 (2021). https://doi.org/10. 1002/smr.2312 18. Zhang, Z., Lei, Y., Mao, X., Yan, M., Xu, L., Zhang, X.: A study of effectiveness of deep learning in locating real faults. Inf. Softw. Technol. 131, 106486 (2021). https://doi.org/10. 1016/j.infsof.2020.106486 19. Renieres, M., Reiss, S.P.: Fault localization with nearest neighbor queries. In: 18th IEEE International Conference on Automated Software Engineering, pp. 30–39. IEEE, Montreal, QC, Canada (2003). https://doi.org/10.1109/ASE.2003.1240292
Bearing-Only Formation Control for Nonlinear Multi-agent Systems with Unknown Dead-Zone Inputs Haoruo Geng, Qin Wang, Zitao Chen, and Yang Yi
Abstract In this paper, the bearing rigid formation control with nonlinear dead-zone inputs is studied. Firstly, the adaptive estimation method is considered to estimate the influence of the nonlinear dead-zone inputs. And then an appropriate formation control algorithm is designed to coordinate the nonlineaer multi-agent systems with unknown dead-zone inputs. Finally, the global stability of the bearing-only formation system is proved by Lyapunov stability theory. Simulation results verify the effectiveness of the proposed control algorithm. Keywords Adaptive control · Formation control · Unknown disturbances · Nonholonomic systems · Relative bearing
1 Introduction In recent years, with the continuous development of modern technology, multi-agent systems have attracted the attention of more and more scholars in emerging disciplines including artificial intelligence, computer science, information theory and control theory. For example, in the field of computer science, Mahatthanajatuphat[1] implemented obstacle avoidance function by Multi-Agent Reinforcement Learning (MARL), Liu [2] studied the problem of distributed adaptive formation control for leader following multi-agent systems affected by noise, and proposed a control scheme integrating distributed estimation and formation control algorithm. Multiagent systems are distributed, collaborative and adaptive, and therefore also have a broad range of potential applications in the control related fields, such as robot formation, intelligent transportation, wireless sensor networks, etc. Lv [3] designed a distributed controller based on multi-agent and provided a control algorithm satisfying the system constraints to achieve the consistency. Yan [4] proposed an eventtriggered formation control method for multi-UAV systems with time-delay discrete H. Geng · Q. Wang (B) · Z. Chen · Y. Yi Department of Information Engineering, Yangzhou University, Yangzhou 225100, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_67
829
830
H. Geng et al.
time MAS, which is suitable for practical applications. Multi-agent formation control is a complex research topic, and its difficulty lies in how to determine the relative relationship among formation members. In previous studies, we found that relative position formation is very dependent on the control algorithm, requires complex calculations and controllers, so it is not flexible enough, while relative distance formation is susceptible to errors and other factors, which needs to be adjusted in real time. Compared with the traditional distance and position-based formation control methods, the bearing-only formation control method not only requires a small amount of measurement information, but also can provide more effective observation information to meet the control requirements and exchange information. It does not cause target aliasing, so it is easy to judge the relative relationship between targets. Therefore, bearing-only formation control methods are receiving more and more attention because of the advantages. In addition, there are few methods and studies related to the study of this type of formation control for nonlinear systems. However, in practical applications, linear systems are just idealized models, most systems are nonlinear, so it is especially important to study formation control of nonlinear systems [5–8]. The control problem about nonlinear multi-agent systems with unknown deadzone is also a hot research topic in recent years [9–12]. Different from linear system, a crucial problem is that the nonlinear factors in the nonlinear system will greatly affect the performance of the system. To address this challenge, researchers have proposed various approaches, such as adaptive control, neural network control, and model predictive control. Recent research on this topic has yielded exciting results. Some researchers have made corresponding progress in how to effectively eliminate or reduce the influence of nonlinear factors on the control system. For example, Li et al. [13] studied uncertain strict feedback nonlinear systems and proposed an adaptive saturation control strategy to handle all dead-zone inputs, Liu [14] studied the adaptive control problem for nonlinear strict feedback systems. Wei [15] considered the problem of adaptive fuzzy output-feedback control design and a fuzzy state controller is proposed to compensate the unknown functions of nonlinear systems. The development of more robust and adaptable control strategies for nonlinear systems is sure to remain a critical area of research in the years ahead. To solve the bearing rigid formation control problem with nonlinear dead-zone inputs, we studied the related schemes of bearing-only formation. Li et al. [16] studied the angle formation tracking control problem of non-holonomic multi-agent systems, using adaptive estimation algorithm, but its input is linear. The previous studies [17–20] only considered the linear inputs, the bearing-only formation control with nonlinear inputs is less studied. Considering the coordination of formation control, eventually, a bearing-only formation control scheme for unknown nonlinear systems with dead-zone is proposed in this paper. At the same time, how to reduce the influence of unknown nonlinear dead-zone inputs is fully discussed. An adaptive estimation method is used to estimate the effect of non-linear dead-zone inputs, then we construct a control law that enables the multi-agent to reach the desired formation from any position. Finally, the global stability of the bearingonly formation system is proved by Lyapunov stability theory, and the effectiveness of the method is verified by simulation.
Bearing-Only Formation Control …
831
Fig. 1 Non-linear input function ϕi (u i )
2 Problem Description Consider a multi-agent system with unknown nonlinear dead-zone inputs moving in a spatial R d , where each individual has an external disturbance: r˙i = ϕi (u i ) + dui The dead-zone model with corresponding inputs u i and output ϕi (u i ) is described as follows: ⎧ ⎨ (u i − u ri )ϕi+ (u i ) ≥ αri (u i − u ri )2 , u i > u ri ϕi (u i ) = 0, −u li ≤ u i ≤ u ri (1) ⎩ (u i + u li )ϕi− (u i ) ≥ αli (u i + u li )2 , u i < −u li All the dead-zone parameters αri , αli , u ri , u li in the above equation are unknown bounded non-zero constants. The physical meaning of the model described by the above formula is shown in Fig. 1: For the convenience of analysis, let 0 0 , we have u i (t) < −u max , so ϕi (u i ) = ϕi− (u i ) ≤ αi u i + δi , so we can get: ∇ri Vi T ϕi (u i ) ≤ ∇ri Vi T (αi u i + δ i ) = ∇ri Vi T αi u i + ∇ri Vi T δ
(13)
When ∇ri Vi < 0 , we have u i (t) > u max , so ϕi (u i ) = ϕi+ (u i ) ≥ αi u i − δi ,so we can get: ∇ri Vi T ϕi (u i ) ≤ ∇ri Vi T (αi u i − δ i ) = ∇ri Vi T αi u i − ∇ri Vi T δ
(14)
Combining Eqs. (13) and (14), we can get: ∇ri Vi T ϕi (u i ) ≤ αi ∇ri Vi T u i + |∇ri Vi | · δ
(15)
The derivative of Wi with respect to t can be obtained: W˙ i (ri j (t)) ≤ 2
n
(−k¯i ∇ri Vi 2 − |∇ri Vi |T δˆi + ∇ri Vi T Di
i=1
n
+ ∇ri Vi T · δ i ) + i=1
1 ˆ (δ ηi1 i
− δi ) · δ˙ˆi
(16)
Let δi + Di = δi , we have: W˙ i ≤
n
i=1
2(−k¯i ∇ri Vi 2 + (δˆi − δi )( 2η1i1 δ˙ˆi − ∇ri Vi T ))
(17)
It can be seen from Formula (17) that W˙ i is non-increasing and the lower bound is zero, so the existence limit Wi (∞) can be obtained. Let T (gi j (t)) = 2
n
n
(∇ri Vi 2 ) = −2k¯ i=1 −2k¯i i=1 j∈Ni (gi j − gi∗j ) . Integrate both sides of the equation to get:
Bearing-Only Formation Control …
lim
t
t→∞ 0
835
T (gi j (τ ))dτ ≤ − lim
t
t→∞ 0
W˙ i (r i j (τ ))dτ
(18)
= Wi (ri j (∞)) − Wi (ri j (0))
t t Namely, 0 T (gi j (τ ))dτ exists and is bounded, and we prove that 0 T (gi j (τ ))dτ is uniformly continuous. For any initial state, T (gi j (t)) and u i are bounded , and therefore r˙i j is bounded. r Because gi j = ri j , gi j , and g˙i j are bounded with respect to i = 1, 2, . . . , n, j ∈ Ni ij , that means that gi j is uniformly continuous on i = 1, 2, · · · , n, j ∈ Ni . It is clear that, when ∀t ≥ 0 , it can be known from the uniform continuity property of T (gi j (t)) to ∀gi j (t) that T (gi j (t)) is uniformly continuous when ∀t ∈ [0, +∞) Therefore, it meets the use conditions of Barbalat lemma and lim W ( pi j (t)) = 0 can be proved, t→∞ and we can get: 2 n ∗ (g − g ) = 0 ⇒ (gi j − gi∗j ) = 0 ij ij j∈Ni i=1 j∈Ni Change it to matrix vector form, then the equation is as follows: H¯ T (gi j − gi∗j ) = 0 So,we have: r T H¯ T (g − g ∗ ) =
1 2
m
k=1
2 ek gk − gk∗ = 0
We can know that lim (gi j (t) − gi∗j (t)) = 0 , that is, the expected bearing rigid t→∞ formation is globally asymptotically stable, and all agents have reached the desired relative bearing.
5 Sinulation Results In this part, We verify the effectiveness of our proposed algorithm through a simulation example. First, the dynamic equation of the system is: r˙i = ϕi (u i ) + dui where ⎧ ⎨ (1.5 + 0.5 cos u i )(u i − 0.5), u i > 0.5 ϕi (u i ) = 0, −0.6 ≤ u i ≤ 0.5 ⎩ (2.5 + 0.5 cos u i )(u i + 0.6), u i < −0.6
836
H. Geng et al.
Fig. 2 The trajectory diagram of agents ϕi (u i ) 3 ||g 12- g *12|| ||g 13- g *13|| ||g 14- g *14||
2
||g 23- g *23|| ||g 24- g *24||
||gij - g ij* ||
1
||g 34- g *34|| ||g 16- g *16||
0
-1
-2
-3 0
1
2
3
4
5
6
t(sec)
Fig. 3 The error between the actual angle and the desired angle ϕi (u i )
7
8
Bearing-Only Formation Control …
837
Fig. 4 The trajectory diagram of agents
and dui = 1.2 cos ri In plane R 2 , six robot agents start from a non-collinear initial position, adjust the position and eventually form a regular hexagon formation. In the simulation, the coefficients of the corresponding controller are: u max = [0.5, 0.5]T , αmin = 5, k¯i = 4.1, ηi = 3.8, q = 1.2. It can be seen from Figs. 2 and 3 that the bearing error between agents gradually is convergent and tends to 0, the actual relative angle reaches the desired angle and the agents eventually converge to the desired formation in the process of movement (Figs. 4 and 5). In three-dimensional space R 3 , the robot agents start from the initial position and eventually form our pre-designed formation. In the simulation, u max = [0.5, 0.5, 0.5]T , αmin = 5, k¯i = 2, ηi = 3.1 . As can be seen from the above 3D figure, the bearing error between agents gradually tends to 0, and the agents eventually converge to the desired formation in the process of movement. Good results can be obtained in 2D space and 3D space simulation. The designed controller can perform the task of hexagon formation well. Therefore, our simulation has achieved the established goal, and the results verify the effectiveness of the control algorithm.
838
H. Geng et al. 1.8 ||g
1.6 ||g
1.4
||g
12 13 14
- g * || 12
- g * ||
||g23 -
1.2
||gij- g ij*||
||g34 -
1
||g ||g
0.8
15 45
13 * || 14 * g 23 || g *34 || g * || 15 g * || 45
-g
-
0.6 0.4 0.2 0 0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
t(sec)
Fig. 5 The error between the actual angle and the desired angle
References 1. Mahatthanajatuphat, C., Srisomboon, K., Lee, W., Samothai, P., Kheaksong, A.: Investigation of multi-agent reinforcement learning on merge ramp for avoiding car crash on highway. In: 2022 37th International Technical Conference on Circuits/Systems,Computers and Communications (ITC-CSCC), Phuket, Thailand, pp. 1050–1053 (2022). https://doi.org/10.1109/ITCCSCC55581.2022.9895011 2. Liu, Y.J., Liu, Z.X.: Distributed adaptive formation control of multi-agent systems with measurement noises. Automatica 150, 4625–4634 (2023) 3. Lv, D.X., Zhang, Z.C., Zhu, L.H: Multi-agent based energy control for solar unmanned aerial vehicles. Control Decision 38(2), 372–378 (2023) 4. Yan, Z.W., Liang, H.Y., Li, X.D., et al.: Event-triggered formation control for time-delayed discrete-time multi-agent system applied to multi-UAV formation flying. J. ranklin Inst. 360(5), 3677–3699 (2023). https://doi.org/10.1016/j.jfranklin.2023.01.036 5. Ma, S.X., Xi, J.Z.:Multi-UAV cooperative formation flight control system for cascade active disturbance rejection control. Int. Conf. Electr. Eng. Control Technol. (CEECT), Macau, Macao, 2021, pp. 93–97 (2021). https://doi.org/10.1109/CEECT53198.2021.9672641 6. Chen, T., Sun, Y., Niu, X., Lan,Y., Fang, W., Liu, P.: Formation control for second-order nonlinear multi-agent systems with external disturbances via adaptive method. In: China Automation Congress (CAC), Xiamen, China, 2022, pp. 5616–5620 (2022). https://doi.org/10.1109/ CAC57257.2022.10055393 7. Fang, Z., Jiang, D., Huang, J., Cheng, C.X., et al.: Autonomous underwater vehicle formation control and obstacle avoidance using multi-agent generative adversarial imitation learning. Ocean Eng. 262 (2022). https://doi.org/10.1016/j.oceaneng.2022.112182 8. Xu, B., Wang, Z.Y., Shen, H.: Distributed predictive formation control for autonomous underwater vehicles under dynamic switching topology. Ocean Eng. 262, 549–559, 0029–8018 (2023). https://doi.org/10.1016/j.oceaneng.2022.112240 9. Zhang, J.Y., Meng, D.Y.: Iterative learning-based formation control for multi-agent systems with locally lipschitz nonlinear dynamics. In: 2019 IEEE 15th International Conference on
Bearing-Only Formation Control …
10.
11.
12.
13. 14.
15.
16.
17.
18.
19.
20.
839
Control and Automation (ICCA), Edinburgh, UK, pp. 1614–1619 (2019). https://doi.org/10. 1109/ICCA.2019.8900010 Xu, S.,Deng,H.and Zhang,L.:Leader-follower formation control for multiple mobile robots with velocity constraints.2022 China Automation Congress (CAC), Xiamen, China, pp. 5466– 5471 (2022) https://doi.org/10.1109/CAC57257.2022.10055196 Chan, N.P.K., Jayawardhana, B., Marinade, H.G.: Angle-constrained formation control for circular mobile robots. IEEE Control Syst Lett 5(1), 109–114 (2021). https://doi.org/10.1109/ LCSYS.2020.3000061 Zhang, X., Su, W., Chen, L.: A multi-agent formation control method based on bearing measurement. In: 2019 4th International Conference on Measurement, Information and Control (ICMIC), (2019), pp. 66–72. https://doi.org/10.1109/ICMIC48233.2019.9068562 Li, Y., Tang, L., Wang, D.: Adaptive saturated control for linearized uncertain strict-feedback nonlinear systems with dead-zone input. IEEE/CAA J. Automatica Sinica 8(2), 476–485 (2021) Liu, Z., Wang, F., Zhang, Y., et al.: Adaptive tracking control for a class of nonlinear systems with a fuzzy dead-zone. IEEE Trans. Fuzzy Syst. 23(1), 193–204 (2015). https://doi.org/10. 1109/TFUZZ.2014.2310491 Wei, Y.,Wang, Y., et al.: IBLF-based finite-time adaptive fuzzy output-feedback control for uncertain MIMO nonlinear state-constrained systems. IEEE Trans. Fuzzy Syst. (2020). https:// doi.org/10.1109/TFUZZ.2020.3021733 Li, X., Wen, C., Fang, X., Wang, J.: Adaptive bearing-only formation tracking control for nonholonomic multiagent systems. IEEE Trans. Cybern. (2021). https://doi.org/10.1109/TCYB. 2020.3042491 Wu, K., Hu, J., Lennox, B., Arvin, F.: Finite-time bearing-only formation tracking of heterogeneous mobile robots with collision avoidance. IEEE Trans. Circ. Syst. II: Express Briefs 68(10), 3316–3320 (2021). https://doi.org/10.1109/TCSII.2021.3066555 Liang, H.,Tan, Q.,Dong, X., Li, Q., Ren, Z.: Formation tracking of multi-agent systems with bearing-only measurement. In: 2015 34th Chinese Control Conference (CCC), pp. 7124–7129 (2015). https://doi.org/10.1109/ChiCC.2015.7260767 Lin, Q., Miao, Z., Wang, Y., Lin, J., Zhong, H.: Differentiator-based formation control of quadrotors with bearing-only measurements. In: 2022 37th Youth Academic Annual Conference of Chinese Association of Automation (YAC), Beijing, China, pp. 121–126(2022). https:// doi.org/10.1109/YAC57282.2022.10023884 Huang, Y., Zhao, J., Lan, W., Yu, X.: Bearing-only formation control of nonholonomic mobile agents with local reference frames. In: IEEE International Conference on Unmanned Systems (ICUS), Guangzhou, China, pp. 861–866 (2022). https://doi.org/10.1109/ICUS55513.2022. 9986967
Adaptive Event-Triggered Output Feedback Tracking Control for Uncertain Nonlinear Systems with Sensor Failures Chen Sun, Yan Lin, and Lin Li
Abstract In this paper, an adaptive event-triggered output feedback tracking scheme is proposed for a class of nonlinear systems with system nonlinearities and sensor failures. Particularly, the system nonlinearities satisfy polynomial growth condition, and both sensor gain variation and additive error are involved. While a state observer based on a dynamic gain is designed to estimate the system states, two triggering conditions are constructed to obtain the triggering values of observer states and dynamic gain, which further leads to the event-triggered control signal. It is shown that by using our scheme, the event-triggered fault-tolerant tracking can be achieved, and the real tracking error can converge to a small residual set. A hydraulic actuation system is used to demonstrate the effectiveness of the proposed scheme. Keywords Adaptive event-triggered control · Output feedback tracking · Nonlinear systems · Sensor failures · Dynamic gain
1 Introduction Event-triggered control has drawn considerable attention due to the popularity of the networks in plenty of practical systems, as pointed in [1–3]. In particular, by monitoring some triggering conditions, information transmitting and control updating occur only when at least one of the conditions is violated.
C. Sun · L. Li School of Energy and Power Engineering, Beihang University, Beijing 100191, China Y. Lin (B) College of Electrical Engineering and Automation, Shandong University of Science and Technology, Qingdao 266590, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_68
841
842
C. Sun et al.
In this paper, we consider a class of nonlinear uncertain systems x˙i = xi+1 + φi (x, θ (t)), i = 1, . . . , n − 1, x˙n = u + φn (x, θ (t)), y = x1 ,
(1)
where x = [x1 , . . . , xn ]T ∈ Rn , u ∈ R and y ∈ R are the system state vector, input and output, respectively, θ (t) ∈ Rm is an unknown continuous time-varying vector within an unknown bounded set, and the unknown continuous functions φi (x, θ ) : Rn × Rm → R, i = 1, . . . , n, are locally Lipschitz with respect to all the system states. Note that only the system output y is measured by one sensor, whose output is ys . The sensor failure model is: ys (t) = s (t)y(t) + δs (t), t ≥ ts ,
(2)
where ts is an unknown time instant, 0 < s (t) ≤ 1 denotes the sensor gain variation and δs (t) denotes the sensor additive error. Both s (t) and δs (t) are uniformly differentiable and bounded functions. For the nonlinear terms φi (·), i = 1, . . . , n, the following assumption is made. Assumption 1 There is an unknown constant ϑ ≥ 0 such that |φi (x, θ (t))| ≤ ϑ(1 + |y| p )(|x1 | + · · · + |xi |), i = 1, . . . , n,
(3)
where p is a known positive integer. The control objective is to design an adaptive event-triggered output feedback tracking controller to compensate for system nonlinearities, sensor failures, and execution error, such that the system output y can track a reference signal yr as closely as possible, where it is assumed that yr and its derivative y˙r are continuous and bounded by an unknown positive constant. The adaptive event-triggered control has been a popular technique, and the main merit is that both the system uncertainties and the execution error can be accommodated, so that the input-to-state stability (ISS) assumption with respect to the execution error in [3] is no longer needed. Despite the progress in the research on state feedback adaptive schemes [4, 5], the adaptive event-triggered output feedback controller still needs further study. Specifically, some adaptive event-triggered stabilizing schemes [6–8] based on adaptive estimation technique were proposed for a class of uncertain nonlinear systems with system uncertainties only depending on the system output. In addition, some adaptive event-triggered stabilizing schemes [9– 11] based on gain scaling technique were proposed for a class of uncertain nonlinear systems with system nonlinearities depend on the unmeasurable states and satisfy different growth conditions. Note that the above results all assume that the output sensor is failure free, which is indeed unreliable. However, fewer results are available to server for adaptive
Adaptive Event-Triggered Output Feedback Tracking Control …
843
event-triggered controller subject to sensor failures. In [12], an event-triggered adaptive control was proposed for a class of uncertain nonlinear systems in the output feedback canonical form with only sensor additive error. For a class of p-normal nonlinear systems with only sensor gain variation, Li et al. [13] proposed an adaptive scheme based on dynamic gain for event-triggered output-feedback stabilization. Therefore, it is essential to pursue an event-triggered control scheme to tolerate more general sensor failures. In this paper, an adaptive event-triggered output feedback tracking scheme is proposed for a class of uncertain nonlinear systems. By using gain scaling technique, a state observer is constructed, and two triggering conditions are proposed for the observer states and the dynamic gain, which leads to the event-triggered input. It is proved that the system nonlinearities, the execution error, and the sensor failures can be counteracted, and the measurable tracking error can converge to an arbitrarily small residual set. It is worth mentioning that two sensor failure patterns rather than the single sensor failure in [12, 13] is considered, the polynomial growth condition that allowed in the system is less conservative than these schemes [6, 9, 12], and more general tracking problem is considered.
2 Adaptive Event-Triggered Fault-Tolerant Controller 2.1 Observer with Dynamic Gain and Event-Triggered Controller Lemma 1 is first introduced to determine some important design parameters. Lemma 1 ([14]) For any positive constant κ, there exist positive constants γ1 and γ2 , positive definite matrices P = P T > 0 and Q = Q T > 0, and vectors a = [a1 , . . . , an ]T and k = [k1 , . . . , kn ]T , such that (A − acT )T P + P(A − acT ) ≤ −γ1 I, κ P ≤ D P + P D + 2κ P ≤ γ2 P, (A − bk T )T Q + Q(A − bk T ) ≤ −4γ1 I, κ Q ≤ D Q + Q D + 2κ Q ≤ γ2 Q, (4) where D = diag{0, 1, . . . , n − 1}, and A, b, c are defined as 0 In−1 0(n−1)×1 1 . A= ,b = ,c = 0 0 1 0(n−1)×1
(5)
To proceed, suppose the event-triggered instants are {t j }∞ j=0 , where any two adjacent instants satisfying t j+1 > t j with t0 = 0 being the initial sampling instant. Then, we define the following two variables = ys − yr , r = x1 − yr ,
(6)
844
C. Sun et al.
with which the state observer with a dynamic gain is given as follows: x˙ˆi = xˆi+1 + i ai ( − xˆ1 ), i = 1, . . . , n − 1, xˆ˙n = u + n an ( − xˆ1 ),
(7)
where xˆi , i = 1, . . . , n, are the observer states, u = w(t j ), ∀t∈[t j , t j+1 ) is an observer input, the gain vector a has been selected from Lemma 1 with 2κ p < 1, and the dynamic gain with the initial condition (0) = 1 is updated by
˙ = max −β1 2 + β2 (1 + |ys | p )2 , ( − xˆ1 )2 + xˆ12 − ϒ , 0 ,
(8)
where β1 , β2 and ϒ are positive design constants that will be designed later. Since
˙ is locally Lipschitz in ( , ys , xˆ1 , yr ), the following properties of hold:
≥ 1, ˙ ≥ 0, ˙ ≥ −β1 2 + β2 (1 + |ys | p )2 ,
˙ ≥ ( − xˆ1 )2 + xˆ12 − ϒ , ∀t ≥ 0.
(9)
Then, the event-triggered FTC controller is designed as u(t) = w(t j ) = w(xˆ1 (t j ), . . . , xˆn (t j ), (t j )) = − n (t j )k1 xˆ1 (t j ) − n−1 (t j )k2 xˆ2 (t j ) − · · · − (t j )kn xˆn (t j ).∀t∈[t j , t j+1 ). (10)
2.2 Event-Triggered Mechanism Now, suppose the triggering instant t j is already known, and the next triggering instant t j+1 is determined by monitoring the following triggering conditions n |xˆk (t j ) − xˆk | ≥ t j+1,x = inf t > t j : k=1
σx
1+κ (t
j)
,
t j+1, = inf{t > t j : − (t j ) ≥ σ },
(11) (12)
where σx > 0 and 0 < σ < 1 are constants. Then the next triggering instant is t j+1 = min{t j+1,x , t j+1, }.
(13)
ˆ j ), x¯i = xˆi (t j ), i = For convenience in the subsequent analysis, we define ¯ = (t 1, . . . , n, and the event-triggered control signal (10) can be rewritten as ¯ n x¯n , ∀t∈[t j , t j+1 ). u(t) = w(t j ) = − ¯n k1 x¯1 − ¯n−1 k2 x¯2 − · · · − k
(14)
Adaptive Event-Triggered Output Feedback Tracking Control …
845
3 Main Results 3.1 Stability Analysis First of all, from (11), it yields that for any interval [t j , t j+1 ), j ∈ {0, 1, 2, . . .}, x , i = 1, . . . , n. Thus, there exist a series of continuous time-varying |x¯i − xˆi | ≤ ¯σ1+κ parameters λi (t), i = 1, . . . , n, satisfying |λi (t)| ≤ 1 with λi (t j ) = 0, such that x¯i = xˆi + λi (t)σx / ¯1+κ , ∀t ∈ [0, +∞).
(15)
From (12), it can be deduced that (1 − σ ) ≤ ¯ ≤ . Moreover, noting the fact that ≥ 1, there also exists a continuous time-varying parameter λ (t), satisfying 1 − σ ≤ λ (t) ≤ 1 with λ (t j ) = 1, such that
¯ = λ (t) , ∀t ∈ [0, +∞).
(16)
To proceed, let the estimate error be defined as ˆ η˜ = [η˜ 1 , . . . , η˜ n ]T = x − Yr − x,
(17)
where xˆ = [xˆ1 , . . . , xˆn ]T and Yr = [yr 0 , 0, . . . , 0]T with yr 0 = (yr − δs )/s . Then, considering the following transformations ei = η˜ i / i−1+κ , z i = xˆi / i−1+κ , i = 1, . . . , n,
(18)
the dynamics of the error vectors e = [e1 , . . . , en ]T and z = [z 1 , . . . , z n ]T , satisfy
˙ e˙ = (A − acT )e + H −1 (x, θ, y˙r 0 , ) + a(1 − s )(e1 + z 1 ) − (κ I + D)e,
(19)
˙ z˙ = (A − bk T )z − a(1 − s )z 1 + as e1 − (κ I + D)z
n n σx n+1− j n+1− j 2− j−κ + b
(1 − λ
)k j z j − b λ
kjλj ,
¯1+κ j=1
(20)
j=1
where H = diag{ κ , . . . , n−1+κ } and (x, θ, y˙r 0 , ) = [ψ1 (·), ψ2 (·), . . . , ψn (·)]T = [φ1 (·) − y˙r 0 , φ2 (·), . . . , φn (·)]T . (21) Now, we are in a position to present our main results.
846
C. Sun et al.
Theorem 1 Let the closed-loop system satisfy Assumption 1, where the plant and sensor failures are modeled by (1) and (2), the state observer and dynamic gain are given by (7) and (8), the triggering conditions and the triggering control signal are designed as (11)–(13) and (14), respectively. Then, there exists a constant ¯ m such that if the sensor gain variation satisfies s ∈[1 − ¯ m , 1],∀t ≥ 0, all the signals of the closed-loop system are bounded, the measurable tracking error converges to an arbitrarily small residual set and the real tracking error r converges to a small residual set only related to the sensor failures s and δs . Proof For the closed-loop system, the locally Lipschitz condition ensures the existence and uniqueness of the solutions (x, x, ˆ ) on the right maximum time interval [0, T f ) for some T f ∈ (0, +∞]. From Lemma 1, since A − acT and A − bk T are Hurwitz matrices, the Lyapunov function is chosen as V = μeT Pe + z T Qz, where μ =
Pa2 +3Qa2 γ12
(22)
+ 2. Then, the derivative of V satisfies
˙ V˙ ≤ − μγ1 e2 − μ eT (D P + P D + 2κ P)e − 4γ1 z2
˙ T − z (D Q + Q D + 2κ Q)z + 2μeT P H − 1 + 2μ eT Pa(1 − s )(e1 + z 1 )
n n+1− j − 2 z T Qa(1 − s )z 1 + 2 z T Qas e1 + 2 z T Qb (1 − λ
)k j z j j=1
− 2z T Qb
n
n+1− j 2− j−κ
λ
kjλj
j=1
σx . ¯
1+κ
(23)
Then, we first estimate √ the influences of the terms caused by the event-triggered mechanism. Let σ ≤ 1 − n 1 − 1/(2Qbk), and noting the facts that 1 − σ ≤ λ (t) ≤ 1 and |λ j (t)| ≤ 1, it yields that 2 z T Qb
n
n+1− j
(1 − λ
)k j z j ≤ γ1 z2 ,
(24)
j=1
−2z T Qb
n
n+1− j 2− j−κ
λ
kjλj
j=1
σx 1 2n 2 km2 Qb2 σx2 ≤ γ1 z2 + , (25) 2 γ1 (1 − σ )2(1+κ) 2κ
¯1+κ
where km = max{k1 , . . . , kn }. Further, we estimate the influences of the other terms. Let the error of sensor gain variation be defined as ¯ = 1 − s >0, and satisfy ¯ ≤ ¯ m := min
√ √ 3 γ1 γ1 3 , , , , 3μ 4μPa 6Qa 3
(26)
Adaptive Event-Triggered Output Feedback Tracking Control …
847
using (17) and Young’s inequality, the following three inequalities can be deduced ¯ 1 + z 1 ) ≤Pa2 e2 /γ1 + 1γ1 z2 /3 + 1γ1 e2 /2, 2μ eT Pa (e −2 z Qa z ¯ 1 ≤2 Qaz ¯ ≤ 1γ1 z /3, T
2
2
2 z Qas e1 ≤5γ1 z /6 + 3Qa e /γ1 . T
2
2
2
(27) (28) (29)
In view of Lemma 2 in Appendix and the change of coordinates (18), it follows |2μeT P H −1 | ≤(1 + |ys | p )2 (2e2 + z2 ) + (2θ12 μ2 n 4 P2 + 1)e2 + μ2 n 2 P2 (θ12 + 1)Yr20 / 2κ .
(30)
Then, choosing the design parameters β1 and β2 β1 ≤ min
γ1 1 γ1 2 , , β2 ≥ max , , μγ2 P 2γ2 Q μκλmin (P) κλmin (Q)
(31)
and substituting (24), (25), and (27)–(30) into (23) results in 1
V˙ ≤ − γ1 − (2θ12 μ2 n 4 P2 + 1) (e2 + z2 ) + 2κ , 2
(32)
2n 2 k 2 Qb2 σ 2
2 2 2 2 2 m x where = γ1 (1−σ 2(1+κ) + μ n P (θ1 + 1)Yr 0 is an unknown positive constant.
) With (19), (20), and (32), it can be proved that the variables e, z, are bounded on the maximum time interval [0, T f ), whose proof involves tedious contradiction arguments and is similar to the proofs of Propositions 1–3 in [15]. Since e, z, are bounded on the maximum time interval [0, T f ), then T f = +∞. Hence, from (14) ¯ and u are bounded on t∈[0, +∞). Then, and r are bounded. and (18), x, x, ˆ , x, ¯ , That is, all the closed-loop signals are bounded. Note that the tracking performance of the measurable tracking error can be ˙ which is the same as the analysis in [15] and is obtained based on the property of , √ therefore omitted. Consequently, it can be deduced that limt→+∞ | | ≤ 2ϒ , that is, can be made arbitrarily small by decreasing the value of ϒ . For the real tracking error r , from (2) and (6), it yields that limt→+∞ | r | ≤ √ 2ϒ
+ supt≥0 (1−s)ys r −δs , that is, r is mainly related to s and δs . s
3.2 Exclusion of Zeno Phenomenon Theorem 2 By applying the scheme in Theorem 1, there exists a constant T0 > 0 such that the inter-event time satisfies t j+1 − t j ≥ T0 , ∀ j ∈ {0, 1, 2, . . .}.
848
C. Sun et al.
Proof First, to seek the lower bound of t j+1,x , let E k = x¯k − xˆk , ∀t ∈ [t j , t j+1 ), one has dtd nk=1 |E k | ≤ nk=1 |x˙ˆk |. From (7) and noting that all the closed-loop signals n ˙ are bounded, there exists a positive constant M1 such that k=1 | xˆ k | ≤ M1 . Moreover, n n x (11) implies that k=1 |E k (t j )| = 0 and limt→t j+1,x k=1 |E k (t)| = ¯σ1+κ . Note that
(t) is bounded on t ∈ [0, +∞) and suppose that 1 ≤ (t) ≤ m with m being an x > 0. Similarly, the unknown constant, it can be deduced that t j+1,x − t j ≥ 2σM m 1 σ
lower bound of t j+1, satisfies t j+1, − t j ≥ M2 > 0. x , Mσ 2 }. Thus the Based on above analysis, t j+1 − t j ≥ T0 > 0 with T0 = min{ 2σM m 1 Zeno phenomenon [16] is avoided.
4 Simulation Example We demonstrate the proposed scheme with a hydraulic actuation system (HAS), which is widely used in different aircraft subsystems, such as surface control [17], fuel metering unit [18], etc. The dynamics of HAS are described by x˙1 = x2 , x˙2 = (Ah x3 − Bh x2 − A f arctan(x2 ))/m h , x˙3 = (4βh K q K v u − 4βh Ah x2 − 4βh C h x3 )/Vh , y = x1 ,
(33)
where x1 and x2 are the displacement and velocity of the piston rod, x3 is the pressure difference between two chambers of the cylinder, and u is the control current. The physical parameters are given as Ah = 2.526 × 103 mm2 , Bh = 10 Ns/mm, A f = 70 Nmm, m h = 2.1 kgm, βh = 800 MPa, K q = 2.7 × 104 mm2 /s, K v = 0.304 mm/A, C h = 1.8 × 10−8 mm3 /s/MPa, and Vh = 1.473 × 105 mm3 . In the simulation, the control goal is to make the displacement x1 track a signal yr = 10 sin(t) mm; The initial states of the plant and the observer are set as x1 (0) = 6, x2 (0) = x3 (0) = 0, xˆ1 (0) = 4, and xˆ2 (0) = xˆ3 (0) = 0. The sampling time is 10−4 s. The design parameters are chosen as k1 = 0.6, k2 = 4.7, k3 = 2, a1 = 4, a2 = 6, a3 = 3.55, κ = 0.2, β1 = 4.535, β2 = 1, σx = 100, σ = 0.3, and ϒ = 0.1. Then, it is assumed that the sensor loses 6% of its effectiveness for t ≥ 5 s and suffers from a periodic bias error δs = 0.1 sin((t − 3) + π/2) − 0.1 for t ≥ 3 s, i.e., ⎧ t ∈ [0 s, 3 s), ⎨ y, ys = (0.03 sin(π t/2 − π ) + 0.97)y + δs , t ∈ [3 s, 5 s), ⎩ t ≥ 5 s. 0.94y + δs ,
(34)
Figures 1, 2, 3, 4, 5, and 6 show that in this case, both and r can track the reference signal very well. More specifically, Figs. 1 and 2 show that although r is influenced by the sensor failures for t ≥ 3 s, it still satisfies a good performance. The dynamic
Adaptive Event-Triggered Output Feedback Tracking Control …
849
20 15 10
(mm)
5 0 -5
-10 -15
0
5
Time(s)
10
15
Fig. 1 Tracking performances 8 7 6 5 4 3 2 1 0 -1 -2
0
5
10
Time(s)
15
Fig. 2 Tracking errors 35 30 25 20 15
28.4 28.2
10
28 27.8
5 0
27.6 4.6
0
Fig. 3 Dynamic gain
4.65
4.7
5
Time(s)
4.75
4.8
10
15
850
C. Sun et al. 250 200 150 100 50 0 -50
0.4
-100
0
-150
-0.4 5.85
0
5
5.86
5.87
10
Time(s)
15
Fig. 4 Observer states 40 30 20
(mA)
10 0 -10 -20 5
-30
4
-40
3 2
-50 -60
1
6.7
0
6.75
5
6.8
6.85
10
Time(s)
15
Fig. 5 Control input 1.5
1
0.5 0.8
0
0.02
0.03
0
Fig. 6 Triggering instants and types
0.04
5
0.05
Time(s)
0.06
0.07
10
0.08
15
Adaptive Event-Triggered Output Feedback Tracking Control …
851
gain , the observer states xˆ1 , xˆ2 , xˆ3 , and their triggering values are shown in Figs. 3 and 4. Figure 5 shows the event-triggered control input u. Figure 6 shows all the triggering instants and triggering types, in which the totally numbers are 1673, and the Zeno phenomenon is avoided since the minimum interval is 0.0007 s.
5 Conclusion In this paper, an adaptive event-triggered output feedback fault tolerant tracking scheme has been introduced. It has been proved that the state observer, two triggering conditions, and the event-triggered controller can ensure that the real tracking error converges to a small residual set that is mainly related to sensor gain variation and additive error. Acknowledgements This work was supported by the National Natural Science Foundation of China (NSFC) under Grant 62073197, Grant 61933006, and the Special Funding for Top Talents of Shandong Province.
6 Appendix Lemma 2 establishes the relationship of ψi (·) between ys and y. Lemma 2 Let the inequalities (3) hold and the signal yr 0 be defined by (17). Then, there exists a positive parameter θ1 , such that the functions ψi (·) satisfy |ψi (·)| ≤θ1 (1 + |ys | p )(|x1 | + · · · + |xi |) + Yr 0 , i = 1, . . . , n,
(35)
where Yr 0 = supt≥0 (|yr 0 | + | y˙r 0 |) is an unknown positive constant. Proof By Assumption 1 and (21), |ψi (·)| ≤ θ (1 + |y| p ) (|x1 | + · · · + |xi |) + | y˙r 0 |, s , with which one has i = 1, . . . , n. From (2), y = ys−δ s 1 p
|ψi (·)| ≤θ 1 + p ys + δs (|x1 | + · · · + |xi |) + Yr 0 , i = 1, . . . , n. s p
p
(36)
By using the inequality (s1 + s2 ) p ≤ 2 p−1 (s1 + s2 ), with s1 = |ys |, s2 = |δs |, and noting that s , δs are bounded, a positive constant θ1 exists such that (35) holds.
852
C. Sun et al.
References 1. Gupta, R.A., Chow, M.Y.: Overview of networked control systems. In: Networked Control Systems, Springer, London (2008) 2. Heemels, W., Johansson, K.H., Tabuada, P.: An introduction to event-triggered and selftriggered control. In: 2012 IEEE 51st IEEE Conference on Decision and Control (CDC). IEEE (2013) 3. Postoyan, R., Tabuada, P., Nesic, D., et al.: A framework for the event-triggered stabilization of nonlinear systems. IEEE Trans. Autom. Control. 60(4), 982–996 (2015) 4. Xing, L.T., Wen, C.Y., Liu, Z., Su, H.Y., Cai, J.P.: Event-triggered adaptive control for a class of uncertain nonlinear systems. IEEE Trans. Autom. Control. 62(4), 2071–2076 (2017) 5. Huang, Y.X., Liu, G.Y.: Practical tracking via adaptive event-triggered feedback for uncertain nonlinear systems. IEEE Trans. Autom. Control. 64(9), 3920–3927 (2019) 6. Xing, L.T., Wen, C.Y., Liu, Z., et al.: Event-triggered output feedback control for a class of uncertain nonlinear systems. IEEE Trans. Autom. Control. 64(1), 290–297 (2018) 7. Zhang, C.H., Yang, G.H.: Event-triggered adaptive output feedback control for a class of uncertain nonlinear systems with actuator failures. IEEE Trans. Cybern. 50(1), 201–210 (2020) 8. Zhang, Z.R., Wen, C.Y., Xing, L.T., Song, Y.D.: Adaptive event-triggered control of uncertain nonlinear systems using intermittent output only. IEEE Trans. Autom. Control. 67(8), 4218– 4225 (2022) 9. Li, F.Z., Liu, Y.G.: Adaptive event-triggered output-feedback controller for uncertain nonlinear systems. Automatica 117, 109006 (2020) 10. Li, F.Z., Liu, Y.G.: Global adaptive stabilization via asynchronous event-triggered outputfeedback. Automatica 139, 110181 (2022) 11. Guo, T.T., Liu, Y.G.: Adaptive event-triggered output-feedback control against unknown control directions and unknown intrinsic growth. IEEE Trans. Cybern. (2022). https://doi.org/10.1109/ TCYB.2022.3182137 12. Wang, C.L., Wen, C.Y., Hu, Q.L.: Event-triggered adaptive control for a class of nonlinear systems With unknown control direction and sensor faults. IEEE Trans. Autom. Control. 65(2), 763–770 (2020) 13. Li, M., Li, S., Shu, F., Xiang, Z.: Adaptive event-triggered output feedback control for a class of p-normal nonlinear systems with sensor failure. Int. J. Robust Nonlinear Control 30, 6627–6644 (2021) 14. Praly, L., Jiang, Z.P.: Linear output feedback with dynamic high gain for nonlinear systems. Syst. Control Lett. 53(2), 107–116 (2004) 15. Zhang, X., Lin, Y.: Robust adaptive tracking of uncertain nonlinear systems by output feedback. Int. J. Robust Nonlinear Control. 26(10), 2187–2200 (2016) 16. Johansson, K.H., Egerstedtz, M., Lygerosy, J., Sastry, S.: On the regularization of Zeno hybrid automata. Syst. Control Lett. 38(3), 141–150 (1999) 17. Goupil, P.: AIRBUS state of the art and practices on FDI and FTC in flight control system. Control Eng. Pract. 19(6), 524–539 (2011) 18. Zhang, Y., Wang, X.J., Wang, S.P., Puig, V.: Evaluation of thermal effects on temperaturesensitive operating force of flow servo valve for fuel metering unit. Chin. J. Aeronaut. 33(6), 1812–1823 (2020)
CARDNet: A Denoiser Based on Contrast-Aware and Residual-Dense Block Qiang Cai, Ying Cao, Chen Wang, Haisheng Li, and Mengxu Ma
Abstract In recent years, deep convolutional neural networks have shown good performance on images with spatially invariant noise, but their performance is limited on real-world noisy images. In order to improve the practicality of denoising algorithms, this paper proposes a simple one-stage blind real image denoising network with a modular structure, which combines the local modeling capability of residual dense convolutional layers with the global modeling capability of spatial and channel attention blocks and inserts them as the main building blocks into widely used encoder–decoder architectures to achieve end-to-end denoising. To preserve image structure information as much as possible during the denoising process, this paper uses different lengths of residuals between different modules to mitigate the flow of low-frequency information, and applies Contrast-Aware channel attention to enhance the dependency relationship of channel activation. Furthermore, this paper evaluates state-of-the-art algorithms on different noise datasets using quantitative metrics and visual quality, experimental results demonstrate the superiority of the proposed algorithm. Keywords Image denoising · Encoder–decoder · Residual dense convolution · Contrast-aware attention
Supported by the National Natural Science Foundation of China (62277001), National Natural Science Foundation of China (62201018), R&D Program of Beijing Municipal Education Commission (KM202310011013) and Beijing Natural Science Foundation (4222003). Q. Cai · Y. Cao (B) · C. Wang · H. Li · M. Ma School of Computer Science and Engineering, Beijing Technology and Business University, Beijing 100048, China e-mail: [email protected] Beijing Key Laboratory of Big Data Technology for Food Safety, Beijing 100048, China National Engineering Laboratory for Agri-Product Quality Traceability, Beijing 100048, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_69
853
854
Q. Cai et al.
1 Introduction Due to the inevitable introduction of unnecessary noise during image acquisition and transmission, image signals are degraded, which in turn affects subsequent image processing tasks. Therefore, restoring clear images from noisy images is one of the important research topics in the field of computer vision. Recently, under the background of deep learning technology, training deep blind denoising network models usually focuses on learning the data distribution of paired noise-clean image pairs, and performs denoising through maximum a posteriori inference. Although deep convolutional neural networks have achieved great success in image denoising, there are still the following shortcomings: (1) Some networks cannot fuse shallow and deep features, leading to problems such as gradient vanishing or feature loss during the denoising process. (2) Many networks treat all channels and spatial feature maps of the image equally, which is not conducive to the practical application of network models. (3) The attention mechanisms used by many networks only use the maximum or average activations, ignoring the position sensitivity of image texture details in image denoising tasks. To solve these problems without increasing the complexity between model blocks, this paper combines the global modeling ability of the double attention block with the local modeling ability of the residual dense block, and inserts it as the main building block into the encoder–decoder architecture. While encoding contextual information step by step, it adaptively separates degraded high-frequency noise and high-frequency texture detail features, and uses residual dense blocks to gradually extract hierarchical features, thereby maximizing the reuse of shallow features and minimizing the loss of precise spatial details. In addition, this paper evaluated the effectiveness of the model’s expression ability by testing real noise image and synthetic Gaussian noise image datasets in terms of PSNR.
2 Related Works Due to the fact that the encoder–decoder network framework is an end-to-end learning algorithm, this structure is very suitable for image denoising tasks. Therefore, Mao et al. first proposed the convolution-deconvolution symmetric structure REDNet [1] with skip connections. In order to better balance between denoising effect and speed, Zhang et al. introduced an denoising network IRCNN [2] based on image priors to increase the network’s receptive field. Tai et al. proposed a very deep dense persistent memory network MemNet [3]. Meanwhile, a multi-level wavelet transform-guided encoder–decoder network MWCNN [4] was proposed. Encoder–decoder network models are still widely studied in the field of image denoising [5–8]. In most denoising tasks, researchers often use residual blocks [9] or Dense Blocks [10] to effectively capture local features of images. In order to capture more hierarchical detail information, RDN [11] use both residual blocks and adjacent channel feature connections to fuse shallow features with deep features.
CARDNet: A Denoiser Based on Contrast-Aware and Residual-Dense …
855
Because of the mathematical convenience, many denoising methods assume that noise follows a uniform Gaussian white noise distribution. However, in natural image denoising tasks, the correlation of image feature channels has a profound impact on the denoising process. Therefore, Anwar et al. [12] applied channel attention mechanisms [13] to the image denoising field for the first time and improved the denoising performance of real noisy images. The above methods have verified the effectiveness of attention mechanisms in image denoising tasks. After that, other method based on attention mechanism are used for image denoising [14]. As a result, attention-based image processing methods have been widely used in image denoising tasks. To maximize the expressive power of the model, the denoising network proposed in this paper integrates channel attention mechanism, spatial attention mechanism, and residual dense blocks into the encoder–decoder network, and combines the advantages of these network modules to make it applicable to image denoising.
3 Method In this section, we propose An encoder–decoder denoising network based on contrastaware attention and residual dense blocks, namely CARDNet. For completeness, in this section, we first introduce the overall network architecture, then introduce the details of each module and the selection of loss function.
3.1 Overall Network Architecture This article proposes a denoising network for real images based on attention mechanism. The overall architecture is shown in Fig. 1. The overall architecture of this paper is a single level UNet architecture with skip connections. Each scale has residual connections between the downsampling operation with a step size of 2 convolutional kernel of 2 × 2 and the pixelshuffle upsampling operation with an upsampling factor of 2. At each scale, this paper uses Attention-RDN blocks, As shown in the dashed line in Fig. 1, the AttentionRDN block fuses the multiple attention block and the dense residual block through two 1×1 convolution, segmentation and concatenation operations and a residual concatenation. In the minimum scale of encoding and decoding networks, in order to facilitate the decoder to reconstruct a more complete image, without further reducing the spatial resolution, this paper extracts the smallest scale features through the dilated convolution [8, 15] of different dilated coefficients to maximize the Receptive field of feature extraction, enhance the backward propagation of global information.
856
Q. Cai et al.
Fig. 1 The network architecture proposed in this paper is shown, in the dashed box, including the composition of the attention-dense connectivity blocks and dilated blocks. In each scale of the encoder–decoder, there are two attention-dense blocks, and the dilated block is composed of dilated convolutions with multiple dilation rates
3.2 Multiple Attention Blocks In low-level vision tasks, it lacks the detail, texture, and edge information while capturing global information using average pooling does improve PSNR values. For this reason, IMDN [16] designs a contrast-aware channel attention module specifically for extracting the correlation of spatial features. Inspired by CBAM [17] and IMDN [16], this paper will combine the channel attention block and spatial attention mechanism of the contrast-aware mechanism to achieve better maintenance of image details, texture and edge information during image denoising. Contrast-aware attention replaces the global average pooling with the sum of standard deviation and mean to evaluate the degree of contrast of the feature map, i.e., the degree of dispersion of the image; the greater the dispersion, the more pronounced the texture features are proved to be. If a vector of space size H×W and channel C is used as input, the mean value of each feature map is μc . The standard deviations are σc , Then the corresponding comparison information values are: C F c = μc + σc
(1)
C F c is the output of the contrast information of the c-th channel. After calculating the contrast information of each channel feature map, the channel is squeezed-excited by two 1×1 convolution operations, and after Sigmoid activation, the weight proportion of the corresponding channel is obtained and multiplied with the original feature
CARDNet: A Denoiser Based on Contrast-Aware and Residual-Dense …
857
Fig. 2 The composition of the multi-attention blocks includes contrastive perceptual channel attention and spatial pixel attention
information to obtain the filtered information of the channel, if the input vector feature map is X 1 , the process of comparing the channel attention block is: C Attention(X ) = σ (Conv_1 (δ (Conv_1(X ))))
(2)
where δ and σ represent the ReLU activation and activated by Sigmoid after, respectively. Then the spatial attention mechanism adopts a similar structure as SAM [18], if the input vector feature map is X 2 , First, we do a channel-based global maximum pooling and global average pooling to obtain two H×W×1 feature maps, and then perform channel concatenation and a 5×5 convolution operation to reduce the dimensionality to 1 channel. Finally, the spatial attention feature map is generated by sigmoid activation, and then the feature map is multiplied with the input feature map of the module to get the final generated features. The process is described as follows: S Attention(X ) = σ (Conv_5 ( Avg(X ); Max(X ))) (3)
858
Q. Cai et al.
Fig. 3 From left to right are the configurations of residual block (a), dense block (b), and residual dense block (c), where the residual dense block consists of both residual and dense connections
If the vector feature map of the input multiple attention block is Y, the vector feature maps of the input channel and spatial attention block are: X = Conv_3 (δConv_3(Y ))
(4)
As shown in Fig. 2, the final process of the multiple attention block is described as follows: M Attention(X ) = Conv_1 (Concat (C Attention(X ), S Attention(X ))) + Y (5)
3.3 Residual Dense Block In most denoising tasks, researchers often use residual blocks [9] to effectively capture local features of images, as shown in Fig. 3a. Consequently, DensNet [10] is proposed using a dense connection block to perform the effect of shallow feature reuse by connecting adjacent feature channels, as shown in Fig. 3b, and to combine the advantages of both, RDN [11] are proposed to use Residual Dense Blocks (RDB) for image denoising, as shown in Fig. 3c. However, using RDB only on a single scale cannot capture richer semantic information. Therefore, this paper use the RDB as the local feature extraction part of the encoding and decoding backbone to solve this problem. It is especially noted that the RDB of this paper uses the LeakyReLU activation function, which is because in the denoising task, the real generated noise is random and spatially varying, so many negative values will be generated in the denoising process, then the LeakyReLU activation function solves the problem that the ReLU function enters the negative interval, resulting in the neurons not learning.
3.4 Loss Function In image denoising tasks, For the sake of supervising the model to generate denoised images of the highest possible quality, L1 loss and L2 loss are usually used as a loss function to supervise the training of the guidance network. Because of both L1 loss and L2 loss can cause the problem of over-smoothed images, this paper uses
CARDNet: A Denoiser Based on Contrast-Aware and Residual-Dense …
859
Charbonnier loss [19]: Charbonnier loss =
2 1 yi, j,k − xi, j,k + ε2 H W C i=1 j=1 k=1 H
W
C
(6)
The gradient of Charbonnier loss at values close to zero is not too small due to the constant ε, avoiding gradient disappearance; the gradient of values far from zero is not too large due to the open square, avoiding gradient explosion.
4 Experiment In this section, we present some empirical performances of CARDNet on real and synthesized image denoising tasks and module visualization results.
4.1 Metrics For ensuring the scientific nature of the studied content, this paper mainly adopts the objective quality evaluation method to verify the validity of the experimental results. The objective quality evaluation includes PSNR [20] and SSIM [21], etc. The experimental results are analyzed quantitatively to compare the superiority and inferiority of the denoised image with the original clean image.
4.2 Experimental Platform and Dataset For synthetic noise image denoising, 800 images from the DIV2K [22] dataset are used in this paper, the paired images are formed as the training set by adding Gaussian white noise. And in terms of real noise image denoising, 320 image pairs from the SIDD [23] dataset are used for training. Before training, the train and test sets are first cut into patches of 128×128 for each image, then a series of data enhancement operations such as rotation and flip are performed on the dataset. The Adam optimizer [24] is used to optimize the model parameters. The environment for this experiment is the pytorch framework, and the hardware facilities are NVIDIA RTX 3080Ti graphics card and Intel i5 processor. To test the validity of the model, the Kodak24 (http://r0k.us/graphics/kodak/) and CBSD68 [22] dataset were used for testing in terms of synthetic noisy images, and the SIDD [23] dataset and DND [25] dataset were used for testing in terms of real noisy images.
860
Q. Cai et al.
Table 1 Real image denoising on SIDD and DND datasets Methods
Blind/non- Multi/single blind scale
SIDD
DND
PSNR
SIMM
PSNR
SIMM
CBM3D [26]
–
Single scale
25.65
0.685
34.51
0.582
CDnCNN [27]
Blind
Single scale
23.66
0.583
32.43
0.790
TWSC [28]
–
Single scale
26.16
0.483
37.94
0.940
FFDNet [29]
Non-blind
Single scale
29.20
0.594
37.61
0.941
C B D N et ∗ [5]
blind
Multi-scale
30.78
0.754
38.06
0.942
R I D N et ∗ [12]
Blind
Single scale
38.71
0.914
39.25
0.952
AI N D N et ∗ [6]
Blind
Multi-scale
38.95
0.952
39.45
0.950
Deam N et ∗ [30]
Blind
Multi-scale
38.19
0.908
39.63
0.950
DUBD [31]
Blind
Single scale
39.27
–
39.44
0.953
InvDN [32]
Blind
Single scale
39.28
0.955
39.56
0.952
VDN [33]
Blind
Multi-scale
39.28
0.909
39.29
0.949
S AD N et ∗ [8]
Blind
Multi-scale
39.46
–
39.59
0.952
Ours
Blind
Multi-scale
39.55
0.956
39.62
0.952
* denotes methods using additional training data. The method proposed in this paper is trained only on the SIDD images and directly tested on DND
4.3 Experimental Comparison of Real Image Denoising Performance In terms of real image denoising, this paper verifies the effectiveness of the proposed algorithm and evaluates the performance of the model from both subjective and objective perspectives by conducting tests on the SIDD dataset, DND dataset. As shown in Table 1, the results of this paper are compared with non-blind denoising methods and blind denoising methods. From Table 1, it can be seen that the proposed method in this paper is significantly higher than other traditional denoising methods and deep learning denoising methods in terms of PSNR and SIMM evaluation metrics. In the above methods, CBDNet [5], AINDNet [6], RIDNet [12], and SADNet [8] all used different data sets combinations to increase the training data to enhance the model generalization, and in this paper, only the SIDD dataset is used for training, which illustrates the stronger expressiveness of the proposed model in this paper. Meanwhile, this paper shows a visual comparison of the denoising effect of the SIDD test dataset with that of other methods to verify the effectiveness of the model proposed in this paper, as shown in Fig. 4, the denoised images of CBDNet [5] and RIDNet [12], without completely removing noise introduces a large number of angular artifacts or speckled textures that do not belong to the structural information of the image itself, which greatly affects the visual effect of the image and reduces the image perceptual quality. While the denoised image of VDN [33] has no excess noise residue, but the structure of the image is overly smoothed in the denoising
CARDNet: A Denoiser Based on Contrast-Aware and Residual-Dense …
861
Fig. 4 A real noisy example from SIDD dataset for comparison of the method proposed in this paper against the state-of-the-art algorithms
Fig. 5 A real noisy example from DND dataset for comparison of the method proposed in this paper against the state-of-the-art algorithms
process, which leads to inconspicuous edge information and lack of clear texture structure. In contrast, this paper is effective in real noise image denoising, and the resulting image has a certain enhancement effect in terms of the subjectivity of visual perception. On the DND dataset, as shown in Fig. 5, in the enlargement results of the two red box lines, the denoised images of other methods have residual noise present and the edges are corrupted, while this paper can effectively remove noise from smooth areas while maintaining clear texture edges, thus we can prove that the method in this paper has good practical effect on denoising the DND dataset.
4.4 Experimental Comparison of Synthesized Image Denoising Performance In terms of synthetic image denoising, this paper evaluates the performance of the model by testing on Kodak24 dataset, CBSD68 dataset. From Table 2, it can be seen that the method proposed in this paper is significantly better than other traditional denoising methods and deep learning denoising methods. The visualization denoising effect on the CBSD68 and kodak24 datasets with Gaussian noise level of 50 is shown in Figs. 6 and 7. As can be seen from the feathers of the bird and the little girl’s dress in the figure, it is difficult to separate out texture details with fine details from the heavy noise in the denoising results of other denoising algorithms. The method compared in this paper focuses on the tendency to remove the heavy noise and ignores the features with serious visual effects such as texture details, which leads to denoised images with over-smoothed artifacts. On the contrary, the algorithm in this paper can recover rich texture details and edges from noisy images without introducing excess artifacts.
862
Q. Cai et al.
Table 2 Synthetic image denoising performance comparision on CBSD68 and Kodak24 datasets Methods Blind/nonCBSD68 Kodak24 blind σ = 30 σ = 50 σ = 70 σ = 30 σ = 50 σ = 70 CBM3D [26] TNRD [34] REDNet [1] CDnCNN [27] MemNet [3] IRCNN [2] FFDNet [29] ADNet [14] RIDNet [12] Ours
Blind – Blind Blind Blind Blind Non-blind Blind Blind Blind
29.73 27.64 28.46 30.04 28.39 30.22 30.31 – 30.47 30.62
27.38 25.96 26.35 28.01 26.33 27.86 27.96 28.04 28.12 28.25
26.00 23.83 25.09 26.56 25.08 – 26.53 – 26.69 26.81
30.89 28.83 29.71 31.39 29.67 31.24 31.39 – 31.64 31.76
28.63 27.17 27.62 29.16 27.65 28.93 29.10 29.10 29.25 29.44
27.27 24.94 26.36 27.64 26.40 – 27.68 – 27.94 27.96
Fig. 6 A real noisy example named ‘bird’ with noise level σ = 50 from CBSD68 dataset for comparison of the method proposed in this paper against the state-of-the-art algorithms
Fig. 7 A real noisy example named ‘girl’ with noise level σ = 50 from Kodak24 dataset for comparison of the method proposed in this paper against the state-of-the-art algorithms
CARDNet: A Denoiser Based on Contrast-Aware and Residual-Dense …
863
Fig. 8 Visualization of heat map, Figure a represents the output of the first convolutional layer output, Figure b represents the output of the first scale Attention-RDN block
4.5 Visualization of Features and Performance This paper visualizes the feature heat map of the trained image, as shown in Fig. 8a, the result of the feature map visualization obtained by feeding the noisy image into a 3×3 convolution, from which it can be seen that the shallow convolution layer extracts richer texture detail features, and the convolved image is passed through the attention dense block, as shown in Fig. 8b shows, it can be seen that different degrees of activation effects are obtained between different channels and spaces respectively, thus capturing more global semantic information, in addition, half of the channels focus on the extraction of local features with richer texture detail information, which shows that the aggregation of multiple attention blocks and residual dense blocks is effective in this paper.
5 Conclusion This paper combines the local modeling capability of residual dense block with the global modeling capability of attention blocks based on Contrast-awareand and inserts them as the main building blocks into encoder–decoder architectures to achieve end-to-end denoising. Meanwhile, different lengths of residuals between different modules are used to mitigate the flow of low-frequency information. At the smallest scale, multiple dilation convolutions are used to enhance the contextual relationships of the images. The proposed algorithm maintains the multi-level fea-
864
Q. Cai et al.
ture information of the image and improves the noise reduction effect of the image. State-of-the-art algorithms on different noise datasets demonstrate the effectiveness of the structural design of this paper for Gaussian and real image denoising.
References 1. Mao, X.-J., Shen, C., Yang, Y.-B.: Image restoration using convolutional auto-encoders with symmetric skip connections. arXiv preprint arXiv:1606.08921 (2016) 2. Zhang, K., Zuo, W., Gu, S., Zhang, L.: Learning deep CNN denoiser prior for image restoration. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3929–3938 (2017) 3. Tai, Y., Yang, J., Liu, X., Xu, C.: Memnet: a persistent memory network for image restoration. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4539–4547 (2017) 4. Liu, P., Zhang, H., Zhang, K., Lin, L., Zuo, W.: Multi-level wavelet-CNN for image restoration. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (2018) 5. Shi, G., Yan, Z., Kai, Z., Zuo, W., Lei, Z.: Toward convolutional blind denoising of real photographs (2018) 6. Kim, Y., Soh, J.W., Gu, Y.P., Cho, N.I.: Transfer learning from synthetic to real-noise denoising with adaptive instance normalization. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020) 7. Zhang, K., Li, Y., Zuo, W., Zhang, L., Gool, L.V., Timofte, R.: Plug-and-play image restoration with deep denoiser prior. IEEE Trans. Pattern Anal. Mach. Intell. 1 (2021) 8. Chang, M., Li, Q., Feng, H., Xu, Z.-h.: Spatial-adaptive network for single image denoising. arXiv preprint arXiv:2001.10291 (2020) 9. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (2016) 10. Huang, G., Liu, Z.: Vdm Laurens, and K. IEEE Computer Society, Q. Weinberger. Densely connected convolutional networks (2016) 11. Zhang, Y., Tian, Y., Kong, Y., Zhong, B., Fu, Y.: Residual dense network for image restoration (2020) 12. Anwar, S., Barnes, N.: Real image denoising with feature attention. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (2019) 13. Jie, H., Li, S., Gang, S., Albanie, S.: Squeeze-and-Excitation Networks. IEEE (2017) 14. Tian, C., Xu, Y., Li, Z., Zuo, W., Liu, H.: Attention-guided CNN for image denoising. Neural Networks 124, 117–129 (2020) 15. Chen, L.C., Papandreou, G., Schroff, F., Adam, H.: Rethinking Atrous convolution for semantic image segmentation (2017) 16. Hui, Z., Gao, X., Yang, Y., Wang, X.: Lightweight image super-resolution with information multi-distillation network. ACM (2019) 17. Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: Cbam: convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19 (2018) 18. Zhu, X., Cheng, D., Zhang, Z., Lin, S., Dai, J.: An empirical study of spatial attention mechanisms in deep networks. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (2020) 19. Zamir, S.W., Arora, A., Khan, S., Hayat, M., Khan, F.S., Yang, M.-H., Shao, L.: Multi-stage progressive image restoration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14821–14831 (2021) 20. Moorthy, A.K., Bovik, A.C.: Blind image quality assessment: from natural scene statistics to perceptual quality 20(12), 3350–3364 (2011)
CARDNet: A Denoiser Based on Contrast-Aware and Residual-Dense …
865
21. Zhou, W., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4) (2004) 22. Martin, D., Fowlkes, C., Tal, D., Malik, J.: A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In: IEEE International Conference on Computer Vision (2002) 23. Abdelhamed, A., Lin, S., Brown, M.S.: A high-quality denoising dataset for smartphone cameras. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (2018) 24. Kingma, D., Ba, J.: Adam: a method for stochastic optimization. In: Computer Science (2014) 25. Plotz, T., Roth, S.: Benchmarking denoising algorithms with real photographs. In: Computer Vision Pattern Recognition (2017) 26. Yang, D., Sun, J.: BM3D-Net: a convolutional neural network for transform-domain collaborative filtering. IEEE Signal Process. Lett. 25(1), 55–59 (2017) 27. Zhang, K., Zuo, W., Chen, Y., Meng, D., Zhang, L.: Beyond a Gaussian denoiser: residual learning of deep CNN for image denoising. IEEE Trans. Image Process. 26(7), 3142–3155 (2017) 28. Xu, J., Zhang, L., Zhang, D.: A trilateral weighted sparse coding scheme for real-world image denoising. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 20–36 (2018) 29. Zhang, K., Zuo, W., Zhang, L.: FFDNet: toward a fast and flexible solution for CNN based image denoising. IEEE Trans. Image Process. 1–1 (2017) 30. Ren, C., He, X., Wang, C., Zhao, Z.: Adaptive consistency prior based deep network for image denoising. In: Computer Vision and Pattern Recognition (2021) 31. Soh, J.W., Cho, N.I.: Deep universal blind image denoising (2021) 32. Liu, Y., Qin, Z., Anwar, S., Ji, P., Kim, D., Caldwell, S., Gedeon, T.: Invertible denoising network: a light solution for real noise removal (2021) 33. Yue, Z., Yong, H., Zhao, Q., Zhang, L., Meng, D.: Variational denoising network: toward blind noise modeling and removal. arXiv preprint arXiv:1908.11314 (2019) 34. Chen, Y., Pock, T.: Trainable nonlinear reaction diffusion: a flexible framework for fast and effective image restoration. IEEE Trans. Pattern Anal. Mach. Intell. (2017)
An Automatic and Efficient Calibration Method for LiDAR-Camera in Targetless Environments Fengli Yang, Juzhi Zhu, and Long Zhao
Abstract Accurate calibration of LiDAR and camera in targetless environments is a crucial task in various applications. This paper proposes an automatic and efficient calibration method for LiDAR and camera in such environments. Firstly, the collected LiDAR point cloud and image data are preprocessed. Considering the richness of edges in natural environments, we incorporated edge features to establish the 2D3D correspondence between LiDAR and camera data. Additionally, we employed RANSAC to obtain a rough estimation of the LiDAR-camera transformation. Given the initial guess, we further optimize the transform estimation based on normalized information distance (a cross-modal distance measure based on mutual information). Experimental evaluations confirm the accuracy and efficiency of the proposed method in targetless environments. Keywords LiDAR-camera calibration · Edge features · Extrinsic parameters
1 Introduction Multi-sensor calibration is the basic and key step of multi-sensor fusion navigation. In LiDAR/vision fusion navigation, it is necessary to associate and match the data acquired by camera and LiDAR to establish the corresponding relationship between sensors [1]. Accurate extrinsic parameters calibration can provide accurate correspondence between sensors, help to achieve consistent data matching and correlation, and improve the effect of fusion navigation [2]. The calibration method of targetless LiDAR-camera [3–5] can adapt to different environments and scenes without the need to prepare specific calibration objects in advance. This is especially important for mobile robots or systems that need to operate in different environments because they can be calibrated in real time, whether indoors or outdoors, in structured or unstructured environments [6]. According to different information sources, F. Yang · J. Zhu · L. Zhao (B) Digital Navigation Center, School of Automatic Science and Electrical Engineering, Beihang University, Beijing 100191, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_70
867
868
F. Yang et al.
targetless calibration methods for LiDAR-cameras are divided into three categories, namely, information-theoretic-based methods [7], feature-based methods [8], and ego-motion-based methods [9]. The methods based on information theory estimates the LiDAR-camera transformation by maximizing the similarity transformation between the LiDAR and the camera data, and measures them by various information measures. The Normalised Information Distance (NID) is a similarity metric that can be used to match the modalities of different sensors. Koide et al. [10] leveraged the NID metric to estimate optimal extrinsic parameters between the 3D LiDAR and the camera. However, these methods are sensitive to noise. Unlike information theory-based methods, features-based targetless LiDARcamera calibration methods directly extract features from point cloud and camera images and match them without comparing their statistical similarity. For example, Zhang et al. [11] proposed a calibration method using the edge information in the image and the depth edge information in the LiDAR. This paper argues that the depth edge often corresponds to the image edge. However, these methods requires a sufficient number of edge features distributed properly. This put certain requirements to the calibration scene. The methods based on ego-motion uses the motion of sensors mounted on the carrier to estimate the extrinsic parameters [12]. However, the deviation of motion estimation will affect the calibration results and lead to inaccurate calibration results [13]. In this paper, we combines the advantages of information-based and feature-based LiDAR-camera calibration methods. And we study the geometric information in the 3D space and the multi-modal information between point cloud data and image data, and to propose a high-precision calibration algorithm of LiDAR-camera in targetless environment. Specially, we extract the edge features of the image and the LiDAR, respectively, and then use the Point to Line Distance (PLD) [14] to estimate the initial values of the extrinsic parameters. Furthermore, we consider the NID metric to refine the roarse guess, which can deal with the problem of sparse edge features. The remainder of this paper is organized as follows. Section 2 briefly introduces the problem of the LiDAR-camera calibration. Section 3 gives a detailed description of our proposed algorithm. Section 4 presents the results obtained from the real experiments. Finally, Sect. 5 summarizes the conclusions of the study.
2 Problem Statement As shown in Fig. 1, in the LiDAR-camera calibration, the transformation relationship between the LiDAR and the camera coordinate system is determined by the extrinsic parameters RCL and tCL , which represent the rotation matrix and translation vector from system OC − the LiDAR coordinate system O L − X L Y L Z L to the camera coordinate 1 K and a 3D a cluster of image points I = x , · · · , x X C YC Z C , respectively. Given i i LiDAR point cloud P = p1L , · · · , p LN , the purpose of joint calibration is to calculate
An Automatic and Efficient Calibration Method for LiDAR-Camera …
pL
u
869
YL XL LiDAR C
xi
RL C
tL
ZC v
XC YC Camera
Fig. 1 Schematic diagram of LiDAR-camera calibration
the extrinsic parameters RCL and tCL . Generally, camera models are considered classic T j and the pinhole imaging models. Thus, the 3D LiDAR point p L = X Lj Y Lj Z Lj T j image point xi = u j v j has the following relationship [15]. ⎡ ⎤ ⎡ ⎤ XL uj ⎢ YL ⎥ j ⎥ (1) Z L ⎣ v j ⎦ = KiC TCL ⎢ ⎣ ZL ⎦ 1 1
RCL tCL , and KiC is the intrinsic matrix, which can be calculated in = 03×1 1
where
TLC
advance.
3 Methodology 3.1 Overview The proposed method aims to perform the calibration of the LiDAR-Camera in targetless scenes. The schematic diagram of the system principle is shown in Fig. 2. The system takes input from a 3D LiDAR and image data and outputs the extrinsic parameters RCL and tCL . The overview system consists of three modules. Firstly, the image and LiDAR data are preprocessed. Then, considering the edge features of the images and point clouds, the initial estimation of the extrinsic parameters is obtained through RANSAC and the minimization of the PLD. Finally, the projected LiDAR intensity map is compared with the image, and the accurate optimization is performed using the NID.
870
F. Yang et al.
Fig. 2 The process for targetless LiDAR-camera calibration
3.2 Data Preprocessing Generally, it is challenging to extract useful texture and geometric information from sparse point clouds. However, in the case of non-repetitive scanning solid-state LiDARs (such as Livox Avia), the number of points in the point cloud accumulates over time. Therefore, by leaving the LiDAR stationary for a period of time and accumulating multiple frames of sparse point clouds, a dense point cloud dataset can be obtained. Furthermore, to better compare the information of different types of data, histogram equalization is applied to filter the dense point cloud and the image.
3.3 Initial Guess Estimation Typically, geometric, semantic, or motion features in the environment exhibit stability and uniqueness, which can improve the accuracy of LiDAR-camera calibration. To optimize the extrinsic parameters between the LiDAR and the camera, an initial estimation is required. Here, the edge feature [16] correspondence between the 3D point cloud and the image is considered to obtain the initial estimation. On one hand, for the image, the Canny operator is used to directly detect the image edge features. On the other hand, depth-continuous edge lines are obtained using a voxel-based map of the point cloud [17]. As shown in Fig. 3, the dense point cloud is first divided into voxels (e.g., 1 m), and then the plane characteristics are computed for each voxel grid (e.g., by examining the eigenvalues of the point covariance matrix). If all the feature points within the current voxel lie on a plane, the previous voxel is stored in memory together with the included feature points. Otherwise, the current voxel is subdivided into multiple voxels, and the process is repeated for each voxel until reaching the minimum size. By adaptively partitioning voxels, multiple plane features can be obtained, which are then connected within a certain range and form plane pairs with angles. The intersection lines of these planes are solved. Next, the extracted LiDAR edges need to be matched with the corresponding edges in the image. For each extracted LiDAR edges, multiple points on the edge are sampled. Each point is then projected onto the image (see Eq. 1). In the KDimensional (KD) tree constructed from the image edge points, we search for k
An Automatic and Efficient Calibration Method for LiDAR-Camera …
871
Fig. 3 Adaptive point cloud voxel segmentation
Fig. 4 LiDAR edges (red), image edge lines (blue) and their correspondences (green)
j nearest neighbors, denoted as Q = xi ; j = 1, · · · , k . Furthermore, we calculate the corresponding mean x¯ i and covariance matrix Si 1 j x k j=1 i k
x¯ i =
Si =
k T j j xi − x¯ i xi − x¯ i
(2)
(3)
j=1
Therefore, the image edge lines can be determined by the mean x¯ i and line direction ni , where ni corresponds to the eigenvector of the largest eigenvalue of Si [18]. Figure 4 shows examples of the extracted LiDAR edges, image edges, and their corresponding relationship.
872
F. Yang et al.
Finally, for each projected sampled point, we optimize the initial estimation of the extrinsic parameters by minimizing the PLD using the Levenberg-Marquardt algorithm [10]. The cost function is defined as follows C T˜ L = arg min TCL
M 2 T j i C j ρ ni KC T L p L − xi
(4)
j=1
3.4 Refinement Due to viewpoint differences, some points in the LiDAR point cloud may be occluded and not visible from the camera. If all LiDAR points are simply projected, these occluded points can result in incorrect correspondences, affecting the calibration results [11]. To address this issue, an effective view-based hidden point removal method is employed to filter out LiDAR points that are not visible from the camera viewpoint. Using the initial estimated LiDAR-camera extrinsic transformation, the LiDAR points are projected onto a 2D image, and only the intensity value with the minimum depth in each pixel are retained. This step is performed to refine the calibration by utilizing the LiDAR intensity map and the image. In order to use NID for measurement, we firstly compute the marginal P (Li ), P (Ii ) and joint histograms P (Li , Ii ) of the LiDAR and pixel intensities, and then calculate their entropies as follows H (X ) =
p (x) log (x)
(5)
x∈X
where x is each bin in the histogram. The NID between Li and Ii is defined as follows NID (Li , Ii ) =
H (Li , Ii ) − MI (Li ; Ii ) H (Li , Ii )
MI (Li ; Ii ) = H (Li ) + H (Ii ) − H (Li , Ii )
(6)
(7)
where MI(Li ; Ii ) is the mutual information between Li and Ii . By using the NelderMead optimizer to minimize Eq. (6), the optimal solution of the LiDAR-camera transformation can be found.
An Automatic and Efficient Calibration Method for LiDAR-Camera …
873
4 Experiments and Results The proposed calibration method for LiDAR-camera systems in targetless environments was evaluated through extensive experiments. The operating device for this experiment was a notebook computer, the model of which was Lenovo Legion Y7000 2020 and the processor was i5-10200H. The running platform for the experiment was based on the Robot Operating System (ROS) [19] in Ubuntu 18.04, and is developed with OpenCV and Point Clound Library (PCL). We firstly tested the method on the public datasets [11], and data collected by a solid-state LiDAR called Livox AVIA and an Intel Realsense-D435i camera (as shown in Fig. 5a). To verify the robustness of the full pipeline, we test it on each of the 3 scenes individually (see 6). The results were shown in Table 1. It is seen that the extrinsic value converges to almost the same value after fine calibration. A visual example illustrating the difference before and after optimization is shown in Fig. 7. Figure 7 shows the initial guess estimation provided bad initial LiDAR-camera transformation; however, the fine registration algorithm achieved an almost ideal result. Furthermore, we compare our methods with ACSC [20] using our datasets, which are produced with a 1440 × 1080 pixels FLIR camera and the AVIA LiDAR (see Fig. 5b). The image of the indoor scene is shown in Fig. 8a. For comparison, we compute the residuals in Eq. (4) using the two calibrated extrinsic, the quantitative result is shown in Fig. 8b. In addition, Fig. 9 shows 3D point cloud coloring results based on two methods. These results shows that the calibration results for our algorithm were close to those for ACSC, which confirm the feasibility and effectiveness of the proposed calibration algorithms.
Fig. 5 Sensor configurations for a public datasets and b our datasets
Fig. 6 Calibration scenes for public datasets
874
F. Yang et al.
Table 1 Calibration results of the different scenarios Sences tx (m) t y (m) tz (m) 1 2 3
init. opt. init. opt. init. opt.
0.052 –0.015 0.037 –0.019 0.050 –0.018
0.036 0.085 0.008 0.081 0.039 0.083
−0.069 –0.091 0.016 –0.101 −0.067 –0.098
ψ(rad)
θ(rad)
φ(rad)
0.340 0.770 0.311 0.768 0.373 0.772
−1.576 –1.545 −1.565 –1.539 1.576 –1.542
1.214 0.766 1.552 0.768 1.182 0.771
Fig. 7 LiDAR projection image overlaid on the camera image using an a initial and b optimized extrinsic parameters
Fig. 8 a Our datasets from indoor scene and b comparison of residual distribution
An Automatic and Efficient Calibration Method for LiDAR-Camera …
875
Fig. 9 3D point cloud coloring results based on a our method and b ACSC
5 Conclusion In conclusion, this study has presented a calibration method for LiDAR-camera systems in targetless environments. By leveraging the edge features extracted from both the LiDAR and the image, an initial estimation of the extrinsic parameters was obtained using RANSAC and the PLD minimization. Subsequently, a refinement step was performed using the NID between the projected LiDAR intensity map and the camera image. This step further improved the accuracy of the calibration. We have performed extensive experiments to compare our method to state-of-the-art methods which validate its effectiveness. Future work can focus on further improving the robustness and accuracy of the calibration method, exploring the integration of additional sensor modalities, and investigating the applicability of the proposed approach in real-time scenarios. Acknowledgements This work is supported by the National Science Foundation of China (Grant No. 42274037), the Aeronautical Science Foundation of China (Grant No. 2022Z022051001), and the National key research and development program of China (Grant No. 2020YFB0505804).
References 1. Shan, T., Englot, B.J., Ratti, C., Rus, D.: LVI-SAM: tightly-coupled lidar-visual-inertial odometry via smoothing and mapping. In: IEEE International Conference on Robotics and Automation, ICRA 2021, Xi’an, China, pp. 5692–5698. IEEE (2021) 2. Tóth, T., Pusztai, Z., Hajder, L.: Automatic lidar-camera calibration of extrinsic parameters using a spherical target. In: 2020 IEEE International Conference on Robotics and Automation, ICRA 2020, Paris, France, May 31—August 31, 2020, pp. 8580–8586. IEEE (2020) 3. Zhao, Y., Wang, Y., Tsai, Y.: 2d-image to 3d-range registration in urban environments via scene categorization and combination of similarity measurements. In: Kragic, D., Bicchi, A.,
876
4.
5.
6. 7.
8. 9.
10. 11. 12.
13.
14.
15. 16.
17. 18. 19. 20.
F. Yang et al. De Luca, A. (eds.) 2016 IEEE International Conference on Robotics and Automation, ICRA 2016, Stockholm, Sweden, May 16–21, 2016, pp. 1866–1872. IEEE (2016) Zhang, X., Zhu, S., Guo, S., Li, J., Liu, H.: Line-based automatic extrinsic calibration of lidar and camera. In: IEEE International Conference on Robotics and Automation, ICRA 2021, Xi’an, China, May 30–June 5, 2021, pp. 9347–9353. IEEE (2021) Park, C., Moghadam, P., Kim, S., Sridharan, S., Fookes, C.: Spatiotemporal camera-lidar calibration: a targetless and structureless approach. IEEE Robot. Autom. Lett. 5(2), 1556–1563 (2020) Wang, Z., Wu, Y., Niu, Q.: Multi-sensor fusion in automated driving: a survey. IEEE Access 8, 2847–2868 (2020) Pandey, G., McBride, J.R., Savarese, S., Eustice, R.M.: Automatic targetless extrinsic calibration of a 3d lidar and camera by maximizing mutual information. In: Hoffmann, J., Selman, B. (eds.) Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, July 22–26, 2012, Toronto, Ontario, Canada. AAAI Press (2012) Yu, H., Zhen, W., Yang, W., Scherer, S.: Line-based 2-d-3-d registration and camera localization in structured environments. IEEE Trans. Instrum. Meas. 69(11), 8962–8972 (2020) Xu, H., Lan, G., Wu, S., Hao, Q.: Online intelligent calibration of cameras and lidars for autonomous driving systems. In: 2019 IEEE Intelligent Transportation Systems Conference, ITSC 2019, Auckland, New Zealand, October 27–30, 2019, pp. 3913–3920. IEEE (2019) Koide, K., Oishi, S., Yokozuka, M., Banno, A.: General, single-shot, target-less, and automatic lidar-camera extrinsic calibration toolbox. CoRR, abs/2302.05094 (2023) Yuan, C., Liu, X., Hong, X., Zhang, F.: Pixel-level extrinsic self calibration of high resolution lidar and camera in targetless environments. IEEE Robot. Autom. Lett. 6(4), 7517–7524 (2021) Liao, Q., Liu, M.: Extrinsic calibration of 3d range finder and camera without auxiliary object or human intervention. In: 2019 IEEE International Conference on Real-time Computing and Robotics, RCAR 2019, Irkutsk, Russia, August 4–9, 2019, pp. 42–47. IEEE (2019) Taylor, Z., Nieto, J.I.: Motion-based calibration of multimodal sensor arrays. In: IEEE International Conference on Robotics and Automation, ICRA 2015, Seattle, WA, USA, 26–30 May, 2015, pp. 4843–4850. IEEE (2015) Censi, A.: An ICP variant using a point-to-line metric. In: 2008 IEEE International Conference on Robotics and Automation, ICRA 2008, May 19–23, 2008, Pasadena, California, USA, pp. 19–25. IEEE (2008) Yan, G., He, F., Shi, C., Cai, X., Li, Y.: Joint camera intrinsic and lidar-camera extrinsic calibration. CoRR, abs/2202.13708 (2022) Zhang, J., Singh, S.: LOAM: lidar odometry and mapping in real-time. In: Fox, D., Kavraki, L.E., Kurniawati, H. (eds.) Robotics: Science and Systems X, University of California, Berkeley, USA, July 12–16 (2014) Yuan, C., Wei, X., Liu, X., Hong, X., Zhang, F.: Efficient and probabilistic adaptive voxel mapping for accurate online lidar odometry. IEEE Robot. Autom. Lett. 7(3), 8518–8525 (2022) Liu, Z., Zhang, F.: BALM: bundle adjustment for lidar mapping. IEEE Robot. Autom. Lett. 6(2), 3184–3191 (2021) Quigley, M., Gerkey, B.P., Conley, K., Faust, J., Ng, A.Y.: Ros: an open-source robot operating system (2009) Cui, J., Niu, J., Ouyang, Z., He, Y., Liu, D.: ACSC: automatic calibration for non-repetitive scanning solid-state lidar and camera systems. CoRR, abs/2011.08516 (2020)
Modeling Analysis of Force-Thermal Coupling for High-Speed Planetary Roller Screw Zheng Jigui, Yang Bin, Tian Qing, Guo Yaxing, Shi Wei, and Cui Zhenglei
Abstract Taking the high-speed planetary roller screw required by electromechanical servo system as the research objective, a force-thermal coupling model of highspeed planetary roller screw under preload conditions is established in this paper. Based on the model, the parameter design of planetary roller screws was carried out, and the simulation analysis under preload was carried out, by which the change law of load force and temperature under high speed was obtained. Finally, the experiment of the force-heat model with preload is carried out, which verifies the correctness of the theoretical model and the consistency of the experiment. Keywords High-speed · Planetary roller screw · Force-thermal coupling · Modeling · Experimental analysis
1 Introduction Planetary roller screw mechanism (PRSM) has been more and more widely used in electromechanical actuators with high stiffness, high load capacity and long life [1]. With the development of electromechanical servo technology, higher requirements are put forward for the speed of PRSM, as the core component of electromechanical actuators. Compared with traditional application scenarios, the operating speed is increased from 4000 to 10,000 rpm, and even 20,000 rpm, which causes a great challenge to the design of PRSM. Yang took the PRSM as the research object, analyzed the heat source, and calculated the friction moment and heat generation that are treated as boundary conditions to analyze the thermal characteristics of the PRSM. And the thermal analysis module was used to analyze the thermal characteristics at different speeds and different coolant flows, and the temperature rise curve of the hollow planetary roller screw was obtained, and the influence of related factors was analyzed [2]. Ma revealed the Z. Jigui · Y. Bin · T. Qing · G. Yaxing · S. Wei (B) · C. Zhenglei Beijing Institute of Precision Mechatronics and Controls, Beijing 100076, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_71
877
878
Z. Jigui et al.
influence of coupling of error-wear-temperature change on the load distribution of PRSM, and established a load distribution model of PRSM involving error, wear and temperature change based on the deformation coordination relationship [3]. Cao [4], Yang [5] and others analyzed the thermal characteristics of the PRSM, and discussed the suppression of thermal deformation, which provided inspiration for the thermal analysis of PRSM. In this paper, a force-thermal coupling mechanism model for high-speed PRSM is established. Based on the theoretical analysis model of force-thermal coupling, the equivalent simulation analysis and equivalent experimental test analysis of highspeed PRSM force-heat coupling are carried out, which lay a theoretical foundation for the high-speed and high-reliability performance of high-power electromechanical servo systems for aerospace, which is of great significance to meet the urgent demand for high-speed and high-reliability PRSM.
2 Force-Thermal Coupling Model for High-Speed PRSM The frictional moment needs to be calculated firstly in order to establish the planetary roller screw force-thermal coupling model [6–9]. Analysis on friction and friction moment for high-speed PRSM mainly include: differential sliding friction analysis, spin sliding friction analysis of roller [10, 11], friction analysis between roller and cage, lubricant viscous friction analysis, preload friction analysis, and friction analysis caused by high-speed operation condition. Moreover, considering whether the PRSM is in a high-speed operating condition, or applied the preload, the frictional moment model under different target requirements can be obtained, as follows. For PRSM with preload and non-high-speed operation, the total frictional moment can be expressed as: M pnv = MC + M S + Mr c + M E + M pr
(1)
For PRSM with preload and high-speed operation, the total frictional moment can be expressed as: M pv = MC + M S + Mr c + M E + M pr + M E H
(2)
Assuming that all mechanical energy is converted into thermal energy, the following relationship can be known through frictional resistance:
HP R S M =
M · ns 9550
(3)
Modeling Analysis of Force-Thermal Coupling …
879
Among, H P R S M is expressed as the frictional heat source of the PRSM (/W), M is the total frictional moment of the PRSM (/N.mm), and ns is the speed of the screw (/rpm).Considering whether the PRSM is in a high-speed operating condition and is applied a preload, a force-thermal coupling model under different target requirements can be obtained, as follows. For PRSM with preload and non-high-speed operation, the force-thermal coupling model can be expressed as:
H pnv =
1 (MC + M S + Mr c + M E + M pr + M F N ) · ns 9550
(4)
For PRSM with preload and high-speed operation, the force-thermal coupling model can be expressed as:
H pv =
1 (MC + M S + Mr c + M E + M pr + M F N + M E H ) · ns 9550
(5)
Therefore, through the above force-thermal coupling models, the force-thermal coupling characteristics of high-speed PRSM can be analyzed according to whether the PRSM is in the high-speed operating condition and whether the preload force is applied.
3 Force-Heat Coupling Analysis for High-Speed PRSM 3.1 Structural Parameters Core structures of PRSM contain screw, nut and roller, whose structural parameters are shown in Table 1, 2, and 3.
3.2 Force-Thermal Coupling Analysis Under Preload and Conventional Speed Conditions Aiming at the conditions of the normal speed and the preload force, combined with the friction torque model (1) and force-heat coupling model (4), the force-thermal coupling mechanism of the PRSM under the conventional speed condition with preload force is analyzed. As shown in Fig. 1, the equivalent speed between the roller and the screw is set to 1000 rpm, and the preload size is set to 30 N when considering the influence of the preload force. The influence characteristics of axial load on the force-thermal
880 Table 1 Structural parameters of screw thread Rotation direction Tooth shape Lead Number of heads Pitch Helix angle Theoretical mean diameter Tooth angle Radius of the arc at the base of the tooth Adjacent pitch error Full-length pitch accumulation error Mean diameter fluctuations Tooth surface roughness
Table 2 Structural parameters of nut thread Rotation direction Tooth shape Lead Number of heads Pitch Helix angle Theoretical mean diameter Tooth angle Radius of the arc at the base of the tooth Adjacent pitch error Full-length pitch accumulation error Mean diameter fluctuations Tooth surface roughness
Z. Jigui et al.
Dextrorotation Triangle 1.25 mm 5 0.25 mm 1.085◦ 20.997 mm 90◦ ± 15 ≤0.02 mm ≤0.003 mm ≤0.012 mm ≤0.005 mm 0.8 µm
Dextrorotation Triangle 1.25 mm 5 0.25 mm 0.651◦ 35 mm 90◦ ± 15 ≤0.02 mm ≤0.003 mm ≤0.005 mm ≤0.003 mm 0.8 µm
coupling behavior of the PRSM are researched when the axial load changes from 40 to 80 N. It can be seen from figure that with the increase of time, the temperature of the screw rises, and the greater the load force, the faster the temperature rises. As shown in Fig. 2, considering the influence of the speed condition of the screw on the force-thermal coupling behavior of the PRSM, the preload of the PRSM is set to 30 N, and the axial load is set to 30 N. The influence characteristics of the force-thermal coupling behavior of the PRSM are studied and analyzed when the speed changes from 400 to 2000 rpm, and it can be seen that the higher the speed, the faster the temperature rises.
Modeling Analysis of Force-Thermal Coupling … Table 3 Structure parameters of roller thread Rotation direction Lead Number of heads Pitch Helix angle Theoretical mean diameter Radius of the arc at the base of the tooth Adjacent pitch error Full-length pitch accumulation error Tooth surface roughness
881
Dextrorotation 1.25 mm 1 0.25 mm 0.701◦ 6.5 mm ≤0.02 mm ≤0.003 mm ≤0.004 mm 0.8 µm
Fig. 1 Analysis of theoretical model of force-thermal coupling of PRSM under load condition
3.3 Force-Thermal Coupling Analysis Under High Speed and Preload Condition Aiming at the high speed and preload condition, combined with the friction torque model (2) and force-heat coupling model (5) of the PRSM, the force-thermal coupling mechanism of the PRSM under the condition of preload and high speed is analyzed. As shown in Fig. 3, the equivalent speed between the screw and the roller is set to 20,000 rpm, and when the axial load increases from 40 to 80 N, the influence of axial load on the force-thermal coupling behavior of the PRSM is studied. It can be seen that in the case of high speed, there is also a tendency that the greater the load force, the faster the temperature rises.
882
Z. Jigui et al.
Fig. 2 Analysis of theoretical model of force-heat coupling of PRSM under speed condition
Fig. 3 Analysis of theoretical model of force-thermal coupling of PRSM under load conditions
Modeling Analysis of Force-Thermal Coupling …
883
Fig. 4 Analysis of theoretical model of force-thermal coupling of roller screw under speed working condition
Similarly, as shown in Fig. 4, the equivalent preload between the roller and the screw is set to 30 N, and the equivalent load between a single roller and the screw is set to 30 N. When the relative speed between the screw and the roller increases from 4000 to 20,000 rpm, the influence characteristics of the force-heat coupling behavior of the PRSM under the speed condition are studied and analyzed, it can be seen that the higher the speed, the faster the temperature of the screw rises at the same time.
4 Experimental Analysis of Force-Heat Coupling for PRSM 4.1 Experimental Analysis of Force-Thermal Coupling of PRSM Under the Condition of Conventional Speed and Preload Force Based on the equivalent test platform for high-speed PRSM, when the equivalent speed between the screw and the roller is set to 1000 rpm and the axial load is increased from 40 to 80 N, the experimental study of force-thermal coupling of PRSM under the conventional speed condition and preload is carried out. The influence of axial load on the force-thermal coupling behavior of PRSM is studied and analyzed, as shown in Fig. 5.
884
Z. Jigui et al.
Fig. 5 Experimental test of force-thermal coupling of PRSM under load conditions
Fig. 6 Experimental results of temperature rise of force-heat coupling for PRSM under load condition
Based on the test results of Fig. 5, combined with the initial temperature of the PRSM, the experimental results of temperature rise of the force-heat coupling under the conventional speed condition and preload force can be obtained, as shown in Fig. 6.
Modeling Analysis of Force-Thermal Coupling …
885
Fig. 7 Theoretical modeling analysis and experimental test of force-thermal coupling for PRSM
According to the theoretical model analysis of force-thermal coupling for PRSM under the conventional speed condition and preload, combined with the experimental results of force-thermal coupling temperature rise in Fig. 6, considering the influence of axial load condition, the theoretical modeling analysis and experimental test comparison analysis of force-thermal coupling under the normal speed condition without preload are carried out, as shown in Fig. 7. Similarly, considering the speed condition of the screw, the influence characteristics of the force-thermal coupling behavior of the PRSM are experimentally studied, and the experimental results of the force-thermal coupling characteristics are shown in Fig. 8, and the analysis results of the force-thermal coupling experiment are shown in Fig. 9. According to the theoretical model analysis of force-thermal coupling of PRSM under the conventional speed condition and preload, combined with the experimental analysis results of force-thermal coupling temperature rise in Fig. 9, considering the influence of the speed condition, the theoretical modeling analysis and experimental test of force-thermal coupling under the conventional speed condition and preload force are carried out, as shown in Fig. 10.
886
Z. Jigui et al.
Fig. 8 Experimental test of force-thermal coupling of PRSM under speed conditions
Fig. 9 Experimental results of temperature rise of force-heat coupling under speed condition
Modeling Analysis of Force-Thermal Coupling …
887
Fig. 10 Comparison analysis of theoretical modeling and experimental test of force-heat coupling
4.2 Experimental Analysis of Force-Thermal Coupling Under High Speed and Preload Condition Based on the equivalent test platform, the experimental research on the force-heat coupling of PRSM under high speed and preload condition is carried out. Firstly, the equivalent speed between the screw and the roller is set to 20,000 rpm, when the axial load increases from 40 to 80 N, the influence characteristics of axial load on the force-thermal coupling behavior of the PRSM are analyzed, and the experimental results of the force-thermal coupling of the PRSM under the condition of high speed and preload are shown in Fig. 11. Based on the results of Fig. 11, combined with the initial temperature of the PRSM, the experimental analysis of force-heat coupling temperature rise under the condition of high speed with preload force can be obtained, as shown in Fig. 12. According to the analysis of the theoretical model of force-thermal coupling of PRSM under high speed and preload condition, combined with the experimental analysis results of force-thermal coupling temperature rise in Fig. 12, and considering the influence of axial load condition, the theoretical modeling analysis and experimental test comparison analysis of force-thermal coupling under the condition of high speed and preload force are carried out, as shown in Fig. 13. Similarly, considering the speed condition of the screw, the equivalent preload between a single roller and the screw is set to 30 N, and the equivalent load between a single roller and the screw is set to 30 N. When the speed changes in the range of 4000 to 20,000 rpm, the influence characteristics of the force-thermal coupling
888
Z. Jigui et al.
Fig. 11 Experimental test of force-thermal coupling under load conditions
Fig. 12 Analysis results of force-heat coupling temperature rise under load condition
Modeling Analysis of Force-Thermal Coupling …
889
Fig. 13 Comparative analysis of theoretical modeling and experimental test of force-heat coupling
Fig. 14 Experimental test of force-thermal coupling under speed condition
890
Z. Jigui et al.
Fig. 15 Analysis results of force-heat coupling temperature rise under speed conditions
Fig. 16 Comparative analysis of theoretical modeling and experimental test of force-heat coupling
behavior of the PRSM are experimentally studied, and the experimental results of the force-thermal coupling characteristics are shown in Fig. 14, and the experimental analysis results of force-thermal coupling are shown in Fig. 15.
Modeling Analysis of Force-Thermal Coupling …
891
According to the analysis of force-thermal coupling of PRSM under the condition of high speed and preload force, combined with the experimental analysis results of force-heat coupling temperature rise in Fig. 16, considering the influence of speed condition, the theoretical modeling analysis and experimental test comparison analysis of force-thermal coupling under the condition of high speed and preload force are carried out, as shown in Fig. 16. Facing the condition of high-speed operation with preload, when the equivalent speed is 20,000 rpm and the axial load increases from 40 to 80 N, the temperature rise range changes from [21.2, 29] ◦ C to [22, 31.9] ◦ C, and the temperature rise rate is about [13.37, 16.97] ◦ C/min. Setting equivalent preload as 30 N and equivalent load as 30 N, when the relative speed between the screw and the roller increases from 4000 to 20,000 rpm, the temperature rise range changes from [22.1, 24.8] ◦ C to [21.1, 30.3] ◦ C, and the temperature rise rate is about [4.63, 15.77] ◦ C/min.
5 Conclusion The influence characteristics of force-heat coupling behavior of PRSM under the speed condition are studied and analyzed, and it can be seen that the higher the speed, the greater the load force, and the faster the screw temperature rises at the same time.Facing the existence of preload and high-speed operation, when the equivalent speed is 20,000 rpm and the axial load increases from 40 to 80 N, the temperature rise range changes from [21.2, 29] ◦ C to [22, 31.9] ◦ C, and the temperature rise rate is about [13.37, 16.97] ◦ C/min.
References 1. Linping, W., Shangjun, M., Xiaojun, F., et al.: A review of planetary roller screw mechanism for development and new trends. Proc. Inst. Mech. Eng., Part C: J. Mech. Eng. Sci. 236(21) (2022) 2. Yang, J., Shi, J., Zhu, J., et al.: Study on secondary thermal characteristics and thermal deformation suppression of planetary roller screw. J. Hubei Univ. Technol. 29(04), 1–4 (2014) 3. Ma, S., Li, X., Liu, G., et al.: Secondary load distribution of planetary roller screw coupling with error-wear-temperature change. J. Northwest. Polytech. Univ. 35(04), 655–660 (2017) 4. Cao, J., Li, L., Liu, Y., et al.: Study on thermal characteristics of high-speed hollow ball screw under different working conditions. Combined Mach. Tool Autom. Process. Technol. (03), 30–32+44 (2011) 5. Yang, J.,Yang, W., Huang, G., et al.: Suppression countermeasures for thermal displacement of ball screw pair. Manuf. Technol. Mach. Tool (08), 109–111 (2006) 6. Velinsky, S.A., Chu, B., Lasky, T.A.: Kinematics and efficiency analysis of the planetary roller screw mechanism. J. Mech. Des. 131, 011016 (2009) 7. Jones, M.H., Velinsky, S.A.: Kinematics of roller migration in the planetary roller screw mechanism. J. Mech. Des. 134, 061006 (2012) 8. Jones, M.H., Velinsky, S.A.: Contact kinematics in the planetary roller screw mechanism. J. Mech. Des. 135, 051003 (2013)
892
Z. Jigui et al.
9. Ma, S.J., Zhang, T., Liu, G., et al.: Kinematics of planetary roller screw mechanism considering helical directions of screw and roller threads. Math. Prob. Eng. 2015, 459462 (2015) 10. Auregan, G., Fridrici, V., Kapsa, P., et al.: Experimental simulation of rolling-sliding contact for application to planetary roller screw mechanism. Wear 332–333, 1176–1184 (2015) 11. Ma, S.J., Liu, G., Fu, X.J., et al.: Rolling-sliding characteristics of planetary roller screw considering elastic deformation. J. Southeast Univ. (Nat. Sci. Ed.) 45, 461–468 (2015)
Optimal Output Tracking for Unknown Linear Discrete-Time Systems Based on Adaptive Dynamic Programming and Output Feedback Kexin Fan, Xuan Cai, Jinghan Wu, Xin Wu, and Qiaoshen Xiao
Abstract In this study, we propose a model-free adaptive dynamic programming (ADP) algorithm for output-feedback optimal tracking control. The algorithm aim to obtain a solution solely based on input and output data gathered from the system trajectories. To realize the optimal output tracking control, this article try to solve the model free optimal output regulation problem. This problem is the same as handling a linear quadratic regulation (LQR) problem and a static optimization problem. First, we rebuild the state to stand for the system state on the basis of the input and output sequence. Then, based on the input and output data, a data-driven policy iteration (PI) is applied to the discrete-time algebraic Riccati equation (ARE) of the relevant LQR problem. Finally, based on the iterative solutions about ARE, a mode-free solution is proposed for the static optimization problem. Keywords Adaptive dynamic programming · Optimal tracking control · Discrete-time systems · Output feedback
1 Introduction In the field of control, output tracking control requires to make the output conform to the target trajectory. In previous studies, in order to solve this problem, it is often transformed into the output regulation problem. However, in practice, the situation of unknown dynamics system information often occurs. With the help of Bellman optimal principle, dynamic programming can effectively deal with model-free tracking control problems, but traditional dynamic programming (DP), just like [3, 4, 21] indicated, has the problem of “curse of dimensionality” and “curse of modeling”. To solve this problem, adaptive dynamic programming (ADP) was innovated by iteratively learning control strategy based on online data. ADP has been widely studied in literature [2, 5, 6, 13, 14, 16, 17, 20, 22, 23]. K. Fan · X. Cai (B) · J. Wu · X. Wu · Q. Xiao School of Electrical Engineering, Shanghai Dianji University, Shanghai 201306, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_72
893
894
K. Fan et al.
There are two iterative types in ADP framework, i.e., policy iteration (PI) and value iteration (VI). VI involves a one-step process that iteratively estimates the value function, and PI involves a two-step process, including evaluation and improvement of policy. This paper mainly discusses policy iteration. This paper focuses on the problem of optimal output tracking control via output feedback for unknown linear discrete-time systems without any restrictive assumptions. Previously, a optimal output regulation via output feedback was proposed in [10], and an event-triggered output feedback control algorithm based on ADP was designed in [24]. However, the control gain matrix in [10, 24] is assumed to be full column rank. It needs to be emphasized that this paper does not rely on this assumption. The main contributions of this article are as follows. 1. Based solely on input and output data, a new output form of Sylvester mapping is designed to successfully implement the PI algorithm. 2. Compared with the works in [10, 24], this article does not necessitate making restrictive assumptions, like assuming the control gain matrix is full column rank. Notations: G −1 , G T and G −T are the inverse, transpose, and inverse of the transpose of matrix G ∈ R N ×N . · indicates the Euclidean norm of a vector or the spectral norm of a matrix, according to context. Assume ⊗ indicate the Kronecker product . Denote by In and 0n×m the n × n identity matrix and the n × m matrix with entries being all zero, respectively. For a matrix A ∈ Rn×m , ker(A) is its kernel, Tr A is its trace, the vec(A) = [a1T , a2T , . . . , amT ] is a vector with ai ∈ Rn being the ith column of A. For a symmetric matrix B ∈ Rm×m , B > 0 (B ≥ 0) means that the matrix B is positive definite (positive semidefinite); denote vecs(B) = [b11 , 2b12 , . . . , 2b1m , b22 , 2b23 , . . . , 2bm−1,m , bm,m ]T , where bi j ∈ R is the (i, j)th element of matrix B. For a column vector x ∈ Rn , vecv(x) = [x12 , x1 x2 , . . . , x1 xn , x22 , x2 x3 , . . . , xn−1 xn , xn2 ]T , where xi ∈ R is the ith entry of vector x.
2 Problem Description 2.1 Output Tracking Problem This paper considers the following linear discrete-time systems: xk+1 = Axk + Bu k vk+1 = Evk yk = C xk
(1) (2) (3)
Optimal Output Tracking for Unknown Linear …
895
ykd = Fvk ek = yk −
(4) ykd
(5)
where x ∈ Rn , u ∈ Rm , y ∈ R p , yd ∈ R p and e ∈ R p are the state, control input, output, reference output and tracking error; v ∈ Rq is the state of the reference system (2), producing the reference signal Fvk ; and A ∈ Rn×n , B ∈ Rn×m , E ∈ Rq×q , C ∈ R p×n and F ∈ R p×q are constant matrices. The systems (1)–(5) require certain standard assumptions, which are also presented in [8, 10, 11, 24]. Assumption 1 The pair (A, B) is controllable, and the pair (A, C) is observable. A − λE I B = n + p, ∀λ E ∈ σ (E). Assumption 2 rank C 0 p×m The objective of this study is to develop a output feedback control strategy that does not rely on any pre-defined mathematical model, and is capable of optimizing the transient performance, with the objective of achieving global stability of the closed-loop system (1) and asymptotic convergence of the tracking error ek to zero, i.e., limk→+∞ ek = 0.
2.2 Optimal Output Tracking Formula It is important to note that this paper aims to address both the asymptotic tracking and transient performance of discrete-time linear optimal output tracking control systems. To achieve optimal output tracking control, we will solve the optimal output regulation problem proposed by [15]. Therefore, the optimal output tracking problem is divided into two sub-problems: Problem 1
¯ (X ∗ , U ∗ ) = arg min Tr X T Q¯ X + U T RU (X,U )
s.t. X E = AX + BU 0 = CX − F where Q¯ = Q¯ T > 0 and R¯ = R¯ T > 0.
(6)
896
K. Fan et al.
Problem 2 u¯ ∗ = arg min u¯
∞
etT Qet + u¯ tT R u¯ t
t=k
s.t. x¯k+1 = A x¯k + B u¯ k
(7)
ek = C x¯k where x¯k = xk − X vk , u¯ k = u k − U vk , Q i = Q iT > 0 and Ri = RiT > 0 with (A, Q (1/2) C) observable.
3 Optimal Output Tracking Design Under Known Dynamics Conditions This section provides the solutions to Problems 1 and 2 under a known dynamics system.
3.1 State Reconstitution Here, we propose a novel approach for reconstituting the state of a system using both input and output data. To accomplish this, we will employ a useful lemma previously introduced by Lewis and Vamvoudakis [17]. Lemma 1 If the pair (A, C) observable, the system state xk could be expressed xk = M y y¯k−1,k−n + Mu u¯ k−1,k−n := M z k where T T T , yk−2 , . . . , yk−n ]T ∈ R pn , y¯k−1,k−n = [yk−1 T T T u¯ k−1,k−n = [u k−1 , u k−2 , . . . , u k−n ]T ∈ Rmn ,
M y = An [YnT Yn ]−1 YnT , Mu = Un − M y Tn 1 , M = [Mu , M y ], T T z k = [u¯ k−1,k−n , y¯k−1,k−n ]T
(8)
Optimal Output Tracking for Unknown Linear …
897
with Un = [B, AB, . . . , An−1 B] ∈ Rn×mn , Yn = [(C An−x1 )T , . . . , (C A)T , C T ]T ∈ R pn×n ⎤ ⎡ 0 p×m C B C AB · · · C An−2 B ⎢ 0 p×m 0 p×m C B · · · C An−3 B ⎥ ⎥ ⎢ ⎥ ⎢ .. .. .. .. Tn 1 = ⎢ ... ⎥. . . . . ⎥ ⎢ ⎣ 0 p×m · · · · · · 0 p×m CB ⎦ 0 p×m 0 p×m 0 p×m 0 p×m 0 p×m
3.2 Solution to Problem 1 Drawing inspiration from the work of [9, 15], we develop a novel output version of the Sylvester map to address this problem S( X¯ ) = M X¯ E − AM X¯ , X¯ ∈ Rr ×q
(9)
Both E and A are specified in (1) and (2). Now, we can provide the solutions for Eq. (6) in following kind. Choose a constant matrix X 1 ∈ Rr ×q satisfying C M X 1 = F. Then, we select h matrices X i ∈ Rr ×q , i = 2, 3, . . . , h + 1 such that all the vectors vec(X i ), i = 2, 3 . . . , h + 1 form a basis for ker(Iq ⊗ C M), where h is the dimension of the kernel of (Iq ⊗ C M). As a result, the general solution of C M X¯ = F is h+1 αi X i (10) X¯ = X 1 + i=2
where αi ∈ R, i = 2, . . . , h + 1 are some constants. If the matrix M has linearly independent rows, any solution of (6) is X = M X¯ = M X 1 +
h+1
αi M X i .
(11)
i=2
From the linearity property of S(X ), we obtain S( X¯ ) = S(X 1 ) +
h+1
αi S(X i ).
(12)
i=2
Thus, (6) equivalent to χ = θ
(13)
898
K. Fan et al.
where
vec(S(X 2 )) . . . vec(S(X h+1 )) 0nq×rq −Iq ⊗ B −Irq 0rq×qm vec(X 2 ) . . . vec(X h+1 ) T T T ¯ χ = [α2 , . . . , αh+1 , vec( X ) , vec(U ) ] vec(−S(X 1 )) . θ= −vec(X 1 )
=
Therefore, if we can get the exact matrix coefficient , θ and M, the optimal matrices X ∗ and U ∗ in Problem 1 can be found by figuring out the equations X ∗ = M X¯ ∗ and ⎤⎡ ⎛⎡ ⎤⎞ ⎤T ⎡ 0 0h×rq h×h α α 0h×qm ⎥ ⎜ ⎢ ⎟ (α ∗ , X¯ ∗ , U ∗ ) = arg min ⎝⎣ vec( X¯ ) ⎦ ⎣ 0qr ×h Iq ⊗ M T Q¯ M 0qr ×qm ⎦ ⎣ vec( X¯ ) ⎦⎠ (α, X¯ ,U ) vec(U ) vec(U ) 0qm×h 0qm×rq Iq ⊗ R¯ ⎡ ⎤ α s.t. ⎣ vec( X¯ ) ⎦ = θ, vec(U )
(14) where α = [α2 , . . . , αh+1 ]T . Actually, there is literature [7] that has discussed (14) which is a quadratic programming.
3.3 Solution to Problem 2 Note that the Problem 2 refers to a conventional discrete-time linear quadratic regulation problem. In accordance with optimal control theory [18, 19] the linear optimal controller for (1) can be devised in the presence of (7) as follows, u k = u¯ ∗ (k) + U ∗ vk = −K x∗ x¯k + U ∗ vk = −K x∗ xk + (U ∗ + K x∗ X ∗ )vk = −K x∗ M z¯ k + (U ∗ + K x∗ X ∗ )vk .
(15)
where X ∗ and U ∗ are got from Problem 1, and the optimal control gain K x∗ is designed as (16) K x∗ = (R + B T P B)−1 B T P A here P ∈ Rn×n is unique positive definite solution to the algebraic Riccati equation (ARE) (17) C T QC − P + A T P A − A T P B(R + B T P B)−1 B T P A = 0
Optimal Output Tracking for Unknown Linear …
899
Equation (17) is nonlinear in P, and can prove challenging to solve in control systems with large dimensions, even when the system matrix is accurately known. To address the issue, we introduce a model-based algorithm: policy iteration (algorithm 1), designed to iteratively approximate P . Lemma 2 (See the work) [12]) Let K x(0) be a control gain matrix such that (A − ( j+1) B K x(0) ) is Schur. Repeatedly solve Eqs. (18) and (19) for P ( j) and K x , j= 0, 1, 2, . . . ,. Then, the following statements hold: ( j)
(1) For all j ∈ Z + , A − B K x is Schur. (2) P ≤ P ( j+1) ≤ P ( j) . ( j) (3) lim j→+∞ K x = K x∗ , lim j→+∞ P ( j) = P.
Algorithm 1: Model-based policy iteration for optimal feedback solution (0)
(0)
Find K x such that (A − B K x ) is Schur; Set j = 0; while True do ( j+1) Solve (18) and (19) to obtain P ( j) > 0 and K x ; ( j)
( j)
( j)
( j)
(A − B K x )T P ( j) (A − B K x ) − P ( j) + C T QC + (K x )T R(K x ) = 0
(18)
( j+1) Kx
(19)
= (R + B P T
( j)
B)
−1
T
B P
( j)
A
j ← j + 1;
4 Optimal Output Tracking Design Under Unknown Dynamics Conditions In this section, we will develop a model-free output-feedback ADP algorithm, which is projected to achieve a solution using solely input and output data. First, we use input and output data to generate the matrix C¯ := C M. In accordance with this objective, by considering (3) we have z kT C¯ i = yi , i = 1, 2, . . . , p
(20)
with C¯ i is the ith row of C¯ and yi is the ith entry of y. We define zk0 = [z k0 , z k0 +1 , . . . , z k0 +so −1 ]T yi,k0 = [yi,k0 , yi,k0 +1 , . . . , yi,k0 +so −1 ]T , i = 1, 2, . . . , p,
(21)
900
K. Fan et al.
Then C¯ i , i = 1, 2, . . . , p can be computed by −1 zTk yi (k0 ), i = 1, 2, . . . , p. C¯ i = zTk zk0
(22)
rank(zk0 ) = r,
(23)
0
0
if where so ∈ Z + indicates the quantity of data specimens to solve C¯ i , and the sampling is deprive from step k0 to step k0 + so − 1. The validity of the aforementioned statements can be established through some straightforward manipulations. Define new state variables x¯i,k , i = 0, . . . , h + 1 as x¯i,k = xk − M X i vk , i = 0, . . . , h + 1
(24)
where X 0 = 0 and X i , i = 1, . . . , h + 1 are defined in Sect. 3.2. Then, x¯i,k , i = 0, . . . , h + 1 are x¯i,k+1 = xk+1 − M X i vk+1 = Axk + Bu k + (−M X i E)vk
(25)
= A x¯i,k + Bu k − S(X i )vk . ( j)
Subtracting and adding B K x x¯i,k from the right-hand side of (25), we derive x¯i,k+1 = A j x¯i,k + B(K x( j) x¯i,k + u k ) − S(X i )vk
(26)
j
where A j = A − B K x . By (26) and (18), we obtain T T P ( j) x¯i,k+1 − x¯i,k P ( j) x¯i,k x¯i,k+1 T T = x¯i,k A j + (K x( j) x¯i,k + u k )T B T − vkT S T (X i ) P ( j) x¯i,k+1 T − x¯i,k P ( j) x¯i,k T T = −x¯i,k (C T QC + (K x( j) )T R(K x( j) ))x¯i,k + x¯i,k A Tj P ( j) B(K x( j) x¯i,k + u k ) T A Tj P ( j) S(X i )vk + (K x( j) x¯i,k + u k )T B T P ( j) x¯i,k+1 − x¯i,k
− vkT S T (X i )P ( j) x¯i,k+1 Substituting (25) into (27), we have
(27)
Optimal Output Tracking for Unknown Linear …
901
T T T x¯i,k+1 P ( j) x¯i,k+1 − x¯i,k P ( j) x¯i,k T = −x¯i,k (C T QC + (K x( j) )T R(K x( j) ))x¯i,k
− 2u kT B T P ( j) S(X i )vk T + 2 x¯i,k A T P ( j) B(K x( j) x¯i,k + u k )
+ − +
vkT S T (X i )P ( j) S(X i )vk T 2 x¯i,k A T P ( j) S(X i )vk (−K x( j) x¯i,k + u k )T B T P ( j) B(K x( j) x¯i,k
(28)
+ uk )
( j) ( j) Define K¯ x = K x M and P¯ ( j) = M T P ( j) M. We can obtain from (8) and (28) for i=1 that T z k+1 P¯ ( j) z k+1 − z kT P¯ ( j) z k = −ykT Qyk − z kT ( K¯ x( j) )T R K¯ x( j) z k
+ 2z kT M T A T P ( j) B( K¯ x( j) z k + u k ) + (− K¯ x( j) z k + u k )T B T P ( j) B( K¯ x( j) z k + u k )
(29)
By Kronecker product, (29) satisfies T z k+1 P¯ ( j) z k+1 = vecv(z k+1 )T vecs( P¯ ( j) ), z kT P¯ ( j) z k = vecv(z k )T vecs( P¯ ( j) ),
ykT Qyk = (ykT ⊗ ykT )vec(Q), z kT ( K¯ x( j) )T R K¯ x( j) z k = (z kT ⊗ z kT )vec(( K¯ x( j) )T R K¯ x( j) ), z kT M T A T P ( j) B( K¯ x( j) z k + u k )
(30)
= (vecv(u k )T − vecv( K¯ x( j) z k )T )vecs(B T P ( j) B). Let so ∈ Z + denote the number of data samples, and the sampling is taken from step k0 to step k0 + so − 1. Then, according to (29) and (30), we can formulate the following linear equation: ( j) ( j) ( j) (31) 0,k0 Ξ0 = Φ0,k0 where ⎡
( j)
0
⎤ vecs( P¯ ( j) ) = ⎣ vec(B T P ( j) AH ) ⎦ , vecs(B T P ( j) B)
( j)
0,k0 = Γ yy,k0 vec(Q) + zz,k0 vec ( K¯ x( j) )T R K¯ x( j) , ( j) 0,k0 = −δ¯zz,k0 , 2 zu,k0 + Γzz,k0 Ir ⊗ ( K¯ x( j) )T , δuu,k0 − δ K ( j) K ( j) ,k0
902
K. Fan et al.
with T yy,k0 = yk0 ⊗ yk0 , yk0 +1 ⊗ yk0 +1 , . . . , yk0 +so −1 ⊗ yk0 +s0 −1 , T zz,k0 = z k0 ⊗ z k0 , z k0 +1 ⊗ z k0 +1 , . . . , z k0 +so −1 ⊗ z k0 +so −1 , T zu,k0 = z k0 ⊗ u k0 , z k0 +1 ⊗ u k0 +1 , . . . , z k0 +so −1 ⊗ u k0 +s0 −1 , δ¯zz,k0 = vecv(z k0 +1 ) − vecv(z k0 ), vecv(z k0 +2 ) − vecv(z k0 +1 ) T , . . . , vecv(z k0 +so ) − vecv(z k0 +so −1 ) , δuu,k0 = vecv(u k0 ), vecv(u k0 +1 ), . . . , vecv(u k0 +so −1 ]T , T δ K ( j) K ( j) ,k0 = vecv( K¯ x( j) z k0 ), vecv( K¯ x( j) z k0 +1 ), . . . , vecv( K¯ x( j) z k0 +so −1 ) . ( j)
The only possible solution for Eq. (31) is 0 , which can be obtained through least mean square (LMS) estimation when the rank condition (33) holds (according Lemma 3), −1 T T ( j) ( j) ( j) ( j) ( j) 0,k0 0,k0 . (32) 0 = 0,k0 0,k0
Lemma 3 Suppose that there is a value of w such that for all so > wo ,wo ∈ Z + rank([zu,k0 , δuu,k0 , δzz,k0 ]) = r m +
(1 + r )r (1 + m)m + 2 2
rank(x,k0 ) = n,
(33) (34)
T where δzz,k0 = vecv(z k0 ), vecv(z k0 +1 ), . . . , vecv(z k0 +so −1 ) and ( j) x,k0 = [xk0 , xk0 +1 , . . . , xk0 +so −1 ]T . Then, 0,k0 has full column rank for all j ∈ Z + provided that so > wo . Proof It can be deduced from Eq. (34) that the components of xk are mutually )r holds if and independent for k ≥ k0 . We have that the relation rank(δ¯zz,k0 ) = (1+r 2 (1+n)n ¯ only if rank(δx x (k0 )) = 2 , where δ¯x x (k0 ) = vecv(xk0 +1 − vecv(xk0 ), vecv(xk0 +2 ) − vecv(xk0 +1 ) T , . . . , vecv(xk0 +so ) − vecv(xk0 +so −1 ) .
Therefore, the lemma is obviously equivalent to the assertion that if ( j) ¯ 0,k = −δ¯x x (k0 ), 2 zu,k0 + zz,k0 Ir ⊗ ( K¯ x( j) )T , δuu,k0 − δ K ( j) K ( j) (k0 ) (35) 0
Optimal Output Tracking for Unknown Linear …
903
has full column rank. The remainder of the proof follows a similar approach as the corresponding portions of lemma 3 in [9], so we shall omit the details. ( j) ( j+1) By means of the value of the component of 0 , K¯ x can be calculated by
K¯ x( j+1) = (R + B T P ( j) B)−1 B T P ( j) AM ( j)
( j)
= (R + 0,3 )−1 0,2 ( j+1)
( j+1)
( j)
(36)
( j)
= Kx M, 0,2 = B T P ( j) AM and 0,3 = B T P ( j) B. where K¯ x In the following, we will introduce a novel data-driven approach for solving Problem 1 without prior knowledge of the system model. Note that Eq. (28) can be represented as T T ¯ ( j) P¯ ( j) z¯ i,k+1 − z¯ i,k P z¯ i,k z¯ i,k+1 T T ¯ = −(yk + X i vk ) Q(yk + X¯ i vk ) − z¯ i,k ( K¯ x( j) )T R K¯ x( j) z¯ i,k
− 2u kT B T P ( j) S(X i )vk T + 2¯z i,k H T A T P ( j) B( K¯ x( j) z¯ i,k + u k ) + − +
vkT S T (X i )P ( j) S(X i )vk T 2¯z i,k M T A T P ( j) S(X i )vk (− K¯ x( j) z¯ i,k + u k )T B T P ( j) B( K¯ x( j) z¯ i,k
(37)
+ u k ),
where z¯ i,k = z k − X i vk , and X¯ i = F when i = 1 and X¯ i = 0 otherwise. Substituting (8) into (27) yields T T ¯ ( j) z¯ i,k+1 P¯ ( j) z¯ i,k+1 − z¯ i,k P z¯ i,k T T = −(yk + X¯ i vk ) Q(yk + X¯ i vk ) − z¯ i,k ( K¯ x( j) )T R K¯ x( j) z¯ i,k T T − z¯ i,k ( K¯ x( j) )T B T P ( j) B K¯ x( j) z¯ i,k − z¯ i,k M T A T P ( j) S(X i )vk
+ +
(38)
u kT B T P ( j) M z¯ i,k+1 − vkT S T (X i )P ( j) M z¯ i,k+1 T T z¯ i,k M T A T P ( j) Bu k + 2¯z i,k ( K¯ x( j) )T B T P ( j) AM z¯ i,k .
By Kronecker product representation, the terms of (37) and (38) are rewritten as T z¯ i,k+1 P¯ ( j) z¯ i,k+1 = vecv(¯z i,k+1 )T vecs( P¯ ( j) ), T ¯ ( j) z¯ i,k P z¯ i,k = vecv(¯z i,k )T vecs( P¯ ( j) ), (yk + X¯ i vk T Q(yk + X¯ i vk ) = (yk + X¯ i vk )T ⊗ (yk + X¯ i vk )T vec(Q), T T T ( K¯ x( j) )T R K¯ x( j) z¯ i,k = (¯z i,k ⊗ z¯ i,k )vec(( K¯ x( j) )T R K¯ x( j) ), z¯ i,k T T T z¯ i,k ( K¯ x( j) )T B T P ( j) B K¯ x( j) z¯ i,k = (¯z i,k ⊗ z¯ i,k ) ( K¯ x( j) )T ⊗ ( K¯ x( j) )T vec(B T P ( j) B), T T z¯ i,k vec(M T A T P ( j) S(X i )), M T A T P ( j) S(X i )vk = vkT ⊗ z¯ i,k
904
K. Fan et al.
and u kT B T P ( j) M z¯ i,k+1 T T T = (¯z i,k ⊗ z¯ i,k ) Ir ⊗ ( K¯ x( j) )T + z¯ i,k ⊗ u kT vec(B T P ( j) AH ) vkT S T (X i )P ( j) S(X i )vk = (vecv(u k )T − vecv( K¯ x( j) z¯ i,k )T )vecs(B T P ( j) B). Then, (37) and (38) become linear equations ˆ i( j),k0 , ˆ i,k0 ˆ i( j) =
(39)
¯ i( j) ¯ i,k0
(40)
=
¯ i( j),k0 ,
where ˆ i (k0 ) = −2vu,k0 , δvv,k0 , −2v z¯i ,k0 ¯ i (k0 ) = ¯ u z¯ ,k0 , u z¯ ,k0 , −v z¯ ,k0 , i, f i,l i ⎤ ⎡ T ( j) vecs(B P B) ¯ i( j) = ⎣ vec(H1T P ( j) B) ⎦ , vec(H T P ( j) S(X i )) ⎤ ⎡ vec(B T P ( j) S(X i )) ( j) ˆ i = ⎣ vecs(S T (X i )P ( j) S(X i )) ⎦ , vec(H T A T P ( j) S(X i )) ( j) ¯ i,k
= δ¯z¯i z¯i ,k0 vecs( P¯ ( j) ) + y¯i y¯i ,k0 vec(Q) + z¯i z¯i ,k0 vec(( K¯ x( j) )T R K¯ x( j) ) 0 + z¯i z¯i ,k0 ( K¯ x( j) )T ⊗ ( K¯ x( j) )T vec(B T P ( j) B) − z¯i u,k0 + 2z¯i z¯i ,k0 ( K¯ x( j) )T ⊗ Ir vec(B T P ( j) AH )
+ v z¯i ,k0 vec(H T A T P ( j) S(X i )), ( j) ˆ i,k
= δ¯z¯i z¯i ,k0 vecs( P¯ ( j) ) + y¯i y¯i ,k0 vec(Q) + z¯i z¯i ,k0 vec(( K¯ x( j) )T R K¯ x( j) ) 0 − 2 z¯i z¯i ,k0 Ir ⊗ ( K¯ x( j) )T + z¯i u,k0 vec(B T P ( j) AM)
− (δuu,k0 − δ K ( j) K ( j) ,k0 )vecs(B T P ( j) B) i
i
v z¯i ,k0 = [vk0 ⊗ z¯ i,k0 , vk0 +1 ⊗ z¯ i,k0 +1 , . . . , vk0 +so −1 ⊗ z¯ i,k0 +so −1 ]T , u z¯i,l ,k0 = [u k0 ⊗ z¯ i,k0 +1,l , u k0 +1 ⊗ z¯ i,k0 +2,l , . . . , u k0 +so −1 ⊗ z¯ i,k0 +so ,l ]T , ¯ u z¯i, f ,k0 = [veca(u k0 , z¯ i,k0 +1, f ), veca(u k0 +1 , z¯ i,k0 +2, f ) , . . . , vecau k0 +so −1 , z¯ i,k0 +so , f )]T , v z¯i ,k0 = [vk0 ⊗ z¯ i,k0 +1 , vk0 +1 ⊗ z¯ i,k0 +2 , . . . , vk0 +s0 −1 ⊗ z¯ i,k0 +so ]T , δ¯z¯i z¯i ,k0 = [vecv(¯z i,k0 +1 ) − vecv(¯z i,k0 ), vecv(¯z i,k0 +2 ) − vecv(¯z i,k0 +1 ) , . . . , vecv(¯z i,k0 +so ) − vecv(¯z i,k0 +so −1 )]T ,
Optimal Output Tracking for Unknown Linear …
905
with y¯i,k0 = yk + X¯ i vk y¯i y¯i ,k0 = [ y¯i,k0 ⊗ y¯i,k0 , y¯i,k0 +1 ⊗ y¯i,k0 +1 , . . . , y¯i,k0 +so −1 ⊗ y¯i,k0 +so −1 ]T , z¯i z¯i ,k0 = [¯z i,k0 ⊗ z¯ i,k0 , z¯ i,k0 +1 ⊗ z¯ i,k0 +1 , . . . , z¯ i,k0 +so −1 ⊗ z¯ i,k0 +so −1 ]T , z¯i u,k0 = [¯z i,k0 ⊗ u k0 , z¯ i,k0 +1 ⊗ u k0 +1 , . . . , z¯ i,k0 +so −1 ⊗ u k0 +so −1 ]T , T δvv,k0 = vecv(vk0 ), vecv(vk0 +1 ), . . . , vecv(vk0 +so −1 , vu,k0 = vk0 ⊗ u k0 , vk0 +1 ⊗ u k0 +1 T , . . . , vk0 +so −1 ⊗ u k0 +s0 −1 , T δuu,k0 = vecv(u k0 ), vecv(u k0 +1 ), . . . , vecv(u k0 +so −1 , T δ K ( j) K ( j) ,k0 = vecv( K¯ x( j) z¯ i,k0 ), vecv( K¯ x( j) z¯ i,k0 +1 ), . . . , vecv( K¯ x( j) z¯ i,k0 +so −1 ) . i
i
ˆ i( j) and ¯ i( j) can be uniquely obtained by Like the equality (32), ˆ i( j) = ¯ i( j) =
ˆ i,k0
T
ˆ i,k0
−1 T ( j) ˆ i,k ˆ i,k0 0
−1 ( j) ¯ i,k0 ¯ i,k ¯ i,k0 T ¯ i,k0 T 0
(41) (42)
under the rank condition (43) and (44). Lemma 4 On basic of Lemma 3, suppose there exists a w1 ∈ Z + such that for all so > w1 and i = 0, 1, . . . , h + 1, rank([vu,k0 , δvv,k0 , v z¯i ,k0 ]) = qm + q 2 + qr
(43)
rank([u z¯i ,k0 , v z¯i ,k0 ]) = qr + mr.
(44)
¯ i( j) and i( j) have the only solution. Then, Define S¯ ( j) (X i ) = M T P ( j) S(X i ), i = 1, 2, . . . , h + 1, B¯ ( j) = M T P ( j) B. With the solution of (42), vec(S¯ ( j) (X i )), i = 1, . . . , h + 1 and B¯ ( j) can be computed as
906
K. Fan et al. ( j)
¯ i,3 vec(S¯ ( j) (X i )) = ⎡ ⎤ −1 ¯ ( j) vecs a ⎦ B¯ ( j) = ⎣ ¯ (bj) vec−1
(45)
( j) ¯ i,3 ¯ a( j) = vecs(B T P ( j) B) and where = vec(H T P ( j) S(X i )), i = 1, . . . , h + 1, ( j) ¯ b = vec(M1T P ( j) B). Because P ( j) is invertible and M has full column rank under the rank condition (34), (13) can be converted into
¯ ( j) χ = θ¯ ( j)
(46)
where ¯ ( j) ¯ ( j) ¯ ( j) ¯ ( j) = vec(S (X 2 )) . . . vec(S (X h+1 )) 0rq×rq −Iq ⊗ B vec(X 2 ) ... vec(X h+1 ) −Irq 0rq×mq T T T ¯ χ = [α2 , . . . , αh+1 , vec( X ) , vec(U ) ] ¯ ( j) ¯θ ( j) = vec(−S (X 1 )) . −vec(X 1 ) In fact, Eqs. (13) and (46) have the same solution. Correspondingly, an equivalent reformulation of Problem 1 can be derived. ⎛⎡ ⎤⎡ ⎤⎞ ⎤T ⎡ 0h×h 0h×rq 0h×qm α α ⎜ ⎟ (α ∗ , X¯ ∗ , U ∗ ) = arg min ⎝⎣ vec( X¯ ) ⎦ ⎣ 0qr ×h Iq ⊗ Q˜ 0qr ×qm ⎦ ⎣ vec( X¯ ) ⎦⎠ (α, X¯ ,U ) vec(U ) vec(U ) 0qm×h 0qm×rq Iq ⊗ R¯ ⎡ ⎤ α ¯ ( j) ⎣ vec( X¯ ) ⎦ = θ¯ ( j) , s.t. vec(U ) (47) where α = [α2 , . . . , αh+1 ]T and Q˜ = M T Q¯ M. Note that (47) is a model-free quadratic programming problem that can be directly solved. The data-driven PIbased algorithm under output feedback is given in Algorithm 2. Theorem 1 If rank conditions (33), (34), (43) (44) are then given any and satisfied, +∞ ( j) +∞ initial stabilizing gain K¯ x(0) , the sequences P¯ ( j) 0 and K¯ x 0 converge to P¯ ∗ and K¯ x∗ , respectively, where P¯ = M T P M and K¯ x∗ = K x∗ M. Moreover, imposing the ( p) ( p) control input u k = − K¯ x z k + K v vk on the discrete system (1),With Algorithm 2 ( p) ( p) ¯ we can obtain K x and K v conclude the following: ( p)
(1) The closed-loop system is stable, i.e., (A − B K x ) is Schur; (2) The tracking error ek eventually converges to zero, i.e., limk→+∞ ek = 0.
Optimal Output Tracking for Unknown Linear …
907
Algorithm 2: Data-driven PI-based optimal output tracking algorithm by output feedback Given Q˜ > 0, R¯ > 0, Q > 0, k0 ∈ N and R > 0; (0) (0) (0) Given Kˆ x = K x M such that (A − B K x ) is Schur; Set j ← 0; (0) Apply u k = − Kˆ x z¯ k + ι(k) with exploration noise ι(k); Find the maximal linearly independent subset of the entries of z k for k ≥ k0 to form the vector z k from the collected data; Calculate the matrix M from the collected data; (0) (0) Set K¯ x ← Kˆ x M ; ¯ Solve (22) to obtain C; Solve the linear equation C¯ X¯ = F to obtain X 1 , X 2 , . . . , X h+1 , such that all the vectors ¯ and X 1 is a special solution to the vec(X i ), i = 2, 3, . . . , h + 1 form a basis for ker(Iq ⊗ C), equation C¯ X¯ = F; Select a prescribed small positive constant ; Solve P¯ (0) , K¯ x(1) , B T P (0) AM and B T P (0) B from (32) and (36); repeat j ← j + 1; ( j+1) Solve P¯ ( j) , K¯ x , B T P ( j) AM and B T P ( j) B from (32) and (36); ( j) ( j−1) until P¯ − P¯ ≤ ; for i ← 1 to h + 1 do Solve M T P ( j) S (X i ) from (41); Solve M T P ( j) B and M T P ( j) S (X i ) from (42); ¯ ( j) and θ¯ ( j) by (45); Obtain the coefficient matrices ∗ ∗ ¯ Find ( X , U ) by solving (47); Set p ← j; ( p) ( p) ( p) ( p) Let K v ← U ∗ + K¯ x X¯ ∗ , thus u k = − K¯ x z k + K v vk .
( j)
Proof Let K x be a stabilizing gain. Suppose that P ( j) is a solution of (18); thus, ( j+1) ( j) is obtained from (19). According to (29), we have P¯ ( j) = M T P ( j) M, K¯ x = Kx ( j) ( j+1) ( j) ( = K x M satisfy (32) and (36). Conversely, if one has P˜ j) = K M and K¯ x x T P˜ ( j) ∈ Rr ×r and K˜ ( j+1) ∈ Rm×r , such that ⎤ −1 vecs( P˜ ( j) ) T T ( j) ( j) ( j) ( j) ( j) ⎣ vec( Pˆ ) ⎦ = 0,k
0,k0 0,k 0,k 1 0 0 0 ( j) vecs( Pˆ2 )
(48)
( j) ( j) K˜ ( j+1) = (R + Pˆ2 )−1 Pˆ1
(49)
⎡
and
( j) ( j) hold, where Pˆ1 and Pˆ2 are some matrices with appropriate dimensions. Be aware ( j+1) ( j+1) of Lemma 3, it is deduced that P¯ ( j) = P˜ ( j) and K¯ x = K˜ x since equation (31)
908
K. Fan et al.
has the only solution. Then, by algorithm 1, we have lim j→+∞ P¯ ( j) = M T P M and ( j) lim j→+∞ K¯ x = K x∗ M. The assertion of part (1) follows immediately from Lemma 2 and Algorithm 1. To prove part (2), we have x¯k+1 = (A − B K x( p) )x¯k ek = C x¯k
(50) (51)
( p) ( p) since we know that K¯ x = K x M and X ∗ = M X¯ ∗ . Then, limk→+∞ x¯k = 0 and limk→+∞ ek = 0.
5 Simulation Results This section presents a simulation that aims to validate the proposed algorithm. We consider a single-phase grid-connected inverter with LCL filter [1]. Its dynamic principle can be expressed by the following formula: ⎡ ⎣
dil (t) dt dvc (t) dt di g (t) dt
⎤⎡ ⎤ ⎡ 1 ⎤ − LR11 − L11 0 il (t) L1 ⎦=⎣ 1 0 − C1 ⎦ ⎣ vc (t) ⎦ + ⎣ 0 ⎦ vi (t) C 1 i g (t) 0 0 − LR22 L2 ⎤
⎡
(52)
where i g (t), vi (t), il (t) and vc (t) are the grid current, inverter voltage, inverter current and capacitor voltage; vi (t) and i g (t) are the control input and output of the system; L 1 is the filter inductor; C is the capacitor; R1 is the filter resistor; and L 2 is the transformer inductor. Here, R1 = 0.02 , L 1 = 725 µH, C = 25 µF, R2 = 0.02 , and L 2 = 2 mH. After discretizing (52) with sampling frequency 4.8 kHz and set the desired grid current i gd , Algorithm 2 can be executed.
0.954 −0.3 v(k + 1) = v(k) 0.3 0.954 i gd (k) = −7 − 6 v(k)
(53)
The simulation results of Algorithm 2 are presented in Figs. 1, 2, 3 4. Figure 1 shows the trajectory of the grid current i g . From the figure, we can see that i g tracks the desired grid current i gd asymptotically. Figure 2 demonstrates the evolution of the control input vi . Figures 3 and 4 show the convergence of K¯ ( j) and P¯ ( j) , respectively.
Optimal Output Tracking for Unknown Linear …
909
20
Controller Updated 15 10 5 0 -5 -10 -15 -20 0
50
100
150
200
150
200
Fig. 1 Profiles of the output i g (k) and the reference output i gd (k) 60
40
20
0
-20
Controller Updated -40
-60 0
50
Fig. 2 Evolution of the control input vi (k)
100
910
K. Fan et al.
6
5
4
3
2
1
0
1
2
3
4
5
6
7
5
6
7
Fig. 3 Difference between K¯ ( j) and the optimal value K¯ ∗ 200 180 160 140 120 100 80 60 40 20
0
1
2
3
4
Fig. 4 Difference between P¯ ( j) and the optimal value P¯
Optimal Output Tracking for Unknown Linear …
911
6 Conclusions The paper discusses the output-feedback-based model-free optimal output tracking control problem. The proposed method utilizes adaptive dynamic programming and output regulation theory to design an optimal output tracking controller that can iteratively approximate the optimal control gain without prior knowledge of the system. System output and the reference output is the only available information for control design.
References 1. Ahmed, K.H., Massoud, A.M., Finney, S.J., Williams, B.W.: A modified stationary reference frame-based predictive current control with zero steady-state error for LCL coupled inverterbased distributed generation systems. IEEE Trans. Ind. Electron. 58(4), 1359–1370 (2011). https://doi.org/10.1109/TIE.2010.2050414 2. Al-Tamimi, A., Lewis, F.L., Abu-Khalaf, M.: Model-free Q-learning designs for linear discretetime zero-sum games with application to H-infinity control. Automatica 43(3), 473–481 (2007) 3. Bellman, R.E.: Adaptive Control Processes: A Guided Tour. Princeton University Press, Princeton, NJ (1961) 4. Bertsekas, D.P., Tsitsiklis, J.N.: Neuro-dynamic Programming. Athena Scientific, Belmont, MA (1996) 5. Bian, T., Jiang, Z.P.: Value iteration and adaptive dynamic programming for data-driven adaptive optimal control design. Automatica 71, 348–360 (2016) 6. Bian, T., Jiang, Z.P.: Reinforcement learning and adaptive optimal control for continuous-time nonlinear systems: a value iteration approach. IEEE Trans. Neural Netw. Learn. Syst. 1–10 (2021) 7. Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, NewYork, NY, USA (2004) 8. Chen, C., Xie, L., Jiang, Y., Xie, K., Xie, S.: Robust output regulation and reinforcement learning-based output tracking design for unknown linear discrete-time systems (2021). arXiv:abs/2101.08706. Accessed 21 Jan 2021 9. Gao, W., Jiang, Z.P.: Adaptive dynamic programming and adaptive optimal output regulation of linear systems. IEEE Trans. Autom. Control 61(12), 4164–4169 (2016) 10. Gao, W., Jiang, Z.P.: Adaptive optimal output regulation via output-feedback: an adaptive dynamic programing approach. In: 2016 IEEE 55th Conference on Decision and Control (CDC), pp. 5845–5850 (2016) 11. Gao, W., Jiang, Z.P.: Adaptive optimal output regulation of time-delay systems via measurement feedback. IEEE Trans. Neural Netw. Learn. Syst. 30(3), 938–945 (2019) 12. Hewer, G.: An iterative technique for the computation of the steady state gains for the discrete optimal regulator. IEEE Trans. Autom. Control 16(4), 382–384 (1971) 13. Jiang, Y., Jiang, Z.P.: Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics. Automatica 48(10), 2699–2704 (2012) 14. Jiang, Y., Jiang, Z.P.: Robust adaptive dynamic programming and feedback stabilization of nonlinear systems. IEEE Trans. Neural Netw. Learn. Syst. 25(5), 882–893 (2014) 15. Jiang, Y., Kiumarsi, B., Fan, J., Chai, T., Li, J., Lewis, F.L.: Optimal output regulation of linear discrete-time systems with unknown dynamics using reinforcement learning. IEEE Trans. Cybern. 50(7), 3147–3156 (2020) 16. Kiumarsi, B., Lewis, F.L., Jiang, Z.P.: H∞ control of linear discrete-time systems: off-policy reinforcement learning. Automatica 78, 144–152 (2017)
912
K. Fan et al.
17. Lewis, F.L., Vamvoudakis, K.G.: Reinforcement learning for partially observable dynamic processes: adaptive dynamic programming using measured output data. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 41(1), 14–25 (2011) 18. Lewis, F.L., Vrabie, D.L., Syrmos, V.L.: Optimal Control, 3rd edn. Wiley, Hoboken, NJ, USA (2012) 19. Liberzon, D.: Calculus of Variations and Optimal Control Theory: A Concise Introduction. Princeton University Press, Princeton, NJ, USA (2012) 20. Pang, B., Jiang, Z.P., Mareels, I.: Reinforcement learning for adaptive optimal control of continuous-time linear periodic systems. Automatica 118, 109035 (2020) 21. Powell, W.B.: Approximate Dynamic Programming: Solving the Curses of Dimensionality, 2nd edn. Wiley, Hoboken, NJ, USA (2011) 22. Wang, D., Ha, M., Qiao, J.: Self-learning optimal regulation for discrete-time nonlinear systems under event-driven formulation. IEEE Trans. Autom. Control 65(3), 1272–1279 (2020) 23. Wang, D., Liu, D., Wei, Q., Zhao, D., Jin, N.: Optimal control of unknown nonaffine nonlinear discrete-time systems based on adaptive dynamic programming. Automatica 48(8), 1825–1832 (2012) 24. Zhao, F., Gao, W., Liu, T., Jiang, Z.P.: Learning-based event-triggered adaptive optimal output regulation of linear discrete-time systems. In: 2021 IEEE 10th Data Driven Control and Learning Systems Conference (DDCLS), pp. 1516–1521 (2021)
LSTM-TD3-Based Control for Delayed Drone Combat Strategies Bingyu Ji, Jun Wang, Hailin Zhang, and Ya Zhang
Abstract Reinforcement learning has made great achievements in the field of game confrontation, and military rendition intelligence is also imminent. In this paper, we propose a game confrontation game model based on LSTM and deep reinforcement learning for the red and blue sides of the drones in the delayed environment. An action buffer pair is added to the original battlefield environment to create a large latency environment, a TD3 algorithm is designed to join LSTM to predict the state trend, and a historical data pool is built using PER. The training results in the simulation platform show that the model has a significant improvement in effectiveness in the large latency environment, with a 0.22 improvement in battle damage ratio and a 34.76% improvement in win rate over the original model. Keywords LSTM · Reinforcement learning · Drones · Control delay
1 Introduction Deep reinforcement learning has made rapid progress in gaming [1] and robot control [2]. However, most algorithms have been evaluated in simulators such as Gym and MuJoCo, where action selection and action are done instantaneously. In practice, however, many real-world problems suffer from action or state delays, including robotic systems, communication networks [3] and unmanned aircraft systems [4, 5], so the results of simulator-trained models may not be directly applicable to real systems. The complexity of modern warfare [6] has made computerbased strategic derivation an important tool for quantifying the battlefield. A typical continuous decision problem in a high latency environment is annihilator air warfare. B. Ji · J. Wang · H. Zhang · Y. Zhang (B) School of Automation, Southeast University, Nanjing 210096, China e-mail: [email protected] Key Laboratory Measurement and Control of Complex Systems of Engineering, Ministry of Education, Nanjing 210096, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_73
913
914
B. Ji et al.
Previous research has shown that latency not only degrades agent performance, but also leads to dynamic system instability, which is extremely dangerous for the control of real systems [7]. Several approaches have been proposed by the control community to address this problem, such as the use of Smith prediction [8], Artstein reduction [9], etc. Most of these methods rely on accurate models, which are usually not available in practice. The latest proposed DRL has great potential to address this problem, and Ramstedt and Pal [10] have proposed an off-strategy model-free algorithm called real-time Actor-Critic which, although performing well in low-latency systems, still encounters problems such as low learning efficiency when the method is used in high-latency systems. Another direction to address the latency problem is adversarial learning for sim-to-real adaptation for robust learning [11]. However, most work in these directions is more concerned with issues such as noise [11] from unstable forces rather than the latency relationship between action or state and environment. This suggests a lack of effective algorithmic frameworks to exploit DRL for realistic Drones control tasks in high-latency systems. We believe this is due to the underutilisation of the dynamics of time-delayed systems. In other words, model-based DRL mechanisms in delayed systems have not been fully identified and formulated. In this paper, we propose a framework for DRL that can be applied in a delayed environment. The main approach is to use action buffer sequences to model the delay [12] between the decision actions of an intelligent agent in a real system and its actual actions in the environment, using an ALTD4 delay control method based on Long Short Term Memory networks (LSTM) [13, 14] and Twin Delayed Deep Deterministic policy gradient algorithm (TD3) [15, 16] to solve the delay problem. The method uses historical sequences of states, actions and rewards of intelligent agents as inputs in a traditional reinforcement learning framework [17]. It uses LSTM networks to correct the historical state information for new state inputs, thereby enhancing the intelligent agent’s ability to predict future states.
2 Simulation Environments and Algorithms The training and testing environment is based on the battlefield environment of multiply unmanned aerial vehicles (Fig. 1). The game is played between eight annihilators and one jammer from each side, with the game ending when all eight annihilators are destroyed or the game runs out of time. Blue uses expert knowledge to write response strategies for each situation, while Red uses deep reinforcement learning algorithms to build a commander model. The state of the environment is the location of the red and blue drones and the amount of ammunition. The party with the agent can have the location of both drones, but can only have information on the amount of ammunition for their own drones.
LSTM-TD3-Based Control for Delayed Drone Combat Strategies
915
Fig. 1 Environment
Fig. 2 Algorithm framework
2.1 Algorithm Framework Based on the TD3 algorithm and LSTM neural network, this paper proposes the ALTD4 (Action Delay by Long Short Term Memory with Twin Delayed Deep Deterministic Policy Gradient algorithm). ALTD4 builds on TD3 introduces an LSTM layer as the prediction layer and uses a buffered queue to delay the output action. The algorithm framework is shown in Fig. 2.
916
B. Ji et al.
2.2 Model Design In the modern battlefield, combat units execute operational orders issued by their command and provide feedback on the battlefield. Rapid changes in battlefield status, irrational behaviour of soldiers, and interference, disruptions or delays in communication equipment can cause combat units to fail to execute orders in a timely manner. Commanders must therefore anticipate battlefield states in advance of issuing orders in order to effectively command and control combat units. The red commander’s model should have battlefield prediction capabilities and be delayed in issuing orders to its own drones. The model is divided into three main parts: A-Delay, LSTM, and TD3. A-Delay is used to create a large delayed environment with unknown variables for the agent. LSTM predicts the future state of the battlefield environment based on historical data. TD3 selects appropriate actions to act on the environment based on the prediction results of LSTM and the pool of historical data [18]. In the TD3 and ALTD4 algorithms, A-delay is implemented by setting up a buffered sequence of actions of a certain length, with the most primitive actions acting on the environment.
2.3 ALTD4 Training Process The commander model is trained through multiple rounds of battlefield matchmaking, with one command given in each round for 10 s. As ALTD4 differs significantly from TD3 in terms of input and neural network structure, this section will provide a brief overview of the algorithm. The following formulation represents the current state of the model at time t. Here, τ represents the prediction step, d represents the action delay, and x represents a random moment. The decision-making process for each step is as follows: (1) The commander obtains the current environmental state of the battlefield, denoted as st , from the environment. Then, the commander updates the sequence of state-action pairs within its memory, denoted as S A = [[st−τ , at−τ ], . . . , [st−1 , at−1 ]]; (2) The L ST M network uses the state sequence S A to predict the battlefield envi and as an input to the Actor network. The ronment at the moment st+τ as st+τ Actor generates the motion at of the red-side drone. The whole process can be summarized as shown in Fig. 3: (3) Add at to the action buffer queue A = [at−d , . . . , at ] in at−d , using at−d to act on the battlefield environment; (4) [[st−2∗τ , at−2∗τ , rt−2∗τ , st−2∗τ +1 ], . . . , [st , at , rt , st+1 ]] as a piece of existing historical data into the historical data pool, which is stored and used according to the following rules in Fig. 4:
LSTM-TD3-Based Control for Delayed Drone Combat Strategies
917
If the size of the historical data pool is larger than the size of the historical data required for commander learning, then proceed to the next learning step, otherwise the round ends. (5) The commander draws historical data from the pool of historical data according to the Prioritized Experience Replay (PER) algorithm [19, 20] Batch = [[sx−τ , ax−τ , r x−τ , sx−τ +1 ], . . . , [sx+τ , ax+τ , r x+τ , sx+τ +1 ]] as learning data; (6) Put the state action sequence of [[sx−τ , ax−τ ], . . . , [sx−1 , ax−1 ]] input to the of the state at the moment of sx+τ , L ST M network to get the prediction sx+τ sx+τ and sx+τ to find the mean squared difference as the loss passed back to the Predict network, i.e. for the L ST M network the loss is Eq. (1); , sx+τ ) loss p = M S E(sx+τ
(1)
as input to the Actor network to get the new annihilator motion orientation (7) sx+τ ax at the current moment, sx+τ +1 as input to the Actor target network to get the new annihilator motion orientation ax+τ +1 at the current moment. The algorithm learning process is Fig. 5. ,ax , r x , respectively, are used as future states, actions, current states, (8) sx+τ +1 , sx+τ and rewards in the updating process of the Critic target network parameters of TD3 for the calculation of Q target , and thus the updating of the Critic neural network parameters, specifically;
⎧ ⎪ ⎪ Q 1 , Q 2 ←−(sx+τ +1 , ax ) ⎪ ⎪ ⎨ Q target = r x + γ ∗ Min(Q 1 , Q 2 ) Q 1 , Q2←−(sx , ax ) ⎪ ⎪ lossC1 = M S E(Q 1 , Q target ) ⎪ ⎪ ⎩ lossC2 = M S E(Q 2 , Q target )
(2)
(9) ax , sx+τ , respectively, as the actions during the update of the Actor network parameters of TD3, the current state (Fig. 6); The process that follows is basically the same as the basic process in TD3 and will not be repeated.
Fig. 3 LSTM and Critic network parameters update
918
Fig. 4 Delivery process of data
Fig. 5 Learning process
Fig. 6 Actor network parameters update
B. Ji et al.
LSTM-TD3-Based Control for Delayed Drone Combat Strategies
919
3 Training Effect Validation The training environment is based on TensorFlow [21] 2.0 and python 3.7. The TD3 and ALTD4 algorithms are used to obtain the respective commander agent through training in the battlefield space with no delay, 5-step delay, and 10-step delay. Finally, the performance of each type of agent is finally tested in a 10-step delay environment. The model naming is in the following naming format: Algorithm − x L − y D, denoting the commander model with τ = x LSTM network and y-step action delays buffer sequence implemented with Algorithm.
3.1 Loss Curve Figure 7 shows the loss versus step size in the training environment for the new LSTM layer added to ALTD4. The loss of the LSTM is the mean squared difference between the predicted and true states, and the convergence of the loss to zero gradually indicates that ALTD4 can predict the future τ -step battlefield environment consistently and accurately, a process that is consistent with experimental expectations. The loss calculation for the Critic network is shown in Eq. (2) in Sect. 2.3. Eventually, the neural network parameters for the target Critic and the valuation Critic will be progressively equal. During the training process, the loss of the Critic network varies with the step size see Fig. 8. In terms of the convergence speed of the Critic network, the ALTD4 algorithm is not as fast as the TD3 algorithm. This is due to the fact that the ALTD4 algorithm
Fig. 7 Predict loss
920
B. Ji et al.
Fig. 8 Critic loss
adds a layer of independently trained LSTM network and the Critic target network and the valuation network are not tightly coupled, so there is a large loss in the Critic calculation. However, it can be seen that both the Critic and TD3 Critic networks of the ALTD4 algorithm converge gradually, and the ALTD4 algorithm converges faster when the environmental delay becomes large, so the trained intelligence is valid and reliable.
3.2 Result Analysis During training, the algorithm aims to maximise the cumulative reward and the reward curve reflects the effectiveness of the reinforcement learning algorithm in solving the task. Figure 9 show that after several iterations, the reward curves of the TD3 and ALTD4 algorithms gradually stabilise in the later stages of training. The main reason for the negative reward of the ALTD4-10L model is that the model finds the strategy “hold the enemy drones with the least number of own drones” and,
LSTM-TD3-Based Control for Delayed Drone Combat Strategies
921
Fig. 9 Training environment rewards
since it has a long prediction time for the future, sends the remaining drones to hold the enemy immediately before the own drones are killed, but this consumes a longer time. Since the reward value function is time-limited, its reward value is lower than other agents. For the Commander model in the environment, an analysis of the difference between the drones destroyed by Red and Blue at the end of the game is shown in Table 1. With the addition of time delay, the same algorithm shows a downward trend in win rate and reward. ALTD4 performance will be better than TD3 performance at the same latency.
3.3 Testing Intelligence Body Performance A 10-step fixed action delay battlefield environment was used as a test 100 round to evaluate the performance of the trained commander intelligences. Run the trained TD3 and ALTD4 commander intelligences multiple times in the test environment and plot the reward curve (Fig. 10).
922
B. Ji et al.
Table 1 Results of the agent in the training environment Algorithm Reward mean Standard Win rate (%) deviation TD3 TD3-5D TD3-10D ALTD4-5L ALTD4-5L5D ALTD4-5L10D ALTD4-10L ALTD4-10L5D ALTD4-10L10D
1.820 1.830 1.298 1.886 1.288 1.380 −1.835 1.742 1.312
0.446 0.489 0.757 0.404 0.830 0.639 1.283 0.430 0.477
72.36 64.32 61.3 74.87 74.31 68.31 87.93 83.41 68.84
Casualty rate 1.19 1.16 1.15 1.21 1.18 1.2 1.39 1.26 1.15
Fig. 10 Testing environment reward
The results of the experiments indicate that these models have approached or reached their respective optimal strategies. The average reward and battle loss ratios of the agents in each experiment were recorded, as shown in Table 2.
LSTM-TD3-Based Control for Delayed Drone Combat Strategies Table 2 Results of the agent in the testing environment Algorithm Reward mean Standard deviation TD3 TD3-5D TD3-10D ALTD4-5L ALTD4-5L5D ALTD4-5L10D ALTD4-10L ALTD4-10L5D ALTD4-10L10D
0.851 1.673 1.814 1.626 1.643 1.867 −0.91 1.812 1.866
0.546 0.528 0.526 0.456 0.393 0.441 0.415 0.480 0.448
923
Win rate (%)
Casualty rate
60.89 69.56 80.43 69.56 71.73 76.12 95.65 78.26 73.91
1.17 1.13 1.23 1.16 1.15 1.26 1.39 1.13 1.15
In a delay-free training environment, the ALTD4 algorithm performed slightly better than TD3, with a 34.76% increase in the maximum win rate, a 0.22 increase in the maximum battle loss rate, and a smaller standard deviation in the final reward. This means that the ALTD4 algorithm is able to improve combat efficiency and reduce losses significantly. Based on the experimental results, it is clear that the ALTD4 algorithm is more suitable for environments where there is no delay in the training set and a large delay in the test set.
4 Conclusion In this paper, an ALTD4 algorithm based on deep reinforcement learning and long short-term memory networks was investigated, which can be used to solve the decision making problem of single agents under large environmental delays. The algorithm was trained and tested in an annihilator air combat gaming environment, and compared with the TD3 algorithm. The experimental results show that the ALTD4 algorithm has slightly larger loss values in the Critic network compared to the TD3 algorithm, but ALTD4 performs very well in a testing environment with greater delay compared to the training environment. The algorithm was also evaluated and validated in various aspects, including training curves, reward curves, win rates and battle loss ratios, demonstrating its potential for use in decision making problems in high delay environments. Acknowledgements This work is supported by National Key R&D Program of China under Grant 2021ZD0112700 and National Natural Science Foundation (NNSF) of China under Grant 61973082 and 62233003.
924
B. Ji et al.
References 1. Joo, H.-T., Kim, K.-J.: Visualization of deep reinforcement learning using Grad-CAM: how AI plays Atari games? In: 2019 IEEE Conference on Games (CoG), pp. 1–2. London, UK (2019). https://doi.org/10.1109/CIG.2019.8847950 2. Silver, D., et al.: Mastering the game of Go with deep neural networks and tree search. Nature 529, 484–489 (2016). https://doi.org/10.1038/nature16961 3. Chen, B., et al.: Delay-aware model-based reinforcement learning for continuous control. Neurocomputing 450, 119–128 (2020). https://doi.org/10.1016/J.NEUCOM.2021.04.015 4. Chen, X., Wang, Y.: Air combat game method based on multi-UAV consensus information. In: The 26th Chinese Control and Decision Conference (CCDC), pp. 4361–4364. Changsha, China (2014). https://doi.org/10.1109/CCDC.2014.6852947 5. Zhang, G., Li, Y., Xu, X., Dai, H.: Efficient training techniques for multi-agent reinforcement learning in combat tasks. IEEE Access 7, 109301–109310 (2019). https://doi.org/10.1109/ ACCESS.2019.2933454 6. Boron, J., Darken, C.: Developing combat behavior through reinforcement learning in wargames and simulations. In: 2020 IEEE Conference on Games (CoG), pp. 728–731. Osaka, Japan (2020). https://doi.org/10.1109/CoG47356.2020.9231609 7. Chung, L.-L., et al.: Time-delay control of structures. Earthq. Eng. Struct. Dyn. 24, 687–701 (1995). https://doi.org/10.1002/EQE.4290240506 8. Astrom, K.J., Hang, C.C., Lim, B.C.: A new Smith predictor for controlling a process with an integrator and long dead-time. IEEE Trans. Autom. Control 39(2), 343–345 (1994). https:// doi.org/10.1109/9.272329 9. Artstein, Z.: Linear systems with delayed controls: a reduction. IEEE Trans. Autom. Control 27(4), 869–879 (1982). https://doi.org/10.1109/TAC.1982.1103023 10. Ramstedt, S., Pal, C.: Real-time reinforcement learning. In: Neural Information Processing Systems (2019) 11. Pinto, L., et al.: Robust adversarial reinforcement learning. In: International Conference on Machine Learning (2017) 12. Firoiu, V., et al.: At human speed: deep reinforcement learning with action delay. ArXiv abs/1810.07286 (2018). https://doi.org/10.48550/arXiv.1810.07286 13. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997). https://doi.org/10.1162/neco.1997.9.8.1735 14. Zhang, S., Cao, R.: Multi-objective optimization for UAV-enabled wireless powered IoT networks: an LSTM-based deep reinforcement learning approach. IEEE Commun. Lett. 26, 3019– 3023 (2022). https://doi.org/10.1109/LCOMM.2022.3210660 15. Fengjiao, Z., Jie, L., Zhi, L.: A TD3-based multi-agent deep reinforcement learning method in mixed cooperation-competition environment. Neurocomputing 411, 206–215 (2020). https:// doi.org/10.1016/J.NEUCOM.2020.05.097 16. Fujimoto, S., et al.: Addressing function approximation error in actor-critic methods. ArXiv abs/1802.09477 (2018). https://doi.org/10.48550/arXiv.1802.09477 17. Sutton, R.S., Barto, A.G.: Reinforcement learning. A Bradford Book 15(7), 665–685 (1998). https://doi.org/10.1007/978-3-642-27645-3 18. Hou, Y., et al.: A novel DDPG method with prioritized experience replay. In: 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 316–321 (2017). https://doi.org/10.1109/SMC.2017.8122622 19. Schaul, T., et al.: Prioritized experience replay. CoRR abs/1511.05952 (2015). https://doi.org/ 10.48550/arXiv.1511.05952 20. Liu, X., et al.: Prioritized experience replay based on multi-armed bandit. Expert Syst. Appl. 189 (2022). https://doi.org/10.1016/j.eswa.2021.116023 21. Abadi, M., et al.: TensorFlow: a system for large-scale machine learning. ArXiv abs/1605.08695 (2016). https://doi.org/10.48550/arXiv.1605.08695
Design of Adaptive Sliding Mode Controller Based on Neural Network Compensation for Stewart Platform Weixiang Zeng, Wenlin Yang, Yunting Wang, and Weilun Situ
Abstract Stewart platform is widely used in motion simulator, parallel machine tool, vibration isolation, stabilization platform and so on. Due to the influence of disturbance and other factors, nonlinear sliding mode control is often used in the design of the Stewart controller. However, when subjected to large disturbances and model errors, the traditional sliding mode controller has serious switching, which leads to chattering. This paper introduces the RBF neural network to compensate for the uncertainties, and the corresponding control law and adaptive rate are designed. At the same time, the system’s stability is proved based on Lyapunov theory. The simulation results show that the designed controller is robust and can track the desired trajectory effectively. Keywords Stewart platform · Inverse dynamic model · Sliding mode control · RBF neural network
1 Introducation Stewart platform has the advantages of stable structure, high precision, and strong bearing capacity. It is widely used in aerospace, defense, and other fields. Because the platform is a multi-input multi-output, strong coupling, parameter time-varying nonlinear system [1], the traditional PID algorithm is difficult to obtain the optimal control effect. At present, many scholars have studied the advanced control strategies of the Stewart platform, such as sliding mode control [2, 3], neural network control [1, 4], fuzzy control [5, 6], active disturbance rejection control [7] and some fusion control methods. Sliding mode control is a variable structure control in which the sliding mode is pre-designed and is not affected by the parameter variation of the controlled object and external disturbance. Therefore, it has strong robustness and is widely used in the design of Stewart platform controller. The control framework of the Stewart platform W. Zeng · W. Yang (B) · Y. Wang · W. Situ Guangdong Institute of Intelligent Unmanned System, Guangzhou 511458, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_74
925
926
W. Zeng et al.
can be divided into joint space control and workspace control. The motion control based on the workspace often needs to measure the pose information of the platform, and the measurement of the pose information of the platform in many scenarios is expensive and time-consuming. Therefore, the joint space control method is more suitable in practice. However, the dynamic model of joint space is difficult to be explicitly represented by joint information. In the actual calculation, the expected trajectory information is usually used instead of the actual trajectory to calculate the dynamic model [8], so there will be model errors. The existence of large model errors and external disturbance will cause serious switching of the sliding mode controller, resulting in chattering. In order to avoid the chattering phenomenon, the neural network is used to compensate for the uncertainty of the sliding mode controller. The neural network has the advantages of strong learning ability and continuous nonlinear function approximation ability and is widely used to control nonlinear systems [9]. References [10–12] studied the neural network adaptive sliding mode control of parallel platform and achieved good results. Inspired by the above discussion, this paper uses a new neural network adaptive sliding mode controller to control the Stewart platform. Firstly, the platform’s inverse kinematics and inverse dynamics model are analyzed, and the Jacobian matrix transforms the dynamic model. Secondly, the sliding mode controller is designed, the neural network is used to compensate for the uncertainty error, and the corresponding control law and adaptive rate are designed. The stability of the system is proved based on Lyapunov theory. Finally, the simulation experiment is designed to verify the control effect under disturbance. The rest of this article is organized as follows. Section 2 derives the Stewart platform’s inverse kinematics and dynamics model. In Sect. 3, a synovial controller based on neural network compensation is designed. The control effect of the controller is given in the simulation experiment in Sect. 4. Section 5 gives a summary of this article.
2 Kinematics and Dynamics of Stewart Platform 2.1 Inverse Kinematics The model of the Stewart platform is shown in Fig. 1, assuming that the lower platform is a fixed platform. The static coordinate system O B (X B , Y B , Z B ) and the dynamic coordinate system O A (X A , Y A , Z A ) are established at the center of the upper and lower platforms, respectively. As shown in Fig. 2, the hinge points connecting the legs to the upper and lower platforms are recorded as Bi and Ai (i = 1, . . . , 6). The roll, pitch and yaw of the upper platform rotating around X B , Y B and Z B are defined as α, β and γ respectively. According to the R PY criterion, the rotation matrix R can be used to represent the rotation process of the dynamic coordinate system.
Design of Adaptive Sliding Mode Controller Based on Neural …
927
Fig. 1 The solidworks model of Stewart platform
(a) Top view of Stewart platform
(b) Single leg vector diagram
Fig. 2 Schematic diagram of Stewart platform structure parameters
⎤ cβ cγ sα sβ cγ − cα sγ cα sβ cγ + sα sγ R = ⎣ cβ sγ sα sβ sγ + cα cγ cα sβ sγ − sα cγ ⎦ −sβ sα cβ cα cβ ⎡
(1)
where s() = sin(·), c() = cos(·). The position of the origin of the dynamic coordinate system in {B} is defined as p = [ px , p y , pz ]T . The pose vector of the moving platform can be expressed as: q = [ px , p y , pz , α, β, γ ]T According to Fig. 2, the position of Bi in the coordinate system {B} is
(2)
928
W. Zeng et al. B
bi = [r B cos(φi )
r B sin(φi )
0]T
(3)
0]T
(4)
, j = 1, 2, 3
(5)
The position of Ai in the coordinate system {A} is A
ai = [r A cos(ψi )
r A sin(ψi )
The expressions of φi and ψi are as follows: φi (ψi ) =
2π ( j − 1) + 3 2π j − φ(ψ) 3
φ(ψ) i = 1, 3, 5 i = 2, 4, 6
In addition, the expression of the leg length vector is: −−−→ −−−→ −−−→ L i = O B O A + O A Ai − O B Bi = R A ai + P − B bi (i = 1, · · · , 6)
(6)
Therefore, the length of the leg can be obtained: li = ||L i ||
(7)
2.2 Inverse Dynamics Before solving the dynamic model, we derive the angular velocity conversion matrix T and the platform Jacobian matrix J of the moving platform. Firstly, the velocity vector of the moving platform is defined as q˙ = [ p; ˙ ω]
(8)
where ω is the angular velocity vector of the moving platform. According to the rotation matrix derivation rule, we can get the relationship between ω and R as follows: ω × R = R˙ (9) According to Eq. 9, finally we can get the relationship between angular velocity and generalized velocity: ⎤⎡ ⎤ ⎡ ⎤ ⎡ α˙ α˙ cβ cγ −sγ 0 ω = T · ⎣ β˙ ⎦ = ⎣ cβ sγ cγ 0 ⎦ ⎣ β˙ ⎦ −sβ 0 1 γ˙ γ˙
(10)
According to the velocity relationship between the joint space and the workspace, the Jacobian matrix Ji of the i-th leg is defined as
Design of Adaptive Sliding Mode Controller Based on Neural …
Ji = u iT where u i = Llii is the unit vector. Therefore, we can conclude
(R A ai × u i )T
929
(11)
l˙ = J q˙
(12)
In this paper, the Euler-Lagrange method establishes the dynamic equation of the Stewart platform. The dynamic equation of its workspace can be expressed as follows [13]: (13) M(q)q¨ + C(q, q) ˙ q˙ + G(q) + Fd = F = J T τ ˙ ∈ R 6×6 is the Coriolis force and where M(q) ∈ R 6×6 is the inertia matrix, C(q, q) centripetal force matrix, G(q) ∈ R 6 is the gravity vector, Fd ∈ R 6 is the disturbance, F ∈ R 6 is the output generalized force, τ ∈ R 6 is the driving torque. Since the moving platform has the greatest influence on the overall dynamic model, this paper only considers the moving platform model. The kinetic energy of the motion platform can be expressed as [14] KA =
1 m A x˙ 2p + y˙ 2p + z˙ 2p + ω T R I A R T ω 2
(14)
where I A is the moment of inertia of the moving platform around the coordinate system {A}. After sorting out Eq. 14, we can get 1 K A = q˙ T M(q)q˙ = q˙ T 2
m A I3×3 O3×3 O3×3 R I A R T
q˙
(15)
According to the potential energy of the moving platform, we can derive the gravity vector as follows: G(q) =
∂(m A gz p ) = [0 0 m A g 0 0 0]T ∂q
(16)
The derivation of C(q, q) ˙ refers to Ref. [8], which can be expressed by the following formula O3×3 O3×3 (17) C(q, q) ˙ = O3×3 ω × R I A R T According to Eqs. 12 and 13, we convert the dynamic equations to obtain
930
W. Zeng et al.
˙ l˙ + G(l) + τd = τ M(l)l¨ + C(l, l) M(l) = J ˙ =J C(l, l)
−T
−T
M(q)J
(C(q, q) ˙ − MJ
G(l) = J
−T
(18)
−1 −1
(19) J˙)J
−1
G(q)
(20) (21)
Since the derived dynamic model is incomplete, we call the above model the ˙ G 0 (l), let M = M0 − M, C = C0 − nominal model, set it as M0 (l), C0 (l, l), C, G = G 0 − G, and combine Eq. 18, we can get ˙ l˙ + G 0 (l) − D(l, l, ˙ l) ¨ =τ M0 (l)l¨ + C0 (l, l)
(22)
˙ l) ¨ = M l¨ + C l˙ + G − τd . where D(l, l,
3 Adaptive Sliding Control Based on Neural Network Compensation 3.1 Sliding Mode Controller The sliding mode controller designed in this paper adopts linear sliding mode surface and exponential reaching law. Firstly, the system error is defined as e = ld − l e˙ = l˙d − l˙
(24)
s = e + e˙
(25)
(23)
The design switching surface is
Deriving both sides of Eq. 25 and substituting Eq. 22 can be obtained s˙ = l¨d + e˙ − M0−1 (τ − C0 l˙ − G 0 ) − M0−1 D
(26)
Define the uncertainty error term f = M0−1 D, and introduce the exponential reaching law: s˙ = −εsgn(s) − λs ε > 0, λ > 0 (27) Combining Eqs. 26 and 27, the control law of the synovial film controller can be obtained as follows:
f + C0 l˙ + G 0 τ = M0 l¨d + e˙ + εsgn(s) + λs −
(28)
Design of Adaptive Sliding Mode Controller Based on Neural …
931
where f is the estimated value of f . We combine Eqs. 26 and 28 to obtain s˙ = −εsgn(s) − λs − f
(29)
where f represents the estimated error, combined with Eq. 25, we can get: f s T s˙ = −ε||s|| − λs T s − s T
(30)
It can be seen from Eq. 30 that the stability of the sliding mode controller is related to the estimation error of the uncertain item, and the stability of the controller cannot be guaranteed at present, so we introduce a neural network to compensate the uncertain item.
3.2 RBF Netural Network In this paper, the RBF neural network is introduced to approximate the uncertainty error of the controller, and the output of the neural network is f = WTh h j = ex p(−||x − c j ||
2
/2b2j ),
(31) j = 1, 2, · · · , m
(32)
where W represents the weight of the neural network, h represents the Gauss Basis function, and X is the input of the network. For the uncertainty error f , there is an ideal weight vector W ∗ and a small positive number η M , so that the approximation error η satisfies (33) max||η|| = max|| f − W ∗T h|| ≤ η M From Eq. 33, we can know that the approximation error η is bounded, so the following formula holds (34) f = W ∗T h + η The adaptive compensation term of the neural network controller can be set as [15]
T h − η M sgn(s)) τ N = M0 ( W
(35)
is the estimated value of W ∗ , and the adaptive rate is taken as where W ˙ = −hs T
W
(36)
where > 0, according to Eqs. 28 and 35, the control law of the adaptive sliding mode controller based on neural network compensation is
932
W. Zeng et al.
Fig. 3 Control system block diagram
T h − η M sgn(s)) τ = M0 (l¨d + e˙ + εsgn(s) + λs) + C0 l˙ + G 0 − M0 (W
(37)
The block diagram of the control system of the Stewart platform is shown in Fig. 3. Define the Lyapunov function: V =
1 1 T T −1 W ) s s + tr (W 2 2
(38)
˙ = −W ˙ . Differentiate both sides of Eq. 38 to obtain = W∗ − W
,W
where W ˙) T −1 W V˙ = s T s˙ + tr (W
(39)
Substituting Eqs. 30, 34 and 35, we can get ˙) T h + η + η M sgn(s)) + tr (W T −1 W V˙ = −ε||s|| − λs T s − s T (W ˙ − hs T ) − (s T η + η ||s||) = −(s T η + η ||s||) ≤ 0 T ( −1 W ≤ tr W M
M
(40)
It can be seen from Eq. 40 that the closed-loop system tends to be stable.
4 Simulation In this chapter, simulation experiments are designed to verify the effectiveness of the controller. The parameters of the simulation experiment are shown in Table 1. Build the control system of the Stewart platform in simulink, as shown in Fig. 4. The desired trajectory is set to:
Design of Adaptive Sliding Mode Controller Based on Neural … Table 1 Simulation parameters Parameters
933
Values
r B (r A ), φ(ψ)
2.75 m (2.50 m), 15.83◦ (46.11◦ )
IA
diag(100177.7,100177.7,200044.3) (kg · m2 )
mA
44356.654 kg
, ε, λ, η M ,
20·I6 , 0.05, 100, 0.2, 15
Fig. 4 Control system in simulink
x(t) = 0.2sin(1.5π t) + 0.05, y(t) = 0.3sin(π t) + 0.1 3π π )) z(t) = 0.4sin(2π(t + )) + 0.1, α(t) = 4sin(0.5π(t + 4 4 π β(t) = 3sin(1.5π(t + )), γ (t) = 3sin(0.5π t) + 1 4 To verify the robustness of the controller, Gaussian white noise is introduced as interference, and its standard deviation is set to 100. The tracking effect of the controller is shown in Figs. 5, and 6 is the tracking error. It can be seen from Figs. 5 and 6 that the Stewart platform can quickly achieve trajectory tracking even when the starting point error is large, and the tracking error is very small after stabilization, and the system can still be stabilized under large interference. It shows that the adaptive sliding mode controller based on neural network compensation has high control accuracy, fast respond and strong robustness for Stewart platform, which can prove that the controller is reliable.
5 Conclusions In this paper, an adaptive sliding mode controller based on neural network compensation is designed to control the Stewart platform. Firstly, inverse kinematics is used to convert the workspace control to the joint space control, which is helpful for the measurement of real-time parameters. Secondly, the inverse dynamics model of the
934
W. Zeng et al. 0.5
0.4
0.25
Actual path Desired path
Actual path Desired path
Actual path Desired path
0.4
0.2
0.3 0.3
0.15
0.05 0
Z position(m)
Y position(m)
X position(m)
0.2 0.1
0.1
0.1 0
0
-0.05
-0.1 -0.1
-0.1
-0.2
-0.15
-0.2 0
0.5
1
1.5
-0.3 0
2
0.5
1
1.5
0
2
0.5
3
Actual path Desired path
2
1.5
2
(c) Z position
(b) Y position
(a) X position
1
Time(s)
Time(s)
Time(s)
4
Actual path Desired path
Actual path Desired path
3.5
2 1
3
0
-1
-2
1
Z orientation(deg)
Y orientation(deg)
X orientation(deg)
0.2
0
-1
2.5 2 1.5 1
-2
-3
0.5 0
-3
-4 0
0.5
1
1.5
0
2
0.5
1
1.5
2
0
0.5
Time(s)
Time(s)
(d) X orientation
1
1.5
2
Time(s)
(e) Y orientation
(f) Z orientation
Fig. 5 Comparison between the expected trajectory and the actual trajectory of Stewart platform 0.12
0.05
0.1
0.04
0.08
0.4 0.35
0.03
0.02
Z position error(m)
Y position error(m)
X position error(m)
0.3
0.06
0.04
0.01
0.02
0
0
0.25 0.2 0.15 0.1 0.05 0 -0.05
-0.02
0
0.5
1
1.5
2
0
0.5
Time(s)
1.5
0
2
1
0.5
1.5
1
0.5
0
0
-0.5 1
1.5
2
1.5
2
0.6
0.4
0.2
-0.2 0
0.5
Time(s)
(d) X orientation
0.8
0
-0.5 0.5
2
1
Z orientation error(deg)
Y orientation error(deg)
1.5
1.5
1.2
2
2
1
Time(s)
(c) Z position
2.5
3
2.5
0
0.5
Time(s)
(b) Y position
(a) X position
X orientation error(deg)
1
1
Time(s)
(e) Y orientation
Fig. 6 6-DOF tracking error of the controller
1.5
2
0
0.5
1
Time(s)
(f) Z orientation
Design of Adaptive Sliding Mode Controller Based on Neural …
935
joint space is deduced, and a traditional sliding mode controller is designed. The neural network is introduced to compensate for the uncertain items in the sliding mode controller, and the stability of the closed-loop system is proved based on the Lyapunov equation. Finally, the control system is designed in Simulink. The simulation results show that the adaptive sliding mode controller based on neural network compensation can make the Stewart platform robust to external disturbances and meet the design requirements. Acknowledgements This work was supported by the International Science and Technology Cooperation Project of Guangdong Province under Grant 2022A0505050027, and the Project for high quality development of 6 marine industries of Department of Natural Resources of Guangdong Provincial (GDNRC[2023]32).
References 1. Ma, J., Yang, T., Hou, Z.G., Tan, M.: Neural network adaptive control of a Stewart mechanismbased active vibration isolation platform. Control Decis. 24(8), 1150–1155 (2009) 2. Cai, Y., Zheng, S., Liu, W., Qu, Z., Zhu, J., Han, J.: Sliding-mode control of ship-mounted Stewart platforms for wave compensation using velocity feedforward. Ocean Eng. 236, 109477 (2021) 3. Lafmejani, A.S., Masouleh, M.T., Kalhor, A.: Trajectory tracking control of a pneumatically actuated 6-DOF Gough-Stewart parallel robot using Backstepping-Sliding Mode controller and geometry-based quasi forward kinematic method. Robot. Comput.-Integr. Manuf. 54, 96–114 (2018) 4. Dai, X., Song, S., Xu, W., Huang, Z., Gong, D.: Modal space neural network compensation control for Gough-Stewart robot with uncertain load. Neurocomputing 449, 245–257 (2021) 5. Vu, M.T., Alattas, K. A., Bouteraa, Y., Rahmani, R., Fekih, A., Mobayen, S., Assawinchaichote, W.:Optimized fuzzy enhanced robust control design for a Stewart parallel robot. Mathematics 10(11), 1917(2022) 6. Serrano, F., Caballero, A., Yen, K., Brezina, T.:Control of a Stewart platform with fuzzy logic and artificial neural network compensation. In: Recent Advances in Mechatronics, pp. 156–160 (2007) 7. Chen, W., Wang, S., Li, J., et al.: An ADRC-based triple-loop control strategy of ship-mounted Stewart platform for six-DOF wave compensation. Mech. Mach. Theory 184, 105289 (2023) 8. Hamid, D.T.: Parallel Robots: Mechanics and Control (2020). https://doi.org/10.1201/b16096 9. Liu, J.: RBF Neural Network Control For Mechanical Systems: Design, Analysis and MATLAB Simulation, 2nd edn (2018) 10. Van Nguyen, T., Ha, C.: RBF neural network adaptive sliding mode control of rotary Stewart platform. In: Intelligent Computing Methodologies: 14th International Conference, ICIC 2018, Wuhan, China, August 15–18, 2018, Proceedings, Part III 14, pp. 149–162 (2018) 11. Wang, Y., Qi, L., Wang, X.G., Liu, J.: Simulation experiment of flexible parallel robot control by RBF neural network based on sliding mode robust term. In: Proceedings of the 2017 2nd Asia-Pacific Conference on Intelligent Robot Systems (ACIRS), Wuhan, China, pp. 16–18 (2017) 12. Zhu, N., Xie, W., Shen, H.: Adaptive sliding mode control with RBF neural network-based tuning method for parallel robot. In: IECON 2022-48th Annual Conference of the IEEE Industrial Electronics Society, pp. 1–6 (2022) 13. Huang, C.I., Chang, C.F., Yu, M.Y., Fu, L.C.: Sliding-mode tracking control of the Stewart platform. In: 2004 5th Asian Control Conference (IEEE Cat. No. 04EX904), Vol. 1, pp. 562– 569 (2004)
936
W. Zeng et al.
14. Chen, L.: Wholly dynamic modeling of 6-DOF Stewart platform parallel robot. J. Yanshan Univ. 28, 228–232 (2004) 15. Tian, H., Yu, Y.: Dynamics and trajectory tracking of a compliant parallel robot. J. Mech. Eng. 52(13), 38–46 (2016)
Research on Adversarial Robustness Properties of Image Classification Networks Based on Deep Vision Qiaoyi Li, Zhengjie Wang, Xiaoning Zhang, Hongbao Du, Bai Xu, and Yang Li
Abstract In response to the problem of significant performance decline of existing deep learning-based intelligent recognition algorithms under adversarial sample attack conditions, this research investigates the intrinsic mechanisms and description methods of adversarial samples. Quantitative linear characteristic analysis is conducted on sub-operations of convolutional neural networks, a model is established to compute the incremental output corresponding to perturbed inputs of suboperations, and the internal mechanism of adversarial sample generation is explored. Using the fast gradient descent method, sensitivity coefficients and offset coefficients are introduced in RestNet networks to establish a relationship model between input perturbations and outputs. The linear characteristics in high-dimensional space are demonstrated to be the cause of adversarial sample generation. Finally, using the projection gradient descent method, a relationship model is established between the number of iterations and outputs to solve the mapping relationship between sensitivity coefficients and the number of iteration attacks. This provides guidance for the design of deep learning attack-defense algorithms. Keywords Deep learning · Adversarial examples · Fast gradient descent method · Projection gradient descent method
1 Introduction When we seek to deploy machine learning systems not only in virtual domains but also in real systems, it becomes essential to examine not only whether these systems work “most of the time,”but also whether they are truly robust and reliable. Despite Q. Li · Z. Wang · X. Zhang · H. Du · B. Xu (B) School of Mechatronical Engineering, Beijing Institute of Technology, Beijing 100081, China e-mail: [email protected] Y. Li School of Mechanical and Electrical Engineering, North University of China, Taiyuan 030051, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4_75
937
938
Q. Li et al.
the fact that deep learning can perform various computer vision tasks with significant accuracy, Szegedy et al. [1] first discovered an interesting weakness of deep neural networks in image classification: some machine learning models, including state-ofthe-art neural networks, are vulnerable to adversarial attacks with small perturbations to the images, which are imperceptible to the human visual system.The emergence of adversarial samples reveals the problem of poor robustness of deep learning models. GU [2] and Chalupka [3] have taken the first step in designing models that resist adversarial perturbations, but no model has yet been able to improve the robustness of models while maintaining the accuracy of the original data. The explanations of the above researchers regarding adversarial samples mainly focus on the extreme nonlinearity and overfitting of deep neural networks. Goodfellow et al. [4] proposed the concept that linear behavior in high-dimensional space makes models vulnerable to adversarial attacks. Based on this concept, a method for rapidly generating adversarial samples was designed, and it was demonstrated that using adversarial samples for adversarial training can increase additional regularisation effects compared to solely using random dropouts.This paper has sparked widespread attention among researchers on adversarial attacks. Based on this, researchers have developed many attack algorithms for generating adversarial samples, such as the limited-memory Broyden-Fletcher-Goldfarb-Shanno (LBFGS) algorithm based on limited memory [1], the fast gradient sign method (FGSM) [4], basic iterative attack/projected gradient descent (BIA/PGD) [5], distributionally adversarial attack (DAA) [6], Carlini and Wagner (CW) attack [7], Jacobian-based saliency map attack (JSMA) [8], and DeepFool [9]. Although many adversarial attack methods have been proposed, they mainly involve modifications to two strategies: (1) setting perturbation norms constraints, and (2) optimizing methods for perturbation norms. As we can see, the causal relationship for adversarial sample generation is still not clear. This paper, for the first time, explores the intrinsic mechanisms and description methods of adversarial samples from the sub-operations of convolutional neural network models. It establishes a relationship model between the number of iterations and outputs, solves the mapping relationship between sensitivity coefficients and the number of iteration attacks, and provides guidance for the design of deep learning attack-defense adversarial algorithms.
2 Adversarial Sample Internal Mechanism and Description Methods Convolutional neural network structures include convolutional layers, pooling layers, and fully connected layers. Among them, the fully connected layer contains sub-operations such as normalization, linear transformation, and nonlinear transformation. To solve the internal mechanism of adversarial sample generation, independent linear characteristic analysis is carried out on each sub-operation.
Research on Adversarial Robustness Properties of Image …
939
2.1 Linear Characteristic Analysis of Normalization and Standardization Operations Consider an adversarial example: x˜ = x + η
(1)
x is the original image matrix, η is the perturbation matrix, where η∞ < , is the perturbation threshold, and η∞ represents the infinity norm of the perturbation matrix η. Normalization Operation Let α = xmax − xmin , β = xmin (it is obvious that given an image matrix, it is a constant), then the normalized image matrix xnor m = xi α−β . It can be seen that normalization is a linear transformation by scaling according to α, and then shifting by βα units. Here, xmax represents the maximum value in the image matrix, xmin represents the minimum value in the image matrix, and xi represents the i-th value after the image matrix is flattened. Standardization Operation Before being fed into the network for training, the image is first standardized. The standardized image matrix after feature standardization is: xnor m − μ (2) xstandar d = σ Here, μ represents the mean value of the normalized image matrix, and σ represents the standard deviation of the normalized image matrix. When the training dataset is given, the mean value μ and the standard deviation σ of the image matrix are fixed values. It can be seen that standardization is a linear transformation by scaling according to σ , and then shifting by μσ units.
2.2 Linear Characteristic Analysis of Convolutional Layer Operations Convolutional Operation of Convolutional Layer The input shape of the convolutional operation is n h × n k , the shape of the convolutional window is kh × kw , ph rows are padded on both sides in the height dimension, pw columns are padded on both sides in the width dimension. If the stride in the height dimension is sh and the stride in the width dimension is sw , the output shape will be [
n w − k w + p w + sw n h − k h + ph + sh ]×[ ] sh sw
(3)
For this analysis, we take the input shape as 3 × n h × n k , use a 1 × 1 convolutional window with an output channel of 1, padding ph = 0, pw = 0, and stride sh = 1, sw = 1. The final output shape will be n h × n k (Fig. 1).
940
Q. Li et al.
Fig. 1 Cross-correlation calculation using a 1 × 1 convolutional kernel with input channels of 3 and output channels of 1
The convolutional operation can be expressed as follows: z = xstandar ˜ d ⊗ ω + b = xstandar d ⊗ ω + η ⊗ ω + b
(4)
Here, xstandar ˜ d is the standardized adversarial sample matrix, ω is the weight matrix of the convolutional layer,b is the bias matrix of the convolutional layer, and ⊗ represents the convolutional operation. The increment of the output value of the convolutional operation is η ⊗ ω, and the average size of the elements in the convolutional kernel is m, then the maximal increment of the convolutional operation is m3n h n k . Here, is a fixed value, and the parameter m is calculated from the training process. Therefore, the convolutional operation is actually a kind of affine transformation, and its increment grows linearly with the input dimension. Batch Normalization Operation of Convolutional Layer After the convolution calculation and before applying the activation function, the data is normalized in batches. The linear analysis is the same as the analysis of standardization operation in Sect. 2.1. Linear Characteristic Analysis of Nonlinear Operations in Convolutional Layers (5) a = φ(z nor m ) the function φ is a non-linear activation function, usually taken as ReLU, sigmoid, tanh, etc. z nor m is the matrix after batch normalization operation in the convolutional layer. Assuming sigmoid as the non-linear activation function: sigmoid(z nor m + ) =
1 1 + ex p(−z nor m − )
The increment is sigmoid(z nor m + ) − sigmoid(z nor m ).
(6)
Research on Adversarial Robustness Properties of Image …
(a) Analysis of Error Values
941
(b) Analysis of Error Percentage
Fig. 2 Linear error analysis of the sigmoid function
The series representation of the sigmoid function can be simplified to 1 1 1 1 17 3 5 7 + × z nor m − × z nor × z nor × z nor m + m − m+ 2 4 48 480 80640 (7) 31 691 9 11 12 × z nor × z − + o(z ) m nor m nor m 1451520 319334400
ssigmoid ≈
To avoid the problem of vanishing gradients and facilitate optimization, for each hidden layer neuron, the input distribution that gradually approaches the saturation region of the output range of the non-linear function is forced back to a standard normal distribution with mean 0 and variance 1, so that the input values to the nonlinear transformation function fall into the region where it is more sensitive to the inputs. Therefore, an input range of [−1, 1] with a probability of 0.683 is taken for the analysis. Taking s1 = 21 + 41 × z nor m and referring to the given figure, it can be observed that the error range between the function sigmoid(z nor m ) and the linear function s1 is [0, 0.019]. When t takes −1 and +1, the maximum error value between the sigmoid function and the linear function s1 is 0.019, and the average error value is 0.005. Furthermore, the percentage of the maximum error under this condition is 7%, while the percentage of the average error is 1% (Fig. 2). To sum up: Sigmoid functioncan be approximately equivalent to linear function when the input value range is [−1,1], and its increment increases linearly with the input dimension.
2.3 Analysis of Linear Characteristics of Pooling Operation The input shape of the pooling layer operation is n h × n k , and the shape of the pooling window is generally k p × kq = 2 × 2. A total of 2 rows are filled on both sides of the height and 2 columns are filled on both sides of the width. When the step length of the height is s p = 1 and the step length of the width is sq = 1, the output shape will be (Fig. 3).
942
Q. Li et al.
Fig. 3 Schematic diagram of pooling operation
[n h − k p + p p + s p /sh ] × [n k − kq + pq + sq /sq ] = [n h − 1] × [n k − 1]
(8)
The feature dimension is reduced from n h × n k to n h−1 × n k−1 in the pooling operation according to the feature invariance. Therefore, based on the convolutional output increment, the reduction of the perturbation maximization increment of average pooling and maximum pooling is as follows: n h n k − (n h − 1)(n k − 1) = (n h + n k + 1)
(9)
As you can see, the increment increases linearly with the input dimension.
2.4 Linear Characteristics Analysis of the Linear Transformation Operation of the Fully Connected Layer Consider the dot product between the full connection layer weight vector w p and the pooled antagonistic sample x˜ p : zl = w Tp x˜ p + b p
(10)
b p represents the bias matrix of the fully connected layer. The increment of the output of the linear transformation function is w T η . We can maximize this increment by assigning η = sign(w p ) while satisfying the maximum norm constraint on η.Suppose the dimension of w p is n p and the average size of the elements in the
Research on Adversarial Robustness Properties of Image …
943
weight vector of the fully connected layer is m p , then the maximum increment of the linear transformation function is m p n p . You can see that the increment of the linear transformation function increases linearly with the input dimension, that is, for higher dimensional problems, if we make a small change in the input, the output of the linear transformation function will eventually change very much.
2.5 Linear Characteristics Analysis of Nonlinear Activation Function Same analysis as in the convolution layer. Conclusion: ||η||∞ does not change with the change of dimension, but the perturbation brought by the operation of activation function (linear activation function and nonlinear activation function) increases linearly with the dimension. Then, for higher dimensional problems, we make many infinitesimal changes in the input, which will eventually cause great changes in the output. The above explanation illustrates that a convolutional neural network can generate adversarial samples if its input has sufficient dimension. And the linear behavior of the countersample is generated in higher dimensions.
3 Linear Perturbation of Neural Network Model The convolutional neural network is too linear to resist the disturbance of linear antagonism, which leads to the significant decline of the prediction confidence.
3.1 Introduction to Symbols h θ:χ⇒R K acts as a mapping from the input space (the three-dimensional tensor) to the output space, where the output space is a K-dimensional vector where k is the number of classes predicted. θ vector represents all parameters that define the model (i.e., all convolution filters, weight matrices of fully connected layers, bias values, etc.) We define the loss function : R K × Z + ⇒ R+ as a mapping from model prediction and true label to a non-negative number. (h θ (x), y)
(11)
For the input x ∈ and the real class y ∈ Z , represents the loss the classifier realizes in its prediction x, assuming the real class is y. By far the most common form of loss used in deep learning is cross entropy loss, defined as
944
Q. Li et al.
(h θ (x), y) = log(
k
exp(h θ (x) j )) − h θ (x) y
(12)
j=1
where h θ (x) j represents the jth element of the vector h θ(x) .
3.2 Fast Gradient Descent Method We use fast gradient descent for counter attacks. First calculate the gradient: g := η (h θ (x) + η, y)
(13)
In order to maximize the loss, we want to adjust η in the direction of this gradient, i.e. take a step: η := η + αg (14) For some step sizes, the perturbation is then projected back into the defined norm constraint η ≤ . let us consider the special infinite norm case η ∞ ≤ , such that η lies in the range [−,]. If our initial value is zero, this will give an update (Fig. 4) η := sign(g) (15) Based on the fast gradient descent method, the step length is adjusted and the relationship between the single step disturbance value and the output probability is obtained. As can be seen from the figure, for restnet classification network, when the disturbance value is within the range of [0, 0.2], random gradient descent is not used to attack successfully, and the output label is always correct. The output confidence of correct labels tends to go down and then up. The inflection point of restnet18 appears between [0.066, 0.068], the inflection point of restnet34 appears between [0.036, 0.0038], the inflection point of restnet50 appears between [0.044, 0.046], The inflection point for restnet101 occurs between [0.050, 0.060]. The meaning of
Fig. 4 Taking images from the Imagenet dataset, a demonstration of network attack based on fast gradient descent on the restnet34 network. An image identified as a hog with 99.998% confidence becomes 70.614% confidence by adding a vector too small to detect.
Research on Adversarial Robustness Properties of Image …
945
(a) restnet18
(b) restnet34
(c) restnet50
(d) restnet101
Fig. 5 Relationship between single-step perturbation value and output probability
the inflection point in the figure indicates that the gradient direction changes at this point. In this analysis, the reverse gradient is temporarily ignored, and the data before the minimum value of the inflection point in each network is taken to analyze the linear characteristics. The fitting of sampled data points is shown in the figure. It can be seen from the figure above that, when the disturbance amount is less than the disturbance amount corresponding to the inflection point, the relationship between the disturbance amount of restnet network and the output value can be approximated as a linear function y p = k y x p + b y , where k y is the sensitivity coefficient and y is the offset coefficient. Goodness of fit R 2 was used to evaluate the fitting effect, where (Figs. 5 and 6): n (y j − yˆ ) (16) R 2 = 1 − i=1 n ¯) i=1 (y j − y where, y j denotes the true value of sampling, y¯ denotes the average value of sampling, and yˆ denotes the fitting value. The minimum fitting degree of restnet is 86%, and the maximum fitting degree is 98.9% (Table1). By establishing the relationship model between the input disturbance and the output of these typical networks, it can be seen that within a certain disturbance range,
946
Q. Li et al.
(a) restnet18
(b) restnet34
(c) restnet50
(d) restnet101
Fig. 6 Linear analysis relationship between single-step perturbation value and output probability Table 1 Fitting parameter Network Goodness of fit R 2 Restnet18 Restnet34 Restnet50 Restnet101
0.880 0.893 0.861 0.989
Sensitivity coefficient ky
Migration coefficient by
−9.085 −18.087 −15.590 −6.277
0.968 0.916 0.932 1.013
the network output changes linearly with the input disturbance, and the absolute value of sensitivity measures the sensitivity of the network to the disturbance. The order of sensitivity is restnet101 >restnet34 >restnet18 >restnet50, which can be used as a basis for the linear interpretation of antagonistic samples.
Research on Adversarial Robustness Properties of Image …
947
3.3 Relationship Model Between Iteration Times and Output Based on Stochastic Gradient Descent Method The training classifier is to optimize the parameters to minimize the average loss on some training sets m. We express it as an optimization algorithm: m 1 (h θ (xi ), yi ) arg min = m i=1 θ
(17)
We use stochastic gradient descent method [3] to solve the problem. That is, for some small lot B ⊆ 1, 2, ...m, we calculate the gradient of our loss relative to the parameter θ and make a small adjustment in this negative direction: α ∇θ (h θ (xi ), yi ) (18) θ := θ − B i=B where α is the step size, we repeat this process for the entire training set until the parameters converge. Gradient ∇θ (h θ (xi ), yi ) calculates how a small adjustment of each parameter θ will affect the loss function. Using backpropagation techniques you can calculate the gradient of the loss function with respect to input xi . This quantity will tell us how small changes in the image itself affect the loss function. And that’s exactly what we’re going to do to make an adversarial case. But instead of adjusting the image to minimize loss, as we do when optimizing network parameters, we will adjust the image to maximize loss.In other words, we need to solve the optimization problem: ˜ y) (19) arg min = (h θ (x), x˜
x˜ represents our adversarial example, trying to maximize the loss function. We need to ensure that x˜ is close to the original input x, so we optimize the perturbation η by: arg max = (h θ (x + η), y) η∈φ
(20)
where φ represents the allowable set of perturbations. A common perturbed set is the set defined by the l∞ norm φ = η : η∞ ≤
(21)
The learning rate of this test was set as 1e-1, and the disturbance limit was set as 0.007. The test results are shown below (Fig. 7): When the iteration attacks are 125 times, the probability of hog is 0.576 and that of armadillo is 0.198. When the iteration attacks are 126 times, the probability of hog is 0.004 and that of armadillo is 0.831. The output category of neural network model is armadillo (Fig. 8).
948
(a) the number of iterations and output probability
Q. Li et al.
(b) the number of iterations and the output value
Fig. 7 Resnet18 network iteration times and output relationship
(a) the number of iterations and output probability
(b) the number of iterations and the output value
Fig. 8 Resnet34 network iteration times and output relationship
(a) the number of iterations and output probability
(b) the number of iterations and the output value
Fig. 9 Resnet50 network iteration times and output relationship
When the iterated attack is 14 times, the probability of hog is 0.004 and the probability of wombat is 0.050. The output category of neural network model is wombat. The probability of wombat and corn for 15 iteration attacks is 0.004 and 0.140. The output category of neural network model is corn. When the iteration attacks are 26 times, the probability of ear is 0.537 and the probability of corn is 0.462. The output category of neural network model is ear (Fig. 9). When iterating attacks 17 times, the probability of hog is 0.481; when iterating attacks 18 times, the probability of hog is 0.018; and the probability of wombat is 0.530. The output category of neural network model is wombat (Fig. 10). With 70 iteration attacks, the probability of hog is 0.484 and that of warthog is 0.196. When 71 iteration attacks occur, the probability of hog is 0.040 and the probability of wombat is 0.820. The output category of neural network model is wombat. As can be seen from the figure above, the number of iterations required for successful attack resnet18>resnet101 >resnet50 >resnet34, In the preceding section, the
Research on Adversarial Robustness Properties of Image …
(a) the number of iterations and output probability
949
(b) the number of iterations and the output value
Fig. 10 Resnet101 network iteration times and output relationship
sensitivity of the network to interference is restnet101 >restnet34 >restnet18 >restnet50. There is no obvious mapping relationship between the sensitivity degree and the number of successful iteration attacks. By analyzing the reasons, we can know that the sensitivity of network output to disturbance depends not only on the correct output of the network, but also on the sensitivity of other types of output to disturbance.In addition, the disturbance limit set by the relation model between the number of iterations and the output based on the stochastic gradient descent algorithm is 0.07. In the initial iteration, the output probability of the correct label basically remains unchanged. After reaching a certain number of iterations, the output probability will mutate and finally change the output label. The reason for analysis is that in the early stage of iteration, the gradient of loss function is not sensitive to the input, so more time is needed to complete the iteration. Finally, by observing the changing trend of output value and output probability, it can be found that compared with output probability, output value is more unstable and fluctuates more. The above phenomenon shows that the increase and decrease of output value cannot represent the increase and decrease of output probability, because the output value of the whole network is also changing.
4 Summary To summarize this innovation: (1) The linear characteristics of the sub-operations of the convolutional neural network are quantitatively analyzed, and the incremental model of the perturbation input-output of the sub-operations is established, which verifies the view that the linear behavior of the model in the high-dimensional space is vulnerable to the attack of the adversarial samples. (2) Based on the fast gradient descent method, the relationship model between input disturbance and output is established, which provides a theoretical basis for the subsequent design of attack algorithms according to the relationship between disturbance and output. (3) Based on stochastic gradient descent algorithm, a relationship model between the number of iterations and the output is established, which provides another way
950
Q. Li et al.
of thinking for optimizing the counter attack algorithm. By optimizing the loss function, the iterative attack coefficient can be reduced and the attack efficiency can be improved.
References 1. Szegedy, C., Zaremba, W., Sutskever, I., et al.: Intriguing properties of neural networks. ICLR (2014). arXiv:1312.6199v4 2. Gu, S., Rigazio, L.: Towards deep neural network architectures robust to adversarial examples. In: NIPS Workshop on Deep Learning and Representation Learning (2014). arXiv:1412.5068v4 3. Chalupka, K., Perona, P., Eberhardt, F.: Visual causal feature learning. Comput. Sci. (2014). arXiv:1412.2309v2 4. Goodfellow, I.J. , Shlens, J., Szegedy, C.: Explaining and harnessing adversarial examples. ICLR (2015). arXiv:1412.6572v3 5. Madry, A., Makelov, A., Schmidt, L., et al.: Towards deep learning models resistant to adversarial attacks. ICLR (2018). arXiv:1706.06083v4 6. Zheng, T., Chen, C., Ren, K.: Distributionally adversarial attack (2018). ArXiv.arXiv:1808.05537v3 7. Carlini ,N., Wagner, D.: Towards evaluating the robustness of neural networks. In: IEEE Symposium on Security and Privacy, pp. 39–57. (2017). https://doi.org/10.1109/SP.2017.49 8. Papernot, N., McDaniel, P., Jha, S., et al.: The limitations of deep learning in adversarial settings. In: Proceedings of IEEE European Symposium on Security and Privacy, pp. 372–387. (2016). https://doi.org/10.1109/EuroSP.2016.36 9. Moosavidezfooli, S., Fawzi, A., Frossard, P., et al.: Frossard. DeepFool: a simple and accurate method to fool deep neural networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2574–2582. (2016). https://doi.org/10.1109/CVPR.2016.282
Author Index
A An, Linxue, 595 Ao, Xiang, 617 Ao, Yichao, 653
B Bin, Yang, 877
C Cai, Qiang, 853 Cai, Xuan, 893 Cao, Ying, 853 Cao, Zhongping, 401 Chai, Yi, 59, 789 Chao, He, 39 Chen, Dongmei, 433 Cheng, Lei, 251 Cheng, Yuqiao, 495, 779 Chen, Hang, 789 Chen, Jing, 455 Chen, Junhui, 305 Chen, Lu, 443 Chen, Qi, 379 Chen, Rui, 359 Chen, Tao, 17 Chen, Wenjie, 81 Chen, Xin, 465 Chen, Xinkai, 295 Chen, Xinmin, 17, 221 Chen, Xue, 519 Chen, Yangzhou, 679 Chen, Yimei, 335 Chen, Zengqiang, 537
Chen, Zitao, 829 Cui, Yue Lei, 201, 509
D Dongwei, Li, 627 Du, Dongsheng, 433 Du, Hongbao, 937 Du, Xiaokai, 717
F Fan, Kexin, 893 Fu, Jian, 693, 747 Fu, Yongling, 495, 779
G Gao, Fugen, 609 Gao, Guanbin, 295 Gao, Hongyu, 693 Gao, Shiyao, 443 Gao, Xuehui, 139 Geng, Haoruo, 829 Guizhen, Kong, 627 Guo, Caixiang, 241 Guo, Hao, 519 Guo, Shouyi, 747 Guo, Xuemei, 401
H Han, Shuyue, 595 Hao, Fei, 151 Hao, Li, 729
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1090, https://doi.org/10.1007/978-981-99-6882-4
951
952 He, Yufeng, 595 Hou, Bingbing, 251 Hou, Linyuan, 579 Hou, Xuzhao, 479 Hua, Jinxing, 151 Huajin, Zhang, 627 Hua, Tiedan, 251 Hu, Cong, 571 Hu, MingZhe, 27 Hu, Zhongyi, 27, 665
I Ichiye, Toshiko, 559
J Ji, Bingyu, 913 Ji, Yaping, 285 Jia, Feng, 527 Jia-jia, Feng, 1 Jia, Longfei, 455 Jiang, Junce, 415 Jiang, Zhihao, 81 Jigui, Zheng, 163, 877 Jun, Xie, 1
L Lei, Xin, 693, 747 Liang, Jianqiang, 321 Liang, Min, 285 Liangmushage, A., 443 Li, Baoquan, 335 Li, Bo, 97 Li, Duansong, 193, 211 Li, Fengyun, 251 Li, Haisheng, 231, 853 Li, Hao, 455 Li, Hongcai, 479 Li, HuaTao, 27 Li, Jin, 241 Li, Jinfeng, 815 Li, Juntao, 609 Li, Kan, 305 Li, Lin, 349, 369, 841 Li, Mingxing, 321 Lin, Yan, 81, 349, 369, 841 Lin, Yue, 17 Li, Qiaoyi, 937 Li, ShuRong, 761 Li, Teng, 305 Li, Wenwen, 321 Liu, Bailang, 559
Author Index Liu, Bingshan, 579 Liu, Feng, 519 Liu, Jingyi, 265 Liu, Liang, 617 Liu, Ruijun, 231 Liu, Taihao, 707 Liu, Wei, 433 Li, Xiaobin, 527 Li, Yang, 937 Li, Yicheng, 579 Luan, Fujin, 295 Lu, Juanzhi, 193, 211 Lu, Meichen, 59 Luo, Liang, 201, 509 Luo, Wenwei, 97 Lu, Weitao, 537 M Ma, Baoli, 113 Ma, Hanjie, 665 Ma, Jirong, 193, 211 Ma, Mengxu, 853 Ma, Xin, 549 Ma, YingKai, 761 Ma, Yue, 479, 801 Mei, Guangyu, 401 Miao, Zhonghua, 125 Mu, Chengmei, 69 N Na, Jing, 295 Nie, Feiyan, 665 P Pang, Yufei, 641 Q Qiang, Chen, 1 Qiang, Tang, 729 Qiang, Yang, 627 Qing, Tian, 877 Quan, Junyu, 455 R Ren, Liping, 595 Rong, Zhang, 163 S Shen, Yingqi, 305
Author Index Shuai, Liu, 39 Si, Guolei, 305 Situ, Weilun, 925 Song, Yunzhong, 707 Sun, Chen, 841 Sun, Hao, 537 Sun, Jian, 495, 779 Sun, Qinglin, 537 Sun, Sixian, 595 Sun, Xiantao, 81 Sun, Zhanquan, 175 Su, Shan, 443
T Tan, Ying, 125 Tao, Jiang, 39 Temuer, Chaolu, 97 Teng, Xiaojing, 559, 635
W Wang, Chaoli, 175 Wang, Chen, 853 Wang, Fuzhong, 707 Wang, Guoli, 401 Wang, Huaiqing, 641 Wang, Jie, 113 Wang, Jiqiang, 221 Wang, Jun, 913 Wang, Liangming, 747 Wang, Liming, 617 Wang, Na, 201, 509 Wang, Qin, 829 Wang, Shuangxin, 265 Wang, Siqi, 97 Wang, Wenfeng, 801 Wang, Wenle, 251 Wang, Xiaolong, 359 Wang, Yu, 789 Wang, Yunfeng, 479 Wang, Yunting, 925 Wang, Zhengjie, 937 Wang, Zhichao, 369, 641 Wang, Zi Cong, 509 Wan, Yu, 139 Wei, Linbing, 717 Wei, Shanbi, 789 Wei, Shi, 163, 877 Wei, Xinjiang, 69 Wu, Jinghan, 893 Wu, Kai, 443 Wu, Rili, 415
953 Wu, Xin, 893 Wu, Xiru, 379, 415
X Xiao, Huimin, 707 Xi, Chenxi, 609 Xiao, Lei, 665 Xiao, Qiaoshen, 893 Xiao, Sumei, 641 Xia, Xiubo, 495, 779 Xie, Yongsheng, 717 Xing, Yashan, 295 Xiong, Hang, 97 Xiu, Chunbo, 549 Xu, Bai, 937 Xue, Yuquan, 617 Xu, Xiuhua, 571 Xu, Yanling, 527
Y Yan, Bingzhuang, 679 Yang, Chenxi, 241 Yang, Fengli, 867 Yang, Lei, 571 Yang, Shujun, 193, 211 Yang, Wenlin, 925 Yang, Yifan, 693 Yang, Yuxuan, 617 Yaxing, Guo, 627, 877 Yin, Huanpu, 231 Yi, Yang, 829 Yongqiang, Dou, 163 Yongsheng, Zhao, 163 Yue, Changlu, 571 Yue, Wenlong, 139 Yu, Feifan, 17, 221 Yu, Haihao, 815 Yu-jie, Zhang, 1 Yu, Lei, 465 YuLong, Huang, 39 Yu, Zhiyuan, 455
Z Zeng, Weixiang, 925 Zhang, Hailin, 913 Zhang, Jian, 125 Zhang, Jong, 519 Zhangjun, Sun, 729 Zhang, Minghao, 335 Zhang, Pu, 495 Zhang, Qifeng, 653
954 Zhang, Xiaoning, 937 Zhang, Xin Hai, 201 Zhang, Ya, 913 Zhang, Yuming, 693, 747 Zhao, Chen, 549 Zhao, Huanyu, 433 Zhaojing, Zhang, 163 Zhao, Jinsong, 349 Zhao, Long, 359, 867 Zhao, Xu, 571 Zhao, Yu, 379 Zheng, Hanxi, 231
Author Index Zhenglei, Cui, 877 Zheng, Qijia, 595 Zhe, Zhang, 627 Zhong, Xi, 285 Zhou, Fan, 175 Zhou, Jin, 125 Zhou, Shitong, 455 Zhou, Ziyang, 617 Zhu, Junzhi, 359 Zhu, Juzhi, 867 Zhu, Yongze, 265