131 3 25MB
English Pages 903 [870] Year 2023
Lecture Notes in Electrical Engineering 1089
Yingmin Jia Weicun Zhang Yongling Fu Jiqiang Wang Editors
Proceedings of 2023 Chinese Intelligent Systems Conference Volume I
Lecture Notes in Electrical Engineering Volume 1089
Series Editors Leopoldo Angrisani, Department of Electrical and Information Technologies Engineering, University of Napoli Federico II, Napoli, Italy Marco Arteaga, Departament de Control y Robótica, Universidad Nacional Autónoma de México, Coyoacán, Mexico Samarjit Chakraborty, Fakultät für Elektrotechnik und Informationstechnik, TU München, München, Germany Jiming Chen, Zhejiang University, Hangzhou, Zhejiang, China Shanben Chen, School of Materials Science and Engineering, Shanghai Jiao Tong University, Shanghai, China Tan Kay Chen, Department of Electrical and Computer Engineering, National University of Singapore, Singapore, Singapore Rüdiger Dillmann, University of Karlsruhe (TH) IAIM, Karlsruhe, Baden-Württemberg, Germany Haibin Duan, Beijing University of Aeronautics and Astronautics, Beijing, China Gianluigi Ferrari, Dipartimento di Ingegneria dell’Informazione, Sede Scientifica Università degli Studi di Parma, Parma, Italy Manuel Ferre, Centre for Automation and Robotics CAR (UPM-CSIC), Universidad Politécnica de Madrid, Madrid, Spain Faryar Jabbari, Department of Mechanical and Aerospace Engineering, University of California, Irvine, CA, USA Limin Jia, State Key Laboratory of Rail Traffic Control and Safety, Beijing Jiaotong University, Beijing, China Janusz Kacprzyk, Intelligent Systems Laboratory, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Alaa Khamis, Department of Mechatronics Engineering, German University in Egypt El Tagamoa El Khames, New Cairo City, Egypt Torsten Kroeger, Intrinsic Innovation, Mountain View, CA, USA Yong Li, College of Electrical and Information Engineering, Hunan University, Changsha, Hunan, China Qilian Liang, Department of Electrical Engineering, University of Texas at Arlington, Arlington, TX, USA Ferran Martín, Departament d’Enginyeria Electrònica, Universitat Autònoma de Barcelona, Bellaterra, Barcelona, Spain Tan Cher Ming, College of Engineering, Nanyang Technological University, Singapore, Singapore Wolfgang Minker, Institute of Information Technology, University of Ulm, Ulm, Germany Pradeep Misra, Department of Electrical Engineering, Wright State University, Dayton, OH, USA Subhas Mukhopadhyay, School of Engineering, Macquarie University, NSW, Australia Cun-Zheng Ning, Department of Electrical Engineering, Arizona State University, Tempe, AZ, USA Toyoaki Nishida, Department of Intelligence Science and Technology, Kyoto University, Kyoto, Japan Luca Oneto, Department of Informatics, Bioengineering, Robotics and Systems Engineering, University of Genova, Genova, Genova, Italy Bijaya Ketan Panigrahi, Department of Electrical Engineering, Indian Institute of Technology Delhi, New Delhi, Delhi, India Federica Pascucci, Department di Ingegneria, Università degli Studi Roma Tre, Roma, Italy Yong Qin, State Key Laboratory of Rail Traffic Control and Safety, Beijing Jiaotong University, Beijing, China Gan Woon Seng, School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore, Singapore Joachim Speidel, Institute of Telecommunications, University of Stuttgart, Stuttgart, Germany Germano Veiga, FEUP Campus, INESC Porto, Porto, Portugal Haitao Wu, Academy of Opto-electronics, Chinese Academy of Sciences, Haidian District Beijing, China Walter Zamboni, Department of Computer Engineering, Electrical Engineering and Applied Mathematics, DIEM—Università degli studi di Salerno, Fisciano, Salerno, Italy Junjie James Zhang, Charlotte, NC, USA Kay Chen Tan, Department of Computing, Hong Kong Polytechnic University, Kowloon Tong, Hong Kong
The book series Lecture Notes in Electrical Engineering (LNEE) publishes the latest developments in Electrical Engineering—quickly, informally and in high quality. While original research reported in proceedings and monographs has traditionally formed the core of LNEE, we also encourage authors to submit books devoted to supporting student education and professional training in the various fields and applications areas of electrical engineering. The series cover classical and emerging topics concerning: • • • • • • • • • • • •
Communication Engineering, Information Theory and Networks Electronics Engineering and Microelectronics Signal, Image and Speech Processing Wireless and Mobile Communication Circuits and Systems Energy Systems, Power Electronics and Electrical Machines Electro-optical Engineering Instrumentation Engineering Avionics Engineering Control Systems Internet-of-Things and Cybersecurity Biomedical Devices, MEMS and NEMS
For general information about this book series, comments or suggestions, please contact [email protected]. To submit a proposal or request further information, please contact the Publishing Editor in your country: China Jasmine Dou, Editor ([email protected]) India, Japan, Rest of Asia Swati Meherishi, Editorial Director ([email protected]) Southeast Asia, Australia, New Zealand Ramesh Nath Premnath, Editor ([email protected]) USA, Canada Michael Luby, Senior Editor ([email protected]) All other Countries Leontina Di Cecco, Senior Editor ([email protected]) ** This series is indexed by EI Compendex and Scopus databases. **
Yingmin Jia · Weicun Zhang · Yongling Fu · Jiqiang Wang Editors
Proceedings of 2023 Chinese Intelligent Systems Conference Volume I
Editors Yingmin Jia School of Automation Science and Electrical Engineering Beihang University Beijing, China Yongling Fu School of Mechanical Engineering and Automation Beihang University Beijing, China
Weicun Zhang School of Automation and Electrical Engineering University of Science and Technology Beijing Beijing, China Jiqiang Wang Ningbo Institute of Materials Technology and Engineering Chinese Academy of Sciences Ningbo, Zhejiang, China
ISSN 1876-1100 ISSN 1876-1119 (electronic) Lecture Notes in Electrical Engineering ISBN 978-981-99-6846-6 ISBN 978-981-99-6847-3 (eBook) https://doi.org/10.1007/978-981-99-6847-3 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore Paper in this product is recyclable.
Contents
Efficient PBFT Algorithm Research Based on Credit Value Model . . . . . Xudong Liu, Rongguang Li, and Le Xin
1
MK-DCCA Based Fault Diagnosis for Incipient Fault in Nonlinear Dynamic Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Junzhou Wu, Mei Zhang, Chihan Gao, Lingxiao Chen, and Chen Ling
15
An Improved Siamese Capsule Networks Classification Method for Few Sample Remote Sensing Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Meng Liu, Jian Cao, Hai-sheng Li, and Qiang Cai
27
Robotic Arm Grasping Based on Deep Reinforcement Learning . . . . . . . Ziyan Li and Junyong Zhai
37
Multidimensional Non-linear Ship Trajectory Prediction Based on LSTM Network Corrected by GA-BP . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xinyu Wang, Wenyu Zhao, Shuangxin Wang, and Jingyi Liu
47
Design and Simulation of the Hydraulic System for Height Adjustment of Carriages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zhiming Yan, Baoyan Hu, Lining Yang, Jian Fu, and Yongling Fu
53
Virtual Model Control Algorithm Simulation for Lower Limb Assist Exoskeleton . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xiaorong Zhu, Jing Chen, Zhiyuan Yu, Zheqing Zuo, Zhe Zhao, and Zichong Zhang
63
Neural Network Based Singularity-Free Adaptive Prescribed Performance Control of Two-Mass Systems . . . . . . . . . . . . . . . . . . . . . . . . . . Dongdong Zheng, Zeyuan Sun, and Weixing Li
73
Intelligent Path Planning for Home Service Robots Based on Improved RRT Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yu Liu, Haikuan Wang, and Shuo Zhang
85
v
vi
Contents
Adaptive Neural Networks Backstepping Control of Uncertain Second-Order Systems with Input and State Time Delays . . . . . . . . . . . . . 101 Renjian Jiang, Lin Tian, Peng Li, and Liang Sun Satellite-Terrestrial Integrated Network Slicing Resource Management Based on Reinforcement Learning . . . . . . . . . . . . . . . . . . . . . . 111 Guoliang Hua, Guangrong Lin, Yuman Zhang, and Yafei Zhao Event-Triggered Adaptive Neural Network Trajectory Tracking Control of MSVs Under Deception Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . 123 Chen Wu, Guibing Zhu, and Jinshu Lu Planetary Flight Obstacle Avoidance Guidance Method Based on ES and DQN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137 Jie Jiao, Wenbo Wu, Binfeng Pan, and Shuaibin Yang Improving Unmanned Panoramic Perception Algorithm of DAFPN-YOLO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 Xiru Wu, Yurui Lin, and Chao Liu Development of Electromagnetic Model for Linear Force Motor in Direct-Drive Hydraulic Servo Actuators . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 Zhongrui Zhao, Bing Chu, Zhenyu Liu, and Zhiming Yan Analysis on Structure and Kinematics for a Novel Decoupled Rope-Driven Manipulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 Yaxing Guo, Kui Huang, Jinjun Zhang, Jigui Zheng, Guizhen Kong, and Longfei Jia Fractional Order LMS Algorithms: A Review and Application in Signal Denoising . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 Haozhe Zhang, Hanliang Huo, Ruoxun Ma, and Lipo Mo Research on Low-Cost IMU Testing Methods . . . . . . . . . . . . . . . . . . . . . . . . 203 Yajing Guo, Fan Yang, Jing Chen, Binyan Liang, and Shuxuan Liu Androgynous Tool Quick-Change Mechanism and Its Misalignment Tolerance of Space Manipulator . . . . . . . . . . . . . . . . . . . . . . . 213 Man Huang, Yanbo Wang, Ke Li, Songbo Deng, He Cai, Jiankang Zhi, and Jiaqi Duan Fast Parameter Estimation Algorithm for the Signal Modeling Based on Equation Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 Ling Xu, Weihong Xu, and Feng Ding Second Harmonic-Compensated Phase-Locked Loop for Resolver-to-Digital Conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 Caixiang Guo, Jin Li, and Chenxi Yang
Contents
vii
Adaptive Tracking of Nonlinear Switched Systems with Sensor Uncertainties Based on a Weighted Average Voting Algorithm . . . . . . . . . 245 Zhiyi Cheng and Yan Lin Load Distribution of Planetary Roller Screw Mechanism with Roller Threads Modification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267 Wei Liu, Zhong Chen, Sheng Xie, Yongqiang Dou, Jigui Zheng, and Chao Geng Finite-Time H∞ Synchronization Control of Piecewise Homogeneous Markov Jumping T-S Fuzzy Discrete Complex Networks Subject to Hybrid Attacks and Uncertainty . . . . . . . . . . . . . . . . . 281 Xiru Wu, Binlei Zhang, Yuchong Zhang, and Yuqiu Zhang Modeling and Control Algorithm of the Multi-duct-rotor Mode Transformable Aircraft . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297 Mucheng Tang, Yue Ma, and Zhiheng Bu VR-Based Virtual Surgical Training via Parametric Human Body Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307 Yilin Zhao, Hongjian Huang, and Qiang Fu Key-Agent Based Dynamic Prioritized Planning for Multi-agent Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317 Kaixiang Zhang, Niya Wang, Shufan Zhang, Ning Wang, and Jianlin Mao Fixed-Time Tracking Control for Robotic Manipulators Based on Adding a Power Integrator Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329 Shiming Wang and Yingmin Jia Model-Free Optimal Control for Linear Systems with State and Control Inequality Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343 Bin Zhang, Chenyang Xu, Lutao Yan, and Haiyuan Li Binary Consensus of Multi-agent Systems With Privacy Preserving and Random Communication Noises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357 Shuochen Wang, Jian Wang, Hongyong Yang, Chuangchuang Zhang, and Li Liu Highway Abandoned Object Detection Based on Foreground Extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367 Yubin Wang and Junyong Zhai YOLO-Based Semantic Segmentation for Dynamic Removal in Visual-Inertial SLAM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377 Xingke Xia, Pu Zhang, and Jian Sun
viii
Contents
Design and Implementation of Integrated Operations Control Center Based on Cloud Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391 Guangyuan Ma, Huiwen Yang, Chen Cheng, Nian Shao, and You Ma Observability of Edge Dynamics in Complex Networks . . . . . . . . . . . . . . . . 407 Zhiliang An, Shaopeng Pang, Mingjun Du, and Peng Ji A Modified Orthogonal Experimental Method for Configuration Data Acquisition Planning of Industrial Robots . . . . . . . . . . . . . . . . . . . . . . 415 Xinyang Guo, Guanbin Gao, Fei Liu, and Yashan Xing Event-Triggered Adaptive Finite-Time Tracking Control for Robot Manipulators with Full-State Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425 Cong Li, Zhiguo Xu, and Lin Zhao Research on Satellite Routing Method Based on Q-Learning in Failure Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433 Zhenrui Chen, Guangrong Lin, Jiaen Zhou, and Yafei Zhao Ensemble Regularized Polynomial Regression for Diagnosing Breast Cancer Subtypes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447 Shan Xiang, Fugen Gao, and Juntao Li Smart Laboratory: A New Smart-Manufacturing-Technologies-enabled Chemical Experiment Paradigm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455 Yaxin Wang, Chun Zhao, Wenzheng Liu, and Xiaotong Liu Design and Implementation of Humanoid Robot Arm Based on Human Arm Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467 Shuxuan Liu, Fan Yang, Chang Li, Junning Zhang, Yajing Guo, and Pengfei Li A Hybrid Variable Impedance Force Control Method for Industrial Robots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477 Jian-jun Zhang, Hou-sheng Li, and Han Li Flexo-Coupled Drive Dexterous Finger Differential Motion Control . . . . 489 Junning Zhang, Shuxuan Liu, Yajing Guo, Zhiwen Luo, and Pengfei Li The Hardware in Loop Simulation System Design Based on PLC for the Process Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503 KaiMing Yang and ZhiBin Xue Research on Gas Source Location of Quadruped Robot Based on DDQN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 519 Fengyun Li, Lei Cheng, Wenle Wang, and Bingbing Hou A Tracking Loop Method for Parallel Receiver with Low Interaction Overhead . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529 Yimin Fan, Yi Zhang, Liu Liu, Jing Sun, Ting Li, and Tian Liu
Contents
ix
Three-Channel Decoupling Control for Fighter Aircraft Based on Prescribed Performance Control and Linear Active Disturbance Rejection Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 539 Pengfei Li, Zhaotao Ke, Yuehui Ji, and Junjie Liu Characterization of Lissajous Scanning Trajectory Based on 2D MEMS Mirror . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 551 Xiulei Zhang and Yongxuan Han SINR Communication Based Fast Predictive Control for CPSs Under DoS Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 563 Enci Wang, Jianlin Hou, Yang Yi, and Qingcheng Shen Experimental Research on High Strength Steel for Side Milling Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573 Zhiwen Luo, Chao He, Kui Huang, He Cai, and Qu Wang Feature Matching Method Based on Improved SIFT and KLT . . . . . . . . . 583 Peng Zhao, Yajie Wang, Xin Su, Niu Shan, and Jun Xiang Violation Detection Method Based on Improved YOLOv5s . . . . . . . . . . . . 591 Shuo Liu, Yu-chen Liang, Xiao-cheng Ma, and Yun-qi Guo A Visual-LiDAR Object Tracking Method Using Correlation Filter and Potential Matching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 607 Junzhi Zhu, Xiaolong Wang, Fengli Yang, and Long Zhao A Model Based on Trend-Seasonal Decomposition and GCN for Traffic Flow Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617 Jiajun Wang, Yong Li, and Jiahao Zhang Formation Control of Multiple Nonholonomic Wheeled Robots with Disturbances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625 Xingyu Gao, Xiaonan Liu, Chen Chen, Xingxing Qiu, and Zhengrong Xiang Compute a Class of Refinable Function by a Matlab Code and Its Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 635 Xiaohui Zhou and Yujia Liu Analysis of Vibration Characteristics and Structural Improvement of Rolling Ball Joint Swinging Nozzle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 647 Haitao Qi, Xu Liu, Dongao Zhao, Duo Liu, Haoyang Meng, and Hang Su Adaptive Tracking Control for Manipulators with Prescribed Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 657 Qingrui Meng and Yan Lin Nighttime Vehicle Object Detection Based on Improved YOLOv7 . . . . . . 667 Haichao Sun, Hui Ye, and Junyong Zhai
x
Contents
A Demagnetization Fault Diagnosis Strategy for Vehicle Permanent Magnet Synchronous Motor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 677 Xiangyu Ma, Dong Guo, Yueling Zhao, and Lei Huang Design and Implementation of UAV Semi-physical Simulation System Based on VxWorks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 687 Wenxiao Hu, Wenyuan Cong, Xinmin Chen, Mengqiao Chen, Yue Lin, and Fengrui Xu Nonsingular Fast Terminal Sliding Mode Control of Stewart Parallel Robot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 699 Xiaoyue Wang and Chenglin Liu Distributed Optimization Algorithm for Multi-agent System with Time-Varying Communication Delay Based on the Game Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 709 Chen Wang, Rui Zhu, Fuyong Wang, and Zhongxin Liu Distributed Formation Control Based on Linear Model for Power-Line Inspection Robots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 721 LinYuan Hou and Yicheng Li A Dynamic Trust-Based Access Control for Multi-domain Cloud Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 733 Mei Fan and Zhongguo Yang On the Digital Intelligence for Online Retail Decision Support . . . . . . . . . 743 Lei Wang, Bin Zhao, and Yong Yang Modeling and High-Order Differential Feedback Control of Unmanned Helicopter Under Disturbances . . . . . . . . . . . . . . . . . . . . . . . . 753 Guoyuan Qi, Shishen Wang, and Xu Zhao Three-Phase Single-Switch Active Power Factor Corrector Based on UC1854 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 767 Changlu Yue, Xu Zhao, Cong Hu, Chunyu Li, and Tao Jiang Study of Digital Test Method for Cable-Driven Dexterous Hand . . . . . . . . 777 Yajing Guo, Fan Yang, Junning Zhang, Bohan Lv, and Si Zeng DETR with Recursive Gated Convolution Encoder . . . . . . . . . . . . . . . . . . . 785 Zijian Lin and Junyong Zhai Discrete Input-to-State Stability with Respect to Boundary and Distributed Disturbances for Balance Laws Systems . . . . . . . . . . . . . . 795 Fatima Zahra Benyoub and Yan Lin A Remote Mobile Image Acquisition System and Experimental Simulation of Indoor Scenes Based on an RGB-D Camera . . . . . . . . . . . . . 809 Xiaohui Shi and Lei Yu
Contents
xi
Flexible Load Market Bargaining Scheduling Strategy for Active Distribution Network Based on Demand Response . . . . . . . . . . . . . . . . . . . . 825 Bowei Shao, Hui Wang, Shuo Zhang, Pan Yin, and Wenliang Li Research on Intelligent Collaboration and Obstacle Avoidance Control of Multiple Parafoils System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 839 Jinshan Yang, Qinglin Sun, Hao Sun, Yuemin Zheng, and Zengqiang Chen Construction of Knowledge Graphs Related to Industrial Key Production Processes for Query and Visualization . . . . . . . . . . . . . . . . . . . . 855 Hongyu Han, Dongmei Fu, and Haocong Jia Multi-objective Optimization Design of the Structural Parameters of Swing Arm Crawler Rescue Robot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 865 Pu Zhang, Bo Cheng, Xingke Xia, and Jian Sun An Anomaly Detection Algorithm for Logs Based on Self-attention Mechanism and BiGRU Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 877 Han Yang, Fuliang Lin, Yi Chai, Kaiming Qie, Wenyi Lin, Yuanyuan Wang, Cheng Zhang, and Maoyun Guo Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 889
Efficient PBFT Algorithm Research Based on Credit Value Model Xudong Liu, Rongguang Li, and Le Xin
Abstract With the widespread attention to blockchain technology, the consensus algorithms that affect its performance have also come into focus. In response to the problem of complicated communication and inappropriate selection of master nodes in the PBFT algorithm consensus, an improved PBFT consensus algorithm (RC-PBFT) based on the credit value model is proposed. Firstly, the reputation value model is constructed according to the historical behavior of the nodes, the nodes are classified, and appropriate master nodes are selected to ensure reasonable selection. Secondly, by combining the UNL concept in the Ripple algorithm, the consistency protocol is optimized, reliable nodes are selected to carry out, and the algorithm efficiency is improved. Finally, the node type is dynamically adjusted, malicious nodes are deleted, and the security and scalability of the algorithm are improved. Through experimental analysis, it is shown that the RC-PBFT master nodes are safer and the consensus efficiency is higher, and in terms of throughput, consensus delay, communication overhead, etc., it is better than the PBFT algorithm. Keywords Blockchain · PBFT · Credit value · Consistency protocol
1 Introduction With the rapid development of internet technology, digital currencies are on the rise. The popularity of the digital currency, Bitcoin [1], heightened global awareness, drawing attention to its underlying technology, blockchain [2]. Thanks to its features of distributed data storage and immutable data, blockchain is widely used in finance, healthcare, internet of things, and other industries. X. Liu (B) · R. Li · L. Xin Beijing University of Technology, Beijing, China e-mail: [email protected] R. Li e-mail: [email protected] L. Xin e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_1
1
2
X. Liu et al.
The consensus algorithm is the core of blockchain technology, intended to solve the consistency problems of distributed systems and improve blockchain’s decentralized characteristics. Earlier research on consensus algorithms primarily focused on non-Byzantine distributed consistency problems [3] like the classic Paxos algorithm [4], and the Raft algorithm proposed later, which is used to manage the log replication for a multi-replica state machine from a multi-copy state machine perspective [5]. Subsequently, consensus algorithms such as PoS [6], PoW [7] and DPoS [8] were also proposed for public chains like digital currency. However, these algorithms didn’t provide an effective solution to the Byzantine fault tolerance problem. Until the introduction of the PBFT [9] algorithm. PBFT, standing for Practical Byzantine Fault Tolerance, is a practical prototype of a Byzantine fault-tolerant consensus algorithm. Miguel Castro (Castro) and Barbara Liskov (Liskov) proposed the PBFT algorithm, which efficiently overcomes previous BFT algorithm’s low efficiency problems. It reduces the algorithm complexity from the exponential level to the polynomial level, making the Byzantine fault-tolerant algorithm applicable to practical systems. Nevertheless, each slave node of the PBFT algorithm needs to carry out P2P consensus synchronization with other nodes, resulting in high communication complexity. Hence, as the node number increases, the algorithm’s performance deteriorates rapidly. With further research on the PBFT algorithm, constant optimization of the algorithm is taking place. The extensible multi-layer PBFT consensus mechanism proposed in literature [10] is an optimal two-layer PBFT that distributes nodes into two layers, thereby diminishing the traffic of a single layer. It solves the original PBFT algorithm’s high communication complexity and unextensibility problems, but disregards the master node’s selection. Literature [11] proposed PBFT-based optimization of the consensus mechanism to the master node. By using an external server as the master node, the N-P PBFT algorithm ensures the master node’s trustworthiness and optimizes the consensus process. However, communication complexity is still at the O(n 2 ) level. The primary contributions of this paper are as follows: (1) To enhance the security and efficiency of the algorithm, the credit value model is used to evaluate each node’s historical behavior before consensus. The node with the highest credit value is selected as the primary node. (2) By incorporating the UNL concept from Ripple’s consensus algorithm, nodes with high credit value are chosen to form the UNL node set for consensus. This optimization of the consistency protocol reduces communication complexity and overhead. (3) The consensus process is also optimized, and a dynamic adjustment protocol for nodes is implemented. Based on analysis and experimentation, the algorithm’s communication complexity is significantly less than the original PBFT algorithm. Besides, its throughput and transaction delay are superior to the original PBFT algorithm.
Efficient PBFT Algorithm Research Based on Credit Value Model
3
Fig. 1 PBFT consensus algorithm
2 Traditional PBFT Algorithm PBFT is a state machine copy replication algorithm, meaning that it models request service for the state machine. The state machine replicates state machine copies under different nodes in the distributed system, with each node saving the state of the state machine request service and executing the request service simultaneously [12]. This algorithm can be applied to asynchronous distributed system, and it can also tolerate some Byzantine nodes, making it a typical algorithm in the alliance chain [13]. The nodes’ roles are divided into client, master node, and slave node [14]. The primary node is the sequence of nodes that interact with clients. After the election of the primary node, the client transmits a request to it. The primary node numbers the request submitted by the client, sends a pre-prepared message to the other secondary nodes, and starts executing the consistency protocol. During this protocol, a node replies to the client upon receiving 2 f + 1 confirmation messages, and when the client receives f + 1 reply message, consensus is achieved [15]. The specific process is illustrated in Fig. 1. The PBFT algorithm comprises three protocols: consistency, checkpoint and viewswitching. In a network environment’s stable operation, only consistency and checkpoint protocols can run. However, if something goes wrong, or the primary node is working slowly, other nodes will start the view-switching protocol to re-elect the primary node and maintain the system’s normal functioning.
3 RC-PBFT Algorithm The RC-PBFT algorithm is derived from the PBFT algorithm, with the addition of the credit value model to improve security and rationalize the master node selection. The consistency protocol has been refined through the integration of the Unique Node List (UNL) concept, resulting in a UNL consensus node set that reduces communication complexity. Finally, the algorithm’s extensibility is improved through the introduction of a dynamic adjustment protocol that allows for node type modification.
4
X. Liu et al.
Table 1 Node behavior classification Behavior Content Normal Failure Spite
Table 2 Node classification Nodal layer UNL Alternate layer Fault layer Malicious layer
Value
Complete the consensus Network delay Tamper with information
Rise Reduce Clear
Range
Task
Cunl < Ci ≤ 100, Cunl = 80 Ca < Ci ≤ Cunl , Ca = 40 C f < Ci ≤ Ca , C f = 20 0 < Ci ≤ C f
Complete the consensus Complete submission Check Delete
3.1 Credit Value Model The RC-PBFT algorithm’s credit value model evaluates node credit values based on their behavioral history and dynamically updates node status accordingly. Different node states are responsible for different tasks, improving the algorithm’s performance and efficiency.
3.1.1
Node Behavior Classification
Nodes that complete the consensus process normally will receive credit value rewards. However, network delays and other conditions may result in a deduction of the credit value. Instances of nodes maliciously interfering with the consensus will result in clearing the node’s reputation value to zero. The specific categories and content are illustrated in Table 1.
3.1.2
Node Layering
As depicted in Table 1, various behaviors will alter the reputation value of nodes, which in turn changes their level of responsibility for different tasks. The credit value is represented by Ci and ranges from [0,100], as shown in Table 2. In Table 2, Cunl , Cc and C f are the reputation thresholds of UNL layer, alternate layer and fault layer respectively. Nodes will be divided into different levels due to different credit values. UNL layer: Responsible for implementing the complete consensus process. Its number accounts for 5–30% of the total nodes, which are held by some nodes with the
Efficient PBFT Algorithm Research Based on Credit Value Model
5
highest credit value. The primary node will be selected from UNL nodes, and the node with the highest credit value will be the primary node. Alternate layer: The alternate layer of the UNL node is responsible for participating in the final commit phase. If the number of UNL nodes is insufficient, they are selected from the alternate layer. Fault layer: If the credit value is lower than Cc , the node becomes faulty. If the faulty node still has evil or time-out behavior, its credit value will be reduced and it will become a malicious node. Malicious layer: If the credit value is lower than C f , the node is suspected of wrongdoing or the network latency is extremely high. Therefore, it needs to be replaced in time. The network will request the master node to delete the node, and the UNL layer will confirm that the node is deleted.
3.1.3
Credit Evaluation Method
The RC-PBFT algorithm evaluates the historical behavior of nodes in the process of consensus as an impact factor, which mainly includes the degree of consensus participation and the degree of consensus positivity. Definition 1 Node credit value. Represents the credit value of node i under the consensus of round j. The model is as follows: j
j
Ci = α Pi + β Q i
(1)
j
where, Ci represents the reputation value of node i. The reputation value is composed of the consensus participation degree of node i and the positive degree of consensus. The degree of consensus participation is determined by the type of node i and its corresponding degree of work completion. The positive degree of consensus is determined by the number of historical consensus completion and the reputation value of the last round of consensus. α and β are the weights of consensus participation degree and consensus positivity degree respectively. Definition 2 Degree of consensus participation. The calculation is carried out according to the type of node and the degree of work completion, which can be expressed as: Pi = γ E i + θ F i + δG i
(2)
where, Pi represents the participation degree of node i, E i represents whether the consensus work has been completed, Fi represents whether the submission work has been completed, and G i represents whether the timeout has occurred. Completion and not timeout are rewarded. δ, γ and θ represent the weight of different jobs.
6
X. Liu et al.
Definition 3 Consensus positivity. According to the joint decision of node consensus completion rate and historical reputation value, it can be expressed as: j
j−1
Q i = μMi + ϕCi
(3)
j
where, Q i represents the positive degree of node i in the j round consensus, and j−1 Mi represents the cumulative number of consensus completed. Ci represents the credit value of node i in the consensus of round j − 1. And are the corresponding weights.
3.1.4
Credit Update
In the process of consensus, the credit value of nodes is updated in real time, and the status of nodes is dynamically adjusted when the level of nodes changes. The process is shown in Algorithm 1.
Algorithm 1: Credit update j−1
Input: < R E QU E ST >, Ci j Output: Ci 1: Select master node 2: Select U N L 3: if node = U N L then 4: Start consistency protocol 5: if complete then 6: E i =1,G i =0,Mi ++; 7: Pi ←E i +G i ; 8: end if
9: else 10: Reply client 11: if complete then 12: Fi =1,G i =0,Mi ++; 13: Pi ←Fi +G i ; 14: end if 15: end if j−1 16: Q i ←Mi +Ci ; j 17: Ci ←Pi +Q i ; j 18: return Ci
3.2 Consensus Process The consensus process of RC-PBFT algorithm is to optimize the consistency protocol based on PBFT algorithm. The consistency protocol of PBFT will carry out two O(N )2 communications during running. When the node size is large, the communication complexity of PBFT is unbearable.
Efficient PBFT Algorithm Research Based on Credit Value Model
7
Fig. 2 RC-PBFT consensus algorithm
3.2.1
Consistency Protocol
In this paper, an optimized consistency protocol is proposed based on the concept of UNL in Ripple algorithm, and the consensus process of RC-PBFT algorithm is designed as shown in Fig. 2. Request: Client C first signs the request and then sends to the master node. Pre-prepare: The primary node signs the request and sends to the UNL node to execute the consistency protocol. Prepare: The UNL node signs the message and replies with the message to the primary node. Commit: After receiving the prepare message, the master node sends to the UNL node, which starts to execute the request and reply to the master node. After receiving the commit message from the UNL node, the primary node sends the message to the alternate node for commit. Reply: After receiving the commit message, the alternate node sends to the client. If the client receives f + 1 reply messages, the whole network reaches a consensus.
3.2.2
Communication Complexity
Combined with the consistency protocol, assuming that the total number of nodes is N , the total traffic is T , and the number of UNL nodes is Nunl , it can be obtained. The traffic of PC-PBFT in the prepare and pre-prepare phases is: T p = T pp = NU N L − 1 The commit phase can be expressed as:
(4)
8
X. Liu et al.
Tcom = 2 (NU N L − 1) + N − NU N L
(5)
By combining the traffic of formula 1, 2 and other stages, the traffic of the whole RC-PBFT algorithm can be obtained as follows: TRC−P B F T = 2N + 3NU N L − 3
(6)
In the consistency protocol of PC-PBFT, the UNL node only needs to communicate directly with the master node, so the communication complexity of the algorithm is reduced to O(N ).
3.3 Node Dynamic Adjustment Protocol In the credit value model of RC-PBFT, nodes are processed hierarchically based on their credit value. Whenever a node’s credit value changes and its level is altered, the dynamic adjustment protocol can be activated to update the node network accordingly.
3.3.1
Node Promotion and Degradation
If a candidate node’s credit value meets the reputation threshold of the UNL layer and the number of UNL layer nodes is small, the node may be promoted to a UNL node. However, if the credit value of an alternate node falls below the reputation threshold of the alternate layer, the node will be repositioned to a lower layer. Additionally, faulty nodes may be adjusted to their appropriate credit levels.
3.3.2
Node Deletion
If the primary node detects the presence of a malicious node in the network’s fault layer, it sends a deletion request to the UNL layer nodes for immediate action. Upon receiving the request, the UNL nodes verify whether to delete the node. When the primary node receives confirmation messages from more than fifty percent of the UNL nodes, it deletes the bad node, updates the network, and guarantees the security and expansibility of the RC-PBFT algorithm. The process of node deletion is illustrated in Fig. 3. As shown in Fig. 4, when the primary node initiates the deletion of a malicious node, the communication volume of (Nunl − 1)2 is carried out during the commit stage. Thus, the total communication volume of the deleted node can be expressed as: Tdel = NU N L 2 + NU N L − 2
(7)
Efficient PBFT Algorithm Research Based on Credit Value Model
9
Fig. 3 Dynamic node deletion
Fig. 4 Comparison of success rates
If the UNL node deletes malicious nodes in a consensus round, the traffic of this consensus round is expressed as: TRC−P B F T = NU2 N L + 2N + 4NU N L − 5
(8)
4 Experimental Analysis To evaluate the performance of the RC-PBFT algorithm, we simulated multiple node startups in the network using Java’s multithreading approach. We also introduced random network delays and constructed a node network. The simulation was conducted on a PC equipped with an Intel Core i7-1065G7 CPU and 16 GB memory, running on a 64-bit Windows 10 operating system. Our analysis and simulations of the PBFT and RC-PBFT algorithms considered various aspects, such as security, communication overhead, consensus delay, and throughput.
10
X. Liu et al.
4.1 Security Analysis The PBFT algorithm is highly secure as it can tolerate Byzantine nodes and verify messages between the nodes during the prepare and commit phases. While the RCPBFT algorithm eliminates the need for mutual communication between nodes in the consistency protocol, its security is enhanced through the implementation of the credit value model, reducing the number of nodes participating in consensus, and including a dynamic adjustment protocol for nodes. Similarly, the client only needs to receive feedback from more than half of the nodes to ensure network consistency. In this experiment, we set the same number of Byzantine nodes in the PBFT and RC-PBFT node networks and measured the success rate of 300 transactions across varying numbers of nodes. The results of the comparison are shown in Fig. 4. As can be seen from the figure, the success rate of RC-PBFT algorithm is more stable than that of PBFT, and will not decrease with the increase of the number of nodes. In summary, RC-PBFT also has high safety.
4.2 Communication Overhead Communication overhead refers to the traffic generated for information exchange when nodes perform consensus algorithms. Assuming that the total number of nodes is N, the algorithm communication complexity of PBFT algorithm is O(N )2 , and the algorithm performance will decline sharply when the number of nodes increases. The communication overhead of PBFT algorithm can be expressed as: TP B F T = 2N 2 − N + 1
(9)
According to formula (6) and (8), the communication complexity of RC-PBFT decreases to O(N )2 in the consensus without malicious nodes. When a malicious node exists, the traffic is O(Nunl )2 . Let x = Nunl /N , where x represents the proportion of UNL nodes in the total nodes. The communication overhead of RC-PBFT algorithm with undeleted nodes can be further expressed as: TRC−P B F T = (3x + 2)N − 3
(10)
The communication cost table of the consensus algorithm for deleting nodes can be expressed as: (11) T RC−P B F T = x 2 N 2 + (2 + 4x)N − 5 Based on the above formula, the communication overhead of PBFT and RCPBFT consensus algorithm is shown in Fig. 5. As the number of nodes increases, the communication overhead of PBFT algorithm rises sharply, while that of RC-PBFT is much smaller than that of PBFT and grows slowly.
Efficient PBFT Algorithm Research Based on Credit Value Model
11
Fig. 5 Communication traffic
Fig. 6 Consensus delay
4.3 Consensus Delay Consensus delay is a crucial metric for assessing the efficacy of consensus algorithms. A low delay consensus algorithm can significantly enhance transaction speed and system safety. It refers to the duration from the start of a transaction until its completion. In this study, we compared the consensus delay of PBFT and RC-PBFT by measuring the average consensus delay of 500 transactions across various numbers of nodes under the same experimental environment. Based on the results presented in Fig. 6, both the PBFT and RC-PBFT algorithms exhibit similar delays for up to 50 nodes. However, as the number of nodes in the network increases, the delay of PBFT rapidly rises, while that of RC-PBFT increases slowly. Thus, the RC-PBFT algorithm can effectively maintain a low consensus delay even with a large number of network nodes.
12
X. Liu et al.
(a) Number of different nodes
(b) Different transaction request
Fig. 7 Comparison of TPS
4.4 TPS Throughput refers to the number of transactions processed by the system per unit time. It is an important indicator to measure the ability of the system to concurrently process transactions. It is usually expressed as TPS. In order to accurately evaluate the throughput of RC-PBFT algorithm, two experiments as shown in Fig. 7 were conducted in this paper. Experiment 1: Set the number of different nodes and compare the average throughput of 600 transactions. See Fig. 7a. Experiment 2: Set the same number of nodes and compare the average throughput when different number of transactions are executed, as shown in Fig. 7b. As can be seen from the analysis in the figure, under different node numbers, the throughput of PBFT algorithm decreases significantly with the increase of node numbers. However, the RC-PBFT algorithm decreases slowly, which can ensure the system still has high throughput when the number of nodes is large. With the same number of nodes, the throughput of RC-PBFT algorithm is higher than that of PBFT algorithm. To sum up, RC-PBFT is superior to PBFT algorithm in terms of throughput and has better ability to process things concurrently.
5 Conclusion This paper presents a practical Byzantine fault-tolerant algorithm called RC-PBFT. The proposed algorithm includes a perfect credit value model, a reasonable mechanism for selecting the winner nodes, and a protocol that optimizes the communication between nodes while ensuring high performance even with more nodes in the network. The concept of UNL is used to select reliable nodes to achieve consensus. In addition, the algorithm has a dynamic adjustment mechanism, making it extensible.
Efficient PBFT Algorithm Research Based on Credit Value Model
13
Simulation results demonstrate that RC-PBFT outperforms the PBFT algorithm in communication overhead, consensus delay, and throughput. To implement the algorithm, we used Java multithreading simulation. However, there is still room for improvement in the actual node interaction. Therefore, we plan to further refine the algorithm and implement it in the actual environment as soon as possible.
References 1. Nakamoto, S.: Bitcoin: a peer-to-peer electronic cash system. Decentralized Bus. Rev. 21260 (2008) 2. Lu, Y.: The blockchain: state-of-the-art and research challenges. J. Ind. Inform. Integr. 15, 80–90 (2019). https://doi.org/10.1016/j.jii.2019.04.002 3. Lamport, L., Shostak, R., Pease, M.: The Byzantine generals problem. 4, 382–401 (1982). https:// doi.org/10.1145/357172.357176 4. Lamport, L.: Paxos made simple. ACM SIGACT News 32, 51–58 (2016). https://doi.org/10. 1145/568425.568433 5. Yun, J., Goh, Y., Chung, J.M.: Analysis of mining performance based on mathmatical approach of PoW. In: 2019 International Conference on Electronics, Information, and Communication (ICEIC), pp. 1–2 (2019). https://doi.org/10.23919/ELINFOCOM.2019.8706374 6. Wang, B., Ye, W., Liu, Y.: Application of transformed cubature quadrature information filtering in distributed POS. IEEE Sens. J. 21, 21913–21920 (2021). https://doi.org/10.1109/JSEN.2021. 3105403 7. Chen, Y., Liu, F.: Improvement of DPoS consensus mechanism in collaborative governance of network public opinion. In: 2021 4th International Conference on Advanced Electronic Materials, Computers and Software Engineering (AEMCSE), pp. 483–488 (2021). https://doi.org/10. 1109/AEMCSE51986.2021.00105 8. Castro, M., Liskov, B.: Practical byzantine fault tolerance. ACM Trans. Comput. Syst. (TOCS) 99, 173–186 (1999). https://doi.org/10.1145/571637.571640 9. Li, W., Feng, C., Zhang, L., et al.: A scalable multi-layer PBFT consensus for blockchain. IEEE Trans. Parallel Distrib. Syst. 32, 1146–1160 (2020). https://doi.org/10.1109/TPDS.2020. 3042392 10. Hai-Bo, T., Tong, Z., He, Z., et al.: Archival data protection and sharing method based on blockchain. J. Softw. 30, 2620–2635 (2019). https://doi.org/10.13328/j.cnki.jos.005770 11. Morkunas, V.J., Paschen, J., Boon, E.: How blockchain technologies impact your business model. Bus. Horizons 62, 295–306 (2019). https://doi.org/10.1016/j.bushor.2019.01.009 12. Castro, M., Liskov, B.: Practical Byzantine fault tolerance and proactive recovery. ACM Trans. Comput. Syst. (TOCS) 20, 398–461 (2002). https://doi.org/10.1145/571637.571640 13. Zavolokina, L., Ziolkowski, R., Bauer, I., et al.: Management, governance and value creation in a blockchain consortium. MIS Q. Executive 19, 1–17 (2020). https://doi.org/10.17705/2MSQE. 00022 14. Tang, S., Wang, Z., Jiang, J., et al.: Improved PBFT algorithm for high-frequency trading scenarios of alliance blockchain. Sci. Rep. 12, 4426 (2022). https://doi.org/10.1155/2021.8455180 15. Yang, J., Jia, Z., Su, R., et al.: Improved fault-tolerant consensus based on the PBFT algorithm. IEEE Access 10, 30274–30283 (2022). https://doi.org/10.1109/ACCESS.2022.3153701
MK-DCCA Based Fault Diagnosis for Incipient Fault in Nonlinear Dynamic Processes Junzhou Wu, Mei Zhang, Chihan Gao, Lingxiao Chen, and Chen Ling
Abstract Incipient fault detection is particularly important in process industrial systems, as its early detection helps to prevent major accidents. Against this background, this study proposes a combined method of Mixed Kernel Principal Component Analysis and Dynamic Canonical Correlation Analysis (MK-DCCA). Comparative experiments were conducted on a CSTR Simulink model, comparing the MK-DCCA method with DCCA and DCVA methods, demonstrating its excellent monitoring performance in detecting incipient faults in nonlinear dynamic systems. Furthermore, fault identification experiments were conducted, validating the high accuracy of the accompanying contribution graph method. Keywords Dynamic system · Incipient fault · Process monitoring · Fault detection
1 Introduction Data-driven methods are increasingly being applied to diagnose incipient faults in process industrial systems. The academic community has accorded considerable attention to a set of prevalent methodologies rooted in multivariate statistical analysis. These methodologies, namely Principal Component Analysis (PCA), Canonical Variate Analysis (CVA), Canonical Variate Dissimilarity Analysis (CVDA), and Canonical Correlation Analysis (CCA), have emerged as common approaches in this field [12]. There are also many successful applications of multivariate statistical analysis-based methods in industry. For instance, in [14], a PCA-based method was applied to fault detection in the semiconductor manufacturing industry with promising results. Ruiz-Cárcel et al. conducted an in-depth study on the CVA method [13], while Pilario et al. proposed the CVDA method and its extended versions based on CVA [9–11]. As for the CCA-based method, in [2], data-driven CCA technology was first used to generate residuals based on canonical correlation, achieving good results in both static and dynamic fault detection scenarios. Subsequently, CCAJ. Wu · M. Zhang (B) · C. Gao · L. Chen · C. Ling Guizhou University, Guiyang, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_2
15
16
J. Wu et al.
based methods have been widely researched and improved by numerous scholars in the field of process monitoring [1, 3–5]. However, diagnosing incipient faults in industrial processes often encounters challenges in handling complex system characteristics, such as non-linearity and dynamics. Moreover, most existing research methods focus on individual characteristics, with limited studies on composite characteristics, avoiding the challenges in this research area. For instance, in [2, 13], the dynamic versions of CCA and CVA methods were proposed to address the issue of process dynamics, but other characteristics were not investigated. In [11], an extended version of CVA called CVDA was introduced, incorporating kernel density estimation for computing statistical threshold values, effectively addressing dynamics and non-Gaussianity, but nonlinearity was not addressed. Although [10] proposed a combined method using kernel method and CVDA to address all characteristics, it lacks research on the relationship between inputs and outputs of nonlinear dynamic system. Building upon the aforementioned research foundation, this paper presents a novel MK-DCCA fault diagnosis method, to address the challenge of detecting incipient faults in industrial processes characterized by complex features such as high dimensionality, non-linearity, and dynamics. By utilizing the MKPCA and DCCA approach, this method effectively handles the non-linear and dynamic aspects of the system, respectively. Consequently, it attains commendable diagnostic performance for incipient faults under the aforementioned system characteristics. Additionally, the contribution degree method is employed to facilitate fault identification in the proposed methodology.
2 Methodological Theory 2.1 Mixed Kernel Component Analysis Within the realm of practical applications involving Kernel Principal Component Analysis (KPCA), the quest for a kernel that demonstrates commendable abilities in both interpolation and extrapolation becomes imperative. Specifically, the pursuit of excellent generalization performance is paramount. Extensive research conducted by Pilario et al. sheds light on this matter, revealing that the Radial Basis Function (RBF) kernel showcases notable prowess in terms of interpolation capabilities, whereas the polynomial kernel exhibits commendable aptitude in the realm of extrapolation [10]. Therefore, based on the approach proposed by Jordaan et al. of combining local and global kernels, this paper combines the RBF kernel and polynomial kernel using the following formula to form a mixed kernel [6]. K mi x = ωK poly + (1 − ω)K R B F
(1)
MK-DCCA Based Fault Diagnosis for Incipient Fault . . .
17
where ω ∈ [0, 1] represents the mixing weight, and when the mixing weight ω assumes a value of 1, the mixed kernel converges to the polynomial kernel, while for ω equal to 0, it converges to the RBF kernel. A combination of linear and RBF kernels through weighted sum is suggested in [6] to achieve both good interpolation and extrapolation capabilities. Therefore, in this study, we combine the polynomial kernel with d = 1 and the RBF kernel.
2.2 Dynamic Canonical Correlation Analysis CCA was selected as the basis for process monitoring in this study, with additional extensions, owing to the shortcomings of methods such as CVA and CVDA in investigating the connection between system input variables and output variables. Based on the study carried out by Chen et al., the residual signal can be characterized as the following equation [2]: r (k) = L T y(k) − M T u(k)
(2)
where u and y respectively denote the input and output vectors, L and M represent constant matrices of unknown values, while the residual signal at moment k is represented by r(k). Vector u and y are then standardized to obtain u¯ and y¯ , assuming p and f as lag and lead parameters, respectively. The past and future observation vectors of u and y are defined as ⎡ ⎤ u(k − p) ⎢ ⎥ .. u p (k) = ⎣ (3) ⎦ . u(k − 1)
⎡ ⎢ u f (k) = ⎣
u(k) .. .
⎤ ⎥ ⎦
(4)
⎤ y(k − p) ⎢ ⎥ .. y p (k) = ⎣ ⎦ . y(k − 1)
(5)
u(k + f ) ⎡
⎡ ⎢ y f (k) = ⎣
y(k) .. .
y(k + f )
⎤ ⎥ ⎦
(6)
18
J. Wu et al.
The past and future observation matrices are then defined as
y p (k) z p (k) = u p (k)
(7)
Z p = [z p (1), . . . , z p (N )] ∈ R p(m+l)×N
Zp Z= Uf
(8)
(9)
Y f = y f (1), . . . , y f (N ) ∈ R( f +1)l×N
(10)
Subsequently, using Z as the input matrix and Y f as the output matrix, the selfcovariance and cross-covariance Σz,z , Σ y f ,y f and Σz,y f of Z and Y f are calculated according to the following formulas. 1 z¯ (i)¯z T (i) N − 1 i=1
(11)
Σ y f ,y f =
1 y¯ f (i) y¯ Tf (i) N − 1 i=1
(12)
Σz,y f =
1 z¯ (i) y¯ Tf (i) N − 1 i=1
(13)
N
Σz,z =
N
N
Then, construct the Hankel matrix H based on the following equation. −1/2 Σz,y f Σ y−1/2 H = Σz,z f ,y f
(14)
Through the process of SVD (singular value decomposition), the matrix H can be decomposed into (15) H = Γ ΛΔT Λn 0 where Γ = (γ1 , . . . , γl ) , Δ = (δ1 , . . . , δm ) , Λ = , and γi and δ j are the 0 0 corresponding singular vectors. Λn = diag(λ1 , . . . , λn ), λ1 ≥ λ2 ≥ · · · ≥ λn ≥ 0 is the singular values. The Cumulative Percentage Value (CPV) method is a useful tool for calculating the value of n [8]. Based on (16)–(18), obtaining the unknown constant matrices L and M will allow us to obtain the residual signal.
Δ(:, 1 : n) L n = Σ y−1/2 f ,y f
(16)
MK-DCCA Based Fault Diagnosis for Incipient Fault . . .
19
−1/2 Jn = Σz,z Γ (:, 1 : n)
(17)
MnT = Λn JnT
(18)
Furthermore, the estimation of the covariance matrix for r(k) is approximated as Σrr = I − Λ2n , where I denotes the unit matrix. Ultimately, the statistical metric T 2 , used for monitoring dynamic system processes, can be calculated through the given equation: (19) T 2 (k) = (N − 1)r T (k)Σrr−1r (k) The calculation of the corresponding threshold is as follows: Tth2 (k) =
n(N 2 − n) F1−α (n, N − n) N (N − n)
(20)
where n and N denote the number of selected singular values and the number of samples, respectively. After obtaining the threshold, process monitoring is performed according to the following logic: T 2 ≥ Tth2 , it indicates a fault; otherwise, there is no fault.
2.3 Contribution Based Fault Identification Building upon the foundation of Li et al.’s research, this study utilizes T 2 contribution based on the CVA method to identify faulty variables [7]. The formula for calculating the canonical state variable z k , which is used in the contribution calculation, is given as follows: −1/2 y¯ p,k (21) Z k = K y¯ p,k = VqT Σ pp where VqT represents the right singular vectors obtained from the SVD of matrix H, and Σ pp is the self-covariance of the past observation matrix. Furthermore, according to [7], the contribution of variables based on the T 2 statistical metric can be calculated as: q n n z j K j,i y¯ p,i = Ci,T 2 (22) C T 2 = T 2 = Z T Z = Z T K y¯ p,k = i=1 j=1
i=1
where Ci,T 2 is the contribution of variable y¯i to the metrics T 2 , and z j K j,i y¯ p,i represents the contribution of variable y¯i to the jth canonical state variable z j . Finally, the variable contribution percentage is computed using Eq. (23) to discern the variables that are correlated with faults. Ci,T 2 (23) Pi,T 2 = CT 2
20
J. Wu et al.
3 Analysis of a Case Study on CSTR Simulink Model 3.1 Model Introduction The CSTR Simulink model was utilized to generate the dataset for this case study, specifically geared towards simulating incipient faults. A comprehensive depiction of the model can be located in [11]. In Fig. 1, the schematic diagram of the CSTR model is illustrated, presenting the system inputs as Ci , Ti and Tci , and the system outputs as C, T , Tc , and Q c . The dynamic model of the CSTR process is outlined as follows: Q dC = (24) (Ci − C) − a1 kC + v1 dt V Q dT (ΔHr ) kC = (Ti − T ) − a1 (T − Tc ) + v2 dt V ρC p V
(25)
Qc dTc UA = (Tci − Tc ) + b1 (T − Tc ) + v3 dt Vc ρcC pc Vc
(26)
where Q, ΔHr , and U A denote the inlet flow rate, the heat of reaction, and the heat transfer coefficient, respectively; ρ and ρC , C p and C pc , V and Vc represent the fluid density, the heat capacity of the fluid, and the volumes of the tank and jacket, respectively. The training and testing datasets were generated by the CSTR simulation model during a 1200 s run, with a frequency of one sample per second. Each test set starts from a fault-free state and introduces a fault after 200 s of operation. Three fault scenarios are employed to assess the efficacy of the proposed method, and their detailed information can be found in Table 1.
Fig. 1 Schematic diagram of the CSTR model
MK-DCCA Based Fault Diagnosis for Incipient Fault . . . Table 1 Fault detailed information Fault number Fault scenario Fault 1
Fault variables
Free vavle Ci malfunction Coolant leakage Q c Both fault 1 and 2 Ci ,Q c
Fault 2 Fault 3
Ci fault Detection result of MK-DCCA
105
21
Fault type
Introduce time(s)
Incipient fault
200
Incipient fault Incipient fault
200 200
Qc fault Detection result of MK-DCCA
104
Test statistic Threshold
Test statistic Threshold
104
103
T2
T2
103
10
102
2
101 101
100
100 0
200
400
600
800
1000
1200
0
200
400
(a)
800
1000
1200
1000
1200
(b)
Ci fault Detection result of DCCA
105
600
10
Qc fault Detection result of DCCA
4
Test statistic Threshold
Test statistic Threshold
104
103
T2
T2
103 102
102 101 101
100
100 0
200
400
600
800
1000
1200
0
200
400
600
(c)
(d)
(e)
(f)
Fig. 2 Monitoring results of different methods for fault 1 and fault 2
800
22
J. Wu et al.
Table 2 Performance indicators of fault 1 and fault 2 Fault type Method FDR (%) FAR (%) Fault 1
Fault 1
MK-DCCA DCCA DCVA MK-DCCA DCCA DCVA
96.20 96.00 93.51 93.01 92.91 91.01
0.52 0 3.14 0 0 2.09
MDR (%)
FDT (s)
3.80 4.00 6.49 6.99 7.09 8.99
223 231 236 246 238 271
3.2 Process Monitoring In this section, a comparative experiment is conducted using different methods for fault 1 and fault 2, followed by an experimental analysis of fault 3. The methods used include MK-DCCA, DCCA, and DCVA. The monitoring performance in the incipient fault scenarios of fault 1 and fault 2 is shown in Fig. 2. From the figure, it is evident that the MK-DCCA, DCCA, and DCVA methods can all provide early warning and predict the trend of faults after they occur. However, in comparison, the MK-DCCA method outperforms the DCCA and DCVA methods in terms of overall performance, as can be seen from the specific performance analysis in Table 2. It is discernible that the MK-DCCA method achieves fault detection rates of 96.20% and 93.01% for fault 1 and fault 2, respectively, which are higher than those of the DCCA and DCVA methods. Moreover, in terms of false alarm rates for fault 1 and fault 2, the CCA-based method achieves lower false alarm rates compared to the CVA-based method. Additionally, the MK-DCCA method maintains the lowest missed detection rates in monitoring both types of faults, with values of 3.80% and 6.99%, respectively. As for fault detection time, the CCA-based method significantly reduces the Fault Detection Time (FDT) compared to the CVA-based method. Although the DCCA method has a shorter detection time than the MKDCCA method for fault 2, it comes at the cost of slightly higher missed detection rates. Overall, the performance of MK-DCCA is more satisfactory. In addition, to make the study more relevant to complex real-world applications, faults 1 and 2 were simultaneously introduced into the CSTR system, referred to as fault 3. The fault monitoring experiment for fault 3 is shown in Fig. 3, and the associated performance metrics are presented in Table 3. It is overtly clear that the CCA-based method has a considerable benefit over the CVA-based method, with approximately 2% higher detection rate, approximately 2% lower false negative rate, and a reduction in detection time of about 20 s. Moreover, in the comparison between MK-DCCA and DCCA methods, the former achieves a slight advantage in detection rate and false negative rate. This once again demonstrates the superiority of the proposed MK-DCCA method and its feasibility in complex application environments.
MK-DCCA Based Fault Diagnosis for Incipient Fault . . . Fig. 3 Monitoring results of different methods for fault 3
23
Ci and Qc fault Detection result of MK-DCCA
105
Test statistic Threshold
104
T2
103
102
101
100
0
200
400
600
800
1000
1200
(a) Ci and Qc fault Detection result of DCCA
105
Test statistic Threshold
104
T2
103
102
101
100
0
200
400
600
(b)
(c)
800
1000
1200
24
J. Wu et al.
Table 3 Performance indicators of fault 3 Method FDR (%) FAR (%) MK-DCCA DCCA DCVA
97.80 97.10 95.11
1.04 1.05 0
MDR (%)
FDT (s)
2.20 2.90 4.90
212 215 237
3.3 Fault Identification In the previous section, the monitoring performance of the MK-DCCA method has been validated. In this section, the focus is on analyzing the fault identification capability of this method. Fault identification is performed using the contribution of the T 2 index, and the contribution plots for fault 1–3 are in Fig. 4. It is possible to ascertain that in faults 1 and 2, the variables Ci and Q c , which act as fault variables, have contributions to the statistical indicators of 98.66% and 86.32%, respectively, significantly higher than other variables. In fault 3, which involves both Ci and Q c as fault variables, their contributions are 64.3% and 34.84%, respectively, again surpassing other variables. The above results demonstrate that the fault identification method employed in this research produces good identification accuracy, enabling precise identification and localization of faults.
4 Conclusion This paper emphasizes the importance of detecting incipient faults in process industrial systems and extends the widely recognized process monitoring method, CCA, to make it more suitable for early detection of incipient faults. By incorporating time parameters, the method gains the ability to handle system dynamics. The inclusion of kernel methods endows the method with the capability to handle nonlinear data. In the selection of kernel functions, a weighted combination of RBF and polynomial kernels is chosen, allowing the kernel function to possess both good interpolation and extrapolation capabilities. Based on the aforementioned work, this paper proposes an MK-DCCA fault diagnosis method and conducts comparative experiments on the CSTR Simulink model, demonstrating the superiority of the proposed method over DCCA and DCVA methods. Nevertheless, this study has certain limitations. The method requires a considerable number of parameters, and its performance is somewhat dependent on the selection of these parameters. The calculation of thresholds is relatively inflexible, leading to limited adaptability to different monitoring objects. Future research will aim to address these issues by exploring adaptive parameter selection and threshold computation.
MK-DCCA Based Fault Diagnosis for Incipient Fault . . . Contribution Plot of Variables to T2
Contribution Plot of Variables to T2 1
0.9
0.9
0.8
0.8
0.7
0.7
0.6
Contribution
Contribution
25
0.6 0.5 0.4
0.5 0.4 0.3
0.3
0.2
0.2
0.1
0.1
0
0 C
Ci
Qc
T
Tc
Tci
C
Ti
Ci
Qc
T
Tc
Tci
Ti
Variable
Variable
(a)
(b) Contribution Plot of Variables to T2
0.7
0.6
Contribution
0.5
0.4
0.3
0.2
0.1
0 C
Ci
Qc
T
Tc
Tci
Ti
Variable
(c)
Fig. 4 Contribution plot of process variables under fault scenarios
Acknowledgements This research was funded by the National Natural Science Foundation of China, Grant No. 62003106, Provincial Natural Science Foundation of Guizhou, Grant No. ZK (2021) 321. and [2017]5788.
References 1. Chen, Z., Deng, Q., Zhao, Z., Tang, P., Luo, W., Liu, Q.: Application of just-in-time-learning CCA to the health monitoring of a real cold source system. IFAC-PapersOnLine 55(6), 23–30 (2022) 2. Chen, Z., Ding, S.X., Zhang, K., Li, Z., Hu, Z.: Canonical correlation analysis-based fault detection methods with application to alumina evaporation process. Control Eng. Prac. 46, 51–58 (2016) 3. Chen, Z., Liang, K.: Canonical correlation analysis-based fault diagnosis method for dynamic processes. In: Fault Diagnosis and Prognosis Techniques for Complex Engineering Systems, pp. 51–88. Elsevier (2021)
26
J. Wu et al.
4. Chen, Z., Zhang, K., Ding, S.X., Shardt, Y.A., Hu, Z.: Improved canonical correlation analysisbased fault detection methods for industrial processes. J. Process Control 41, 26–34 (2016) 5. Gao, L., Li, D., Yao, L., Gao, Y.: Sensor drift fault diagnosis for chiller system using deep recurrent canonical correlation analysis and k-nearest neighbor classifier. ISA Trans. 122, 232–246 (2022) 6. Jordaan, E.M.: Development of robust inferential sensors: industrial application of support vector machines for regression (2004) 7. Li, X., Mba, D., Diallo, D., Delpha, C.: Canonical variate residuals-based fault diagnosis for slowly evolving faults. Energies 12(4), 726 (2019) 8. Negiz, A., Çlinar, A.: Statistical monitoring of multivariable dynamic processes with statespace models. AIChE J. 43(8), 2002–2020 (1997) 9. Pilario, K.E.S., Cao, Y., Shafiee, M.: Incipient fault detection, diagnosis, and prognosis using canonical variate dissimilarity analysis. In: Computer Aided Chemical Engineering, vol. 46, pp. 1195–1200. Elsevier (2019) 10. Pilario, K.E.S., Cao, Y., Shafiee, M.: Mixed kernel canonical variate dissimilarity analysis for incipient fault monitoring in nonlinear dynamic processes. Comput. Chem. Eng. 123, 143–154 (2019) 11. Pilario, K.E.S., Cao, Y.: Canonical variate dissimilarity analysis for process incipient fault detection. IEEE Trans. Ind. Inform. 14(12), 5308–5315 (2018) 12. Qin, S.J.: Survey on data-driven industrial process monitoring and diagnosis. Ann. Rev. Control 36(2), 220–234 (2012) 13. Ruiz-Cárcel, C., Cao, Y., Mba, D., Lao, L., Samuel, R.: Statistical process monitoring of a multiphase flow facility. Control Eng. Prac. 42, 74–88 (2015) 14. Wise, B.M., Gallagher, N.B.: The process chemometrics approach to process monitoring and fault detection. J. Process Control 6(6), 329–348 (1996)
An Improved Siamese Capsule Networks Classification Method for Few Sample Remote Sensing Images Meng Liu, Jian Cao, Hai-sheng Li, and Qiang Cai
Abstract To address the problems of insufficient data and blurred details in remote sensing images under different scenarios, we propose a method to improve Siamese Capsule Networks for classification of remote sensing images with few samples. First, a pre-trained convolutional neural network is used for feature extraction and display of the input data. Second, a residual network block is added to the convolutional neural network to improve the depth and robustness of the block. Then, an attention mechanism is added to each residual block to improve the feature extraction capability of the model on the input data. Next, the output of each residual block is fed into the capsule neural network to learn the spatial structure and static information of the input data. The capsule neural network can be used for feature extraction and representation of the input data through capsule layers and dynamic routing algorithms to improve the representation capability of the model. Finally, the outputs of the two capsule neural networks are input into the twin neural network, and the similarity measure of the two capsule vectors is performed by the loss function to determine whether the two remote sensing images belong to the same kind. The experimental results show that the model can achieve 66 and 83% detection rates at 1 shot and 5 shot under the condition of few samples, and the comprehensive performance of the method is better than the existing methods. Keywords Remote sensing images · Siamese capsule network · CBAM
1 Introduction With the rapid development of aerial photography technology, remote sensing images have become more convenient and inexpensive to acquire, which gives us more diverse means to observe our world. The study of remote sensing images is of great M. Liu (B) · J. Cao · H. Li · Q. Cai School of Computer Science and Engineering, Beijing Technology and Business University, Beijing 100048, China Beijing Key Laboratory of Big Data Technology for Food Safety, Beijing 100048, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_3
27
28
M. Liu et al.
significance in the field of remote sensing. Remote sensing images can not only reflect the current situation, but also obtain dynamic information based on images from different periods, such as changes in cities [1], expansion of roads [2], and destruction of nature [3]. Remote sensing scene classification with precise semantic features can provide decision basis for urban planning and disaster monitoring. With the rapid development of deep learning, CNN have good classification and feature learning capabilities and have started to be applied in the field of remote sensing image classification. Cheng et al. [7] studied the application of CNNs in remote sensing classification. They used a pre-trained CNN model as a generic feature extractor. Then, simple classifiers such as linear support vector machine were used for scene classification, and the support vector machine was used for scene classification with AlexNet [4] (Krizhevsky et al. 2012), VGGNet [5] (Simonyan and Zisserman 2014), and GoogleNet [6] (Szegedy et al. 2015) for comparison. Chaib et al. [7] classify VHR image scenes using a pre-trained deepCNN model, where VGGNet is used as a feature extractor and then combined between different useful fully connected layers. According to the literature [8], a multi-branch neural network is proposed to solve the knowledge adaptation problem for remote sensing scene datasets. Their goal is to learn invariant representations of features from multiple source domains of labeled images and target domains of unlabeled images. Although the features learned by CNN are effective to some extent, there are still some limitations in the field of remote sensing image research. First, the small sample of remote sensing image dataset cannot make full use of deep learning methods to extract data features on remote sensing image images. Second, remote sensing image images have multiple views, i.e., multiple perspectives of the scene. In each category, different images have different angles and positions. Last but not least, CNNs usually work very well when the training and test sets are close to each other. If the distribution of the test images changes, the performance of the CNN drops dramatically. To address the above problems, this paper proposes an improved twin network with remote sensing images as data samples, and solves the problem of overfitting of traditional neural networks in remote sensing image datasets with few samples by introducing capsule networks as sub-networks. Meanwhile, the residual module integrates the CBAM attention mechanism, which enhances the model’s attention to the important feature information of remote sensing images and effectively improves the classification accuracy of remote sensing images with few samples.
2 Improved Siamese Capsule Networks 2.1 Siamese Capsule Networks Siamese Capsule Networks [9] were first proposed by Hinton et al. in 2011 for solving target detection and classification problems. With the development of capsule networks, Siamese Capsule Networks have gradually become a research hotspot with
An Improved Siamese Capsule Networks Classification Method for Few Sample . . .
29
expanding applications. Siamese Capsule Networks consist of two capsule networks with the same structure, and these capsule networks share the same weights and parameters. Capsule networks are a new type of neural network structure for solving the limitation problems in traditional convolutional neural networks. Capsule networks represent features by vectorization, which can better capture the spatial relationship between features and thus improve the generalization ability of the model. In Siamese Capsule Networks, each capsule represents a feature vector to represent a certain feature of the input data. The main idea of Siamese Capsule Networks is to determine the similarity of two input data by comparing their distances in the capsule space. The training process of Siamese Capsule Networks usually employs a dynamic routing algorithm for capsule networks, which enables the network to learn the spatial relationships between features while learning the feature representations. Specifically, in the dynamic routing algorithm, the network adjusts the spatial relationships between capsules by continuously updating the transfer probability matrix between capsules, thus improving the performance of the model.
2.2 Residual Module The residual module [10] is a module for improving deep convolutional neural networks, which was first proposed by Kaiming He et al. in 2015 to solve the deep network degradation problem. In the traditional deep convolutional neural networks, the performance of the model gradually reaches saturation and even degradation as the depth of the network increases. The residual module enables the network to learn residual information by introducing cross-layer connectivity, thus improving the performance and stability of the model. The basic principle of the residual module is to pass the residual information between the input and output to the subsequent network layers through a cross-layer connection. Specifically, in the residual module, the input data is summed with the input data after two or more convolutional layers and a nonlinear activation function to obtain the output data. That is, the output data is equal to the input data plus the residual information after convolution. This cross-layer connection can alleviate the degradation problem of deep networks and allow the network to learn the residual information, thus improving the performance and stability of the model. In this paper, we use the residual neural network as Resnet50.
2.3 CBAM CBAM (Convolutional Block Attention Module) is proposed by literature [6], which consists of channel attention and spatial attention connected, and the CBAM network structure diagram is shown in Fig. 1.
30
M. Liu et al.
Fig. 1 CBAM attention mechanism
Fig. 2 Channel attention module
The Channel Attention Module compresses the feature map in the spatial dimension to obtain a one-dimensional vector and then operates on it. The compression in the spatial dimension takes into account not only Average Pooling but also Max Pooling. Average Pooling and Max Pooling can be used to aggregate the spatial information of the feature map, send it to a shared network, compress the spatial dimensions of the input feature map, and sum and merge element by element to produce a channel attention map. Channel attention, on a graph alone, is concerned with which elements on this graph are of importance. The channel attention mechanism is shown in Fig. 2. The Spatial Attention Module is a channel compression mechanism that performs mean pooling and maximum pooling in the channel dimension. The maximum pooling operation is to extract the maximum value from the channel, and the number of extractions is the height multiplied by the width, while the average operation is to extract the average value from the channel, and the number of extractions is also the height multiplied by the width. The Spatial Attention mechanism is shown in Fig. 3. Finally, the outputs of the spatial attention module and the channel attention module are summed to form the CBAM-adjusted feature map.
An Improved Siamese Capsule Networks Classification Method for Few Sample . . .
31
Fig. 3 Spatial attention module
Fig. 4 Improved Siamese Capsule Networks
2.4 Complete Model The general structure of the improved network in this paper is shown in Fig. 4, We first perform data preprocessing on the input remote sensing images by resizing the images to fit the image input size required by the model, followed by convolutional feature extraction of the images using ResNet to obtain a feature tensor. The twin capsule network is next used to process the two input images. For each input image, we use a capsule network, where each capsule represents a specific feature vector. In this step, we use a dynamic routing algorithm to train the capsule network to improve the performance and accuracy of the model. In this step, we obtain the capsule vectors of the two images. Then the CBAM attention mechanism is used to extract the important features. In this step, we use a CBAM module to apply attention to the output of ResNet. the CBAM module consists of two parts: channel attention and spatial attention. Channel attention is used to enhance the correlation between different channels, and spatial attention is used to enhance the correlation in specific spatial regions. In this step, we obtain a feature tensor. Finally, we calculate the similarity between each image pair and use the Contrastive Loss loss function to measure the performance of the model. In this step, we calculate the similarity between each image pair using the Euclidean distance. Then, we use the Contrastive Loss loss function to calculate the loss of the model to guide the training and optimization of the model.
32
M. Liu et al.
Fig. 5 Remote sensing image
3 Experimental Analysis 3.1 Image Database The UC Merced Land Use Dataset is an image dataset for remote sensing image classification, created by researchers at the University of California, Buffalo (UC Merced). The dataset contains 21 categories of high-resolution aerial imagery, each containing 100 color images of 256 × 256 pixels in size. The dataset provides a base test for computational vision tasks such as feature classification and target detection. Figure 5 shows some images of the terrain. The image dataset was enhanced by rotating, randomly cropping, and adding noise. This augmented image dataset contains a total of 5234 images. For the augmented dataset, one and five images of each type are sampled as the training set and the rest as the test set.
3.2 Training Parameter Settings The experiments in this paper are based on a Windows 10 OS computer with a CPU configuration of AMD R7-4800H Core(TM), a graphics card configuration of Nvidia GeForce GTX 1650 (4 GB), and a deep learning framework version of pytorch 1.7.1. the input image size is processed uniformly to 256 × 256. during the training process, the optimization was performed using the SGD algorithm with 100 epoch iterations, the basic learning rate was set to 0.001, the momentum was set to 0.9, the sampled batch data was set to 2, and the loss function used BCEWithLogitsLoss.
An Improved Siamese Capsule Networks Classification Method for Few Sample . . .
33
Table 1 Experimental results of improving the ablation of Siamese Capsule Networks in UC merced land use dataset Model Resnet50 CBAM 1shot 5shot Siamese Capsule Networks
√ √ √
√
58.02% ± 0.43%
71.02% ± 0.73%
62.02% ± 0.67% 66.02% ± 0.53% 68.02% ± 1.03%
74.02% ± 0.39% 78.02% ± 0.63% 83.31% ± 0.69%
Table 2 Comparing experiments with different models Method Classification accuracy 1-shot ProtoNet Meta-SGD DLA-MatchNet Improved Siamese Capsule Network
40.35% ± 1.02% 60.58% ± 0.94% 67.85% ± 0.68% 68.02% ± 1.03%
5-shot 69.55% ± 0.55% 76.04% ± 0.49% 79.97% ± 0.75% 83.31% ± 0.69%
3.3 Experimental Results and Analysis 3.3.1
Ablation Experiment
To verify the effectiveness of each module of the improved Siamese Capsule Networks model proposed in the article, we conducted ablation experiments From Table 1, it can be seen that the classification recognition of the model has improved to different degrees after adding two modules, Resnet and CBAM, to the original model.
3.3.2
Comparison Experiments
In this experiment, the four models with the highest accuracy in current small-sample image classification are selected for comparison. As can be seen from Table 2, the improved twin neural network outperforms ProtoNet, Meta-SGD, and DLA-MatchNet in terms of accuracy with one training sample by 0.2%, 7.5%, and 4.5%, respectively, and outperforms ProtoNet, Meta-SGD, and DLA-MatchNet in terms of accuracy with five training samples by improved by 13.1%, 7.3%, and 3.3%.
34
M. Liu et al.
4 Conclusion For the problem of highly similar remote sensing images with small sample size, an improved Siamese Capsule Networks is proposed in this paper. a residual model is added after the capsule layer in the original network to improve the training efficiency and generalization ability of the model. Meanwhile, an attention mechanism is embedded in the residual module so that the model can better focus on the important regions in the image. Experiments show that the improved network model proposed in this study is better in terms of recognition accuracy compared with the original network and some improved networks. The main conclusions are as follows: (1) Adding the ResNet50 residual layer after the capsule layer allows the network to be deeper, thus improving the performance of the network and avoiding the problems of gradient disappearance and gradient explosion during the training process. (2) Combining CBAM attention mechanism module with the model can make full use of the feature map information, improve the model’s attention to the information features of remote sensing images, and improve the recognition accuracy of the model for remote sensing images with a small number of samples. Acknowledgements This work was supported by the National Natural Science Foundation of China No. 62277001 and Scientific Research Program of Beijing Municipal Education Commission KZ202110011017.
References 1. Zhang, L., Zhang, L., Bo, D.: Deep learning for remote sensing data: a technical tutorial on the state of the art. IEEE Geosci. Rem. Sens. Mag. 4(1), 22–40 (2016) 2. Chen, Y., Lin, Z., Zhao, X., Wang, G., Gu, Y.: Deep learning-based classification of hyperspectral data. IEEE J. Sel. Topics Appl. Earth Observ. Rem. Sens. 7(2), 2094–2107 (2014) 3. Mohan, A., Singh, A.K., Kumar, B., Dwivedi, R.: Review on remote sensing methods for landslide detection using machine and deep learning. Trans. Emerg. Telecommun. Technol. 32(3), e3998 (2021) 4. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. Commun. ACM 60(4), 84–90 (2017) 5. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (5) (2014). arXiv preprint arXiv:1409.1556 6. Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A.: Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 6, pp. 1–9 (2015) 7. Cheng, G., Ma, C., Zhou, P., Yao, X., Han, J.: Scene classification of high resolution remote sensing images using convolutional neural networks. In: 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS) (7), pp. 767–770. IEEE (2016) 8. Chen, S.-B., Wei, Q.-S., Wang, W.-Z., Tang, J., Luo, B., Wang, Z.-Y.: Remote sensing scene classification via multi-branch local attention network. IEEE Trans. Image Process. 31(8), 99–109 (2021)
An Improved Siamese Capsule Networks Classification Method for Few Sample . . .
35
9. O’ Neill, J.: Siamese capsule networks (9) (2018). arXiv preprint arXiv:1805.07242 10. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition vol. 10, pp. 770–778 (2016)
Robotic Arm Grasping Based on Deep Reinforcement Learning Ziyan Li and Junyong Zhai
Abstract The paper proposes a model-free robotic grasping method based on deep reinforcement learning. Robot grasping unknown objects is a challenging task that usually requires prior knowledge of target objects, manual labeling of training data, or large amount of experience data obtained from training to achieve generalization. The grasp strategy proposed in this paper is based on an on-policy deep reinforcement learning (DRL) algorithm, namely the proximal policy optimization (PPO), combined with the self-attention mechanism, which does not require any prior knowledge of specific objects or training data of specific scenes. Keywords Grasping · Deep reinforcement learning · Self-attention
1 Introduction Industrial robots are widely used in industrial production, and grasping is one of the most basic operational tasks. However, current mature vision-based robotic grasping methods require much prior knowledge, such as the geometric characteristics of the object (points, lines), parameters of the camera and so on. The control algorithm can generally be divided into sensing, pose and position estimation, planning and action stages: the robot first perceives the environment, identifies objects and determines suitable grasping poses, and then plans the path. This process is clearly different from human grasping behavior, where grasping is a dynamic process, with sensing and action interleaved tightly at each stage. Traditional grasping methods, which require the redesign of object features and grasping schemes when the environment or task changes, do not meet the real-world generalization needs for robotic grasping. With the development of AI, deep reinforcement learning (DRL) provides a model-free learning method that enables end-to-end robot grasping. Over time, the RL agent learns the desired behavior through continuous trail-error in the interactions with the environment, enabling self-supervised learning to grasp. The main Z. Li · J. Zhai (B) School of Automation, Southeast University, Nanjing 210096, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_4
37
38
Z. Li and J. Zhai
challenge for learning-based robot grasping is generalization: whether the robot can extend the operational knowledge learned from self-supervision to new environments and grasp unknown objects. At present, the generalization performance of DRL is focused on video game applications [1] and simple simulated robots [2], and is rarely applied to complex robotic grasping. The literature [3] grasped the new objects successfully based on 580k grasp attempts, which were collected by 7 real robots for several weeks. The series of off-policy DRL algorithms proposed by Quillen et al. [4] also require 1M grasp attempts. The realization of such generalization is extremely expensive and difficult to reproduce. In the paper, a grasping strategy based on self-attention mechanism and DRL algorithm is proposed. After training for 5–7 h, this method can achieve 70% grasp success rate on the test set. Inspired by other fields of deep learning (DL), this paper adds the self-attention to the learning framework for robotic grasping. In the DL, attention models have been successfully applied to complex visual tasks such as video and scene understanding. The work [1] improved the performance of video games by incorporating temporal and spatial attention mechanisms into the underlying network architecture of DRL. Therefore, this paper proposes that incorporating self-attention into the vision-based robotic grasping DRL task is reasonable and effective. In addition, this paper adopted an on-policy DRL algorithm, named proximal policy optimization (PPO) [5]. The actor network of PPO is updated with a new idea, which is that its objective function can be iteratively updated with a small number of samples in multi-round training, solving the problems of difficult to determine step size and excessive update variance in traditional policy gradient algorithms. Therefore, PPO has the advantages of being highly adaptable and stable in training. In this paper, the main contributions are as follows: • The paper proposes a model-free robotic grasping method based on an on-policy DRL algorithm. Without any prior knowledge of the target object or abundant experience data with the consumption of training time, it can realize the grasping generalization in less training time. • The Markov property of RL input states that the next state is fully dependent on the current state is exploited to improve the learning performance by incorporating temporal and spatial attention mechanisms into the feature extraction architecture.
2 Related Work The research on robotic grasping problems is divided into two main methods: geometry-driven and data-driven. Geometric drive strategies require the design and extraction of geometric features for the target object, combined with camera parameters to calculate a suitable grasping position, which is usually combined with traditional vision servo grasping. Visual servo uses visual feedback to control the robot’s motion. However, such methods rely on hand-designed features and camera parameters, with very limited generalization performance.
Robotic Arm Grasping Based on Deep Reinforcement Learning
39
In contrast, data-driven grasping does not require robot kinematics, object geometry or physical properties. At present, data-driven grasping solutions focus on the estimation of grasping positions and pose. The system senses the environment, selects the best grasping pose, and then plans paths to these positions. One type performs accurate pose estimation and grasp planning on known target objects in the database [6], which has more robust grasping performance in some cluttered and occluded grasping scenarios than geometry-driven approaches; the second type generates grasping poses directly end-to-end from the input image or point cloud information [7], which enables the grasping of unknown objects. However, all these grasping schemes are open-loop. The strategy proposed in this paper allows for both generalization performance and dynamic closed-loop grasping. The latest progress in DRL can effectively solve complex control problems for robotic arms, such as manipulation tasks [8], grasping tasks and movement tasks [9]. The literature [3] achieves generalization to some extent, but is very time-consuming and labor-intensive. The main task of learningbased robotic arm grasping is generalization.
3 Main Results 3.1 Reinforcement Learning The RL agent learns by interacting with the environment to complete a specific task and maximize the reward value. The RL process can be represented by a Markov decision process (MDP), described by a quadruplet (st , at , st+1 , rt ). During interaction, the agent based on its policy function πθ (a|s) generates an action at that acts on the environment in the current state st , then obtains the next state st+1 and the reward r (at , st ) = rt . This paper considers a finite-horizon. Episodes have length T steps. The ultimate goal of the agent is to learn an optimal policy πθ (a|s) to maximize the discounted cumulative returns G t : Gt =
T
γ k−t rk (ak , sk )
(1)
k=t
where t ∈ [1, T ]. γ ∈ [0, 1] is the discount factor. In this paper, we adopt an actor-critic algorithm that combines two different networks. The actor network generates actions at through approximate πθ (a|s) at the current time t and the critic network estimates the value function V (s) or the actionvalue function Q(s, a) to evaluate the actor network. The critic network is updated by minimizing the time difference error (TD-error) given as:
40
Z. Li and J. Zhai
Q(st , at ) = E st+1 ,at+1 ,... (G t |st , at ) V (st ) = E at ,st+1 ,... (G t |st ) L(ωt ) = rt + γ max Q(st+1 , at+1 ; ωt−1 ) − Q(st , at ; ωt ) at+1
(2) (3) (4)
where at ∼ πθ (at |st ), st+1 ∼ P(st+1 |(st , at )), P is the transition probability distribution. On the other hand, the actor network is trained by strategic gradient method, in which θ is updated according to the ascending direction of the gradient to maximize the critic output: θ ← θ + ∇θ log πθ (at |st )V (st ).
(5)
3.2 Model Structure The actor-critic structure adopted in this paper is shown in Fig. 1. The actor and critic networks have a common feature-network used to extract the input st features, where st is an RGB image of size (48, 48, 3). The model adopts the sparse reward. When grasping an object in one episode, the Rt is 1, otherwise it is 0. The actor network generates continuous actions represented by Cartesian displacements (dx, dy, dθ), where dx, dy and dθ are the x-axis offset, y-axis offset and the wrist’s vertical angle offset, respectively. The z-axis offset is taken as a fixed value of 0.06.
Fig. 1 Actor-critic architecture
Robotic Arm Grasping Based on Deep Reinforcement Learning
41
3.3 Proximal Policy Optimization PPO adopts a new policy gradient approach that is much easier to implement than trust region policy optimization (TRPO) [10]. The traditional policy gradient algorithm is very sensitive to step size updates, making it difficult to choose the suitable step size. During the training process, it is easy to create significant differences between the new and old strategy, which is not conducive to the generation of optimal policies. The actor network objective function proposed by PPO can be updated iteratively with a small number of samples in multiple rounds of training, solving the problems of difficult step size determination and large update differences. PPO tries to maximize the policy improvement step while maintaining small variability between the new and old policy. The policy objective is given by: L C L I P (θ) = Eˆ t [min(rt (θ) Aˆ t , cli p(rt (θ), 1 − , 1 + ) Aˆ t )]
rt (θ) =
πθ (at |st ) πθold (at |st )
(6)
(7)
where rt (θ) is used to constrain the new policy. The cli p(rt (θ), 1 − , 1 + ) Aˆ t named clipped advantage function serves the same purpose. is the hyper-parameter usually 0.2. Here, the expectation Eˆ t indicates the empirical average over a finite batch of samples, in an algorithm that alternates between sampling and optimization. Aˆ t is the estimate of generalized advantage estimation Aπ (st , at ): Aπ (st , at ) = Q(st , at ) − V (st ).
(8)
3.4 Attention Mechanism Self-attention focuses on the global information of the feature map, unlike CNN that limits the size of the perceptual field due to the setting of the convolution kernel, resulting in a stack of multiple layers for the network to focus on the global feature map. The self-attention types adopted in this paper are Long’s dot product type [11] and Bahdanau’s additive type [12], which are applied in the 5 frameworks shown in Fig. 2. The attention layer is added after each CNN layer, expect the last one. The input state was changed from a single 48 × 48 RGB image to a stack of four 48 × 48 gray-scale images containing enough spatial and temporal information to satisfy Markov’s hypothesis.
42
Z. Li and J. Zhai
Fig. 2 Feature network architecture with attention layers
4 Experiments 4.1 Software and Hardware Configurations The algorithm is based on Python 3.6 and Tensorflow 2.6 and implemented on Ubuntu 18.04 GNU/Linux system. The Kuka-Diverse-Object-Env built by Pybullet [13], has a total of 1000 various objects, divided 9:1 into a training set and a test set. Each grasping system randomly places 5 different objects in a box for a 7-axis Kuka robot arm to grasp. Each algorithm is run 5 times with random seed initialization. Table 1 shows the hyper-parameters used by PPO. Each season contains 1024 steps, that is, the trajectory length, which is about 140 episodes, and each episode 7–8 steps. The score in the following result graph is the average of all episode scores in a season.
Robotic Arm Grasping Based on Deep Reinforcement Learning Table 1 Hyper-parameters used for PPO Parameters Discount factor γ Learning rate α Clip factor in PPO Discount factor in GAE λ Trajectory length N Training epochs Batch size Season
(a) attention of Luong’s type
43
Value 0.993 0.0002 0.2 0.9 1024 20 128 35
(b) attention of Bahdanau’s type
Fig. 3 Effect of self-attention and architectures on grasping score
4.2 The Impact of Attention Mechanism Figure 3a, b respectively show the training results of two attention mechanisms, Luong’s type and Bahdanau’s type, in the 5 structures shown in Fig. 2. In order to emphasize the importance of extracting features from the attention layer, arch(e) improves arch(b) by multiplying the output of the attention layer by 2, and adding its input as the input for the next network layer. From (a) and (b) alone, it can be seen that Luong-4 and Bahdanau-4 have better performance. Comparing the two images, the effect of Luong-4 is better than that of Bahdanau-4. From the above results, it is clear that the attention mechanism improves the performance of DRL algorithms. In order to see how attention layers work, this paper uses Grad-CAM to visualize different attention layers, as shown in Fig. 4. The upper and lower lines represent the visualization results of the last two attention layers output. Figure 4 shows that the highlighted areas, namely the target objects and the end-effector, are given more importance. The objects in the lower row are clearer than those in the upper row, indicating that the attention layer assigns more importance to them, which is clearly different from other regions.
44
Z. Li and J. Zhai
Fig. 4 Visualization of attention layers
Fig. 5 Four adjacent gray-scale images Fig. 6 Effect of self-attention and stacking on grasping score
4.3 The Impact of Stacked Frames Through observation of the training process, it is known that there is occlusion, which greatly affects the performance of the algorithm. To solve the problem, as described in Sect. 3.4, a stack of four adjacent gray-scale images is used as input, as shown in Fig. 5. Stacked frames combined with attention mechanism, the model uses multiple simultaneous attention focuses and can adjust attention over time to handle partially observable situations. Figure 6 shows the training results. It is observed that the grasp success rate reaches 71%, which is 15% higher than a single RGB image as input.
Robotic Arm Grasping Based on Deep Reinforcement Learning
45
Fig. 7 Grasp success rate (%) of different models on the test set
Several of the above trained grasping models are applied on the test set, where the objects in the test set are completely different from the training set. Each grasping model is run five times, and the results are averaged. As shown in Fig. 7, the Luongstack-4 model achieves a success rate of 70%, indicating that the proposed model-free grasping strategy can achieve the grasping of new objects.
5 Conclusion In this paper, a model-free robotic grasping strategy based on PPO is proposed. This strategy, combined with the self-attention mechanism, effectively improves the learning ability. At the same time, the Markov property of reinforcement learning is exploited to use a stack of four neighboring gray-scale images as input states, which solves the occlusion problem to a certain extent and improves the grasping success rate. This method does not require any prior knowledge of the target object and is not limited to specific scenes. It is completely a self-supervised learning method that can grasp new objects. However, this method adopts sparse rewards, resulting in low sample utilization rate and low grasp success rate. In the future, this article will adopt methods to reduce the impact of sparse rewards and improve the grasp success rate. Acknowledgements This work was supported in part by Natural Science Foundation of Jiangsu Province (BK20211162).
46
Z. Li and J. Zhai
References 1. Manchin, A., Abbasnejad, E., van den Hengel, A.: Reinforcement learning with attention that works: a self-supervised approach. In: 26th International Conference on Neural Information Processing, pp. 223–230 (2019) 2. Lillicrap, T.P., Hunt, J.J., Pritzel, A., et al.: Continuous control with deep reinforcement learning (2015). arXiv preprint arXiv:1509.02971 3. Kalashnikov, D., Irpan, A., Pastor, P., et al.: QT-OPT: scalable deep reinforcement learning for vision-based robotic manipulation (2018). arXiv preprint arXiv:1806.10293 4. Quillen, D., Jang, E., Nachum, O.: Deep reinforcement learning for vision-based robotic grasping: a simulated comparative evaluation of off-policy methods. In: IEEE International Conference on Robotics and Automation, pp. 6284–6291 (2018) 5. Schulman, J., Wolski, F., Dhariwal, P., et al.: Proximal policy optimization algorithms (2017). arXiv preprint arXiv:1707.06347 6. Zeng, A., Yu, K.T., Song, S., et al: Multiview self-supervised deep learning for 6d pose estimation in the amazon picking challenge. In: IEEE International Conference on Robotics and Automation, pp. 1386–1383 (2017) 7. Pinto, L., Gupta, A.: Supersizing self-supervision: learning to grasp from 50k tries and 700 robot hours. In: IEEE International Conference on Robotics and Automation, pp. 3406–3413 (2016) 8. Gu, S., Holly, E., Lillicrap, T., et al.: Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In: IEEE International Conference on Robotics and Automation, pp. 3389–3396 (2017) 9. Yue, P., Xin, J., Zhao, H., et al.: Experimental research on deep reinforcement learning in autonomous navigation of mobile robot. In: 14th IEEE Conference on Industrial Electronics and Applications, pp. 1612–1616 (2019) 10. Schulman, J., Levine, S., Moritz, P., et al.: Trust region policy optimization. In: International Conference on Machine Learning, pp. 1889–1897 (2015) 11. Luong, M.T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation (2015). arXiv preprint arXiv:1508.04025 12. Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate (2014). arXiv preprint arXiv:1409.0473 13. PyBullet. Bullet real-time physics simulation. https://pybullet.org/wordpress/
Multidimensional Non-linear Ship Trajectory Prediction Based on LSTM Network Corrected by GA-BP Xinyu Wang, Wenyu Zhao, Shuangxin Wang, and Jingyi Liu
Abstract To solve the problem of poor stability and accuracy of ship trajectory prediction by a single model, the LSTM ship trajectory prediction model is established to predict the ship AIS data, and the LSTM prediction error is corrected by GA-BP model. The experimental validation is carried out under three prediction models, which proves that the error correction model has smaller prediction error and better stability, and can accurately predict the ship trajectory, which is important for the avoidance and control of maritime traffic accidents. Keywords Trajectory prediction · LSTM · GA-BP
1 Introduction With the growth of trade in the South China Sea, the maritime traffic is facing a high risk of collisions. In order to monitor navigation dynamics of ships, vessels are equipped with an automatic identification system (AIS). Using AIS data to predict ship trajectories enables effective collision warning and can assist relevant authorities in regulating marine traffic. Many scholars have proposed a variety of ship track prediction models, including dynamic model, statistical model and neural network model. The dynamic model uses the idea of physical modeling [11], mainly using Kalman filter in [6], vector X. Wang · W. Zhao · S. Wang (B) Beijing Jiaotong University, School of Mechanical, Electronic and Control Engineering, No. 3, Shangyuancun, Beijing 100044, China e-mail: [email protected] X. Wang e-mail: [email protected] W. Zhao e-mail: [email protected] J. Liu CETC Key Laboratory of Aerospace Information Applications, Shijiazhuang, Hebei, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_5
47
48
X. Wang et al.
analysis in [3], etc. The statistical model uses a large amount of historical trajectory data to find the navigation rules. Qiao et al. in [5] proposed a trajectory prediction method based on hidden Markov model, which can predict the continuous trajectory of moving targets. Neural network models have many advantages in nonlinear fitting and multifeature fusion. Zhen et al. in [10] and Xu et al. in [9] used BP neural network prediction method to achieve multidimensional ship trajectory prediction. Considering the slow convergence issue in BP nerual networks [4], Li et al. in [2] adopted the adaptive particle swarm optimization algorithm to optimize the traditional BP neural network to realize the course prediction of ships. Wang et al. in [7] proposed a ship trajectory prediction method based on LSTM model. Wu et al. in [8] corrected LSTM network errors by BP neural networks, further improving the prediction accuracy of ship trajectories.
2 AIS Data Preprocessing In this paper, the extracted AIS trajectory point information included MMSI, longitude, latitude, course, speed, and timestamp. The difference between the timestamps of the i-th and the i+1-th trajectory point of the same MMSI is needed to be calculated to get the time interval Mi = (Loni , Lati , θi , Vi , ΔTi ), where Loni , Lati , θi , Vi , ΔTi respectively represents the longitude, latitude, course, speed and timestamp of the i-th data point. In the data cleaning process, data rows with NAN are deleted, and a linear interpolation method [1] incorporating course and speed is used to complete the interpolation. A ship trajectory feature database is eventually established for trajectory prediction in the following sections.
3 LSTM Network and Its Error Correction Model Based on GA-BP Neural Network 3.1 LSTM Prediction Model LSTM consists of three gates: input gate i t , forget gate f t , and output gate ot . The formulas are as follows:
Multidimensional Non-linear Ship Trajectory Prediction Based . . .
⎧ i t = σ (W ⎪ ⎪ i xt + Ui h t−1 + bi ) ⎪ ⎪ f = σ W f xt + U f h t−1 + b f ⎪ t ⎪ ⎨ ot = σ (Wo xt + Uo h t−1 + bo ) c˜t = tanh (Wc xt + Uc h t−1 + bc ) ⎪ ⎪ ⎪ ⎪ c ⎪ t = f t ct−1 + i t c˜t ⎪ ⎩ h t = ot tanh (ct )
49
(1)
where σ is activation function sigmoid, represents vector element multiplication, {Wi , W f , Wo are the weight matrices of the inputgate, forget gate, and output gate hidden layer states respectively, and Ui , U f , Uo are the weight matrices of the input layer states. bi , b f , bo are bias vectors. The LSTM prediction model belongs to supervised learning problems, and labels need to be set for each sample. The eigenvector of a sample at time t can be represented as X t = {Lon t , Latt , θt , Vt , ΔTt }, and the output label is represented as Yt = {Lon t , Latt }.
3.2 Error Correction Model Based on LSTM Network Corrected by GA-BP BP neural network has certain limitations in practical applications. Genetic Algorithm (GA) is used to optimize the initial weights and thresholds of the BP neural network. In this paper, based on the LSTM model, the GA-BP model is selected to correct the prediction error of LSTM. The framework of this error correction model is shown in Fig. 1, and the calculation steps are as follows: (1) The AIS dataset is divided into a training set and a test set. The network is trained by the LSTM model to make preliminary prediction of the multidimensional feato obtain the predicted results of latitude ture sequence X t in the
test set samples ˆ ˆ ˆ and longitude X t = Lon t , Lat t . And the absolute error of the predicted results is calculated as ˆ t − Lon t (2) ΔLon t = Lon ˆ t − Latt ΔLatt = Lat
(3)
where Lont and Lat t are the true values of longitude and latitude. (2) The error feature vector E t = { Δ Lont , ΔLatt , θt , Vt } is constructed. E t is fed into the GA-BP model for training and the corrected value of the latitude and longitude error value Eˆ t is calculated. The final predicted results of the latitude and longitude Yˆt in the ship trajectory can be expressed as Yˆt = Xˆ t + Eˆ t
(4)
50
X. Wang et al.
Fig. 1 Error correction model frame diagram
(a) LSTM predicted results
(b) LSTM predicted results corrected by GA-BP
Fig. 2 Longitude and latitude predicted results
4 Experimental Results In order to verify the validity of the model, a total of 1054 AIS data from cargo ships underway in the South China Sea are selected as the original data. The LSTM network constructed above is used for experiments. The predicted results of the latitude and longitude are shown in Fig. 2a. Next the error sequence is input to the GA-BP network for correction. The prediction results of latitude and longitude after error correction are shown in Fig. 2b. Three error evaluation indicators are used in this paper, namely mean square error (MSE), root mean square error (RMSE) and mean absolute percentage error (MAPE).
Multidimensional Non-linear Ship Trajectory Prediction Based . . . Table 1 Comparison of error of three prediction models Model MSE/(×10−7 ) RMSE/(×10−4 ) LSTM 4.074 GA-BP 3.218 Error correction model 0.936
6.302 8.818 2.954
51
MAPE/(×10−5 ) 1.250 1.312 0.693
Fig. 3 Comparison of LSTM network prediction error before and after correction
Table 1 lists the calculated results of the three error evaluation indicators for the three different models, and Fig. 3 compares the LSTM prediction error values before and after the GA-BP correction. Table 1 shows that the MSE, RMSE and MAPE of the GA-BP modified LSTM network model predicted results are significantly lower than the other two single prediction models. Figure 3 shows that the prediction errors corrected by GA-BP are smaller and more stable. The experiments have initially verified the feasibility and reasonableness of this error correction model.
5 Conclusions In response to the problem that the use of a single algorithm cannot solve the prediction instability of the ship trajectory prediction model, this paper has proposed a multidimensional non-linear ship trajectory prediction on LSTM corrected by GABP. The experimental results show that the predicted results of the error correction model are significantly better than those of the single LSTM network and GA-BP network.
52
X. Wang et al.
References 1. Lei, H.: Research on port vessel trajectory prediction method based on AIS data. Master’s thesis, Dissertation, Lanzhou University (2021) 2. Li, X.Y., Bu, R.X., Qin, K., et al.: Ship Course Prediction Based on Self-adapting PSO-BP Neural Network Model, pp. 21–24. Singapore, Singapore (2019). http://dx.doi.org/10.1109/ ICITE.2019.8880242 3. Luo, W.D., Zhang, G.J.: Ship motion trajectory and prediction based on vector analysis. J. Coast. Res. 95, 1183–1188 (2020). https://doi.org/10.2112/SI95-230.1 4. Ma, S.X., Liu, S.S., Meng, X.: Optimized BP Neural Network Algorithm for Predicting Ship Trajectory, pp. 525–532. Chongqing, China (2020). http://dx.doi.org/10.1109/ITNEC48623. 2020.9085154 5. Qiao, S.J., Shen, D.Y., Wang, X.T., et al.: A self-adaptive parameter selection trajectory prediction approach via hidden Markov models. IEEE Trans. Intell. Transp. Syst. 16(1), 284–296 (2015). http://dx.doi.org/10.1109/TITS.2014.2331758 6. Ristic, B., La Scala, B., Morelande, M., Gordon, N.: Statistical analysis of motion patterns in AIS data: anomaly detection and motion prediction. Cologne, Germany (2008). http://dx.doi. org/10.1109/ICIF.2008.4632190 7. Wang, Y.K., Xie, X.L., Ma, H., et al.: Ship trajectory prediction based on sliding window LSTM network. J. Shanghai Maritime Univ. 43(14–22) (2022). https://doi.org/10.13340/j.jsmu.2022. 01.003 8. Wu, C.P.: Research on trajectory clustering and prediction based on AIS data. Dissertation, Master’s thesis, Nanjing University of Information Science and Technology (2022) 9. Xu, T.T., Liu, X.M., Yang, X.: BP neural network-based ship track real time prediction. J. Dalian Maritime Univ. 38(9–11) (2012). https://doi.org/10.16411/j.cnki.issn1006-7736.2012. 01.028 10. Zhen, R., Jing, Y.X., Hu, Q.Y., et al.: Vessel behavior prediction based on AIS data and BP neural network. Navig. China 40(6–10) (2017) 11. Zhen, R., Shao, Z.P., Pan, J.C.: Advance in character mining and prediction of ship behavior based on AIS data. J. Geo-inform. Sci. 23(2111–2127) (2021)
Design and Simulation of the Hydraulic System for Height Adjustment of Carriages Zhiming Yan, Baoyan Hu, Lining Yang, Jian Fu, and Yongling Fu
Abstract Railroad train is an important transportation for modern travelling. In order to provide convenience for passengers, in particular for people who are disabled or carry large luggage, a hydraulic height adjustment system was designed. This system is equipped in the secondary suspension system to raise up or draw down carriages to fit in with platforms of different heights. According to the case of the changes to the vertical secondary suspension, four operating stages of the height adjustment system are decided. To verify dynamic performance of the system, a model was developed in AMESim. Results of simulation demonstrates its practicability, and the action time is less than 7 s. The effect of damping after height adjustment system is added to the initial secondary suspension system is also discussed. Based on the premise that passengers have a comfortable experience on the train, suggestions about the selection of pipe length, pipe diameter and oil temperature are put to ensure appropriate damping ratio. The research provides theoretical basis of subsequent application and optimization of the hydraulic height adjustment system. Keywords Modern railroad train · Secondary suspension · Hydraulic height adjustment system · Damping coefficient · AMESim
1 Introduction Railroad train plays a significant role in long-distance transport, in particular, capable of carrying a large amount of passengers and their luggage. Not only could the modern railroad train run between big cities, but also serve in connection of urban areas and rural areas. Generally, the height of platforms varies greatly, however, the distance between the bottom of the carriage and the ground remains constant, which leads to the difference in the height of platforms and carriages. This difference brings inconZ. Yan · B. Hu · L. Yang · J. Fu (B) · Y. Fu School of Mechanical Engineering and Automation, Beihang University, Beijing 100191, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_6
53
54
Z. Yan et al.
venience for the passengers who are disabled or carry large luggage when they get in and get off the train [1]. An effective way to solve this problem is to equip a height adjustment system on carriages to accord with the height of various platforms. There have been some researches on height adjustment systems. What [2, 3] studied were pneumatic height-keeping systems that had relatively small load and low accuracy. In contrast, the hydraulic system is more suitable for large load and has been applied to suspension of railroad trains for a long time [4–6]. The damping coefficient of suspension have an significant impact on the vibration of the carriage, and the vibration directly affects passengers’ comfort on the train [7]. Authors [8, 9] focused on the simulation of damping’s influence on the vibration of the carriage. The experiment about suspension’s damping was carefully conducted in [10]. In summary, almost all height adjustment systems were designed to maintain a steady height during the running of the vehicle [11]. It is obvious that using previous systems cannot resolve the problem of difference in the height of platforms and carriages. In this paper, a hydraulic actuation system used to adjust the height of carriages is designed, and its function is verified successfully in AMESim. This system can adjust the height of the carriage quickly and accurately. To ensure passengers’ good experience on the train, studying the damping of the new vertical secondary suspension system with the hydraulic actuation system is also a certainty. Results show that the pipe length, pipe diameter and oil temperature are the main factors that affect the damping of the new vertical secondary system.
2 Design of Hydraulic Height Adjustment System Figure 1 shows the schematic of the suspension system of a train with the designed height adjustment system. There are two bogies at the front and rear of the bottom of the carriage respectively. Bogie is connected to wheels by spring-damper systems and all of them compose the primary suspension. Secondary suspension system existing between the carriage and the bogie consists of lateral suspension and vertical suspension. Four hydraulic actuators of the height adjustment system are arranged on the four corners at the bottom of carriage, parallel to vertical suspensions. The hydraulic height adjustment system is depicted in Fig. 2. A hydraulic cylinder and the spring of the vertical suspension are integrated. The rod of the cylinder is fixed to the bogie. The bottom of the cylinder and a carriage fit together. When a train pulls in a station, the sensor attached to the bottom of a carriage detects the distance between the bottom of a carriage and the surface of the railway platform, which is the height that the height adjustment system needs to compensate. Detailed operation is divided into 4 stages as follows: 1. Draw-down stage. When a train pulls into a station, the reversing valve 4 works in the left position, and the isolating valve 9 is open. The high-pressure oil enters the rod end chamber through the speed-limiting valve 10 and the shuttle valve 12. The accumulator 11 holds partial oil from the blind end chamber, and extra oil
Design and Simulation of the Hydraulic System for Height Adjustment of Carriage
55
Fig. 1 Schematic of carriage suspension
1.motor 2.constant displacement pump 3.safety valve 4.reversing valve 5.proportional relief valve 6.check valve 7.electromagnetic unloading valve 8.safety valve of accumulator 9.isolating valve 10.one-way speed-limiting valve 11.accumulator 12.shuttle valve 13.hydraulic actuator 14.damper
Fig. 2 Schematic of hydraulic height adjustment system
goes back to the tank through the unloading valve 7. At this time, the height of the carriage decreases, and 10 is fully open. Decreasing height can be controlled by valve opening of the proportional relief valve 5. The whole system adopts open-loop control. 2. Locking stage. As the carriage reaches the predetermined height, isolating valve 9 is closed to keep the carriage stationary. Passengers are able to get on and get off the train in this stage. 3. Releasing stage. This stage starts off before a train is ready to depart. Reversing valve 4 works in the right position, and unloading valve 7 is closed. The pump and the accumulator supply oil to the blind end chamber. The oil in rod end chamber flows into the tank through 12, 10, and 9 successively. 10 is not fully open, forming a buffer to ensure passengers’ comfort. As a result, the carriage is lifted by the spring and the cylinder.
56
Z. Yan et al.
4. Free stage. The motor 1 and solenoid valves such as 4, 5, 7, 9 don’t work in this stage that occurs when train is running. Oil freely flows between chambers of the cylinder through pipeline and shuttle valve 12.
3 Verification of Function 3.1 Mathematical Model The mathematical model in this section specifically describes the draw-down and releasing stages. In the draw-down stage, the required force F to draw down carriages is defined as: (1) F = 4h 1 k where h 1 is the stroke, and k is the stiffness of the secondary suspension. F is provided entirely by four hydraulic actuators, and each of actuators outputs F1 . 1 F1 = F = p2 A2 − p1 A1 (2) 4 where p1 is the pressure and A1 is the effective area of the blind end chamber; and p2 is the pressure and A2 is the effective area of the rod end chamber. The theoretical velocity v1 of the actuator is v1 =
nq 4 A2
(3)
where n is the speed of the motor, q is the displacement of the pump. All leakages are ignored. In the releasing stage, the pressure in the blind end chambers and tank are approximately zero, and the pressure in the rod end chamber p2 is determined by p2 =
kh 2 A2
(4)
where h 2 is the actual stroke that the height adjustment system should compensate. The height of the carriage is supposed to be changed after people go aboard and alight from a train with the effort of the spring and the hydraulic cylinder. When h 2 is small, the speed limiting valve 10 does not work, and the releasing speed v2s mainly depends on the resistance along the return pipe. v2 s =
q2 π d4 = p2 2 A2 256μl A2
where d is the diameter of the pipe.
(5)
Design and Simulation of the Hydraulic System for Height Adjustment of Carriage
57
Fig. 3 A submodel including the cylinder, shuttle valve, springs and damper
When h 2 is large, the releasing speed is limited because the reflux of the rod end chamber is restricted by the speed limiting valve 10. So the limited releasing speed v2l is ql (6) v2l = 2 A2 where ql is the rated max flow rate. The flow rate declines as the carriage is raised to a certain height. Hence, the releasing speed should obey Eq. (5).
3.2 The Results of Simulation The model is built in AMESim to verify the feasibility of the designed system and analyze its performance. Modeling the new vertical secondary suspension system with a hydraulic cylinder is the first step. Figure 3 presents a submodel created in AMESim, which consists of a hydraulic cylinder, a shuttle valve, springs and damper. The shuttle valve takes charge of isolate or connect two chambers in different stages. And the springs and damper are the components from the initial secondary suspension system. The displacement of the bogie would be introduced in Chap. 4. Figure 4 shows the model of hydraulic height adjustment system, which is mainly composed of a controller, hydraulic components and the submodel shown in Fig. 3. The combination of the motor and the pump is the power unit. These hydraulic valves actually have been presented in Fig. 2. And the controller is responsible for controlling the power unit and valves’ work order in different stages. A no-load running cycle with 4 stages is simulated with key parameters shown in Table 1. Figure 5 depicts the displacement of the carriage. The pressure in two chambers is shown Fig. 6.
58
Z. Yan et al.
Fig. 4 Model of hydraulic height adjustment system in AMESim Table 1 Parameters in the model Parameter name Spring stiffness (N/m) Damping coefficient (N/(m/s)) Piston diameter of cylinder (mm) Piston rod diameter of cylinder (mm) Effective stroke of cylinder (mm) Mass of carriage (kg) Displacement of pump (mL/rev) Rated speed of motor (rev/min) Maximum load (kg) Preset pressure of safety valve (bar)
Value 600,000 17,800 90 40 90 15,689 6 3000 10,500 160
Fig. 5 Displacement of carriage
Before the 2nd second, the height adjustment system works in free stage. The height of carriage and the pressure in two chambers are both invariable. The period of from the 2nd to the 10th second belongs to draw-down stage. At the 2nd second, the unloading valve 7 operates to reduce the initial pressure of the system, and the height of the carriage is basically unchanged. Therefore, the pressure in rod end chamber changes slightly. The carriage starts to go down at the 3rd second. Figure 5 suggests that the stroke could reach 89 mm in no more than
Design and Simulation of the Hydraulic System for Height Adjustment of Carriage
59
Fig. 6 Pressure of cylinder chambers
7 s. The draw-down speed is basically stable, which guarantees passengers’ safety. And the pressure in two chambers increases correspondingly as shown in Fig. 6. The pressure in the rod end chamber gradually rises to 12.5 MPa, and the pressure in the blind end chamber rises to 2 MPa. During the 10th to the 20th second, the system works in the locking stage. The height of the carriage and the pressures in chambers remain constant. After the 20th second, the system gets into the releasing stage. Figure 5 shows that the carriage rises rapidly and return to its original height in 3 s. The pressure depicted in Fig. 6 drops sharply in 3 s. The system is going to be in free stage. In conclusion, it takes the system no more than 7 s to draw down the carriage and no more than 3 s to release the carriage. In this process, the pressure in rod end chamber is never larger than 12.5 MPa. The whole stroke could meet almost all height difference. 4 stages are also gone through successfully by the designed system.
4 Analysis of Damping Effect 4.1 Mathematical Model Since the height adjustment system is added to the initial vertical secondary suspension system, it is necessary to analyze the damping of the new vertical secondary suspension system. In free stage, the oil could flow between the blind end chamber and the rod end chamber through pipes and the shuttle valve. Meanwhile, the damping hinders the flow of oil, indirectly affecting the vibration of the carriage. Given the metal pipe, the damping force caused by the pipe FB1 is FB1 =
150μl A22 v π d4
(7)
where v is the relative velocity between the cylinder’s barrel and piston, l is the total length of the connecting pipe between the blind end and rod end chambers, and μ is the kinematic viscosity of the hydraulic oil.
60
Z. Yan et al.
The oil also flows through the shuttle valve that could be regarded as a short hole. Equation (8) presents the damping force caused by the shuttle valve. FB2 =
ρ A32 v 2Cd A3
(8)
where Cd is the flow coefficient, and A3 is the equivalent area of the shuttle valve. Then, the total damping force of a single actuator provides in free stage is FB =
150μl A22 ρ A32 v+ v 4 πd 2Cd A3
(9)
Therefore, the total damping coefficient ζ of the new vertical secondary suspension system is 150μl A22 ρ A32 FB = + (10) ζ = v π d4 2Cd A3 Equation (10) demonstrates that the damping coefficient mainly depends on the length and diameter of the pipeline as the structure of shuttle valve is defined. The cylinder usually decides on the length of the pipeline, which means that adjusting the diameter of the pipe is the most effective way to control the damping force of the system.
4.2 The Results of Simulation In the free stage, a sine wave served as an excitation simulates the vibration that a bogie withstands in Fig. 3. Figure 7a reveals the change of damping coefficient with diverse pipe’s length and diameter at 20 ◦ C. The longer the pipe is, the higher damping coefficient is, at the same diameter. The damping within the system becomes smaller as the diameter is bigger. Around the range of 6–10 mm increasing the diameter of the pipe is more useful to reduce the damping coefficient whatever the length of a pipe. However, when the diameter of a pipe is larger than 10 mm, it has slight effect on damping coefficient. In addition to the structure of the pipe, temperature also affects the damping. Figure 7b shows the results under different temperatures and diameters, and the length of the pipeline is set up to 0.3 m. When temperature is below 0 ◦ C and the diameter of the pipe is smaller than 10 mm, the damping coefficient of the system increases enormously with the decrease of the temperature. In the range of 0–40 ◦ C, damping coefficient decreases slowly with the increase of temperature except the pipe with 6 mm. Many researches indicate that different cases allow different best damping coefficient. To realize a relatively stable damping coefficient, the pipe diameter should be larger than or equal to 10 mm, and the temperature should be higher than 0 ◦ C.
Design and Simulation of the Hydraulic System for Height Adjustment of Carriage
(a)Pipe Length and Diameter
61
(b)Pipe Diameter and Temperature
Fig. 7 Comparison of damping coefficients with different factors
To sum up, the selection of the pipe’s diameter and length could help engineers to determine appropriate damping coefficient. In the structural design, engineers are able to shorten the length of the pipe and increase the diameter. The influence of environmental temperature needs to be considered too. There could be a heater inside the system to work at low temperatures to keep a reasonable damping coefficient.
5 Conclusion 1. A scheme of the hydraulic height adjustment system is proposed, which is installed in the secondary suspension. The system could eliminate the possible difference of the height between the carriage and the platform when a train arrives at a station. It provides convenience for passengers who are disabled or carrying large luggage. 2. The system’s model is established in AMESim, and its practicability is verified. The stroke of the actuator could reach 89 mm, and the pressure of the system is always kept lower than 12.5 MPa. The draw-down stage could end in 7 s, and the releasing stage lasts no longer than 3 s. The hydraulic height adjustment system completes 4 stages smoothly and quickly. 3. The damping of the new vertical secondary suspension system is calculated. The relations between the damping coefficient and the structure of the pipe, environmental temperature are discussed. When the pipe diameter is larger than or equal to 10 mm and the temperature is higher than 0 ◦ C, a relatively stable damping coefficient can be obtained. The results provide a theoretical basis for this system’s application on the train in the future.
62
Z. Yan et al.
References 1. Teng, X., Huang, Q., et al.: Diffusion of aqueous solutions of ionic, zwitterionic, and polar solutes. J. Chem. Phys. 148(22), 222827 (2018) 2. Ma, X., et al.: Design and testing of a nonlinear model predictive controller for ride height control of automotive semi-active air suspension systems. IEEE Access 6, 63777–63793 (2018) 3. Zhao, J., et al.: Integrated variable speed-fuzzy PWM control for ride height adjustment of active air suspension systems. In: 2015 American Control Conference (ACC). IEEE (2015) 4. Zhuang, D.: Study on vertical and lateral dynamic performances of vehicle with active hydropneumatic suspension. PhD dissertation, Shanghai Jiao Tong University (2007) 5. Feng, M., Liu, G., Ge, Y.: Modeling and simulation analysis of height adjustment system of hydro pneumatic suspension vehicle. In: 2022 14th International Conference on Measuring Technology and Mechatronics Automation (ICMTMA). IEEE (2022) 6. Zhang, X.: Design and research of giant magnetostrictive actuator in train active suspension system. MA thesis, Lanzhou Jiaotong University (2020) 7. Teng, X., Hwang, W.: Effect of methylation on local mechanics and hydration structure of DNA. Biophys. J. 114(8), 1791–1803 (2018) 8. Shen, Y., et al.: Research on test and simulation of hydro-pneumatic suspension. In: 2011 International Conference on Consumer Electronics, Communications and Networks (CECNet). IEEE (2011) 9. Jin, T.H., et al.: Adaptability of variable stiffness and damping shock absorber for semi-active suspension of high speed train. J. Vibr. Eng. 33, 772–783 (2020) 10. Peng, L., Lu, D., Wang, J.: The acquisition of characteristics of hydro-pneumatic suspension cylinder based on parameter correction. In: 2018 10th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC), vol. 2. IEEE (2018) 11. Teng, X., Hwang, W.: Elastic energy partitioning in DNA deformation and binding to proteins. ACS Nano 10(1), 170–180 (2016)
Virtual Model Control Algorithm Simulation for Lower Limb Assist Exoskeleton Xiaorong Zhu, Jing Chen, Zhiyuan Yu, Zheqing Zuo, Zhe Zhao, and Zichong Zhang
Abstract In response to the problem of poor motion control models for lower limb heavy-duty exoskeleton robots, this paper proposes a lower limb assisted exoskeleton control method based on virtual model control algorithm (VMC). The dynamic model, human motion model, and human-exoskeleton interaction model of the exoskeleton robot are established. The virtual model applied by the overall control target is decomposed into each support leg according to certain principles, and the corresponding joint torque is calculated using the Jacobian matrix of both legs, And a virtual model of the swinging phase of both legs is constructed for VMC control of the swinging phase, and model switching is carried out based on gait rules and touchdown signals, thereby avoiding the derivation of complex global Jacobian matrices and reducing the disturbance introduced by model switching, improving the effect of exoskeleton following human motion. Through simulation modeling analysis, the virtual model control algorithm based lower limb assist exoskeleton control method has smaller human-exoskeleton interaction compared to sensitivity amplification control, which improves the control performance of lower limb heavy-duty assist exoskeleton robots. Keywords Lower limb assisted exoskeleton robot · Virtual model control algorithm · Human exoskeleton interaction model · Dynamics simulation
The paper supported by the Defense Industrial Technology Development Program (JCKY2021602B029). X. Zhu · J. Chen · Z. Yu · Z. Zuo · Z. Zhao · Z. Zhang Beijing Institute of Precision Mechatronics and Controls, Beijing 100076, China X. Zhu · J. Chen (B) · Z. Yu · Z. Zuo · Z. Zhao Laboratory of Aerospace Servo Actuation and Transmission, Beijing 100076, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_7
63
64
X. Zhu et al.
1 Introduction Raibert proposed the Virtual Model Control (VMC) method, which is an intuitive control method. Its core idea is to use hypothetical virtual components to connect the robot’s internal action points, or connect the action points with the external environment, to generate corresponding virtual forces to drive the robot to achieve the desired motion. The VMC method was first applied to bipedal robots, such as Pratt et al., who successfully implemented dynamic walking control for bipedal robots Spring Turkey and Spring Flamingo using VMC [1–3]. On this basis, Chew et al. also added learning algorithms to VMC to achieve self-tuning of control parameters [4, 5]. VMC has also been extended and applied to quadruped robots to a certain extent. For example, Chen Jiapin et al. from Shanghai Jiao Tong University [6] extended the idea of VMC to quadruped robot models, but it was only limited to the derivation of Jacobian matrices with different support phases and no further application research was conducted. For example, Ajalloeian et al. [7] combined VMC with CPG for simulation experiments of quadruped robots, and Havoutis et al. [8] used virtual model ideas to achieve active leg compliance in the quadruped robot HyQ, but neither fully extended VMC to the overall motion control of quadruped robots. In the above VMC applications, most of them involve the derivation of the global Jacobian matrix for multi legged support phases. Its purpose is to fully map the workspace velocity to the joint space velocity under the state of multi legged support, that is, to map all virtual forces to joint torques, in order to achieve simultaneous control of all degrees of freedom of the robot body. However, this approach requires simultaneous solving of multiple single leg Jacobian matrices, which can be very complex when there are many joint degrees of freedom or support legs [9]. If the control of multiple degrees of freedom of the robot body is decomposed into the control of a single leg, only the Jacobian matrix of the single leg is needed to calculate the joint torque of the corresponding leg. In this way, regardless of how many supporting legs the robot has, simple Jacobian matrices are used for joint torque calculation, which can effectively simplify the solving process and improve computational efficiency. Gehring et al. used the Jacobian transposition control method (essentially the same as VMC) to control the support legs of the quadruped robot Star1ETH, allocating all virtual forces and moments acting on the body to the three-dimensional forces generated by each support leg. These expected forces were converted into joint moments using a single leg Jacobian matrix [10]. However, this control method ignores the flipping motion of the quadruped robot around the diagonal of the body during diagonal trotting, resulting in ineffective control of the lateral roll angle change and lateral balance of the robot body. This article establishes a dynamic model, human motion model, and human-exoskeleton interaction model of an exoskeleton robot, decomposes the virtual model imposed by the overall control objective into various support legs according to certain principles, calculates the corresponding joint torque using a simple single leg Jacobian matrix, and constructs a virtual model of the single leg swing phase for swing phase VMC control.
Virtual Model Control Algorithm Simulation . . .
65
2 Distributed VMC Control Model In the research on dynamic modeling and control simulation of electromechanical lower limb exoskeleton robots, there are two main parts, namely the construction of a human-exoskeleton collaborative motion simulation platform and the simulation and validation of control algorithms. The construction of the simulation platform includes the following research contents: human motion data acquisition system, human kinematics modeling and control, human-exoskeleton interface design and modeling, multi-mode human-exoskeleton coupling modeling, human-exoskeleton coupling lower limb exoskeleton robot dynamics modeling and Ren Kai coupling control algorithm research. This article proposes a decomposition based VMC method. The main idea is to decompose the virtual model applied by the overall control objective onto each support leg according to certain principles, calculate the corresponding joint torque using a simple single leg Jacobian matrix, and construct a virtual model of the single leg swing phase for swing phase VMC control. The model is switched based on gait rules and ground contact signals. The supporting legs cooperate with each other to achieve the control objectives of the robot’s body. Figure 1 schematic diagram of VMC In the vertical direction, there is a virtual parallel spring damping element between the hip joint and the ground to generate a
Fig. 1 Schematic diagram of VMC
66
X. Zhu et al.
vertical force Fz to maintain height. In the horizontal direction, a virtual damping is connected in series between the hip joint and the velocity source to generate a horizontal force Fx to maintain the horizontal velocity within the error range. At the same time, a set of spring damping is installed between the hip joint and the body, generating a virtual torque Ma to maintain the body posture. Although the robot shown in the figure has flat feet, the ankle joint torque is actually set to 0 during the control process, which is consistent with the designed five link robot. The existence of the foot end is only for the convenience of mathematical derivation. As mentioned earlier, in the control process of the supporting leg, the input variables are the expected horizontal velocity vxd, the expected hip joint height zd, and the expected body deflection angle ad. By specifying that the virtual component set is a linear component, the control law can be obtained: Fx = bx (vxd − vx )
(1)
Fz = k z (z d − z) + bz (˙z d − z˙ )
(2)
˙ Mα = kα (αd − α) + bα (α˙ d − α)
(3)
In the equation, bi and ki represent the damping coefficient and spring stiffness, respectively. The schematic diagram of the robot’s legs supporting phase is shown in Fig. 2. The symbols and positives of each joint angle are the same as those of a single leg support, with the subscript l representing the left leg and r representing the right leg. The mapping relationship between the virtual force at the body and the torque of each joint when the legs are supported can be derived from the kinematics relationship, that is, the virtual force is transformed into the joint space through the Jacobian matrix, as shown in Eq. (4). Al T τl J = B τr 0
Fl 0 Ar T · F J r B
(4)
Expand Eq. (4) to: ⎤ ⎡ A τla ⎢ τlk ⎥ ⎢ Q ⎢ ⎥ ⎢ ⎢ τlh ⎥ ⎢ 0 ⎢ ⎥=⎢ ⎢τra ⎥ ⎢ 0 ⎢ ⎥ ⎢ ⎣τr k ⎦ ⎣ 0 0 τr h ⎡
B R 0 0 0 0
−1 0 −1 0 −1 0 0 C 0 S 0 0
0 0 0 D T 0
⎤ ⎡ ⎤ fla 0 ⎢ flk ⎥ 0⎥ ⎥ ⎢ ⎥ ⎢ ⎥ 0⎥ ⎥ · ⎢ flh ⎥ ⎥ ⎥ −1⎥ ⎢ ⎢ fra ⎥ ⎣ ⎦ −1 fr k ⎦ −1 fr h
(5)
where, A = −L 1 cos(θla ) − L 2 cos(θla +θlk ), B = −L 1 sin(θla ) − L 2 sin(θla + θlk ), D = −L 1 sin(θra ) − L 2 sin(θra + θr k ), C = −L 1 cos(θra ) − L 2 cos(θra + θr k ), R = −L 2 sin(θla + θlk ), S = −L 2 cos(θra + θr k ), Q = −L 2 cos(θla + θlk ), T = −L 2 sin(θra + θr k ).
Virtual Model Control Algorithm Simulation . . .
67
Fig. 2 Schematic diagram of virtual model legs
When the legs come into contact with the ground, the topology of the robot becomes a parallel mechanism, and the virtual force at coordinate system B is coupled by the forces generated by two support chains. The coupling relationship between virtual forces can be determined based on actual needs. This article assumes that the force generated by the left and right legs accounts for the same proportion in the total force, and the relationship equation is shown in Eq. (6). ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ f xl f xr fx ⎣ f z ⎦ = ⎣ f zl ⎦ + ⎣ f zr ⎦ fθ f θl f θr
(6)
Due to having six joints and only three virtual forces required, three additional constraints need to be added. In addition to the constraints at the two ankles, for the convenience of calculation, it is assumed that the moment at the left and right hip joints is equal, namely:
68
X. Zhu et al.
τla = τra = 0, τlh = τr h , τθ h = τθ h Jointly, the following six equations can be obtained: ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ f xl 1 0 0 1 0 0 fx ⎢ f y ⎥ ⎢ 0 1 0 0 1 0 ⎥ ⎢ f zl ⎥ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎢ f θ ⎥ ⎢ 0 0 1 0 0 1 ⎥ ⎢ f θl ⎥ ⎥ ⎢ ⎥ ⎢ ⎥=⎢ ⎢ 0 ⎥ ⎢ A B 1 0 0 0 ⎥ · ⎢ f xr ⎥ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎣ 0 ⎦ ⎣ 0 0 0 C D 1 ⎦ ⎣ f zr ⎦ 0 0 1 0 0 −1 f θr 0
(7)
(8)
Combining Eq. (5) with Eq. (8) yields Eq. (9), which is the torque expression for the hip and knee joints of the left and right legs during the bipedal support phase. ⎡ ⎤ ⎡ CV ⎤ −V −Q D+RC DV ⎡ ⎤ − 21 τlk E E 2E fx 1 ⎢ τlh ⎥ ⎢ 0 ⎥ −2 ⎢ ⎥ = ⎢ AW 0BW W +S B−T ⎥ ⎣ fz ⎦ · (9) A ⎣τr k ⎦ ⎣− − E − 21 ⎦ E 2E f θ τr h 0 0 − 21 where, E = C B − AD, V = Q B − R A = −L 1 L 2 sin(θlk ), W = S D − T C = −L 1 L 2 sin(θr k ). It is worth noting that the matrix in Eq. (9) is fully rank for all joint angles, except for collinear feet.
3 Two Legged VMC Model and Simulation In order to use VMC control methods safely and efficiently, this article creates a multi body simulation environment for executing training processes in virtual scenes. As shown in Fig. 3, the multi body simulation environment includes a human body model, an exoskeleton model, a structured terrain, and an interaction model, as shown in Fig. 4. The report proposes using reference motion to drive the human body model, which guides the movement of the exoskeleton model through the human-exoskeleton interaction forces generated by the human-exoskeleton interaction model. Unlike previous methods that only rely on trajectory to solve joint torque, this method showcases the dynamic human-exoskeleton interaction process during walking and reveals the essence of collaborative motion in human-exoskeleton systems. The reference motion at different walking speeds is obtained by stretching and reproducing the original reference motion cycle length. The original reference motion is a hip, knee, and ankle joint trajectory collected by a motion capture system (Mtw Awinda, Xsens) at a frequency of 240 Hz during a complete gait cycle of the human body walking on a treadmill at a speed of 2.8 km/h, with 334 time steps, approximately 1.39 s. It is worth noting that by changing the motion speed in this way, the reference motion obtained is still physically feasible, and the feet are always fixed on the ground without sliding during the support phase (Fig. 5).
Virtual Model Control Algorithm Simulation . . .
69
Fig. 3 Multibody simulation environment and model
Fig. 4 System model of lower limb exoskeleton robot
A Matlab/SimMechanics simulation control system was built, and the VMC controller module is shown in Fig. 6. The effectiveness and performance of the simulation model directly determine the accuracy of exoskeleton human-exoskeleton collaborative control. Through simulation research on the exoskeleton following the two variables of human center height and travel speed, exoskeleton follow-up control can be achieved, which can guide the design and implementation of the actual prototype control system. At a speed of 2.8 km/h, the human-exoskeleton interaction forces in the horizontal, vertical, and pitch directions of the human back are shown in Fig. 7, and the comparison of the results of the human-exoskeleton interaction forces on the back is shown in Table 1. From Table 1, it can be seen that the VMC algorithm has an average force of 2.33 N in the pitch direction, while the sensitivity method ARC algorithm has an average force of 4.22 N in the pitch direction, a decrease of approximately 44.79%; The VMC algorithm has an average force of 167.09 N in the horizontal direction,
70
X. Zhu et al.
Fig. 5 Human and exoskeleton model of lower limb exoskeleton robot
Fig. 6 VMC controller module
while the sensitivity method ARC algorithm has an average force of 229.04 N in the horizontal direction, a decrease of approximately 27.05%; The VMC algorithm has an average force of 119.59 N in the vertical direction, while the sensitivity method ARC algorithm has an average force of 272.03 N in the vertical direction, a decrease of approximately 56.04%. It can be concluded that the VMC algorithm has smaller human-exoskeleton forces in all directions of the lower back and better performance.
Virtual Model Control Algorithm Simulation . . .
71
Fig. 7 Back human exoskeleton interaction force value Table 1 Comparison of results of human exoskeleton interaction on the back Parameters MVC results (N) ARC results (N) Reduce ratio (%) Pitch direction average force Horizontal direction average force Vertical direction average force
2.33
4.22
44.79
167.09
229.04
27.05
119.59
272.03
56.04
4 Conclusions In response to the problem of poor motion control models for lower limb heavyduty exoskeleton robots, this paper proposes a lower limb assisted exoskeleton control method based on virtual model control algorithm (VMC), and builds a Matlab/SimMechanics simulation control system, including the exoskeleton robot dynamics model, human motion model, and human-exoskeleton interaction model. A two-leg simulation interaction model is constructed for VMC control, and the ASC algorithm is compared. The decrease in interaction force in the pitch direc-
72
X. Zhu et al.
tion is about 44.79%, the decrease in interaction force in the horizontal direction is about 27.05%, and the decrease in interaction force in the vertical direction is about 56.04%, proving that the VMC algorithm can improve the control performance of lower limb heavy-duty assisted exoskeleton robots.
References 1. Pratt, J.E.: Virtual model control of a biped walking robot. Massachusetts Institute of Technology (1995) 2. Pratt, J., Pratt, G.: Intuitive control of a planar bipedal walking robot. In: Proceedings of the IEEE International Conference on Robotics and Automation, pp. 2014–2021 (1998) 3. Pratt, J., Dilworth, P., Pratt, G.: Virtual model control of a bipedal walking robot. In: Proceedings of the 1997 IEEE International Conference on Robotics and Automation, Albuquerque, New Mexico, pp. 193–198 (1997) 4. Chew, C.-M., Pratt, J., Pratt, G.: Blind walking of a planar bipedal robot on sloped terrain. In: Proceedings of IEEE International Conference on Robotics and Automation, Detroit, Michigan, pp. 381–386 (1999) 5. Chew, C.-M., Pratt, G.A.: Dynamic bipedal walking assisted by learning. Robotica 20(05), 477–491 (2002) 6. Chen, J., Cheng, J., Yu, G.: A virtual model of quadruped robot for diagonal jotting and straight walking. J. Shanghai Jiao Tong Univ. 35(12), 1771–1775 (2001) 7. Ajallooeian, M., Pouya, S., Sproewitz, A., et al.: Central pattern generators augmented with virtual model control for quadruped rough terrain locomotion. In: IEEE International Conference on Robotics and Automation (ICRA), vol. 2013, pp. 3321–3328 (2013) 8. Havoutis, I., Semini, C., Buchli, J., et al.: Quadrupedal trotting with active compliance. In: IEEE International Conference on Mechatronics (ICM), vol. 2013, pp. 610–616 (2013) 9. Pratt, J.E.: Virtual model control of a biped walking robot. Massachusetts Institute of Technology (1995) 10. Gehring, C., Coros, S., Hutter, M., et al.: Control of dynamic gaits for a quadrupedal robot. In: IEEE International Conference on Robotics and Automation (ICRA), vol. 2013, pp. 3287–3292 (2013)
Neural Network Based Singularity-Free Adaptive Prescribed Performance Control of Two-Mass Systems Dongdong Zheng, Zeyuan Sun, and Weixing Li
Abstract This paper focuses on the trajectory tracking control problem of twomass systems, addressing the challenges posed by unknown system dynamics and unknown control gain. To handle these challenges, we first reformulate the system model into a singularity-free form and employ neural networks to approximate the unknown nonlinear functions. To ensure that the tracking errors are bounded by predefined performance boundaries and avoid the potential singularity problem inherent in other indirect adaptive control methods, we develop a singularity-free prescribed performance controller. Additionally, to simplify the controller design procedure, we adopt a high-order command filter and abandon the commonly used backstepping control approach. We employ the Lyapunov approach to analyze the stability of the identification and control algorithms, while simulation results demonstrate the efficacy of the proposed algorithms.
1 Introduction In industrial applications, a variety of equipment comprises motors and load machines, connected by low-stiffness shafts or flexible couplings. Examples of such systems include flexible joint robotic manipulators [1], wind turbines [2], cable car systems [3], and elastic drives [4]. These systems can be modelled as two-mass systems due to their inherent elasticity, and various control methods have been proposed in the literature to manage these systems [5–8]. In a real system, the inertia of the load JL is usually unknown or may change during the control process. Since the inertia is uncertain, the control gain b(x), which is often proportional to the inverse of the inertia, is also unknown. When the estimated ˆ 1/b(x) is used to build the controller, the singularity problem may occur. Several methods have been proposed in the literature to resolve the singularity problem. The D. Zheng (B) · W. Li Beijing Institute of Technology, Beijing 100081, China e-mail: [email protected] Z. Sun China North Artificial Intelligence and Innovation Research Institute, Beijing 100072, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_8
73
74
D. Zheng et al.
ˆ projection algorithm was utilized by some authors to ensure that |b(x)| estimated by a neural network with a special structure was greater than a predetermined lower bound b [9, 10]. However, this approach led to impaired tracking performance, as the neural network was not accurate in approximating b(x). Instead of the inverse of ˆ ˆ b(x), b(x)/( bˆ 2 (x) + δ) was utilized by the authors in [11, 12] to avoid the singularity problem, where δ is a small positive parameter. However, the unknown nonlinearity of the system could not be entirely compensated, and consequently, the overall response of the closed-loop system was deteriorated. Rovithakis [13] reset the sign or value ˆ of the neural network weights when b(x) < b, but this resulted in the inability of the neural network to accurately estimate the unknown control gain, causing oscillations at the resetting point. The singularity-free adaptive control methods were proposed by Zheng et al. in [14]. However, it is necessary to investigate further how this approach can be applied to a practical high-order two-mass system. Meeting specific requirements for steady-state and transient performance is essential for accomplishing industrial tasks, and tight constraints are often placed on tracking errors. To address such constraints, researchers have extensively investigated the use of prescribed performance control (PPC), which is a useful tool discussed in the literature [15]. Li et al. studied the PPC problem of a flexible joint robotic manipulator (RM) and designed a controller using the funnel control technique via the backstepping approach. Liu et al. adopted a backstepping-based prescribed performance controller for a free-flying flexible-joint space robot. Rovithakis et al. investigated a backstepping-like prescribed performance controller for a variable stiffness flexible joint robot. The backstepping-based prescribed performance control of a flexible joint robot was studied by Ma et al. in [16], where an event-trigger mechanism was developed to reduce the communication load. Wang et al. introduced a novel approach by incorporating both the prescribed performance function and the positive integrable time-varying function in the adaptive backstepping control of FJRM. However, it is worth noting that all of these control algorithms use either backstepping or dynamic surface control techniques, which can lead to a tedious controller design process that involves designing multiple filters and virtual controllers step-by-step. This paper proposes a novel NN-based adaptive PPC method for a two-mass system with unknown parameters, aiming to address the issues mentioned earlier. The singularity-free technique is adopted in the control design to prevent the singularity problem, and a high-order command filter is utilized to simplify the controller design process. The rest of the paper is organized as follows: Sect. 2 formulates the control problem, Sect. 3 outlines the NN identification algorithm, Sect. 4 presents the development of the singularity-free adaptive PPC, Sect. 5 presents simulation results, and Sect. 6 draws the conclusions.
Neural Network Based Singularity-Free Adaptive Prescribed . . .
75
2 Problem Formulation A typical two-mass system can be modelled as JL θ¨L + f 1 (θ L , θ˙L ) + Δ1 (θ L , θ˙L ) = k(θm − θ L ) Jm θ¨m + f 2 (θm , θ˙m ) + Δ2 (θm , θ˙m ) = u − k(θm − θ L )
(1a) (1b)
where JL , Jm are the moment of inertia of the load and the motor, respectively, k is the stiffness coefficient, θ L , θm are the angular position of the load and the motor, respectively, u is the input torque, f (θ L , θ˙L ), f (θm , θ˙m ) denote the friction nonlinearities of the load side and motor side, respectively, and Δ1 (θ L , θ˙L ), Δ2 (θm , θ˙m ) are the unmodeled dynamics of the load side and the motor side, respectively. Let x1 := θ L , x2 := θ˙L , y1 := θm − θ L , y2 := θ˙m − θ˙L , then the original system (1) can be reformulated into a singularity-free form as x˙1 = x2
(2a)
J¯L x˙2 = − f¯ − Δ¯ 1 + y1 y˙1 = y2 Jm y˙2 = J¯m f 1 − f 2 + J¯m Δ1 − Δ2 − (k + J¯m )y1 + u
(2b) (2c) (2d)
where J¯L = JL /k, Δ¯ 1 = Δ1 /k, f¯ = f /k, J¯m = Jm /JL . Because f¯, Δ¯ 1 , f 2 , Δ2 are unknown, then neural networks can be used to approximate the lumped unknown dynamics as − f¯ − Δ¯ 1 = w1 φ1 (x1 , x2 ) + ζ1 J¯m f 1 − f 2 + J¯m Δ1 − Δ2 − (k + J¯m )y1 = w2 φ2 (x1 , x2 , y1 , y2 ) + ζ2
(3a) (3b)
where w1 , w2 are the NN weights, and φ1 , φ2 are the NN regressor vectors, ζ1 , ζ2 are NN approximation errors. Using (3), one can rewrite (2) as x˙1 = x2 J¯L x˙2 = y˙1 = y2
w1 φ1
(4a) + y1 + ζ1
Jm y˙2 = w2 φ2 + u + ζ2
(4b) (4c) (4d)
Remark 1 The system dynamics have been transformed into a singularity-free form expressed as (2), which ensures that the control gain is consistently equal to 1. Consequently, the use of a known control gain during the design of an adaptive controller is enabled, which circumvents the potential singularity problem.
76
D. Zheng et al.
3 Online Identification of NN Weights Because J¯L , Jm , w1 , w2 are unknown, an online identification method is developed 1 to (4b) and (4d), one has firstly. Apply a first order filter G(s) = ks+1 J¯L x˙2 f = w1 φ1 f + y1 f + ζ1 f Jm y˙2 f =
w2 φ2 f
+ u f + ζ2 f
(5a) (5b)
where y1 f , x2 f , y2 f , φ1 f , φ2 f , u f , ζ1 f , ζ2 f can be obtained as follows: y1 − y1 f x2 − x2 f , y1 f (0) = 0, x˙2 f = , x2 f (0) = 0 k k y2 − y2 f u −uf , y2 f (0) = 0, u˙ f = , u f (0) = 0 y˙2 f = k k φ1 − φ1 f φ2 − φ2 f φ˙ 1 f = , φ1 f (0) = 0, φ˙ 2 f = , φ2 f (0) = 0 k k ζ1 − ζ1 f ζ2 − ζ2 f ζ˙1 f = , ζ1 f (0) = 0, ζ˙2 f = , ζ2 f (0) = 0 k k y˙1 f =
Rearrange (5), one has W1 Φ1 = y1 f + ζ1 f , W2 Φ2 = u f + ζ2 f
(6a)
where W1 = [ J¯L , w1 ] , W2 = [Jm , w2 ] , Φ1 = [x˙2 f , φ1f ] , Φ2 = [ y˙2 f , φ2f ] . Define auxiliary matrices Px , Py and auxiliary vectors Q x , Q y as P˙x = −l x Px + l x Φ1 Φ1 , Px (0) = 0(n+1)×(n+1) Q˙ x = −l x Q x + l x Φ1 y1 f , Q x (0) = 0(n+1)×1 P˙y = −l y Py + l y Φ2 Φ2 , Py (0) = 0(n+1)×(n+1)
(7b)
Q˙ y = −l y Q y + l y Φ2 u f , Q y (0) = 0(n+1)×1
(7d)
(7a) (7c)
Integrating (7), one has t Px (t) =
e
−l x (t−r )
Φ1 (r )Φ1 (r )dr,
t Q x (t) =
0
0
t
t
Py (t) = 0
e−l y (t−r ) Φ2 (r )Φ2 (r )dr, Q y (t) =
0
e−lx (t−r ) Φ1 (r )y1 f dr
(8a)
e−l y (t−r ) Φ2 (r )u f dr
(8b)
Neural Network Based Singularity-Free Adaptive Prescribed . . .
77
Define another two auxiliary matrices as Hx = Q x − Px Wˆ 1 , Hy = Q y − Py Wˆ 2
(9)
where Wˆ 1 , Wˆ 2 are the estimation of W1 and W2 , respectively. Using (8), it is easy to verify that Hx = Px W1 − ζ˚1 − Px Wˆ 1 = Px W˜ 1 − ζ˚1
(10a)
Hy = Py W2 − ζ˚2 − Py Wˆ 2 = Py W˜ 2 − ζ˚2
(10b)
t t where W˜ 1 = W1 − Wˆ 1 , W˜ 2 =W2 − Wˆ 2 , ζ˚1 = 0 e−l(t−r ) Φ1 (r )ζ1 f dr , ζ˚2 = 0 e−l(t−r ) Φ2 (r )ζ2 f dr . Then the following learning laws can be designed to update Wˆ 1 and Wˆ 2 online Wˆ 1 = Γ1 Hx , Wˆ 2 = Γ2 Hy
(11)
where Γ1 ∈ R+ , Γ2 ∈ R+ are learning gains chosen by the user. Lemma 1 Consider the system (5) with weight updating laws given in (11), it can be guaranteed that the weight estimation errors W˜ 1 , W˜ 2 are uniformly ultimately bounded (UUB) [17].
4 Singularity-Free Prescribed Performance Control The control objective is to design a controller such that x1 tracks a given reference signal xr ∈ R+ , xr ∈ L2 . Define e1 = x1 − xr , e2 = x2 − x˙r , then the error dynamics of the load side can be obtained as e˙1 =e2 J¯L e˙2 = w1 φ1 + y1 + ζ1 − J¯L x¨r
(12a) (12b)
To achieve superior transient and steady-state tracking performance of the twomass system, the tracking error is expected to be bounded by predefined performance boundaries as ¯ i = 1, . . . , n −δϕ(t) < ei1 < δϕ(t),
(13)
where the prescribed performance function ϕ(t) is typically selected as: ϕ(t) =(ϕ0 − ϕ∞ )e−κt + ϕ∞ ∀t ≥ 0
(14)
78
D. Zheng et al.
with 0 < ϕ∞ < ϕ0 , κ > 0 being designed constants, and δ, δ¯ are designed parameters. Define the error transformation as follows: e1 (t) = ϕΛ(η1 ), Λ(η1 ) =
¯ η1 (t) − δe−η1 (t) δe eη1 (t) + e−η1 (t)
(15)
Then the transformed error can be obtained as η1 (t) =
1 μ1 (t) + δ ln 2 δ¯ − μ1 (t)
(16)
where μ1 (t) = e1 (t)/ϕ(t) represents the intermediate variable. Define η2 = η˙ 1 , and using (12), then one has η2 = η˙ 1 =
1 r1 μ˙ 1 2
(17)
where r1 =
e1 ϕ˙ 1 1 e2 − 2 − , μ˙ 1 = ¯ μ1 + δ ϕ ϕ μ1 − δ
Using (12), the derivative of η2 can be calculated as 1 η˙ 2 = r˙1 μ˙ 1 + 2 1 = r˙1 μ˙ 1 + 2
1 r1 μ¨ 1 2 1 e˙2 r1 ( + ν1 ) 2 ϕ
(18)
where r˙1 =
1 1 2e1 ϕ˙ 2 2e2 ϕ˙ e1 ϕ¨ μ ˙ − , ν = − 2 − 2 1 1 ¯ 2 (μ1 + δ)2 ϕ3 ϕ ϕ (μ1 − δ)
Define a sliding manifold sx as s x = λ 1 η1 + η2
(19)
where λ1 ∈ R+ is chosen by the user. Using (12), (17) and (18), one has s˙x =λ1 η2 + η˙ 2 r1 w1 φ1 + y1 + ζ1 − J¯L x¨r + J¯L ∇ = 2 J¯L ϕ
(20)
Neural Network Based Singularity-Free Adaptive Prescribed . . .
79
where ∇=
2ϕ 1 1 r1 ν1 + λ1 η2 + r˙1 μ˙ 1 r1 2 2
To guarantee the convergence of sx , a virtual control signal α1 can be firstly designed as α1 = −wˆ 1 φ1 + JˆL x¨r − JˆL ∇ − k1 sx
(21)
We define filtered virtual control signal α1 f as α˙ 1 f =α2 f α˙ 2 f = −
(22a)
ωn2 α1 f
− 2ωn α2 f +
ωn2 α1
(22b)
Define auxiliary tracking error ε1 = y1 − α1 , ε2 = y2 − α2 f . Substituting (21) into (20), it is easy to show that s˙x =
r1 − k1 sx + w˜ 1 φ1 + ε1 + ζ1 − J˜L x¨r + J˜L ϕν1 + J˜L ∇ 2 J¯L ϕ
(23)
Define δ = α1 − α1 f . The dynamics of the auxiliary tracking errors ε1 , ε2 can be obtained as ε˙ 1 = y˙1 − α˙ 1 f + α˙ 1 f − α˙ 1 = ε2 − δ˙ Jm ε˙ 2 =w2 φ2
+ u + ζ2 − Jm α˙ 2 f
(24a) (24b)
Define a sliding manifold s y as s y = λ2 ε1 + ε2
(25)
The derivative of s y is s˙y =
1 Jm λ2 ε2 + w2 φ2 + u − Jm α˙ 2 f + ζ2 − Jm δ˙ Jm
(26)
Therefore, we can design the control signal as u = −wˆ 2 φ2 − Jˆm λ2 ε2 + Jˆm α˙ 2 f − k2 s y
(27)
Substituting (27) into (26), one has s˙y =
1 − k2 s y + J˜m λ2 ε2 + w˜ 2 φ2 − J˜m α˙ 2 f + ζ2 − Jm δ˙ Jm
(28)
80
D. Zheng et al.
Theorem 1 Consider the nonlinear two-mass system (1). It can be guaranteed that by using the controller (27) with α1 given in (21), then all signals closed-loop system is semi-globally uniformly ultimately bounded (SGUUB), and the tracking error e1 will be bounded by the predefined performance boundaries. Proof Consider the following Lyapunov function candidate Vc =
J¯L 2 Jm 2 s + s 2 x 2 y
(29)
Using (23), (28) and Young’s inequality, the derivative of Vc can be obtained as r1 V˙c = sx − k1 sx + ε1 + ϑx + s y − k2 s y + J˜m λ2 ε2 + ϑ y ϕ k2 s y2 ϑ y2 k1r1 2 r1 k1 sx2 ε2 J˜2 λ2 ε2 ϑ2 ≤− sx + + 1 + x − k2 s y2 + + m 2 2 + ϕ ϕ 2 k1 k1 2 k2 k2 ϑ y2 k1r1 2 r1 ε12 r1 ϑx2 k2 2 J˜m2 λ22 ε22 s + + =− − sy + + 2ϕ x k1 ϕ ϕ k1 2 k2 k2 k1 r1 2 k2 s − ( − γ2 )s y2 + ϑ˚ ≤− (30) 2ϕ x 2 ˙ where ϑx = w˜ 1 φ1 + ζ1 − J˜L x¨r + J˜L ϕν1 + J˜L ∇, ϑ y = w˜ 2 φ2 − J˜m α˙ 2 f + ζ2 − Jm δ, 2 2 ϑ J˜m2 λ22 ϑ r1 r y γ2 = max{ k1 ϕλ2 , k2 }, ϑ˚ = ϕ1 k1x + k2 . Therefore, if only k1 , k2 are selected such ϑ˚ 2ϑ˚ that k22 > γ2 , then V˙c < 0 is true whenever |sx | > 2ϕ , or |s | > . Hence, y k1 r1 k2 −2γ2 the boundedness of Vc as well as sx and s y is guaranteed. According to (19), it can be inferred that η1 is also bounded, which means e1 is bounded by the prescribed performance boundaries. Theorem 1 is thus proved.
5 Simulation To evaluate the efficacy of the proposed control algorithm, we employed the two-mass system model (1) for the simulation. Here, we set f 1 = b L θ˙L , f 2 = bm θ˙m , and Δ1 = Δ2 = 0, while the system parameters are as follows: JL = 7.4 × 10−3 kg m2 s−2 , Jm = 5.5 × 10−3 kg m2 s−2 , b L = 0.1 Nm s−1 , bm = 0.1 Nm s−1 , and k = 2.9 Nm rad−1 . The reference signal we used is xr = sin(0.4π t). We selected the initial value of x1 as x1 (0) = 1, whereas the other states began at 0. To control this system, we chose the following controller parameters: k f = 0.005, l x = l y = 1, Γ1 = Γ2 = 200, δ¯ = δ = 1, ϕ0 = 2, ϕ∞ = 0.02, κ = 10, λ1 = 2, λ2 = 20, ωn = 1000, k1 = 0.05, and k2 = 50. We used two neural networks with 30 hidden neurons each and a simulation time sampling interval of 0.0002 s. For comparison purposes, we also tested an adaptive DSC controller [18], and we present the simulation results
Neural Network Based Singularity-Free Adaptive Prescribed . . .
81
Fig. 1 Angular position
Fig. 2 Position tracking error
in Figs. 1, 2, 3 and 4. In these figures, “-PPC” denotes the results obtained from the proposed method in this paper, whereas “-DSC” denotes the results obtained from the DSC method proposed in [18]. Figures 1 and 2 indicate that the proposed PPC method accurately tracks the given reference signal, with a small tracking error that conforms to the prescribed performance boundaries. In contrast, the DSC method yields a significant gap between the system state and the reference trajectory, resulting in a much larger tracking error, as demonstrated in Fig. 2. The superiority of the proposed PPC method is evident in Figs. 3 and 4. Specifically, in Fig. 3, we observe that the proposed method closely tracks the given reference x˙r , as opposed to the DSC method, which shows severe oscillations at the start of
82
D. Zheng et al.
Fig. 3 Angular velocities
Fig. 4 Velocity tracking error
the control process, as evident in the deviation between the load velocity and its reference α2 f . In addition, Fig. 4 demonstrates that the PPC method results in a rapid, stable convergence of the tracking error to 0, whereas the DSC method produces large oscillations and persistent tracking errors throughout the simulation.
6 Conclusion In this paper, we present a novel singularity-free adaptive prescribed performance control method for a class of two-mass systems. The method is based on utilizing neural networks and a high-order command filter. Our approach differs from tradi-
Neural Network Based Singularity-Free Adaptive Prescribed . . .
83
tional dynamic models, where the control gain changes depending on system inertia. Instead, we adopt a singularity-free model that maintains a unit control gain. To ensure both transient and steady-state performance, we design a prescribed performance controller that circumvents the potential singularity problem that may arise even when the original control gain is unknown. To minimize the need for repetitive design of virtual control signals and command filters, we develop a high-order command filter, and design two sliding mode controllers that make use of the system’s integral structure. Furthermore, we employ neural networks to approximate the unknown dynamics of the system, and the online learning laws are designed to update the estimated system parameters and neural network weights concurrently. We demonstrate that our proposed identification and control laws are successful in controlling a typical two-mass system through simulation results.
References 1. Sariyildiz, E., Chen, G., Yu, H.: A unified robust motion controller design for series elastic actuators. IEEE/ASME Trans. Mechatron. 22(5), 2229–2240 (2017) 2. Boukhezzar, B., Siguerdidjane, H.: Nonlinear control of a variable-speed wind turbine using a two-mass model. IEEE Trans. Energ. Convers. 26(1), 149–162 (2010) 3. Hu, X., Li, L.: A new nonlinear active disturbance rejection control for the cable car system to restrain the vibration. Complexity 2020 (2020) 4. Khan, I.U., Dhaouadi, R.: Robust control of elastic drives through immersion and invariance. IEEE Trans. Ind. Electron. 62(3), 1572–1580 (2015) 5. Erenturk, K.: Gray-fuzzy control of a nonlinear two-mass system. J. Franklin Inst. 347(7), 1171–1185 (2010) 6. Fuentes, E., Kalise, D., Kennel, R.M.: Smoothened quasi-time-optimal control for the torsional torque in a two-mass system. IEEE Trans. Ind. Electron. 63(6), 3954–3963 (2016) 7. Erenturk, K.: Fractional-order PIλ Dμ and active disturbance rejection control of nonlinear two-mass drive system. IEEE Trans. Ind. Electron. 60(9), 3806–3813 (2012) 8. Miao, Z., Wang, Y.: Robust dynamic surface control of flexible joint robots using recurrent neural networks. J. Contr. Theor. Ap. 11(2), 222–229 (2013) 9. Zheng, D.-D., Xie, W.-F., Ren, X., Na, J.: Identification and control for singularly perturbed systems using multitime-scale neural networks. IEEE Trans. Neural Netw. Learn. Syst. 28(2), 321–333 (2017) 10. Ren, X., Rad, A.B., Chan, P., Lo, W.L.: Identification and control of continuous-time nonlinear systems via dynamic neural networks. IEEE Trans. Ind. Electron. 50(3), 478–486 (2003) 11. Xu, H., Ioannou, P.A.: Robust adaptive control for a class of MIMO nonlinear systems with guaranteed error bounds. IEEE Trans. Autom. Control 48(5), 728–742 (2003) 12. Chu, Z., Zhu, D., Yang, S.X.: Observer-based adaptive neural network trajectory tracking control for remotely operated vehicle. IEEE Trans. Neural Netw. Learn. Syst. 28(7), 1633– 1645 (2017) 13. Rovithakis, G.A.: Stable adaptive neuro-control design via Lyapunov function derivative estimation. Automatica 37(8), 1213–1221 (2001) 14. Zheng, D.-D., Pan, Y., Guo, K., Yu, H.: Identification and control of nonlinear systems using neural networks: a singularity-free approach. IEEE Trans. Neural Netw. Learn. Syst. 30(9), 2696–2706 (2019) 15. Bu, X., Qi, Q., Jiang, B.: A simplified finite-time fuzzy neural controller with prescribed performance applied to Waverider aircraft. IEEE Trans. Fuzzy Syst. 30(7), 2529–2537 (2022)
84
D. Zheng et al.
16. Ma, H., Zhou, Q., Li, H., Lu, R.: Adaptive prescribed performance control of a flexible-joint robotic manipulator with dynamic uncertainties. IEEE Trans. Cybern. 52(12), 12 905–12 915 (2022) 17. Na, J., Mahyuddin, M.N., Herrmann, G., Ren, X., Barber, P.: Robust adaptive finite-time parameter estimation and control for robotic systems. Int. J. Robust Nonlin. 25(16), 3045–3071 (2015) 18. Wang, M., Wang, C.: Learning from adaptive neural dynamic surface control of strict-feedback systems. IEEE Trans. Neural Netw. Learn. Syst. 26(6), 1247–1259 (2015)
Intelligent Path Planning for Home Service Robots Based on Improved RRT Algorithm Yu Liu, Haikuan Wang, and Shuo Zhang
Abstract In order to improve the safety and efficiency of home service robots operating in complex environments, we need to carry out effective and proper motion planning for the robot’s path. Facing the problems of random growth direction, a large number of redundant nodes in random search, and poor smoothness in the RRT algorithm, this paper proposes an improved RRT path planning algorithm for home service robots. By introducing the idea of the artificial potential field method, the growth direction of new nodes is constrained, and the original fixed step size is changed to dynamic step size, the APF-RRT algorithm based on dynamic step size is proposed to speed up the search efficiency, and by simplifying the path and combining the idea of Floyd’s algorithm to smooth the path, a safety threshold is added when removing redundant points and inserting new nodes to keep the robot away from obstacles. The path length is shortened, the redundant nodes are reduced, and the smoothness of the robot is improved. MATLAB simulation and physical verification results show that the improved algorithm can reduce the path length, decrease the number of nodes, and improve the smoothness, which verifies the effectiveness and superiority of the algorithm. Keywords Path planning · RRT algorithm · Artificial potential field method · Floyd’s algorithm
Y. Liu (B) · H. Wang · S. Zhang School of Mechatronic Engineering and Automation, Shanghai University, Shanghai 200444, China e-mail: [email protected] H. Wang e-mail: [email protected] S. Zhang e-mail: [email protected] H. Wang Shanghai Key Laboratory of Power Station Automation Technology, Shanghai 200444, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_9
85
86
Y. Liu et al.
1 Introduction Since stepping into the digital and intelligent era, people’s quality of life has significantly improved and technology continues to break through innovation, so the application of home service robots in medical, agriculture, and life [14] has become more and more widespread. Home service robots are unique robots that serve human beings and can replace human beings to complete home services, free human hands, and improve work quality and efficiency. The work efficiency of home service robots mainly depends on the trajectory generated by motion planning, and path planning is one of the main research contents of motion planning, which is the strategy to constitute the curve connecting the starting position and the ending position. Path planning is widely used in planning problems that can be topologized as point-line networks, so the research of path planning methods for robots has essential economic value and use value. Path planning technology [13], as a core element in the research of home service robot technology, is to achieve autonomous path planning decisions for home service robots in unknown environments [7]. Path planning is mainly divided into global path planning and local path planning. The global path planning algorithm belongs to the static planning algorithm, which carries out path planning according to the existing map information and seeks an optimal path from the starting point to the target point. The local path planning algorithm belongs to the dynamic planning algorithm, which only needs to collect environmental information in real-time and then determine the distribution of obstacles to select the optimal path from the current node to a certain target node. In this paper, we study global path planning algorithms, which are widely used at present, and are mainly divided into two types—the search-based path planning algorithm represented by the A* algorithm [4] and the sampling-based path planning algorithm represented by the Rapidly exploring Random Trees (RRT) algorithm [6]. The fast extended random tree algorithm, which has been widely developed and applied in the last decade or so, is a sampling-based motion planning algorithm that was proposed by LaValle [9] of Iowa State University in 1998 and has been engaged in the improvement and application research of the RRT algorithm. His related work laid the foundation of the RRT algorithm. Gammell et al. [5] proposed improved algorithms for the problems of RRT, such as strong randomness, rough paths, and poor dynamic adaptability. For example, RRT* can recompute new nodes to achieve asymptotic optimality, Informed RRT* [11] uses an elliptic contraction strategy based on the initial path to reduce jagged path cases, RRTX designs a replanning strategy to adapt to the dynamic environment and improve the practicality of the algorithm. Kuffner et al. [12] artificially enhance the search efficiency while constructing two (more) trees from the starting and ending points, such as RRT-connect, Optimal BRRT* [15] adds a heuristic function on this basis to improve the convergence speed further, HARRT* combines the Tonglen topological space with a bidirectional tree, and theoretically proves the method effectiveness of the method. In addition to the bidirectional expansion tree strategy, Jaillet, Rodriguez, et al. induce the generation
Intelligent Path Planning for Home Service Robots Based on Improved RRT Algorithm
87
of sampling points based on environmental features, such as ADD-RRT [1], restricts the boundary point sampling area to a sphere and establishes a mapping relationship between the radius of the sphere and the success rate of boundary point expansion as a way to improve the expansion success rate. OB-RRT is similar in collecting obstacle data to determine the node growth direction. Rapidly exploring Random Trees (RRT) algorithm as a sampling-based path planning algorithm is characterized by fast search speed and effective search in complex situations. At the same time, there are many shortcomings [10], such as random growth direction, which leads to repeated planning, poor stability, and random search. There are a large number of invalid nodes and more redundant nodes in the process of random search. To address these problems, based on the random growth direction of the random tree and the blindness of node expansion, the idea of the Artificial Potential Field method [17] (APF) is added to constrain the growth direction of new nodes, and the original fixed step size is changed to dynamic step size, and the dynamic step size based The APF-RRT algorithm based on dynamic step length is proposed to improve the search efficiency and reduce the complexity, based on the existence of many turning points and redundant points, the path is smoothed using the idea of Floyd’s algorithm [16] to remove a large number of invalid nodes in the random search, shorten the length of the whole path and reduce the time consumption at the turning points so that the robot runs more smoothly and smoothly.
2 Basic RRT Algorithm Principle The algorithm of RRT is to initialize a random tree. The starting point X_start in the environment is used as the starting point of the random tree search, and the endpoint X_goal is used as the target point of the random tree search. The state point X_rand is randomly used from the environment. If the point is not within the obstacle range, the Euclidean distance from all nodes in the random tree to X_rand is calculated. The node X_near closest to the sampling point X_rand is found from the constructed tree, and the tree growth process starts. First, X_near and X_rand are connected. The direction of this connection is the growth direction of the tree. Set a step size. Stepsize is the step size of the tree’s one-time growth. Growing a step size in this direction of the tree’s growth will generate a new node X_new at the end of the growth and determine whether X_new to X_rand passes through the obstacle. If it passes through, the new node is abandoned. If not, the X_new node is added to the tree, and the growth process of the above tree is repeated until the distance from the newly generated node to the target point of the tree is less than a step size. The growth of the tree is terminated, and X_goal is added to the random tree as the last path node. The algorithm ends, and the planned path is obtained. The Pseudocode is shown in Fig. 1. In Fig. 2, initialize an environment, including a map, starting point, and end point. The black object is the obstacle, and the blue is the starting position, the red is the end position, X_rand is the random sampling point, X_near is the nearest node of
88
Y. Liu et al.
Fig. 1 RRT algorithm pseudocode
Fig. 2 Algorithm node expansion
the random sampling point, and X_new is the generated new node. The calculation X− rand−X− near , s is the extended formula of the new node is X− new = X− near + s X − rand−X− near step size, X− rand −X− near represents the Euclidean distance between the two.
Intelligent Path Planning for Home Service Robots Based on Improved RRT Algorithm
89
3 Improved RRT Algorithm 3.1 APF-RRT Algorithm Artificial potential field path planning is a virtual force method proposed by Khatib. Its basic idea is to design the movement of mobile robots in the surrounding environment as a kind of movement in an abstract artificial gravitational field. The potential field includes the gravitational pole and repulsive pole. It is suggested that the target point of the entering area generates gravity to the mobile robot, and the area and obstacle that the robot is not allowed to enter generate repulsive force to the mobile robot so that the mobile robot in the potential field is affected by the gravitational field of its target pose and the repulsive field around the obstacle [3]. Finally, the movement of the mobile robot is controlled by the resultant force to move toward the target. The potential gravitational field is mainly related to the distance between the robot and the target point. The smaller the distance, the smaller the potential field energy received by the robot, and the larger the distance, the larger the potential field energy, so the gravitational function is Uatt (q) = 21 ξρ 2 q, qgoal , where ξ is the scale factor, ρ q, qgoal is a vector, which represents the Euclidean distance between the current position q of the robot and the target point position qgoal , and the direction is from the current robot position to the target point position, the factor that determines the repulsion potential field is the distance between the robot and the obstacle. When the robot does not enter the influence range of the obstacle, the potential energy is zero. After the robot enters the influence range of the obstacle, the larger the distance between the two is, the smaller the potential energy of the robot is. The smaller the distance is, the greater2the potential energy of the robot is. The repulsion function 1 1 η − 1 , if ρ (q, qobs ) ≤ ρ0 , where η is a proportional coefficient, is 2 ρ(q,qobs ) ρ0 0, if ρ (q, qobs ) ≥ ρ0 ρ (q, qobs ) is a vector, which represents the Euclidean distance between the current position of the robot q and the obstacle position qobs , and the direction is from the obstacle position to the current position of the robot, through the gravitational field and the repulsive field, the composite field of the entire operating space can be obtained. The resultant force potential field of the robot is the sum of the repulsive potential field and the potential gravitational field. The resultant force potential field function is U(q) = Uatt (q) + Urep (q). In order to solve the problems of the blindness of node expansion and insufficient smoothness of generated path in the Rapidly Exploring Random Tree algorithm, this paper combines the APF algorithm with the RRT algorithm, changes the original fixed step size to dynamic step size, and proposes APF-RRT algorithm based on dynamic step size. As shown in Fig. 3, the obstacle will generate repulsion in the robot, and the target point will generate gravity for the robot. The sum of repulsion and gravity is the resultant force of the robot affected by the target point and the obstacle [2]. Under the action of the resultant force, the growth direction of the robot
90
Y. Liu et al.
Fig. 3 APF-RRT algorithm based on dynamic step size
will slowly grow in the direction of approaching the target point, thus improving the search efficiency and reducing the complexity. At this time, the calculation formula of the new node is X− new = X− near + sQ, X− goal−X− near X− rand−X− near + k X , s is the step size of random point direcQ = X − rand−X− near − goal−X− near tion, X − rand − X− near Represents the Euclidean distance between X − rand and X − near, X − goal − X − near represents the Euclidean distance between X − goal and X − near, k is the gravitational coefficient. The pseudo-code of the improved algorithm is shown in Fig. 4, and the specific optimization procedure is as follows. The starting node is added to the random tree T as a root node, as in step 1. Take this root node as the now newly generated node. When the number of iterations is less than the maximum number of iterations, if the distance from the newly generated node of the tree to the target point is greater than one step, the present node is taken as the parent node of the next iteration, and the potential gravitational field is established at the target point, and the repulsive potential field is established at the obstacle to calculating the joint external force on this node, as in step 6. the parent node generates a new node under the action of the joint force, and the number of iterations is added one, and the above steps are repeated until. The distance from the newly generated node of the tree to the target point is less than one step, ending the algorithm. The RRT algorithm with gravity is still extended with a fixed step size. When in a complex environment, the fixed step size is not conducive to avoiding obstacles, and the efficiency of expansion in an open environment is relatively low. Based on this, this paper proposes a variable step size idea based on the gravitational coefficient k, as follows.
Intelligent Path Planning for Home Service Robots Based on Improved RRT Algorithm
91
Fig. 4 Improved RRT algorithm pseudo code
• First, initialize a large k value. • When encountering obstacles, the k value is reduced to reduce the step size in the gravitational direction so that the new node is extended in the direction of random points to avoid obstacles. • When there is no obstacle, increase the value of k to speed up the efficiency of the algorithm. However, the value of k cannot be too large or too small. Too large may cause local optimum, and too small will not reflect the role of gravity, so the value of k must be set reasonably according to the actual environment.
3.2 Path Smoothing When using the traditional RRT algorithm to generate a complete path from the start to the end, there are many turning points and redundant points because the dynamics constraints and minimum turning radius are not taken into account during the search. To address this problem, the path is smoothed using the idea of Floyd’s algorithm, which removes a large number of invalid nodes from the random search, shortens the length of the entire path, and reduces the time consumption at the turning points to make the robot run more smoothly and smoothly.
92
Y. Liu et al.
Fig. 5 Deleting redundant nodes process
The Floyd algorithm, also known as the interpolation method, is an algorithm for finding the shortest path between multiple source points in a given weighted graph by drawing on the idea of dynamic programming [8]. In this paper, we combine the RRT algorithm with the Floyd algorithm for the secondary planning of paths and further shorten the length of the paths planned using the RRT algorithm by using the Floyd algorithm for multiple iterations of optimization from the starting point to the target point, while greatly improving the smoothness of the paths with better paths. The path planned by the RRT algorithm is a series of connecting lines between neighboring nodes, which will generate many inflection points. If there is no obstacle between the front and neighboring rear nodes of the current node, the current node is redundant and deleted, and if it exists, multiple iterations of two nodes are performed to shorten the path length. As shown in Fig. 5, L(A, B) denotes the Euclidean distance from A to B. There is an obstacle between A, B, so L(A, B) = +∞, R(A, B) is A → B. At this point, insert point C between points A, B. If there is no obstacle between A, C and B, C, then C is a valid point, if L(A, C) + L(C, B) < L(A, B), then L(A, B) = L(A, C) + L(C, B), R(A, B) is A → C → B. Then insert the point D between the two points A, B. There is no obstacle between A, D and B, D. D is a valid point, if L(A, D) + L(D, B) < L(A, C) + L(C, B), then L(A, B) = L(A, D) + L(D, B), R(A, B) is A → D → B. With this iteration, the insertion of points E, F, and G reduces the safety, although the improved path turning point is reduced, and the path is shortened. To improve the safety of path planning and keep the path away from obstacles, a safety threshold is added when deleting redundant points and inserting new nodes, as shown in Fig. 6. If the distance of the newly generated path L(A, D) from the center of the obstacle is less than the set safety value r (robot radius), then the distance is additionally increased by 2r/Dis (Dis is the distance from the robot to the center of the obstacle), and vice versa, it is not increased.
Intelligent Path Planning for Home Service Robots Based on Improved RRT Algorithm
93
Fig. 6 Increase the security threshold
4 Experiment and Simulation To verify the performance of the improved algorithm, experiments are conducted on the Matlab simulation platform. In order to make the improved algorithm controllable, a limit on the number of cycles is added so that if the randomly generated tree cannot search for the target point x_goal within the specified number of cycles, the algorithm is shown to return a failure and the simulation is verified by resetting the parameters again.
4.1 Simulation Experiments The original RRT algorithm twice improved the algorithm in Matlab to do simulation analysis. The experimental map is a two-dimensional scene map, and the simulation is divided into two different scenes of simple obstacles and complex obstacles, which can make the simulation results more comparable and better verify the effectiveness of the algorithm. The simulation experiment is set to perform path planning in a maze environment, with a simulation environment of size 500*500, the initial coordinates of the studied object are (100, 100), the coordinates of the target location are (400, 300), the specified expansion step size is 50, the maximum number of iterations is set to 10,000, and path planning is performed for the original RRT algorithm and the twice improved algorithm, respectively. The obtained simulation results are shown in Figs. 7 and 8. The blue line indicates the path planning of the original RRT algorithm, the red line indicates the path planning of the first improved algorithm (APF-RRT), and the green line indicates the path planning of the second improved algorithm (FAPF-RRT). In the case of simple obstacles, the corresponding results of the three algorithms are shown in Fig. 7. In the complex environment case, with many obstacles, the
94
Y. Liu et al.
(a) Original RRT algorithm
(b) Improved APF-RRT (c) Improved interpolation algorithm point method
Fig. 7 Simple obstacle path planning diagram
(a) Original RRT algorithm
(b) Improved APF-RRT (c) Improved interpolation algorithm pointmethod
Fig. 8 Complex obstacle path planning map
corresponding results of the three algorithms are shown in Fig. 8. From the figure, it can be seen that the original RRT algorithm can plan the path quickly, but the path nodes are more, the smoothness is poor, and the path length is not ideal. The RRT algorithm improved by adding the artificial potential field idea greatly reduces the number of nodes, shortens the path length, and improves the path quality significantly. On this basis, the path length of the RRT algorithm optimized by Floyd’s algorithm is not much different, but the number of nodes is further reduced, and the smoothness becomes better.
4.2 Parameter Analysis To verify the performance of the improved algorithms, the RRT algorithm, the first improved RRT and the second improved RRT algorithm were run 50 times each, and the number of turns, path length, planning time, and iterations were recorded, and the average number of turns, average path length, average planning time and an average number of iterations were calculated for the three algorithms, as shown in
Intelligent Path Planning for Home Service Robots Based on Improved RRT Algorithm
95
Table 1 Performance comparison of optimized algorithms in simple environments Algorithm Average path Average number Average planning Average number length/m of iterations time/s of inflection points Original RRT APF-RRT FAPF-RRT
844.59 687.11 680.93
91 32 30
12.52 5.23 5.52
81 30 5
Table 2 Performance comparison of optimized algorithms in complex environments Algorithm Average path Average number Average planning Average number length/m of iterations time/s of inflection points Original RRT APF-RRT FAPF-RRT
1067.36 885.58 881.24
183 70 74
22.17 10.46 10.68
67 28 5
Tables 1 and 2. From the table, in the path planning process, the second improved RRT algorithm can find the path in a very short time compared to the original RRT algorithm and the first improved RRT algorithm in terms of average path length. According to the table, in path planning, the APF-RRT algorithm and FAPF-RRT algorithm can find the path in a very short time on the average path length. Under the same environment and parameters, the average number of iterations of the two improved algorithms is almost the same. Compared with the original RRT algorithm, the average number of iterations of the APF-RRT algorithm and FAPF-RRT algorithm is reduced by 61.7% and 59.6%, respectively. For the average planning time, the path smoothing algorithm is added to the FAPF-RRT algorithm than the APF-RRT algorithm, and the planning time is increased in the path planning process. Compared with the original RRT algorithm, the average planning time of the APF-RRT algorithm and the FAPF-RRT algorithm is reduced by 52.8% and 51.8%, respectively, which has obvious advantages and accelerates the convergence speed. In the maze environment, the average number of inflection points of the APF-RRT algorithm is reduced by 92.5% and 82.1%, respectively, compared with the FAPF-RRT algorithm and original RRT algorithm.
4.3 Actual Scene Testing In order to verify the performance of the algorithm in the actual scene, the computer is configured as Ubantu18.04LTS, the processor is Intel I5-6500, the main frequency is 3.2 Hz, and the running memory is 16 GB. The EAIBOT N1 two-wheel differential drive robot is selected as the experimental object, as shown in Fig. 9. Using the open
96
Y. Liu et al.
Fig. 9 Drive robot
source robot software system ROS, 10 m * 5 m as the experimental environment, as shown in Fig. 10, the table represents the obstacles in the environment, and the planned path is shown in Fig. 11. Figure a represents the state of the robot at the initial point. Figures b–d are the process of the robot avoiding obstacles according to the path planned by the algorithm. It can be seen that the robot travels along the planned path without collision with obstacles. Figure e is the state in which the robot successfully reaches the target point. The mobile robot starts from the initial point, runs along the planned path, and successfully reaches the target point. The experiment verifies the feasibility of the improved algorithm.
Intelligent Path Planning for Home Service Robots Based on Improved RRT Algorithm
97
Fig. 10 Experimental environment
5 Conclusion Aiming at the problems of random growth direction, a large number of redundant nodes, and poor smoothness in the RRT algorithm, this paper proposes an improved RRT service robot path planning algorithm. By introducing the idea of the artificial potential field method, the growth direction of new nodes is constrained, and the path is smoothed by simplifying the path and combining the idea of the Floyd algorithm. The safety threshold is added when deleting redundant points and inserting new nodes, which improves the safety of path planning, keeps the path away from obstacles, and improves the stability of the robot. According to the MATLAB simulation results, compared with the basic RRT algorithm, the effectiveness and superiority of this algorithm are mainly reflected in the following aspects. (1) Compared with the basic RRT algorithm, the time required for path planning of the improved RRT algorithm is greatly reduced. In the simulation environment, the path planning time is reduced by 51.8%. (2) Compared with the basic RRT algorithm, the path length planned by the improved RRT algorithm is reduced. In the simulation environment, the path length is reduced by 17.4%. (3) After smoothing the path initially planned by the improved RRT algorithm, the new path has better smoothness, which can better meet the requirements of service robot motion. In the physical verification experiment, the feasibility of the algorithm is also proved. The simulation verification of this paper is only carried out in a two-dimensional environment and does not take into account the three-dimensional environment. In the future, it will consider path planning in a three-dimensional environment and can further improve the adaptability of the algorithm in path planning according to the constraints of obstacles.
98
Y. Liu et al.
(a)
(b)
(c)
(d)
(e) Fig. 11 Drive robot path diagram
References 1. Cai, P., Yue, X., Zhang, H.: Add-RRV for motion planning in complex environments. Robotica 40(1), 136–153 (2022) 2. Chen, H., Dou, P., Wang, Z., Zhang, H.: Improved RRT* path planning algorithm based on artificial potential field method. In: 2022 5th World Conference on Mechanical Engineering and Intelligent Manufacturing (WCMEIM), pp. 961–965. IEEE (2022)
Intelligent Path Planning for Home Service Robots Based on Improved RRT Algorithm
99
3. Duan, Y., Yang, C., Zhu, J., Meng, Y., Liu, X.: Active obstacle avoidance method of autonomous vehicle based on improved artificial potential field. Int. J. Adv. Robot. Syst. 19(4), 17298806221115984 (2022) 4. Erke, S., Bin, D., Yiming, N., Qi, Z., Liang, X., Dawei, Z.: An improved a-star based path planning algorithm for autonomous land vehicles. Int. J. Adv. Robot. Syst. 17(5), 1729881420962263 (2020) 5. Gammell, J.D., Srinivasa, S.S., Barfoot, T.D.: Informed RRT: optimal sampling-based path planning focused via direct sampling of an admissible ellipsoidal heuristic. In: 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 2997–3004. IEEE (2014) 6. Khan, A.T., Li, S., Kadry, S., Nam, Y.: Control framework for trajectory planning of soft manipulator using optimized RRT algorithm. IEEE Access 8, 171730–171743 (2020) 7. Krell, E., Sheta, A., Balasubramanian, A.P.R., King, S.A.: Collision-free autonomous robot navigation in unknown environments utilizing PSO for path planning. J. Artif. Intell. Soft Comput. Res. 9(4), 267–282 (2019) 8. Kuang, H., Li, Y., Zhang, Y., Feng, Y.: Improved a-star algorithm based on topological maps for indoor mobile robot path planning. In: 2022 IEEE 6th Information Technology and Mechatronics Engineering Conference (ITOEC), vol. 6, pp. 1236–1240. IEEE (2022) 9. Kuffner, J.J., LaValle, S.M.: RRT-connect: an efficient approach to single-query path planning. In: Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No. 00CH37065), vol. 2, pp. 995–1001. IEEE (2000) 10. Li, J., Li, C., Chen, T., Zhang, Y.: Improved RRT algorithm for AUV target search in unknown 3d environment. J. Marine Sci. Eng. 10(6), 826 (2022) 11. Mashayekhi, R., Idris, M.Y.I., Anisi, M.H., Ahmedy, I., Ali, I.: Informed RRT*-connect: an asymptotically optimal single-query path planning method. IEEE Access 8, 19842–19852 (2020) 12. Noreen, I., Khan, A., Habib, Z.: Optimal path planning using RRT* based approaches: a survey and future directions. Int. J. Adv. Comput. Sci. Appl. 7(11) (2016) 13. Patle, B., Pandey, A., Parhi, D., Jagadeesh, A., et al.: A review: on path planning strategies for navigation of mobile robot. Defence Technol. 15(4), 582–606 (2019) 14. Tuomi, A., Tussyadiah, I.P., Stienmetz, J.: Applications and implications of service robots in hospitality. Cornell Hospitality Q. 62(2), 232–247 (2021) 15. Wang, K., Zeng, G., Lu, D., Huang, B., Li, X.: Path planning of mobile robot based on improved asymptotically-optimal bidirectional rapidly-exploring random tree algorithm. J. Comput. Appl. 39(5), 1312 (2019) 16. Wang, L., Wang, H., Yang, X., Gao, Y., Cui, X., Wang, B.: Research on smooth path planning method based on improved ant colony algorithm optimized by Floyd algorithm. Frontiers Neurorobot. (2022) 17. Xu, T., Zhou, H., Tan, S., Li, Z., Ju, X., Peng, Y.: Mechanical arm obstacle avoidance path planning based on improved artificial potential field method. Ind. Robot Int. J. Robot. Res. Appl. 49(2), 271–279 (2022)
Adaptive Neural Networks Backstepping Control of Uncertain Second-Order Systems with Input and State Time Delays Renjian Jiang, Lin Tian, Peng Li, and Liang Sun
Abstract This paper investigates the adaptive neural networks backstepping control for uncertain second-order systems with input and state time delays. A new Lyapunov-Krasovskii function is used to compensate time delay and transform the time-delay system into a delay-free system. The neural network is used to approximate the unknown function and deal with the uncertainty in the system. The stability of the closed-loop system is proved based on Lyapunov theory, and the tracking error can converge to a small neighborhood of zero. In addition, two simulation examples are used to verify the performance of the controller. Keywords Adaptive neural control · Backstepping control · Time delays
1 Introduction In recent years, nonlinear control systems have attracted much attention in practical engineering. Among nonlinear control methods, the backstepping method is an effective method to solve high-order nonlinear problems. Especially, the combinaThis work was supported by the National Natural Science Foundation of China (61903025) and Fundamental Research Funds for the Central Universities (No.FRF-IDRY-GD22-002 ). R. Jiang · L. Sun (B) School of Intelligence Science and Technology, University of Science and Technology Beijing, Beijing 100083, China e-mail: [email protected] R. Jiang e-mail: [email protected] R. Jiang · L. Tian 101 Research Institute, Ministry of Civil Affairs, Beijing 100070, China e-mail: [email protected] P. Li Beijing Key Laboratory of Urban Underground Space Engineering, University of Science and Technology Beijing, Beijing 100083, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_10
101
102
R. Jiang et al.
tion of adaptive control and backstepping technology provided an effective system framework for uncertain parameter adjustment and has been widely used in nonlinear systems [1]. At the same time, the practical engineering is also accompanied by time delay, disturbance, and many uncertainties. These factors may lead to performance reduction of the control system, and even make the system unstable. In the past decades, many scholars have studied the control of uncertain nonlinear systems. In [2], model predictive control of nonlinear systems based on machine learning was studied, and the performance and robustness of this method were verified by applying it to a chemical process example. The tracking control problem of finitetime command filtering was studied for time-varying full-state constrained nonlinear systems with unknown input delay. The backstepping method was combined with finite-time command filtering technology to ensure the stability of the system [3]. To solve the time delay problem, the barrier Lyapunov function and Pade approximation method under the unified framework was adopted in [1]. A finite-time adaptive neural network tracking controller was designed to solve the problem of system uncertainty and actuator saturation [4]. The timing control problem for a class of nonlinear uncertain time-delay systems was studied in [5]. A time-adaptive controller was designed by using the fixed-time adjustment function and the Lyapunov-Krasovskii function method. In [6], a fast transient tracking differentiator was proposed under the framework of the tracking differentiator design method, which was used to improve the trajectory tracking of robots with a time delay with parameter uncertainties and bounded perturbations. In this paper, we consider the problem of tracking control for uncertain secondorder systems with input delay and state delay. The main contribution of this paper includes the following parts. (1) This paper studies adaptive control strategies under input and state delay. A novel L-K function is constructed to compensate the time delay, and the time delay system is transformed into a delay-free system by using a predictor term to offset the time delay. (2) In the process of controller design, the neural network is used to approximate unknown continuous functions to deal with the uncertainty of the system. By estimating the weight vector of unknown constant to reduce the parameter learning. (3) According to Lyapunov stability theory, it is proved that the signals and estimated parameters of the closed-loop system are bounded, and the tracking errors can converge to a small neighborhood of zero. The performance and robustness of the controller are verified by two simulation cases.
2 Preliminary Knowledge and Problem Description Lemma 1 For ∀ (x, y) ∈ R2 , the following inequality holds: xy ≤
1 μa a |x| + |y|b a bμb
where a > 1, b > 1, μ > 0, (a − 1)(b − 1) = 1.
(1)
Adaptive Neural Networks Backstepping Control of Uncertain . . .
103
Lemma 2 Given the continuous function (κ1 , . . . , κn ) : Rm r × · · · × Rm n → R having (0, . . . 0) = 0, where κi ∈ Rm i (i = 1, 2, . . . , n, m i > 0), assume positive mi smooth n functions ϕi (κi ) : R → R, satisfying ϕi (0) = 0 such that | (κ1 , . . . , κn )| ≤ i=1 ϕ (κi ). Lemma 3 For ηi > 0(i = 1, 2), exists the set i = {z i | |z i | ≤ 0.2554ηi }. Then, for / i , it satisfy the inequality 1 − 16tanh2 (z i /ηi ) < 0. zi ∈ In the process of controller design, we introduce the RBF neural networks to deal with the unknown continuous function f (Z ) : Rq → R, f (Z ) = W ∗T S (Z ) + δ (Z ), T where W ∗ = w1∗ , w2∗ , . . . , wq∗ ∈ Rq is the ideal constant weight vector, δ (Z ) is the approximation error assuming that δ(Z ) ≤ δ ∗ , Z is the NN input with q dimen T sions, S (Z ) = s1 (Z ) , s2 (Z ) , . . . , sq (Z ) is the basis function vector with si (Z ), where si (Z ) = exp −(Z −γiσ)2 (Z −γi ) , i = 1, . . . q, σ is the width of Gaussian funcT tion and γi = γi1 , . . . , γiq is the centre of the receptive field. The ideal constant weight vector is described as T
W ∗ = arg min
Wˆ ∈R q
sup f (Z ) − Wˆ T S (Z )
Z ∈ Z
(2)
where Wˆ is the estimate of W ∗ . In this paper, the optimal constant weight vector
2 W ∗ is update by the estimated norm Wˆ . Wi∗ is a constant that can be represented
∗ 2 as Wi = bi θi with bi being a positive parameter. The estimation errors of θi is expressed as θ˜i = θˆi − θi , θˆi is used to represent the estimation of θi . Consider the following second-order system model with input and state time delays ⎧ ⎨ x˙1 = f 1 (x1 ) + x2 + h 1 (x (t − τ1 )) x˙2 = f 2 (x2 ) + u(t − τu ) + h 2 (x(t − τ2 )) (3) ⎩ y = x1 where x1 (t), x2 (t) are system states, y is control output, f i (·) : Ri → R, (i = 1, 2) are unknown with f i (0) = 0, h j (·) : R2 → R, ( j = 1, 2) are unknown nonlinear time delay functions satisfying h j (0) = 0, τi and τu are constant input delay and unknown state delay respectively. Assumption 1 The desired trajectory signal yd (t) and yd(n) (t) are both continuous and bounded. The control objective of this paper is to design an adaptive neural network controller so that the output y of the system can track the given desired trajectory yd under Assumption 1.
104
R. Jiang et al.
3 Controller Design and Stability Analysis Since the control object is a second-order nonlinear system, the recursive design procedure has two steps. The following coordinate transformation is adopted z 1 = x1 − yd , z 2 = x2 − α1 + z u ,
(4)
where z 1 and z 2 are the backstepping variables of Step 1 and Step 2 respectively, yd is the desired tracking t trajectory of system output, α1 is the virtual control input of Step 1, and z u = t−τu u (ξ )dξ is equivalent to the state caused by input delay. Step 1: From (4), we can get the time derivative of z 1 z˙ 1 = x2 + f 1 (x1 ) + h 1 (x (t − τ1 )) − y˙d . Then, the Lyapunov function V1 is constructed as V1 = 21 z 12 + 2λ1 1 θ˜12 + VL1 , where λ1 is the positive parameter and the Lyapunov-Krasovskii function VL1 is given as −π1 (t−τ1 ) t π1 s 2 VL1 = 2k=1 e 2 ϕ1k (xk (s))ds, where π1 is a positive parameter, ϕ1k is t−τ1 e a positive function. Based on the Lemma 1 and Lemma 2, the following inequality gets 1 2 + ϕ (xk (t − τ1 )). 2 k=1 1k 2
z 1 h 1 (x (t − τ1 )) ≤
z 12
(5)
Considering z 2 = x2 − α1 and inequality (5), we can get z1 ¯ 16 V˙1 ≤ z 1 (z 2 + α1 + f 1 (x1 ) + |z 1 | − y˙d + tanh2 L1 z1 η1 z1 1 + θ˜1 θ˙ˆ1 − π1 VL1 − L¯ 1 + 1 − 16tanh2 L¯ 1 + L 1 r1 η1
(6)
π1 τ1 π1 τ1 2 2 where L 1 = 2k=1 e 2 ϕ1k (xk (t)), L¯ 1 = 2k=1 e 2 ϕ1k (x1 (t)) is the substitution for L 1 in the Step 1. To avoid calculation errors, we introduce a term (16/z 1 )tanh2 (z 1 /η1 )L 1 which can be approximated by RBF NNs. To deal with the unknown nonlinear function, the F1 (Z 1 ) is given as 16 F1 (Z 1 ) = f 1 (x1 ) + |z 1 | − y˙d + tanh2 z1
z1 η1
z1 L¯ 1 + , 2
W1∗T S1 is employed to identify F1 having any given δ1∗ > 0. F1 (Z 1 ) = W1∗T S1 (Z 1 ) + δ1 (Z 1 ) , |δ1 (Z 1 )| ≤ δ1∗
(7)
Adaptive Neural Networks Backstepping Control of Uncertain . . .
105
T where Z 1 = x1 , θˆ1 , yd , y˙d with δ1 (Z 1 ) being the constructing error. Based on the Lemma 1, it has b1 θ1 2 T 1 2 z 12 1 z 1 F1 ≤ z S S + + + δ1∗2 (8) a 1 1 2 1 1 2 2 2 2a1 2a1
2 where W1∗ = b1 θ1 . The virtual control can be designed as α1 = −k1 z 1 −
b1 θˆ1 z 1 S1T (Z 1 ) S1 (Z 1 ) 2a12
(9)
and the adaptive parameter is designed as θ˙ˆ1 =
λ1 b1 2 T z S (Z 1 ) S1 (Z 1 ) − σ1 θˆ1 2a12 1 1
(10)
where k1 , a1 , b1 , σ1 are design parameters. Then, substituting (7)–(10) into (6), we can get 1 1 V˙1 ≤ −k1 z 21 + z 1 z 2 + a12 + δ1∗2 − π1 VL1 2 2 z1 σ1 θ˜12 σ1 θ12 ¯ − + − L 1 + 1−16tanh2 L¯ 1 + L 1 2λ1 2λ1 η1
(11)
Step 2: Consider z 2 = x2 − α1 + z u and its derivative is computed as z˙ 2 = u + f 2 (x) + h 2 (x (t − τ2 )) − α˙ 1 . Define a Lyapunov function as V2 = 21 z 22 + parameter, VL2 is designed as VL2 =
2 2 e j=1 k=1
(
−π2 t−τ j
) t
2
τ2u −π2u (t−τ2u ) e + 2
t−τ j
t t−τ2u
1 ˜2 θ 2λ2 2
(12)
+ VL2 , where λ2 is a positive
eπ2 s ϕ 2jk (xk (s))ds ⎛
eπ2u s ⎝
t s
⎞ u 2 (ξ ) dξ ⎠ds.
(13)
106
R. Jiang et al.
Similar to (6) in Step 1, we can get z2 ¯ 16 2 ˙ V2 ≤ z 2 (u + f 2 (x) + ς (x2 + f 1 (x)) + tanh L2 z2 η2 b1 z 1 T z2 1 ˙ + θ˜2 θˆ2 −ς 2 z 22 + S (Z 1 ) S1 (Z 1 ) θ˙ˆ1 − ς y˙d + 2 1 2 r2 2a1 t z2 τu − u 2 (ξ ) dξ − L¯ 2 + 1 − 16tanh2 L¯ 2 + L 2 2 η2
(14)
t−τu
where ς = k1 +
b1 θˆ1 2a12
S1T (Z 1 ) S1 (Z 1 ). Define the unknown function
b1 z 1 T S (Z 1 ) S1 (Z 1 ) θ˙ˆ1 − ς y˙d 2a12 1 z2 ¯ z2 16 2 2 − z 1 + − ς z 2 + tanh L2 2 z2 η2
F2 (Z 2 ) = f 2 (x) + ς (x2 + f 1 (x)) +
(15)
T where Z 2 = x1 , x2 , θˆ1 , θˆ2 , yd , y˙d , y¨ ∈ R7 . The process of controller design and adaptive parameter is similar to Step 1 as θˆ2 b2 z 2 S2T (Z 2 ) S2 (Z 2 ) 2a22
(16)
λ2 b2 2 T z S (Z 2 ) S2 (Z 2 ) − σ2 θˆ2 θ˙ˆ2 = 2a22 2 2
(17)
u = −k2 z 2 −
where k2 , a2 , b2 , σ2 are design parameters. Then, substituting (15)–(17) into (14), we can get t 1 1 σ2 θ˜22 σ2 θ2∗2 τu V˙2 ≤ −k2 z 22 − z 1 z 2 + a22 + δ2∗2 − π2 VL2 − + − u 2 (ξ ) dξ 2 2 2λ2 2λ2 2 t−τu z2 (18) − L¯ 2 + 1−16tanh2 L¯ 2 + L 2 . η2 Theorem 1 Consider the system model (3), if Assumption 1 is satisfied, and the control law is designed as (16) and the adaptive law is (10) and (17), then all signals of the closed-loop system are bounded, and the output tracking errors can converge to a small neighborhood of the origin. Proof The total Lyapunov function candidate V = V1 + V2 . Based on the CauchySchwarz inequality, it has the following result
Adaptive Neural Networks Backstepping Control of Uncertain . . .
107
2 t t 1 2 1 τ u z = u (ξ ) dξ ≤ u 2 (ξ ) dξ 2 u 2 2 t−τu
t−τu
2 2 L i = i=1 L¯ i , then, the derivative of total Lyapunov function can be and i=1 rewritten as V˙ ≤ −
2 i=1
ki z i2
−
2
πi VLi −
i=1
2 σi θ˜ 2 i
i=1
2 zi 2 1−16tanh + L¯ i η i i=1
2λi
+
2 1 i=1
2
ai2
σi θi∗2 1 + δi∗2 + 2 2λi
(19)
The last term on theright-hand side of the two parts above inequality can besplit into 2 2 zi 2 zi 2 ¯ ¯ L i = zi ∈i 1−16tanh ηi L i + zi ∈ i=1 1−16tanh / i 1−16tanh ηi zi L¯ i . For z i ∈ / i , based on Lemma 3 and L i ≥ 0, it is obtained that the second ηi term is negative. When z i ∈ i , |z i | ≤ 0.2554ηi with ηi being a positive parameter, it Let α0 = min1≤i≤2 {2ki , πi , σi }, β0 = is obvious that z i and the first term are bounded. 2 1 2 1 ∗2 σi θi∗2 2 zi + zi ∈i 1−16tanh ηi L¯ i , where β0 is bounded. i=1 2 ai + 2 δi + 2λi 2 Then, since V > 0 and the inequality (19) can be rewritten as V˙ = i=1 V˙i ≤ −α0 V + β0 , so it is obtained that V is bounded. Meanwhile, the estimation error of adaptive parameter θ˜i and the L-K function VLi are uniformly bounded. This means that the tracking converge to a small neighborhood of the origin errors can 2 2 2β0 1 2 ˜2 − namely i=1 |z i | ≤ i=1 θi − 2 i=1 VLi . α0
ri
4 Simulation Results In this section, two examples are simulated to evaluate the effectiveness of the proposed controller. In the simulation examples, the temperature and pressure models respectively. The temperature example is given as x˙1 = x2 + are verified x3 1 )x 2 (t−τ1 ) , x˙2 = u (t − τu ) + x2 cos (x1 x2 ) + x12 (t − τ2 ) log 3x1 + 1+x1 2 + 1+xx12(t−τ 2 1 1 (t−τ1 )+x 2 (t−τ1 ) sin (x2 (t − τ2 )), y = x1 . The desired reference signal yd of system output is given as 5t 3 + 2t 2 (0 ≤ t < 5), 800(5 ≤ t ≤ 25), −16t 2 + 720t − 7200(25 < t ≤ 30). The initial value of the system states are x1 (0) = 0.08, x2 (0) = 0.1, τ1 = 0.1, τ2 = p2 0.05, τu = 0.2. The pressure example is given as p˙ 1 = p2 + log p1 + 1+1p3 + 2 p1 (t−τ1 ) , p1 (t−τ1 )+ p2 (t−τ1 )
1
p˙ 2 = u (t − τu ) + p1 sin ( p2 ) + 2 p1 (t − τ2 ) p2 (t − τ2 ), q = p1 . The desired reference signal qd of system output is given as −0.05t 3 − 0.02t 2 (0 ≤ t < 5), −8(5 ≤ t ≤ 25), 0.16t 2 − 7.2t + 72(25 < t ≤ 30). The initial value of the system states are p1 (0) = 0.05, p2 (0) = 0.12, τ1 = 0.03, τ2 = 0.05, τu = 0.1.
108
R. Jiang et al.
1000
2 0
800
-2 600 -4 400 -6 200
-8
0
-10 0
5
10
15
20
25
30
0
5
10
15
20
25
30
5
10
15
20
25
30
Fig. 1 The trajectory tracking of temperature and pressure 150
300 200
100 100 50
0
0
-100 -200
-50 -300 -100
-400 0
5
10
15
20
25
30
0
Fig. 2 Control inputs of the temperature system and pressure system
The controller parameters are set as k1 = 23, k2 = 31, a1 = a2 = 2.5, b1 = b2 = 1.2, r1 = r2 = 0.7, θˆ1 = θˆ2 = 0, σ1 = 0.88, σ2 = 0.15. And Wˆ 1T S (Z 1 ) contains 34 nodes with centres spaced evenly in the interval [−1.1, 1.1] × [−0.7, 0.7] × [−1.1, 1.1] × [−1.7, 1.7] and widths being 0.82; Wˆ 2T S (Z 2 ) contains 37 nodes with centres spaced evenly in the interval [−1.1, 1.1] × [−1.8, 1.8] × [−0.6, 0.6] × [−1.7, 1.7] × [−1.2, 1.2] × [−1.4, 1.4] × [−1.7, 1.7] and widths being 1.31. The simulation results are given in Figs. 1, 2 and 3. As can be seen in Fig. 1, the time trajectories of system output can track the reference signals of temperature and pressure, respectively. It is worth noting that the fifth second and the 25th seconds are the junctions between the dynamic segments and the stationary segment. In the first five seconds and the last five seconds, the superiority of dynamic tracking can be seen and the tracking error is almost zero. The tracking error of stationary segment converges to a small neighborhood of zero in accordance with theoretical analysis. The results in Fig. 1 demonstrate a good tracking performance. The control inputs of temperature system and pressure system are shown in Fig. 2, respectively. The overshoots of different degrees occur at the junctions between dynamic segments and stationary segment, which are caused by the excessive oscillation of the controller when controlling the large system. The controller chattering
Adaptive Neural Networks Backstepping Control of Uncertain . . . 0.1 0.05 0 -0.05
109
0.05
0 0
5
10
15
20
25
0
30
10-4
5
10
15
20
25
30
5
10
15
20
25
30
10-4 0
2
-5
1
-10
0 0
5
10
15
20
25
30
0
Fig. 3 Adaptive parameters of the temperature system and pressure system
of the dynamic segments in tracking process because of the need to maintain the accuracy of the dynamic tracking. Because of the need to quickly track the stationary segment signal, the controller chattering does not stabilize immediately at the five second. The controller needs to continue adjusting the input signal. The estimated values of adaptive parameters are shown in Fig. 3, respectively. As can be seen in the results, the estimated unknown parameters are all bounded. It accords with the conclusion of stability analysis. Nevertheless, the estimated parameters do not converge to their true values, which is mainly due to the persistent excitation condition is not satisfied.
5 Conclusions In this study, an adaptive neural network controller for uncertain second-order systems with input and state time delays is addressed. We construct a predictor term to solve the time delay problem. Meanwhile, the RBF neural network is used to approximate the unknown function and deal with the uncertainty in the system. In addition, the stability of the closed-loop system is proved based on Lyapunov theory, and the performance of controller is demonstrated by simulation examples.
References 1. Min, H., Shi, S., Fei, S., Yu, X.: Adaptive control for output-constrained nonlinear systems with input delay. In: Chinese Control and Decision Conference (CCDC), vol. 2020, pp. 1463–1468 (2020) 2. Aisha, A., Atharva, S., Mohammed, S., Fahim, A., Panagiotis, D.: Machine learning-based predictive control of nonlinear time-delay systems: closed-loop stability and input delay compensation. Dig. Chem. Eng. 7, 100084 (2023)
110
R. Jiang et al.
3. Lu, Y., Liu, W., Ma, B.: Finite-time command filtered tracking control for time-varying full state-constrained nonlinear systems with unknown input delay. IEEE Trans. Circ. Syst. 69(12), 4954–4958 (2022) 4. Wen, S., Lin, Q., Ning, H., Xun, D.: Finite-time adaptive neural control for uncertain nonlinear time-delay systems with actuator delay and full-state constraints. Int. J. Syst. Sci. 50(4), 726– 738 (2019) 5. Ju, P., Chun, C., Kuo, L., Hao, L.: A novel theorem for prescribed-time control of nonlinear uncertain time-delay systems. Automatica 152, 111009 (2023) 6. Huan, W., Yu, S.: Differentiator-based time delay control for uncertain robot manipulators. Asian J. Control 25, 485–496 (2023)
Satellite-Terrestrial Integrated Network Slicing Resource Management Based on Reinforcement Learning Guoliang Hua, Guangrong Lin, Yuman Zhang, and Yafei Zhao
Abstract The satellite-terrestrial integrated network is an effective approach to address the problem of limited coverage despite abundant ground network resources. Network slicing technology provides an efficient resource-sharing method for building sub-networks with different requirements on the physical network infrastructure. The resource allocation manager plays a crucial role in the architecture, and intelligent algorithms are commonly employed to solve resource allocation problems. In this paper, focusing on the electric power communication application scenario with multiple network slices for a single gateway, we model the spectrum resource utilization and slice satisfaction, and investigate the reinforcement learning intelligent strategy. We formulate reinforcement learning as a Markov process and attempt to solve the bandwidth resource allocation rationalization problem for three types of slices using the A2C algorithm. Based on simulation analysis, the A2C algorithm demonstrates applicability in this scenario, showing a clear upward trend in environment reward value after 2000 training steps and stabilizing after 4000 steps, with the satisfaction of individual high-rate slices approaching. Keywords Satellite-terrestrial integrated network · Network slicing · A2C · Reinforcement learning
1 Introduction The Satellite-terrestrial integrated network slicing technology provides an efficient means of resource sharing for constructing subnetworks with different requirements on top of a physical network. In the context of network slicing resource allocaG. Hua (B) Yinhe Hangtian (Xi’an) Technology Co., Ltd., Xi’an 700100, China e-mail: [email protected] G. Lin Yinhe Hangtian (Beijing) Internet Technology Co., Ltd., Beijing 100141, China Y. Zhang · Y. Zhao Beijing University of Posts and Telecommunications, Beijing 100876, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_11
111
112
G. Hua et al.
tion, it is necessary to allocate and manage bandwidth, storage, and computing resources within the network slices based on diverse application scenarios and Quality of Service (QoS) requirements. In order to effectively leverage the advantages of the satellite-terrestrial cooperative network [1] which integrates the extended space network, the Internet, and mobile wireless networks to provide comprehensive services and global anytime anywhere network access [2], it is important to consider the specificities of satellite systems as well as the challenges associated with joint resource allocation between ground and satellite networks. Given the limited resources in satellite networks, it is crucial to constantly and reasonably allocate resources, thereby necessitating the integration of satellite and ground network resources through a well-designed mechanism. In recent years, there has been extensive research on satellite-terrestrial integrated network slicing technology. A joint consideration of satellite and ground networks is made, incorporating prediction into machine learning to accurately estimate satellite communications [3]. Resource allocation for both satellite and ground networks is dynamically and intelligently performed, allowing for responsiveness to environmental changes. In [4], the resource allocation problem for railway communications relying on satellite and 5G connections is addressed, and several exemplary candidate solutions are presented. Some researchers propose a 5G low Earth orbit satellite integrated network architecture based on Software-Defined Networking (SDN) and Network Function Virtualization (NFV) [5]. A mixed integer linear programming approach is employed to address the reliable mapping problem of 5G low Earth orbit satellite network slicing, and resource orchestration for slice requests is studied. While satellite networks serve as transparent relays within the entire communication network, the resources onboard satellites are relatively limited compared to those on the ground. Hence, the rational design of scheduling algorithms for onboard and ground resources and their intelligent integration remain urgent challenges [6]. To achieve more intelligent and optimized slicing scheduling algorithms, it is necessary to incorporate technologies such as machine learning, deep learning, and reinforcement learning for network traffic prediction and optimization. Among these techniques, reinforcement learning has shown excellent performance in resource allocation [7]. An in-depth analysis of the application of reinforcement learning is provided in resource optimization, offering thorough explanations of the fundamental principles underlying reinforcement learning modeling [8]. Effective machine-learning models are commonly used to train wireless resource allocation schemes. Federated learning is considered an inevitable game-changer in future wireless network prediction, intelligent resource management, and intelligent policy control. These functionalities can be achieved using traditional machinelearning algorithms [9]. However, privacy preservation remains a concern in traditional machine learning, and federated learning can address this issue [10]. Federated learning enables learning without moving data from devices to a central server, thereby preserving user privacy [11]. This paper proposes a reinforcement learning-based intelligent network slicing resource allocation strategy for satelliteterrestrial integrated network scenarios. The objective is to optimize user satisfaction and spectrum resource utilization, thereby maximizing the overall system utility func-
Satellite-Terrestrial Integrated Network Slicing Resource …
113
tion. Among various reinforcement learning algorithms, the Advantage Actor-Critic (A2C) algorithm is selected and implemented using the TensorFlow framework. The designed algorithm is then evaluated through simulation tests. The structure of the paper is organized as follows. Section 2 primarily describes the research scenario and modeling approach. Section 3 introduces the reinforcement learning-based network slicing resource allocation algorithm. Section 4 presents the simulation results and analysis. Section 5 provides the main conclusions of this paper.
2 Scenario Description For the resource allocation problem of satellite-terrestrial collaborative network slicing in multi-service scenarios, reinforcement learning is utilized to focus on optimizing the allocation of bandwidth resources for the gateway stations. The scenario is illustrated in Fig. 1, where the gateway stations connect the local network data center with the low-earth orbit satellite network, enabling wireless access network and core network information exchange, and providing complete communication links for various application scenarios. This paper primarily considers three types of user service requirements: high connectivity, low latency, and high data rate, as shown in Fig. 1. The network infrastructure includes satellite gateways, low Earth orbit satellite networks, drones with dedicated networks, as well as optical fiber and data centers. The focus is on the frequency domain resource allocation of low Earth orbit satellite gateways, which play a crucial role in forwarding signals from ground users to satellites. Ground users’ signals may be obstructed by terrain, buildings, and other obstacles, but after being relayed to the low Earth orbit satellite gateways, the signals can be directly transmitted to satellites, improving communication stability and efficiency.
Fig. 1 Satellite-terrestrial integration scenarios
114
G. Hua et al.
In each scheduling cycle, resources on the three different network slices are allocated reasonably based on the current network status (number of user data packet requests, spectrum resource utilization, slice satisfaction). If the initial allocation evenly divides the network resources among the three slices, the calculated system utility values after one cycle may be 100%, 50%, and 30% respectively. However, if the user data packet requests on Slice 1 suddenly increase in the next cycle and the corresponding resource allocation strategy is not adjusted accordingly, the resources allocated to Slice 1 may be insufficient to meet user demands, while the resources allocated to Slices 2 and 3 may be unfertilized, leading to resource waste. Therefore, this paper aims to improve the overall system utility by incorporating differentiated satisfaction models for the three network slices based on their distinct user requirements.
2.1 Modeling of Spectrum Resource Utilization Within a scheduling period, let the available spectrum resources of the satellite gateway be denoted as W. The set of power application scenario slices is denoted as S E P , and the network slice is denoted as sn ∈ S E P (n = 1, 2, 3). The number of data packets requested by users on each slice is represented by D Pn (n = 1, 2, 3). The satellite gateway allocates bandwidth wn (n = 1, 2, 3) to each slice based on the magnitude of D Pn . According to Shannon’s theorem, if the allocated bandwidth for each slice is fully utilized, the data transmission rate on that slice can be represented by Eq. 1. vn = wn log2 (1 + S N Rn )
wn = W
(1) (2)
n=1,2,3
S N Rn represents the signal-to-noise ratio between the users and the satellite gateway on the slice, as represented by Eq. 2. gn represents the loss and fading in the satellite gateway-to-user process, as well as the average channel gain after fading. Pn is the transmit power of the satellite gateway, and N0 is the one-sided noise power spectral density. gn Pn S N Rn = (3) wn N 0 After performing the aforementioned calculations, the overall spectrum resource utilization can be obtained using the following equation: RU =
sn ∈S E P n
W
vn (4)
Satellite-Terrestrial Integrated Network Slicing Resource …
115
2.2 User Satisfaction Model According to the QoS requirements of each slice, we set the maximum end-to-end delay lmax n and the minimum data transfer rate rmin n for an individual data packet d on a specific slice sn . Assuming that in a given period, slice sn incurs a total delay tn for delivering a single data packet to the user, which includes both transmission delay and queuing delay. The successful transmission of the data packet can be considered if the following conditions are simultaneously satisfied: tn ≤ lmax n
(5)
vn ≥ rmin n
(6)
If a single data packet transmission is successful, we denote it as d = 1, and if it is unsuccessful, we denote it as d = 0. The total set of data packets transmitted by the satellite gateway from slice sn is denoted as d ∈ Dsn . Based on the aforementioned conditions, we can determine the number of successfully transmitted data packets in slice sn as Psn : d (7) Psn = d∈Dsn
The user satisfaction Satn of slice sn can be defined as the ratio of successfully transmitted data packets to the total number of data packets transmitted to users: Satsn =
Psn Dsn
(8)
For the overall user satisfaction of the system, it can be defined as the weighted sum of the satisfaction levels of the three network slices, where the weights αn for each slice can be set according to the requirements: Sat =
αn Satsn (n = 1, 2, 3) a
(9)
sn ∈S E P
2.3 System Utility Model By combining the aforementioned spectrum resource utilization RU and user satisfaction modeling, the system’s overall utility function E in this paper is defined as the weighted sum of the two models. E = β RU + Sat
(10)
116
G. Hua et al.
where β is the weighting coefficient for spectrum resource utilization. Since user demands are variable, the environmental state, such as the number of user data packet requests, changes in each resource allocation cycle. This paper expects the gateway station to allocate bandwidth resources to each slice appropriately based on the network environment’s state changes within each cycle. The optimization goal is to maximize the overall utility function by improving the spectrum resource utilization while satisfying user satisfaction requirements. max E = β RU + Sat
(11)
3 Network Slicing Resources Allocation Algorithm The intelligent agent in reinforcement learning can interact with the environment and receive feedback from the changes in the environment. It explores the “optimal solution” for the next action based on the feedback as rewards or penalties. Here, the "optimal solution" refers to a relatively better solution, as the agent’s actions in the learning process may not reach the standard definition of optimality until a certain number of learning iterations. Only after a sufficient number of learning iterations, we may observe better behavior from the agent. Therefore, this paper adopts a suitable algorithm from reinforcement learning to solve the intelligent network slicing resource allocation problem in the context of power applications. The goal is to optimize the user satisfaction and spectrum resource utilization mentioned earlier and maximize the overall system utility function. In the model constructed using the A2C algorithm, it is necessary to include the Actor network and the Critic network. The following is an introduction to these two components and their related functions (Fig. 2). First, the state-value function Vπ (s) is defined, which represents the weighted sum of possible action values and can also be referred to as the baseline value. Vπ (s) =
π (an |sn ) · Q π (sn , an )
(12)
an
The A2C network’s distinct feature is its use of an advantage function to compute gradients, which represents the advantage gained by taking an action in the current state. A (s, a) = Q (s, a) − V (s) = E r + γV (sn+1 ) (13) − V (s) ≈ r + γV (sn+1 ) − V (s) = δ In practice, we often tolerate a certain level of variance and use the temporal difference (TD) error function as a replacement for the advantage function in calculations. The reason is that the TD function can be considered an unbiased estimate of the advantage function.
Satellite-Terrestrial Integrated Network Slicing Resource …
117
Fig. 2 A2C algorithm
Thus, advantage function is defined as, Vπ (s; θ, w) =
π (an |sn ; θ) · qπ (sn , an ; w)
(14)
an
where an ∈A π (an |sn ; θ) = 1 Loss function is Loss A = −δ log π (a|s) = − (r + γV (sn+1 ) − V (s)) log π (a|s)
(15)
The update rule for the parameters of the actor network is as follows: θ = θ + α
∂ log π (a|s, θ) A (s, a) ∂θ
(16)
∂ log π (a|s, θ) δ ∂θ
(17)
thus, θ = θ + α
The value function loss refers to the difference between the state value function output by the critic network and the actual long-term reward. Typically, mean squared error loss is used to measure this difference. LossC =
2 1 qπ (sn , an ; w) − rt + γqπ sn+1 , an+1 ; w 2
(18)
In practice, we can utilize gradient descent optimization method to minimize the total loss function and update the parameters of the actor and critic networks. During the subsequent simulation process, it is necessary to observe the trends of the two
118
G. Hua et al.
loss functions to analyze the effectiveness of our model design. The critic network employs gradient descent to update its parameters. w = w + β ·
∂ LossC ∂w
(19)
4 Simulations 4.1 Parameter Settings First, we provide experimental parameter descriptions and initialization configurations for the network slicing application scenario. Rayleigh fading is employed for channel transmission, and the transmit power of the satellite gateway is set to 46. The power spectral density of the noise is − 174. The total communication bandwidth is set to 10. We consider three types of network slices: high-rate, low-latency, and massive connectivity. The high-rate and massive connectivity slices have similar user requirements, but the massive connectivity slice serves a larger number of users with a higher volume of data packet transmission. Based on the QoS requirements of the three slices, the maximum delay for the low-latency slice is set to 1 and the minimum data transmission rate is set to 10. For the high-rate slice, the maximum delay is set to 10 and the minimum data transmission rate is set to 100. For the massive connectivity slice, the maximum delay is set to 10 and the minimum data transmission rate is set to 100. Additionally, the bandwidth allocation granularity is set to 200. Next, we construct the actor network and critic network. The learning rate for the actor network is set to 0.001. It consists of two fully connected layers. The first layer takes the state as input to extract action-related features and uses the ReLU activation function. The second layer contains neurons and uses the Softmax activation function to map the output to probabilities of different actions. The learning rate for the critic network is set to 0.01, and the discount factor is set to 0.9. The first layer takes the state as input to extract value-related features using the ReLU activation function. These features are then sent to the second layer with only one neuron to obtain the state value. This layer has no activation function. The AdamOptimizer optimizer is used to minimize the loss function.
4.2 Simulation Results The intelligent agent is deployed to the bandwidth allocation node to learn gradually, using the A2C algorithm, the bandwidth allocation scheme that maximizes the total utility value of the network slices. The loss function values of the actor and critic networks, as well as the environment reward values, are monitored to determine the suitability of this algorithm for bandwidth resource allocation in a single base station, multi-slice network application scenario.
Satellite-Terrestrial Integrated Network Slicing Resource …
119
Fig. 3 The trend of the overall system utility
The training objective of this study is to maximize the total utility value of the system, which is the reward function value. In the parameter settings, the weight of spectrum resource utilization is set to 0.01, and the weight of each slice’s satisfaction is set to 1. The gray part represents the real instantaneous reward value. To observe the trend of the reward value, the average value of the adjacent 500 steps for each point is taken to obtain a smoothed function curve graph. The trend of the reward function is shown in Fig. 3. It can be observed that as the number of iterations increases, the total utility value of the system starts to rise around 2000 steps and stabilizes after approaching 4000 steps. It can also maintain an average reward value above 5.7 within 500 steps, indicating that the algorithm is applicable to bandwidth resource allocation in a single base station, multi-slice network application scenario. The environment reward value can be maintained above 5.0 on average, with a maximum real reward value reaching 15.82. However, noticeable spikes can be observed in the graph, particularly around 4247 steps and 7946 steps. The reward value around 4247 steps shows a significant drop. To analyze the cause of these spikes, we output the system’s spectrum utilization and the satisfaction of each slice to observe which part of the resource allocation simulation is causing interference. First, let’s analyze the trend of spectrum utilization, as shown in Fig. 4. The gray part represents the real instantaneous utilization rate. To observe the overall trend, the average value of the adjacent 500 steps for each point is taken to obtain a smoothed function curve, as shown in Fig. 4.2. Similarly, the spectrum utilization shows a significant turning point around 2000 steps, and the areas marked with red dots exhibit a noticeable upward trend, reaching a stable state around 4000 steps. This indicates that the spectrum resource utilization can be optimized as the number of learning steps increases, and the resource allocation made by the actor can gradually adapt to the environment. However, unlike the reward value graph, there are no noticeable spikes in the spectrum utilization graph, and the overall fluctuation is
120
G. Hua et al.
Fig. 4 The trend of spectrum utilization
small. The spectrum utilization does not exhibit steep changes as seen in the reward value graph. Considering that we have set the weight of spectrum resource utilization to 0.01, the results also show that the change in spectrum utilization does not have a significant impact on the overall trend of total utility. Figure 5 shows the individual slice satisfaction variations for the high-rate slice and low-latency slice. Since the large-connection slice is similar to the high-rate slice in configuration, with only a difference in the number of data packets, it is not displayed here. The left curve represents the satisfaction of the high-rate slice. According to Shannon’s formula, the data transmission rate is directly related to the allocated bandwidth. Therefore, during the learning process, it is only necessary to allocate a sufficient amount of bandwidth to the slice without excess to achieve high satisfaction. It can be seen that the satisfaction value in the graph is very close to 1 after the overall stabilization.
Fig. 5 Slice-level user service satisfaction
Satellite-Terrestrial Integrated Network Slicing Resource …
121
The right curve represents the satisfaction of the low-latency slice. Due to factors such as latency during data packet transmission, which are influenced by factors other than bandwidth allocation, satisfaction does not solely depend on bandwidth allocation. Therefore, although the curve also shows a fluctuation-increase-stabilization trend, the stable slice satisfaction is difficult to approach 1, reflecting the impact of complex environmental factors on slice satisfaction. In these two graphs, we observe the source of the spikes in the system’s total utility. Around 4000 steps, all three slices simultaneously experience a fluctuation in satisfaction followed by recovery. This may be due to occasional errors during the training process or issues with the environmental model. However, the stable trend after the fluctuations allows us to boldly predict that in the new state after 10,000 steps, the model will make better resource allocation decisions, thereby maximizing the system’s total utility.
References 1. Lin, Z., Lin, M., Wang, J.B., et al.: Joint beamforming and power allocation for satelliteterrestrial integrated networks with non-orthogonal multiple access. IEEE J. Sel. Top. Signal Process. 13(3), 657–670 (2019) 2. Yao, H., Wang, L., Wang, X., et al.: The space-terrestrial integrated network: an overview. IEEE Commun. Mag. 56(9), 178–185 (2018) 3. Rodrigues, T.K., Kato, N.: Network slicing with centralized and distributed reinforcement learning for combined satellite/ground networks in a 6G environment. IEEE Wirel. Commun. 29(1), 104–110 (2022) 4. Bisio, I., Lavagetto, F., Verardo, G., et al.: Network slicing optimization for integrated 5Gsatellite networks[C] 2019 IEEE Global Communications Conference (GLOBECOM). IEEE 20(1), 1–6 (2019) 5. Baohua, X., Jun, L., Nan, X., et al.: Reliability-based 5G-LEO constellation network slice mapping algorithm. Appl. Res. Comput. 38(11), 3407–3410 (2021) 6. Skondras, E., Michalas, A., Vergados, D.J., et al.: Network slicing on 5G vehicular cloud computing systems. Electron. MDPI 10(12), 1474–1480 (2021) 7. Lei, L., Yuan, Y., Vu, T.X., et al.: Dynamic-adaptive AI solutions for network slicing management in satellite-integrated B5G systems. IEEE Netw. 35(6), 91C97 (2021) 8. Jinyu, W., Xinran, W., Wenlei, S., et al.: Applications of reinforcement learning in the field of resource optimization. Big Data Res. 7(5), 1–19 (2021) 9. Luu, Q.T., Kerboeuf, S., Kieffer, M.: Uncertainty-aware resource provisioning for network slicing. IEEE Trans. Netw. Serv. Manage. 18(1), 79–83 (2021) 10. Drif, Y., Chaput, E., Lavinal, E., et al.: An extensible network slicing framework for satellite integration into 5G. Int. J. Satellite Commun. Network. 39(4), 1–10 (2021) 11. Khan, L.U., Yaqoob, I., Tran, N.H., et al.: Network slicing: recent advances, taxonomy, requirements, and open research challenges. IEEE Access 8(3), 36009–36028 (2020)
Event-Triggered Adaptive Neural Network Trajectory Tracking Control of MSVs Under Deception Attacks Chen Wu, Guibing Zhu, and Jinshu Lu
Abstract This paper investigates the event-triggered adaptive neural trajectory tracking control of marine surface vessels (MSVs) under internal/external uncertainties and deception attacks. A novel adaptive neural trajectory tracking control solution is proposed. Under the backstepping framework, this work treats internal and external uncertainties and deception attacks as compound uncertainties, using neural network (NN) technologies and single-parameter learning ideas to reconstruct. In addition, to reduce the mechanical wear of the actuator, the event-triggered adaptive neural tracking control is developed. Keywords Event-triggered · Adaptive neural · Marine surface vessels · Trajectory tracking · Deception attacks · Compound uncertainties
1 Introduction In the past several decades, the commercial, scientific, and military fields related to the ocean have become more and more extensive, and the demand for MSVs is growing [1]. In order to make the MSVs able to navigate safely at the sea horizon, and execute the task smoothly according to the desired trajectory, it is critical that the issue of MSVs control problems. In an unmanned mode, it is worth noting that the MSVs are operated in a network environment, the open network links between sensors, controllers, and actuators, make the exchanged data vulnerable to attacks and cause serious security problems. Therefore, for the navigation control of MSVs, network attack becomes the biggest threat [2], that is, solving the problem of a network attack is crucial for achieving the safe navigation of MSVs and the execution of the specified task. In the existing reports, there are already many ways to solve the uncertainties for MSVs. For the external disturbances, extended state observer [3], passivityC. Wu (B) · G. Zhu · J. Lu School of Naval Architecture and Maritime, Zhejiang Ocean University, Zhoushan 316022, Zhejiang, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_12
123
124
C. Wu et al.
based controller with integral action [4], the disturbance observer [5], etc., have been reported, When the vessel model parameters cannot satisfy the parameter decomposition, neural network (NN) [6] and adaptive fuzzy logic technology (FLT) [7] will be proposed. However, in the network environment, MSVs have typical characteristics of cyber-physical systems (CPSs), Cyber-attacks have three types: denial-of-service attacks (DoS), replay attacks, and deception attacks. [8]. Relatively speaking, injection attacks are the most common attacks and it has the most damage to the system [9]. Sargolzaei et al. [10] uses an observer to estimate the attack signal, which requires the signal to be differentiable and bounded. Janaideh et al. [11] uses an attack mitigation approach. Therefore, it is essential to design the controller for the MSVs under injection attacks. Motivated by the aforementioned observations, in this paper, under the backstepping design framework, using RBF NN and Young’s inequality ways to design control law. To reduce the transmission of network resources, we use an event trigger mechanism to handle this. Therefore, we proposed a novel event-triggered adaptive neural network trajectory tracking control of MSVs under deception attacks.
2 Problem Formulation The nonlinear mathematical model of MSVs can be described as [12] η˙ = J(ψ)υ
(1)
M υ˙ + C(υ)υ + D(υ)υ = τ + d
(2)
where η = [x, y, ψ]T enotes the position vector of the MSVs. υ = [u, v, r ]T is the velocity vector. τ = [τ1 , τ2 , τ3 ]T is the control input, τ1 is the surge force, τ2 is the sway force, τ3 is the yaw moment. d = [d1 , d2 , d3 ]T is the time-varying disturbance vector. The rotation matrix J(ψ) is ⎡
cos ψ − sin ψ J(ψ) = ⎣ sin ψ cos ψ 0 0
⎤ 0 0⎦. 1
M is the positive-definite symmetric inertia matrix containing additional mass; f (υ) = C(υ)υ + D(υ)υ. C(υ) is coriolis term and D(υ) is the damping matrix. In particular, the impact of the deception attack on the output signal is taken into account, which is formulated as follows τ = τ c
(3)
where denotes the injection and deception attacks. τ c is the actual control law.
Event-Triggered Adaptive Neural Network Trajectory Tracking …
125
When MSVs suffer from deception attacks, the attack can be described as follow η˘ = η + χ1 η = ω 1 η υ˘ = υ + χ2 υ = ω 2 υ
(4) (5)
where ω 1 = diag[ω1,1 , ω1,2 , ω1,3 ], ω 2 = diag[ω2,1 , ω2,2 , ω2,3 ], ω 1 = I 3 + χ1 , ω 2 = I 3 + χ2 . χ1 and χ2 are attack amplitudes. Assumption 1 The external disturbance d is bounded, and there exists an unknown constant d satisfying d ≤ d.M and f (υ) are unknown. Assumption 2 The attack signals χ1,i , χ2,i , , are bounded. ηr , η˙ r and η¨ r are bounded. Control objective: It is to design an adaptive control law τ for the MSVs described by (1)-(2) subjected to internal and external uncertainties as well as the injection and deception attacks under Assumptions 1–2. Under the designed control law, the actual trajectory of USVs can track ηr , and all signals in the closed-loop system are bounded.
3 Control Law Design In this section, under the backstepping design framework, we use RBF NN [13] and Young’s inequality [14] ways to design control law. We define the position tracking error S1 ∈ R3 and velocity tracking error S2 ∈ R3 . S1 = η˘ − ηr
(6)
S2 = υ˘ − α f
(7)
where α f ∈ R3 is a filter version of the virtual control law. We can obtain the filter ˙ f + α f = α; α f (0) = α(0) μfα
(8)
where μ f is the filter time constant. We define the filtering error is e f = α f − α, and then one can get α ˙ f = −e f /μ f . Step 1: Taking the derivative of S1 we can obtain S˙ 1 =ω ˙ 1 ω −1 S1 + ηr + ω 1 J(ψ)ω −1 1 2 S2 + ω 1 J(ψ)ω −1 ˙r 2 (α f − α + α) − η
(9)
126
C. Wu et al.
the virtual control law α is ˘ 1 + c1 ϑˆ 1 Φ(Z 1 )2 ]S1 α = − J T (ψ)[k
(10)
˙ ϑˆ 1 = c1 Φ 2 (Z 1 )S1 2 − σ1 ϑˆ 1 , ϑˆ 1 (0) ≥ 0
(11)
with the adaptive law
where k1 is a diagonal design matrix,σ1 and c1 are positive design parameters. ϑ1 = ˙ r ) with Z 1 = [S1T , ηrT , η˙ r ]T . max{1, ω ˙ 1 ω −1 1 } and Φ(Z 1 ) = (η r + S1 + η Step 2: The time derivative of S2 is ˙˘ − M α ˙f M S˙ 2 =M υ =M ω ˙ 2 ω −1 ˘ − Mω 2 M −1 f (ω −1 ˘ 2 υ 2 υ) − Mα ˙ f + Mω 2 M −1 τ c + Mω 2 M −1 d
(12)
Let ˙ 2 ω −1 ˘ − Mω 2 M −1 f (ω −1 ˘ − Mα ˙f G(Z 2 ) =M ω 2 υ 2 υ) =W T ξ(Z 2 ) + ε
(13)
T where Z 2 = υ˘ T , α ˙ Tf ∈ R6 . W is the weight vector, ξ(Z 2 ) = [ξ1 (Z 2 ), ξ2 (Z 2 ), ξ3 (Z 2 )]T is the basis function vector with Z 2 ∈ Ωz and ε is the reconstrcution error of the NN. S2T M S˙ 2 can be obtained
¯ + ω 1 J(ψ)ω −1 S2T M S˙ 2 =S2T W T ξ(Z 2 ) + ε + ℘d 2 S1 ˘ c − S2T ω 1 J(ψ)ω −1 + S2T ℘τ 2 S1
(14)
where ℘˘ = Mω 2 M −1 , ℘¯ = Mω 2 M −1 and exists a positive constant ℘2 and ℘3 ˘ ≤ ℘2 ≤ γmax (℘). ˘ γmin (℘) ¯ ≤ ℘3 ≤ γmax (℘). ¯ satisfying γmin (℘) ¯ + ω 1 J(ψ)ω −1 S . We have Let L 2 = W T ξ (Z 2 ) + ε + ℘d 1 2 S2 L 2 ≤ ϑ2 S2 β(ς)
(15)
, β(ς) = ξ (Z 2 ) + S1 + 1 ¯ , ω 1 J(ψ)ω −1 where ϑ2 = max W , ε + ℘d 2 is a kernel function. According to (14)–(15), the control law τ c is designed as follows τ c = −k2 S2 − c2 ϑˆ 2 β 2 (ς)S2 with the adaptive law
(16)
Event-Triggered Adaptive Neural Network Trajectory Tracking …
˙ ϑˆ 2 = c2 β 2 (ς) S2 2 − σ2 ϑˆ 2 , ϑˆ 2 (0) ≥ 0
127
(17)
where the design matrix k2 is diagonal design matrix, σ2 is a design parameter. the estimate of ϑ2 is ϑˆ 2 and the event-triggering mechanism τi (t) =τci (t f ), ∀t ∈ [t f , t f +1 ) f ∈ N ti, f +1 = inf{t ∈ R||τci (t) − τi (t)| ≥ ai }(i = 1, 2, 3)
(18)
where i = τci (t) − τi (t) is the measurement error, ai is positive design constants.
4 Stability Analysis Consider the Lyapunov function as follows V =
1 1 1 1 1 T S S1 + (ϑ1 − ℘1 ϑˆ 1 )2 + S2T M S2 + e2f + (ϑ2 − ℘2 ϑˆ 2 )2 (19) 2 1 2℘1 2 2 2℘2
The time derivative of V yields ˙ ˙ V˙ = S1T S˙ 1 − (ϑ1 − ℘1 ϑˆ 1 )ϑˆ 1 + S2T M S˙ 2 + e f e˙ f − (ϑ2 − ℘2 ϑˆ 2 )ϑˆ 2
(20)
The term S1T S˙ 1 can be written as ϑ1 1 2 + S1T ω 1 J(ψ)ω −1 S1T S˙ 1 ≤c1 Θ1 Φ 2 (Z 1 ) S1 2 + 2 S2 + S1 4c1 2 ℘2 − ℘1 S1T k1 S1 − c1 ℘1 ϑˆ 1 Φ 2 (Z 1 ) S1 2 + 1 μ−1 e2 2 f f
(21)
T ˘ ˘ where let ℘ = ω 1 ω −1 2 f (ψ) and J(ψ) J (ψ) = J(ψ − ψ). f (ψ) > 0, exists a constant ℘1 satisfying ℘1 ≤ γmin (℘). ˙ According to (11), the term−(ϑ1 − ℘1 ϑˆ 1 )ϑˆ 1 can be weitten as
˙ − (ϑ1 − ℘1 ϑˆ 1 )ϑˆ 1
σ1 2 σ1 ≤ − c1 Φ 2 (Z 1 ) S1 2 (ϑ1 − ℘1 ϑˆ 1 ) + ϑ1 − (ϑ1 − ℘1 ϑˆ 1 )2 2℘1 2℘1
(22)
The term S2T M S˙ 2 can be written as ˘ c + c2 ϑ2 S2 2 β 2 (ς) + S2T M S˙ 2 ≤ S2T ℘τ
ϑ2 − S2T ω 1 J(ψ)ω −1 2 S1 4c2
(23)
128
C. Wu et al.
Using (8), the term e f e˙ f in (20) can be written as
e f e˙ f = e f (−
1 ef 1 1 + Af ) ≤ − − A2m e2f + μf μf 2 2
(24)
˙ The term−(ϑ2 − ℘2 ϑˆ 2 )ϑˆ 2 can be weitten as ˙ − ϑ2 − ℘2 ϑˆ 2 ϑˆ 2 2 σ ϑ2 σ2 2 2 ϑ2 − ℘2 ϑˆ 2 + ≤ − c2 β 2 (ς) S2 2 ϑ2 − ℘2 ϑˆ 2 − 2℘2 2℘2
(25)
Synthesizing (21), (22), (23), (24), (25) and (20) yields σ1 ϑ1 σ1 2 V˙ ≤ − γmin (k1 )℘1 S1T S1 − (ϑ1 − ℘1 ϑˆ 1 )2 + + ϑ − ℘2 ϑˆ 2 )2 + 2℘1 4c1 2℘1 1 1 ϑ2 σ2 1 ℘2 e2f + − γmin (k2 )℘2 S2T S2 + − (ϑ2 − − A2m − 1 μ−1 f 4c2 2℘2 μf 2 2 ≤ − μV + ϑ
σ2 ϑ22 2℘2 1 2 (26)
where 1 1 ℘2 − A2m − 1 μ = min 2γmin (k1 )℘1 , σ1 , 2γmin (k2 )℘2 , σ2 , μf 2 2μ f ϑ=
ϑ1 σ1 2 ϑ2 σ2 2 1 + ϑ + + ϑ + 4c1 2℘1 1 4c2 2δ2 2 2
According to the above stability analysis, the above theorem exists Theorem 1 Considering the closed-loop control system for MSVs described (1)and (2) with assumptions1-2 subject to internal uncertainties, external uncertainties, and deception attacks, by selecting design parameter k1 , k2 , c1 , c2 , σ1 ,σ2 and μ f with being satisfied.√The tracking error S1 = η˘ − ηr can be settled within Ω = {S1 |S1 ≤ , > 2θ/λ} through appropriately, virtual control law (10), control law (16), and adaptive laws (11) and (17) achieves that all signals in the vessel closed-loop control system are uniformly ultimately bounded. Proof (1) By solution (26), it follow that θ θ exp(−λt) + V ≤ V (0) − λ λ
(27)
Event-Triggered Adaptive Neural Network Trajectory Tracking …
129
Hence, V (0) is the initial value of V and V is bounded. From (19), S1 , S2 , e f , ϑˆ 1 , ˘ α ˙ f and α are bounded owing to assumptions 3. Since S2 ϑˆ 2 are bounded. η, υ, and (8) is bounded, υ and α f are bounded. Because of the boundedness of S1 , υ, α f , the Φ(Z 1 ) and F(Z 2 ) is bounded. Since ϑˆ and S2 are bounded, τ c isbounded. Therefore, all signals are bounded in the closed-loop system. It follows from (27), one gets θ −λt 2θ S1 ≤ 2 V (0) − e + (28) λ λ √ There exists positive constant > 2θ/λ, and T > 0. There have S1 ≤ for all t > T . Therefore the tracking error of MSVs can converge to the set Ω = {S1 ∈ R3 |S1 ≤ }. (2) Avoidance of zero behavior. According to the event triggering, for ∀t ∈ [ti , ti+1 ), τ1 , τ2 , τ3 are constants, it imply that τ˙1 = 0, τ˙2 = 0,τ˙3 = 0, differentiating the measurement error, d|1 | ≤ |τ˙c1 − τ˙1 | ≤ |τ˙c1 | dt d|2 | ≤ |τ˙c2 − τ˙1 | ≤ |τ˙c2 | dt d|3 | ≤ |τ˙c3 − τ˙1 | ≤ |τ˙c3 | dt
(29) (30) (31)
Furthermore, we can get ∂τc1 ˙ S11 + ∂ S11 ∂τc2 ˙ τ˙c2 = S12 + ∂ S12 ∂τc3 ˙ τ˙c3 = S13 + ∂ S11 τ˙c1 =
∂τc1 ˙ ϑ21 + ∂ ϑˆ 21 ∂τc2 ˙ ϑ22 + ∂ ϑˆ 22 ∂τc3 ˙ ϑ23 + ∂ ϑˆ 23
∂τc1 ˙ Z 21 + ∂ Zˆ 21 ∂τc2 ˙ Z 22 + ∂ Zˆ 22 ∂τc3 ˙ Z 23 + ∂ Zˆ 23
∂τc1 ˙ S21 ∂ S21 ∂τc2 ˙ S22 ∂ S22 ∂τc3 ˙ S23 ∂ S23
(32) (33) (34)
In this subsection, S1 , S2 , ϑ2 and Z 2 are bounded and continuous. Therefore, ||τ˙c || ≤ τ¯ . Taking the fact 1 (l1 + 1) = 0, 2 (l1 + 1) = 0, 3 (l1 + 1) = 0, limt→l1 +1 min{|1 |} = ¯ 1 limt→l1 +1 min{|2 |} = ¯ 2 limt→l1 +1 min{|3 |} = ¯ 3 into account, we have that a1τ1 = tl1 +1 − tl1 , a2τ2 = tl2 +1 − tl2 , and a3τ3 = tl3 +1 − tl3 are bounded. a1τ1 ≤ a¯ 1τ1 , a2τ2 and a3τ3 , which implies that the inter-execution intervals a1τ1 , a2τ2 and a3τ3 exist the lower bound a¯ 1τ1 ≤ avoided.
τ
a¯ 11 τ¯1
, a¯ 2τ2 ≤
τ
a¯ 22 τ¯2
and a¯ 3τ3 ≤
τ
a¯ 33 τ¯3
. Thus, the Zero behavior is effective
130
C. Wu et al.
5 Simulations The simulation research is conducted on a scale model named CyberShip II. The dynamic parameters of the motion model described by (1) and (2) are given in [15]. In simulation, the reference trajectory is governed by η˙ r = J(ψr )υr , M υ˙r + πt πt T 3 , 10 sin2 100 ] . The external disturbances are f (υr ) = τ r , where τ r = [1, 15 cos2 100 T taken as d = J(ψ) ℵ, Here, ℵ is generated by the first order Markov process ˙ = −Ξ −1 ℵ + Γ j with j being the zero-mean Gaussian white noises processes. ℵ The NNs for G(Z 2 ) contain 20 nodes, the centers evenly space in the range [−2, 2] × [−2, 2] × [−2, 2] and the width is taken ωi = 2(i = 1, ...20). The deception signals 1 t t T [sin 10t cos 20t , cos 10t sin 20t , 21 sin 100 cos 100 ] of sensor are set as χ1 = 50 1 t t t t t t T and χ2 = 100 [sin 10 cos 20 , cos 10 sin 20 , 2 sin 100 cos 100 ] . The injection and 1 t deception signals of actuator are set as = 50 [sin 10t cos 20t , cos 10t sin 20t , 21 sin 100 t T cos 100 ] t t t t T and κ = [0.02 + 0.5 sin 100 , 0.02 + 0.5 cos 100 , 21 sin 100 cos 100 ] . The simulation process takes into account the influence of the injection and deception signals on the signal transmission of the sensor and actuator. The design parameters and initial states are in Table 1. The simulation results are shown in Figs. 1, 2, 3, 4, 5 and 6. Figure 1 shows that the proposed scheme can force the MSVs to follow the reference trajectory ηr and satisfactory control performance. The curves of position error and heading error are described in Fig. 2, which indicates that S1 is bounded. Figure 3 illustrates the velocity errors S2 are bounded. Figure 4 shows the curves of the 2-norm of ϑˆ 1 and ϑˆ 2 , which implies that ϑˆ 1 and ϑˆ 2 are bounded. Figure 5 plots the control input τ c ,
Table 1 Design parameters and initial states Index Item Initial value ϑˆ 1 (0) ϑˆ 2 (0)
Design parameters
η(0) υ(0) ℵ(0) c1 c2 k1 k2 σ1 σ2 μf Ξ Γ
Value 0.5 0 [1, −1, 0.1]T 0 [0.5, 0.5, 0.5]T 0.4 0.6 diag[2.5, 3.8, 9] diag[15, 5, 20] 0.06 0.06 0.02 diag[2, 3, 2] [3, 3, 3]T
Event-Triggered Adaptive Neural Network Trajectory Tracking … 5
131
The proposed control scheme Reference
0 -5 -10
0
10
20
30
40
50
60
70
80
0
10
20
30
40
50
60
70
80
0
10
20
30
40
50
60
70
80
10 5 0
0 -2 -4 -6
Fig. 1 Actual and reference position in x y and yaw angle ϕ 1 0.5 0 -0.5 0
10
20
30
40
50
60
70
80
10
20
30
40
50
60
70
80
10
20
30
40
50
60
70
80
0.5 0 -0.5 -1
0
0.1 0.05 0 0
Fig. 2 Tracking error S1
132
C. Wu et al.
s 2,1
2 0 -2
0
10
20
30
40
50
60
70
80
0
10
20
30
40
50
60
70
80
10
20
30
40
50
60
70
80
s 2,2
4 2 0 -2
s 2,3
1 0.5 0 -0.5 0
Fig. 3 Velocity tracking errors S2 1
0.5
0 0
10
20
30
40
50
60
70
80
0
10
20
30
40
50
60
70
80
6
4
2
0
Fig. 4 Evolution of ϑˆ i
Event-Triggered Adaptive Neural Network Trajectory Tracking …
133
20 0 -20
ETC scheme Continuous control scheme
-40 -60
0
10
20
30
40
50
60
70
80
10
20
30
40
50
60
70
80
10
20
30
40
50
60
70
80
10 0 -10
0
10 0 -10 -20 0
Fig. 5 Control inputs τ c
Fig. 6 Event-triggered time interval
134
C. Wu et al.
for which it becomes clear that the actuator control input τ c is reasonable. Figure 6 shows the triggering times and triggering instants, and over a short period of time, it can be clearly seen that the control commands on τ c are not transmitted indefinitely. Through the above analysis, we can get that theorem 1 is confirmed and all signals in the closed-loop trajectory tracking control system are bounded.
6 Conclusion In this work, an event-triggered adaptive neural tracking control has been developed for MSVs under deception attacks. This work takes into account the influence of the deception signals on the signal transmission of the sensor and actuator. By using the property of NN and single-parameter learning idea, solving the design challenges caused by deception attacks. Especially, only one parameter is used to transform this work into a parameterized form. The proposed control strategy can allow the MSVs to keep sailing along the target trajectory, according to simulation results.
References 1. Shenoi, R.A., et al.: Global Marine Technology Trends 2030. Univ. Southampton, Southampton, U.K., Tech. Rep. GMTT 2030 (2015) 2. Kessler, G.C.: Cybersecurity in the maritime domain. USCG Proc. Marine Safety Secur. Council 76(1), 34 (2019) 3. Sun, T., Zhang, J., Pan, Y.: Active disturbance rejection control of surface vessels using composite error updated extended state observer. Asian J. Control 19(5), 1802–1811 (2017) 4. Donaire, A., Romero, J.G., Perez, T.: Trajectory tracking passivity-based control for marine vehicles subject to disturbances. J. Franklin Inst. 354(5), 2167–2182 (2017) 5. Do, K.D.: Practical control of underactuated ships. Appl. Soft Comput. 37(13), 1111–1119 (2010) 6. Yin, S., Xiao, B.: Tracking control of surface ships with disturbance and uncertainties rejection capability. IEEE/ASME Trans. Mechatron. 22(3), 1154–1162 (2016) 7. Zhu, G., Ma, Y., Li, Z., Malekian, R., Sotelo, M.: Event-triggered adaptive neural fault-tolerant control of underactuated MSVs with input saturation. IEEE Trans. Intell. Transp. Syst. 23(7), 7045–7057 (2022) 8. Ding, D., Han, Q.L., Xiang, Y., Ge, X., Zhang, X.: A survey on security control and attack detection for industrial cyber-physical systems. Neurocomputing 275, 1674–1683 (2018) 9. Ye, Z., Zhang, D., Wu, Z.G: Adaptive event-based tracking control of unmanned marine vehicle systems with DoS attack. J. Franklin Inst. B 358(3), 1915–1939 (2021) 10. Sargolzaei, A., Allen, B.C., Crane, C.D., Dixon, W.E.: Lyapunov-based control of a nonlinear multiagent system with a time-varying input delay under false-data-injection attacks. IEEE Trans. Industr. Inform. 18(4), 2693–2703 (2021) 11. Janaideh, M.A., Hammad, E., Farraj, A., Kundur, D.: Mitigating attacks with nonlinear dynamics on actuators in cyber-physical mechatronic systems. IEEE Trans. Industr. Inform. 15(9), 4845–4856 (2019) 12. Skjetne, R.: Smogeli, Fossen TI: modeling, identification, and adaptive maneuvering of cyber ship II: a complete design with experiments. IFAC Proc. Vol. 37(10), 203–208 (2004)
Event-Triggered Adaptive Neural Network Trajectory Tracking …
135
13. Sanner, R., Slotine, J.: Gaussian networks for direct adaptive control. In: American Control Conference, 2153–2159. IEEE (1991) 14. Wang, Y., Zhang, J., Zhang, H., Xie, X.: Finite-time adaptive neural control for nonstrictfeedback stochastic nonlinear systems with input delay and output constraints. Appl. Math. Comput. 392, 125756 (2021) 15. Skjetne, R., Fossen, T.I., Kokotovi, P.V.: Adaptive maneuvering, with experiments, for a model ship in a marine control laboratory. Automatica 41(2), 289–298 (2005)
Planetary Flight Obstacle Avoidance Guidance Method Based on ES and DQN Jie Jiao, Wenbo Wu, Binfeng Pan, and Shuaibin Yang
Abstract Obstacle avoidance constitutes a critical challenge in planetary exploration flights. Traditional approaches struggle to meet safety requirements due to the vast distance from Earth and the substantial measurement and control delay errors involved. To address these issues, this study proposes a novel hybrid obstacle avoidance guidance method that combines Evolution-Strategies (ES) and Deep Q-Network (DQN) algorithms. This approach enables autonomous obstacle avoidance flights in complex and unpredictable environments. By leveraging a data-driven approach, the method utilizes DQN for intelligent decision-making during obstacle avoidance flights, while ES is employed for obstacle avoidance guidance. Extensive simulations demonstrate that the proposed method achieves autonomous obstacle avoidance flights in complex and unfamiliar environments, exhibiting remarkable robustness and adaptability. Keywords Reinforcement learning · Autonomous decision-making · Planetary guidance · Obstacle avoidance
1 Introduction Deep space exploration holds significant importance in uncovering the origins and evolution of life, harnessing space resources, and advancing space science and technology. Consequently, it has garnered increasing attention. Among the most effective methods for detection is direct planetary surface exploration using rovers or exploration vehicles. Rovers have been successfully utilized in lunar and Martian landings but are limited in their detection range and mobility, impeding the exploration of specialized terrains like canyon mountains. In contrast, exploration spacecraft offer greater freedom and a wider detection range, enabling the exploration of hard-toreach areas. Moreover, their high flight speeds substantially enhance detection efficiency. By synergizing the capabilities of exploration vehicles and spacecraft, tasks J. Jiao · W. Wu · B. Pan (B) · S. Yang School of Astronautics, Northwestern Polytechnical University, Xi’an 710072, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_13
137
138
J. Jiao et al.
such as multi-site sampling can be efficiently accomplished [1, 2]. As the number and complexity of exploration missions increase, the technology of exploration spacecraft becomes crucial for planetary exploration endeavors. Autonomous obstacle avoidance stands as a fundamental challenge in realizing safe exploration flights for exploration vehicles. However, the long distance from Earth, significant measurement and control delay errors, complex flight environments, and unpredictability pose substantial threats to planetary flight safety. In the past few years, various methods have been proposed to address the obstacle avoidance problem for exploration vehicles, including geometric methods, path planning methods, numerical solution methods, and intelligent algorithms. Geometric algorithms typically treat the spacecraft as a particle with a given speed and establish geometric relationships based on relative speed, position, and heading information. These relationships are then utilized to calculate online obstacle avoidance trajectories. Based on the collision cone concept [3], Han et al. [4] employed a collision cone for conflict detection and guided the relative velocity vector within the collision cone as an obstacle avoidance vector for solving obstacle avoidance trajectories. However, the algorithm’s imposition of constant-speed flight constraints on the spacecraft can hinder convergence. Jenie [5] proposed the three-dimensional velocity obstacle method and introduced the concept of a buffer velocity set to provide additional maneuvering space for the spacecraft to avoid obstacles. Nevertheless, the solution process involves iterating through each collision avoidance plane, limiting real-time performance. Path planning entails selecting an optimal or suboptimal collision-free path between a start and end point based on predefined criteria. Its essence lies in obtaining optimal or feasible solutions under constrained conditions. Liu et al. [6] employed the artificial potential field method and Lyapunov stability theorem to avoid falling into local minima. While effective in dynamic or static environments, this algorithm does not guarantee globally optimal obstacle avoidance paths. Chao et al. [7] proposed an improved artificial potential field algorithm combined with the collision cone, utilizing the collision detection mechanism of the collision cone to eliminate adverse effects from irrelevant obstacles’ potential fields. Additionally, they designed a fuzzy repulsion gain adjustment coefficient to prevent local optimization. However, this algorithm requires pre-designed adjustment coefficients for the environment, which may adversely impact flight movements in uncertain environments. Numerical solution methods abstract spacecraft collision processes as mathematical models and derive optimal solutions under specified motion constraints via numerical operations. Sunberg et al. [8] introduced an algorithm based on an interpolation grid that employs a linear value function approximation strategy. Nevertheless, the algorithm demands significant computational space and struggles to attain optimal solutions in high-dimensional spaces. With the recent advancements in artificial intelligence, several intelligent algorithms have emerged. Julian et al. [9] proposed compressing depth neural networks to optimize discrete numerical tables in obstacle avoidance strategies. These tables are approximated using asymmetric loss functions and gradient descent algorithms to obtain approximated tables. However, this algorithm relies on a centralized approach
Planetary Flight Obstacle Avoidance Guidance …
139
to control spacecraft for obstacle avoidance and depends on the original numerical table’s performance. As a result, it can only be applied to cooperative obstacle avoidance scenarios, rendering the spacecraft unable to independently avoid obstacles when faced with foreign invading spacecraft or non-cooperative dynamic obstacles. Han et al. [10] modeled three-dimensional obstacle avoidance problems in dynamic obstacle environments as continuous-state Markov decision processes. This algorithm eliminates the need for state space discretization, enabling continuous-speed obstacle avoidance for spacecraft. Nonetheless, the algorithm necessitates offline learning and is prone to overfitting under external noise influences. The UAV team at the University of Zurich employed machine learning methods to achieve autonomous obstacle avoidance flights of micro UAVs in forest environments based on deep neural networks [11]. However, these methods heavily rely on hardware performance and require extensive data collection for algorithm training and learning, thereby requiring further validation of their generalization capabilities. While these methods have made notable advancements in obstacle avoidance, each approach possesses inherent limitations. In this paper, we propose a hybrid guidance method based on deep reinforcement learning ES and DQN. The ES algorithm facilitates parallel operations and can analyze states to make decisions. Combining the two algorithms yields a versatile guidance method capable of achieving autonomous obstacle avoidance flights for detection spacecraft. The paper’s first part establishes the spacecraft’s dynamic model, followed by the introduction of the guidance method based on reinforcement learning ES and DQN algorithms. Finally, numerical simulations are provided as demonstrations.
2 Problem Formulation In order to streamline the experiment, the mathematical model is based on several key assumptions: (1) The planet under consideration possesses an atmosphere; (2) The spacecraft’s flight altitude remains within 2000 m from the planetary surface, and its relative velocity does not exceed 100 m/s. Thus, it is assumed that the gravity acceleration remains constant and unchanged; (3) The mass of fuel consumed by the spacecraft is significantly smaller than the mass of the spacecraft itself; (4) The detector onboard the spacecraft accurately perceives information without any errors. Figure 1 illustrates a schematic diagram of the reference coordinate system between the detection spacecraft and the target point. The coordinate system’s origin is positioned at the target point. Within this system, V represents the spacecraft’s speed, θ represents the velocity dip, and σ represents the track deflection angle. This coordinate system primarily serves the purpose of determining the spacecraft’s centroid position and spatial orientation.
140
J. Jiao et al.
Fig. 1 Schematic diagram of reference coordinate system on planetary surface
Based on the above assumptions, the three degree of freedom dynamic equation is established as follows: ⎧ ⎪ ⎪ v˙ = Rx /m ⎪ ⎪ θ˙ = (R y /mv) + g/v ⎪ ⎪ ⎪ ⎪ σ ⎨ ˙ = −Rz /mv cos θ x˙ = v cos θ cos σ (1) ⎪ ⎪ y˙ = v sin θ ⎪ ⎪ ⎪ ⎪ z˙ = −v ⎪ ⎪ cos θ sin σ ⎩ r = x 2 + (y + R0 )2 + z 2 where x is the flight distance, y is the flight altitude, m is the mass of the spacecraft, R0 is the planetary radius, Rx , R y and Rz are the projection of the aerodynamic force on the spacecraft in the coordinate system respectively, and g is the local gravity acceleration. The aerodynamic force is calculated with the following expression: ⎛
⎞ ⎛ ⎞ ⎛ ⎞ Rx −X −C x ⎝ R y ⎠ = ⎝ Y ⎠ = ⎝ C y ⎠ qs Z Rz Cz q=
1 2 ρv 2
(2)
(3)
here, C x , C y and C z are called drag coefficient, lift coefficient and lateral force coefficient respectively; q is the dynamic pressure, ρ is the air density; Sr e f is the reference area. In this paper, the spacecraft is controlled by normal overload command guidance, so that R y = n · mg.(xc , yc ) is the position of the obstacle and ri is the safe distance between the spacecraft and the obstacle. All constraints should be met during obstacle avoidance: |n| < n max
(4)
x(t0 ) = x0 , y(t0 ) = y0 , θ (t0 ) = θ0 x(t f ) = x f , y(t f ) = y f
(5) (6)
(x − xc )2 + (y − yc )2 ≥ ri2
(7)
Planetary Flight Obstacle Avoidance Guidance …
141
where Eq. (4) is the control constraint, Eq. (5) is the initial constraint, Eq. (6) is the terminal constraint, and Eq. (7) is the obstacle avoidance constraint.
3 Guidance Method Based on ES and DQN 3.1 Deep Reinforcement Learning Reinforcement learning is a powerful approach that utilizes exploratory interaction with the environment to reinforce actions that lead to performance improvement while penalizing actions that hinder performance [12, 13]. It has proven to be effective in optimizing dynamic systems. Reinforcement learning can be categorized into model-based and model-free methods. Model-based reinforcement learning involves using data acquired from environment interaction to optimize learning and model the environment. However, these models inherently contain errors. Over time, as the algorithm iteratively interacts with the environment, these errors accumulate, making it challenging for the algorithm to converge to the optimal solution. Moreover, the models lack generality and need to be reconfigured as the problem changes. Therefore, model-based methods are not suitable for tasks such as obstacle avoidance in complex and unknown environments. On the other hand, model-free reinforcement learning is simpler and more versatile as it does not require explicit modeling of the environment. Given the complex and harsh environment, measurement and control delays, and limited computational capabilities of planetary exploration vehicles, model-free methods are better suited for obstacle avoidance during planetary exploration flights. The schematic diagram of reinforcement-learning is shown in Fig. 2. With the rapid advancement of artificial intelligence, the integration of deep learning and reinforcement learning has given rise to deep reinforcement learning. Reinforcement learning defines the task objective and optimization direction, while deep learning provides problem representation and solution methods, enabling effective resolution of high-dimensional continuous space problems [14].
Fig. 2 Schematic diagram of reinforcement-learning
142
J. Jiao et al.
When incorporating complex deep neural network representations into reinforcement learning, optimization methods can be broadly classified into gradient-based and gradient-free methods. Gradient-based optimization methods, such as gradient ascent, determine the next solution based on the gradient direction of the objective function. However, in complex optimization problems, the objective function often exhibits non-convexity and non-smoothness. This poses several challenges for gradient-based reinforcement learning, including issues with stable points (such as an abundance of saddle points or local optima), ill-conditioned problems, and the flatness of activation functions. These challenges can result in problems such as the vanishing gradient issue during optimization or convergence to suboptimal solutions, leading to poor performance. To address such problems, gradient-free optimization [15, 16], also referred to as “zero-order optimization” or “black box optimization”, encompasses a range of optimization algorithms that do not rely on gradients. In gradient-free optimization, the next solution is determined solely based on the evaluation value of the objective function. Most gradient-free optimization algorithms follow a similar structure. Initially, a set of solutions is randomly initialized in the search space, and then an explicit or implicit underlying objective function model is established based on the currently available solutions. This model suggests regions with potentially better solutions, and new solutions are sampled from the model while updating the model itself. The gradient-free optimization algorithm iteratively improves the solution quality by repeating the sampling and updating process. By leveraging gradient-free optimization, the quality of solutions can be enhanced iteratively without relying on gradients, providing an alternative approach to address the limitations of gradient-based optimization methods.
3.2 ES Method Evolution-Strategies(ES) is a prominent algorithm within the realm of model-free and gradient-free reinforcement learning. It was introduced by Salimans et al. from the OpenAI team in 2017 [17]. This algorithm departs from the traditional approach of reinforcement learning in terms of behavioral domain and resurrects the evolutionary strategy algorithm from the 1980s. The ES algorithm can be conceptualized as generating multiple candidate models by perturbing the internal parameters of the model. Subsequently, the internal parameters are updated based on the obtained return values. The ES algorithm primarily encompasses two main processes within its loop: (1) Disturbances εi , ..., εn are randomly sampled from the Gaussian distribution N (0, σ 2 I ) and evaluated by using the disturbedF (θ + σ εi )strategy to interact with the environment,where i = 1, . . . , n; (2) According to the disturbance and its evaluation results, the gradient is estimated, and then the current strategy parameters θt are updated using the estimated
Planetary Flight Obstacle Avoidance Guidance …
143
gradient, which can be describe as: θt+1 ← θt + α
1 n Fi εi i=1 nσ
(8)
The ES algorithm pseudocode is shown in Algorithm 1. Algorithm 1: ES (Evolution Strategies) Set hyperparameters: learning rate alpha, strategy neural network parameters θ0 , disturbance standard deviation σ , number of rounds episode Function ES: randomly sample disturbances εi , ..., εn from Gaussian distribution N (0, σ 2 I ) and disturb the neural network θ1 = θ0 + σ εi ; initialization status s; while not the end of the cycle do for each episode do for each step in the round do use the disturbed neural network strategy to select actions and interact with the environment according to the state; obtain evaluation F (θ + σ εi ) value and update status value s ← s; n update parameters θt+1 ← θt + α i=1 Fi εi /nσ ; ES offers several advantages over algorithms like Deep Deterministic Policy Gradient (DDPG), Proximal Policy Optimization (PPO), and others: (1) Enhanced Exploration: ES possesses superior exploration capabilities, allowing it to effectively navigate and explore complex environments. This attribute is particularly beneficial in scenarios with long episodes, delayed rewards, sparse rewards, and other challenging factors. It enables ES to discover optimal or near-optimal solutions more efficiently; (2) Parallel Operation and Scalability: ES is well-suited for parallel operation, making it easier to leverage computational resources efficiently. By performing multiple parallel trials or simulations, ES can explore various solutions simultaneously, accelerating the learning process. This characteristic enhances its scalability and enables it to handle complex problems effectively; (3) Gradient-Free Optimization: Unlike algorithms that rely on gradient calculations, ES eliminates the need to compute gradients. This eliminates concerns related to gradient explosion or vanishing gradients, which can be problematic in gradient-based methods. By circumventing the gradient-related issues, ES offers a more robust and stable optimization process; (4) Flexibility with Non-differentiable Elements: ES can incorporate nondifferentiable elements into its strategy. This flexibility allows it to handle scenarios where the environment or the problem itself involves non-differentiable components, making it applicable to a wider range of problems;
144
J. Jiao et al.
(5) Lower Computation Requirements: ES typically requires fewer computations per iteration compared to other algorithms. Consequently, the online computing capacity needed to implement ES is not as demanding. This characteristic makes it more suitable for scenarios with limited computational resources or real-time applications. In summary, ES combines efficient exploration, parallel operation capabilities, gradient-free optimization, flexibility with non-differentiable elements, and lower computational requirements, making it a powerful and versatile algorithm for addressing complex reinforcement learning problems.
3.3 DQN Learning Method When dealing with large state and action spaces, traditional Q-learning can suffer from scalability issues as the Q-table becomes prohibitively large, leading to reduced solving efficiency. Deep Q-Network (DQN) addresses this challenge by incorporating deep learning techniques [18, 19]. By leveraging neural networks, DQN transforms the problem of updating the Q-table into a nonlinear function approximation problem. In this case, the neural network serves as the nonlinear function, and its parameters (represented by the weight coefficients of each layer, denoted as θ ) are updated to make the Q-function approximate the optimal Q-values. Instead of storing and updating a large Q-table, DQN utilizes a deep neural network to estimate the Q-values directly from the raw state inputs. The neural network takes the state as input and outputs Q-values for each possible action. During training, the network’s weights (θ ) are updated using gradient descent or other optimization algorithms to minimize the difference between the predicted Q-values and the target Q-values. The introduction of deep learning in DQN provides several advantages. Firstly, it allows for more compact and efficient representation of the Q-function compared to the explicit Q-table. Secondly, the neural network can generalize across similar states, enabling better performance even for unseen state-action pairs. Thirdly, deep learning techniques enable the utilization of powerful function approximation capabilities, making it possible to handle high-dimensional state spaces. In summary, DQN tackles the scalability issue of Q-learning by employing a deep neural network to approximate the Q-values. This approach enables more efficient and effective learning in large state and action spaces, making it suitable for complex reinforcement learning problems, that is: Q(s, a; θ ) ≈ Q (s, a)
(9)
The loss function of DQN is defined by L(θ ) = E[(Q target − Q(s, a; θ ))2 ]
(10)
Planetary Flight Obstacle Avoidance Guidance …
145
Furthermore, the target Q value is: Q(s , a ; θ ) Q target = r + γamax
(11)
In DQN, the empirical playback mechanism is also used to store the transfer sample data (st , at , rt , st+1 ) obtained from interaction in the playback memory unit. During training, a part of it is taken out by means of uniform random sampling for training data. The existence of empirical playback mechanism breaks the time correlation between data, and the robustness and stability of training results will be better. In addition, another feature of DQN is to build a network with exactly the same structure in addition to the original neural network, which is called the target network. The original network used for action value function approximation is called the action network. In the process of weight updating, the weight of the action network is updated every step, and the target network is updated every fixed step, and the update process is to directly assign the parameters of the action network to the target network. Therefore, the update process of the value function is: Q(s , a ; θ − ) − Q(s, a; θ )]∇ Q(s, a; θ ) θt+1 = θt + α[r + γαmax
(12)
The pseudocode of the DQN algorithm is shown in Algorithm 2. Algorithm 2: DQN (Deep Q-Network) Algorithm parameters: the number of action network update steps C, determines whether to randomly select action parameters > 0, and the number of learning rounds episode Initialize action network Q(θ − ), target network parameters Q(θ ) for cycle by episodes do initialization status s; for cycle according to the number of steps in the episode do select action a according to the greedy strategy Q( − greedy) using Q(θ − ) of Q table; use action a to obtain the reward value r and the next moment status s ; store data (s, a, r , s ) to dataset D; select a batch of data sets (si , ai , ri , s i )(i ∈ B); update Q table: Q target = r + γ a Q(s , a ; θ − ); Δθ = α[r + γ α Q(s , a ; θ − ) − Q(s, a; θ )]∇ Q(s, a; θ ); update parameters: θ = θ + Δθ; update parameters θ − : θ − = θ every C steps; update the status value: s ← s ; end until s reaches the terminal state; end
146
J. Jiao et al.
Fig. 3 Flow chart of obstacle avoidance reinforcement-learning algorithm
3.4 Hybrid Obstacle Avoidance Guidance Method The overall framework consists of two layers. The inner layer utilizes the ES algorithm, which employs a strategy neural network that has been trained to provide guidance for obstacle avoidance by issuing command overrides. On the other hand, the outer layer incorporates the DQN algorithm, which is suitable for discrete behaviors. After detecting obstacles, the DQN algorithm analyzes and makes decisions, selecting the appropriate ES strategy within the discrete behavior space to enable autonomous obstacle avoidance or normal detection flight. The specific process for obstacle avoidance is as follows: During the execution of the detection mission, the spacecraft scans its surroundings for potential obstacles within its detection range. Upon detecting obstacles, the spacecraft enters the danger judgment stage and measures the distance to the obstacles. At this point, the DQN algorithm is employed to assess whether the obstacles pose a threat to the flight. If a threat is detected, the ES guidance for obstacle avoidance is activated, and the spacecraft performs the necessary maneuvers to avoid the obstacles. If no threat is detected, the spacecraft maintains its initial flight path and continues flying. The algorithm flow is depicted in Fig. 3. Reinforcement learning algorithms DQN and ES have different purposes, so it is necessary to design state, action space and reward function respectively. The basic design principles of this paper are as follows: (1) State space S: the selection of state space needs to characterize the state of spacecraft and the relative relationship with obstacles at the same time, so as to facilitate the processing of terminals and constraints; (2) Action space A: overload is selected as action n = action, overload includes maximum constraint |n| < n max ;
Planetary Flight Obstacle Avoidance Guidance …
147
(3) Reward function R: the reward function needs to comprehensively represent the flight state of the spacecraft. If the spacecraft collides with obstacles, the reward value should be given as small as possible. Otherwise, it should be given as much reward as possible.
4 Numerical Demonstrations It is assumed to be in an atmospheric planetary environment and verified by experimental simulation. Firstly, the ES algorithm is trained to verify whether it can realize the obstacle avoidance effect of the detection spacecraft, and then combined with the DQN algorithm to verify whether it can realize the autonomous obstacle avoidance flight. The obstacle is simplified as an obstacle circle with a radius of 200, that is: ri = 200. The spacecraft is simplified as a particle with a detection range of 100 m. In order to verify the effectiveness of the algorithm, it is assumed that the initial position and pitch angle of the simulated spacecraft and the position of the obstacle have a random variation of 30% each time. The circle obtained from the size of the detection range of the spacecraft outside the boundary of the obstacle is taken as the detection boundary. The outside of the detection boundary is the safety zone, and the inside of the detection boundary is the early warning zone. In the hybrid guidance method of ES and DQN, when the spacecraft enters the warning area, it needs to use DQN algorithm to determine whether the obstacles are threatening. If so, it needs to switch the ES algorithm for obstacle avoidance to further guide and avoid obstacles, otherwise it will fly normally. The program ends when the following three situations are included: (1) If the accuracy requirements are not met within the given length of the time domain, the cycle ends; (2) If the spacecraft collides with obstacles during flight (the distance from the center of the obstacle is less than the radius), the cycle ends; (3) If the second norm of the spacecraft position vector is less than 10, it means that the specified position has been successfully reached, and the cycle ends. The parameter settings of dynamic and environmental models are shown in Table 1.
4.1 Simulation Results by ES The purpose of the simulation is to use the ES algorithm and the deep neural network to learn the obstacle avoidance guidance method, so as to realize the autonomous decision-making for obstacle avoidance when the initial position and inclination disturbance occur in the flight process. In the design of neural network, four-layer
148
J. Jiao et al.
Table 1 Parameter setting of obstacle avoidance problem Parameter Value Initial Position (x0 , y0 ) (m) Initial position of obstacle (xc , yc ) (m) Initial speed v0 (m/s) Initial mass m 0 (kg)
Terminal position x f , y f (m)
Overload constraint g m/s2 Initial pitch angle θ (◦ )
[− 1200, 1200] [− 500, 600] 120 113 [0, 0] [− 5, 5] − 10
neural network is used, one input layer, two hidden layers and one output layer, with four-dimensional state value as input and one-dimensional action value as output. Details of which are provided as follows: (1) State space S: represents the state as a four-dimensional vector containing the position (x, y), distance from the obstacle (r), and angle between directions (q), that is state = [x, y, q, r ]. When no obstacle is detected, no obstacle avoidance maneuver is required. If an obstacle is detected, r and q are the real-time distance and direction angle between the spacecraft and the obstacle; (2) Action space A: give overload command A ∈ [−5, 5] through ES neural network; (3) Reward function R: as long as the flight is safe, the reward r = 0.01 is set to for each interaction. The longer the flight time, the higher the reward. Reward r = 10 will be given when reaching the destination. If obstacle avoidance fails, reward r = −10 will be set. A total of 100 simulation tests have been conducted, and a selection of the results is illustrated in Figs. 4, 5 and 6.
(a) Scenario 1
(b) Scenario 2
Fig. 4 Obstacle avoidance trajectory (left: scenario 1 right: scenario 2)
Planetary Flight Obstacle Avoidance Guidance …
(a) Scenario 1
149
(b) Scenario 2
Fig. 5 Diagram of the decision-making process (left: scenario 1 right: scenario 2)
(a) Scenario 1
(b) Scenario 2
Fig. 6 Diagram of relative distance variation (left: scenario 1 right: scenario 2)
In the scene where the initial state of the spacecraft and the location of obstacles change randomly, the spacecraft will fly normally when no obstacles are detected. After the obstacles are detected and the spacecraft enters the early warning area, the ES is used to conduct guided flight according to the input state. The simulation results show that the ES can conduct obstacle avoidance flight in time according to the trained obstacle avoidance strategy, and has good robustness. Although the pure ES algorithm can successfully avoid obstacles, it is too sensitive to the state after entering the early warning area and cannot distinguish whether there is a threat. It is possible to take obstacle avoidance flight even if there is no threat but entering the early warning area, resulting in unnecessary energy consumption. Therefore, it needs to be mixed with DQN algorithm to improve the energy utilization under the premise of ensuring the safety of obstacle avoidance flight.
150
J. Jiao et al.
4.2 Simulation Results by Hybrid DQN and ES It needs to redesign the original state space and action space by utilizing DQN algorithm, that is: (1) State space S:the original state space is determined by the spacecraft parameters relative to the destination. ES algorithm has been able to avoid obstacles. Therefore, at any time, it is necessary to select flight cruise or obstacle avoidance, which is the content of DQN training. So select the state space state = [x, y, r ]; (2) Action space A: use the action value to decide whether to select flight cruise or ES obstacle avoidance guidance. The action space is A ∈ {a = avoidance, cruising}. Besides, each decision will last for a period of time, set to 0.1 s; (3) Reward function R: the reward function is set as the distance between the current position and the target point position. The reward function is: z(x, p) =
x 2 + p2 − p
(13)
where px = p y = 0.01, and r (x, p) = 0.01 ∗ (z(x, px ) + z(y, p y )). The simulation results are shown in Figs. 7, 8 and 9. In the scenario 1 of Figs. 7, 8 and 9, the planetary flight obstacle avoidance guidance method based on the hybrid of ES and DQN utilizes DQN to determine the threat of the obstacle after entering the early warning area, and then make the decision of obstacle avoidance flight. In scenario 2, as shown in Fig. 8, it can be observed that when the pure ES guided spacecraft detects an obstacle and enters the warning area, it is unable to determine if there is an actual threat. Consequently, it generates maneuver instructions, leading to redundant obstacle avoidance behavior. This behavior is not efficient in terms of energy utilization. To address this issue, the integration of DQN is necessary to enhance decision-making. DQN evaluates the flight status and makes decisions
(a) Scenario 1
(b) Scenario 2
Fig. 7 Diagram of the obstacle avoidance trajectory (left: scenario 1 right: scenario 2)
Planetary Flight Obstacle Avoidance Guidance …
(a) Scenario 1
151
(b) Scenario 2
Fig. 8 DQN and ES decision process action (left: scenario 1 right: scenario 2)
(a) Scenario 1
(b) Scenario 2
Fig. 9 Diagram of the decision-making process (left: scenario 1 right: scenario 2)
for normal flight. The resulting overload command is depicted in Fig. 9. With the assistance of DQN, the spacecraft ensures safe flight without relying on the obstacle avoidance strategy. The simulation results show that the hybrid planetary flight obstacle avoidance guidance method based on DQN and ES can realize obstacle avoidance flight, and make decisions through threat analysis, effectively avoid additional maneuvers, ensure flight safety, save energy and improve the utilization of resources.
5 Conclusion This paper presents a novel approach for achieving autonomous intelligent obstacle avoidance in spacecraft flights by employing a reinforcement learning method based on the ES and DQN algorithms. The proposed method addresses the challenge
152
J. Jiao et al.
of navigating complex and uncertain environments while considering initial state disturbances and the uncertainty of obstacle positions. By leveraging the integration of ES and DQN, our approach enables the spacecraft to obtain optimal obstacle avoidance strategies without explicit planning. This leads to the successful realization of obstacle avoidance in planetary flights, ensuring the safety and efficiency of the spacecraft. Through extensive simulations of planetary flight obstacle avoidance scenarios, our study draws the following two significant conclusions: (1) The obstacle avoidance guidance method combining ES and DQN exhibits remarkable robustness and adaptability in the presence of disturbances. It enhances the overall autonomy and intelligence of the guidance process, enabling effective obstacle avoidance; (2) Model-free methods, like the one employed in this study, offer valuable advantages in handling uncertain and complex environments where establishing a model is challenging. The gradient-free nature of our approach is particularly beneficial in optimizing non-convex, non-smooth optimization problems and addressing extensive behavior state spaces. This holds great significance for planetary exploration flights characterized by dynamic and unpredictable environments, as well as the high costs associated with environmental interactions. In summary, our proposed reinforcement learning method presents a promising solution for achieving autonomous and intelligent obstacle avoidance in spacecraft flights. It surpasses the limitations of traditional approaches, providing an adaptive and intelligent solution for ensuring safe and efficient planetary exploration missions. Acknowledgements The authors gratefully acknowledge the support to this work by the National Program on Basic Research Project of China (Grant No. JCKY2020903B002).
References 1. Dong, J., Rao, W., Meng, L.: Research on the development of foreign Mars low-altitude flight technology. Aerosp. Eng. 26(1), 110–119 (2017) 2. Bar-Cohen, Y., Colozza, A., Badescu, M., et al.: Biomimetic flying swarm of entomopters for mars extreme terrain science investigations. In: Concepts and Approaches for Mars Exploration, 1679 (2012) 3. Chakravarthy, A., Ghose, D.: Obstacle avoidance in a dynamic environment: a collision cone approach. IEEE Trans. Syst. Man Cybern. Part A Syst. Humans 28(5), 562–574 (1998) 4. Han, S.C., Bang, H., Yoo, C.S.: Proportional navigation-based collision avoidance for UAVs. Int. J. Control Autom. Syst. 7(4), 553–565 (2009) 5. Jenie, Y.I., Kampen, E., Visser, C., et al.: Three-dimensional velocity obstacle method for UAV’s uncoordinated avoidance maneuver. In: AIAA Guidance, Navigation, and Control Conference (2016) 6. Liu, J.Y., Guo, Z.Q., Liu, S.Y.: The simulation of the UAV collision avoidance based on the artificial potential field method. Adv. Mater. Res. 2076, 1400–1404 (2012) 7. Xiong, C., Xie, W.J., Dong, W.H.: Path planning of UAV obstacle avoidance based on improved artificial potential field. Comput. Eng. 44(9), 314–320 (2018)
Planetary Flight Obstacle Avoidance Guidance …
153
8. Sunberg, Z.N., Kochenderfer, M.J., Pavone, M.: Optimized and trusted collision avoidance for unmanned aerial vehicles using approximate dynamic programming. In: 2016 IEEE International Conference on Robotics and Automation (ICRA) (2016) 9. Kochenderfer, J.K.D., Kochenderfer, M.J., Owen, M.P.: Deep neural network compression for spacecraft collision avoidance systems. J. Guidance Control Dyn. 42(3), 598–608 (2019) 10. Han, X., Wang, J., Xue, J.Y., et al.: Intelligent decision making for 3-dimensional dynamic obstacle avoidance of UAV based on deep reinforcement learning. In: The 11th International Conference on Wireless Communications and Signal Processing (WCSP), 1–6 (2019) 11. Giusti, A., Guzzi, J., Ciresan, D.C., et al.: A machine learning approach to visual perception of forest trails for mobile robots. IEEE Robotics Autom. Lett. 1(2), 661–667 (2016) 12. Mettler, B., Kanade, T.: System Identification Modeling of a Model-scale Helicopter. American Helicopter Society (2000) 13. Mete, B., Kanade, T.: System Identification of Small-size Unmanned Helicopter Dynamics. American Helicopter Society Forum (1999) 14. Du, N.X., Lü, Q., Lin, H.C., Wei, H.: Advancing high-dimensional continuous spaces: applications of deep reinforcement learning in the field of robotics. Robot 41(2), 276–288 (2019) 15. Nazareth, J.L.: Introduction to derivative-free optimization. Math. Comput. 79(271), 1867– 1869 (2010) 16. Rios, L.M., Sahinidis, N.V.: Derivative-free optimization: a review of algorithms and comparison of software implementations. J. Glob. Optim. 56(3), 1247–1293 (2013) 17. Salimans, T., Ho, J., Chen, X., et al.: Evolution Strategies as a Scalable Alternative to Reinforcement Learning. arXiv preprint arXiv:1703.03864 (2017) 18. Mnih, V., Kavukcuoglu, K., Silver, D., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2019) 19. Goodfellow, I., Bengio, Y.: Deep Learning. MIT Press (2016)
Improving Unmanned Panoramic Perception Algorithm of DAFPN-YOLO Xiru Wu, Yurui Lin, and Chao Liu
Abstract On the basis of YOLO, combined with multi-task learning to realize the tasks of traffic object detection, drivable area segmentation and lane line segmentation in unmanned driving, the unmanned panoramic perception algorithm based on DAFPN-YOLO is proposed. Then, dynamic attention is used to achieve scale perception, space perception and task perception to improve the model’s performance in three tasks: traffic object detection, driveable area segmentation and lane line segmentation. The multi-task loss function is adjusted, and FocalLoss is introduced to improve the model’s performance in the face of category-unbalanced data. The accuracy of vehicle perception algorithm for driverless cars is improved. Keywords Multi-task learning · Panoramic perception · Dynamic attention
1 Introdution Multi-Task Learning (MTL) is a machine learning method that aims to improve the generalization ability of machine learning systems by learning multiple related tasks jointly [1]. Multi-task learning (MTL) aims to share information between tasks by learning them jointly, usually within a single network. Thus it can improve the individual tasks’ performance [2] and overall inference time. MTL has been successfully applied to many computer vision tasks [3]. Nevertheless, it usually considers a single dataset labeled for all tasks [2, 4], whereas in real-world applications we are often faced with a scarcity of labeled data for the tasks at hand: each dataset is only labeled for one of the tasks. Moreover, the human effort required to label all the datasets with the missing annotations would be time-consuming and impractical. X. Wu · Y. Lin (B) · C. Liu College of Electronic Engineering and Automation, Guilin University of Electronic Technology, Guilin 541004, Guangxi, China e-mail: [email protected] X. Wu e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_14
155
156
X. Wu et al.
Therefore, there is a need to tackle this issue and leverage partially labeled datasets in a MTL framework. The MTL framework is being successfully applied to multiple computer vision domains, especially in the realm of scene understanding [5–7]. MTL is also used in the domain of face analysis [8] or person identification [9, 10]. Most MTL techniques assume a single dataset labeled for all the relevant tasks, or use knowledge transfer across tasks (to learn more about MTL please refer to [11]). However, they do not tackle the simultaneous learning of multiple tasks from different datasets with partial labels.A few works tackle the problem of learning multiple tasks from disjoint datasets [12–16]. The general approach is to use a MTL framework and deal with the encountered forgetting effect by leveraging knowledge distillation [12, 13, 15]. However, these methods do not leverage the unlabeled data from one of the tasks to improve the other task’s performance. Thus, other works [14, 16] use adversarial learning jointly with SSL to learn multiple tasks from disjoint datasets, whilst utilizing all the data at hand for each task. We evaluate our approach on the BBD100K dataset , where our method outperforms existing frameworks and achieves state-of-the-art instance prediction performance. Our contributions are the following: • Dynamic attention is introduced to realize scale perception, space perception and task perception, and improve the performance of the model in three tasks: traffic target detection, driveable area segmentation and lane segmentation. • FocalLoss is introduced to improve the performance of the model in the face of class imbalance data, and improve the accuracy of the vehicle perception algorithm of unmanned vehicles.
2 Method In this Section, we will present our method.
2.1 Unmanned Panoramic Perception Algorithm Based on YOLO The algorithmic framework of the YOLO-based unmanned panoramic perception algorithm is shown in Fig. 1, which consists of a backbone network, a neck network, and three decoder heads for the corresponding task. Backbone: Backbone is used to extract the features of the input image. Typically, some classic image classification network acts as the backbone. Due to the excellent performance of YOLOv4 in object detection, CSPDarknet is selected as the backbone of this method, which solves the gradient duplication problem in the optimization
Improving Unmanned Panoramic Perception Algorithm of DAFPN-YOLO
157
Fig. 1 Unmanned panoramic perception algorithm framework based on YOLO
process. It supports feature propagation and feature reuse, and reduces parameters and computation. Therefore, it is beneficial to ensure the real-time performance of the network. Neck: The function of the neck is to process the feature sequences output at different stages in the backbone, and perform operations such as splicing and fusion of these sequences. The algorithm uses the spatial pyramid module (SPP) to fuse multi-scale features, and uses the feature pyramid module (FPN) to fuse multi-semantic features. Decoder heads: The algorithm designs three separate decoder heads for each task. Similar to YOLOv7, multi-scale detection scheme based on Anchor is adopted. First, a path aggregation network (PAN) is used to better extract localization features, combining features from PAN and FPN to fuse semantic information and local features, and running the detection directly on the multi-scale fusion feature map in PAN. The grid of each multiscale feature map is assigned three anchor points with different aspect ratios, and the detector head predicts the offset of the position, the height and width after scaling, and the probability and corresponding confidence of each category. Driveable area segmentation and lane line segmentation are implemented in different task heads with different network structures, and different semantic features are used. Compared with the other two tasks, features extracted from deeper network layers are not necessary for driving region segmentation, and these deeper features cannot improve the prediction performance, but will increase the difficulty of convergence when the model is trained. Therefore, this algorithm connects the branch of the driving area division head before the FPN module. In addition, in order to compensate for the potential loss caused by this, an additional upsampling layer is applied, that is, a total of four nearest neighbor interpolation upsampling times are applied in the decoder stage.
158
X. Wu et al.
The lane line segmentation task branches are attached to the end of the FPN layer to extract features at a deeper level, as the lane routes are often elongated and difficult to detect in the input image. In addition, deconvolution is used in the decoder part of the lane splitting to further improve performance.
2.2 Design of Loss Function The loss function of the detection branch is composed of classification loss, object loss and bounding box loss weighted, as shown in formula 1: L det = α1 L class + α2 L obj + α3 L box
(1)
In the formula, Focal Loss was used in L class and L obj , It is used to reduce the contribution of easily classified objects to the loss function and improve the learning effect of difficult samples. L class for punishment classification, L obj is the confidence level used for a prediction. The L box uses the CIoU loss function, which takes into account the distance, overlap rate, scale similarity, and aspect ratio between the prediction box and the real label box. The loss function of the division branch of the driving area is the cross entropy loss function, as shown in formula 2: L da−seg = L ce = −ylog(yˆ) + (1 − y)log(1 − yˆ)))
(2)
The loss function of lane line division branch is cross entropy loss function and IoU loss function. IoU loss is particularly effective for the prediction of lane line sparsity category. The loss function of lane line division branch is shown in formula 3: L ll−seg = L ce + L I oU = L ce + 1 −
TP T P + FP + FN
(3)
Finally, the loss function of the entire multitask network is the weighted sum of the losses of the three branches, as shown in formula 4: L all = γ1 L det + γ2 L da−seg + γ3 L ll−seg
(4)
In the formula, α1 , α2 , α1 , γ1 , γ2 and γ3 are hyperparameters used to balance the loss of each part.
Improving Unmanned Panoramic Perception Algorithm of DAFPN-YOLO
159
2.3 Unmanned Panoramic Perception Algorithm Based on DAFPN-YOLO The network structure of the DAFPN-YOLO algorithm proposed in this paper is shown in Fig. 2, which consists of a shared encoder for extracting features from the input image, and three decoder heads corresponding to the task. This section shows the network configuration of the model. Backbone: In the literature on designing efficient neural network architectures, the main considerations are no longer limited to the number of parameters, the amount of computation, and the density of computation. Ma et al., starting from the characteristics of memory access cost, analyzed the influence of input/output channel ratio, the number of branches of the architecture, and element-level operations on the inference speed of the network. Dollar et al. also took the activation function into additional consideration when scaling the model, that is, more consideration was given to the number of elements in the output tensor of the convolution layer. The CSPVoVNet design in Fig. 3b is a variant of VoVNet. In addition to considering basic design concerns, CSPVoVNet’s architecture also analyzes gradient paths to enable weights at different layers to learn a greater variety of features, resulting in faster and more accurate reasoning. ELAN considers the following design strategy in Fig. 3c: How to design efficient networks. They concluded that by controlling the shortest longest gradient path, deep networks can learn and converge efficiently. An
Fig. 2 DAFPN-YOLO structure chart
160
X. Wu et al.
Fig. 3 Structure comparison map
extended E-ELAN based on ELAN is introduced into this algorithm and its main architecture is shown in Fig. 3d. In large-scale ELAN, the gradient path length and stack number of compute blocks have reached a stable state, and if more compute blocks are stacked infinitely, this stable state may be broken and the parameter utilization will be reduced. E-ELAN uses methods of scaling, scrambling, and fusing cardinality to achieve the ability to continuously enhance the learning effect of the network without breaking the original gradient path. In terms of architecture, E-ELAN only changes the architecture of the compute block, while the architecture of the transformation layer is completely unchanged, using group convolution to extend the channels and cardinality of the compute block, applying the same group parameters and channel multipliers to all compute blocks of the compute layer. Then, according to the set group parameter g, the feature maps calculated by each calculation block are scrambled into group g and spliced together. At this point, the number of channels in each feature map group will be the same as the number of channels in the original schema. Finally, a G-group feature map is added to perform the cardinality merge. In addition to maintaining the original ELAN design architecture, E-ELAN can also guide different sets of computing blocks to learn more diverse features. Neck: Based on FPN feature pyramid network, this algorithm integrates dynamic attention head to realize neck network. Dynamic attention head consists of three parts: scale awareness attention, space awareness attention and task awareness attention. The specific content of dynamic attention head is introduced in detail in the next section. Task head: consists of object detection head, lane line segmentation head and driving area segmentation head.
Improving Unmanned Panoramic Perception Algorithm of DAFPN-YOLO
161
2.4 Dynamic Attention The definition of dynamic attention head is as follows. Given an eigenvector τ ∈ R L∗S∗C , its self-attention is calculated by formula 5 as follows: W (τ ) = π(τ )τ
(5)
In the formula, π(·) is a kind of attention function, and a common solution for this attention function is implemented through the fully connected layer. It is computationally costly to directly learn attention functions in all dimensions, which cannot be realized in practice due to the high dimension of the tensor. Therefore, the algorithm in this chapter introduces dynamic attention head to transform attention functions into three sequential attention functions, each of which focuses only on a specific Angle. The formula 6 is as follows: W (τ ) = πC (π S (π L (τ )τ )τ )τ
(6)
In the formula, π L , π S and pi C are three different attention functions, which are respectively applied to scale perception, space perception and task perception. Scale-aware attention: Scale-aware attention is the attention that dynamically integrates different scale features according to the semantic importance of the features, and its calculation formula 7 is as follows: 1 τ) τ (7) π L (τ )τ = σ f ( SC S,C In the formula, f (·) is a linear function using a 1×1 convolution approximation, σ (x) stands for the Hard-Sigmoid function. Spatially aware attention: The spatially aware attention module is based on fusion of features focused on discriminating regions that exist in both spatial location and feature level. Considering the high dimensions of the space, this algorithm breaks this module into two steps: first sparse attention learning using deformable convolution, and then aggregate features at different levels in the same spatial location. The formula 8 for calculating spatial perceptual attention is as follows: π S (τ ) · τ =
L K 1 wl,k · τ (l; pk + Δpk ; c) · Δm k L l=1 k=1
(8)
In the formula, K is the number of sparsely sampled locations, pk + Δpk is the location pk moved to a discriminative region by self-taught spatial offset Δpk , and Δpk is the self-learned importance scalar at the location pk . Both of these quantities are learned from the input features at the intermediate level of τ .
162
X. Wu et al.
Fig. 4 Dynamic attention structure chart
Task-aware attention: Task-aware attention can jointly learn and propagate different representations of objects. It dynamically opens or closes feature channels to support different tasks in learning feature representations. The calculation formula 9 of task-aware attention is as follows: πC (τ ) · τ = max(α 1 (τ ) · τc + β 1 (τ ), α 2 (τ ) · τc + β 2 (τ ))
(9)
T In the formula, tau c represents the feature slice of the c channel, α 1 , α 2 , β 1 , β 2 = θ (·) is the super function of the learning control activation threshold, and the implementation of θ (·) is similar to Dynamic relu, where task aware attention first performs global average pooling on the L × S dimension to reduce the dimension, and then uses two fully connected layers and one normalized layer. Eventually, a ShiftedSigmoid function is applied to normalize the output across the shifted shifted range. The complete dynamic attention module is shown in Fig. 4, which relearns the features output by FPN to obtain different feature representations to adapt to different downstream tasks.
2.5 Multi-task Loss Function In DAFPN-YOLO, the loss function setting for detecting the branch part is retained, and is still the weighted sum of classification loss, object loss, and bounding box loss. In addition, Focal Loss is used in L class and L obj o deal with sample imbalance. L class is used to punish classification errors, while L obj is used to predict confidence. L box reflects the overlap rate, aspect ratio and scale similarity distance between the predicted result and the true value. Proper setting of loss weight can effectively guarantee the result of multi-task detection. For driveable area segmentation, cross
Improving Unmanned Panoramic Perception Algorithm of DAFPN-YOLO
163
entropy loss is used, which aims to minimize the classification error between the network output and the true label value. For lane line segmentation, Focal Loss was used instead of cross-entropy Loss. For difficult classification tasks such as lane line detection, Focal Loss was used to effectively focus the model on difficult samples, thus improving the detection accuracy. The final losses are as follows: L all = γ1 (α1 L class + α2 L obj + α3 L box ) + γ2 L ce + γ3 L Focal Loss
(10)
3 Experimental Results and Analysis of Perceptual Algorithms 3.1 Traffic Object Detection Experiment Results This section presents a comparison of the detection results of four models on the BBD100K dataset. Recall and mAP50 are used to evaluate the detection accuracy. Table 5-1 lists the detection accuracy parameters. The detection accuracy of the model in this chapter is higher than that of Faster R-CNN and YOLOv5s, and the accuracy is slightly worse than that of Swin Transformer, but the model in this paper can run in real time (Table 1). As can be seen from Fig. 5, the model in this chapter can achieve excellent detection performance under the conditions of day and night, and there is basically no missing detection. The improved network can better complete the traffic object detection task, and the detection accuracy can be ensured for small long-range objects or multiple traffic objects, indicating that the improved network can cope with traffic sign detection in complex scenarios. The algorithm can overcome the problems of small object and large number effectively, which shows that the algorithm improvement is reliable, effective and feasible.
Table 1 Comparison of object detection results Algorithms mAP50 (%) Recall (%) Faster RCNN YOLOv5s Swin transformer DAFPN-YOLO
68.4 64.9 77.2 76.5
89.4 333 86.8 89.2
Speed (fps) 9.3 444 8.8 41
164
X. Wu et al.
Fig. 5 Test results compared to the graph,from left to right: a Faster RCNN, b YOLOv5s, c swin transformer, d DAFPN-YOLO
3.2 Driveable Area Segmentation Experiment Results The visual results of driving area segmentation are shown in Fig. 6. In this chapter, both the “Zone/Driveable” and “zone/selectable” classes in the BDD100K data set are classified as “driveable zones” without distinction [17]. The model in this paper only needs to distinguish between the driving area and the background in the image. mIoU is used to evaluate the segmentation performance of different models. The results are shown in Table 2. It can be seen that the model in this chapter outperforms FCN and PSPNet by 14.9% and 3.6%, respectively [18, 19].
3.3 Panoramic Perception Visualization Results Figures 7 and 8 show a visual comparison of YOLOP, Hybridnet [20], and this paper’s DAFPN-YOLO on the BDD100K dataset. Figure 7 shows the results during the day. The left column lists three scenarios of YOLOP. In the first scenario, there are some wrong driveable sections and missing driveable area segmentation, in the second scenario, there are redundant small object detection boxes and missing driveable area segmentation, and in the third scenario, lane line detection is missing. The middle column shows three scenarios of Hybridnet. In the first scenario, there are discontinuous lane line predictions, in the second scene, there are repeated detection and lane line missing detection problems of small vehicles, and in the third scene, there are some false vehicle and lane line detection. The right column shows the
Improving Unmanned Panoramic Perception Algorithm of DAFPN-YOLO
165
Fig. 6 Comparison map of driveable area segmentation, from left to right: a FCN, b PSPNet, c DAFPN-YOLO Table 2 Comparison of driveable area segmentation results
Algorithms
mIoU
FCN PSPNet DAFPN-YOLO
78.3 89.6 93.2
results of this paper’s DAFPN-YOLO, showing that the model in this paper provides better performance in various scenarios. Figure 8 shows the nighttime results. The left column lists the results of YOLOP in three scenarios: the first scenario has false detection and missing driveable area segmentation, the second scenario has lane line detection deviation, and the third scenario has missing lane line detection and driveable area segmentation. The middle column is the result of Hybridnet, and in the first scenario there are missing detection boxes for certain vehicles and some false detection boxes. In the second scenario, there are some false detection and redundant detection boxes. In the third scenario, there are vehicle detection boxes with missing driveable area segmentation and redundancy. The images in the right column are the result of DAFPN-YOLO in this article, which shows that the model in this chapter successfully overcomes these issues and shows better performance.
166
X. Wu et al.
Fig. 7 Effect picture of daytime panoramic perception, from left to right: a YOLOP, b Hybridnet, c DAFPN-YOLO
Fig. 8 Effect picture of night panoramic perception, from left to right: a YOLOP, b Hybridnet, c DAFPN-YOLO
4 Conclusion Based on YOLO, this paper combines multi-task learning to realize the tasks of traffic object detection, driving area segmentation and lane line segmentation in unmanned driving. Then, dynamic attention is used to achieve scale perception, space perception and task perception to improve the performance of the model in three tasks: traffic object detection, driveable area segmentation and lane line segmentation. After
Improving Unmanned Panoramic Perception Algorithm of DAFPN-YOLO
167
adjusting the multi-task loss function, FocalLoss is introduced to improve the performance of the model in the face of category-unbalanced data. The experiment verifies the good performance of DAFPN-YOLO in the unmanned panoramic perception task, and improves the accuracy of the vehicle perception algorithm of the unmanned vehicle. Acknowledgements This work was supported by National Natural Science Foundation of China under Grant 62263005, Guangxi Natural Science Foundation under Grant 2020GXNSFDA238029, Laboratory of AI and Information Processing (Hechi University), Education Department of Guangxi Zhuang Autonomous Region under Grant 2022GXZDSY004, Innovation Project of Guangxi Graduate Education YCSW2023298, Innovation Project of GUET Graduate Education under Grant 2023YCXS124.
References 1. Kim, D.S., Arsalan, M., Owais, M., Park, K.R.: ESSN: enhanced semantic segmentation network by residual concatenation of feature maps. IEEE Access 8, 21363–21379 (2020) 2. Zhang, Y., Yang, Q.: A survey on multi-task learning. IEEE Trans. Knowl. Data Eng. 34(12), 5586–5609 (2021) 3. Kokkinos, I.: Ubernet: training a universal convolutional neural network for low-, mid-, and high-level vision using diverse datasets and limited memory. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 6129–6138 4. Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 3730–3738 5. Kendall, A., Gal, Y., Cipolla, R.: Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 7482–7491 6. Vandenhende, S., Georgoulis, S., Van Gansbeke, W., Proesmans, M., Dai, D., Van Gool, L.: Multi-task learning for dense prediction tasks: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 44(7), 3614–3633 (2021) 7. Zhen, M., Wang, J., Zhou, L., Li, S., Shen, T., Shang, J., Fang, T., Quan, L.: Joint semantic segmentation and boundary detection using iterative pyramid contexts. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 13666–13675 8. Huang, Z., Zhang, J., Shan, H.: When age-invariant face recognition meets face age synthesis: a multi-task learning framework. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 7282–7291 9. Lin, Y., Zheng, L., Zheng, Z., Wu, Y., Hu, Z., Yan, C., Yang, Y.: Improving person reidentification by attribute and identity learning. Pattern Recogn. 95, 151–161 (2019) 10. Schumann, A., Stiefelhagen, R.: Person re-identification by deep learning attributecomplementary information. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017, pp. 20–28 11. Ruder, S.: An overview of multi-task learning in deep neural networks. arXiv preprint arXiv:1706.05098 (2017) 12. Dhar, P., Kumar, A., Kaplan, K., Gupta, K., Ranjan, R., Chellappa, R.: Eyepad++: a distillationbased approach for joint eye authentication and presentation attack detection using periocular images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 20218–20227 13. Hong, Y., Niu, L., Zhang, J., Zhang, L.: Beyond without forgetting: multi-task learning for classification with disjoint datasets. In: 2020 IEEE International Conference on Multimedia and Expo (ICME), pp. 1–6. IEEE (2020)
168
X. Wu et al.
14. Huang, C., Tang, H., Fan, W., Xiao, Y., Hao, D., Qian, Z., Terzopoulos, D., et al.: Partly supervised multi-task learning. In: 2020 19th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 769–774. IEEE (2020) 15. Kim, D.-J., Choi, J., Oh, T.-H., Yoon, Y., Kweon, I.S.: Disjoint multi-task learning between heterogeneous human-centric tasks. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1699–1708. IEEE (2018) 16. Wang, Y., Tsai, Y.-H., Hung, W.-C., Ding, W., Liu, S., Yang, M.-H.: Semi-supervised multitask learning for semantics and depth. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2022, pp. 2505–2514 17. Yu, F., Chen, H., Wang, X., Xian, W., Chen, Y., Liu, F., Madhavan, V., Darrell, T.: Bdd100k: a diverse driving dataset for heterogeneous multitask learning. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 2636–2645 18. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 3431–3440 19. Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 2881–2890 20. Vu, D., Ngo, B., Phan, H.: Hybridnets: end-to-end perception network. arXiv preprint arXiv:2203.09035 (2022)
Development of Electromagnetic Model for Linear Force Motor in Direct-Drive Hydraulic Servo Actuators Zhongrui Zhao, Bing Chu, Zhenyu Liu, and Zhiming Yan
Abstract The direct drive hydraulic servo actuation system is extensively applied in hydraulic systems, particularly in the aerospace industry. However, in the study of direct drive hydraulic servo actuators, further improvements are required for the modeling and simulation of linear force motors, which serve as the core electromechanical components. In this paper, the principle of linear force motor is analyzed, and the magnetic circuit and mathematical model are established. Then a simulation model of the linear force motor is established in AMESim, and the basis for setting the parameters is explained. Finally, the model of linear force motor is applied and verified in the model of the direct drive hydraulic servo actuator system. The results show that the developed model of linear force motor has accurate principle, a balance between complexity and accuracy, and could reflect its dynamic characteristics and load characteristics. It is suitable for the simulation and performance evaluation of the whole direct drive hydraulic servo actuator system. Keywords Hydraulic servo actuators · Direct drive valve (DDV) · Linear force motor · Electromagnetism · AMESim
1 Introduction Hydraulic servo actuation system has the characteristics of fast response, high power density and strong load capacity, and has been widely used in military and industrial fields such as aerospace. Direct drive valve (DDV) is a new type of electro-hydraulic servo valve without hydraulic pilot stage. The principle of direct-drive valve controlled electro-hydraulic actuation system is shown in Fig. 1. The linear force motor directly drives the movement of the valve core, and the position of hydraulic cylinder Z. Zhao · B. Chu · Z. Liu Shenyang Aircraft Design and Research Institute, AVIC, Shenyang 110035, China Z. Yan (B) School of Mechanical Engineering and Automation, Beihang University, Beijing 100191, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_15
169
170
Z. Zhao et al.
Fig. 1 Schematic of direct-drive hydraulic servo actuator
is controlled through the liquid resistance of each valve port. DDV has the advantage of less leakage, small size and strong anti-pollution ability, and the scope of its application is gradually expanding [1]. The actuator involves multiple energy fields such as mechanical, electromagnetic and hydraulic, which interact with each other and realize the function collectively. Coupled with the precise structure and high price of the actuator, simulation is an efficient and cost-effective means to evaluate its performance during design and analysis [2]. For direct drive actuation system, The research in [3] regarded the servo valve only as a second order system, without paying attention to the specific structure and principle of the valve. References [4, 5] directly used the model EMLTR01 in the electromagnetic library of AMESim. But the model is actually a proportional electromagnet, which could only provide force in one direction. In addition to above scheme, some researchers directly determined the displacement of valve core through instruction [6], which ignores the movement and load characteristics of the force motor, especially under the influence of the hydraulic power of the valve core. Some researchers realized the model in Simulink based on theoretical analysis, and conducted co-simulation with the model of valve and cylinder in AMESim [7], but this greatly increased complexity, which is not necessary when complex control algorithms or exact structural designs are not involved. In the research of direct drive hydraulic servo actuation systems mentioned above, AMESim is mainly used, which is a simulation software based on power bond graph theory, and suitable for multi-energy coupling modeling of complex systems [8]. Linear force motor is the key electromechanical component of direct drive servo actuator system. Its dynamic and load characteristics directly determine the overall performance of direct drive servo actuator system. At present, the design of new structure and performance verification of the linear force motor mostly adopt the finite element method, which needs to be imported into AMESim through co-simulation or data table [9]. The former leads to a large amount of calculation, and the latter is difficult to reflect the correlation between the characteristics and the mechanical structure of the force motor. According to above discussion, the current models of linear force motor in the simulation of direct drive electro-hydraulic servo system
Development of Electromagnetic Model for Linear …
171
is still to be optimized. Therefore, a electromagnetic model of the force motor is developed based on AMESim, and is applied in the direct drive actuation system to verify its function. The developed model is based on the actual structure and working principle of the linear force motor, which can realize multi-energy coupling and balance the calculation amount and precision, and is suitable for the detailed performance analysis of the direct drive electro-hydraulic servo system.
2 Structure and Operating Principle of Linear Force Motor The structures and magnetic circuits of linear force motors are complex and diverse, and new forms are still being proposed continually. But their mechanisms are basically the same. Figure 2 shows a typical structure of linear force motor by MOOG, which consists of the permanent magnets, yoke, armature component, centering spring and control coil. Two symmetrical air gaps are formed between the armature and the yoke. Its principle is to change the magnetic flux in the air gaps by controlling the current of coil, and then control the electromagnetic attraction to drive the armature. The armature component is connected with the centering spring, which can adjust the zero position of linear force motor. In the force motor, the reluctance of permanent magnet is much greater than air gaps, and the reluctance of the air gaps is also significantly larger than soft magnet. Therefore, only the reluctance of two air gaps and the magnetomotive force of the permanent magnet and the control coil is considered. According to the above principles and the mechanical structure of the force motor in Fig. 2, the corresponding magnetic circuit is established and shown in Fig. 3. Where Nc i c is the magnetomotive force generated by the current of the control coil, M0 is the magnetomotive force of
Fig. 2 Typical mechanical structure of linear force motor [6]
172
Z. Zhao et al.
Fig. 3 Magnetic circuit schematic of force motor
the permanent magnet, R1 and R2 are the reluctances of two air gaps, and Φ1 and Φ2 are the magnetic fluxes of air gaps. The signs of parameters refer to the arrow. Without excitation currents, the armature is in the zero position, the length and the permanent magnetic flux of the two working air gaps is equal. At this time, the armature is balanced and the output force is zero. When the control coil is energized, the coil generates a control magnetic flux at the working air gaps. The combined action of the permanent and the control magnetic flux generates a force to push the armature. The output force is proportional to the input current, and the movement of linear force motor is bidirectional.
3 Mathematical Model of Linear Force Motor The working principle of the linear force motor is discussed in Sect. 2. However, the qualitative result is insufficient to express all the characteristics of the linear force motor. In this chapter, the mathematical model of the linear force motor is established, which is divided into the electromagnetic process and the mechanical process.
3.1 Magnetic Flux of Air Gaps Refer to Fig. 3, the permanent flux generated by permanent magnet and the control flux generated by coil current mostly pass through the air gap. Ideally, it is considered that all the flux passes through the air gap, then the magnet flux of air gap 1 is: Φ1 =
M0 Nc i c + R1 2Rg
(1)
where Nc is the turns of the control coil, i c is the control current, and Rg is the reluctance of each working air gap when the armature is in the middle.
Development of Electromagnetic Model for Linear …
173
Similarly, the flux of air gap 2 is: Φ2 =
M0 Nc i c − R2 2Rg
(2)
The expressions of the reluctance of two air gaps are: R1 =
x g−x = Rg 1 − μ0 A g g
(3)
R2 =
x g+x = Rg 1 + μ0 A g g
(4)
where μ0 is the permeability in vacuum, A g is the effective area of the magnetic pole, g is the initial length of air gap, and x is the offset of the armature. 0 and controlling flux as Φc = Recording the permanent magnetic flux as Φg = M Rg Nc i c , 2Rg
the magnetic fluxes of two air gaps is: Φ1 =
Φg + Φc 1 − gx
(5)
Φ1 =
Φg − Φc 1 + gx
(6)
3.2 Output Force The electromagnetic attraction is determined by: F=
Φ2 2μ0 A g
(7)
where Φ is the magnetic flux of the surface of magnetic pole. The output force of the linear force motor is the difference of the electromagnetic attraction between two air gaps: Fo =
1 (Φ 2 − Φ22 ) = K t i c + K m x 2μ0 A g 1
where K t is the force coefficient, K m is called as magnetic stiffness.
(8)
174
Z. Zhao et al.
Kt =
Φg Nc 2 g 1 − gx ⎡
Km =
2 ⎢ ⎣ μ0 A g g
(9) ⎤2
Φg ⎥ 2 ⎦ 1 − gx
(10)
When x g, 1 − ( gx )2 ≈ 1. At this point, changes of the force coefficient and magnetic stiffness caused by the movement of armature can be ignored. That is, the values of these parameter could be seen as constant: K t0 =
Φg Nc g
(11)
K m0 =
2Φg 2 μ0 A g g
(12)
Fundamentally, linear force motor is a kind of electric-mechanical conversion element that provides rectilinear motion under the effect of the difference of electromagnetic attraction. It has approximately linear characteristic. The displacement of armature will also affect the output of the system.
4 Modeling of Force Motor in AMESim According to the magnetic circuit shown in Fig. 3, the model of linear force motor is established as shown in Fig. 4. The magnetic circuit of the model is composed of permanent magnet, air gap and coil. As for mechanical part, the inertia of the armature, centering spring, and displacement limit are considered. In order to improve its linearity, the torque motor is generally designed as gx < 13 [10]. Similarly, it can be seen from Eqs. (9) and (10) that, except for ( gx )2 , other parts of the expressions are physical constant and structural parameters of the system. Therefore, to make the force motor work in approximately linear interval, it can be 1 1 . In this model, it is set as x ≤ 10 g. achieved by guarantee ( gx )2 10 In addition, Eq. (8) shows that the output of the force motor is related not only to the current, but also to the displacement of armature. According to Eq. (10), the magnetic stiffness is positive. This indicates that when the displacement of the force motor increases, the driving force tends to increase correspondingly, which is equivalent to introducing a positive feedback and inherent instability into the system. To study its effects, further simulation is carried out based on above model. The excitation current of the coil is set as 0.5 A. At this time, the output of the linear force motor without the spring changes with the position of armature as shown in Fig. 5. It can
Development of Electromagnetic Model for Linear …
175
Fig. 4 Model of linear force motor in AMESim
Fig. 5 Relation between output force and armature displacement
be seen that displacement of armature has a significant impact on the output of the linear force motor, the driving force will increase with the armature displacement increasing. As a result, when the armature leaves the zero position, there will be an obvious motion tendency towards the maximum stroke. Therefore, there must be a spring to counteracting the influence of armature’s displacement on the output force, so as to realize the excellent characteristic that the output of the force motor is only linearly related to the current. Figure 6 shows the relationship between the output of the linear force motor and the excitation current when the armature displacement is fixed at zero. It can be seen from the figure that when the excitation current changes near the zero, the output force of the linear force motor has a good linear relationship with the output current. With the increase of current, the saturation effect of the magnetic flux becomes obvious gradually. Under normal conditions, the max current of coil is about 1 A, so the force motor is operating within its linear range.
176
Z. Zhao et al.
Fig. 6 Relation between output force and coil current
Fig. 7 Model of direct-drive hydraulic servo actuator in AMESim
5 Application and Verification Through the above research, a model of linear force motor was established and debugged. Then, the model was applied to the simulation of direct-drive hydraulic servo actuator as shown in Fig. 7. The movement of the slide valve is driven directly by the linear force motor, and then the displacement of the actuator is controlled by DDV. Figure 8 shows the response of the model under the action of sinusoidal instructions with the amplitude of 0.0035 m and the frequency of 1 Hz. The result shows that the model of direct drive valve actuator has good rapidity and accuracy. Figure 9 shows the output of the linear force motor and hydraulic jet force acted on the valve core. The results show that the external load of the force motor is mainly the steady-
Development of Electromagnetic Model for Linear …
177
Fig. 8 Step instruction shift response under sinusoidal signal
Fig. 9 Contrast of output force of linear motor and jet force of slide valve
state fluid force acting on the valve core. The load can be transferred to the force motor normally and affect the output characteristics of the force motor, so that the load characteristics of the linear force motor could be simulated.
6 Conclusion 1. The mechanical structure of the linear force motor is analyzed theoretically, and the magnetic circuit and mathematical model of the system are established. The working principle, characteristics and ideal working conditions of the force motor system are described. 2. Based on the theoretical analysis, the electromagnetic model of the force motor is established and debugged in AMESim, and the basis of parameter selection is explained. The established model can reflect the principle and characteristics of the linear force motor, such as dynamic response and load characteristics, and is suitable for the whole simulation and performance evaluation of the direct drive hydraulic servo actuation system.
178
Z. Zhao et al.
References 1. Teng, X., Huang, Q., et al.: Diffusion of aqueous solutions of ionic, zwitterionic, and polar solutes. J. Chem. Phys. 148(22), 222827 (2018) 2. Teng, X., Hwang, W.: Effect of methylation on local mechanics and hydration structure of DNA. Biophys. J. 114(8), 1791–1803 (2018) 3. Mao, X., Nie, S.W., Li, G.Q.: Research on AMESim simulation and control for direct drive electro-hydraulic servo die forging hammer control system. Forg. Stamp. Technol. 08, 141–149 (2022). (in Chinese) 4. Zheng, Y.F.: Trouble-Shooting and Performance Simulation on Electro-hydraulic Servo Valve. Master Thesis, Jiangsu University (2019). (in Chinese) 5. Ye, X.H., Zheng, Y., Hou, L.Q.: Simulation of the characteristics of rolling mill force motor valve and failure simulation research. Metall. Power 10, 7–11 (2019). (in Chinese) 6. Pu, X.Q., Lei, J.: Static performance simulation of force motor valve based on AMESim. Mech. Eng. 01, 157–158 (2017). (in Chinese) 7. Xia, L.Q., Niu S.Y.: Research of adaptive friction compensation for direct drive valve. J. Syst. Simul. 12, 3327–3329, 3335 (2017). (in Chinese) 8. Meng, B., Xu, H., Liu, B., et al.: Novel magnetic circuit topology of linear force motor for high energy utilization of permanent magnet: analytical modelling and experiment. Actuators 10(2), 32 (2021) 9. Teng, X., Hwang, W.: Elastic energy partitioning in DNA deformation and binding to proteins. ACS Nano 10(1), 170–180 (2016) 10. Wang, C.X.: Hydraulic Control System. China Machine Press, Beijing (1999). (in Chinese)
Analysis on Structure and Kinematics for a Novel Decoupled Rope-Driven Manipulator Yaxing Guo, Kui Huang, Jinjun Zhang, Jigui Zheng, Guizhen Kong, and Longfei Jia
Abstract The rope-driven hyper-redundant manipulator has flexible movement and the ability to avoid obstacles and optimize multiple objectives, which is widely used. However, at present, the traditional rope-driven manipulator mainly exists the following two problems: the structural size is fixed, and the overall position of the manipulator cannot be adjusted; The length of the ropes driving different joints is coupled and not completely decoupled. In view of the above problems, a novel hyperredundant manipulator driven by the rope with links of unequal length and function of adjustable overall position is proposed in this paper. With the help of the proposed structure, the driving cable that does not control the robotic arm passes through the inside, and the coupling phenomenon is released from the structure, thereby simplifying the model and reducing the amount of calculation. And through the derivation of formulas, the effectiveness of the scheme is proved by theory. Finally, simulation shows that the scheme is superior to the traditional scheme. Keywords Structure · Kinematics · Decoupled · Rope-driven · Manipulator
1 Introduction With the development of science and technology, countries in the world are investing huge manpower and material resources to promote the steady development of hightech industries such as robot, artificial intelligence, aerospace, aviation and nuclear power. Regular monitoring and maintenance are an important way to ensure the safe operation of equipment such as spacecraft, large aircraft and nuclear facilities with complex structure, which leave the small space available for monitoring and maintenance operations. In order to successfully complete monitoring and maintenance in narrow spaces, some abilities for the manipulator are needed including traversing Y. Guo · K. Huang · J. Zhang · J. Zheng · G. Kong · L. Jia (B) Beijing Institute of Precision Mechatronics and Controls, Beijing 100076, China e-mail: [email protected] K. Huang · L. Jia Laboratory of Aerospace Servo Actuation and Transmission, Beijing 100076, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_16
179
180
Y. Guo et al.
narrow environments, avoiding obstacles, avoiding joint singularities and joint overruns, and a large enough dexterous operating space. The abilities of multi-objective optimization, and adaptability to narrow and multi-obstacle spaces are poor for traditional manipulator, making it difficult to achieve the above goals at the same time, so it is necessary to study hyper-redundant manipulator that can complete the above tasks [1–3]. In 2002, the OC Robotics company developed a class of rope-driven hyperredundant manipulator [4–6], which is used in the industrial field with a lighter operating arm and is mainly used for detection operations in small space. Besides, the manipulator is remotely driven by ropes, and the bending of the joint segment of the manipulator can be controlled to achieve three-dimensional movement by coordinating and controlling the movement of the three ropes. In 2014, a snake manipulator that can be used for aircraft assembly was designed [7, 8], which has a mass of 5 kg, a diameter of 90 mm and a total length of 1500 mm. The whole manipulator is composed of 5 joint segments, and each joint segment is composed of four small joints. And the small joints are connected by a ball hinge, and a rubber sleeve is added in the middle to play a role of auxiliary support, with a total of 10 degrees of freedom. The manipulator is driven through ropes that are retracted by 15 motor to achieve motion, and the maximum angle is greater than 180◦ . The advantage of adopting ropes is that the driving motor can be placed rear, which greatly reduces the weight of the manipulator. Besides, with a camera mounted on the front end, the snake manipulator can detect obstacles and avoid obstacles with the help of a vision system. In 2017, Tang and Wang developed a hyper-redundant manipulator with 23-DOFs [9, 10] to perform kinematic modeling and study related technologies for motion planning. In the manipulator, per section is driven by three ropes to control two degrees of freedom. Because quite a lot degrees of freedom of the manipulator, it has good bending properties and can work in small spaces. At present, the manipulator driven by ropes mainly exists the following problems: (1) The structure size is fixed, and the overall position of the manipulator cannot be adjusted. (2) There is a coupling relationship between the length of the ropes that drives each joint, and complete decoupling is not achieved. In this paper, a novel hyper-redundant manipulator driven by the rope with links of unequal length and function of adjustable overall position is proposed, where the three driving ropes driving the joint pass out from the inside of the sleeve, and the length of these three driving ropes is related to the deflection angle of the joint. While the rest of the driving ropes always pass through the middle of the sleeve, and the change of the deflection angle of the joint does not affect the length of these driving ropes. Thus, the decoupling is realized structurally. And through the derivation of formulas, the effectiveness of the scheme is proved by theory. Finally, simulation is developed and the results show that the scheme is superior to the traditional scheme. The layout of this paper is as follows: Sect. 2 introduces the differences between the model of the novel rope-driven hyper-redundant manipulator and the traditional manipulator. Section 3 demonstrates how coupling is uncoupled in the scheme pre-
Analysis on Structure and Kinematics for a Novel …
181
Fig. 1 Model of the manipulator
sented in this paper by theory. Section 4 proves that the proposed scheme outperforms the traditional scheme through simulation. Section 5 summarizes the work of this paper.
2 Modeling In this paper, a novel rope-driven hyper-redundant manipulator is proposed, as shown in Fig. 1, including three parts of mechanical structure, driving mechanism and sliding table, of which mechanical structure with a total length of more than three meters is composed of eight rigid hollow links of unequal length connected in series. The end link is connected to the camera, light source and manipulator, and the length of the end link is adjustable. The two adjacent links are connected by universal joints, and each link is driven by three driving ropes. The driving mechanism is composed of motor, coupling, ball screw, slider and driving ropes, and the motor drives the lead screw to rotate to control the slider to move back and forth, thereby driving the ropes, and then adjusting the attitude of each link. The sliding table adds an integral translation freedom to the manipulator, and the overall movement of the manipulator can be driven by the slider. The structure in this paper relieves the coupling phenomenon, and is equipped with many degrees of freedom, strong load capacity, strong adaptability to narrow and multi-obstacle spaces, and the ability to achieve multiple tasks at the same time. The structure of hollow link is shown in Fig. 2, and the elastic sleeve passes through its interior. There are oblique holes at the right end of the link, and the diagonal holes consist of 24 holes evenly distributed along the axis ring. The middle of the elastic sleeve contains driving ropes that drive the rear link, and there are 3 openings at a distance on the elastic sleeve. And the 3 driving ropes of driving joint are threaded through the opening and fixed to the driving link through the corresponding oblique holes and joints. The layout of the driving link in the traditional manipulator is shown in Fig. 3, from which it can be seen that the length of the driving rope 1 and the driving rope 2 driving link 3 is not only related to the angle of joint 3, but also related to the
182
Y. Guo et al.
Fig. 2 Structure of hollow linkModel of the manipulator
Fig. 3 The layout of the driving ropes in the traditional manipulator
angle of joint 2. That is, in a traditional manipulator, the length of the driving ropes is coupled to each joint angle passed in front. The layout of the driving ropes in the manipulator in this paper is shown in Fig. 4, from which it can be seen that the length of the driving rope 1 and the driving rope 2 of driving link 3 are only related to the angle of joint 3, and have nothing to do with the angle of joint 2 (because these two driving ropes pass through the hollow universal joint in the middle of joint 2 with the help of elastic cannula). That is, the length of the driving ropes in the manipulator in this paper is only related to the angle of the joint in front of the link, and there is no coupling effect with the angle of the other joints. In this paper, the length of the eight links in the manipulator is set to be 486, 486, 486, 486, 206, 206, and 286 mm. As shown in Fig. 5, assuming that the total length of the traditional manipulator is equal to the total length of the manipulator in this paper and both are 3128 mm, the minimum turning diameter of the traditional manipulator is 1142 mm. However, the minimum turning diameter of the manipulator in this paper is only 602 mm and the flexibility of the last three joints of the manipulator is greatly improved. When the manipulator is perforated into a narrow space, due to the limitation of obstacles, there is a certain range that makes the end point of the manipulator unreachable. The inaccessible range of the manipulator in this paper is
Analysis on Structure and Kinematics for a Novel …
183
Fig. 4 The layout of the driving ropes in the manipulator in this paper
Fig. 5 Comparison chart of obstacle avoidance
significantly less than that of the traditional manipulator, which also confirms the flexibility of the end joints of the manipulator in this paper. In order to accurately control the position and posture of the manipulator, it is necessary to study the mapping relationship between the task space and the joint space of the manipulator and to establish a kinematic model. The link of the hyper-redundant manipulator proposed in this paper is composed of universal joints connected in series. In order to carry out kinematic analysis of the manipulator, the corresponding D-H coordinate system is established, and then the D-H table and the positive kinematic expression of the manipulator are obtained. The coordinate system constructed is shown in Fig. 6. The link i has two degrees of freedom including the yaw angle and pitch angle relative to the link i − 1, which are the angle between the axis Yi and the axis Yi−1 and the axis Z i and the axis Z i−1 respectively in the conventional coordinate system. In the mathematical model established in this paper, the yaw angle αi and pitch angle θi are used to describe the motion of the link i relative to the link i − 1. The coordinate system {Oi−1 } can be transferred to the coordinate system {Oi } through the following procedure: translating ai along the axis X i−1 , then rotating αi around the axis Z i−1 , and finally rotating θi around the axis Yi . And ai is the distance measured along X i−1 from the origin of the coordinate system {Oi−1 } to the origin of the coordinate system
184
Y. Guo et al.
Fig. 6 D-H coordinate system Table 1 D-H parameters Coordinate ai (mm) system {Oi } 0 1 2 3 4 5 6 7 8
0 0 L1 L2 L3 L4 L5 L6 L7
θi ()
αi
di (mm)
0 θ1 θ2 θ3 θ4 θ5 θ6 θ7 θ8
0 α1 α2 α3 α4 α5 α6 α7 α8
0 0 0 0 0 0 0 0 0
{Oi }, and αi is the angle that the coordinate system rotates from Yi−1 to Yi along Z i−1 . θi is the angle that the coordinate system rotates from X i−1 to X i along Yi and di is the distance measured along Z i from X i−1 to X i . The Denavit-Hartenberg (D-H) parameters established according to the coordinate system of the hyper-redundant manipulator are shown in Table 1. When the hyper-redundant manipulator whose link is equipped with two degrees of freedom is in any position, the value of ai is fixed, and when the position of the manipulator changes, αi and θi also change.
3 Mapping Between Parameters The force and moment are not considered for kinematic analysis, and only the expression function of the corresponding motion position and attitude of the end of the manipulator in the coordinate system of the base with time is needed to solve. And the mapping relationship between each parameter under the traditional scheme and the new scheme proposed in this paper are analyzed.
Analysis on Structure and Kinematics for a Novel …
185
Fig. 7 Simplified model of single joint
3.1 Mapping Relationships Under the Traditional Scheme The kinematic analysis of the traditional rope-driven hyper-redundant manipulator is relatively complex and has a certain coupling relationship. And the coupling phenomenon is particularly pronounced between the mapping of joint space and drive space. By analyzing the relationship between the positive and inverse kinematics of the driving space and the joint space, the relationship between the two degrees of freedom variables (αi , θi ) and the corresponding lengths of three driving ropes (l ja , l jb , l jc ) of the joint is obtained. Each joint of the traditional rope-driven manipulator is driven by 3 parallel driving ropes. One end of the driving rope is fixed on the end of joint, and the other end is fixed in the driving mechanism of the base. And the joint can be rotated through the driving ropes driving the link under the limit of the universal joint. There are 27-3i (i = 1, 2, . . . , 7, 8) driving ropes passing through the link i in total. Only 3 driving ropes in each link are used to drive the link, and the rest of the driving ropes are designed to pass through the current link to drive the link behind. The simplified model of single joint is shown in Fig. 7. First, the mapping relationship between lengths of three driving ropes (l ja , l jb , l jc ) and variables (αi , θi ) at one joint is analyzed. li represents the length of the ropes through holes Aw , Bw and Cw at the joint i. The total length of the three drive ropes controlling the link j in the first joint and joints after the first joint is l ja , l jb and l jc respectively, and the calculation formula is as follows. ⎧ j j−1 j−1 ⎪ ⎪ ⎪ ⎪ li ja + (L i − 2d) = l1 ja + · · · + li ja + · · · + l j ja + (L i − 2d) l ja = ⎪ ⎪ ⎪ ⎪ i=1 i=1 i=1 ⎪ ⎪ ⎪ ⎪ j j−1 j−1 ⎨ l jb = li jb + (L i − 2d) = l1 jb + · · · + li jb + · · · + l j jb + (L i − 2d) ⎪ ⎪ ⎪ i=1 i=1 i=1 ⎪ ⎪ ⎪ ⎪ j j−1 j−1 ⎪ ⎪ ⎪ ⎪ li jc + (L i − 2d) = l1 jc + · · · + li jc + · · · + l j jc + (L i − 2d) ⎪ ⎩ l jc = i=1
i=1
i=1
(1)
186
Y. Guo et al.
The coordinate system {Oi−1 } can be converted to the coordinate system {Oi } through the following procedure: translating L along the axis X i−1 , then rotating αi around the axis Z i−1 , and finally rotating θi around the axis Yi . Then the homogeneous transformation matrix of the conversion process can be obtained. i−1 i T
= T rans(L i−1 , 0, 0)Rot (Z , αi )Rot (Y, θi ) ⎡ ⎤⎡ ⎤⎡ cαi −sαi 0 0 cθi 0 −sθi 100L ⎢0 1 0 0 ⎥ ⎢sαi cαi 0 0⎥ ⎢ 0 1 0 ⎥⎢ ⎥⎢ =⎢ ⎣0 0 1 0 ⎦ ⎣ 0 0 1 0⎦ ⎣−sθi 0 cθi 000 1 0 0 0 0 0 01 ⎡ ⎤ cαi cθi −sαi cαi sθi L i−1 ⎢sαi cθi cαi sαi sθi 0 ⎥ ⎥ =⎢ ⎣ −sθi 0 cθi 0 ⎦ 0 0 0 1
⎤ 0 0⎥ ⎥ 0⎦ 1
(2)
where cαi = cos(αi ), sαi = sin(αi ), cθi = cos(θi ), sθi = sin(θi ). The distance between the point Aw on the plane Aw Bw Cw with the point Ow (the center of the circle) is set as r, and the angle between the line Aw Ow and the axis Z w is set as the position angle φ A of the point Aw , where φ A = φ B − 120 = φC − 240. The position of the point Aw on the plane Aw Bw Cw in the coordinate system {Oi−1 } is (x Aw , y Aw , z Aw ), and the position angle φ A of Aw can be found by the following formula. ⎧ ⎪ ⎨ x Aw = −d y Aw = r sin φ A (3) ⎪ ⎩ z Aw = r cos φ A where d is the distance from the center point of joint to the end face of the link, and r is the distance between the center of the circle of the hole through which the drive rope passes and the center of the link. The position of the point Am on the plane Am Bm Cm in the coordinate system {Oi } is (x Am , y Am , z Am ), and the position angle φ A can be found by the following formula. ⎧ ⎪ ⎨ x Am = d y Am = r sin φ A ⎪ ⎩ z Am = r cos φ A
(4)
The position coordinates of point Am can be converted into the coordinate system {Oi−1 } through the transformation matrix, and then the difference between point Am and point Aw in three directions can be obtained by the following formula.
Analysis on Structure and Kinematics for a Novel …
187
⎡
⎤ ⎤ ⎡ ⎤ ⎡ x A x Aw x Am ⎢y A ⎥ i−1 ⎢ y Am ⎥ ⎢ y Aw ⎥ ⎢ ⎥ ⎥ ⎢ ⎥ ⎢ ⎣z A ⎦ =i T (4 × 4) ⎣ z Am ⎦ − ⎣ z Aw ⎦ 0 1 1
(5)
The distance between point Am and point Aw is obtained: Am Aw =
(x A )2 + (y A )2 + (z A )2
(6)
After organization, l 2 = (−r sαi sφ + r cαi sθi cφ + d + dcαi cθi )2 + (r cαi sφ + r sαi sθi cφ − r sφ + dsαi cθi )2 + (r cθi cφ − r cφ − dsθi )
(7)
2
where l = li ja ; φ = φ A . In the same way, the size of Bw Bm , Cw Cm and the length of each rope can be abbreviated as follows. l = f (θ, α, φ) (8) According to the above formula, values of li ja , li jb and li jc can be derived when αi , θi and the position angle of the driving ropes (φ ja , φ jb , φ jc ) are known. And the total length of l ja , l jb , and l jc of the three driving ropes controlling the link j can be obtained through Formula (1). Next, how to obtain αi and θi for the known length of the driving rope is analyzed. Taking the single joint as an example, Formula (9) can obtained when the arguments are brought into Formula (8). ⎧ ⎪ ⎨ f (θi , αi , φ ja ) − li ja = 0 f (θi , αi , φ jb ) − li jb = 0 ⎪ ⎩ f (θi , αi , φ jc ) − li jc = 0
(9)
Among the three equations in the formula, values of li ja , li jb , li jc , φ ja , φ jb and φ jc ) are known, and there are two unknown quantities αi and θi . The three equations are overdetermined, and αi and θi can be solved with the help of Matlab software. When the position angles of 24 ropes are determined, α1 and θ1 can be found according to l11a , l11b and l11c . And then the length of ropes l12a , l12b , l12c , l13a , l13b …at the first joint can be solved through α1 and θ1 , and the size of l22a , l22b and l22c can be found. Next, α2 and θ2 can be found according to l22a , l22b and l22c , and so on, the values of all αi and θi can be obtained. In summary, for the traditional rope-driven manipulator, values of li ja , li jb and li jc can be derived by αi and θi , and αi and θi can also be derived by using li ja , li jb and li jc . However, there is coupling phenomenon in forward and reverse kinematic calculation, and the calculation amount is large.
188
Y. Guo et al.
3.2 Mapping Relationships Under the Improved Scheme By analyzing the kinematics corresponding to the improved hyper-redundant manipulator in this paper, the relationship between the deflection angle and the length of the driving rope is obtained. In the process of analyzing the mapping relationship, the D-H coordinate system and transformation matrix used are the same as the traditional scheme, and some other calculation formulas are similar to the traditional scheme. The total length of the three driving ropes controlling the first joint and joints after the first joint is (l ja , l jb , l jc ), and the calculation formula is as follows. ⎧ j−1 ⎪ ⎪ ⎪ l ja = l j ja − 2d + Li ⎪ ⎪ ⎪ ⎪ i=1 ⎪ ⎪ ⎪ ⎪ j−1 ⎨ l jb = l j jb − 2d + Li ⎪ ⎪ i=1 ⎪ ⎪ ⎪ ⎪ j−1 ⎪ ⎪ ⎪ ⎪ Li ⎪ ⎩ l jc = l j jc − 2d +
(10)
i=1
The mapping relationship between the length of driving ropes (li ja , li jb and li jc ) and the joint angle (αi ,θi ) at one joint is the same as the traditional scheme. The positive kinematic formula is the same as Formula (7), and the inverse kinematic formula is the same as Formula (9). α1 and θ1 can be found according to (l11a , l11b and l11c ), and α2 and θ2 can be obtained according to (l22a , l22b and l22c ). In short, α j and θ j can be found by (li ja , li jb , li jc ) without effect of other parameters, that is, there is no coupling phenomenon. When the improved rope-driven hyper-redundant manipulator is adopted, not only the coupling phenomenon can be decoupled, but also the amount of calculation can be reduced.
4 Comparative Analysis The manipulator analyzed in this paper contains n = 8 links, and the mapping relationship between deflection angles of joints and length of driving ropes in the traditional scheme is shown in Fig. 8. In the traditional scheme, multiple sets of data need to be calculated at each joint and then are added together to obtain real-time data of the total length of the driving ropes. The number of data that needs to be calculated is as follows. n [3(n + 1) − 3i] = 108 (11) N0 = i=1
Analysis on Structure and Kinematics for a Novel …
189
Fig. 8 Traditional mapping
Fig. 9 Improved mapping
In the improved scheme, the mapping relationship between the deflection angle of the joint and the length of the driving ropes is shown in Fig. 9. Under the improved scheme, only 3 sets of data need to be calculated at each joint and then are added together to obtain real-time data of the total length of the driving ropes. The number of data that needs to be calculated is as follows. N1 =
n
3 = 24
(12)
i=1
Simulation analysis for the two mapping schemes is analyzed by MATLAB. The simulation time required to calculate once under each mapping relationship is recorded. And the simulation is executed 100 times, and the results are shown in Figs. 10 and 11. By comparing the results in the two figures, it can be seen that the average time required under the traditional scheme is 44 ms, and the average
190
Y. Guo et al.
Fig. 10 Simulation time corresponding to the traditional mapping relationship
Fig. 11 Simulation time corresponding to the improved mapping relationship
simulation time corresponding to this method is 13 ms, so the mapping relationship proposed in this paper can reduce the simulation time and improve the control efficiency.
5 Conclusion In this paper, a novel hyper-redundant manipulator driven by ropes is proposed to decouple the joint movement from the structure, and the effectiveness of the scheme is proved by establishing the kinematic model of the manipulator. By analyzing the mapping relationship, it can be shown that the improved scheme reduces the amount of kinematic calculation. The simulation shows that the scheme is superior to the traditional scheme, and the average simulation time is reduced from 44 to 13 ms by counting the time required for 100 simulations, which improves the control efficiency of the manipulator and lays a good foundation for the optimal design and control of the hyper-redundant manipulator driven by ropes.
Analysis on Structure and Kinematics for a Novel …
191
References 1. Huang, Y., Jia, L., Chen, J., et al.: A novel path planning algorithm considering the maximum deflection angle of joint. IEEE Access 9, 115777–115787 (2021) 2. Jia, L., Huang, Y., Chen, T., et al.: MDA+ RRT: a general approach for resolving the problem of angle constraint for hyper-redundant manipulator. Expert Syst. Appl. 193, 116379 (2022) 3. Jia, L., Yu, Z., Zhou, H., et al.: Variable dimensional scaling method: a novel method for path planning and inverse kinematics. Machines 10(11), 1030 (2022) 4. Buckingham, R.: Snake arm robots. Ind. Robot: Int. J. 29(3), 242–245 (2002) 5. Buckingham, R., Graham, A.: Snaking around in a nuclear jungle. Ind. Robot: Int. J. 32(2), 120–127 (2013) 6. Sargeant, B., Robson, S., Szigeti, E., et al.: A method to achieve large volume, high accuracy photogrammetric measurements through the use of an actively deformable sensor mounting platform. Int. Arch. Photogr. Remote Sens. Spat. Inf. Sci. XLI-B5, 123–129 (2016) 7. Zhang, Q., Zhou, L., Wang, Z.: Design and implementation of worm like creeping mobile robot for east remote maintenance system. Fusion Eng. Des. 118, 81–97 (2017) 8. Yao, Y., Du, Z., Wei, Z.: Research on assembly system of snake-arm robot. Aeronaut. Manuf. Technol. 491(21), 26–30 (2015) 9. Yuan, W., Wei, Z., Zou, F., et al.: A kinematic analysis method for underdriven snake-arm robot. Mach. Electron. 11, 65–67 (2014) 10. Tang, L., Wang, J., Li, L., et al.: Design of modular drive device for rope-driven manipulator. Mech. Des. Res. 2, 15–18 (2016)
Fractional Order LMS Algorithms: A Review and Application in Signal Denoising Haozhe Zhang, Hanliang Huo, Ruoxun Ma, and Lipo Mo
Abstract Fractional calculus is a powerful mathematical tool for describing memory properties and intermediate processes, integrating it with LMS algorithm to fully exploit the historical information in the process of weight adjustment can effectively improve the convergence performance and steady-state performance of LMS algorithm. In this paper, we exhaustively review the development of fractional order LMS algorithms (FOLMSs) in the past decade, discusses their properties, advantages and disadvantages. Then we apply these algorithms to sinusoidal signal denoising, from the perspective of output signal variance to measure the denoising effect of various FOLMSs. Finally, we provide some suggestions for the future research direction of FOLMS in the hope of its better development. Keywords Fractional calculus · LMS algorithm · Signal denoising
1 Introduction Adaptive filtering algorithms have been one of the hot research topics in modern signal processing and machine learning fields. Adaptive filters can automatically update its structural parameters to complete corresponding tasks by sensing changes H. Zhang · H. Huo · R. Ma · L. Mo (B) School of Mathematics and Statistics, Beijing Technology and Business University, Beijing 100048, P.R. China e-mail: [email protected] H. Zhang e-mail: [email protected] H. Huo e-mail: [email protected] R. Ma e-mail: [email protected] L. Mo Research Center for Consumption Big Data and Intelligent Decision-Making, Beijing Technology and Business University, Beijing 100048, P.R. China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_17
193
194
H. Zhang et al.
in the external environment and conditions, which have been widely used in fields such as system identification, channel equalization, active noise control, echo cancellation, and adaptive beamforming [1, 2]. The basic concept of adaptive filter can be traced back to 1938 when switching theory [3] was first established. In 1960, Widrow and Hoff proposed an adaptive switching circuit based on the least mean square (LMS) algorithm [4, 5], since then, the LMS algorithm has become the most popular adaptive filtering algorithm due to its simplicity and efficiency. The convergence performance of standard LMS algorithms is highly dependent on environmental changes. Firstly, a large step size tends to lead to instability of the LMS algorithm [6], which severely limits its convergence speed. Second, in many real-world scenarios, various factors, such as lightning and hydroacoustic noise in industrial machines, low-frequency atmospheric noise, etc., cause the weight coefficients of the LMS adaptive filter to be adjusted as a “noisy” process, i.e., there exists a steady-state error in the weight coefficients when they are in steady-state. It has been reported that the standard LMS algorithm has a larger steady-state error compared to other adaptive algorithms [7]. In summary, the standard LMS algorithm is difficult to be accepted in many scenarios requiring fast and accurate convergence. Therefore, many engineers and scholars have endeavored to make improvements to the standard LMS algorithm, such as introducing variable step size schemes [8–10], modifying the iteration time index [11], and incorporating information of input signal autocorrelation matrix [12], etc. Although there are numerous related results, an enhanced LMS algorithm with faster convergence and smaller steady-state error is still anticipated. The concept of fractional order calculus first appeared more than 300 years ago in a letter between Leibniz and L’Hospital [13]. With its inherent nonlocal properties, fractional calculus was widely used for accurate modeling, especially for objects and processes with global correlation and history dependent properties, such as image processing [14, 15], signal analysis [16, 17], viscous materials [18, 19], biomedicine [20, 21], etc. In the field of system control [22, 23], fractional controllers have a wider design freedom, so the control effect is more accurate and efficient. In addition, some classical theories and methods such as Fourier transform [24, 25], numerical approximation [26, 27], and BP neural network [28, 29] have been extended to the fractional order situation, which outperforms the traditional integer order case due to better physical intuition and mathematical generality. Related studies [30] have shown that the algorithms fitted with fractional calculus can obtain wider design freedom while gaining faster convergence speed and higher convergence accuracy. Utilizing the long memory property of fractional calculus to fully exploit the historical information in the process of weight adjustment of LMS algorithm can undoubtedly improve the convergence and steady-state performance of LMS algorithm and promote the development of adaptive filtering theory. This paper gives a detailed review of the development about FOLMSs in the past decade, summarizes their origin, properties, advantages and disadvantages, and also presents some humble opinions on the future research contents and directions of FOLMSs. Finally, various FOLMSs are applied to sinusoidal noise signal denoising, from the perspective of output signal variance to measure their denoising effect.
Fractional Order LMS Algorithms: A Review …
195
2 A Review of FOLMSs in Past Decade The LMS algorithm can be essentially considered as an iterative algorithm coupling classical gradient descent method with first order difference, which can be expressed as follows ∂[e2 (n)] w(n + 1) = w(n) − μ ∂w(n) (1) = w(n) + 2μe(n)u(n), where w(n) = [w1 (n), w2 (n), . . . , w N (n)]T ∈ R N , u(n) = [u(n), u(n − 1), . . . , u(n − N + 1)]T ∈ R N and e(n) is the weight vector, input sequence and output error of the filter, respectively, μ is the learning rate, also called the algorithm step size. In constructing FOLMSs, there are two basic directions: one is to replace the integer order difference of the modified LMS algorithm by the fractional order difference, i.e. (2) ∇ α w(n) = w(n) + 2μe(n)u(n) FOMLMS, the other is to combine the fractional order gradient descent method with the standard LMS algorithm, i.e. w(n + 1) = w(n) − μ
∂ α [e2 (n)] FOGLMS. ∂wα (n)
(3)
In 1987, Kretschmer et al. delayed the iteration time index of the LMS algorithm by one order to obtain a modified LMS algorithm (MLMS) [11], which can be considered as an unconditionally stable version of the standard LMS algorithm. Then, in 2015, Tan et al. innovatively introduced fractional order difference in MLMS to proposed a FOMLMS(a) [31], where the difference order α ∈ (0, 1]. A detailed theoretical analysis of the convergence performance of FOLMLS(a) showed that a larger μ and a larger α lead to the faster convergence speed; meanwhile, the effects of μ and α on the steady-state performance of FOMLMS(a) were illustrated from the power spectrum perspective by employing the fractional Z-transform theory [32]. Since the difference order of FOMMLS(a) is restricted α ∈ (0, 1], its convergence speed is weaker than traditional MLMS. To break this barrier, Cheng et al. extended the difference order to α ∈ (0, 2) (FOMLS(b)) [33], which effectively optimizing the convergence speed and design freedom of FOMLMS(a). By analyzing the properties of the difference equation solutions, to find that FOMLMS(b) has different convergence characteristics in different difference intervals: w(n) tends to converge monotonically when α ∈ (0, 1]; w(n) tends to converge with overshoot when α ∈ (1, 2). In addition, it was proved that a larger μ and α bring faster convergence speed for FOMLMS(b), but accompanied with a larger steady-state error. To balance the contradiction between convergence speed and convergence accuracy, a difference order hybrid switching strategy was proposed: setting an error threshold, using α ∈ (1, 2) to obtain faster convergence speed when the error term is larger than this threshold, and using α ∈
196
H. Zhang et al.
(0, 1] to obtain higher convergence accuracy when the error term is smaller than this threshold during iteration. From the perspective of fractional order gradient descent, replacing the first order gradient of the standard LMS algorithm with a fractional order one, a class of LMS algorithms based on fractional order gradient information (FOGLMS) can be designed. Raja et al. proposed a simplified fractional order gradient LMS algorithm (FOGLMS(a)) [34, 35] which have been successfully applied in the field of active noise control. w(n + 1) = w(n) + 2μ
|w(n)|1−α e(n)u(n) (2 − α)
FOGLMS(a).
(4)
Since the gradient order of FOGLMS(a) is restricted to α ∈ (0, 1] only, its convergence speed is not satisfactory, a composite algorithm (FOGLMS(b)) combining integer order gradient and fractional order gradient is proposed in [36] to further optimize the convergence speed of FOGLMS(a). w(n + 1) = w(n) + 2μ1 e(n)u(n) + 2μ2
|w(n)|1−α e(n)u(n) (2 − α)
FOGLMS(b). (5)
Based on FOGLMS(b), a gradient convex combination adaptive filtering method (FOGLMS(c)) was introduced in [37] to enhance the speech signal quality in the framework of speech enhancement system. With this idea of combining gradients, FOGLMS(a) has been successfully applied to system identification [38], parameter identification [39], channel equalization [40] and other engineering practices. However, the FOGLMS mentioned above intentionally ignores the integral initial value condition when using the chain rule of fractional derivative, which overly weakens the nonlocal property and causes its steady-state performance far from satisfactory. In addition, most of the studies on the above algorithms are confined to practical applications and algorithm simulations, only the convergence step condition was analyzed in [41], the theoretical analysis of the effect of α on the convergence and steady-state performance is still an open problem. In order to further improve the steady-state performance of the FOGLMS by weakening the nonlocality of the fractional order derivative to a lesser extent, a variable initial value strategy was adopted in [42] to correct the fractional order gradient by replacing the fixed initial value with a variable initial value (FOGLMS(d)), where the gradient order α ∈ (0, 2). w(n + 1) = w(n) − μ
∂ α [e2 (n)] α w(n−1) ∂w(n) w (n)
FOGLMS(d).
(6)
Through rigorous theoretical analysis, it was demonstrated that a larger α ∈ (0, 1.5) brings faster convergence speed and larger steady-state error for FOGLMS(d). In order to balance the contradiction between convergence speed and convergence accuracy, a variable fractional order strategy was designed: a larger α is used at the initial
Fractional Order LMS Algorithms: A Review …
197
moment of the iteration to ensure faster convergence speed; a smaller gradient order is used when FOGLMS(d) tends to converge to ensure a higher convergence accuracy. Moreover, the FOGLMS(d) has been successfully applied to the identification of nonlinear Hammerstein systems [43] to test its theoretical correctness and practicality. Whether FOMLMS or FOGLMS [31, 33, 42], they almost all exhibit a common property that a larger difference order or gradient order brings a faster convergence speed along with a larger steady-state error. In order to obtain a better convergence performance while enhancing design freedom as much as possible for more practical application scenarios, an intuitive idea is to take both fractional order difference and fractional order gradient in the standard LMS algorithm. Based on this idea, a double fractional order LMS algorithm (DFOLMS) was proposed in [44], i.e. α w(n + 1) = w(n) − μ
∂ β [e2 (n)] DFOLMS. β w(n−1) ∂w(n) w (n)
(7)
Through the model approximation technique, the DFOLMS was transformed into two fractional order difference models to analyze its convergence and steady state properties indirectly. By using fractional order difference equation solution and fractional Z-transform theory, it was proved that DFOLMS has different convergence characteristics in different difference intervals and a larger α and β will lead to a faster convergence speed but a larger steady state error accompany. Through simulation, the DFOLMS was compared with FOMLMS(b) and FOGLMS(d), due to a wider design freedom, the DFOLMS can always converge faster than FOMLMS(b) and FOGLMS(d) by adjusting its parameters, which fully demonstrated its superior performance and designability.
3 Prospecting This paper mainly provides a detailed review of the development of FOLMSs in the last decade, on the basis of which there are still certain research prospects, as summarized below. A. The gradient order of FOGLMS(a), FOGLMS(b) and FOGLMS(c) is restricted to α ∈ (0, 1] only, if this gradient order limit can be broken, then their convergence speed will be greatly optimized. B. In deriving the convergence step conditions for FOGLMS(d) and DFOLMS, matrix multiplication swapping is utilized, and matrix multiplication swapping presupposes that the algorithm step size μ is set properly (small), which makes the final convergence step condition contain this restriction. It is still worthwhile to investigate how to directly obtain the step size condition for convergence beyond the matrix multiplication swapping. C. Since DFOLMS contains fractional order difference operators and fractional order gradient expansion terms, there exists a complicated iterative style, the authors only analyzed its convergence and steady-state properties indirectly
198
H. Zhang et al.
through model approximation. How to directly analyze its convergence and steady-state properties to refine the related theoretical analysis is still a valuable research direction. D. The fractional order gradient descent method constructing by different strategies exhibits different performance in LMS algorithm. For example, the variable initial value strategy can effectively reduce the steady-state noise, while the truncating higher order terms strategy can effectively accelerate the convergence speed. Based on this, the performance of FOLMSs can be further improved by combining different fractional order gradient descent methods. E. DOFLMS allows a richer practical application for adaptive filtering theory, for example designing a nonlinear fractional order adaptive filtering scheme based on DFOLMS for parameter estimation of nonlinear Hammerstein systems [43], combining DFOLMS with pseudo or proportionate affine projection algorithms [45, 46] for echo cancellation, etc.
4 Application of FOLMSs in Signal Denoising In this section, we use various types of FOLMSs for sinusoidal signal denoising, and measure their denoising effect by the magnitude of the variance of the output signal. The original signal is s = sin(0.025π t), by adding Gaussian white noise to it, we can obtain the noise signal s , where SNR (Signal to Noise Ratio) is 1, and their time-domain waveforms are shown in Fig. 1. The time-domain waveforms of the output signal after denoising by various types of FOLMSs is shown in Fig. 2. And we calculate the steady-state variance of the output signal, the results are presented in Table 1. From Fig. 2 and Table 1, we can conclude that taking smaller difference order and gradient order in FOLMSs will result in smaller steady-state variance for the output Original sinusoidal signal s 1.5 1 0.5 0 −0.5 −1 −1.5
0
100
300
200
400
500
600
400
500
600
,
Noise signal s 2 1 0 −1 −2
0
100
200
Fig. 1 The time-domain waveform of s and s
300 times t
Fractional Order LMS Algorithms: A Review …
199
2 Standard LMS algorithm 0
−2 0 2
100
200
300
400
500
600
FOGLMS(a) with α = 0.5
0
−2 2
0
100
200
300
400
500
600
FOGLMS(b) with α = 0.5
0
−2 2
0
100
200
300
400
500
600
FOMLMS(b) with α = 0.8
0
−2 0 2
100
200
300
400
500
600
FOGLMS(d) with α = 0.5
0
−2 2
0
100
200
300
400
500
600
DFOLMS with α = 0.8 and β = 0.5
0
−2
0
100
200
300
400
500
600
Fig. 2 Time-domain waveforms of the output signal after denoising by various FOLMSs
signal, i.e., better denoising effect. It is particularly noteworthy that due to a wider design freedom of DFOLMS, its difference order and gradient order can be adjusted freely, making its difference order or gradient order smaller than that in FOMLMS or FOGMLS, so its denoising effect is the best among FOLMSs. Note 1: Since we preset the noise signal as the first K outputs of the filter, K is the filter order, so the denoising curve will suddenly converge at t = K = 128, i.e. the real denoising phase starts at t = 128, see Algorithm 1 for details. Note 2: From Fig. 2, the steady-state variance of the output signal is calculated by taking the data segment 500 < t < 600.
200
H. Zhang et al.
Table 1 Steady-state variance of the output signal after denoising by various FOLMSs
Algorithm
Steady-state variance
LMS FOGLMS(a) FOGLMS(b) FOMLMS(b) FOGLMS(d) DFOLMS
0.490813 0.482688 0.498544 0.481271 0.476327 0.452971
Appendix Algorithm 1 FOLMSs for signal denoising Setting filter order K , algorithm step μ, original signal s and iteration number N; Add Gaussian white noise to signal s, x is SNR s = awgn(s, x); Preset first K outputs y(1 : K ) = s (1 : K ); For n = 1 : K w(n, :) = zer os(1, K ); Weight vector initialization end For n = K : N u = s (n − K + 1 : n); Use of noise signal s as input Output signal after denoising y(n) = w(n, :) ∗ uT ; e(n) = s(n) − y(n); Output error of filter Using FOLMSs to update w(n, :), take DFOLMS as example, α ∈ (0, 1) For i = 1 : K 1−β ; Calculate fractional order gradients g(i) = |w(n,i)−w(n−1,i)| (2−β) end G = diag(g); Save gradient information in diagonal matrix G wdf = zer os(1, K ); Differential operator initialization For j = 0 : n − 2 j,:)−w(n− j−1,:)] ; wdf = wdf + (−1) j+1 (α)[w(n− ( j+2)(α− j−1) end Calculate differential operator w(n + 1, :) = w(n, :) − wdf + 2 ∗ μ ∗ e(n) ∗ u ∗ G; Weight vector iteration end plot(y);
Plot the time-domain waveform of output signal y
Fractional Order LMS Algorithms: A Review …
201
References 1. Diniz, P.: Adaptive Filtering: Algorithms and Practical Implementation. Springer, Cham (2020) 2. Haykin, S.: Adaptive Filter Theory. Prentice-Hall, Upper Saddle River (2013) 3. Shannon, C.: A symbolic analysis of relay and switching circuits. Trans. Am. Inst. Electr. Eng. 57(12), 713–723 (1938) 4. Widrow, B., Hoff, M.: Adaptive switching circuits. Neurocomputing 4(1), 126–134 (1960) 5. Nilsson, N.: Learning Machines: Foundations of Trainable Pattern-Classifying Systems. McGraw-Hill, New York (1965) 6. Tarrab, M., Feuer, A.: Convergence and performance analysis of the normalized LMS algorithm with uncorrelated Gaussian data. IEEE Trans. Inf. Theory 34(4), 680–691 (1988) 7. Kwong, R., Johnston, E.: A variable step size LMS algorithm. IEEE Trans. Signal Process. 40(7), 1633–1642 (1992) 8. Fan, T., Lin, Y.: A variable step-size strategy based on error function for sparse system identification. Circuits Syst. Signal Process. 36(3), 1301–1310 (2017) 9. Jalal, B., Yang, X., Liu, Q.: Fast and robust variable-step-size LMS algorithm for adaptive beamforming. IEEE Antennas Wirel. Propag. Lett. 19(7), 1206–1210 (2020) 10. Zhang, X., Yang, S., Liu, Y.: Improved variable step size least mean square algorithm for pipeline noise. Sci. Program. 2022, 1–16 (2022) 11. Kretschmer, F., Lewis, B.: An improved algorithm for adaptive processing. IEEE Trans. Aerosp. Electron. Syst. AES-14(1), 172–177 (1978) 12. Widrow, B., Stearns, S.: Adaptive Signal Processing. Prentice-Hall, Englewood Cliffs (2008) 13. Machado, J., Galhano, A., Trujillo, J.: On development of fractional calculus during the last fifty years. Scientometrics 98(1), 577–582 (2014) 14. Motloch, S., Sarwas, G., Dzielinski, A.: Fractional derivatives application to image fusion problems. Sensors 22(3), 1049 (2022) 15. Abdeljawad, T., Banerjee, S., Wu, G.: Discrete tempered fractional calculus for new chaotic systems with short memory and image encryption. Optik 218, 163698 (2020) 16. Waheed, W., Deng, G., Liu, B.: Discrete Laplacian Operator and Its Applications in Signal Processing, pp. 89692–89707 (2020) 17. Cioc, R., Chrzan, M.: Fractional order model of measured quantity errors. Bull. Pol. Acad. Sci.-Tech. Sci. 67(6), 1023–1030 (2020) 18. Mueller, S., Kaestner, M., Brummund, J.: On the numerical handling of fractional viscoelastic material models in a FE analysis. Comput. Mech. 51(6), 999–1012 (2013) 19. Gritsenko, D., Paoli, R.: Theoretical analysis of fractional viscoelastic flow in circular pipes: general solutions. Appl. Sci. 10(24), 9030 (2020) 20. Magin, R.: Fractional calculus models of complex dynamics in biological tissues. Comput. Math. Appl. 59(5), 1586–1593 (2010) 21. Ferdi, Y.: Some applications of fractional order calculus to design digital filters for biomedical signal processing. J. Mech. Med. Biol. 12(2), 255–497 (2012) 22. Bettayeb, M., Mansouri, R.: Fractional IMC-PID-filter controllers design for non integer order systems. J. Process Control 24(4), 261–271 (2014) 23. Shah, P., Agashe, S.: Review of fractional PID controller. Mechatronics 38, 29–41 (2016) 24. Kilbas, A., Luchko, Y., Martinez, H., et al.: Fractional Fourier transform in the framework of fractional calculus operators. Integral Transform. Spec. Funct. 21(10), 779–795 (2010) 25. Singh, A., Banerji, P.: Fractional integrals of fractional Fourier transform for integrable Boehmians. Proc. Natl. Acad. Sci. India Sect. A 88(1), 49–53 (2018) 26. Xue, D., Zhao, C., Chen, Y.: A modified approximation method of fractional order system. In: 2006 International Conference on Mechatronics and Automation, pp. 1043–1048 (2006) 27. Pooseh, S., Almeida, R., Torres, D.: Numerical approximations of fractional derivatives with applications. Asian J. Control 15(3), 698–712 (2013) 28. Wang, J., Wen, Y., Gou, Y.: Fractional-order gradient descent learning of BP neural networks with Caputo derivative. Neural Netw. 89, 19–30 (2017)
202
H. Zhang et al.
29. Dong, Y., Li, X., Zhang, J.: Application of fractional theory in quantum back propagation neural network. Math. Methods Appl. Sci. 46(3), 3080–3090 (2021) 30. Monje, C., Chen, Y., Vinagre, B.: Fractional-Order Systems and Controls. Springer, London (2010) 31. Tan, Y., He, Z., Tian, B.: A novel generalization of modified LMS algorithm to fractional order. IEEE Signal Process. Lett. 22(9), 1244–1248 (2015) 32. Cheng, J.: Theory of Fractional Difference Equations. Xiamen University Press, Xiamen (2011) 33. Cheng, S., Wei, Y., Chen, Y.: A universal modified LMS algorithm with iteration order hybrid switching. ISA Trans. 67, 67–75 (2017) 34. Shah, S., Samar, R., Raja, M.: Fractional normalised filtered-error least mean squares algorithm for application in active noise control systems. Electron. Lett. 50(14), 973–975 (2014) 35. Shah, S., Samar, R., Raja, M.: Fractional-order adaptive signal processing strategies for active noise control. Nonlinear Dyn. 85(3), 1363–1376 (2016) 36. Osgouei, S., Geravanchizadeh, M.: Speech enhancement using convex combination of fractional least-mean-squares algorithm. In: 2010 5th International Symposium on Telecommunications, pp. 869–872 (2010) 37. Geravanchizadeh, M., Ghalami, S.: Speech enhancement by modified convex combination of fractional adaptive filtering. Iran. J. Electr. Electron. Eng. 10(4), 256–266 (2014) 38. Chaudhary, N., Raja, M.: Identification of Hammerstein nonlinear ARMAX systems using nonlinear adaptive algorithms. Nonlinear Dyn. 79(2), 1385–1397 (2015) 39. Raja, M., Chaudhary, N.: Adaptive strategies for parameter estimation of Box-Jenkins systems. IET Signal Proc. 8(9), 968–980 (2014) 40. Shah, S.: Riemann-Liouville operator-based fractional normalised least mean square algorithm with application to decision feedback equalisation of multipath channels. IET Signal Proc. 10(6), 575–582 (2016) 41. Khan, A., Shah, S., Raja, M.: Fractional LMS and NLMS algorithms for line echo cancellation. Arab. J. Sci. Eng. 46(10), 9385–9398 (2021) 42. Cheng, S., Wei, Y., Chen, Y.: An innovative fractional order LMS based on variable initial value and gradient order. Signal Process. 133, 260–269 (2017) 43. Cheng, S., Wei, Y., Sheng, D.: Identification for Hammerstein nonlinear ARMAX systems based on multi-innovation fractional order stochastic gradient. Signal Process. 142, 1–10 (2018) 44. Zhang, H., Mo, L.: A novel LMS algorithm with double fractional order. Circuits Syst. Signal Process. 42(2), 1236–1260 (2023) 45. Albu, F., Kwan, H.: Combined echo and noise cancellation based on Gauss-Seidel pseudo affine projection algorithm. In: 2004 IEEE International Symposium on Circuits and Systems, pp. 505–508 (2004) 46. Albu, F., Paleologu, C., Benesty, J.: A low complexity proportionate affine projection algorithm for echo cancellation. In: 2010 18th European Signal Processing Conference, pp. 6–10 (2010)
Research on Low-Cost IMU Testing Methods Yajing Guo, Fan Yang, Jing Chen, Binyan Liang, and Shuxuan Liu
Abstract In this paper, through the study of low-cost inertial measurement unit (IMU) test method, inertial device test verification platform is established to estimate the accuracy of the gyroscope, accelerometer, posture information of the modular self-developed chip, and to judge whether it can meet the accuracy requirements of human posture acquisition, which is an effective solution to reduce the cost of batch production. Keywords Inertial devices · Human posture acquisition · Modularity
1 Introduction IMU has the advantages of autonomy, high accuracy in short time [1, 2]. However IMU errors accumulate rapidly with time [3], so that IMU measurement accuracy is critical. Besides, the chip cost is proportional to the measurement accuracy. Selfdeveloped IMU modular chip is an effective solution for batch production and cost reduction. IMU can meet the demand of human posture acquisition (see in [4–6]). In order to measure whether the accuracy of the low-cost IMU modular chip can satisfy requirements, the inertial device test verification platform is built. The high-precision inertial navigation chip (Xsens as an example) and the low-precision low-cost modular chip to be tested are used to perform inertial navigation networking, and modular peripheral circuits are designed to realize signal acquisition, processing and transmission. Mount the inertial navigation network on a three-axis rotating platform, which can provide a dynamic environment for certain movements. Under such dynamic environment, capture and process test data of the high-precision inertial navigation chip and the low-cost modular chip, which are pre-processed as required. The average error and variance are calculated respectively. Y. Guo (B) · F. Yang · J. Chen · B. Liang · S. Liu Beijing Institute of Precision Mechatronics and Controls, Beijing 100076, China e-mail: [email protected] Laboratory of Aerospace Servo Actuation and Transmission, Beijing 100076, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_18
203
204
Y. Guo et al.
2 Design of Inertial Device Test and Verification Platform 2.1 Test Platform Structure Design Three-axis rotating platform is applied to satisfy the requirement of rotation of three coordinate axes and the speed of movement in the application environment. As the IMU follows the human body in the actual field test process, the maximum speed of rotation reaches 60 r/min, the maximum speed of the high-speed shaft is 60 rpm and the other two axes are low-speed axes for alignment. Sensor layout requirements: each inertial navigation chip coordinate system is parallel with high accuracy, and the coordinate system orientation is set as required (Fig. 1).
2.2 Sensor Installation System Design In this program, a multi-dimensional test platform is designed, involving IMU module, IMU mounting plate and the base of rotating platform (Figs. 2 and 3).
Fig. 1 Rotating platform
Research on Low-Cost IMU Testing Methods
Fig. 2 IMU units and connection boards
Fig. 3 12 groups of IMU units and mounting plates
205
206
Y. Guo et al.
Fig. 4 Electrical collection systems
2.3 Electrical Design Reference to the current modular self-developed chip with Xsens communication controller, data from multiple sensors online should be collected. The Speedgoat controller is used to achieve simultaneous uploading of multiple sensors to an upper computer for offline calculation in MATLAB (Fig. 4).
3 Validation Result 3.1 Test Environment Setup Install one high precision Xsens inertial navigation chip and four low precision modular self-developed chips to be tested according to the networking strategy, and install the inertial navigation network on the three-axis rotating platform using the test tooling, to build an inertial device test and verification platform. The posture
Research on Low-Cost IMU Testing Methods
207
Fig. 5 Test environment
information of the installed inertial device sensors is collected and uploaded to the upper computer, and the accuracy of the modular self-developed chip to be tested is dynamically calculated through information pre-processing and mathematical statistics to judge whether it can meet the accuracy requirements of exoskeleton human posture information measurement (Fig. 5).
208
Y. Guo et al.
Fig. 6 Data collection interface of Speedgoat
3.2 Data Collection and Processing Data collection interface of Speedgoat is shown in Fig. 6. The posture information of the installed inertial device sensors is collected and uploaded to the upper computer, which is shown in Figs. 7, 8 and 9, with the Xsens inertial navigation chip data in red and the posture information of the four modular self-developed chips to be tested in blue.
3.3 Test Results Under such dynamic environment, capture and process test data of the high-precision inertial navigation chip and the low-cost modular chip, which are pre-processed as required. The average error and variance are calculated respectively. The results of the three directions are shown in Tables 1, 2 and 3. As can be seen from Table 2, the 3rd sensor of the modular self-developed chip, the Y-direction error is large, and the other chips can meet the requirements in all three directions of posture information. This platform can be used for accuracy testing and chip screening of low-cost IMU modular chips.
Research on Low-Cost IMU Testing Methods
Fig. 7 Theta data
Fig. 8 Fai data
209
210
Y. Guo et al.
Fig. 9 Gama data Table 1 Angle error in X direction No. Average error 1 2 3 4
− 0.5891 − 0.5744 0.9946 0.1690
Table 2 Angle error in Y direction No. Average error 1 2 3 4
− 0.8900 − 0.0922 2.1028 − 0.1206
Table 3 Angle error in Z direction No. Average error 1 2 3 4
− 0.4318 0.1281 0.0896 0.2141
Variance of error 0.0174 0.0316 0.0435 0.0366
Variance of error 0.0191 0.0486 0.0634 0.0212
Variance of error 0.0645 0.5001 0.1453 0.6150
Research on Low-Cost IMU Testing Methods
211
4 Conclusion In this paper, through the study of low-cost IMU test method, a test and verification platform of inertial devices has been established to estimate the accuracy of gyroscope, accelerometer and posture information of modular self-developed chips, and to judge whether they meet the accuracy requirements of human posture acquisition. This platform can also be applied for accuracy testing and chip screening of low-cost IMU modular chips.
References 1. Rogne, R.H., Bryne, T.H., Johansen, T.A., Fossen, T.I.: Fault detection in lever-arm-compensated position reference systems based on nonlinear attitude observers and inertial measurements in dynamic positioning. In: American Control Conference (2016) 2. Funato, K., Tasaki, R., Sakurai, H., Terashima, K.: Development and experimental verification of a person tracking system of mobile robots using sensor fusion of inertial measurement unit and laser range finder for occlusion avoidance. J. Robot. Mechatron. 33(1 TN.191), 33 (2021) 3. Vodicheva, L.V., Parysheva, Y.V.: Estimation of sensor accuracy parameters in a strapdown inertial measurement unit using a relatively rough turntable. Giroskopiya i Navigatsiya 27(2), 162–178 (2019) 4. Lin, Z., Zecca, M., Sessa, S., Bartolomeo, L., Takanishi, A.: Development of an ultraminiaturized inertial measurement unit wb-3 for human body motion tracking. In: IEEE/SICE International Symposium on System Integration (2010) 5. DeShaw, J.: New methodologies for evaluating human biodynamic response and discomfort during seated whole-body vibration considering multiple postures. PhD thesis (2013) 6. Wei, L., Jing, Z., Huang, A.: Design of human motion acquisition system based on MIMU. Comput. Measur. Control (2009)
Androgynous Tool Quick-Change Mechanism and Its Misalignment Tolerance of Space Manipulator Man Huang, Yanbo Wang, Ke Li, Songbo Deng, He Cai, Jiankang Zhi, and Jiaqi Duan
Abstract In order to expanding the appliance of space manipulators and meet the requirements of increasingly diverse on-orbit missions, an androgynous design tool quick-change mechanism for switching end-effectors of space manipulator was developed, which has a greater lateral distance tolerance and rotation tolerance around three axis than other tool change mechanisms under the same external size. The front interface of tool quick-change mechanism is selected as the research object, and the theoretical analysis of the lateral tolerance was carried out by the geometric projection of the key points of front interface, at the same time, the virtual dynamics simulation model of the capturing process was established in Adams. Finally, the results of theoretical analysis and simulation experiment are consistent, which verifies that the androgynous tool quick-change mechanism has a pretty misalignment tolerance performance. Keywords Space manipulator · Tool quick-change mechanism · Misalignment tolerance · Androgynous design
1 Introduction As space on-orbit tasks become increasingly complex and diversified, an on-orbit task requires the space arm with multiple functions to complete it. The traditional scheme of fixing a single end-effector with one space arm cannot meet the actual needs. Therefore, the technology of multi-functional end-effectors is a great significant study for completing diversified space tasks. And one of the solutions is to construct the tool quick-change mechanism. The tool quick-change mechanism technology needs to satisfy the reliable connection both mechanical and electrical functions, and has certain misalignment tolerance M. Huang · Y. Wang (B) · K. Li (B) · S. Deng · H. Cai · J. Zhi · J. Duan Beijing Precision Electromechanical Control Equipment Institute, Beijing 10076, China e-mail: [email protected] K. Li e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_19
213
214
M. Huang et al.
performance, so as to ensure smooth docking in the process of tool switching. The tool quick-change mechanism can significantly improve the utilization efficiency of space manipulator, and greatly reduce the cost of space launch with its high adaptability, so as to realize efficient recycling of existing equipment. At present, many countries such as Canada, Japan, Europe and the United States have equipped the tool quick-change mechanism for the space manipulator. China Space Station has been successfully constructed in 2022, in the future, more and more space missions will need space arms to complete, such as on-orbit maintenance and assembly. Therefore, the study of tool quick-change mechanism is a great scientific, political and economic significance for the development of our country’s space industry. The existing tool quick-change mechanism at home and abroad is mainly equipped with large space manipulator. In Ref. [1], the OTCM mechanism is installed at the end of the SPDM, a special dexterous hand of the International Space Station, with small position tolerance and large size and weight. In Ref. [2], the tool quick-change mechanism CTED uses steel balls to lock passive interface, with little position tolerance. So it requires high precision for the position and posture control of the manipulator. In Ref. [3], the tool quick-change mechanism developed by Japan in project ETS-VII has complex structure and large size. In Refs. [4–11], the SIROM device developed by ESA adopts the androgynous design. In Refs. [5, 6], the tool quick-change mechanism adopts the ellipsoidal form-fit geometry, but this structure is easy to get stuck during the capturing process. In Ref. [7], the electrical connector of the tool quickchange mechanism is not axial, which exists extra torque effect on the end joint and its misalignment tolerance quite small. The tool quick-change mechanism in Ref. [8] has a small misalignment tolerance. Due to the complex structure, large size, small misalignment tolerance and other reasons, the tool quick-change mechanisms in the above literature are not suitable for the docking of small tools, and are not suitable for the small space manipulator mounted on the base of satellites, spacecraft, etc. In this paper, an androgynous design tool quick-change mechanism is developed for the future small space manipulator on orbit maintenance, which can be used for docking and changing operation tools and realize the multi-function of the space manipulator.
2 Design of Tool Quick-Change Mechanism 2.1 Input of Mechanism Design The tool quick-change mechanism is divided into two parts from the function characteristics: the space arm interface (active interface) and the tool interface (passive interface). The active interface is mounted on the arm end joint, and the tool interface is connected to the end-effectors, the ORU (on-orbit replaceable unit), and various cooperative target loads. The tool quick-change mechanism developed in this paper is suitable for small space arms, which are usually installed on satellites, spacecraft and
Androgynous Tool Quick-Change Mechanism …
215
reusable space vehicles and used for on-orbit maintenance, on-orbit technology verification, on-orbit assembly and other tasks. This type of space manipulator is required to have a small size and weight, as well as the performance of a multi-functional end to meet a variety of on-orbit missions. Therefore, the tool quick-change mechanism is required to have the following functions: (1) Large distance and rotation tolerance: in the process of tools switching, the passive and active interface cannot be aligned completely. There will be a certain position and posture deviation between them. The tool quick-change mechanism needs to have a large misalignment tolerance to ensure the successful capturing. (2) Reliable locking and quick release: After the position and posture deviation between the active and passive is eliminated, the locking mechanism can respond quickly to lock the passive interface.
2.2 Overall Design Scheme The tool quick-change mechanism adopts the segmented design, the front section is the guiding positioning mechanism and small interface locking module, the middle section is the large interface locking module, and the end section is the drive and control module of the locking mechanism. In the process of docking, the active and passive interface guide position and posture to align, and the locking mechanism realizes the mechanical locking of the tool quick-change mechanism. The electrical connector module use floating pins, with a certain floating ability of 6 DOF, and provides the 30 V, 1 A electrical connection for the end-effector. The overall scheme design of the tool quick-change mechanism is shown in Fig. 1.
androgynous design
Passive Interface
Fig. 1 Structure of the tool quick-change mechanism
Active Interface
216
M. Huang et al.
3 Performance Analysis of Misalignment Tolerance 3.1 Analysis of the Capturing Process In the process of tool switching by the tool quick-change mechanism, there is a certain initial position and altitude deviation between the active and passive interface. Therefore, the misalignment tolerance of the tool quick-change mechanism is a key indicator determine the success of tool switching. The larger misalignment tolerance is, the simpler the control of the manipulator arm is, and the higher the possibility of successful tool switching is, when the tool quick-change mechanism is on orbit, the active is connected with the end joint of the space manipulator, and the passive carries the operating tools and is fixed on the tool rack. The tool switching process of androgynous tool quick-change mechanism can be divided into three stages. In the first stage, there is no contact between the active and the passive, and there is initial position and altitude deviation. The second stage is the process that the active approaches the passive and eliminate the deviation continuously under the action of the guiding interface. In the third stage, the passive triggers the locking switch and the locking motor. Under the action of the locking wedge, the passive interface will be locked by the wedge, and the electrical connector is successfully connected, that means the mechanical and electrical connection is completed.
3.2 Capture Successful Connection Conditions Mechanical Friction Conditions The guiding surface of the tool quick-change mechanism is selected as the analysis object, as shown in Fig. 2, F is the axial force of the active during the docking process, and Fr is the full binding force of the docking surface, that is, the resultant force of the conical side reaction and the sliding friction force. The condition that no deadlock occurs during the docking process is Eq. (1). α
x1
(15)
0) . where, a = h 1 − h 0 , = (x(x−x 1 −x 0 ) According to Hertz contact theory, stiffness coefficient in collision process can be estimated by the following formula based on contact ellipse and basic deformation relation:
4 K = 3π(h 1 + h 2 )
R1 R2 R1 + R2 )
1/2 (16)
222
M. Huang et al.
hi =
1 − μi 2 , i = 1, 2 π Ei
(17)
where, R1 , R2 represent the equivalent radius of the contact area, μi is Poisson’s ratio of materials, E i is Young’s modulus of the material. The androgynous guiding interface chooses aluminum alloy material, Young’s modulus is E 1 = E 2 = 72G Pa and Poisson’s ratio is μ1 = μ2 = 0.33. Since the contact process is that the active interface approaches the passive interface under the control of the manipulator, the equivalent radius of the curve where the contact point of the active is taken as R1 = 45 mm, and the passive interface is considered as a plane, so R2 = +∞. According to Eqs. (16) and (17), the stiffness coefficient can be obtained as K = 3.61 × 105 N/mm, and the collision force index of metal materials can be e = 1.5. Maximum damping component can be C = 50 N s/mm. The maximum penetration depth is defined as 0.1 mm.
4.2 Simulation Result The Z -axial distance between the active and passive interface is extracted as the verification index. When the distance becomes 0 mm, that means the process of capturing is successful. At the same time, the displacement and rotation angle of the active and passives are extracted to observe the distance and rotation misalignment tolerance of the tool quick-change mechanism. As shown in Fig. 6, it can be seen that when the active has been set to rotate 0◦ around the Z -axis and deviated 6mm in the O X direction and 17 mm in the OY direction, with the process of capturing, the passive moves 6 mm in the O X direction
Fig. 6 Horizontal displacement tolerance (ψ = 0◦ , x = 6 mm, y = 17 mm)
Androgynous Tool Quick-Change Mechanism …
223
Fig. 7 Horizontal displacement tolerance (ψ = 22.5◦ , x = 6 mm, y = 17 mm)
(the red line) and 17 mm in the OY direction (the blue line) on the test platform, and the distance z between the active and passive decreases to 0 mm (the black line). It shows that the tool quick-change mechanism has the horizontal displacement tolerance of theoretical calculation, which has been shown in Table 1. As shown in Fig. 7, it can be seen that when the active has been set to rotate 22.5◦ around the Z -axis, and deviated 6 mm in the O X direction and 17 mm in the OY direction. With the process of capturing, the passive moves 6 mm in the O X direction and 17 mm in the OY direction on the test platform, at the same time the passive rotates + 22.5◦ around the Z axis on the test platform, and the distance z between the active and passive decreases to 0 mm, indicating that the docking is successful. It shows that the tool quick-change mechanism has a bigger rotation tolerance around Z -axis than theoretical calculation, which has been shown in Table 1. When verifying the rotation tolerance around the X -axis and the Y -axis, the model settings need to be changed. The passive interface is also set on the same test platform, while the active is set without motion pair, and only moves along the axial direction under gravity environment, without limiting its freedom. As shown in Fig. 8, it can be seen that when the active is set to rotate + 15◦ around the X-axis, offset 0 mm in the X direction and 0 mm in the Y direction from the origin, with the process of capturing, the angle of the active rotates is − 15◦ around the X -axis, and the distance Z decreases to 0 mm, indicating that the capturing is successful. Verifying that the tool quick-change mechanism has a − 15◦ rotation tolerance around X -axis. As shown in Fig. 9, it can be seen that when the active is set to rotate + 15◦ around the X-axis, offset 0 mm in the X direction and 0 mm in the Y direction from the origin, with the process of capturing, the angle of the active rotates is − 15◦ around
224
M. Huang et al.
Fig. 8 Rotation tolerance around X-axis (α = 15◦ , x = 6 mm, y = 17 mm)
Fig. 9 Rotation tolerance around Y-axis (β = 15◦ , x = 6 mm, y = 17 mm)
the X -axis, and the distance Z decreases to 0 mm, indicating that the capturing is successful. Verifying that the tool quick-change mechanism has a − 15◦ rotation tolerance around Y -axis.
5 Conclusion In order to realize the multi-functional end-effectors of space manipulator, the tool quick-change mechanism was developed based on the androgynous design. The structure of front interface was combined with conical and V-shaped plane, which
Androgynous Tool Quick-Change Mechanism …
225
has a pretty misalignment tolerance performance. Based on the geometric projection method, the relationship between the design parameters and the misalignment tolerance was established. Through the theoretical calculation and Adams virtual dynamic simulation analysis, verifying this androgynous interface has a big misalignment tolerance with ± 6 mm (O X direction) and ± 17 mm (OY direction) in translation, and ± 15◦ around three axis in rotation.
References 1. Hwang, C.: Design of a robotic end-effector to emulate the orbit replaceable unit/tool changeout mechanism (OTCM) for space station robotic system. Technical Report, Johnson Space Center, NASA (2013) 2. Gelmi, R., Rusconi, A., Lodoso, J., Campo, P., Chomicz, R., Schiele, A.: Design of a compact tool exchange device for space robotics applications. In: Proceedings of the 9th ESA Workshop on Advanced Space Technologies for Robotics and Automation, ASTRA, pp. 28–30 (2006) 3. Oda, M., Nishida, M., Nishida, S.: Development of an EVA end-effector, grapple fixtures and tools for the satellite mounted robot system. In: Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS’96, vol. 3, pp. 1536–1543 (1996) 4. Diaz-Carrasco Diaz, M., Guerra, G., Gala, J., Viñals, J.: SIROM electronics design: current state and future developments. Acta Astronaut. 202, 742–750 (2023) 5. Nishida, S., Yoshikawa, T.: A new end-effector for on-orbit assembly of a large reflector. In: 2006 9th International Conference on Control, Automation, Robotics and Vision, pp. 1–6 (2006) 6. Sun, K., Jin, M., Cui, P., Liu, H., Ren, L.: A new fast exchange device and its capture tolerance. Robot 36, 92–99 (2014) 7. Lu, J.: Design and Research on Tool Exchange Device and Operation Tools of Space Robotic Arm. Harbin Institute of Technology (2017) 8. Liu, Q.: Research on End-Tools Exchange Device of Space Robot. Harbin Institute of Technology (2011) 9. Guan, Q., Zhao, X., Wen, Z., Jin, X.: Calculation method of hertz normal contact stiffness. J. Southwest Jiaotong Univ. (2021) 10. Gao, Y.: Research of mechanism of ADAMS contact and contact friction. Autom. Appl. Technol. 64–66 (2017) 11. Vinals, J., Urgoiti, E., Guerra, G., Valiente, I., Esnoz-Larraya, J., Ilzkovitz, M., Franéski, M., Letier, P., Yan, X., Henry, G., Quaranta, A., Brinkmann, W., Jankovic, M., Bartsch, S., Fumagalli, A., Doermer, M.: Multi-functional interface for flexibility and reconfigurability of future European space robotic systems. Adv. Astronaut. Sci. Technol. 1, 119–133 (2018). https://doi.org/10.1007/s42423-018-0009-1
Fast Parameter Estimation Algorithm for the Signal Modeling Based on Equation Solution Ling Xu, Weihong Xu, and Feng Ding
Abstract Sine signals are used widely in many application. This paper presents a fast signal parameter identification algorithm in terms of the feature parameters of the sine wave with an initial phase. In order to avoid complex calculation and realize fast parameter identification, a multiple three-point identification technique is developed by constructing algebraic equation group based on three discrete observations and solving equations. Moreover, to overcome the difficulty of solving transcendental equations regarding the signal parameters, the original transcendental equation group is transformed into a simple form through a equation transformation. Finally, an example is provided to test the performance of the proposed signal identification method and the simulation results show nice performance. Keywords Signal modeling · Algebraic equation · Parameter estimation · Sinusoidal signal
1 Introduction Sinusoidal signal as a typical signal or test signal has the most simple frequency component, which is widely used in practice [1–3]. The research on sinusoidal signal parameter identification has always been a hot issue in the field of signal modeling and many identification methods are presented. Dastres et al. proposed an adaptive parameter identification algorithm for a multi-sinusoidal signal [4]. Liu et al. L. Xu (B) · W. Xu · F. Ding School of Internet of Things and Artificial Intelligence, Wuxi Vocational Institute of Commerce, Wuxi 214153, China e-mail: [email protected]; [email protected] W. Xu e-mail: [email protected] F. Ding e-mail: [email protected] L. Xu School of Internet of Things Engineering, Jiangnan University,Wuxi 214122, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_20
227
228
L. Xu et al.
presented a global exponential estimation technique in terms of the multi-tone sinusoidal signals with unknown frequencies [5]. Xu et al. put forward to a separable synchronous signal estimation algorithm through separating the original parameters into two different parameter sets to reduce computational complexity [6]. Jiang et al. studied the parameter estimation for the discrete-time sinusoidal signals in the presence of measurement noise by using a nonlinear control principle [7]. Pin et al. presented an adaptive observer approach to identify the multi-sinusoidal signals [8]. Pin et al. designed a finite-time convergent estimation algorithm to estimate the amplitude, frequency and phase in terms of a biased sinusoidal signal [9]. In spite of the above-mentioned signal identification methods can estimate the parameters of the signals to be identified, they are very complicated and it is not easy to realize in engineering. In this study, we try to present a fast and convenient identification approach to estimate the parameters of the sinusoidal signals. Parameter estimation is to obtain the model parameters by means of the parameter estimation algorithms [10–12]. Generally, the parameter estimation algorithms are proposed by constructing a cost function with respect to the parameters to be estimated [13, 14]. Moreover, during the process of deriving the parameter estimation algorithms, the algorithms are proposed by optimizing the cost function, in which the optimization process is the process of solving equations to find the optimal parameter estimation [15]. However, it is difficult to find the optimal solutions of the parameters to be estimated by a direct algebraic method. Therefore, the recursive or iterative methods are employed to find the optimal solution, which is the arithmetic solution [13]. It is noted that because the recursive or iterative processes need numerous calculations, these parameter estimation algorithms are too complicated to employ in the actual industrial processes. Therefore, it is of higher application value to study and put forward the parameter estimation methods which have simple mechanism and are conducive to engineering realization. The rests of this paper are outlined as follows. Section 2 illustrates the problem of the signal modeling for the sinusoidal signal with an initial phase. Section 3 presents the multiple three point method. Section 4 provides an example to test the performance of the proposed method and analyzes the test results. Section 5 gives the conclusions of this work.
2 Problem Illustration This work considers a sine signal with an initial phase, in which the mathematical description is y(t) = H sin(ωt + φ), (1) where H > 0 is the amplitude, ω > 0 is an angular frequency and −π φ π is the initial phase. Since Eq. (1) contains three unknown variables H , ω and φ, the information of at least three points is needed to determine the parameters of the sine signal
Fast Parameter Estimation Algorithm for the Signal Modeling …
229
in (1) according to the principle of the algebraic solution. To estimate the three parameters H , ω and φ, we can use three observations y(t1 ), y(t2 ) and y(t3 ) at three different sampling instant t = t1 , t = t2 and t = t3 to build three algebraic equations by substituting the measurements into (1) as follows: y(t1 ) = H sin(ωt1 + φ), y(t2 ) = H sin(ωt2 + φ), y(t3 ) = H sin(ωt3 + φ). Obviously, the above equation group is a transcendental equation group and it is difficult to solve directly. Therefore, the equation group is expanded through the trigonometric function as follows: y(t1 ) = H cos φ sin(ωt1 ) + H sin φ cos(ωt1 ), y(t2 ) = H cos φ sin(ωt2 ) + H sin φ cos(ωt2 ), y(t3 ) = H cos φ sin(ωt3 ) + H sin φ cos(ωt3 ). Define two intermediate variables C := H cos φ and D := H sin φ. If the parameter estimates of the parameters C, D and ω are acquired, the parameter estimates of the parameters H and φ can be obtained. As a result, we have y(t1 ) = C sin(ωt1 ) + D cos(ωt1 ), y(t2 ) = C sin(ωt2 ) + D cos(ωt2 ), y(t3 ) = C sin(ωt3 ) + D cos(ωt3 ). The above equations have been simplified in form, but they are still transcendental equations, which cannot be solved algebraically. Therefore, other methods need to be further sought, such as the constraints on the collection time point of observation data. Therefore, we can design the algebraic solution methods to estimate the signal model parameters by using the observation data at multiple time points.
3 Multiple Three Point Method For the purpose of deriving the signal modeling methods, the observations y(t1 ), y(t2 ) and y(t3 ) at three multiple time points t1 = ξ , t2 = 2ξ and t3 = 3ξ are employed to construct the equation group with respect to the unknown parameters of the sine signal in (1). Then the equation group is constructed as follows: y(t1 ) = y(ξ ) = H sin(ωξ + φ), y(t2 ) = y(2ξ ) = H sin(2ωξ + φ),
(2) (3)
y(t3 ) = y(3ξ ) = H sin(3ωξ + φ).
(4)
230
L. Xu et al.
Expanding (4) gives y(t3 ) = H sin(3ωξ + φ) = H sin(2ωξ + ωξ + φ) = H sin(2ωξ + φ) cos(ωξ ) + H cos(2ωξ + φ) sin(ωξ ) H = H sin(2ωξ + φ) cos(ωξ ) + [sin(3ωξ + φ) + sin(−ωξ − φ)] 2 H H sin(3ωξ + φ) − sin(ωξ + φ) = H sin(2ωξ + φ) cos(ωξ ) + 2 2 1 1 = y(t2 ) cos(ωξ ) + y(t3 ) − y(t1 ). 2 2
(5)
When y(t2 ) = 0, from the above equation, we have cos(ωξ ) =
y(t1 ) + y(t3 ) . 2y(t2 )
(6)
As a result, the general solution of the angular frequency ω is 1 y(t1 ) + y(t3 ) 2mπ arccos + ξ 2y(t2 ) ξ y(ξ ) + y(3ξ ) 2mπ 1 + , m = 0, 1, 2, . . . = arccos ξ 2y(2ξ ) ξ
ω=
(7)
Choosing ξ = ξ1 and ξ = ξ2 and ξ2 /ξ1 is irrational number, the unique solution of the angular frequency ω is y(ξ1 ) + y(3ξ1 ) 2mπ 1 + arccos , m ∈ N0 ξ1 2y(2ξ1 ) ξ1 1 y(ξ2 ) + y(3ξ2 ) 2nπ + arccos , n ∈ N0 . ξ2 2y(2ξ2 ) ξ2
ω=
(8)
After the angular frequency ω is determined, the next assignment is to determine the amplitude H and the initial phase φ. Expanding (2)–(3) gives y(t1 ) = H sin(ωξ + φ) = H sin(ωξ ) cos φ + H cos(ωξ ) sin φ, y(t2 ) = H sin(2ωξ + φ) = H sin(2ωξ ) cos φ + H cos(2ωξ ) sin φ. The above equations can be rewritten as a matrix form and the matrix is given as
sin(ωξ ) cos(ωξ ) sin(2ωξ ) cos(2ωξ )
Then, we can get
H cos φ H sin φ
=
y(t1 ) y(t2 )
.
Fast Parameter Estimation Algorithm for the Signal Modeling …
H cos φ H sin φ
231
−1 cos(ωξ ) y(t1 ) = y(t2 ) cos(2ωξ ) 1 cos(2ωξ ) − cos(ωξ ) y(t1 ) = y(t2 ) sin(ωξ ) sin(ωξ − 2ωξ ) − sin(2ωξ )
=
sin(ωξ ) sin(2ωξ )
y(t1 ) cos(2ωξ )−y(t2 ) cos(ωξ ) − sin(ωξ ) y(t2 ) sin(ωξ )−y(t1 ) sin(2ωξ ) − sin(ωξ )
.
Substituting the above equation into H 2 sin2 φ + H 2 cos2 φ = H 2 yields [y(t1 ) cos(2ωξ ) − y(t2 ) cos(ωξ )]2 [y(t2 ) sin(ωξ ) − y(t1 ) sin(2ωξ )]2 + 2 sin (ωξ ) sin2 (ωξ ) y 2 (t2 ) sin2 (ωξ ) + y 2 (t1 ) sin2 (2ωξ ) − 2y(t1 )y(t2 ) sin(2ωξ ) sin(ωξ ) = sin2 (ωξ ) 2 2 2 y (t1 ) cos (2ωξ ) + y (t2 ) cos2 (ωξ ) − 2y(t1 )y(t2 ) cos(2ωξ ) cos(ωξ ) + sin2 (ωξ ) y 2 (t1 ) + y 2 (t2 ) − 2y(t1 )y(t2 ) cos(2ωξ − ωξ ) = sin2 (ωξ ) y 2 (t1 ) + y 2 (t2 ) − 2y(t1 )y(t2 ) cos(ωξ ) = 1 − cos2 (ωξ )
H2 =
=
3 )+y(t1 ) y 2 (t1 ) + y 2 (t2 ) − 2y(t1 )y(t2 ) y(t2y(t 2)
2 y(t1 )+y(t3 ) 1− 2y(t2 )
=
4y 4 (t2 ) − 4y 2 (t2 )y(t1 )y(t3 ) . 4y 2 (t2 ) − [y(t1 ) + y(t3 )]2
Taking the square root of both sides of (9) gives the algebraic solution of the amplitude y 2 (t2 ) − y(t1 )y(t3 ) H = 2|y(t2 )| . (9) 2 4y (t2 ) − [y(t1 ) + y(t3 )]2 Substituting the amplitude H and the angular frequency ω into (2) yields φ = arcsin
y(t1 ) − ωξ + 2nπ, n = 0, 1, 2, . . . H
(10)
Since the initial phase φ satisfies −π < φ π , the integer value of n in (10) can be determined. As a result, the value of the initial phase φ can be obtained. Equations (8)–(10) constitute the multiple three point (MTP) method for identifying the feature parameters.
232
L. Xu et al.
The computing steps of the MTP method are listed as follows. • Step 1: Set ξ1 and ξ = ξ2 (ξ2 /ξ1 is irrational number). • Step 2: Collect the observations y(ξ1 ), y(2ξ1 ), y(3ξ1 ) and y(ξ2 ), y(2ξ2 ), y(3ξ2 ), (y(2ξ1 ) = 0, y(2ξ2 ) = 0). • Step 3: Determine the angular frequency ω through (8). • Step 4: Determine the amplitude H through (9). • Step 5: Determine the initial phase φ through (10).
4 Example In this section, a performance test experiment is carried out by an example. Consider a sine wave with an initial phase
11π , y(t) = 2 sin 0.5t + 5 where the true values are H = 2, ω = 0.5 and φ = 11π . 5 In order to collect the measurements which are similar to the actual circumstance, a white noise with variance σ 2 = 0.102 is added to the signal to be modeled. Employing the proposed MTP method to compute the characteristic parameters H , ω and φ. For the purpose of acquiring highly accurate parameter estimates, the simulation experiments are carried out for many times, in which the test totality is k = 30. The computed parameters after 30 times are listed in Table 1 and Fig. 1. The parameter computing errors change as the test time k are shown in Fig. 2, where the error is computed as the following formula δ(k) =
| Hˆ − H |2 + |ωˆ − ω|2 + |φˆ − φ|2 . |H |2 + |ω|2 + |φ|2
Table 1 The computed signal parameters k H ω 1 2 5 10 15 20 25 30 Average values True values
1.93452 1.98542 1.97283 2.00045 2.00143 1.97504 1.96618 2.01178 2.05234 2.00000
12.56637 6.29795 2.53517 1.26721 0.84653 0.62832 0.50265 0.42297 0.50866 0.50000
φ
δ(%)
1.57080 1.53606 1.31059 1.29367 1.25689 1.57080 1.57080 1.33467 1.46242 1.42800
4.81329 2.31235 0.81295 0.31058 0.15411 0.14758 0.10181 0.04848
Fast Parameter Estimation Algorithm for the Signal Modeling …
H
2.2
233
Computed value
True value
2 1.8 5
10
15
20
25
30
k 10 Computed value
True value
5 0 5
10
15
20
25
k 2
Computed value
True value
1.5 1
5
10
15
20
25
30
20
25
30
k
Fig. 1 The computed signal parameters after 30 times 5
4
3
2
1
0
5
10
15 k
Fig. 2 The parameter computing errors versus k
Moreover, the predicted signal and the true signal are compared in Fig. 3. From the test results, we can draw some conclusions. • Table 1 and Fig. 1 show that the computed parameters can approach the true parameter values or the average values of the signal parameters are close to the true values after several computing. • Figure 2 shows that the computing errors between the estimated signal parameters and the true signal parameters become smaller with the increasing of the computing times, which means that the computed parameter values are satisfied. • Figure 3 is the comparison between the predicted signal and the true signal. The signal waves shows that the dynamical characteristic of the predicted signal can
234
L. Xu et al. 3 True signal
Predicted signal
2
y(t)
1 0 -1 -2 -3 0
5
10
15
20
25
30
t Fig. 3 The parameter computing errors versus k
seize the dynamic of the true signal, which means that the proposed MTP method is effective. • It is noted that the numerous computations are necessary to obtain a highly accurate parameter estimates. We can take the average values of the numerous computation results as the final prediction values.
5 Conclusions In this paper, a highly efficient identification algorithm for the sine signal with an initial phase is presented based on the algebraic equation principle. By using three data from the discrete observations and constructing the equation group with respect to the signal parameters, the signal parameters are estimated by solving the algebraic equations, in which the sampling instants have the multiple relation which can avoid to solve a complicated transcendental equations by means of the trigonometric transformation. The advantages of the proposed method are that it can obtain the estimates of the signal parameters conveniently and quickly. Acknowledgements This work was supported by Qing Lan Project of Jiangsu Province, by the “333” Project of Jiangsu Province (No. BRA2018328). The authors are grateful to Professor Feng Ding at Jiangnan University for his helpful suggestions.
Fast Parameter Estimation Algorithm for the Signal Modeling …
235
References 1. Lin, Y., Zhang, Y., Fu, S., Zhang, H., Wang, P.: A configurable detection chip with 0.6% Inaccuracy for liquid conductivity using dual-frequency sinusoidal signal technique in 65 nm CMOS. Microelectron. J. 124, 105434 (2022) 2. Tehrani, O.S., Sabahi, M.F.: Eigen analysis of flipped Toeplitz covariance matrix for very low SNR sinusoidal signals detection and estimation. Digit. Sig. Proc. 129, 103677 (2022) 3. Ding, F., Xu, L., Liu, X.M.: Signal modeling—Part A: Single-frequency signals. J. Univ. Sci. Technol. (Nat. Sci. Ed.) 38(1), 1–13 (2017) 4. Dastres, H., Ebrahimi, S.M., Malekzadeh, M., Gordillo, F.: Robust adaptive parameter estimator design for a multi-sinusoidal signal with fixed-time stability and guaranteed prescribed performance boundary of estimation error. J. Franklin Inst. 360(1), 223–250 (2023) 5. Liu, T., Huang, J.: Global exponential estimation of the unknown frequencies of discrete-time multi-tone sinusoidal signals. Automatica 142, 110377 (2022) 6. Xu, L., Ding, F., Zhu, Q.M.: Separable synchronous multi-innovation gradient based iterative signal modeling from online measurements. IEEE Trans. Instrum. Meas. 71, 6501313 (2022) 7. Jiang, T., Xu, D., Chen, T., Sheng, A.: Parameter estimation of discrete-time sinusoidal signals: a nonlinear control approach. Automatica 109, 108510 (2019) 8. Pin, G., Wang, Y., Chen, B., Parisini, T.: Identification of multi-sinusoidal signals with direct frequency estimation: an adaptive observer approach. Automatica 99, 338–345 (2019) 9. Pin, G., Chen, B., Parisini, T.: Robust finite-time estimation of biased sinusoidal signals: a Volterra operators approach. Automatica 77, 120–132 (2017) 10. Ding, F.: System Identification—New Theory and Methods. Science Press, Beijing (2013) 11. Ding, F.: System Identification—Performance Analysis for Identification Methods. Science Press, Beijing (2014) 12. Ding, F.: System Identification—Auxiliary Model Identification Idea and Methods. Science Press, Beijing (2017) 13. Ding, F., Yang, J.B., Xu, Y.M.: Convergence of hierarchical stochastic gradient identification for transfer function matrix models. Control Theory Appl. 18(6), 949–953 (2001) 14. Ding, F., Yang, J.B.: Hierarchical identification of large scale systems. Acta Automatica Sin. 25(5), 647–654 (1999) 15. Ljung, L.: System Identification Theory for the User. Prentice Hall (1999)
Second Harmonic-Compensated Phase-Locked Loop for Resolver-to-Digital Conversion Caixiang Guo, Jin Li, and Chenxi Yang
Abstract In the software-based resolver-to-digital conversion (RDC), phase-locked loop (PLL) is often used to estimate high-accuracy angular position and velocity from resolver signals. However, in conventional PLL, the second harmonics in resolver signals introduce sinusoidal fluctuations in angular position and velocity estimates. In order to solve this problem, a second harmonic-compensated PLL is proposed in this paper. The phase detector structure in this method is improved by adding second harmonic term. It can compensate the theoretical error caused by second harmonic. Compared with the conventional PLL, this method improves the estimation accuracy and is easy to implement. Simulation and experimental results demonstrate the effectiveness of the proposed method. Keywords Resolver · Phase-locked loop · Harmonic
1 Introduction Resolvers, one kind of shaft angle transducers, can transform the rotor position information into a pair of orthogonal amplitude-modulated signals. To obtain highaccuracy rotor position and velocity from the analog signals, the resolver-to-digital conversion (RDC) is necessary in servo control systems. Some commercial integrated circuits (ICs) are designed to realize RDC. These ICs have high reliability but limited bandwidth because of the chip resolution [1]. In contrast, the software-based RDC can set parameters and adjust bandwidth more flexibly. The software-based RDC mainly consists of two steps: firstly, the analog signals are detected to obtain a pair of sinusoidal envelopes which include angular position information; then, the position and velocity are estimated from the envelopes by using trigonometric method, pseudo-linear method, phase-locked loop (PLL) method, etc. The trigonometric method [2] performs arctangent and differential operations on the ratio of the pair signals to obtain position and velocity. This method is easy to C. Guo (B) · J. Li · C. Yang State Grid Taiyuan Electrical Power Supply Company, Taiyuan 030000, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_21
237
238
C. Guo et al.
implement but amplifies the noise because of differential operation. Pseudo-linear method [3] transforms envelopes into linear function of angular position, which is simple but needs a large number of logic modules to locate the angular position. PLL method can continuously correct errors and track position through the closed loop. Generally, PLL has stronger robustness and higher estimation accuracy than the above two open-loop methods. The conventional PLL is a type II tracking loop, which consists of three parts: phase detector (PD), loop filter and voltage-controlled oscillator (VCO). At present, various forms of PLL are proposed by improving the loop filter. In [4], a feedforward channel based on arctangent function was added to the loop filter to improve the accuracy. And a third-order PLL was proposed by designing an accelerationcompensated loop filter, which could reduce the steady-state error under variable speed [5]. In addition, Nguyen et al. [6] employed the dominant-pole approximation algorithm to approximate a third-order PLL as a second-order one. It could operate with wider bandwidth and simpler parameter adjustment. However, the above methods ignored the harmonic components in envelopes, which limits the further improvement of accuracy. Ideally, the output of nonlinear PD sin(θ − θˆ ) is used to approximate the error between the angular position θ and the estimated value θˆ . Unfortunately, when the actual envelopes contain harmonics, the output of PD inevitably has a theoretical‘ error. In order to solve this problem, some methods on the basis of PLL have been proposed, such as mathematical modeling and adaptive notch filter. The former method [7] corrected the distorted envelopes by constructing the high-order approximate polynomial of the harmonics. But the calculation process is complicated. In the latter, notch filters were designed to suppress the specific harmonics before envelopes enter PLL [8, 9]. However, it is difficult to suppress the low-order harmonics near the fundamental wave. As the proportion of the second harmonic is higher than the other harmonics, it has a more serious impact on the output of PD. Therefore, the key problem is to suppress the second harmonic and reduce the theoretical error for improving the accuracy of PLL. In this paper, a second harmonic-compensated PLL is designed to solve the above problem. The structure of PD can compensate the fluctuation of estimated position and velocity caused by second harmonics, so as to improve the accuracy of PLL. This paper is organized as follows. Section 2 presents the problem of PD in conventional PLL and the influence of second harmonics. Section 3 describes the design method. Section 4 analyzes the simulation and experimental results. Finally, the conclusions are given in Sect. 5.
2 Problem Formulation of Conventional PLL The schematic diagrams of resolver and PLL-based RDC are shown in Figs. 1 and 2. When the rotor winding of resolver is excited by a high-frequency sinusoidal voltage u exc and rotates with the measured angular position θ , the two orthogonal stator wind-
Second Harmonic-Compensated Phase-Locked Loop …
239
Fig. 1 The designed reward function Fig. 2 The designed reward function
ings output amplitude-modulated signals u sin and u cos . Then, the pair of sinusoidal envelopes u s and u c are obtained by detecting the signals. As shown in Fig. 2, a PLL can be used to solve the estimate angular position θˆ and velocity ωˆ from envelopes. Ideally, the envelopes can be described as u s = sin θ , u s = cos θ . Then the output of PD in conventional PLL is ε1 = sin θ cos θˆ1 − cos θ sin θˆ1 = sin θ˜1
(1)
where θ˜1 = θ − θˆ1 is the estimation error. Assuming that θ˜1 is small enough, ε1 = sin θ˜1 ≈ θ˜1 . Under this small-signal assumption, the output of PD ε1 approximately represents the estimation error θ˜1 . Therefore, the estimated values θˆ1 and ωˆ 1 can be obtained when the closed loop of PLL tends to be stable. However, in practice, the existence of harmonic components in envelopes should be concerned, which caused from the excitation signal distortion and nonlinear winding distribution [10]. This paper focuses on the influence of 2nd harmonic on the conventional PLL, as its amplitude is higher than the other harmonics. Here the amplitudes of 2nd harmonics are defined as a2 , then the envelopes should be rewritten as u s = sin θ + a2 sin 2θ (2) u c = cos θ + a2 cos 2θ
240
C. Guo et al.
Therefore, the output of PD ε1 ≈ θ˜1 is no longer valid, but ε1 = (1 + a2 cos θ ) sin θ˜1 + a2 sin θ
(3)
With θ˜1 is small enough, the approximate expression sin θ˜ ≈ θ˜ and cos θ˜ ≈ 1 can be obtained. And assuming that the amplitude of 2nd harmonic is much smaller than the fundamental component, i.e. a2 1. Therefore, when the closed loop of PLL makes ε1 tend to be 0, the actual error θ˜1 is no longer 0, but θ˜1 ≈ −
a2 sin θ ≈ −a2 sin θ 1 + a2 cos θ
(4)
According to Eq. (4), the influence of 2nd harmonic on the conventional PLL can be summarized as follows: (1) It causes the angular position error θ˜1 to include a sinusoidal fluctuation with frequency of ω and amplitude of a2 ; (2) Similarly, it also causes the velocity error, ω˜ 1 = ω − ωˆ 1 , to include a sinusoidal fluctuation with frequency of ω and amplitude of a2 ω.
3 Second Harmonic-Compensated PLL According to Sect. 2, when the 2nd harmonic exists in the envelopes, the output of PD ε1 is no longer approximately represents the estimation error θ˜1 . As a result, the sinusoidal fluctuations are introduced in the estimated values θˆ1 and ωˆ 1 , which seriously affect the accuracy of conventional PLL. In order to solve this problem, a simple second harmonic-compensated PLL is designed in this paper. Its PD structure is improved in the feedback channel, which can compensate the fluctuations by correct the output of PD. This method is introduced as follows. The schematic diagram of designed method is shown in Fig. 3. The amplitude of 2nd harmonic a2 is required as a parameter in this PD structure. Since the harmonics in envelopes depend on the winding structure and excitation voltage, a2 can be assumed as a known constant value. The value of a2 is solved by offline Fourier transform. Then, the output of PD is ε2 = sin(θ − θˆ2 ) + a2 sin(θ − 2θˆ2 ) + a2 sin(2θ − θˆ2 ) + a22 sin 2(θ − θˆ2 )
(5)
Here the estimation errors of angular position and velocity are defined as θ˜2 = θ − θˆ2 , ω˜ 2 = ω − ωˆ 2 , respectively. Assuming that θ˜2 is small enough when the system is near the equilibrium point. The approximation sin θ˜2 ≈ θ˜2 , sin(3θ˜2 /2 ) ≈ 3θ˜2 /2 , ˆ ] ≈ cos θ can be made. Then, an approximation of sin 2θ˜2 ≈ 2θ˜2 and cos[(θ + θ)/2 (5) is expressed as
Second Harmonic-Compensated Phase-Locked Loop …
241
Fig. 3 The designed reward function
3 θ + θˆ2 sin θ˜2 + a22 sin 2θ˜2 2 2 ε2 ≈ (1 + 2a22 + 3a2 cos θ )θ˜2 ≈ θ˜2
ε2 = sin θ˜2 + 2a2 cos
(6) (7)
Therefore, when the closed loop tends to the equilibrium point, the output of PD ε2 → 0, which means the estimation errors θ˜2 → 0, ω˜ 2 → 0. According to the above analysis, the PD structure of this method is different from that of conventional PLL. When the envelopes contain 2nd harmonic, the closed-loop error signal ε2 of the method is still approximately equal to θ˜2 . Hence, the sinusoidal fluctuations introduced by harmonic are compensated. Compared with mathematical modeling and adaptive notch filter, this method is easy to implement and have the ability to suppress the low-order harmonic near the fundamental wave.
4 Simulation To simulate the envelopes of resolver, a pair of orthogonal unit sinusoidal signals are generated. The 2nd harmonics with amplitudes of a2 = 0.0033V and Gaussian white noise are added in the pare signals. As shown in Figs. 2 and 3, the parameters of the loop filter in the designed method are the same as those in the conventional PLL, which are ζ = 0.707 and ωn = 628 rad/s. The two cases of constant speed and constant acceleration are both carried out and analyzed as follows. Case 1: constant speed (ω = 6π rad/s) The estimation error comparisons of two methods are shown in Figs. 4 and 5. From the blue curves, the angular position estimation error of the conventional PLL presents a sinusoidal fluctuation with an amplitude of 0.19◦ (0.0033 rad) and a frequency of
242
C. Guo et al.
Fig. 4 The designed reward function
Fig. 5 The designed reward function
Fig. 6 The designed reward function
6π rad/s. And the velocity estimation error also presents a fluctuation with the same frequency. They verify the theoretical error analysis of conventional PLL in Eq. (4). From the red curves, the fluctuations caused by 2nd harmonic are compensated by the proposed method. Case 2: constant acceleration (ω = (4π t + π ) rad/s) As shown in Figs. 6 and 7, the proposed method has smaller errors than the conventional PLL. The blue curves verify the problem formulation of conventional PLL in
Second Harmonic-Compensated Phase-Locked Loop …
243
Fig. 7 The designed reward function
Eq. (4). When the envelopes contain 2nd harmonics, the position estimation error presents a sinusoidal fluctuation with an amplitude of 0.19◦ (0.0033 rad) and a frequency of (4π t + π ) rad/s. The amplitude of velocity estimation error increases with the increase of velocity. Compared with that, the designed method can compensate the fluctuations introduced by 2nd harmonics.
5 Conclusions Because of nonlinear windings and excitation voltage distortion, the envelope signals of resolver include 2nd harmonics, which seriously affect the accuracy of conventional PLL. In this paper, the output of PD in PLL is verified to be affected by the 2nd harmonics, which leads to sinusoidal fluctuations of angular position and velocity estimates. To solve this problem, a second-harmonic compensated PLL is proposed. The PD structure in this method is improved and have the ability to compensate the fluctuations. The simulation and experimental results verify that this method can improve the estimation accuracy. In the future, the influence of high-order harmonics will be discussed and analyzed to further improve the accuracy of PLL.
References 1. Zhou, D., Lai, L., Li, G.: An improved dual-axis mirror control scheme for imaging LIDAR application. IFAC-Pap. Online 49(17), 124–129 (2016) 2. Hou, C.C., Chiang, Y.H., Lo, C.P.: DSP-based resolver-to-digital conversion system designed in time domain power electronics. IET Power Electron. 7(9), 2227–2232 (2014) 3. Benammar, M., Khattab, A., Saleh, S., et al.: A sinusoidal encoder-to-digital converter based on an improved tangent method. IEEE Sens. J. 17(16), 5169–5179 (2017) 4. Wang, F., Shi, T., Yan, Y., et al.: Resolver-to-digital conversion based on accelerationcompensated angle tracking observer. IEEE Trans. Instrum. Measur. 68(10), 3494–3502 (2019)
244
C. Guo et al.
5. Nguyen, H.X., Tran, T.N., Park, J.W., et al.: An adaptive linear-neuron-based third-order PLL to improve the accuracy of absolute magnetic encoders. IEEE Trans. Ind. Electron. 66(6), 4639– 4649 (2019) 6. Hoang, H.V., Jeon, J.W.: Signal compensation and extraction of high resolution position for sinusoidal magnetic encoders. In: International Conference on Control, Automation and Systems, Seoul, pp. 1368–1373. IEEE (2007) 7. Jung, S.Y., Nam, K.: PMSM control based on edge-field hall sensor signals through ANF-PLL processing. IEEE Trans. Ind. Electron. 58(11), 5121–5129 (2011) 8. Carugati, I., Donato, P., Maestri, S., et al.: Frequency adaptive PLL for polluted single-phase grids. IEEE Trans. Power Electron. 27(5), 2396–2404 (2012) 9. Hanselman, D.C.: Resolver signal requirements for high accuracy resolver-to-digital conversion. IEEE Trans. Ind. Electron. 37(6), 556–561 (1990)
Adaptive Tracking of Nonlinear Switched Systems with Sensor Uncertainties Based on a Weighted Average Voting Algorithm Zhiyi Cheng and Yan Lin
Abstract In this paper, we proposed an adaptive tracking control by output feedback based on a weighted average voting algorithm for a class of switched systems with external disturbances and sensor uncertainties. A modified weighted average voting algorithm is employed to regulate the proportion of information transmitted by sensors in different cases. By incorporating high gain scaling technique, the dynamic gain state observer and linear-like controller compose the closed-loop system are constructed, which ensures all signals are bounded and tracking errors can converge to a residual set that can be made arbitrarily small within sensor failures. Finally, a numerical simulation was given to illustrate the availability of our scheme. Keywords Nonlinear switched systems · Voting algorithm · Dynamic gain observer · Sensor uncertainties
1 Introduction The safety and reliability of switched systems have received much attention in research, since abundant real achievements has been applied in living and production, such as power electronics, aircraft systems, power systems, etc. [1–4]. It is well known that instability caused by switching between different subsystems is an essential issue. Therefore, tremendous investigations have focused on this issue, such as switched systems under arbitrary switching signals based on common Lyapunov Function (CLF) in [5, 6] and multiple Lyapunov Function (MLF) in [7] are proposed, and average dwell time methods (ADT) for switched systems under restricted switching signals are adopted in [8]. In practical, system components are inevitable to suffer from uncertain internal and external factors, especially actuators and sensors, which lead to the performance decay and system instability. To avoid or compensate for the occurrence of the above Z. Cheng · Y. Lin (B) College of Electrical Engineering and Automation, Shandong University of Science and Technology, Qingdao 266510, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_22
245
246
Z. Cheng and Y. Lin
situations, tremendous effort has been made on Fault Tolerant Control (FTC). FTC can be roughly divided into passive and active ones. Passive methods rely on robust control theory, which is insensitive to prior fixed failure cases by unchangeable controller [9]. In contrast to passive methods, active approaches adjust controller parameters or reconfigure the system structure to make up for the loss of component efficiency. Numerous active FTC schemes have been proposed, such as multiple model [10], sliding mode control [11], fault diagnosis and control reconfiguration [12] and direct adpative control [13, 14], to name just a few. Although plenty of positive results have been obtained on fault system architecture with one sensor, few researches with regard to FTC based on redundant sensors have been investigated. In [15], neural networks are adopted to identify the faulty sensor and recover the irregular value of states. Other earlier works like analytical redundancy [16, 17] have made attempts and progress, however, all of above are limited to apply to adaptive FTC. It should be noted that, in general, redundant sensors operating require monitored mechanism based on voting algorithm. The basic voting algorithms can be classified into the following categories: unanimity voting algorithm [18], plurality voting algorithm [19], median voting algorithm [20], weighted averaging voting algorithm [21, 22]. In [23], a novel history based on weighted average voters is investigated, nevertheless, excessive focus on history data inevitably affects accuracy of real time signal information. Therefore, the issue of how to design an appropriate voting algorithm based on history data for adaptive FTC still needs to be explored in depth. In this paper, motivated by the above discussions, in order to combine redundant sensors with adaptive FTC for switched systems, a modified weighted average voting algorithm based on history data is proposed. By incorporating high gain scaling technique, state observer and adaptive controller are constructed. The major contributions of this paper are summarized as follows: 1. To the best of our knowledge, the adaptive FTC for switched nonlinear systems based on voting algorithm has been investigated firstly until now. We modified the weighted average voting algorithm based on history data, which can assign the proper weight for each sensor in real time with update of design parameters simultaneously. For the existence of forgetting factor, undesirable effects on history fault data can also be removed in failure free case. 2. In this paper, both multiplicative and additive sensor uncertainties are considered, which is more general than [23–25]. Furthermore, our control scheme eliminates restrictions of the strict-feedback form for all subsystems with polynomial growth condition. Note that the form of our controller is linear-like which is more simple to be designed than backstepping technique. The rest of this paper is organized as follows. In Sect. 2, we describe the switched systems model, some basic assumptions of preliminary conditions for controller design and voter mechanism. Section 3 introduces the detailed mechanism and design procedure of modified weighted average voting algorithm based on history-date. In Sect. 4, observer and adaptive output feedback controller are constructed, meanwhile, stability analysis of switched systems is presented. Numerical simulation results are
Adaptive Tracking of Nonlinear Switched Systems with Sensor …
247
shown in Sect. 5 which illustrate the effectiveness of our proposed scheme. Finally, the conclusion are summarized in Sect. 6.
2 Problem Statement In this paper, a class of nonlinear switched systems is described by the following differential equations x˙i = xi+1 + ψi,κ (x, θ (t)) + ξi (t), x˙n = u + ψn,κ (x, θ (t)) + ξn (t),
(1)
y = x1 where x = [x1 , . . . , xn ]T ∈ R n , u ∈ R and y ∈ R are the states, input and real output, respectively, θ (t) ∈ R m is an unknown continuous time-varying vector within an unknown bounded set, the continuous functions ψi,κ (x, θ ) : R n × R m → R, i = 1, . . . , n, κ = 1, . . . , k, are not necessary to be known, and ξi (t), i = 1, . . . , n, represent external disturbances and satisfy |ξi (t)| ≤ ξ¯ , ∀t ≥ 0,
(2)
where ξ¯ is an unknown positive constant. Consider the sensors may suffer outage, variation, drift and other sensor failure during the system operation, the measured output of jth sensor can be molded as follows (3) y j (t) = ρ j (t)y(t) + η j (t), ∀t ∈ [0, ∞), j = 1, . . . , m 0 where ρ j (t) denotes the sensor gain and η j (t) denotes the sensor measurement error. Then, the system measured output can be described by y (t) =
m0
j (t)y j (t)
(4)
j=1
where m 0 , is an odd number which can be divided into m 1 + (m 1 + 1), represents the amount of sensors, and bounded function j (t) is the weighting function for the output of the jth sensor. We make the following assumptions for system (1). Assumption 1 For the functions ψi,κ , there exists a known positive constant p and an unknown positive constant θ hold the following inequalities |ψi,κ (x, θ )| ≤ ϑ(1 + |y| p )(|x1 | + · · · + |xi |), i = 1, . . . , n where p ≥ 1.
(5)
248
Z. Cheng and Y. Lin
Assumption 2 The reference signal yr (t) and its derivative y˙r (t) are continuous and bounded. Remark 1 In Assumption 1, (5) shows the need for unmeasured states, requiring us to construct the observer to estimate the unmeasured value of states, which is more general than [24]. Remark 2 The sensor failure model (3) is more general than [25, 26], which can be described as more sensor failure category. In particular, ρ(t) = 0 implies the loss of all valid information from the sensor. In Assumption 2, there is unnecessary to determine the bounds of yr and y˙r , and it will be explained later.
3 A Weighted Average Voting Algorithm In this section, a weighted average voting algorithm based on history date, applied to the redundant sensor operation, is proposed.
3.1 Sensor Performance To improve the accuracy of sensors, the following equation is exploited to measure the output of jth sensor (6) |y j (t) − y(t)| ≤ Δs , where Δs is an arbitrary positive constant. Similar to [23], the distances di j (t), j = 1, . . . , m 0 , j = i are defined as di j (t) = |yi (t) − y j (t)|,
(7)
which is facilely proved that di j ≤ 2νΔs when both sensors are in healthy operation, where the positive constant ν ≤ 1.
3.2 Voter Mechanism In (4), the number of sensors is restricted to odd, then, we make the following assumptions. Assumption 3 No more than m 1 sensor measurement error exceed over Δs for t ≥ 0.
Adaptive Tracking of Nonlinear Switched Systems with Sensor …
249
For simplicity, we introduce three cases in the distances |di j (t)| : Case 1: Total distances in the ith sensor satisfy di j (t) ≤ 2μΔs ; Case 2: At most m 1 distances in the ith sensor satisfy di j (t) ≥ 2μΔs ; Case 3: At least m 1 + 1 distances in the ith sensor satisfy di j (t) ≥ 2μΔs ; With above three cases, define the auxiliary functions as 1, if Case 3 holds ςi j = 0, if Case 1 or 2 holds
(8)
With the application of (8), the weighting functions are constructed as t wi (t) = i
t
m0
ςi j (τ )dτ
t−T j=1, j=i
0
m0
ςi j (t)di j (t)dt
(9)
j=1, j=i
wi (t) i (t) = m 0 j=1 w j (t)
(10)
where the positive 0 constant i determines the convergency rate of ith weighting funct tion, t−T mj=1, j=i ςi j (τ )dτ represents the effect of sensor operation status during t 0 time interval T which can be disposed by designer, and 0 mj=1, j=i ςi j (t) di j (t)dt denotes comprehensive impact, consisting of sensor operation status and sensor performance over all the time. Furthermore, j (t) = m10 , j = 1, . . . , m 0 when t = 0. Obviously, the following equations hold j (t) ∈ (0, 1),
m0
j (t) = 1, ˙ j ∈ L ∞ , ∀t ≥ 0.
(11)
j=1
According to the above (8)–(10), system output can be regulated by the weighting functions to the appropriate values. Then we introduce the following lemma, which demonstrates the different cases in j (t). Lemma 1 For the jth sensor measurement output y j , there exists a constant t f that the sensor operation status will not change when t ≥ t f . (1) If all the sensors are in failure free case for all t ≥ 0, then j (t) =
1 , m0
j = 1, . . . , m 0 , ∀t ≥ 0.
(12)
(2) If no more than m 1 + 1 sensors fails and the jth sensor lies in the case 2 or 3, then (13) lim j (t) = 0. t→∞
250
Z. Cheng and Y. Lin
(3) If no more than m 1 + 1 sensors fails and the jth sensor lies in the case 1, then lim j (t) =
t→∞
1 + limt→∞
1 m 0
i=1,i= j
wi (t)
(14)
In addition, the continuous changes between the failure patterns are not considered in this paper. And the system real output y (t) satisfies lim |y (t) − y(t)| ≤
t→∞
4m 1 + 1 νΔs m1 + 1
(15)
The proof will be given in Appendix.
4 Adaptive Output Feedback Controller Design 4.1 Observer and Adaptive Output Feedback Controller Similar to [27], the nonlinearity of switched systems can be transformed into the form of measurement output coupling with measured states by Lemmas 2 and 3 is utilized to determine some parameters, matrices, and vectors of the adaptive output feedback controller. Lemma 2 Let the functions ψi,κ , i = 1, . . . , n, κ = 1, . . . , k, hold (5), then there exists a positive parameter ϑ ∗ , such that |i,κ | ≤ ϑ ∗ (1 + |y | p )(|xˆ1 | + · · · + |xˆ2 |) + Yr 0 + ξ¯ , i = 1, . . . , n
(16)
where Yr 0 = sup (|yr 0 | + | y˙r 0 |) is an unknown constant. Consider the effect of the weighted average voting algorithm on sensors, we utilize Lemma 2 to let y instead of the system unmeasurable output y, and the functions i and parameter yr 0 will be introduced later. Proof From Assumption 2 and (3), p m 1 y j=1 j η j − m 1 |i,κ | ≤ϑ 1 + m 1 j=1 j ρ j j=1 j ρ j
(17)
(|x1 | + · · · + |xi |) + | y˙r 0 | + ξ¯ , i = 1, . . . , n. By applying the inequality p
p
(1 + 2 ) p ≤ 2 p−1 (1 + 2 ), and (11), there exists a positive constant θ ∗ satisfies
(18)
Adaptive Tracking of Nonlinear Switched Systems with Sensor …
⎞p⎞ ⎛ m1 1 ϑ ⎝ 1 + ( m 1 ) p ⎝|y | + | j η j |⎠ ⎠ j=1 j ρ j j=1 ⎛ ⎛ ⎞⎞ m1 p−1 2 ⎝|y | p + | ≤ϑ ⎝1 + m 1 j η j | p ⎠⎠ ( j=1 j ρ j ) p j=1
251
⎛
(19)
≤ϑ ∗ (1 + |y | p ). The proof of Lemma 2 completed. Lemma 3 For any positive constant σ , there exist positive constants μ1 and μ2 , positive definite symmetric matrices P and Q, and vectors a = [a1 , . . . , an ]T and k = [k1 , . . . , kn ]T , such that the following inequalities hold: (A − ac T )T P + P(A − ac T ) ≤ −μ1 I (A − ac T )T Q + Q(A − ac T ) ≤ −4μ1 I σ P ≤ D P + P D + 2σ P ≤ μ2 P σ Q ≤ D Q + Q D + 2σ Q ≤ μ2 Q
(20)
where D = diag{0, 1, . . . , n − 1}, and A, b, c are defined as A=
0 1 0 In−1 , b = (n−1)×1 , c = 0 0 1 0(n−1)×1
(21)
Before designing the observer, we define the following variables e = y − yr , er = x1 − yr .
(22)
In addition, an observer with dynamic gain is indispensible to estimate the unknown states. Define the observer with dynamic gain l with the help of (1) and (22) as x˙ˆi = xˆi+1 + l i ai (e − xˆ1 ), i = 1, . . . , n − 1, (23) x˙ˆn = u + l n ai (e − xˆ1 ), where x, ˆ i = 1, . . . , n, are the estimated state, and the gain l can be chosen by Lemma 2 with 2σ p < 1, (24) then the dynamic gain l is updated by l(0) = 1, l˙ = max{−χ1l 2 + χ2 l(1 + |y | p )2 , (e − xˆ1 )2 + xˆ12 − δ , 0},
(25)
252
Z. Cheng and Y. Lin
where δ is a positive constant, and positive constants χ1 , χ2 will be given later. Notice that l˙ is locally Lipschitz in (l, y , xˆ1 , yr ), dynamic gain l possesses the following properties l ≥ 1, l˙ ≥ 0, l˙ ≥ −χ1l 2 + χ2 l(1 + |y | p )2 , l˙ ≥ (e − xˆ1 ) + 2
xˆ12
(26)
− δ , ∀t ≥ 0.
With the dynamic gain in hand, we design the controller as u = −l n k1 xˆ1 − l n−1 k2 xˆ2 − · · · − lkn xˆn .
(27)
where the gain vector k has been selected from (20) as same as the vector a.
4.2 Main Results In this subsection, we will prove the stability of the closed-loop system with the weighted average voting algorithm (4), the state observer (23), the dynamic high gain (25) and the adaptive output feedback controller (27), in the existence of sensor uncertainties and disturbances. Then, we reach a conclusion of the main results in the following theorem. Theorem 1 If the closed-loop system consisting of (1), (9), (10), (23), (27), and reference signal satisfy the Assumption 1–3, then there exists a positive constant ℘m < 1 satisfies m0 j (t)ρ j (t) ∈ [1 − ℘m , 1], ∀t ≥ 0. (28) j=1
Further, the closed-loop signal vector (x, x, ˆ l) is bounded and the real tracking error e can converge to a residual, meanwhile, the smaller ideal tracking error will be obtained with sensor operation status closed to the failure free case. Proof The locally Lipschitz condition ensures the existence and uniqueness of the solutions (x, x, ˙ l) on the right maximum time [0, T f ] for some T f ∈ (0, +∞]. For brevity, define the estimation error as ˆ = [1 , . . . , n ]T = x − Yr − x,
(29)
where xˆ = [xˆ1 , . . . , xˆn ]T and Yr = [yr 0 , 0, . . . , 0]T with yr 0
0 jηj yr − mj=1 = m 0 . j=1 j ρ j
(30)
Adaptive Tracking of Nonlinear Switched Systems with Sensor …
253
To proceed, under the transformations ¯i =
i xˆi , εi = i−1+σ , i = 1, . . . , n, l i−1+σ l
(31)
the time derivative of ¯ and ε satisfy ˙ ˙¯ =l(A − ac T )¯ + L −1 1 (, θ, yr 0 , ξ ) + la(1 −
m0 j=1
ε˙ =l(A − bk T )ε + la(1 −
m0
j ρ j )¯1 + la
j=1
m0 j=1
l˙ j ρ j )(¯1 + ε1 ) + (σ I + D), l
l˙ j ρ j ε1 + (σ I + D), l
(32) where ¯ = [¯1 , . . . , ¯n ]T , ε = [ε1 , . . . , εn ]T , ξ = [ξ1 , . . . , ξn ]T , L 1 = diag{l σ , . . . , l n−1+σ }, and (, θ, y˙r 0 , ξ ) = [ψ1,κ + ξ1 − y˙r 0 , ψ2,κ + ξ2 , . . . , ψn,κ + ξn ]T .
(33)
Once the positive parameter σ is determined, the set (P, Q, μ1 , μ2 , a, b) can be selected by Lemma 2. Then, construct a common Lyapunov function V = γ ¯ T P ¯ + ε T Qε,
(34)
Consider the sensor uncertainties based on the weighted average voting algorithm as ℘ =1−
m0
j ρ j > 0,
(35)
j=1
Table 1 Inequalities in deduction of V˙ (a) (b) (c) (d) (e) (f)
Pa 2 1 2 2 μ1 l ¯ + 2 lμ1 ε
1 2 2 2 T −2lε Qa℘ε1 ≤ 2l℘ Qa ε ≤ 2 μ1 l ε
2 2lε T Qa(1 − ℘)¯1 ≤ μ1 l ε 2 + 3l Qa
¯ 2 μ1 2 T −1 p 2 |2γ ¯ P L | ≤ 2(1 + |y | ) ¯ + ε 2 + 2 ∗2 2 r 0 +ξ¯ ) (2ϑ ∗ 2 γ 2 n 4 P 2 + 1) ¯ 2 + γ 2 n 2 P 2 ϑ Yr 0 l+(Y 2σ ˙l T −γ l ¯ (D P + P D + 2σ P)¯ ≤ γ χ1 μ2 l P
¯ 2 − γ χ2 σ (1 + |y | p )2 P
¯ 2 ˙ − ll ε T (D Q + Q D + 2σ Q)ε ≤ χ1 μ2 l Q
ε 2 − χ2 σ (1 + |y | p )2 Q
ε 2
2γ l ¯ T Pa℘ (¯1 + ε1 ) ≤ 21 μ1 l ¯ 2 +
254
Z. Cheng and Y. Lin
then, by using the decoupled techniques in Table 1, the V˙ can be further computed as V˙ ≤ − γ lμ1 ¯ 2 − 4lμ1 ε 2 + γ lχ1 μ2 ¯ 2 − γ χ2 σ (1 + |y | p )2 P
¯ 2 + lχ1 μ2 Q
ε 2 − χ2 σ (1 + |y | p )2 Q
ε 2 + 2(1 + |y | p )2 ¯ 2 + (1 + |y | p )2 ε 2 + (2ϑ ∗ 2 γ 2 n 4 P 2 + 1) ¯ 2 δσ 1 + γ 2 n 2 P 2 2σ + μ1l ¯ 2 + 2μ1l ε 2 l 2
Pa 2 + 3 Qa 2 + l ¯ 2 , μ1
(36)
where δσ = ϑ ∗ 2 Yr 0 2 + (Yr 0 + ξ¯ )2 is an unknown positive constant. Let the parameters γ , ℘, χ1 and χ2 satisfy
Pa 2 + 3 Qa 2 + 3, μ21 √ √ μ1 2 2 μ1 ℘ ≤ ℘m := min , , , , 4γ Pa 2γ 4 Qa 2 μ1 2μ1 , , χ1 ≤ min γ μ2 P 2μ2 Q
1 2 , , χ2 ≥ max γ σ P σ Q
γ =
(37) (38) (39) (40)
the derivative of V can be deduced to δσ
¯ 2 + ε 2 + 2σ . V˙ ≤ − μ1l − 2ϑ ∗ 2 γ 2 n 4 P 2 + 1 l
(41)
Follow by, the boundness of state variables ¯ , ε, l on the maximum time interval [0, T f ], which is included in the Appendix for the lengthy arguments, has been proved (Fig. 1). In addition, to prove the measurable tracking error e that can converge to an arbitrary small residual set of origin, we define h(t) = (e − xˆ1 )2 + xˆ12 − δ ,
(42)
and note that ˙ b )| = |max {h(ta ), 0} − max {h(tb ), 0} | ≤ |h(ta ) − h(tb )|. ˙ a ) − l(t |l(t
(43)
From (32), the boundness of x, x, ˆ l on t ∈ [0, +∞) can be obtained, which implies ˙ h(t) is differentiable and h(t) is bounded, thus, h(t) is uniformly continuous. For
Adaptive Tracking of Nonlinear Switched Systems with Sensor …
255
Fig. 1 Switching signal of switched systems
any positive ι, there exists a tι , for all ta and tb , satisfies |ta − tb | < tι and |h(ta ) − ˙ is uniformly continuous. According to the boundness of l(t) and h(tb )| < ι, then, l(t) ˙ = 0. Meanwhile, in view of (26) and the inequality Barbalats’ lemma, limt→+∞ l(t) 2 ≤ 2(e − xˆ1 )2 + 2 xˆ12 < 2(l˙ + δ ), we can obtain e lim |e | ≤
t→+∞
2δ ,
(44)
which implies the measurable tracking error |e | can be regulated arbitrary small by reducing the values of δ . Furthermore, considering the relationships between the real tracking error er and the measurable tracking error |e |, er is rewritten as er = y − y + e ,
(45)
then, it can be deducted that limt→+∞ |er | ≤ limt→+∞ |y − y | + limt→+∞ |e |, with the help of (15), which can be further simplified to lim |er | ≤
t→+∞
4m 1 + 1 νΔs + 2δ , m1 + 1
(46)
that is, similar to the analysis in e , the real tracking error er is associated with the performance of the weighted average voting algorithm besides the measurable tracking error.
256
Z. Cheng and Y. Lin
5 Simulation Example In this section, two third-order plants which can be switched into each other at any time are given to illustrate the validity of the scheme we proposed: Plant 1 : x˙1 (t) = x2 + x12 (t)cos(x3 (t)) + ξ1 (t), x˙2 (t) = x3 (t) +
x22 (t) + ξ2 (t), 1 + θ1 x12 (t)
x˙3 (t) = u(t) + θ2 (x1 (t) + x2 (t))) + ξ3 (t), Plant 2 : x˙1 (t) = x2 + x3 (t) 1 + θ3 x12 (t) + ξ1 (t), x˙2 (t) = x3 (t) + θ4 x1 (t) + x32 (t) + ξ2 (t),
(47)
x˙3 (t) = u(t) + θ5 cos (x1 (t) + x3 (t)) + ξ3 (t), Sensor failure pattern : y(t) = ρ(t)x1 (t) + η(t), where parameter vector θ = [θ1 , θ2 , θ3 , θ4 , θ5 ]T is completely unknown, the external disturbance vector ξ(t) = [ξ1 (t), ξ2 (t), ξ3 (t)]T and sensor uncertainties ρ(t) and η(t) are also unknown but bounded. To proceed with the numerical simulation, we set system parameter vector as θ = [1, 0.2, 1.5, 0.4, 0, 4]T , disturbance vector as ξ(t) = [0.3 sin(t), 0.8 cos(t), −0.1 sin(t)], and choose ϑ = 3 to satisfy Assumption 2. For the sake of comparison, all simulations employ the same switching signal, which has been shown in Fig. 2. Plant 1 will be the controlled object when κ = 0, otherwise, the controlled object will be switched into plant 2. The control objective is to design an adaptive output feedback controller for the switched plant (47) to guarantee the switched system stability and make output y(t) track the reference signal yr (t) = sin(t) by our proposed methods. In the simulation, by using the dynamic high gain observer design procedure, the observer can be constructed as follows: xˆ˙1 = xˆ2 + 93l(e − xˆ1 ), x˙ˆ2 = xˆ3 + l 2 (e − xˆ1 ), x˙ˆ3 = u + 9.7l 3 (e − xˆ1 ),
(48)
u = −4.8l 3 xˆ1 − 0.05l 2 k2 xˆ2 − 1.1l xˆ3 , in which dynamic gain l is updated by l(0) = 1, l˙ =max{−2.1l 2 + 2.9l(1 + |y |2 )2 , (e − xˆ1 )2 + xˆ12 − 0.1, 0}.
(49)
Adaptive Tracking of Nonlinear Switched Systems with Sensor …
Fig. 2 Simulation results of case 1
257
258
Fig. 3 Simulation results of case 2
Z. Cheng and Y. Lin
Adaptive Tracking of Nonlinear Switched Systems with Sensor …
259
To proceed, the design parameters for voter mechanism are Δs = 0.02, ν = 1, T = 4, 1 = 40, 2 = 40 and 3 = 40. The initial conditions of plant states and estimated states are set as x1 (0) = 0.4, x1 (0) = 1, x1 (0) = −1, xˆ1 (0) = −2, xˆ2 (0) = −2.5, and xˆ3 (0) = −3. Furthermore, with the above conditions, we will consider two cases of operating status in sensors with distinct uncertainties. Case 1: All sensors are failure free: y1 = y,
y2 = y,
y3 = y, t ≥ 0 s.
(50)
Figure 2a–e show that in case 1, when no sensors fail, all states of plant and observer, real and measurable errors and dynamic gain, and control signal are bounded with changes in state equations of the plant. From Fig. 2f, it can be seen that all output proportion of sensors are equal to each other under no failure condition. Case 2: Sensor 2 suffers from calibration error, then losses of its accuracy and recovers to the normal operating status eventually: y1 = y, y3 = y, t ≥ 0s, ⎧ y, ⎪ ⎪ ⎪ ⎨(0.2| sin(t)| + 0.5) y, y2 = ⎪ 0.2, ⎪ ⎪ ⎩ y,
t t t t
∈ [0, 2)s ∈ [2, 8) s ∈ [8, 15) s ∈ [15, 20] s
(51)
Figure 3a–e show that in case 2, even with occurrences of sensor uncertainties and impact of system switching, our scheme still guarantees real and measurable errors within permissible error range. In the meanwhile, other signals are also bounded, and Fig. 3f shows that the voter mechanism selects the fault sensor successfully and then assigns its output proportion to others. Notice that, after failure of sensor 2 disappearing, all outputs of weighting functions recover to case 1.
6 Conclusion In this paper, we proposed an adaptive tracking control by output feedback based on the weighted average voting algorithm for a class of switched systems. A state observer for unavailable states due to sensor uncertainties is constructed, which employs high gain scaling technique as the same to controller design. Furthermore, the unknown nonlinearity of subsystems is removed by polynomial growth condition so that the continuousness of Lyapunov function is guaranteed. It has been shown that analysis of the stability of switched systems can be proved by adaptive dynamic gain control laws. A numerical simulation shows the effectiveness of our scheme to invoke redundant sensors to reduce the proportion of information transmitted
260
Z. Cheng and Y. Lin
by faulty sensors. In addition, all the closed-loop subsystem stability and tracking performance can be secured within system uncertainties and sensor uncertainties. The ability to utilize redundant sensors to detect failure patterns should be further researched in the future.
7 Appendix Proof of Lemma 1: (1) When no sensors are out of the prescribed performance Δs for t > t f , in view of (8)–(9), one has that limt→+∞ w j (t) = 1, j = 1, . . . , m 0 . Then, the proof of (12) is completed. (2) When jth sensor is out of the prescribed performance Δs for t > t f , in view of (8)–(9), one has that limt→+∞ w j (t) = 0. Then, the proof of (13) completed. (3) When no more than m 1 + 1 sensors are out of the prescribed performance Δs excluding jth sensor, combining the above cases in functions w(t), (14) can be proved. Furthermore, it is obvious that m 1 + 1 sensors lie in case 1, then we divided the residual m 1 sensors into m 2 in case 1 and m 1 − m 2 in case 2 or case 3. For the existence of m 1 − m 2 sensors in failure, from (6) and (7), it follows that |y j (t) − y(t)| ≤ |y j (t) − yi (t)| + |yi (t) − y(t)| ≤ 3μΔs
(52)
in case 2 or case 3. Hence, lim |y (t) − y(t)| m m2 m 1 −m 2 1 +1 = lim i (t)yi (t) + j (t)y j (t) + s (t)ys (t) t→+∞ s=1 i=1 j=1 m 1 +1 m2 yi (t) − y(t) y j (t) − y(t) , = lim + t→+∞ m + m + 1 m + m + 1 1 2 1 2 i=1 j=1 t→+∞
(53)
by using (7), (52) and (53), we can obtain that m1 + 1 m1 + 1 νΔs + 3νΔs m1 + m2 + 1 m1 + m2 + 1 4m 1 + 1 ≤ νΔs . 1 + m1
lim |y (t) − y(t)| ≤
t→+∞
(54)
Adaptive Tracking of Nonlinear Switched Systems with Sensor …
261
Proof for the boundness of l: In view of (25), suppose that l is bounded on [0, t f ) which is monotone nondecreasing, but lim l(t) = +∞.
t→t f
(55)
Hence, there exists a t1 ∈ (0, t f ), such that 21 μ1l ≥ 2ϑ ∗ γ 2 n 4 P 2 + 1, which yields δσ 1 (56) V˙ ≤ − μ1l( ¯ 2 + ε 2 ) + 2σ , ∀t ∈ [t1 , t f ). 2 l That is, ¯ and ε are bounded on [t1 , t f ). From (4), (24), and (34), there exists t2 ∈ [t1 , t f ) that −χ1l 2 + χ2 (1 + |y | p )2 2 m m0 0 ≤ −χ1l 2 + χ2 1 + i ρi (e 1 + xˆ1 + yr 0 ) + i ηi i=1 i=1 m0 2 1+2σ p 2 −σ ≤ −χ1l + χ2 i ρi (¯1 + ε1 ) + yr 0 + l yr 1+
(57)
i=1
≤ 0, ∀t ∈ [t2 , t f ), which leads to
l˙ = max (e − xˆ1 )2 + xˆ12 − δ , 0 .
(58)
Then, substituting (3), (4), (26) into (58), we have (e − xˆ1 )2 + xˆ12 − δ ≤ cσ l 2σ (¯12 + ε12 ), ∀t ∈ [t2 , t f ),
(59)
where cσ is an positive constant. Define V1 = l 2σ V , by using (36), (56), and (59), for ∀t ∈ [t2 , t f ), the derivative of V1 can be derived as 1 V˙1 ≤ − μ1l 2σ +1 ( ¯ 2 + ε 2 ) + 2σ cσ (¯12 + ε12 )V1 + δσ 2 ≤ −2C1lV1 + 2σ cσ (¯12 + ε12 )V1 + δσ ,
(60)
where C1 is an positive constant. Similar to analysis in (56), ¯ and ε are bounded on [t2 , t f ), and there must exists t3 ∈ [t2 , t f ) such that V˙1 ≤ −C1lV1 + δσ , ≤ −C1lV1 (t3 ) + δσ , ∀t ∈ [t3 , t f ). Therefore, for ∀t ∈ [t3 , t f ), we can obtain that
(61)
262
Z. Cheng and Y. Lin
V1 (t) ≤ e−C1 lt3 (t−t3 ) V1 (t3 ) +
δσ . C1l(t3 )
(62)
Further, considering the definitions of V , V1 , and cσ l
2σ
(¯12
+
ε12 )
≤
1 1 + cσ V1 (t), λmin (Q) γ λmin (P)
(63)
with the continuity and monotone nondecreasing property of l, we can obtain cσ l 2σ (¯12 + ε12 ) ≤ δ ,
(64)
which implies l˙ = 0 on ∀t ∈ [t3 , t f ). Thus, (64) is contradicted with the assumption (55), that is, l is bounded on ∀t ∈ [0, t f ). Proof for the boundness of ε: Note that the derivative of ε can be rewritten as l˙ ε = l (A − bk) ε + l 1−σ a y − yr − xˆ1 − (σ I + D). l
(65)
V2 = ε T Qε,
(66)
V˙2 ≤ −C2 V2 + C3 (l˙ + δ ),
(67)
Define
whose derivation satisfies
where C2 =
μ1 2 Q
l 1−2σ (t f )
Qa 2 μ1
and C3 =
are positive constants.
Solving (67) yields V2 ≤ e
−C2 t
V2 (0) + C3 e
−C2 t
t
eC2 τ l˙ + δ dτ, (68)
0
C3 ≤ V2 (0) + C3l(t) + δ , C2 which shows the boundness of ε on[0, T f ) with the boundness of l(t). Proof for the boundness of ¯ : With definition ! σ n−1+σ , L 2 = diag 1, l ∗ , . . . , l ∗ where l ∗ satisfies l ∗ ≥
8 γ
+
12+4n 4 ϑ ∗2 (1+|y | p )2 P 2 μ1
ε¯ i =
(69)
+ 1, we define
¯i , i = 1, . . . , n. l ∗n−1+σ
(70)
Adaptive Tracking of Nonlinear Switched Systems with Sensor …
263
Then, from (32), (69), (70), the derivative of ε¯ can be derived as 1−σ ε˙¯ =l ∗l(A − ac T )¯ε + L 1 −1 L 2 −1 + ll ∗ a ε¯ 1 l˙ − l 1−σ a(y − xˆ1 − yr ) + (σ I + D)¯ε, l
(71)
V3 = ε¯ T P ε¯ ,
(72)
Let
whose time derivative can be computed as V˙3 ≤ −μ1ll ∗ ¯ε 2 + 2¯ε T P L 1 −1 L 2 −1 − 2l 1−σ l ∗ ε¯ T Pa(y − xˆ1 − yr ) l˙ 1−σ −2ll ∗ ε¯ T Pa ε¯ 1 + ε¯ T (P D + D P + 2σ P)¯ε . l
(73)
With the following inequalities, μ1 ll ∗ 4ll ∗
¯ε 2 + 4 μ1
1−2σ
2ll ∗
1−σ
ε¯ T Pa ε¯ 1 ≤
Pa 2 ¯12 ,
(74)
μ1ll ∗ 4l 1−2σ
¯ε 2 +
Pa 2 (y − xˆ1 − yr )2 , 4 μ1l ∗ (75) 2¯ε T P L 1 −1 L 2 −1 ≤3 ¯ε 2 + n 4 ϑ ∗ 2 (1 + |y | p )2 P 2 ( ¯ε 2 + ¯ 2 )
−2l 1−σ l ∗ ε¯ T Pa(y − xˆ1 − yr ) ≤
+ and note that
(1 + |y | p )2 2
¯ε 2 + n 2 (ξ¯ + Yr 0 ) P 2 + γ n 2 ϑ ∗ 2 P 2 Yr20 , γ (76)
2 ¯12 ≤ 2 (¯1 − ε1 )2 + ε12 ≤ 2σ (l˙ + δ ), l
(77)
it can be checked that ∗1−2σ
1 4ll V˙3 ≤ − μ1ll ∗ +
Pa 2 ¯12 + n 4 ϑ ∗ 2 (1 + |y | p )2 P 2 ε 2 4 μ1 4l 1−2σ 2 +
Pa 2 ε12 + n 2 P 2 (ϑ ∗ 2 Yr20 + (ξ¯ + Yr 0 ) ). μ1l ∗
(78)
Similar to the analysis in (67)–(68), the boundness of ¯ on [0, T f ) can be determined.
264
Z. Cheng and Y. Lin
References 1. Wu, D., Sun, Y., Shao, S.: Adaptive neural control for nonlinear switched systems with improved MDADT and its applications. J. Franklin Inst. 359(17), 9544–9568 (2022) 2. Ndoye, A., Delpoux, R., Trégouët, J.F., Lin-Shi, X.: Switching control design for LTI system with uncertain equilibrium: application to parallel interconnection of dc/dc converters. Automatica 145, 110522 (2022) 3. Sun, Z., Ge, S.S.: Analysis and synthesis of switched linear control systems. Automatica 41(2), 181–195 (2005) 4. Xu, Z., Li, X., Stojanovic, V.: Exponential stability of nonlinear state-dependent delayed impulsive systems with applications. Nonlinear Anal. Hybrid Syst. 42, 101088 (2021) 5. Kosov, A.A., Kozlov, M.V.: On the existence and construction of common Lyapunov functions for switched discrete systems. J. Appl. Ind. Math. 12, 668–677 (2018) 6. Lin, X., Chen, C.C., Qian, C.: Smooth output feedback stabilization of a class of planar switched nonlinear systems under arbitrary switchings. Automatica 82, 314–318 (2017) 7. Fan, L., Zhu, Q.: pth moment exponential stability of switched discrete-time stochastic systems: a multiple Lyapunov functions method. J. Franklin Inst. 358(13), 6835–6853 (2021) 8. Zhai, G., Hu, B., Yasuda, K., Michel, A.N.: Stability analysis of switched systems with stable and unstable subsystems: an average dwell time approach. Int. J. Syst. Sci. 32(8), 1055–1061 (2001) 9. Zhai, D., Lu, A.Y., Dong, J., Zhang, Q.: Adaptive fuzzy tracking control for a class of switched uncertain nonlinear systems: an adaptive state-dependent switching law method. IEEE Trans. Syst. Man Cybern. Syst. 48(12), 2282–2291 (2017) 10. Zhao, Q., Jiang, J.: Reliable state feedback control system design against actuator failures. Automatica 34(10), 1267–1272 (1998) 11. Mao, Z., Yan, X.G., Jiang, B., Chen, M.: Adaptive fault-tolerant sliding-mode control for highspeed trains with actuator faults and uncertainties. IEEE Trans. Intell. Transp. Syst. 21(6), 2449–2460 (2019) 12. Yang, H., Jiang, B., Staroswiecki, M.: Supervisory fault tolerant control for a class of uncertain nonlinear systems. Automatica 45(10), 2319–2324 (2009) 13. Tang, X., Tao, G., Joshi, S.M.: Adaptive actuator failure compensation for parametric strict feedback systems and an aircraft application. Automatica 39(11), 1975–1982 (2003) 14. Tao, G., Chen, S., Joshi, S.M.: An adaptive actuator failure compensation controller using output feedback. IEEE Trans. Autom. Control 47(3), 506–511 (2002) 15. Guo, T.H., Nurre, J.: Sensor failure detection and recovery by neural networks. In: International Joint Conference on Neural Networks, No. NASA-TM-104484 (1991) 16. Merrill, W.C.: Sensor failure detection for jet engines using analytical redundancy. J. Guidance Control Dyn. 8(6), 673–682 (1985) 17. Staroswiecki, M., Comtet-Varga, G.: Analytical redundancy relations for fault detection and isolation in algebraic dynamic systems. Automatica 37(5), 687–699 (2001) 18. Blough, D.M., Sullivan, G.F.: A comparison of voting strategies for fault-tolerant distributed systems. In: Proceedings Ninth Symposium on Reliable Distributed Systems, pp. 136–145. IEEE (1990) 19. Knight, J., Leveson, N.: A large scale experiment in n-version programming. In: Proceedings of Ninth Annual Software Engineering Workshop (1984) 20. Kopetz, H., Ochsenreiter, W.: Clock synchronization in distributed real-time systems. IEEE Trans. Comput. 100(8), 933–940 (1987) 21. Latif-Shabgahi, G.R.: A novel algorithm for weighted average voting used in fault tolerant computing systems. Microprocess. Microsyst. 28(7), 357–361 (2004) 22. Lorczak, P.R., Caglayan, A.K., Eckhardt, D.E.: A theoretical investigation of generalized voters for redundant systems. In: The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers, pp. 444–451. IEEE (1989)
Adaptive Tracking of Nonlinear Switched Systems with Sensor …
265
23. Latif-Shabgahi, G., Bass, J.M., Bennett, S.: History-based weighted average voter: a novel software voting algorithm for fault-tolerant computer systems. In: Proceedings Ninth Euromicro Workshop on Parallel and Distributed Processing, pp. 402–409. IEEE (2001) 24. Qian, C., Lin, W.: Output feedback control of a class of nonlinear systems: a nonseparation principle paradigm. IEEE Trans. Autom. Control 47(10), 1710–1715 (2002) 25. Li, S., Tao, G.: Feedback based adaptive compensation of control system sensor uncertainties. Automatica 45(2), 393–404 (2009) 26. Zhai, D., An, L., Li, X., Zhang, Q.: Adaptive fault-tolerant control for nonlinear systems with multiple sensor faults and unknown control directions. IEEE Trans. Neural Networks Learn. Syst. 29(9), 4436–4446 (2017) 27. Zhang, X., Lin, Y.: Robust adaptive tracking of uncertain nonlinear systems by output feedback. Int. J. Robust Nonlinear Control 26(10), 2187–2200 (2016) 28. Sun, C., Lin, Y.: Adaptive output feedback compensation for a class of nonlinear systems with actuator and sensor failures. IEEE Trans. Syst. Man Cybern. Syst. 52(8), 4762–4771 (2021) 29. Zhai, D., An, L., Dong, J., Zhang, Q.: Output feedback adaptive sensor failure compensation for a class of parametric strict feedback systems. Automatica 97, 48–57 (2018)
Load Distribution of Planetary Roller Screw Mechanism with Roller Threads Modification Wei Liu, Zhong Chen, Sheng Xie, Yongqiang Dou, Jigui Zheng, and Chao Geng
Abstract Planetary roller screw mechanism (PRSM) is a novel screw with high transmission performance. The uneven load distribution of PRSM roller threads will seriously weaken its bearing capacity and affects its transmission performance. This work aims to proposes a roller modification method based on the load distribution and deformation compatibility relationship of PRSM threads to improve the bearing capacity of PRSM. Subsequently, the influence of various parameters on the load distribution is analyzed. The results show that using the method proposed in this article for roller threads modification, the uneven coefficient on the nut-roller side has been reduced by 8.71%, and the roller-screw side has been reduced by 11.95%. This work can provide a theoretical basis for the subsequent uniform load design and manufacturing of PRSM. Keywords Planetary roller screw mechanism · Load distribution · Deformation compatibility relationship · Roller threads modification
1 Introduction Planetary roller screw mechanism (PRSM) is a linear transmission device that can convert linear and rotational motion into each other. Due to its advantages such as high load-bearing capacity, high stiffness, high transmission accuracy, and long service life, PRSM is widely used in the fields of aviation, aerospace, navigation, as well as petroleum, medical equipment, large precision machine tools, and automated production lines. With the development of mechatronics integration, PRSM, as the actuator of electro-mechanical actuators (EMA), will further expand its application fields.
W. Liu · Z. Chen · S. Xie · Y. Dou · J. Zheng (B) · C. Geng Beijing Institute of Precision Mechatronics and Controls, Beijing 100076, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_23
267
268
L. Wei et al.
Fig. 1 Schematic diagram of standard PRSM structure
The load-bearing characteristics of PRSM are the foundation of the advantages of its transmission performance. Research results showed that, similar to bolt-nut transmission, the load distribution of PRSM roller threads is not uniform in the actual transmission process. Regardless of the loading method, the first few contact threads bear the most of the load. The inherent unevenness of load distribution causes wear, also reduces the load-bearing capacity and service life of PRSM [1]. At present, some research on the load distribution of PRSM threads has been conducted, but there is little research on improving this phenomenon. In 2016, Zhang [2] from Northwestern Polytechnical University studied the load distribution of PRSM by the direct stiffness method [3], and proposed a new design method for PRSM threads. Hui et al. [4] from Northwestern Polytechnical University proposed a roller thread modification method that changes the half thread thickness of the screw and nut threads. Zhang [5] and Zhang [6] from the Beijing Institute of Automation Control Equipment used numerical approximation to solve the PRSM constraint equation system and proposed a segmented modification method for the pitch diameter. In 2022, Hu [7] from Chongqing University proposed a roller taper correction method based on deformation coordination and force balance to optimize load distribution, and conducted roller grinding experiments. The above methods do not consider the changes in meshing state caused by structural changes after roller modification, as well as the calculation model of thread load distribution based on this. Therefore, this work aims to analyze the meshing characteristics and derive the calculation model of thread load distribution of PRSM with modified roller, and solve for the optimal modification amount. The standard PRSM are used as the research object, as shown in Fig. 1. The meshing characteristics of the modified PRSM were analyzed according to the cross-sectional profile of the screw, roller and nut thread surface method. A method is proposed to solve the load distribution of PRSM roller threads based on deformation coordination relationship, and an optimal solution method of modification amount is proposed. Detailed research and discussion were conducted on the influencing factors of PRSM load distribution after modification.
Load Distribution of Planetary Roller Screw Mechanism …
269
2 Modification Design and Meshing Characteristics 2.1 Principle of Roller Modification Existing research showed that the load distribution of threads is severely uneven during the transmission process of PRSM. PRSM has a compact structure, with rollers transmitting loads simultaneously with nut and screw through threads, and usually works in high-speed and heavy-load conditions. Therefore, traditional methods of improving load distribution of threaded fasteners are not suitable for PRSM. Methods of uniform load distribution of PRSM need to be designed based on its structure, transmission mode and working conditions. The accumulated deformation of the shaft during the deformation of a thread becomes one of the main factor causing uneven load distribution of PRSM roller threads. In order to compensate for the accumulated deformation of the shaft, a modification design threads on nut-roller side and roller-screw side is necessary to achieve the adjustment of the initial axial clearance between the threads. The thread with less axial deformation before modification has less axial clearance, so it takes the lead in meshing and load-bearing to effectively reduce the load concentration and make the load distribution more uniform. The change in the initial clearance of threads is shown in Fig. 2. Mark roller threads by the serial number, and adjust the corresponding gaps on the roller-screw side and nut-roller side as ε Si and ε N i (i = 1, 2, 3, ..., τ ), τ is the number of roller threads. On the roller with processed threads, the grinding wheel feed rate is adjusted to complete the fine machining of the roller thread profile. The feed rate of the grinding wheel is: v=
(PR + ε Xi )ω (PR + ε Xi ) = T 2π
(1)
where X = S or R stands for the roller-screw side or roller-nut side, PR is the pitch is the rotation period of the roller. of the roller thread, and T = 2π ω
Fig. 2 Principle of roller thread profile modification
270
L. Wei et al.
2.2 Meshing Characteristics Based on the structural characteristics of PRSM, establish the overall coordinate system O − X Y Z and the part coordinate system o X − x X yY z Z (X = S or R or N , representing screw, roller, and nut respectively). The screw coordinate system coincides with the overall coordinate system, and in the part coordinate system, the x X -axis passes through the thread helix start point of the corresponding part, and the z X -axis coincides with the part axis. In the corresponding part coordinate system, the thread surface equation can be expressed by cylindrical coordinate system: f X = (r p X cos θ p X , r p X sin θ p X , z X (r p X , θ p X ))
(2)
The spiral surface equations of the screw, roller, and nut can be expressed as [8]: z X (r p X , θ p X ) =
h(r p X ) θpX lX + ξX 2π cos λ X
(3)
where l X is the lead of the corresponding part,ξ X = −1 stands for the upper surface of the thread, ξ X = 1 stands for the lower surface of the thread, λ X represents the helix angle of the corresponding part, and h(r p X ) stands for the contour line of the lower surface of the corresponding part in the normal section. For the thread of screw or nut, although they are external or internal thread respectively, the thread profile is a triangular profile with the same parameters, their spiral surface normal cross-sectional profile is similar. For rollers, the thread profile is circular. In the corresponding normal section coordinate system, the profile of the lower surface of each part can be expressed as: ⎧ ⎪ h(r pS ) = P4 cos λ S + (r pS − r S ) tan β S ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ h(r pN ) = P4 cos λ N + (r N − r pN ) tan β N ⎪ ⎪ ⎪ ⎪ P ⎪ h(r ) = cos λ + r cos − r T2 − (r p Ri − r Ri + r T sin β R )2 p R R T ⎪ 4 ⎩
(4)
where P is the pitch of thread, r S is the theoretical radius of the screw, r N is the nominal radius of the nut, r Ri is the nominal radius of the roller corresponding to the modified thread, and r T is the radius of the roller arc contour in the normal section. λ X and β X are the lead angle and half angle of the normal section thread shape of the corresponding part.
Load Distribution of Planetary Roller Screw Mechanism … Table 1 Structural parameters of screw thread Parameter Screw Nominal diameter d R0 , d R , d N /mm Strats N X Pitch P/mm Flank angle β X /◦ Lead angle λ X /◦ Theoretical diameter d R /mm Roller arc radius r T /mm Nut external diameter D N /mm
24 5 1 45 3.7991 23.9679 / /
271
Roller
Nut
8 1 1 45 2.2785 / 5.6569 /
40 5 1 45 2.2785 / / 53
Given h(r p X ), the unit normal vector on a threaded surface can be expressed as: grad f X − n→ X (r p X , θ p X ) = ξ X grad f X =
⎡
h (r p X ) cos θ p X −
⎢ · ⎣h (r p X ) sin θ p X + h (r ) ( cos λp XX )2 + ( 2πrli p X )2 + 1 −ξ X 1
ξ X l X sin θ p X 2πr p X ξ X l X cos θ p X 2πr p X
⎤ ⎥ ⎦
(5)
2.3 Contact Parameter Solution Due to the structure and motion characteristics of PRSM, transmission is achieved through continuous meshing of internal and external threads. The number of screw and nut heads is the same, resulting in different lead angle between the screw and roller. There is interference between the screw and roller when they are both processed according to the nominal pitch diameter. Generally, this interference is eliminated by keeping the center distance between the screw and the roller constant during processing and adjusting the pitch diameter of the screw. Directly using the nominal pitch diameter of the screw for analysis will result in error, so it is necessary to understand the theoretical pitch diameter of the screw for subsequent analysis. According to the calculation method in Ref. [8], the following table parameters are used to solve the contact parameters and subsequent load distribution, as shown in Table 1.
272
L. Wei et al.
3 Solution Model of the Roller Threads Load Distribution 3.1 Force Analysis When the number of rollers in a PRSM is Z , and the number of threads on a roller is τ . Assuming that PRSM bears an axial external load uniformly distributed on each roller, for a single roller and corresponding screw and nut parts, the force is transmitted from the nut to the roller to the screw through these τ threads, as shown in Fig. 3, The force balance relationship can be expressed as: τ i=1
FN Ri =
τ i=1
FS Ri =
F Z
(6)
where FN Ri or FS Ri are axial load on the i-th thread on the nut-roller side or rollerscrew side. For a single thread, the normal contact load at the contact point can be decomposed into three components: axial, tangential, and radial. As shown in Fig. 4, there are: ⎧ FXa = FX n cos θ cos λ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ FX t = FX n cos θ sin λ ⎪ ⎪ ⎪ ⎪ ⎪ ⎩ FXr = FX n sin θ
Fig. 3 Force transfer process
Fig. 4 Thread force decomposition
(7)
Load Distribution of Planetary Roller Screw Mechanism …
273
Fig. 5 Deformation of PRSM thread
3.2 Deformation Analysis The deformation of PRSM thread segment can be discretized into the following three parts: shaft deformation, thread deformation and thread contact deformation, as shown in Fig. 5. Shaft deformation refers to the axial tension and compression deformation of the screw, roller, and nut. According to the tensile and compressive stiffness formula of material mechanics, the axial deformation formula for a continuous and uniform rigid body during the elastic deformation stage can be expressed as: δ B Xi =
FX Ri P EX AX
(8)
where X = S or R or N stand for the screw, roller, and nut respectively. E X is the elastic modulus of the corresponding part, A X is the equivalent cross-sectional area of the corresponding part. When PRSM bears the load, the roller threads engage with the screw and nut thread respectively. After being subjected to force, deformation will occur along the axis of the screw. In addition to the deformation caused by the bending of the threads, there are also various elastic deformations caused by various reasons. As shown in Fig. 5, the axial deformation of threads includes deformation caused by bending, shear force, inclination, shear force of the thread root and radial component force. the thread deformation can be expressed as:
274
L. Wei et al.
⎧ a δ1 = (1 − μ2 ) 3F {[1 − (2 − ab )2 + 2 ln( ab )] cot 3 β X − 4( ac )2 tan β X } ⎪ ⎪ 4E ⎪ ⎪ ⎪ ⎪ ⎪ a ⎪ δ2 = (1 + μ) 6F cot 3 β X ln( ab ) ⎪ 5E ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ a ⎪ δ3 = (1 − μ2 ) 12F · ac2 (c − b2 tan β X ) ⎪ πE ⎪ ⎨ 2
a P δ4 = (1 − μ2 ) 2F [ ln( P+a/2 ) + 21 ln( 4P − 1)] ⎪ πE a P−a/2 a2 ⎪ ⎪ ⎪ ⎪ ⎪ 3 d ⎪ ⎪ δ5e = (1 − μ) FEa · Ppe · tan2 β X ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ 3 D 2 +d 2 ⎪ d ⎪ ⎪ δ5i = ( D02 −d 2pi + μ) FEa · Ppi · tan2 β X ⎪ ⎩ 0 pi
(9)
where a, b and c are thread parameters. The total deformation generated by the axial load on the corresponding part’s i-th thread is: (10) δT Xi = δ1i + δ2i + δ3i + δ4i + δ5Xi where a, b and c are thread form parameters. Based on Hertz contact theory, the elastic deformation of point contact between two curved surfaces is: ρ X R cos β X cos λ X 2 K X (e) 3 3E X R FX Ri ) ( (11) δC Xi = π m Xa 2 where X = S or N represents the screw or nut and the corresponding contact side with the roller, K X (e) respectively the first kind of complete elliptic integral of the Hertz contact ellipse corresponding to the contact side, and m Xa is the major axis coefficient of the contact ellipse; ρ X R is the sum of the principal curvature of the two thread surfaces on the contact side; E X R is the equivalent elastic modulus of the contact surface.
3.3 Deformation Coordination Equation Assuming that the position of the contact point does not change during the deformation process. The deformation coordination equation can be expressed as: Δl B Xi + Δl T Xi + ΔlC Xi = Δl B Ri + Δl T Ri + ΔlC Ri + Xi
(12)
Taking the deformation coordination equation of the nut-roller contact side under the condition of installing on the same side, screw, roller and nut thread are all right-handed thread, with the nut under pressure and the screw under tension. The calculation process is:
Load Distribution of Planetary Roller Screw Mechanism … i ⎧ i j=1 (FN R j −FS R j )PN ⎪ Δl = δ = B N i B N i ⎪ j=1 EN AN ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ Δl T N i = δT N i − δT N i+1 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ √ ⎪ ⎪ ρ N R cos β N cos λ N 2 ⎪ K N (e) 3 3E N R FN Ri ⎪ Δl = πm N a ( ) ⎪ ⎪ 2 ⎨ C Ni i i (FS R j −FN R j )PR ⎪ Δl ⎪ δ B Ri = j=1 E R N R B Ri = ⎪ j=1 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ Δl T Ri = δT Ri+1 − δT Ri ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ √ ⎪ ⎪ 3 ρ N R cos β N cos λ N 2 3E N R FN Ri+1 ⎪ K (e) N ⎪ Δl = − ( ) ⎪ πm N a 2 ⎩ C Ri
275
(13)
From the formula, it can also be seen that as the number of threads increases, the nut and roller will accumulate deformation of the shaft segment, which is a factor that cannot be ignored in causing uneven distribution of thread load. The load distribution calculation model for PRSM roller threads can be established using Eq. 12. Due to the nonlinear relationship between contact deformation and axial force (Eq. 11), the load distribution cannot be directly solved. The following iterative method can be used to solve the load distribution, with uniform load as the initial condition to obtain the first deformation, and then reverse the calculation of the contact stiffness and load distribution under this deformation. Each cycle evaluates the two load distributions solved before and after. If the results reach a stable convergence accuracy, the solved load distribution has reached a stable state; Otherwise, continue the iteration loop until the convergence accuracy Acc is met, as shown in Fig. 6.
3.4 Solution to the Optimal Modification Amount After determining the installation method, structural parameters, and bearing capacity, calculate the load distribution without modification and solve for the cumulative deformation of the screw, roller, and nut. The maximum modification amount is the cumulative deformation amount. As the number of threads increases, the modification amount increases linearly, and the modification amount at the minimum modified threads is 0. According to the calculation model for the load distribution of the roller threads and the structural parameters of the upper section, calculate the load distribution (as shown in Fig. 7a) and the cumulative deformation of the roller on the nut-roller side and the screw nut side, which are 7 µm on the nut-roller side and 8 µm on the rollerscrew side. Use this as the modification amount to calculate the load distribution, as shown in Fig. 7b.
276
L. Wei et al.
Fig. 6 Iterative solution algorithm for load distribution of PRSM roller threads
From the figure, it can be seen that the modification has a significant optimization effect on the uniform load distribution of the roller thread. The uneven coefficient of the maximum load distribution of the nut-roller side changed from 1.1652 to 1.0637, a decrease of 8.71%, and the roller-screw side changed from 1.2849 to 1.1314, a decrease of 11.95%.
Load Distribution of Planetary Roller Screw Mechanism …
(a) before thread modification
277
(b) after thread modification
Fig. 7 Load distribution of threads before and after modification
4 Analysis of Factors Influencing Load Distribution According to the solution model of PRSM roller threads load distribution established above, it can be seen that the PRSM threads load distribution is closely related to factors such as installation method and stress state, design parameters such as the number of threads and thread profile parameters, and machining errors of threaded parts. In order to gain a deeper understanding of the load distribution law of PRSM threads, reveal the influence mechanisms of various factors on its load distribution law, and guide the solution to PRSM modification quantity to achieve controllable uneven load distribution, this chapter will study the effects of axial force, number of rollers and number of roller threads on the load distribution of PRSM roller threads.
4.1 Axial Force PRSM has the outstanding advantage of large load-bearing capacity, and the range of axial load variation in actual working conditions is large. Therefore, the distribution of threads load of PRSM under different axial loads is studied. Given the axial loads of PRSM as 5500 N, 11,000 N, 22,000 N, 33,000 N, and 44,000 N, obtain the load distribution of the roller threads. As shown in the figure, the uneven distribution of thread load on the nut-roller side and the roller-screw side both become severer with the increase of axial load, but the increase in the uneven coefficient of load distribution gradually slows down. This is because the contact stiffness of threads is nonlinear, and as the contact load increases, the contact stiffness will also increase. Therefore, as the axial load increases, the trend of uneven distribution of PRSM threads load gradually decreases. In summary, changes in axial load will not change the basic pattern of PRSM thread load distribution, but as the axial load increases, the uneven distribution of PRSM thread load will become more severe.
278
L. Wei et al.
(a) nut-roller side
(b) roller-screw side
Fig. 8 The influence of axial force on the load distribution
(a) nut-roller side
(b) roller-screw side
Fig. 9 The influence of the number of rollers on the load distribution
4.2 Number of Rollers In the case that the starts and the middle diameter of screw have been determined, the number of rollers can be selected, according to the PRSM parameters above, regardless of structural limitations, the load distribution of the rollers is 5, 10, 15 and 20 threads respectively. Figure 8a, b is the load distribution of the nut-roller side and the roller-screw side respectively. As shown in the figure, with the increase of the number of rollers Z, the stiffness ratio of the roller shaft and the screw will not be conducive to the uniform load distribution, but as the number of rollers that can be installed gradually increases, the load of a single roller and a single thread will be reduced, which is conducive to improving the fatigue life of PRSM.
Load Distribution of Planetary Roller Screw Mechanism …
(a) nut-roller side
279
(b) roller-screw side
Fig. 10 The influence of the number of roller threads on the load distribution
4.3 Number of Roller Threads The number of roller threads determines the number of threads participating in the meshing drive at the same time, and the load distribution after changing the number of roller threads is shown in Fig. 9. The uneven load distribution of both the nutroller side and the roller-screw side increase with the increase in the number of roller threads. When the number of the roller threads is increased from 10 to 60, the uneven load distribution coefficient of the first thread on the roller-screw side increases from 1.0325 to 1.9381, and the uneven coefficient of load distribution of the last pair of thread decreases from 0.9830 to 0.6240. For the nut-roller side, the uneven load distribution coefficient of the first pair of thread increased from 1.0180 to 1.5968, while the uneven distribution coefficient of the last pair of thread increased from 0.9894 to 0.7276 (Fig. 10).
5 Conclusion (1) A load distribution model considering the modification of roller threads was established based on the deformation coordination relationship. The deviation phenomenon of the screw-roller contact point existing in PRSM itself and the influence of the modification on the meshing characteristics were considered. (2) By establishing a solution model, the load distribution of PRSM roller threads was calculated, and the modification amount was determined based on the solution results. Through calculation, it was found that this modification method resulted in the maximum load distribution non-uniformity coefficient on the nut roller side changing from 1.1652 to 1.0637, a decrease of 8.71%, and the roller screw side changing from 1.2849 to 1.1314, a decrease of 11.95%.
280
L. Wei et al.
(3) Analyzed the influence of axial load, number of rollers, and number of roller threads on the load distribution of PRSM under the same side installation, nut pressure, and screw tension conditions. The conclusion is that the uneven coefficient of load distribution gradually increases slowly with the increase of axial load; An increase in the number of rollers and roller threads will exacerbate the uneven distribution of thread load, but it will also reduce the maximum bearing of the thread. Due to the high axial stiffness of the nut and the low thread stiffness, the load distribution of the nut roller side is more uniform than on the roller screw side.
References 1. Du, X., Chen, B., Zheng, Z., et al.: Investigation on mechanical behavior of planetary roller screw mechanism with the effects of external loads and machining errors. Tribol. Int. 154, 106689 (2021) 2. Zhang, W., Liu, G., Tong, R., et al.: Thread load balance design method of planetary roller screw mechanism. J. Northwestern Polytech. Univ. 34(3), 499–507 (2016) 3. Ry¨s, J., Lisowski, F.: The computational model of the load distribution between elements in a planetary roller screw. J. Theor. Appl. Mech. 52(3), 699–705 (2014) 4. Guo, H., Tong, R., Liu, G., et al.: Thread modification method for load balance on planetary roller screw mechanism. J. Theor. Appl. Mech. 36(4), 685–692 (2018) 5. Zhang, X., Huo, X., Huang, J., et al.: Analysis of load capacity for planetary roller screw mechanism with pitch diameter modification. In: 2019 Chinese Automation Congress (CAC) (2019) 6. Zhang, J.: Parameter design and static mechanical property analysis of planetary roller screw. Harbin Institute of Technology (2018) 7. Hu, R., Wei, P., Zhou, P., et al.: A roller taper modification method for load distribution optimization of planetary roller screw mechanism. J. Adv. Mech. Des. Syst. Manuf. 16(3), JAMDSM0032 (2022) 8. Chen, Z., Zheng, J., Shi, W., et al.: Contact characteristics of planetary roller screw mechanism considering angle between principal plane. Missiles Space Veh. (6), 38–42, 47 (2021)
Finite-Time H∞ Synchronization Control of Piecewise Homogeneous Markov Jumping T-S Fuzzy Discrete Complex Networks Subject to Hybrid Attacks and Uncertainty Xiru Wu, Binlei Zhang, Yuchong Zhang, and Yuqiu Zhang Abstract In this paper, a non-fragile fuzzy control method is proposed to solve the common parameter uncertainty phenomenon in complex networks and the security threat caused by network attacks. A T-S fuzzy discrete complex network model following piecewise homogeneous Markov process is constructed on the basis of the study of Markov jump models with aligned times. The combined effects of spoofing attacks and denial of service attacks are considered when processing controller signals. By analyzing the Lyapunov-Krasovskii functional with dual modal correlation, we establish sufficient conditions for finite time boundedness of the synchronization error system, and then verify the controller’s effectiveness in the attacked Lorenz chaotic system. Keywords Finite-time H∞ synchronization · Piecewise homogeneous Markov · Non-fragile · Hybrid attacks
1 Introduction In recent years, the practical application of complex networks has brought a lot of convenience to People’s Daily life and social development. However, in an open network communication environment, the system is vulnerable to malicious network attacks, such as the Stuxnet worm attack on Iran’s nuclear power plant in 2010 and the power grid attack on Ukraine in 2015, which led to a nationwide blackout. Therefore, the security of complex networks cannot be ignored [1]. Compared with physical attacks, network attacks mainly reduce network reliability by damaging the availability and integrity of transmitted data or sensors [2, 3]. Generally speaking, network attacks can be divided into spoofing attacks and denial of service (DoS) attacks. The former seriously harms the integrity of data by modifying the information transmitted in the network, while the latter greatly reduces the availability of network X. Wu (B) · B. Zhang · Y. Zhang · Y. Zhang School of Electronic Engineering and Automation, Guilin University of Electronic Technology, Guilin 541004, Guangxi, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_24
281
282
X. Wu et al.
resources and further leads to packet loss [4, 5]. In Ref. [6], tracking control and filtering are studied for discrete T-S fuzzy systems subjected to random DoS attacks. In Ref. [7], the particle filtering problem of a class of information physical systems subjected to random spoofing attacks is studied under the polling protocol. Zhang et al. [8] studied the robust output consistency of multiple agents under DoS attack. Due to the uncertainty and complexity of the network environment, only considering a single attack mode cannot fully describe the affected situation of the network system, so some researchers begin to pay attention to the security control of the network under multiple attacks [9, 10]. In Ref. [9], adaptive event triggering mechanism is used to study the security control method of T-S fuzzy system subjected to both DoS attack and spoofing attack. However, it should be Remarked that the synchronization control of T-S fuzzy discrete complex networks under hybrid attacks has not been fully studied. Based on the homogeneous Markov process, parameter switching of complex networks is described, and the finite time synchronization control of the system is realized under mode hopping. However, in some more complex dynamic environments, it may cost a lot of money to accurately obtain Markov transition probability, and the modal jump rule may change with the change of the environment, so the transition probability is non-linear and cannot be described by a single transition probability matrix [11]. To solve these problems, literature [12] proposes a Markov process with piecewise homogeneous transition probability to study linear system filtering problems, in which the system modal changes are subject to a higher-level transition probability matrix. It is more practical to use this model to study complex networks with parameter switching. For the research on synchronous control of complex networks, most of the results obtained at present are based on the assumption that controller parameters can be accurately obtained [13, 14]. However, in fact, complex and changeable environmental factors may affect the control accuracy, such as actuator degradation, numerical calculation errors, aging of system components, etc., and the parameters obtained in these cases may not be the same as the ideal value. In other words, during the control of the network, the parameters often change or fluctuate, which will lead to certain vulnerability of the network system, thus affecting the control performance. The research on non-fragile problems has attracted a lot of research attention [15, 16]. For example, in literature [16], the author introduced a non-fragile event trigger state estimation method and applied it in discrete complex networks to obtain better estimation results. Therefore, it is important to consider non-vulnerability when designing complex network synchronization controllers. Based on the above analysis, this chapter focuses on the finite time security synchronization control of T-S fuzzy discrete complex networks based on piecewise homogeneous Markov jump mechanism, while ensuring the non-fragile and H ∞ performance of the control system. A piecewise homogeneous Markov model is introduced to model T-S fuzzy discrete complex networks with uncertainty and parameter hopping. The time-varying transition probability is described by polyhedral structure and subject to higher-level homogeneous Markov process. The models in references [17, 18] can be regarded as special cases in this chapter. Due to the
Finite-Time H∞ Synchronization Control of Piecewise …
283
complexity of the network environment, this chapter extends from common single network attacks to hybrid attacks, using two Bernoulli random variables to describe the random sequence of spoofing attacks and DoS attacks respectively. Considering the gain fluctuation, a non-fragile controller is constructed by adding uncertain terms, and a dual modal dependent Lyapunov-Krasovskii functional is established. By combining inverse convex matrix inequalities and decoupling techniques, a less conservative finite time H ∞ synchronization condition is obtained.
2 Problem Formulation A class of piecewise homogeneous Markov jump T-S fuzzy discrete complex network model is considered. Given a complete probability space (ψ, F, Pr ), where the ψ representation of the sample space, is an-algebra of subsets of the sample space, Pr is a probability measure about F. Considering a class of T-S fuzzy discrete complex network model with coupled delay and uncertainty composed of N nodes, the first fuzzy rule of this model can be expressed as: Fuzzy rule l : if ∂1 (k) is πl1 , ∂2 (k) is πl2 , ..., and ∂q (k) is πlq , then ⎧ N ⎪ l ⎪ l ⎪ ⎪ x (k + 1) = Aσ (k) + ΔAσ (k) xi (k) + h iσj(k) σ (k) xi (k − τ (k)) ⎪ ⎨ i j=1
⎪ ⎪ ⎪ ⎪ ⎪ ⎩
+ E σl (k) wi (k) + Bσl (k) u i (k) + Iσ (k) (k)
(1)
z i (k) = Cσl (k) xi (k)
where l ∈ M = {1, 2, . . . , }, is the total number of IF-THEN rules, πlh (h = 1, 2, ..., q) represents the fuzzy set, 1(k),∂2 (k), ..., ∂q (k) represents the premise variable of T-S fuzziness; xi (k) ∈ Rn , u i (k) ∈ Rn and z i (k) ∈ Rn , represent the state vector of the first network node at the moment, control input vector and control output vector; τ (k) indicates a variable time lag, moreover τ1 ≤ τ (k) ≤ τ2 ; Γσ (k) ∈ Rn represents an inwardly coupled
for a complex network, θ represents the cou matrix σ (k) σ (k) represents the externally coupled configpling strength, h i j Hσ (k) = h i j N ×N
uration matrix of complex networks, if there is a connection from node to node j, h iσj(k) > 0, or h iσj(k) = 0. The diagonal elements of the matrix Hσ (k) satisfies hiiσ (k) = − Nj=1,i= j hiσj(k) ; Iσ (k) (k) ∈ Rn I is an external output vector; wi (k) ∈ Rn I represents an external perturbation vector; The initial value of the state vector xi (k) can be expressed as φi (t) = xi (t), t ∈ [−τ2 , 0]z , Alσ (k) , Bσl (k) , Cσl (k) , and Erl (k) are known matrices with appropriate dimensions, and uncertain matrices Δ Alσ (k) satisfies the following norm boundedness Assumption 1. Before further study, a few necessary definitions, assumptions and lemmas are given:
284
X. Wu et al.
Assumption 1 Assume that the uncertain parameter matrix Δ Alσ (k) is normally bounded and satisfies the following allowable conditions: Δ Alσ (k) = E l1σ (k) F l1σ (k) (k)M l1σ (k)
(2)
where E l1σ k , M l1σ k is a known constant matrix with an appropriate dimension, T F lσ k (k) represents an unknown time-varying parameter matrix and is F l1σ (k) (k) F l1σ k (k) ≤ I satisfied for any k > 0. σ (k) ∈ R = {1, 2, . . . , r } represents a finite Markov process whose transition rule
ϕ(k+1) (k) , follows the homogeneous transition probability matrix Θ ϕ(k+1) (k) = θab
ϕ(k+1) the element θab (k) is defined as follows: ϕ(k+1) (k) Pr{σ (k + 1) = b | σ (k) = a} = θab ϕ(k+1) where θab (k) > 0 and this condition is satisfied
(3)
r
ϕ(k+1) (h) = 1. In addiβ=1 θ ab ϕ(k+1) θab (k), it is found that the
tion, according to the form of transition probability mode σ (k) is constrained by a higher level of homogeneous Markov process ϕ(k), whose transition mode set G = {1, 2, . . . , g}, the transition probability is defined as: Pr{ϕ(k + 1) = v | ϕ(k) = s} = ψsv (k)
(4)
The corresponding homogeneous transition probability matrix is expressed as ψ(k) g = ψsv (k), where the matrix elements satisfy conditions ψsv (k) ≥ 0 and v=1 ψsv (k) = 1. Remark that the state jump follows a homogeneous Markov chain, and the transition probability matrix Θ ϕ(k+1) (k) and ψ(k) are determined by the time constant k, which can be modeled using the following convex polyhedral structure: Θ ϕ(k+1) (k) =
D
ϕ(k+1)
ρd (k)Θd
ϕ(k+1)
, Θd
ϕ(k+1)d = θsν
(5)
d=1
Ψ (k) =
Q
q h q (k)Ψq , Ψq = ψsv
(6)
q=1
D where ρd (k) ≥ 0 and h q (k) ≥ 0 satisfy d=1 ρd (k) = 1, d ∈ {1, 2, . . . , D}. In A conϕ(k+1) vex polyhedron structure, the matrix Θd and ψq can be regarded as vertices, then D and Q represent the number of vertices, respectively. Figure 1 shows a schematic diagram of the mode transfer process in piecewise homogeneous Markov process. The mode transfer of the system is governed by the double transition probability.
Finite-Time H∞ Synchronization Control of Piecewise …
285
Fig. 1 Mode transfer process in piecewise homogeneous Markov process
Using the weighted average fuzzy theory, the global fuzzy model of the system (1) can be derived as ⎧ N
⎪ ⎪ η η ⎪ ⎪ x x (k + 1) = A + Δ A (k) + hiσj(k) σ (k) x i (k − τ (k)) i i ⎪ σ (k) σ (k) ⎨ j=1 (7) η η ⎪ ⎪ + E σ (k) wi (k) + B σ (k) ui (k) + Iσ (k) (k) ⎪ ⎪ ⎪ η ⎩ z (k) = C x (k) σ (k) i
i
where η
Aσ (k) =
η
η
ηl (∂(k))Aσ (k) , ΔAσ (k) =
l=1
η
Bσ (k) =
l=1
η
E σ (k) =
l=1
η
ηl (∂(k))ΔAσ (k)
l=1
η
η
ηl (∂(k))Bσ (k) , Cσ (k) =
η
ηl (∂(k))Cσ (k)
l=1
q πli (∂i (k)) η ηl (∂(k))E σ (k) , ηl (∂(k)) = i=1 q i=1 πli (∂i (k)) l=1
πli (∂i (k)) represents the membership q degree of the premise variable ηl (∂(k)) > 0 in the T-S fuzzy set πli . Assuming i=1 πli (ϕ( k)) ≥ 0, ηl (∂(k)) meets the conditions h ηl (∂(k)) > 0 and i=1 ηl (∂(k)) = 1, ηl is replaced by ηl (∂(k)) for the convenience of subsequent analysis.
286
X. Wu et al.
Using a similar approach to system modeling, it can be obtained that the target node dynamics can be expressed in the following form ⎧ ⎨ s(k + 1) = Aη ⎩
η + ΔA σ (k) σ (k) s(k) + Iσ (k) (k)
η
z s (k) = Cσ (k) s(k)
(8)
where s(k) ∈ R n and z s (k) ∈ R n z represent the state vector and control output vector of the target node at time k respectively. The difference between the network node states (7) and (8) is defined as the closed-loop synchronization error of the complex network, expressed as ei (k) = xi (k) − s(k) and z¯ i (k) = z i (k) − z s (k), and the synchronization error system can be obtained as: ⎧ N
⎪ ⎪ η η ⎪ ⎪ e e (k + 1) = A + ΔA (k) + hiσj(k) ei (k − τ (k)) i ⎪ σ (k) σ (k) ⎨ i j=1
⎪ ⎪ ⎪ ⎪ ⎪ ⎩
z i (k)
η + E σ (k) wi (k) η = C σ (k) ei (k)
+
η B σ (k) ui (k)
(9)
The initial value of the state of the error system can be expressed as e˜i (t), t ∈ [−τ2 , 0]Z . In order to ensure the non-fragile property of the designed controller, the possible gain fluctuation phenomenon is fully considered in the design of the controller and an uncertain term is introduced. According to the parallel distributed compensation technology, the controller and the system have the same prerequisite variable, then the T-S fuzzy rule of the non-fragile controller can be expressed as: Fuzzy rule m: if ∂1 (k) is πl1 , ∂2 (k) is πl2 , ..., and ∂q (k) is πlq , then l l u˜ i (k) = K iσ (k) + ΔK iσ (k) ei (k)
(10)
l l where K iσ (k) represents the controller gain of the σ (k) mode, the uncertainty ΔK iσ (k) is a non-fragile controller perturbation matrix satisfying the norm boundness, satisfying ΔK lσ (k) = E l2σ (k) F l2σ (k) (k)M l2σ (k) , E l2σ (k) and M l2σ (k) are all known constant T matrices, and F l2σ (k) is constrained by the inequality condition F l2σ (k) F l2σ k ≤ I. Because of the openness of the network communication environment, nodes are vulnerable to malicious network attacks while transmitting information, thus affecting the stability of the system. When modeling the controller in the real network environment, the non-fragile control input signal is considered to be affected by the mixed attack signal composed of spoofing attack and DoS attack. When the network is attacked by spoofing, spoofing attack signals will interfere with control signals instead of normal data, and spoofing attack signals are represented by function f (ei (k)). The variable β(k) following Bernoulli distribution is used to represent the spoofing attack on the control input signal, that is, the signal transmission is normal when β(k) = 1, and the spoofing attack when β(k) = 0. At the same time,
Finite-Time H∞ Synchronization Control of Piecewise …
287
the influence of DoS attack on the control input signal is also considered, so the controller signal u i (k) actually received by the system can be expressed as: u i (k) = α(k) (β(k)u˜ i (k) + (1 − β(k)) f (ei (k)))
(11)
In combination with formula (10), the following complete form of control input can be further obtained:
η η (12) u i (k) = α(k) β(k) K iσ (k) + ΔK σ (k) ei (k) + (1 − β(k)) f (ei (k)) Assumption 2 [19] For a given constant matrix, the spoofing attack signal satisfies the following conditions: f (ei (k)) 2 ≤ Lei (k)
(13)
Assumption 3 α(k) and β(k) satisfy the following statistical characteristics: E{α(k)} = Pr{α(k) = 1} = α, ¯ ¯ E{β(k)} = Pr{β(k) = 1} = β.
(14)
where α¯ ∈ [0, 1] and β¯ ∈ [0, 1]. At the same time, it can be further drawn ¯ − α), ¯ E{α(k) − α} ¯ = 0, E (α(k) − α) ¯ 2 = α(1 ¯ = 0, E (β(k) − β) ¯ 2 = β(1 ¯ − β). ¯ E{β(k) − β} Remark Noting that the control input signal given in (12) is affected by two types of network attacks, which may occur simultaneously or alternately, two variables α(k) and β(k) obeying Bernoulli distribution are used here to represent the transmission of the attack sequence. The characteristic of DoS attacks is to prevent any data transmission through the communication network, including false signals generated by spoofing attacks. Therefore, the default DoS attack has a priority when designing the controller. When two types of network attacks occur at the same time, spoofing attacks are also prevented, and the actual impact is still DoS attacks. By substituting the control input (12) into the formula (9), using the Kronecker product technique, the following simplified closed-loop error system can be further obtained: ⎧
η η ⎪ e(k + 1) = A + ΔA ⎪ σ (k) σ (k) e(k) + He(k − τ (k)) ⎪ ⎪ ⎪ ⎪ η ⎪ ⎪ + Eσ (k) w(k) ⎪ ⎨
η η η (15) + α(k)β(k)Bσ (k) K σ (k) + ΔK σ (k) e(k) ⎪ ⎪ ⎪ ⎪ η ⎪ ⎪ + α(k)(1 − β(k))Bσ (k) f (e(k)) ⎪ ⎪ ⎪ η ⎩ y˜ (k) = Cσ (k) e(k)
288
X. Wu et al.
Fig. 2 Block diagram of closed-loop control system
where T T e(k) = e1T (k), e2T (k), . . . , e TN (k) , w(k) = w T (k), w T (k), . . . , w T (k) , T e(k − τ (k)) = e1T (k − τ (k)), e2T (k − τ (k)), . . . , e TN (k − τ (k)) , η η η η η η η η Aσ (k) = I ⊗ Aσ (k) , ΔAσ (k) = I ⊗ ΔAσ (k) , Bσ (k) = I ⊗ Bσ (k) , Cσ (k) = I ⊗ Cσ (k) η η Eσ (k) = I ⊗ Eσ (k) , H = Hσ (k) ⊗ σ (k) , η
η
η
η
K σ (k) = diag K 1σ (k) , K 2σ (k) , . . . , K N σ (k) , η η η η ΔK σ (k) = diag ΔK 1σ (k) , ΔK 2σ (k) , . . . , ΔK N σ (k) .
As shown in Fig. 2, the block diagram of the closed-loop control system shows that the controller signal will be affected by random attack signals in the network environment. In order to facilitate the later theoretical analysis, the probability characteristics of the attack signal are given according to Assumption 2, and the following new synchronization error system expression is further obtained:
η η η ¯ 2σ (k) e(k + 1) = Ξ1σ (k) + ΔΞ1σ (k) + (α(k) − α)Ξ η
Ξ1σ (k)
¯ η ¯ η , + (β(k) − β)Ξ ¯ − β)Ξ 3σ (k) + (α(k) − α)(β(k) 4σ (k)
η η η η η η = Aσ (k) + ΔAσ (k) + α¯ β¯ Bσ (k) K σ (k) + α¯ β¯ Bσ (k) ΔK σ (k) e(k) η
η Ξ2σ (k) η
Ξ3σ (k) η
Ξ4σ (k)
η
¯ B + He(k − τ (k)) + Er (k) w(k) + α(1 ¯ − β) σ (k) f (e(k)),
η η η ¯ Bη f (e(k)), = β¯ Bσ (k) K σ (k) + ΔK σ (k) e(k) + (1 − β) σ (k)
η η η η = α¯ Bσ (k) K σ (k) + ΔK σ (k) e(k) + α¯ Bσ (k) f (e(k)),
η η η η = Bσ (k) K σ (k) + ΔK σ (k) e(k) − Bσ (k) f (e(k)).
(16)
Finite-Time H∞ Synchronization Control of Piecewise …
289
Definition 1 [20] For a given matrix φ, the normal scalar quantities c1 ,c2 (c1 ≤ c2 ) and w, if for any t ∈ {−T M , −T M + 1, . . . , 0}, the following conditions are met ⎧ T ˜ ≤ c1 ⎨ E e˜ (t)Φ e(t) Tm ⇒ E e T (k)Φe(k) ≤ c2 w(k)w(k) ≤ w˜ ⎩
(17)
k=0
Then the closed-loop error system (9) is said to be bounded about mean square ˜ finite time, i.e. the system achieve finite time synchronization about (c1 , c2 , Tm , Φ, w). Definition 2 [21] If the closed-loop error system (9) satisfies the inequality conditions in Definition 1, with respect to reaching the mean square finite time bounded, while satisfying the following (c1 , c2 , Tm , Φ, w) ˜ inequalities under zero initial conditions T Tm m T 2 T (18) E z (k)¯z (k) ≤ E γ w (k)w(k) k=0
k=0
Then the closed-loop error system (9) is bounded with (c1 , c2 , Tm , Φ, w) ˜ respect to mean square finite time and satisfies the H ∞ performance index γ , that is, the system ˜ achieve finite time H ∞ synchronization. (1) with respect to (c1 , c2 , Tm , Φ, w) Assumption 3 or any t ∈ {−T M , −T M + 1, . . . , 0}, the closed-loop error system satisfies the following inequality conditions: E (e(k + 1) − e(k))T (e(k + 1) − e(k)) ≤ c3
(19)
where c3 is the normal number. Lemma 1 [22] (Inverse-convex matrix inequality) for any vector y1 , y2 , and scalar ∈ (0, 1) with appropriate dimensions, if there is a matrix γ > 0 and X that satisfies the conditions ϒ X ≥ 0, (20) ϒ Then the following inequality is true 1 T 1 y ϒ y1 + y T ϒ y2 ≥ 1 1− 2
y1 y2
T
ϒ X ∗ ϒ
y1 . y2
(21)
3 Numerical Simulations In this section, the theory proposed above is applied to Lorenz chaotic system, and the effectiveness and applicability of theoretical analysis are proved by experimental results.
290
X. Wu et al.
Fig. 3 Network node communication topology
Example 1 A Lorenz chaotic system consisting of 4 nodes is considered as a T-S fuzzy homogeneous discrete complex network model. Figure 3 shows the communication topology between nodes. According to the chaotic system proposed in Ref. [23], the model of target node in continuous time is expressed as ⎧ s1 (t) = −10s1 (t) + 10s2 (t) ⎪ ⎪ ⎨ s2 (t) = bσ (t) s1 (t) − s2 (t) − s1 (t)s3 (t) ⎪ ⎪ ⎩ s (t) = s (t)s (t) − 8 s (t) 3 1 2 3 3
(22)
bσ (t) is set to b1 = 42, b2 = 36 under piecewise homogeneous Markov process jump mode. The continuous time chaotic system given by Eq. (20) is discretized by firstorder Euler method, and the discrete sampling period is Δt = 0.01. The parameters of the mode-dependent T-S fuzzy complex network are given below ⎡
⎤ 0.9000 0.1000 0 A11 = ⎣ 0.4200 0.9900 −0.3200 ⎦ , 0 0.3200 0.9733 ⎡
⎤ 0.9000 0.1000 0 A12 = ⎣ 0.3600 0.9900 −0.3200 ⎦ , 0 0.3200 0.9733 B11 C11 E 11 E 21
⎡
⎤ 0.9000 0.1000 0 A21 = ⎣ 0.4200 0.9900 0.3200 ⎦ , 0 −0.3200 0.9733 ⎡
⎤ 0.9000 0.1000 0 A22 = ⎣ 0.3600 0.9900 0.3200 ⎦ , 0 −0.3200 0.9733
= B12 = B21 = B22 = I, 1 = 2 = I, = 0.6, I1 (k) = I2 (k) = 0, = C12 = 0.72 −0.18 0.27 , C21 = C22 = 0.88 −0.22 0.31 , = diag{0.36, 0.18, 0.72}, E 12 = diag{0.32, 0.12, 0.6}, = diag{0.42, 0.21, 0.85}, E 22 = diag{0.24, 0.12, 0.55}.
The prerequisite variable of the T-S fuzzy model is xi1 (k), and the membership function is
Finite-Time H∞ Synchronization Control of Piecewise …
291
Fig. 4 Synchronization error of open loop system
η1 (xi1 (k)) =
0.5 (1 + xi1 (k)/35) , |xi1 (k)| ≤ 35 , η2 (xi1 (k)) = 1 − η1 (xi1 (k)) . |xi1 (k)| ≥ 35 0,
Homogeneous Markov processes σ (k) and φ(k) are governed by the following transition probability matrix Ψ =
0.2 0.8 0.5 0.5 0.5 0.5 , Θ2 = . , Θ1 = 0.6 0.4 0.7 0.3 0.3 0.7
According to the network topology diagram given in Fig. 4, the complex network external coupling configuration matrix under two modes is ⎡
⎤ −0.5 0.2 0.3 0 ⎢ 0.5 −0.6 0 0.1 ⎥ ⎥ H1 = ⎢ ⎣ 0.3 0 −0.8 0.5 ⎦ , 0 0.2 0.4 −0.6
⎡
−0.7 ⎢ 0.3 H2 = ⎢ ⎣ 0.2 0.3
0.2 −0.6 0.2 0.1
0.3 0.2 −0.8 0.4
⎤ 0.2 0.1 ⎥ ⎥. 0.4 ⎦ −0.8
In addition, the correlation matrix of the uncertain terms ΔAlσ (k) and ΔK σl (k) is set to 1 2 1 2 E 11 = E 11 = 0.05 0.01 0.02 , E 12 = E 12 = 0.03 0.02 −0.01 , T 1 2 1 2 = M11 = 0.04 −0.02 0.03 , M12 = M12 = 0.01 0.05 0.01 , M11 1 2 1 2 E 21 = E 21 = 0.02 −0.01 0.01 , E 12 = E 12 = 0.03 0.04 0.02 , T 1 2 1 2 = M11 = 0.03 −0.03 0.026 , M12 = M12 = 0.04 0.03 0.018 M11 1 1 2 2 F11 (k) = F12 (k) = F11 (k) = F12 (k) = 0.6 sin(k), 1 1 2 2 F21 (k) = F22 (k) = F21 (k) = F22 (k) = 0.5 cos(k), Set the coupling delay of complex network species to τ (k) = [sin(π k/2) + 1] and T ¯ w(k), ¯ w(k)) ¯ , where τ1 = 0, τ2 = 2. The external disturbance is wi (k) = (w(k), 0.2k . The spoofing attack signal is set to f (ei (k)) = w(k) ¯ = 2 cos(5k)/ 1 + e
292
X. Wu et al.
Fig. 5 a Modal evolution of a homogeneous Markov process σ (k); b Modal evolution of homogeneous Markov process φ(k); c Control the sequence of DoS attacks and spoofing attacks on the input signal
Finite-Time H∞ Synchronization Control of Piecewise …
293
Fig. 6 Trajectories of deception attacks
Fig. 7 Synchronization errors of closed-loop system
− tanh (0.5ei (k)), L = −0.5, and the expectation of the attack probability is β = 0.5, while the expectation of the DoS attack probability is β = 0.3. Set parameters ζ = 1.02, c1 = 2, c2 = 10, c3 = 1.5, w˜ = 0.5, Tm = 50 and Φ = I , H ∞ to γ = 0.5910. The initial value of the target node is set to s(k) = (15.8, −12.48, 15.64)T , [−τ2 , 0]Z , and the initial value of the network node is generated randomly by floating the initial value of the target node 1. The fuzzy controller gain of each node in each mode can be solved by LMI, and the control simulation results obtained are shown in Figs. 5, 6, 7, 8 and 9. When the complex network (1) is not controlled, the synchronization error trajectory is shown in Fig. 5, and it is obvious that the curve is divergent and the system cannot realize synchronization. Figure 5a, b show the evolution process of σ (k) and σ (k) modes in piecewise homogeneous Markov process respectively, and the system parameter matrix is related to the direct mode σ (k). The sequence of DoS attacks and spoofing attacks on the control input signal is shown in Fig. 5c. When a DoS attack occurs, the spoofing attack signal is also blocked and cannot be transmitted successfully. Therefore, the spoofing attack signal is not transmitted when the DoS attack signal is transmitted.
294
X. Wu et al.
Fig. 8 Trajectories of controllers
Fig. 9 Chaotic phase trajectories of each node in the closed-loop system
The trajectory of the spoofing attack is shown in Fig. 6, the synchronization error trajectory of the closed-loop system is plotted in Fig. 7, the Controller trajectory is shown in Fig. 8. From the enlarged diagram within k = 50, it can be seen that the synchronization error converges within the set finite time Tm . Figure 9 shows the chaotic phase trajectory of each node after the controller is applied to the complex network. It is noted that the three network nodes can still track the state of the target node faster when attacked, and the simulation experiment results show that the effect is good.
Finite-Time H∞ Synchronization Control of Piecewise …
295
4 Conclusions In this chapter, we study the finite time secure synchronization control problem of TS fuzzy discrete complex networks, where the system parameters vary with time and the mode transition probability is subject to piecewise homogeneous Markov process. In order to ensure the security and reliability of complex network control, the influence of mixed attack signals composed of spoofing attacks and DoS attacks on controller signals is considered, and the uncertainty term is used to avoid its vulnerability when designing controller gain. The stability of the closed-loop synchronous error system is analyzed by Lyapunov according to the two-layer homogeneous transition probability, and by using the inverse convex matrix inequality to estimate the summation term with delay information to reduce the conservatism, the LMI error system satisfies the sufficient condition of mean-square finite time boundness and the expected controller gain, which guarantees the H ∞ performance. Finally, the example of Lorenz chaotic system shows that the designed non-fragile synchronization controller is effective and can ensure convergence of synchronization errors under hybrid network attacks. Acknowledgements This work was supported by National Natural Science Foundation of China under Grant 62263005, Guangxi Natural Science Foundation under Grant 2020GXNSFDA238029, Laboratory of AI and Information Processing (Hechi University), Education Department of Guangxi Zhuang Autonomous Region under Grant 2022GXZDSY004, Innovation Project of Guangxi Graduate Education YCSW2023298, Innovation Project of GUET Graduate Education under Grant 2023YCXS124.
References 1. Jin, M., Chao, C.: Distributed adaptive security consensus control for a class of multi-agent systems under network decay and intermittent attacks. Inf. Sci. Int. J. 547(1) (2021). https:// doi.org/10.1016/j.ins.2020.08.013 2. Sakthivel, R., Kwon, O.-M., Park, M.J., Choi, S.-G., Sakthivel, R.: Robust asynchronous filtering for discrete-time t-s fuzzy complex dynamical networks against deception attacks. IEEE Trans. Fuzzy Syst. 30(8), 3257–3269 (2022). https://doi.org/10.1109/TFUZZ.2021.3111453 3. Liu, J., Yin, T., Cao, J., Yue, D., Karimi, H.R.: Security control for t-s fuzzy systems with adaptive event-triggered mechanism and multiple cyber-attacks. IEEE Trans. Syst. Man Cybern. Syst. PP(99), 1–11 (2020). https://doi.org/10.1109/TSMC.2019.2963143 4. Yuan, H., Xia, Y., Yang, H.: Resilient state estimation of cyber-physical system with multichannel transmission under dos attack. IEEE Trans. Syst. Man Cybern. Syst. PP(99), 1–12 (2020). https://doi.org/10.1109/TSMC.2020.2964586 5. Ding, D., Wang, Z., Han, Q.L., Wei, G.: Security control for discrete-time stochastic nonlinear systems subject to deception attacks. IEEE Trans. Syst. Man Cybern. Syst. (2016). https://doi. org/10.1109/TSMC.2016.2616544 6. Zhao, N., Shi, P., Xing, W., Lim, C.P.: Resilient adaptive event-triggered fuzzy tracking control and filtering for nonlinear networked systems under denial-of-service attacks. IEEE Trans. Fuzzy Syst. 30(8), 3191–3201 (2022). https://doi.org/10.1109/TFUZZ.2021.3106674
296
X. Wu et al.
7. Song, J., Shan, J.: Particle filtering for a class of cyber-physical systems under round-robin protocol subject to randomly occurring deception attacks. Inf. Sci. Int. J. 544(1) (2021). https:// doi.org/10.1016/j.ins.2020.07.047 8. Zhang, D., Liu, L., Feng, G.: Consensus of heterogeneous linear multiagent systems subject to aperiodic sampled-data and DoS attack. IEEE Trans. Cybern. PP(99), 1–11 (2018). https:// doi.org/10.1109/TCYB.2018.2806387 9. Peng, H., Zhang, Y., Lei, J., Lin, M.: H ∞ asynchronous synchronisation control for Markovian coupled delayed neural networks with missing information. Int. J. Syst. Sci. 53 (2022). https:// doi.org/10.1080/00207721.2021.1998719 10. Dong, S., Liu, M.: Adaptive fuzzy asynchronous control for nonhomogeneous Markov jump power systems under hybrid attacks. IEEE Trans. Fuzzy Syst. 31(3), 1009–1019 (2023). https:// doi.org/10.1109/TFUZZ.2022.3193805 11. Deng, Y., Mo, Z., Lu, H.: Robust H ∞ state estimation for a class of complex networks with dynamic event-triggered scheme against hybrid attacks (2021). https://doi.org/10.1088/16741056/ac0ee9 12. Xue, M., Yan, H., Zhang, H., Li, Z., Chen, S., Chen, C.: Event-triggered guaranteed cost controller design for t-s fuzzy Markovian jump systems with partly unknown transition probabilities. IEEE Trans. Fuzzy Syst. 29(5), 1052–1064 (2021). https://doi.org/10.1109/TFUZZ. 2020.2968866 13. Zhang, L.: H ∞ estimation for discrete-time piecewise homogeneous Markov jump linear systems. Automatica 45(11), 2570–2576 (2009). https://doi.org/10.1016/j.automatica.2009. 07.004 14. Hou, N., Dong, H., Wang, Z., Ren, W., Alsaadi, F.E.: Non-fragile state estimation for discrete Markovian jumping neural networks. Neurocomputing 179(C), 238–245 (2016). https://doi. org/10.1016/j.neucom.2015.11.089 15. Shen, H., Hu, X., Wang, J., Cao, J., Qian, W.: Non-fragile H ∞ synchronization for Markov jump singularly perturbed coupled neural networks subject to double-layer switching regulation. IEEE Trans. Neural Netw. Learn. Syst. 34(5), 2682–2692 (2023). https://doi.org/10.1109/ TNNLS.2021.3107607 16. Qiu, Y., Hua, C., Wang, Y.: Nonfragile sampled-data control of t-s fuzzy systems with time delay. IEEE Trans. Fuzzy Syst. 30(8), 3202–3210 (2022). https://doi.org/10.1109/TFUZZ. 2021.3107748 17. Adhira, B., Nagamani, G., Dafik, D.: Non-fragile extended dissipative synchronization control of delayed uncertain discrete-time neural networks. Commun. Nonlinear Sci. Numer. Simul. 116, 106820 (2022). https://doi.org/10.1016/j.cnsns.2022.106820 18. Fan, S., Yan, H., Zhang, H., Shen, H., Shi, K.: Dynamic event-based nonfragile dissipative state estimation for quantized complex networks with fading measurements and its application. IEEE Trans. Circ. Syst. I. Regul. Pap. Publ. IEEE Circ. Syst. Soc. (2), 68 (2021). https://doi.org/10. 1109/TCSI.2020.3036626 19. Liu, J., Yin, T., Cao, J., Yue, D., Karimi, H.R.: Security control for t-s fuzzy systems with adaptive event-triggered mechanism and multiple cyber-attacks. IEEE Trans. Syst. Man Cybern. Syst. PP(99), 1–11 (2020). https://doi.org/10.1109/TSMC.2019.2963143 20. Nesheli, M.M., Ceder, A.A., Gonzalez, V.A.: Real-time public transport operational tactics using synchronized transfers to eliminate vehicle bunching 3220–3229 (2016). https://doi.org/ 10.1109/TITS.2016.2542268 21. Sang, H., Zhao, J.: Finite-time H ∞ estimator design for switched discrete time delayed neural networks with event-triggered strategy. IEEE Trans. Cybern. PP(99), 1–13 (2020). https://doi. org/10.1109/TCYB.2020.2992518 22. Park, P., Ko, J.W., Jeong, C.: Reciprocally convex approach to stability of systems with timevarying delays. Automatica (2011). https://doi.org/10.1016/j.automatica.2010.10.014 23. Wang, J., Xia, J., Shen, H., Xing, M., Park, J.H.: H ∞ synchronization for fuzzy Markov jump chaotic systems with piecewise-constant transition probabilities subject to PDT switching rule. IEEE Trans. Fuzzy Syst. 1–1 (2020). https://doi.org/10.1109/TFUZZ.2020.3012761
Modeling and Control Algorithm of the Multi-duct-rotor Mode Transformable Aircraft Mucheng Tang, Yue Ma, and Zhiheng Bu
Abstract In this article, we presented a mode transformable VTOL aircraft composed of ducted propellers in 6 × 6 squared array to meet the requirements of flying through narrow door and window openings. We defined the basic modes of the aircraft, modeled and analyzed the dynamic characteristics of each mode. We designed a cascade PID algorithm for different modes and a power switching method for rolling on the ground, proposed a partition rotational speed allocation method, and verified them in software simulation. The simulation results of each mode indicate that the aircraft has the ability to deform through narrow holes and roll forward. Keywords VTOL aircraft · Ducted propeller · Mode transformable · Flight control
1 Introduction Multi rotor aircraft benefit from the advantages of maneuverability and can achieve vertical lift and hover in the air, playing an important role in transportation, media, and other aspects. At present, the common quadrotor and eight rotor aircrafts on the market are still weak in complex environments. Under the basic configuration, mode transformable aircrafts generally have a similar overall structure, aerodynamic characteristics, and control system to traditional fixed structure aircrafts [1]. Changing the structure of the aircraft will inevitably cause a change in the moment of inertia of the aircraft, and generally affect the aerodynamic characteristics during flight. This will change parameters in the algorithm and bring difficulties to the design of the control system. Unmanned aerial vehicles can be classified based on weight, platform type, power mode, and application range [2]. References [3–5] based on the quadrotor UAV with M. Tang · Y. Ma (B) · Z. Bu Beijing Institute of Technology, Beijing 100089, China e-mail: [email protected] Y. Ma Chongqing Innovation Center, Beijing Institute of Technology, Chongqing 401135, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_25
297
298
M. Tang et al.
Fig. 1 Aircraft in preset modes
different mode transforming mechanics, designed the inner and outer loop control framework of track control and attitude control, analyzed the aerodynamic characteristics and identified the model parameters in the simulation, and finally built a flight platform for physical verification. References [6–9] based on single ducted rotor UAV, studied the control algorithm in method of cascade PID or MPC controller. References [10–16] studied the mechanical design and controller in cascade PID and robust H∞ algorithm for the special configurations. The main contributions of this article are as follows: (1) Dynamic models of different modes of the aircraft were provided, and the aerodynamic feasibility of the modes was verified through CFD simulation. (2) The flight control strategy was presented using a cascade PID control method with a partition rotational speed allocation method, and the control strategy was validated in MATLAB/Simulink and Coppeliasim.
2 Modeling and Analysis of Aerodynamic Characteristics In this section, the model of the aerodynamic components of the aircraft is given first, and then the flight and rolling of the entire aircraft are analyzed in kinematics. The flow simulation is carried out in the ANSYS software to verify the rationality of the design and provide a reference for the design of the flight control algorithm. The aircraft modeled in this article adopts 6×6-duct array and QUAD-X configuration layout allow for a single degree of freedom rotation transformation of up to 60◦ in columns, achieving flattened I-mode and folded U-mode, as well as ground rolling in fully retracted O-mode. The aircraft is symmetric in each mode and has enough lift force to hover in space (Fig. 1).
2.1 Power System Modeling According to [17, 18], the propeller lift and torque models are given: T ∗ = CT ρ M = CM ρ
N 2 60
N 2 60
D 4p
(1)
D 5p
(2)
Modeling and Control Algorithm of the Multi-duct-rotor …
299
Fig. 2 Simplified aircraft model, with 4 quadrant into 4 propellers
T = (1 + q)T ∗
(3)
C T , C M is the dimensionless lift coefficient and torque coefficient, q is the lift gain coefficient of the duct structure. The coefficients are solved by the BEM theory model from [19]. The model of the electric drive system is given in reference to the simplified model from [20].
2.2 Dynamic Characteristic Analysis Considering the symmetry shown in geometry, the aircraft under mode I is simplified as a quadrotor model. The relationships between flight speed, flight distance and pitch angle of mode I are described by the following equations (Fig. 2): V (θ ) =
2G tan θ ρ S [C D1 (1−sin3 θ )+C D2 (1−cos3 θ )]
(4)
C D = C D1 1 − sin3 θ + C D2 1 − cos3 θ
(5)
Z (θ ) = 60T f ly (θ )V (θ )
(6)
In mode U, we assumes that the lift of each duct remains the same, from which the expression for level flight speed is the same as that of the mode I. Fold angle is described as γ . The horizontal components of propeller torque cancel out each other. The maximum lift force and vertical torque within the same quadrant is: W = 9Tmax (cos 0.5γ + cos 1.5γ + cos 2.5γ ) M = 9Mmax (cos 0.5γ + cos 1.5γ + cos 2.5γ )
(7)
300
M. Tang et al.
Fig. 3 Result of relationship between pitch and velocity and folding angle under different loads in mode U
Fig. 4 Simplification of aircraft model
Based on above equations, numerical calculations are carried out using the parameters designed in Sect. 2. The U-mode folding angle results show that as the folding angle increases, the total vertical pulling force of the aircraft decreases (Fig. 3).
2.3 Flow Field Simulation Simulate the transient flow field of the aircraft using the Fluent module in ANSYS Workbench to determine the impact of multiple ducted propellers forming a compact array on lift. To simplify the model, remove excess ribs between ducts, and use rigid connections between them. The aircraft is divided into tetrahedral grids, set the inlet flow rate to 0 m/s and the propeller speed to 50,000 rpm. Simulate single channel, four channel in mode I, and six channel in mode U respectively (Figs. 4 and 5). The simulation results show that the lift force generated by a single duct and an mode I duct in 2 × 2 array are 1.82 and 1.46 N, respectively, indicating that the flow coupling between parallel ducts weakens the lift. In the mode U 2 × 3 array, the lift generated by a single duct along the axis direction is 1.82 N, which can be considered as the cancellation of the coupling characteristics of the duct assembly and modal transformation, verifying the assumption of equal lift force of each duct in the mode U in Sect. 2.2.
Modeling and Control Algorithm of the Multi-duct-rotor …
301
Fig. 5 Result of 4 ducts in mode I and 6 ducts in mode U
3 Design and Simulation of Control Algorithm This section will start from the definition of the coordinate system and provide a model of the position and attitude based on the dynamic model in Sect. 3. Based on the classical control strategy of the quadrotor UAV, the cascade PID control strategy of the aircraft is given, and the speed distribution strategy under the QUAD-X layout is determined.
3.1 Rigid Body Dynamics Model of Aircraft Firstly, we assume that the aircraft is rigid and the center coincides with the COG. The aircraft is only subjected to gravity and lift force. The angular velocity of the aircraft in the body coordinate system is ω B = [ p, q, r ]T , and it in the ground system ˙ T . The angular velocity equation is: is Θ˙ = [ϕ, ˙ θ˙ , ψ] ⎤ ⎡ ⎤⎡ ⎤ 1 tan θ sin φ tan θ cos φ φ˙ p ⎣ θ˙ ⎦ = ⎣ 0 cos φ − sin φ ⎦ ⎣ q ⎦ sin φ cos φ r 0 ψ˙ cos θ cos θ ⎡
(8)
Analyzing the Forces on Aircraft Based on Newton’s Second Law: m v˙e = −mge3 + f Rbe e3
(9)
302
M. Tang et al.
The magnitude of gyroscopic moment and total propeller moment is given by the following equation: ⎡ ⎤ nr
i −J q (−1) ω ¯ k⎥ ⎢ 0 k=1 ⎢ ⎥ n r ⎢ ⎥
(10) G=⎢ i ⎥ J p (−1) ω ¯ 0 k ⎣ ⎦ k=1 0 ⎡
⎤ ⎡ ⎤ τx τx1 + τx2 + τx3 + τx4 τ = ⎣ τ y ⎦ = ⎣ τ y1 + τ y2 + τ y3 + τ y4 ⎦ τz τz1 + τz2 + τz3 + τz4
(11)
The subscript of the propeller torque is expressed in the partition under body coordinates, with 1–4 corresponding to the I–IV quadrants respectively. The torque of each partition is calculated from the lift and torque equations given in Sect. 3. The attitude equation of the aircraft can be obtained from the above equation: ⎡ ⎡ ⎤ ⎢ p˙ ⎢ ω˙ b = ⎣ q˙ ⎦ = ⎢ ⎢ ⎣ r˙
1 Jx x 1 Jyy
⎤ nr
τx − J0 q (−1)i ω¯ k + Jyy − Jzz qr ⎥ k=1
⎥ nr ⎥
τ y + J0 p (−1)i ω¯ k + (Jzz − Jx x ) pr ⎥ ⎦ k=1 1 τ pq + J − J z xx yy Jzz
(12)
In mode U, the matrix of interia will change: JU = diag(Jx xU , JyyU , JzzU ) ⎡ ⎤T 2.5 cos( γ2 ) + 1.5 cos( 3γ2 ) + 0.5 cos( 5γ2 ) ⎦ 1 = diag(Jx x , Jyy , Jzz ) · diag ⎣ 2.5 sin( γ2 ) + 1.5 sin( 3γ2 ) + 0.5 sin( 5γ2 )
(13)
3.2 Design of Control Strategy The control adopts a cascade PID strategy, with an outer loop controlling the acceleration along three axis in body coordinate system command of the aircraft and an inner loop controlling the torque command along three axis of the aircraft. The control input for aircraft position is: ax = K p err (x) + K i a y = K p err (y) + K i az = K p err (z) + K i
err (x)dt + K d derrdt(x)
(14)
err (y)dt + K d derrdt(y)
(15)
(z) err (z)dt + K d derr dt
(16)
Modeling and Control Algorithm of the Multi-duct-rotor …
303
According to the modeling in Sect. 3.2, reference values for lift and roll and pitch angle can be obtained. The control input for the aircraft attitude has the similar form with it of position. Solve the rotational speed signal of equivalent quadrotor in four quadrants with following equations:
f τ
= A ω¯ 2
(17)
In which, A is the matrix for torque distribution in partitions.
3.3 Partition Lift Allocation Method The equivalent lift force and torque generated within a single quadrant should be the same as before simplification. Under mode I, the distribution equation can be obtained as follows:
i 2 2 (18) ωj = ωj
2 2 d i ωij = d ω j
i = 1, 2, . . . , 9;
j = 1, 2, 3, 4.
(19) (20)
The equations have 7 ◦ C of freedom and are statically indeterminate. By unifying the speed within the entire quadrant to the same value or assigning the same speed to two different regions, the corresponding speed for each channel can be assigned. Considering the folding angle γ in mode U and unifying the rotational speed within a single quadrant, a similar distribution equation with mode I can be obtained which can be easily solved:
2 sin γ i (ω∗ )2 = ω j
(21)
3.4 Simulation and Analization The above presented dynamic model and control algorithm of the aircraft were built in the MATLAB/Simulink environment to verify the reliability of the algorithm. In the simulation, three operating conditions were set: vertical takeoff, hover, and forward flight, and the results are as follows. The simulation shows that the algorithm can make a stable lift distribution to the duct array, and can track fast and have little overshoot in flight (Fig. 6).
304
M. Tang et al.
Fig. 6 Result of flight simulation
Fig. 7 Results of mode I tracking simulation
Fig. 8 Results of mode U travel through a narrow window
Import the aircraft model into CoppeliaSim software to further simulate the aircraft control algorithm. Set the mode I to track the given reciprocating target position given by mouse in the horizontal plane and mode U to fly pass the narrow window. The result shows that the aircraft is in a reasonable range of pitch and tilt when tracking the target (Fig. 7). Finally, simulate the rolling of the mode O on the ground. Only two columns of ducted propellers provide power at the same time, and switch the driven ducted columns during the motion. Obviously, there is still a gap in rolling performance between the 6-column duct and the wheeled rigid body.
Modeling and Control Algorithm of the Multi-duct-rotor …
305
Fig. 9 Results of mode O rolling on the ground
4 Conclusion This article presented an aircraft composed of multiple ducted propeller array. Two flight modes and one rolling mode were defined. Dynamic characteristics of the aircraft were analyzed through models and CFD simulation. A cascade PID flight controller with a partition rotational speed allocation method was designed, and control strategy simulation verification was conducted based on MATLAB/Simulink and CoppeliaSim. Future research is expected to continue to optimize the comprehensive control performance of this type of aircraft (Figs. 8 and 9).
References 1. Mintchev, S., Floreano, D.: Adaptive morphology: a design principle for multimodal and multifunctional robots. IEEE Robot. Autom. Mag. 23(3), 42–54 (2016) 2. Yan, C., Tu, L., Wang, Y., et al.: Application of unmanned aerial vehicle in civil field in China. Flight Dyn. 40(3), 1–6+12 (2022) 3. Su, C., Yu, H., Wu, L., Yu, L., Gao, J., Lu, J.: Design and flight verification of a deformable tilting quadrotor UAV 10 (2022) 4. Gao, J., Wang, P., Hou, Z.: UAV velocity control based on improved PID algorithm 43, 1–5, 10 (2015) 5. Zhang, Z., Yang, Z., Duan, Y., Liao, L., Lu, K., Zhang, Q.: Active disturbance rejection control method for actively deformable quadrotor. Control Theory Appl. 38(4), 444–456 (2021) 6. Lee, H., Han, S., Lee, H., Jeon, J., Lee, C., Kim, Y.B., Song, S.H., Choi, H.R.: Design optimization, modeling, and control of unmanned aerial vehicle lifted by Coand˘a effect. IEEE/ASME Trans. Mechatron. 22(3), 1327–1336 (2017) 7. Cheng, Z., Pei, H., Li, S.: Neural-networks control for hover to high-speed-level-flight transition of ducted fan UAV with provable stability. IEEE Access 8, 100-135–100-151 (2020) 8. Manzoor, T., Yuanqing, X., Di-Hua, Z., Dailiang, M.: Trajectory tracking control of a VTOL unmanned aerial vehicle using offset-free tracking MPC. Chin. J. Aeronaut. 33(7), 2024–2042 (2020) 9. Manzoor, T., Sun, Z., Xia, Y., Ma, D.: MPC based compound flight control strategy for a ducted fan aircraft. Aerosp. Sci. Technol. 107, 106264 (2020) 10. Xiaoliang, W., Xiang, C., Najjaran, H., Bin, X.: Robust adaptive fault-tolerant control of a tandem coaxial ducted fan aircraft with actuator saturation. Chin. J. Aeronaut. 31(6), 1298– 1310 (2018)
306
M. Tang et al.
11. Fan, W., Xiang, C., Xu, B.: Modelling, attitude controller design and flight experiments of a novel micro-ducted-fan aircraft. Adv. Mech. Eng. 10(3), 1687814018765569 (2018) 12. Zhang, Y., Xu, B., Xiang, C., Fan, W., Ai, T.: Flight and interaction control of an innovative ducted fan aerial manipulator. Sensors 20(11), 3019 (2020) 13. Zhao, M., Anzai, T., Shi, F., Chen, X., Okada, K., Inaba, M.: Design, modeling, and control of an aerial robot dragon: a dual-rotor-embedded multilink robot with the ability of multi-degreeof-freedom aerial transformation. IEEE Robot. Autom. Lett. 3(2), 1176–1183 (2018) 14. Zhao, M., Okada, K., Inaba, M.: Enhanced modeling and control for multilinked aerial robot with two DoF force vectoring apparatus. IEEE Robot. Autom. Lett. 6(1), 135–142 (2020) 15. Shi, F., Zhao, M., Anzai, T., Chen, X., Okada, K., Inaba, M.: External wrench estimation for multilink aerial robot by center of mass estimator based on distributed IMU system. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 1891–1897. IEEE (2019) 16. Zhao, M., Nagato, K., Okada, K., Inaba, M., Nakao, M.: Forceful valve manipulation with arbitrary direction by articulated aerial robot equipped with thrust vectoring apparatus. IEEE Robot. Autom. Lett. 7(2), 4893–4900 (2022) 17. Moffitt, B., Bradley, T., Parekh, D., Mavris, D.: Validation of vortex propeller theory for UAV design with uncertainty analysis. In: 46th AIAA Aerospace Sciences Meeting and Exhibit, p. 406 (2008) 18. Merchant, M., Miller, L.S.: Propeller performance measurement for low reynolds number UAV applications. In: 44th AIAA Aerospace Sciences Meeting and Exhibit, p. 1127 (2006) 19. Quan, Q.: Design and Control of Multirotor Aircraft. Publishing House of Electronics Industry (2018) 20. Ling, Y., Wang, T., Zhang, D., Zhou, L.: Fluid mechanics analysis of UAV rotor based on fluent. Pract. Electron. 31(22–25) (2023)
VR-Based Virtual Surgical Training via Parametric Human Body Modeling Yilin Zhao, Hongjian Huang, and Qiang Fu
Abstract In this paper, we developed a VR-based system for surgical training using parametric human body modeling. This system has two main functions: (1) Using a hierarchical leg model and inhomogeneous deformation methods based on the Mega-Fiers plugin, the system automatically estimates the proportional relationships between muscles, fat, and bones in a three-dimensional leg model based on parameters such as gender, height, weight, and body fat percentage to construct a personalized 3D leg model for surgical operations; (2) The system allows for interactive control of virtual surgical tools’ postures and movements using VR, and the leg model can be cut open using mesh algorithms in Unity to perform virtual surgical incisions. Additionally, the system provides a high-fidelity VR surgical room environment and a parameterized user interface. We conducted a series of experiments to demonstrate the effectiveness of the parametric 3D human body modeling and the usability of the interactive incision by using VR equipment. In addition, our system can be further developed and applied to other medical fields beyond surgery, such as anatomical education and medical research. Keywords Virtual reality · Surgical training · Parametric modeling
Y. Zhao International School, Beijing University of Posts and Telecommunications, Beijing 100876, China e-mail: [email protected] H. Huang School of Computer Science, Beijing University of Posts and Telecommunications, Beijing 100876, China e-mail: [email protected] Q. Fu (B) School of Digital Media & Design Arts, Beijing University of Posts and Telecommunications, Beijing 100876, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_26
307
308
Y. Zhao et al.
1 Introduction Virtual surgical training has been widely used in medical education and practice due to its safety, repeatability, and cost-effectiveness. With the development of virtual reality (VR) technology, various VR platforms have been developed to provide a realistic and immersive environment for surgical training. On one hand, VR-based surgical training is that it enables surgeons to practice surgical procedures in a safe and controlled environment. On the other hand, VR platforms provide a highly immersive experience, allowing users to interact with simulated surgical tools and anatomical structures in a way that closely mimics the experience of real surgery. In recent years, VR-based surgical training has been widely studied along with various systems being developed [1]. Some studies have placed particular emphasis on utilizing hand position tracking and force-feedback devices to achieve precise three-dimensional interactive information and provide users with a more realistic sense of interaction [2]. As the number of virtual surgical training systems continues to increase, there is also a growing need for 3D models that support virtual surgery. Such demands can be satisfied by either patient-specific data or synthetic data. In this paper, we propose a novel synthetic data generation method that leverages human body parameters, including gender, height, weight, and fat, to automatically generate a 3D model of a human leg for virtual surgery. Based on this, we develop a VR platform for virtual surgical training via parametric human body modeling. Users can adjust these parameters to generate a customized patient model for virtual surgical simulation, thus allowing for more personalized and targeted training. Our system is developed with Unity3D, a popular game engine that offers powerful real-time rendering and physics simulation capabilities. It provides an interactive and intuitive interface for users to interact with virtual surgical tools and visualize the surgical procedures in real-time, such as the inhomogeneous deformation of 3D legs by using the Mega-Fiers plugin and interactive incisions by using VR equipment. We demonstrate the usability and effectiveness of our system by presenting the results of parametric 3D leg generation and virtual surgical operation and discussing its potential applications in medical education and practice.
2 Related Work Virtual surgical simulation has emerged as a promising approach for surgical training, skill acquisition, and assessment [3–5]. On one hand, haptic feedback has been identified as a key component of virtual surgical simulation [6–8], enabling surgeons to experience the sense of touch and force feedback during surgery. On the other hand, recent studies have explored the use of real-time simulation and mesh-free methods in virtual surgical simulation [9–11]. These studies demonstrate the potential of virtual surgical simulation as an effective training and assessment tool for surgical procedures. Besides, at the microscopic level, using simulation calculations
VR-Based Virtual Surgical Training via Parametric Human Body Modeling
309
to analyze and predict biological mechanisms is also a popular research direction (e.g., [12, 13]). In addition, several studies have also evaluated the validity and effectiveness of virtual reality simulation for specific surgical procedures, such as hip arthroscopy, spine surgery, and percutaneous coronary intervention. For example, Khanduja et al. [14] tested the construct validity of a virtual reality hip arthroscopy simulator and demonstrated that it accurately assessed surgical skills and had high levels of user satisfaction. Banaszek et al. [15] compared virtual reality simulation with benchtop simulation in the acquisition of arthroscopic skill and found that virtual reality simulation had better face validity and user satisfaction. Gasco et al. [16] evaluated the usefulness of a virtual reality spine surgery simulation and demonstrated that it improved surgical skills and confidence. Li et al. [17] designed and evaluated a personalized percutaneous coronary intervention surgery simulation system and demonstrated that it improved surgical skills and reduced procedure time. Teng and Deng [18] proposed a spiral flow guider for endovascular stent to induce the blood flow in the stent to rotate.
3 Methodology and the System The VR-based interactive virtual surgical simulation technology was implemented using Unity and SteamVR plugins. The basic architecture of our system consists of two aspects: intelligent generation of human body models based on body parameters and virtual simulation of surgical procedures based on VR interaction. The former aims to generate three-dimensional human body models that fit the body shape based on body parameters such as height, weight, and fat rate. To achieve this goal, we need to use Blender to construct multi-level and accurate models and use the mega-fiers plugin in Unity to achieve non-uniform deformation of the model when weight parameters change. The latter is to realize the interaction of surgical tools and the cutting operation of surgical objects in the VR environment. To achieve this goal, we need to use the interaction function and collision detection technology in Unity to simulate the operation of surgical tools with devices such as hand-held controllers and achieve the cutting of surgical objects. At the same time, to achieve the cutting effect, we also need to use mesh algorithms in Unity to cut the model. The system architecture is shown in Fig. 1. In orthopedic surgery, there are significant individual differences in patient body shape, which can greatly affect the surgical outcome. To address this, the goal of our system is to simulate and model the patient’s body according to gender, body fat percentage, height, and weight (Fig. 2-left). In this manner, we can establish diverse and differentiated human body models based on each patient’s body type and disease pattern, which enables us to design more targeted surgical plans. This approach solves the previous problem of a single model and the difficulty of obtaining diverse models. To use the human body parameters to model individual 3D shapes, we design a multi-layer 3D leg model consisting of skin, muscle, fat, and bone (as illustrated
310
Y. Zhao et al.
Fig. 1 The architecture of our system
Fig. 2 Left System UI for body parameter inputs. Right Multi-layer leg 3D models with skin, fat, muscle, and bone
in Fig. 2-right). Based on this, we use computer-aided design and precise measurements to create personalized surgical plans and simulate the surgical process for each patient, which allows for orthopedic surgery to develop towards an intelligent, precise, and personalized direction. In addition, we considered the distribution of body fat and used the bulge script in the mega-fiers plugin. We adjusted the amount value in bulge while dragging the slider, and as weight increases, the distribution of fat in the body increases non-linearly. For example, when an adult male gains 1kg of weight, there is more fat distribution in the inner thighs and near the buttocks, and less near the joints. These changes are updated in real-time on the model, which is more consistent with the biological features of the human body. Our system uses the mesh algorithm to optimize the cutting of surgical instruments on human body models, achieving layered cutting of skin, fat, muscles, and bones by surgical tools, and performing real-time cutting in Unity3D. Firstly, the cutting plane
VR-Based Virtual Surgical Training via Parametric Human Body Modeling
311
Fig. 3 Examples of layered cutting on different layers. From left to right, the first one only cuts the skin, the second one cuts the skin and fat layers, and the last two cut the layers of skin, fat, and muscle
is determined based on the doctor’s touch operation, and the vertices of the original model are divided into two categories, namely, above and below the plane. Colliders (collision boxes) are added to the skin, muscles, fat, and the front end of the surgical instrument, and set as triggers. Next, the collision boxes are used to detect whether the skin, muscles, fat, etc. intersect with the surgical instrument. If the surgical instrument has no focus on the plane of the human body model, the algorithm ends. If the cutting plane intersects with the surgical instrument, the mesh data of the original model is obtained, and all mesh vertices are traversed to determine which vertices are in the upper and lower parts of the cutting plane, and they are placed in PositiveMesh and NegativeMesh, respectively. In addition, vertices inside the surgical instrument need to be deleted. After the deletion is completed, the section needs to be filled, and the section vertices need to be re-sorted, connected, and UV-mapped. The effect of layered cutting is shown in Fig. 3.
4 Experiments In this section, we first evaluate the methods we proposed in our system, including (1) the intelligent generation method for parametric human body modeling, and (2) a VR-based interactive virtual surgical simulation. Then, we evaluate our system in terms of both the user interface and the system performance. Parametric 3D modeling of legs. To demonstrate the effectiveness of the parametric 3D human body modeling, we first conduct a qualitative comparison in which we use four pairs of body parameters to generate 3D legs (Fig. 4). We employ a control variate method in each pair. More specifically, legs in each pair are generated with only one parameter (gender, height, weight, or body fat percentage) with different values, while the other three parameters are the same. We can observe that in the section of each leg, various factors can impact the leg model. Firstly, gender can affect changes in the leg model. When body fat percentage, weight, and height are held constant, females generally have a higher body fat content, while males tend to
312
Y. Zhao et al.
Fig. 4 Examples of parametric body modeling results. We show a pair with different fat percentages (a), weights (b), heights (c), and genders (d)
have a higher muscle content. Secondly, height can also impact the leg model. When body fat percentage, weight, and gender are kept constant, taller height results in a higher bone proportion, relatively smaller muscle content, and relatively higher fat content. Thirdly, weight can also have an effect on the leg model. When body fat percentage, height, and gender remain unchanged, an increase in weight causes an increase in the amount of fat tissue in the legs, while the proportion of muscle and bone may slightly increase. Finally, body fat percentage can also influence the leg model. When height, weight, and gender are constant, a rise in body fat percentage leads to an increase in the amount of fat in the legs, while the proportion of muscle and bone decreases. Interactive incision for surgery simulation. To demonstrate the usability of the interactive incision by using the VR equipment, we conducted a user study in which we invited three non-medical undergraduates to participate in the virtual surgery by using our system. For each participant, we gave her/him a few minutes to get familiar with the system interaction. Then we asked the participants to make an incision on the skin, fat, and muscle of a 3D leg, respectively. We counted the number of attempts for each participant until her/him successfully made the correct incision and summarize the results in Table 1. From this user study, we can see that even non-professional users can get benefit from our system and learn surgical operation with tolerable attempts. As shown in Fig. 5, our system interface has built a VR operating room based on the Unity3D platform, giving people a sense of reality. To evaluate the performance proposed system, several experiments were conducted. The accuracy of the gener-
Table 1 User study results of the manual cutting by using our system Layer of 3D leg Skin Fat Average attempts 20 32
Muscle 16
VR-Based Virtual Surgical Training via Parametric Human Body Modeling
313
Fig. 5 The virtual operating room in our system
ated human body models was verified by comparing the models with actual body measurements. Moreover, the realism and effectiveness of the VR-based surgical simulation technology were assessed by conducting a survey among experienced surgeons. The survey focused on the realism of the surgical environment and instruments, the level of immersion and interactivity, and the overall effectiveness of the simulation. Results of the experiments showed that the proposed system is accurate and effective in providing realistic surgical training.
5 Conclusion In conclusion, the use of virtual reality (VR) in surgical training is becoming increasingly popular as it provides a safe and controlled environment for trainees to practice their skills. This article has presented a new VR platform for virtual surgical training via parametric human body modeling, which aims to improve the realism and accuracy of the simulation. The platform uses a 3D body scanner to generate a personalized avatar for each trainee, which can be manipulated to simulate various surgical procedures. The platform has been tested with a group of surgical residents and has shown promising results in terms of its usability and effectiveness. Trainees reported high levels of immersion and engagement with the simulation, and their performance in the simulated procedures improved over time. Despite the promising results of the VR platform for virtual surgical training via parametric human body modeling, there are several limitations that must be acknowledged. Firstly, the accuracy of the platform is dependent on the quality of the 3D body scanner and the parametric human body model. While the technology has
314
Y. Zhao et al.
advanced significantly in recent years, there is still room for improvement in terms of the fidelity and resolution of the avatars generated by the scanner. Additionally, the current platform does not account for differences in anatomy between individuals, which may limit its applicability to certain patient populations. Secondly, the platform currently lacks advanced haptic feedback systems, which may limit the realism of the simulation. Haptic feedback, such as force feedback or tactile feedback, is essential for simulating the tactile sensations experienced during surgery, and its absence may reduce the effectiveness of the training. Finally, the platform’s cost may be prohibitive for some institutions or individuals. The 3D body scanner and the development of the parametric human body model require a significant investment of time and resources, which may make it inaccessible to some training programs. In the further, we are interested in expanding the platform to include additional surgical procedures and developing more advanced haptic feedback systems to further enhance the realism of the simulation. With continued development, this VR platform has the potential to become a valuable tool for surgical training and could ultimately improve patient outcomes by ensuring that surgeons are well-trained and prepared to handle a range of procedures. However, our current method still has several limitations. Firstly, our method is more suited for interactive design systems that allow users to configure the initial status of the mesh primitives. For the automatic design workflow, we have to use the random initial mesh primitives to generate candidate floor plans. But since the parameters in the generated floor plans are known, exploring the proper one is not difficult. Secondly, we mainly focus on the interior space division for the designed floor plans, hence, slight manual assistance is required to assign semantic labels to each room.
References 1. Camp, C.: Editorial commentary: “Virtual Reality” simulation in orthopaedic surgery: realistically helpful, or virtually useless? Arthroscopy 34(5), 1678–1679 (2018) 2. Derossis, A.M., Bothwell, J., Sigman, H.H., et al.: The effect of practice on performance in a laparoscopic simulator. Surgi. Endosc. 12(9), 1117–1120 (1998) 3. Cotin, S., Dawson, S.L., Meglan, D., et al.: ICTS, an interventional cardiology training system. Stud. Health Technol. Inform. 70, 59–65 (2000) 4. McCloy, R., Stone, R.: Science, medicine, and the future: virtual reality in surgery. BMJ 323(7318), 912–5 (2001) 5. Basdogan, C., De, S., Kim, J., et al.: Haptics in minimally invasive surgical simulation and training. IEEE Comput. Graph. Appl. 24(2), 56–64 (2004) 6. Botden, S.M.B.I., Jakimowicz, J.J.: What is going on in augmented reality simulation in laparoscopic surgery? Surg. Endosc. 23(8), 1693–1700 (2009) 7. Sugand, K., Akhtar, K., Khatri, C., Cobb, J., Gupte, C.: Training effect of a virtual reality haptics-enabled dynamic hip screw simulator. Acta Orthop. 86(6), 695–701 (2015) 8. Aïm, F., Lonjon, G., Hannouche, D., Nizard, R.: Effectiveness of virtual reality training in orthopaedic surgery. Arthroscopy J. Arthroscopic Relat. Surg. 32(1), 224–232 (2016) 9. Pan, J.J., Yang, Y.H., Gao, Y., et al.: Real-time simulation of electrocautery procedure using meshfree methods in laparoscopic cholecystectomy. Vis. Comput. 35(6), 861–872 (2019)
VR-Based Virtual Surgical Training via Parametric Human Body Modeling
315
10. Bauer, D., Wieser, K., Aichmair, A.O., Zingg, P., Dora, C., Rahm, S.: Validation of a virtual reality-based hip arthroscopy simulator. Arthroscopy 35(3), 789–795 (2019) 11. Blumstein, G., Zukotynski, B., Cevallos, N., Ishmael, C., Zoller, S., Burke, Z., et al.: Randomized trial of a virtual reality tool to teach surgical technique for tibial shaft fracture intramedullary nailing. J. Surg. Educ. 77(4), 969–77 (2020) 12. Teng, X., Hwang, W.: Effect of methylation on local mechanics and hydration structure of DNA. Biophys. J. 114(8), 1791–1803 (2018) 13. Teng, X., Hwang, W.: Chain registry and load-dependent conformational dynamics of collagen. Biomacromolecules 15(8), 3019–3029 (2014) 14. Khanduja, V., Lawrence, J., Audenaert, E.: Testing the construct validity of a virtual reality hip arthroscopy simulator. Arthroscopy 33(3), 566–71 (2017) 15. Banaszek, D., You, D., Chang, J., Pickell, M., Hesse, D., Hopman, W., et al.: Virtual reality compared with bench-top simulation in the acquisition of arthroscopic skill. J. Bone Joint Surg. 99(7), 34 (2017) 16. Gasco, J., Patel, A., Ortega-Barnett, J., Branch, D., Desai, S., Kuo, Y., et al.: Virtual reality spine surgery simulation: an empirical study of its usefulness. Neurol. Res. 36(11), 968–73 (2014) 17. Li, S., Cui, J.H., Hao, A.M., et al.: Design and evaluation of personalized percutaneous coronary intervention surgery simulation system. IEEE Trans. Vis. Comput. Graph. 27(11), 4150–4160 (2021) 18. Teng, X., Deng, X.: Optimization of a helical flow inducer of endovascular stent based on the principle of swirling flow in arterial system. J. Biomed. Eng. 27(2), 429–434 (2010)
Key-Agent Based Dynamic Prioritized Planning for Multi-agent Systems Kaixiang Zhang, Niya Wang, Shufan Zhang, Ning Wang, and Jianlin Mao
Abstract Multi-agent path finding (MAPF) is the basis of multi-agent systems. The current priority-based MAPF algorithm has a high performance in solution efficiency, but is sensitive to priority order. In order to reduce priority sensitivity, this paper proposes a key agents based priority adjustment method. Firstly, the key agents affecting the problem solution is extracts level by level to construct a multi-level structure. Then, the priority order is adjusted according to the key agents and the level they are in. Finally, a key agents based dynamic priority safe interval path planning (KDSIPP) algorithm is constructed. Simulation results show that the proposed algorithm has a significant improvement in success rate compared with existing algorithms, and is able to solve problems containing hundreds of agents. Moreover, there is only a 6.7% increase in path costs compared to the optimal solution. Keywords Multi-agent · Path planning · Priority
1 Introduction Multi-agent path finding is currently a widely studied problem in the field of AI [1, 2]. As the basis for safe operation and efficient cooperation of multi-agent systems, MAPF aims to find a path from start to goal for each agent in a multi-agent system, and there can be no conflict between the paths of any two agents [3]. MAPF is a NP-hard problem [4], whose state space grows exponentially with both the number of agents and the path length. According to the combination or not of the the agent’s path states at each path step during planning process, the current MAPF search algorithms can be divided into coupled and decoupled algorithms [5]. Among them, the coupled algorithms are usually derived from traditional A* [6] algorithm extended by the space-time dimension, which have completeness and optimality guarantees. However, as the number of agents increases, this type of algorithm is K. Zhang · N. Wang · S. Zhang · N. Wang (B) · J. Mao Kunming University of Science and Technology, Kunming 650500, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_27
317
318
K. Zhang et al.
highly susceptible to the explosion of search space [7], and thus the number of agents that can be solved is limited. Decoupled search algorithms are suitable for solving MAPF problems with more agents. The core idea is to make the path states of the agents independent of each other. Decoupled search algorithms are usually classified into two typical frameworks: conflict-based search (CBS) and prioritized planning (PP). The CBS algorithm was proposed by Sharon et al. [8], which designed a two-layer solution framework: the lower layer is used to plan single-agent path, and the upper layer is used to detect multi-agent conflicts and guide the lower layer to re-plan. In this framework, conflicts between paths will be gradually eliminated through continuous conflict detection and re-planning [9]. On the basis of this framework, Li et al. [10] proposed the disjoint splitting method to prune the conflicting extension trees at the upper layer of CBS, thus avoiding the processing work of a large number of duplicate child nodes at the upper layer. Zhang et al. [11] used the mutex propagation to provide a stronger constraints for CBS, thus avoiding the symmetry problem of upper-layer nodes and reducing the number of upper-layer nodes. The CBS framework re-plan both sides of the conflict agents and selects the better one for expansion, thus providing a guarantee on the quality of the path. However, as path coupling increases, the number of conflicts rises sharply. This conflict handling approach tends to fall into an iterative cycle of resolving one conflict and generating a new one. PP is another model of decoupled algorithms, which usually have higher solution efficiency and are therefore of wide interest in practical application scenarios with high real-time requirements [12]. The PP algorithm was proposed by Erdmann et al. [13], which also has a two-layer framework: the lower layer performs single-agent path planning, while the upper layer sequentially assigns agent to be planned in order of priority, and collates the planned paths as obstacle constraints for subsequent agents [14]. Based on the PP algorithm, Silver [15] proposed the hierarchical cooperative A* (HCA*) algorithm, which adds a heuristic computation layer under the single-agent planning layer of the PP framework to obtain more accurate heuristic values in the form of reverse A* search. HCA* improves the solution efficiency of the PP algorithm and achieving a solution capacity of tens to hundreds. Phillips et al. [16] improved the spatio-temporal A* algorithm for the single-agent planning layer and proposed the safe interval path planning (SIPP) algorithm. SIPP replaces the time steps in the original spatio-temporal A* with a limited number of safe intervals corresponding to each map vertex, and improves the solution speed by an order of magnitude compared to HCA*. Based on SIPP, Yakovlev et al. [17] constructed the weighted SIPP (WdSIPP) algorithm by adding suboptimal weights to the single-agent planning process, which further accelerated the problem solving. The PP framework based algorithms have advantages in terms of solution efficiency. However, this type of algorithm is usually not complete and optimal. In particular, it suffers from priority sensitivity, which will seriously affect its success rate [15]. Given the above, this paper proposes a priority adjustment method in order to reduce the priority sensitivity of PP framework based algorithms.
Key-Agent Based Dynamic Prioritized Planning …
319
To sunmmarize, the main contributions of this paper are given as follows: (1) A hierarchical extraction strategy for key agents has been proposed to analyze the principal contradictions that affect problem solving. (2) A priority adjustment strategy based on key agents has been proposed, which can efficiently adjust the priority order and enable problem solving. The structure of this paper is as follows: Sect. 2 briefly describes MAPF and the blocking problem. In Sect. 3, a extraction strategy for key agents and a priority adjustment strategy based on key agents is presented. The pseudocode of the algorithm is given in Sect. 4. Simulations are given in Sect. 5 and conclusions are outlined in Sect. 6.
2 Problem Describe In this section, we will describe MAPF problem and conflicts separately. Subsequently, the effect of priority sensitivity on the PP framework based algorithm will further be described, in particular the blocking problem.
2.1 MAPF Problem Given an undirected connected graph G = (V, E) and a set of agent A = {ai |i ∈ I } (I = {1, 2, . . . , Na }), MAPF is to find a path πi from its start si to its goal gi for each agent ai , and there can be no conflict between any two paths πi and π j . On this basis, the optimisation objective of the MAPF problem is to minimise the sum of Na |πi | [18]. costs (SOC): i=0 In the MAPF problem, conflicts can be mainly divided into vertex conflict and edge conflict [19]. Among them, vertex conflict means that any two agents ai ,a j occupy the same map vertex at the same time: πi (t) = π j (t) , as shown in Fig. 1a. The edge conflict means that ai ,a j pass through the same edge in the same unit time period: πi (t) = π j (t + 1)&πi (t + 1) = π j (t), as shown in Fig. 1b. Fig. 1 Conflict classification
g2 s1
s1
g2
g1 s2
(a) Vertex conflict
s2
g1
(b) Edge conflict
320
K. Zhang et al.
4
4
3
3 s2
2
s1
g1
g2
1 1
2
3
4
5
6
s3
s2 s1
2
s4
1
s5 1
(a) Case of two agents
g1
g2 g3
g4 g5
2
3
4
5
6
(b) Case of five agents
Fig. 2 Blocking cases
2.2 Blocking Problem Although the PP framework-based algorithm has high solution efficiency in solving the MAPF problem, it suffers from priority sensitivity and the solution results are easily influenced by the priority order. With an inappropriate priority order, the solution process will occur with goal blocking between agents. As shown in Fig. 2a, if the priority order is set to a1 ≺ a2 , then a1 will plan and occupy the pass gate (5,2) first. As a result, a2 will be blocked in the left-hand region and fail to travel to the target position. Conversely, if the priority order is set to a2 ≺ a1 , then a2 will plan first and obtain path {(2, 2), (3, 2), (4, 2), (5, 2), (6, 2)}. In that case, a1 will avoid a2 , allowing the problem to be solved. It can be seen that the blocking problem will cause the single agent planning to fail and force the solution process to terminate. To a certain extent, the blocking problem can be avoided by adjusting the priority order. However, as the number of priority orders is factorially related to the number of agents:Na !, the difficulty of adjusting the priority order will increase dramatically as the number of agents increases. As in Fig. 2b, the total number of priority orders for the five agents is 120, but the only one that enables the problem to be solved is a5 ≺ a4 ≺ a3 ≺ a2 ≺ a1 . Without effective adjustments, a large amount of the underlying computing resources will be wasted.
3 Priority Adjustment Method Based on Key Agents 3.1 Extraction Strategies for Key Agents In order to improve the efficiency of priority adjustment and reduce the goal blocking problem, this paper proposes a priority adjustment strategy based on key agents. The strategy corresponds to a specific solution process. By analysing the actual solution situation, the principal contradiction affecting the solution of the problem
Key-Agent Based Dynamic Prioritized Planning …
, , , , ,
, , ,
321
,
Fig. 3 Multi level extraction of key agents
Fig. 4 Multi-level structure of key agents for case in Fig. 2b
are identified and then focused on solving the principal contradiction. Where the key agents are the unsolved individuals in the solution process and represent the principal contradiction in the set of agents A. On this basis, there may still have principal contradictions in the key agents. Therefore, this paper designs an structure of key agents with multi-level characteristics, and proposes a level-by-level extraction strategy of key intelligences. the solutions are performed sequentially for the set A = As shown in Fig. 3, ai , a j , ak , al , am , · · · . If none of the subsequent agents {ak , al , am , · · ·} is solved under the constraint of ai , a j , then {ak , al , am , · · ·} are set as the first level key agents. On this basis, solve {ak , al , am , · · ·} again. If {am , · · ·} is unsolvable under the constraint of {ak , al }, set {am , · · ·} as the second level of key agents. Repeat the above process and extract level by level until there are no more unsolvable individuals. The resulting multi-level structure is the hierarchical structure of the key agents. For example, for Fig. 2b, the final multi-level structure is shown in Fig. 4 through level extraction.
3.2 Priority Adjustment Strategy Based on Key Agents Extraction In the multi-level structure of key agents, the agents at the upper level are the principal contradictions that the agents at the next level should focus on. Therefore, this paper further proposes a priority adjustment strategy based on multi-level key
322
K. Zhang et al.
Fig. 5 Dynamic priority adjustment based on key agents
Table 1 Paths for the case in Fig. 2b Agent Paths Path sequences a1 a2 a3 a4 a5
π1 π2 π3 π4 π5
{(2,3),(1,3),(1,3),(1,3),(1,3),(2,3),(3,3)} {(2,4),(2,3),(3,3),(4,3),(4,4),(4,4),(4,4),(4,3)} {(1,4),(2,4),(2,4),(2,4),(2,3),(3,3),(4,3),(4,2),(5,2)} {(1,2),(2,2),(2,3),(3,3),(4,3),(4,2),(5,2),(6,2),(6,3),(6,2)} {(1,1),(2,1),(2,2),(2,3),(3,3),(4,3),(4,2),(5,2),(6,2),(6,1)}
agents. As shown in Fig. 5, corresponding to the level extraction process in Fig. 3, after completing the extraction of key agents of a level, the key agents at that level are adjusted to the high priority. In this way, the levels are adjusted one by one until the priority adjustments of key agents are completed for all layers. As in the case of Fig. 4, the adjusted priority order is a5 ≺ a4 ≺ a3 ≺ a2 ≺ a1 . With this priority order, the problem will be solved and the path plans will be obtained as shown in Table 1.
4 Key Agents Based Dynamic Priority SIPP Algorithm Based on the above priority adjustment strategy, this paper improves the SIPP algorithm belonging to PP framework and further proposes a key agents based dynamic priority SIPP (KD-SIPP) algorithm, as shown in Algorithm 1–2. where G contains information about the environment and the safe intervals of SIPP algorithm, i.e. the unoccupied time of each map vertex; Au is used to store unsolved individuals.
Key-Agent Based Dynamic Prioritized Planning …
323
Algorithm 1 KD-SIPP Require: agents A, graph G; 1: while no solution do 2: G ← original G; π ← ∅; Au ← ∅; 3: for each ai in A do 4: πi ← single-agent path planning by SIPP(ai , G); 5: if πi = ∅ then 6: add πi to π ; 7: update G by adding path constraints(G, πi ); 8: else 9: add ai to Au ; 10: continue; 11: end if 12: if all ai have obtained πi then 13: return π ; 14: else 15: A ← Ad just Priorit y(A, Au ); /* Algorithm2 */ ; 16: end if 17: end for 18: end while
5 Simulations In order to verify the effectiveness of the proposed algorithm, tests were carried out using 6 different types of maps [10, 20] as shown in Fig. 6. In each type of map, 8 types of agent numbers are set from small to large. Each agent number is tested with 20 random cases, where the start and goal of each agent is randomly generated. For the 960 random cases set above, the KD-SIPP algorithm proposed in this paper is used for solving, and CBS [8], SIPP [16], and WdSIPP [17] are used for comparison. Among them, in order to balance solution speed and path quality, the weight of WdSIPP is taken as w = 1.75. All algorithm are written in Python and run on the PC
Algorithm 2 AdjustPriority Require: agents A, Au ; 1: while Au = ∅ do 2: adjust A by raising the priority of the agents in Au ; p 3: Au ← Au ; Au ← ∅; G ← original G; p 4: for each ai in Au do 5: πi ← single-agent path planning by SIPP(ai , G); 6: if πi = ∅ then 7: update G by adding path constraints(G, πi ); 8: else 9: add ai to Au ; 10: continue; 11: end if 12: end for 13: end while 14: return A;
324
K. Zhang et al.
(a) Empty16 map[20]
(b) Random20 map[10]
(c) Random32 map[20]
(d) Room32 map[20]
(e) Maze32 map[20]
(f) Den312d map[20]
Fig. 6 Maps for testing
with Intel Core i7-11700 2.8GHz CPU and 32GB RAM. The solution time limit of all algorithm for each case is 120 s. If the time limit is exceeded, it will be considered as a failure. The success rate results of each algorithm are shown in Fig. 7, and the path cost performances are shown in Fig. 8. It can be seen from Fig. 7 that the proposed KD-SIPP algorithm has a significant improvement in success rate compared to the CBS, SIPP, and WdSIPP algorithms. The success rate of CBS was the first to show a decline as the number of agents increased. Subsequently, SIPP and WdSIPP also showed a decreasing trend. Especially when the number of agents exceeds 100, the success rate of SIPP and WdSIPP has dropped to a lower level due to the blocking problem. In that case, KD-SIPP was consistently able to complete the solution of all test cases with a high success rate. This result demonstrates the effectiveness of KD-SIPP in priority adjustment. From the comparison of the results of different types of maps, it can be found that the algorithms perform best in the empty16 map, and are able to solve the problem with a higher density of agents. In the random20 and random32 maps, the success rate of the SIPP and WdSIPP algorithms decreased significantly when the number of agents exceeded 130. In contrast, KD-SIPP was still able to achieve a success rate of 100%. The KD-SIPP performs better in irregular den312d maps with a solving capacity of over 200 agents. In complex terrain such as room32 and maze32, the CBS, SIPP and WDSIPP algorithms are challenged by the increased of path coupling and the probability of blocking problems. By adjusting the priority order, KD-SIPP effectively reduces blocking problems and increases the success rate.
Key-Agent Based Dynamic Prioritized Planning …
325
80
60
CBS SIPP WdSIPP KD-SIPP
40 20
60
CBS SIPP WdSIPP KD-SIPP
40 20
20
40
60
80
100
0
120
20
40
(a) Empty16
80
80
Success rate(%)
100
60 CBS SIPP WdSIPP KD-SIPP
0
80
100
0
120
25
40
30
55
60
70
90
120
150
Number of agents
(c) Random32 100
CBS SIPP WdSIPP KD-SIPP
60 40 20
80 60
CBS SIPP WdSIPP KD-SIPP
40 20 0
0
10
20
(b) Random20
100
20
CBS SIPP WdSIPP KD-SIPP
40
Number of agents
Number of agents
40
60
Success rate(%)
0
60
0
0
0
Success rate(%)
Success rate(%)
100
80
Success rate(%)
100
80
Success rate(%)
100
85
10
20
Number of agents
30
40
50
60
10
70
60
(d) Room32
110
160
210
Number of agents
Number of agents
(e) Maze32
(f) Den312d
Fig. 7 Result of success rate 2500 CBS SIPP WdSIPP KD-SIPP
2000
250
800
200
3000 300 250
1000
200
500
2000
400
300
150
150
400
100
10
12
14
16
18
20
500
22
0 0
20
40
60
80
100
100
8
10
12
14
16
18
20
22
1000
24
120
0
20
40
60
80
100
0
120
(b) Random20
4000
SOC
1800 600 550 500 450 400
3000
10
2000
45
120
150
1200
6000
1000
800 800
60
75
90
3000
600
1000
20
0 30
90
700
250
15
9000 900
300
0
60
CBS SIPP WdSIPP KD-SIPP
12000
1000
350
600
30
(c) Random32
CBS SIPP WdSIPP KD-SIPP
5000
1200
20
15000
SOC
CBS SIPP WdSIPP KD-SIPP
15
Number of agents
6000
2400
10
Number of agents
(a) Empty16 3000
200
0
0
Number of agents
SOC
CBS SIPP WdSIPP KD-SIPP
4000
1500
1200
SOC
SOC
1600
5000
CBS SIPP WdSIPP KD-SIPP
2000
SOC
2400
9
10
11
12
13
14
15
16
600 10
17
15
20
0
0 5
15
25
35
45
55
Number of agents
Number of agents
(d) Room32
(e) Maze32
65
0
40
80
120
160
200
Number of agents
(f) Den312d
Fig. 8 Result of path cost
The test results effectively illustrate that the proposed priority adjustment method has good adaptability to different map scenarios. As can be seen from Fig. 8, the path cost of KD-SIPP is consistent with the results of the SIPP algorithm and better than the WdSIPP algorithm. Compared to the optimal cost obtained by the CBS algorithm, the cost obtained by KD-SIPP has increased. However, the increase is limited. The largest difference is the result of
326
K. Zhang et al.
Na = 20 in den312d map, where KD-SIPP increases the cost by 6.7% compared to CBS. The rest of the increase is essentially around 3.0%. It is shown that the KDSIPP algorithm increases the success rate significantly while losing less path quality. The results validate the practicality of the proposed algorithm.
6 Conclusions This paper proposes a priority dynamic adjustment method based on multi-level key agents to reduce the priority sensitivity of existing priority based MAPF algorithms. A large-scale random cases were tested on the proposed algorithm in six different types of maps. The test results show that the proposed algorithm has a significant improvement in success rate compared to the CBS, SIPP and WdSIPP algorithms. For tasks involving hundreds of agents that are difficult to solve with comparison algorithms, it still has a high success rate. In terms of path costs, compared to the optimal result obtained by CBS algorithm, there is only a 6.7% increase. The test results verify the effectiveness of the proposed algorithm. Acknowledgements This work was funded by National Nature Science Foundation of China under grants (62263017). (Corresponding author: Niya Wang, [email protected].)
References 1. Khan, A., Zhang, C., Li, S., Wu, J., Schlotfeldt, B., Tang, S.Y., Ribeiro, A., Bastani, O., Kumar, V.: Learning safe unlabeled multi-robot planning with motion constraints. In: the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 7558–7565. IEEE Press, New York (2019) 2. Das, P.K., Behera, H.S., Das, S., Tripathy, H.K., Panigrahi, B.K., Pradhan, S.K.: A hybrid improved PSO-DV algorithm for multi-robot path planning in a clutter environment. Neurocomputing 207, 735–753 (2016) 3. Wang, H., Rubenstein, M.: Walk, stop, count, and swap: decentralized multi-agent path finding with theoretical guarantees. IEEE Robot. Autom. Lett. 5, 1119–1126 (2020) 4. Yu, J., LaValle, S.M.: Structure and intractability of optimal multi-robot path planning on graphs. In: The Twenty-Seventh AAAI Conference on Artificial Intelligence (AAAI), pp. 1443– 1449. AAAI Press, Palo Alto (2013) 5. Ma, H.: Target assignment and path planning for navigation tasks with teams of agents [Ph. D. dissertation], University of Southern California (2020) 6. Hart, P.E., Nilsson, N.J., Raphael, B.: A formal basis for the heuristic determination of minimum cost paths. IEEE Trans. Syst. Sci. Cybern. 4(2), 100–107 (1968) 7. Standley, T., Korf, R.: Complete algorithms for cooperative pathfinding problems. In: The Twenty-second International Joint Conference on Artificial Intelligence (IJCAI), pp. 668–673. AAAI Press, Palo Alto (2011) 8. Sharon, G., Stern, R., Felner, A., Sturtevant, N.R.: Conflict-based search for optimal multi-agent pathfinding. Artif. Intell. 219, 40–66 (2015) 9. Zhang, H., Wu, Y., Hu, J., Zhang, J.: A multi-robot path finding algorithm based on improved conflict search. Control Decis. 38(5), 1227–1235 (2023)
Key-Agent Based Dynamic Prioritized Planning …
327
10. Li, J., Harabor, D., Stuckey, P.J., Felner, A., Ma, H., Koenig, S.: Disjoint splitting for multiagent path finding with conflict-based search. In: The International Conference on Automated Planning and Scheduling (ICAPS), pp. 279–283. AAAI Press, Palo Alto (2019) 11. Zhang, H., Li, J., Surynek, P., Satish Kumar, T.K., Koenig, S.: Multi-agent path finding with mutex propagation. Artif. Intell. 311, 103766 (2022) ˇ 12. Cáp, M., Novák, P., Kleiner, A., Selecký, M.: Prioritized planning algorithms for trajectory coordination of multiple mobile robots. IEEE Trans. Autom. Sci. Eng. 12(3), 835–849 (2015) 13. Erdmann, M., Lozano-Perez, T.: On multiple moving objects. Algorithmica 2, 477–521 (1987) 14. Zhang, K., Mao, J., Xiang, F., Xuan, Z.: B-IHCA*, a bargaining game based multi-agent path finding algorithm. Acta Automatica Sin. Early Access (2022). https://doi.org/10.16383/j.aas. c220065 15. Silver, D.: Cooperative pathfinding. In: The AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE), pp. 117–122. AAAI Press, Palo Alto (2005) 16. Phillips, M., Likhachev, M.: SIPP: safe interval path planning for dynamic environments. In: The 2011 IEEE International Conference on Robotics and Automation (ICRA), pp. 5628–5635. IEEE Press, New York (2011) 17. Yakovlev, K., Andreychuk, A., Stern, R.: Revisiting bounded-suboptimal safe interval path planning. In: The Thirtieth International Conference on Automated Planning and Scheduling (ICAPS), pp. 300–304. Palo Alto, AAAI Press (2020) 18. Li, J., Ruml, W., Koenig, S.: EECBS: a bounded-suboptimal search for multi-agent path finding. In: The AAAI Conference on Artificial Intelligence (AAAI), pp. 12353–12362. Palo Alto, AAAI Press (2021) 19. Li, J., Felner, A., Boyarski, E., Ma, H., Koenig, S.: Improved heuristics for multi-agent path finding with conflict-based search. In: The International Joint Conference on Artificial Intelligence (IJCAI), pp. 442–449. Palo Alto, AAAI Press (2019) 20. Stern, R., Sturtevant, N.R., Felner, A., Koenig, S., Ma, H., Walker, T.T., Li, J., Atzmon, D., Cohen, L., Satish Kumar, T.K., Boyarski, E., Bartak, R.: Multi-agent pathfinding: definitions, variants, and benchmarks. In: The International Symposium on Combinatorial Search (SoCS), pp. 151–158. Palo Alto, AAAI Press (2019)
Fixed-Time Tracking Control for Robotic Manipulators Based on Adding a Power Integrator Method Shiming Wang and Yingmin Jia
Abstract This paper investigates the fixed-time control problem of an n degrees of freedom (n−DOF) robotic manipulators subject to uncertain dynamics and external disturbances. We propose a uniform robust exact differentiator (URED) observer to compensate for the lumped disturbances. Using the observer states, we then design a fixed-time stabilizer for the concerned system by the approach of adding a power integrator (AAPI). Both observer and tracking errors can be proven converge to zero in a fixed-time. Numerical simulations are carried out to illustrate the control algorithm. Keywords Adding a power integrator method fixed-time control · Robotic manipulator
1 Introduction Nowadays, researches on robotic manipulators have become popular in the control community due to their wide industrial applications [1]. In particular, fixed-time tracking control of robotic manipulators is one of the important topics as it emphasizes steering the robotic manipulators’ joints to reach the desired value in a certain fixed time. Fixed-time control methods applicable for nonlinear systems can be roughly classified into terminal sliding mode method [2, 3], homogeneity theory method [4, 5], and AAPI method [6–9]. For uncertain nonlinear second-order system, Moulay et al. [2] designed a sliding mode variable leading to global robust fixed-time controller. Huang et al. designed a consistent non-singular fixed-time tracking control law based on fast terminal sliding mode technology while avoiding singular problems [3]. Gao S. Wang · Y. Jia (B) The Seventh Research Division and the Center for Information and Control, School of Automation Science and Electrical Engineering, Beihang University (BUAA), Beijing 100191, China e-mail: [email protected] S. Wang e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_28
329
330
S. Wang and Y. Jia
et al. designed a global fixed-time controller of a class of perturbed planar nonlinear systems based on the bi-limit homogeneity technique [4]. This ensured that the closed-loop system is globally fixed-time stable. Fixed-time distribution control law based on continuous observer for second-order multi-agent system with external disturbance is presented in [5]. For the AAPI control method, in [6], an AAPI fixed-time controller is proposed for a class of high-order nonlinear systems with monotone degrees and output constraints, ensuring that all state variables converge to zero within a fixed time. Zhang et al. proposed a fixed-time controller using AAPI method, which solved the fixed-time regulation problem of the nonlinear system with uncertainty and guaranteed the fixed-time stability of the system [7]. The method was applied to three mechanical systems to verify its effectiveness. Huang et al. designed an AAPI controller for nonlinear systems, which achieved fixed-time convergence of the closed-loop system. Using a fixed time disturbance observer and AAPI method [8]. Kang et al. designed a fixed-time controller for high-order nonlinear large-scale systems [9]. Motivated by the discussions above, we investigate the fixed-time tracking control for an n−DOF robotic manipulator in the current paper. An observer for estimating the lumped disturbances is proposed applying the uniform robust exact differentiator approach. Then, using the observer states and borrowing the main idea of AAPI, we propose a fixed-time tracking control scheme. It is proven by Lyapunov stability theory that within a fixed-time, all tracking errors of joints converge to zero. Meanwhile, in the process of modeling the dynamics of the robotic manipulator, we refer to the water model in [10] to make our dynamic model more realistic. Finally, we use dynamics simulation [11] to verify the effectiveness of our AAPI method. This paper’s main contribution is to present the design and analysis of a fixed-time control algorithm for an n-DOF robotic manipulator subject to external disturbances. The dynamic model of the robotic manipulator we have established can be extended to the dynamic of fibrillar collagen [12], making it more convenient to conduct research on fibrillar collagen. In comparison with results presented in [13, 14] that consider only pure robotic manipulators, the external disturbances and the model uncertainties is compensated by our observer design. From the aspect of convergence property, the controller proposed by the presented work drives all joints converging to desired values with a fixed time, featuring faster regulation capacity compared with control laws in [15] by which the errors only converge to zero as time goes infinite. The rest proceeds as follows. Section 2 introduces some useful lemmas. The modeling and problem formulation are shown in Sect. 3. Section 4 presents the main results. A numerical simulations is included in Sect. 5. Section 6 concludes the work briefly.
Fixed-Time Tracking Control for Robotic Manipulators …
331
2 Mathematical Preliminaries Consider the nonlinear system x˙ = f (t, x), x(t0 ) = x0 , ∀t ≥ t0 ,
(1)
where f : R+ × D → Rn is continuous in t and Lipschitz in the state x on R+ × D, and D ⊂ Rn is an open set containing the equilibrium x = 0, i.e., f (t, 0) = 0, ∀t ≥ t0 . Definition 1 [16] If the solution x(t, x0 ) of system (1) reaches to the equilibrium x = 0 at a bounded setting-time T (x0 ), i.e., x(t, x0 ) = 0, ∀t ≥ T (x0 ), where T (x0 ) ≤ Tmax with Tmax being a positive constant. Thus, the system (1) is fixed-time stable. Lemma 1 [16] Let V (x) : Rn → R≥0 be a continuously differentiable and radially unbounded function such that ∂V f (t, x) ≤ −(ξ V p (x) − ιV q (x))v , ∀t ≥ t0 , x(t0 ) = x0 , ∂x
(2)
where ξ > 0, ι > 0, p > 0, q > 0 and v > 0 are constants satisfying pv < 1, qv > 1. Then, x = 0 is the fixed-time stable equilibrium with settling-time T (x0 ) being estimated by 1 1 T (x0 ) ≤ v + . (3) ξ (1 − pv) ιv (qv − 1) Lemma 2 [17] Given real scalars a1 , . . . , an , b1 and b2 , one has the following claims (4a) (|a1 | + · · · + |an |)x ≤ |a1 |x + · · · + |an |x , if 0 < x ≤ 1; x b − b x ≤ 21−x |b1 − b2 |x , if x = 2h − 1 , h ∈ N+ . 1 2 2h + 1
(4b)
Lemma 3 [17] If there are two positive constants c, d and a real function w(x, ¯ y) : R2 → R+ , then y w¯ − y (c, d) |d|x+y x w(c, ¯ d) |c|x+y |c| |d| ≤ + . x+y x+y x
x
y
(5)
Lemma 4 [18] Given the positive scalars z 1 , z 2 , . . . , z n ∈ R+ and a constant p > 1, then n n p z i ≥ n 1− p ( z i ) p . (6) i=1
i=1
332
S. Wang and Y. Jia
Fig. 1 The n-DOF robotic manipulator
3 Problem Formulation Consider the n-DOF robotic manipulators depicted in Fig. 1. From [1], its dynamics can be described as ˙ q˙ + G(q) = τ + d, M(q)q¨ + C(q, q)
(7)
where d denotes the unknown external disturbance, and τ is the control torque, q ∈ Rn denotes the joint angle, q˙ ∈ Rn is the joint angular velocity, M(q) ∈ Rn×n ˙ ∈ represents the inertia matrix, G(q) ∈ Rn stands for the gravity matrix, C(q, q) Rn×n is the centrifugal force and coriolis force matrix. To focus on the fixed-time tracking control, we suppose that all modeling coefficients in (7) are known and make the assumption below, Assumption 1 The infinite norm of the derivative of external disturbance d is bounded by a known constant , i.e., sup d˙ (t)∞ ≤ . t≥0
In the real environment, the dynamics of the robotic manipulator may be uncertain due to flexible joints, friction, wear and etc. The parameter matrices of the system can be divided into into the nominal parts and the uncertain parts, i.e.M(q) = M0 (q) + ˙ = C0 (q, q) ˙ + C(q, q). ˙ M(q); G(q) = G 0 (q) + G(q); C(q, q) Define the desired position qd and the position error e, qd = [qd1 , qd2 , . . . , qdn ]T , e = q − qd = [e1 , e2 , . . . , en ]T . Let x 1 = e, x 2 = e˙ , we can obtain a newly rewritten dynamics model,
(8)
Fixed-Time Tracking Control for Robotic Manipulators …
x˙ 1 = x 2 x˙ 2 = Zτ + A + F.
333
(9)
˙ q˙ + G 0 ), F = M0 −1 f , f = d + q¨ d − where Z = M0 −1 , A = −M0 −1 (C0 (q, q) ˙ q˙ − G(q). M(q)q¨ − C(q, q) Assumption 2 The infinite norm of the lumped disturbances F is bounded by a known constant j , i.e., F˙ ≤ j . ∞
The control goal of this paper is: considering the robotic manipulator system (7) with the model uncertainties and the unknown external disturbances under Assumptions 1 and 2, design a control low so that the tracking errors e can convergent to zero in a fixed time.
4 Main Results The controller design includes a fixed-time tracking control law together with an observer of lumped disturbances. Design the following control input τ = C0 q˙ + G 0 + M0 u,
(10)
where u = [u 1 , u 2 , ..., u n ]T denotes the new control input. We substitute (10) into (7) and perform some direct computation, we can obtain that x˙ 1 = x 2 (11) x˙ 2 = u + F. Via observing (11), the control design involves coming up with designing u so that in a fixed-time (11) is stabilized to zero. Meanwhile, the original control input τ can be recovered by (10) once u is available.
4.1 Disturbances Observer Design Let Fˆ = M0 −1 fˆ, we follow the URED technique [19] and propose the observer as ⎧ ⎨ ζˆ˙ = Zτ + A + Fˆ − p1 ϕ1 (σ0 ) ⎩ ˆ˙ F = − p2 ϕ2 (σ0 ),
(12)
334
S. Wang and Y. Jia
where p1 > 0, p2 > 0, σ0 ζˆ − q˙ and 1
3
ϕ1 (σ0 ) = |σ0 | 2 sign(σ0 ) + μ |σ0 | 2 sign(σ0 ), 1 3 ϕ2 (σ0 ) = sign(σ0 ) + 2μσ0 + μ2 |σ0 |2 sign(σ0 ), 2 2
(13)
with a constant μ ≥ 0. The following proposition can be derived by modifying the robust differentiator in [19]. Proposition 1 Considering the robotic manipulator system (7) with Assumptions 1 and 2. In a fixed-time T1 , the Fˆ generated by (12) reaches M0 −1 f . T1 is irrelevant of the initial condition, if the designed parameters p1 and p2 are chosen in the set
√ 4L 2 p12 2 P = ( p1 , p2 ) ∈ R |0 < p1 ≤ 2 L, p2 > + 2 4 p1 √ 2 ∪ ( p1 , p2 ) ∈ R | p2 > 2 L, p2 > 2L .
(14)
where L = λmax {M0 −1 }j . ˆ can be directly obtained Proof The fixed-time convergence of observer error F − F via following the routine of [19]. The fixed-time T1 is given by Equation (12) in [19]. Hence, we omit the detailed proof.
4.2 Fixed-Time Tracking Controller Design In the follows, we will design a fixed-time controller for the model (11) via the disturbance observer (12),
1 2−ω1 2 ˙ ˆ ˆ u i = −2 (2 − ω2 ) (a − 1 + − χ2i (L − Fi ) Fi + L , c1 (15) 2k , ωk = 2a+3−2k with k∈ where a ∈ Z>1 , l ∈ R+ , c1 ∈ R+ , βi = α1 + α3i , ρ = 2k+1 2a+1 N≥1 , and α1 , α3i , χ2i (i = 1, 2, 3) will be defined later. 1−ω2
ω2 − 2 βi + lχ2iρ )χ2i 2n+1
Theorem 1 Under Assumptions 1 and 2, the closed-loop error dynamics (11) with the control of (15) and the disturbance observer (13) is globally stable in fixed time. Proof Choose the Lyapunov function candidate for system (11), n n 1 1 V = x 1T x 1 + Hi + (L − Fˆi )2 , 2 2c 1 i=1 i=1
(16)
Fixed-Time Tracking Control for Robotic Manipulators …
and Hi =
1 1−ω 2 2 (2 − ω2 )
x2i x˜2i
335
1 ω
1
(s ω2 − x˜2i2 )2−ω2 ds,
(17)
where the x˜2i is viewed as a virtual control to be determined. We define a positive ρ function (x1i ) = a + lx1i and construct x˜2i as below, x˜2i −χ1iω2 (x1i ), 1 ω
(18a) 1 ω
χ1i = x1i , χ2i = x2i2 − x˜2i2 .
(18b)
From Lemma 2, we can obtain 1 1 1 s w2 − x˜ w2 ≥ (2−1 |s − x˜2i |) w2 . 2i
(19)
In the case of x2i ≥ x˜2i , substitute (19) into (16) yields 1 1 V ≥ x 1T x 1 + 1−ω 2 2 2 (2 − ω2 ) i=1 n
+
x2i x˜2i
2
(2−1 (s − x˜2i )) w2
−1
ds
n 1 (L − Fˆi )2 2c1 i=1
2 1 1 − 2 = x 1T x 1 + 1−ω w2 2 w2 (x2i − x˜2i ) w2 2 2 2 (2 − ω2 ) i=1 n
+
(20)
n 1 (L − Fˆi )2 ≥ 0. 2c1 i=1
In the case of x2i < x˜2i , we can use the same method to prove V ≥ 0. So V is positive definite. The time derivative of Hi is 1 ω2 1 d(− x ˜ ) x2i ω1 ω 2i (s 2 − x˜2i2 )1−ω2 ds H˙ i = 1−ω x2i 2 2 dx1i x˜2i 1 2−ω2 χ (u i + Fi ). + 1−ω 2 2 (2 − ω2 ) 2i
1
(21)
According to the definition of x2i and χ2i , one has 1 ω
x2i = (x˜2i2 + χ2i )ω2 ≤ |x˜2i | + |χ2i |ω2 = |χ1i |ω2 + |χ2i |ω2 ,
(22a)
336
S. Wang and Y. Jia
1 1 d(−x˜2iω2 ) d(x1i (a + lx1iρ ) ω2 ) = dx dx1i 1i ρ
1
=(a + lx1i ) ω2 +
lρ ρ ρ 1 −1 x1i (a + lx1i ) ω2 ω2
(22b)
α2i (x1i ), with α2i (x1i ) ≥ 0 being a C 1 function. Based on Lemma 2, Lemma 3, (22a) and (22b), it then follows that 1 x2i ω2 1 1 1 d(− x ˜ ) ω 2i 2 1−ω2 ω2 (s − x˜2i ) ds 21−ω2 x2i dx 1i x˜2i 1
≤
(|χ1i |ω2 + |χ2i |ω2 )α2i (x1i ) |x2i − x˜2i | |χ2i |1−ω2
(23)
21−ω2 ≤(|χ1i |ω2 + |χ2i |ω2 ) |χ2i | α2i (x1i ).
The time derivative of (16) along with the solution position of (11) can be calculated as, V˙ =x 1T x 2 +
n i=1
=
n
n 1 (L − Fˆi ) F˙ˆi H˙ i − c1 i=1
x1i (x2i − x˜2i ) + x1i x˜2i +
i=1
n i=1
n 1 (L − Fˆi ) F˙ˆi , H˙ i − c1 i=1
(24)
substitute (18), (21) and (23) into (24) yields V˙ ≤ −
n n γ γ +ρ aχ1i + lχ1i + χ1i (x2i − x˜2i ) i=1
i=1
n + (|χ1i |ω2 + |χ2i |ω2 ) |χ2i | α2i (x1i )
(25)
i=1
+
n n 1 1 2−ω2 χ (u + F ) − (L − Fˆi ) F˙ˆi , i i 21−ω2 (2 − ω2 ) i=1 2i c1 i=1
Using Lemma 2 and Lemma 3, we obtain that |χ1i (x2i − x˜2i )| ≤ 21−ω2 |χ1i | |χ2i |ω2 .
(26)
Fixed-Time Tracking Control for Robotic Manipulators …
337
Substituting (26) into (25) leads to, V˙ ≤ −
n n γ γ +ρ aχ1i + lχ1i + 21−ω2 |χ1i | |χ2i |ω2 i=1
i=1
n + (|χ1i |ω2 + |χ2i |ω2 ) |χ2i | α2i (x1i ) i=1
+ ≤−
n n 1 1 2−ω2 χ (u + F ) − (L − Fˆi ) F˙ˆi i i 21−ω2 (2 − ω2 ) i=1 2i c1 i=1
(27)
n n n γ γ +ρ aχ1i + lχ1i + (k1 + k2 ) |χ1i |γ + βi |χ2i |γ i=1
i=1
i=1
n n 1 1 + 1−ω χ2i2−ω2 (u i + Fi ) − (L − Fˆi ) F˙ˆi , 2 2 (2 − ω2 ) i=1 c1 i=1 4a where γ = 2a+1 , k1 , k2 ∈R+ , k1 + k2 = 1, βi = k3 + k4 + α2i , k3 =
ω2
1 ω k1 2
1−ω2
( 21+ω2 )
1+ ω1
2
,
α2 1+ω2 k4 = ( ωk22 )ω2 ( 1+ω ) . 2 Taking the proposed control algorithm(15) into (27), we obtain that
V˙ = −
n n n γ γ γ +ρ γ +ρ (a − 1)(χ1i + χ2i ) − l(χ1i + χ2i ) + χ2i2−ω2 (Fi − L). (28) i=1
i=1
i=1
According to Proposition 1, we can know that Fi < L, so we can obtain (Fi − L) ≤ 0, one has, 2 2 n n γ γ +ρ χ ji − l χ ji , ∀t ≥ T1 . V˙ ≤ − (a − 1) j=1 i=1
i=1
χ2i2−ω2
(29)
j=1 i=1
Define c2 = max {|χ11 | , |χ12 | , . . . , |χ1n |}, choose c1 ≥ V ≤2
n
n
4L 2 , 3c22
we obtain
(χ1i2 + χ2i2 ),
(30)
i=1
by Lemma 2, it follows that V
γ 2
n 2 n γ 4a 4a 2a+1 2a+1 ≤ 2 (χ1i + χ2i ) = 2 χ ji2 , i=1
j=1 i=1
(31)
338
S. Wang and Y. Jia
by Lemma 4, one has n 2
γ +ρ χ ji
⎛ ⎞λ n n 2 2 = (χ 2ji )λ ≥ 21−λ ⎝ χ 2j ⎠ ≥ 21−2λ V λ ,
j=1 i=1
where λ =
γ +ρ . 2
j=1 i=1
(32)
j=1 i=1
Combining (31) and (32), the derivative V˙ further satisfies 1 V˙ ≤ − V 2 − l21−2λ V λ . 2 γ
Based on Lemma 1, as
γ 2
(33)
< 1 and λ > 1, the setting time T2 for (33) can be calculated T2 =
2 1−
γ 2
+
22λ−1 . l(λ − 1)
(34)
Considering the convergence time of the observer, we know that there is a fixed-time moment T3 = T1 + T2 such that V (t) ≡ 0, ∀t ≥ T3 . Thus, the error dynamics (11) is fixed-time stable. This completes the proof.
5 Simulation In this section, we carry out numerical simulations to validate the proposed theoretical results with an 6-DOF robotic manipulator. The dynamics parameters of the concerned manipulator refer to Table 1. Table 1 Dynamics parameters of 6-DOF robotic manipulator Joint mass Nominal value (kg) m1 m2 m3 m4 m5 m6 Inertia matrix I1 I2 I3 I4 I5 I6
0 17.4 4.8 0.82 0.34 0.09 Nominal value (kg · m2 ) Diag (0, 0.35, 0) Diag (0.13,0.524,0.539) Diag (0.066,0.086,0.0125) Diag(0.018, 0.013, 0.018) Diag(3e-04, 4e-04, 3e-04) Diag(1.5e-04, 1.5e-04, 4e-05)
Actual value (kg) 0 17 5 0.8 0.35 0.1 Actual value (kg · m2 ) Diag (0, 0.35, 0) Diag (0.13,0.5,0.5) Diag (0.066,0.09,0.013) Diag(0.016, 0.012, 0.016) Diag(3e-04, 4e-04, 3e-04) Diag(1.6e-04, 1.6e-04, 5e-05)
Fixed-Time Tracking Control for Robotic Manipulators … e1
e2
e3
e4
e5
339
2
e6
q1
q2
q3
q4
q5
q6
0.4 1 0.2 0
0
-0.2 -1 -0.4 -0.6
0
2
4
6
8
-2
10
0
2
(a) Tracking error
4
6
8
10
(b) Angles state
Fig. 2 Tracking error and Angles state of each joint without uncertain dynamics Fig. 3 Control torque of each joint without uncertain dynamics
1
60
3
2
5
4
6
40 20 0 -20 -40 -60 0
2
4
6
8
10
For simulation useing, we set q(0) = [0.1, 0.1, −0.2, 0.2, −0.3, 0.2]T rad π π π π π π qd (0) = [ , , − , , − , ]T rad 4 2 2 2 2 2 d(t) = [1sint, 1cost, 0.5sint, 0, 0, 0]T N · m. The q are designed using the fifth-order polynomial interpolation method. Based on the proposed controller (15), the control coefficients are set as a = 2, l = 1, ρ = 85 5 2 2, α1 = 4.45, and α3i = 2.73 (2 + x1i2 ) 3 + 3.33x1i2 (2 + x1i2 ) 3 . In addition, we set the gain for the observer (13) by p1 = 4, p2 = 4 and μ = 1. The simulation results are shown in Figs. 2 and 3. It can be observed from Fig. 2a that the tracking error of each joint can approach zero at nearly t = 3.5. This can be further supported by the curves depicted in Fig. 2b that show each joint has reached its desired value in a fixed-time. By Fig. 3, we can see that the control torques of all joints are bounded and maintained at reasonable levels. In addition, from Fig. 3b, the URED observer errors converge to zero in a fixed time.
340
S. Wang and Y. Jia e1
e2
e3
e4
e5
e6
2
q1
q2
q3
q4
q5
q6
0.4 1 0.2 0
0
-0.2 -1 -0.4 -0.6
0
2
4
6
8
10
-2
0
2
4
6
8
10
(b) Angles state
(a) Tracking error
Fig. 4 Tracking error and Angles state of each joint with uncertain dynamics 20 1
60
2
3
4
5
de1
6
40
de2
de3
de4
de5
de6
10
20 0
0 -20
0.2 0 -0.2
-10
-40 -60 0
2
4
6
(a) Control torque
8
10
-20
9 0
2
9.5 4
6
10 8
10
(b) Observer errors of (13)
Fig. 5 Control torque and observer errors of (13) of each joint with uncertain dynamics
In the presence of model uncertainties and external disturbances, the errors of each joint and the angles of each joint are shown in Fig. 4, respectively. The control torques are shown in Fig. 5a. In Fig. 5b, and we can see the URED observer errors of each joint. To sum up, the above simulation results are consistent with the theoretical results, which verifies the effectiveness of the proposed control strategy.
6 Conclusions The fixed-time tracking control scheme has been studied for an n-DOF robotic manipulator with external disturbances and model uncertainties. An observer for the lumped disturbances and a fixed controller are proposed by utilizing robust exact differentiator and the approach of AAPI, respectively. In the Lyapunov sense, it is proved that
Fixed-Time Tracking Control for Robotic Manipulators …
341
the errors of the observer and the controller will converge to zero in a fixed-time. Also the robustness of the closed-loop system are verified based on AAPI method. Future work will focus on the fixed-time control of the robotic manipulator with input constraints. Acknowledgements This work was supported in part by the NSFC (62133001, 62227810) and the National Basic Research Program of China (973 Program: 2012CB821200, 2012CB821201).
References 1. Li, X., Wang, X., Wang, J.: A kind of lagrange dynamic simplified modeling method for multi-DOF robot 1. J. Intell. Fuzzy Syst. 31(4), 2393–2401 (2016) 2. Moulay, E., Léchappé, V., Bernuau, E., Defoort, M., Plestan, F.: Fixed-time sliding mode control with mismatched disturbances. Automatica 136, 110009 (2022) 3. Huang, Y., Jia, Y.: Fixed-time consensus tracking control of second-order multi-agent systems with inherent nonlinear dynamics via output feedback. Nonlinear Dyn. 91, 1289–1306 (2018) 4. Gao, F., Zhu, C., Huang, J., Wu, Y.: Global fixed-time output feedback stabilization of perturbed planar nonlinear systems. IEEE Trans. Circ. Syst. II Expr. Briefs 86(2), 707–711 (2020) 5. Tian, B., Lu, H., Zuo, Z., Yang, W.: Fixed-time leader-follower output feedback consensus for second-order multiagent systems. IEEE Trans. Cybern. 49(4), 1545–1550 (2018) 6. Chen, X., Zhang, X., Qian, C.: Fixed-time stability analysis and stabilization control of a class of nonlinear systems with output constraints. Int. J. Robust Nonlinear Control 31(1), 498–513 (2022) 7. Zhang, Z., Wu, Y.: Fixed-time regulation control of uncertain nonholonomic systems and its applications. Int. J. Control 90(7), 1327–1344 (2017) 8. Huang, J., Zhang, Z.: Nonlinear feedback design for fixed-time tracking of a class of nonlinear systems. Int. J. Comput. Math. 94(7), 1349–1362 (2017) 9. Kang, B., Ma, Z., Zhang, W., Li, Y.: Fixed-time fuzzy adaptive decentralized control for highorder nonlinear large-scale systems. Int. J. Control Autom. Syst. 20(12), 4100–4110 (2022) 10. Teng, X., Liu, B., Ichiye, T.: Understanding how water models affect the anomalous pressure dependence of their diffusion coefficients. J. Chem. Phys. 153(10), 104510 (2020) 11. Teng, X., Hwang, W.: Chain registry and load-dependent conformational dynamics of collagen. Biomacromolecules 15(8), 3019–3029 (2014) 12. Teng, X., Hwang, W.: Ch. 4. Structural and dynamical hierarchy of fibrillar collagen. In: Cell and Matrix Mechanics, pp. 101–118. Taylor and Francis (2014) 13. Ou, M., Sun, H., Zhang, Z., Li, L.: Fixed-time trajectory tracking control for multiple nonholonomic mobile robots. Trans. Inst. Meas. Control 43(7), 1596–1608 (2021) 14. Li, G., Wang, X., li, S.: Time distributed approximate optimization algorithms of higher order multiagent systems via penalty-function-based method. IEEE Trans. Syst. Man Cybern. Syst. 52(10), 6174–6182 (2022) 15. Cao, Z., Niu, Y., Song, J.: Finite-time sliding-mode control of Markovian jump cyber-physical systems against randomly occurring injection attacks. IEEE Trans. Autom. Control 65(3), 1264–1271 (2019) 16. Polyakov, A.: Nonlinear feedback design for fixed-time stabilization of linear control systems. IEEE Trans. Autom. Control 57(8), 2106–2110 (2011) 17. Zhang, Z., Wu, Y.: Switching-based asymptotic stabilisation of underactuated ships with nondiagonal terms in their system matrices. IET Control Theory Appl. 9(6), 972–980 (2015)
342
S. Wang and Y. Jia
18. Zuo, Z., Tie, L.: Distributed robust finite-time nonlinear consensus protocols for multi-agent systems. Int. J. Syst. Sci. 47(6), 1366–1375 (2016) 19. Cruz-Zavala, E., Jaime, J., Fridman, L.: Uniform robust exact differentiator. IEEE Trans. Autom. Control 56(11), 2727–2733 (2011)
Model-Free Optimal Control for Linear Systems with State and Control Inequality Constraints Bin Zhang, Chenyang Xu, Lutao Yan, and Haiyuan Li
Abstract A model-free iteration algorithm is presented to obtain the optimal control policy for an linear system in this paper. A novel identification model is introduced, based on which policy updating laws under the Pontryagin’s framework is established. Three types of convergence accuracy judgments are used, which determine the convergence accuracy of the optimal control, the convergence accuracy of the system model, and the convergence accuracy of the mixed state-control variable inequality constraint. This algorithm extends the existing Kleinman Algorithm to finite-horizon cost function. Simulation result illustrates the efficiency of the proposed approach. Keywords Optimal control · Model-free control · Linear systems · Inequality constraints
1 Introduction Model-free control is estimating the optimal value function and thus solving optimal control when the environmental dynamics model is completely or partially unknown [1, 2]. Recently, model-free control problems are becoming more and more common in practical industrial scenarios, such as controlling the navigation of a ship [3], controlling the flight of a helicopter [4], and the motion of a soft robot [5, 6]. For most of these problems, we cannot obtain an accurate dynamical model of the system due to physical constraints [7–9]; either the system model is known, but the problem suffers from computational complexity. The model-free optimal control method can overcome these difficulties. B. Zhang (B) · C. Xu School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing 100876, China e-mail: [email protected] L. Yan · H. Li School of Automation, Beijing University of Posts and Telecommunications, Beijing 100876, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_29
343
344
B. Zhang et al.
Nowadays, the most commonly used method to solve model-free control problems is adaptive dynamic programming (ADP) [10]. The basic structure of ADP is an actorcritic structure, where the actor module interacts with the environment to generate decisions or control, and the critic module evaluates the performance of the system to adjust the control strategy [11–13]. ADP theory integrates reinforcement learning, dynamic programming, and function approximation to estimate the cost function using the function approximation structure, and approximates the optimal solution of the system using offline or online update methods to solve the optimal control problem of nonlinear systems [14–16]. Especially, the stability of systems is often considered in control science, so Lyapunov methods have become the most common method for solving model-free optimal control online. The Lyapunov approach [17, 18] is useful for studying the stability of nonlinear dynamic systems and developing feedback controllers. It guarantees the stability of closed-loop systems, which usually depends on the joint design of Lyapunov equations and state feedback controllers. Huang et al. [19] focused on Lyapunov and Bellman’s dynamic programming method, concerned with considering the generation and evaluation of optimal controllers, and proposed a sequential approximation method that combines ease of evaluation with computational simplicity. Kleinman [20] argued that the solution of the continuous Riccati equation is accomplished with the help of a series of Lyapunov equations. Saridis et al. [21] extended the idea and applied it to nonlinear continuous systems by solving the generalized HJB equation by Lyapunov’s method to achieve optimal control. Vamvoudakis [22] proposed an online adaptive algorithm that behaves as an actor-critic architecture and is capable of continuous time adaptivity by introducing nonstandard terms to achieve a stable closed-loop system. Yang et al. [23] addressed the infinite time-domain optimal control problem for constrained input continuous nonlinear systems in the presence of an unknown model by constructing a recurrent neural network to identify the unknown controlled system, as well as using a Lyapunov direct method for the estimated weights of the actor-critic architecture network to do a constrained final consistent bounded and a stable closed-loop system is achieved. In contrast to the way Zhang [24] and Bhasin et al. [25] sought actor update laws, Yang used stability as a starting point and established an actor adjustment law, not unlike the one proposed by Vamvoudakis and Lewis [22]. Two networks act as actors and critics to approximate the optimal control and optimal cost. Vamvoudakis [26] formulated the infinite time-domain optimal control of continuous-time linear time systems as a model-free Q-learning problem. To parameterize the state and control, a model-independent action-related value function with real-time tuning parameters is constructed using Reinforcement Learning ideas to guide the construction of the Q function and the actor-critic structure to maximize the optimal control and optimal cost, followed by the Liapunov equation for the asymptotic The optimal solution is achieved and the closed-loop stability of the system is ensured by a reasonable tuning using the Lyapunov equation. Among model-free control problems, there is a class of control problems with constraints. Most of the ADP method consider input constraints [27, 28], and these problems are also called “input saturation problems" [29, 30]. Therefore, an innova-
Model-Free Optimal Control for Linear Systems with State …
345
tion of this paper is to change the constraint condition to a joint constraint on both the input and state, under which a new model-free iterative algorithm is proposed for the optimal control of linear systems based on the Lyapunov method. This algorithm extends the existing Kleinman Algorithm to finite-horizon cost function. Simulation result illustrates the efficiency of the proposed approach. The reminder of this paper is structured under the following. The problem formulation is given in Sect. 2. The system identification is designed in Sect. 3. Model-free iterative algorithm is presented in Sect. 4. The algorithm with extension to timevarying systems is presented in Sect. 5. The effectiveness of our new method is verified by a simulation example in Sect. 6. Finally, we summarize the work of this paper in Sect. 7.
2 Problem Formulation Consider the following linear system x˙ = Ax + Bu, x(t0 ) = x0
(1)
where x ∈ Rn is the state, u ∈ Rm is the control input, A ∈ Rn×n is the transition matrix, and B ∈ Rn×m is the distribution matrix. The cost function associated with system (1) is defined as J (u) =
1 2
tf
x T Qx + u T Ru dt
(2)
t0
where Q ∈ Rn×n and R ∈ Rm×m are positive definite symmetric matrices. The mixed state-control variable inequality constraint is defined as S(x, u, t) ≤ 0
(3)
where S = (s1 , s2 , · · · , sz )T : Rn × Rm × [t0 , t f ] → Rz is a z-dimensional nonlinear vector function. The control objective is to find u(t) in t0 ≤ t ≤ t f to minimize (2) subject to system (1) and inequality constraint (3). Given the problem formulation (1)–(3), the Kelley-Bryson penalty function [31] is formed as follows ⎛ ⎞ tf z ⎝ 1 x T Qx + 1 u T Ru + r k J u, r k = h[s j ]s 2j (x, u, t)⎠ dt (4) 2 2 t0 j=1
where h[σ ] =
1, σ > 0 0, σ ≤ 0
(5)
346
B. Zhang et al.
and r k > 0. Let {r k } be an infinite sequence of positive numbers such that r k+1 > r k > 0 and limk→∞ r k = ∞. It is proved that the sequence of optimal control inputs (corresponding to each r k ) converges to the solution of the original constrained problem as k → ∞ [31]. The necessary conditions for the optimization of J (u, r k ) are ⎧ z ⎪ k ⎪ ˙ ⎪ λ = −Hx − 2r h[s j ]s j s j x ⎪ ⎪ ⎪ ⎪ j=1 ⎪ ⎨ z T k = −Qx − A λ − 2r h[s j ]s j s j x , λ(t f ) = 0 (6) ⎪ ⎪ ⎪ ⎪ j=1 ⎪ ⎪ ⎪ ⎪ ⎩ u = arg min H (x, u, λ) = −R −1 B T λ (7) u
where H =
1 2
x T Qx + u T Ru + λT (Ax + Bu).
3 System Identification We assume that the i-th estimate of the system state is designed as xˆ˙ i = Aˆ i x + Bˆ i u + ν x − xˆ i , xˆ i (t0 ) = x0
(8)
where xˆ i ∈ Rn is the state estimation value, Aˆ i ∈ Rn×n is the estimate of A, Bˆ i ∈ Rn×m is the estimate of B, and ν ∈ R is a positive parameter. The updating laws for Aˆ i and Bˆ i are designed as ⎧ W ⎪ ˙ˆ i = x − xˆ i x T + h X ⎪ i i T ˙ ˆ ˆ ˆi ˆ i−1 (t f ) (9) ⎪ A − A X − B U ⎪ j j j X j , A (t0 ) = A ⎪ ⎨ j=1 W ⎪ ⎪ ⎪ ˙ˆ i = x − xˆ i u T + h X ⎪ ˙ j − Aˆ i X j − Bˆ i U j UTj , Bˆ i (t0 ) = Bˆ i−1 (t f )(10) B ⎪ ⎩ j=1
˙ j }W is the set of history stack, where h ∈ R is a positive parameter and {X j , U j , X j=1 ˙ j = AX j + BU j , j = 1, 2, · · · W . We assume that i.e., X Assumption 1 Let = > 0.
W j=1
Xj Uj
Xj Uj
T ∈ R(n+m)×(n+m) . It is assumed that
With Assumption 1, the following result is obtained.
Model-Free Optimal Control for Linear Systems with State …
347
Theorem 1 Consider the linear system (1). The estimates (8)–(10) will converge i uniformly to the true values, i.e., limi→∞ x − xˆ = 0, limi→∞ A − Aˆ i = 0, and limi→∞ B − Bˆ i = 0, for all t ∈ [t0 , t f ]. Proof We define the Lyapunov function as 1 i T i 1 x˜ x˜ + tr V = 2 2 i
T 1 T i ˜ + tr B˜ i B˜ i A A˜ i 2
(11)
where x˜ i = x − xˆ i , A˜ i = A − Aˆ i , and B˜ i = B − Bˆ i . Taking the derivation of the Lyapunov function, we can see that T i A˜ x + B˜ i u − ν x˜ i V˙ i = x˜ i ⎛ ⎛ ⎞⎞ W T T ˙ j − Aˆ i X j − Bˆ i U j ⎠⎠ − tr ⎝ A˜ i ⎝x x˜ i + h Xj X ⎛
j=1
⎛
⎞⎞ W T T ˙ j − Aˆ i X j − Bˆ i U j ⎠⎠ − tr ⎝ B˜ i ⎝u x˜ i + h Uj X j=1
T T T i = − ν x˜ i x˜ i + x˜ i A˜ x + B˜ i u − tr A˜ i x + B˜ i u x˜ i ⎛ ⎞ W W T T − h · tr ⎝ A˜ i X j A˜ i X j + B˜ i U j + B˜ i U j A˜ i X j + B˜ i U j ⎠ j=1
T T 0, a 0 > 0, b0 > 0, Aˆ 0 , and Bˆ 0 . Choose the convergence accuracies 1 > 0, 2 > 0, and 3 > 0. Let u 0 (t), t ∈ [t0 , t f ], be the initial control input. Get the initial state estimate during time interval [t0 , t f ] as follows x˙ˆ 0 = Aˆ 0 x 0 + Bˆ 0 u 0 + ν x 0 − xˆ 0 , xˆ 0 (t0 ) = x0
(19)
where x 0 is the measurable state of system (1) with initial control input u 0 . Calculate the initial cost function tf 0 0 P x 0 , u 0 , r 0 , t dt. (20) J u ,r = t0
Set i = 0 and k = 0.
Model-Free Optimal Control for Linear Systems with State …
349
Step2: Calculate u i+1 and λi by solving ⎧ i i i+1 k i i i i ⎪ (21) ⎨ λ˙ = − Qˆ x x , u , r , t, λ , Aˆ , Bˆ , λ (t f ) = 0
1 ⎪ ⎩ u i+1 = arg min Qˆ x i , u, r k , t, λi , Aˆ i , Bˆ i + a i δu i 1 + bi δu i 22 (22) u 2 where δu i = u − u i . Step3: Calculate xˆ i+1 , Aˆ i+1 , and Bˆ i+1 by solving ⎧ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨
x˙ˆ i+1 = Aˆ i+1 x i+1 + Bˆ i+1 u i+1 + ν x i+1 − xˆ i+1 (23) W T ˙ j − Aˆ i+1 X j − Bˆ i+1 U j XTj (24) A˙ˆ i+1 = x i+1 − xˆ i+1 x i+1 + h X
j=1 ⎪ ⎪ ⎪ W ⎪ i+1 T i+1 ⎪ ˙ ⎪ i+1 i+1 ˙ j − Aˆ i+1 X j − Bˆ i+1 U j UTj (25) ˆ ⎪ u B = x − x ˆ + h X ⎪ ⎩ j=1
where xˆ i+1 (t0 ) = x0 , Aˆ i+1 (t0 ) = Aˆ i (t f ), and Bˆ i+1 (t0 ) = Bˆ i (t f ). Step4: Calculate the cost function J u i+1 , r k =
tf
P x i+1 , u i+1 , r k , t dt.
(26)
t0
If J (u i+1 , r k ) − J (u i , r k ) > 0, set a i ← a i + h a and bi ← bi + h b , where h a > 0 and h b > 0 are the iterative steps, and go to Step2. Otherwise, go to Step5. Step5: If u i+1 − u i ≤ 1 , go to Step6. Otherwise, set i ← i + 1, a i ← a 0 , bi ← b0 , and go to Step2. Step6: If Aˆ i+1 − Aˆ i + Bˆ i+1 − Bˆ i ≤ 2 , go to Step7. Otherwise, set i ← i + 1, a i ← a 0 , bi ← b0 , and go to Step2. Step7: If S(x i+1 , u i+1 , t) ≤ 3 , stop. Otherwise, set i ← i + 1, a i ← a 0 , bi ← b0 , r k ← r k + h r , where h r > 0 is the iterative step, and go to Step2. Remark 1 A flow chart of the algorithm is presented in Fig. 1. Three types of convergence accuracy judgments are used, where 1 determines the convergence accuracy of the optimal control, 2 determines the convergence accuracy of the system model, and 3 determines the convergence accuracy of the mixed state-control variable inequality constraint. Remark 2 In [32], Kleinman has provided a model-based iteration method to solve the optimal control for system (1) with infinite-horizon cost function J (u) = t0
∞
x T Qx + u T Ru dt.
(27)
350
B. Zhang et al.
Fig. 1 Model-free learning algorithm
Specifically, the iteration algorithm is given as follows. Kleinman Algorithm: Step1: Initialization: Choose K 0 ∈ Rm×n such that A − B K 0 is Hurwitz. Choose the convergence accuracies > 0. Step2: Policy Evaluation: Solve the following linear matrix equation for P i
A − BKi
T
T P i + P i A − B K i + Q + K i R K i = 0.
(28)
Step3: Policy Improvement: Update the control policy using K i+1 = R −1 B T P i . Step4: Stop if K i+1 − K i ≤ . Otherwise, set i ← i + 1 and go to Step2.
(29)
Model-Free Optimal Control for Linear Systems with State …
351
In Kleinman Algorithm, accurate system model is needed and the initial control policy u = −K 0 x is assumed to be stable. Moreover, infinite-horizon cost function is used, which simplified the policy evaluation equation. In our algorithm, arbitrary initial control policy can be chosen and identification model is used instead of the accurate one. In addition, different from the Kleinman Algorithm, finite-horizon cost function is used in our algorithm. Remark 3 Model-free iterative algorithm with control inequality constraints have been widely considered. However, there are no effective model-free methods to handle the state constraints due to technical difficulties. In this paper, our iterative algorithm is a supplement to existing input constraint algorithms.
5 Extension to Time-Varying Systems Consider the following time-varying linear system x˙ = A(t)x + B(t)u, x(t0 ) = x0
(30)
where A : [t0 , t f ] → Rn×n is the time-varying transition matrix, B : [t0 , t f ] → Rn×m is the time-varying distribution matrix. The cost function is defined as (2) and the mixed state-control variable inequality constraint is defined as (3). In this situation, the estimate of the system state is xˆ˙ i = Aˆ i (t)x + Bˆ i (t)u + ν x − xˆ i , xˆ i (t0 ) = x0 .
(31)
The updating law for Λi (t) = [ Aˆ i (t) Bˆ i (t)]T is 2 T Λi (t) =Λi−1 (t) + I − ρt φ (x, u) x − xˆ i W ˙ j − Λi−1 (t) T φ Xtj , Utj +ρ φ Xtj , Utj X
(32)
j=1
˙ t }W is the set of history stack during where ρ > 0, φ(x, u) = [x T u T ]T , {Xtj , Utj , X j j=1 W t ∈ [t0 , t f ], and t = j=1 φ(Xtj , Utj )φ T (Xtj , Utj ) ∈ R(n+m)×(n+m) , t ∈ [t0 , t f ]. We assume that Assumption 2 There exist positive constants ρ > ρ > 0 such that ρ I ≤ t ≤ ρ I , t ∈ [t0 , t f ]. With Assumption 2, the following result is obtained. Theorem 2 Consider the time-varying linear system (30). The estimates (31) and i x − x ˆ = 0, (32) will converge uniformly to the true values, i.e., lim i→∞ i i limi→∞ A(t) − Aˆ (t) = 0, and limi→∞ B(t) − Bˆ (t) = 0, for all t ∈ [t0 , t f ].
352
B. Zhang et al.
Proof The proof is similar to that of Theorem 1 and is omitted here. With the updating laws (31) and (32) for the system state and system matrices, model-free learning algorithm for the time-varying linear system (30) can be achieved. In this situation, Step3 (23)–(25) in the Model-Free Learning Algorithm is replaced by Step3’: Calculate xˆ i+1 (t), Aˆ i+1 (t), and Bˆ i+1 (t), t ∈ [t0 , t f ] by solving ⎧ i+1 x˙ˆ = Aˆ i+1 (t)x i+1 + Bˆ i+1 (t)u i+1 + ν x i+1 − xˆ i+1 , x i+1 (t0 ) = x0 (33) ⎪ ⎪ ⎪ ⎪ ⎪ ⎨ Λi (t) = Λi−1 (t) + I − ρt 2 φ (x, u) x − xˆ i T W ⎪ t t i−1 T t t ⎪ ⎪ ˙ ⎪ +ρ φ X , U − Λ (t) φ X j , U j . (34) X ⎪ j j j ⎩ j=1
Remark 4 For the time-invariant systems, a point set of history stack is needed for system identification. Differently, a trajectory set of history stack is used in the timevarying situation, where each trajectory is established during t ∈ [t0 , t f ]. Besides, in Assumption 2, the matrix t is assumed to be positive and uniformly bounded for all t ∈ [t0 , t f ], which is much stronger than Assumption 1. Remark 5 For the time-invariant systems, the model-free optimal control problems have been well settled by using adaptive/approximate dynamic programming. However, for the time-varying systems, adaptive/approximate dynamic programming suffers from inherent technical obstacles. Different from the existing research results, our algorithm is established under Pontryagin’s framework and the successfully solves the time-varying optimal control problems.
6 Simulation In this section, we will present an example to illustrate the performance of our algorithm. Consider the following aircraft system described by [33] ⎛
⎞ ⎛ ⎞ −1.01887 0.90506 −0.00215 0 x˙ = ⎝ 0.82225 −1.07741 −0.17555 ⎠ x + ⎝ 0 ⎠ u. 0 0 −1 1 The cost function is
10
J= 0
x T x + 10u 2 dt.
(35)
(36)
Model-Free Optimal Control for Linear Systems with State …
353
1 x1 x2 x3
0.8 0.6 0.4
x
0.2 0 -0.2 -0.4 -0.6 -0.8 -1
0
1
2
3
4
5
6
7
8
9
10
time
Fig. 2 The initial state trajectory
The inequality constraint is assumed to be x12 + x22 < 2.
(37)
We set the initial condition of the states as x0 = [1 0.5 − 1]T . We start the iteration from ⎛ ⎞ ⎛ ⎞ 1 0 −1 0 Aˆ 0 = ⎝ 0.5 −0.5 0 ⎠ and Bˆ 0 = ⎝ 0 ⎠ . (38) 0 0 −1 1 By using the Model-Free Learning Algorithm, we obtain that the identification model is ⎛ ⎞ ⎛ ⎞ −1.0189 0.9051 −0.0022 0 Aˆ id = ⎝ 0.8223 −1.0774 −0.1755 ⎠ and Bˆ id = ⎝ 0 ⎠ . (39) 0 0 −1 1 The cost function converges from J = 4.7774e + 03 to J = 4.2208. It can be seen that the cost value reduces significantly. Figures 2, 3, 4 and 5 show the state trajectories and control inputs.
7 Conclusion In this paper, we have proposed a model-free iteration algorithm to obtain the optimal control policy for an linear system. A novel identification model has been introduced, based on which policy updating laws under the Pontryagin’s framework have been established. The model-free iterative algorithm can be implemented in the presence
354
B. Zhang et al. 10 8 6 4
u
2 0 -2 -4 -6 -8 -10
0
1
2
3
4
5
6
7
8
9
10
time
Fig. 3 The initial control input 1 x1 x2 x3
0.8 0.6 0.4
x
0.2 0 -0.2 -0.4 -0.6 -0.8 -1
0
1
2
3
4
5
6
7
8
9
10
6
7
8
9
10
time
Fig. 4 The optimal state trajectory 0.12 0.1 0.08 0.06
u
0.04 0.02 0 -0.02 -0.04 -0.06
0
1
2
3
4
5
time
Fig. 5 The optimal control input
Model-Free Optimal Control for Linear Systems with State …
355
of time-varying system parameters and mixed state-control inequality constraints. A simulation example has been given to illustrate the effectiveness of the developed methods. Acknowledgements This work was supported by the National Natural Science Foundation of China (Grant No. 61973044).
References 1. Fliess, M., Join, C.: Model-free control. Int. J. Control 86(12), 2228–2252 (2013) 2. Hou, Z., Jin, S.: Model free adaptive control: theory and applications. CRC Press (2013) 3. Wang, L., Li, S., Liu, J., et al.: Data-driven path-following control of underactuated ships based on antenna mutation beetle swarm predictive reinforcement learning. Appl. Ocean Res. 124, 103207 (2022) 4. Quan, Q.: Sensor calibration and measurement model. In: Introduction to Multicopter Design and Control, pp. 147–172, Springer, Berlin (2017) 5. Engel, Y., Szabo, P., Volkinshtein, D.: Learning to control an octopus arm with gaussian process temporal difference methods. Adv. Neural Inf. Process. Syst. 18 (2005) 6. Silver, D., Lever, G., Heess, N., et al.: Deterministic policy gradient algorithms. In: International Conference on Machine Learning. pp. 387–395, Pmlr (2014) 7. Teng, X., Ichiye, T.: Dynamical model for the counteracting effects of trimethylamine N-Oxide on urea in aqueous solutions under pressure. J. Phys. Chem. B 124(10), 1978–1986 (2020) 8. Teng, X., Ichiye, T.: Dynamical effects of trimethylamine N-Oxide on aqueous solutions of urea. J. Phys. Chem. 123(5), 1108–1115 (2019) 9. Teng, X., Huang, Q., Dharmawardhana, C., Ichiye, T.: Diffusion of aqueous solutions of ionic, zwitterionic, and polar solutes. J. Chem. Phys. 148(22), 222827 (2018) 10. Zhang, H., Zhang, X., Luo, Y., Yang, J.: An overview of research on adaptive dynamic programming. Acta Automatica Sin. 39(4), 303–311 (2013) 11. Lewis, F.L., Vrabie, D.: Reinforcement learning and adaptive dynamic programming for feedback control. IEEE Circ. Syst. Mag. 9(3), 32–50 (2009) 12. Wang, Y., O Donoghue, B., Boyd, S.: Approximate dynamic programming via iterated Bellman inequalities. Int. J. Robust Nonlinear Control 25(10), 1472–1496 (2015) 13. Liu, D., Xue, S., Zhao, B., et al.: Adaptive dynamic programming for control: a survey and recent advances. IEEE Trans. Syst. Man Cybern. Syst. 51(1), 142–160 (2020) 14. Liu, D., Li, H., Wang, D.: Data-based self-learning optimal control: research progress and prospects. Acta Automatica Sin. 39(11), 1858–1870 (2013) 15. Werbos, P.: Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. Harvard University, USA (1974) 16. Bertsekas, D.: Reinforcement learning and optimal control. Athena Sci. (2019) 17. Haddad, W., Chellaboina, V.: Nonlinear Dynamical Systems and Control: A Lyapunov-Based Approach. Princeton University Press, New Jersey (2008) 18. Karafyllis, I., Jiang, Z.: Stability and Stabilization of Nonlinear Systems. Springer, London (2011) 19. Huang, L., Zheng, Y.P., Zhang, D.: The second method of Lyapunov and the analytical design of the optimum controller. Acta Automatica Sin. 2(4), 202–218 (1964) 20. Kleinman, D.: On an iterative technique for Riccati equation computations. IEEE Trans. Autom. Control 13(1), 114–115 (1968) 21. Saridis, G., Lee, C.: An approximation theory of optimal control for trainable manipulators. IEEE Trans. Syst. Man Cybern. 9(3), 152–159 (1979) 22. Vamvoudakis, K., Lewis, F.: Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica 46(5), 878–888 (2010)
356
B. Zhang et al.
23. Yang, X., Liu, D., Wang, D.: Reinforcement learning for adaptive optimal control of unknown continuous-time nonlinear systems with input constraints. Int. J. Control 87(3), 553–566 (2014) 24. Bhasin, S., Kamalapurkar, R., Johnson, M., Vamvoudakis, K.G., Lewis, F.L., Dixon, W.E.: A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems. Automatica 49, 82–92 (2013) 25. Zhang, H., Cui, L., Zhang, X., Luo, Y.: Data-driven robust approximate optimal tracking control for unknown general nonlinear systems using adaptive dynamic programming method. IEEE Trans. Neural Netw. 22, 2226–2236 (2011) 26. Kyriakos, G.: Vamvoudakis: Q-learning for continuous-time linear systems: a model-free infinite horizon optimal control approach. Syst. Control Lett. 100, 14–20 (2017) 27. Modares, H., Lewis, F.L., Naghibi-Sistani, M.: Integral reinforcement learning and experience replay for adaptive optimal control of partially-unknown constrained-input continuous-time systems. Automatica 50(1), 193–202 (2014) 28. Baldi, S., Valmorbida, G., Papachristodoulou, A., et al.: Online policy iterations for optimal control of input-saturated systems. In: 2016 American Control Conference (ACC), pp. 5734– 5739. IEEE (2016) 29. Rizvi, S.A.A., Lin, Z.: Model-free global stabilization of continuous-time linear systems with saturating actuators using adaptive dynamic programming. In: 2019 IEEE 58th Conference on Decision and Control (CDC), pp. 145–150. IEEE (2019) 30. Abu-Khalaf, M., Lewis, F.L.: Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica 41(5), 779–791 (2005) 31. Jacobson, D.H., Lele, M.M.: A transformation technique for optimal control problems with a state variable inequality constraint. IEEE Trans. Autom. Control 14(5), 457–464 (1969) 32. Kleinman, D.: On an iterative technique for Riccati equation computations. IEEE Trans. Autom. Control 13(1), 114–115 (1968) 33. Modares, H., Lewis, F.L.: Linear quadratic tracking control of partially-unknown continuoustime systems using reinforcement learning. IEEE Trans. Autom. Control 59(11), 3051–3056 (2014)
Binary Consensus of Multi-agent Systems With Privacy Preserving and Random Communication Noises Shuochen Wang, Jian Wang, Hongyong Yang, Chuangchuang Zhang, and Li Liu
Abstract Collaborative control in multi-agent systems has appeared as a subject of considerable interest in the control science, and the binary consensus in multi-agent systems has received extensive attention. This paper focuses on the binary consensus problem in discrete multi-agent systems under uncertain random communication noises environments, where every intelligent agent can only use its own information and the information from its neighbors for information exchange or observation. In order to improve consensus accuracy, an algorithm is proposed that can achieve asymptotic convergence of the agents, minimizing the covariance of the consensus error when the privacy level reaches a certain condition. The paper employs matrix theory and automatic control theory to analyze the system’s consistency and privacy. Finally, the results prove that the control protocol delivered in this paper achieves a certain level of privacy protection while the consensus accuracy of the multi-agent systems is also improved. Keywords Binary Consensus · Multi-agent Systems · Privacy Preserving
1 Introduction The multi-agent systems (MAS) have received attention from experts in fields such as computer science. As a cornerstone in the investigation of MAS, the consensus problem retains a position of significant prominence, thereby making it one of the most extensively examined subjects within MAS research. In most cases, there’s not only a cooperative relationship within the intelligent agent systems, but there may also be a competitive relationship between nodes. Research on MAS with a “cooperation-competition” relationship would be more applicable to real-world situations. In [1], for this reason, in recent years, the study of bipartite S. Wang · H. Yang (B) · C. Zhang · L. Liu School of Information and Electrical Engineering, Ludong University, Yantai 264000, China e-mail: [email protected] J. Wang Yantai Municipal People’s Procuratorate, Yantai 264000, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_30
357
358
S. Wang et al.
consensus in MAS has attracted widespread attention from many researchers. In [2], the binary consensus problem in heterogeneous MAS was investigated under the presence of communication noise. In [3], Under the assumption of a fixed topology, it was proven that if the system satisfies structural balance conditions, the system would achieve asymptotic unbiased mean-square conditions. Within the scope of time-varying topologies, requisite conditions for attaining both mean square averageconsensus and almost sure consensus have been delineated. In [4], The investigation was conducted on the issue of leader-following bipartite consensus within a singleintegrator system, while factoring in the implications of measurement noise. In [5] The study addresses the fixed-time bipartite consensus in nonlinear MAS that are exposed to external disturbances. In the context of directed signed networks, a number of sufficient conditions have been suggested to ensure the attainment of fixed-time bipartite consensus. In [6], The research delves into the secure state estimation and control issue in MAS, proposing a particular filter gain derived from the resolution of an equality constrained optimization problem. In [7], to mitigate the adverse impacts of measurement noises, a time-varying consensus gain is introduced, A novel protocol of stochastic nature that varies over time is introduced, marking the first attempt to resolve the bipartite consensus issue. In [8], The study resolves the fixed-time bipartite consensus issue for MAS, irrespective of whether they are subject to disturbances. This paper intends to investigate the bipartite consensus and privacy protection algorithm, analyze the bipartite consensus of the first-order system under uncertain communication environments, nodes obtain the states of others through measurement. At the same time, a privacy protection function is added during communication between nodes. Nodes can obtain the states of others through measurement and can also obtain the states of others through communication between each other. It can enhance the accuracy of consensus and ensure the security of data.
2 Preliminaries 2.1 Graphy Theory Consider an undirected and connected graph, symbolized as G = (V, E, A), where V = {v1 , v2 . . . vn } embodies the collection of nodes, E ∈ V × V symbolizes the assembly of edges, and the matrix A = ai j N ×N signifies the interconnections among nodes. An undirected edge from node i to node j is represented as {i, j}, and if {i, j} ∈ E, it signifies that node i can solely receive messages from node j. In the context of the matrix A, if {i, j} ∈ E, then ai j = 0; otherwise, ai j = 0. A positive ai j value implies a cooperative relationship between node i and node j, whereas a negative ai j value suggests a competitive relationship between the two nodes. Let Ni = { j | (i, j) ∈ E, ∀ j ∈ V } symbolize the set of neighbor nodes for node i. The diagonal matrix of the undirected graph, symbolized as D =
Binary Consensus of Multi-agent Systems with Privacy …
359
diag {d1 , d2 , . . . , dn } ∈ R n×n , contains elements di = j=i ai j , i = 1, 2, . . . , n. The Laplacian matrix of the graph is represented as L = D − A ∈ R n×n , undirected n×n signifies the weights between nodes. where the matrix A = ai j ∈ R
2.2 Related Lemmas and Definitions Definition 1 (Structural balance.) The signed graph G is considered structurally balanced if V can be partitioned into two non-overlapping, non-empty subsets V1 and V2 such that the union of V1 and V2 equals V , and the intersection of V1 and V2 is the null set. A signed graph G with the property of structural balance has the following characteristics: (1) For i ∈ V , there exists di ∈ {1, −1} such that a non-negative matrix D = diag (d1 , d2 , . . . , dn ) can be found such that D AD has all non-negative elements. (2) The associated Laplacian matrix of the unsigned graph, L S = DL D, is positive semidefinite. The eigenvalues λz (t), z = 1, 2, . . ., n of the Laplacian matrix L satisfy λn (t) ≥ · · · λ2 (t) ≥ λ2 (t) = 0. Definition 2 (Sensitivity.)For any δ ∈R, if there exists i 0 ∈ V , the initial state data sets D = {xi (0), i ∈ V } and D = xi (0), i ∈ V are said to be adjacent. Such δ i = i0 like: xi (0) − xi (0) ≤ . 0 i = i0 Definition 3 (bipartite consensus.) It is established that a random variable x , exists for any specified initial state. Where E [x ]2 < ∞, such that the limit as t approaches infinity of limt→∞ E [xi (t) − di x ]2 = 0, where di ∈ {−1, 1} and i ∈ V , then the system is said to achieve bipartite consensus. Definition 4 (Accuracy.) For p ∈ [0, 1] and r ∈ R≥0 , It is established that a random 11 Dx(0) variable x exists for any specified initial state. Suchlike P x − N N ≤ p} ≥ 1 − p, and x satisfies E x = ( p, r ) accuracy.
1TN Dx(0) , Var (x ∗ ) N
< ∞, the system is said to achieve
3 A Binary Consensus Algorithm with Communication Frequency Constraints 3.1 Bipartite Consensus This section focuses on the binary consensus of discrete-time MAS under uncertain communication environments. xi (t + 1) = xi (t) + u i (t), t = 0, 1, . . . , N
(1)
360
S. Wang et al.
xi (t) and u i (t) denote the state and the control protocol of node i, respectively. The node i can receive message from its neighboring nodes. yi (t) = xi (t) + Γ (t)ωi (t).
(2)
The variable ωi (t) follows a normal distribution mechanism, where ωi (t) ∼ random N μ, σ 2 , with variance σ 2 . Γ (t) is a binomial distribution probability function. If Γ (t) is 0, node sends information to each other with a certain probability p. If Γ (t) is 1, node only can obtain the state of the others through information measurement.
yi (t) =
Γ (t) = 0 xi (t) xi (t) + ωi (t) Γ (t) = 1
(3)
The paper proposes a binary consensus control protocol for a MAS system with communication frequencies. η represents a positive step size, and each node i updates its respective state in accordance with the following procedure. u i (t) = −η(t)
ai j xi (t) − sgn ai j y j (t)
(4)
j∈Ni
xi (t + 1) = xi (t) − η(t)
ai j xi (t) − sgn ai j y j (t) .
(5)
j∈Ni
Theorem 1 If Eqs. (4) and (5) hold under the condition of structural balance, Additionally, η satisfies the following conditions: (1). limt→∞ η2 (t)σ 2 p 2 (t D ) . the algorithm is said to achieve binary consensus in the → 0 (2). 0 < η(t) < 2λ L D 2 MAS system. Proof The expression for Eq. (5) can be restated as follows: xi (t + 1) = (I − η(t)D)xi (t) + Γ (t)η(t)Aωi (t)
(6)
Let z(t) = Dx and L D = DL D, where D = diag (d1 , d2 , . . . , dn ), di ∈ {1, −1} Due to the properties of the diagonal matrix, it is known that D −1 = D. From this, the following conclusion is summarized. z(t + 1) = I − η(t)DL D −1 z(t) + Γ (t)η(t)D Aω(t) = (I − η(t)L D ) z(t) + Γ (t)η(t)D Aω(t)
(7)
Let H = (1/N )11T , (t) = (I N − H ) z(t), Φ(t) = δ T (t)δ(t). From this, we can derive
Binary Consensus of Multi-agent Systems with Privacy …
361
δ(t + 1) = (I N − H ) z(t + 1) = z(t) − η(t)L D z(t) + Γ (t)η(t)D Aω(t) − H z(t) − Γ (t)η(t)H D Aω(t) = δ(t) − η(t)L D Z (t) + Γ (t)η(t) (I N − H ) D Aω(t) = [I N − η(t)L D ] δ(t) + Γ (t)η(t) (I N − H ) D Aω(t)
(8)
we can conclude that: Φ(t + 1) = Φ(t) − η(t)δ T (t) L D + L TD δ(t) + η2 (t)δ T (t)L TD L D δ(t) + 2η(t)δ T (t)Γ (t) I N − η(t)L TD (I N − H ) D Aω(t)
(9)
+ η (t)ω(t)Γ (t)A D (I N − H ) (I N − H ) D Aω(t) 2
2
T
T
T
since 1T δ(t) = 0, λ2 (t D ) = λ2 (t), from reference [3], one can deduce that δ T (t)L D δ(t) ≥ λ2 (t)Φ(t) holds true. And due to the symmetry of L D , we can deduce Φ(t + 1) ≤ 1 − 2η(t)λ2 (t) + η2 (t) L D 2 Φ(t) T T T +η2 (t)ω T (t)Γ 2 (t)A D (I N − H ) (I N − H ) D Aω(t) T T +2η(t)δ (t)Γ (t) I N − η(t)L D (I N − H ) D Aω(t)
(10)
Upon applying the expectation operation to Eq. (10), we derive the following: E[Φ(t + 1)] ≤ 1 − 2η(t)λ2 (t) + η2 (t) L D 2 E[Φ(t)]
+η2 (t) I N − H 2 D 2 A 2 E Γ 2 (t) E ω(t)ω T (t)
(11)
From Eq. (11), we know that E Γ 2 (t) = E Γ 2 (t) = D Γ 2 (t) − E(Γ (t)) =
p, and E ω(t)ω T (t) = σ 2 . Therefore, Eq. (11) can be rewritten as E[Φ(t + 1)] ≤ 1 − 2η(t)λ2 (t) + η2 (t) L D 2 E[Φ(t)] + η2 (t) I N − H 2 D 2 A 2 σ 2 p
(12)
According to condition 1 in Theorem 1, limt→∞ η2 (t)σ 2 p → 0, we can derive 2in Theolimt→∞ η2 (t) I N − H 2 D 2 A 2 σ 2 p → 0. According to condition 2 2 2 (t D ) L < 1 We then 0 < 1 − 2η(t)λ (t) + η (t) rem 1, since 0 < η(t) < 2λ 2 D L 2 D
can deduce 0 < E[Φ(t+1)] < 1. The state value of Φ(t) is continuously decreasE[Φ(t)] ing. When t → ∞, E[Φ(t)] = 0, indicating that the final state of the intelligent node Furthermore, As t approaches infinity, we have E[Φ(t)] = is convergent. E Φ δ T (t)δ(t) = 0. we can conclude
362
S. Wang et al.
=E
limt→∞ E[Φ(t)] = E δ T (t)δ(t) 2 z 1 − N z 2 + · · · + N1 z n + E NN−1 z 2 − N1 z 1 + · · · + .. . N −1 2 1 +E N z n + N z 1 + · · · + N1 z N −1 2 2 = E z 1 − N1 1nT z(t) + · · · + z n − N1 1nT z(t) →0
N −1 N
1
1 z N n
2 (13)
We can derive the following equation 1TN z(t) = 1TN (I N − η(t − 1)L D ) z(t − 1) + η(t − 1)Γ (t) (1 N D A) ω(t − 1) = 1TN z(t − 1) + η(t − 1)Γ (t) 1TN D A ω(t − 1)
(14)
By performing iterations, we can obtain the expression for z(t) as z(t) = Dx(t).
Di xi = 1TN z(t) =
j=1
z i (0) +
∞
i∈v
Γ (t)η( j − 1)di ci ωi ( j − 1).
When t → ∞ we can deduce z i (0) + Γ (t)η( j − 1)di ci ωi ( j − 1) 1TN z(t) = Let 1TN z(t) = N x ∗ , then x ∗ = Eq. (13)
t→∞
=
(16)
j=1 i∈V
i∈V
lim E[Φ(t)] = E
(15)
j=1 i∈V
N
1 T 1 Z (t). N N
z1 − x ∗
E
2
We use the conclusion derived from
2 2 + z2 − x ∗ + · · · + zn − x ∗
xi − di x ∗
2
=0
(17)
i=1
N We obtain i=1 E (xi − di x ∗ )2 = 0, which satisfies the requirements of Definition 3. When t → ∞, xi converges to di x ∗ , where x ∗ is a random variable satisfying 1T Dx(0) E x = N N . For i ∈ V, t ∈ N , we can obtain Eqs. (18), (19) and compute the expectation and variance of x . ⎡ ⎤ ∞ ∗ 1 1 E x = E⎣ z i (0) + Γ (t)η( j)di ci ωi ( j − 1)⎦ N i∈V N j=0 i∈V 1 1 = z i (0) = di xi (0) N i∈V N i∈V
(18)
Binary Consensus of Multi-agent Systems with Privacy … ∞ 1 2 Var x = 2 η ( j)di2 ci2 E [ωi ( j)]2 E Γ 2 (t) N j=0 i∈V ∞ c2 p 2 η ( j)σ 2 ( j) = i∈V2 i N j=0
363
(19)
From the analysis of Eq. (12), we find that the state of the intelligent node ultimately tends to be consistent. From the analysis of (18), (19), we conclude that xi converges to di x ∗ . The above formulas satisfy the definition conditions, and the system has achieved mean square consensus. The proof is complete. Moreover, by transforming the Chebyshev’s inequality, we can deduce Var (x ) P x − E x < r ≥ 1 − r2
(20)
ci2 p ∞ 2 2 In addition, we choose the precision r = i∈V j=0 η ( j)σ ( j)to satisfy the N2 requirements of Definition 4. At this point, we say that the proposed controller protocol has achieved ( p, r ) precision.
3.2 Privacy Analysis Theorem
2 Assuming structural balance holds, the sensitivity of this algorithm is T l(t) δ t =1 l(t) ≤ t−2 . And the privacy level is = t=1 b(t) l=0 (1 − η(t)cmin ) δ t ≥ 2 Proof Assuming P (a) = xi(a) (0), i ∈ V and P (b) = xi(b) (0), i ∈ V are sets of observations. R (a) = q P (a) , W : W ∈ W and R (b) = q P (b) , W : W ∈ W represent the sets of possible trajectories in the observation set W . Let f P (a) , q (b) (a) (b) P , W and f P , q P , W denote the probability density functions of the trajectories, we can conclude that Eq. (24), n ∈ {a, b}. xi(n) (t) = (1 − η(t − 1)ci ) xi(n) (t − 1)+ ai j sgn ai j y j (t − 1) η(t − 1)
(21)
j∈Ni
By iteration, we can obtain xi(b) (t) − xi(a) (t) =
t−1 l=0
(1 − η(t)ci ) xiD ,O (0) − xiD,O (0)
(22)
364
S. Wang et al.
When t ≥ 2 t−2
(b)
(a)
ρ R , W (t − 1) − ρ R , W (t − 1) ≤ (1 − ηl C) δ 1
(23)
l=0
we can conclude that ⎞ ⎛ (a) T (b) l(t) i∈V x i (t − 1) − x i (t − 1) P R (a) ⎠ ≤ exp = exp ⎝ . (24) b(t) b(t) P R (b) t=1 We assume R (a) is the output state with added communication frequency. The intelligent node can obtain its state through certain probability measurement and ideal information reception. Let R (b) be the output state obtained by the intelligent node through measurement alone. In fact, compared with R (b) , R (a) reduces the covariance of noise, improves the precision of consistency, and achieves a certain privacy protection level . A smaller l(t) implies improved privacy.
4 Simulation In the simulation, we selected six intelligent nodes. The structure of the intelligent nodes under the balanced network is shown in Fig. 1 The initial state is x(0)=[25, −10, 0 ,10 , −15 ,3]. The numbers −1 and 1 respectively represent the competitive and cooperative relationships between the nodes.
Fig. 1 Structure diagram of agent states under balanced structure conditions
Binary Consensus of Multi-agent Systems with Privacy …
365
Fig. 2 Ideal state of the agents
Fig. 3 The state of the intelligent agent with joining the communication frequency
According to the states in Figs. 2 and 3, we conclude that the controller protocol has achieved consistency, the consistency precision is higher than the agent without the addition of communication frequency. From Fig. 3, the agent eventually tend to the same value even in the random communication noise. From Fig. 4, the states of the agent eventually tend to the same value in an ideal communication environment.
5 Conclusion This paper researches the consistency in uncertain communication environments. The results show that the consistency of algorithm is improved and the privacy between agents are protected. In conclusion, the impact of sensitivity on the robustness of privacy protection is analyzed.
366
S. Wang et al.
Fig. 4 The state of the intelligent agent without joining the communication frequency
Acknowledgements The research is supported by the National Natural Science Foundation of China (61673200), the Natural Science Foundation of Shandong Province of China (ZR2022MF231).
References 1. Hongbin, L.: Research on Binary Consensus Problem in Multi-agent Systems Security. Hangzhou Dianzi University (2020) 2. Dakang,T.: Group Control and Consensus Analysis of Multi-agent Systems with Time Delays. Chongqing University of Posts and Telecommunications (2020) 3. Li, T., Zhang, J.-F.: Consensus conditions of multi-agent systems with time-varying topologies and stochastic communication noises. IEEE Trans. Autom. Control 2043–2057 4. Ma, C.-Q., Xie, L.: Necessary and sufficient conditions for leader-following bipartite consensus with measurement noise. IEEE Trans. Syst. Man Cybern. Syst. 50(5), 1976–1981 (May2020). https://doi.org/10.1109/TSMC.2018.2819703 5. Zhou, Xu., Liu, Xiaoyang, Cao, Jinde, Song, Mei: Fixed-time bipartite consensus of nonlinear multi-agent systems under directed signed graphs with disturbances. J. Franklin Inst. 359(6), 2693–2709 (2022) 6. Chen, G., Zhang, Y., Gu, S., Hu, W.: Resilient state estimation and control of cyber-physical systems against false data injection attacks on both actuator and sensors. IEEE Trans. Control Netw. Syst. pp. 500–510, Mar (2022) 7. Cui Qin, M., Zheng Yan, Q.: Bipartite consensus on networks of agents with antagonistic interactions and measurement noises. IET Control Theory Appl. 10, 2306–2313 8. Deng, Qun, Jie, Wu., Han, Tao, Yang, Qing-Sheng., Cai, Xiu-Shan.: Fixed-time bipartite consensus of multi-agent systems with disturbances. Phys. Stat. Mech. Appl. 516, 37–49 (2019)
Highway Abandoned Object Detection Based on Foreground Extraction Yubin Wang and Junyong Zhai
Abstract With the development of highways, abandoned objects pose significant threats to driving safety. This paper proposes an abandoned object detection method for highways based on YOLOv7 and Background-Separated Gaussian Mixture Model(BS-GMM). Firstly, BS-GMM is employed to extract the foreground from the video and obtain the binary image of the abandoned objects. Finally, the Intersection over Union(IoU) matching algorithm is applied to differentiate abandoned objects from vehicles. Through validation with highway videos, experimental results demonstrated that the pixel-level error rate of BS-GMM algorithm was 2.01%, which outperformed the Gaussian Mixture Model (GMM) algorithm’s 3.67%. Moreover, the stationary object’s duration of BS-GMM algorithm was 500 frames, which was higher than the traditional GMM algorithm’s 200 frames. Keywords GMM · Target detection · Foreground extraction
1 Introduction The growth of transportation demand in China has led to challenges in managing abandoned objects on high-speed highways, where small abandoned objects pose significant threats to drivers. It is difficult for drivers to avoid small abandoned objects when driving at high speeds, so it is necessary to detect and remove abandoned objects accurately. Abandoned object detection methods can be divided into manual inspection and automated detection [1–3]. Manual inspection relies on real-time video monitoring to respond promptly to abandoned objects. However, automated inspection has become imperative with the expansion of highway networks and the increase in camera arrangements. Automated detection can be broadly classified into two categories: indirect detection and direct detection. Indirect detection relies on traffic flow data or other factors to detect suspected abandoned object presence [4]. For example, slowing down Y. Wang · J. Zhai (B) School of Automation, Southeast University, Nanjing 210096, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_31
367
368
Y. Wang and J. Zhai
vehicle speed, reducing traffic flow, or intentionally avoiding certain areas on the road may indicate the presence of abandoned objects. However, indirect detection cannot accurately determine the occurrence of abandoned object presence and has poor real-time performance[5]. Direct detection involves directly detecting the abandoned item itself. Currently, the most commonly used method is analyzing real-time footage from highway cameras [6]. However, small abandoned object detection on highways still needs to be improved due to the complexity of the environment and potential interfering factors. This article specifically addresses the detection of small abandoned objects.
2 Differential Detection Foreground extraction based on difference method: Taking a single pixel as an example, a fixed pixel value at a certain time is taken as the background, and the pixel value can be any color space, such as RGB, HSV, etc., or a single-channel grayscale value. The pixel at the same position in the current frame of the video is taken as the test pixel. The value of the test pixel is subtracted by the background pixel value. The subtraction can take the Euclidean distance of multi-channel pixel values or directly subtract the absolute value of single-channel grayscale values. For the calculation of Euclidean distance: n (1) di f f (x, y, t) = (Ii (x, y, t) − Bi (x, y))2 i=1
where di f f (x, y, t) is the difference between the test pixel and the background pixel at position (x, y) and time t, Ii (x, y, t) is the value of the ith channel’s test pixel at position (x, y) and time t, and Bi (x, y) is the ith channel’s value of the background pixel at position (x, y), n is the number of color channels for the image. Then, it is determined whether the difference value is greater than the set threshold. This threshold needs to be manually set. It is considered a foreground pixel if it is greater than the threshold. Otherwise, it is considered a background pixel [7]. The foreground extraction algorithm based on the mixture of Gaussian model is similar to the foreground extraction based on the difference method. However, instead of defining a global color threshold for background detection, a Gaussian distribution model is established based on the historical pixel values. This model is used as the background for detection.
Highway Abandoned Object Detection Based on Foreground …
369
3 Gaussian Mixture Model Initially, K Gaussian distributions are assigned to each pixel in the video image as background models, P(x j,t ) =
K
ωij,t · η(x j,t ; μij,t , ij,t )
(2)
i=1
where x j,t represents pixel values and P is background distribution. ωij,t represents the weight of the ith Gaussian distribution in the mixture model at time t for the jth pixel. μij,t and ij,t are the mean and covariance matrix of the ith Gaussian model for the jth pixel, while η is the probability density function. If the current pixel value is represented in RGB color space, the mean vector and covariance matrix as follows: R G B , μi,t , μi,t ) μi,t = (μi,t
i,t
(3)
⎡
⎤ σ R2 0 0 = ⎣ 0 σG2 0 ⎦ 0 0 σ B2
(4)
G R B where μi,t , μi,t , μi,t are the means of the red, green and blue channels respectively for the ith Gaussian model, σ R2 , σG2 , σ B2 are the variances of the red, green, and blue channels respectively. The probability density function is as follows:
η x j,t ; μij,t , ij,t =
1
d
(2π) 2 ij,t
T −1 ij,t x j − μij,t (5) exp − 21 x j − μij,t
where d is the dimension of the feature space. For each pixel in the input image, we match it with the K Gaussian distributions corresponding to that pixel. The matching process is as follows:
xt − μi,t−1 ≤ σi D (6) where σi presents the variance of model ith, D is a constant usually taken between 2.5 and 3.5. The mixture of Gaussian model is used for foreground extraction, where K Gaussian models are sorted by weight and matched in descending order. The 3σ rule is applied for model matching, and pixel type is determined based on background partitioning. Background partition refers to the selection of B Gaussian distributions as background models, B is the number of Gaussian distributions, and the parameter T represents the background threshold (T > 0.7, usually set to 0.9):
370
Y. Wang and J. Zhai
B = arg min b
b
ωi,t > T
(7)
i=1
when judging the pixel type, if it matches any of the B Gaussian distributions, it is considered as background. Otherwise, it is considered as foreground. When a pixel matches with the ith model, its parameters are updated as follows: ωi,t = (1 − α)ωi,t−1 + α Mi, j
(8)
μ j,t = (1 − ρ)μ j,t−1 + ρ X i,t
(9)
T 2 2 X i,t − μi,t = (1 − ρ)σi,t−1 + ρ X i,t − μi,t σi,t
(10)
where Mi, j represents the matching result between the pixel point and the model. The update rate of model parameters is denoted by ρ = α · η(X t , μi,t , σi,t ), α is the learning rate, and the weights of the updated Gaussian distributions always sum up to one. When the weight of a certain model is less than a certain threshold, delete the model and re-assigned it. If a pixel does not match any model during matching, only the weight of the corresponding model is modified. The priorities of each Gaussian distribution are calculated and ranked in descending order to prioritize higher-weight models and matching stops once a model is matched. λi,t = ωi,t /σi,t , i = 1, 2, . . . , K
(11)
where λi,t is the normalized value of the ith Gaussian model at time t. The algorithm uses post-processing steps to segment objects after identifying foreground pixels. New foreground models are created for the abandoned object with the pixel value as the mean value. Stationary targets become part of the background, and when the abandoned object disappears, the foreground model’ s weight decreases and merges into the background. The mixture of Gaussian model is suitable for dynamic target detection [8]. However, the disadvantage is that stationary objects may eventually disappear into the foreground.
4 Improved Gaussian Mixture Modeling An improved method for the mixture of Gaussian model is proposed in this paper, using background separation to enhance the background partition and pixel type judgment. The method separates the foreground and background models, utilizing only the background model’s matching results to determine pixel type. The foreground and background are re-segmented based on the background model’s weight attenuation, allowing for fixed background detection of stationary targets while maintaining real-time adaptation to environmental changes.
Highway Abandoned Object Detection Based on Foreground …
Background threshold T0
BG
T1
BG
T2
BG
T3
BG
371
0.9
FG
FG BG
Fig. 1 Background segmentation method for a mixture of Gaussian model Decay threshold 0.1
Background threshold
T0
BG
T1
BG
T2
BG
0.9
FG
FG BG
T3
BG
FG
Fig. 2 Changing the background partitioning strategy
In traditional GMM, the background is partitioned in each frame, which leads to static abandoned objects disappearing quickly in the foreground. The background threshold is set to 0.9, and the learning rate is set to 0.001. The background partitioning method of the original GMM is as Fig. 1. This paper proposes an improved Gaussian Mixture Model to detect stationary targets for longer by separating foreground and background models based on a threshold. As shown in Fig. 2, the background model is re-segmented using an attenuation threshold when its sum of weights is lower than the threshold, and the categories of each model are maintained. The proposed method successfully detected a stationary abandoned object for 500˙frames. This method introduces changes to the background partitioning strategy and model matching method, achieving dynamic modeling of low noise background while effectively detecting static targets. This study compared the detection of the static abandoned object using the original GMM and the proposed Background-Separated Gaussian Mixture Model with parameters such as learning rate of 0.001 and background threshold of 0.9 kept constant. The threshold for background separation is set to 0.5.
372
Y. Wang and J. Zhai
5 Object Detection On highways, various situations, such as vehicle stops and traffic accidents, can cause changes in motion status. We employed object detection to classify stationary objects to distinguish them from abandoned objects. Object detection is a major task in the field of computer vision, which can be broadly classified into single-stage and two-stage methods. Generally, two-stage methods exhibit higher accuracy but are slower in operation, making it challenging to meet real-time requirements. Hence, single-stage methods are preferred for many applications. One representative of single-stage methods is the YOLO series of models. The YOLOv7 [9] model stands out for its strong customizability, fast convergence, and extremely high detection accuracy. We have chosen the YOLOv7 model with the fewest parameters to meet real-time requirements as the basis for our vehicle detection system. The object detection system based on YOLOv7 takes a single video frame as input. It outputs the given image’s recognized object information (bounding box), classification information, and confidence score (ranging from 0 to 1). The network architecture is shown in Fig. 3. As seen in Fig. 3, the YOLOv7 model follows the classic one-stage structure in object detection, which can be divided into four parts: input, backbone, neck, and head. The YOLOv7 model has concurrently incorporated RepVGG-style modifications to its Head network architecture, additional Head training, and corresponding positive-negative sample matching strategies [10]. YOLOv7’s sample matching strategy combines positive sample allocation from YOLOv5 and negative sample allocation from YOLOx, using a dynamic refinement approach called “loss aware” to improve real-time accuracy. The strategy also incorporates the SimOTA algorithm for more accurate prior knowledge, resulting in further improvements in accuracy compared to YOLOv5. The YOLOv7 model includes additional Head training for deep supervision and fuses losses of both the detection Head and auxiliary Head to enhance overall performance through localized ensemble at higher network layers. Its positive-negative sample matching strategy is designed around the detection and auxiliary Head, combining those of YOLOv5 [11] and YOLOx [12].
Fig. 3 Architecture of the YOLOv7 network
Highway Abandoned Object Detection Based on Foreground …
373
Compared to YOLOv5, YOLOv7’s sample matching strategy incorporates “loss aware” to dynamically refine sample allocation based on the model’s current performance. This allows for real-time filtering to improve accuracy. Additionally, YOLOv7’s strategy combines YOLOv5’s positive sample allocation and YOLOx’s negative sample allocation and leverages the SimOTA algorithm to provide more accurate prior knowledge, further improving accuracy.
6 Experiment Results and Comparison A prerequisite for detecting small abandoned objects is ensuring that stationary targets remain in the foreground for a certain period. In Chap. 3, two main methods for detecting stationary targets were discussed: reducing the background model’s learning rate and the foreground extraction method based on BS-GMM. The method of fixed background detection mentioned earlier also belongs to the method of reducing the learning rate, namely, setting the learning rate to 0.
6.1 Performance Comparison The BS-GMM method improves the original GMM by re-dividing the background based on the attenuation of the background model weight sum without increasing the program’s running time. A learning rate parameter in this experiment requires a longer running time due to the need for updating and normalizing the overall background model parameters during model matching, compared to the previous experiment. Regarding memory usage, the BS-GMM method only adds a boolean-type flag property to the original Gaussian model, so it does not significantly affect the program’s memory usage. The performance metrics for the two algorithms run in the experimental environment are shown in Table 1. Based on the performance metrics presented, there is little difference between the two algorithms regarding their execution time and memory usage. Therefore, it can be inferred that the small object detection algorithm based on GMM can achieve real-time detection.
Table 1 Performance metrics between BS-GMM and Low LR GMM Algorithm Execution time (ms) Memory usage (MB) BS-GMM Low LR GMM
30.5 26.8
187.2 185.6
374
Y. Wang and J. Zhai
Table 2 Pixel-level error rate between BS-GMM and Low LR GMM
Algorithm
Pixel-level error rate (%)
BS-GMM Low LR GMM
3.67 2.01
Original image
Extraction with GMM
Extraction with BS-GMM
143rd frame
243rd frame
343rd frame
623rd frame
Fig. 4 Detection result image
6.2 Analysis of Results This subsection compares the pixel-level errors between the GMM and BS-GMM methods. The pixel-level error is calculated as the percentage of misclassified pixels in the foreground mask. Regarding the pixel-level error, both algorithms in this study performed well, but BS-GMM showed a more stable performance in some details. Compared with the GMM-based algorithm, the BS-GMM-based algorithm has better robustness and stability regarding the pixel-level error. Table 2 shows that the pixel-level error rate between the binary image of the foreground objects detected by the BS-GMM-based algorithm and the manually annotated image was 2.01%, while that of the GMM-based algorithm was 3.67%. This is due to the fact that the BS-GMM method can better adapt to the changing background and foreground conditions by dynamically updating the background model. As shown in Fig. 4, BS-GMM prevents the abandoned object from being classified as background too quickly due to the fast updating of the Gaussian mixture model after they become still, which extends the adequate time that stationary objects are identified as foreground and improves detection accuracy. Compared to GMM foreground extraction, where the adequate time for stationary objects is 200 frames, BS-GMM increases it to 500.
Highway Abandoned Object Detection Based on Foreground … Table 3 Average Noise Count between BS-GMM and Low LR GMM
375
Method
Average noise count
BS-GMM Low LR GMM
13.2 37.4
Overall, the BS-GMM method is more effective in detecting small abandoned objects than the GMM method, as it can achieve real-time detection and higher accuracy in pixel-level error. This experiment compared the average number of noise pixels generated by BS-GMM and the low learning rate GMM during foreground extraction. As can be seen from Table 3, the average noise count generated by BS-GMM during foreground extraction is significantly lower than that of low learning rate GMM. This shows that BS-GMM is more effective in reducing noise during foreground extraction, which is beneficial for subsequent analysis and processing of the extracted foreground.
7 Conclusion This paper can accurately detect small abandoned objects by combining the YOLOv7 object detection algorithm and the IoU matching algorithm. The BS-GMM-based foreground extraction algorithm and IoU matching algorithm achieve the best performance. Regarding pixel-level error and average noise, the BS-GMM-based algorithm has better robustness and stability than the traditional GMM algorithm. Therefore, this method has high accuracy and real-time performance in detecting small abandoned objects. Acknowledgements This work was supported in part by Natural Science Foundation of Jiangsu Province (BK20211162).
References 1. Singh, A., Sawan, S., Hanmandlu, M., et al.: An abandoned object detection system based on dual background segmentation. In: 2009 sixth IEEE International Conference on Advanced Video and Signal based Surveillance, pp. 352–357. IEEE (2009) 2. Xiya, L., Jingling, W., Qin, Z.: An abandoned object detection system based on dual background and motion analysis. In: 2012 International Conference on Computer Science and Service System, pp. 2293–2296 IEEE (2012) 3. Stauffer, C., Grimson, W.E.L.: Adaptive background mixture models for real-time tracking. In: 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 246–252. IEEE (1999) 4. Barnich, O., Van Droogenbroeck, M.: ViBe: a universal background subtraction algorithm for video sequences. IEEE Trans. Image Process. 20(6), 1709–1724 (2011)
376
Y. Wang and J. Zhai
5. Bhaskar, H., Mihaylova, L., Maskell, S.: Background modeling using adaptive cluster density estimation for automatic human detection. INFORMATIK 2007: Informatik Trifft Logistik 130–134 (2007) 6. Shin, M.C.: Bayesian Gmm. University of Pennsylvania (2015) 7. Xiao, Z.: Efficient GMM estimation with singular system of moment conditions[J]. Stat. Theory Relat. Fields 4(2), 172–178 (2020) 8. Xue, D., Xue, L., Cheng, W.: Empirical likelihood for generalized linear models with missing responses. J. Stat. Planning Infer. 141(6), 2007–2020 (2011) 9. Wang, C.Y., Bochkovskiy, A.: YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv preprint arXiv:2207.02696 (2022) 10. Yang, F., Zhang, X., Liu, B.: Video object tracking based on YOLOv7 and DeepSORT[J]. arXiv preprint arXiv:2207.12202 (2022) 11. Zhu, X., Lyu, S., Wang, X., et al.: TPH-YOLOv5: improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios. In: IEEE/CVF International Conference on Computer Vision, pp. 2778–2788 (2021) 12. Ge, Z., Liu, S., Wang, F., et al.: Yolox: Exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430 (2021)
YOLO-Based Semantic Segmentation for Dynamic Removal in Visual-Inertial SLAM Xingke Xia, Pu Zhang, and Jian Sun
Abstract We propose a YOLO-based semantic segmentation approach for optimizing dynamic noise in visual-inertial fused SLAM, addressing the challenge of high-precision real-time localization and mapping for low-speed scenarios in rescue mobile robotics. By leveraging the YOLO model, potential dynamic objects in the scene are detected, and a dynamic point removal algorithm is designed by integrating optical flow and multi-view geometry to effectively eliminate dynamic noise, thus enhancing the accuracy of localization and mapping. Moreover, our method incorporates data fusion from multiple sensors including vision and inertial sensors, and employs filtering techniques to optimize and fuse the data, resulting in high-precision SLAM localization and mapping. Experimental results demonstrate the effectiveness of our approach in reducing the impact of dynamic noise and improving the robustness and accuracy of the SLAM system. Keywords Simultaneous localization and mapping (slam) · Visual inertial odometer · Dynamic noise · Object detection
1 Introduction Each year, approximately 60,000 individuals perish as a result of natural disasters [1], accounting for 0.1% of the global mortality rate. Presently, the prevailing employment of search and rescue robots relies predominantly on remote operation, still lacking comprehensive environmental perception capabilities, rendering them insufficient for meeting actual search and rescue requirements in complex and unfamiliar settings. Consequently, Simultaneous Localization and Mapping (SLAM) technology, reliant on sensor-driven environmental perception, has emerged as an indispensable precondition and key technology for facilitating partial or complete autonomy in intricate and unexplored environments. X. Xia · P. Zhang · J. Sun (B) Laboratory of Aerospace Servo Actuation Transmission, School of Mechanical Engineering and Automation, Beihang University(BUAA), Beijing 100191, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_32
377
378
X. Xia et al.
Within intricate and volatile disaster scenarios, individual sensors’ practical utility is hindered by performance limitations, impeding their capacity to satisfy the demands of intricate, unstable disaster environments. By leveraging the amalgamation of sensor data, a multitude of sensor-based information fusion approaches have proven efficacious in enhancing the precision and robustness of SLAM systems. Furthermore, the problem of dynamic noise, arising from moving vehicles, debris, and other mobile elements during real-time localization, presents a formidable challenge for a plethora of SLAM methodologies. These dynamic disturbances can significantly impede the accuracy and precision of the localization process.
2 Related Work Currently, SLAM algorithms that incorporate sensor fusion can be categorized into various combinations based on data fusion characteristics. These combinations include vision and IMU, LiDAR and IMU, and vision and LiDAR. Regarding the fusion of vision and IMU, the MSCFK algorithm proposed in [2] integrates visual and LiDAR sensor information and utilizes a Kalman filter to fuse the data for SLAM. By utilizing the inter-camera visibility, multiple state constraints are established to effectively mitigate odometry drift. In [3] , a robust and high-precision monocular visual-inertial SLAM system named VINS-Mono is designed, enabling rapid system initialization and featuring relocalization and loop closure capabilities. However, in large-scale mapping, the reliance on feature-based methods in VINS-Mono leads to high map redundancy and inefficiency. Additionally, challenging scenarios such as low-texture areas or weak lighting conditions can result in inaccurate feature extraction, matching, and subsequent adverse effects on pose estimation and map construction accuracy. To achieve precise localization in dynamic environments, the PR-VIO algorithm [4] adopts a tightly-coupled visual-inertial approach and introduces planar homography constraints for odometry estimation, effectively enhancing the algorithm’s robustness in dynamic environments. Approaches such as SuMa++ [5] and VDOSLAM [6] combine semantic segmentation with SLAM systems to perceive and remove potential dynamic objects. However, when applied to 3D LiDAR, these methods face limitations due to sparse point clouds and the potential removal of static objects. To address this issue, algorithms like DymSLAM [7] and AirDOS [8] construct dynamic stereo visual SLAM systems, leveraging motion segmentation based on inter-frame motion similarity to accurately obtain rigid object templates and effectively remove dynamic objects.
YOLO-Based Semantic Segmentation for Dynamic Removal …
379
3 Dynamic VI System Framework This paper presents a YOLO-based dynamic removal approach for visual-inertial SLAM (Dynamic VI). The overall framework is illustrated in Fig. 1, comprising five main components: data preprocessing, dynamic segmentation, frontend visualinertial odometry, backend optimization and map construction, and loop closure detection. In the feature extraction stage, image preprocessing techniques, such as distortion correction, denoising, and feature extraction, are applied to improve the matching efficiency and localization accuracy of feature points. Point cloud data is processed in two directions: for localization and map construction, the frontend visual-inertial odometry estimates the robot’s pose by fusing point cloud data with IMU measurements. Regarding dynamic noise detection, the YOLO network is utilized to label the point cloud data containing dynamic objects, which are subsequently removed during the backend optimization and map construction phase, resulting in a complete, low-noise, and highly accurate navigable map.
3.1 IMU Preintegration Due to the inherent characteristics of the IMU, it is susceptible to drift and noise, which can impact its measurements. To enhance precision, preintegration is utilized to mitigate relevant errors. The data acquired from an IMU sensor comprises accel-
Fig. 1 System framwork of dynamic VI
380
X. Xia et al.
eration and angular velocity in the IMU coordinate system. The measured values are influenced by drift (ba , bg ) and noise (ηa , ηb ). The IMU sensor measurements consist of a combination of the true value, drift, and noise. Let ω and a denote the actual true values, and ω˜ and a˜ represent the measured values of the IMU. The corresponding equations are as follows: ω˜ I = ω I + bg + ηg a˜ I = Rb (a W − G w ) + ba + ηa
(1)
where, ω˜ I represents the measured angular velocity of the IMU coordinate system relative to the world coordinate system, ω I represents the instantaneous angular velocity of the IMU coordinate system relative to the world coordinate system, bg represents the gyroscope bias, ηg represents the angular velocity measurement error at the current time. a˜I represents the measured linear acceleration of the IMU, Rb represents the rotation matrix from the IMU coordinate system to the world coordinate system, a W represents the acceleration in the world coordinate system, ba represents the linear acceleration bias, ηa represents the angular velocity measurement error. Based on the camera pose at the previous timestamp and the IMU measurements within this time interval, the velocity, position, and orientation at the next timestamp can be computed: vt+Δt = vt + gΔt + Rt (at − bta − n at )Δt 1 1 pt+Δt = pt + vt Δt + gΔt 2 + Rt (at − bta − n at )Δt 2 2 2 Rt+Δt = Rt exp((ωt − bωt − ηtω )Δt)
(2)
Due to the special nature of integrating IMU measurements using the Manifold manifold, the starting point for each optimization varies. This results in changes in the position, velocity, and orientation obtained from the IMU measurements, requiring reintegration during each optimization. The significance of preintegration lies in making the IMU measurements independent of the initial position, velocity, and orientation. Feasibly, in multi-sensor fusion scenarios, the camera sampling frequency is typically significantly lower than that of the IMU. Therefore, integration can be performed between two keyframes of visual data, reducing computational requirements. By preprocessing the data using IMU preintegration, the pose relationship between camera frames can be directly obtained. The pose change obtained using the preintegration method is as follows: Δvi, j = RiT (v j − vi − gΔti j ) 1 Δ pi, j = RiT ( p j − pi − vi Δti j − gΔti2j ) 2 ΔRi j = RiT R j
(3)
YOLO-Based Semantic Segmentation for Dynamic Removal …
381
Fig. 2 YOLO network architecture
By using this method, we can obtain the IMU preintegration values from time instant i to j. The main purpose is to directly derive the pose relationship between two frames by utilizing the IMU preintegration quantities when i and j correspond to the timestamps between adjacent camera frames. These results can be used in the backend graph optimization process.
3.2 Dynamic Noise Detection To eliminate the mapping artifacts caused by dynamic noise, it is necessary to label the corresponding dynamic objects for further removal. Although ORB feature points can be obtained on the boundary surfaces, in most cases, the point cloud within them cannot guarantee belonging to the same object. By introducing semantic information, different objects can be effectively separated, and the feature points on different objects can be dynamically tracked to determine if the object is in motion between frames. 2D Semantic Segmentation and Point Cloud Annotation In this study, CSPDarknet53 is adopted as the backbone network of the YOLO model for feature extraction. The specific network architecture can be referred to in Fig. 2. By utilizing this approach, we can effectively utilize semantic information for the classification and processing of 3D point clouds, thereby avoiding the need to annotate all 3D point clouds in long-term and high-frame-rate SLAM processes. The entire detection process is as follows: Firstly, the input image is resized to a specified scale according to the requirements of the network architecture and divided into a grid of cells. Each cell is responsible for detecting objects that are located at the center point of that cell. To prevent multiple cells from detecting the same object, non-maximum suppression is applied to eliminate redundant results. Non-maximum suppression first selects the bounding box of the object with the highest confidence score and then calculates the Intersection over Union (IOU) between the bounding boxes of other objects and the highest-scoring object. If the IOU exceeds a certain threshold, the bounding boxes of objects with lower confidence scores are discarded. Finally, the bounding box with the highest confidence score, which does not overlap with other bounding boxes, is obtained along with the object’s outline. Based on
382
X. Xia et al.
this outline, the corresponding point cloud within the bounding box is labeled, and motion consistency checking is performed in the subsequent steps. Motion Consistency Check The results obtained from semantic segmentation may not necessarily represent the dynamic noise that needs to be eliminated. It is necessary to combine semantic information with motion consistency checks to establish a secondary semantic knowledge base, which determines whether objects are in motion. In this regard, we employ epipolar geometry between image frames for evaluation. The epipolar constraint states that for a pair of matched points between the previous and current frames, the point in the current frame should lie on the epipolar line calculated from the fundamental matrix and the point in the previous frame. Let a pair of matched points be denoted as p1 = [u 1 , v1 ]T and p2 = [u 2 , v2 ]T , with their homogeneous coordinates represented as P1 = [u 1 , v1 , 1]T and P2 = [u 2 , v2 , 1]T , where u and v represent pixel coordinates in the image frames, and E P is calculated as the epipolar line using the following equation: ⎡ ⎤ ⎡ ⎤ X u1 E P1 = ⎣ Y ⎦ = F P1 = F ⎣ v1 ⎦ (4) Z 1 where [X, Y, Z ]T represents the line vector and F represents the fundamental matrix, the formula for the distance between the matched point p2 and the epipolar line is given by: T P F P1 2 D= (5) X 2 + Y 2 The motion consistency check process is as follows: Firstly, the optical flow pyramid is computed to obtain the matched ORB feature points in the current frame. If the matched pairs are too close to the image edges or if the pixel differences in the 3 × 3 image patch centered around the matched pair are too large, the matched pairs are discarded. Then, the fundamental matrix is estimated using RANSAC with the maximum number of inliers. Subsequently, the epipolar lines in the current frame are computed using the fundamental matrix. Finally, it is determined whether the distance between a matched point and its corresponding epipolar line is smaller than a certain threshold. If the distance exceeds the threshold, the matched point is considered to be in motion. If a certain number of dynamic points determined by the motion consistency check fall within the contour of the segmented object, the object is considered to be in motion. Once an object is identified to be in motion, all the feature points located within the object’s contour are removed. This allows for the accurate elimination of outliers and reduces the influence of segmentation errors to some extent. Furthermore, it is advantageous to utilize the waiting time of the tracking thread for the semantic segmentation results from another thread. During the waiting time, the tracking thread can perform the motion consistency check. By effectively utilizing the waiting time, the motion consistency check can be conducted in real-time, thereby further improving the efficiency of the algorithm.
YOLO-Based Semantic Segmentation for Dynamic Removal …
383
3.3 Visual-Inertial SLAM Feature Point Tracking and Pose Estimation Upon receiving a new image, we detect ORB feature points and compute their corresponding descriptors, which are then labeled. The entire image detection process is tracked to facilitate matching between camera frames and further pose estimation and optimization. Specifically, FAST feature points are extracted from the image pyramid. The number of feature points extracted varies depending on the image resolution, with higher-resolution images extracting more corner points. The detected feature points are then described using ORB, and the closest ORB feature points are matched and identified using Hamming distance. Subsequently, for each new image frame, an initial pose estimation is performed, followed by the generation of a local map. Bundle Adjustment is then conducted to refine the map points and current pose within the local map. During the matching process, the PnPSolver performs up to three rounds of feature matching. Firstly, the constant velocity model estimation is carried out. If the match is successful, it indicates a constant velocity. If the constant velocity model match fails, further reference frame model matching is performed. The frame with the highest number of common points with the current keyframe is selected as the key reference frame. The unmatched map points from the keyframe are projected into the current frame, generating new matches. If the third round of matching still fails, relocalization matching is required. Similarity between bag-of-words is used to search for keyframes that are similar to the current frame, and matching is performed by adjusting the descriptor threshold. Given the 3D spatial coordinates P, and their corresponding projections p, of n matched points between frames, we aim to calculate the camera pose T . Let Pi = [X i , Yi , Z i ]T represent the spatial coordinates of a point, and ui = [u i , vi ]T represent its pixel coordinates upon projection. The reprojection error is expressed as: εcamera = ui −
1 KTPi si
(6)
When the environment exhibits image blur, lack of distinctive features, or the robot undergoes large-scale motion or rapid rotation, it can pose challenges in feature extraction, matching, and significant visual differences between adjacent frames. Consequently, the reliability of visual-based pose estimation is significantly reduced. In Sect. 3.1, we performed preintegration on the IMU. Throughout the entire process, the IMU provides estimations of the robot’s pose, yielding the next moment’s vt+Δt , pt+Δt , and Rt+Δt . However, due to the accumulation of measurement errors, the pose information provided by the IMU can only remain accurate for a short period of time. To achieve accurate pose estimation over a longer duration, it is necessary to continuously correct the IMU’s errors using visual information or other means (such as loop closure detection). Here, we define the preintegration residual of the IMU:
384
X. Xia et al.
⎤ ⎤ ⎡ −1 w 2 qkw ⊗ qk−1 ⊗ γkk−1 εγ x yz ⎥ ⎢ εβ ⎥ ⎢ T ⎥ ⎢ ⎥ ⎢ R (V − V − G Δt) − βkk−1 k k−1 w ⎢ ⎥ k−1 ⎢ ⎥ = ⎢ εα ⎥ = ⎢ R T (P − P k−1 ⎥ 1 2 k−1 − Vk−1 Δt − 2 Gw Δt ) − αk ⎥ k−1 k ⎣ εba ⎦ ⎢ ⎣ ⎦ ba,k − ba,k−1 εbg b −b ⎡
ε I MU
g,k
(7)
g,k−1
Taking into account the visual reprojection error, we define the joint error (cost function) in the pose estimation and optimization process. Consequently, we obtain the objective for the least squares optimization to minimize: T ∗ = arg min T
⎧ ⎨ ⎩
k∈I
ρε I MU (z k , χ )2 +
(l, j)∈C
⎫ ⎬ 2 y ρ εcamera (z xlj , χ ) ⎭
(8)
where, χ = [x0 , x1 , ......, xn , y0 , y1 , ......, yn ], y represents the observed landmark points, xk = [ p W , v W , R W , ba , bg ]. In the subsequent steps, Bundle Adjustment optimization will be performed separately in the local and global scopes. Local Map Tracking and Map Building After successfully estimating the current initial pose, the local map is updated and the current pose is optimized. Firstly, the newly created keyframe is added to the map. Then, feature matching is performed between the current keyframe and the top 10 keyframes with the highest co-visibility scores. This generates map points, and the local map points are projected onto the feature points of the current frame. The current pose is further optimized through local Bundle Adjustment. The entire optimization factor graph is illustrated in Fig. 3. The observations of the map points are updated, and the number of inliers is counted to determine the success of tracking the local map. The local mapping thread periodically cleans the map by removing poorly observed map points (i.e., those observed by only a few keyframes) or map points with large projection errors after optimization. Loop Closure This paper proposes a keyframe-based loop closure detection algorithm using the DBoW2 model. It utilizes the relationship between feature points extracted from the front-end to determine if two frames belong to the same scene. By mapping feature descriptors to a visual word dictionary, similarity calculations are performed between the current and historical scenes. If loop closure is confirmed, data association is performed to establish the relationship and correct the current pose.
4 Experiments This paper validates the performance of the proposed Dynamic VIO localization and mapping method through a combination of testing on public datasets and real-world data verification. The public dataset used is the TUM-VI dataset, which includes four sequences: corridor, magistrate, room, and slides. For the real-world data, a
YOLO-Based Semantic Segmentation for Dynamic Removal …
385
Fig. 3 Optimization factor plot Fig. 4 Experimental equipment
tracked robot equipped with multiple sensors, as shown in Fig. 4, was used for data collection. The data collection platform consists of a tracked robot with sensors such as Kinect Azure and IMU, controlled by NVIDIA Jetson AGX Xavier. The experimental computing platform is configured with an 8-core ARMv8.2 CPU, 16 GB of memory, and runs on Ubuntu 18.04LTS.
4.1 Positioning Experiment Analysis Simulated experiments were conducted using the TUM-VI public dataset, and a comparison was made with two popular open-source solutions, namely ORB-SLAM3 and VINS-MONO, which have shown good mapping performance. The evaluation metric used was the Absolute Pose Error. The error comparison among the three SLAM methods is shown in Fig. 5.
386
X. Xia et al.
(a) Sequences Corridor
(c) Sequences Room
(b) Sequences Magistrate
(d) Sequences Slides
Fig. 5 Absolute positioning accuracy comparison
In Fig. 5a, Dynamic VI demonstrates excellent performance on sequences 1 and 2, significantly reducing the trajectory errors compared to ORB-SLAM3. In Fig. 5b, the absolute errors of the Dynamic VI system are consistently lower than those of the VINS-MONO system. Figure 5c shows the experimental results with ground truth available throughout the entire sequence. In the scenario of sequence room 2, which involves frequent rotational loops, Dynamic VI produces more reliable and accurate motion trajectories. Overall, the trajectory absolute errors of Dynamic VI show a slight decrease, further improving the accuracy of pose estimation.
4.2 Mapping Experiment Analysis For the mapping experiment, real-world data from a laboratory environment was used in this study. The robot was controlled using a joystick to navigate through the laboratory corridor scene. The resulting point cloud map and trajectory are shown in Fig. 6.
YOLO-Based Semantic Segmentation for Dynamic Removal …
Fig. 6 Dense point cloud mapping and trajectory
Fig. 7 Dynamic noise culling
387
388
X. Xia et al.
From the trajectory plot in Fig. 6, it can be observed that in a scene with weak texture and poor lighting conditions like the corridor, the estimated trajectory by the Dynamic VI system closely matches the actual robot’s trajectory, demonstrating the robustness of the algorithm. Additionally, the Dynamic VI algorithm effectively removes dynamic objects from the camera data, as shown in Fig. 7. This confirms the effectiveness of the keyframe strategy adopted in our mapping approach. By identifying and removing dynamic objects, we successfully eliminate the issue of ghosting and construct a dense point cloud map. This experimental result is of great significance for robot navigation in dynamic and texture-weak environments. In such environments, the presence of dynamic objects can interfere with the robot’s navigation and localization. By using the Dynamic VI algorithm, we are able to more accurately capture information about the static environment, thereby improving navigation accuracy and robustness. Furthermore, our experiment validates the effectiveness of the keyframe strategy in mapping. In summary, our experimental results demonstrate the superiority of the Dynamic VI algorithm in dynamic noise removal and mapping and localization strategies. This provides valuable insights and references for robot navigation research in dynamic and texture-weak environments, and contributes to further enhancing the perception and navigation capabilities of robots in complex environments.
5 Conclusion This paper proposes a visual/inertial fusion SLAM method based on YOLO semantic segmentation to optimize dynamic noise, aiming to address the high-precision real-time localization and mapping challenges of rescue mobile robots in low-speed scenarios. The method combines YOLO model, optical flow, and multi-view geometry to design a dynamic point removal algorithm module, which effectively eliminates dynamic noise and improves the accuracy of localization and mapping. Experimental results demonstrate that the proposed method exhibits good accuracy and robustness in both static and dynamic environments, providing an effective solution for the localization and mapping of visual SLAM systems in dynamic environments.
References 1. Guizzo, E.: Rescue-robot show-down. IEEE Spectr. 51(1), 52–55 (Jan 2014). https://doi.org/10. 1109/MSPEC.2014.6701433 2. Mourikis, A.I., Roumeliotis, S.I.: A multi-state constraint Kalman filter for vision-aided inertial navigation. In: Proceedings 2007 IEEE International Conference on Robotics and Automation, pp. 3565–3572, Rome, Italy (2007). https://doi.org/10.1109/ROBOT.2007.364024 3. Qin, T., Li, P., Shen, S.: VINS-mono: a robust and versatile monocular visual-inertial state estimator. IEEE Trans. Rob. 34(4), 1004–1020 (Aug 2018). https://doi.org/10.1109/TRO.2018. 2853729
YOLO-Based Semantic Segmentation for Dynamic Removal …
389
4. Ram, K., Kharyal, C., Harithas, S.S., Madhava Krishna, K.: RP-VIO: robust plane-based visualinertial odometry for dynamic environments. In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 9198–9205, Prague, Czech Republic (2021). https:// doi.org/10.1109/IROS51168.2021.9636522 5. Chen, X., Milioto, A., Palazzolo, E., Giguère, P., Behley, J., Stachniss, C.: SuMa++: Efficient LiDAR-based semantic SLAM. In: 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4530–4537. Macau, China (2019). https://doi.org/10.1109/ IROS40897.2019.8967704 6. Zhang, J., Henein, M., Mahony, R., Ila, V.: VDO-SLAM: A Visual Dynamic Object-aware SLAM System (2020). arXiv preprint arXiv:2005.11052 [cs.RO] 7. Wang, C., et al.: DymSLAM: 4D dynamic scene reconstruction based on geometrical motion segmentation. IEEE Rob. Autom. Lett. 6(2), 550–557 (Apr 2021). https://doi.org/10.1109/LRA. 2020.3045647 8. Qiu, Y., Wang, C., Wang, W., Henein, M., Scherer, S.: AirDOS: dynamic SLAM benefits from articulated objects. In: 2022 International Conference on Robotics and Automation (ICRA), pp. 8047–8053. Philadelphia, PA, USA (2022). https://doi.org/10.1109/ICRA46639.2022.9811667
Design and Implementation of Integrated Operations Control Center Based on Cloud Architecture Guangyuan Ma, Huiwen Yang, Chen Cheng, Nian Shao, and You Ma
Abstract As an important part of aerospace engineering, integrated operations control center is usually responsible for the reception, processing, storage and application analysis of satellite in-orbit test data. The traditional integrated operations control center is faced with some problems, such as tight coupling of software and hardware, low efficiency of resource sharing and poor system expansion ability, which limit the development of the aerospace TT&C system. In response to the above issues, an integrated operations control center system based on cloud architecture is proposed and designed to realize resource sharing and interaction, service customization and system expansion, which has been verified by space missions. Keywords Integrated operations control center · Cloud architechture · Virtualization · Resource sharing
1 Introduction Integrated operations control center is one of the important components of space engineering. During the launch of the spacecraft and long-term operation in orbit, the integrated operations control center is responsible for the reception and processing of data and the production of control instructions. It ensures the normal execution of the spacecraft’s mission in orbit. With the continuous development of China’s space industry, the number of satellites in orbit is increasing, and the management is becoming more and more complex and diverse, which puts forward a higher demand for integrated operations and control center. The integrated operations control center receives spacecraft data and sends commands to control spacecraft operations through the network [1]. The integrated operation control center usually includes G. Ma (B) · C. Cheng · N. Shao · Y. Ma Beijing Mechanical and electrical Engineering overall Design Department, Beijing 100039, China e-mail: [email protected] H. Yang Unit 63789 of the People’s Liberation Army, Xi’an 710043, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_33
391
392
G. Ma et al.
Fig. 1 Typical structure diagram of integrated operations control center
data processing system, data storage system, monitoring and display system, service scheduling system, etc. A typical structure is shown in Fig. 1. With the vigorous development of space missions, new requirements have been put forward for the integrated operations control center, which needs to process large-scale data in a short time. Problems in the capability of the integrated operations control center have gradually emerged, mainly manifested in the following aspects [2]. (1) The integrated operations control center is not flexible enough, the hardware system is tightly coupled with the business application, and each business has its own system. (2) The allocation and restructuring of software and hardware resources are not flexible, and resource sharing is insufficient. (3) The connectivity of the system is poor, and the scalability to adapt to new services is limited.
Design and Implementation of Integrated Operations …
393
In view of the above problems, while meeting the requirements of system reliability, stability and real-time performance, the integrated operations control center system urgently needs to design a reliable, flexible and easy to operate and maintain architecture. Therefore, the resource-sharing and on-demand integrated operations control center system with cloud architecture emerged. This paper describes the functional analysis, architecture design and specific implementation of the system.
2 System Function Analysis The integrated operations control center is a task management center that integrates task planning, data sending and receiving, system monitoring, data processing, information management, application analysis and results display. It is responsible for the planning of satellite missions in accordance with the schedule of tests and the state of terrestrial resources. On this basis, the integrated operations control center makes operation plans, generates payload control instructions, monitors the in-orbit working status of satellites and payloads according to telemetry data, ensures the normal in-orbit operation of satellites and payloads, completes the reception and processing of data transmission downlink data, and generates primary product data. The functional module of integrated operations control center is shown in Fig. 2.
2.1 In-Orbit Mission Planning In the stage of long-term satellite operation management, the satellite in-orbit test mission planning should be formulated according to the overall arrangement of the test mission. First, orbit forecast is carried out according to the latest orbit data
Fig. 2 Functional requirements of integrated operations control center
Functional requirements of integrated operations control center
Status Monitoring and three-dimensional display
Data Application Analysis
Reception for data transmission
Satellite-ground station resource scheduling
Measurement and control data processing
In-orbit mission planning
394
G. Ma et al.
Fig. 3 The process of task planning Table 1 Classification of measurement and control data S/N Name Format 1. 2. 3. 4. 5. 6.
Comprehensive optimization of flight data Take-off zero point Separation point parameters Initial orbit prediction of spacecraft Original data of spacecraft Tracking raw data of relay satellite
Dtage
Notes
Binary
Launch
Non-period
Binary Binary Binary Binary Binary
Launch Launch Launch Launch/long-term Launch/long-term
Non-period Non-period Non-period Period Period
pushed by the measurement and control center. Then, the user prepares payload control instructions according to the orbit data, forms the payload data injection plan file, and sends it to the measurement and control center. The measurement and control center applies for resources and windows to the dispatching center and sends payload data to inject the instruction parameters in the plan file to the ground station. Finally, the ground station executes the instruction as planned. The process of in-orbit task planning is shown in Fig. 3.
2.2 Measurement and Control Data Processing The measurement and control data mainly includes the measurement and control data at the launch stage and the telemetry data at the long-term operation stage. The measurement and control data are in the format of binary code stream, and the data classification is shown in Table 1.
Design and Implementation of Integrated Operations …
395
Fig. 4 Process of generating measurement and control work plan
Latest orbit data
idle window of satellite-ground station
Orbital prediction and calculation of transit window
Window matching conflict resolution
Measurement and control word plan
The integrated operations control center needs to analyze, process, display and store the measurement and control data. At the launch stage, the integrated operations control center can analyze flight data, take-off zero-point, satellite rocket separation point parameters, initial orbit forecast, and other raw data, then send to the display software for real-time display. In the long-term operation management stage, the integrated operation control center can analyze the original telemetry data in real time, so as to monitor the status of satellites and loads in real time, judge key parameters, and give early warning information and fault alarms. At the same time, the parsed telemetry data is stored, supporting curve display and list display of telemetry parameters, and supporting data download and export.
2.3 Ground Station Resource Scheduling In the stage of long-term satellite operation management, the integrated operations control center regularly detects and resolves the conflict of resource use of the ground station according to the situation of mission planning and resource use of the ground station, automatically generates the measurement and control work plan of the ground station and sends it to the dispatching center. The dispatching center arranges the ground station to perform tasks according to the dispatching plan. The generating process of measurement and control work plan is shown in Fig. 4. The dispatching center periodically releases the resource usage plan and idle window of each ground station. The integrated operations control center makes orbit forecast, calculates the time window of the satellite passing through the fixed ground
396
G. Ma et al.
station, and carries out window matching and conflict resolution with the resource use plan issued by the dispatch center. The dispatching center completes the resource allocation according to the actual use of resources and returns the application results.
2.4 Reception for Data Transmission Data transmission is the scientific data of the satellite in orbit test, including the platform and payload data. According to the different down transmission rate, it can be divided into low-speed data transmission and high-speed data transmission. During the satellite transit window, the ground station receives the real-time data transmitted by the satellite and transfers it to the integrated operations control center, which receives the data for analysis, processing, storage and other operations. If the in-cycle data reception fails, an application for post-transmission shall be submitted to the dispatch center and the ground station shall be arranged for post-transmission. The data transmission process is shown in Fig. 5. (1) According to the measurement and control work plan sent by the integrated operations control center, the dispatching center sends the equipment work plan and data transmission plan to the ground station. (2) Before the data transmission task starts, the ground station shall send the application for data transmission to the integrated operations control center in advance. After receiving the application, the integrated operations control center shall send the receipts to the ground station. After receiving the receipts, the ground station shall start the process to be transmitted. (3) When the data transmission task begins, the ground station will start the antenna equipment for tracking satellite, receive the satellite downlink data and transfer the data to the integrated operations control center. (4) After the transit of the satellite, the data transmission task ends, and the antenna equipment of the ground station is withdrawn. At this time, the ground station is still transferring the received data to the integrated operations control center due to the limited bandwidth between the ground station and the integrated application center. (5) Upon completion of data transmission, the ground station shall send the data transmission completion report to the integrated operations control center, which shall immediately send the receipt of the data transmission completion report to the ground station upon receipt of the above documents. (6) When the integrated operations control center has the requirement of post retransmission, it selects the task type and cycle number and sends the post retransmission application to the dispatching center. After receiving the post retransmission application, the dispatching center returns the application results.
Design and Implementation of Integrated Operations …
397
Equipment work plan/ Data transmission plan/
Satellite-ground
Dispatch Center
station Request instructions for the plan/ Device monitoring information/
Ex post resend application receipt
Ex post resend application data transmission/ Data transmission link establishment/ Data quality analysis report/ Data transmission completion report/
Integrated operations
control center
receipts
Fig. 5 Data transmission process
Data transmission
Preproce ssing
Analytical splitting
Data analysis
Digital data
Image analysis
Image file
Third-party analysis Fig. 6 Data transmission process
2.5 Data Application Analysis The integrated operations control center receives the data in real time, analyzes the data into the payload data and the satellite platform data, and distributes the decomposed data to the payload analysis module for further analysis. Finally, primary data products are generated and stored. The processing process is shown in Fig. 6. After generating primary data products, secondary processing can be carried out according to user requirements to provide users with secondary product data. As shown in Table 2.
398
G. Ma et al.
Table 2 List of data products S/N Grade Product 1
0 level
2
1 level
3
2 level
4
...
Raw data
Description
Parse and split the received original binary data Once processing Convert binary data into digital data and images Once analysis Based on the results of once processing, once analysis is carried out ... ...
Prccessing request
Display request
Receive analysis access
null
Process access
Form curve
Payload dependence
Form curve
...
...
2.6 Status Monitoring and Three-Dimensional Situation Display (1) Status monitoring (a) Display and monitor the operating state of the system in real time, including the communication link state and operating state among the ground station, dispatching center, measurement and control center and integrated operation and control center. (b) According to the newly calculated satellite state, on-board load state and satellite orbital data, monitor the current in-orbit operation state information of the satellite, report the fault phenomenon, analyze the fault causes, and provide the fault solution. (2) Three-dimensional display (a) Display a two-dimensional map with the earth as the background, and display the position and status of satellites, signal-gate stations, user stations and integrated operations control center on the two-dimensional map. (b) Display 3D earth model, which can be enlarged, shrunk, rolled and dragged to display the position and status of satellite, ground station and integrated operation control center on 3D earth model. Among them, the satellite can load 3D model, display the attitude of satellite, and show the working status of internal load by analyzing the input status data. (c) In the three-dimensional scene, display perspectives can be set, such as the specific perspective of satellite, the specific perspective of earth station and user observation. (d) In the three-dimensional scene, display perspectives can be set, such as the specific perspective of satellite, the specific perspective of earth station and user observation;
Design and Implementation of Integrated Operations … application layer
mission planning
core business service
399 data application analysis
state monitoring
information processing
data processing
orbit calculation
ex post processing
computing service
data service
storage management simulation deduction
control calculation integrated display
service layer basic service
management and control platform
resource layer
integrated cloud platform
basic element
maintenance management
computing virtualization
storage virtualization
container service
micro-service
computing resources auxiliary resources
storage resources
bus service
time service
safety management
network virtualization
network resources
machine serverization
display resources
safety protection
Fig. 7 System architecture
(e) It can display the current in-orbit operation status of the spacecraft on the three-dimensional digital earth and replay the historical measurement tasks of the spacecraft.
3 System Design 3.1 System Architecture The system architecture is mainly composed of resource layer, service layer and application layer. The resource layer integrates computing, storage, network, display and other basic resources based on cloud computing and virtualization technologies. Service layer is based on service-oriented technology to realize the servitization and componentization of system functions such as data processing, data storage, orbit calculation and comprehensive display. The application layer realizes the rapid construction and flexible expansion of service models and service processes of various business systems based on intelligent customization technology [3]. The system architecture is shown in Fig. 7.
400
G. Ma et al.
3.2 Resource Layer The resource layer consists of computing, storage, network, display, protection and other hardware infrastructure. It provides resources of underlying physical devices to users in the form of services through virtualization technology and realizes effective monitoring of physical and virtual resources through the integrated cloud platform to realize intelligent resource performance, capacity, and configuration management [4]. Computing resources include various computer hardware resources, virtualization management software, and service software hosted on VMS to implement cloud resource allocation, service management, status monitoring, and log management. Storage resources include storage hardware resources, data storage management software, and various service data. Network resources are mainly composed of high availability switching networks to realize information transmission among systems or resources. Display resources include display systems and business display software hosted on. Auxiliary resources include auxiliary facilities such as cabinets, multi-screen switching kits, and operation terminals. The integrated cloud platform virtualizes physical resources through virtualization technology, including computing virtualization, storage virtualization, and network virtualization. By running multiple virtual machines servers (VMS) on a single physical server, it abstracts the dependence of application programs on underlying systems and hardware. VMS are isolated from each other and do not affect each other. VMS can run different operating systems and provide different application services.
3.3 Service Layer The service layer is responsible for the interaction between applications and cloud platforms. Different services can obtain different interface services according to their requirements. In order to flexibly allocate resources and improve the capability and efficiency of supporting various services, cloud services of the central system should be integrated reasonably. The service layer includes basic generic services and core business services. Basic generic service is the basic common service unit provided for system operation management and is the foundation of core business realization. Core business service is the business module unit that uses the basic service to complete the central task. Basic generic service includes computing service, data service, bus service, password service and so on. The computing service provides general computing, high performance computing, and parallel computing for orbit computing, data processing, and fault diagnosis. Data service realizes the function of storage, access, and other functions of various structured and unstructured data. Application and users
Design and Implementation of Integrated Operations …
401
can call, submit, download, and update configuration information through data service. The bus service provides unified standard interaction mode for the internal and external systems of the center. Password service completes password management distribution service [5]. Core business services include information processing, data processing, storage management, orbit computing, control computing, ex post processing, simulation and integrated display. Among them, information processing completes the external data receiving and sending, and information exchange within the center. Data processing complete telemetry data and transmission data processing. Storage management complete all kinds of original data and calculation results storage and management. Orbit calculation complete orbit prediction and precision orbit calculation. Control calculation complete the spacecraft control parameters of the calculation and processing. The simulation includes space environment simulation, spacecraft flight orbit simulation and control network simulation. Integrated display is used to monitor the working state of equipment.
3.4 Application Layer Based on the resource layer and service layer, the application layer directly faces all business systems and users. Each business application can customize the business models and service processes as required. Application layer allocates computing and storage resources and network services according to the requirements of computing, data storage and network. Using the basic platform and service support provided by the resource layer and service layer, the application layer is an integrated cloud service mode built on the basis of a combination of system business and functions. The application layer responds to user service requests through PC and mobile terminals and realizes user interface functions.
4 System Implementation 4.1 System Deployment The integrated operations control center mainly uses virtualization and cloud technologies to provide resources of underlying physical devices to the service layer and application layer in the form of services. The system deployment is shown in Fig. 8.
402
G. Ma et al.
Fig. 8 System deployment
4.2 Computing Resources The computing resources are mainly composed of a group of high-performance server clusters, which complete the real-time information sending and receiving of the central system, the post information sending and receiving, the calculation and processing of the received data (including telemetry data processing, data transmission processing, orbit calculation, resource scheduling, etc.), and provide the guarantee of computing capacity for the completion of spacecraft instructions and data injection. The cloud platform manages and dynamically allocates computing resources to achieve load balancing of server clusters. In the cloud computing resource design, the server virtualization technology is used to deploy the user’s corresponding service system in the virtualization environment, which aims to improve the utilization of physical server resources, ensure the continuity of the service system. According to the characteristics of the service, orbit computing and data processing can be classified as computation-intensive services, which require high CPU resources. Information sending and receiving, data storage can be classified as standard services, which have high requirements on CPU resources, network I/O and data storage resources. Operation management and display services can be classified as non-computing intensive services. In actual applications, resources such as CPU, memory, network I/O, and disks can be dynamically added based on the actual usage.
Design and Implementation of Integrated Operations …
403
4.3 Storage Resources Storage resources stores real-time data and post-processing data and provides data retrieval and query functions. It consists of database storage system and distributed storage system to realize application data sharing, unified management of structured data and unstructured data, and hierarchical service of real-time data, quasi-real-time data and post data. Storage resources are deployed in a mixed deployment mode, that is, a distributed cloud storage system and Fibre Channel-Storage Area Network (FC-SAN) storage system [6]. The distributed cloud storage system (CSS) centrally plans and deployable storage capacities of all nodes on the platform to form a reliable, highperformance, and highly scalable unified storage service system. The core is the distributed engine at the bottom layer to provide block storage and file storage functions for the upper layer. This fully distributed storage architecture can make the whole system have strong scalability. The FC-SAN storage system consists of a database server, disk array, and database management system. It stores structured data and enables important service data to be managed and queried through the database.
4.4 Network Resources Network resources are composed of core switches, aggregation switches and access switches, forming a relatively separate network system of service network, data network and management network, reducing the mutual influence of different networks and improving the availability and maintainability of the system. The cloud platform network can be divided into three networks based on functions: management network, service network, and storage network [7]. The management network is responsible for the management and control of the entire cloud platform. Service network loads service data on the cloud platform; The storage network carries storage-related services on the cloud platform. To ensure network data reliability, the service and management networks are isolated by VLAN. The storage network is an optical fiber storage network. The failure of one network does not affect the normal operation of the other two networks.
4.5 Display and Auxiliary Resources Display resources provide access to the integrated operation control center for users to perform operations such as task control, status monitoring, data management, and instruction making. In addition, display resources provide administrators with access to the cloud platform to configure and modify the system. The system page is shown in Fig. 9.
404
G. Ma et al.
Fig. 9 System page
Auxiliary resources include cabinets, KVM switchers, and operation terminals. The cabinet and KVM switch are used to install deployment servers, disk arrays, and switches. The operation terminal is used to remotely manage computing, storage, and network resources [8].
4.6 System Management The cloud management platform consists of a virtualization engine and cloud computing management software. The virtualization engine improves server utilization, which greatly reduces the number of servers and thus reduces costs. The cloud management platform software simplifies server management and application deployment in physical and virtual environments. The virtualization technology is used to construct and consolidate resources (including computing, storage, and network physical devices) in a unified manner to provide unified management and security defense for virtual resource pools [9–11].
5 Conclusion This paper introduces the construction of integrated operations control center based on cloud architecture from three aspects: function analysis, scheme design and concrete implementation. The proposed scheme is advanced and feasible and has been verified by long-term satellite missions in orbit.
Design and Implementation of Integrated Operations …
405
References 1. Yu, Z.: Aerospace TT&C System Engineering. National Defense Industry Press, Beijing (2008) 2. Liu, H.: Design of satellite ground control system based on SOA. Comput. Netw. 40(13), 49–51 (2014) 3. Zheng, Y., Chen, S.: Construction and Management of Distributed Cloud Data Center. Tsinghua University Press, Beijing (2013) 4. Tian, Y.: Architecture design based on cloud service. Electron. Technol. Softw. Eng. 27(21), 162–163 (2019) 5. Liu, G., Lu, J., Zhou H., et al.: Design of space-based information application service system based on cloud architecture. J. China Acad. Electron. Sci. 13(5), 526–544 (2018) 6. Chen, G., Ming, Z.: Cloud Computing Engineering. Posts and Telecommunications Press, Beijing (2016) 7. Gu, J.: Cloud Computing Architecture Technology and Practice. Tsinghua University Press, Beijing (2015) 8. Yang, H.: Development of U.S. military space TT&C enterprise-level ground system. Telecommun. Technol. 57(7), 841–848 (2017) 9. Teng, X., Hwang, W.: Effect of methylation on local mechanics and hydration structure of DNA. Biophys. J. 114(8), 1791–1803 (2018) 10. Teng, X., Hwang, W.: Elastic energy partitioning in DNA deformation and binding to proteins. ACS Nano 10(1), 170–180 (2016) 11. Teng, X., Deng, X.: Optimization of a helical flow inducer of endovascular stent based on the principle of swirling flow in arterial system. J. Biomed. Eng. 27(2), 429–434 (2010)
Observability of Edge Dynamics in Complex Networks Zhiliang An, Shaopeng Pang, Mingjun Du, and Peng Ji
Abstract The dynamic processes that occur at the edges of complex networks are relevant to various real-world situations. In recent years, more and more scholars have focused their attention on the study of edge dynamics in complex networks, and achieved important results. Structural controllability of edge dynamics in complex networks has been extensively studied, but there is currently no related work on structural observability. In this paper, we propose a framework for studying the structural observability of edge dynamics in complex networks. A method is proposed to find the minimum number of sensor nodes and observed edges required for observability. We apply the method to model networks and give theoretical formulations. Keywords Complex network · Edge dynamics · Structural observability
1 Introduction In recent years, there has been an explosion of interest in nodal dynamics in complex networks [1–5]. Lin [6] proposed a definition of structural controllability that bypasses measuring system parameters. Liu et al. [7] developed structural controllability and provided a minimum input theory based on the concept of maximum matching to characterize the controllability of directed networks. Thanks to the principle of duality [8], observability can borrow some concepts from controllability. Observability of nodal dynamics is studied in [9]. Edge dynamics also matter in many real-world networks. For example, for a social network, a person is a node and there is an edge between two people passing information. The seminal work addressing edge controllability in complex networks was presented by Nepusz et al. [10]. Subsequent research on edge dynamics in complex networks aroused more enthusiasm among researchers, and a lot of progress has been made [11–16]. HowZ. An · S. Pang (B) · M. Du · P. Ji School of Information and Automation Engineering, Qilu University of Technology (Shandong Academy of Sciences), Jinan 250353, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_34
407
408
Z. An et al.
ever, current research on the structural observability of edge dynamics in complex networks is still missing. In this paper, we investigate the structural observability of edge dynamics in complex networks. The number and location of sensor nodes required for the structural observability of edge dynamics in complex networks are determined by the local topology of nodes. We propose a method to find the minimum number of sensor nodes and observed edges required for observability. Our model network-based simulations investigate the effect of the degree distribution on the number of sensor nodes and observed edges. At the same time, we give a theoretical analysis based on the degree distribution.
2 Structural Observability of Edge Dynamics We introduce the edge dynamics [10] in complex networks. For directed complex networks G(V, E), let x denote the vector specifying the dynamics of each edge in the network. And N is the total number of nodes and M is the total number of edges in G(V, E). Let xv− and xv+ be the state vectors of the inbound and outbound edges corresponding to node v, respectively. The inbound edge vector xv− can affect the outbound edge vector xv+ and the output vector yv . The edge dynamics process can then be described by the following: x˙ v+ = Sv · xv− , yv = ςv · xv− ,
(1)
where Sv is a switch matrix with the number of rows equal to the out-degree kv+ and the number of columns equal to the in-degree kv− of node v. yv is the output vector, which is the same dimension vector as xv− . ςv is a diagonal matrix whose diagonal elements characterize the coupling relationship between xv− and yv . The correspondence between the above dynamics and linear time-invariant dynamical systems can be established by reformulating the above dynamics in terms of edge variables, yielding: x˙ = W · x, (2) y = Q · x, where W ∈ R M×M is the transpose of the adjacency matrix of the line graph L(G) (see Fig. 1 for a line graph example), where wi j is not 0 if and only if the head of edge j is the tail of edge i. The nodes of L(G) correspond to the edges of the original network G, and each edge of L(G) represents a directed path of length 2 in G. Q is a diagonal matrix whose elements on the diagonal correspond one-to-one to the elements on the diagonal of ςv at each node. If we wish to observe a system, we first need to identify a set of observable edges. If these edges can provide a complete observation of the network, we will call them
Observability of Edge Dynamics in Complex Networks
409
Fig. 1 a Directed network G with 5 nodes: a, b, c, d and e, and 6 edges: xi (i = 1, . . . , 6). b The line graph L(G) of the original directed network G. The colors of the edges in L(G) correspond to the colors of the nodes in (a). c Sensor nodes, observed edges and output signals in the original directed network G. d Sensor nodes and output signals in the line graph L(G). The state matrix W, output matrix Q and observability matrix O. The observability matrix O is full rank, indicating that the system is structurally observable
observed edges. Their heads are sensor nodes. We denote the minimum number of sensor nodes and observed edges by NO and MO , respectively. A system described by Eq. (2) is said to be observable if we can reconstruct its complete internal state from its outputs, which is possible if and only if the observability matrix O = [QT , (QW)T , . . . , (QW M−1 )T ]T ,
(3)
rank(O) = M.
(4)
has full rank, that is
This expresses the mathematical condition for observability, known as the Kalman rank condition for observability. However, the rank criterion has significant practical limitations for complex systems and testing the Kalman rank condition is also computationally expensive. The concept of structural controllability [7] gives us a new perspective. We can use the principle of duality to introduce structural observability. Assume that the matrices W and Q are structured matrices that contain fixed zeros and independently free parameters. The system (2) is said to be structurally observable if the free parameters of the matrix can be set in such a way that the system becomes observable in the usual sense.
410
Z. An et al.
3 Structural Observability Theory To study issues related to the structural observability of edge dynamics in complex networks, we introduce definitions of three node classifications and related concepts of connected components [10]. If kv+ > kv− , the node v is called divergent, if kv+ < kv− , it is called convergent, and if kv+ = kv− , it is called balanced. We define a balanced component as a connected component consisting only of balanced nodes and at least one edge. In order to prove our theorem, we first introduce the following lemma: Lemma 1 [9] By invoking the duality between controllability and observability in linear systems, the controller in system G(A) (matrix A is usually called the state matrix) is just the observer in its dual (or transposed) system G(A T ), which is obtained by flipping all the edges in the direction. By monitoring these observers, the system G(A T ) is guaranteed to be observable. From the principle of duality, we propose the following theorem: Theorem 1 The minimum number of sensor nodes required to maintain structural observability of edge dynamics on the network G(V, E) is given byFrom the principle of duality, we propose c NO = N(kv− >kv+ ) + βi , (5) i=1
where c is the number of connected components. If the i-th connected component is balanced, then βi = 1. Otherwise, it is 0. Proof The edge dynamics of directed network G are equivalent to the nodal dynamics of its corresponding line graph L(G). Each acyclic edge in line graph L(G) corresponds to a path of length 2 in G, and a single cyclic edge in L(G) generates a single path of length 1 in G. When we flip the direction of each edge in L(G), the direction of the corresponding path in the original graph G also flips. From Lemma 1 we know that flipping the direction of each directed edge, the controller in L(G) is just an observer of its dual (transposed) system L(G T ). Because the edges in the original graph G T corresponding to the line graph L(G T ) are also flipped. The divergent (convergent) nodes in G are the convergent (divergent) nodes in G T . It has been proved in [10] that the minimal set of driver nodes required to control its edge dynamics on a network G can be determined by choosing divergent nodes of G and one arbitrary node from each balanced component. So the minimum set of sensor nodes required to observe its edge dynamics on a network G can be determined by selecting the convergent nodes of the G and one arbitrary node from each balanced component. Theorem 2 The minimum number of observed edges required to maintain structural observability of the edge dynamics on a network G(V, E) is given by MO =
N
max(kv− − kv+ , 0) +
v=1
where βi is the same as that in Eq. (5).
c i=1
βi ,
(6)
Observability of Edge Dynamics in Complex Networks
411
Proof The overall proof is similar to Theorem 1. When the direction of each edge in L(G) is flipped, the direction of each edge in G is also flipped. So the divergent nodes in G is transformed into the convergent nodes of the dual system G T , but the number of balanced nodes in the balance component does not change. It has been proved in [10] that if the edge dynamics of a complex network are controllable, each divergent node must control its kv+ − kv− outbound edges, and each selected node in a balanced component must control only one of its outbound edges. Due to the principle of duality, when the edge dynamics of complex networks are observable, each convergent node must observe its kv− − kv+ inbound edges, and a selected node in each balanced component must observe only one of its inbound edges.
4 Theoreical Analysis The structural observability of edge dynamics can be quantified by the fraction of sensor node n O and observed edge m O , where n O = NO /N and m O = MO /M. According to Theorem 1, the fraction of sensor nodes depends almost entirely on the joint degree distribution of the network. By ignoring the possible balanced components, the fraction of sensor nodes in a network with a joint degree distribution P(kv− = i, kv+ = j) = pi j is simply given by nO =
∞ ∞
pi j .
(7)
i= j+1 j=0
That is, for the case where the in-degree is bigger than the out-degree, only the sum of the joint probabilities needs to be calculated. When the in-degree and out-degree are uncorrelated and the distribution is the same (in all model networks), pi j = p ji is also correct, so n O can also be written as nO =
1−
∞ k=0
2
pkk
.
(8)
An analytical expression for the observed edge can also be derived from the joint degree distribution as follows: ∞
mO =
∞
1 i p(i+ j), j , k i=1 j=0
(9)
where k = M/N . The formula stems from the fact that there are N pi j sensor nodes with in-degree i and out-degree j, each such edge has i − j observed edges, and the sum must be divided by M to obtain the fraction of observed edges (Fig. 2).
412
Z. An et al.
Fig. 2 a n O and c m O in ER and EX networks as a function of average degree k. b n O and d m O in SF network as a function of the exponential γ of the degree distribution. ER, EX and SF networks [17–19] are all generated based on static models with N = 5000. All data nodes and error bars are obtained by averaging over 100 independent realizations
Erd˝os-Rényi networks: For Erd˝os-Rényi (ER) network, both in-degree and outdegree follow a Poisson distribution. So n O is given as follows: ∞
n ER O =
k2k 1 1 (1 − e−2k ) = (1 − e−2k I0 (2k)), 2 k!k! 2 k=0
(10)
where Iα (x) is the modified Bessel function of the first kind. We can also compute m O as follows: m ER O =
∞ ∞ ∞ e−2k e−2k ki+2 j = i i Ii (2k). k i=1 j=0 (i + j)! j! k i=1
(11)
Exponential networks: In the exponential (EX) network, the in-degree and outdegree follow an exponential distribution P(kv+ = k) = P(kv− = k) = Ce−k/κ , where k . So the expected value of n O is given as follows: C = 1 − e−1/κ and κ = 1/ log 1+k
Observability of Edge Dynamics in Complex Networks
413
∞
n EX O =
1 k (1 − C 2 . e−2i/κ ) = 2 2k + 1 i=0
(12)
Similarly, the expected value of m O is as follows: ∞
m EX O =
∞
C 2 −(i+2 j)/κ k + 1 ie = . k i=1 j=0 2k + 1
(13)
Scale-free networks: The in-degree and out-degree of the scale-free (SF) network follow the power law distribution, that is p(k) =
[k(1 − a)]1/a (k − 1/a, k(1 − a)) , a (k + 1)
(14)
where (s, x) is an incomplete Gamma function, (n) = (n − 1)! is the Gamma function and the parameter a = 1/(γ − 1). Thus n O is given as follows: ∞
n SF O =
1 (1 − δ 2 k2 ). 2 k=0
(15)
Similarly, the expected value of m O is as follows: ∞
m SF O
∞
1 2 = iδ i+ j j , k i=1 j=0
(16)
where δ = [k(1−a)] and k = (k−1/a,k(1−a)) . a (k+1) For each network, the good agreement was obtained between analytical predictions and simulation results. It shows that the fraction of sensor nodes and observed edges depends almost entirely on the joint degree distribution of the network. 1/a
5 Conclusion We study the structural observability of edge dynamics in complex networks. Then we derive general theoretical formulations for the minimum number of sensor nodes and observed edges of edge dynamics and apply them to model networks. In the future work, we will apply the structural observability research framework of edge dynamics to real networks and give the corresponding theoretical formulations.
414
Z. An et al.
Acknowledgements This work was supported by National Nature Science Foundation of China under Grant No. 61903208, No.62103210, Training Fund of Qilu University of Technology (Shandong Academy of Sciences) under Grant No. 2022PY054, 2022PY003, Youth Innovation Science and technology support plan of colleges in Shandong Province (2021KJ025).
References 1. Bollob’s, B.: Modern Graph Theory. Springer Science & Business Media (1998) 2. Albert, R., Barabási, A.L.: Statistical mechanics of complex networks. Rev. Mod. Phys. 74(1), 47 (2002). https://doi.org/10.1103/revmodphys.74.47 3. Dorogovtsev, S.N., Mendes, J.F.F.: Evolution of networks: from biological nets to the Internet and WWW. Oxford Univ. Press (2003). https://doi.org/10.1063/1.1825279 4. Dorogovtsev, S.N., Goltsev, A.V., Mendes, J.F.F.: Critical phenomena in complex networks. Rev. Mod. Phys. 80(4), 1275 (2008). https://doi.org/10.1103/revmodphys.80.1275 5. Pastor-Satorras, R., Castellano, C., Van Mieghem, P., et al.: Epidemic processes in complex networks. Rev. Mod. Phys. 87(3), 925 (2015). https://doi.org/10.1103/RevModPhys.87.925 6. Lin, C.T.: Structural controllability. IEEE Trans. Autom. Control 19(3), 201–208 (1974) 7. Liu, Y.Y., Slotine, J.J., Barabási, A.L.: Controllability of complex networks. Nature 473(7346), 167–173 (2011). https://doi.org/10.1038/nature10011 8. Bohner, M., Wintz, N.: Controllability and observability of time-invariant linear dynamic systems. Mathematica Bohemica 137(2), 149–163 (2012). https://doi.org/10.21136/MB.2012. 142861 9. Liu, Y.Y., Slotine, J.J., Barabási, A.L.: Observability of complex systems. Proc. Nat. Acad. Sci. 110(7), 2460–2465 (2013). https://doi.org/10.1073/pnas.1215508110 10. Nepusz, T., Vicsek, T.: Controlling edge dynamics in complex networks. Nat. Phys. 8(7), 568–573 (2012). https://doi.org/10.1038/nphys2327 11. Pang, S.P., Wang, W.X., Hao, F., et al.: Universal framework for edge controllability of complex networks. Sci. Rep. 7(1), 4224 (2017). https://doi.org/10.1038/s41598-017-04463-5 12. Pang, S., Hao, F.: Optimizing controllability of edge dynamics in complex networks by perturbing network structure. Physica A Stat. Mech. Appl. 470, 217–227 (2017). https://doi.org/ 10.1016/j.physa.2016.12.001 13. Pang, S.P., Hao, F.: Target control of edge dynamics in complex networks. Physica A Stat. Mech. Appl. 512, 14–26 (2018). https://doi.org/10.1016/j.physa.2018.08.011 14. Lu, F., Yang, K., Qian, Y.: Target control based on edge dynamics in complex networks. Sci. Rep. 10(1), 1–12 (2020). https://doi.org/10.1038/s41598-020-66524-6 15. Wang, X., Wang, X.: Consensus of edge dynamics on complex networks. In: IEEE International Symposium on Circuits and Systems (ISCAS), 1271–1274. IEEE, 2014 16. Shen, C., Ji, Z., Yu, H.: The structural controllability of edge dynamics in complex networks. In: Chinese Control and Decision Conference (CCDC), 5356–5360. IEEE, 2018 17. Chung, F., Lu, L.: Connected components in random graphs with given expected degree sequences. Ann. Combinatorics 6(2), 125–145 (2002). https://doi.org/10.1007/pl00012580 18. Goh, K.I., Kahng, B., Kim, D.: Universal behavior of load distribution in scale-free networks. Phys. Rev. Lett. 87(27), 278701 (2001). https://doi.org/10.1103/physrevlett.87.278701 19. Catanzaro, M., Pastor-Satorras, R.: Analytic solution of a static scale-free network model. Euro. Phys. J. B-Condensed Matter Complex Syst. 44, 241–248 (2005). https://doi.org/10.1140/epjb/ e2005-00120-9
A Modified Orthogonal Experimental Method for Configuration Data Acquisition Planning of Industrial Robots Xinyang Guo, Guanbin Gao, Fei Liu, and Yashan Xing
Abstract The configuration data is an important factor affecting the stability and generalization ability of calibration results. A data acquisition planning method based on the orthogonal experimental method optimized by particle swarm optimization (PSO) is proposed to improve the stability and generalization ability of kinematic calibration. Firstly, the robot position error model is derived. Secondly, the corresponding angle values of each joint level are determined by PSO. Finally, the proposed method is verified by simulation. The results show that the average accuracy of the proposed method is 21.58% and 11.29% higher than the random method and the conventional orthogonal experimental method, respectively. Moreover, the analysis results indicate that the proposed method has stronger generalization ability and stability across the entire workspace. Keywords Industrial robot · Orthogonal experimental method · Parameter identification
1 Introduction With the development of industrial automation and flexibility in recent years, the demand for industrial robots had been increasing rapidly, the requirements for the accuracy of robots were also increasing. The positioning accuracy can be divided into absolute positioning accuracy and repeated positioning accuracy [1]. The absolute accuracy of industrial robots is lower, typically only 2∼3 mm, but the industrial applications based on the absolute positioning accuracy are more and more widely nowadays [2]. It is found that the kinematic parameter error accounts for about 90% of the total error of the robot [3], so kinematic calibration can effectively improve the X. Guo · G. Gao (B) · F. Liu · Y. Xing Faculty of Mechanical and Electrical Engineering, Kunming University of Science and Technology, Kunming 650500, China e-mail: [email protected] Yunnan International Joint Laboratory of Intelligent Control and Application of Advanced Equipment, Kunming University of Science and Technology, Kunming 650500, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_35
415
416
X. Guo et al.
absolute positioning accuracy. The current research in the field of robot calibration mainly focusses on kinematic modeling and parameter identification [4], only a few literatures have studied the configuration data used in parameter identification, and the configuration data has an important impact on the calibration results [5]. Sun et al. [6] proposed a robot optimization strategy based on circle points analysis method, which reduced the absolute positioning error by 43.99%. Jia et al. [7] sought the optimal configuration by using the set construction method and the average error increased by 66.67% compared with the random configuration. Lyu et al. [8] proposed an improved optimal configuration selection algorithm, which reduced the average error by 9% compared with random configurations. The above work show that the configuration data screened by observability can improve the effectiveness of robot calibration results, but the method first acquires a large amount of configuration data and then filters them. However, in reality, the acquisition of robot configuration data requires a lot of time and most of the acquired data are discarded due to poor observability. Moreover, the acquisition of configurations is usually based on personal experience or random method, and they are difficult to control the observability of random configuration [9], so there are some defects in the optimization after the acquisition configuration. If each set of data has been planned before configuration acquisition, then the configuration acquisition of robot calibration can be standardized without too much data acquisition. The orthogonal experimental method is a method for multi-factor and multilevel experiment planning, which can solve the optimization problem of multi-factor [10]. More comprehensive position information can be obtained by using orthogonal experiments to acquire configuration. Qi et al. [11] proposed using the orthogonal experimental method as the sample data acquisition method, and established the robot positioning error measurement system. Han et al. [12] selected the configuration data by the orthogonal experimental method of evenly dividing the joint angle space, and showed that the method was feasible by projecting each coordinate plane. There have been no reports yet on the use of orthogonal experimental method for robot calibration. The orthogonal experimental method typically assumes an average distribution of values within a given range. However, robots consist of a number of joints, and each joint has a unique range of motion. Introducing the conventional orthogonal experimental method directly into robot calibration can potentially weaken the effectiveness of the calibration process [13]. In contrast, we propose a novel approach that utilizes a configuration data acquisition planning method based on an optimized orthogonal experimental method using particle swarm optimization. This innovative method aims to enhance the overall effectiveness of the calibration process. Firstly, we derive the robot position error model and determine the initial parameters for the orthogonal experimental table. Secondly, we utilize the particle swarm optimization algorithm to divide the joint angle space. Finally, we obtain the orthogonal experimental table based on the particle swarm optimization approach. This optimized table is then compared with the conventional orthogonal experimental method using average partition and the random method.
A Modified Orthogonal Experimental Method for Configuration Data …
417
2 Robot Modeling 2.1 Robot Kinematic Modeling In this paper, a 6R general-purpose collaborative robot is used as the object. Firstly, the coordinate system of the robot is established based on the MD-H rule, in turn, the kinematic parameters of the robot are obtained. The robot kinematic parameters, theoretical joint angle range and actual joint angle range are shown in Table 1. Where ai −1 denotes the linkage length of joint i − 1; αi −1 denotes the linkage twist angle; θi denotes the joint rotation angle of joint i; di denotes the linkage i offset. According to the principle of coordinate homogeneous transformation, the coordinate system relationship between link i − 1 and link i can be represented by the homogeneous transformation matrix i−1 i T =⎡Rot X (αi−1 ) · Trans X (ai−1 ) · Rot Z (θi )⎤· Trans Z (di ) cθi −sθi 0 ai−1 ⎢ sθi cαi−1 cθi cαi−1 −sαi−1 −sαi−1 di ⎥ ⎥ =⎢ ⎣ sθi sαi−1 cθi sαi−1 cαi−1 cαi−1 di ⎦ 0 0 0 1
i−1 i T
(1)
where, s represents the sine function and c represents the cosine function. The pose at the end of the robot can be calculated from Eq. 2. 0 0 1 6 T = 1 T ·2
T ·23 T ·34 T ·45 T ·56 T
(2)
2.2 Position Error Modeling According to the kinematic modeling in Sect. 2.1, there are 24 parameters from the base to the end of the robot, ai −1 , αi −1 , θi , di (i = 1, 2...6) to represent the kinematic
Table 1 The parameters of the 6-DoF robot Number of αi −1 /(deg) ai −1 /(mm) θi /(deg) linkages 1 2 3 4 5 6
0 −90 0 0 −90 −90
0 0 418 398 0 0
θ1 θ2 θ3 θ4 θ5 θ6
di /(mm)
Theoretical Actual joint joint angle angle range range /(deg) /(deg)
96 138 −114 98 98 89
0 ∼ 360 −180∼180 −180∼180 −180∼180 0∼360 −360∼0
80 ∼ 200 −145∼35 25∼140 −180∼165 30∼165 360∼0
418
X. Guo et al.
parameter error and establish the error model of the robot through the differential principle as follows: 6 6 6 6 ∂p ∂p ∂p ∂p p= αi−1 + ai−1 + θi + di i ∂αi−1 i=1 ∂a ∂θ ∂d i−1 i i i=1 i=1 i=1
(3)
where, p represents the error between the theoretical position and the actual position of the robot end-effector. The matrix form is: p = p R − p N = J k
(4)
where, p R represents the actual position of the end-effector of the robot, p N represents the nominal position of the end-effector of the robot, J is the Jacobian matrix of the robot, k is the error vector of kinematic parameters, including 24 parameters. In summary, the position error model of the robot’s kinematic parameters is established. In this paper, the least square method is used to identify the kinematic parameters and then realize the robot calibration.
3 Orthogonal Experimental Design Based on Joint Space Division 3.1 Design of the Orthogonal Experimental Table According to the simulation platform, six joints of industrial robots are selected as six factors in the orthogonal experimental table, and 50 points are usually acquired for calibration. It is concluded that the L 49 (76 ) orthogonal table can be used for configuration planning, where 7 represents the horizontal number of each joint. Orthogonal experimental tables with 6 factors and 7 levels were prepared by the R-studio program. The L 49 (76 ) orthogonal experimental table is generated by running the program. According to the meaning of the orthogonal experimental table, 49 represents a total of 49 groups of experiments, that is, only 49 groups of data need to be acquired for the planned configuration. The first five groups of experiments are shown in Table 2.
3.2 Division of Joint Space Equally Each level of the orthogonal experimental table is usually evenly assigned. According to the actual joint angle range, joint 1 is taken as an example, and the actual joint
A Modified Orthogonal Experimental Method for Configuration Data … Table 2 The orthogonal experimental table of L 49 (76 ) Point Factor Joint1 Joint2 Joint3 Joint4 1 2 3 4 5
1 6 2 7 3
3 7 1 1 4
5 5 7 5 5
7 7 6 3 6
419
Joint5
Joint6
2 5 5 4 7
4 3 4 5 1
angle range is divided into 7 angles from 80 to 200◦ : [80◦ , 100◦ , 120◦ , 140◦ , 160◦ , 180◦ , 200◦ ]. These 7 angles correspond to 7 levels, that is 1 is 80◦ , 2 is 100◦ · · · and 7 is 200◦ . Similarly, dividing the remaining 5 joint angles into intervals to complete the L 49 (76 ) orthogonal experimental table.
3.3 Division of Joint Space Based on PSO PSO is a parallel global optimization algorithm based on swarm intelligence, which was originally proposed by Kennedy [14]. The optimal solution of the problem is obtained by searching. The algorithm has simple structure and fast convergence speed, and has certain advantages in dealing with high-dimensional problems [15]. Therefore, this paper uses this algorithm to search the joint angle. During the search process, each particle i is represented by two vectors, the velocity vector Vidt and the t . Each particle updates its velocity and position using its personal position vector X id best position in history Pbi and the global best position Pg found so far. The updated formulas of X i and Vi are shown in Eqs. (5) and (6), respectively. t
t t + c2 r2 Pgdt − X id − X id Vidt+1 = ωVidt + c1r1 Pbid
(5)
t+1 t = X id + Vidt+1 X id
(6)
t and Vidt denote the speed and position of the t − th iteration of the i − th where X id t is the best position searched by the i − th particle particle on the d-dimension. Pbid in the t − th iteration on the d dimension, and Pgdt is the best position searched by the whole population in the t − th iteration. t and t + 1 represent the current and next iteration, respectively. ω is the inertia weight, c1 and c2 represent the acceleration t and Pgdt weights of particles, respectively. r1 and r2 are random coefficients of Pbid numbers in the range of [0,1]. The personal best position and the global best position of each particle are updated using Formulas (7) and (8) in each iteration.
420
X. Guo et al.
Table 3 The joint space is divided by the PSO Point Factor Joint1 Joint2 Joint3 1 2 3 4 5
142.9 101.3 80 142.7 194.3
−85.8 10.5 −145 −145 10.5
t+1 Pbid
t+1 Pgid
34.9 108.5 78.1 34.9 34.9
Joint4
Joint5
Joint6
114.6 114.6 −50.1 −50.1 −50.1
49.7 148.1 148.1 51.1 30
−180 −240 −180 −120 −360
t
t+1
t+1 ≤ f Pbid , f X id X id
t+1 t t > f Pbid , f X id Pbid
t
pgdt , f pgdt ≤ min f Pbid i
t
t+1 , f pgdt > min f Pbid Pbid
(7)
(8)
i
where f is the objective function of the optimization problem. The objective function is to take the minimum of the two norm values of the verification set error of 50 sample points. From this, the PSO optimized orthogonal experimental table is obtained, as shown in Table 3. According to the orthogonal experimental method ‘uniform dispersion, neat and comparable’ and the use of PSO search algorithm to divide the joint angle space, these 49 sampling points can represent the entire workspace of the robot.
4 Simulations 4.1 Dataset Generation Method The effectiveness of the proposed method is verified by the comparative simulation of random method, orthogonal experimental method and the proposed method, and in order to ensure the randomness of the random method, random joint angles are generated by setting different time seeds. From this, we obtain the configuration of the three method. In addition, a validation set I of 1000 configurations was randomly generated in the theoretical joint angle in order to see the generalization ability of the three methods.
A Modified Orthogonal Experimental Method for Configuration Data …
421
4.2 Kinematic Parameter Error Setting To simulate the real calibration scene, it is necessary to add parameter errors and robot end noise. The kinematic errors of 0.02◦ and 1mm are added to the nominal values of the αi −1 , θi , ai −1 and di , respectively. Then, the noise with a magnitude of 0.2 mm and Gaussian distribution is added in the x, y and z directions at the end of the calculated robot.
4.3 Simulation Results In this paper, the effectiveness of the proposed method is demonstrated by the results of kinematic parameter identification and verification. Firstly, the configuration of the of the three methods are identified respectively. The position error before calibration is shown in Fig. 3, and the maximum error two norm in x, y and z directions reaches 25 mm. The results of the identification set are shown in Fig. 4. The six groups of configurations can reduce the robot error after kinematic parameter identification, and there is no significant difference in the error of the identified robot in a small range of joint angles. The results of the identification set can only be reflected in the case of parameter fitting in this configuration. The verification set is verified by 1000 samples in the whole joint angle space, so the verification set can better reflect the stability and generalization of the configuration in the whole joint angle space. The average position error of four random groups, the orthogonal experimental configuration, and the proposed configuration are shown in Table 4. The maximum and average histograms of the six groups of configurations are shown in Fig. 1. The first column is the proposed method, the second column is the orthogonal experimental method, and the remaining four are the random method. It can be seen from Table 5 that the calibration effect of the method proposed in this paper and the conventional orthogonal experimental method are significantly improved compared with the average value of random method. However, as shown
Table 4 The average positioning error of the proposed method, orthogonal experimental method and random method in verification set The proposed method Orthogonal The mean of random experimental method method Max error in the x Max error in the y Max error in the z Mean error in the x Mean error in the y Mean error in the z
0.71091 0.57898 0.59888 0.17218 0.15813 0.16911
0.79478 0.78381 0.68198 0.19055 0.18135 0.19131
0.90864 1.01238 0.79639 0.21622 0.21223 0.20873
422
X. Guo et al. 0.8
30 Orthogonal experimental method
The proposed method
Orthogonal experimental method
0.7
25
Position error (mm)
Position error (mm)
The proposed method
20 15 10
0.6 0.5 0.4 0.3 0.2
5 0.1 0
10
20
30
0
40
10
20
30
40
Points
Points
(a) Positioning error before calibration
(b) Positioning error after calibration
Fig. 1 The before calibration positioning error and identification set results of 6 groups of configurations 0.3
1.2
The proposed method
The proposed method
Orthogonal experimental method
Orthogonal experimental method
1
Mean error (mm)
Max error (mm)
0.25
0.8 0.6 0.4
0.15 0.1 0.05
0.2 0
0.2
0
Max error of x
Max error of y
Max error of z
(a) The maximum positioning error
Mean error of x
Mean error of y
Mean error of z
(b) The average positioning error
Fig. 2 The maximum positioning error and average positioning error of the three method in verification set
in Fig. 2, compared with the first group of random method, the positioning error of conventional orthogonal experimental method is larger. From this comparison, it can be seen that the stability of the average division of the orthogonal experimental configuration calibration is poor. The maximum accuracy of the proposed method is 10.55%, 26.13%, 12.19% higher than the conventional orthogonal experimental method in the x, y and z directions, respectively, and the average accuracy of the proposed method is 9.64%, 12.8%, 11.6% higher than the conventional orthogonal experimental method in the x, y and z directions, respectively. Besides, the maximum accuracy of the proposed method is 21.76%, 42.81% and 24.8% higher than the random method in the x, y and z directions, respectively, and the average accuracy of the proposed method is 20.37%, 25.49% and 18.95% higher than the random method in the x, y and z directions, respectively. Therefore, the calibration effect of the configuration planned by the proposed method is more stable, and the generalization of the method is stronger through the verification set results of the full joint angle range.
A Modified Orthogonal Experimental Method for Configuration Data …
423
5 Conclusion A configuration data acquisition planning of kinematic calibration for industrial robots based on the orthogonal experimental method optimized by PSO is proposed. The kinematic model of the robot is established by the rigid body differential kinematic theory. The orthogonal experimental table of L 49 (76 ) is determined according to the number of robot joints and the number of configurations, and the joint space is divided by PSO. The simulation results show that: Under the noisy verification set, the average accuracy of the proposed method is 9.64%, 12.8%, 11.6% higher than the conventional orthogonal experimental method in the x, y and z directions, respectively, and is 20.37%, 25.49% and 18.95% higher than random method in the x, y and z directions, respectively. This shows that the proposed method in this paper has strong generalization ability. And it is found that the configuration planned by proposed method is less affected by noise in parameter identification. Therefore, the identified parameters are more stable and more generalized in the calibration process. Acknowledgements This work was supported by National Natural Science Foundation of China under grant(52265001), and partially supported by Yunnan Fundamental Research Project sunder grant (202201ASO70033).
References 1. Gao, G.B., Niu, J.P., Liu, F.: Positioning error compensation of 6-dof robots based on anisotropic error similarity. Optics Precision Eng. 30(16), 1955–1967 (2022) 2. Feng, L.M., Yu, J.H., Wang, Y.Y.: Research on calibration of absolute positioning accuracy of 6-dof cooperative robot. Manufact. Autom. 44(10), 25–28 (2022) 3. Ni, H.K., Yang, Z.Y., Yang, Y.F.: Robot kinematics calibration method considering base frame error. Chin. Mech. Eng. 33(06), 647–655 (2022) 4. Luo, G.Y., Zou, L., Wang, Z.L.: A novel kinematic parameters calibration method for industrial robot based on levenberg-marquardt and differential evolution hybrid algorithm. Rob. Comput.Integr. Manuf. 71, 1–11 (2021) 5. Jiang, Z.X., Huang, M., Tang, X.Q.: A new calibration method for joint-dependent geometric errors of industrial robot based on multiple identification spaces. Robot. Comput.-Integr. Manuf. 71, 1–16 (2021) 6. Sun, D.L., Qiao, G.F., Song, G.M.: Experimental study on accuracy of kinematic calibration for serial industrial robots based on CPA method. Instrum. Tech. Sens . 77–83 (2021) 7. Jia, Q.X., Wang, S.W., Chen, G.: A novel optimal design of measurement configurations in robot calibration. Math. Prob. Eng. 2018, 1–17 (2018) 8. Lyu, Z.Y., Wen, X.L., Cui, W.X.: Optimization of pose set for kinematic parameter calibration of industrial robot. Instrum. Tech. Sens. 97–102 (2021) 9. Wen, X.L., Song, A.G., Feng, Y.G.: Robot calibration and uncertainty evaluation based on optimal pose set. Chin. J. Sci. Instr. 43(9), 276–283 (2023) 10. Taguchi, G.: The system of experimental design engineering methods to optimize quality and minimize cost (1987) 11. Qi, L.Z., Chen, L., Wang, W.: Industrial robot’s positioning error measurement based on orthogonal experimental table. Chin. Mech. Eng. 48(6), 720–723 (2018) 12. Han, S., Liu, M.L., Wang, J.S.: Studies on the influencing factors of the manipulator positioning error based on orthogonal experiment. Control Eng. China 48(6), 2219–2225 (2018)
424
X. Guo et al.
13. Siciliano, B., Sciavicco, L., Villani, L., Oriolo, G.: Robotics: modelling, planning and control. Robot. Model. Planning Control (2011) 14. Kennedy, J., Eberhart, R.: Particle swarm optimization. In: Proceedings of ICNN’95International Conference on Neural Networks. vol. 4, pp. 1942–1948. IEEE (1995) 15. Bergh, F.V.D., Engelbrecht, A.P.: A cooperative approach to particle swarm optimization. IEEE Trans. Evol. Comput. 8(3), 225–239 (2004)
Event-Triggered Adaptive Finite-Time Tracking Control for Robot Manipulators with Full-State Constraints Cong Li, Zhiguo Xu, and Lin Zhao
Abstract This article presents a novel event-triggered adaptive finite-time tracking control scheme for robot manipulators with full-state constraints, a command filtered backstepping strategy is utilized, the filtering error is eliminated by the error compensation mechanism. The adaptive fuzzy control technique is used to deal with the unknown nonlinearities. Then, the finite-time control is employed to improve the steady-state performance of the system. Moreover, system communication burden is efficiently reduced by the relative event-triggered control, and all system state will be constrainted due to barrier Lyapunov functions. Finally, a single-link manipulator simulation is shown to validate the effectiveness of the method. Keywords Event-trigger · Command filtered backstepping · Adaptive fuzzy control · Robot manipulator
1 Introduction Robots have been paid abundant attention these days, various advanced control strategies are employed, such as sliding mode control[1], adaptive control [2] and so on, therefore they have been a hot topic in many different fields [3, 4]. As a mature instrument, the backstepping strategy [5–7] is broadly adopted in such control issue. Combined with the adaptive fuzzy control[8], the system with parameter uncertainties can achieve satisfactory trajectory performance. The explosion of complexity (EOC) problem can be tackled by designing a command filter, and the following filtering error will be tackled by an error compensation mechanism [9]. The finite-time command filtered backstepping approach is established in [10], which has fast convergence speed. In addition, event-triggered control can reduce the communication burden and improve control performance [11]. However, the state constraints are not considered in the above methods. To constraint the system states, barrier Lyapunov C. Li · Z. Xu · L. Zhao (B) Qingdao University, Qingdao 266000, China e-mail: [email protected]
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_36
425
426
C. Li et al.
functions (BLFs) have been proposed [12]. Therefore, we will apply event-triggered adaptive finite-time tracking control strategy to control the robot manipulator, the full-state constraints for the system are also considered by using BLFs.
2 System Description and Preliminaries The dynamics of the manipulator is constructed as: C(x)x¨ + M(x, x) ˙ x˙ + G(x) = τ
(1)
where x ∈ IRn is the joint vector, C(x) ∈ IRn×n is the symmetric inertia matrix, M(x, x) ˙ ∈ IRn×n is centripetal and coriolis torques matrix, G(x) ∈ IRn is gravitational torque, τ ∈ IRn is control torque. The states of x are constrained as |xi | ≤ f i , f i > 0 is a constant, and the desired signal is xd . Assumption 1 Signals xd and x˙d are both known, smooth, and bounded. Lemma 1 [10] Consider the positive constants a1 > 0, a2 > 0, and r ∈ (0, 1), if P(t) satisfies P˙ + a1 P + a2 P α ≤ 0, then it converges rapidly to its equilibrium point in finite time T ≤ t0 + (1/a1 (1 − α)log[(a1 P 1−α (t0 ) + a2 )/a2 ]). Lemma 2 [8] If f (ω) is defined on a compact set , then the fuzzy logic system M T S(ω) satisfies the inequality: supω∈ | f (ω) − M T S(ω)| ≤ λ, where j > 0, M = [m 1 , m 2 , · · · , m j ]T ∈ IR j is defined as the weight vector, S(ω) = [ς1 (ω), · · · , ς j j (ω)]T / i=1 ςi (ω) is the basis function vector, ςi (ω) = ex p[−(ω − ψi )T (ω − ψi )/ βi2 ] is the Gaussian function, and ψi = [ψi,1 , · · · , ψi,n ]T is defined as the center vector, βi is the width. Lemma 3 [13] For the positive constants p > 0, j > 0, there holds: p
|D| p |E| j ≤
pι(D, E)− j |E| p+ j pι(D, E)|D| p+ j + p+ j p+ j
(2)
where ι(D, E) > 0 is a real valued function. Lemma 4 [13] For i ∈ IR, 0 < m ≤ 1: n i=1
n n m m |i | ≤ |i |m ≤ n 1−γ |i | i=1
(3)
i=1 v2
i Lemma 5 [12] For vi > 0, o ∈ IR satisfying |o| < vi , one gets: log( v2 −o 2) < i
o2 . vi2 −o2
Lemma 6 [11] For > 0 and ϑ ∈ IR, function tanh(·) satisfies 0 ≤ |ϑ|−ϑtanh( ϑ ) ≤ 0.2875 .
Event-Triggered Adaptive Finite-Time Tracking Control …
427
3 Main Results Denoted the command filter as: χ˙ i,1 = i 1 i = −ϕi,1 |χi,1 − αi | 2 sign(χi,1 − αi ) + χi,2 χ˙ i,2 = −ϕi,2 sign(χi,2 − i )
(4)
where αi , χi,1 are the system input and output, ϕi,1 and ϕi,2 are the filter gains. Define the tracking error vectors: z 1 = x1 − xd , z 2 = x2 − π2 − , where π2 = ˙ = − + M −1 (q)τ + χ2,1 , is the virtual auxiliary signal, and is constructed as: τ . Further define the compensated tracking error vectors: β1 = z 1 − ζ1 , β2 = z 2 − ζ2 . Define a full-state-compact set as βk ,i = |βk,i | ≤ f k,i , then the stability of the system will be proofed as follows. n 1 f T Ii f 1 Step 1: Choose the first BLF as: V1 = i=1 log f T I f1 −β , where I1 = T 2 i 1 1 1 Ii β1 (1, 0, 0, · · · , 0), In = (0, 0, 0, · · · , 1). Then one gets: V˙1 =
n i=1
n β1 Ii β˙1 = F1,i z 2 + (π2 − α1 ) + α1 − x˙d − ζ˙1 + (5) f 1T Ii f 1 − β1T Ii β1 i=1
where the virtual signal α1 is constructed as γ −1
γ +1
γ −1
γ +1
γ −1
γ +1
2 2 2 α1 = −c1 z 1 + x˙d − d1 [F1,1,1 β1,12 , F1,2,2 β1,22 , · · · , F1,n,n β1,n2 ] γ γ γ h1 − γ +1 [F1,1,1 , F1,2,2 , · · · , F1,n,n ] −
(6)
and ζ˙1 is denoted as γ γ γ T ζ˙1 = (π2 − α1 ) − c1 ζ1 + ζ2 − h 1 ζ1,1 , ζ1,2 , · · · , ζ1,n
(7)
then one gets V˙1 =
n
γ −1
γ +1
F1,i (−c1 β1 + β2 − d1 F1,i2 β1,i2 −
i=1
h1 γ γ F + h 1 ζ1 ) γ + 1 1,i γ +1
(8)
γ +1
1γ 1 According to Lemma 3, we have: h 1 F1,i,i ζ1,i ≤ γh+1 F1,i,i + γh+1 ζ1,i , where F1,i,i is the ith row element of F1,i . Then, substitute it into (8) yields:
V˙1 ≤
n i=1
γ −1
γ +1
F1,i (−c1 β1 + β2 − d1 F1,i2 β1,i2 ) +
h 1 γ γ +1 ζ γ + 1 i=1 1,i n
(9)
428
C. Li et al.
Step 2: The second BLF V2 is constructed as: V2 = V1 + Then we get: n
V˙2 = V˙1 +
F2,i [M −1 (−C x2 − G)] +
i=1
+
n
n
n
1 i=1 2
log
f 2T Ii f 2 . f 2T Ii f 2 −β2T Ii β2
F2,i ( − π˙ 2 − ζ˙2 )
i=1
F2,i τ
(10)
i=1
Set g2 = M −1 (−C x2 − G), because g2 contains uncertainty, it can be approximated by using a fuzzy logic system, and g2 = [g2,1 , · · · , g2,n ]T will be denoted as: g2,i = T ξ2,i + η2,i , in which η2,i is approximation error and it holds ||η 2,i 2,in || ≤ , > 0 is a constant. Thus, according to the Young’s inequality, one gets: i=1 F2,i,i g2,i ≤ 2 2 T 2 n F2,i,i ||2,i ||ξ2,i ξ2,i F2,i,i a22 22 + 2 + 2 + 2 . i=1 2a22 Design the event-triggered mechanism as: Si (t) = −(1 + δi ) α2,i tanh( K 2,iςα2,i ) + ν¯ i tanh K 2,iς ν¯i τi (t) = Si (tk ), t ∈ [tk , tk+1 )
(11)
where 0 < δi < 1, νi > 0, ν¯ i > νi (1 − δi ), tk+1 =inf{t ∈ IR||si (t)| ≥ δi |τi (t)| + νi }, κi,2 νi Si (t) − 1+κ . And si (t) = Si (t) − τi (t). The constants κi,1 , κi,2 satisfy τi = 1+κ i,1 δi i,1 δi according to Lemma 6, we can get: n
F2,i τi ≤
i=1
n
F2,i α2 + 0.557n
(12)
i=1
Then construct α2 and ζ˙2 as: β2,1 β2,n T , · · · , F1,n,n ] − 21 [F2,1,1 , · · · , F2,n,n ]T α2 = −c2 z 2 + π˙ 2 − [ F1,1,1 F2,1,1 F2,n,n γ γ T T 2 − 2a1 2 [F2,1,1 θˆ ξ2,1 ξ2,1 , · · · , F2,n,n θˆ ξ2,n ξ2,n ] − γh+1 [F2,1,1 , · · · , F2,n,n ] 2
γ −1 2
γ +1 2
γ −1 2
(13)
γ +1 2
−d2 [F2,1,1 β2,1 , · · · , F2,n,n β2,n ]T − γ γ γ T ζ˙2 = −c2 ζ2 − h 2 ζ2,1 , ζ1,2 , · · · , ζ2,n
(14)
Submit all these into (10), then one gets: V˙2 ≤
n
n n h 1 γ γ +1 h γ γ +1 ζ1,i + 2 ζ2,i ) γ +1 γ +1 i=1 i=1 i=1 ⎡ ⎤ n n γ +1 γ +1 − ⎣d1 (F1,i,i β1,i ) 2 + d2 (F2,i,i β2,i ) 2 ⎦ + 0.557n
(−F1,i c1 β1 − F2,i c2 β2 ) + (
i=1
+
n i=1
i=1
2 2 || − θ)ξ ˆ T ξ2,i F2,i,i (||2,i 2,i 2a22
a2 2 + 2 + 2 2 2
(15)
Event-Triggered Adaptive Finite-Time Tracking Control …
429
Step 3: The Lyapunov function V3 is denoted as: V3 = V2 + one can get:
1 2
2
V˙3 = V˙2 + ζ1T (π2 − α1 ) + ζ1T ζ2 − c1 ζ1T ζ1 − c2 ζ2T ζ2 − h 1
T k=1 ζk ζk .
n
Then,
γ +1
ζ1,i
i=1
−h 2
n
γ +1
ζ2,i
(16)
i=1
Considering the command filter, if ϕi,1 , ϕi,2 are selected properly, then one gets: χi,1 = αi−1 , i = α˙ i−1 [1]. Thus, (π2 − α1 ) = 0, further combining Yang’s inequality, we 2 . So, (16) is denoted as: can obtain ζk ζk+1 ≤ 21 ζk2 + 21 ζk+1 V˙3 ≤ V˙2 −
2
c0 ζkT ζk
k=1
−
2 n
γ +1
h k ζk,i
where c0 = min c1 − 21 , c2 − 21 . Step 4: Construct the Lyapunov function as: V4 = V3 + θ˙ˆ = −2ρψ θˆ +
(17)
k=1 i=1
θ˜ 2 . 2ρ
Further, set θ˙ˆ as:
n 1 2 T F ρξ2,i ξ2,i 2 2,i,i 2a 2 i=1
(18)
Then, submit θ˙ˆ into V4 , one holds: n θ˜ 2 F ρξ T ξ2,i V˙4 = V˙3 + 2ψ θ˜ θˆ − 2 2a2 i=1 2,i,i 2,i
(19)
γ +1 θ˜ 2 ] + ωψθ 2 ≤ −( ϒρ θ˜ 2 ) 2 According to Lemma 3, 2ψ θ˜ θˆ =2ψ θ˜ (θ −θ˜ ) ≤ −[ ψ(2ω−1) ω γ +1 + ( ϒ θ˜ 2 ) 2 − 2ϒ θ˜ 2 + ωψθ 2 , where ϒ = ρ[ψ(2ω − 1)/2ω]. When ϒ θ˜ 2 ≥ 1,
ρ
γ +1
( ϒρ θ˜ 2 ) 2 − γ +1 ( ϒρ θ˜ 2 ) 2 − ten as:
ρ ϒ ˜2 θ ρ ϒ ˜2 θ ρ
ρ
− ϒρ θ˜ 2 + ωψθ 2 = ωψθ 2 ; when ϒρ θ˜ 2 < 1, + ωψθ 2 < 1 − ϒρ θ˜ 2 + ωψθ 2 < 1 + ωψθ 2 . Thus, (19) can be writ+ ωψθ 2 ≤
ϒ ˜2 θ ρ
γ +1 2
V˙4 ≤ −1 V4 − 2 V4
+ 3
(20)
γ +1 γ +1 γ +1 s in which 1 = min {2ck , 2co , 2ϒ} , 2 = min 2 2 ds , 2 2 γh+1 , (2ϒ) 2 , 3 = + n2 22 + 0.557n + 1. From the above proof, one can easily be verified that in γ +1 finite time T1 = T0 + 1/[ 1 (1 − (1 + γ )/2)] log[( 1 V 1− 2 T0 + 2 )/2 ], sig3 = A}; in finite time nals will converge to the region: (βk , z k , ζk , θ˜ ) ∈ {V ≤ 1 (1− ) n 2 a 2 2
430
C. Li et al. γ +1
T2 = T0 + 1/[1 (1 − (1 + γ )/2)]log[(1 V 1− 2 T0 + 2 )/ 2 ], signals will con2 3 verge to the region: (βk , z k , ζk , θ˜ ) ∈ {V ≤ ( 2 (1− ) γ +1 = B}; the full-state con) strains can be guaranteed.
4 Simulation Results An example of a single-link robot manipulator model is given as follows: H x¨ + N x˙ + MgLsin(x) = U
(21)
in which x, x˙ are the angle and the angular velocity, U is the control input. Set the parameters of the moment of inertia H = 1kg · m 2 , coefficient of viscous friction N = 1N m · s/rad, the mass of load M = 0.1kg, gravitational acceleration g = 10N /kg, arm length L = 1m. The initial state is set as x0 = [0.1rad, −2rad/s], the desired output is xd = 0.2sin(t). Other control parameters are denoted as:γ = 3/5, f 1 = 0.5, f 2 = −2, f¯1 = 0.75, f¯2 = 3.25, d1 = c1 = h 1 = 2, d2 = 12, c2 = 22, h 2 = 5, δ = 0.05, = 0.5, ν = 0.1, ν¯ = 1. ˙ it can be seen that x can track Fig. 1 shows the simulation results of x, xd and x, the desired signal xd precisely under the given algorithm. 1 0.5 0 -0.5 -1
0
5
10
15
20
25
30
0
5
10
15
20
25
30
4 2 0 -2 -4
Fig. 1 The time-varying trajectories of x, xd and x˙
Event-Triggered Adaptive Finite-Time Tracking Control …
431
5 Conclusion A novel event-triggered adaptive finite-time control strategy based on BLFs is proposed to tackle the tracking control problem of robot manipulator, the EOC problem is settled by constructing a command filter, the filtering error is eliminated by the compensation mechanism. Moreover, the unknown nonlinear dynamics can be estimated by using the adaptive fuzzy control; the communication burden is efficiently reduced by the event-triggered strategy. Besides, all the outputs will not violate their necessary constraints with the help of BLFs. Acknowledgements This work was supported by the Science and Technology Support Plan for Youth Innovation of Universities in Shandong Province (2019KJN033), and the Natural Science Foundation of Shandong Province (ZR2021MF046).
References 1. Levant, A.: Higher-order sliding modes, differentiation and output-feedback control. Int. J. Control. 76, 924–941 (2003) 2. Yang, Z.-J., Fukushima, Y., Qin, P.: Decentralized adaptive robust control of robot manipulators using disturbance observers. IEEE Trans. Control. Syst. Technol. 20(5), 1357–1365 (2012) 3. Teng, X.J., Hwang, W.: Elastic energy partitioning in DNA deformation and binding to proteins. ACS Nano. 10(1), 170–180 (2016) 4. Teng, X.J., Deng, X.Y.: Optimization of a helical flow inducer of endovascular stent based on the principle of swirling flow in arterial system. J. Biomed. Eng. 27(2), 429–434 (2010) 5. Bechlioulis, C.P., Rovithakis, G.A.: Adaptive control with guaranteed transient and steady state tracking error bounds for strict feedback systems. Automatica. 45(2), 532–538 (2009) 6. Sun, Y., Wang, W., Ma, G., Li, Z., Li, C.: Backstepping-based distributed coordinated tracking for multiple uncertain Euler-Lagrange systems. J. Syst. Eng. Electron. 27(5), 1083–1095 (2016) 7. Van, M., Mavrovouniotis, M., Ge, S.S.: An adaptive backstepping nonsingular fast terminal sliding mode control for robust fault tolerant control of robot manipulators. IEEE Trans. Syst. Man Cybern Syst. 49(7), 1448–1458 (2019) 8. Wang, F., Chen, B., Liu, X. P., Lin, C.: Finite-time adaptive fuzzy tracking control design for nonlinear systems. IEEE Trans. Fuzzy. Syst. 26(3), 1207C1216 (2018) 9. Farrell, J.A., Polycarpou, M., Sharma, M., Dong, W.: Command filtered backstepping. IEEE Trans. Autom. Control. 54(6), 1391–1395 (2009) 10. Yu, J.P., Shi, P., Zhao, L.: Finite-time command filtered backstepping control for a class of nonlinear systems. Automatica. 92, 173–180 (2018) 11. Xing, L., Wen, C., Liu, Z., Su, H., Cai, J.: Event-triggered adaptive control for a class of uncertain nonlinear systems. IEEE Trans. Autom. Control. 62(4), 2071–2076 (2017) 12. Tee, K.P., Ge, S.S.: Control of nonlinear systems with partial state constraints using a barrier Lyapunov function. Int. J. Control. 84(12), 2008–2023 (2011) 13. Qian, C., Lin, W.: A continuous feedback approach to global strong stabilization of nonlinear systems. IEEE Trans. Autom. Control. 46(7), 1061–1079 (2001)
Research on Satellite Routing Method Based on Q-Learning in Failure Scenarios Zhenrui Chen, Guangrong Lin, Jiaen Zhou, and Yafei Zhao
Abstract With the continuous advancement of communication technology, satellite networks have gained increasing significance in the field of communication. They offer seamless global coverage and enable communication services in remote areas. Low-orbit satellites have the advantages of small propagation delay and global coverage of the network, and gradually become the main components of the satellite network. However, the harsh environment may sometimes lead to the failure of links or nodes within the satellite networking, emphasizing the need for robust algorithms to establish new routes in such failure scenarios. In this paper, the Q-learning algorithm is used to select the optimal path from the source satellite to the other satellites in the networking in the failure scenario. The simulation results show that, compared with the shortest path algorithm (SPF), this algorithm reduces the packet loss rate of data packets by 100%, and the average end-to-end delay during the transmission by 1%. Keywords Satellite routing · Q-learning · SPF · Average end-to-end delay · Packet loss rate
1 Introduction The demand for information globalization has driven the rapid advancement of communication technology, and the utilization of satellite communication systems with extensive coverage has emerged as a prominent field of research worldwide. The coverage of terrestrial communication networks primarily relies on infrastructure development, resulting in high communication costs in sparsely populated regions or harsh environment. In comparison, satellite networks offer notable advantages such as wide coverage, immunity to geographical constraints and the ability to achieve Z. Chen (B) · J. Zhou · Y. Zhao Beijing University of Posts and Telecommunications, Beijing 100876, China e-mail: [email protected] G. Lin Yinhe Hangtian (Beijing) Internet Technology Co., Ltd, Beijing 100192, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_37
433
434
Z. Chen et al.
seamless global connectivity. Furthermore, satellite systems have been contributing to deliver telecommunication services in a wide range of sectors such as aeronautical, maritime, military, rescue and disaster relief [1]. Low-orbit satellites are organized into interconnected constellations, forming a system where the routing algorithm between networks plays a crucial role. Unlike terrestrial networks, satellite nodes are constantly in high-speed motion, resulting in frequent changes in network topology [2]. Additionally, satellite networks have limited data processing and storage capabilities. Consequently, traditional routing algorithms designed for terrestrial networks cannot be directly applied to satellite networks. Routing algorithms for satellite networks must adapt to the rapid and dynamic changes of the topology and traffic, while also being resilient to network failures to ensure the robustness of the algorithm. Early satellite routing algorithms can be broadly categorized into three types: discretization-based algorithms which operate on virtual topologies, dynamic routing algorithms, and area division-based routing algorithms. Liu et al. [3] proposed a method that converts the dynamic topology into a series of static topologies, enabling the calculation of broadcast routing tables and feasible path sets, thereby achieving dynamic routing. Zha et al. [4] made improvements to the traditional GPSR algorithm and applied it to satellite routing. They extended the range of next hop selection for satellite nodes when employing the greedy strategy and projected satellites to sub-satellite points using the peripheral strategy to obtain a two-dimensional plane result, meeting the conditions required by the right-hand rule. However, none of these algorithms consider the possibility of failures in satellite networks. To address this, real-time detection of faulty nodes in the satellite networks becomes necessary. Chen et al. [5] introduced a position prediction algorithm to calculate the position of each satellite throughout a given cycle and generate a network topology. They then incorporated the dynamic detection concept from OSPF into a static routing framework to facilitate the discovery of faulty nodes and flooding. However, they employed the SPF algorithm when looking for a new optimal path in the failure scenario, without considering the specific load conditions of each satellite. With the advent of AI algorithms, numerous researchers have begun employing them in the design of satellite routing algorithms. Zhou et al. [6] introduced a realtime routing method in the satellite network based on CNN to minimize service data flow delay. Wang et al. [7] proposed a satellite routing algorithm utilizing the Q-learning algorithm, which incorporated limitations on maximum hop counts and improved the greedy strategy. However, they did not take satellite load and failure nodes into account. Based on the Q-learning algorithm, Zhang et al. [8] combined the concept of Time Aggregation Graph (TAG) and also considered the congestion level of satellite nodes. This approach enabled the prediction of on-off changes in links within a specific time period. To enhance algorithm optimization capabilities, Ding et al. [9] introduced the concept of Q-learning to the ant colony routing algorithm for micro-nano satellite networks. Yin et al. [10] proposed a speed-up Q-learningbased routing algorithm and presented a split-based speed-up convergence strategy to prevent routing loops or the ping-pong effect. Gong et al. introduced routing algorithms based on improved Q-learning and improved Double Q-learning in [11]
Research on Satellite Routing Method Based on Q-Learning …
435
and [12], respectively. In these two approaches, each satellite node maintains two Q-value tables. One table selects the forwarding node, while the other evaluates the forwarding value and determines the next hop node for the data packet based on the mixed Q-value. Additionally, congestion levels of the satellite network are taken into account. Zuo et al. [13] employed the DQN algorithm to intelligently select paths based on spatial positions, mutual distances, and available bandwidth information of neighboring nodes. In [14], Wang et al. utilized the DDQN algorithm and introduced a link state awareness strategy. This strategy employed the congestion level of links as a reference for state prediction, expanding the next hop selection range of satellite nodes to two hops and mitigating potential loop scenarios. However, we found that the aforementioned algorithm only considers the optimal path between a single source satellite node and a single target satellite node, without specifying an optimal path selection scheme between the source satellite and all other satellites within the constellation. When faulty nodes are present in a satellite network, it is crucial to update the optimal path from each satellite to other satellites in the network promptly. This ensures that packet loss is avoided when transmitting data to the faulty nodes based on non-real-time optimal paths. In this paper, the LEO48 low-orbit satellite constellation is chosen as the modeling object. The position prediction algorithm from [5] is referenced and further improved to calculate the accurate position information of all satellites in the constellation, enabling the construction of a static topology. Subsequently, a scenario with constellation topology and faulty nodes is modeled completed. Based on this scenario, a satellite routing method utilizing Q-learning is proposed to determine the optimal path from the source satellite to the remaining satellites within the satellite network. When designing the reward function, the algorithm takes into account various factors such as link length, the minimum number of hops from the target satellite, and the load of each satellite. This comprehensive consideration allows the algorithm to construct a Q-value table with the initial destination satellite set as the source satellite. By iterating over the Q-value table, the optimal path from any satellite node to the source satellite can be determined. By reversing the path obtained before, the optimal path from the source satellite to other satellites in the network can be derived. The performance analysis demonstrate that our satellite routing algorithm based on Q-learning reduce the packet loss rate by 100% and average end-to-end delay by 1%. The paper is organized as follows: In Sect. 2 the preliminary is introduced. In Sect. 3 we give the definition of system model, and a satellite routing algorithm based on Q-learning is proposed in Sect. 4, The experimental results are discussed in Sects. 5, and 6 draws the conclusions.
2 Preliminaries Reinforcement learning is a subfield of machine learning that enables goal achievement through continuous iterative optimization, even in the absence of prior knowledge about the environment. It involves a process of trial and error, evaluation, and
436
Z. Chen et al.
Fig. 1 The process of the agent exploring the environment
improvement. In reinforcement learning, the process is often modeled using a Markov Decision Process framework: MDP= S, A, P, R. Here, S represents the state set, A represents the action set, P represents the probability distribution of choosing different actions in each state, and R represents the reward. The entity in reinforcement learning is typically referred to as an “agent.” Throughout the learning process, the agent observes the environment, selects the next action, receives a corresponding reward, and transitions to the next state. Each action can influence future states, and the objective of the agent is to maximize its cumulative rewards through a sequence of actions. The process of the agent exploring the environment is depicted in Fig. 1. The traditional algorithms of reinforcement learning, such as Q-learning and SARSA, are widely recognized. When combined with deep learning techniques, these algorithms are referred to as deep reinforcement learning algorithms, which include DQN, DDPG, and others. Some of these algorithms are improved on the basis of Q-learning algorithms. In this study, a LEO constellation consisting of 48 satellites is modeled. This constellation exhibits a small number of satellites, a simple topology, and a relatively fixed number of communication nodes for each satellite. Moreover, in the presence of faulty nodes, utilizing the original optimal path can result in significant data packet loss. Therefore, it becomes crucial to promptly obtain a new optimal path in such situations. After comprehensive consideration, the Q-learning algorithm is deemed suitable for the LEO satellite constellation due to its limited number of satellite nodes. Additionally, the Q-learning algorithm exhibits lower computational complexity compared to deep reinforcement learning algorithms. Therefore, in this study, the Q-learning algorithm is selected to find the optimal path in failure scenarios. The Q-value table is the core of the Q-learning algorithm, represented as Q (s, a), which denotes the Q-value obtained when the agent selects action a in state s. The Q value can be iteratively updated using an optimization update function formula. In the Q-learning algorithm, the Bellman equation associated with the Q-value update is as follows: (1) − Q a) Q s , a Q (s, a) = Q (s, a) + α r + γ max (s, a
where a represents the action chose by the agent, s and s respectively represent the current state and next state. α represents the learning rate, which determines the speed of the learning process. γ represents the discount factor, indicating the attention degree to future benefits.
Research on Satellite Routing Method Based on Q-Learning …
437
3 System Model The system model consists of four parts: the satellite constellation model, the propagation model, the transmission model and the queuing model.
3.1 Satellite Constellation Model Low Earth Orbit (LEO) satellite constellations can generally be categorized into polar-orbit and inclined-orbit constellations. Polar-orbit constellations are commonly represented as “M*N /N /F:h:i”, where M and N represent the number of satellites on each orbit and the total number of orbits, respectively. F denotes the phase factor, indicating the phase difference between neighboring satellites on adjacent orbits. The variables h and i represent the orbital height and inclination, respectively. In the case of the LEO48 low-orbit satellite constellation considered in this study, it can be represented as 48/6/3:1450:86. Accordingly, the constellation consists of a total of 6 orbits, with 8 satellites evenly distributed on each orbit. The phase factor of the constellation is 3, the orbit altitude is 1450 km, and the orbital inclination is 86◦ . Based on the orbit number and the position sequence on the orbit, the satellites within the constellation are assigned unique numbers and defined as Si, j , where i represents the orbit number and j represents the satellite number on the orbit. Each satellite has up to 4 communication nodes, which include neighboring nodes on the same orbit and adjacent orbits. However, there are two exceptional cases: the polar region and the reverse slot. Once a satellite enters the polar region, its communication capabilities are restricted due to antenna limitations, preventing communication with satellite nodes in adjacent orbits. Consequently, it can only establish communication links with neighboring nodes on the same orbit. Furthermore, the satellites on orbit 1 and orbit 6 move in opposite directions, resulting in a high relative speed between them. As a consequence, the satellites on either side of the reverse slot are unable to establish communication links. Therefore, the satellites on these two orbits have a maximum of 3 communication nodes. In the communication process, the end-to-end delay of data is directly influenced by the links it traverses. Therefore, the following formulas are provided to calculate the link lengths for the same orbit and adjacent orbit in polar orbit constellations. Generally, the formula for calculating the length of the same-orbit inter-satellite link in polar orbit constellations is as follows: √ d1 = 2 ∗ R ∗
1 − cos
360 M
(2)
The formula for calculating the inter-satellite link length of adjacent orbital neighbors is as follows:
438
Z. Chen et al.
d2 =
√
2 ∗ R ∗ cos θ ∗
1 − cos
360 2∗N
(3)
After considering the phase factor, the calculation formula of the adjacent orbit inter-satellite link length is as follows:
2
√ 2π F 2 d3 = (d2 ) + 2 ∗ R ∗ 1 − cos MN
(4)
In the above three formulas, the definitions of M, N , F are consistent with the previous descriptions in this section, and R represents the orbital altitude. Given that the five parameters of the satellite model are constant, the length of the inter-orbital link (IOL) within an orbit remains constant. As the latitude of the satellite increases, the length of the inter-satellite link (ISL) decreases. Importantly, these calculation formulas still hold true even if the satellite orbit possesses a certain inclination. Link length remains a crucial parameter in both propagation models and transfer models.
3.2 Propagation Model In low-orbit satellite networks, the distance between satellites plays a significant role in the end-to-end delay of data packets during the communication process. The propagation delay of the inter-satellite link is primarily determined by the length of the link, which represents the distance between satellites. The propagation delay from satellite u to satellite v can be calculated using the following formula: t_ pr opu,v =
du,v c
(5)
where du,v represents the link length between satellites u and v (satellite u and satellite v must be communicable), and c represents the speed of light.
3.3 Transmission Model The transmission delay is another component of the end-to-end delay of the data packet. When a data packet arrives at a relay satellite, it needs to follow the First-InFirst-Out (FIFO) principle, meaning that it will wait until all preceding data packets have been transmitted before it can be sent. Therefore, the transmission delay can be considered as a waiting delay. The transmission delay is influenced by the transmission rate and the size of the data packet, and can be expressed by the following relationship:
Research on Satellite Routing Method Based on Q-Learning …
t_transu,v =
f Ru,v
439
(6)
where f represents the size of the data packet and Ru,v represents the transmission rate between satellite u and satellite v. According to Shannon’s theorem and the free space loss model, the transmission rate Ru,v can be calculated using the following formula: 2 Pt x ∗ G t x ∗ ∗G r x ∗ λ/4π du,v Ru,v = Bu,v ∗ log2 1 + (7) PN0 where Bu,v represents the bandwidth between satellite u and satellite v, Pt x represents transmit power, λ represents the carrier wavelength. PN0 represents the noise power, and can be calculated by the formula PN0 = Ru,v ∗ kc ∗ Tn , where kc and Tn represent the Boltzmann constant and the noise temperature respectively. G t x represents the transmit antenna gain and G r x represents the receive antenna gain, and can be calculated as follows: 4π D 2 (8) Gtx = λ2 Gr x =
πD λ
2 ∗η
(9)
where D indicates the antenna diameter, and η indicates the antenna efficiency of the receiver.
3.4 Queuing Model When the network load is high, the congestion level of satellite nodes increases, resulting in significant queuing delay. This paper adopts a First-In-First-Out (FIFO) queuing model, where the queuing delay of data packets in the queue is considered as the cumulative transmission delay of preceding packets. However, each satellite has a limited cache capacity. When the load on a satellite reaches its limit, the cache queue can no longer accept new packets, and subsequent packets which transmitted to that satellite will be discarded.
4 Routing Algorithm Based on Reinforcement Learning Firstly, we model the satellite constellation as an MDP (Markov Decision Process) model, denoted as S, A, P, R, as mentioned in the previous section, to facilitate the design of the Q-learning-based routing algorithm. Each satellite node in the con-
440
Z. Chen et al.
stellation represents a specific state, and all satellite nodes collectively form the state set S. Each satellite has communicable nodes (neighbor nodes), which are the potential next-hop options for the current satellite. The probability of selecting neighbor nodes varies, corresponding to p(s,s ) in the probability set P. Different selections correspond to different actions for each state, and all these actions constitute the action set A. When satellites transmit data to the selected next-hop node according to the chosen action, a reward R is obtained. For the construction of the initial static scene, we refer to the position prediction algorithm described in [5], and further details are not provided in this paper. In the static satellite topology, failure nodes are intentionally set, and the congestion degree (load situation) is specified for each satellite. In this scenario, the Q-learning-based routing algorithm proposed in this paper consists of two main parts: the training part and the optimal path finding part. It is important to note that the default source satellite used is S1,1 , and this choice remains unchanged when selecting other satellites.
4.1 Q-Learning Algorithm Training in the Satellite Network. In the Q-value update formula, the reward value R represents the feedback on the performance of data packet forwarding, reflecting how well the forwarding is executed. In a LEO satellite network, the link length plays a significant role in determining the end-to-end delay of data transmission. Additionally, it is crucial to consider the minimum number of hops between the relay satellite and the destination satellite to prevent routing loops. This paper also takes into account the impact of satellite load on the data transmission path. If congested nodes are present along the path, the transmission delay will inevitably increase, and even leading to packet loss. Based on the aforementioned analysis, the three factors mentioned above will be taken into account when designing the reward function. Since these factors have different magnitudes, it is necessary to normalize them before assigning weights. The link lengths have already been calculated during the construction of the satellite network topology, and the normalization formula for the link length is provided as follows: d − dmin (10) dnor m = dmax − dmin where dmax and dmin are the maximum and minimum values of all the link length in the satellite network, d is the origin link length, and dnor m is the normalized result of d. When transmitting data from the current satellite to the next hop satellite, the minimum hop count with the target satellite can change by at most 1, and it can increase, decrease, or remain unchanged corresponding to positive or negative rewards. As for setting the congestion degree for each satellite, the range is (0, 1), so no normalization is required.
Research on Satellite Routing Method Based on Q-Learning …
441
The reward function that combines the three factors is set as follows: R = −ω1 ∗ dnor m + ω2 ∗ hop + ω3 ∗ blocking
(11)
where dnor m , hop and blocking respectively represent the normalized link length, the minimum satellite hop change amount and the congestion degree of the next hop satellite,ω1 ,ω2 ,ω3 are the weight of three factors, which meet the following condition: ω1 + ω2 + ω3 = 1
(12)
The training steps of Q-learning algorithm in the satellite network are as follows: (1) Taking the current source satellite S1,1 as the target node, select a non-faulty satellite node far away from S1,1 in the network as the starting node, so as to traverse the entire constellation. Initialize the Q value table. (2) When the training episodes not reach the maximum, start the next training episode. (3) Select the action according to the current training episodes, randomly select the action or select the action with the greatest benefit. (4) If the selected action is unreasonable (the next hop is an adjacent node in the polar region or a faulty node), the state remains unchanged and a large negative benefit is obtained, then go to (3). (5) If not reach the target node, calculate the reward according to the setting of the reward function, go to (3). Otherwise, get a large positive reward and go to (6). (6) If the current training episode has reached the maximum value, the training is finished. Otherwise go to (2). In the action selection step of the above process, the idea of dynamic greedy strategy is adopted. When the episode is small, the degree of exploration of the environment should be increased, so the probability of randomly selecting an action needs to be increased, assuming that the probability of randomly selecting an action at this time is ε. When the episode is large, the agent is basically familiar with the entire environment, so it can gradually increase the probability of selecting the action with the largest reward. When the final probability reaches 1 (that is, ε = 0), the action selection for each state is fixed.
4.2 Finding the Optimal Path After training, a Q-value table is generated, which records the rewards for choosing different actions in each state. The target node of the Q-value table is S1,1 . Therefore, regardless of the starting state, the algorithm selects the action with the highest reward in the current state during each iteration. This process ultimately yields the optimal path from the start satellite node to S1,1 . By reversing the path found previously, the
442
Z. Chen et al.
optimal paths from S1,1 to other satellites can also be obtained. These paths are stored in the Q-value table. When the satellite topology is updated, the Q-value table is also updated, enabling each satellite to obtain real-time optimal paths to other satellites.
5 Experiment and Simulation In the failure scenario of the satellite network, the traditional Shortest Path First (SPF) algorithm can update the shortest path tree from each satellite to others in the constellation. However, it does not consider the satellite load. The Q-learning-based satellite routing algorithm proposed in this paper takes into account various factors such as link length, number of hops from relay satellite to target satellite and satellite load. This algorithm aims to find a more optimal path and improve communication performance. To compare the performance, we consider S1,1 as an example in this scenario. We gradually increase the data packet generation rate and observe the packet loss rate and average end-to-end delay. The parameters related to transmission delay please refer to [15], and the remaining parameters are listed in Table 1. Through the above parameter settings, when the generation rate of data packets continues to increase, we can get the performance comparison between the Qlearning-based satellite routing algorithm (Q-learning-SR) and the shortest path first algorithm (SPF). The results are shown in the Figs. 2 and 3. Figure 2 shows that compared with the traditional SPF algorithm, the algorithm proposed in this paper reduce the packet loss rate by 100%. Since there are fully loaded satellite nodes in the experimental scene, serious packet loss will occur when the data packets are transmitted according to the shortest path found by the SPF algorithm, and the satellite routing algorithm based on Q-learning takes into account the satellite load, bypassing these heavily congested nodes, and comprehensively select a shorter and less congested path to reduce the packet loss rate. As can be seen, our proposed algorithm ensures no packet loss occurs.
Table 1 Parameters settings Parameters Maximum queue length Maximum training times ε ω1 , ω2 , ω3 Generation rate of data packets Number of faulty nodes Number of fully loaded nodes
Value 200 10000 0.7 0.2,0.4,0.4 100 1500/s 3 3
Research on Satellite Routing Method Based on Q-Learning …
Fig. 2 Packet loss rate comparison between two algorithms
Fig. 3 End-to-end delay comparison between two algorithms
443
444
Z. Chen et al.
Figure 3 shows that compared with the traditional SPF algorithm, the algorithm proposed in this paper can also improve the average end-to-end delay. This performance improvement can be greater when conditions such as bandwidth in the link are limited which result in the reduction of transmission rate.
6 Conclusion In this paper, we propose a Q-learning-based satellite routing algorithm for the failure scenario in the LEO satellite constellation. We establish an MDP model that considers factors such as link length, minimum number of hops from relay satellite to the target satellite and satellite load. This allows each satellite to quickly generate a new optimal path to other satellites in the constellation in the event of a failure. Simulation results demonstrate that our proposed algorithm outperforms the traditional SPF algorithm. Specifically, the algorithm reduces the packet loss rate by 100% and improves the average end-to-end delay during transmission by 1%. Acknowledgements This paper was supported by Shanghai Industrial Collaborative Innovation Project (No. 2021-cyxt2-kj14) and the Beijing Municipal Science and Technology Project (No. Z221100007722012).
References 1. Al-Hraishawi, H., Chougrani, H., Kisseleff, S., Lagunas, E., Chatzinotas, S.: A survey on nongeostationary satellite systems: The communication perspective. IEEE Commun. Surv. Tutorials 25(1), 101–132 (2023) 2. Xu, G., Zhao, Y., Ran, Y., Zhao, R., Luo, J.: Spatial location aided fully-distributed dynamic routing for large-scale Leo satellite networks. IEEE Commun. Lett. 26(12), 3034–3038 (2022) 3. Liu, Y., Liu, C.: Distributed dynamic routing algorithm for satellite constellation. In: 2018 10th International Conference on Communication Software and Networks (ICCSN), pp. 300–304 (2018) 4. Zha, P., Long, C., Wu, J., Li, S.: Satellite lifetime predicted greedy perimeter stateless routing protocol for Leo satellite network. Chin. Autom. Congr. (CAC) 2020, 5102–5107 (2020) 5. Pan, T., Huang, T., Li, X., Chen, Y., Xue, W., Liu, Y.: Opspf: orbit prediction shortest path first routing for resilient Leo satellite networks. In: ICC 2019 - 2019 IEEE International Conference on Communications (ICC), pp. 1–6 (2019) 6. Zhou, J., Sun, Z., Zhang, R., Lin, G., Zhang, S., Zhao, Y.: A cloud-edge collaboration CNNbased routing method for ISAC in LEO satellite networks. In: Proceedings of the 2nd Workshop on Integrated Sensing and Communications for Metaverse. ACM, Helsinki Finland, pp. 25–29 (Jun 2023). [Online]. Available: https://dl.acm.org/doi/10.1145/3597065.3597451 7. Wang, X., Dai, Z., Xu, Z.: Leo satellite network routing algorithm based on reinforcement learning. In: 2021 IEEE 4th International Conference on Electronics Technology (ICET), pp. 1105–1109 (2021) 8. Zhang, L.: Research and Implementation of Routing Protocol in Satellite Network. Master’s Thesis, Xidian University (2017)
Research on Satellite Routing Method Based on Q-Learning …
445
9. Ding, Y., Zhao, Y., Gao, Y., Zhang, R.: Q-learning quantum ant colony routing algorithm for micro-nano satellite network. In: 2021 IEEE 6th International Conference on Computer and Communication Systems (ICCCS), pp. 949–954 (2021) 10. Yin, Y., Huang, C., Wu, D.-F., Huang, S., Ashraf, M.W.A., Guo, Q.: Reinforcement learningbased routing algorithm in satellite-terrestrial integrated networks. Wireless Commun. Mob. Comput. 2021, 1–15 (2021) 11. Gong, X., Sun, L., Zhou, J., Wang, J., Xiao, F.: Adaptive routing strategy based on improved q-learning for satellite internet of things. In: Security, Privacy, and Anonymity in Computation, Communication, and Storage: SpaCCS 2020 International Workshops, Nanjing, China, December 18-20, 2020, Proceedings 13, pp. 161–172. Springer, Berlin (2021) 12. Zhou, J., Gong, X., Sun, L., Xie, Y., Yan, X.: Adaptive routing strategy based on improved double q-learning for satellite internet of things. Secur. Commun. Netw. 2021, 1–11 (2021) 13. Zuo, P., Wang, C., Yao, Z., Hou, S., Jiang, H.: An intelligent routing algorithm for Leo satellites based on deep reinforcement learning. In: 2021 IEEE 94th Vehicular Technology Conference (VTC2021-Fall), pp. 1–5 (2021) 14. Wang, C., Wang, H., Wang, W.: A two-hops state-aware routing strategy based on deep reinforcement learning for Leo satellite networks. Electronics 8(9), 920 (2019) 15. Wang, H.: Research on Dynamic Routing Algorithm of Low Earth Orbit Satellite Network based on Graph Neural Network. Master’s thesis, Chongqing University of Posts and Telecommunications (2022)
Ensemble Regularized Polynomial Regression for Diagnosing Breast Cancer Subtypes Shan Xiang, Fugen Gao, and Juntao Li
Abstract Accurate diagnosis of breast cancer subtypes is crucial for prevention and treatment. This paper proposed a method to address the non-linear relationship of features in breast cancer subtype diagnosis. Firstly, missing values in miRNA expression data were replaced with the mean value of the corresponding feature column. Next, a polynomial regression model was adopted and the interaction of miRNA was considered to expand the dimension of the data. Then, the ensemble learning strategy was introduced through bootstrap sampling on the training set, and 100 regularized polynomial logistic regression models were constructed. Finally, relative majority voting strategy was applied to predict breast cancer subtypes for the testing samples. The results of 50 data division experiments showed that the proposed method can improve the diagnostic accuracy of breast cancer subtypes. Keywords Polynomial regression · Ensemble learning · Breast cancer
1 Introduction Cancer is an extremely serious disease that poses a significant threat to human health and has become a global health challenge [1]. Among various types of cancer, breast cancer is one of the most common cancers among women, characterized by abnormal proliferation of breast cells [2]. Currently, the incidence rate of breast cancer continues to rise, and young women are gradually becoming a high-risk population [3]. According to the latest statistical data, more than 2 million women are diagnosed with breast cancer each year, with the number of deaths reaching 500,000 [4]. Therefore, there is an urgent need to strengthen early diagnosis and treatment of breast cancer. The diagnosis of breast cancer typically relies on such as imaging and pathology for traditional methods. Therefore, it suffered from subjectivity and limitations S. Xiang · F. Gao · J. Li (B) College of Mathematics and Information Science, Henan Normal University, Xinxiang 453007, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_38
447
448
S. Xiang et al.
imposed by tools like visual inspection and optical microscopy [5]. In recent years, the emergence of omics data has garnered widespread attention [6]. RNA-seq, a high-throughput sequencing technology, has been widely applied for quantifying and analyzing the expression of transcripts [7]. In recent breast cancer research, Fu et al. proposed a new statistical method based on RNA-seq data for detecting differential expression [8]. Additionally, Talal et al. proposed a novel homogenization algorithm by using RNA-seq data to predict breast cancer subtypes [9]. MicroRNAs (miRNAs) are a class of non-coding RNA molecules that exhibit a high degree of conservation in biological evolution [10]. In the field of breast cancer research, a two-stage technique was proposed by Sarkar et al. to identify the most significant miRNA biomarkers [11]. Lin et al. conducted an analysis of
Fig. 1 Overview of the EPLR-R framework
Ensemble Regularized Polynomial Regression for Diagnosing …
449
miRNA expression profiles by employing weighted correlation network, Bayesian network, and miRNA similarity (MISIM) methods [12]. However, these studies have not taken into account the potential impact of missing values in the data on subsequent analysis and decision-making. Therefore, it becomes crucial to address the issue of missing data by introducing imputation methods. Yang et al. proposed a new method that combines miRNA functional similarity and expression similarity [13]. Zhang et al. proposed an interpolation model based on backpropagation neural network association rules [14]. Park et al. proposed a bidimensional integrative factorization method for imputing missing data [15]. In addition, breast cancer is a highly complex disease, whose development is regulated by multiple factors, and these factors may exhibit non-linear relationships. Therefore, it is necessary to conduct further research on the interactions among these factors to enhance the diagnostic accuracy of breast cancer subtypes. In order to solve the above two problems, this paper replaced missing values in data with the average values of features and constructed quadratic relationships among features on the completed data. Motivated by the bootstrap sampling technique [16] and modeling technique [17–19], this paper proposed an ensemble polynomial logistic regression with ridge penalty (EPLR-R) for breast cancer subtype diagnosis. Figure 1 gives an overview of the method.
2 Problem Statement The miRNA data for breast cancer can be obtained from the literature [20]. This data consists of 231 samples and 124 miRNAs, including 86 Luminal A (LA), 39 Luminal B (LB), 24 HER2-Enriched (H2), 41 Basal-Like (BL), and 41 control subtypes. Let X = (x1 , . . . , xi , . . . , x231 )T be the given miRNA expression matrix, where xi = (xi(1) , xi(2) , . . . , xi(124) )T denotes the expression levels of 124 miRNAs for the ith sample. Let yi denote the labels of breast cancer subtypes. If the ith sample comes from LA, LB, H2, BL, or control, yi takes 1, 2, 3, 4, or 5 accordingly. Diagnosing breast cancer subtypes using miRNA data can be regarded as a fiveclass classification problem, and the diagnostic accuracy relies on the handling of missing values and the selection of feature sets. Based on data imputation and nonlinear feature construction techniques, this paper proposed EPLR-R model to improve the diagnostic accuracy of breast cancer subtypes.
3 Main Results 3.1 Model Building For dealing with the missing values in the data, we adopted a feasible strategy to interpolate the data. Specifically, we substituted the missing value with the average of the features. This approach was primarily chosen based on the following two reasons.
450
S. Xiang et al.
Firstly, the data has a relatively small number of missing values, so this replacement method has minimal impact on the overall data. Secondly, through descriptive statistical analysis of the data, we found that the data presented an obvious central trend, in which most of the values were clustered around the mean value. Considering the non-linear relationship of features in breast cancer diagnosis based on miRNA expression data, we proposed the following second-order polynomial regression model for diagnosing of breast cancer subtypes: arg
max
k=1,2,3,4,5
f k (x),
(1)
where the kth decision function f k (x) in (1) is defined as f k (x) =
βk(0)
+
124 i=1
βk(i) x (i)
+
124 124
(i, j) (i) ( j)
βk
x x
,
(2)
i=1 j=i
(i, j)
βk(0) is the offset, βk(i) and βk are the regression coefficient, and x (i) denotes the expression level of the ith miRNA for the sample x. Let x (125) = x (1) x (1) , x (126) = x (1) x (2) , …, x (7874) = x (124) x (124) , βk(125) = βk(1,1) , (126) βk = βk(1,2) , …, βk(7874) = βk(124,124) . Then the second-order polynomial function f k (x) in (2) can be represented as the following linear regression: f k (x) = f k (x) = βk(0) + β k x, T
(3)
where x = (x (1) , . . . , x (7874) )T , β k = (βk(1) , . . . , βk(7874) )T . We adopted a stratified sampling method to make the training set consistent with the proportion of subtypes in the original data. We divided the data set into a training set and a test set to ensure that no information from the test samples was used during model training. The training set contains four-fifths of the samples (185), while the testing set includes the remaining samples. And we performed 50 repeated experiments. To solve the regression coefficient (βk(0) , β k ) in (3), we proposed a polynomial logistic regression with ridge penalty (PLR-R) model: min l({β k , βk(0) }51 ) + λP(β k ),
β k ,βk(0)
(4)
where the negative log-likelihood loss function l({β k , βk(0) }51 ) in (4) is defined as l({β k , βk(0) }51 ) = −
185 5 5 T (0) 1 T [ I(yi = k)(βk(0) + β k x i ) − ln( e(βk +β k x i ) )], (5) 185 i=1 k=1 k=1
Ensemble Regularized Polynomial Regression for Diagnosing …
451
the penalty function P(β k ) in (4) is defined as P(β k ) =
5
β k 22 ,
(6)
k=1
λ is regularization parameter, I(·) is the indicator function, yi ∈ {1, 2, 3, 4, 5} denotes the label of breast cancer subtype, x i = (xi(1) , . . . , xi(7874) )T denotes the expression levels of 7874 miRNAs for the ith sample. β k = (βk(1) , . . . , βk(7874) )T is the regression coefficient vector and βk(0) is the offset. In practical applications, individual classifiers often suffer from overfitting or underfitting issues, resulting in low accuracy or inability to adapt to complex situations. We adopted an ensemble learning strategy to solve the above situation. Specifically, we first performed 100 bootstrap samplings on the training set of each subtype. Then, we trained PLR-R model on each sampling set. Finally, the labels of the test samples were predicted according to the relative majority voting strategy. The diagnostic accuracy obtained was denoted as Acc.
3.2 Solving Algorithm We developed the EPLR-R algorithm to solve the average diagnostic accuracy of 50 data partitions. The simple steps are as follows: Algorithm 1 EPLR-R framework for diagnosis of breast cancer Require: MiRNA expression data: S = [X, Y ], feature matrix: X = (x1 , ..., x231 )T , sample label: Y = (y1 , ..., y231 )T ; Ensure: The average diagnostic accuracy of 50 seeds on the testing set: Ada 1: Obtain S = [X , Y ] according to (2); 2: for seed = 1 to 50 do 3: S train = 0.8 ∗ S; S test = 0.2 ∗ S; 4: for m = 1 to 100 do m 5: Call bootstrap sampling to obtain sampling set S train ; m 6: Fit the mth PLR-R on S train ; m m m 7: Predict subtype label on S test by combining (1) and (3): y m = (y m 1 , ..., y p , ..., y 46 ); 8: end for 9: for p = 1 to 46 do 100 m 100 m 100 m m 10: y p = arg maxlabel { 100 m=1 (y p = 1), m=1 (y p = 2), m=1 (y p = 3), m=1 (y p = 100 m 4), m=1 (y p = 5)}; 11: end for 12: Accseed = 46 p=1 (I(y p − y p ))/46; 13: end for 50 14: Ada = mean{ seed=1 Accseed }
452
S. Xiang et al.
In the process of fitting PLR-R model, we adopted gradient descent algorithm to solve (β k , βk(0) ) in (3). The core idea is to update parameters along the direction of the fastest decline of the objective function, so as to gradually approach the global optimal solution. To determine the optimal value of parameter λ in PLR-R, we adopted the 10 fold cross-validation method. This method is done by dividing the data into ten equally sized sections and repeated training and validation for parameter selection. We adopted the relative majority voting strategy to predict the test sample label. The prediction results of each model were counted and the subtype that received the most votes was used as the final prediction label. In order to avoid the influence of different data division on diagnostic accuracy, we conducted 50 random division of data.
4 Experiment In order to validate the effectiveness of EPLR-R model, we compared it with other six models. These six models include multinomial logistic regression with ridge penalty (MLR-R) model, multinomial logistic regression with lasso penalty (MLRL) model, multinomial logistic regression with elastic net penalty (MLR) model, Random Forest (RF), Support Vector Machine (SVM), and Naive Bayes (NB). We employed the EPLR-R algorithm described in Sect. 3.2 to solve the problem. MLR-R, MLR-L and MLR were implemented via the R package glmnet. RF was implemented by using the R package randomForest. SVM and NB were implemented by using the R package e1071. The average diagnostic accuracy (Ada) and variance (Var) based on 50 data partitions was adopted to evaluate the performance of these models. Ada and Var of the seven models under 50 random partitions can be seen in Table 1. The results showed that Ada of EPLR-R model was 80.17%, which is higher than the other six models. And the variance of the EPLR-R model was 0.0029. Compared with MLR-R, MLR-L, MLR, RF, SVM, and NB, the diagnostic accuracy of EPLR-R model is increased by 0.39%, 6.52%, 14.87%, 5.21%, 3.52% and 4.95%, respectively. The reason for the higher diagnostic accuracy of the proposed method could be attributed to the utilization of an ensemble learning strategy and a polynomial regression model. The ensemble learning strategy is to improve the generalization performance of the model. Polynomial regression model can better capture the nonlinear relationship in the data and the interaction between features.
Table 1 Comparison of diagnostic accuracy for different models Method EPLR-R MLR-R MLR-L MLR RF Ada Var
0.8017 0.0029
0.7978 0.0032
0.7365 0.0023
0.6530 0.0032
0.7496 0.0021
SVM
NB
0.7665 0.0022
0.7522 0.0029
Ensemble Regularized Polynomial Regression for Diagnosing …
453
5 Conclusion In this paper, EPLR-R model was proposed for diagnosing breast cancer subtypes. The mean value was used to replace the missing value of the corresponding feature column to reduce the information loss in the data. In order to eliminate model overfitting and improve generalization ability, 100 PLR-R models were fitted by using bootstrap sampling. Experimental results on breast cancer data demonstrated that this model is superior to other six models in diagnostic performance. Acknowledgements This work was supported by the Scientific and Technological Project of Henan Province (232 102 210 066).
References 1. Ghose, S., Radhakrishnan, V., Bhattacharya, S.: Ethics of cancer: beyond biology and medicine. E Canc. Med. Sci. 28(13), 911 (2019). https://doi.org/10.3332/ecancer.2019.911 2. Wilkinson, L., Gathani, T.: Understanding breast cancer as a global health concern. Br. J. Radiol. 95(1130), 20211033 (2022). https://doi.org/10.1259/bjr.20211033 3. Bakkach, J., Mansouri, M., Derkaoui, T., Loudiyi, A., Fihri, M., Hassani, S., Barakat, A., Nourouti, N.G., Mechita, M.B.: Clinicopathologic and prognostic features of breast cancer in young women: a series from North of Morocco. BMC Womens Health 17(1), 106 (2017). https://doi.org/10.1186/s12905-017-0456-1 4. Sung, H., Ferlay, J., Siegel, R.L., Laversanne, M., Soerjomataram, I., Jemal, A., Bray, F.: Global cancer statistics 2020: globocan estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Canc. J. Clin. 71(3), 209–249 (2021). https://doi.org/10.3322/caac.21660 5. Sui, D., Liu, W.F., Chen, J., Zhao, C.X., Ma, X.X., Guo, M.Z., Tian, Z.F.: A pyramid architecture-based deep learning framework for breast cancer detection. Biomed Res. Int. 2021, 2567202 (2021). https://doi.org/10.1155/2021/2567202 6. Strehl, J.D., Wachter, D.L., Fasching, P.A., Beckmann, M.W., Hartmann, A.: Invasive breast cancer: recognition of molecular subtypes. Breast Care. 6(4), 258–264 (2011). https://doi.org/ 10.1159/000331339 7. Shi, X.X., Liu, X.J., Chen, C.L., Zhang, L.: Improving RNA-Seq expression estimation by modeling isoform- and exon-specific read sequencing rate. BMC Bioinform. 16(1), 332 (2015). https://doi.org/10.1186/s12859-015-0750-6 8. Fu, R., Wang, P., Ma, W.P., Taguchi, A., Wong, C.H., Zhang, Q., Gazdar, A., Hanash, S.M., Zhou, Q.H., Zhong, H., Feng, Z.D.: A statistical method for detecting differentially expressed snvs based on next-generation RNA-seq data. Biometrics. 73(1), 42–51 (2017). https://doi.org/ 10.1111/biom.12548 9. Ahmed, T., Carty, M., Wenric, S., Pelossof, R.: Towards cancer mega-cohorts: A novel homogenization algorithm applied to diverse breast cancer RNA-Seq datasets. J. Clin. Oncol. 38(15), e13507–e13507 (2020). https://doi.org/10.1200/JCO.2020.38.15_suppl.e13507 10. Han, B.W., Li, Z.H., Liu, S.F., Han, H.B., Dong, S.J., Zou, H.J., Sun, R.F., Jia, J.: A comprehensive review of microRNA-related polymorphisms in gastric cancer. Genet. Mol. Res. 15(2), 15028289 (2016). https://doi.org/10.4238/gmr.15028289 11. Sarkar, J.P., Saha, I., Sarkar, A., Maulik, U.: Machine learning integrated ensemble of feature selection methods, followed by survival analysis for predicting breast cancer subtype specific miRNA biomarkers. Comput. Biol. Med. 131(1), 104244 (2021). https://doi.org/10.1016/j. compbiomed.2021.104244
454
S. Xiang et al.
12. Hua, L., Zhou, P., Li, L., Liu, H., Yang, Z.: Prioritizing breast cancer subtype related miRNAs using miRNA-mRNA dysregulated relationships extracted from their dual expression profiling. J. Theor. Biol. 331, 1–11 (2013). https://doi.org/10.1016/j.jtbi.2013.04.008 13. Yang, Y., Xu, Z.D., Song, D.D.: Missing value imputation for microRNA expression data by using a GO-based similarity measure. BMC Bioinf. 17(1), 109–116 (2016). https://doi.org/10. 1186/s12859-015-0853-0 14. Zhang, L., Cui, H., Liu, B., Zhang, C., Horn, B.K.P.: Backpropagation neural network for processing of missing data in breast cancer detection. IRBM. 42(6), 435–441 (2021). https:// doi.org/10.1016/j.irbm.2021.06.010 15. Park, J.Y., Lock, E.F.: Integrative factorization of bidimensionally linked matrices. Biometrics. 76(1), 61–74 (2020). https://doi.org/10.1111/biom.13141 16. Pham, B.T., Tien Bui, D., Prakash, I., Dholakia, M.B.: Hybrid integration of multilayer perceptron neural networks and machine learning ensembles for landslide susceptibility assessment at Himalayan area (India) using GIS. Catena. 149(1), 52–63 (2017). https://doi.org/10.1016/j. catena.2016.09.007 17. Teng, X.J., Liu, B.L., Ichiye, T.: Understanding how water models affect the anomalous pressure dependence of their diffusion coefficients. J. Chem. Phys. 153(10), 104510 (2020). https://doi. org/10.1063/5.0021472 18. Teng, X.J., Ichiye, T.: Dynamical model for the counteracting effects of trimethylamine NOxide on urea in aqueous solutions under pressure. J. Phys. Chem. B. 124(10), 1978–1986 (2020). https://doi.org/10.1021/acs.jpcb.9b10844 19. Teng, X.J., Ichiye, T.: Dynamical effects of trimethylamine N-Oxide on aqueous solutions of urea. J. Phys. Chem. B. 123(5), 1108–1115 (2019). https://doi.org/10.1021/acs.jpcb.8b09874 20. Li, J.T., Zhang, H.M., Gao, F.G.: Identification of miRNA biomarkers for breast cancer by combining ensemble regularized multinomial logistic regression and cox regression. BMC Bioinf. 23(1), 434 (2022). https://doi.org/10.1186/s12859-022-04982-7
Smart Laboratory: A New SmartManufacturing-Technologies-enabled Chemical Experiment Paradigm Yaxin Wang, Chun Zhao, Wenzheng Liu, and Xiaotong Liu
Abstract With the continuous development of smart manufacturing technology, the smart laboratory is received increasing attention as an important branch of smart manufacturing. However, most laboratories still rely on traditional manual operations and manual recording of experimental data, which leads to low efficiency, high error rates, and difficult to unify the operation process. This paper proposes a smart laboratory solution based on smart manufacturing technologies, which utilizes Internet of Things, communication, big data, and artificial intelligence technologies to meet precise data collection, processing, and analysis during the experimental process. A small case study is conducted using a 3D model of a robotic arm and a virtual robotic arm to intercommunication and effectiveness of the proposed smart laboratory solution. The solution carries advantages such as improving laboratory efficiency, reducing experimental errors, and promoting data sharing, providing a new approach and method for the intelligent construction of laboratories. Keywords Smart manufacturing · Smart laboratory · Intercommunication
1 Introduction In recent years, the rapid development and widespread application of smart manufacturing technology attracts wide attention from the industrial and academic communities. Smart manufacturing is based on the deep integration of new-generation information and communication technology with advanced manufacturing technology, which runs through various stages of manufacturing activities such as design, production, management, and services [1]. The core concept of smart manufacturing includes intelligent, efficient, green, and safe, which aims to achieve the transformation and upgrading of the manufacturing industry towards low cost and high efficiency Y. Wang · C. Zhao (B) · W. Liu · X. Liu Beijing Information Science and Technology University, Beijing 100101, China e-mail: [email protected] Y. Wang e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_39
455
456
Y. Wang et al.
[2]. The development of smart manufacturing technology brings huge changes to the traditional manufacturing industry and also put forward new requirements for manufacturing industry research. As an important part of manufacturing industry research, the smart laboratory can effectively improve manufacturing efficiency and quality through digital simulation technology, which enables simulation, optimization and control of manufacturing processes. Smart manufacturing is a manufacturing paradigm supported by advanced technologies such as artificial intelligence, the Internet of Things, big data, and cloud computing. The smart laboratory is a new paradigm of laboratory built on the basis of smart manufacturing concepts and technologies [3]. For example, traditional printing smart laboratories rely heavily on manual labour, which often leads to timeconsuming and error-prone experiments [4]. In contrast, the smart laboratory can automatically collect and process experimental data through sensors and robots, thus reducing operator intervention and improving the accuracy of experimental results. Additionally, traditional smart laboratories work only at certain times and places. In contrast, the smart laboratory can achieve virtual experiments, and students can participate in experiments remotely through the Internet [5]. Furthermore, traditional chemical smart laboratories usually use manual operation instruments for experiments. The data collection range is limited by human perception, which reduces the accuracy and reliability of experimental results [6]. Some traditional chemical experiments pose certain risks, such as flammability and explosiveness. Through digital modelling and simulation technology, experimenters can monitor and adjust real experiments in real-time through virtual laboratories, avoiding dangerous situations [7]. This paper aims to achieve bidirectional information exchange and data transmission between a 3D model of the robotic arm in Unity and a virtual robotic arm, to verify the mutual communication and effectiveness of the proposed smart laboratory solutions. The main contribution of this paper is to propose a smart laboratory application method based on smart manufacturing technologies, providing new ideas and directions for the development of smart manufacturing. According to the functional requirements of the intelligent laboratory, this paper applies intelligent manufacturing technologies to each module of the laboratory and realizes the automatic collection and processing of experimental data, and the automatic control and optimization of experimental processes. The effectiveness of the proposed intelligent laboratory application method in data transmission is proved by a case study. The remainder of this paper is organized as follows. First, the concepts covered in this paper are introduced in Sect. 2. Second, the technology application of smart manufacturing technologies in the smart laboratory in this paper is presented in Sect. 3. Third, a small case of bidirectional communication between the 3D model of the robotic arm and the virtual robotic arm is introduced in Sect. 4. Finally, Sect. 5 concludes this paper.
Smart Laboratory: A New Smart-Manufacturing-Technologies …
457
2 From Smart Manufacturing to Smart Laboratory Smart manufacturing is a hot topic in recent years, integrating advanced technologies such as the Internet of Things and big data to improve manufacturing efficiency and reduce costs. This paper aims to develop a smart laboratory based on smart manufacturing technologies, integrating real-world physical systems with virtual systems to achieve bidirectional data exchange and communication. In the following sections, this paper introduces the relevant concepts.
2.1 Smart Manufacturing Smart manufacturing combines intelligence, digitization, networking, and intelligence in production, management, and service, aiming to improve manufacturing efficiency, quality, and flexibility and achieve personalized customization and sustainable development [8]. Smart manufacturing technologies mainly include several aspects: Firstly, IoT technology, which can achieve interconnection and information sharing among various equipment and materials in the manufacturing process [9]. Through wireless sensors, RFID and other technologies, IoT technology connects all kinds of equipment and materials in the manufacturing process to the Internet, forming a smart manufacturing system. Secondly, big data technology is a series of technologies and tools used to process and analyze large amounts of data [10]. With the development of Internet technology, the amount and types of data continue to increase. The emergence of big data technology provides a solution for data collection, storage, processing, and analysis. Thirdly, cloud computing technology provides powerful computing and storage capabilities for smart manufacturing. Through cloud computing technology, manufacturing enterprises can store massive data and applications in the cloud, achieving data sharing and collaboration [11], while also significantly reducing the information investment and operation costs of enterprises. Fourthly, artificial intelligence technology can achieve automation and intelligent control of the manufacturing process through data learning and analysis[12]. AI technology can be applied to various links in the manufacturing process, such as product design, production planning, equipment maintenance, etc., improving manufacturing efficiency and quality, and reducing costs and risks. Finally, virtual reality technology can achieve three-dimensional visualization and simulation of the manufacturing process [13]. Through virtual reality technology, manufacturing enterprises can simulate and optimize the manufacturing process in a virtual environment, reducing trial and error and costs, and improving manufacturing efficiency and quality.
458
Y. Wang et al.
2.2 Smart Laboratory The first generation of smart laboratories mainly relies on introducing new hardware devices and software platforms to control robotic arms and achieve visual perception. However, data processing is typically performed using databases or file storage, and programming languages or specialized software are used for data analysis. Additionally, the sensors and algorithms used in this generation are relatively simple, and can not comprehensively perceive and understand the surrounding environment. The interactivity of the first-generation smart laboratory is limited, resulting in one-way communication with users, lacking real-time responsiveness and flexibility. With the adoption of big data technologies, the second generation of smart laboratories can more efficiently store, manage, and process massive amounts of data. The second generation of smart laboratories can achieve the real-time perception of the experimental environment using sensor technologies, quickly identifying and adapting to environmental changes. In terms of interactivity, the second generation of smart laboratories makes significant improvements, introducing more intelligent interaction methods, such as speech recognition and natural language processing, allowing users to interact with the intelligent system more naturally. The third generation of smart laboratories typically uses more advanced data processing techniques and algorithms, such as deep learning and neural networks, which can efficiently process and analyze large amounts of data. Through intelligent sensor networks and human-machine interaction systems, the third generation of smart laboratories can autonomously perceive and identify the laboratory’s internal and external environment and devices, achieving real-time monitoring and control of the laboratory’s operating status. The third generation of smart laboratories not only supports conventional interaction methods such as voice and gesture but also introduces more diverse interaction methods, such as head tracking and body posture recognition. The evolution of the smart laboratory is shown in Fig. 1. The smart laboratory based on smart manufacturing technologies proposed by this paper not only relies on robotic arms to replace human repetitive labour but more importantly, applies advanced algorithms and computing power to expand human cognition and decision-making ability. The new type of laboratory promotes the progress of science and technology and brings wider application and greater social value [14]. The smart laboratory is a kind of laboratory integrating intelligent production and automatic experiment. The smart laboratory realizes the automation and intelligence of the laboratory through digital technology and smart manufacturing technology [15]. Smart manufacturing technologies include the Internet of Things, cloud computing, artificial intelligence, big data analysis and other technologies, the smart laboratory uses smart manufacturing technologies to achieve automation production, which can better meet the needs of the laboratory [16].
Smart Laboratory: A New Smart-Manufacturing-Technologies …
459 The third generation
The second generation
Autonomous Perception Humanized The first generation storage&processing separation Sensor sensing
Unidirectional
Evolution of Intelligent Laboratory Data-Handling Autonomy Human-Computer interaction
Fig. 1 The evolution of smart laboratory
3 Smart-Manufacturing-Technologies-Enabled Smart Laboratory Smart manufacturing technologies and the smart laboratory are complementary to each other. The development of smart manufacturing technologies requires continuous optimization and improvement of the production process, and the smart laboratory provides a scientific basis for the optimization of the manufacturing process by supporting the collection, analysis, and mining of experimental data. The smart laboratory is divided into seven modules according to the requirements, including a data acquisition module, a data processing module, an intelligent control module, a virtual simulation module, a remote experiment module, a knowledge management module and a security guarantee module. The implementation of these modules also applies the relevant technologies of intelligent manufacturing. The relationship between demand-driven intelligent laboratory and intelligent manufacturing technology support is shown in Fig. 2. The data acquisition module in the smart laboratory is responsible for collecting various types of data. Different types of sensors can be used to achieve data acquisition [17]. For example, in a physical laboratory, sensors can be used to collect data on temperature, humidity, pressure, etc. In a chemical laboratory, sensors can be used to collect data on pH value, conductivity, etc. The data acquisition module requires data transmission technology. This technology is used to transfer the collected data to the data processing module. The purpose of this transfer is to facilitate the analysis and processing of the data. Wireless transmission technologies or wired transmission technologies, can be used to transmit data to the data processing module. To ensure data accuracy and stability, the data acquisition module also needs to process and calibrate data, such as filtering, noise reduction, calibration, etc.
460
Y. Wang et al. Intelligent laboratory demand driven
Intelligent laboratory demand driven
Data management
Control management
Control management
Collecting experimental data
Data Acquisition Module
Process and analyze data
Data Processing Module
Storing experimental data
Knowledge Management Module
Control Experimental Equipment
Intelligent Control Module
Constructing Virtual Models
Virtual Simulation Module
Remote Operation Monitoring
Remote Experiment Module
Data Transmission Technology Big Data Technology Internet of Things
Communication Technology
Artificial Intelligence Technology
Virtual Reality Technology
Communication Technology Remote control technology Sensor Technology
Control management
Ensure Experiment Safety
Safety Guarantee Module
Cloud Computing Technology ...
Fig. 2 Requirements technology mapping
The data processing module is responsible for processing and analyzing the collected data to extract useful information. There is a large amount of data that needs to be processed in this module. To effectively process the data, big data technology is widely used in the data processing module of smart laboratories. For instance, big data platforms such as Hadoop and Spark can help process massive data and provide fast and efficient data processing capabilities [18]. Cloud computing technology can also be utilized in the data processing module to achieve large-scale data storage. Moreover, IoT technology can be utilized to collect and process sensor data. IoT technology can transmit sensor data to the cloud for analysis and processing in the data processing [19]. The intelligent control module is mainly used for laboratory control and monitoring. The intelligent control module applies artificial intelligence technology to
Smart Laboratory: A New Smart-Manufacturing-Technologies …
461
achieve intelligent control of experiments. By analyzing experimental data and building models, the intelligent control module can make autonomous decisions, diagnose, and optimize the experimental process [20]. Real-time monitoring of various parameters during the experiment is carried out using various sensors, and the collected data is transmitted to the data processing module for analysis. The virtual simulation module is used to build virtual models of laboratory scenes and equipment for virtual simulation experiments. Virtualization technology can virtualize a physical device into multiple logical devices. In the smart laboratory, virtualization technology can be used to create multiple virtual experimental environments, each virtual experimental environment can conduct different experiments, thereby fully utilizing the resources of physical devices. 3D modelling technology can transform physical entities into digitized 3D models to create virtual experimental equipment, objects, and scenes, making the virtual environment more realistic. Virtual reality technology can simulate the technology of real scenes, allowing users to experience the scenes and objects in the virtual environment as if users were actually there [17]. The remote experiment module is used to realize remote experiment operation and monitoring. Remote control technology enables the connection between the user’s local computer and the remote experiment device. This connection is established through remote access control software, facilitating remote experiment operations [21]. By leveraging cloud computing technology, virtualized experiment environments can be built on the cloud, and corresponding virtualized experiment devices can be provided. Users can remotely access these virtualized experiment devices for experiment operations through remote access control technology. Real-time monitoring of the experiment environment and data collection are necessary for remote experiments. Sensor technology can be used to monitor physical parameters such as temperature, humidity, and light intensity, and the collected data can be uploaded to the cloud for processing and analysis. The knowledge management module is primarily responsible for managing and sharing various knowledge and information involved in the smart laboratory. In the knowledge management module, all data involved in the laboratory can be integrated into a knowledge base for management, classification and organisation. The knowledge and information in the laboratory are constantly updated and changed. The knowledge management module can update and maintain the content in the knowledge base, ensuring that laboratory personnel have access to the latest and correct information and knowledge. The security module provides effective security, reliability, and confidentiality for the laboratory. The security module can perform user authentication, access control, and permission management functions for different user identities. The security module also provides data encryption and decryption functions to ensure that laboratory data is not accessed or leaked by unauthorized users.
462
Y. Wang et al.
4 Case Study The case is mainly to realize the communication between the 3D model of the robotic arm and the virtual robotic arm. The case mainly includes the establishment of the 3D model of the robot arm in unity, the configuration of the virtual robot arm, and the selection of appropriate communication methods to achieve data acquisition and real-time communication of control motion.
4.1 Case Design In this case, the 3D model of the robotic arm and the virtual robotic arm is mainly realized to communicate with each other to achieve real-time communication of data acquisition and control motion. The first is to make preparations, first collect the data of the physical model, equipment source data and other information, and store the data in a specific location, according to the data to establish a 3D model of the robotic arm, use a 3D printer to print out the physical twin of the robotic arm, and then, according to the different needs of the model to manage and classify the collected data, and for the interactive control layer to use, in the interactive control layer includes three interfaces. The Virtual Model Interaction Interface monitors the status of the model of the robotic arm. The simulation interaction interface monitors the status of the virtual robotic arm. The communication between the simulation interaction interface and the 3D model of the robotic arm and the communication between the virtual model interaction interface and the virtual robotic arm need to be supported by the communication interface. In the virtual layer, when the user adjusts the 3D model of the robotic arm or the virtual robotic arm, their respective control sites transmit position information to the message centre. Upon receiving the information, the other side controls the real-time movement of the robotic arm. Three interfaces and hierarchies allow the entire system to work together to achieve the desired functions and goals. The case flow chart is shown in Fig. 3.
4.2 System Implementation 4.2.1
Build a 3D Model
First of all, according to the collected data of the physical model, device source data and other information, AutoCAD or other CAD software is used to draw the 3D model of the robotic arm, and the files of the 3D model components are exported and converted into appropriate file formats. Create a new scene in Unity and import the
Smart Laboratory: A New Smart-Manufacturing-Technologies …
463
Data Layer Computaonal Model
Simulaon Data
Acon Model
Device Source Data Manufacturing Equipment
... Component Management
...
Physical Twin
Interacve Control Layer
Communicaon Interface
Simulaon Interacve Interface
Virtual Model Interacve Interface
Virtual Layer Control
Control 3D Model of Roboc Arm
Control Site
Control Site
Virtual Roboc Arm
Fig. 3 The Case Flow Chart
3D model component files of the robotic arm into the scene. Finally, use a control script to achieve the effect of controlling the robot arm component by controlling the keyboard.
4.2.2
Install the Virtual Robot Arm
Install the operating system and related software of the robotic arm in the virtual machine, such as virtual teach pendant and other software, install the WorkVisual software in the computer, write the control script of the virtual robotic arm in the WorkVisual software, and pass the script to the virtual robotic arm control site, and the virtual robotic arm control site controls the virtual robotic arm through the script on the one hand, and on the other hand, the script is used to receive the status information from the 3D model of the robotic arm and communicate with the robotic arm 3D model.
4.2.3
Intercommunication
In this case, the control site is mainly responsible for two tasks. The first task is to control the 3D model of the robotic arm in real time, and the status control information of the 3D model of the robotic arm can be updated in real-time. The state information of the 3D model of the robotic arm is published to the message centre. The second
464 Robotic Arm of 3D MOdel Control Site
Y. Wang et al. Event1 Publication
Event1 Subscription Message Center Event2 Publication
Event2 Subscription
Event1 Publication
Event2 Subscription
Communication Protocol Topic/Channel Data Format
Virtual Robotic Arm Control Site
Event1 Subscription
Event2 Publication
Subscriber/Publisher Message Queue ...
Attitude Information Positional Information Status Information Control Instruction ...
Fig. 4 Intercommunication schematic diagram
task subscribes to the status information of the virtual robotic arm, changes the state of the 3D model of the robotic arm, and realizes real-time control of the 3D model of the robotic arm. The virtual robotic arm control site has two main functions. The first function is to publish virtual robotic arm status information to the message centre. The second function is to receive status information from the 3D model of the robotic arm and then control the rotation of the virtual robotic arm. In this case, the communication chooses the subscription publishing mode, and the local MQTT server is used to realize two-way communication between the 3D model of the robotic arm and the virtual robotic arm. The intercommunication Schematic diagram is shown in Fig. 4.
4.3 Analyse In this case, the 3D model of the robot arm and the virtual robot arm are interrelated, and the two control sites realize real-time interactive simulation through data transmission. In this way, the interaction between the virtual environment and the real environment can be realized, the development and testing costs of the robotic arm control system can be effectively reduced, and the stability and reliability of the system can be improved.
Smart Laboratory: A New Smart-Manufacturing-Technologies …
465
5 Conclusion The smart laboratory’s features enable the smart laboratory to play an increasingly important role in education, scientific research and so on. In the education field, the smart laboratory can provide students with a more diverse experimental teaching environment and practical opportunities. In the scientific research field, the smart laboratory can help researchers design experiments and analyze data more efficiently. In summary, this paper studies the construction and development of smart laboratories, proposes the design scheme of the smart laboratory based on smart manufacturing technologies, and looks forward to the future development direction and trend of the smart laboratory. However, the construction and development of smart laboratories still face some problems and challenges, requiring further strengthening of theoretical research and in-depth exploration of the application of smart manufacturing technologies in laboratories to achieve the sustainable development of smart laboratories.
References 1. Zhou, J., et al.: Toward new-generation smart manufacturing. Engineering 4(1), 11–20 (2018) 2. Zhong, R.Y., et al.: Smart manufacturing in the context of industry 4.0: a review. Engineering 3(5), 616–630 (2017) 3. He, Bin, Bai, Kai-Jian.: Digital twin-based sustainable smart manufacturing: a review. Adv. Manuf. 9, 1–21 (2021) 4. Hawkins, Ian, Phelps, Amy J.: Virtual laboratory versus traditional laboratory: which is more effective for teaching electrochemistry? Chem. Educ. Res. Pract. 14(4), 516–523 (2013) 5. Wang, Lei, Chen, Xin, Liu, Qiang: A lightweight smart manufacturing system based on cloud computing for plate production. Mobile Netw. Appl. 22, 1170–1181 (2017) 6. Venkatasubramanian, Venkat: The promise of artificial intelligence in chemical engineering: Is it here, finally? AIChE J. 65(2), 466–478 (2019) 7. Lu, Y., et al.: Digital Twin-driven smart manufacturing: connotation, reference model, applications and research issues. Robot. Comput.-Integr. Manuf. 61, 101837 (2020) 8. Esmaeilian, Behzad, Behdad, Sara, Wang, Ben: The evolution and future of manufacturing: a review. J. Manuf. Syst. 39, 79–100 (2016) 9. Younan, Mina, Khattab, Sherif, Bahgat, Reem: From the wireless sensor networks (WSNs) to the Web of Things (WoT): an overview. J. Intell. Syst. Internet Things 4(2), 56–68 (2021) 10. Lokers, R. et al.: Analysis of Big Data technologies for use in agro-environmental science. Environ. Model. Softw. 84, 494–504 (2016) 11. Marinescu, D.C.: Cloud computing: theory and practice. Morgan Kaufmann (2022) 12. Shaw, J., et al.: Artificial intelligence and the implementation challenge. J. Med. Internet Res. 21(7), e13659 (2019) 13. Zhao, H.Y., et al.: Application of virtual reality technology in high vocational education. Appl. Mech. Mater. 556, 6716–6719 (2014) 14. Peng, X., et al.: Next-generation smart laboratories for materials design and manufacturing. MRS Bull. 1–7 (2023) 15. Chen, B., et al.: Smart factory of industry 4.0: Key technologies, application case, and challenges. IEEE Access 6, 6505–6519 (2017) 16. Cai, Q., et al.: Research on key technologies for immune monitoring of smart manufacturing system. Int. J. Adv. Manuf. Technol. 94, 1607–1621 (2018)
466
Y. Wang et al.
17. Zhang, L., et al.: Modeling and simulation in smart manufacturing. Comput. Ind. 112, 103123 (2019) 18. Wang, J., et al.: Big data analytics for smart manufacturing systems: a review. J. Manuf. Syst. 62, 738–752 (2022) 19. Kumar, Sachin, Tiwari, Prayag, Zymbler, Mikhail: Internet of things is a revolutionary approach for future technology enhancement: a review. J. Big Data 6(1), 1–21 (2019) 20. Peng, X., et al.: Next-generation smart laboratories for materials design and manufacturing. MRS Bull. 1–7 (2023) 21. Esposito, G., et al.: Non-traditional labs and lab network initiatives: a review. Int. J. Online Biomed. Eng. 17(5) (2021)
Design and Implementation of Humanoid Robot Arm Based on Human Arm Mechanism Shuxuan Liu, Fan Yang, Chang Li, Junning Zhang, Yajing Guo, and Pengfei Li
Abstract A seven-degree-of-freedom humanoid robotic arm is designed to address the problems of large size, small loading capacity and low accuracy of existing humanoid robotic arms. The arm is designed according to human arm movement mechanism, and technical indexes of each joint of the arm are developed to achieve the end load requirements. Two types of joint structure design methods are proposed, and a multi-turn absolute encoder is used to achieve high-precision control of the joints, which improves the control system accuracy of the robot arm and achieves the purpose of “usefulness”. The electrical system of the humanoid robot arm is built, and the electrical connection is reliable and stable. Through experimental verification, the seven-degree of freedom humanoid robot arm can be considered to be in good motion condition, and the repetitive positioning accuracy of the robot arm can reach ± 0.1 mm under loading condition, which proves high accuracy of transmission and control. Keywords Arm · Humanoid robot arm · Joint · Seven degrees of freedom
1 Introduction With the rapid development of the service robot field, more and more functions are needed for humanoid robots. In order to enhance the functions of humanoid robots, the humanoid robot arm, the most important part of humanoid robots, is indispensable. Arms of human beings are advanced product during human evolution in nature, which have seven degrees of freedom and can achieve flexible, fast, and S. Liu (B) · F. Yang · J. Zhang · Y. Guo · P. Li Beijing Institute of Precision Mechatronics and Controls, Beijing100076, China e-mail: [email protected] Laboratory of Aerospace Servo Actuation and Transmission, Beijing100076, China C. Li China Academy of Launch Vehicle Technology, Beijing100076, China J. Zhang Harbin Institute of Technology, Harbin150006, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_40
467
468
S. Liu et al.
accurate positioning in space, and they have become the goal of many researchers to make robots with flexible arms like human arms (see in [1–3]). Japan’s SoftBank developed the robot Pepper [4], designed as a family companion to help customers in retail stores, but the robot’s humanoid robotic arm has low loading capacity and can only perform simple actions such as waving and shaking hands. The 7-DOF humanoid robot arm developed by the National Institute of Applied Sciences (INSA) in France has almost all the functions of a human arm, but its total size and weight are too large [5]. Therefore, it is significant to study a humanoid robot arm with similar human scale and high loading capacity. The humanoid robotic arm studied in this project accomplishes the indicators of human scale and high load, and at the same time puts higher requirements on the repetitive positioning accuracy of the robotic arm itself to be able to complete high-precision operation tasks autonomously.
2 Configuration Design The human arm is a mechanism with seven degrees of freedom, including three degrees of freedom for the shoulder joint, one degree of freedom for the elbow joint, and three degrees of freedom for the wrist joint. The shoulder joint is a typical balland-socket joint, the elbow joint is a hinge joint with one degree of freedom, and the wrist joint is composed of a rotational joint and two cross joints with two degrees of freedom. The human arm is a redundant structure, and because of this, the human arm is able to be flexible (Fig. 1).
Fig. 1 Configuration of the robot arm
Design and Implementation of Humanoid …
469
In order to imitate human’s actions, this design adopts a seven-degree-of-freedom configuration, which is the S-R-S configuration. Three tandem rotating joints are used as shoulder joints for forward and backward extension and flexion, extension and retraction, and rotation; one rotating joint is used as elbow joint for pitch and tilt; and three tandem rotating joints are used as wrist joints for flip, pitch and deflection. In order to facilitate modeling and calculation, the seven joints are orthogonal to each other and all points are in a straight line at arm extension.
3 Forward Kinematics Construction In order to design the robotic arm anthropomorphically, the total length of the designed robotic arm is 700 mm and the outer envelope diameter is about 120 mm. To describe the end motion the robotic arm, the DH coordinate system of the robotic arm is established as shown in Fig. 2, and the initial attitude of this right arm is the natural downward state of both arms. Fig. 2 DH coordinate system of humanoid robot arm
470
S. Liu et al.
Table 1 DH parameters of humanoid robot arm Number Alpha a 1 2 3 4 5 6 7
0 90 − 90 90 −90 90 −90
0 0 0 0 0 0 0
d/mm
theta
d1 (280) 0 d2 (240) 0 d3 (200) 0 d4 (90)
theta1 theta2 theta3 theta4 theta5 theta6 theta7
According to the established linkage coordinate system and linkage parameters, the DH parameters of the humanoid robot arm are obtained as shown in Table 1. Where the coordinate origin is set at the center of the human body, so the d1 length is set to 280 mm.
4 Design of Humanoid Robot Arm Body 4.1 General Layout The length of the humanoid robotic arm is closely related to the load size. The longer the length, the moment generated by the same load will increase, which will further increase the output torque requirement of the shoulder joint and the outer envelope diameter. Therefore, it is necessary to control the total length of the robotic arm, which is deigned as 700 mm. Joints 1, 3, 5 and 7 are placed in vertical direction, and their length directly affects the total length of the arm. In order to reduce the arm length, the joint actuators are centrally distributed, where the actuators of joints 2, 3 and 4 are placed on the second and third joints with longer lengths, and the actuators of joints 5 and 6 are placed on the fourth and fifth joints’ arms. By making full use of blank area, the outer envelope size of the robot arm can be controlled, as shown in Fig. 3.
Fig. 3 Drive layout
Design and Implementation of Humanoid … Table 2 Individual joint moment indicators Number Name 1 2 3 4 5 6 7
Joint1 Joint2 Joint3 Joint4 Joint5 Joint6 Joint7
471
Torque (Nm) 55 40 20 20 10 10 5
4.2 Joint Moment Analysis Rated load of this robotic arm is designed to be 2.5 kg, which can accomplish most of the work in life such as holding water cups, carrying bags, and doing housework. According to the length of the robot arm, the load, the estimated shell weight, and the component selection, the output torque of each joint was initially decomposed, and the indicators are shown in Table 2.
4.3 Design of Joint Structure The joint is the key part of the robotic arm. Most of the existing robotic arm joints on the market are oriented to industrial collaboration needs, in order to achieve rapid development of the robotic arm. Most of the fully functional modular design, transmission, drive and control are all integrated in the joint part, which leads to large volume and high weight, the structure can not reach the purpose of anthropomorphism. In order to achieve the lightweight design of the robotic arm, the joint adopts a compact design method, integrating the motor, drive and sensor into the smallest modular unit, integrating the joint shell and arm rod design, while deploying the drive on the arm rod (Fig. 4). The integrated joint consists of motor, transmission mechanism and feedback sensor, in this design, and in order to achieve the purpose of humanoid, the envelope outside diameter should not be greater than 120 mm. From Table 2, it can be seen that output torque of the first and second joint is relatively large. If such two joints have the same configuration with others, it will inevitably cause the size of the motor and transmission to be larger, which can not meet the design requirements. In this design, different reduction ratios are used to achieve similar radial dimensions of different joints. The first two joints are designed with two-stage reducer, and the remaining joints are designed with single-stage reduction. The two-stage reducer is the combination of gear reduction and harmonic reducer, and the single-stage speed
472
S. Liu et al.
Fig. 4 Integrated design of shell and arm
Table 3 Motor and reducer parameters Number Name Power of motor (W) 1 2 3 4 5 6 7
Joint1 Joint2 Joint3 Joint4 Joint5 Joint6 Joint7
70 70 100 100 70 70 30
Gearbox Harmonic Reduction reduction ratio reduction ratio ratio 5.6 5.6 – – – – –
100 100 100 100 100 100 100
1:560 1:560 1:101 1:101 1:101 1:101 1:101
reduction is the harmonic reducer. The parameter table of the selected motor and reducer is shown in Table 3. For high-precision joints, the high-precision closed-loop control capability is required, and the selected high-precision position sensor is particularly important. There are two classic sensing configurations for high precision joints. One is Hall sensor + incremental sensor + single-turn absolute sensor, in which Hall sensor is used for motor commutation control, incremental sensor is used for speed loop control, and single-turn absolute sensor is used for position loop control. The other is Hall sensor + multi-turn absolute encoder, in which Hall sensor is used for motor commutation control and multi-turn absolute encoder is installed one the shaft of the motor for dual loop control of speed and position. In this configuration, Hall sensor is used for motor commutation control and multi-turn absolute encoder is installed at the rear end of the motor to participate in the dual-loop control of speed and position loops. In order to reduce the number of sensors in this design, a compact multi-turn absolute encoder of HEIDENHAIN EBI1135 is used as the sensing feedback element, which is an 18-bit sensor of small size and light weight. The multi-turn function requires external coin cell power supply in order to be in a non-loss state when the robot arm is powered down, and the maximum power consumption is 520 mW according to the design parameters, which
Design and Implementation of Humanoid …
473
Fig. 5 EBI1135 multiturn absolute encoder
Fig. 6 Composition diagram of two-stage reduction joint design
can last up to two months with a general coin cell battery and has good maintenance performance (Fig. 5). The two types of joint structures designed are shown in Figs. 6 and 7, which are the minimum joint composition diagram for two-stage deceleration and the minimum joint composition diagram for one-stage deceleration, respectively.
474
S. Liu et al.
Fig. 7 One-stage reduction joint design composition diagram
5 Electrical Design of Humanoid Robot Arm The power supply of the humanoid robot arm is designed with 24 V human safety voltage, and in order to prevent interference between the control circuit and the power circuit, a dual circuit design is carried out, and the multi-turn absolute encoder is reduced from 24 to + 3.6 V by a voltage regulator. The communication bus of the robot arm is Ethercat, and the distributed control of each joint driver is carried out through the bus method (Fig. 8).
Fig. 8 Electrical connection of humanoid robot arm
Design and Implementation of Humanoid …
475
Fig. 9 Humanoid robot arm action sequence Table 4 Repeat positioning test data Number Comparator 1 2 3 4 5
1 VS 2 2 VS 3 3 VS 4 4 VS 5 5VS 6
Repetitive positioning error value (mm) 0.04 0.05 0.04 0.06 0.05
6 Validation Results This design forms a humanoid double-arm robot by designing the left arm and the right arm, in a symmetrical state. By sending a certain sequence of control commands to the humanoid robot arm, the robot arm moves naturally and can imitate human actions. The video screenshot of the robot arm motion sequence is shown in Fig. 9. The target is installed at the end of the robotic arm, and the robotic arm moves repeatedly according to a predetermined trajectory. Under the condition of carrying load, the repeat positioning accuracy of the end of the robotic arm is tested by using laser tracker, and the repeat positioning accuracy of the robotic arm can be ± 0.1 mm. The test data are shown in Table 4.
476
S. Liu et al.
7 Conclusion A seven-degree-of-freedom humanoid robotic arm was researched according to the human arm mechanism. DH parameters were analyzed, and personalized joint was designed for specific requirements. Through the verification of the prototype, the humanoid robotic arm can move smoothly and can reach 2.5 kg load capacity, and the repeat positioning accuracy of the robotic arm is ± 0.06 mm at most, meeting the demand of ± 0.1 mm.
References 1. Fang, C., Ding, X.: Anthropomorphic arm kinematics oriented to movement primitive of human arm triangle. Jiqiren/Robot 34(3), 257 (2012) 2. Xia, J., Jiang, Z., Hao, Z., et al.: Dual fast marching tree algorithm for human-like motion planning of anthropomorphic arms with task constraints. IEEE/ASME Trans. Mechatron. 99, 1 (2020) 3. Zhao, J., Song, C., Du, B.: Configuration of humanoid robotic arm based on human engineering. J. Mech. Eng. 49(11), 16 (2013) 4. Amit, P., Rodolphe, G.: A mass-produced sociable humanoid robot: pepper: the first machine of its kind. IEEE Robot. Autom. Mag. 25, 40–48 (2018) 5. Tondu, B.: A seven-degrees-of-freedom robot-arm driven by pneumatic artificial muscles for humanoid robots. Int. J. Robot. Res. 24(4), 257–274 (2005)
A Hybrid Variable Impedance Force Control Method for Industrial Robots Jian-jun Zhang, Hou-sheng Li, and Han Li
Abstract This paper proposes a hybrid variable impedance force control method for industrial robots to address two types of problems: soft contact force control and force overshoot. Firstly, in order to realise soft force tracking, a variable target stiffness coefficient is designed based on the hybrid impedance control with reference to the property that the human arm can freely adjust the arm stiffness to adjust the magnitude of the contact force, so that the stiffness of the manipulator can be adjusted with the force error to achieve the purpose of soft force tracking. A proof based on Lyapunov’s direct method is also given. Secondly, considering the overshoot problem in the whole force tracking process, the fuzzy controller is designed to achieve the suppression of the force response speed and force overshoot by adjusting the damping term appropriately, so as to enhance the safety of the whole system. The results show that the controller can converge to a small neighbourhood of zero for both step and sinusoidal force signals, and can also suppress force overshoot during tracking. Keywords Industrial robotics · Hybrid variable impedance control · Fuzzy control · Force overshoot suppression
1 Introduction With the rapid development of robot application technology, robot position control is becoming more and more mature. When the robot does not need to contact with the object, simple position control is enough. However, if the robot needs to make contact with objects, such as axle hole assembly, surface treatment and grinding, J. Zhang · H. Li (B) · H. Li School of Electrical Engineering and Automation, Henan Polytechnic University, Jiaozuo 454000, China e-mail: [email protected] URL: https://www.hpu.edu.cn/ J. Zhang Henan International Joint Laboratory of Direct Drive and Control of Intelligent Equipment, Henan Polytechnic University, Jiaozuo 454000, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_41
477
478
J. Zhang et al.
dexterous hand manipulation, and interaction with humans [1, 2], position control is not adequate. In recent years robot compliance control has gradually become a research hotspot in this field, especially in industrial robot applications. Jiao proposed an adaptive hybrid impedance control method above the analysis considering human kinematics, which can improve the internal forces when grasping objects [7]. Seul designed a neural network-based impedance control, proposing that the damping term in the impedance control should be adjusted in response to changes in the external environment, for example, if the contact is stiffer, the damping term should be increased to suppress force overshoot during contact [8]. Lee devised a variable stiffness impedance control algorithm, which designs the stiffness term based on the end force error of the manipulator, essentially wanting the manipulator to be free to control its own stiffness like a human arm, as a means to achieve force control [9]. Cao proposed a smooth adaptive hybrid impedance control to improve the force overshoot problem during industrial robot applications [10]. Although extensive research has been done by a large number of scholars on the human control of industrial machines, for the time being, there are still three issues that need to be addressed urgently [11–13]: 1. Maintaining the force tracking error. 2. Avoiding excessive overshoot generation during force tracking. 3. Achieving fast force response. In this paper, a hybrid variable impedance force control method is proposed. Firstly, to address the problem of soft force tracking in the contact process, a variable target stiffness coefficient is designed with reference to the characteristic that the human arm can freely adjust the arm stiffness to adjust the size of the contact force, so that the stiffness of the manipulator can be adjusted with the force error to achieve the effect of soft force tracking. Secondly, considering the overshoot problem in the whole force tracking process, the fuzzy controller is designed to suppress the force response speed and force overshoot by adjusting the damping term appropriately, thus enhancing the safety of the whole system.
2 Position-Based Impedance Control Firstly, the traditional position based impedance control is explained: An initial trajectory X c is set in the uncontacted space to ensure that the end of the manipulator contacts the target object. When the end of the manipulator contacts the target object, a contact force Fe is generated at the end of the manipulator, and the contact force Fe is used by the impedance control outer loop to correct the initial trajectory X c . The resulting corrected trajectory X d is tracked by the position control inner loop to achieve the set target impedance. However, the accuracy of the inner loop of position control in the whole process will directly affect the realization of the target impedance. The position-based impedance control block diagram is shown in Fig. 1.
A Hybrid Variable Impedance Force Control Method for Industrial Robots
479
Fig. 1 Block diagram of position-based impedance control
The algorithm is derived from Eq. (1). Md x¨ + Bd x˙ + K d x = Fe
(1)
Of these:x = xc − xd .x represents the change in displacement„xc represents the initial trajectory, the xd represents the corrected trajectory. Fe represents the force on the end of the manipulator in Cartesian space.Md ,Bd ,K d These are known as the target inertia matrix, the target damping matrix and the target stiffness matrix respectively.The target matrix is generally set as a positive definite diagonal constant matrix to ensure the stability of the whole system,since they are both positive definite diagonal matrices, they are decoupled in all dimensions,It is feasible to consider all of them as one-dimensional cases.
3 Design of Hybrid Variable Impedance Force Control Methods In this paper a hybrid variable impedance force control method is proposed by means of a selection matrix S distinguishing the position control space from the force control space, combining the advantages of mixed force-position control with impedance control, the matrix S the diagonal matrix is still used for the design, and when a dimension requires position control the corresponding matrix for that direction S in for 1,otherwise it is 0.Also in order to focus on force control, this paper focuses on the force control part of the discussion and design. On this basis it is assumed that the system position control inner loop is ideal, so the robot can track the corrected trajectory generated by the impedance controller well. Firstly, the variable impedance controller is designed with reference to Lee [9], where the target damping term and the stiffness term are the main focus of the design. Kronander [14] pointed out that
480
J. Zhang et al.
Fig. 2 Overall block diagram of hybrid impedance control
ideally the target inertia matrix should be designed in the same way as the matrix of inertia quantities in the robot dynamics equations, but this approach would add a huge amount of computational effort, and the design of the target inertia matrix is closely related to the stability of the system [14]. Therefore, in this paper, the target inertia matrix is still designed using a positive definite constant diagonal matrix, and the target damping as well as the target stiffness matrix are designed as diagonal matrices to ensure that each dimension is decoupled. The overall block diagram of the hybrid impedance control is shown in Fig. 2.
3.1 Target Stiffness Matrix Design The human arm is free to adjust the stiffness of the arm according to the force error to achieve control of the interacting force, for example, when the human arm is in contact with an object, the stiffness of the arm can be increased when the human subjectively expects to exert more force on the object, and conversely, the stiffness of the arm can be reduced. Given such excellent characteristics of the human arm, this paper modifies the target impedance relationship as follows. Md x¨ + Bd (t)x˙ + K d (t)x = Fe
(2)
design of adaptive stiffness terms based on force errors. K d (t) = αk0 x −1
(3)
α = k p e f + kd e˙ f
(4)
e f = Fd − Fe
(5)
of which:
A Hybrid Variable Impedance Force Control Method for Industrial Robots
481
Fd Expression of expectation,k p and kd are arbitrary positive number. The dynamic equation for the force error of the modified system can then be obtained as e f = Fd − Fe = Fd − Md x¨ − Bd (t)x˙ − K d (t)x
(6)
taking Eq. (3) into Eq. (4) gives e f = Fd − Fe = Fd − Md x¨ − Bd (t)x˙ − k0 k p e f − k0 kd e˙ f
(7)
rectification leads to k0 kd e˙ f + (k0 k p + 1)e f = Fd − Md x¨ − Bd (t)x˙
(8)
after a simple collation the steady state force error of the system can be obtained as e f ss =
Fd k0 k p + 1
(9)
From Eq. (10), it can be seen that the steady-state force error is related to the desired force Fd and the parameters for which it was designed k0 ,k p is relevant, so consider the k0 ,k p The design is large so that the steady-state force error of the system will converge to a small neighbourhood of zero, which is perfectly feasible in engineering terms.
3.2 Target Damping Matrix Design The original intention of designing the target damping term in this paper was to suppress the amount of force overshoot and the speed of the force response of the system by appropriately increasing the damping term so that the force tracking of the system is in a dynamic process similar to over-damped tracking as far as possible, and this approach can further ensure the safety of the system. As mentioned above, the modified target impedance relationship is given in Eq. (2), and the target damping term is next designed. (10) Bd (t) = b0 + b(t) of which b0 is a positive number whose effect will be described in the next section of the proof. For the damping correction quantity b(t), In this paper, a fuzzy controller is used for the design: where the force errore f , Rate of change of force error e˙ f as input to the fuzzy controller, and b(t) as the controller output. The fuzzy controller principle is shown in Fig. 3.
482
J. Zhang et al.
Fig. 3 Fuzzy control schematic
The design rules for the partial fuzzy controller are as follows. (1) When the force error e f is zero and the rate of change of force error e˙ f is zero, then the fuzzy controller output b(t) a zero value should be taken and no correction should be made to the original damping term. (2) When the force error e f relatively small, rate of change of force error e˙ f when the fuzzy controller output is relatively large, then b(t) a larger value should be taken to give a similar over-damped dynamic characteristic and to limit the speed of the system force response to ensure system safety. (3) When the force error e f relatively large, rate of change of force error e˙ f when the comparison is small, the fuzzy controller output b(t) smaller values should be taken to accelerate the force response of the system so that the contact force Fe rapid convergence to desired force Fd . According to the above fuzzy rules, all input and output variables are divided into 7 fuzzy subsets, which are represented by the letters NB, NM, NS, ZO, PS, PM and PB respectively. The membership function of each fuzzy set is a gaussian type function. Traditional Mamdani algorithm is used for fuzzy reasoning, and max-min coincidence operation is used for fuzzy reasoning rules. The center of gravity method is used to defuzzify the result of fuzzy inference. The fuzzy rules are shown in Table 1.
Table 1 Fuzzy control rules b(t) NB ef
NB NM NS ZO PS PM PB
PB PB PB PB PM PS ZO
e˙ f NM
NS
ZO
PS
PM
PB
PB PB PB PM PS ZO NS
PB PB PM PS ZO NS NM
PB PM PS ZO NS NM NB
PM PS ZO NS NM NB NB
PS ZO NS NM NB NB NB
ZO NS NM NB NB NB NB
A Hybrid Variable Impedance Force Control Method for Industrial Robots
483
4 Proof of Stability It is well known that in impedance control the individual target matrices are often designed as positive definite constant diagonal matrices, which ensures the stability of the impedance controller as this obviously makes Eq. (2) a stable ordinary differential equation, but if the individual target impedance parameters are time-varying, the analysis of stability becomes crucial. A very common approach to first analyse the forces between the end of the manipulator and the contact is to also treat the forces between the two as a linear spring model, where the contact force with the contact is Fe can be expressed as: Fe =
ke (x − xe ) x ≥ xe 0 x < xe
(11)
xe indicates the position of the contact, ke indicates the coefficient of stiffness of the contact. When x < xe , The end of the manipulator is not touching the contact, when the contact force between the two is 0; When x ≥ xe this can be seen as contact from the manipulator to the contact, where the forces on both can be seen as the product of the deformation and stiffness. A simple model of the contact between the end of the manipulator and the environment is shown in Fig. 4. The stability of the designed variable impedance controller is considered next, and for emphasis, the following are considered in the one-dimensional case, consider the following candidate Lyapunov function. s V =
1 1 2 1 m x˙ + ke (x − xe ) + kn e f 2 2 2 2
(12)
where Eq. (12) can be expressed as the sum of the kinetic and potential energies of the original system, is the nominal stiffness value designed by man, and bringing Eqs. (3), (4), (10) into Eq. (2) yields m x¨ = − f d − bd (t)x˙ + k0 k p (e f + 1) + k0 kd e˙ f
(13)
at this point it is assumed that the expected force is a constant or a slowly changing quantity, then. Fig. 4 Simple model the manipulator with the environment
484
J. Zhang et al.
e f = Fd − Fe = Fd − ke (x − xe )
(14)
e˙ f = f˙d − F˙e = −ke (x˙ − xe ) = −ke x˙
(15)
differentiating Eq. (12) with respect to time yields 1 V˙ = m˙ x˙ 2 + ke (x − xe )x˙ + kn e f e˙ f + m x¨ x˙ 2
(16)
taking Eqs. (13), (14), (15) into Eq. (16) gives 1 V˙ = m˙ x˙ 2 + (k0 k p − kn ke )e f x˙ − bd (t)x˙ 2 − k0 kd ke x˙ 2 2 if the design nominal stiffness.kn =
k0 k p , ke
(17)
Then Eq. (17) is organised as
1 1 ˙ x˙ 2 V˙ = m˙ x˙ 2 − bd (t)x˙ 2 − k0 kd ke x˙ 2 = −(bd (t) + k0 kd ke − m) 2 2
(18)
afterwards the design of the bd (t) = b0 + b(t) Taking into Eq. (18) yields 1 ˙ x˙ 2 V˙ = −(b0 + b(t) + k0 kd ke − m) 2
(19)
since the target inertia matrix is still designed as a positive definite diagonal constant matrix in this paper, then m˙ = 0,Bringing in Eq. (19) yields: V˙ = −(b0 + b(t) + k0 kd ke )x˙ 2
(20)
and the k0 and kd are arbitrarily set positive number,ke is the target’s stiffness factor which obviously cannot be negative, then only suitable positive numbers need to be designed b0 can guarantee V˙ ≤ 0, Also because of 1 1 ˙ − m) ¨ x˙ 2 V¨ = −2( m˙ + k0 kd ke + b0 + b(t))x˙ x¨ − (2b(t) 2 2
(21)
˙ − m¨ = 2b(t) ˙ is the differential of the fuzzy controller output, which while 2b(t) is obviously bounded. 1 ¨ m˙ + k0 kd ke + b0 + b(t))x˙ x¨ V = −2 2 1 2 ˙ ¨ x˙ ≤ −2(k0 kd ke + b0 + b(t) x˙ x¨ − (2b(t) − m) 2
(22)
so V¨ is bounded then V˙ is continuous, The original system is asymptotically stable by the barbalat lemma.
A Hybrid Variable Impedance Force Control Method for Industrial Robots
(a) Force tracking curve graph
(c) Variable stiffness term variation graph
485
(b) Force error curve graph
(d) Variable damping term variation graph
Fig. 5 Step signal force tracking
5 Simulation Validation The proposed system has been extensively simulated in this paper, and specific simulation results are presented next. The simulation uses a planar two-degree-of-freedom robot that, set the physical parameters of the robot as m 1 = 1 (kg), m 2 = 1 (kg) and also l1 = l2 = 1 (m). Assuming that the contact at this point is a virtual wall, the wall position xe = 0.8, Wall stiffness ke = diag(5, 0), and the robot end is only available at x axis direction with force, In variable impedance controllers Md = diag(1, 1), b0 = diag(8, 8), k0 = diag(100, 100), k p = diag(50, 50), kd = diag(10, 10), The fuzzy controller thesis domain is [−10, 10], the step-signal force tracking and sinesignal force tracking of the system at this point are shown in Figs. 5 and 6. (1) step signal force tracking Fd = 0.5N (2) sine signal force tracking Fd = 0.5 + 0.3 sin(2t)N From the simulation results, it can be seen that the hybrid variable impedance controller proposed in this paper has good tracking effect for both step signal and
486
J. Zhang et al.
(a) Force tracking curve graph
(c) Variable stiffness term variation graph
(b) Force error curve graph
(d) Variable damping term variation graph
Fig. 6 Sine signal force tracking
sinusoidal signal, and after processing and calculating the step signal data, it is found that the force steady-state error can reach 5.4709 × 10−4 N, For sinusoidal signals the tracking error can converge to a small neighbourhood of zero. And for variable impedance parameters Bd (t)andK d (t), variable stiffness term K d (t) generates greater stiffness in the event of large force errors to allow the actual force at the end to track the desired force, Variable damping term Bd (t) a small damping is generated to accelerate the convergence of the force signal when the initial force error is large, Large damping when the actual force signal is about to track the desired force signal, In order to ensure that the system does not generate overshoots and large overshoot amounts, this justifies the author’s previous ideas and intentions. Comparison results with other impedance control algorithms (conventional impedance control algorithm (overdamping) and model-referenced adaptive P I D impedance control algorithm) are also given in this paper, Where the expectation force: Fd = 0.5N . The specific simulation results are shown in Fig. 7.
A Hybrid Variable Impedance Force Control Method for Industrial Robots
487
Fig. 7 Comparison between the algorithm in this paper and other algorithms
As can be seen from the comparison chart, the traditional impedance control algorithm (overdamping) has a slow response to force and it requires some a priori knowledge. The model-referenced adaptive P I D impedance control algorithm has a good response speed for the force, but it inevitably produces a large amount of overshoot, which is not favored in practical industrial applications. The variable impedance control proposed in this paper, due to its variable damping and stiffness terms, outperforms the traditional impedance control algorithm in the initial force rise phase, and does not produce overshoot in the force rise completion phase, which is safer and more reliable than the model-referenced adaptive P I D impedance control algorithm. Moreover, it’s entire force tracking process time is better than the other two algorithms. On the whole, the variable impedance control algorithm proposed in this paper appears to be safer and more effective than the other two algorithms.
6 Conclusion In this paper, a hybrid variable impedance controller is designed to take into account the problems of soft interaction force control and force overshoot that are common in traditional industrial robots. Firstly, the target stiffness coefficient is designed with the human arm as the ideal target, so that the robot joint stiffness can be adaptively adjusted according to the desired force and the actual contact force, and the robot end contact force converges to the desired force. Secondly, the target variable damping coefficient is designed for the force overshoot and force overshoot problems in the force tracking process, and the fuzzy controller is used to adjust the system damping term appropriately to suppress and reduce the force overshoot and force response speed. Finally, a stability analysis is presented and the effectiveness of the designed controller is verified by simulation.
488
J. Zhang et al.
References 1. Zhai, A., et al.: Adaptive neural synchronized impedance control for cooperative manipulators processing under uncertain environments. Robot. Comput.-Integr. Manuf. 75, 102291 (2022) 2. Mustalahti, P., Mattila, J.: Position-based impedance control design for a hydraulically actuated series elastic actuator. Energies 15(7), 2503 (2022) 3. Bai, K., et al.: Spherical wrist with hybrid motion-impedance control for enhanced robotic manipulations. IEEE Trans. Robot. Publ. IEEE Robot. Autom. Soc. 2, 38 (2022) 4. Raibert, M.H., Craig, J.J.: Hybrid position/force control of manipulators. ASME J. Dyn. Syst. Measure. Control 102(2), 126–133 (1981) 5. Neville, H.: Impedance control: an approach to manipulation: Part I-theory. J. Dyn. Syst. Measure. Control 107 (1985) 6. Song, P., Yu, Y., Zhang, X.: A tutorial survey and comparison of impedance control on robotic manipulation. Robotica 37(5), 1–36 (2019) 7. Jiao, C., et al.: Adaptive hybrid impedance control for dual-arm cooperative manipulation with object uncertainties. Automatica 140 (2022) 8. Jung, S., Hsia, T.C.: Neural network impedance force control of robot manipulator. Industr. Electron. IEEE Trans. 45(3), 451–461 (1998) 9. Lee, K., Buss, M.: Force tracking impedance control with variable target stiffness. IFAC Proc. Vol. 41(2), 6751–6756 (2008) 10. Cao, H., et al.: Smooth adaptive hybrid impedance control for robotic contact force tracking in dynamic environments. Industr. Robot Int. J. Robot. Res. Appl. 47(2), 231–242 (2020) 11. Xu, K., et al.: Adaptive impedance control with variable target stiffness for wheel-legged robot on complex unknown terrain. Mechatronics 69, 102388 (2020) 12. Chen, G., et al.: Adaptive variable parameter impedance control for apple harvesting robot compliant picking. Complexity 2020 (2020) 13. Xu, Z., et al.: Dynamic neural networks based adaptive optimal impedance control for redundant manipulators under physical constraints. Neurocomputing, 471 (2022) 14. Kronander, K., Billard, A.: Stability considerations for variable impedance control. IEEE Trans. Robot. 32(5), 1298–1305 (2016)
Flexo-Coupled Drive Dexterous Finger Differential Motion Control Junning Zhang, Shuxuan Liu, Yajing Guo, Zhiwen Luo, and Pengfei Li
Abstract Aiming at the problem of flexure-coupled drive control of the finger with large gripping force and multi-degree-of-freedom full-drive humanoid dexterous hand, this paper proposes a flexure-coupled drive differential motion control method for the dexterous finger. Through the new design of knotting point and flexible cable winding path, the coupling between joints is reduced, the working space of the finger is increased, and the preload of the flexible cable is ensured. Through a more simplified geometric method, the motion model of the finger with flexible cable coupling transmission is established, and the relationship between the finger joint angle and the displacement of the flexible cable is analyzed, which greatly reduces the computational effort of motion control. A differential motion control method of the finger with flexible cable coupling drive is proposed to realize the coupled differential drive control of N + 1 drive motors to N finger joints. The static and card operability analysis of the dexterous finger is carried out to ensure the good operability of the dexterous finger. The experiment proves that the dexterous finger is well controlled accurately, finely and in real time by the flexible cable coupling drive dexterous finger differential motion control method, and the control accuracy is within 1 ◦ C. Keywords Flex-rope coupling · Rope-driven kinematics · Differential motion · Maneuverability analysis
J. Zhang (B) Harbin Institute of Technology, Harbin 150096, China e-mail: [email protected] J. Zhang · S. Liu · Y. Guo · Z. Luo · P. Li Beijing Research Institute Precision Mechatronics and Controls, Beijing 100076, China J. Zhang · S. Liu · Y. Guo · Z. Luo Laboratory of Aerospace Servo Actuation and Transmission, Beijing 100076, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_42
489
490
J. Zhang et al.
1 Introduction As a universal end tool for robots, the humanoid dexterous hand [1] is the most flexible and delicate end actuator essential in special robots such as space in-orbit service robots [2, 3], aerospace intelligent manufacturing robots, and detonation robots. The flexible cable-driven dexterous hand [4] greatly reduces the size and weight of the dexterous hand by remote drive [5], but there are also problems such as mutual coupling of joint motions, flexible cable tension constraint, difficult kinematic analysis, discontinuous motion control, and poor operability. To address these problems, the paper [6, 7] improves the structure of the flexible cable-driven dexterous hand [8, 9], the paper [10, 11] constructs a kinematic model of the dexterous hand [12], the paper [13, 14] designs an intelligent control method to plan finger motions, and the paper [15, 16] proposes a control strategy for a single finger [17]. The paper focuses on the differential motion control of the flexible cable coupled drive dexterous finger and the operability of the dexterous finger. Due to the size of the actuator, it is difficult to use it directly to control the torque at the joints, so a more common tendon drive is used in dexterous hands to transmit the power from the actuator to the appropriate joint. Tendons have the advantage of being flexible and lightweight, but they also complicate the kinematic relationships of the fingers of the dexterous hand. A dexterous hand finger using tendon actuation consists of four inelastic tendons to control three active finger joints and one passive hand joint. Each tendon consists of a non-extendable wire rope coupled to a motor, and the tendons are not allowed to be coupled to each other. The lateral swing joint between the base of the hand and the fingers is defined as joint 1, the bending joint between the base of the hand and the root knuckle of the fingers is joint 2, the bending joint between the root knuckle of the fingers and the middle knuckle of the fingers is joint 3, and the bending joint between the middle knuckle of the fingers and the tip knuckle of the fingers is joint 4. Where joint 1, joint 2, and joint 3 are the active joints and are the main control objects, and joint 4 is the passive joint and is associated with joint 3 The four-link mechanism is used to associate. In simple terms, joint 2 is controlled by two tendons, tendon 1 and tendon 2, joint 3 is controlled by two tendons, tendon 3 and tendon 4, and joint 1 is controlled by four tendons, tendon 1, tendon 2, tendon 3 and tendon 4, so there is a coupling between a group of tendons and a group of finger joints. At the same time although each tendon is coupled to a linkage, there may be forces on multiple joints when pulling the tendon, this is because when pulling the tendon, there are forces along the entire length of the tendon and transmitted to the parts that act as support for the tendon, this is another coupling between a set of tendons and a set of finger joints. Without a proper tendon transmission path, this coupling is difficult to eliminate and may also result in a smaller working space for the dexterous hand.
Flexo-Coupled Drive Dexterous Finger Differential Motion Control
491
2 Selection of Tendon Transmission Path The principles of dexterous finger tendon transmission path selection are, first, to ensure that when pulling the tendon, the force on other joints is minimized and the coupling between joints is reduced; second, to ensure that the dexterous finger has more working space; third, to ensure that the tension on the tendon is within a certain range to prevent the wire rope from being too loose and affecting the accuracy, or the wire rope from being pulled off.
2.1 Selection of the Knotting Point Tendon transmission path selection is first of all to determine the location of the knotted rope point. The selection of the knotted rope point is simplified to Fig. 1, the figure set the knotted rope point for A, the wire rope from A point and the pulley tangent in B point, A to the center of the circle o distance for m, the pulley radius r, knotted rope point A with the pulley rotation an angle to reach C point, at this time the wire rope and the pulley tangent in D point h=l+
m2 − r 2 − r θ
(1)
In the formula (1), h is the current wire rope length, l is when θ = 0 the basic displacement length of the wire rope, m is the distance from the knotted rope point to the center of the circle, r is the radius of the pulley, θ is the rotation angle. From the formula (1) can be seen, h with θ monotonically decreasing, when θ = 0, h take the maximum value, when θ θ take the maximum value, h take the minimum value; to √ ensure that when h take the minimum value, √ θ can take a larger value, so the m 2 − r 2 smaller the value is better; when m 2 − r 2 = 0, that is, when m = r , θ can take the maximum value, that is, the maximum working space of the finger. It can be seen that when the A point is on the circle o, the working space of the finger is the largest, so the knotted rope point should be on the pulley. As two wire ropes are difficult to knot on the pulley, so it is necessary to set the knotting point on the connecting rod. Wire rope in the knotted rope point after the knotted rope in the pulley on the cross point for the cross around the column, can achieve the same effect as the knotted rope point fixed on the pulley. Since there is no relative displacement between the knotting point and the crossover point when the joint is rotated, the wire rope between the knotting point and the crossover point will not be displaced with the rotation of the joint.
492
J. Zhang et al.
Fig. 1 Knotting point selection diagram
2.2 Selection of Transmission Path The wire rope of tendon 1 tendon 2 is knotted at the knotting point on the front linkage of joint 2 and then wrapped around to the crossover point on the joint 2 pulley for crossover around the column, next the wire rope of tendon 1 tendon 2 is arranged around the joint 2 pulley respectively and then extended to the guide sleeve of the dexterous hand base, after passing through the sleeve, and finally connected to the motor. The transmission path of tendon 1 and tendon 2 is shown in Fig. 2a. Since the dexterous hand is designed to imitate a human hand structure, θ2 is greater than or equal to 0. When pulling tendon 1 tendon 2, it will produce a torque on both joint 2 and joint 1. The wire rope of tendon 3 and tendon 4 is knotted at the knotting point on the front connecting rod of joint 3 and then wound to the crossing point on the pulley of joint 3 for crossover around the column, next the wire rope of tendon 3 and tendon 4 are arranged around the pulley of joint 3 and subsequently passed through the sleeve on the front connecting rod of joint 3. After this, the wire rope of tendon 3 is extended to the guide tube of the dexterous hand base, and after passing through the casing, it is finally connected to the motor; the wire rope of tendon 4 is extended to the joint 2 pulley, and after winding the pulley, it is connected to the guide tube of the dexterous hand base, and after passing through the casing, it is finally connected to the motor. Tendon 3 tendon 4 transmission path is shown in Fig. 2b. Since the dexterous hand is designed to imitate a human hand structure, θ3 is greater than or equal to 0. When pulling tendon 3 tendon 4, it will produce a torque on joint 3, joint 2 and joint 1.
3 Tendon-Driven Coupled Finger Differential Motion Control 3.1 Tendon-Driven Coupled Finger Kinematics The displacement function h i is used to represent the working state of each tendon, which measures the relationship between the displacement of the tendon end and the angle of rotation of the finger joint. For joint 1, when tendon 3 and tendon 4 are pulled
Flexo-Coupled Drive Dexterous Finger Differential Motion Control
(a) Schematic diagram of tendon 1 tendon 2 transmission path
493
(b) Schematic diagram of tendon 3 tendon 4 transmission path
Fig. 2 Tendon transmission path
a certain distance, tendon 1 and tendon 2 move in the opposite direction and the finger is deflected around the axis of joint 1, forming an angle θ1 with the dexterous finger as shown in Fig. 3a. Define the θ1 positive direction of joint 1 rotation when tendon 3 and tendon 4 are pulled. Angle θ1 of rotation h 11 , h 21 of joint 1 Displacement produced by tendon 1 tendon 2, is: h 11 = R1 tan θ1
(2)
h 21 = R1 tan θ1
(3)
Angle θ1 of rotation h 31 , h 41 of joint 1 Displacement produced by tendon 3 tendon 4, is: h 31 = −R1 tan θ1
(4)
h 41 = −R1 tan θ1
(5)
In Eqs. (2)–(6) R1 is the distance from the axis of joint 1 to the base sleeve of the dexterous finger. For joint 2, when tendon 1 is pulled a certain distance, tendon 2 then moves in the opposite direction, and the finger also bends around joint 2, forming a pinch angle θ2 , as shown in Fig. 3b. Define the θ2 positive direction of joint 2 rotation when tendon 1 is pulled. Angle θ2 of rotation h 12 , h 22 of joint 2 Displacement produced by tendon 1 tendon 2 , is: h 12 = −R2 tan θ2
(6)
h 22 = −R2 tan θ2
(7)
in Eqs. (6) and (7) R2 is the radius of the joint 2 pulley.
494
J. Zhang et al.
(a) joint 1
(b) joint 2
(c) joint 3
Fig. 3 Relationship between joint and tendon displacement
For joint 3, when the tendon 3 is pulled a certain distance, the tendon 4 then moves in the opposite direction and the finger also bends around the joint 3, forming a pinch angle θ3 , as shown in Fig. 3c. Define the direction of rotation of joint 3 when pulling tendon 3 as the positive direction of θ3 . The displacement function of tendon 3 is no longer a simple linear function but becomes a more complex nonlinear function. According to the geometric method, the rope length of tendon 3 from the base sleeve of the dexterous finger to the upper sleeve of the front link of joint 2 can be solved by connecting the front end of the base sleeve of the dexterous finger and the end of the upper sleeve of the front link of joint 2 to the axis of joint 2 to form a triangle, and the rope length h 32 of tendon 3 from the base sleeve of the dexterous finger to the upper sleeve of the front link of joint 2 can be obtained according to the cosine theorem as: h 32 =
b a a 2 + c2 + b2 + d 2 + 2 a 2 + c2 b2 + d 2 cos arctan + arctan + θ2 c d
(8)
in Eq. (8) a is the vertical distance from the centerline of the dexterous finger base to the sleeve, b is the vertical distance from the centerline of the front linkage of joint 2 to the sleeve, c is the horizontal distance from the front of the sleeve of the dexterous finger base to the axis of joint 2, and d is the horizontal distance from the end of the sleeve on the front linkage of joint 2 to the axis of joint 2. Angle θ3 of rotation h 33 , h 43 of joint 3 Displacement produced by tendon 3 tendon 4, is: h 33 = −R3 θ3 + h 32 − c − d
(9)
h 43 = R2 θ2 + R3 θ3
(10)
in Eqs. (9) and (10) R3 is the radius of the joint 3 pulley. In summary, the displacement functions of tendon 1 tendon 2 tendon 3 tendon 4, respectively, are: h 1 = l1 − R2 θ2 + R1 tan θ1
(11)
Flexo-Coupled Drive Dexterous Finger Differential Motion Control
495
h 2 = l2 + R2 θ2 + R1 tan θ1
(12)
h 3 = l3 − R3 θ3 + h 32 − c − d − R1 tan θ1
(13)
h 4 = l4 + R2 θ2 + R3 θ3 − R1 tan θ1
(14)
in Eqs. (11) and (14), li is the θi = 0 basic displacement in time.
3.2 Tendon-Driven Coupled Finger Differential Kinematics The differential motion of a robot is the tiny motion of the robot mechanism that can be used to derive the velocity relationship between different components. According to the definition, differential motion is a tiny motion. Therefore, if this motion is measured or calculated over a small period of time, the robot’s velocity relationship can be obtained. When oriented to practical operational tasks, the robot’s dexterous hand needs to perform not only stable grasping but also dexterous manipulation. When the target is grasped, if the grasped object is not in the best operating position with respect to the dexterous hand, the simpler way is to release the object and re-grip it, which is easy to do, but it is equivalent to increasing the number of times the dexterous hand grasps, extending the adjustment time of the grasping tool and reducing the efficiency. The human hand can change the position of the object in the hand by coordinating the movement of each finger while maintaining a stable grip when grasping the object. Therefore, the robot dexterous hand must have the ability to achieve dexterous manipulation through finger planning. However, the contact point between the fingertips and the object is changing all the time during the coordinated movement of each finger. In order to achieve the desired movement speed and trajectory of the object, the speed of the fingertips must be precisely controlled in addition to the position of the end of the finger. Therefore, it is necessary to calculate the velocity of each finger joint at any moment and establish the relationship between the rope pull relative to the differential motion of the joint by solving the Jacobi matrix of the robot. Differentiating h 1 , h 2 , h 3 and h 4 , respectively, we obtain: ∂h 1 ∂h 1 ∂h 1 dθ1 + dθ2 + dθ3 ∂θ1 ∂θ2 ∂θ3 = R1 dθ1 tan (θ1 )2 + 1 − R2 dθ2
(15)
∂h 2 ∂h 2 ∂h 2 dθ1 + dθ2 + dθ3 ∂θ1 ∂θ2 ∂θ3 = R2 dθ2 + R1 dθ1 tan (θ1 )2 + 1
(16)
dh 1 =
dh 2 =
496
J. Zhang et al.
∂h 3 ∂h 3 ∂h 3 dθ1 + dθ2 + dθ3 ∂θ ∂θ2 ∂θ3 1 a 2 2 b 2 2 + arctan 2a + 2c b + d dθ2 = − sin θ2 + arctan c d − R1 dθ1 tan (θ1 )2 + 1 − R3 dθ3 (17) ∂h 4 ∂h 4 ∂h 4 dθ1 + dθ2 + dθ3 dh 4 = ∂θ1 ∂θ2 ∂θ3 (18) 2 = R2 dθ2 + R3 dθ3 − R1 dθ1 tan (θ1 ) + 1
dh 3 =
Thus, using the recursive algorithm, the Jacobi matrix of the finger and the differential equation of motion of the robot can be derived as follows: ⎡ ⎤ ⎡ ⎤ dh 1 dθ1 ⎢dh 2 ⎥ ⎢ ⎥ = [J (θ )] ⎣dθ2 ⎦ (19) ⎣dh 3 ⎦ dθ3 dh 4 The finger Jacobi matrix [J (θ )] is: ⎡ ⎤ R1 tan (θ1 )2 + 1 −R2 0 ⎢ R1 tan (θ1 )2 + 1 ⎥ R2 0 ⎥ J (θ ) = ⎢ ⎣−R1 tan (θ1 )2 + 1 A -R3 ⎦ −R1 tan (θ1 )2 + 1 R2 R3 a b A = − sin θ2 + arctan + arctan ∗ 2a 2 + 2c2 b2 + d 2 c d
(20)
The pseudo-inverse of the Jacobi matrix [J (h)] of the finger can be solved as, which is a 3 × 4 matrix and will not be expanded here. At this point, the Jacobi matrix of the finger and the differential equations of motion of the robot are: ⎤ ⎡ ⎡ ⎤ dh 1 dθ1 ⎥ ⎢ ⎣dθ2 ⎦ = [J (h)] ⎢dh 2 ⎥ (21) ⎣dh 3 ⎦ dθ3 dh 4 Therefore, the relationship between the joint velocity θ˙1 , θ˙2 , θ˙3 in joint space and the velocity h˙ 1 , h˙ 2 , h˙ 3 , h˙ 4 of the rope motion is: ⎡ ⎤ ⎡ ⎤ h˙ 1 θ˙1 ⎢h˙ 2 ⎥ ⎣θ˙2 ⎦ = [JV (h)] ⎢ ⎥ (22) ⎣h˙ 3 ⎦ θ˙3 h˙ 4 JV (h) = J (h)
(23)
Flexo-Coupled Drive Dexterous Finger Differential Motion Control
497
4 Finger Statics and Maneuverability Analysis 4.1 Finger Static Analysis With the finger in static equilibrium, the relationship between the external force on each joint of the finger and the moment of the rope is established. According to the theory of robotics, there is a dual relationship between differential kinematics and statics, and the statics relationship can be established directly based on differential kinematics. To establish the relationship between the driving moment of the rope and the output force of the finger joints, define: ⎡ ⎤ f θ1 H F = ⎣ f θ2 ⎦ (24) f θ3 Indicates the force acting on each joint of the finger. ⎡ ⎤ dθ1 H D = ⎣dθ2 ⎦ dθ3 indicates the rotation angle of each finger joint. Also, the rope is defined: ⎡ ⎤ T1 ⎢T2 ⎥ ⎥ [T ] = ⎢ ⎣T3 ⎦ T4
(25)
(26)
indicates the force acting on each rope. ⎤ dh 1 ⎢dh 2 ⎥ ⎥ [Dh ] = ⎢ ⎣dh 3 ⎦ dh 4 ⎡
(27)
denotes the displacement of each rope. According to the principle of imaginary work, the total imaginary work of the joint must be equal to the total imaginary work of the rope, which yields: H T H F D = [T ]T [Dh ]
(28)
Mapping relationship between joint space and rope motion derived from the finger Jacobi matrix:
498
J. Zhang et al.
H D = [JF (h)] [Dh ]
(29)
H F
(30)
get: [T ] = [JF (h)]T Among them: JF (h) = J (h)
(31)
The above shows the relationship between the driving torque of the finger rope and the desired force of the joint output. Since the Jacobi matrix is known based on the previous motion analysis, the controller can calculate the required driving torque of the rope based on the desired value of the finger joint output force of the dexterous hand and control the robot dexterous hand.
4.2 Finger Manipulability Analysis The relationship between the velocity of the dexterous fingertip and the joint velocity in joint space obtained by quoting the previous calculation is: ⎡ ⎤ ⎡ ⎤ θ˙1 x˙ ⎣ y˙ ⎦ = [JV (θ )] ⎣θ˙2 ⎦ (32) z˙ θ˙3 ⎤ A c1 (−a4 s233 − a3 s23 − a2 s2 ) c1 (−2a4 s233 − a3 s23 ) Jv (θ ) = ⎣ B s1 (−a4 s233 − a3 s23 − a2 s2 ) s1 (−2a4 s233 − a3 s23 )⎦ 0 − (−a4 c233 + a3 c23 + a2 c2 ) − (2a4 c233 + a3 c23 ) ⎡
(33)
A = −s1 (a4 c233 + a3 c23 + a2 c2 + a1 ) B = c1 (a4 c233 + a3 c23 + a2 c2 + a1 ) The Jv (θ ) three columns are denoted as J1 (θ ), and J2 (θ ) , and J3 (θ ) the speed of the end is denoted as vti p . vti p = J1 (θ ) θ˙1 + J2 (θ ) θ˙2 + J3 (θ ) θ˙3
(34)
The end velocities in any direction in Cartesian space can be generated by picking appropriate joint velocities as long as J1 (θ ), and J2 (θ ) , and J3 (θ ) are linearly independent. When the finger is straightened, the Jacobi matrix becomes a singular array and the corresponding bit shape is called a singular bit shape. The characteristics are mainly manifested in that the end velocities of the finger in some directions cannot be realized.
Flexo-Coupled Drive Dexterous Finger Differential Motion Control
499
Jacobi matrices can be used to map the boundaries of joint velocities into the boundaries of fingertip velocities. The joint velocity is mapped into a polygon by the Jacobi matrix, while the joint velocity is mapped into a -unit circle in the terminal Cartesian space. This circle represents the contour of the joint velocity space, where the common action of the actuators can be considered as the sum of squares of the joint velocities. By mapping, the unit circle is mapped into an ellipsoid of the terminal velocity, which is an operability ellipsoid. Similarly to the operability ellipsoid, an equiprofile unit circle in Cartesian space can be mapped into an ellipsoid of the end force plane by reversing the transpose of the Jacobi matrix, and this ellipsoid is the force ellipsoid. The force ellipsoid reflects the ease with which the robot can generate forces in different directions. It is obvious from the maneuverability ellipsoid and the force ellipsoid that if it is easier to generate end velocities in a certain direction, it becomes more difficult to generate forces in that direction, and vice versa. For a given robot shape, the major axis of the maneuverability ellipsoid coincides exactly with the major axis of the force ellipsoid, but the major axis of the force ellipsoid is the opposite of the major axis of the maneuverability ellipsoid (if the former is long, the latter must be short, and vice versa). From this, it can be analyzed that whether the current set of selected finger joint angles of the dexterous hand has good operability plays an important role in the selection of multiple solutions in inverse kinematics.
5 Experimental Results According to the tendon-driven coupled finger differential motion control method, the correspondence between the motion displacement with the tendon rope and the actual angle of the dexterous hand finger was obtained. As shown in Fig. 4. The kinematics of the finger single-joint flexible rope differential transmission mechanism is simulated and verified by using the ADMAS simulation platform. After 500 groups of experiments at different target angles (including common angles and special angles) of the dexterous finger, the correspondence between the motion displacement with the tendon rope and the actual angle of the dexterous finger is obtained, and the error between the actual angle of the dexterous finger and the target angle is obtained. The simulation results are shown in Fig. 5. According to the analysis of the experimental results, it can be seen that the average error of the kinematic control of each finger joint angle of the finger flexure differential transmission mechanism is 0.96 ◦ C. At the same time, the linearity of the fingers during the motion was very good, and no oscillation or overshoot occurred.
500
J. Zhang et al.
Fig. 4 Operable degree ellipsoid and force ellipsoid Fig. 5 Flexo displacement corresponding to angle change
6 Conclusion This paper reduces the coupling between the joints, increases the working space of the finger, and ensures the preload of the flexible cable through a new design of the knotting point and the winding path of the flexible cable. Through a more simplified geometric method, the motion model of the dexterous finger of the flexible cable coupling drive is established, and the relationship between the finger joint angle and the displacement of the flexible cable is analyzed, which greatly reduces the computational effort of motion control. A differential motion control method of the finger with flexible cable coupling drive is proposed to realize the coupled differential drive control of N+1 drive motors to N finger joints. The static and card operability analysis of the dexterous finger is carried out to ensure the good operability of the
Flexo-Coupled Drive Dexterous Finger Differential Motion Control
501
dexterous finger. The experiment proves that the dexterous finger is well controlled accurately, finely and in real time by the flexible cable coupling drive dexterous finger differential motion control method, and the control accuracy is within 1◦ .
References 1. Hong, L., Hirzinger, G.: Research on intelligent robotic dexterous hands. J. Xi’an Jiaotong Univ. 4 (2003) 2. Xiao-Tao, W., Tong-Tong, X.U.: Space five fingers dexterous hand single finger force soft control system design. Sci. Technol. Eng. 001, 019 (2019) 3. XiaoTao, W., Tao, Y.: Spatial five-finger dexterity hand single-finger control system design. Sci. Technol. Eng. 17, 4 (2015) 4. Ge, S.: Research on Adaptive Control Algorithm for Tendon Driven Dexterous Hand. Ph.D. thesis, Nanjing University of Posts and Telecommunications (2020) 5. Ruxue, H.: Research on Key Technology of Perception and Control of Multi-finger Dexterous Hand in Tendon Driven Space. Ph.D. thesis, Nanjing University of Aeronautics and Astronautics 6. Rongdi, Z., Hai, W., Xiaopin, X., Xuan, Zhou: Kinematic modeling of the multi-fingered dexterous hand. J. Anhui Univ. Eng. 27(004), 25–28 (2012) 7. Meng, Y.: 19-degree-of-freedom decoupled dexterous hand based on improved flexible cable drive (2019) 8. Sun, Z.: Research on Dexterous Hand Based on Rope Drive. Ph.D. thesis, Tianjin University of Science and Technology (2012) 9. Xu, W.: Structural Design and System Simulation Study of a New Multi-finger Dexterous Hand. Ph.D. thesis, Shandong University of Science and Technology 10. Jun, P.: Research on Multi-finger Hand Control Method Based on Cross-Coupling Strategy. Ph.D. thesis, Harbin Institute of Technology 11. Dunchao, F., Xiaotao, W., Liangliang, H.: Hybrid position/tendon tension control of spatial multi-finger dexterous hands based on tendon actuation. Aerosp. Control 32(6), 6 (2014) 12. Fang, Y, Huang, Y.: Design and performance study of a robot dexterous hand based on humanoid parallel finger mechanism. J. Beijing Jiaotong Univ. (2021) 13. Yin, M., Xu, Z., Zhao, Z., Han, W.: Design and master-slave control of a five-finger dexterous hand based on lasso drive. China Mech. Eng. (2021) 14. Wei, Y.: Research on the Operation Planning of a Robotic Multi-finger Dexterous Hand Considering Sliding-Rolling Motion. Ph.D. thesis, Shanxi University of Science and Technology (2003) 15. Gao, L., Guo, B., Wang, K.: Structural design of humanoid dexterous hand and control strategy of single finger. Hydraul. Pneumatics 2, 4 (2012) 16. Wei-Cheng, M., Jin-Run, L., Wei-Hao, C., Min, X., Zhi-Jie, H.: Optimized design of ropedriven single-joint flexible finger. Xiamen Inst. Technol. J. (2021) 17. Hong, Y.: Research on Humanoid Prosthetic Hand Mechanism and Single Finger Control Method. Ph.D. thesis, Harbin Institute of Technology
The Hardware in Loop Simulation System Design Based on PLC for the Process Control KaiMing Yang and ZhiBin Xue
Abstract This paper designs a hardware-in-the-loop simulation system based on programmable logic controller (PLC) and MATLAB. The hardware part of the system is composed of analog input and output module, PLC and computer. The analog input and output module can convert digital-analog signal, PLC can set system control parameter, and computer can establish the controlled object model and detect realtime system running state. The modular-designed simulation platform can improve its flexibility and operability by adjusting the network and operation parameters according to experimental needs. Keywords Simulink · PLC · Real-time · Data interaction · Hardware-in-loop
1 Introduction The traditional simulation system only verifies the feasibility of the control system through software simulation, and does not consider the response characteristics of the controller to different controlled objects in the actual control system. Therefore, a hardware-in-the-loop simulation system is introduced in this paper to parameterize and virtualize the actual control objects and retain the response characteristics of the controller to different controlled objects. With PLC as controller, the control effect of PLC on different controlled objects can be tested. The hardware-in-the-loop simulation platform contains both virtual objects and physical objects, which is closer to the actual system, and has the advantages of high confidence level [1], high security, and short test cycle [2], which is the main development direction of future simulation systems. Jiang et al. [3] designed a real-time simulation system that can simulate process objects, which can provide substantially K. Yang · Z. Xue School of Water Resources and Electric Power, Qinghai University, Xining 810016, China Z. Xue (B) College of Chemical Engineering, Qinghai University, Xining 810016, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_43
503
504
K. Yang and Z. Xue
equivalent controlled objects for the process Control system based on Programmable Logical Control (PLC). Verification of control algorithms on hardware-in-the-loop simulation systems can not only shorten the development cycle of optimization algorithms and reduce test costs [4], but also flexibly configure experimental parameters to simulate different equipment and working conditions, overcoming the shortcomings of traditional pure digital simulation that cannot fully reflect system characteristics and pure physical simulation that has poor flexibility and security risks. Moreover, the simulation system can realize real-time simulation [5, 6]. At present, researchers at home and abroad mainly use OPC technology [7–10] to realize the connection between the controller and the controlled object, but there are few studies on the way to connect the controller and the controlled object through the analog input and output module [11–15]. Based on this, the paper uses the analog input and output module to establish the connection between the controller and the controlled object, and realize the real-time data interaction between the controller and the controlled object, so that the built hardware-in-the-loop simulation system is closer to the actual control system.
2 Design of Hardware-in-the-Loop Simulation System 2.1 The Hardware in the Loop Simulation System is Composed Figure 1 is the block diagram of the hardware-in-the-loop simulation system, which is composed of PLC, analog input and output module, and PC. The system needs to establish a connection between the controller part and the controlled part, so the real-time data communication part is divided into the focus of the system design. At the same time, according to the needs of the system, the necessary analog conversion module is added. Run MATLAB software on PC, in which the transfer function of the third-order system is constructed as the assumed controlled object; Analog output module is connected to a PC to convert the digital signal of the controlled object into analog
Fig. 1 Block diagram of hardware-in-the-loop simulation system
The Hardware in Loop Simulation System Design Based on PLC …
505
output signal; PLC collects the analog output signal, carries out PID control, and outputs the control signal as analog signal (standard current signal); Analog input/ output module collects the analog signal, converts it into the input signal of the aforementioned controlled object, and transmits it to MATLAB to form a closed-loop feedback control system.
2.2 Hardware-in-the-Loop Simulation System Control Strategy The system supports PID control strategy, in which the control parameters are set by Step 7 in TIA Portal software. PID control principle is simple, easy to use, widely used, strong applicability, and PID controller structure is simple, each parameter has a clear physical meaning, the calculation workload is small, easy to adjust parameters, easy to achieve multi-loop control. Now many PLC products have PID control functions, such as PID-compact, PID control instruction, etc. These PID control functions are simple to use, only need to adjust some parameters, and the user is easy to master the adjustment method.
2.3 Hardware-in-the-Loop Simulation System Hardware and Software The hardware of the simulation platform is composed of two upper computers (host), PLC, analog input and output module, EM235 expansion module, etc. Software includes MATLAB Simulink and TIA Portal V15.1. PLC selected Siemens S7-1200PLC, is a modular programmable logic controller for engineering applications, used to complete logic control and network communication and other tasks. Because PLC needs to collect and output analog signals, expansion modules are used to achieve this function. EM235 expansion module is selected during design. Configuration is required in the DIP switch when using the EM235 expansion module. The EM235 expansion module has a total of six DIP switches (SW1–SW6), which determine all the input Settings of the EM235 expansion module. Switch SW1 and SW6 need to be turned on during design; Configure the analog input to 0–20 mA current, and add a 20% offset when writing the PID program using Step 7 in the TIA Portal to configure the analog input to 4–20 mA. Analog input/output module is a device used to realize digital-to-analog conversion. The module has four analog input and four analog output interfaces. The communication interface is a network port and the input/output form is 4–20 mA current signal. In this simulation system, the analog input and output module converts the collected signals to digital and analog to realize data interaction between PLC and MATLAB and complete information communication. This module supports Modbus
506
K. Yang and Z. Xue
protocol. Modbus has four data models, which are discrete input coil, discrete output coil, input register and hold register. The input register is used to read the data collected by the analog input port of the module, and the hold register is used to write data to the analog output port. The module IP address is 192.168.1.232, port number is 10000, and device address is 1. The main computer 1 is equipped with MATLAB Simulink simulation software. In this simulation system, it is used to establish the controlled object model of the system and monitor the running state of the system in real time. Host 2 is equipped with TIA Portal software, which is the PLC control system development software. It can be programmed through Step 7 in this software to realize PID control rule. In terms of communication, signals such as controlled quantity are transmitted between host 1 and analog input and output module through network cables, and communication between analog input and output module and PLC is completed by EM235 expansion module. Host 2 and PLC are connected through RS485 bus, and the user writes instructions to PLC through Step 7 in TIA Portal.
3 The Establishment of Controlled Object Model in MATLAB Simulink In this system, the mathematical model of the controlled object is built in Simulink. PLC is used as the controller and the analog input and output module is used to convert the signal type. So that the system can realize the control of different controlled objects. MATLAB simulation control flow chart is shown in Fig. 2. The real-time simulation time is set in Simulink, and MATLAB sends the controlled quantity to PLC for control through Modbus protocol, receives the control quantity output by PLC, and displays the output curve through Simulink oscilloscope. In order to successfully read relevant control parameters in Simulink, it is necessary to design real-time simulation module, data read and write module, ADC module and DAC module in Simulink.
3.1 Simulink Real-Time Simulation Design (1) The use of S-function: The S-function is used to extend the simulation capabilities of the Simulink environment by automatically loading dynamically linked subroutines for execution. The operation process is divided into three parts: initialization, simulation cycle and simulation end. The working flow chart is shown in Fig. 3. In the initialization stage, the tasks of memory space allocation,
The Hardware in Loop Simulation System Design Based on PLC …
507
Fig. 2 Flow chart of simulation control program
sampling time setting, state variable assignment and module parameter assignment are completed. The extended function is implemented in the simulation cycle phase. (2) The realization of real-time simulation: Since the simulation time is always faster than the actual time, this paper uses S-function to realize real-time simulation of the system. The S-function calculates the difference between the simulation time and the real time at each mdlUpdate, and iteratively determines the simulation time and the real time. When the simulation time is greater than or equal to the real time, the next step is taken to make the difference zero and realize the real-time simulation of the system.
508
K. Yang and Z. Xue
Fig. 3 S-function flow chart
3.2 Design of Data Read and Write Module in Simulink (1) Data reading module (read): In Simulink, S-function is used to write data reading core programs as follows: function sys=mdlOutputs(t,x,u,m) global out out=read(m,’inputregs’,1,1); sys=out; (2) Data writing module (write): In Simulink, S-function is used to write the core program of data writing as follows: function sys=mdlOutputs(t,x,u,m) global in in=u write(m,’holdingregs’,1,in); sys=in;
The Hardware in Loop Simulation System Design Based on PLC …
509
Fig. 4 ADC module
Fig. 5 DAC module
3.3 ADC and DAC Module Design in Simulink Because the feed quantity, control quantity and feedback quantity of the controlled object in Simulink are analog quantities, the range is 0–2; However, PC and analog input and output modules transmit 4000–20,000 digital signals through network cables, and they are integer data, so ADC and DAC modules need to be designed in Simulink to complete the conversion between analog signals and digital signals. (1) ADC module design: In the Fcn module, multiplication and addition operations are used to design the ADC module, so that the analog signal is converted to the digital signal; Fig. 4 shows the ADC module. After substituting (0,4000) and (2,20000) into formula 1, a = 8000 and b = 4000 are obtained, and the mathematical model 2 required for designing ADC module is obtained. The digital signal transmitted through network cable is integer data. Therefore, it is necessary to use an integral function (floor) when designing the ADC module. y = ax + b
(1)
y = 8000x + 4000
(2)
1 x − 0.5 8000
(3)
y=
(2) DAC module design: Division and addition operations are used in the Fcn module to design DAC modules, so that digital signals can be converted into analog signals. Figure 5 shows the DAC module. After substituting (4000,0) and (20000,2) into formula 1, a = 1/8000 and b = − 0.5 are obtained, and formula 3, the mathematical model required for designing DAC module, is obtained.
3.4 Hardware-in-the-Loop Simulation System Controlled Object Model Display The controlled object is selected as the mathematical model of the three-capacity water tank, and the transfer function is shown in formula 4.
510
K. Yang and Z. Xue
Fig. 6 Controlled object model in MATLA Simulink
G(s) =
2s 3
0.55 + 5s 2 + 4s + 1
(4)
After the above design, the overall structure of the controlled object model and related auxiliary modules built in Simulink is shown in Fig. 6. The controlled object model is composed of three tank transfer function, data reading module, data writing module, ADC module, DAC module and real-time simulation module.
4 Design of Hardware-in-the-Loop Simulation Platform (1) System hardware configuration The hardware configuration diagram of the system is shown in Fig. 7. The EM235 module is connected to the analog input/output channel on the analog input/output module through the analog input/output interface, and the analog input/output module is connected to the PC through the network cable to form the control system. Take the first analog input interface as an example, the voltage signal is directly connected to A + and A − according to the positive and negative electrodes, and the current signal needs to be shorted RA + and A + first, and then the positive electrode of the current input signal is connected to A + and the negative electrode is connected to A −. The analog output interface is connected according to the type of output signal. For an unconnected port, you need to short-circuit X + and X − to prevent the unconnected port from interfering with the port in use. (2) System hardware design results Figure 8 is the physical diagram of the hardware-in-the-loop simulation system, which includes five parts: PC, network port, analog input and output module, EM235 expansion module and PLC controller.
The Hardware in Loop Simulation System Design Based on PLC …
511
Fig. 7 Schematic diagram of system hardware configuration
Fig. 8 Physical diagram of hardware-in-the-loop simulation system
4.1 PID Control Rule Design PID controller parameter design: Set PI controller transfer function as shown in formula 5. According to the PI controller transfer function, the controller parameter is: Proportionality coefficient is 0.06813, Integral coefficient is 0.13627. D(s) =
0.06813s + 0.13627 s
(5)
512
K. Yang and Z. Xue
4.2 The PID Control Rule is Realized in PLC The PID-Compact is used in the design to implement PID control rules. The analog input port address is IW0, the analog output port address is QW0, the set value is 1.0, and the manual control mode is adopted. Due to the PLC control parameters in unit and control parameters in Simulink environment is different, so before the design parameters in PLC to control parameter transformation, integral transformation formula, as shown in the type 6 differential transformation formula, such as type 7. TI =
KP 60K I
(6)
The differential conversion formula is shown in (7): TD =
KD 60K P
(7)
The control parameters obtained in Simulink simulation environment are obtained by converting formula (6) as follows: proportional gain is 0.06813, integration time is 0.00833.
5 Simulation Verification and System Monitoring Design 5.1 Show the Simulation Results in Simulink Simulation Environment In order to verify the reliability of the hardware-in-the-loop simulation system, the same model is built in Simulink for comparison. The simulation model is shown in Fig. 9. Since the hardware-in-the-loop simulation system controls the discretized data, the Simulink simulation model is discretized.
Fig. 9 Closed loop control system built in Simulink simulation environment
The Hardware in Loop Simulation System Design Based on PLC …
513
Fig. 10 Step response curve under PI control law
In Fig. 9, the liquid level is taken as the controlled quantity of the three-capacity water tank model, and the given value is 1 m. The control effect of PI controller in Simulink simulation environment is shown in Fig. 10. As can be seen from Fig. 10, the overshoot of the system is zero. Rise time is 25 s, Adjust time is 55 s, steady state error is zero.
5.2 Hardware in the Loop Simulation System Simulation Verification Here, the hardware-in-the-loop simulation system built in Chaps. 2 and 3 is used to control the mathematical model of the three-capacity water tank, and the simulation results are compared with those in 4.1. The controlled quantity is also set as liquid level and the given value is 1 m. Controller parameter setting: proportional gain is 0.068134, integration time is 0.00833 min; The sampling time of PLC controller is 1 s. The comparison between the control effect of hardware in loop simulation and Simulink simulation is shown in Figs. 11 and 12. The solid blue line in the figure is the step response curve and controller output curve obtained by hardware in loop simulation environment, while the dotted red line is the step response curve and controller output curve obtained by Simulink simulation environment. The simulation
514
K. Yang and Z. Xue
Fig. 11 Comparison of hardware-in-the-loop simulation and Simulink simulation control effect
results show that the overshoot of the system is zero, the rise time is 25 s, the adjustment time is 55 s, and the steady-state error is 0. By comparing the control effect of hardware-in-the-loop simulation system and Simulink simulation control effect, it can be seen that the control effect of hardware-in-the-loop simulation system and Simulink simulation platform is basically the same, indicating that the control strategy adopted in this paper is feasible, and the test results are effective and reliable.
5.3 Monitoring Interface and PLC Controller Data Interaction Design The App designer function in MATLAB is used to design the monitoring interface of the hardware-in-the-loop simulation system [16–18]. After PID operation, the PLC controller sends the output of the controller to the analog input and output module. The latter converts the received analog signal into a digital signal and transmits it to the Workspace in MATLAB through network port communication. MATLAB reads data in Workspace through Callback to draw controller output curve and step response curve. The two-way data communication diagram between MATLB and PLC controller is shown in Fig. 13.
The Hardware in Loop Simulation System Design Based on PLC …
Fig. 12 Hardware in loop simulation and Simulink controller output comparison
Fig. 13 Two-way data communication between MATLB and PLC controller
515
516
K. Yang and Z. Xue
Fig. 14 Monitoring results of hardware-in-the-loop simulation system
5.4 Monitoring Results Display Open Step 7 in TIA Portal, set PID control parameters and run the program, and then open the GUI monitoring interface of the PLC-based hardware-in-the-loop simulation system; Click the Run button to get the monitoring result as shown in Fig. 14. By comparing Figs. 11 and 14, it can be seen that the output waveform of the hardware in the loop simulation system is consistent with that displayed in the monitoring system, so the monitoring system designed in this paper is feasible.
6 Conclusion This paper designs a hardware-in-the-loop simulation system based on MATLAB and PLC. The controlled object model of the system was established based on Simulink, the analog input and output module was used to establish the connection between PLC controller and controlled object, and the TIA Portal V15.1 software was used to write instructions to PLC to complete the construction of the hardware-in-theloop simulation platform. At the same time, the simulation platform is compared and verified. The verification results show that the hardware-in-the-loop simulation system designed in this paper is feasible and the control strategy adopted in the system
The Hardware in Loop Simulation System Design Based on PLC …
517
is reliable. According to the experimental needs, the system can flexibly change the controlled object model of the system, which improves the flexibility and operability of the simulation platform. Acknowledgements This work was supported in part by the Transformation Project of Qinghai Province under Grant 2017-GX-103 and the Education and Teaching Research Project of Qinghai University under Grant JY202136. Conflict of interest The authors declare that there is no conflict of interests regarding the publication of this paper.
References 1. Dai, W., Zhou, P., Zhao, D., et al.: Hardware-in-the-loop simulation platform for supervisory control of mineral grinding process. Power Technol. 288, 422–434 (2016) 2. Yong, J., Feng, N., Chen, N.: Development of hardware-in-the-loop simulation experimental platform for automatic driving vehicle. Exp. Technol. Manage. 38(2), 127–131 (2021) 3. Jiang, B., Bian, J., Peng, X., et al.: Development and verification of process control hardwarein-the-loop simulation system. Exp. Technol. Manage. 38(11), 105–109 (2021) 4. Zou, Y.: Overview of hardware-in-loop simulation system. Value Eng. 35(35), 97–98 (2016) 5. Wang, Z., Qi, D., Mei, J., et al.: Real-time controller hardware-in-the-loop co-simulation testbed for cooperative control strategy for cyber-physical power system. Global Energy Interconnect. 4(02), 214–223 (2021) 6. Dominic, S., Lohr, Y., Schwung, A., et al.: PLC-based real-time realization of flatness-based feedforward control for industrial compression systems. IEEE Trans. Industr. Electron. 64(02), 1323–1331 (2017) 7. Yan, Q., Li, Q., Li, X., et al.: Design of hardware-in-the-loop training platform for advanced control based on OPC technology. Exp. Technol. Manage. 37(07), 100–104 (2020) 8. Yu, Ch., Heng, W., Gao, C.: Design and implementation of virtual simulation experiment system based on OPC technology. Electr. World 24(63), 110–111+116 (2019) 9. Wang, M.: Water tank level control system of MATLAB and PLC based on OPC. Instrumentation 24(12), 9–11 (2017) 10. Jin, L.: Hydraulic and pneumatic virtual simulation platform based on OPC technology. Hydraul. Pneum. Seals 42(09), 72–75 (2022) 11. Wang, B., Zhang, G.: Development of innovative experimental project of PLC multi-channel analog data acquisition. Exp. Technol. Manage.38(09), 166–169+180 (2021) 12. Tang, G., Zhao, W., Zhou, X.: Design of high precision analog quantity input and output module for train control on-board equipment. Electr. Drive Locomotives 261(02), 45–48 (2018) 13. Zhao, H.: Application of Mitsubishi FX0N-3A. Indus. Control Comput. 23(11), 33–34+36 (2010) 14. Chen, C., Zhang, D., Yuan, X.: Design of a multi channel wireless analog input and output module with voltage switching function. Telecom. Power Technol. 34(01), 26–28+31 (2017) 15. Cui, W., Chen, H., Peng, J.: Configuration and debugging of analog signal input and output module of SLC500TM PLC. J. Shanghai Univ. 2003(02), 145–147 (2003) 16. Zhu, F., Xu, Z., Huang, G.: A design for the real-time liquid level monitoring system based on Matlab GUI. Res. Explor. Lab. 36(09), 83–86 (2017) 17. Lin, Y., Shuchao, W., Shuifeng, H.: Design of anti-theft monitoring system based on GUI. J. Mianyang Teachers’ Coll. 37(02), 51–55 (2018) 18. Liu, Y., Zhang, Y., Wang, J.: Design of temperature acquisition and monitoring system based on Matlab/GUI. J. Yuxi Normal Univ. 33(08), 67–70 (2017)
Research on Gas Source Location of Quadruped Robot Based on DDQN Fengyun Li, Lei Cheng, Wenle Wang, and Bingbing Hou
Abstract This study proposes a solution for air source exploration using the Double DQN algorithm on a quadruped robot platform. Gas dispersion data-based environment and simulator were designed to train the olfactory quadruped robot by Fluent software and ROS. Quadruped robots, with their discrete foothold, multi-degree of freedom, and multi-limb characteristics, excel in complex road conditions. By utilizing the dual-value network of DDQN algorithm, the robot achieves accurate air source detection. Experimental results show that the robot efficiently locates air sources even in scenes with obstacles. This method finds practical applications in rugged ground settings like pipe corridors and mine caves, addressing potential dangers to humans and organisms. Keywords Air source localization · DDQN · Quadruped robot · Reinforcement learning
1 Introduction With the rapid development of modern industry, China has become the world’s largest chemical producer since 2010, with the output value of chemical industry accounting for more than 50% of the global total output value. Due to the flammable, explosive, toxic, and harmful nature of the chemicals used in chemical production, the safety risks are significant. Consequently, the issue of hazardous gas leaks is increasingly apparent. If not adequately controlled, such leaks can result in severe casualties and property losses. For example, in 2013, A material leakage poisoning accident occurred in a production workshop in Shandong Province, which resulted in three fatalities and a direct economic loss of about 2.706 million yuan. In October of the same year, a hydrogen sulfide poisoning accident occurred in the purification section of the yellow phosphorus workshop in a chemical plant in Hubei, resulting in 3 deaths F. Li (B) · L. Cheng · W. Wang · B. Hou College of Information Science and Engineering, Wuhan University of Science and Technology, Wuhan 430080, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_44
519
520
F. Li et al.
and 5 injuries and a direct economic loss of 2.38 million yuan. In addition, a explosion occurred near Houma, Louisiana, in 2015, at the Gibson wellhead gas processing plant owned by Williams Partners TransContinental Gas Pipeline Transportation. Only in 2021, there were 122 chemical accidents across the country, resulting in 150 deaths. In the same year, there were 177 chemical spills in the United States. Therefore, the leakage safety problem in chemical production still needs high attention and effective control measures. In order to avoid the occurrence of such safety problems, we should solve from the source. In the initial stage of gas source leakage, if the leak source can be discovered and processed in time, we can prevent or weaken the major losses caused by the leakage of dangerous chemicals. Therefore, efficient, accurate and safe search and location of toxic and hazardous chemical gas leakage source has become an urgent practical demand.
1.1 Gas Source Location Algorithm Gas source localization is mainly divided into three stages, namely smoke plume detection, smoke plume tracking, and odor source confirmation [1]. That is, the presence of odor is searched for in the surrounding environment, and then tracked based on the propagation path of the smoke plume. Finally, the smoke plume’s position is estimated through calculation and the robot is directed to reach that position, thereby completing the entire gas source localization process. The existing odor source localization algorithms can be roughly classified into six types, namely, concentration gradient-based algorithms, biomimetic algorithms, multi-robot algorithms, as well as probability distribution-based, control-based, and learning-based algorithms [2]. The algorithm based on concentration gradient is commonly used in unbounded and open environments, where the concentration changes gradually and the diffusion process of molecules predominates. Gas sensors can easily detect the concentration difference in such environments. On this basis, studies have been conducted on the direction of odor sources [3] and the target vector of each step of robot movement. Bionic algorithms are mostly inspired by biology, such as fruit fly algorithm [4], silkworm moth algorithm [5], etc., which are propelled by specific movement strategies. The multi-robot algorithm is more concerned with the division of labor and cooperation among robots. By coordinating the movement of multiple agents distributed in the search area, odor exploration can be carried out, which can greatly improve the algorithm probability while keeping the computational complexity low. It includes biased random movement [6], information trend algorithm [7], etc. The probability-distribution-based algorithms are considered to be more flexible, as any plume distribution model applicable to the search scenario can be selected, even without making specific assumptions about the plume model. The first application of this algorithm is the proposed Infortaxis [8] paper, which applies the typical Bayesian inference method to search for air sources, so that researchers start to consider positioning without relying on gradient concentration. Then, some people put forward the improved Infortaxis algorithm which first explores the environment and collects
Research on Gas Source Location of Quadruped Robot Based on DDQN
521
certain information before moving [9]. The control-based algorithm is employed to drive the motion of the robot, in order to progressively reduce the error between the estimated multiple iso-concentration contours of odor and the robot position, thus enabling the robot to approach the odor source. For example, Jiang et al. integrated interactive estimation and control in a unified loop and designed a plume surface tracking algorithm [10]. Lu et al. proposed a multi-robot based finite time motion control algorithm for air source location [11]. The last algorithm based on learning is a behavior model through which robots can learn to process noisy inputs and solve complex tasks efficiently. For example, fuzzy logic is used to enable robots to autonomously locate air sources indoors [12] and a model-free reinforcement learning is designed to track underwater plumes. The experimental results show that this algorithm can greatly shorten the positioning time and achieve better positioning effect on the robot platform with strong computing power. Reinforcement learning is based on Markov decision process (MDP). According to the actions taken by the agent in the environment, the environment is interpreted as reward and state representation and feedbacks to the agent. Through continuous interaction, the agent conducts autonomous training and finally learns the optimal strategy. When the solution object is the action value function, this algorithm is called Q-learning algorithm. In order to compensate for the instability generated by the nonlinear network when fitting the value function, Mnih et al. combined the convolutional neural network with the traditional Q-learning algorithm and proposed the DQN algorithm, so that it has strong adaptability and universality. In order to solve the problem of overestimating Q value in DQN, van Hasselt proposed the double Q learning algorithm and the improved DDQN algorithm [13]. The experimental results show that DDQN can estimate more accurate Q value and improve the algorithm performance. Therefore, this paper will use DDQN algorithm to explore the odor source.
1.2 Gas Source Location Based on Robot It is dangerous to locate the source of gas leakage for humans or other creatures, so most robots are used to achieve this task. Although tracked robot has strong terrain adaptability, its speed and efficiency are low. The wheeled robot has high speed and high efficiency, but its obstacle crossing ability and terrain adaptability are poor. On the other hand, the impact of UAV rotor on plume diffusion has always existed. Although the research in this aspect has been carried out, the problem has not been solved perfectly. In contrast, quadruped robots are characterized by discrete landing points and multiple degrees of freedom and multiple limbs [14]. These two advantages can help quadruped robots adapt to complex and varied terrain and different road conditions, and overcome obstacles and gullies while having less ground supporting pressure. Therefore, a source finding algorithm based on DDQN is proposed in this paper for air source location exploration of olfactory quadruped robot.
522
F. Li et al.
Fig. 1 Flow diagram of gas source location based on DDQN
2 Methodology In this section, a DDQN algorithm model is proposed for air source location. As shown in Fig. 1, the initial state is first input into Q network, Q value is output through Q network, and the next action is selected to generate the next state. Secondly, the next action is repeated and the output Q value corresponds to the maximum action. Then, the Q value of the Target Q network corresponding to the action is calculated as the actual value, and the error is calculated with the previous output predicted value, and then transmitted back to the Q network. Periodically copy the parameters of Q network to Target Q network, so as to update the training parameters and then train the good model for air source location.
2.1 Data Preparation Firstly, gas data preparation was conducted by utilizing the gas diffusion simulation function in the Fluent software. A closed environment was built within the Fluent for gas simulation, and position and concentration data were collected. A planar gas dataset was exported for training the robot, and algorithm validation was carried out using other planar gas datasets in subsequent experiments. Subsequently, various environments were constructed using the Fluent software, such as adding obstacles, to simulate the gas diffusion process, and the olfactory quadruped robot was employed for source localization.
Research on Gas Source Location of Quadruped Robot Based on DDQN
523
2.2 Double DQN DQN is a deep neural network algorithm used to predict the size of Q value, which can be understood as the value of state action, that is, the expected benefits brought by the action performed by the agent in a certain state. Specifically, the agent chooses action A under the current state s. And Q(s,a) corresponding to the action strategy π is equal to the reward rt obtained at the current moment plus the maximum reward that can be obtained later, the formula is as follows: Q (s, a) = max E rt + γ rt+1 + γ 2 rt+2 + · · · | st = t, at = a, π π
(1)
Its corresponding objective function is: YtD Q N ≡ Rt+1 + γ max Q St+1 , a; θt−
(2)
a
DDQN algorithm is improved on the basis of DQN. Its model structure is basically the same as that of DQN. The only difference is that the objective function: YtD D Q N
≡ Rt+1 + γ Q St+1 , arg max Q a
(St+1 , a; θt ) , θt−
(3)
The difference between these two objective functions is that the optimal action selection of DoubleDQN is based on the parameter θt of the Q network that is currently being updated, while the optimal action selection of DQN is based on the parameter θt− of the Target-Q network. In this way, DDQN, since every selection is based on the parameter of the current Q network, So it’s going to be a little bit smaller than it was before. This reduces overestimation to a certain extent, making the Q value closer to the true value. In addition, a compound function sig(c) is constructed in this subject, which makes the value range of reward rt be (− 1,1): ⎧ ⎨ 1, c > c sig (c) = 0, c = c ⎩ −1, c < c rt = sig (c) ×
c − cmin cmax − cmin
(4)
(5)
Where, c is the concentration at the present moment,c is the concentration at the last moment, S is state, a is action, r is reward, γ is discount factor, the value range is 0 to 1, to balance the importance of immediate reward and long-term reward.
524
F. Li et al.
The loss function is defined as: L i (θi ) = E (s,a,r,s )∼U (D)
r + γ max Q s , a ; θi− − Q (s, a; θi )
2
a
(6)
For each iteration, theta can be obtained by gradient descent: θi+1 = θi − α∇θ L (θi )
(7)
We defined the input of the network as a sequence containing the position of the robot (in both directions x and y) and the concentration of the current position. The cyclic neural network was used for data processing, and then it was input into the fully connected neural network to obtain the status value and reward at the next moment.
2.3 Gas Source Location Based on DDQN The pseudo-code of air source location using DDQN is as follows:
Algorithm 1: Double DQN For GSL Init: D:empty replay buffer; θ:initial network parameters; θ − :copy of θ; Init: Nr :replay buffer maximum size;Nb : training batch size;N − :target network replacement freq for episode = 1 , M do Initialize frame sequence x = {}; for t ∈ {0, 1, 2, · · · } do Set state s =x,sample action a ∼ π B ; Sample next frame x t from environment ε given (s,a) and receive reward r , and append x t to x ; 6 if |x| > Nr then delete oldest frame xtmin from x
1 2 3 4 5
7 8 9 10 11 12 13
end Set s =x, and add transition tuple (s,a,r,s ) to D replacing the oldest tuple if |D| ≥ Nr Sample a minibatch of Nb tuples (s,a,r,s ) ∼ Unif(D) Construct target values, one for each of the Nb tuples: Define a max (s ;θ) = arg maxa Q(s ,a ;θ)
yi =
r, if s is terminal r + γ Q(s ,a max (s ;θ);θ − ), other wise
Do a gradient descent step with loss yi - Q(s,a;θ) Replace target parameters θ − = θ every N − steps 16 end 17 end 14 15
(8)
Research on Gas Source Location of Quadruped Robot Based on DDQN
525
The robot’s action corresponds to the four directions, the next action is selected randomly with the probability of epsilon and the action corresponding to the maximum Q value is selected with the probability of 1-epsilon. Each turn ends when the robot hits an obstacle or wall; The robot reaches the location of the air source or 0.5 m around the air source; The robot ran into an obstacle ; Robot movement step length exceeds maximum stride length.
3 Experiment 3.1 Simulation Experiment on Fluent A three-dimensional enclosed space of 25m×15m×5m is built in fluent software, as shown in Fig. 2. Seven air intakes and two air outlet are set, the size and number of which are 3.2 m × 3.2 m × 6, 3.6 m × 3.6 m × 1, 3.6 m × 3.6 m × 2. In addition, four obstacles with the size of 4.5 m × 1.2 m × 1 m are set. An air source with a radius of 1.5m and a height of 0.5m is placed at the bottom of the confined space to continuously release carbon dioxide gas in the simulated gas diffusion process. Then, the wind speed and direction of the air source and the air inlet are set, and the air inlet is set into two groups of symmetrical distribution for control experiment. Figure 3 is the gas diffusion diagram under different conditions, which respectively shows the difference before and after gas diffusion in different air outlets. It can be seen that wind speed and different air intake outlets have a great influence on air diffusion.
3.2 Source Searching Experiment Based on DDQN In order to train the neural network, firstly, we grid the gas data, and then train the parameters of Q network. Adam acts as the optimizer. In order to help the neural
Fig. 2 Three-dimensional space diagram simulating gas diffusion
526
F. Li et al.
(i) Fig.a
(ii) Fig.b
(iii) Fig.c
(iv) Fig.d
Fig. 3 Gas diffusion diagram at the same time under different conditions Table 1 Hyperparameters in the training process α γ Memory_size 0.0005
0.99
200,000
Memory_warmup_size Batch_size 200
64
network learn more deeply, we set a very small learning rate α and a large discount rate γ , and the corresponding probability of action selection epsilon enables the robot to explore more areas at the beginning of training. Some hyperparameters during the training process are set as Table 1. At the beginning of training, the robot will hit the wall quickly, resulting in the termination of searching. The step size is small. After training, the compensation gradually decreases and converges, which means that the robot can efficiently find the air source at this moment. In this paper, GADEN is used to simulate the four-legged source searching process. GADEN is a gas dispersion simulator widely used in the field of robot olfaction. We created a simulation platform for an olfactory quadruped robot using the GADEN project, and implemented basic motion and perception functions in the simulator. The robot in the simulation measures gas concentration using a virtual metal oxide or photoionization detector and flow vector using a virtual ultrasonic anemometer. To simulate air source location, we used Fluent software to model the scene and imported air flow data into GADEN for gas diffusion simulation (Fig. 4).
Research on Gas Source Location of Quadruped Robot Based on DDQN
(i)
(ii)
527
(iii)
Fig. 4 Quadruped source finding map based on Gaden
(i)
(ii)
(iii)
Fig. 5 Experiments on odor source finding in different environments
Once the simulation is complete, the GSL task can be tested on our simulation platform for olfactory quadruped robots. The experimental results show that after the neural network is deployed on the quadruped robot, the robot can find the air source smoothly by following the plume. In order to verify the reliability of the algorithm, we set up several different environments in GADEN, and changed the location of the air source, the starting point of the car and the inlet and outlet of the air source positioning experiment again. The results are shown in the figure. It can be seen that the olfactory quadruped robot can still locate the source in the unfamiliar environment, which indicates that our algorithm has adaptability to different environments (Fig. 5).
4 Conclusion An olfactory quadruped robot capable of detecting gas concentration and airflow to locate air sources in complex terrain is introduced in this paper. To enable air source detection in various environments, a DDQN-based air source location algorithm is proposed. The location data of obstacles and gas concentration are taken as inputs by the algorithm, which outputs the next movement direction for the robot. Only one strategy is trained and applied across different scenarios in the paper. It is confirmed by the experiment results that air sources can be located accurately and efficiently by the olfactory quadruped robot from random positions.
528
F. Li et al.
Although different obstacles are designed in our work, it still belongs to twodimensional source finding. Future work will focus on extending the air source location algorithm based on DDQN to more scenarios, such as high-dimensional or dynamic obstacle scenarios, so that it can have better motion ability, expand the use of air source location scenarios, and have higher practical application value.
References 1. Chen, X., Huang, J.: Odor source localization algorithms on mobile robots: a review and future outlook. Robot. Autonom. Syst. 112, 123–136 (2019) 2. Chen, X.: Research on Indoor Smoke Source Localization Algorithm Based on Wheeled Mobile Robot. Huazhong University of Science and Technology (2020) 3. Russell, R.A.: Comparing search algorithms for robotic underground chemical source location. Autonom. Robots 38, 49–63 (2015) 4. López, L.L., Vouloutsi, V., Chimeno, A.E., et al.: Moth-like chemo-source localization and classification on an indoor autonomous robot. In: On Biomimetics. IntechOpen (2011) 5. Liu, Y., Jiang, Y., Zhang, X., et al.: An improved grey wolf optimizer algorithm for identification and location of gas emission. J. Loss Prevent. Process Industr. 105003 (2023) 6. Júnior, D.A.D., da Cruz, L.B., Diniz, J.O.B., et al.: Detection of potential gas accumulations in 2D seismic images using spatio-temporal, PSO, and convolutional LSTM approaches. Expert Syst. Appl. 215, 119337 (2023) 7. Zhang, Y.: Design and Implementation of Forest Fire Prevention and Control System Based on LoRa-enabled Drone Clusters. Hangzhou Dianzi University (2020) 8. Martinez, D.: On the right scent. Nature 445(7126), 371–372 (2007) 9. Lu, Q., Han, Q.L., Xie, X., et al.: A finite-time motion control strategy for odor source localization. IEEE Trans. Industr. Electron. 61(10), 5419–5430 (2014) 10. Wang, X., Shan, K.: Design of small gas source locating mobile robot based on fuzzy logic. Machine Tool Hydraul. 47(11), 7–11 (2019) 11. Hu, H., Song, S., Chen, C.L.P.: Plume tracing via model-free reinforcement learning method. IEEE Trans. Neural Netw. Learn. Syst. 30(8), 2515–2527 (2019) 12. Hasselt, H.: Double Q-learning. Adv. Neural Inf. Process. Syst. 23 (2010) 13. Van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double q-learning. Proc. AAAI Conf. Artif. Intell. 30(1) (2016) 14. Miao, Y., Wang, Y., Zhang, J.: New strategies based on improved fruit fly optimization algorithm for unknown indoor odor source location. In: 2020 IEEE International Conference on Real-time Computing and Robotics (RCAR), 2020, 297–303
A Tracking Loop Method for Parallel Receiver with Low Interaction Overhead Yimin Fan, Yi Zhang, Liu Liu, Jing Sun, Ting Li, and Tian Liu
Abstract This paper proposes a low-interaction-cost parallel tracking method for receiver loops, which aims to address the issues of poor flexibility in serial tracking loops and high interaction overhead in parallel tracking loops. The proposed method achieves this by parallelizing the carrier and code tracking loops and introducing a local prediction method to reduce the interaction frequency between the loops. By ensuring the tracking performance of the receiver loops, this method enhances the flexibility of receiver loop deployment and reduces resource overhead. Keywords Receiver · Telemetry tracking and command (TT&C) · Carrier tracking loop · Code tracking loop
1 Introduction In TT&C systems, receiver loops are typically used to continuously track dynamic control signals. Without continuous dynamic adjustments to the local carrier or code, the captured signal quickly loses lock. The task of the tracking loop is to dynamically adjust these parameters in real-time to achieve real-time tracking of the carrier and code components in the control signal [1–4]. Traditional TT&C receiver loops consist of carrier tracking loops and code tracking loops [5, 6], which are usually processed in series and interact with each other at sampling points. However, TT&C ground system architectures are moving towards cloud-based architectures based on network and virtualization technologies [7]. In a cloud-based architecture, control baseband processing software is deployed on a cloud platform built by general-purpose servers. When the signal processing module needs to be deployed on multiple servers, CPUs, or multiple cores, the flexibility of serial loop processing is limited, and it cannot be well combined with the dynamic resource scheduling and anti-destruction features of the cloud platform. Existing research on parallel tracking is rare, and those that exist still interact loops at samY. Fan (B) · Y. Zhang · L. Liu · J. Sun · T. Li · T. Liu Southwest China Institute of Electronic Technology, Chengdu 610036, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_45
529
530
Y. Fan et al.
pling points. This type of design leads to large interaction overhead between loops and cannot fully utilize the real-time processing capabilities of multiple signals on the cloud platform. The major contribution of this paper is to propose a new receiver tracking loop method to solve the problems of poor deployment flexibility of existing receiver loops and high interaction overhead between loops, while ensuring the tracking performance of receiver in a cloud-based TT&C architecture.
2 Carrier Tracking Loop Phase-locked loop (PLL) is a commonly used circuit in communication systems for tracking the phase of the input signal’s carrier [8, 9]. The basic structure of a PLL includes a phase discriminator, loop filter, and a voltage-controlled oscillator (VCO) or a numerically-controlled oscillator (NCO). Although the goal of the PLL is to align the phase, the input signal of the PLL is not directly the phase information but rather a modulated signal containing both information and noise. One of the challenges of the PLL is to extract the required phase information from the complex input signal. To address this issue, the phase discriminator calculates the phase difference between the input and output signals, the loop filter filters out the noise from the input signal, and the NCO is responsible for reproducing the carrier signal.
2.1 Phase Discriminator The purpose of a carrier phase discriminator is to determine the phase difference of the carrier between the input and output signals. There are various algorithms for achieving this, and one classic example is the arctan discriminator. Here, we will describe the working principle of the phase discriminator. Assuming that the sequence obtained after modulation of the binary code is represented as m(t), the signal at the receiving end can be expressed as s(t) = m(t) sin(w1 t + θ1 ) + n(t),
(1)
where w1 and θ1 represent the carrier frequency and initial phase of input signal s(t), respectively, and n(t) denotes noise. Multiplying the received signal by the locally replica sine and cosine signals yields
A Tracking Loop Method for Parallel Receiver with Low Interaction Overhead
531
q(t) = s(t) cos(w2 t + θ2 ) = m(t) sin(w1 t + θ1 ) cos(w2 t + θ2 ) + n I (t) 1 = m(t)[sin((w1 + w2 )t + θ1 + θ2 ) + sin((w1 − w2 )t + θ1 − θ2 )] + n I (t), 2 i(t) = s(t) sin(w2 t + θ2 )
(2)
= m(t) sin(w1 t + θ1 ) sin(w2 t + θ2 ) + n Q (t) 1 = m(t)[− cos((w1 + w2 )t + θ1 + θ2 ) + cos((w1 − w2 )t + θ1 − θ2 )] + n Q (t), 2
where w2 and θ2 represent the carrier frequency and initial phase of replica signals, respectively. After filtering the i(t) and q(t) signals, we can get Q(t) =
1 m(t) sin((w1 − w2 )t + θ1 − θ2 ), 2
(3)
I (t) =
1 m(t) cos((w1 − w2 )t + θ1 − θ2 ). 2
(4)
The phase error can be obtained as ϕ(t) = (w1 − w2 )t + θ1 − θ2 = arctan
Q(t) . I (t)
(5)
2.2 Loop Characteristics Filters can be classified according to the order of their transfer functions. Once the structure of the filter is determined, the s-domain model of the entire carrier tracking loop is also established. In this section, we provide a brief introduction to the transfer functions of commonly used first-order filters, along with the corresponding loop transfer functions. A linear model of the PLL is illustrated in Fig. 1, which is better suited for analysis purposes and can still serve as the basis for performance prediction. The second-order PLL system consists of a first-order loop filter and a NCO, whose transfer functions are
Fig. 1 Block diagram of a PLL using Laplace transform
532
Y. Fan et al.
F(s) =
τ2 s + 1 , τ1 s
(6)
Ko , s
(7)
N (s) =
with the NCO gain K o . Therefore, we can derive that G(s) =
K d F(s)N (s) , 1 + K d F(s)N (s)
(8)
which is the transfer function of a second-order PLL system with the gain K d of the phase discriminator. Then, combining (6) and (7) gives G(s) = where the natural frequency η =
2εηs + η2 , s 2 + εηs + η2
(K o K d ) , τ1
and the damping ratio ε =
(9) τ2 η . 2
3 Code Tracking Loop The PLL introduced in the previous section can track the carrier phase and remove the carrier from the received signal, which is sufficient for information codes that have not undergone spread-spectrum processing. However, in practical engineering scenarios, the received signal is often a sequence of information that has already been subjected to spread-spectrum processing. In such cases, not only is the carrier phase at the receiver offset, but there is also an issue with aligning the spreading sequence. Therefore, to accurately remove the spread-spectrum code, a code tracking loop structure must be added, as illustrated in Fig. 2. The received signal is multiplied by the local carrier’s cosine and sine signals to obtain the IQ signals. The IQ signals are then correlated with the three versions of the spreading code, and the resulting six signals Q E , Q P , Q L , I E , I P , and I L are integrated and filtered to calculate the phase error between the code chips in the received signal and the locally reproduced spreading code. Finally, the signals are filtered by a loop filter to remove noise, and the code NCO generator is adjusted until the phase error is zero, indicating successful reproduction of the spreading code. In this paper, the quasi-coherent dot product power form of the discriminator is adopted in the code tracking loop, which can be written as
δcp =
I E2 + Q 2E −
I L2 + Q 2L 1 , 2 I 2 + Q2 + I 2 + Q2 E E L L
(10)
A Tracking Loop Method for Parallel Receiver with Low Interaction Overhead
I
E
Integrate&Dump P
Digital IF
Integrate&Dump
Code NCO generator
NCO carrier generator
533
L L
Integrate&Dump Code loop filter
Code loop discriminator
Integrate&Dump
P Q
Integrate&Dump E
Integrate&Dump
Fig. 2 A block diagram of the code tracking loop
where Q E , Q L , I E , I L are the four correlator outputs shown in Fig. 2. The main difference between the code and carrier tracking loops lies in the implementation of the discriminators. The analysis methods of loop filters and loop transfer functions for both types of loops can be found in Sect. 2.
4 Illustrative Simulations The received digital baseband signal with both frequency and code offsets of 50 Hz is simultaneously input into a carrier tracking loop and a code tracking loop, where the C/No is 40dB·Hz and the sampling frequency f s is 30MHz, as shown in Fig. 3. The digital baseband signal in the carrier tracking loop is split into I and Q branches, which are integrated and then input into the carrier phase discriminator that uses the two-quadrant arctangent function as shown in Eq. (5). The carrier phase discriminator detects the phase error between the locally generated carrier signal and the input signal during integration. The phase error is then filtered by the carrier loop filter and used as feedback to adjust the output phase and frequency of the locally generated carrier signal using the NCO carrier generator, to maintain coherence with the received signal. In the code tracking loop, the digital baseband signal is also split into I and Q branches, and each branch is further split into three branches E (early), P (prompt), and L (late) for integration. After integration, the results are input into the code phase discriminator that uses the normalized non-coherent early-minus-late (EML) assignment as shown in Eq. (10). The code phase discriminator detects the phase error between the locally generated code and the input signal during integration. The phase error is then filtered by the code loop filter and used as feedback to adjust the output phase and frequency of the locally generated code using the code NCO generator, to maintain coherence with the received signal.
534
Y. Fan et al.
Carrier tracking loop I
Digital IF
NCO carrier generator Q
I
P
Lowpass filter
Carrier loop Carrier loop filter discriminator Code NCO prediction and generator P Lowpass filter Integrate&Dump
E
Integrate&Dump P
Digital IF
NCO carrier prediction and generator Q Code tracking loop
Integrate&Dump
Integrate&Dump
Code NCO generator
L L
Integrate&Dump Code loop filter
Code loop discriminator
Integrate&Dump
P Integrate&Dump E
Integrate&Dump
Fig. 3 Parallel tracking loop implementation principle diagram
The NCO carrier generator of the carrier tracking stage produces data that is sent to the NCO carrier prediction unit of the code tracking stage, enabling initial synchronization of the carrier component of the input signal in the code tracking loop. At the same time, the data generated by the code NCO generator in the code tracking stage is sent to the code NCO prediction unit of the carrier tracking loop, achieving synchronization of the pseudorandom code of the input signal in the carrier tracking loop and completing the first inter-loop data exchange. After the first interloop data exchange is completed, the two loops run independently in parallel. On one hand, based on the approximate linearization of the output values of the code NCO generator, as shown in Fig. 4, during the subsequent inter-loop non-interaction period, the code NCO prediction unit of the carrier tracking loop utilizes the first interaction value and adopts linear interpolation to produce the code P (prompt) component. On the other hand, based on the approximate linearization of the output values of the NCO carrier generator, as shown in Fig. 5, the NCO carrier prediction unit of the code tracking loop similarly utilizes the first interaction value and performs linear interpolation to produce the locally reproduced carrier. According to the characteristics of the loop, after running for an appropriate amount of time, the second, third, and subsequent inter-loop data exchanges are
A Tracking Loop Method for Parallel Receiver with Low Interaction Overhead
Fig. 4 Output curve of the code NCO generator
Fig. 5 Output curve of the NCO carrier generator
535
536
Fig. 6 Comparison of convergence curves for carrier tracking loops
Fig. 7 Comparison of convergence curves for code tracking loops
Y. Fan et al.
A Tracking Loop Method for Parallel Receiver with Low Interaction Overhead
537
completed to correct the error accumulation of the two independent parallel loops during their parallel operation. The initial inter-loop interaction time for the parallel loop is set to 1ms, with subsequent inter-loop intervals set at 10ms. In addition, the reference group is set to the conventional serial tracking loop. The comparative convergence curves for the carrier and code tracking loops are obtained through simulations, as shown in Figs. 6 and 7. It can be observed that this paper achieves equally good performance in terms of frequency and code deviation convergence to 50 Hz under low-interaction parallel tracking conditions.
5 Conclusions In conclusion, this paper presents a parallel tracking method for receiver loops by combining the carrier and code tracking loops. The proposed method utilizes linear interpolation to reduce the frequency of inter-loop interaction, which provides a low-interaction cost solution for receiver loop parallel tracking. On one hand, by separately inputting the received digital baseband signal to the carrier and code tracking loops for tracking, the proposed method enhances the flexibility of loop deployment and improves processing parallelism. On the other hand, during parallel operation, the proposed method adopts local prediction to reduce the frequency of interaction between the carrier and code tracking loops, which reduces the resource overhead caused by high inter-loop interaction frequency.
References 1. Yang, R., Huang, J., Zhan, X., et al.: Decentralized FLL-assisted PLL design for robust GNSS carrier tracking. IEEE Commun. Lett. 25(10), 3379–3383 (2021) 2. Yang, C., Zheng, Z., Fang, Z., et al.: A super-sensitivity photoacoustic receiver system-on-chip based on coherent detection and tracking. IEEE Trans. Biomed. Circuits Syst. 15(3), 454–463 (2021) 3. Hang, R., Zhang, L., Luo, Y., et al.: GNSS carrier phase tracking with discrete wavelet transform filtering under ionospheric scintillation. IEEE Commun. Lett. 21(2), 394–397 (2017) 4. Xu, B., Jia, Q., Hsu, L.T.: Vector tracking loop-based GNSS NLOS detection and correction: algorithm design and performance analysis. IEEE Trans. Instrumentation Measure. 69(7), 4604– 4619 (2019) 5. Yang, R., Huang, J., Zhan, X., et al.: Decentralized FLL-assisted PLL design for robust GNSS carrier tracking. IEEE Commun. Lett. 25(10), 3379–3383 (2021) 6. Luo, Z. , Ding, J., Zhao, L.: Adaptive gain control method of a phase-locked loop for GNSS carrier signal tracking. Int. J. Antennas Propag. 1–14 (2018) 7. Chen, G., Tong, X., Du, W.B., et al.: Application of cloud computing key technology in aerospace TT&C 31(1), 217–228 (2022) 8. Kumar, G.: FM receiver design using programmable PLL. Wirel. Pers. Commun. Int. J. 97(1), 773–787 (2017) 9. Cong, S.: An adaptive INS-aided PLL tracking method for GNSS receivers in harsh environments. SIAM J. Appl. Dyn. Syst. 16(2) (2016)
Three-Channel Decoupling Control for Fighter Aircraft Based on Prescribed Performance Control and Linear Active Disturbance Rejection Control Pengfei Li, Zhaotao Ke, Yuehui Ji, and Junjie Liu
Abstract In this paper, we propose a decoupling control method based on prescribed performance control (PPC) and linear active disturbance rejection control (LADRC) to solve the problem of high uncertainty and strong coupling during high angle of attack flight. The angle of attack, sideslip angle and bank angle around the velocity are taken as the controlled variables, respectively, and three independent controllers are designed. The unmodeled part of the system, uncertainty and coupling terms are treated as total disturbance, and it is estimated by an extended state observer and then compensated in the control law. Improving transient and steady-state performance by using PPC. The controllers achieve rapid convergence of the tracking error. The simulation results show that the control method can track the control commands quickly and stably with good performance. Keywords Linear active disturbance rejection control · Three-channel · High angle of attack · Decoupling control · Prescribed performance
1 Introduction The performance of the high angle of attack is one of the main characteristic that determine whether the fighter can dominate in air combat. However, during the high angle of attack flight, the attitude angle of the fighter changes rapidly, and the dynamics of the fighter will have characteristics of strong coupling, uncertainty and nonlinear, which makes the control of the fighter more difficult. In this case, nonlinear control is usually used, such as robust control [1, 2], nonlinear dynamic inversion control [3, 4] and sliding mode control [5, 6], etc. The robust control can make high stability and robustness under the modeling uncertainty, but the steady-state accuracy is weak. The P. Li · Z. Ke · Y. Ji · J. Liu The School of Electrical Engineering and Automation, Tianjin University of Technology, Tianjin 300384, China Tianjin Key Laboratory for Control Theory and Applications in Complicated Industry Systems, Tianjin 300384, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_46
539
540
P. Li et al.
nonlinear dynamic inverse has a strong reliance on the accuracy of the model. Han [7] proposed a model-free active disturbance rejection controller (ADRC) method, which consists of three main components: Tracking Differentiator (TD), Nonlinear State Error Feedback Control law (NLSEF), and Extended State Observer (ESO), where ESO is the core of the method. The ESO can take the total disturbance (including coupling terms between individual channels, modeled uncertainty, external disturbance, etc.) as an extended state for real-time estimation and compensation, approximating the output and input quantities as a multi-integral relationship, thus achieving decoupling. Gao [8] proposed linear active disturbance rejection control after linearization on the basis of ADRC, and proposed bandwidth parameterization [9] to simplify the rectification of parameters, which promoted the application and development of this control method [10–13]. In [14], a robust adaptive controller for multi-input multioutput nonlinear systems was proposed to guarantee a prescribed performance. Tan [15] proposed backstepping control method to stabilize fixed-time prescribed performance. PPC can improve the transient and steady-state performance of the system by limiting the convergence rate and convergence interval of the tracking error. In this paper, we adopt linear extended state observer (LESO) to estimate the strong coupling between channels to address the issue of strong coupling between channels during high angle of attack flight. In the process of tracking the attitude angle by the LADRC, the aerodynamic control surfaces still has a large control margin. To obtain better transient and steady-state performance of the system, we use PPC to make full use of the residual efficiency of the control surfaces. The remaining parts are organized as follows. Section 2 presents the nonlinear model of fighter aircraft. Section 3 presents the design principles for LADRC and PPC in three channels. The results of the simulation and conclusion are shown in Sects. 4 and 5.
2 Nonlinear Model of Fighter Aircraft The model of the fighter aircraft in the airflow coordinate system is obtained from the transformation relationship between the airframe coordinate system and the airflow coordinate system [16]. The nonlinear model can be expressed as: Ix z Ix x − I yy + Izz Izz (la + l T ) + Ix z (n a + n T ) + pq p˙ = Ix x Izz − Ix2z Ix x Izz − Ix2z Izz I yy − Izz − Ix2z qr + Ix x Izz − Ix2z
(1)
(m a + m T ) + (Izz − Ix x ) pr + Ix z r 2 − p 2 q˙ = I yy
(2)
Three-Channel Decoupling Control for Fighter Aircraft Based on Prescribed . . .
541
Ix x Ix x − I yy + Ix2z Ix z (la + l T ) + Ix x (n a + n T ) r˙ = + pq Ix x Izz − Ix z 2 Ix x Izz − Ix z 2 Ix z Ix x − I yy + Izz qr − Ix x Izz − Ix z 2
(3)
−D + Y sin β − mg sin γ V˙ = m Tx cos β cos α + Ty sin β + Tz cos β sin α + m
(4)
α˙ = q − tan β ( p cos α + r sin α) + +
(Tz cos α − Tx sin α − L) mV cos β
mg cos γ cos μ mV cos β
(5)
1 (Y cos β + mg cos γ sin μ) mV 1 −Tx sin β cos α + Ty cos β − Tz sin α sin β + mV
(6)
1 (L cos μ − Y sin μ − mg cos γ ) mV Tx + (sin μ sin β cos α + cos μ sin α) mV Ty Tz sin μ cos β + (sin μ sin β sin α − cos μ cos α) − mV mV
(7)
β˙ = p sin α − r cos α +
γ˙ =
1 (L sin μ + Y cos β cos μ) mV cos γ Tz − (cos μ sin β sin α + sin μ cos α) (8) mV cos γ Ty Tx cos μ cos β + (sin μ sin α − cos μ sin β cos α) + mV cos γ mV cos γ
χ˙ =
Y + Ty p cos α + r sin α tan γ cos μ cos β + mV cos β L + (tan γ sin μ + tan β) mV Tx sin α − Tz cos α + (tan γ sin μ + tan β) mV g Tx cos α + Tz sin α tan γ cos μ sin β − cos γ cos μ tan β − mV V
μ˙ =
(9)
542
P. Li et al.
Fig. 1 Diagram of the aircraft model
where m is the mass of aircraft; V is the flight velocity; p, q, r represent the roll rate, pitch rate and yaw rate, respectively; α, β represent angle of attack and the sideslip angle, respectively; la , m a , n a represent the roll aerodynamic moment, pitch aerodynamic moment and yaw aerodynamic moment respectively; l T , m T , n T represent the roll moment, pitch moment and yaw moment provided by the thrust vector, respectively; L , D, Y represent the lift, drag and lateral force, respectively; Tx , Ty , Tz represent the component of the thrust vector in the three-axis direction of body-fixed reference frame, respectively; Ix x , I yy , Izz represent roll, pitch and yaw moment of inertia, respectively, Ix z represents the product of inertia between x and z axes of body-fixed reference frame. Due to the symmetry of the mass distribution of the airframe, Ix y and I yz are equal to 0. γ represents the flight path angle, χ represents velocity heading angle, and μ represents the bank angle around the velocity. In this paper, we assume that Tx is equal to force, so Tz and Ty are equal to 0. The model in coordinate system of the aircraft is shown in the Fig. 1. Where O X B Y B Z B is body-fixed reference frame, O X E Y E Z E is earth-fixed reference frame, and O X W YW Z W is wind-axes reference frame. The objective is to control the attitude angles α, β and μ to their desired values αc , βc and μc . The constraints on the states and control inputs should be satisfied to obtain better system performance with high uncertainty and strong coupling.
3 Design of LADRC Controller Due to the strong coupling between the three channels of the aircraft, LADRC controllers are designed in the three channels separately to eliminate the strong coupling between the three channels and to eliminate the uncertainty and unmodeled parts to achieve decoupled control of the three channels. The three-channel LADRC controller structure design is shown in Fig. 2. α, β and μ are selected as controlled variables and the deflections of elevator, rudder and aileron (δe , δr , δa ) are used as the control quantities. αc , βc and μc are the control commands for angle of attack, the sideslip angle and the bank angle around the velocity, respectively.
Three-Channel Decoupling Control for Fighter Aircraft Based on Prescribed . . .
543
Fig. 2 Three-channel LADRC design structure
3.1 Design of PPC The prescribed performance method realizes the constraint of the system performance by designing a performance function, which can be expressed by the following inequalities: −Qp (t) < e (t) < p (t) (10) where 0 < Q ≤ 1, e (t) denotes the tracking error of controlled variables and p (t) is performance function. The performance function is positive definite and strictly decreasing. The performance function is designed as follows: p (t) = ( p0 − p∞ ) e−lt + p∞
(11)
where p0 restricts the overshoot, p∞ = lim x→∞ p (t) restricts the steady-state error and determines the convergence rate of tracking error. p0 and p∞ are positive constants. The constrained tracking error is converted to unconstrained tracking ε (t) error by the following equations: e (t) = p (t) Z (ε (t)) , ∀t ≥ 0
(12)
Z (ε (t)) is chosen as a strictly increasing function as follows: Z (ε (t)) =
eε(t) − e−ε(t) eε(t) + e−ε(t)
where ε (t) is conversion error and can be obtained by:
(13)
544
P. Li et al.
ε (t) =
1 ln 2
τ+Q Q(1 − τ )
(14)
where τ = e (t) p (t). If ε (t) is bounded, e (t) is bounded and meets the prescribed performance.
3.2 Design of LADRC for the Pitch Channel From (5) and (2), we can obtain: m T + (Izz − Ix x ) pr + Ix z r 2 − p 2 + m a + f˙α I yy m T + (Izz − Ix x ) pr + Ix z r 2 − p 2 ma = f˙α + + − bα δe I yy I yy
α¨ = q˙ + f˙α =
(15)
+ (bα − bα0 ) δe + bα0 δe = Fα + bα0 δe mg cos γ cos μ Tz cos α − Tx sin α − L + mV cos β mV cos β − tan β ( p cos α + r sin α)
fα =
m T + (Izz − Ix x ) pr + Ix z r 2 − p 2 ma Fα = f˙α + + I yy I yy
(16)
(17)
− bα δe + (bα − bα0 ) δe where Fα is considered as the total disturbance, estimated and compensated with LESO. bα0 and bα are the set and actual elevator gain, respectively. From the second-order differential expression of α, it is necessary to design the LESO of the third order. The expression for the third-order LESO is as follows: ⎡
⎤ ⎡ ⎤ ⎡ ⎤ ⎤⎡ z˙ α1 −βα1 1 0 0 βα1 z α1 ⎣ z˙ α2 ⎦ = ⎣ −βα2 0 1 ⎦ ⎣ z α2 ⎦ + ⎣ bα0 βα2 ⎦ u e α z˙ α3 z α3 0 βα3 −βα3 0 0
(18)
2 3 T ,ωα0 [βα1 , βα2 , βα3 ]T = 3ωα0 ,3ωα0
(19)
where z α1 , z α2 and z α3 are estimations of the output α, the derivative of α and the total disturbance Fα , ωα0 is an adjustable parameters of the LESO, and u e is the output of
Three-Channel Decoupling Control for Fighter Aircraft Based on Prescribed . . .
545
the controller and the input of the elevator actuator. With the state observer properly designed, the control law is given by: ue =
−z α3 + u e0 bα0
(20)
Ignoring the estimation error in z α3 , the system is reduced to a unit gain double integrator: α¨ = Fα + bα u e ≈ u e0
(21)
which is easily controlled with a proportional plus derivative controller. u e0 = kαp · εα (t) − kαd · z α2
(22)
where εα (t) is conversion error of (αc − Z α1 ), kαp and kαd are adjustable parameters of control law in the pitch channel.
3.3 Design of LADRC for the Yaw Channel and Roll Channel Similar to the pitch channel controller design, combining (6), (3), (9) and (1), we can obtain: β¨ = Fβ + bβ0 δr (23) μ¨ = Fμ + bμ0 δa The equation of third-order LESO is as follows: ⎡
⎤ ⎡ ⎤ ⎡ ⎤ ⎤⎡ z˙ j1 −β j1 1 0 0 β j1 z j1 ⎣ z˙ j2 ⎦ = ⎣ −β j2 0 1 ⎦ ⎣ z j2 ⎦ + ⎣ b j0 β j2 ⎦ u i β z˙ j3 z j3 0 β j3 −β j3 0 0
(24)
the control law of the yaw channel and roll channel is given by: ui =
k j p · ε j (t) − k jd · z j2 − z j3 b j0
(25)
where ((i, j) = (r, β), (a, μ)). Since desired siginals and disturbance are always bounded is a reasonable assumption, the eigenvalues of the closed-loop system can be always located in the left half-plane by appropriate parameters selection according to the stability analysis of the LADRC [9]. So the closed-loop system can satisfy the bounded-input boundedoutput (BIBO) stable through appropriate parameter selection.
546
P. Li et al.
4 Results of Simulation The initial state is set to a flight altitude of 15,000 ft, a flight speed of 500 ft/s, an angle of attack of 0◦ , a sideslip angle of 0◦ , a bank angle around the velocity of 0◦ , and thrust of 19,000 lbs. The parameters of controllers for the three channels are shown in Table 1. The relationship between the actual attitude angle and the control command of attitude angle are shown in Fig. 3. Deflection of aerodynamic control surfaces are shown in Fig. 4. Figure 5 depicts the relationship between the tracking error of the attitude angle and the prescribed performance function.The tendency of roll, pitch and yaw angle rates are depicted in Fig. 6 As shown in Fig. 3, the angle of attack, sideslip angle and bank angle around the velocity can track the set value well than LADRC. During the tracking of αc and μc , the sideslip can be quickly stabilized at 0◦ , and the overshoot of the sideslip
Table 1 Parameters of controller Channel b0 kp kd Pitch Yaw Roll
−22.92 2.292 −23.38
2.1 3.5 1
5 5 5
Fig. 3 Attitude angle results of simulation
ω0
p0
p∞
l
Q
200 200 200
0.6806 0.0087 1.0471
0.02 0.0026 0.04
0.51 0.5 0.5
1 1 0.8
Three-Channel Decoupling Control for Fighter Aircraft Based on Prescribed . . .
Fig. 4 Deflection of aerodynamic control surfaces
Fig. 5 Tracking errors and performance function
547
548
P. Li et al.
Fig. 6 Angular rate
angle does not exceed 0.1◦ . It can be seen that the proposed method can obtain better system performance by increasing the deflection of the aerodynamic maneuvering surfaces during the rise of the desired signals from Fig. 4. The tracking errors are all within the bounded range of the prescribed performance function as can be seen from Fig. 5. In general, the method proposed in this paper can achieve decoupled control of roll, pitch and yaw channels at high angle of attack, and has a faster error convergence rate compared to LADRC.
5 Conclusion This paper proposes a control method based on the nonlinear model of fighter aircraft for the LADRC and PPC of super maneuverable aircraft control. The decoupled control of the three channels is achieved by estimating and compensating the coupling terms, uncertainties and unmodeled parts between three channels through LESO and improving system performance through PPC. The method is verified by simulation and has a good tracking effect. However, due to the high uncertainty, disturbance and input saturation during high angle of attack flight, the traditional prescribed performance method can’t comprehensively meet the requirements of the transient and steady-state performance. Hence, how to improve the prescribed performance control method to overcome the above deficiencies will be the next focus.
Three-Channel Decoupling Control for Fighter Aircraft Based on Prescribed . . .
549
Acknowledgements This work was supported partly by the Natural Science Foundation of China under grants 62203331.
References 1. Biannic, J.-M., Roos, C., Knauf, A.: Design and robustness analysis of fighter aircraft flight control laws. Eur. J. Control 12(1), 71–85 (2006) 2. Chen, M., Jiang, B.: Robust bounded control for uncertain flight dynamics using disturbance observer. J. Syst. Eng. Electron. 25(4), 640–647 (2014) 3. Alam, M., Celikovsky, S.: On the internal stability of non-linear dynamic inversion: application to flight control. IET Control Theor. Appl. 11(12), 1849–1861 (2017) 4. Atesoglu, O., Kemal Ozgoren, M.: Control and robustness analysis for a high-alpha maneuverable thrust-vectoring fighter aircraft. J. Guid. Control Dyn. 32(5), 1483–1496 (Sep–Oct, 2009). AIAA Guidance, Navigation and Control Conference and Exhibit, Honolulu, HI, 18–21 Aug. 2008 5. Ijaz, S., Fuyang, C., Hamayun, M.T., Anwaar, H.: Adaptive integral-sliding-mode control strategy for maneuvering control of F16 aircraft subject to aerodynamic uncertainty. Appl. Math. Comput. 402 (Aug 1, 2021) 6. Raj, K., Muthukumar, V., Singh, S.N., Lee, K.W.: Finite-time sliding mode and super-twisting control of fighter aircraft. Aerospace Sci. Technol. 82–83, 487–498 (2018) 7. Han, J.: From PID technique to active disturbances rejection control technique. Control Eng. China 9(3), 13–18 (2002) 8. Gao, Z.: On the centrality of disturbance rejection in automatic control. ISA Trans. 53(4), 850–857 (2014) 9. Gao, Z., et al.: Scaling and bandwidth-parameterization based controller tuning. In: ACC, pp. 4989–4996 (2003) 10. Fu, T., Gao, Y., Guan, L., Qin, C.: An LADRC controller to improve the robustness of the visual tracking and inertial stabilized system in luminance variation conditions. Actuators 11(5) (May 2022) 11. Li, D., Wang, Z., Yu, W., Li, Q., Jin, Q.: Application of LADRC with stability region for a hydrotreating back-flushing process. Control Eng. Prac. 79, 185–194 (2018) 12. Liu, C., Luo, G., Duan, X., Chen, Z., Zhang, Z., Qiu, C.: Adaptive LADRC-based disturbance rejection method for electromechanical servo system. IEEE Trans. Ind. Appl. 56(1), 876–889 (2019) 13. Wang, Z., Wang, Y., Cai, Z., Zhao, J., Liu, N., Zhao, Y.: Unified accurate attitude control for dual-tiltrotor UAV with cyclic pitch using actuator dynamics compensated LADRC. Sensors 22(4), 1559 (2022) 14. Bechlioulis, C.P., Rovithakis, G.A.: Robust adaptive control of feedback linearizable MIMO nonlinear systems with prescribed performance. IEEE Trans. Autom. Control 53(9), 2090–2099 (2008) 15. Tan, J., Guo, S.: Backstepping control with fixed-time prescribed performance for fixed wing UAV under model uncertainties and external disturbances. Int. J. Control 95(4), 934–951 (2022) 16. Atesoglu, Ö., Özgören, M.K.: High-alpha flight maneuverability enhancement of a fighter aircraft using thrust-vectoring control. J. Guid. Control Dyn. 30(5), 1480–1493 (2007)
Characterization of Lissajous Scanning Trajectory Based on 2D MEMS Mirror Xiulei Zhang and Yongxuan Han
Abstract The MEMS mirror is an ideal scanning device for endoscope design due to its fast speed, small size, and light weight. The Lissajous scanning model offers advantages such as weak optical damage and fast imaging rate, attracting attention from researchers. This study introduces the mathematical model and working mechanism of the MEMS mirror, analyzes the characteristics of Lissajous graphical trajectory, and experimentally verifies its patterns. These findings provide a theoretical basis and practical experience for developing two-photon endoscopes using the MEMS mirror. Keywords MEMS mirror · Lissajous scanning · Endoscope · Endoscopic probe
1 Introduction The MEMS vibrating mirror is a micro-reflector based on Micro-Electro-Mechanical System (MEMS) technology. It possesses characteristics such as fast response speed, small overall size, light weight, and large scanning angle [1]. The miniaturization process of MEMS vibrating mirrors has led to its increasing popularity in the field of brain science research and endoscopic medical applications. For instance, Cheng Heping’s team at Peking University developed a high-speed and high-resolution miniaturized two-photon microscope based on MEMS vibrating mirrors. Moreover, Haijun Li’s team at the University of Michigan, USA, created an ultra-compact MEMS vibrating mirror that can be integrated into a miniature endoscopic probe. Additionally, U. Schelinski’s group at Fraunhofer Institute for Photonic Microsystems, Germany, designed a set of laser scanning endoscopes based on miniature MEMS mirrors. MEMS mirror scanning methods are classified into three main types: raster, spiral, and Lissajous. Raster scanning offers several advantages such as easy control, uniform trajectory, and a simple image reconstruction algorithm; however, its imaging X. Zhang (B) · Y. Han School of Automation Science and Electrical Engineering, Beihang University, Beijing 100191, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_47
551
552
X. Zhang and Y. Han
speed is rather slow, typically less than 1 Hz. Spiral scanning, on the other hand, has a relatively simple control method and image reconstruction algorithm, but poor light uniformity and high light damage to the sample make it unsuitable for long-term live scan imaging. Its imaging speed is generally no more than 8 Hz. In contrast, while the Lissajous scanning method has a more complex control method and image reconstruction algorithm, it boasts strong light uniformity and minimal light damage. Most importantly, it can easily achieve an imaging speed of more than 10 Hz. Other advantages of the Lissajous scanning method include rich trajectory characteristics, wide parameter selectivity, and highly adaptable application [2, 3]. Therefore, the Lissajous scanning method holds significant research significance and application value. This paper aims to investigate the stability, repetition consistency, filling density, and spatio-temporal transformation of Lissajous scanning trajectory. In particular, this study focuses on the impact of variations in Lissajous scanning control signal parameters on trajectory characteristics. To verify and analyze the actual scanning trajectory, we have designed an experimental platform, which provides valuable theoretical basis and practical experience for further development of the Lissajous scanning imaging system based on MEMS mirrors.
2 Mathematical Model and Working Mechanism of MEMS Mirrors 2.1 Mathematical Model The electrostatic double resonant MEMS mirror is selected for this experimental platform, driven by the cosine signal to work in the resonant state. Its equation of motion can be expressed by A cos(ωt + ϕ) (A is the amplitude, ω is the frequency, and ϕ is the initial phase). For the spring oscillator model, the trajectory is the projection of the oscillator in the horizontal direction with uniform circular motion, which is simple harmonic motion. The spring oscillator’s motion equations are analyzed by the rotation vector method, as shown in Fig. 1. At moment t, the rotation angle θ = ωt + ϕ, x = R cos θ = R cos(ωt + ϕ) = A cos(ωt + ϕ), where R is the circumferential radius, A is the amplitude, ω is the circular frequency and ϕ is the initial phase. It can be seen that MEMS is analogous to a spring oscillator, and the MEMS mirror can be equated to the undamped motion model of a spring oscillator without considering the drive signal attenuation. Likewise, in the case of a two-dimensional MEMS mirror, it can be modeled as a combination of a spring oscillator with simple harmonic motion in two perpendicular directions. This enables an analysis of the combined motion trajectory characteristics using Lissajous graph theory.
Characterization of Lissajous Scanning Trajectory Based on 2D MEMS Mirror
553
Fig. 1 Schematic diagram of the rotation vector method
Fig. 2 Schematic diagram of the working mechanism of 2D MEMS vibrating mirror
2.2 Working Mechanism A laser beam is incident to the center of the MEMS mirror and reflected to the projection surface X OY , thus forming a scanning point A(x, y) [4]. Since the mirror vibrates simply harmonic, let the angular frequency of vibration along the horizontal direction X -axis be ωx and the amplitude θ1 , the angular frequency of vibration along the vertical direction Y -axis ω y and the amplitude θ2 , and the vertical distance L between the reflecting surface and the projection surface (Fig. 2). Assuming that the initial moment the mirror is parallel to the projection surface and the reflected light point coincides with the point O, the angle between the light reflected by the mirror O A and the line O O at the moment t in the horizontal and vertical directions are θx = θ1 cos ωx t (1) θ y = θ2 cos ω y t
554
X. Zhang and Y. Han
The coordinates of the light point A at moment t are
x = Ltanθx = L tan(θ1 cosωx t) y = Ltanθ y = L tan(θ2 cosω y t)
(2)
Since L is much larger than the projection size, i.e., θ1 and θ2 are considered very small, the (2) can be approximated as
x = Ltanθx = L tan θ1 cosωx t y = Ltanθ y = L tan θ2 cosω y t
Let
which then results in
x0 = L tan θ1 y0 = L tan θ2 x = x0 cosωx t y = y0 cosω y t
(3)
(4)
(5)
As can be seen, the (5) reveals that the simple harmonic vibration of the twodimensional MEMS vibrating mirror is equivalent to that of the light point on the projection plane.
3 Lissajous Graph Theory Definition 1 [5] A Lissajous graph is a stable closed curve that results from the combined motion of two mutually perpendicular simple harmonic motions, i.e., the frequency ratio between the two motions is expressed as a rational number.
3.1 Determinants of the Trajectory of the Lissajous Graph The equation of the simple harmonic motion of the graphical trajectory of Lissajous along the X and Y axes is given by
x = A x cos(2π f x t + ϕx ) y = A y cos(2π f y t + ϕ y )
where A x , A y are amplitudes, f x , f y are frequencies, ϕx , ϕ y are initial phases.
(6)
Characterization of Lissajous Scanning Trajectory Based on 2D MEMS Mirror
555
Applying inverse cosine on both sides of (6) yields ⎧ x ⎪ ⎨ 2π f x t + ϕx +2iπ = ±A x arccos( A ) x y ⎪ ⎩ 2π f y t + ϕ y +2iπ = ±A y arccos( ) Ay
(7)
where i and j are arbitrary integers. It by simple calculations on (7) follows that −ϕx −2iπ ± A x arccos( Axx ) fx = fy −ϕ y −2iπ ± A y arccos( Ayy ) Define m =
(8)
fy with gcd( f x , f y ) being the greatest common gcd( f x , f y ) m f y , then we obtain n = ff xy (m, n are mutually prime integers), which
fx gcd( f x , f y )
and n =
divisor of f x and together with (8), leads to
mϕ y − nϕx + 2( jm − in)π = ±m arccos Therefore, cos(mϕ y − nϕx ) = cos(m arccos
y x ∓ n arccos Ay Ax
y x ∓ n arccos ) Ay Ax
(9)
(10)
From the equation of trajectory in Eq. (10), it can be seen that the trajectory of Lissajous graph is uniquely determined by the frequency ratio parameters m, n and mϕ y − nϕx under the determination of amplitude A x , A y and the trajectory refresh frequency is the maximum convention of f x and f y .
3.2 Homogeneity of Lissajous Graphs From Eq. (10), it can be seen that the initial phase ϕx and ϕ y determine the trajectory of the Lissajous graph when the amplitude and frequency ratio are determined, and the same phenomenon of the graph trajectory will occur when the initial phase change shows a certain pattern, which is the homogeneity or periodicity characteristic of the Lissajous graph [6]. The specific analysis is as follows: By applying the periodicity of the cosine function and employing the method of translation of the time origin to Eq. (6), one can obtain the graphical identity formula below, (11) m(φ y − φ y ) − n(φx − φx ) = 2lπ where (ϕx , ϕ y ) and (ϕx , ϕ y ) are two groups of initial phase parameters, and l is an arbitrary integer. Let Δϕx = ϕx − ϕx , Δϕ y = ϕ y − ϕ y , δ = ϕ y − ϕx , Δδ = Δϕ y − Δϕx , convert (11) into
556
X. Zhang and Y. Han
(m − n)Δφx + mΔδ = 2lπ
(12)
(m − n)Δφ y + nΔδ = 2lπ
(13)
or
In view of either (12) or (13), there are some cases 1. If ϕx is fixed, i.e. ϕx = ϕx , Δϕx = 0, Δδ = Δϕ y . Substituting l = ±1 into (12) results in Δϕ y = ± mπ . The period of the graph trajectory changing with the initial ; phase ϕx is 2π m 2. If ϕ y is fixed, i.e. ϕ y = ϕ y , Δϕ y = 0, Δδ = Δϕx . Substituting l = ±1 into (13) results in Δϕx = ± 2π . The period of the graph trajectory changing with the n 2π initial phase ϕ y is n ; 3. The ϕx and ϕ y are time-varying, and δ is time-invariant, i.e. Δδ = 0. Tak2π 2π and Δϕ y = ± m−n , ing l = ±1 into (12) and (13) contributes to Δϕx = ± m−n respectively. The period of the graph trajectory changing with the initial phase 2π . ϕx or ϕ y is m−n As indicated above, the frequency ratio is fixed to mn , and the graphs of initial phase , ϕx ± 2π are as same as the graphs of , ϕ y and ϕx ± 2π , ϕ y ± 2π ϕx , ϕ y ± 2π m n n m initial phase (ϕx , ϕ y ).
3.3 Symmetry and Complexity of Lissajous Graphs According to the (6), the points of the combined motion trajectory (x1 , y1 ) and (x2 , y2 ) are respectively obtained at different moments t1 and t2 . Suppose we let y2 = −y1 , x2 = x1 . In that case, we can find that (2k + 1) mn is even, i.e., the graph is symmetric about the x-axis when m is even. Similarly, let x2 = −x1 , y2 = y1 , then (2k + 1) mn is even, i.e., the graph is symmetric about the y-axis when n is even. When m and n are both odd, then x2 = −x1 , y2 = −y1 , resulting that the graph being symmetric about the origin [7]. In the closed graph with the frequency ratio of mn , the larger the value of m + n, the more complex the trajectory, that is, the higher the filling density [8], the clearer the imaging, but there may be filling saturation phenomenon. In the case of limited resonant frequency f x , f y , gcd( f x , f y ) is the scanning imaging frame rate, the larger the choice of m and n, the smaller the gcd( f x , f y ). The resonant frequency f x and f y need to be set so that the filling density and imaging frame rate reach a balance.
Characterization of Lissajous Scanning Trajectory Based on 2D MEMS Mirror
557
4 Experimental Validation 4.1 Experimental Setup In experiments, the dual resonant electrostatic MEMS mirror is selected and bonded to the tube housing, and the physical diagram is shown in Fig. 3. In order to verify the characteristics of the Lissajous scanning trajectory of the two-dimensional MEMS mirror under different amplitude, frequency, and initial phase parameters, an experimental platform was built. The composition block and physical diagrams are shown in Figs. 4 and 5, respectively. The platform mainly consists of a laser transmitter, beam conditioner, fixture, MEMS mirror, target, high voltage amplifier module, FPGA control board, regulated power supply module, and cables. In order to facilitate the study of the trajectory characteristics of Lissajous, the MEMS mirror drive signal is programmed by the FPGA control board, and the amplitude, frequency, and initial phase parameters are adjustable.
Fig. 3 An resonance electrostatic MEMS mirror bounded to the tube housing
Fig. 4 Block diagram of the composition of the MEMS mirror experimental platform
558
X. Zhang and Y. Han
Fig. 5 Physical diagram of the MEMS mirror experimental platform
(a) Scan traces of Ax = 2V, Ay = 2V.
(b) Scan traces of Ax = 3V, Ay = 3V.
Fig. 6 Trajectory of Lissajous at different amplitudes
4.2 Effect of Amplitude Parameters on Trajectory The X and Y-axis drive signal frequencies f x = 2.4 kHz, f y = 2.4 kHz, initial phase ϕx = 0◦ , ϕ y = 90◦ were set, and the signal amplitude was varied within the allowable voltage range to observe the changes in the scanning trajectory of Lissajous. The actual graph traces were recorded for two sets of parameters, as shown in Fig. 6a A x = 2 V, A y = 2 V (at the midpoint of the MEMS operating voltage); Fig. 6b Ax = 3 V, Ay = 3 V (at the maximum MEMS operating voltage). After being amplified by a high-voltage amplifier module with a gain of 65, the two-dimensional MEMS mirror was driven.
Characterization of Lissajous Scanning Trajectory Based on 2D MEMS Mirror
(a) fx = 2.3kHz, fy = 2.4kHz, fx /fy = m/n = 23/24.
559
(b) fx = 2.2kHz, fy = 1.8kHz, fx /fy = m/n = 11/9.
(c) fx = 2.1kHz, fy = 2.4kHz, fx /fy = m/n = 7/8.
Fig. 7 Trajectory of Lissajous at different amplitudes
Upon increasing or decreasing the driving signal amplitude within the appropriate voltage range, the trajectory of the MEMS mirror occurs no distortion, as illustrated in Fig. 6a. The graphical representation of the trajectory portrays an elliptical shape that correlates with the theoretical trajectory. However, when the amplitude approaches the maximum voltage limit for MEMS operation, the trajectory still appears as an ellipse, but the distortion becomes more apparent, as shown in Fig. 6b. This phenomenon happens due to the nonlinear saturation produced by the MEMS operation near the limit voltage. A test conducted to evaluate the scanning trajectory of Lissajous reveals that the amplitude variation from 1.4 to 2.6 V causes little to no distortion, thus providing a reference for selecting the amplitude in future practical applications of the MEMS mirror.
560
X. Zhang and Y. Han
(a) ϕy = 0◦ .
(b) ϕy = 1◦ .
(c) ϕy = 2◦ .
(d) ϕy = 3◦ .
(e) ϕy = 4◦ .
(f) ϕy = 5◦ .
(g) ϕy = 6◦ .
(h) ϕy = 7◦ .
(i) ϕy = 8◦ .
Fig. 8 Variation of Lissajous trajectory of initial phase ϕ y with ϕx = 0◦
Characterization of Lissajous Scanning Trajectory Based on 2D MEMS Mirror
561
4.3 Effect of Frequency Parameters on Trajectory Set the drive signal amplitudes in X and Y axis as A x = 2 V and A y = 2 V, respectively. The initial phases are defined as ϕx = 0◦ and ϕ y = 0◦ . We also change the drive signal frequencies f x and f y within the resonance bandwidth and record the changes in the graph trajectory of Lissajous. Three graph trajectories with different frequency ratios were recorded, as shown in Fig. 7. When the field-of-view size of the scan trajectory is constant, and the larger the value of m + n, the more complex the trajectory and the higher the filling density, which is consistent with the theory and provides a basis for frequency selection for the optimization of the actual MEMS scan imaging quality.
4.4 Effect of Initial Phase Parameter on Trajectory Set drive signal amplitude in the X and Y axis as A x = 2 V, A y = 2 V, respectively. Let f x = 2475 Hz and f y = 1265 Hz, we know that ff xy = mn = 11 . The initial phase 9 ◦ ◦ ◦ ◦ ϕx = 0 is fixed and ϕ y varies from 0 to 360 in step of 1 . The variation patterns of the scanning trajectory of Lissajous are plotted in Fig. 8. Obviously, the Fig. 8a–i demonstrates the graph periodicity of trajectory. The graph trajectories under the cases ϕ y = 0◦ and ϕ y = 8◦ are the same, while those eight graphs of the cases 0◦ –7◦ reveal that the period is 7◦ . Changing the initial phase of the X -axis signal ϕx = 10◦ , ϕ y changes in steps from 0◦ to360◦ , and the trajectory of the Lissajous graph is different from that of ϕx = 0◦ . Still, it shows a periodic property of the pattern, and the period is 8◦ . From the graphical homogeneity theory in Sect. 3.2, it is known that the graphical repeatability period under this experimental condition = 360 = 8◦ . The experimental phenomenon is consistent with the theory. There is 2π m 45 is always a high symmetry and density graph in the repeatability period, which can be used in MEMS scanning imaging.
5 Conclusion This paper introduces the mathematical model and working mechanism of MEMS mirrors. It also derives and analyzes the parameter factors, graphical homogeneity, and symmetry/complexity characteristics of the uniquely determined Lissajous graphical trajectory. For the two-dimensional resonant electrostatic MEMS mirror, an experimental platform is built to quantitatively analyze the characteristics and laws of the Lissajous pattern trajectory under different amplitudes, frequencies, and initial phase parameters. The presented results provide a theoretical basis and practical experience for developing the Lissajous scanning imaging system based on the two-dimensional MEMS mirror.
562
X. Zhang and Y. Han
References 1. Urey, H., Baran, U., Holmstrom, S.: MEMS laser scanners. J. Microelectromech. Syst. 23(2), 259–275 (2014) 2. Liang, W., Murari, K., Zhang, Y., et al.: Increased illumination uniformity and reduced photodamage offered by the Lissajous scanning in fiber-optic two-photon endomicroscopy. J. Biomed. Opt. 17(2), 021108 (2012) 3. Myaing, M.T., MacDonald, D.J., Li, X.D.: Fiber-optic scanning two-photon fuorescence endoscope. Opt. Lett. 31(8), 1076–1078 (2006) 4. Miao, X., Li, H.F., Zhang, Y.H.: Analysis of the mechanism of image distortion and correction of MEMS vibrating mirror scanning confocal images. Infrared Laser Eng. 50(2), 1–10 (2021) 5. Zhao, L.: A summary of the discussion on the graphs of Lissajous. College Phys. 16(11), 19–21 (1997) 6. Xu, D., Zhang, F.: Parameters of Lissajous graphs. J. Qufu Normal Univ. 27(2), 54–56 (2001) 7. Zhang, X.W.: Analysis of the effect of initial phase on the graph of Lissajous. J. Hubei Normal Univ. (Nat. Sci.) 20(1), 56–60 (2000) 8. Liu, Y.: Research on Lissajous Scanning Micro Laser Projection Display Technology. Northwestern Polytechnic University (2014)
SINR Communication Based Fast Predictive Control for CPSs Under DoS Attacks Enci Wang, Jianlin Hou, Yang Yi, and Qingcheng Shen
Abstract In this paper, an effective security control algorithm is proposed for a class of dual-rate cyber physical systems (CPSs) with denial-of-service (DoS) attacks. To solve the issues of control performance degradation and system instability, a novel alternating observer based on partial predictive information is designed to reconstruct the complete system state information. Meanwhile, to account for wireless transmission losses, a communication network model based on signal to interference plus noise ratio (SINR) is then introduced. Additionally, a secure controller that operates at a fast rate is proposed to achieve the desired control performance and attack resilience. At each fast rate update, predictive techniques are employed to generate virtual outputs, which are utilized to update and reconstruct the system’s secure state. Keywords Cyber physical systems (CPSs) · Denial-of-service (DoS) · Dual rate · Predictive observer signal to interference plus noise ratio (SINR)
1 Introduction With the rapid advancement of computing technology, communication technology, and control technology, significant transformations have unfolded in human social life [1–3]. Nevertheless, conventional isolated technologies have become inadequate in fulfilling the requirements for informatization and networking of the new generation of production equipment. Consequently, Cyber-Physical Systems (CPSs) have emerged as a timely solution, constituting the forefront of research in the field of automation, with preliminary progress already achieved [4–6]. Dual Rate Cyber-Physical Systems (DRCPSSs), as a distinct subset of CPS, are characterized by disparate sensor sampling rates and controller update rates [7–10]. E. Wang · J. Hou · Y. Yi (B) College of Information Engineering, Yangzhou University, Yangzhou 225127, China e-mail: [email protected] Q. Shen Jiangsu Qingya Electronic Technology, Xuzhou 221116, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_48
563
564
E. Wang et al.
However, owing to their openness and susceptibility, DRCPSSs are prone to network attacks. Previous research on network security control in CPSs has predominantly concentrated on single rate systems, thereby making direct application to dual rate systems challenging [10–14]. Therefore, the development of a new control methodology becomes necessary to ensure the secure and stable operation of DRCPSSs under attack, while maintaining commendable control performance. Furthermore, the normal operation of DRCPSSs is persistently affected by the stochastic loss of data packets resulting from inherent channel factors, including channel noise, power attenuation, and multipath effects [15, 16]. Current research lacks an investigation into the combined impact of inherent random packet loss and Denial of Service (DoS) attacks in DRCPSSs. Building upon the aforementioned discussion, this paper primarily addresses the security optimization control problem of dual rate information physical systems under DoS attacks. The remainder part of this article is organized as follows. The establishment of the system model and problem formulation are presented in Sect. 2. The detailed process of the proposed remote state estimator based on partial prediction information and controller design is provided in Sect. 3. Stability analysis is presented in Sect. 4 with detailed proofs. Simulation results are presented and analyzed in Sect. 5. Finally, the conclusions are given in Sect. 5.
2 Problem Formulation and Preliminaries 2.1 Dual-Rate Dynamic System We consider a class of discrete linear time-invariant dual-rate dynamic systems given by the following: ⎧ ⎨ x(kΔ + p + 1) = Ax(kΔ + p) + Bu(kΔ + p) y(kΔ + p) = C x(kΔ + p) ⎩ a y (kΔ + p) = C x(kΔ)
(1)
where x(·), u(·), y(·), y a (·) stand for the system state, the control input, the system output and the sampled output of the sensor, respectively. A, B, and C are system matrices with compatible dimensions. We define T as the fast update period of the system controller and M T as the slow sampling period of the sensor with Δ > 1 and Δ ∈ N + . p = 1, 2, · · · , Δ − 1. For convenience of expression, let kΔ = Tk , (k + 1)Δ = Tk+1 , kΔ + P = Tk, p , T kΔ + p + 1 = Tk, p+1 , z(Tk, p ) = x T (Tk, p ) u T (Tk, p ) . The dynamic equation in Eq. (1) can be rewritten as follows:
SINR Communication Based Fast Predictive Control …
⎧ ¯ k, p ) ⎨ z(Tk, p+1 ) = Az(T ¯ y(T ) = C z(Tk, p ) , ⎩ a k, p y (Tk, p ) = C¯ z(Tk )
565
(2)
T AB C . , C¯ = 0 0 0 Furthermore, φ(·) is a signal detector located at the sensor end, used to detect whether the sensor has an actual measured value y a (·) output. Therefore, we use φ(·) ∈ {0, 1} to indicate whether the sensor has a measurement signal output: where A¯ =
φ(·) =
1, if y a (·) output , 0, otherwise
(3)
¯ Pr[φ(Tk, p ) = 0] = 1 − φ. ¯ It is evident that y a (Tk, p ) = and let Pr[φ(Tk, p ) = 1] = φ, φ(Tk, p )y(Tk, p ).
2.2 Channel Communication Model Based on SINR In this paper, the measurement channel between the sensor and the controller is set to transmit information through the open and shared AWGN wireless channel. At this time, the SNR of the measurement channel can be expressed as follows SN R =
ps , ω2
(4)
where ps is the energy of the data packet sent by the sensor, and ω2 is the energy of the channel AWGN. SER is an index to measure the quality of wireless channel communication, and its relationship with channel SNR is as follows g(x) =
√1 2π
∞ x
2
√ exp − r2 dr, S E R = 2g ξ S N R ,
(5)
where ξ > 0 represents network parameters. Considering an energy-constrained DoS attacker adopting a random attack strategy to attack the measurement channel of CPSs. When the channel is subjected to DoS interference attacks, taking into account the impact of the attack, Eq. (4) can be rewritten as: SI N R =
ps , ω2 + pa
(6)
where pa represents the interference energy generated by the attacker launching the attack. In this case, we have: ξ SI N R . (7) S E R = 2g
566
E. Wang et al.
2.3 Data Packet Loss Under DoS Attacks The purpose of a DoS (Denial-of-Service) attack is to deplete the resources of a system or exploit vulnerabilities in its communication channels, causing disruption (i.e., packet loss) and rendering the system unable to provide its intended services to legitimate users. We use θ (·) ∈ {0, 1} to indicate whether a packet is lost: θ (·) =
1, if y a (·) arrive without error , 0, otherwise (regarded as dropout)
(8)
Assuming θ (·) follows a Bernoulli distribution, we have Pr[θ (Tk, p ) = 1] = θ¯ = 1 − S E R, Pr[θ (Tk, p ) = 0] = 1 − θ¯ = S E R. By combining Eqs. (3) and (8), the complete transmission state of the signal in the S-C channel can be obtained in DRCPSs under DoS attacks: y˜ (Tk, p ) = θ (Tk, p )y a (Tk, p ) = θ (Tk, p )φ(Tk, p )y(Tk, p ), where y˜ (Tk, p ) represents the output state affected by slow sampling and network attacks. The purpose of this paper is to simultaneously maintain the excellent control performance and anti-attack capability of MRCPSs under DoS network attacks. To achieve this, the following two useful lemmas are proposed. Lemma 1 Let V (x) be the Lyapunov function. If there exist λ ≥ 0, μ > 0, v > 0 and 0 < ϕ < 1 such that: μx(k)2 ≤ V (x(k)) ≤ vx(k)2 , E {V (x(k + 1)) |x(k) } − V (x(k)) ≤ λ − ϕV (x(k)), then, x(k) satisfies v λ . E x(k)2 ≤ (1 − ϕ)k x(k)2 + μ μϕ
3 Predictive Controller Design In this section, we develop a control framework based on partial predictive information for DRCPSs, and gradually introduce the detailed design of the proposed control method. Due to the unknown internal state information of the system, it is necessary to first construct a state observer to obtain the system’s state. However, in multi-rate information physical systems, due to the partial knowledge of the system’s output information, it is necessary to combine predictive knowledge to build a predictor
SINR Communication Based Fast Predictive Control …
567
that provides system predictions. This, in turn, is combined with the actual sensor measurements to construct an alternating predictive observer. The specific steps are as follows: First, we construct a state observer as follows based on a typical Luenberger observer: ¯ z (Tk, p ) + L(y a (Tk, p ) − yˆ (Tk, p )) zˆ (Tk, p+1 ) = Aˆ , (9) ¯ yˆ (Tk, p ) = C zˆ (Tk, p ) where zˆ (·) and yˆ (·) are the estimates of z(·) and z(·). It is worth noting that there exists a significant disparity in the update rates of the two input variables, u(Tk, p ) and y a (Tk, p ), in the aforementioned observer. Specifically, the update rate of u(Tk, p ) is K T , represented by the time series {0, T, 2T, . . .}, while the update rate of y a (Tk, p ) is K M T , represented by the time series {0, M T, 2M T, . . .}. To effectively address this issue, we propose the construction of multiple virtual output points during the sensor’s slow sampling period, ensuring that the output rate aligns with the update rate of the controller. This approach aims to reconcile the disparate update rates between u(Tk, p ) and y a (Tk, p ), facilitating their synchronized operation within the control system. Based on this, we propose the following alternate predictive observer:
¯ z (Tk, p ) + L( y¯ (Tk, p ) − yˆ (Tk, p )) zˆ (Tk, p+1 ) = Aˆ , (10) yˆ (Tk, p ) = θ (Tk, p )φ(Tk, p )C¯ zˆ (Tk, p ) + (1 − θ (Tk, p )φ(Tk, p ))y p (Tk, p )
where A represents the hybrid output, and its specific expression is given by: y¯ (Tk, p ) = (1 − θ (Tk, p )φ(Tk, p ))y p (Tk, p ) + θ (Tk, p )φ(Tk, p )y a (Tk, p ),
(11)
where y p (·) represents the predicted output value. Furthermore, A is obtained by the following predictor:
¯ z (Tk, p ) zˆ p (Tk, p+1 ) = Aˆ , yˆ p (Tk, p ) = C¯ xˆ p (Tk, p )
(12)
where A represents the predicted state and B denotes the predicted output. In conclusion, by combining Eqs. (9)–(12), we propose the following controller: u(Tk, p ) = K zˆ (Tk, p ),
(13)
where K = [K 1 K 2 ].
4 Stability Analysis Section 3 provides a detailed design of the proposed predictive control method. In this section, we first establish an augmented model for system states and errors under DoS attacks, followed by a rigorous stability analysis of the system.
568
E. Wang et al.
The estimation error is defined as: e(Tk, p ) = z(Tk, p ) − zˆ (Tk, p ). Therefore, combining Eqs. (2), (10), and (12), the closed-loop system can be represented as:
z(Tk, p+1 ) = ( Aˆ + B¯ K )z(Tk, p ) − B¯ K e(Tk, p ) , e(Tk, p+1 ) = ( A˜ 1 − A˜ 2 − A˜ 3 − A˜ 4 )e(Tk, p )
(14)
˜ ¯ )− φ)LC where A˜ 1 = A¯ − θ¯ φ¯ LC1 , A˜ 2 = (θ (ρT ) − θ¯ )(φ(ρT 1 , A3 = (θ (ρT ) − θ¯ ) ¯ θ¯ LC1 , Aˆ = A 0 , B¯ = B . φ¯ LC1 , A˜ 4 = (φ(ρT ) − φ) 0 0 0 Then, the stability analysis of the estimation error system (10) is presented as follows. Theorem 1 Consider the proposed Dual-Rate Cyber-Physical System (1) under a DoS attack with an unknown attack strategy affecting the measurement channel. By employing an observer-based controller consisting of the alternating prediction observer (10) and the feedback controller (13), if there exist positive definite matrices P and S, matrix U , X and P¯ satisfying Eqs. (15) and (16), then the closed-loop system (14) is exponentially mean-square stable, and the controller gain matrix K and observer gain matrix L satisfy Eq. (17). ¯ P B¯ = B¯ P, ⎡
Υ1 ⎢ 0 ⎢ ⎢ Υ3 ⎢ Λ1 = ⎢ ⎢ 0 ⎢ 0 ⎢ ⎣ 0 0
∗ Υ2 Υ4 Υ6 Υ8 Υ10 Υ12
∗ ∗ Υ5 0 0 0 0
∗ ∗ ∗ Υ7 0 0 0
∗ ∗ ∗ ∗ Υ9 0 0
∗ ∗ ∗ ∗ ∗ Υ11 0
(15)
⎤ ∗ ∗ ⎥ ⎥ ∗ ⎥ ⎥ ∗ ⎥ ⎥ < 0, ∗ ⎥ ⎥ ∗ ⎦ Υ13
K = P¯ −1 U, L = S −1 X,
(16)
(17)
¯ , Υ4 = − BU ¯ , Υ5 = −P, Υ6 = S A¯ − where Υ1 = −P, Υ2 = −S, Υ3 = P Aˆ + BU ¯ Υ7 = −S, Υ8 = X C, ¯ Υ9 = −σ2−1 S, Υ10 = X C, ¯ Υ11 = −σ3−1 S, Υ12 = X C, ¯ σ1 X C, ¯ ¯ ¯ σ2 = θ¯ φ(1 ¯ − θ¯ )(1 − φ), ¯ σ3 =θ¯ φ¯ 2 (1−θ), ¯ σ4 =θ¯ 2 φ(1− φ). Υ13 = −σ4−1 S, σ1 = θ¯ φ, Proof Let η(Tk, p ) = [z T (Tk, p )eT (Tk, p )], construct the Lyapunov functions as V (η(Tk, p )) = z T (Tk, p )Pz(Tk, p ) + e T (Tk, p )Se(Tk, p ).
(18)
SINR Communication Based Fast Predictive Control …
569
Based on the closed-loop system (14), we can obtain: E {V (η(ρT + T ))} − V (η(ρT ) = E z T (ρT + T )Pz(ρT + T ) + e T (ρT + T )Se(ρT + T ) − V (η(ρT )) = [( Aˆ + B¯ K )z(ρT ) − B¯ K e(ρT )]T P[( Aˆ + B¯ K )z(ρT ) − B¯ K e(ρT )] ¯ ¯ (19) +[( A¯ − σ1 L C)e(ρT )]T S[( A¯ − σ1 L C)e(ρT )] T T T ¯ ¯ +(σ2 + σ3 + σ4 )[L Ce(ρT )] S[L Ce(ρT )] − z (ρT )Pz(ρT ) − e (ρT )Se(ρT ) = η T (ρT ) Ψ1T Ψ2 Ψ1 + Ψ3 η(ρT ) = η T (ρT )Θη(ρT ) where Ψ1 = [Ψ11 Ψ12 ], Ψ2 = diag {P, S, σ2 S, σ3 S, σ4 S}, Ψ3 = diag {−P, −S}, Ψ11 = [ Aˆ T + K T B¯ T ]T , Ψ12 = [−K T B¯ T A¯ T − σ1 C¯ T L T C¯T L T C¯ T L T C¯ T L T ]T . Multiply both sides of the inequality (16) by diag I, I, P −1 , S −1 , S −1 , S −1 , S −1 , we can have: ⎤ ⎡ Ω1 ∗ ∗ ∗ ∗ ∗ ∗ ⎢ 0 Ω2 ∗ ∗ ∗ ∗ ∗ ⎥ ⎥ ⎢ ⎢ Ω3 Ω4 Ω5 ∗ ∗ ∗ ∗ ⎥ ⎥ ⎢ ⎥ (20) Λ2 = ⎢ ⎢ 0 Ω6 0 Ω7 ∗ ∗ ∗ ⎥ < 0, ⎢ 0 Ω8 0 0 Ω9 ∗ ∗ ⎥ ⎥ ⎢ ⎣ 0 Ω10 0 0 0 Ω11 ∗ ⎦ 0 Ω12 0 0 0 0 Ω13 where Ω1 = −P, Ω2 = −S, Ω3 = Aˆ + B¯ K , Ω4 = − B¯ K , Ω5 = −P −1 , Ω6 = ¯ Ω7 = −S −1 , Ω8 = L C, ¯ Ω9 = −σ2−1 S −1 , Ω10 = L C, ¯ Ω11 = −σ3−1 S −1 , A¯ − σ1 L C, −1 −1 ¯ Ω13 = −σ4 S . Ω12 = L C, Based on Schur complement, it can be concluded that Λ2 is equivalent to Θ < 0. Define scalars α and β satisfying α = max {λmax (P), λmax (S)} , 0 < β < min {λmin (−Θ), α} ,
(21) (22)
By Θ < 0 (22), (22) and Lemma 1 we have 2 E η(Tk, p ) − V (η(Tk, p )) ≤ βα (1 − βα )k η(0)2 , 0
F represents the probability corresponding to the F value. MS is the mean square deviation,where K is the milling force coefficient under the same processing conditions (same material performance, tool parameters, and machine tool parameters), vc is the milling speed, f z is the feed speed, a p is the axial cutting depth, and ae is the radial cutting depth. Among them, a, b, c, and d are the milling force coefficients corresponding to each milling parameter. (1) F = K vca f zb a cp aed In order to obtain each coefficient conveniently, take the logarithm on both sides of Eq. (1) to convert it into a linear function, ln F = ln K + a ln vc + b ln f z + c ln a p + d ln ae
(2)
Experimental Research on High Strength Steel for Side Milling Processes
579
The regression linear regression function of Matlab can be used to conveniently perform least squares linear regression on experimental results, obtaining the milling force coefficients corresponding to each milling parameter, as well as the correlation coefficient R 2 and residual information of the model. After calculation, it can be concluded that Eq. (3) is the average milling force model for side milling, where ae0.6080 , R 2 = 0.9921 Fx = e6.2520 vc−0.2516 f z0.6316 a 0.8273 p ae0.8062 , R 2 = 0.9662 Fy = e5.7842 vc−0.1269 f z0.4988 a 0.7108 p
(3)
ae0.9109 , R 2 = 0.9869 Fz = e6.7497 vc−0.2124 f z1.1949 a 0.7958 p R2 is the regression linear correlation coefficient, and the F values obtained by the F-test are 203.62, 84.04, 18.78 and 346.07, 78.65 and 207.21 respectively. All F values are greater than F0.99 (3,23) = 4.76, indicating that the linear regression performed by logarithmic transformation of Eq. (1) is highly significant. The residual and confidence space of the formula are shown as follows. The horizontal and vertical coordinates in Fig. 2 represent the number of experimental groups and residuals, respectively. It can be seen that except for the third point in the figure, the residuals of all other data are close to zero, and the confidence space of the residuals includes zero. This indicates that the above regression formula can better match the original data, and the aforementioned three points can be considered as outliers.
5 Conclusion Using the same processing parameters, a comparative experiment was conducted on side milling and end milling of PCrNi3MoVA, and the surface roughness Ra of side milling was between 0.489 and 0.938 µm, the Ra relative to side milling is 0.087– 0.409 µm. The surface roughness of side milling is not as good as that of end milling. This is mainly due to the large overhang of the tool during side milling, and the use of side edges for machining can easily cause radial runout of the tool, thereby affecting the roughness of the machined surface. In end milling, the feed rate f z and cutting depth a p have a greater impact on roughness, while the milling speed vc has a smaller impact. The influence of processing parameters on roughness is not significant in side milling processing, and the variation of roughness value with parameters is not significant. This also indicates that the tool runout during side milling processing plays a major role in affecting surface quality. Linear regression was performed on the orthogonal experimental results to summarize the power exponential prediction formula for milling force. The formula has high significance and small residual, which can better describe the milling force situation under experimental conditions and can be used for preliminary prediction of milling force.
580
Z. Luo et al.
Fig. 2 Residual plot of milling force regression formula
(a) residual plot of milling force Fx
(b) residual plot of milling force Fy
(c) residual plot of milling force Fz
Experimental Research on High Strength Steel for Side Milling Processes
581
References 1. Guo, F., Huang, J.F.: Microstructure and mechanical properties of a gun steel with high strength and toughness. Heat Treat. Metals 30(11), 31–34 (2005) 2. Xiao, G.W., Duan, C.Z.: Milling force model of FV520B stainless steel by end milling cutter side milling. Tool Eng. 57(2), 107–110 (2023) 3. Chen, R.Y.: Principles of Metal Cutting, p. 1. China Machine Press, Beijing (2002) 4. Liu, X.-W., Cheng, K.: Prediction of cutting force distribution and its influence on dimensional accuracy in peripheral milling. Mach. Tools Manuf. 42, 791–800 (2002) 5. Wang, X.K., Li, D.: Mechanical Processing Process Manual. China Machine Press, Beijing (2006) 6. Yuan, J.Z.: Experimental Technology of Metal Cutting. China Machine Press, Beijing (1988)
Feature Matching Method Based on Improved SIFT and KLT Peng Zhao, Yajie Wang, Xin Su, Niu Shan, and Jun Xiang
Abstract Feature extraction and matching are important aspects of monocular vision and key steps for autonomous flight of unmanned aerial vehicles. To address the issues of ineffective matching and low efficiency in feature extraction and matching, improved SIFT and KLT methods are proposed. Firstly, analyze the basic principles of SIFT and KLT methods, and quantify the calculation time through a profiler to determine the direction of improvement. Secondly, the Blood region is used to improve the SIFT method, and the layered iterative method is used to improve the KLT method, in order to limit the number of feature points and improve matching efficiency. Finally, the feasibility of the improved method was verified by comparing the feature extraction and matching results before and after the algorithm improvement through simulation. Keywords SIFT · KLT · Blood · Hierarchical iteration · Feature matching
1 Introduction Monocular vision motion estimation is the key to autonomous flight of unmanned aerial vehicles, and feature matching is one of the important links in motion estimation. At present, feature extraction and matching based on continuous information between adjacent frames is more suitable for unmanned system maneuvering flight. However, the speed of feature extraction and matching solution seriously restricts the update frequency of monocular vision, which has a serious impact on subsequent motion estimation. At present, scholars at home and abroad are interested in the research of feature matching. Jia [1] from the University of information engineering uses INS to assist feature matching, which can effectively reduce the search radius of feature matching and improve the accuracy of feature matching. Ma Xuelei effectively improves the registration accuracy of different perspectives by registering on 3D point clouds. Xu P. Zhao (B) · Y. Wang · X. Su · N. Shan · J. Xiang SiChuan Aerospace Fenghuo Servo Control Technology Corporation, Chengdu 611130, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_50
583
584
P. Zhao et al.
Ao et al. from Shanghai University of Technology proposed an improved hybrid filtering, feature description dimensionality reduction, and SIFT feature matching method. This article takes the SIFT/KLT algorithm as the research object, improves the algorithm based on Blob and hierarchical iteration methods, establishes improved algorithm models, and compares the matching efficiency before and after improvement through simulation to verify the high efficiency of improved feature matching. Based on the research results, this method can provide a certain reference for monocular vision used in unmanned systems.
2 Algorithm Description 2.1 SIFT Operator The invariant performance of SIFT operator optimizes feature detection, extraction, etc., and can handle the feature matching problem between two images at the same time. Its operator steps mainly include Gaussian blur, extreme value detection, key point positioning, direction determination, and feature description. Gaussian Blur Gaussian blur is the convolution of the image and the normal distribution of Gauss. The unique linear kernel of Gauss convolution is applied to reduce the image noise and level of detail while generating the required scale space. To maintain the smoothness of blur, weight allocation is performed based on the distance between the point and the center, where the Gaussian kernel function expression [2]: 1 x 2 +y2 G(x, y) = 2πσ (1) × e− 2σ 2 2 This is the standard deviation of normal distribution. If it is larger, it means that the corresponding image is more blurred. The degree of Gaussian blur can be set by different standard deviations and orders. The normalized Gauss template can be quickly generated by using the function, and its expression is as follows:
Gauss = f speci f al(‘gaussian’, [5, 5], 0.6)
(2)
Gaussian templates with 5 × 5 and σ = 0.6 according to the fspecifal function. ⎡
6.58573e − 06 4.24781e − 04 ⎢ ⎢ G = ⎢ 1.70354e − 03 ⎣ 4.24781e − 04 6.58573e − 06
4.24781e − 04 2.73984e − 02 1.09878e − 01 2.73984e − 02 4.24781e − 04
1.70354e − 03 1.09878e − 01 4.40655e − 01 1.09878e − 01 1.70354e − 03
4.24781e − 04 2.73984e − 02 1.09878e − 01 2.7398e − 02 4.24781e − 04
⎤ 6.58573e − 06 4.24781e − 04 ⎥ ⎥ 1.70354e − 03 ⎥ (3) 4.24781e − 04 ⎦ 6.58573e − 06
Feature Matching Method Based on Improved SIFT and KLT
585
Scale Space Extreme Value Detection The Gaussian pyramid model uses reduced order sampling and is composed of images of different resolutions, including O sets of S layers. It can only suppress the high-frequency information of the image and remove other information besides the reserved frequency. If we assume that the image is represented as I(x, y), Use L(x, y, z) for multi-scale spatial construction, and describe the image pyramid. Combining the original image, if the size of the original image is expressed as M, N: S = log2 min(M, N ) − t t ∈ [0, log2 min(M, N )]
(4)
Using Gaussian difference function (DOG), the expression is as follows:
D(x, y, σ ) = (G(x, y, kσ ) − G(x, y, σ )) ◦ I (x, y)
(5)
Accurate Positioning of Key Points To obtain the extremum points in continuous space, it is necessary to use sub pixel interpolation method to eliminate the deviation of discrete points, set a threshold for judgment until the interpolation calculation converges. Set the contrast using Taylor expansion as follows:
D(x) = D +
DT x
δx + 21 δx T xD2 δx 2
T
2 − D 1 δx = − D(x) xx 2
(6)
(7)
On this basis, the principal curvature method [50] is used to eliminate the edge extreme points, so as to ensure the effectiveness and stability of the operator. Based on the above fitting function, the first-order central difference is constructed by using the finite difference method and expanded as follows
f (xi ) =
f (xi ) =
( f ) x x i
( 2 f ) x 2 x i
≈
≈
f (xi +h)− f (xi −h) 2h
f (xi +h)− f (xi −h)−2 f (xi ) h2
(8) (9)
Determination of Key Point Direction The rotation invariance is largely determined by the reference direction of key points. Direction Gradient Histogram (HOG) is an important step in solving the modulus and direction of points in Gaussian images. Its main process and method are as follows: HOG.
2.2 KLT Matching Algorithm Given images I, J, set window W, weight function W(x), and I(x − d) = I(x) − g × d. Then we can get the minimum gray square difference formula and the displacement of
586
P. Zhao et al.
the adjacent image feature window, and iterate each extracted feature point according to the Newton’s method. (1) The vision. PointTracker operator performs corner detection on adjacent two frames of images, and allocates information to key points. The point set is defined as imagePoint1, imagePoint2. (2) Create a point tracker using vision.PointTracker as follows: T r = vision.Point T r (‘M B E’, 1, ‘N P L’, 5)
(10)
MBE is the abbreviation of MaxBidirectionalError, it has been defined as the forward backward error threshold, which can be considered to represent the maximum pixel distance between adjacent points in the front and back frames. If the error is greater than the set value of this attribute, the corresponding feature points are considered invalid and removed. NPL is the abbreviation of NumPyramidLevels, it represents the number of pyramid levels, and the higher the value, the greater the displacement that can be processed between adjacent frames. (3) Initialize the tracking process by using the initialize function to specify the initial position and initial frame of the point. It needs to meet the requirement that the key point coordinate format is M × 2 arrays, defining the initial frame image in 2D or RGB format. (4) Using Step to track feature points, [point, valid] = step[tracker, I]. Among them, tracker represents the feature points in the previous frame of the image, I represents the input image of the next frame, point represents the output of the method, represents the position of the point to be tracked, and valid is M × An array representing the tracking quality.
3 Algorithm Improvement 3.1 Improved SIFT Algorithm Based on Blob To improve the efficiency of SIFT solving, blob regions are used to improve feature point extraction [3]. Given the Gaussian pyramid layer a and group b, the image size is m × n. 26 is required for precise positioning of key points × m × n × a × For this calculation, a profiler was used to quantify the calculation time. The feature extraction function (GetFeatures function) ran for a total of 511.619 s, of which the interpLocation function took 468.426 s, accounting for percent 91.6 of the total time. To reduce the number of undetected points in precise point localization, the blob region threshold is set to D. Only points that meet the conditions of this region are subjected to precise point localization, thereby limiting the number of feature points and shortening the extraction time.
Feature Matching Method Based on Improved SIFT and KLT
587
3.2 Improved KLT Algorithm Based on Hierarchical Iterative Design Based on the Gaussian pyramid of the image [4], downsampling pyramid decomposition is performed on adjacent two frames of images. A small window is selected for search, and the high-resolution output of the lower layer is continuously used as the low-resolution output for iteration to the order. If the pyramid layer is L and the image is named A B, the main steps include: (1) Establish a pyramid with the same number of layers for two frames of images, and initialize the distance value d0l = 0 on the L-layer high-resolution image; (2) Let n be the number of Newton iterations in the L-layer, with an initial value of 0. According to Eq. (3.28), combined with the two frames of images in the Lth layer, the distance value can be solved as dnl. If dnl-dn-1l meets the set accuracy, or if n is greater than or equal to the set number of iterations, then this step is completed and the next step is taken; Otherwise, the second step will be executed until all pixels are detected. (3) If I = 0, the entire cycle can be ended by checking all layers of the image pyramid. Otherwise, if d0l-1 = 0 and n = 0, return to the second step for execution.
4 Simulation Analysis Using standard image planes, books, scenes, and images in the field of image processing, testing feature extraction and matching to verify the correctness of the algorithm.
4.1 Feature Extraction Comparison Test In the Airplane images, SIFT and improved methods were used for comparative testing, and in order to highlight the comparison, the commonly used Harris method was added by the author. Figure 1 shows that the SIFT extraction effect involves a large number of feature points, resulting in a corresponding time cost, and there may be errors in the extraction, but it is still unknown; Harris needs to perform grayscale processing on the image, so its extraction efficiency should be the fastest. However, due to the relationship between image grayscale, multiple features cannot be recognized, resulting in the minimum number of feature extraction; The improved SIFT algorithm can achieve good results by balancing the quantity and efficiency of extraction (Table 1).
588
P. Zhao et al.
Fig. 1 SIFT/improved SIFT/Harris feature extraction Table 1 Extracting results using different methods Type index Harris SIFT Number Time Number Book Scene Airplane
380 510 238
1822 2952 3424
835 759 1895
Time
SIFT Number
Time
2028 3764 9392
470 442 965
1351 1915 4460
Feature Matching Method Based on Improved SIFT and KLT Table 2 Test values for different parameters Type MaxBidirectionlError Test Test Test Test
0 0.5 1.5 3.5
0.2 0.6 2 4
0.3 0.8 2.5 4.5
589
NumPyramidLecels 0.4 1.1 3 5
0 2 4 1
1 3 5 2
4.2 Feature Matching Test Feature matching is based on improved SIFT feature extraction, using a layered iterative design of the KLT algorithm, tested using MATLAB vision. PointTracker, and studied the impact of maximum bidirectional error and pyramid layers on the number of matching points during the matching process. In this section of testing, book and scene matching, as well as airplane images from different angles, were selected. Firstly, to determine the impact of the maximum bidirectional error and pyramid layers on the number of matching points in KLT matching, a combination test was conducted on the two parameter values (Fig. 2; Table 2). The number of matching points is negatively correlated with NumPyramidLevels, meaning that as the number of pyramid layers increases, the number of matching points decreases. This also indicates that one of the functions of KLT matching based on hierarchical iteration is to limit feature point mismatches. The analysis of the graph shows that there is a positive correlation between the number of matching points and MaxbidirectionlError, but the growth of matching points caused by the maximum bidirectional error will seriously affect the quality of matching. Analyzing figures, it can be seen that NumPyramidLevels have a greater impact on matching points than MaxbidirectionlError. The effect of comprehensive consideration on the matching results of the two uses the parameter combination of the Maxbidirectionler = 1, NumpyramidLevels = 2 for the matching test of the KLT improvement algorithm (Fig. 2). In the matching of the above two frames, the number of matching points between Book and Scene is 59 pairs; the number of AirPlane matching points at different perspectives is 682.
590
P. Zhao et al.
Fig. 2 SIFT/improved SIFT/Harris feature extraction
5 Conclusion By using Book/Scene/AirPlane simulation analysis, the improved SIFT feature extraction method based on the improved SIFT can improve the eigencies extraction efficiency at a sufficient number; Can match the extraction feature points to the greatest extent. The test results show that based on improving the SIFT/KLT feature point matching method, it can effectively solve the problem of excessive number of eigencies, large calculations, and time-consuming. Solve the excessive displacement of adjacent frames and better adapt to the dynamic flight of the aircraft.
References 1. Jia, X.: Visual SLAM algorithm based on adaptive adaptive auxiliary features. Opt. Precis. Eng. (2023) 2. Jiang, L.: Study on Image Blur Measurement Methods Based on Multi-scale Space Analysis. Xihua University (2014) 3. Sun, X.: UAV image matching algorithm based on SIFT improvement. Electr. Light Control (2023) 4. Yu, Z.: Use the logo point matching method of improving the KLT algorithm. Laser Optoelectron. Prog. (2018) 5. Ma, X.: Method-based sheep-dot-point cloud allocation method. J. Chin. Agric. Univ. (2023) 6. Xu, A.: A SIFT-based improved feature point matching algorithm. Software
Violation Detection Method Based on Improved YOLOv5s Shuo Liu, Yu-chen Liang, Xiao-cheng Ma, and Yun-qi Guo
Abstract In order to reduce the number of traffic accidents caused by violations, violations are detected, and drivers or passengers who violate regulations are warned and punished to encourage them to comply with traffic rules. Traditional methods for violation detecting in complex environments have low detection accuracy and poor adaptability to the environment. In response to these problems, the YOLOv5s network model is improved. The SPP module is replaced by the ghostSPPF module to reduce the number of model parameters, the CBS module in the backbone network is replaced by the MP2 module to upgrade the channel dimensionality, the C3 structure is replaced by the C2f structure, and the BCEwithlogitsloss loss function is replaced by the QFocalLoss loss function. The improved network model has been used to detect the violation of drivers using mobile phones while driving. The mAP of the improved YOLOv5s network model can reach 93.02%, which is higher than the mAP of 91.37% of the original YOLOv5s network model. The detection frame rate of the improved YOLOv5s network model can reach 87.33FPS, which can meet the needs of real-time traffic violation detection. Keywords Deep learning · Traffic violation detection · YOLOv5s · Data enhancement
1 Introduction With the increasingly widespread application of deep learning technology, it is being used by more and more object detection algorithms, constantly updating and iterating. According to different feature extraction methods, the existing object detection algorithms can be divided into three types. The first type is traditional object detection algorithms, which manually design the features of the detected object and use machine learning for classification. The second type is a classification object detection algorithm that combines candidate boxes with deep learning, such as the Fast S. Liu (B) · Y. Liang · X. Ma · Y. Guo Beijing Institute of Precise and Electromechanical Control Equipments, Beijing 100076, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_51
591
592
S. Liu et al.
R-CNN algorithm. This algorithm is the Two-Stage method, which has higher accuracy than traditional methods, but the detection speed is slow and difficult to meet the needs of real-time detection. The third type is a regression target detection algorithm based on deep learning, such as YOLO algorithm. This algorithm is the One-Stage method, which improves both accuracy and speed, and can achieve real-time detection results. YOLO is the pioneering work of the One-Stage object detection algorithm, proposed by Joseph Redmon and Ali Farhadi et al. in 2015 (see [1]). When YOLO performs object detection, regression methods are used, using the entire image as input to the network. Through only one neural network, the position information of the bounding box of the detected object can be obtained, and the category of the detected object can be determined. The full name of YOLO is ‘You Only Look Once’. You only need to browse once to obtain the location of the detection target and determine its category. YOLO can quickly detect objects in images or videos. Compared to traditional object detection algorithms and candidate box based object detection algorithms, YOLO has a faster detection speed under the same recognition category range and recognition accuracy conditions. YOLO’s detection speed can meet the real-time detection requirements of videos, and the average accuracy of YOLO’s real-time detection is better than other real-time detection systems. However, due to the fact that one cell in YOLO can only predict two boxes and one category, the detection effect for objects that are close to each other and small groups of targets is not ideal. YOLOv2 optimized YOLO’s network structure, adopted a new Darknet-19 network structure, added Batch Normalization, and introduced Anchor Box mechanism, which improved detection speed and accuracy. However, the detection effect for small targets is still not good (see [2]). YOLOv3 uses the Darknet-53 network to replace YOLOv2’s Darknet-19 network, adding multi-scale detection and multi label classification, which improves detection speed and accuracy. Compared to YOLOv2, there has been a significant improvement in detection accuracy for scenes where the detection target is small target objects, which improves the detection performance of the network model for small target objects (see [3]). YOLOv4 has selected various optimization methods that have emerged since the release of YOLOv3 to improve detection accuracy. While combining them, further innovation has been made to achieve a perfect balance between YOLOv4’s detection speed and accuracy (see [4]). YOLOv5 On the basis of YOLOv4, data enhancement operations are used in the input layer to preprocess the input images, the Focus structure and CSP structure are used in the backbone network, the loss function is improved, and the detection speed and accuracy of the network model are further improved (see [5]). With the increase in the number of vehicles, the number of traffic accidents has also increased. In order to regulate driving behavior, reduce the occurrence of traffic accidents, and reduce the incidence of casualties in traffic accidents, it is necessary to detect illegal behavior of drivers using mobile phones while driving. This paper proposes an improved model based on YOLOv5s algorithm. The SPP module is replaced by ghost SPPF module to reduce the model parameters, the CBS module in the backbone network is replaced by MP2 module to increase the dimension of
Violation Detection Method Based on Improved YOLOv5s
593
Fig. 1 YOLOv5s network model structure
the channel, the CSP structure is replaced by C2f structure, and the BCEwithlogitsloss function is replaced by QFocalLoss function. The improved network model was validated through ablation experiments and comparative experiments, and the results showed that the improved network model had a significant improvement in the detection accuracy of violations, and the detection frame rate could meet the needs of real-time detection.
2 YOLOv5s Network Model There are four versions of YOLOv5 network, namely YOLOv5s, YOLOv5m, YOLOv5l, and YOLOv5x. YOLOv5s is the network model with the smallest depth and feature map width in the YOLOv5 series detection network (see [6]), so the weight file of YOLOv5s is relatively small. YOLOv5s network model can take advantage of its small size to be deployed and used in embedded devices (see [7]). The YOLOv5s network model is divided into four parts: input, backbone, multi-scale feature fusion network (Neck), and output (see [8]). The structure of the YOLOv5s network model is shown in Fig. 1.
2.1 Input The input image is first preprocessed in the input of YOLOv5s network model, and Mosaic data enhancement is used to splice images, which can enrich the training data, increase the background complexity of the detected object in the training image, and perform adaptive anchor frame calculation, adaptive image scaling and other
594
S. Liu et al.
operations (see [9]). Then the image is fed into the convolutional neural network for training, improving the model’s detection ability for target objects.
2.2 Backbone The main structures of the Backbone backbone network of the YOLOv5s network model include: Focus module, CSP structure, and SPP module (see [9]). The key of the Focus module is to slice the feature map, split the high-resolution feature map into multiple low-resolution feature maps, and convert the wide and high-dimensional information of the image into channel dimensions. Then, different features are extracted through convolution, reducing computational complexity and accelerating processing speed (see [10]). In response to the problem of duplicate gradient information in network optimization, the network structure of CSP structure has been optimized to achieve richer gradient combinations while reducing the computational burden of the network model during training (see [11]). The SPP module uses three combined multi-scale maximum pooling layers to extract spatial feature information of different sizes, output vectors of fixed size, fuse multiple receptive field, and improve the detection accuracy of the network model in view of the large difference in the size of the target to be detected in the image (see [12]).
2.3 Multi-scale Feature Fusion Network (Neck) The Neck structure adopts a combination of feature pyramid structure and path aggregation network structure (see [13]). On the basis of the feature pyramid structure, a feature transmission path from the bottom to the top is added. The feature information at the bottom can be better transmitted to the top layer through the newly added feature transmission path, enhancing the feature fusion ability of the network model. For situations where the target to be detected is relatively dense in the image, improved the detection performance of the network model.
2.4 Output IOU_Loss loss function in YOLOv3 is replaced by GIOU_Loss loss function in the output of YOLOv5s network model, which increases the measurement of intersection scale, optimized the problem of non intersection of YOLO series boundary boxes in the early stage, and improved the accuracy of network identification (see [14]).
Violation Detection Method Based on Improved YOLOv5s
595
3 Improvement of YOLOv5s Network Model 3.1 Optimization of SPPF Module Since YOLOv3, the SPP (Spatial Pyramid Pooling) module has been used in network models. SPP module adopts 1 × 1, 5 × 5, 9 × 9 and 13 × 13 maximum pooling mode, multi-scale feature fusion, and increase receptive field (see [15]). The upgraded SPP structure SPPF (Spatial Pyramid Pooling Fast) module is used in YOLOv5s network model. The structure of the SPPF module is shown in Fig. 2. Based on the SPP module, the SPPF module reduces the size of the pooled core, and changes the connection mode of the pooled core to cascade, which can not only integrate the feature maps of different receptive field, enrich the expression ability of the feature map, but also improve the running speed (see [16]). Ghost module is a method of model compression, with a model structure shown in Fig. 3. In the Ghost module, m sets of convolutional kernels are used to convolution with input to generate m × h × w dimension eigengraph. Then perform linear transformation φ on the eigengraph to generate ghost maps, using both intrinsic and ghost feature maps as output, which can generate more feature maps with fewer parameters and reduce computational complexity without affecting network accuracy, achieving the effect of improving the computational speed of the network model. By combining the eigengraph and ghost feature map as outputs, more feature maps can be generated with fewer parameters, reducing computational complexity and improving the computational speed of the network model without affecting network accuracy. In the SPPF model, CBS module is replaced by Ghost Conv moudle, resulting in a new ghost SPPF module. Due to the limited resources of embedded devices, when deploying neural networks on them, it is necessary to consider whether the device’s resources can meet the usage needs of the neural network. The lightweight Ghost module can further reduce the parameter size of the YOLOv5s network model, making the lightweight performance of the network model suitable for embedded devices and further improving the inference speed of the network model.
Fig. 2 SPPF module structure
596
S. Liu et al.
Fig. 3 Ghost model structure
Fig. 4 CBS module structure
3.2 Optimization of CBS Module The feature fusion of YOLOv5s network model adopts a common CBS module, which is composed of convolution layer, Batch normalization and SiLu activation layer. Batch normalization can improve the convergence speed through standardized means. SiLu activation function adds a linear part on the basis of Sigmoid activation function to better fit the linear function, making the learning process faster and more stable. The structure of the CBS module is shown in Fig. 4. The MP2 module has two branches, and its structural diagram is shown in Fig. 5. The first branch undergoes a maximum pooling for downsampling, and then undergoes a 1 × 1 convolution to change the number of channels. The second branch first undergoes a 1 × 1 convolution to change the number of channels, and then passes through a 3 × 3 convolution kernel with a step size of 2 convolution block, which is also used for downsampling. Finally, the results of the first and second branches were added together to obtain the super downsampling result (see [17]). The MP2 module is used to replace the CBS module in the YOLOv5s network model. While using convolution to extract features normally, the maximum pooling operation is used to expand the receptive field, and then the features extracted from the two branches are fused to preserve both the texture features of the image and the local information features of the image, which can reduce the information loss caused by the direct use of convolution in the original YOLOv5s network model and improve the generalization of the network (see [17]).
Violation Detection Method Based on Improved YOLOv5s
597
Fig. 5 MP2 module structure
Fig. 6 C3 module structure
3.3 Optimization of C3 Structure The C3 module in the YOLOv5s network model is a CSP architecture, which is divided into two branches, as shown in Fig. 6. One branch uses multiple bottleneck stacks and three standard convolutional layers, while the other branch only passes through one basic convolutional module. Finally, the two branches are concated (see [18]). The function of the C3 module is to learn residual features. The BottleNeck module in the CSP main branch gradient module has a residual structure, which can increase the gradient value of backpropagation between layers, prevent the phenomenon of gradient disappearance when the network deepens. While training deeper networks, it can also ensure good feature extraction ability and improve the accuracy of the network model (see [19]). Compared with C3, the C2f module lacks one layer of CBS and uses split to layer features instead of Conv. Its structural diagram is shown in Fig. 7. Using C2f structure instead of C3 structure can obtain richer gradient flow information while ensuring lightweight. Replacing the C3 structure with the C2f structure can obtain richer gradient flow information while ensuring lightweight.
598
S. Liu et al.
Fig. 7 C2f module structure
3.4 Optimization of Loss Function The classification loss function of YOLOv5s network model uses BCEwithlogitsloss loss function, but BCEwithlogitsloss loss function only calculates the classification loss of positive samples. In order to balance positive and negative samples and difficult samples, QFocalloss loss function is used as the classification loss function. QFocalloss loss function is shown in Formula (1).
Q F L(θ ) = − at ∗ |y − σ |β ∗ [(1 − y) ∗ log(1 − σ ) + y ∗ log(σ )] at = y ∗ a + (1 − y) ∗ (1 − a)
(1)
For the detection of violations, the proportion of faces, hands and mobile phones in the whole picture is not very high. In the QFocalloss loss function, −at can be used to balance positive and negative samples, and |y − σ |β can be used to balance difficult and easy samples, increasing the attention of the loss function to positive samples and high-quality samples, reducing the interference of negative samples, optimizing samples that are difficult to detect, and further improving the accuracy of the model.
4 Experimental Setup and Result Analysis 4.1 Experimental Parameters and Environment The dataset used for this training consists of a total of 10,000 images, of which 4560 images contain violations of drivers using their phones while driving, and the number of images without violations is 5440. The network hyperparameter configuration is as follows: during the training process, the SGD optimizer is used to adjust the training parameters. The initial learning rate is 0.01, the learning rate attenuation mode is Cos cosine attenuation, and the batch size is set to 16. A total of 300 epochs are
Violation Detection Method Based on Improved YOLOv5s
599
trained. Experimental environment: CPU is I7-11700 @ 2.50 GHz, GPU is Nvidia 3080ti, graphics memory is 16GB, CUDA version 10.2, cuDNN version 7.6.5, Python version 1.7.0, and compilation language is Python 3.7.
4.2 Data Preprocessing In the data preprocessing stage, data augmentation is performed on the images in the training dataset to increase the background complexity of the detection target and improve the generalization ability of the model. However, if data augmentation is used in all rounds, it will result in the model learning only the enhanced data features, not the real data features. Therefore, in the first 210 rounds of training, 50% of the images will be subjected to Mixup + mosaic data augmentation, while in the last 90 rounds of training, no data augmentation will be performed, resulting in good model generalization and the ability to fit the real data distribution.
4.3 Violation Criteria Due to the fact that a simple object detection task cannot accurately determine whether there is a violation behavior in the image or video, it is necessary to design a violation behavior criterion in conjunction with the object detection task. Firstly, if the violation behavior occurs within the range of the vehicle’s front window, the front window of the vehicle needs to be detected. When there is a cross area between the face, hands, and mobile phones detected in the front window of the vehicle, and the cross area is located on the right side of the front window, it indicates that there is a violation of the driver’s use of a mobile phone in the picture or video.
4.4 Evaluation Criteria The mainstream evaluation indicators in the field of object detection are used to evaluate the object detection performance of improved network models: mean average precision (mAP), as shown in formula (2) (see [20]). ⎧ 1 classes ⎪ ⎪ 1 ⎪ ⎪ m AP = PR ⎪ ⎪ classes 1 ⎪ ⎪ ⎨ 0 TP ⎪ P= ⎪ ⎪ T P + FP ⎪ ⎪ ⎪ ⎪ TP ⎪ ⎩R = T P + FN
(2)
600
S. Liu et al.
Table 1 Results of ablation experiment Improvement points Parameter capacity (M) Original YOLOv5s ghostSPPF MP2 C2f QFocalloss
9.65 9.43 9.709 9.83 9.83
Frame rate (FPS)
mAP (%)
87.42 88.23 88.21 87.33 87.33
91.37 91.43 92.25 92.69 93.02
Classes represents the number of categories, and the number of categories for detecting violations is 4. When a window is detected, it represents that the front window of the car has been detected. When a face is detected, it represents that the face has been detected. When a hand is detected, it represents that the hand has been detected. When a phone is detected, it represents that the phone has been detected. P and R represent the accuracy and recall rates respectively. TP represents the number of correctly predicted positive samples, FP represents the number of positive samples with incorrect predictions, and FN represents the number of negative samples with incorrect predictions (see [21]). In order to meet the real-time detection requirements of violation behavior detection in practical application scenarios, the frame per second (FPS) of the network model is also used as a reference indicator for this experiment. A frame rate greater than 30 FPS can meet the real-time detection requirements.
4.5 Ablation Experiment In order to verify the effectiveness of the improved YOLOv5s network model, ablation experiments were set up for each improvement point to verify whether each improvement point can improve the detection performance of the network model. In each round of ablation experiment, an improvement point is added in sequence according to the original YOLOv5s network model, ghost SPPF module, MP2 module, C2f structure, and QFocalloss loss function. The training results under the same experimental conditions are shown in Table 1. From Table 1, it can be seen that after the SPP module was replaced by ghost SPPF module, the parameter capacity of the network model decreased by 0.22M compared to the original network, and the mAP increased by 0.06%. After the CBS module was replaced by the MP2 module, the mAP increased by 0.82%. After the C3 structure was replaced by the C2f structure, the mAP increased by 0.44%. After the QFocalloss loss function was used, the mAP increased by 0.33%. The parameters of the improved YOLOv5s network model increased slightly compared with the original network model, and the detection frame rate decreased slightly. But the
Violation Detection Method Based on Improved YOLOv5s Table 2 Results of comparative experiments Network model Parameter capacity (M) Faster RCNN SSD Yolov3 Yolov4 Improved YOLOv5s
155.77 105.32 68.33 73.54 9.83
601
Frame rate (FPS)
mAP (%)
14.73 20.65 28.39 24.66 87.33
87.88 88.75 84.604 87.78 93.02
9.83M parameters still meet the deployment requirements of embedded devices. The frame rate of 87.33FPS can also meet the needs of real-time detection. The improved network model’s mAP has increased by 1.65% compared to the original network model, resulting in a significant improvement in detection accuracy.
4.6 Comparative Experiment In order to further verify whether the improved YOLOv5s network model outperforms other network models in object detection performance, comparative experiments were conducted to compare the detection performance of the improved network model with other mainstream object detection network models (Faster RCNN, SSD, Yolov3, Yolov4). Under the same experimental environment, 300 rounds of training were conducted using the same dataset and the same training parameters. The experimental results are shown in Table 2. From Table 2, it can be seen that the parameter capacity of the improved YOLOv5s network model is much lower than other network models, making it more suitable for deployment in embedded device environments. The frame rate is also much higher than other network models, and the real-time detection performance is better. mAP is also much higher than other network models, and the detection accuracy is higher. Overall, the improved YOLOv5s network model outperforms other mainstream object detection network models in detecting violations on the dataset.
4.7 Experimental Results and Analysis The training loss curve of the improved network model is shown in Fig. 8: a total of 300 rounds of training were conducted. In the initial stage of training, the loss curve showed a downward trend. When the training reached the 210th round, the loss curve fluctuated because the first 210 rounds of data were enhanced, and the data after 210 rounds were not enhanced. Therefore, the train loss will decrease slightly. The mAP variation curve during training is shown in Fig. 9.
602
S. Liu et al.
Fig. 8 Loss curve of training
Fig. 9 mAP curve of training
In order to further demonstrate the detection effect of the improved YOLOv5s network model in violation behavior detection tasks in this article, different types of images were randomly selected from the test set for testing. The original YOLOv5s network model was used to test the test set images, and the test results are shown in Fig. 10. Although most violations committed by drivers using their mobile phones while driving can be detected, in complex environments such as reflective and dim
Violation Detection Method Based on Improved YOLOv5s
603
Fig. 10 Detection effect of the original YOLOv5s network model
Fig. 11 Detection effect of the improved YOLOv5s network model
lighting, as shown in Fig. 10e and f, although the front window, face, and hands were detected in the images, the mobile phone was not accurately detected. The improved YOLOv5s network model was used to test the same test set of images, and the test results are shown in Fig. 11. The network model has better detection performance for violations committed by drivers using their mobile phones while driving. Even in complex environments such as reflective and dim lighting, violations can be accurately detected. It indicates that the improved network has good adaptability to complex environments, can capture key information, and has better generalization ability.
604
S. Liu et al.
5 Conclusion Aiming at the problems of low detection accuracy and poor adaptability of traditional violations detection methods in complex environments, this paper proposes an improved network model based on YOLOv5s network model, optimizes the backbone network, improves the loss function, enhances the feature extraction ability and feature fusion ability of the network model, and improves the accuracy and detection speed of the network model for violations detection. In the ablation experiment, the mAP of the improved YOLOv5s network model in this article increased by 1.65% compared to the original YOLOv5s network model, with high accuracy. The detection effect in the comparative experiment is also better than other mainstream target detection network models. The YOLOv5s network model improved in this article has a small number of parameters, good real-time detection performance, high detection accuracy, and can be deployed in embedded devices, effectively meeting the usage needs of various traffic intersection violation detection systems.
References 1. Xun, C.: Automatic Container Code Recognition Based on Deep Learning. Changchun University of Science and Technology (2020). https://doi.org/10.26977/d.cnki.gccgc.2020.000084 2. Liu, S., Gu, Y., Rao, W., et al.: Illegal vehicle detection method based on optimized YOLOv3 algorithm. J. Chongqing Univ. Technol. (Nat. Sci.) 35(04), 135–141 (2021) 3. Pan, W., Luo, Y.: Vehicle target detection based on improved YOLOV3. Comput. Appl. Softw. 40(01), 167–172, 204 (2023) 4. Shi, J.-w., Yang, L.-q., Fang, Y.-h., et al.: Helmet wearing detection method based on Improved YOLOv4. Comput. Eng. Des. 44(02), 518–525 (2023). https://doi.org/10.16208/j.issn10007024.2023.02.027 5. Li, J., Ge, Y., Liu, Y.-p.: Traffic sign recognition for dim small targets based on improved YOLOv5. Computer Systems & Applications, pp. 1–8 (2023). https://doi.org/10.15888/j.cnki. csa.009056 6. Zhao, M., Yu, H., Li, H.-q., et al.: Detection of fish stocks by fused with SKNet and YOLOv5 deep learning. J. Dalian Ocean Univ. 37(02), 312–319 (2022). https://doi.org/10.16535/j.cnki. dlhyxb.2021-324 7. Deng, T., Tan, S., Pu, L.: Traffic light recognition method based on improved YOLOv5s. Comput. Eng. 48(09), 55–62 (2022). https://doi.org/10.19678/j.issn.1000-3428.0062843 8. Long, S., Song, X., Zhang, S., et al.: Research on vehicle detection in aerial images with improved YOLOv5s. Laser J. 43(10), 22–29 (2022). https://doi.org/10.14016/j.cnki.jgzz.2022. 10.022 9. Jiang, C., Zhang, H., Zhang, E., et al.: Pedestrian and vehicle target detection algorithm based on improved YOLOv5s. J. Yangzhou Univ. (Nat. Sci. Ed.) 25(06), 45–49 (2022). https://doi. org/10.19411/j.1007-824x.2022.06.008 10. Chen, Z., Qi, H., Wang, X.: Research on mask wearing detection based on improved YOLOv5 algorithm. Electron. Des. Eng. 30(22), 67–72 (2022). https://doi.org/10.14022/j.issn16746236.2022.22.014 11. Yang, X.-l., Cai, Y.-w.: Pedestrian detection system based on YOLOV5S and its implementation. Comput. Inf. Technol. 30(01), 28–30 (2022). https://doi.org/10.19414/j.cnki.1005-1228. 2022.01.006
Violation Detection Method Based on Improved YOLOv5s
605
12. Wang, L., Duan, J., Xin, L.: YOLOv5 helmet wear detection method with introduction of attention mechanism. Comput. Eng. Appl. 58(09), 303–312 (2022) 13. Huang, Y., Liu, H., Chen, Q., et al.: Transmission line insulator fault detection method based on USRNet and improved YOLOv5x. High-Voltage Technol. 48(09), 3437–3446 (2022). https:// doi.org/10.13336/j.1003-6520.hve.20220314 14. Wang, L., He, M.-t., Xu, S., et al.: Garbage classification and detection based on YOLOv5s network. Packag. Eng. 42(08), 50–56 (2021). https://doi.org/10.19554/j.cnki.1001-3563.2021. 08.007 15. Gu, Y., Cao, M., Xiu, J., et al.: Algorithm for detecting violations based on YOLOv4 network. J. Chongqing Univ. Technol. (Nat. Sci.) 35(08), 114–121 (2021) 16. Gao, W., Shan, M., Song, N., et al.: Detection of microaneurysms in fundus images based on improved YOLOv4 with SENet embedded. J. Biomed. Eng. 39(04), 713–720 (2022) 17. Qi, L., Gao, J.: Small object detection based on improved YOLOv7. Comput. Eng. 49(01), 41–48 (2023). https://doi.org/10.19678/j.issn.1000-3428.0065942 18. Wang, X.: An improved traffic sign recognition and detection on rainy environment based on YOLOv5. Mod. Inf. Technol. 6(20), 71–75, 80 (2022). https://doi.org/10.19850/j.cnki.20964706.2022.20.018 19. Ma, N., Cao, Y., Wang, Z., et al.: Landing runway detection algorithm based on YOLOv5 network architecture. Laser Optoelectron. Prog. 59(14), 199–205 (2022) 20. Feng, H., Huang, C., Wen, Y.: Remote sensing image small target detection based on improved YOLOv3. J. Comput. Appl. 42(12), 3723–3732 (2022) 21. Zhnag, R., Dong, f., Cheng, X.: Application of improved YOLOv5s algorithm in non-motorized helmet wearing detection. J. Henan Univ. Sci. Technol. (Nat. Sci.) 44(01), 44–53, 57 (2023). https://doi.org/10.15926/j.cnki.issn1672-6871.2023.01.007
A Visual-LiDAR Object Tracking Method Using Correlation Filter and Potential Matching Junzhi Zhu, Xiaolong Wang, Fengli Yang, and Long Zhao
Abstract Visual based single-object tracking is a classical task in computer vision, which uses texture and color to distinguish objects and backgrounds. However, visual based methods are sensitive to illumination change and motion blur. So, we propose a coarse-to-fine visual-lidar fusion algorithm. This algorithm integrates a visual subsystem and a lidar subsystem. We use the visual subsystem to calculate a coarse position of the object, and then we use the lidar subsystem to optimize it to obtain an accurate position. Compared to visual tracking algorithms, our algorithm is more accurate and robust. Keywords Single-object tracking · Visual-lidar fusion algorithm · Coarse-to-fine method
1 Introduction Single-object tracking (SOT) is one of the tasks with widespread application in navigation and positioning, which has been widely used in unmanned driving [1] and intelligent transportation [2]. The target of SOT algorithm is to distinguish the difference between the object and background, and accurately separate the object. At present, the mainstream SOT algorithms are based on visual methods, which are usually based on state filters [3, 4], correlation filters [5, 6], and neural networks [7–9]. However, vision-based algorithms are sensitive to illumination change and motion blur, resulting in a decrease in accuracy and robustness. In recent years, many researches attempted to use lidar point cloud to achieve SOT algorithms. Qi [10] proposed a “Point to Box” network and used Hough Voting [11] to obtain the position of the object. Zhou [12] designed a Relation-Aware Sampling method and Relation Attention Module to achieve object tracking. Lidar point cloud can provide geometric information and spatial location of targets, meanwhile, lidar point are J. Zhu · X. Wang · F. Yang · L. Zhao (B) Digital Navigation Center, School of Automation Science and Electrical Engineering, Beihang University, Beijing 100191, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_52
607
608
J. Zhu et al. Visual Subsystem Image Template DFT
Camera
Update
CrossCorrelation
Image Template Train
Coarse Position
Initial Value Precise Position
LiDAR
Field Generate
Rasterize
Template Cloud
Update
Object Position
Potential Field Matching Template Field Generate
LiDAR Subsystem
Fig. 1 Block diagram of the visual-lidar fusion algorithm. The visual subsystem uses correlation filter to obtain the coarse position. The lidar subsystem uses potential field matching to obtain the precise position
not sensitive to illumination change. But there is a number of challenging problems such as lidar point cloud sparsity, random shape incompleteness, texture feature absence [12]. Thus, we propose a coarse-to-fine visual-lidar fusion algorithm. We utilize the color, texture, and geometric structure of the object, combining the advantages of visual algorithms and point cloud algorithms. Figure 1 shows the framework of the algorithm, which is composed of two key sub-systems: a visual subsystem (VS) and a lidar subsystem (LS). In VS we use correlation filter to obtain the coarse position of the object; The LS uses potential field matching to optimize the coarse position provided by the VS, then the LS outputs the precise location of the object.
2 Algorithm Principle 2.1 Visual Subsystem Algorithm The VS is based on a correlation filter [6], we first use initial samples to train a correlation filter. During detection, the algorithm computes cross correlation between filter and search area, and finally the position with the highest response is selected as the coarse position. We transfer filter and search area into the Fourier domain, so cross correlation operation can be converted into element-wise product operation to reduce the computation time.
A Visual-LiDAR Object Tracking Method Using Correlation …
609
Consider a one-dimensional signal, the conclusions can be directly extended to two-dimensional images. Assume that the one-dimensional signal can be expressed as vector x = [x0 , x1 , · · · , xn−1 ]T (positive sample), then using cyclic shift operator generating several virtual negative samples, which can be replaced by a circulant matrix X ∈ R n×n : ⎛ ⎞ x0 x1 . . . xn−1 ⎜ xn−1 x0 . . . xn−2 ⎟ ⎜ ⎟ (1) X = C (x) = ⎜ . . . . ⎟ ⎝ .. .. . . .. ⎠ x1 x2 . . . x0
The permutation matrix can be diagonalized by the discrete Fourier transform (DFT) [13]: C (x) = X = Fdiag xˆ F H (2) where the F is discrete Fourier matrix; xˆ denote the discrete Fourier transform of x (script ∧ denote the variable in Fourier domain). Assume that the filter f (z) = wT z, we use ridge regression to train the filter, and minimize the square error between the sample x i ∈ R n×1 and the regression targets yi :
(3) min ( f (x i ) − yi )2 + λw2 w
i
λ is regularization parameter that avoids over-fitting. The weight w can be solved by the least square method: −1 H X y w = X H X + λI
(4)
where the X denote the circulant matrix of sample x, y is the vector of regression targets. Replacing Eq. (2) in Eq. (4) to obtain the analytical solution of the weight w, and transforming it to Fourier domain: ˆ = dft (w) = w
xˆ ∗ ˆy xˆ ∗ xˆ + λ
(5)
xˆ ∗ denote complex conjugate of xˆ ; dft(∗) denote the discrete Fourier transform; denote the element-wise product operation. To ensure the accuracy and robustness of the algorithm, the kernel trick is used to extend the filter to non-linear space [5]. w can be represented by the linear combination of samples x i and coefficients αi : w=
i
αi φ(x i )
(6)
610
J. Zhu et al.
The filter can be expressed as: f (z) = wT z =
n
αi φ(x i )T φ(z) =
i=0
n
αi κ(x i , z)
(7)
i=0
Defining kernel function κ(x i , z) = φ(x i )T φ(z) (we use Gaussian kernel function). The coefficient vector α ∈ R n×1 can be solved in dual space: α = (K + λI)−1 y
(8)
K is kernel matrix, satisfying K i j = κ(x i , x j ). Meanwhile kernel matrix K can be denoted as the circulant of first row k x x ( K = C(k x x )). Therefore, Eq. (8) can be extended to kernel space: ˆy (9) αˆ = x x kˆ + λ In the detection stage, the filtering operation can be replaced by: xz f (z) = kˆ αˆ
(10)
xz
where kˆ is the kernel correlation of image template x and search area z. The VS selects the position with the highest response as the potential search area of the object, and provide it to the LS. After receiving the accurate solution corrected ˆ by the LS, the VS update the image template x and coefficient α: x = (1 − τ )x + τ x accurate αˆ = (1 − τ )x + τ αˆ accurate
(11)
where the τ is update step factor, x accurate and αˆ accurate are accurate solutions corrected by the LS.
2.2 LiDAR Subsystem Algorithm The LS first extracts point cloud in potential search area provided by VS, and utilizes the geometric features of point cloud to construct boundary potential fields. Finally, LS uses potential field matching to optimize the coarse position calculated by the VS. Considering that the VS has already obtained an optimal solution based on texture information, the LS only needs to focus on matching geometric information. LS calculates the normal vector of each point, and uses the normal vector to represent the boundary potential energy of the object. The normal vectors contain the geometric features in the neighborhood of the point. We get the nearest neighbor point
A Visual-LiDAR Object Tracking Method Using Correlation …
611
0
0
1
0
0
1
1
1
0
0
1
0
0
1
1
0
1
0
0
1
1
1
0
1
0
Boundary Potential Field
Mask Matrix
Fig. 2 A mask matrix created by boundary potential field (or depth field)
cloud group of each point by Kd-tree searching. Using Singular Value Decomposition to solve the normal vector n of each point. Subsequently, we unify the scale of the normal vector and eliminate directional ambiguity: nˆ = − cos(n · z p )
n n
(12)
z p is the z-axis of the camera coordinate system. However, due to the sparsity of point cloud, the number of pixels in the boundary potential field is greater than the number of points in point cloud. So, we propose a method by using mask matrix to mask those pixels without points (as shown in Fig. 2). After projecting point cloud to the pixel coordinate system, we create boundary potential field, depth field and mask matrices with normal vectors and depth values of each point. During detection, we perform the same operation on the point cloud of template (template cloud) and the point cloud in potential search area (search cloud). The matching of boundary potential fields and depth fields mainly adopts the method by local statistical information [14]. First, we rasterize the boundary potential field E and depth field G of search cloud to obtain grid field E s and G s . Then we calculate the mean value and dispersion for each block (gu , gv ) in the grid field. For the boundary potential field: n E smean (gu , gv ) v E (gu , gv ) =
1 n
=
n
i, j∈(gu ,gv )
E(i, j)
n cos(E(i, j), E smean (gu , gv ))
(13)
(14)
i, j∈(gu ,gv )
where the E smean (gu , gv ) donates the mean value of the block (gu , gv ) in grid field E s ; the v E (gu , gv ) donates the dispersion of the block (gu , gv ); E(i, j) is the valid pixel not masked by mask metrix. Similarly, for the depth field:
612
J. Zhu et al.
n Dsmean (gu , gv ) ⎛ v D (gu , gv ) = tanh ⎝ωn
=
i, j∈(gu ,gv )
n
D(i, j)
(15)
n ⎞
D(i, j) − D mean (gu , gv ) ⎠ s
(16)
i, j∈(gu ,gv )
where the tanh(·) is Hyperbolic Tangent function; ω is the scale factor used to smooth the dispersion of the depth field. Assuming that the pixel (u, v) of the template cloud field falls into the central grid C(gu , gv ) of the search cloud grid field. We extract the neighbor grid set N (gu , gv ) of the central grid and compute the difference value between pixel and grid. For each pixel (u, v) in template cloud field and its corresponding grid (gu , gv ), the difference value σ can be calculated as: σ (u, v) = γ D σ D (u, v) + γ E σ E (u, v)
σ D (u, v) = v D (gu , gv ) ·Dt (u, v) − D mean (gu , gv ) s
σ E (u, v) = v (gu , gv ) · sin(E t (u, v), E
(17)
E smean (gu , gv ))
σ E is difference value provided by boundary potential field; σ D is difference value provided by depth field; γ E and γ D are compensation factors. We compare different σ to achieve potential field matching. When the σ with central grid meets the threshold, this pixel (u, v) does not provide an update step. When the σ with central grid exceeds the threshold and, σ with neighbor grid meets the threshold, this pixel (u, v) provides an update step ξit+1 that depends on the position of neighbor grid (as shown in Fig. 3). The ξit+1 moves the pixel (u, v) towards the grid with less difference value σ . Therefore, the total update step: ξ
t+1
=α
ξit+1 + (1 − α)ξ t np
(18)
Matching Grid Difference Value Compare
Fig. 3 The two matching neighbor grids will offer an average step towards themselves
A Visual-LiDAR Object Tracking Method Using Correlation …
613
We use momentum method to optimize the update step, which could keep the stability during optimizing. α is momentum factor, ξ t is update step from the previous iteration, n p is the number of valid pixels. After potential field matching, the LS uses a fast segmentation method to extract the new template cloud, which is based on the normal vectors and depth values. Then calculating the bounding box of the new template cloud to correct the coarse position of the object. The LS output the corrected position as precision position to the VS to train a new image template.
3 Experiment We evaluate our method on KITTI [15] tracking dataset. In this section, we only use the data from Velodyne HDL-64 (point cloud) and Point Grey Flea 2 (RGB image). Considering that correlation filters and neural networks are both classic branches of visual algorithms, we evaluate the accuracy and robustness of our method with KCF (correlation filtering based) and SiamRPN (neural networks based). Figures 4, 5, 6 and 7 show the tracking results of three algorithms. We calculate the IoU scores of the prediction boxes and truth boxes to evaluate the accuracy of tracking. When the IoU score is below 0.4, it is considered that the algorithm has failed. Figure 4 shows the results on the “Cyclist” sequence. In this sequence, we evaluate the effectiveness of algorithm in fast motion and object deformation. As shown in Fig. 4, after the scale change of the object (Frame 50), the KCF failed, while SiamRPN and our method are able to maintain tracking accuracy. As the tracking time prolongs, the accuracy of SiamRPN gradually decreases. Our method has good robustness and can maintain a certain accuracy (over 0.6). In the “Car” sequence, we mainly evaluate the accuracy and robustness of the algorithm against extreme illumination change and long-term tracking. We simulate
KCF Siam RPN Ours Frame 10
Frame 50
Frame 100
Failed
Frame
Fig. 4 The tracking results and IoU chart of “Cyclist” sequence
Frame 130
614
J. Zhu et al. KCF Siam RPN Ours Frame 15
Increase brightness
Frame 25
Frame 100
Frame 220
Reduce brightness Failed
Failed
Frame
Fig. 5 The tracking results and IoU chart of “Car” sequence KCF Siam RPN
Ours Frame 10
Frame 25
Frame 50
Frame 67
Failed
Frame
Fig. 6 The tracking results and IoU chart of “Pedestrian” sequence
the illumination change in this sequence: we gradually increase the brightness of the image, then gradually decrease the brightness, and repeats this process during the tracking. As shown in the Fig. 5, the KCF quickly failed, and the SiamRPN also failed due to the combined effects of illumination change and scale change (Frame 220). Our method is able to resist severe illumination change and maintain the stability and accuracy. Figure 6 shows the results on “Pedestrian” sequence, and we evaluate the algorithms in complex situations. In complex situations, the KCF encounters false tracking, while both SiamRPN and our method achieve good results. Since the LiDAR is employed, our method can simultaneously output the 3D bounding box of the target (Fig. 7).
A Visual-LiDAR Object Tracking Method Using Correlation …
Frame 10
Frame 25
615
Frame 50
Fig. 7 The 2D bounding box and 3D bounding box in “Pedestrian” sequence
4 Conclusion In this work, we propose a coarse-to-fine visual-lidar fusion algorithm, which consists of a visual subsystem and a lidar subsystem. We evaluate the performance of our method on the KITTI dataset and find that our algorithm maintains good tracking accuracy and robustness in complex environments, severe illumination change, and scale change. At the same time, our method can output the 2D and 3D bounding boxes of the object in real-time, only running on the CPU. Acknowledgements This work is supported by the National Science Foundation of China (Grant No. 42274037), the Aeronautical Science Foundation of China (Grant No. 2022Z022051001), and the National key research and development program of China (Grant No. 2020YFB0505804).
References 1. Poczter, S.L., Jankovic, L.M.: The google car: Driving toward a better future? J. Bus. Case Stud. (JBCS) 2. Lu, H.C., Fang, G.L., Wang, C., Chen, Y.W.: A novel method for gaze tracking by local pattern model and support vector regressor. Signal Process. 90(4), 1290–1299 (2010) 3. Li, G., Li, C.: Learning skeleton information for human action analysis using kinect. Signal Process. Image Commun. 84, 115814 (2020) 4. Wang, Q., Yang, C., Zhu, H.R., Yu, L.: Interactive multi-model kalman filtering algorithm based on target tracking. In: Proceedings of 2021 Chinese Intelligent Systems Conference: Volume I, pp. 82–94. Springer, Berlin (2022) 5. Henriques, J.F., Rui, C., Martins, P., Batista, J.: High-speed tracking with kernelized correlation filters. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 583–596 (2014) 6. Danelljan, M., Hager, G., Khan, F.S., Felsberg, M.: Learning spatially regularized correlation filters for visual tracking. In: 2015 IEEE International Conference on Computer Vision (ICCV) (2015) 7. Wang, Z.M., Wang, C.L., Shen, S.: Urban road object detection and tracking applications based on acoustic localization. In: Proceedings of 2020 Chinese Intelligent Systems Conference, vol. I, pp. 10–17. Springer, Berlin (2021) 8. Bo, L., Yan, J., Wei, W., Zheng, Z., Hu, X.: High performance visual tracking with Siamese region proposal network. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 8971–8980 (2018)
616
J. Zhu et al.
9. Cui, Y.T., Jiang, C., Wang, L.M., Wu, G.S.: Mixformer: end-to-end tracking with iterative mixed attention. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13608–13618 (2022) 10. Qi, H., Feng, C., Cao, Z., Zhao, F., Xiao, Y.: P2b: Point-to-box network for 3d object tracking in point clouds. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6329–6338 (2020) 11. Qi, C.R., Litany, O., He, K., Guibas, L.J.: Deep hough voting for 3d object detection in point clouds. In: IEEE/CVF International Conference on Computer Vision, pp. 9277–9286 (2019) 12. Zhou, C., Luo, Z., Luo, Y., Liu, T., Pan, L., Cai, Z., Zhao, H., Lu, S.: Pttr: Relational 3d point cloud object tracking with transformer. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8531–8540 (2022) 13. Gray, R.M., et al.: Toeplitz and circulant matrices: a review. Found. Trends Commun. Inf. Theor. 2(3), 155–239 (2006) 14. Biber, P., Straßer, W.: The normal distributions transform: a new approach to laser scan matching. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, vol. 3, pp. 2743–2748. IEEE (2003) 15. Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? the kitti vision benchmark suite. In: IEEE Conference on Computer Vision and Pattern Recognition (2012)
A Model Based on Trend-Seasonal Decomposition and GCN for Traffic Flow Prediction Jiajun Wang, Yong Li, and Jiahao Zhang
Abstract Traffic flow prediction plays an important role in Intelligent Transportation Systems(ITS). To improve the accuracy of traffic flow prediction, this paper proposes a multi-location based on Trend-Seasonal Decomposition and GCN Traffic Flow Forecasting Models for the task of multi-location traffic flow prediction. In this paper, the proposed model mainly consists of two functions: First, the Trend-Seasonal component decomposes the temporal data of traffic flow into a more predictable trend part and a seasonal or periodic part. Second, GCN is used to obtain spatial information between different observation points and improve the accuracy of multi-position prediction. Finally, the experiments for the PeMS04 and PeMS08 data sets are carried out to verify the effectiveness of proposed model. Keywords GCN · Trend-seasonal decomposition · Traffic prediction
1 Introduction Traffic flow modeling, a branch of time series forecasting, has always been a hot research hotspot. Currently, most deep learning methods for time series forecasting can be roughly divided into three categories: RNN-based method [1, 2], transformerbased method [3], and time-series-decomposition-based method. Through the use of RNN-based methods, it is possible to predict the future by analyzing past time data and learning potential time patterns hidden within it. Transformer-based model is firstly proposed in the field of NLP, and later researchers applied it to CV and time J. Wang College of Computer and Data Science/College of Software, Fuzhou University, Fuzhou 350100, China J. Wang · Y. Li (B) Public Security Department, Fujian Police College, Fuzhou 350000, China e-mail: [email protected] J. Zhang Ningbo Institute of Materials Technology and Engineering, Chinese Academy of Sciences, Ningbo 315201, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_53
617
618
J. Wang et al.
series prediction. Due to the self-attention mechanism, this model can be used for modeling time series and presents powerful capabilities. But the time complexity of its self-attention calculation will take a lot of time to train the model. Time-seriesdecomposition-based method decomposes time-series into individual components (e.g., trends, cycles, etc.), allowing for more focus on the pertinent information in the temporal information during modeling, while reducing modeling complexity and improving generalization capabilities. However, In practical scenarios (e.g. multilocation traffic flow forecasting tasks), traffic flow prediction does not rely only on the temporal information, but also on the spatial information, which has a significant impact on the prediction results. So Traffic data can be modeled spatially using graph neural network (GNN). To improve the accuracy of traffic flow forecasting, this paper proposes a traffic flow forecasting model based on both Trend-Seasonal Decomposition and GCN (TSD-GCN). The specific details in the model will be detailed in the Methodology section. The effects and predictive ability of the model are verified in the experimental section.
2 Related Work For the better application to time series forecasting, researchers have made some improvements based on Transformer. Informer uses ProbSparse self-attent-ion to improve the calculation speed, based on transformer, and proposes a DMS prediction strategy [4]. Pyraformer proposes a Low-Complexity Pyramidal self-attention to linearly relate both the space and time complexity of its self-attention calculation to the length of the time series [5]. The seasonal-trend decomposition method is typical a method based on Time series decomposition. For time series data, it has characteristics such as seasonality, trend, and irregularity. The seasonal-trend decomposition method is a standardized method in time series analysis, to easily predict trend by decomposing the original time series data [6]. Autoformer propose a Time series decomposition based on transformer to extract trend-cyclical features and seasonal features from the original time series data, from different perspectives and mine complex temporal patterns, and then achieves better prediction results [7]. On the basis of Autoformer, FEDformer further proposes hybrid experts’ strategies to extract various time trends through convolution kernels of different sizes [8]. In order to strengthen the learning ability of the seasonal-trend decomposition model, Woo et al. use the method of contrast learning to pre-train the feature extractor of the model [9]. Then, the model can pay more attention to the extraction of trend and seasonal features. Migrating it to the downstream task (e.g. time series prediction), the effect of model prediction is further improved. However, the above methods only model time series data for the temporal information. In recent years, research on GNN has made considerable progress, and many researchers have proposed various GNN variants for different application scenarios and achieved remarkable results [10–12]. The graph convolutional network (GCN) is
A Model Based on Trend-Seasonal Decomposition and GCN …
619
mostly used in the field of traffic prediction and has also achieved remarkable results. Zhao et al. proposes the T-GCN model to model the temporal-spatial information contained in traffic data through the combination of GRU and GCN [13]. Guo et al. proposes an attention-based spatio-temporal graph convolution model (ASTGCN) [14]. This methold mainly uses the attention mechanism to model the spatio-temporal relationship in traffic data and GCN to capture spatial information, so as to achieve the purpose of predicting traffic flow. However, such GCN-based traffic flow forecasting models are usually not as capable of capturing temporal information as methods based on time-series decomposition.
3 Methodology Fig. 1 shows the structure of the proposed model, which contains N blocks (indicated by red dashed lines). Each block has four main parts: TCN (Temporal Convolution Network) Layer, FFT (Fast Fourier Transform) Layer, GCN (Graph Convolution Network) Layer and Self-Attention. The Self-Attention here uses Multi-Head SelfAttention. Firstly, for the traffic flow prediction task, the original input sequence is defined as X ∈ R p×l×d0 , where p represents the number of observation points, l represents the time length of the input sequence, and d0 represents the feature dimensions. The output X Out ∈ R p×l obtained through the model is the predicted result, and l is the predicted time length.
Fig. 1 Proposed model overview. The Trend-Seasonal decomposition in model consists of two components: TCN and FFT
620
J. Wang et al.
3.1 Multi-head Self-Attention The self-attention mechanism is first used in the transformer [3]. Here multi-head Self-Attention is used. For an input X ∈ R p×l×d0 , we need to make operations on it, as shown in the formula (1). X = P E(X W 0 )
(1)
where P E indicates Position Embedding[3], W 0 ∈ Rd0 ×d is a learnable parameter matrix, at this time X ∈ R p×l×d . The multi-head self-attention consists of H heads. The specific operation method of each head i(i = 1, 2, ..., H ) is as follows. First, we can obtain Queries, Keys, and Values according to the input X, and the specific operation is shown in formula (2). Q i = X W Q i , K i = X W K i , Vi = X W Vi
(2)
where W Q i ∈ Rd×dk , W K i ∈ Rd×dk , W Vi ∈ Rd×dv are learnable weight matrices respectively. Second, after calculating the attention scores of Queries and Keys, the final output is shown in formula (3). Qi K T Ai = so f tmax( √ i )Vi dk
(3)
√ where the scaling factor dk can prevent the gradient from disappearing to a certain extent. Finally, the output of each head is concatenated a matrix S = Concat (A1 , A2 , ..., A H ), S ∈ R p×l×H ·dv . The final output Aout of Self-Attention is given by the formula (4), W 0 ∈ R H ·dv ×d is also a learnable weight matrix. Aout = SW 0
(4)
3.2 GCN Layer GCN was first proposed by Kipf and Welling [10]. It is intended to carry out information aggregation or information dissemination similar to CNN operations on Graph. For a multi-layer GCN, its specific operation is given by formula (5). H L+1 = σ ( D˜ −1/2 A˜ D˜ −1/2 H l W L )
(5)
A Model Based on Trend-Seasonal Decomposition and GCN …
621
where A˜ = A + In is an adjacency matrix representing an undirected graph with self-connection. In is the identity matrix. D˜ ii = j A˜ ii is the degree matrix of the graph. W L is a learnable weight matrix for each convolution operation. σ (·) is an activation function, such as Sigmoid, ReLU, etc. H L represents the output of the first layer GCN, H 0 = H , H is the original input. The output obtained by GCN Layer in the model can be expressed as gout = GC N (Aout ).
3.3 FFT Layer and TCN Layer Inspired by time series decomposition and the method proposed by Woo et al. [9], the model proposed in this paper must not only capture spatial information, but also learn underlying temporal pattern of traffic data from both trend and periodic perspectives. The role of the FFT Layer is to capture the periodic characteristics of the data, and the specific operation is shown in formula (6). f out = R F F T (F F T (A Out )W f + B f )
(6)
where FFT and RFFT indicate Fourier Transform and Inverse Fourier Transform respectively. W f ∈ R d×d , B f is bias. The role of TCN Layer is to capture the trend characteristics oftraffic data. TCN uses L+1 mixture of autoregressive experts [9], L = log2 (l/2) each expert is a 1d causal convolution, and the size of the i-th convolution kernel is 2i . The output (T,i) of each expert is C = CausalConv(Aout , 2i ). The output of the TCN Layer is shown in formula (7). Tout = Ave Pool(C
(T,0)
,C
(T,1)
,··· ,C
(T,L)
1 (T,i) C L + 1 i=0 L
)=
(7)
Finally, the output of each block is Bout = Batch N or m(gout + Tout + f out ). The final output of the model needs to go through MLP layer.
4 Experiments In the experiment section, the tested model only uses 1 Block. And tested separately on the PeMS04 and PeMS08 data sets. During training, we use the data of 10 observation points in two datasets for training. Experimental environment: RTX 3090 (24 GB), Xeon(R) Gold 6330 CPU @ 2.00 GHz, pytorch version 1.9.0. Root mean square error (RMSE), mean absolute error (MAE) and Mean squared error(MSE) are used as the evaluation metrics.
622
J. Wang et al.
Table 1 Traffic flow forecasting results Datasets/Metrics MSE PeMS04 PeMS08
0.0758 0.0427
RMSE
MAE
0.2755 0.2063
0.1913 0.1385
Fig. 2 PeMS04 (sensor95 and sensor72, left to right)
Fig. 3 PeMS08 (sensor9 and sensor153, left to right)
Table 1 presents the results of the models trained on the PeMS04 and PeMS08 datasets respectively. From the results, the results of training on the PeMS08 dataset are better than PeMS04. Figures 2 and 3 respectively show the prediction results of the model on two different data sets. It can be seen from Fig. 2 that the model has a large local deviation on the dataset with large disturbances. However, from the perspective of the trend and periodicity of the overall, the model can learn these two characteristics well, which can also be seen in the prediction results of the PeMS08 data set in Fig. 3. In addition, it can also be seen that the proposed model can better learn the data distribution of different sensors at the same time based on the position information between different sensors, combined with the data distribution of each sensor itself, which verifies the proposed model on the side. Capabilities on multi-location traffic flow forecasting.
A Model Based on Trend-Seasonal Decomposition and GCN …
623
5 Conclusion and Future Work In this paper, a traffic flow prediction model based on trend-seasonal decomposition and GCN is proposed. The model includes Self-Attention mechanism, 1D-Casual Convolution, GCN, etc. It can capture the temporal information of traffic flow and combine the location information of different observation points to improve the accuracy of traffic flow prediction for a certain extent. Through the experimental results of PeMS04 and PeMS08 data sets, we achieved 0.1913 and 0.1385 respectively on the evaluation metric MAE and it was also able to achieve 0.2755 and 0.2063 on RMSE respectively. Then, the validity of the prediction model proposed in this paper is proved to a certain extent. Of course, the current model still has some shortcomings. For example: it is impossible to provide an accurate forecast for more complex scenarios. Weather and holidays that may affect the results are not considered. In the future, we hope to continue to strengthen the predictive ability of the model, and at the same time strengthen the general capability of the model so that it cannot only predict traffic flow, but also predict other traffic indicators. Acknowledgements This work was supported by the National Natural Science Foundation of China (Grant No. 62301159) and the Fujian Provincial Nature Foundation (Grant No. 2023J01229).
References 1. Rangapuram, S.S., Seeger, M.W., Gasthaus, J., Stella, L., Wang, Y., Januschowski, T.: Deep state space models for time series forecasting. Adv. Neural Inf. Process. Syst. 31 (2018). https:// proceedings.neurips.cc/paper/8004-deep-state-space-models-for-time-series-forecasting 2. Salinas, D., Flunkert, V., Gasthaus, J., Januschowski, T.: Deepar: Probabilistic forecasting with autoregressive recurrent networks. Int. J. Forecast. 36(3), 1181–1191 (2020). https://doi.org/ 10.1016/j.ijforecast.2019.07.001 3. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017). https:// proceedings.neurips.cc/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf 4. Zhou, H., Zhang, S., Peng, J., Zhang, S., Li, J., Xiong, H., Zhang, W.: Informer: beyond efficient transformer for long sequence time-series forecasting. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 35, pp. 11106–11115 (2021). https://doi.org/10. 1609/aaai.v35i12.17325 5. Liu, S., Yu, H., Liao, C., Li, J., Lin, W., Liu, A.X., Dustdar, S.: Pyraformer: low-complexity pyramidal attention for long-range time series modeling and forecasting. In: International Conference on Learning Representations (2021). https://openreview.net/forum?id=0EXmFzUn5I 6. Cleveland, R.B., Cleveland, W.S., McRae, J.E., Terpenning, I.: Stl: a seasonal-trend decomposition. J. Off. Stat 6(1), 3–73 (1990) 7. Wu, H., Xu, J., Wang, J., Long, M.: Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. Adv. Neural Inf. Process. Syst. 34, 22419–22430 (2021). https://proceedings.neurips.cc/paper/2021/hash/ bcc0d400288793e8bdcd7c19a8ac0c2b-Abstract.html
624
J. Wang et al.
8. Zhou, T., Ma, Z., Wen, Q., Wang, X., Sun, L., Jin, R.: Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting. In: International Conference on Machine Learning. pp. 27268–27286. PMLR (2022). https://proceedings.mlr.press/v162/zhou22g.html 9. Woo, G., Liu, C., Sahoo, D., Kumar, A., Hoi, S.: Cost: Contrastive learning of disentangled seasonal-trend representations for time series forecasting. arXiv preprint arXiv:2202.01575 (2022) https://openreview.net/forum?id=PilZY3omXV2 10. Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016) 11. Ying, Z., You, J., Morris, C., Ren, X., Hamilton, W., Leskovec, J.: Hierarchical graph representation learning with differentiable pooling. Adv. Neural Inf. Process. Syst. 31 (2018). https:// proceedings.neurips.cc/paper/2018/hash/e77dbaf6759253c7c6d0efc5690369c7-Abstract. html 12. Zhang, M., Chen, Y.: Link prediction based on graph neural networks. Adv. Neural Inf. Process. Syst. 31 (2018). https://doi.org/10.48550/arXiv.1802.09691 13. Zhao, L., Song, Y., Zhang, C., Liu, Y., Wang, P., Lin, T., Deng, M., Li, H.: T-gcn: a temporal graph convolutional network for traffic prediction. IEEE Trans. Intell. Transp. Syst. 21(9), 3848–3858 (2019). https://doi.org/10.1109/TITS.2019.2935152 14. Guo, S., Lin, Y., Feng, N., Song, C., Wan, H.: Attention based spatial-temporal graph convolutional networks for traffic flow forecasting. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 33, pp. 922–929 (2019). https://doi.org/10.1609/aaai.v33i01. 3301922
Formation Control of Multiple Nonholonomic Wheeled Robots with Disturbances Xingyu Gao, Xiaonan Liu, Chen Chen, Xingxing Qiu, and Zhengrong Xiang
Abstract This paper investigates the formation control problem of multiple nonholonomic wheeled robot systems with unknown external disturbances. To reduce communication overhead, a minimal communication topology is employed. The nonholonomic constraint is transformed into a second-order system with disturbances using a model transformation approach. A distributed leader-following formation control strategy is proposed based on position and velocity error information and the disturbance value observed by a designed disturbance observer. The sufficient condition for the system to reach formation is obtained using a linear matrix inequality approach and the Lyapunov stability theory. A simulation example is provided to validate the proposed method. Keywords Nonholonomic mobile robot · Formation control · Minimum communication topology · Dynamics model · Lyapunov analysis
1 Introduction The consensus or synchronization problem is of significant interest from both theoretical and practical perspectives, serving as the foundation for cooperative coordination control in multi-agent distributed systems. It holds great potential for applications in various fields, such as distributed sensor networks [1], flocking [2, 3], distributed attitude alignment [4], multi-missile cooperative attacks [5], and formation control [6–11]. The objective of consensus control is to design interaction protocols or algorithms that enable a group of agents to reach a common value of a specific state or output over time. The distributed formation control of multiple mobile robots is an active research area in the fields of control and robotics. Information exchange among these robots can help overcome limitations in individual capabilities, knowledge, or resources. This approach offers advantages such as high efficiency, stability, and low cost. X. Gao · X. Liu · C. Chen · X. Qiu · Z. Xiang (B) School of Automation, Nanjing University of Science and Technology, Nanjing 210094, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_54
625
626
X. Gao et al.
Formation control involves controlling a group of multi-agent systems to move or arrange themselves in a specific geometric shape. In recent years, various algorithms have been used in the design of multi-robot formation systems, such as leader-following, virtual structure, algebraic graph theory, and behavior-based methods. The leader-following method, similar to the leaderfollowing consensus problem in multi-agent systems in [12], is widely studied due to its ease of implementation. Previous studies have focused on networked nonholonomic vehicles subject to velocity constraints in [6], as well as the cases of a single leader and a single follower, which can be simplified into a trajectory tracking control problem in [7, 8]. Leader-based interconnected topology distributed formation control protocols have also been developed based on the estimated state of the leader in [9, 10]. This paper proposes a distributed formation control strategy for the second-order formation control problem of nonholonomic mobile robots with disturbances. The main contributions of this paper are as follows: (1) To enhance real-time performance and reduce communication complexity, a tree topology is utilized; (2) The dynamic equations and the role of external interference are considered to describe the motion properties of the robot more accurately; (3) An interference observer is designed to mitigate the influence of disturbances; (4) A distributed formation control protocol is developed to efficiently achieve the desired formation specifications in a coordinated manner.
2 Interconnected Topology In this paper, we employ the tree topology structure. Consider a group of leaderfollower robots labeled by 0, 1, . . . , N , as shown in Fig. 1. We divide the system into k layers based on the communication between the robots: the leader is in layer 0, marked as 0, 0; There are n 1 followers in layer 1, marked as 1, i, where i = 1, 2, . . . , n 1 . Similarly, there are n k followers in layer k, marked as k, i, where i = N − n k + 1, N − n k + 2, . . . , N . Each robot receives information only from the leader (or middle leader) of the previous layer and passes its own status information to the next follower.
Fig. 1 The interconnected tree topology
Formation Control of Multiple Nonholonomic Wheeled Robots …
627
Fig. 2 TurtleBot 2
Remark 1 The tree topology, used in prior studies [13, 14], organizes the system hierarchically. Each node communicates solely with its parent node, reducing communication complexity. This topology is ideal for distributed consistency or formation control, enhancing system scalability and stability.
3 Main Results 3.1 Construction of the Kinetic Model Consider a multi-robot system consisting of N identical non-holonomic two-wheeled mobile robots, such as TurtleBot 2 shown in the Fig. 2. Let [xi , yi ]T and θi be the position and orientation of robot i, respectively. The kinematic model for each robot is given by ⎤ ⎡ ⎤ x˙i cos θi 0 ⎣ y˙i ⎦ = ⎣ sin θi 0 ⎦ vi , i = 1, 2, · · · , N , wi 0 1 θ˙i ⎡
(1)
where vi and wi are the linear and angular speeds of the robot i, respectively. Taking into account the kinetic energy of the body and wheels, as well as the virtual work done by linear and rotational motion, we utilize the second type of Lagrange equations described in [15] to derive the dynamic equation of the robot as follows: φ¨ v˙i = A ¨ri + di , i = 1, 2, · · · , N (2) w˙ i φli where φri , φli are the angular displacements of the left and right wheel of the Jwy r 3 robot i. The matrix A is defined as A = m b Jbz r 4 + 2Jwy r 2 (m 2 b L + Jbz ) + 4J L wy
628
X. Gao et al.
2J L 2J L Jbz + rwy2 Jbz + rwy2 2J L 2J L , where m b is the mass of the robot body, Jbz , Jwy m b L + rwy2 −m b L − rwy2 are the inertia of the body and the wheel, respectively. L is half of the wheel spacing, and r is the wheel radius. di ∈ R2 is the disturbance, satisfies Assumption 1 as follows: Assumption 1 Assume that the derivative of the external disturbance is bounded, i.e., ∀t ≥ 0, d˙i ≤ η, where η is a known positive constant. To deal with nonholonomic constraints, an off-axis point should be considered a processing point located on the vehicle direction axis, a distance li from the center. The position of the processing point is then given as follows:
x hi yhi
x = i yi
+ li
cos θi . sin θi
(3)
Meanwhile, considering the connection with the simple second-order system, the following coordinate change is adopted
φ¨ri φ¨li
=
cos θi −li sin θi sin θi li cos θi
−1
u xi + vi wi sin θi + li wi2 cos θi . u yi − vi wi cos θi + li wi2 sin θi
(4)
Then the follower system (1), (2) can be converted into z˙ i = gi , g˙i = A f i + di , i = 1, 2, · · · , N
(5)
where z i = [x hi , yhi ]T , gi = [vxi , v yi ]T , f i = [u xi , u yi ]T , Similarly, the dynamic of the leader for the multirobot system is described by z˙ 0 = g0 , g˙ 0 = 02×1 , where z 0 = [x h0 , yh0 ]T , g0 = [vx0 , v y0 ]T , 02×1 = [0, 0]T . In this paper, the expected formation of the robotic system is specified by Δi = T Δxi , Δ yi ∈ R2 , which represents the expected relative position of each robot i for the leader 0. At the same time, all followers are expected to be in line with the direction of the leader. Definition 1 The desired formation in multi-robot systems (5) is achieved if, for any initial conditions, as t → ∞, z i (t) − z 0 (t) ≤ Δi + δ and gi (t) − g0 (t) ≤ δ hold for all i = 1, 2, · · · , N , where δ = [δx , δ y ]T ∈ R2 is a vector of small positive constants.
3.2 Distributed Estimation of External Disturbances To render the problem more amenable, let si = [z i , gi ]T , the system (5) can be rewritten as s˙i = α1 si + α2 di + α2 A f i .
Formation Control of Multiple Nonholonomic Wheeled Robots …
629
Now, inspired by [16], a disturbance observer for estimating external disturbances in system (5) is proposed by as follows:
q˙i = −H α2 (qi + H si ) − H (α1 si + α2 A f i ) , i = 1, 2, . . . , N dˆi = qi + H si
(6)
01 where qi ∈ R2 is the state of disturbance observer, α1 = I2 and α2 = 00 0 is kronecker product. H ∈ R2×4 is the I2 are known parameters, where 1 gain matrix to be designed, dˆi is the estimate of di . Theorem 1 If the disturbance in the system (5) satisfies Assumption 1, if H α2 ∈ R2×2 in the disturbance observer (6) satisfies H α2 > 0, the disturbance observer (6) can achieve bounded tracking of disturbance di . Proof Define the estimate error ei = di − dˆi , we obtain e˙i = −H α2 ei + d˙i .
(7)
Integrating e˙i with respect to time, we have ei (t) = e−H α2 t ei (0) −
t
e−H α2 (t−τ ) d˙i dτ.
(8)
0
According to the properties of norms, the following inequality can be obtained:
−H α t
t −H α (t−τ )
2 2
˙ ei (t) ≤ e ei (0) + e di dτ
,
(9)
0
since −H α2 satisfies Hurwitz stability, then ∃M ≥ 1, β > 0, such that
−H α t
2
e ≤ Me−βt , t ≥ 0
(10)
t ei (t) ≤ Me−βt ei (0) + η Me−β(t−τ ) dτ 0 η −βt ηM e . + = M ei (0) − β β
(11)
then (9) can be rewritten as
Since β > 0 and η, M > 0, ei (t) is bounded, and when t → ∞, ei (t) → ηM . β The disturbance observer (6) demonstrates a gradual tracking of the disturbance; however, it is subject to steady-state error due to the presence of the dynamic performance of the disturbance. The proof is completed.
630
X. Gao et al.
3.3 Formation Control for Multi-robot Systems with Disturbances In this section, it is proposed to design a distributed formation control protocol using the disturbance predicted by the disturbance observer (6) to achieve the formation control target (as defined in Definition 1). Theorem 2 For the second-order multi-robot system (5), if the disturbance observer (6) is used and f k,i of the robot i in layer k is designed as f k,i = −K 1 [z k,i − z k−1,m − (Δk,i − Δk−1,m )] − K 2 (gk,i − gk−1,m ) − A−1 dˆk,i ,
(12)
where K 1 , K 2 ∈ R2×2 are control gains, dˆk,i is from the observer (6) of the robot i from the layer k, m = 1, 2, ..., n k−1 , i = N − n k + 1, N − n k + 2, ..., N . Moreover, K 1 = A−1 P1 , K 2 = A−1 P1 P2 , where P1 , P2 are positive definite matrixs to be determined, and satisfy P2 P1 −
1 P2 P2T − P2−1 > 0. 2
(13)
At the same time, there exist a positive definite and symmetric matrix Q and a control gain H in observer system (6) to be designed, such that (H α2 )T Q + Q H α2 − Q Q T −
1 I >0 2
(14)
then the formation control objectives (Definition 1) can be achieved. Proof The tree topology is used to achieve the formation goal between each twolayer structure, thus ensuring that the entire multi robot system can finally meet the established requirements. Step 1: First, consider the tracking problem between the followers in layer 1 and the leader in layer 0, let z˜ 1,i = z 1,i − z 0,0 − Δ1,i , g˜ 1,i = g1,i − g0,0 act as the tracking error between the i-th follower (located in layer 1) and the leader, where i = 1, 2, ..., n 1 , such that (5) can be rewritten as z˙˜ 1,i = g˜ 1,i , g˙˜ 1,i = A f 1,i + d1,i .
(15)
A candidate Lyapunov function for system (15) is constructed: 1 1 1 1 1 T T T z˜ 1,i z˜ 1,i + ε1,i ε1,i + e1,i Qe1,i , 2 i=1 2 i=1 i=1
n
V1 =
n
n
(16)
Formation Control of Multiple Nonholonomic Wheeled Robots …
631
where ε1,i = z˜ 1,i + P2 g˜ 1,i . Since f 1,i = −K 1 [z 1,i − z 0,0 − Δ1,i ] − K 2 (g1,i − g0,0 ) − A−1 dˆ1,i , the derivative of V1 along system (15) is V˙1 =
n1
T g˜ z˜ 1,i 1,i +
i=1 n1
+
T g˜ ε1,i 1,i + P2 A f 1,i + P2 d1,i
i=1 n1 T (−H α )T Q − Q H α e T Q d˙ e1,i e1,i 2 2 1,i + 2 1,i
i=1 n1
=−
n1
i=1 T P −1 z˜ z˜ 1,i 1,i − 2
i=1 n1
+2
n1
T ε1,i
P2 P1 − P2−1 ε1,i +
i=1
n1 i=1
T P e ε1,i 2 1,i −
n1
(17) T Qe ¯ 1,i e1,i
i=1
T Q d˙ , e1,i 1,i
i=1
where − Q¯ = (−H α2 )T Q − Q H α2 , Q¯ > 0. n 1 T n 1 T ε1,i P2 e1,i ≤ 21 i=1 ε1,i P2 Motivated by Young’s inequality [17], we have i=1 n1 n1 n1 n1 ˙ T ˙ 1 T T T T T ˙ P2 ε1,i + 2 i=1 e1,i e1,i , 2 i=1 e1,i Q d1,i ≤ i=1 e1,i Q Q e1,i + i=1 d1,i d1,i , and by Assumption 1, then (17) can be rewritten as V˙1 ≤ − −
n1
i=1 n1 i=1
T z˜ 1,i P2−1 z˜ 1,i
T e1,i
−
n1 i=1
T ε1,i
1 −1 T P2 P1 − P2 P2 − P2 ε1,i 2
1 Q¯ − Q Q − I e1,i + η2 . 2
(18)
T
We conclude that V˙1 ≤ −c0 V1 + C, when c0 > Cp , there is V˙1 ≤ 0 on V1 = p, that is, V1 ≤ p is an invariant set. We have 0 ≤ V1 (t) ≤ cC0 + [V1 (0) − cC0 ]e−c0 t , Thus, V1 (t) is bounded, z˜ 1,i , g˜ 1,i , e1,i are all bounded. Known from C = η2 , C is an definite constant. Therefore, the appropriate matrixs P1 , P2 , Q and H α2 are selected to make C fully small, and the tracking errors z˜ 1,i , g˜ 1,i can converge to a small neighborhood c0 near the origin, respectively, thus this system can achieve practical consensus. .. . Step k: In the end, consider the tracking problem between the followers in layer k-1 and layer k. The proof procedure is the same as Step 1. Let z˜ k,i = z k,i − z k−1,m − (Δk,i − Δk−1,m ), g˜ k,i = gk,i − gk−1,m act as the tracking error of the i-th follower (in layer k) to the m-th follower (in layer k-1), where m = 1, 2, ..., n k−1 and the corresponding i = N − n k + 1, N − n k + 2, ..., N . Similarly, make Vk = N 1 N 1 N T T T z ˜ z ˜ + ε ε + k,i k,i i=N −n i=N −n i=N −n k +1 ek,i Qek,i as a candidate +1 +1 k,i k,i 2 2 k k Lyapunov function, so that this subsystem can also achieve practical consensus. In summary, we prove the consensus of each two-layer subsystem, where followers in layer k can track the layer k-1 and so on until followers in layer 1 achieve
632
X. Gao et al.
consistency with the leader for the desired formation. This ensures that the entire system achieves the control goal of Definition 1. Remark 2 If H α2 is symmetric, we can obtain that P1 > 21 P2T + P2−2 , H α2 > 21 Q + 1 −1 Q from conditions (13) , (14). It is only necessary for P2 and Q to meet simple 4 positive definite conditions, so there always exist P1 , P2 , and H . Remark 3 Using the coordinate transformation of (4), then the control input (12) can be converted to −1 ¨ k,i = Rk,i ( f k,i + k,i ),
(19)
¨ k,i = [φ¨ k,ri , φ¨k,li ]T is the angular acceleration input of the left and right where wheels of the wheeled robot i (located in layer k), f k,i should satisfies the conditions (13) and (14), 2 cos θk,i vk,i wk,i sin θk,i + lk,i wk,i cos θk,i −lk,i sin θk,i , k,i = . When Rk,i = 2 sin θk,i lk,i cos θk,i −vk,i wk,i cos θk,i + lk,i wk,i sin θk,i the control input of the multi-robot system is (19), the final motion position of the two-wheeled robots will reach a specified shape.
4 Numerical Examples and Simulations In this section, a simulation example is provided to validate the effectiveness of the theoretical results. Suppose that the multi-robot systems consists of five followers and one leader. The communication topology is given in Fig. 3. The desired formation configuration is Δ1,1 = [−0.5, 1]T , Δ1,2 = [0.5, 0]T , Δ2,3 = [−1, −1]T , Δ2,4 = [0, −1]T , and Δ2,5 = [1, −1]T . The external disturbances are assumed to be d = [0.1sin(t), 0.1sin(3t)]T . The initial states of the followers are given as z 1,1 (0) = [−0.25, 1.2]T , z 1,2 (0) = [1, 0]T , z 2,3 (0) = [−0.8, 1]T , z 2,4 (0) = [0, 2]T , and z 2,5 (0) = [2, −3]T . For A, choose the related parameters m b = 0.5 kg, Jbz = 0.0025 kg · m 2 , Jwy = 0.0000264 kg · m2 , r = 0.0325m, and L = 0.1 m. Meanwhile, the control param-
Fig. 3 The communication topology
0,0
1,1
2,3
1,2
2,4
2,5
Formation Control of Multiple Nonholonomic Wheeled Robots …
633
Fig. 4 Simulation results with z 0,0 = [0, 0]T and g0,0 = [1, 1]T
0 0 10.76 0 eters are K = , 0 0 0 10.76 1.22 −0.03 1 0.182 P1 = , and P2 = . The states trajectories of the five −0.03 1.25 0.182 1 followers under the protocol (12) designed as above are depicted in Fig. 4, from which it can be observed that formation is reached.
5 Conclusion A distributed formation control strategy is proposed to address the formation control problem of non-holonomic mobile robots with exogenous disturbances. The robot dynamics model is constructed, and hierarchical control of the robots is achieved using a tree communication topology to enhance control precision and response speed. Simulation results demonstrate the effectiveness of the proposed algorithm. Future work could involve incorporating event triggering for communication efficiency and further research on practical issues in robot formation, such as collision avoidance.
634
X. Gao et al.
Acknowledgements This work was supported by Postgraduate Research & Practice Innovation Program of Jiangsu Province (KYCX23_0479).
References 1. Zheng, M., Liu, C.-L., Liu, F.: Average-consensus tracking of sensor network via distributed coordination control of heterogeneous multi-agent systems. IEEE Control Syst. Lett. 3(1), 132–137 (2019) 2. Olfati-Saber, R.: Flocking for multi-agent dynamic systems: Algorithms and theory. IEEE Trans. Automat. Contr. 51(3), 401–420 (2006) 3. Zhu, J., Lu, J., Yu, X.: Flocking of multi-agent non-holonomic systems with proximity graphs. IEEE Trans. Circuits Syst. I, Reg. Papers 1, 60(1), 199–210 (2012) 4. Du, H., Li, S.: Attitude synchronization control for a group of flexible spacecraft. Automatica 50(2), 646–651 (2014) 5. Zhou, J., Lü, Y., Li, Z., Yang, J.: Cooperative guidance law design for simultaneous attack with multiple missiles against a maneuvering target. J. Syst. Sci. Complex. 31(1), 287–301 (2018) 6. Yu, X., Liu, L.: Distributed formation control of nonholonomic vehicles subject to velocity constraints. IEEE Trans. Ind. Electron. 63(2), 1289–1298 (2015) 7. Liang, X., Liu, Y.H., Wang, H., Chen, W., Xing, K., Liu, T.: Leader-following formation tracking control of mobile robots without direct position measurements. IEEE Trans. Automat. Contr. 61(12), 4131–4137 (2016) 8. Xiao, H., Li, Z., Chen, C.P.: Formation control of leader-follower mobile robots’ systems using model predictive control based on neural-dynamic optimization. IEEE Trans. Ind. Electron. 63(9), 5752–5762 (2016) 9. Yu, X., Liu, L.: Distributed formation control of nonholonomic vehicles subject to velocity constraints. IEEE Trans. Ind. Electron. 63(2), 1289–1298 (2015) 10. Miao, Z., Liu, Y.H., Wang, Y., Yi, G., Fierro, R.: Distributed estimation and control for leaderfollowing formations of nonholonomic mobile robots. IEEE Trans. Autom. Sci. Eng. 15, 1946– 1954 (2018) 11. Chen, C., Zou, W., Xiang, Z.: Event-triggered consensus of multiple uncertain Euler-lagrange systems with limited communication range. IEEE Trans. Syst. Man Cybem. Syst. (2023). https://doi.org/10.1109/TSMC.2023.3277703 12. Chen, C., Zou, W., Xiang, Z.: Leader-following connectivity-preserving consensus of multiple Euler-lagrange systems with disturbances. IEEE Syst. J. (2023). https://doi.org/10.1109/ JSYST.2023.3263262 13. Ji, Z., Lin, H., Yu, H.: Leaders in multi-agent controllability under consensus algorithm and tree topology. Syst. Control Lett. 61(9), 918–925 (2012) 14. Çeltek, S.A., Durdu, A., Kurnaz, E.: Design and simulation of the hierarchical tree topology based wireless drone networks. In: 2018 International Conference on Artificial Intelligence and Data Processing (IDAP), pp. 1–5 (2018) 15. Hendzel, Z.: Modelling of dynamics of a wheeled mobile robot with mecanum wheels with the use of Lagrange equations of the second kind. Int. J. Appl. Mech. Eng. 22(1), 81–99 (2017) 16. Zhang, X., Liu, X.: Further results on consensus of second-order multi-agent systems with exogenous disturbance. IEEE Trans. Circuits Syst. I, Reg. Papers1, 60(12), 3215–3226 (2013) 17. Tong, S., Wang, T., Li, Y.: Fuzzy adaptive actuator failure compensation control of uncertain stochastic nonlinear systems with unmodeled dynamics. IEEE Trans. Fuzzy Syst. 22(3), 563– 574 (2013)
Compute a Class of Refinable Function by a Matlab Code and Its Figures Xiaohui Zhou and Yujia Liu
Abstract According to a fast numerical algorithm for computing multiscaling function, a Matlab pseudo code is given for computing the approximate values and showing its figures. In this paper, some examples are given for our discussion. The approximate values of some refinable functions which satisfies the two-scale equation can be obtained by the matlab code, such as orthogonal multiscaling function, biorthogonal multiscaling function, two-direction refinable function and so on. The simulation figures are given. Keywords Multi-resolution analysis · Multiscaling function · Two-direction wavelet
1 Introduction Wavelet theory has been applied in many fields [1], such as signal processing, generating normal random numbers [2], economics and finance [3–6], image denoising and so on. It is known to all that except Haar wavelets, any other real wavelet can not obtain the orthogonality, symmetry and compact support simultaneously. In order to overcome the drawback of wavelet, the multiwavelet was developed, such as GHM multi-wavelet and C-L multi-wavelet. The construction of multi-wavelet is based on the multiplicity multi-resolution analysis [5–7]. Here is a brief introduction to the multiplicity multi-resolution analysis which is discussed by a lot of literatures. Assume that V j is an increasing sequence of closed subspace inL 2 (R), which satisfies the following conditions: (1) V j ⊂ V j+1 ,∀ j ∈ Z ; X. Zhou (B) Shanghai University of Finance and Economics-Zhejiang College, 321000 Jinhua, Zhejiang, China e-mail: [email protected] Y. Liu Department of mathematics, Zhejiang University of Finance and Economics Dongfang College, 314408 Jiaxing, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_55
635
636
X. Zhou and Y. Liu
(2) j∈Z V j = {0}, j∈Z V j = L 2 (R); (3) f (x) ∈ V j ⇔ f (2x) ∈ V j+1 ; (4) There exists a vector functionΦ(x) = [φ1 (x), φ2 (x), . . . , φr (x)]T , such that the set{φi (x − k), i = 1, 2, . . . , r, k ∈ Z } is an orthogonal basis ofV0 . An orthogonal multiplic ity multi-resolution analysis of L 2 (R) is generated by the subspace sequence V j ofL 2 (R) with the above properties, where, V j = clos L 2 (R) < φi, j,k : i = 1, 2, . . . , r, k ∈ Z > ,φi, j,k = 2 j/2 φi (2 j x − k). For each j ∈ Z , W j is an orthogonal complementary space of V j in the space V j+1 , that is, V j+1 = V j ⊕ W j . Thus,⊕ j∈Z W j = L 2 (R). If there exists a vector function Ψ (x) ∈ L 2 (R)r such that the set{Ψ (x − k), k ∈ Z }is an orthogonal basis of W0 , where,i = 1, 2, . . . , r , Ψ (x)=[ψ1 (x), ψ2 (x), . . . , ψr (x)]T andW j = clos L 2 (R) < ψi, j,k = 2 j/2 ψi (2 j x − k) : k ∈ Z >. Then the functionΦ(x) is called the multiscaling function, and Ψ (x) is the multi-wavelet corresponding toΦ(x). This paper will be organized as follows: in Sect. 2, the concept of an multiscaling function is introduced; in Sect. 3, a fast algorithm is discussed for computing the numerical values of a multi-scaling function; in Sect. 4, a Matlab code is given for computing the approximate values. Finally, some examples are given for illustrating how to use the matlab code to compute the approximate values and show its figures, such as orthogonal multiscaling function, biorthogonal multiscaling function, twodirection refinable function and so on.
2 Preliminary Assume that Φ(x) = [ϕ1 (x), ϕ2 (x), ...ϕr (x)]T is an multiscaling function with multiplicity r and dilation factor a satisfies the two-scale matrix equation: Φ(x) =
Hk Φ(ax − k)
(1)
k∈Z
where r × r matrix sequence {Hk } are called two-scale matrix sequences of Φ(x). According to the multiplicity multi-resolution analysis , the a − 1 function vecT tors Ψ i (x) = [ψ i 1 (x), ψ i 2 (x), ...ψ i r (x)] , i = 1, 2, . . . , a − 1 are called orthogonal multi-wavelets corresponding to Φ(x), which satisfy the following matrix equations: G ik Φ i (ax − k) (2) Ψ i (x) = k∈Z
where r × r matrix sequences G ik are called two-scale matrix sequences of Ψ i (x). Definition 1 Let Φ(x) = [ϕ1 (x), ϕ2 (x), ...ϕr (x)]T be a multiscaling function. And Ψ i (x) = [ψ1 (x), ψ2 (x), ...ψr (x)]T is the corresponding multiwavelets. If there are the following equations:
Compute a Class of Refinable Function by a Matlab Code and Its Figures
637
< ϕi (x), ϕ j (x − k) >= δi, j δ0,k , < ϕi (x), ψ kj (x − k) >= 0, < ψik (x), ψ lj (x − k) >= δ0,k δi, j δl,k , where i, j = 0, 1, . . . , r − 1; k, l = 0, 1, . . . , a − 1. Then Φ(x) is called an orthogonal multiscaling function and Ψ i (x) are the corresponding orthogonal multiwavelets.
3 An Fast Algorithm for a Multi-scaling Function In this section, a fast algorithm for computing the numerical values of a multi-scaling function is introduced by the method of Daubechies and Lagarias [5]. Yang Shouzhi has discussed the case for an integer scale factor a of the multiscaling function [6, 8] , where a ≥ 2. Assume that Φ(x) = [ϕ1 (x), ϕ2 (x), ...ϕr (x)]T is a multi-scaling function with multiplicity r and dilation factor a, which satisfies the two-scale matrix equation: Φ(x) =
M
Hk Φ(ax − k)
(3)
k=0
M Define m = a−1 , where [x] denotes the smallest integer greater than or equal to x, and a vector A ∈ R (m−1)r as A = [ϕ1 (1), ϕ2 (1), ..., ϕr (1), ϕ1 (2), ϕ2 (2), ..., ϕr (2), ..., ϕ1 (m − 1), ϕ2 (m − 1), ..., ϕr (m − 1)]T . According to Eq. (3), we have A = N A,
(4)
where N is the (m − 1) r × (m − 1) r matrix, and (N )i, j = Hai− j , i, j = 1, 2, . . . , m − 1.
(5)
According to the normalization condition r=1 m−1 k=1 φ (k) = 1, the solution of Eq. (3) is unique. In order to compute the approximate value of the multi-scaling function Φ(x) or the multi-wavelet Ψ (x) at a point, the vector A (x) ∈ R r m , x ∈ [0, 1) is defined as A (x) = [φ1 (x) , φ2 (x) , . . . , φr (x) , φ1 (x + 1) , φ2 (x + 1) , . . . , φr (x + 1) , . . . , φ1 (x + m − 1) , φ2 (x + m − 1) , . . . , φr (x + m − 1)]T , (6)
638
X. Zhou and Y. Liu
where
T A (0) = 0, 0, . . . , 0, A T .
(7)
Define r m × r m matrix T0 , T1 , . . . Ta−1 as (T )i, j = Hai− j−(a−1)+ , i, j = 1, 2, . . . , m, = 0, 1, . . . , a − 1.
(8)
For an arbitrary point x ∈ [0, 1), according to Eq. (3), there is
A
x + a
= T A (x) , = 0, 1, . . . , a − 1.
An given point x can be expanded by ’base a’ digit as x = d j =0, 1, . . . , a − 1. An operator τ is defined as: τx =
∞
(9)
∞ j=1
d j a − j , where
d j a − j+1 .
j=2
So A (x) = Td1 A (τ x) .
(10)
A (t) = Td1 Td2 . . . Tdm A (0) .
(11)
Iterated Eq. (10) repeatedly,
In a word, the following steps of fast algorithm is given for computing the approximate value of Φ (x): Step one. Construct a r m × r m matrix T0 , T1 , . . . Ta−1 by Eq. (8); Step two. For any x ∈ [0, m], there is an integer k,where x ∈ [k, k + 1). Let s = x − k, s ∈ [0, 1) is expressed by a-base approximation to mj=1 d j a − j ; Step three. Calculate the vector A˜ = Td1 Td2 . . . Tdm A (0), where A (0) is determined by Eq. (7); Step four. The r k + 1, r k + 2, . . . , r k + r components of A˜ can be the approximate values of the functions φ1 (x), φ2 (x), . . . φr (x). Notes: The components from the first to r th of A˜ can be the approximate values of φ1 (x), φ2 (x), . . . φr (x). The components from (r +1)th to 2r th of A˜ can be the approximate values of φ1 (1+x), φ2 (1+x), . . . φr (1+x). Similarly, the components from[(m − 2)r + 1] th to [(m − 2)r +r ] th of A˜ can be the approximate values of φ1 (m − 1 + x), φ2 (m − 1 + x), . . . φr (m − 1 + x).
Compute a Class of Refinable Function by a Matlab Code and Its Figures
639
4 Matlab Code for Computing Multi-scaling Function In this section, a Matlab code is given for computing a multi-scaling function and showing its figures. It is given as follows. define a function ‘multiscalefunctionfigure(r, a, M, p)’ %r-multiplicity,a-dilation factor,M-Length of filters,p-low filters compute m=floor(M/(a-1)); for i=1:m-1 if a*i-1+1>m or a*i-1+1m or a*i-j+1m or (a*i-1-a+k+1)m or (a*i-j-a+k+1) 0,
(6)
where Pi , Pi,∞ and ai are positive constants, Pi > Pi,∞ , i = 1, . . . , n. An example image about the prescribed performance is shown in Fig. 1. Fig. 1 An example of prescribed performance
0
0
Time
660
Q. Meng and Y. Lin
To achieve (5), a transformed error z 1,i and a smooth, strictly increasing function T (z 1,i ) are defined with the properties [20]: (i) − δ i < T (z 1,i ) < δ¯i , (ii) lim T (z 1,i ) = δ¯i , lim z 1,i →+∞
z 1,i →−∞
T (z 1,i ) = −δ i ,
(iii) T (0) = 0.
(7)
The tracking error can be transformed into i = T (z 1,i ). pi
(8)
Because T (z 1,i ) is strictly monotonic and p(t) is nonzero on t ∈ [0, +∞), z 1,i = T −1 (e1,i / pi ).
(9)
In this study, the T is chosen as T (z 1,i ) =
δ¯i e z1,i +r − δ i e−(z1,i +r ) 1 , r = ln(δ i /δ¯i ). z +r −(z +r ) 1,i 1,i e +e 2
(10)
Prescribed Performance for the Error z 2 The prescribed performance for error z 2,i is defined as |z 2,i | < K i .
(11)
The following Barrier Function (BF) [21] is to be applied in the backstepping design to achieve (11) B(z 2,i ) =
2 K 2,i 1 ln 2 . 2 2 K 2,i − z 2,i
(12)
It is obvious that B(z 2,i ) is positive definite and C 1 continuous for |z 2,i | < K i .
3 Adaptive Control Design With the knowledge above, the virtual control signal, the controller and the adaptive update law are designed as follows: α1 = −c1 (
∂z 1 −1 ) z 1 + x˙r + p, ˙ ∂ p
(13)
Adaptive Tracking Control for Manipulators with Prescribed … 2 u i = (K i2 − z 2,i )(−c2 z 2,i −
θ˙ˆi = i
z 2,i φ. − z i2
661
∂z 1,i z 1,i ) − θˆiT φ + α˙ 1,i , ∂i
(14) (15)
K i2
where c1 and c2 are positive constants, i is a positive-definite matrix. Theorem 1 For the n-link manipulator system (2), apply the the virtual control signal (13), the controller (14) and the adaptive update law (15). It can be obtained that the tracking error converges to zero asymptotically and all the signals are bounded. Proof The control law is constructed as follows steps. Step 1: According to Eq. (4) the derivative of z 1 can be expressed as ∂z 1 ∂z 1 (x˙1 − x˙r ) + p˙ ∂ ∂p ∂z 1 ∂z 1 = (x2 − x˙r ) + p˙ ∂ ∂p ∂z 1 ∂z 1 = (z 2 + α1 − x˙r ) + p, ˙ ∂ ∂p
z˙ 1 =
(16)
∂z 1,1 ∂z 1,1 ∂z 1 ∂z 1,n ∂z 1,n ∂z 1 = diag = diag and . ,..., ,..., ∂ ∂1 ∂n ∂p ∂ p1 ∂ pn Define the quadratic function V1 as
where
V1 =
1 T z z1. 2 1
(17)
Then V˙1 = z 1T z˙ 1 .
(18)
Choose the virtual control signal α1 as α1 = −c1
∂z 1 ∂
−1
z 1 + x˙r +
p. ˙ p
(19)
.
(20)
where := diag p
1 n ,..., p1 pn
662
Q. Meng and Y. Lin
Then we have ∂z 1 z2 V˙1 = −c1 z 1T z 1 + z 1T ∂ n ∂z 1,i = −c1 z 1T z 1 + z 1,i z 2,i . ∂i i=1
(21) (22)
Step 2. Similarly, the derivative of z 2 can be expressed as z˙ 2 = x˙2 − α˙ 1 = θ T φ + u − α˙ 1
(23)
Define the quadratic function V as K2 1 1 T −1 θ˜ θ˜i . ln 2 i 2 + 2 i=1 K i − z 2,i 2 i=1 i i n
V = V1 +
n
(24)
Then ∂z 1 z2 + V˙ = −c1 z 1T z 1 + z 1T ∂
n n z 2,i z˙ 2,i θ˜iT i−1 θ˙ˆi . + 2 2 K − z i 2,i i=1 i=1
(25)
The adaptive control law u i and update law θ˙ˆi are designed as ∂z 1,i 2 ) −c2 z 2,i − z 1,i − θˆiT φ + α˙ 1,i , u i = (K i2 − z 2,i ∂i ˙θˆ = z 2,i φ. i i K i2 − z i2
(26)
respectively. It can be deduced that V˙ = −c1 z 1T z 1 − c2 z 2T z 2 ,
(27)
which means that the errors and z 2 converge to zero asymptotically. Furthermore, all the closed-loop signals are bounded.
4 Simulation Results This section provides a simulation example to demonstrate the feasibility of the proposed strategy. Consider a two-link manipulator robot [22] expressed as system (1):
Adaptive Tracking Control for Manipulators with Prescribed …
663
m 1l1l2 cos(q1 − q2 ) (m 1 + m 2 l12 m 2 l22 m 1l1l2 cos(q1 − q2) 0 −q˙2 C = m 2 l1l2 sin(q2 − q1 ) −q˙1 0 −(m 1 + m 2 )l1 g sin(q1 ) G= −m 2 l2 g sin(q2 ) 3.5q˙1 F= 2.5q˙2
M=
(28)
where m 1 = 2 kg and m 2 = 3 kg are the mass of two links respectively, l1 = l2 = 1 m are the length of the links, and g = 9.8 m/s2 . The objective of the control is to track the reference signal π sin(t), 6 π π = sin(t) + sin(2t). 6 6
q1r = q2r
The prescribed performance for the tracking error i is set as pi (t) = (π/10 − π/90)e(−2t) + π/90, i = 1, 2. The design parameters of the update law and control law are chosen as c1 = 10, c2 = 20, K 1 = K 2 = 1.2 and 1 = 2 = 5I . The angles of the manipulator with the controller applied are shown in Fig. 2 and the tracking error is shown in Fig. 3. The figures show that joint 1 tracks the reference signal within 1 s, while joint 2 tracks it within 2 s. Figure 3 shows that the tracking error satisfies the prescribed performance.
q1/rad
0.5
0
-0.5
0
5
0
5
t/s
10
15
10
15
1 0.5
q2/rad
Fig. 2 Angles of the joints. The blue curves are the output signals and the red dashed lines are the reference trajectory
0 -0.5 -1
t/s
664 0.4 0.2
e1/rad
Fig. 3 Tracking errors with PPF. The blue curves are the error and the red dashed lines are the prescribed performance p
Q. Meng and Y. Lin
0 -0.2 -0.4
0
5
0
5
0
5
0
5
10
15
10
15
10
15
10
15
0.4
e2/rad
0.2 0 -0.2 -0.4
1.5 1
z21/rad
Fig. 4 Errors z 2 with BF. The blue curves are the error z 2 and the red lines are the K i in 11
t/s
0.5 0 -0.5 -1 -1.5
1.5
z22/rad
1 0.5 0 -0.5 -1 -1.5
t/s
The error signal z 2 is shown in Fig. 4, which indicates that the z 2 converges to 0 and is limited in [−1.2, 1.2] as desired. The control signals are shown in Fig. 5. The adaptive estimated parameters θˆ are shown in Fig. 6. The parameters fluctuate within a small range after tracking the reference signal in the output.
5 Conclusions This study proposes an adaptive control law to track reference trajectories for manipulators. The PPF and BF are introduced for controller design to improve the dynamics performance. The tracking error dynamics are constrained within an exponentially
Adaptive Tracking Control for Manipulators with Prescribed …
665
50
Fig. 5 Control signals. The blue curve is the control signal u 1 and the red is the u2
40
Control signal/(Nm)
30 20 10 0 -10 -20 -30 -40 -50
Fig. 6 Signals of adaptive update law θˆ
0
5
t/s
10
15
2
1.5
1
0.5
0
-0.5
0
5
t/s
10
15
converging envelope. The errors for the virtual control signals are constrained within a certain range. The simulation results demonstrate the feasibility of the proposed control method.
References 1. Hu, B., Guan, Z.H., Lewis, F.L., Chen, C.P.: Adaptive tracking control of cooperative robot manipulators with Markovian switched couplings. IEEE Trans. Ind. Electron. 68(3), 2427–2436 (2020) 2. Meng, Q., Lai, X., Yan, Z., Su, C.Y., Wu, M.: Motion planning and adaptive neural tracking control of an uncertain two-link rigid-flexible manipulator with vibration amplitude constraint.
666
Q. Meng and Y. Lin
IEEE Trans. Neural Networks Learn. Syst. 33(8), 3814–3828 (2021) 3. Hu, X., Wei, X., Zhang, H., Han, J., Liu, X.: Robust adaptive tracking control for a class of mechanical systems with unknown disturbances under actuator saturation. Int. J. Robust Nonlinear Control 29(6), 1893–1908 (2019) 4. Sun, W., Su, S.F., Xia, J., Nguyen, V.T.: Adaptive fuzzy tracking control of flexible-joint robots with full-state constraints. IEEE Trans. Syst. Man Cybern. Syst. 49(11), 2201–2209 (2018) 5. Kim, J.: Two-time scale control of flexible joint robots with an improved slow model. IEEE Trans. Ind. Electron. 65(4), 3317–3325 (2017) 6. Sun, L., Zhao, W., Yin, W., Sun, N., Liu, J.: Proxy based position control for flexible joint robot with link side energy feedback. Robot. Auton. Syst. 121, 103, 272 (2019) 7. He, W., David, A.O., Yin, Z., Sun, C.: Neural network control of a robotic manipulator with input dead zone and output constraint. IEEE Trans. Syst. Man Cybern. Syst. 46(6), 759–770 (2015) 8. Yang, Y., Hua, C., Guan, X.: Finite time control design for bilateral teleoperation system with position synchronization error constrained. IEEE Trans. Cybern. 46(3), 609–619 (2015) 9. Yu, J., Shi, P., Zhao, L.: Finite-time command filtered backstepping control for a class of nonlinear systems. Automatica 92, 173–180 (2018) 10. Polyakov, A.: Nonlinear feedback design for fixed-time stabilization of linear control systems. IEEE Trans. Autom. Control 57(8), 2106–2110 (2011) 11. Hong, H., Yu, C., Yu, W.: Adaptive fixed-time control for attitude consensus of disturbed multispacecraft systems with directed topologies. IEEE Trans. Network Sci. Eng. 9(3), 1451–1461 (2022) 12. Bechlioulis, C.P., Rovithakis, G.A.: Robust adaptive control of feedback linearizable MIMO nonlinear systems with prescribed performance. IEEE Trans. Autom. Control 53(9), 2090–2099 (2008) 13. Li, Y., Tong, S., Liu, L., Feng, G.: Adaptive output-feedback control design with prescribed performance for switched nonlinear systems. Automatica 80, 225–231 (2017) 14. Wang, M., Wang, C., Shi, P., Liu, X.: Dynamic learning from neural control for strict-feedback systems with guaranteed predefined performance. IEEE Trans. Neural Networks Learn. Syst. 27(12), 2564–2576 (2015) 15. Huang, X., Duan, G.: Fault-tolerant attitude tracking control of combined spacecraft with reaction wheels under prescribed performance. ISA Trans. 98, 161–172 (2020) 16. Zhang, J.X., Yang, G.H.: Event-triggered prescribed performance control for a class of unknown nonlinear systems. IEEE Trans. Syst. Man Cybern. Syst. 51(10), 6576–6586 (2020) 17. Liu, Y.J., Chen, H.: Adaptive sliding mode control for uncertain active suspension systems with prescribed performance. IEEE Trans. Syst. Man Cybern. Syst. 51(10), 6414–6422 (2020) 18. Bu, X., Qi, Q., Jiang, B.: A simplified finite-time fuzzy neural controller with prescribed performance applied to waverider aircraft. IEEE Trans. Fuzzy Syst. 30(7), 2529–2537 (2021) 19. Galicki, M.: Finite-time control of robotic manipulators. Automatica 51, 49–54 (2015) 20. Wang, W., Wen, C.: Adaptive actuator failure compensation control of uncertain nonlinear systems with guaranteed transient performance. Automatica 46(12), 2082–2091 (2010) 21. Tee, K.P., Ge, S.S., Tay, E.H.: Barrier Lyapunov functions for the control of output-constrained nonlinear systems. Automatica 45(4), 918–927 (2009) 22. Ling, S., Wang, H., Liu, P.X.: Adaptive fuzzy tracking control of flexible-joint robots based on command filtering. IEEE Trans. Ind. Electron. 67(5), 4046–4055 (2019)
Nighttime Vehicle Object Detection Based on Improved YOLOv7 Haichao Sun, Hui Ye, and Junyong Zhai
Abstract Detecting vehicles using deep learning is a crucial area of research in computer vision. Detecting vehicles at night can be challenging due to factors such as low illumination, complex lighting, and other environmental conditions, which can negatively impact the accuracy of detection. As a result, achieving high detection accuracy for night vehicle images can be difficult. Therefore, this paper combines image enhancement algorithm and object detection algorithm to study vehicle object detection in night scene. Firstly, Laplacian sharpening algorithm is applied to enhance the images. Then, the improved YOLOv7 algorithm is applied to detect the enhanced images. Compared with the original YOLOv7 on the self-made night vehicle dataset, improved YOLOv7 only has 1.6% higher parameters, but has 2.5% less computation, and brings 1.9% higher mAP0.5 and 1.7% higher mAP0.5:0.95. Keywords Object detection · YOLOv7 · Attention mechanism · Image enhancement · Small target
1 Introduction Vehicle detection is a crucial area of research that is rapidly advancing due to the rapid development of artificial intelligence technology. To achieve real-time vehicle detection, many researchers are using one-stage object detection algorithms [1, 2]. The literature [3] proposed a vehicle object detection method based on the improved YOLOv4 [4] algorithm. They accomplished this by utilizing the k-means clustering algorithm to create anchors that were more appropriate for the UA-Detrac dataset [5], and by improving the PANet [6]. In the study by Wang et al. [7], they proposed a H. Sun School of Automation, Southeast University, Wuxi 214000, China H. Ye School of Science, Jiangsu University of Science and Technology, Zhenjiang 212000, China J. Zhai (B) School of Automation, Southeast University, Nanjing 210096, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_58
667
668
H. Sun et al.
fast and high-precision method for detecting smoky vehicles. Mobilenetv3-small [8] is applied to replace the backbone of YOLOv5s [9] to reduce the model computation and parameters. Zhang et al. [10] proposed a novel YOLOv7-RAR algorithm for urban vehicle detection. In their approach, they utilized the Res3Unit structure to reconstruct the backbone network of the YOLOv7 algorithm [11], which improved the overall architecture and ability of the network model. To enhance the features of vehicles, ACmix [12] is added after the SPPCSPC layer of the backbone network. YOLOv7 is the basic model of the YOLO series. It achieves higher accuracy and faster processing speeds compared to most existing object detectors in the 5 FPS to 160 FPS range. On GPUv100, YOLOv7 demonstrates the highest accuracy among real-time object detectors that operate at over 30 FPS. Therefore, the YOLOv7 network of YOLO series is selected and improved on this basis.
2 Improved YOLOv7 Model 2.1 Laplacian Sharpening Algorithm In the night scene, the vehicle target boundary in the picture is not clear due to low illumination and fast moving speed. In view of the above problem, we take the processing effect and speed of the algorithm into consideration, and use Laplacian sharpening algorithm to process the vehicle picture. The Laplacian operator is a fundamental differential operator that is both rotationally invariant and isotropic. The Laplace transform of a two-dimensional image function is equivalent to the second derivative of isotropy, and is defined as: L f (x, y) = ∇ 2 f (x, y) =
∂ 2 f (x, y) ∂ 2 f (x, y) + ∂x2 ∂ y2
(1)
where L represents the Laplace transform, ∇ 2 represents the Laplacian operator. The second partial derivatives of the function f (x, y) with respect to x and y are given by: ∂ 2 f (x, y) = f (x + 1, y) + f (x − 1, y) − 2 f (x, y), ∂x2 ∂ 2 f (x, y) = f (x, y + 1) + f (x, y − 1) − 2 f (x, y). ∂ y2
(2) (3)
To make it more suitable for digital image processing, (1) is discretized to: L f (x, y) = f (x + 1, y) + f (x − 1, y) + f (x, y + 1) + f (x, y − 1) − 4ΔxΔy. (4)
Nighttime Vehicle Object Detection Based on Improved YOLOv7
669
Fig. 1 a Laplacian operation mask. b Laplacian operation extension mask
The mask Fig. 1a has the same result in the upper, lower, left and right directions of 90 ◦ C, that is to say, there is no direction in the 90 ◦ C direction. In order for the mask to have this property in the direction of 45 ◦ C, the filter mask Fig. 1a is extended as Fig. 1b.
2.2 SPD-Conv Block Current CNN architectures often use strided convolution or pooling layers, which are commonly employed but flawed, as they can result in the loss of critical details in the image. The literature [13] introduced a novel CNN block, known as SPD-Conv module, which can be used to replace very strided convolution and pooling layer. The SPD-Conv module is composed of two layers: a space-to-depth (SPD) layer and a non-strided convolution (Conv) layer (i.e., stride = 1) that can be integrated into various CNN architectures. Space-to-depth (SPD): The SPD component generalizes a (raw) image transformation technique to downsampling feature maps inside and throughout a CNN. When the scale is equal to 2, as illustrated in Fig. 2a–c, the input image X is downsampled by a factor of 2, resulting in four sub-maps, each with a shape of ( 2S , 2S , C1 ).
Fig. 2 Illustration of SPD-Conv when scale = 2
670
H. Sun et al.
The SPD layer takes the feature map X (S, S, C1) and transforms it into an interS S , scale , scale2 C1 ). An example of this transformation mediate feature map X ( scale using a scale of 2 is shown in Fig. 2d. Non-strided convolution: Non-strided convolution aims to preserve the maximum amount of discriminative feature information. Following the SPD feature transformation layer, a non-strided convolution layer with C2 filters is added to the S S , scale , scale2 C1 ) into network where C2 < scale2 C1 . Which transforms X ( scale S S X ( scale , scale , C2 ). To enhance the network’s performance on small target vehicles that are located at a distance, the SPD-Conv module is used to replace the strided-Conv module in the original YOLOv7 network. To maximize the detection effect of the improved SPDConv module, it is added to different positions of the network. We replace the stridedConv in backbone, neck and backbone and neck of the network with SPD-Conv. We note the three models as YOLOv7-SPDI, YOLOv7-SPDII and YOLOv7-SPDIII.
2.3 SimAM SimAM [14] is an attention module that is simple in concept and very effective. This module computes 3D attention weights for feature maps in a parameter-free manner, distinguishing it from the current channel and spatial attention modules proposed in [15, 16]. In addition, SimAM is that most operations are selected based on defined energy functions, avoiding excessive structural adjustments. To enhance the effectiveness of attention, an energy function is defined to assess the significance of each neuron: et (ωt , bt , y, xi ) =
M−1 1 (yo − xi )2 + (yt − tˆ)2 M − 1 i=1
(5)
where xi = ωt xi + bt and tˆ = ωt t + bt are linear transforms of other neurons xi and target neuron t. The index i is used to traverse over the spatial dimensions of the channel, and there are a total of M = H × W neurons in that channel. To simplify our analysis, we will use binary labels with values of either −1 or 1 for yo and yt . Additionally, we will include a regularizer term in (5). The final energy function is given by: M−1 1 et (ωt , bt , y, xi ) = (−1 − (ωt xi + bt ))2 + (1 − (ωt t + bt ))2 + λωt2 M − 1 i=1
(6) which can be solved for ωt and bt using a closed-form solution. It can be obtained by:
Nighttime Vehicle Object Detection Based on Improved YOLOv7
671
2(t − μt ) (t − μt )2 + 2σt2 + 2λ
(7)
ωt = −
1 bt = − (t + μt )ωt (8) 2 1 M−1 1 M−1 2 2 where μt = M−1 i=1 x i and σt = M−1 i=1 (x i − μt ) are mean and variance calculated over all neurons except t in that channel. As the current solutions presented in (7) and (8) are computed on a single channel, it is a reasonable assumption that all pixels within a given channel adhere to the same distribution. Using the closed-form solution for ωt and bt , we can compute the minimal energy as follows: et∗ =
4( σ 2 + λ) (t − μ)2 + 2 σ 2 + 2λ
(9)
M M where μ = M1 i=1 xi and σ 2 = M1 i=1 (xi − μ)2 . It can be inferred from (9) that ∗ a lower energy value et signifies a higher degree of uniqueness for the neuron t in comparison to neighboring neurons, thereby signifying its greater relevance in the visual processing task. As a result, we can estimate the importance of each neuron by computing the reciprocal of its corresponding energy value, that is, 1/et∗ . The refinement phase of this module can be summarized as follows: X = sigmoid
1 X E
(10)
where E groups all et∗ across channel and spatial dimensions. means Hadamard product. sigmoid function is applied to restrict the value in E. For detection tasks, Squeeze-and-Excitation (SE) [15] and SimAM achieve very similar performance. However, SimAM does not introduce an additional number of parameters, whereas SE does. For example, SE-ResNet50 [17] introduces 2.5 M parameters and SE-ResNet101 introduces 4.7 M parameters. To enhance the overall performance of the YOLOv7 model, the SimAM module is incorporated as an additional component. To study the effect of different addition locations of attention module on network detection performance, we conduct relevant experiments to determine the location of addition of attention modules. We add it to the backbone, neck, and backbone and neck of the network, respectively. The above three cases are denoted as YOLOv7-SimAMI, YOLOv7-SimAMII and YOLOv7-SimAMIII, respectively.
3 Experiments 3.1 Setup In this experiment, we use Ubuntu 18.04 system, NVIDIA GeForce RTX 3080 ti (12 G), and the experimental environment was python3.8, pytorch1.12.1, cuda 11.2.
672
H. Sun et al.
Table 1 Comparison of experimental results at different replacement positions Model mAP0.5 (%) mAP0.5:0.95 (%) Original YOLOv7 YOLOv7-SPDI YOLOv7-SPDII YOLOv7-SPDIII
89.6 89.3 90.5 88.3
50.2 51.2 50.8 50.2
We set epochs, batch size and lr to 100, 8 and 0.01, respectively. We select SGD optimizer. In order to assess the efficacy of the algorithm, we employ several metrics such as model parameters (Para.), computation complexity measured in floating point operations (FLOPs) and mean average precision (mAP).
3.2 Dataset At present, there is a lack of public night vehicle dataset. Therefore, we collect on different roads in the city and under different weather conditions. 5000 pictures are obtained according to the rule of taking one frame of pictures in 5 s. We remove some similar pictures to avoid overfitting of the model caused by too many similar pictures. The total number of dataset pictures after processing is 4112. We split the train set, validation set and test set 8:1:1.
3.3 Result SPD-Conv: We test using SPD-Conv to replace Non-stride Conv at different positions. Table 1 shows that the improved YOLOv7-SPDII model is slightly behind the YOLOv7-SPDI model in terms of mAP0.5:0.95. However, mAP0.5 is much higher than that of YOLOv7-SPDI model. Therefore, considering that the YOLOv7-SPDII model has improved the overall detection performance, we choose YOLOv7-SPDII as the basic model for subsequent experiments. SimAM: We test adding SimAM attention to different positions on the network. Table 2 demonstrates that the performance of the model can be improved by adding the attention mechanism in different locations of the network. Among them, YOLOv7-SimAMI performed best. Therefore, adding attention mechanism in backbone can produce the best results. Comparative analysis of each module improvement: To verify the effect of different modules, we add modules to the network one by one to observe whether they can improve the performance of the model. Table 3 shows that the feature of the vehicle in the picture can be enhanced by adding Laplacian sharpening algorithm. We note
Nighttime Vehicle Object Detection Based on Improved YOLOv7
673
Table 2 Experimental results of different addition positions of SimAM Model mAP0.5 (%) mAP0.5:0.95 (%) Original YOLOv7 YOLOv7-SimAMI YOLOv7-SimAMII YOLOv7-SimAMIII
89.6 90.5 89.6 89.8
50.2 51.1 50.3 50.5
Table 3 The impact of different module on model performance Models #Param. FLOPs mAP0.5 YOLOv7 YOLOv7-I YOLOv7-II YOLOv7-III Improvement
36.9M 36.9M 37.5M 37.5M +1.6%
103.2G 103.2G 100.5G 100.6G −2.5%
89.6% 89.9% 90.6% 91.5% +1.9%
mAP0.5:0.95 50.2% 50.8% 52.4% 51.9% +1.7%
Fig. 3 Comparison of the mAP0.5 between the YOLOv7 with different modules and the original YOLOv7
this model as YOLOv7-I. YOLOv7-I has the same parameters and computation, and bring 0.3% higher mAP0.5. Furthermore, compared with YOLOv7, YOLOv7-I converges faster. On the basis of YOLOv7-I model, we improve the neck network by replacing the strided-Conv with SPD-Conv block to reduce the loss of detail features. We note this model as YOLOv7-II. Compared with YOLOv7-I, YOLOv7-II performance is improved by 0.7% mAP0.5. Finally, we add the SimAM attention to backbone, and note it as YOLOv7-III. Compared with YOLOv7-II, YOLOv7-III has the same parameters and computation, and brings 0.9% higher mAP0.5. Compared with original YOLOv7, the improved YOLOv7 only has 1.6% higher parameters, but has 2.5% less computation, and brings 1.9% higher mAP0.5 (Fig. 3) and 1.7% higher mAP0.5:0.95.
674
H. Sun et al.
Fig. 4 Insufficient light. a YOLOv7. b Improved YOLOv7
Fig. 5 Complex light. a YOLOv7. b Improved YOLOv7
3.4 Comparison of Detection Results in Different Scenarios Figures 4 and 5 show the detection results in different environments of night vehicle dataset. Compared with the improved YOLOv7 algorithm, the original network has different degrees of missing detection in scenes with insufficient light and complex light.
4 Conclusion In this paper, we propose a nighttime vehicle detection algorithm based on the improved YOLOv7. Firstly, Laplacian sharpening algorithm is applied to enhance images, which solves the problem of the blurred images caused by the motion of the vehicle. Then, the SPD-Conv module is applied to replace part of the strided-Conv in the network and extract the details in the pictures, which improves the detection performance of small object vehicles in the distance. In the meantime, SimAM is introduced for feature enhancement to further improve network performance. Compared with the original YOLOv7 on the self-made night vehicle dataset, improved YOLOv7 only has 1.6% higher parameters, but has 2.5% less computation, and brings 1.9% higher mAP0.5 and 1.7% higher mAP0.5:0.95. In the future, we will conduct research on the basis of this experiment and adopt more efficient image enhancement algorithm and network structure to improve vehicle detection performance in night scenes.
Nighttime Vehicle Object Detection Based on Improved YOLOv7
675
References 1. Redmon, J., Divvala, S., Girshick, R., et al.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016) 2. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., et al.: SSD: single shot multibox detector (2015). arXiv preprint arXiv: 1512.02325 3. Yang, X., Chen, B.: Vehicle detection based on improved YOLOv4. In: International Conference on Algorithms, Microchips and Network Applications, pp. 338–342 (2022) 4. Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M.: YOLOv4: optimal speed and accuracy of object detection (2020). arXiv preprint arXiv: 2004.10934 5. Wen, L., Du, D., Cai, Z., et al.: UA-DETRAC: a new benchmark and protocol for multi-object detection and tracking (2015). arXiv preprint arXiv: 1511.04136 6. Liu, S., Qi, L., Qin, H., et al.: Path aggregation network for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8759–8768 (2018) 7. Wang, C., Wang, H., Yu, F., et al.: A high-precision fast smoky vehicle detection method based on improved Yolov5 network. In: 2021 IEEE International Conference on Artificial Intelligence and Industrial Design, pp. 255–259 (2021) 8. Howard, A.G., Zhu, M., Chen, B., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications (2017). arXiv preprint arXiv: 1704.04861 9. Jocher, G., Chaurasia, A., Stoken, A., et al.: YOLOv5[EB/OL] (2022). https://github.com/ ultralytics/yolov5 10. Zhang, Y., Sun, Y., Wang, Z., et al.: YOLOv7-RAR for urban vehicle detection. Sensors 23(4), 1801–1802 (2023) 11. Wang, C.Y., Bochkovskiy, A., Liao, H.Y.M.: YOLOv7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors (2022). arXiv preprint arXiv: 2207.02696 12. Pan, X., Ge, C., Lu, R., et al.: On the integration of self-attention and convolution. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 815–825 (2022) 13. Sunkara, R., Luo, T.: No more strided convolutions or pooling: a new CNN building block for low-resolution images and small objects (2022). arXiv preprint arXiv: 2208.03641 14. Yang, L., Zhang, R.Y., Li, L., et al.: SimAM: a simple, parameter-free attention module for convolutional neural networks. In: International Conference on Machine Learning, pp. 11863– 11874 (2021) 15. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 7132–7141 (2018) 16. Woo, S., Park, J., Lee, J.Y., et al.: CBAM: convolutional block attention module. In: Proceedings of the European Conference on Computer Vision, pp. 3–19 (2018) 17. He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
A Demagnetization Fault Diagnosis Strategy for Vehicle Permanent Magnet Synchronous Motor Xiangyu Ma, Dong Guo, Yueling Zhao, and Lei Huang
Abstract To address issues that potential emergencies may arise from the operation of electric vehicles, particularly related to overloading or gridlock of the permanent magnet synchronous motor (PMSM) and resulting motor temperature increase, as well as the natural wear of motor components over time, a diagnostic method of PMSM demagnetization faults based on a permanent magnet flux observer is proposed to this paper. The approach involves establishing a permanent magnet flux observer to capture the fault characteristics and provide an effective diagnosis solution. ANSYS Maxwell and MATLAB co-simulation are used to verify the feasibility of the proposed method. Simulation results demonstrate that the approach is both simple and effective in accurately diagnosing PMSM demagnetization faults in electric vehicles. Keywords Demagnetizing fault diagnosis · Permanent magnet flux observer · PMSM
1 Introduction In the realm of vehicle utilization, due to limitations in installation space, PMSM exhibit a high power density, however, their heat dissipation capabilities are poor and they operate under complex conditions. PMSM operations usually involve navigating complex acceleration and deceleration, variable load requirements, the attainment of optimal maximum torque current ratio whilst treading a fine line between weak magnetic fields and strong armature reactions [1, 2]. Furthermore, PMSM is susceptible to permanent magnet demagnetization failure, an affliction often exacerbated by natural aging and other influencing factors. The diagnosis of demagnetization faults typically utilizes artificial intelligence, signal processing, and parameter identification techniques. In one study discussed in X. Ma · D. Guo (B) · Y. Zhao · L. Huang School of Electrical Engineering, Liaoning University of Technology, Jinzhou 121001, People’s Republic of China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_59
677
678
X. Ma et al.
reference [3], a method of diagnosing uniform demagnetization faults with PMSM was developed by analyzing the radial air gap magnetic density and stator current. This involved establishing an equivalent magnetic circuit model for the PMSM Daxis, and generating radar maps of the radial air gap magnetic density amplitude beneath each pole. Based on the resulting maps and the magnetic density values, the degree of uniform demagnetization fault could be estimated. Another study, reference as [4], presented a diagnostic method that utilized both continuous and discrete wavelet transforms to diagnose PMSM magnetization faults. While these approaches are effective against diagnosing demagnetization faults, they can be complicated and require a large amount of calculation. The method proposed to this paper utilizes a permanent magnet flux observer to diagnose demagnetization faults in vehicle PMSM. Through the establishment of the observer, tracking down the permanent magnet flux can be performed in real-time. This allows for the detection of changes in the flux of the permanent magnet - a direct indication of demagnetization faults. By observing the trend of these changes in the observer, fault characteristics can be captured and diagnosis can be achieved. Additionally, the co-simulation of ANSYS Maxwell and MATLAB addresses the inadequacy of MATLAB in simulating demagnetization fault conditions.
2 Fault Diagnosis Model The paper proposes the establishment of a fault diagnosis model in the two-term rotation (d-q) coordinate system, as well as the construction of a permanent magnet flux observer based on the mathematical model of a permanent magnet synchronous motor (PMSM). The developed model enables fault feature extraction and facilitates comprehensive fault diagnosis.
2.1 Mathematical Model of PMSM In order to streamline computation and enhance the accuracy of fault detection, it is advantageous to establish the mathematical model of PMSM within a rotating coordinate system. By doing so, the calculation of motor parameters can be simplified and made more efficient, as demonstrated in literature sources [5, 6]. The mathematical model of PMSM on any rotating coordinate system is, ⎡ ⎤ ⎡ cos θ − sin θ UA 2 ⎣U B ⎦ = 2 ⎢ cos(θ − π ) − sin(θ − 23 π ) ⎣ 3 3 UC cos(θ + 23 π ) − sin(θ + 23 π )
√1 2 √1 2 √1 2
⎤⎡
⎤ Ud ⎥⎣ ⎦ ⎦ Uq Uo
(1)
A Demagnetization Fault Diagnosis Strategy …
679
In the context of three-phase electrical systems, the variables U A , U B , UC signify the voltage of each respective phase, while the electrical angle represents the phase difference between them. Similarly, the variables Ud , Uq , Uo refer to the voltage components in the d-axis and q-axis, respectively. It is worth noting, however, that the central voltage, denoted by Uo , is generally not of practical significance in most cases. In any rotational coordinate system, PMSM can be described by an equation for voltage, u d = Rs i d + dψd /dt − ωr ψq (2) u q = Rs i q + dψq /dt − ωr ψd where Rs denote stator resistance,i d , i q represents d and q axis current, ψd , ψq denotes d and q axis flux, ωr denote angular velocity. With sophisticated linguistic structures, it can be articulated as follows: Stator resistance, labeled as Rs , represents the resistance in stator of an electric motor. The variables i d , i q stand for the current in the d and q axis, respectively, while the variables ψd , ψq represent the flux in the d and q axis, correspondingly. Additionally, the parameter ωr signifies the angular velocity of the motor. In any rotating coordinate system, the flux linkage of a PMSM can be expressed as, ψd = L d i d + ψm (3) ψq = L q i q The inductance of the d-axis and q-axis are denoted by L d , L q respectively, while ψm stands for the flux chain of the permanent magnet. The equation for electromagnetic torque in a PMSM can be expressed as follow, Te = 1.5Pn ψm i q
(4)
where Pn is the polar logarithm. The mechanical motion equation of PMSM is, Te − Tl = J dω/dt + Rs ωr
(5)
where Tl is the load torque, J is the moment of inertia.
2.2 Establish a Permanent Magnet Flux Observer Due to its benefits including minimal computation, rapid diagnosis, and easy implementation, the voltage-based flux observer has become widely employed in motor control systems [7]. The flux linkage of a PMSM can be represented in the two-phase static coordinate system as follows,
680
X. Ma et al.
ψα = (u α − Ri α ) dt ψβ = u β − Ri β dt
(6)
The voltage and current in the rest coordinate system are represented by u α , u β and i α , i β respectively. The estimated flux value in a rotating coordinate system can be derived from the estimated flux value in a two-phase stationary coordinate system.
ψd cos θ sin θ ψα = ψq ψβ − sin θ cos θ
(7)
According to (3), (6) and (7), the expression of permanent magnet flux observer can be expressed as, ψm = cos θ
(u α − Ri α ) dt + sin θ
u β − Ri β dt − L d i d
(8)
3 System Simulation Test and Verification Figure 1 illustrates the system’s overall framework, which employs Space Vector Pulse Width Modulation (SVPWM) technique for controlling three bridge inverter circuits that operate a PMSM. The motor’s current loop and speed loop are regulated through three PI regulators, creating a double closed-loop system for precise speed control. The finite element model of three bridge inverter circuits and PMSM is represented by the simploere module, which is built and implemented in ANSYS Maxwell simploere. In addition to establishing a highly accurate finite element model of the motor, simulating the impact of temperature rise on permanent magnets, and considering complex conditions such as eddy current, copper and iron damage on the motor’s magnetic field, simploere outperforms MATLAB in terms of these capabilities. Therefore, this paper adopts a combined approach, whereby MATLAB focuses on control, simploere takes on the execution part, and together, they complete the fault diagnosis of PMSM under temperature rise demagnetization [8, 9]. The demagnetization fault diagnosis module consists of a permanent magnet flux observer for extracting demagnetization fault characteristics and a speed detection module for detecting the electrical Angle signal, converting it into speed, and feeding it back to the PI controller [10, 11]. Co-simulation model built in simploere is shown in Fig. 2. The finite element model of the PMSM for the vehicle can be seen in FEA1 as depicted in the figure. The relevant parameters for this PMSM are provided in Table 1. Additionally, the MATLAB sinmulink module MDL1 is utilized in this system. The module takes in three-phase current as well as electrical angle as input, and provides output in terms of IGBT pilot communication number and torque source for the
A Demagnetization Fault Diagnosis Strategy …
Fig. 1 System overall framework
Fig. 2 Co-simulation model
681
682 Table 1 Main parameters of vehicle PMSM Motor Connection mode Polar number Working system Bus voltage Rated/peak speed Rated/peak torque Rated/peak power
X. Ma et al.
Parameter Y 4 S9 306 V/DC 2860/9000 rpm 60/150 Nm 18/36 kW
Fig. 3 Sinmulink simulation model
inverter circuit. The system as a whole is powered by two 153 V DC voltage sources, with the PMSM driving circuit consisting of six IGBTs [12]. The phase resistance R is 20, while the inductor L is 0.1 mH. The MATLAB Simulink simulation model depicted in Fig. 3 is presented. Utilizing the AnsoftSFunction, the connection between MATLAB and simploere is established. As the PMSM module in MATLAB is not employed, the initial position of the motor is not referenced to A, necessitating a subtraction of 90◦ C. The simulation employs a sampling period of 1/6000 s and a step size of 0.01ms. To improve demagnetization fault diagnosis, the simulated scenario entails a steady temperature of 20 ◦ C in the permanent magnet prior to 0.2 s, which subsequently undergoes a continuous escalation to 260 ◦ C within the time frame of 0.2–0.3 s, subsequently persisting at that temperature beyond 0.3 s. The impact of different permanent magnet materials, which exhibit varying B-H characteristics, on simulation results must also be considered [13, 14]. In this study, magnetic steel is
A Demagnetization Fault Diagnosis Strategy …
683
Fig. 4 Co-simulation results
employed as the permanent magnet material in order to simulate the demagnetizing fault condition. As the temperature of this material increases, its magnetic induction intensity decreases, thus facilitating the simulation of demagnetization fault conditions [15]. The subject study involved the design of a simulation framework consisting of a single system, as depicted in Fig. 4. The diagram illustrates simulation’s components, which include electromagnetic torque, rotational speed, three phase current, and flux observer. During the 0.2–0.3 s interval, it becomes evident that both the electromagnetic torque and three-phase current experience an initial increase followed by a subsequent decline. Moreover, there is a distinct shift in the growth trajectory of the rotational speed during this particular time frame. This can be attributed to the increase in current, which generates more magnetic flux that cut magnetic force line to produce greater effective torque and increase speed. As time approaches 0.5 s, the motor’s permanent magnet undergoes a state of irreversible demagnetization caused by prolonged exposure to high temperatures. This phenomenon creates an imbalance in the internal magnetic field, leading to the unstable behavior of all motor
684
X. Ma et al.
indicators. Consequently, relying on such indicators for fault diagnosis becomes impractical. The flux observer reveals a significant decrease in flux at 0.251 s. By utilizing this approach, it is possible to diagnose the demagnetization fault of vehicle PMSM and opt for direct shutdown to prevent further damage to the system resulting from operating in fault.
4 Conclusion By utilizing a permanent magnet flux observer, a method for diagnosing vehicle PMSM demagnetization faults was established. The fault features were obtained through this method and a co-simulation technique using ANSYS Maxwell and MATLAB was employed to supplement MATLAB’s inability to clearly simulate demagnetization faults. The results of the simulations verify the practicality of the proposed approach, which is both simple and effective in accurately and promptly diagnosing demagnetization faults in vehicle PMSM. Acknowledgements This work was supported by the major project 2020JH1/10100021 of Liaoning Provincial Department of Science and Technology and Key project of Liaoning Provincial Department of Education JZL202015407 and the Natural Science Foundation of Liaoning Provincial Department of Science and Technology 2019-ZD-0697.
References 1. Nejadi-Koti, H., Faiz, J., Demerdash, N.A.O.: Uniform demagnetization fault diagnosis in permanent magnet synchronous motors by means of cogging torque analysis. In: 2017 IEEE International Electric Machines and Drives Conference, 2017 2. Sun, Z.: Demagnetization fault analysis of permanent magnet in permanent magnet synchronous motor. Electr. Mach. Technol. 06 (2022) 3. Ding, S., He, W., Hang, J., Tang, D.: Research on uniform demagnetization fault diagnosis of permanent magnet synchronous motor based on radial air gap flux density and stator current. Proc. CSEE 10, 1–9 (2022) 4. Ruiz, J.-R.R., Rosero, J.A., Espinosa, A.G., et al.: Detection of demagnetization faults in permanent-magnet synchronous motors under nonstationary conditions. IEEE Trans. Magn. 45(7) (2009) 5. Li, H., Tao, C.: Demagnetization fault diagnosis and fault pattern recognition for PMSM of electric vehicle. Trans. China Electrotechn. Soc. 32(05) (2017). http://kns.cnki.net/kcms/detail/ 44.1259.TH.20220923.1125.002.html 6. Zhang, Y., Liu, G., Chen, Q.: Discrimination of interturn short-circuit and local demagnetization in permanent magnet synchronous motor based on current fluctuation characteristics. Trans. China Electrotechn. Soc. 37(07) (2022) 7. Yang, J., Zhang, J.: On-line diagnosis method of magnetic loss fault of PMSM based on EKF [J/OL]. Mach. Tool Hydrauli. 5 (2023) .https://doi.org/10.19595/j.cnki.1000-6753.tces.2017. 05.001 8. Zhang, Z., Qin, P., Xu, J., Liu, H.: Magnetic parameter analysis of permanent magnet synchronous motor degaussing fault. Micro Special Motor 46(08), 31–34+44 (2018)
A Demagnetization Fault Diagnosis Strategy …
685
9. Yang, C., Liu, S., Zhang, Z.: Research on fault diagnosis method for loss of magnetic field of vector controlled permanent magnet synchronous motor. J. Light Industr. 32(04) (2017) 10. Urresty, J., Riba, J., Delgado, M., et al.: Detection of demagnetization faults in surface-mounted permanent magnet synchronous motors by means of the zero sequence voltage component. IEEE Trans. Energy Convers. 27(1) (2012) 11. Da, Y., Shi, X., Krishnamurthy, M.: A new approach to fault diagnostics for permanent magnet synchronous machines using electromagnetic signature analysis. IEEE Trans. Power Electron. 28(08) (2013) 12. Reigosa, D., Fernández, D., Park, Y., et al.: Detection of demagnetization in permanent magnet synchronous machines using hall-effect sensors. IEEE Trans. Industr. Appl. 54(04) (2018) 13. Ruoho, S., Kolehmainen, J., Ikaheimo, J., et al.: Interdependence of demagnetization, loading, and temperature rise in a permanent-magnet synchronous motor. IEEE Trans. Magn. 46(3) (2010) 14. Ruoho, S., Haavisto, M., Takala, E., et al.: Temperature dependence of resistivity of sintered rare-earth permanent-magnet materials. IEEE Trans. Magn. 46(1) (2010) 15. Liu, K., Zhu, Z.Q.: Online estimation of the rotor flux linkage and voltage-source inverter nonlinearity in permanent magnet synchronous machine drives. IEEE Trans. Power Electron. 29(01) (2014)
Design and Implementation of UAV Semi-physical Simulation System Based on VxWorks Wenxiao Hu, Wenyuan Cong, Xinmin Chen, Mengqiao Chen, Yue Lin, and Fengrui Xu
Abstract Semi-physical simulation is an important test method in the design of Unmanned Aerial Vehicle (UAV) flight control system because of its advantages of accuracy and efficiency. Based on the requirements of UAV field simulation, this paper analyzes the key technologies and overall architecture of the semi-physical simulation system, proposes a semi-physical simulation solution based on embedded data acquisition front-end and distributed real-time simulation system, and completes the design of the simulation platform. The platform is designed to provide convenient visual test means for UAV system tests and flight quality assessment under laboratory conditions, and solve the drawbacks of high test cost and system complexity. The system has been used in the design and testing of a specific type of UAV flight control system. The practical application shows that the simulation system can effectively shorten the development period of the control system, reduce the development risk and improve the quality of the controller.… Keywords UAV · Semi-physical simulation · VxWorks · Real-time simulation
1 Introduction With the rapid development of aviation science and technology, UAVs have gradually become a hot spot in aviation research with their advantages of low cost and reusability, and have been widely used in military and civilian fields [1]. Unmanned W. Hu School of Automation, Central South University, Changsha, China W. Cong Faculty of Electrical Engineering and Computer Science, Ningbo University, Ningbo, China W. Hu · X. Chen · Y. Lin (B) · F. Xu Ningbo Institute of Materials Technology and Engineering, Chinese Academy of Sciences, Ningbo, China e-mail: [email protected] M. Chen College of Intelligent Science, National University of Defense Technology, Changsha, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_60
687
688
W. Hu et al.
aerial systems are usually composed of airframes, power systems, electrical equipment, flight control systems, control command systems, ground support systems, etc. Among them, the flight control system, as the core subsystem of the UAV, largely determines the performance of the UAV and plays a vital role in ensuring the flight safety of the UAV [2]. In order to guarantee the flight performance of UAV, comprehensive testing and validation are required during the flight controller design process. The whole machine test of the flight control system requires the cooperation of various system professionals, the test process has high risk, long task cycle, low test efficiency, and is easily constrained by meteorological and environmental factors. In summary, flight test verification of flight control systems is very difficult, which requires us to build an excellent simulation platform for the development and testing of flight control systems [3]. In view of the above problems, considering the efficiency, economy and safety factors in the development of flight control system, this paper builds a set of UAV semi-physical simulation platform. The computer based on the real-time operation core can realize complex logic processing functions, which has the characteristics of high integration, strong computing power, rich interface, good scalability and strong versatility, so it has been widely used in semi-physical simulation [4]. Simulation platform as a semi-physical simulation implementation of the necessary link, need to have good real-time, generally use a dedicated DSPACE real-time system simulation platform [5], or use PC + real-time operating system mode, such as VxWorks, Qnx and other real-time operating systems, these simulation platforms have good performance and are widely used in the aerospace field [6–8]. In this paper, a UAV semiphysics simulation system is built based on RTSim environment under VxWorks embedded hardware architecture, and the UAV Simulink simulation model is compiled into C language code and downloaded to the hardware platform for real-time simulation.
2 Components of the Simulation System Semi-physical simulation, also known as hardware-in-the-loop simulation, is a test method that connects physical equipment to the simulation loop, which combines the advantages of physical test and digital simulation, and has unparalleled advantages in mathematical simulation and physical test [9]. The overall architecture of the UAV semi-physics simulation system designed in this paper is shown in Fig. 1. The system is mainly composed of the main control computer, real-time simulator and flight control box, and the system adopts the system architecture of upper and lower computer interaction. The host computer (main control computer) is based on the window operating system, which is convenient for human-computer interaction and can realize the functions of system model development, simulation test process management, simulation data processing and so on. The design of the lower computer (real-time simulation computer) is based on the VxWorks operating system, which can complete the real-time settlement of digital modules and realize data interaction
Design and Implementation of UAV Semi-physical Simulation …
689
Fig. 1 Semi-physical simulation system architecture
with physical devices through hardware I/O channels. In addition, the semi-physical simulation system also includes signal conditioning equipment and communication bus to simulate the signal interaction between the real aircraft control system and other systems, improve the simulation accuracy of the simulation system, and make the simulation results more realistic and reliable.
3 Simulator Design and Implementation The semi-physical simulation system designed in this paper is based on the mature VxWorks hardware architecture, which enables the simulator to operate and execute the object model program code in real time, so as to realize the simulation of key dynamic characteristics of UAV such as the six-degree-of-freedom motion characteristics of UAV, sensor and actuator interface dynamic characteristics. The system collects the position information of the actuator through A/D, and outputs the UAV status data through the serial port or D/A. Serial communication between the emulator and the console allows the target application to run independently of the Matlab operating environment.
3.1 Architecture Design of Real-Time Simulator Based on VxWorks System VxWorks is a priority-based preemptive real-time operating system with ultra-low latency and ultra-low jitter to quickly respond to the needs of drone flight attitude con-
690
W. Hu et al.
trol. The system supports C++, Boost, Rust, Python, pandas and other development languages, and is equipped with edge optimization and OCI-compatible container engines, which is convenient for the development, upgrade and simulation test of UAV aircraft control system. The VxWorks RTOS kernel meets the basic requirements of a specific real-time environment in UAV flight simulation, including: (1) Multitasking: In order to meet the asynchronous nature of events during the flight of the aircraft, the system allows multi-threaded execution of tasks corresponding to external events. The multi-threaded task can better match the real drone flight environment, which greatly improves the credibility of the system simulation results. (2) Preemption scheduling: Real-world events have inherited priorities, and the priority of events is fully considered in the process of system CPU scheduling, and a priority-based CPU scheduling strategy is designed, when the high-priority tasks are in the executable state, the system automatically preempts the currently running lower-priority tasks. (3) Inter-mission communication and synchronization: When an aircraft performs a mission, there may be many tasks performed simultaneously as part of an application, and VxWorks provides a fast and powerful communication mechanism to ensure the simulation of the dynamic characteristics of the aircraft system during the mission. At the same time, the synchronization and mutual exclusion mechanism between tasks is designed to avoid scheduling conflicts between shared resources and critical sections. (4) Performance boundary: In order to ensure the needs of real-time simulation of UAV, according to the shortcomings of traditional throughput optimization methods, performance optimization is carried out according to the worst working conditions of the system core to ensure the overall execution efficiency of the internal functions of the system. Based on the above real-time simulation requirements of UAV, the framework of VxWorks real-time simulation system designed in this study is shown in Fig. 2.
3.2 System Requirements Analysis The semi-physical simulation test is a key link in the design process of the UAV flight control system, and the main function is to verify the functionality of the flight control system [10]. The control command analysis and flight attitude feedback in the simulation process are realized through the simulation model, and the semi-physical simulation platform should have test functions such as pre-design simulation analysis and fault location test of flight control law. 1. Simulation platform system-level requirements analysis (1) Closed-loop simulation verification requirements involved in aircraft control law design and cross-linking system development.
Design and Implementation of UAV Semi-physical Simulation …
691
Fig. 2 VxWorks RTOS architecture
(2) The needs of RD personnel to monitor the key states and parameters of the aircraft during the controller simulation process. (3) The functional adjustment needs of different models, and the designed structural framework should be scalable and reconfigurable. (4) The platform meets the needs of designers’ simulation data processing and distribution. 2. Platform functional requirements analysis (1) Flight attitude monitoring and controller parameter adjustment function: During the semi-physical simulation process, the external debugging interface can be used to monitor and adjust the controller parameters in real time. (2) Fault injection function: In the process of semi-physical simulation test of aircraft, various sudden faults can be simulated according to test requirements, and fault signals can be injected into the semi-physical model to test the robustness of the controller. (3) Sensor signal conditioning: signal conditioning of the signals of each part of the sub-module and adding photoelectric isolation modules to provide a signal channel for real-time computer and host computer data interaction. (4) Closed-loop simulation function: The real-time attitude information of the aircraft is intuitively displayed through the three-dimensional vision system, and the data-source driving function is provided for the closed-loop simulation system.
3.3 Construction of UAV Dynamics Model This section mainly models the kinematic characteristics of the model from two aspects: dynamic equation analysis and kinematic equation analysis.
692
W. Hu et al.
First of all, it is necessary to analyze the dynamic characteristics of the aircraft, including the force equation and moment equation of each operating direction of the body subjected to the UAV, and derive and calculate the dynamic characteristics through force analysis, and obtain the following equations: ⎧ F ⎨ u˙ = vr − wq − g sin θ + mX v˙ = −ur − wp − g cos θ sin ϕ + FmI ⎩ w˙ = up − up − g cos θ cos ϕ + FmZ
(1)
where u, v, w are the velocity components in the three degrees of freedom directions of the body coordinate system, and p, q, r represent roll angle rate, pitch angle rate and yaw angle rate, respectively. Fx , Fy , Fz represents the projected value of the resultant force on each axis. The moment equation in the body coordinate system is as follows: ⎧ ⎨ ϕ˙ = p + (r cos ϕ + q sin ϕ) tan θ θ˙ = q cos ϕ − r sin ϕ ⎩ ˙ ψ = cos1 θ (r cos ϕ + q sin ϕ)
(2)
where l is the rolling moment,m is the pitching moment, and n is the yaw moment. Through the conversion between ground coordinates and body coordinates, the equation of centroid motion of the UAV around the body coordinate system is obtained [11]: ⎧ ⎨ ϕ˙ = p + (r cos ϕ + q sin ϕ) tan θ θ˙ = q cos ϕ − r sin ϕ (3) ⎩ ˙ ψ = cos1 θ (r cos ϕ + q sin ϕ) ˙ ψ˙ respectively represent the rate of change in angular velocity in the where ϕ, ˙ θ, direction of x, y, and z in the axis direction. The equation of linear motion of the body is: ⎧ X˙ d = u(cos θ cos ψ + v sin ϕ sin θ cos ψ − cos ϕ sin ψ) ⎪ ⎪ ⎪ ⎪ ⎨ +w(sin ϕ sin ψ + cos ϕ sin θ cos ψ) Y˙d = u cos θ sin ψ + v(sin ϕ sin θ sin ψ + cos ϕ cos ψ)n ⎪ ⎪ +w(− sin ϕ cos ψ + cos ϕ sin θ sin ψ) ⎪ ⎪ ⎩ ˙ Z d = u sin θ − v sin ϕ cos θ − w cos ϕ cos θ n
(4)
where X˙d , Y˙d , Z˙ d represent the linear motion speed change rate on the x, y, z axis of the UAV in the ground coordinate system, respectively.
Design and Implementation of UAV Semi-physical Simulation …
693
Fig. 3 Overall simulation model structure of UAV
3.4 UAV Simulation Model Construction The UAV simulation model in this system is built on the host machine by Simulink, mainly including the UAV body six-degree-of-freedom motion module, engine module, aerodynamic module, and serial port A/D, D/A and other I/O device interface driver modules, and its overall structure model is shown in Fig. 3. Among them, the UAV six-degree-of-freedom motion model receives the output data of the engine module and actuator for real-time calculation of motion attitude, and outputs information such as UAV position, attitude, speed, angular velocity and so on. The atmospheric environment model is mainly for the simulation of the dynamic atmospheric environment, including disturbance factors such as the atmospheric turbulence model and wind shear model. The engine module calculates the thrust according to the received engine control instructions, and the engine control commands are issued by the controller through the serial port; The actuator module calculates the momentum of each actuator based on the information collected by the board. After building the drone model, you can compile the simulation model into C language code through the real-time simulation toolbox and download it to the simulator of the real-time operation core for execution. The system is based on the integrated development environment provided by MATLAB/Simulink for dynamic modeling and simulation testing. Platform code compilation can directly call the compiler of other IDEs, saving complicated program code writing work, establishing of the model using Simulink’s graphical modeling tools, avoiding manual writing of the calculation process, greatly improving the modeling efficiency, and improving the accuracy of simulation calculation, Simulink model establishment and compilation process is shown in Fig. 4.
694
W. Hu et al.
Fig. 4 Simulink model establishment and compilation process
Fig. 5 Simulink model of UAV
Through the analysis of the dynamic characteristics of the UAV, the differential equation characterizing the motion characteristics of the aircraft is obtained, considering the above dynamic equation and the influence of other factors such as the ship’s wake and deck motion, a nonlinear full simulation model of the UAV is established in Simulink, and some modules of the model are shown in Fig. 5.
4 Simulation Console Software Implementation The semi-physical simulation system designed in this paper uses RTSimPlus as the main control software, uses MATLAB software modeling, and combines VxWorks real-time simulation system to form a complete closed-loop iterative process. The system follows the MBSE design concept and supports the whole process develop-
Design and Implementation of UAV Semi-physical Simulation …
695
Fig. 6 UAV simulation model design process
ment of UAV products from demand design, system design, prototype verification, and system integration testing. The system adopts open software/hardware architecture, supports the modular and standardized design of UAV systems, can quickly realize functions such as distributed node equipment increase, computing power improvement, and I/O interface expansion according to the flight test requirements of actual models, and supports system-level multidisciplinary joint simulation, such as MATLAB, AMESim/Motion, MWorks, Simpack and other multi-disciplinary and multi-domain simulation modeling software, and quickly carry out verification tests. The system also has complete simulation management, automatic test, data monitoring, graphical instrument control display and data processing and analysis, etc.; with fault injection, signal conditioning, integrated wiring and other signal adaptation complete solutions; in the design of this platform uses a process-oriented development method to achieve the development of the entire application, the entire simulation model design process is shown in Fig. 6.
5 Semi-physical Simulation Test In order to verify the performance and effect of the semi-physical simulation system, in this research a certain type of UAV is taken as an example. The model parameters are assigned by using the m file, and the target application is compiled through RTW
696
W. Hu et al.
Fig. 7 Semi-physical simulation platform graphical user interface (GUI)
Fig. 8 Part of the semi-physical simulation platform hardware equipment
and downloaded to the simulator for running. Usually, the operation cycle of the UAV flight control system is 10–20 ms, in order to ensure the real-time nature of the system, without losing the generality, the simulation step size in this simulation is set to 10 ms. Once the simulation starts, the control box outputs flight control commands to control the attitude and motion trajectory of the aircraft, and the semi-physical simulation flight effect is shown in Fig. 7.
Design and Implementation of UAV Semi-physical Simulation …
697
The semi-physical simulation system developed in this paper can verify the effectiveness of the flight control system and can well realize the flight simulation test of UAV. In addition, during the simulation process, the system tested the functions of simulation parameter setting, atmospheric parameters, simulation curve display and flight status monitoring, and the results showed that the functions of the system achieved the expected effect, the system operation was convenient, and some hardware facilities and simulation vision of the system were shown in Fig. 8.
6 Conclusion Based on the hardware-in-the-loop simulation and testing technology in the target environment of VxWorks, this research develops a semi-physical simulation system for UAV flight control. The system supports the conversion of the UAV Simulink simulation model block diagram into C language program code, which is simple, efficient and easy to maintain. The simulation system has been used in the design and flight verification of a certain type of UAV, and the practice shows that the simulation system can effectively verify the correctness and rationality of the flight control law, intuitively reflect the flight control effect of the UAV, and provide effective support for the optimization design of the flight control system and the performance evaluation of the UAV system. Acknowledgements This research was supported by Ningbo Key R&D Program (2023Z044).
References 1. Na, Z., Liu, Y., Shi, J., Liu, C., Gao, Z.: UAV-supported clustered NOMA for 6G-enabled internet of things: trajectory planning and resource allocation. IEEE Internet Things J. 8, 15-041–15-048 (2021) 2. Lin, Z., Wang, W., Li, Y., Zhang, X., Zhang, T., Wang, H., Wu, X., Huang, F.: Design and experimental study of a novel semi-physical unmanned-aerial-vehicle simulation platform for optical-flow-based navigation. Aerospace (2023) 3. Jian-guo, Y.: Research and design for a flight simulation system based on vxworks. Flight Dyn. (2008) 4. Yang, Y.J., Han, J.: Real-time object detector based mobilenetv3 for UAV applications. Multimedia Tools Appl. 82, 18-709–18-725 (2022) 5. Mueller, M., Smith, N.G., Ghanem, B.: A benchmark and simulator for UAV tracking. In: European Conference on Computer Vision (2016) 6. Kun, Z., Li, D., Shichao, C., Bin, H.G.: Design and implementation of 1553b bus controller in vxworks (2019) 7. Dong, D., Fang, Y., Chen, Y.: Application of test method based on state transition diagram in flight control software. In: 2019 6th International Conference on Dependable Systems and Their Applications (DSA), pp. 495–496 (2020) 8. Liu, H.C.Z.: In-vehicle information system embedded software developing approach based on QNX RTOS (2015)
698
W. Hu et al.
9. Lauss, G., Strunz, K.: Accurate and stable hardware-in-the-loop (HIL) real-time simulation of integrated power electronics and power systems. IEEE Trans. Power Electron. 36, 10-920–10932 (2021) 10. Hong-de, D., Yang, L., Liang, W.: The data transmission system design between intelligent multi-serial port and ethernet based on ARM and FPGA. In: 2012 Second International Conference on Instrumentation, Measurement, Computer, Communication and Control, pp. 172–175 (2012)
Nonsingular Fast Terminal Sliding Mode Control of Stewart Parallel Robot Xiaoyue Wang and Chenglin Liu
Abstract Aiming at the control problem of Stewart parallel robot, this paper presents a nonsingular fast terminal sliding mode control algorithm combining double-power reaching law. The proposed control algorithm can ensure that the robot system quickly converges to the desired pose, weaken the influence of chattering, and overcome the singularity problem of the fast terminal sliding mode controller. Firstly, the nonsingular fast terminal sliding surface of the robot is defined. Then, based on the equivalent control principle, the nonsingular fast terminal sliding mode control law of the system is obtained by using the double-power reaching law to deduce the nonlinear control law. Finally, the Lyapunov stability theorem and Gaussian hypergeometric function were used to prove that the pose error can converge to 0 in a finite time. Through numerical simulation testing, the rationality of this control algorithm is verified. Keywords Stewart parallel robot · Nonsingular fast terminal sliding mode control · Double-power reaching law
1 Introduction As one of the most widely used 6-DOF parallel robot, Stewart parallel robot, which has the strengths of large rigidity, strong carrying capacity and stable structure, has been widely used in aerospace, medical [1], motion simulator [2], and other fields. The robot is composed of an upper platform, a lower platform, six drive rods and 12 universal joints. By driving the telescopic movement of the drive rod, the pose and attitude change of the upper platform can be realized, so that various spatial motion attitudes can be simulated. Any one degree-of-freedom movement on the upper platform will cause different movements of the six drive rods, so the Stewart parallel platform is a multi-variable strongly coupled servo system.
X. Wang · C. Liu (B) Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Institute of Automation, Jiangnan University, Wuxi 214122, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_61
699
700
X. Wang and C. Liu
Because the Stewart parallel robot has strong coupling that may cause instability and destruction, it is significant to design a reasonable control method to safeguard the coordinated movement of each driving rod. At present, to control Stewart parallel robots mainly has the following methods: traditional PID control [3], improved PID control [4], adaptive control [5], sliding mode variable structure control [6], and intelligent control [7]. Due to the particular construction of Stewart parallel robot, adopting traditional control methods to meet the control requirements is absolutely intractable. With the development of variable structure theory, sliding mode control is used to solve the control problem of system with uncertain parameters. Begon [8] first applied sliding mode control to platform control in 1995, providing a new control idea for the platform. In 2005, Park [9] applied the sliding mode control method to the position servo system of the Stewart platform to realize the reproduction of composite command signals. Therefore, to solve the control problem of the Stewart parallel robot, this paper provides a nonsingular fast terminal sliding mode control method and adopts the double-power reaching law. This control method not only ensures that the pose variable quickly converges to the desired position in a finite time, but also suppresses the chattering phenomenon generated by the control input. According the dynamic model of the system and the nonsingular fast terminal sliding surface, the nonsingular fast terminal sliding mode controller of the equivalent control principle and the double-power approximation law design system is adopted to achieve the rapid adjustment of the Stewart parallel robot pose. The theoretical analysis and numerical simulation results prove the usefulness of the presented method.
2 Dynamic Model of Stewart Parallel Robot The configuration of Stewart 6-DOF parallel robot is shown in Fig. 1, the upper platform can carry out six-degree-of-freedom movement in space, and the lower platform is fixed. Six adjustable length drive rods connect the upper and lower platforms through universal joints to achieve three-dimensional translation and threedimensional rotation of the upper platform in relation to the lower. To perform dynamic analysis of the Stewart parallel robot, system description should be established firstly. The mass center of the lower platform is the origin O so center of the upper as to establish a static axis system O {X, Y, Z }, while the mass platform is the origin P so as to establish a moving axis system P {x, y, z}. When the upper platform is in the middle position, the axis direction of the moving and static axis system is consistent. The static axis system is the reference axis system, and the pose of the Stewart parallel robot could be represented by the following generalized coordinate vector q = [x, y, z, ϕ, θ, ψ]T ,
(1)
Nonsingular Fast Terminal Sliding Mode Control of Stewart …
701
Fig. 1 Schematic diagram of Stewart 6-DOF parallel robot
where (x, y, z) represents the coordinates of the centroid of the upper platform in the static axis system O {X, Y, Z }, (ϕ, θ, ψ) represents the Euler angle at which the upper platform rotates around the x-axis, y-axis and z-axis. Newton-Euler’s method [10] has been widely used to establish the robot’s dynamic model as M(q)q¨ + C(q, q) ˙ q˙ + G(q) = J u,
(2)
where M(q) is a positively definite symmetric inertial matrix, C(q, q) ˙ q˙ is the centripetal and Coriolis force matrix, G(q) is the gravitational force matrix, J is Jacobian matrix, the vector u indicates the input forces from the drive levers. Since M(q) is an invertible matrix, Eq. (2) is re-formulated as q¨ = M −1 (J u − C q˙ − G).
(3)
Because the Stewart dynamic model is highly sophisticated, the special mathematical expressions of the motion equation of the Stewart parallel robot are not given in this paper. The comprehensive calculation results of the math expressions in the dynamic equation of Stewart parallel robot could be found in the literature [6].
702
X. Wang and C. Liu
3 Non-singular Fast Terminal Sliding Mode Control Stewart Parallel Robot The control law u, which makes the robot posture q be accurately adjusted to a given desired position qd for arbitrary initial conditions q0 , need to be designed. Firstly, we present the nonsingular fast terminal sliding surface as ˙ s(x) = e + αsigγ1 e + βsigγ2 e,
(4)
where s = [s1 , s2 , . . . , s6 ]T , error vector e = q − qd , e = [e1 , e2 , . . . , e6 ]T , α = diag(α1 , α2 , . . . , α6 ) with αi > 0, β = diag(β1 , β2 , . . . , β6 ) and βi > 0, sigγ1 e = [|e1 |γ11 sgn(e1 ), |e2 |γ12 sgn(e2 ), . . . , |e6 |γ16 sgn(e6 )]T , sigγ2 e˙ = [|e˙1 |γ21 sgn(e˙1 ), |e˙2 |γ22 sgn(e˙2 ), . . . , |e˙6 |γ26 sgn(e˙6 )]T , 1 < γ2i < 2, γ2i < γ1i , γ1i = pi /qi , γ2i = gi /h i , and pi , qi , gi , h i are all positive odd numbers. Taking the time derivative of Eq. (4) yields ˙ γ2 −1 · e¨ s˙ (x) = e˙ + αγ1 |e|γ1 −1 · e˙ + βγ2 |e|
(5)
Substituting (5) into Eq. (3) yields ˙ γ2 −1 · (M −1 J u − M −1 C q˙ − M −1 G − q¨d ) (6) s˙ (x) = e˙ + αγ1 |e|γ1 −1 · e˙ + βγ2 |e| The selection of control laws for sliding mode control should meet two key points including reachability and traceability. Reachability means that the switching surface s = 0 can be reached if s · s˙ < 0 holds. Note that the control law for this constraint is usually switching. Traceability means that the states must move on the switching surface s = 0 after reaching the switching surface s = 0. If s = 0 and s˙ = 0, the state will not break away from s = 0. The control law designed according to this constraint is named as the equivalent control. According to the sliding mode equivalent control principle, the control law is u = u eq + u n ,
(7)
where u eq is equivalent control law, and u n is a nonlinear control law. From Eq. (6), u eq is constructed as ˙ 2−γ2 − β −1 γ2−1 |e|γ1 −1 |e| ˙ 2−γ2 αγ1 + M −1 C q˙ u eq = J −1 M(−β −1 γ2−1 |e| + M −1 G + q¨d ).
(8)
To design the nonlinear control rate u n , we adopt the double-power approximation rate expressed in the following form s˙ = −k1 sigρ1 s − k2 sigρ2 s,
(9)
Nonsingular Fast Terminal Sliding Mode Control of Stewart …
703
where k1 = diag(k11 , k12 , . . . , k16 ), k2 = diag(k21 , k22 , . . . , k26 ), k1i > 0, k2i > 0, T i = 1, . . . , 6 , sigρ1 s = [|s1 |ρ11 sgn(s1 ), |s2 |ρ12 sgn(s2 ), . . . , |s6 |ρ16 sgn(s6 )] , sigρ2 s = ρ T |s1 | 21 sgn(s1 ), |s2 |ρ22 sgn(s2 ), . . . , |s6 |ρ26 sgn(s6 ) , ρ1i > 1, 0 < ρ2i < 1. Let M −1 J u n = −k1 sigρ1 s − k2 sigρ2 s, and the nonlinear control rate u n is u n = J −1 M(−k1 sigρ1 s − k2 sigρ2 s).
(10)
Combining (8) and (10), the nonsingular fast terminal sliding mode control law is formulated as ˙ 2−γ2 − β −1 γ2−1 |e|γ1 −1 |e| ˙ 2−γ2 αγ1 u = J −1 M(−β −1 γ2−1 |e| + M −1 C q˙ + M −1 G + q¨d − k1 sigρ1 s − k2 sigρ2 s).
(11)
4 Finite Time Convergence Analysis In this section, the finite time stability of the system (2) with (11) will be proved. Before present the proof, we list a useful lemma. Lemma 1 [11] Assume that there exist the positive real numbers a1 , a2 , . . . , an and 0 < p < 2, then the following inequality holds, p
p
(a12 + a22 + · · · + an2 ) p ≤ (a1 + a2 + · · · + anp )2 . Define the Lyapunov energy function as V =
1 T s · s. 2
(12)
By computation, one has ˙ γ2 −1 · e¯ s˙ (x) = e˙ + αγ1 |e|γ1 −1 · e˙ + βγ2 |e| γ2 −1 ρ1 ˙ ≤ βγ2 |e| · (−k1 sig s − k2 sigρ2 s), ˙ 2−γ2 − β −1 γ2−1 |e|γ1 −1 |e| ˙ 2−γ2 αγ1 − k1 sigρ1 s − k2 sigρ2 s. where e¯ = −β −1 γ2−1 |e| Then, the time derivative of V is V˙ = s T s˙ ˙ γ2 −1 · (−k1 sigρ1 s − k2 sigρ2 s) ≤ s T · βγ2 |e| ˙ γ2 −1 · (−k1 |s|ρ1 +1 − k2 |s|ρ2 +1 ) = βγ2 |e| ≤ 0. Hence, the stability of the sliding surface has been proven.
(13)
704
X. Wang and C. Liu
To prove the finite time convergence of approaching sliding surface, we get V˙ = −s T K 1 sigρ1 s − s T K 2 sigρ2 s
(14)
˙ γ2 −1 k1 , K 2 = βγ2 |e| ˙ γ2 −1 k2 , K 1 ,K 2 is a positive definite diagonal where K 1 = βγ2 |e| matrix. From Lemma 1, it can be obtained that: V˙ ≤ −s T K 2 sigρ2 s ≤ −2(ρ2 +1)/2 K 2 V (ρ2 +1)/2
(15)
where K 2 = mini {K 2i } > 0 represents the minimum eigenvalue of K 2 .Since 0 < ρ2 < 1 and then 1/2 < (ρ2 + 1)/2 < 1. According to literature [11], the nonsingular fast terminal sliding surface converges to 0 within the following time: Tr ≤
V (1−ρ2 )/2 (0) 2(1−ρ2 )/2 K 2 (1 − ρ2 )
(16)
Thus, the finite time convergence of the sliding surface has been proven. Next, we prove the finite time convergence of state variables. When the states arrive at the sliding surface, we obtain s(x) = e + αsigγ1 e + βsigγ2 e˙ = e + α · |e|γ1 sgn (e) + β · |e| ˙ γ2 sgn (e) ˙ = 0.
(17)
If the equation holds, then: sgn (e) = −sgn (e) ˙
(18)
It follows from (17) that |e| ˙ =
γ1 1 1 2 · (|e| + α · |e|γ1 ) γ2 . β
(19)
Substituting e˙ = |e| ˙ · sgn(e) ˙ into Eqs. (18) and (19) brings γ1 1 1 2 · (|e| + α · |e|γ1 ) γ2 sgn(e). e˙ = − β
(20)
Because α > 0, γ1 > 0, then |e| + α · |e|γ1 = |e + α · eγ1 | can be obtained, then Eq. (20) can be updated as:
Nonsingular Fast Terminal Sliding Mode Control of Stewart … de dt
705
γ1 1 = − β1 2 · (|e| + α · |e|γ1 ) γ2 sgn(e + α · eγ1 ) γ1 1 = − β1 2 · (|e + α · eγ1 |) γ2 sgn(e + α · eγ1 ).
(21)
According to [12], the solution of differential Eq. (21) is |e(0)|
1
β γ2
Ts =
1 dx (e + α · eγ1 ) γ2 (22) 1− γ1 γ2 |e(0)| 2 1 γ2 − 1 γ2 − 1 = ·F , ;1 + ; −α|e(0)|γ1 −1 , α (γ2 − 1) γ2 (γ1 − 1)γ2 (γ1 − 1)γ2
0
where F(·) is Gaussian hypergeometric function. Therefore, with the nonsingular fast terminal sliding surface (4) and the controller (11) is adopted, then the pose error of the Stewart parallel robot system converged to 0 in a finite time.
5 Simulation For simplicity, the coefficient matrix of the Stewart parallel robot are set as ⎡
M6×6
3 ⎢0 ⎢ ⎢0 =⎢ ⎢0 ⎢ ⎣0 0
G 6×1
0 3 0 0 0 0
0 0 3 0 0 0
0 0 0 6 0 0
0 0 0 0 6 0
⎤ ⎡ 0 500 ⎢0 6 0 0⎥ ⎥ ⎢ ⎢ 0⎥ ⎥ , C6×6 = ⎢ 0 0 3 ⎥ ⎢0 2 0 0⎥ ⎢ ⎦ ⎣0 7 0 0 6 100
⎡ ⎤ ⎡ 0 100 ⎢0⎥ ⎢1 1 0 ⎢ ⎥ ⎢ ⎢6⎥ ⎢ ⎥ , J6×6 = ⎢ 1 1 1 =⎢ ⎢0⎥ ⎢1 1 1 ⎢ ⎥ ⎢ ⎣0⎦ ⎣1 1 1 0 111
0 0 0 1 1 1
0 0 0 0 1 1
0 2 0 0 2 0
2 0 0 0 1 0
⎤ 0 0⎥ ⎥ 0⎥ ⎥, 0⎥ ⎥ 0⎦ 1
⎤ 0 0⎥ ⎥ 0⎥ ⎥. 0⎥ ⎥ 0⎦ 1
The initial conditions of robot are chosen as q(0) = [0.5, −1, 0.6, −0.5, 0.4, −0.7]T , q(0) ˙ = [−0.1, 0.2, −0.3, 1, 0.6, 0.8]T , and the desired pose of the robot is set as qd = [xd , yd , z d , ϕd , θd , ψd ]T = [sin(t), sin(t+π /6), sin(t+π /3), sin(t+π /2), sin(t + 2π /3), sin(t+π )]T . (23)
706
X. Wang and C. Liu
Fig. 2 Tracking curve of Stewart parallel robot under non singular fast terminal sliding mode control algorithm
Then, the parameters of nonsingular fast terminal sliding mode control are designed as α = diag(2, 2, 2, 2, 2, 2), β = diag(2, 2, 2, 2, 2, 2), γ1i = 13/5, γ2i = 5/3, ρ1i = 1.5, ρ2i = 0.05, k1 = diag(10, 10, . . . , 10), k2 = diag(10, 10, . . . , 10). Figure 2 shows the trajectory tracking curve of the nonsingular fast terminal sliding mode control algorithm of the Stewart parallel robot. Among them, the abscissa is the time, the ordinate is the state of the upper platform, the actual trajectory is the black solid line, the desired trajectory is the red dashed line. Under the chosen parameters, evidently, the actual trajectory of the pose variable of the upper platform could be converged to the desired trajectory in a finite time, and the robot system has a good dynamical performance during the convergence process. It can be seen that the nonsingular fast terminal sliding mode controller designed in this paper has a good control effect on the posture of Stewart parallel robot.
Nonsingular Fast Terminal Sliding Mode Control of Stewart …
707
6 Conclusion Regarding the Stewart parallel robots, a nonsingular fast terminal sliding mode control accompanied with the double-power approximation law is proposed. This method can make the pose error of the robot system quickly converges to 0 in a finite time. Based on the robot model and the property of the nonsingular fast terminal sliding surface, the equivalent control is obtained and designed as the double-power approximation law, which guarantees the global finite time convergence of the sliding surface and the finite time convergence of the trajectory error. Theoretical analysis and numerical simulation results verify the effectiveness of the control method and indicate that it can provide good control performance.
References 1. Khanbabayi, E., Noorani, M.R.S.: Design computed torque control for Stewart platform with uncertainty to the rehabilitation of patients with leg disabilities. Comput. Methods Biomech. Biomed. Eng., in press (2023). https://doi.org/10.1080/10255842.2023.2222863 2. Arco, V.S., Gar, B.J., Haas, R.: Validation of a ride comfort simulation strategy on an electric Stewart Platform for real road driving applications. J. Low Freq. Noise Vibr. Active Control 42(1), 368–391 (2023). https://doi.org/10.1177/14613484221122109 3. Su, Y.X., Duan, B.Y., Zheng, C.H.: Nonlinear PID control of a six-DOF parallel manipulator. IEE Proc. Control Theory Appl. 151(1), 95–102 (2004). https://doi.org/10.1049/ip-cta: 20030967 4. Eftekhari, M., Karimpour, H.: Neuro-fuzzy adaptive control of a revolute Stewart platform carrying payloads of unknown inertia. Robotica 33(9), 2001–2024 (2015). https://doi.org/10. 1017/S0263574714001222 5. Nguyen, C., Antrazi, S., Zhou, Z.: Adaptive control of a Stewart platform-based manipulator. J. Rob. Syst. 10(5), 657–687 (1993). https://doi.org/10.1002/rob.4620100507 6. Kumar, P.R., Chalanga, A., Bandyopadhyay, B.: Smooth integral sliding mode controller for the position control of Stewart platform. ISA Trans. 58, 543–551 (2015). https://doi.org/10. 1016/j.isatra.2015.06.003 7. Khalil, W., Guegan, S.: Inverse and direct dynamic modeling of Gough-Stewart robots. IEEE Trans. Rob. 20(4), 754–761 (2004). https://doi.org/10.1109/TRO.2004.829473 8. Begon, P., Pierrot, F., Dauchez, P.: Fuzzy sliding mode control of a fast parallel robot. In: 1995 IEEE International Conference on Robotics and Automation, pp. 1178–1183. IEEE Press, Nagoya (1995). https://doi.org/10.1109/ROBOT.1995.525440 9. Min, P., Min, L., Seok, J.: The design of sliding mode controller with perturbation observer for a 6-DOF parallel manipulator. In: 2001 IEEE International Symposium on Industrial Electronics Proceedings, pp. 1502–1507. IEEE Press, Pusan (2001). https://doi.org/10.1109/ISIE.2001. 931928 10. Fu, S., Yu, Y., Shen, T.: Non-linear robust control with partial inverse dynamic compensation for a Stewart platform manipulator. Int. J. Model. Ident. Control 1(1), 44–51 (2006). https:// doi.org/10.1504/IJMIC.2006.008647 11. Yu, S., Yu, X., Shirinzadeh, B., Man, Z.: Continuous finite-time control for robotic manipulators with terminal sliding mode. Automatica 41(11), 1957–1964 (2005). https://doi.org/10.1016/j. automatica.2005.07.001 12. Yang, L., Yang, J.: Nonsingular fast terminal sliding-mode control for nonlinear dynamical systems. Int. J. Robust Nonlinear Control 21(16), 1865–1879 (2011). https://doi.org/10.1002/ rnc.1666
Distributed Optimization Algorithm for Multi-agent System with Time-Varying Communication Delay Based on the Game Theory Chen Wang, Rui Zhu, Fuyong Wang, and Zhongxin Liu
Abstract In this paper, the distributed optimization problem for multi-agent system with time-varying communication delay τ (t) is studied based on game theory. Firstly, the distributed optimization problem min x φ(x) is modeled as a state based ordinal potential game model G. Then, the existence and validity of the Nash equilibrium in the game model are verified. In addition, a revenue-based strategy learning algorithm is designed under topology network with τ (t) to find the Nash equilibrium. Finally, a numerical simulation illustrates the results. Keywords Multi-agent systems · Distributed optimization · Game theory · Time-varying communication delay · Strategy learning algorithm
1 Introduction Multi-agent systems (MASs) are an important branch of distributed artificial intelligence. At present, the researches of MASs mainly include the following aspects: consistency control [1, 2], and distributed optimization problems. Especially, with the rapid development of information technology, the theory of distributed optimization has been gradually improved [3, 4]. Then the problem of distributed optimization in MASs has also become a new hot topic for scholars. In the past, Tistsiklis et al. [5] firstly established an analysis framework for distributed computing models. Nedié [6] proposed the distributed sub-gradient descent method. For continuous distributed systems, Gharesifard et al. [7] also proposed corresponding algorithms to enable multiple agents to reach a consistent optimum. In practice, it is difficult to ensure the two-way interaction of information in MASs. For the MASs with single information interaction, Pu et al. [8] proposed to use the degree C. Wang · F. Wang (B) Key Laboratory of Dependable Service Computing in Cyber Physical Society, Ministry of Education of China, Chongqing University, Chongqing 400044, China e-mail: [email protected] C. Wang · R. Zhu · F. Wang · Z. Liu College of Artificial Intelligence, Nankai University, Tianjin 300350, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_62
709
710
C. Wang et al.
input information of each single agent to design the push-pull gradient-based algorithm. In addition, Nedié et al. [9] proposed a gradient tracking method to increase the convergence speed successively for time-varying communication topology MASs. As an important branch of modern mathematics and operations research, the game theory has received more and more attention in academia in recent years. In particular, with the rise of MASs, the interdisciplinary researches of game theory and multi-agent learning have become a hot topic. For example, in [10, 11], the game theory was applied to optimize control methods in the process of studying multi-agent consistency problems. Reference [12] studied the correlation between potential game methods and coordinated control, which brings a new approach to the application of the game theory in the field of multi-agent control. Reference [13] used supermodel game to design a second-order multi-agent system leader selection algorithm. In addition, with the rapid development of distributed games, more and more researchers have proposed that distributed optimization problems can also be solved by using game theory. In [14, 15], a game model for distributed optimization problems was established for different information interaction modes. The actual network environment is changeable and complex, it is meaningful to handle distributed optimization problem of MASs with communication delays. Although most of the existing work has solved the optimization solution problem, there are still many problems: (1) the complexity of the solution with the time delays is too high; (2) in the actual situation, the global information is difficult to know and only local information can be obtained. To solve these problems, this paper considers transforming distributed optimization problems into time-varying communication delays to game problems. In summary, the main innovations of this study compared to the results of previous studies are as follows. Firstly, consider the distributed optimization problem with time-varying communication delay which is transformed into reasonable game G. Secondly, proof the consistency of the Nash equilibrium (NE) in G and the optimal solution x ∗ in the optimization problem min x φ(x). Thirdly, NE seeking algorithm under time-varying communication time delay is designed. The remainder of this paper is organized as follows: In Sect. 2, the basic knowledge is introduced and the distributed optimization problems are modeled as the game model. In Sect. 3, proof relevant conclusions and design algorithms to find NE. In Sect. 4, we conduct numerical simulations to verify the conclusions. Finally, we summarize this paper.
2 Problem Statement This section mainly contains two parts, Part 1 introduces the basic preliminaries and notions; Part 2 builds a game model for distributed optimization.
Distributed Optimization Algorithm for Multi-agent System …
711
2.1 Basic Notation For convenience, we firstly give some notations. R is the set of all non-negative real numbers. Mm×n is denoted the set of m × n real-valued matrices. xi is denoted as the i-th component of a vector x. For a matrix A, we express [A]i j as the matrix entry in the i-th row and j-th column. A standard game can be represented by G = {N , {Si }i∈N , {Ui }i∈N }, where N is players’set, {Si }i∈N is strategy set and {Ui }i∈N is thethe set of utility function. We define the Cartesian product of sets {Si }i∈N as n Si . Next, let’s introduce the basic concepts of NE and potential games. S = i=1 Definition 1 Given a game G = {N , {Si }i∈N , {Ui }i∈N } with n players. If for any ∗ ∗ ∗ = (s1∗ , s2∗ , . . . , si−1 , si+1 , . . . , sn∗ ) determines, its behavior si∗ is the optimal i, s−i ∗ ∗ ∗ ). Then s ∗ = (si∗ , s−i ) is a Nash equibehavior of i, i.e. si = arg maxsi ∈Si Ui (si , s−i librium of the game. Definition 2 In a finite game G = {N , {Si }i∈N , {Ui }i∈N }, for any player i ∈ N , it has si1 , si2 ∈ Si , if there exists a function ψ : S → R, such that the following equation holds Ui (si1 , s−i ) − Ui (si2 , s−i ) = ψi (si1 , s−i ) − ψi (si2 , s−i ), where s−i = (s1 , . . . , si−1 , si+1 , . . . , sn ). Then this game model is a potential game, and the function ψ is called the potential function.
2.2 Game Model for Distributed Optimization We consider the distributed optimization problem with n agents. Let N = {1, 2 . . . n} its set of feasible behaviors as the set of agents. Any i ∈ N chooses behavior λi from n Λi , then the arbitrary Λi ⊆ R, where Λi is a convex subset of R. Let Λ = i=1 combination of all agents’ behaviors λ = (λ1 , . . . , λn ) ∈ Λ. This paper assume the MASs has a smooth and convex global optimization objective function φ : Λ → R. The distributed optimization problem is expressed in the formal form as follows: min φ(λ1 , . . . , λn )
λi ∈Λi
s.t. λi ∈ Λi , ∀i ∈ N .
(1)
In this section, we design the information interaction model of MASs with τ (t) is two-way information interaction. For simplicity, we use a matrix to represent the information interaction between agents. We denote that A(t) ⊆ Mn×n is used to describe the local information available at the moment t. For ease of description, denote [A]i j (t) as ai j (t), ∀i, j ∈ N . Then if there is an information interaction between i and j, the element ai j (t) = a ji (t) = 1; otherwise ai j (t) = a ji (t) = 0. Based on the proposed problem and the design of the model in [14], we construct a more general game, that is, a state based ordinal potential game.
712
C. Wang et al.
Definition 3 Given a game model G = {N , {Si }i∈N , {Ui }i∈N , X, f, ψ}, it adds state space X and state transition function f to the original potential game. The state transition function is f : X × S − X . If there R such is a convex function ψ : X × S → that the following equation holds sgn Ui (x, si , s−i ) − Ui (x, si , s−i = sgn ψ(x, si , ¯ 0) s−i ) − ψ(x, si , s−i ) , where sgn(·) is a sign function. And there is ψ(x, s) = ψ(x, holds for (x, s) and the next new state x¯ = f (x, s). Then the above game is said to state based ordinal potential game.
2.3 Design of Game Models The work in this section focuses on transforming problem (1) with τ (t) into a game model G = {N , {Si }i∈N , {Ui }i∈N , X, f, ψ}, including the six parts: 1. The design of the player: The set N = {1, 2, . . . , N } is corresponding to the decision-making individual in problem (1). 2. The design of the state space: Let state vector of the state space at the (t), x2 (t), . . . , xn (t) ∈ X . For any i ∈ N , we define t moment be x(t) = x 1 xi (t) = λi (t), γi (t) , where λi(t) represents thetrue decision value of the agent i at the t moment, and γi (t) = γi1 (t), . . . , γin (t) is the agent i’estimate of other, j specifically, γi (t) represents the estimate of i to λ j (t). In addition, considering j that γi cannot be infinite, we suppose: for any j ∈ N , there are m agents that estimate λ j , and the sum of the estimated decision values is m multiple of λ j , that is, m j γi (t) = mλ j (t) (2) i=1
3. Design of strategy spaces: Let si (t) ∈ Si be the strategy of the agent i at the t moment, which affects not only λi (t), but also γi (t). We design si (t) = (Δλi (t), Δγi (t)), where Δλi (t) represents the change for the true value of i at the t moment, Δγi (t) = (Δγi1 (t), . . . , Δγin (t)) is expressed as the change for the other agent’s decision estimates at the t moment. In addition, Δγik (t) = k k {ai j (t)Δγ˙i→ j (t)} j=1,...,n , where Δγi→ j (t) stands for the estimate of k that i passes to j. 4. Design of state transition functions: In general, the update of the state is determined by the previous state, strategy through the state transition function f . In this paper, we consider a case with τ (t), there is a certain time difference in the transmission of information. The specific performance in this article is that the k (t), information passed by other agents received by i at the t moment is not γ j→i k but γ j→i (t − τ (t)). Further, x(t) = (λ(t), γ (t)) is composed of two parts, where γ (t) is affected by the information transfer of other agents, f can only use the information of x(t − τ (t)) to obtain the state in the new stage. Finally the model is shown:
Distributed Optimization Algorithm for Multi-agent System …
713
x(t + 1) = f x(t − τ (t)), s(t) .
(3)
The specific changes in two parts of x(t + 1) = λ(t + 1), γ (t + 1) can be represented by x(t − τ (t)) = λ(t − τ (t)), γ (t − τ (t)) . For any i, j, k ∈ N , there are the following transfer rules: λi (t + 1) = λi (t − τ (t)) +
τ (t)
Δλi (t),
(4)
m=0
γik (t + 1) =γik (t − τ (t)) + n ⎡
τ (t)
+
⎣
m=0
τ (t)
ξik (t − m)Δλk (t − m)
m=0 n
k (t − m) − ai j (t − m)Δγ j→i
j=1
n
⎤ k (t − m)⎦ . ai j (t − m)Δγi→ j
j=1
(5) In (5), ξik (t) is a judgment function, when k = i, ξik (t) = 1, otherwise ξik (t) = 0. The following verifies that such a design satisfies (2) of this paper, and for any i, summing (5) can be obtained:
γik (t + 1) =
i∈N
γik (t − τ (t)) + n
i∈N
+
(t) τ i∈N m=0
⎡ ⎣
(t) τ
ξik (t − m)Δλk (t − m)
i∈N m=0 n j=1
k (t − m) − ai j (t − m)Δγ j→i
n
⎤ k (t − m)⎦ . ai j (t − m)Δγi→ j
j=1
(6) Because we assume that the model is two-way interactive information, sends one
n a information, there must be one receiving information, therefore i∈N j=1 i j (t)
n
k k k Δγ j→i (t) − j=1 ai j (t)Δγi→ j (t) = 0. Thus (6) can have i∈N γi (t + 1) = λk (t + 1). Therefore, under the state transition law shown by (4) and (5), (2) holds at any moment. 5. Design of the utility function: We draw on the design ideas of the utility function in Ref. [14], use the nearest neighbor decision idea and add error suppression terms. The utility function is designed as: Ui (x(t + 1), s(t + 1)) = Ui (γ j (t)|ai j (t)=1 ) + Uie (γ j (t)|ai j (t)=1 ),
(7)
˙ 1 (t),...,γ n (t)) ai j (t)φ(γ , Uie (γ j (t)|ai j (t)=1 ) = α Ui (γ j (t)|ai j (t)=1 ) = j∈N aj i j (t) j j∈N
k 2 k∈N j∈N ai j (t)(γ j (t))
stands for error suppression and α > 0 is a parametric j∈N ai j (t) factor.
where
714
C. Wang et al.
6. Design of potential functions: Finally, we need to find a suitable potential function to rationalize the game model. The design of ψ takes into account the existence of φ in problem (1), and adds an error suppression term, finally indicating as:
n k 1 2 j∈N φ(γ j (t), . . . , γ j (t)) k∈N j∈N (γ j (t)) +α . ψ(x(t), a(t)) = n n where α is the same weight parameter as in (7).
3 Multi-agent Distributed Game with τ (t) 3.1 The Validity of the Game Model Theorem 1 In the case of two-way information interaction, we can model the problem (1) with τ (t) as the state based ordinal game. Proof For any i ∈ N , let s¯i (t) replace si (t) as a new strategy at time t. For other j = i, s j (t) remain unchanged. That is, only i changes its own strategy, it just affects j γi (t), j ∈ N . The strategy of j, j = i does not change, so γ¯ jk (t) = γ jk (t), ∀k ∈ N . The change in the utility function of i at t is denoted as ΔUi,t , which is expressed as follows: φ(γi1 (t), . . . , γin (t)) φ(γ¯i1 (t), . . . , γ¯in (t))
− ΔUi,t = j∈N ai j (t) j∈N ai j (t)
k k 2 2 k∈N (γi (t)) k∈N γ¯i (t)) −
+α
. j∈N ai j (t) j∈N ai j (t) Similarly, the change value of the potential function can be expressed as follows: 1 [ψ(γ 1 (t), . . . , γin (t)) − ψ(γ¯i1 (t), . . . , γ¯in (t))] n i α k 2 k 2 + (γ (t)) − (γ¯i (t)) . n k∈N i k∈N
Δψt =
It is easy to see that j∈N ai j (t)ΔUi,t = nΔψt , so sgn(Δψt ) = sgn(ΔUi,t ). At the same time, for any i ∈ N , when its action changes from si to empty action, the empty behavior has no effect on the state, then there is x(t + 1) = x(t). So the marginal cost of the potential function Δψ = ψ(x(t), s(t)) − ψ(x(t + 1), 0) = 0. Since the state based ordinal potential game model is a generalized potential game model, it retains this good property of potential games that (x ∗ , s ∗ ) must exist. Because the game model uses γik (t) instead of λk (t) to construct the utility function, we firstly need to discuss whether the γik (t) can truly reflect λk (t), ∀i, k ∈ N .
Distributed Optimization Algorithm for Multi-agent System …
715
For any i, k ∈ N , the specific description of the relationship between γik (t) and λk (t) at (x ∗ , s ∗ ) is as follows. Theorem 2 Consider the model 2.3, if the game evolves to (x ∗ , s ∗ ) and the matrix sequence corresponding to the information topology between agents satisfies the condition of complete sequence, then for any i, k ∈ N , γik (t) = λk (t). Proof Let the set of neighbors i at time t be L i (t) ⊆ N , where L i (t) = { j ∈ N : ai j (t) = 1}. Consider that in the process of game evolution, at least one global arrival point in the topology corresponds to the cumulative effect of the bidirectional interac tion topology. So for any i, we define Ni = t L i (t). Ni has only two cases: |Ni | = 1 or |Ni | ≥ 2. Firstly, if the number of i fields |Ni | = 1, let m ∈ Ni . For any k, the estimated k k = Δγi→m + δk , which change in k passed by i to m under the new strategy is Δγi→m δk ∈ R; For other agents, the estimated change value passed to them by i remains k k unchanged, i.e. Δγi→ j = Δγi→ j + δk , where j ∈ Ni \ {m}. In this way, ΔUi,t can be expressed as ⎛ ΔUi,t = ⎝
n
ai j (t)Ui (x(t), s (t)) −
j=1
n
⎞ ai j (t)Ui (x(t), s(t))⎠ /
j=1
n
ai j (t)
(8)
j=1
Let δk → 0, k ∈ N , ΔUi,t become the following, ⎡ lim
δk →0,k∈N
ΔUi,t = ⎣ /
n
δk
k=1 n
∂ψ ∂γmk (t)
−
∂ψ ∂γik (t)
+ 2α γmk (t) − Δγik (t)
⎤ + 2α(δk )2 ⎦
ai j (t)
j=1
(9) For any δk ∈ R, there is ΔUi,t ≥ 0. So ∀i, k ∈ N , (9) can be translated to ∂ψ ∂γik (t)
2α(γmk (t)
γik (t))
∂ψ ∂γmk (t)
−
+ − = 0. Note that ψ is a convex function, according to the median theorem, we obtain: ∂ψ ∂ψ ˙ mk (t) − γik (t)) − = H (ψ)|ξk γmk (t)+(1−ξk )γik (t) (γ ∂γmk (t) ∂γik (t)
(10)
where ξk ∈ (0, 1), H (ψ) is the Hessian matrix of the convex function ψ. Equation (10) multiply by (γmk (t) − γik (t)) to get the following result,
∂ψ ∂ψ ˙ mk (t) − γik (t))2 . (γmk (t) − γik (t)) = H (ψ)|ξk γmk (t)+(1−ξ )γik (t) (γ − ∂γmk (t) ∂γik (t) (11)
716
C. Wang et al.
˙ mk (t) − Then, we can have 0 ≥ −2α(γmk (t) − γik (t)) = H (ψ)|ξk γmk (t)+(1−ξ )γik (t) (γ γik (t))2 . Because the Hessian matrix of the convex function is semi-positive definite, so 0 ≥ −2α(γmk − γik ) ≥ 0. Thus, it can be seen that for any i, k ∈ N , and i’s neighbor m, there are γmk (t) = γik (t). Then, if the number of i fields | Ni |≥ 2, select any two players j1 , j2 ∈ Ni . The k k si (t) = Δλi (t), Δγi (t) is designed as follows: ∀k∈N , Δγi→ j1 (t) = Δγi→ j1 (t) + k δk , where δk ∈ R. The estimated change in k passed i to the j2 is Δγi→ j2 (t) = k k k Δγi→ j2 (t) − δk . And Δγi→ j (t) = Δγi→ j (t), j = j1 , j2 . In this way, the proof process is the same as the proof process in (1), only all γmk (t) becomes γ jk1 (t), and all γik (t) becomes γ jk2 (t). The following proof results can be obtained γ jk1 (t) = γ jk1 (t).
Then by (2), we have i∈N γik (t) = nλk (t). Conclusions from both scenarios are combined, ∀i, k ∈ N , it follows that γik (t) = λk (t). It can be seen from the above that when the game reaches (x ∗ , s ∗ ), the real value of the participants’ decision is consistent with the estimated value. Finally, it is only necessary to show that the equilibrium solution and the optimal solution of the objective function are consistent. In the multi-agent system with τ (t), the proof process of the consistency of NE and optimal solution is basically the same as the proof process of MASs with two-way interaction of normal information, please refer to [14].
3.2 Strategy Learning Algorithm Under Topology Network with τ (t) In this section, we design a suitable algorithm to find (x ∗ , s ∗ ) under a topological network with τ (t). In practice, players usually update their strategies based on their historical strategy and the state of other players. In this article, (3) is considered, the t moment can only receive information that comes from t − τ (t). Then the specific form can be written in the following form: si (t) = Fi ((x(0), si (0), Ui (0)) , . . . , (x(t − τ (t)), si (t − τ (t)), Ui (t − τ (t)) , . . . , si (t)) .
(12) Combined with the communication characteristics of MASs in this paper, we design a new algorithm based on the policy learning algorithm. The specific algorithm description is as Algorithm 1.
Distributed Optimization Algorithm for Multi-agent System …
717
Algorithm 1 Strategy learning algorithm under topology network with τ (t) j
1: Initialize si (0), Ui (0) and [G i (0)]k j = γk (0), ∀i, j, k; 2: while The iteration stop condition was not reached do 3: Generate p, where ε ∈ (0, 1); 4: if 0 < p < 1 − ε then 5: keep si ; 6: else 1 − ε < p 7: choose a new strategy si (t); 8: for k = t − τ (t) : t do j 9: Obtain G i (t) by using the formula (5), where [G i (t)]k j = γk (t); 10: Calculate the new utility Ui (t) under si (t); 11: if Ui (t) > Ui then 12: si = si (t) become new strategy, Ui = Ui (t) become new utility; 13: else 14: Keep si ; 15: t = t + 1; 16: Calculate φmin under s;
Remark 1 The algorithm must be able to ensure the agent strategy and the value that n εn |S×X | ) , where |S × X | is the of φ converges according to probability p = ( i=1 |Si | number of Cartesian products of the set S and X .
4 Numerical Example Example 1 Consider a multi-agent system with n = 6 agents, the agent set is N = {1, 2, 3, 4, 5, 6}, and agent i ∈ N has the corresponding decision λi ∈ Λi , where Λi is the agent decision set. The agents respectively achieve their own optimal decisions in the process of optimizing their own objective functions. Through the common decision of all agents λ = (λ1 , λ2 , λ3 , λ4 , λ5 , λ6 ) ∈ Λ, where Λ = i∈N Λi , the following multi-agent system optimization problems can be solved: min φ(λ) = λQλT + bλT λ∈S
(13)
s.t. λi ∈ Λi , ∀i ∈ N . where b = [−1 − 1 − 1 − 1 − 1 − 1], ⎡
4 0.2 0.3 0.2 0.4 1
⎢0.2 ⎢ ⎢ ⎢0.3 Q=⎢ ⎢0.2 ⎢ ⎢ ⎣0.4
⎤
2 0.2 0.2 0.2 0.3⎥ ⎥ ⎥ 0.2 5 1 3 0.1⎥ ⎥ 0.2 1 3 1 0.1⎥ ⎥ ⎥ 0.2 3 1 10 0.4⎦
1 0.3 0.1 0.1 0.4 3
718
C. Wang et al.
Fig. 1 Changes in agent strategy si
The topology rule of information interaction among decision individuals is designed as: In the beginning, each agent has at least one other agent for information interaction, and after a certain period of time, information interaction between each agent can be realized. Use Algorithm 1 to solve the problem, where α = 0.01 in Ui , ε = 0.2/0.4/0.6, t = 100000. The simulation results of each agent strategy are shown in Fig. 1. From Fig. 1, we know that with the method under the conditions of local information, λ convergences and the convergence value error is less than 0.0025 at different ε. In addition, we also use gradient descent to solve this optimization problem. Comparing them together, the comparison results are shown in the Fig. 2. For any ε, it is found that the two algorithms have the same convergence trend, and the convergence of the two algorithms is less than 0.005, which confirms the validity of the model established in this paper. In addition, the new algorithm runs much faster in real operations than gradient descent algorithms and obtains φ only through local information which achieves the effect of protecting information.
5 Conclusion This paper models the distributed optimization problem for multi-agent system with τ (t) as a game model, verifies the validity of the game, and finally proves the consis-
Distributed Optimization Algorithm for Multi-agent System …
719
0 =0.2 =0.4 =0.6 gredient method
the objective function value
-0.05
-0.1
-0.15
-0.2
-0.25
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
number of iterations
Fig. 2 Changes in φ values under different algorithms
tency of the optimal solution and NE. Moreover, we design a revenue-based policy learning algorithm in the varying communication delay network topology to find the NE. Finally, it is illustrated by simulation results of this paper. Acknowledgements This work was supported by the Open Fund of Key Laboratory of Dependable Service Computing in Cyber Physical Society of Chongqing University (Grant No. CPSDSC202202), and the National Natural Science Foundation of China (Grant No. 62103203).
References 1. Yu, W., Chen, G., Ren, W., et al.: Distributed higher order consensus protocols in multiagent dynamical systems. IEEE Trans. Circuits Syst. I: Regul. Pap. 58(8), 1924–1932 (2011) 2. Xiang, X., Liu, L., Gang, F.: Consensus of single integrator multi-agent systems with directed topology and communication delays. Control Theory Technol. 14, 21–27 (2016) 3. Deng, Z., Liang, S., Hong, Y.: Distributed continuous-time algorithms for resource allocation problems over weight-balanced digraphs. IEEE Trans. Cybern. 1–10 (2017) 4. Qiu, Z., Xie, L., Hong, Y.: Distributed optimal consensus of multiple double integrators under bounded velocity and acceleration. Control Theory Technol. 17(1), 85–98 (2019) 5. Tsitsiklis, J.N.: Problems in decentralized decision making and computation. Thesis, Department of EECS, Massachusetts Institute of Technology (1984) 6. Nedié, A., Ozdaglar, A.: Distributed subgradient methods for multi-agent optimization. IEEE Trans. Autom. Control 54(1), 48–61 (2009) 7. Cortes, J., Gharesifard, B.: Continuous-time distributed convex optimization on directed graphs (2012)
720
C. Wang et al.
8. Shi, P., Wei, S., Xu, J., et al.: Push-pull gradient methods for distributed optimization in networks (2018) 9. Nedié, A., Olshevsky, A., Shi, W.: Achieving geometric convergence for distributed optimization over time-varying graphs. SIAM J. Optim. 27(4) (2016) 10. Vamvoudakis, K.G., Lewis, F.L., Hudas, G.R.: Multi-agent differential graphical games: online adaptive learning solution for synchronization with optimality. Automatica 48(8), 1598–1611 (2012) 11. Abouheaf, M.I., Lewis, F.L., Vamvoudakis, K.G., et al.: Multi-agent discrete-time graphical games and reinforcement learning solutions. Automatica 50(12), 3038–3053 (2014) 12. Marden, J.R., Arslan, G., Shamma, J.S.: Cooperative control and potential games. IEEE Trans. Syst. Man Cybern. Part B Cybern. 39(6), 1393 (2009) 13. Xue, L., Wang, Q.L., Sun, C.Y.: Game theoretical approach for the leader selection of the second-order multi-agent system. Control Theory Appl. (2016) 14. Zhang, J., Qi, D., Zhao, G.: A new game model for distributed optimization problems with directed communication topologies. Neurocomputing 148, 278–287 (2015) 15. Zhang, J., Qi, D., Zhao, G.: Distributed optimization and state based ordinal potential games. Commun. Comput. Inf. Sci. 355, 113–121 (2013)
Distributed Formation Control Based on Linear Model for Power-Line Inspection Robots LinYuan Hou and Yicheng Li
Abstract In this paper, a formation distributed control based on linear model is proposed for power line inspection (PLI) robot. Firstly, the relationship between PLI robots is established through graph theory, so that the lead robot can control the orientation of other robots. Secondly, the PLI robot model is transformed into linear model by the linearization method. Then, the distributed formation control is designed based on the connection between the lead robot and the rest of the robots. Finally, based on Lyapunov stability theorem, the Lyapunov function is constructed to prove that PLI robot can form the desired formation. Keywords Power-line inspection robot · Linear model · Distributed formation control · Lyapunov theorem
1 Introduction With the development of the times, a single robot has limited ability to acquire, process and control information. It is not enough to meet some human needs. With the increase of the complexity of the work, we urgently need multi robot cooperation to achieve tasks that a single robot can not complete [1]. At the same time, with the development of science and technology, multi robot distributed formation cooperative control plays an extremely important role in the military field. Through the formation control algorithm, multi intelligent cluster can be effectively realized. For example, the effective control of UAV formation [2], unmanned boat formation [3], wheeled robot [4–6] formation and other cluster. The advantage of the PLI robot is that it can carry the corresponding detection equipment and execution structure on the power line transport. It also can identify instruments, foreign materials, cables, etc., support real-time data transmission, and has the integration of multiple safety protection L. Hou (B) · Y. Li College of Sciense, Hohai University, Nanjing 210098, China e-mail: [email protected] Y. Li e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_63
721
722
L. Hou and Y. Li
measures, which is suitable for a variety of indoor occasions. Therefore, the control of PLI robot has attracted widespread social attention [7–9]. In order to ensure the robustness and stability of formation control, a series of stable algorithms, such as following the leader, synovial coupling method and virtual structure method, have been gradually formed in recent years, which have been widely used in industry, agriculture and even military fields [10]. The formation control algorithm of leader–follower can realize the overall control by controlling the leader. The position of the remaining robots varies depending on the lead robot, and the leader can capture the overall position information. When the position of the leader changes, the rest of robots will also change its position based on it, so as to realize the overall control [11–13]. The coupling gain of the synovium and the adaptive method can get rid of the original formation control method which depends on the communication topology map by updating the UAV position to design the time formation controller [14, 15]. The above methods are mature at present, but there are relatively few studies in the area of the special robot formation control of PLI robot [16]. Considering that PLI robot can carry equipment and goods in the transportation process, if large-scale inspection is to be realized, the distributed formation control of PLI robot based on linearization model is particularly important. In this paper, the linearized model of PLI robot is adopted, and the distributed formation controller is designed by following-leader based on graph theory, and its stability is verified by Lyapunov function. The structure of this paper is as follows: In the System Model, Sect. 1 simply introduces graph theory and illustrates the relationship between robots. Section 2 introduces the dynamic model of the PLI robot. Section 3 establishes the linearized model. In Materials and Analysis, Sect. 1 introduces the design of the controller, and Sect. 2 verifies the stability of the system through the Lyapunov function.
2 System Model 2.1 Graph Theory and Its Basic Knowledge A graph can be represented as G = (V, E), where V = 1, 2, . . . , N to denote the vertex set. E ∈ V × V to represent the edge set. In this paper, we consider using the adjacency matrix A = ai j ∈ R N ×N to represent: ai j =
1 (i, j) ∈ E 0 (i, j) ∈ /E
(1)
In this formula, if i transmits a signal to j then ai j = 1, otherwise ai j = 0. However, if the vertex i transmits a signal to j, but j does not transmits a signal to i, then ai j = 1, a ji = 0. For matrix A, the diagonal elements are all 0, that is, there is no case that robots send signals to themselves.
Distributed Formation Control Based on Linear …
723
Fig. 1 Physical model parameters in the X 1 O1 Z 1 plane
In this paper, diagonal matrix D = di j ∈ R N ×N to represent the total number of neighbor vertex information received by vertex i Matrix D as follows: n
ai j i = j i = j
(2)
L(G) = D(G) − A(G)
(3)
di j =
j=1
0
Laplace matrix L is defined as follows:
n
where the elements are: li j =
j=1, j=i
−ai j
ai j i = j i = j
(4)
2.2 The PLI Robot Model and Its Parameter Adjustment Are Shown in Figs. 1 and 2 Where α1 is the included angle between the actuating rod and Y1 , in the same time α2 is the included angle between the initial position of the robot and its active area. h 1 and h 2 respectively represents the height of the actuating rod and the base, where m 1 and m 2 respectively represent the weight of the robot and its assembly box. l1
724
L. Hou and Y. Li
Fig. 2 Physical model parameters in the X 1 O1 Y1 plane
and l2 represents the horizontal distance from the robot body centroid to the cable and the length between the cables. The state vector is: T T x = x1 x2 x3 x4 = α1 α˙ 1 α2 α˙ 2
(5)
u represents the control input and d represents the external disturbance. Hypothesis: the external disturbance d of the PLI robot is bounded and changes slowly with time. According to the dynamic model of the PLI robot in document [17], the state vector T of the PLI robot can be expressed as f = f 1 f 2 f 3 f 4 , and f is a continuous function. x˙ = f (x, u) ⎧ x˙1 = x2 = f 1 ⎪ ⎪ ⎪ 2 sin x 1 −m 2 gh 1 sin x 3 cos x 1 ⎨ x˙2 = −m 2 gh + d = f2 m 1 l12 +m 2 [h 22 +(−l2 +h 1 sin x3 )2 ] x˙3 = x4 = f 3 ⎪ ⎪ ⎪ ⎩ x˙ = u+m 2 h 1 (−l2 +h 1 sin x3 ) cos x3 x22 −m 2 gh 1 sin x1 cos x3 = f 4 4 m h2
(6)
2 1
When the control input u satisfies U (0) = 0, the state space equation of the PLI T T robot has an equilibrium point at x = x1 x2 x3 x4 = 0 0 0 0 . At the equilibrium point, the state vector equation of the PLI robot meets (7): ⎧ 0 = x2 ⎪ ⎪ ⎪ 0 = −m 2 gh 2 sin x1 −m 2 gh 1 sin x3 cos x1 + d ⎨ m 1 l12 +m 2 [h 22 +(−l2 +h 1 sin x3 )2 ] (7) ⎪ ⎪ 0 = x4 2 ⎪ ⎩ 0 = u+m 2 h 1 (−l2 +h 1 sin x3 ) cos x3 x2 −m 2 gh 1 sin x1 cos x3 m h2 2 1
Distributed Formation Control Based on Linear …
725
⎧ X2 = 0 ⎪ ⎪
⎨ X 3 = arcsin −h 2 h −1 1 tan x 1 X4 = 0 ⎪ ⎪
⎩ u(0) = m 2 gh 1 (−l2 + h 1 sin x3 [arcsin −h 2 h −1 1 tan x 1 ])
(8)
The non-linear system of the PLI robot can be regarded as a set of equilibrium points of a series of systems. According to (8), for each system, it meets −1 ≤ −h 2 h −1 1 tan x 1 ≤ 1. There are multiple corresponding x 3 , make u(0) equals 0.
2.3 Establishment of Linearization Model T The PLI robot system is linearized at the equilibrium point x = x1 x2 x3 x4 = T 0 0 0 0 is as follows, 0(x,u) is a high-order infinitesimal, then: x˙ = f (x, u) = x
∂f ∂f +u + o(x, u) = αx + βu + o(x, u) ∂x ∂u
⎤ · · · ∂∂ xf14 ⎢ . . . .. ⎥ T = ⎣ ..
T . . ⎦ x= x 1 x 2 x 3 x 4 = 0 0 0 0 ∂ f4 · · · ∂∂ xf44 ∂ x1 ⎡ ⎤ ⎡ 0 1 0 0 0 1 0 ⎢ ⎥ −m gh 0 −m gh 0 2 2 2 1 ⎢ 2 2 ⎥ ⎢ k1 0 k2 2 2 2 2 ⎢ ⎥ ⎢ m 1 l1 + m 2 r + l2 0 ⎥ = ⎣ = ⎢ m 1 l1 + m 2 r + l2 0 0 0 ⎣ 0 0 0 1⎦ k 3 0 0 0 0 0 − hg1 ∂ f 1 ∂ f 2 ∂ f 3 ∂ f 4 T ∂ f 1 0 0 0 = 0 0 0 k4 β= = = 2 m h ∂u ∂u ∂u ∂u 2 1 ∂u u(0)=0 ⎡
∂ f α= ∂x
(9)
∂ f1 ∂ x1
⎤ 0 0⎥ ⎥ 1⎦ 0 (10)
According to formulas (6)–(10), the PLI robot linearization system can be written as follows: ⎧ ⎪ ⎪ x˙1 = x2 ⎨ x˙2 = k1 x1 + k2 x3 + d (11) x˙3 = x4 ⎪ ⎪ ⎩ x˙4 = k3 x1 + k4 u Meanwhile, we can write Eq. (9) as follows: ⎡
⎤ ⎡ x˙1 0 ⎢ x˙2 ⎥ ⎢ k1 ⎢ ⎥=⎢ ⎣ x˙3 ⎦ ⎣ 0 x˙4 k3
1 0 0 0
0 k2 0 0
⎤⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 0 0 0 x1 ⎢ x2 ⎥ ⎢ 0 ⎥ ⎢ d ⎥ 0⎥ ⎥⎢ ⎥ + ⎢ ⎥+⎢ ⎥ 1 ⎦ ⎣ x3 ⎦ ⎣ 0 ⎦ ⎣ 0 ⎦ 0 x4 k4 u 0
(12)
726
L. Hou and Y. Li
For the PLI robot numbered i, we can rewrite its model as follows: x˙i = Axi + Bu i + f (x)
(13)
In the above formula, xi represents the state of the i robot, A and B are the coefficient matrix in formula (12), and f(x) represents a bounded disturbance, which satisfies: | f (x)| ≤ D.
3 Materials and Analysis 3.1 Controller Design Use m = m 1T m 2T . . . m nT to describe the formation. For the whole system, if there is a controller, it can meet the following conditions: lim xi (t) − x j (t) − i j = 0
t→∞
(14)
In the above formula, i j = m i − m j represents the formation of PLI system, then we say that the system can form formation m. Based on the consistency method of formation system, the distributed formation controller constructed by using the relative state information between local PLI is as follows: N
ai j x j (t) − m j − (xi (t) − m i ) (15) u i (t) = cK j=1
In the above formula, ai j represents the elements in the adjacency matrix A, K is a undetermined matrix, and C is the coupling coefficient. Then we can bring Eq. (15) into Eq. (13) to obtain the equation of the closed-loop system as follows: x(t) ˙ = (I N ⊗ A − cL ⊗ B K ) x(t) + (cL ⊗ B K )m + F(x)
(16)
In the above formula, L represents the Laplace matrix between PLI robots; F(x), m, x respectively represent n-dimensional column vectors. For the Laplace matrix L ∈ R n×n , then ∃H ∈ R n×(n−1) , st L = H E, and the definition of E ∈ R (n−1)×n is as follows: ⎡ ⎤ 1 −1 0 · · · 0 ⎢ . ⎥ ⎢ 0 1 −1 · · · .. ⎥ ⎢ ⎥ E=⎢. . . . ⎥ ⎣ .. .. . . . .. .. ⎦ 0 0 0 1 −1
Distributed Formation Control Based on Linear …
727
If there is a directed spanning tree, then matrix H is column full rank, and the nonzero eigenvalue of L can be calculated from the eigenvalue of H E, which satisfies Re(λ(E H )) > 0, Let ei (t) = xi (t) − m i (t), 1 ≤ i ≤ n, set the deviation vector n between the new vectors ηi (t) = ei (t) − ei+1 (t), 1 ≤ i ≤ n − 1, and then we express it in vector form: η(t) = (E ⊗ I N ) e(t) = (E ⊗ I N ) (x(t) − h)
(17)
T ]. In the above formula η(t) = [η1T , η2T , . . . , ηn−1 Since there is a column full rank matrix H ∈ R n×(n−1) , making L = H E, there
is: η(t) ˙ = (I N −1 ⊗ A − E H ⊗ B K ) η(t) + (E ⊗ A)m + (E ⊗ In ) F(x)
(18)
In the above formula, we make G(x) = (E ⊗ In )F(x). According to the definition of variable η(t), when limn→∞ η(t) = 0 the PLI robot formation system can form formation m.
3.2 Stability Analysis In order to make the formation system to form the formation m, we should make it satisfy (E ⊗ A)m = 0, and for the scalar parameter c > 0, ∃Q 2 in the controller, the following inequality holds: A T Q 2 + Q 2 A − cγ Q 2 B B T Q 2 +
ρ2 In + θ1 Q 22 < 0 θ2
(19)
The above formula is θ1 and θ2 to satisfy the maximum and minimum eigenvalues of the positive definite matrix Q 1 of formula: Q 1 (E H ) + (E H )T Q 1 > γ Q 1 , 0 < γ < 2 min{Re(λ(H E))}. So we can design the feedback matrix as K = B T Q 2 . The formula proof is constructed as follows: Design Lyapunov function: V = η T (Q 1 ⊗ Q 2 )η, in the above formula Q 1 and Q 2 both meet Q(E H ) + (E H )Q T > γ Q, Let K = B T Q 2 , and condition (E ⊗ A)m = 0, then: V˙ = η T (I N −1 ⊗ A − cE H ⊗ B K )T (Q 1 ⊗ Q 2 ) η + η T (Q 1 ⊗ Q 2 ) (I N −1 ⊗ A − cE H ⊗ B K ) η + G(x)T (Q 1 ⊗ Q 2 ) η + η T (Q 1 ⊗ Q 2 ) G(x)
= η T Q 1 ⊗ A T Q 2 + Q 2 A − c (E H )T Q 1 + Q 1 E H ) ⊗ Q 2 B B T Q 2 η + 2G(x)T (Q 1 ⊗ Q 2 ) η
(20)
728
L. Hou and Y. Li
First of all, we know ∀x, y ∈ R n , and n-dimensional matrix Q > 0, D, S, we can get: (21) 2x T Dsy ≤ x T D P D T + y T S T Q −1 Sy Therefore:
2G(x)T (Q 1 ⊗ Q 2 ) η < ρ 2 η T η + η T Q 21 ⊗ Q 22 η
(22)
We make the θ1 and θ2 respectively as the maximum and minimum eigenvalues of matrix Q 1 , then there are: Q 1 ≤ θ1 I, Q 1 ≥ θ2 I , so we can get:
ρ2 T η (Q 1 ⊗ I ) η 2G(x)T (Q 1 ⊗ Q 2 ) η ≤ ρ 2 η T η + η T Q 21 ⊗ Q 22 η ≤ θ2
+ θ1 η T Q 1 ⊗ Q 22 η (23) Bring (23) into Eq. (20) to obtain: V˙ ≤ η T
ρ2 Q 1 ⊗ A Q 2 + Q 2 A − cγ Q 2 B B Q 2 + In + θ1 Q 22 θ2 T
T
η < 0 (24)
So limn→∞ η(t) = 0, the PLI robot formation system can form formation m.
4 Simulation Experiment 4.1 Parameter Setting First, we give the physical parameters of the PLI robot (Table 1). Then assume that there are four PLI robots to form the desired formation, and the system matrix is as follows: ⎡
0 ⎢0 A=⎢ ⎣0 0
1 0 0 0
0 1 0 0
⎤ ⎡ ⎤ 0 0 ⎢0⎥ 0⎥ ⎥, B = ⎢ ⎥ ⎣0⎦ 1⎦ 0 1
(25)
Then set the last bounded disturbance as f (x, t) = [0, 0.01 sin xt, 0, 0], then ρ = 0.01. The robot formation is designed as a square, the corresponding link diagram is Fig. 3. Table 1 The physical parameters of PLI robot m 1 (kg) m 1 (kg) l1 (m) 70
30
0.15
l1 (m)
h 1 (m)
h 2 (m)
0.40
0.4
0.4
Distributed Formation Control Based on Linear …
729
Fig. 3 The connection between PLI robots
According to the corresponding link relationship, we can write the Laplace matrix L expression corresponding to it as: ⎡
0 ⎢ −1 ⎢ L=⎣ 0 −1
0 1 −1 0
0 0 1 −1
⎤ 0 0⎥ ⎥ 0⎦ 2
We might as well set the distance between adjacent robots as 10 m, then we can express the formation matrix M as: ⎡
0 ⎢0 M =⎢ ⎣0 0
10 1 0 0
10 0 10 0
⎤ 0 0 ⎥ ⎥ 10 ⎦ 0
4.2 The Simulation Results Bring in the parameters set above, and solve the linear matrix inequality to obtain the system controller as: ⎡
⎤ 0.6642 0.2214 0 0 ⎢ 0.2214 0.8118 0 0 ⎥ ⎥ Q2 = ⎢ ⎣ 0 0 0.6642 0.2214 ⎦ 0 0 0.2214 0.8118 T K = 0 0 0.2214 0.8118
730
L. Hou and Y. Li
Fig. 4 The trend of relative position of PLI robot with time
Q 2 in the above is a positive definite symmetric matrix, and the simulation results are shown in Fig. 4. From the above figure, we can see that the relative positions between the final PLI robots tend to be the same, that is, the robots maintain a fixed relative distance, and we can regard PLI 1 as the leader and the rest as the followers. The simulation results are consistent with the theoretical analysis (Fig. 5). According to the above figure, when the initial input is consistent, the final deflection angle of the PLI robot will basically tend to be consistent and converge stably, which is consistent with our expectation (Fig. 6).
4.3 Conclusion In this paper, the corresponding linearization model is obtained by linearizing the PLI robot, which is a special kind of robot. Then the formation distributed controller is designed based on the graph theory. The relevant parameters of the controller are obtained through mathematical derivation. Then the corresponding stability is verified by Lyapunov function. Finally, the conclusion is verified by simulation experiment.
Distributed Formation Control Based on Linear …
Fig. 5 The trend of input deflection angle of PLI robot with time
Fig. 6 The deviation error of robot varies with time
731
732
L. Hou and Y. Li
References 1. Lee, K.H., Jabez, L.K., Roland, B.: Balancing collective exploration and exploitation in multiagent and multi-robot systems: a review. Front. Robot. AI 8 (2022) 2. Lili, W., Mou, C., Tao, L.: Disturbance-observer-based formation-containment control for UAVs via distributed adaptive event-triggered mechanisms. J. Franklin Inst. 358(10) (2021) 3. Yu, F.M., Song, W.D., Long, W.C.: Formation control for water-jet USV based on bio-inspired method. China Ocean Eng. 32(1) (2018) 4. Xu, S.J., Zhu, J.F., Zhang, S.Y., Du, Y.: Driven wheeled robot based on fuzzy PID algorithm control research. Appl. Mech. Mater. 2491, 336–338 (2013) 5. Kassaeiyan, P., Tarvirdizadeh, B., Alipour, K.: Control of tractor-trailer wheeled robots considering self-collision effect and actuator saturation limitations. Mech. Syst. Signal Process. 127 (2019) 6. Control of wheeled robots with bluetooth-based smartphones. Int. J. Recent Technol. Eng. 8(2) (2019) 7. Jiao, C., Xu, Y., Li, X., Zhang, X., Zhao, Z., Pang, C.: Electromagnetic shielding techniques in the wireless power transfer system for charging inspection robot application. Int. J. Antennas Propag. 2021 (2021) 8. Zhang, Z., Fu, B., Li, L., Yang, E.: Design and function realization of nuclear power inspection robot system. Robotica 39(1) (2020) 9. Xiong, W., Yang, S., Zhang, Z., Chen, L., Huang, S.: Research on image recognition of power inspection robot based on improved YOLOv3 model. J. Phys.: Conf. Ser. 1486(4) (2020) 10. Cabral-Pacheco, E.G., Villarreal-Reyes, S., Galaviz-Mosqueda, A., Villarreal-Reyes, S., Rivera-Rodríguez, R., Perez-Ramos, A.E.: Performance analysis of multi-hop broadcast protocols for distributed UAV formation control applications. IEEE Access 7 (2019) 11. Liang, X., Wang, H., Liu, Y., Liu, Z., Chen, W.: Leader-following formation control of nonholonomic mobile robots with velocity observers. IEEE/ASME Trans. Mechatron. PP(99) (2020) 12. Zhang, S., Yan, W., Xie, G.: Consensus-based leader-following formation control for a group of semi-biomimetic robotic fishes. Int. J. Adv. Robot. Syst. 14(4) (2017) 13. Control and Systems Engineering: Data from Georgia Institute of Technology provide new insights into control and systems engineering (fault tolerant finite-time leader follower formation control for autonomous surface vessels with LOS range and angle constraints). J. Eng. (2016) 14. Fahim, K.E., Hossain, M.S., Afgani, M.K., Farabi, S.M., Shajid, S.: Modelling and simulation of DC-DC boost converter using sliding mode control. Int. J. Recent Technol. Eng. (IJRTE) 9(2) (2020) 15. Engineering—Automobile Engineering: Recent studies from South China University of Technology add new data to automobile engineering (sliding mode control of double-wishbone active suspension systems based on equivalent 2-degree-of-freedom model). J. Eng. (2020) 16. Li, K., Ye, Z., Zhang, J., Chen, X.: Control optimization method of substation inspection robot based on adaptive visual servo algorithm. J. Phys.: Conf. Ser. 1676(1) (2020) 17. Dian, S., Chen, L., Hoang, S., Pu, M., Liu, J.: Dynamic balance control based on an adaptive gain-scheduled backstepping scheme for power-line inspection robots. IEEE/CAA J. Autom. Sinica 6(1) (2019)
A Dynamic Trust-Based Access Control for Multi-domain Cloud Systems Mei Fan and Zhongguo Yang
Abstract With the development of distributed technologies, there is an increasing demand for resource sharing and collaboration across domains. Achieving crossdomain access control is a key issue in multi-domain cloud systems. Traditional static access control methods are not well-suited for the dynamic changes in multi-domain cloud systems. This paper proposes a dynamic trust-based access control (DT-ABAC) for multi-domain cloud systems, which is based on dynamic user trustworthiness and attribute-based access control. The method uses a trust model to evaluate the level of user trust based on their behavior and dynamically adjusts the access control policy according to this trust level. This method can achieve more accurate security policy management in multi-domain cloud systems. The experiments show that the DTABAC model can effectively reduce the impact of malicious access by users on the success rate of trusted user access. Keywords DT-ABAC · Trustworthiness evaluation · Dynamic access control · Multi-domain cloud systems
1 Introduction Currently, multi-domain cloud systems have become a common cloud computing architecture. This allows users to access a variety of cloud services through a single system, resulting in increased resources and services, as well as improved system availability and flexibility. However, along with the benefits, multi-cloud systems also pose new challenges such as access control, data privacy, and resource scheduling. How to achieve cross-domain access control is an important issue in multi-domain cloud systems. Currently, there are two mainstream access control models: rolebased access control (RBAC) [1] and attribute-based access control (ABAC) [2]. M. Fan (B) · Z. Yang School of Information, North China University of Technology, Beijing 100144, China e-mail: [email protected] Beijing Key Laboratory on Integration and Analysis of Large-Scale Stream Data, Beijing 100144, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_64
733
734
M. Fan and Z. Yang
RBAC provides features such as role-based management, support for the principle of least privilege, and constraints on static and dynamic separation of duties. However, RBAC lacks decentralized management of security policies and does not provide resource usage constraints, making it challenging to ensure the security of crossdomain collaboration in multi-cloud systems. ABAC has been proposed as a solution for access control in cloud environments [3]. ABAC allows access decisions to be determined based on attributes of users and resources, offering a flexible and scalable access control approach, more suitable for the needs of the current distributed environment [4]. However, in multi-domain environments, evaluating user attributes may involve multiple domains with dynamically changing trustworthiness over time [5]. To address the access control issues in multi-domain cloud systems, this paper proposes a dynamic trust-based access control method for multi-domain cloud systems. The method is an extension of attribute-based access control and introduces a dynamic user trust evaluation mechanism to achieve dynamic cross-domain access while ensuring interoperability security in multi-domain systems. The rest of this paper is organized as follows. Section 2 reviews related work. Section 3 describes the establishment of the model. Section 4 compares access control models with other trust mechanisms. Section 5 concludes.
2 Related Work Access control technology has been extensively researched in the academic community as an important means to address cloud security issues. Different network environments have different access control requirements, resulting in various access control models for different network environments. In cloud computing environments, there are a large number of cloud users with high levels of dynamism, and user security information is relatively lacking in cloud computing systems. This is even more significantly in cross-domain environments. The demand for resource sharing and collaboration across single security domains is increasing in government, military, business, and medical fields, and multi-domain systems have been widely used. In various multi-domain cloud systems, the implementation mechanism of access control is mainly to ensure the security of inter-domain interoperability policy synthesis. However, traditional access control models have shown great inadequacy in cloud computing environments in terms of granularity, dynamic authorization, and large-scale distributed access control. Access control models must consider multiple factors such as trust and environment for constraint control. In order to improve the security of cloud computing environments and ensure that cloud resources are not subject to illegal access, many domestic and foreign researchers have conducted research on access control. Zou proposed a cloud computing-based access control model [6], which ensures the security and controllability of data in cloud environments by introducing task
A Dynamic Trust-Based Access Control …
735
scenario sets for user data monitoring. Muhammad implements the concept of least privilege in the ABAC model by introducing the concept of roles between subjects and subject attributes, as well as objects and object attributes [7]. DAS mainly propose solutions for conflicts in attribute-based access control policies and issues during policy migration processes [8]. The aforementioned access control models mainly use static access control constraints for modeling, targeting different security requirements of cloud environments, and have achieved certain results, but there are still some shortcomings. UIKEY proposes a trust model called TrustRBAC based on trust management and roles [9], which uses security policies between two domains to calculate direct trust and recommended trust, improving system efficiency and reliability. GHAFOORIAN proposes an RBAC model based on trust and reputation, which implements a new method for calculating direct trust value and considers some security indicators, providing good resistance to security threats based on trust RBAC models and scalability [10]. This paper proposes a DT-ABAC model, taking into consideration the characteristics of multi-domain cloud systems, as well as trust attributes. The model extends the concept of trust attributes based on attribute-based access control, and proposes a method for calculating user trustworthiness. This method reduces the recommendation behavior of malicious security domains and the access of malicious users by considering the similarity evaluation between security domains, the effectiveness of time, and the influence of penalty mechanism on the attribute values of user trustworthiness, thereby enhancing access security.
3 Methodology 3.1 Definition of Model-Related To support dynamic access control functions in multi-domain systems, this paper proposes adding a trust attribute to the ABAC model and dynamically evaluating the trust level of users. According to the definition form of the ABAC model [11], the following definition of the DT-ABAC model is given. The formal definition of the DT-ABAC model is composed of a five-tuple (S, O, E, P, T), where S represents subjects, O represents objects, E represents environment, P represents permissions, and T represents trustworthiness attributes and their values. Attributes and attribute values are defined as subject attribute (Sattr), object attribute (Oattr), environment attribute (Eattr), permissions attribute (Pattr), and trust attribute (Tattr). An entity has specific attributes and an attribute value. By defining attributes and attribute values, the access control model integrates access control policies and access control mechanisms. The trust attribute is calculated by the trust module and its value ranges from (0, 1]. The multi-domain cloud environment is defined as a set of mutually crossable security domains S D = {sd1 , sd2 . . .}, where sd1 , sd2 represent mutually independent security domains. A request is composed
736
M. Fan and Z. Yang
of a quintuple Req = S A, O A, E A, P A, T A, representing the subject’s attribute information as the basis for authorization. A policy, which defines the access control strategy for a permission p, is defined as a triple Policy = (R, p, effect), where the rule R is determined by subject rules, resource rules, and environment rules. p represents the permission associated with the rule, and effect = (permit, deny). If the decision of a policy is to allow access, effect = permit, and if the decision is to deny access, effect = deny.
3.2 Establishment of DT-ABAC Model In multi-domain cloud systems, there is a frequent need for cross-domain access. To ensure the security between different security domains, this paper proposes a DT-ABAC model, which combines domain management center (DMC) and trust management (TM) with attribute-based access control model. In this model, both security domains sd1 and sd2 adopt attribute-based access control model. When a user from security domain sd1 requests access to an object in security domain sd2 , the trust management will invoke the user’s historical access information (HI) and domain set information (DI) from the cross-domain management center to calculate the user’s comprehensive trust value, in order to expand the access control request. If the user satisfies the access control policy, the user is allowed to access the resource; otherwise, the user is denied access to the resource. Several functional points in the access control mechanism are also provided in the paper. The Policy Enforcement Point (PEP) is responsible for transforming the original access request issued by the subject into an attribute-based access request. At the same time, the PEP enforces the decision result of the Policy Decision Point (PDP), which determines whether to allow or deny the cross-domain access request. The Policy Administration Point (PAP) is responsible for managing the rules of access control and providing rule support for the policy decision of the PDP. The Policy Information Point (PIP) is responsible for managing some attribute information, and obtains the user’s trust attributes from the trust management module. The PIP also provides attribute support for building Attribute-based Access Control Requests (AAR) by the PEP. The Policy Decision Point (PDP) is responsible for making decisions on cross-domain (CD) access, obtaining domain set information from the domain management center, and making authorization decisions based on the access control rules. The cross-domain access process is shown in Fig. 1.
3.3 Calculation of Trustworthiness Window Mechanism. Cloud environments are highly dynamic, and the level of trust of subjects by security domains may change over time. We believe that trust has temporal relevance, and after a certain period of time, the historical records of
A Dynamic Trust-Based Access Control …
737
Fig. 1 Cross domain access control
a subject’s access to a security domain are no longer relevant to the current access. Therefore, this paper introduces a time window mechanism, where only data within the window is used for calculation, while data outside the window is ignored. Trust Decay Function. Trust is a constantly changing process, and initially high trust values can become less reliable over time. In trust-based access control, there may be cases where users maliciously manipulate their trust values. Since these users have sufficiently high trust levels, the system is unable to prevent them from engaging in malicious cross-domain access. To avoid this situation, the model introduces a time decay function W (t), which can effectively reduce the impact of historical access data on current access. The specific Eq. 1 is as follows: W (t) =
1 − ε, t =n 1 W (t − 1) + n , t < n
(1)
where n represents the number of times a user may potentially access within the window time, and t represents the number of intervals between the user’s first access and their current historical access behavior. ε is a small positive number used to adjust the range of trust decay, and the weight of the user’s historical trust value will decrease as the number of accesses increases. Penalty Mechanism. Highly trusted users pose a greater risk when engaging in malicious access. Based on this characteristic, this paper designs a mechanism to penalize users based on their trust levels. The trust levels are divided into 5 levels, as listed in Table 1.
738
M. Fan and Z. Yang
Table 1 Trust level Trust level value (0, 0.2] Rank
Lower
(0.2, 0.4]
(0.4, 0.6]
(0.6, 0.8]
(0.8, 1]
Low
Medium
High
Higher
The penalty mechanism is given by the function as shown in Eq. 2. Where A is a parameter used to control the intensity of the penalty, with higher values of A indicating stronger penalties; L is the total number of trust levels that are defined; li represents the current trust level of the user, which dynamically changes as the user accesses objects. li ∗π (2) p = A sin 2∗L Inter-domain Similarity. Similarity evaluation is used to describe the similarity of trust levels between two security domains S Di and S D j for the same user. The higher the similarity between S Di and S D j , the more similar their evaluations of the same user’s trust level. This paper uses the Pearson correlation coefficient to represent the similarity between them, as shown in the following formula Eq. 3: S Dik ∗ S D jk − S Dik ∗ S D jk sim(i, j) = 2 2 2 N ∗ S Dik − S Dik ∗ N ∗ S D 2jk − S D jk N∗
(3)
where N represents the total number of other security domains that subject sk has accessed before accessing the target security domain S Di . S Dik represents the evaluation of S Di security domain on user sk , and S D jk represents the evaluation of S D j security domain on user sk . The specific formula for S Dik is as follows (4): S Dik =
Sik Fik ∗ Sik
(4)
where Sik represents the number of successful accesses of user sk to security domain S Di , and Fik represents the number of failed accesses of user sk to security domain S D j . In this paper, the trust level of subject si is combined with the similarity of evaluations as described above. Before subject si accesses security domain S D j , other security domains with high similarity of evaluations to S D j are considered as part of the evaluation of S D j ’s security. If the recommended similarity is high, it is used as a recommendation factor; if the recommended similarity is low, the recommendation of that domain is discarded. The value of the recommended trustworthiness is in the range of (0, 1]. The indirect trustworthiness calculation formula is given in (5): N IDT =
k=1
DTik ∗ sim( j, k) N
(5)
A Dynamic Trust-Based Access Control …
739
where N represents the total number of security domains that subject si has accessed, DTik represents the direct trustworthiness between subject si and security domain S Dk . The calculation of direct trustworthiness is provided in the following text. Calculation of Comprehensive Trustworthiness. Assuming that subject si wants to access resources in security domain S D j , and si is currently in security domain S Di . When subject si first accesses the target resources, the direct trustworthiness is initialized to 0.2, which is the lowest trustworthiness. According to Bayesian theory, the calculation formula for direct trustworthiness can be derived as shown in Eq. (6): DTi j =
0.2,
Si j +1 , Ni j +2
n=1 n>1
(6)
where Si j represents the number of successful interactions, and Ni j represents the total number of interactions between subject si and security domain S D j . The calculation of subject’s trustworthiness consists of both direct trustworthiness and recommended trustworthiness from other security domains. This ensures that the user’s access behavior in all security domains affects the trustworthiness of other security domains, thus providing a broader supervision of malicious subjects’ access. In this paper, the comprehensive trustworthiness is calculated by taking the weighted average of direct trustworthiness and recommended trustworthiness. The calculation of comprehensive trustworthiness is shown in Eq. 7, where θ (θ < 1)is used to adjust the proportion between direct trust and recommended trust. In general, the trustworthiness in the current security domain is more accurate, so the weight of direct trust should be higher than the weight of indirect trust. If the direct trust degree is the highest level of trust (DT > 0.8), use the direct trust degree as the overall trust degree. DT, DT > 0.8 T = (7) θ ∗ DT + (1 − θ ) ∗ I DT
4 Evaluation 4.1 Experimental Setting The simulation environment was set up on a system with Inter (R) Core (TM) i52450M CPU @ 2.5 GHz, 4 GB of memory. Generate access control requests using the NetworkX tool, which can provide dynamic user requests and corresponding permissions. The following text provides 100 experiments of cross domain access for test users. The initial trustworthiness of all users was set to 0.2, with the weight of direct trustworthiness set at 0.6.
740
M. Fan and Z. Yang
Fig. 2 Scale graph of successful user interactions
4.2 Results To verify the advantage of the proposed model in preventing malicious user access and malicious recommendations from security domains. We adopted the similarity trust method proposed by Kavitha et al. (abbreviation as STAC) and compared it with our proposed model [12]. As shown in Fig. 2, when the proportion of malicious users gradually increases from 10 to 70%, both the DT-ABAC model and the STAC model show a certain degree of decrease in the success rate of user interactions. Both the STAC model and the DT-ABAC model adopt the same similarity-based recommendation method, but the DT-ABAC model incorporates evaluation similarity based on historical access information and ignores recommendations from security domains with low similarity. Therefore, compared to the STAC model, the proposed model in this study is able to identify more malicious recommendations from security domains when calculating user trust attributes. It also incorporates punishment mechanisms and trust decay mechanisms to penalize malicious users and prevent them from continuing malicious access based on their trust levels. As a result, the success rate of trusted users’ access is improved.
5 Conclusion In this paper, we propose a DT-ABAC model for multi-domain cloud systems. The method combines attribute-based access control model and dynamic trust evaluation, and uses a dynamic authorization approach. It introduces similarity and penalty
A Dynamic Trust-Based Access Control …
741
mechanisms to make the trust attributes more accurate. Simulation experiments show that this method can effectively reduce malicious user access and improve the success rate of trusted user access, ensuring the security of cross-domain access by users.
References 1. Xu, J., Yu, Y., Meng, Q., Wu, Q., Zhou, F.: Role-based access control model for cloud storage using identity-based cryptosystem. Mob. Netw. Appl. 26, 1475–1492 (2021) 2. Singh, D., Sinha, S., Thada, V.: Review of attribute based access control (ABAC) models for cloud computing. In: 2021 International Conference on Computational Performance Evaluation (ComPE), pp. 710–715. IEEE (2021) 3. Crampton, J., Williams, C.: Attribute expressions, policy tables and attribute-based access control. In: Proceedings of the 22nd ACM on Symposium on Access Control Models and Technologies, pp. 79–90 (2017) 4. Wei, J., Chen, X., Huang, X., Hu, X., Susilo, W.: Rs-habe: revocable-storage and hierarchical attribute-based access scheme for secure sharing of e-health records in public cloud. IEEE Trans. Dependable Secure Comput. 18(05), 2301–2315 (2021) 5. Servos, D., Osborn, S.L.: Current research and open problems in attribute-based access control. ACM Comput. Surv. (CSUR) 49(4), 1–45 (2017) 6. Jiashun, Z., Yongxie, Z., Yan, G.: Research on ABAC model based on usage control in cloud environment. Comput. Appl. Res. 31(12), 3692–3694 (2014) 7. Aftab, M.U., Qin, Z., Quadri, S.F., Javed, A., Nie, X.: Role-based ABAC model for implementing least privileges. In: Proceedings of the 2019 8th International Conference on Software and Computer Applications, pp. 467–471 (2019) 8. Das, S., Sural, S., Vaidya, J., Atluri, V.: Policy adaptation in hierarchical attribute-based access control systems. ACM Trans. Internet Technol. (TOIT) 19(3), 1–24 (2019) 9. Uikey, C., Bhilare, D.: Trustrbac: trust role based access control model in multi-domain cloud environments. In: 2017 International Conference on Information, Communication, Instrumentation and Control (ICICIC), pp. 1–7. IEEE (2017) 10. Ghafoorian, M., Abbasinezhad-Mood, D., Shakeri, H.: A thorough trust and reputation based RBAC model for secure data storage in the cloud. IEEE Trans. Parallel Distrib. Syst. 30(04), 778–788 (2019) 11. Iyer, P., Masoumzadeh, A.: Mining positive and negative attribute-based access control policy rules. In: Proceedings of the 23rd ACM on Symposium on Access Control Models and Technologies, pp. 161–172 (2018) 12. Kavitha, A., Reddy, V.B., Singh, N., Gunjan, V.K., Lakshmanna, K., Khan, A.A., Wechtaisong, C.: Security in IoT mesh networks based on trust similarity. IEEE Access 10, 121712–121724 (2022)
On the Digital Intelligence for Online Retail Decision Support Lei Wang, Bin Zhao, and Yong Yang
Abstract Usually, academic machine learning, as one of the key tools of data intelligence, has been dived into accuracy improvement by the majority of papers with certain task objectives, ideal assumptions, and well-labeled training datasets; however, those are not ready-made meals in real-world industry scenarios. Considering the actionable implementation of data intelligence for online retail decision-making quality enhancement scenarios, this report provides a series of practical procedures to make further endeavors on implementing machine learning algorithms from theory to practice. Particularly, the procedures include (1) contextualizing the business problem, which aims to depict a panoramic view of the business process operations across all the vital organizations; (2) decomposing the business problem into machine learning tasks, which consists of an evaluation of potential risks and identification for the business problem with the highest input-output ratio as a candidate pool of machine learning tasks; and (3) underestimated manipulations of industrial data wrangling and feature extraction. Keywords Digital intelligence · Machine learning applications · Online retail
1 Introduction The history of using digital means to improve operational efficiency in the retail industry can be traced back to 1987 when Walmart, a retail company in the United States, had sales that were less than half of the century-old store, Sears. Walmart’s low-price business philosophy of “saving every penny for customers” implemented the “three no’s” policy for suppliers, which meant no kickbacks, no advertising, and no delivery fees, to achieve the lowest purchase price as described in Graff [2]. However, there was still room for improvement in operational efficiency. In 1985 Walmart launched a commercial satellite and established its own private satellite communication system, called The Walmart Satellite Network (WSN), described in L. Wang (B) · B. Zhao · Y. Yang JD.com, Inc., Beijing, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_65
743
744
L. Wang et al.
Roberts [5], which allowed headquarters and various stores to communicate with each other through voice, data, and video. As a result, more than 4000 stores worldwide could inventory every item’s stock, shelf placement, and sales volume in one hour. The close connection between internal and external information systems allowed Walmart to exchange the data of sales, transportation, and ordering information with suppliers every day, ensuring that store sales, ordering, and distribution remained synchronized. Because Walmart applied technological means ten years earlier than its competitors, it greatly improved operational efficiency, ultimately achieving lower theoretical costs than surrounding competitors, efficiently implementing the “Everyday Low Price (EDLP)” promotion strategy, winning customer satisfaction, and surpassing Sears in sales in 1991, becoming the largest retail company in the United States [16]. As of today, in the rapid development of digital information systems, retailers have expanded from offline to online. The high velocity, volume, and variety of data, as well as the exponential development of artificial intelligence technology, have inspired a retail company from a traditional retail company to a technology company, making digital intelligence in online retail inevitable, such as supply chain enhancement [12, 14]. However, for large online retail platforms in China, the complexity of the buyer, merchandise, and scene dimensions makes the implementation of digital intelligence a thorny issue. The complexity of the “buyer” dimension is reflected in its wide client, including not only ordinary household consumers but also enterprise users and rural purchasing agents. The complexity of the “merchandise” dimension is due to a large number of SKU categories (in the tens of thousands) and a large amount of merchandise (in the tens of millions for regular-sold SKUs and in the billions for all SKUs). The complexity of the “scene” dimension represents various sales channels, including individual recipients and consumer sales, live streaming sales, enterprise VIP customer agreement procurement, small and medium-sized enterprise procurement, and low-price sales in the sinking market. Therefore, when online retail platform operators prepare to create promotion activities (daily promotions or mid-year/final-year promotions) for each kind of “buyer-merchandise-scene” side to stimulate the consumption willingness of buyers, there will be a number of combination promotion activity decisions even for a single SKU, and it is not always smooth sailing to use digital intelligence to improve decision-making quality [7, 9, 13]. For example, when a fresh operator from a large online retail platform wants to get a clue from digital intelligence to achieve the optimal GMV price for an SKU in the next few days, it is not enough to treat this business problem starting with a pricedemand regression problem as Fig. 1 shown and price optimization problem solved by Lau [1] and Phillips [15]. Furthermore, it is necessary to determine who proposed this pricing issue, what their role played in the online retail platform, what they can do (can they participate in discount sales, adjust prices), what their motivation is (whether they are responsible for SKU sales performance appraisal, what are the assessment indicators?), and what can be done after obtaining the optimal GMV price (where, when, and what can be done). If the consumer’s demand is high, can the current inventory meet the purchasing volume? Does inventory need to be transferred from another location? Who will afford the cost of transferring inventory? What kind of
On the Digital Intelligence for Online Retail Decision Support
745
Fig. 1 Price-demand regression problem
level price can we access in the price waterfall? Since pricing includes a remarkably complex set of decisions and there are many kinds of price discounts result in different promotion activities decision between the list price and the pocket price as mentioned in Leung [13] and Phillips [15]. After these issues are raised, the possibility of the machine learning model for “demand curve prediction” to be applied in practice is questionable. Therefore, in the online retail platform, to determine the optimal GMV price of an SKU through intelligent data-driven approaches, it is essential to have a deep understanding of the business problem’s proposer, role, and motivation, as well as the subsequent operational steps and potential issues. Meanwhile, it is also crucial to recognize the limitations and constraints of machine learning models in practical applications, as discussed by Wuest [10] and Angermueller [11, 12]. Therefore, despite the outstanding performance of machine learning in various fields, such as natural language processing and computer vision, it still faces multiple challenges when dealing with business problems in online retail, such as the following three issues: 1. How to contextualize the business problems and divide them into smaller subproblems that can be potentially solved by machine learning techniques; 2. How to translate business problems into machine learning tasks; 3. How to clean, analyze, and extract irrational dirty data in the industry. These critical issues will be discussed in three parts in this article. Part one: Contextualize the business problem. Part two: Decompose the business problem into machine learning tasks. Part three: Industrial data wrangling and feature extraction.
2 The Main Results 2.1 Contextualize Business Problem In this step, the main goal is to break down and analyze business problems and find controllable, intervenable, and valuable business problems as the candidate pool for machine learning tasks.
746
L. Wang et al.
The first step is to depict the panoramic view of the business process and analyze business problems. Take a large online retail platform as an example, with a wide range of customers (household consumers, enterprise users, rural purchasing agents), a large number of SKU categories (with tens of thousands of categories), and a large volume of SKUs (with billions of SKUs), and detailed sales channels (such as individual recipients and consumers sales, live streaming sales, direct store sales, enterprise VIP customer, small and medium-sized enterprise customer, and lowpriced sales in sinking markets). Therefore, to depict the panoramic view of the business process, the following questions need to be confirmed: (1) Which role is responsible for purchasing goods, and determining the purchase price and inventory level? (2) Which role is responsible for sales, determining the participation of goods in promotion activities; (3) Which role has the pricing power, determining the specific selling price in promotion activities; (4) In promotion activities, how to achieve price reduction promotion for different customers and sales channels for the same SKU; (5) Whether the effect data after promotion can be observed. After confirming the above questions, we can draw the business process panoramic view demo shown in Fig. 2. The second step is to evaluate potential obstacles and risks and determine whether the key links are solvable. At this time, it is necessary to locate where the business problem proposer is in the business process and narrow down the scope of the business problem. Based on the upstream and downstream dependencies, evaluate potential obstacles and risks, that is, to truly solve the business problem, what roles need to cooperate to achieve it, and how much manpower needs to be invested. Taking the promotion activity intelligent (SKU selection and pricing) business problem as an example, the key links can be abstracted as shown in Fig. 3. Finally, based on the scope of the business problem, with metric analysis estimate how much benefit can be obtained after the problem is solved, and calculate the inputoutput ratio based on the estimated investment. Find the business problem with the highest input-output ratio and put it into the candidate pool for machine learning tasks.
2.2 Decomposing Business Problem into Machine Learning Tasks After Contextualizing the business problem, a panoramic view of the business process will be constructed, and controllable and intervenable links will be selected as the candidate pool for machine learning tasks. Then four critical considerations will be carried out in order to decompose the business problems into machine learning tasks with respect to the data-driven decision support system shown in Fig. 4 inspired by control theory. The first step is to determine the granularity of the machine learning task objective, including data sample granularity and objective accuracy. As Loni [4, 6] classifies
On the Digital Intelligence for Online Retail Decision Support
747
Fig. 2 A panoramic view demo of the business process operations in large online retail platforms
Fig. 3 The key links of promotion activity intelligent (SKU selection and pricing) business problem
748
L. Wang et al.
Fig. 4 The data-driven decision support system for promotion activity intelligent
questions into coarse and fine-grained questions, we can also classify the machine learning task. In order words, whether to explore the fuzzy correctness of coarsegrained problems or to pursue the accuracy of fine-grained problems would be determined. For example, when using a machine learning model to solve the investment portfolio problem, if the fuzzy correctness of coarse-grained problems is selected, the goal of the model is to select stocks reasonably and configure investment amounts to maximize the probability of attaining a given target profit level; if the accuracy of fine-grained problems is selected, the goal of the model can be set to predict the price of stocks on the second day. However, the second type of problem is easy to define as a supervised-learning problem, but in actual business use, the accuracy of stock price prediction is required to be extremely high because even small errors in stock price prediction will be amplified by large purchase volumes and result in large gains or losses. On the other hand, for the first type of problem, even if there is a certain error between the actual winning rate and the predicted winning rate of the model, as long as the actual result is positive, this low winning rate can bring great benefits especially when total investment amounts are large with a long-term investment objective. The second step is to evaluate the verifiability of the model. That is, besides the general evaluation methods for machine learning summarized in Reich [3], whether the criteria for judging the model results as good (correct) or bad (incorrect) are clear and whether the cost is affordable (not only human resource cost but also time cost). Take an easy verifiability model as an example; the model is set to recognize the price numbers in the SKU image as long as ordinary people with normal cognition and knowledge of numbers can evaluate the correctness of the model’s recognition results. On the contrary, the model set to output the optimal price is hard to verify, which serves to suggest online retail platform operators plan the optimal selling price for a certain promotion activity in order to achieve the maximum GMV. How to judge whether this price is good or bad, based on manual judgment or based on the GMV of historical or future transactions at this price? If based on manual judgment, not to mention the human and time costs, is the standard of manual judgment unique, and will it change? If the standard changes, how can the model learn the pattern?
On the Digital Intelligence for Online Retail Decision Support
749
Secondly, if based on the GMV of historical transactions at this price or the trial effect of small-scale production in the future, how to ensure that the increase or decrease of GMV is not caused by other factors? Therefore, the more ambiguity and more costly it is to evaluate whether the model results are good (correct) or bad (incorrect) for machine learning tasks, the more difficult it is to verify the model and the more human and time costs it will consume. Third, determine the production materials. That is, whether the data is available and accessible, whether it has a competitive advantage (i.e., obtaining data that competitors cannot obtain), and whether the business rules for cleaning noise can be guided by professionals. Continuing with the previous example, since manual verification of the model cannot guarantee that the standard remains unchanged, and the use of historical transaction prices cannot rule out other external factors, the model has no verifiability before it is put into production. Fourth, confirm the scale of the problem and computing resources involved. That is, how large is the business problem to be solved and whether the available computing resources can support it.
2.3 Industrial Data Wrangling and Feature Extraction Once the objectives of the machine learning problem have been well defined, the next step is to convert raw data into a usable format, remove noise, and extract features related to the target as much as possible. There are three underestimated manipulations of Industrial data wrangling. (1) Data parsing, which is usually used for compressed data and XML format files, such as order XML gzip compressed data used for transmitting between microservices component via MQ(Message Queue). It needs to be unzipped and parsed into XML syntax operational format in order to easily extract various theme data, such as order theme data(order ID, ordered SKU ID), promotion activity theme data(promotion ID and promotion discount amount), and so on; (2) Data deduplication, which is a very important but often overlooked step. Especially for order data, if duplicate orders occur, false GMV will be introduced when calculating the overall GMV or order volume; (3) Abnormal data processing, including null value processing (judging whether null values are normal or abnormal from a business logic perspective, treating them as 0 or filling them with the average value based on the previous and next values), and data distribution exploration (observing the data distribution over a certain period of time and judging whether extreme values are rare events, true exceptions, or data mixed with other business scenarios). In addition, noise-cleaning business rules based on expert experience can be obtained during the data exploration phase when identifying machine learning tasks. For structured data, feature extraction fundamental generally includes three steps: (1) the first step is to define the data range based on business rules; (2) the second step is data joining; (3) and the third step is data aggregation. However, the underestimated manipulation is that when entering the third step after completing the second step, it is
750
L. Wang et al.
important to note whether there are duplicate records that can cause duplicate records, which will affect the statistical results of data aggregation, such as summation action. Taking the SKU demand prediction model construction as an example, during the processing of data wrangling and feature extraction based on the selected machine learning model, the data range is limited based on the business scenario from multiple dimensions. This includes not only the SKU range (category, brand, current validity, popularity) but also the time range of the SKU order (whether to consider the past 30 days, past year, or past three years), the user range of the SKU order (member users or enterprise users), and the sales channel range of the SKU order (individual recipients and consumers sales or low-price promotion in the sinking market). The reason for data joining is that data is usually stored separately according to different business themes when the business process is complex and the data volume is extremely large. For example, in the order theme data model, the order number ID and SKU ID are used as the primary keys, and other order information (order date, user ID, sales channel, promotion activity ID, pre-discount price, post-discount price, sales quantity) are used as values. In the user theme data model, the user ID is used as the primary key, and other user information (whether it is a member, whether it is an enterprise user) is used as values. In the SKU theme data model, only the SKU ID is used as the primary key, and other SKU attributes (category, brand, current validity, recent sales, new SKUs) are used as values. In the promotion theme data model, the promotion activity ID is used as the primary key, and promotion information (activity start time, activity end time) is used as values. Therefore, from the order theme data model, only the SKU ID, the user ID, and the price and quantity of the purchase due to participation in a certain promotion activity are known. If we need to know user information, SKU information, and promotion activity information at the same time, we need to associate the SKU theme data model, user theme data model, and promotion theme data model. Data aggregation is to aggregate feature values based on the data sample granularity of the model’s target. For example, if the data sample granularity of the model’s target is to observe the average selling price and the total sales quantity of an SKU every hour since the start of a certain activity, as described in Ferreira, the aggregation key will be the promotion activity ID, SKU ID, and the nth hour timestamp after the start of the activity, and the average selling price and sales quantity will be aggregated for the range of orders in this key. In addition, the data range needs to be limited, considering the computing resources with respect to the selected machine learning model. First, consider the existing hardware devices and time limitations to evaluate the processing capacity of the supported data. Second, consider the data sample granularity of the model feature extraction. Third, have a psychological expectation for the feature normalization scheme selection (for example, normalizing the same feature for all samples or normalizing all features for the same sample).
On the Digital Intelligence for Online Retail Decision Support
751
3 Conclusion This report discusses the issues and challenges when the digital intelligence technique is prepared to be applied in online retail as a means to help platform operators enhance decision-making quality. As one of the most important decisions for online retail platform operators, the promotion activity decision, which consisted of the promotion SKU (stock keeping unit) selection sub-decision and SKU pricing sub-decision, is discussed in detail by giving use cases to illustrate the obstacle encountered when applying machine learning to achieve digital intelligence. Finally, an actionable implementation framework is proposed, which is hoped to further enhance the communication and collaboration between the industry and academia and to make the bridge between them more accessible and smooth.
References 1. Lau, A.H.L., Lau, H.S.: The newsboy problem with price-dependent demand distribution. IIE Trans. 20(2), 168–175 (1988) 2. Graff, T.O., Ashton, D.: Spatial diffusion of Wal-Mart: contagious and reverse hierarchical elements. Prof. Geogr. 46(1), 19–29 (1994) 3. Reich, Y., Barai, S.V.: Evaluating machine learning models for engineering problems. Artif. Intell. Eng. 13(3), 257–272 (1999) 4. Loni, B.: A survey of state-of-the-art methods on question classification (2011) 5. Roberts, B., Berg, N.: Walmart: Key Insights and Practical Lessons from the World’s Largest Retailer. Kogan Page Publishers (2012) 6. Teng, X., Liu, B., Ichiye, T.: Understanding how water models affect the anomalous pressure dependence of their diffusion coefficients. J. Chem. Phys. 153(10), 104510 (2020) 7. Wu, J., Li, L., Da, Xu., L.: A randomized pricing decision support system in electronic commerce. Decis. Support Syst. 58, 43–52 (2014) 8. Ferreira, K.J., Lee, B.H.A., Simchi-Levi, D.: Analytics for an online retailer: demand forecasting and price optimization. Manuf. Serv. Oper. Manag. 18(1), 69–88 (2016) 9. Teng, X., Hwang, W.: Effect of methylation on local mechanics and hydration structure of DNA. Biophys. J. 114(8), 1791–1803 (2018) 10. Wuest, T., Weimer, D., Irgens, C., Thoben, K.D.: Machine learning in manufacturing: advantages, challenges, and applications. Prod. Manuf. Res. 4(1), 23–45 (2016) 11. Angermueller, C., Pärnamaa, T., Parts, L., Stegle, O.: Deep learning for computational biology. Mol. Syst. Biol. 12(7), 878 (2016) 12. Teng, X., Hwang, W.: Elastic energy partitioning in DNA deformation and binding to proteins. ACS Nano 10(1), 170–180 (2016) 13. Leung, K.H., Luk, C.C., Choy, K.L., Lam, H.Y., Lee, C.K.: A B2B flexible pricing decision support system for managing the request for quotation process under e-commerce business environment. Int. J. Prod. Res. 57(20), 6528–6551 (2019) 14. Ivanov, D., Dolgui, A., Sokolov, B.: The impact of digital technology and Industry 4.0 on the ripple effect and supply chain risk analytics. Int. J. Prod. Res. 57(3), 829–846 (2019) 15. Phillips, R.L.: Pricing and Revenue Optimization. Stanford University Press (2021) 16. Sam Walton Biography. Biography.com, http://www.biography.com/people/sam-walton9523270. Accessed 2017
Modeling and High-Order Differential Feedback Control of Unmanned Helicopter Under Disturbances Guoyuan Qi, Shishen Wang, and Xu Zhao
Abstract The existence of the servo on small unmanned helicopter has increased the order and control difficulty of the entire system. This paper proposes an inverse model control idea to eliminate the effect of this device and solve the problem of the inverse transfer function being unable to be achieved. A trajectory tracking control system designed according to the small unmanned helicopter model characteristics to achieve autonomous flight on a precise path with reduced unknown interference effects. The Nonlinear Pole Assignment Controller (NPAC) used, considering the simplicity and known outer loop model, to make full use of the known model information and shorten the system response time. For the strong nonlinearity, strong coupling, and disturbance in the inner loop model, the High-Order Differential Feedback Controller (HODFC) utilized to compensate for the nonlinear model, disturbance, and coupling with its control filtering. Simulation experiments demonstrated that the designed control system had better anti-interference ability and tracking accuracy than the PID when applying the figure-eight climbing trajectory tracking control under external interference. Keywords Unmanned helicopter · High-order differential feedback control · Pole configuration · Track tracking · Inverse model
1 Introduction Small unmanned helicopters are rotary-wing aircraft capable of vertical takeoff and landing, hovering, low-altitude, low-speed flight, and have strong maneuverability. However, their modeling and flight control are more challenging than quadrotors due to their high nonlinearity, strong coupling, and being an underactuated system. Servos are the unique actuator components of small unmanned helicopters compared to quadcopters. Typically, servos are modeled using transfer functions [1]. However, their presence increases the order of the entire system, making control a G. Qi (B) · S. Wang · X. Zhao School of Control Science and Engineering, Tiangong University, Tianjin 300380, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_66
753
754
G. Qi et al.
more challenging task [2]. Previous literature often neglected to include the servo model in their modeling efforts [3–5], leading to control system models that deviate from the true model of small unmanned helicopters. The control methods for small unmanned helicopters can be classified into linear and nonlinear approaches [6]. Previous research of linear control approaches tend to overlook the nonlinear features of small unmanned helicopters [7, 8], resulting in inadequate solutions to their strong coupling issues. To address these limitations, researchers have begun using nonlinear controllers. Most nonlinear control methods heavily depend on the accurate model of the system [9–11]. Therefore, some scholars have recommended model-free control methods. Currently, the prevalent modelindependent control method is Active Disturbance Rejection Control (ADRC) [12], which incorporates an Extended State Observer (ESO) used to estimate unknown models and disturbances. However, Qi et al. [13] found that the ESO structure lacks an integral cascade structure and only employs position (or angle) error information without factoring in velocity (or angular velocity) error information, leading to issues with accuracy when observing unknown models and an inability to estimate convergent perturbations during fast maneuvering. The High-Order Differential Feedback Controller (HODFC) proposed by Qi et al. [14] addresses the limitations of ADRC by providing a model-independent control method. HODFC compensates for unknown components in the system indirectly through controller filtering, even for stronglycoupled nonlinear systems. By adjusting the filter parameters, the compensation accuracy of unknown components can be ensured. Previous research did not specifically design control systems for the model characteristics of small unmanned helicopters, where the outer loop is simple and known, while the inner loop process is nonlinear and severely coupled. The innovation and contribution of this paper lies in: In response to the unique servo and swashplate models used in helicopters, an inverse model control method proposed. A small unmanned helicopter trajectory tracking control system designed based on the characteristics of the small unmanned helicopter model. This system is composed of HODFC and Nonlinear Pole Assignment Controller (NPAC), and its stability has been rigorously demonstrated.
2 Small Unmanned Helicopter Model 2.1 Kinematic and Rigid-Body Dynamic Models The kinematic part mainly deals with the translational and rotational motions of small unmanned helicopters between the local North East Down (NED) coordinate system and the body coordinate system. The translational and rotational motion can be expressed in the following form [15]: ˙ = S−1 ωb , P˙n = Rbn Vb ,
(1)
Modeling and High-Order Differential Feedback Control …
755
where Pn = [ Px Py Pz ]T , Vb = [ u v w ]T respectively represent the position vector in the local NED coordinate system and the velocity vector in the body coordinate system, = [ φ θ ψ ]T and ωb = [ p q r ]T respectively denote the Euler angles and the angular velocity vector. Rbn in (1) is the rotation matrix from the body coordinate system to the local NED coordinate system. The transformation matrix S−1 in (1) can be expressed as ⎡
S−1
⎤ 1 sin φ tan θ cos φ tan θ cos φ − sin φ ⎦ . = ⎣0 0 sin φ/ cos θ cos φ/ cos θ
(2)
The six-degree-of-freedom rigid body dynamics model of small unmanned helicopters can be represented by the Newton-Euler equations in the following form [15]:
˙ b = −ωb × Vb + Fb + Fb,g , V m m ω˙ b = J−1 [Mb − ωb × (Jωb )] ,
(3)
where m denotes the weight of the helicopter, J = diag{Jx x , Jyy , Jzz } is the inertia matrix of the helicopter, F b,g represents the projection of gravity onto the body coordinate system, which can be expressed as ⎡
F b,g
⎤ −mg sin θ = ⎣ mg sin φ cos θ ⎦ . mg cos φ cos θ
(4)
Fb and Mb in (3) respectively denote the force vector and moment vector acting on the small unmanned helicopter, which can be expressed as ⎧ ⎡ ⎤ ⎡ ⎤ Fbx X mr + X f us + X d ⎪ ⎪ ⎪ ⎪ F b = ⎣ Fby ⎦ = ⎣ Ymr + Y f us + Ytr + Yv f + Yd ⎦ , ⎪ ⎪ ⎨ f + Zd ⎡Fbz ⎤ ⎡ Z mr + Z f us + Z h⎤ M L + L + L ⎪ bl mr v f tr ⎪ ⎪ ⎪ M b = ⎣ Mbm ⎦ = ⎣ Mmr + Mh f ⎦ , ⎪ ⎪ ⎩ Mbn Nmr + Nv f + Ntr
(5)
where Fbx , Fby , Fbz respectively denote the total forces acting on the small unmanned helicopter along the x, y, z axes, Mbl , Mbm , Mbn respectively represent the total moments acting on the small unmanned helicopter around the x, y, z axes, (·)mr , (·)tr , (·) f us , (·)v f , (·)h f represent the main rotor, tail rotor, fuselage, vertical stabilizer and horizontal stabilizer of the helicopter, which are the five sources generating aerodynamic forces and moments. X d , Yd , Z d in (5) denote the external disturbances acting on the small unmanned helicopter.
756
G. Qi et al.
2.2 Component Dynamics Model Main Rotor and Tail Rotor Dynamics Model. The tension in the main rotor or tail rotor can be expressed in the following form [16]:
1 μ2∗r μz,∗r + λ∗r a∗r s∗r 2 − , T∗r = δ col/ ped + ρ( ∗r R∗r )2 π R∗r 3 2 2 2
(6)
where ∗ represents either m or t. When ∗ represents m, the parameters in (6) are for , the main rotor; when ∗ represents t, the parameters in (6) are for the tail rotor. δcol δ ped , a∗r , ρ, ∗r , and R∗r in (6) respectively represent the total pitch control of the main rotor, total pitch control of the tail rotor, slope of the lift curve of the rotor, air density, rotor speed, and rotor radius. s∗r , μ∗r , μz,∗r and λ∗r respectively represent the rotor solidity, advance ratio, advance ratio along the z axis and inflow ratio. The flapping angle dynamic model of a small unmanned helicopter main rotor can be represented as [16]: ⎧ 1 ⎪ ⎪ βlon + Aβlat βlat + Alon δ lon ⎨ β˙lon = −q − τmr , 1 ⎪ ⎪ ⎩ β˙lat = − p − βlat + Bβlon βlon + Blat δ lat τmr
(7)
where βlon and βlat respectively represent the longitudinal and lateral flapping angles of the small unmanned helicopter, τmr is the main rotor flapping motion time constant, Aβlat and Bβlon represent the coupling effects between longitudinal and lateral flapping motions, Alon and Blat are the gain coefficients for varying flapping angle with respect and δlat respectively represent the longitudinal to longitudinal and lateral periods, δlon and lateral period variations of the small unmanned helicopter, which are control input variables. Due to the existence of flapping motion, the lift generated by the main rotor will produce components along the x, y, and z axes, with expressions: ⎧ ⎨ X mr = −Tmr sin βlon cos βlat Ymr = Tmr sin βlat cos βlon . ⎩ Z mr = −Tmr cos βlon cos βlat
(8)
The torque components generated by the main rotor can be expressed as: ⎧ L mr = Ymr Hmr + K β βlat − L˙ mr τ f p ⎪ ⎪ ⎨ Mmr = X mr Hmr + K β βlon − M˙ mr τ f q , bc Rmr 2 ⎪ ⎪ ⎩ Nmr = ρ( mr Rmr )2 CQ smr
(9)
Modeling and High-Order Differential Feedback Control …
757
where K β , Hmr , τ f and C Q respectively represent the elasticity coefficients of the main rotor, the vertical distance between the main rotor hub and the center of gravity of the helicopter, the main rotor time constant, and the torque coefficient. The force generated by the tail rotor coincides with the x axis of the body coordinate system, hence the component of the force generated by the tail rotor is Ytr = −Ttr . The tail rotor force will generate two moments, L tr = Ytr Htr and Ntr = −Ytr Dtr . Htr and Dtr are the vertical and horizontal distances between the tail rotor and the center of gravity of the helicopter, respectively. Fuselage Dynamics Model. The formulas for calculating the aerodynamic drag in the three directions are given by [16]: 1 1 X f us = − ρ Sx uV∞ , Y f us = − ρ S y vV∞ , 2 2 1 Z f us = − ρ Sz w + vi,mr V∞ , 2
(10)
where Sx , S y , Sz are the effective areas of aerodynamic drag along the x, y, z axes of the aircraft coordinate system, respectively. V∞ in (10) represents the combined velocity at the aerodynamic center of the aircraft. Stabilizer Dynamics Model. The force generated by the horizontal stabilizer can be expressed as [15]
Zh f
⎧ ρ ⎪ ⎨ − Clα,h f Sh f wh f |u| 2 = ρ ⎪ ⎩ − Sh f wh f |wh f | 2
w hf wu hf u
≤ tan(α) , > tan(α)
(11)
where C1α,h f , Sh f and wh f are respectively the lift curve slope, area of the horizontal stabilizer and the local vertical airspeed of the horizontal stabilizer. The moment generated by the horizontal stabilizer is Mh f = Z h f Dh f , where Dh f is the horizontal distance between the center of gravity of the helicopter and the horizontal stabilizer. The calculation method for the force Yv f generated by the vertical stabilizer is similar to that for the force produced by the horizontal stabilizer [15]. Similar to the tail rotor, the vertical stabilizer generates two moments, L v f and Nv f . Servo and Swashplate Model. The servo model of the small unmanned helicopter can be simplified as a first-order inertial element with the transfer function [1] of G(s) = 1/(0.1s + 1). There is also a cross swashplate under the main rotor of the small unmanned helicopter as an important actuator. The expressions for the main rotor servo and swashplate model, as well as the tail rotor servo model, are
758
G. Qi et al.
⎡
⎤
⎡
⎤−1
δ col 1 0 1 ⎣ δ lat ⎦ = ⎣ −1 0.5 0.5 ⎦ 1 0.5 −0.5 δ lon δ ped = δ ped
⎡
⎤ 1 ⎢ 0.1 s + 1 ⎥ ⎢ ⎥ 1 ⎢ δ lat ⎥, ⎢ 0.1 s + 1 ⎥ ⎣ ⎦ 1 δ lon 0.1 s + 1 δ col
1 . 0.1 s + 1
(12)
T where δ col δ lat δ lon is the output of the swashplate model, which is also the control input of the aerodynamics of various components of the helicopter; δ ped is T the output of the tail rotor servo; δ col δ lat δ lon and δ ped are the control outputs of the servo and swashplate inverse model, and they are also the inputs of the servo and swashplate model.
3 Design of Trajectory Tracking Control System for Small Unmanned Helicopter 3.1 Overall Design of Control System The trajectory tracking control system of the small unmanned helicopter is mainly divided into the position control system and the attitude control system. The control system also includes the servo and swashplate inverse model, which can effectively reduce the influence of servo and swashplate on the trajectory tracking control. The cascade control method will be adopted for both the position control system and the attitude control system. In terms of controller selection, the outer loop process of both the position control system and the attitude control system can be clarified by (1). Therefore, the NPAC controller can be selected for the outer loop process, which can fully utilize the known model information to shorten the system adjustment time. The inner loop process is nonlinear and heavily coupled, so the HODFC that does not rely on the model can be used to compensate for unknown nonlinear models and disturbances using control filtering, thus achieving high-precision control. The control system structure is shown in Fig. 1.
3.2 Design and Stability Analysis of NPAC Assuming the system to be controlled as y˙ = f (x, t) + bu o , where f (x, t) is a known model function. Assuming the given reference signal as yr and the observable
Modeling and High-Order Differential Feedback Control …
759
Fig. 1 Internal structure of control system
actual signal as y, the tracking error is defined as eo = yr − y, and the subscript o represents the control tracking error of the outer-loop NPAC controller. The tracking error equation is assumed to be e˙o = y˙r − y˙ = −keo + (keo + y˙r − ( f (x, t) + bu o )) ,
(13)
setting keo + y˙r − ( f (x, t) + bu o ) = 0, the tracking error equation becomes e˙o = −keo . The characteristic polynomial s + k is a Hurwitz polynomial if and only if k > 0. In this case, limt→∞ eo = 0, which means that limt→∞ y = yr . Therefore, the NPAC that ensures asymptotic stability of the closed-loop system is given as uo =
1 (keo + y˙r − f (x, t)) , b
(14)
where k is a parameter determined by designing the expected pole configuration based on the Hurwitz polynomial property of the characteristic polynomial s + k of error equation.
3.3 Design and Stability Analysis of HODFC The differential information of input and output signals is crucial in the system. HOD designed by Qi et al. [17] can extract the differential information of a bounded differential signal of n-order. Based on the differential information extracted by HOD, Qi et al. [14] designed high-order differential feedback controllers for general affine systems. The small unmanned helicopter system is a MIMO system, which can be regarded as a SISO system for each channel, and the coupling between channels can be considered as disturbance or uncertainty terms. Therefore, we give an example of high-order differential feedback controller based on the first-order SISO. Assuming that the first-order differential equation of a SISO nonlinear affine system is y˙ = f¯(x, t) + g(x)u h , where t, u h , and x = y denote the system output,
760
G. Qi et al.
Fig. 2 HODFC control block diagram
control input, and state, respectively. f¯ (·) represents the dynamic characteristics of the system which defined as the total disturbance of the system, assumed to be unknown. g(x) is an unknown nonlinear function. The system can be transformed into y˙ = f (x, t) + bu h , by setting f (x, t) = f¯(x, t) + g(x)u h − bu h , in order to separate out the term bu h , where b is the control coefficient that can be set according to control requirements, and the unknown control variable g(x)u h and decomposed −bu h are combined into the total unknown function or disturbance f (x, t). This is done to minimize the impact of unknown information on the control system. The control block diagram of HODFC is shown in Fig. 2. Assuming the tracking error is set to e1 = yr − y, e2 = y˙r − y˙ , the tracking T error vector can be represented as e¯ = e1 e2 , the tracking error equation can be expressed as e˙1 = −ke1 + K¯ e¯ + y˙ − ( f (x, t) + bu h ) ,
(15)
where K¯ = [ k 1 ]. Setting K¯ e¯ + y˙ − ( f (x, t) + bu h ) = 0, then the tracking error equation can be written as e˙1 = −ke1 . When k is greater than zero, s + k is a Hurwitz polynomial. Therefore, limt→∞ e1 = 0, which means that limt→∞ y = yr . From K¯ e¯ + y˙ − ( f (x, t) + bu h ) = 0, the control input can be obtained as uh =
1¯ K e¯ + y˙ − f (x, t) , b
(16)
However, f (x, t) contains the system model and an unknown function of external disturbances. Therefore, y˙ − f (x, t) is unknown. To address this problem, we consider the original system and obtain y˙ − f (x, t) = bu h , substituting it into (16) yields the following expression: uh =
1¯ K e¯ + u h . b
(17)
Obviously, this is unrealizable because u h is an unknown control variable. However, the filter uˆ h can be used as a substitute for u h , introducing a filtering module and using uˆ h as an estimated value of u h . Although u h is an unknown parameter, its value at the previous moment is determinate, and uˆ h can be obtained due to the
Modeling and High-Order Differential Feedback Control …
761
lag characteristic of the filter. The filtering module is given as uˆ h =λu h /(s + λ), where λ is the filtering factor. When λ → +∞, it satisfies limλ→∞ uˆ h = u h , and the controller filtering module is globally asymptotically stable. By replacing u h with uˆ h , the unknown total disturbance f (x, t) in the system is indirectly compensated. The HODFC control law is obtained as follows: uh =
1¯ K e¯ + uˆ h , b
(18)
substituting y˙ = f (x, t) + bu h and (18) into (15), obtain e˙1 = −ke1 + b(u h − uˆ h ). Where k > 0 and limλ→∞ uˆ h = u h . Therefore, the entire closed-loop control system is asymptotically stable and satisfies limt→∞ limλ→∞ e1 = 0, limt→∞ limλ→∞ y = yr . In (18), it is assumed that e¯ is known. In the practical application process of HODFC, eˆ¯ can be obtained using HOD as an estimate of the error signal e¯ . The final HODFC based on HOD is expressed as follows: uh =
1¯ˆ K e¯ + uˆ h . b
(19)
In the practical application of HODFC, the filtering factor λ does not need to be a particularly large value, and λ ∈ [5, 50] is generally sufficient to meet the filtering requirements. The control coefficient b can be artificially determined according to the needs of the control system. HODFC has the following advantages: It does not rely on system models and disturbances. It stabilizes the closed-loop system. The design of the control filter can compensate for unknown nonlinear models and disturbances.
3.4 Design of Control Systems for Each Channel All position and angle loops of the small unmanned helicopter will use NPAC as the controller. Here, we take the x-axis position loop as an example to make a detailed design process of the NPAC. From (1), we can obtain the relationship between the position Px and the velocity u of the x-axis, its expression is shown as follows: P˙x = u cos θ cos ψ + v(sin φ sin θ cos ψ − cos φ sin ψ) , +w(cos φ sin θ cos ψ + sin φ sin ψ) = f Px + u cos θ cos ψ
(20)
where f Px = v(sin φ sin θ cos ψ − cos φ sin ψ) + w(cos φ sin θ cos ψ + sin φ sin ψ) can be treated as a known model function. Assuming that the tracking error of the x-axis position is e Px = Pxr − Px , the NPAC control law for the x-axis position loop can be obtained from equation (14) as follows: ur =
1 k Px e Px + P˙xr − f Px , cos θ cos ψ
(21)
762
G. Qi et al.
where k Px is the only adjustable control parameter, which can be designed according to the desired pole configuration. As analyzed in Sect. 3.2, limt→∞ e Px = 0, that is, limt→∞ Px = Pxr . Due to the nonlinearity and severe coupling of all velocity and angular velocity loops, a model-independent HODFC can be used to compensate for unknown nonlinear models and disturbances using control filtering. Here, we take the x channel velocity loop as an example to give a detailed design process of the HODFC controller. For the x channel velocity loop, the extended state estimation vector ˆ¯ = uˆ uˆ˙ T for the reference speed and the extended state estimation vector R ur r r ˆA ¯ u = uˆ uˆ˙ T for the actual speed can be obtained using HOD. Therefore, the obserˆ¯ − Aˆ¯ . From vation error expansion vector for the x channel velocity loop is eˆ¯ = R u
ur
u
(19), the control law for the x channel velocity loop can be obtained as follows: θr =
1 ¯ ˆ K u e¯ u + θˆr , bu
(22)
where K¯ u = ku 1 is an adjustable parameter, and θˆr =λu θr /(s + λu ) is the control filter. As analyzed in Sect. 3.3, limt→∞ uˆ = uˆ r , that is, limt→∞ u = u r .
3.5 Inverse Models of Servo and Swashplate The servo and swashplate are important executive components on small unmanned helicopters. However, after incorporating the servo and swashplate models into the system, controlling the small unmanned helicopter becomes extremely difficult. To solve this problem, this paper proposes the concept of inverse model control and incorporates the inverse models of the servo and swashplate on the small unmanned helicopter into the controller to compensate for their influence on trajectory tracking control. From the servo and swashplate model described in (12), the inverse models of the main rotor servo and swashplate can be derived as follows: ⎤ ⎤ ⎡ ⎤⎡ 0.1δˆ˙ col + δcol 1 0 1 δ col ⎥ ⎣ δ lat ⎦ = ⎣ −1 0.5 0.5 ⎦ ⎢ ⎣ 0.1δˆ˙lat + δlat ⎦ . 1 0.5 −0.5 δ lon 0.1δˆ˙lon + δlon ⎡
(23)
The inverse model of the tail rotor servo is δ ped = 0.1δˆ˙ ped + δ ped . δˆ˙col , δˆ˙lat , δˆ˙lon , δˆ˙ ped is the derivative of the output δcol , δlat , δlon , δ ped of the position and attitude control system obtained through HOD.
Modeling and High-Order Differential Feedback Control …
763
4 Simulation and Analysis of Trajectory Tracking for Small Unmanned Helicopter The trajectory tracking control simulation for small unmanned helicopter will use a figure-eight climbing as the flight goal. In this part, the control system composed of HODFC and NPAC will be compared with the PID control system in terms of control effect. In the simulation, the initial position of the small unmanned helicopter is set to (0, 0, 0)m and the initial attitude angle is (0, 0, 0)rad. The external force disturbance on the small unmanned helicopter in the x, y, and z directions is represented by X d , Yd , Z d = 10 sin(0.5π t). In order to achieve the figure-eight climbing maneuver, the reference trajectory of the small unmanned helicopter is Pxr = sin(0.05π t), Pxr = sin(0.05π t), Pzr = 0.1t, ψr = 0. The simulation results of figure-eight climbing maneuver for the small unmanned helicopter are shown in Fig. 3. It can be clearly seen from Fig. 3 that the control system composed of HODFC and NPAC has better control effect and higher tracking accuracy. The control effects and tracking errors of the x and y channels are shown in Fig. 4. The control system composed of HODFC and NPAC can achieve high-precision trajectory tracking, while the control system composed of PID exhibits significant deviation and cannot accurately track the reference trajectory. After calculating the average error, the tracking errors of the x and y channels are 3.3 times and 2.8 times smaller than those of PID, respectively. This is because NPAC fully utilizes the known model information, shortening the adjustment time, while HODFC uses control filtering to compensate for unknown nonlinear models and disturbances, making
Fig. 3 Figure-eight climbing simulation results
764
G. Qi et al.
Fig. 4 Control effects and tracking errors of the x and y channels
external force disturbances unable to have a significant impact on trajectory tracking control. PID, on the other hand, does not use known information and only slowly adjusts based on errors, with poor anti-interference capability. Therefore, the control system composed of HODFC and NPAC has higher tracking accuracy and stronger anti-interference ability. Based on the above analysis, the control system designed in this paper, composed of HODFC and NPAC, has superior control performance, can effectively overcome external force disturbances, has high tracking accuracy, and faster tracking response.
5 Conclusion In order to make the model of the small unmanned helicopter more realistic in simulations, this paper established an accurate multi-body dynamics model and kinematics model for the small unmanned helicopter. Based on the mathematical model characteristics of the small unmanned helicopter, a trajectory tracking control system for
Modeling and High-Order Differential Feedback Control …
765
the small unmanned helicopter composed of HODFC and NPAC was designed. In the design of the controller, inverse model control was adopted to compensate for the effects of servo and swashplate on trajectory tracking control. The figure-eight climbing trajectory tracking control of the small unmanned helicopter with external disturbances was realized through simulation. The effectiveness of the proposed control system was verified and compared with the classical PID. The simulation results show that the trajectory tracking control system composed of HODFC and NPAC has higher accuracy, more precise trajectory tracking, stronger anti-interference ability, and faster response than the PID control.
References 1. Samal, M.K., Garratt, M., Pota, H., Sangani, H.T.: Model predictive attitude control of vario unmanned helicopter. In: IECON 2011-37th Annual Conference of the IEEE Industrial Electronics Society, pp. 622–627. IEEE (2011). https://doi.org/10.1109/IECON.2011.6119382 2. Utkin, V.: Discussion aspects of high-order sliding mode control. IEEE Trans. Autom. Control 61(3), 829–833 (2015). https://doi.org/10.1109/TAC.2015.2450571 3. Mokhtari, S., Abbaspour, A., Yen, K.K., Sargolzaei, A.: Neural network-based active faulttolerant control design for unmanned helicopter with additive faults. Remote Sens. 13(12), 2396 (2021). https://doi.org/10.3390/rs13122396 4. Shen, S., Xu, J.: Adaptive neural network-based active disturbance rejection flight control of an unmanned helicopter. Aerosp. Sci. Technol. 119, 107062 (2021). https://doi.org/10.1016/j. ast.2021.107062 5. Lai, Y.C., Le, T.Q.: Adaptive learning-based observer with dynamic inversion for the autonomous flight of an unmanned helicopter. IEEE Trans. Aerosp. Electron. Syst. 57(3), 1803–1814 (2021). https://doi.org/10.1109/TAES.2021.3050653 6. Kendoul, F.: Survey of advances in guidance, navigation, and control of unmanned rotorcraft systems. J. Field Robot. 29(2), 315–378 (2012). https://doi.org/10.1002/rob.20414 7. Hu, Y., Yang, Y., Li, S., Zhou, Y.: Fuzzy controller design of micro-unmanned helicopter relying on improved genetic optimization algorithm. Aerosp. Sci. Technol. 98, 105685 (2020). https:// doi.org/10.1016/j.ast.2020.105685 8. Budiyono, A., Wibowo, S.S.: Optimal tracking controller design for a small scale helicopter. J. Bionic Eng. 4(4), 271–280 (2007). https://doi.org/10.1016/S1672-6529(07)60041-9 9. Frazzoli, E., Dahleh, M.A., Feron, E.: Trajectory tracking control design for autonomous helicopters using a backstepping algorithm. In: Proceedings of the 2000 American Control Conference, ACC (IEEE Cat. No. 00CH36334), vol. 6, pp. 4102–4107. IEEE (2000). https://doi. org/10.1109/ACC.2000.876993 10. Wang, X., Yu, X., Li, S., Liu, J.: Composite block backstepping trajectory tracking control for disturbed unmanned helicopters. Aerosp. Sci. Technol. 85, 386–398 (2019). https://doi.org/10. 1016/j.ast.2018.12.019 11. Ifassiouen, H., Guisser, M., Medromi, H.: Robust nonlinear control of a miniature autonomous helicopter using sliding mode control structure. Int. J. Mech. Mechatron. Eng. 1(2), 63–68 (2007). https://doi.org/10.5281/zenodo.1063252 12. Shao, S., Gao, Z.: On the conditions of exponential stability in active disturbance rejection control based on singular perturbation analysis. Int. J. Control 90(10), 2085–2097 (2017). https://doi.org/10.1080/00207179.2016.1236217 13. Qi, G., Li, X., Chen, Z.: Problems of extended state observer and proposal of compensation function observer for unknown model and application in uav. IEEE Trans. Syst. Man Cybern.: Syst. 52(5), 2899–2910 (2021). https://doi.org/10.1109/TSMC.2021.3054790
766
G. Qi et al.
14. Qi, G., Chen, Z., Yuan, Z.: Adaptive high order differential feedback control for affine nonlinear system. Chaos Solitons Fractals 37(1), 308–315 (2008). https://doi.org/10.1016/j.chaos.2006. 09.027 15. Cai, G., Chen, B.M., Lee, T.H.: Unmanned rotorcraft systems. Springer (2011). https://doi.org/ 10.1007/978-0-85729-635-1 16. Padfield, G.: The theory and application of flying qualities and simulation modeling. Helicopter Flight Dynamics; AIAA: Education Series, Washington, DC (1996). https://doi.org/10.1002/ 9780470691847 17. Qi, G., Chen, Z., Yuan, Z.: Model-free control of affine chaotic systems. Phys. Lett. A 344(2–4), 189–202 (2005). https://doi.org/10.1016/j.physleta.2005.06.073
Three-Phase Single-Switch Active Power Factor Corrector Based on UC1854 Changlu Yue, Xu Zhao, Cong Hu, Chunyu Li, and Tao Jiang
Abstract The three-phase active power factor corrector based on UC18-54 chip is designed. It not only inherits the advantages of high reliability of traditional analog control chip algorithm, but also has the features of simple topology, high versatility of AC input side (single-phase/three-phase) and high power factor when it is applied to three-phase system. The main work of this paper is as follows: 1. The threephase active power factor correction platform based on UC1854 is built by Psim simulation software, and the results show that the scheme design is reasonable and feasible. 2. Based on the simulation parameters to build the principle prototype and complete the loading test, the test verification shows that the input side power factor can be improved to 0.96 when the maximum output power is about 1500 W, which is basically consistent with the simulation results and directly proves the correctness and reliability of the design. Keywords Active power factor corrector · PSIM simulation · UC1854
1 Introduction With the development of power electronics technology, inverters, inverter power supplies, high-frequency switching power supplies and various types of special converters have been widely used in various fields of the national economy. However, a large part of these current converters need to convert AC power to DC power before they can work properly. Because the diode uncontrolled rectifier circuit is widely used in the conventional conversion process, a large number of harmonics and reactive power are injected into the power grid, resulting in serious power grid ’pollution ’. The most fundamental measure to combat this grid “pollution” is to improve the input-side power factor so that it is close to 1 [1, 2]. In single-phase active power factor correction circuits, a number of commercially available dedicated analog conC. Yue (B) · X. Zhao · C. Hu · C. Li · T. Jiang Beijing Institute of Precision Mechatronics and Controls, Beijing 100076, China e-mail: [email protected] Laboratory of Aerospace Servo Actuation and Transmission, Beijing 100076, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_67
767
768
C. Yue et al.
trol chips have been used. In the case of three-phase active power factor correction circuits, there are very few application cases of separate analog control chips. Most of them adopt digital control scheme or three-phase PFC circuit composed of two or three single-phase PFC circuits controlled by analog chips [3]. The cost not only increases the number of devices used in the circuit, but also increases the parallel current sharing control circuit, which undoubtedly increases the complexity of the control strategy. This paper proposes to use a UC1854 control chip to implement a three-phase active power factor correction design. The test results show that the design of this scheme is reasonable and feasible, and the power factor of the input side is significantly improved.
2 Principle of Operation By investigating the power topology of the active power factor correction circuit, considering the factors such as power level and electromagnetic interference, combined with the design index requirements of this paper, the boost power circuit is selected as the main topology of the three-phase active power factor correction circuit [4]. As shown in Fig. 1, the three-phase 380 V mains filter out the common mode interference and differential mode interference through the EMI filter, the AC-DC conversion is realized by the uncontrolled rectifier bridge, and then passed through an uncontrolled rectifier bridge to achieve AC to DC conversion. The converted DC power is input to the boost circuit to complete the power factor correction and output boost function. The UC1854 control chip controls the duty cycle of the switch tube
Fig. 1 General block diagram of three-phase active power factor corrector
Three-Phase Single-Switch Active Power Factor Corrector Based on UC1854
769
under the average current control strategy by collecting the output voltage amplitude of the boost circuit, the change value of the input inductor current and the voltage amplitude of the ’ steamed bread wave ’ after rectification, so as to realize the stability of the output voltage and the sinusoidal change of the input current with the input voltage. The average current control strategy is as follows [5]: The boost output voltage is divided by a resistor and fed to a voltage error amplifier. Compare with UC1854 control chip reference source. The output voltage of the voltage error amplifier is regulated by the feedback from the external voltage compensation network of the chip. It is input to the multiplier together with the rectified ’ steamed bread wave ’ voltage sampling value. The output of the multiplier is used as the reference current for the current error amplifier. Compared with the sampling value of the inductor current, The output voltage of the current error amplifier is adjusted by feedback from the chip’s external current compensation network. That is, the high frequency component of the inductor current is averaged. The amplified average current error is compared with the sawtooth wave generated by the chip oscillator. The driving signal of the control switch tube is generated by the pulse width modulator.
3 Circuit Design and Main Parameter Calculation As shown in Table 1, it is the design indicators of three-phase active power factor corrector.
3.1 Boost Inductance Calculation As the main energy storage element of boost circuit, inductance not only plays the role of energy transfer, but also affects the effect of power factor correction. Therefore, it should be designed to be as accurate and effective as possible. Firstly, it is necessary to calculate the maximum time for the inductor to store energy in a switching cycle, that is, the maximum duty cycle of the switch tube which is calculated by the formula:
Table 1 Three-phase APFC input and output indicators Number Indicators Parameter Unit 1 2 3 4 5 6 7
Input voltage (Vin) Input frequency (fin) Output voltage (Vout) Output power (Pout) Switch frequency (fsw) Efficiency (η) Power factor (λ)
342–418 45–55 615 1500 100 ≥ 0.9 ≥ 0.9
V Hz V W kHz
Notes Three-phase
Full load Half load and above
770
C. Yue et al.
D M AX = 1 −
√
2∗Vin(min) Vo
= 0.21
(1)
Secondly, the inductor current ripple is calculated. Since the designed inductor current operates in continuous conduction mode (CCM) and the inductor current ripple coefficient is usually taken as 0.4, the calculation formula is: I R I P P L E = 0.4 ∗ Iin_ pk = 3.1A
(2)
The peak value of inductance current is: Iin_ pk =
√
2∗
Pout η∗λ∗Vin(min)
I L_P K = Iin_ pk +
IRI P P L E 2
= 7.66A.
(3)
= 9.21A
(4)
Finally, the formula for calculating the minimum inductance is: L≥
√ 2∗Vin(min) ∗D M AX f SW ∗I R I P P L E
= 327uH
(5)
The final actual value is 450uH.
3.2 Output Capacitance Calculation Capacitor plays a role of low frequency filtering and energy storage in boost circuit. If the volume and weight indicators allow, try to increase the capacitance to obtain a smooth and stable DC output voltage. In order to ensure that the drop voltage amplitude is within the allowable range when the output voltage is suddenly loaded, it is assumed that the output voltage of the system is not less than 570 V in a period (TH O L D ) after power down. The minimum output capacitance can be calculated by the following formula: 1 TH O L D = fin(min) = 22ms (6) C OU T ≥
2POU T ∗TH O L D V 2 OU T −VH2 O L D
= 1.2mF
(7)
In practice, two 820 uF/450 V electrolytic capacitors are actually selected to be connected in series first and then in parallel in three ways to obtain the total capacitance of 1.23 mF.
Three-Phase Single-Switch Active Power Factor Corrector Based on UC1854
771
3.3 Current-Continuing Diode Selection The three-phase APFC main circuit adopts the traditional boost single switch boost topology. Due to the large capacitance of the selected output capacitor, the currentcontinuing diode not only needs to have the ability to withstand the maximum peak current of the inductor and the reverse output voltage, but also has the feature of short reverse recovery time. Integrate the above requirements, In this paper, the silicon carbide Schottky fast recovery diode of GeneSiC manufacturer model GC2X10MPS12 is selected. Its reverse recovery time is essentially zero, which greatly reduces switching losses [6].
3.4 Voltage Compensation Network In order to achieve better stability, a double-pole single-zero compensation method (type II compensation network) can be used to offset some of the zero poles generated by the open-loop transfer function. As shown in Fig. 2, the transfer function G C can be derived from the formula. Gc =
1+s R2 C1 s R1 C1 (1+s R2 C2 )
(8)
where R2 is the compensation network resistance; C1 and C2 are the compensation network capacitors; R1 is the output voltage divider resistor. According to the Bode diagram, the zero-pole frequency of the transfer function is the turning point frequency of its amplitude-frequency characteristic. Therefore, f p0 = 1/(2π R1C1), f p1 = 1/(2π R2C2),and f z0 = 1/(2π R2C1). In general, the three points are selected based on the following principles: The first pole ( f p0 ) is at the origin. The second pole is used to offset the zero point generated by the output capacitor ESR ( f p1 ). The third zero point ( f z0 ) is taken at 1/5 of the bandwidth to increase the phase angle. After simulation and experimental debugging, R2 = 174k, C1 = 10nF, C2 = 47nF can be obtained.
3.5 Current Compensation Network In general, APFC is in continuous conduction mode, and the current loop compensation network also adopts the bipolar single zero compensation method, which is consistent with the voltage loop compensation network compensation method. However, because the current signal contains fast-changing components, the passband of the current amplifier is wider [7], so the selection principle at the zero and pole is slightly different. Through the investigation of relevant information. The first pole frequency ( f p0 ) is 1/10 of the switching frequency, that is, 100 kHz/10 = 10 kHz.
772
C. Yue et al.
Fig. 2 Type II compensation network
The zero frequency ( f z0 ) is 0.2 fc, namely 2 kHz, and the second pole frequency ( f p1 ) is 10 fp0, namely 100 kHz. After simulation and experimental debugging, R2 = 20 k, C1 = 620 pF, C2 = 62 pF can be obtained.
4 Simulation and Experimentation In this paper, the three-phase APFC circuit modeling is carried out with Psim software. Simulations were performed with a 1500W resistive load. Figure 3 shows the output voltage waveform at full load. As can be seen from the figure, the maximum overshoot voltage is about 648 V, and finally stabilizes at the design value of about 61 5V after several oscillation adjustments. Figure 4 is the comparison waveform of three-phase current and three-phase voltage at full load. As can be seen from the figure, the three-phase input current can follow the same phase and period of each phase input voltage. The simulation test results can be concluded that the circuit design scheme and related parameters are reasonable and feasible. Figures 5 and 6 are the waveforms of the test process. Figure 5a shows the output voltage waveform at full load. It can be seen from the figure that the output voltage can be stabilized at 615 V, and the fluctuation range is 608–626 V. Figure 5b shows the output voltage waveform when switching from no-load to full-load. From the
Three-Phase Single-Switch Active Power Factor Corrector Based on UC1854
773
Fig. 3 Output voltage waveform at full load
Fig. 4 The comparison waveform of current and voltage at full load
diagram, it can be seen that the adjustment time required for the three-phase APFC output voltage to switch from the constant voltage in the no-load state to the constant voltage in the full-load state is about 338 ms, and the minimum voltage is about 574 V, which is basically consistent with the calculation results. Figure 6a shows the measured data on the power analyzer at an input power of 737 W. From the diagram, when the input line voltage is 381 V and the input line current is 1.194 A, the input power is 737 W, the apparent power is 788 W, and the power factor is 0.9355. Figure 6b shows the measured data on the power analyzer at an input power of 1554 W. From the diagram, when the input line voltage is 381 V and the input line current is 2.45 A, the input power is 1554 W, the apparent power is 1618 W, and the power
774
C. Yue et al.
Fig. 5 Output voltage waveform in practice
factor is 0.9604. According to the above data, the input power factor of this design is greater than 0.9 under half load and above conditions, which meets the design index and directly proves the correctness of this design [8].
Three-Phase Single-Switch Active Power Factor Corrector Based on UC1854
Fig. 6 Measured data on power analyzer
775
776
C. Yue et al.
5 Conclusion In this paper, the following conclusions are obtained through theoretical research, simulation and experiment: The design of three-phase single-switch active power factor corrector can be realized by UC1854 control chip, and the power factor is greater than 0.9 when the output power is half load or above, which significantly improves the power factor of AC input side. The three-phase active power factor correction model based on UC1854 control chip is built by Psim simulation software. The simulation data analysis shows that the scheme design is reasonable and feasible, which is of great significance for guiding practice. The test prototype developed in this paper can achieve stable operation near the rated power of 1500 W, with high working stability, and has been successfully applied to the three-phase power supply system. This technology has broad application prospects in small and medium power switching power supply, uninterruptible power supply, PWM rectifier, etc.
References 1. Sehwag, V., Dua, V., Singh, A., Rai, J.N., Shekhar, V.: Power factor correction using apfc panel on different loads. In: 2018 2nd IEEE International Conference on Power Electronics, Intelligent Control and Energy Systems (ICPEICES), pp. 73–77. IEEE (2018) 2. Gao, X.: A study of three-phase APFC based on quadratic buck converter. In: IEEE PES Innovative Smart Grid Technologies, pp. 1–4. IEEE (2012) 3. Li-Hua, W., Hui-Zhi, G., Huai-Jian, M., Hong, L.: The APFC system based on an improved duty cycle control algorithm. In: Proceedings of 2011 6th International Forum on Strategic Technology, vol. 1, pp. 570–574. IEEE (2011) 4. Kim, C.J., Jang, J.Y.: Characteristics of a high power factor boost converter with continuous current mode control. KIEE Int. Trans. Electr. Machinery Energy Convers. Syst. 4(2), 65–72 (2004) 5. Su, W., Ming, H.: Research and simulation of active power factor correction. In: 2011 International Conference on Consumer Electronics, Communications and Networks (CECNet), pp. 5156–5158. IEEE (2011) 6. Ben-Yaakov, A., Zeltser, I.: Benefits of silicon carbide schottky diodes in boost APFC operating in CCM. Power Conversion and Intelligent Motion, PCIM-2001, 101–105, Nuremberg (2001) 7. Matada, M., Panda, A.: Simulation of improved dynamic response in active power factor correction converters (2009) 8. Zaidi, M.N., Ali, A.: Power factor improvement using automatic power factor compensation (APFC) device for medical industries in Malaysia. In: MATEC Web of Conferences, vol. 150, p. 01004. EDP Sciences (2018)
Study of Digital Test Method for Cable-Driven Dexterous Hand Yajing Guo, Fan Yang, Junning Zhang, Bohan Lv, and Si Zeng
Abstract In this paper, we study a digital test method of cable-driven dexterous hand, construct a physical simulation model of flexible rope-driven humanoid dexterous hand, and build a joint simulation platform for control algorithm verification. The digital model has sufficient measurement units which can realize the acquisition, measurement, reservation, and comparison of design parameters and data, faster and more conveniently compared with the prototype model. Consequently, during the digital test, design flaws can be effectively found, and the design plan can be quickly corrected and changed to reduce the redesign and machining probability of machined parts. At the same time, it can effectively reduce the reworking time of mechanical parts and provide strong support for the forward-design of cable-driven humanoid dexterous hand. Keywords Cable drive · Humanoid dexterous hand · Co-simulation
1 Introduction The full-drive humanoid dexterous hand requires a total of 20 flexible rope (tendons) for the five fingers of one hand, including two degrees of freedom of the wrist joint, and a total of 22 degrees of freedom for transmitting motion and power with strong coupling relationships, as shown in Fig. 1. A simulation test model, which is closer to the real physical environment, is created to studying the digital test method of ropedriven dexterous hand, and exploring the system design and simulation verification based on the digital physical simulation model [1], taking into consideration not only the elasticity and force of the flexible rope material itself, but also the friction on the external surface. The structure is highly humanoid [2] and the forward design method is tedious and complicated. Without digital prototype, the physical prototype for testing should be carried out, which can cause damage to the prototype and lead Y. Guo (B) · F. Yang · J. Zhang · B. Lv · S. Zeng Beijing Institute of Precision Mechatronics and Controls, Beijing 100076, China e-mail: [email protected] Laboratory of Aerospace Servo Actuation and Transmission, Beijing 100076, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_68
777
778
Y. Guo et al.
Fig. 1 Cable-driven humanoid dexterous hand
to project schedule delay or even project failure [3]. Long development cycle and high risk are not beneficial to the parallel research among different teams [4]. To form a simulation test model which is closer to the real physical environment, the elasticity of the flexible rope material and the friction on the external surface have been taken into account during building the model of the dynamics of a full-drive humanoid dexterous hand with a flexible rope drive unit [5, 6]. The simulation model is encapsulated into a SIMULINK module, retaining the common data interface. Building the control algorithm model In the SIMULINK platform and encapsulating the control algorithm into a SIMULINK module facilitate the rapid replacement of the control algorithm. The virtual simulation test platform supports simulation verification of multiple intelligent algorithms, while the simulation model, as a virtual physical prototype, can verify the algorithm accuracy and reliability. In addition, this simulation platform provides an efficient, reliable and fast verification of multiple intelligent control algorithms of the full-drive humanoid dexterous hand. Adequate and fast verification of algorithms facilitates task-oriented feasibility analysis, fully verifies relevant index parameters, aids forward design, improves the reliability of the physical prototype, effectively reduces modification and rework problems during prototype development, and can significantly shorten the development cycle, while the simulation platform facilitates parallel research among different teams, optimizes team collaboration mode, and improves development efficiency.
2 Key Technologies for Simulation Modeling 2.1 Cable-Driven Modeling It is difficult for the motors to directly drive the joints due to size requirement of the structure. Consequently, tendon drive is applied on the dexterous hand to transmit the power from the motors to the appropriate joints Tendons have the advantage of being flexible and lightweight [7], but there is a complex coupling relationship between the tendons and finger joints, which also complicates the kinematic relationship of the fingers of the dexterous hand [8].
Study of Digital Test Method for Cable-Driven Dexterous Hand
779
2.2 Measurement Unit Design The dexterous hand interacts with the outside world not only requires a suitable mechanical mechanism, but also the sensors it carries. Therefore, the model simulation should be designed with the same measurement unit as the sensor configuration of the dexterous hand scheme; at the same time, the measurement unit design of the key technical parameters is also required in order to provide a basis for verification of the control algorithm. The pairing of different sensors makes the dexterous hand form a kind of closed-loop system, and the system can judge the state of the dexterous hand during operation based on the sensor values, thus making the whole task operation more reliable. In the sensing system of the dexterous hand, position sensors and force sensors are essential to the sensing system. Without these sensors, the dexterous hand cannot acquire its own state as well as sense the external environment, and cannot perform complex operations. During the movement of the dexterous hand, the position of the joint rotation needs to be detected in real time in order to enable the dexterous hand to accomplish grasping in various postures. When the dexterous hand performs a task, the haptic sensor captures information should be analyzed when the dexterous hand performs some complex tasks.
2.3 Co-Simulation Model In order to better realize the real-time control of the robot and improve the control accuracy of the robot, only dynamics simulation or by detecting the angular displacement feedback information from the position sensors installed at the joints is not enough. It is necessary to analyze the relationship between the drive force/moment input to the system and the motion information of each joint in depth by mathematical methods from the perspective of robot mechanics. Therefore, it is necessary to analyze and simplify the mechanical structure of the dexterous hand as well as the degrees of freedom of each joint to establish the dynamics model of a single finger. The relationship between the driving torque and the angular displacement, angular velocity and angular acceleration of each joint of the finger is analyzed mathematically as the theoretical basis of the motion control algorithm in order to improve the control accuracy of the dexterous hand.
3 Specific Implementation Program In order to verify the reasonableness of the design and prevent the situation of reducing efficiency and effectiveness such as reworking, the model-based simulation technology of the flexible cable-driven humanoid dexterous hand was studied by obtain-
780
Y. Guo et al.
Fig. 2 Dynamical model
Fig. 3 Simulation validation of index finger dynamics
ing statistical data of functional tests through simulation experiments in the entire process of design development. Build one fingers dexterous hand dynamics model, as shown in Fig. 2. Add actuation to the three rotary joints of the index finger to verify coordinate changes in Cartesian space at the end of the finger to verify single-finger simulation (Fig. 3). A Simulink module with the angles of the three joints of the index finger as input and the Cartesian spatial coordinate positions (x, y, z) as output is created, as shown in Fig. 4.
Study of Digital Test Method for Cable-Driven Dexterous Hand
781
Fig. 4 Packaged Simulink modules
4 Simulation Validation 4.1 Verification of Kinematic Algorithm for Single-Joint Flexible Cable Differential Transmission Mechanism The simulation platform was applied to verify the kinematics of a single-joint flexible cable differential drive mechanism for fingers. After 500 groups of experiments at different target angles (including common angles and special angles) of the dexterous finger, the correspondence between the displacement with the tendon rope and the actual angle of the dexterous finger was obtained, and the error between the actual angle of the dexterous finger and the target angle was acquired. The simulation effect is shown in Fig. 5. Selected experimental results are shown in Table 1. According to the analysis of the experimental results, it can be seen that the average error of the kinematic control of each finger joint angle is 0.96 ◦ C (Fig. 6).
4.2 Validation of Forward Kinematic Algorithm on the Co-Simulation Platform To tackle with the problem of slow movement and response speed of humanoid dexterous finger joints driven by flexible cable due to the dead zone of reciprocating motion, the study of flexible cable differential master-slave push-pull type forceposition hybrid control theory was carried out. Establish the kinematics and dynamics model of multi-tendon parallel coupled transmission, design efficient multivariable feedback control law. The precise and stable control of multiple joints is achieved
782
Y. Guo et al.
Fig. 5 Simulation test Table 1 Correspondence between the displacement of the tendon rope and the target angle of hand fingers No. Target angle Displacement of the tendon rope q1 q2 q3 h1 h2 h3 h4 1 2 3 4 5 6 7 8 9 10 11 12 13 14
20 20 20 10 10 0 0 0 0 0 − 15 − 15 0 0
0 30 90 60 30 0 90 45 90 60 30 60 0 65
Fig. 6 Angle of finger movement
0 60 90 30 45 75 0 45 90 85 45 60 85 85
2.0018 − 1.6634 − 8.9937 − 6.3606 − 2.6954 0 − 10.9956 − 5.4978 − 10.9956 − 7.3304 − 5.1389 − 8.8041 0 − 7.9412
2.0018 5.667 12.9974 8.3002 4.64 0 10.9956 5.4978 10.9956 7.3304 2.1915 5.8567 0 7.9412
− 1.8151 − 11.8084 − 24.3483 − 11.0932 − 8.9438 − 8.9762 − 11.3509 − 9.8203 − 22.3465 − 17.4538 − 6.5003 − 12.3149 − 10.1979 − 17.6099
− 2.0018 8.9937 19.9893 10.0258 8.1932 9.163 10.9956 10.9956 21.9911 18.326 10.6367 16.1345 10.3847 18.326
Study of Digital Test Method for Cable-Driven Dexterous Hand
783
Fig. 7 The control block diagram under simulink platform
Fig. 8 Results in Cartesian space coordinates
and strong flexibility and system nonlinearity have been overcome. The problem of rope loosening caused by multiple movements is solved, so that the ability of close to manual operation speed and stable and flexible movement is realized. Based on ADAMS to establish the physical simulation model of the dexterous hand flex cable transmission system, and Simulink to establish the dynamics model of the servo motor and the control algorithm model, the digital simulation platform of the flex cable drive dexterous hand is built to support the subsequent research of finger flex control and multi-finger grasp planning algorithm, and the control block diagram is shown in Fig. 7. Comparing the results of the algorithm solution and the dynamics simulation results, the test results are shown in Fig. 8. The blue solid line in the figure is the dynamics simulation result, and the red solid line is the algorithm solution result, the maximum error is 0.127 mm, which can verify the high accuracy of the positive kinematics algorithm.
784
Y. Guo et al.
5 Conclusion This paper studies the digital test method of rope-driven dexterous hand, through the establishment of a simulation test model closer to the real physical environment, which can assist in forward design, full verification of design parameters, efficient and rapid verification of intelligent algorithms, etc., specifically in the following aspects: 1. Establish a physical simulation model of a flexible cable-driven humanoid dexterous hand and design an adequate library of measurement cells to fully validate the relevant parameters of the robot, which is conducive to mechanistic analysis and adequate verification of design parameters. 2. Build a co-simulation platform for control algorithm verification, which is conducive to efficient and rapid verification of algorithm accuracy and reliability, forming a test platform for algorithm verification before the prototype is fabricated and assembled, and improving the efficiency of parallel research and verification among different teams. 3. Efficiently find the design defects, quickly correct the design changes, and find the problem of inappropriate winding of the flexible cable, which causes the joint linearity rotation range to be less than 90 deg, which provides strong support for the forward design of cable-driven humanoid dexterous hand.
References 1. Zheng, X.H., Liu, X.H., Zhang, L., Sheng-Peng, L.I.: Study on dynamics simulation analysis of humanoid dexterous robot hand based on Adams. In: Manufacturing Automation (2013) 2. Stellin, G., Cappiello, G., Roccella, S., Carrozza, M.C., Becchi, F.: Preliminary design of an anthropomorphic dexterous hand for a 2-years-old humanoid: towards cognition. In: IEEE/RASEMBS International Conference on Biomedical Robotics Biomechatronics (2006) 3. Xu, Y., Xu, S., Xu, X., Zhao, C., Yao, A.: Kinematics and grasping analysis of SHU-II five fingers humanoid dexterous hand. Yi Qi Yi Biao Xue Bao/Chin. J. Sci. Instrument 39(9), 30–39 (2018) 4. Zuo, J.Q., Zhang, L., Jin, Z.P., Guan-Nan, X.I., Sun, X.G.: Design of the humanoid dexterous hand of service robots. J. Nantong Vocational Univ. (2014) 5. Lu, Y., Fan, D.: Transmission backlash of precise cable drive system. Proc. Insti. Mech. Eng. Part C J. Mech. Eng. Sci. 227(10), 2256–2267 (2013) 6. Lu, Y.F., Fan, D.P., Liu, H., Hei, M.: Transmission capability of precise cable drive including bending rigidity. Mechanism Mach. Theor. 94, 132–140 (2015) 7. Xie, H.W., Tao, Z., Hou, J.Z., Zhang, W.G.: Experimental research on transmission accuracy of multi-cable drive. Binggong Xuebao/Acta Armamentarii 38(4), 728–734 (2017) 8. Wei, L., Tao, L., Ji, Z.: Design and control of cable-drive parallel robot with 6-dof active wave compensation. In: 2017 3rd International Conference on Control, Automation and Robotics (ICCAR) (2017)
DETR with Recursive Gated Convolution Encoder Zijian Lin and Junyong Zhai
Abstract This paper proposes a novel approach for improving token interaction modeling in object detection by replacing the self-attention module in the DETR encoder with the Recursive Gated Convolution (g n Conv) module. Our method utilizes a fixed order setting for g n Conv and shares weights between encoder layers to seamlessly integrate it into the transformer encoder. By replacing the conventional attention encoder, our method can be easily integrated into any single-scale DETR-like detector, resulting in higher accuracy with comparable parameters and computational complexity. At the COCO datasets challenge, the proposed GConv encoder achieves a significant improvement (+0.9AP) on DAB-DETR, demonstrating its effectiveness in any single-scale DETR architecture. Keywords Object detection · DETR · Recursive gated convolution
1 Introduction In the field of object detection, convolutional neural networks (CNN) are the foundation for modern object detectors, both one-stage and two-stage, that rely on predicting results using a multitude of multi-scale anchors [1, 2]. However, this approach requires the use of multiple manually designed components, such as anchor-box design and non-maximum suppression (NMS) post-processing, resulting in a process that is not fully end-to-end. To overcome these issues, DETR has made significant strides using the transformer as the foundation of its detector, achieving comparable performance with CNN-based detectors [3]. However, DETR has its limitations, such as slow convergence speed and low accuracy for small object detection. Many previous works have attempted to address these issues from various perspectives [4–9], with a consensus that the transformer encoder is responsible for refining the output features from the backbone. Z. Lin · J. Zhai (B) School of Automation, Southeast University, Nanjing 210096, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_69
785
786
Z. Lin and J. Zhai
The architecture of transformer encoders has been widely employed in fundamental computer vision backbones and has held a predominant position for several years. More recently, HorNet has been developed, which employs recursive gated convolutions to perform high-order spatial interactions [10]. HorNet achieves an excellent combination of the feature extraction capabilities of CNNs and the spatial interaction modeling abilities of Transformers, providing a promising alternative to the conventional self-attention transformer encoder in DETR-like detectors. Encouraged by HorNet’s success, we have reevaluated the role of encoders in DETR-like detectors and are attempting to replace the conventional self-attention transformer encoder with a HorNet-style encoder to better refine the features. This paper thoroughly dissect the operational framework of the self-attention module in the encoder of DETR-like detectors. We also introduce a novel recursive gated convolution (GConv) encoder as a compelling alternative to the encoder in DETR-like detectors. By modifying g n Conv, the GConv encoder seamlessly integrates into DETR and brings marginal improvements in every single-scale DETR variant.
2 Related Work 2.1 DETR-Like Models DETR utilizes different sequence-length of query and key embedding in transformer decoder cross-attention module to make it possible to exploit finite learnable objects which is so called set prediction. In order to extract multi-scale feature maps, Deformable DETR introduces a specific operator that selects the top-K relevant pixels for each feature layer and applies deformable attention to obtain a new feature-map, reducing computational complexity from O(L 2 ) to O(K L), with K being assigned a value of 4 in the original paper [4]. Conditional DETR uses MLP transfer the object query position embedding to the same form of feature map have and feed the query position to the predict head as the reference points of the object bounding box [5]. Anchor DETR views the object position query as the anchor point and proposes multi-patterns to figure out one-anchor-multi-objects problem [6]. DAB DETR following the clue of Condition DETR thinks of the object position query as the anchor box and introduces Width and Height-Modulated Cross-Attention to make the decoder sensitive to the anchor’s width and height information [7]. DN DETR blames the bad convergence speed on Hungarian match algorithm and brings out using reproducing the origin object with denoising noised ground-truth object to strengthen the decoder’s positioning capability.
DETR with Recursive Gated Convolution Encoder
787
2.2 HorNet Backbone With massive research of Transformer style backbones, Many combination or alternatives of Transformer began to emerge. HorNet uses Global Filter as the basic processor which can model the whole feature map with O(L log L) computational complexity [10, 11]. Then using a exquisite approach of recursive gated convolution to get high-order interactions between every pixels in the feature map which called g n Conv in paper. It has made great progress and is superior to swin-transformer in all aspects in image classification task.
3 GConv DETR 3.1 Revision of Self Attention Encoder The Transformer architecture has achieved tremendous success in both Natural Language Processing (NLP) and Computer Vision (CV), owing to its remarkable modeling capabilities. A crucial element of the Transformer is the multi-head attention module, as illustrated in Fig. 1a. The attention module initially reshapes the (h, w, c) tensor to (hw, c), treating each pixel with shape (1, c) as a token. This setting enables the attention module to focus solely on the content information of features, disregarding the positional relationship between feature pixels, necessitating the inclusion of positional embedding as a vital component in Transformer encoder. A scalar formulation is presented in this study to enhance our comprehension of the attention mechanism and the processing of features within the self-attention block. The formulation utilizes indexation of the cth channel and nth head with c and n, respectively, as well as indexation of the ith and jth locations with i and j.
Encoder Layer 2
Encoder Layer 2
Encoder Layer 1
Encoder Layer 1 FFN
V
Proj.C Mul
Q
K
Processor
FFN
Self-attention
Norm.C (C,*)
Proj.C Mul
Concatenate
Proj.C/2 Mul
+
(C/2,*)
(C/4,*)
(a) Attention encoder
Image Features
(b) GConv encoder
Fig. 1 Comparison of attention encoder and our GConv encoder
(C/2,*)
Norm.C (C,*)
(C,*)
Positional Embeddings
Global Filter
(C/2,*)
Processor.7/4C Proj.2C
Image Features
(C/2,*)
DWConv.C/2
(C/4,*) (C/2,*) (C,*)
+
Positional Embeddings
(c) Processor
788
Z. Lin and J. Zhai
The resulting new feature at the ith location in the cth channel can be expressed as follows: N (i, j,n) ( j,c ,n) (c ,c,n) (i,c) wO A v xnew = n=1 c ∈φn j∈i (1) T A(i, j,n) = softmax
c ∈φn
q (i,c ,n) k ( j,c ,n) √ C/N
where q, k, v are the projection outputs that aggregate the original features of all channels at a location respectively, φn is the set of one attention head which has C/N elements and Ωi is the set of all mixing targets that corresponding to one location which has j elements.
3.2 Recursive Gated Convolution Taking inspiration from HorNet, the fundamental block in the encoder has been replaced with the g n Conv block. As our model does not necessitate down-sampling, the brilliant strategy of adjusting to doubling channels by setting incremental order used in HorNet has been excluded from our implementation. The GConv encoder module is introduced and illustrated in Fig. 1b. The new feature output can be written as C ) (i,c) h (i,c ) f (x (i,c ) ) = pn(i,c) w (i,c (2) xnew O c =1
where x represents the feature slice output from the input projection, h represents the last output feature from the recursive gated convolution pn−1 , and f denotes the arithmetic operation of the processor which is a hybrid module, depicted in Fig. 1c, that combines the Depth-wise Convolution (DWConv) and Global Filter(GF) to perform information interaction. The formula for the GF can be expressed as f GF = F−1 (w F(x))
(3)
where w is a complex matrix, and the symbol represents the Hadamard product, which multiplies the matrices element-wise. The notations F and F−1 denote the fast Fourier transform (FFT) and inverse fast Fourier transform (IFFT), respectively. GF uses FFT F transforms features to discrete frequency domain, then learns long-term spatial dependence by multiplying with w, and finally uses IFFT F−1 transforms features back to real domain. The use of FFT increases computational efficiency as the module only costs 2LC log2 L + LC/2 computation, where L is the length of the feature. As shown in Fig. 3, the two ultimate output features which are distinguishable of the GConv encoder have been visualized. Notably, the channel weights have been normalized using only raw data, without the aid of auxiliary tools like CAM.
DETR with Recursive Gated Convolution Encoder
789
Fig. 2 Convergence curves of DAB-DETR and DAB-GConv-DETR
In the resulting image, warmer regions correspond to higher weights, while colder regions represent lower weights. The visualization demonstrates that the encoder can effectively extract both content and contour information in the respective layers, thus enhancing the decoder’s ability to detect such features (Fig. 2).
3.3 Complexity As mentioned earlier, the computation of GF is 2LC log2 L + LC/2, the GF part of hybrid processor totally costs H W Clog2 (H W ) + 41 H W C computation. In other hand, the DWConv part of hybrid processor totally need 21 K 2 H W C. In our implementation, the k is 3 and the channel number of processor input feature is 47 C
computations. In conso that the processor totally cost 7H4W C log2 (H W ) + 19 4 sideration of H W C and H W Cin Cout computation cost of element-wise multiply and linear projection respectively, the total computation of GConv encoder is 7 H W Clog2 (H W ) + 161 H W C + 29 H W C 2 . As shown in Table 1, the O(nlogn) 4 16 8 complexity of GConv encoder prior to the O(n 2 ) complexity of self attention.
Table 1 The computation and parameter comparison of self-attention module and our GConv encoder module Computation Parameters Self attention GConv encoder
4H W C 2 + 2(H W )2 C 7 4 H W Clog2 (H W ) + 161 29 2 16 H W C + 8 H W C
4C 2 29 2 8 C +
63 8 C
+
7 16 H W C
790
Z. Lin and J. Zhai
3.4 Model Architecture In contrast to HorNet, which utilizes pre-norm to enhance data stability prior to achieving long-term dependence, the features are normalized after residual connections in our approach, similar to traditional transformer encoders. Based on historical experience, each feature channel contains information that describes a specific aspect of the target. The responsibility of the performer is to aggregate the appropriate features from all input feature channels into the appropriate channel. In line with this approach, the same performer weights are utilized in each encoder layer, which means that all encoder layers share a single performer. This setting allows the performer weights to receive more training in each epoch, resulting in better performance than non-weight-sharing performers, while also reducing the number of parameters.
4 Experiments 4.1 Setup In order to make a comparison with other DETR variants on an equivalent platform, an experiment was conducted on COCO2017 detection datasets [12]. Following customary practice, the standard mean average precision (AP) result on the COCO validation datasets for various Intersection over Union (IoU) thresholds and object scales is presented. Several ablation experiments were conducted on variants of DAB-DETR, replacing the original attention encoder with a GConv encoder. The feature map was first extracted using the backbone and then refined by our GConv encoder. Finally, as done by all DETRs, the decoder uses object queries to locate objects from the refined features and output the final set of object boxes. To enhance productivity, the training process was primarily truncated to 12 epochs on ResNet-50 (R50) [13] to investigate the efficacy of each constituent. Subsequently, after obtaining the final architecture setting, the performance of various DETR versions trained for 50 epochs on both ResNet-50 and ResNet-101 with our improved version was compared. With regards to the learning rate scheduler, an initial learning rate (lr) of 1 × 10−4 was utilized and the lr was reduced by a factor of 0.1 at the 11th and 40th epochs during training for 12 and 50 epochs, respectively. Moreover, the model employ the AdamW optimizer with weight decay of 1 × 10−4 and train our model on a single Nvidia V100 GPU. Our batch size is set to 16.
DETR with Recursive Gated Convolution Encoder
791
Table 2 Results for our GConv-DETR and other DETR-like detection models Model #Params AP AP50 APS APM DAB-DETR-R50 DAB-GConvDETR-R50 DAB-DETR-R101 DAB-GConvDETR-R101 DN-DETR-R50 DN-GConvDETR-R50 DN-DETR-R101 DN-GConvDETR-R101
APL
44M 43M
42.2 42.8 (+0.6)
63.1 63.5
21.5 23.4
45.7 46.4
60.3 60.8
63M 62M
43.5 44.4 (+0.9)
63.9 64.8
23.6 24.2
47.3 48.6
61.5 62.2
44M 43M
44.1 44.2 (+0.1)
64.4 64.2
22.9 23.6
48.0 48.1
63.4 62.8
63M 62M
45.2 45.5 (+0.3)
65.5 65.8
24.1 25.3
49.1 49.7
65.1 64.4
4.2 Comparison with Self-attention DETR Experiments are conducted to compare the GConv encoder method with conventional self-attention encoder models trained for 50 epochs without multiple patterns. The results are summarized in Table 2. From the table, it is evident that DAB-GConvDETR exhibits better performance with fewer parameters than DAB-DETR, confirming its effectiveness. Notably, GConv-DETR shows a more remarkable improvement (approximately +2AP) in the small object detection task than in medium and large object detection tasks, which have been identified as DETR’s weaknesses. However, the comparison between DN-DETR and DN-GConv-DETR reveals minimal improvement brought about by GConv. Nevertheless, it is noteworthy that although there has been a decrease in accuracy for large object detection, there remains a significant enhancement in accuracy for small object detection. Furthermore, convergence curves have been presented in Fig. 2 for models trained on the ResNet-50 backbone without the use of multiple patterns. It can be observed that DAB-GConv-DETR achieves 35AP in around 17 epochs, while DAB-DETR requires approximately 25 epochs. This clearly indicates that the GConv encoder converges faster than the self-attention encoder. Hence, our GConv-DETR model is better suited for shorter training durations, such as 12 or 36 epochs, in comparison to the conventional attention-based DETR.
4.3 Ablations Order selection. Table 3 presents the ablations of various order selections of GConv Encoder module without using shared processor. Apparently, the table clearly indicates that the encoder with order 3 get best performance in all aspects except for the small object detection AP.
792
Z. Lin and J. Zhai
Fig. 3 The visualization of results and encoder output features
Table 3 Comparison of GConv encoder using different orders #Order #Params AP AP50 APS 1 2 3 4 5 6
43.11M 43.40M 43.50M 43.53M 43.55M 43.56M
35.9 36.1 36.2 35.8 36.0 35.7
56.9 56.8 57.2 56.8 56.8 56.5
17.2 17.7 17.8 18.0 17.8 17.9
Table 4 Comparison of GConv encoders using different processor. #Params AP AP50 APS DWConv 43.18M Hybrid 43.24M Global filter 43.30M
36.4 36.5 36.4
56.5 56.8 57.0
17.3 17.9 17.7
APM
APL
39.1 39.8 39.9 39.2 39.3 39.2
52.5 52.9 53.1 52.4 52.5 51.8
APM
APL
40.1 39.7 40.0
52.6 53.3 52.9
Processor selection. The results of encoders using three different shared processors are presented in Table 4. It can be observed that the hybrid of GF and DWConv outperforms the others slightly. This is mainly due to the fact that DWConv aggregates local information as all convolution networks do, while Global Filter focuses more on long-term information. The hybrid processor employs DWConv in the early stages of recursive gated convolution and Global Filter in the later stages, which gives it the respective advantages of both.
DETR with Recursive Gated Convolution Encoder
793
5 Conclusion This paper analyzes the functionality of the self-attention encoder and proposes replacing it with the Recursive Gated Convolution module as the encoder. The order of the recursive gate is set as a constant 3, and different processors are compared to ultimately select a hybrid of depth-wise convolution and Global Filter. Additionally, the weights of the processor are shared across all encoder layers. Finally, GConvDETR is compared to the original DETR-like detector that uses a self-attention encoder. The results demonstrate that our GConv encoder has a faster convergence rate and better 5x (50 epochs) average precision with both ResNet-50 and ResNet101 as backbones. This study confirms the suitability of GConv encoder for singlescale DETR-like detectors due to fewer parameters and comparable training time. Although we only use GConv encoder to extract single-scale features, it has the potential to handle multi-scale features with O(L log L) compute complexity. This work is an initial step in applying GConv encoder to DETR-like detectors, and future work will explore the use of GConv encoder to extract multi-scale features without increasing computation.
References 1. Girshick, R.: Fast R-CNN. In: Proceedings of IEEE International Conference on Computer Vision (ICCV), pp. 1–9 (2015) 2. Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement, pp. 1–6 (2018). arXiv preprint arXiv:1804.02767 3. Carion, N., Massa, F., et al.: End-to-end object detection with transformers. In: European Conference on Computer Vision (ECCV), pp. 213–229 (2020) 4. Zhu, X., Su, W., Lu, L., et al.: Deformable DETR: deformable transformers for end-to-end object detection. In: 2021 International Conference on Learning Representations (ICLR), pp. 1– 16 (2020) 5. Meng, D., Chen, X., Fan, Z., et al.: Conditional DETR for fast training convergence. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 3651– 3660 (2021) 6. Wang, Y., Zhang, X., Yang, T., et al.: Anchor DETR: query design for transformer-based detector. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 2567–2575 (2022) 7. Liu, S., Li, F., Zhang, H., et al.: DAB-DETR: dynamic anchor boxes are better queries for DETR. In: International Conference on Learning Representations (ICLR), pp. 1–19 (2022) 8. Li, F., Zhang, H., Liu, S., et al.: DN-DETR: accelerate DETR training by introducing query denoising. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 13619–13627 (2022) 9. Zhang, H., Li, F., Liu, S., et al.: DINO: DETR with improved denoising anchor boxes for end-to-end object detection (2022). arXiv preprint arXiv:2203.03605 10. Rao, Y., Zhao, W., Tang, Y., et al.: HorNet: efficient high-order spatial interactions with recursive gated convolutions. In: Advances in Neural Information Processing Systems (NeurIPS), pp. 1– 18 (2022) 11. Rao, Y., Zhao, W., Zhu, Z., et al.: Global filter networks for image classification. In: Advances in Neural Information Processing Systems (NeurIPS), pp. 1–19 (2021)
794
Z. Lin and J. Zhai
12. Lin, T.-Y., Maire, M., et al.: Microsoft COCO: common objects in context. In: European Conference on Computer Vision (ECCV), pp. 740–755 (2014) 13. He, K., Zhang, X., et al.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Discrete Input-to-State Stability with Respect to Boundary and Distributed Disturbances for Balance Laws Systems Fatima Zahra Benyoub and Yan Lin
Abstract This paper discusses the input-to-state stability (ISS) notion for a class of partial differential equation systems. Precisely, by means of Lyapunov functional method, we establish the ISS properties for Balance laws systems in L 2 -norm with respect to both in-domain and boundary disturbances. Then, after considering a numerical discretization of the system, we introduce the notion of discrete ISS in 2-norm. Finally, in a numerical sense, stability conditions will be investigated by employing a discrete Lyapunov function … Keywords Input-to-state stability · Discrete stability · Hyperbolic PDE · Balance laws systems · Lyapunov functional · In-domain disturbances · Boundary disturbances
1 Introduction In the recent past, a fairly complete theory of input-to-state stability has been established for finite dimensional systems, and has been successfully invested in the area of nonlinear control systems. Therefore, the literature of this field becomes quite rich and reviewed in many survey papers (e.g see [1]). More recently , there has been a considerable effort devoted to extending the ISS theory to infinite dimensional systems, where the main existing results were reviewed in the recent survey [2]. A huge part of such systems are governed by partial differential equations (PDEs), the textbook [3] is the most recent monograph dealing with the concept of ISS for such class in a pedagogical manner. In the wide area of systems governed by partial differential equations, in the present work, we are interested in the class of hyperbolic systems of balance laws, those systems provide the mathematical framework to model physical systems with F. Z. Benyoub · Y. Lin (B) School of Automation and Electrical Engineering, Beihang University, Beijing, China e-mail: [email protected] F. Z. Benyoub e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_70
795
796
F. Z. Benyoub and Y. Lin
certain conservative properties. For details about the definition, Physical origins and examples, one can refer to [4] and [5], the last reference also gathers main tools and results of solvability and stability of this systems class which has recently become an active research field. The input-to-state stability of linear conservation laws and balance laws systems, with respect to boundary or in-domain disturbances, has also been investigated in several recent works, just to cite a few [6–9], where the Lyapunov functional method has been applied to assess stability conditions. Our goal is to establish ISS properties with respect to in-domain and boundary disturbances for the considered continuous system and its discretization, which is called discrete ISS, and the assessment will be based on the Lyapunov functional method. The concept of discrete input-to-state stability and stabilization of balance laws systems has been treated only in [8, 9] with respect to boundary disturbances. The rest of this paper is organized as follows: Firstly, in Sect. 2, the problem of interest for us is formulated, then, the adopted notion of input-to-state stability as an exponential stability in the presence of disturbances will be presented. finally, after introducing some important technical inequalities, theoretical results will be stated for the considered continuous problem. Then, in Sect. 3, we migrate to the analysis of stability in the discrete sense. First, we discuss the discretization and solvability of the problem. Then, we will define the ISS property as well as the Lyapunov function in the discrete sense. And, finally, we will be able to state the numerical stability results under appropriate assumptions. Finally, concluding remarks are provided in Sect. 4.
2 Problem Formulation and Theoretical Results In the present work, we deal with a class of spatially-varying coefficients hyperbolic balance laws system governed, in [0, 1] × [0, +∞), by equation: ∂t X (z, t) + A(z)∂z X (z, t) + Π (z)X (z, t) = M.δ(z, t)
(1)
where A and Π are non-constant matrices in Rn×n , and A is diagonal with entries {A p } p=1,n of the form ⎡ A(z) = ⎣
⎤
+
A(z) 0
+
0 ⎦ , z ∈ [0, 1] A(z)
−
−
with A(z) is positive diagonal matrix in Rm×m and A(z) is negative diagonal matrix in R(n−m)×(n−m) .
Discrete Input-to-State Stability with Respect to Boundary …
797
M is a constant matrix of Rn×d1 and δ : [0, 1] × R+ → Rd1 is the vector of disturbances distributed over the domain. Using the form of matrix A, the state vector X is then written ⎡ ⎤ + + X⎦ X ∈ Rm ⎣ X= − wher e − X X ∈ Rn−m Problem (1) is supplemented by the initial condition X (z, 0) = X 0 (z) x ∈ (0, 1)
(2)
for a given function X 0 , and the boundary conditions ⎡
⎤
+
⎡
⎤
+
(0, t) ⎦ X (1, t) ⎦ ⎣X =K⎣− + N .d(t) − X (1, t) X (0, t)
(3)
where K is a constant matrix of Rn having the form ⎡
⎤ − 0 K K =⎣+ ⎦ K 0 N is also a constant matrix of Rd2 , and d : R+ → Rd2 is the vector of boundary disturbances. We assume that the initial condition satisfies the compatibility condition ⎡
⎤ + X (0, 0) ⎣ ⎦ −
X (1, 0)
⎡ =K
⎤ + X (1, 0) ⎣ ⎦ −
X (0, 0)
(4)
By assuming some regularity assumptions on fluxes matrix A, disturbances vector δ and initial data X 0 , as established in [10], solution for our problem exists and is unique.
2.1 Stability Results The first result of this work, which is stated in Theorem 1, deals with ISS properties of system (1) with boundary conditions (3) in the sense of definition below: Definition 1 (L 2 -Input-to-state stability) System (1) is said to be Input-to-state stable in L 2 -norm w.r.t boundary disturbances d and in-domain disturbances δ if there exist positive real constants a, b, c1 and c2 such that for any initial condition X 0 ∈ L 2 ([0, 1], Rd1 ) and for all t > 0:
798
F. Z. Benyoub and Y. Lin
X (., t) L 2 ([0,1],Rn ) ≤ a.e−b.t X 0 L 2 ([0,1],Rn ) + c1 . sup δ(., s) L 2 ([0,1],Rn ) + c2 . sup |d(s)| s∈(0,t)
s∈(0,t)
For the considered problem we make the following assumptions: Assumption 1 There exist a diagonal spatially-varying continuously differentiable positive definite matrix P such that • the matrix ⎡ Φ1 := ⎣
+
+
A(1) P(1) 0
⎤
+
+
⎤
⎦ − 2K T ⎣ A(0) P(0) − 0 − ⎦K − A(0) P(0) 0 − A(1) P(1) −
0
⎡
−
is positive semi-definite. • the matrix Φ2 := P A + A .P − Π T .P − P.Π + P
(5)
(6)
is negative definite. One denotes φinf and φsup the smallest infimum and largest supremum of Φ2 eigenvalues respectively. • Let ν1 and ν2 be the largest eigenvalues of ⎡ NT
+
⎤
+
⎣ A(0) P(0) 0
−
0
−
− A(1) P(1)
⎦N
and M T .P.M respectively. Aiming to assess ISS properties, we use the Lyapunov functional defined in L 2 ([0, 1], Rn ) by 1 L(X (., t)) = X (z, t)T P(z)X (z, t)dz (7) 0
where P satisfies assumptions on matrices (5) and (6). For the same aim, the following preliminary inequalities are highly needed Proposition 1 (Gronwall’s inequality [11]) Suppose that y : R+ → R+ is absolutely continuous on [0; T ] for any T > 0 and satisfies for a.e. t > 0 the following differential inequality dy ≤ g(t)y(t) + h(t) dt
Discrete Input-to-State Stability with Respect to Boundary …
799
where g; h ∈ L 1 ([0; T ]; R) for any T > 0. Then for all t ∈ R+ , t
y(t) ≤ y(0)e 0
g(s)ds
t +
τ
h(s)e 0
g(τ )dτ
ds
0
In the present study, we adopt the particular version where functions g and h are constants. Proposition 2 (Young’s inequality) Let y and z be two n-vectors, then for any positive semi-definite matrix B ±2y T Bz ≤ y T By + z T B.z Proposition 3 Let Pinf and Psup be such that Pinf = min
i=0,n
inf Pi (z)
z∈[0,1]
Psup = max i=0,n
sup Pi (z) z∈[0,1]
n are diagonal entries of P, then for any n-vector Y where (Pi )i=0
1
1 |Y | ≤ L(Y ) ≤ Psup
|Y |2
2
Pinf 0
(8)
0
Proposition 4 For any n-vector Y , there exists a negative constant β such that Y T Φ2 .Y ≤ βY T P.Y with β =
φsup Psup
Now, the stability is stated as follows Theorem 1 (ISS) Under Assumption 1, system (1) with boundary conditions (3) and initial condition (2) is input-to-state stable in L 2 -norm w.r.t boundary disturbances d and in-domain disturbances δ having the following estimate X (., t)2L 2 ([0,1],Rn ) ≤α.e−β.t X 0 2L 2 ([0,1],Rn ) + γ . sup δ(., s)2 L 2 ([0,1],Rd1 ) s∈(0,t)
+ η. sup |b(s)| s∈(0,t)
where α =
Psup , Pinf
γ =
2.ν1 β
and η =
ν2 . β
2
(9)
800
F. Z. Benyoub and Y. Lin
Proof For simplicity, in this section, we write X i = X (i, t), Pi = P(i) and Ai = A(i) for i = 0, 1. The first step is to analyze the time-derivative of Lyapunov function (7), then we obtain
1 dL (X (., t)) = − X T .P.A.X 0 dt 1
+ X T P A + A P − Π T P − PΠ X dz (10) 0
1 +2
X T P.M.δdz 0
After some plain calculations on the first term of the right hand side (RHS) combined to boundary conditions (3), we obtain
− X T .P.A.X
1 0
⎡
⎡
= ⎣K ⎡
⎤ + X ⎣ 1⎦
+ +
⎣ P0 A0
− X0
⎤T + N .d(t)⎦ . ⎤ ⎡
0
− − − P1 A1
⎦ . ⎣K
⎡
⎤ + X ⎣ 1⎦ − X0
⎤ + N .d(t)⎦
0 ⎡ ⎤T ⎡ ⎤⎡ ⎤ + + + + X P A 0 X − ⎣ −1 ⎦ ⎣ 1 1 − − ⎦ ⎣ −1 ⎦ X0 X0 0 − P0 A0 ⎡ ⎤T ⎡ ⎤ ⎡ ⎤ + + + + X P A 0 X1 ⎦ 1⎦ T ⎣ 0 0 ⎣ ⎦ ⎣ K = − .K . − − − X0 X0 0 − P1 A1 ⎡ ⎤ + + P A 0 + d(t)T N T ⎣ 0 0 − − ⎦ N .d(t) 0 − P1 A1 ⎡ ⎤ ⎡ ⎤T + + + X P A 0 + 2. ⎣ −1 ⎦ K T ⎣ 0 0 − − ⎦ N .d(t) X0 0 − P1 A1 ⎡ ⎤T ⎡ ⎤⎡ ⎤ + + + + X P A 0 X1 ⎦ 1 1 1 ⎣ ⎦ ⎣ ⎦ ⎣ − − − − − X0 X0 0 − P0 A0
Discrete Input-to-State Stability with Respect to Boundary …
801
Consequently, By Young’s inequality (2), we get
− X T .P.A.X
1 0
⎡ ≤−
⎤T + X 1 ⎣ ⎦ −
⎡ Φ1
X0
⎤ + X 1 ⎣ ⎦
⎡
+ 2.d(t)T N T
−
X0 + +
⎣ P0 A0 0
⎤ 0
− −
− P1 A1
⎦ N .d(t)
Using the assumption on matrix Φ1 and the fact that ν1 is the largest eigenvalue of the matrix in the second term of the RHS, we obtain the upper-bound 1
− X T .P.A.X 0 ≤ 2.ν1 sup |d(s)|2
(11)
0≤s≤t
Inserting inequality (11) in (10), using Young’s inequality (2), we obtain dL (X (., t)) ≤ dt
1
1 X Φ2 X +
δ T M T P.M.δ + 2.ν1 sup |d(s)|2
T
(12)
0≤s≤t 0
0
By definition of β and ν2 , it follows dL (X (., t)) ≤ βL(X (., t)) + ν2 sup δ(., s)2 + 2.ν1 sup |d(s)|2 dt 0≤s≤t 0≤s≤t
(13)
Finally, by Gronwall’s inequality (1) together with Proposition 3, the desired result (9) yields.
3 Discretization and Numerical Stability Results Before addressing discrete stability issues, let’s first discuss a numerical discretization of Eq. (1). Firstly, a fractional-step method (also called operator splitting method) is applied by splitting the main equation into two sub-problems: Pb. A: ∂t X (z, t) + A(z).∂z X (z, t) = 0 (14a) Pb. B: ∂t X (z, t) = −Π (z).X (z, t) + M.δ(z, t)
(14b)
More details about the method analysis and next discretization can be found in [4, Chap. 17].
802
F. Z. Benyoub and Y. Lin
Then, in the domain (0, 1) × R+ , we introduce the equidistant grid (z i− 21 , t j ) for i = 0, N and j ∈ N with, respectively, space and time step size Δz and Δt satisfying the c f l condition: Δt . max sup |A p (z)| < 1 (15) Δz 1≤ p≤n z∈[0,1] The grid points are denoted by z i− 21 = i.Δz , i = 0, N
(16)
and t j := j.Δt ,
j ∈N
(17)
Over one time step, we use the upwind method for discretization of Pb. A and the forward Euler method for Pb. B, then, the solution at the grid point (z i− 21 , t j ) is approximated by z
j Xi
1 = Δz
i+ 2
1
X (z, t j )dz, i = 0, N , j ∈ N
(18)
z i− 1 2
It results that the discretization of the split system (14) has the following structure of two steps: Step A: ⎡ ∗⎤ ⎡ j ⎤ ⎡ ⎤⎡ j ⎤ + + + + +j ⎢ X i ⎥ ⎢ X i ⎥ Δt ⎢ Ai−1 0 ⎥ ⎢ X i − X i−1 ⎥ (19a) ⎣ ∗⎦ = ⎣ j ⎦− ⎣ ⎦⎣ j ⎦ − − Δz − − −j 0 Ai+1 Xi Xi X i+1 − X i Step B:
⎡
⎤ ⎡ ∗⎤ ⎡ ∗⎤ + j+1 + + X X ⎢ i ⎥ ⎢ i⎥ ⎢ Xi ⎥ j ⎣ j+1 ⎦ = ⎣ ∗ ⎦ − Δt Πi ⎣ ∗ ⎦ + Δt .Mδi − − − Xi Xi Xi
(19b)
Therefore, the conditions (2), (3) and (4) are, respectively, discretized by the following equations: (20) X i0 = X 0i i = 0, N − 1 ⎡
⎤ ⎡ j ⎤ +j + X ⎢ −1 ⎥ ⎢ X N −1 ⎥ j ⎣ j ⎦ = K . ⎣ j ⎦ + N .d −
XN
−
X0
j ∈N
(21)
Discrete Input-to-State Stability with Respect to Boundary …
⎡
+0
⎤
⎡
+0
803
⎤
⎢ X −1 ⎥ ⎢ X N −1 ⎥ ⎣ 0 ⎦ = K. ⎣ 0 ⎦ −
(22)
−
XN
X0
At this stage, we are able to discuss the main result of this work: the discrete ISS according to the definitions bellow: It’s to be noted that the analysis is made in 2-norm defined for any spatiallydiscretized n-vector X by N −1 X i T .X i X 2 = Δz . i=0
Definition 2 (Discrete ISS) The discretized system (19) is said to be Input-to-state stable in 2-norm with respect to boundary disturbances {d j } j=0,1,... and in-domain j disturbances {δi } i=0,N −1 if there exist positive constants α, β, γ and η such that for j=0,1,...
any initial condition {X i0 }i=0,N −1 and for all j ∈ N X j ≤ α.e−β j.Δt .X 0 + γ max δ s + η max (|d s |) 0≤s≤ j
0≤s≤ j
(23)
Definition 3 (Discrete Lyapunov function) For j ∈ N, Let’s define the set L j = Δz
N −1
jT
j
X i .Pi .X i
i=0
for weight functions ⎡ Pi = ⎣
+
⎤
Pi 0 ⎦ i = 0, N − 1 − 0 Pi
+
P i ∈ Rm×m + −
P i ∈ R(n−m)×(n−m) +
Then, {L j } j∈N is said to be Discrete Lyapunov function for system (19) with j respect to disturbances {δi } i=0,N −1 if there exist positive scalars η, θ and σ such that j=0,1,...
L j+1 − L j ≤ −ηL j + θ. max δ s + σ max (|d s |) 0≤s≤ j 0≤s≤ j Δt
(24)
The left side of (24) is seen as an approximate time derivative of the Lyapunov functional candidate (7). The following inequality is essential to establish the Discrete-ISS properties stated in Theorem 2.
804
F. Z. Benyoub and Y. Lin
Proposition 5 ([8]) Let a > 0 and z ∈ R, suppose for discrete functions y n , n = 0, N − 1 y n+1 − y n ≤ −ay n + z Δt then y n+1 ≤ (y 0 −
z z )(1 − aΔt )n+1 + a a
for a small enough Δt such that 0 < 1 − aΔt < 1 We introduce the following assumptions: Assumptions: There exist a set of diagonal positive definite matrices {Pi }i=−1,N such that: • The matrix ⎡ χ := ⎣
+
⎤
+
A N −1 P N 0
0
− − − A0 P −1
⎡
+
⎤
+
⎦ − 2K T ⎣ A−1 P 0 − 0− ⎦K 0 − A N P N −1
(25)
is positive semi-definite • For i = 0, N − 1, the matrices Θi = Pi + 2Δt .ΠiT .Pi .Πi − ΠiT .Pi − Pi .Πi and
⎡ Ψi := ⎣
+ + Ai P i+1
+
⎤
+
− Ai−1 P i
−
−
0
− −
Ai+1 P i − Ai P i−1
0
(26)
⎦
(27)
are negative semi-definite. Theorem 2 (Discrete ISS) Let the c f l condition hold then, under assumptions assumed above, the discretized system (19) with boundary conditions (21) and initial condition (20) is input-to-state stable in 2-norm w.r.t boundary disturbances j {d j } j=0,1,... and distributed disturbances {δi } i=0,N −1 . j=0,1,...
Sketch of the proof: In order to analyze the left side of (24) and upper bound it we introduce the term L∗ defined by: N −1 X i∗ T .Pi .X i∗ L∗ = Δz i=0
Discrete Input-to-State Stability with Respect to Boundary …
805
Firstly, we refer to the discrete system (19a) to obtain N −1 L j+1 − L∗ Δz ∗ T
X i Δt ΠiT Pi Πi − ΠiT Pi − Pi Πi X i∗ = Δt Δt i=0 jT
jT
+ Δt δi M T Pi Mδi − 2Δt δi M T Pi i X i∗ jT + 2δi M T Pi X i∗ j
(28)
Then, by using Young’s inequality, we get the following L j+1 − L∗ Δz ∗ T X i [Pi + 2Δt ΠiT Pi Πi − ΠiT Pi − Pi Πi ]X i∗ ≤ Δt Δt i=0 jT j +(1 + 2Δt )δi M T Pi Mδi N −1
(29)
Let θ denotes, for j ∈ N, the largest eigenvalue of the matrices M T Pi M for i = 0, N − 1, using the assumption (26) for the matrices Θi , we finally obtain the following upper-bound L j+1 − L∗ ≤ (1 + 2Δt )θ max δ s d1 0≤s≤ j Δt
(30)
Now, to analyse the second term, we first introduce the following notations ⎡ Ci =
⎡
⎤
+
+j
⎤
Δt ⎣ Ai−1 0 ⎦ X ⎦ j and Yi = ⎣ − i−1 j − Δz 0 − Ai+1 X i+1
Considering the discrete split system (19), we get L∗ − L j Δz jT j −X i Ci Pi (2I − Ci )X i = Δt Δt i=0 N −1
+
jT 2X i Ci Pi (I
j
jT
−Ci )Yi + Yi Ci Pi Ci Yi
j
Since the c f l condition holds, and by using Young’s inequality, we obtain L∗ − L j Δz jT j jT j −X i Ci Pi X i + Yi Ci Pi Yi ≤2 Δt Δt i=0 N −1
(31)
806
F. Z. Benyoub and Y. Lin
Then, after plain calculations, and by inserting the boundary conditions (21), we obtain ⎡ ⎡ ⎤ + + + + N −1 T L∗ − L j A P − A P 0 j i i+1 i−1 i ⎣Xi ⎣ ⎦ X ij ≤2 − − − − Δt Ai+1 P i − Ai P i−1 0 i=0 ⎤T ⎡ ⎡ ⎡ ⎤ ⎡ ⎤ ⎤ +
+
+
+
+
j A P 0 A P 0 ⎥ ⎢X ⎦ K⎦. + ⎣ Nj −1 ⎦ ⎣− ⎣ N −1 N − − ⎦ + 2K T ⎣ −1 0 − − − 0 − A0 P −1 0 − A N P N −1 X0 ⎤ ⎤ ⎡ ⎡ ⎤ +
+
+
j A−1 P 0 0 ⎢ X N −1 ⎥ jT T ⎦ N .d j ⎥ ⎦ ⎣ − j ⎦ + 2d N ⎣ − − A P 0 − N N −1 X0
(32) Hence, using assumptions for matrices χi and Ψi defined⎡by (25) and (27) respec⎤ + + A P 0 −1 0 ⎦N tively,andforσ beingthelargesteigenvalueofthematrix N T ⎣ − − 0 − A N P N −1 we obtain N −1 L∗ − L j T jT j X i .χi .X i + 2σ d j d j (33) ≤2 Δt i=0 Sequently, similarly to (4) and (3), there exists a positive constant η such that L∗ − L j ≤ −ηL j + σ max (|d s |) 0≤s≤ j Δt
(34)
Finally, Proposition 5 concludes the proof.
4 Conclusion In this work, we have examined the establishment of ISS properties for Balance laws systems subjected to both in-domain and boundary disturbances, we have considered both of continuous and discretized problems and all the statements were based on Lyapunov functional method. The statements were well proven theoretically, but due to time limitations, it could not be applied to a numerical example. That would be established computationally for physical problems in future works. As part of future work plan, the ISS stability analysis of the Burgers equation, as a prototype of nonlinear equations, will be investigated in a similar manner, by means of Lyapunov functional method. Further, the study will be extended to the nonlinear balance laws systems.
Discrete Input-to-State Stability with Respect to Boundary …
807
References 1. Dashkovskiy, S.N., Efimov, D.V., Sontag, E.D.: Input to state stability and allied system properties. Autom. Remote Control 72, 1579 (2011) 2. Mironchenko, A., Prieur, C.: Input-to-state stability of infinite-dimensional systems: recent results and open questions 3. Karafyllis, I., Krstic, M.: Input-to-State Stability for PDEs. Springer International Publishing (2019) 4. LeVeque, R.J.: Finite Volume Methods for Hyperbolic Problems, vol. 31. Cambridge University Press (2002) 5. Bastin, G., Coron, J.: Stability and Boundary Stabilization of 1-d Hyperbolic Systems (2016) 6. Prieur, C., Mazenc, F.: ISS-Lyapunov functions for time-varying hyperbolic systems of balance laws. Math. Control Signals Syst. 24(1), 111–134 (2012) 7. Prieur, C., Winkin, J.J.: Boundary feedback control of linear hyperbolic systems: application to the Saint-Venant-Exner equations. Automatica 89, 44–51 (2018) 8. Weldegiyorgis, G.Y., Banda, M.K.: An analysis of the input-to-state-stabilisation of linear hyperbolic systems of balance laws with boundary disturbances (2020). arXiv preprint arXiv:2006.02492 9. Weldegiyorgis, G.Y., Banda, M.K.: Input-to-state stability of non-uniform linear hyperbolic systems of balance laws via boundary feedback control. Appl. Math. Optim. (2020). https://doi.org/ 10.1007/s00245-020-09726-8 10. Kmit, I.: Classical solvability of nonlinear initial-boundary problems for first-order hyperbolic systems. Int. J. Dyn. Syst. Diff. Equat. 1(3), 191–195 (2008) 11. Zheng, J., Zhu, G.: Input-to-state stability with respect to different boundary disturbances for Burgers’ equation. In: 23rd International Symposium on Mathematical Theory of Networks and Systems, pp. 562–569 (2018, July) 12. Banda, M.K., Weldegiyorgis, G.Y.: Numerical boundary feedback stabilisation of non-uniform hyperbolic systems of balance laws. Int. J. Control 93(6), 1428–1441 (2020) 13. Evans, L.C.: Partial Differential Equations. American Mathematical Society (2010)
A Remote Mobile Image Acquisition System and Experimental Simulation of Indoor Scenes Based on an RGB-D Camera Xiaohui Shi and Lei Yu
Abstract In order to solve the problems of camera shake affecting the acquisition effect, inability to accurately judge the camera movement and rotation speed, and great physical and mental exertion arising from handheld RGB-D cameras during 3D data image acquisition, this paper designs an indoor remote movable image acquisition system based on RGB-D cameras. First, a motion control module is built based on the ROS system and a microcontroller, then the RGB-D camera is driven through Jetson Nano for image acquisition and the optimized AKAZE algorithm is used to perform feature processing on the images. Simulation results have shown that this platform has high accuracy in image acquisition and feature processing in complex and wide indoor scenes. Keywords RGB-D camera · Indoor image acquisition · Image feature processing · Remote control platform
1 Introduction In the process of 3D data image acquisition, a commonly used and relatively simple method is to collect indoor scene images using a handheld RGB-D camera [1]. In the indoor environment where the ground is relatively flat and the scene is small, using a handheld RGB-D camera for image acquisition is really convenient and fast. However, handheld RGB-D camera-based acquisition also has some shortcomings. X. Shi · L. Yu School of Mechanical and Electric Engineering, Soochow University, Suzhou, China L. Yu Key Laboratory of Opto-Technology and Intelligent Control, Ministry of Education, Lanzhou Jiaotong University, Lanzhou, China L. Yu (B) Jiangsu Key Laboratory of Advanced Manufacturing Technology, Huaiyin Institute of Technology, Huai’an, China e-mail: [email protected]; [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_71
809
810
X. Shi and L. Yu
Since handheld RGB-D cameras require human factors for mobile acquisition, then there will inevitably be camera shake in all directions, which affects the acquisition effect. It is also impossible to accurately determine the speed of camera movement and rotation, which puts forward higher requirements for the robustness and accuracy of the whole system. Moreover, in larger scenes, data collection using handheld RGB-D cameras also consumes a lot of physical and mental energy. Therefore, inspired by driverless cars carrying various sensors for road condition recognition, this paper designed an indoor remote removable image acquisition hardware platform based on RGB-D camera for complex and wide indoor scenes at a lower hardware cost. As shown in Fig. 1.
Fig. 1 Hardware platform for remote acquisition experiments based on an RGB-D camera
A Remote Mobile Image Acquisition System and Experimental …
811
2 Image Acquisition Module of an RGB-D Camera Driven by Jetson Nano In the image acquisition module, the Jetson Nano [2] development board equipped with the Linux system and the RGB-D camera driver is used as the lower computer to directly drive the RGB-D camera to acquire and store the image dataset. The Jetson Nano can be controlled remotely by the host computer through VNC remote tools. The data can be transferred through the secure file transfer protocol (SFTP) [3] (Fig. 2). According to the requirements of complex scene acquisition and the bearing capacity of the movable platform, the image acquisition equipment should have the characteristics of far effective range, high image resolution, compactness and portability, and accurate depth information. Therefore, the RGB-D camera selected for this study is the Intel RealSense Depth Camera D455 as shown in Fig. 3. At a sampling rate of 30 frames per second, the RGB sensor supports a maximum resolution of 1280 × 800 and the depth sensor supports a maximum resolution of 1280 × 720, and both sensors support a global shutter to better prevent blurring of depth images at high speeds, which is sufficient for smooth image acquisition at faster speeds indoors. The color sensor field of view (FOV) is 90◦ × 65◦ , the depth sensor field of view is 87◦ × 58◦ , depth information is obtained through binocular stereo vision, infrared projector can improve the depth accuracy in low-texture scenes, the left and right imager send data to the depth processor to calculate the depth value of each pixel in the image and generate depth image frames, the theoretical accuracy error within 4 m is less than 2%, which is enough to support the requirement of high precision 3D data acquisition in complex scenes. To be mounted on the mobile platform and drive the RGB-D camera for image data acquisition, storage and transmission, Jetson Nano installed with Linux operating
Fig. 2 Image acquisition interface
812
X. Shi and L. Yu
Fig. 3 Physical and structural diagram of the Intel D455 camera
system is selected in this paper as the lower computer to directly drive the RGBD camera and communicate with the upper computer through LAN. As shown in Fig. 4, Jetson Nano is an embedded development board with advantages such as small size, strong functionality, and low price. It supports the Ubuntu 18.04 LTS system and can achieve rapid and good deployment and application of artificial intelligence technology on various intelligent devices. The overall hardware design of Jetson Nano is similar to Raspberry Pi, but with more powerful performance. Equipped with quad-core ARM A57 @ 1.43 GHz CPU, 128-core Maxwell GPU and 4 GB LPDDR4 25.6 GB/s memory, it supports high-resolution sensors and parallel processing of multiple sensors, which can provide enough AI arithmetic power for industrial application terminals. It supports the current mainstream deep learning frameworks such as TensorFlow [4], PyTorch [5], Caffe [6], Keras [7], etc. to make image recognition, object detection, semantic segmentation, video enhancement and other functions easily realized in industrial terminals. And it supports board-level support packages for deep learning, computer vision, multimedia processing, and other software libraries such as CUDA, cuDNN and TensorRT. Considering that desktop and server graphics cards are expensive and bulky, they are not suitable for the demand of edge computing. Jetson Nano is a good choice for the current research and application of edge computing. Based on the performance of Jetson Nano, it is also one of the future research directions to gradually migrate the content from the host end to the mobile terminal. Next, place the Jetson Nano and the upper computer PC on the same local area network, which can be the same hotspot or WIFI network, install the VNC remote control software separately, and select the corresponding port number to connect. There are two computers in the whole connection system. After setting the port number on the mobile terminal, enter the IP address and port number on the remote terminal to realize the remote connection between the master and slave computers, as shown in Fig. 5. Finally, secure file transfer protocol is used to realize the remote transmission process of image data set, which can save human recovery, data copy and other steps.
A Remote Mobile Image Acquisition System and Experimental …
813
Fig. 4 Introduction to various components of the Jetson Nano
Fig. 5 VNC remote control desktop
3 Motion Control Module Based on ROS System and a Microcontroller Robot Operating System (ROS System), in essence, is an open source operating system for robot motion control. It needs to rely on the Linux system to run on top of the computer hardware and better provide hardware device abstraction, underlying device control, process messaging and other system functions and various functions and tools needed to compile and write code (Fig. 6). The motion control module in this paper sends control commands to the robot via a remote control terminal to control the robot’s travel. When we press the corresponding move button on the keyboard, the ROS system on the Raspberry Pi side publishes
814
X. Shi and L. Yu
Fig. 6 Image acquisition process
a speed topic. We subscribe to this topic at the underlying control node, receive the speed data, and then write it to the serial port. The microcontroller controls the motor operation by reading the speed data sent from the serial port, thus enabling the keyboard to control the movement of the cart, as shown in Fig. 7. The main components include a Raspberry Pi 3 Model B with Ubuntu and the corresponding version of ROS robotics system pre-installed, STM32 microcontroller, motor driver, voltage regulator module, encoder, inertial measurement unit IMU, power supply battery, etc.
Fig. 7 Diagram of motion control module
A Remote Mobile Image Acquisition System and Experimental …
815
Fig. 8 Diagram of the two-wheel differential chassis model
The motion chassis designed in this study adopts a two-wheel differential chassis, and the mathematical model of the motion of the two-wheel differential chassis is briefly described by establishing the coordinate system shown in Fig. 8. Let the angular velocity of the left and right wheels are ω1 and ω2 , the radius of the left and right wheels are r1 and r2 , the linear velocity of the left and right wheels are v1 and v2 , and the distance between the center of the left and right wheels is d. Then when the car body moves forward or backward in a straight line, the linear velocity of the car body is |v1 | = |v2 |; when the car body rotates in place, the angular velocity of the car body is |ω1 | = |ω2 |; when the car body does circular motion, let the radius of the arc be R, then the linear and angular velocities of the car body can be expressed as follows: vcar =
v1 + v2 2
(1)
v1 2v1 = 2R − d R − d2 2v2 v2 = = d 2R +d R+ 2 v2 − v1 = d
ωcar =
(2)
So the mathematical model of the car motion during the circular motion can be expressed as follow: r1 v 2 = − rd1 ω
r2 2 r2 d
ω1 ω2
=
1 2
1 2 − d1 d1
v1 v2
(3)
816
X. Shi and L. Yu
4 Introduction to Image Feature Processing Methods 4.1 Introduction to Common Image Feature Processing Algorithms Feature point extraction refers to extracting some points that have representative features in the input image, and according to different extraction algorithms, feature points with different numbers and locations can be obtained. Feature extraction can be usually divided into detecting feature points and computing descriptors. Feature matching is the process of matching different descriptors and presenting the successfully matched feature pairs. The more representative feature algorithms are SIFT [8] and SURF [9] algorithms that build a Gaussian scale space in the form of pyramids. However, SIFT and SURF algorithms suffer from the problem that the object boundary information is missing when constructing the scale space, and the noise details on all scales are smoothed to the same degree, which affects the accuracy of localization. For the problems of SIFT and SURF, nonlinear filtering strategies such as bilateral filtering and nonlinear diffusion filtering to construct the scale space are more capable of retaining the target boundary and more feature information while filtering details. The KAZE [10] algorithm improves its repeatability and uniqueness by nonlinear diffusion filtering, but the drawback is that it is very computationally resource intensive, and the additive operator splitting (AOS) [11] strategy relied on to solve the nonlinear diffusion equation is difficult to satisfy its real-time performance. The AKAZE [12] algorithm is proposed in order to reduce the large consumption of computational resources by the KAZE algorithm. The AKAZE algorithm combines the fast explicit diffusion (FED) [13] strategy to solve partial differential equations, which can establish a scale space more efficiently than general nonlinear methods. At the same time, the introduction of the modified local difference binary (M-LDB) [14] improves the robustness of the general LDB descriptor against rotation and scale changes. Compared with SIFT and SURF algorithms, the AKAZE algorithm is faster; compared with ORB [15] algorithm, the AKAZE algorithm has improved the repeatability and robustness. The scale-invariant feature transform (SIFT) algorithm is introduced first, as shown in Fig. 9a for a frame in the experimental scene for SIFT feature extraction, SIFT has better robustness against scene rotation, scale brightness transformation, noise, etc. and can distinguish and match information more quickly and effectively in a large amount of feature data. More feature information can be extracted for images with less information as well. The main process is: (1) Identify special points with scale and rotation invariance by Gaussian differentiation function in all scales of images, and extract feature points in more special places such as object corners, edge points, regional highlights and dark points. (2) The fitted model is used to further determine the position and scale of each candidate feature point, and then the feature direction of the feature point is given based on the local image gradient direction as the reference for the subsequent image transformation. (3) Matching the feature vectors between feature points to find the corresponding pairs of feature points to
A Remote Mobile Image Acquisition System and Experimental …
817
establish the matching relationship. However, the SIFT algorithm also suffers from shortcomings such as low real-time performance, weak processing ability for smooth edge targets, and easy to produce false matches. The Speeded Up Robust Features (SURF) algorithm is an improved and more robust image feature processing algorithm based on the SIFT algorithm, which dramatically improves the operation rate and is more robust than SIFT in terms of different image transformations. As shown in Fig. 9b, one frame of the experimental scene was subjected to SURF feature extraction. The process of SURF algorithm is: (1) Searching images on all scale spaces, detecting candidate feature points and filtering and localizing feature points by constructing Hessian matrices. The
Fig. 9 Comparison of the feature points extraction effect about different algorithms on the actual scene image
818
X. Shi and L. Yu
Hessian matrix is constructed to generate stable feature points, which corresponds to the Difference of Gauss (DoG) process in the SIFT algorithm. For images, the Hessian matrix describes the local curvature of the function: H (I (x, y)) =
∂2 I ∂2 I ∂ x 2 ∂ x∂ y ∂2 I ∂2 I ∂ x∂ y ∂ y 2
(4)
The local feature points can be determined by calculating the matrix determinant values. Also, in order to improve the computational efficiency of the algorithm, the SURF algorithm replaces the Gaussian filter with a box filter. A weighting factor of 0.9 needs to be introduced in the determinant calculation to balance the error caused by the box filter approximation: Det(H ) =
2 ∂2 I ∂2 I ∂2 I · − 0.9 · ∂ x 2 ∂ y2 ∂ x∂ y
(5)
Each pixel processed by the Hessian matrix is compared with the determinant size of its 8 neighbor points on the same layer and 18 points on the upper and lower layers to confirm whether it is an extreme point. After the feature points are located, the final stable feature points are screened out by filtering to remove the incorrectly located feature points. (2) The Haar wavelet features in the circular neighborhood around the feature point are calculated by rotating 0.2 radians in a 60° sector, and the direction with the maximum value is set as the main direction of the feature point. (3) The Haar wavelet features are computed separately for each of the rectangular domains in the neighborhood around the main direction of the feature point. Each region eventually produces a 4-dimensional vector. After merging, the feature vector of one SURF feature descriptor is 64 dimensions, while the descriptor of SIFT feature has 128 dimensions, so the SURF algorithm effectively improves the real-time performance. (4) The Euclidean distance between the eigenvectors representing the two feature points is calculated, and if the Euclidean distance is short, the match is high. If there is a difference between the positive and negative of the Hessian matrix trace, the matched point pair will also be directly rejected. Considering the requirement of real-time feature detection, both SIFT and SURF algorithms are deficient, and at this time ORB algorithm is a good solution to this problem. The ORB feature algorithm is composed of two parts, the improved FAST feature points and the binary robust independent element feature (BRIEF) descriptor. Its greatest advantage is fast speed, real-time performance, high robustness to noise, scale and rotation invariance. As shown in Fig. 9c, the ORB feature extraction is performed on one frame of the experimental scene, and it can be seen that although the ORB feature extraction has the advantage of speed, the number of extracted feature points is less for the more complex indoor scenes, and the effect of feature
A Remote Mobile Image Acquisition System and Experimental …
819
processing at the edge details is not very satisfactory. FAST algorithm only needs to compare the size of pixel brightness, and mainly carries out detection for the obvious change of local pixel gray level, which is very fast. But the disadvantage is that there is no scale representation and direction information. To obtain the scale invariance, the ORB algorithm detects feature points on each layer of the constructed image pyramid. And its rotation invariance is further achieved by using the neighborhood gray-scale centroid method. The process of calculating the eigendirection using neighborhood gray-scale centroid method are as follows: (1) The moment m of the neighborhood F is defined as: m ab =
x a y b I (x, y), a, b ∈ {0, 1}
(6)
x,y∈F
(2) The image centroid coordinates can be obtained by the moment m: X=
m 10 m 01 , Y = m 00 m 00
(7)
(3) The angle of the ORB feature point can be expressed as the vector direction from the geometric center of the neighborhood to the centroid C: ω = arctan
m 01 m 10
(8)
BRIEF is a unique binary feature descriptor in which two values of 0 and 1 are used to mark the size relationship of two random pixels around a key point, respectively, and choose to take 0 or 1 according to their size relationship. 128 point pairs are picked sequentially by Gaussian distribution to obtain a 128-dimensional vector. After the feature points are denoised by Gaussian smoothing, the 128 point pairs selected by Gaussian distribution are rotated in the feature direction to obtain new point pairs, and the BRIEF description is obtained by calculating the feature values for the new point pairs, and then ORB feature extraction is realized. The next step is to establish the correspondence between the feature point pairs by the feature matching algorithm. While algorithms such as SIFT and SURF, which detect feature points in linear scale space, are prone to the problem of losing features at the boundaries and details of images, the KAZE algorithm preserves more details by constructing a nonlinear scale space, solving the problem of blurred boundaries and lost details. As shown in Fig. 9d, KAZE feature extraction was performed on one frame of the experimental scene. The main process of KAZE algorithm is: (1) Construct the nonlinear scale space. Similar to the SIFT algorithm, the scale level of the KAZE algorithm increases logarithmically, but the same resolution as the original image is used for each layer. The nonlinear scale space is constructed using nonlinear diffusion filtering and additive operator splitting AOS algorithm. The Nonlinear diffusion filtering regards the image brightness that changes at
820
X. Shi and L. Yu
different scales as a function divergence to better handle image details and noise. The divergence is expressed as follow: ∂L div[g(|∇ I (x, y, t)|)∇ L] ∂t
(9)
where ∇ I is the gradient of the image I after Gaussian smoothing. t represents the scale parameter. g(|∇ I (x, y, t)|) represents the transfer function to be diffused. The scale of each layer is expressed as: σ (o, s) = σ0 2
So+s S
, o ∈ [0, O − 1], s ∈ [0, S + 2]
(10)
where σ0 is the base layer scale. o is the group number. s is the layer number of the group. (2) Detect and locate feature points. KAZE detects the extreme points of the determinant of the Hessian matrix at different scales to detect the feature points, and determines whether the feature points are retained by taking the derivative of the Taylor expression L(x) and assigning the derivative to zero. L(x) = L +
∂L ∂x
T x + 0.5x T
∂2 L x ∂x2
(11)
(3) Determine the main direction of the feature point. This process can be referred to the SURF algorithm, in which a 60° sector is drawn in the circular neighborhood near the feature point, and then the sum of Haar wavelet features in the area is obtained by continuously micro-rotating the sector, and finally the direction with the largest sum is set as the main direction of the feature point. (4) Generate feature descriptors. The KAZE algorithm divides a region window into 4 × 4 subregions and performs Gaussian weighting and normalization on the description vector dv for each subregion to obtain a 64-dimensional description vector.
L y
|L x |, (12) L y, dv = Lx, Although the KAZE algorithm is able to preserve image details and edge features, the construction of the nonlinear scale space is time-consuming, so the AcceleratedKAZE (AKZAE) algorithm improves the operational efficiency of display diffusion and descriptors. The AKAZE feature extraction was performed on one frame of the experimental scene as shown in Fig. 9e. The AKAZE algorithm uses a fast display diffusion strategy to establish a nonlinear scale space, while the modified local difference binary (M-LDB) is used to improve the robustness of rotation invariance and scale invariance. The core of the FED algorithm is to perform M-step cycles on n display diffusion processes by changing the step size τ , and solve the diffusion
A Remote Mobile Image Acquisition System and Experimental …
821
equation as follow: L i+1, j+1 = L i+1, j + τ A L i · L i+1, j ,
j ∈ [0, n − 1]
(13)
where τ is the time step. A L i is the conduction matrix. n is the cycle length. the period of the FED algorithm can be expressed as σi (o, s), which can be converted to time units as ti . σi (o, s) = σ0 · 2
o+s S
, o ∈ [0, O − 1], s ∈ [0, S − 1], i ∈ [0, M] ti = 0.5σi2 , i ∈ [0, N ]
(14) (15)
The main process of AKAZE is: (1) The construction of nonlinear scale space is firstly done by nonlinear filtering and fast explicit diffusion algorithm. (2) The Hessian matrix determinant values normalized at different scales are used to screen the extreme value points. (3) The principal directions of the feature points are calculated, and then the descriptors are calculated by the M-LDB algorithm. (4) Use the Hamming distance as a match for the feature points.
4.2 Introduction to the Optimized Image Feature Processing Algorithm Considering the more complex indoor scenes have more detail information and boundary information, and the algorithm of detecting feature points in the linear scale space is easy to lose the features at the boundary and details of the image, so the feature algorithm of non-linear scale space construction is chosen in this study. Although the KAZE algorithm is able to preserve image details and edge features, it is very time consuming. AKAZE algorithm is more efficient than KAZE, but the actual feature extraction results show that it still does not achieve better results in terms of the number of extracted feature points, distribution, and preservation of edge detail information. Compared with the KAZE algorithm, AKAZE does not have a significant improvement in edge feature extraction, and the feature extraction effect for ground texture is also poor. For this reason, this study is partially optimized for the AKAZE algorithm. As shown in Fig. 9f, the feature processing algorithm in this paper has better performance in edge feature extraction, ground feature extraction and the feature point number distribution compared with the standard AKAZE algorithm. Meanwhile, compared with the various image feature processing algorithms in the previous paper, as shown in Fig. 9, it can be seen that the improved algorithm is more capable of retaining the target boundary and more feature information while filtering details, improving its repeatability and uniqueness compared to the SIFT and SURF algorithms. Compared with the ORB algorithm, the improved algorithm
822
X. Shi and L. Yu
extracts feature points for more complex indoor scenes, and the performance of feature processing at the edge details is more satisfactory.
5 Conclusion Handheld RGB-D cameras are prone to camera shake during 3D data image acquisition due to instability, which affects the acquisition effect, and it is also impossible to accurately judge the camera movement and rotation speed, and it is very easy to consume human body energy for data acquisition of larger scenes. In this paper, an indoor remote movable image acquisition system based on RGB-D camera is designed for indoor image acquisition of more complex and wide scenes. Firstly, the motion control module based on ROS operating system and microcontroller is built to carry the RGB-D camera, which greatly improves the stability. The image acquisition module based on Jetson Nano driving RGB-D camera can better prevent the blurring of depth images when moving at high speed, which is sufficient to meet the image acquisition work at faster speed indoors. The AKAZE algorithm is partially optimized for better performance in edge feature extraction, ground feature extraction and the feature point number distribution, and is a suitable system for indoor image acquisition of more complex and wide scenes. Acknowledgements The work is supported by National Natural Science Foundation of China (61873176); Opening Foundation of Key Laboratory of Opto-technology and Intelligent Control, Ministry of Education; The open fund for Jiangsu Key Laboratory of Advanced Manufacturing Technology (HGAMTL-2202); Tang Scholar of Soochow University and Jiangsu Province’s “333 High Level Talent Training Project”. The authors would like to thank the referees for their constructive comments.
References 1. Nguyen, A.H., Holt, J.P., Knauer, M.T., et al.: Towards rapid weight assessment of finishing pigs using a handheld, mobile RGB-D camera. Biosys. Eng. 226, 155–168 (2023) 2. Zhang, Z.D., Tan, M.L., Lan, Z.C., et al.: CDNet: a real-time and robust crosswalk detection network on Jetson nano based on YOLOv5. Neural Comput. Appl. 34(13), 10719–10730 (2022) 3. Fan, C.I., Chen, I.T., Cheng, C.K., et al.: FTP-NDN: file transfer protocol based on re-encryption for named data network supporting nondesignated receivers. IEEE Syst. J. 12(1), 473–484 (2016) 4. Pang, B., Nijkamp, E., Wu, Y.N.: Deep learning with tensorflow: a review. J. Educ. Behav. Stat. 45(2), 227–248 (2020) 5. Imambi, S., Prakash, K.B., Kanagachidambaresan, G.R.: PyTorch. Programming with TensorFlow: solution for edge computing applications, vol. 4, pp. 87–104 (2021) 6. Jia, Y., Shelhamer, E., Donahue, J., et al.: Caffe: convolutional architecture for fast feature embedding. In: Proceedings of the 22nd ACM International Conference on Multimedia, pp. 675–678 (2014)
A Remote Mobile Image Acquisition System and Experimental …
823
7. Muhammad, W., Ullah, I., Ashfaq, M.: An introduction to deep convolutional neural networks with Keras. In: Machine Learning and Deep Learning in Real-time Applications, IGI Global 231–272 (2020) 8. Gupta, S., Kumar, M., Garg, A.: Improved object recognition results using SIFT and ORB feature detector. Multim. Tools Appl. 78, 34157–34171 (2019) 9. Gupta, S., Thakur, K., Kumar, M.: 2D-human face recognition using SIFT and SURF descriptors of face’s feature regions. Vis. Comput. 37, 447–456 (2021) 10. Okawa, M.: From BoVW to VLAD with KAZE features: offline signature verification considering cognitive processes of forensic experts. Pattern Recogn. Lett. 113, 75–82 (2018) 11. Shen, Y., Peng, F., Zhang, Z.: Efficient optical proximity correction based on semi-implicit additive operator splitting. Opt. Express 27(2), 1520–1528 (2019) 12. Soleimani, P., Capson, D.W., Li, K.F.: Real-time FPGA-based implementation of the AKAZE algorithm with nonlinear scale space generation using image partitioning. J. Real-Time Image Proc. 18(6), 2123–2134 (2021) 13. Alcantarilla, P.F., Solutions, T.: Fast explicit diffusion for accelerated features in nonlinear scale spaces. IEEE Trans. Patt. Anal. Mach. Intell 34(7), 1281–1298 (2011) 14. Bibars, A., Mahroos, M.: New local difference binary image descriptor and algorithm for rapid and precise vehicle visual localisation. IET Comput. Vision 13(5), 443–451 (2019) 15. Luo, C., Yang, W., Huang, P., et al.: Overview of image matching based on ORB algorithm. J. Phys. Conf. Ser. 1237(3), 032020 (2019) (IOP Publishing)
Flexible Load Market Bargaining Scheduling Strategy for Active Distribution Network Based on Demand Response Bowei Shao, Hui Wang, Shuo Zhang, Pan Yin, and Wenliang Li
Abstract With the rapid increase in the connection of distributed energy and electric vehicles to the grid, the stable operation of the grid is difficult. Firstly, a flexible load market bargaining scheduling strategy based on demand response is proposed for active distribution network (ADN), AND a multi-and framework for TOU pricing is constructed. Secondly, the market pricing model of load aggregator under ADN is established, and KKT is distributed to solve the problem, so as to obtain the optimal price while ensuring the privacy of each subject. Thirdly, with the lowest cost of ADN as the goal, the energy transaction volume of load aggregator is solved to provide conditions for the reasonable redistribution of benefits and other issues. Finally, from the perspective of the economic benefits of the grid, the load aggregator and the user, the simulation analysis is carried out on the three key influencing parameters, which are the price upper limit, the market supervision coefficient and the power sale price discount coefficient of the grid to the load aggregator, to verify the effectiveness of the proposed strategy. Keywords Active distribution network · Load aggregator · Market bargaining · KKT condition · Dynamic electricity price
B. Shao · P. Yin · W. Li Jilin Institute of Chemical Technology, 132013 Jilin City, Jilin Province, China e-mail: [email protected] H. Wang (B) Changchun Institute of Technology, 130012 Changchun, China e-mail: [email protected] National and Local Joint Engineering Research Center for Measurement, Control and Safe Operation of Intelligent Distribution Networks, 130012 Changchun, China S. Zhang State Grid Jilin Electric Power Company Yanbian Power Supply Company, 133000 Yanji City, Jilin Province, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_72
825
826
B. Shao et al.
1 Introduction On March 15, 2015, the CPC Central Committee and The State Council issued Several Opinions on Further Deepening the Reform of the Electric Power System, which kicked off a new round of electric power system reform. With the further deepening of the reform, the power market based on mid- and long-term transactions has been gradually established in various provinces and cities. Medium- and long-term trading plays the biggest role in the market is to stabilize the market price and maximize the risk of trading for market participants [1, 2]. An important feature of smart grid is power demand response (PhD), which means that when the wholesale market price rises or the reliability of the system is threatened, the advanced users change their inherent habitual mode to directly compensate the power supplier by receiving the notification of the power consumption induced load reduction or power price increase signal, in order to reduce or delay the power load for a certain period of time It is one of the solutions of demand side management to deal with power supply, stabilize the grid and restrain price rise in the short term. The core of market mechanism is price mechanism. At present, the pricing strategies of electricity market mainly include time-of-use price, fixed price, real-time price and critical peak price. Since the supply and demand relationship between suppliers and users is a real-time interaction process, it is particularly important to study the real-time pricing of smart grid. Asadi et al. solved the welfare maximization problem of real-time pricing through particle swarm optimization algorithm, worked out the optimal energy consumption of each user, and maximized the total welfare of all users. The results show that this algorithm effectively improves the energy utilization rate [3]. Based on Stackelberg dynamic game theory, literature [4–7] proposed a master-slave game model with the distribution network side as the main body and the load side as the slave body, and established the master-slave game economic model with the minimum operation cost of the distribution network, the maximum wind power consumption and the lowest electricity cost of users on the load side as the objectives. The above literatures are all new energy consumption strategies of a single micro-grid or park, and the complementary economy between adjacent micro-grids or parks can greatly increase the absorption capacity and improve the stability of the system. Literature [8] studies the multi-objective optimal scheduling model of photovoltaic charging stations, which takes into account the minimum power purchase cost and the minimum cycle power. Literature [9] established a rolling optimization demand response model of charging stations considering the uncertainty of photovoltaic output and charging load. Based on the evolutionary game theory, literature [10] proposes a micro-energy networkuser game model that considers various flexible loads of electricity, heat and cold to participate in optimal scheduling. Literature [11] considers the translation, transfer, reduction and substitution characteristics of flexible loads of electricity and gas, and verifies that flexible loads participating in scheduling can effectively reduce system operation costs and promote wind power consumption. Literature [12] fully tapped the schedulable potential of electrical and thermal flexible loads, and built a combined electrothermal optimal scheduling model of virtual power plant with comprehensive
Flexible Load Market Bargaining Scheduling Strategy for Active Distribution . . .
827
consideration of system economy and peak cutting and valley filling indexes. Based on the consideration of flexible thermal load reduction and flexible electric load transfer, the literature [13] also considers the conversion of electric heating and electrical energy to promote IES grid-connected consumption. Although the above literatures have taken into account the flexible load response on the demand side, they have not considered the dispatchable value of the increasing load of electric vehicles in the power grid This paper proposes a flexible load market bargaining scheduling strategy for active distribution network (ADN) based on demand response, constructs a multi-and framework for TOU price, AND then establishes a load aggregator market pricing model under ADN, and KKT distributes solutions, which can obtain the optimal price and ensure the privacy of each subject. Thirdly, with the lowest cost of ADN as the goal, the energy transaction volume of load aggregator is solved to provide conditions for the reasonable redistribution of benefits and other issues. Finally, from the perspective of the economic benefits of the grid, the load aggregator and the user, the simulation analysis is carried out on the three key influencing parameters, which are the price upper limit, the market supervision coefficient and the power sale price discount coefficient of the grid to the load aggregator, to verify the effectiveness of the proposed strategy.
2 Flexible Load Market Bargaining Scheduling Strategy Architecture of Active Distribution Network Based on Demand Response Figure 1 shows the architecture diagram of flexible load market bargaining scheduling strategy of active distribution network (ADN) designed based on demand response. In this paper, power market scheduling is used to participate in power network optimization scheduling, so as to ensure the stability of system operation. The market scheduling participants in the figure above are distributed energy, conventional generator sets, load aggregators, EV loads, and customer conventional loads. The solid line in the figure represents the power transmission process, and the dashed line represents the market information exchange process. In the distribution network structure, the electric energy is sent from the power side and sent to the load side by the distribution system. The function of the dispatching center is to predict the electricity demand of the next day according to the day-ahead information of the grid, and formulate a reasonable power supply plan. The aggregate load provider will integrate the load-side information and send it to the dispatching center, sign a long-term electricity sales agreement with the electric vehicle users, sell electric energy to the electric vehicle users through the agreed price, and submit the day-ahead market clearing. The charging architecture of electric vehicles is shown in Fig. 2. Dispatching center makes dispatching plan according to distribution network information, predicted spot price and EV historical data; If the electric vehicle aggre-
828
B. Shao et al.
Fig. 1 Distribution network architecture diagram Fig. 2 Electric vehicle charging architecture
gator’s initial electricity consumption plan will cause congestion in the distribution system, the dispatching center will calculate the congestion price caused by controllable flexible loads such as electric vehicles and release it to the electric vehicle aggregator. After receiving the electricity price information from the dispatching center, the aggregator predicts the dynamic price again according to the electricity price, and makes its own optimal charging plan respectively. Finally, aggregators submit their energy plans to the spot market for market clearing.
Flexible Load Market Bargaining Scheduling Strategy for Active Distribution . . .
829
3 Optimal Scheduling Modeling In this section, the mathematical model, operational constraints and the objective function of the lowest operational cost of the key components of ADN are modeled.
3.1 Objective Functions Firstly, the probability distribution of EV daily driving distance and charging time was established. Based on Monte Carlo, the distribution function of EV networks in different regions of the distribution network was randomly sampled. Probability distribution of charging start time [14]:
f (t) =
⎧ ⎨
1 √ σs 2π ⎩ √1 σs 2π
(t−μ)2 , μs − 12 2 2σs 2 , μs − exp (t+24−μ) 2 2σs
exp
< t ≤ 24 12 < t ≤ μs − 12
(1)
Formula: f(t) is the return time of the electric vehicle as the start time of charging. Mean μs = 17.6, variance σs = 3.4. Daily mileage probability distribution:
f (x) =
1 1 (ln x − μ)2 √ exp − x σs 2π 2σs2
(2)
Formula: f(x) is the daily mileage of electric vehicles, mean μs = 3.2 and variance σs = 0.88. Electric vehicle charging time:
Ti E V,C L =
SiE V WiE V,100 100 piE V,C ηiE V,C
(3)
Formula: SiE V is the daily driving range of EV i, WiE V,100 is the power consumption of EV i for 100 km, PiE V,C is the average charging power of EV i, and ηiE V,C is the charging efficiency of EV i.
3.2 Distributed Energy Model The power of new energy generation fluctuates greatly under the influence of natural conditions, so the actual output of wind power generation and photovoltaic power
830
B. Shao et al.
generation is obtained by superposition of predicted output and predicted error. The sum of predicted value and predicted deviation here cannot completely replace the real value, but can equally reflect the randomness of photovoltaic fan output. PtP V = Pt,PprV e + ε P V
(4)
PtW ind = Pt,Wprind e + εW ind
(5)
Formula: PtP V and PtW ind are the actual output of photovoltaic and wind power generation at time t respectively; Pt,PprV e and Pt,Wprind e are the predicted output values of photovoltaic and wind power generation, respectively. ε P V and εW ind are output prediction errors of photovoltaic and wind power generation respectively.
3.3 Constraints on Operation (1) Distributed energy constraint 0 ≤ PtP V ≤ Pt0P V , 0 ≤ PtW ind ≤ Pt0W ind
(6)
Formula: PtP V is photovoltaic power generation at time t; PtW ind is wind power generation at time t; Pt0P V is the maximum power generation at time T of photovoltaic maximum power tracking mode; Pt0W ind is the maximum power generation at time T of wind power maximum power tracking mode. (2) Conventional unit climbing constraints H 0 ≤ PtH ≤ Pmax
(7)
H H H Pmin ≤ PtH − Pt−1 ≤ Pmax
(8)
H is the maxFormula: PtH is the output power of a conventional unit at time t; Pmax H imum output power of a conventional unit; Pmin is the minimum climbing power of H is the maximum climbing power of a conventional unit. a conventional unit; Pmax
(3) Power balance constraint PtP V + PtW ind + PtH = PtE V + PtJ
(9)
Formula: PtV E is the total charging power of electric vehicle at time t, and PtJ is the base load.
Flexible Load Market Bargaining Scheduling Strategy for Active Distribution . . .
831
(4) Electric vehicle restraint When formulating the optimal scheduling plan, the travel constraints of electric vehicles should also be considered to ensure that the needs of electric vehicle users can be met when they have unexpected travel needs. The target SOC of psychological satisfaction of users should be reached as far as possible before the end of charging of electric vehicles. Specific can be described into the following formula: V V (t) = SSEOC,i (t − 1) + PtE V ∗ t SSEOC,i
(10)
V V V SSEOC,min ≤ SSEOC,i (t) ≤ SSEOC,max
(11)
EV EV Pmin ≤ PtE V ≤ Pmax
(12)
V V V 0.95 ∗ SSEOC,max ≤ SSEOC,last ≤ SSEOC,max
(13)
V Formula: SSEOC,i is the current state of charge of the electric vehicle; t is the EV EV and Pmin are the upper and lower limits of charging and discharging time; Pmax EV the charging power of the electric vehicle; SS OC,last is the state of charge when the electric vehicle finishes charging. The adjustable load up and down of electric vehicle can be expressed as: EV EV ≤ P E V ≤ Psum,max Psum,min
(14)
Formula: P E V is the upper and lower adjustable amount of electric vehicle load. EV EV and Psum,max are the lower limit and upper limit of the upper and lower Psum,min adjustable amount of electric vehicle load respectively. (5) KKT condition In order to obtain reasonable electricity price, literature [15] converts it into a Lagrange duality problem, and the optimal Lagrange multiplier is exactly the optimal price in smart grid. Although this transformation can obtain the optimal Lagrange multiplier and optimal electricity supply and consumption, the disadvantage is that there is a dual gap between the solution of the original problem and the dual problem. As a result, there may be some difference between the solution of the dual problem and the original solution. Therefore, this paper presents another method to solve the Lagrange multiplier, which obtains the optimal Lagrange multiplier by solving the KKT condition directly. Now KKT condition is introduced, and the specific meaning of objective function and constraint condition is omitted to define a simple optimization model [16].
832
B. Shao et al.
max f (x) x
(15)
s.t. j (x) ≥ 0
Formula: f(x) is the objective function, j(x) = ( j1 (x), j2 (x), . . ., jm (x)) is the inequality constraint function, and ji (x)(i = 1, 2, . . ., m) is the n-element function, x ∈ Rm . Let x ∗ be a local maximum of the inequation-constrained optimization problem (15), and the effective constraint set I(x ∗ ) = i | ji (x ∗ ) = 0, and let f(x) and ji (x)(i = 1, 2, . . ., m) be differentiable at x ∗ , if the vector set Ji (x ∗ ) (i ∈ I (x ∗ )) is linearly independent, then there is vector λ∗ = (λ∗1 , λ∗2 , . . . , λ∗m ) such that
m ∗ f (x ∗ ) + i=1 λi Ji (x ∗ ) = 0 λi∗ ≥ 0, Ji (x ∗ ) ≥ 0, λi∗ ∗ Ji (x ∗ ) = 0, ∀i = 1, 2, . . . , m
(16)
N U (X ik , ωik ) − Ck (L k ), j (v k ) := j (k) = L k − Formula: Take f(v k ) := f(x) = i=1
N k k k i=1 x i ≥ 0 where f(v ) represents the benefit function in the KTH period, j(v ) k N +1 → R1. represents the electric quantity constraint in the KTH period, and j(v ): R k∗ Let v be a local maximum of the model, the problem can be transformed into the following nonlinear equations of v k∗ and λ∗k by using the KKT condition
N
N k v k Λk (v k∗ , λ∗k ) := v k ( i=1 U (X ik , ωik ) − Ck (L k )) + λ∗k v k (L k − i=1 xi ) = 0
(17) N N xik ) = 0, λ∗k ≥ 0, L k − i=1 xik ≥ 0 λ∗k (L k − i=1
v k Λk (v k∗ , λ∗k ) = (x1k Λk (v k∗ , λ∗k ), . . . , x Nk Λk (v k∗ , λ∗k ), L k Λk (v k∗ , λ∗k ))T , to obtain the optimal solutions λ∗k and v k∗ ; Where D is the shadow price of power resources, even if the Lagrange multiplier is the value at the optimum point, the optimal electricity price λ∗k within the KTH period in the power grid is determined by (17).
4 Objective Function With the goal of minimizing the operation cost of the power grid, reasonable dynamic electricity price can be formulated to make users respond to the change of electricity price, thus changing their electricity consumption behavior and optimizing the overall electricity load curve. f 1 = W P V + W W ind + W E V + W H + W J
W PV = P PV ∗
T i=1
E tP V
(18)
(19)
Flexible Load Market Bargaining Scheduling Strategy for Active Distribution . . .
W W ind = P W ind ∗
T
833
E tW ind
(20)
E tE V ∗ t
(21)
i=1
W EV = P EV ∗
T i=1
W H = PH ∗
T
E tH
(22)
E tJ
(23)
i=1
W J = PJ ∗
T i=1
Formula: f 1 is the total operation cost of day-ahead dispatching plan; W P V is the cost of photovoltaic power generation; W W V is the cost of wind power generation; W E V is the response cost of dispatching electric vehicles; W H is the power generation cost of conventional units; W J is the base load cost; E tP V is the unit KWH cost within the life cycle of photovoltaic; E tW V i is the unit KWH cost within the life cycle of fan; E tE V i is the final electricity price after load response of electric vehicles; E tH i is the unit KWH cost of conventional unit, and E tJ is the unit KWH cost of base load of system operation.
5 Example Analysis In this paper, Matlab2018B+CPLEX is used to solve the problem. If N = 1000, it means that there are 1000 electric vehicles subject to scheduling; T = 24 means that a day is divided into 24 periods and the value generated randomly from 1 to 1000 identifies the charging demand of i of each EV user. The main network adopts time-of-use electricity price, and the specific electricity price information is shown in Table 1. The load aggregator publishes the price of electricity sold today according to the information of electricity sold the day before and electricity sold price (Fig. 3). As shown in the figure, the red line is the dynamic electricity price, the black line is the time-of-use electricity price of the grid, the blue line is the disorderly charging power of electric vehicles, and the green line is the orderly charging power of electric vehicles. During the period [0–6], its REP is priced slightly higher than the TOU price, and less electricity is purchased from the grid because most of the electricity will be supplied by the smart building energy storage system the day before. During [6–12], the price of REP increased significantly, while the price of time-sharing increased less. In addition, photovoltaics provide electricity, reducing
834
B. Shao et al.
Table 1 Tou electricity price of the grid Time period Peak-valley type 00:00–06:00 6:00–12:00 12:00–19:00 19:00–23:00 23:00–24:00
Valley Peak Flat Peak Valley
Electricity price 0.4 1.1 0.8 1.1 0.4
Fig. 3 Power and electricity price optimization diagram for electric vehicles
the purchase of electricity for the main grid. Therefore, TOU price increases with the increase of REP price, but the increase of REP price decreases. During [12–21], the purchased power of the main network gradually increases during [12–16], while the purchased power of the main network gradually decreases during [16–21], so the pricing price of REP changes accordingly, which further reflects that the pricing of REP can timely respond to demand-side users, thus improving user satisfaction. After optimized scheduling by load aggregator, the peak value of EV charging load moves forward, making EV charging off-peak. When the power grid operation reaches its peak value, EV load runs at low load, and the peak value of EV decreases by 568 kW when the power grid runs at its peak value, which effectively improves the stability of power grid operation (Table 2). As shown in Figs. 4 and 5, the dispatching center makes a day-ahead dispatching output plan based on the predicted EV load, PV power generation, and wind power generation. Wind power generation is all day long, photovoltaic power generation is concentrated in the daytime, new energy output accounts for about 35% of the total equipment output, electric vehicle load mainly during 12:00–24:00 h. The
Flexible Load Market Bargaining Scheduling Strategy for Active Distribution . . . Table 2 Cost comparison statement Conventional unit Photovoltaic power generation Disorder Order
70,562.02 65,635.04
Fig. 4 Distributed energy output
Fig. 5 Power balance diagram
11,913.63 12,025.35
Wind power generation 22,133.41 22,225.24
835
Total cost 104,609.06 99,885.63
836
B. Shao et al.
peak output of conventional units in the scheduling plan is 6106 kW, and the total operating cost is 70,562.02. Compared with TOU, the electricity price of orderly EV load is higher in peak period and lower in valley period. Compared with the disordered EV load, the corresponding ordered EV load power is reduced in peak period and increased in normal period, which conforms to the inverse proportional change relationship between electricity price and power based on the price elasticity coefficient matrix, which indicates that the dynamic electricity price response of EV load in dispatching has a positive promoting effect on the optimization of dispatching strategy.
6 Conclusion In order to solve the problem of load variability caused by complex peak-valley periods, which leads to the increase of power network operation cost, this paper proposes a market bargaining scheduling strategy for flexible load in active distribution network (ADN) based on demand response. After analyzing and exploring the example, the conclusion can be made as follows: (1) This paper introduced KKT condition to solve the optimal electricity price of the power grid, guided users to participate in the optimization of power grid dispatching through dynamic electricity price, and significantly reduced the peak load of EV users. (2) The market bargaining scheduling strategy proposed in this paper requires the load aggregator to integrate the information of lower users, and then build a reasonable scheduling plan, which improves user satisfaction and reduces power network fluctuations. Acknowledgements Supported by Jilin Province science and technology development plan project (20210203101SF).
References 1. Development and Reform Commission, Energy Bureau. Circular on the issuance of the basic rules for medium and long term transactions of electric power. Bulletin of The State Council of the People’s Republic of China, (22), pp. 55–70 (2020) 2. National Development and Reform Commission, National Energy Administration. Notice on signing long-term electric power contracts in 2021[EB/OL] (2020-12-02) [2022-08-07] 3. Asadi, G., Gitizadeh, M., Roosta, A.: Welfare maximization under real-time pricing in smart gridusing PSO algorithm. In: Iranian Conference on Electrical Engineering, pp. 1–7 (2013) 4. Qiu, G., He, C., Luo, Z., et al.: Economic dispatch of Stackelberg game in distribution network considering. Electr. Power Autom. Equip. 41(6), 66–74 (2021) 5. Yang, J., Qin, W., Shi, W., et al.: Two-stage optimal dispatching of regional power grid based on electric vehicles participation in peak-shaving pricing strategy. Trans. China Electrotech. Soc. 37(1), 58–71 (2022)
Flexible Load Market Bargaining Scheduling Strategy for Active Distribution . . .
837
6. Sautermeister, S., Falk, M., Baker, B., et al.: Influence of measurement and prediction uncertainties on range estimation for electric vehicles. IEEE Trans. Intell. Transp. Syst. 19(8), 2615–2626 (2018) 7. Ma, W., Wang, W., Wu, X., et al.: Optimal dispatching strategy of hybrid energy storage system to smooth the fluctuation of photovoltaic grid-connected power. Autom. Electr. Power Syst. 43(3), 58–66 (2019) 8. Zhou, T., Sun, W.: Multi-objective optimal scheduling of electric vehicles for charging route based on utilization rate of charging device. Power Syst. Prot. Control 47(4), 115–123 (2019) 9. Chen, Q., Liu, N., Zhao, T., et al.: Automatic demand response for PV charging station based on receding linear programming. Power Syst. Technol. 40(10), 2967–2974 (2016) 10. Geng, S., Niu, X., Guo, X., et al.: Multi-objective evolutionary game of micro energy grid considering multi-energy flexible load scheduling. Electr. Power Constr. 41(11), 101–115 (2020) 11. Zhao, X., Feng, X., Dai, R., et al.: Coordinated dispatch model for integrated electricity and natural gas systems considering load flexibilities and network topology optimization. Proc. CSEE 41(20), 6856–5859 (2021) 12. Wang, Z., Zhang, Y., Huang, K., et al.: Robust optimal scheduling model of virtual power plant combined heat and power consider in multiple flexible loads. Electr. Power Constr. 42(7), 1–10 (2021) 13. Cui, Y., Guo, F., Fu, X., et al.: Source-load coordinated optimal dispatch of integrated energy system based on conversion of energy supply and use to promote wind power accommodation. Electr. Power Constr. 46(4), 1437–1447 (2022) 14. Tian, L., Shi, S., Jia, Z.: A statistical model for charging power demand of electric vechhicles. Power Syst. Technol. 34(11), 126–130 (2010) 15. Liu, B., Li, J., Gao, Y.: Research on real-time pricing algorithm for classified subscribers on smart grid. Appl. Res. Comput. 34(9), 51–55 (2017) 16. Li, Y., Li, J., Dang, Y., et al.: Smoothing newton algorithm for real-time pricing of smart grid based on KKT conditions. J. Syst. Sci. Math. Sci. 40(4), 646–656 (2020)
Research on Intelligent Collaboration and Obstacle Avoidance Control of Multiple Parafoils System Jinshan Yang, Qinglin Sun, Hao Sun, Yuemin Zheng, and Zengqiang Chen
Abstract In large-scale airdrop missions, multiple parafoils are needed for tasks such as supplying materials, transporting weapons and equipment, and performing earthquake relief. However, coordinating these systems and avoiding obstacles in the environment can be a challenge. To address these issues, we propose an intelligent coordination and obstacle avoidance control strategy for multi-parafoil systems. This strategy utilizes a nonlinear reduced-order wind field model for multi-parafoil systems to simulate their dynamic characteristics. The leader-following method is used to enable position information exchange between parafoils, while the self-disturbance rejection algorithm and artificial potential field method are used to enable intelligent coordination control and autonomous obstacle avoidance in obstacle spaces. Through simulation experiments, we demonstrate that this intelligent control strategy allows the multi-parafoil system to maintain stable tracking formation and autonomous obstacle avoidance within a limited time frame in large-scale airdrop missions. Additionally, this strategy ensures that motion errors converge within a certain range. Keywords Multiple parafoils · Intelligent coordination · Autonomous obstacle avoidance · Active disturbance rejection control
1 Introduction Traditional round parafoils are still widely used in airdropping and retrieving supplies due to their technological advancement and ease of operation. However, their landing point is uncontrollable, and their scattering area is large, making it difficult to meet the needs of precision delivery and ensuring the safety of the dropped materials [1]. In contrast, the parafoil system has many advantages, such as high lift-to-drag ratio, controllability, simple operation, large payload capacity, stable flight, safety, reliability, low cost, and long endurance, making it widely used in military and civilian fields. Timely and accurate supply of equipment and material rescue drops J. Yang · Q. Sun (B) · H. Sun · Y. Zheng · Z. Chen College of Artificial Intelligence, Nankai University, Tianjin 300350, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_73
839
840
J. Yang et al.
play a crucial role in battlefield situations and post-disaster relief efforts. The parafoil system can achieve accurate delivery and safe landing of loads through automatic control technology and bird-like landing techniques, achieving lossless recycling and reuse [2]. Large-scale airdrop tasks will be necessary actions for emergency battlefield equipment support and post-disaster material security. However, in the large-scale airdrop scenes of multiple parafoil systems, there are problems such as large landing scattering area, independent flight tasks, lack of coordination, collisions, and obstacle collisions between each other. Therefore, research on the airdrop of multiple parafoil systems has increasingly important strategic significance, which has attracted the attention of many research institutions and researchers, including NASA and ESA. The research on multiple parafoils is still in the early development stage, and there are not many related research results available. Kaminer et al. [3] proposed a method of multiple parafoil cooperative landing with high glide ratio. By using optimal control method to plan a feasible collision-free path for all parafoils, each parafoil tracks the designated planned path and realizes the coordination of the multiple parafoils with time constraints. However, this method depends on the decoupling of time and space. Calise et al. [4] attempted to simulate the behavior of biological systems to achieve the ability of multiple parafoil systems to autonomously track and avoid obstacles in complex environments. This control strategy only requires a few parameters to be determined from flight tests and avoids the problems of system identification and parameter estimation caused by the use of complex algorithms. Rosich et al. [5] used a large parafoil model of 6000 kg to design a guidance law for precise landing of the parafoil system and proposed a behavior-based cooperative motion control strategy for multiple parafoils. The research shows that multiple parafoil landing can be safely separated and avoid collisions, and this cooperative control strategy of multiple parafoils can accurately deliver critical materials to the target location. Research on the return trajectory planning and tracking control of multi-parafoils in China is relatively scarce, mainly focusing on optimal control methods and traditional automatic control algorithms to plan the optimal control trajectory of the parafoil system and obtain real-time trajectory tracking effects. Luo et al. [6] proposed a multiple parafoils trajectory planning method based on the Gauss pseudospectral method. Simulation results show that the multiple parafoils can gradually converge into a group when reaching the target area, avoiding collisions between multiple parafoils. Chen et al. [7] established a simple multiple parafoils model and used an artificial potential field method to avoid collisions between and achieve control of multiple parafoils’ assembly. Sun et al. [8] fully considered the nonlinear dynamic characteristics of the parafoil, used an 8 degrees of freedom (DOF) dynamic model, designed a active disturbance rejection control (ADRC) algorithm based on self-disturbance control technology, and proved the stability of the controller under disturbances such as communication delay and external wind field. Chen et al. [9] proposed a virtual structure-based multiple parafoil formation guidance strategy, guiding multiple parafoils to track corresponding positions on the virtual structure and complete coordinated formation tasks. Simulation results show that the guidance
Research on Intelligent Collaboration and Obstacle Avoidance Control . . .
841
strategy based on virtual structure can gradually gather multiple parafoils with different positions and angles into formation and track the planned trajectory, accurately landing at the target point. In summary, cooperative control of multiple parafoils has become a new research hotspot in the field of parafoil guidance and navigation control abroad. How to avoid collisions and autonomous obstacles and coordinate consistent landing has become an urgent problem to be solved. This article explores the collective control and autonomous obstacle avoidance of multiple parafoils in response to the aforementioned issues. It investigates the collective control algorithm of multiple parafoils in obstacle spaces to avoid collisions, providing theoretical reference for the further development of multiple parafoils technology.
2 Modeling of Multiple Parafoils The aerial delivery of large-scale parafoils is shown in Fig. 1. The parafoil is made of flexible textile materials and takes on a shape similar to that of an airplane wing after inflation. The parafoil is controlled by pulling on the control lines attached to the trailing edge of the canopy on both sides. In most cases, the parafoil’s main flight modes include gliding, turning, decelerating, and bird-style descent.
Fig. 1 Large scale parafoils airdrop
842
J. Yang et al.
The model of the parafoil system involves multiple physical quantities such as lift, drag, and added mass, and exhibits high nonlinearity. The commonly used models include 6 DOF, 8 DOF, and 9 DOF models [10–12]. As the number of degrees of freedom in the parafoil model increases, more state parameters can be obtained, but the corresponding computational complexity also increases. In large-scale airdrop missions involving multiple parafoils, it is not necessary to study the relative motion between the canopy and the carrier, but only to understand the motion trajectory of the system’s center of mass. In order to simplify the dynamic equations and kinematic equations of the parafoil system, a simplified mass point model is chosen to replace the complex high-DOF model, and an equivalent nonlinear reduced-order model is obtained through transformation [13, 14]. During the gliding process of a parafoil, whether controlled by a single-sided or double-sided pull, the changes in glide rate and horizontal speed are minimal. Therefore, we can make the following assumption: the lift-to-drag ratio of the parafoil during the stable flight phase is constant, and thus the vertical and horizontal speeds of the parafoil remain unchanged. Considering the horizontal wind field, the wind field is known. Additionally, the control response of the parafoil system has no delay. Based on the above assumptions, the dynamic equation of the parafoil in the wind coordinate system can be simplified as Eq. 1: ⎧ x˙i = vs cos θi + v f,x ⎪ ⎪ ⎨ y˙i = vs sin θi + v f,y , i = 1, 2, 3, . . . , N (1) ˙ = wi θ ⎪ ⎪ ⎩ i z˙ i = vz where xi , yi , z i are the position of each parafoil system. θi is the yaw angle of each parafoil system. wi is the turning angular velocity of the parafoil with asymmetric pull-down control rope, which is also the control quantity of each parafoil. vs , vz are the horizontal and vertical velocities of the parafoil system. Currently, several researchers have found that the trends of the parafoil system’s center of mass motion described by the reduced-order model and the 6 DOF model are consistent through the comparison of simulation results [15–17]. Therefore, the reduced-order model is sufficient to describe the motion characteristics of multiple parafoil systems and greatly reduces the computational complexity. In multiple parafoils, all individual parafoils are simulated using the reduced-order model, and their cooperative motion trajectories are solved. Furthermore, as shown in Eq. 1, the parafoil system is inherently nonlinear. Based on the model of a single parafoil described, the case of multiple parafoils being dropped together can be considered. However, unlike the case of a single parafoil, in this scenario, more attention needs to be paid to the consistency between parafoils and the avoidance of collisions. Therefore, each parafoil can be viewed as an intelligent agent. An intelligent agent is generally defined as an entity that can perceive its own state and interact with the environment through its own sensors. Currently, parafoils can already perceive their own and environmental states through onboard sensors and make corresponding control operations.
Research on Intelligent Collaboration and Obstacle Avoidance Control . . .
843
3 Multiple Parafoils Collaboration and Obstacle Avoidance Control 3.1 Cooperative Control During large-scale air drops of multi-parafoil systems, it is common for the control center to plan a predetermined trajectory for each individual parafoil, which can then track the planned trajectory to the target point, thereby achieving coordinated control of the multiple parafoils. However, this approach has a serious drawback, namely that in the event of a parafoil malfunction or other unexpected accident, it may lead to extreme situations such as collisions between multiple parafoils. To overcome this problem, this paper proposes a leader-follower consensus strategy for coordinated control. This method selects one parafoil as the leader and the others as followers. The leader tracks the planned trajectory and the leader and followers coordinate with each other to form a fixed formation until landing. This method can effectively prevent extreme situations such as collisions between multiple parafoils and improve the stability and robustness of the system. For the leader, a fixed route is given to guide it to the target point. As for the follower, it needs to control its speed and position in real-time by adjusting its position error with respect to the leader and other followers to achieve the goal of following the leader and forming a formation with other agents. As shown in Fig. 2, the communication topology diagram of the multiple parafoils and the communication matrix shown in Eq. 2 are presented. P0 represents the leader, while P1 , P2 , P3 , and P4 represent the four followers in the formation. The arrows indicate the direction of information flow, where Ai j represents the information flow from j to i. Ai j = 1 indicates that there is information exchange between j and i, while Ai j = 0 means that there is no information exchange between j and i. The diagonal elements are all 0 because an individual cannot establish a communication connection with itself. ⎡ ⎤ 01100 ⎢0 0 1 1 0⎥ ⎢ ⎥ ⎥ (2) Ai j = ⎢ ⎢0 1 0 0 1⎥ ⎣0 1 0 0 1⎦ 00110
3.2 Artificial Potential Field Method To avoid obstacles, the followers need to utilize the artificial potential field (APF) method and quickly restore the original formation after the obstacle avoidance process, enabling the multiple parafoils to autonomously avoid obstacles. The APF method abstracts the parafoil system as points under the action of a potential field. The potential field is composed of an attractive field centered on the target point
844
J. Yang et al.
Fig. 2 Communication topology of multi parafoils
and a repulsive field centered on obstacles. The parafoil is guided to the target point by the potential field forces it receives in the environment [18]. The magnitude of the attractive force changes as the distance between the parafoil and the target point changes. In traditional APF, the magnitude of the attractive force is proportional to the distance between the parafoil and the target point, with the force increasing as the distance increases. The function of the attractive potential field can be defined as Eq. 3: Ugrav (q) =
1 2 ερ 2 rt
(3)
where a is ε the attraction gain coefficient, while ρr t = q − qtarget represents the Euclidean distance between the parafoil position q and the target point position qtarget . The direction of gravity points from the position of the parafoil to the position of the target point. Gravity is the gradient of gravitational field Eq. 4: Fgrav (q) = ∇Ugrav (q) = ερr t The repulsive potential field function is defined as Eq. 5:
2 1 0 < ρr o ≤ ρ0 σ ρ1r o − ρ10 2 , Ur ep (q) = ρr o > ρ0 0
(4)
(5)
where σ is the position repulsion gain coefficient, while ρr o = q − qo represents the Euclidean distance between the parafoil position q and the obstacle position. ρ0 is a constant that represents the maximum range of the obstacle repulsion force.
Research on Intelligent Collaboration and Obstacle Avoidance Control . . .
845
Fig. 3 Force analysis of the parafoil under multiple obstacles
The direction of repulsion points from the obstacle position to the parafoil position. The same repulsive force is the gradient of the repulsive field, as shown in Eq. 6:
σ ρ1r o − Fr ep (q) = ∇Ur ep (q) = 0
1 ρ0
1 ρr2o
,
0 < ρr o ≤ ρ0 ρr o > ρ0
(6)
Based on the defined functions of the attractive and repulsive fields, the global potential field function in the entire space can be obtained as Eq. 7: U (q) = Ugrav (q) + Ur ep (q)
(7)
The resultant force acting on the parafoil is the gradient of the global potential field, as shown in Eq. 8: F (q) = ∇U (q) = Fgrav (q) + Fr ep (q)
(8)
Under the influence of multiple obstacles, the force acting on the parafoil is shown in Fig. 3.
3.3 Design of Nonlinear ADRC Although the traditional PID algorithm has the advantages of simple design, strong adaptability, and few parameters to adjust, it is difficult to achieve the optimal solu-
846
J. Yang et al.
Fig. 4 Force analysis of the parafoil under multiple obstacles
tion during parameter tuning. ADRC is a control algorithm proposed by researcher Han [19] for engineering applications. It is based on the essence of PID control technology and absorbs the achievements of modern control theory. ADRC is developed through computer simulation experiments and comprehensive exploration of results induction and summarization, and it is a new practical digital control technology that does not rely on accurate models of the controlled objects and can replace PID control technology. Its anti-interference ability and ability to solve uncertainty problems have been tested in many experimental systems and practical engineering applications [20], thus attracting widespread attention from researchers. ADRC includes a tracking differentiator, an extended state observer, and a nonlinear error feedback control law, among which the extended state observer is the core part of the Active Disturbance Rejection Control technology. The extended state observer can estimate and compensate for the internal and external disturbances, as well as unmodeled and nonlinear changes in the system, and control the equivalent “integrator series” linear system of the controlled object. In this paper, the ADRC algorithm is employed to regulate the formation. As shown in Fig. 4, the ADRC technology mainly consists of the following components: Nonlinear Tracking Differentiator (TD), Extended State Observer (ESO), and Nonlinear Feedback (NLSEF). To apply the ADRC algorithm in a digital controller, it is necessary to discretize the core component of the ESO. Currently, there are two commonly used methods for discretization: Euler’s method and Zero-Order Hold (ZOH) method. When the system sampling period is short, the observation effect of the ESO is similar for both methods, but when the system sampling period is long, there is a significant difference in the estimation ability of the ESO, and the ZOH method has higher estimation accuracy and tracking speed. By using the ZOH method for discretization, the discrete-space expression of the second-order system’s ESO is as Eq. 9: e (k) = z 1 (k) − y (k) (9) Z (k + 1) = Φ Z (k) + Γ u (k) − L p e (k)
Research on Intelligent Collaboration and Obstacle Avoidance Control . . .
Here,
847
⎡
⎡ bT 2 ⎤ 2 ⎤ 1 T T2 2 Φ = ⎣ 0 1 T ⎦ , Γ = ⎣ bT ⎦ 0 0 1 0
where e(k) is the observation error. position error value. y(k) is z 1 (k) is the observed the actual output value. Z (k) = z 1 (k) z 2 (k) z 3 (k) , z 2 (k) is the derivative value of the observation position error, z 3 (k) represents the total disturbance observed in the system. T is the sampling period. Set L p = Φ L c , and rewrite Eq. 9 as Eq. 10. e (k) = z 1 (k) − y (k) (10) Z (k + 1) = (Φ − Φ L c C) Z (k) + Γ u (k) + L p y (k) where C is the output matrix of the discrete system, and C = 1 0 0 . To simplify the parameters, the characteristic roots of the discrete system (Φ − Φ L c C) are configured at β, as shown in Eq. 11, |z I − (Φ − Φ L c C)| = (z − β)3
(11)
In Eq. 11, the relationship between the characteristic roots β configured for the discrete system and the characteristic roots −ω0 configured for the continuous system is that β = exp (−ω0 T ). The Eq. 11 can be solved to obtain Eq. 12: ⎤ ⎡ 1 − β3 3 ⎦ Lc = ⎣ 2 − 3β + β 3 2T (12) 3 1 (1 − β) T 2 According to the differential of the input signal and the input signal and the error of the derivative of the system output and output observed by the state observer, the control and disturbance compensation nonlinear feedback module is then carried out, as shown in Eqs. 13 and 14: E 1 = x1 (k) − z 1 (k) (13) E 2 = x2 (k) − z 2 (k) u (k + 1) = u (k) +
Ai j (kp · E 1 + kd · E 2 − z 3 (k + 1)) b
(14)
where x1 (k) is the input for predicting position error. x2 (k) is the predicted position error derivative input. The state variable errors E 1 and E 2 are formed by subtracting the state variable estimates z 1 (k) and z 2 (k) provided by the ESO. The control input u in Eq. 14 is formed by combining the state variable error E 1 and E 2 with the compensation term z 3 (k) estimated by the ESO for the unknown external force.
848
J. Yang et al.
4 Simulation Test of Multiple Parafoils To validate the effectiveness of the intelligent cooperative and obstacle avoidance algorithm designed in this study, simulations were conducted in the MATLAB environment. A simulated transport aircraft sequentially deployed multiple parafoils in arbitrary directions and simulated them in specific wind fields and obstacle environments. The leader parafoil tracked a trajectory to the target point, while the follower parafoil obtained information from surrounding parafoils through information exchange to calculate errors. The follower parafoil adjusted its position by pulling the trailing edge of the parafoil body and performed real-time obstacle avoidance. Two sets of simulation results are presented in this paper, with the initial and target positions of the multiple parafoils being the same in both sets. However, the specific wind field conditions and obstacle space settings were different, as shown in Tables 1 and 2. As shown in Fig. 5, 6 and 7, the first set of experimental results includes the three-dimensional trajectory of multiple parafoil cooperative obstacle avoidance, the cooperative formation error of multiple parafoils, and the inter-individual distance in the formation of multiple parafoils. Figures 8, 9 and 10 display the second set of experimental results, where the obstacle positions and wind conditions were changed. As revealed by Figs. 5, 6 and 7 and Table 3, under specific wind and obstacle conditions, although the initial positions of each parafoil were relatively dispersed, the multiple parafoil system gradually converged and maintained the desired formation flight through cooperative control algorithms. Meanwhile, each parafoil also autonomously avoided obstacles during the cooperative flight, verifying the rationality of the algorithm.
Table 1 Initial position and target position parameters of the multiple parafoils S/N Initial position (m) Final position (m) Leader Follower1 Follower2 Follower3 Follower4
(380, 380, 600) (300, 300, 600) (305, 305, 600) (330, 330, 600) (350, 350, 600)
Table 2 Obstacle location and wind field conditions S/N Group one Obstacle1/m Obstacle2/m Obstacle3/m Wind
(150, 50, 130) (250, 220, 400) (175, 190, 200) +X-axis, 3 m/s
(40, 18, 0) (16, 5, 0) (16, 30, 0) (5, 5, 0) (5, 30, 0)
Group two (120, 130, 150) (250, 350, 400) (75, 50, 20) +Y-axis, 3 m/s
Research on Intelligent Collaboration and Obstacle Avoidance Control . . .
849
Fig. 5 3D trajectory of multiple parafoils
Fig. 6 Formation error of multiple parafoils
As shown in Figs. 8 and 9 and Table 4, under different initial positions, the multiple parafoil system achieved motion coherence convergence and completed the predetermined large-scale airdrop mission. From the flight trajectory, it can be seen that the algorithm strategy proposed in this paper, combining leader-following, artificial
850
Fig. 7 Individual spacing distance of multiple parafoils formation Fig. 8 3D trajectory of multiple parafoils
J. Yang et al.
Research on Intelligent Collaboration and Obstacle Avoidance Control . . .
Fig. 9 Formation error of multiple parafoils
Fig. 10 Individual spacing distance of multiple parafoils formation
851
852
J. Yang et al.
Table 3 Obstacle location and wind field conditions Group one Maximum Average Error/m
6.39
1.18
Table 4 Obstacle location and wind field conditions Group one Maximum Average Error/m
4.35
0.86
Standard deviation 1.36
Standard deviation 0.7
potential field, and self-disturbance control, can accurately track the target trajectory and maintain a smooth and stable flight trajectory. Meanwhile, when encountering obstacles, the multiple parafoils can quickly form a stable formation according to the specified requirements, and its error curve also approaches zero. The stability analysis of controllers for nonlinear systems has been demonstrated in references [21], which have proven the stability of the active disturbance rejection controller in the presence of external disturbances. Simulation results also show that through this method, the designed controller can make the multiple parafoil system converge to a neighborhood that contains the origin within a finite time. The distance error between the reference position input and the actual position is always bounded, and its first derivative is also bounded, with the upper bound of the error decreasing monotonically with the increase of the observer and controller gains. Therefore, under the correct design of controller parameters, the intelligent coordinated and obstacle avoidance control algorithm proposed in this paper can achieve stability of the multiple parafoil system.
5 Conclusions In large-scale material and equipment supply air drop missions, coordination between multiple parafoil systems is often required to complete the task, making the study of intelligent coordinated control of multiple parafoils strategically significant. This paper adopts a nonlinear reduced-order wind field model for multiple parafoil systems and proposes a leader-follower consensus-based algorithm for coordinated control and obstacle avoidance of multiple parafoil systems. Through simulation experiments, it is found that in the case of individual parafoils being dispersed, each parafoil can obtain the position of neighboring parafoils through local information exchange, ultimately achieving formation control of the multiple parafoil system, including formation assembly, collision avoidance, and obstacle avoidance, with good disturbance suppression performance, verifying the effectiveness of the proposed control strategy, and conducting stability analysis of the controller. Compared with most optimal control strategies, this intelligent control strategy for multiple parafoil systems can
Research on Intelligent Collaboration and Obstacle Avoidance Control . . .
853
make large-scale air drops a more effective, reliable, and real-time critical material delivery method, providing theoretical references for future research on the control of multiple parafoils in obstacle space.
Acknowledgements This work was supported by the National Natural Science Foundation of China (Grant NO. 61973172, 62003177 and 61973175) and the 2022 Tianjin Graduate Research and Innovation Project (Grant NO. 2022BKYZ048).
References 1. Xue, X., Wen, C.Y.: Review of unsteady aerodynamics of supersonic parachutes. Prog. Aerosp. Sci. 125, 100728 (2021). https://doi.org/10.1016/j.paerosci.2021.100728 2. Rimani, J., Viola, N., Saluzzi, A.: An approach to the preliminary sizing and performance assessment of spaceplanes’ landing parafoils. Aerospace-Basel 9(12), 1–22 (2022). https:// doi.org/10.3390/aerospace9120823 3. Kaminer, I., Yakimenko, O., Pascoal, A.: Coordinated payload delivery using high glide parafoil systems. In: 18th AIAA Aerodynamic Decelerator Systems Technology Conference and Seminar (2005). https://doi.org/10.2514/6.2005-1622 4. Calise, J., Preston, D.: Swarming/flocking and collision avoidance for mass airdrop of autonomous guided parafoils. J. Guid. Control Dyn. 31(4), 1123–1132 (2008). https://doi. org/10.2514/1.28586 5. Rosich, A., Gurfil, P.: Coupling in-flight trajectory planning and flocking for multiple autonomous parafoils. Proc. Inst. Mech. Eng. Part G J. Aerospace Eng. 226(6), 691–720 (2012). https://doi.org/10.1177/0954410011413637 6. Luo, S., Sun, Q., Tao, J., Liang, W., Chen, Z.: Trajectory planning and gathering for multiple parafoil systems based on pseudo-spectral method. In: 35th Chinese Control Conference (CCC), pp. 2553–2558 (2016). https://doi.org/10.1109/ChiCC.2016.7553748 7. Qi, C., Min, Z., Zhihao, Z., Ma, M., Huang, R.: Multiple autonomous parafoils system modeling and rendezvous control. Acta Aeronauticaet Astronautica Sinica 37(10), 3121–3130 (2016). www.cnki.net/kcms/detail/11.1929.V.20160224.1700.010.html 8. Sun, H., Wang, F., Sun, Q., Chen, Z., Tao, J.: Distributed consensus algorithm for multiple parafoils in mass airdrop mission based on disturbance rejection. Aerosp. Sci. Technol. 109, 10647 (2021). https://doi.org/10.1016/j.ast.2020.106437 9. Chen, Q., Sun, Y., Zhao, M., Liu, M.: A virtual structure formation guidance strategy for multiparafoil systems. IEEE Access 7, 123592–123603 (2019). https://doi.org/10.1109/ACCESS. 2019.2938078 10. Zhang, Z., Zhao, Z., Fu, Y.: Dynamics analysis and simulation of six DOF parafoil system. Cluster Comput. 22, 12669–12680 (2019). https://link.springer.com/article/10.1007/s10586018-1720-3 11. Lv, F., He, W., Zhao, L.: An improved nonlinear multibody dynamic model for a parafoil-UAV system. IEEE Access 7, 139994–140009 (2019). https://doi.org/10.2514/6.2020-1643 12. Ochi, Y.: Modeling and simulation of flight dynamics of a relative-roll-type parafoil. In: AIAA Scitech 2020 Forum (2020). https://ieeexplore.ieee.org/document/8847392 13. Zhang, L., Gao, H., Chen, Z., Sun, Q., Zhang, X.: Multi-objective global optimal parafoil homing trajectory optimization via Gauss pseudospectral method. Nonlinear Dyn. 72, 1–8 (2013). https://link.springer.com/article/10.1007/s11071-012-0586-9 14. Hua, Y., Lei, S., Weifang, C.: Research on parafoil stability using a rapid estimate model. Chin. J. Aeronaut. 30(5), 1670–1680 (2017). https://doi.org/10.1016/j.cja.2017.06.003
854
J. Yang et al.
15. Xiong, J.: Research on the dynamics and homing project of parafoil system. National University of Defense Technology, Changsha (2005). https://kns.cnki.net/KCMS/detail/detail.aspx? dbname=CDFD9908&filename=2006127563.nh 16. Gao, H., Zhang, L., Sun, Q., Sun, M., Chen, Z.: Fault-tolerance design of homing trajectory for parafoil system based on pseudo-spectral method. Control Theor. A. 30(6), 702–708 (2013). https://kns.cnki.net/kcms/detail/44.1240.TP.20130621.1753.005.html 17. Fowler, L., Rogers, J.: Bézier curve path planning for parafoil terminal guidance. J. Aeros. Inf. Com. 11(5), 300–315 (2014). https://doi.org/10.2514/1.I010124 18. Akishita, S., Kawamura, S., Hisanobu, T.: Velocity potential approach to path planning for avoiding moving obstacles. Adv. Robot. 7(5), 463–478 (1992). https://doi.org/10.1163/ 156855393X00294 19. Han, J.: From PID to active disturbance rejection control. IEEE T. Ind. Electron. 56(3), 900–906 (2009). https://ieeexplore.ieee.org/document/4796887 20. Zheng, Q., Dong, L., Lee, H., Gao, Z.: Active disturbance rejection control for MEMS gyroscopes. In: 2008 American Control Conference, pp. 4425–4430 (2008). https://ieeexplore.ieee. org/abstract/document/4587191 21. Zheng, Q., Gao, L., Gao, Z.: On stability analysis of active disturbance rejection control for nonlinear time-varying plants with unknown dynamics. In: 2007 46th IEEE Conference on Decision and Control, pp. 3501–3506 (2007). https://ieeexplore.ieee.org/document/4434676
Construction of Knowledge Graphs Related to Industrial Key Production Processes for Query and Visualization Hongyu Han, Dongmei Fu, and Haocong Jia
Abstract In the era of “Industry 4.0”, there is an urgent need for industries to realize knowledge informatization and intelligence. Calcination is a very important and complex production process in many industrial production lines. Taking the cement calcination section as an example, this paper constructs a relevant knowledge graph and designs corresponding knowledge extraction methods for several different data sources. Finally, the knowledge graph consists of 1284 entities, 1547 relations, and 4115 attribute values; on this basis, query and Q&A algorithms are designed. A networked software platform that can provide knowledge retrieval, relationship query, full KG display, and other service functions for calcined section-related production equipment, production process, and production flow is developed. This method can promote the application of knowledge graph in the field of industrial key production processes. Keywords Knowledge graph · Query system · Graph database · Neo4j · Cement calcination process
1 Introduction Knowledge graph, as an effective method to present knowledge within a specific range, can fully integrate the information resources of specific objects on the Internet, which is different from traditional databases. Based on the high reliability requirement of knowledge, it is necessary to conduct refined classification training for the knowledge of specific objects in industrial scenarios. Compared to deep learning, KG does not require massive amounts of data for training and can be used for specific tasks in industrial scenarios. KG uses structured Query language to realize knowledge location, which can improve the retrieval efficiency. In the matching process, H. Han · D. Fu (B) · H. Jia School of Automation and Electrical Engineering, Beijing Engineering Research Center of Industrial Spectrum Imaging, University of Science and Technology Beijing, Beijing 100083, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_74
855
856
H. Han et al.
semantic analysis is achieved by utilizing a large amount of correlation information between nodes, rather than calculating similarity one-on-one. Compared with traditional expert systems, KG has strong relationship expression and real-time feedback capabilities, as well as a friendly interactive interface, which can be used as data support in intelligent question answering services. In recent years, knowledge graph research has sprung up at home and abroad, and its vitality has been fully demonstrated. To help designers in product innovation design, Zhang [15] cleverly uses a hybrid knowledge representation method to build an ontology of innovation design subject areas and create a knowledge base prototype system. Chen [7] proposed a sensitive knowledge base construction method on account of topic model entity link method and prior knowledge, which focuses on semi-structured data of the Internet. Chang [2] collected structured and unstructured geological literature data, used a string based fuzzy matching algorithm, then manually extracted and summarized features to classify the domain data after similarity calculation. The biomedical papers and abstracts from Pub Med, ORCID, and Ex PORTER were selected by Kim et al. [11] to extract relevant knowledge and construct a knowledge graph that includes biological entities, authors, articles, contact information, and fund relationships. Wei et al. [9] believe that the process of relationship extraction can be analogized as a mapping between two entities, effectively solving the problem of triplet overlap. At present, domestic industrial enterprises introduce automation control systems but only realize the primary parameter control of industrial automation [10]. Take the cement production process as an example, the volume of variables is small but numerous, the calcination industrial system of rotary kiln in smaller scale cement production is more complex, the chemical changes are complicated and coupled, in addition, the control parameters are multiple. Therefore, calcination is the core part of cement production. However, the current industrial production environment is not friendly, technical knowledge is greatly compromised in the process of experience passing on [12], and the current technical training efficiency for cement production workers in the country is insufficient. In addition, there are almost no research in the industry, especially in cement knowledge graph [4]. In this paper, based on the models and theories related to knowledge graph construction, focusing on the fields related to the key processes of industrial production, the knowledge graph related to the cement calcination section is constructed. And the knowledge query system design is completed, which is oriented to the construction of knowledge base to provide the services of retrieval content, a relationship full picture and relationship query, and the query of the real data set of the system is conducted through the experiment. The system is validated through experiments on real dataset queries.
Construction of Knowledge Graphs Related to Industrial …
857
2 Construction Method and Scale of Knowledge Graph 2.1 Domain Schema Design and Knowledge Extraction From the knowledge graph application scenario and object-oriented requirements, the scope and boundary of knowledge can be obtained by backpropagation and analysis. For a professional domain knowledge graph, it is necessary to define the relevant attributes of each concept and the relationship between concepts using a semi-automatic construction method based on a large number of existing data tables and texts to obtain the domain schema of the knowledge graph [8]. By agreeing with the instructor and field engineers, it was decided to extract knowledge from the following two types of Internet resources: firstly, unstructured data from relevant teaching materials and books, and secondly, semi-structured data from web pages, such as the rotary kiln detail information interface of various company websites. (1) Knowledge Extraction Based on the Production Process Referring to the authoritative source in the field, Cement Processes, the professional terms in the field of cement calcination are extracted and classified by manual acquisition, semi-automatic processing, and professional terminology database construction, and the corresponding semantic relationships between entity elements and entities are extracted according to their corresponding texts. (2) Knowledge Extraction Based on National Standards This topic takes “Rotary Kiln for Cement Industry GB/T 32994-2016” as an example, as the national standard has a fixed specification for the structure, arrangement, and format of the standard with clear boundaries, so it can be extracted by using based on the rule: trigger keywords. (3) Website Knowledge Extraction Based on Xpath Parsing By calling the Request library in Python as the crawler dispatcher, The rule structure of website elements was analyzed to build a knowledge extractor, the collection of rotary kiln knowledge was completed, the acquired data was stored locally, and an important database for the subsequent knowledge graph Q&A system was provided.
2.2 Data Pre-processing and Entity Fusion Data cleaning, de-duplication, and standardization are carried out for the existing more regular data. For the crawled cement rotary kiln data on the website, it is necessary to filter and delete in formation such as spaces, gibberish and tabs without affecting the semantics, and save them [5]. To ensure data uniformity, only one node of the same entity exists in the knowledge graph, different representations of the
858
H. Han et al.
Table 1 Table of de-duplicated and normalized entities De-duplication type Before de-weighting Equipment name Craft name Chemical formula Substance name
Rotary kiln, cement rotary kiln, cement kiln, rotary kiln for cement industry Calcining belt, calcining section, calcining process, calcining process AL2O3, Al2O3, Al2o3, Al2O3 Limestone, limestone, chert, lime
After de-weighting Rotary kiln Calcined belt Al2 O3 Limestone
same entity need to be unified. After data cleaning, the Jaccard similarity coefficient is calculated to compare the similarity between entities, and entity normalization is performed to merge semantically identical attributes and relations to complete denormalization [14].
J accar d(A, B) =
|A ∩ B| |A ∩ B| = (A = φ | B = φ) |A ∪ B| |A| + |B| − |A ∩ B|
where, the elements in sets A, and B are all the lexical items in the text, and only one of the same lexical items in the set is retained. Take the cement calcination segment as an example, list Table 1.
2.3 Knowledge Graph Construction Knowledge Graph Design and Implementation. The vast majority of knowledge representations in knowledge graphs are RDF (Resource Description Framework), whose most common representation is a triad, which can be described as (node 1, relation, node 2) or (node, attribute, attribute value). In this paper, based on the document library that has been extracted and fused with knowledge, each form containing more information and nested variables is converted into the form of a one-to-one mapping triad, and then entity and attribute files are generated, represented as a graph structure, and stored in the Neo4j database [1]. The construction process is shown in Fig. 1. According to the characteristics of Neo4j, a directed graph was generated eventually. Through the knowledge combing of the previous job, a relatively small knowledge graph is constructed, which based on a tree-like structure, containing a total of 1284 knowledge entities, 1547 relationships, and 4115 attribute values, among which there are 16 types of label labels for knowledge entities and 96 types of relationships. Some of the entities and relationships of the currently constructed knowledge graph are shown in Table 2.
Construction of Knowledge Graphs Related to Industrial …
859
Fig. 1 Knowledge graph construction process Table 2 Examples of entities and relationships Entity (E) type Number of E Entity relationship (R) type Other equipment Content Matter Standard Rotary kiln Inspection standards Standard requirements Overview
536 244 110 100 81 78 35 24
Contains Features CaO content Introduction Composition Configuration Applications Originated from
Number of R 411 127 32 77 45 24 30 20
Query Quiz Algorithm. Supported by the knowledge graph already constructed, The task of transforming the natural language input questions into node information queries is accomplished, and a knowledge retrieval rule based on node queries is designed [13] (Fig. 2). (1) Entity Extraction Entity extraction is performed using two models of LTP’s word splitting and lexical annotation. When using the default dictionary of the LTP tool, it was found that most of the specialized terms were split twice, so this paper adds the uncuttable proper nouns to the custom dictionary based on the specialized terminology database constructed, and then performs text splitting again. When the term “cement calcined section” is not added to the custom dictionary, the result is cut into “cement”, “calcined” and “section”, and after adding the custom dictionary, the accuracy of word separation is greatly improved. Then, the output list after word separation is lexically annotated, the nouns are extracted as target entities after annotation, and the answers are retrieved in the subsequent work. (2) Answer Search Based on Knowledge Graph Using the py2neo library to connect the Neo4j graph database and the entire questionand-answer model of the algorithm, the list of labels of entity nodes is defined; secondly, the output entity list of the algorithm is used as nodes to transform the
860
H. Han et al.
Fig. 2 Algorithm framework of knowledge graph query system
original question into a query language recognizable by the database and retrieve them one by one in the knowledge graph, and finally, the retrieved answers are output in the form of a triplet [3].
3 Experiments and Results 3.1 Knowledge Graph Construction Results and Visualization In this paper, the knowledge graph related to the cement calcined section constructed in the Neo4j graph database species is represented by entity relations in the form of a triplet, which allows visual presentation and interaction of the whole knowledge graph (Fig. 3). The knowledge graph in this paper belongs to the industry vertical field knowledge graph, according to the user groups and usage needs, mainly for the cement practitioners’ related training, troubleshooting reference, production standards, and
Construction of Knowledge Graphs Related to Industrial …
861
Fig. 3 Neo4j graph database visualization effect
other purposes, which can quickly query the process belt, material temperature, gas temperature, precautions in the process, query the standard information, model, and parameters of rotary kiln, accompanying technical documents, etc., query the different raw materials in limestone, mica stone, etc. of silica, alumina content, etc. It was found that these directional, layer-connected maps, when combined with certain deep learning models, can enable more advanced functions such as data mining and knowledge inference.
3.2 Knowledge Graph-Based Visualization and Query Q&A System Through Python software development cement calcined section knowledge map query system, using Flask framework and HTML language for front-end interface design, to achieve front-end and back-end data connection, and support mainstream browser access, can achieve access to the real-time feedback of user queries (Fig. 4). The query system mainly realizes the information query application function and takes this as an entry point to design and implement a query system. Based on this, the system has a total of four modules: Start Query, Retrieve Content, Relationship Full Profile, and Relationship Q&A (Fig. 5).
862
Fig. 4 General architecture of the system
Fig. 5 Query Q&A system results graph
H. Han et al.
Construction of Knowledge Graphs Related to Industrial …
863
4 Summary This paper have completed the huge knowledge combing and data pre-processing of the cement calcination section, realized a set of cement calcination domain knowledge map construction process, containing a total of 1284 knowledge entities, 1547 relationships, and 4115 attribute values, and designed a visualization and query Q&A system based on this cement calcination section knowledge map. Through the real data set query verification, the method is feasible and can largely improve the query and quiz efficiency, and for the relative gap in the field research, the study has a great novelty, although taking cement as an example, it can be extended to many industrial key production processes.
References 1. Aldwairi, M., Jarrah, M., Mahasneh, N., Al-khateeb, B.: Graph-based data management system for efficient information storage, retrieval and processing. Inf. Process. Manag. 60(2), 103165 (2023). https://www.sciencedirect.com/science/article/pii/S0306457322002667 2. Chang, L., Zhu. Y., W.X.Z.X.L.Y.S.: Building a mineral knowledge base in the big data environment: taking tungsten mines as an example. China Min. Ind. 27, 93–96, 108 (2018) 3. Guo, Q., Zhuang, F., Qin, C., Zhu, H., Xie, X., Xiong, H., He, Q.: A survey on knowledge graph-based recommender systems. IEEE Trans. Knowl. Data Eng. 34(8), 3549–3568 (2022) 4. Huang, X.-L., L.H.O.P.W.H.: Design and implementation of an expert system for cement rotary kiln diagnosis. China Sci. Technol. Inf. 490, 87–89 (2014) 5. Kun, Z.: Research and implementation of APK crawler technology for automated identification of paging and extraction of search results (2020) 6. Li, Z., T.R.F.L.: Cement intelligent manufacturing production control MES system. Cem. Technol. 210, 42–49, 56 (2019) 7. Peiqiang, Z.: Research and implementation of key technologies for building a standard structured knowledge base based on domain ontology. Res. Qual. Tech. Supervis. 77, 8–11 (2021) 8. Rastogi, A., Zang, X., Sunkara, S., Gupta, R., Khaitan, P.: Towards scalable multi-domain conversational agents: the schema-guided dialogue dataset. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 8689–8696 (2020) 9. Wei, Z., Su, J., Wang, Y., Tian, Y., Chang, Y.: A novel cascade binary tagging framework for relational triple extraction (2020) 10. Xiaodong, Z.: Progress and development direction of China’s cement industry. China Cem. 12, 48–54 (2017) 11. Xu, J., Kim, S.: Building a PubMed knowledge graph. Sci. Data 7 (2020) 12. Yang, M.J.S.Y.: Discussion on technologies related to intelligent manufacturing in the cement industry. Jiangsu Build. Mater. 168, 68–73 (2019) 13. Yin, Z., Shi, L., Yuan, Y., Tan, X., Xu, S.: A study on a knowledge graph construction method of safety reports for process industries. Processes 11(1) (2023). https://www.mdpi.com/22279717/11/1/146 14. Zhang, J., Huang, B., Fujita, H., Zeng, G., Liu, J.: Feqa: fusion and enhancement of multi-source knowledge on question answering. Expert Syst. Appl. 227, 120286 (2023) 15. Zhang, L., Li, Y., A.L.W.X.: Construction and application of thematic knowledge base for innovative design. J. Eng. Des. 19, 241–249 (2012)
Multi-objective Optimization Design of the Structural Parameters of Swing Arm Crawler Rescue Robot Pu Zhang, Bo Cheng, Xingke Xia, and Jian Sun
Abstract Aiming at the problem that the unreasonable structural parameters of swing arm crawler rescue robot directly lead to its poor obstacle crossing ability and low movement efficiency in complex search and rescue scenes, an optimization design method of robot structural parameters based on NSGA-II was proposed and verified. Firstly, based on the analysis of the motion mechanism of the robot, the model of robot motion performance and structural parameters is established. Then, NSGA-II multi-objective genetic algorithm based on non-dominated sorting was used to solve the model, and then the optimized structural design parameters of the robot prototype were determined. Finally, according to the optimization results, the 3D virtual prototype was designed for simulation. The results show that the maximum step climbing height of the robot is increased by 11.54%, and the swing arm torque is reduced by 8.53%, which verifies the effectiveness of the multi-objective optimization design method of structural parameters based on NSGA-II algorithm. Keywords Rescue robot · Multi-objective optimization · NSGA-II
1 Introduction Rescue robots can cross narrow Spaces, climb over scattered debris, and carry out remote control and sensor detection to help rescue workers find trapped people faster and more safely [1]. Common ground rescue robots include wheeled, tracked, footed and wheel-track combined robots, among which tracked robots equipped with active swing arms can greatly improve their obstacle crossing ability, such as the robot Mini-CALIBER in Canada and PackBot [2] in the United States. And the “Ling Lizard” robot of Shenyang Institute of Automation. Researchers focus on the obstacle jumping strategy and motion control of this kind of robot. With the deepening of the research, it will be found that the design of the robot’s structural parameters directly P. Zhang · B. Cheng · X. Xia · J. Sun (B) Laboratory of Aerospace Servo Actuation Transmission, School of Mechanical Engineering and Automation, Beihang University (BUAA), Beijing 100191, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_75
865
866
P. Zhang et al.
affects its motion ability in complex obstacle scenes, so how to design reasonable robot structural parameters has become an urgent problem to be solved. In the process of performing search and rescue tasks, the robot is required to have reasonable power matching, climb over higher obstacles and less energy loss, so the design of robot body structural parameters becomes a multi-objective optimization problem [3]. Nan et al. [4] used Pareto model to deal with multi-objective problems to analyze and design the structural parameters of amphibious deformable robots, which improved various maneuvering performance in the task environment. Jian et al. [5] proposed a dynamic matching design method for mining rescue robot based on multiobjective particle swarm optimization algorithm, designed 12 decision variables and 5 constraints, and determined a dynamic parameter scheme with long endurance and light machine weight. Yang [6] designed Rolling Wolf, a wheel-legged composite robot, using a multi-objective optimization method, which effectively improved the load distribution, structural stability and carrying capacity of the wheel-legged robot. Yan et al. [7] used objective programming to deal with multi-objective optimization problems, taking the maximum obstacle value in the known environment as the objective and assigning the performance function, and taking the deviation between the performance function and the expected objective as the objective function. These design methods effectively guide robot prototype design, but lack of robot adaptation to different environmental obstacles, and the optimized structural parameter values may not meet the requirements of robot obstacle-jumping ability in search and rescue environment. Based on the research of the existing rescue robot, a four-track robot with active swing arm was designed. In order to explore the optimal structural parameters, the structural parameters of each part of the robot were taken as decision variables and the motion mechanism of the robot and the environment was taken as constraints to build the mathematical model of the robot’s structural parameters and motion performance. The multi-objective genetic algorithm NSGA-II was used to search the Pareto solution of the comprehensive optimal performance, and the optimal structural parameters of the robot satisfying different motion performance were obtained, so as to guide the design of the structural parameters of the robot prototype.
2 Robot Structure The robot designed in this paper adopts a double-swing arm four-track structure, and its overall layout is shown in Fig. 1. The robot chassis is mainly composed of (1) left swing arm, (2) right swing arm, (3) left main drive track, (4) right drive track and (5) robot body. The robot body is equipped with the robot’s drive system, power system, main control system and environment sensing system. The transmission system of the robot is shown in Fig. 2. The torque output of the 1-drive motor first passes through the 2-drive reduction box and 3-reduction gear, and then passes through the transmission pin to the 4-front crawler wheel, which drives the rear driven crawler wheel to form the movement of
Multi-objective Optimization Design of the Structural …
867
Fig. 1 General scheme layout diagram
Fig. 2 Transmission system
6-main crawler. In addition, the crawler wheel on the main crawler wheel drives the 5-large crawler wheel of the swing arm to rotate through the hollow transmission pin. Drive the 8-swing arm small track wheel to form 7-swing arm track movement, and finally form the synchronous movement of the main track and swing arm track. The torque output of the 9-swing arm motor is transmitted to the front shaft through the 10-swing arm reducer. The gear drives the rotation of the front shaft through the flat key, and the power is transmitted to the 11-swing arm frame by splines at both ends of the front shaft, forming the swing arm swing.
3 Multi-objective Optimization Model Establishment 3.1 Design Variables Figure 3 is a schematic diagram of the structural parameters of the swing arm tracked rescue robot. In the figure, O1 is the centroid position of the robot body, O2 is the
868
P. Zhang et al.
Fig. 3 Schematic diagram of robot structure parameters Table 1 Robot structure parameter table Numbers Parameters Unit Definition 1 2 3 4 5 6 7 8 9
R r b L1 L2 B m1 m2 l1
mm mm mm mm mm mm kg kg mm
10
l2
mm
11 12 13
h α β
mm rad rad
Radius of main track wheel Swing arm track wheel radius Track width Main center distance Swing arm center distance Body width Subject quality Swing arm mass Distance between center of gravity of main body and center of rear wheel Distance between center of gravity of swinging arm and center of front wheel Height of center of gravity above ground Angle between the swing arm and the horizontal plane The angle between the body and the horizontal plane
centroid position of the swing arm, and O is the overall centroid position of the robot. Other structural parameters are shown in Table 1. The motion performance of the rescue robot is related to several variable parameters, and each design variable is independent of each other. Therefore, the design variable of the multi-objective optimization model is formula (1) x = (R, r, L 1 , L 2 , l1 , l2 , m 1 , m 2 , h)
(1)
Multi-objective Optimization Design of the Structural …
⎧ ⎪ m 1 + m 2 ≤ 50 ⎪ ⎪ ⎪ ⎪ ⎪ L 1 + L 2 ≤ 1200 ⎪ ⎪ ⎪ 1 2 ⎪ ⎪ ⎪ l ∈ [ L1, L1] ⎪ ⎨1 3 3 1 2 l ∈ [ L2, L2] 2 ⎪ ⎪ 4 3 ⎪ ⎪ ⎪ ⎪α + β ≤ π ⎪ ⎪ ⎪ ⎪ R ∈ [50, 100] ⎪ ⎪ ⎪ ⎩r ∈ [25, 50]
869
(2)
3.2 Constraints Condition Firstly, based on design experience and reference to existing robot prototypes, the possible range of optimal solution of robot mechanism parameters is estimated, and this range can effectively reduce algorithm optimization time. The constraint of given robot structural parameters is formula (2).
3.3 Objective Function According to the design requirements of the swing arm crawler rescue robot, it should be able to cross high enough steps and match small swing torque of the swing arm. The following two performances are analyzed and the structural parameters and corresponding performance objective functions are established. Crossing the Steps For the mobile robot, the step is the most typical obstacle, the difficulty of the step crossing increases with the increase of the height, so it is very important to design the robot dimension reasonably. The step crossing process is as follows (Fig. 4). The position distribution of the robot’s centroid is shown in Fig. 3. An axis O y was established with the rear wheel axis of the robot as the origin. The center of mass of the vehicle body was set as O1 , the center of mass of the swing arm was set as O2 , and the overall center of mass of the robot was set as O(x O , y O ), which could be obtained from the distribution law of the center of mass ⎧ m 1 l1 + m 2 L 1 m 2 l2 cos α ⎪ ⎨x G = + m1 + m2 m1 + m2 (3) m1h m 2 l2 sin α ⎪ ⎩ yG = + m1 + m2 m1 + m2
870
P. Zhang et al.
Fig. 4 Obstacle clearing process diagram Fig. 5 Analysis of the maximum height that the robot can cross
m 2 l2 m 1 l1 + m 2 L 1 m1h , k1 = , k2 = . Then the formula can be m1 + m2 m1 + m2 m1 + m2 simplified as x G = k1 + k0 cos α (4) yG = k2 + k0 sin α
set k0 =
It can be obtained from the above equation (x G − k1 )2 + (yG − k2 )2 = k02
(5)
As can be seen from Eq. (3), the robot’s center of mass is distributed on a circle with (z G , yG ) as the center and k0 as the radius, and its position varies with the Angle α of the swing arm. When the robot’s global center of mass G is located in the plane where the vertical plane of the step is located, the robot’s obstacle crossing height reaches the maximum, as shown in Fig. 5. Through geometric relations, the expression of the maximum obstacle height H can be obtained:
Multi-objective Optimization Design of the Structural …
871
Fig. 6 Robot brace process
R + yG R + yG + R = x G sin β + yG cos β − +R cos β cos β (6) By substituting Eq. (1) for Eq. (6), it can be seen that, the influencing factors of the robot’s maximum obstacle height H include vehicle body mass m 1 , swing arm mass m 2 , distance between front and rear driving wheel axes L 0 , distance between the overall center of mass and rear wheel axis l1 , distance between swing arm center of mass and rear wheel axis l2 , height of vehicle body center of mass h, drive wheel radius R, vehicle body tilt Angle β and swing arm swing Angle α, etc. The performance objective function of the robot crossing steps can be defined H = (x G − yG tan β) sin β −
min f 1 (x) =
1 H
(7)
Output torque of swing arm When the robot climbs the steps, swing arms will support the robot, as shown in Fig. 6. This process can be described as the overall center of gravity of the robot rises and the gravitational potential energy increases with the assistance of the swing arm. Since the realization speed of this process is low, its kinetic energy is not considered, and the work done by the swing arm is equal to the increase of the robot’s overall gravitational potential energy. Let the torque of driving swing arm be M, then: θ2 Mdθ = m 1 gs1 + m 2 gs2 (8) θ1
⎧ 1) θ1 = π − arcsin( H −R−r cos(π/2−θ ⎪ ) ⎪ L2 ⎪ ⎪ ⎨ θ2 = π + arcsin( R−r ) L2 l1 s1 = H L +L cos(arcsin( ⎪ R−r ⎪ 1 2 L 2 )) ⎪ ⎪ ⎩ L 1 +l2 s2 = L 1 +L 2 H − l2 sin(π − θ1 )
(9)
The performance objective function of robot swing arm torque output can be defined to satisfy min f 2 (x) = M (10)
872
P. Zhang et al.
4 Optimization of Structural Parameters Compared with NSGA algorithm, NSGA-II algorithm [8] uses fast non-dominated sorting to reduce the computational complexity of the algorithm from O(m N 3 ) to O(m N 2 ), which greatly reduces the computational time of the algorithm. Secondly, the elite strategy is adopted to conduct non-dominant sorting after the merger of parent and child individuals, which makes the search space larger [9]. When generating the next generation parent population, individuals with higher priority are selected in order, and the crowding degree is used to select among the peer individuals, so as to ensure that excellent individuals can have a greater probability of being retained [10, 11]. At the same time, the crowding degree method is used to replace the fitness sharing strategy that requires specifying the sharing radius, and it is used as the standard to select excellent individuals among the peers, which ensures the diversity of individuals in the population and is conducive to the selection, crossover and variation of individuals within the whole interval[12]. Therefore, NSGA-II is adopted in this paper for multi-objective optimization of robot structural parameter optimization. The algorithm flow is shown in Fig. 7. This article uses the computer configuration for CPU: i7-8850H, memory 16G. The specific parameters of the algorithm are set as follows: crossover probability is 0.9, crossover distribution index is 20, mutation probability is 0.1, variation distribution index is 20, population number is 50, iteration algebra is 200. The results of Pareto set obtained by programming are shown in Fig. 8. The Pareto optimal solution was obtained by NSGA-II genetic algorithm. All the solutions in the optimal solution made the two optimization objectives reach the optimal as far as possible, but no solution could be found to make the two optimization objectives reach the optimal at the same time. Therefore, it is necessary to find the most appropriate optimal solution in the Pareto optimal solution set [8]. The calculated results were normalized and then the average value method was adopted to obtain the optimization results as shown in Table 2.
5 Model Simulation Verification Using Webots as the simulation verification platform, the pre-optimization robot virtual prototype and the optimized robot virtual prototype were established respectively, and the control script was written using Python to simulate the obstacle crossing action. The obstacle crossing process is shown in Fig. 9. Robot structural parameters were modified to extract the height of car body centroid rise and swing arm drive torque in the obstacle crossing process of robots in the two schemes before and after optimization. The comparison relationship was shown in Figs. 10 and 11. It can be seen from the simulation results that the obstacle crossing height of the robot is 0.26 m before optimization and 0.29 m after optimization, which increases
Multi-objective Optimization Design of the Structural …
873
Fig. 7 Flowchart of NSGA-II algorithm
by 11.54%. The driving torque of the optimized front swing arm is 45.01 Nm, and the torque of the optimized front swing arm is 41.17 Nm, which is reduced by 8.53%, indicating that the optimized structural design parameters can improve the obstacle crossing ability of the robot, and verify the effectiveness of the proposed optimization method.
874
P. Zhang et al.
Fig. 8 Pareto set result Table 2 Optimization result parameter table Numbers Parameters Unit 1 2 3 4 5 6 7 8
R r L1 L2 m1 m2 l1 l2
mm mm mm mm kg kg mm mm
Before optimize
After optimize
80 40 600 400 35 15 300 100
88 45 616 412 39 11 430 212
Multi-objective Optimization Design of the Structural …
Fig. 9 Obstacle crossing diagram Fig. 10 Comparison of car body center of mass rising height
Fig. 11 Pendulum arm torque comparison diagram
875
876
P. Zhang et al.
6 Conclusion Based on NSGA-II algorithm, a structural optimization design scheme of the swing arm crawler search and rescue robot is proposed. Firstly, the geometric and kinematic function relationship between the robot structural parameters and obstacles was established, and the obstacle-crossing performance function of the robot crossing the step height and the swing arm driving moment was obtained. NSGA-II algorithm was used to solve the objective function, and the optimal structural parameter value was obtained. Combined with the structural parameters, Webots software was used to simulate the virtual prototype. Compared with the structural parameter scheme before optimization, the rationality and feasibility of the structural parameters are verified, which provides a reference for the subsequent robot development.
References 1. Dong, B.Y., Zhang, Z.Q., Xu, L.J., et al.: Research status and development trend of intelligent emergency rescue equipment. J. Mech. Eng. 56(11), 1–25 (2020) (in Chinese). https://doi.org/ 10.3901/JME.2020.11.001 2. Yamauchi, B.M.: PackBot: a versatile platform for military robotics. Proc. SPIE—Int. Soc. Opt. Eng. 5422, 228–237 (2004). https://doi.org/10.1117/12.538328 3. Ke, L., Liu, K., Wu, G., et al.: Multi-objective optimization design of corrugated steel sandwich panel for impact resistance. Metals 11(9), 1378 (2021). https://doi.org/10.3390/met11091378 4. Nan, Li., Minghui, Wang, Shugen, Ma., et al.: Mechanism parameters design method of an amphibious transformable robot based on multi-objective genetic algorithm. J. Mech. Eng. 48(17), 10–20 (2012). https://doi.org/10.3901/JME.2012.17.010 5. Liu, J., Ge, S., Zhu, H., et al.: Mine rescue robot power matching based on multi-objective particle swarm optimization. J. Mech. Eng. 51(3), 18–28 (2015). (in Chinese). https://doi.org/ 10.3901/JME.2015.03.018 6. Yang, Luo, Qimin, Li., Zhangxing, Liu: Design and optimization of wheel-legged robot: rollingwolf. Chin. J. Mech. Eng. 27(6), 1133–1142 (2014). https://doi.org/10.3901/CJME.2014.0905. 144 7. Zhu, Y., Wang, M., Li, B., et al.: Design and verification of structural parameters of track deformable robot based on goal programming. Trans. Agric. Eng. Soc. (in Chinese). https:// doi.org/10.11975/j.issn.1002-6819.2016.14.006 8. Lin, Y.K., Chang, P.C., Yang, L.C.L., et al.: Bi-objective optimization for a multistate job-shop production network using NSGA-II and TOPSIS. J. Manuf. Syst. 52(Part A), 43–54 (2019). https://doi.org/10.1016/j.jmsy.2019.05.004 9. Sun, L., Zhang, B., Wang, P., et al.: Multi-objective parametric optimization design for mirrors combined with non-dominated sorting genetic algorithm. Appl. Sci. 13(5), 3346 (2023). https:// doi.org/10.3390/app13053346 10. Teng, X., Huang, Q., Dharmawardhana, C., Ichiye, T.: Diffusion of aqueous solutions of ionic, zwitterionic, and polar solutes. J. Chem. Phys. 148(22), 222827 (2018). https://doi.org/10. 1063/1.5023004 11. Teng, X., Hwang, W.: Effect of methylation on local mechanics and hydration structure of DNA. Biophys. J. 114(8), 1791–1803 (2018). https://doi.org/10.1016/j.bpj.2018.03.022 12. Teng, X., Hwang, W.: Elastic energy partitioning in DNA deformation and binding to proteins. ACS Nano 10(1), 170–180 (2016). https://doi.org/10.1021/acsnano.5b06863
An Anomaly Detection Algorithm for Logs Based on Self-attention Mechanism and BiGRU Model Han Yang, Fuliang Lin, Yi Chai, Kaiming Qie, Wenyi Lin, Yuanyuan Wang, Cheng Zhang, and Maoyun Guo
Abstract This paper proposes an anomaly detection algorithm for logs based on self-attention mechanism and BiGRU model to deal with the multiple characteristics exhibited by logs during system anomalies. First, two BiGRU models are utilized to process the log template sequence and log template frequency vector, respectively. Then, the outputs of the two BiGRU models are concatenated and input into a selfattention layer to fuse different features. Finally, the distribution probability of the next log template is output to achieve log anomaly detection. The proposed method is evaluated on the Hadoop Log File System (HDFS) dataset and achieves an precision of approximately 96.7%. Compared to other log anomaly detection algorithms such as DeepLog and LogAnomaly, our method demonstrates better performance on this dataset. Keywords Anomaly detection · Bi-directional Gated Recurrent Unit (BiGRU) · Self-attention mechanism
1 Introduction As computer systems continue to grow in scale and complexity, system logs have become an important source of information for computer system operation management. Log files record important information such as the operational status of various system components, user operations, network traffic and more, which are of great significance for system management, fault diagnosis and business monitoring. Therefore, log anomaly detection technology has become a highly focus on the industry, which can analyze log data to detect potential problems and anomalous events, such as abnormal network traffic, unauthorized access and system crashes. H. Yang · Y. Chai · W. Lin · C. Zhang · M. Guo (B) School of Automation, Chonqing University, Chongqing 400044, China e-mail: [email protected] F. Lin · K. Qie · Y. Wang Science and Technology on Information Systems Engineering Laboratory, Beijing Institute of Control and Electronic Technology, Beijing 100038, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3_76
877
878
H. Yang et al.
Currently, log anomaly detection can be divided into traditional machine learning and deep learning methods based detection model. There are some typical examples of traditional machine learning methods. For instance, Xu et al. [1] used the PCA method to implement offline anomaly detection for logs. Han et al. [2] merged semantically similar feature matrix templates based on expert knowledge to reduce noise interference and then used SVM to discriminate whether the log data was abnormal. Liu et al. [3] proposed a log anomaly detection algorithm that combines K-prototype clustering and KNN classification. Although traditional machine learning models have the advantages of high interpretability and good detection performance on specific datasets, they also have disadvantages such as poor processing performance on large-scale datasets and high dependence on feature engineering. With the rapid development of deep learning, many studies have introduced deep learning into log anomaly detection. For example, the DeepLog model proposed by Du et al. [4] uses LSTM networks to perform anomaly detection on both log template sequences and parameter variables. Meng et al. [5] proposed the LogAnomaly model, which uses dLCE natural language processing models to generate template vectors through word embedding technology, and then inputs them into LSTM for model training to achieve log anomaly detection. Zhang et al. [6] proposed the LogRobust model, which uses a bidirectional LSTM to capture the forward and backward dependencies of log template sequences and applies an attention mechanism to obtain log templates that are more important for discrimination. PLELog [7] vectorizes log templates and performs clustering, then uses a probability labeling method to assign labels to unlabeled data in the training set, which are then used as input to a GRU network with attention mechanism for log anomaly detection. Logs contain multiple features, including event counts, event sequences, and time intervals [8]. When the system experiences an anomaly, it will manifest in some log features. For example, event counts represent the number of a particular event in the log and a significant increase or decrease in the event count may indicate an anomaly in the system. System logs have temporal properties, when the log patterns deviate from normal operation, it suggests that the system has an anomaly. Time intervals refer to the duration between two consecutive logs and if the time interval between two logs is too large, it may indicate a performance degradation in the system. When a system experiences an anomaly, it may manifest various log anomaly features. However, traditional deep learning methods train separate models for different features of logs ignoring the correlations between features and making it difficult to handle complex and diverse log data. To deal with this issue, this paper proposes a log anomaly detection method based on self-attention mechanism and Bi-GRU model, which achieves multi-feature fusion for log anomaly detection. The main work of this paper are as follows: 1. The paper designs a log anomaly detection model composed of two Bi-directional Gated Recurrent Unit networks (BiGRU) and a self-attention mechanism, which is used for log anomaly detection. 2. The paper connects the outputs of the two BiGRU models and inputs them into the self-attention mechanism to achieve fusion of the two log features (log sliding window data and log template frequency vector) for log anomaly detection.
An Anomaly Detection Algorithm for Logs Based …
879
Self-Attention
Log parsingǃDataset construction
System logs
Dataset BiGRU1
Dataset
Output merging
BiGRU2
Is it abnormal˛
Log template frequency vector
Log sliding window
Log parsingǃDataset construction
Log1
Log2
Log3
Output results
Log template candidate set (size g)
Model training
Anomaly detection
Fig. 1 Log anomaly detection system framework
3. The paper compares and analyzes the proposed method with DeepLog, LogAnomaly, and LogRobust log anomaly detection models in terms of precision, recall and other metrics under the same experimental conditions and conduct ablation experiments on the proposed method.
2 Data Generation and Model Building 2.1 Method Framework As shown in Fig. 1, the framework of the proposed log anomaly detection method, which is based on self-attention [9] mechanism and BiGRU [10], consists of two stages: model training and anomaly detection. In the model training stage, the normal log dataset is preprocessed to obtain the required log dataset by sliding window and log template frequency vector set. Then, the two datasets are separately input into two BiGRU models for training. In the anomaly detection stage, the current system log is parsed to construct a dataset and input into the two BiGRU models. The self-attention layer outputs the prediction of the next log template distribution probabilities. And the top g pieces of log templates with the highest probabilities are selected and placed in the log template candidate set. If the template of the actual log which the target system’s next generated is not in the log template candidate set mentioned above, the log is considered anomalous at that moment.
2.2 Log Parsing Log data is semi-structured which requires analysis of its general structure in order to separate constants, variables and provide the basis for subsequent feature representation. One log consists of a regular message part and a feature message part. The
880
H. Yang et al.
Fig. 2 Log parsing
regular message part includes basic information such as timestamp, level and the name of module to resulting in the log, which can be extracted and separated using regular expressions. The feature message part contains the core content of the log, consisting of text, numbers, and special symbols. There are currently many algorithms for extracting log templates based on log text, including SLCT [11], IPLoM [12], LKE [13], Spell [14], Drain [15] and LPV [16]. In this paper, the model purposed in this paper uses the Drain algorithm for log parsing, as illustrated in Fig. 2.
2.3 Sliding Window Oriented Data Set Generation This paper utilizes the sliding window mechanism, which is a time-based data processing technique that sequentially processes data within a specific time range, to process log template sequences. Assuming a window size of k, the log template information within the window is treated as sliding window sequence data, so the corresponding log template frequency vector is generated. The (k + 1)th log is used as the subsequent log of the window. The window slides forward 1 piece of the log template on the log template sequences, generating one log template sliding-window data and one log template frequency vector. Figure 3 shows that by sorting the log data set D = {d1 , d2 , . . . , di } parsed by Drain according to module and time, a log data set can be obtained for each module i, where di = {l1 , l2 , . . . , l j } represents the log sequence generated by module i at moment j and l j represents the log produced by module i at moment j. By extracting the log template IDs corresponding to these logs in D, the log template
An Anomaly Detection Algorithm for Logs Based …
881
Fig. 3 The process of generating the dataset
sequence data set S = {s1 , s2 , . . . , si } for all modules in D can be obtained, where si represents the log template sequence generated by module i. And si can be defined as si = {m 1 , m 2 , . . . , m j }, where m j represents the log templates corresponding to these logs. When a sliding window of length k is defined, each slide, whose length is one log template, on the sequence si generates a log template sliding window data and its corresponding log template frequency vector. The log template sliding window data corresponding to si is wi = {h j−k , . . . , h j−2 , h j−1 } where h j−k is a log template window with the end position j − k in si . The log template frequency vector h j−k is f i, j−k , whose dimension is x, representing the log template type which has appeared in the log data set D. So f i, j−k [n] represents the frequency of template n in h j−k of wi . For example, f i, j−k [0] = 1 indicates that the frequency of template 0 in h j−k of wi is 1. A log template sliding window data wi corresponds to a log template frequency vector f i = { f i, j−k , . . . , f i, j−2 , f i, j−1 }. W = {w1 , w2 , . . . , wn } is the log template sliding window data set for all modules in D. And F = { f 1 , f 2 , . . . , f n } is the set of log template frequency vectors for all modules in D.
2.4 BiGRU Model Combined with Self-attention Mechanism BiGRU is a type of deep neural network architecture which is a variant of the unidirectional Gated Recurrent Unit (GRU) by adding a backward GRU. Compared to GRU, it is able to better capture long-term dependencies within sequences and can leverage bidirectional information to understand the sequence better. Because the logs have an obvious sequential relationship, BiGRU can better capture potential anomalies and abnormal patterns of log sequences. Additionally, BiGRU can learn contextual information of log sequences as well as relationships between adjacent
882
H. Yang et al.
Fig. 4 Self-attentive mechanism
logs. By capturing these contextual cues and relationships, it can detect abnormal events and errors of the log data more accurately. The self-attention mechanism is an important mechanism in deep learning. Its basic idea is to treat each element in the input sequence as a set of vectors and then to determine their weights in the model by calculating the similarity between these vectors. Finally, these vectors are weighted and summed to obtain a representation of the entire sequence. As the self-attention mechanism can learn the key information in the input sequence and represent it as a vector, it can be used to process the data output by two BiGRU models connected together. By weighting the information at each moment in the input sequence, the model can focus more on the information most relevant to the current moment, which can improve the accuracy of the model in detecting anomalies in logs. As shown in Fig. 4, v1 and v2 are probability distribution vectors of the next log template obtained respectively based on the log sliding window dataset and log template frequency vector set. For example, p11 represents the probability that the BiGRU model predicts the next log corresponding to the log m 1 template based on the log sliding window dataset. Because v1 and v2 both predict the probability distribution vector of the next log template at the next time step and their dimensions are the number of log templates types obtained from the parsed log dataset. So v1 and v2 can be concatenated to form the input data out = {out1 , out2 , . . . , outx }, which is then fed into the self-attention mechanism. The input data out first enters the fully connected layer 1, which maps each model to a hidden vector of the same dimension and then applies a non-linear transformation using the tanh activation function to obtain a new hidden vector u x : u x = tanh(W ∗ outx + b1 ) for subsequent attention calculation.
(1)
An Anomaly Detection Algorithm for Logs Based …
883
Fig. 5 BiGRU models combining self-attentive mechanism
In Eq. (1), outx is the input vector, W is the weight matrix of this layer and b1 is the bias vector of this layer. Then, the vector (u = {u 1 , u 2 , . . . , u x }) is input to fully connected layer 2 for linear transformation and softmax function to obtain the attention weight vector ax : exp(w T u x ) (2) ax = exp(w T u x ) In Eq. (2), u x is the xth vector output from the tanh layer. w T is the weight matrix of this layer. Finally, the input vector outx and attention weight vector ax are multiplied together to obtain the final log template distribution probability v = { p1 , p2 , . . . , px }, where px is the probability of generating a log template x at the next moment. px =
n
ax outx
(3)
i=1
This paper proposes a BiGRU model combined with self-attention mechanism to achieve log anomaly detection. As shown in Fig. 5, the model consists of three main layers: input layer, hidden layer, and output layer. The input layer is responsible for accepting sliding window data wi = {h j−k , . . . , h j−2 , h j−1 } of a certain log template and its corresponding log template frequency data f i = { f i, j−k , . . . , f i, j−2 , f i, j−1 }. In the hidden layer, BiGRU1 is responsible for processing log template sliding window data. And BiGRU2 is responsible for processing log template frequency data. The self-attention mechanism layer is used to weight the results of the two BiGRU
884
H. Yang et al.
outputs connected together to extract key information. The self-attention mechanism layer outputs the log template distribution probability vector vi = { p1 , p2 , . . . , px } for the next moment step. The top g log template probabilities with the highest probability in are added to the template candidate set C = {c1 , c2 , . . . , cg } where cg represents the gth highest log template probability in vi . If the log template of the actual log of the next moment does not belong to the template candidate set, it is considered anomalous. This model is divided into training phase and anomaly detection phase. In the training phase, when the previously log template sliding window data set W and log template frequency vector set F are first input, the two BiGRU models will sequentially extract w and f from the dataset for training. The detailed steps of the training phase are shown in Table 1. In the anomaly detection phase, for the logs that need to be detected, template parsing and data preprocessing are performed first to obtain the required sliding window data and frequency vector of the log templates. These data are then input into the model, which outputs a probability vector vi = { p1 , p2 , . . . , px } for the next moment’s log template distribution. The top g log templates with the highest probabilities in vi are extracted to form a candidate set C. If the log template of a system-generated log is in set C, the log is considered normal. Otherwise, it is considered anomalous. The following is the algorithm for the anomaly detection model. The detailed steps of the anomaly detection phase are shown in Table 2.
Table 1 Model training algorithm Model training algorithm Input: Log template sliding window dataset W , log template frequency vector set F Output: BiGRU model combined with self-attention mechanism 1. Input the log template sliding window dataset W and log template frequency vector set F 2. Extract the log template sliding window data w and log template frequency data f from the dataset for training FOR LINE IN W : Extract w and input into BiGRU1 FOR LINE IN F: Extract f and input into BiGRU2 3. Train the BiGRU models combined with self-attention mechanism 4. Output the trained BiGRU model combined with self-attention mechanism and save the model data
An Anomaly Detection Algorithm for Logs Based …
885
Table 2 Algorithm for anomaly detection Algorithm for anomaly detection Input: The raw log sequence Log = {log1 , log2 , . . . , logi } of a certain system module, the maximum number g of log templates with the highest probabilities, and the sliding window length k Output: The result of log detection 1. Input the original dataset log and use a log parsing tool to parse the log and obtain the log template sequence s = {m 1 , m 2 , . . . , m j } 2. Define a sliding window with a length of k FOR i IN s: IF i ≥ j END IF ELSE Generate sliding window data w = {h j−k , . . . , h j−2 , h j−1 } corresponding to s Generate frequency vector f = { f i, j−k , . . . , f i, j−2 , f i, j−1 } corresponding to w 3. Input w into BiGRU1 and f into BiGRU2 4. Obtain the probability vector v = { p1 , p2 , . . . , px } for the next time step’s log template distribution in this module 5. Extract the top g log templates with the highest probabilities in v to form a candidate set C 6. Return the result of log detection
3 Experiment and Results 3.1 Experimental Environment and Data Experimental environment: Processor: AMD Ryzen 7 5800H; Memory: 16G; Graphics card: GeForce RTX 3060; Operating system: Windows10; CUDA version: CUDA11.6, Cudnn8.3.2; Tensorflow2.8.0. Experimental data: the HDFS dataset [17] from open-source datasets was selected as the training and experimental dataset, which contains 11,197,954 log entries, of which approximately 2.9% are abnormal.
3.2 Evaluation Metrics In actual log data, normal log data usually far outweighs abnormal log data. If only accuracy is used to evaluate the performance of anomaly detection algorithms, it is easy to predict all test data as normal and result in a high accuracy rate. Therefore, this article adopts a series of evaluation metrics to evaluate the performance of the algorithm, including Precision, Recall and F1 score. These evaluation metrics can
886
H. Yang et al.
more comprehensively reflect the performance of the algorithm and measure the advantages and disadvantages of the algorithm from different perspectives. Precision represents the proportion of correctly detected abnormal log data to the total number of log data detected as abnormal: Pr ecision =
TP T P + FP
(4)
Recall is the proportion of correctly detected abnormal log data to the total number of abnormal log data: TP (5) Recall = T P + FN F1 score (F-measure) is the harmonic mean of precision and recall, which comprehensively evaluates the overall performance of anomaly detection: F1-measur e =
2 ∗ Pr ecision ∗ Recall Pr ecision + Recall
(6)
True Positive (TP) refers to the number of abnormal log data correctly detected by the model, False Positive (FP) refers to the number of abnormal log data incorrectly detected as normal log data by the model and False Negative (FN) refers to the number of normal log data incorrectly detected as abnormal log data by the model.
3.3 Experimental Analysis This section presents a comparative analysis of the performance of the log anomaly detection models based on the self-attention mechanism and BiGRU model proposed in this paper, as well as the DeepLog, LogAnomaly, and LogRobust models on the HDFS dataset and conducted ablation experiments on the proposed method. Experiment 1. Table 3 shows the results. From the Recall and F1 metrics, it can be seen that the proposed method achieved the best performance among the four methods. In terms of the Precision metric, the proposed method outperformed DeepLog and LogRobust, and was only 0.2% lower than LogAnomaly. Compared to DeepLog, the proposed method achieved improvements of 0.87%, 5.29%, and 3.09% in Precision, Recall and F1, respectively. Experiment 2. Ablation experiments were conducted on the log anomaly detection method based on self-attention mechanism and Bi-GRU model. To verify the superiority of the proposed log anomaly detection method compared to other models, this paper performed ablation experiments by selectively removing certain components of the method to investigate its value. The experiments were conducted on the HDFS dataset. Specifically, this paper tested the performance of the model using three different types of log features: the log sliding window dataset separately, the log template frequency vector dataset separately and a combination of the above two datasets.
An Anomaly Detection Algorithm for Logs Based …
887
Table 3 Exception detection effect on HDFS data sets Model HDFS Precision Recall DeepLog LogAnomaly LogRobust The proposed method
0.9583 0.9690 0.9216 0.9670
0.9330 0.9825 0.9586 0.9859
Table 4 The results of ablation experiment Model HDFS Precision Recall Sliding window dataset Frequency vector set The proposed method
F1 0.9454 0.9757 0.9397 0.9763
F1
Time consumption (s)
0.9580
0.9337
0.9457
4426.353
0.9408 0.9670
0.8674 0.9859
0.9026 0.9763
5159.104 7366.619
As shown in Table 4, when using the proposed method, the model achieved higher precision, recall, and F1 score compared to using a single log feature. However, the use of two Bi-GRU models resulted in an increase in model parameters, which slowed down the speed of log anomaly detection and led to an increase in time consumption.
4 Conclusion Log anomaly detection plays a crucial role in the fields of information technology and software operations and maintenance. In this paper, a log anomaly detection method based on self-attention mechanism and BiGRU model is proposed to achieve anomaly detection in system logs. This method uses two BiGRU models to process the log template sequence and log template frequency vector respectively. The output of the two models is then connected and input into the self-attention mechanism layer to associate different features, enabling anomaly detection from multiple features of the log. The experimental results show that the model proposed in this article achieves good performance in anomaly detection. In the next step, other features of the log (such as log parameters) can be introduced to observe the performance of the model in log anomaly detection after incorporating more features and there is more work also needed to do, such as shortening the time consumption of the multi-feature model in log anomaly detection.
888
H. Yang et al.
References 1. Xu, W., Huang, L., Fox, A., et al.: Detecting large-scale system problems by mining console logs. In: Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles, pp. 117–132. ACM, Montana (2009). https://doi.org/10.1145/1629575.1629587 2. Han, S.B., Wu, Q.H., Zhang, H., et al.: Log-based anomaly detection with robust feature extraction and online learning. IEEE Trans. Inf. Forens. Secur. 16, 2300–2311 (2021). https:// doi.org/10.1109/TIFS.2021.30533716 3. Liu, Z.L., Qin, T., Guan, X.H., et al.: An integrated method for anomaly detection from massive system logs. IEEE Access 6, 30602–30611 (2018). https://doi.org/10.1109/ACCESS.2018. 2843336 4. Du, M., Li, F.F., Zheng, G.N., et al.: DeepLog: anomaly detection and diagnosis from system logs through deep learning. https://doi.org/10.24963/ijcai.2019/658 5. Meng, W.B., Liu, Y., Zhu, Y.C., et al.: LogAnomaly: unsupervised detection of sequential and quantitative anomalies in unstructured logs. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence, pp. 4739–4745. AAAI Press, Macao (2019) 6. Zhang, X., Xu, Y., Lin, Q.W., et al.: Robust log-based anomaly detection on unstable log data. In: Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 807–817. ACM, Singapore (2019). https://doi.org/10.24963/ijcai.2019/658 7. Yang, L., Chen, J.J., Wang, Z., et al.: Semi-supervised log-based anomaly detection via probabilistic label estimation. In: 2021 IEEE/ACM 43rd International Conference on Software Engineering (ICSE), pp. 1448–1460. IEEE, Madrid (2021). https://doi.org/10.1109/ICSE43902. 2021.00130 8. Zhao, N.W., Wang, H.L., Li, Z.Y., et al.: An empirical investigation of practical log anomaly detection for online service systems. In: Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 1404–1415. ACM, Singapore (2021). https://doi.org/10.1145/3468264.3473933 9. Lin, Z., Feng, M., Santos, C., et al.: A structured self-attentive sentence embedding (2017) 10. Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. CoRR, abs/1406.1078 (2014) 11. Varandi, R.: A data clustering algorithm for mining patterns from event logs. In: Proceedings of the 3rd IEEE Workshop on IP Operations & Management (IPOM 2003) (IEEE Cat. No. 03EX764), pp. 119–126. IEEE, Kansas City (2003). https://doi.org/10.1109/IPOM.2003. 1251233 12. Makanju, A., Zincir-Heywood, A.N., Milios, E.E.: A lightweight algorithm for message type extraction in system application logs. IEEE Trans. Knowl. Data Eng. 24(11), 1921–1936 (2011). https://doi.org/10.1109/TKDE.2011.138 13. Fu, Q., Lou, J.G., Wang, Y., et al.: Execution anomaly detection in distributed systems through unstructured log analysis. In: 2009 9th IEEE International Conference on Data Mining, pp. 149 –158. IEEE, Miami Beach (2009). https://doi.org/10.1109/ICDM.2009.60 14. Du, M., Li, F.F.: Spell: online streaming parsing of large unstructured system logs. IEEE Trans. Knowl. Data Eng. 31(11), 2213–2227 (2019). https://doi.org/10.1109/TKDE.2018.2875442 15. He, P.J., Zhu, J.M., Zheng, Z.B., et al.: Drain: an online log parsing approach with fixed depth tree. In: 2017 IEEE International Conference on Web Services (ICWS), pp. 33–40. IEEE, Honolulu (2017). https://doi.org/10.1109/ICWS.2017.13 16. Xiao, T., Quan, Z., Wang, Z.J., et al.: LPV: a log parser based on vectorization for offline and online log parsing. In: 2020 IEEE International Conference on Data Mining (ICDM), pp. 1346–1351. IEEE, Sorrento (2020). https://doi.org/10.1109/ICDM50108.2020.00175 17. He, S., Zhu, J., He, P., et al.: Loghub: A Large Collection of System Log Datasets towards Automated Log Analytics (2020)
Author Index
A An, Zhiliang, 407
B Benyoub, Fatima Zahra, 795 Bu, Zhiheng, 297
C Cai, He, 213, 573 Cai, Qiang, 27 Cao, Jian, 27 Chai, Yi, 877 Chen, Chen, 625 Cheng, Bo, 865 Cheng, Chen, 391 Cheng, Lei, 519 Cheng, Zhiyi, 245 Chen, Jing, 63, 203 Chen, Lingxiao, 15 Chen, Mengqiao, 687 Chen, Xinmin, 687 Chen, Zengqiang, 839 Chen, Zhenrui, 433 Chen, Zhong, 267 Chu, Bing, 169 Cong, Wenyuan, 687
D Deng, Songbo, 213 Ding, Feng, 227 Dou, Yongqiang, 267 Duan, Jiaqi, 213 Du, Mingjun, 407
F Fan, Mei, 733 Fan, Yimin, 529 Fu, Dongmei, 855 Fu, Jian, 53 Fu, Qiang, 307 Fu, Yongling, 53
G Gao, Chihan, 15 Gao, Fugen, 447 Gao, Guanbin, 415 Gao, Xingyu, 625 Geng, Chao, 267 Guo, Caixiang, 237 Guo, Dong, 677 Guo, Maoyun, 877 Guo, Xinyang, 415 Guo, Yajing, 203, 467, 489, 777 Guo, Yaxing, 179 Guo, Yun-qi, 591
H Han, Hongyu, 855 Han, Yongxuan, 551 He, Chao, 573 Hou, Bingbing, 519 Hou, Jianlin, 563 Hou, LinYuan, 721 Hua, Guoliang, 111 Huang, Hongjian, 307 Huang, Kui, 179, 573 Huang, Lei, 677 Huang, Man, 213
© The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2023 Y. Jia et al. (eds.), Proceedings of 2023 Chinese Intelligent Systems Conference, Lecture Notes in Electrical Engineering 1089, https://doi.org/10.1007/978-981-99-6847-3
889
890 Hu, Baoyan, 53 Hu, Cong, 767 Huo, Hanliang, 193 Hu, Wenxiao, 687
J Jia, Haocong, 855 Jia, Longfei, 179 Jiang, Renjian, 101 Jiang, Tao, 767 Jiao, Jie, 137 Jia, Yingmin, 329 Ji, Peng, 407 Ji, Yuehui, 539
K Ke, Zhaotao, 539 Kong, Guizhen, 179
L Liang, Binyan, 203 Liang, Yu-chen, 591 Li, Chang, 467 Li, Chunyu, 767 Li, Cong, 425 Li, Fengyun, 519 Li, Hai-sheng, 27 Li, Haiyuan, 343 Li, Han, 477 Li, Hou-sheng, 477 Li, Jin, 237 Li, Juntao, 447 Li, Ke, 213 Lin, Fuliang, 877 Ling, Chen, 15 Lin, Guangrong, 111, 433 Lin, Wenyi, 877 Lin, Yan, 245, 657, 795 Lin, Yue, 687 Lin, Yurui, 155 Lin, Zijian, 785 Li, Peng, 101 Li, Pengfei, 467, 489, 539 Li, Rongguang, 1 Li, Ting, 529 Liu, Chao, 155 Liu, Chenglin, 699 Liu, Duo, 647 Liu, Fei, 415 Liu, Jingyi, 47 Liu, Junjie, 539
Author Index Liu, Li, 357 Liu, Liu, 529 Liu, Meng, 27 Liu, Shuo, 591 Liu, Shuxuan, 203, 467, 489 Liu, Tian, 529 Liu, Wei, 267 Liu, Wenzheng, 455 Liu, Xiaonan, 625 Liu, Xiaotong, 455 Liu, Xu, 647 Liu, Xudong, 1 Liu, Yu, 85 Liu, Yujia, 635 Liu, Zhenyu, 169 Liu, Zhongxin, 709 Li, Weixing, 73 Li, Wenliang, 825 Li, Yicheng, 721 Li, Yong, 617 Li, Ziyan, 37 Lu, Jinshu, 123 Luo, Zhiwen, 489, 573 Lv, Bohan, 777 M Ma, Guangyuan, 391 Ma, Ruoxun, 193 Mao, Jianlin, 317 Ma, Xiangyu, 677 Ma, Xiao-cheng, 591 Ma, You, 391 Ma, Yue, 297 Meng, Haoyang, 647 Meng, Qingrui, 657 Mo, Lipo, 193 P Pan, Binfeng, 137 Pang, Shaopeng, 407 Q Qie, Kaiming, 877 Qi, Guoyuan, 753 Qi, Haitao, 647 Qiu, Xingxing, 625 S Shan, Niu, 583 Shao, Bowei, 825 Shao, Nian, 391
Author Index Shen, Qingcheng, 563 Shi, Xiaohui, 809 Su, Hang, 647 Sun, Haichao, 667 Sun, Hao, 839 Sun, Jian, 377, 865 Sun, Jing, 529 Sun, Liang, 101 Sun, Qinglin, 839 Sun, Zeyuan, 73 Su, Xin, 583
T Tang, Mucheng, 297 Tian, Lin, 101
W Wang, Chen, 709 Wang, Enci, 563 Wang, Fuyong, 709 Wang, Haikuan, 85 Wang, Hui, 825 Wang, Jiajun, 617 Wang, Jian, 357 Wang, Lei, 743 Wang, Ning, 317 Wang, Niya, 317 Wang, Qu, 573 Wang, Shiming, 329 Wang, Shishen, 753 Wang, Shuangxin, 47 Wang, Shuochen, 357 Wang, Wenle, 519 Wang, Xiaolong, 607 Wang, Xiaoyue, 699 Wang, Xinyu, 47 Wang, Yajie, 583 Wang, Yanbo, 213 Wang, Yaxin, 455 Wang, Yuanyuan, 877 Wang, Yubin, 367 Wu, Chen, 123 Wu, Junzhou, 15 Wu, Wenbo, 137 Wu, Xiru, 155, 281
X Xiang, Jun, 583 Xiang, Shan, 447 Xiang, Zhengrong, 625 Xia, Xingke, 377, 865
891 Xie, Sheng, 267 Xing, Yashan, 415 Xin, Le, 1 Xu, Chenyang, 343 Xue, ZhiBin, 503 Xu, Fengrui, 687 Xu, Ling, 227 Xu, Weihong, 227 Xu, Zhiguo, 425
Y Yang, Chenxi, 237 Yang, Fan, 203, 467, 777 Yang, Fengli, 607 Yang, Han, 877 Yang, Hongyong, 357 Yang, Huiwen, 391 Yang, Jinshan, 839 Yang, KaiMing, 503 Yang, Lining, 53 Yang, Shuaibin, 137 Yang, Yong, 743 Yang, Zhongguo, 733 Yan, Lutao, 343 Yan, Zhiming, 53, 169 Ye, Hui, 667 Yin, Pan, 825 Yi, Yang, 563 Yue, Changlu, 767 Yu, Lei, 809 Yu, Zhiyuan, 63
Z Zeng, Si, 777 Zhai, Junyong, 37, 367, 667, 785 Zhang, Bin, 343 Zhang, Binlei, 281 Zhang, Cheng, 877 Zhang, Chuangchuang, 357 Zhang, Haozhe, 193 Zhang, Jiahao, 617 Zhang, Jian-jun, 477 Zhang, Jinjun, 179 Zhang, Junning, 467, 489, 777 Zhang, Kaixiang, 317 Zhang, Mei, 15 Zhang, Pu, 377, 865 Zhang, Shufan, 317 Zhang, Shuo, 85, 825 Zhang, Xiulei, 551 Zhang, Yi, 529
892 Zhang, Yuchong, 281 Zhang, Yuman, 111 Zhang, Yuqiu, 281 Zhang, Zichong, 63 Zhao, Bin, 743 Zhao, Chun, 455 Zhao, Dongao, 647 Zhao, Lin, 425 Zhao, Long, 607 Zhao, Peng, 583 Zhao, Wenyu, 47 Zhao, Xu, 753, 767 Zhao, Yafei, 111, 433 Zhao, Yilin, 307
Author Index Zhao, Yueling, 677 Zhao, Zhe, 63 Zhao, Zhongrui, 169 Zheng, Dongdong, 73 Zheng, Jigui, 179, 267 Zheng, Yuemin, 839 Zhi, Jiankang, 213 Zhou, Jiaen, 433 Zhou, Xiaohui, 635 Zhu, Guibing, 123 Zhu, Junzhi, 607 Zhu, Rui, 709 Zhu, Xiaorong, 63 Zuo, Zheqing, 63