179 9 81MB
English Pages 2096 [2097] Year 2021
Lecture Notes in Electrical Engineering 654
Qilian Liang · Wei Wang · Xin Liu · Zhenyu Na · Xiaoxia Li · Baoju Zhang Editors
Communications, Signal Processing, and Systems Proceedings of the 9th International Conference on Communications, Signal Processing, and Systems
Lecture Notes in Electrical Engineering Volume 654
Series Editors Leopoldo Angrisani, Department of Electrical and Information Technologies Engineering, University of Napoli Federico II, Naples, Italy Marco Arteaga, Departament de Control y Robótica, Universidad Nacional Autónoma de México, Coyoacán, Mexico Bijaya Ketan Panigrahi, Electrical Engineering, Indian Institute of Technology Delhi, New Delhi, Delhi, India Samarjit Chakraborty, Fakultät für Elektrotechnik und Informationstechnik, TU München, Munich, Germany Jiming Chen, Zhejiang University, Hangzhou, Zhejiang, China Shanben Chen, Materials Science and Engineering, Shanghai Jiao Tong University, Shanghai, China Tan Kay Chen, Department of Electrical and Computer Engineering, National University of Singapore, Singapore, Singapore Rüdiger Dillmann, Humanoids and Intelligent Systems Laboratory, Karlsruhe Institute for Technology, Karlsruhe, Germany Haibin Duan, Beijing University of Aeronautics and Astronautics, Beijing, China Gianluigi Ferrari, Università di Parma, Parma, Italy Manuel Ferre, Centre for Automation and Robotics CAR (UPM-CSIC), Universidad Politécnica de Madrid, Madrid, Spain Sandra Hirche, Department of Electrical Engineering and Information Science, Technische Universität München, Munich, Germany Faryar Jabbari, Department of Mechanical and Aerospace Engineering, University of California, Irvine, CA, USA Limin Jia, State Key Laboratory of Rail Traffic Control and Safety, Beijing Jiaotong University, Beijing, China Janusz Kacprzyk, Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland Alaa Khamis, German University in Egypt El Tagamoa El Khames, New Cairo City, Egypt Torsten Kroeger, Stanford University, Stanford, CA, USA Qilian Liang, Department of Electrical Engineering, University of Texas at Arlington, Arlington, TX, USA Ferran Martín, Departament d’Enginyeria Electrònica, Universitat Autònoma de Barcelona, Bellaterra, Barcelona, Spain Tan Cher Ming, College of Engineering, Nanyang Technological University, Singapore, Singapore Wolfgang Minker, Institute of Information Technology, University of Ulm, Ulm, Germany Pradeep Misra, Department of Electrical Engineering, Wright State University, Dayton, OH, USA Sebastian Möller, Quality and Usability Laboratory, TU Berlin, Berlin, Germany Subhas Mukhopadhyay, School of Engineering & Advanced Technology, Massey University, Palmerston North, Manawatu-Wanganui, New Zealand Cun-Zheng Ning, Electrical Engineering, Arizona State University, Tempe, AZ, USA Toyoaki Nishida, Graduate School of Informatics, Kyoto University, Kyoto, Japan Federica Pascucci, Dipartimento di Ingegneria, Università degli Studi “Roma Tre”, Rome, Italy Yong Qin, State Key Laboratory of Rail Traffic Control and Safety, Beijing Jiaotong University, Beijing, China Gan Woon Seng, School of Electrical & Electronic Engineering, Nanyang Technological University, Singapore, Singapore Joachim Speidel, Institute of Telecommunications, Universität Stuttgart, Stuttgart, Germany Germano Veiga, Campus da FEUP, INESC Porto, Porto, Portugal Haitao Wu, Academy of Opto-electronics, Chinese Academy of Sciences, Beijing, China Junjie James Zhang, Charlotte, NC, USA
The book series Lecture Notes in Electrical Engineering (LNEE) publishes the latest developments in Electrical Engineering - quickly, informally and in high quality. While original research reported in proceedings and monographs has traditionally formed the core of LNEE, we also encourage authors to submit books devoted to supporting student education and professional training in the various fields and applications areas of electrical engineering. The series cover classical and emerging topics concerning: • • • • • • • • • • • •
Communication Engineering, Information Theory and Networks Electronics Engineering and Microelectronics Signal, Image and Speech Processing Wireless and Mobile Communication Circuits and Systems Energy Systems, Power Electronics and Electrical Machines Electro-optical Engineering Instrumentation Engineering Avionics Engineering Control Systems Internet-of-Things and Cybersecurity Biomedical Devices, MEMS and NEMS
For general information about this book series, comments or suggestions, please contact [email protected]. To submit a proposal or request further information, please contact the Publishing Editor in your country: China Jasmine Dou, Editor ([email protected]) India, Japan, Rest of Asia Swati Meherishi, Editorial Director ([email protected]) Southeast Asia, Australia, New Zealand Ramesh Nath Premnath, Editor ([email protected]) USA, Canada: Michael Luby, Senior Editor ([email protected]) All other Countries: Leontina Di Cecco, Senior Editor ([email protected]) ** This series is indexed by EI Compendex and Scopus databases. **
More information about this series at http://www.springer.com/series/7818
Qilian Liang Wei Wang Xin Liu Zhenyu Na Xiaoxia Li Baoju Zhang •
•
•
•
•
Editors
Communications, Signal Processing, and Systems Proceedings of the 9th International Conference on Communications, Signal Processing, and Systems
123
Editors Qilian Liang Department of Electrical Engineering University of Texas at Arlington Arlington, TX, USA
Wei Wang Tianjin Normal University Tianjin, China
Xin Liu Dalian University of Technology Dalian, China
Zhenyu Na School of Information Science and Technology Dalian Maritime University Dalian, China
Xiaoxia Li Huazhong Agricultural University Wuhan, China
Baoju Zhang College of Electronic and Communication Engineering Tianjin Normal University Tianjin, China
ISSN 1876-1100 ISSN 1876-1119 (electronic) Lecture Notes in Electrical Engineering ISBN 978-981-15-8410-7 ISBN 978-981-15-8411-4 (eBook) https://doi.org/10.1007/978-981-15-8411-4 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore
Contents
Optimal Quality of Information Service in an Electronic Commerce Website . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Weiwei Wu and Di Lin
1
Weight Forests-Based Learning Algorithm for Small-Scaled Data Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jiafu Ren, Di Lin, and Weiwei Wu
9
Detecting Anomalous Insiders Using Network Analysis . . . . . . . . . . . . Lei Dai, Liwei Zhang, Limin Li, and You Chen Containerization Design for Autonomous and Controllable Cloud Distributed System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xiao Zhang, Yu Tang, Hao Li, Shaotao Liu, and Di Lin Spatially Transformed Text-Based CAPTCHAs . . . . . . . . . . . . . . . . . . Chuanxiang Yan, Yu Tang, and Di Lin
17
30 39
Battery Capacity Multi-step Prediction on GRU Attention Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jiazhi Huo, Yu Tang, and Di Lin
47
Heterogeneous Network Selection Algorithm Based on Reinforcement Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sheng Yu, Shou-Ming Wei, Wei-Xiao Meng, and Chen-Guang He
56
Heterogeneous Wireless Private Network Selection Algorithm Based on Gray Comprehensive Evaluation Value . . . . . . . . . . . . . . . . . . . . . . Shouming Wei, Shuai Wei, Chenguang He, and Bin Wang
64
A Novel Build-in-Test Method for the Multi-task Radar Warning Receiver Based on a Parallel Radio Frequency Network . . . . . . . . . . . Desi Luo, Song Li, Yang Hui, and Xu Zhou
73
v
vi
Contents
Design of FIR Filter Based on Genetic Algorithm . . . . . . . . . . . . . . . . Yipeng Wang, Yan Ding, Anding Wang, Jingyu Hua, and Weidang Lu Timing Error Detection and Recovery Based on Linear Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xuminxue Hong, Bo Yang, Jingyu Hua, Anding Wang, and Weidang Lu Robust Interference-plus-Noise Covariance Matrix Reconstruction Algorithm for GNSS Receivers Against Large Gain and Phase Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bo Hou, Haiyang Wang, Zhikan Chen, Zhiliang Fan, and Zhicheng Yao
81
88
94
Research on Image Recognition Technology of Transmission Line Icing Thickness Based on LSD Algorithm . . . . . . . . . . . . . . . . . . . . . . Shili Liang, Jun Wang, Peipei Chen, Shifeng Yan, and Jipeng Huang
100
A High-Frequency Acceleration Sensor for Monitoring Sloshing Response of Ships . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chuanqi Liu, Wei Wang, Libo Qiao, and Jingping Yang
111
Research on cm-Wave and mm-Wave Dual-Frequency Active Composite Detection Guidance Technology . . . . . . . . . . . . . . . . . . . . . Lai-Tian Cao, Chen Yan, Xiao-Min Qiang, and Xue-Hui Shao
119
Research on Modulation Recognition Algorithm Based on Combination of Multiple Higher-Order Cumulant . . . . . . . . . . . . . Yingnan Lv and Jiaqi Zhen
128
Indoor Positioning Technology Based on WiFi . . . . . . . . . . . . . . . . . . . Baihui Jiang and Jiaqi Zhen
134
Power Optimization in DF Two-Way Relaying SWIPT-Based Cognitive Sensor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chenyiming Wen, Yiyang Qiang, and Weidang Lu
139
Compressive Sensing-Based Array Antenna Optimization for Adaptive Beamforming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jian Yang, Jian Lu, Bo Hou, and Xinxin Liu
150
A Fiber Bragg Grating Sensor for Pressure Monitoring of Ship Structure Under Wave Load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jingping Yang, Wei Wang, Libo Qiao, and ChuanQi Liu
159
Research on Key Technologies of NoverCart Smart Shopping Cart System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chengyao Yang, Gong Chen, Bo Yang, Lu Ba, and Jinlong Liu
167
Human Identification Under Multiple Gait Patterns Based on FMCW Radar and Deep Neural Networks . . . . . . . . . . . . . . . . . . . Shiqi Dong, Weijie Xia, Yi Li, and Kejia Chen
176
Contents
vii
Design and Application of a High-Speed Demodulator Supporting VCM Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wang Huai, Li Fan, and Han Zhuo
186
Comparative Analysis of Rain Attenuation Prediction Models for Terrestrial Links in Different Climates . . . . . . . . . . . . . . . . . . . . . . Lijie Wang and Hui Li
200
Improved Discrete Frequency and Phase Coding Waveform for MIMO Radar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Linwei Wang, Bo Li, and Changjun Yu
213
Analysis of Influence of Antenna Azimuth on the Performance in a MIMO System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ke-Xin Xiao, Hui Li, You Luo, Huan-Yu Li, and Yu-Han Wang
217
Design of a Third-Order Filter with 10 GHz Center Frequency . . . . . Hai Wang, Zhihong Wang, Guiling Sun, Rong Guo, Ming He, Yi Zhang, and Shengli Zhang
225
An Image Acquisition and Processing Technique Based on Machine Vision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Huan-Yu Li, Hui Li, Jie Cheng, Yu-Han Wang, and Ke-Xin Xiao
234
Power Allocation Based on Complex Shape Method in NOMA System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Jie Cheng, Hui Li, Lijie Wang, Chi Zhang, and Huanyu Li
243
Study on the Feature Extraction of Mine Water Inrush Precursor Based on Wavelet Feature Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ye Zhang, Yang Zhang, Xuguang Jia, Huashuo Li, and Shoufeng Tang
254
Research on Coal and Gas Outburst Prediction Using PSO-FSVM . . . Benlong Zhu, Ye Zhang, Yanjuan Yu, Yang Zhang, Huashuo Li, Yuhang Sun, and Shoufeng Tang
262
Prediction of Coal and Gas Outburst Based on FSVM . . . . . . . . . . . . Xuguang Jia, Ye Zhang, Yang Zhang, Yanjuan Yu, Huashuo Li, Yuhang Sun, and Shoufeng Tang
270
Influencing Factors of Gas Emission in Coal Mining Face . . . . . . . . . . Zhou Zhou, Fan Shi, Yang Zhang, Yanjuan Yu, and Shoufeng Tang
278
Study on Gas Distribution Characteristics and Migration Law Under the Condition of Air Flow Coupling . . . . . . . . . . . . . . . . . . . . . . . . . . . Yanjuan Yu, Huashuo Li, Yang Zhang, Xuguang Jia, Fan Shi, Yongxing Guan, and Shoufeng Tang Research on Channel Coding of Convolutional Codes Cascading with Turbo Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chong-Yue Shi, Hui Li, Jie Xu, Qian Li, Hou Wang, and Liu-Xun Xue
285
293
viii
Contents
A Low Pilot-Overhead Preamble for Channel Estimation with IAM Method in FBMC/OQAM Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dejin Kong, Qian Wang, Pei Liu, Xinmin Li, Xing Cheng, and Yitong Li Modeling of Maritime Wireless Communication Channel . . . . . . . . . . Yu-Han Wang, Meng Xu, Huan-Yu Li, Ke-Xin Xiao, and Hui Li
300 309
Research on Message Forwarding Mechanism Based on Bayesian Probability Model in Wireless Multihop Network . . . . . . . . . . . . . . . . Yang Yan, Qin Danyang, Guo Xiaomeng, and Ma Lin
322
A S-Max-Log-MPA Multiuser Detection Algorithm Based on Serial in SCMA System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Guanghua Zhang, Zonglin Gu, Weidang Lu, and Shuai Han
330
A Method and Realization of Autonomous Mission Management Based on Command Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yiming Liu, Yu Jiang, Junhui Yu, Li Pan, Hongjun Zhang, and Zhenhui Dong
338
Remote Sensing Satellite Autonomous Health Management Design Based on System Working Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Li Pan, Fang Ren, Chao Lu, Liuqing Yang, and Yiming Liu
346
An Autonomous Inter-Device Bus Control Transfer Protocol for Time Synchronization 1553B Bus Network . . . . . . . . . . . . . . . . . . . Tian Lan, Zhenhui Dong, Hongjun Zhang, and Jian Guo
354
A Hybrid Service Scheduling Strategy of Satellite Data Based on TSN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zhaojing CUI, Zhenhui Dong, Hongjun Zhang, Xiongwen HE, and Yuling QIU Campus Bullying Detection Algorithm Based on Surveillance Camera Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Tong Liu, Liang Ye, Tian Han, Tapio Seppänen, and Esko Alasaarela
362
368
Activity Emersion Algorithm Based on Multi-Sensor Fusion . . . . . . . . Susu Yan, Liang Ye, Tian Han, Tapio Seppänen, and Esko Alasaarela
373
Neural Network for Bullying Emotion Recognition . . . . . . . . . . . . . . . Xinran Zhou, Liang Ye, Chenguang He, Tapio Seppänen, and Esko Alasaarela
379
Modulation Recognition Algorithm of Communication Signals Based on Artificial Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dongzhu Li, Liang Ye, and Xuanli Wu
384
Deep Learning for Optimization of Intelligent Reflecting Surface Assisted MISO Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chi Zhang, Xiuming Zhu, Hongjuan Yang, and Bo Li
390
Contents
ix
Max-Ratio Secure Link Selection for Buffer-Aided Multiuser Relay Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yajun Zhang, Jun Wu, and Bing Wang
396
Reconfigurable Data Acquisition System with High Reliability for Aircraft . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yukun Chen, Lianjun Ou, Gang Rong, and Fei Liu
401
The Algorithm of Beamforming Zero Notch Deepening Based on Delay Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Junqi Gao and Jiaqi Zhen
409
Mask Detection Algorithm for Public Places Entering Management During COVID-19 Epidemic Situation . . . . . . . . . . . . . . . . . . . . . . . . . Yihan Yun, Liang Ye, and Chenguang He
414
Campus Bullying Detection Algorithm Based on Audio . . . . . . . . . . . . Tong Liu, Liang Ye, Tian Han, Tapio Seppänen, and Esko Alasaarela An End-to-End Multispectral Image Compression Network Based on Weighted Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shunmin Zhao, Fanqiang Kong, Yongbo Zhou, and Kedi Hu
420
425
Multispectral Image Compression Based on Multiscale Features . . . . . Shunmin Zhao, Fanqiang Kong, Kedi Hu, and Yuxin Meng
430
Dense Residual Network for Multispectral Image Compression . . . . . . Kedi Hu, Fanqiang Kong, Shunmin Zhao, and Yuxin Meng
435
Hyperspectral Unmixing Method Based on the Non-convex Sparse and Spatial Correlation Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . Mengyue Chen, Fanqiang Kong, Shunmin Zhao, and Keyao Wen
441
Deep Denoising Autoencoder Networks for Hyperspectral Unmixing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Keyao Wen, Fanqiang Kong, Kedi Hu, and Shunmin Zhao
447
Research on Over-Complete Sparse Dictionary Based on Compressed Sensing Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zhihong Wang, Hai Wang, and Guiling Sun
453
Spectrum Occupancy Prediction via Bidirectional Long Short-Term Memory Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lijie Feng, Xiaojin Ding, and Gengxin Zhang
462
Active–Passive Fusion Technology Based on Neural Network-Aided Extended Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xiaomin Qiang, Zhangchi Song, Laitian Cao, Yan Chen, and Kaiwei Chen
470
x
Contents
Multi-Target Infrared–Visible Image Sequence Registration via Robust Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bingqing Zhao, Tingfa Xu, Bo Huang, Yiwen Chen, and Tianhao Li
478
Research on an Improvement of Images Haze Removal Algorithm Based on Dark Channel Prior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Guonan Jiang, Xin Yin, and Menghan Dong
487
Research on the Human Vehicle Recognition System Based on Deep Learning Fusion Remove Haze Algorithm . . . . . . . . . . . . . . . . . . . . . . Guonan Jiang, Xin Yin, and Jingyan Hu
498
Improved Skeleton Extraction Based on Delaunay Triangulation . . . . Jiayi Wei, Yingguang Hao, and Hongyu Wang
507
An Algorithm of Computing Task Offloading in Vehicular Network Based on Network Slice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Peng Lv, Zhao Liu, Yinjiang Long, Peijun Chen, and Xiang Wang
515
A Cluster Routing Algorithm Based on Vehicle Social Information for VANET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chenguang He, Guanqiao Qu, Liang Ye, and Shouming Wei
522
LSTM-Based Channel Tracking of MmWave Massive MIMO Systems for Mobile Internet of Things . . . . . . . . . . . . . . . . . . . . . . . . . Haiyan Liu, Zhou Tong, Qian Deng, Yutao Zhu, Tiankui Zhang, Rong Huang, and Zhiming Hu A Polarization Diversity Merging Technique for Low Elevation Frequency Hopping Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Gu Jiahui, Wang Bin, Liu Yang, and Liu Xin Systematic Synthesis of Active RC Filters Using NAM Expansion . . . . Lingling Tan, Fei Yang, and Junkai Yi
529
538 545
An OFDM Radar Communication Integrated Waveform Based on Chaos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zhe Li and Weixia Zou
555
Joint Estimation for Downsampling Structure with Low Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chen Wang, Wei Wang, Wenchao Yang, and Lu Ba
563
One-Bit DOA Estimation Based on Deep Neural Network . . . . . . . . . . Chen Wang, Suhang Li, and Yongkui Ma
572
A New Altitude Estimation Algorithm for 3D Surveillance Radar . . . . Jianguo Yu, Lei Gu, Dan Le, Yao Wei, and Qiang Huang
581
Cause Analysis and Solution of Burst Multicast Data Packet Loss . . . . Zongsheng Jia and Jiaqi Zhen
589
Contents
The Prediction Model of High-Frequency Surface Wave Radar Sea Clutter with Improved PSO-RBF Neural Network . . . . . . . . . . . . . . . . Shang Shang, Kangning He, Tong Yang, Ming Liu, Weiyan Li, and Guangpu Zhang
xi
595
CS-Based Modulation Recognition of Sparse Multiband Signals Exploiting Cyclic Spectral Density and MLP . . . . . . . . . . . . . . . . . . . . Yanping Chen, Song Wang, Yulong Gao, Xu Bai, and Lu Ba
604
Bandwidth Estimation Algorithm Based on Power Spectrum Recovery of Undersampling Signal . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yuntao Gu, Yulong Gao, Si Wang, and Baowei Li
612
An Adaptive Base Station Management Scheme Based on Particle Swarm Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wenchao Yang, Xu Bai, Shizeng Guo, Long Wang, Xuerong Luo, and Mingjie Ji An Earthquake Monitoring System of LoRa Dynamic Networking Based on AODV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Long Wang, Wenchao Yang, Xu Bai, Lidong Liu, Xuerong Luo, and Mingjie Ji Track Segments Stitching for Ballistic Group Target . . . . . . . . . . . . . . Xiaodong Yang, Jianguo Yu, Lei Gu, and Qiang Huang A Physical Security Technology Based upon Multi-weighted Fractional Fourier Transform Over Multiuser Communication System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yong Li, Zhiqun Song, and Bin Wang Fast Convergent Algorithm for Hypersonic Target Tracking with High Dynamic Biases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Dan Le and Jianguo Yu Optimization of MFCC Algorithm for Embedded Voice System . . . . . Tianlong Shi and Jiaqi Zhen Energy-Efficient Hybrid Precoding for Adaptive Sub-connected Architecture in MmWave Massive MIMO Systems . . . . . . . . . . . . . . . Li Li, Qian Deng, Weiwei Dong, Yutao Zhu, Tiankui Zhang, Rong Huang, and Wei Huang
619
628
634
642
649 657
661
Sequential Pattern Mining-Based Alarm Correlation Analysis for Telecommunication Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ying Chen, Tiankui Zhang, Rong Huang, Yutao Zhu, and Zemin Liu
670
Multi-Model Ensemble-Based Fault Prediction of Telecommunication Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ying Chen, Tiankui Zhang, Rong Huang, Yutao Zhu, and Junhua Hong
678
xii
Contents
Bone Marrow Cell Counting Method Based on Fourier Ptychographic Microscopy and Convolutional Neural Network . . . . . . Xin Wang, Tingfa Xu, Jizhou Zhang, Shushan Wang, Yizhou Zhang, Yiwen Chen, and Jinhua Zhang Identification of Sensitive Regions for Power Equipment Based on Fast R-CNN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hanwu Luo, Qirui Wu, Zhonghan Peng, Hailong Zhang, and Houming Shen
687
694
Dimensionality Reduction Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . Wenzhen Li, Qirui Wu, Zhonghan Peng, Kai Chen, Hui Zhang, and Houming Shen
700
An Improved APTEEN Protocol Based on Deep Autoencoder . . . . . . . Yu Song, Shubin Wang, and Lixin Jing
709
APTEEN Protocol Data Fusion Optimization Based on BP Neural Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Lixin Jing, Shubin Wang, and Yu Song
717
Realization of Target Tracking Technology for Generated Infrared Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ge Changyun and Zhang Haibei
726
Review on Wearable Antenna Design . . . . . . . . . . . . . . . . . . . . . . . . . . Licheng Yang, Tianyu Liu, Qiaomei Hao, Xiaonan Zhao, Cheng Wang, Bo Zhang, and Yang Li
731
A Review of Robust Cost Functions for M-Estimation . . . . . . . . . . . . . Yue Wang
743
Research on the Path Planning Algorithm for Emergency Evacuation in the Building Based on Ant Colony Algorithm . . . . . . . . . . . . . . . . . Chenguang He, Suning Liu, Liang Ye, and Shouming Wei Multiple Action Movement Control Scheme for Assistive Robot Based on Binary Motor Imagery EEG . . . . . . . . . . . . . . . . . . . . . . . . . Xuefei Zhao, Dong Liu, Shengquan Xie, Quan Liu, Kun Chen, Li Ma, and Qingsong Ai Modeling and Simulation of Ionospheric Surface Scattering Equation for High-Frequency Surface Radar . . . . . . . . . . . . . . . . . . . . . . . . . . . Yongfeng Yang, Ziqiang Zhang, Xuguang Yang, Jun Du, and Yanghong Zhang Greedy Matching-Based Pilot Allocation in Massive MIMO Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Wen Zhao, He Gao, Yao Ge, Jie Zhang, Sanlei Dang, and Tao Lu
751
760
769
777
Contents
xiii
Modeling and Simulation of Ionospheric Volume Scattering Equation for High-Frequency Surface Radar . . . . . . . . . . . . . . . . . . . . Yongfeng Yang, Wei Tang, Xuguang Yang, Xueling Wei, and Le Yang
786
Research on the Elite Genetic Particle Filter Algorithm and Application on High-Speed Flying Target Tracking . . . . . . . . . . . Lixia Nie, Xuguang Yang, Jinglin He, Yaya Mu, and Likang Wang
792
Fault Estimation and Compensation for Fuzzy Systems with Sensor Faults in Low-Frequency Domain . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yu Chen, Xiaodan Zhu, and Jianzhong Gu
799
Anti-jamming Performance Evaluation of GNSS Receivers Based on an Improved Analytic Hierarchy Process . . . . . . . . . . . . . . . . . . . . Yuting Li, Zhicheng Yao, Yanhong Zhang, and Jian Yang
808
Communication Optical Cable Patrol Management System Based on RFID + GIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yang Mei, Jiang Yaolou, Zhou Bo, Qing Chao, and Chen Zhen
818
Lip Language Recognition System Based on AR Glasses . . . . . . . . . . . Zhenzhen Huang, Peidong Zhuang, and Xinyu Ren
825
Device-Free Human Activity Recognition Based on Channel Statement Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ruoyu Cao, Xiaolong Yang, Mu Zhou, and Liangbo Xie
835
Wi-Breath: Monitoring Sleep State with Wi-Fi Devices and Estimating Respiratory Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xin Yu, Xiaolong Yang, Mu Zhou, and Yong Wang
839
Hard Examples Mining for Adversarial Attacks . . . . . . . . . . . . . . . . . Jiaxi Yang
843
Positioning Parameter Estimation Based on Reconstructed Channel Statement Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xiaolong Wang, Xiaolong Yang, Mu Zhou, and Wei He
855
Three-Dimensional Parameter Estimation Algorithm Based on CSI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yuan She, Xiaolong Yang, Mu Zhou, and Wei Nie
859
Edge Cutting Analysis of Image Mapper for Snapshot Spectral Imager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Xiaoming Ding, Yupeng Li, Xiaocheng Wang, and Cheng Wang
864
Object Detection of Remote Sensing Image Based on Multi-level Domain Adaption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Peiran Kang, Xiaorui Ma, and Hongyu Wang
870
xiv
Contents
Research on Camera Sign-In System Based on SIFT Image Splicing Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Feng Long Yan, Hai Bei Zhang, Changyun, and Ge Research on Performance Optimization Scheme for Web Front-End and Its Practice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fenglong Yan, Zhao Xu, Yu Shan Zhong, Zhang HaiBei, and Chang Yun Ge Automatic Counting System of Red Blood Cells Based on Fourier Ptychographic Microscopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Shushan Wang, Tingfa Xu, Jizhou Zhang, Xin Wang, Yiwen Chen, and Jinhua Zhang
878
883
891
Implemention of Speech Recognition on PYNQ Platform . . . . . . . . . . . Wei Sheng, Songyan Liu, Yi Sun, and Jie Cheng
899
Multi-target Tracking Based on YOLOv3 and Kalman Filter . . . . . . . Xin Yin, Jian Wang, and Shidong Song
905
Load Balance Analysis-Based Retrieval Strategy in a Heterogeneous File System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Di Lin, Weiwei Wu, and Yao Qin Brain-Inspired Maritime Network Framework with USV . . . . . . . . . . Xin Sun, Tingting Yang, Kun Shi, and Huapeng Cao
912 927
Line-of-Sight Rate Estimation Based on Strong Tracking Cubature Kalman Filter for Strapdown Seeker . . . . . . . . . . . . . . . . . . . . . . . . . . Kaiwei Chen, Laitian Cao, Xuehui Shao, and Xiaomin Qiang
936
Malicious Behavior Catcher: An Intrusion Detection System Based on VAE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Linna Fan, Jiahai Yang, and Tianjian Mi
944
Campus Physical Bullying Detection Based on Sensor Data and Pattern Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bo Long, Jian Liu, Jun Wang, and Chenguang He
954
Speaker Recognition System Using Dynamic Time Warping Matching and Mel-Scale Frequency Cepstral Coefficients . . . . . . . . . . Yang Xue
961
MAN: A Multidimension Attention-Based Recurrent Neural Network for Stock Price Forecasting . . . . . . . . . . . . . . . . . . . . . . . . . . Weipeng Zhang
968
Driver Multi-function Safety Assistance System . . . . . . . . . . . . . . . . . . Wanqi Wang, Peidong Zhuang, and Shiwen Zhang
977
Contents
xv
Edge Computing-Enabled Dynamic Multi-objective Optimization of Machining Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Zhibo Sui, Xiaoxia Li, Jianxing Liu, and Zhengqi Zeng
986
Improved Dark Channel Prior Algorithm Based on Wavelet Decomposition for Haze Removal in Dynamic Recognition . . . . . . . . . Peiyang Song
997
A Novel Broadband Crossover with High Isolation on Microwave Multilayer PCB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1007 Xu Yang, Jiancheng Liu, Meiying Wei, Xiaoming Li, Anping Li, and Xiaofei Zhang A Recognition Algorithm Based on Region Growing . . . . . . . . . . . . . . 1013 Luguang Wang, Yong Zhu, and Chuanbo Wang Point Cloud Simplification Method Based on RANSAC and Geometric Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1020 Chuanbo Wang, Yong Zhu, and Luguang Wang Compressive Autoencoders in Wireless Communications . . . . . . . . . . . 1028 Peijun Chen, Peng Lv, Hongfu Liu, Bin Li, Chenglin Zhao, and Xiang Wang Activity Segmentation by Using Window Variance Comparison Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1036 Xinxing Tang, Xiaolong Yang, Mu Zhou, and Lingxia Li Multi-carrier-Based Positional Modulation Design . . . . . . . . . . . . . . . . 1041 Qingnuan Hu, Bo Zhang, Jinjin Zhang, Jin Gao, Yang Li, Xiaonan Zhao, Cuiping Zhang, and Cheng Wang Directional Modulation Design with a Transmission Power Constraint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1048 Jinjin Zhang, Bo Zhang, Jin Gao, Qingnuan Hu, Wei Liu, Yang Li, Xiaonan Zhao, Cuiping Zhang, and Cheng Wang Circular Antenna Array-Based Directional Modulation Design . . . . . . 1055 Jin Gao, Bo Zhang, Qingnuan Hu, Jinjin Zhang, Yang Li, Xiaonan Zhao, Cuiping Zhang, and Cheng Wang An Improved Visual Inertial Odometry in Low-Texture Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1062 Zi-Yuan Song, Chen-Yang Yan, and Fei Zhou A Pole Extraction Algorithm Based on Sliding Windowed Matrix Pencil Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1073 Tianbao Zhang, Yang Zhou, Hongliang Gao, Zhian Deng, Chunjie Zhang, Jiawei Shi, and Jianguo Yu
xvi
Contents
A Recursive Algorithm for the Energy Minimization of the Data Packets Transmission with a Low Computational Complexity . . . . . . . 1080 Yunhai Du, Chao Meng, Zihan Zhang, and Wenjing Ding Speech Emotion Recognition Algorithm for School Bullying Detection Based on MFCC-PCA-SVM Classification . . . . . . . . . . . . . . 1088 Yuhao Wang, Xinsheng Wang, and Chenguang He 3D Anchor Generating Network:3D Object Detection via Learned Anchor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1094 Huinan Li and Yongping Xie Handoff Decision Using Fuzzy Logic in Heterogeneous Wireless Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1102 Liwei Yang and Qi Zhang Campus Bullying Detection Based on Speech Emotion Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1108 Chenke Wang, Daning Zhang, and Liang Ye MTFCN: Multi-task Fully Convolutional Network for Cow Face Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1116 Ziyan Wang, Fuchuan Ni, and Na Yao Boosted Personalized Page Rank Propagation for Graph Neural Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1128 Wenwen Xia and Fucai Luo An Improvement Single Beacon Positioning Algorithm Using Sparse Extended Information Filter for AUV Localization . . . . . . . . . . . . . . . 1137 Wanlong Zhao, Huifeng Zhao, Jucheng Zhang, and Yunfeng Han An Algorithm Design for Fish Recognition Based on Computer Vision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1146 Guangyao Chen and Yiran Zhou An Iterative Multi-channel DDPSK Receiver for Time-Varying Underwater Acoustic Communications . . . . . . . . . . . . . . . . . . . . . . . . . 1152 Zhihui Wu, Chao Gao, Feng Gao, Junhong Peng, and Jie Wu An Improved LC Visual Attention Detection Algorithm . . . . . . . . . . . 1161 ZhiYang Zhao, BaoJu Zhang, CuiPing Zhang, JiaZu Xie, Man Wang, and WenRui Yan Joint Source-Channel Coding Scheme Based on UEP-Raptor . . . . . . . 1168 Chang Liu, Wenchao Yang, Dezhi Li, Zhenyong Wang, Zhenbang Wang, and Haibo Lv Energy-Efficient UAV-Based Communication with Trajectory, Power, and Time Slot Allocation Optimization . . . . . . . . . . . . . . . . . . . 1176 Liping Deng, Hong Jiang, Jie Tian, and He Xiao
Contents
xvii
UAV-Assisted Wireless Communication Network Capacity Analysis and Deployment Decision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1185 Qiuyun Zhang, Hong Jiang, Qiumei Guo, Jie Tian, Fanrong Shi, Mushtaq Muhammad Umer, and Xinfan Yin An Edge Detection Algorithm Based on Fuzzy Adaptive Median Filtering and Bilateral Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1196 Man Wang, Bao Ju Zhang, Cui Ping Zhang, Jia Zu Xie, and Feng Juan Wang Fusion Dynamic Time Warping Algorithm for Hand Gesture Recognition in FMCW Radar System . . . . . . . . . . . . . . . . . . . . . . . . . 1205 Aihu Ren, Yong Wang, Mu Zhou, Xiaolong Yang, and Liangbo Xie Vital Signs Detection Using a FMCW Radar Sensor Based on the Discrete Wavelet Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . 1210 Wen Wang, Yong Wang, Xiaobo Yang, Mu Zhou, and Liangbo Xie PHCount: Passive Human Number Counting Using WiFi . . . . . . . . . . 1214 Xi Chen, Zengshan Tian, Mu Zhou, Jianfei Yu, and Bin Luo Gait Cycle Detection Using Commercial WiFi Device . . . . . . . . . . . . . 1224 Gongzhui Zhang, Zengshan Tian, Mu Zhou, and Xi Chen Improved Spectral Angle Mapper Applications for Mangrove Classification Using SPOT5 Imagery . . . . . . . . . . . . . . . . . . . . . . . . . . 1232 Xiu Su, Xiang Wang, Derui Song, Jianhua Zhao, Jianchao Fan, and Zhengxian Yang A Novel Method for Direction of Arrival Estimation in Impulsive Noise Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1244 Li Li, Derui Song, Xiaofei Shi, and DeJun Zou Sea Utilization of Different Marine Industries in the Bohai Economic Rim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1248 Jianli Zhang, Derui Song, Limei Qu, Hao Zhang, Jingping Xu, and Yujuan Ma Interference Analysis Between Satellite and 5G Network . . . . . . . . . . . 1257 Xinxin Miao and Mingchuan Yang Analysis of Spatial Correlation for Wideband HAP-MIMO 3-D Channel Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1262 Bingyu Xie and Mingchuan Yang Channel Characteristics Analysis and Modeling of Deep Space Communication Link . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1268 Xin Guan and Mingchuan Yang
xviii
Contents
Design of Global Coverage Satellite Constellation Based on GEO and IGSO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1276 Yingzhe Dou and Xiaofeng Liu Markov Chain-Based Statistical Model of Three-State SIMO Land Mobile Satellite Channel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1283 Xinxin Miao and Xiaofeng Liu Network Coding Over Hybrid Satellite-Terrestrial Two-Way Relay Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1288 Xin Guan and Xiaofeng Liu Performance Analysis and FPGA Implementation of Layered Space–Time Code for Mobile Satellite Communication . . . . . . . . . . . . 1294 Bingyu Xie and Xiaofeng Liu Performance Analysis of CRDSA Based on M2M Flow Model . . . . . . 1300 Xiaoling Fu and Yanyong Su Performance Analysis of Satellite Internet of Things Access Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1306 Xiaoling Fu and Yanyong Su SCMA Multiple Access Technology in Satellite Internet of Things System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1312 Ruiran Liu and Yanyong Su Duplex Mode Switching for Underlay Cognitive Radio Systems Based on Outage Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1316 Ranran Zhou, Liang Han, and Yupeng Li A New Security-Oriented Multi-dimensional Assessment Method for Perception Layer of Electric Internet of Things . . . . . . . . . . . . . . . 1326 Yuxuan Yang, Jingtang Luo, Chenyang Li, Shujuan Sun, Yuanshuo Zheng, and Min Zhang The Security-Oriented Assessment Framework for Perception Layer of Electric Internet of Things . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1335 Jingtang Luo, Quan Tang, Shujuan Sun, Chenyang Li, Yuanshuo Zheng, and Min Zhang Optimizing Frequency Reuse in Multibeam Satellite Communication Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1344 Weizhong Zhang, Wenchao Yang, Zhenyong Wang, Dezhi Li, and Qing Guo A Critical Links Identification Method for Interdependent Cyber-Physical System of Smart Grid . . . . . . . . . . . . . . . . . . . . . . . . . 1350 Jingtang Luo, Hechun Zhu, Jian Zeng, Ganghua Lin, Jiuling Dong, and Min Zhang
Contents
xix
An Estimation Algorithm of Power System Frequency in Impulse Noise Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1358 Shiying Yao, Yiming Chen, Jingtang Luo, Liangtao Duan, Jiuling Dong, and Min Zhang Research on Broadcasting of Beidou Differential Information Based on AIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1366 Xianfeng Zhao, Yanjun Fang, Yang Zhang, and Chen Wang Analysis of Electromagnetic Shielding Model Applied in Wireless Energy Transmission System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1371 Xiu Zhang, Yang Li, Xin Zhang, and Ruiqing Xing Analysis of the Influence of Subarray Beamforming on Frequency-Hopping Communication System . . . . . . . . . . . . . . . . . . 1381 Zhengyu Zhang, Qinghua Wang, Yongqing Zou, and Xin Wang EEG Feature Selection Based on SFFS for Audiovisual-Induced Emotion Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1390 Yi Wang, Kun Chen, Yue He, Zhilei Li, and Li Ma Radio Resource Allocation for Centralized DU Architecture of 5G Radio Access Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1397 Mengbin Gao, Tiankui Zhang, Nan Zhao, Jing Li, and Liwei Yang Low-Energy and Area-Efficient Scheme with Dummy Capacitor Switching for SAR ADCs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1406 Dengke Yao, Liangbo Xie, Ruoyu Cao, and Mu Zhou SVDec: UAV Detection System Using Software-Defined Radio . . . . . . 1412 Yi Li, Wei Nie, Wei He, and Mu Zhou Recent Progress of ISAR Imaging Algorithms . . . . . . . . . . . . . . . . . . . 1418 Yong Wang, Yuhong Shu, Xiaobo Yang, Mu Zhou, and Zengshan Tian Research Status and Development of Battery Management System . . . 1422 Panpan Liu, Changbo Lu, Changfu Wang, Xudong Wang, Wanli Xu, Youjie Zhou, and Hua Li Temperature Data Acquisition and Safety Characteristics Analysis of Lithium-Ion Battery’s Thermal Runaway . . . . . . . . . . . . . . . . . . . . 1430 Wanli Xu, Changfu Wang, Changbo Lu, Dongkai Ma, Xudong Wang, Panpan Liu, and Yaohui Wang Image Acquisition and Analysis of Lithium-Ion Battery’s Thermal Runaway . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1439 Wanli Xu, Changbo Lu, Changfu Wang, Yao Nie, Xudong Wang, Youjie Zhou, and Hua Li
xx
Contents
Weak Light Characteristic Acquisition and Analysis of Thin-Film Solar Cells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1448 Wanli Xu, Changfu Wang, Changbo Lu, Hui Sun, Xudong Wang, Yanli Sun, and Litong Lv Research Status and Key Technologies of Long-Distance Laser Energy Transmission System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1457 Wanli Xu, Changfu Wang, Changbo Lu, Panpan Liu, Xudong Wang, Mengyi Wang, and Lei Xu Research Status of Unmanned System and Key Technologies of Energy Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1466 Wanli Xu, Changfu Wang, Changbo Lu, Mengyi Wang, Xudong Wang, Panpan Liu, and Weigui Zhou Analysis of the Effect of Coil Offset on the Efficiency of Magnetic Coupling Wireless Power Transfer System . . . . . . . . . . . . . . . . . . . . . . 1475 Wanli Xu, Hang Zhang, Changbo Lu, Shizhan Li, Xudong Wang, Panpan Liu, and Yaohui Wang Data Collection and Performance Analysis of Lithium-Titanate Battery Charging and Discharging at Low Temperature . . . . . . . . . . . 1483 Wanli Xu, Changbo Lu, Changfu Wang, Lijie Zhou, Xudong Wang, Youjie Zhou, and Hua Li Research on Feature Extraction of Piston Knocking Vibration Signal and Analysis of Correlation Degree of Wear . . . . . . . . . . . . . . . . . . . . 1492 Xudong Wang, Xiaolei Li, Feng Wang, Wanli Xu, Yanli Sun, Youjie Zhou, and Lei Xu Duplex Mode Selection for Underlay D2D Communications Based on Energy Efficiency Maximization . . . . . . . . . . . . . . . . . . . . . . . . . . . 1500 Xueqiang Ren, Liang Han, and Yupeng Li The Performance Investigation of Direct Detection Optical OFDM System with Different Modulation Formats . . . . . . . . . . . . . . . . . . . . . 1509 Guoqing Wang, Yupeng Li, Liang Han, and Xiaoming Ding Novel OFDM-Based Self-interference Channel Estimator for Digital Cancellation in Full-Duplex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1516 Fei Wu and Liang Han Multiple Hypothesis Tracking with Mixed Integer Linear Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1525 Dingbao Xie and Zhaolei Liu An Adaptive Filtering Technique for Hypersonic Targets Based on Acceleration Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1532 Lei Gu, Jianguo Yu, Qiang Huang, and Zhenlai Xu
Contents
xxi
TDOA Estimation of a Single Wi-Fi Access Point Based on CSI . . . . . 1541 Ming Zhang, Qingzhong Li, Jianguo Yu, and Zhian Deng Numerical Analysis of the Ill-Posedness of Ground-Based 2D Radar Short-Arc Orbit Determination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1550 Zhengtao Zhang, Qiang Huang, and Jianguo Yu Simulation Study About the Radar Cross Section of a Typical Targets Based on FEKO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1558 Feng Wang, Pengyuan Liu, and Zhonglin Wei Detection of Malicious Nodes in the uav Network . . . . . . . . . . . . . . . . 1566 Jia Chen, Yanzhi Zhu, and Shaofan Zhu A UAV-Swarm Control Platform Architecture Based on Cloud . . . . . . 1573 Li Zeng, Zesheng Zhu, Xuzhou Shi, and Yulei Liu A Survey: Development and Application of Behavior Trees . . . . . . . . . 1581 Wang Zijie, Wang Tongyu, and Gao Hang A Survey on Security and Privacy in Spatial Crowdsouring . . . . . . . . 1590 Mengting Shi, Mengqi Li, and Yuping Zhang Application of Machine Learning in Space–Air–Ground Integrated Network Data Link . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1596 Shaofan Zhu, Shuning Wang, and Jia Chen Survey on UAV Coverage Path Planning Problem . . . . . . . . . . . . . . . . 1601 Jiankang Xu, Xuzhou Shi, Zesheng Zhu, and Hang Gao Research on Multi-sensor Fusion Algorithm Based on Statistical Track Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1608 Jinliang Dong, Lei Bian, Yumeng Zhang, and Huifang Dong Research on the Improved EMC Design of Vehicle-Borne Rotating Phased Array Radar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1616 Gan Wang, Jinliang Dong, Yumeng Zhang, and Huifang Dong RBF Neural Network-Based Temperature Error Compensation for Fiber Optic Gyroscopes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1627 Wei Cai, Jianyang Wang, Wenhui Hao, Yanguo Zhou, and Yiming Liu A Multi-Beam Forward Link Precoding Algorithm for Dirty Paper Zero-Forcing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1636 Yumeng Zhang, Jinliang Dong, Lei Bian, and Gan Wang UAV Detection and Recognition Based on RF Fingerprint . . . . . . . . . . 1647 Zhi Chao Han, Wei Nie, Mu Zhou, and Qing Jiang
xxii
Contents
Soil Humidity Classification Based on Confident Learning via UWB Radar Echoes with Noisy Labels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1653 Chenghao Yang and Jing Liang Laser Point De-Fuzzy Method Based on Stitched Background Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1662 Wenrui Yan, Baoju Zhang, Chuyi Chen, Jingqi Fei, Cuiping Zhang, and Zhiyang Zhao Z-NetMF: A Biased Embedding Method Based on Matrix Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1669 Yuchen Sun, Liangtian Wan, Lu Sun, and Xianpeng Wang Hyperspectral Band Selection Based on Improved K-Means Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1677 Yulei Wang, Qingyu Zhu, Yao Shi, Meiping Song, Haoyang Yu, and derui Song Joint Classification of Multispectral Image and SAR Image Based on Deep Feature Fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1682 Beimin Xie, Xinrong Wang, and Peiran Kang Advances of Power Supply Technology for Unmanned Aerial Vehicle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1691 Wanli Xu, Changbo Lu, Youjie Zhou, Xuhui Wang, Weigui Zhou, Mengyi Wang, and Lei Xu Design of Fuel Cell UAV Power System . . . . . . . . . . . . . . . . . . . . . . . . 1699 Wanli Xu, Changbo Lu, Youjie Zhou, Xuhui Wang, Hua Li, Lei Xu, and Mengjie Hao Overview on Micro-grid Technology Research . . . . . . . . . . . . . . . . . . . 1707 Wanli Xu, Changfu Wang, Xuhui Wang, Shushuai Zhou, Youjie Zhou, Hua Li, Weigui Zhou, and Lei Xu Performance Test of Mini Solid Oxide Fuel Cell . . . . . . . . . . . . . . . . . 1714 Youjie Zhou, Changfu Wang, Wanli Xu, Shushuai Zhou, Xiangjing Mu, Mengyi Wang, and Litong Lv Development of Hydrogen Fuel Cell Technology and Prospect for Its Military Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1722 Youjie Zhou, Changbo Lu, Jian Cheng, Wanli Xu, Yanli Sun, Hua Li, and Lei Xu A Novel Location-Based and Bandwidth-Aware Routing Algorithm for Wireless Ad Hoc Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1731 Chunguang Shi, Wenjin Hao, Quan Yin, Bo Zhou, and Kui Du
Contents
xxiii
Real-Time Measurement of Carrier Frequency Based on FFT and ANF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1736 Wenzhao Li, Weihua Dai, Shen Zhao, and XiWei Guo Algorithm Research on Improving Halo Defects Based on Guided Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1743 Yan Li, Tiantian Guan, Jingwen He, and Zichen Cheng The Smart Grid Vulnerability Analysis and Chain Faults Suppression Under Interdependent Coupling Relationship . . . . . . . . . . 1752 Jingtang Luo, Shiying Yao, Jiamin Zhang, Yiming Chen, and Min Zhang A Coastline Detection Algorithm with ACM Driven by Diffusion Coefficient for SAR Imagery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1760 Xiaofei Shi, Xu Zhang, Derui Song, and DeJun Zou Coastline Detection with a Spatial Second-Order Correlation Statistic for SAR Imagery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1764 Xiaofei Shi, Xu Zhang, Derui Song, and DeJun Zou A Survey on the Entity Linking in Knowledge Graph . . . . . . . . . . . . . 1768 Jingjing Du and Bo Ning A Multi-focus Image Fusion Method Based on Cascade CNN Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1774 Zhang HeXuan and Tong Ying Research on the Effects of Emotional Intervention in Online Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1782 Simbarashe Tembo and Jin Chen Study on Data Fusion Processing Algorithm of Marine Sensor Based on Information Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1790 Hao Gao, Lin Cao, and Lei Yang Study on Typical Nonlinear System Control Strategies Based on Energy Conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1798 Hao Gao, Lin Cao, and Yulong Cai Relay Selection based on Multiple-Attribute Decision Making for Underwater Acoustic Cooperative Communication . . . . . . . . . . . . . 1807 Miao Ke and Zhiyong Liu A Chinese Knowledge Graph for Cardiovascular Disease . . . . . . . . . . 1816 Xiaonan Li, Kai Zhang, Guanyu Li, and Bin Zhu Study and Validation on a Novel Multi-dimensional Marine Electromagnetic Prospecting Nonlinear Inversion Algorithm . . . . . . . . 1827 Hao Gao, Lin Cao, and Guangyuan Chen
xxiv
Contents
Study on the Noise Analysis of Weak Photoelectric Signal Detection in the Marine Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1835 Lin Cao, Hao Gao, and Xuejun Xiong A Brief Survey of Graph Ranking Methods . . . . . . . . . . . . . . . . . . . . . 1844 Mengmeng Guan and Bo Ning Power Control for Two-Way DF Relay-Aided Underlaid D2D Communications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1849 Ning Liang, Liang Han, and Yupeng Li A Survey on Conversational Question-Answering Systems . . . . . . . . . . 1857 Deji Zhao and Bo Ning Stable Time Transfer Over 120 km Optical Fiber with High-Precision Delay Variation Measurement . . . . . . . . . . . . . . . 1862 Xiaocheng Wang, Xiaoming Ding, Yupeng Li, and Cheng Wang Constellation Design and Application of Real-Time Space-Based Information Services Supporting Communication, Navigation and Remote Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1869 Wang Liyun and Meng Jing Path Index-Enhanced Incremental Subgraph Matching Algorithm for Dynamic Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1879 Yunhao Sun, Guanyu Li, Bo Ning, and Bing Han Obstacle Detection and Recognition Using Stereo Vision and Radar Data Alignment for USV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1888 Lin Cao, Hailin Liu, Hao Gao, and Hui Li Research on New Progress and Key Technology of Space TT & C and Data Transmission System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1896 Xiyuan Li and Jing Meng The Radar Echo Extrapolation Based on ConvLSTM . . . . . . . . . . . . . 1903 Zhaoping Sun, Can Lai, Xia Chen, and Haijiang Wang Cross-Age Face Recognition Using Deep Learning Model Based on Dual Attention Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1911 Jialve Wang, Shenghong Li, and Fucai Luo Research on Classification of Alzheimer’s Disease Based on Multi-scale Features and Sequence Learning . . . . . . . . . . . . . . . . . 1920 Sen Han, Lin Wang, and Derui Song A Behavior Predictive Control Mechanism Based on User Behavior Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1924 Mengkun Li, Yongjian Wang, and Caiqiu Zhou
Contents
xxv
Unmanned Aerial Vehicle (UAV) Networking for Ocean Monitoring: Architectures and Key Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . 1928 Bin Lin, Yajing Zhang, Xu Hu, and Jianli Duan MEMS Mirror LIDAR System and Echo Signal Processing . . . . . . . . 1932 Tao Liu, Dingkang Wang, Ruowei Mei, and Xinlin Gou Research on Indexing and KNN Query of Moving Objects in Road Network Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1944 Wei Jiang, Guanyu Li, Jingmin An, Yunhao Sun, Heng Chen, and Xiaonan Li Design and Implementation of a Power Module with Overcurrent Protection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1951 Peng Chen, Qingchun Wu, Yuejun Shi, Qian Zheng, and Sen Yang The Convolutional Neural Network Used to Classify COVID-19 X-ray Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1958 Qiu Yiwen, Wang Yan, Song Derui, Lv Ziye, and Shi Xiaofei Speech Emotion Recognition Based on Transfer Learning of Spectrogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1963 Lv Ziye, Wang Yan, Song Derui, Qiu Yiwen, and Shi Xiaofei A Novel Kalman Filter Algorithm Using Stance Detection for an Inertial Navigation System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1968 Zhijian Shi, Ruochen Feng, Rui Lin, and Gareth Peter Lewis A Multi-Strategy Batch Mode Active Learning Algorithm for Image Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1977 Sen Han, Lin Wang, and Derui Song Few Shot Object Detection via Training Image Generation . . . . . . . . . 1981 Deyuan Zhang, Yixin Zhang, and Junyuan Wang Research of Simulation Method for UAV Fault Generation Facing Emergency Operation Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1987 Sen Yang, Leiping Xi, Hairui Dong, Xuejiang Dang, and Peng Chen Identification of Precipitation Clouds Based on Faster-RCNN Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1993 Yuanbo Ran, Li Tian, Haijiang Wang, Jiang Wu, and Tao Xiang MIF: Toward Semantic-Aware Representation for Video Retrieval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2001 Bo Lu and Xiaodong Duan Wireless Channel Models for Maritime Communications . . . . . . . . . . . 2010 Bin Lin, Jiaye Li, ZhenWang, and Tiancheng Kang
xxvi
Contents
Architecture Design of 5G and Virtual Reality-Based Distributed Simulated Training Platform for Ship Pilots . . . . . . . . . . . . . . . . . . . . 2014 Bin Lin, Linan Feng, Hongyi Xu, and Dewei Wang Design of Verification System on Robot Trajectory Interpolation Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2018 Ti Han Indoor Motion Status Identification Algorithm Based on Decision Tree Model for FM Signal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2024 Shuai Wang, Ying Jin, and Xuemei Wang Localization Algorithm Based on Fingerprint Model for FM Signal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2030 Xuemei Wang, Yonglin Wu, and Wei-cheng Xue The Design and Development of FM Localization Algorithm Based on KNN Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2036 Xue-mei Wang, Beiqi Song, and Wei- cheng Xue Design and Development of Data Acquisition Module of Intelligent Meter Based on LoRa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2043 Le Wang and Lei Li Design and Development of Temperature Processing Module in Intelligent Terminal of Internet of Things . . . . . . . . . . . . . . . . . . . . 2050 Le Wang and Yuxiang Du Indoor Motion Status Identification Algorithm Based on SVM Model for FM Signal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2057 Qun Liu, Yonglin Wu, and Wei-cheng Xue Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2063
Optimal Quality of Information Service in an Electronic Commerce Website Weiwei Wu and Di Lin(B) University of Electronic Science and Technology of China, Chengdu, China [email protected]
Abstract. Multiple sellers can post their goods information through an online electronic commerce website, e.g., ebay, AliExpress, Amazon. The goods information post by a seller can attract the users who are originally interested in the goods from the other sellers, and thus the information post by a seller may reduce the influence of information from the other sellers, e.g., discourage customers to purchase goods from the other sellers especially when these sellers are competitors. In consideration of two classes of priority for sellers, we develop an admission control algorithm which temporarily removes low-priority sellers when a few high-priority sellers complain about their confusion caused by the others. We proof that the proposed algorithm can achieve the minimal number of sellers who have to be removed, subject to the constraint of an acceptable level of confusion at all of high-priority sellers. Also we investigate the admission control algorithm under a few more general settings, including the multiple-priority setting as well as the setting in which not all of highpriority sellers can attain their acceptable level of confusion even after removing all of low-priority sellers. Finally, under the structure of quite a few typical social networks, we show that the proposed admission control algorithm can efficiently solve the problem of mitigating confusion at high-priority sellers under various structures of social networks.
Keywords: Social network Iterative algorithm
1
· Priority of sellers · Admission control ·
Introduction
In the era of electronic commerce, sellers can employ websites to advertise their goods, such as posting goods information on ebay or AliExpress. The sellers need to register as members by payment when using a website, and prefer to enjoy a high quality of information service by this website. However, a website contains the information post by multiple users (i.e. the sellers in an electronic commerce website), and the information from the other sellers (sometimes the competitors) may attract the users of a seller. Thus, the information post by the other sellers
c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_1
2
W. Wu and D. Lin
may lead a seller to feel uncomfortable since he/she does not gain benefits from the information service by the website [1], and this seller may be discouraged to employ the electronic commerce website any longer. From the perspective of methodology, the studies of removing confusion in a social network can be primarily classified into two categories: qualitative analysis and quantitative model. The first category of studies centers on the proposal of a few qualitative techniques to deal with confusion, e.g., [2]. A large amount of information confusion present in social networks can be used to extract and represent web pages, and a few qualitative techniques, e.g., improving the clarity of goals, to treat the confusion are studied. Only 10% of the information is really useful and the rest is confusion [3]. The second category of studies focus on quantitatively modelling the information confusion. The confusion in social networks is statistically analyzed in the impact of confusion on individual performance, and is discussed on how to model the information confusion and formulate the impact of confusion on useful information as Information-toConfusion Noise Ratio (ICNR). However, none of these above papers presents the scheme of mitigating confusion to improve the level of ICNR. The main contributions of our work rest on the following points: (1) Our work novelly investigates the optimal strategies of admission control to ensure that high-priority users experience an acceptable level of confusion; (2) Our work proposes an iterative algorithm of admission control to suppress confusion, and addresses its applications in actual social network context; (3) Our work shows that networks in which users place trust on few information providers can suffer less information confusion than those in which users equally trust all of information providers. The rest of the paper is organized as follows. Section 2 summarizes the formulation of ICNR proposed in [4], characterizes the problem of admission control in a social network, and presents the algorithm of admission control. Section 2.2 analyzes the performance of admission control algorithm. Section 2.3 presents a few practical considerations in using the algorithm of admission control. Numerical results are presented in Sect. 3 and conclusions are drawn in Sect. 4.
2
Problem of Admission Control
In this section, we first summarize the mathematical formulation of ICNR, which was proposed in [4] and is presented here for completeness. Then, we characterize the problem of admission control. 2.1
Confusion of Information
Consider a social network (e.g., Facebook or Twitter) with N information users who receive information from others and in turn, provide information to the others. In other words, within the N -node social network, each node (person) is both information provider and information user. The information provider i (i = 1, 2, . . . , N ) transmits information with an intensity of Ii . Intuitively, Ii
Optimal Quality of Information Service in an Electronic Commerce Website
3
could indicate (but not limited to) any of the following issues: (1) The authenticity of the information (e.g., by referring to a certified document); (2) A personal relation (e.g., friendship) or a professional relation (e.g., supervisor) with the information user; (3) Advertisement or propaganda. For any information user i, we denote his/her primary information provider as provider i. The useful information received by user i from provider i is written as Ii tii /Ri , where tii (0 ≤ tii ≤ 1) represents the trust placed by user i on the information provider i and Ri represents the rate of sending information by provider i. When user i receives information from multiple providers, the contradiction is present between the information from the primary provider and other information providers. Let vji (0 ≤ vij ≤ 1) denote the amount of contradiction between the information from provider i (primary provider) and provider j, then, the amount of confusion at information user i due to the presence of multiple information providers can be written as j=i Ij tji vji , where Ij denotes the information intensity of provider j and tji denotes the trust placed by user i on the information provider j. Apart from the confusion due to multiple providers, another source of confusion results from the natural level of uncertainty about the received information when the user processes information. Let Ni denote the natural confusion experienced by the user i, then, the total amount of confusion can be denoted as j=i Ij tji vji + Ni . For information user i, Information-to-Confusion-Noise Ratio (ICNR) is defined as the quality of useful information received by an information user in the presence of confusion. Based on the aforementioned analysis, mathematically, ICNR of user i, θi , can be formulated as [4] Ii tii /Ri , j=i Ij tji vji + Ni
θi =
2.2
(1)
Preliminaries of Admission Control Algorithms
In the following, we firstly provide a few concepts on the admission control algorithms. Definition 1. With an acceptable level of ICNR θˆi and transmission intensity I, we define the supported-user set as S(I) = {i|θi ≥ θˆi } Remark 1. Denote XH and XL as the set of high-priority users and the set of low-priority users, respectively. By Definition 1, we can represent the set of supported users in high priority as SH (I) = {i|θi (I) ≥ θˆi and i ∈ XH } Also, we can define the set of supported users in low priority as SL (I) = {i|θi (I) ≥ θˆi and i ∈ XL }.
4
W. Wu and D. Lin
Definition 2. Denote X as the set of all the users. SH (I) = X − SH (I) and SL (I) = X − SL (I) are denoted as the set of high-priority users who are unsupported and the set low-priority users who are unsupported, respectively. Then, the outage ratio of high-priority users can be defined as OH (I) and the outage ratio of low-priority users can be defined as as OL (I)
OH (I) =
|SH (I)| |X|
and OL (I) =
|SL (I)| |X|
where | · | denotes the cardinality of a user set. In this paper, we can characterize the optimization of information service as the following problem min OL (I) subject to OH (I) = 0
0≤I≤¯ I
(2)
where ¯I denotes an upper bound of information intensity I. Remark 2. Problem (2) indicates that we can guarantee all the users in high priority are supported and the minimal number of users in low priority are unsupported. 2.3
Algorithm of Admission Control
To the best of our knowledge, no literature has studied the algorithm of admission control in social networks. So we establish our algorithm in consideration of the admission control algorithms in telecommunication networks, though the algorithms applied in telecommunication cannot be employed in social networks. Generally, the admission control algorithms in telecommunication networks can be classified into two categories: (1) Performance tracking based (TPC) algorithms and (2) Temporary removal based (DFC) algorithms [5]. In TPC algorithms, the amount of interference from the other users determine the strategy of transmission power of each user, and each user’s transmission power increases with the rise of amount of interference. In the TPC algorithm, the communication performance cannot meet a few users’ transmission requirements even when they adjust their transmission power at the maximum. To overcome the drawbacks of TPC algorithms, researchers propose a few DFC algorithms. Though the details of these algorithms are different, a common feature is that instead of increasing their level of transmit power, a few users will temporarily disconnect from the network when the level of interference to them is high. However, we cannot employ a DFC algorithm under our scenario: priority-based admission control in a social network, since existing DFC algorithms cannot be applied in a network with users in different priorities. Thus, DFC may lead to the removal of a high-priority user if his/her interference level is high. Given the respective drawbacks of TPC and DFC algorithms, we establish an innovative admission control algorithm for priority-based users in a social network, and this algorithm borrows the ideas of TPC and DFC algorithms. In this algorithm, TPC algorithms are employed by the users in high priority, while DFC algorithms are employed by the users in low priority. We can summarize the detailed admission control algorithm in Algorithm 1.
Optimal Quality of Information Service in an Electronic Commerce Website
5
Algorithm 1: Admission control of users in M priorities (M > 2, X1 , · · ·, XM denote the set of users in different priorities which rank from highest to lowest) Step 1: Merge the multiple sets of users in various priorities X1 , · · · , XM −1 into a set, which is represented as XH , i.e., XH = X1 ∪ · · · ∪ XM −1 . Also we set XL = XM . Step 2: Build up problem (2) and solve it with Algorithm 1. Step 3: If we can find the solution to problem (2), then turn to Step 4, and otherwise turn to Step 5; If we cannot find the solution, we remove all of users in XL , recombine the sets of users as XH = X1 ∪ · · · ∪ Xi−1 , XL = Xi (where XH is composed of the users in i (i ≤ M ) different priorities), and turn to step 2. Step 4: Return the available solution to problem (2). Step 5: Return ‘No feasible solution’.
3
Simulation Results
We consider Amazon product co-purchasing network, which is composed of 403,394 sellers and 3,387,388 transactions [5]. We partition the sellers into 5 priorities based on the number of transactions they completed with the method of clustering, and employ the data to measure the QoIS. 3.1
Convergence of Proposed Algorithm
In the following, we address the convergence of Algorithm 1 in various random networks. In the simulation, we set the targeted ICNR as 0dB and Ri = 1 for any user. Also tij is uniformly distributed in [0, 1]. Then, we discuss the rate of convergence to the targeted ICNR in various social networks. Figure 1 presents the convergence rate of our algorithm in various random networks. Blue line with ‘’ denotes Exponential network; Red line with ‘’ denotes preferential attachment (scale-free) network; Dark line with ‘o’ denotes Erdos-Renyi network. We can observe from Fig. 1 that Algorithm 1 in the networks with highly concentrated trust (e.g., Exponential network) can converge to the solution at a high speed, while the algorithm in the networks with equally distributed trust (e.g., Erdos-Renyi network) converges to the solution at a low speed. Actually, Algorithm 1 for the Exponential network can achieve the solution after 7000 iterations, while it can achieve the solution after 12,000 iterations in the Erdos-Renyi network. This is because users place equal amount of trust on information providers in Erdos-Renyi network and can easily be impacted by the other providers besides the primary provider. The information user has to combine the information from multiple sources before he/she can determine his/her actions. However, in the Exponential network, the users only trust few providers, and they suffer a small amount of confusion to determine their actions.
6
W. Wu and D. Lin
Fig. 1 Rate of convergence under various networks (Blue line with ‘’ denotes Exponential network; Red line with ‘’ denotes preferential attachment (scale-free) network; Dark line with ‘o’ denotes Erdos-Renyi network)
Fig. 2 Outrage ratio of high-priority users (Blue line with ‘’ denotes Exponential network; Red line with ‘’ denotes preferential attachment (scale-free) network; Dark line with ‘o’ denotes Erdos-Renyi network; Regular line represents Algorithm 1; Dotted line represents DFC algorithm)
3.2
Outage Ratio of Users
In the simulation, by setting Ri = 1 and hij uniformly distributed in [0, 1], we investigate the outage ratio of users for different social networks. Figure 2 shows the outrage ratio of users in high priorities with various targeted ICNRs (dB) in various networks (Algorithm 1 vs. the DFC algorithm [5]). Blue line with ‘’ denotes Exponential network; Red line with ‘’ denotes preferential
Optimal Quality of Information Service in an Electronic Commerce Website
7
Fig. 3 Outrage ratio of low-priority users (Blue line with ‘’ denotes Exponential network; Red line with ‘’ denotes preferential attachment (scale-free) network; Dark line with ‘o’ denotes Erdos-Renyi network; Regular line represents Algorithm 1; Dotted line represents DFC algorithm)
attachment (scale-free) network; Dark line with ‘o’ denotes Erdos-Renyi network. Figure 3 shows the outrage ratio of users in low priority with various values of target ICNR (dB) under different random networks (Algorithm 1 vs. the DFC algorithm [5]). Blue line with ‘’ denotes Exponential network; Red line with ‘’ denotes preferential attachment (scale-free) network; Dark line with ‘o’ denotes Erdos-Renyi network. We can observe from Fig. 2 that Algorithm 1 can always ensure that no users in the highest priority are required to be removed in various networks, while this zero outage of users in high priorities cannot be guaranteed by the DFC algorithm. Actually, the outrage of users in high priorities can achieve 50% when setting the targeted ICNR of users as 4dB. Also Fig. 3 illustrates that the algorithm of DFC can achieve a much lower outage ratio for users in low priorities in comparison with Algorithm 1. This result shows that the DFC algorithm cannot select the users in low priorities to remove from the network and thus cannot guarantee a zero outage among the users in high priorities. To guarantee the zero outage for users in high priorities, we need to ensure that the removed users must be selected among the users in low priorities, which is guaranteed by Algorithm 1. Figures 2 and 3 also show that Exponential network can achieve a lower outrage ratio than the Erdos-Renyi network. In an Exponential network, the information users are primarily dependent on one information provider, while in an Erdos-Renyi network users are equally dependent on a few information providers. This is because one or few primary information providers in the highly concentrated trust network (e.g., Exponential network) would suffer a low level of confusion. Note that the place of trust on few information providers does not
8
W. Wu and D. Lin
indicate that users would receive correct information. Instead, it only indicates that the users are suffering a smaller amount of confusion when they trust few information providers.
4
Conclusion
We addressed an admission control algorithm to mitigate the information confusion at high-priority users in social networks with multiple information providers. Some of the key inferences drawn were • Our proposed admission control algorithm can dramatically reduce the outage ratio of users than DFC algorithm, which is the most widely-used admission control algorithm under the non-priority scenario. • Under the networks with highly concentrated trust, the priority-based admission control algorithm can converge to the solution at a higher speed than under the networks in which users equally trust information providers. • Networks with highly concentrated trust can achieve a lower outage ratio (i.e., suffer less information confusion) than the networks in which users equally trust information providers.
References 1. Jin L, Chen Y, Wang T, Hui P, Vasilakos AV (2013) Understanding user behavior in online social networks: a survey. IEEE Commun Mag 51(9):144–150 2. Adams P (2010) Social networking. The Noisy Channel, July 2010. [Online]. Available: http://thenoisychannel.com/2010/07/08/pauladamsspresentation-on-socialnetworking 3. Olson M (2009) Social media: Improve your signal-to-noise ratio. Biznik Article on Brand Development. [Online]. Available: http://biznik.com/articles/social-mediaimprove-your-signalto-noise-ratio 4. Anand S, Subbalakshmi KP, Chandramouli R (2013) A quantitative model and analysis of information confusion in social networks. IEEE Trans Multimedia 15(1):207– 223 5. Ngo D, Le LB, Ngoc TL (2012) Distributed Pareto-optimal power control for utility maximization in femtocell networks. IEEE Trans Wireless Commun 11(10):3434– 3446
Weight Forests-Based Learning Algorithm for Small-Scaled Data Processing Jiafu Ren, Di Lin(B) , and Weiwei Wu University of Electronic Science and Technology of China, No. 4, North Jianshe Road, Chengdu, People’s Republic of China [email protected]
Abstract. This paper proposes an ensemble learning algorithm that assigns a weight to the prediction of trees in each category, and these trees are generated by a specific feature to establish weight forests. The weight of forests is composed of a weight matrix and a weight of full connections, and two types of weight updating methods are employed. On the CICD2017, the generalized weight forest method outperforms the random forests method. The weight forests method can achieve better performance by using incremental learning for weight update. In this paper, we propose an incremental tree algorithm based on weight forests to solve the drift problem, and the occurrence of concept drift can be detected by adjusting the coefficients of incremental trees at a high level of accuracy. Keywords: Weight forests · Generalization performance learning · Concept drift · Incremental tree
1
· Incremental
Introduction
Weight forests focus on diversity and combination strategy: assigning a feature to generate a CART [1] tree as individual learners to increase diversity and two types of weight to achieve a combination strategy. Random forests [2] increase learner diversity through sample perturbations and attribute perturbations. Rotating forests [3] pay attention to the diversity and accuracy of individual learners. Biau [4] maintains that random forests is one of the most accurate general learning techniques. This paper describes the design and implementation of weight forests and compares the generalization performance with random forests on the same data set. In traditional machine learning, it is generally assumed that there are enough training samples and can be given in advance, the training samples are used up, and the training is not continued. However, in practical applications, training samples are often not given in advance, and the training example appears over
c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_2
10
J. Ren et al
time. Combined with Guo Haonan [5], Zang Wenyu [6] and Ruping Stefan [7], these studies indicate that incremental learning has characteristics: (i) The concept of the target does not change over time, i.e., the information of the sample cannot be judged by the time the sample arrives. (ii) When a new sample arrives, the model can be adjusted based on the sample, and all samples cannot be stored and discarded after training. (iii) The model tries to retain the information of the used samples and promotes the model to learn new information from the new sample. The method of using the weight forests incremental learning is given in this paper. The concept of a target changes over time and is called a concept drift. Therefore, the model that handles the concept drift not only adjusts the model according to the new sample but also discards the information that was not related to the new knowledge. Incremental learning and concept drift do not require strict boundaries in practical applications. An incremental tree algorithm is proposed to solve the phenomenon of concept drift.
2
Weight Forests
Weight forests assign a weight to each tree’s prediction for each category, and the algorithm model is represented by Fig. 1.
Fig. 1 Weight forests model
Weight Forests-Based Learning Algorithm . . .
2.1
11
Node
Each node contains multiple decision trees and a weight matrix. A feature is randomly selected, and at least one tree is created for each feature. The training samples used to build the tree are formed by sampling with replacement. The trees are divided into nodes, keeping the number of nodes in each node the same. At the leaf node of the tree, consider the number of categories of the data set used to build the tree and convert the output of the tree into a vector, which is called the tree output vector. Each tree is given a weight for each category’s predicted value. A matrix in which the weights of all the trees on a node are combined called a weight matrix W, and a matrix in which the output vectors of the trees on each node are combined is called a class matrix P. Algorithm 1 Update weight matrix W , output category vector S Input: Ti All trees of ith node,; W :ith weight matrix ; x :Training sample ; Output: Class vector: S = [s1 , s2 , ..., sc ]; 1: Calculate the class matrix P from the sample and Ti ; p 2: Normalized class matrix pij = ijpij ∀i, j; 3: Update weight matrix wik = wik + η(pik − wik ) ∀i ; η indicates the learning rate, and k is the column index corresponding to x; s 4: Calculate category vector S i : sj = w1j p1j + ... + wdj pdj ,∀j; sj = jsj , ∀j; 5: return S
2.2
Fully Connected
Given the activation function for weight forests, as in formula (1), the activation function input and output are both vectors. ui e|ui |+1 |ui | ≥ m c (1) vi = [0, ..., 0] |ui | < m c Loss function definition: The output is subjected to the Softmax function to obtain the probability distribution of the output classification and then compared with the sample label to find the cross-entropy to obtain the loss function. The formula is as follows: eyi sof tmax(yi ) = c y i i e H(p, q) = − p(x)logq(x) where p(x) is the real result,q(x) is the prediction result. The entire weight forests algorithm is depicted in Algorithm 2.
(2) (3)
12
J. Ren et al
Algorithm 2 Weight Forests Input: Training samples:D = [D1 , D2 , ..., Dz ]; Training sample accuracy:a; Output: Weight Forests moudel 1: Randomly select md features, build a tree for each feature. Select each feature at least once; 2: Initialize m weight matrices, each element is initialized to 0; 3: Build a fully connected network, enter mc nodes, and output c nodes; 4: repeat 5: for each Di ∈ D do 6: Update weight matrix, output category vector [S 1 , S 2 , ..., S m ],S i = [si1 , si2 , ..., sic ],(algorithm 1); 1 2 m 1 2 m 7: [u1 , u 2 , ..., uc ] ← [S , S , ..., S ] , ui = [si , si , ..., si ],∀i; |ui |+1 m |ui | ≥ c ui e ∀i; 8: vi = [0,...,0] |ui | < m c 9: [v1 , v2 , ..., vc ] input to a fully connected network, gradient descent updates the weight of the full connected’s weight 10: end for 11: Calculate the accuracy of the current model on D; 12: if The accuracy rate on each category is greater than a then break 13: end if 14: until (The accuracy rate on each category is greater than a)
3 3.1
Incremental Learning Weight Update
The decision boundaries formed by the decision tree are axis-parallel, and the decision boundaries formed by the decision tree established by one feature are inaccurate (as can be seen from the effective tree coefficients in Sect. 4.3). On this basis, weight forests use the weight method to determine the decision boundary with strong generalization performance. Then, only updating the weights in weight forests can change the decision boundary, which allows the weight forests to learn incrementally. 3.2
Incremental Tree
Each tree of the weight forests is established by a feature. The leaf node of each tree is a partition and the information of the saved category in the partition. The incremental tree reconstructs the partition by making the partition smaller and more precise, thus coping with the concept drift phenomenon. Different from the tree in weight forests, the incremental tree saves the sample information in the leaf node, which is called the leaf node sample table. The effective tree coefficient is proposed for detecting the occurrence of concept drift and judging whether to discard the tree. L α=
1 li
cL
,
1 ≤α≤1 c
(4)
Weight Forests-Based Learning Algorithm . . .
13
where L is the number of the leaf nodes of tree, li is number of categories in leaf node, c is the number of categories of target.
Algorithm 3 Update Incremental Tree Input: Sample Dt :x ∈ Dt ; Tree T :Tji ∈ T i , j ∈ [1, d], T i ∈ T, i ∈ [1, m]; Leaf node storage sample threshold: N ; Effective tree coefficient threshold: α; Output: Tree T ; 1: for each x ∈ Dt do 2: for each Tji ∈ T i , j ∈ [1, d], T i ∈ T, i ∈ [1, m] do 3: root ← Tji , Traverse the leaf nodes of root according to x; 4: if The category of x is not in the leaf node then 5: Add x information to the leaf node table, the category of x is placed in the partition , and the value is 1; 6: else 7: count ← the number of x categories in leaf nodes 8: number ← the total number of leaf node samples then 9: if count ≤ number 2 10: Add x information to the leaf node table; 11: end if 12: end if 13: if number ≥ N then 14: Extract sample information in the leaf node table and combine them into training data; 15: Gini ← Calculate the Gini index 16: if Gini ≥ 0 then 17: Split leaf node; 18: end if 19: end if 20: end for 21: end for 22: for each Tji ∈ T i , j ∈ [1, d], T i ∈ T, i ∈ [1, m] do 23: β ← Calculate the effective tree coefficient of Tji 24: if β ≥ α then 25: Tji ← Use Dt to create a new incremental tree 26: end if 27: end for
The entire update inremental tree algorithm is depicted in Algorithm 3.
4 4.1
Experiment and Results Analysis Generalization Performance
CICD2017 [8] is a data set for intrusion detection. The sample categories used for the experiment, a total of 12 categories, randomly selected 50 samples in
14
J. Ren et al
each category for a total of 600 samples as a training set, which can well test the generalization performance of the algorithm. From the experimental results in Table 1, the generalization performance of weighted forests is stronger than that of the random forest. In the category where the random forest performance is weak, the weighted forest can also achieve high accuracy. Table 1 Experimental results of random forests and weighted forests
Random forests
FTP-patator DoS slowloris 0.997 0.985
DoS Hulk 0.965
Weight forests
0.992
0.967
0.994
Web attack XSS SSH-patator Port scan
DoS GoldenEye
0.630
0.979
0.995
0.983
0.967
0.993
0.995
0.981
Bot
BENIGN
DoS Slowhttptest DDoS
0.974
0.773
0.968
0.993
0.979
0.899
0.968
0.995
Web attack brute force 0.596 0.973
4.2
Increment Learning
In the initial training model stage, each category randomly selects only one sample in the data set, and a total of 12 samples are used as training sets. In the experiment, 120 samples were sampled without replacement as one batch and put into the model to simulate the process of incremental learning. The experimental results are shown in Fig. 2. The experimental results show that the classification information that most categories can learn in two batches can reach about 90% accuracy, and the accuracy of each category does not fluctuate much in the subsequent batches. 4.3
Concept Drift
To verify the effect of the valid tree coefficient, the paper compares the existence and nonexistence of the concept drift phenomenon and updates the tree and does not update the tree. Using seven categories, the initial category has only one sample, and the fifth batch has a conceptual drift (and no conceptual drift). The experimental results are shown in Fig. 3. Experiments show that: (i)The concept drift phenomenon exists and does not update the tree. When the concept drift occurs, the effective tree coefficient rises abruptly, indicating that the
Weight Forests-Based Learning Algorithm . . .
15
Fig. 2 Incremental learning accuracy on the test set with batch
Fig. 3 Valid tree coefficient varies with batch
effective tree coefficient can reflect the occurrence of the conceptual drift phenomenon. (ii)The concept drift phenomenon exists and does not update the tree. When the concept drift occurs, the effective tree coefficient rises and falls immediately, indicating that the algorithm can cope with the concept drift phenomenon. (iii)The concept drift phenomenon does not exist and does not update the tree. The effective tree coefficient increases with a small increase, indicating that the incremental tree will accumulate errors without updating the tree.
5
Conclusion
The performance of generalized weight forest method on CICD2017 is better than that of random forests. From the experimental results of incremental learning, weights contribute to the generalization of weight forest method for more suitable target categories, and the weight forests operate incremental learning by updating its weight. A tree in the weight forests is generated from a feature, leading the incremental tree easy to understand and implement, and also makes the weight forests applicable to the problems with more sample features. At the
16
J. Ren et al
same time, the coefficients of an incremental tree can predict the occurrence of a drift. Updating the tree with an effective coefficient can not only handle the issue of drift but also solve the problem of cumulative tree accumulation error.
References 1. Brieman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and Regression Tree Analysis 2. Breiman L (2001) Random forests. Mach Learn 45(1):5–32 3. Rodriguez JJ, Kuncheva LI, Alonso CJ (2006) Rotation forest: a new classifier ensemble method. IEEE Tran Pattern Anal Mach Intell 28(10):1619–1630 4. Biau G (2012) Analysis of a random forests model. J Mach Learn Res 13(1):1063– 1095 5. Guo H, Wang S, Fan J, Li S (2019) Learning automata based incremental learning method for deep neural networks. IEEE Access 7:41164–41171 6. Zang W, Zhang P, Zhou C, Guo L (2014) Comparative study between incremental and ensemble learning on data streams: case study. J Big Data 1(1):5 7. Ruping S (2001) Incremental learning with support vector machines. In: Proceedings 2001 IEEE international conference on data mining. IEEE, pp 641–642 8. Sharafaldin I, Lashkari AH, Ghorbani AA (2018) Toward generating a new intrusion detection dataset and intrusion traffic characterization. In: ICISSP, pp 108–116
Detecting Anomalous Insiders Using Network Analysis Lei Dai2 , Liwei Zhang1 , Limin Li2 , and You Chen3,4(B) 1 Beijing Institute of Technology, Beijing, China 2 China Academy of Electronics and Information Technology, Beijing, China 3 Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN,
USA [email protected] 4 Department of Electrical Engineering & Computer Science, School of Engineering, Vanderbilt University, Nashville, TN, USA
Abstract. Collaborative information systems (CISs) can improve efficiency and quality of services, because they enable users to coordinate and collaborate on common works with a large scale. In recent decades, increasing number of hospitals start to adopt CIS to improve efficiency and quality of health care treatment, and CIS is becoming an important platform for hospital employees to treat patients. CIS is rich in resources of relations among users, however, people seldom study networks of users existing in the CIS, especially in health information systems. In this paper, we construct collaborative weighted networks of users from two typical CIS: one from health care respect, Vanderbilt University STARPANEL system; and the other from a more general respect, Wikipedia talk system. We learn characteristics of those two networks, and find both networks follow degree-strength relations, which could be used for detecting anomalous behaviors of users in the CIS, for instance, anomalous behaviors of hospital employees and wiki editors. Keyword: Collaborative information systems · Confidence social network · Information leakage
1 Introduction The use of web-based “collaborationware” has been increasing in recent years. We call systems which use web-based collaborationware as collaboration information systems (CIS). CIS such as Wiki [1], Blog [2], Picasa [3], and Podcasting [4] can offer opportunities for powerful information sharing and ease of collaboration. Users and subjects are two common objects in CIS. For instance, in Wikipedia, users and subjects are editors and pages, respectively, and group of editors collaboratively add reviews for common pages. For the efficiency and convenience of CIS, recently, many health care providers adopt them in collaborative clinical treatments. For example, Vanderbilt University has
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_3
18
L. Dai et al.
been using a system called STARPANEL to store electronic health records (EHR) of patients. Users such as doctor, nurse, and biller conduct their collaborative works on common patients in this system. There are a lot of existing works [5–13] concentrating on constructing and analyzing social networks, such as Facebook, Twitters. Few works focus on constructing and analyzing implicit social networks existing in CIS. Barrat et al. [14] published a work studying the characteristics of world-wide airport network (WAN) and the scientist collaboration network (SCN), which is similar with our works. They treated WAN and SCN to be undirected networks and thought the relations from node A to node B and from node B to node A are equal. However, equal relations cannot reveal user mutual interactions in health information system. For instance, the relations between a doctor and a biller are asymmetric for a biller that can be interacted with large number of doctors and a doctor may be interacted with a small number of billers. We construct our constructed confidence social network (CSN) from access logs of CIS such as access logs of Vanderbilt STARPANEL system and edit logs of Wiki talk system. The relation weights between a pair of users in CSN are calculated based on common subjects they accessed. If a pair of users have common subjects accessed, there will be an edge between them, otherwise not. The weight from one user to its connected users is related with number of common subjects they accessed and number of subjects itself accessed. To the best of our knowledge, we are the first to construct directed collaborative networks in this way. Information leakage from insiders of CIS arises to be an important issue in information security and privacy fields. The main reason why private information has been often leaked by insiders is the collaborative characteristic of CIS. So, analysis of social relation among users is leveraged by researchers for preventing private information leakage. There are several existing works [15, 16] aim to detect anomalous insiders through analysis of social relations. However, they only use relations of users to detect whether a user is anomalous or not, and they do not study the whole networks existing in the systems and how to rely on characteristics of networks to detect anomalous insiders. At the end of this paper, we explained how our findings in the CSN can be leverage to detect anomalous users in the network. There are three contributions of our works: (1) All users in CSN can be characterized as a law Y = AX −α .Y is the average conditional probability (strength) of a user, and X is number of friends (degrees) of a user. (2) The law described in (1) is stable over the time. (3) By using this law described in (1), we can determine whether a user’ behavior is anomalous or not.
2 Confidence Social Network The CSN is constructed based on access logs of users on subjects. The access logs of a user is represent as a 3-tuple user, subject, time, which is the very general and basic information in CIS. Based on the access logs of users, we construct a bipartite graph
Detecting Anomalous Insiders Using Network Analysis
19
of users and subjects in a time span such as one day or one week. Figure 1a depicts a bipartite graph of users and subjects. There are 7 users and 11 subjects in this bipartite graph, and every user has a certain number of subjects accessed. A directed graph is constructed based on the bipartite graph, and it is depicted in Fig. 1b. In this graph, if a pair of users have common subjects accessed, they will be connected by a bidirectional edge, and the weight of edge is the number of common subjects they accessed. For instance, u5 and u6 co-access s1 and s4 , so the weight of edge between them is 2. In this graph, every user is assigned as a value, which is the number of subjects accessed by this user. For instance, u5 accesses three subjects s1 , s4 and s7 , so the weight of u5 itself is 3, which is depicted inside the circle of u5 in Fig. 1b.
Fig. 1 An example to illustrate process of constructing a confidence social network from access logs of users in CIS
The CSN is constructed directly on the above graph. Different from the graph, there are two directed edges between pairs of users and the weights of the two edges are different. Each edge is assigned a value of confidence. Confidence is the conditional probability from one user to another user, and it is defined as: Confidence(ui → uj ) =
Comui ,uj Nui
(1)
20
L. Dai et al.
where Comui ,uj is the number of common subjects accessed by ui and uj , which is equal to weight of bidirectional edge between ui and uj in the directed graph. Nui is the number of subjects ui accessed. As in Fig. 1b, Nu7 = 5, which means u7 accesses five subjects s8 , s9 , s10 and s11 in Fig. 1a. Through measurement of confidence, we can get values of confidences for all edges in the CSN. For instance, in Fig. 1c Confidence(u7 → u5 ) = 15 and Confidence(u5 → u7 ) = 13 .
3 A Law of Confidence Social Network 3.1 Definition of the Law In the confidence social network, every user can be represented by two metrics: number of connections (degrees) and average confidence to all of its connections (strength). In order to verify relations between degree and strength, we define average confidence of a user to its all connections as: ∀uj ∈Cui Confidence(ui → uj ) (2) AveCon(ui ) = |Cui | where Cui is the set of friends or connections of ui . As in Fig. 1, Cu1 = 3, which indicates u1 has three connections u2 , u3 , and u6 . Cardinality | · | is to represent the number of elements in a set. For instance, the average confidence of u6 in Fig. 1c . is:AveCon(u6 ) = 0.25+0.5+0.25+0.5 4 We want to characterize the relations between degree and strength, and we hypothesize their relation obey a law, which can be described as: AveCon = A × |C|−α
(3)
where AveCon is the strength of a user, and |C| is the degree of a user. α is a positive value, and the law indicates that when the degree of a user increase, the average confidence from this user to its all connections will decrease, and they fit the law AveCon = A × |C|−α . The law indicates that when a user has large number of connections, the average number of common subjects between this user and its all connections is much smaller than the number of subjects accessed by himself. When a user has small number of connections, the average number of common subjects between this user and its all connections is not significantly different with the number of subjects accessed by itself. The distribution of seven users on the degree-strength in the Fig. 1c is depicted in Fig. 2. u2 ,u3 , u4 ,u5 , and u6 are very close to the law line, however, u1 and u7 have more distances to the line than those users. u1 and u7 are two typical users in the system. u1 is the user who is curious about the subjects which are popular for most of users such as VIPs’ medical records in the health information system and popular articles in the Wiki talk system. u7 is the user who accessed large number of isolated subjects and none of them were accessed by other users. The degree and strength of u1 are 3 and 1, respectively, which indicates the subjects u1 accessed are VIPs such as a medical record of a famous man in the health information system. As depicted in Fig. 1a, u1 accessed a subject s3 which was also accessed by u2 , u3 , and u6 . u7 as depicted in Fig. 1a accessed five subjects s7 , s8 , s9 , s1 0, and s1 1, but four of them are isolated subjects.
Detecting Anomalous Insiders Using Network Analysis
21
Fig. 2 An example of user distribution on the degree and strength
3.2 Evolution of the Law For CSN is constructed based on the access logs of CIS, and the logs change over the time, so we evaluate the evolution of the law over the time in the following two aspects: (1) In the different size of time window, the relation information access logs represent will be different, so we evaluate the laws over different time windows such as one day, one week, and one month. For instance, within one day/week/month access logs, we construct corresponding social network CSNday /CSNweek /CSNmonth , respectively, and then calculate the values of αday /αweek /αmonth for the corresponding laws. Finally, we evaluate whether these laws will be stable over the different time window size or not. If the laws are not stable over the different time window size, we will further evaluate the stability of a law associated with a fixed CSN over the time, which is described as: (2) For a CSN constructed based on access logs of a random day, we get all of its nodes U = {u1 , u2 , . . . , un } and the corresponding law AveCon = A × |C|−α . Then, we evaluate the evolution of this CSN’s law in one day, one week, and one month, respectively. Our evaluation is based on that the nodes in the CSN are not changed but the relations between them are changing over the time. The aim of this evaluation is to verify whether the law for a fixed group of users will be stable over the time or not.
4 Experiments We will construct bidirectional weighted CSN on two typical real CIS and then modeling the laws of users for those two types of networks. Furthermore, we will evaluate the stabilities of laws for both types of networks. The first collaborative information system is called Wikipedia, where editors can edit Wiki pages. Wikipedia is a general system of collaboration. We call dataset retrieved from Wikipedia as Wiki dataset. The other one is more specific than Wikipedia, it is Vanderbilt STARPANEL system, where users such as doctor, nurse, and biller co-access electronic health record (EHR) of patients. We call dataset related with STARPANEL as EHR dataset. 4.1 Datasets Wiki dataset contains the complete edit history (all revisions, all pages or articles) of all Wikipedia since its inception till January 2008. It can be downloaded in [17]. In
22
L. Dai et al.
Wiki dataset, the 3-tuple of access logs is reviewer, article, time. An article can be reviewed by multiple collaborative reviewers, who are interested in this article. There are 11,176,227 revisions, 257,441 reviewers and 2,686,677 articles in Wiki dataset. EHR dataset contains access logs of 3 months from the year 2010. Access transactions are represented as user, patient, time. It has 10,386,372 accesses, 28,305 users, and 648,662 patients. 4.2 Results 4.2.1 Mutual Relations of Users in CSN Small word phenomenon is also exist in confidence social network. We define a hop that is one distance between a pair of users. For instance, u1 connects with u2 , u2 connects with u3 , and there is no edge between u1 and u3 ; then, the hops between u1 and u2 are 1, and between u1 and u3 is 2. The average number of hops in EHR and Wiki confidence social networks are 4.8 and 2.7, respectively. Beside hop, we want to emphasize on reciprocity of CSN, which represents the mutual relations between pairs of nodes. Reciprocity is a quantity to specifically characterize directed networks. Link reciprocity measures the tendency of vertex pairs to form mutual connections between each other [Diego2004]. We modified definition of link reciprocity to be confidence reciprocity to measure mutual confidence between each pair of users. Confidence reciprocity of the CSN is defined as: ∀i,j,i=j (Confidence(ui → uj ) − a) × (Confidence(uj → ui ) − a) Reciprocity = 2 ∀i,j,i=j Confidence(ui → uj ) − a) (4) and a is defined as:
a=
∀i,j,i=j
Confidence(ui → uj ) |E|
(5)
where |E| is the total number of directed edges in the CSN. Based on the definition of confidence reciprocity, if the mutual relations between pairs of nodes are symmetrical, then the value of confidence reciprocity will be around 1. The reciprocities of EHR and Wiki confidence social networks are 0.44 and 0.65, respectively, which indicates the relations between pairs of nodes are asymmetric in both of the networks. That is also the reason why we need to consider both networks to be directed ones. 4.2.2 Laws Mining Figure 3 presents laws of users for one day, one week, and one month on EHR and Wiki data, respectively. Also, we get values of α for all laws. α is calculated by using bin technology and linear regression [18, 19]. Firstly, we transform the original distribution into a log scale distribution. Secondly, we conduct bin technique on the log scale distribution to generate points of bin block which are depicted in the bottom of Figs. 1 and 2. Thirdly, we calculate values of α and A through linear regression on the extracted points.
Detecting Anomalous Insiders Using Network Analysis
23
(a) ETHR
(b) Wiki
Fig. 3 Laws on day, week, and month in EHR dataset and Wiki dataset, respectively. The top of these two figures are original distribution of users on strength and degree, and the bottom of the two figures are log scale of the original distribution. The laws are also drawn on the original distributions and log scale distributions. Each distribution can be described by three laws, the law itself and its two 1-stand deviation laws
All users can be described by three laws in a specific time window, such as one day, one week, and one month. The law is located at the median position, and the other two curves are the plus and minus of one stand deviation of the law. Figure 3 depicts laws of users on three different time windows. From the figure, we can see that middle α in EHR dataset nearly keeps the same for one day (0.57), one week (0.60), and one month (0.60). There are two reasons for this phenomenon, first, the number of hospital employees is stable and their responsibilities in the hospital change little, so users in EHR system have stable relationships. Second, there are few new users joining into the hospital user’s network within a month, which will lead the CSN of EHR to be stable
24
L. Dai et al.
over the time. However, different from EHR, Wiki system has variable values of middle α (0.85, 0.92, 0.78). There are two reasons for this phenomenon. Firstly, for every day, there are many new users joining in the Wiki social network, which will change scale of the Wiki CSN. Secondly, the relations among existing users changes over the time because of their different interests on the new joining articles. In order to verify which reason influence the Wiki CSN significantly, we intend to evaluate the stability of the law for a fixed CSN (without new users joining) over the time in the next sub section. 4.2.3 Law Evolution In Wiki system, laws have different values of α for a day, a week, and a month. There are two reasons for this, one is large number of new users joining in the network, the other is the relations of the existing users change over the time. We verify which one influences the law significantly by conducting law evolution strategy on Wiki data. We randomly select a “Start day” of Wiki data, and there are 1793 users on this day, and then calculate the degree-strength relations of all users on this day which is depicted in the first column of Fig. 4. We then remodel the degree-strength relations of those 1793 users in one day, one week, and 1 month, respectively. The distributions of those 1793 users on the degree-strength in 1 day, 1 week, and 1 month are depicted in the second column, third column, and fourth column of Fig. 4, respectively. For each of the four distributions, we calculate its value of α. From the figure, we can see that the values of α change little, which are 0.85, 0.86, 0.88, and 0.84, respectively. This indicates that the relations of existing users changes little over the time. So, the main reason influencing the law of Wiki CSN in the different time window is the new joining users.
Fig. 4 Distributions of the same set of users on degree and strength for start day, one day later, one week later, and one month later
5 Application of Anomaly Detection CIS are increasingly relied upon to manage sensitive information [20]. Intelligence agencies, for example, have adopted CIS to enable timely access and collaboration between
Detecting Anomalous Insiders Using Network Analysis
25
groups of analysts [21–23] using data on personal relationships, financial transactions, and surveillance activities. Additionally, hospitals have adopted electronic health record (EHR) systems to decrease healthcare costs, strengthen care provider productivity, and increase patient safety [24], using vast quantities of personal medical data. However, at the same time, the detail and sensitive nature of the information in such CIS make them attractive to numerous adversaries. This is a concern because the unauthorized dissemination of information from such systems can be catastrophic to both the managing agencies and the individuals (or organizations) to whom the information corresponds. It is believed that the greatest security threat to information systems stems from insiders [25–28]. A suspicious insider in this setting corresponds to an authenticated user whose actions run counter to the organization 抯 policies. Social relation analysis is considered to be an important tool to detect suspicious insiders in the health information system [5, 16], and insiders with anomalous relational behaviors will be considered to be anomalous. The CSN is constructed based on the collaborative environments and the degree-strength relations existing in the CSN can be used for detecting suspicious insiders in the CIS. We assume that if users in the CSN are far away the degree-strength law, then there will be a high probability for that user to be anomalous. So, we designed a law-based suspicious insiders detection model as: Scoreui =
max(yi , A × xi −α ) × log(|yi − A × xi −α )| + 1) min(yi , A × xi −−α )
(6)
where yi and xi correspond to strength and degree of ui in the CSN, respectively. Based on the Eq. 6, we calculate anomaly scores for all users in confidence social networks of EHR day data and Wiki day data. The distributions of anomaly score for EHR users and Wiki users are depicted in Fig. 5. As showed in Fig. 5, over 95% EHR users have anomaly score less than 0.5 and over 97% Wiki users have anomaly score less than 0.6, which are reasonable in the real case. For majority of users in the CIS are normal and they constitute the structure of the confidence social network. Upon the law-based anomaly detection, we can find two suspicious users in the CIS. (1) Users who locate at the right-top corner of the law, similar as u1 in Fig. 2. This type of users have large degree and large strength, for they accessed small number of subjects which are accessed by most of users in the network. For instance, in the hospital system, if a curious user want to look for a famous patient’s medical record (like subject s3 in the Fig. 1a), then he will be the user in the top-right corner of the degree-strength law. In the Wiki environment, this type of users will be the ones who are only interested in popular articles. (2) Users who locate at the left-bottom corner of the law, like u7 in Fig. 2. This type of users have small values of degree and strength, for they accessed lots of subjects which are not accessed by other users, and we call those accesses as isolated accesses. For instance, subject s8 , s9 , s1 0, and s1 1 in Fig. 1a are only accessed by u7 , so the accesses of u7 on them are isolated ones. The number of users who have high rate of isolated accesses are very small in the Vanderbilt STARPANEL system. Figure 6 shows only majority of users in the Vanderbilt STARPANEL system that
26
L. Dai et al.
(a)ETHR
(b) Wiki Fig. 5 Distribution of anomaly score for EHR users and Wiki users
has rate of isolated access less than 0.1. If a user like u7 in Fig. 1a accessed a large number of isolated accesses, then he will be at the left-bottom of the law corner.
Fig. 6 The distribution of rate of isolated accesses for all users in one day of EHR data
Detecting Anomalous Insiders Using Network Analysis
27
6 Discussion and Conclusions Confidence social network is constructed based on the access logs of users on subjects in collaborative information systems. So, the CSN represents the interaction relations of users and indicate how they collaboratively work on the common subjects. Because the mutual interactions between pairs of users are asymmetric, so we treat the CSN as directed weighted networks. We introduce the whole process of constructing a confidence social network through a bipartite graph of users and subjects, and then show the degreestrength relations and explain how to use those relations to detect anomalous users in the CIS. The degree-strength law we find is very useful and important for privacy protection in CIS. We can model behaviors of users by using the law. In this paper, we list two types of suspicious users for explanation of applications of law. If a user’s behavior is strange, for instance, a user accesses a famous man’s medical records, who is diagnosed by many collaborators, this user will have high values of strength and degree, which will break the law (at the right-top corner of the law). Also, if a user accesses many isolated subjects, which are not accessed by other users, this user will get low values of degree and strength, which will also break the law (left-bottom corner of the law). There still are several limitations of our works, which we would focus on in the future. (1) The applications of the confidence social network are widely such as user’s relations management in the hospital and resource allocation. However, in this paper, we only list the application of anomaly detection. The major woks of this paper is to explain confidence social network and its degree-strength law, so we explained the anomaly detection application in a simple way. However, in our future works, we can extend this application and evaluate the performance the law-based anomaly detection model. (2) We only show the reciprocity and degree-strength law of confidence social network, however, there are many works such as communities of users detection and optimal paths between pairs of users need to be done in the future. (3) Although we showed the degree-strength law in a general data (Wiki) and a specific data (EHR), we need to verify the law in a more diverse environments, such as more clinical information systems. Our work is an initial social network analysis in CIS, and we believe it will provide potential materials and directions for future researches on this area.
References 1. 2. 3. 4. 5.
Wiki Definition, https://www.answers.com/topic/wiki Blog. https://www.educause.edu/ir/library/pdf/ELI7006.pdf Picasa. https://picasa.google.com/ Podcasting. https://en.wikipedia.org/wiki/Podcast Adamic LA, Buyukkokten O, Adar E (2003) A social network caught in the web. First Monday 8(6–2):1–21
28
L. Dai et al.
6. Tong ST, Heide BVD, Langwell L (2008) Too much of a good thing? The relationship between number of friends and interpersonal impressions on Facebook. J Comput-Mediat Commun 13:532–549 7. Lampe C, Ellison N, Steinfield C (2007) A familiar Face(book): profile elements as signals in an online social network. In: Proceedings of conference on human factors in computing systems, pp 435–444 8. Walther JB, Van Der Heide B, Kim SY, Westerman D (2008) The role of friends? Appearance and behavior on evaluations of individuals on Facebook: are we known by the company we keep? J Human Commun Res 34:28–49 9. Donath J, Boyd D (2008) Public displays of connection. Technol J 22(4):71–82 10. Tidwell LC, Walther JB (2002) Computer-mediated communication effects on disclosure, impressions, and interpersonal evaluations: getting know one another bit at a time. Human Commun Res 28(3):317–348 11. Gross R, Acquisti A (2005) Information revelation and privacy in online social networks. In: Proceedings of the ACM workshop on privacy in the electronic society, pp 71–80 12. Milka P (2005) Flink: semantic web technology for the extraction and analysis of social networks. J Web Semant 3:211–223 13. Matsuo Y, Mori J, Hamasaki M, Nishimura T, Takeda H, Hasida K, Ishizuka M (2007) POLYPHONET: an advanced social network extraction system from the web. J Web Semant 5:262–278 14. Barrat A, Barthelemy M, Pastor-Satorras R, Vespignani A (2004) The architecture of complex weighted networks. PNAS 101(11):3747–3752 15. Chen Y, Malin B (2011) Detection of anomalous insiders in collaborative environments via relational analysis of access logs. In: Proceedings of ACM conference on data and application security and privacy, pp 63–74 16. Chen Y, Nyemba S, Zhang W, Malin B (2011) Leveraging social networks to detect anomalous insider actions in collaborative environments. IEEE Intell Secur Inform (ISI):119–124 17. Wiki Dataset. https://snap.stanford.edu/data/wiki-meta.html 18. Adamic LA, Huberman BA (2002) Zipf’s law and internet. Glottometrics 3:143–150 19. Adamic LA, Huberman BA (2000) Power-law distribution of the world wide web. Science 287:2113 20. Javanmardi S, Lopes C (2007) Modeling trust in collaborative information systems. In: Proceedings of international conference on collaborative computing: networking, applications and worksharing, pp 299–302 21. Chen H, Wang F, Zeng D (2004) Intelligence and security informatics for homeland security: information, communication, and transportation. IEEE Trans Intell Transp Syst 5(4):329–341 22. Chen H, Zeng D, Atabakhsh H, Wyzga W, Schroeder J (2003) COPLINK: managing law enforcement data and knowledge. Comm ACM 46(1):28–34 23. Popp R (2004) Countering terrorism through information technology. Comm ACM 47(3):36– 43 24. Menachemi N, Brooks R (2008) Reviewing the benefits and costs of electronic health records and associated patient safety technologies. J Med Syst 30(3):159–168 25. C. Probst, R.R. Hansen, and F. Nielson. Where Can an Insider Attack?. Proc. Workshop Formal Aspects in Security and Trust. 2006. p127–142. 26. Schultz E (2002) A framework for understanding and predicting insider attacks. Comput Secur 21(6):526–531
Detecting Anomalous Insiders Using Network Analysis
29
27. Stolfo S, Bellovin S, Hershkop S, Keromytis A, Sinclair S, Smith SW (2008) Insider attack and cyber security: beyond the hacker. Springer 28. Tuglular T, Spafford E (1997) A framework for characterization of insider computer misuse. Unpublished paper, 1997
Containerization Design for Autonomous and Controllable Cloud Distributed System Xiao Zhang, Yu Tang(B) , Hao Li, Shaotao Liu, and Di Lin School of Information and Software Engineering, University of Electronic Science and Technology of China, ChengDu, China [email protected], [email protected]
Abstract. With the relative popularity of cloud computing, a large number of services and applications have been deployed in cloud environments. However, the existing cloud technology architecture still has many problems. This paper summarizes the problems existing in the communication module in the development of cloud technology so far, such as the three major problems of complicated cloud’s architecture, homogeneous service, and pseudo-saturation of communication. Based on these problems, we give the corresponding solutions for our system under a distributed architecture container. This solution is implemented under autonomous and controllable conditions and can ensure the robustness and relative security of the system and ensure the smoothness of external communications. Our work has made a certain contribution to the current cloud computing environment and made up for some of the shortcomings of existing cloud environment communication modules. Keywords: Communication things
1
· Network · Cloud cluster · Internet of
Introduction
Information science and industry have already entered the cloud era. Everything can be clouded, such a sentence has become a reality. A large number of computing nodes are distributed in the cloud, and even a large number of companies have begun to deploy their business on various clouds, whether it is a public business or a private business. From mobile device development to web development, the cloud has become a trend. Please note that the LNCS Editorial assumes that all authors have used the western naming convention, with given names preceding surnames. This determines the structure of the names in the running heads and the author index. c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_4
Containerization Design for Autonomous and Controllable . . .
31
In addition to the risk of being invaded, the cloud cluster architecture is also plagued by a large amount of communication redundancy. Whether it is internal or external, the convergence of the architecture design makes the communication of the time-division multiplex network more and more tested. Therefore, under the condition that the robustness in program development must be guaranteed, the communication framework of the cluster is the internal and external communications of the cluster. However, in the future IoT era, more and more open communication requirements and the addition of a large number of sensor modules accompanied by different communication protocols at the bottom of the hardware; it is to ensure the synergy and high time of the sensors and communication processes. Under the requirements of nature, cloud communication architecture must rethink, not only security direction, but also efficiency. Especially in space–air–ground–sea network, the integrated network is composed of heterogeneous satellite networks, deep space networks, space vehicles, maritime information network and ground wired and wireless network facilities with various functions, such as communication, reconnaissance, location, navigation and remote sensing, etc. The users, aircraft and various communication platforms in the ground, sea, air and deep space are closely combined through the space–air–earth–sea links. Space–air–ground–sea integrated network gives full play to the advantages of space, air, ground and sea information technologies and achieves effective acquisition, coordination, transmission and convergence of space–air–ground–sea multi-dimensional information. It can provide all-weather and all-area high-speed wireless communications for all kinds of users. In order to realize such a network architecture, a cloud cluster architecture is a necessary component, and the distributed fog computing method cannot satisfy the collection and calculation of large amounts of data. Cloud clusters, as an important part, undertake a large number of computing tasks for the complicated network architecture (Fig. 1).
Fig. 1. In this figure, we have outlined the specific practices of today’s cloud clusters and the virtualization process. From left to right are the physical environment, the virtual environment and the docker container environment. Within the docker container, multiple containers can form microservices involved
32
X. Zhang et al.
2
Related Work
The cloud environment can be roughly divided into public cloud, private cloud, and hybrid cloud [4]. Public cloud is a cloud deployment model controlled by cloud service providers and used for cloud service users and resources. The cloud service provider builds the infrastructure, integrates resources to build a cloud virtual resource pool and allocates sit to multiple tenants as needed. The cloud servers and cloud server instances that we often hear or use belong to the public cloud category and are suitable for enterprises and developers without the conditions or requirements for setting up a private cloud. Public clouds have very broad boundaries, and users have few restrictions on accessing public cloud services. The above all clearly and concisely explain the design architecture and communication architecture used in today’s cloud computing technology clusters. While explaining the architectural ideas, we have also found many problems. In today’s cloud computing, communication resources seem to have become the most important module that restrains the development of cloud clusters. While other physical resources can be improved by stacking, communication resources cannot be improved ideally in the same way, and multiple physical network cards have limited improvement in communication efficiency. Moreover, the communication efficiency of the link depends to some extent on the shortest board.
3
Cloud with Container
On the basis of virtualization, the concept of containers has gradually evolved. Containers are proposed and developed by the docker company [6] and launched the Docker container management program. Docker [2] is an open-source application container engine that allows developers to package their applications and dependencies into a portable image, and then release is also possible on any popular Linux or Windows OS. On top of the image, the rapid arrangement of containers and the rapid iteration of services are achieved. To a certain extent, the iteration speed of the product has been greatly improved. The emergence and evolution of containers have gradually formed the microservices pattern presented by current application development. What is microservice? Microservice is the subdivision of each functional module of a large-scale software, the basic unit is a container, a container is only responsible for a module, and the components required for a single module are put together to form a so-called service concept (Fig. 2). On the architecture of the container, its communication methods have begun to become richer. In addition to container-to-external communication, communication between containers has begun to become more frequent, especially now that container service deployments tend to be single-machine deployments. Intercontainer communication is a concept of inter-process communication within the operating system. The packaging technology of containers starts to converge the communication between containers and the communication between containers
Containerization Design for Autonomous and Controllable . . .
33
Fig. 2. In this figure, we have outlined the specific practices of today’s cloud clusters and the virtualization process. From left to right are the physical environment, the virtual environment and the docker container environment. Within the docker container, multiple containers can form a microservices involved
via virtual NICs to physical NICs. Finally, the communication between containers is performed in a manner similar to local loopback. Although the communication overhead of local loopback is less, when the corresponding user requests are served uniformly, most of the work done by the container is actually sequential. So the problem of communication between containers is that the communication mode adopted by the container itself is imitating the mode of communication between the physical machine and the outside world. This may not be wrong, and some communication must be performed, but it has to be said that the consumption of communication resources. It seems to be a bottleneck here. This seems to be a paradox. The existing communication framework may be able to meet its needs, but facing the 5G communication and the large amount of sensor information and communication requests perhaps will be difficult to care about the end.
4
Problems
Here, we list three problems that exist in the network architecture of cloud clusters.
34
4.1
X. Zhang et al.
Service Convergence
In today’s cloud clusters, there are large numbers of homogeneous services. A large number of cloud services or applications are deployed in cloud clusters. A large number of program architectures are almost similar, and communication protocols are mostly the same. On a single cloud host creating multiple containers for deploying the required services has become sparse and common. Although this is not a problem and suitable for the independence of modules, there are relatively certain problems with the entire cloud cluster. The cloud platform provides corresponding architecture support and platform support for application design and release in accordance with the IAAS idea. Although it facilitates user development to a certain extent, once a problem occurs, a large number of VMs may fall by attackers. In addition, a large amount of communication content has become the same and is used to handle similar or similar services. Although two ideas such as DCTCP [1] and multi-TCP [5] have emerged, similar business information flows are merged and passed through routers together to reduce communication overhead. Later, the idea of distributing appeared, and many cloud service vendors also adopted such algorithms for physical machine communication, which greatly improved the communication efficiency in the communication process to a certain extent. But in the virtual machine, it still uses the traditional TCP protocol. A large number of program development packages still use the traditional communication protocol stack, queuing on the virtual network card (DHCP mode) or listing on the physical network card (NAT mode). Another consideration is the network security. Commonly used development plugins, such as fastjson, third-party libraries, are widely used in the production and layout of cloud services. 4.2
Complexity of the Architecture
The cloud architecture is as described above. It is similar to a multi-dimensional nested space. Layers abstract the operating system kernel to generate virtual machines, system images and finally containers, and even some containers. There is also a container as a nest. This approach makes the operating system’s own load extremely large. Nested virtualization greatly occupies the operating system’s memory and multi-level cache. Although the operating system itself, memory and the existence of multilevel cache are to enable programs or applications to run well, such level-by-level virtualization has caused a great waste of the system’s own resources. From the operating system level, overutilizing the operations of the image and the kernel has caused a greater waste of system resources. In order to load a server program of tens of MB, hundreds of MB-level operating system containers are needed. The same thing happens with network card resources.
Containerization Design for Autonomous and Controllable . . .
4.3
35
Communication Pseudo-saturation
The communication pseudo-saturation mentioned here is the problem that the communication request passing rate and network speed utilization rate of the cloud cluster gateway seem to be very high, but in fact, the number of requests and responses to the user fails to meet the target requirements. Most of the above means that during the communication process, the communication between the containers and the communication within the container are carried out in a manner close to the local loopback. Local loopbacks make the message queue longer, so that messages need to be sent to the Internet need to wait more frequently than the previous communication process before they can communicate with the outside network. The entire communication process is roughly: The network card in the container receives the request and queues it, the content of the data packet at the front of the queue is forwarded to the virtual machine network card, and the virtual machine network card receives the data packet in the container in its own message queue, queues it, and then queues the message. Forward to the physical network card. The physical network card starts to forward the message by judging the gateway of the internal virtual network card. Although it seems like this, the communication process has become much more orderly, but it has increased the ordering of external messages, and as far as the application itself is concerned, the probability of delays and errors in the message required by the user has also increased. User experience is not guaranteed. Operators noticed that the communication delay was too long. Instead, they began to expand the container to ensure high concurrency. The increase in the number of containers resulted in longer queue delays, which did not solve the fundamental problem.
5
Our Design
In order to solve the above problems, we propose an improved distributed autonomous controllable architecture in private cloud and hybrid cloud. For the cloud architecture described earlier, we have made minor improvements, and we explain our architecture design from the perspective of three issues. And details of our architecture depicted in Fig. 3. As shown in the figure above, our design method provides two application deployment schemes. One (S1) is a unified deployment method. The same application module is deployed on each physical server to ensure module independence. The second design solution (S2) is to ensure the integrity of the module. Different containers are deployed on the server to run different application modules, and the integrity of the application is achieved through local communication. Both schemes have their own advantages and disadvantages. But in summary, for dealing with Complicated virtualization, we removed the process of using KVM and other technologies for real host virtualization, which consumed a lot of memory and communication resources. At the same time, we used docker swarm to manage the cluster [9]. It is not worried about
36
X. Zhang et al.
Fig. 3. Left side of the figure is the physical server cluster, and the right side is the distributed cluster architecture we designed. We added routers on the right, and actually also have routers on the left
S1 or S2, all benefit from these solutions. It is greatly saving memory usage and communication resource consumption, and improving the operating efficiency and communication efficiency of the entire cluster by removing the virtualization layer. And about Service convergence, the S1 solution guarantees the independence of each application module and communicates through a cross-domain network solution. Although the communication efficiency is lower than S2, the fault is easy to trouble shoot, and it will not happen that the attacker will be completely trapped. If there is a problem with a single physical server, we can implement rapid service migration and troubleshoot while ensuring that the service can run temporarily. The S2 solution puts each module in the same physical server and uses a method to communicate with each container. The burden on the main network card is not large. This method greatly reduces the situation of service problems and the possibility of communication congestion. As for Communication pseudo-saturation, we have detailed the work functions of the network card. One network card (N1) is used to complete the common communication tasks of the server, and the other network card is dedicated to inter-container communication (N2). For S1, N1 is responsible for external communication, but for N2, in addition to handling internal communication, in order to achieve communication between different types of modules, communication must be performed through a physical router, but for containers, cross-domain communication management still needs to be performed through a virtual router. Although some load is added to the communication, from the management level, the integrity and independence of the communication interface can be well guaranteed. The corresponding service host worked. For the S2 solution, we use N2 to handle the communication process of the internal network segment of the corresponding container. The communication process is not in the corresponding gateway. We forward the message to the focus
Containerization Design for Autonomous and Controllable . . .
37
on communicating with the outside world by bridging or copying the message queue network card N1. Although this method is slightly cumbersome, it ensures that a single network card device or multiple network card devices can handle the same task, and to a certain extent, avoid false saturation of the network card caused by homogeneous operation, which causes the server to fail. There are situations in which external requests can be effectively handled. As for the space–air–ground–sea network, different from the common network, the network is composed of a large number of control sensors. Each sensor manufacturer uses its own different communication protocol for data exchange, not only the http [2], but also CoAP [8], AMQP [3], etc. For such a large number of communication protocols, what we need to do at the beginning is a unified communication interface and a unified decoder in the cloud. And integrate a variety of communication protocols into a restful interface form, and then access our design solution, you can quickly develop the service or collect data.
6
Conclusion and Discussion
In this paper, we elaborated on the development status of cloud cluster technology in the recent stage and focused on the problems contained in the communication modules in cloud cluster technology, summarized the corresponding problems, and classified the problems into three parts. For the above problems, we designed two architecture design solutions for the private cloud and hybrid cloud to solve the above problems. We carefully explored the theoretical feasibility of the solution and proved that our two solutions are in the private cloud, and hybrid cloud conditions are feasible and meaningful. In addition, we also discussed whether our solution is feasible under the public cloud environment and the space–air–ground–sea network. In order to adapt to the changes in the experimental environment, although the scheme has been changed to a certain extent, the results show that our scheme still has a certain degree of improvement compared to the existing environment.
References 1. Alizadeh M, Greenberg AG, Maltz DA, Padhye J, Patel P, Prabhakar B, Sengupta S, Sridharan M (2010) Data center TCP (DCTCP). In: SIGCOMM 2010 2. Berners-Lee T, Fielding RT, Nielsen HF (1996) Hypertext transfer protocol– HTTP/1.0. RFC 1945:1–60 3. Fernandes JL, Lopes IMC, Rodrigues JJPC, Ullah S (2013) Performance evaluation of RESTful web services and AMQP protocol. In: 2013 fifth international conference on ubiquitous and future networks (ICUFN), pp 810–815 4. Hafeez T, Ahmed N, Ahmed B, Malik AW (2018) Detection and mitigation of congestion in SDN enabled data center networks: a survey. IEEE Access 6:1730– 1740 5. Han H, Shakkottai S, Hollot CV, Srikant R, Towsley D (2006) Multi-path TCP: a joint congestion control and routing scheme to exploit path diversity in the internet. IEEE/ACM Trans Netw 14:1260–1271
38
X. Zhang et al.
6. Merkel D (2014) Docker: lightweight Linux containers for consistent development and deployment 7. Naik N (2016) Building a virtual system of systems using docker swarm in multiple clouds. In: 2016 IEEE international symposium on systems engineering (ISSE), pp 1–3 8. Shelby Z, Hartke K, Bormann C (2014) The constrained application protocol (CoAP). RFC 7252:1–112
Spatially Transformed Text-Based CAPTCHAs Chuanxiang Yan, Yu Tang(B) , and Di Lin School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, China [email protected]
Abstract. In recent years, deep learning technology has achieved great achievements in the fields of text, image, and speech recognition. As long as there is enough data, deep learning techniques often achieve good results. In the field of text-based CAPTCHA recognition, facing the antisegmentation technology, using deep learning without using character segmentation techniques to end-to-end recognize CAPTCHAs, can also achieve good accuracy. However, some recent studies have found that deep neural networks are vulnerable to adversarial examples. By adding a very small perceptible disturbance to the input samples, the input of the disturbance will cause the model to output incorrect prediction results with high confidence. In this paper, we propose an adversarial text-based CAPTCHAs based on spatial transformation. And we use four state-ofthe-art CNN models to recognize such adversarial CAPTCHAs with and without preprocessing. Experiments show that this type of CAPTCHAs can effectively reduce the recognition rate of the attack models. Keywords: Text-based CAPTCHAs · Deep neural network · Convolutional neural network · Adversarial example · Spatial transformation
1
Introduction
In this paper, we focus on text-based CAPTCHAs, because they are the most widely used type of CAPTCHAs. Among the many captcha recognition techniques, using deep neural networks to identify text-based CAPTCHAs is an efficient way, and many attackers have successfully break many CAPTCHA systems using this technique [1]. For example, Goodfellow trained a large-scale distributed deep neural network to break the reCAPTCHA system with an accuracy of 99.8% [2]. However, recent researches [3] have shown that many advanced deep neural network models are susceptible to adversarial samples. As the input to DNN models, adversarial samples are carefully designed by attackers in order to make c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_5
40
C. Yan et al.
the model output wrong result. The attacker adds unperceived perturbation to the original image, which has a very disturbing disturbance to the model to generate adversarial samples. In addition, Szegedy et al. [3] found that adversarial samples also have cross model and cross dataset generalization, which is also called transferability. Generalization is an important attribute of adversarial samples, which makes it possible that the adversarial samples designed for a target victim model can fool other models that deal with the same task. Considering that adversarial samples have a small perturbation to the original pictures, which can greatly interfere with the deep neural network model, and have the characteristics of transferability. Applying adversarial samples to textbased CAPTCHAs may be a good idea. The contribution of this paper is as follows. • We used four state-of-the-art CNN models to train on text-based CAPTCHA datasets with and without preprocessing, which achieve high accuracy on the test set. We use these models as attack models for subsequent experiments. • We proposed an adversarial text-based CAPTCHAs generation method based on spatial transformation. • We used the above four attack models to evaluate the security of the adversarial text-based CAPTCHAs with and without preprocessing and analyzed the experimental results.
2
Related Works
In this section, we will briefly summarize the existing adversarial attack algorithms. 2.1
Adversarial Examples
Based on the adversary’s goal, there are two types of attack: targeted attack and non-targeted attack. Given an original sample (x, y), x is the input of the model, represented by a feature vector. y ∈ Y is the ground truth label of x. For targeted attack, given a target label t = y, the attacker’s purpose is to add an imperceptible disturbance to the original input x to generate the adversarial example xadv (xadv = x + ), so that classifier g misclassifies adversarial sample xadv as t(g(xadv ) = t). For non-targeted attack, the attacker only aims at making the classifier g misclassify the adversarial sample xadv as any class except the ground truth label y(g(xadv ) = y = y). Based on the adversary’s knowledge, attack can be divided into white-box attack and black-box attack. Under a white-box attack, the attacker can obtain all the information of the target neural network, including the parameters, the structure, the gradient, and the datasets, etc., which can be used to make adversarial samples [4]. In the black-box attack setting, the attacker can’t obtain the full inner information of the DNN model [5]. In this paper, we will study white-box attack and black-box attack both.
Spatially Transformed Text-Based CAPTCHAs
3
41
Proposed Method
In this section, we first introduce the problem and the notations that will be used and then show our spatially transformed text-based CAPTCHAs method (stCap). 3.1
Problem Definition
For a trained model, x indicates the input CAPTCHA picture, normalized to between −1 and +1 with its corresponding label label = {label1 , label2 , . . . , labelq }, where labeli , i ∈ [1, q] is a one-hot vector and q indiq 1 2 , ytrue , . . . , ytrue } represents cates the length of the CAPTCHA. ytrue = {ytrue i the ground truth label of the sample x with q labels, and ytrue is the index of the number 1 in labeli . C represents the trained classifier, C(x) = y, which has the same shape as ytrue . F represents the DNN model which outputs a score vector F (x) = {f 1 , f 2 , . . . , f q }, f i ∈ [0, 1]n . Z is the logits, the output of the last layer (without the activation function), Z(x) = {z 1 , z 2 , . . . , z q },z i ∈ Rn , i ∈ [1, q]. F (x) = σ(Z(x)), where σ is the last layer’s activation function and y i = arg max(z i ), i ∈ [1, q]. Our goal is to generate an adversarial sample xadv θ
via the original input x such that C(xadv ) = y = ytrue (non-targeted attack). 3.2
Spatially Transformed Text-Based CAPTCHAs
For text-based CAPTCHAs recognition, compared with the color of the character and the background, the shape is a more useful feature, because the former can be easily removed, treated as noise and unrelated information by preprocessing, making the latter the most important feature to distinguish the type of the character [6]. For this reason, the adversarial sample generation algorithm, which directly modifies the pixel values of the original picture, performs poorly on textbased CAPTCHAs. Therefore, slightly changing the shape of a character may be a better way to improve the CAPTCHAs’ robustness. Under this motivation, we propose the spatially transformed text-based CAPTCHAs (stCap) method, which takes advantage of spatially transformed adversarial example optimization method (stAdv) [7] to generate text-based CAPTCHAs with higher robustness. The following introduces two core concepts in the stAdv used in our algorithm. Per-pixel flow field f . f is the displacement relationship between the pixels (i) of the original image x and the adversarial example xadv . If xadv represents the (i) (i) pixel value of the i-th pixel point of xadv , (uadv , vadv ) is the position of its corresponding coordinate point, and fi = (u(i) , v (i) ) is the displacement of the ith (i) pixel point, then we can get the position of xadv at the ith pixel of the original (i) (i) image x, (u(i) , v (i) ) = (uadv + u(i) , vadv + v (i) ). Since the obtained position coordinates may be decimals, directly using the nearest-neighbor method to obtain pixels will cause a non-differentiable situation. Here, a differentiable bilinear interpolation method [8] is used to obtain the pixel values. The formula is described as follows:
42
C. Yan et al.
x(q) (1 − u(i) − u(q) )(1 − v (i) − v (q) )
(i)
xadv =
(1)
q∈N (u(i) ,v (i) )
N (u(i) , v (i) ) is the neighbors of (u(i) , v (i) ), the index of the 4 pixels around, including top-left, top-right, bottom-left, bottom-right. With per-pixel flow field f , we can change the shape of the characters, thereby improving the robustness of the text-based CAPTCHAs. Objective function. In order to get the appropriate flow field f , we need to minimize the following objective function: f ∗ = arg min Ladv (x, f ) + τ Lflow (f )
(2)
f
Ladv (x, f ) enables the generated adversarial sample to be misclassified by the classifier as a target t. Lflow (f ) guarantees the smoothness of spatial changes. τ is used to balance the two losses. The formulas for these two losses are defined as follows: (3) Ladv (x, f ) = max(max Z(xadv )i − Z(xadv )t , κ) i=t
Lflow (f ) =
u(p) − u(q) 2 + v (p) − v (q) 2
all pixels p
2
2
(4)
q∈N (p)
Ladv (x, f ) uses the same objective function in [9], κ is used to control the degree to which the output value of the target class t is higher than other classes. For Lflow (f ), N (p) represents the index value of the pixels around p. Minimizing the loss function can make the pixels around p have similar displacement and direction of the p point, so as to achieve the effect of smoothing spatial changes. Algorithm 1 demonstrated the design and process of stCap method. We have improved on the basis of the stAdv method, with two major changes. Firstly, we introduced an integer T to control the interference level of the adversarial text-based CAPTCHAs. The smaller the number, the greater the interference. If T equals to 0, the adversarial text-based CAPTCHAs will be predicted to be the least possible results. Conversely, if T is n − 1, n is the total number of classes for each character, the adversarial text-based CAPTCHAs will not cause any interference. Secondly, in order to speed up the process of adversarial text-based CAPTCHAs, we used the Adam optimizer [10] to find the approximate optimal solution of the formula 2 instead of using the L-BFGS optimizer in the original paper [11].
4 4.1
Experiment and Analysis Experiment Setup
4.1.1 Data Set We use the Pillow library to generate clean alphanumeric text-based CAPTCHAs with a length of 4, which does not use any distortion
Spatially Transformed Text-Based CAPTCHAs
43
Algorithm 1 Spatially Transformed Text-Based CAPTCHAs Method. x is the original Text-based CAPTCHAs, C is the classifier learned by the network during training, Z is the logits of DNN model, ytrue is the ground truth of CAPTCHAs, iter is the maximum number of iterations, τ is used to balance Ladv and Lflow , κ is used to control the attack confidence level, lr is the learning rate, T controls the degree of interference input: x, C, Z, yt r u e , iter, τ , κ, lr, T output: xadv 1: Initialize xadv ← x, target ← C(xadv ), f ← 0; 2: for i = 1; i 0 do 10: logits ← Z(xadv ); 11: Ladv ← Ladv (logits, target, κ); 12: Lf low ← Lf low (f ); 13: loss ← Ladv + τ Lf low ; 14: use Adam optimizer with the learning rate lr to minimize the loss; 15: based on the updated flow field f and Eq. (2), generate the updated adversarial text-based captchas xadv ; 16: iter − −; 17: end while 18: return xadv
and conglutination technology. Each character is independent, making the generated text-based CAPTCHAs easy to read. The difference between each character is reflected in the font style and color. The font style of each character changes among the 40 font styles in Windows 10. The RGB value of the character color ranges from [0, 220]. We generated 100,000 text-based CAPTCHAs as the training set, 10,000 as the validation set, and 10,000 as the test set. 4.1.2 Security Evaluation Criteria The Attack Success Rate (ASR): This value describes the proportion of correctly identified text-based CAPTCHAs to the total number of ones. The higher the value, the lower the security and robustness. 4.1.3 Attack Models Throughout the experiment, we used four advanced CNN models to identify text-based CAPTCHAs and generate adversarial textbased CAPTCHAs based on the models’ information. The four CNN models are: LeNet-5 [12], ResNet-50 [13], DenseNet-121 [14] and Wide ResNet-50-2 [15]. At
44
C. Yan et al.
the same time, in the experiment, we used five preprocessing techniques commonly used in CAPTCHA recognition, including: Otsu’s Binarization [16], Blur, MedianBlur, GaussianBlur and BilateralFilter. We use B to represent Otsu’s Binarization. Table 1 shows the performance of 4 CNN models on clean textbased CAPTCHAs, where normal indicates that no preprocessing technique is used. Table 1 ASR of four attack models on the clean text-based CAPTCHAs
4.2
Preprocessing
Attack models (%) LeNet-5 ResNet-50 DenseNet-121 Wide ResNet-50-2
Normal
97.01
98.13
100
99.06
B
87.53
88.37
88.37
88.30
Blur + B
86.89
91.44
91.98
91.20
MedianBlur + B
83
86.08
86.25
85.79
GaussianBlur + B
88.69
91.57
91.87
91.39
BilateralFilter + B 82.53
84.94
85.48
84.91
Adversarial Text-Based CAPTCHAs Based on stCap Method
In the experiment, to make the generated adversarial text-based CAPTCHAs similar to the original ones and achieve the purpose of disturbing the classifier, we set T to −2, penultimate class. At the same time, in order to make the classifier predict the adversarial text-based CAPTCHAs as the target classes with high confidence, k is set to −1000. In order to make the degree of the disturbance small, we set Adam’s learning rate to 0.0001. Table 2 shows the final experimental results. It can be seen that when the pre-processing technology is not used and the attack model and the generation model are the same (white-box attack), the stCap method performs well. The ASR of the four attack models is 0.00, 0.23, 5.80, 4.19%, approximately 90% lower than before. When the attack model and the generated model are different (black-box attack), the ASR of the attack model is about 60%, which is about 35% decrease. When using preprocessing technology, if the attack model and the generated model are the same, the stCap algorithm reduces the ASR value of the four CNN models by approximately 40%. If not, the ASR value dropped by approximately 30%. Table 3 shows an example of the adversarial text-based CAPTCHAs. It can be seen that the stCap method can enhance the transferability of adversarial text-based CAPTCHAs and can also enhance the robustness of it in the preprocessing situation.
Spatially Transformed Text-Based CAPTCHAs
45
Table 2 ASR values of four attack models on adversarial text-based CAPTCHAs generated by stCap method (with and without image preprocessing) Attack models
Preprocessing
Models used to generate adversarial CAPTCHAs (%) LeNet-5 ResNet-50 DenseNet-121 Wide ResNet-50-2
LeNet-5
ResNet-50
DenseNet-121
Normal
0.00
79.26
71.88
76.71
B
36.74
69.17
62.98
67.47
Blur + B
43.16
64.38
62.16
66.26
MedianBlur + B
37.57
50.60
51.55
54.68
GaussianBlur + B
38.89
69.45
65.76
68.39
BilateralFilter + B 33.89
63.04
57.76
61.86
Normal
41.33
0.23
39.06
49.77
B
43.64
53.41
43.55
52.83
Blur + B
48.60
65.39
62.41
66.33
MedianBlur + B
39.57
47.37
49.35
48.78
GaussianBlur + B
45.88
65.76
65.18
65.48
BilateralFilter + B 40.74
56.13
50.57
56.60
Normal
49.71
69.82
5.80
68.18
B
42.52
56.36
41.59
56.24
Blur + B
47.81
64.88
64.78
65.69
MedianBlur + B
39.39
47.88
49.27
52.83
GaussianBlur + B
46.89
64.47
64.04
64.48
BilateralFilter + B 39.24
53.09
50.57
55.03
40.17
53.27
47.06
4.19
B
38.40
49.03
43.22
48.99
Blur + B
49.09
62.49
62.16
63.92
MedianBlur + B
37.99
44.19
46.73
48.21
GaussianBlur + B
44.50
62.63
60.45
63.13
BilateralFilter + B 39.77
53.18
49.27
53.61
Wide ResNet-50-2 Normal
Table 3 Use ResNet-50 to recognize four different adversarial text-based CAPTCHAs generated by stCap method with and without reprocessing (GaussianBlur + B) Clean Adversarial Predicted label Predicted label Predicted label True text-based text-based of clean text-based of adversarial text-based of adversarial text-based lable CAPTCHAs CAPTCHAs CAPTCHAs CAPTCHAs(without) CAPTCHAs(with)
5
00qN
00qN
O0qN
O0qN
00wm
00wm
0Owm
0Owm
03lB
03lB
03l8
03l3
08o5
08o5
O8o5
O8oS
Conclusion
In this paper, we propose an adversarial text-based CAPTCHAs generation method stCap based on spatial transformation. In order to evaluate the security of the CAPTCHAs, we used four state-of-the-art CNN models to recognize these adversarial text-based CAPTCHAs in the two cases with and without
46
C. Yan et al.
image preprocessing and preprocessing technology. Experiment shows our proposed stCap method can both enhance the transferability of adversarial textbased CAPTCHAs and the robustness of it in preprocessing situation, make the existing text-based CAPTCHAs more secure.
References 1. Zhang L, Zhang L, Huang S, Shi Z (2011) A highly reliable captcha recognition algorithm based on rejection. Acta Autom Sinica 37(7):891–900 2. Goodfellow IJ, Bulatov Y, Ibarz J, Arnoud S, Shet V (2013) Multi-digit number recognition from street view imagery using deep convolutional neural networks. arXiv preprint arXiv:1312.6082 3. Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow I, Fergus R (2013) Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 4. Kurakin A, Goodfellow I, Bengio S (2016) Adversarial examples in the physical world. arXiv preprint arXiv:1607.02533 5. Papernot N, McDaniel P, Goodfellow I, Jha S, Celik ZB, Swami A (2017) Practical black-box attacks against machine learning. In: Proceedings of the 2017 ACM on Asia conference on computer and communications security. ACM, pp 506–519 6. Shi C, Xu X, Ji S, Bu K, Chen J, Beyah R, Wang T (2019) Adversarial captchas. arXiv preprint arXiv:1901.01107 7. Xiao C, Zhu JY, Li B, He W, Liu M, Song D (2018) Spatially transformed adversarial examples. arXiv preprint arXiv:1801.02612 8. Jaderberg M, Simonyan K, Zisserman A et al (2015) Spatial transformer networks. In: Advances in neural information processing systems, pp 2017–2025 9. Carlini N, Wagner D (2017) Towards evaluating the robustness of neural networks. In: 2017 IEEE symposium on security and privacy (SP). IEEE, pp 39–57 10. Kingma DP, Ba J (2014) Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 11. Liu DC, Nocedal J (1989) On the limited memory BFGS method for large scale optimization. Math Progr 45(1–3):503–528 12. LeCun Y, Bottou L, Bengio Y, Haffner P et al (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324 13. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778 14. Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4700–4708 15. Zagoruyko S, Komodakis N (2016) Wide residual networks. arXiv preprint arXiv:1605.07146 16. Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 9(1):62–66
Battery Capacity Multi-step Prediction on GRU Attention Network Jiazhi Huo, Yu Tang(B) , and Di Lin School of Information and Software Engineering, University of Electronic Science and Technology of China, 610054 Chengdu, China [email protected]
Abstract. The prediction of battery capacity plays an important role in estimating battery life information. Many studies use battery capacity as a standard indicator of battery life. But the battery’s capacity will fade irrecoverably due to the electrochemical reactions. The existing lithium-ion battery capacity prediction technologies cannot effectively understand the data relationships between the different charge/discharge cycles. The proposed method uses the attention mechanism to figure out this problem. Also, the proposed method uses the asymmetric loss function that is suitable for the capacity prediction. We use the first 50% of the cycle as the training data, then predict the battery capacity after multiple cycles, effectively reducing the dependence of the battery capacity prediction on the experiment data and improving accuracy over traditional recurrent network methods.
Keywords: Capacity prediction Lithium battery
1
· Attention mechanism · GRU ·
Introduction
With the development of the electric car and the mobile phone, batteries have played an increasingly important role as the energy source. In order to maintain peak performance for the device, the battery must be in good condition. However, because of its physical properties, such as the solid electrolyte inter-phase growth [7], batteries cannot stay healthy for long. After several charge and discharge cycles, the key attributes such as battery capacity and voltage will gradually degenerate. For example, the maximum capacity of a laptop’s battery decreases at a rate of 10% per year. The battery production and delivery must go through quality inspection to obtain battery capacity degenerate information for users’ reference. However, the battery has a long degradation period. So, the quality of each test may take a long time to obtain experimental data. If the capacity of the battery can be accurately predicted, it will provide valuable reference for battery users and producers. For the user, it can prompt the user to replace the c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_6
48
J. Huo et al.
battery early, which can effectively prevent the loss of equipment performance and accidents. For the manufacturer, not only can it find the problem of the battery as soon as possible during the research and development process, but it can also speed up the battery production quality. Therefore, the battery capacity prediction is an important part of battery health management. In this paper, we propose a neural network based on attention mechanism and GRU network for battery capacity prediction. Also, we use an asymmetric loss function which is suitable for capacity estimations. The remainder of this paper is organized as follows. In Sect. 2, we review related work, and in Sect. 3, we describe the proposed method. Then, we present and analyze the experimental results in Sect. 4. We conclude in Sect. 5.
2
Related Work
The battery health management is generally studied from the following indicators: RUL (remaining useful life), SoH (state of health), SoC (state of charge) and impedance [12]. In addition to the impedance, the others are generally measured by the capacity of the battery. The capacity prediction methods for batteries can be categorized into two main categories: model-based methods and data-driven methods [13]. The model-based methods mainly use the existing mathematical methods to define a equation for the degradation of capacity. Olivares et al. [6] proposed the particle filter algorithm for predicting the capacity. Richardson et al. [8] used the Gaussian regression process combined the different kernel functions to predict the capacity. Although these methods have a good estimate of battery life, there are still have two problems. Firstly, there are different types of batteries. A model is suitable for one battery type, which does not mean that it has a good result on other batteries. Secondly, the filter method is limited by the particle degradation problem [5]. In order to make the prediction model more accurate and general, people use the data-driven methods which only need the degradation data from the battery. These methods use a specific learning algorithm to obtain the relationship and characteristics from the data. Therefore, the neural networks are more and more used in the capacity prediction. Li et al. [4] use the deep convolution neural networks which contain five convolution layers to predict the capacity. In addition, the recurrent neural network is also widely used. Chauui et al. [1] use the traditional RNN to transmit the actual current, voltage and temperature of the battery into the RNN as input to predict the capacity of the next cycle of the battery. Zhang et al. [13] change the traditional RNN to the LSTM with long and short memory ability. By using the LSTM, the input pressure of the network is reduced, and the problem of gradient disappearance and explosion in the training process is well solved.
3
Proposed Method
The overall structure of the algorithm is shown in Fig. 1. First, we input five cycles of data into the recurrent GRU network as a sequence length and then
Battery Capacity Multi-step Prediction on GRU Attention Network
49
calculate the attention weight for the output of each hidden layer. Then, the output ht of recurrent network is multiplied by attention weight which is calculated by attention mechanism. Finally, the output is fully connected to calculate the prediction value we need.
Fig. 1. Overall structure of the proposed network
3.1
Network
3.1.1 GRU Network The full name of GRU is called Gated Recurrent Unit, which is a form of unit in a recurrent network. It is a simplified version of LSTM. Compared to the LSTM structure with three gates, the GRU has only two gates, namely the update gate zt and the reset gate rt . Similarly, it has the hidden layer ht . Its internal calculations are shown in the following formula. rt = σ (Wr · [ht−1 , xt ])
(1)
zt = σ (Wz · [ht−1 , xt ]) ˜ t = tanh W˜ · [rt ∗ ht−1 , xt ] h h
(2)
˜t ht = (1 − zt ) ∗ ht−1 + zt ∗ h
(4)
(3)
The GRU retains two gates, making it has the advantages of LSTM. Furthermore, it is more suitable for small amounts of data and low data dimensions as input. This is exactly the data characteristic of the battery cycle data. At the same time, because its network structure is simpler, it will also be faster than LSTM.
50
J. Huo et al.
3.1.2 Attention Mechanism Attention mechanism was first proposed in the field of visual images. It improves the accuracy of the model by calculating different weights on the input sequence. The outstanding ability in NLP applications has inspired many scholars to apply it to the problem of time series prediction. Shih et al. [11] use the attention mechanism for the electricity consumption, solar power production, and polyphonic piano pieces having a good performance than before. In this paper, we use the additive attention to capture the relationship between the data. The structure and formula are shown in Fig. 2.
Fig. 2. Additive attention structure. The structure shows that the process of the output of the GRU layer through the attention mechanism to form the output of attention layer
Firstly, every two ht vectors will be added together ht,t = tanh(hTt Wt + The calculated vector will be weighted et,t = Wa ht,t + ba and then put into the Softmax function to get a weighted distribution at = softmax(et ). with each Lastly, the current ht that needs to be calculated will be multiplied weight to get an output from the attention mechanism lt = t at,t ht . hTt Wx + bh ).
3.2
Asymmetric Loss Function
Because of the specification of the battery capacity, we need to predict the capacity safety. The main purpose of capacity prediction is to prevent the accidents caused by battery scrap. So we need to train the network in the direction that the predicted capacity is a little smaller than the actual value. The conventional loss functions, such as MSE, MAE have no additional penalty for the case that the predicted value is larger than the actual value. This makes the predicted capacity may be larger than the actual capacity, which is not conducive to the health management of the battery. Therefore, we must use asymmetric loss function to make the predicted value more punitive than the actual value. Listou et al. [2] used the scoring function provided by Saxena et al. [9] to train the network for turbofan engine RUL prediction. The formula for scoring function is as follows. C means capacity. The a1 and a2 are two penalty factors
Battery Capacity Multi-step Prediction on GRU Attention Network
51
of the function. Just set a1 > a2 , so that the network can train as we need. d = Cpred − Cactual d e− a1 , d < 0 , α1 , α2 ∈ R+ SF = (5) d e a2 , d ≥ 0 Elseikh et al. [3] proposed the asymmetric squared loss function to fix the disadvantages of the scoring function which is difficult to train and will cause large amount of exponential calculation. We will use this loss function in our method. The asymmetric squared error is as follows: α1 d2 , d < 0 , α1 , α2 ∈ R+ (6) ASE = α2 d2 , d ≥ 0 3.3
Data Preparation
The battery dataset used in our experiment is provided by Severson et al. [10]. This entire dataset consists of 124 commercial lithium-ion batteries cycled to failure under fast-charging conditions. The dataset includes battery cycling under various charge and discharge policy. The battery will collect charge and discharge data from the new one until its capacity is less than 80% of the rated capacity. For example, the first sample of the dataset we called it cell 0 which charge policy is marked as 1C(4%)-6C. That means that the cell 0 was charged from 0 to 4% using the charging rate of 1C, 1C is 1.1A. Then, the cell 0 was charged from 4 to 80% using the charging rate of 6C. Lastly, all cells then charged from 80 to 100% 1C and a current cutoff of C/50. All cells were subsequently discharged by 4C [10]. We use the discharge capacity as the actual battery capacity. In this paper, we use the ‘Batch-2017-06-30’ containing 48 cells from the dataset as our experimental data. As the cycle increases, the capacity of the cells decreases as shown in Fig. 3.
4
Experiment
In the section, we use the one layer LSTM and one layer GRU as the controlled experiment. We select some of the 48 cells from the dataset to train and predict. 50% of the cycle data is used to train, and we will predict the capacity of the rest 50% cycle data. For example, the cell 0 contains 326 cycles data. So, we use 163 cycles to train and then predict the rest capacity. As mentioned above, we will input five cycles of data into the network as a sequence length. Then, we predict the battery capacity at the sixth cycle, and then, we continue to use the capacity predicted at the sixth cycle as the input of the next sequence. We used RMSE and MAPE to measure the accuracy of capacity prediction. The full
52
J. Huo et al.
Fig. 3. Capacity degradation curve. The color of each curve represents the capacity change of each battery
name of RMSE is root mean square error. It directly represents the gap between the predicted value and the real value. Its calculation formula is as follows: N 1
2 (observedt − predictedt ) (7) RMSE = N t=1 The full name of MAPE is mean absolute percentage error. The MAPE not only considers the error between the predicted value and the real value, but also considers the proportion between the error and the real values. Its calculation formula is shown in formula (8) n
observedt − predictedt 100 × (8) MAPE = observedt n t=1 We selected four batteries for the experiments, cell 0, cell 6, cell 12, and cell 18. These four batteries use different charging strategies, and the number of cycles is different. The specific battery information is as follows: Table 1. Battery sample information Cell 0
Cell 6
Cell 12
Cell 18
Change policy 1C(4%)-6C 3.6C(%)-6C 4c(31%)-5C 4.4C(47%)-5.5C Cycle
326
545
491
523
The neuron number of GRU and LSTM network is 100. And the parameters of GRU are 30,701, and the parameter of LSTM is 40,901. The neuron number of GRU for proposed network is 100, and the attention neuron is 50. The parameter
Battery Capacity Multi-step Prediction on GRU Attention Network
53
of the whole proposed network is 40,802. We use the ASE as the loss function and RMSprop as the optimizer. Figure 4 shows the four cells capacity prediction curve. Table 2 shows the RMSE and MAPE for the four cells capacity prediction between the two conventional ways and the proposed GRU-Attention method.
(a) cell 0
(b) cell 6
(c) cell 12
(d) cell 18
Fig. 4. Different methods for capacity prediction in the last 50% of life cycle
Table 2. RMSE and MAPE in capacity prediction for four cells by different methods Cell 0
Cell 6
Cell 12
Cell 18
RMSE
MAPE (%)
RMSE
MAPE (%)
RMSE
MAPE (%)
RMSE
MAPE (%)
LSTM
0.0285
6.97
0.2230
20.18
0.1093
12.37
0.1075
12.5
GRU
0.0603
5.42
0.0546
6.62
0.0541
6.77
0.0562
5.71
GRU + Attention
0.0228
5.09
0.0451
7.31
0.0168
6.24
0.0209
4.51
As we can see, the proposed network can best fit the curve of battery capacity degradation. The GRU has a poorly curved shape in all four batteries, and the LSTM fits well at the beginning, but a large deviation appears in the last few cycles.
54
J. Huo et al.
5
Conclusion
As an important part of battery health management, battery capacity prediction can provide powerful data support for users and producers. This paper provides a data-driven battery capacity prediction method by combining GRU recurrent network and attention mechanism. At the same time, an asymmetric loss function is used to make the capacity prediction more secure. As shown in the above table, the proposed network is more accurate than traditional LSTM and GRU in capacity prediction, and its MAPE and RMSE are lower. Using the traditional recurrent network, although it can capture the vertical characteristics of data, it cannot capture the horizontal characteristics of data. Combined with attention mechanism, it can carry out a weight analysis of the horizontal data, so that the predicted value can obtain the hidden layer vector information of the GRU network according to the attention weight. It is meaningful to use less battery cycle data for capacity prediction. For example, the algorithm in this paper only selects the first 50% data of life cycle. But using less data will cause many problems. Using the 50% data of life cycle for training and the predicted data are used for iterative prediction which will lead to a larger prediction error in the last few cycles. This is an inevitable problem using a data-driven approach to achieve predictions. In the future, we will find ways to solve this problem.
References 1. Chaoui H, Ibe-Ekeocha CC (2017) State of charge and state of health estimation for lithium batteries using recurrent neural networks. IEEE Trans Veh Technol 66(10):8773–8783 2. Ellefsen AL, Bjørlykhaug E, Æsøy V, Ushakov S, Zhang H (2019) Remaining useful life predictions for turbofan engine degradation using semi-supervised deep architecture. Reliab Eng Syst Saf 183:240–251 3. Elsheikh A, Yacout S, Ouali MS (2019) Bidirectional handshaking LSTM for remaining useful life prediction. Neurocomputing 323:148–156 4. Li X, Ding Q, Sun JQ (2018) Remaining useful life estimation in prognostics using deep convolution neural networks. Reliab Eng Syst Saf 172:1–11 5. Liu Z, Sun G, Bu S, Han J, Tang X, Pecht M (2016) Particle learning framework for estimating the remaining useful life of lithium-ion batteries. IEEE Trans Instrum Meas 66(2):280–293 6. Olivares BE, Munoz MAC, Orchard ME, Silva JF (2012) Particle-filtering-based prognosis framework for energy storage devices with a statistical characterization of state-of-health regeneration phenomena. IEEE Trans Instrum Meas 62(2):364–376 7. Pinson MB, Bazant MZ (2013) Theory of SEI formation in rechargeable batteries: capacity fade, accelerated aging and lifetime prediction. J Electrochem Soc 160(2):A243–A250 8. Richardson RR, Birkl CR, Osborne MA, Howey DA (2018) Gaussian process regression for in situ capacity estimation of lithium-ion batteries. IEEE Trans Ind Inf 15(1):127–138
Battery Capacity Multi-step Prediction on GRU Attention Network
55
9. Saxena A, Goebel K, Simon D, Eklund N (2008) Damage propagation modeling for aircraft engine run-to-failure simulation. In: 2008 international conference on prognostics and health management. IEEE, pp 1–9 10. Severson KA, Attia PM, Jin N, Perkins N, Jiang B, Yang Z, Chen MH, Aykol M, Herring PK, Fraggedakis D et al (2019) Data-driven prediction of battery cycle life before capacity degradation. Nat Energy 4(5):383 11. Shih SY, Sun FK, Lee Hy (2019) Temporal pattern attention for multivariate time series forecasting. Mach Learn 108(8–9):1421–1441 12. Waag W, Fleischer C, Sauer DU (2014) Critical review of the methods for monitoring of lithium-ion batteries in electric and hybrid vehicles. J Power Sources 258:321–339 13. Zhang Y, Xiong R, He H, Pecht MG (2018) Long short-term memory recurrent neural network for remaining useful life prediction of lithium-ion batteries. IEEE Trans Veh Technol 67(7):5695–5705
Heterogeneous Network Selection Algorithm Based on Reinforcement Learning Sheng Yu1(B) , Shou-Ming Wei1,2 , Wei-Xiao Meng1,2 , and Chen-Guang He1,2 1 Communication Research Center, Harbin Institute of Technology, Harbin, China
[email protected], [email protected] 2 Key Laboratory of Police Wireless Digital Communication, Ministry of Public Security,
People’s Republic of China Harbin, Harbin, China
Abstract. In the current network environment, multiple wireless access technologies coexist. In order to meet the needs of various services in heterogeneous networks, in order to enable each user to select the most appropriate network to provide services and adapt to the dynamic changes of the network environment, this paper defines the markov decision-making process of network selection based on reinforcement learning, taking the heterogeneous network constructed by PDT and B-trunC as the background A network access control algorithm based on reinforcement learning is proposed, which fully considers the service type of session and the mobility of terminal, and realizes the adaptability of network selection. Keywords: Heterogeneous network · Reinforcement learning · Mobility
1 Introduction In the current network environment, many kinds of wireless access technologies coexist. Due to the overlapping coverage of different networks, different business requirements and complementary technical characteristics, it is particularly important to coordinate heterogeneous wireless network resources. For this reason, many joint wireless resource management methods are proposed to achieve load balancing and heterogeneous network selection [1–3]. However, many algorithms cannot adapt to the changing network environment, and for the rapid change of the network environment, we should constantly modify the strategy to achieve self-management of network resources. Reinforcement learning [4] is an algorithm for agents to learn by interacting with the environment. It does not need to prepare the training set in advance. The training data is generated automatically by interacting with the environment, and its performance is evaluated by the accumulation of rewards, so as to achieve an optimal decision. At present, reinforcement learning has been widely used in robotics and automatic control
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_7
Heterogeneous Network Selection Algorithm …
57
[5, 6]. Because it can automatically adjust algorithm according to the change of environment, it is also introduced to the resource management of wireless communication system. Q-learning is a method of RL. Using learning agent to gradually build a Q function, trying to estimate the future discount cost so that learning agent can take certain actions in the current state. At present, Q-learning has been applied to the selection of heterogeneous wireless networks. References [7, 8] studies the joint Q-learning algorithm for network access control, but the algorithm does not distinguish the access service attributes. Literature [9] discusses the resource autonomous joint management algorithm based on Q-Learning. Although the service attributes are considered, the differences of terminal mobility are not considered. In view of the above problems, this paper proposes a Q-learning-based heterogeneous network access control algorithm, which selects the appropriate network for the session according to the service type, terminal mobility, and network load status.
2 Q-learning Algorithm Model In the system of reinforcement learning, it includes the following basic elements: The first is the state of the environment S. The state St of the environment at time t is a state in its environment state set. The second is the individual action A. The third is the environmental reward R. The fourth is individual strategy π , It represents the basis for an individual to take action. The relationship between these elements is shown in Fig. 1. The agent selects an action to execute according to the current environment and its own strategy. After the action is executed, it will get an immediate reward and reach a new state, which is the operation process of reinforcement learning.
Fig. 1 Schematic diagram of reinforcement learning
Define policy as a probability distribution over action given state as: π(a|s) = P[At = a|St = s]
(1)
In the considered problem, policy π determines which network should be chosen as the next action conditioned on the current state. Define Qπ (s, a) = Eπ [Gt |St = s, At = a]
(2)
58
S. Yu et al.
as the action-value function based on policy π of the MDP, the expectation of the cumulative discounted reward starting from s, taking action a, and following policy π . The objective of the algorithm is to find a policy to maximize the action-value function Q ∗ (s, a) = max Qπ (s, a) a
(3)
The optimal policy can be found by maximizing over the optimal action-value function Q ∗ (s, a) as 1 if a = arg max Q ∗ (s, a) a∈A π ∗ (a|s) = (4) 0 otherwise One possible way to obtain the optimal action-value function Q ∗(s, a) is Q-learning, which can be iteratively implemented as − Q(st , at ) + α[Rt+1 + γ max Q(st+1 , a) − Q(st , at )] Q(st , at ) ← a
(5)
3 System Model The environment considered in this paper is composed of B-trunC and PDT. Only two types of business, voice business and data business, are considered in the model. BtrunC and PDT have different characteristics. B-trunC network is more suitable for high bandwidth data services. PDT network is more suitable for voice services and all services due to its real-time and continuous characteristics. However, from the perspective of terminal mobility, PDT network is more suitable for high-speed mobile terminals, BtrunC network is more suitable for low-speed mobile or fixed terminals, and B-trunC network is only suitable for fixed terminals because of its minimum coverage. Combined with the above model, the JRRM controller in the system uses Q-learning algorithm to select appropriate network access according to the service type, terminal mobility, network load, and other conditions of terminal access. Considering the different services requested by users, we code them separately. When selecting data service, B= [0, 1]. When selecting data service, B= [1, 0]. Similarly, M is used to represent the mobile situation of the terminal. L represents the network load condition in the system, that is, the proportion of the used resources in each network to the total resources, L = {L1 , L2 }. In order to meet the limitation that Q-learning can only deal with discrete data types, L1 and L2 are quantized into several levels to construct Q-value table, so the state set can be defined as: S = {B, M , L}
(6)
In wireless heterogeneous network, JRRM controller is used as learning agent to select appropriate network access for terminal according to state and learning experience. According to the model, considering the wireless heterogeneous network composed of three networks, namely PDT, B-trunC network, the action set is defined as: A = {1, 2}
(7)
Heterogeneous Network Selection Algorithm …
59
Among them, 1 indicates PDT network selection; 2 indicates B-trunC network selection. The return function is used to evaluate the immediate return after the session request is accessed in a certain state S. Considering the adaptability of the request type, terminal mobility and other factors with different networks, the return function is defined as: r = β × (η(b, k) + η(m, k))
(8)
4 Algorithm Implementation Process After considering the load condition, service type and terminal mobility of the network, the Q-learning algorithm is applied to the heterogeneous access selection. The Q-learning algorithm based on online learning includes two aspects. Algorithm Input Number of iterations T , state set S, action set A, step α, attenuation factor γ , exploratory rate ε. Output: the value of all states and actions. 1. Randomly initialize the corresponding value of all states and actions Q.The Q value is initialized to 0 for the termination state. 2. For i form 1 to T: (a) (b) (c) (d)
Initializes S as the first state of the current state sequence Select action A in current state S with ε − greedy algorithm Execute the current action A in state S to get the new state S and reward R Update value function Q(S, A): − Q(st , at ) + α[Rt+1 + γ max Q(st+1 , a) − Q(st , at )] Q(st , at ) ← α
(e) S = S (f) If S is the termination state, the current iteration is completed. Otherwise, go to step (b).
5 Simulation Results In the simulation model, we consider the initiation of new session and the occurrence of service request in overlapping area. The arrival time interval of each session obeys the exponential distribution with the mean value of 20 s, and the service duration obeys the exponential distribution with the mean value of 80 s. The service only considers voice service and data service, and the bandwidth request of voice service is 1–2 resource blocks, the bandwidth request of data service is 3–5 resource blocks, and the network load L is uniformly quantized into 10 levels. See Table 1 for other simulation parameters.
60
S. Yu et al. Table 1 Simulation parameter settings Parameter index
PDT B-trunC
Total number of resource blocks
50
100
Voice service matching coefficient 5
1
Data service matching coefficient
1
5
Mobile matching coefficient
5
1
Matching coefficient at rest
1
5
Among other parameters, the discount factor γ = 0.8, the initial learning rate α = 0.5, and the initial exploration probability ε = 0.5. Figure 2 shows the distribution of two services in two networks before and after learning. It can be seen clearly that in the initial stage of simulation, the two services are basically evenly distributed in the two networks, which is obviously not the result we want, but in the end, most of the voice services choose to access to PDT and most of the data services choose to access to B-trunC.
Fig. 2 Percentage of users choosing different networks in different situations
From the distribution of two kinds of mobile users in the two networks before and after learning, it is obvious that the two kinds of users in the two networks are basically
Heterogeneous Network Selection Algorithm …
61
evenly distributed in the initial stage of learning. However, with the progress of learning, the best strategy for selecting actions is constantly adjusted. There are more static end users in B-trunC users than mobile end users and more mobile end users in PDT users than static end users. The following figure compares the influence of three algorithms on the service distribution in PDT network and B-trunC network. The load difference in Fig. 3 refers to the load ratio difference between voice and data. The load difference in Fig. 4 refers to the load ratio difference between static service and mobile service. It can be seen from Figs. 3 and 4 that when RAA algorithm and LBA algorithm are adopted, the difference between the voice and data load proportion of PDT and B-trunC network, as well as the difference between the static service and the mobile service load proportion have certain randomness, but they are all within 5%, so they are not able to distinguish services well. QLA algorithm can adapt to the properties of services and the mobility of users, so that different access networks can play the technical advantages to meet the needs of different users.
Fig. 3 Load ratio difference between voice and data services
Figure 5 shows the change of blocking rate in the whole network environment with the algorithm iteration. As shown in the figure, with the increase of iteration times, the blocking rate of the system presents a downward trend and tends to be stable gradually. QLA algorithm can not only adapt to the service attributes and user mobility, but also reduce the blocking rate. The blocking rate of this algorithm is always lower than that of random access.
6 Conclusions In the wireless heterogeneous network environment composed of PDT and B-trunC, this paper proposes a heterogeneous wireless network access selection algorithm based on Q-learning, which fully considers the load of the network, the business attributes of the initiating session, and the mobility of the terminal, so that the JRRM controller
62
S. Yu et al.
Fig. 4 Load ratio difference between static and mobile services
Fig. 5 The change of blocking rate with the number of iterations
can reasonably allocate each session to the most appropriate network according to the characteristics of the network. Acknowledgements. This paper is supported by National Key R&D Program of China (No. 2018YFC0807101).
References 1. Serrador A, Carniani A, Corvino V (2011) Radio access to heterogeneous wireless networks through JRRM strategies. In: 2010 Future network and mobile summit. IEEE 2. Giupponi L, Agustí R, Pérez-Romero J A framework for JRRM with resource reservation and multiservice provisioning in heterogeneous networks. Mobile Netw Appl 11(6):825–846
Heterogeneous Network Selection Algorithm …
63
3. Coupechoux M, Kelif J-M, Godlewski P (2008) SMDP approach for JRRM analysis in heterogeneous networks. In: Wireless conference. EW 2008. 14th European. IEEE 4. Imanberdiyev N, Fu C, Kayacan E (2016) Autonomous navigation of UAV by using realtime model-based reinforcement learning. In: International conference on control, automation, robotics and vision (ICARCV). IEEE 5. Cuayahuitl H, Lee D, Ryu S (2019) Deep reinforcement learning for chatbots using clustered actions and human-likeness rewards. In: 2019 international joint conference on neural networks (IJCNN) 6. Moghadam M, Elkaim GH (2019) A hierarchical architecture for sequential decision making in autonomous driving using deep reinforcement learning. In: Real-world sequential decision making workshop at ICML 2019 7. Haddad M, Altman Z, Elayoubi SE et al (2011) A Nash-Stackelberg fuzzy Q-Learning decision approach in heterogeneous cognitive networks. In: Global telecommunications conference. IEEE 8. Simsek M, Galindo-Serrano A, Czylwik A, Giupponi L (2011) Improved decentralized Qlearning algorithm for interference reduction in LTE-Femtocells. In: Wireless Advanced WiAd 2011. IEEE 9. Tabrizi H, Farhadi G, Cioffi J (2012) Dynamic handoff decision in heterogeneous wireless systems: Q-learning approach. In: IEEE international conference on communications. IEEE
Heterogeneous Wireless Private Network Selection Algorithm Based on Gray Comprehensive Evaluation Value Shouming Wei1,2(B) , Shuai Wei1 , Chenguang He1,2 , and Bin Wang1,2 1 Communication Research Center, Harbin Institute of Technology, Harbin, China
[email protected], [email protected] 2 Key Laboratory of Police Wireless Digital Communication, Ministry of Public Security,
People’s Republic of China Harbin, Harbin, China
Abstract. A variety of wireless access technologies have emerged in public security private network communication systems, and these technologies are heterogeneous in terms of access and business. Under the conditions of such a heterogeneous network, it is an urgent problem for users to achieve the best connection according to their own business needs. Aiming at the network selection of heterogeneous wireless private network, this paper presents a heterogeneous wireless private network selection algorithm based on the gray comprehensive evaluation value. This algorithm calculates the subjective weight of the network attributes by using the analytic hierarchy process, and uses the gray correlation analysis method to obtain the gray correlation matrix that is obtained, and finally, the two are combined to obtain a comprehensive grayscale evaluation value, select the best network to provide users with high-quality services. Keywords: Private network · Network selection · Gray evaluation
1 Introduction In the mobile police integration intelligent network architecture, the intelligent resource adaptation engine mainly studies the intelligent access mechanism of the converged terminal and the business mechanism in the application service [1–3]. Heterogeneous wireless private network selection is an important component of intelligent resource adaptation. It implements network selection of terminals and users through specific algorithms and provides users with the best services. At present, there are many kinds of network selections for heterogeneous wireless private networks [4, 5]. Among them, the selection algorithm of heterogeneous wireless private networks based on multi-attribute decision is generally divided into two parts, one is to determine the weight, and the other is the network performance ranking [6, 7].
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_8
Heterogeneous Wireless Private Network Selection …
65
2 Analytic Hierarchy Process Analytic hierarchy process is to layer the problem, including the target layer, that is, the problem to be solved; the standard layer, that is, the guidelines that must be followed to select various measures and programs to achieve the overall goal; the program layer, which is used immediately measures to solve the problem [8, 9]. In this paper, the hierarchy is divided as follows: The target layer is the final selection of the network; the criterion layer is the various attributes of the network; the solution layer is a plurality of alternative networks. This article comprehensively considers network performance such as bandwidth, network delay, network jitter, and packet loss rate, as shown in Fig. 1.
Access to the best network
Target layer
Criterion layer
Bandwidth
Jitter
Delay
Packet loss rate
Solution layer Network1
Network2
Network3
Fig. 1 Hierarchical model
This article first constructs the judgment matrix of network attributes. The element values in the judgment matrix are obtained by the user’s perception of the two attributes “which is more important” and “the degree of importance” [10]. The dimensions set to 1–9 represent the user’s quantitative cognitive situation in the judgment of attribute importance, as shown in Table 1 for the definition of the importance dimension. The judgment attributes of each level are compared with each other to form a judgment matrix A, which has the properties as shown in (1): ⎧ ⎨ aij > 0 (1) a = 1/aji ⎩ ij aii = 1 The matrix formed by the comparison result is called the preference judgment matrix. Judgment matrix is shown in (2). gij indicates the importance level of attribute i compared to attribute j. gij > 0, gij > 1 and gij = 1/gij (i, j = 1, 2, . . . , n). ⎡
G = (gij )n×n
g11 ⎢ g21 ⎢ =⎢ . ⎣ ..
g12 g22 .. .
⎤ · · · g1n · · · g2n ⎥ ⎥ .. .. ⎥ . . ⎦
gn1 gn2 · · · gnn
(2)
66
S. Wei et al. Table 1 Importance scale Proportion Significance 1 3 5 7 9 2, 4, 6, 8
Equal importance Slightly important Significantly important Strongly important Extremely important Median
Then, calculate the weight of the network attribute parameter and get the weight vector W. First, multiply each row element value of the pairwise judgment matrix Wi = n n Wi , finally normalize the g (i = 1, 2, . . . , n), root it to the nth power W = ij i j=1
vector Wi = W i / ni=1 W i .Wi represents the weight value of the ith attribute, get the weight of the attribute parameter and get the weight vector W = {W1 , W2 , . . . Wn }. Due to the diversity of people’s understanding of things, judgment conflicts may occur when constructing a pairwise judgment matrix, so it is difficult to maintain absolute consistency, and a consistency test must be performed on the judgment matrix. The test method is as follows: (1) Calculate the consistency reference index CI, n is the order of the judgment matrix, and λmax is the maximum eigenvalue of the judgment matrix. CI =
λmax − n n−1
(3)
(2) Find the consistency indicator RI in Table 2. Table 2 RI reference values n
1 2 3
4
5
6
7
8
9
RI 0 0 0.58 0.90 1.12 1.24 1.32 1.41 1.45
(3) Calculate the consistency ratio coefficient CR = CI/RI. When the value of CR is less than 0.1, it indicates that the consistency of the judgment matrix is satisfied, and the obtained weight value is valid. Instead, you should recalculate the weight values of the attributes.
3 Gray Relational Analysis Gray relational analysis is used to analyze the degree of discrete sequence correlation which is not completely uncertain. By defining a reference sequence, it is judged whether
Heterogeneous Wireless Private Network Selection …
67
the reference sequence is closely related to other sequences according to the shape of the sequence curve and the similarity of the change trend. First, the measured network attribute values at a certain moment are represented by matrix A. It is assumed that there are m selectable networks (Ni , 1 ≤ i ≤ m), and each network has n attributes (Pj , 1 ≤ j ≤ n). The decision matrix A is shown in (4), aij represents the jth attribute value of the ith network.
A = (aij )m×n
N1 = N2 .. .
P1 P2 · · · Pn ⎤ a11 a12 · · · a1n ⎢ a21 a22 · · · a2n ⎥ ⎢ ⎥ ⎢ . . . . ⎥ . . . . ⎣ . . . . ⎦ ⎡
(4)
am1 am2 · · · amn
Nm
Secondly, the decision matrix A is standardized to obtain a standardized decision matrix B. In this paper, use the range change method to standardize it. For benefit-type attributes (such as bandwidth), the larger the value, the better. Normalize it according to formula (5): bij =
aij − (aij )min i = 1, 2, · · · , m (aij )max − (aij )min
(5)
For cost-type attributes (such as delay), the smaller the better, the normalization is performed according to formula (6): bij =
(aij )max − aij i = 1, 2, . . . , m (aij )max − (aij )min
(6)
Get the standardized decision matrix B is shown in (7):
B = (bij )m×n
N1 N = 2 .. .
P1 P2 · · · Pn ⎤ b11 b12 · · · b1n ⎢ b21 b22 · · · b2n ⎥ ⎢ ⎥ ⎢ . . . . ⎥ ⎣ .. .. .. .. ⎦ ⎡
Nm
(7)
bm1 bm2 · · · bmn
Thirdly, define network reference sequence. For the standardized decision matrix B, when the network attribute is a benefit type, the reference sequence can be defined as y = {max bi1 , max bi2 , · · · max bin , }, when the network attribute is a cost type, the i
i
i
reference sequence is defined as y = {min bi1 , min bi2 , · · · min bin , }, and the obtained i
i
i
reference sequence is an ideal network. Finally, calculate the gray correlation degree according to formula (8): min min |yj − bij | + ξ max max |yj − bij | rij =
i
j
i
j
|yj − bij | + ξ max max |yj − bij | i
j
, (i = 1, 2, . . . m, j = 1, 2, . . . n) (8)
68
S. Wei et al.
ξ ∈ [0, 1] is the discriminant coefficient, usually 0.5.min min |yj − bij | is the i
j
two-stage minimum difference between the sequence and the reference sequence and max max |yj − bij | is the two-stage maximum difference between the sequence and the i
j
reference sequence. Gray correlation matrix R is obtained and shown in (9): ⎡
R = (rij )m×n
r11 ⎢ r21 =⎢ ⎣ ··· rm1
r12 r22 ··· rm2
⎤ · · · r1n · · · r2n ⎥ ⎥ ··· ··· ⎦ · · · rmn
(9)
From the gray correlation matrix R and the weight vector W, the comprehensive evaluation values of the comprehensive performance of each network and the ideal network can be obtained by formula (10): Z = R × WT
(10)
4 Simulation Results 4.1 Parameters and Environmental Settings The simulation scenario sets up a heterogeneous wireless network environment that integrates two wireless access technologies, PDT and WLAN. The simulation environment considers four decision factors: bandwidth, delay, jitter, and packet loss rate. The parameter attributes of different networks are shown in Table 3. Table 3 Parameter attribute values of each wireless network Parameter
Bandwidth (kHz)
Delay (ms)
Jitter (ms)
Packet loss rate (%)
PDT
12.5
40
5
0.07
WLAN1
10,000
80
9
0.02
WLAN2
20,000
140
15
0.04
4.2 Analysis of Simulation Results The judgment matrix of available voice services is shown in (11): ⎡
Avoive
⎤ 1 1/7 1/7 1/3 ⎢ 7/1 1 5/4 3/1 ⎥ ⎥ =⎢ ⎣ 7/1 4/5 1 3/1 ⎦ 3/1 1/3 1/3 1
(11)
Heterogeneous Wireless Private Network Selection …
The judgment matrix of available voice services is shown in (12): ⎡ ⎤ 1 7/1 2/1 3/2 ⎢ 1/7 1 1/4 1/5 ⎥ ⎥ Avideo = ⎢ ⎣ 1/2 4/1 1 2/3 ⎦ 2/3 5/1 3/2 1 The judgment matrix of available data services is shown in (13): ⎡ ⎤ 1 5/1 3/1 4/5 ⎢ 1/5 1 2/5 1/6 ⎥ ⎥ Adata = ⎢ ⎣ 1/3 5/2 1 1/3 ⎦ 5/4 6/1 3/1 1
69
(12)
(13)
Gray comprehensive evaluation value
The user’s network selection among the three services is shown in Fig. 2. It can be seen that the voice service accesses the PDT network. The PDT network has a small bandwidth and small delay jitter. Since the voice service is a real-time service, it has high delay and jitter on the network, and does not require high network width, which is consistent with the actual situation. Data services are connected to the WLAN1 network. We know that data services have low requirements for delay jitter and high requirements for packet loss, and WLAN1 has the lowest packet loss, which is consistent with the actual situation.
0.8
PDT WLAN1 WLAN2
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
Voice business
Video business
Video business
Business type
Fig. 2 Network selection results
Because the user’s preferences and weights for the candidate network parameters greatly affect the final decision result, a decision sensitivity analysis is performed to calculate the change amount of the difference between the grayscale comprehensive evaluation values of the candidate network and observe the results of the access network. Changes. Figures 3, 4, 5 and 6 show the network selection results under different bandwidths and delays. It can be intuitively seen from the figure that when users have low bandwidth requirements and high delay requirements, the user terminal will choose a PDT network with low bandwidth and low delay; as the user’s bandwidth requirements increase, the delay with reduced demand, users will first access the sub-optimal
70
S. Wei et al.
WLAN1 network with large bandwidth and small delay, and then access the more optimal WLAN2 network. The simulation results are in good agreement with the actual situation, which fully illustrates the correctness of the network selection algorithm in this paper.
Fig. 3 The difference in evaluation between PDT and WLAN1 in different weights
Fig. 4 The difference in evaluation between PDT and WLAN2 in different weights
5 Conclusions This paper proposes a heterogeneous wireless network selection algorithm based on gray comprehensive evaluation values. This algorithm calculates subjective weights of network attributes by analytic hierarchy process, uses gray correlation analysis method to obtain gray correlation matrix, and finally combines the two to obtain gray level synthesis The evaluation value can judge the weight value more accurately and comprehensively. In addition, according to the different needs of different services, and considering the performance of bandwidth, network delay, etc. The best network is selected to provide users with high-quality services.
Heterogeneous Wireless Private Network Selection …
71
Fig. 5 The difference in evaluation between WLAN1 and WLAN2 in different weights
Fig. 6 Network selection results
Acknowledgements. This paper is supported by National Key R&D Program of China (No. 2018YFC0807101).
References 1. Nasser N, Hasswa A, Hassanein H (2006) Handoffs in fourth generation heterogeneous networks. IEEE Commun Mag 44(10):96–103 2. Song QY, Abbas J (2005) Network selection in an integrated w ireless LAN and UMTS environment using mathematical modeling and computing techniques. IEEE Wireless Commun 12(6):42–48 3. Jun L (2009) Theory and technology realization of heterogeneous wireless network fusion. Electronic Industry Press, Beijing, pp 134–158 4. Xin H, Bin L (2008) Heterogeneous wireless network switching technology. Beijing University of Posts and Telecommunications Press, Beijing, pp 1–17 5. Radhika K, Venu Gopal Reddy A (2011) Network selection in heterogeneous wireless networks based on fuzzy multiple criteria decision making. In: 2011 3rd international conference on electronics computer technology (ICETCT), pp 136–139
72
S. Wei et al.
6. Bari F, Leung VCM (2007) Multi-attribute network selection by iterative TOPSIS for heterogeneous wireless access. In: IEEE consumer communications and networking conference, pp 808–812 7. Shen W, Zeng QA (2008) Cost-function-based network selection strategy in integrated wireless and mobile networks. IEEE Trans Vehicular Technol 57(6):3778–3788 8. Julong D (2003) Grey system theory. Huazhong University of Science and Technology Press, Wuhan, pp 122–202 9. Sasaki (2012) Evaluation of communication system selection applying AHP algorithm in heterogeneous wireless networks. In: Computing, communications and applications conference (ComComAp), pp 334–338 10. Wei L, Yumin S, Yanjiao J (2008) Analysis of multiple obective decision methods based on entropy weight. In: IEEE Pacific-Asia workshop on computational intelligence and industrial application
A Novel Build-in-Test Method for the Multi-task Radar Warning Receiver Based on a Parallel Radio Frequency Network Desi Luo(B) , Song Li, Yang Hui, and Xu Zhou The 29Th Research Institute of China Electronics Technology Group Corporation, Chengdu, China [email protected]
Abstract. A radar warning receiver (RWR) can detect and process radar signals to alarm radar threats in battlefields, and it has become an important sensor in the modern warfare. The multi-task RWR can improve the multi-signal processing capability and warning space, by adding processing hardware. As the amount of hardware increases, the reliability of whole system decreases. Multi-task RWR needs to conduct build-in-test before operation, to check the warning capability and locate the failed hardware. The traditional build-in-test schemes can test each sub-system of a single task independently with an external self-checking signal source. In this paper, we propose a novel build-in-test method for the multi-task RWR based on a parallel radio frequency (RF) network. The RF network makes the hardware of each sub-system as a backup for each other during the build-intest. By utilizing the redundancy of parallel sub-systems, the joint build-in-test of the multi-task RWR can improve the accuracy without any additional RF signal detector. Keywords: Multi-task radar warning receiver · Novel build-in-test method · Parallel radio frequency (RF) network
1 Introduction By measuring and analyzing the radar signals received, a radar warning system can tell the location, type, tracking state and RF parameters of enemy radars, assisting warfighters to learn and control the battlefield situation, and take effective protective measures timely. Therefore, the radar warning system enjoys an important role in the new modern warfare [1]. The first-generation radar warning receiver emerged in the middle of 1960s, equipped with video crystal detector used for analog demodulation which can enable itself to alert radars with certain frequency and parameters. AN/APR-25 of the US army is one of the typical examples [2]. During Vietnam War, the US air force had suffered severe loss
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_9
74
D. Luo et al.
due to Vietnam’s attacks with ground anti-aircraft missiles. And, the US army made continuous efforts into updating RWR based on the features of the ground anti-aircraft radar signals, and rolling out a series of products including AN/APR-3 and AN/APR-45, which had greatly increased the survival chance of American aircrafts in the period of the Vietnam War. As the technologies advanced, the successors had applied digital circuit and project software-based transformation on a large scale. With functions and performance greatly improved, the updated ones can identify the operating condition of radars through scanning, complete restructuring based on software operating model, quickly respond to the changing signal environment and cope with new threats [3, 4]. The progress of artificial intelligence (AI) has also facilitated the development of RWR. AI offers more ways for identifying radar warning signals and has effectively improved the accuracy of RWR [5]. Under the increasingly complicated battlefield situation, various electronic information devices and informationized weapons are deployed intensively on all the fighting platforms ranging from sea, land, air to space, which leaves the RWR exposed to threats with different functions and frequencies in all directions [6]. With the limited processing capability of single hardware, it is necessary to install several subtask systems with different functions onto the RWR, to carry out full-time radar warning tasks covering all airspace with completed functions [7]. To ensure the normal operation and performance of RWR, a built-in-test of the whole system is necessary to assist maintainers with fault location and system recovery [8]. The built-in-test is to carry out function integrity test rapidly and alarm fault location through its own self-checking signal source without any external device. Since built-intest cannot improve the explicit capability of the system, it is often overlooked in system program design. Although RWR’s signal processing capability and device reliability have improved greatly, traditional self-test methods enjoy more preference [8]. Multi-task RWR is more complicated in terms of hardware structure, with lower reliability and higher malfunction probability than single-task system under the precondition of the reliability of single hardware module staying constant. Therefore, the multi-task RWR needs its built-in-test program to provide more precise self-testing results [9]. RWR built-in-test program can be divided into two categories, including status checking and RF signal checking. The status checking method is to collect important information from key active devices regarding voltage, electric current, temperature, etc. If deviation appeared between the parameters mentioned above and the normal ones, then a fault can be confirmed. However, RF signals checking method can detect the fault through its RWR which can analyze the built-in-test RF signals with certain parameters generated by an external self-test signal source. And the RF signal checking method varies with the different categories of parameters from time domain sampling, frequency domain sampling to time–frequency joint sampling approach [10]. Since the status checking method, simpler though, can only detect device malfunction rather than RF link failure, it works as a supplementary means. The built-in-test program discussed in this paper refers to time–frequency joint sampling approach.
A Novel Build-in-Test Method for the Multi-task …
75
2 Traditional Built-in-Test Method Multi-task RWR includes several sub-systems with each consisting of serial independent reception channel, frequency converter, and digital signal sampler. A radar signal from the external electromagnetic environment will be led by the reception channel into warning system, and as a radio frequency signal, it will be converted into an intermediate frequency signal through the frequency converter, then processed by digital signal sampler, and finally complete its parameter measuring. The standard workflow is as shown in Fig. 1.
Electromagnetic Environment of Battlefield
Reception Channel
Reception Channel
Reception Channel
Frequency Converter
Frequency Converter
Frequency Converter
Digital Signal Sampler
Digital Signal Sampler
Digital Signal Sampler
Fig. 1 The standard workflow of RWR
Assuming that all the sub-systems applied the same reception channel, frequency converter, and digital signal sampler sharing the same reliability and failure rate which are λRC , λFC and λDA , respectively, the mean time between failure (MTBF) of each sub-system would be MTBFsub 1 λFT + λCV + λAD
MTBFsub =
(1)
Parallel N sub-systems undertake different warning tasks and stay mutually independent. If any one of them failed, the whole RWR would break down. Then, the MTBF of the whole multi-task RWR MTBFsys is 1 N λFT + λCV + λAD MTBFsub = N
MTBFsys =
(2)
76
D. Luo et al.
The formula above tells that a multi-task system has shorter MTBF and lower reliability than single-task one, and that its reliability can get lower with more sub-systems. Therefore, multi-task systems need more precise built-in-test programs to improve its fault correction efficiency. Traditional RWR built-in-test solution inputs certain self-check signals into reception channel through external self-check source, and measures the self-check signals by the digital signal samplers. The RF signals of traditional built-in-test flows as shown in Fig. 2. Self-checking Signal Source
Reception Channel
Reception Channel 2
Reception Channel N
Frequency Converter
Frequency Converter 2
Frequency Converter N
Digital Signal Sampler
Digital Signal Sampler 2
Digital Signal Sampler N
Signal Flow of Traditional BIT Signal Flow of Improved BIT
Fig. 2 Flow of RF signals of RWR in built-in-test condition
If the digital signal samplers can measure self-check signals correctly, it can be sure that the sub-system works properly. Otherwise, it has failed. Since the reception channel and frequency converter cannot measure parameters, after the fault detected, it shall be reported as that the reception channel, frequency converter, and digital signal sampler have all failed. The traditional built-in-test program works as shown in Fig. 3. Without module-level failure detection capability, the fault location accuracy of traditional built-in-test method is zero. However, certain probability of self-check accuracy will still exist in the occasion of practical application if all the modules were reported as failed according to the self-check result. Assuming that the failure rates of reception channel, frequency converter, and digital signal sampler were λRC , λFC and λDA , respectively, the actual accuracy of traditional built-in-test method is RC λ + λFC + λAD /3 (3) Pt = 1 − 1 − λRC 1 − λFC 1 − λDA From the formula, it can be understood that the accuracy of traditional built-in-test has no correlation with the number of sub-systems of RWR.
A Novel Build-in-Test Method for the Multi-task …
Start to Check
Generate the Built-intest Signal
Measure the Built-in-test Signal
Is the Result Correct
No
End
Modules Of the Sub-system Normal
77
Yes Modules Of the Sub-system Failed
Fig. 3 The procedure of traditional built-in-test method
3 The Built-In-Test Method Based on Parallel RF Network This paper has proposed a built-in-test method of multi-task RWR based on parallel RF network, which can make possible the links between any two modules among the reception channel, frequency converter, and digital signal sampler. Under the self-checking condition, all the sub-systems, as redundant backups for each other, can jointly conduct self-test when failure appears, which can promote the precision of failure location. The RF signals of built-in-test method of multi-task RWR based on parallel RF network flows as Fig. 2. The built-in-test method of multi-task RWR based on parallel RF network also utilizes external self-check signal source to input certain signals into all the reception channels and measures the signals through digital signal sampler. When the digital signal sampler fails to acquire self-check signals correctly, it can be confirmed that the fault occurs. And, the parallel network can bring the self-check signals from the other sub-systems into the input ends of frequency converter and digital signal sampler which can carry out cross location of the error level by level, promoting the failure location capability from sub-system level to module level without additional self-check signal sampling hardware. According to the flowchart in Fig. 4, when the reception channel, frequency converter and digital signal sampler all have failures, the whole system cannot assembly wellfunctioned temporary sub-systems, and the self-checking operation cannot locate the mal-functioned module precisely and effectively. Then, the self-check result will show all the modules failed. However, in other situations, the fault can be located correctly. Assuming that the failure rate of reception channel, frequency converter and digital signal acquisition were λRC , λFC , and λDA all of which conform to binomial distribution, the probability PA for locating all the failures precisely will be N N N 1 − 1 − λRC 1 − λFC 1 − λDA PA = 1 − (4) N N N 1 − λFC 1 − λDA 1 − 1 − λRC
78
D. Luo et al.
Start to Check
Are all the Results Correct
No
Preset all Modules Failed
Combine The Modules from Different Subsystems to Make up New Temporary Sub-systems
Yes No
end
Traverse All New Yes Temporary Subsystems
Clear the Failures of the Temporary Sub-system with Correct Result
Measure the Builtin-test Signal of the new Temporary Sub-systems
Fig. 4 The procedure of built-in-test method of multi-task RWR based on parallel RF network
Similar to the traditional built-in-test method, when the system cannot locate failures effectively, the self-check result suggests all the modules failed, which still has a certain rate of precision. And, the accuracy of improved self-check solution shall be modified as Pi (1 − PA ) λRC + λFC + λDA /3 Pi = PA + (5) N N N 1 − λFC 1 − λDA 1 − 1 − λRC By simulating the situation when λRC , λFC , and λDA are 0.1, the result is as shown in Fig. 5. The simulation above can conclude that (1) When the RWR only has one subsystem without redundant hardware, the built-in-test method of multi-task RWR based on parallel RF network will degrade to the traditional one with the same accuracy; (2) when the RWR has more than two subtask systems, the accuracy of the parallelRF- network-based method will be higher than the traditional one; (3) The accuracy of traditional method has no correlation with sub-systems while the parallel-RF-networkbased method will detect failures more precisely with a larger number of its subtask systems. Under the premise of their single modules sharing the same failure ratio, the figure below has compared the self-check accuracy of RWRs with different number of subtask systems. According to the simulation results in Fig. 6, it shows that (1) the self-test accuracy gets higher with more sub-systems under the precondition of the single modules with same reliability since more sub-systems can promote the parallel redundancy and further facilitate the detection and location of malfunctions; (2) With the decreasing reliability of single modules, the probability of the modules all failed simultaneously, including
A Novel Build-in-Test Method for the Multi-task …
79
1 0.9
Accuracy
0.8 0.7 0.6 0.5
Build-in-test Based on Parallel RF Network Traditional Build-in-test
0.4 0.3
1
2
1.5
2.5
3
4
3.5
4.5
5
Number of the Sub-systems
Fig. 5 Comparison of accuracy between the traditional built-in-test method and improved method
reception channel, frequency converter, and digital signal sampler, will rise, and the accuracy of self-check will decrease. 1
Accuracy
0.95
0.9
0.85
0.8 The number of sub-systems is 2 The number of sub-systems is 3 The number of sub-systems is 4
0.75 0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
0.55
0.6
Failure Probability of Each Module
Fig. 6 Comparison of accuracy with different number of subtask systems
4 Conclusions The built-in-test method of multi-task RWR based on parallel RF network has thoroughly utilized the redundancy between sub-systems of RWR, and make the joint self-test possible when the fault occurs. Without additional self-test signal sampling hardware, the
80
D. Luo et al.
parallel-RF-network-based method has promoted failure location capability from subsystem level to module level, making the malfunction detection of built-in-test more precise. According to the theoretical analysis and simulation results, the accuracy of parallel-RF-network-based method has positive correlation with the number of subsystems, which enables the method to play a greater role in more completed RWRs in the future. Additionally, since the parallel RF network is relatively independent to the other hardware in radar countermeasures devices, products with its design already accomplished can also be improved through fine adjustment on its software and hardware, which makes the parallel RF network-based method more accessible and applicable to the existing systems.
References 1. Zheng LI (2008) The development trend of the airborne radar warning receiver. Electron Inf Warf Technol 3(23):51–54 2. Fan Z, Xinkai C, Zhuangzhi H (2014) The development and technology trend of the radar warning receiver. Aerodyn Missile J:41–46 3. Gee A (2012) Radar warning receiver (RWR) time-coincident pulse data extraction and processing. In: IEEE radar conference, pp 0752–0757 4. Ruijia W, Xing W (2014) Radar emitter recognition in airborne RWR/ESM based on improved K nearest neighbor algorithm. In: IEEE international conference on computer and information technology, pp 148–151 5. Kauppi J-P, Martikainen KS (2007) An efficient set of features for pulse repetition interval modulation recognition. In: International conference on radar systems, pp 1–5 6. Kuang Y, Shi Q, Chen Q, Yun L, Long K (2005) A novel SDIF-based PRI estimation approach to deinterleave repetitive pulse sequences. WSEAS Trans Math 4:260–265 7. Jun W, Peng L, Dong Y, Wei L, Xinyu Y (2009) A novel deinterleaving algorithm of radar pulse signal based on DSP. In: IEEE international symposium on industrial electronics, pp 1899–1903 8. Self AG (2006) “Probability of intercept”, the interception and analysis of radar signals, pp 97–114 9. Chakrabarty K, Murray BT (1998) Design of built-in test generator circuits using width compression. IEEE Trans Comput-Aided Des 17(10):1044–1051 10. Ghelfi P, Scotti F, Onori D, Bogoni A (2019) Photonics for ultrawideband RF spectral analysis in electronic warfare applications. Sel Top Quant Electron IEEE J 25(4):1–9 11. Agrawal VD, Kime CR, Saluja KK (1993) A tutorial on built-in self test. IEEE Des Test Comput:73–82
Design of FIR Filter Based on Genetic Algorithm Yipeng Wang1(B) , Yan Ding2 , Anding Wang1 , Jingyu Hua1 , and Weidang Lu3 1 School of Information and Electronic Engineering, Zhejiang Gongshang University,
Hangzhou 310018, China [email protected] 2 Zhejiang Jinhua Post and Telecommunications Engineering Co., Ltd., Jinhua 321017, China 3 College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
Abstract. The finite impulse response (FIR) filter has been widely used in wireless communication systems and its design is crucial for modern Space-air-groundsea integrated communication equipment. Then, genetic algorithm is a simple and effective tool for optimization processing, where it uses fitness functions to guide the search process without the requirement of prior knowledge. So, this paper introduces the genetic algorithm into FIR filter design, where the best frequencyresponse-error-vector (FREV) is searched, and the filter is produced based on the weighted least squares (WLS) principle. Finally, under the pre-determined filter specifications, we design FIR filters and obtain the satisfactory results. Keywords: Digital filter design · Genetic algorithm · Satellite communication · Weighted least squares · Weighting function
1 Introduction In scientific research and engineering, it is necessary to process the signal in the wireless communication systems including cellular communication and satellite communication [1, 2]. Filter [3] is an indispensable part in the field of modern signal processing and electronic communication. The calculation of filter design for various technical index can be almost realized by computer [4]. In [5], the equiripple digital filter is designed by the weighted least squares algorithm; in [6], the infinite impulse response digital filter with low group delay is designed by the optimization iterative method. Commonly the traditional filter design is based on the deterministic method of the least square theory with little consideration of heuristic search. The genetic algorithm [7, 8] is an adaptive global optimization probability algorithm with strong robustness formed in the process of simulating biological evolutionary. It provides a general framework for solving complex system optimization problems.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_10
82
Y. Wang et al.
In this paper, we first introduce the index of filter design, then study the way to determine the initial weight of weighted least squares algorithm and the principle of genetic algorithm in filter design. Finally, the performances of four design methods are simulated and compared by MATLAB.
2 Fir Filter Design 2.1 Filter Design Index The ideal filter is physically unrealizable, and in order to be physically achievable, we set a transition band from one band to another, as shown in Fig. 1.
Fig. 1 Filter index
Where ωp is the pass-band cutoff frequency, ωs is the stop-band cutoff frequency, δ1 and δ2 are maximum absolute fluctuation of pass-band and stop-band, or pass-band tolerance and stop-band tolerance. However, specific technical index [9] tends to give the maximum attenuation Rp allowed by the pass-band and the minimum attenuation As that the stop-band should achieve which are defined as: Rp = −20 lg
1 − δ1 >0 1 + δ1
(2.1)
As = −20 lg
δ2 1 1 + δ1
(2.2)
According to the [10], we can set fc = 6.400 × 105 ; fs = 7.700 × 105 . fp = 6.075 × 105 . So bandwidth is 1.28 × 106 , and the sampling frequency is 5.12 × 106 (Hz). 2.2 Setting the Initial Weight of Weighted Least Squares Algorithm For the weighted least squares algorithm, the bigger weight means the square of the approximation error E[(ω)]2 will be smaller. According to the research of [5, 11], the
Design of FIR Filter Based on Genetic Algorithm
83
initial weight can be set by the following method: firstly, we obtain the frequency response of the FIR filter by the continuous least squares algorithm, then calculate the error of the amplitude–frequency response curve compared with the ideal index Hd (ω). As shown below: In Fig. 2, the blue curve indicates the results of continuous least squares algorithm. In addition, it can be seen from the figure that the error values of the first (the 9th) and the second (the 11th) points in the stop-band fail to accurately reflect the actual error of the ripple on the amplitude–frequency response curve. So we added the 10th point to control this fluctuation more effectively, whose frequency is: ω10 =
ω9 + ω11 2
(2.3)
Fig. 2 Continuous least squares (blue) and discrete least squares (red) errors
It can be seen from the red line in Fig. 4b that the actual error value at the 6th–11th frequency points is relatively large, so the initial value of the weight value at the 6th–11th frequency points is artificially increased to indicate the importance of this part of the error. According to the reference [3], the stop-band weight is multiplied by a certain weighting coefficient, which is defined as: weight = δ1 /δ2
(2.4)
where δ1 , δ2 represent the maximum ripple required.
3 Application of Genetic Algorithm in Design 3.1 Calculation of the Fitness Value The calculation of fitness value [12] is the core of the whole genetic algorithm. We use two calculation methods of fitness value which are introduced as follows:
84
Y. Wang et al.
Fitness value a: we calculate the impulse response h(n), then calculate the amplitude– frequency response H (ω) and the maximum ripple value of H (ω) in the pass-band and stop-band(dp and ds) can be calculated. Hereafter, the maximum value in the maximum ripple value of pass-band and stop-band is defined as Maxextrm. The reciprocal of Maxextrm is taken as the fitness value, that is: fitness(h) = 1/Maxextrm
(3.1)
Fitness value b: getting the amplitude–frequency response H (ω), we calculate the maximum value of approximation error Maxextrm and the minimum value of approximation error Minextrm in the pass-band and stop-band, respectively, and define the equal ripple degree Q as follows: Q=
Maxextrm − Minextrm Maxextrm
(3.2)
3.2 Selection, Crossover, and Mutation The selection strategy [13] adopts the roulette strategy, in which PPi = ij=1 pi where PPi is the cumulative probability, pi is the selection probability of R, that is: fitness(xi ) pi = N P i=1 fitness(xi )
(3.3)
Where fitness(xi ) is the fitness value of individual. When PPi−1 ≤ r < PPi , the “i” is selected. Then, the double tangent crossover [14] can make the amount of genes exchanged by both parents smaller, which is conducive to the retention of excellent individuals. The specific operation process of big mutation genetic algorithm [15] is: when the maximum fitness Fmax and average fitness Favg of a generation satisfies: α · Fmax < Favg
(3.4)
where 0.5 < α < 1. We set all individuals as the form of individuals with the highest fitness value. Then, a mutation is performed on the concentrated parameter with a probability more than 5 times higher than the usual mutation probability. 3.3 The Whole Weight Vector Search Process For the sake of simplicity, the individuals in this design are coded by the floating-point coding scheme [16], so each individual R has 28 genes. Based on the initial weight vector R, the optimal weight vector Rbest is searched by the big mutation genetic algorithm with multi-point crossover. The population size N P is 500, the maximum evolution generation N G is 500, the crossover probability Pc is 0.9, the mutation probability Pm is 0.1, the density factor α is 0.6, and the large variation probability Pbm is 0.2. The whole search process is shown in Fig. 3.
Design of FIR Filter Based on Genetic Algorithm
85
YES
Using the ini al weight vector R as the search criterion
Each individual R in a random ini aliza on popula on
Current genera on=NG?
NO
Calculate the fitness value of each weigh ng coefficient vector R of the current popula on
Calculate the fitness value according to the h corresponding to each R and select it with roule e strategy
Cross and mutate
Generate a new genera on of popula on individuals
END
Output op mal individuals Rbest and the corresponding impulse responses h(n)
Fig. 3 Flowchart of genetic algorithm
Fig. 4 a Amplitude response of discrete least square algorithm (dB), b amplitude response of discrete-weighted least square algorithm (dB), c weighted least square amplitude response corresponding to fitness value a (dB), d weighted least square amplitude response corresponding to fitness value b (dB)
4 Experimental Results 4.1 Comparison of Output Parameters of Four Solutions 1. Output parameters of discrete least square solution: ωp = 0.238020; ωc = 0.257088; ωs = 0.297041; Rp = 0.575860;As = 29.841830 (dB) where ωc is normalized 3 dB cut-off frequency. 2. Output parameters of discrete-weighted least square solution: ωp = 0.239849; ωc = 0.249797; ωs = 0.299319; Rp = 1.921760; As = 43.380435 (dB). 3. Output parameters of weighted least square solution of fitness value a: ωp = 0.238624; ωc = 0.249148; ωs = 0.300324; Rp = 1.650086; As = 46.991386 (dB).
86
Y. Wang et al.
4. Output parameters of weighted least square solution of fitness value b: ωp = 0.238752; ωc = 0.249088; ωs = 0.300288; Rp = 1.721403; As = 46.797261 (dB) (Fig. 4). From the above, we can find under the constraints, all the frequency indexes of FIR filter solutions meet the design requirements. The results of the discrete weighted least squares show that the selection of the initial weight vector is successful, but it is not enough for the indexes and equiripple degree. However, the discrete least squares solution corresponding to the weighting coefficient searched by the genetic algorithm with fitness value a and fitness value b has reached the index required in stop-band attenuation and pass-band fluctuation with a little lacking in the equiripple degree.
5 Conclusion In this paper, the weighted least squares algorithm and genetic algorithm are used to design FIR filter. The implementation of genetic algorithm adopts roulette selection, double tangent crossover and big mutation strategy, and the optimal weight vector is used to generate filter coefficients with WLS algorithm. Then, using MATLAB, we get the results which show the three indexes meet the standards under the constraints, and the solution corresponding to the optimal weight searched by genetic algorithm greatly improves the stop-band attenuation. Meanwhile, it proves though the form of fitness value is different, the essential function is the same and the genetic algorithm works well in searching the optimal weight which can further improve the stop-band attenuation of filter, and basically achieve equiripple. It shows the principle of setting the initial weight vector based on error size is correct and the algorithm is effective.
References 1. Yi K, Li Yi, Sun C, Nan C (2015) Recent development and its prospect of satellite communications. J Commun 36(6):157–172 2. Li Y, Zhao Y, Pei W (2018) A doppler shift estimation scheme for 5G-LEO satellite mobile communication system. Comput Measure Control 26(10):226–229, 234 3. Guangshu H (2003) Digital signal processing: theory, algorithm and implementation. Tsinghua University Press, Beijing 4. Middlestead RW (2017) Appendix B: digital filter design and applications. Digital communications with emphasis on data modems: theory, analysis, design, simulation, testing, and applications. Wiley 5. Lim YC, Lee JH, Chen CK, Yang RH (1992) A weighted least squares algorithm for quasiequiripple FIR and IIR digital filter design. IEEE Trans Signal Process 40(3):551–558 6. Yang Y, Xu P (2016) Iterative optimization algorithm of infinite impulse response digital filters’ design with low group-delay. J Nanjing Univ Sci Technol 40(4) 7. Gong C, Wang Z (2009) Proficient in MATLAB optimization calculation. Electronic Industry Press, Beijing 8. Zhang J, Zhan Z (2009) Computational intelligence. Tsinghua University Press, Beijing 9. Qifang Gu (2016) Design and simulation of FIR in performance index optimization of digital filter. Wireless Interconnection Technol 15:64–67
Design of FIR Filter Based on Genetic Algorithm
87
10. Xiao Y, Lu L, Lee M (2006) FIR-rake receiver for TD-SCDMA mobile terminals. Proc ISCAS’06 1:2785–2788 11. Oppenheim AV, Schafer RW, Buck JR (2005) Discrete-time signal processing. Tsinghua University Press, Beijing 12. Lee A, Ahmadi M, Jullien GA, Lashkari RS, Miller WC (1999) Design of 1-D FIR filters with genetic algorithms. Proc ISSPA’99 2:955–958 13. Fuller TG, Nowrouzian B, Ashrafzadeh F (1998) Optimization of FIR digital filters over the canonical signed-digit coefficient space using genetic algorithms. Proc MWSCAS’98 1:456–459 14. Holland JH (1992) Genetic algorithms. Scientific American Press 15. Lee A, Ahmadi M, Jullien GA, Miller WC, Lashkari RS (1998) Digital filter design using genetic algorithm. Proc. IEEE Symp Adv Digital Filter Signal Process 1:34–38 16. Suckley D (1991) Genetic algorithm in the design of FIR filters. IEE Proc-G 138(2):234–238
Timing Error Detection and Recovery Based on Linear Interpolation Xuminxue Hong1(B) , Bo Yang2 , Jingyu Hua1 , Anding Wang1 , and Weidang Lu3 1
2
School of Information Electronic Engineering, Zhejiang GongShang University, Hangzhou 310014, China [email protected] Research Institute of Comba Telecom Co. Ltd., Guangzhou 510000, China 3 College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
Abstract. Timing synchronization is a critical and essential component of a satellite communication system. In order to synchronize the received signal with the sampling clock, the synchronization system in the receiving device calculates the sampling clock deviation. The timing error detector (TED) in the Gardner timing algorithm is a module for calculating the clock deviation of the synchronous system, and the recovery performance is related to the S-curve. In this paper, we propose an improved Gardner timing recovery algorithm. This method establishes a TED output timing error lookup table for the S-curve, uses the TED output as an index to determine the S-curve segment, and then uses linear interpolation to recover the timing error estimate. The results show that the estimated performance is still great in the presence of timing error and Rummerler multipath. Keywords: Satellite communication · Timing synchronization S-curve · Timing error detector · Linear interpolation
1
·
Introduction
With the development of computer and digital signal processing technology, all digital receivers have been widely used in satellite communication technology. Due to the Doppler frequency shift and the long propagation delay between the mobile terminal and the satellite, there will inevitably be deviations in the sampling time, resulting in the data point of the receiving end may not be at the optimal point, which causes bit errors [1].
c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_11
Timing Error Detection and Recovery . . .
89
Timing synchronization loops are a critical component of digital communication receivers and have a significant impact on the performance of communication systems. The delay of the signal in the actual communication system is undetermined [2], which means the receiving end needs to generate a clock sequence strictly in synchronization with the received symbols, so as to correctly recover the transmitting end signal. Symbol synchronization is the basis of correct sampling decision and an important factor affecting the system error rate. Therefore, timing synchronization performance has important research value and practical significance. The Gardner algorithm is a timing error estimation algorithm that can recover independently of the carrier phase, and it is non-judge-oriented [3]. The basic principle is to use the TED to extract the amplitude and polarity change information of the optimal sampling point of the adjacent symbol, with the information of whether the adjacent symbol transition point is zero, the timing error can be extracted from the sampled signal [4]. The characteristics of a timing error detector can be represented by its S-curve, since its sinusoidal shape is called the S-curve, which is a function of the timing error [5]. In this paper, we improve the time delay estimation module composed of the loop filter and the timing controller. The lookup table (LUT) is obtained by linearly interpolating the S-curve, and the delay is estimated by looking up the table. The results show that we can still obtain the accurate timing error under the interference of AWGN and multipath channel. 1.1
System Model
The interpolation-based Gardner timing recovery algorithm is a typical representative of modern timing synchronization [6]. In this algorithm, only two sampling points are needed for each symbol to participate in the calculation. Its system model is shown in Fig. 1. The interpolation filter, timing error detector, loop filter, and timing controller form a closed loop.
Fig. 1. Timing synchronous loop system model
However, in practical communication systems, it is necessary to consider the time delay estimation in the case of multipath transmission. The traditional
90
X. Hong et al.
method of estimating the time delay by Gardner algorithm cannot guarantee the high precision estimation requirement in the multipath channel. The lookup table and interpolation method can achieve high performance time delay estimation under AWGN and multipath channel. The principle is mainly to use the TED in the Gardner algorithm to obtain the desired mean value for a plurality of symbol periods of the timing error signal at a given timing offset time, and the horizontal axis is set as the timing offset, and the vertical axis is phase error average output, an S-curve which is approximately symmetrical about the origin is obtained, and then, the S-curve is linearly interpolated to establish a LUT.
2
Obtain LUT by Linear Interpolation
2.1
Linear Interpolation
The TED uses a non-data-aided error detection algorithm, which is easy to realize in full digital and can be designed separately from the carrier tracking module. The expression of timing error calculation is [7]: ut (r) = yI (r − 1/2)[yI (r) − yI (r − 1)] + yQ (r − 1/2)[yQ (r) − yQ (r − 1)].
(1)
where yI (r) and yQ (r) represent the sampling values of the r-th symbol on I and Q branches, respectively, and r represents the number of symbols. Set a pair of sampling values between the (r-1) and the r-th to yI (r − 1/2) and yQ (r − 1/2). Taking the expected of multiple symbols, then we get the average output of the timing error detector [8]: Ut (τ ) = E{ut (r)} g(τ + (r − 1/2)T s − pT s)[g(τ + rT s − pT s) − g(τ + (r − 1)T s − pT s)] = p
(2) where TS is the symbol period, τ is the timing phase shift, and the function g(T ) is the pulse corresponding to the filtered signal. Due to the small number of S-curve points generated by simulation, it is necessary to extend the number of points of S-curve by linear interpolation to increase the number of segments, and the error between the interpolated S-point and the actual S-point satisfies the accuracy requirements. Using two adjacent error means (x0 , y0 ) and (x1 , y1 ), we can obtain the two-point linear equation: (y − y0 ) · (x1 − x0 ) = (x − x0 ) · (y1 − y0 ).
(3)
By substituting x and x into Eq. (3), the value of one or more unknown quantities between the two known quantities can be simulated by the linear equation.
Timing Error Detection and Recovery . . .
2.2
91
LUT-Based Search Estimation
After the LUT is established, the mean value of phase error can be calculated by the timing controller, and the time delay δ can be estimated by LUT. Assuming that y is the mean error actually measured by Eq. (2). In order to improve the accuracy, we can these two points for the second linear interpolation, and the interval is divided into ten segments again. According to Eq. (3), the coordinates in the interval after interpolation can be expressed as (xk , yk ), k=1,2,3, . . . , 10. 1 −y0 ) Where xk = (1 − 0.1k)x0 + 0.1kx1 ,yk = 0.1 k(y x1 −x0 + y0 . The delay δ is estimated to be xk or xk+1 based on the distance between y and yk or yk+1 : δ=xk if y −yk yk+1 −y
3
Experimental Results
3.1
S-Curve Test
In order to test the specific difference of the S-curve at different SNR, the source signal is set to a 16QAM random signal, and the number of sampling points is 200,000. The following are the S-curves drawn by the SNR of 5 dB and 20 dB, respectively (Fig. 2).
5
10-3
5
4
4
3
3
2
2
SNR=5dB
SNR=20dB
1 Err
Err
1 0
0
-1
-1
-2
-2
-3
-3
-4
-4
-5 -0.5
10-3
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
-5 -0.5
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
Fig. 2. S-curve under different SNR
The average curve is obtained by taking the S-value of the SNR of 5 and 20 dB. The MSE values of S-curve and mean curve corresponding to 5 dB and 20 dB SNR are 1.3930e−07 and 1.9006e−07, respectively. It is proved that the S-curve of different SNR is small, because the phase error is averaged.
92
3.2
X. Hong et al.
Time Delay δ Estimation and Error Test Results
Estimated absolute error δ
Estimate δ
Test Results in Multipath Channel. The test results show that S-curve has little effect on the selection of SNR. Moreover, due to the average processing of the phase error, we still can obtain a more accurate delay δ under the influence of AWGN and multipath channels (Fig. 3). 1 0 -1 -0.5
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
0.1
0.2
0.3
0.4
0.5
0.1
0.2
0.3
0.4
0.5
δ 0.01 0.005 0 -0.5
-0.4
-0.3
-0.2
-0.1
0
Estimated relative error δ
δ 0.5
0 -0.5
-0.4
-0.3
-0.2
-0.1
0
δ
Estimate δ
(a) S-curve SNR=5dB,test estimated SNR=5dB 1 0 -1 -0.5
-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
0.1
0.2
0.3
0.4
0.5
0.1
0.2
0.3
0.4
0.5
Estimated absolute error δ
δ 0.02 0.01 0 -0.5
-0.4
-0.3
-0.2
-0.1
0
Estimated relative error δ
δ 1 0.5 0 -0.5
-0.4
-0.3
-0.2
-0.1
0
(b) S-curve SNR=5dB,test estimated SNR=20dB Fig. 3. The effect of estimated SNR on estimation error in multipath channel
Timing Error Detection and Recovery . . .
4
93
Conclusion
In this paper, we study the basic principle of S-shaped curve and linear interpolation of timing error detector and its application in timing error estimation, the estimation performance of this method under different SNR is simulated by MATLAB, which provides theoretical basis and data reference for subsequent Gardner loop estimation. The final test results show that the SNR of the test estimation has little effect on the performance, and the S-curve has little effect on the SNR selection. Compared with the traditional Gardner timing recovery algorithm, this method has better performance against AWGN and multipath channel interference and more accurate time delay estimation.
References 1. Zhao Y (2014) Research on all digital receiving timing synchronization algorithm in satellite communication. Chongqing Univ Posts Telecommun 157(10):66 2. Zhao J, Jiang Q (2009) Research on timing synchronization based on LEO satellite communication. Radio Commun Technol 35(5):19–22 3. Zhang DM, Li X, Chen JT (2019) Research on timing synchronization algorithm of primary synchronization signal in 5G system. Study Opt Commun (03):59–64 4. Zeng M, Luo Y, Jiang H, Jiang LL (2019) Research on improved Gardner symbol rate synchronization algorithm. Process Autom Instrum 40(07):76–78 5. Liu YC, Qiu YT, Li CH (2017) Simulation and implementation of Gardner bit timing algorithm based on interpolation. Electron Technol 46(11):51–53 6. Shi X, Xue TJ (2016) Research on symbol synchronization technology based on Gardner loop. Microcomput Appl 32(10):51–53 7. Fu YM, Zhu J, Qin WM (2012) An improved Gardner timing recovery algorithm. J Southwest Univ Sci Technol 33(06):191–198 8. Zhao Y, Guo JB (2011) A new lock detection algorithm for Gardner’s timing recovery. In: 2011 IEEE 13th international conference on communication technology (ICCT)
Robust Interference-plus-Noise Covariance Matrix Reconstruction Algorithm for GNSS Receivers Against Large Gain and Phase Errors Bo Hou1(B) , Haiyang Wang1 , Zhikan Chen2 , Zhiliang Fan1 , and Zhicheng Yao1 1
Rocket Force University of Engineering, Xi’an 710025, China
2
[email protected] Unit 66133, Beijing 100043, China
Abstract. A novel robust interference-plus-noise covariance (INC) matrix reconstruction algorithm is proposed for global navigation satellite system (GNSS) receivers. Instead of using the presumed interference steering vectors (SVs) to reconstruct the INC matrix, the SVs projected onto the interference subspace are utilized so as to mitigate large SV mismatches caused by gain and phase errors. Simulation results indicate that the proposed algorithm performs better than the other INC matrix reconstruction algorithms in output carrier-to-noise (C/N0 ) ratio. Moreover, the effectiveness of the proposed algorithm is validated by the GNSS software receiver. Keywords: GNSS receiver · Interference mitigation · Interference-plus-noise covariance matrix reconstruction phase errors
1
· Gain and
Introduction
GNSS has been widely applied in both civil and military fields because it can provide all-time, all-weather and high accuracy position, navigation and timing service to global users. Nevertheless, GNSS receivers are easily interfered by strong interferences since GNSS signals are extremely weak and even 20 dB lower than the thermal noise when they arrive at receivers [1, 2]. Therefore, array antennas with adaptive beamforming algorithms, which can form nulls toward the incoming interferences while steering the array response toward the desired signals, are always adopted to cancel interferences [3–5]. Besides, to maintain the performance of adaptive beamformers in the presence of SV mismatches, c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_12
Robust Interference-plus-Noise Covariance Matrix . . .
95
INC matrix reconstruction algorithms are proposed in [6, 7]. The INC matrix reconstruction method can be robust to look direction error, while it does not work well any longer when there are gain and phase errors. In the current work, we propose a novel INC matrix reconstruction algorithm for GNSS receivers against large SV mismatch caused by gain and phase errors. To reduce the reconstruction distortion in the presence of gain and phase errors, the projection of the presumed interference SVs onto the interference subspace is used to reconstruct the interference covariance matrix, instead of using the presumed interference SV directly. Moreover, the noise power is regarded as the minimum eigenvalue of the sample covariance matrix (SCM), based on which the noise covariance matrix can be obtained. With these two components, the reconstructed INC matrix is acquired. In addition, the proposed algorithm is compared with the other typical INC matrix reconstruction algorithms, and meanwhile its effectiveness is validated by the GNSS software receiver.
2
Signal Model
Consider a uniform linear array (ULA) with M omni-directional antenna elements. Assume that L desired GNSS signals impinge on the array from the direction θl (l = 1, 2, . . . , L) and Q interferences are incident from the direction θq (q = 1, 2, . . . , Q). Let x(t) ∈ CM ×1 denotes the time domain samples at time t where C is the set of complex numbers, and it can be expressed as x(t) = xs (t) + xi (t) + n =
L l=1
a(θl )sl (t) +
Q
a(θq )sq (t) + n,
(1)
q=1
where xs (t), xi (t) and n represent the desired GNSS signals, interferences and noise at time t, respectively. sl (t) and sq (t) are, respectively, the complex envelope of the lth desired GNSS signal and qth interference. a(θl ) and a(θq ) denote the SV corresponding to the lth desired GNSS signal and qth interference, respectively. Moreover, θl and θq are, respectively, the direction of arrival (DOA) for the lth desired GNSS signal and qth interference. For the ULA, the SV can 2π(M −1)d sin θ 2πd sin θ λ ] where d is the be expressed with the form [1, e−j λ , . . . , e−j distance between adjacent elements, λ denotes the incident wavelength, and j 2 = −1. In practice, each desired GNSS signal can be processed in a dependent channel. And thus, the output of the beamformer associated with the lth GNSS satellite can be given by (2) yl (t) = wlH x(t), H
where the superscript {·} stands for the Hermit transpose. wl is the weight vector for the lth GNSS satellite and it can be found from the solution to minimize wlH Ri+n wl wl
subject to wlH ¯ a(θl ) = 1
(3)
96
with
B. Hou et al.
Ri+n = E{(xi + n)(xi + n)H },
(4)
where Ri+n ∈ CM ×M represents the INC matrix, E{·} denotes the mathematical expectation operation, and ¯ a(θl ) stands for the presumed SV of the lth desired GNSS signal. However, it is infeasible to acquire the INC matrix Ktheoretical H ˜ = 1 x(k)x (k) where K Ri+n , which is always replaced by the SCM R k=1 K ˜ → Ri+n along with K → ∞. denotes the number of snapshots and there is R Then, the weight vector for the lth desired GNSS signal can be presented as wl =
˜ −1 ¯ R a(θl ) . H ˜ ¯ a (θl )R−1 ¯ a(θl )
(5)
With (5), the array output C/N0 corresponding to the lth GNSS signal can be written as Bσl2 |wlH ¯ a(θl )|2 , (6) C/N0 (l) = wlH Ri+n wl where B stands for the bandwidth of the desired GNSS signal and σl2 denotes the power of the lth desired GNSS signal.
3
Proposed Algorithm
The conventional INC matrix reconstruction method proposed in [6] can be presented as ¯ a(θ)¯ ˆ i+n = P(θ)¯ aH (θ)dθ, (7) R ¯ Θ
¯ ¯ ˜ −1 ¯ where P(θ) = 1/¯ aH (θ)R a(θ) is the Capon spatial spectrum estimator, Θ denotes the angular complement sector of Θ in which the signal of interest (SOI) is located and ¯ a(θ) represents the presumed SV associated with a certain direction θ. However, the reconstructed INC matrix will be seriously distorted due to the gain and phase errors, which will cause a dramatic performance degradation. In this work, a novel INC matrix reconstruction algorithm against large gain and phase errors is put forward. The theoretical INC matrix in (4) can be further developed as Ri+n = Ri + Rn =
Q
σq2 a(θq )aH (θq ) + σn2 I,
(8)
q=1
where σq2 denotes the power of the qth interference, σn2 represents the noise power and I is the M -order identity matrix. It can be seen that the theoretical INC matrix can be divided into the interference component Ri and the noise component Rn . Therefore, we can respectively reconstruct these two components.
Robust Interference-plus-Noise Covariance Matrix . . .
97
The SCM mentioned above can be decomposed as ˜ = R
M
H H λm um uH m = UI ΛI UI + USN ΛSN USN ,
(9)
m=1
where λ1 ≥ λ2 ≥ · · · ≥ λM are the eigenvalues in the descending order and um is the eigenvector corresponding to the eigenvalue λm . Since the power of interferences are much larger than those of GNSS signals and noise, the eigenvalues associated with interferences are easily distinguished. Consequently, the eigenvectors u1 , u2 , . . . , uQ corresponding to large eigenvalues span the interference subspace and the other small eigenvectors uQ+1 , uQ+2 , . . . , uM span the signal-plusnoise subspace. Moreover, UI = [u1 , u2 , . . . , uQ ], USN = [uQ+1 , uQ+2 , . . . , uM ], ΛI = diag{λ1 , λ2 , . . . , λQ }, and ΛI = diag{λQ+1 , λQ+2 , . . . , λM }. The interference subspace spanned by actual interference SVs a(θ1 ), a(θ2 ), . . . , a(θQ ) is equivalent to that of u1 , u2 , . . . , uQ . Thus, the projection of the presumed interference SVs onto the interference subspace can be denoted by a(θq ) = UI UH a(θq ). I ¯
(10)
Using the corrected interference SVs in (10) instead of the presumed interference SVs, the reconstructed interference covariance matrix can be given by H Ri = P(θ) a(θ) a (θ)dθ Θ1 ∪···∪ΘQ
(11)
P(θ)UI UH a(θ)¯ aH (θ)UI UH I ¯ I dθ,
= Θ1 ∪···∪ΘQ
where Θ1 , Θ2 , . . . , ΘQ denote the Q angular sectors of interference and P(θ) = H ˜ −1 a(θ) is the new Capon spatial spectrum estimator. Define these 1/ a (θ)R interference angular sectors as Θq = {θ||θ − θq | ≤ β} (q = 1, 2, . . . , Q) where β is a bound value that specifies the range of interference angular sectors. It can be found that the interference DOAs are required as a priori, which can be obtained easily with the conventional DOA estimation algorithms since the power of interferences are much larger than those of the desired GNSS signals and noise. ˜ can be regarded as the noise power Moreover, the minimum eigenvalue of R 2 σ n , due to which the estimated noise covariance matrix can be written as
2
Rn = σ n I = λM I.
(12)
With (11) and (12), the reconstructed INC matrix Ri+n can be presented as
Ri+n = Ri + Rn = P(θ)UI UH a(θ)¯ aH (θ)UI UH I ¯ I dθ + λM I. Θ1 ∪···∪ΘQ
(13)
98
B. Hou et al.
Utilizing the similar idea in [6], the orthogonal component of the SV mismatch vector e can be obtained as e⊥ by solving a quadratically constrained quadratic programming problem, with which the SOI SV corresponding to the lth GNSS satellite is corrected as ¯ a(θl ) + e⊥ . Substituting (13) into (5), the weight vector for the lth GNSS satellite can be finally expressed as −1
wl =
4
Ri+n [¯ a(θl ) + e⊥ ] H
−1
[¯ a(θl ) + e⊥ ] Ri+n [¯ a(θl ) + e⊥ ]
.
(14)
Simulations
Consider a ULA with 10 omnidirectional antenna elements separated by half wavelength. The desired signal is assumed as BeiDou-2 (BD2) signal at B3 band whose carrier frequency 1268.52 MHz and bandwidth is 20.46 MHz. The Doppler shift and chip offset are, respectively, defined as −1 kHz and 2230 chips. The desired BD2 signal is incident from the direction 5◦ and its SNR is −20 dB. Two wideband interferences whose bandwidths are the same as the desired BD2 signal impinge on the array from the directions −50◦ and −20◦ . Moreover, the corresponding INRs are respectively 40 dB and 30 dB. The sampling frequency and the analog intermediate frequency are set 62 MHz and 46.52 MHz, respectively. The performance of the proposed algorithm is mainly compared with the two typical INC matrix reconstruct-estimation beamforming (REB) algorithms in [6, 7]. For the REB algorithm in [6], the SOI angular sector is set to be Θ = in [7], the involved robust Capon beamforming [0, 10◦ ]. For the REB algorithm √ (RCB) bound is set as ε = 0.1 and 7 dominant eigenvectors are used. As for the proposed algorithm, assume that β = 5◦ and the estimated interference DOAs are, respectively, θ¯1 and θ¯2 , based on which the two interference angular sectors can be obtained as Θ1 = [θ¯1 − 5◦ , θ¯1 + 5◦ ] and Θ2 = [θ¯2 − 5◦ , θ¯2 + 5◦ ]. The gain and phase errors at each element are, respectively, acquired from the random generators N = (1, 0.12 ) and N = (0, (12◦ )2 ) with large variances. All results are the average of 200 Monte Carlo runs. Without lose of generality, the input SNR is set to [−30 dB, −10 dB] as the desired BD2 signal is weak and the number of snapshots is fixed at K = 620. As depicted in Fig. 1a, the proposed algorithm always has a much better output C/N0 performance than the other two typical REB algorithms. To validate the effectiveness of the proposed algorithm, the BD2 software receiver is adopted and the acquisition result is shown in Fig. 1b, from which we can found that the desired BD2 signal has been acquired successfully. The simulation results indicate that the proposed algorithm can be robust to large gain and phase errors well.
5
Conclusion
A novel INC matrix reconstruction algorithm has been proposed for GNSS receivers. Instead of utilizing the presumed interference SVs to reconstruct the
Robust Interference-plus-Noise Covariance Matrix . . .
99
Fig. 1. a Output C/N0 versus input SNR in the case of large gain and phase errors; b signal acquisition result with the BD2 software receiver.
INC matrix directly, the projection of the presumed interference SVs onto the interference subspace is utilized, which can effectively avoid the reconstruction distortion in the presence of gain and phase errors. Simulation results show that the output C/N0 performance of the proposed algorithm is always better than those of the other REB algorithms. Moreover, the desired BD2 signal can be acquired by the BD2 software receiver successfully after interference mitigation with the proposed RAB algorithm.
References 1. Gao GX, Sgammini M, Lu MQ, Kubo N (2016) Protecting GNSS receivers from jamming and interference. Proc IEEE 104(6):1327–1338 2. Park KW, Park C (2019) Determination of LO frequency for reception of maximum number of GNSS signals in presence of interference. Electron Lett 55(9):552–554 3. Capon J (1969) High-resolution frequency-wavenumber spectrum analysis. Proc IEEE 57(8):1408–1418 4. Fante RL, Vaccaro JJ (1972) Wideband cancellation of interference in a GPS receive array. IEEE Trans Aerosp Electron Syst 36(2):549–564 5. Wang YD, Chen FQ, Nie JW, Sun GF (2016) Optimal reference element selection for GNSS power-inversion adaptive arrays. Electron Lett 52(20):1723–1725 6. Gu YJ, Leshem A (2012) Robust adaptive beamforming based on interference covariance matrix reconstruction and steering vector estimation. IEEE Trans Signal Process 60(7):3881–3885 7. Zheng Z, Zheng Y, Wang WQ, Zhang H (2018) Covariance matrix reconstruction with interference steering vector and power estimation for robust adaptive beamforming. IEEE Trans Veh Technol 67(9):8495–8503
Research on Image Recognition Technology of Transmission Line Icing Thickness Based on LSD Algorithm Shili Liang, Jun Wang, Peipei Chen, Shifeng Yan, and Jipeng Huang(B) School of Physics, Northeast Normal University, Changchun 130024, Jilin, China [email protected]
Abstract. Icing on transmission lines has become one of the important factors that endanger the safe and stable operation of transmission lines. The timely identification of ice thickness on transmission lines can effectively prevent the damage caused by ice disasters. In order to improve the measurement accuracy of icing thickness of transmission lines, this paper proposed a linear detection algorithm based on LSD algorithm to detect icing thickness of transmission lines. Firstly, image processing methods such as image pre-processing, morphological processing, and edge detection are used to pre-process the icing image of the transmission line. Then, the edge of the icing image is detected using the LSD algorithm. Finally, the ratio of the image pixels to the actual diameter of the wire is used to find out of ice thickness. The experimental results show that the error of this method is 0.0443, which can be used to measure the thickness of icing. Keywords: Transmission line · Image processing · Icing thickness · LSD algorithm
1 Introduction Transmission line icing is a dynamic physical phenomenon related to meteorology, fluid mechanics, and heat transfer. When environmental factors such as ambient temperature, humidity, wind speed, terrain, and altitude are met, the transmission line will be iced. Micro-meteorological factors have the greatest impact. For example, when the ambient temperature is below 0°, the relative humidity of the air is higher than 85%, and the wind speed over 4 m/s, ice may occur on the transmission line. At the same time, ice and snow on power transmission lines also bring great harm [1]. Ice covering can easily cause accidents such as flashover of insulators, wires swing, falling towers, and power communication interruptions. Therefore, ice and snow disasters and ice covering seriously threaten power system normal operation. During the Spring Festival in 2008, a large area of snow and ice disasters in southern China caused power outages for more than
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_13
Research on Image Recognition Technology …
101
100 million people. The economic losses and social impact caused by power outages are difficult to estimate. Among them, the collapse of power lines and towers and the outage of transformer stations caused direct economic losses of state grid and China Southern power grid of 10.4 billion and 5 billion yuan. Therefore, it is of great significance for the icing monitoring of transmission lines. At present, a lot of researches have been done on the icing of transmission lines at home and abroad. For example, a mechanical model is used to calculate the ice thickness of the transmission line. This method has high calculation accuracy, but it is relatively difficult to implement, and many factors need to be considered. A tensile sensor can also measure the ice thickness, but it has higher requirements on environmental factors, and the wire is dancing that will cause inaccurate measurements; manual observation can only do qualitative analysis of the line. It is not easy for workers in remote mountainous areas to reach it, so there are inconveniences. Setting up observation stations around power lines has high credibility and accuracy of the data obtained, but the cost is high and the construction period is long is not convenient for large-scale establishment; in addition, the use of the image method to monitor the transmission line in real time for icing has been widely recognized. A monitoring terminal system is installed on the tower of the transmission line. The instrument transmits the captured icing image to the monitoring center to observe the icing for the operation and maintenance staff of the situation provides great convenience. There are also drones equipped with cameras for power transmission line inspections, and the pictures obtained can be transmitted to the monitoring terminal for the convenience of staff analysis. Therefore, it is of great significance to use ice-covered images to detect the ice-covered thickness of transmission lines in real time and accurately. This paper proposes a method for calculating the ice thickness of transmission lines based on the LSD algorithm [2]. Combined with image pre-processing, morphological processing, edge detection, and other image processing methods to pre-process the icecovered transmission line pictures, the ice-thickness is finally calculated.
2 Image Preprocessing 2.1 Median Filtering In reality, most of the transmission lines are in natural environments, and the cameras will capture complicated tree branches, and some noise will be carried during the transmission of image data. Noise will reduce the accuracy and sharpness of the image data. In addition, noise will affect the edge processing of the image, so the image needs to be denoised. Here, according to the characteristics of the image, the median filtering method with better denoising effect is used to process the image. Median filtering is a kind of non-linear smoothing filtering method, which can suppress noise and avoid blurring of image edges, which is more conducive to the detection and extraction of image edges [3]. The wire icing picture is from the observation point of No. 249 pole tower of a certain line of Yunnan Power Grid, 110 kv line. The wire of this line uses steel core aluminium stranded wire LGJ-400/50, 27.6415° north latitude, and 1621 m altitude. The diameter d is 27.63 mm. As shown in Fig. 1.
102
S. Liang et al.
Fig. 1 Image of transmission line icing
Median filtering was performed on the ice-covered image. The results are shown in Fig. 2. As can be seen from the pictures after the median filtering, this method effectively removed some blurred noise and brought convenience to the subsequent image processing.
Fig. 2 Median filter processing
2.2 Image Binarization In RGB color system, color images need to be grayscale and binarization, so as to facilitate subsequent image processing and recognition. Image graying is to quantify the brightness value of color image, and image binarization is to change the gray value of points on the image to 0 or 255, so that the whole image presents an obvious black and white effect. Figure 3 shows the results of image binarization. 2.3 Image Morphology Processing Morphological denoising is mainly accomplished through the combination of open operation and closed operation, and most of the noise can be suppressed by using different
Research on Image Recognition Technology …
103
Fig. 3 Image binarization
combinations. According to the characteristics of image noise, we can use smooth image contours to filter out isolated points and burrs. We can also fill in cracks by smoothing the edges of the image and adhering adjacent objects. This can be done very well. In addition, in order to produce different filtering effects, we can use different structural elements [4]. Open operation: The original image is corroded and then expanded, the purpose is to separate the adhesive target and keep the size of the original target. As shown in formula (1): A ◦ B = (AB) ⊕ B = ∪{(BZ )|(B)Z ⊆ A}
(1)
Closed operation: The original image is expanded and then corroded, the purpose is to merge the fracture target and basically maintain the size of the original target. As shown in formula (2): A · B = (A ⊕ B)B = ∪{(BZ )|(B)Z ⊂ A}
(2)
After median filtering, grayscale, and binarization, the ice-covered image of the transmission line is processed by image morphology. The image processing results are shown in Figs. 4 and 5.
3 Edge Detection The methods of edge detection are mainly divided into traditional methods and modern methods. Good edge detection minimizes the possibility of misclassifying non-edge points into edges. The basic principle of the traditional method is completed by convolving the operator template with the image. The traditional edge detection method is simple and fast, but is sensitive to noise. Modern methods mainly include wavelet methods, mathematical morphology methods, fuzzy mathematical methods, and genetic algorithms. These methods have their own advantages and disadvantages and their application scope. According to the characteristics of the image to be processed, this paper selects Canny edge detection. Canny operator is a multi-level edge detection algorithm.
104
S. Liang et al.
Fig. 4 Open operation
Fig. 5 Close operation
This method has good anti-noise ability, high positioning accuracy, and wide application [5]. Canny goal is to find an optimal edge detection algorithm. The meaning of optimal edge detection is as follows: (1) Optimal detection: The algorithm can identify the actual edges in the image with high probability, and the probability of false detection of non-edges is very small. (2) Optimal positioning criterion: The detected edge point is closest to the actual edge point, and the positioning is accurate. (3) The edge response is single-valued: The edge points detected by the operator should correspond to the actual edge points one to one.
Research on Image Recognition Technology …
Canny edge detection criteria, the expression is as follows: +w −w f (x)G(−x)dx SNR = 1 +w 2 2 ρ −w (x)dx
105
(3)
SNR stands for signal-to-noise ratio, which reduces the error rate of edge detection. f (x) is the impulse response of the filter at the boundary [−w, w], G(−x) is the image edge function, and ρ is the mean square error of Gaussian noise. The larger the SNR value is, the better the detection effect will be. Optimal positioning criterion, its expression is as follows: +w (x) (−x) dx −w f G (4) D= 1 +w 2 2 (x)dx −w The criterion is to make the edge obtained as close to the real edge as possible. The larger D is, the more accurate the edge positioning is. The single-edge response criterion has the following expression: +∞ 2 −∞ (x)dx (5) G (−x)L(f ) = +∞ 2 (x)dx −∞ This criterion ensures that each edge has a unique response. Canny edge detection algorithm steps are as follows: (1) Process the image in advance, and then smooth the image with a Gaussian filter. (2) Calculate the magnitude and direction of the image gradient after filtering, using the mathematical method of first order partial derivative finite difference. (3) Non-maximum suppression is performed on the gradient amplitude to obtain a refined edge. That is to find the local maximum points in the image gradient and set the non-local maximum points to zero. (4) Detect and connect edges with double thresholds T 1 and T 2 (T 1 > T 2) to get edge images. The effect of the canny edge detection algorithm used in this paper is shown in Fig. 6.
4 LSD Algorithm After image pre-processing and edge detection, straight line segments in the picture need to be extracted. Hough transform [6, 7] and Radon transform are more common in straight line detection algorithms. Hough transform means that the curve in the original space can be transformed into a point in the parameter space according to the curve expression and the dual relationship of the points and lines. This simplifies the problem
106
S. Liang et al.
Fig. 6 Canny edge detection results
from the problem of detecting the curve in the original space to the problem of finding the peak in the parameter space, but the complexity of the Hough transform algorithm is relatively high. Radon transform refers to the projection of an image in (ρ, θ ) space. The straight lines with high gray values in the image will form bright points in the (ρ, θ) space, and the line segments with low gray values will form dark points in the (ρ, θ) space. Therefore, the detection of straight lines is transformed into the detection of bright points and dark points in the transform area, but the detection speed of the Radon transform is slower. Therefore, this paper uses the LSD algorithm [8, 9], which is a more optimized method for detecting straight lines. The LSD algorithm is a straight line detection segmentation algorithm for the image to be detected. This algorithm is a local line contour extraction algorithm. It has the advantages of fast detection rate and automatic adjustment of detection parameters. LSD algorithm is a new line detection segmentation algorithm proposed by scientists only in 2010. The algorithm first calculates the angle between each pixel and the reference line to construct the reference line field, as shown in Fig. 7, and then uses the area growth algorithm to merge pixels with approximately the same direction in the field to obtain a series of line support domains, as shown in Fig. 8. As shown in the figure, pixel merge is finally performed in these domains to extract straight line segments. At the same time, the algorithm uses the “Contrario model” and “Helmholtz principle” principles to judge straight lines and perform error control to obtain better results. LSD is based on effective error control and pixel combination to extract the line. The detailed steps of the line extraction algorithm are as follows: (1) Image scaling: The image itself is a discrete sequence of grayscale values. In order to reduce or even eliminate the sawtooth effect in many images, the first step of LSD algorithm is to reduce the input image, which is completed by Gaussian downsampling. (2) Gradient calculation: This step is to ensure the independence of the distribution of adjacent points. The gradient calculation formula is as follows: gx (x, y) =
i(x + 1, y) + i(x + 1, y + 1) − i(x, y) − (x, y + 1) 2
(6)
Research on Image Recognition Technology …
107
Fig. 7 Image gradient and level-lines
Fig. 8 Line support regions
Calculation of gradient direction:
Ang(x, y) = arctan
gx (x, y) −gx (x, y)
(7)
Gradient amplitude: G(x, y) =
gx2 (x, y) + gy2 (x, y)
(8)
(3) Gradient pseudo-ranking: Since the determination of straight line candidate areas uses the gradient neighborhood growth method, the area growth is obtained according to the processing order of pixels. Therefore, to perform straight line segmentation detection, it is necessary to start from pixels with high gradient amplitude. (4) Gradient threshold: A pixel with a small gradient amplitude means that the gray level changes slowly. Due to the quantization of pixel values, errors may occur in the calculation of the gradient, so the threshold needs to be set. (5) Region growth: Select an unused pixel in the sort list as the seed point. You can use the split image to take the entire image as the starting point to obtain the segmented area. You can also start from a single pixel point to find and merge. The seeds have the same or similar pixels around them to obtain a segmented area. (6) Rectangle estimation: Find the rectangle corresponding to the straight line, and then judge whether the candidate area can be used to extract the straight line.
108
S. Liang et al.
(7) Determination of straight line segments: Calculate and modify each rectangle, and then improve its NFA value. If the NFA value of the rectangle is less than the threshold S, the area is a straight line and it is output. In this paper, using the LSD detection algorithm to measure the icing edge of the transmission line, as shown in Fig. 9.
Fig. 9 LSD algorithm detection results
5 Calculation of Ice Thickness on Transmission Lines 5.1 Calculation of Ice Thickness Based on LSD Algorithm The imaging principle of the camera is shown in Fig. 10. The camera can be seen as a convex lens, f is the focal length, d is the object distance, and v is the image distance. The imaging formula is as follows [9]: Rp =
f (v − f )R0
Fig. 10 Imaging principle of the camera
(9)
Research on Image Recognition Technology …
109
where Rp is the size of the image, and R0 is the radius of power transmission line. According to the icing image of the transmission line, the LSD algorithm is used to accurately detect the iced wire of the transmission line, and the pixel value corresponding to the diameter of the iced wire at this time is x 2 . It is known from the above that the actual diameter of the wire when there is no ice is d. The corresponding pixel value is x 1 . Through the proportional relationship, the ice thickness of the wire in the iced picture can be calculated [10]. The calculation formula of ice thickness is as follows: d x2 −1 × (10) D= x1 2 According to the LSD algorithm, and after taking multiple measurements and averaging, x 2 = 29.95 mm, x 1 = 11.50 mm, d = 27.63 mm can be taken into the formula: D = 22.16 mm. 5.2 Comparison with Other Line Detection Methods According to the literature [11], the average ice thickness of this Yunnan power grid line is 21.22 mm. The following Table 1 is the comparison of the results of three linear detection algorithms in measuring the ice thickness of the same transmission line, where A is the LSD linear detection algorithm used in this paper, B is the traditional Hough transform detection algorithm, and C is the Radon transform algorithm. Table 1 Comparison of different line detection algorithms Line detection algorithm A/mm
B/mm
C/mm
Ice thickness
20.23
22.31
Error
22.16 0.0443
0.0467
0.0514
It can be known from Table 1 that the measurement results in this paper have credibility and the error is 0.0443, which can be used to measure the thickness of ice coating. This article mainly introduces a method based on LSD algorithm and combined with image processing to detect the ice thickness of the wire. There are still some problems and deficiencies: 1. Ice coating density and other issues. Different degrees of ice-covering and uneven ice-covering have not been considered. A large number of ice-covering pictures can be identified and processed using deep learning methods. According to the gray value and shape of different types of ice-covering, the wire icing is judged thickness. 2. Micro-meteorology and other factors will greatly interfere with camera shooting. Especially, the breeze vibration will have a great influence on the position of the wire. So, try to keep the state of no wind or low wind speed when collecting images.
110
S. Liang et al.
3. Whether the camera is covered with ice will also affect the results. If the lens is covered with ice or dust, the picture will be blurred. Therefore, it is recommended to use a camera with dust and ice protection. 4. Increase the resolution. It can be seen from the image that the resolution of the icecovered image will also affect the result of ice-thickness. The use of a high-pixel camera can make the measurement results more accurate.
6 Conclusion At present, there are still some problems with the reliability and accuracy of online transmission line monitoring systems. It is more convenient and faster to perform transmission line detection through such intuitive observation methods as video images. This paper proposes a new linear detection algorithm based on LSD to detect the ice thickness of transmission lines based on the existing method for detecting the ice thickness of transmission lines. Using this method, we can well guide the ice melting work and have practical significance for the safe and stable operation of the power system. Acknowledgements. This work was funded by the Social Development Project of Jilin Province Science and Technology Department (20190303016SF) and Changchun Science and Technology Bureau (18DY010).
References 1. Farzaneh M, Savadjiev K (2005) Statical analysis of field data for precipitation icing accretion on overhead power lines. IEEE Trans Power Delivery 20(2):1080–1087 2. Grompone von Gioi R, Jakubowicz J, Morel JM, Randall G (2010) LSD: a fast line segment detector with a false detection control. IEEE Trans Pattern Anal Mach Intell 32(4):722–732 3. Heng X, Shurong P, Yazhen M, Bin L (2017) Research on detection method of transmission line ice thickness based on image processing. Shaanxi Electr Power 45(05):32–35 4. Duan X, Wenfeng D, Zikai P, Jinhong D, Yibin X (2012) Edge detection algorithm based on mathematical morphology and wavelet transform. J Comput Appl 32(S2):165–167 5. Canny J (1986) A computational approach to edge eetection. IEEE Trans Pattern Anal Mach Intell 8(6):679–698 6. FernandesLAF, Oliveira MM (2008) Real-time line detection through an improved Hough transform voting scheme.Pattern Recogn 41(1):299–314 7. Galambos C, Kittler J, Matas J (2001) Gradient based progressive probabilistic Hough transform.IEEE Proc Vision Image Signal Process 148(3):158–165 8. Grompone von Gioi R, Jakubowicz J, Morel JM, Randall G (2008) On straight line segment detection.Math Image Vision 32(3):313–347 9. Huohua L, Jie C, Yichun X, Chang X, Yunpeng M (2019) Powerline extraction method for aerial images based on LSD and statistical analysis. Appl Sci Technol 46(02):30–34 10. Wang JJ, Wang JH, Shao JW et al (2017) Image recognition of icing thickness on power transmission lines based on a least squares Hough transform. Energies 10(415):1–15 11. Yanpin H, Guote L, Yiwei X, Junlin Z, Zunwei S, Lizheng L (2014) Wavelet analysis image recognition of transmission line ice thickness. High Voltage Technol 40(02):368–373
A High-Frequency Acceleration Sensor for Monitoring Sloshing Response of Ships Chuanqi Liu, Wei Wang(B) , Libo Qiao, and Jingping Yang Tianjin Key Laboratory of Wireless Mobile Communications and Power Transmission, Tianjin Normal University, Tianjin 300387, China [email protected]
Abstract. This article introduces a high-frequency acceleration sensor for monitoring sloshing response of ships. The ship is not only threatened by the external environment during the voyage. At the same time, the hull will sway when the frequency of movement in the waves approaches the natural frequency of the liquid in the tank, thereby seriously damaging the hull structure. Therefore, it is necessary to monitor the acceleration outside the hull in real time and respond in time according to the situation. The sensor will move relative when the sensor is installed on the ship and subjected to external acceleration, and the optical fiber connected between the bosses will be stretched or compressed accordingly, both of which change the center wavelength of the fiber Bragg grating (FBG). The magnitude of the acceleration experienced by the vessel will show the change in wavelength on the displacer through the fiber grating. The sensor is designed to face the harsh natural environment of marine ships, which can realize automatic real-time monitoring of acceleration and ensure the safety of the ship during sailing. Keywords: Sloshing response · Fiber Bragg gratings · High-frequency acceleration sensor
1 Introduction As a large-scale transportation tool, marine ships have the advantages of huge cargo capacity and low transportation costs, which ensures the demand of marine ships in daily use. However, many risks and challenges faced by maritime vessels in transportation contrast sharply with their advantages. For example, strong winds and waves, collisions between reefs and ships may pose a serious threat to ships [1]. In view of these various unexpected circumstances, it can be found that the subjective judgment of the ship operator alone is not reliable. Various sensors have been developed to monitor the vibration, stress, hydraulic pressure, and other indicators of the ship during sailing to protect the safety of the ship.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_14
112
C. Liu et al.
It is worth mentioning that the threats of a ship faces during navigation not only come from the external natural environment, but also from the inside of the ship. Sloshing generally occurs when the hull’s motion frequency in waves is close to the natural frequency of the liquid in the ship’s tank. The sloshing response will cause the liquid to violently impact the internal structure of the hull, which will seriously threaten the ship’s navigation safety. It is, therefore, particularly important to design a sensor that can monitor the ship’s sloshing response. Over the years, with the burgeon of optical communication technology and optical fiber technology, fiber is used as a sensitive means of optical fiber sensing technology has made great progress. Compared with traditional piezoelectric sensors, fiber grating sensors have a small volume, large amount of sensing information, corrosion resistance, high temperature resistance, and electromagnetic interference resistance, and easy formation of sensing networks. Fiber grating sensors are more suitable for harsh natural environments, such as marine vessels. Currently, fiber grating sensors are widely used in the measurement of physical quantities such as stress, acceleration, pressure, and temperature [2]. In the field of acceleration monitoring, the sensor for monitoring the sloshing response of the ship needs further research and improvement. To this end, a new type of fiber grating high-frequency acceleration sensor is proposed.
2 Theory FBG actually formed in the narrow-band filter or reflective fiber core. Figure 1 shows that when a beam of light passes through FBG, except for broadband light, there is no obvious attenuation of light at other wavelengths. Only broadband light will manufacture mode coupling [3], while light that meets the Bragg condition is reflected back to the incident end, and its reflection spectrum is a narrow-band spectrum, and its central wavelength satisfies the formula: λB = 2neff
(1)
Fig. 1 Reflection and transmission spectrum of FBG
where represents the grating period. neff represents the effective refractive index of the fiber core region; λB represents the center wavelength of the FBG.
A High-Frequency Acceleration Sensor …
113
Alters in temperature and strain on the fiber grating will result in the effective refractive index neff of the fiber core and the center wavelength λB of the FBG to change [4]. According to formula (1), the impact of these changes can be calculated using the following function: ∂ ∂ ∂neff ∂neff + neff l + 2 + neff T (2) λB = 2 ∂l ∂l ∂T ∂T where the first part shows the wavelength, drift caused by the axial strain, which is determined by the refractive index change due to elastic light effect and the alter of the fiber grating period . Therefore, the wavelength drift caused by axial strain can also be presented as: λB = λB (1 − pe )ε
(3)
In the function, pe is the effective elasticity coefficient of FBG. For the FBG whose grating core is usually ordinary germanium-doped quartz, pe ≈ 0.22 [5]. In the function (2), the second part represents the wavelength drift due to the changes of temperature, which is the wavelength drift caused by the grating period and refractive index changes caused by thermal expansion. Wavelength drift due to temperature changes can be presented as: λB = λB (a + an )T
(4)
where a represents the thermal expansion coefficient of the fiber. For germanium-doped fibers, a = 0.55 × 10−6 , an represents the thermo-optic coefficient of the fiber. For germanium-doped fibers, it is about 8.6 × 10−6 . To prevent the sensor measurement process from being disturbed by temperature changes, the sensor network uses the reference grating method to perform temperature compensation [6]. There are two fiber gratings in the sealed box, one is affixed between the two bosses of the sensor and is simultaneously affected by the temperature and strain, the other is stuck on the inside of the sealed box wall and is only affected by the temperature change. Link the pigtails of the two optical fibers to the demodulator and record the readings of their center wavelengths. Assuming that the offsets of the center wavelengths of the two gratings are λ1 and λ2 , these two offsets can be presented as: λ1 = λ1 (T)
(5)
λ2 = λ2 (ε, T)
(6)
With this temperature compensation method, the effects of temperature changes can be separated from the measurement process. This method of temperature compensation is not only cheap and easy to realize, but also can better separate temperature and strain, which is suitable for use in such a harsh environment as marine vessels [7].
114
C. Liu et al.
3 High-Frequency Acceleration Sensor Structure The fiber grating high-frequency acceleration sensor used for ship sloshing response monitoring is composed of fixed part, sensitive part and sealed part (Figs. 2, 3 and 4).
Fig. 2 Fiber grating high-frequency acceleration sensor plan view
Fig. 3 Fiber grating high frequency acceleration sensor front view
The fixing part includes an optical fiber sticking boss, a sensor base, and a lateral fixing hole. The two ends of the optical fiber are, respectively, attached to the bosses on both sides. When the sensor is subjected to external acceleration in the up and down direction, the circular hinge drives the mass block including the adhesive boss to move, and the optical fiber is deformed accordingly. The sensor base passes four screws. The hole and the watertight box are fixed to ensure that the sensor and the watertight box are relatively stationary. The horizontal fixing holes are located on both sides of the watertight box. After the package is completed, the thimble is inserted through the fixing hole to ensure that the sensor will not have excessive lateral displacement during the movement. Reduce lateral interference. The sensitive part consists of a circular hinge, a fiber grating, and a temperaturecompensated grating. There are a total of four circular hinges. The mass block containing the optical fiber sticking boss connected to sensor base by a hinge. The mass blocks on both sides will be displaced in different directions from left to right when the sensor is subject to acceleration in the up and down direction, so that the corresponding extension or compression of the FBG; the deformation and wavelength variation of FBG are
A High-Frequency Acceleration Sensor …
115
Fig. 4 Fiber grating high-frequency acceleration sensor side view
approximately linear. The compensation algorithm of the demodulator can be used to discover the wavelength of the FBG and the sway response of the ship structure. As the ratio changes, the external acceleration experienced by the hull structure will be displayed in the form of a wavelength change through the FBG on the demodulator side. The temperature compensation grating is affixed to the inside of the watertight box. It is only affected by temperature changes and not affected by acceleration. Temperature compensation can be achieved through a compensation algorithm to remove the passive influence of temperature on the acceleration measured by the sensor. The sealing part includes a watertight box cover, a watertight box body, an optical fiber waterproof aerial jack, and an optical fiber waterproof aerial plug fixing hole. The watertight box cover and the watertight box body form a watertight box. The sensor is fixed inside by a screw hole. The connection between the box cover and the box body is sealed with a rubber ring in the sealing groove to ensure that the entire watertight box is completely sealed. It is not affected by seawater erosion and other external conditions; the optical fiber waterproof aerial jack and the optical fiber waterproof aerial plug fixing hole are located on the side of the watertight box. It is designed to access the optical fiber waterproof aerial plug, and the optical fiber is connected to the demodulation through the waterproof aerial plug. Instrument to ensure that the fiber optic exit is also protected by watertightness.
4 Experimental Process The sensor is simulated and tested by ANSYS simulation software [8]. The sensor simulation structure is shown in Fig. 5. The sensor is first modal analyzed. The sensor material is set to beryllium bronze, because beryllium bronze not only has good elastic properties and elastic recovery ability, but also is easy to process and resistant to corrosion. It is suitable for use in harsh natural conditions in the ocean. Before the modal analysis, the sensor structure needs to be
116
C. Liu et al.
Fig. 5 ANSYS simulation structure
meshed. Considering the complexity of the meshing of the sensor hinge, intelligent meshing is used in this experiment. The specific partitioning is shown in Fig. 6.
Fig. 6 Sensor structure meshing
Constraints need to be placed on the sensors after meshing. After constraining the bottom surface of the sensor to ensure that there is no lateral interference during the modal analysis, the modal analysis of the sensor is shown in Fig. 7.
Fig. 7 Modal analysis results
A High-Frequency Acceleration Sensor …
117
From the results of the above modal analysis, it can be known that the natural frequency of the sensor structure is 3079.4 Hz, such a result fully meets our expected needs for designing the sensor. The static analysis of the sensor is to determine the strain of the sensor by imposing various constraints on the sensor. The first few steps of the static analysis are the same as the modal analysis, and you need to set up the sensor material and mesh. Set an additional acceleration of 20G to the sensor, and fix the bottom surface so that the sensor base will not be displaced under acceleration. The sensor strain under acceleration is shown in Fig. 8.
Fig. 8 Sensor static analysis
From the results of the static analysis combined with the calculation formula of microstrain, it can be obtained that the sensor generates about 37.6 με under the acceleration of 20G, and the sensitivity of the sensor can be estimated by the formula to be about 2.26 pm/με, which is in line with the Claim.
5 Conclusion The high-frequency acceleration sensor developed in this study is used to monitor the sloshing response of a ship. The optical fiber connected between the sensor bosses will be deformed accordingly when the sensor is installed on the ship and subjected to external acceleration. Further, the center wavelength of the FBG increases. Thus, the magnitude of the acceleration experienced by the ship will be displayed on the demodulator through FBG wavelength shift. The sensor has a compact structure and can be installed in a small space on a ship without occupying the working space on the ship. According to the results of simulation experiments, the sensor has excellent performance. Structure or building. The sensor uses the reference grating method for temperature compensation to ensure that the sensor can effectively separate the interference of temperature changes on the measured value of the sensor during the measurement process. The sensor is provided with a completely sealed watertight box, and the fiber end is also protected by the fiber optic waterproof plug, which ensures the watertightness of the entire sensor system and the reliability and stability when used on marine vessels.
118
C. Liu et al.
Acknowledgements. This paper is supported by Natural Youth Science Foundation of China (61501326, 61401310, 61731006). It also supported by Tianjin Research Program of Application Foundation and Advanced Technology (16JCYBJC16500).
References 1. Wei W (2010) Study of key technology on ship hull structural health monitoring with fiber Bragg grating. Tianjin University, Tianjin 2. Guozhen Y (2017) Theoretical and experimental research on optical fiber vibration sensing. North China Electric Power University 3. Zeng N (2005) Research on the key technology of fiber optic accelerometers. Tsinghua University 4. Xing Z (2019) Research on fiber Bragg grating vibration sensing technology. Xi’an Shiyou University 5. Baojin P, Yanbiao L, Min Z, Yaqiang S, Hongzhen J, Shurong L (2005) New method of measuring fibers valid elastic_optic constant. Opt Tech 05:655–658 6. Meller SA, Jones ME, Wavering TA et al (1998) Development of fiber optic sensors for advanced aircraft testing and control. Proc SPIE Int Soc Opt Eng 3541:134–139 7. Yali Q, Chenghua S, Zhefu W, Kai L, Peifa M (2001) Research on all fiber optic accelerometer. J Zhejiang Univ Technol 29(3):220–225 8. Desheng J, Daxiong C, Lei L (2004) Application of ANSYS in design of fiber Bragg grating accelerometer. J Transducer Technol 11:75–77
Research on cm-Wave and mm-Wave Dual-Frequency Active Composite Detection Guidance Technology Lai-Tian Cao(B) , Chen Yan, Xiao-Min Qiang, and Xue-Hui Shao Beijing Aerospace Automatic Control Institute, Beijing 100854, China [email protected]
Abstract. This article proposed a cm-wave and mm-wave dual-frequency active composite detection guidance technology, analyzed the advantages of cm-wave and mm-wave dual-frequency active composite detection guidance. The dualfrequency active composite detection guidance method has the characteristics of long detection distance, strong clutter suppression capability, high target recognition probability, and strong anti-interference capability. On this basis, the detection guidance system scheme and the key technologies were discussed. Keywords: Cm-wave and mm-wave · Composite · Detection guidance
1 Introduction Today, the electromagnetic environment is extremely complicated, poses a severe challenge for accurate detection guidance. Traditional single-frequency active detection guidance technology has increasingly shown its inherent limitations, and dual-frequency active composite detection guidance technology will be the future development trend. Using the cm-wave and mm-wave dual-frequency, active composite detection guidance technology can give full play to the characteristics of each frequency, also can effectively improve the comprehensive performance of the detection guidance system.
2 Cm-Wave and mm-Wave Dual-Frequency Active Composite Detection Guidance Capability Analysis 2.1 Detection Distance Under the detection guidance system equivalent radiant power restriction, if the attenuation of electromagnetic space transmission is smaller, the transmission distance is farther.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_15
120
L.-T. Cao et al.
Generally, electromagnetic waves transmitted in atmosphere will be absorbed by water molecules and oxygen atoms, which can cause greater attenuation in certain frequency [1]. If the working frequency of the detection guidance system is lower than the water resonance frequency, the absorption of water vapor in the atmosphere can be ignored. To achieve long-distance detection, the lower working frequency (such as cmfrequency) needs to be selected to ensure the feasibility of the system scheme [2]. For example, we can choose X-frequency or Ku-frequency as the main working frequency to ensure the required distance. 2.2 Clutter Suppression Capability Take sea clutter as an example. Sea clutter will affect the sea surface targets detection. Target and clutter in radar beam are affected by the same detection guidance system parameters (power and antenna gain, etc.), the ratio of target and clutter energy is determined by the respective radar cross-sectional area [3], calculated as follows: σ (1) SCNR = σ0 A Equation (1), σ is the radar cross-sectional area (RCS), σ0 is the backscattering coefficient of the sea surface, and A is the sea clutter strip area irradiated by the beam which is the same distance resolution unit as the target (Fig. 1).
Fig. 1 Schematic diagram of sea clutter strip area illuminated by radar beam
Assuming the shape of irradiated strip area is rectangular, the sea clutter area A is obtained by removing the area occupied by the target in the same distance resolution unit, calculated as follows: A = Ac − At = R · ϕ · R/ cos θ − At
(2)
Equation (2), Ac is the strip area in the resolution unit irradiated by the beam, At is the area occupied by the target in the resolution unit, R is the distance from target to detection
Research on cm-Wave and mm-Wave Dual-Frequency …
121
guidance system, ϕ is the radar beam width, R is the radar distance resolution, θ is the grazing angle that between the bean irradiation direction and the sea surface. The signal-to-noise ratio is as follows: SCNR =
σ σ0 (R · ϕ · R/ cos θ − At )
(3)
From Eq. (3), we can conclude that the signal-to-noise ratio depends mainly on the σ (radar cross-sectional area), σ0 (backscattering coefficient of the sea surface), θ (grating angle), ϕ (radar beam width), and R (radar distance resolution). Assuming a distance resolution of 1.875 m, Ku-frequency (cm-wave) radar beam width of 7.5°, Ka-frequency (mm-wave) radar beam width of 3.2°, and W-frequency (mm-wave) radar beam width of 2°, perform simulation analysis on RCS (σ ), σ0 , A and SCNR at different angles and frequency. Modeling a ship target, the simulation results in different frequency (Ku/Ka/W) and target azimuth at large grazing angles are as follows. (Detection guidance system transmitting and receiving polarization are horizontal). As can be seen from the Fig. 2, the ship target RCS of the Ku-frequency, Kafrequency, and the W-frequency are equivalent, and the target RCS fluctuations at different azimuth are ±10 dB.
Fig. 2 RCS comparison chart of ship target in different frequency and target azimuth at large grazing angles
HH in the Table 1 indicates that the detection guidance system transmitting and receiving polarization are vertical. VH indicates that the system transmitting polarization is horizontal, receiving polarization is vertical. As can be seen from above Table, under the large grazing angles conditions, the typical ship target RCS of each frequency are comparable. Modeling the sea clutter [4, 5], the simulation results of sea surface backscattering coefficients in different frequency and grazing angles under 3-level sea condition are shown in Fig. 3. From Fig. 3, when the grazing angle is less than 70°, the W-frequency backscattering coefficient is significantly lower than the X-frequency and Ku-frequency. When the grazing angle is greater than 70°, the W-frequency backscattering coefficient is about
122
L.-T. Cao et al.
Table 1 Comparison of RCS in different frequency, polarizations, and ship target azimuth at large grazing angles Target RCS Target RCS Target RCS Target RCS Target RCS Target RCS (dB) (dB) (dB) W/HH (dB) (dB) (dB) W/VH Ku/HH Ka/HH Ku/VH Ka/VH Grazing 32.539 angle (70°)
31.682
32.017
20.910
21.314
20.639
Grazing 32.826 angle (80°)
33.090
32.967
23.523
24.547
23.125
Fig. 3 The sea surface backscattering coefficients in different frequency and grazing angle under 3-level sea condition
1 dB lower than the X-frequency and Ku-frequency. The simulation results of the sea clutter strip area illuminated by radar beam in different frequency and grazing angles are shown in Fig. 4.
Fig. 4 The sea clutter strip area illuminated by radar beam in different frequency and grazing angle
Research on cm-Wave and mm-Wave Dual-Frequency …
123
The target RCS is 3000 m2 , without considering the impact of the target RCS fluctuation, calculate the target signal-to-noise ratio in different frequency and grazing angles under 3-level sea condition, the result is shown in Fig. 5.
Fig. 5 The target signal-to-noise ratio in different frequency and grazing angle under 3-level sea condition
According to the radar manual, when the target detection probability is greater than 98%, the signal-to-noise ratio threshold needs to be greater than 14 dB if use CFAR detection method. As can be seen from the Fig. 5, under the condition of the grazing angles close to 80°, the target signal-to-noise ratio of the Ku-frequency is near the detection threshold. Refer to Fig. 2, consider the effect of target RCS fluctuations ±10 dB, it is difficult to ensure stable detecting target. The target signal-to-noise ratio in Ka-frequency is improved compared to Ku-frequency, but the effect is not obvious. Compared with the Ku-frequency, the W-frequency target detection signal-to-noise ratio is increased by 9.5 dB, and the detection performance advantage is obvious under large grazing angles. 2.3 Target Detection, Identification and Anti-Jamming Capabilities Cm-wave and mm-wave dual-frequency active composite detection guidance technology in target detection, identification, and anti-jamming capabilities is reflected in the following aspects: 1. Target detection process uses dual-frequency active information, suppresses the effects of multipath and angular flicker, and improves the stability of target detection. 2. Detect, correlate, estimate, and identify multi-source information from the dualfrequency time domain, frequency domain, and energy domain, and extract sta-ble feature parameters that can be used to distinguish targets from jamming. Thus, accurate state and identity estimation can be performed, and target recognition probability can be improved. 3. The mm-wave frequency detection guidance system has strong Doppler resolving power, so it can enhance the capability of slow targets recognition. The mm-wave frequency has wide available spectrum and narrow radar beam width, which makes
124
L.-T. Cao et al.
it difficult for other party to perform targeted interference, so it has low interception capability. 4. The differences between the interference and the target of the dual-frequency scattering characteristics can help improve the target detection and extraction capabilities in the background of interference. 5. The two frequency system work at the same time, which can effectively combat the active interference released by a single frequency, and effectively expands the working spectrum width and facilitates frequency avoidance, thus improving the system anti-interference capability and reliability. 2.4 Summary To sum up, due to the advantages of small attenuation of cm-wave frequency atmospheric transmission and low impact of rainfall, the system can achieve long-distance detection. On this basis, increase the mm-wave working frequency, using cm-wave and mm-wave dual-frequency active composite detection guidance method to improve the anti-clutter, target recognition, and anti-interference capabilities.
3 Research on cm-Wave and mm-Wave Dual-Frequency Active Composite Detection Guidance System Scheme The cm-wave and mm-wave dual-frequency active composite detection guidance device adopts the phased array system. It consists of 5 units: antenna RF unit, receiving unit, signal processing unit, frequency source unit, and electrical source unit. The antenna RF unit consists of a dual-frequency common-caliber antennas configuration and a dualfrequency TR module unit. The frequency source unit generates 3 local oscillator signals (cm-wave frequency local oscillator signal, mm-wave frequency local oscillator signal, IF local oscillator signal), which are sent to the antenna RF unit and receiving unit, respectively. To ensure signals coherence, the reference clock of the signal processing unit is also provided by the frequency source unit. The transmitted signals are amplified by the distributed TR components and radiated through the antenna array to synthesize large power signals in space. The target echo signals received by the antenna RF unit are filtered and amplified by the antenna RF unit and the receiving unit, and output to the processing signal unit. Then sample and quantize them. The functional block diagram of the cm-wave and mm-wave dual-frequency active composite detection guidance device is shown in Fig. 6. The dual-frequency antennas adopt a common-caliber configuration design, and receive signals through beam network module. In the transmitting state of the cm-wave frequency antenna RF unit, the TR module unit performs amplitude and phase control, power amplification on the cm-wave frequency transmission signals, then the signals enter the antenna’s beam network module through the sub-array channels, form the required antenna beam in the specific direction, and synthesize electromagnetic wave for space radiation. The signals flow of the cm-wave frequency antenna RF unit receiving status is opposite to the transmitting flow.
125
.. .
Research on cm-Wave and mm-Wave Dual-Frequency …
.. .
.. .
.. .
.. .
.. .
.. .
.. .
.. .
Fig. 6 The functional block diagram of the dual-frequency active composite detection guidance device
The working principle of the mm-wave antenna RF unit is basically the same as the cm-wave antenna RF unit. The difference is that under the common-caliber configuration condition, the mm-wave antenna array elements use the sparse array method to reduce the impact on the cm-wave antenna array, and achieve high gain, narrow antenna beam, no grating lobe, and large-airspace scanning characteristics. The receiving unit is mainly used to filter and amplify the input IF signals, so as to output the signals that meet the power requirement. The frequency source unit generates RF local oscillator signals, IF local oscillator signals required for up-conversion and down-conversion, and reference clock signals. It uses an FM phase-locked source to achieve frequency agility. The signal processing unit is composed of a data collection module and a signal processing module. It has the functions of collecting target echo signals, completing signal processing, data interaction, and generating timing control signals. The electrical source unit converts the system power into the low-voltage power output required by each unit, and guarantees the stability of each power supply.
126
L.-T. Cao et al.
4 The Key Technology of the cm-Wave and mm-Wave Dual-Frequency Active Composite Detection Guidance 4.1 The mm-Wave Antenna Beam Large-Airspace Scanning Technology In order to achieve the maximum utilization efficiency of the cm-wave and mm-wave dual-frequency antennas, the dual-frequency antennas use a common-caliber configuration design, which need to take into account the different design requirements of the dual-frequency on the antenna. To ensure the long-distance detection of the system, the cm-wave antenna array elements should be arranged as full as possible. On this basis, due to the high mm-wave frequency, the antenna array elements spacing is very narrow under normal layout, and they will interfere with the cm-wave antenna array units [6]. To ensure the long-distance detection capability of the cm-wave antenna, the mmwave antenna units need to adopt the sparse array method. Under the limitation of the antenna aperture size, the airspace scanning range of the mm-wave antenna beam will be affected. That is, when the beam is scanned at a large angle, grating lobes are likely to occur, and the side lobe level will also increase significantly, which will affect the detection performance of the system and increase the probability of acquisition from the side lobe. The antenna array elements layout need to be optimized, and the received echoes are weighted in amplitude and phase, so that the mm-wave antenna beam can meet the design requirements when scanning in large airspace. 4.2 Dual-Frequency Joint Anti-Jamming Technology Although the spatial resolution (angular resolution and distance resolution) of the mmwave frequency is very high, more details of the detected target can be observed, and the target recognition capability is strong, but the target acquisition probability is low and the detection distance is short. The research foundation is still weak and lacks of corresponding experimental data support. The cm-wave has a wide antenna beam and poor detecting accuracy, but the target acquisition probability is high and the technology is relatively mature. Cm-wave and mm-wave have their own advantages and disadvantages in detection guidance performance. How to make full use of the information in the each frequency, and adopt corresponding detection guidance strategies in complex electromagnetic environment to maximize the overall advantages of the dual-frequency, requires technical research.
5 Conclusion This paper studies the cm-wave and mm-wave dual-frequency active composite detection guidance technology. The dual-frequency active composite detection guidance technology has the characteristics of long-distance detection, strong suppress clutter, target recognition, and anti-interference capabilities. The cm-wave and mm-wave dualfrequency active composite detection guidance system scheme realization were discussed, the difficulties and problems that need to be solved were summarized, and also the feasibility of the composite detection guidance technology was explained.
Research on cm-Wave and mm-Wave Dual-Frequency …
127
References 1. Jiang ZL, Han XQ, Fu L (2009) An engineering calculation method for atmospheric attenuation in electromagnetic propagation. Radar Confront 2 2. Di YQ, Guan ZC (2007) Study on satellite-to-earth link rain attenuation. Space Electron Technol 3 3. Du K, Shen J (2013) The method for ship target detection at large grating angle. Guidance Fuse 3 4. Ren HX, Ji GY, Wu KK (2014) Research on empirical model of sea clutter backscattering coefficient. Modern Electron Techn 10 5. Gregers-Hansen V (2012)Aerospace and electronic systems. IEEE Trans Aerosp Electron Syst 4 6. Yu TX, Ma L (2013) Study on isolation of microwave and millimeter-wave antennas in composite arrays. Inf Technol 10
Research on Modulation Recognition Algorithm Based on Combination of Multiple Higher-Order Cumulant Yingnan Lv and Jiaqi Zhen(B) College of Electronic Engineering, Heilongjiang University, Harbin 150080, China [email protected]
Abstract. The algorithm of modulation recognition using a single higher-order cumulant as the characteristic parameter is usually limited, and the recognition performance needs to be improved, for this reason, a recognition method is proposed, which uses the combination of multiple higher-order cumulants to construct the characteristic parameters, so that it contains more signal characteristics, and realizes the recognition of multiple signals of MASK, MPSK, and MFSK. The MATLAB simulation shows that it has a better recognition rate. Keyword: Characteristic parameter higher-order cumulant modulation recognition
1 Introduction In order to adapt to the development trend of future communication diversification, it is necessary to monitor the electromagnetic spectrum, which has important military and civil value. Modulation recognition is one of the key technologies of electromagnetic spectrum monitoring, so it has very important practical significance. At present, most of the research divides the modulation signal recognition methods into two categories, one is the recognition algorithm based on maximum likelihood estimation, the other is the recognition algorithm based on feature value extraction. Among the feature extraction algorithms, there are many that extract feature parameters based on higher-order cumulants. In Ref. [1], higher-order cumulant, wavelet energy entropy and power spectrum entropy characteristic parameters, and decision tree method are used to classify BFSK, 4FSK, 8FSK, BPSK, 16QAM, and other signals. Swaim et al. [2] adopted the pattern recognition method to realize the modulation recognition of MPSK signal by using the fourth-order cumulant of power normalization as the classification feature. In literature [3], 9 modulation styles including orthogonal frequency division multiplexing are identified in the additive white Gaussian noise channel by using
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_16
Research on Modulation Recognition Algorithm …
129
power spectrum characteristics and support vector machine (SVM) method. In literature [4], the cumulant values of order 2–8 are calculated for a variety of modulated signals, the characteristic value of signal recognition is extracted, the decision tree classifier is designed, and the decision tree threshold is trained and optimized, which improves the signal recognition performance to some extent. The above-mentioned characteristic parameters constructed by high-order cumulants mostly take the form of a single cumulant as the characteristic parameter or the ratio of two cumulants, and do not make full use of multiple cumulant information. Based on the above, this paper makes a unified analysis of the common high-order cumulants of mask, MPSK and MFSK modulation signals, and constructs a new characteristic parameter to realize the modulation signal classification.
2 Feature Extraction of Higher-Order Cumulants For the Gaussian white noise signal with zero mean value, it can be seen from literature [2] that if the mean value of random variable is zero and obeys the Gaussian distribution, then its third-order and higher-order cumulants above the third-order are always zero. According to this characteristic, the higher-order cumulant can suppress the zero mean Gaussian white noise and interfere to some extent. The value of the higher-order cumulant for different digital modulation signals is different, and it is not affected by the zero mean Gaussian white noise [5]. Therefore, the higher-order cumulant can be used as the characteristic value to distinguish and identify the modulation mode of different signals. In order to facilitate the processing, the theoretical value of the higher-order cumulants of the modulated signals used in this paper after the average power is normalized is assumed to be E, and the theoretical value of each order cumulants of different digital modulation signals is calculated according to the definition of the higher-order cumulants, as shown in Table 1. Table 1 The cumulative value of a digitally modulated signal Signal types
|C20 |
|C21 |
|C40 |
|C41 |
|C42 |
|C63 |
2ASK
E
E
2 E2
2 E2
2 E2
13 E 2
1.36 E 2
1.36 E 2
9.16 E 2
4ASK
E
E
1.36 E 2
2FSK
0
0
E2
0
E2
4 E2
4FSK
0
0
E2
0
E2
4 E2
2PSK
E
E
2 E2
2 E2
2 E2
13 E 2
E
E2
0
0
4 E2
4PSK
0
3 Classifier Design When constructing the identified feature parameters, the influence of phase jitter on the feature parameters can be reduced by constructing the feature parameters with the
130
Y. Lv and J. Zhen
absolute value of the cumulant. Meanwhile, in order not to let the amplitude interfere with the identified feature parameters, the form of ratio is used. It can be seen that the constructed feature parameters are not sensitive to amplitude and phase, and the stability of the algorithm is improved to some extent. Construct the identification feature parameter fx1, whose expression is: |c41 | |c42 |
fx1 =
(3.1)
According to the analysis in Table 1, when constructing the parameter fx1, the theoretical values of each modulation mode are as follows. 2ASK 2PSK 4PSK 2FSK 4FSK fx1 1
1
1
0
0
Construct the identification feature parameter fx2, whose expression is: fx2 =
|c41 | + |c42 | 2|c40 |
(3.2)
According to the analysis in Table 1, the high-order cumulant parameters of 2ASK and 2PSK are the same. If the cumulant is calculated to the eighth order, the difference will be shown. This paper will not do research for the moment. According to the characteristic parameter fx2, the theoretical values of each modulation mode are as follows. 2ASK 4ASK 4PSK fx2 1
1
0
Construct the identification feature parameter fx3, whose expression is: fx3 =
|c63 | |c21 |3
(3.3)
The theoretical values of each modulation mode are as follows. 2ASK 4ASK fx3 13
9.16
In the software simulation environment, the signal-to-noise ratio of digital modulation signal is taken as the independent variable, and fx1, fx2 and fx3 are taken as the characteristic parameters to distinguish the signal for simulation. The simulation results are shown in Fig. 1.
Research on Modulation Recognition Algorithm …
131
Fig. 1 Simulation diagram of parameters fx1, fx2, and fx3
4 Recognition Effect Simulation In the algorithm used in this paper, a combination of multiple high-order cumulants is adopted in the selection of feature parameters, and the modulation recognition is carried out with the combined action of multiple high-order cumulants, which is compared with the modulation recognition method of single high-order cumulants as feature parameters. The simulation diagram of the improved recognition rate is shown in Fig. 2, while the simulation diagram of the recognition rate of the unimproved algorithm is shown in Fig. 3.
Fig. 2 Recognition rate based on higher-order cumulants after algorithm improvement
By observing the simulation results, the improved algorithm can not only recognize the signal well, but also better than the unimproved algorithm. When the signal-to-noise ratio of the improved recognition method is between 0 and 2, the recognition success rate can reach more than 95%, while for the unimproved algorithm, the recognition success rate can reach more than 95% only when the signal-to-noise ratio is around 6. Simulation
132
Y. Lv and J. Zhen
Fig. 3 Recognition rate based on higher-order cumulants before algorithm improvement
results show that the improved classification decision tree structure is reasonable and can recognize the target signal. By observing the simulation results, the improved algorithm can not only recognize the signal well, but also better than the unimproved algorithm. When the signal-to-noise ratio of the improved recognition method is between 0 and 2, the recognition success rate can reach more than 95%, while for the unimproved algorithm, the recognition success rate can reach more than 95% only when the signal-to-noise ratio is around 6. Simulation results show that the improved classification decision tree structure is reasonable and can recognize the target signal.
5 Conclusion This article is based on higher-order cumulant methods, look for to distinguish between types of signal characteristic parameters, by building a classifier based on this method, complete the MASK, MPSK, MFSK signal recognition, finally, the algorithm of recognition performance simulation, the simulation results show that the algorithm is relatively single cumulant do characteristic parameter identification method on recognition rate increased. Acknowledgments. This work was supported by the National Natural Science Foundation of China under Grant No. 61501176, Heilongjiang Province Natural Science Foundation (F2018025), University Nursing Program for Young Scholars with Creative Talents in Heilongjiang Province (UNPYSCT-2016017).
Research on Modulation Recognition Algorithm …
133
References 1. Huixin C (2017) Research on communication signal modulation recognition technology. Xi’an University of Electronic Science and Technology 2. Fu Y, Zhu J, Wang S et al (2014) Robust non-data-aided SNR estimation for multilevel constellations via Kolmogor-Smirnov test. IEEE Commun Lett 18(10):1707–1710 3. Swami A, Sadler BM (2000) Hierarchical digital modulation classification using cumulants. IEEE Transac Commun 48(3):416–429 4. Shi W, Feng G (2014) An improved OFDM recognition algorithm based on support vector machines. Microelectron Comput 31(10):98–102. 5. Zhang Y (2016) Modulation recognition of digital signals based on high-order cumulants. Inf Commun 2:27–30.
Indoor Positioning Technology Based on WiFi Baihui Jiang and Jiaqi Zhen(B) College of Electronic Engineering, Heilongjiang University, Harbin 150080, China [email protected]
Abstract. In this paper, aiming at the inaccuracy of offline database data acquisition of fingerprint location method in wireless fidelity (WiFi) indoor positioning technology, the traditional mean filtering method is improved, and a median mean filtering method is proposed. Mean filtering, the method first eliminates the maximum and minimum small-probability singular value of RSSI generated by the AP signal source at the same position at different times, and then takes the average value of the RSSI signal strength value after removing the singular value. Keywords: WiFi indoor location · Fingerprint · Median average filtering
1 Introduction Nowadays, common indoor positioning technologies mainly include, Bluetooth, RFID, Ultrasonic, WiFi, UWB, etc. [1]. Indoor WiFi wireless positioning technology is divided into two categories according to the positioning principle: one is the positioning method based on ranging, and the other is the positioning method without ranging [2]. The first positioning method that requires ranging includes: TOA, AOA, TDOA, and signal propagation model positioning method [3]. The second is the location fingerprint positioning method that does not require ranging. The RSSI values of signal strength at different physical locations in the room are different, which means that each location in the room corresponds to a set of unique WiFi signal strength values, just like the fingerprint of a person, each person has their own uniqueness [4]. It is mainly divided into two stages: offline data acquisition, database building and online algorithm matching and positioning. The indoor positioning classification diagram is shown in Fig. 1. This article mainly aims at offline data acquisition RSSI signal intensity values are not accurate in the process of building an improved median filter method namely median average filtering method, namely in the RSSI signal strength value of collected to get rid of the small probability of maximum and minimum singular value of the average, so as to improve data accuracy is not high.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_17
Indoor Positioning Technology Based on WiFi
135
Fig. 1 Indoor positioning and classification diagram
2 Processing of RSSI Signal Strength Value The multi-path effect caused by indoor obstacles and human walking makes the RSSI signal strength value inaccurate. I ensure the data is reasonable and effective, it is necessary to choose an appropriate filtering method to filter the signal. Signal mean filtering is relatively simple for data processing, which only provides a certain auxiliary function for signal acquisition. Therefore, this paper proposes an improved mean value average filtering method for the collected RSSI signal value, which removes the maximum and minimum singular value of small probability in advance, and then carries out mean filtering to select the most accurate RSSI signal value. 2.1 Traditional Mean Filtering The traditional mean filtering method is to average a group of collected data samples. Assume that the RSSI vector of the same AP signal source collected by a mobile terminal at a sampling point is RSSI = {R1 , R2 , …, Rn }, Rn indicates the nth RSSI signal strength value collected. Then R¯ i after mean filtering is shown in formula (2.1). 1 R¯ i = Ri n n
(2.1)
i=1
2.2 Median Average Filtering On the basis of traditional mean filtering, the mean value average filter firstly screened the collected RSSI signal strength value, and then averaged the collected data samples after removing the maximum and minimum singular value of RSSI signal strength. The minimum value of RSSI signal strength is shown in formula (2.2), and the maximum value of RSSI signal strength is shown in formula (2.3). Ra = min{R1 , R2 , . . . , Rn }
(2.2)
Rb = max{R1 , R2 , . . . , Rn }
(2.3)
136
B. Jiang and J. Zhen
After removing Ra and Rb , average the collected data samples of this group as shown in formula (2.4). 1 Ri R¯ i = n n
(2.4)
i=1
3 Experiment and Analysis 3.1 Experimental Steps The experimental computer is Lenovo notebook computer (operating system is windows 10 flagship version, mainly used for data storage and chart drawing), Huawei mobile phone (to collect RSSI signal strength value), a steel tape for measuring the spacing (distance between sampling points), the length of tape is 10 m, and the measurement error is ±1.2 mm. The location area of this experiment is the corridor area on the second floor of A8 building of Heilongjiang University, which is rectangular. One AP signal source is set on the left side of the corridor area, and six test points are set in the center area of the corridor, which are divided into two lines, each line has three test points. The left side starts with the Mathematics Laboratory, the right side ends with the director’s office, and the distance between adjacent test points is 1.5 m. Each test point continuously collects 5 groups of data, and collects and records the data every second. The experimental area diagram is shown in Fig. 2. Mark (0,0) coordinates as reference point 1, (0,1.5) as reference point 2, (1.5,0) as reference point 3, (1.5,1.5) as reference point 4, (3,1.5) as reference point 5, (3,3) as reference point 6.
Fig. 2 Experimental area
Experiment data acquisition twice, the first was to collect no one walking does not produce multi-path effect of experimental data to get the first set of experimental data, the second is someone moving to collect the multi-path effect of the experimental data, the
Indoor Positioning Technology Based on WiFi
137
experimental data of median average filter and the traditional median filter processing, get the second and the third set of experimental data. The three groups of data are compared to determine whether the accuracy is improved. 3.2 Experimental Result and Analysis The collected three groups of data are input to MATLAB for analysis, and the curve analysis comparison chart is obtained. The abscissa is the position number of six sampling points set in the experiment, the ordinate is the RSSI signal strength value collected, the data 1 is the data obtained when someone walks around for traditional mean filtering, and the data 2 is the data obtained when someone walks around for median filtering data obtained, the data 3 is the data without people moving around. First of all, it can be seen from the comparison chart that the walk of people has a multi-path effect on the third and fourth sampling points. Secondly, data 2 is closer to data 3, which shows that the data processed by median average filtering method is closer to the data without people walking, while the data processed by traditional mean filtering method is quite different from the data without people walking. It is proved that in the median average filtering method, the maximum and minimum singular values of RSSI signal strength value are removed in advance and then the average value is taken, which can effectively reduce the small-probability data caused by human walking, and the influence of multi-path effect is less, and the data obtained is more accurate. The comparison figure is shown in Fig. 3.
Fig. 3 Curve comparison
4 Conclusion Through this research experiment, it is proved that for offline data acquisition and processing of fingerprint location in WiFi indoor location technology, the median mean
138
B. Jiang and J. Zhen
filtering method proposed in this paper can effectively make up for the shortcomings of traditional mean filtering method, and reduce the impact of multi-path effect caused by factors such as human walking and signal interference in the actual environment.
References 1. Jin C, Qiu DW (2017) Research on indoor positioning technology based on WiFi signal. Surv Mapp Bull 05:21–25 2. Wei HR, Wang WT (2014) Research on improved location fingerprint recognition algorithm based on WiFi positioning technology. Manuf Autom 36(12):148–151 3. Hu L (2019) Research on indoor location technology based on WiFi 4. Wang T (2015) Research on the location algorithm of underground coal mine based on location fingerprint. China Univ Min Technol
Power Optimization in DF Two-Way Relaying SWIPT-Based Cognitive Sensor Networks Chenyiming Wen, Yiyang Qiang, and Weidang Lu(B) College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China [email protected]
Abstract. In this paper, we develop and analyze resource optimization protocols in two-way decode-and-forward relaying SWIPT-based cognitive sensor networks. The current schemes generally ignore the interference caused by secondary users’ transmission to primary users. However, when the channel between them is good, this neglect will have a bad impact on the analysis of the performance of both networks. Therefore, we investigate the maximum transmission rate that can be reached by secondary users under the premise of ensuring the communication quality of the primary users. We derive this scheme by jointly optimizing the transmit power of two secondary users and the power splitting ratio of the relay node, respectively. The tradeoff between the interference threshold and other system parameters is given in the numerical simulations to corroborate the effectiveness of the proposed solutions. Keywords: Power splitting ratio · Interference threshold · Cognitive radio · Decode-and-forward · SWIPT
1 Introduction As an emerging technology which can actively supplement the energy of wireless devices, radio frequency (RF) has been widely studied in wireless network recently [1]. The technology of simultaneous wireless information and power transfer (SWIPT) [2] has become an effective way to extend the lifetime of wireless networks. Relay nodes can utilize power switching (PS) or time switching (TS) methods to forward the received information. PS performs better than TS in terms of information transmission and energy acquisition. A complete receiver structure including its energy distribution relationship is explained in [3]. The unlicensed frequency bands where sensor nodes currently work are increasingly crowded, which will affect the performance of wireless sensor networks (WSNs) [4]. In this case, cognitive radio (CR) [5] is derived as a feasible technology to tackle this problem. It enables cognitive users dynamically access primary users under the condition that primary users’ communication performance is not affected.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_18
140
C. Wen et al.
The relay protocols in cooperative cognitive radio networks (CCRNs) make full use of space diversity technology to reduce signal fading through multipath propagation in the radio environment [6]. Under this network, secondary users communicate with each other with the support of cognitive relay nodes. As a result, CCRNs’ achievable data rates and reliability can be significantly improved, which can be seen from literatures for amplify-and-forward (AF) [7, 8] and decode-and-forward (DF) [9, 10]. In current researches, how to achieve the maximum rate of CCRNs and the outage probability analysis are widely considered. However, the impact of secondary users on primary users is rarely considered; this effect will have a great impact on the network when channels are good. So, in our paper, we consider that the rate of the two-way DF network reaches the maximum value when the interference of the secondary users to the primary users is less than the threshold. In our algorithm, we jointly optimize the transmit power of two secondary users and the power splitting ratio of the energy-carrying relay to maximize the accessibility rate of secondary users, respectively.
2 System Model A two-way cognitive SWIPT network model based on DF relay is considered in Fig. 1. The model consists of two primary users PU1, PU2, two secondary users SN1, SN2, and a relay user RSN. We transfer the signal in two equal phases. During the first phase, primary users PU1, PU2 transmit their signals xPU1 , xPU2 at PP power to each other, secondary users SN1, SN2 transmit their signals xSN1 , xSN2 at PSN1 , PSN2 power j to RSN. We assume that hi ∼ CN(0, dij−m ) is the channel coefficient, where i, j ∈ j
{PU1, PU2, SN1, SN2, RSN}, i = j, hi = hij , and m is the path-loss exponent. The received signal at RSN can be given as PU 1
PU 1
PU 2
RSN
RSN SN 2
SN 1
PU 2
SN 2
SN 1 Phase
Phase Transmission Channel
Interference Channel
Fig. 1 System model
RSN 1 − λ( PSN1 hRSN SN1 xSN1 + PSN2 hSN2 xSN2 ) √ RSN + 1 − λ( PPU1 hRSN PU1 xPU1 + PPU2 hPU2 xPU2 ) + n
yRSN =
√
(1)
where n ∼ CN(0, σ 2 ) is the additive white Gaussian noise. The signal received by RSN can be divided into two portions, λ part of it is for energy reception and (1 − λ) of it is for information reception.
Power Optimization in DF Two-Way Relaying SWIPT …
141
We can see that xSN1 , xSN2 will affect each other when RSN receives these two signals at the same time. Thus, if pSN1 γSN1,RSN < pSN2 γSN2,RSN (as condition T1 ), RSN decodes xSN2 firstly and then decodes xSN1 . Similarly, when pSN1 γSN1,RSN ≥ pSN2 γSN2,RSN (as condition T2 ), RSN decodes xSN1 firstly and then decodes xSN2 . So, the achievable rate for SN1-RSN and SN2-RSN can be shown as ⎧ RSN 2 ⎪ h (1−λ)P ⎪ SN1 1 ⎪ k ∈ T1 2 SN1
⎪ RSN1,RSN = 2 log2 1 + ⎨ RSN 2 +n (1−λ)pP hRSN PU1 + hPU2 (2) 2 ⎪ (1−λ)PSN2 |hSN2,RSN | ⎪ 1 ⎪ ⎪ ⎩ RSN2,RSN = 2 log2 1 + (1−λ)pP hRSN 2 +hRSN 2 +n+(1−λ)PSN1 hRSN 2 k ∈ T1 ⎧ ⎪ ⎪ ⎪ ⎪ ⎨ RSN1,RSN =
1 2
⎪ ⎪ ⎪ ⎪ ⎩ RSN2,RSN =
1 2
log2 1 + log2 1 +
PU1
PU2
SN1
2 (1−λ)PSN1 hRSN
SN1 2 2 RSN 2 RSN +n+(1−λ)PSN2 hRSN +h (1−λ)pP h PU1
PU2
PU1
PU2
2 (1−λ)PSN2 hRSN SN2 2
2 (1−λ)pP hRSN +hRSN +n
k ∈ T2
SN2
(3) k ∈ T2
In phase 2, the relay user RSN uses all harvested power to forward signals. Considering, both secondary users know their own signals and can cancel the self-interference. Therefore, the data rate in the second phase can be given as ⎛
RSN 2 RSN 2 + hRSN 2 + P ηλ P h P SN1 hSN1 PU1 PU2 1
RRSN,SN1 = log2 ⎝1 + 2 2 SN1 2 n + pP hSN1 PU1 + hPU2
2 ⎞ + n hSN1 2 +PSN2 hRSN SN2 RSN 2 SN1 2 ⎠ n + pP hSN1 PU1 + hPU2 ⎛
hRSN 2 + hRSN 2 + PSN1 hRSN 2 ηλ P P PU1 PU2 SN1 1 RRSN,SN2 = log2 ⎝1 + 2 SN1 2
SN1 2 n + pP hPU1 + hPU2
RSN 2 2 ⎞ +PSN2 hSN2 + n hSN2 RSN (4) 2 SN1 2 ⎠ + h n + pP hSN1 PU1 PU2 where η represents energy conversion efficiency. When the channel condition is good, the secondary users have a great impact on the primary users, and we need to take the interference of SN1, SN2 with PU1, PU2 into consideration. Then, the influence of SN1, SN2 in phase 1 can be given as 2 PU1 2 + P IPU1,1 = PSN1 hPU1 h SN2 SN1 SN2
(5)
2 PU2 2 IPU2,1 = PSN1 hPU2 + P h SN2 SN1 SN2
(6)
142
C. Wen et al.
Similarly, the influence of SN1, SN2 in phase 2 can be given as 2 RSN 2 RSN 2 RSN 2 PU1 2 IPU1,2 = ηλ PP hRSN + + P + n + P h h h h SN1 SN1 SN2 SN2 PU1 PU2 RSN (7) 2 2 2 2 2 RSN PU2 RSN = ηλ PP hRSN + PSN1 hRSN PU1 + hPU2 SN1 + PSN2 hSN2 + n hRSN
IPU2,2
(8)
3 Problem Formulation In this article, our target is to maximize the sum of two secondary users’ information rates and ensure that secondary users’ interference with the primary user is less than the threshold Ith . The problem is formulated as OP1 : max R {PS ,λ}
s.t.
C1 : IPU1,1 ≤ Ith , IPU2,1 ≤ Ith C2 : IPU1,2 ≤ Ith , IPU2,2 ≤ Ith C3 : 0 < PSN1 ≤ pmax , 0 < PSN2 ≤ pmax C4 : λ ∈ [0, 1]
(9)
where PS = {PSN1 , PSN2 } and C3 indicates the maximum transmitted power constrain of secondary users SN1, SN2. The total transmission rate can be given as (10) R = min RSN1,RSN , RRSN,SN2 + min RSN2,RSN , RRSN,SN1 Now the optimal variables R can be obtained by solving the Lagrange dual function as (11). g(α1 , α2 , α3 , α4 ) = max L(λ, PS , α1 , α2 , α3 , α4 ) {λ,PS }
where
2 PU1 2 − P L(λ, PS , αi ) = R + α1 Ith − PSN1 hPU1 SN2 hSN2 SN1 i=1,2,3,4
PU2 2 PU2 2 + α2 Ith − PSN1 hSN1 − PSN2 hSN2 ‘ RSN 2 RSN 2 + α3 Ith − ηλ PP hPU1 + hPU2 RSN 2 RSN 2 PU1 2 +PSN 1 hSN1 + PSN2 hSN2 + n hRSN
(11)
Power Optimization in DF Two-Way Relaying SWIPT …
143
RSN 2 RSN 2 + α4 Ith − ηλ PP hPU1 + hPU2 PU2 2 RSN 2 RSN 2 +PSN1 hSN1 + PSN2 hSN2 + n hRSN
(12)
The target function R can be rewritten as R = β1 RSN1,RSN + (1 − β1 )RRSN,SN2 +β2 RSN2,RSN + (1 − β2 )RRSN,SN1 . If RSN1,RSN < RRSN,SN2 , we set β1 as 0, else β1 is 1. Similarly, if RSN2,RSN < RRSN,SN1 , β2 is supposed to be 0, otherwise β2 is 1. The optimal problem can be solved by using the sub-gradient-based method. The sub-gradient can be obtained by the following formulas. 2 PU1 2 h − P α1 = Ith − PSN1 hPU1 SN2 SN1 SN2 2 PU2 2 α2 = Ith − PSN1 hPU2 − P h SN2 SN1 SN2 2 RSN 2 RSN 2 α3 = Ith − ηλ PP hRSN + + P h h SN1 SN1 PU1 PU2 2 PU1 2 +PSN2 hRSN + n h SN2 RSN 2 RSN 2 RSN 2 α4 = Ith − ηλ PP hRSN + + P h h SN1 PU1 PU2 SN1 2 RSN 2 2 +PSN2 hSN1 + n hPU2 RSN
(13)
The optimal PS∗ and λ∗ can be obtained by the following two steps. 3.1 Deriving the Optimal PS∗ Assumed that we have already known λ. From KKT condition, we could know the partial derivative of the Lagrange function is zero at the optimal condition.
144
C. Wen et al.
3.2 Under Condition T1 The partial derivatives of L to PSN1 , PSN2 can be expressed as ⎧ 1 A ∂L ⎪ ⎪ = β1 ⎪ ⎪ ∂PSN1 2 B + APSN1 ⎪ ⎪ ⎪ CE 1 ⎪ ⎪ + (1 − β1 ) ⎪ ⎪ ⎪ 2 F + C(D + P SN2 I + EPSN1 ) ⎪ ⎪ 1 GPSN2 A ⎪ ⎪ ⎪ − β2 ⎪ ⎪ 2 (B + APSN1 )(B + APSN1 + GPSN2 ) ⎪ ⎨ HE 1 + (1 − β2 ) +J ⎪ 2 F + H (D + PSN2 I + EPSN1 ) ⎪ ⎪ ⎪ IC ∂L 1 ⎪ ⎪ = (1 − β1 ) ⎪ ⎪ ∂P 2 F + C(D + EP ⎪ SN2 SN1 + PSN2 I ) ⎪ ⎪ 1 G ⎪ ⎪ + β2 ⎪ ⎪ ⎪ 2 B + APSN1 + GPSN2 ⎪ ⎪ IH 1 ⎪ ⎪ ⎩ + (1 − β2 ) +K 2 F + H (D + EPSN1 + PSN2 I )
(14)
2 2 2
where A = (1 − λ)hSN1,RSN , B = (1 − λ)pP hPU1,RSN + hPU2,RSN + 2 2 2
+ n, E = n, C = ηλhRSN,SN2 , D = PP hPU1,RSN + hPU2,RSN 2 2 2
hSN1,RSN , F = pP hPU1,SN1 + hPU2,SN1 + n, G = (1 − λ)hSN2,RSN 2 , H = 2 2 2 2 ηλhRSN,SN1 , I = hSN2,RSN , J = −α1 hSN1,PU1 − α2 hSN1,PU2 2 2 2 2 +α3 E hRSN,PU1 + α4 E hRSN,PU2 , K = −α1 hSN2,PU1 − α2 hSN2,PU2 2 2 +α3 I hRSN,PU1 + α4 I hRSN,PU2 . When β1 = β2 = 0, CFK − C 2 F 2 K 2 + C 2 H 2 I 2 − 2CF2 HK2 + F 2 H 2 K 2 ∗ PSN1 = − 2CEHK + +CHI + FHK + 2CDHK 2CEHK CEH − C 2 E 2 H 2 + C 2 F 2 J 2 − 2CF2 HJ2 + F 2 H 2 J 2 ∗ PSN 2 = − 2CHIJ + +CFJ + FHJ + 2CDHJ (15) 2CHIJ When β1 = β2 = 1,
∗ PSN1 =
∗ PSN2 =
AK − GJ 2AJK
+
AK − GJ − 2GJK
+ (16)
Power Optimization in DF Two-Way Relaying SWIPT …
When β1 = 1, β2 = 0, 1 AI − 2BEK + 2BIJ + ∗ = PSN1 2 AEK − AIJ 2 2BE HK2 − 2AEFK2 + AHI2 J − 2ADEHK2 ∗ PSN2 = 2AEHIK2 − 2AHI2 JK −2AEHIK + 2AFIJK + 2ADHIJK − 2BEHIJK + 2AEHIK2 − 2AHI2 JK
145
(17)
When β1 = 0, β2 = 1, we cannot get an analytical solution, but we can solve it numerically by using MATLAB. 3.3 Under Condition T2 The partial derivatives of L to PSN1 , PSN2 can be shown as ⎧ 1 A ∂L ⎪ ⎪ = β1 ⎪ ⎪ ∂PSN1 2 B + GPSN2 + APSN1 ⎪ ⎪ ⎪ CE 1 ⎪ ⎪ + (1 − β1 ) ⎪ ⎪ ⎪ 2 F + C(D + EP SN1 + IPSN2 ) ⎪ ⎪ EH 1 ⎪ ⎪ ⎪ +J + (1 − β2 ) ⎪ ⎪ 2 F + H (D + EPSN1 + IPSN2 ) ⎪ ⎨ ∂L 1 AGPSN1 = −β1 ⎪ ∂PSN2 2 (B + GPSN2 )(B + APSN1 + GPSN2 ) ⎪ ⎪ CI 1 ⎪ ⎪ ⎪ + − β1 ) (1 ⎪ ⎪ 2 F + C(D + EP ⎪ SN1 + IPSN2 ) ⎪ ⎪ 1 G ⎪ ⎪ + β2 ⎪ ⎪ ⎪ 2 B + GPSN2 ⎪ ⎪ HI 1 ⎪ ⎪ ⎩ + (1 − β2 ) +K 2 F + H (D + EPSN1 + IPSN2 ) When β1 = β2 = 0, CFK − C 2 F 2 K 2 + C 2 H 2 I 2 − 2CF2 HK2 + F 2 H 2 K 2 ∗ PSN1 = − 2CEHK + +CHI + FHK + 2CDHK 2CEHK CEH − C 2 E 2 H 2 + C 2 F 2 J 2 − 2CF2 HJ2 + F 2 H 2 J 2 ∗ PSN2 = − 2CHIJ + +CFJ + FHJ + 2CDHJ 2CHIJ When β1 = β2 = 1 ∗ PSN1
G + 2BK = − 2AK
+
(18)
(19)
146
C. Wen et al. ∗ PSN2
A + 2BJ = − 2GJ
+
When β1 = 0, β2 = 1 2BCI2 J2 + CE2 GK − 2FGIJ2 − 2CEGIJ ∗ PSN1 = − 2CEJ(EGK − GIJ) +2EFGJK − 2BCEIJK + 2CDEGJK + 2CEJ(EGK − GIJ) EG + 2BEK − 2BIJ + ∗ PSN2 = − 2EGK − 2GIJ
(20)
(21)
When β1 = 1, β2 = 0, we cannot get an analytical solution, but we can solve it numerically by using MATLAB. 3.4 Deriving the Optimal λ∗ Substituting the optimal PS∗ into (12), the Lagrangian function can be written as PU1 2 PU1 2 L(λ, PS , α1 , α2 , α3 , α4 ) = F + α1 Ith − PSN1 hSN1 − PSN2 hSN2 + α3 Ith ‘ PU2 2 PU2 2 + α2 Ith − PSN1 hSN1 − PSN2 hSN2 + α4 Ith (22) where F = β1 RSN1,RSN + (1 − β1 )RRSN,SN2 + β2 RSN2,RSN + (1 − β2 )RRSN,SN1 2 PU2 2 − α3 ηλhPU1 E + IP ηλ + P − α (D ) h SN1 SN2 4 RSN RSN (D + PSN1 E + IPSN2 ) (23) From (23), we can see that λ is only related to F; thus, optimal λ∗ can be found as λ∗ = arg max F {λ}
(24)
4 Simulation Result In this section, we assume that the channel between any two nodes is Rayleigh fading channel, the path loss coefficient m = 3 and the energy conversion efficiency η = 0.8. The sending power of the primary users pp = 3 W. For simplicity, we set the power of the additive white Gaussian noise n = 0.001 W. The distance of link SN1-RSN and SN2-RSN is 1.5 m, the distance of link SN1-PU1 and SN2-PU2 is 3 m and the distance of link SN1-PU2 and SN2-PU1 is 4.5 m.
Power Optimization in DF Two-Way Relaying SWIPT …
147
Figure 2 illustrates the achievable rate of cognitive network Rmax versus I th for different λ. As can be seen from this figure, when I th is small and gradually increases, it is equivalent to primary users PU1, PU2 can accept more cognitive network interference, then the secondary users SN1, SN2 can have greater power to transmit signal, that is, PS1 and PS2 become gradually larger, so Rmax rises. When I th reaches a certain level, since the interference of the secondary users is within the acceptance range of the primary users, PS1 and PS2 are unchanged under this condition, resulting in the constant transmission rate. When I th is the same, because the larger λ allows the RSN to collect more energy for forwarding in the second phase, the rate of RSN-SN1 and RSN-SN2 links in the second phase will also increase. Finally, the Rmax that can be reached by secondary users also increases.
Fig. 2 Achievable rate of cognitive network Rmax versus I th for different λ
Figure 3 illustrates the achievable rate of cognitive network Rmax versus I th for different PS . From this figure, we can obtain that as I th gradually increases, the reachable rate Rmax of the secondary users also gradually increases. When I th is the same, larger PS1 and PS2 mean that more energy is decoded and forwarded by RSN, and the rate Rmax can be reached by secondary users eventually becomes larger.
5 Conclusion In this paper, we have proposed a joint resource optimization protocol in a DF cognitive network, where primary users and secondary users can transmit in two directions. Specifically, under the premise of ensuring the transmission quality of the primary users, the secondary users share the spectrum in two transmission phases. By jointly optimizing the transmission rate of the two secondary users and the power segmentation factor
148
C. Wen et al.
Fig. 3 Achievable rate of cognitive network Rmax versus I th for different PS
of the relay node, the information rate of secondary users is maximized. Through a series of numerical simulations, we can see that when the primary users’ transmit power and power segmentation factor are both optimal, the transmission rate of the cognitive network is optimal.
References 1. Lu X, Wang P, Niyato D et al (2015) Wireless networks With RF energy harvesting: a contemporary survey. IEEE Commun Surv Tutorials 17(2):757–789 2. Varshney LR (2008) Transporting information and energy simultaneously. In: 2008 IEEE international symposium on information theory. IEEE 3. Zhou X, Zhang R, Ho CK (2012) Wireless information and power transfer: architecture design and rate-energy tradeoff. IEEE Trans Commun 61(11):4754–4767 4. Nasir AA, Zhou et al (2013) Relaying protocols for wireless energy harvesting and information processing. IEEE transactions on wireless communications. 12(7):3622–3636 5. Kang X, Liang YC, Nallanathan A et al (2008) Optimal power allocation for fading channels in cognitive radio networks: ergodic capacity and outage capacity. IEEE Trans Wirel Commun 8(2):940–950 6. Laneman JN, Tse DNC, Wornell GW (2004) Cooperative diversity in wireless networks: efficient protocols and outage behavior. IEEE Trans Inf Theor 50(12):3062–3080 7. Silva S, Ardakani M, Tellambura C (2017) Relay selection for cognitive massive MIMO twoway relay networks. In: 2017 IEEE wireless communications and networking conference (WCNC), San Francisco, CA, pp 1–6 8. Ubaidulla P (2012) Optimal relay selection and power allocation for cognitive two-way relaying networks. Wirel Commun Lett 1(3):225–228. IEEE
Power Optimization in DF Two-Way Relaying SWIPT …
149
9. Li Q, Varshney PK (2017) Resource allocation and outage analysis for an adaptive cognitive two-way relay network. IEEE Trans Wirel Commun 16(7):4727–4737 10. Hong H, Xiao L, Yan Y, Xu X, Wang J (2014) Outage performance for cognitive two-way relaying networks with underlay spectrum sharing. In: 2014 IEEE 79th vehicular technology conference (VTC Spring), Seoul, pp 1–5
Compressive Sensing-Based Array Antenna Optimization for Adaptive Beamforming Jian Yang1,2 , Jian Lu2(B) , Bo Hou2 , and Xinxin Liu2 1 2
School of Electronic Engineering, Xidian University, Xi’an 710071, China Xi’an Research Institute of Hi-Tech, Hongqing Town, Xi’an 710025, China [email protected]
Abstract. In the digital beamforming, each antenna element usually corresponds to a front-end chain, which dramatically increases the hardware costs. In this paper, compressive sensing-based array antenna optimization technique is used for adaptive beamforming, which can decrease the hardware complexity while suppressing the interfering signals. Compressive sensing is utilized to reduce the sampling channel number, and the sparse reconstruction based on the convex optimization model is applied to accurately recover the full-array data. Then, the weight vector can be obtained by the recovered data. Simulation results are carried out to demonstrate the effectiveness of the proposed method. Keywords: Compressive sensing · Adaptive beamforming reconstruction · Convex optimization
1
· Sparse
Introduction
To effectively enhance the desired signal and suppress the interferences, adaptive beamforming techniques have been widely applied to radar, sonar, wireless communications, and speech processing. [1–4]. In the adaptive beamforming system, each antenna sensor needs to be equipped with a front-end chain for digital sampling and processing, which significantly increases the hardware costs of the system design. Compressive sensing (CS) is one of the most interesting signal processing techniques, which can enable accurate signal recovery at sub-Nyquist sampling rates [5]. That is to say, the CS technique can effectively decrease the hardware complexity. In [6], a measurement-domain adaptive beamforming approach based on distributed CS was proposed to reconstruct the high-resolution ultrasound images. But this method mainly aims to reduce the amount of samples per channel and makes the real-time transmission of sensor array data possible. In [7], an adaptive beamforming algorithm for the coprime array was proposed by processing the virtual uniform linear array (ULA) c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_19
Compressive Sensing-Based Array Antenna Optimization . . .
151
signals with CS. However, with this method, the sidelobe levels in the beampattern are very high. In [8], for achieving high-resolution beamforming in timedomain, CS was adopted to provide a solution for underdetermined linear problem by promoting sparsity of the solution. However, the CS technique is mainly used to obtain the DOAs in time-domain. Moreover, CS is useful for DOA estimation in the sparse arrays [9,10]. In [11], an improved beamforming method with CS was proposed by reconstructing the signal with fairly low sample rate. But the method merely uses CS to decrease the signal sample rate in timedomain. In [12], to address the reconstruction of aircraft engine noise source, both the adaptive beamforming and CS method are considered to improve the performance of the inverse reconstruction in terms of array imagining resolutions and the number of sensors. From the perspective of hardware system, an adaptive beamforming structure based on CS is designed to effectively reduce the number of front-end chains.
2
Signal Model
Consider a ULA consisted of N omni-directional sensors with half a wavelength apart that receive Q narrowband far-field signals. The desired signal comes from the direction θ0 , while Q − 1 interferences impinge upon the array from the directions θ1 , θ2 , . . . , θQ−1 , respectively. The array signal vector x(k) at the kth snapshot can be expressed as x(k) = A(θ)s(k) + n(k) =
Q−1
sq (k)a(θq ) + n(k),
(1)
q=0
where s(k) = [s0 (k), s1 (k), . . . , sQ−1 (k)]T are the uncorrelated source signals, A(θ) = [a(θ0 ), a(θ0 ), . . . , a(θQ−1 )] is the array manifold, sq (k) denotes the qth incident signal, a(θq ) denotes the steering vector, n(k) denotes the additive noise vector, which is spatially and temporally white Gaussian with zero mean and variance δ02 . Assume that the desired signal s0 (k), interferences sq (k), 1 ≤ q ≤ Q−1, and noise n(k) are statistically independent with each other. For the ULA, the steering vector a(θq ) has the following form a(θq ) = [1, exp(−jπdx sin θq ), . . . , exp(−jπdx (N − 1) sin θq )]T ,
(2)
T where dx = 2d λ , d, λ and (·) denote the array element spacing, signal wavelength, and the transpose operation, respectively. The adaptive beamformer output can be represented as (3) y(t) = wH x(k),
where w is the weight vector of the beamformer, and (·)H denotes the Hermitian transpose operation. The weight vector w of the optimal beamformer can be obtained by maximizing the output signal-to-interference-plus-noise ratio (SINR), which is mathematically equivalent to the minimum variance distortionless response (MVDR) beamforming [13], min wH Ri+n w subject to wH a(θ0 ) = 1, w
(4)
152
J. Yang et al.
where Ri+n is the interference-plus-noise covariance matrix (INCM). The optimal solution of the above optimization problem can be expressed as wopt =
R−1 i+n a(θ0 )
aH (θ0 )R−1 i+n a(θ0 )
.
(5)
In the digital array system, the number of front-end chains must be the same as that of the sensors, which plays bad effects on hardware design and energy consumption. The subarray technology is often used to reduce the number of expensive front-end chains, but the degree of freedoms (DOFs) of the beamforming is also cut down at the same time. In this paper, we propose an adaptive beamforming structure based on CS, which can reduce the hardware costs and improve the system DOFs of the beamforming.
3
CS-based Array Antenna Optimization for Adaptive Beamforming
In the section, an array antenna structure based on CS, sparse reconstruction, and adaptive beamforming is designed. The CS technique is utilized to reduce the number of front-end chains, and the sparse reconstruction is used to recover the N-dimensional array signal vector and improve the DOFs of the adaptive beamforming. 3.1
CS Theory
With the idea of CS, a random CS kernel Φ is applied to compress the array signal vector x(k) as z(k) = Φx(k) = Φ(A(θ)s(k) + n(k)),
(6)
where M < N determines the dimension of the compressed measurement vector z(k), Φ is the M × N dimensional compression matrix. The elements of Φ can be generated from a random distribution, such as Gaussian random matrix, Hadamard matrix, partial Fourier matrix, if there is no available prior knowledge of the desired signal [14]. With Φ, the N -dimensional array signal vector x(k) can be compressed to a M-dimensional compressed measurement vector z(k), which dramatically reduces the number of receiving channels. Because the energy of interfering signals is far stronger than noise, the noise n(k) can be neglected, and then, the compressed measurement vector z(k) can be further written as z(k) = Ψx(k) = ΦA(θ)s(k),
(7)
where Ψ = ΦA(θ) is the measurement matrix. If s(k) is sparse with only Q nonzero coefficients, the signal vector s(k) can be recovered by solving the inverse problem of Eq. (7). Therefore, according to the CS theory, the incident signal vector s(k) must be converted to a sparse vector ¯ s(k), and A(θ) also needs to ¯ At last, the array signal vector x(k) can be be accordingly transformed to A. accurately reconstructed.
Compressive Sensing-Based Array Antenna Optimization . . .
3.2
153
Sparse Reconstruction
To make the signal vector s(k) sparse, we evenly divide the angular set of the incident signals into L parts, and each angle corresponds to a steering vector. ¯ Hence, the L steering vectors can construct a transform matrix A(ϑ), ¯ A(ϑ) = [a(ϑ0 ), a(ϑ1 ), . . . , a(ϑl ), . . . , a(ϑL )],
(8)
where ϑ = [ϑ0 , ϑ1 , . . . , ϑL ] are the possible DOAs of the incident signals. Accordingly, the array signal vector s(k) can be extended as ¯ s(k) = [0, . . . , s0 (k), 0, . . . , s1 (k), 0, . . . , sQ−1 (k), 0, . . .],
(9)
where s0 (k), s1 (k), . . . , sQ−1 (k) are the incident signals. In (9), the incident signals are arbitrarily distributed in the Q non-zero positions of ¯ s(k), and the remaining elements are zero. The extended signal vector ¯ s(k) is sparsed with only Q non-zero elements, which are composed of the Q incident signals at the kth snapshot. By compressing the array signal vector x(k), we can obtain ¯ z(k) = Φx(k) = ΦA(ϑ)¯ s(k),
(10)
where z(k) is M -dimensional signal vector, ¯ s(k) is L-dimensional signal vector with Q non-zero coefficients, satisfying Q < M L. According to the CS theory, the sparse signal vector ¯ s(k) can be recovered with the M-dimensional signal vector z(k), and then, the array signal vector x(k) can be reconstructed. Therefore, the sparse optimization model can be established as ¯ (11) sk 0 subject to z(k) − ΦA(ϑ)˜ sk ≤ ϕ, min ˜ ˜ sk
where s˜k is the estimate of ¯ s(k), and ϕ is a user parameter to determine the reconstruction uncertainty bound. The non-convex optimization problem with the 0 -norm can be solved by convex relaxation, and the non-convex 0 -norm is replaced by the convex 1 -norm. Hence, the optimization problem (11) can be replaced by 1 ¯ sk , (12) sk 1 + z(k) − ΦA(ϑ)˜ min ξ˜ ˜ sk 2 where ξ is a regularization parameter to balance the sparsity of the reconstructed signal and the reconstruction error. The optimization problem is convex and can be efficiently solved with convex optimization software, such as CVX [15]. Then, the array signal vector x(k) can be reconstructed as follows, ¯ x ˜(k) = A(ϑ)˜ sk .
(13)
˜ = [s1 , s2 , . . . , sK ], adaptive By recovering K snapshots array receiving signals S beamforming algorithms can be used to obtain the weight vector w for enhancing the desired signal and suppressing the interferences.
154
3.3
J. Yang et al.
System Design
As shown in Fig. 1, the N -dimensional received signals are sent into the CS network and converted to a M -dimensional signal vector. Through M front-end chains, M -dimensional digital signal vector z(k) can be obtained, and then, the full-array signal vector x ˜(k) can be recovered by solving the convex optimization ˜ = [s1 , s2 , . . . , sK ] are used for the problem. At last, the recovered signals S adaptive beamforming, and then, the weight vector w can be obtained. In the procedures, the shared front-end chain 1 connects the contact b and the contact d. When the weight vector w is applied to the array antenna, the shared front-end chain 1 connects the contact a and the contact c. By recovering the full-array receiving signal vector x ˜(k), the DOFs of the beamformer can be effectively improved.
1
Pre-processing
2
Pre-processing
W1
W2 Output
...
N
Pre-processing
a
b
WN 11
c
Front-end Chain 1
1 Adaptive Beamforming
12
...
M
M2
...
1N
M1
d
Front-end Chain M
x1 x2
...x
N
Sparse Reconstruction
MN
CS Network
Fig. 1 Structure design of array antenna
4
Simulation Results
In the simulations, a ULA with N = 16 omni-directional sensors spaced half a wavelength is considered. There are four signals composed of the desired signal and three interferences impinging from the directions θ0 = 10◦ , θ1 = −40◦ , θ2 = −20◦ , and θ3 = 40◦ , respectively. The first signal is assumed to be the desired signal with 10 dB signal-to-noise ratio (SNR), and the others are interferences with the same interference-to-noise ratio (INR). In the simulations, the number of snapshots is fixed to be K = 60, the INRs are set to 35 dB, the parameter M
Compressive Sensing-Based Array Antenna Optimization . . .
155
is set to 10, and the CS kernel Φ ∈ R10×16 is obtained from a Gaussian random matrix. For each scenario, 200 Monte-Carlo trials are performed. In the proposed method, we use the adaptive beamforming algorithm based on the INCM reconstruction presented in [16] to obtain the weight vector w, and the regularization parameter is set to ξ = 0.4. The proposed method is compared to the INCM reconstruction-based adaptive beamforming (Rec-INCM) presented in [16], the worst-case performance optimization-based beamformer (DBF-WCPO) presented in [17], the covariance matrix reconstruction-based beamformer (Rec-SVECM) presented in [18], the optimal beamforming with 16 sensors (16-sensor-optimal), and the optimal beamforming with ten sensors (10-sensor-optimal). Note that the weight vector of the optimal beamforming is obtained by (5). The angular set of the desired signal is assumed to be within the interval Θ = [θ0 − 5◦ , θ0 + 5◦ ]. The value ε = 0.3 is used for the DBFWCPO beamformer, and the predetermined threshold is set to ξ = 0.95 in the Rec-SVECM beamformer. 0 −10 −20 −30 −40 −50 −60 −70
16−sensors−optimal 10−sensors−optimal Proposed method
−80 −90 −90 −80 −70 −60 −50 −40 −30 −20 −10
0
10
20
30
40
50
60
70
80
90
Fig. 2 Structure design of array antenna
The beampatterns of the tested beamformers are shown in Fig. 2. The beampointing and null positions accurately correspond to the DOAs of the desired signal and interferences, respectively, and the null widths of the proposed method are widened to a certain extent, which enables the robustness against the DOA error. In addition, the sidelobe levels and main-lobe width of the proposed beamformer outperform the optimal beamformer of the 10-sensors ULA. As shown in Fig. 3, the simulation results demonstrate the anti-jamming performance of the tested methods. Figure 3a compares the output SINR versus
156
J. Yang et al. 70 50 40
16−sensors−optimal DBF−WCPO Rec−INCM Rec−SVECM Proposed method
OUTPUT SINR (dB)
OUTPUT SINR (dB)
60
30 20 10 0 −10 −20 −30 −30
−20
−10
0
10
20
30
40
50
30 25 20 15 10 5 0 −5 −10 −15 −20 −25 −30
16−sensors−optimal DBF−WCPO Rec−INCM Rec−SVECM Proposed method
20
30
40
INPUT SNR (dB)
50
60
70
80
90
100
NUMBER OF SNAPSHOTS
(b)
(a)
Fig. 3 Second example: a Output SINR versus input SNR; b output SINR versus number of pulses
70 60 50 40 30 20 10 0 −10 −20 −30 −40 −30
16−sensors−optimal DBF−WCPO Rec−INCM Rec−SVECM Proposed method
−20
−10
0
OUTPUT SINR (dB)
OUTPUT SINR (dB)
the input SNR. It can be seen from the figures that the performance of the proposed method is very close to the optimal SINR in the range from −30 to 20 dB and slightly degrade at higher SNRs. Figure 3b compares the output SINR versus the number of snapshots. It can be demonstrated that the proposed method outperforms other tested beamformers in the scenario.
10
20
INPUT SNR (dB)
(a)
30
40
50
30 25 20 15 10 5 0 −5 −10 −15 −20 −25 −30
16−sensors−optimal DBF−WCPO Rec−INCM Rec−SVECM Proposed method
20
30
40
50
60
70
80
90
100
NUMBER OF SNAPSHOTS
(b)
Fig. 4 Second example: a Output SINR versus input SNR; b output SINR versus number of pulses
In this example, the influence of random signal direction errors on the array output SINR is examined. Assume that the DOA errors of the desired signal and interferences are subject to uniform distribution in [−1◦ , +1◦ ] for each simulation run. In the simulations, the DOAs change from run to run, but the weight vectors of the tested beamformers remain constant. Figure 4(a) depicts the output SINR of the tested beamformers versus the input SNR. The output SINR versus the number of snapshots is illustrated in Fig. 4b. It has been shown that
Compressive Sensing-Based Array Antenna Optimization . . .
157
the proposed beamformer outperforms the other tested methods in case of random signal direction errors, and the CS technique can enhance the anti-jamming performance by improving the DOFs of the beamforming.
5
Conclusion
In this paper, a novel adaptive beamforming based on array antenna optimization technique is proposed. With the CS method, the full-array data are recovered by solving the convex optimization problem, and the recovered data are used to obtain the weight vector by the adaptive beamforming. Simulation results demonstrate that the number of front-end chains is greatly reduced and the overall performance of the proposed beamforming is close to the optimal. Moreover, the main-lobe width and sidelobe levels can be effectively cut down by improving the DOFs of the adaptive beamforming. Acknowledgment. This work was supported by the National Natural Science Foundation of China under grant 61501471.
References 1. Zhang Z, Liu W, Leng W, Wang A, Shi H (2016) Interference-plus-noise covariance matrix reconstruction via spatial power spectrum sampling for robust adaptive beamforming. IEEE Signal Process Lett 23(1):121–125 2. Yang J, Lu J, Liu X, Liao G (2020) Robust null broadening beamforming based on covariance matrix reconstruction via virtual interference sources. Sensors 20(7):1865. https://doi.org/10.3390/s20071865 3. Li W, Yang J, Zhang Y, Lu J (2019) Robust wideband beamforming method for linear frequency modulation signals based on digital dechirp processing. IET Radar Sonar Navig 13(2):283–289 4. Lu J, Yang J, Liu X, Liu G, Zhang Y (2019) Robust direction of arrival estimation approach for unmanned aerial vehicles at low signal-to-noise ratios. IET Signal Process 13(4):456–463 5. Donoho DL (2006) Compressed sensing. IEEE Trans Inf Theor. 52(4):1289–1306 6. Zhang Q, Li B, Shen M (2013) A measurement-domain adaptive beamforming approach for ultrasound instrument based on distributed compressed sensing: initial development. Ultrasonics 53:255–265 7. Gu Y, Zhou C, Goodman NA, Song WZ, Shi Z (2016) Coprime array adaptive beamforming based on compressive sensing virtual array signal. In: 41th IEEE international conference on acoustics, speech and signal processing. IEEE Press, Shanghai, pp 2981–2985 8. Choo Y, Park Y, Seong W (2017) Compressive time-domain beamforming. In: IEEE Underwater Technology, IEEE Press, Busan 9. Zhou C, Gu Y, Zhang YD, Shi Z, Jin T, Wu X (2017) Compressive sensing-based coprime array direction-of-arrival estimation. IET Commun 11(11):1719–1724 10. Guo M, Zhang YD, Chen T (2018) DOA estimation using compressed sparse array. IEEE Trans Signal Process 66(18):4133–4146
158
J. Yang et al.
11. Zhou S, Shen Z, Wang Q (2018) An improved beamforming method with compressive sensing for interference suppression. In: 15th International Bhurban conference on applied sciences and technology. IEEE Press, Islamabad, pp 775–777 12. Yu W, Huang X (2018) Reconstruction of aircraft engine noise source using beamforming and compressive sensing. IEEE Access 6:11716–11726 13. Capon J (1969) High-resolution frequency-wavenumber spectrum analysis. Proc IEEE 57(8):1408–1418 14. Gu Y, Goodman NA, Ashok A (2014) Radar target profiling and recognition based on TSI-optimized compressive sensing kernel. IEEE Trans Signal Process 62(12):3194–3207 15. Grant M, Boyd S, Ye Y, CVX: Matlab software for disciplined convex programming, http://cvxr.com/cvx 16. Gu Y, Leshem A (2012) Robust adaptive beamforming based on interferencecovariance matrix reconstruction and steering vector estimation. IEEE Trans Signal Process 60(7):3881–3885 17. Vorobyov SA, Gershman AB, Luo ZQ (2003) Robust adaptive beamforming using worst-case performance optimization: a solution to the signal mismatch problem. IEEE Trans Signal Process 51(2):313–324 18. Shen F, Chen F, Song J (2015) Robust adaptive beamforming based on steering vector estimation and covariance matrix reconstruction. IEEE Commun Lett 19(9):1636–1639
A Fiber Bragg Grating Sensor for Pressure Monitoring of Ship Structure Under Wave Load Jingping Yang, Wei Wang(B) , Libo Qiao, and ChuanQi Liu Tianjin Key Laboratory of Wireless Mobile Communications and Power Transmission, Tianjin Normal University, 300387 Tianjin, China [email protected]
Abstract. Based on the development of high-sensitivity pressure sensor, a kind of FBG sensor for the pressure monitoring of ship structure under wave load is proposed. The pressure sensor is mainly composed of metal circular diaphragm and two FBGs pasted on the sensitized structure, and the difference structure is adopted to greatly improve the measurement sensitivity. Firstly, the structure of the FBG pressure sensor is introduced, and the working principle of the FBG pressure sensor is analyzed theoretically. Then, the pressure sensitivity of the FBG pressure sensor in the range of 0–4.5 MPa is 0.871 pm/KPa through the simulation calculation through the finite element analysis software. Keywords: Fiber Bragg grating · Pressure sensor · The difference structure
1 Introduction Ships working in the ocean will inevitably be affected by the waves. Under the influence of the waves, the ship will move greatly, and the bottom of the bow will come out of the water. When it enters the water again, it will be strongly impacted. This impact is called slamming phenomenon. Slamming phenomenon can occur at the bottom of the bow, at the bow and aft. In the instant of a slamming, there is a great impact on the bottom. It will cause stress changes in local structures, where slamming occurs or even destroy ship structure. It is a short process for a ship to be slammed, but the bow of the ship is subjected to the impact pressure from the sea water. The instantaneous pressure value is usually very high, which will also damage the ship structure and even lead to deformation and structural failure of the ship structure.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_20
160
J. Yang et al.
As an important component of optical fiber sensor, optical fiber grating has good antielectromagnetic interference ability and electrical insulation. The signal is wavelength coded and is not sensitive to environmental interference due to fluctuations in light source intensity, random fluctuations caused by micro bending loss of optical fibers and coupling loss. The material of the FBG is silicon dioxide, which has strong corrosion resistance and is small in size and plasticity. Fiber grating can connect multiple raster of different wavelengths on the same fiber to achieve simultaneous measurement of multiple physical quantities. Therefore, optical sensors are suitable for use in harsh environments like ocean ships. The main manifestation of slamming load is impact pressure on ship structure. In this paper, based on the study of pressure sensor technology, a general design method for the study of ship slamming load monitoring sensor based on fiber grating is established. In 1993, Xu, M. G. et al. developed a pressure optical fiber sensor using optical fiber grating [1]. In this paper, the peak reflection wavelength of bare fiber grating during compression is studied. It is found that the wavelength drift is only 0.22 nm under the hydraulic pressure of 70 MPa. As a result, the bare optical fiber grating pressure measurement sensitivity is very low and cannot be used directly for pressure physical measurement. In 2004, Hu Shuyang et al. sealed the gas piston and grating in a cylindrical container. Research shows that the sensor has a sensitivity of 3.515 × 105 times that of bare optical fiber and a good linearity [2]. The disadvantage of this sensor is that temperature compensation is not considered. In order to improve the sensitivity, sensitive parts or sensitizing structures can be used.
2 Theory 2.1 Basic Sensing Theory of Fiber Grating Fiber Bragg grating (FBG) is a kind of fiber in which the refractive index periodically changes along the axis of the fiber core, forming a spatial phase grating. The structure of fiber grating is shown in Fig. 1. This periodic structure makes the Bragg scattering condition of the FBG which reflects when a wide spectral light passes through the FBG. The reflection spectrum peaks at the Bragg wavelength, and the rest of the wavelength continues to transmit through the FBG.
Fig. 1 Fiber Bragg grating structure
The effective refractive index neff and grating period of the reflected light at the center wavelength λB satisfy the following relationships [3] λB = 2neff Λ
(1)
A Fiber Bragg Grating Sensor for Pressure Monitoring of Ship …
161
It is known from Eq. (1) when the FBG is subject to temperature or strain changes in the external environment, the grating period and effective refractive index change accordingly, which results in the offset of the center wavelength of the reflected light. Therefore, the temperature and strain information of the environment in which the grating is placed can be reflected by the change of the center wavelength of the reflected light from the FBG. When FBG is used as a sensitive element, temperature and strain physical quantity will cause FBG to change [4], so Eq. (1) can be written as λB = 2neff + 2neff
(2)
2.2 Principle of Fiber Bragg Grating Strain Sensing In the analysis of FBG stress–strain principle, it is divided into axial strain and tangential strain, and the sensitivity of FBG to axial strain is much greater than that to tangential strain. Therefore, it is generally assumed that there is no tangential strain, and the ambient temperature remains unchanged. The change of grating wavelength with strain can be expressed as λB = (1 − Pe )εx λB
(3)
where εx is the stress on the FBG; Pe is the effective elasto-optical coefficient of optical fiber [5], from Eq. (3) Kε =
λB = 1 − Pe εx λB
(4)
Further derivation of Eq. (4) can be obtained [6, 7]: λB = (1 − P)λB εx = Kε εx
(5)
According to Hooke’s law, the relationship between the strain at the grating and the external axial tension F can be obtained. ε=
1S EF
(6)
where E and s represent the material properties of the fiber. Thus, the relationship between the change of FBG wavelength and the pull F is obtained. λ=
1S (1 − Pe )λ EF
(7)
From the above analysis, we can know that the change of the wavelength of FBG shows a linear relationship with the external axial strain of the fiber, regardless of the radial strain and temperature. Therefore, we can obtain the external strain indirectly by analyzing the FBG center wavelength data.
162
J. Yang et al.
2.3 Fiber Bragg Grating Temperature Sensing Characteristics When the FBG is only affected by the temperature change of the environment, the period of the FBG is changed by the thermal expansion effect of the fiber, and the effective refractive index of the FBG is also changed by the thermal-optical effect of the fiber. Therefore, it is known that the center wavelength of the FBG will also shift when the temperature changes. The relationship between the thermal expansion effect and the grating period is = α · · T
(8)
where α represents the thermal expansion coefficient of the optical fiber. The relationship between the effective refractive index of the grating and the change of the thermal-optical effect is neff = ξ · neff · T
(9)
ξ represents the thermo-optical coefficient of the optical fiber material; the relationship between the effective refractive index of grating and the change of ambient temperature can be obtained. Equations (8, 9) are brought into Eq. (2) which can obtain [8]. λB = (a + ξ ) · λB · T = KT · T
(10)
It is not difficult to find from Eq. (10). Generally, there is a very good linear relationship between the change of FBG wavelength and the change of temperature value. However, due to the different fabrication processes of optical fibers and the different composition and proportion of doping in optical fibers, this will lead to a significant difference in thermal expansion coefficient and thermal-optical coefficient between different types of optical fibers, and subsequently, the temperature sensitivity coefficient will also be different. There are differences, so calibration operations are required before using different fiber gratings to ensure the accuracy of the data.
3 Design Scheme of Pressure Sensor Metal diaphragms have been a key measurement element in pressure sensors, especially in marine vessels and some aircraft. Diaphragms can be used to measure differential or gauge pressures or absolute pressures directly. In general, a diaphragm is a circular thin plate or film that separates fluids under two different pressures and has some flexibility. When there is a pressure difference between the fluids on both sides of the diaphragm, the diaphragm produces a certain deformation, which can be measured and sensed to output displacement and obtain parameters such as the pressure of the measured fluid. As the simplest diaphragm, the flat diaphragm can obtain good linear characteristics between pressure and displacement, especially for small deflection diaphragms. Within its elastic range, the linear characteristics of the diaphragm are excellent. However,
A Fiber Bragg Grating Sensor for Pressure Monitoring of Ship …
163
when the displacement of the center of the diaphragm exceeds a certain value, the linear characteristics of the diaphragm lose rapidly, generally not more than half the thickness of the diaphragm. The displacement of the planar diaphragm is so small that many pressure sensors using potentiometer mode cannot achieve the desired sensitivity. Because the FBG is more sensitive to pressure than the potentiometer mode, we consider transferring a small displacement from the center of the flat diaphragm to the FBG area to measure the wavelength of the raster to sense the pressure difference on both sides of the diaphragm. The sensor adopts differential symmetrical structure. The specific scheme is as follows: Firstly, a certain pre-tension is applied to the two gratings. When the sensitive diaphragm is under pressure, it will sag down and deform, drive the hard center to produce a downward displacement, drive the grating on the left side to stretch, increase the wavelength and shrink the grating on the right side, reduce the wavelength. Due to the symmetry of the structure, the stretching amount of the left grating is equal to the shrinking amount of the right grating, so the change of the central wavelength of the grating is twice that of the ordinary single grating. Finally, according to the change of wave length, the change of diaphragm force can be reflected, and the measured fluid pressure can be inferred. 3.1 Force Analysis of Plane Diaphragm When the transverse load on a rigid flat diaphragm material is uniformly and symmetrically distributed, we can get the relationship 2 2 q ∂ ω 1 ∂ω ∂ ω 1 ∂ω + + = (11) ω 2 2 ∂r r ∂r ∂r r ∂r D In Eq. (11), q represents the transverse load, and D represents the bending stiffness of planar diaphragm materials. Calculate Eq. (11) to get ω=
r C1 r 2 qr 4 + + C2 ln + C3 64D 4 R
(12)
For the fixed form of the planar diaphragm, this paper uses the form of peripheral clamping. After the planar diaphragm is fixed with the peripheral, when it is deformed, the rotation angle of the fixed position around the diaphragm and the center of the diaphragm is zero. From this, the deflection equation of the circular membrane in the form of clamping and fixing around the sheet can be obtained [9]. ω=
2 q 2 R − r2 64D
(13)
The maximum deflection occurs at the center of the flat diaphragm when r = 0, the maximum deflection ω is ω=
qR4 64D
(14)
164
J. Yang et al.
4 Simulation Experiment Analysis The finite element analysis software ANSYS workbench is used to simulate the surface strain distribution of the plane circular diaphragm in the FBG pressure sensor under the measured pressure. The parameters selected in the simulation are as follows: The material of the diaphragm is stainless steel, the modulus of elasticity is 1.93 × 1011 Pa, the Poisson’s ratio is 0.31, the effective inner diameter of the diaphragm is 40 mm, and the thickness of the diaphragm is 1 mm. In the simulation analysis and calculation, the plane circular diaphragm is completely restrained by fixing all around. The simulation steps of FBG pressure sensor are as follows: First of all, according to the designed sensor structure in the simulation software, the physical model is created, the grid is intelligently divided in the model, and the constraint conditions are added to the model. Pressure is applied to the diaphragm, and the pressure is increased gradually from 0 to 4.5 MPa at the interval of 0.5 MPa. The measured pressure sensor is loaded step by step, and after each load is stabilized, the displacement of the fiber Bragg gratings between the pressure sensors is recorded; the change of strain and wavelength is calculated, as shown in Table 1. Table 1 Shows the measured FBG strain with the change of load pressure recorded in the experiment Pressure (MPa)
Grating 1 displacement (mm)
Grating 2 displacement (mm)
Strain (με)
Wavelength variation (pm)
0.5
0.0145
−0.0145
725
870
1
0.0290
−0.0290
1450
1740
1.5
0.0435
−0.0435
2175
2610
2
0.0581
−0.0581
2905
3486
2.5
0.0726
−0.0726
3630
4356
3
0.0871
−0.0871
4355
5226
3.5
0.1017
−0.1017
5085
6102
4
0.1162
−0.1162
5810
6972
4.5
0.1307
−0.1307
6535
7842
The axial strain of the grating caused by external force can be calculated by εx =
L L
(15)
where L represents the length change of FBG, and L represents the length of FBG. In practical engineering calculation application, we usually use με to express the strain, where 1ε = 1 ∗ 106 με. According to the measured strain values of two FBGs in the pressure sensor under different pressure loads, as shown in Fig. 2, it can be obtained that the wavelength change
A Fiber Bragg Grating Sensor for Pressure Monitoring of Ship …
165
of FBG has a linear relationship with the pressure load. The relationship after calculation and fitting is as Eq. (16), and the pressure sensitivity coefficient of the FBG pressure sensor is: 0.871 pm/KPa.
Fig. 2 Change curve of FBG wavelength with pressure
y = 871.8x − 3
(16)
5 Conclusion In this paper, a fiber Bragg grating (FBG) sensor for the pressure monitoring of ship structures under wave loads is proposed and developed. The FBG sensor consists of a metal circular diaphragm and two FBGs attached to the sensitized structure. Firstly, the structure of the FBG pressure sensor is introduced, and the working principle of the FBG pressure sensor is analyzed theoretically. Then, the pressure sensitivity of the FBG pressure sensor in the range of 0–4.5 MPa is 0.871 pm/KPa through the simulation calculation through the finite element analysis software. Acknowledgements. This paper is supported by Natural Youth Science Foundation of China (61501326, 61401310, 61731006). It also supported by Tianjin Research Program of Application Foundation and Advanced Technology (16JCYBJC16500).
References 1. Xu MG, Reekie L (1993) Optical in-fibre grating high pressure sensor. Electron Lett 29(4):398– 399 2. Shuyang Hu, Shiya He, Qida Zhao (2004) A novel high sensitivity FBG pressure sensor. Optoelectron Laser 015(004):410–412
166
J. Yang et al.
3. Majumder M, Gangopadhyay TK, Chakraborty AK et al (2008) Fiber Bragg gratings in structural health monitoring—present status and applications. Sens Actuators A 147(1):150–164 4. Jinlong Theory, experiment and application of micro-structure fiber grating. Nankai University 5. Wang X, Guo Y, Xiong L et al (2018) High-frequency optical fiber Bragg grating accelerometer. IEEE Sens J 18(12):4954–4960 6. Weihong B, Yanjun Z, Yuefeng Q et al (2008) Optical fiber communication and sensing technology. Electronic Industry Press 7. Antunes P, Varum H, André P (2011) Uniaxial fiber Bragg grating accelerometer system with temperature and cross axis insensitivity. Measurement 44(1):55–59 8. Du Y, Sun B, Li J, Zhang W (2019) Optical fiber sensing and structural health monitoring technology. Springer Science and Business Media LLC 9. Yingchun L (2006) Design and application of sensor principle. National Defense Science and Technology Press, Beijing, pp 102–103
Research on Key Technologies of NoverCart Smart Shopping Cart System Chengyao Yang1 , Gong Chen1 , Bo Yang1 , Lu Ba2 , and Jinlong Liu1(B) 1
School of Electronics and Information Engineering, Harbin Institute of Technology, Harbin, China [email protected], [email protected], [email protected], [email protected] 2 China Institute of Marine Technology and Economy, Beijing, China [email protected]
Abstract. NovelCart is a smart shopping cart system that integrates ultra-wideband indoor positioning technology (UWB), radio frequency identification technology (RFID), associated rule mining algorithms and collaborative filtering algorithms and can be installed directly on traditional shopping carts. Among them, UWB indoor positioning technology is used to provide accurate positioning services with a precision of 30 cm; RFID RF identification technology is used to automatically obtain information on purchased goods; the association rule mining algorithm and collaborative filtering algorithm are used for precision advertising delivery. NovelCart uses a Kivy based graphical interface that allows commercial use of the LGPLv3 protocol to support platforms such as Linux, Windows, Android, and iOS. Through these technologies, NovelCart’s commodity navigation, automatic checkout, accurate promotion information delivery, and automatic scanning of purchased goods are supported. Keywords: UWB · RFID · Association rule mining algorithm Collaborative filtering algorithm
1
·
The Program Design and Demonstration
The NovelCart smart shopping cart system is aimed at large-scale integrated supermarkets. It is committed to providing supermarket consumers with a better shopping experience and providing a more accurate and effective advertising platform for supermarkets [1]. The overall system design of NovelCart is shown in Fig. 1. c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_21
168
C. Yang et al.
Fig. 1. Overall system design of NovelCart
NovelCart uses the ARM main processor plus STM32 coprocessor design to ensure the smooth operation of the system and meet the high real-time requirements. For sensors, NovelCart is mainly equipped with UHF RFID modules and UWB indoor positioning modules. It should be noted that NovelCart’s indoor positioning function needs to be used together with the UWB positioning anchor. In addition, NovelCart is equipped with a 7-inch 1024 × 600 HD touch screen for consumer interaction.
2
Hardware Design
NovelCart hardware design mainly consists of two parts: The main body of NovelCart smart shopping cart system and UWB positioning anchor. 2.1
The Main Body of NovelCart Hardware Design
The hardware framework of the NovelCart main body system is shown in Fig. 2: In terms of hardware, NovelCart’s ARM main controller is mainly responsible for controlling the KLM400 module to identify the goods in the shopping cart in real time, to obtain indoor positioning information by communicating with the coprocessor, alarm when the user exceeds the checkout area and use the onboard HDMI interface to output the user interaction interface to the touch screen. Part of the schematic is shown in Fig. 3. The STM32F103RCT6 coprocessor uses an 8M passive crystal oscillator as the clock input. It is multiplied by an internal phase-locked loop and operates at 72 MHz. The coprocessor part circuit also retains the startup mode selection interface BOOT0 for downloading programs, post-maintenance and debugging. The position of the UWB module antenna affects the accuracy of the positioning. In order to ensure the positioning performance and minimize the interference of other electronic components on NovelCart, the DWM1000 uses a separate circuit board through the cable and coprocessor part of the circuit.
Research on Key Technologies of NoverCart Smart Shopping . . .
169
Fig. 2. Hardware block diagram of NovelCart
Fig. 3. Part of circuit diagram of ARM main controller
2.2
The Hardware Design of UWB Positioning Anchor
2.2.1 The Principle of UWB Indoor Positioning NovelCart uses the single-sided two-way ranging (SS-TWR) direction to calculate the distance between the shopping cart and the UWB positioning anchor [2]. The principle of the SS-TWR positioning mode is shown in Fig. 4: The UWB module on NovelCart can measure the distance between base stations by accurately measuring the time difference Tround between the ranging request and the received reply message. The formula for calculating the distance is shown below:
Fig. 4. SS-TWR positioning
170
C. Yang et al.
d=
1 (Tround − Treply ) 2
(1)
c = 2.99792 × 108 m/s is the light speed. UWB three-point positioning can be realized according to the height h of the UWB module, the three distance data d1 , d2 , d3 and the three UWB anchor positions (x1 , y1 , z1 ), (x2 , y2 , z2 ), (x3 , y3 , z3 ). NovelCart uses a least squares three-point localization algorithm. Consider a set of linear equations that are overdetermined (the number of unknowns is less than the number of equations): Ax = b If the matrix AT A is not singular, there is a unique solution: −1 T x = AT A A b Assuming the UWB module coordinates are (x, y, z), the following three equations can be listed based on our known data: ⎧ 2 2 2 2 ⎨ (x1 − x) + (y1 − y) + (z1 − h) = d1 (1) 2 2 2 (x − x) + (y2 − y) + (z2 − h) = d2 2 (2) ⎩ 2 2 2 2 (x3 − x) + (y3 − y) + (z3 − h) = d3 2 (3) Enable
A= b=
2(x1 − x2 ) 2(y1 − y2 ) , 2(x1 − x3 ) 2(y1 − y3 )
x=
x , y
d22 − d21 + x21 + y12 + z12 − x22 − y22 − z22 − 2(z1 − z2 )h d23 − d21 + x21 + y12 + z12 − x23 − y32 − z32 − 2(z1 − z3 )h
−1 T By substituting x = AT A A b, the x-coordinate and y-coordinate of the UWB module can be calculated, thereby realizing the three-point positioning of the UWB. 2.2.2 The UWB Anchor Design The main framework of the UWB anchor is shown in Fig. 5, and some key schematics are shown in Figs. 6 and 7.
3
Software Design
NovelCart’s graphical user interface (GUI) is designed based on Kivy. Native support includes input device protocols for multi-touch platforms. Its graphics core is built around OpenGL ES2. The GPU of the target platform can be fully used for acceleration. The main relationship structure of the business logic thread of the software is shown in Fig. 8.
Research on Key Technologies of NoverCart Smart Shopping . . .
171
Fig. 5. UWB base station hardware system frame
Fig. 6. UWB base station power supply and interface circuit diagram
3.1
EPC Solution of RFID Label
The electronic code of products (EPC code) is the next-generation identification code that can be used to uniquely identify objects in the supply chain. NovelCart’s principle of identifying items in a shopping cart is to receive signals from RFID labels in the shopping cart through the RFID module, solve the EPC of each RFID label and query the database according to the EPC to obtain various information on all items in the shopping cart. After the chip receives a single polling instruction, if the label verified by CRC can be received, the KLM400 will return data containing RSSI, PC, EPC and CRC.
172
C. Yang et al.
Fig. 7. UWB base station DWM1000 module schematic
Fig. 8. Logic of GUI interface
4 4.1
Algorithm Design Association Rule Mining Algorithm Apriori
NovelCart preprocessed the user history purchase data collected by supermarkets and used Apriori [3], an association rule mining algorithm through which
Research on Key Technologies of NoverCart Smart Shopping . . .
173
frequent item sets can be effectively obtained, and corresponding thresholds are set according to data characteristics to recommend high-correlation products to users. Table 1 gives an example of a transaction dataset. Each row in the figure corresponds to a transaction and contains a collection of products that uniquely identified by the TID and the specific customer purchased. Table 1. Transaction records of shopping basket TID Items 1
Bread, milk
2
Bread, diaper, beer, eggs
3
Milk, diaper, beer, coke
4
Bread, milk, diaper, beer
5
Bread, milk, diaper, coke
The collection of all items in the shopping basket data is I = {i1 , i2 , . . . , id }, while the collection of all transactions is T = {t1 , t2 , . . . , tN }. A collection contains 0 or multiple items is called an itemset. Obviously, each transaction ti contains an item set that is a subset of I. The strength of the association rule can be measured by its support and confidence. The association rule mining task is decomposed into two sub-tasks: frequent item set generation and strong rule generation. As shown in Fig. 12, there is an itemset of I = {a, b, c, d, e}. In general, a k-itemset may produce 2k − 1 frequent itemset after excluding empty sets. In practice, the value of k may be very large. Combined with the first transcendental principle used by Apolori, the candidate set is pruned using support. It also shows that when {A, B} is a non-frequent set, its superset is also a non-frequent set and can be cut off (Fig. 9). 4.2
Collaborative Filtering Algorithm
The collaborative filtering recommendation algorithm [4] finds the user’s preferences by mining the user’s historical behavior data, groups the users based on different preferences and recommends products with similar tastes. We Simulated the ratings of two items by five users which is represented by Table 2. These behaviors can indicate the user’s attitude and preference for the product. The Euclidean coefficients between the five users are shown in Table 3. The smaller the coefficient, the closer the distance between the two users, and the preference is more similar. In the case of complex data, Pearson’s correlation evaluation
174
C. Yang et al.
Fig. 9. Lattice structure and unfrequent sets of pruning Table 2. Rating of products by five users Item 1 Item 2 User A 3.3
6.5
User B 5.8
2.6
User C 3.6
6.3
User D 3.4
5.8
User E 5.2
3.1
Table 3. Euclid distance evaluation (A, B, C, D, E represent users) 1ds
A, B A, C A, D A, E B, C B, D B, E C, D C, E D, E
Coefficient 4.63
O.36 O.71 3.89 4.30 4.00
0.78 0.54
3.58 3.24
Reciprocal 0.18
O.73 O.59 0.20 0.19 0.20
0.56 0.65
0.22 0.24
algorithm is used to calculate the relevance, and the result of Pearson’s correlation coefficient is a number between −1 and 1. The closer the coefficient get to 0, the weaker the correlation is. The results are shown in Table 4. The final recommendations are shown in Table 5.
5
Conclusion
NovelCart is a smart shopping cart system that integrates ultra-broadband indoor positioning technology (UWB), radio frequency identification technology (RFID), associated rule mining algorithms and collaborative filtering algorithms and can be installed directly on traditional shopping carts. Using a GUI based on Kivy, it runs smoothly under Linux and Windows systems.
Research on Key Technologies of NoverCart Smart Shopping . . .
175
Table 4. Pearson’s correlation coefficient (A, B, C, D, E represent users) AB
AC
AD
AE
BC
BD
BE
CD
CE
Similarity 0.9998 −0.8478 −0.8418 −0.9152 −0.8417 −0.8353 −0.9100 0.9990
DE
0.9763 0.9698
Table 5. Recommended products for user C Similarity A
A*
B
B*
C
C*
D
D*
User D
0.99899
3.4 3.39656 4.4 4.395544 5.8 5.7941262 2.1 2.097873
User E
0.97627
3.2 3.12407
0
4.1 4.0027152 3.7 3.612206
Total
6.52063
4.395544
9.7968414
5.71008
Total similarity
1.97526
1.975259
1.9752593
1.975259
Similarity
3.30115
2.2253
4.9594479
2.8908
Acknowledgments. This work was supported by the National Natural Science Foundation of China (Grant Nos. 61601145, 61471142, 61571167, 61871157), Project of Ministry of Education, 201802092007, teaching curriculum reform of virtual experiment simulation project of digital electronic technology Curriculum construction project of Harbin University of technology, 2018cxsy02, “innovation experiment of electronic information system based on scientific and technological innovation”, Curriculum construction project of Harbin University of technology, 2018cxcy02, “theoretical research and technological innovation in the direction of electronic information”.
References 1. Arciuolo T, Abuzneid A (2019) Simultaneously shop, bag, and checkout (2SBCCart): a smart cart for expedited supermarket shopping. In: 2019 international conference on computational science and computational intelligence (CSCI) 2. Cheng Y, Zhou T (2019) UWB indoor positioning algorithm based on TDOA technology. In: 2019 10th international conference on information technology in medicine and education (ITME), pp 777–782 3. Bodon F (2003) A fast APRIORI implementation. In: Proceedings of the IEEE ICDM workshop on frequent itemset mining implementations 4. Guan W, Chung CY, Lin LJ (2009) Collaborative-filtering content model for recommending items: US, US7584171
Human Identification Under Multiple Gait Patterns Based on FMCW Radar and Deep Neural Networks Shiqi Dong1,2 , Weijie Xia1,2(B) , Yi Li1,2 , and Kejia Chen2 1 Key Laboratory of Radar Imaging and Microwave Photonics, Ministry of Education, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, People’s Republic of China 2 College of Electronic and Information Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, People’s Republic of China [email protected]
Abstract. Human identification has been the crucial and difficult problem of domestic and foreign scholars for a long time. As a novel identification technology, more and more attention is paid to the gait identification, which has proven to be feasible. In this paper, the authors propose a gait identification method based on micro-Doppler signatures obtained by 77 GHz frequency-modulated continuous wave (FMCW) radar. The obtained signal is represented by time-frequency (T-F) spectrum, and then, deep neural network (DNN) is adopted to deal with the spectrums for human identification. It is shown that the method can identify humans under three different gait patterns (walking; jogging; and walking with books) with 95% accuracy for 50 people. In addition, the method can also identify humans even if the subject is walking under other gait patterns that are not included in the training set.
Keywords: Human identification radar
1
· Micro-doppler · DNN · FMCW
Introduction
In recent years, biometrics have been applied in many ways [1]. Compared to traditional passwords, biometrics are generally not stolen, forgotten or copied, and are more secure and easy to use. Gait is a biological characteristic that mainly refers to a person’s body shape and walking posture. Compared with other types of biological features such as fingerprints and iris, this method does not require the active cooperation of the subject. Even if someone wears a mask, gait recognition may be effective. Therefore, gait recognition has broad application prospects and economic value in the fields of access control systems, security monitoring, human-computer interaction, smart home and medical diagnosis. c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_22
Human Identification Under Multiple Gait Patterns Based . . .
177
Most gait recognition technologies use camera technology to analyze and process moving image sequences containing people [2]. With the introduction of micro-Doppler features [3], some scholars have noticed that people generate micro-Doppler features during walking or performing other activities. Since everyone is unique, gait and micro-Doppler features should also be unique, which inspired discussion and research on gait recognition based on micro-Doppler features [4,5]. Compared with the most studied camera-based gait recognition technology, radar is not sensitive to light and can protect people’s privacy. Kim first applied radar doppler features to activity classification), and adopted artificial neural network (ANN), support vector machine (SVM), deep convolutional neural network (DCNN) and other technologies, respectively, to gradually improve the classification accuracy [6–8]. In [9], an identification method based on DCNN and radar micro-doppler features is proposed, which can identify people in noncontact, remote and unlit states, with an accuracy of over 90% for small samples. Sun conducted multi-person micro-doppler feature analysis based on generalized S transform [10] and could identify the walking person as 1, 2 or 3. In [11], lowpower frequency-modulated continuous wave (FMCW) radar is used to collect micro-doppler features and conduct human body recognition. When collecting data, the target is allowed to walk in a free and autonomous way, and the recognition accuracy is about 80%. In addition, [12]adopted a variety of different gait patterns for identification, respectively, with an average accuracy of about 90%. In this paper, we explore the feasibility of gait recognition based on a 77 GHz FMCW radar by using micro-Doppler features under different gait patterns. Convolutional neural network (CNN) [13] which is widely used in image recognition is utilized to realize the classification. What’s more, we introduce recurrent neural network (RNN) to extract time-related features. The raw signal is characterized by T-F spectrum, and then, the neural network is adopted to deal with the spectrums for human identification. Contrast to [9], FMCW radar is adopted in this paper for future multi-person recognition, and the single gait pattern is extended to three common gait patterns (walking; jogging; and walking with books) in life. In the case of 50 people and three gait patterns, the identification accuracy of the network finally reached more than 95%. In addition, the method can also identify humans even if the subject is walking in other gait patterns that are not included in the training set.
2 2.1
Experiment Setup and Data Processing Experiment Setup
The 77GHz FMCW radar adopted in this paper is AWR1642 device, which is a single-chip millimeter-wave radar sensor integrated by Texas instruments (TI). Figure 1 shows the experiment scene in the corridor. The radar operates in 77 GHz FMCW mode with a 768 MHz bandwidth. Fifty people participated in the data collection. There were 28 males and 22 females aged between 21 and 25. The average weight/height of men and women was 70 kg/170 cm and 46 kg/158 cm, respectively. Each person walked 60 times along the radar line of sight (three gait
178
S. Dong et al.
patterns: walking; jogging; and walking with book) for 8 s at a time. Participants were required to start about 10 m from the radar.
Fig. 1. Experiment scenario a walking, b walking with a book, c jogging
2.2
Data Processing
According to the Doppler effect, the movement of the target relative to the signal source will cause the change of wave frequency. In addition, if the moving target has rotation or vibration components, besides the main Doppler shift, additional frequency components will be observed, which is called micro-doppler effect [3]. The motion of the human body relative to radar produces a microdoppler, which can be clearly observed in joint time-frequency domain. Therefore, we perform time-frequency (T-F) analysis on the original signal to obtain a spectrum containing micro-Doppler features. In this article, two-dimensional fast Fourier transform (2-D FFT) is applied to obtain the T-F spectrums, which can be seen in Fig. 2. Figure 3a–c shows three spectrograms with different gait patterns from one person, while Fig. 3d–f shows three spectrograms with same gait pattern from three different persons. As we can see from the figures, there are certain differences between different gait spectrums, but there is little difference between normal walking and walking with book, which can be attributed to the small contribution of arm movement to micro-doppler during walking. Moreover, because everyone walks at different speed and postures, to some extent there are differences in the spectrum of different people in the same gait pattern, which is the reason why our recognition is effective.
3
Deep Neural Networks
CNN is composed of input layer, convolution layer, pooling layer, full connection layer and output layer. In view of the limitation of CNN’s inability to model changes in time series, RNN is proposed in order to adapt to the processing of time series data. Considering the continuity of gait in reality, we combine classification network and long short-term memory (LSTM) network to extract and
Human Identification Under Multiple Gait Patterns Based . . .
179
Fig. 2. The operation of 2D-FFT
Fig. 3. Time-frequency spectrums a–c Different gait patterns, d–f different people (walking)
180
S. Dong et al.
Fig. 4. Architecture of proposed network. a Some blocks of Xception, b architecture of CNN+RNN
Human Identification Under Multiple Gait Patterns Based . . .
181
merge gait features through different networks, thereby ensuring the accuracy and long-term stability of recognition. The proposed network has two channels: a CNN channel using the Xception model with weights pre-trained on ImageNet and an independent LSTM channel including two LSTM layers. These two channels run parallel to each other, as shown in Fig. 4b. As an improvement of Inception-V3, Xception mainly replaces the original convolution operation with depthwise separable convolution, which improves the model’s effect without increasing the network complexity. As can be seen from Fig. 4a to b, the Xception network contains 14 blocks. In Fig. 4a, S Conv stands for separable convolution, Conv stands for original convolution, B stands for batch normalization layer, and R stands for RELU activation function. The input of the entire CNN+RNN network is preprocessed RGB images with a size of 299×299×3. On the CNN channel, the input image passes directly through the pre-trained Xception model until the last convolution block and the global pooling layer and finally obtains 2048-dimensional feature values. On the RNN channel, the input image is transformed from RGB image to grayscale image first, then resized to 23×3887 and then passes through two LSTM layers to finally obtain 2048-dimensional feature values. Afterward, since we have obtained 2048-dimensional feature values from both the CNN and RNN channel, the two outputs will be merged and the output of this merge layer will be used for classification.
4 4.1
Analysis and Results Comparison with Other Method
This subsection discusses the differences between this study and conventional studies on person identification based on radar micro-Doppler signatures to clarify the merits of this article. Table 1 summarizes the differences between conventional methods and that used in this study. First, as explained in the introduction, this article is on human identification using micro-Doppler signatures. The accuracy we achieved is almost superior to that of person identification based on gait motions as indicated in Table 1. Furthermore, the number of persons is quite bigger than that of the conventional studies, which is an important merit of our proposed method. Table 2 summarizes the differences between our network and other networks, and the results prove that our network is better than other networks. 4.2
Effect of Group Size
This subsection discusses the effect of group size when using our method for human identification. The data of 10 people, 20 people, 30 people, 40 people and 50 people are used as dataset and trained five times each. The identification accuracy is averaged, and the results are shown in the Fig. 5. When the number of people is 10, the average identification accuracy is 98.555%; when the number
182
S. Dong et al. Table 1. Comparison with other method Gait type
Number of persons Accuracy (%)
Yang
Walk/run 15
94.4/95.2
Cao
Walk
4/10
97.1/85.6
Chen
Walk
2
99.9
Our method Walk+run 50
95.99
Table 2. Comparison with other network Gait type
Number of persons Accuracy (%)
AlexNet
Walk (with/without book) + jog 50
85.23
VGG16
Walk (with/without book) + jog 50
92.31
Our network Walk (with/without book) + jog 50
95.99
of people is 50, the average recognition accuracy is 95.985%, only a decrease of about 3%. It can be seen that when the number of people increases from 10 to 50, the accuracy does not decrease significantly. Based on this result, we can infer that even if our dataset is expanded to 100 or even 200 people, the identification accuracy can still be maintained above 90%. 4.3
Extend to Other Gait Patterns
In this subsection, we try to extend the identification method to other gait patterns that are not included in the training set. We collected data 5 times when a people walking with an umbrella in one hand and walking with a bag, respectively, as shown in Fig. 6. The obtained time-frequency spectrums of the extended gait patterns are shown in Fig. 7b, c. From Fig. 7, we can see that the time-frequency spectrums of different gait patterns are very similar, but there are slight differences. Since different carrying objects are basically only reflected in the arm component and the arm component contributes less to the overall component, the extension of the identification method is feasible. The obtained time-frequency spectrums are used as a validation set for previously trained 10person network, and the recognition accuracy obtained is 95.79% and 99.01%, respectively. This result shows that the identification method in this paper can be successfully applied to other extended gait patterns (different carrying objects in daily life), while maintaining high accuracy.
5
Conclusion
This paper presents a method of gait recognition based on radar. In this method, we apply DNN to the micro-doppler spectrums obtained by radar, which can
Human Identification Under Multiple Gait Patterns Based . . .
183
Fig. 5. Accuracy of different group size
Fig. 6. Data collections of extended gait patterns. a Walking with a bag, b Walking with an umbrella
Fig. 7. Time-frequency spectrums. a–c Spectrums of different gait patterns (walking; walking with a bag; walking with a umbrella)
184
S. Dong et al.
identify human successfully. When we identify one person (gait pattern is random) out of 50, we can do it with more than 95%accuracy. In our opinion, this shows the potential of DNN based on radar micro-doppler signal in gait recognition. Although our results are satisfactory, the method still has some limitations in its wide application. First, because the shape of the micro-doppler signal is the key to classification, the classification accuracy may decline when two people’s spectrums are similar or the movement is irregular. Second, the conditions we considered in the experiment, the participants walked along the line-of-sight path of the radar, are simple compared with the actual situations. In future, we will further study non-LOS solutions to facilitate the practical application of our methods. Acknowledgments. This work was supported by the Fundamental Research Funds for the Central Universities (Grant no. 3082019NS2019026).
References 1. Menotti D et al (2015) Deep representations for iris, face, and fingerprint spoofing detection. IEEE Trans Inf Forensics Secur 10(4):864–879 2. Wu Z, Huang Y, Wang L, Wang X, Tan T (2016) A comprehensive study on crossview gait based human identification with deep cnns. IEEE TPAMI 39(2):209–226 3. Chen VC, Li F, Ho SS, Wechsler H (2006) Micro-doppler effect in radar: phenomenon, model, and simulation study. IEEE Trans Aerosp Electron Syst 42(1):2– 21 4. Gurbuz SZ, Amin MG (2019) Radar-based human-motion recognition with deep learning: promising applications for indoor monitoring. IEEE Signal Process Mag 36(4):16–28 July 5. Li X, He Y, Jing X (2019) A survey of deep learning-based human activity recognition in radar. Remote Sens 11(9):1068 6. Kim Y, Ling H (2008) Human activity classification based on micro-Doppler signatures using an artificial neural network. In: 2008 IEEE antennas and propagation society international symposium, pp 1–4 7. Kim Y, Ling H (2009) Human activity classification based on micro-doppler signatures using a support vector machine. IEEE Trans Geosci Remote Sens 47(5):1328– 1337 8. Kim Y, Ling H (2016) Human detection and activity classification based on microdoppler signatures using deep convolutional neural networks. IEEE Geosci Remote Sens Lett 13(1):8–12 9. Cao P, Xia W, Ye M, Zhang J, Zhou J (2018) Radar-ID: human identification based on radar micro-doppler signatures using deep convolutional neural networks. IET Radar Sonar Navig 12(7):729–734 10. Zhongsheng S, Jun W, Yaotian Z (2015) Multiple walking human recognition based on radar micro-doppler signatures. Sci China Inf Sci 58(12):1869–1919 11. Vandersmissen B, Knudde N, Jalalvand A, Couckuyt I, Bourdoux A, De Neve W, Dhaene T (2018) Indoor person identification using a low-power FMCW radar. IEEE Trans Geosci Remote Sens 56(7):3941–3952
Human Identification Under Multiple Gait Patterns Based . . .
185
12. Yang Y, Hou C, Lang Y, Yue G, He Y, Xiang W (2019) Person identification using micro-doppler signatures of human motions and UWB radar. IEEE Microw Wirel Compon Lett 29:366–368 13. Krizhevsky A, Sutskever I, Hinton G (2018) ImageNet classification with deep convolutional neural networks. NIPS 25(1):1097–1105. Curran Associates Inc
Design and Application of a High-Speed Demodulator Supporting VCM Mode Wang Huai1(B) , Li Fan2 , and Han Zhuo1 1 Space Star Technology Co., Ltd, Beijing 100086, China
[email protected] 2 Aerospace Information Research Institute, Chinese Academy of Science, Beijing 100094,
China
Abstract. With the rapid development of earth observation technology, the resolution of payload is getting higher and higher, and the amount of data transmitted by remote sensing satellite is getting larger and larger. In order to solve the contradiction between the remote sensing data acquired by the satellite and the transmission ability of the satellite-to-ground transmission links. Using new transmission systems, such as high-order modulation of high speed transmission and variable code modulation (VCM), to improve the efficiency of satellite-to-ground transmission links within limited frequency band resources has become the development trend of satellite-to-ground data transmission for new generation remote sensing satellites. In this paper, the high rate digital demodulator supporting VCM mode is designed, and the overall hardware design and software design, as well as the key point VCM design are introduced. Finally, combining with the ground system engineering of GF-7 satellite, the function and performance of the highspeed demodulator supporting VCM are verified, and the engineering application is introduced. Keywords: High speed · Demodulation · VCM · Satellite
1 Introduction With the rapid development of remote sensing satellite technology and the booming demand for civilian and military remote sensing applications, the accuracy of remote sensing satellite payload is constantly improving [1], which leads to a large increase in the amount of satellite-to-ground data. The traditional low rate data transmission seriously affects the performance of the satellite [2–4], so the satellite data transmission system is developing towards the direction of high rate. At the same time, in the traditional satellite data transmission system design, in order to ensure the link availability, the system is often designed according to the worst channel conditions. When the channel conditions become better, the data can only be transmitted according to a fixed rate,
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_24
Design and Application of a High-Speed Demodulator …
187
which greatly wastes valuable resources. How to transmit a large amount of data in a limited time with the least energy consumption has become an urgent problem in the field of satellite data transmission engineering. Variable code modulation technology [5–7] (VCM) has the ability to change the coding mode, modulation mode and information rate. It can dynamically change the transmission parameters according to the change of channel conditions, and significantly improve the transmission efficiency on the premise of ensuring the link reliability [8, 9]. According to the above requirements, combined with the ground system engineering of GF-7 satellite, a high-speed digital demodulator supporting VCM mode are designed to improve the transmission efficiency and realize the design of satellite ground integration. GF-7 satellite is the first high-speed remote sensing satellite in China, which adopts VCM technology.
2 Overall Design The high-speed digital demodulator supporting VCM mode is composed of 4U chassis based on VPX standard, backplane, Industrial computer mainboard and high-speed signal processing board. Cooperate with high-speed digital signal processing firmware and upper computer monitoring and control software to complete high rate demodulation, channel decoding and other functions. Various bus types such as SPI, gigabit network and PCIE are adopted between the mainboard and the backplane to carry out control information, monitoring information and high-speed data interaction. High-speed data information interaction can also be carried out between multiple boards through high-speed data serial interfaces (e.g., PCIe, RocketIO, Aurora, Serial RapidIO, etc.), which can support 5 Gbps/Lane. The main processing board is implemented in a standard 3U VPX architecture. The core is a Xilinx V7 series FPGA, which supports large-scale digital signal processing operations. It has PCIe 3.0 upper interface and can carry out high-speed data transmission with upper computer. It has three groups of large capacity DDR data buffers. With FMC_HPC interface, it can connect high-speed FMC boards and support 80 pairs of LVDS and 8 high-speed serial transceivers. The hardware implementation function of the main processing board is shown in Fig. 1. FPGA is mainly used to demodulate the digital signal and DSP is used to monitor the whole board. This paper focuses on the design of FPGA module and DSP module of the main processing board. 2.1 FPGA Module Design FPGA mainly completes clock recovery, carrier recovery, blind equalization, decoding and other main functions. The logic relationship of FPGA internal signal processing module is shown in Fig. 2. This paper focuses on the design of all digital parallel demodulation module and blind equalization module.
188
W. Huai et al.
Fig. 1 Functional block diagram of hardware implementation of main processing board
RS Decoder BPF+AGC
ADC
Time Recovery
Carrier Recovery
Blind Equalizer
Viterbi Decoder
Frame Sync LPDC Decoder
VPX
Fig. 2 Logic diagram of FPGA on chip signal processing module
Digital zero IF
AD data
Convert serial to parallel
Mixed frequency
IF
Low pass filter
Down sampling
Drop the sampling
Symbol synchronization
Carrier recovery Phase correc on
Insert numerical
NCO
Matched filtering
The loop filter
Timing error
2Sample1
NCO
Phase correc on
I-Baseband data The loop filter
The phase error
Q-Baseband data
Fig. 3 Demodulating structure diagram
2.1.1 Design of All Digital Parallel Demodulation Module The high-speed parallel demodulation structure is shown in Fig. 3. The demodulation structure consists of digital zero IF module, down sampling module, symbol synchronization module and carrier recovery module [10–12]. The main functions of each module are as follows: (1) Digital zero if module: after mixing and low-pass filtering the input signal, I and Q baseband signals with timing error, frequency difference and phase error are output; (2) Down sampling module: This module has two working modes: direct and down sampling. When the input signal rate is high, the through mode is adopted; when input signal rate is low (such as less than 10 Msps), the down sampling mode is
Design and Application of a High-Speed Demodulator …
189
adopted, and the CIC + ISOP cascade mode is used to reduce the sampling ratio of the low rate signal, which is convenient for the back-end processing; (3) Symbol synchronization module: through timing error algorithm and loop design, the input baseband data at fixed sampling frequency is interpolated to baseband data at twice the rate; A non-data aided timing error algorithm (Gardner timing error algorithm) based on maximum likelihood estimation is adopted to ensure that each interpolated symbol is presented in the form of vertex and middle point, and the back end is located. The matched filtering operation completes the waveform matching of the formed symbols and further filters out the out-of-band noise. After the timing error is corrected, the output of the module is two-way symbol data (I and Q) with frequency difference and phase error at twice code rate; (4) Carrier recovery module: extract the baseband data at double code rate through 2out-1 module to obtain the symbol data with frequency difference and phase error at single code rate; use the second-order loop of decision feedback phase detection algorithm (Costas phase detection algorithm) based on maximum likelihood estimation to convert the frequency difference and phase error into the phase control word change of NCO module, and then use CORDIC The algorithm completes the correction of the symbol data, and finally obtains the correct I and Q channel symbol data. 2.1.2 Design of All Digital Blind Equalization Module Using equalization technology can effectively eliminate the inter code interference and improve the data transmission rate. The structure of equalizer in high-speed demodulator after accessing the channel is shown in Fig. 4. The input signal
x(t)
equivalent equalizer
y(t)
channel
Fig. 4 Channel structure after adding equalizer
If the system after cascading the equivalent channel and the equalizer is regarded as a new generalized channel, its impulse is as follows: h (t) = h(t) ∗ hE (t) In the above formula, hE (t) is the impulse response of equalizer. Suppose the Fourier transform of h (t) is H (w). The purpose of equalizer is to make H (w) meet the distortion free transmission condition: i π H (w + 2π Ts ) = Ts , |w| ≤ Ts i
That is, the transmission function of the equalizer used to compensate for channel distortion shall be the reciprocal of the channel transmission function: HE (w) =
1 H (w)
=
−jϕ(w) 1 |H (w)| e
190
W. Huai et al.
Equalizer can be divided into non blind equalizer and blind equalizer. Because the channel characteristics of satellite are unknown in advance and the channel response is time-varying, adaptive blind equalizer is generally used in high-speed demodulator. The coefficient updating algorithms of blind equalizer include LMS (least mean square algorithm), RLS (recursive least square algorithm), CMA (constant modulus algorithm), among which LMS algorithm is the simplest and the least complex. RLS algorithm converges faster than LMS algorithm, but at the cost of high complexity. CMA algorithm is mainly used in QAM modulation mode. One disadvantage of CMA is that only the modulus of the signal is constant to adjust the coefficient, so CMA algorithm has the problem of phase rotation for 8PSK. For the phase rotation of CMA, the complexity of the improved RC-CMA algorithm is too high, and RC-CMA algorithm does not have the good convergence characteristics of CMA. Therefore, considering the performance of various algorithms, the difficulty of FPGA implementation, and the improvement of demodulation performance after implementation, we choose the dual-mode method based on CMA algorithm and complex LMS algorithm to achieve equalization [13, 14]. The structure of LMS algorithm with fir transverse filter structure is shown in Fig. 5.
Fig. 5 Adaptive FIR filter structure
LMS is based on the minimum mean square error. From the steepest descent method, the coefficient updating formula can be deduced as follows: y(n) = wT (n)x(n) w(n + 1) = w(n) + 2μe(n)x(n) where w(n) = [w0 (n), w1 (n), . . . , wN (n)]T is the tap weight vector of the adaptive filter at time n, x(n) = [x0 (n), x1 (n), . . . , xN (n)]T is the input signal vector of time n, d (n) is the reference standard signal, e(n) is the error signal, μ is the iteration step. The convergence condition of LMS algorithm is 0 ≤ μ ≤ 1/λmax , λmax is the maximum eigenvalue of the autocorrelation matrix of the input signal. An improved NLMS algorithm normalizes the convergence factors to ensure the stable convergence of the adaptive equalization algorithm. The normalized convergence factor of the algorithm is expressed as: μ =
μ σx2
Design and Application of a High-Speed Demodulator …
191
where σx2 is the variance of input signal x(n). The calculation formula is as follows: σˆ x2 =
M
x2 (n − i) = xT (n)x(n)
i=0
where σˆ x2 represents the estimation of signal variance at time n. For stationary random input signal x(n), σˆ x2 is an unbiased estimator of σx2 . Taking the normalized convergence factor expression into the LMS coefficient updating formula, we can get: μ e(n)x(n) w(n + 1) = w(n) + 2 xT (n)x(n)
In order to prevent the denominator of the fraction in the above formula from being 0, a small normal number c is usually added to the denominator. In this way, the iterative formula of NLMS algorithm becomes: e(n)x(n) w(n + 1) = w(n) + 2 c+xT μ (n)x(n) In this way, as long as the convergence conditions are guaranteed: 0 γm , where γm = |hm |2 /σm2 . At the receiving end, the signal-to-interference plus noise ratio (SINR) of the signal received by user i is SINRi = m−i
pi |hi |2
2 2 j>i pi |hj | + σi
pi γi = m−i j>i pi γj + 1
(3)
According to Shannon’s formula, the capacity of user i on the carrier can be expressed as
pi γi
Ci = BT log2 (1 + SINRi )=BT log2 1 + m−i j>i
pi γj + 1
(4)
246
J. Cheng et al.
Fig. 1 Schematic of a multi-user downlink NOMA systems
In order to achieve fair scheduling between users, weighted sum rate is used as the allocation criterion for maximizing user power [23], and then, the total channel capacity on the carrier can be expressed as ∗ = Csum
m i=1
1 Ci log2 1 + γi2
(5)
It can be seen from Eq. (5) that system throughput and user power allocation are closely related. Therefore, in order to improve the system performance of throughput, it is necessary to allocate power to users reasonably. The above power allocation problem can be simplified to the following function optimization problem Csum = max BT p
m i=1
1 ∗ Csum log2 1 + γi2
(6)
Subject to: C1 :
m
pi ≤ PT
i=1
γi Pi ) ≥ Ri , ∀i=2, 3, . . . , m C2 : BT log2 (1 + m−i j>i γi Pj + 1 C3 : pm hm−1 −
m−1
∗ pi hm−1 ≥ Pthr , ∀i = 2, 3, . . . , m
i=1
among which, C1 is the total power constraint; C2 is to meet the user’s minimum rate requirements; Ri is the user’s minimum rate; C3 is the condition required to achieve the correct demodulation of SIC; Pthr is the power difference threshold [24].
Power Allocation Based on Complex Shape Method in NOMA System
247
Since γm= |hm |2 /σm2 , C3 can be transformed into i−1 pj γi−1 σ 2 ≥ pthr , i = 2, 3, . . . , m pi γi−1 σ 2 −
(7)
j=1
Therefore, Eq. (6) can be transformed into Csum = max BT p
m i=1
1 ∗ Csum log2 1 + γi2
(8)
Subject to: C1:
m
pi ≤ PT
i=1
γi Pi ) ≥ Ri , ∀i=2, 3, . . . , m C2: BT log2 (1 + m−i j>i γi Pj + 1 i−1 pj γi−1 σ 2 ≥ pE , i = 2, 3, . . . , m C3: pi γi−1 σ 2 − j=1
In summary, Eq. (8) is a constrained convex optimization model. Considering that multiple users on the same frequency band will lead to serious error propagation and higher user delay [25]. Therefore, the model solves a low-dimensional constrained optimization problem.
3 Sub-optimal Power Allocation Algorithm The CSS method is a direct search algorithm, and its basic idea is to select K design points in the feasible region of the N-dimensional space as the vertices of the initial complex (polyhedron) [26]. Then, it compares the size of the objective function value of each vertex in the complex shape, takes the point with the largest objective function value as the dead point and the centroid of the other points except the dead point as the centre of the mapping; finally people can find the mapping point of the dead point. Generally speaking, the objective function value of the secondary mapping point is less than the bad point; that is, the mapping point is better than the bad point. A new complex type with K vertices is formed by replacing the bad points with the mapping points and then with the original compound points except for the bad ones. Through repeated iterative calculations, new points with low objective function values are continuously replaced by new points with low objective function values in the feasible domain, thereby forming a new complex, and the complex is continuously moved and contracted to the best advantage, so that each vertex of the complex approach its centroid until it meets the iteration accuracy requirements. In this paper, for the NOMA user group with m users, the initial value is selected, and the constraints of C1, C2 and C3 are satisfied. In addition, the initial number of
248
J. Cheng et al.
groups k satisfies: m + 1 ≤ k ≤ 2m. According to experience, if m is large, then the left value is taken; if m is small, then the right value is taken. The advantage of this choice is that it can reduce the calculation redundancy. The definition of the most advantageous pL , the worst point pH and the second worst point pSH in the algorithm are as follows pL = arg min{−Csum ∗ (pi )}, i = 1, 2, . . . , k
(9)
pH = arg max{−Csum ∗ (pi )}, i = 1, 2, . . . , k
(10)
pSH = arg max −Csum ∗ (pj ) , j = 1, 2, . . . , k, j = i (11) Mapping point pR= pc + α pc− pH , where α > 1. The general value is 1.3, which is the mapping coefficient. The iteration termination condition is to find the root mean square value of the difference between each point and the best pL , the formula is ⎫1/2 ⎧ k ⎨1 2 ⎬ Csum ∗ (pj ) − Csum ∗ (pL ) ≤ε (12) ⎭ ⎩k j
Among them, ε is the error accuracy. In summary, the optimization steps based on the complex search method are as follows: (1) Given the initial power value p0 vector group in the feasible domain to construct the initial compound shape. p0 = p1 , p2 , p3 , . . . , pm , pi,i∈m = (p1 , p2 , p3 , . . . , pk )T (2) Substitute each element under each user’s power value vector pi into C ∗sum to find the best advantage pL , the worst point pH and the second worst point pSH . (3) Calculate the centroid of each point except the worst point pH ; check whether it is within the feasible range, and if so, then go to the next step; otherwise, reselect the initial value and construct the initial complex shape. (4) Calculate the reflection point pR and check whether it is within the feasible region. If yes, then go to the next step; otherwise, halve the coefficient and recalculate the reflection point pR . ∗ (p ) < (5) Calculate the objective function value C ∗sum of the reflection point. If Csum R ∗ Csum (pH ), then pR instead of pH reconstructs the complex shape and checks whether the iteration termination condition is satisfied; otherwise, turn to the next step. (6) Replace the worst point pH with the second worst point pSH to perform the above optimization process, check whether the conditions for terminating the iteration are satisfied, and if so, then terminate the iteration; otherwise, reconstruct the complex and continue searching.
4 Simulation and Performance Analysis 4.1 Simulation Conditions This section mainly simulates the NOMA downlink. Before that, it is assumed that the user grouping in the NOMA system has been completed, and it is simulated for one of
Power Allocation Based on Complex Shape Method in NOMA System
249
the user groups in the NOMA system. For the considered system models, the parameter values provided in Table 1 below are adopted, unless specified otherwise. Then, we use Monte Carlo analysis to derive system performance of OMA and NOMA, which can be seen in Figs. 2 and 3. Finally, the comparing the performance of the CSS scheme with FSPA, FPA and FTPA algorithms can be seen in Figs. 4 and 5. Table 1 Simulation parameters in NOMA systems Parameters
Value
Sub-carrier bandwidth B/kHz
180
Total downlink transmission power PT /dBm
46
Channel difference threshold Pthr /dBm
10
Channel model
Rayleigh fading channel
Delay spread/us
5
Maximum Doppler shift/Hz
30
Channel estimation
Ideal
Minimum user rate Ri /kbps
500
Noise power spectral density σ 2 /dBm/Hz
−174
8
4.5
x10
4
System Sum Rate(bps)
3.5 3 2.5 2 1.5
The Rate of UE1by NA The Rate of UE2 by NA
1
System Sum Rate by NA The Rate of UE1 by OA
0.5
The Rate of UE2 by OA System Sum Rate by OA
0
5
10
15
20
25
30
35
40
SNR of UE2(dB)
Fig. 2 Change of the user rates and sum rates of NOMA systems and OMA systems with γ 2 when γ 1 = 40 dB
4.2 System Performance Comparison Figures 2 and 3 compare the performance of the NOMA and OMA systems of two users. To simulate the possible situation of UE2 in actual communication, the channel quality
250
J. Cheng et al. x10
4.6
8
Sum Rate of System(bps)
4.5 4.4 4.3 4.2 4.1 4 3.9
Sum Rate by NA Sum Rate by OA
3.8
3
4
5
6
7
8
Number of Users
Fig. 3 Change of the sum rates of NOMA systems and OMA systems with the number of users 8
Weighted Sum Rate of System(bps)
x10 1.25
FSPA CSS
1.2
FTPA FPA
1.15 1.1 1.05 1 0.95 0.9 0.85 0.8 3
4
5
6
7
8
Numbers of Users
Fig. 4 Weighted sum rate of four algorithms with the relationship of γ 2
is good when the distance to the BS is close, and γ 2 is large; the channel quality is poor when the distance is far from the BS, and γ 2 is small. In Fig. 1, the signal-to-interference and noise ratio of UE1 remains unchanged at γ 1 = 40 dB, and the signal-to-interference and noise ratio of UE2 increases from 5 dB to 40 dB. When UE2 = 5 dB, there are users with large differences in channel quality in the system; that is, UE1 can be regarded as the central user and UE2 as the edge user. Observe the following conclusions from Fig. 2.
Power Allocation Based on Complex Shape Method in NOMA System
251
8
1.25
x10
FSPA CSS
Weighted sum rate(bps)
1.2
FTPA FPA
1.15 1.1 1.05 1 0.95 0.9 0.85 0.8 5
10
15
20
25
30
35
40
SNR of UE2(dB)
Fig. 5 Weighted sum rate of the four algorithms as a function of the number of users
(1) Compared with the OMA system, the NOMA system can be effectively improved under any channel quality conditions, whether it is a single user rate or system and rate; the total system rate can be increased by about 20%; especially, for some of the edge users, the performance of throughput even can be increased by about 200%. (2) OMA adopts the water injection power algorithm, the power allocated by users with good channel quality is large, and the power allocated by users with poor channel quality is small, and users are not scheduled by weighted sum rate. Therefore, the OMA system is more sensitive to SINR. (3) In the NOMA system, the edge user rate is very close to the centre user rate and remains stable. Therefore, the NOMA system is fairer. In Fig. 3, when the number of users in the system continues to increase, the NOMA system and the rate using the algorithm of this paper are superior to the OMA system. In addition, when there are fewer users multiplexed in the same frequency band, the performance of NOMA under this algorithm is significantly improved. With the number of multiplexed users increasing, and to ensure the calculation delay, the error accuracy ε is reduced, so the performance slightly decreases. It can be seen that this algorithm can achieve better results when the number of multiplexed users is within 5. 4.3 Algorithm Performance Comparison In Fig. 4, when the number of users is 2 and the SINR of UE1 is γ 1 = 40 dB, the relationship between γ 2 and the weighted rate of the system under this algorithm, exhaustive search method, FPA algorithm and FTPA algorithm is verified by comparison. As the signal-to-interference and noise ratio γ 2 of UE2 increases, the equivalent channel conditions become better, and the system weighted rate under the four algorithms increases accordingly.
252
J. Cheng et al.
When γ 2 is small, that is, when there are edge cell users in the system, the solution obtained by the algorithm in this paper is closer to the optimal solution than the FPA algorithm and the FTPA algorithm. Because the FTPA algorithm and FPA algorithm do not fully consider the impact of edge users, the system and rate performance are not as good as the algorithm in this paper. In Fig. 5, as the number of users in the NOMA system increases, the system weighting rate of the four algorithms increases. However, because the time–frequency resources are fixed, the increase in the number of users will make the system’s rate close to the system capacity limit. When the number of users multiplexed on the same frequency band remains within 5, the algorithm in this paper can approach the optimal solution for power allocation. Because the algorithm in this paper has a stronger ability to find optimization, therefore, in the case of multi-user multiplexing, the algorithm in this paper is also significantly better than the FTPA algorithm and FPA algorithm.
5 Conclusion Aiming at the problem of power allocation optimization in NOMA system, this paper proposes a power allocation scheme based on complex search algorithm. This algorithm can achieve sub-optimal solution when the number of multiplexed users in the same frequency band is small, and the algorithm has low complexity, which is convenient for physical implementation. The simulation results prove the superiority of this algorithm in improving the system data throughput rate. However, the shortcoming of this algorithm is that when there are many multiplexing users, the performance improvement is not obvious. In the next phase of the work, we will focus on the performance limits of multi-user multiplexing, including the number of multiplexing users and spectrum utilization. Acknowledgements. This work is supported by National Natural Science Foundation of China (61661018) and Hainan Provincial Natural Science Foundation High-level Talent Project (2019RC036). Hui Li is the corresponding author.
References 1. Dai L, Wang B, Yuan Y et al (2015) Non-orthogonal multiple access for 5G: solutions, challenges, opportunities, and future research trends[J]. IEEE Commun Mag 53(9):74–81 2. Zhu J, Wang J, Huang Y et al (2017) On optimal power allocation for downlink non-orthogonal multiple access systems. IEEE J Sel Areas Commun 35(12):1–2 3. Benjebbour A, Saito Y, Kishiyama Y et al (2013) Concept and practical considerations of non-orthogonal multiple access (NOMA) for future radio access. In: Proceedings of the international symposium on intelligent signal processing and communication systems, pp 770–774 4. Tabassum H, All MS, Hossain E et al (2016) Non-orthogonal multiple access (NOMA) in cellular uplink and downlink: challenges and enabling techniques 1–7 5. Dadi R, Parsaeefard S, Derakhshani M et al (2016) Power-efficient resource allocation in NOMA virtualized wireless networks. In: 2016 IEEE global communications conference, pp 1–6
Power Allocation Based on Complex Shape Method in NOMA System
253
6. Wei Z, Ng DWK, Yuan J (2016) Power-efficient resource allocation for MC-NOMA with statistical channel state information. In: IEEE global communications conference, pp 1–7 7. Li X, Li C, Jin Y (2016) Dynamic resource allocation for transmit power minimization in OFDM-based NOMA systems. IEEE Commun Lett 20(12):2558–2561 8. Cai W, Chen C, Bai L et al (2016) User selection and power allocation schemes for downlink NOMA systems with imperfect. In: IEEE 84th vehicular technology conference, pp 1–5 9. Liu X, Wang X, Liu Y (2017) Power allocation and performance analysis of the collaborative NOMA assisted relaying systems in 5G. China Commun 14(1):50–60 10. Yang Z, Ding Z, Fan P et al (2016) A general power allocation scheme to guarantee quality of service in downlink and uplink NOMA systems. IEEE Trans Wirel Commun 15(11):7244– 7257 11. Yang Z, Xu W, Pan C et al (2017) On the optimality of power allocation for NOMA downlinks with individual QoS constraints. Commun Lett 21(7):1649–1652 12. Sun Y, Ng DWK, Ding Z et al (2017) Optimal joint power and subcarrier allocation for MC-NOMA Systems. In: IEEE global communications conference, pp 1–6 13. Chen Z, Ding Z, Dai X et al (2017) An optimization perspective of the superiority of NOMA compared to conventional OMA. Trans Signal Process 65(19):5191–5202 14. Oviedo JA, Sadjadpour HR (2018) On the power allocation limits for downlink multi-user NOMA with QoS. In: IEEE international conference on communications, pp 1–5 15. Benjebbour A, Li A, Saito Y et al (2013) System-level performance of downlink NOMA for future LTE enhancements. In: IEEE Globecom workshops, pp 66–70 16. Hojeij MR, Farah J, Nour CA et al (2015) Resource allocation in downlink non-orthogonal multiple access (NOMA) for future radio access. In: IEEE 81th vehicular technology conference, pp 1–6 17. Li H, Ye M, Tong Q et al (2019) Performance comparison of systematic polar code and non-systematic polar code. J Commun 2019(6):203–209 18. Oviedo JA, Sadjadpour HR (2017) A fair power allocation approach to NOMA in multiuser SISO systems. IEEE Trans Veh Technol 66(9):7974–7985 19. Choi J (2016) Power allocation for max-sum rate and max-min rate proportional fairness in NOMA. IEEE Commun Lett 20(10):2055–2058 20. Timotheou S, Krikidis I (2015) Fairness for non-orthogonal multiple access in 5G systems. IEEE Signal Process Lett 22(10):1647–1651 21. Al-Abbasi ZQ, So D KC (2015) Power allocation for sum rate maximization in non-orthogonal multiple access system. In: IEEE 26th annual international symposium on personal, indoor, and mobile radio communications, pp 1649–1653 22. Higuchi K, Benjebbour A (2015) Non-orthogonal multiple access (NOMA) with successive interference cancellation for future ratio access. IEICE Trans Commun 98(3):403–404 23. Schaepperle J, Rüegg A (2009) Enhancement of throughput and fairness in 4G wireless access systems by non-orthogonal signaling. Bell Labs Tech J 13(4):59–77 24. Ali MS, Tabassum H, Hossain E (2016) Dynamic user clustering and power allocation for uplink and downlink Non-Orthogonal multiple access (NOMA) systems. IEEE Access 4:6325–6343 25. Usman MR, Khan A, Usman MA et al (2016) On the performance of perfect and imperfect SIC in downlink non orthogonal multiple access (NOMA). In: International conference on smart green technology in electrical and information systems, pp 102–106 26. Zhang Y, Wu SY (2018) Matlab optimization algorithm. Tsinghua University Press, Beijng
Study on the Feature Extraction of Mine Water Inrush Precursor Based on Wavelet Feature Coding Ye Zhang1 , Yang Zhang2 , Xuguang Jia2 , Huashuo Li1 , and Shoufeng Tang1(B) 1 China University of Mining and Technology, Xuzhou, China
[email protected], [email protected] 2 Xuzhou Comprehensive Center for Inspection and Testing of Quality and Technical
Supervision, Xuzhou, China
Abstract. Coal water inrush acoustic emission (AE) signal is characterized by time-varying, non-stationary, unpredictable, and transient properties. To extract effective features representing coal water inrush information, the AE signal is analyzed by the wavelet characteristic energy spectrum coefficient based on wavelet theory. The feasibility of the wavelet feature coding has confirmed from code scheme’s availability and consistency, and it proves that the coding method can be used as a sign of waveform identification. The inclusion of energy distribution characteristics makes the waveform features more ordered and simplified. While the analysis of the obtained feature encoding in chronological order, it is possible to obtain the state of the time series signals, to lay an important basis for analyzing the evolution of water inrush acoustic emission coal from the time series level, such that a change dynamic characteristic acoustic emission signal becomes possible. And this will lay an important foundation for the time sequence analysis of acoustic emission event’s evolution in mine water inrush. Keywords: Coal water inrush · Acoustic emission · Wavelet feature coding · Wavelet characteristic energy spectrum · Wavelet theory
1 Introduction Mine water inrush is one of the main safety disasters in the process of mine construction and production. However, the threat of water inrush is extremely serious because of complex hydrogeological conditions. The frequent occurrence of water inrush accidents has caused substantial economic losses and casualties. In the past ten years, the number of large water inrush accidents was 517, and 2753 people were injured or lost their lives. In the past 20 years, more than 300 coal mines have had water inrush accidents, and economic losses have amounted to more than 50 billion yuan in RMB. Therefore, the development of an accurate method for predicting water inrush is essential [1, 2].
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_32
Study on the Feature Extraction of Mine Water Inrush Precursor …
255
In recent years, with the development of acoustic emission technology, it has become an important tool for water inrush research in coal mines [3–6]. However, due to the characteristics of the acoustic emission system and the limitations of the prior art methods in rock acoustic emission testing, a large number of interference signals are mixed into the acoustic emission signal. This has greatly affected the application of acoustic emission technology in geotechnical and mining engineering [7–9]. Because the AE signal has the characteristics of suddenness, diversity of source signal and diversity of interference noise, etc., the wavelet transform has good localization properties in both time and frequency domains. Therefore, wavelet transform is the most effective method for analyzing this type of signal [10–13]. Based on wavelet transform and wavelet characteristic energy spectrum coefficient, the wavelet signature energy spectrum coefficient encoding of mine acoustic emission signal is studied in this paper. The results show that the feature coding scheme has a high recognition rate and can effectively distinguish the various waveforms and types of interference in actual production.
2 Wavelet Packet Characteristic Power Spectrum Coefficient The wavelet characteristic energy spectrum coefficient is used to characterize the distribution of AE signal energy in each frequency band of wavelet packet decomposition. Different information components in the signal will cause the energy of AE signal to be distributed in different frequency bands. Therefore, the wavelet characteristic energy spectrum coefficient can be selected to characterize the characteristics of the response signal and the interference signal. Wavelet feature coding requires fewer parameters to describe signal characteristics. It can be used to identify different types of interference signals and response signals. It is also very beneficial to suppress the instability of response signals and suppress interference signals in the same frequency band. Assume that the acoustic emission signal of the discrete sampling sequence is f (n). Under J scale wavelet decomposition, f (n) will have J + 1 frequency range components in the form of the total energy constant, that is, the following formula holds to f (n) = AJ f (n) + DJ f (n) + DJ −1 f (n) + L + D1 f (n)
(1)
If EJA f (n) =
N
(AJ f (n))2
n=1
EjD f (n) =
N
(Dj f (n))2 , j = 1, 2, . . . , J
(2)
n=1
EJA f (n) and EjD f (n) represent the low-frequency signal energy component and each high-frequency signal energy component on the decomposition scale J of the signal, respectively.
256
Y. Zhang et al.
The total energy of the AE signal is Ef (n) = EJA f (n) +
J
EjD f (n)
(3)
j=1
The wavelet characteristic energy spectral coefficient is defined as the ratio of the energy component of each wavelet decomposition to the total energy, which can represent the distribution of energy components in different frequency bands at each wavelet decomposition scale. rEJA and rEjD to represent the low-frequency and high-frequency wavelet characteristic energy spectral coefficients respectively [14–16], rEJA =
EjD f (n) EJA f (n) rEjD = j = 1, 2, . . . , J Ef (n) Ef (n)
(4)
Decomposition scale
Decomposition scale
Figure 1 shows the waveform of the acoustic emission signal collected in the coal mine. The wavelet characteristic energy spectrum coefficient analysis method is used to process the wavelet decomposition of the sixth order of the AE signal to reconstruct the waveform and its characteristic energy spectrum coefficient.
Noise waveform D-sym6-6 Spectral coefficient
D-sym6-6 Spectral coefficient
AE Waveforms in the Analysis Section
Decomposition scale Distribution of wavelet characteristic energy spectrum coefficient of acoustic emission signal
Decomposition scale Distribution of wavelet characteristic energy spectrum coefficient of noise signal
Fig. 1 Wavelet characteristic power spectrum coefficient of AE and noise signals
It can be seen from Fig. 1, there is a clear difference between the energy spectrum distribution of acoustic emission signals and noise signals, which is of great value for identifying different types of signals and suppressing noise signals.
3 Wavelet Characteristic Power Spectrum Feature Encoding The feature code representation is to determine whether each spectral coefficient is in a valid state or an invalid state by analyzing whether it has impact characteristics and the distribution characteristics of each spectral coefficient. The method of forming the
Study on the Feature Extraction of Mine Water Inrush Precursor …
257
wavelet characteristic energy spectrum coefficient: (1) The wavelet characteristic energy spectrum coefficient is obtained by performing wavelet transformation on a waveform of a specified length; (2) Judge the impact of each decomposition layer according to the impact. If the impact is not satisfied, clear all the coefficients of the layer and set the layer as a non-impact sign; (3). According to the result of step 2, adjust the wavelet characteristic energy spectrum coefficient to form a new wavelet characteristic energy spectrum coefficient, and reconstruct the waveform [17–19]. The impact judgment method is the peak-to-average ratio method. Define several parameters: Unitized mean havg =
Vrms Vtop
(5)
hmax =
Vmax Vtop
(6)
Unitized peak
Peak-to-average ratio h=
hmax havg
(7)
V top is the maximum set peak of the acquisition device, V msx is the maximum peak of the signal in the analyzed interval, V rms is the root mean square of the signal in the analyzed interval. Take h = 4 as the impact discrimination threshold as an example. If h < 4, it is considered that the signal at this decomposition level is mainly noise. The characteristic spectral coefficient of this layer is set to 1, and the corresponding wavelet coefficient is set to 0. Otherwise, it is considered that there is an impact signal, and the characteristic spectral coefficient of the layer is set to 0, and the corresponding wavelet coefficient is not processed. Thus, the state of the waveform can be determined as follows Wavelet coefficient 0 marks
The digital representation is: 1000111, which is the characteristic code corresponding to the illustrated waveform. It is found that the signal on the d1 layer is generally weak. In order to reduce the amount of calculation, consider removing the d1 layer, that is, only the a6 d6 d5 d4 d3 d2 status code is considered. And the encoding method is CodeValue = a6 ∗ 20 + d6 ∗ 21 + d5 ∗ 22 + d4 ∗ 23 + d3 ∗ 24 + d2 ∗ 25
(8)
The waveform can be represented by the code “49”. The principle of uniqueness is the basic principle of controlling heavy and garbled characters; one of the purposes of encoding is to improve the accuracy of information processing, information access, information transmission, information retrieval, information control, information utilization, and information statistics. One code can only
258
Y. Zhang et al.
represent one object, and one control object can only have one code. Therefore, the proposed coding scheme should be able to effectively analyze all signals, all waveforms can be encoded, and can effectively distinguish different waveforms. Waveforms with a certain characteristic can be identified according to the coding, and there should be no bit errors. According to the analysis, the feasibility analysis of the coding scheme proposed in this paper is performed [20–22]. (A) Effectiveness of the coding scheme Figure 2 shows the wavelet decomposition of different waveforms, the spectral coefficients, and the corresponding coding situation.
Fig. 2 Wavelet characteristic power spectrum and encoding of different AE signals
It can be seen that the extracted morphological quantization coding can distinguish different types of waveforms.
Study on the Feature Extraction of Mine Water Inrush Precursor …
259
(B) Consistency of coding scheme Figure 3 shows the recognition of waveforms with the same code value “58”.
Fig. 3 Characteristic energy spectrum and reconstitution waveform of encoding 58
Figure 4 shows the recognition of waveforms with the same code value “35”. It can be seen from the morphological comparison analysis of the original waveforms, that the coding scheme proposed in this paper has a higher recognition rate. It can encode common interferences and establish an identification library in actual production, which can effectively distinguish the waveforms and types of various interferences [20, 23].
4 Conclusion Based on wavelet analysis, this paper studies some existing problems and key technologies in the field of acoustic emission signal processing and feature extraction. 1. According to the characteristics of coal-rock acoustic emission signals, the correlation principle of wavelet energy spectrum coefficients is introduced. The wavelet energy spectrum coefficients are used to characterize the response signals and interference signals of coal mines. The waveforms of the acoustic emission signals collected in the coal mine are processed to obtain the wavelet decomposition of each order of the acoustic emission signals and the reconstructed waveforms and their characteristic energy spectrum coefficients. 2. The wavelet characteristic energy spectrum feature vector is extracted by using the wavelet characteristic energy spectrum coefficient. Based on the wavelet characteristic coding, a wavelet characteristic coding method of acoustic emission signals is
260
Y. Zhang et al.
Fig. 4 Characteristic energy spectrum and reconstitution waveform of encoding 35
proposed. After the wavelet feature vector feature encoding processing, an independent code corresponding to the waveform is obtained. This code can be used as a sign of distinguishing each other, and it also contains the energy distribution characteristics, making the waveform features more orderly. The coding scheme can be used as a waveform index value, avoiding the tedious process of pattern recognition, and has the advantage of simplifying recognition. 3. The state time series of coal mine acoustic emission signals can be obtained by the coding method proposed in this paper, and the evolution process of coal mine water inrush events can be analyzed, which provides new ideas and methods for solving the problems of monitoring technologies such as acoustic emission.
Acknowledgements. Supported by National Key R&D Program of China under grant No. 2017YFF0205500. Thanks to the modern analysis and computing center of CUMT for providing computing services.
References 1. Dong S, Hu W (2007) Basic characteristics and main controlling factors of coal mine water hazard in China. Coal Geol Explor 35(5):34–38
Study on the Feature Extraction of Mine Water Inrush Precursor …
261
2. National coal mine safety administration (2006) 2005 Annual national coal mine accident analysis report compilation 3. Zhou Z, Li G, Ning S, Du K (2014) Acoustic emission characteristics and failure mechanism of high-stressed rocks under lateral disturbance. Chin J Rock Mechan Eng 33(8):1720–1728 4. Lai X, Cai M (2004) Couple analyzing the acoustic emission characters from hard composite rock fracture. J Univ Sci Technol Beijing 11(2):97–101 5. Ohtsu M, Watanabe H (2001) Quantitative damage estimation of concrete by acoustic emission. Constr Build Mater 15:217–224 6. Tang J, Liu W, Fei X (2011) Study on after peak acoustic emission features of rock type material. Coal Sci Technol 039(005):21–24 7. Zhao E, Wang E (2006) Experimental study on acoustic emission characteristics of rock and soil in the failure process. J Disaster Prev Mitig Eng 26(3):316–320 8. Tang S, tong M, Hu J, He X (2010) Characteristics of acoustic emission signals in damp cracking coal rocks. Min Sci Technol 20(1):143–147 9. Tang S, Tong M et al (2010) The acoustic emission experiment system of rock outburst in water inrush. J Min Saf Eng 27(3):429–432 10. Meng M (2015) Application research of optimization theory and wavelet analysis in time series analysis 11. Yuan W, MA Y, Liu S, Xu Y, Sun H (2013) Remote sensing image compression analysis based on wavelet transform and matlab. Geomatics World 03:50–55 12. Liu J, She K (2010) Independent component analysis algorithm using wavelet filtering. J Electron Meas Instrum 1:39–44 13. Chen Q (2013) Wavelet basis construction and its algorithm and implementation based on the characteristics of geotechnical signals. Changsha University of Science & Technology 14. Shou-Feng T, Min-Ming T, Yu-Xiang P (2012) Wavelet basis function of the microseismic signal analysis. In: 2011 international conference in electrics, communication and automatic control proceedings. Springer New York 15. Tang S, Tong M, Pan Y et al (2011) Energy spectrum coefficient analysis of wavelet features for coal rupture microseismic signal. Chin J Sci Instrum 032(007):1521–1527 16. Zhao Y, Liu L, Pan Y et al (2017) Experiment study on microseismic, charge induction, self-potential and acoustic emission during fracture process of rocks. Chin J Rock Mech Eng 17. Lu C, Dou L (2005) Frequency spectrum analysis on microseismic monitoring and signal differentiation of rock material. Chin J Geotech Eng 27(7):773–775 18. Cao Y, Li Y, Hu X et al (2010) Study of the disaster information coding on the earthquake site. J Seismol Res 33(3):344–348 19. Tong M, Hu J, Tang S et al (2009) Study of acoustic emission signal characteristic of watercontaining coal and rock under different stress rates. J Min Safety Eng 026(001):97–100 20. Qian H, Yang S, Iyer R et al (2014) Parallel time series modeling—a case study of in-database big data analytics. Trends and applications in knowledge discovery and data mining. Springer International Publishing, Berlin 21. Ling T, Liao Y, Zhang S (2010) Application of wavelet packet method in frequency band energy distribution of rock acoustic emission signals under impact loading. J Vib Shock 29(10) 22. Tang S, Tong M, Junli HU et al (2010) Characteristics of acoustic emission signals in damp cracking coal rocks. Min Sci Technol 01:147–151 23. Mei H-B, Gong J (2007) An IDS alarm analysis method for intrusion warning based on time series theory. Comput Sci 34(12):68–72
Research on Coal and Gas Outburst Prediction Using PSO-FSVM Benlong Zhu1 , Ye Zhang2 , Yanjuan Yu1 , Yang Zhang1 , Huashuo Li2 , Yuhang Sun2 , and Shoufeng Tang2(B) 1 Xuzhou Comprehensive Center for Inspection and Testing of Quality and Technical
Supervision, Xuzhou, China 2 China University of Mining and Technology, Xuzhou, China
[email protected]
Abstract. Coal and gas outburst is a kind of natural disaster in the process of coal mining, which is very destructive. If the outburst can be predicted accurately in time, the corresponding protective measures can be taken before the disaster, and the life safety of the underground workers can be guaranteed to the maximum extent. Because the traditional support vector machine (SVM) has the disadvantages of low noise resistance and easy to be affected by parameters, this paper presents a new prediction method based on particle swarm optimization fuzzy support vector machine (PSO-FSVM). In order to coordinate the global and local optimization ability of PSO, inertia weight and simulated annealing algorithm are introduced to improve the optimization ability and forced local optimal trap probability of PSO. The improved prediction model combined with PSO and FSVM is used to predict coal and gas outburst. The experimental results show that the model has faster training speed and higher classification accuracy than PSO-SVM and FSVM models. Keywords: Coal and gas outburst · FSVM · PSO · PSO-FSVM
1 Introduction Recent years, with the economic development of our country, the demand for coal is growing. Because of the increasingly bad mining conditions and frequent accidents in the mining process, how to mine coal safely has become an important topic. The formation of coal mines is mainly affected by geological conditions, and the distribution characteristics are relatively obvious. Most of them exist underground tens to hundreds of meters away from the ground. Because the underground mining environment is too dangerous, various unpredictable mine accidents will occur during the mining process. Among all mine accidents, the largest proportion of coal and gas outburst accidents, and the frequency of occurrence is gradually increasing. The coal and gas outburst problem has reached the point where it must be solved.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_33
Research on Coal and Gas Outburst Prediction Using PSO-FSVM
263
With the continuous maturity and development of statistical learning theory, Cortes and Vapnik put forward a new pattern recognition method SVM based on the principle of structural risk minimization, which shows many advantages in solving the problem of non-linear, small sample data and multi-dimensional recognition by selecting appropriate kernel function [1–3]. The samples collected by the SVM have fuzzy information such as noise points and outliers, which will cause the classification to fail to achieve the optimal group and the accuracy rate will be greatly reduced. In order to solve this problem, Lin and Wang [4] put forward the concept of FSVM, and introduced the fuzzy factor into support vector machine, which effectively improved the impact of external factors on classification accuracy. This paper selects FSVM as the detection method of coal and gas outburst. PSO has the advantages of simple structure, no need for too many parameter adjustments in calculation, fast implementation speed, and easy integration with other methods. Therefore, this paper selects the particle swarm optimization algorithm for the optimization of the FSVM model, and improves the particle swarm optimization algorithm in view of the disadvantages of the particle swarm algorithm that easily fall into a local optimum.
2 PSO Algorithm and Its Improvement PSO is mainly to observe and study the foraging behavior of birds and combine two bird swarm models to optimize the problem [5]. The empirical data of particle searching for individual optimal solution mainly consists of two parts: the current speed and current position of particle. In order to find the optimal solution, the particle will constantly move and update iteratively. This update process is mainly completed by each particle updating its current speed and position by comparing the optimal solution found in the process of individual motion and the historical optimal solution of population [6–8]. k+1 k k k = vid + c1 r1 (qid − xid ) + c2 r2 (qgd − xid ) vid
(1)
k+1 k+1 k xid = xid + vid
(2)
where k represents the number of iterations; r1 and r2 can make the population have enough species, their values are randomly generated between 0 and 1; c1 and c2 are called acceleration (learning) factors, and usually take a non-negative constant. The position of the particle is determined by the adaptive value xi . The update speed of the particle is determined by the speed viD before the particle moves, the current position qiD , and the distance qgD between the global optimal position and the individual optimal position. When the optimization problem is different, the convergence degree of the particle swarm algorithm will fluctuate greatly. In order to eliminate the negative impact of these problems on the algorithm’s convergence process, the inertia weight t. k+1 k k k = tvid + c1 r1 (qid − xid ) + c2 r2 (qgd − xid ) vid
(3)
The particle swarm algorithm is also called standard particle swarm algorithm after introducing inertial weight. As the particle evolved from its generation to its generation, its position changed.
264
B. Zhu et al.
3 Improved PSO Algorithm In this paper, the particle swarm optimization algorithm is easy to fall into the local optimum. The particle swarm optimization algorithm is improved: (1) By optimizing the main input parameters of the particle swarm algorithm such as acceleration factor and inertia weight; (2) Adding simulated annealing to force the particle swarm algorithm to break out of the local optimal trap. In the traditional particle swarm optimization algorithm, the ability of searching the global optimal solution is mainly determined by the inertia weight. In this paper, according to the different functions of the inertia weight at different times of the algorithm operation, the setting of the inertia weight is changed to decrease nonlinearly with the number of iterations, The new formula for calculating the inertia weight is t(k) = ts − (ts − te )(k/Tmax )2
(4)
where ts is the initial inertia weight, te is the inertia weight at the maximum number of iterations, k is the current iteration number, and Tmax is the maximum iteration number. In traditional PSO, the magnitude of the acceleration factor C 1 C 2 affects the global and local search capabilities of the particle swarm optimization. The acceleration factor C 1 controls the local optimization ability of the algorithm, and the local optimization ability of the algorithm increases as C 1 increases; C 2 controls the global optimization ability, and the global optimization ability of the algorithm increases as C 2 increases. In the entire optimization process of the algorithm, the global optimization ability of the initial optimization algorithm is more important, and in the latter stage of the algorithm, in order to find the optimal solution quickly, it is hoped that the local optimization ability can be strengthened. Therefore, if you want the particle swarm algorithm to find the optimal solution, you can add an adaptive acceleration factor to the traditional particle swarm algorithm. The general acceleration factor adaptation formula is: c1 = R1 + R2 T k max (5) c2 = R3 − R4 T k max In the formula, R1 R2 R3 R4 is a non-negative constant. As the number of iterations increases, C 1 will gradually increase and C 2 will gradually decrease. This allows the particle swarm algorithm to ensure its global and local optimization capabilities at the same time. This paper uses the simulated annealing algorithm to improve PSO. When the particles in the particle swarm algorithm fall into a local optimal trap, the simulated annealing algorithm can simulate solid annealing when the optimal solution is selected by the particle swarm. With a certain probability, a particle that is slightly worse than the trapped particles can be selected to jump out of the trap. P = exp(−E/kT )
(6)
In the formula, E is the internal energy when the particle temperature reaches T, E is the change amount of the internal energy, and k is the Boltzmann constant. It can be seen from formula that the higher the temperature T, the smaller probability is, and
Research on Coal and Gas Outburst Prediction Using PSO-FSVM
265
the lower the temperature T, the greater the probability is. Because annealing means cooling, the change in internal energy is always negative, so −E/kT ≥ 0, and the range of probability formula is (0,1).
4 PSO-FSVM (A) FSVM. Let the training set be S = {(x1 , y1 , S1 ), . . . , (xl , yl , Sl )}, where xj ∈ Rn , yj ∈ {−1, 1} and Sj are real numbers greater than or equal to zero and less than or equal to 1. Sj is the fuzzy membership (j = 1, . . . , l) corresponding to the training point (xj , yj ) after fuzzification. The processed data is transmitted as input to the prediction model. For linearly separable data, the process of the classification model to find the optimal hyperplane can be expressed as a mathematical formula: ⎧ l ⎪ ⎪ ⎪ Sj ξj min 21 w2 + C ⎨ w,b,ξ j=1 (7) ⎪ st · yj ((w · x) + b) + ξj ≥ 1, j = 1, . . . , l ⎪ ⎪ ⎩ ξj ≥ 0, j = 1, . . . , l where C is the penalty parameter, ξ = (ξ1 , . . . , ξ l )T and Sj are the membership of the training data (xj , yj ) whether it belongs to the correct sample. To solve the quadratic programming, the following Lagrange function is constructed:
1 L(w, b, ξ, α, β) = w2 + C sj ξj − αj yj ((w · x) + b) + ξj − 1 − βj ξj 2 l
l
l
j=1
j=1
j=1
(8) Among them α = (α1 , . . . αl )T , β = (β1 , . . . , βl )T , αj ≥ 0, βj ≥ 0, j = 1, . . . , l. According to Wolfe dual definition, the Lagrangian function is minimized with respect to w, b, ξ . Finally, the fuzzy optimal classification function is f (x) = sgn{(w∗ · x) + b∗ )}, x ∈ Rn
(9)
where w∗ =
l j=1
αj∗ yj xj , b∗ = yi −
l
yi αi xj · xi , i ∈ i 0 < αi∗ < si C .
j=1
According to the characteristics of coal and gas samples, the membership degree selected in this paper is determined by the distance between the data and the class center in the high-dimensional space to which the membership function is mapped. The higher the degree of near membership, the larger the difference between the noise point far from the class center point and the membership of the correct sample. According to the
266
B. Zhu et al.
characteristics of the measured data of coal and gas outburst, the membership function can classify the noise point with the correct sample. The membership function gives a smaller membership degree to the noise points that are far away from the class center point in the sample points, so that the classification surface will not be shifted due to the noise points, and the influence of the noise points on the prediction classification accuracy is eliminated. Suppose there is data {x1 , . . . ,xl } in n-dimensional space Rn . Let x0 be the class center point and r be the class radius: ⎧ l ⎪ ⎪ xj ⎨ x0 = 1l (10) (j = 1, . . . ,l) j=1 ⎪ ⎪ ⎩ r = max xj − x0 xj
While determining the degree of membership based on distance, the degree of membership of the class center sample can be expressed as: xj − x0 (11) sj = 1 − r+δ (B) PSO-FSVM Reasonable selection of parameters still has a great impact on the prediction speed, prediction accuracy, and generalization ability of the fuzzy support vector machine, and the particle swarm optimization algorithm has easy-to-understand principles, is relatively easy to use, has high optimization efficiency, and is robust. Features. In order to improve the classification performance of fuzzy support vector machine, this paper uses particle swarm algorithm to find the optimal parameters of fuzzy support vector machine. The particle is defined as a three-dimensional data, including penalty coefficient C, kernel function parameter δ and characteristic parameter a. When the penalty coefficient and kernel function parameters meet the set range, the characteristic parameter value is 1, otherwise the characteristic parameter value is 0.
5 Experiment and Result Analysis This paper selects the data monitored by outburst mines in Guizhou, Yunnan, Sichuan and other areas in the literature as a model sample set. The coal and gas outburst categories are divided into four categories: no outburst is represented by the number 1; small outbursts are represented by the number 2. Representation; medium-sized protrusions are represented by the number 3; large-scale protrusions are represented by the number 4. In order to compare the performance of each model, the same raw data is normalized and then input to a particle swarm optimization support vector machine for training and testing. For fuzzy support vector machine and particle swarm algorithm optimization fuzzy support vector machine, the data Training and testing after blurring. Analysis and comparison from the training accuracy, the following are the training accuracy maps of the three classification algorithms under the same data.
Research on Coal and Gas Outburst Prediction Using PSO-FSVM
267
After comparison, it can be seen that the fuzzy support vector machine algorithm of particle swarm optimization has the highest training accuracy, reaching 98.3607%, which indicates that this algorithm is closer to the salient data and corresponding salient degree that we want to simulate in this paper. And it can be seen from Figs. 1 and 2 that the fuzzy membership function is introduced to reduce the impact of erroneous samples on the classification results. Comparing Figs. 2 and 3, it can be seen that the training accuracy of PSO-SVM is higher than that of FSVM. PSO-FSVM
Category tags
Classification of actual test sets Classification of prediction test sets
Training set samples
Fig. 1 PSO-FSVM model training simulation results
PSO-SVM
Category tags
Classification of actual test sets Classification of prediction test sets
Training set samples
Fig. 2 PSO-SVM model training simulation results
It can be seen from Table 1 that under the same computer configuration, compared with other algorithms, the particle swarm optimization fuzzy support vector machine
268
B. Zhu et al. FSVM
Category tags
Classification of actual test sets Classification of prediction test sets
Training set samples
Fig. 3 FSVM model training simulation results
algorithm has the shortest training time. Therefore, in terms of convergence speed, the particle swarm optimization fuzzy support vector machine is far superior to the other four algorithms. Table 1 Different model training time comparison table Model category
PSO-FSVM
FSVM
PSO-SVM
BP
Training time/s
1.107
2.538
2.061
9.134
6 Conclusion Aiming at the shortcomings of the SVM, which is not strong in noise resistance and greatly affected by parameters, this paper proposes a new type of coal and gas outburst prediction method based on PSO-SVM. This method first calculates the fuzzy membership degree of each sample by the fuzzy membership function to reduce the influence of noise points on the classification results; Secondly, the PSO algorithm is improved, and the inertia weight that decreases non-linearly with the number of iterations is introduced to improve the optimization ability of the algorithm. The simulated annealing algorithm is used to make the particles in the particle swarm algorithm forcibly jump out of the local optimal trap with a certain probability. The improved PSO algorithm is used for parameter optimization of FSVM; Finally, a FSVM prediction model based on PSO algorithm is constructed. This model first reduces the influence of erroneous samples on
Research on Coal and Gas Outburst Prediction Using PSO-FSVM
269
the prediction ability of the model by assigning corresponding membership to the measured data. Then, the PSO algorithm is used to find the optimal parameters and change the parameters. The impact on the prediction model is minimized. PSO-SVM model and FSVM model are compared with the model in this paper. The experiments prove that the PSO-SVM model has fast training speed and the highest classification accuracy. The improved fuzzy support vector machine prediction model proposed in this paper is simulated by using MATLAB software combined with the measured data from mines. The output results prove that the algorithm is faster than other traditional prediction methods and can detect whether Make more precise judgments when there are outstanding. This method effectively solves the problems of poor anti-noise, slow training speed and low prediction accuracy in the traditional prediction methods, and has strong practicability. Acknowledgements. Supported by National Key R&D Program of China under grant No. 2017YFF0205500.
References 1. Gao L, Zhao S, Gao J (2013) Application of artificial fish-swarm algorithm in SVM parameter optimization selection. Comput Eng Appl 49(23):86–90 2. Pang B (2016) Research on the movement direction decoding of animals based on maximum likelihood estimation. Zhengzhou University 3. Zhang D (2007) Research into prediction model of water content in crude oil based on intelligent information processing technique. China University of Petroleum 4. Lin CF, Wang SD (2002) Fuzzy support vector machines. IEEE Trans Neural Netw 13(2):464– 471 5. Zhang C (2018) The research and application of hybrid swarm intelligence optimization. University of Science and Technology Beijing 6. Li Y (2014) The study of particle swarm algorithm based on multi-objective optimization and its application. Southwest Jiaotong University 7. Shao Q (2017) Particle swarm optimization and its application in engineerin. Jilin University 8. Wang J (2016) Research and application of mutli-objective particle swarm optimization algorithm. Northeast Petroleum University
Prediction of Coal and Gas Outburst Based on FSVM Xuguang Jia1 , Ye Zhang1 , Yang Zhang1 , Yanjuan Yu1 , Huashuo Li2 , Yuhang Sun2 , and Shoufeng Tang2(B) 1 Xuzhou Comprehensive Center for Inspection and Testing of Quality and Technical
Supervision, Xuzhou, China 2 China University of Mining and Technology, Xuzhou, China
[email protected]
Abstract. Coal and gas outburst are one of the natural disasters in coal mines. It is highly destructive and sudden. It is a complex nonlinear problem that is affected by a combination of factors. Fuzzy support vector machine (FSVM) combines the advantages of fuzzy theory and support vector machine (SVM), has strong recognition ability in the case of small samples, and has better learning ability than traditional SVM. In this paper, the gray correlation analysis (GRA) is used to extract coal and gas outburst indicators, an appropriate fuzzy membership function is introduced, and on this basis, a model of coal and gas outburst prediction based on FSVM is proposed. The comparison of verification and other prediction methods proves that the FSVM model can meet the requirements of coal and gas outburst prediction, and the same set of data is trained using FSVM, PSO-SVM and BP neural network system. Experiments prove that FSVM has better prediction accuracy. Keywords: Coal and gas outburst · SVM · FSVM · GRA
1 Introduction In recent years, China’s economy has developed rapidly, and the demand for coal is also increasing. Due to the worsening mining conditions in the mines and frequent accidents during the mining process, how to safely mine coal has become an important issue. The formation of coal mines is mainly affected by geological conditions, and the distribution characteristics are relatively obvious. Most of them exist underground from tens to hundreds of meters from the ground. Because the underground mining environment is too dangerous, various unpredictable mining accidents will occur during the mining process. Among all mining accidents, coal and gas outburst accidents have the largest proportion, and the frequency of occurrence is gradually increasing. The outstanding problems of coal and gas have reached a point where they must be resolved.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_34
Prediction of Coal and Gas Outburst Based on FSVM
271
Coal and gas outburst are affected by a variety of influencing factors. Traditional coal and gas outburst prediction needs to determine the main influencing factors. However, some methods in modern mathematical theory, such as neural network and gray system theory, can ignore these effects and use the nonlinear relationship between each influencing factor and its corresponding outburst result to predict coal and gas outburst. Scholars in China and abroad have tried to use various classification algorithms to predict coal and gas outbursts. At present, many successful classification algorithms used in the field, include artificial neural network algorithms and fuzzy mathematical algorithm. Scholars from China University of mining and technology used BP neural network to establish a coal and gas outburst prediction model [1]; Guo Deyong et al. Built a model based on gray theory and neural network theory and brought it into the field of coal and gas outburst prediction [2] Peng Hong et al. Combined BP neural network and rough set algorithm to analyze gas outburst [3]; Feng Zhanwen applied analytic hierarchy process fuzzy comprehensive evaluation method to predict the danger of coal and gas outburst [4]. However, due to the certain subjectivity of the AHP method and fuzzy comprehensive evaluation method, the complexity of the artificial neural network algorithm and the resulting iterative process shock and overfitting problems plus the complexity of the coal mine production system and many factors the nonlinearity of denaturation, and the limitation of data samples make the above evaluation methods encounter great limitations in practical application [5]. With the continuous maturity and development of statistical learning theory, Cortes and Vapnik proposed a new pattern recognition method based on the principle of structural risk minimization in 1995, support vector machine (SVM), shows many advantages in nonlinear, small sample data, and multi-dimensional recognition problems. In this paper, the gray correlation analysis method is used to extract coal and gas outburst indicators, an appropriate fuzzy membership function is introduced, and on this basis, a model of coal and gas outburst prediction based on FSVM is proposed [6, 7]. Through actual data and BP prediction results, the comparison proves that the FSVM model can meet the requirements of coal and gas outburst prediction and has higher accuracy.
2 Gray Correlation Analysis The gray system theory was proposed by Professor Deng Julong in 1982, and mainly analyzes and processes the imperfect system [8]. First, preprocess the data. After processing, the reference sequence becomes Y 0 = {y0 (k)|k = 1, 2, … n}, and the comparison sequence becomes Y i = {yi (k)|k = 1, 2, … n, i = 1, 2,…m}. The processing formula can be expressed as: y0 (k) =
1 n
x0 (k) yi (k) = n i=1 x0 (i)
1 n
xi (k) n t=1 xi (t)
(1)
The correlation coefficient between points y0 and yi at point k can be expressed as: ξi (k) =
min min |y (k) − y (k)| + ρ max max |y (k) − y (k)| 0 i 0 i i k i k max max |y (k) − y (k)| + ρ |y (k) − y (k)| 0
i
i
k
0
i
(2)
272
X. Jia et al.
In the formula: the specific value of ρ can be determined according to the actual situation, generally ρ = 0.5. The calculation formula of gray entropy correlation degree and the calculation formula of entropy weight are as follows: 1 w(k)ξi (k) n n
ri =
(3)
k=1
where ξi (k) is the correlation coefficient
3 FSVM Let the training set be S = {(x1 , y1 , S1 ), . . . , (xl , yl , Sl )}, where xj ∈ Rn , yj ∈ {−1, 1} and Sj are real numbers greater than or equal to zero and less than or equal to 1. Sj is the fuzzy membership (j = 1, . . . , l) corresponding to the training point (xj , yj ) after fuzzification. The processed data is transmitted as input to the prediction model. For linearly separable data, the process of the classification model to find the optimal hyperplane can be expressed as a mathematical formula: ⎧ l ⎪ ⎪ ⎪ Sj ξj min 21 w2 + C ⎨ w,b,ξ j=1 (4) ⎪ st · y ((w · x) + b) + ξ ≥ 1, j = 1, . . . , l j j ⎪ ⎪ ⎩ ξj ≥ 0, j = 1, . . . , l where C is the penalty parameter, ξ = (ξ1 , . . . , ξ l )T and Sj are the membership of the training data (xj , yj ) whether it belongs to the correct sample. To solve the quadratic programming, the following Lagrange function is constructed: L(w, b, ξ, α, β) =
1 w2 + C sj ξj − αj (yj ((w · x) + b) + ξj − 1) − βj ξj 2 l
l
l
j=1
j=1
j=1
(5) Among them α = (α1 , . . . αl )T , β = (β1 , . . . , βl )T , αj ≥ 0, βj ≥ 0, j = 1, . . . , l. According to Wolfe dual definition, the Lagrangian function is minimized with respect to w, b, ξ . Finally, the fuzzy optimal classification function is f (x) = sgn{(w∗ · x) + b∗ )}, x ∈ Rn
(6)
where w∗ =
l j=1
αj∗ yj xj , b∗ = yi −
l
yi αi (xj · xi ), i ∈ {i0 < αi∗ < si C} .
j=1
According to the characteristics of coal and gas samples, the membership degree selected in this paper is determined by the distance between the data and the class center in the high-dimensional space to which the membership function is mapped. The higher
Prediction of Coal and Gas Outburst Based on FSVM
273
the degree of near membership, the larger the difference between the noise point far from the class center point and the membership of the correct sample. According to the characteristics of the measured data of coal and gas outburst, the membership function can classify the noise point with the correct sample. The membership function gives a smaller membership degree to the noise points that are far away from the class center point in the sample points, so that the classification surface will not be shifted due to the noise points, and the influence of the noise points on the prediction classification accuracy is eliminated. Suppose there is data { x1 , . . . xl } in n-dimensional space Rn . Let x0 be the class center point and r be the class radius: ⎧ l ⎪ ⎪ x0 = 1 xj ⎨ l (7) (j = 1, . . . , l) j=1 ⎪ ⎪ ⎩ r = max xj − x0 xj
While determining the degree of membership based on distance, the degree of membership of the class center sample can be expressed as: xj − x0 (8) sj = 1 − r+δ
4 Application of FSVM in Prediction of Coal and Gas Outburst 4.1 Selection of Coal and Gas Outburst Influence Index and Gray Correlation Analysis Coal and gas outburst is the result of the combined action of three factors of in situ stress, gas and coal physical, and mechanical properties. In this paper, the strength of coal and gas outburst is selected as the reference sequence, and the data of each factor affecting coal and gas outburst is set as the comparison sequence. Let the intensity of gas outburst be the mother factor, and set the data of other influencing factors as sub-factors and their observations. In this paper, the data collected from outburst mines in Guizhou, Yunnan, Sichuan, and other regions in the literature is used as the model sample set for gray correlation analysis, and 8 representative samples are selected from all the obtained samples as data for calculating outburst indicators (Table 1). While calculating the gray correlation degree, due to the different dimensions of the influencing factors, the calculation amount will be too large and the comparison will be meaningless, so the data needs to be preprocessed. After the dimensionless processing of coal and gas outburst data, the comparative obstacles are eliminated. First, the raw data of outburst measured outburst mines are analyzed, and the original data are used to obtain the entropy weights of various indicators for outburst strength. The gray entropy correlation degree is calculated by the value and the gray correlation coefficient (Table 2). The gray entropy correlation degree of each indicator can be determined according to the numerical value. From the definition of gray entropy correlation degree, the larger the value, the greater the impact on the outburst strength. The top five factors of the entropy value are used as prediction indicators.
274
X. Jia et al. Table 1 Partial raw data
The sample number
Strength
Initial gas velocity
Ruggedness factor
Gas pressure
Gas pressure
Types
Mining depth
Mining depth
1
150.20
19.13
0.33
2.69
1.23
3
615
10.31
2
19.97
5.95
0.26
0.94
2.03
5
451
13.26
3
15.17
18.15
0.18
1.27
1.41
3
462
10.42
4
0.00
5.08
0.57
1.21
1.65
1
401
9.11
5
76.24
8.14
0.41
1.31
1.36
3
765
8.89
6
10.34
7.91
0.63
2.85
1.87
3
417
10.49
7
0.00
7.26
0.45
2.07
1.14
1
480
9.74
8
111.67
14.38
0.23
3.87
0.97
3
550
8.19
Table 2 Gray entropy weighting of each influencing factor Influence factor
Mining depth
Gas content
Types of coal destruction
Soft layer thickness
Ruggedness factor
Initial velocity
Gas pressure
Gray entropy correlation
0.6401
0.5546
0.6061
0.5466
0.5792
0.6416
0.6698
4.2 Simulation The basic steps of the entire algorithm can be described as follows: (1) Normalize the original sample data. (2) Sort the correlation of the obtained indexes, and put forward the main indexes. (3) Find the class center x 0 and class radius r of the sample set, and calculate the membership function sj . (4) Determine the fuzzy training set: S = {(x 1 , y1 , s1 ),… ,(x l , yl , sl )}, select the training samples and test samples. (5) Select the Gauss function as the radial basis kernel function, and use the cross validation method to optimize and determine the penalty parameter C and the parameter σ of the Gauss radial basis kernel function. (6) Use the optimal penalty parameter C and the parameter σ of the Gauss radial basis kernel function to learn the training sample set and establish the FSVM model. (7) Use test samples to make predictions, compare the test results with the original data results, and verify the accuracy of the FSVM model. In this paper, 80 sets of data are selected as fuzzy sample sets. In order to make the model fully trained, 75% (60 sets) of data are selected as training sample data, and the
Prediction of Coal and Gas Outburst Based on FSVM
275
remaining 25% (20 sets) of data are used as test sample data. Select the SVM model parameters as C = 100, σ = 10, and divide the coal and gas outburst categories into four categories: No outburst is represented by the number 1; small outburst is represented by the number 2; medium-sized outburst is represented by the number 3; large outburst is used the number 4 indicates. Input the same test data into the FSVM prediction model and the two comparative prediction models (Figs. 1, 2 and 3). FSVM
Category tags
Classification of actual test sets Classification of prediction test sets
Training set samples
Fig. 1 FSVM model output
PSO-SVM
Category tags
Classification of actual test sets Classification of prediction test sets
Training set samples
Fig. 2 PSO-SVM model simulation output
From the image comparison, it can be seen intuitively that the classification accuracy of the FSVM model in predicting test samples is 92.3077%, which is much higher than the BP neural network prediction model and also higher than the PSO-SVM prediction accuracy of 66.667%. At the same time, it can be seen that when predicting the 7th, 9th,
276
X. Jia et al. BP
Category tags
Classification of actual test sets Classification of prediction test sets
Training set samples
Fig. 3 BP neural network model simulation output
and 10th data, the results obtained by PSO-SVM are wrong, and the FSVM model can still correctly predict the results. By comparing these sets of data with other data, it can be found that these sets of sample points are isolated from other sample point sets and are too far away from the sample center. However, due to the special reason that their trust in all samples is the same, it will cause the searched classification surface tends to be close to these isolated points, so that the found classification surface has a certain deviation from the actual classification surface. In the simulation process conducted in this paper, there are noise points near the prominent sample set, so the classification surface found by traditional SVM is closer to the salient sample set than the actual classification. This offset makes the non-prominent sample set obtained by classification actually there is outstanding data. The FSVM introduces a fuzzy membership function, which assigns a corresponding fuzzy membership to each training sample data, so that the support vector corresponding to the isolated sample has the minimum impact on the construction of the optimal classification hyperplane, and the sample points of points near the optimal hyperplane can be correctly divided.
5 Conclusion (1) In this paper, the gray correlation analysis method is used to analyze the influencing factors, select appropriate indicators, and establish a model of coal and gas outburst prediction based on FSVM. The training results of actual data samples show that the model can be used to predict correctly the danger of coal and gas outburst in mines. By comparing with the prediction results of BP and PSO-SVM models, it can be concluded that the prediction model based on FSVM has more accurate prediction results. (2) The traditional SVM algorithm treats all samples equally. When the sample set contains a lot of noise or isolated points, the algorithm cannot distinguish the noise and isolated points from the support vector, so the classification surface obtained is not a true optimal classification surface, resulting in accuracy decline. When dealing
Prediction of Coal and Gas Outburst Based on FSVM
277
with noise or isolated points, FSVM reduces the impact of noise or isolated points on the determination of the optimal classification surface by assigning smaller degrees of membership to the noise and isolated points, and at the same time assigns a larger degree of membership to the support vector. The support vector is guaranteed to determine the optimal hyperplane, so the classification surface obtained by FSVM is superior to the classification surface obtained by general SVM, and the classification accuracy is higher.
Acknowledgements. Supported by National Key R&D Program of China under grant No. 2017YFF0205500.
References 1. Zhai H, Li Z (2008) Application of neural network in the coal and gas outburst prediction sensitive indicators. J Henan Polytechnic Univ (Nat Sci) 4:381–385 2. Li X (2018) Study on optimized grey neural network. North China University of Water Resources and Electric Power 3. Kang H, Jiang J, Yang S, Zhang Y, Cao Y (2017) Diagnosis on failure of onboard equipment based on rough set neural network. Chin Railways 5:67–71 4. Yuan J (2018) Analysis of China’s energy situation and energy policy. China Market Mag 7:31 + 35 5. Zhang Q, Pu Y (2018) Research on dynamic prediction technology of coal and gas outburst in high-yield and high-efficiency mine. Coal Sci Technol 46(10):65–72 6. Yang R, Chen C, Li H (2016) The simulation research of prediction model in coal and gas outburst. Comput Simul 33(09):435–439 7. Jia S (2014) The research and application of the edge detection algorithm based on grey system theory. Donghua University 8. Zhao X (2015) Study on the regional prediction about the outburst of coal and gas in Nanshan coal mine. Hunan University of Science and Technology
Influencing Factors of Gas Emission in Coal Mining Face Zhou Zhou1 , Fan Shi2 , Yang Zhang3 , Yanjuan Yu3 , and Shoufeng Tang2(B) 1 Special Equipment Safety Supervision Inspection Institute, Jiangsu Province, China 2 China University of Mining and Technology, Xuzhou, China
[email protected] 3 Xuzhou Comprehensive Center for Inspection and Testing of Quality and Technical
Supervision, Xuzhou, China
Abstract. Most coal mines in China are distributed below the surface, so the construction of mines plays an important role in coal mining and production. The mining work of coal mines is closely connected with gas. The deeper the mine is, the more gas emission will increase. In the process of coal seam mining, once the gas cannot be discharged in time, gas outburst and gas explosion accidents are prone to occur. Therefore, understanding the laws and influencing factors of gas emission are the premise of gas safety operation and the basis of mine ventilation design. Knowing the distribution rules and influencing factors of gas emission, we can design a more reasonable ventilation system and prevent dangerous gas accidents in mines in advance. Keywords: Mine · Gas · Emission influence · Factor law
1 Introduction In the process of coal formation, anaerobic bacteria will produce gas in the process of decomposing the original parent material of coal-humic organic matter [1]. Due to the continuous sinking of sediments in the peat layer, the burial depth gradually increased. After a long time in the environment of high temperature and high pressure, through the combined effect of physical and chemical, less and less volatile matter in coal, more and more fixed carbon, at the same time, a large amount of gas will be formed, so the higher the degree of coalification of the sediment, the more gas can be produced. Gas is lighter than air, easy to diffuse and has strong permeability. It is easy to release from the goaf through the rock layer from the adjacent layer. Gas itself is non-toxic, but it cannot be breathed by people. When the gas concentration in the air in the mine exceeds 50%, it can cause people to suffocate due to lack of oxygen.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_35
Influencing Factors of Gas Emission in Coal Mining Face
279
Through years of experiments, coal industry experts determined that the main component of gas is alkanes, of which CH4 accounts for the largest proportion [2]. The maximum allowable concentration of the harmful gas contained in the mine in the safety regulations of coal mine is listed in Table 1 as follows. Table 1 Maximum allowable concentration of harmful gas contained in the mine Maximum allowable concentration of some harmful gases CO
0.0024%
H2 S
0.00066%
SO2
0.0005%
NO2
0.00025%
CH4
0.0005%
NH4
0.00025%
H2
0.5%
When a gas encounters an open flame or a mechanical collision, a gas explosion may occur, which directly threatens the life safety of the underground workers and the normal operation of the equipment. Under the mine, when the CH4 content exceeds 5%, there will be a danger of gas explosion. In the coal seam, the occurrence of gas has a great relationship with the internal structure of the coal. During the process of organic matter becoming coal, the geological movement continues, and many cracks and pores will be generated inside the coal. The size of these fissures and pores is different, making them an excellent place for gas to accumulate. In the process of coal seam mining, a lot of gas will be emitted. With the movement of the airflow under the mine, the environment under the mine will be polluted. The existence states of gas in coal are: free state and adsorption state. The free state is about 80–90%, and the adsorption state is only about 10–20%, indicating that most of the gas is in the free state [1–3].
2 Main Factors Affecting Gas Emission Gas is one of the main causes of the mine disaster, which seriously threatens the safety of workers and equipment. The gas emission we refer to correspond to the entire mine [3]. For a coal seam, we call it the gas emission of a coal seam. Many gas emission data indicate that the distribution of gas emission in mines is extremely uneven [4]. The gas emission from different coal seams in the same mine is different, and the gas emission from different sections of the same coal seam is also different. The geological conditions are inseparable and related to the mining technology of the mine. Therefore, the main factors that determine the amount of gas emission are: natural geological conditions and mining technology factors. Geological structure. The geological structure can not only change the permeability of the surrounding rock of the coal seam, but also affect the morphology and structure of
280
Z. Zhou et al.
the coal seam. The migration, distribution, and occurrence of gas are largely controlled by the fault structure. Tension and tension faults are usually open faults, which mainly emit gas flow. If the gas is close to similar faults, then the content will be significantly reduced, and compression torsion and compression faults are closed faults, which mainly seal coal seam gas. In this area, the gas content is relatively high. Connection between coal seam gas occurrence and coal seam burial depth. The escape and preservation of gas depend to a large extent on the overlying coal seam, which is the most important geological condition affecting gas content. The deeper the burial, the greater the in situ stress, which reduces the permeability of the coal seam and surrounding rock. In addition, the distance of gas migration to the surface also becomes larger. These factors all play a good role in the storage of gas protection. Generally, the gas content and pressure of coal seam are directly proportional to the depth of burial. The deeper the burial, the higher the degree of coal metamorphism, and it also increases the amount of gas produced. Not only that, the increase in burial depth also increases the ground stress, and the gas sealing conditions are more perfect. These conditions also play a good role in the preservation of gas [5]. Mining sequence and method. When multiple layers or thick coal layers are mined in close proximity by layers, there is a large amount of gas emission from the first coal seam, and the adjacent coal seam and other layered gas will also continuously flow into the mining space of the first coal seam. The range of roof loosening and collapse is large, so the entire collapse method should be adopted when managing the roof. At this time, there is a large amount of gas emission. When the filling method is used, the gas emission is smaller. When there are more coal pillars in the mining interval, there will be more coal loss in the mining face of the working face, which in turn reduces the recovery rate and increases the gas emission. When the mining face is pressed periodically or for the first time, the amount of gas emission will increase accordingly. When the mine is in a normal production state, the amount of gas emission will increase by about 60%. Goaf management. The main source of gas is the goaf, which has a large concentration of gas. When the sealability of the goaf is poor, there is a large wind pressure difference on both sides of the return air lane, and a large amount of wind flow will bring more gas, increase the amount of gas emission. As shown in Fig. 1, the coal wall of the working face, the broken coal wall at the top of the roof, and the gas emitted from the mined coal will be directly taken away by the wind in the working face, while the gas from the goaf and adjacent layers, most of them go into the goaf after being poured into the mined-out area, which is called gas gushing in the mined-out area [4].
3 Source of Mine Gas Within the scope of the coal mining yard, the gas source refers to the source of gas escaping from the mine [6]. Part of the gas in the coal mining face comes from the coal wall of the coal mining face, the mined coal block, and the adjacent layer, etc., originated from the goaf. Among them, the main sources of gas are shown in Table 2. Gas in the coal wall. There are many pores and fissures of different sizes in the coal seam, so that the gas can continuously flow in the coal seam. As the coal mining work progresses, the original gas pressure in the coal seam is destroyed, so that the
Influencing Factors of Gas Emission in Coal Mining Face
281
Fig. 1 Gas composition in goaf Table 2 Main source of gas q1 Coal wall q2 Coal wall when the top frame is broken q3 Mined coal q4 Falling coal q5 Goaf q6 Adjacent layer
methane pressure changes, the overall pressure is out of balance, the gas will move, and the permeability of the coal seam will be enhanced, which will eventually lead to a sharp gas flow. Therefore, after many tests, it is concluded that the coal wall gushing intensity has a certain relationship with time, as shown in Fig. 2. It can be seen that the intensity of gas emission gradually decreases with the passage of time. Under certain conditions, coal wall gas gushing intensity can be measured by exposure time, as shown in Eq. 1: W = W1 (1 + t)−n In the formula: W Remaining gas content of coal wall, m3 /t; W1 Initial gas content of coal wall, m3 /t; n Gas diffusion velocity coefficient of mined coal, min−1 . Mining gas in coal. The mined coal can penetrate and diffuse gas. The process of releasing coal to release gas is quite complicated and long. The rate of gas diffusion
282
Z. Zhou et al.
Gushing intensity Times Fig. 2 Relationship of coal wall gushing intensity with time
is closely related to the size of the coal block (i.e., the size of the coal block), the temperature, humidity, and atmospheric pressure of the coal mining face [7]. The smaller the coal mass, the faster the gas emission rate. The time required for the coal blocks with different particle sizes to release 90% of the gas is shown in Table 3. Table 3 Time required to release 90% of gas from coals of different sizes Coal sample size
10−6 m
10−5 m
10−3 m
1 mm
1 cm
1m
Discharge time
40.6 s
1 min
100 h
1 year
15 years
15w years
Note Quoted from the European Community Coal Mining Industry Manual (coal mine gas drainage)
When the coal is transported to the ground, most of the gas has been released: W = W1
100 − P 100
In the formula: W Gas content in coal transported to the surface, m3 /t; W1 Original gas content of coal seam, m3 /t; P Coal gas emissions. Adjacent layer of gas. The amount of gas emitted from the adjacent layer accounts for a large proportion of the amount of gas emitted from the mine [4]. In the underground roadway working face, if the goaf leaks, there will be gas flow from the adjacent layer into the goaf, and part of the gas flow will return to the coal mining face through the goaf, which further leads to the coal mining face gas emission increased. Therefore, not only to prevent air leakage in the mined-out area, but also to reduce the gas content of the adjacent layer. Mine data experience shows that the thicker the adjacent layer, the
Influencing Factors of Gas Emission in Coal Mining Face
greater the gas emission, namely: Q = Q1 ×
283
n × η/m
In the formula: Q Q1 n η m
Near layer release, unit: m3 /min; Coal wall release, unit: m3 /min; Total thickness, unit: m; Gas emission rate of the ith adjacent layer; Mining layer thickness, unit: m.
Goaf gas. The gas source in the mined-out area is mainly produced by the coal mining face, mined coal blocks, and the remaining coal pillars. However, the proportion of gas emission in the goaf to the total gas emission is also related to the size of the goaf, the design of the ventilation system, and the management of gas [8]. Assuming that the residual coal in the goaf is uniformly distributed, we use V3 to represent the CH4 gushing intensity of the residual coal. In the opposite direction of the advancing direction, the CH4 gushing amount of the coal remaining in the goaf can be calculated by for: a+b V dA Q= In the formula: Q V a b
Relative gas emission from goaf, unit: m3 /min; Gas release intensity of remaining coal, unit: m3 (t × min); Length between coal wall and support, unit: m; Wide fluctuation range in the forward direction, unit: m.
4 Conclusion China is a major coal producer and the country with the worst coal mine disaster in the world. Among mine disasters, gas disasters occur frequently, which has the greatest impact on mine mining work. Through research, this article understands the causes of mine gas formation, the main components of face gas, and the main influencing factors that affect the gas concentration. It provides an important theoretical basis for the prevention and control of mine gas, which can prevent the accumulation of gas and prevent accidents. However, the gas in the goaf of high gas face usually accounts for a large part of the gas emission of the entire stope, which can reach 70%. A large amount of gas leaking from the goaf continuously flows into the face, resulting in gas accumulation in the face and gas explosion danger. Therefore, how to control the mined-out area is a major problem in the mining of mines, and it needs to be solved urgently. Acknowledgements. Supported by National Key R and D Program of China under grant No. 2017YFF0205500.
284
Z. Zhou et al.
References 1. Yin B, Zhang W, Lu W et al (2017) Study on the dynamic relationship between the characteristics of coal spontaneous combustion and gas explosion in goaf. Coal Eng 49(5):99–102 2. Yong W (2018) Study on the law of expelling gas by hydraulic fracture of gas-bearing coal seam. China University of Mining and Technology, Beijing 3. Liu Y, Yuan L, Xue J et al (2018) Analysis on the occurrence regularity of coal mine gas disaster accidents in China from 2007 to 2016. Min Safety Environ Prot 45(3):124–128 4. Gao Y, Luo K, Zhang J (2018) Status and development prospect of intelligent mining in fully mechanized coal mining face. Energy Environ Prot 40(11):167–171 5. Ding B (2017) Characteristics of main disasters and prevention measures for coal mines in China [J]. Coal Sci Technol 45(5):109–114 6. Xu L, Li X (2018) Mine disaster early warning model based on big data. Coal Mine Saf 49(3):98–101 7. Tan G (2015) Construction and application of mine ventilation gas disaster early warning platform based on information technology. Coal Technol 34(3):338–340 8. Wang G, Fan J (2018) Progress and prospect of key technology innovation in intelligent coal mining. Ind Mine Autom 44(2):5–12
Study on Gas Distribution Characteristics and Migration Law Under the Condition of Air Flow Coupling Yanjuan Yu1 , Huashuo Li2 , Yang Zhang1 , Xuguang Jia1 , Fan Shi2 , Yongxing Guan2 , and Shoufeng Tang2(B) 1 Xuzhou Comprehensive Center for Inspection and Testing of Quality and Technical
Supervision, Xuzhou, China 2 China University of Mining and Technology, Xuzhou, China
[email protected]
Abstract. When the safe concentration of gas in coal face exceeds the upper limit, it will have a great influence on the safe mining work of coal mine and endanger people’s life and property. Therefore, it is of great practical significance to study the gas distribution characteristics and the law of gas migration, which can provide more perfect theoretical guidance for coal mine ventilation management and gas prevention and control. This paper will first study the distribution of wind speed in the mine and the form of gas movement under the condition of air flow coupling, and on the basis of this theory discuss the distribution law of gas concentration on each observation surface, the distribution of gas in the upper corner, and the overall change law of gas under different wind speed. Keywords: Wind speed distribution · Wind coupling · The gas distribution
1 Introduction At present, research on gas prevention and control has been carried out at home and abroad. As is known to all, gas safety is crucial to coal mining, and the study of gas distribution characteristics and migration law is of great practical significance, which provides more complete theoretical guidance for coal mine ventilation management and gas prevention and control. We need to discuss the forms of gas movement in the case of different wind speed and air flow coupling. According to the current research at home and abroad, there are mainly four types of gas movement forms in the case of air flow coupling: gas diffusion, gas turbulence diffusion caused by irregular turbulence pulsation, gas convection migration, and gas dispersion [1–4]. All of them affect the gas distribution and migration law to different degrees. Mastering their basic knowledge will provide fundamental theoretical basis for the following research.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_36
286
Y. Yu et al.
2 Analysis of Gas Distribution Under the Condition of Air Flow Coupling 2.1 X-Axis Numerical Analysis We between the coal mining working face goaf and coal wall, every 2 m set up three data observation, I, II, II, respectively, each data observation plane parallel to the X-axis, as shown in Fig. 1.
Fig. 1 Arrangement of the data observation plane on the X-axis
Decorate each observation surface, every 5 m layout, a wireless sensor node set I near into wind lane first wireless gas sensor nodes to (0, 0), so that can be measured three sets of data. The gas concentration waveform of I, II, III observation surface is shown in Fig. 2a–c, respectively [5, 6]. 0.45 0.40
0.5
0.9
0.45
0.8 0.7
0.4
0.6
0.35
0.35
0.30
0.3
0.4
0.25
0.3
0.5
0.25
0.2 0.20
0
20
40
60
80 (a)
100 120 140
0.2
0
20
40
60
80
100 120 140
(b)
0.1
0
20
40
60
80
100 120 140
(c)
Fig. 2 I, II, III observation
I observation surface gas density wave as shown in Fig. (a), first increases, then decreases, probably to the 15–100 m has been maintained a slowly increasing trend, finally increase with the decrease of the first, present a wavy [7]. Because the wind from the mining face into the wind lane to enter, the wind has changed about 90°, and under the goaf and the effect of return air lane, formed a small range of weak area, the small
Study on Gas Distribution Characteristics and Migration Law …
287
scale will have the gas turbulence diffusion migration way, wind flow, and gas flow will occur between the convection gas migration, collision, and inertial resistance and can produce viscous resistance between them; under the action of the force can have certain dilution gas, the gas concentrations higher than the surrounding here, so it showed a trend of decrease after the first increase; between 15 and 100 m, the air current has been relatively stable, and the gas will be diluted by the stable airflow, so the gas concentration will be higher as the flow continues [8]. From 100 m to the return air lane, due to the change of the air flow Angle, a second small area of the weak air flow influence surface on the observation surface is formed. Therefore, there is a trend of increasing first and then decreasing [9]. II observation surface gas density wave as shown in Fig. (b), gas density wave has been increasing trends, the observation plane in the middle of coal mining work face, the observation on the wind speed is uniform, so no turbulence in the wind, gas be evenly diluted, here not only by the coal wall, roof fall, the gas of coal, but also on the working plane from leaking into the mined-out area, caused by goaf gas emission; therefore, this period of increasing the concentration of the gas is slow speed. III observation surface gas density wave as shown in Fig. (c), the observation plane 0 m, goaf gas emission quantity is small, and romantic impact area is small; 0 m to 110 m has been slowly increasing trends. Between 110 m and return air lane, because in the goaf and return air lane junction, the angle of the wind flow changing 90 DHS, formed a small range of weak area, combined with gas from leaking into the mined-out area back into the mined-out area and return air lane junction, hence the eddy current accumulation phenomenon, the gas concentration of here than in other parts of the gas density are bigger; therefore, between 110 m and return air lane, significant gas concentration is increased [10]. 2.2 Y-Axis Numerical Analysis The five sensors with the same color as shown in Fig. 3 were used as a gas observation surface. Five observation surfaces were arranged in total. Five gas data on each gas observation surface were grouped into a group, and their gas concentration distribution fitting curves were successively drawn through surfer software, as shown in Fig. 4. From Fig. 4, we can get that the gas concentration on the side of the coal wall of no. 1 and no. 2 gas observation surface is higher than that in the goaf, and it tends to decrease slowly, but the decreasing range is relatively smooth. As this section is relatively close to the inlet air lane, a large part of the gas is taken away by the wind, and the air flow leaks into the goaf from the coal mining face. Gas observation, 3, 4, left and right sides is symmetrical, low and middle on both sides; this is mainly because the mine on cross section, the biggest in the middle of the wind speed, the more close to the border, the lower the wind speed, and the coal wall to the fluid; such as function, the flow rate is very low here; inertia force is very small, still maintained a laminar flow movement, so in 3, 4, gas observation on the middle of the lowest concentration, on both sides of the high concentration. On the no. 5 gas observation surface, the concentration of gas on the goaf side is larger than that on the side of the coal wall, and the trend of rise is rapid with a large rise range [11–13].
288
Y. Yu et al.
Fig. 3 Mining face 0.9
Gas concentration /(%)
0.8 0.7 0.6
5# 4# 3# 2# 1#
measuring section measuring section measuring section measuring section measuring section
0.5 0.4 0.3 0.2 0.1 0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
Working face width direction /(m) Fig. 4 Gas distribution in the Y-axis direction
2.3 Gas Analysis on the Vertical Surface of Air Flow Direction According to the knowledge of fluid dynamics, the iso-concentration diagram of the lower cut surface of the mine can be obtained, as shown in Fig. 5. It can be concluded from Fig. 5 that the velocity in the middle is the highest, and the velocity is lower as it gets closer to the boundary. When it reaches the wall surface, the wall impedes the fluid. The gas concentration here is relatively high, and the wind speed in the middle is relatively large, so the gas concentration is relatively low, which is the reason why the gas concentration contour shows a circle, and the value of the circle in the middle is small. In addition, as the gas density is relatively low, it will rise and accumulate, and the gas concentration above it changes quickly. Therefore, the gas contour is dense above and sparse below.
Study on Gas Distribution Characteristics and Migration Law …
289
Fig. 5 Contour map of gas concentration on section
2.4 Gas Concentration Analysis in Upper Corner The gas concentration at the upper corner is the highest in the whole coal face and the area with high gas concentration. Figure 6 shows the distribution diagram of wireless gas sensor measuring nodes in the upper corner experiment.
Fig. 6 Upper corner sensor distribution
According to the experimental data of gas concentration in the upper corner under the condition of air flow coupling, the MATLAB fitting curve was obtained. As shown in Fig. 7, in the section, since the density of gas is smaller than that of air, its diffusion rate is 1.34 times that of air, and it is easy to rise and accumulate. In addition, the air leakage carries a large amount of gas from the goaf, so the gas concentration is relatively high. ➂ segment is the eddy zone, when the wind speed and fluctuation velocity have achieved greater value, because the equivalent of dead zone in upper corner, and the external wind don’t come in, the vortex flow does not go out, ➂ period of high concentration of gas cannot into ➀ and ➄, which is ➀, ➂, ➄ paragraphs with paragraph ➁, ➃, and ➁ and ➃ almost vertically, pulsation of diffusion and convection migration ➂ the gas concentration decreased slowly.
290
Y. Yu et al. 0.90 0.85 0.80 0.75 0.70 0.65 0.60 0.55 0.50 0.45 0.40 0.0
0.5
1.0
1.5
2.0
2.5
3.0
Fig. 7 Distribution of gas concentration
The main air current has appeared in section ➄, because the wind speed in the middle is the largest; the wind speed on both sides is relatively small, so the gas concentration drops a little faster. The convection migration between the eddy current area and the main flow area can dilute the high concentration of gas in the upper corner. For the high concentration of gas in the upper corner, we can install ventilation equipment in the upper corner to remove the vortex area. The reasons for vortex accumulation in the upper corner of return air are as follows: (a) the density of gas is smaller than that of air, and its diffusion rate is 1.34 times that of air, so it is easy to rise and accumulate; (b) the wind speed at the upper corner is low, which is prone to vortex, which makes it difficult for the gas from the goaf to flow out and return to the air lane; (c) the gas emitted from the goaf makes the gas concentration in the upper corner relatively high.
3 Influence of Wind Speed on Gas Concentration Distribution in Coal Face As there is airflow on the work surface, there is also gas, if gas velocity and airflow. The velocity of the flow is not the same; then, there will be a relative movement between the gas flow and the wind. In both gases, at the boundary, because the molecules of the gas are doing random thermal motions, they are bumping into each other; because of the velocity of the wind relatively fast, then the molecules in the air velocity layer move to the gas velocity layer; the air velocity molecules will move the surplus can be transferred to the gas molecules; the gas molecules are subjected to a forward shear stress, so that the gas molecules speed accelerate; on the contrary, the wind molecules will be subjected to a backward force; the speed will be reduced; in the gas fluid, the kinetic energy conversion produces inertia resistance, and there is a viscous resistance, which makes the gas will flow with the direction of the wind; Therefore, the thinner the contour line is, the smaller the gas gradient is. In the coal mining face, the higher the wind speed, the smaller the gas concentration at each point, and the overall distribution
Study on Gas Distribution Characteristics and Migration Law …
291
law is the same. The following is the contour map of gas distribution on the working face of coal mining after coal cutting at different wind speeds, as shown in Fig. 8.
Fig. 8 Contour of gas distribution on coal face before coal cutting under different wind speed
4 Conclusion Gas is a major factor that seriously threatens the safety of coal mine production. Therefore, the clearer the distribution characteristics of gas in the roadway are, the more accurate the implementation of gas control measures will be. This paper first studied the distribution of wind speed and the form of gas movement under the condition of air flow in the mine, providing a theoretical basis for the exploration of gas migration at various levels, so as to better understand the specific distribution of gas. Then, the data of gas concentration on the observation surfaces of different dimensions of the coal face are analyzed, and the distribution law of gas concentration on the observation surfaces of different dimensions, the distribution of gas at the upper corner, and the general change law of gas concentration under different wind speeds are revealed. It provides a more intuitive and targeted practical value for gas treatment. Acknowledgements. Supported by National Key R&D Program of China under grant No. 2017YFF0205500.
References 1. Luo J (2020) Application of prevention and control measures for corner gas on u-shaped mining face. Build Mate Decoration 2:214–215
292
Y. Yu et al.
2. Ang Y (2019) Study on corner gas treatment of fully-mechanized mining face with large mining height. Shandong Coal Sci Technol 11:110–112 + 115 3. Huang C, Sun L (2008) Study on gas distribution and extraction technology in goaf under different wind speeds. Shanxi Coking Coal Sci Technol 42(7):4–7 4. Yan J, Meng Z, Zhang K, Yao H, Hao H (2020) Pore distribution characteristics of various rank coals matrix and their influences on gas adsorption. J Pet Sci Eng 189 5. Yan L, Cao Q, Zhang Q (2019) Distribution characteristics and main controlling factors of oil and gas in the Bohai Sea Area. J Coast Res 98(SI) 6. Zhu J, Zhang R, Zhang Y, He F (2019) The fractal characteristics of pore size distribution in cement-based materials and its effect on gas permeability. Sci Rep 9(1) 7. Zhao S, Zhao H (2011) Numerical model of influence law of wind speed on gas dilution in working face. Metal Mine 2:25–27 8. Guo X, Wu X (2011) Application of software in structural surface change trend research. Green Sci Technol 7:215–217 9. Xu K (2007) Numerical simulation of airflow field and gas distribution in local ventilation heading face. Henan Polytechnic University, Jiaozuo city, Henan province 10. Zhai C (2008) Coupling law of mining-induced fracture field and gas flow field in close coal seam group and research on prevention and control technology. China University of Mining and Technology 11. Liu JJ, Zhang BH (2005) Fluid mechanics. Peking University Press, Beijing, pp 72–104 (in Chinese) 12. Jing C (2005) Study on gas migration law of high gas fully mechanized mining face and goaf. Shandong University of Science and Technology, Qingdao 13. Yang Y (2017) Study on the air-gas coupling characteristics of roadway sections of different shapes. Taiyuan University of Technology
Research on Channel Coding of Convolutional Codes Cascading with Turbo Codes Chong-Yue Shi, Hui Li(B) , Jie Xu, Qian Li, Hou Wang, and Liu-Xun Xue College of Information and Communication Engineering, Hainan University, Haikou 570228, China [email protected], [email protected] Abstract. In this paper, we mainly studied the coding and decoding principles of convolutional codes and turbo codes and verified the coding and decoding process and bit error rate performance of the two codes. In different channel conditions (AWGN and Rayleigh fading channels), with different code rates (1/2 code rate and 1/3 code rate), the convolutional codes with different constraint lengths (3 and 7) and turbo codes are simulated and compared. We can know from the simulation results that the performance of the two codes in AWGN channel is better than that in Rayleigh channel. In these two channels, the performance of the two codes with small bit rate is better than that of the ones with large bit rate; the performance of turbo codes improves with the increase of the constraint length. The performance of convolution code decreases with the increase of the constraint length when the SNR is small, but it improves with the increase of the constraint length when the SNR is large. Keywords: Channel coding · Convolutional code · Viterbi algorithm · Turbo code · MAP algorithm
1 The Introduction Before the 1940s, it was thought that errors in communication could be reduced only by increasing the transmission power and retransmission. The convolutional codes promote the development of coding technology, and the performance of wireless communication has improved dramatically in the next 10 years. In 1990s, French electrical engineers C. Berrou and A. Glavieux invented a coding method—turbo code [1], whose efficiency is close to Shannon limit. Turbo code has become the coding technology used by 3G/4G mobile communication technology, until today, we are still using turbo code for 4.5G/5G [2, 3]. The characteristic of turbo codes is that the coding complexity is low, which meets the development requirements of 5G mobile communication technology [5, 6]. Therefore, turbo codes play an important role in 5G mobile communications [7]. In 5G communication, the polar code in the channel coding scheme is a researching hotspot, and simulation results show that the system polar code has better bit error rate performance than the non-system polarization code [6]. This design mainly discusses the performance comparison of convolutional codes and turbo codes.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_37
294
C.-Y. Shi et al.
2 Convolutional Code 2.1 The Encoding Principle of Convolution Code Convolutional code is a kind of channel coding with superior performance. Its encoder and decoder are easy to implement and have strong error correction ability [8–10]. Convolutional codes can be represented by (n, k, m), where “n” represents the length of the encoder output code element, i.e., it is the encoding length; “k” represents the length of encoder input code element; “m” represents the constraint length, i.e., it is the number of encoder shift registers; R (R = k/n) is the bit rate [8]. Figure 1 is (2,1,3) convolutional code encoding block diagram. Assumption, there is a sequence (1101) enter Fig. 1 from the left, “bi , bi−1 , bi-2 ” initial state is 0, the first code word entered is “1,” the code word is moved to the “bi ,” the code word in “bi ” move to “bi−1 ,” and so on. We can get the first encoded output is “11,” adder for XOR operation. It can be concluded that convolutional code encoding is not only related to the current input, but also related to the previous (N − 1) information bits. From Fig. 2, we can conclude that under other conditions unchanged, the lower the bit rate, the better the performance, and the longer the constraint length, the better the performance.
Fig. 1 Convolutional code (2,1,3) encoding block diagram
2.2 Decoding for Convolutional Codes Viterbi decoding is the optimal algorithm, i.e., it is the maximum likelihood decoding algorithm that can minimize the decoded bit error rate (BER) [1]. Viterbi decoding algorithm was proposed by Viterbi in the late 1960s. The basic principle of the Viterbi decoding algorithm is comparison, in which people compare the received information sequence with all sequences that may be the transmission sequence and select the one with the shortest Hamming distance as the transmission sequence [2]. Normally, people can decode through tree diagram, grid diagram, and state diagram. Let’s use the grid diagram as an example. If there is a input sequence: M= [1101], after encoding by the (3,1,3) convolutional code encoder, then the theoretical output is: C= [111110010100], decoding by Viterbi algorithm with minimum Hamming distance as criterion. According to Fig. 3, the possible paths are: The first output is “111,” and the output corresponding to the path from the starting point “a” to “b” is “111,” and the Hamming distance between the out corresponding to this path and the actual output is 0; the output corresponding to the path from the starting point “a” to “a” is “000,” the
Research on Channel Coding of Convolutional Codes Cascading …
295
Fig. 2 Comparison of convolution codes with different code rates at the same constraint length and with different constraint lengths at the same constraint rate
Hamming distance between the out corresponding to this path and the actual output is 3, choose the route with the smallest Hamming distance, the dotted line in the figure represents ”1,” and the solid line represents ”0,” so we can get the first input is “1,” and in the same way, we can draw several other inputs.
Fig. 3 Grid diagram of convolutional code (3,1,3)
296
C.-Y. Shi et al.
3 Turbo Code 3.1 Encoding of Turbo Code The turbo code encoder consists of two component encoders, an interleaver, a puncturing matrix, and a multiplexer [3]. Usually, there are only two component encoders in an encoder, and the best choice for component codes is RSC. Figure 4 is a structure diagram of a turbo code encoder with an RSC generating polynomial.
Fig. 4 Structure of encoder for turbo codes
Take the encoder in Fig. 4 as an example, {uk } = [1011001], {cs } = [1011001]. The verification {uk } output after RSC encoding is: The first code word of the input sequence is “1,” and the initial state of shift register 1 is “0,” and the initial state of shift register 2 is “0,” and after the XOR operations of the three code words “1,” “0,” and “0” get the ak = 1; then, ak makes an XOR operation with the code word in shift register 2 get the {c1p } first output information is “1.” In the same way, you can get other output information. The role of the interleaver is to disrupt the ordering of the symbols within the input information sequence. The purpose of this is to make the turbo code close to pseudo-random coding [4]. We can get the results in Fig. 5 for turbo code that when other conditions are the same, the lower the bit rate, the lower the bit error rate; and the longer the constraint length, the lower the bit error rate.
Research on Channel Coding of Convolutional Codes Cascading …
297
Fig. 5 Simulation of turbo codes under the same code rate with different constraint lengths and the same constraint length with different code rates
3.2 Decoding for Turbo Codes The superior performance of turbo codes is mainly due to the iterative decoding structure of turbo codes [9]. Figure 6 is the decoding structure of turbo code, which is mainly composed of the following parts: component decoder, interleaver, and de-interleaver, etc. The purpose of the de-interleaver is to restore the sequence after the interleaver is shuffled back to the previous order. Because the complexity of the maximum likelihood decoding algorithm will change with the length of the input data. The longer the length, the more complex and difficult to implement.
Fig. 6 The decoder structure of turbo codes
298
C.-Y. Shi et al.
Turbo codes with maximum likelihood decoding require that the length of each packet is not too long; iterative decoding can solve this problem. Iterative decoding uses the principle of repeated feedback and divides the input long code into several steps to complete the decoding work; this will solve the problem mentioned above. The decoding algorithm of turbo code adopts MAP algorithm. (0) (1) (1) The iterative decoding process is: at first, Lc rl , Lc rl and La (ul ) (The initial value (1) of La (ul ) is 0). These three messages go to the component decoder 1; then, the output (1) (1) after the decoding process is external information Le (ul ). Le (ul ) after being shuffled (2) (0) (0) by the interleaver; La (ul ). Lc rl is the output of the Lc rl after it has been shuffled (2)
(0)
(2)
by the interleaver. La (ul ), Lc rl , and Lc rl enter the next decoder together. After (2) (2) decoding by component decoder 2, we get a outside information Le (ul ), Le (ul ) after passing through the de-interleaver, it will be sent to the component decoder 1 as the next (1) round of decoding process La (ul ). Repeat these steps, after several iterations until the two external information approaches steady state, judging L(1) (ul ) and L(2) (ul ), we can get the result of the whole decoding process [5]. From the simulation results in Fig. 7, it can be seen that turbo codes perform better than convolution codes when the bit rate, channel, and constraint length are all the same. The performance of convolution code and turbo in AWGN channel is better than that in Rayleigh fading channel.
Fig. 7 Performance comparison of turbo code and convolution code under the same constraint conditions, but different bit rate and different channel
Research on Channel Coding of Convolutional Codes Cascading …
299
4 Conclusions From the simulation results, we can know that the two encoding methods, convolutional code and turbo code; when the constraint length is the same, the smaller the bit rate, the lower the bit error rate. Under the same bit rate, the longer the constraint length, the lower the bit error rate. The performance of convolutional codes and turbo codes in AWGN channel is better than that in Rayleigh fading channel. Under all other conditions are the same, turbo code performance is better than convolutional code. Acknowledgements. This work is supported by National Natural Science Foundation of China (61661018), Research on key technologies of 5G MIMO-OFDM wireless communication system, a general project of Hainan Natural Science Foundation (619MS029), The key technology of MIMO-OFDM and its offshore communication in Hainan University Science Research Project (Hnky2019-8) and Hainan Provincial Natural Science Foundation High-level Talent Project (2019RC036). Hui Li is the corresponding author.
References 1. Jiao Z (2009) The application and research of packet Turbo code in WiMAX mobile system. Shanghai Jiao Tong University 2. Zhou D (2011) Signal detection in multi-user collaborative diversity. Henan University of Science and Technology 3. Guo J (2009) Research on joint source channel coding in wireless channel. Dalian University of Technology 4. Liu B (2012) Research on random row-column cyclic shift interleaver and quantum interleaver. Nanjing University of Posts and Telecommunications 5. Liu C (2014) Turbo decoder low power design. Innov Technol 6:2–2 6. Li H, Ye M, Tong Q, Cheng J, Wang L (2019) Performance comparison of systematic polar code and non-systematic polar code. J Commun 40(6):203–209 7. Wang J (2019) Correlation analysis of 5G channel coding technology. Digit Commun World 3:39 8. Yang S, Liu X, Xu X, Li G, Ji Z (2019) Recognition technology of convolutional codes. Aerosp Electron Warfare 5:38–45 9. Ma Y, Peng N (2018) Coding and decoding of Turbo code and its performance simulation. Digit Technol Appl 12:110–112 10. Li K, Yan R (2020) Performance study of convolution code in flight test. Sci Technol Innov 4:55–57
A Low Pilot-Overhead Preamble for Channel Estimation with IAM Method in FBMC/OQAM Systems Dejin Kong1 , Qian Wang1 , Pei Liu2 , Xinmin Li3 , Xing Cheng4 , and Yitong Li5(B) 1
4
School of Electronic and Electrical Engineering, Wuhan Textile University, Wuhan 430074, China [email protected] 2 School of Information Engineering, Wuhan University of Technology, Wuhan 430070, China [email protected] 3 School of Information Engineering, SouthWest University of Science and Technology, Mianyang 621010, China [email protected] School of Information and Communication Engineering, Beijing Information Science and Technology University, Beijing 100101, China [email protected] 5 School of Information Engineering, Zhengzhou University, Zhengzhou 450001, China [email protected]
Abstract. For the channel estimation, a large pilot overhead is required due to the imaginary interference in filter-bank multicarrier employing offset quadrature amplitude modulation (FBMC/OQAM) systems. In this paper, we address the pilot reduction and present a short preamble for the classical interference approximation method (IAM). Compared with the conventional preamble consisting of 3 columns symbols, the pilot overhead is equivalent to 2 column symbols in the proposed preamble. It is proven that there exists no performance loss in the proposed preamble at a significantly reduced pilot overhead. To verify the proposed preamble, numerical simulations are carried out with respect to bit error ratio and mean square error.
1
Introduction
As an emerging multicarrier modulation, filter-bank multicarrier employing offset quadrature amplitude modulation (FBMC/OQAM) was firstly proposed in [1]. Owing to the very low spectral sidelobe property, the FBMC/OQAM system exhibits the high-frequency spectrum utilization and has the good ability of c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_38
A Low Pilot-Overhead Preamble for Channel Estimation . . .
301
asynchronous transmission, attracting much attention [2–4] as a promising technique for future communications. Nevertheless, since the orthogonality condition always is satisfied in real-valued field [4], there exists the imaginary interference among transmitted symbols of FBMC/OQAM. As a result, compared with classical orthogonal frequency division multiplexing (OFDM), it is more complex for the channel estimation and requires further investigation in the FBMC/OQAM system. In [5], the interference approximation method (IAM) was presented to achieve channel estimation of FBMC/OQAM systems. Afterward, the other versions of this method were proposed, i.e., IAM-C [6] and IAM-I [7], in which imaginaryvalued pilots are employed to enhance the pseudo-pilot power. However, the conventional pilot preamble for IAM requires 3 column real-valued symbols, i.e., 1 column pilot symbols for the channel estimation and 2 column zero symbols for reducing the interference to pilot symbols. It should be noted that the symbol interval of FBMC/OQAM is only half of that in OFDM systems; hence, the pilot-overhead preamble for IAM is 50% larger than that of OFDM systems [5]. To achieve the goal of the pilot-overhead reduction, pairs of real pilots (POP) method was proposed [5], which only requires 2 columns real-valued pilots. However, the conclusion has existed in [5], i.e., the POP will suffers from the performance loss compared with the IAM due to the poor ability to deal with the channel noise. In this paper, we devote to reducing the pilot overhead of channel estimation by proposing a short preamble for IAM, which only consists of 2 columns pilots. It is demonstrated that there exists no performance loss in the proposed preamble, at a significantly reduced pilot overhead. Simulations have been done to evaluate the proposed preamble and for comparison, the performance of the POP method has been also given. As a remark, the proposed scheme of this paper can be also applied in the other versions of the IAM method, for instance, and IAM-C [6] and IAM-I [7].
2
Channel Estimation in FBMC/OQAM
In [5], the IAM method has been proposed based on the following model [5] a ˆm,n ≈ Hm,n (am,n + a(∗) m,n ) + ηm,n ,
(1)
where am,n is the transmitted symbol and a ˆm,n is the demodulation in (∗) FBMC/OQAM. Hm,n is the channel frequency response of m-th subcarrier. am,n stands for the imaginary interference in FBMC/OQAM, i.e., m,n am,n ζm+p,n+q , (2) a(∗) m,n = Ω m,n could be written with Ω = {(p, q) | |p|, |q| ≤ 1 and (p, q) = (0, 0)}. ζm+p,n+q m,n 0,0 ζm+p,n+q = ζp,q =
∞ l=−∞
g[l]g[l + q
M j2πpl jπ(p+q) ]e M e 2 , 2
(3)
302
D. Kong et al.
m,n where g[l] is the pulse shaping filter of FBMC/OQAM. Note that ζm+p,n+q =1 m,n for (p, q) = (0, 0). Otherwise, ζm+p,n+q is imaginary-valued. Furthermore, it is m,n is close to zero [5]. ηm,n worthwhile to note that for |p| > 1 or |q| > 1, ζm+p,n+q could be written as
ηm,n =
∞
η[l]g[l − n
l=−∞
M − j2πml − jπ(m+n) 2 ]e M e , 2
(4)
where η[l] is the additive white Gaussian noise with variance σ 2 [8]. Then, the IAM channel estimation is written as [5] ˆ m,n = H
a ˆm,n am,n +
(∗) am,n
= Hm,n +
ηm,n (∗)
am,n + am,n
,
(5)
ˆ m,n is the estimate of Hm,n . where H Figure 1a shows the conventional preamble structure, consisting of three columns of pilots, i.e., am,0 = am,2 = 0 with m = 0, 1, · · · , M − 1, a4l,1 = a4l+1,1 = 1 and a4l+2,1 = a4l+3,1 = −1 with l = 0, 1, · · · , M 4 − 1. Note that the symbol interval of FBMC/OQAM is half of that in OFDM systems; hence, the pilot overhead of preamble is larger than OFDM.
Fig. 1. Conventional preamble and the proposed preamble in FBMC/OQAM
3
A Novel Preamble with Low Pilot Overhead
In the following, we propose a novel preamble based on IAM, which only consists of 2 columns of pilots. Compared to the conventional scheme in Fig. 1a, the proposed structure reduces 1/3 pilot overhead at a negligible price of channel estimation performance.
A Low Pilot-Overhead Preamble for Channel Estimation . . .
3.1
303
Preamble Structure of the Proposed Scheme
Figure 1b depicts the proposed preamble structure, where am,1 , m = 0, 1, · · · , M − 1, are pilot symbols. Specifically, a4l,1 = a4l+1,1 = 1 and a4l+2,1 = a4l+3,1 = −1 with l = 0, 1, · · · , M 4 − 1, which are same as that in Fig. 1a. Differently, 1/3 pilot overhead is saved to transmit additional data symbols (ADS), i.e., U = [u0 , x1 , · · · , uM −1 ]T with u2m = a2m,2 and u2m+1 = a2m+1,0 . Since pilot interference is caused by ADS , compensating pilot symbols (CPS) are required and designed to eliminate the interference, i.e., V = [v0 , v1 , · · · , vM −1 ]T with v2m = a2m,0 and v2m+1 = a2m+1,2 . Therefore, the total pilot overheads are 2 columns of symbols. To ensure no extra pilot power loss, the power of CPS is from ADS. Since the CPS is designed based on the ADS at the transmitter, the CPS can be used for the channel equalization of the ADS. As a result, the ADS has the similar ability to fight against the noise compared with data symbols as we can see below. According to (1) and (2), ΦUX and ΩHV are the interferences to pilots from the ADS and CPS, respectively. To eliminate the interference to pilots, Ω is designed according to the following equation ΦHU + ΩHV = 0, where
⎛
0,1 0,1 0,1 ζ0,2 ζ1,0 ζ2,2 1,1 1,1 ⎜ ζ 1,1 ζ1,0 ζ2,2 ⎜ 0,2 ⎜ ζ 2,1 2,1 2,1 ζ1,0 ζ2,2 Φ=⎜ ⎜ 0,2 .. .. .. ⎜ ⎝ . . . M −1,1 M −1,1 M −1,1 ζ1,0 ζ2,2 ζ0,2 ⎛ 0,1 0,1 0,1 ζ0,0 ζ1,2 ζ0,2 1,1 1,1 ⎜ ζ 1,1 ζ1,2 ζ2,0 ⎜ 0,0 ⎜ ζ 2,1 2,1 2,1 ζ1,2 ζ2,0 Ω=⎜ ⎜ 0,0 . . .. ⎜ .. .. ⎝ . M −1,1 M −1,1 M −1,1 ζ1,2 ζ2,0 ζ0,0 ⎛ ⎞ H0 ⎜ ⎟ H1 ⎜ ⎟ H=⎜ ⎟, .. ⎝ ⎠ .
··· ··· ··· .. . ··· ··· ··· ··· .. . ···
⎞ 0,1 ζM −1,0 1,1 ζM −1,0 ⎟ ⎟ ⎟ 2,1 ζM −1,0 ⎟ , ⎟ .. ⎟ ⎠ . M −1,1 ζM −1,0 ⎞ 0,1 ζM −1,2 1,1 ⎟ ζM −1,2 ⎟ 2,1 ζM −1,2 ⎟ ⎟, ⎟ .. ⎟ ⎠ . M −1,1 ζM −1,2
(6)
(7)
(8)
(9)
HM −1
and Hm = Hm,0 . Note that the matrices Φ and Ω are known at the receiver, −1 and it is proven that Ω −1 Φ is a unitary matrix, where (·) is the inverse matrix of a matrix. Then, the CPS can be obtained according to (6) V = −H−1 Ω −1 ΦHU.
(10)
However, H is not available for the transmitter, thus, (10) cannot be used for the design of the CPS directly.
304
D. Kong et al.
Define C = H−1 B−1 AH, and it could be written as ⎛
⎜ ⎜ ⎜ C=⎜ ⎜ ⎝
H0 β H0 00 H1 β H0 10
.. . HM −1 β (M −1)1 H0
H0 β H1 01 H1 β H1 11
.. . HM −1 β (M −1)2 H1
··· ··· .. . ···
H0 β HM −1 0(M −1) H1 β HM −1 1(M −1)
.. .
HM −1 β HM −1 (M −1)(M −1)
⎞ ⎟ ⎟ ⎟ ⎟. ⎟ ⎠
(11)
where βmn is the (m, n)-th entry of Ω −1 Φ. Note that only few entries in each row of Ω −1 Φ can be considered nonzero. Taking the IOTA filter as example, it can be obtained as ⎧M −1 ⎪ ⎪ |β |2 = 1, ⎪ ⎨ n=0 mn
(12)
⎪ ⎪ ⎪ ⎩ |βmn |2 ≈ 0.9991, |m − n| ≤ 3, |m − n| ≥ M − 3. n
Thus, we can approximately consider that βmn ≈ 0 for M − 3 > |m − n| > 3. m β ≈ 0 for M − 3 > |m − n| > 3. In addition, the subcarrier Obviously, Cmn = H Hn mn spacing of multi-carrier systems is very small compared with the coherence bandwidth. Therefore, for several adjacent subcarriers, their channel frequency responses could m ≈ 1 for |m − n| ≤ 3 or be assumed to be constant [5]. Then, we can assume H Hn |m − n| ≥ M − 3. Therefore, we can conclude that Cmn ≈ βmn and (13) could be applied to design the CPS, (13) V = −Ω −1 ΦU.
3.2
Channel Equalization of ADS
In this subsection, it is proven that the CPS can be used for the channel equalization of the ADS. As a result, the ADS has the similar ability to fight against the noise compared with data symbols as we can see below. ˆ and U ˆ is obtained as From (1), the demodulation of V ⎧
ˆ = H U + U(∗) + η u , ⎨U
(14) ˆ = H V + V(∗) + η v , ⎩V (∗)
(∗)
where U(∗) and V(∗) are the imaginary interference, respectively, i.e., U(∗) = [u0 , u1 (∗) (∗) (∗) (∗) (∗) , · · · , uM −1 ]T , in which u2k = a2k,2 and u2k+1 = a2k+1,0 , V(∗) = (∗)
(∗)
(∗)
(∗)
(∗)
(∗)
(∗)
[v0 , v1 , · · · , vM −1 ]T , in which v2k = a2k,0 and v2k+1 = a2k+1,2 . η u = u T u u [η0u , η1u , η3u , · · · , ηM with η2i = η2i,2 and η2i+1 = η2i+1,0 . η v = −1 ] v v v v T v v [η0 , η1 , η2 , · · · , ηM −1 ] with η2i = η2i,0 and η2i+1 = η2i+1,1 . As shown in (13), there is a linear relationship between U and V, i.e., −1 V. (15) U = −Ω −1 Φ −1 −1 ˜ = −Ω Φ ˆ and it can be written as Let U V
˜ = −Ω −1 Φ −1 H V + V(∗) + −Ω −1 Φ −1 η v U −1 ˜. = −Ω −1 Φ H V + V(∗) + η (16)
A Low Pilot-Overhead Preamble for Channel Estimation . . .
305
˜ ≈ After channel equalization and taking real parts, we have (H−1 U) −1 ˆ when there is no channel noise. Thus, if the noise η ˜ −Ω −1 Φ V = U = (H−1 U) u 1 ˆ 1 ˜ and η are Gaussian and independent identical distributed, we can use the 2 U + 2 U ˆ In this way, the variance of ( 1 η ˜ + 12 η u ) as the input of channel equalizer instead of U. 2 u ˜ is half of that of η . In the following, it will be proven that the noise η and η u are identical independent and Gaussian distributed. According to the definition of η[k], it is easily observed that ηm,n satisfies Gaussian distribution as well, and the variance and covariance can be obtained as (17) and (18), respectively. +∞ j(m+n)π nM − j2πml − 2 η[l]g[l − ]e M e Var [ηm,n ] = E 2 l=−∞ +∞ ∗ j(m+n)π nM j2πml 2 η [l]g[l − ]e M e 2
l=−∞
= σ2
∞
g[l −
l=−∞
nM nM ]g[l − ] = σ2 . 2 2
(17)
1 l − j(m1 +n1 )π n1 M − j2πm M 2 Cov [ηm1 ,n1 , ηm2 ,n2 ] = E η[l]g[l − e ]e 2 l=−∞ +∞ ∗ 2 l j(m2 +n2 )π n2 M j2πm 2 η [l]g[l − ]e M e 2
+∞
l=−∞
m2 ,n2 0,0 = σ 2 ζm . = σ 2 ζm 1 ,n1 1 −m2 ,n1 −n2
(18)
−1 Let emn stands for the element of −Ω −1 Φ at position (m, n). Note that em(m−1) = ˜ could be expressed as em(m+1) . The m-th entries’ covariance of the η u and η M −1 u u v v , em,n ηnv = Cov [ηm , em,m−1 ηm−1 + em,m+1 ηm+1 ] = 0. (19) Cov ηm n=1
˜ are independent identically distributed. Obviously, the m-th entries of η u and η ˆ + 1U ˜ could replace U ˆ as the Thus, according to the maximum likelihood criterion, 12 U 2 1 ˆ 1 ˜ ˆ input of the equalizer. Compared with U, the noise power of 2 U + 2 U reduces in half, which means that the ADS has similar capability to fight against the noise compared with the data symbols.
4
Simulation Results
In simulations, the following parameters are considered. • • • • •
Subcarrier number: 2048 Sampling rate (MHz): 9.14 Path number: 6 Delay of path ( μs): −3, 0, 2, 4, 7, 11 Power delay profile (dB): −6.0, 0.0, −7.0, −22.0, −16.0, −20.0
306
D. Kong et al.
• Modulation: 4QAM • Channel coding: Convolutional coding [5]. For comparison, this section also gives the performance of pairs of pilots (POP) [5], consisting of two columns of pilots. 0
−2
MSE (dB)
−4
−6
−8
−10 proposed 2−column preamle, IAM conventional 3−column preamle, IAM POP
−12
−14
0
1
2
3
4
5 6 SNR (dB)
7
8
9
10
Fig. 2. MSE of the proposed scheme
Figure 2 depicts MSE of the proposed preamble in the IAM method. It can be easily seen that the IAM outperforms the POP greatly. Besides, by the IAM method, the proposed preamble achieves similar MSE compared to the conventional preamble. Note that, only two column pilots are needed in the proposed preamble, while the conventional preamble requires three columns of pilots. Therefore, better spectral efficiency can be achieved by our proposed preamble and channel estimation approach. Figure 3 depticts BER of the proposed preamble in the IAM method. From simulation results, the IAM outperforms the POP greatly, which is accordance to Fig. 2. Besides, the IAM method with the proposed preamble achieves similar BER compared to the conventional preamble, demonstrating the effectiveness of our proposed scheme.
5
Conclusions
In this paper, a novel preamble with low pilot overhead was proposed for FBMC/OQAM. Compared to the conventional preamble consisting of 3 columns symbols, only 2 columns pilot are needed in our scheme. It was proven that, at a significantly reduced pilot overhead, the proposed preamble could exhibit the same performance compared to the conventional preamble. Therefore, by our proposed preamble and channel estimation approach, better spectral efficiency can be achieved in the FBMC/OQAM system.
A Low Pilot-Overhead Preamble for Channel Estimation . . .
307
0
10
−1
10
−2
BER
10
−3
10
−4
10
−5
10
proposed 2−column preamble, IAM conventional 3−column preamble, IAM POP
−6
10
0
1
2
3
4
5 SNR (dB)
6
7
8
9
10
Fig. 3. BER of the proposed scheme
Acknowledgments. This work was supported by the Fundamental Research Funds for under Grant WUT: 2020IVA024, the Nature Science Foundation of Southwest University of Science and Technology under Grant 18zx7142, and the National Natural Science Foundation of China (NSFC) under Grant 61801433, Grant 62001333, Grant 62001336.
References 1. Chang RW (1966) Synthesis of band-limited orthogonal signals for multichannel data transmission. Bell Syst Tech J 45:1775–1796 2. Kong D, Zheng X, Jiang T (2020) Frame repetition: a solution to imaginary interference cancellation in FBMC/OQAM systems. IEEE Trans Signal Process 68:1259– 1273 3. Kong D, Qu D, Jiang T (2014) Time domain channel estimation for OQAM-OFDM systems: algorithms and performance bounds. IEEE Trans Signal Process 68(2):322– 330 4. Siohan P, Siclet C, Lacaille N (2002) Analysis and design of OQAM-OFDM systems based on filterbank theory. IEEE Trans Signal Process 50(5):1170–1183 5. L´el´e C, Javaudin J-P, Legouable R, Skrzypczak A, Siohan P (2008) Channel estimation methods for preamble-based OFDM/OQAM modulations. Eur Trans Telecommun 19(7):741–750 6. Du J, Signell S (2009) Novel preamble-based channel estimation for OFDM/OQAM systems. In: Proceedings of ICC. IEEE 7. L´el´e C, Siohan P, Legouable R (2008) 2 dB better than CP-OFDM with OFDM/OQAM for preamble-based channel estimation. In: Proceedings of ICC, IEEE, pp 1302–1306
308
D. Kong et al.
8. Liu P, Jin S, Jiang T, Zhang Q, Matthaiou M (2017) Pilot power allocation through user grouping in multi-cell massive MIMO systems. IEEE Trans Commun 65(4):1561–1574
Modeling of Maritime Wireless Communication Channel Yu-Han Wang, Meng Xu, Huan-Yu Li, Ke-Xin Xiao, and Hui Li(B) School of Information and Communication Engineering, Hainan University, 570228 Haikou, China [email protected], [email protected]
Abstract. Influenced by radian of the Earth and the waves, also the shelter of ships and sea waves, wireless channels above the sea have the effects of deep fading and multipath. Microwaves of very high frequency and ultrahigh frequency are analyzed in oceanic environments. We considered the direct path, paths of mirror reflection, and diffuse scattering, and we calculated the power of mirror reflection and diffuse scattering theoretically. Area of effective diffuse scattering and partitioning was introduced to calculate the power of multipath. We built a generalized channel model with different frequency band and communication range. Keywords: Maritime communication · Multipath channel · Diffuse scattering · Wireless channel modeling
1 Introduction Hainan is the province with the largest ocean area in China, with a sea area of 2 million square kilometers, accounting for 42.3% of Chinese ocean area [1]. The production and living areas of people from land to sea are constantly expanding, such as the development of ocean-going fisheries, marine environment detection and oil exploration, maritime safety and other communication service issues [2], so there is an urgent need to find a solution to marine communication. The problem of the development of high-speed data communication in modern ports and near-shore ships is becoming more and more prominent [3]. The establishment of a universal wireless channel model can provide a scientific basis for the design of communication systems and thereby reduce the blind marine radio wave propagation and channel modeling research in engineering design [4].
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_40
310
Y.-H. Wang et al.
2 Simple Prediction Model of Ocean Radio Wave Propagation Loss 2.1 Radio Wave Propagation Theory Suppose there is an isotropic emission source in free space, and the total power radiated in all directions is Pt watts (ideal) (Fig. 1). m2
Pt
Rm
w/m 2
Fig. 1 Flux density generated by isotropic source
The flux density across the sphere from the source R meters is F=
Pt W/m2 4πR2
(1)
The antenna gain G(θ) is defined as the ratio of the radiated power per unit solid angle in the direction of θ to the average radiated power per unit solid angle G(θ ) =
P(θ) Po 4π
(2)
Among them, P(θ ) is the radiated power per unit solid angle of the antenna; Po is the total radiated power of the antenna; G(θ) is the gain of the antenna in the angular direction. The antenna gain takes the value of G(θ) when θ = 0, which is used to measure the radiant flux density of the antenna. Assuming that the transmitter output power is Pt , a lossless antenna is used, and the antenna gain is Gt , then the flux density at a distance R in the direction of the line of sight is F=
PtGt W/m2 4π R2
(3)
where Pt Gt is called effective isotropic radiated power (EIRP), which means an equivalent isotropic source with power Pt Gt watt, θ = 0. If an ideal receiving antenna with an aperture area of A m2 is used, as shown in Fig. 2, the received power Pr can be calculated by the following formula Pr = F × A W
(4)
The received power of the actual antenna cannot be calculated using the above formula. Because of the energy incident on the antenna aperture, part of it is reflected
Modeling of Maritime Wireless Communication Channel W m EIRP=Pt W
311
2
Pr
m2
Gt
Fig. 2 Power received by an ideal antenna in an area of A m2 (the incident flux density is F and the received power is Pr = F × A = Pt Gt A/4π R2 W)
into free space and part of the energy is absorbed by the lossy element. Ae is the effective aperture, and Ar is the actual aperture area, then Ae = ηAAr
(5)
where ηA is the aperture efficiency, reflecting all losses between the incident wavefront and the antenna output. The power received by the actual antenna is Pr =
Pt Gt Ae W 4πR2
(6)
Among them, the limited area of the receiving antenna Ae =
λ2 Gr 4π
(7)
where Gr is the effective gain of the receiving antenna. The relationship between antenna gain and area: G = 4πAe /λ2 , where λ is the wavelength corresponding to the operating frequency. Using Eqs. (6) and (7), the link equation is Pr = EIRP + Gr − Lp dBW
(8)
where EIRP = lg(Pt Gt ) dBW, Gr = lg(4πAe /λ2 ) dBW 2 = 20 lg 4π R λ Lp = 10 lg 4πR λ
(9)
Equation (8) represents the ideal situation. In practice, the attenuation caused by oxygen, water vapor, and rainfall, the sea surface reflection propagation loss, the internal loss generated by the antenna at both ends of the link, and the pointing error loss must also be considered. Pr = EIRP + Gr − Lp − La − Lta − Lra dBW
(10)
where L a is the atmospheric loss, L ta is the loss caused by the transmitting antenna, and L ra is the loss caused by the receiving antenna.
312
Y.-H. Wang et al.
2.2 Popagation Loss in Free Space When radio waves propagate in free space, the energy in per unit area will be attenuated by diffusion. This reduction is called free space propagation loss. The L p can be expressed in decibels as Lp = 32.45 + 20 lg d (km) + 20 lg f (MHz) dB
(11)
In this formula, f represents the operating frequency (MHz), and d represents the distance (km) between the transmitting and receiving antennas. 2.3 Sea Surface Reflection Propagation Loss As shown in Fig. 3, the total received signal of the receiver should be a composite signal of direct and sea reflection.
Fig. 3 Sea surface reflection propagation model
The reflection propagation model of radio waves on a smooth sphere is shown in Fig. 4, where T represents the signal emission point and AB line represents the tangent line through point C. The reflection process of radio waves on a smooth sphere must satisfy the angle of incidence equal to the angle of reflection. Therefore, when the height of the antenna at both ends of the path is h1 , h2 , and the station distance d is determined, the reflection point position C is a certain value, and the position d 1 of point C meets the following conditions 3 d2 (12) d1 + Kadh1 = 0 d13 − dd12 − Ka(h1 + h2 ) − 2 2 In the above formula, d 2 = d − d 1 , K represents the earth’s equivalent radius coefficient, and we assume P = 1.5925Kd (h2 − h1 )
(13)
Modeling of Maritime Wireless Communication Channel
313
Fig. 4 Reflecting point calculation chart
Q=
d2 + 2.125K(h2 + h1 ) 12
P Q 3/ 2
d φ d1 = + 2 Q cos + 240 2 3 ϕ = ar cos
(14) (15) (16)
The tangent line AB through the reflection point C cuts the antenna heights h1 and h2 at both ends into two parts.
h1 = h1 + h1 h1 =
h2 = h2 + h2 h2 =
d12 2Ka
(17)
d22 2Ka
(18)
where a is the radius of the earth; h1 and h2 are the effective height of the antenna. The reflection fading loss thus obtained is
2 4πd λ
Lf = 10 lg 4πh1 h2 2 1 + D0 − 2D0 cos dλ
(19)
where D0 is the ground equivalent reflection coefficient. Figure 5 shows that the frequency f is 300 MHz, 3 GHz, and 30 GHz, respectively; the transmitter height h1 is 100 m, the mobile station height h2 is 50 m, the communication distance d is 0–80 km, the earth radius a is 6400 km, and the earth is equivalent when the radius coefficient K = 4/3 and the ground equivalent reflection coefficient D0 = 1, the curve relationship between the reflection loss L f and the distance d between the transmitting and receiving antennas.
314
Y.-H. Wang et al.
Fig. 5 Curves of reflection loss and transmission distance at different frequencies
3 Modeling of Marine Wireless Transmission Channels 3.1 Model Parameters of the Marine Channel RMS of wave height Wind and waves on the sea surface are an important factor that causes changes in the ocean channel. Sea state is a method of describing the state of the sea surface in the form of numerical series. In Table 1, WMO and Douglas, both divide the sea state by wave height [5]. Root mean square (RMS) of wave height refers to the root mean square value of sea surface wave height. Table 1 Division of sea state Sea state
WMO wave height (m)
Douglas Wave height
Roughness of the sea
–
Peaceful
0
0
1
0–0.1
0), its Bayesian formula is shown in (4): P(Bi )P(A|Bi ) P(Bi |A) = n i=1 P(Bi )P(A|Bi )
(4)
The calculation model of (4) shows that when a source node sends an RREQ request, there are eight neighbor nodes in the network, and four neighbor nodes with the same probability are selected for message forwarding. The purpose is to forward the neighbor node Di message to the destination node D. It can set that P(D|Di ) = P(Di ) = 21 . According to the Bayesian model, there is: P(Di |D) =
1 2
×
1 2
+
1 2
×
1 2 1 2
× +
1 2 1 2
×
1 2
+
1 2
×
1 2
= 0.25
Bayesian model is used to filter the neighbor nodes for message forwarding. It is assumed that the forwarding probability of the neighbor nodes and the destination nodes receive probability are both 50%, only 25% of the neighbor nodes participate in the forwarding and successfully send the message to the destination nodes. Therefore, it is only necessary to forward the message in the selected neighboring nodes. 2.2 B_AODV Route Discovery Algorithm It is assumed that the source node S in the network needs to send data to the destination node D. During the AODV route discovery process, any node Fi (i = 1, 2, 3, . . . , N ) receives the RREQ packets and performs packets processing. The established Bayesian probability model calculates the forwarding probability Pm of the intermediate nodes. Algorithm 1 gives the B_AODV route discovery process based on Bayesian probability model.
Research on Message Forwarding Mechanism Based on Bayesian …
Algorithm 1 B_AODV
routing
discovery
325
algorithm
3 Simulation Results and Performance Analysis The performance of the proposed message forwarding mechanism is analyzed in this paper based on the Bayesian probability model based on NS2 network simulator. 3.1 Construction of Simulation Environment In the simulation scenario, the amount of network nodes is set between 10 and 100, the node moving speed Vmax is 5 m/s, which indicates that the nodes in the network move intermittently at the speed of v (v ∈ [0, Vmax ]). Under the same parameter settings, the simulation was repeated 10 times to obtain the average performance. In order to analyze the routing performance, AODV, AODV_EXT, B_AODV, OLSR, DSDV, and DSR are used in the simulation for comparison. Other simulation parameters are shown in Table 1. The following performance indicators are mainly used in the simulation analysis:
326
Y. Yan et al. Table 1 Simulation parameter settings Simulation parameters Parameter settings Routing protocol
AODV
Simulation scene 600 × 600 m2 Communication radius 100 m Number of nodes
10–100 ( = 10)
Channel type
Wireless channel
Propagation model
Two-ray ground
Interface queue type
DropTail/PriQueue
Antenna model Data packet size
Omnidirectional antenna 512 Bytes
Packet sending rate
4 p/s
Node transmit power
23 dBm
Node received power
17 dBm
Idle node power
15.4 dBm
Node initial energy
30 J
Normalized Energy Consumption The energy consumed by the entire network during message transmission. The large the network scale, the more energy consumed; Throughput The number of packets arriving per second reflects the routing protocol’s ability to send and receive data; Data Packets Successful Delivery Rate The ratio of the number of data packets received by all destination nodes in the network to the number of data packets sent by the source nodes, which reflects the reliability of network transmission performance. The greater the ratio, the better the reliability. 3.2 Experimental Results and Analysis From the normalized energy consumption of the network shown in Fig. 1, when plenty of nodes are between 10 and 50, the energy consumption of routing protocols that uses flooded message forwarding increases slowly. After the number of nodes is greater than 50, the flooding forwarding routing protocol performs large-scale message forwarding in the network, and the energy consumption almost linearly rises under severe overload conditions. The paper proposes based on the Bayesian probability model B_AODV uses selective message forwarding instead of flooding message forwarding to neighboring nodes, so that it consumes less energy than other five protocols. At the same time, the energy consumption increases more slowly. The comparison of the effective throughput of the network shown in Fig. 2 shows that the traditional DSR has the best throughput performances, but the B_AODV, AODV_EXT, and AODV protocols are used, the network nodes do not need to maintain
Normalized energy consumption (%)
Research on Message Forwarding Mechanism Based on Bayesian …
327
80 70
AODV_EXT OLSR DSR
AODV B_AODV DSDV
60 50 40 30 20 10 0 10
20
30
40
50
60
70
80
90
100
Number of nodes
Fig. 1 Comparison of normalized energy consumption among routing protocols
the route between the two nodes. The function of signal discovery and maintenance required by the route is reduced, resulting in a decrease in network throughput. 4000
AODV B_AODV DSDV
Throughput (Bytes)
3500 3000
AODV_EXT OLSR DSR
2500 2000 1500 1000 500 0 10
20
30
40
50
60
70
80
90
100
Number of nodes
Fig. 2 Effective throughput of each routing protocol network
From Fig. 3, we know that, along with the nodes move faster and faster, the data packets delivery rate of each routing protocol decreases, while the B-AODV data delivery rate decreases slowly, which is about 20% larger than the traditional AODV. The results of DSDV is identical with B_AODV; the mechanism of using only the data packets sent or received by neighboring nodes has better network performance. From this, B_AODV has better stability and provides a guarantee for the successful transmission of data in the network.
4 Conclusion Based on the heavy load overhead and high energy consumption caused by message forwarding during the route discovery process in wireless multihop network, the paper builds a Bayesian message forwarding model for it. Under the condition of ensuring
Y. Yan et al. Data packet successful delivery rate (%)
328
100 95 90 85 80 75 70 65
B_AODV DSDV
55 50 0.5
AODV_EXT OLSR DSR
AODV
60
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
5.0
Maximum movement rate of nodes (m/s)
Fig. 3 Successful delivery rate of each routing protocol data
network connectivity, the unnecessary message forwarding can be reduced by calculating node density and a posteriori probability, so that a dynamic probability forwarding routing algorithm B_AODV is proposed. The paper shows that the B_AODV routing algorithm based on the Bayesian probability theory message forwarding model has obvious advantages in debasing network energy expenditure and routing overhead. Among them, when the network scale is large, this advantage is still further evident. One of the main advantages of this algorithm is to reduce the replay times of traditional AODV and similar routing algorithms without affecting the propagation accuracy. In relation to the traditional probability condition forwarding mechanism, the Bayesian probability scheme proposed will effectively reduce the forwarding number of intermediate nodes in RREQ. Hence, the B_AODV algorithm not only has advantages in reducing multiple replays of messages and saving power consumption, but also can reduce channel congestion and improve communication, thereby providing technical support for future large scale dynamic multihop network deployment. Acknowledgements. This work was support by the National Natural Science Foundation of China (61771186), University Nursing Program for Young Scholars with Creative Talents in Heilongjiang Province (UNPYSCT-2017125), Distinguished Young Scholars Fund of Heilongjiang University, and Postdoctoral Research Foundation of Heilongjiang Province (LBH-Q15121).
References 1. Fanian F, Rafsanjani MK (2020) A new fuzzy multihop clustering protocol with automatic rule tuning for wireless sensor networks. Appl Soft Comput J. https://doi.org/10.1016/j.asoc. 2020.106115 2. Liu X, Jia M, Zhang X, Lu W (2019) A novel multichannel internet of things based on dynamic spectrum sharing in 5G communication. IEEE Internet Things J 6(4):5962–5970 3. Watanabe M (2019) Realization of a multihop wireless controlled sensor and actuator network for cable installation. ICT Express 227–234
Research on Message Forwarding Mechanism Based on Bayesian …
329
4. Liu X, Zhang X (2020) NOMA-based resource allocation for cluster-based cognitive industrial internet of things. IEEE Trans Industr Inf 16(8):5379–5388 5. Sharifi SA, Babamir SM (2020) The clustering algorithm for efficient energy management in mobile ad-hoc networks. Comput Netw. https://doi.org/10.1016/j.comnet.2019.106983 6. Kim JW, Moon SY et al (2011) Improved message communication scheme in selective forwarding attack detection method. In: The 7th international conference on digital content, multimedia technology and its applications. IEEE http://dx.doi.org/ 7. Mahmoud M, Lin XD, Shen XM (2015) Secure and reliable routing protocols for heterogeneous multihop wireless networks. IEEE Trans Parallel and Distrib Syst 1140–1153 8. Angurala M, Bala M, Bamber SS (2020) Performance analysis of modified AODV routing protocol with lifetime extension of wireless sensor networks. IEEE Access 10606–10613 9. Chen H, Ren J (2012) Structure-variable hybrid dynamic bayesian networks and its inference algorithm. In: 2012 24th Chinese control and decision conference, pp 2816–2821 10. Hou C, Zhao QH (2015) Bayesian prediction-based energy-saving algorithm for embedded intelligent terminal. IEEE Trans Very Large-Scale Integr Syst 2902–2912 11. Chen QC, Lam KY, Fan PZ (2005) Comments on “distributed Bayesian algorithms for faulttolerant event region detection in wireless sensor networks”. IEEE Trans Comput 1182–1183 12. Zhao LF, Bi GA et al (2013) An improved auto-calibration algorithm based on sparse Bayesian learning framework. IEEE Signal Process Lett 889–892
A S-Max-Log-MPA Multiuser Detection Algorithm Based on Serial in SCMA System Guanghua Zhang1(B) , Zonglin Gu1 , Weidang Lu2 , and Shuai Han3 1 Northeast Petroleum University, Daqing, China
[email protected]
2 Zhejiang University of Technology, Hangzhou, China 3 Harbin Institute of Technology, Harbin, China
Abstract. Sparse code multiple access (SCMA) has become one of the most promising key technologies for 5G communication in the future. In the multi-user detection algorithm of SCMA system, the message passing algorithm (MPA) has high complexity because of the exponential (EXP) operation. Because the maximum and approximate algorithms are used in the maximum logarithm message passing algorithm (Max-log-MPA), it will cause some information loss, resulting in poor accuracy of system transmission information. The serial message passing algorithm (S-MPA) combines the user node message update with the resource node message update, which will also cause some information loss, resulting in poor bit error rate of the system. Therefore, a maximum logarithm message passing algorithm based on serial (S-Max-log-MPA) is proposed in the paper, which first converts the exponential operation to the sum of the maximum values and then integrates the user node message update into the resource node message update. In the iterative process, the user node information is updated and then transferred to the next node to update the information of the resource node, which can greatly reduce the information loss and the space occupied by the intermediate variables, and effectively improve the accuracy of the system transmission information. Simulation results show that the bit error ratio (BER) performance of the system is better with the increase of iterations. Keywords: SCMA · BER · Serial · Message passing algorithm · Max-log-MPA
1 Introduction With the development of the Internet of things and mobile communication system, the demand for larger capacity, higher transmission rate, ultra-high-density throughput and large-scale application scenarios has also increased dramatically [1]. The research of the fifth-generation (5G) mobile communication technology has become an inevitable trend [2]. The development trend of 5G technology includes continuous wide area coverage, low delay and high reliability, hot spot and high capacity coverage, low power consumption and large connection [3, 4]. Therefore, 5G can solve many problems of
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_42
A S-Max-Log-MPA Multiuser Detection Algorithm …
331
network demand of mobile devices at this stage. Sparse code multiple access (SCMA) is a non-orthogonal multiplexing technology proposed by Huawei in 2014. It is one of the key technologies of 5G. SCMA is evolved from low-density spread spectrum multiple access (LDS-MA) [5], which has good flexibility and strain capacity, and can obtain additional SNR gain. The SCMA system integrates quadrature amplitude modulation (QAM) and sparse spread spectrum [6], which can directly map binary data stream to multi-dimensional code word in complex domain, so it can solve the overload caused by large-scale connection [7]. The sender of SCMA system uses message passing algorithm (MPA) to detect multiple users. Multiple data layers from users are orthogonally superposed in the same time-frequency resource block for transmission [8]. The receiver separates multiple data layers of the same time-frequency resource block through the MPA decoder. In reference [9], a message delivery algorithm based on logarithmic domain is proposed. In reference [10], an uplink scheme SCMA system based on serial message passing algorithm (SMPA) is proposed. In reference [11], an improved maximum logarithm message passing algorithm (Max-log-MPA) multiuser detection algorithm is proposed. Compared with the original MPA algorithm, the above algorithm has improved the bit error ratio, but the SCMA system still has a high bit error ratio. In order to solve the problem of user data loss caused by logarithm operation in Max-log-MPA algorithm and part of information loss caused by the combination of user node message update and resource node update in S-MPA, a maximum logarithm message passing algorithm based on serial (S-Max-log-MPA) is proposed in the paper. The algorithm transforms the exponential (EXP) operation in the original MPA into the multiplication operation and the maximum operation in the log domain and then integrates the user node message update into the resource node message update. During the iterative update process, the information of user node is updated and transferred to the next node immediately, and then the information of resource node is updated immediately. In this way, this can reduce the storage of intermediate variables in the transmission process and also reduce the loss of user information, so improve the bit error ratio (BER) performance of the system.
2 SCMA System In the SCMA uplink system, assuming that J user information shares K (J > K) timefrequency resource blocks, the system overload rate is: λ = J/K, and the communication model is shown in Fig. 1. After source coding and channel coding, user information uj (u1 , u2 , . . . , uJ ) obtains binary bit data stream bj (b1 , b2 , . . . , bJ ), which is then mapped to n-dimensional codeword x = f (bj ) in codebook by SCMA encoder. Because of the sparsity of SCMA, the mapping process of SCMA can be defined as: f : Blog2 M → χ , where M is the codebook size, B is the set of binary numbers, and χ is the user codebook. T If yj = y1 , y2 , . . . , yK is the signal received on K resource blocks, then y=
J j=1
diag hj xj + n
(1)
332
G. Zhang et al.
b1,b2,...bN
User
Channel Coding
SCMA Coding
+
Channel Decoder
SCMA Decoder
Calculate BER Calculate Complexity
n(AWGN)
Fig. 1 Uplink SCMA communication system model
Because the channels of each layer of the upper link are different, the channel fac T tor is hj (h1 , h2 , . . . , hJ ), xj = x1,j , x2,j , . . . , xK,j represents the k-dimension SCMA codeword of the j-th user, and nj = [n1 , n2 , . . . , nK ]T is the white Gaussian noise vector added to the channel with distribution N(0, σ 2 I). Assuming that six users share four time-frequency resource blocks, the system overload rate is 150%. The user information is mapped to the codewords on the SCMA codebook. Each user has a unique codebook, as shown in Fig. 2. User1
User2
(1,1)
(1,0)
User3
(1,0)
+ +
+
User4
(0,0)
User5
(0,1)
User6
(1,1)
+ +
Fig. 2 SCMA coding principle
3 S-MPA Algorithm S-MPA algorithm is based on the original MPA algorithm, which uses the serial resource node update, and integrates the user node message update with the resource node message update in each iteration. The updated information can be delivered in time, which reduces the storage space of intermediate variables and improves the convergence speed of messages. The updating process of S-MPA algorithm can be divided into three steps. Step 1: initialize the conditional probability of the S-MPA algorithm. 1 , j = 1, 2, . . . , J (2) Ic0k →uj xj = M where ck represents the function node, and Ic0k →uj xj represents the information of variable node uj sent by function node ck during the initial iteration.
A S-Max-Log-MPA Multiuser Detection Algorithm …
333
The probability Qt xj (t = 1, 2, 3, …, t max ) of codeword combination on timefrequency resource block K is: ⎛
2 ⎞
⎟ 1
⎜ t (3) hk,v xk,v
Q xj = exp⎝− yk −
⎠ N0,k
v∈ξk
where v = j, v ∈ ξk , j ∈ ξk , yk (k = 1, 2, 3, . . . , K) represents the received signal, N 0, k represents the noise power on the time-frequency resource block K, ξk represents the set of users allocated on the time-frequency resource block, hk,v represents the channel coefficient of the v-th user on the k-th resource block, and xk,v represents the codeword of the v-th user on the k-th resource block. Step 2: the S-MPA algorithm first updates the resource nodes. ⎤ ⎡ ⎣Qt xj × (4) xj ⎦ Ictm →uk xj × Ict−1 Ictk →uj xj = m →uk ∼xj m∈ξk / j m∈ξk / j Among them, ∼ xj represents the edge probability of symbol x j , and ξk j represents the set of all variable nodes except the j-th user connected with function node ck . Then, update the user variable node. ⎡ ⎤ xj ⎦ (5) Ict−1 Ictk →uj xj = normalize⎣apv xj × m →uk m∈ξk / j Among apv xj represents the prior probability of user J code word, and them, normalize xj represents normalization processing. Step 3: decision processing of S-MPA algorithm. First, the probability of codeword xj,m transmitted by user variable node j is estimated, m = 1, 2, . . . M . (6) Icm →uk xj Q xj,m = apv xj,m × m∈ξk / j where Q xj,m represents the probability of codeword; xj,m transmitted by user variable node j. Then, the log likelihood ratio (LLR) of each coding bit is calculated to determine the user: P(bi = 0) m:bm,i =0 Q xj,m = log (7) LLRj,x = log P(bi = 1) m:bm,i =1 Q xj,m where P(bi = 0) represents the probability of the variable node to be decoded, and P(bi = 1) represents the probability of the variable node to be decoded.
334
G. Zhang et al.
4 S-Max-Log-MPA Algorithm Because of the large amount of operation of exponential operation (EXP) and the large space occupied in the process of message iterative updating, and the node message updating and resource node message updating are carried out respectively, the performance of multi-user checking algorithm is high. The Max-log-MPA algorithm will make part of the information lost, which leads to the poor bit error ratio performance. In this paper, the S-Max-log-MPA algorithm is based on the idea of adding the S-MPA algorithm to the max log MPA algorithm. Firstly, using the MPA algorithm of sum product operation [12], the index operation of the original MPA algorithm is reduced to the maximum value and addition operation, which greatly reduces the space occupied by the operation. In addition, the node message update and the resource node message update are combined. In each iteration, the updated message is immediately delivered to the next node, and the latter node immediately updates the resource node’s message, which can reduce the loss of user information and the space occupied by intermediate variables. Therefore, the S-Max-log-MPA algorithm can effectively reduce the bit error ratio of the detection algorithm. Step (4) of the message update process of S-Max-log-MPA algorithm can be modified as follows: ⎧
2
⎪ ⎨
1 1
× max − 2
− h x y Ictk →uj xj = 2 × √ k k,v k,v
∼xj ⎪ 2π δ ⎩ 2δ
v∈ξk
⎫ ⎬ x + Ictm →uk xj + Ict−1 j m →uk ⎭ m∈ξk / j m∈ξk / j
(8)
5 Analysis of BER Performance The BER performance of the system is an important index to measure the accuracy of message transmission in communication system. Max-log-MPA algorithm uses sum product operation, which reduces the index operation of the original MPA algorithm to the maximum value and addition operation. But it will cause some information loss and poor performance of system bit error ratio. Because the S-MPA algorithm does not judge the stability of user information, some user information is wrongly transmitted. In this algorithm, the index operation is reduced to the maximum value and the addition operation, and then the user node information update is integrated into the resource node information update, which can reduce the loss of user information and improve the accuracy of user information transmission.
6 Analysis of Simulation Results The simulation parameters are as follows: The number of multiple users is 6, the timefrequency resource block is 4, the nonzero element is 1000, the overload factor is 150%,
A S-Max-Log-MPA Multiuser Detection Algorithm …
335
the channel is the Gaussian white noise (AWGN) channel, and the codebook is the four-dimensional codebook published by Huawei [13]. Figure 3 shows the average EBR performance comparison between the S-Max-logMPA algorithm and the Max-log-MPA algorithm with the increase of SNR when the maximum number of iterations is t max = 3. It can be seen from Fig. 3 that the average BER of S-Max-log-MPA algorithm is better than that of Max-log-MPA algorithm, When E b /N o = 0 dB, BER performance of S-Max-log-MPA algorithm is slightly higher than that of Max-log-MPA algorithm by 6.03%. When E b /N o = 14 dB, BER performance of S-Max-log-MPA algorithm is slightly higher than that of Max-log-MPA algorithm by 0.967%.
Fig. 3 Comparison of average BER performance of two algorithms when t max = 3
Figure 4 shows the comparison of the average EBR performance of S-Max-log-MPA algorithm and Max-log-MPA algorithm with the increase of SNR when the maximum number of iterations is t max = 5. According to the comparison between Figs. 4 and 3, as the number of iterations increases, the closer the average bit error rate of the two algorithms is, the better. But in general, the average bit error rate of S-Max-log-MPA algorithm is still better than that of Max-log-MPA algorithm. When E b /N o = 0 dB, BER performance of S-Max-log-MPA algorithm is slightly higher than that of Max-logMPA algorithm by 5.56%. When E b /N o = 14 dB, BER performance of S-Max-log-MPA algorithm is slightly higher than that of Max-log-MPA algorithm by 0.683%.
7 Conclusion The S-Max-log-MPA algorithm in this paper is based on the idea of adding the SMPA algorithm to the Max-log-MPA algorithm. It combines the advantages of the two algorithms and can effectively solve the problem of high bit error ratio caused by information loss caused by Max-log-MPA algorithm and S-MPA algorithm. From the simulation results, we can see that with the increase of the number of iterations, the BER performance of this algorithm is better.
336
G. Zhang et al.
Fig. 4 Comparison of average BER performance of two algorithms when t max = 5
Funding. This work was supported by the Youth Science Foundation of Northeast Petroleum University under Grant No. 2019QNL-34 and Graduate Innovative Research Work and the National Natural Science Foundation of China under Grant (61871348). Conflicts of Interest The authors declare no conflict of interest.
References 1. Al-Falahy Naser, Alani Omar Y (2017) Technologies for 5G networks: challenges and Opportu-nities. IEEE IT Prof 19(1):12–20 2. Yifei Yuan, Longming Zhu (2014) Application scenarios and enabling technologies of 5G. IEEE China Commun 11(11):69–79 3. Zhang S, Xu X, Lu L, Wu Y, He G, Chen Y (2014) Sparse code multiple access: an energy efficient uplink approach for 5G wireless systems. In: Proceedings of 2014 IEEE global communications conference, pp 4782–4787. https://doi.org/10.1109/glocom.2014.7037563 4. Agyapong PK, Iwamura M, Staehle D, Kiess W, Benjeb-bour A (2014) Design considerations for a 5G network architecture. IEEE Commun Mag 52(11):65–75 5. Guo D, Shamai S, Verdu S (2005) Mutual information and minimum mean-square error in Gaussian channels. IEEE Trans Inform Theory 51(4):1261–1282 6. Khalid SS, Abrar S (2015) Blind adaptive algorithm for sparse channel equalisation using projec-tions onto p-ball. Electron Lett 51(18):1422–1424 7. Nikopour H, Baligh H (2013) Sparse code multiple access. In: 2013 IEEE 24th international symposium on personal indoor and mobile radio communication (PIMRC), pp 332–336. https://doi.org/10.1109/pimrc.2013.6666156 8. Nikopour H, Yi E, Bayesteh A, Au K, Hawryluck M, Baligh H, Ma J (2014) SCMA for downlink multiple access of 5G wireless networks. In: 2014 IEEE global communications conference, pp 3940–3945. https://doi.org/10.1109/glocom.2014.7037423 9. Zhang S, Xu X, Lu L, Wu Y, He G, Chen Y (2014) Sparse code multiple access: an energy efficient uplink approach for 5G wireless systems. In: 2014 IEEE global communications conference, Austin, TX, USA, pp 4782–4787. https://doi.org/10.1109/glocom.2014.7037563
A S-Max-Log-MPA Multiuser Detection Algorithm …
337
10. Yang D, Dong B, Chen Z, Fang J, Yang L (2016) Shuffled multiuser detection schemes for uplink sparse code multiple access systems. IEEE Commun Lett 20(6):1231–1234 11. Xiong W, Wenping G, Xuewan Z, Wenli D (2018) Improved MAX-Log MPA multiuser detection algorithm in SCMA system. Appl Electron Tech 44(5):111–123 12. Kschischang FR, Frey BJ, Loeliger H-A (2001) Factor graphs and the sum-product algorithm. IEEE Trans Inf Theory 47(2):498–519 13. Pokamestov DA, Demidov AY, Kryukov YV, Rogozhnikov EV (2017) Dynamically changing SCMA code- books. In: 2017 International Siberian conference on control and communications (SIBCON), conference location: Astana, Kazakhstan, pp 1–4. https://doi.org/10.1109/ SIBCON.2017.7998451
A Method and Realization of Autonomous Mission Management Based on Command Sequence Yiming Liu(B) , Yu Jiang, Junhui Yu, Li Pan, Hongjun Zhang, and Zhenhui Dong Beijing Institute of Spacecraft System Engineering, Beijing, China [email protected]
Abstract. In order to meet the requirements of more and more remote sensing satellite loads and more and more complex imaging tasks, a method of autonomous mission management based on command sequence is proposed in this paper. The design and encapsulation of the command sequence based on the load operations, reduce the planning of the task details and the amount of the injection. Only the main operation and time of imaging mission can be focused on the ground. This improves the efficiency of mission injection and execution, as well as the autonomous management level of mission. The method is applied to a satellite. Keywords: Command sequence · Autonomous mission management · Remote sensing satellite
1 Introduction With the continuous development and progress of satellite technology, the requirements for the efficiency and quality of remote sensing satellite imaging are getting higher and higher. This requires the satellite data management system to be able to complete the satellite imaging task more timely and accurately. The autonomous mission management of remote sensing satellite refers to the monitoring of the mission information on the ground and the necessary judgment and processing of the mission, and generates the commands that can be executed by the on-board equipment, in order to control the switch and state setting of the payload equipment to complete the actions required for the execution of the mission. This realizes the user needs finally. Nowadays, as more and more payloads are loaded on remote sensing satellites, the completion of missions is also more and more complicated due to the need for cooperation between the loads. The instructions of each subsystem and single machine are more complicated. If we still rely on the injection command to achieve the mission, users need to inject a huge amount of remote control data. This also requires the users to understand the complex load workflow to achieve the command selection, which greatly increases the difficulty of using the satellite. At home and abroad, a lot of research has
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_43
A Method and Realization of Autonomous Mission Management …
339
been carried out on satellite autonomous mission management. NASA has developed two satellite mission planning systems, ASPEN [1] and CASPER [2], based on the EO-1 satellite. ASPEN generates satellite command levels for mission-level observation targets automatically. CASPER proposes a real-time replanning mechanism, which can collect the real-time state and new target of the system based on a fixed time step or event step, and make necessary planning. PROBA supported by ESA developed a number of autonomous planning technologies to complete autonomous management of the spacecraft in cooperation with ground control [3]. CNES and ONERA demonstrated the autonomous management in the AGATA project ground simulation system [4]. Lterature [5] encapsulates the independent indirect command group into a series of instructions, which simplifies the remote control operation of the satellite. However, it cannot flexibly choose the instructions to match with other missions, and it is not suitable for missions that require a large number of complex instructions. Literature [6] proposes a layered architecture design of autonomous mission management system, but it is not applicable to remote sensing satellites. Literature [7] uses the autonomous mission planning instruction to realize the mission of fixed commands, which reduces the input of users and improves the simplicity and reliability of satellite command operation. However, it is not applicable to the mission of complex types and commands. In this paper, a method of autonomous mission management based on command sequence is proposed, which is suitable for remote sensing satellites with multiple mission loads and complex commands. Users only need to inject the necessary information of mission to complete the imaging mission, this improve the efficiency and reliability of satellite operation.
2 Autonomous Mission Management Method Based on Command Sequence According to the satellite control mode, the user’s shooting/transmission/playback plan is carried out by the ground. In this design, the mission planning is still agreed to be completed by the ground. The ground user only needs to plan the basic imaging recording mission and data playback mission, and injects into the satellite through the mission template, and the satellite can automatically plan and calculate the actions needed for the mission execution. The autonomous mission management process is shown in the following (Fig. 1).
Mission injec on
Mission analysis Command sequence selec on
Command me calcula on Collision detec on
Command merge
Command adjust
Fig. 1 The autonomous mission management process
1. Mission injection
Command execu on
340
Y. Liu et al.
In the autonomous mission management method based on command sequence implementation, the ground user does not need to pay attention to the instructions, timing and the collocation between loads, but only needs to inject the mission template containing the necessary mission elements and the other work is completed on the satellite independently. The mission template is shown in the following Table 1. Table 1 Mission template Mission elements Instructions
Note
Type
Identify specific types of missions
Necessary
Priority
Identify mission priority
Sequence number For mission retrieval, modification and deletion
Necessary Necessary
Attribute
Includes the mission side swing, main load, transmission mode Optional and other mission attributes
Start time
–
Necessary
End time
–
Necessary
The mission template only needs to contain the mission type, priority, sequence number, necessary attributes and the start and end time. 2. Mission analysis and command sequence selection Mission management analyzes the mission according to the mission attributes in the mission template, and selects the relevant equipment to complete the mission according to the main load required for the mission. The corresponding command sequence is selected according to the final determination of the payload equipment participating in the mission. 3. Command time calculation and collision detection The specific time of command execution is calculated from the start and end time of the mission and the preparation time of equipment switch. Through the command sequence, the relative time of command execution and mission execution can be defined flexibly, and the on-orbit modification can be supported, which can be tested and adjusted according to the actual use. If there is a conflict in load usage between missions, the conflicting missions are deleted as required. 4. Command merge When executing the command, the necessary merge operation of command is occurred to ensure the successful completion of the mission.
A Method and Realization of Autonomous Mission Management …
341
5. Command adjust Since the decoupling of missions and commands is realized by using the command sequence, it is convenient to make adaptive adjustment to the commands, including the selection of on-off commands according to the main and standby states of the equipment, and the modification of the state setting commands, etc. 6. Command execution Execute the commands and complete the mission.
3 Key Technology 3.1 Design of Command Sequence The design of command sequence is the key of autonomous mission management. The command sequence represents the minimum collection of load actions, which can completely realize the operation and control of a load, including starting up, shutting down, state setting and so on. Missions are performed by different loads, that is, by different command sequences. The command sequence contains (but is not limited to) the following elements (Table 2). Table 2 Command sequence Element
Instructions
Note
Commands or groups of commands
The commands required to complete the action
Necessary
Command relative time
Command execution time relative to mission time
Necessary
Command absolute time
Command execution absolute time
Necessary
Command modification identification
Whether the command is modified and Optional why
Command merging method
Rules for merging commands
Optional
Among them: • The commands or groups of commands are the specific commands needed to complete the load control. The address is adopted in the command sequence to realize the decoupling between the command sequence and the command. The change of the command sequence can be realized by modifying the command.
342
Y. Liu et al.
• The relative execution time of command refers to the relative value between the execution time of command and the execution time of mission. Generally, the boot class command precedes the start time of the mission, while the shutdown class command is later than the end time of the mission. • The absolute execution time of the command is the actual execution time of the command obtained by calculation. • Command modification identification includes command not modification, modification according to the device’s main and standby state, parameter modification according to the actual needs of the mission and so on. • Command merging method refers to the judgment method of command merging in the process of mission execution when adjacent missions have the same load or equipment. When a merge is required, the command is no longer executed. Autonomous mission management determines the selection of command sequence according to the information such as key load and execution time in the mission template. A large amount of specific mission information is in the command sequence, no need for the user to care, which greatly reduces the input required by the user, improves the efficiency of injection, and reduces the probability of error. 3.2 Load Selection and Command Time Calculation When the user inject the mission template, only the main loads involved in the mission (laser, camera) are considered, and other related equipment (power, data transmission, etc.) are matched by the satellite independently. According to the actual situation, the collocation rules between the equipment are defined on the satellite, as shown in the following table (the table is only for example, not for actual situation). (Table 3) When the main equipment is determined, the related equipment is also determined, and the command sequence required by the mission is selected. After the command sequence selection is completed, the absolute execution time of each command is calculated. 3.3 Mission Merge Judgment and Command Execution In the process of simultaneous execution of multiple missions, the working time of equipment will be overlapping. At this time, the commands should be combined based on the working time window of the equipment to avoid unnecessary on-off operation. Otherwise, the mission cannot be completed properly. The principle of merging is: if the interval between two adjacent missions is less than the preparation time of the equipment, the two equipment tasks are merged. The merging strategy is: • Delete the earlier shutdown command and the later startup command. • After merging, the start time of the mission is the earlier start time and the end time of the mission is the later end time.
A Method and Realization of Autonomous Mission Management …
343
Table 3 Equipment collocation rules No. Related equipment 1
Management equipment 1
2
Management equipment 2
3
Management equipment 3
4
Power equipment 1
5
Power equipment 2
6
Power equipment 3
7
Data equipment 1
8
Data equipment 2
9
Data equipment 3
Main equipment Laser Laser Laser Laser Laser Camera Camera Camera Camera Camera 1 2 3 4 5 1 2 3 4 5 √ √ √ √ √ √ √
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
√
There are two types of equipment working time windows that need to be merged: 1. Equipment working time overlaps between missions (Fig. 2)
Mission 1
Mission 2
Fig. 2 Mission overlapping
After merging, the equipment task start time is the start time of mission 1, and the end time is the end time of mission 2. The shutdown command of mission 1 and the startup command of mission 2 are no longer executed. 2. Equipment working time covers between missions (Fig. 3)
344
Y. Liu et al.
Mission 1
Mission 2
Fig. 3 Mission covered
After merging, the equipment task start time is the start time of mission 1, and the end time is the end time of mission 1. The shutdown command of mission 2 and the startup command of mission 2 are no longer executed. The mission merge among multiple missions is the same as two missions. 3. Equipment working time not completely covered (Fig. 4)
Mission 1
Mission 2
Mission 3
Fig. 4 Mission not completely covered
After merging, the equipment task start time is the start time of mission 1, and the end time is the end time of mission 3. The shutdown command of mission 1, the startup command of mission 2, the shutdown command of mission 2 and the startup command of mission 3 are no longer executed. 4. Equipment working time completely covered (Fig. 5)
Mission 1
Mission 2
Mission 3
Fig. 5 Mission completely covered
After merging, the equipment task start time is the start time of mission 1, and the end time is the end time of mission 1. The startup command of mission 2, the shutdown
A Method and Realization of Autonomous Mission Management …
345
command of mission 2, the startup command of mission 3 and the shutdown command of mission 3 are no longer executed. When the condition of mission merge is met, the on-off commands of the equipment are merged. Because the definition of merge method for command level is realized in the command sequence, the different merge time and operation for each command can be achieved. For example, some equipment only merge shutdown commands, not startup commands. When the command is confirmed to be executed, modify the command according to the need, finally execute the command, and complete the mission.
4 Validation The method of autonomous mission management using command sequence has been successfully applied to a satellite. After more the two months, 21 mission templates and 46 merged scenarios were tested, which performed well and fulfilled the satellite imaging requirements.
5 Conclusion This paper introduces a method and key technique of autonomous mission management using command sequence. This method realizes the decoupling of mission to equipment and equip to command. It completes the autonomous management of the selection of equipment, calculation of command execution time and command content adaptation. The autonomous mission management reduces the decomposition of missions, improves the efficiency and accuracy of remote injection, and also provides support the diversified design of missions. It can provide reference for the autonomous mission management of other remote sensing satellites.
References 1. Chien S, Rabideau G, Kflight R et al (2000) ASPEN-auto-mated planning and scheduling for space mission operations. In: Proceedings of the 6th international conference on space operations (Space Ops 2000).Washington, DC: AIAA, pp 1–10 2. Cichy B, Chien S, Sehaffer S et al (2006) Validating the EO-1 autonomous science agent. NASA, Washington C 3. Bamsley MJ, Settle JJ, Cutter MA et al (2004) The PROBA/CHRIS mission: a low-cost smallsat for hyperspectral multiangle observations of the earth surface and atmosphere. IEEE Trans Geosci Remote Sens 42(7):1512–1520 4. Jeremie P, Sylvain J, Patxi O (2014) Autonomous mission planning in space: mission benefit and real-time performances. In: Prooceedings of the embedded real time software and systems, Toulouse. Paris: CNES 5. Baofeng Wu, Zhigang Li, Junyu Li et al (2013) Design of mission-oriented auto nomous commands for small satellites[J]. Spacecraft Eng 22(4):68–71 (in Chinese) 6. Jianbing Zhu, Luyuan Wang, Wei Zhao et al (2016) Analysis on key techniques of onboard autonomous mission management system of optical agile satellite. Spacecraft Eng 25(4):54–59 (in Chinese) 7. Chi Zhang, Jie Liu, Zhenxin Wang et al (2017) Design and on-orbit verification of Task selfarrangement for GF-3 satellite. Spacecraft Eng 26(6):29–33 (in Chinese)
Remote Sensing Satellite Autonomous Health Management Design Based on System Working Mode Li Pan(B) , Fang Ren, Chao Lu, Liuqing Yang, and Yiming Liu Beijing Institute of Spacecraft System Engineering, Beijing, China [email protected]
Abstract. The work mode of remote sensing satellite in orbit is flexible and diverse, the working state and parameters of the satellite under different working modes are very different, which brings some difficulties to the autonomous health management of remote sensing satellite. In order to associate satellite health management with working mode, a health management method based on finite state machine (FSM) is proposed. Taking the working modes as the state, the switching operation of each mode (such as switching on and off of equipment, etc.) as the state transfer condition, the autonomous health management disposal measures under different states as the execution action, the refined autonomous health management model under different working modes is established. The actual flight verification of a remote sensing satellite shows that the autonomous health management design is reasonable and feasible, and the finite state machine modeling method is universal and reusable. Keywords: Remote sensing satellite · Work mode · Autonomous health management · Finite state machine (FSM)
1 Preface Remote sensing satellite has a short time of ground visible telemetry tracking and control, a high requirement for continuity of remote sensing mission operation, and an increasing number of remote sensing satellites, in order to reduce the burden of ground management, improve the service ability of satellite and the handling ability of major emergency faults, it is required that the satellite has a stronger autonomous health management ability. So fault diagnosis and disposal method that is mainly relying on the ground need to be gradually transformed into on-board real-time autonomous diagnosis and disposal, which can not only deal with the satellite faults or abnormal events quickly, but also avoid the serious consequences caused by the further spread of faults. Autonomous health management can improve the reliability and continuity of the satellite mission operation.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_44
Remote Sensing Satellite Autonomous Health Management …
347
The traditional methods of autonomous health management of remote sensing satellites are mainly based on the fault diagnosis algorithm of signal processing; the fault diagnosis is mainly by the static threshold or state [1]. For the state or threshold changing with the satellite working mode, the traditional autonomous health management mostly adopts wide threshold or ignore monitoring the parameter change, this will lead to some defects, such as no health management under some important working modes, or health management is relatively extensive, unable to accurately identify faults, etc. In order to meet the needs of dynamic health management under different working modes, the system level, subsystem level and equipment level health state under each working mode should be monitored delicately. It is necessary to associate the satellite health management with the satellite working mode, and then accurately identify and handle the satellite anomalies under different working modes. Therefore, this paper proposes a finite state machine health management model based on the satellite working mode, which takes the satellite working mode as the state, the mode switching operation as the state transition condition, and the satellite fault diagnosis as the execution action. On this basis, it realizes the satellite autonomous health management associated with the satellite working modes, accurately identifies the satellite faults and generates the health report in different modes.
2 Autonomous Health Management Design Related to Working Mode 2.1 Autonomous Health Management Architecture Design The autonomous health management of remote sensing satellite usually adopts the hierarchical and distributed architecture. The system level, subsystem level and equipment level all have the corresponding autonomous health management capabilities. The lower level provides the health status or report to the higher level for the information synthesis to obtain more comprehensive and accurate health information for fault diagnosis and disposal. The architecture has the advantages of accurate fault monitoring, safe and effective fault disposal, etc. On this basis, the information of satellite working mode and equipment working status and monitoring conditions are associated with different levels of autonomous health management, which can further make the monitoring conditions accurate and complete, and improve the accuracy and coverage of fault monitoring and diagnosis. Autonomous health management system architecture of a remote sensing satellite is shown in Fig. 1. 2.2 Satellite Operation Mode, Events and Operation Phase 2.2.1 Satellite Operation Mode Definition Remote sensing satellite is equipped with payload, payload data transmission, processing and storage equipment, antenna according to observation task. Its payload mission working modes mainly include imaging recording, data playback, recording while playback, etc. In different modes, each payload works flexibly and cooperatively according
348
L. Pan et al. System level Autonomous health management
Satel lite wor k mod e
System health report
Subsys1 health management
Subsys2 health management
...
Subsys n health management
subsys health report
Device health management
Device health management
...
Device health management
device health report
Satellite mission operation (Ground)
Satellite TT&C management (Ground)
Fig. 1 Autonomous health management system architecture
to the mission requirements, thus deriving multiple sub modes. The commonly working modes are defined as shown in Table 1. 2.2.2 Event and Work Phase Definition In different working modes, each subsystem performs payload mission operation according to time sequence requirements to execute imaging and data transmission mission. Define the operation or action that triggers satellite working mode or condition conversion as point event Sn, and defines the phase of satellite working mode or condition maintenance as working segment Lm. Take a satellite as an example, point event related to typical working mode and working phase is defined in Tables 2 and 3. 2.3 Fault Monitoring Design Related to Working Mode The focus of health management of satellite working mode is to monitor the working condition of the payload and the health condition of the satellite platform, such as the satellite attitude range and stability, the power supply during the payload working phase, the working temperature of the payload, the usage of the storage capacity and the health condition of the payload. In different working phase, fault knowledge base can be established for fault modes and their criteria that can be monitored, and health management modules of each level can complete health management functions such as fault diagnosis and disposal according to the fault knowledge base. The design of fault knowledge base in different working phase is shown in Table 4. 2.4 Design of Health Reporting Mechanism When autonomous health management is associated with satellite payload mission execution, at the beginning of each mission, the fault monitoring is started, and at the end
Record and playback
5
Payload 1 record and playback
√
NA
Both CH record and playback
4
Record and playback by sequence
Both channel × playback
Playback
√
√
3
Payload 1 record
Payload 1
Payload 1 and payload 3 record
Recording
1
Sub mode
2
Workmate
No.
×
NA
×
×
×
Payload 2
×
NA
×
×
×
Payload 1 + 2
×
NA
×
√
×
Payload 3
Table 1 Commonly working modes
NA
√
×
√
√
Record
NA
√
√
×
×
Playback
√
×
×
×
×
Record and playback
×
×
×
×
×
Housekeeping playback
Remote Sensing Satellite Autonomous Health Management … 349
350
L. Pan et al. Table 2 Event Sn define and judgment condition
No.
Event code
Event name
Event judgment condition
1
CAM_S1
Payload subsystem start working
Payload controller on
2
CAM_S2
Payload 1 start working
Payload 1 fourth power on
3
CAM_S3
Payload 1 stop working
Payload 1 fourth power off
4
CAM_S4
Payload subsystem stop working
Payload controller off
5
CAM_S5
Payload 2 start working
Payload 2 fourth power on
6
CAM_S6
Payload 2 stop working
Payload 2 fourth power off
Table 3 Work phase Lm define No. Phase code Phase name 1
CAM_L1
Payload preparing/stop phase
2
CAM_L2
Payload 1 working phase
3
CAM_L3
Payload 2 working phase
4
CAM_L4
Payload 1, 2 working phase
of the mission, the fault monitoring is stopped, and form a mission monitoring report. The report consists of two parts, one is event code, which represents a typical working mode in Table 1, and the other is fault code, which represents the faults occurred during this task. Each fault is set with an independent code. This report should be generated and sent to the ground whether there is a fault during the mission.
3 Software Design and Implementation 3.1 The Design of Finite-State Machine Model The finite state machine (FSM), also known as finite state automata, is a mathematical model that represents the finite states and the behaviors of transition and action among these states. Finite state machine (FSM) is a kind of work used to model the behavior of objects. Its main function is to describe the state sequence of objects in their life cycle, and how to respond to various events from the outside world. FSM is widely used in modeling application behavior, software engineering, compiler, and computing and language research [2]. Generally speaking, the state machine can be described by four elements, namely, the present state, the condition, the action and the secondary state [3]. Present state: refers to the current state. Condition (message): when a condition is met or occurs, an action will be triggered or a state migration will be performed.
Remote Sensing Satellite Autonomous Health Management …
351
Table 4 Fault knowledge database design No.
Fault type
Fault criterion
Work phase
1
Position function error GPS Position error during imaging GPS time code invalid Integration time invalid
DDT_L1/DDT_L2/DDT_L3 DDT_L4/CAM_L2/CAM_L3 CAM_L4/CGY_L2
2
Payload 1 power supply error during working phase
Power voltage error Power on status error Current error
CAM_L2
3
Payload 2 power supply error during working phase
Power voltage error Power on status error Current error
CAM_L3
4
Payload 1, 2 power supply error during working phase
Power voltage error Power on status error Current error
CAM_L4
5
Payload 1 s power supply error
Power voltage error Power on status error Current error
CAM_L2/CAM_L4
6
Payload 2 s power supply error
Power voltage error Power on status error Current error
CAM_L3/CAM_L4
Action: the action executed after the condition is met. After the action is executed, you can migrate to a new state or keep the original state. Secondary state: the new state to move to when the conditions are met. The “secondary state” is relative to the “present state”. Once activated, the “secondary state” will be transformed into a new “present state”. According to the working mode design, and the definition of event, and working phases of a satellite, the state machine model of working mode health management is established. The mapping relationship of the above four elements is: the working phases (Lm) corresponding to the present/secondary state, the event (Sn) corresponding to the condition, and the fault monitoring of the working phases corresponding to the action. Taking the camera subsystem as an example, the working mode health management model based on FSM is shown in Fig. 2. 3.2 Software Implementation The model is implemented in C language by the on-board data handling subsystem of satellite. The working mode health management function is a sub module of the satellite autonomous health management system. The main interface relationship with other functional modules is shown in Fig. 3. The telemetry function module generates the mode conversion information and starts/stops the state machine. The mission management module and the command processing module generate the state conversion conditions,
352
L. Pan et al.
Initial/Stop
CAM_S1:payload start
CAM_S4:payload stop
CAM_L1: payload prepare/stop CAM_S5:payload 2 start
CAM_S2:payload1 start
CAM_S3:payload1 stop
CAM_L2: payload1 work phase
CAM_S6:payload2 stop
CAM_L4: payload1,2 work phase
CAM_S6:payload2 stop
CAM_S5:payload2 start
CAM_L3: payload 2 work phase
CAM_S2:payload1 start
CAM_S3:payload1 stop
Fig. 2 FSM model of autonomous health management
that trigger the state machine runs and performs fault monitoring under the corresponding state; after the autonomous health management module generates the event report information, it is transmitted to the ground system via the satellite telemetry processing module. Parameter monitoring
Downlink data
TM collection and downlink
TM data
Health report
Health report management
Autonomous health management
Work mode
Ground system Payload miision Uplink data
TC receive and processing
Payload Mission management
Status change
CMD sequence
Work mode and analyzi ng
TC processing CMD
Mission Execution
Fig. 3 Software module relationship
Health management
health manage ment
Remote Sensing Satellite Autonomous Health Management …
353
4 In Orbit Verification A satellite has realized a complete system level, subsystem level and equipment level autonomous health management architecture, among which the health management design based on the finite state machine and associated with the satellite working mode, has effectively solved the unified problem of satellite autonomous mission management and autonomous health management, and the autonomous health management can follow the satellite working mode in a timely manner. It can accurately locate the fault caused in different modes and evaluate the performance of the mission, which significantly improves the satellite system level health management ability. A satellite has completed the flight verification of the autonomous health management after launching and running in orbit. The results show that the autonomous health management design associated with the satellite working mode can accurately and real-time reflect the working condition of the satellite payload, and cooperate well with the mission execution.
5 Conclusion In this paper, for the first time, the working mode of remote sensing satellite is associated with the design of autonomous health management. Based on the theory of finite-state machine, and according to the design of working mode, payload working phase and fault knowledge base during payload working, the state change and input-output action in autonomous health management associated with the working mode are abstracted, The finite-state machine (FSM) model of autonomous health management which based on different satellite working modes, is first established and verified by on orbit flight. The actual flight shows that the design of the autonomous health management system oriented to the working mode is reasonable and feasible, and the state machine modeling method is universal and reusable. Its technical achievements and experience can be extended to the other remote sensing satellite applications, and the integrity and level of the autonomous health management system of the remote sensing satellite can be continuously improved.
References 1. Pan Y, Zhang G, Zhang D (2011). Design and implementation of fault diagnosis algorithm for satellite health management. Spacecraft Eng 20(5):37–42 2. Yang S, Wang X, Li M (2019) Mission planning method of rendezvous and docking based on FSM[J]. J Beijing Univ Aeronaut Astronaut 45(9):1741–1745 3. Wang Y, Liu Y (2014) Automatic test design of remote sensing satellite ground station based on state machine. In: The 19th China remote sensing conference, pp 349–354
An Autonomous Inter-Device Bus Control Transfer Protocol for Time Synchronization 1553B Bus Network Tian Lan(B) , Zhenhui Dong, Hongjun Zhang, and Jian Guo Beijing Institute of Spacecraft System Engineering, 100094 Beijing, China [email protected]
Abstract. The 1553B bus is widely used in spacecraft avionics system for its high reliability. The single point of failure (SPOF) of bus controller (BC) leads to the bus network paralysis. To improve the service continuity of the bus network, an autonomous inter-device bus control transfer protocol is proposed. The remote terminal (RT) address is used to identify device and initialize it with constant priority and timing parameters. In the situation of start-up and BC failure, the highest priority device autonomously acquires bus control, and the bus communication is self-repaired. After the fault device fixed, the bus control can be transferred under command control. In this paper, the realization scheme, the software implementation and the verification test of proposed protocol is introduced.
1 Introduction The 1553B bus is one of the most popular bus used in avionics system. In the 1553B bus network, a bus controller (BC) and various remote terminals (RT) are connected by a linear topology. To achieve fault-tolerant, a dual-channel design is used [1]. As a development of the MID-STD-1553, the ECSS-E-ST-50-13C protocol introduced a message scheduling scheme. The runtime of bus network is divided into periodic frames, which have similar time interval. The data transmission, including receiving and sending, are general designed. The device identifies event by data, instead of the sub-address of 1553B. At the beginning of each frame, a communication frame synchronization message is broadcasted, followed by populated content and burst content [2]. However, as a master-slave bus, the single point of failure (SPOF) of BC may cause disfunction of the whole bus network, and risk the whole spacecraft. To maintain the communication on the bus network, one and only one BC is essential. There are three problems to conquer the SPOF of BC: (1) How to check BC dysfunction, (2) How to select a new BC, (3) How to transfer the bus domination. To solve these problems, various designs have been proposed, which can be divided into two categories: the inner-device designs and the inter-device designs. The difficulty is how to avoid the competition of different devices, including the fault prime BC, the elected new BC and other candidate devices.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_45
An Autonomous Inter-Device Bus Control Transfer Protocol …
355
In inner-device fault tolerance designs, the prime BC and backup computer modules are working in hot backup mode. A specific election module is designed to transfer the bus control when the prime BC dysfunctional (Keya Liu [3]). The prime BC and its backups exchange sync data through inner bus, each module judge itself health status by the received sync data, and the election module decided the bus control owner by monitoring the health status. When the prime BC couldn’t maintain the bus network, it is shut down to release the bus control, the competition is avoided. The inner-device designs achieve high performance fault tolerance at cost of high cost of hardware and software. With the rapid development of spacecraft avionics system, the RTs have got equal capability to the BC; inner-device design is no longer the most effective way. To exert the potential of capacities of the whole system, the capable devices are developed to be backups, which construct inter-device fault tolerance design. Reference [4] proposed a 1553B fault-tolerant communication protocol (FTCP). In this protocol, the RTs monitoring the BC dysfunction by checking the periodical heartbeat signal broadcast by BC. When the last time of missing BC’s heartbeat exceeds the threshold, the first priority RT will enter BC mode immediately. Though the protocol realized backup to BC by software in RTs, there are still several problems unsolved. First, the missing of BC heartbeat may indicate the BC’s dysfunction, but it can also cause by bus interface error of the RT side. Secondly, the protocol just considered how to avoid confliction in the scene that all devices are already working, but it didn’t consider how to avoid competition when the spacecraft power-on/reset. Thirdly, after the dysfunction of prime BC is fixed, it can’t rejoin the bus network, not even to perform the BC again. To solve those problems, an autonomous inter-device bus control transfer protocol is proposed in this paper. Three scenes are considered, including the power-on, the BC dysfunction occupied in system running status, and the reconnection of a fixed device to running bus network. The priority of device is achieved by individual parameters, and the regular of parameters for different device are discussed. Considering the feature of popular bus communication protocol, especially the time synchronization 1553B communication protocol proposed by ECSS, our proposed scheme can be realized on current 1553B bus network without additional cost.
2 Start-up Bus Control Selection Procedure In our proposed scheme, the communication frame synchronization message is used as the periodical broadcast heartbeat signal of BC. Different RT address allocated to each device in the bus network, including the prime BC. All devices on the bus are initialized as RT, and enter a sniffing state at the last step of initialization, as it’s shown in Fig. 1. As a part of initialization, the bus control selection procedure runs once when a device power-on or reset. And it is performed in following steps: 1. The device is first initialized into RT mode, and its certain RT address is affirmed. 2. Each device can use its RT address to retrieve the timing parameter used in sniffing state. The devices on the bus network share a global configure table, which format is shown in Table 1.
356
T. Lan et al.
Fig. 1 The bus control selection procedure in the initialization process
Table 1 is a global configure table that supports three devices. Tstart is the initialization time cost before the sniffing state, which can be measured in ground test. TInitWait indicates the duration of sniffing BC, which is used to compensate the difference of Tstart . TBCcycle is the cycle of heartbeat signal broadcasted by BC. TBCerror is the threshold for BC in running state, and the TRunWait is the threshold for RT in running state. N is integral times of TBCcycle . In our scheme, N is set to 4. Table 1 is a global configure table that supports three devices. Tstart is the initialization time cost before the sniffing state, which can be measured in ground test. TInitWait indicates the duration of sniffing BC, which is used to compensate the difference of Tstart . TBCcycle is the cycle of heartbeat signal broadcasted by BC. TBCerror is the threshold for BC in running state, and the TRunWait is the threshold for RT in running state.N is integral times of TBCcycle . In our scheme, N is set to 4.
An Autonomous Inter-Device Bus Control Transfer Protocol …
357
Table 1 Timing parameters related to bus control transfer Device
Priority
RT address
Sniffing state config
Running state config
T start
T InitWait
T BCerror
T RunWait
A
First
2
T start_A
T InitWait_A
2 N× T BCcycle
T BCerror +2T BCcycle
B
Middle
6
T start_B
T InitWait_B
2 N× T BCcycle
T BCerror +4T BCcycle
C
Last
8
T start_C
T InitWait_C
2 N× T BCcycle
T BCerror +6T BCcycle
Device A has the first priority to be BC, device B has middle priority and device C has last priority. Considering the various start-up time Tstart , that is usually longer caused by more complex initial operations, larger scale of software, or lower performance of hardware. To ensure the sequential initialization of devices and to avoid the start-up competition, the following inequality should be satisfied: (Tstart (n) + TInitWait (n)) < (Tstart (n + 1) + TInitWait (n + 1))
(1)
Considering the heartbeat cycle of BC, the inequality transforms into: N × TBCcycle < · · · < (Tstart (n) + TInitWait (n)) + N × TBCcycle < (Tstart (n + 1) + TInitWait (n + 1))
(2)
3. In sniffing state, every device check if there has a BC on the bus by monitoring the heartbeat signal. If there is already a BC, the device finishes its initialization process and enter into running state in RT mode. Otherwise, the devices continue sniff BC heartbeat signal in RT mode. In this process, the device should ensure the communication dysfunction is not cause by itself. So, the configure registers of bus interface are checked to diagnose the possible error, and a re-initialization is performed when error detected. If the time duration of missing BC’s heartbeat signal exceeds the TInitWait of device, it turns to BC and enters the running state. Different device has different TInitWait , which make the first priority device that with the minimum sum of Tstart and TInitWait , become the BC.
3 Autonomous Fault-Tolerance Bus Control Transfer Procedure After the initialization, all devices in the bus network enter running state as BC or RTs. The BC maintains the communication and broadcast heartbeat signal. Two competition scenes should be considered in the running state, one is the competition between the
358
T. Lan et al.
fault BC and candidate device, the other one is the competition among the candidate devices. In our scheme, two parameters are introduced to avoid the competition, one is TBCerror , which is the error threshold for the BC; the other one is TRunWait , the customized error threshold for the candidate devices, which indicated the priority in relation to the RT address. The higher priority device is customized with a shorter TRunWait , which make it can take the place of the fault BC earlier. Each device shares the same fault tolerance logic that shown in Fig. 2.
Fig. 2 The fault-tolerance logic in running state
As shown in Fig. 2, to identify the BC’s disfunction, the BC broadcasts its heartbeat signal periodically, and periodic fault detection is executed both in the BC and the candidate devices.
An Autonomous Inter-Device Bus Control Transfer Protocol …
359
For the BC, the fault detection is performed in following steps: (1) A specific BC error count is increased when it can’t communicate with any RT in a period; otherwise the count is reset to zero. (2) When the BC error count is increased, a reconfiguration of bus interface module is proceeded to solve the fault. (3) When the BC error count exceeds the fault threshold of BC, TBCerror , and the BC can confirm the disfunction of itself. It should release the domination of bus network by reset its bus interface module into RT mode, and switch to backup unit if it possible. For the candidate devices, the fault detection is performed in following steps: (1) The candidate devices identify the BC’s disfunction by monitoring the periodically heartbeat signal of BC. When candidate device can’t receive new heartbeat signal in a period, the fault monitor count is increased, otherwise the count is reset to zero. (2) In each period, the candidate device compares the fault monitor count with the specific threshold TRunWait , if the count exceeded half of the TRunWait , an RT mode reconfiguration is proceeded to remove the possible fault of bus interface module. (3) If the count exceeded the TRunWait , the BC’s disfunction is indicated, which will be first identified by the top priority candidate device. Then, the device can get the domination by re-initial its bus interface module into BC mode. The normal function of the 1553B bus network is recovered and the service of spacecraft is continued. To avoid the candidate device turns to BC while the fault BC is still in BC mode, the following inequality should be satisfied: TBCerror + 2M × TBCcycle < MINi=0,1,...,30 (TRunWait (i))
(3)
In the inequality, M is integral times of TBCcycle , which is set to 1 in our scheme. To avoid the competition between candidate devices, the following inequality should be satisfied: TRunWait (n) + 2M × TBCcycle ≤ TRunwait (n + 1), n = 0, 1, . . . , 29 MAXi=0,1,...,30 (TRunWait (i)) + 2 × TBCcycle ≤ MINj=0,1,...,30 (TStart (j) + TInitWait (j)) (4) The regulation between the thresholds in running state and the thresholds in sniffing state is intent to avoid the competition caused by restart of the prime BC, whether it’s planned or not.
4 Active Credible Transfer Protocol of Bus Control To transfer bus control among health devices, an active credible bus control transfer protocol is used. Three kinds message is designed: bus control request, bus control transfer permission, and bus control transfer acknowledgement. The bus control request
360
T. Lan et al.
and the bus control transfer acknowledgement are sent by the applicant and received by current BC, the bus control transfer permission is sent by the current BC and received by the applicant. In this protocol, the bus control is transferred in following steps: (1) The applicant device is ordered to apply the bus control from current BC. The trigger is designed as a specific command. The applicant sends bus control request for one time, which is packaged into standard data transfer service. (2) After the BC receives the request, it starts the handshake process last three round. In each round, the BC first send bus control transfer permission to the applicant, then the applicant replies a bus control transfer acknowledgement back to the BC. The handshake process is ended immediately when any message is missed or timeout. (3) After a constant time interval that enough for the handshake process, the BC and the applicant judge if the bus control transfer can be executed. At the BC side, if there are more than two acknowledgements received successfully, it releases the bus control by reinitializing the bus interface into RT mode. Otherwise, the BC is still working as BC. At the RT side, it takes the bus control when all three permissions are correctly received. To avoid the competition in the process of bus control transfer, the time interval used in BC should shorter than the time interval used in the applicant. The current BC should take the moment of receiving the request message as the start point, and the applicant should take the moment of receiving the first permission message as the start point.
5 Validation Experiments Proposed protocol is testified in ground test environment of satellite avionics system. There are three reality devices including in the avionics system, one is BC (device A), the other two are RTs (device B and C). The computer module of device A is modified to support an additional RT address. The logic of proposed protocol is encapsulated into software component and uploaded into onboard software. Three scenes are designed in the test, the power-on scene, the BC dysfunction scene and the active transfer scene. The communication status of the bus network is monitored by specific designed test software [5]. In the power-on case, the communication establishing time matches the start time and initial wait time of device A. The period of message and communication status between BC and device B and C are all correct. In the BC dysfunction scene, the dysfunction is simulated by sending a reset command to the BC. Device B replace device A to be a BC and recover the bus communication correctly, and the restarted device A joined the bus network again in RT mode. Then, a bus transfer trigger command is injected to device A by the TT&C channel. The request, the permission and the acknowledgement are transferred in expect sequence, and the bus control is return to device A.
An Autonomous Inter-Device Bus Control Transfer Protocol …
361
6 Conclusions Base on the previous researches on fault tolerant of 1553B bus network, an autonomous inter-device bus control transfer protocol is proposed. Devices on the bus are treated as peers with different priority level, and the RT address is bonded with the priority. The competition of bus control is avoided by staggering the fault determination time, and the relationship between timing parameters is analyzed. Different timing parameters are used to deal with initial power-on and in-service failures. In addition, an active bus control transfer protocol is introduced to facilitate the reconstruction of the bus network after the repair of the fault device. The proposed protocol can be implemented on the basis of ECSS-E-ST-50-13C protocol without increasing the bus overhead. The proposed protocol is verified in ground simulation environment of satellite avionics system, and the experiment results are as expected.
References 1. DDC (2003) MIL-STD-1553 designer’s guide. Databus Device Corporation 2. ECSS (2008) Spacecraft Engineering: Interface and communication protocol for MIL-STD1553B data bus onboard spacecraft. ECSS secretariat ESA-ESTEC requirements and standards division 3. Keya Liu et al (2018) TMR design of FCS based on 1553B and reliability analysis. Comput Meas Control 119–123 4. Kan Wu et al (2011) Research and design of 1553B fault-tolerant communication protocol. In: 2011 International conference on information and industrial electronics 5. Junhui Yu et al (2018) Bus monitor and simulation software design based on time synchronization 1553B communication protocol. Comput Meas Control 136–139
A Hybrid Service Scheduling Strategy of Satellite Data Based on TSN Zhaojing CUI(B) , Zhenhui Dong, Hongjun Zhang, Xiongwen HE, and Yuling QIU Beijing Institute of Spacecraft System Engineering, 100094 Beijing, China [email protected]
Abstract. Using time-aware shaper to ensure the accurate transmission is the key of time sensing network (TSN) scheduling. According to the need of downloading different kinds of massive data from spacecraft to the ground, an efficient scheduling algorithm which is based on time-aware shaper is designed in this paper. By guarantee data arriving exactly and completely, and ensuring the priority of the real-time data transfer, the virtual channel resource utilization can be improved. Keywords: Time sensing network · Time-aware shaper · Scheduling strategy onboard
1 Introduction With the increasing complexity of space tasks and the enhancing cooperation among multiple space tasks, massive data need to be integrated in one download channel. However, when channel resource is occupied by burst tasks which has massive data and higher priority, packet queuing will lead to congestion and several task with lower priority will be delayed, so, the efficiency of data downloading will be reduced. So make the burst and massive data transmit in a certain way is the key to improve the real-time performance of the download channel. Time-sensitive network is a new industrial communication standard which is promoted by the international industry at present. Based on accurate synchronization, TSN can ensure the high consistency of network task scheduling and the certainty and real time of the network through data flow scheduling and system configuration related protocols [1]. TSN is composed of a series of data link layer standards, mainly including three parts: clock synchronization standards, data flow scheduling standards, and system configuration standards [2]. Aiming at the complex communication requirements of using Ethernet application in spacecraft and space-based network, the data scheduling mechanism of TSN was researched in this paper. Compared with the strict priority scheduling mechanism of standard Ethernet, TSN introduces the concept of “time-aware shaper,”
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_46
A Hybrid Service Scheduling Strategy of Satellite …
363
and TSN proposes different types of time-aware shaper for different application scenarios [3]. By using the TSN shaper, we can make the network node control the queue messages in time slot which is prescribed, so that the receiving and sending of each queue are defined in the predefined time scheduling table, and these data queue messages will be transmitted in the predetermined time slot; meanwhile, other queues in the same switching node will be locked [4]. In this way, different kinds of data flows will send data in turn according to the specified time slot, which eliminating the uncertainty caused by the interruption of burst data. Also, the delay of data flow in each switching node is determined, which ensure the certainty and the performance of real time of network in theory. Learnt from the scheduling mechanism of time-aware shaper in TSN, an improved circular queue forwarding strategy based on credit is proposed in this paper. On the basis of time synchronization of the sender and receiver, the time scheduling table of each task is preset and the transmission of queue are triggered by time scheduling table [5]. Burst tasks have the specified time slot according to the credit which is defined by transmission time. In this way, this scheduling mechanism improves the certainty and performance of real time of the transmission theoretically, which is compared with the usual scheduling mechanism in spacecraft.
2 Several Time-Aware Shaper Strategies Defined in TSN In the credit-based shaper (CBS), besides transmitting the best-effort queue, there are two queues are given higher priority, and these two queues transmit alternately according to the queue credit. When the queue is not in transmission, its credit value will increase with the increase rate of idle time slot. When the queue is in transmission, its credit value will decrease with the transmission rate. When the transmission is completed, the credit value of the queue will be cleared; meanwhile, the credit value of this queue is lower than another queue, and the credit value of this queue will increase again with the waiting time, and waiting for the next transmission. Time-aware gating shaper, which defined the gating table used for periodic control of the queue sending or waiting. Before each transmission, all network nodes which from the sender to the receiver are synchronized. For each port in the bridge, the switch operation is carried out according to the concerted gating table. In the implementation of IEEE 802.1 Qbv standard, one queue is used for real-time data stream transmission, which is specified in advance during time scheduling configuration, and another queue is used for reserved queue which transmit the burst data. The frame grabbing mechanism, which is designed by TSN working group to solve the waste problem caused by the reserved bandwidth of “best-effort” data stream in time-aware gating shaper. For gigabyte bandwidth, the reserved bandwidth 1.5 kb will lead to a time loss of 1.25 microseconds. Cyclic queue forward (CQF) mechanism, which is a periodic queuing and forwarding mechanism. The queues in each switching node are simply divided into two ping-pong
364
Z. CUI et al.
queues: Q1 and Q2, and the time slot is divided into odd and even time slots. At the input port, data received in odd time slot S1 enters queue Q1, and data received in even time slot enters queue Q2. The scheduling and shaping mechanism of the output port is also very simple. The S1 slot can only output Q2 data, and the S2 slot can only output Q1 data. In this way, a gated periodic queuing mechanism is formed in TAS, and the delay of data is also determined, which is the transmission delay plus the delay of each hop [6]. At present, developers in the field of industry usually develop time shapers by integrating various data flow scheduling methods based on several time-aware shaper strategies listed above or other time scheduling method [7].
3 Design of a Hybrid Service Scheduling Strategy in Spacecraft In view of the complex communication requirements of spacecraft, the Consultative Committee for Space Data System (CCSDS) proposed the Advanced Orbit System (AOS), several layers protocol is adopted in AOS protocol, which is similar to ISO 7-layer open system interconnection (OSI) model. In AOS, the spatial link layer corresponds to the data link layer in the OSI model. The space link layer is divided into two sub layers: one is the data link protocol sub layer, which enables data packet produced by different users and multiplex different packets to the same virtual channel; another is the synchronization and coding sub layer. The AOS space data link protocol is included in the data link protocol sub layer, which enables multiple user data transmit in the same physical channel at the same time. According to this CCSDS AOS space data link protocol, telemetry data download to earth station through two scheduling mechanisms, virtual channel scheduling and source packet scheduling [8]. In data link protocol sublayer, static and dynamic scheduling are both used for virtual channel schedule. In the second stage of multi-channel scheduling, the static priority scheduling based on fast scheduling algorithm is carried out according to the proportion of source packet period. When the source packet of burst channel has a longer transmission period, the low priority source packet will not be able to transmit in real time, which is shown in Fig. 1. What is more, when it is necessary to transmit two or more kinds of burst data at the same time, the transmission is not efficient. According to the time-aware shaper scheduling mechanism of TSN and the transmission characteristics of telemetry data produced by spacecraft, a mixed service scheduling algorithm of satellite data based on the credit of the cyclic queue forwarding is designed, as shown in Fig. 2, which divide the virtual channel into 2 parts [9]. The basic idea is expressed as follows: After the time synchronization, the virtual channel is divided into time slots. In odd time slot, the fixed tasks expropriate the channel as shown in Q1 [4]. In this way, the occurrence of lower priority packet loss can be avoided effectively and different burst data can occupy the channel in contiguous time slots. For example, telemetry data can be basically divided into periodic transmission telemetry packets (load test data, platform and load real-time telemetry data, platform and load delay telemetry data) and burst telemetry packets (memory unloading data, event report). Based on the CQF mechanism in TSN, the steps of this strategy designed are shown in Fig. 3.
A Hybrid Service Scheduling Strategy of Satellite …
365
Fig. 1 Burst data interrupts the period data transmission
Fig. 2 A mixed service scheduling strategy based on CQF of TSN
1. Time calibrating, using the GPS synchronization information to make the receive side and the sender side have the same time. 2. The sender is composed of two virtual channels: Q1 and Q2, each virtual channel transmits part of periodic data and part of burst data. Q1 transmits load test data, memory unloading data, and Q2 transmits platform and load real-time telemetry data, platform and load delay telemetry data, event report data. 3. When the odd time slot comes, Q1 begins to send its data which is determined already and Q2 waits. When the even time slot comes, Q2 sends its data and Q1 waits. 4. In the odd time slot, the time slot will be allocated to the burst task by the credit of the last same odd slot. The credit is defined as the data amount to be transmitted, if the amount is large, then the credit in the next slot will decrease. 5. If the burst task finish the transmission, then the credit of the burst data will be cleared. With the idle time increasing, the next transmission time slot for the burst will be longer.
366
Z. CUI et al.
Time calibra ng
Generate me slot alloca on table
Odd me slot or not yes
no
Packets in Q1 enter into sending queue
Packets in Q2 enter into sending queue
Transmission of burst task in Q1
Transmission of burst task in Q2
Transmission of period task in Q1 by priority
Transmission of period task in Q2 by priority
Calculate the credit of the burst task in Q1, to be used in next slot
Calculate the credit of the burst task in Q2, to be used in next slot
Enter next period
Fig. 3 Flowchart of the scheduling strategy designed in this paper
4 Conclusion Aiming at the demand of improving data transmission in spacecraft, a hybrid service scheduling strategy is designed, which is based on the cycling queueing forward strategy in TSN. In this way, different kinds of data flows will send data in turn according to the specified time slot, which eliminating the uncertainty caused by the interruption of burst data. Using the credit mechanism, not only the priority of the burst data transmission can be ensured, but also the other data can download exactly and completely and, which makes the resource utilization improved.
A Hybrid Service Scheduling Strategy of Satellite …
367
References 1. Kane Z, Difference between traditional ethernet and TSN. Available via DIALOG. http://blog. csdn.net/u013743845/article/details/76682290 2. IEEE standard for local and metropolitan area networks–audio video bridging (AVB) systems, IEEE Std. 802.1BA-2011, 2011. Available via DIALOG. http://standards.ieee.org 3. TTE and TSN: different design philosophy makes different future, 2019. Available via DIALOG. http://www.xperis.com.cn/technology_news_11.html 4. Enhancements for scheduled traffic, IEEE Std. P802.1Qbv. Available via DIALOG. http:// www.ieee802.org/1/pages/802.1bv.html 5. Zhiping Z, Yang Bai, He X (2019) Design and implementation of hybrid service scheduler for TTE network. Inf Commun 5:138–140 6. Peizhuang C, Ye T, Xiangyang G, Xirong Q, Wendong W (2019) A survey of key protocol and application scenario of time-sensitive network. Telecommun Sci 20(10):31–42 7. TSN switching based on FAST(3), 2019. Available via DIALOG. http://blog.csdn.net/m0_375 37704/article/details/86747856 8. Xianghui W, Tonghuan W, Ningning L, Hexiang T (2011) An efficient scheduling algorithm of multiplexing TM service based on the AOS. Spacecraft Eng 5:138–140 9. Peng Q (2019) Research on TSN scheduling alogorithm for real-time application requirements. Electr Eng (1):82
Campus Bullying Detection Algorithm Based on Surveillance Camera Image Tong Liu1,2 , Liang Ye1,3(B) , Tian Han2,3 , Tapio Seppänen3 , and Esko Alasaarela3 1 Harbin Institute of Technology, 150080 Harbin, China
[email protected]
2 Harbin University of Science and Technology, 150080 Harbin, China 3 University of Oulu, 90014 Oulu, Finland
Abstract. In recent years, the phenomenon of campus violence has gradually come into people’s attention. Detection of campus bullying based on surveillance camera image has become a research hotspot. This paper designs a campus violence detection algorithm with a 3D convolutional neural network. The detection process includes three parts: video image preprocessing, feature extraction, and classification algorithm design. 80% of the samples in the video database are used to train the classification model, 20% of the samples are used as the testing set, and a fivefold cross-validation method is used to evaluate the performance of the classification algorithm. The simulation results show that the average classification accuracy of the proposed algorithm is 95.02%, indicating that the violence detection algorithm has good performance. Keywords: 3D convolutional neural network · Image processing · Campus violence · Classification algorithm
1 Introduction In recent years, the phenomenon of campus violence has gradually come into people’s attention. Campus violence is common social problem all over the world [1, 2]. Campus violence hurts teenagers both physically and mentally, so the detection of campus bullying based on surveillance camera image has become a research hotspot [3, 4]. This paper designs a campus violence detection algorithm with a 3D convolutional neural network. The remainder of this paper will describe the detection algorithm in detail.
2 Extract Video Image Features Campus violence data were gathered by role playing, and the video images were captured with a surveillance camera. Firstly, the algorithm preprocesses the video images. Then, the features of the video images will have a better impact on the accuracy of the recognition algorithm. In this paper, a C3D convolutional neural network is built to extract the features of the preprocessed video images.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_47
Campus Bullying Detection Algorithm Based …
369
2.1 Video Image Preprocessing This paper uses OpenCV to read the video, preprocesses these image sequences in turn and retains 3 channels of color images during this process. The purpose is to get as many features as possible. Firstly, the length and the width of one frame of the video image are both processed to 112 pixels, and secondly, the 16 frames of the video image are processed as a basic unit [5, 6]. The structure is given as Fig. 1.
Fig. 1 Structure of the basic unit of the video image
2.2 Feature Extraction Model This paper builds a C3D neural network based on the tensor-flow framework to extract features of video images. The input unit of the model is a tensor, and its dimension is 16*112*112*3. The C3D neural network is based on 8 3D convolution operations and 4 3D maximum pooling operations, in order not to reduce the length of the sequence too early [7]. The size of the convolution kernel in the convolution operations is 3*3*3, and the step size is 1*1*1. In the first pooling operation, the size of the pooling core is 1*2*2, and the step size is 1*2*2. In the remaining three pooling operations, the size of the pooling core is 2*2*2, and the step size is 2*2*2. After the convolution pooling operation are two fully connected layers. The number of neurons in the first fully connected layer is 4096, and the number of neurons in the second fully connected layer is 487. This paper selects the 4096-dimensional features output by the first fully connected layer as the input of the classification model. The structure of the C3D neural network is given as Fig. 2.
Fig. 2 C3D neural network structure
370
T. Liu et al.
3 Classification Algorithm Design In order to simulate campus violence realistically, the authors and the team members role played campus violence scenes and recorded the videos. Totally, 118 non-violent video samples and 83 violent video samples were collected. 80% of the sample data are used to train the classification model, and 20% of the sample data are used as the testing set data to evaluate the performance of the model. The authors input 201 video data samples into the C3D feature model and obtain 201 csv feature files. Fivefold cross-validation is performed to get an average classification accuracy. 3.1 Classification Algorithm Model The C3D feature extraction model extracts features for each basic unit. Each basic unit outputs a 4096-dimensional feature vector through the C3D feature model, so the number of neurons in the first layer of the classification model is 4096. Considering the efficiency of training model and the classification performance [8, 9], this paper designs a fourlayer neural network. The final output of the classification model is divided into two categories: violent behavior or non-violent behavior. Therefore, the number of neurons in the output layer of the classification model is 2. The number of neurons in the hidden layers is 512 and 32, respectively. The structure of the classification model is given in Fig. 3.
Fig. 3 Neural network classification model
3.2 Selection of Classification Algorithm Parameters Considering that the number of samples of the training model is small, in order to prevent overfitting, dropout is introduced into the hidden layer of the classification model and regularization of weight parameters is added to the model loss function. This paper selects the Adagrad optimizer to train the model because the learning rate can be changed automatically. The global learning rate ε = 0.02, but this is not the actual learning rate. The actual rate is inversely proportional to the square root of the sum of the previous parameters, εn =
δ+
ε n−1 i=1 gi
(1) ⊗ gi
Campus Bullying Detection Algorithm Based …
371
δ is a very small constant, and the value is about 10−7 , so that the value of the denominator will not be 0. The specific iterative process of the optimizer is given as follows: Firstly, randomly select a batch of samples {x1 , . . . , xm } and the corresponding labels {y1 , . . . , ym } from the training set. Secondly, calculate the gradient value and error, and update the gradient accumulation r. The parameters are updated according to r and the gradient values. The iterative process is given as below. 1 L(f (xi ; θ), yi ) gˆ ← + ∇θ m
(2)
r ← r + gˆ ⊗ gˆ
(3)
i
θ = −
ε √ ⊗ gˆ δ+ r
θ ← θ + θ
(4) (5)
3.3 Classification Algorithm Training Results After 70 rounds of training on the classification algorithm, the accuracy of the training set reaches 97%, and the accuracy of the testing set reaches 95%. The loss function of the model converges, and the training results is given as Fig. 4.
Fig. 4 Classification algorithm training results
4 Classification Algorithm Performance Evaluation Fivefold cross-validation is used to verify the performance of the classification model. The testing set is divided into 5 equal parts, and the average value of the 5 testing results is taken as the final result. The testing results are given as Table 1.
372
T. Liu et al. Table 1 Cross-validation results (unit: %)
Accuracy
Test set1
Test set2
Test set3
Test set4
Test set5
Average correct rate
93.94
93.97
97.76
96.04
93.41
95.02
5 Conclusions This paper proposes a campus violence detection algorithm based on surveillance camera image. The algorithm reads images and preprocesses them with OpenCV. A C3D convolutional neural network is built to extract the features. This paper builds a four-layer neural network and tests the classification performance with fivefold cross-validation. The violence detection algorithm finally gets an average accuracy of 95.02%.
6 Acknowledgement This research was funded by National Natural Science Foundation of China under grant number 41861134010.
References 1. Liu X, Jia M, Zhang X, Lu W (2019) A novel multichannel internet of things based on dynamic spectrum sharing in 5G communication. IEEE Internet Things J 6(4):5962–5970 2. Liu X, Zhang X (2020) NOMA-based resource allocation for cluster-based cognitive industrial internet of things. IEEE Trans Industr Inf 16(8):5379–5388 3. Arandjelovic R, Gronat P, Torii A, Pajdla T, Sivic J (2016) NetVLAD: CNN architecture for weakly supervised place recognition. CVPR, Las Vegas, pp 5298–5299 4. Gao Y, Liu H, Sun X, Wang C, Liu Y (2016) Violence detection using oriented violent flows. Image Vis Comput 48(4):37–41 5. Gordo A, Almazan J, Revaud J, Larlus D (2016) Deep image retrieval: Learning global representations for image search. ECCV, Amsterdam, pp 241–257 6. Hasan M, Choi J, Neumann J, Roy-Chowdhury AK, Davis LS (2016) Learning temporal regularity in video sequences. CVPR, Las Vegas, pp 733–739 7. Hou R, Chen C, Shah M (2017) Tube convolutional neural network (t-cnn) for action detection in videos. ICCV, Venice, pp 1–11 8. Kooij J, Liem M, Krijnders J, Andringa T, Gavrila D (2016) Multi-modal human aggression detection. Comput Vis Image Underst 106–120 9. Mohammadi S, Perina A, Kiani H, Vittorio M (2016) Angry crowds: detecting violent events in videos. ECCV, Amsterdam, pp 3–18
Activity Emersion Algorithm Based on Multi-Sensor Fusion Susu Yan1,2 , Liang Ye1,3(B) , Tian Han2,3 , Tapio Seppänen3 , and Esko Alasaarela3 1 Harbin Institute of Technology, 150080 Harbin, China
[email protected]
2 Harbin University of Science and Technology, 150080 Harbin, China 3 University of Oulu, 90014 Oulu, Finland
Abstract. With the popularity of wearable intelligent devices, the research of activity recognition using wearable sensors has also been developed. This paper proposes an activity emersion algorithm based on multiple movement sensors. The authors gather movement data with wearable sensors and pre-process the data. They extract 57 features in both time-domain and frequency-domain, and select useful features with an SVM-RFE algorithm. SVM is used as the classifier. Recognition accuracy for the waist sensor is 93.4%, and that for the leg sensor is 92.1%. Then, the authors fuse the recognition results and achieve an accuracy of 95.6%, which is better than either single sensor. Keywords: Activity emersion · Data fusion · MSVM-RFE · SVM
1 Introduction With the rapid development of machine learning techniques in recent years [1, 2], the research of activity recognition and activity emersion has become a hot field. Activity recognition is usually based on wearable sensors. The number and the position of the sensors have a great impact on the recognition accuracy. Nishida et al. [3] used an accelerometer, a gyroscope, and a magnetometer to recognize postures. Gao et al. [4] used different sensor combination and got an average accuracy of 93%. Generally, sensorbased activity recognition can be catalogued into single-sensor-based [5] and multisensor-based [6]. This paper uses multiple wearable sensors [7, 8] for activity emersion. With the acceleration and the gyro gathered by the sensors and a series of processing, the algorithm emerges the types of activities that the user has performed. Activity emersion can be used in various scenes, for example, campus violence detection, and patient activity recognition.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_48
374
S. Yan et al.
2 Method 2.1 Data Gathering and Pre-processing Data were gathered with LPMS sensors which had embedded 3D accelerometers and 3D gyroscopes. In this research work, the authors placed two sensors on the waist and the right leg to collect movement data. Nine types of activities were acted, namely beat, jump, play, push, run, stand, walk, push down, and fall down. The y-axis of the accelerometer is vertical, and a combined vector is used instead of the two horizontal axes, (1) ACCH = ACCx (i)2 + ACCz (i)2 As for the gyroscope, a combined vector is used instead of the three axes, Gyro = Gyrox (i)2 + Gyroy (i)2 + Gyroz (i)2
(2)
For the ease of processing, the authors use a sliding window to segment the data flow, as shown in Fig. 1. The window length is 256 points, and the window slides 128 points each time. The sampling rate is 50 Hz. Noise is unavoidable, so a wavelet filter is used to reduce noise.
Fig. 1 Sliding window
2.2 Feature Extraction and Feature Selection Features are extracted from the y-axis of the acceleration, the combined horizontal vector of the acceleration, and the combined vector of the gyro. Altogether 36 (12*3) timedomain features and 21 (7*3) frequency-domain features are extracted, as shown in Tables 1 and 2, respectively.
Activity Emersion Algorithm Based on Multi-Sensor Fusion
375
Table 1 Time-domain features Number
Feature
Sensor data
1
Mean
Acceleration*2, gyro*1
2
Variance
Acceleration*2, gyro*1
3
Maximum
Acceleration*2, gyro*1
4
Minimum
Acceleration*2, gyro*1
5
Median absolute deviation
Acceleration*2, gyro*1
6
Pearson’s correlation coefficient
Acceleration*2, gyro*1
7
Zero cross ratio of 3 axes
Acceleration*3
8
Maximum of differential Acceleration*2, gyro*1
9
Minimum of differential Acceleration*2, gyro*1
10
Mean of differential
11
Kurtosis
Acceleration*2, gyro*1
12
Skewness
Acceleration*2, gyro*1
Acceleration*2, gyro*1
Table 2 Frequency-domain features Number
Feature
Sensor data
1
Maximum
Acceleration*2, gyro*1
2
Minimum
Acceleration*2, gyro*1
3
Median absolute deviation
Acceleration*2, gyro*1
4
Mean
Acceleration*2, gyro*1
5
Energy
Acceleration*2, gyro*1
6
FFT coefficients
Acceleration*2, gyro*1
7
Entropy
Acceleration*2, gyro*1
Then, the authors use boxplots to select useful features. Figure 2 gives two examples of a useful feature and a useless feature. Figure 2a shows an example of a useful feature which is able to distinguish some types of activities from the others, whereas Fig. 2b shows an example of a useless feature. In this way, the authors firstly exclude 17 useless features. For the remainder 40 features, the authors use other feature selection methods. Other feature selection methods include Filter-type, Wrapper-type, etc. Filter-type methods select features according to the sample distances in the feature space, whereas
376
S. Yan et al.
(a) A useful feature.
(b) A useless feature.
Fig. 2 Two examples of boxplots
Wrapper-type methods select features according to the contributions to the classification result. Therefore, Wrapper can suit the classifier better. This paper chooses the RFE (Recursive Feature Elimination) algorithm for further feature selection. SVM (Support Vector machine) is used as the classifier. Since this work is a multi-class classification, the authors build multiple SVM models for classification. Each feature goes through the constructed SVM-RFE model, and the classification result is given in Fig. 3.
Fig. 3 Classification accuracies during feature selection
It can be seen that when the number of features is 25, the accuracy is the highest. Therefore, the authors choose the first 25 features sorted by the SVM-RFE model. 2.3 Classification Design and Results As mentioned above, multiple SVM models are used as the classifier, with RBF (Radial Basis Function) as the kernel function. Three-fold cross validation is applied, i.e. 2/3 of the samples are used as the training set, 1/3 are used as the testing set, and the simulation is repeated 3 times. Figure 4 shows the confusion matrices of the sensor on the waist and that on the leg.
Activity Emersion Algorithm Based on Multi-Sensor Fusion
(a) Confusion matrix of the waist.
377
(b) Confusion matrix of the leg.
Fig. 4 Confusion matrices
Then, catalog beat, push, and push down as physical violence, and catalog the remainder 6 types of activities as daily-life activities. Define physical violence as positive, and daily-life activities as negative. For the sensor on the waist, accuracy = 93.4%, precision = 93.7%, recall = 93.2%, and F1-score = 93.0%. For the sensor on the leg, accuracy = 92.3%, precision = 92.1%, recall = 92.3%, and F1-score = 92.2%. Finally, fuse the recognition results of the two sensors, and Table 3 shows the fusion result. Table 3 Confusion matrix after fusion (unit: %) Predicted label
Beat Jump Play Push Run Stand Walk Pushdown Falldown
True label Beat
96.7 0
0
0
Jump
0
100
0
0
0
0
0
0
0
Play
4.9
0
93.9 0
0
0.6
0.6
0
0
Push
5.3
0
3.2
0
0
0
0
0
91.5 0
0
0
2.6
0.7
Run
0
0.6
0
0
98.2 0
0.6
0.6
0
Stand
0
0
0.6
0
0
0
1.2
0
Walk
98.2
0
0
1.6
0
0
0
98.4
0
0
Pushdown 0
0
0
1.8
0
0
0
84.2
14.0
Falldown
0
0
0
0
0
0
11.9
86.6
1.5
For the fusion result, accuracy = 95.6%, precision = 94.0%, recall = 94.2%, and F1score = 94.1%, which shows an improvement compared with a single sensor. To better illustrate the activity emersion results, the authors display simplified human actions on a graphical user interface. Figure 5 gives two examples.
378
S. Yan et al.
(a) Display of walking.
(b) Display of beating.
Fig. 5 Two examples of human action display
3 Conclusions This paper proposes an activity emersion algorithm based on multiple wearable sensors. Nine types of activities are simulated, and 57 features in both time-domain and frequencydomain are extracted. With boxplots and the SVM-RFE algorithm, the feature dimension is reduced to 25. With a single sensor on the waist, the recognition accuracy reaches 93.4%, and on the leg, it reaches 92.1%. After fusion, the accuracy reaches 95.6%. To give intuitive results, the authors also display the recognized actions on a user interface. Acknowledgements. This research was funded by National Natural Science Foundation of China under grant number 41861134010.
References 1. Liu X, Jia M, Zhang X, Lu W (2019) A novel multichannel internet of things based on dynamic spectrum sharing in 5G communication. IEEE Internet Things J 6(4):5962–5970 2. Liu X, Zhang X (2020) NOMA-based resource allocation for cluster-based cognitive industrial internet of things. IEEE Trans Ind Inf 16(8):5379–5388 3. Nishida M, Kitaoka N Takeda K (2015) Daily activity recognition based on acoustic signals and acceleration signals estimated with Gaussian process. In: Asia-Pacific signal and information processing association summit and conference, pp 279–282. IEEE 4. Amroun H, Ouarti N, Temkit MH (2017) Impact of the positions transition of a smartphone on human activity recognition. In: IEEE international conference on internet of things, 937–942. IEEE 5. Ali H (2014) Physical activity recognition using single sensor: a novel approach 6. Gao L, Bourke AK, Nelson J (2014) Evaluation of accelerometer based multi-sensor versus single-sensor activity recognition systems. Med Eng Phys 36(6):779–785 7. Wang Y, Cang S, Yu H (2019) A survey on wearable sensor modality centred human activity recognition in health care. Expert Syst Appl 137:167–190 8. Martinez-Hernandez U, Dehghani-Sanij A (2018) Adaptive Bayesian inference system for recognition of walking activities and prediction of gait events using wearable sensors. Neural Netw 102:107–119
Neural Network for Bullying Emotion Recognition Xinran Zhou1 , Liang Ye1,2(B) , Chenguang He1,3 , Tapio Seppänen2 , and Esko Alasaarela2 1 Harbin Institute of Technology, Harbin 150080, China
[email protected]
2 University of Oulu, 90014 Oulu, Finland 3 Key Laboratory of Police Wireless Digital Communication, Ministry of Public Security,
People’s Republic of China, Harbin 150080, China
Abstract. Bullying is a kind of aggressive behaviour of unjustified actions and speeches which usually occur among students. Bullying behaviours are often followed by verbal disputes such as abuse, crying and other negative voices. Bullying emotion recognition can assist the detection of bullying behaviours. This paper uses neural networks for speech emotion recognition to recognize bullying behaviours. Spectrograms and MFCC features are extracted from speeches, and CNN and RBF neural networks are used for classification. According to experimental results, CNN outperforms RBF and gets an average recognition accuracy of 82.5% for bullying emotions. Keywords: Radial basis function · Convolutional neural networks · Speech emotion recognition
1 Introduction Bullying is a kind of aggressive behaviour of unjustified actions and speeches which usually occur among students, and it is a pressing problem in schools today [1]. When bullying behaviours happen, there are often verbal disputes such as abuse, crying and other negative voices. Therefore, bullying emotion recognition can assist the detection of bullying behaviours. This paper studies the use of Convolutional Neural Network (CNN) [2] and Radius Basis Function (RBF) [3] for speech emotion recognition (SER) [4] to recognize bullying emotions.
2 Neural Network for SER SER has been studied for many years and has many applications [5, 6]. This paper uses two different methods of SER algorithms, namely CNN and RBF, to test the recognition performance.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_49
380
X. Zhou et al.
2.1 CNN for SER AlexNet CNN is a deep learning algorithm which can take in an input image, assign learnable weights and biases and differentiate one from the other. The architecture of CNN is shown in Fig. 1.
convolution+nonlinearity
max pooling fully connected layers
output
Fig. 1 Architecture of CNN
AlexNet [7] was the winning entry in ILSVRC 2012. It is an open-source image recognition pre-trained deep CNN. It consists of five convolutional layers with varying kernel sizes, followed by three fully connected layers, and the last layer is used to perform the 1000-way classification by applying a softmax function. This paper uses transferred AlexNet for classification. Spectrogram Spectrograms have been widely used in convolutional neural networks. The spectrogram is the spectrum diagram of speech. It is generally obtained by processing the received time-domain signals. Spectrograms are basically two-dimensional graphs, with a third dimension represented by colours. They can simultaneously reflect the three-dimensional information (time, frequency and energy) in the time–frequency domain of speech signals. One can see how energy varies at different frequencies, which are represented by the colour. The darker the colour is, the larger the energy is (Fig. 2). Specifically, the input size of AlexNet is 227 × 227 × 3; the spectrograms are normalized before training. 2.2 RBF for SER MFCC Feature Extraction Mel-scale Frequency Cepstral Coefficients (MFCC) are the cepstrum parameters extracted under mel-scale frequency domain, which are widely used in speech recognition [8]. The relationship between mel frequency and actual frequency can be approximated by Eq. (1). f (1) Mel(f ) = 2595 × lg 1 + 700
Neural Network for Bullying Emotion Recognition
381
Fig. 2 Spectrogram of an audio
The utterance is processed to remove the noise. Then, further sort out the denoised speech signal, filter out the useful speech frames and then perform MFCC feature extraction. Beside MFCCs, the first-order differential and the second-order differential coefficients are also extracted.
3 RBF Neural Network RBF network is a three-layer feedforward neural network. It contains a pass-through input layer, a hidden layer and an output layer [9] (Fig. 3).
Fig. 3 The architecture of RBF network
The hidden layer consists of numbers of RBF non-linear activation units. 2 1 (2) G Xi − Xp = exp(− 2 Xi − Xp ) 2σ The output of the activation function in the hidden layer can be calculated using Eq. (2) based on the distance between the input pattern x i and the centre x p .
382
X. Zhou et al.
4 Experiments and Results The experimental environment is MATLAB 2018b, and the audio recordings used in this experiment are from the CASIA database. The emotions used in this paper are happy, sad, angry and neutral. Firstly, split the samples into the training dataset and the testing dataset and then train networks with the speech utterances in the training dataset [10, 11]. Since AlexNet was designed to perform the 1000-way classification, some adjustments are needed to meet the classification requirements in this paper [12]. Thus, layers except the last three from the pre-trained network are extracted. Label and feed training dataset into the CNN. Consider the happy and the neutral voices as non-bullying emotion and the sad and the angry voices as bullying emotion. The results are summarized in Tables 1 and 2. It can be seen from tables that the accuracies of bullying emotion recognition for CNN and RBF are 82.50% and 62.67%, respectively. CNN outperforms RBF. Table 1 Confusion matrix of bullying emotion recognition by CNN Predicted class Real class
Bullying
Non-bullying
Bullying
82.50 (%)
17.50 (%)
Non-bully
8.75
91.25
Table 2 Confusion matrix of bullying emotion recognition by RBF Predicted class
Real class
Bullying (%)
Non-bullying (%)
Bullying
62.67
37.33
Non-bully
35.00
65.00
5 Conclusion Nowadays, bullying is very common among students. This paper studied bullying recognition through emotional voices. Two methods were tested, that is, the pre-trained network AlexNet (CNN) with spectrogram and RBF neural network with MFCCs. According to the simulation results, CNN outperforms RBF with a higher recognition accuracy. Acknowledgements. This paper was supported by National Key R and D Programme of China (No. 2018YFC0807101).
Neural Network for Bullying Emotion Recognition
383
References 1. Fenny O, Falola MI (2020) Prevalence and correlates of bullying behavior among nigerian middle school students. Int J Offender Ther Compar Criminol 64(5):564–585. https://doi.org/ 10.1177/0306624X20902045 2. Zhao J, Mao X, Chen L (2019) Speech emotion recognition using deep 1D and 2D CNN LSTM networks. Biomed Signal Process Control 47:312–323 3. Chernykh V, Sterling G, Prihodko P (2017) Emotion recognition from speech with recurrent neural networks 4. Akçay MB, O˘guz K (2020) Speech emotion recognition: emotional models, databases, features, preprocessing methods, supporting modalities, and classifiers. Speech Commun 116:56–76. ISSN 0167-6393 5. Liu X, Jia M, Zhang X, Lu W (2019) A novel multichannel internet of things based on dynamic spectrum sharing in 5G communication. IEEE Internet Things J 6(4):5962–5970 6. Liu X, Zhang X (2020) NOMA-based resource allocation for cluster-based cognitive industrial internet of things. IEEE Trans Indust Inform 16(8):5379–5388 7. Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. NIPS 8. All Answers. November 2018. Model of speech recognition using MFCC extraction [online]. Available from https://ukdiss.com/examples/speech-recognition-using-mfcc.php? vref=1. Accessed 25 Apr 2020 9. Faris H, Aljarah I, Mirjalili S (2017) Evolving radial basis function networks using moth-flame optimizer. https://doi.org/10.1016/B978-0-12-811318-9.00028-4 10. Zheng W, Mo Z, Xing X, Zhao G (2015) CNNs-based acoustic scene classification using multi-spectrogram fusion and label expansions 11. Huang Z, Dong M, Mao Q, Zhan Y (2014) Speech emotion recognition using CNN. In: MM 2014—proceedings of the 2014 ACM conference on multimedia, pp 801–804. https://doi. org/10.1145/2647868.2654984 12. Ng H-W, Nguyen VD, Vonikakis V, Winkler S (2015) Deep learning for emotion recognition on small datasets using transfer learning. In: Proceedings of the 2015 ACM on international conference on multimodal interaction (ICMI’15). Association for Computing Machinery, New York, NY, USA, pp 443–449. https://doi.org/https://doi.org/10.1145/2818346.2830593
Modulation Recognition Algorithm of Communication Signals Based on Artificial Neural Networks Dongzhu Li1 , Liang Ye1,2(B) , and Xuanli Wu1 1 Harbin Institute of Technology, Harbin 150080, China
[email protected] 2 University of Oulu, 90014 Oulu, Finland
Abstract. With the development of wireless communication technologies and computer science, modulation recognition of communication signals has attracted more and more attention. This paper studies a modulation recognition method based on artificial neural networks (ANN). Two kinds of communication channels are tested with the modulation recognition method, namely additive white Gaussian noise (AWGN) channel and Rayleigh channel. Simulations are formed in MATLAB environment. According to the simulation results, the modulation recognition method can achieve an average recognition accuracy of 96% in the AWGN channel and 82% in the Rayleigh channel. Keywords: Modulation recognition · Artificial neural network · Signal classification
1 Introduction With the development of communication technology, various communication methods are widely applied [1, 2]. Thus, the electromagnetic wave environment has been unprecedented complex. In this situation, more and more attention has been put on the communication signal recognition technology. On the other hand, as the computer science has been developing rapidly, artificial neural network (ANN) [3] is now considered as a powerful tool for signal recognition. Featuring high fault tolerance, self-adaptive, and organizing ability, ANN is a biomimetic structure consisting of massive interconnecting computational nodes. ANN is a distributed parallel network with high-application capabilities. Modulation recognition of communication signals based on ANN can automatically identify the modulation of digital signals after signal detection and help the demodulation process. With this technique, government can implement spectrum management and interference identification. Also, it can be applied on military, such as communication jamming and monitoring.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_50
Modulation Recognition Algorithm of Communication …
385
2 Modulation Recognition Over the past twenty years, computer technology engineering realizations became more stable, and more and more people started the study of communication signal recognition system based on advanced computer science. Starting from the early artificial recognition, multilevel classification frame such as trapezoidal frame had been popular for quite a long time. However, its low efficiency and complex operation let people turn their eyes to decision theory, neural networks, and cascade neural networks, and these methods soon became the core techniques for signal recognition. Since A.K.Nandi [4] and E.E.Azzouz [5] firstly applied ANN on signal recognition, more and more algorithms have been published by various institutions [6]. A basic modulation recognition system of communication signals can be divided into three parts, which are signal preprocessing, feature extraction, training, and recognition [7]. ANN is very effective in training and recognition. A simplified structure of the recognition system based on ANN is given in Fig. 1.
Fig. 1 Structure of recognition system based on ANN
2.1 Feature Extraction During the recognition process, feature extraction is always the basement of the algorithm. Feature extraction is responsible for extracting signal’s time-domain characteristics and transform-domain characteristics. Time-domain characteristics include instantaneous amplitude, instantaneous phase position, characteristic parameters of instantaneous frequency, and other statistical parameters. Hilbert transforming and zero-crossing method are widely used to analyze these parameters. Transform-domain characteristics include power spectrum, spectral correlation function, time–frequency distribution, and other statistical parameters, which can be got by using fast Fourier transformation. There are also many other ways for feature extraction, such as wavelet theory-based method [8] and continuous typing method. Meanwhile, there are some classifiers that can extract features automatically. The following chapters will discuss such classifiers. 2.2 Neural Network Classification Neural network classifier here is responsible for the classification and recognition part of the whole procedure. Due to its intelligent level, recognition efficiency and recognition
386
D. Li et al.
accuracy, more and more attention has been paid on neural networks. ANN classifier has become a popular implementation of neural networks. Classification and recognition need the designer to provide suitable decision rules and reasonable structure. Generally, classifiers can be divided into decision-making classifier and neural network classifier. Compared with decision-making classifier, neural network classifier can deal with more modulations and suit various environment. Besides, it provides better performance under non-linear condition, together with good stability and potential fault tolerance. It can achieve a quite satisfying recognition accuracy. Feedforward neural network is one of the most applied networks in signal recognition, and algorithms such as multilayer perceptron (MPL) and radial basis function are considered as mature techniques. A multilayer perceptron network consists of interconnected artificial neurons with self-adaptive, self-learning, and auto-organization ability. It can be defined as a parallel distributed dynamic processor. In other words, it can automatically extract features among massive data, thus forming a classification system. The basic structure of a multilayer perceptron is shown in Fig. 2, which includes an input layer, a hidden layer, and an output layer.
Fig. 2 Structure of a three-layer MLP neural network
In the computing network, each neuron’s activation function defines its output when the input is given. This network includes many algorithms, and one of the most important is back propagation (BP). By comparing the predicted output with the real label, it can calculate and give a predetermined difference and adjust the weight coefficient and thus minimize the error. The interlayer propagation process can be described with the formula below: I I wjk ϕ wki αi (n) (1) yj (n) = ϕ i=1
i=1
Modulation Recognition Algorithm of Communication …
387
The ANN classifier shows good performance in an additive white Gaussian noise (AWGN) channel and shows its advantage under large quantity condition and fuzzy condition. The authors use simulation to verify the above theory, and the built MLP classifier training performance is shown in Fig. 3.
Fig. 3 MLP classifier training performance
As the simulation shows, through an AWGN channel, a one hidden layer MLP network does a quite good job. After training, its testing sample set’s recognition accuracy can reach about 95%. The testing result after training procedure is shown in Fig. 4.
Fig. 4 Recognition accuracy of MLP classifier in AWGN channel
If the channel environment gets more complex, for example, a Rayleigh channel, then the result of this simulation model becomes unsatisfying.
388
D. Li et al.
Taking Rayleigh channel, for example, the recognition accuracy is much lower than that in an AWGN channel. Still use the above classifier for simulation, but replace the testing sample set by signals through a Rayleigh channel, and the result is given in Fig. 5.
Fig. 5 Recognition accuracy of the MLP classifier in Rayleigh channel
3 Conclusion Throughout the existing modulation recognition algorithms of communication signals, the one based on neural network shows great advantages in stability and convenience. Different from the systems based on decision theory, it has a wider range of application. This paper studied a modulation recognition method based on artificial neural networks. Two kinds of communication channels were tested, namely AWGN channel and Rayleigh channel. Simulation results showed that the modulation recognition method can achieve an average recognition accuracy of 96% in the AWGN channel and 82% in the Rayleigh channel. Acknowledgements. This work was supported by the National Natural Science Foundation of China with grant No. 61971161.
References 1. Liu X, Jia M, Zhang X, Lu W (2019) A novel multichannel internet of things based on dynamic spectrum sharing in 5G communication. IEEE Internet Things J 6(4):5962–5970 2. Liu X, Zhang X (2020) NOMA-based resource allocation for cluster-based cognitive industrial internet of things. IEEE Trans Industr Inf 16(8):5379–5388 3. Wang L (2016) A new neural network-based signal classification method for MPSK. In: Proceedings of the 2016 2nd international conference on energy equipment science and engineering, p 5 4. Wong MLD, Nandi AK (2003) Automatic digital modulation recognition using artificial neural network and genetic algorithm. Signal Process 84(2) 5. Adzhemov SS, Tereshonok MV, Chirov DS (2015) Type recognition of the digital modulation of radio signals using neural networks. Moscow Univ Phys Bull 70(1.
Modulation Recognition Algorithm of Communication …
389
6. Sun T, Jia J, Yu G (2016) Automatic modulation recognition of both digital and analog communication signals. In: Proceedings of the 2016 international conference on electrical, mechanical and industrial engineering 7. Sun Y, Li J, Lin F, Pan G (2019) Automatic signal modulation recognition based on deep convolutional neural network. In: Proceedings of the 3rd international conference on computer engineering, information science and application technology (ICCIA 2019) 8. Wang L (2016) Recognition of digital modulation signals based on wavelet amplitude difference. In: Proceedings of 2016 IEEE 7th international conference on software engineering and service science (ICSESS 2016)
Deep Learning for Optimization of Intelligent Reflecting Surface Assisted MISO Systems Chi Zhang, Xiuming Zhu, Hongjuan Yang, and Bo Li(B) Harbin Institute of Technology (Weihai), Weihai, China [email protected]
Abstract. Intelligent reflecting surface (IRS) has drawn great amount of attention from the researchers recently. It is worth investigating to maximize the spectral efficiency (SE) by jointly optimizing the beamforming at the access point (AP) and the phase shifts of the IRS. Although the traditional iterative algorithm can achieve high SE, it is not suitable for practical implementation due to its high computational complexity. Unsupervised learning could reduce the computational complexity, but as the number of IRS elements increases, the performance of SE becomes unsatisfactory. In this paper, we investigated a new deep learning method to maximize the SE in IRS assisted multiple-input single-output (MISO) communication system. Simulation results show that the performance of SE is better than that of the unsupervised learning method. Keywords: Intelligent reflecting surface · Beamforming design · Deep learning
1 Introduction With the commercialization of 5G wireless communication, researchers have started research on 6G [1–3]. Recently, intelligent reflecting surface (IRS) has received a lot of attention. IRS is composed of a large number of passive elements, each of which can independently adjusts the phase of incoming signal [4, 5]. In order to improve the performance of the IRS assist multiple-input single-out (MISO) communication system, there are a few studies on maximizing the spectral efficiency (SE) by jointly optimizing the beamforming at the access point (AP) and IRS phase shift [6–8]. Since the phase shift of IRS is constrained by the unit modulus, the optimization problem is difficult to tackle. In [6], the author uses the positive semidefinite relaxation algorithm (SDR) to transform the original problem into a standard positive semidefinite programming problem. The manifold optimization algorithm is used to solve the problem in [7]. However, when the number of IRS elements get large, they both have a large computational delay. Recently, the deep learning (DL) method has shown great potential in non-convex problems that traditional optimization methods are difficult to tackle [8, 9]. Compared with traditional iterative algorithms, unsupervised
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_52
Deep Learning for Optimization of Intelligent Reflecting …
391
learning methods could greatly reduce the computational complexity. However, as the number of IRS elements increases, the performance will become unsatisfactory. In this paper, we proposed a new DL method to maximize the SE by jointly optimizing the beamforming at the AP and IRS phase shift. The results show that the performance of the proposed method is better than that of unsupervised learning method.
2 System Model Consider a single-user MISO wireless system, as illustrated in Fig. 1, where an AP equipped with M antennas serves a single-antenna user and the IRS is deployed with N phase shifters. We assume that all channels are quasi-static flat-fading models, and the perfect channel state information (CSI) is known at both AP and IRS. The channels of the BS-IRS, IRS-user, and BS-user links are denoted as G ∈ CM ×N , hr ∈ CN ×1 , and hd ∈ CM ×1 , respectively. The received signal at the user is y = (Ghr + hd )T ws + z,
(1)
IRS
AP
G
hr
hd
User
IRS controller
Fig. 1 IRS-aided single-user MISO system
where = diag ejθ1 , ejθ2 , . . . , ejθN denotes the phase shift matrix at the IRS and θn ∈ [0, 2π ] is the phase shift of the n - th reflecting element of the IRS. w ∈ CM ×1 denotes the beamforming vector at the AP satisfying w2 ≤ p, p is the maximum transmit power at the AP. The transmitted signal is denoted by s, where E[|s|2 ] = 1. z is the additive white Gaussian noise with zero mean and variance σ 2 . In this paper, we choose to maximize SE as the optimization goal, the SE is given by 2 1 R = log2 1 + 2 (Ghr + hd )T w , (2) σ for any given , the optimal beamforming method that maximize the SE is the maximum T = √p (Ghr +hd )H . By substituting w ratio transmission (MRT) strategy, i.e., wopt opt to Ghr +hd (2), the problem can be formulated as max log2 1 + γ Ghr + hd 2 θ (3) s.t. 0 ≤ θn ≤ 2π, ∀n = 1, . . . , N
392
C. Zhang et al.
where γ is the signal-to-noise ratio (SNR).
3 Proposed Scheme In this section, we will introduce the proposed DL framework to solve the above joint optimization problem. (1) Input Layer It can be seen from (3) that the SE is related to the SNR and the CSI, so it is easy to think that choose the SNR and the CSI as the input of the neural network. Since the neural network does not support complex input, we split the CSI into real part and imaginary part and sent them to the network, respectively. In order to ensure that the neural network can work effectively for various SNR, each set of data is randomly selected from (−20 dB, 20 dB). (2) Output Layer Our goal is to find the optimal IRS phase shift matrix to maximize the SE. Similar to [9], we choose the activation function of the output layer as sigmoid, to ensure that the output is in the range of (0, 1). The output real values are α1 . . . αN , so the corresponding phase shift matrix is given by (4) = diag ej2π α1 , ej2π α2 , . . . , ej2π αN . (3) Loss Function In [8], the unsupervised learning method directly takes the negative number of the objective function in (3) as the loss function. However, as the number of the IRS elements increases, the SE calculated by this method is significantly lower than that of the traditional iterative algorithm. In order to improve the SE of the unsupervised learning method, we modified the results of unsupervised learning method by adding a small amount of labeled data into the training set, so as to make the results closer to the optimal phase shift. The loss function is defined as (5) Loss = Reward − log2 1 + γ Ghr + hd 2 , where Reward =
The maximum SE corresponding to CSI and SNR Labeled data 0 Unlabeled data
(6)
Reward for labeled data is the upper bound of the objective function in (3), which can be calculated by SDR method. Reward without label is defined as 0, and the loss function is the same as that of unsupervised learning.
Deep Learning for Optimization of Intelligent Reflecting …
393
4 Simulation and Analysis In this section, we evaluate the performance of the proposed method by using the SDR algorithm, manifold optimization algorithm, and unsupervised learning method as benchmarks. Assume that all the channels are modeled by independent Rayleigh fading, and the reference
distance dref is set to 1 m. Then, the path loss can be calculated by 20.4 log10 d dref [10]. The distances from AP to IRS, AP to User, and IRS to User are denoted by dAI , dAU , and dIU , respectively. In order to facilitate comparison with [8], we set that dAI = 8m. The calculation of distance is shown in Fig. 2, while d0 ∼ U[0, 8], d1 ∼ U[1, 6], respectively. d AI
AP
8m
IRS
d0 d1 d AU
d
2 0
d IU
2 1
d
d AI
d0
2
d12
User
Fig. 2 Simulation setup
A. Average Spectral Efficiency versus SNR We consider that M = 4, N = 16, and simulation results in this section are averaged over 1000 channel realizations. We also added the case of without IRS, and the beamforming vector at AP is set according to the MRT. For a fair performance comparison between our proposed method and unsupervised learning, both of the samples for training are set to 500,000. In our method, 10% of the training samples is the labeled data. Figure 3 shows that when a small amount of labeled data is added to the training set, the performance of SE will be improved, but it is still lower than the traditional iterative algorithm. It can also be seen that when the communication system is assisted by IRS, the performance of IRS assisted communication is greatly improved. B. Computational Complexity We consider that M = 4 and N = 8, 16, 32, respectively. All simulation results that computation time of the different algorithm is averaged over 1000 channel realizations. The SDR algorithm can be solved through the CVX toolkit [11], and the manifold optimization algorithm can be solved by Manopt [12]. For a fair comparison, all algorithms are run on i5-8265 CPU. In Fig. 4, we plot the average computation time of the different method. Compared with the traditional iterative algorithm, the DL method can greatly reduce the run time. With the increase of the number of IRS elements, the computation time of the traditional iterative algorithm will increase significantly, while that of the DL method will increase within 0.1 ms.
394
C. Zhang et al.
Fig. 3 Average spectral efficiency versus SNR
Fig. 4 Average time consumption
5 Conclusions In this paper, a new DL method is proposed to maximize the SE by jointly optimizing the beamforming at the AP and the phase shifts of the IRS in the single-user MISO system. The simulation results show that the performance gain of the proposed method is better than unsupervised learning method, and compared with traditional iterative algorithms, DL methods can significantly reduce the computational complexity. Deep learning provides valuable insight to deal with non-convex problems that are difficult to be solved by traditional optimization algorithms and can greatly reduce computational complexity while ensuring performance. Acknowledgements. This work is supported in part by National Natural Science Foundation of China (No. 61401118, and No. 61671184), Natural Science Foundation of Shandong Province (No. ZR2018PF001 and No. ZR2014FP016).
Deep Learning for Optimization of Intelligent Reflecting …
395
References 1. Dang S, Amin O, Shihada B et al (2020) What should 6G be? Nature Electron 3(1):20–29 2. Liu X, Jia M, Zhang X, Lu W (2019) A novel multichannel internet of things based on dynamic spectrum sharing in 5G communication. IEEE Internet Things J 6(4):5962–5970 3. Liu X, Zhang X (2020) NOMA-based resource allocation for cluster-based cognitive industrial internet of things. IEEE Trans Industr Inf 16(8):5379–5388 4. Basar E, Di Renzo M, De Rosny J et al (2019) Wireless communications through reconfigurable intelligent surfaces. IEEE Access 7:116753–116773 5. Di Renzo M, Debbah M, Phan-Huy DT et al (2019) Smart radio environments empowered by reconfigurable AI meta-surfaces: an idea whose time has come. EURASIP J Wirel Commun Netw 2019(1):1–20 6. Wu Q, Zhang R (2018) Intelligent reflecting surface enhanced wireless network: joint active and passive beamforming design. In: 2018 IEEE global communications conference (GLOBECOM). IEEE, pp 1–6 7. Yu X, Xu D, Schober R (2019) MISO wireless communication systems via intelligent reflecting surfaces. In: 2019 IEEE/CIC international conference on communications in China (ICCC). IEEE, pp 735–740 8. Gao J, Zhong C, Chen X et al (2020) Unsupervised learning for passive beamforming. arXiv: 2001.02348 9. Lin T, Zhu Y (2019) Beamforming design for large-scale antenna arrays using deep learning. IEEE Wirel Commun Lett 10. Saleh AAM, Valenzuela R (1987) A statistical model for indoor multipath propagation. IEEE J Sel Areas Commun 5(2):128–137 11. Grant M, Boyd S (2014) CVX: Matlab software for disciplined convex programming, version 2.1 12. Boumal N, Mishra B, Absil PA et al (2014) Manopt, a Matlab toolbox for optimization on manifolds. J Mach Learn Res 15(1):1455–1459
Max-Ratio Secure Link Selection for Buffer-Aided Multiuser Relay Networks Yajun Zhang(B) , Jun Wu, and Bing Wang Army Academy of Artillery and Air Defense, Nanjing, China [email protected], [email protected], [email protected]
Abstract. Cooperative communication techniques have proven to be an effective way to achieve physical layer security. In recent years, cooperative networks with buffered relaying have attracted more and more attention. This paper studies the security problems in buffer-aided multiuser relay networks. Based on the proposed maximum ratio secure link selection criterion (Max-ratio-LS for short), we investigate the performances of secrecy rate and secrecy outage probability. The numerical simulation results demonstrate that, for multiuser relay networks, our proposed max-ratio secure link selection criteria outperform previously reported no-buffer-aided cooperative schemes.
1 Introduction Physical layer security (PLS) has attracted more and more attention in recent years. The basic opinion of physical layer security is to use the physical characteristics of wireless channel, such as noise or fading, which is traditionally considered as interference, to prevent eavesdropping attacks and improve transmission reliability with relatively low computational overhead. In Wyner’s pioneering work [1], the maximum nonzero information transmission rate is defined as the secrecy capacity at which the sender can reliably send the secret message to the target receiver while the eavesdropper cannot decode it. Since then, many reports and studies have been put forward in the literature from different perspectives [2]. Many studies have shown that cooperative communication not only increases the transmission capacity of wireless networks significantly [3], but also provides an effective way to increase the confidentiality [4]. By carefully designing the relay, the information rate of the destination is maximized, and the information rate of the eavesdropper is minimized, so as to improve the security ability. In recent years, buffer-aided cooperative networks have attracted much attention due to their significant performance advantages [5]. By introducing a data buffer at the relay, the requirement that relay to destination and the best source to receiver links must be determined simultaneously can be relaxed. With the development of buffer-aided relay technology, buffer relay is proposed in [6– 8] to ensure the security of physical layer. In [7, 8], a half-duplex two hop relay network consisting of source (Alice), buffer-aided decode-and-forward (DF) relays, destination
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_53
Max-Ratio Secure Link Selection for Buffer-Aided …
397
(Bob), and eavesdropper (Eve) is considered, and a link selection scheme is examined by solving two optimization problems with two hop transmission efficiency and confidentiality constraints. Different from Refs. [7, 8] paper [6] studies multi-relay scenarios, the paper proposes a new maximum ratio relay selection scheme for secure transmission in buffer assisted DF relay networks. The scheme has an eavesdropper and can intercept the signals from both relay and source nodes at the same time. The results show that the performance of the max-ratio relay selection scheme is obviously better than the traditional max–min ratio strategy, which proves that it is an attractive secure wireless transmission method. In this study, buffer-aided relay is introduced into multi-user cooperative networks, and a max-ratio link selection criterion is proposed to improve security. Compared with the traditional random user selection scheme (t-random-us) and the traditional optimal user selection scheme (t-optimal-us), our buffer-aided maximum ratio link selection criterion scheme can significantly increase the security performance of secrecy outage probability and secrecy rate. As far as we know, this is the first time that buffer-aided relay is introduced into multi-user secure cooperative networks. Simulation results show the effectiveness of the proposed maximum ratio secure link selection scheme. The rest of the paper is arranged as follows. In the second part, we describe the system model and introduce the proposed max-ratio secure link selection scheme in detail. In the third part, this paper studies two measurement methods of secrecy rate and secrecy outage probability. In the part of IV, we use some numerical results to show the max-ratio performance of our secure link selection strategy. Finally, the fifth part is the conclusion.
2 Max-Ratio Link Selection Scheme and System Model Figure 1 shows the system model of buffer-aided multiuser cooperative network, where there are multi-source nodes (Ak , 1 ≤ k ≤ K), an eavesdropper (Eve), a destination node (Bob), and a relay node (R). All nodes work in half-duplex mode, which means they do not send and receive at the same time. Time division multiple access (TDMA) is adopted between multi-user nodes. Without loss of generality, each transmission time slot is assumed unity. The relay is equipped with a finite size L data buffer Q (in terms of number of packets). Packets in the buffer obey the “first-in-first-out” rule. In our model, no direct connection is assumed between Bob and multi-user source nodes. The channel coefficients for Ak → E, R → E, and Ak → R at time t are denoted as hAk ,E (t), hR,E (t), and hAk ,R (t), respectively. The channel is assumed quasi-static (slow block) Rayleigh decay, so that for a packet duration, the channel coefficients remain ∼ CN 0, A,R , constant, but vary independently across packet times, i.e., h (t) A ,R k hR,E (t) ∼ CN 0, R,E , and hAk ,E (t) ∼ CN 0, A,E . All noise is assumed additive white Gaussian noise, and without loss of universality, the noise variance is taken to be normalized. As in some other literature [6], we assume that accurate knowledge is available for all channels, including eavesdropping channels. Then, the max-ratio secure link selection criterion can be formula as 1 + γAk ,R (t) 1 + γR,B (t) max max , max (1) i∗ = arg i∈{A1 ,...AK ,R} Ak : (Q)=L 1 + γAk ,E (t) R: (Q)=0 1 + γR,E (t)
398
Y. Zhang et al.
Multi-Users
Relay receiving stage Relay transmission stage
A1
···
L Q
R
hAk ,R
Ak
hR,B
B
···
hR,E hAk ,E
Signal link
E
AK
Eavesdropping link
Fig. 1 Model of buffer-aided multi-user relay system in DF secure transmission, in which eavesdropper intercepts signals from relay nodes and source nodes
where (Q) gives the number of data packets in buffer Q. γAk ,E (t), γR,E (t), and γAk ,R (t) is the instantaneous signal-to-noise ratio of link Ak → E, R → E, and Ak → R at time t, respectively. If i∗ ∈ {A1 , . . . , AK }, the selected transmission link is i∗ → R, and if i∗ ∈ {R}, the selected link is R → B. Without losing generality, at time slot t, we assume i∗ ∈ {A1 , . . . , AK }, which means the network is at the relay receiving stage. The selected transmission user i∗ can transmit data packet x(t) to relay R with power PA . Here, for convenient understanding in the following, we denote i∗ as A∗k at this stage. Then, received signal at R and Eve are given by, (2) yA∗k ,R (t) = PA hA∗k ,R (t)x(t) + nR (t) yA∗k ,E (t) =
PA hA∗k ,E (t)x(t) + nE (t)
(3)
respectively, where nE (t) and nR (t) are the noise at eavesdropper E and relay R, respectively, and E |x|2 = 1. yA∗k ,R (t) is then decoded and stored, waiting its turn to be transmitted, in a buffer Q. In the next time period τ - th, when i∗ ∈ {R}, which means the network is at the relay transmitting stage. The selected transmission user i∗ can transmit the decoded-andforward data packets x(t) to the destination node B with power PR . Here, for convenient understanding in the following, we denote i∗ as R at this stage. Then, the received signal at Eve and Bob is given by, (4) yR,E (t + τ ) = PR hR,E (t + τ )ˆx(t) + nE (t + τ ),
yR,B (t + τ ) =
PR hR,B (t + τ )ˆx(t) + nB (t + τ ),
(5)
respectively, where the noise for Bob at time t + τ is nB (t + τ ), and the relay’s power is PR .
Max-Ratio Secure Link Selection for Buffer-Aided …
399
Here, in order to be fair in comparing different safety scenarios, we set total transmitting power constraints for different scenarios, with the power P, i.e., PA + PR = P. Then, PA and PR can be set as, PA = PR,B σR2 / R,B σR2 + A,R σB2 , PR = PA,R σB2 / R,B σR2 + A,R σB2 .
3 Secrecy Outage Probability At time slot t, if A∗k is selected to transmit a data, and, at time slot t + τ , relay R is selected to decode-and-forward the data to destination, and the instantaneous secrecy capacity for the whole multi-user cooperative system with buffering is obtained as [6],
⎞ ⎛ ∗ ,R (t), γR,B (t + τ ) 1 + min γ A 1 k ⎠ (6) Cs = log2 ⎝ 2 1 + γA∗k ,E (t) + γR,E (t + τ ) 2 where the instantaneous SNRs in (6) can be derived by, γA∗k ,R (t) = PA hA∗k ,R (t) /σR2 , 2 2 γA∗k ,E (t) = PA hA∗k ,E (t) /σE2 , γR,B (t + τ ) = PR hR,B (t + τ ) /σB2 , γR,E (t + τ ) = 2 PR hR,E (t + τ ) /σE2 . The secrecy outage probability is defined as that the achievable secrecy rate, which is less than the target secrecy rate, and if it is lower than this probability, the secure transmission cannot be guaranteed. Therefore, the probability of confidentiality interruption can be expressed as [2], Pout (Rs ) = Pr(Cs < Rs )
(7)
where Rs is the target secrecy rate and Cs is expressed in (6).
4 Simulation Results In this part, the secrecy outage performance of the proposed max-ratio-LS secure scheme is simulated in DF buffer-aided multi-user cooperative networks. Comparing to the existing schemes, e.g., T-random-US, T-optimal-US, we will give the performance enhancement of our scheme. Here, under the same assumptions as in [6], we set P = 2. The metric of instantaneous secrecy capacity in (6) gives the secure communication rate at the current channel states. Figure 2 compares the secrecy outage performance, and we set A,E = R,E = 5 dB, A,R = R,B . First of all, we can see that the outage performance of max-ratio-LS is significantly better than the other two bufferassisted schemes, and we can observe that the performance Pout (Rs ) of the proposed max-ratio-LS scheme will decrease with the increase of buffer size, slowly approaching the performance boundary.
400
Y. Zhang et al. 0
Secrecy Outage Probability: P
out
s
(R )
10
-1
10
T-random-US T-optimal-US proposed Max-ratio-LS, proposed Max-ratio-LS, proposed Max-ratio-LS, proposed Max-ratio-LS,
L=1 L=2 L=5 L=50
-2
10
0
5
10
20
15
•
A,R
25
30
(dB)
Fig. 2 Secrecy outage probability Pout (Rs ) versus A,R at Rs =1, K= 2
5 Conclusions In this paper, a buffer-aided maximum ratio security link selection scheme for security enhancement in multi-user cooperative wireless communication system is proposed, selecting the max-ratio link to transmit confidential data which provides higher security performance. It selects the maximum ratio link to transmit confidential data, which can provide higher security performance. Simulation shows that the secrecy outage probability is lower for the maximum ratio security link selection scheme when the buffer length is larger. Next, security performance analysis is an important research work, and the optimal power allocation between source and relay is another interesting work.
References 1. Wyner AD (1975) The wire-tap channel. Bell Syst Tech J 54(8):1355–1387 2. Mukherjee A, Fakoorian S, Huang J, Swindlehurst A (2014) Principles of physical layer security in multiuser wireless networks: a survey. IEEE Commun Surv Tutor 16(3):1550–1573 3. Nosratinia A, Hunter TE, Hedayat A (2004) Cooperative communication in wireless networks. IEEE Commun Mag 42(10):74–80 4. Krikidis I, Thompson JS, McLaughlin S (2009) Relay selection for secure cooperative networks with jamming. IEEE Trans Wirel Commun 8(10):5003–5011 5. Zlatanov N, Ikhlef A, Islam T, Schober R (2014) Buffer-aided cooperative communications: opportunities and challenges. IEEE Commun Mag 52(4):146–153 6. Gaojie C, Zhao T, Yu G, Zhi C, Chambers JA (2014) Max-ratio relay selection in secure buffer-aided cooperative wireless networks. IEEE Trans Inf Forensics Secur 9(4):719–729 7. Jing H, Swindlehurst AL (2013) Wireless physical layer security enhancement with bufferaided relaying. In: Presented at the Asilomar conference on signals, systems and computers (ACSSC 2013), CA, USA, pp 1560–1564 8. Huang J, Swindlehurst AL Buffer-aided relaying for two-hop secure communication. IEEE Trans Wirel Commun 14(1):152–164
Reconfigurable Data Acquisition System with High Reliability for Aircraft Yukun Chen(B) , Lianjun Ou, Gang Rong, and Fei Liu China Academy of Launch Vehicle Technology, Beijing 100076, China [email protected]
Abstract. Data acquisition is the important measure to acquire aircraft telemetry state and is indispensable components to aerocraft test. A high reliability data acquisition system for aircraft was developed based on master–slave redundant nodes architecture. Dual hot standby redundant architecture was adopted in central node, and control right can be switched rapidly when fault happens, and the reliability and reconfiguration can be achieved. Power modules and signal acquisition modules of acquisition and coding unit were integrated, and then, the complexity and weight of electric cables can be decreased. The experimental results indicate that the acquisition system can meet the demands of isolated acquisition and transmission and fault isolation. The data acquisition system has engineering application value for data acquisition and transmission with high reliability and precision. Keywords: Redundancy · Aircraft · Reconfiguration · Data acquisition
1 Introduction Telemetry system of spacecraft has become a key element in the exploration process of missiles and launch vehicles and influences the design and enhancement for performance of aircraft. With the development of system complexity on the spacecraft, some new demands of telemetry parameters are put forward such as quantity, variety, and precision [1]. Data acquisition is the important measure to acquire aircraft telemetry state. In recent years, the reliability and reconfiguration of data acquisition system have become very popular. The data acquisition modules and power modules are separated for the traditional aircraft, so all the devices collect and transmit data by using respective crystal clock [2]. A special data acquisition system for aircraft was developed based on master–slave redundant nodes architecture. Dual hot standby redundant architecture was adopted in central node, and control right can be switched rapidly when fault happens, the reliability and reconfiguration can be achieved. Power module and signal acquisition module were integrated, and then, the complexity and weight of electric cables can be decreased. The experimental results indicate that the acquisition system can meet the demands of
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_54
402
Y. Chen et al.
isolated acquisition and transmission and fault isolation. The data acquisition system indicates a significant advantage and engineering application value for data acquisition and transmission with high reliability and precision. 1.1 System Design The data acquisition system of aircraft based on master–slave includes sensors, acquisition unit, and centre unit, as depicted in Fig. 1. The physical parameters measured are first obtained by sensors and transformed to electricity signal [3], then transmitted to corresponding acquisition unit. The electricity signal filtered and amplified sequentially is transmitted to centre unit. According to scheduled sequences, the centre unit controls corresponding acquisition unit. At the same time, the centre unit receives data from acquisition units and lays out inner data buffer [4], and then, data is sent to transmitter after transfer from parallel signal to serial data. Isolated Acquisition Signal
Isolated Acquisition Signal Sensor
Acquisition Unit
Acquisition Unit
Sensor
Sensor
Sensor
Isolated Acquisition Signal
Isolated Acquisition Signal Acquisition Unit
Sensor
Centre Unit main
Centre Unit backup
Acquisition Unit
Sensor
Sensor
Sensor
Isolated Acquisition Signal Sensor
Isolated Acquisition Signal Acquisition Unit
Dual Units Switch Circuit
Acquisition Unit
Sensor
Sensor
Sensor Data Output
Fig. 1 Architecture framework of data acquisition system
1.2 System Hardware Design 1.2.1 Power and Signal Acquisition Module The power module and signal acquisition module was integrated to power and signal acquisition module. Besides the power modules for inner modules, the power module for
Reconfigurable Data Acquisition System …
403
external sensors was also developed in the acquisition unit. Figure 2 shows the connection between power and signal acquisition module and sensors. The power modules for inner modules and external sensors were separate and worked respectively. The power supply and signal acquisition can be achieved on the same connector through inner cables [5], and then, the complexity and weight of electric cables can be decreased greatly. Power Sensor Power and Signal Acquisition Module
Signal
Power Sensor Signal
Fig. 2 Link between power and signal acquisition module and sensor
The power module supplying +5 and ±15 V for external sensors was adopted by DC convertor, which can work from 14 to 40 V input. The filter was advised before power convertor input to fit electromagnetic compatibility. Figure 3 illustrates the design of power module. +5V 28V+Input Filter(3A) 28V-Input
DC-DC(30W)
±15V GND
10uF
0.1uF
CASE
CASE
Fig. 3 Design of power module
1.2.2 Isolated Acquisition Module Isolated acquisition module included the first amplifier, the second amplifier, photocoupler, the third amplifier, analogue multiplexers, the fourth amplifier and analogue-todigital converter, as depicted in Fig. 4. The signal needed to isolation passed sequentially the devices displayed in Fig. 4. The signal before analogue-to-digital converter was low impedance state [6], at the same time, the inter-channel interference can also be eliminated. Figure 5 illustrates the detail isolated acquisition circuit. In the design of a aircraft, the scale of isolated acquisition signal is from 0 to 5 V. The first amplifier and the fourth amplifier worked as voltage follower [7], and both voltage magnification is 1. To stabilize
404
Y. Chen et al.
The First Amplifier
The Second Amplifier
The Third Amplifier
Photocoupler
Analog Multiplexers
The Fouth Amplifier
Analog-to-Digital Converter
Fig. 4 Isolated acquisition architecture
the photodiode light signal strength, the external feedback circuit consisted of the second amplifier and the photocoupler input photodiode, and it can inspect photodiode intensity and adapt photodiode current. When photocoupler output photodiode received light signal, the current signal received by the third amplifier was transformed to voltage signal. R6 232
C3
233
235
231
237
201
R5
236
R4 C1
R2
Analog Multiplexers
Photocoupler
R3
(HCNR201)
Analog-toDigital Converter
(ADG406)
C2
R1
C4
234
(AD7821)
D0 D1 D2 D3 D4 D5 D6 D7
202 203 204 205 206 207 208 209
R7
Fig. 5 Isolated acquisition circuit
1.2.3 Non-Isolated Acquisition Module Figure 6 displays non-isolated acquisition module circuit. Non-isolated acquisition module consisted of multiplexers, amplifier, and analogue-to-digital converter [8]. The multiplexers typical delay time between the 50% and 90% points of the digital inputs and the switch on condition when switching from one address state to another is 120 ns [9]. Analogue-to-digital converter sampled the input analogue signal and output parallel 8 bit data to data bus. 331 333 332 301
Analog Multiplexers (ADG406) C 31
Analog-toDigital Converter (AD7821)
R 31
D0 D1 D2 D3 D4 D5 D6 D7
302 303 304 305 306 307 308 309
Fig. 6 Non-isolated acquisition circuit
1.3 System Software Design Figure 7 illustrates the data flow diagram of acquisition system in one cycle. The main centre unit and backup centre unit first detects its own state data after starting work.
Reconfigurable Data Acquisition System …
405
State data is then transmitted to dual units switch circuit to determine current centre unit. The current centre unit delivers data synchronization signal and clock signal to all the acquisition units at the beginning of each cycle. All the acquisition units send data collected to current centre unit. Before the current centre unit processes received data, it detects own state data again. If state data is correct, then current centre unit processes received data in the current cycle and prepares clock signal for the next acquisition cycle. Otherwise if state data is false, current centre unit abandons received data in the current cycle and loops to the procedure for detecting main centre unit and backup centre unit state. Centre Unit Initialization
Self-checking
Main Centre Unit Self-checking Pass
NO
Set Fault and No More Than 3 Times Backup Centre Unit Self-checking Pass
YES
NO
YES
Dual Units Switch Circuit
Determine Current Centre Unit
Deliver Data Synchronization Signal and Clock Signal
Acquisition Unit 1
Acquisition Unit 2
Acquisition Unit N
Output Data
Output Data
Output Data
Current Centre Unit
NO
Self-checking Pass YES
Process Received Data
Prepare Clock Signal for the Next Acquisition Cycle
Fig. 7 Data flow diagram of acquisition system in one cycle
406
Y. Chen et al.
Dual units switch circuit adopts the mechanism of watch dog and is accomplished by time monitor and control logic circuit. Dual units switch circuit has triggers corresponding, respectively, to main centre unit and backup centre unit. The main centre unit and backup centre unit present signal to reset respective trigger, but the timer inside dual units switch circuit set both triggers. If time set comes to end, the main centre unit or backup centre unit cannot present successfully reset signal, and therefore, the corresponding trigger will not be reset. At this time, dual units switch circuit will develop switch signal, the current centre unit will be changed. If the main centre unit or backup centre unit both work normally, dual units switch circuit will not develop switch signal, and the main centre unit will be selected as current centre unit by default. In the case of fault that both main centre unit and backup centre unit cannot work, the main centre unit is also selected as current centre unit to avid switch frequently.
2 Implementation and Evaluation According to Nyquist–Shannon sampling theorem, a signal can be exactly reconstructed from its samples if the sampling frequency is greater than twice the highest frequency of the signal [10]. In practice, exactly reconstructing signal and decreasing baud rate are both balanced. The sampling frequency is often significantly more than twice the required frequency. For example, besides multiple analogue signals, there are images, digital data, and 1553 B data on aircraft need to be transmitted through the acquisition system [11]. These items are summarized in Table 1. Table 1 Telemetry information on aircraft #
Category
Number
Sampling frequency (Hz)
Baud rate
1
Pressure
15
40
320
2
Overload
3
200
1600
3
Low frequency vibration
60
320
2560
4
High frequency vibration
90
8000
64,000
5
Temperature
14
40
320
6
Image
2
500,000
500,000
7
Voltage signal
30
40
320
8
Digital data of other system
5
/
115,200
9
1553 B data
1
/
500,000
Remark
Need to isolated acquisition
Reconfigurable Data Acquisition System …
407
The total transmission baud rate in Table 1 reaches to 7 Mbps. The test results listed below can be identified: Total 30 voltage signals are obtained, and the results meet the demand of isolated acquisition. Signal insulation resistance is greater than 100 M, and acquisition accuracy is less than 1%. Dual hot standby redundant architecture is adopted in central node. When fault injected results in the failure on main centre unit, control right can be switched rapidly from main centre unit to backup centre unit, and the reliability and reconfiguration can be achieved. Figure 8 shows that waveform of initial voltage release in the case of 7 Mbps baud rate. When the initial voltage is 3 V, the acquisition voltage can be stabilized rapidly at about 20% acquisition cycle after switching to next acquisition path by impedance conversion. The initial high voltage can be released rapidly, and therefore, the sampling and quantization are not interfered, and the inter-channel interference can also be eliminated.
Fig. 8 Waveform of initial voltage release
3 Conclusions The data acquisition system with high reliability for aircraft was developed based on master–slave redundant nodes architecture. Dual hot standby redundant architecture was adopted in central node, and control right can be switched rapidly when fault happens, and therefore, it has significant advantage in the reliability and reconfiguration. Power module and signal acquisition module of the acquisition unit were integrated, and then, the complexity and weight of electric cables can be decreased. The experimental results demonstrate that the acquisition system can meet the demands of isolated acquisition and fault isolation and eliminate inter-channel interference. The data acquisition system has engineering application value for data acquisition and transmission with high reliability and accuracy.
References 1. Bangfu L, Jianmin H, Bingchang L (2005) Telemetry system. China Aerospace Publishing Company, Beijing
408
Y. Chen et al.
2. Peijingjing L (2018) The design and application of multi-channel high precision analog-dataacquisition circuit. Comput Digital Eng 46(3):606–608 3. Yansong Z, Daming Q, Hongbin W (2018) Measurement error of non-contact rotor vibration displacement measurement system. J Aerosp Power 33(2):300–303 4. Barnes Cody R, Toney Ethan G, Jaromczyk Jerzy W (2016) Comparison or net-work architectures for a telemetry system in the solar car project. In: Federated conference on computer science and information systems, Gdansk, Poland, pp 751–755 5. Zheng X (2011) The data communication on-board system based on VxWorks. Xidian University, Xian 6. Xiaole C, Junhui C (2018) Design of FPGA-based uulti-channel analog acquisi-tion circuit. Aeronaut Comput Techn 48(2):101 7. Zhenming G (2000) Analogue circuit basis. Harbin Engineering University, Harbin 8. Li G, Ji L, Feng G (2014) Application and analysis of analog multiplexers in data acquisition system. Appl Integr Circuits 12(40):40–42 9. Weilong S, Yuanwu Z, Zhenhua Z (2018) Design and FPGA implementation of real time ethernet core scheduling module 43(3):111–115 10. Certain NH (1928) Topics in telegraph transmission theory. AIEE 47:617–644 11. Randge Commanders Council (2017) Telemetry standards. IRIG Standard 106-17[S]
The Algorithm of Beamforming Zero Notch Deepening Based on Delay Processing Junqi Gao and Jiaqi Zhen(B) College of Electronic Engineering, Heilongjiang University, Harbin 150080, China [email protected]
Abstract. In order to solve the problem of the performance degradation of the diagonal loading beamforming algorithm at low signal-to-noise ratio (SNR), a new beamforming method based on the covariance matrix of the received signal delay information is proposed. Firstly, the delay covariance matrix of the received signal is constructed to suppress the influence of the noise, then the delay covariance matrix is transformed to obtain the eigenvalues of the covariance matrix, and finally, the eigenvalues are processed by diagonal loading technology. The simulation results show that the proposed algorithm is productive. Keyword: Diagonal loading · Delay processing · Beamforming
1 Introduction Beamforming is an important research content in the array signal processing, which is widely used in radar, sonar, and wireless communication [1–3]. In recent decades, many robust adaptive beamforming algorithms have emerged. The robust adaptive beamforming algorithm based on the diagonal loading (DL) technology is a kind of common algorithm. The DL beamforming algorithm modifies the covariance matrix by adding a small amount of loading to the sampling covariance matrix. In Ref. [4], a new matrix is constructed to replace the sample covariance matrix, so as to improve the robustness of the beamformer, but the computational complexity of the algorithm is high. In Ref. [5], the method of eigenvalues decomposition is used to reduce the computational complexity, but the performance of the beamformer needs to be improved.
This work was supported by the National Natural Science Foundation of China under Grant No. 61501176, Heilongjiang Province Natural Science Foundation (F2018025), University Nursing Program for Young Scholars with Creative Talents in Heilongjiang Province (UNPYSCT2016017).
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_55
410
J. Gao and J. Zhen
2 Signal Model The covariance matrix of the uniform linear array (ULA) is eigen factorized to obtain R=
M
λi μi μH i
(1)
i=1
where ui is the eigenvector corresponding to the eigenvalue λi . The covariance matrix after diagonal loading is: R = R + λDL I
(2)
where λDL is the loading amount and I is the unit matrix. When the SNR is high, there are λ1 λDL λM ; the following loading capacity can be selected M λi (3) λDL = i=1
When the SNR is low, resulting in the decline of the technical performance of diagonal loading.
3 Diagonal Loading Technology Based on Delay Processing The signal received by the array is processed as a delay of time τ . x (t) = x(t − τ ) = A(ϕ)s(t − τ ) + n(t − τ )
(4)
When the delay τ is far less than the reciprocal of the bandwidth of the incident signal, the formula (4) can be expressed as x (t) = A(ϕ)Φ(τ )s(t) + n(t − τ )
(5)
where Φ(τ ) = diag[ϕ1 ,ϕ2 , . . . ,ϕK ] ϕi = exp(−jwτ ), i = 1, 2, . . . , K. Define delay covariance matrix Rτ = E[x(t)xH (t)] = A(ϕ)Rs Φ(τ )AH (ϕ) + E[n(t)nH (t − τ )] + [n(t)sH (t)Φ H (τ )AH (ϕ)] + E[A(ϕ)s(t)nH (t − τ )]
(6)
The Algorithm of Beamforming Zero Notch Deepening …
411
where n(t) is uncorrelated noise, there are E[n(t)nH (t − τ )] = 0, and the signal is not related to the noise, so the following formula holds Rτ = Φ H (τ )A(ϕ)RS AH (ϕ)
(7)
Rs can be written as follows according to the properties of Hermit matrix Rs = A(ϕ)Rs AH (ϕ) = U U H
(8)
where U is the eigenvector matrix of Rs , p1 ≥ · · · ≥ pk ≥ pk+1 = · · · = pM = 0 are the eigenvalues of Rs , and the delay covariance matrix is calculated as follows, ⎡
2 H Rτ = Rτ RH τ =U U
p12 ⎢ p2 ⎢ 2 ⎢ .. ⎢ . ⎢ ⎢ = U ⎢ pk2 ⎢ ⎢ ⎢ ⎢ ⎣
⎤
0
..
. 0
⎥ ⎥ ⎥ ⎥ ⎥ ⎥ H ⎥U ⎥ ⎥ ⎥ ⎥ ⎦
(9)
where U = Φ H (τ )U . The eigenvalues of the original signal covariance matrix can be estimated by eigendecomposition of formula (9) and square the obtained eigenvalues, then the eigenvalues obtained can be diagonally loaded according to formula (3).
4 Simulation Analysis In the experiment, ten elements of uniform linear array are used, and the number of snapshots is 100. One desired and one interference signal are incident on the array at 0° and −40°, respectively. The performance of the proposed and DL beamforming algorithm under the condition of low snapshots and SNR is shown in Figs. 1 and 2. It can be seen that the algorithm proposed in this paper can form a deeper zero depression under the condition of less snapshots and low SNR, which shows that the algorithm proposed in this paper to suppress interference and noise is productive.
412
J. Gao and J. Zhen
Fig. 1 Comparison of performance between the proposed and DL algorithm (SNR = −30)
Fig. 2 Comparison of performance between the proposed and DL algorithm (SNR = −15)
References 1. Wang X, Amin M, Cao X (2018) Analysis and design of optimum sparse array configurations for adaptive beamforming. IEEE Trans Signal Process 66:340–351 2. Lin T, Zhu Y (2018) Beamforming design for large-scale antenna arrays using deep learning. IEEE Wirel Commun Lett 9:103–107
The Algorithm of Beamforming Zero Notch Deepening …
413
3. Huang J, Su H, Yang Y (2020) Robust adaptive beamforming for MIMO radar in the presence of covariance matrix estimation error and desired signal steering vector mismatch. IET Radar Sonar Navig 14:118–126 4. Jing G (2016) Robust beamforming based on variable loading. Electron Lett 41:55–56 5. Jie Z, Qian Ye, Qiushi T et al (2016) Low complexity variable loading for robust adaptive beamforming. Electron Lett 52:338–340
Mask Detection Algorithm for Public Places Entering Management During COVID-19 Epidemic Situation Yihan Yun1 , Liang Ye1,2(B) , and Chenguang He1,3 1 Harbin Institute of Technology, Harbin 150080, China
[email protected]
2 University of Oulu, 90014 Oulu, Finland 3 Key Laboratory of Police Wireless Digital Communication, Ministry of Public Security,
People’s Republic of China, Harbin 150080, China
Abstract. COVID-19 now is spreading fast all over the world. Wearing masks has proven to be an effective way to prevent COVID-19 to some extent. This paper studies a mask detection algorithm for public places entering management during COVID-19 epidemic situation. Only people wearing masks are allowed to enter. Cameras are fixed at the entrances of public places and take photos of people who are coming in. Then, a series of pre-processing are performed, including face detection, normalization, etc. Residual network is used as the classifier. Simulation results show that the average recognition accuracy can reach 90%. Keywords: Mask detection · Image processing · Face detection · ResNet
1 Introduction Right now, COVID-19 (Corona Virus Disease 2019) [1] is a serious threat to people and is spreading very fast all over the world. By May 2020, over 3.3 million people have been infected, and 7% of them died. Therefore, prevention of COVID-19 is the responsibility of all human beings. Wearing masks has proven to be an effective way to prevent the virus from spreading. People in China now must wear a mask if they want to enter public places such as supermarket, bank, etc. The supervision of wearing masks costs a lot of human resources. Therefore, this paper proposes an automatic mask detection method for public places entering management which is able to detect whether a person is wearing a mask automatically.
2 Image Pre-Processing Procedure Image pre-processing includes RGB image graying, face detection and location, normalization, and histogram equalization.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_56
Mask Detection Algorithm for Public Places Entering …
415
2.1 RGB Image Graying Image graying makes R = G = B (R = Red, G = Green, and B = Blue) for each pixel in the image. Figure 1 shows an example of image graying.
Fig. 1 RGB image graying
2.2 Face Detection and Location Face detection detects whether an image contains human faces and face location finds the position of the face in the image. Commonly used features for face detection and location include Haar features, color features, structure features, histogram features, etc. The Adaboost algorithm is used for face detection and location. Besides, the binary method and the grid method can also be used, but they have high requirements for the original image; the location results are prone to large deviation. A better face detection and location method is the Cascade Object Detector algorithm [2]. Cascade Object Detector is able to eliminate the influence of irrelevant information such as background environment. Figures 2 and 3 gives two examples.
(a) input
(b) output
Fig. 2 None human face is detected
Figure 2 has no human face in it, so the output of the algorithm is the same with the input. For an image with a human face, Cascade Object Detector shows the detection box on the original image and outputs the detected face, as shown in Fig. 3.
416
Y. Yun et al.
(a) gray image
(b) face detection
(c) detected face
Fig. 3 Human face detection
2.3 Normalization Normalization is commonly used in pattern recognition and computer vision research areas [3, 4]. Image normalization converts original images into a standard form via a series of transformation, and the standard form image has invariable characteristics for translation, rotation, scaling, and other transformations, and ensures that the data in all dimensions are in the same change range. Scale Normalization Scale normalization is to transform the located face area into a uniform size, which is conducive to the subsequent expression feature extraction. Because the distances between the photographed people and the camera are different, the sizes of the detected face area are also different. They need to be normalized to get the same size of the face area, as shown in Fig. 4.
(a) image x1
(b) image x2
Fig. 4 Two images for face detection
As shown in Fig. 5, x1 and x2 are two original images with different sizes for face detection. Bimg1 and bimg2 are the detected face areas, and their sizes are also different. By scale normalization, we get two face images l1 and l2 of the same size (145 pixel * 145 pixel). This size is large enough to extract image features.
Mask Detection Algorithm for Public Places Entering …
417
Fig. 5 Image normalization results
Gray Normalization Gray normalization is to normalize the gray value of an image to [0,1]. Gray normalization can effectively highlight the details of the image and weaken the interference of light intensity. Figure 6 gives an example of gray normalization.
Fig. 6 Gray normalization
2.4 Histogram Equalization Histogram equalization can make the gray distribution of the image more uniform, thus increasing the contrast and making the details of the image clear. Meanwhile, this operation can weaken the influence of light and darkness to a large extent. Figure 7 gives an example of histogram equalization.
Fig. 7 Histogram equalization
418
Y. Yun et al.
As shown in Fig. 7, after histogram equalization, the image contrast becomes larger and the details are clearer.
3 Classification Model 3.1 Residual Network Residual network (ResNet) [5–7] introduces a forward feedback shortcut connection between input and output, i.e., identity mapping. Traditional neural networks learn the input–output mapping H(x), whereas ResNet learns F(x) = H(x) − x. ResNet avoids the mapping from x to H(x) directly and learn the difference, which is easier to approach to 0. This paper uses a convolutional neural network—ResNet-50 as the basic structure. As shown in Fig. 8, stage 5 and the layers before it are fixed. An AVG POOL (average pool) layer follows stage 5, and an FC (full connection) layer is added.
Fig. 8 Classification model
3.2 Mask Detection The authors collect 100 images of people wearing masks and 100 images of people not wearing masks. All the images are divided into five groups on average, of which four are used as the training set and one is used as the testing set. Fivefold cross-validation is used, and Table 1 gives the average classification accuracy. We can see that the proposed method can detect mask wearing at a high accuracy. Table 1 Classification accuracy Detected as
Wearing a mask (%) No mask (%)
Wearing a mask 90
10
No mask
90
10
Mask Detection Algorithm for Public Places Entering …
419
4 Conclusion This paper proposes a mask detection method for public places entering management during COVID-19 epidemic situation. A camera is fixed at the entrance of the public place and takes photos of people. Image pre-processing and face detection are performed. ResNet is used for classification. The experimental result shows that the proposed method can detect mask wear at an average accuracy of 90%. Acknowledgements. This paper was supported by National Key R&D Program of China (No. 2018YFC0807101).
References 1. Bai L, Wang M, Tang X et al (2020) Considerations on novel coronavirus pneumonia in the diagnosis and treatment. West China Med J 35(02):125–131 2. Liu X, Cai F, Wang Z (2015) Research on face detection and location algorithm based on MATLAB. J Tonghua Normal Univ 36(12):11–13 3. Liu X, Jia M, Zhang X, Lu W (2019) A novel multichannel internet of things based on dynamic spectrum sharing in 5G communication. IEEE Internet Things J 6(4):5962–5970 4. Liu X, Zhang X (2020) NOMA-based resource allocation for cluster-based cognitive industrial internet of things. IEEE Trans Indust Inf 16(8):5379–5388 5. Lei Fu, Ren D, Yunqi Hu et al (2020) Defect detection method of medical plastic bottle manufacturing based on ResNet network. Comput Modern 04:104–108 6. Chang C (2019) Research on image classification of human protein profile based on ResNet depth network. J Med Inform 40(07):45–49 7. He W, Deng J, Liu A et al (2020) Research on the access control system of face recognition magazine based on MTCNN and RESNET. Microcontroll Embedded Syst 20(04):51–54
Campus Bullying Detection Algorithm Based on Audio Tong Liu1,2 , Liang Ye1,3(B) , Tian Han2,3 , Tapio Seppänen3 , and Esko Alasaarela3 1 Harbin Institute of Technology, Harbin 150080, China
[email protected]
2 Harbin University of Science and Technology, Harbin 150080, China 3 University of Oulu, 90014 Oulu, Finland
Abstract. With the continuous breakthroughs in various technologies, voice recognition has become a research hotspot. It is a method to detect the phenomenon of bullying in time by detecting whether the campus bullying emotion is contained in the voice. This paper builds a convolutional neural network model to recognize speech emotions. Firstly, pre-process the audio data, then extract the MFCC feature parameters from the pre-processed audio data, and finally design a classification algorithm. This paper selects the CASIA database, which has a total of 300 voice audios, including six emotions: angry, scared, happy, neutral, sad, and surprised. Using fivefold cross-validation to test the performance of the model, the accuracy of the classification algorithm is 68.51%. Finally, the classification algorithm is used to perform emotion recognition on a test sample selected from a campus bullying movie section. This section shows “fear” emotion, and the algorithm judges that the audio shows “fear” emotion. The actual scenes are consistent, indicating that the classification algorithm in this paper has certain stability and practicability. Keywords: Neural networks · MFCC · Speech emotion recognition
1 Introduction With the continuous breakthroughs in various technologies, voice recognition has become a research hotspot. It has been used in various areas [1, 2]. It is also a method to detect the phenomenon of campus bullying in time by detecting whether the campus bullying emotion is contained in the voice. This paper extracts Mel Frequency Cepstrum Coefficient (MFCC) features from voices and uses neural networks to detect possible campus bullying emotions, thus indicating possible bullying events.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_57
Campus Bullying Detection Algorithm Based on Audio
421
2 Speech Signal Pre-processing and Feature Extraction Audio features will directly affect the accuracy of recognition. This paper firstly preprocesses each audio and then extracts the MFCC [3] features of the pre-processed audio. 2.1 Voice Signal Pre-Processing The duration of the voice signal is processed to be 2 s, and the voice sampling point is set to be 32,000. Pre-emphasis Firstly, the speech signal is pre-emphasized by a high-pass filter to improve the high-frequency part of the speech, reduce the amplitude range of the gene line, and reduce the dynamic range of the spectrum. The high-pass filter is given as follows. H (z) = 1 − μz −1
(1)
The value of μ in the formula is between 0.9 and 1.0, usually 0.97. Framing Collect N sampling points into one observation unit. Windowing Multiply each frame by a Hamming window to increase the continuity of the left and right ends of the frame. Suppose the signal after framing is S(n), n = 0, 1, . . . , N − 1, N , where N is the frame size. Then, apply the Hamming window S (n) = S(n) × W (n), where W (n) is calculated as follows, 2πn , 0≤n≤N −1 (2) W (n, a) = (1 − a) − a × cos N −1 Different a will produce different Hamming windows. In general, the value of a is 0.46.
3 Speech Signal Feature Extraction MFCCs do not depend on the nature of the signal, do not make any assumptions and restrictions on the input signal, and are in line with the hearing characteristics of the human ear. Even if the signal-to-noise ratio is reduced, this feature still has better recognition performance. In this paper, the MFCC features are extracted from the pre-processed audio [4, 5], and the process is given as follows (Fig. 1).
Fig. 1 MFCC extraction process
Fast Fourier Transform The fast Fourier transform is performed on the preprocessed signal to obtain the spectrum of each frame, and the spectrum of the speech
422
T. Liu et al.
signal is modulo squared to obtain the power spectrum of the speech signal. The DFT of the voice signal is calculated as follows. Xa (k) =
N −1
x(n)−j2πk/N , 0 ≤ k ≤ N
(3)
n=0
where x(n) is the input voice signal, and N represents the number of Fourier transform points. Triangle bandpass filter The energy spectrum passes through a set of Mel-scale triangular filter banks to smooth the spectrum and eliminate harmonics, highlighting the formants of the original speech. Discrete Cosine Transform s(m) = ln
N −1
|Xa (k)| Hm (k) , 0 ≤ m ≤ M
k=0
C(n) =
N −1 m=0
s(m) cos
2
π n(m − 0.5) , n = 1, 2, . . . , L M
(4)
(5)
4 Classifier Design This paper selects the CASIA database, which has a total of 300 audios and contains six emotions: angry, scared, happy, neutral, sad, and surprised. 4.1 Classification Model This paper builds a neural network with six convolution layers and two pooling layers to classify speech emotions. The classification algorithm recognizes six emotions in total, so the output layer of the neural network has six neurons [6, 7], and its model structure is given as follows (Fig. 2). 4.2 Training Results Train the classifier 200 times, and the training results are given as follows. Figure 3 shows that the model begins to overfit after training for 60 rounds. At this time, the training accuracy reaches 70%, and the verification accuracy tends to reach 50%. At this time, the training can be ended and the model can be saved. 4.3 Performance Testing The authors use fivefold cross-validation to evaluate the performance of the classification algorithm. They classify the “fear” and the “angry” emotions as bullying emotions, and the rest of the emotions as non-bullying emotions. The classification results are given as follows (Table 1).
Campus Bullying Detection Algorithm Based on Audio
423
Fig. 2 Classification model structure
Fig. 3 Model training results
Table 1 Classification results (unit:%) Bullying (classify)
Non-bullying (classify)
Bullying (actual)
67.26
32.74
Non-bullying (actual)
23.94
76.06
5 Classify Campus Bullying Sample The speech emotion recognition model trained in this paper is used to perform bullying emotion recognition, so the authors then test the model with a campus bullying sample. The selected sample is from a campus violence movie. The corresponding scene in the movie is a campus violence plot, and the audio shows the fear emotion. Input the audio sample into the speech emotion recognition model, and the model recognition results are given as follows.
424
T. Liu et al.
Table 2 shows that the speech emotion classification model in this paper judges the emotion of the speech sample as “fear,” which is in line with the movie scenario, indicating that the classification algorithm has certain practicality. Table 2 Recognition result (unit:%)
Probability
Angry
Happy
Fear
Neutral
Sad
Surprised
1.67e−7
5.79e−6
9.67e−1
9.25e−3
1.93e−2
4.77e−3
6 Conclusion This paper detects campus bullying via speech emotion recognition techniques. The authors choose the CASIA database as the audio samples and extract the MFCC features. Then they build a neural network for classification. According to experimental results, the proposed method can detect bullying emotions and bullying scenario. Acknowledgements. This research was funded by National Natural Science Foundation of China under grant number 41861134010.
References 1. Liu X, Jia M, Zhang X, Lu W (2019) A novel multichannel internet of things based on dynamic spectrum sharing in 5G communication. IEEE Internet Things J 6(4):5962–5970 2. Liu X, Zhang X (2020) NOMA-based resource allocation for cluster-based cognitive industrial internet of things. IEEE Trans Indust Inf 16(8):5379–5388 3. Zhang J (2009) Speech signal preprocessing and feature extraction technology. Comput Knowl Technol 5(22):6280–6288 4. Feng YL, Huo YM, Jiang F (2019) Research on feature extraction algorithm of MFCC voice signal based on EMD. Electron World 6(8):23–25 5. Tu SS, Yu FQ (2012) Speech emotion recognition based on improved MFCC of EMD. Comput Eng Appl 48(18):119–121 6. Zhiyuan T, Lantian L (2017) Collaborative joint training with multitask recurrent model for speech and speaker recognition. ACM, China, pp 493–504 7. Larochelle H, Mandel M (2012) Learning algorithm for the classification restricted boltzmann machine. J Mach Learn Res 6(13):643–669
An End-to-End Multispectral Image Compression Network Based on Weighted Channels Shunmin Zhao(B) , Fanqiang Kong, Yongbo Zhou, and Kedi Hu College of Astronautics, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China [email protected]
Abstract. For the spectral correlation of multispectral images, a novel compression framework, named an end-to-end multispectral image compression network based on weighted channels, is proposed. The framework consists of a forward coding network, a quantizer and an inverse decoding network. The multispectral images are fed into the forward coding network, which is composed of the residual block based on weighted channels, to obtain intermediate features based on weighted channels. The intermediate features are quantized and encoded by the quantizer and the entropy encoder to obtain the compressed code stream. The compressed code stream passes through the entropy decoder and the inverse decoding network, which is symmetric to the forward coding network, to reconstruct multispectral images. The results validate that the proposed network shows better performance compared to JPEG2000 and JPEG. Keywords: Multispectral image · Compression framework · Deep learning · Weighted channels
1 Introduction Multispectral images contain both the spatial information of targets and rich spectral information, which could be applied to many aspects such as agriculture, medical treatment and military. However, the multispectral images also bring huge data, which severely restricts its applications. Multispectral image compression could compress the images into a code of small and significantly reduce the volume of data and then promote the applications of multispectral images. Multispectral image compression schemes could be classified into three categories: algorithms based on prediction, algorithms based on vector quantization and algorithms based on various transform encodings [1]. These algorithms all have obvious shortcomings such as the compression rate is low, the practical applications are complex and the restructured images have block effects.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_58
426
S. Zhao et al.
Recent evidence suggests that deep learning has great potential in image compression [2–4]. For instance, Toderici et al. [2, 3] use recurrent neural networks to compress and reconstruct the images. The network can adjust the compression ratio of the images by controlling the number of iterations. Jiao et al. [4] use convolutional neural networks to reduce artifacts. The results of the aforementioned methods demonstrate that the performance of the compression framework based on deep learning is better than JPEG and JPEG2000 in both objective and subjective evaluations. In this paper, the residual block based on weighted channels is used as the basic component of the compression framework. The forward coding network is used to extract the features based on weighted channels. The extracted features are quantized and encoded by the quantizer and the entropy encoder. The inverse decoding network is used to restructure multispectral images. The results validate that the proposed network shows better performance compared to JPEG2000 and JPEG.
2 Methods In this section, we will describe the basic components of the network, which consists of the forward coding network, the quantizer, the lossless entropy encoder/decoder and the inverse decoding network. The structure of the overall network is shown in Fig. 1
Fig. 1 Overall structure of the compression network
2.1 Forward Coding Network The detailed structure of the forward coding network is shown in Fig. 2. Weighted net means the weighted residual block, as shown in Fig. 3, where residual denotes the residual block and scale means that the input is multiplied by the corresponding channel weights. This can be formulated as x˜ c = Fsc (uc , sc ) = sc × uc
(1)
where x˜ c represents the weighted features, Fsc (·) represents the multiplication, sc represents the output of sigmoid function in Fig. 3, means the weights of different channels, uc denotes the features extracted by the residual block.
An End-to-End Multispectral Image Compression Network …
427
Fig. 2 Forward coding network architecture
Fig. 3 The architecture of the weighted channels residual block
2.2 Quantizer According to [2], the derivative of the round function is zero almost everywhere. Therefore, a relaxation operation needs to be adapted, which can be formulated as: XQ = Round[(2Q − 1) × Sigmoid(XE )]
(2)
where XQ is the data of the quantized features, XE is the unquantized data of the intermediate features and Q is the quantization level. The round function rounds the data during forwarding propagation and skips the quantization layer to directly pass the gradient into the previous layer during backward propagation. 2.3 Entropy Coding The quantized features are subjected to the lossless entropy coding (a lossless compression method using a high compression ratio image compression standard ZPAQ [5]) to remove statistical redundancy and obtain the compressed code stream. The ZPAQ compresses the quantized features to generate the code stream and then entropy decoding is to recover the code stream, the inverse quantized feature data XQ /(2Q − 1) is input to the inverse decoding network. 2.4 Inverse Decoding Network The inverse decoding network is symmetrical to the forward coding network and its structure is shown in Fig. 4. The PixelShuffle [6] as the upsampling operation to recover the size of intermediate features. The kernel size of all convolutional layers of the overall network is 3 × 3.
428
S. Zhao et al.
Fig. 4 The architecture of the inverse coding network
3 Experiment 3.1 Data and Training Our dataset is derived from multispectral images of the landsat8 satellite, which contains seven bands. We select about 80,000 feature-rich images with a size of 128 × 128 for training and 24 images with a size of 512 × 512 for testing. The training set and test set are independent of each other. We use Adam optimizer [7] to train the model and the initial learning rate is 0.0001. We follow the stepdown LR method mentioned in [8] to make the network converge faster. 3.2 Evaluation of Results To evaluate the performance of the compression network, we compare our algorithm with JPEG and JPEG2000 at eight different bit rates. Figure 5 shows the average PSNR of the test set at different bit rates. We can see that our algorithm has better performance than JPEG and JPEG2000 at all bit rates. The experimental results show that our algorithm has better performance than JPEG and JPEG2000.
Fig. 5 Average PSNR of test data set at different bit rates
An End-to-End Multispectral Image Compression Network …
429
4 Conclusion In this paper, we propose an end-to-end multispectral image compression network based on weighted channels to compress multispectral images and the network can used the correlation between channels to enhance the performance of the network. The experimental results show the superior performance of the network by comparing with JPEG and JPEG2000.
References 1. Gelli G, Poggi G (1999) Compression of multispectral images by spectral classification and transform coding. IEEE Trans Image Process 8(4):476–489. https://doi.org/10.1109/83.753736 2. Toderici G, O’Malley SM, Hwang SJ et al (2016) Variable rate image compression with recurrent neural networks. In: ICLR2016. Available from https://arxiv.org/abs/1511.06085. Cited 23 Apr 2020 3. Toderici G, Vincent D, Johnston N et al (2017) Full resolution image compression with recurrent neural networks. CVPR 2017:5435–5443. https://doi.org/10.1109/cvpr.2017.557 4. Jiao SM, Jin Z, Chang CL et al (2018) Compression of phase-only holograms with JPEG standard and deep learning. Appl Sci 8(8):1258. https://doi.org/10.3390/app8081258 5. Mahoney M (2014) The ZPAQ open standard format for highly compressed data - level 2. Available from https://mattmahoney.net/dc/zpaq204.pdf. Cited 23 Apr 2020 6. Shi WZ, Caballero J, Huszar F et al (2016) Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. CVPR 2016:1874–1883. https://doi. org/10.1109/cvpr.2016.207 7. Kingma D, Ba J (2017) Adam: a method for stochastic optimization. Available from https:// arxiv.org/abs/1412.6980. Cited 23 Apr 2020 8. Li Mu, Zuo W, Shuhang Gu et al (2017) Learning convolutional networks for content weighted image compression. CVPR 2017:3214–3223
Multispectral Image Compression Based on Multiscale Features Shunmin Zhao(B) , Fanqiang Kong, Kedi Hu, and Yuxin Meng College of Astronautics, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China [email protected]
Abstract. We propose a compression framework based on multiscale features using deep learning technique for complex features of multispectral image. The multiscale feature extraction module is taken as the basic block of the framework to constitute the encoder and the decoder. The encoder is used to extract the dominant features of the input multispectral images and the decoder, which is symmetrical to the encoder, is used to reconstruct multispectral images. The balance of the bit rates and the distortion is implemented by the rate-distortion optimizer in loss function. The encoder and the decoder are trained jointly based on minimizing the mean squared error (MSE). Our proposed framework is compared with JPEG2000 in terms of PSNR and the experimental results validate that our proposed framework outperforms JPEG2000. Keywords: Multiscale features · Multiscale image compression · Deep learning · Rate-distortion optimizer
1 Introduction Since the great data volume, multispectral images compression has always been a challenging task. Recently, deep learning technique shows great potential in image compression [1–3]. They use an end-to-end learning algorithm to train networks and compress images or use neural networks to improve the quality of the restructured images. The results show that the performance of the neural networks with deep learning technique outperforms conventional compression standards. Multispectral images have similar data structure with visible light images except for larger data volume. Therefore, it is possible to compression multispectral images with deep learning technique. In this paper, we propose a compression framework based on multiscale features to compress multispectral images. The framework consists of an encoder, a quantizer and a decoder. The encoder, which is composed of multiscale feature extraction module with skip connections, is used to encode the input images to get intermediate features. The intermediate features, which after passing through the quantizer, are restructured into images by the decoder with similar structure to the encoder. The gradient of the
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_59
Multispectral Image Compression Based on Multiscale Features
431
quantizer is zero, so the quantizer cannot directly added to the training network. We adapt some relaxation operations to make the quantizer can be added the network to training the algorithm. We use ZPAQ [4] as the entropy codecs to further remove the statistical redundancy. PSNR is used to measure the quality of the restructured images; we compare our compression framework with traditional compression standard JPEG2000.
2 Methods We introduce the details of the framework in this section, which includes the multiscale feature extraction module, the encoder, the decoder and the quantizer. 2.1 Multiscale Feature Extraction Module The multiscale feature extraction module, named multiscale block, is used to extract the multiscale features of the input images. According to GooLeNet [5], we can know that multiple convolution kernels can achieve multiscale feature extraction. Here, we also adapt multiple convolution kernels to extract multiscale features and its detailed structure is shown in Fig. 1. We use softmax function to assign normalized weights for the feature maps obtained by multiple convolution kernels. The module can adjust the weights of the feature maps according to the content of the input images and implement dynamic multiscale feature extraction.
Fig. 1 The structure of the multiscale block
2.2 Encoder and Decoder The detailed structure of the encoder and the decoder is shown in Fig. 2. ‘Multiscale block’ denotes the proposed module as shown in Fig. 2. Conv denotes the convolution with 3 × 3. ReLU is used as the nonlinear activation layer. Pixelshuffle [6] is used as the
432
S. Zhao et al.
upsampling operation and downsampling operation is symmetrical to the upsampling operation. The sigmoid function in the encoder outputs the intermediate features limited in the range of 0–1.
Fig. 2 The structure of encoder and decoder
2.3 Quantizer The quantizer, which is located at the end of the encoder, quantizes the intermediate features produced by the encoder. This can be expressed as follows: XQ = [XE × (2Q − 1)]
(1)
where XE denotes the intermediate features, Q is the desired bit depth (which is 8 in the framework), [·] denotes the round function. The quantizer quantizes the intermediate features during forward propagation. In order to the framework can be trained, we set that the gradient of the quantization layer is always 1 during backward propagation. 2.4 Rate-Distortion Optimizer The loss function is the weighted sum of bit rates and distortion. The estimation of the bit rates and the distortion are denoted by the approximate calculation of the entropy and the mean squared error (MSE), respectively. L = LD + λLR
(2)
LR = −E[log2 Pd (x)]
(3)
x+ 21
Pd =
Pc (x)dx
(4)
x− 21
where Pc represents the probability of the corresponding pixels. We adapt spline interpolation to get the approximate entropy LR .
Multispectral Image Compression Based on Multiscale Features
433
3 Experiment 3.1 Data and Training Our datasets contain about 80,000 images with a size of 128 × 128 for training and 24 images with a size of 512 × 512 for testing. The datasets come from Landsat8 satellite, which contain seven multispectral bands. Test images are not included in the training images. 3.2 Parameter Settings We use Adam optimizer to train the framework. The detailed setting of parameters λ and other parameters is shown in Table 1. Table 1 Parameter settings Parameters
Values
λ
[1e−4,1e−5,3e−5,5e−5]
Batch size
32
Learning rate
[1e−4, 1e−5]
Intermediate feature h8 × w8 × 48, h8 × w8 × 36
4 Experiment we compare our compression framework with JPEG2000, in terms of PSNR. The compared results are shown in Table 2. From Table 2, we can find that the PSNR of the proposed algorithm outperforms that of JPEG200 at all bit rates, which demonstrates that the proposed algorithm has superior performance than JPEG2000. Table 2 Average PSNR of JPEG2000 and the proposed algorithm at different bit rates Bit rate JPEG2000 Proposed
0.388
PSNR
Bit rate
43.6
0.434
46.39
PSNR
Bit rate
44.07
0.535
46.89
PSNR 44.98 47.18
434
S. Zhao et al.
5 Conclusion We propose a compression framework based on multiscale features to compress multispectral images. We also evaluate our framework with JPEG2000 and the experimental results show the superior performance of the framework. The quality of the restructured images shows the great advantages of the multiscale block.
References 1. Toderici G, O’Malley S M, Hwang SJ et al (2016) Variable rate image compression with recurrent neural networks. In: ICLR2016. Available from https://arxiv.org/abs/1511.06085. Cited 23 Apr 2020 2. Toderici G, Vincent D, Johnston N et al (2017) Full resolution image compression with recurrent neural networks. CVPR 2017:5435–5443. https://doi.org/10.1109/cvpr.2017.557 3. Rippel O, Bourdev L (2017) Real-time adaptive image compression. ICML 2017(70):2922– 2930 4. Mahoney M (2014) The ZPAQ open standard format for highly compressed data-level 2. Available from http://mattmahoney.net/dc/zpaq204.pdf. Cited 23 Apr 2020 5. Szegedy C, Liu W, Jia Y et al (2015) Going deeper with convolutions. CVPR 2015:1–9. https:// doi.org/10.1109/cvpr.2015.7298594 6. Shi WZ, Caballero J, Huszar F et al (2016) Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. CVPR 2016:1874–1883. https://doi. org/10.1109/cvpr.2016.207
Dense Residual Network for Multispectral Image Compression Kedi Hu(B) , Fanqiang Kong, Shunmin Zhao, and Yuxin Meng College of Astronautics. Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China [email protected]
Abstract. Considering that multispectral images have a mass of complex features, to extract intact feature data, a compression framework for multispectral images based on dense residual network (DRN) is proposed in this paper. The multispectral images are first fed into the encoder, and residual dense block in it can read all the information learned from the preceding block via a contiguous memory mechanism, then preserves the features adaptively. Then the data is compressed by down-sampling and converted to bit stream by quantization and entropy encoding. With that, global dense features can be extracted completely from the original image. Additionally, rate distortion optimization is used to make the data more compact. Then, we reconstruct the image via entropy decoding, inverse quantization, up-sampling, and deconvolution. The experimental result shows that the proposed method outperforms JPEG2000 at the same bit rate. Keywords: Multispectral image compression · Deep learning · Dense residual network · Rate distortion optimization
1 Introduction Multispectral images contain a wealth of information both spatially and spectrally. Thus, they can be widely used after processing in many fields such as military, agricultural, industry, and so on. However, high dimensionality of the feature makes it a huge amount of data, and normally leads to transmission and storage problems. Therefore, it is necessary to seek for a high-performance multispectral image compression method. Unlike visible images, multispectral images have spatial correlations, ones between adjacent pixels of each band, and spectral correlations, and ones between different pixels of the same spatial position mapping from adjacent bands. As a result, the multispectral image compression method that we expect for should be able to effectively remove not only the spatial correlations but also spectral correlations as well. Recently, more and more researches show that deep learning has great potential to image compression. In view of the characteristics of multispectral images, we proposed
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_60
436
K. Hu et al.
an efficient compression method based on dense residual network, which can effectively compress image in the feature domain. The network is divided into two parts. In the encoding end, spectral and spatial features of the image are extracted completely from the input data via convolution and residual dense block (RDB), then using down-sampling to reduce the size of the data. Then rate distortion optimization makes the data distribution more compact. The intermediate feature data is quantized later and encoded to bit stream by entropy coding. In the decoding end, the bit stream is reconstructed to multispectral image through an inverse transform process of entropy decoding, inverse quantization, up-sampling, and deconvolution.
2 Related Work The multispectral image data is a three-dimensional cube, and there is no displacement between the spectrum segments, that is, a 3D still image. Due to this reason, the multispectral image coding method can be developed on the basis of two-dimensional still image coding method. For visible image compression, Toderici et al. [5, 6] used the recurrent neural network to generate intermediate reconstruction images of different quality, and controlled the compression ratio by changing the iterations. In the case of holographic image compression, Jiao et al. [4] used convolutional neural network to reduce the artifacts and solved the quality degradation problem caused by the loss of some high-frequency features in compression process. For the optical field image compression, Bakir et al. [1] reconstructed the optical field image from the sparse image obtained from the coding end by using deep learning in the decoding end. The methods mentioned above are all superior to JPEG and JPEG2000 both subjectively and objectively. As we all know, ResNet [3] alleviates the problem of gradient disappearance and accelerates the training speed by introducing a shortcut connection and learning residual. Compared to ResNet, DenseNet [2] proposes a more radical dense connection mechanism, in which each layer accepts all the information from previous layers as additional input. DenseNet makes it possible to realize feature multiplex, and reduces the number of parameters. Inspired by the above, we propose a compression structure based on dense residual network. Dense residual network is a combination of dense connection and residual network (Fig. 1).
3 Proposed Method The structure of the end-to-end multispectral image compression network proposed in this paper is shown in Fig. 2. The multispectral image is converted to the representation of feature domain by dense residual network through the forward network. Then, the main spectral and spatial features are extracted, and the size of the feature data is reduced by down-sampling. After initial compression, the data is then further processed via quantization and lossless entropy encoding, and the binary stream is finally obtained. The structure of the decoding end is symmetrical with that of the encoding end. During decoding, the binary stream is first restored to the feature data through entropy decoding. Then the restored data,
Dense Residual Network for Multispectral Image Compression
437
Fig. 1 Composition of RDB: a residual block b dense block c residual dense block
Fig. 2 Overall flowchart of the compression network
respectively, passes through the inverse quantization layer, deconvolution layer, and upsampling layer in turn to gradually recover to its original dimension and size, and finally restoring the image from feature domain to pixel domain. In particular, rate distortion optimization can reduce the entropy of the feature data, making the distribution more compact, resulting in a significant improvement of the network. 3.1 Dense Residual Network The dense residual network consists of three modules [7]: shallow feature extraction net (SFENet), residual dense block, and global feature fusion (GFF). As shown in Fig. 3a, the first two convolution layers are designed to extract shallow features of the input image. RDB integrates ResNet and DenseNet together to learn information, which is efficient and takes full account of all the hierarchical characteristics. The RDB allows a direct connection between the previous block and each layer of the current block, resulting in a continuous memory (CM) mechanism, as shown in Fig. 4.
438
K. Hu et al.
Fig. 3 Illustration of dense residual network: a forward network b backward network
Through CM, features of all the layers are concatenated (concat) in the dimension of channel. The 1*1 convolution after concatenation is mainly used for feature fusion of multiple channels as well as dimension reduction, which is called local feature fusion. Finally, local residual learning fuses all the information of the previous block and the current block together and transmits it into the next RDB.
Fig. 4 Architecture of RDBs
The global feature fusion and local feature fusion in RDB are essentially the same, merely applied in different situation. But for one thing, we still need to pay attention to. That is, global feature fusion has fewer short connections than local feature fusion has. For local feature fusion in RDB, each layer receives the data of all previous layers and adaptively preserves the features, and for global feature fusion outside RDB, it just concatenates the features of each RDB and fuses hierarchical features from all the channels. 3.2 Rate Distortion Optimization In order to further optimize the performance of the multispectral compression network, it is necessary to strike a balance between the bit rate and the loss of image distortion. To ensure the quality of reconstructed image while keeping the bit rate as low as possible, the loss function used in this paper is as below: L = D + λR
(1)
Dense Residual Network for Multispectral Image Compression
439
where D is the distortion represented by mean square error, λ is the penalty weight, and R is the approximation of the entropy of quantized feature data, representing the bit rate. R can be calculated via: (2) R = −E log2 Pq 1 x+ 2
Pq =
Pd (x) dx
(3)
1 x− 2
where Pd (x) is the probability density function of the data. To make the reconstructed image closer to the original one, we reduce D gradually through training. Meanwhile, decreasing R makes the data distribution of intermediate features more compact which improves the compression performance effectively. 3.3 Implementation Details In the proposed network, all the convolution kernels are 3 × 3 except for the 1 × 1 convolution, and the size of feature graph of each layer is kept fixed by padding 1 to each side of the input. We use pixel-shuffle as up-sampling operation, which is proved well performed in super-resolution.
4 Results We evaluate the image quality and distortion using PSNR, as shown in Table 1. Compared with JPEG2000, our method obtains better PSNR (dB) at the same bit rate (bit/s). Table 1 Average PSNR of two methods at different bit rate
JPEG2000
Rate
PSNR
Rate
PSNR
Rate
PSNR
Rate
PSNR
0.563
45.21
0.406
43.80
0.341
43.10
0.249
41.96
Proposed
47.98
46.97
47.41
46.05
As we observe, our method is on average 2–4 dB better than JPEG2000. And as the bit rate decreases, the advantage of our method becomes more obvious. Especially, unlike JPEG2000, our method has a stable performance at low bit rates and PSNR will not decline as quickly as JPEG2000.
5 Conclusion We proposed a compression framework for multispectral image using dense residual network, which can make full use of all the hierarchical features from the original image. And the experiments show that our method is superior to JPEG2000, proving that the research of RDN does have a good prospect.
440
K. Hu et al.
References 1. Bakir N, Hamidouche W, Déforges O et al (2018) Light field image compression based on convolutional neural networks and linear approximation. ICIP2018: 1128–1132 2. Huang G, Liu Z, Laurens VD et al (2017) Densely connected convolutional networks. In: IEEE CVPR2017. https://ieeexplore.ieee.org/document/8099726https://doi.org/10.1109/cvpr. 2017.243 3. He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition. In: IEEE CVPR2016. https://ieeexplore.ieee.org/document/7780459https://doi.org/10.1109/cvpr. 2016.90 4. Jiao S, Jin Z, Chang C et al (2018) Compression of phase-only holograms with JPEG standard and deep learning. Appl Sci 8(8):1258 5. Toderici G, O’Malley SM, Hwang SJ et al (2016) Variable rate image compression with recurrent neural networks. In: ICLR2016. CoRR abs/1511.06085: n. pag 6. Toderici G, Vincent D, Johnston N et al (2017) Full resolution image compression with recurrent neural networks. In: IEEE CVPR2017, vol 1, pp 5435–5443. https://doi.org/10.1109/cvpr.201 7.577 7. Zhang Y, Tian Y, Kong Y et al (2018) Residual dense network for image super-resolution. In: IEEE CVPR2018. https://ieeexplore.ieee.org/document/8578360https://doi.org/10.1109/cvpr. 2018.00262
Hyperspectral Unmixing Method Based on the Non-convex Sparse and Spatial Correlation Constraints Mengyue Chen(B) , Fanqiang Kong, Shunmin Zhao, and Keyao Wen College of Astronautics, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China [email protected]
Abstract. Aiming to the row sparse feature and the row sparse feature of the abundance matrix of hyperspectral mixed pixels, a hyperspectral unmixing method based on the non-convex sparse representation and spatial correlation constraints is presented. The non-convex sparse representation and spatial correlation representation models are firstly constructed, which take the non-convex p-norm of the abundance matrix to exploiting row sparse feature and the total variation regularization to exploit the spatial information of the hyperspectral images. Then the non-negativity constraint is taken into the model to exploit the physical property of the hyperspectral images. An iteratively optimization algorithm is developed for the unmixing model by the alternating direction method of multipliers. Experimental results illustrate that the proposed approach can get better unmixing accuracy than the other unmixing methods. Keywords: Hyperspectral image · Sparse unmixing · Sparse representation · Total variation · Non-convex
1 Introduction Hyperspectral images, which contain abundant spectral and spatial information, are widely applied in various fields. In real scenarios, mixed pixels are always encountered due to the limitation of optical device performance or because of the fact that multiple substances exist [1] inside the same pixel. The hyperspectral unmixing technique, which aims at decomposing mixed pixels to distinguish the pure components (endmembers) and to work out the proportion of each component (fractional abundances), has been put forward to deal with this problem. Unmixing methods based on geometry and statistics are generally used in the linear unmixing model. Sparse regression techniques have been added into the hyperspectral unmixing model and formed sparse unmixing methods [2–4] in recent years. The sparse unmixing methods mainly include convex optimization algorithms and greedy algorithms. Convex optimization methods need to solve the l1 norm regression problem and
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_61
442
M. Chen et al.
usually use the ADMM method for the purpose of making solution process more efficient. Greedy algorithms are hard to find the endmembers’ best subset from the spectral library when mixed pixels involve numerous endmembers. Recently, research [5, 6] in non-convex optimization algorithms has proved that the l p -norm which is non-convex, may achieve better unmixing results. In this paper, to improve hyperspectral unmixing accuracy, the total variation regularization is added into the non-convex optimization model.
2 The Sparse Unmixing Model The linear sparse unmixing model supposes that the spectrum of mixed pixels can be represented by the linear combination of spectral signatures occurred in a spectral dictionary which is known. Assuming that A ∈ RL×m is the L × m spectral dictionary, and m is endmembers’ number in the spectral dictionary A and L is band number. The linear sparse unmixing model can be represented as: y = Ax + n
(1)
where y ∈ RL is a mixed pixel’s the spectrum vector,x ∈ Rm is the abundance vector and n ∈ RL is the noise. Sparse unmixing is proposed as a semisupervised unmixing method to tackle the linear unmixing model problem, which uses A as a priori knowledge. So the sparse unmixing problem can be represented as follows: 2
minx0 s.t.y − Ax 2 ≤ δ, x ≥ 0, 1T x = 1 x
(2)
where x0 is the number of atoms which are not equal to zero in x.δ ≥ 0 is tolerated deviation. x ≥ 0 is abundance non-negativity constraint (ANC). 1T x = 1 means that the sum of the abundance coefficients in x is 1 and is called abundance sum-to-one constraint (ASC). Then l0 norm problem can be switched by the l1 norm problem or the L 2,1 norm problem to get a more efficient resolution process. These problems are the convex optimization problem. The l1 norm problem for sparse unmixing is represented as follows: 2
minx1 s.t.y − Ax 2 ≤ δ, x ≥ 0, 1T x = 1 x
(3)
where x1 = m i=1 |xi | is the l1 norm of x. SUnSAL [2] algorithm, has been put forward to tackle the problem. The L2,1 norm problem, which is called the simultaneous sparse unmixing problem, is represented as: minX 2,1 s.t.Y − AX F ≤ δ, X ≥ 0 X
(4)
where X 2,1 = m i=1 Xi 2 is the l2,1 norm of X , Xi denotes the ith row of X . X F donates the Frobenius norm of X . Collaborative SUnSAL (CLSUnSAL) [4] is proposed to solve this problem.
Hyperspectral Unmixing Method Based on the Non-convex Sparse …
443
3 Proposed Hyperspectral Unmixing Methodology In recent years, non-convex sparse representation is used to signal reconstruction, which shows that the lp -norm can get better reconstruction performance than the l1 -norm. Because the lp -norm can get a closer substitute to the l0 -norm of the NP hard problem than the l1 -norm. The adjacent pixels in hyperspectral image can be represented as different linear combines by the same endmembers, and we take the total variation regularization to express the spatial correlation characteristics of the hyperspectral images. So the hyperspectral sparse unmixing method based on the non-convex representation and spatial correlation constraints can be delivered as: min λJp,2 (X ) + βTV (X ) s.t.Y − AX F ≤ δ, X ≥ 0 X
(5)
p 2 where Jp,2 (X ) = i xi,· 2 , and xi,· 2 = ( j xi,j )1/2 , 0 < p ≤ 1, the parameter λ > 0 and β > 0 are the regularization parameters controlling the weight of nonconvex sparsity and total variation terms, X F is the Frobenius norm of X , TV (X) = xi − xj is the total variation regularization. X ≥ 0 denotes the ANC constraint. i,j 1 The unmixing model of the proposed algorithm can be rewritten as follows: 1 min Y − AX 2F + λJp,2 (X ) + βTV (X ) + ιR+ (X ) X 2 where
ιR+ (X ) =
0 , X ij ≥ 0 ∀i, j. ∞ , X ij < 0
(6)
(7)
The optimization problem in (6), after using the ADMM method, can be equivalent to the following form: arg min
X ,V1 ,V2
1 Y − AX 2F + λJp,2 (X ) + βTV (X ) + ιR+ (V2 ) 2 μ μ + X − V1 − D1 2F + X − V2 − D2 2F 2 2
(8)
where μ > 0 is the Lagrange multiplier, V and D are the intermediate variables. To solve (8), the optimal problem of (8) is split into a few simpler ones via the ADMM method, which are defined as follows (X (k+1) , V (k+1) ) = arg min L(X , V )
(9)
D(k+1) = D(k) − (X (k+1) − V (k+1) )
(10)
X ,V
Based on (9) and (10), we can get the solution X , V , D of the augmented Lagrange problem: 1 X (k+1) = arg min Y − AX 2F + λJp,2 (X ) X 2
444
M. Chen et al.
μ μ (k) (k) 2 (k) (k) 2 U − V1 − D1 + U − V2 − D2 F F 2 2 μ (k+1) (k) 2 V1 = min βTV (X ) + X (k+1) − V1 − D1 V1 F 2 μ (k+1) (k) 2 V2 = min ιR+ (V2 ) + X (k+1) − V2 − D2 V2 F 2 +
(11) (12) (13)
After simple algebras, the solution of (11) is X (k+1) = W ((AW )T (AW ) + (λ + 2μW T )I )−1 ((AW )T Y + μW T
2
ϕik )
(14)
i=1
1−p/2 (k) (k) (k) where ϕi = Vi + Di , i = 1, 2, W = diag(p−1/2 xm, 2 ). The solution of (12) is the soft threshold as follows V1k+1 = soft(D1k − HV1k ,
λ ) μ
(15)
where H denotes a convolution by using the discrete Fourier transform diagonalization. The solution of (13) can be easily solved: V2k+1 = max(X k − D2k , 0)
(16)
4 Results We use two simulated data cubes to demonstrate the algorithms’ performance by calculating the signal reconstruction error (SRE). The SRE is defined as ESRE = 2
E X 22 /E X − Xˆ , where X and Xˆ , respectively, denotes the true abundances 2
and reconstruction abundances. The spectral library A choose USGS spectral library Splib06a with 498 materials inside. In the first simulated data, five spectral signatures are selected randomly from the spectral dictionary A and the data size is 75 × 75 pixels. The second simulated data follows the literature (4), which contains 100 × 100 pixels and selects nine spectral signatures randomly from A. Add Gaussian white noise of 20, 30, 40 dB, respectively, to them. The SRE results listed in Table 1 compare the proposed algorithm with another three methods. The proposed method can get the best results in most cases. Figure 1 shows the estimated abundance images with 30 dB white noise in three randomly selected endmembers. The noise points in the presented algorithm’s estimated abundance images are fewer than other methods. And abundance images of the proposed method are more similar to the true abundance images.
Hyperspectral Unmixing Method Based on the Non-convex Sparse …
445
Table 1 Two simulated data cubes’ SRE (dB) values with different unmixing methods Data cube
SNR (dB)
SOMP
SUnSAL
DC1
20
10.3434
3.7599
30
19.5881
8.1211
22.0782
33.0384
40
29.6068
17.4204
30.1012
45.6606
20
17.3239
10.9845
14.8106
17.5148
30
26.2183
18.4072
22.8667
28.3035
40
35.7868
27.9553
31.8396
37.1407
DC2
SUnSAL-TV 9.2944
The proposed method 18.7995
(a)True abundances
(b)SOMP
(c)SUnSAL
(d)SUnSAL-TV
(e)The proposed method
Fig. 1 Abundance images of 1, 4, 9 endmembers of second data under 30 dB white noise
5 Conclusion In this work, we introduced a hyperspectral sparse unmixing approach based on the nonconvex sparse and spatial correlation constraints. The results of two simulated data sets show that the presented method performs better than convex optimization algorithms and greedy algorithms.
References 1. Bioucas-Dias J, Plaza A (2010) Hyperspectral unmixing: geometrical, statistical, and sparse regression-based approaches. Toulouse, France 2. Afonso MV, Bioucas-Dias JM, Figueiredo MAT (2011) An augmented Lagrangian approach to the constrained optimization formulation of imaging inverse problems. IEEE Trans Image Process 20(3):681–695
446
M. Chen et al.
3. Iordache MD, Bioucas-Dias J, Plaza A (2012) Total variation spatial regularization for sparse hyperspectral unmixing. IEEE Trans Geosci Remote Sens 50(11):4484–4502 4. Iordache J, Bioucas-Dias A (2014) Plaza, Collaborative sparse regression for hyperspectral unmixing. IEEE Trans Geosci Remote Sens 52(1):341–354 5. Chartrand R (2007) Exact reconstruction of sparse signals via nonconvex minimization. IEEE Signal Process Lett 14(10):707–710 6. Daubechies I, DeVore R, Fornasier M, Güntürk CS (2010) Iteratively reweighted least squares minimization for sparse recovery. Commun Pure Appl Math 63(1):1–38
Deep Denoising Autoencoder Networks for Hyperspectral Unmixing Keyao Wen(B) , Fanqiang Kong, Kedi Hu, and Shunmin Zhao College of Astronautics, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China [email protected]
Abstract. The autoencoder (AE) is based on reconstruction and unsupervised framework, so its extracted features contain enough components to represent the input signal. Based on this characteristic, AE can be well applied to hyperspectral unmixing. However, due to the low precision of traditional AE and the large influence of noise, in this paper, a deep denoising autoencoder network (DDAE) for hyperspectral unmixing is proposed to improve the accuracy of abundance estimation while realizing the de-noising function. In order to guarantee the abundance to be sum-to-one and nonnegative, the endmembers satisfy the nonnegative, we limit the weight of hidden layer and decoding layer to be nonnegative, add the sum-toone constraint to the hidden layer, and the L 2,1 -norm constraint to the objective function as a regular term, which takes advantage of the multiple sparsity between adjacent pixels. Experiments with real data and comparison with other algorithms prove the effectiveness of the DDAE algorithm. Keywords: Hyperspectral unmixing · Autoencoder · Deep denoising autoencoder network
1 Introduction In recent years, as deep learning has become more and more widely used in computer vision, it has gradually been introduced into the fields of image classification, natural language processing, and remote sensing data. At present, autoencoder (AE), restricted Boltzmann machine, deep belief network, and convolutional neural network are the most common deep learning models. The autoencoder is based on reconstruction and unsupervised framework, such that the extracted features contain enough components to represent the input signal. Based on this characteristic, the AE can be well applied to hyperspectral unmixing. Firstly, the main features of the original data are extracted by encoding. Secondly, the original data is reconstructed by decoding, and the abundance coefficient and the endmember matrix can be, respectively, obtained. Because the traditional AE has no constraints, it is easy to copy the input to the output directly, or only make minor changes to produce small reconstruction errors, so that the model
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_62
448
K. Wen et al.
performance is usually very poor. The denoising autoencoder (DAE) based on the AE adds random noise to the input and transfers it to the AE to reconstruct the noise-free input. In this paper, a deep denoising autoencoder network (DDAE) for hyperspectral unmixing is proposed. According to the physical meaning of unmixing, the abundance and endmember must satisfy nonnegativity, and the abundance coefficient should satisfy the sum-to-one requirement, so when using DAE for hyperspectral unmixing, set the weight of the hidden layer and the decoding layer to nonnegative, and the hidden layer satisfies the sum-to-one requirement. If the noise and endmember are estimated incorrectly, the performance of unmixing will decrease dramatically [1]. In order to solve this problem, the L 2,1 -norm constraint is added to the objective function as a regularization term, which not only reduces the redundant lines of the encoder but also take advantage of the multiple sparsity between adjacent pixels, and improves the performance of abundance estimation. In the simulation experiment, compared with other commonly used unmixing algorithms, the DDAE algorithm is superior to other comparison algorithms on simulated and real.
2 The Model of Denosing Autoencoder Traditional AE without any constraint is easy to directly copy the input to the output, or only make minor changes to produce small reconstruction errors, such that the model performance is usually very poor. In this context, Vincent et al. proposed the DAE algorithm [2] based on the traditional AE, by adding noise to the input, and then using the “corrupted” samples containing noises to reconstruct the “clean” input without noise, which is the main difference from the traditional AE. At the same time, the training strategy enables DAE to learn more about the essential characteristics of the input data. In this paper, add additive Gaussian noise to the data, and change the size of the noise to conduct experiments separately. The DAE network includes the following two parts: 1. Encode the input data X with the encoder f (x) to obtain the hidden layer output S (that is, abundance coefficient):
S = f (x) = σ (WX)
(1)
The activation function is represented by σ (x), and the weight of the encoder is represented by W. 2. Decoder g(x) uses S to reconstruct the data:
ˆ = g(S) = AS X
(2)
Deep Denoising Autoencoder Networks for Hyperspectral Unmixing
449
ˆ represents the The decoder weight (i.e., endmember) is represented by A, and X reconstructed data. The network uses the minimized average reconstruction error to learn the weights and hidden layer representation of the reconstructed data defined as follows: 1 1 g(f (xi )) − xˆ i 2 2 n 2 n
J (W, A) =
(3)
i=1
According to the existing formulas, we will further study to solve the unmixing problem by combining nonnegative constraints and sum-to-one constraints.
3 Deep Denoising Autoencoder Networks for Hyperspectral Unmixing In this section, the proposed deep denoising autoencoder is introduced in details. Figure 1 shows the network structure of the proposed DDAE algorithm.
Fig. 1 Network structure of the proposed DDAE
For unmixing problems, the sum-to-one of the abundance coefficient is an important ˆ and the weight A are constraint. In order to satisfy this constraint, the input data X ¯ and A, ¯ expanded by a constant vector, and the augmented matrices are denoted by X respectively: ˆ A X ¯ ¯ ,A = (4) X= T 1Tl 1m So 1Tl sj = 1, the column vector of the abundance matrix satisfies the sum-to-one constraint. In practical applications, the data is often accompanied by noise, and the presence of noise and the incorrect estimation of the endmember will cause the unmixing per formance to drop sharply. Therefore, regular terms WT 2,1 are introduced to reduce the redundant rows of the encoder, but this strategy cannot reflect the multiple sparsity between adjacent pixels; in this paper, we improved the term to σ (WX)2,1 , which not
450
K. Wen et al.
only reduces the redundant endmember but also introduces multiple sparsity to improve the performance of abundance estimation. In summary of the above analysis, defining the objective function of W as: J (W) =
1 Aσ ¯ (WX) − X ¯ 2 + λ(WX)2,1 F 2
(5)
According to the requirement of linear unmixing, both the encoding function and the decoding function should be linear mapping functions. At the same time, the activation function also ensures that the hidden layer (abundance S) is nonnegative. Therefore, the ReLu function is selected as the activation function, which is defined as follows σ (x) = max(x, 0). However, the ReLu function has a disadvantage in increasing the gradient extensively. According to the research, it can be solved by L 1 -norm or L 2 -norm [3]. In the proposed network, we use the L 2,1 -norm to solve the problems caused by ReLu [1]. According to the requirements of the unmixing, the weight of decoder also needs to be nonnegative, that is A ≥ 0. Also, we solve this problem with the ReLu function, which guarantees the nonnegativity of A during the optimization process.
4 Experimental Results and Analysis In this part, we conducted simulation experiments through a set of real data to verify the performance of the algorithm. And compare it with SUnSAL, SUnSAL-TV, and SMP algorithm. The experiment used the airborne Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) Cuprite data as the hyperspectral data [4]. The data size is 250 × 191 pixels and contains 188 spectral bands. The mineral map generated by Tricorder3.3 software and the reconstructed abundance map of each algorithm are shown in Fig. 2. In the experiment, the performance of the unmixing algorithm is evaluated by using sparsity and reconstruction error. Sparsity is the number of non-zero values in the abundance matrix of hyperspectral images. In order to prevent the negligible value from being counted in the total during calculation, we define the abundance value greater than 0.001 as non-zero abundance [5]. The RMSE is defined as follows [6]: n m 2 1 ˆ ij 1 (6) RSME = Xij − X 2 n m i=1
j=1
ˆ represents the reconstruction of where X represents the original hyperspectral image, X the hyperspectral image, n represents the number of bands, and m represents the number of pixels of the image. The better the quality of the unmixing with the lower the RMSE. Table 2 shows the sparsity and RMSE of each algorithms. It can be seen that the sparsity and reconstruction error of the DDAE algorithm are the smallest and far superior to other algorithms, indicating that the proposed algorithm has high unmixing performance for real hyperspectral images. From Fig. 2, it can be seen that the reconstructed abundance image has fewer noise points, and retains the edge and feature information of the image, which is closer to the abundant distribution map generated by the software.
Deep Denoising Autoencoder Networks for Hyperspectral Unmixing
Alunite
Buddingtonite
451
Chalcedony
Tricorder 3.3 software product
SMP
SUnSAL
SUnSALTV
DDAE
Fig. 2 Abundance images of three different elements and reconstruction abundance images of different algorithms Table 1 Sparsity and the reconstruction errors Algorithms SMP
SUnSAL SUnSAL-TV DDAE
Sparsity
17.5629
RMSE
15.102 0.0034
0.0051
20.472 0.0038
10.0954 0.0018
5 Conclusion In this paper, a deep denoising autoencoder network is proposed to solve the problem of hyperspectral unmixing. On the basis of DAE, we add the nonnegative and the sum-toone constraints to the abundance coefficient, and add L 2,1 -norm constraint as a regular term to the objective function, which makes good use of the multiple sparsity between
452
K. Wen et al.
adjacent pixels and improves the accuracy of abundance estimation. Experimental data shows that the DDAE algorithm is superior to other contrast algorithms, especially for high noise hyperspectral data.
References 1. Qu Y, Qi H (2019) uDAS: an untied denoising autoencoder with sparsity for spectral unmixing. IEEE Trans Geosci Remote Sens 57(3):1698–1712 2. Ma H, Ma S, Xu Y et al (2018) Image denoising based on improved stacked sparse denoising autoencoder. Comput Eng Appl 54(4):199–204 3. Gülçehre CC, Bengio Y (2016) Knowledge matters: Importance of prior information for optimization. J Mach Learn Res 17(1):226–257 4. Kong F, Li Y, Guo W (2016) Regularized MSBL algorithm with spatial correlation for sparse hyperspectral unmixing. J Vis Common Image R 40:525–537 5. Iordache M-D, Bioucas-Dias JM, Plaza A (2014) Collaborative sparse regression for hyperspectral unmixing. IEEE Trans Geosci Remote Sens 52(1):341–354 6. Plaza J, Plaza A, Martínez P et al (2003) H-COMP: a tool for quantitative and comparative analysis of endmember identification algorithms. Proc IEEE IGARSS 1:291–293
Research on Over-Complete Sparse Dictionary Based on Compressed Sensing Theory Zhihong Wang, Hai Wang(B) , and Guiling Sun College of Electronic Information and Optical Engineering, Nankai University, Tianjin 300350, China [email protected] Abstract. This paper introduces the concept of compression sensing theory, signal sparsity, sparse base and over complete dictionary, and introduces the over complete dictionary of DCT (discrete cosine) and chirplet wavelet in detail. On this basis, with the help of MATLAB simulation tools, the sparse simulation of several commonly used over complete dictionaries is carried out, and the sparse signal is reconstructed by OMP algorithm. The simulation results show that the chirplet wavelet dictionary, and DB wavelet dictionary are better than the DCT. Keywords: Compression sensing · Sparse signal · Over-complete dictionary
In 2006, D.L. Donoho formally proposed the concept of compression sensing (CS) [1]. Compression sensor combines sampling and compression process, uses signal sparsity or compressibility to remove redundancy between data, uses non-adaptive linear projection, uses non-correlation measurement, and realizes signal reconstruction by far less than the approximate rate of measurement value of signal dimension [2–6].
1 Signal Sparsity 1.1 Absolute Sparsity For sparse signal, it is defined as follows: for a one-dimensional signal x ∈ RN , its element is xi (i = 1, 2, · · · , N ), and the signal is regarded as a vector if its support set sup px = {i ∈ ZN : xi = 0} meets the following requirements: x0 = |suppx| = K
(1)
The x is called a sparse signal in a strict sense. The definition of sparsity in the transform domain is as follows: The original signal x(x ∈ RN ) is not a sparse signal, Ψ = {ψi }N i=1 is a set of column vectors with length of N . The signal x is represented by the linear combination of column vectors ψi , and the coefficient is αi , as shown in formula (2) x=
N
αi ψi = Ψ α
i=1
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_63
(2)
454
Z. Wang et al.
1.2 Signal Compressibility For the original signal x with one-dimensional length of N or its transform domain signal α, the elements in the signal are sorted according to the absolute value, and we obtain |x|1 ≥ |x|2 ≥ · · · ≥ |x|N the element |x|i is the i th largest value. If the ordered element follows the power-law decrement, as shown in formula (3): |x|i ≤ Cq • i−q , i = 1, 2, . . . , N
(3)
Cq is a constant, q ≥ 0. Keep the K values of the largest absolute value of the element, and set others to zero. In this way, the sparse approximation signal xK of the original signal can be obtained, which is called the best K sparse approximation of the original signal.
2 Sparsity Transform The sparse transform generally includes Fourier transform and wavelet transform. 2.1 Fourier Transform Fourier transform is a powerful tool to study the stationary signal. After the transform, the signal energy is concentrated in |ω| ≤ ωm . The transformation formula is as follows (4): ⎧ +∞ ⎪ ⎪ ⎪ F(ω) = f (t)e−jωt dt ⎨ −∞ (4) +∞ ⎪ 1 ⎪ jωt dω ⎪ F(ω)e f (t) = ⎩ 2π −∞
The effect of Fourier transform on the analysis of stationary signal is very good, but for non-stationary signal, when it is necessary to know the local frequency component at any time, the Fourier transform cannot meet. On the basis of Fourier transform, the short-time Fourier transform (STFT) [7] is proposed,. +∞ f (t)w(t − τ )e−jωt dt Gf (ω, b) =
(5)
−∞
Once the window function adopted by STFT is determined, its resolution is also determined, then its application range is limited.
Research on Over-Complete Sparse Dictionary Based on Compressed …
455
2.2 Wavelet Transform In 1984, A.Grossmann and J. defined “wavelet” and put forward wavelet transform [8]. The wavelet transform is shown in formula (6): 1 Wf (a, b) = √ |a|
+∞ t−b f (t)ϕ dt a
(6)
−∞
The parameters a and b in wavelet transform correspond to those ω and τ in Gabor transform. Where, the parameters b and τ are translation parameters, which change the window position; the change of ω will not affect the window shape and size, and the change of a not only changes the spectrum structure of the waveform, but also changes the size and shape of the window. In this way, the parameters can be selected according to the actual situation: for the place where the signal changes slowly, the time window can be widened and the sampling frequency can be reduced; for the signal segment with large signal fluctuation and complex frequency components, the frequency resolution can be reduced, the time window can be reduced and the time resolution can be improved, so as to pay attention to the transient change of the signal in the time domain [9].
3 Sparse Base and Over-Complete Dictionary 3.1 Sparse Base Suppose ψi (i = 1, . . . , N ) is a set of vectors with length of N , for the signal x with length of N , it can be expressed by linear combination of vectors ψi , as shown in formula (7): x = a1 ψ1 + · · · + ai ψi + · · · + aN ψN
(7)
where ai is the coefficient corresponding to each vector. Then, the vector a is the sparse representation of the original signal in the transform domain, and the matrix Ψ composed of vector groups ψi (i = 1, . . . , N ) is called sparse matrix or sparse base. 3.2 Over-Complete Dictionary The main idea of the over complete dictionary is to use the redundancy of the atomic library, according to the characteristics of the original signal, use the prior conditions or characteristic parameters to construct the characteristic atoms, and set all the atoms together to form the over complete dictionary. There are several rules for the construction of over complete dictionary: (1) according to the existing function set or matrix. Although it is simple and convenient to use the constructed matrix as a dictionary, its practical application has some limitations. The controllable wavelet dictionary belongs to this way; (2) the over complete dictionary is constructed according to the adjustable parameters. By limiting one or several important parameters of the basis function to generate atoms, an over complete dictionary is formed [10]. For example, wavelet packet dictionary and Gabor dictionary belong to this category; (3) the way to construct a dictionary for a specific type of signal.
456
Z. Wang et al.
4 Simulation Analysis DCT, chirplet wavelet, and DB wavelet are used as the basis to generate the corresponding over complete dictionary, and the sparse reconstruction simulation of stationary signal, non-stationary signal, time-domain sparse signal, and frequency-domain sparse signal is carried out, respectively. The reconstruction algorithm adopts OMP. 4.1 DCT Over-Complete Dictionary In DCT, there are two characteristic parameters: frequency and phase. By over sampling this characteristic parameter Q, we can generate an over complete dictionary. The larger the Q is, the more atoms there are in the over complete dictionary. Assuming that the length of the signal to be decomposed is N = 64 and the characteristic parameter phase has been sampled 12 times, the number of atoms in the overcomplete dictionary is 832, and the characteristic parameters corresponding to all atoms are shown in Table 1. Table 1 Characteristic parameters in DCT with signal length of N = 64 Atom Parameters Frequency f Phase ϕ gr1
0
0
gr2
0.5
0.002
gr3
1
0.0041
gr4
1.5
0.0061
…
…
…
gr831
31
0.0225
gr832
31.5
0.0245
The objects of sparse reconstruction are stationary change signal, non-stationary transform signal, time-domain sparse signal, and frequency-domain sparse signal. The simulation results are shown in Fig. 1. In Fig. 1, the absolute error of stationary signal reconstruction is between [7.2231e−04,0.2007], the absolute error of non-stationary signal reconstruction is [0.0026,0.4511], the absolute error of frequency-domain sparse signal reconstruction is [0.0063,1.3182]. 4.2 Chirplet Wavelet Over-Complete Dictionary In chirplet wavelet, there are five characteristic parameters: scale parameter s, time center u, frequency center ξ , time-frequency slope η and phase parameter ϕ. Similarly,
Research on Over-Complete Sparse Dictionary Based on Compressed …
457
Fig. 1 Sparse reconstruction effect of DCT over complete dictionary
the number of atoms in the dictionary is determined by the length of the signal to be decomposed and the oversampling coefficient of the characteristic parameter. Here, the signal with the length of N = 64 is still used as the signal to be decomposed. The number of atoms in the dictionary is 44,278. The characteristic parameter groups corresponding to atoms are shown in Table 2. The simulation results are shown in Fig. 2. It can be seen from Fig. 2 that the stationary signal reconstruction error range is [1.0357e−05, 0.0042]; the reconstruction error range of non-stationary signal is [3.3208e−6,0.0107]; the reconstruction error range of frequency-domain sparse signal is [1.3468e−05,0.0144]. 4.3 Db Wavelet Over-Complete Dictionary In this section, an over complete dictionary is generated by the method of DB wavelet transform decomposition of signals. The number of atoms in the dictionary is determined by the number of prior signals, the order of decomposition N and the number of decomposition layers l. The specific steps of dictionary generation are as follows: (1) Select a certain number of prior model signals with the length of n;
458
Z. Wang et al. Table 2 Characteristic parameters in Chirplet wavelet with signal length of N = 64
Atom
Parameter Scale parameter s
Time center u
Frequency center ξ
Time frequency slope η
Phase parameter ϕ
cr1
1
0
0
0
0
cr2
2
7
4.7124
0
1.5708
cr3
4
16
6.2832
0
4.1888
cr4
8
12
1.9635
0
0.5236
…
…
…
…
…
…
cr44277
32
48
5.7923
0.0982
3.6652
cr44278
64
64
6.2832
0.0982
5.7596
Fig. 2 Sparse reconstruction effect of Chirplet wavelet over complete dictionary
(2) According to the multi-resolution Mallat algorithm, the prior model signal is split, and the low-frequency component A1 and high-frequency component D1 of the signal are obtained,with the length of [(n − 1)/2] + N ;
Research on Over-Complete Sparse Dictionary Based on Compressed …
459
(3) According to the decomposition order, determine whether to continue the re cracking Ai : if necessary, the low-frequency component Ai+1 and high-frequency component Di+1 will be obtained by cracking; if not, the previous cracking component will be combined to form an atom (gγ = [Ai , Di , D(i−1) , · · · , D1 ])in the dictionary: where, γ = (N , l) is the parameter group of the decomposition order and the number of cracking layers. (4) The number of atoms in the generated dictionary is determined by the number of selected signals m, the order of decomposition N and the number of cracking layers l, and then compose the dictionary: D = {gγ }γ ∈Γ . In this paper, 24 sets of measured temperature signals are used as a prior signal to construct the over complete dictionary. The order of decomposition and the number of decomposition layers are selected. The number of atoms in the over complete dictionary is 2448. In order to facilitate comparison, the same stationary change signal, non-stationary change signal and frequency-domain sparse signal as before are still used for simulation, and the simulation results are shown in Fig. 3.
Fig. 3 Sparse reconstruction effect of Db wavelet over complete dictionary
It can be seen from Fig. 3 that the reconstruction error range of stationary signal is [7.4638e−7,0.0012], the reconstruction error range of non-stationary signal is
460
Z. Wang et al.
[4.4252e−04,0.0570]; the reconstruction error range of the frequency-domain sparse signal is [2.4294e−05,0.0023]. If a group of signals similar to the prior temperature signal is selected as the signal to be decomposed, the simulation waveform is shown in Fig. 4.
Fig. 4 Simulation results of the signal similar to prior model in DB wavelet dictionary
From Fig. 4, it can be seen that the signal reconstruction error reaches the level of 10– 14, realizing high-precision signal reconstruction, and from the sparse representation, it can be seen that the signal is really sparse.
5 Conclusion In this paper, sparse transform, sparse base, and over complete dictionary are introduced, and the construction method of DB wavelet cascade is proposed. On this basis, we use DCT over complete dictionary, Haar over complete dictionary, chirplet over complete dictionary and dB over complete dictionary to sparse the signals, and use OMP algorithm to reconstruct the signals. The simulation results show that the Chirplet and DB dictionary are relatively sparse compared with the other two. The sparse effect of Haar wavelet is the worst. If the signal to be decomposed is similar to the prior signal that produces the over complete dictionary of DB wavelet, the reconstruction error can reach 10–14 level, which realizes the effective sparse and accurate reconstruction of the signal. Acknowledgements. This work is supported by the Self-made Experimental Teaching Instrument and Equipment Project Fund of Nankai University; (No:2019NKZZYQ03).
Research on Over-Complete Sparse Dictionary Based on Compressed …
461
References 1. Donoho DL (2006) Compressed sensing. IEEE Trans Inf Theory 52(4):1289–1306 2. Baraniuk RG (2007) Compressive sensing [lecture notes]. Sig Process Mag IEEE 24:118–121 3. Candes EJ, Romberg J (2006) Quantitative robust uncertainty principles and optimally sparse decompositions. Foundations of Computational Mathematics, Secaucus, NJ, USA, vol 6, pp 227–254 4. Candès EJ, Romberg J, Tao T (2006) Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans Inf Theor 52:489–509 5. Candes EJ, Romberg JK, Tao T (2006) Stable signal recovery from incomplete and inaccurate measurements. Commun Pure Appl Math 59:1207–1223 6. Candes EJ, Tao T (2006) Near-optimal signal recovery from random projections: Universal encoding strategies. IEEE Trans Inf Theor 52:5406–5425 7. Cohen BJ (1998) Time frequency analysis: theory and application. Xi’an Jiaotong University Press 8. Fan Q (2008) Wavelet analysis. Wuhan University Press 9. Chen SB, Donoho D (1994) Basis pursuit. In: 1994 conference record of the twenty-eighth Asilomar conference on signals, systems and computers, vol 1, pp 41–44 10. Elad M, Aharon M (2006) Image denoising via sparse and redundant representations over learned dictionaries. IEEE Trans Image Process 15(12):3736–3745
Spectrum Occupancy Prediction via Bidirectional Long Short-Term Memory Network Lijie Feng, Xiaojin Ding(B) , and Gengxin Zhang Telecommunication and Networks National Engineering Research Center, Nanjing University of Posts and Telecommunications, Nanjing, China [email protected]
Abstract. In a satellite system, the ability to generate future spectrum occupancy can play an important role in increasing spectrum efficiency, and spectrum prediction is emerging as an efficient approach for increasing spectrum efficiency. In order to predict spectrum occupancy more accurately, we propose a bidirectional long short-term memory network (BiLSTM)-based spectrum prediction (SP) scheme, which can be performed in two stages. Specifically, in the first stage, the historical spectrum data may be pre-processed, and in the second stage, the preprocessed data should be sent to BiLSTM model, which will perform training and generate the optimized hyperparameters firstly. Then, BiLSTM will be activated to perform prediction via the optimized hyperparameters. Performance evaluations show that the BiLSTM-based SP scheme outperforms the LSTM-oriented SP scheme in terms of both accuracy and learning speed. Keywords: Spectrum occupancy prediction · Deep learning · BiLSTM
1 Introduction Due to the shortage of spectrum resources [1, 2], increasing spectrum efficiency is very important for the satellite networks. Fortunately, spectrum prediction technology emerges as an effective way to infer future spectrum occupancy by mining the internal relationship of spectrum data both of the time domain and frequency domain. Thus, spectrum efficiency can be increased by achieving the spectrum occupancy in advance, resulting in decreasing collisions among the users of the satellite networks. Currently, many researchers have developed effective spectrum prediction techniques for analyzing spectrum data. Logistic regression analysis [3], support vector machine regression [4], and hidden Markov model [5] were designed to predict spectrum occupancy in advance. However, these algorithms relied on mathematical distribution characteristics of the spectrum data, result in certain limitations. Differing from these methods, long short-term memory (LSTM) neural network is always used to deal with time
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_64
Spectrum Occupancy Prediction via Bidirectional Long Short-Term …
463
series problems, whilst spectrum data can be regarded as a kind of time series. Spectrum prediction algorithm based on LSTM [6–8] model has been proposed, where only central frequency is analyzed. However, a user may be allocated a band with many frequency points. Hence, motivated by this consideration, we investigate a bidirectional long short-term memory (BiLSTM) [9]-based spectrum prediction scheme to process whole frequency points of a user, and output an integrated result. This paper builds a combined neural network model, which can make the best of the spatial and temporal characteristics of user spectrum data to predict the future user behavior. The organization of this paper is as follows. In Sect. 2, we first present BiLSTM model. In Sect. 3, spectrum data pre-processing and model training are provided. In Sect. 4, performance evaluations are shown. In Sect. 5, we conclude the paper.
2 BiLSTM Model Spectrum data of the users can be regarded as time series data, which are interrelated in time domain. The recurrent neural network (RNN) has a unique advantage in the processing of time series data, and BiLSTM is a variant of RNN, which is composed of forward LSTM and backward LSTM, as shown in Fig. 1. Moreover, the architecture of LSTM is depicted in Fig. 2. To be specific, the LSTM structure uses memory cells instead of classic neurons, which consists a forget gate, an input gate, and an output gate, where the forget gate, the input gate, and the output gate can be used to control the degree of forgetting the historical status information, to get the upper layer input and add new status information, and to output information, respectively.
Output
LSTM
LSTM
LSTM
... ...
LSTM
LSTM
LSTM
Input
Fig. 1 BiLSTM model
As shown in [6], given time slot t, the gates can be given by: t−1 + bi ix xt + W ih h it = δ W
(1)
464
L. Feng et al.
Fig. 2 Architecture of LSTM unit
t−1 + bf fx xt + W fh h ft = δ W
(2)
− → → t−1 + bm mx xt + W mh h m t−1 + it tanh W m t = ft −
(3)
t−1 + bo ox xt + W oh h ot = δ W
(4)
t = ot tanh(m t) h
(5)
where → denotes forward direction, xt denotes input vector, mt denotes memory unit. Moreover, it , f t , and ot are used to represent the input gate, forget gate, output gate at the given tth time slot, respectively. Furthermore, ht is hidden state at the same time. mt denotes the memory cell. Parameters W and b with different subscripts represent the different weight matrices and bias vectors, respectively. δ denotes activation function and tanh is Hyperbolic tangent function. denotes the dot product of two vectors. The output of BiLSTM in Fig. 1 is determined by both forward LSTM and backward LSTM together, which can be written as: → ←−← − −→− H t = W h ht + W h ht + bh
(6)
−→ ←− where H t denotes the overall output of LSTM network, W h and W h denote the weight − → ← − matrices of forward LSTM and backward LSTM, respectively. Moreover, ht and ht represent output of forward LSTM and backward LSTM, respectively.
3 Spectrum Data Pre-processing and Model Training 3.1 Spectrum data pre-processing One user may occupy multi-channel, and the status of each channel may be correlated both in the frequency domain and time domain. Thus, in order to improve the accuracy of prediction, the spectrum data of multi-channel should be processed jointly, thus completing the prediction of the spectrum status of one user. Let N and T denote the number
Spectrum Occupancy Prediction via Bidirectional Long Short-Term …
465
of channels occupied by a single user and the total length of training time, respectively. Moreover, let {yi (t), i = 1, 2, . . . , N ; t = 1, 2, . . . , T } denote the status information of the ith channel at tth time slot. If yi (t) is 0, the channel is idle; otherwise, the channel is busy. The lookback window size is set to S time slots. During data pre-processing, the input data is divided into a N × S matrix, denoted by Y, which contains spatiotemporal features of user behavior information. For example, given time period [t − (S − 1), t], Y can be formulated as ⎤ ⎡ ⎤ ⎡ y1 (t − (S − 1)), . . . , y1 (t − 2), y1 (t − 1), y1 (t) y1 ⎢ y ⎥ ⎢ y2 (t − (S − 1)), . . . , y2 (t − 2), y2 (t − 1), y2 (t)⎥ ⎥ ⎢ 2⎥ ⎢ ⎥ ⎢ ⎥ . (7) Y=⎢ ..⎥ ⎢ ..⎥ = ⎢ ⎣ .⎦ ⎣ .⎦ yN
yN (t − (S − 1)), . . . , yN (t − 2), yN (t − 1), yN (t)
N ×S
3.2 Model Training Due to the status information of channel is only represented by busy or idle, the model can be regarded as a binary classifier. Thus, the sigmoid function is selected as the activation function in the output layer of the LSTM. The sigmoid function is calculated as follows [6]: δ(x) =
1 , 1 + e−x
(8)
where x is the input of the neuron. The output layer of the combined model outputs an N-dimensional vector denoted as ⎤ ⎡ βˆ1 (t + 1) ⎥ ⎢ ˆ ⎢ β2 (t + 1)⎥ ⎥ ⎢ , (9) βˆ = ⎢ ..⎥ ⎥ ⎢ .⎦ ⎣ βˆN (t + 1) N ×1
where each value range is between 0 and 1, and then these values will used to get the status of spectrum occupancy by comparing with the threshold, then, the status of spectrum occupancy can be given by ⎤ ⎡ yˆ 1 (t + 1) ⎢ yˆ 2 (t + 1)⎥ ⎥ ⎢ ⎥ , (10) Yˆ = ⎢ ..⎥ ⎢ ⎣ .⎦ yˆ N (t + 1)
N ×1
wherein if βˆi (t + 1) is greater than the threshold, yˆ i (t + 1) is 1, indicating the channel is busy; otherwise, yˆ i (t + 1) is 0, indicating the channel is idle.
466
L. Feng et al.
The loss function and the optimizer are the two key parameters for the BiLSTM model. In each training round, the loss function outputs a loss value to measure the gap between the predictions and the targets. In contrast, the optimizer is used to calculate and to update the weights of the network relying on making the loss value lower. Binary cross-entropy function is often used as the loss function of neural networks for binary classification problems. NAdam is the optimizer of the network, which is proved to have stronger constraints on the learning rate and better optimization effects. The binary cross-entropy formula is as follows: Loss = −
M 1
Y m log βˆ m , M
(11)
m=1
where M is the size of the input dataset, actual channel occupancy status denotes as Y m and βˆ m is combined model predicted score.
4 Experiments In this section, simulation results are presented to verify the performance of the specifically designed scheme, and the conventional LSTM scheme is also evaluated for comparison purposes. The experiment platform is Inter(R) Core(TM) i7-6700HQ 2.60GHz GPU is GTX965M memory is 16.00 GB, and neural network is built on Tensorflow framework. Moreover, the dataset is created by collecting from the satellite of the Tiantong-1, where the frequency band is 2176.075–2176.105 MHz. And the sample period is one hour, where the data of the first 45 minutes and last 15 min are used for training and testing, respectively. 4.1 Hyperparameters optimization K-fold cross-validation method [10] is advocated to optimize hyperparameters of chosen neural network, which divides the dataset into K subsets. To be specific, a single subset is retained for model validation, and the remaining K − 1 subsets are used for training and validating. Moreover, cross-verification is repeated K times, and the average value is set as the final validation result of the model. Furthermore, based on the initial setting, hyperparameters may be gradually modified to achieve a balance between performance and efficiency, thus, K is set to 4 (Table 1). Through extensive cross-validation, the optimized number of hidden layers is 1, which neurons should be 128. 4.2 Prediction Results In Fig. 3, we present the prediction accuracy of the BiLSTM and LSTM for different epochs. It can be seen from Fig. 3 that the BiLSTM model can achieve better accuracy than that of the LSTM model. Figure 4 illustrates the prediction accuracy of two models for different time slots in advance. Observe from Fig. 4 that BiLSTM scheme always achieves higher accuracy
Spectrum Occupancy Prediction via Bidirectional Long Short-Term …
467
Table 1 Evaluation results of different model structures LBiLSTM neu Four-fold cross-validation accuracy 1 1
2
3
Mean accuracy 4
32 0.8830 0.9059 0.9048 0.8953 0.8973 64 0.8915 0.9103 0.9039 0.8971 0.9007 128 0.9043 0.9158 0.9113 0.8957 0.9068
2
32 0.8960 0.9096 0.8996 0.8808 0.8965 64 0.8935 0.9078 0.9023 0.8909 0.9038 128 0.8894 0.9088 0.8975 0.8913 0.8968
Fig. 3 Prediction accuracy of two models for different epochs
than that of the LSTM for time slots in advance from 1 to 20. However, the accuracy deceases as time slot in advance increases, due to that the correlation of spectrum data becomes lower as longer time slots in advance increases.
5 Conclusions In this paper, we investigated the spectrum prediction for satellite. We conceived a BiLSTM-based spectrum prediction scheme. Specifically, the historical spectrum data may be preprocessed firstly, and the preprocessed data will be sent to the BiLSTM model both for training and for test. Moreover, from our performance evaluations, the BiLSTM-based spectrum prediction scheme can achieve better accuracy and learning speed than that of the LSTM-oriented spectrum prediction scheme.
468
L. Feng et al.
Fig. 4 Prediction accuracy of two models for different time slots in advance
Acknowledgments. This work presented was partially supported by the National Science Foundation of China (No. 91738201), the China Postdoctoral Science Foundation (No. 2018M632347), the Natural Science Research of Higher Education Institutions of Jiangsu Province (No. 18KJB510030), and the Open Research Fund of Jiangsu Engineering Research Center of Communication and Network Technology, NJUPT.
References 1. Muhammad A, Rehmani M, MAO S (2018) Wireless multimedia cognitive radio networks: a comprehensive survey. IEEE Commun Surv Tutorials 20(2):1056–1103 2. Hu F, Chen B, Zhu K (2018) Full spectrum sharing in cognitive radio networks toward 5G: a survey. IEEE Access 6:15754–15776 3. Krishan K, Prakash A, Tripathi R (2017) A spectrum handoff scheme for optimal network selection in nemo based cognitive radio vehicular networks. Wirel Commun Mobile Comput 2017:1–16 4. Sapankevych N, Sankar R (2009) Time series prediction using support vector machines: a survey. IEEE Comput Intell Mag 4(2):24–38 5. Zhao Y, Hong Z, Wang G et al (2016) High-order hidden bivariate markov model: a novel approach on spectrum prediction. In: 2016 25th IEEE international conference on computer communication & networks (ICCCN), pp 1–7 6. Li H, Ding X, Yang Y et al (2019) Online spectrum prediction with adaptive threshold quantization. IEEE Access 7:174325–174334 7. Ergen T, Kozat S (2018) Online training of LSTM networks in distributed systems for variable length data sequences. IEEE Trans Neural Netw Learn Syst 29(10):5159–5165 8. Park K, Choi Y, Choi W et al (2020) LSTM-based battery remaining useful life prediction with multi-channel charging profiles. IEEE Access 8:20786–20798
Spectrum Occupancy Prediction via Bidirectional Long Short-Term …
469
9. Sun J, Shi W, Yang Z et al (2019) Behavioral modeling and linearization of wideband RF power amplifiers using BiLSTM networks for 5G wireless systems. IEEE Trans Veh Technol 68(11):10348–10356 10. Rodriguez J, Perez A, Lozano J (2010) Sensitivity analysis of k-fold cross validation in prediction error estimation. IEEE Trans Pattern Anal Mach Intell 32(3):569–575
Active–Passive Fusion Technology Based on Neural Network-Aided Extended Kalman Filter Xiaomin Qiang(B) , Zhangchi Song, Laitian Cao, Yan Chen, and Kaiwei Chen Beijing Aerospace Automatic Control Institute, Beijing, People’s Republic Of China [email protected]
Abstract. The information fusion algorithm based on extended Kalman filter is widely used in the field of information fusion because of its relatively simple algorithm and solid theoretical foundation. This paper uses the advantages of neural networks and their advantages in solving nonlinear and time-varying systems to improve the traditional extended Kalman filtering method, improve the speed and accuracy of filtering, and improve the quality of active and passive data fusion. Keywords: Neural network · Extended Kalman filter · Active–passive fusion
1 Introduction In the new generation of combat systems, the target uses a lot of methods such as electromagnetic interference, deception, camouflage, and concealment. Using a single guidance mode seeker to track the target has been difficult to meet the operational requirements. Multi-mode composite guidance has become an important development of missile weapons direction. Active and passive radar combined guidance technology is an important form of multimode combined guidance technology. The traditional multi-mode guidance method works in a serial manner, and the mode is switched according to the actual situation, and the function of the multi-mode seeker cannot be truly exerted. The information fusion algorithm based on extended Kalman filtering is widely used in the field of information fusion [1]. However, this algorithm still has its limitations when applied: The extended Kalman filter requires that the state model of the signal is known, but for many practical problems, especially for time-sensitive targets, it is difficult to give the state model accurately. It can be done and when the parameters of the system change, the destructive effect caused by the inaccuracy of the model will become more and more prominent, causing the divergence of the estimation results. In view of the above problems, this paper uses the advantages of neural networks and the advantages in solving nonlinear and time-varying systems to improve the traditional extended Kalman filtering method, improve the filtering speed and accuracy, and improve the quality of active and passive data fusion.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_65
Active–Passive Fusion Technology Based on Neural Network-Aided …
471
2 System Design of Extended Kalman Filter Based on Neural Network 2.1 Active and Passive Radar Fusion System Model The radar nonlinear state equation is [2, 3]: X (n + 1) = f (Xn , n) + Wn , Wn ∼ N (0, Q)
(1)
Equation (1), W n is the system gaussian white noise, and Q is the system noise variance matrix. The measurement equation is: Z(n) = h(Xn , n) + Vn , Vn ∼ N (0, R)
(2)
Equation (2), V n is the gaussian white noise during observation, and R is the observation noise variance matrix. In the active–passive fusion system, the information fusion algorithm based on extended Kalman filtering is relatively simple and has a solid theoretical foundation. According to the structure and function of the system, the data fusion system is generally divided into centralized and distributed [4, 5]. As shown in Fig. 1, the biggest advantage of this structure is that the information loss is the smallest, but the data interconnection is more difficult, and the system requires large-capacity processing capacity, and the calculation burden is heavy.
Active radar measurements Kalman filter synchronization information fusion
Passive radar measurements
Global optimal State estimation
Fig. 1 Centralized data fusion
As shown in Fig. 2, distributed systems can only achieve local optimal or global sub-optimal fusion performance, but greatly reduce the burden on the fusion center.
Active radar measurements
Kalman filter
Optimal state estimation Fusion Center
Passive radar measurements
Kalman filter
Fig. 2 Distributed data fusion
472
X. Qiang et al.
2.2 BP Neural Network Structure Active and passive information fusion has always been prominent in the military field. With the development of artificial intelligence technology, intelligent methods such as fuzzy theory, neural networks, and evidence reasoning have become an important force to promote the development of active and passive information fusion technology. The method represented by the neural network combines the non-modeling advantages and linguistic description of the problem description. Through the neural-network-assisted active and passive fusion filtering, higher filtering accuracy can be obtained. The BP algorithm has the advantages of being simple and easy to implement, small amount of calculation, and strong parallelism. The structural diagram of BP neural network is as follows [6, 7] (Fig. 3). Back propagation of error
y1
x1
y2
x2
yn
xn Input layer
Output layer
Hidden layer 1
Hidden layer 2
Fig. 3 BP neural network structure diagram
As shown in Fig. 4, BP neural network learning is a typically supervised learning. Its learning algorithm consists of two processes: forward calculation of the data stream and backpropagation of the error signal. 2.3 Active and Passive Fusion System Based on BP Neural Network Assisted Extended Kalman Filter When performing target tracking, extended Kalman filtering needs to meet the following requirements: The state model of the signal is known; the dynamic system and the
Active–Passive Fusion Technology Based on Neural Network-Aided …
Training samples
Modify weights, thresholds
473
Forward propagation Network output
Back propagation
Network error
Fig. 4 BP neural network learning algorithm
measurement noise follow a Gaussian white noise sequence with a mean of zero. If the following situations occur, the mathematical description often does not match the actual physical process: (1) The tracking target is a time-sensitive target; it is difficult to accurately model the tracking target; (2) the dynamic changes in the real environment cause the system model random interference; (3) errors caused by coordinate transformation, rounding errors in the calculation process, initial state and initial error covariance matrix, etc. The occurrence of these situations will have different degrees of impact on the accuracy of the extended Kalman filter, so we can consider further amendments to the extended Kalman filter [8]. The extended Kalman filter formula is: One-step prediction of status: Xˆ (n + 1, n) = f n, Xˆ (n, n) (3) One-step prediction of covariance: P(n + 1, n) = fx (n)P(n, n)fxT (n) + Q(n)
(4)
Measurement prediction: zˆ (n + 1, n) = h n + 1, Xˆ (n + 1, n)
(5)
S(n + 1) = hx (n + 1)P(n + 1, n)hTx (n + 1) + R(n + 1)
(6)
K(n + 1) = P(n + 1, n)hTx (n + 1)S −1 (n + 1)
(7)
Covariance:
Gain:
State update equation: Xˆ (n + 1, n + 1) = Xˆ (n + 1, n) + K(n + 1) zˆ (n + 1) − zˆ (n + 1, n + 1)
(8)
474
X. Qiang et al.
For the case where the target is a time-sensitive target or the real environment changes dynamically, the values in the above formula can no longer accurately reflect the state of the target. Observe the right side of the above equal sign, which consists of three parts: One-step prediction Xˆ (n + 1, n), Kalman gain K(n + 1) and new information [ˆz (n + 1) − zˆ (n + 1, n + 1)]. When the target state changes, it is the changes in these three parts that cause the estimation error of the state equation. Use the parameters that directly affect the filtering error as the input of the neural network, estimated error X − Xˆ (n + 1, n + 1) as the output of the network, where X represents the theoretical truth value. Using the highly nonlinear mapping capability from n to m in the neural network, the extended Kalman filter estimate can be corrected to improve the tracking accuracy. Based on this, the active and passive fusion system based on BP neural-networkassisted extended Kalman filter is designed, as shown in Fig. 5.
Active radar
Passive radar
(r,alpha1,beta1)
(alpha2,beta2)
Active and passive fusion extended Kalman filter
Filter estimate
Best estimate
^ X ( n + 1, n) K (n + 1) ^
^
[ z (n + 1) − z (n + 1, n + 1)]
BP neural network Estimation error
Fig. 5 Active and passive fusion system based on BP neural-network-assisted extended Kalman filter
Active and passive fusion algorithm steps of BP neural-network-assisted extended Kalman filter: (1) Establish target motion model; (2) Establish an active extended Kalman filter module and a passive extended Kalman filter according to the motion model; (3) Create a BP neural network and determine the structure of the neural network. The inputs of the BP network are one-step prediction Xˆ (n + 1, n), gain K(n + 1), and innovation [ˆz (n+1)−ˆz (n+1, n+1)]; The output of the network is X −Xˆ (n+1, n+1); (4) Initialize the BP neural network, set initial weights and thresholds, and train the BP neural network through sample data; (5) Active and passive fusion filtering based on active and passive fusion algorithm; (6) Call the BP neural network to get the estimation error and correct the active and passive fusion filter results.
Active–Passive Fusion Technology Based on Neural Network-Aided …
475
3 Simulation Results Simulate the active and passive fusion algorithm of BP neural-network-assisted extended Kalman filter; the simulation conditions are as follows: Onboard active and passive radar tracks moving targets, the initial position of the tracking target is [X, Y, Z] = [10, − 1500 m, 0 m], and the target moves at a constant speed. The neural network uses BP neural network. In this simulation experiment, the extended Kalman estimated state is six dimensions, the gain is 9 × 2 dimensions, and the innovation is 2 × 1 dimensions. Therefore, the input of the network is 26 dimensions, and the hidden layer is set to 6; the number of neurons in each layer is 100. The training function is the variable learning rate momentum gradient descent algorithm traingda, the hidden layer neuron transfer function is Tansig, and the output layer transfer function is purelin. The number of Monte Carlo tests is N = 100. Figure 6 shows the results of a random Monte Carlo test. Among them, (a) and (c) show the results of active and passive fusion filtering, and (b) and (d) show the results of BP neural-network-assisted active and passive fusion filtering. 3000
3000 Active and passive fusion filter value Truth value Measured value in X direction
2500
2000
X/m
X/m
2000 1500
1500
1000
1000
500
500
0 0
BP-Active and passive fusion filter value Truth value Measured value in X direction
2500
10
20
30
40
50
60
0 0
70
10
20
Frame number
30
(a)
60
70
(b)
Active and passive fusion filter value Truth value Measured value in X direction
1000
BP-Active and passive fusion filter value Truth value Measured value in X direction
1000 500
Y/m
500
Y/m
50
1500
1500
0
0
-500
-500
-1000
-1000
-1500
40
Frame number
0
10
20
30
40
Frame number
50
60
70
-1500
0
10
20
30
40
Frame number
(c) Fig. 6 Truth-measured value-filtered value
(d)
50
60
70
476
X. Qiang et al.
Taking the difference between the filtered value and the true value as the standard, the results of active and passive fusion filtering and BP neural network-assisted active and passive fusion filtering are compared, as shown in Fig. 7.
X direction error/m
20 15 10 5 0
0
10
20
30
40
50
60
70
Frame number
Y direction error/m
60
BP active-passive fusion error Active-passive fusion error
40
20
0 0
10
20
30
40
50
60
70
Frame number Fig. 7 Error comparison
It can be seen from Fig. 7 that BP neural-network-assisted active–passive fusion filtering has better accuracy than active–passive fusion filtering. Figure 8 shows the accuracy in the X-direction and the Y-direction of active filtering, active–passive fusion filtering and BP neural network-assisted active-passive fusion filtering with root-mean-square error (RMSE) as the standard. It can be seen from Fig. 8 that BP neural-network-assisted active–passive fusion filtering has better convergence speed and convergence accuracy.
4 Conclusion This paper proposes an active–passive fusion algorithm based on BP neural-networkassisted extended Kalman filtering. By using the advantages of the neural network and its advantages in solving nonlinear and time-varying systems, the traditional extended Kalman filtering method is improved. Through simulation comparison with the traditional active–passive fusion algorithm, BP neural-network-assisted active–passive fusion filter has better convergence speed and convergence accuracy and achieves the quality of active–passive data fusion.
Active–Passive Fusion Technology Based on Neural Network-Aided … X direction accuracy
50
40
40 35 30
RMSE/m
30
RMSE/m
Active filtering Active and passive fusion filtering BP-Active and passive fusion filtering
45
35
25 20
25 20
15
15
10
10
5
5
0 0
Y direction accuracy
50
Active filtering Active and passive fusion filtering BP-Active and passive fusion filtering
45
477
5
10
15
20
25
30
35
40
Frame number
45
50
55
60
65
70
0 0
5
10
15
20
25
30
35
40
45
50
55
60
65
70
Frame number
Fig. 8 X-direction and Y-direction filter accuracy
References 1. Zhang X, Li G, Feng D (2015) Target tracking algorithm for active and passive radar seeker data fusion. J Bullets Guidance (03):5–8 2. Lu D (2005) Research on information fusion technology in active and passive radar combined guidance. Country Graduate School of University of Defense Science and Technology 3. Baier S, Spieckermann S, Tresp V (2017) Attention-based information fusion using multiencoder-decoder recurrent neural networks 4. Cui B, Zhang J (2012) Active and passive sensor adaptive measurement fusion algorithm. Com Eng Appl 49:23–26 5. Cui B, Zhang J, Yang Y (2009) Variable-weight data fusion from active and passive radars for target tracking. In: 2nd international workshop on computer science and engineering. Chengdu:[s.n.], pp 342–344 6. Yan Z (2005) Research on application of neural network for information fusion. Comput Simul 9:145–147 7. Orton M, Fitzgerald W (2002) A Bayesian approach to tracking multiple target using sensor arrays and particle filters. IEEE Trans Signal Process 50(2):216–223 8. Cui L, Gao S, Jia H (2014) Application of neural network assisted Kalman filtering in integrated navigation. Optics Precision Eng 022(005):1304–1311
Multi-Target Infrared–Visible Image Sequence Registration via Robust Tracking Bingqing Zhao1,2 , Tingfa Xu1,3,4(B) , Bo Huang1 , Yiwen Chen1 , and Tianhao Li1 1 School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China
[email protected]
2 Shanghai Electro-Mechanical Engineering Institute, Shanghai 201109, China 3 Key Laboratory of Photoelectronic Imaging Technology and System, Ministry of Education of
China, Beijing 100081, China 4 Chongqing Innovation Center, Beijing Institute of Technology, Chongqing 401120, China
Abstract. To solve the problem of multi-target depth difference for infrared and visible image sequence registration, a registration framework based on robust tracking is proposed. Firstly, the curvature scale space corners of targets are extracted, and the descriptors based on the curvature distribution are created to complete feature matching. Since different targets lie on different depth planes, single global transformation matrix is no longer applicable. Robust target tracking algorithm is introduced to dynamically allocate and update the matching reservoir for each target. Finally, the precise registration of multiple targets is realized by computing transformation matrix independently for each target. Experiments on a public dataset of non-planar infrared and visible image sequences show that our framework achieves lower overlapping errors and improves the accuracy of multi-target registration. Keywords: Image sequence registration · Feature matching · Target tracking · Matching reservoir
1 Introduction Image registration is one of the important research contents in the field of computer vision. Image sequence registration is a process of matching and superposing sequential images obtained from different time, sensors and conditions. It is widely used in the field of multi-source image fusion. In addition to the difference between multi-source imaging sensors, the non-planar characteristic of the scene also brings great challenge to the registration task. Especially for image sequences, when the imaging sensors are close to the targets, multiple targets are located on different depth planes. If only single transformation matrix is used to register all the targets, the registration accuracy will be reduced due to the depth difference.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_66
Multi-Target Infrared–Visible Image Sequence Registration via …
479
To register infrared–visible sequential images, the near-planar algorithm proposed by Sonn et al. [1] can overcome the depth difference of moving targets to a certain extent. But it requires that multiple targets must be located on the same depth plane in the scene, which cannot truly solve the problem. Charles et al. [2] propose a registration framework based on shape matching, but it still cannot overcome the depth difference of multiple targets in the scene. For the first time, Nguyen et al. [3] propose to independently analyze each target in the scene and calculate the transformation matrix by using the foreground contours. However, one matrix is calculated for all targets, and the influence of depth difference still exists. In order to solve this problem, we propose a multi-target infrared–visible image sequence registration framework based on robust tracking. We extract the curvature scale space (CSS) [4] corners of the targets and create curvature distribution descriptors to improve the accuracy of feature matching. Then each target is assigned a matching reservoir by tracking, and transformation matrix calculations are completed independently.
2 Related Work Image registration methods can be divided into two types: region-based methods and feature-based methods. Region-based methods directly use gray values of the image to establish a similarity measure function between the images and select optimization strategies, like genetic algorithm [5], Powell’s algorithm [6], etc., to obtain the optimal parameters of the transformation matrix. Cross-correlation methods [7] adopt the crosscorrelation function between images for registration, but the computational complexity is high. Mutual information methods [6] utilize the statistical correlation between images to complete registration. For infrared–visible images, there exists huge difference in gray value, and the textures in visible images may be missed in infrared images. Region-based registration methods can hardly be applied in such situations. Feature-based methods do not rely on the gray values of the whole image, but first extract the prominent points [4], edges [8], contours [2], etc., as the feature information. The parameters of the transformation matrix are then calculated by matching the feature information. Points are the most common used features. However, for infrared– visible images, the distribution of the feature points is not completely unified due to the difference in gray values. It remains a challenge to extract and describe feature points accurately for infrared–visible image registration tasks. For infrared–visible image sequence, the registration of moving targets based on motion information has become a research hotspot in recent years. Bilodeau et al. [9] propose a method based on trajectories. The main idea is to use the trajectory consistency of moving objects and obtain the transformation matrix parameters through the matched trajectory lines. Zhang et al. [10] propose a method based on optical flow, which calculates the optical flow components of each frame, normalizes the rotation invariant and scale invariant and then obtains transformation matrix by matched points. Sonn et al. [1] propose the concept of matching reservoir in the process of image sequence registration. Then Charles et al. [2] use shape context descriptors of the contours to describe and match the feature points and introduce a voting mechanism based on the random consistency algorithm to update the matching reservoir. Sun et al. [11] propose a coarse-to-fine
480
B. Zhao et al.
registration framework, which uses motion vectors to complete the coarse registration and establish a matching reservoir based on the histogram of edge gradient direction.
3 Methodology Figure 1 shows the overall flow of the proposed registration framework. It mainly includes the following steps: foreground extraction, target matching, feature matching, tracking-based matching reservoir allocation and update.
Fig. 1 Flow of the proposed framework
PAWCS method [12] is used to extract the foreground of the moving targets. The principle is to construct the background model by using color and texture information of the image and to update the model by using a negative feedback strategy. Based on the movement simultaneity, the tracker and the matching reservoir for each target are initialized through the occurrence order. And then the corresponding relationship of the target is established according to the number of the tracker. CSS corner detection algorithm [4] is adopted to extract the feature points of the moving targets. The foreground contours and the curvature distribution maintain a high uniform characteristic in infrared–visible images. Based on this, the curvature distribution descriptor is established to describe the extracted feature points. The process is as follows: 1. With the extracted feature point as the center, a ring region with an interval of σh between adjacent circles is established. σh is the scale factor in CSS corner detection algorithm (σh = 3 in our framework).
Multi-Target Infrared–Visible Image Sequence Registration via …
481
2. Each circle is divided into eight subregions at 45◦ interval, so each feature point neighborhood is divided into 24 subregions. Figure 2 shows the subregions that are partitioned, and Rn represents the Nth subregion.
Fig. 2 Subregions centered on a feature point
3. Calculate the curvature value of each pixel and normalize it. The cumulative value of curvature of each subregion is then calculated and the 24-dimensional curvature distribution descriptors of feature points (the total number is N ) are generated:
CDk = [Rk (1), Rk (2), . . . , Rk (24)] k = 1, 2, . . . , N
(1)
For one feature point pair in infrared–visible images, the chi-square test statistics of the two curvature distribution descriptors are as follows: Cs =
M 1 [CDIR (m) − CDVIS (m)]2 , M = 24 2 CDIR (m) + CDVIS (m) m=1
(2)
The point pairs with the smallest chi-square test statistics are regarded as matched point pairs. Bidirectional matching principle is adopted at the same time to reduce mismatches. Combining with the curvature distribution descriptor, a matching reservoir updating mechanism based on Gaussian criterion is established. Firstly, a measure function based on Gaussian criterion is calculated for each matched point pair: d 2 (H pIR (i) , pVIS (i)) IR VIS − λ · Cs CD (i), CD (i) (3) Ei = exp − σ2 where d 2 (H(pIR (i)), pVIS (i)) represents the square of the Euclidean distance between the coordinates of the point pIR (i) after the transformation of the current transformation
482
B. Zhao et al.
matrix H and the point pVIS (i).Cs(CDIR (i), CDVIS (i)) represents chi-square test statistics of two curvature distribution descriptors calculated by Eq. 2. Normally, σ 2 = 100.λ represents the weight factor of distance measure and feature description measure(λ = 0.5 in our framework). When the matching reservoir is full, all existing pairs are divided into outliers and inliers by K-means method [13] based on Gaussian criterion, and one outlier is randomly replaced by the current matched point pair. Multiple targets in the scene are located on different depth planes, and the transformation matrix needs to be calculated independently for each target. Background-aware correlation filters (BACF) [14] tracker is introduced to establish the connection between different foregrounds in the frames and then complete the allocation and update of the matching reservoir. BACF tracker is robust in coping with deformation, scaling and background occlusion, which is applicable in both infrared and visible scenes. To register multiple targets in the scene, global transformation matrix needs to be calculated first, which can provide a certain number of matched point pairs for the initialization process of the new matching reservoir when the target appears, tracking fails or the target leaves the scene. RANSAC method [15] is used for all the matched point pairs in the global matching reservoir to obtain global transformation matrix T g . For each infrared foreground BIR , if there is only one matching reservoir corresponding to it, the current matrix T c is calculated by RANSAC method using the matching point pairs in the matching reservoir. If there is no matching reservoir corresponding to it, tracking failure occurs and the global transformation matrix T g is treated as its current matrix and waits for re-initialization of the matching reservoir. If there are multiple matching reservoirs corresponding to it, occlusion occurs. At this time, the mean value of multiple matrices T a calculated by multiple matching reservoirs is used as its current matrix until the end of the occlusion.
4 Experiments and Analysis 4.1 Dataset OTCBVS dataset provided by Bilodeau et al. [16] is selected to test the proposed registration framework. It contains four infrared–visible image sequences with non-planar characteristic. The experiments are implemented on an Intel(R) Core (TM) i5-6500 3.20 GHz CPU, 16 GB RAM, MATLAB-R2016a platform. The average computing time of single frame of each image sequence is about 0.1 s. 4.2 Results and Analysis Figure 3 shows the mosaic results of transformed infrared images and visible images. The number of targets in the scene varies from one to five. The proposed registration framework can calculate the transformation matrix of each target, which effectively solves the problem of depth difference in non-planar scenes. When occlusion happens, multiple foregrounds are considered as one foreground and are registered by one registration matrix until the end of the occlusion.
Multi-Target Infrared–Visible Image Sequence Registration via …
483
(a) Mosaic results of OTCBVS-1
(b) Mosaic results of OTCBVS-2
(c) Mosaic results of OTCBVS-3
(d) Mosaic results of OTCBVS-4
Fig. 3 Mosaic results obtained by the proposed registration framework
To quantitatively evaluate the proposed framework, the method proposed by Charles et al. [2] and the ground truth matrix constructed by manually selecting matching points are selected for experimental comparison. The overlapping error between the transformed infrared foregrounds and visible foregrounds is used as the evaluation criterion: N VIS ∩ Γ F IR , T n n n=1 Fn (4) ξOE = 1 − N VIS ∪ Γ F IR , T n n n=1n=1 Fn where FnVIS and FnIR represent the visible and infrared foregrounds. Γ (FnIR , Tn ) represents the coordinate interpolation with the current transformation matrix Tn . Tables 1 and 2, respectively, show the minimum and the average overlapping errors of each image sequence (the bold represents the best result). And Fig. 4 shows the overlapping error time curves of these methods. Combining the tables and the curves, our proposed registration framework achieves the best results in all image sequences.
484
B. Zhao et al. Table 1 Minimum overlapping errors in OTCBVS dataset Sequence pair Ground-truth Ours
Charles et al.
OTCBVS-1
0.2582
0.1672 0.2300
OTCBVS-2
0.3606
0.1583 0.3263
OTCBVS-3
0.4291
0.1574 0.2981
OTCBVS-4
0.2797
0.1486 0.1978
Table 2 Average overlapping errors in OTCBVS dataset Sequence pair Ground-truth Ours
Charles et al.
OTCBVS-1
0.7220
0.4577 0.7176
OTCBVS-2
0.6983
0.3918 0.7423
OTCBVS-3
0.6920
0.4497 0.5537
OTCBVS-4
0.6346
0.4288 0.5839
Fig. 4 Overlapping error time curves in OTCBVS dataset
Multi-Target Infrared–Visible Image Sequence Registration via …
485
5 Conclusion When registering non-planar infrared–visible image sequences, multiple targets lie on different depth planes, and a single global transformation matrix is no longer applicable. To solve this problem, we propose a registration framework based on robust tracking. CSS corners are extracted, and curvature distribution descriptors are created to complete the description and matching of feature points. Then BACF tracking method is introduced to dynamically allocate and update the matching reservoir for each target in the scene, and the parameters of the transformation matrix are independently calculated for each target, so as to complete the accurate registration of the images. Experiments on OTCBVS dataset show that the proposed framework achieves the optimal results in all image sequences and is robust to target occlusion. Acknowledgements. This research was supported by the Major Science Instrument Program of the National Natural Science Foundation of China under grant 61527802 and the General Program of National Nature Science Foundation of China under grants 61371132 and 61471043.
References 1. Sonn S, Bilodeau GA, Galinier P (2013) Fast and accurate registration of visible and infrared videos. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops. IEEE Press, Portland, pp 308–313 2. St-Charles PL, Bilodeau GA, Bergevin R (2015) Online multimodal video registration based on shape matching. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops. IEEE Press, Boston, pp 26–34 3. Nguyen DL, St-Charles PL, Bilodeau GA (2016) Non-planar infrared-visible registration for uncalibrated stereo pairs. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops. IEEE Press, Las Vegas, pp 329–337 4. Mokhtarian F, Suomela R (1998) Robust image corner detection through curvature scale space. IEEE Trans Pattern Anal Mach Intell 20(12):1376–1381 5. Chow CK, Tsui HT, Lee T (2004) Surface registration using a dynamic genetic algorithm. Pattern Recogn 37(1):105–117 6. Thevenaz P, Unser M (2000) Optimization of mutual information for multiresolution image registration. IEEE Trans Image Process 9(12):2083–2099 7. Kim J, Fessler JA (2004) Intensity-based image registration using robust correlation coefficients. IEEE Trans Med Imaging 23(11):1430–1444 8. Kim YS, Lee JH, Ra JB (2008) Multi-sensor image registration based on intensity and edge orientation information. Pattern Recogn 41(11):3356–3365 9. Bilodeau GA, Torabi A, Morin F (2011) Visible and infrared image registration using trajectories and composite foreground images. Image Vis Comput 29(1):41–50 10. Zhang Y, Zhang X, Maybank SJ (2013) An IR and visible image sequence automatic registration method based on optical flow. Mach Vis Appl 24(5):947–958 11. Sun X, Xu T, Zhang J (2017) A hierarchical framework combining motion and feature information for infrared-visible video registration. Sensors 17(2):384 12. St-Charles PL, Bilodeau GA, Bergevin R (2015) A self-adjusting approach to change detection based on background word consensus. In: Proceedings of the IEEE winter conference on applications of computer vision. IEEE Press, Waikoloa, pp 990–997
486
B. Zhao et al.
13. Hartigan JA, Wong MA (1979) Algorithm AS 136: a K-means clustering algorithm. J R Stat Soc 28(1):100–108 14. Galoogahi HK, Fagg A, Lucey S. (2017) Learning background-aware correlation filters for visual tracking. In: Proceedings of the IEEE international conference on computer vision. IEEE Press, Venice, pp 1144–1152 15. Fischler MA (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 24(6):381–395 16. Bilodeau GA, Torabi A, St-Charles PL (2014) Thermal-visible registration of human silhouettes: a similarity measure performance evaluation. Infrared Phys Technol 64:79–86
Research on an Improvement of Images Haze Removal Algorithm Based on Dark Channel Prior Guonan Jiang, Xin Yin(B) , and Menghan Dong Tianjin Key Laboratory of Wireless Mobile Communications and Power Transmission, Tianjin Normal University, Tianjin 300387, China [email protected]
Abstract. In view of the image reduction in haze weather due to aerosol scattering, the reduction of information volume and recognition accuracy of outdoor images, this paper improves the dark channel prior algorithm on the basis of the analysis of the image recovery algorithm, puts forward a solution strategy to obtain the atmospheric light value and the transmission based on the dark channel prior algorithm with different resolutions, and carries on the relevant experimental research, which effectively solves the problems of color distortion and poor real time of image haze removal. Keywords: Image haze removal · Dark channel prior algorithm · Image restoration · Scale transformation
1 Introduction At present, haze weather has become a frequent disaster weather. In the foggy weather, due to the large scattering and refraction of light caused by various aerosol particles suspended in the air, the image quality of outdoor images is reduced, resulting in the decline of target identification accuracy, which has a great impact on the field of outdoor monitoring, traffic safety, and automatic driving [1]. The research on the implementation strategy of image haze removal algorithm has become a hot issue in the field of machine vision [2]. Recently, there are two main categories of haze removal algorithms at home and abroad, one is based on image-enhanced haze removal algorithms and the other is down to image recovery algorithms. The image-enhanced haze removal algorithm is divided into contrast enhancement and image color enhancement, which are typically represented by histogram equalization algorithms [3–7] and Retinex algorithms [8–12]. The image recovery haze removal algorithm is mainly divided into two categories, one is based on cloud computing and deep learning method. The method is to estimate the intensity of the original image transmission by using a large amount of data and the model and
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_67
488
G. Jiang et al.
data pair of haze and dense haze images. As the artificial intelligence technology was introduced into the haze removal algorithm, such algorithms have achieved a good effect of haze removal [13–16], but the disadvantage is that the application and generalization of different types of data is a big unknown, so it is difficult to implement. The other method is to estimate the transmission based on the scientific prior knowledge of images such as dark channel and then obtain the haze-free images [17–20]. Among them, the dark channel prior algorithm (DCP) proposed by Kaiming He has achieved a good image haze removal effect, which can restore the outdoor haze image with high quality. In order to visually observing the effect of different haze removal algorithms, this paper makes a lot of experiments on several classic algorithms and evaluates the results of the algorithms objectively. The original images selected by the experiment are from baidu.com, and the experimental results are shown in Fig. 1, which consists of two sets of pictures, each containing four pictures. From left to right is the original images, the results
Fig. 1 Haze removal results. a Original images. b Results of the DCP algorithm. c Results of Retinex algorithm. d Results of histogram equalization algorithm results
of the DCP algorithm, the results of Retinex algorithm, and the results of histogram equalization algorithm results. The resolution of the images in first group is 559 * 353, and that in the second group is 185 * 118. Figure 2 is the gradient magnitude map of the results of the images in second group processed by different algorithms. Table 1 is an objective evaluation of the image quality most widely used in the experiments for the original image, the DCP algorithm result, the Retinex algorithm result and the histogram equalization algorithm result in the second group, using peak signalto-noise ratio, average gradient, edge strength, variance (gray scale), and information entropy as the evaluation index, while Table 1 counts the operation time of different algorithms. In the evaluation index, the peak signal-to-noise ratio reflects the image distortion, the greater the peak signal-to-noise ratio of the test, the less image distortion. The average gradient reflects the change rate of the image’s tiny detail contrast, and this value can indicate the relative clarity of the image. Edge strength is essentially the magnitude of the edge point gradient, the grayscale variance reflects the size of the highfrequency part of the image. Information entropy refers to the confusion of the system,
Research on an Improvement of Images Haze Removal Algorithm …
489
Fig. 2 Gradient magnitude map of the second group of images in Fig. 1. a The DCP algorithm. b Retinex algorithm. c Histogram equalization algorithm
and the entropy of the good focus image is greater than the image without clear focus, the greater the information entropy measured, the clearer the image. Table 1 Objective evaluation before and after processing of the second group of images (185 * 118) in Fig. 1 Evaluation comparison
Algorithm
Peak signal-to-noise ratio
24.06
Average gradient
2.71
Edge strength
27. 57
Variance (gray scale)
1631. 98
Information entropy
6.91
Time-consuming (s)
–
Original image
The DCP algorithm 24.07
4 .22
Retinex algorithm
Histogram equalization algorithm
24.06
24.26
3.44
2.91
42.34
34.54
29.88
2723.07
1897.17
5574.85
7.5573
7.29
7.96
3.05
1.75
1.25
Through the experimental comparison, from the three algorithms result in Fig. 1 and the gradient magnitude map of image in second group in Fig. 2, it can be seen intuitively that the image after DCP processing is clearer and the haze removal effect is better. From Table 1, we can objectively see that the average gradient and edge strength measurement of the DCP algorithm result is larger, but the DCP algorithm takes a long time and the real-time performance is poor [21]. In view of the above problems, this paper proposes the solution strategy of the dark channel prior algorithm with high time complexity and easy distortion of image color after restoration.
490
G. Jiang et al.
2 Improved Dark Channel Prior Haze Removal Algorithm 2.1 Dark Channel Prior Algorithm The dark channel prior is that Kaiming He gathered statistics on the pixels and the intensity of a large number of outdoor images and thus drew a reasonable conclusion: Most local patches in haze-free outdoor images contain some pixels which have very low intensities in at least one color channel. As shown in Fig. 3, the minimum values of the three channels R, G, and B of pixel X in an outdoor haze-free image J are taken out, respectively, as the dark channel value of pixel X, which is close to 0; that is, the intensity of dark channel Jdark in haze-free image J is always very low and approaches 0 [2].
Fig. 3 Kaiming He’s conclusion emoticons
The DCP algorithm according to the results obtained the transmission rate and then obtained the recovered image based on the calculation of the transmission rate and the estimated scattering components of atmospheric light in the propagation. The brightness of the object in the haze image can be seen as a linear overlay with the scattered light of the sunlight after outdoor propagation and the incident light after the attenuation of atmospheric light scattering during the propagation, the mathematical model is as follows: (X ) = (X )t(X ) + a(1 − t(X ))
(1)
In Eq. (1), () is the observed intensity, () is the value of the processed image, () is the transmission rate, is the global atmospheric light. Where () is the desired quantity of the algorithm and () is the known quantity, and the image can be recovered by obtaining a and then calculating (). represents the maximum brightness value of the image, which is available under a known condition, according to the mathematical model (1) and we have: (X ) = ((X ) − a)/t(X ) + a
(2)
Then take the minimum value of the two sides and filter the minimum value to get a range. Since the range neighborhood is small enough to treat the range as a constant,
Research on an Improvement of Images Haze Removal Algorithm …
491
and then based on the knowledge of the dark channel prior knowledge, the transmission rate is shown in Eq. (3): (3) (X ) = 1 − min min(I c )/ac Ω
C
where is one of the three channels of image R, G, and B, is an x-centric neighborhood. This image haze removal technique based on the dark channel prior algorithm is very mature, but there are some shortcomings in the DCP algorithm. First of all, as shown in Table 1, the DCP algorithm has a higher time complexity than other haze removal algorithms, which is because the algorithm needs to obtain the transmission rate map when dehazing, and this part has a large number of floating point calculations, resulting in low processing efficiency of the whole program. In addition, the DCP algorithm cannot effectively remove the haze of the sky region, the color will be distorted when restoring the image, and the obtained image will often have a large area of texture or even block phenomenon. This is mainly because the pixel values of R, G, and B in the sky region of the image are not close to 0, which does not meet the dark channel prior condition proposed by He. In this paper, we optimize the traditional DCP for the above defects and propose an improved dark channel prior algorithm based on different resolutions 2.2 Dark Channel Prior Algorithm Based on Different Resolutions 2.2.1 Implementation Strategy to Reduce DCP Time Complexity On the premise of not losing the haze removal effect of the image, considering that the low-resolution scale of the image can save the main features on the image and the high-resolution scale can obtain the detailed information of the image, the dark channel prior algorithm based on different resolutions is proposed in this paper to improve the real-time performance of the algorithm. The transmission rate of the image is calculated at the low-resolution scale of the image, and then the output image is restored to high resolution, so as to obtain the dehazing image. In this paper, we count the time taken by the algorithm to process images of different resolutions before and after the improvement through a large number of experiments, and Table 2 is part of the data of the respective average time taken by different algorithms to process images with different resolutions for multiple times. As shown in Table 2, compared with the two algorithm spending times, it can be seen that when the image processing is small, the two algorithms have little difference in operation time, and when the image processing is large, the improved algorithm runs less. In order to find the appropriate resolution reduction factor for processing images, this paper has carried out multiple groups of experiments. Each group of images had the same size, and different groups of images size are different, 10 sets of experiments were carried out. By testing the images of different reduction factors, we find that the improved algorithm result has no distortion and is most similar to the result of the DCP algorithm when the scale is reduced by 0.7. Figures 4 and 5 show the histogram of the results before and after the improvement. Comparing the result histogram obtained by the DCP algorithm with the result obtained by the improvement algorithm with a reduction factor of 0.7, it can be seen that the histograms of the results of the two algorithms are similar.
492
G. Jiang et al. Table 2 Algorithm processing time comparison
Image resolution
Processing time (s) Run time of the DCP algorithm (s)
Run time of the improved algorithm (reduction factor 0.25) (s)
107 * 102 * 3
2.636346
2.631605
115 * 354 * 3
2.697019
2.219750
221 * 370 * 3
2.719445
2.206490
364 * 480 * 3
4.326415
2.884810
878 * 960 * 3
9.812119
3.341906
Fig. 4 Histogram of the results from the DCP algorithm
Fig. 5 Histogram of the improved DCP algorithm with a reduction factor of 0.7
To show more intuitively that when the reduction ratio is 0.7, the result of the improved algorithm is close to that of the traditional DCP algorithm, the values of the results obtained by the two algorithms are compared on the height 450:452 and width 650:653 of the image in this paper. By comparing the data obtained by the algorithm in Fig. 6 and the data obtained by the improved algorithm in Fig. 7, it is found that the values of the results of the two algorithms are very close. In order to observe whether the time taken to reduce the resolution of the original image would offset the profit gained from the optimization of the improved algorithm when the reduction factor is 0.7, the operating time of the different size images with a reduction factor of 0.7 in the experiment was calculated. Table 3 shows that when the two algorithms process images of different sizes for 50 times of average time, it is concluded by comparing the data in the table that when the reduction factor is reduced to 0.7, the time spent in scaling will not offset the profit
Research on an Improvement of Images Haze Removal Algorithm …
493
Fig. 6 Data obtained by DCP algorithm
Fig. 7 Data obtained the improved algorithm based on different resolutions
gained from the time spent in optimizing images. Therefore, the reduction factor of 0.7 is selected for the combination of time and distortion. Table 3 Comparison of the operation times of different size images algorithms Time-consuming (s)
Image size 301 * 353
600 * 401
950 * 361
960 * 442
900 * 600
DCP algorithm
1.404807 s
4.238449 s
5.326186
3.928985 s
7.862653 s
Improved algorithm
1.065548 s
2.751835 s
2.297381 s
2.779072 s
4.932078 s
2.2.2 Solution Strategy for Color Distortion in Image Haze Removal In the haze removal processing of the target images, the DCP algorithm has the color distortion problem in the gray white or sky area of the image. In view of this problem, this paper puts forward the optimization of atmospheric light value and transmission rate in the dark channel prior algorithm based on different resolutions. For the optimization of atmospheric light value, the comparison part [22] was added to the paper, and atmospheric light parameter which is the maximum of three channels per pixel of the image after filtering was set to calculate the maximum global atmospheric light value. When the value is greater than the preset global atmospheric light value (240), this preset value is selected as the atmospheric light value [23]; otherwise, the average value of all points in the sky region is taken as the global atmospheric light value. Moreover, the guided filter is added to optimize the transmission rate, and the smoothing function of the filter is used to avoid the exposure of the sky region, which improves the image contrast and color saturation and enhances the detail of the images. In addition, to eliminate the problem of dark color of the haze-free images obtained by the algorithm and the color
494
G. Jiang et al.
abnormal phenomenon in the bright areas, linear color enhancement [24] was carried out in the last step of the dark channel prior algorithm based on different resolutions in this paper, which results the resulting graph has a good color recovery degree and a better sense of picture hierarchy.
3 Research of Improved Dark Channel Prior Haze Removal Algorithm In this paper, an improved dark channel prior haze removal algorithm is processed under the environment of 8 GB memory of Intel Pentium N3700 processor, Win7 flagship operating system. The paper puts forward the flowchart of dark channel prior algorithm based on different resolutions as shown in Fig. 8. The algorithm first reduced the resolution of the original images, obtained the transmission rate of the image, and then restored the resolution of the transmission ratio images to the original size. The accurate transmission rate was obtained by the guided filter, and thus we got the haze-free images. Finally, the linear color enhancement was applied to the images.
Fig. 8 Flowchart for dark channel prior algorithm based on different resolutions
Figure 9 shows three groups of comparison images of the DCP algorithm results before and after the improvement. From left to right is the original images, the results of the DCP algorithm, the results of the dark channel prior algorithm based on different resolutions before linear color enhancement, and the result after the linear color enhancement. As can be seen from Fig. 9, the sky region of the image obtained by the dark channel prior algorithm with different resolutions after haze removal has low exposure and color distortion. In order to better evaluating the effect of the improved algorithm, two evaluation indexes, peak signal-to-noise ratio and evaluation gradient, are adopted in this paper. Table 4 shows the indexes of image of the DCP algorithm and that of the dark channel prior algorithm based on different resolutions in the first group of Fig. 9. By comparing the peak signal-to-noise ratio, it can be seen that the image of the dark channel prior algorithm based on different resolutions contains more effective information, and the comparison of the average gradient shows that the image obtained by the dark channel
Research on an Improvement of Images Haze Removal Algorithm …
495
Fig. 9 Comparison of results before and after DCP algorithm improvement
prior algorithm based on different resolutions has a better contrast of small details, more obvious texture variation characteristics, and clearer image. Table 4 Different algorithms to haze evaluation indexes Evaluation index comparison
Algorithm The DCP algorithm
The improved DCP algorithm
Peak signal-to-noise ratio
23.1117
24.7420
Average gradient
13.572
13.969
4 Conclusion In view of the scientific problem of decreasing recognition ability of self-driving system in haze days, this paper studies the dark channel prior haze removal algorithm and proposes the dark channel prior to the method based on different resolutions. By comparing the corresponding indicators, the images haze removal using the dark channel prior algorithm based on different resolutions can significantly improve the real-time performance of the algorithm, and the image will not be distorted. Furthermore, the experiment proves that the dark channel prior algorithm based on different resolutions can improve the reliability and accuracy of target recognition in haze weather. The research of this paper has a positive effect on automatic driving, transportation, and safety monitoring.
496
G. Jiang et al.
Acknowledgements. This work was supported by Tianjin Key Laboratory of Wireless Mobile Communications and Power Transmission, and funding of innovation and entrepreneurship training project for Tianjin University students (20191006508).
References 1. Yu Z, Yang C (2016) Impact of haze weather on transportation and countermeasures. Sci Technol Vis 7:254–257 2. He KM, Sun J, Tang XO (2011) Single image haze removal using dark channel prior. IEEE Trans Pattern Anal Mach Intell 33(12):2341–2353 3. Lim SH, Isa NSM, Ooi CH, Toh KKV (2015) A new histogram equalization method for digital image enhancement and brightness preservation. Signal Image Video Process 9(3) 4. Wu Y (2016) Histogram equalization image haze removal technology based on HSV color model. J Yuncheng Univ 34(06):60–62 5. Li X, Wang Q, Yang H (2020) Image enhancement technology of fog based on histogram equalization [J/OL]. China Build Mater Sci Technol 1–2. http://kns.cnki.net/kcms/detail/11. 2931.TU.20180618.1945.040.html 6. Rao BS (2020) Dynamic histogram equalization for contrast enhancement for digital images. Appl Soft Comput J 89:106114 7. Li Z, Guo Y, Ji J (2020) The design and implementation of histogram-based image defogging platform. Comput Knowl Technol 16(06):180–182 8. Wei J, Zhijie Q, Bo X, Dean Z (2018) A nighttime image enhancement method based on Retinex and guided filter for object recognition of apple harvesting robot. Int J Adv Robotic Syst 15. https://doi.org/10.1177/1729881417753871 9. Wang J, Li Y (2019) An improved image defog algorithm based on Retinex theory. Comput Knowl Technol 15(26):200–201, 203 10. Wang Y, Na Z, Han M (2019) A research on Retinex-based haze removal algorithms for color images. J Shangluo Univ 33(4):6–8, 58 11. Hu Y, Tang C, Xu M, Lei Z (2019) Selective retinex enhancement based on the clustering algorithm and block-matching 3D for optical coherence tomography images. Appl Opt 58(36) 12. Jin Z, Wu Y, Min L, Ng MK (2020) A Retinex-based total variation approach for image segmentation and bias correction. Appl Math Model 79 13. Xu C, Xu X, Jia K et al (2016) Dehaze Net: an end-to end system for single image haze removal. IEEE Trans Image Proces 25(11):5187–5198 14. Xi Z (2018) The Research of image dehazing algorithm. Xidian University 15. Zhang Z, Zhou W (2019) Image Dehazing algorithm based on deep learning. J South China Normal Univ (Natural Sci Edn) 51(3):123–128 16. Chen Y (2019) Research and implementation of single image dehazing based on deep learning. Xidian University 17. Fu Z, Yang Y et al (2015) Improved single image dehazing using dark channel prior. J Syst Eng Electron 26(5):1070–1079 18. Wang Y, Huang T-Z et al (2020) A convex single image dehazing model via sparse dark channel prior. Appl Math Comput 375. https://doi.org/10.1016/j.amc.2020.125085 19. Liu XY, Dai SK (2015) Halo-free and color-distortion-free algorithm for image dehazing. J Image Graphics 20(11):1453–1461 20. Ti X (2018) Image dehazing algorithm based on two prior knowledge. Shangdong University of Science and Technology 21. Gou X, Sun W (2019) Improved single image haze removal algorithm using dark channel prior computer & digital engineering 47(11):2890–2894 , 2909
Research on an Improvement of Images Haze Removal Algorithm …
497
22. He L, Yan Z, Liu Y, Yao B (2019) An improved prior defogging algorithm based on dark channel. Comput Appl Softw 36(02):57–61 23. Wu G, Li D, Yu L, Zhou J (2019) An algorithm for sea image defogging based on dark channel prior. Opt Optoelectronic Technol 17(6):45–50 24. Wang JB, He N, Zhang LL et al (2015) Single image dehazing with a physical model and dark channel prior. Neurocomputing 149(PB):718–728
Research on the Human Vehicle Recognition System Based on Deep Learning Fusion Remove Haze Algorithm Guonan Jiang, Xin Yin(B) , and Jingyan Hu Tianjin Key Laboratory of Wireless Mobile Communications and Power Transmission, Tianjin Normal University, Tianjin 300387, China [email protected]
Abstract. The human vehicle identification system is a key module in the field of automatic driving, but the image quality obtained by the system is poor in haze weather, which makes it difficult for the system to identify the target. In view of the above problems, this paper studies the human vehicle recognition system based on deep learning and fusion of remove haze algorithm. This system is based on the dark channel priori with different resolution to research the image of removed haze, in order to effectively detect the categories and behaviors of moving objects in the image sensing area, introduce artificial intelligence into human vehicle recognition, propose a target detection algorithm based on deep learning and Kalman filtering and Hungarian algorithm. This algorithm can ensure the high-speed feedback of the detected object and improve the detection accuracy at the same time, solve the problem that small and medium targets are easy to be missed in the detection process. Finally, the feasibility of the algorithm is verified by experimental research. Keywords: Human vehicle identification system · Deep learning · Kalman filtering algorithm · Hungarian algorithm
1 Introduction Human vehicle recognition system is a key module in the field of automatic driving. At present, due to the global fog and haze weather becoming normal, there are a large number of particles suspended in the atmosphere under fog and haze conditions, resulting in a strong scattering effect on the light, resulting in blurring of pedestrians and vehicles in the video collected by people’s vehicle recognition system under fog and haze, thus affecting the system to accurately identify moving targets. Therefore, the research of target detection strategy in human vehicle recognition system under fog and haze has become a hot topic in the field of automatic driving. At present, the target detection algorithm is mainly divided into three research schemes. The first one is target detection algorithm based on reinforcement learning.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_68
Research on the Human Vehicle Recognition System Based on Deep …
499
The typical performance is the deep Q learning network (DQN) [1–3]. DQN has better effect when dealing with high-dimensional input, but there is a certain limitation on the output of high-dimensional output. Moreover, the learning complexity and computational complexity of the algorithm are relatively high. The second scheme is the target detection algorithm for candidate regions. The typical network is R-CNN [4–6] and Faster-R-CNN [7–9]. Such detection methods are better for target detection, but such algorithms are time-consuming in training and testing as in the first scheme. The third is the target detection algorithm based on regression. Typical implementations are YOLO [10–12] and SSD [13, 14]. The typical algorithm in these algorithms is YOLO series algorithm. YOLO series algorithm can detect targets quickly, detect speed 45f/s, and detect high performance. This algorithm is widely used in image target detection, video target detection, real-time camera target detection, and so on, but small and medium targets are easy to be missed. Therefore, in view of the above problems, the second part of the thesis studies the YOLO series algorithm and proposes the improvement strategy to form the human vehicle recognition system based on the deep learning fusion haze algorithm. In the third part, the related experimental research of human vehicle recognition system is carried out to verify the effectiveness of the algorithm.
2 Human Vehicle Recognition System Based on Deep Learning Fusion Haze Remove Algorithm Based on the deep learning fusion haze removal algorithm, a dark channel prior haze removal algorithm is studied [15]. Algorithm based on regression algorithm, Hungarian algorithm, Kalman filtering, a new human vehicle recognition system is proposed. The human vehicle recognition system performs target detection and target tracking based on deep learning technology and effectively integrates the haze removal algorithm to improve the reliability and accuracy of target recognition in fog and haze weather, and to reduce the incidence of traffic accidents. The system adopts the algorithm of dark channel priori based on different resolutions to remove the haze and get clear video images. After that, propose a target detection algorithm based on deep learning and Kalman filtering and Hungarian algorithm to detect the classes and actions of moving objects effectively, and to identify the targets in real time, so as to solve the problems of easy detection of small targets in the detection process. The specific process of the system is shown in Fig. 1. 2.1 YOLO Algorithm The essence of the YOLO series target detection algorithm is regression. It can achieve end-to-end target detection and recognition. It is a convolutional neural network. YOLO target detection algorithm can predict multiple Box positions and classes at one time. The algorithm is fast and accords with the real-time requirements of the system. Fast-R-CNN adopts proposal training mode. This kind of algorithm often detects the background area error as a specific target. However, the goal detection algorithm of the YOLO series does not choose to train the network by extracting the proposal, but directly selects the whole graph training model. In this way, it can better distinguish the target area from
500
G. Jiang et al.
Fig. 1 Flowchart of upper computer of human vehicle identification system based on deep learning and haze removal algorithm
the background area. But YOLO improves the detection speed while sacrificing some accuracy. YOLO detection system will turn the image resize to 448 * 448 at first, then run CNN, and finally get the non-maximum inhibition to optimize the detection results. The YOLO detection system flow is shown in Fig. 2.
Fig. 2 Process of YOLO detection system
Research on the Human Vehicle Recognition System Based on Deep …
501
2.2 YOLOV3 Target Detection Algorithm YOLOv3 target detection algorithm is the most widely used algorithm in YOLO series nowadays. The network structure in YOLOv3 mainly includes Darknet53 for extracting image features and new network Darknet53 for multi-scale prediction YOLO layer. YOLOv3, extracting image features, is using Leaky ReLu as activation function and Darknet53 as full convolution network. It is mainly composed of convolution layer, batch standardization, and thermocline connection, and the introduction of the thermocline connection strengthens the convergence effect. Darknet53 fully used the residual structure in the RESNET network to enable the network to achieve a very deep degree while avoiding the gradient vanishing problem. In addition, compared with the RESNET network architecture, Darknet53 removes the pool layer and uses the convolution layer of two-step size to reduce the dimension of the feature map and better maintain the information transmission. It fully absorbs the idea of network in network. The use of 1 * 1 convolution not only reduces the number of parameters, but also enhances the feature fusion among different channels. The working principle of convolution is shown in Fig. 3. The dark green 5 × 5 grid represents the input image, the light green 5 × 5 square represents the output image, the blue part 1 × 1 square and 3 × 3 square is represented as convolution kernel, first calculated from the upper left corner region of the input image, and the convolution operation maps the upper left 3 × 3 region of the input image to the first point of the upper left corner of the output image, followed by the sliding convolution kernel, the convolution operation can be completed.
Fig. 3 Schematic diagram of YOLOv3 convolution layer
502
G. Jiang et al.
In addition, because a target object that appears in the border of a target object may belong to the category of multiple objects at the same time, when classifying the target objects in the border, YOLOv3 abandons the softmax function that can only multiscale the objects in a target border into one class and adopts the idea of multi-scale classification. Each frame uses the multi-label classification to predict the classes that the bounding box may contain. In the training process, binary cross-entropy loss is used for class prediction. For overlapping tags, multi-label method can better simulate the data and enhance the robustness of the target detection algorithm. YOLOv3 divides the YOLO layer into three scales, uses convolution operation on the same scale YOLO layer, uses convolution kernel to complete the interaction between the feature graph and local features, and outputs three different scales of feature map. This improvement effectively enhances the detection effect of the algorithm for objects of different sizes and objects. 2.3 Deep Learning Fusion Target Detection Algorithm Based on Kalman Filtering and Hungarian Algorithm In the paper, the human vehicle recognition system is based on YOLOv3 for target detection. When testing, it is found that the algorithm usually has some problems such as missed detection and error detection in target detection in video stream, as shown in Fig. 4. Aiming at this problem, in recent years, some scholars have proposed the Kalman filtering algorithm to improve the accuracy of target tracking [16–20].
Fig. 4 Missing detection of YOLOv3
In this paper, the target detection algorithm based on regression is studied. The fusion algorithm of YOLOv3 and Kalman filtering and Hungary algorithm [21] is proposed. The target detection algorithm of deep learning fusion Kalman filter and Hungary algorithm is proposed. The predicted values in the filter match well with the detection values in target detection, and the optimal estimation is obtained. The algorithm improves the problems of missing detection and misdetection in target detection. The target detection algorithm proposed in this paper initializes the parameters first, then loads the trained deep learning model, gets the detection video image set, identifies and detects pedestrians or vehicles in the first frame detection screen, displays and records the initial center point coordinates of the target through the border dimension, and then carries out the target prediction and the update of the target state. Each target can be tracked in real time according to the coordinate displacement of the center point, and the type and behavior of the moving object can be detected effectively. The specific algorithm flow is shown in Fig. 5.
Research on the Human Vehicle Recognition System Based on Deep …
503
Fig. 5 Flowchart of target detection algorithm based on deep learning fusion Kalman filter and Hungarian algorithm
3 Experimental Research on Human Vehicle Recognition System Based on Deep Learning Fusion Haze Removal Algorithm In this paper, an improved dark channel prior defogging algorithm was studied in the environment of Intel Pentium N3700 processor and 8 GB memory. This paper applied the algorithm of dark channel priori algorithm based on different resolutions, fusion of Kalman filter and Hungarian algorithm to human vehicle recognition system. The result of the processing was compared with the YOLOv3-based detection algorithm alone. The average precision [22], the average false detection rate [23], and the average leak detection rate were evaluated as the evaluation indexes for
504
G. Jiang et al.
each frame of the different algorithm results, as shown in Table 1. From Table 1, we can see that the miss rate has been reduced after improvement. It was found that the time spent on the video was less than that in Fig. 6, and the average running time of the algorithm was 1 min, 49 s. Table 1 Average accuracy, the average false detection rate and the average missed rate of the results of YOLOv3 algorithm and the improved algorithm Index
Algorithm Average accuracy (%)
Average false detection rate (%)
Average missed rate (%)
YOLOv3 algorithm
82.6
2.2
30.1
Improved algorithm
93.2
1
24.42
Fig. 6 Details of the video processed by the algorithm
4 Conclusion Aiming at the problem of the recognition ability of human vehicle recognition system under haze, this paper studies the dark channel prior fog removal algorithm, YOLOv3 algorithm, Hungary algorithm, and Kalman filtering and proposes a human vehicle recognition system based on deep learning fusion haze removal algorithm, The system uses the dark channel priori algorithm based on different resolutions to remove haze. The target detection algorithm based on deep learning fusion Kalman filtering and Hungarian algorithm is used to detect the categories and behaviors of moving objects effectively. It is proved by experiments that the human vehicle recognition system based on deep learning fusion haze removal algorithm can improve the reliability and accuracy of target recognition in haze weather. The research in this paper is active in the fields of automatic driving, transportation, and safety monitoring. Acknowledgements. This work was supported by Tianjin Key Laboratory of Wireless Mobile Communications and Power Transmission, and funding of innovation and entrepreneurship training project for Tianjin University students (20191006508).
Research on the Human Vehicle Recognition System Based on Deep …
505
References 1. Yu W, Mo F (2016) Path planning method of mobile robot based on depth automatic encoder and Q-learning. J Beijing Univ Technol 42(5):668–673 2. Zheng Q, Liu H (2019) Combining depth Q learning and attention model video face recognition 4:111–115, 120 3. Lin C-J, Jhang J-Y et al (2019) Using a reinforcement Q-learning-based deep neural network for playing video games. Electronics 8(10):1128 4. Yu T (2019) Research on human behavior recognition based on smart phone [d]. Nanjing University of Posts and Telecommunications, Jiangsu Province 5. Girshick R, Donahue J, Daeerll T et al (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: 2014 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, Columbus, Ohio, USA, pp 580–587 6. Han Z, Wang C, Fu Q (2019) Arbitrary-oriented target detection in large scene sar images. Defence Technol. https://doi.org/10.1016/j.dt.2019.11.014 7. Rens S, He K, Girshick R et al (2015) Faster R-cnn: towards real-time object detection with region proposal networks. Advances in neural information processing systems. Curran Associates Inc, Quebec, Canada, pp 91–99 8. Yang D, Huang W, Zhang Q, Li Y, Zhang Y (2020) Research on intrusion detection of platform end personal based on Faster-RCNN. Railway Comput Appl 29(2):6–11 9. Redmon J, Divvala S, Girshick R et al (2016) Youonly look once: unified, real-time object detection. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, LasVegas, USA, pp 779–788 10. Yang B, Zhang Y, Cao J et al (2017) On road vehicle detection using an improved faster RCNN framework with small-size region up-scaling strategy. In: Pacific-Rim symposium on image and video technology. Springer, Cham, pp 241–253 11. Redmon J, Farhadi A (2018) Yolov3 an incremental improvement. arXiv: Computer Vision and Pattern Recognition 1804.02767 12. Reis Dos, Welfer De Souza, Cuadros Leite, Gamarra DFT (2019) Mobile robot navigation using an object recognition software with RGBD images and the YOLO algorithm. Appl Artif Intell 33(14):1290–1305 13. Liu W, Anguelov D, Erhan D et al (2016) SSD: single shot multibox detector [C]. In: European conference on computer vision. Springer, Amsterdam, Netherlands, pp 21–37 14. Zhao Q (2018) Research on target detection of improved SSD. Guangxi University, Guangxi Province 15. He KM, Sun J, Tang XO (2011) Single image haze removal using dark channel prior. IEEE Trans Pattern Anal Mach Intell 33(12):2341–2353 16. Wang J (2019) Research and implementation of radar target tracking algorithm. Xi’an University of Electronic Science and Technology, Shaanxi Province 17. Sun L, Zhang X, Li Q, Liu T, Fang Z (2019) Indoor multi pedestrian target continuous location method based on monocular vision. Sci Surveying Mapp 44(12):95–101, 133 18. Wang J, Li H (2019) Intelligent analysis of vehicle illegal behavior based on deep learning algorithm. J Shanghai Inst Shipping Sci 4(42):49–54 19. Wei R, Bao S, Xu Z, Xu F (2020) Face detection algorithm based on Kalman filter in MTCNN network. Radio Eng 50(3):193–198 20. Rahdan A, Bolandi H, Abedi M (2020) Design of on-board calibration methods for a digital sun sensor based on Levenberg–Marquardt algorithm and Kalman filterings. Chin J Aeronaut 33(1):339–351 21. Gu W (2013) Research and application of target assignment based on evolutionary Hungarian algorithm. Xi’an University of Electronic Science and Technology, Shaanxi Province
506
G. Jiang et al.
22. Du J, He N (2019) Real time detection of road vehicles based on improved YOLOv3. Comput Eng Appl 1–9 23. Liu J, Hou S et al (2019) Real time vehicle detection and tracking based on enhanced tiny YOLO3 algorithm. J Agric Eng 35(6):118–125
Improved Skeleton Extraction Based on Delaunay Triangulation Jiayi Wei, Yingguang Hao(B) , and Hongyu Wang Faculty of Electronic Information and Electrical Engineering, Dalian University of Technology, Dalian, China [email protected]
Abstract. The skeleton, which is the center axis of the target shape, is a topological representation of the shape. It has been used in image processing and pattern recognition fields such as target recognition, target matching, text recognition, blood vessel detection, and crack detection. Since Blum first used the grass burning model to extract the skeleton as a shape descriptor, there has been a lot of research on image skeleton extraction algorithms. In view of the problems existing in the existing methods, such as inaccurate skeleton position, discontinuous structure, and sensitivity to noise and small deformation, this paper proposes an improved image skeleton extraction algorithm based on constrained Delaunay triangulation, which effectively improves the performance of the algorithm by means of burr pruning, image pyramid, and other measures. The improved method can meet the requirements of object skeleton extraction in various scenes and has a good effect on noise suppression. Keywords: Skeleton extraction · Delaunay triangulation · Skeleton pruning
1 Introduction As a shape descriptor, skeleton has many superior characteristics. At present, image skeleton extraction algorithm is widely used in image processing and pattern recognition, mainly in the fields of target recognition, shape matching, and various target detection. In [1, 6], they extracted the skeleton as structural features of the target on the MPEG7 shape dataset and then identified various typical targets in the dataset according to the obtained skeleton structure. In [10], the skeleton is used to identify seals, characters, and other texts. In [2], the extracted image skeleton is used in the research of shape matching algorithm. Of course, there are also detection of road cracks and vascular detection [11] based on skeleton extraction. According to the definition and application requirements of the skeleton, a good skeleton should have the following properties. Firstly, the skeleton should represent the important visual part of the object, retain the topology and geometric structure of the object, and contain the center of the largest disk, which can be used to reconstruct
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_69
508
J. Wei et al.
the target. Next the position of the skeleton should be accurate and located in the center of the contour. Then, the skeleton should be connected and single pixel. And the skeleton should be able to overcome the instability caused by noise and small deformation and has the invariable characteristics of translation, scaling, rotation, and perspective. At present, the main methods for image skeleton extraction include thinning algorithm [4, 5, 12], method based on distance transformation [9], method based on mathematical morphology, and method based on constrained Delaunay triangulation [3, 7, 8]. In recent years, the convolutional neural network has been widely studied, and the extraction of the skeleton based on the convolutional neural network has also achieved good results [13]. In this paper, an efficient, robust, and accurate skeleton extraction algorithm is proposed, and the extracted skeleton is used as the structural feature of multi-source image registration. Based on the constrained Delaunay triangulation method, this paper analyzes the causes of the burrs in the skeleton and uses the shear algorithm to shear the burrs in the initial skeleton to get the final skeleton. The overall architecture of the algorithm is as follows (Fig. 1). The original image
Image preprocessing
Image contour extraction
Pruning algorithm
Initial skeleton extraction
Image skeleton
Triangulation produces triangles
Fig. 1 Overall architecture of the algorithm
2 Delaunay Triangulation Method for Skeleton Extraction 2.1 Algorithm Theory and Representation It has been proved that the constrained Delaunay triangulation (CDT) can generate triangle subdivision of contours in digital images, and the generated triangle can approximately replace the diameter of the tangent circle of contours [4]. As shown in Fig. 2c, a complete shape can be divided into several non-overlapping triangles by Delaunay triangulation. A triangle within its shape is called an inner triangle. The sides of an inner triangle that are shared with other inner triangles are called inner sides, otherwise they are called outer sides. The shape skeleton in Fig. 2d is the result of classifying, analyzing, and processing the triangle in Fig. 2c. We combine the theory of [4, 7] to classify triangles. According to the different local structure information represented by the internal triangle, we divide the internal triangle into three different type: end triangle (E-T), normal triangle (N-T), and junction triangle (J-T). The skeletons of different types of triangles have different definitions. As shown in Fig. 3a, the skeleton of the end triangle (E-T) is defined as the line between the center of gravity and the midpoint of the inner edge. As shown in Fig. 3b, the skeleton
Improved Skeleton Extraction Based on Delaunay Triangulation
509
Fig. 2 Skeleton extraction based on Delaunay triangulation method: a input image b contour extraction c triangulation d final skeleton
of the normal-triangle (N-T) is defined as the connection of the midpoint of the two inner edges. As shown in Fig. 3c, the junction triangle (J-T), if it is an acute triangle, its skeleton is defined as the connection between the midpoint of three sides and its outer center. While, as shown in Fig. 3d, it is a right or obtuse triangle, its skeleton is defined as the connection between the midpoint of the two adjacent sides of the obtuse or right angle and the midpoint of the third side, respectively. In the figure, the largest black point is the vertex of the triangle, the middle black point is the midpoint of the triangle, the smallest black point is the outer center of the triangle, the black solid line is the edge of the triangle, and the black dotted line is the corresponding skeleton line.
Fig. 3 Internal triangle classification and definitions of skeletons of different types of triangles: a end triangle b normal triangle c acute junction triangle d right or obtuse junction triangle
2.2 Skeleton Extraction Error Cause Analysis According to the principle of extracting skeleton by Delaunay triangulation method, the triangle should be formed on opposite sides. And we know that this method is sensitive to noise and small deformation. As shown in Fig. 4a, the edge of the binary image is not smooth, and it is easy to form a small triangle on the same side in triangulation. As shown in Fig. 4b, when the contour points are too close to each other, it is easy to generate redundant small triangles in triangulation. These redundant small triangles can interfere with the determination of the type of adjacent triangles, resulting in the resulting skeleton deviating from the correct direction or creating redundant burrs. From the comparison between Fig. 5b, c, we can intuitively see that burrs account for a large proportion in the skeleton before pruning. Statistical analysis in Fig. 5b, the initial skeleton is 1233 pixels, of which burrs account for 966 pixels and burrs account for 80%. Such a skeleton cannot be directly applied to image registration research, so it is necessary to shear the skeleton.
510
J. Wei et al.
Fig. 4 Causes of errors in skeleton extraction: a the image edge is not smooth after binarization b the contour points are too close to each other
Fig. 5 Burr proportion analysis: a original drawing b pre-shear skeleton c post-shear skeleton
3 Improved Algorithm According to the above analysis, the main reason for the burrs in the skeleton is that the edges are not smooth and the contour points are too close to each other. For these two reasons, we propose improved methods, respectively. In order to solve the problem of skeleton error caused by uneven edges, we adopt the measures of image pyramid and edge smoothing to improve it. To solve the problem of skeleton error caused by the contour points being too close to each other, we adopt contour sampling approximation and threshold screening methods to improve. The steps of the algorithm before adding these two improvements are shown in Fig. 2. Firstly, the input image is preprocessed, including graying, filtering, and binarization. The gray level of the input image is changed, and then, the gray level image is filtered by Gaussian filter to suppress the noise in the image. Then, the image binarization is done to obtain a binary image, in which the background pixel value is 0 and the target pixel value is 1. Next, extract the contour points on the smoothed image with the findContours() function provided in the OpenCV computer vision library. At last, the Delaunay triangulation is generated with the obtained contour points, and the resulting triangles are connected according to the classification rules to obtain the initial skeleton of the image. The improved method of image pyramid combined with edge smoothing is that before the image is grayed, the image is sampled down, and the image after binarization
Improved Skeleton Extraction Based on Delaunay Triangulation
511
is edge smoothed. Edge smoothing is to perform Gaussian filtering on the binary image and then take a new threshold for another binarization. The contour sampling approximation method in this paper is to extract the target contour first and store it in a clockwise or counterclockwise order. After the contour is obtained, equal interval sampling is conducted clockwise or counterclockwise (4–6 points are generally selected for the interval in this experiment) to obtain the initial points used to generate triangulation. The steps of threshold filtering are as follows: 1. All triangulation results are stored in a triangle set. 2. Calculate the perimeter Pmax of the largest triangle in the triangle set and set the judgment threshold of the perimeter as (0.01–0.02) * Pmax . 3. Calculate the perimeter of each triangle and delete the triangles whose perimeter is less than the threshold from the triangle set. 4. Then, classify all the remaining triangles in the set to get the final skeleton.
4 Experiment and Discussion The experiment was done in Windows environment using OpenCV 2.4.13 computer vision library and C++ language. Next, we will test the improvement effect of the two improvement methods,respectively, and prove that the skeleton obtained by the improved method is accurate and with few burrs by some qualitative and quantitative analysis. Figure 6 shows experiment of image pyramid with edge smoothing improved method. From left to right, they are, respectively, the skeleton of the original image, the image after downsampling…the image after three downsampling. When the size of the sampled image is too small (about 20 × 20 pixels), it is difficult to extract the contour of the image, so it cannot get an accurate skeleton. The experimental results show that the improved method of image pyramid with edge smoothing can shear the burrs in the skeleton to some extent, but the remaining burrs are still too numerous to be used in other studies.
Fig. 6 Effect of improved method by image pyramid with edge smoothing
Then, we qualitatively and quantitatively test the effect of contour sampling approximation and threshold filtering method on the basic graph. The smaller the value of MSE (mean square error), the more similar the two images are. And the closer the value of SSIM (structural similarity index) is to 1, the more similar the two images are. As given in Table 1, the skeleton that we extracted is very close to the truth value. Then, we carried out experiments on complex shapes and actual images. Figure 7 is the result of the experiment. Figure 7 contains words, horses, people, bicycles, leaves,
512
J. Wei et al. Table 1 Quantitative analysis of the improved method
The original image
Ground truth
The skeleton of our method Run time/s
MSE SSIM
0.97
0.72
0.42
0.41
0.964981
2.00565
0.783519
1.14985
0.930103
0.96871
0.925674
0.951679
and airplanes. From the obtained skeleton results, we can see that the results obtained by this improved algorithm retain the complete topological structure of the target with few burrs.
Fig. 7 Result on complex shapes and actual images
The above two improvements can be applied individually or in combination. The following two methods are combined to extract the skeleton of the leaf image in Fig. 6,
Improved Skeleton Extraction Based on Delaunay Triangulation
513
and the results are as follows. The combination of the two methods not only preserves the complete skeleton structure of the target, but also increases the robustness of the skeleton (Fig. 8).
Fig. 8 Skeleton results of the combination of the two methods
The following two methods are combined to extract the skeleton of visible and infrared images in the same scene. The left image is the skeleton of the visible image, and the right is the skeleton of the infrared image. It can be seen that the skeleton extracted by the method in this paper is correct and robust as a structural feature in multi-source image alignment (Fig. 9).
Fig. 9 Multi-source image skeleton extraction results in the same scene
5 Conclusion In this paper, we analyze the causes of the burrs in the skeleton and propose two improved methods to shear the skeleton. We correct the skeleton and shear the burrs by contour approximation and perimeter threshold judgment, and increase the robustness of the skeleton extraction by image pyramid and edge smoothing. Experimental results prove that the skeleton obtained by the proposed method satisfies the properties that a good skeleton should have, and simultaneously this method is much efficient. Because of the pruning algorithm, the branches of skeleton may be shorter than the true branches, but it does not affect the recognition and matching. There are errors in the reconstruction of the skeleton, but the impact is small.
514
J. Wei et al.
References 1. Asian C, Tari S (2011) An axis-based representation for recognition. In: Proceedings Eighth IEEE international conference on computer vision, ICCV, pp 149–154 2. Bai X, Latecki LJ (2008) Path similarity skeleton graph matching. IEEE Trans Pattern Anal Mach Intell 30(7):1282–1292 3. Guo G, Wang X, Zhang W et al (2014) A new pruning method for medial axis of planar free-form shape. In: 2014 international conference on progress in informatics and computing (PIC). IEEE 4. Morrison P, Zou JJ (2005) An effective skeletonization method based on adaptive selection of contour points. In: Third international conference on Information technology and applications. ICITA 2005. IEEE Computer Society 5. Saeed K, Tabedzki M et al (2010) K3M: a universal algorithm for image skeletonization and a review of thinning techniques. Appl Math Comput Sci 20:317–335 6. Sebastian TB, Klein PN, Kimia BB (2004) Recognition of shapes by editing their shock graphs. IEEE Transa Pattern Anal Mach Intell 26(5):550–571 7. Sintunata V, Aoki T (2017) Grey-scale skeletonization using Delaunay triangulation. In: IEEE international conference on consumer electronics-Taiwan. IEEE 8. Sintunata V, Aoki T (2017) Skeleton extraction in cluttered image based on Delaunay triangulation. In: IEEE international symposium on multimedia. IEEE 9. Wang P, Fan Z, Shiwei M (2013) Skeleton extraction method based on distance transform. In: 2013 IEEE 11th international conference on electronic measurement & instruments (ICEMI). IEEE 10. Zhang Z, Shen W, Yao C, Bai X (2015) Symmetry-based text line detection in natural scenes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2558–2567 11. Zhang P, You X, Xu D (2013) A novel method for vessel skeleton extraction. In: 2013 international conference on machine learning and cybernetics, Tianjin, pp 118–123 12. Zhang TY, Suen CY, Suen CY (1984) A fast parallel algorithm for thinning digital patterns. Commun ACM 27(3):236–239 13. Zhao K, Shen W, Gao S et al (2018) Hi-Fi: hierarchical feature integration for skeleton detection
An Algorithm of Computing Task Offloading in Vehicular Network Based on Network Slice Peng Lv1 , Zhao Liu1 , Yinjiang Long2(B) , Peijun Chen2 , and Xiang Wang3 1 School of Computer and Communication Engineering, Beijing University of Science and
Technology, No. 30 Xitucheng Road, Beijing, China 2 School of Information and Communication Engineering, Beijing University of Posts and
Telecommunications, No. 10 Xitucheng Road, Beijing, China [email protected] 3 Beijing Jianea Technology Inc, Room 603, Building Xingfa, No.45 Zhongguancun Street, Haidian District, Beijing, China
Abstract. In order to solve the diversification requirement of quality of service (QoS) for computing tasks in vehicular network, this study will introduce network slice technology into the RSU uplink and propose an improved genetic algorithm. With the goal of reducing the average delay of computing task offloading in system, optimization function is a nonlinear programming problem, which solved by improving the object of crossover and mutation to the same type genes. Simulation results show that the improved genetic algorithm has good system performance and fast convergence speed. Keywords: Vehicular network · Network slice · Task offloading
1 Introduction Offloading computing tasks to nearby communication entities not only solves insufficient resources of vehicle, but also ensures the timeliness of computing tasks [1]. Softwaredefined vehicle cloud (SDVC) and three-layer network architecture of jointing back-end cloud, vehicle cloud, and edge cloud increase the flexibility and resource utilization [2, 3]. According to the geographic location and speed of nearby vehicles, vehicle that need to offload computing tasks can choose the candidate vehicle with sufficient resources and stable distance or roadside Units (RSU) provide a stable resource allocation strategy for vehicle cloud (VC) [4, 5]. In order to provide computing resource utilization of vehicle, a collaborative scheduling scheme for computing task triage was proposed, which is an NP-hard problem, and a heuristic algorithm reduces the complexity of the solution [6]. Aiming at frequent network topology changes and insufficient resources in Vehicular Ad Hoc Networks (VANETs), the scheduling algorithm based on RSU cloud reduces the response time, and a specific request priority effectively reduce overhead costs and energy consumption [7]. In addition, machine-to-machine technology (M2M), game
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_70
516
P. Lv et al.
theory, resource sharing, and delay also are considered in computing tasks offloading [8–10]. Although the above studies have given many effective computing task offloading schemes, they have not taken into account the diverse QoS requirements. Therefore, this study will introduce network slice technology into vehicular network, which not only meets the diversity of QoS requirements, but also increases the flexibility of resource allocation.
2 System Model As shown in Fig. 1, considering partial offloading of computing tasks, the system consists of a RSU and vehicles traveling in two-way lanes. All vehicles have limited computing power, and the RSU equips with a mobile edge computing (MEC) server to provide fast and efficient computing services. RSU
V2I
V2I
V2I
Entrance1 Entrance2
Fig. 1 V2I computing task offloading scenario
2.1 Local Offloading Using the set I = {I1 , I2 } to indicate the lane entrance, and Ji (t) indicates the number of vehicles at the entrance i, which follow the Poisson process parameter λi : P(Ji (t) = γ ) =
(λi )γ −λi t ,γ γ! e
= 0, 1, 2, . . .
(1)
Using Vi,j to indicate the j th vehicle on the lane entrance i, U i,j indicates the com puting task of Vi,j , which can be donated as Si,j = Ki,j , Oi,j , where Ki,j and Oi,j indicate the size of data and delay requirement of Ui,j , respectively. Using M = {1, 2} to indicate two offloading ways, m = 1 indicates local offloading; m = 2 indicates RSU m . Using S 1 offloading, and the size of data in two offloading ways can be donated as ki,j i,j to indicate the Vi,j ’s computing resource, ζi,j indicates the complexity of Ui,j . Therefore, the computing resources needed of two offloading ways can be donated as follows: m = ζ km ϕi,j i,j i,j
(2)
Then, the delay of local offloading for Ui,j can be donated as follows: 1 = Ti,j
m ϕi,j 1 Si,j
(3)
An Algorithm of Computing Task Offloading in Vehicular Network …
517
2.2 RSU Offloading The RSU offloading includes two parts: computing task reception and computing task processing. Using the set N = {1, 2, . . . , N } to indicate the network slices on RSU, whose delay requirement is denoted as set, Vi,j,n indicates the vehicle under network slice n. For the computing task reception, using set C = {1, 2, . . . , C} to indicate the number of resource block (RB) on RSU uplink, the bandwidth of each RB is B. Using the binary βi,j,n,c ∈ {0, 1} to indicate the actual allocation of RB, βi,j,n,c = 0 indicates RBc unallocated; βi,j,n,c = 1 indicates RBc which is allocated to Vi,j,n , thereby the data rate of Vi,j,n can be donated as follows: |Gi,j,n,c |2 Pc βi,j,n,c B log2 1 + σ 2 +ω Ri,j,n = (4) i,j=j n,c
c∈C
where Gi,j,n,c indicates the channel gain, Pc indicates the transmit power, σ 2 indicates the noise power, and ωi,j=jn,c indicates interference from other users on RBc . The reception time of Ui,j can be donated as follows: 2 Ti,j,n =
m ki,j Ri,j,n
(5)
For computing task processing, using S m to indicate the computing resources of MEC m indicates the computing resource allocated to Vi,j,n . Then, the processing server, si,j,n delay of Ui,j can be donated as follows: 3 Ti,j,n =
m ϕi,j m si,j,n
(6)
However, Vi,j,n receives the processing result when leaving the service range of RSU. Using Lm to indicates the service range of RSU, Xi,j,n indicates the distance that Vi,j,n has traveled, and vi,j,n indicates the speed of Vi,j,n . Then, the processing delay of RSU offloading should meet the following conditions: 2 + T3 Ti,j,n i,j,n ≤
(Lm −Xi,j,n ) vi,j,n
(7)
The processing result of computing task is a small data, which can be negligible. To reduce the average delay of system, the optimization function can be shown in formula (8), where E[.] indicates the expectation of a random variable. η1 indicates the ratio of offloading constraint, and η2 indicates the maximum delay of computing task. η3 indicates the constraint of RB allocated to vehicle,and η4 indicates the minimum delay of network slices. η5 indicates the constraint of computing resource allocated to vehicle, and η6 indicates RSU which should complete the computing task before vehicle leaves its service range. 1 Di,j i∈I E[Xi (t)] m∈M i∈I j∈J n∈N m η1 : ki,j = Ki,j
min D =
m∈M
518
P. Lv et al.
η2 : Di,j ≤ Oi,j η3 : βi,j,n,c = C n∈N c∈C 4 η4 : Ti,j,n ≤ Qi,j
η5 :
m si,j,n = Sm
i∈I j∈J n∈N 4 5 η6 : Ti,j,n ≤ Ti,j,n
(8)
3 Problem Calculation m, β m There are coupling relationship between ki,j i,j,n,c and si,j,n in formula (8), and the constraint contains equalities and inequalities, the complexity of solution is very high. Genetic algorithm has not the constraint of derivation and continuity for function and has inherent implicit parallelism and better global optimization ability. In this study, an improved genetic algorithm is proposed to solve optimization (8).
3.1 Initialization Population m, β m Each chromosome consists of ki,j i,j,n,c and si,j,n , whose length τ = 3 · E[Xi (t)]. Using ς to indicate the population size, matrix indicates the initialization population, each column can be donated as follows: ⎧ ⎨ Φ(:, 1 : E[Xi (t)]) offloading ratio (9) Φ(:, :) = Φ(:, E[Xi (t)] : 2E[Xi (t)])RB resource ⎩ Φ(:, 2E[Xi (t)] : 3E[Xi (t)]) computing resource
In addition, all chromosomes meet the constraint conditions η1 − η6 , which can effectively avoid the invalidation outside the feasible region, thereby reducing the number of iterations. 3.2 Genetic Operators The genetic operator includes three parts: selection operator, crossover operator, and mutation operator, but the optimization function (16) contains three different types of resource, it is necessary to ensure the isolation and immutability of different resources. • Crossover operator In this part, the object of crossover operator will be converted to the same gene on the same chromosome. The steps are as follows: (1) perform crossover operation base on the crossover probability. (2) Randomly select the crossover position and determine the resource range according to formula (9). (3) Randomly select a gene as the cross object within this range to perform crossover operator.
An Algorithm of Computing Task Offloading in Vehicular Network …
519
• Mutation operator Mutation needs to design new rules for mutation numbers, the steps are as follows: (1) perform mutation operation based on the mutation probability; (2) randomly select mutation position, determine the resource range according to formula (9), and randomly select the object of mutation within this range; (3) calculate the minimum resources requirement based on η1 –η6 , and the least resources of mutation genetic subtract this number as the maximum number of mutation; 4) randomly select a number from zero to maximum number of mutation, one of gene adds the number, and the other subtracts the number. 3.3 Improved Genetic Algorithm In this study, the fitness function is formula (8), and thereby, the improved genetic algorithm can be shown as follows. Improved genetic algorithm
4 Simulation Analysis In all the simulation, there are two-way lane entrance, the computing resources in each vehicle is 5 × 108 cycle/s, transmit power is 0.4 W, vehicle flow parameter is 10, and vehicle’s speed is 30 km/h. The data size of computing task is 1 Mb, whose maximum
520
P. Lv et al.
delay requirement randomly allocate to {100 ms, 200 ms}, the complexity of computing task is 30cycle/bit. The service range of each RSU is 200 m, the RB number of each RSU is 30, bandwidth is 180 kHz, the computing resources in MEC server is 1010 cycle/s. In Fig. 2a, with the increase of communication resources, the number of RB allocated to vehicles will be increased, the processing delay of computing tasks gradually decreases. The smaller the data size, the more local offloading, so the average delay is lower and convergence is faster. In Fig. 2b, the effect of complexity on average delay is relatively stable, and the increase in computing resources makes convergence faster. 1.0Mb 1.5Mb 2.0Mb
130 120 110 100 90
80 70 60 50
100
Average Delay(ms)
Average Delay(ms)
150 140
30cycle/bit 40cycle/bit 50cycle/bit
95 90 85 80
10
12
14 RB
16
(a) RB and Data Size
18
75
1
3 4 2 Computing Resource 10 (cycle/s)
5
(b) Computing Resource and Complexity
Fig. 2 Effects of different variables on average delay
In order to verify the convergence of improved genetic algorithm, the relationship between the iterations and average delay is given in Fig. 3a. Due to the solution obtained by the improved genetic algorithm, each iteration is in the feasible region, with the number of iterations increases, the feasible solution decreases continuously, so the average delay decreases at a fast speed at first, then slows down gradually, and finally tends to be stable between 80 and 100 times. Therefore, the improved genetic algorithm is suitable for practical scenarios. Figure 3b shows the performance comparison of different algorithms. It is easy to find that the performance of improved genetic algorithm is better than partial optimization algorithms, and the convergence speed is faster. The performance of only optimizing communication resources is better than only optimizing computing resources before 1.5 × 1010 , then it reverse.
5 Conclusion This study focuses on computing tasks offloading in vehicular network, network slice technology, and an improved genetic algorithm are used to solve the problem of diversification QoS requirement. Simulation results show that the algorithm has a good convergence and system performance. In future work, we will consider the application of network slicing technology in V2V communication.
An Algorithm of Computing Task Offloading in Vehicular Network … 480 430
115 Average Delay(ms)
330 280 230 180 130 80
Improved genetic algorithm Only optimizing computing resource Only optimizing communication resource
110
380
521
105 100 95 90 85 80
0
20
40
60 iterations
(a) Convergence
80
100
1
2
3
4
5
Computing Resource 10 (cycle/s)
(b) Performance Comparison
Fig. 3 Convergence and performance comparison of algorithm
Acknowledgements. This project is supported by the National Science and Technology Major Project (2018ZX03001022-008).
References 1. Lyu F, Zhu HZ, Zhao HB et al (2018) SS-MAC: a novel time slot-sharing MAC for safety messages broadcasting in VANETs. IEEE Trans Veh Technol 67(4):3586–3597 2. Jang I, Choo S, Kim M et al (2017) The software-defined vehicular cloud: a new level of sharing the road. IEEE Veh Technol Mag 12(2):78–88 3. Ahmad F, Kazim M, Adnane A et al (2015) Vehicular cloud networks: architecture applications and security issues. In: IEEE/ACM 8th international conference on UCC, pp 571–576 4. Li B, Pei Y, Wu H et al (2014) Computation offloading management for vehicular ad hoc cloud. Algorithms and architectures for parallel processing. Springer International Publishing, Berlin, pp 728–739 5. Mershad K, Artail H (2013) Finding a STAR in a vehicular cloud. IEEE Intell Trans Syst Mag 55–68 6. Sun F, Fen H, Nan C et al (2018) Cooperative task scheduling for computation offloading in vehicular cloud. IEEE Trans J 67(11):11049–11061 7. Patra M, Krishnendu CTI, Konwar HN et al (2019) A cost-and-energy aware resource scheduling technique for roadside clouds. In: VTC2019-Fall, pp. 1–5 8. Yu R, Huang XM, Kang JW et al (2015) Cooperative resource management in cloud-enabled vehicular networks. IEEE Trans 62(12):7938–7951 9. Wu G, Talwar S, Johnsson K et al (2015) M2M: from mobile to embedded Internet. IEEE Commun Mag 49(4):36–43 10. Wang HS, Li X, Ji H et al (2018) Federated offloading scheme to minimize latency in MECenabled vehicular networks. In: IEEE Globecom workshops, pp 1–6
A Cluster Routing Algorithm Based on Vehicle Social Information for VANET Chenguang He1,2(B) , Guanqiao Qu1 , Liang Ye1 , and Shouming Wei1,2 1 Communications Research Center, Harbin Institute of Technology, Harbin, China
{hechenguang,guanqiaoqu,yeliang,weishouming}@hit.edu.cn 2 Key Laboratory of Police Wireless Digital Communication, Ministry of Public Security,
Beijing, People’s Republic of China
Abstract. In recent years, with the gradual rise and improvement of autonomous driving technology, research on vehicle ad hoc network (VANET) has gradually been paid attention by researchers. In VANET, the vehicle nodes are networked through a routing protocol and communicate with each other with the established routing. However, the high-speed mobility of vehicle nodes will cause rapid changes in the topology of the network, which can increase the information transmission delay. And as the number of vehicles increases, the probability of information transmission collision in the network will also increase. When the delay and collision reach a certain level, it will cause the loss of information. In order to solve these problems, this paper proposes a cluster routing algorithm based on vehicle social information. The communication source node and the destination node will communicate with each other through these cluster heads. This algorithm is superior to the traditional routing algorithm in terms of VANET communication performance. Keywords: VANET · Cluster routing algorithm · Vehicle social information
1 Introduction With the improvement of living conditions, people’s requirements for service in the car have become more diverse, and there are more requirements for mobile network access. Therefore, there is an increment in data transmission between vehicles and mobile networks [1, 2]. Because the spectrum resources of the communication system are very precious, the more communication between vehicles and base stations or vehicles and vehicles will cause the waste of precious spectrum resources. It will cause excessive network cost and even cause network link congestion and collapse. Therefore, data aggregation is particularly important, which will make the communication more efficient and greatly improve the utilization of spectrum [3]. Because VANET has the characteristics of high data capacity, low latency requirement, and rapid topology change, the traditional routing protocol in the existing mobile ad hoc network (MANET) cannot meet the needs of the VANET.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_71
A Cluster Routing Algorithm Based on Vehicle Social Information …
523
Ant colony algorithm [4] is an evolutionary algorithm. The algorithm is mainly based on the behavior of the ant colony searching for the shortest path in the foraging process. It can be used to improve existing routing protocols by applying ant colony algorithm. Reference [5] combines ant colony algorithm with dynamic source routing protocol (DSR). Unlike traditional routing algorithms, the most important thing in this algorithm is path discovery and maintenance. This algorithm defines a data packet based on ant colony algorithm, namely forward ants packet and backward ants packet, and estimates the current network status by considering the transit time, hop counts, and Euclidean distance. Fuzzy logic [6] can reduce the data retransmission rate and improve the transmission efficiency during the communication. Also, it can select a better next node as a relay node. Reference [7] combines the fuzzy logic with ad hoc on-demand distance vector routing (AODV) in the VANET. The sending node will include the current node’s direction, speed, and other information in the data packet. The receiving node compares its own information with the data it received. And then, the probability of better route can be calculated through the fuzzy logic system. Only the routes with higher probability values are kept during communication, until the receiving node is the destination. Reference [8, 9] study cluster-based routing protocol (CBRP). It divides nodes into different clusters, and the nodes communicate through the cluster head. It only maintains the route table of one-hop or two-hop neighbors of the cluster head node. Compared with other routing protocols, its routing table entries are smaller. Meanwhile, CBRP is an on-demand routing protocol, which can reduce the overhead in a certain extent. Similarly, reference [10] proposes an algorithm named mobility prediction-based clustering (MPBC). This algorithm estimates the relative speed between each node and neighbor nodes when selecting the cluster head, and the node with the lowest ID is set to be the cluster head. Although these studies have solved the need for low latency, efficient, and stable routing in VANET in a certain extent, we note that the ant colony algorithm has high iteration complexity and high overhead. Also, the rules formulated by fuzzy logic are not applicable all the time. At the same time, the existing cluster routing algorithm only considers one social information, such as the speed, which does not apply to all vehicles on the road. This paper mainly studies the traditional routing protocols in mobile ad hoc networks, and proposes a cluster routing algorithm based on vehicle social information such as the distance between two vehicles, the speed, the type of vehicles, the number of neighbors, and so on.
2 Cluster Algorithm Based on Vehicle Social Information In this section, we propose a cluster algorithm based on vehicle social information, which comprehensively considers a variety of social information and elects the cluster head (CH) by calculating the weighted social information count of each node in the cluster. After selecting the CH, the remaining nodes automatically become the cluster members (CM).
524
C. He et al.
The social information of a vehicle is the type of vehicle, the speed of the vehicle, the total distance between a vehicle node, and its neighbors. The specific definitions of these four kinds of vehicle social information are as follows: Define the type of vehicle Ci as the category of vehicle i. For example, all buses on the road belong to one category, and taxis belong to another category. Define the number of vehicle’s neighbors Ni,t as the number of all neighbors of the node i within the set maximum distance threshold th of the cluster at time t. The cluster maximum distance threshold th is the maximum radius distance centered on each node. Define the total distance between vehicle nodes Di,t as: Di,t =
n n
xi,t − xj,t
2
2 + yi,t − yj,t
(1)
i=1 j=1,j=i
Di,t is the sum of the distance between the node i and its each neighbor node j at t. Define the speed of the vehicle Mi,t as: 2 2 Mi,t = xi,t − xi,t−tim + yi,t − yi,t−tim /tim (2) Mi,t is the moving distance of node i within the algorithm execution interval tim at time t. In this algorithm, first, the vehicle types Ci of all nodes in the scene will be randomly assigned to simulate different types of vehicles on the road. Secondly, set the maximum distance threshold th of the cluster. When clustering, the nodes will disconnect with others whose distance exceeds the threshold th nor the types of vehicle are different. While others are connected. After clustering, the node i in the cluster will send a HELLO message to search for all neighbors within the th distance of it. The total number of neighbors of node i is recorded as Ni,t . Then, it will calculate the total distance between node i and all its neighbor vehicle nodes in each cluster, and calculate the speed of node i at time t. After this, the weighted social information count wi,t of each node at time t can be defined as: wi,t = λ1 × Mi,t + λ2 × Di,t − λ3 × Ni,t
(3)
Define λi as the normalized weight. After calculating wi,t , the algorithm will compare the count of all nodes in the cluster, and the node with the smallest count will become the CH at that time, and the remaining nodes in the cluster automatically become the CM. The communication source node in the cluster will regard the CH as a relay node and communicate with the destination node through the CH. After time tim has passed, the cluster algorithm will be executed again and elect the CH and CM at time t + tim.
3 Cluster Routing Algorithm Based on Vehicle Social Information In this section, we propose a cluster routing algorithm based on AODV and the cluster algorithm in the section above. After getting the cluster information from the cluster
A Cluster Routing Algorithm Based on Vehicle Social Information …
525
algorithm, the source node and destination node will communicate with each other by using the cluster routing algorithm through CH. The traditional non-cluster routing algorithm AODV has four types of route messages, namely RREP, RREQ, RERR, and HELLO. When the node receives RREQ from the communication source node, it will first establish a reverse routing path, and then, it will check whether itself is the destination node. If it is the destination, it will send RREP directly. If it is the relay node with the route to destination, it will send RREP, or it will broadcast the RREQ. In the cluster routing algorithm, the destination node will send RREP only in these situations as follows: (i)
(ii) (iii)
(i)
(ii)
(iii)
The packet it received is from source of RREQ. The node is CH. The packet it received is from CH. The packet it received is from source of RREQ. The node is not CH, but the packet it received is from the CH of itself. The packet it received is not from the source of RREQ but is in the same cluster with itself. The relay node which does not have the route to destination will broadcast RREQ only in these situations as follows or it will send RREP directly. The node is CH. The packet it received is from source of RREQ. The source of RREQ is CH. The node is CH. The packet it received is from source of RREQ. The source of RREQ is not CH but is in the same cluster with itself. The node is CH. The packet it received is not from source of RREQ but is CH.
4 Simulation Results In this section, we use the cluster routing algorithm based on vehicle social information for communication and compare the results with traditional non-cluster routing algorithms. See Table 1 for the simulation parameters. Table 1 Simulation parameter settings Total number of nodes
10, 20, 30, 40, 50
The type of vehicle
3
The minimum speed of vehicle
3
The maximum speed of vehicle
5, 10, 15, 20
Number of communication connections 5, 10, 15, 20, 25
526
C. He et al.
The speed of a vehicle is a random number between the minimum speed and the maximum speed in the table. By comparing the communication performance of cluster routing algorithm and traditional AODV routing algorithm under the same parameters, we can get the characteristics and advantages of cluster routing algorithm proposed in this paper. From Figs. 1, 2, and 3, we can see that as the number of vehicles on the road gradually increases, the performance also gradually deteriorates. The rate of packet loss and the average transmission delay rise linearly with the increase of the number of vehicles, while the rate of normalized routing cost rises approximately exponentially. The increase in the cost represents that there are more broadcast pathfinding messages in the scene, which means that more vehicles will cause more burden to the pathfinding work. At the same time, the increasing rate of packet loss means that due to the more collision, the route message cannot reach the destination node correctly, which will increase the cost in turn. The average transmission delay represents the average time for the message from the source node to the destination node. When the number of vehicles is increased, the work of the node will also increase, which can cause the rise of average time in transmission.
Fig. 1 Rate of packet loss of two algorithms under 15 m/s
However, by comparing the communication performance between these two algorithms, the cluster routing algorithm reduces the rate of packet loss rate by about 3%, and the average transmission delay is reduced by about 0.4 s. The cost of normalized routing cost is also effectively reduced, and the average value reduced is about 0.6. Therefore, we can get the conclusion that this algorithm is superior to the traditional routing algorithm in terms of VANET communication performance.
A Cluster Routing Algorithm Based on Vehicle Social Information …
527
Fig. 2 Average transmission delay of two algorithms under 15 m/s
Fig. 3 Rate of normalized routing cost of two algorithms under 15 m/s
5 Conclusions This paper proposes a cluster routing algorithm based on vehicle social information for VANET to meet the needs of vehicles for the low-latency efficient and stable routing. This algorithm takes full account of various social information of vehicles on the road
528
C. He et al.
and elects cluster heads as relay nodes in different clusters by calculating weighted social information count. The source node communicates with the destination node through the cluster head. Compared with the traditional non-cluster routing algorithm, this algorithm has better performance when nodes communicate with each other in VANET. Acknowledgements. This paper is supported by National Key R&D Program of China (No.2018YFC0807101).
References 1. Abbani N, Jomaa M, Tarhini T, Artail H, El-Hajj W (2011) Managing social networks in vehicular networks using trust rules. In: 2011 IEEE symposium on wireless technology and applications (ISWTA), Langkawi, pp 168–173. 10.1109/ISWTA.2011.6089402 2. Liu X, Min J, Zhang X, Lu W (2019) A novel multichannel Internet of Things based on dynamic spectrum sharing in 5G communication. IEEE Internet Things J 6(4):5962–5970 3. Soua A, Afifi H (2013) Adaptive data collection protocol using reinforcement learning for VANETs. In: 2013 9th international wireless communications and mobile computing conference (IWCMC), Sardinia, pp 1040–1045. https://doi.org/10.1109/iwcmc.2013.658 3700 4. Amudhavel J et al (2015) An robust recursive ant colony optimization strategy in VANET for accident avoidance (RACO-VANET). In: 2015 international conference on circuits, power and computing technologies [ICCPCT-2015], Nagercoil, pp 1–6. https://doi.org/10.1109/icc pct.2015.7159383 5. Rajesh Kumar M, Routray SK (2016) Ant colony based dynamic source routing for VANET. In: 2016 2nd international conference on applied and theoretical computing and communication technology (iCATccT), Bangalore, pp 279–282. https://doi.org/10.1109/icatcct.2016. 7912008 6. Jadhav RS, Dongre MM, Devurkar G (2017) Fuzzy logic based data dissemination in Vehicular ad hoc networks. In: 2017 international conference of electronics, communication and aerospace technology (ICECA), Coimbatore, pp. 479–483. 10.1109/ICECA.2017.8203731 7. Feyzi A, Sattari-Naeini V (2015) Application of fuzzy logic for selecting the route in AODV routing protocol for vehicular ad hoc networks. In: 2015 23rd Iranian conference on electrical engineering, Tehran, pp 684–687. https://doi.org/10.1109/iraniancee.2015.7146301 8. Yu JY, Chong PHJ, Zhang M (2018) Performance of efficient CBRP in mobile Ad Hoc Networks (MANETS). In: 2008 IEEE 68th vehicular technology conference, Calgary, BC, 2008, pp 1–7. 10.1109/VETECF.2008.18 9. Liu Xin, Zhang Xueyan (2020) NOMA-based Resource allocation for cluster-based cognitive industrial Internet of Things. IEEE Trans Ind Inf 16(8):5379–5388 10. Ni M, Zhong Z, Zhao D (2011) MPBC: a mobility prediction-based clustering scheme for Ad Hoc networks. IEEE Trans Veh Technol 60(9):4549–4559. https://doi.org/10.1109/TVT. 2011.2172473
LSTM-Based Channel Tracking of MmWave Massive MIMO Systems for Mobile Internet of Things Haiyan Liu1,2 , Zhou Tong3(B) , Qian Deng3 , Yutao Zhu1 , Tiankui Zhang3 , Rong Huang4 , and Zhiming Hu1 1
4
Yingtan Internet of Things Research Center, Beijing, China 2 China CITIC Bank, Beijing, China 3 Beijing University of Posts and Telecommunications, Beijing, China [email protected] Network Technology Research Institute of China Unicom, Beijing, China
Abstract. We propose a D-step forward prediction-based beam space channel tracking algorithm, which focuses on improving the accuracy of channel tracking in a multi-user massive multi-input multi-output (MIMO) system. Based on the time dependence of long short-term memory (LSTM) algorithm, the prediction results of the previous D-1 time are used as training parameters to track the beam space channel at the current time. It avoids the estimation error expanding with the increase of continuous tracking time in the process of channel tracking. Simulation results show that the proposed algorithm can effectively reduce the symbol error rates in a slowly changing environment.
1
Introduction
With the large available spectrum and wide bandwidth, millimeter-wave (mmWave) is a promising candidate for high data rate applications in future network [1]. Due to the much smaller wavelength of mmWave, a large scale of array antennas can be packed in a small space, making the spatial processing technologies possible, such as massive MIMO and adaptive beamforming. Achieving directional beamforming requires accurate channel state information (CSI), which can be obtained by efficient channel tracking methods for mmWave timevarying environment [2]. Beam space channel tracking can be applied in many mobile IoT scenarios, such as intelligent driving, logistics tracking, and so on. Existing beam space channel tracking algorithms for mmWave massive MIMO system can be divided into three categories. Based on the transition probability of the beam supports, the first category of beam space channel tracking This work is supported by Jiangxi Province key research and development program (No. 2018ABC28008). c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_72
530
H. Liu et al.
can obtain the optimal beam support through tracking the support set. Reference [3] focused on the angle of departure (AoD) and the angle of arrival (AoA) tracking for temporally correlated sparse mmWave MIMO channels. As for the second category, accurate beam tracking can be achieved by beam training. Using convex optimization theory, Ref. [4] found the best transmit–receive beam pair that maximizes the received signal power. The third category of beam space channel tracking is to track moving directions of users by establishing a time-varying geometric model of beam support. By considering a practical user motion model, Ref. [5] excavated a temporal variation law of the physical direction between the BS and each mobile user, then utilized the obtained beam space channels in the previous time slots to predict the prior information of the beam space channel in the following time slot without channel estimation. In this paper, based on the beam space channel model in uniform rectangular array (URA), we propose a D-step forward prediction-based channel tracking algorithm for multi-user system, which avoids the tracking error expansion due to the long tracking time.
2
System Model
In this section, we describe the system model of uplink channel tracking for multiuser mmWave massive MIMO systems in time division duplexing (TDD). We assume that the base station (BS) is equipped with URAs [6] to serve K single-antenna users. The URA is shown in Fig. 1. Nv and Nh in the figure denote rows and columns of array, ϕ and φ are the azimuth and elevation angles of the BS. The spacing d between two antennas is 12 λc , where λc is the wavelength.
Fig. 1 Uniform rectangular array
LSTM-Based Channel Tracking of MmWave Massive MIMO Systems . . .
531
The horizontal array response vector is given by 1 a(u) = √ [ 1 e−ju . . . e−j(Nh −1)u ]T Nh
(1)
where u = 2π λdc cos φ sin ϕ is the horizontal wave-path difference. The vertical array response vector is given by 1 a(v) = √ [ 1 e−jv . . . e−j(Nv −1)v ]T Nv
(2)
where v = 2π λdc sin φ is the vertical wave-path difference. The array steering matrix of a URA is given by a (ϕ, φ) = a (v) ⊗ a (u). We assume that the channel is a block fading channel, which does not change in one signal block, but could vary from block to block. Besides, the limited number of dominant scatters in the propagation surroundings resulted in the limitation of NLOS path. The gain of NLOS element is less than LOS element. Therefore, the LOS ray is mainly considered in the analysis. The time-varying geometric model of the user k, at block τ , is given by hk (τ ) = ηk a (ϕk (τ ) , φk (τ ))
(3)
where ηk is the complex path gain of user K. The time-varying geometric chanT nel model of K users is given by H = [h1 , . . . , hk , . . . , hK ] . According to the simplified time-varying channel model (4), the physical channel can be obtained through angles and gain. Therefore, the time-varying model of the physical channel can be established by the state evolution model of both the angles and the gain. Let the angle vector θ (τ ) denote the set of multi-user azimuth and elevation angles. Based on the temporal correlation of angles, the state evolution model of the angle vector is θ (τ ) = θ (τ − 1) + n1 (τ )
(4)
where n1 (τ ) is the Gaussian noise. The complex path gain ηk of user k is assumed to remain constant in different blocks. In the mmWave system, the channel energy is concentrated on a few beams, that is, only a few beams have a large gain, and the gains of the remaining beams are small. Therefore, the mmWave beam space channel is sparse. We can select a small number of dominant beams according to the sparse beam space channel to significantly reduce the dimension of the massive MIMO systems. The conventional channel (4) can be transformed to the beam space channel through predefined analog precoding matrix, which can be represented by the discrete Fourier transform (DFT) matrix 1 a ϕ¯1 , φ¯1 , . . . , a ϕ¯Nh , φ¯1 , a ϕ¯1 , φ¯2 , . . . , a ϕ¯Nh , φ¯Nv (5) U= √ Nh Nv v ) ⊗ a (¯ u), v¯ = 2π λdc sin φ¯n , u ¯ = 2π λdc cos φ¯n sin ϕ¯m , where a ϕ¯m , φ¯n = a (¯ ϕ¯m = N1h 2πm m = 0, . . . , Nh − 1 , φ¯n = N1n 2πn n = 0, . . . , Nv − 1 . The channel
532
H. Liu et al.
model of mmWave beam space can be given by T ˜1 , . . . , h ˜K ˜ = HU = h H
(6)
˜ k = hk U is beam space channel of the user k, the Nh Nv columns of where h Uk correspond to Nh Nv orthogonal spatial beams. Due to the sparse struc˜ k of user k is focused on ture of mmWave channel, the beam space channel h ˜ k . The power a few nonzero points, which constitute the spatial support of h ˜ these space supports. Therefore, we define beam gains of hk is concentrated on
˜ hk,s , s = 1, . . . , Nh Nv of user k to express the signal transmission capability of different beams. The beam with the largest gain is determined as the optimal beam. We devote vector Rk as the beam selection matrix of user k, which is of size Nh Nv × 1. After beam selection, the beam space channel of the user k is ˜ r = hk URk . given by h k We use full-connection mode in hybrid beamforming. We set the number of users equal to the number of the RF chains. If the number of the user is K, the analog beamforming matrix F is of size Nh Nv × K while the digital beamforming W is size of K × K. The analog beamforming vector of the user k can be expressed as fk = URk . Then, F = [f1 , . . . , fK ] is the multi-user analog beamforming matrix. In digital beamforming, precoding technology is used to increase signal to interference plus noise ratio (SINR) and improve the flexibility of signal transmission. The SINR of user k is given by ˜ r H 2 γk h w k k (7) SINRk = ˜ r H 2 1 + i=k γk h w k i where γk is SNR, wi denotes the ith column of digital beamforming matrix W. The W can be obtained through zero-forcing 1 ˜ ˜ ˜ H −1 W= √ H f Hf Hf P
(8)
˜ f = HF, P denotes a power constraint parameter. where H Suppose the transmitter sends point symbol x, after hybrid beamforming, the downlink received symbols can be expressed as y = HFWx + n
(9)
where n ∼ CN 0, σ 2 IK is the addition Gaussian noise.
3
Beam Space Channel Tracking
The beam space channel tracking can be regarded as the tracking of the optimal beam support, and the optimal beam of the user at the current time is tracked
LSTM-Based Channel Tracking of MmWave Massive MIMO Systems . . .
533
by the optimal beam at the last-time time. This section details the proposed long short-term memory (LSTM)-based D-step forward prediction beam space channel tracking algorithm. The beam space channel of users at the front D time is used as the training parameter of the LSTM network, and then the beam space channel of the current time is tracked by the time continuity of the channel parameters.
Fig. 2 LSTM memory block
As shown in Fig. 2, the basic unit in the hidden layer of a LSTM network is the memory block, which contains a cell state and three gates, namely input gate, output gate and forgetting gate. Each gate reads the output of the previous unit hτ −1 and the unit input at that moment xτ to generate the value of the gate. ˜ τ −1 is the predicted In the proposed beam space channel tracking algorithm, h ˜ τ −1 = value of channel tracking at the previous time, which can be expressed as h H (τ − 1). xτ is the expected value of the beam space channel at the current time, xτ = HF. Forget gate fτ determines how much of the previous cell state Cτ −1 is reserved for the cell state Cτ . It can be given by fτ = σ (Wf · [hτ −1 , xτ ] + bf )
(10)
where Wf is the weight matrix of the forget gate, bf denotes the forget gate bias, σ is sigmoid function. The input gate determines the amount of information added to the cell state. The input gate consists of two parts, iτ is a sigmoid ˜ τ is a vector function that determines which information needs to be updated. C generated by the tanh function that determines alternative updates. The two parts can be expressed as iτ = σ (Wi · [hτ −1 , xτ ] + bi )
(11)
534
H. Liu et al.
˜ τ = tanh (WC · [hτ −1 , xτ ] + bC ) C
(12)
Wi and WC are the weight matrices of the two parts in the input gate, respectively. bi and bC are gate biases. The two parts jointly update the cell state, and the cell state Cτ −1 at the last moment forms a new unit state Cτ after the forgetting gate and the input gate ˜τ Cτ = fτ ∗ Cτ −1 + iτ ∗ C
(13)
Output gate controls the effect of long-term memory on current output, which can be given by (14) oτ = σ (Wo [hτ −1 , xτ ] + bo ) where Wo is the weight matrix of the output gate, bo is the output gate bias. The final output of a memory block in LSTM is determined by the output gate and the cell state, which can be given by hτ = oτ · tanh (Cτ )
(15)
In the proposed beam space channel tracking algorithm, the D-step forward prediction is performed on the beam space channel according to the characteristics of the time-varying channel, as shown in Fig. 3. The hidden layer is the LSTM neural network memory block shown in Fig. 2. The input parameter x contains the channel expectation values of D consecutive time ˆ (τ −D + 1) , H ˆ (τ −D + 2) , . . . , H ˆ (τ ) , each of which corresponds to blocks H the input of a hidden layer. The output value after the D-step prediction is the predicted value of the beam domain channel at the current time. The output parameters of the channel in each hidden layer are respectively affected by the ˜ τ −1 of the previous channel, and the expected cell state Cτ , the estimated value h value xτ of the channel at the current time. The input parameter is the channel expectation value to prevent the tracking error from expanding over time during channel tracking. Mean-square error (MSE) loss is chosen as the loss function of the proposed algorithm. In the forward transmission process, the error between estimation channel matrix and expectation channel matrix is obtained by MSE loss func
2 ˆ (τ ) is the expectation ˜ (τ ) = E H ˜ (τ ) − H ˆ (τ ) , where H tion, E = MSE H matrix of beam space channel at block τ . The error value is used in the training process of backpropagation. The error of predicted channel will accumulate as the channel tracking process moves on. To avoid serious error accumulation, the D-step forward prediction algorithm limits the length of continuous channel tracking blocks by reasonably selection the D value.
4
Simulation Results
In this section, the performance of the proposed algorithm is given by numerical simulations. We compare the proposed algorithm with ASD-based channel estimation algorithm [7]. The main simulation parameters are given in Table 1.
LSTM-Based Channel Tracking of MmWave Massive MIMO Systems . . .
Fig. 3 D-step forward prediction model Table 1 Simulation parameters Parameter
Value
Power
27 dBm
The number of user K
4
Dimensions of hidden layer dc
30
Step of forward prediction D
3
Correlation coefficient of gain ρ
0.995
Antenna number at BS Nrh × Nrv 8 × 8 100
symbol error rate
10-1
10-2
10-3
10-4 0
2
4
6
8
10
12
14
16
18
SNR(dB)
Fig. 4 Symbol error rate versus variation of angle
20
535
536
H. Liu et al. 100
symbol error rate
10-1
10-2
10-3
10-4
10-5
0
2
4
6
8
10
12
14
16
18
20
SNR(dB)
Fig. 5 Symbol error rate versus number of antenna
Figure 4 shows the symbol error rate performance of the two comparison algorithms at varying angle variation. The variations of angle determine how fast the user varies. As the variation of angle increases, the symbol error rate increases. The reason for this result is that when the variation of angles increases, the correlation between the channels becomes weak, and it is no longer applicable to estimate the CSI at the current time with the CSI of previous time. The simulation results show that the symbol error rate of the proposed beam space channel tracking algorithm is smaller than that of the comparison algorithm when the variation of angles for different users is the same. It can be seen that both algorithms are more suitable for channel tracking in a slow time-varying environment than in a fast time-varying environment. The channel tracking algorithm proposed in this paper can obtain more accurate CSI in a slow time-varying environment and improve system performance. Figure 5 illustrates the symbol error rate performance of the comparison algorithms against the number of antennas in BS with variation of angle σu2 = 0.5 2 180 π . We can observe from the figure that the symbol error rate of both algorithms decreases with the increase of the number of antennas. When the SNR is small, the symbol error rate decreases slightly with the number of antennas increase. Increasing the number of BS antennas can improve the transmission performance of the system. The symbol error rate of the proposed algorithm decreases more with the number of antennas increasing.
LSTM-Based Channel Tracking of MmWave Massive MIMO Systems . . .
5
537
Conclusion
In this paper, we present a D-step forward prediction-based beam space channel tracking algorithm in mmWave massive MIMO system for mobile Internet of Things. Based on the LSTM, we propose a D-step forward prediction algorithm to achieve beam space channel tracking in time-varying environment. Simulation results showed that the proposed can not only decrease symbol error rate.
References 1. Andrews JG, Buzzi S, Choi W, Hanly SV, Lozano A, Soong ACK, Zhang JC (2014) What will 5G be? IEEE J Sel Areas Commun 32(6):1065–1082 2. Va V, Vikalo H, Heath RW (2016) Beam tracking for mobile millimeter wave communication systems. In: 2016 IEEE GlobalSIP 3. Duan Q, Kim T, Huang H, Liu K, Wang G (2015) AoD and AoA tracking with directional sounding beam design for millimeter wave MIMO systems. In: 2015 IEEE 26th annual international symposium on personal, indoor, and mobile radio communications (PIMRC), Hong Kong, 2015, pp 2271–2276 4. Yuan W, Armour SMD, Doufexi A (2016) An efficient beam training technique for mmWave communication under NLoS channel conditions. In: 2016 IEEE wireless communications and networking conference, Doha, 2016, pp 1–6 5. Gao X, Dai L, Zhang Y, Xie T, Dai X, Wang Z (2017) Fast channel tracking for terahertz beamspace massive MIMO systems. IEEE Trans Veh Technol 66(7):5689– 5696 6. Hong W, Baek KH, Ko S (2017) Millimeter-wave 5G antennas for smartphones: overview and experimental demonstration. IEEE Trans Antennas Propag 65(12):6250–6261 7. Gao X, Dai L, Han S, Chih-Lin I, Adachi F (2016) Beamspace channel estimation for 3D lens-based millimeter-wave massive MIMO systems. In: 2016 8th international conference on wireless communications & signal processing (WCSP), Yangzhou, 2016, pp 1–5
A Polarization Diversity Merging Technique for Low Elevation Frequency Hopping Signals Gu Jiahui1(B) , Wang Bin2 , Liu Yang2 , and Liu Xin2 1 CAST504, Xi’an 710071, China
[email protected] 2 Xi’an University of Science and Technology, Xi’an 710054, China
Abstract. In the process of wireless signal transmission at low elevation angle, multipath effect is easy to be formed due to the reflection signal of the ground, which can cause signal distortion and interruption. Polarization diversity merging technique is an effective way to combat signal fading, which can improve the quality of wireless communication without increasing transmission power and bandwidth. However, the diversity signals merged with different frequency and phase will cause mutual interference. In the frequency hopping mode, residence time of each frequency hopping point is short and the closed-loop control method is unsuitable to rectify the deviations of each hop. To solve this problem, an open-loop control method is proposed in this paper, which can correct and merge the frequency deviation of polarization diversity signals in real time. It has the advantages of simple structure and fast processing speed, and is very suitable for the reception of low elevation frequency hopping signals. Keywords: Frequency hopping · Polarization diversity merging · Open-loop control method
1 Introduction Diversity merging technique is an effective way to resist signal fading and improve the receive reliability, which requires no training sequence and improves the quality of wireless communication without increasing transmission power and bandwidth [1– 3]. The polarization diversity merging technique in low elevation angle is an way to counteract channel fading by utilizing antenna diversity gain. In the frequency hopping mode, the residence time of each frequency hopping (FH) point is very short. The phase and frequency offsets in different frequency bands are independent [4, 5]. In the traditional closed-loop control mod, compared with the reference branch, the offset branch has processing delay, which may cause mutual interference when merged in different hopping frequencies. In this paper, an open-loop control 项目资助:国家重点研发计划 (2018YFC0808301); 国家自然科学基金 (61,801,371); 陕西省 自然科学基础研究计划 (2018JM5052); 西安科技大学博士启动金 (2018QDJ028) 资助课题
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_73
A Polarization Diversity Merging Technique …
539
method is proposed to solve the problem. By the fast Fourier transform (FFT) of two branched signals, the deviation of diversity signals is eliminated in time. The offset can be easily eliminated by conjugating two branch signals.
2 Polarization Diversity Merging In Gaussian channel, left–right rotation signal is corrected by frequency deviation and phase deviation. SL = an ej[2π(f0 )]t + Nl
(1)
SR = an ej[2π(f0 )]t + Nr
(2)
The signal-to-noise ratio of each signal is NEb0 = an2 /σ 2 , where σ 2 represents the variance of the Gaussian channel. After the two roads are combined in equal proportion, the following results can be obtained. S = SL + SR = 2 · an ej[2π(f0 )]t + Nl + Nr
(3)
After the polarization combination, the signal-to-noise ratio is
Eb N0
S = 4an2 /2σ 2 =
2 NEb0 . Using the polarization merging, the weighted values of are selected according to the fading situation of the two channels. When the performance of one channel is extremely poor due to the deep fading, the other channel can be weighted to improve the gain [6]. As shown in Fig. 1, the receiver of FH system adopts the polarization combination method to improve the output SNR. Compared with the single-channel FH receiver, a maximum gain of up to 3 dB in performance can be obtained, while it can also suppress the multipath effect. 10
0
Diversity Merging Channels Single Channel 10
-2
Ber
10
-1
10
10
10
-3
-4
-5
1
2
3
4
5
6
7
8
9
Eb/N0(dB)
Fig. 1 Comparison of output SNR between diversity merging channels and single channel
540
G. Jiahui et al.
3 Open-Loop Controlled Method 3.1 Closed-Loop Controlled Method The combination mode of the existing polarization diversity is shown in Fig. 2. In the left-handed loop, it is composed by a channel phase discriminator, a loop filter amplifier, and a phase-locked VCO. The right-handed branch signal S2 (t) is taken as the reference signal. The left-handed branch signal S1 (t) achieve the alignment by the closed-loop controlled pattern, in order to complete the synchronous acquisition of the signal. S1 (t)
Χ VCO
Closed Loop Control
Loop Filter
Phase Discriminator Σ
LO S2 (t)
Χ Fig. 2 Combination mode of the existing polarization diversity [7]
When the two branch signals are merged, the frequency and phase difference between these two channels must be eliminated in time. Otherwise, two signals will not only bring no gain but also affect the receive performance. In the fixed frequency communication system, the time delay of processing can be corrected based on the estimation of frequency and phase deviation of the previous signal. However, in FH mode, each hop signal falls on a different frequency band, and the deviation estimate extracted from the previous hop signal is not applicable to the next hop signal. Due to the short residence time of FH signal, the closed-loop feedback control method cannot timely correct the frequency and phase deviation. Therefore, the existing technology is not suitable for the merging of polarization diversity of frequency hopping signals. 3.2 Open-Loop Controlled Method in FH System After the un-hopping processing, the signal is changed into a narrow-band signal. The processing progress at the receiver includes un-hopping processing, filter extraction, diversity synthesis, so as to achieve the best transmission performance [8]. The functional block diagram of FH polarization synthesis is shown in Fig. 3. Left– right polarized antennas 1 and 2 receive signals enter AD sampling of their branches and
A Polarization Diversity Merging Technique …
541
obtain Rr and Rl , respectively. f c represents the current frequency hopping, and S r and S l represent the dextral signal and sinistral frequency hopping signal after the solution. Rl = Sl ej[2π(fc )]t
(4)
Rr = Sr ej[2π(fc )]t
(5)
Polarized Antenna 2
Polarized Antenna 1 Left Handed Channel
Left Handed AD
Right Handed AD
Digital DownConversion
Right Handed Digital Down- Channel Conversion
Extraction And Filtering
Extraction And Filtering
Matching Filtering
Matching Filtering
Open Loop Phase discriminator Unhopping Loop
Code Clock Recovery
Unhopping Loop
Optimal Sampling Estimate Differential Demodulation TOD Signal Extracted
Fig. 3 Functional block diagram of FH polarization synthesis
After the digital down-conversion, the signal can reduce the sampling rate to 4 times of the symbol rate through the extraction and filtering algorithm [9]. The frequency control word of hopping local oscillator (LO) comes from the time of odder (TOD) information extracted from the demodulation signal. After matching filtering algorithm, S l and S r were obtained, respectively. Sl = an ej[2π(f0 +fd )t+ϕ]
(6)
Sr = an ej[2π(f0 )]t
(7)
f 0 is the starting frequency deviation, f d is the left-handed signals frequency deviation relative to right-handed signals, ϕ is the left-handed signal phase deviation relative to right-handed signal.
542
G. Jiahui et al.
The signals S l and the conjuction of S r are multiplied. Rn = Sl ∗ Sr∗ = an ej[2π(f0 +fd )t+ϕ] ∗ an e−j[2π(f0 )]t
(8)
To simplify, we get Rn = ej[2π fd t+ϕ]
(9)
As shown is Fig. 4, the signals S l and the conjugation of Rn are multiplied. SL = Sl ∗ R∗n = an ej[2π(f0 +fd )t+ϕ] ∗ e−j[2πfd +ϕ] = an ej[2π(f0 )]t
Le-handed branch
(10)
I ADC
Digital DownConversion Q Open-Looped Control
FFT
fd
I NCO Q
Open-Looped Phase Discriminator Right-handed branch
I ADC
Digital DownConversion Q
Fig. 4 Open-loop control method diagram
After the correction of the frequency and phase, we obtain the left-handed branch S L and the right-hand branch remains the same S R = S r . Using the maximum ratio synthesizer, merge the two branch signals. S = SR CR + SL CL
(11)
where S R and S L are the corrected left-handed and right-handed signals, respectively, and C R and C L are their respective weighting factors. AGCR is dextral branch gain control coefficient. AGCL is dextral branch gain control coefficient [10]. CR =
1 AGCR , CL = 1 − CR , k = 1+k AGCL
(12)
The code clock recovers from square clock error estimation algorithms. S signal is sampled by 4 times of the symbol rate, then use square arithmetic to eliminate the modulation information, to estimate the signal information regularly, its phase is the phase difference between sampling clock and the clock signal, and then to restore code clock according to the phase error. An open-loop interpolation filter is used to obtain the optimal sampling point at 4 times the symbol rate and recover the code element information.
A Polarization Diversity Merging Technique …
543
PM differential demodulation removes the modulated signal in the symbol. And the information of hopping points is extracted from the code elements and fed back to the two down-conversion modules to complete the unhopping loop. As shown in Fig. 5, the FH system lasts for 1 ms with each frequency point, in the closed-loop mode, the phase error of each hop will be affected by the estimation error of the previous hop. In the open-loop control mode, the phase error of each hop will not be affected by the estimation error of the previous hop, and the error convergence is fast.
Fig. 5 Comparison of phase deviation between the open-loop and close-loop control in FH system
4 Conclusion The paper proposes a polarization diversity merging processing method of frequency hopping signal. During the un-hopping processing, two branch signals are changed into a narrow-band signal. The open-loop controlled method can process the frequency deviation and phase deviation of two signals in real time by means of open-loop control, which will bring about 3 dB gain by the weighted combination. Compared with the closed-loop control method, it has advantages of simple structure and fast processing speed, and is very suitable for realizing the reception of frequency hopping telemetry signal, so as to achieve the best transmission SNR. In the low elevation hopping antijamming communication system, the method proposed in this paper can realize fast FH phase and frequency synchronization, and has wide application prospect.
544
G. Jiahui et al.
References 1. Hammersley TG (2011) Polarization diversity, US7933363 B2 2. Dong L, Choo H, Heath RW, Ling H (2005) Simulation of MIMO channel capacity with antenna polarization diversity. IEEE Trans Wirel Commun 4(4):1869–1873 3. Simon MK, Alouini M-S (2000) Digital communication over fading channels. Wiley 4. Bird WE, Triggs A (2000) Frequency hopping, US6128327 A 5. Keller C, Pursley M (1987) Diversity combining for channels with fading and partial-band interference. IEEE J Sel Areas Commun 5(2):248–260 6. Sergienko AB, Sylka SS (2019) Bit error rate of FQAM transmission in fast fading channel. In: 2019 IEEE conference of Russian young researchers in electrical and electronic engineering (EIConRus). Saint Petersburg and Moscow, Russia, pp 33–38 7. 李雄飞, 刘晓旭, 王宇等. 基于MATLAB的理想π / 4-DQPSK系统仿真. 2006, 3(4):40–45 8. 顾嘉辉, 王乐, 唐硕等. 一种跳频信号极化分集合并方法. 中国专利: ZL201611141008.5, 12 Dec 2016 9. Oerder M, Meyr H (1988) Digital filter and square timing recovery. IEEE Trans Commun 36(5):605–612 10. Han Y, The KC (2007) Performance study of linear and nonlinear diversity-combining techniques in synchronous FFH/MA communication systems over fading channels 1(1):1–6
Systematic Synthesis of Active RC Filters Using NAM Expansion Lingling Tan(B) , Fei Yang, and Junkai Yi The School of Beijing Information Science and Technology University, Beijing 100192, China {tanlingling,yijk}@bistu.edu.cn, [email protected]
Abstract. Active network synthesis is proved an effective method for circuit designer to find new circuits with desired performance. This paper demonstrates method of application of nodal admittance matrix (NAM) expansion to active filter synthesis from the port matrix of voltage-controlled voltage source (VCVS). The Sallen-Key (SK) second-order low-pass filter, Aherberg-Mossberg (AM) secondorder low-pass filter, and Deliyannis second-order band-pass filter are synthesized by NAM expansion, and simulation results verify the feasibility of circuit synthesis method. Keywords: Active filter synthesis · Nodal admittance matrix (NAM) expansion · Nullor · Sallen-key (SK) second-order filter · Aherberg-mossberg (AM) second-order filter · Deliyannis second-order filter
1 Introduction Method of synthesizing active circuits using NAM expansion has been proposed in [1], which has enormously enriched the theory of active network synthesis [2–4]. Alternative circuit topologies can be obtained from the same transfer function, when taking account of the different performances [5]. The infinity variable ∞i in the admittance matrix can bridge the representation of nullor with matrix analysis [6]. Moreover, the pathological elements, that are nullator, norator, voltage mirror, current mirror, have obtained more attention for their usage in circuits analysis and synthesis [7, 8]. The generation process is based on a symbolic method of circuit design, and the derived circuit originates from a symbolic transfer function. As extra nullors are introduced, nullators and norators may be paired as operational amplifiers (OPA); OPA is prevalent devices in filter circuit design. More second-order RC active filters need to be synthesized to achieve automatic circuit design, that is what this paper endeavors to do. Firstly, method of automatic analog circuit design is provided using NAM expansion. Secondly, SK second-order filter, AM second-order filter, and Deliyannis second-order This work was supported in part by NSFC- General Technology Basic Research National Natural Science Foundation of China (U1636208) and Foundation of BISTU (2,025,007).
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_74
546
L. Tan et al.
filter are synthesized from the given transfer function. Finally, the realizability of the derived filters is verified by corresponding low-pass and band-pass filters design and simulation.
2 Theory of NAM Expansion In active network synthesis, circuits can be synthesized from a given voltage transfer function or a current transfer function [9], which is the reverse process of circuit analysis. From the given 3 × 3 or 4 × 4 port matrix [10] of the VCVS and CCCS, method of NAM expansion is applied to new circuits design through adding rows and columns in the admittance matrix. In order to implement matrix expansion to get an N × N network, pivotal expansion is implemented on the specific terms in the matrix [4]. Nullor is widely used in active network synthesis for its capacity to model active blocks, such as OPA [11]. In a finally synthesized N × N network, corresponding nullor pair can be synthesized by its equivalence of OPA. This paper demonstrates the filter synthesis method, which means that a p × p port admittance matrix derived by a certain transfer function is expanded to a port-equivalent n × n nodal admittance matrix, where n > p. Firstly, the matrix expansion process starts from introducing blank rows and columns which represent internal nodes of the NAM, and then nullors are added. Secondly, NAM expansion is applied thereafter to certain matrix terms until every term in the admittance matrix becomes a single admittance term [12]. Thirdly, the added nullors are used to move certain resulting admittance matrix terms to their final locations, which can properly describe either floating or grounded passive terms [13]. Finally, the NAM is obtained, which includes finite terms representing passive circuit components and infinite terms representing OPA.
3 Application of Active Filter Synthesis 3.1 SK Second-Order Low-Pass Filter Synthesis Consider a voltage transfer function given as Eq. (1), Av =
N ab ab = = D ad + ab + bd + cd (b + d )(a + b + c) − b(b + c)
(1)
According to the port matrix type I of VCVS [10], the process of synthesis starts from the 3 × 3 admittance matrix, where an arbitrary first-order function Q is introduced, Q is selected to equal to a + b + c. ⎡ ⎤ ⎤ ⎡ 0 0 0 N =ab 0 0 0 D=(b+d )(a+b+c)−b(b+c) ⎢ ⎥ ⎣ 0 ∞1 −∞1 ⎦ − −∞1 −−−−−−−−−−−−−−−→ ⎣ 0 ∞1 ⎦ (b+d )(a+b+c)−b(b+c) − ab 0 −N 0 D Q Q ⎡ ⎤ 0 0 0 Q=a+b+c ⎢ ⎥ 0 ∞1 −∞1 (2) −−−−−−→ ⎣ ⎦ b(b+c) ab − a+b+c 0 b + d − a+b+c
Systematic Synthesis of Active RC Filters Using NAM Expansion
547
Pivotal expansion is carried out on terms y31 and y33 in the last matrix. Corresponding terms are added, so that the elements can be connected in floating or grounding form. Especially, term −c is moved to the second line for ±∞1 [6]. (3)
Equation (3) obtained describes a circuit topology with 1 nullor, as shown in Fig. 1, which describes the SK second order low-pass filter circuit. Obviously, Fig. 1 can be realized by OPA combined with passive elements. For realization, a is selected as a resistor g1 , b is selected as a resistor g2 , c is selected as a capacitor, that is C 2 s, d is selected as a capacitor, that is C 1 s, as shown in Fig. 2. Then Eq. (1) yields Av = =
g1 g2 g1 C1 s + g1 g2 + g2 C1 s + C1 C2 s2 s2 +
g1 g2 C1 C2 g1 +g2 g1 g2 C2 s + C1 C2
(4)
c a 1
b 4
3
2
d
Fig. 1 Synthesized circuit topology
Design a low-pass filter according to the parameters below: f p = 1 kHz, Qp = 1, K = 1, where f p represents the pass-band cutoff frequency, Qp represents the quality factor, K represents the voltage magnification. AV (S) =
Kωp2 S2 +
ωp Qp S
+ ωp2
(5)
In this design example, resistors are chosen to 1 K. Component values which are calculated according to Eq. (5) are listed in Table 1. The SK second-order low-pass filter circuit is simulated, and results are shown in Figs. 3 and 4.
548
L. Tan et al.
Fig. 2 SK second-order low-pass filter
Table 1 Components values of the SK second-order low-pass filter C1 = 80 nF C2 = 320 nF R1 = R2 = 1 k
C2 R1
R2
V1_21K
1K
X1_2
320n
vdb1_1 v
1
C1 80n
Fig. 3 Simulation of SK second-order low-pass filter Results of SK socond order low-pass filter 0.0 -10.000 vdb1_1 -20.000 -30.000 -40.000 1.000
10.000
100.000 FREQ
1.000k
10.000k
Fig. 4 Simulation results of SK second-order low-pass filter
3.2 AM Second-Order Low-Pass Filter Synthesis Consider a voltage transfer function given as Av = −
N bdg =− D ehc + bfd
(6)
According to the port matrix type III of VCVS [10], the process of synthesis starts from the 3 × 3 admittance matrix of Eq. (7), where arbitrary first-order function Q, Q1 ,
Systematic Synthesis of Active RC Filters Using NAM Expansion
549
and Q2 are introduced in the following equation.
(7) Then pivotal expansion is carried out on the terms of y41 and y42 .
(8)
Two columns of zero terms were added to the fifth and seventh columns, two rows of zero terms were added to the fourth and sixth row, infinity variables ∞2 and ∞3 are introduced to the newly added rows and columns, then corresponding terms are added to make elements appear in the form of floating or grounding. ⎡ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣
0 0 0 0 0 0 −g
0 0 0 ∞1 −e d + e + g 0 0 0 0 0 0 −f 0
0 0 0 0 0 0 −d 0 0 0 0 0 −c 0 −b 0 ∞3 0 0 0 −h
⎡ ⎤ ⎤ g 0 0 0 0 0 −g 0 ⎢ ⎥ ⎥ 0 ⎥ 0 0 0 −f ⎢ 0 e + f ∞1 − e ⎥ ⎢ ⎥ ⎥ ⎢ 0 −e d + e + g −d ⎥ 0 ⎥ 0 0 0 ⎢ ⎥ ⎥ ⎢ ⎥ ⎥ ∞2 ⎥ → ⎢ 0 0 −d c+d −c 0 ∞2 ⎥ ⎢ ⎥ ⎥ ⎢ 0 ⎥ 0 ⎥ 0 0 −c b + c −b 0 ⎢ ⎥ ⎥ ⎣ ⎦ ⎦ 0 −h 0 0 0 0 ∞3 − b b + h 0 −g −f 0 0 0 −h f + h + g
(9)
Equation (9) obtained describes a circuit topology with 3 nullors, as shown in Fig. 5, which described the AM second-order low-pass circuit. For realization, b is selected as a resistor g2 , c is selected as a resistor g3 , d is selected as a resistor g4 , e is selected as a resistor combined with a capacitor, that is C 2 s + g5 , f is selected as a resistor g6 , g is selected as a resistor g1 , h is selected as a capacitor C 1 s, as shown in Fig. 7. Then, Eq. (6) yields (Fig. 6) Av = − =−
g1 g2 g4 (g5 + C2 s)g3 C1 s + g2 g4 g6 s2 +
g1 g2 g4 g3 C1 C2 g5 g2 g4 g6 C2 s + g3 C1 C2
(10)
Design a low-pass filter according to the parameters below: f p = 1 kHz, Qp = 1, K = 1. For simplicity, resistors are chosen to 1 k. Component values are calculated in Table 2. The AM second-order low-pass filter circuit is simulated according to the component values, results are shown in Fig. 8.
550
L. Tan et al. f g
h
b
N1
d
c
6
5
1
7
N2
4
N2
e 2
3
N1
N3 N3
Fig. 5 Synthesized circuit topology
Fig. 6 AM second-order low-pass filter R6 1KR2
C1
R1
X1 1K
159n
X1_2
R5 R4 1K
R3
1
C2
1K
159n
X2
1K
Fig. 7 Simulation of AM second-order low-pass filter
Table 2 Components values of the AM second-order low-pass filter C1 = C2 = 159 nF R1 = R2 = R3 = R4 = R5 = 1 k
3.3 Deliyannis Second-Order Band-Pass Filter Synthesis Consider a voltage transfer function given as AV =
vdb1_1 v
V1_21K
ac(d + e) cea − d [f (a + b + c) + bc]
Systematic Synthesis of Active RC Filters Using NAM Expansion
551
Results of AM second order low-pass filter 0.0 -10.000 vdb1_1 -20.000 -30.000 -40.000 1.000
10.000
100.000 FREQ
1.000k
10.000k
Fig. 8 Simulation results of AM second-order low-pass filter
=
ac(d + e) e[f (a + b + c) + c(a + b)] − [f (a + b + c) + bc](d + e)
(11)
According to the port matrix type IV of VCVS [10], the synthesis process starts from the 4 × 4 admittance matrix, An arbitrary first-order function Q is introduced. N1 = 0 D1 = e P1 = d + e N2 = ac ⎡ ⎡ ⎤ D2 = f (a + b + c) + bc 0 0 0 0 0 ⎢ ⎥ P2 = f (a + b + c) + c(a + b) ⎢ 0 0 ∞ −∞ ⎢0 ⎢ 1 1⎥− ⎢ ⎥ −−−−−−−−−−−−−−−−−−−−−−→ ⎢ ⎣0 ⎣ −N1 −D1 P1 0 ⎦ −N2 −D2 0 P2 −ac
⎡
0 ⎢ 0 ⎢ ⎢ 0 ⎣
0 0 −e
−ac −f (a+b+c)−bc Q Q
⎡
⎢ −→ ⎢ ⎣
Q=a+b+c⎢
0 0 0
0 ∞1 d +e 0
0 0 −e −f (a + b + c) − bc
(c+f )(a+b+c)−c2 Q
−ac −f (a+b+c)−bc a+b+c a+b+c
0 ∞1 d +e 0
⎤ 0 ⎥ −∞1 ⎥ ⎥ ⎦ 0 (c + f )(a + b + c) − c2
(12)
⎤
0 −∞1 0
0 0 −e
0 ∞1 d +e 0
⎥ ⎥ ⎥ ⎦ 0 −∞1 0
(c+f )(a+b+c)−c2
⎤ ⎥ ⎥ ⎥ ⎦
(13)
a+b+c
Then pivotal expansion is carried out on terms y41 , y42 and y44 in the last matrix. Corresponding terms are added, so that the elements can be connected in the form of floating or grounding.
(14)
Equation (14) obtained describes a circuit topology with one nullor, as shown in Fig. 9, which describes the Deliyannis second-order band-pass filter circuit. For realization, a is selected as a resistor g1 , b is selected as a capacitor, that is C 2 s, c is selected
552
L. Tan et al.
as a capacitor, that is C 1 s, d is selected as a resistor g3 , e is selected as a resistor g4 , as shown in Fig. 10. Then Eq. (11) yields Av = =
g1 C1 s(g3 + g4 ) C1 sg1 g4 − g3 g2 (g1 + C1 s + C2 s) + C1 C2 s2
s2 +
g1 (g3 +g4 ) g3 C2 s (g1 g4 −g2 g3 )C1 +g2 g3 C2 s + Cg11 gC22 g3 C1 C2
b 1
a
5
(15)
f c
3
4
e 2
d
Fig. 9 Synthesized circuit topology
Fig. 10 Deliyannis second-order low-pass filter
Design a band-pass filter according to the parameters below: f p = 1 kHz, K = 1, where f p represents the pass-band frequency, K represents the voltage magnification. For simplicity, resistors are chosen to 1K, calculated component values are listed in Table 3. ω
AV (S) =
K Qpp S S2 +
ωp Qp S
+ ωp2
(16)
The Deliyannis second-order band-pass filter circuit is simulated according to the component values presented above; simulation results are shown in Figs. 11 and 12.
Systematic Synthesis of Active RC Filters Using NAM Expansion
553
Table 3 Components values of Deliyannis second-order band-pass filter C1 = 113 nF C2 = 226 nF R1 = R2 = R3 = R4 = R5 = 1 k
C2 226n
R1
R2 1K
C1
X2
vdb1_1 v
1
V1_21K
113n
R3 R41K 1K
Fig. 11 Simulation of Deliyannis second-order band-pass filter
0.0 -10.000 -20.000 vdb1_1 -30.000 -40.000 -50.000 -60.000 1.000
10.000
100.000 FREQ
1.000k
10.000k
Fig. 12 Simulation results of Deliyannis second-order band-pass filter
4 Conclusion This paper demonstrates a systematic active filter synthesized method based on NAM expansion using nullors. Differ from traditional methods of circuits design, the circuit topology is derived by the original transfer function without assumptions about the synthesized circuit topology. The active circuit topologies of the SK second-order lowpass filter, AM second-order low-pass filter, and the Deliyannis second-order band-pass filter are synthesized from the given transfer function. Separately, a practical designed filter is provided, and the simulation results verify the feasibility of systematic circuit synthesis method based on NAM expansion.
References 1. Haigh DG, Radmore P et al (2004) Systematic synthesis method for analogue circuits—part I notation and synthesis toolbox. Int Symp Circ Syst ISCAS I:701–704 2. Soliman AM (2010) Two integrator loop fitters: generation using NAM expansion and review. J Electr Comput Eng 2010(108687). https://doi.org/10.1155/2010/108687
554
L. Tan et al.
3. Soliman AM (2011) Generation of current mode filters using NAM expansion. Int J Circuit Theory Appl 19:1087–1103 4. Tan L, Bai Y, Teng J, Liu K, Meng W (2012) Trans-impedance filter synthesis based on nodal admittance matrix expansion. Circ Syst Signal Process. https://doi.org/10.1007/s00034-0129514-y 5. Tan L, Wang Y, Yu G (2017) Active filter synthesis based on nodal admittance matrix expansion. EURASIP J Wirel Commun Netw 1:2017 6. Haigh DG, Radmore PM (2006) Admittance matrix models for the nullor using limit variables and their application to circuit design. IEEE Trans Circ Syst I Reg Pap 53(10):2214–2223 7. Sánchez-López C, Fernández FV, Tlelo-Cuautle E (2010) Generalized admittance matrix models of OTRAs and COAs. Microelectron J 41(8):502–505 8. Tlelo-Cuautle E, Sánchez-López C, Moro-Frías D (2010) Symbolic analysis of (MO)(I)CCI(II)(III)-based analog circuits. Int J Circuit Theory Appl 38(6):649–659 9. Haigh DG (2006) A method of transformation from symbolic transfer function to active-RC circuit by admittance matrix expansion. IEEE Trans Circ Syst I Reg Pap 53(12):2715–2728 10. Haigh DG, Tan FQ, Papavassiliou C (2005) Systematic synthesis of active-RC circuit buildingblocks. Analog Integr Circ Signal Process Netherlands 43(3):297–315 11. Saad RA, Soliman AM (2010) A new approach for using the pathological mirror elements in the ideal representation of active devices. Int J Circuit Theory Appl 38(3):148–178 12. Saad RA, Soliman AM (2008) Use of mirror elements in the active device synthesis by admittance matrix expansion. IEEE Trans Circ Syst I Reg Pap 55(9):2726–2735 13. Soliman AM (2009) Adjoint network theorem and floating elements in the NAM. J Circ Syst Comput 18(3):597–616
An OFDM Radar Communication Integrated Waveform Based on Chaos Zhe Li1,2(B) and Weixia Zou2 1 Beijing University of Posts and Telecommunications, Beijing 100876, People’s Republic of
China [email protected] 2 Key Lab of Universal Wireless Communications, MOE, Beijing University of Posts and Telecommunications, Beijing 100876, People’s Republic of China
Abstract. This paper mainly studies the design of radar and communication integrated shared signal based on chaos mapping. Orthogonal frequency division multiplexing (OFDM) technology has been widely used in the field of communication, and phase-coded OFDM radar has received considerable research in recent years. Combining the advantages of these two technologies, an integrated phase-coded OFDM signal based on chaotic sequence is proposed in this paper, where logistic chaos mapping is introduced as the phase coding sequence. The scheme uses logistic chaotic mapping to generate the phase coding sequence of the signal, and then realizes the modulation of the communication data by establishing mapping relationship between the communication data and the coding sequence. In order to improve the communication bit carrying capacity of the integrated signal, each subcarrier of the waveform is modulated with a different phase coding sequence. The analysis and simulation of the ambiguity function of the designed signal demonstrate that the signal not only improves the communication ability of the chaos-based integrated signal but also has better radar performance. Keywords: Radar communication integration · Phase-coded · Chaos · OFDM
1 Introduction With the rapid development of electronic equipment and information technology, spectrum resource has become increasingly scarce, and the frequency bands of communication system and radar system have gradually overlapped, so radar communication integration technology has received extensive attention and research. Among the existing integrated technologies, the shared signal scheme has the highest degree of attention owing to its highest degree of integration. Shared signal technology refers to combining communication signal and radar signal to design a new waveform to realize the communication function while ensuring that the radar function is not affected, mainly including linear frequency modulation technology (LFM), spread spectrum technology,
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_75
556
Z. Li and W. Zou
and OFDM technology [1]. Since the communication symbol rate in LFM only corresponds to chirp rate, so its communication performance is not optimal. In [2], the author proposed a single-carrier integrated signal based on spreading spectrum using m-sequence; however, its performance is largely affected by the type and length of the spreading sequence, and the data rate reduces due to introducing spreading factor. OFDM signal has large time-bandwidth product, high spectral efficiency, anti-multipath fading, besides, OFDM radar has also received much attention in recent years. In [3], Levanon proposed a multi-frequency complementary phase modulation radar signal, in which a complementary set is used as phase sequence to modulate subcarriers. The complementary set has good autocorrelation and cross-correlation, but the number is limited and the code seeking process is complicated. In [4], the author discussed the feasibility of designing an OFDM system that satisfies both radar and communication functions. The choice of phase coding sequence is the key to phase-coded radar. Although the ideal Barker code and Huffman code have good autocorrelation, their sequence length is limited and the generated sequence is fixed, which means it is easy to be intercepted, so the security is not guaranteed [5]. In [6], the author proposed an OFDM signal based on chaotic sequence, but from communication perspective, this approach has poor data carrying capacity. Chaos is a random and irregular motion, which has many excellent characteristics, such as aperiodic, initial value sensitivity, pseudo-random, long-term unpredictability, noise-like broad-spectrum characteristics, and ideal correlation. It is easy to generate and the sequence length is not limited. The non-periodic randomness of the chaotic signal makes it have a “pushpin-shaped” ambiguity function (ideal pushpin-shaped ambiguity function graph only has a single peak at the origin), is a relatively ideal radar signal. Therefore, this paper aims at designing a new kind of waveform based on chaotic sequence for joint communication radar system.
2 Radar Communication Integrated Signal Model 2.1 Phase-Coded OFDM Signal Model The baseband model of phase-coded OFDM signal based on chaos proposed in this paper can be expressed as s(t) =
−1 N −1 M
wn an,m rect(t − mtb ) exp(j2π fn t)
(1)
n=0 m=0
where N and M are the number of subcarriers and symbols, respectively. wn =wn ejθn is the complex weight of nth subcarrier, which is used to weight the amplitude and initial phase of subcarrier. an,m is the phase code of mth symbol of nth subcarrier. fn = nf is the nth subcarrier frequency, f = 1/tb is the frequency separation between two adjacent subcarriers, tb is the symbol period, rect(·) is rectangular window function. The pulse structure of the waveform is shown in Fig. 1.
An OFDM Radar Communication Integrated Waveform Based on Chaos
557
Fig. 1 Pulse structure of the phase-coded OFDM radar communication integrated signal
2.2 Chaotic Sequence Generation Discrete chaotic sequences are generated based on chaotic mapping equations. Typical chaotic mapping equations include Bernoulli, Logistic, Tent, Quadratic, etc. [7] Logistic mapping is used in this paper, the mapping equation is given as 1 1 2 − x (n) − x(n) ∈ [−0.5, 0.5], 2.5 < μ ≤ 4 (2) x(n + 1) = μ 4 2 Generation and quantization algorithm of logistic sequence are as follows. 1. The chaotic sequence generates different sequence according to different initial values x0 , a random value in [−0.5, 0.5], and the logistic mapping has better chaos effect when μ > 3.57, so the mapping equation under μ= 4 is 1 − 4x2 (n) x(n) ∈ [−0.5, 0.5] (3) 2 2. Generate a chaotic sequence of length M . In order to reduce the influence of the initial value, it is necessary to pre-iterate the mapping equation before generating the sequence. Then, continue iterating M times to get the required chaotic sequence {xm }, m = 0, 1, . . . , M − 1. 3. Perform phase quantization on the chaotic sequence obtained in step 2. Np is the maximum number of different quantized phases. Then generating the phase coding sequence {am }, m = 0, 1, · · · , M − 1 in the range of [0, 2π ] according to x(n + 1) =
am = exp(j2π ϕm )ϕm = ceil(Np xm + 0.5) · 2π/Np
(4)
2.3 Communication Information Modulation After the chaotic sequence is obtained, the communication data needs to be modulated in the chaotic sequence. In [6], each subcarrier is modulated with the same phase coding sequence, that is, an,m = a0,m , n = 0, 1, . . . , N − 1, m = 0, 1, . . . M − 1.
558
Z. Li and W. Zou
The scheme has simple OFDM structure and the demodulation at the receiving end has low computational complexity, but the obvious drawback is that each subcarrier can only carry one bit. That is, a 0,m is used as phase coding sequence when sending 1, −a0,m is used when sending 0. In this paper, the following modulation approach is used in order to improve the capacity of communication of the integrated signal based on chaos. 1. Let N = M , at first, an initial phase sequence is obtained according to the generation algorithm, then the initial sequence a0,m is cyclically shifted M −1 times to obtain M different phase coding sequences an,m , n, m = 0, 1, . . . , M − 1. 2. Perform serial-to-parallel conversion on the transmitted communication bits to obtain N channels of communication data, these N channels of data will be transmitted by N subcarriers, respectively. 3. The number of cyclic shifts of a sequence relative to the initial sequence a0,m is used to carry communication information. A chaotic sequence of length M can carry K= log2 M bits; therefore, for the nth subcarrier, the first K bits waiting to be transmitted are selected, and these K binary bits are converted into decimal number dn . 4. The communication data { dn } , n = 1, 2, . . . , N corresponding to the N subcarriers can be obtained through the previous step. Let dn correspond to the cyclic shift number, and the sequence with the cyclic shift number of dn is selected from the obtained M sequences, which is used for modulating the nth subcarrier. The phase coding sequence of all subcarriers can be obtained by this method. For simplicity, assuming that N = M = 4, the mapping relationship between communication data and phase coding sequence is shown as Table 1. Table 1 Mapping between communication data and phase coding sequence Data dn Phase sequence 00
0
a0 , a1 , a2 , a3
01
1
a1 , a2 , a3 , a0
10
2
a2 , a3 , a0 , a1
11
3
a3 , a0 , a1 , a2
3 Ambiguity Function of the Integrated Signal The ambiguity function is an effective tool for analyzing and researching radar signals and designing waveforms. It can be used to study what waveform the radar signal should use and the resolution, ambiguity, measurement accuracy, and clutter consistency of radar system after filtering.
An OFDM Radar Communication Integrated Waveform Based on Chaos
559
Suppose the echoes of the two targets are s(t) and s(t + τ )e−j2πfd , respectively. According to the definition, the ambiguity function can be written as +∞ s(t)s∗ (t + τ )exp(j2π fd t)dt χ (τ, fd ) =
(5)
−∞
τ is delay, fd is doppler shift, * is conjugate symbol. The radar resolution is higher at (τ, fd ) when χ (τ, fd ) is smaller. Generally, use |χ (τ, fd )|2 /|χ (0, 0)|2 to measure the speed–distance two-dimensional joint resolution of the target. The expression of the integrated signal designed in this paper can be written as s(t) =
N −1
wn xn (t) exp(j2πfn t), 0 ≤ t ≤ Mtb
(6)
n=0
M −1
xn (t) = m=0 an,m rect(t − mtb ) is phase-coded signal corresponding to the n-th subcarrier. According to (5) and (6), the ambiguity function of the integrated waveform is as follows. +∞ s(t)s∗ (t + τ ) exp(j2πfd t)dt χ (τ, fd ) = −∞ +∞N −1
=
wn xn (t) exp(j2π nft)
−∞ n=0
·
N −1
wk∗ xk∗ (t + τ ) exp[−j2π kf (t + τ )] · exp(j2π fd t)dt
k=0 N −1 N −1
=
wn wk∗ exp[−j2π kf τ ]
n=0 k=0 +∞
·
xn (t)xk∗ (t + τ ) exp[j2π(n − k)ft + j2π fd t]dt
−∞
= χauto (τ, fd ) + χcross (τ, (n − k)f + fd )
(7)
χauto is self-ambiguity function, the main component of the ambiguity function, in which n = k.χcross is cross-ambiguity function, the secondary part of the ambiguity function, in which n = k.χcross is very small, which can be ignored compared to χauto . Their expressions are given as χauto (τ, fd ) =
N −1 n=0
|wn |2 exp(−j2π nf τ )χn (τ, fd )
(8)
560
Z. Li and W. Zou
χcross (τ, fd )=
N −1
N −1
wn wk∗ exp(−j2π kf τ )χn,k (τ, fd )
(9)
n=0 k=0,k=n
χn (τ, fd ) is the self-ambiguity function of the n-th subcarrier xn (t), χn,k (τ, fd ) is the cross-ambiguity function of xn (t), xk (t).
4 Simulation and Analysis 4.1 Correlation Analysis of Chaotic Sequence According to the generation method above, a logistic chaotic sequence of length 128 is obtained, and its auto-correlation graph and cross-correlation graph are shown as Fig. 2. It can be seen from the result that when the symbol offset is 0, the normalization amplitude of autocorrelation is 1, and its side lobe level is very low, and the cross-correlation value is quite small at any symbol offset. This result shows that chaotic sequence based on logistic mapping not only has good auto-correlation but also has good cross-correlation, so it is suitable as coding sequence for phase-coded OFDM signal.
Fig. 2 Autocorrelation and cross-correlation of logistic chaos
4.2 Radar Performance Analysis The ambiguity function of the integrated shared signal consists of two parts: selfambiguity function and mutual ambiguity function. The self-ambiguity function is weighted by the ambiguity function of each subcarrier, and the mutual ambiguity function is the mutual interference between different subcarriers. Set N = M = 16, the weighting coefficient for each subcarrier is set to 1, then randomly generating 8-phase quantized chaotic phase coding sequence. The ambiguity function diagrams when using same phase coding sequence for each subcarrier (the left subgraph in Fig. 3) and using
An OFDM Radar Communication Integrated Waveform Based on Chaos
561
Fig. 3 Ambiguity function diagram
different phase coding sequences generated by cyclically shifting for each subcarrier (the right subgraph in Fig. 3) are shown as Fig. 3, respectively. Figure 3 shows that both schemes have a pushpin-shaped ambiguity function graph. However, in the left part, the ambiguity function graph has a high level and wide range of side lobe, which means that its speed–distance resolution is not ideal. But as the right graph shows, the center peak is obvious, the side lobe range is smaller, and the side lobe level is also reduced a lot, so it has better speed–distance two-dimensional resolution. The result shows that the integrated signal based on chaotic sequence designed in this paper has a better radar performance. 4.3 Communication Performance Analysis Let N = M = 16, then the number of communication bits that a subcarrier can carry is K= log2 M = 4. If the symbol duration is tb = 1μs, then the duration of a pulse is Mtb = 16 μs, assuming that the duty cycle of the radar pulse is λ = 0.3, the communication bit rate Rb can be obtained as Rb =
N ×K 16 × 4 bit × λ= × 0.3 = 1.2 Mbit/s Mtb 16 × 1 μs
(10)
16×1 bit If a subcarrier can only carry one bit, the communication bit rate is 16×1 μs × 0.3 = 0.3 Mbit/s, it can be seen that the designed waveform can increase the communication rate by K = 4 times under the parameters above. The longer the chaotic sequence length and the more subcarriers, the higher the data carrying capacity that can be achieved. However, as the sequence length increases, the complexity of the integrated signal becomes higher; therefore, a trade-off should be made between the communication rate and the sequence length. Moreover, since the number of bits is related to the length of sequence by K= log2 M , that is, multiple symbols are used to carry the same communication information, so the noise resistance is better in phase-coded OFDM signal.
5 Conclusion Based on the excellent characteristics of the phase-coded OFDM signal and chaotic sequence, we have proposed a new radar communication integrated signal based on
562
Z. Li and W. Zou
logistic mapping in this paper. In order to improve communication data carrying capacity of the integrated waveform, cyclically shifting the chaotic sequence so that each subcarrier uses a different phase coding sequence for phase modulation. The ambiguity function of the designed signal is derived, and the ambiguity function graph is simulated. The result shows that the ambiguity function of the designed waveform is pushpin-shaped and has a lower side lobe level. In addition, the communication rate of the shared signal is directly proportional to the subcarriers, and the bit carrying capacity for each subcarrier increases by K times through the proposed modulation method. This means that the integrated waveform proposed in this paper combines the advantages of phase-coded OFDM signal and chaotic sequence, not only has better radar performance, but also has better communication performance. Acknowledgments. This work was supported by NSFC (No. 61971063).
References 1. Quan S, Qian W, Guq J et al (2014) Radar-communication integration: an overview. In: The 7th IEEE/international conference on advanced infocomm technology, Fuzhou, pp 98–103 2. Sturm C, Wiesbeck W (2011) Waveform design and signal processing aspects for fusion of wireless communications and radar sensing. Proc IEEE 99(7):1236–1259 3. Levanon N (2000) Multifrequency complementary phase-coded radar signal. IEE Proc-Radar Sonar Navig 147(6):276 4. Sturm C, Zwick T, Wiesbeck W (2009) An OFDM system concept for joint radar and communications operations. In: Proceedings of the 69th IEEE vehicular technology conference, VTC Spring 2009. Hilton Diagonal Mar, Barcelona, Spain. IEEE, 26–29 Apr 2009 5. Kai H, Jingjing Z (2016) A design method of four-phase-coded OFDM radar signal based on Bernoulli chaos. J Radars 5(4):361–372. https://doi.org/10.12000/JR16050 6. Zhao J, Huo K, Li X (2014) A chaos-based phase-coded OFDM signal for joint radarcommunication systems. In: 2014 12th international conference on signal processing (ICSP), Hangzhou, pp 1997–2002 7. Sheng H, Fuhui Z, Yantao D et al (2018) Chaotic phase-coded waveforms with space-time complementary coding for MIMO radar applications. IEEE Access:1–1
Joint Estimation for Downsampling Structure with Low Complexity Chen Wang(B) , Wei Wang, Wenchao Yang, and Lu Ba Department of Communication Engineering, Harbin Institute of Technology, 150001 Harbin, People’s Republic of China [email protected]
Abstract. Spectrum sensing and direction of arrival (DOA) estimation have both been thoroughly investigated. Estimating spectrum and DOA is important for many signal processing applications, such as cognitive radio (CR). A challenging scenario, faced by CRs, is that of multiband signals, composed of several narrowband transmissions spread over a wide spectrum and each with an unknown carrier frequency and DOA. The Nyquist rate of such signals is high. To overcome the sampling rate issue, several sub-Nyquist sampling methods, such as multicoset or the modulated wideband converter (MWC), have been proposed. In this paper, we use an ULA-based MWC structure to implement joint carrier frequency and DOA estimation at sub-Nyquist sampling rate. Different from other methods, we reduce the complexity of the hardware with fewer antennas and solve the pairing issues compared to other estimation methods.
Keywords: Spectrum sensing MWC
1
· DOA · Sub-Nyquist · ULA-based
Introduction
As wireless communication systems increase, more and more spectrum resources are allocated to different communication systems, which makes spectrum resources increasingly scarce. At the same time, the utilization rate of spectrum resources is not high, resulting in the existence of “spectral holes.” In order to solve this problem, cognitive radio (CR) has been proposed, which allows secondary users to opportunistically access licensed frequency bands left vacant by their primary owners and then increases spectral efficiency. Spectrum sensing is This work is supported by National Nature Science Foundation of China (NSFC) (61671176) and Civil Space Pre-research Program during the 13th Five-Year Plan (B0111). c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_76
564
C. Wang et al.
an essential task for CRs [1]. Direction of arrival (DOA) recovery can enhance CR performance by allowing exploitation of vacant bands in space in addition to the frequency domain. Both spectrum sensing and DOA estimation have been thoroughly investigated. For the spectrum sensing, some sensing schemes have been proposed which is assuming known identical DOAs, such as energy detection [2], matched filter [3], and cyclostationary detection [4]. Well-known techniques for DOA estimation include MUSIC algorithm [5], ESPRIT algorithm [6], and their extension algorithms. The signal frequency support is typically known. However, many signal processing applications may require or at least benefit from the two combined, namely joint spectrum sensing and DOA estimation. Many modern applications dealing with wideband signals have lead to high Nyquist rates. To increase the chance of finding unoccupied spectral bands, CRs have to sense a wide spectrum, which is leading to prohibitively high Nyquist rates. Moreover, such high sampling rate generates a large number of samples to process, increasing power consumption. To overcome the sampling rates bottleneck, several sampling methods have recently been proposed which can reduce the sampling rate in multiband settings below Nyquist rate [7]. For example, multicoset sampling and MWC, which is sampling at sub-Nyquist rate [8], can solve carrier estimation and spectrum sensing from sub-Nyquist samples. In the literature [9], the authors use MWC to sample and recover the signals by using orthogonal matching pursuit. Based on reconstructed signals, the authors implement spectrum analysis to estimate frequency. One of the main determinations is that the hardware complexity is extremely high. Another problem is the pairing issue. The MUSIC algorithm used in the literature [9] is to perform DOA recovery, while each frequency will generate one true DOA and other false DOAs. However, a new algorithm that implements correlation calculation is used to solve this problem. But the complexity of this algorithm is high. In another literature [10], it uses the L-shaped array to obtain two frequencies and angle-related equations on the x -axis and the y-axis, respectively, and then solves the two equations to obtain frequency and DOAs. The hardware complexity is also very high. In this paper, we use a downsampling structure proposed in [10] to achieve DOA and frequency joint estimation. However, different from the literature [10], we propose a new joint frequency and DOA estimation algorithm under subNyquist rate. We also reduce the complexity of the hardware with fewer antennas and solve the pairing issues.
2 2.1
ULA-Based MWC System Description
We first introduce the sampling structure. Based on the existing MWC structure, a new structure based on ULA was proposed in [10]. The main difficulty of MWC is to choose a suitable periodic function p(t), so that the transformation matrix satisfies the RIP criterion. The corresponding improvements to MWC
Joint Estimation for Downsampling Structure . . .
565
and applying it to ULA have overcome this problem. The new structure can be able to formulate sub-Nyquist spectrum as the well-known sum-of-exponents problem, which can be solved by various known methods. And it can also achieve DOA and frequency joint estimation. As shown in Fig. 2.1, the system consists of a ULA composed of N sensors. All sensors have the same sampling pattern implementing a single channel of the MWC, the received signal is multiplied by a periodic pseudo-random sequence p(t), the period is Tp = 1/fp , and then the signal is passed through a low-pass filter with a cut-off frequency fs /2 and sampled at the low rate fs .
Fig. 1 ULA configuration with N sensors
2.2
Frequency Domain Analysis
We firstly prove the relationship between the sampled sequence from the nth sensor and the unknown transmissions si (t) and corresponding carrier frequencies f . We define the signal of the ULA at the nth perceptron that can be expressed as M M si (t + τn )ej2πfi (t+τn ) ≈ si (t)ej2πfi (t+τn ) (1) un (t) = i=1
i=1
S(f ) = s1 s2 . . .sm
(2)
dn cos(θ) (3) c where τn = dn c cos(θ) is the accumulated phase at the nth sensor with respect to the first sensor. The above assumption is that the signal is a narrowband signal si (t). The Fourier transform of the received signal un (t) can be expressed as τn =
Un (f ) =
M i=1
Si (f − fi )ej2πfi τn
(4)
566
C. Wang et al.
In each sensor before filtering and sampling, the received signal is first mixed with the pseudo-random sequence p(t) ∞
pi (t) =
2π jT lt p
cl e
(5)
l=−∞
where 1 cl = Tp
Tp
−j T2π lt
pi (t)e
P
dt
(6)
0
After mixing, the frequency domain expression of the ith channel is expressed as ∼ Yn
∞ (f) =
∞
un (t)p(t)e−j2πf t dt =
cl Un (f − lfp )
(7)
l=−∞
0∞
Bring the Formula (4) into the Formula (7), we have ∞
∼
Yn (f) =
cl
l=−∞
M
Si (f − fi − lfp )ej2πfi τn
(8)
i=1
The impulse and frequency response of an ideal LPF with cut-off frequency fs /2 are represented by h(t) and H(f ), respectively. After the signals is filtered with h(t) M ∼ S i (f )ej2πfi τn (9) Yn (f ) = i=1
where ∼ Si
L0
cl Si (f − fi − lfp )
(10)
l=−L0
Finally, the signal downsampled by the nth sensor can be expressed as yn [k] yn (kTs )
(11)
Consider the expression of M signals in the frequency domain. The discrete Fourier transform of (11) is Yn (ej2πf Ts ) =
M
Wi (ej2πf Ts )ej2πfi τn
(12)
i=1
For the observation of N -sensor ULA, we can write the above formula as a matrix Y(f ) = AW(f ) (13) Note that Y(f ) is a vector of length N , the nth element Yn (f ) = Yn (ej2πf Ts ), W(f ) is an unknown vector of length M . Its ith element is Wi (f ) = Wi (ej2πf Ts ),
Joint Estimation for Downsampling Structure . . .
567
the matrix A is a rotation vector, consisting of an unknown carrier frequency f and delay τ , where ⎡ j2πf τ ⎤ j2πfM τ1 1 1 e ··· e ⎢ ⎥ .. .. ··· A=⎣ (14) ⎦ . . j2πf1 τN · · · j2πfM τN e e In the time domain, we have y[k] = Aw[k]
(15)
where y[k] has n elements, and w[k] is a vector of length M with ith element wi (k). In order to reconstruct the signal, we first reconstruct the carrier frequency f and DOA.
3
Joint Frequency and Angle Estimation Under Sub-Nyquist Sampling
To get the frequency of the signal, we need to give a delay τ to the output on each antenna of the array. The delayed signal is described as follows un (t − τ ) =
M
si (t − τ )ej2πfi (t+τn −τ )
(16)
i=1
=
M
si (t)ej2πfi (t+
(n−1)dsinθ ) c
ej2πfi τ
i=1
Before the delay, the signal model is described as y[k] = Aw[k]
(17)
By giving a delay τ , the downsampling signal can be expressed as y[k] = Aφw[k]
(18)
where φ is a diagonal matrix φ = diag(e−j2πf1 τ , e−j2πf2 τ , . . . , e−j2πfM τ ) Stack the data before the delay, the data after the delay can be expressed as
y0 A y= = w (19) Aφ y1
A set Ω = , then Ω will be divided into two blocks, the dimension of the Aφ first blocks is M × M , and the dimension of the second block is (2N -M )× M . Then, we define a matrix P that is satisfying PH Ω1 = Ω2
(20)
568
C. Wang et al.
We divide the matrix A into two pieces A1 A= A2
(21)
where A1 = Ω1 , then we can get ⎤ A2 Ω2 = ⎣A1 φ⎦ A2 φ ⎡
(22)
Do the same operation to PH PH
⎡ ⎤ P1 = ⎣P2 ⎦ P3
(23)
Then, from the Formulas (22) and (23), P1 A1 = A2 , P3 A1 = A2 φ. Further, equations can be obtained P1 A1 φ = P3 A1 , so it can be converted to P1 A1 φA1 −1 = P3 . Note that since the similarity transform does not change the eigenvalues of the matrix, their eigenvalues are the same, and the eigenvalues of φ contain frequency information, so the frequency value of the signal can be obtained by eigenvalue decomposition of Ψ , and we use the least squares Ψ = (P1 H P1 )
−1
P1 H P3
(24)
After the eigenvalue decomposition of the matrix, the largest K eigenvalues are the same as the estimation of the K angular elements of φ, and the corresponding column space of the K eigenvectors is the same as the column space composed of the column vectors of A1 , and the K eigenvectors are combined to form a matrix A1e , which is an estimate of the A1 . So far, the kth estimated frequency is −angle(μk ) (25) fk = 2πτ
I Because there is K×K A1 =A , after getting the estimate of A1 , we define P1
IM ×M A1e . Then, we can use the ESPRIT algorithm described in [7] to B= P1 estimate DOAs −angle(αk ) ) (26) θk = arcsin( 2πdfk To sum up, the algorithm is mainly divided into two steps. The first step is to calculate the matrix P, and the second step is to find the estimated value of the frequency according to the above algorithm, and then estimate the DOAs according to the ESPRIT algorithm.
Joint Estimation for Downsampling Structure . . .
4
569
Simulation Results
We have conducted two experiments in order to evaluate the performance of the proposed joint frequency and DOA estimation method at sub-Nyquist sampling rate. The performance is measured in terms of the mean absolute error (MAE) between the estimated values of DOA and frequency and theirs real values. The expression of the evaluation is MAE =
R K 1 ˆr,k Er,k − E RK r=1
(27)
k=1
where R is the total runs of the Monte Carlo method, K is the number of the narrowband transmissions si (t), E is the estimation θ or f . In our simulations, the number of signals is K = 2, the corresponding angle q and frequency f are f1 = 1.5 GHz, q1 = 30◦ , f2 = 3.5 GHz, q1 = 50◦ , and the biggest bandwidth B of signals is B = 50 MHz. The spacing between adjacent sensors is d = 0.03 m, the signal-noise-ratio(SNR) is defined as SN R = 10log10 (δs 2 /σ 2 ). There are 200 Monte Carlo experiments considered under each SNR. As shown in Figs. 2 and 3, the error in frequency and DOA is gradually decreasing when SNR is increasing. By using the data which is sampled at subNyquist sampling rates and the proposed algorithm, we can successfully achieve the signal frequency and DOA joint recovery and solve the matching problems.
0.035
MAE(GHz)
0.03
0.025
0.02
0.015
0.01 -4
-2
0
2
4
6
8
10
SNR(dB)
Fig. 2 Performance of DOA at different SNR
12
14
570
C. Wang et al. 9
× 10-3
8
MAE(degree)
7 6 5 4 3 2 1 -4
-2
0
2
4
6
8
10
12
14
SNR(dB)
Fig. 3 Performance of carrier frequency f at different SNR
5
Conclusion
In this literature, we use a new algorithm to solve the problem of frequency and DOA joint recovery under downsampling. In terms of the undersampling structure, we adopted the structure of ULA-based MWC, which can reduce the complexity of hardware and recover the signals from the sub-Nyquist rates. Compared with traditional MWC, the new structure does not need to design pseudorandom code. We also propose a new algorithm which can effectively solve the pairing issue in frequency and DOA joint estimation problem. Simulation results show that the proposed method can match the frequency and DOA accurately, and the performance of the algorithm also tends to be better with higher SNR.
References 1. Ghasemi A, Sousa ES (2008) Spectrum sensing in cognitive radio networks: requirements, challenges and design trade-offs. IEEE Commun Mag 46(4):32–39 2. Urkowitz H (1967) Energy detection of unknown deterministic signals. Proc IEEE 55(4):523–531 3. Turin GL (1960) An introduction to matched filters. IRE Trans Inf Theor 6:311– 329 4. Gardner WA, Napolitano A, Paura L (2006) Cyclostationarity: half a century of research. Signal Process 86:639–697 5. Schmidt RO (1986) Multiple emitter location and signal parameter estimation. IEEE Trans Antennas Propag 34(3):276–280
Joint Estimation for Downsampling Structure . . .
571
6. Roy R, Kailath T (1989) ESPRIT-estimation of signal parameters via rotational invariance techniques. IEEE Trans Signal Process 37(7):984–995 7. Mishali M, Eldar YC (2011) Sub-Nyquist sampling: bridging theory and practice. IEEE Signal Process Mag 28(6):98–124 8. Mishali M, Eldar YC (2009) Blind multiband signal reconstruction: compressed sensing for analog signals. IEEE Trans Signal Process 57(3):993–1009 9. Cui C, Wu W, Wang WQ (2017) Carrier frequency and DOA estimation of subNyquist sampling multi-band sensor signals. IEEE Sens J 17(22):7470–7478 10. Xu L et al (2018) Joint two-dimensional DOA and frequency estimation for Lshaped array via compressed sensing PARAFAC method. IEEE Access
One-Bit DOA Estimation Based on Deep Neural Network Chen Wang(B) , Suhang Li, and Yongkui Ma Department of Communication Engineering, Harbin Institute of Technology, Harbin 150001, China [email protected]
Abstract. This paper established a deep neural network model for DOA estimation of narrowband signals. First, one-bit quantization is considered into implementation for only retaining the symbol information of training data, as it offers low cost and low complexity in actual communication system. Then we investigate the performance of the neural network trained with quantized data and traditional MUSIC algorithm. Finally, simulations are conducted for correctness and validation. The results illustrate that the proposed method can realize meshless DOA estimation and has higher estimation accuracy in the case of low signalto-noise ratio.
Keywords: DOA estimation network
1
· One-bit quantization · Deep neural
Introduction
The research on DOA estimation has long been focused on the purpose of improving the accuracy of parameter estimation under the premise of limited computing power [1]. In order to achieve short-time and high-precision DOA estimation, various new methods have emerged out. The inspiration for these algorithms mainly comes from assuming that there is a parametric forward mapping from the direction of signals to the array input, if this mapping is reversible, then the array output can be matched by the pre-constructed mapping and therefore detect the direction of incoming signals [2]. Different matching criteria formed different methods, such as, subspace algorithm based on eigenvalue decomposition [3], sparse reconstruction algorithm based on compressed sensing (CS) [4] and maximum likelihood (ML) algorithm based on statistical learning theory [5]. This work is supported by National Natural Science Foundation of China (NSFC) (61671176) and Civil Space Pre-research Program during the 13th Five-Year Plan (B0111). c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_77
One-Bit DOA Estimation Based on Deep Neural Network
573
In order to further reduce computational complexity, Hu et al. [6] have innovatively made the DOA estimation problem as a classification problem by incorporating the framework of sparse signal theory into support vector machines (SVM); in this way, the result of one-bit quantization is regarded as the label of the array output. In addition, the application of massive MIMO has been a core issue in modern communication technology [7]. As is known to all, the computational complexity of traditional DOA estimation algorithms is relatively high. For instance, thecomputational complexity of 1-D ESPRIT algorithm can be in the order of O M 3 + 2M 2 L in calculating the singular value decomposition (SVD) for a covariance matrix with an M -element array and L snapshots. Especially with the increment of the number of antennas, the calculation load will increase extremely, which is obviously not available in real-time processing and practical application [8]. The neural network, which has strong ability of denoising, learning, selfadapting and complex mapping [9], has been gaining more momentum in DOA estimation [10–12]. The calculation of neural network mainly comes from the multiplication, addition and nonlinear transformation of matrix without the complex operation of eigenvalue decomposition [13]. Hence, the neural network can provide new possibilities for rapid implementation and practical application of hardware. Inspired by the above works, we propose a one-bit DOA estimation algorithm based on deep neural networks (DNN).
2 2.1
The Description of the Proposed Method Deep Learning Framework for One-Bit DOA Estimation
In this subsection, we will review some statistical properties of one-bit quantization and give the relationship between the covariance matrix of quantized and unquantized data, which provides a theoretical basis for training neural networks. Consider a continuous-time, real, scalar, and stationary Gaussian process X(t) and the output process Y (t) = f (X(t)), where f (·) is a amplitudequantization function. Then, the autocorrelation function of Y (t), represented by RY (τ ), can be described by the arcsine law [14] RY (τ ) = E[Y (t + τ )Y (t)] =
2 −1 ¯ sin RX (τ ) π
(1)
¯ X (τ ) = RX (τ )/RX (0) is the normalized autocorrelation function of where R X(t). Hence, we can recover the normalized autocorrelation of the unquantized data by calculating the autocorrelation function of the observed one-bit data in Y (t). Similarly, the Bussgang theorem [15] further indicates that the crosscorrelation between X(t) and Y (t) is proportional to the autocorrelation of X(t), which can be described as RXY (τ ) = E[X(t + τ )Y (t)] = CRX (τ )
(2)
574
C. Wang et al.
where the factor C depends on the characteristics of f (x) and the power of X(t). Actually, the Bussgang theorem provided a strategy of equivalent expression of nonlinear mapping, that is, the output process Y (t), which is a nonlinear function of X(t), is equivalent to a linear function of X(t) in terms of secondorder statistics. In the above derivation, we assume that X(t) is real and scalar. In the following discussion, we mainly focus on the case where x is a complex vector. First, the result of one-bit quantization for a complex vector x can be defined as 1 y = √ sign(x) 2
(3)
where the ith entry of sign(x) is given by [sign(x)]i = sign((xi ))+j·sign((xi )), where sign(x) represents the sign function, that is, sign(x) is mapped to +1 if x is nonnegative and −1 if x is negative. (·) and (·) represent the operation of taking the real and imaginary part of a plural, respectively. The factor √12 normalizes the power of y. Therefore, there are four cases √ of the √ for the value 2, (+1 − j)/ 2, (−1 + corresponding element of the vector y, i.e., (+1 + j)/ √ √ j)/ 2, (−1 − j)/ 2. Next, we will review the arcsine law of complex Gaussian vectors. The normalized covariance matrix of x can be defined as Rx = Q−1/2 Rx Q−1/2
(4)
where Q is a diagonal matrix and satisfying [Q]q,q = [Rx ]q,q . Based on the arcsine law, the covariance matrix corresponding to the quantized complex vector can be described as 2 (5) Ry = E[yyH ] = sin−1 (Rx ) π where sin−1 denotes inverse sine transformation on each element of the vector Rx . The formulation (5) indicates that the normalized covariance Rx can be estimated from the covariance of one-bit data, which can be described as π Ry (6) Rx = sin 2 Then, consider K uncorrelated far-field narrowband signals with the directions of θ = {θ1 , θ2 , . . . , θK } impinging onto an M -element uniform linear array, the received signal vector under one snapshot can be expressed as x = As + n =
K
a(θk )sk + n
(7)
k=1
where A = [a(θ1 ), a(θ2 ), . . . , a(θK )] denotes the manifold matrix composed of the steering vectors, and the kth steering vector is a(θk ) = [1, . . . , ej2πd(M −1) sin θk /λ ]. The result for the observation with one-bit quantization can be expressed as r = sign((x)) + j · sign((x))
(8)
One-Bit DOA Estimation Based on Deep Neural Network
575
Since the input feature of the neural network we built is composed of real values, by constructing several real-number matrices, i.e., (r) (A) −(A) q= Φ= (r) (A) (A) (9) (S) (n) t= e= (S) (n) finally, the data model of one-bit quantization can be described as q = sign(Φt + e) 2.2
(10)
DOA Estimation Based on Deep Learning
As shown in Fig. 1, we established a deep neural network (DNN) framework for DOA estimation in terms of multiple narrowband signals. The proposed structure mainly consists of data preprocessing, spatial filtering autoencoder, multiclassifier and linear interpolation. In the following discussion, we will consider the specific role of each part.
Fig. 1 Deep neural network model for DOA estimation of narrowband signals
The process of data preprocessing is used to generate feature vectors suitable N for the subsequent DNN. In fact, the covariance matrix Rx = i=1 xi xTi /N of the observation results x under N snapshots is symmetric, that is, the element in the upper triangular is the same as the lower triangular. Usually, we can extract the features of DOAs by constructing the following vector r = [R1,2 , R1,3 , . . . , R1,M , R2,3 , . . . , R2,M , . . . , RM −1,M ] ∈ C (M −1)M/2×1 (11)
T T T r = (r ), (r ) /r2 (12) T
where Rm1 ,m2 represents the entry of the m1 th row and m2 th column in the covariance matrix Rx . It is worth noting that the observation data r has been quantified as +1 or −1, but we can approximately recover the normalized covariance matrix of the original data from the one-bit covariance matrix by formulation (6). Once the covariance matrix of original data is recovered, we can take the operations of (11) and (12) to generate suitable inputs for the autoencoder. Then, the autoencoder is trained to decompose the characteristics obtained from the incident signals into a discrete angle domain. Specifically, for the multiple input features r generated from θ(0) to θ(P ) , the angle domain is discrete
576
C. Wang et al.
into I small parts θ1 , θ2 , . . . , θI , where I/P = I0 . Assume that the feature vector r(θi ) in terms of θi is the input into the autoencoder, then the output of the pi th decoder is r(θi ), and the other P − 1 decoders are 0|r|×1 . Finally, the output of the P decoders in terms of r(θi ) can be described as T
u(θi ) = uT1 , . . . , uTP ⎡
⎤
⎢ ⎥ = ⎣0T|r|×1 , . . . , 0T|r|×1 , rT (θi ), 0T|r|×1 , . . . , 0T|r|×1 ⎦
(13)
P −p
p−1
Usually, the loss function is defined as the l − 2 norm of the error between the real and the expected output ε(1) (θi ) =
1 2 u(θi ) − u ˆ (θi )2 2
(14)
where u ˆ (θi ) denotes the real output of the autoencoder when the feature vector is r(θi ). The weights and bias of the autoencoder can be updated through backward propagation process. Next, keep the weight and bias of the autoencoder unchanged, and use the input feature vector r(θi ) as input and the spatial spectrum as output to train the multi-layer classifier. We set P parallel multi-classifiers to receive the results from the autoencoder. Consider the case two signals from θ and θ + Δj arrived, J where Δ = {Δj }j=1 . Then, in the multi-classifier, the input feature vector is r(θ, Δj ), where θ(0) θ < θ(P ) − Δj , j = 1, . . . , J, and the output spatial spectrum can be defined as ⎧ ¯ ¯ ¯ ⎨ (θl+1 − θ)/(θ l+1 − θl ), θl−1 θ < θl , θ ∈ {θ, θ + Δj } ¯ ¯ ¯ (15) [y(θ, Δj )]l = (θl+1 − θ)/(θl+1 − θ), θl θ < θl+1 , θ ∈ {θ, θ + Δj } ⎩ 0, otherwise The formulation (14) indicates that the reconstructed spatial spectrum only has nonzero positive values on the grid adjacent to the true DOAs, so that we can obtain the estimated DOAs through linear interpolation: N N (16) θi αi / αi θˆ = i=1
i=1
where θ1 , . . . , θN represent the discrete angles, α1 , . . . , αN represents the weight output by the multi-classifier corresponding to the discrete grid θ1 , . . . , θN . Note that in the process of training the multi-classifier, the training (2) (2) (2) = set is defined as Γ(2) = [Γ1 , . . . , ΓJ ], where the jth input is Γj (2) [r(θ1 , Δj ), . . . , r(θI − Δj , Δj )]. The corresponding label set is defined as Ψ = (2) (2) (2) [Ψ1 , . . . , ΨJ ], where the j-th output is Ψj = [y(θ1 , Δj ), . . . , y(θI − Δj , Δj )]. We also use the l−2 norm of the error as loss function and update the weight and
One-Bit DOA Estimation Based on Deep Neural Network
577
bias through back propagation. Namely, the loss function of the multi-classifier is defined as 1 2 y (θ, Δ) − y(θ, Δ)2 (17) ε(2) (θ, Δ) = ˆ 2 where yˆ(θ, Δ) denotes the real output of the multi-classifier.
3
Simulation Results
In this section, we investigate the performance of the proposed method and the traditional MUSIC algorithm. First, the relevant parameters of the deep neural network are set as follows. The training angle range is from −60◦ to 59◦ with the grid spacing 1◦ ; thus, 120 grids are taken into account with θ1 = −60◦ , θ2 = −59◦ , . . . , θI = 59◦ . The number of snapshots in the training set and test set is 400. For the spatial domain autoencoder, 10 samples are randomly generated near each grid, and the learning rate is set to 0.001; for the multi-classifier, each sample corresponds to two incident signals, and the angular interval of the two signals is selected from the set {2◦ , 4◦ , . . . , 80◦ }. Specifically, the angle of the first signal θ is sampled from −60◦ to 60◦ −Δ with the interval of 1◦ , the second signal is θ+Δ. Note that each combination produces 10 samples; therefore, the total number of training samples in the multi-classifier is (118+116+· · ·+40)×10 = 19,800. When testing, assume that 5 DOAs come from the direction −30◦ , −5◦ , 15.3◦ , 32.2◦ , 55◦ , as shown in Fig. 2. The proposed algorithm can estimate multiple signals at the same time with no gridding effect.
Fig. 2 Multi-signal DOA estimation testing results. SNR is set to −5 dB, the estimated DOA at the peak are −30.000◦ , −5.000◦ , 15.050◦ , 32.328◦ , 55.000◦
578
C. Wang et al.
Then, we consider the ability to resolve multiple signals of the proposed algorithm and MUSIC algorithm. Here two signals in the range from −60◦ to 60◦ are taken into account with the angle interval 14◦ . That is, if the first signal is select 0◦ , then the second is set to 14◦ . SNR is set to −5 dB. The simulation environment settings for MUSIC is exactly the same as DNN. As shown in Fig. 3., it reveals the relationship between the predicted and true values of DOAs, and their absolute errors in the case of using one-bit data to train the DNN for DOA estimation and MUSIC. It can be observed that the variance of the angle error estimated by one-bit DNN is smaller, that is, the error is more concentrated than MUSIC. Tables 1 and 2 further record the error of the two methods when detecting DOAs. Table 1 Error record of signal 1 Signal 1
Mean absolute error Standard prediction error Max error
One-bit DNN 0.071
0.107
0.581
MUSIC
0.228
1.4
0.108
Table 2 Error record of signal 2 Signal 2
4
Mean absolute error Standard prediction error Max error
One-bit DNN 0.061
0.087
0.592
MUSIC
0.193
1.5
0.144
Conclusion
In this paper, we investigate using one-bit data for neural network training and achieve meshless DOA estimation. The proposed structure includes data preprocessing, spatial filtering autoencoder, multi-classifier and linear interpolation. Especially, the spatial filtering autoencoder is used to extract the features of DOA and has ability of noise reduction. The multi-classifier and linear interpolation further improve the estimation accuracy when multiple signals arrive. Simulation results show that the proposed algorithm outperforms traditional MUSIC method under low SNR.
One-Bit DOA Estimation Based on Deep Neural Network
579
(a) One-bit DNN estimation result
(b) One-bit DNN estimation absolute error
(c) MUSIC estimation result
(d) MUSIC estimation absolute error
Fig. 3 One-bit DNN compared with MUSIC
References 1. Yoon YS, Kaplan LM, Mcclellan JH (2006) TOPS: new DOA estimator for wideband signals. IEEE Trans Signal Process 54(6):1977–1989 2. Lecun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436 3. Schmidt R, Schmidt RO (1986) Multiple emitter location and signal parameters estimation. IEEE Trans Antennas Propag 34(3):276–280 4. Ganguly S, Ghosh I, Ranjan R, Ghosh J, Kumar PK, Mukhopadhyay M (2019) Compressive sensing based off-grid DOA estimation using OMP algorithm. In: 2019 6th international conference on signal processing and integrated networks (SPIN), Noida, India, 2019, pp 772–775 5. Miller MI, Fuhrmann DR (1990) Maximum-likelihood narrow-band direction finding and the EM algorithm. IEEE Trans Acoust Speech Signal Process 38(9):1560– 1577 6. Gao Y, Hu D, Chen Y, Ma Y (2017) Gridless 1-b DOA estimation exploiting SVM approach. IEEE Commun Lett 2210–2213 7. Larsson EG, Edfors O, Tufvesson F (2013) Massive MIMO for next generation wireless systems. IEEE Commun Mag 52(2):186–195 8. Bencheikh ML, Wang Y (2010) Joint DOD-DOA estimation using combined ESPRIT-MUSIC approach in MIMO radar. Electron Lett 46(15):1081–1083 9. Ren S, He K, Girshick R (2015) Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 39(6):1137– 1149
580
C. Wang et al.
10. Xiang H, Chen B, Yang M, Yang T, Liu D (2019) A novel phase enhancement method for low-angle estimation based on supervised DNN learning. IEEE Access 7:82329–82336. https://doi.org/10.1109/ACCESS.2019.2924156 11. Huang H, Gui G, Sari H, Adachi F (2018) Deep learning for super-resolution DOA estimation in massive MIMO systems. In: 2018 IEEE 88th vehicular technology conference (VTC-Fall), Chicago, IL, USA, 2018, pp 1–5. https://doi.org/10.1109/ VTCFall.2018.8691023 12. Liu Z, Zhang C, Yu PS (2018) Direction-of-arrival estimation based on deep neural networks with robustness to array imperfections. IEEE Trans Antennas Propag 66(12):7315–7327. https://doi.org/10.1109/TAP.2018.2874430 13. Hecht-Nielsen (2002) Theory of the backpropagation neural network. In: International 1989 joint conference on neural networks. IEEE 14. Van Vleck JH, Middleton D (1966) The spectrum of clipped noise. Proc IEEE 54(1):2–19 15. Bussgang JJ (1952) Cross-correlation function of amplitude-distorted Gaussian signals. Tech. Rep. 216, Res. Lab. Elec., Mas. Inst. Technol.
A New Altitude Estimation Algorithm for 3D Surveillance Radar Jianguo Yu(B) , Lei Gu, Dan Le, Yao Wei, and Qiang Huang Nanjing Research Institute of Electronics Technology, Nanjing 210039, China [email protected], [email protected], [email protected], [email protected], [email protected]
Abstract. This paper proposes an adaptive altitude estimation (AE) algorithm to improve the 3D surveillance radar’s accuracy. Firstly, the altitude measurement error is derived by the radar’s measurement error matrix theoretically. Then, we design multiple models (MM) for altitude estimation, which both contain maneuvering and constant velocity (CV) models working parallel. The proposed AE algorithm adaptive chooses the optimal result by comprehensively using Kalman filter’s residual and altitude velocity with limited sliding window data. The performance of the proposed AE algorithm is evaluated via simulations of two tracking scenarios. Experiment results show that the proposed AE algorithm may greatly improve accuracy performance of altitude estimation under different scenarios. Keywords: Altitude estimation · Multiple models · Altitude maneuvering · Kalman filter
1 Introduction The altitude of the attacking target plays an important role in evaluating the target’s threaten in modern war. The target’s real altitude cannot be detected by the radar’s measurements directly. For traditional 2D radar, it can only acquire the target’s range and azimuth, which loses the ability to estimate the real altitude. Using more than two 2D radars to build up a group network [1] make it available to estimate the target’s real altitude. Commonly, a 3D radar can acquire the measurements of range, azimuth, and elevation. The target’s altitude can be estimated from these measurements. However, there are many factors may deteriorate the elevation precision, including multipath signals, atmosphere refraction and target maneuvering, which make it difficult to estimate the target’s real altitude [2] precisely. To improve the altitude estimation precision, a robust beamspace transformation method [3] is proposed, which considers all diffuse reflected multipath signals as interference signals and can be cancelled by proposed beamspace transformation. John et al. [4] compared a series of simple algorithm approximations that are used to compute the altitude of a track. The algorithm approximations include the refraction model of
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_78
582
J. Yu et al.
atmosphere in the tropospheric region and nonlinear least squares fit of the database to compute the parameters. A height measurement method of four beam is proposed [5] to solve height jumping and low height-measurement accuracy of 3D radar in height measurement method of dual beam. The rest of this paper is organized as follows. In Sect. 2, the altitude measurement error is derived by the radar’s measurement error matrix, and then the primary simulation results are discussed. Multiple models (MM) filters for altitude estimation are also proposed. Experimental results show that the proposed AE algorithm may improve accuracy performance of altitude estimation in Sect. 3. The last section gives the conclusions.
2 Multiple Models for Altitude Estimation In order to filter the altitude, we should first calculate the altitude error by using the radar’s direct measurement. MM filters for altitude estimation are designed which both contain maneuvering and constant velocity (CV) models working parallel. The proposed AE algorithm adaptive chooses the optimal result by comprehensively using Kalman filter’s residual and altitude velocity with limited sliding window data. 2.1 Altitude Estimation Error Derived Figure 1 shows the geometry between radar and target. r is the radar range measurement of target. re means the equivalent radius of earth. hr and ht represent the altitude of radar and target, respectively. θ is the elevation based on line of sight. By the cosine theorem, we can get the expression as follow. π (1) + θ = (re + ht )2 r 2 + (re + hr )2 − 2r(re + hr ) cos 2
Target
r
θ
ht
Radar
hr re
re
Fig. 1 Geometry of radar and target
A New Altitude Estimation Algorithm for 3D Surveillance Radar
From Eq. (1), we can get ht ht = r 2 + (re + hr )2 + 2r(re + hr ) sin θ − re
583
(2)
Let Re = re + hr , Eq. (1) can express in another way ht = R2e + r 2 + 2Re r sin θ − (Re − hr ) = ( R2e + r 2 + 2Re r sin θ − Re ) + hr ( R2e + r 2 + 2Re r sin θ − Re )( R2e + r 2 + 2Re r sin θ + Re ) = + hr R2e + r 2 + 2Re r sin θ + Re r 2 + 2rRe sin θ = + hr ( r 2 + R2e + 2rRe sin θ + Re )
(3)
Obviously, Re r, then Eq. (3) can be simplified as ht = ≈
r 2 (sin2 θ + cos2 θ ) + 2rRe sin θ + hr ( r 2 + R2e + 2rRe sin θ + Re ) r 2 cos2 θ + r sin θ + hr 2Re
(4)
For a given radar, hr is a known parameter which can be acquired from navigation equipment. The altitude error of ht can be derived from Eq. (4) δr2 0 ∂h ∂h T t ∂ht t t δh2t = ∂h ∂r ∂θ ∂r ∂θ 0 δθ2 ≈ sin2 θ δr2 + r 2 cos2 θ δθ2
(5)
T in Eq. (5) stands for matrix transpose operation. δh2t is the altitude error derived from radar’s measurement range r and elevation θ. Commonly, δr is the radar’s measurement error of range, which mainly depends on the working band. δθ is the radar’s measurement error of elevation, which mainly depends on beam width and SNR. The value of δr is far smaller than r, so we set δr as a known Gaussian distribution random value, and simulate the altitude error under different situations. From Fig. 2, altitude error mainly depends on δθ , and it has little relation to the elevation level. 2.2 Altitude Filter Based on Maneuvering Detection To reduce the altitude estimation error and avoid the tracking model mismatch when the target is maneuvering in the altitude, we propose multiple models filter, which both contain maneuvering and constant velocity (CV) models working parallel. The proposed AE algorithm adaptive chooses the optimal result by comprehensively using filter’s residual and altitude velocity with limited sliding window data. The flowchart of the algorithm is described as Fig. 3.
584
J. Yu et al.
Fig. 2 Altitude error with different elevation error
Calculate the Altitude error
CV model filtering
maneuvering model filtering
residual and velocity statistic
residual and velocity staƟsƟc
maneuvering detecƟon
model choose and output
Fig. 3 Flowchart of the altitude filter
Here, we choose the “Singer” [6] as the maneuvering model. The maneuvering detection result is based on two facts. Firstly, the ratio of residual over the threshold is larger than N/M, secondly, the mean velocity of altitude is lower than threshold. N is the times that filter residual over the given threshold. M is the total number of the sliding window data. In this paper, N/M is 2/3, and the altitude velocity is 10 m/s.
A New Altitude Estimation Algorithm for 3D Surveillance Radar
585
3 Experiments and Discussions 3.1 Experimental Setup We have tested the proposed AE algorithm with two traditional scenarios. In the first scenario, the target is diving and climbing up with more than 40 m/s altitude velocity. In the second scenario, the target keeps almost the same altitude under different range. The 3D surveillance radar has a working bandwidth of 5 MHz, a ranging error of 30 m, an azimuth and elevation error of 0.2°. 3.2 Experimental Results 3.2.1 Maneuvering Scenario The target’s moving parameters are shown in Table 1. Table. 1 Target’s moving parameters Altitude (m)
Start time (s) End time (s) Duration (s)
11,900
209.9
11,900–8800 423
422.9
213
472.5
49.5
8800
472.6
634.2
161.6
8800–6800
634.3
765.5
131.2
6800–7100
765.6
1162.7
397.1
7100–8800
1162.8
1294.3
131.5
8800
1294.4
1529.4
235.0
Figure 4 shows the comparison results of different altitude estimation algorithms. The tracking error is compared in Fig. 5 From Figs. 4 and 5, we can see that the altitude estimation precision is almost the same when the target keeps at the same altitude level between different algorithms. However, when the target starts to dive from 422 to 482 s, the traditional algorithm’s tracking error increases up to 1500 m immediately. In contrast, the proposed AE algorithm detects the altitude maneuvering event, and adaptive chooses the optimal result which control the maximum tracking error within 192 m. The results illustrate that the AE algorithm has strong maneuvering detection ability and can greatly reduce the altitude maneuvering tracking error. 3.2.2 Constant Altitude Flight Scenario In this scenario, the target is moving from far to near with a constant altitude about 10,000 m. Figure 6 depicts the comparison results between different algorithms, and Fig. 7 shows the tracking error.
586
J. Yu et al. 12000
Traditional algorithm Proposed AE algorithm Real Data
11000
Altitude(m)
10000
9000
8000
7000
200
400
600
800 Time(s)
1000
1200
1400
Fig. 4 Altitude estimation results
Traditional algorithm Proposed AE algorithm
1500
Altitude Error(m)
1000
500
0
-500
200
400
600
800 Time(s)
1000
1200
1400
Fig. 5 Altitude estimation error
From above experimental results, we can see that the altitude estimation precision is decreasing with the target’s range increasing. The average root mean square deviation in 300 km is about 500 m for traditional algorithm, and the proposed AE algorithm decreases to about 84 m, which means the altitude precision is improved 83.2%.
A New Altitude Estimation Algorithm for 3D Surveillance Radar
587
12 Traditional algorithm Proposed AE algorithm Real Data
11.5
Altitude (km)
11 10.5 10 9.5 9 8.5 8
0
50
100
150 200 Range(km)
250
300
350
300
350
Fig. 6 Altitude estimation results 2000 1500
Traditional algorithm Proposed AE algorithm
Altitude Error(m)
1000 500 0 -500 -1000 -1500 -2000
0
50
100
150 200 Range(km)
250
Fig. 7 Altitude estimation error
4 Conclusion This paper presents an adaptive AE algorithm to improve the 3D surveillance radar’s accuracy performance. The proposed AE algorithm adaptive chooses the optimal result by comprehensively using filter’s residual and altitude velocity with limited sliding window data. Experiment results show that the proposed AE algorithm may greatly improve
588
J. Yu et al.
accuracy performance of altitude estimation under different scenarios, especially when the target is maneuvering in altitude or moving in long distance.
References 1. Dai Xiao He Jia-zhou (2006) Study on true altitude estimation of 3-D radar. Command Control Simul 28(5):98–103 2. Lin X, Xiang L et al (2018) Research on background knowledge based software processing method of target height measurement for 3D radar. MATEC Web of Conf 175(2018):1–4 3. Shu T, Liu X, Yu W (2009) Target height finding in narrowband ground-based 3D surveillance radar using beamspace approach. IEEE Radar Conf:1–6 4. John JS, Martin M (1994) Estimating true altitude of a track from local 3-D Cartesian coordinates of a radar. IEEE 2:1014–1017 5. Yu-Jie L, Dan-Dan D et al (2014) An improved height measurement method for 3D radar. Mod Electron Technique 37(5):16–20 6. Signer RA (1970) Estimating optimal tracking filter performance for manned maneuvering targets. IEEE Trans AES 6(4):473–483
Cause Analysis and Solution of Burst Multicast Data Packet Loss Zongsheng Jia1(B) and Jiaqi Zhen2 1 Unit 92941 of the PLA, Huludao 125000, China
[email protected] 2 College of Electronic Engineering, Heilongjiang University, Harbin 150080, China
Abstract. In some service deployment, in order to realize the purpose of controlling multiple computers by a single computer, it is necessary to use multicast to send control data. However, in the practical application scenarios, if the interval between the two multicast data transmissions is too long, there will be packet loss. In order to solve this problem, this paper discusses the basic principle of multicast, analyzes the technical principle of packet loss in the process of multicast establishment, puts forward the solutions, and tests in the actual network environment to verify the effectiveness of the proposed solutions. Keyword: Multicast · PIM-SM · IGMP · RPT · SPT
1 Introduction There are unicast, broadcast, multicast, and other data transmission modes in IP network, among which multicast mode is suitable for point-to-multipoint communication and can avoid the pressure of broadcast mode on the network. In some control fields, there is a need for a single network device to send time-sensitive control information to multiple network devices. In this case, multicast is the most convenient and effective way. Due to the requirement of timeliness, the control information needs to be sent in millisecond time. In order to improve the reliability, the same control information needs to be repeatedly transmitted for 40 frames. The receiving end judges the validity of the control information by calculating the probability of the received control information. Generally, the probability of receiving is correct. Data is required to be greater than 90% (i.e., the transmitted 40 frames of data require receiving at least 36 frames). However, in the actual application process, if the transmission interval of control information is too long, there will be a serious packet loss problem, resulting in the receiver unable to normally determine the control information and affecting the normal operation of the service. Li et al. [1] analyzes the reasons of data disorder and packet loss in the process of switching RPT to SPT and proposes a zero packet loss solution. However, the proposed solution is to modify and perfect the SwitchToSptDesired (S, G) function in PIM-SM
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_79
590
Z. Jia and J. Zhen
protocol code, so that the process of switching RPT to SPT can be carried out after all data packets on the RPT are completely received. Although this solution can achieve zero packet loss, it is not practical to modify the protocol code in practical engineering applications. Chong [2] proposes a scheme to prohibit RPT from switching to SPT. Although this scheme can realize zero packet loss of multicast data, it will cause serious pressure on RP due to multicast traffic and is not an ideal solution. Wang [3] analyzes the problem of multicast packet loss in large background traffic scenarios and proposes to use static multicast group to solve the problem of packet loss. However, the static multicast group method is equivalent to changing the ASM model of multicast to SSM model, which is not quite consistent with the actual application scenarios. Therefore, this paper carries out many tests on the existing network environment, analyzes the test results to find out the causes of packet loss, proposes an effective solution to increase the survival time of multicast sources, and verifies the effectiveness of the proposed method through many tests in the actual network environment.
2 Actual Network Environment The network environment for multicast operation is shown in Fig. 1. There are four routers named R1, R2, R3, and R4 in the network. R4 is connected with two-layer two switches S1 and S2, which are, respectively, connected with PC1 and PC2. PC1 is the multicast source and PC2 is the multicast receiver. The four routers are configured with Protocol Independent Multicast-Sparse Mode (PIM-SM), Any-Source Multicast (ASM) mode, and the interface of R4 connecting S2 runs IGMP protocol. The four routers are configured with dynamic RP, and the C-RP configured by the four routers has the same priority. However, due to the larger IP address of R1, it is elected as RP for the whole network. S1 and S2 run IGMP snooping protocol. R1, R2, R3, and R4 are, respectively, located in four different places, and R4, S1, and S2 are located in the same computer room, and the network delay is about 8 ms between PC1 or PC2 and RP(R1) and is less than 1 ms between PC1 or PC2 and R1. Run the multicast sending program on PC1, sending 40 frames of data suddenly each time, and after the 40 frames have been sent, silence a certain time, then send 40 frames of data again suddenly, to simulate the actual situation of accidental control information; PC2 runs the multicast program to receive multicast data. Through analysis of the test results for many times, it is found that with the increase of transmission interval, disorder and packet loss will gradually occur, and the situation is getting more and more serious, so that when the transmission interval is increased to 4 h, there is a 97.56% chance that PC2 only receives the first frame of data, and the remaining 39 frames of data are all lost. The test results are shown in Table 1.
3 Multicast Operation Principle and Packet Loss Analysis 3.1 Multicast Establishment Process In the multicast network, there is a Rendezvous Point(RP), which serves all multicast groups in the network. All routers in the network know the location of RP [4].
Cause Analysis and Solution of Burst Multicast Data Packet Loss
591
Fig. 1 Network topology diagram
When a multicast receiver appears in the network, it will report to the RP hop by hop through the router connected by the multicast receiver, and an (*, G) entry will be created at the route router and RP. The root of the table is RP. Therefore, the multicast distribution tree established according to the (*, G) entries is also called Rendezvous Point Tree (RPT). When a multicast source appears, it is required to register the multicast source with the RP first. The multicast data it sends out is first encapsulated in a registration message and sent to the RP in unicast. After receiving the registration message, the RP registers the multicast source, forwards the message to the multicast receiver according to the previous (*, G) entries, and creates (S, G) entries according to the (*, G) entries, where (S, G) entries are used to create a Shortest Path Tree (SPT). Sometimes, the path from the multicast source to the multicast receiver via RP may not be the optimal path, resulting in relatively high overhead. However, there is an optimal path between the multicast source and the multicast receiver in the network, which is called SPT, and is the most efficient way to forward packets. After PC2 starts the multicast receiving program, it sends a multicast join message to R4 through S2 switch, and R4 sends a multicast join message to R1(RP) through R3 and R2 respectively and creates (*, G) entries hop by hop to generate an RPT with RP as the root. When PC1 running the multicast sending program sends multicast data to the multicast group, after the multicast data reaches R4(DR) via S1, R4 encapsulates the multicast data and sends it to R1(RP) via path R4-R3-R2-R1 in unicast. RP decapsulates the message to complete the registration of the multicast source, establishes (S, G) entries according to the previous (*, G) entries, and sends the multicast data to PC2 along RPT (R1-R2-R3-R4-S2) to complete the sending process of multicast data once. Therefore, no matter how long the transmission interval is, the first frame of data can be reliably
592
Z. Jia and J. Zhen Table 1 Multicast test results for existing networks
Sending interval
Number of test
Number of packets not dropped
Number of packet losses
Maximum Minimum packet loss packet loss
Average of packet losses
Packet loss probability
60 s
5175
5175
0
N/A
N/A
0
110 s
731
731
0
N/A
N/A
N/A
0
120 s
3647
3647
0
N/A
N/A
N/A
0
130 s
5062
5062
0
N/A
N/A
N/A
0
140 s
615
615
0
N/A
N/A
N/A
0
150 s
576
576
0
N/A
N/A
N/A
0
160 s
217
217
0
N/A
N/A
N/A
0
3 min
544
544
0
N/A
N/A
N/A
0
4 min
364
248
116
11
1
2
31.87%
5 min
637
566
71
29
1
3
11.15%
10 min
448
264
184
39
1
4
41.07%
15 min
96
71
25
35
1
5
26.04%
30 min
47
28
19
39
1
4
40.43%
1h
33
12
11
13
1
4
33.33%
4h
41
1
40
39
39
39
97.56%
N/A
transmitted to the multicast receiver (in the previous test process, no matter how serious the packet loss is, the multicast receiver PC2 can always receive the first frame of multicast data from the multicast source). 3.2 The DR at the Group Member Side Switches RPT to SPT In PIM-SM network, a multicast group only corresponds to one RP and only constructs one RPT. Without SPT switchover, RP is the necessary transfer station for all multicast messages. When the multicast message rate gradually increases, it will impose a huge burden on RP. In order to solve this problem, PIM-SM allows group member side DR to reduce the burden on RP by triggering SPT switchover. As shown in Fig. 1, the multicast data sent by PC1 arrives at PC2 via RPT (S1-R4-R3R2-R1 (RP)-R2-R3-R4-S2), and the path cost is relatively large. Therefore, switching to SPT (S1-R4-S2) will be a better choice. 3.3 Analysis of the Causes of Packet Loss By default, after receiving the first multicast data packet, the DR (R4) at the group member side will switch RPT to SPT. After switching to SPT, DR at the group member
Cause Analysis and Solution of Burst Multicast Data Packet Loss
593
side will send a prune message to RP to inform RP to stop sending multicast data to DR, to prevent PC2 from receiving duplicate data. However, if the multicast source does not send multicast data after certain intervals, RP (R1), R2, R3, and R4 will delete the (S, G) entry of the multicast group, which is necessary for constructing SPT. Therefore, when the multicast source retransmits the multicast data, the multicast source registration, the (S, G) entry establishment, and the SPT establishment process according to the (S, G) entry need to be re-performed. However, it takes some time to reestablish SPT, which leads to packet loss when burst multicast data is sent out, but SPT has not been established yet.
4 Solutions The methods to solve the packet loss problem have been described in [1–3], which are modifying the protocol source code, forbidding SPT switchover or configuring static multicast group, but these methods all have corresponding deficiencies. According to the previous analysis, the key to solve the packet loss problem is to make the router maintain the (S, G) entry for establishing SPT. As long as the entry maintenance time is longer than the multicast data transmission interval, the SPT from the multicast source to the multicast receiver will always exist, and the packet loss problem will not occur. In PIM-SM protocol, when PIM equipment receives a multicast message from a multicast source, it will start the timer of the (S, G) entry, and the time is set as the source survival time. If the message from the multicast source is received before timeout, reset the timer; if no message from the multicast source is received after timeout, the (S, G) entry is deemed invalid and deleted. By default, if the multicast source does not send data to the multicast group after sending the multicast data once, the router in 210 s will delete the multicast source and the (S, G) entries in the corresponding multicast group will also be deleted (the test results in Table 1 also show that no packet loss occurs when the sending interval is not more than 3 min). When the multicast source sends data to the group again, it needs to re-register the multicast source to RP and establish the (S, G) entries, which will lead to packet loss. Therefore, making the multicast source live longer as much as possible will help to solve the problem of packet loss.
5 Test Verification Run the multicast sender and multicast receiver programs written in Python (Version 3.7.2) language on PC1 and PC2, respectively, set the multicast source lifetime of PIM protocol to the maximum 65535 s in R4 router, keep other configurations as they are, and carry out experimental verification. Since in the previous test, packet loss will not occur if the sending interval is less than 3 min, the time interval of this test will start from 4 min. The results are shown in Table 2. The results show that the packet loss rate at all time intervals is 0. Even when the transmission interval is 4 h, it is inconceivable that there is still zero packet loss. In the previous network (as shown in Table 2), there is a 97.56% packet loss rate and the packet loss problem is very serious (only the first of 40 packets can be received). Therefore, the method of increasing the survival time of the multicast source can realize the zero
594
Z. Jia and J. Zhen Table 2 Test results for prolonging the survival time of multicast sources
Sending interval
Number of test
Number of packets not dropped
Number of packet losses
Maximum Minimum packet loss packet loss
Average of packet losses
Packet loss probability
4 min
100
100
N/A
N/A
N/A
0
5 min
100
100
N/A
N/A
N/A
N/A
0
10 min
100
100
N/A
N/A
N/A
N/A
0
15 min
100
100
N/A
N/A
N/A
N/A
0
30 min
100
100
N/A
N/A
N/A
N/A
0
1h
50
50
N/A
N/A
N/A
N/A
0
4h
50
50
N/A
N/A
N/A
N/A
0
N/A
packet loss scheme of accidental multicast data, perfectly solve the packet loss problem of accidental multicast data, and reliably realize the purpose of controlling multiple devices in a multicast mode by one device.
6 Conclusion Aiming at the problem of packet loss in accidental multicast, this paper discusses the causes of packet loss, proposes solutions, and verifies the feasibility of the proposed scheme. The results show that zero packet loss can be achieved by increasing the lifetime of the multicast source, and the scheme only needs to set the lifetime of the multicast source with a simple command inside the router, without prohibiting RPT from switching to SPT or configuring static multicast groups. The scheme is simple and feasible and has strong operability.
References 1. Li X, Han G, Liu H (2007) Research on packet loss during RPT to SPT Switch. Comput Eng. https://doi.org/10.1016/j.cageo.2006.02.011 2. Chong J (2012) Research of packet loss in ip network multicast.radio communications technology. CNKI:SUN:WXDT.0.2012-04-018 3. Wang X, Shi X (2012).Study on packet loss during transmission of sporadic multicast data. J Spacecraft TT&C Technol 31(4) 4. Huawei Technologies Co., Ltd. (2012) S9300, S9300E, and S9300X V200R010C00 Product documentation. Available via https://support.huawei.com/hedex/hdx.do?docid=EDOC10001 35334&lang=en
The Prediction Model of High-Frequency Surface Wave Radar Sea Clutter with Improved PSO-RBF Neural Network Shang Shang1(B) , Kangning He1 , Tong Yang1 , Ming Liu1 , Weiyan Li1 , and Guangpu Zhang2 1 School of Electronic and Information, Jiangsu University of Science and Technology,
Zhenjiang 212003, China [email protected] 2 College of Underwater Acoustic Engineering, Harbin Engineering University, Harbin 150001, China
Abstract. The sea clutter of high-frequency surface wave radar (HFSWR) has chaotic characteristics. Using the phase space reconstruction method to extend the one-dimensional sea clutter time series to the multi-dimensional phase space to fully demonstrate the internal dynamics of the sea clutter, and then training the Radial basis function (RBF) neural network to learn the internal dynamics of sea clutter and establishing the prediction model. The initial parameters of the network affect the convergence speed and the accuracy of the network model, so the particle swarm optimization (PSO) algorithm is used to optimize the initial parameters of the RBF neural network. Aiming at the PSO algorithm problems of the slow convergence speed and easily getting into local optimum, this paper proposes an improved PSO algorithm based on stage optimization. The simulation results show that the improved PSO algorithm has higher convergence accuracy; the optimized RBF neural network prediction model has higher stability and accuracy and has a better prediction effect on sea clutter. Keywords: Sea clutter · Chaotic characteristics · RBF neural network · Particle swarm optimization · Prediction
1 Introduction HFSWR can monitor the ocean all-weather and over the horizon. At present, it plays an important role in both military and civil fields. However, the resonance between HF electromagnetic wave and sea waves will produce sea clutter of various scales. The first-order components of sea clutter often submerge the target echo, which becomes the biggest challenge of marine target detection. In order to predict sea clutter, scholars have proposed some classic statistical models including lognormal distribution, Weibull distribution, K-distribution [1, 2], etc.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_80
596
S. Shang et al.
However, these models are not universally suitable for different sea conditions, and the prediction error is large. Therefore, people turn their focus to the study of chaos characteristics of sea clutter and find that sea clutter contains some certainty and has many typical characteristics of chaos [3]. Based on the theory of phase space reconstruction, this paper uses RBF neural network to learn the internal dynamics of sea clutter and construct a prediction model. We use the PSO algorithm to optimize the network and make some improvements on it. The improved PSO-RBF neural network model has higher stability and accuracy for sea clutter prediction.
2 Phase Space Reconstruction Sea clutter of HFSWR is a kind of multivariable chaotic system produced by multiple factors. If we want to fully reveal the laws contained in time series, we must extend the one-dimensional time series to high-dimensional space, that is, phase space reconstruction of time series [4]. Phase space reconstruction involves two important parameters: time delay τ and embedding dimension m. In this paper, C–C algorithm [5] is used to calculate the two parameters. The one-dimensional time series:x1 , x2 , x3 , . . . , xn , is extended to high-dimensional phase space. Y(i) = [xi , xi+τ , . . . , xi+(m−1)τ ]
(1)
where i = 1, 2, 3, . . . , n − (m − 1)τ , Y(·) is a phase point in the phase space. The phase points Y(i) at time i can be predicted by the function F. {xi , xi+τ , . . . , xi+(m−1)τ } = F{xi−1 , xi−1+τ , . . . , xi−1+(m−1)τ }
(2)
It can be seen from formula (2) that the known sampling value can predict the sampling value of the next time through nonlinear mapping, and then we get the prediction equation of sea clutter as shown in formula (3). xi = f (xi−τ , xi−2τ , . . . , xi−mτ )
(3)
In order to provide more information for the system, S. Haykin improved the prediction equation as follows [6]. xi+mτ = f (xi , xi+1 , . . . , xi+mτ −1 )
(4)
The sea clutter prediction equation above contains multiple variables; it is difficult to obtain its analytical expression, so it is necessary to model the function f .
3 Prediction Model of RBF Neural Network RBF neural network has strong nonlinear mapping capability, so it is used to learn the prediction equation of sea clutter. RBF neural network is a three-layer structure; in this paper, it can be known from the formula (4) that the number of input nodes should be
The Prediction Model of High-Frequency Surface …
597
determined by the reconstruction parameters of phase space. The data of input layer is mapped to the hidden layer by Gauss kernel function. The hidden layer and output layer of the network are connected by weight parameter ω, and the value of output is shown in the following formula (5). Y =
h
ωk G(x, ck )
(5)
k=1
where h is the number of hidden layers, and G(·) is the value of hidden layer node. The structure of RBF neural network in this paper is shown in Fig. 1.
Fig. 1 Structure of RBF neural network
Constructing k sets of training data to train the network and finally generates a predictor to predict the sea clutter of adjacent units. Considering the extreme sensitivity of chaotic system to initial conditions, one-step prediction is adopted in this paper. The effect of prediction was evaluated by formula (6) E =1−
N i=1
(ei − ei )2 /
N
(Yi − Y i )2
(6)
i=1 −
where e is the error between the value of output and the observation value of sea clutter, e −
is the expected value of error, Y and Y represent the observed value and the mathematical expectation of observed value, respectively.
598
S. Shang et al.
4 Optimizing RBF Neural Network with PSO Algorithm The performance of the prediction model is directly affected by the initial parameters. In order to obtain the optimal initial parameters of the neural network, we introduce the PSO algorithm and make some improvement on it. 4.1 PSO Algorithm PSO algorithm is used to search for the optimal value in the global space by imitating the way of birds’ foraging. Each particle contains position vector and velocity; its initial state is given randomly in search space. The dimension of particle search space is D, and the population contains M particles. The position of the i-th particle in the space is X i = [xi1 , xi2 , . . . , xiD ], and the speed is V i = [vi1 , vi2 , . . . , viD ],pbest i = [pi1 , pi2 , . . . , piD ] is the optimal position of the individual searched by this particle, gbest = [g1 , g2 , . . . , gD ] is the optimal position in the current population. Particles update their speed and position according to Eq. (7). k+1 k k k = ωvid + c1 rand() pbestkid − xid + c2 rand() gbestkid − xid vid k+1 k+1 k xid = xid + vid
(7)
where k is the number of current iteration steps, ω is inertia weight, c1 andc2 are learning factors, d is the dimension, vkid and xkid is the velocity and position of the i-th particle in the d-dimension during the k-th iteration. The standard PSO algorithm is prone to fall into local optimum and slow convergence. The direction of improvement mainly includes population diversity [7] and parameter control [8, 9]. Shi proposed the strategy of linear decreasing weight (LDW) [10]. However, such a strategy does not have the ability to quickly converge in the later period. 4.2 Improved PSO Algorithm This paper proposes a staged search (SS) method, which divides the entire optimization process into three phases. The early and late phases are respectively invested in global search and local exploration, and the mid-term is the transition phase; formula (8) is used to control the inertia weight. ⎧ t ≤ t1 ⎨ ωmax , (8) ω = ωmin + (ωmax − ωmin ) · at / T , t1 < t < t2 ⎩ ωmin , t2 < t < tmax where a = 0.005 is the sag constant, it controls the evolution speed of particles from global search to local exploration. t1 and t2 are cut-off steps, T = t2 − t1 is the mid-term iteration steps. The evolution of learning factors is controlled by formula (9). It makes
The Prediction Model of High-Frequency Surface …
599
the population keep diversity in the early stage and converge to the global optimum rapidly in the later stage. c=
1 + cmin 1 + e±q·(t−(tmax / 2))
(9)
where q = 0.1 is the compression constant, which controls the evolution speed of learning factors; cmin is the minimum value of learning factors. When the coefficient before the compression constant is “ + ”, the formula is the evolution track of c1 , otherwise it is the evolution track of c2 . The fitness evaluation function is Ep. Ep =
N
(Yi − Yˆ i )2 /N
(10)
i=1
where N is the number of training samples used in calculating the fitness value, Y is the actual output of the network, and Y is the expected output. The network parameters that need to be optimized include data center, data width and neural weights. These three kinds of parameters are encoded to generate the position vector of particles, which can indirectly find the optimal initial network parameters by iterating and updating the position of particles in the global space.
5 Simulation In this section, the measured data of a batch of HFSWR is taken for simulation experiment. The operating frequency of the radar is 3.7 MHz, and the time domain sampling interval is 0.149 s. The data are divided into I-channel and Q-channel; we take the data of I-channel as an example. In the PSO algorithm, set the population size to 20, the maximum number of iteration steps is 300, each of the three optimization stages takes up 100 iteration steps, ωmax is 0.95, ωmin is 0.4, and cmin is 1.5. The convergence process of PSO is shown in Fig. 2, where SSPSO is the improved method in this paper. In the figure, the PSO algorithm stops convergence in the early stage and falls in the local optimum. The LDWPSO method has improved convergence accuracy compared to the PSO method, but the convergence speed is slow. SSPSO has a fast convergence rate in the later stage, and the convergence accuracy is the highest. We select the sea clutter data of the 30th distance unit to construct training data and normalize the data. Through the C–C algorithm, we get τ = 4, and m = 3, so the number of input nodes is m · τ = 12. Setting the number of hidden layers to 4. Figure 3 shows the convergence curve of training error of network model optimized by different optimization algorithms, with the expected accuracy of 0.01 and training steps of 1000. The RBF neural network optimized by SSPSO has a faster convergence speed and achieves the excepted accuracy after 522 iterations. After 1000 iterations of training, the MSE value of SSPAO-RBF neural network model is 0.0048, while that of LDWPSORBF network model is only 0.0126. It is not difficult to find that the SSPSO-RBF network model has the best convergence speed and accuracy.
600
S. Shang et al. 0.025 SSPSO LDWPSO PSO
minimum fitness
0.02
0.015
0.01
0.005
0
0
50
100
150
200
250
300
number of iterations
Fig. 2 Iterative curve of particle swarm optimization
Fig. 3 Convergence curve of network training error
In order to eliminate the influence of randomness on the effect, this paper evaluates the effect by taking the mean value from several experiments. Table 1 shows the simulation results of four models; the unoptimized RBF neural network model has the worst performance in the final prediction accuracy. In addition, it is found in the experiment that the unoptimized network initial parameters are generated randomly, which brings great instability to the training of network model; the introduction of PSO algorithm
The Prediction Model of High-Frequency Surface …
601
eliminates the instability. Furthermore, the improved PSO algorithm in this paper has greatly improved the training accuracy and prediction effect of the network. Figure 4 shows the prediction effect of the improved PSO-RBF neural network prediction model on the sea clutter of adjacent distance unit. The prediction accuracy is high, and the results can basically describe the observed value of sea clutter. Table 1 Comparison of different optimization algorithms Algorithm
Minimum fitness Accuracy of training model Accuracy of prediction (%) (%)
RBF
–
95.89
84.17
PSO-RBF
0.003745
98.82
91.60
LDWPSO-RBF 0.003450
99.25
92.46
SSPSO-RBF
99.54
95.06
0.003011
6 Conclusion Based on the theory of phase space reconstruction, RBF neural network is used to learn the internal dynamics of sea clutter, and then a prediction model is established to predict the sea clutter of adjacent units. In order to optimize network performance, this paper introduces the PSO algorithm and improves it; the improved PSO algorithm gives full play to the global search and local exploration capabilities of particles and improves the convergence speed and accuracy of PSO. After optimization, the stability and accuracy of the RBF neural network prediction model have been significantly improved. Judging from the prediction results, the improved PSO-RBF neural network can indeed learn the internal dynamics of sea clutter, which is of great significance for the later suppression of sea clutter.
602
S. Shang et al.
Fig. 4 Prediction effect of improved PSO-RBF prediction model
Acknowledgements. The authors would like to express their great thanks to the support of the National Natural Science Foundation of China (61801196), National Defense Basic Scientific Research Program of China (JCKYS2020604SSJS010), Jiangsu Province Graduate Research and Practice Innovation Program Funding Project (KYCX20_3142, KYCX20_3139). The authors
The Prediction Model of High-Frequency Surface …
603
also gratefully acknowledge the helpful comments and suggestions of the reviewers, which have improved the presentation.
References 1. Trunk GV (1972) Radar properties of non-Rayleigh sea clutter. IEEE Trans Aerosp Electron Syst 8(2):193–204 2. Ward KD, Tough RJA, Watts S (2007) Sea clutter: scattering, the K distribution and radar performance. Waves Random Complex Media 17(2):233–234 3. Simon RA, Kumar PBV (2013) A nonlinear sea clutter analysis using chaotic system. In: Fourth international conference on computing. IEEE, pp 1–5 4. Lv JH (2002) Analysis and application of chaotic time series. Wuhan University Press 5. Kim HS, Eykholt R, Salas JD (1999) Nonlinear dynamics, delay times, and embedding windows. Phys D 127(1):48–60 6. Haykin S (1999) Radar clutter attractor: implications for physics, signal processing and control. IEE Proc Radar Sonar Navig 146(4):177–188 7. Xie ZG, Zhong SD, Wei YK (2011) Modified particle swarm optimization algorithm and its convergence analysis. Comput Eng Appl 47(1):46–49 8. Hu TQ, Zhang XX, Cao XY (2020) A hybrid particle swarm optimization with dynamic adjustment of inertial weight. Electron Opt Control:1–9 9. Xu HT, Ji WD, Sun XQ et al (2020) A PSO algorithm with inertia weight decay by normal distribution. J Shenzhen Univ (Sci Eng):1–6 10. Shi Y, Eberhart R (1998) A modified particle swarm optimizer. In: Proceedings of the IEEE conference on evolutionary computation. Anchorage, pp 68–73
CS-Based Modulation Recognition of Sparse Multiband Signals Exploiting Cyclic Spectral Density and MLP Yanping Chen1 , Song Wang2 , Yulong Gao3(B) , Xu Bai3 , and Lu Ba4 1
School of Computer and Information Engineering, Harbin University of Commerce, Harbin 150028, Heilongjiang, China [email protected] 2 School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150080, Heilongjiang, China [email protected] 3 Communication Research Center, Harbin Institute of Technology, Harbin 150080, Heilongjiang, China [email protected], x [email protected] 4 Institute of Science and Technology, Harbin Institute of Technology, Harbin 150080, Heilongjiang, China [email protected]
Abstract. Modulation recognition of sparse multiband signals is a key technology for intelligent signal processing. However, the existing method suffers from high sampling rate and poor anti-noise performance. Therefore, compressed sensing and cyclic spectral density are combined to cope with the shortcomings of existing methods followed by the multi-layer perceptron (MLP) to recognize modulation mode of signal. Some simulations are carried out, and the simulation results show that the proposed algorithm is correct and effective. Keywords: Modulation recognition · Cyclic spectral density Compressed sensing · The multi-layer perceptron
1
·
Introduction
Sparse multiband signals are widely used in radar, wideband communication, and spectrum sensing in cognitive radio [1, 2]. In these fields, signal receivers usually receive one or more unknown signals with different modulation at the same time. This work is supported by National Natural Science Foundation of China (NSFC) (61671176) and Civil Space Pre-research Program during the 13th Five-Year Plan (B0111). c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_81
CS-Based Modulation Recognition of Sparse Multiband . . .
605
The receiver is required to quickly recognize modulation for performing other processing tasks. Therefore, modulation recognition becomes a key technology for intelligent signal processing. As an important theory to analyze the periodic characteristics of signals, cyclic spectral density is an extension of traditional spectrum analysis. Studying cyclic spectral density of signals is not only helpful to deepen the understanding of signal spectrum analysis, but also can realize many other signal processing tasks. According to the theory of cyclic spectral density, Gaussian white noise has no cyclostationary property, so it possesses good anti-noise performance [3,4]. Because of the great difference of cyclic spectral density of different signals, cyclic spectral density contains more signal parameter information, which makes it possible for signal modulation recognition and signal detection in cognitive radio environment [5,6]. However, it is very difficult to calculate cyclic spectral density of multiband signals if we follow the traditional Shannon sampling theorem. Fortunately, compressed sensing shows great potential to solve the mentioned-above dilemma. Compressed sensing was initially used in image processing and image compression. Because there exist a large number of sparse or nearly sparse signals in the field of communication, more and more people focus on the combination of signal processing and compressed sensing. For compressed sensing, signals first are represented in certain basis, and then sampled with the sub-Nyquist rate. In this way, it can replace the traditional equidistant sampling [9]. Because compressed sensing and cyclic spectral density have different advantages for signal processing tasks, their combination has received attention from signal processing community. In [10], compressed sensing is utilized to acquire compressed measurements and then perform the cyclostationary detection. A new cyclic autocorrelation feature detection based on compressed sensing was proposed in [4]. In this method, the original signal is reconstructed with a small number of random measurements, and the normalized cyclic autocorrelation statistics is used to carry out spectrum sensing. And it does not require the prior information of signals. Nevertheless, this method has longer observation time and higher computational complexity compared with the energy-based detection method [11,12]. It was pointed in [13] that the cyclic spectral density of signal possesses a strong sparsity, and it can be mapped into a sparse space by combining compressed sensing. Correspondingly, cyclic spectral density of the original signal can be calculated directly from sub-Nyquist measurements without reconstruction of the original signal in time domain, which significantly reduces computational complexity. In recent years, the multi-layer perceptron has been widely investigated for many fields. Compared with decision tree classification, the multi-layer perceptron classifier has great advantages, and it has high efficiency in identifying digital modulation signals. Accordingly, some research institutions utilize the characteristics of modulated signals in time domain to train the multi-layer perceptron, then output the classification results of modulation modes [14].
606
Y. Chen et al.
The characteristics of different modulated signals (such as BPSK, am, FSK, MSK, QAM, PAM, etc.) on power spectrum show partial overlap, so it is impossible to discriminate the modulation modes completely. However, for different modulated signals, there are different cyclostationary characteristics in the dual frequency plane. In addition, the noise does not show cyclostationary characteristics in the dual frequency plane, so we can use the cyclostationarity of signal to identify modulation modes. In this paper, we combine compressed sensing with cyclic spectral density of signal to reduce the sampling frequency and improve the anti-noise performance. At the same time, we use the compressed measurements to directly estimate cyclic spectral density using our proposed CS-SSCA algorithm [15]. Then the multi-layer perceptron is exploited to recognize the modulation automatically.
2 2.1
Description of the Proposed Algorithm Estimation of Cyclic Spectral Density with the Sub-Nyquist Measurements
Generally, the cyclic periodogram method cannot be utilized due to its large variance and bias. The cyclic periodogram method must be smoothed to reduce the estimation error. Due to the high efficiency and low computational complexity, the time-domain smooth cyclic periodogram is widely used in practice. The time-domain smoothing SSCA algorithm can be expressed as [15] fk +qΔα (n, fk −qΔα )Δt = SxT 2
τ
XT (rL, fk )x∗ (r)gc (n − r)e−j2πqr/N
(1)
a(n)x(r, n)e−j2πf (r−n)T
(2)
N t /2
XT (r, f ) =
r=−N t /2 fk +qΔα is the cyclic periodogram, a(n) is the window function, gc (n) is where SxT the smoothing window, q = −P/2, . . . , p/2. L is the decimating factors satisfying L < N . The optimal estimation performance can be achieved for L = N/4. Further more, the cyclic periodogram can be expressed in a matrix form
X[f ]N ×L = FWX0 [n] = (FWF−1 )FX0 [n] ¯ 0 [f ] = WX
(3)
where F is the N -point DFT matrix and its (i, j) element F(i,j) = e−j2πij/N . ¯ = FWF−1 and F−1 is the N -point IDFT matrix of F. X0 [f ] = FX0 [n], W And W is the Hamming window function matrix. With the compressed sensing theory, we have (4) Y[n] = AX0 [n]
CS-Based Modulation Recognition of Sparse Multiband . . .
607
where A is a matrix. Because the signals are sparse in frequency domain, we express (4) as Y[n] = AF−1 FX0 [n] = ΦX0 [f ]
(5)
where Φ = AF−1 is the measure matrix satisfying restricted isometry property (RIP). Because of the sparsity of the modulated signal, only some specific rows contain a larger value. And the value of other rows occupied by additive Gaussian noise is small. Therefore, the matrix X0 [f ] is block sparse, and the SOMP algorithm can be exploited to recover the signals. The support region of cyclic spectral density can be denoted as Sα i (n, fi ) = x∗ (n) XT (n, fk ) 2.2
(6)
The DNN-Based Modulation Recognition Algorithm
If the resolution of cyclic spectral density is low, it may not be able to accurately calculate the characteristic peak value of the cyclic spectrum of the signal, resulting in the wrong recognition of modulation mode. If the sampling frequency is increased, the computational complexity will be increased, which greatly increases the ADC hardware overhead. Correspondingly, we improve the modulation recognition performance with the multi-layer perceptron under the constraint of the low sampled rate. For the multi-layer perceptron, the first step is to calculate the error with forward propagation. In the ith layer, the relationship between the input data and the output results can be expressed in the matrix form ⎡ 1 ⎤ ⎤ ⎡ 1 ⎤ ⎡ 1 ⎤⎞ ⎛⎡ 11 yi yi−1 bi wi . . . wi1n ⎢ .. ⎥ ⎥ ⎥ ⎥⎟ ⎜⎢ .. . . ⎢ ⎢ . . . (7) Yi = ⎣ . ⎦ = f ⎝⎣ . ⎦ + ⎣ .. ⎦⎠ . .. ⎦ ⎣ .. yim
wim1 . . . wimn
n yi−1
bm i
where m and n are the number of nodes of the ith layer and i − 1th layer respectively. And f is the activation function, and W is the weight matrix. Because many modulation modes need to be recognized, the output layer is composed of multiple nodes, and softmax function is employed as a nonlinear activation function. eyk Ok = yk , ke
k = 1, 2, . . . , K
(8)
where K is the number of modulation modes. The modulation mode z of signal is (9) z = max(Ok ) k
608
Y. Chen et al.
In the following, backpropagation is utilized to update the weight coefficients, and the loss function is K
E=
1 1 2 (d − o)2 = (dk − ok ) 2 2 k=1
=
1 2
K
[(dk − f (netk )]
2
k=1
⎛ ⎡ ⎞ ⎤2 K m 1 ⎣ (dk − f ⎝ = wjk yj + bj ⎠)⎦ , 2 j=1 k=1
j = 0, 1, 2, . . . , m, k = 1, 2, . . . , K
(10)
The matrix form of parameter updating formula of each layer is Wl := Wl − η
∂E ∂Wl
(11)
∂E (12) ∂bl where η is the learning rate. For the output layer, the partial derivative is expressed by ∂f (Wl yl + b) T ∂E = (o − d) ◦ yl (13) ∂Wl ∂Wl bl := bl − η
∂f (Wl yl + bl ) ∂E = (o − d) ◦ ∂bl ∂bl For the hidden layer, the partial derivative is denoted as ∂E ∂f (Wl yl + bl ) ∂E ◦ = Wl+1 ∂Wl ∂Wl+1 ∂Wl ∂f (Wl yl + bl ) ∂E ∂E ◦ = bl+1 ∂bl ∂bl+1 ∂bl
(14)
(15) (16)
We need a very high resolution to detect all the features, the multi-layer perceptron is required to deal with a large number of data. In order to reduce the problem of large amount of data caused by high resolution, the maximum value of cyclic spectral density is given α (f )] profile(α) = max[SX f
(17)
α (f ) the normalized cyclic spectral density, W. where SX It is observed that different modulated signals have the different α-profile, so the α-profile can be regarded as a image to be classified. Considering all the mentioned-above components, the process of the proposed method can be summarized in Fig. 1.
CS-Based Modulation Recognition of Sparse Multiband . . .
609
Fig. 1 Missing detection probability of PSD-based algorithm and CSD-based algorithm for the different SNR
3
Simulation Results
In this section, some simulations are carried out to evaluate the performance of the proposed algorithm. The simulation parameters are set as follows. The modulated signals are the BPSK signal, the QPSK signal, and the AM signal and the number of their labeled samples are 820, 505, and 567, respectively. All the signals are generated randomly with the square-root raised cosine filter. For the BPSK signal, the carrier frequency, signal rate and observed time are in [100, 300], [10, 50]. For the QPSK signal, fc ∈ [3000 Hz, 5000 Hz], Rb ∈ [8, 50]. For the AM signal, the carrier frequency fc ∈ [20 Hz, 100 Hz], the amplitude A ∈ [2, 8]. The number of layers for MLP is 3, the number of nodes of the hidden layer is 20, activation functions of hidden layer and output layer are the Tanh function and Softmax function. The learning rate η = 0.05, α = 1, and the number of iteration epcho = 40. We first study the success rate for the number of training samples with a given SNR = 5 dB, and the simulation result is shown in Fig. 2. It is seen that the success rate increases with the increasing of the number of samples, and reaches 0.9 when the number of samples is 80. Next, the success rate over the different SNR is demonstrated in Fig. 3 for a given number of samples 320. It is observed that the success rate increases with the increasing of SNR. In addition, the more obvious α-profile feature is, the more accurate the modulation recognition is.
4
Conclusion
Because the signals with different modulation modes have different characteristics in the dual frequency plane, the modulation modes of input signals can be identified by cyclic spectral density. Moreover, the Gaussian noise has no cyclic spectral density in the dual frequency plane, so using cyclic spectral density as the recognition feature can improve the anti-noise performance. In addition, we combine compressed sensing with cyclic spectral density to reduce the sampled rate. More importantly, we exploit the multi-layer perceptron as recognizer to improve the recognition performance. Simulation results demonstrate that the proposed method has a high success rate for different cases.
610
Y. Chen et al.
Fig. 2 Success rate for the different SNR
Fig. 3 Success rate for the different number of samples
CS-Based Modulation Recognition of Sparse Multiband . . .
611
References 1. Yang L, Fang J, Duan H, Li H (2020) Fast compressed power spectrum estimation: toward a practical solution for wideband spectrum sensing. IEEE Trans Wireless Commun 19(1):520–532 2. Davenport MA, Boufounos PT, Wakin MB, Baraniuk RG (2012) Signal processing with compressive measurements. IEEE J Sel Topics Signal Process 4(2):445–460 3. Gardner W, William A (1991) Exploitation of spectral redundancy in cyclostationary signals. IEEE Signal Process Mag 8(2):14–36 4. Gardner W (1986) Measurement of spectral correlation. IEEE Trans Acoust Speech Signal Process 34(5):1111–1123 5. Napolitano A (2016) Cyclostationarity: new trends and applications. Sig Process 120:385–408 6. Gini F, Giannakis GB (1998) Frequency offset and symbol timing recovery in flatfading channels: a cyclostationary approach. IEEE Trans Commun 46(3):400– 411 7. Candes EJ, Romberg J, Tao T (2006) Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans Inf Theory 52(2):489–509 8. Mishali M, Eldar YC (2010) From theory to practice: sub-Nyquist sampling of sparse wideband analog signals. IEEE J Sel Topics Signal Process 4(2):375–391 9. Hu H, Wang Y, Song J (2008) Signal classification based on spectral correlation analysis and SVM in cognitive radio. In: IEEE advanced information networking and applications. Okinawa, Japan, pp 883–887 10. Khalil R, Louerat MM, Petigny R, et al (2012) Background time skew calibration for time-interleaved ADC using phase detection method. In: 2012 IEEE 10th international new circuits and systems conference (NEWCAS). Montreal, QC, Canada, pp 257–260 11. Eldemerdash YA, Marey M, Dobre OA et al (2013) Fourth-order statistics for blind classification of spatial multiplexing and Alamouti space-time block code signals. IEEE Trans Commun 61(6):2420–2431 12. Yan X, Feng G, Wu HC, Xiang W, Wang Q (2017) Innovative robust modulation classification using graph-based cyclic-spectrum analysis. IEEE Commun Lett 21(1):16–19 13. Tian Z, Tafesse Y, Sadler BM (2012) Cyclic feature detection with sub-Nyquist sampling for wideband spectrum sensing. IEEE J Sel Topics Signal Process 6(1):58– 69 14. Yan X, Liu G, Wu HC, Feng G (2018) New automatic modulation classifier using cyclic-spectrum graphs with optimal training features. IEEE Commun Lett 22(6):1204–1207 15. Gao Y, Wang S, Chen Y, wei Y (2018) Cyclic spectrum estimation under compressive sensing by the strip spectral correlation algorithm. In: 2018 international wireless communications and mobile computing conference. Limassol, Cyprus, pp 856–860
Bandwidth Estimation Algorithm Based on Power Spectrum Recovery of Undersampling Signal Yuntao Gu1 , Yulong Gao1(B) , Si Wang2 , and Baowei Li2 1 Harbin Institute of Technology, Harbin 150028, China
[email protected], [email protected]
2 Shanghai Institute of Satellite Engineering R&D Center, Beijing 100081, China
[email protected], [email protected]
Abstract. With the rapid development of wireless communication and the increase of various types of communication services, cognitive radio based on spectrum sensing is becoming a hot research topic. One of the key tasks of spectrum sensing is signal parameter identification, which is to quickly identify a series of parameters of signal based on signal detection, to provide basis for subsequent spectrum allocation and sharing. Bandwidth is one of the important parameters of communication signal. It is the premise of detecting "spectrum holes" to estimate bandwidth quickly and accurately. Due to the difficulty of Nyquist sampling in wideband spectrum sensing, this paper studies undersampling bandwidth estimation. We introduce the power spectrum estimation algorithm under the assumption of wide stability and put forward the corresponding bandwidth estimation strategy. The limitation of the strategy is analyzed. The performance of the algorithm is simulated under different SNR and compression ratios. Different random undersampling matrix is researched as well. The simulation shows that the algorithm discussed in this paper is feasible and reliable. Keywords: Spectrum estimation · Power spectrum estimation · Bandwidth estimation · Undersampling
1 Introduction Recently, with the development of wireless communication technology, the spectrum resources are increasingly scarce. The concept of cognitive radio was proposed in 1999 [1]. In short, what cognitive radio needs to do is to effectively use "spectrum hole" to communicate [2]. One of the basic tasks is signal parameter identification. However, a wide band will cause great pressure on the ADC of the receiver. This paper focuses on algorithms This work is supported by National Natural Science Foundation of China (NSFC) (Grant No. 61671176).
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_82
Bandwidth Estimation Algorithm Based …
613
based on undersampling. The basic point is to estimate the bandwidth based on the undersampling power spectrum estimation. From the point of view of cyclostationarity, Reference [3] systematically presents how to recover the cyclic spectrum of the signal in the framework of compressed sensing (CS). In [4, 5], the problem of power spectrum estimation under the new sampling framework is studied. In [6, 7], a linear algorithm combining power spectrum estimation and wavelet edge detection is studied. The rest of this paper is organized as follows: Sect. 2 introduces the algorithm of power spectrum estimation based on undersampling. After that, the corresponding bandwidth estimation strategy is given in Sect. 3. In Sect. 4, numerical simulations are conducted to evaluate the performance of the proposed algorithm. Finally, conclusions are drawn in Sect. 5.
2 Power Spectrum Recovery Algorithm of Undersampling Signal We regard the modulated signal as a wide and stable random signal. According to the literature [8], under the framework of compressed sensing (CS), the sampling rate of CS receiver is fs,cs = (M /N )fs . f s is Nyquist sampling rate. M /N ∈ (0, 1] is the compression rate. Such a linear compression sensing process can be described as: z = Ax
(1)
For x (n), its autocorrelation function is independent of time: rx (n, τ ) = rx (τ ), ∀n.
(2)
According to Wiener-Khinchin law, we have sx = Frx
(3)
where sx = [sx (0), . . . , sx (N − 1)]T , rx = [rx (0), . . . , rx (N − 1)]T . The mapping relationship between (1) and (2) can be written as follows: [vec{Rx }](n−τ )N +n = [Rx ](n,n−τ ) = rx (τ ) [vec{Rx }]nN +n−τ = [Rx ](n−τ,n) = rx (τ ) τ ∈ [0, N − 1], n ∈ [τ, N − 1]
(4)
Obviously, vec{Rx } can be linked directly with rx by a linear mapping matrix PN ∈ 2 {0, 1}N ×N vec{Rx }=PN rx
(5)
The specific form of PN is: PN ((n − τ )N − n, τ ) = PN (nN + n − τ, τ ) = 1 ∀τ = 0, . . . , N − 1; n = τ, . . . , N − 1
(6)
614
Y. Gu et al.
The autocorrelation matrix Rz =E z ∗ z H ∈ RM ×M still has M (M + 1)/2 degrees of freedom. Using the approach above, we can get rz = QM vec{Rz }
(7)
p(n, τ ) = τ M − τ (τ − 1)/2 + n ∀τ ∈ [0, M − 1], 0 < n < M − 1 − τ
(8)
q1 (n, τ ) = (n + τ )M + n q2 (n, τ ) = nM + n + τ
(9)
QM (p(n,τ ),q (n,τ )) = QM (p(n,τ ),q (n,τ )) = 21 + 21 δτ,0 1 2 QM (p(n,τ ),q) = 0, ∀q ∈ 0, M 2 − 1 /{q1 (n, τ ), q2 (n, τ )}
(10)
Due to Rz =E z ∗ z H ∈ RM ×M , the relationship between Rz and Rx is: Rz = ARx AH Using the properties of matrix, vec{UXV } = Kronecker-product) we can get:
rz = QM vec ARx AH
(11)
V T ⊗ U vec{X } (⊗ stands for
= QM (A ⊗ A)vec{Rx } = rx
(12)
By introducing (3), the linear relationship between power spectra is: rz = F −1 sx = sx
(13)
where =F −1 . The size of is (M (M + 1))/2 × N . The compression rate has a maximum:
√ 8N + 1 − 1 N →∞ 2 M = −−−−→ (14) N min 2N N If the received signal is sparse in the frequency domain, then L1 norm regularization can be introduced to obtain convex problems to guarantee sparsity: sx = arg min rz − sx 22 + λ sx 1
(15)
sx
These convex problems can be solved by existing convex optimization software kits (such as CVX toolkit based on MATLAB).
Bandwidth Estimation Algorithm Based …
615
Fig. 1 Recovery of undersampling power spectrum
Table 1 Undersampling bandwidth estimation strategy Undersampling bandwidth estimation strategy 1. Find two maximum peaks 2. Select the local minimum point closest to the outside of the two peak points 3. Choose the spectrum distance between these two points as the bandwidth
3 Bandwidth Estimation Strategy Based on Recovered Power Spectrum In this section, we mainly consider how to estimate the bandwidth according to the power spectrum. The signal in Fig. 1 is BPSK signal, with signal-to-noise ratio of 5 dB, sampling rate of 40KHz, carrier frequency of 16KHz, bit rate of 1 kHz, and hyper parameter of 2. The strategy to directly estimate bandwidth is shown in Table 1.
4 Numerical Simulations 4.1 The Relationship Between Bit Rate and Bandwidth Estimation The modulation signal simulated in this section is BPSK signal. Most of the simulation conditions are the same as Fig. 1. According to Fig. 2, the computational complexity of the least square method is much lower than that of the convex optimization method, and it can be more accurate. 4.2 The Performance of Bandwidth Estimation Against Noise Most of the simulation conditions are the same as Fig. 1. Bit rate is 1.6 KHz.
616
Y. Gu et al.
Fig. 2 Relationship between bit rate and bandwidth estimation
According to Fig. 3, the computational complexity of convex optimization is not only higher (as shown in Table 2), but also has poor performance. When the SNR is low, as shown in Fig. 4, the power spectrum estimation has a large deviation with high probability. Therefore, in the later simulation, we use the least square method.
Fig. 3 Performance against noise
Bandwidth Estimation Algorithm Based …
617
Table 2 Computational complexity comparison Algorithm
Time for one estimation (s)
Least square
0.010689
Convex optimization 5.654571
Fig. 4 Poor performance of power spectra estimation under low SNR
4.3 The Choice of Undersampling Matrix We choose random 01 matrix, Bernoulli matrix, and Gaussian matrix to analyze. Most of the simulation conditions are the same as Fig. 2. According to Fig. 5, only the random 01 matrix can give reasonable bandwidth estimation results. In addition, the hardware implementation difficulty of random 01 matrix is the lowest.
5 Conclusion In this paper, we first deduce the algorithm of undersampling power spectrum estimation under the assumption of wide stability and then give the bandwidth decision strategy. We analyze the performance of the algorithm from the aspects of bit rate, SNR, sampling matrix, and compression ratio. The simulation results show that the least square method has great advantages in computation and estimation accuracy, which means for bandwidth estimation, sparse constraints are unnecessary. Meanwhile, random 01 matrix is a suitable undersampling matrix.
618
Y. Gu et al.
Fig. 5 Performance of different sampling matrices
References 1. Liu X, Jia M, Zhang X, Lu W (2019) A novel multichannel internet of things based on dynamic spectrum sharing in 5G communication. IEEE Internet Things J 6(4):5962–5970 2. Liu X, Zhang X (2020) NOMA-based resource allocation for cluster-based cognitive industrial internet of things. IEEE Trans Indus Inf 16(8):5379–5388 3. Hong S (2011) Multi-resolution bayesian compressive sensing for cognitive radio primary user detection. In: IEEE global telecommunications conference 4. Ariananda DD, Leus G (2012) Compressive wideband power spectrum estimation. IEEE Trans Signal Process 60(9):4775–4789 5. Lexa MA, Davies M, Thompson J et al (2011) Compressive power spectral density estimation. In: IEEE international conference on acoustics. IEEE 6. Tian Z, Giannakis GB (2007) Compressed sensing for wideband cognitive radios. In: IEEE international conference on acoustics. IEEE 7. Ariananda DD, Leus G (2011) Wideband power spectrum sensing using sub-Nyquist sampling. In: IEEE international workshop on signal processing advances in wireless communications. IEEE 8. Tian Z, Tafesse Y, Sadler BM (2012) Cyclic feature detection with sub-Nyquist sampling for wideband spectrum sensing. IEEE J Sel Topics Signal Process 6(1):58–69
An Adaptive Base Station Management Scheme Based on Particle Swarm Optimization Wenchao Yang, Xu Bai(B) , Shizeng Guo, Long Wang, Xuerong Luo, and Mingjie Ji Harbin Institute of Technology, Harbin, China [email protected]
Abstract. With the rapid development of 5G in recent years, the energy consumption in the information and communication industry is becoming serious day by day. The sleeping strategy of the base station (BS) is to consider the load situation and user distribution of each BS under the heterogeneous cellular network model and close the BS with low load. Meanwhile, some users of the BS with high load are assigned to the BS with low adjacent load, so as to achieve energy consumption balance. The simulation results show that the particle swarm optimization algorithm is superior to traditional distributed algorithm in energy consumption and energy saving efficiency, which can realize green communication, but the time it takes is a little longer. Keywords: 5G · Energy consumption · Particle swarm optimization
1 Introduction With the rapid development of 5G technology, the communication industry is facing an increasingly serious problem of energy consumption. As the number of users increases, the communication energy consumption will increase exponentially. The BS energy saving mechanism can not only save a lot of energy, but also improve the efficiency of the communication system. BS energy saving algorithm can be divided into traditional distributed algorithm and centralized algorithm; distributed algorithm separates each BS, according to the number of users in the community to determine the state of the BS, when the number of users in the community to be less than the threshold set, the BS to enter a dormant state, the user access to the adjacent BS in the district; the distributed algorithm has the advantage that can be quickly according to the current connection state of the number of users decided to BS, but after the BS dormancy, if the adjacent BS also enter a dormant state, then will cause the original user communication interrupt in the community.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_83
620
W. Yang et al.
Ferron Adelantad and Agapi Mesodiakaki mainly made improvements in the direction of BS deployment, discussed the relationship between cell radius and system energy consumption, and selected the optimal BS erection scheme [1], but did not involve the BS dormancy strategy after BS deployment. Jaya B. Rao, Abraham O. Fapojuwo discussed the influence of cell radius on system power consumption and how to solve the optimal cell radius in a specific user distribution scenario [2]. Hossam S. Hassanein and Stefan Valentin proposed a distributed algorithm to determine whether a BS enters the sleep mode by estimating the load of each BS [3]. The algorithm only considers the load of each BS and does not get the global optimal solution from the whole. Gong Jie and Yang Peng proposed a scheme of BS dormancy that was adjusted according to the traffic volume in the cell [4]. The BS should respond to the change of traffic volume quickly and decide whether to dormancy, without involving heterogeneous cellular network. Taking lte-a network as the background, Zhao Jihong and Hu Jiangyan et al. proposed the BS cooperative shutdown scheme [5]. The BS cooperative scheme can select the best cluster for users within the coverage range, reduce the power consumption of the BS, and guarantee QoS at the same time. Zhang Zhizhong and Hao Minji proposed a dynamic BS shutdown algorithm based on the low load of single-layer cellular network [6]. The algorithm calculates the distance between the user and the BS in the cell in real time and shuts down the low-load BS and the idle BS.
2 Energy Saving Principle in BS In wireless communication, the traffic flow speed is very fast, and the hot spots only have peak traffic at a certain time of day. At other times, the load is greatly reduced, while the BS equipment that is still in peak working state at this time greatly increases the energy consumption and operating expenses of the cellular network. The design of the BS switching mechanism is very complicated. The dormancy period and the number of dormancy BSs are the problems that we are faced with. If too many BSs are shut down due to incorrect threshold estimation in the mechanism, then the available resources in the network will be greatly reduced, resulting in the shortage of frequency band resources and longer waiting time of users, thus failing to meet the communication needs of users. In addition, if too many BSs are shut down, then the distance between the service BSs will increase significantly. It cannot guarantee that all users are within the coverage range of the service BSs, which will lead to the deterioration of the coverage performance of the network. As a result, the user data transmission efficiency and the system spectrum efficiency will both decrease due to the increase of link delay. Particle swarm optimization (PSO) algorithm is a large-scale combinatorial optimization algorithm with simple parameters but fast solving speed jointly proposed by Dr. Eberhart and Dr. Kennedy in 1995, also known as particle swarm optimization algorithm [7]. Compared with other combinatorial optimization algorithms, particle swarm optimization algorithm has a simple concept and does not need complex solving process. Instead, it iteratively obtains the optimal solution by setting particle combination. There are two extremums in the process of searching and solving: One is the individual optimal value, denoted as pit (i = 1, 2, . . . , P), where P is the population size;
An Adaptive Base Station Management Scheme Based …
621
the other extreme value is the global optimal value. The best result is the overall optimal value after comparing the values obtained by substituting each particle into the optimization function. Equations (1) and (2) are velocity update formula and position update formula, respectively. In the iterative optimization process, each particle in the population updates its velocity information and position information by comparing its own information with that of others to get closer to the optimal solution [8]. t t+1 t t t t vid + c2 r2 pgd = w vid +c1 r1 pid − xid − xid (1) t+1 t+1 t xid = xid + vid
(2)
t represents the speed information of each particle i in the t iteration process; x t where vid id represents the position information of particle i in the t iteration process; w represents the weight of inertia; r 1 and r 2 are random numbers between (0, 1); and c1 and c2 are learning factors. The formula for the value of w is as follows, where k max is the maximum number of iterations set before the execution of the algorithm.
w = wmax − k ·
wmax − wmin kmax
(3)
The number of particles in the algorithm is determined by the complexity of the problem and the user’s demand for solving speed. In a normal combinatorial optimization problem, 20–40 particles are enough; for complex problems, the number of particles can be appropriately increased, but the increase in the number of particles will also increase the running time of the program; when the complexity of the problem is low, only a small number of particles can be taken to get a good solution. The learning factor is equivalent to the ability of individuals in a population to imitate, and this parameter enables each particle in the population to self-summarize and learn from the best individuals in the population, thus approaching the best individuals in the population. The two parameters of c1 and c2 are generally taken to the same value, with the value range of (0, 4), and the middle value of 2 is usually taken. Inertia weight is also a very important parameter in PSO algorithm. The size of the parameter determines the inheritance ratio of the current position and velocity information of the particle. The correct choice can enable the particle to have a balanced search ability and development ability and get the global optimal solution quickly.
3 BS Energy Saving Scheme Based on PSO In this paper, a two-layer heterogeneous cellular network model is adopted, in which a part of micro BSs are added around the traditional macro cell. It is assumed that the cellular model is a standard hexagon, and each macro station has three sectors, and two micro BSs are deployed in each sector. Suppose the number of the network’s macro stations is M, the number of micro BSs is F, and the number of users is N. bm stands for the mth BS; r f stands for the f th micro BS; and un stands for the nth user, respectively.
622
W. Yang et al.
If un is plugged into bm , then the value of SINR is: b
b PLm,n pm F pb pr k=1,k=i PLk,n k + k=1 PLk,n k
SINR = M m,n
+σ 2
(4)
If un is plugged into r f , then the value of SINR is: PLf ,n pfr
r
SINR = M
pb k=1 PLk,n k +
f ,n
F
k=1,k=j
PLk,n pkr +σ 2
(5)
b Similar to a single-layer cellular network, define the connection matrix X = b xm,n and X r = xfr ,n represent the connection relationship among the user, M ×N F×N
b = 1 means that BS b is connected to user u ; x r = 1 the macro and micro BSs. xm,n m n f ,n means that micro BS r f is connected to user un ; xfr ,n = 0 means that micro BS r f and user un are not connected. Figure 1 is a schematic diagram of the modeling:
Fig. 1 Schematic diagram of cellular network modeling
Considering the QoS and energy saving performance of the network, the connection relationship of BS users can be modeled as the following model: min Z = Pin m,n
m
b s.t. xm,n = {0, 1}, xfr ,n = {0, 1} b xm,n xfr ,n = 1 + f
(6)
An Adaptive Base Station Management Scheme Based …
623
The number of BSs and users is M and N, respectively, and the number of antennas in each BS is N TRX . Each BS has two states of on and off, and its energy consumption model is [9]: 0 < Pouti < Pmax , Pin = M · NTRX · P0 + P ·
N
Pouti
(7)
i=1
Pouti = 0, Pin = NTRX · Psleep
(8)
where Pouti represents the power that meets the requirements of UE i . P0 represents the minimum output power of each antenna when the BS is on, P is the load factor, and represents the slope of the load-related energy consumption. Psleep represents the power of each antenna when the BS is asleep. The array H = [hi ] is used to represent the state combination of each BS in the cellular network, where hi = {0, 1} represents the sleep and on states of each BS, and H is regarded as a particle. In terms of algorithm initialization, practical factors such as adjacent BSs and user distribution are taken into account. Neii refers to the number of adjacent BSs of BSi , loadi refers to the number of loads of BSi , and the initial threshold K i of each BS is set: Ki = P1 ∗ (Neii /Neimax ) + P2 ∗ (Loadi /Loadmax )
(10)
where the maximum number of adjacent BSs of BSi is Neimax , which can provide services for up to loadmax users at the same time; P1 and P2 are the weight coefficients. The initial location information Gi of each BS is set to be a random number between 0 and 1. If Gi > K i , then hi = 0, and vice versa hi = 1. Similarly, a random number between 0 and 1 is taken as the initial velocity information V i of each BS. Then, the update expressions of the algorithm’s position and speed are as follows: Gi (t + 1) = Gi (t) + Vi (t + 1)
(11)
Vi (t + 1) = w · Vi (t) + c1 r1 Gipbest − Gi (t) + c2 r2 Ggbest − Gi (t)
(12)
where Gi (t + 1) and V i (t + 1) are the results of the (t + 1)th iteration. Gi (t) and V i (t) are the results of the tth iteration. Gipbest is the optimal value from each particle to the current search; Ggbest is the optimal value from the whole population to the current search. c1 , r 1 , c2 , and r 2 are the parameters of particle swarm optimization algorithm. By setting the weight factor w to be randomly selected from {−1, −0.5, 0, 0.5, 1}, the searching ability of the algorithm was greatly increased.
4 Simulation Analysis Figure 2 is the result of simulation based on the two-layer heterogeneous cellular network model. Users are randomly distributed within the coverage range of the BS. In this simulation, the results obtained by particle swarm optimization (PSO) are compared with
624
W. Yang et al.
those obtained by traditional distributed algorithms. The distributed algorithm determines whether the BS is closed or not according to the load condition of each BS, without considering the global condition. General process of distributed algorithm first will be set to open all the BS; according to the relationship between BS and the distance of each user connection matrix, statistical total number of users connected to the BS, when the load is less than the threshold value of the BS, will be the base for dormant, then to establish a BS and the user connection matrix, calculating energy consumption and other relevant information. The particle swarm optimization (PSO) algorithm used in this paper is a centralized algorithm, which determines whether the BSs sleep or not from the overall load through iterative solution.
Fig. 2 User random distribution
Table 1 shows the comparison of energy consumption after executing two sleep algorithms from midnight to 6:00. It can be seen that since network traffic is very low between 0 and 8 a.m, and the number of activated BSs is small; energy consumption can be greatly reduced after the dormancy of most BSs. The result of particle swarm optimization is better than that of distributed algorithm. The energy consumption of the network can be improved by adjusting the resting state of the BS in the cellular model. Using a histogram to represent the data in Table 1, it can be seen clearly that after the implementation of the PSO algorithm, when the user flow in the cellular model is low, the overall energy consumption is greatly reduced, and the results obtained are far better than the traditional distributed algorithm. This shows the effectiveness of PSO and achieves the goal of green communication (Figs. 3 and 4). In order to further measure the stability of the sleep algorithm, the interrupt rate, the number of activated BSs, and the energy efficiency of the two sleep algorithms are calculated. It can be seen from the figure that the particle swarm optimization algorithm
An Adaptive Base Station Management Scheme Based …
625
Table 1 Comparison of energy consumption between the two algorithms (W) Time
0 a.m
1 a.m
2 a.m
3 a.m
4 a.m
5 a.m
6 a.m
7 a.m
8 a.m
PSO
50,920
41,349
33,125
27,174
24,053
24,537
27,129
33,385
40,827
分布式
55,388
44,271
35,797
28,008
24,261
25,082
28,361
35,042
45,493
Fig. 3 Comparison of energy consumption
Fig. 4 Comparison of interruption probability
proposed in this paper is superior to the traditional distributed algorithm in terms of interrupt rate, number of activated BSs, and energy efficiency (Figs. 5 and 6).
Fig. 5 Comparison of working BS
626
W. Yang et al.
Fig. 6 Comparison of energy efficiency
5 Conclusion In this paper, a two-layer cellular heterogeneous network model including a macro cell and a micro cell is established. Particle swarm optimization (PSO) is used as the BS sleep algorithm, and the algorithm flow chart is developed. The simulation results show that compared with the distributed algorithm, the system energy consumption can be saved by 10%, and the interrupt rate can be reduced by about 10% in a certain period of time. It can be seen from the data that particle swarm optimization (PSO) can significantly reduce the energy consumption when the user flow is low, realize the balance of energy consumption, and realize green communication, which is a better sleep strategy for the BS. Acknowledgements. Supported by the National Key R and D Program of China (No. 2017YFC1500601.)
References 1. Esodiakaki A, Adelantado F, Antonopoulos A et al (2014) Energy impact of outdoor small cell backhaul in green heterogeneous networks. In: 2014 IEEE 19th international workshop on computer aided modeling and design of communication links and networks (CAMAD). IEEE Press, pp 11–15 2. Rao JB, Fapojuwo AO (2014) A survey of energy efficient resource management techniques for multi-cell cellular networks. IEEE Commun Surv Tutorials 16(1):154–180 3. Abou-Zeid H, Hassanein HS, Valentin S (2016) Energy-efficient adaptive video transmission: exploiting rate predictions in wireless networks. IEEE Trans Veh Technol 63(5):2013–2026 4. Gong J, Zhou S, Niu Z et al (2010) Traffic-aware base station sleeping in dense cellular networks. In: 2010 18th international workshop on quality of service (IWQoS). IEEE Press, pp 1–2 5. Zhao J, Hu J, Qu Y, Wang W (2016) An energy efficiency cooperating base station sleep mechanism in LTE-advanced network. Telecommun Sci 2:6 6. Hao M, Zhang Z, Xi B (2016) Dynamic base station shutdown algorithm based on distance sensing in 5G network. Video Eng 40(1):76–81
An Adaptive Base Station Management Scheme Based …
627
7. Kennedy J, Ebert R (1995) Particle swarm optimization. In: Proceedings of IEEE international conference on neural networks 8. Xue F, Liu G, Gao S (2011) Solving 0–1 integer programming problem by hybrid particle swarm optimization algorithm. Comput Technol Autom 30(1):86–89 9. Niu Z, Zhou S, Zhou S et al (2012) Energy efficiency and resource optimized hyper-cellular mobile communication system architecture and its technical challenges. Sci Sin (Inf) 10:1191– 1203
An Earthquake Monitoring System of LoRa Dynamic Networking Based on AODV Long Wang, Wenchao Yang, Xu Bai(B) , Lidong Liu, Xuerong Luo, and Mingjie Ji Harbin Institute of Technology, Harbin, China [email protected]
Abstract. In recent years, earthquake disasters are increasingly frequent, resulting in great loss to people’s life and property. In order to better evaluate and predict the earthquake, it is of great significance to design a system which can effectively collect the seismic data when an earthquake occurs. LoRa wireless communication has the advantages of low power consumption, long communication distance and strong anti-interference ability. In this paper, LoRa network is used to upload the data collected by the seismic sensor to the LoRa base station. Then, the data is sent to the cloud server through 4G, finally received by the working platform. In view of the node damage during the earthquake, the AODV protocol is used to ensure the transmission of acquisition data in the form of dynamic networking. The result shows that the monitoring system of LoRa network seismic monitoring system based on AODV protocol runs stably and can obtain reliable seismic data. Keywords: Earthquake monitoring · Dynamic networking · LoRa · AODV protocol
1 Introduction In recent years, the frequent occurrence of earthquake disasters in China, such as Lushan earthquake, Yushu earthquake and Wenchuan earthquake, has caused great loss to people’s life safety and property [3]. Because of its geological conditions, China is one of the most seismic hazard-prone countries in the world [12]. Besides, with the acceleration of urbanization, the population and wealth are highly concentrated, and the city is developing toward large scale and complex, which becomes more and more vulnerable in front of earthquake disasters. Therefore, the realization of real-time monitoring and accurate dynamic assessment of earthquake disaster risk plays an important role in the area of the analysis and prediction of earthquake disasters. It has become an important technical means to help to reduce regional and urban earthquake damage. The USA, Japan and other countries have established natural disaster risk monitoring and loss assessment systems, such as HAZUS-MH system [7], PAGER system [4] and Phoenix DMS system [5].
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_84
An Earthquake Monitoring System of LoRa …
629
At present, the coverage density of earthquake monitoring in China is still low. The traditional wired network is not flexible enough, and line interruptions are easy to occur during an earthquake, making it not reliable to transmit the collected seismic information. Multiple wireless communication technologies, such as Zigbee, Bluetooth and WiFi, also have shortcomings [9]. Their poor penetration capacity and short communication distance cannot satisfy our demand. Long range (LoRa), as an important component of low-power wide area network (LPWAN), integrates digital spread spectrum, digital signal processing, forward error correction coding and other technologies [6]. The application of LoRa wireless transmission network in earthquake monitoring can meet the requirements of long-distance communication, discrete distribution, strong anti-interference ability and low power consumption [1]. In addition, considering that the LoRa base stations may be damaged in the event of earthquakes or other accidents, the AODV protocol can be used to dynamically route the network so that LoRa terminal nodes can transmit the data to other available base stations. After receiving the seismic data, LoRa base stations upload the data to the cloud server through 4G network, and the upper computer obtains the data through the Internet for processing and analysis.
2 System Structure Model The system is composed of four parts: the LoRa terminal nodes, the LoRa base station, the cloud server and the user platform, as shown in Fig. 1. The LoRa terminal nodes transmit the collected seismic data to the LoRa base station through the LoRa network, and then the seismic data will be stored on the cloud server through the 4G connection. Users can easily use the Internet to access the server to obtain the collected seismic data and complete the data analysis and processing, and also can set the acquisition-related parameters to the LoRa terminal nodes by the connection in turn.
Fig. 1 System structure
Consider that the nodes need to be flexibly placed in the position meeting the detection conditions, which involves the requirements of low power consumption. Both the LoRa terminal node and the LoRa base station select STM32F407ZGT6 produced by STMicroelectronics as the main control chip. It is a 32-bit microprocessor with arm cortex-m4 core, with a working frequency of 168 MHz. The chip’s computing capacity meets the hardware requirements of terminal nodes and base stations, and meanwhile its power consumption is not high. The module structure of the LoRa terminal node is shown in Fig. 2, including the three-axis vibration sensor module for detecting seismic wave, the power module, main control chip and the LoRa RF module.
630
L. Wang et al.
Fig. 2 Structure of the LoRa terminal node
Figure 3 shows the module structure of the LoRa base station. It includes the power module, main control chip, the LoRa RF module and the 4G module for communication with server.
Fig. 3 Structure of the LoRa base station
The LoRa terminal nodes save AD data in memory in real time and discard the data beyond the storage time. When an earthquake is detected, it is triggered to send the 10 s before and 20 s after the earthquake through LoRa module. The LoRa base station integrates the received LoRa signal data and then transmits the data to Aliyun server through 4G. Users can log in to Aliyun server to obtain data for processing and analysis.
3 Network Routing There are two networking modes of LoRa. Under the circumstance that all LoRa base stations are in normal operation, the star topology is applied. Each terminal node can directly communicate with LoRa base station nodes. When an earthquake occurs, the LoRa base station nodes are likely to be damaged due to the impact of the earthquake, resulting in the corresponding multiple terminal nodes unable to effectively transmit data signals. So in order to solve this problem, the AODV protocol is applied. It can network dynamically, and the LoRa terminal node can contact with any device after multi-hop routing, as shown in Fig. 4. So, the data can be can transmitted to other available base stations. Ad hoc on-demand distance vector (AODV) routing is a routing protocol applied to routing in wireless ad hoc networks, which can realize unicast and multicast routing
An Earthquake Monitoring System of LoRa …
631
Fig. 4 A possible situation for AODV dynamic networking
[8]. This protocol is a typical protocol of routing on demand in ad hoc networks. It can determine a route to the destination in the dynamic point-to-point network and has the characteristics of fast access speed, low computation, low memory consumption, light network load, etc. It uses the destination sequence number to ensure that there is no loopback at any time, avoiding many problems in the traditional distance vector protocol [2]. AODV frames mainly include routing request (RREQ) frame, routing reply (RREP) frame and routing error (RERR) frame. As a source-driven routing protocol, in AODV protocol, if there is no route to reach the target node when a node needs to transmit information to other nodes in the network, it must first send RREQ message in the form of multicast, as shown in Fig. 5. The network layer addresses of the source node and the destination node are recorded in the RREQ message. When the neighboring node receives the RREQ, it first determines whether the destination node is itself. If so, send routing response (RREP) to the source node. If not, continue to forward RREQ for search.
Fig. 5 AODV protocol routing search process
When a node detects a link break with a neighbor node, it sends a RERR. The source node can initiate the routing request again if it needs to communicate with the destination node, after receiving the RERR. When the terminal node cannot find an available LoRa base station node that can transmit the data, it stores the earthquake data in the reserved
632
L. Wang et al.
memory, playing the role of black box. When the workers bring the LoRa collector to the site, they can collect the data collected by the surrounding nodes during the earthquake within the communication range.
4 Results The results collected by the installed equipment during the Changning earthquake are shown in Fig. 6. The three sets of data measured by the triaxial vibration sensor are vertical direction, north–south direction and east–west direction, respectively. The sampling frequency is 200 Hz. By triggering the threshold of band-pass filter to detect the occurrence of earthquake, the collected data of 10 s before the earthquake and 20 s after the earthquake is uploaded to the cloud server through LoRa network and 4G, and the collected data can be well used for seismic monitoring and analysis research. 30
vertical
20
amplitude (mv)
10
east-west
0
-10
north-south
-20
-30
0
1000
2000
3000
4000
5000
6000
sample number
Fig. 6 Seismic data acquired during the Changning earthquake
5 Summary In view of the frequent earthquake disasters in recent years, it is of great significance to design a set of system which can effectively collect the data when the earthquake occurs. Based on STM32 hardware platform, we use LoRa to network to transmit the data collected by seismic sensor and then upload the data to cloud server through 4G, and finally the data can be acquired by user platform. In view of that the node may be damaged during earthquakes, the AODV protocol is used for dynamic networking
An Earthquake Monitoring System of LoRa …
633
to ensure the transmission of the collected data. The results show that LoRa network seismic monitoring system based on AODV protocol runs stably and can obtain reliable seismic data. Acknowledgements. This work was supported by the National Key R&D Program of China (No. 2017YFC1500601.)
References 1. Augustin A, Yi J, Clausen T, Townsley WM (2016) A study of LoRa: long range and low power networks for the internet of things. Sensors 16(9):1466 2. Chakeres ID, Belding-Royer EM (2004, March) AODV routing protocol implementation design. In: 24th international conference on distributed computing systems workshops. Proceedings, pp 698–703 3. Cui P, Chen XQ, Zhu YY, Su FH, Wei FQ, Han YS, Liu HJ, Zhuang JQ (2011) The Wenchuan earthquake (May 12, 2008), Sichuan province, China, and resulting geohazards. Nat Hazards 56(1):19–36 4. Fema P (2012) 58-1 (2012) Seismic performance assessment of buildings (volume 1Methodology). Federal Emergency Management Agency, Washington 5. Hori M, Ichimura T (2008) Current state of integrated earthquake simulation for earthquake hazard and disaster. J Seismolog 12(2):307–321 6. Lavric A, Popa VA (2017) LoRaWAN: long range wide area networks study. In: 2017 international conference on electromechanical and power systems (SIELMEN). IEEE, pp 417–420 7. Remo JW, Pinter N (2012) Hazus-MH earthquake modeling in the central USA. Nat Hazards 63(2):1055–1081 8. Perkins, C., Belding-Royer, E., & Das, S. (2003). RFC3561: Ad hoc on-demand distance vector (AODV) routing. 9. Trinh LH, Bui VX, Ferrero F, Nguyen TQK, Le MH (2017, Dec) Signal propagation of LoRa technology using for smart building applications. In: 2017 IEEE conference on antenna measurements and applications (CAMA). IEEE, pp 381–384 10. Truong TP, Nguyen LM, Le TH, Pottier B (2019, Apr) A study on long range radio communication for environmental monitoring applications. In: Proceedings of the 2019 2nd international conference on electronics, communications and control engineering, pp 92–97 11. Wu DS, Li ZW, Meng D, Zhang X, Li LD (2017) Design of intelligent interactive terminal of household intelligent power management system. Autom Instrum 1:34–38 12. Xu W, Liu J, Xu G, Wang Y, Liu L, Shi P (2016) Earthquake disasters in China. Natural disasters in China, Springer, Berlin, Heidelberg, pp 37–72
Track Segments Stitching for Ballistic Group Target Xiaodong Yang, Jianguo Yu(B) , Lei Gu, and Qiang Huang Nanjing Research Institute of Electronics Technology, Nanjing 210039, China [email protected], [email protected], [email protected], [email protected]
Abstract. The ballistic group targets are spaced highly close, and they are mutual occlusive. Hence, how to distinguish and precisely track the complex ballistic target under a presented radar working band has become an urgent problem. To solve this problem, we first predict the group center by two-body movement, and then we induce the track segment stitching (TSS) method using track file to construct test hypothesis, then to correlate and smooth the track segments and finally to manage the batch. Simulation results show that the average tracking time is double compared with the traditional algorithm, and the tracking root-mean-square (RMS) errors are significantly decreased. The group targets are consecutively tracking and keep the unique batch, which provides a new precise tracking method for ballistic group target. Keywords: Group target · Ballistic target · Track segment stitching
1 Introduction In modern warfare, ballistic missiles have become a killer of local wars because of their fast flight speeds and strong long-range precision strike capabilities [1]. The world’s major powers have done a lot of research on both attack and defense of the ballistic missile. In terms of missile penetration, means such as stealth, jamming, decoy, reentry maneuver and multiple warheads have been practically applied to multi-type products [2], while the corresponding missile defense technology has developed relatively slowly. Complex ballistic target group tracking is the basis for warhead identification and interception. Due to the similar movement status of the ballistic target groups, large and small targets are shielded from each other, and the distance between the targets is too close. It is difficult for the radar to distinguish the targets, which increases the difficulty of radar detection and causes mis-detection frequently. This will inevitably cause the ballistic target groups to be mis-associated, the tracks to be mixed, and ultimately affect the warhead target recognition and interception. Most of the traditional methods focus on the modeling and tracking of ballistic target group, and take the warhead group as an overall target to manage and correlate,
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_85
Track Segments Stitching for Ballistic Group Target
635
and filtering [3]. Some scholars use the relatively stable distribution of target positions within the group as the basis for subsequent processing and then improve the association, filtering and trajectory prediction of the target group [4]. However, the missile defense system pays more attention to high-threat warhead targets and needs to accurately track and confirm the warheads. Based on the ballistic target’s motion characteristics and motion model, we use the existing track file to “backtrack” the batch number management of the newly batched tracks, effectively reduce the batch change and nonlinear estimation errors, and improve the radar detection performance on ballistic target group from the information processing system.
2 Tracking Model of Ballistic Target Group 2.1 Measurement Model Assuming that the radar measurement data is characterized in the east–north–up (ENU) coordinate system at a right angle to the center of the station, without loss of generality, taking three-dimensional coordinates as an example, the radar can obtain the target’s range, azimuth and elevation measurement information. The radar measurement equation at time k is: Z(k) = HX(k) + W(k)
(1)
where H is the state transition matrix, X(k) is the target’s state variables and W(k) is the radar measurement noise. The corresponding measured value is expressed as: T Z(k) = rk ak ek + W(k)
(2)
where rk , ak and ek , respectively, correspond to the range, azimuth and elevation of the target relative to the radar at time k, which are expressed as: ⎧ ⎪ ⎪ rk = xk2 + yk2 + zk2 ⎪ ⎪
⎪ ⎨ ak = a tan xykk (3) ⎪ ⎪ ⎪ zk ⎪ ⎪ ⎩ ek = a tan 2 2 xk +yk
where xk , yk and zk correspond to the east, north and zenith components of the ENU coordinates at time k. 2.2 Equation State The trajectory of a ballistic target consists of an active segment and a passive segment. The duration of the active segment is short, and because of radar deployment, most of them are outside the line of sight of the missile defender. The ballistic model studied in this paper is based on the passive segment of the flight trajectory. The model is
636
X. Yang et al.
developed based on the outer space two-body movement. It is used to describe the motion characteristics of the missile in the action of gravity and atmospheric resistance. The flight duration of this segment accounts for more than 80% of the entire trajectory. As the main time window for missile interception operations, it is of great significance. The expression of the two-body motion model in the earth center inertia (ECI) coordinate system is as follows: urECI r¨ECI = − ||rECI ||3
(4)
where rECI is the position vector in the ECI coordinate system, u is the earth’s gravity constant, the value is 3.986005 × 1014 m3 /s2 , r¨ECI is the second-order derivative of rECI and ||rECI || is the modulus value of the position vector. Due to the self-rotation of the earth, missiles are easily described in the earth-centered earth-fixed (ECF) coordinate system. The conversion from ECI to ECF coordinates is as follows: ECI rECI = TECF · rECI
(5)
ECI is the conversion formula for ECF to ECI coordinates: TECF
⎡
⎤ cos(ωt) − sin(ωt) 0 ECI (ωt) = ⎣ sin(ωt) cos(ωt) 0 ⎦ TECF 0 0 1
(6)
where ω is 7.292115 × 10−5 rad/s, the rotation speed of the earth. From (4) to (6), we can get the expression in the ECF coordinate system: ECI ECI × rECF ) = ((TECF × rECF ) ) (TECF ECI ECI ˙ rECF ) = ((TECF ) rECF + TECF = ((T ECI )rECF + 2(T ECI )r˙ECF + T ECI r¨ECF ) ECF
ECF
ECF
(7)
where rECF , r˙ECF and r¨ECF are the target’s position, velocity and acceleration vector in the ECF coordinate system, respectively. The second and third terms in Eq. (7) are related to the Coriolis force and the centripetal force, respectively. Finally, we can finally obtain the ordinary differential equations of the ballistic target in the unpowered phase [5].
3 Group Target Tracking for Ballistic 3.1 Flowchart of Track Segment Stitching Based on the tracking model of ballistic group targets, a track segment correlation algorithm based on the test hypothesis is used to obtain a flowchart of ballistic group targets for a system closed-loop processing based on detection and tracking, as shown in Fig. 1.
Track Segments Stitching for Ballistic Group Target
637
Predict track situation
Update track correlation
Initiate tracks
track segments stitching
Fig. 1 Flowchart of tracking ballistic group targets
It can be known from Fig. 1 that the situation prediction is first performed from existing track group, and make predictions from the current time k of the track i to the next cycle k + 1, that is: Zi (k + 1|k) = H Xˆ i (k|k)
(8)
where Zi (k + 1|k) is predicted measurement value of the track i. H is measurement matrix. Xˆ i (k|k) is estimated situation value of the track i at time k. Details refer to formulas (1), (3) and (4). Predict the spatial position of all tracks, calculate the corresponding differences pairwise, and count the number of targets falling into the ambiguous area; that is, the target deviation in all dimensions is less than the radar resolution. If spatial differences of the track i and j at time k meet the following formula, i and j fall into ambiguous area, and the number of ambiguity is 2. Traverse all the tracks and count the number of targets in ambiguous area. Zi (k + 1|k) − Zj (k + 1|k) ≤ δ (9) T In the above formula, δ = δr δa δe represents the radar’s ability to resolve group targets in the three dimensions of range, azimuth and elevation, usually taking three times the radar measurement error. 3.2 Association Algorithm of Track Segments In the stage of tracking for the group targets after detection, whether two track segments come from the same target is essentially a hypothesis testing problem. Assume that the discrete track of the track is: Xi = {ti , ri , ai , ei }i=1,2,...n
(10)
638
X. Yang et al.
The discrete plot of track j is: Xj = {tj , rj , aj , ej }j=1,2,...m
(11)
Track i has n plots and track j has m plots. ti , ri , ai , ei are time, range, azimuth and elevation, respectively. Suppose estimated state of track i is Xˆ i , estimated error covariance matrix is Pi , estimated state of track j is Xˆ j and estimated error covariance matrix is Pj , then: ε = (Xi − Xˆ i )T Pi−1 (Xi − Xˆ i ) + (Xj − Xˆ j )T Pj−1 (Xj − Xˆ j )
(12)
The above ε is defined as the normalized statistical range, which obeys the χ 2 distribution. The track segment pairing problem is reduced to the hypothesis testing problem of the distribution. The threshold εmax satisfies: Pr{ε ≤ εmax } = 1 − α
(13)
where εmax is detection threshold and 1 − α is test confidence.
4 Experiments and Discussion 4.1 Experimental Setup The radar station stands in the launch area. The radar has a working bandwidth of 2 MHz, a ranging error of 50 m, and an azimuth and elevation error of 0.2°, and obeys a Gaussian distribution with zero mean. The STK simulation software is used to generate the data. The data sampling interval is T = 1 s, the launch time of Batch No. 1 target is 0, the trajectory height is 200 km, and the trajectory range is 400 km. The characteristics of the standard trajectory are shown in Fig. 2. There are three ballistic targets in the simulation scene. In order to simulate the track interruption caused by the mutual occlusion and too close distance between ballistic targets, the track will be randomly interrupted for 30 s between the launch time of 200–235 s; that is, the plot information will not be sent to the track processing module; therefore, the track is not updated in a certain period. After that, it will perform batch clear operation. When the interruption is over, the batching will be restarted. The time versus range and time versus azimuth of tracks are shown in Figs. 2 and 3, where the old track batch numbers are 1–3, and the newly batched track numbers are 4–6. From Fig. 2, it shows that when the track is just started, as the estimation of initial speed is not accurate, the filtering is still in the convergence stage, the track precision is poor, which is comparable to the plot track precision and has obvious jitter, and the tracking has basically converged after more than a dozen update cycles. From Fig. 3, it shows that before the track is interrupted, as the lack of precise measurement information, the extrapolation error also increases with time. Therefore, an effective track stitching method should be used to complete the task of keeping the same batch number before and after the track is interrupted and reducing tracking errors.
Track Segments Stitching for Ballistic Group Target
639
310 305
Range (km)
300 295 290
batch-1 batch-4 batch-2 batch-5 batch-3 batch-6
285 280 275
190
200
210
220
230
240
250
Time (s)
Fig. 2 Time versus range of tracks
188
Altitude (km)
186
184
batch-1 batch-4 batch-2 batch-5 batch-3 batch-6
182
180
178
190
195
200
205
210
215
220
225
230
235
240
Time (s)
Fig. 3 Time versus azimuth of tracks
4.2 Track Segment Stitching The ballistic model in Sect. 2 is used to extrapolate the old track. The data length is the last 15 points of the old track. The processing method is the least square Gauss–Newton iteration. Extrapolate the old track to the starting output time of the new track, and use TSS algorithm based on test hypothesis in Sect. 3.2 to calculate the correlation between the new and old tracks. The simulation results are shown in Figs. 4 and 5. Using the auction algorithm to correlate the above formula can quickly find the correct pairing relationship; that is, the old tracks 1–3 are paired with the new tracks 4–6, respectively. The precision and track continuity of TSS algorithm (in this paper)
640
X. Yang et al. 310
305
Range (km)
300
295
290
batch-1 batch-2 batch-3
285
280
275 195
200
205
210
215
220
225
230
235
240
245
Time (s)
Fig. 4 Range after segment stitching
188
batch-1 batch-2 batch-3
186
Altitude (km)
184
182
180
178
176 190
200
210
220
230
240
250
Time (s)
Fig. 5 Altitude after segment stitching
before and after the interruption of 1000 Monte Carlo simulation statistics are shown in the following table. From the analysis in Table 1, it can be seen that the tracking precisions of the proposed algorithm for range, azimuth and elevation have been greatly improved before and after the track is interrupted. The range precision has increased by 30.9%, and azimuth and elevation precisions have increased by 53.3% and 50.8%, respectively. According to the data in Table 2, after the TSS, the average tracking time has more than doubled. At the same time, the average batch number of tracks is exactly the same as the real one, which solved the problem of batch failure.
Track Segments Stitching for Ballistic Group Target
641
Table 1 Comparison of tracking precision Tracking error
Traditional algorithm Algorithm in this paper
Range RMSE (m)
26.5
18.3
Azimuth RMSE (°)
0.12
0.056
Elevation RMSE (°)
0.13
0.064
Table 2 Comparison of tracking continuity Continuity
Traditional algorithm
Algorithm in this paper
Average tracking time (s)
165.8
351.7
6.0
3.0
90.2
99.8
Average batch number of tracks Correlation accuracy (%)
5 Conclusion Aiming at the problem of continuous and stable tracking of complex ballistic group target tracks, this paper uses the least squares Gauss–Newton iteration and extrapolation algorithm of ballistic model to realize track segment correlation based on test hypothesis. The algorithm proposed in this paper greatly improves the average cycle of tracks, effectively reduces the RMS error of the range and angle of tracks during track interruption, and can provide a reference for subsequent rapid warhead identification and missile interception.
References 1. Chen CX (2014) Study on approaches and algorithms for tracing ballistic missile. Graduate School of Northwestern Ploy Technical University, Xian 2. Li DG (2018) Russian air defense and anti-missile defense system. Defense Technol Indus 11:64–67 3. Li CX (2015) Summary of group tracing technology based on ballistic missile. Tactical Missile Technol 3:66–73 4. Du MY (2019) Advances in key technologies of group target tracing. Electron Opt Control 26(4):59–65 5. Yu JG (2011) Ballistic target track segments association and optimization. Acta Aeronaut Astronaut Sin 32(10):1897–19043
A Physical Security Technology Based upon Multi-weighted Fractional Fourier Transform Over Multiuser Communication System Yong Li(B) , Zhiqun Song, and Bin Wang Science and Technology on Communication Networks Laboratory, The 54th Research Institute of China Electronics Technology Group Corporation, Shijiazhuang 050081, Hebei, China young li [email protected]
Abstract. Inspired by the multiple weighted-type fractional Fourier transform (M-WFRFT) employed over physical layer security of point to point communication systems, we in this paper extend M-WFRFT to the single transmitter to multiple receivers communication system. According to the transform orders relationships between different MWFRFT, we design the communication system and the frame structure based upon M-WFRFT. It can be demonstrated, from the numerical simulations, that the proposed technology can further improve the physical layer security performance of wireless communication.
Keywords: Multiple weighted-type fractional fourier transform (M-WFRFT) · Physical layer security · Frame structure
1
Introduction
Recently, multiuser communication system has become more and more popular in reality. However, there are increasing requirement of physical layer security in the multiuser communication system [1, 2]. Traditionally, spread spectrum technology and frequency hopping (FH) are employed over wireless communications and improve the physical layer security performance. Nevertheless, with the signal detecting and recognition algorithms rapid development, the traditional spread spectrum and FH algorithm cannot satisfy the requirement of physical layer security [3]. Yong Li is the communications author. All of authors are now at the Science and Technology on Communication Networks Laboratory and the 54th Research Institute of China Electronics Technology Group Corporation. c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_86
A Physical Security Technology Based upon Multi-weighted . . .
643
Multi-antennas with artificial noise technology [4, 5], utilizing the space direction characteristic of antenna, can guarantee the physical layer security over multiuser communication systems. It needs to design that the channel state information (CSI) of legitimate receiver must be better than that of eavesdroppers. Apart from conventional technologies, weighted-type fractional Fourier transform (WFRFT) has been widely used in the physical layer security [6, 7], which is based on 4-WFRFT. Furthermore, through extending the weighted items, 4-WFRFT can develop to multi-WFRFT (M-WFRFT) [8]. Specially, we have proposed the single user communication system in [3] based upon M-WFRFT. However, it does not consider the multiuser communication system. Inspired by the single user communication based upon M-WFRFT, we in this paper proposed the M-WFRFT-based multiuser communication system. The multiuser communication system and the special frame structure have been designed in this paper. The physical layer security performance of multiuser communication system has been verified by some numerical simulations.
2
The Definition of M-WFRFT and Its Properties
In this section, we firstly provided the definition of M-WFRFT. Moreover, its some important properties are employed, which can be the basic theory of physical layer security over multiuser communication system. 2.1
The Definition of M-WFRFT
To length of N for the signal X, the α order M-WFRFT of X can be expressed as follows: α X, (1) F α X = WM α Here, WM is the α order M-WFRFT matrix with size of N × N . Moreover, can be described as follows:
α WM
α WM =
M −1
4l
M Bα l W ,
(2)
l=0 4l
M are the coefficients of M-WFRFT and the in which Bα l and W WFRFT matrix, respectively, and can be defined as follows:
Bα l =
M −1 1 2πi (α − l)k], exp[ M M
l = 0, 1, 2, . . . , M,
4l M
order 4-
(3)
k=0 4l
W4M =
3 ˆ l=0
4l
AˆlM Fl ,
ˆl = 0, 1, 2, 3,
(4)
644
Y. Li et al.
Aˆβl is the 4-WFRFT coefficients [9–11], and is expressed as follows: Aˆβl =
1 2πi (β − ˆl)k], exp[ 4 4 3
ˆl = 0, 1, 2, 3.
(5)
k=0
It can be observed from (3) and (5), the coefficients of 4-WFRFT and MWFRFT have some similar formats. The relationship between 4-WFRFT and M-WFRFT has been exploited in [8]. Here, we quote the theory properties from [8] in the following. 2.2
Some Important Properties
Here, we employ the properties of [3,8] as follows: Theorem 1. Assuming α4 and βM are the modulation orders of 4-WFRFT βM denote the matrices for and multi-WFRFT, respectively, and W4α4 and WM βM α4 4-WFRFT and multi-WFRFT, accordingly. If W4 = WM , then βM =
M α4 , 4
M > 4,
(6)
The order relationship between 4-WFRFT and M-WFRFT has been revealed in Theorem 1. Moreover, as the generalization of Theorem 1, Corollary 1 provided the order relationship between different M-WFRFT (M ∈ Z), which is the based theory of this paper. αm = Wnαn (m, n ≥ 4), then Corollary 1. If Wm
αm =
m αn n
(7)
According to Corollary 1 above, M-WFRFT and N-WFRFT of original sigαm nal, with M = N , can be equal, iff. Wm = Wnαn (m, n ≥ 4). It inspires us that we can design transmitter and receiver over wireless communication with different transform orders. Even if the eavesdroppers steal the security keys of transmitter, they also cannot resolve the signal correctly. Here, we assume that the eavesdropper can steal the information of transmitter but that of receiver. We have proposed the single transmitter and single receiver in [3]. However, this theory can be extended to single transmitter and multiple receivers communication system.
3
System Model
The single transmitter to multiple receiver communication systems, such as wireless radio system and so on, are widely used in realize. However, due to the open characteristic of wireless communication, anyone who is in the range of wireless
A Physical Security Technology Based upon Multi-weighted . . .
645
communication can receive and analyze the information. In this case, the physical layer secure is the significant problem for wireless communication. In this section, we will provide the single transmitter to multiple receiver communication system based upon M-WFRFT. We provide the system model in Fig. 1. Assuming N wireless communication devices, in which, only one device is the transmitter and the last N − 1 devices are the receivers. We set the device number as 0, 1, 2, . . . , N − 1.
Fig. 1 System model for multiuser communication
Without loss of generality, we assume the transform order of kth communication device is αk . And the transform order relationship, according to Corollary 1, between kth and jth is expressed as follows: αk =
k αj , j
(8)
Here, we assume that the kth communication device is the transmitter. The αk order domain of signal X = x0 , x1 , . . . , xN −1 , throughout the −αk order of Mk -WFRFT, is converted to time domain signal: S = Wk−αk X T .
(9)
The time domain signal S, by digital to analog transform and up conversion, will be designed to frame structure. The frame structure can be designed as Fig. 2. It can be divided into three parts, the frame synchronization code, the receiver transform order and the transmitting information. Note that, we can only transmit the receiver transform order not that of transmitter (denoted as
646
Y. Li et al.
“FF”). Even if the receiver transform order has been intercepted by eavesdropper, the communication information cannot be revealed as that the weighted item number (M ) of receiver cannot be known by eavesdropper.
Fig. 2 Frame structure designation
At the receiver, the jth(j = k) receive communication device, by down conversion and analog to digital transform, will be processed by frame synchronization module. Here, it utilizes the correlation operation. The received signal Y , removing the frame synchronization codes, can be rewritten as follows: α
Y = Wj j S T ,
(10)
Here, we focus on the physical layer security technology. And the channel equalization algorithms have not been considered. According to Corollary 1, (8) and (9), the (10) can be rewritten as follows: α
Y = Wj j Wk−αk S T j
= Wjk
αk
Wk−αk S T
(11)
= Wkαk Wk−αk S T = ST We can find that only the correct weighted item and order can guarantee the physical layer secure of communication system. Moreover, it does not transmit the order of transmitter to avoid the transform order intercepting. We will, from numerical simulations, verify the secure performance in the next section.
4
Numerical Simulations and Discussions
Based upon the analysis above, the M-WFRFT-based physical layer security technology is useful for single transmitter and multiple receivers communications system. In this section, we will provide the numerical simulation to verify the physical layer security performance of M-WFRFT. We, in the numerical simulation, assume that the weighted item and transform order are 16 and 1, respectively. The correct transform order is 0.5 at the weighted item (M = 8) of receiver, according to Corollary 1. Moreover, the transform order 0.5 will exist in the frame structure. Assuming that the eavesdropper can intercept transform order 0.5 of the receiver but not the weighted
A Physical Security Technology Based upon Multi-weighted . . . 10
BER
10
10
10
10
647
0
-1
-2
M M M M
-3
-4
-5
=8 = 16 = 32 = 64 0
5
Eb / N0 (dB)
10
Fig. 3 BER performance of M-WFRFT
item. In this case, we simulate the scanning error touch off bit to error (BER) performance dropping. The simulation result can be shown in Fig. 3. It can be observed, from Fig. 3, that the eavesdroppers scan the weighted item with the transform order known. When weighted item (M ) of eavesdropper is equal to 16, the BER performance will be increasing one order of magnitude at Eb /N0 ≥ 9 dB. When M = 32 and 64, the BER performance of eavesdropper will be more than 10−1 , in which it cannot work in practice. Note that, the eavesdropper must intercept the transform order of receiver.
5
Conclusion
In this paper, we proposed the M-WFRFT-based physical layer technology over multiuser communication system. We first introduce the M-WFRFT and two important properties. According to the transform order relationship, we design the system model and frame structure. It can be demonstrated that, from the numerical simulations, the proposed technology can protect the physical layer security of wireless communication. Acknowledgments. The work is supported by the fund of the National Key Research and Development Program of China (under grant 254). And also this work is supported by Science and Technology on Communication Networks Laboratory under grant SXX18641X027.
648
Y. Li et al.
References 1. He B, Chen J, Kuo Y, Lang L (2017) Cooperative jamming for energy harvesting multicast networks with an untrusted relay. IET Commun 11(13):2058–2065 2. Zou Y, Wang J, Wang X et al (2016) A survey on wireless security: technical challenges, recent advances, and future trends. Proc IEEE 104(9):1727–1765 3. Li Y, Song Z (2019) A secure wireless communication mechanism based on multiWFRF. In: 2019 IEEE 21st international conference on high performance computing and communications, Zhangjiajie, China, pp 437–442. https://doi.org/10. 1109/HPCC/SmartCity/DSS.2019.00072 4. Gau RH (2012) Transmission policies for improving physical layer secrecy throughput in wireless networks. IEEE Commun Lett 16(12):1972–1975 5. Mei W, Chen Z, Fang J (2017) Artificial noise aided energy efficiency optimization in MIMOME system with SWIPT. IEEE Commun Lett 21(8):1795–1798 6. Fang X, Sha X, Li Y (2015) Seret Communication Using Parallel Combinatory Spreading WFRFT. IEEE Commun Lett 19(1):62–65 7. Fang X, Zhang N, Zhang S et al (2017) On physical layer security: weighted fractional fourier transform based user cooperation. IEEE Trans Wirel Commun 16(8):5498–5510 8. Li Y, Song Z, Sha X (2018) The multi-weighted type fractional Fourier transform scheme and its allplication over wireless communications. Eurasip J Wirel Commun Networking 41:1–10 9. Li Y, Sha X, Wang K (2013) Hybrid carrier communication with partial demodulation over underwater acoustic channels. IEEE Commun Lett 17(12):2260–2263 10. Li Y, Sha X, Zheng F, Wang K (2014) Complexity equalization of HCM systems with DPFFT demodulation over doubly-selective channels. IEEE Signal Process Lett 21(7):862–865 11. Li Y, Liu A (2016) Inter-carrier inteference miligation algorithm based on hybrid carrier system. Radio Commun Technol 42(3):42–45
Fast Convergent Algorithm for Hypersonic Target Tracking with High Dynamic Biases Dan Le(B) and Jianguo Yu Nanjing Research Institute of Electronics Technology, Nanjing, China [email protected], [email protected]
Abstract. The chirp signal of large time–bandwidth product solves the contradiction between long detection range and fine range resolution, and is widely used in the current radar systems. However, due to the distance-Doppler coupling, high dynamic biases are brought into the radar ranging for the tracking of hypersonic targets, which leads the much longer time to converge for filters. In this paper, we propose a fast convergent algorithm to overcome the problem. The intuition is that the high dynamic bias is caused by the high radial velocity and the long convergence time is introduced by the large estimation error of the radial velocity. The proposed algorithm solves the problem by employing an algorithm with high precision of radial velocity. Experiments show that the proposed algorithm largely improves the convergence time of range, azimuth angle, elevation angle and position. Keywords: Hypersonic target tracking · Distance-Doppler coupling · Fast convergence
1 Introduction The chirp signal is widely used in the current radar systems because of the advantage of large time–bandwidth product. However, the strong coupling between distance and Doppler is also introduced, which leads the bias of range measurement and has large impact on the target state estimation and the data association. In order to overcome the problem, in the field of signal processing, some approaches to estimate the radial velocity are proposed [1, 2]; however, they cannot be directly applied in practical systems. In the field of data processing, some approaches of state estimation are performed with range bias and modify the estimated range according to the filtered range velocity. However, they are only applicable to targets of low speed and large error is still kept for high-speed targets. The hypersonic vehicles such as X-47B and X-51A are successfully tested in recent years. Aiming at the problem, Wang et al. [3] proposed an improved extend Kalman filter with the measurement of radial velocity. Zhang et al. [4] proposed an unbiased estimation algorithm at the same time of target tracking. However, the methods in Refs. [3, 4] both require that the measurement of radial
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_87
650
D. Le and J. Yu
velocity is available, which usually is unavailable for the tracking of high-speed target. Huang et al. [5] compared three filter algorithms with bias model. The comparisons show that the decoupled filter is best in view of the range precision and UKF is best in view of the angle precision. Observing that the range decoupled filter also has the advantage of the best precision of radial velocity and the fastest convergence; meanwhile, the angle precision of UKF becomes worse for high-speed target with high data rate; this paper proposes an algorithm combing the range decoupled filter with bias model and UKF with unbiased model. The rest of this paper is organized as follows. Section 2 gives the biased model for tracking the target with range-Doppler coupling, and the proposed algorithm is presented in Sect. 3. The experiments are presented in Sect. 4, and Sect. 5 is devoted to conclusion.
2 Biased Model for Tracking We assume that the radar emits the chirp signal B 2 s(t) = cos π t , 0 ≤ t ≤ τ τ where B is the bandwidth and τ is the pulse width. Then for a target with range r0 and radial velocity v, the amplitude of the received signal after matched filtering is given by |t − t0 | sin π fd + Bτ t (τ − |t − t0 |) |so (t)| = 1 − τ π fd + B t (τ − |t − t0 |) τ
where t0 = 2r0 c, c is the light speed, fd = −2f0 v c is the Doppler shift and f0 is the central carrier frequency. According to the maximum signal–noise ratio criterion, the measured range is given by r = r0 +
f0 τ v B
Let the Doppler coupling coefficient be cf =
f0 τ B
then range bias is cf v. Table 1 illustrates the Doppler coupling coefficient cf and the range bias for different wave bands, where the pulse width is 1.2 ms and the band width is 1.5 MHz. It can be seen that cf varies from 0.24 to 9.6 s, and the range bias varies from 0.48 to 19.2 km. The range bias is too large to be neglected. In order to overcome the problem, the biased tracking model, which eliminates the mismatching between the biased measurement and the unbiased tracking model, is built. Assuming the state vector of the target at time k be (1) Xk = xk , x˙ k , x¨ , yk , y˙ k , y¨ k , zk , z˙k , z¨k
Fast Convergent Algorithm for Hypersonic Target …
651
Table 1 Range bias for different wave bands Band f0 /GHz τ /ms v/km s−1
B/MHz cf /s
P
0.3–1
1.2
2
1.5
0.24–0.8 0.48–1.6
L
1–2
1.2
2
1.5
0.8–1.6
1.6–3.2
S
2–4
1.2
2
1.5
1.6–3.2
3.2–6.4
C
4–8
1.2
2
1.5
3.2–6.4
X
8–12
1.2
2
1.5
6.4–9.6 12.8–19.2
Range bias/km
6.4–12.8
then the modified measurement equation is given by ⎛ ⎞ 2 2 2 2 2 2 ⎛ ⎞ ⎜ xk + yk + zk + cf (xk x˙ k + yk y˙ k + zk z˙k )/ xk + yk + zk + ⎟ r˜k ⎜ ⎟ −1 (y /x ) ⎟ Zk = ⎝ a˜ k ⎠ = ⎜ tan k k ⎜ ⎟ ⎝ ⎠ e˜ k 2 2 −1 tan zk / xk + yk ⎛
⎞ wkr + ⎝ wka ⎠ wke
(2)
where wkr , wka and wke are the measurement error of range, azimuth angle and elevation angle, respectively.
3 Proposed Algorithm As can be seen from (2), the precision of the predicated range depends on the precision of the estimated radial velocity xk x˙ k + yk y˙ k + zk z˙k xk2 + yk2 + zk2 The better precision of the estimated radial velocity theoretically indicates the better precision of the predicated range. On the other hand, we observe that the range decoupled filter is best in view of the precision of radial velocity and the convergence time, while UKF is best in view of the angle precision. Hence, we propose an algorithm combing the range decoupled filter with bias model and UKF with unbiased model. Its frame is shown in Fig. 1. The bias range decoupled filter receives the bias range measurement as the input and outputs the unbiased filtered range and filtered range velocity. The last one is then input into the unbiased UKF filter. The unbiased UKF filter receives the measurements r˜ , a˜ , e˜ and the estimated range velocity rˆ˙ as the inputs and outputs, the estimated aˆ and eˆ . Finally, the estimated rˆ , aˆ and eˆ are converted into position components xˆ , yˆ and zˆ .
652
D. Le and J. Yu
r
r , a, e
Bias Range Decoupled Filter
rˆ rˆ
Unbiased UKF Filter
aˆ , eˆ
Coordinate Transformaon
xˆ , yˆ , zˆ
Fig. 1 Frame of the proposed algorithm
Usually, we are interested in the root-mean-square error (RMSE) of the estimated range, azimuth, elevation and position in radar systems. The motion equation of the bias range decoupled filter is modeled as Rk =FRk−1 + VkR where
⎞ 1 T T2 2 Rk = (rk , r˙k , r¨k ) , F = ⎝ 0 1 T ⎠. 0 0 1 ⎛
and the term VkR is a zero mean Gaussian noise process with covariance QR = qR Q. ⎞ ⎛ 5 T 20 T 4 8 T 3 6 Q = ⎝ T4 8 T3 3 T2 2⎠ T T3 6 T2 2 The measurement equation of the bias range decoupled filter is given by r˜k = HRk + wRk where H = 1 cf ,k 0 , cf ,k is the Doppler coupling coefficient at time k and wRk is a zero mean Gaussian noise process with covariance σR2 . Since the motion model and measurement model are both linear, it can be implemented by Kalman filter. The motion equation of the unbiased UKF filter is modeled as ⎛ ⎛ x⎞ ⎞ F00 Vk (3) Xk = ⎝ 0 F 0 ⎠Xk−1 + ⎝ Vky ⎠ z Vk 0 0F y
where Xk is defined in (1) and the terms Vkx , Vk , Vkz are all zero mean Gaussian noise process with covariance Qx = qx Q, Qy = qy Q and Qz = qz Q, respectively. The measurement equation of the unbiased UKF filter is given by ⎞ ⎛ ⎛ ⎞ ⎛ r⎞ xk2 + yk2 + zk2 rˆk wˆ k ⎟ ⎜ −1 ⎟ ⎜ tan ⎝ a˜ k ⎠ = ⎜ k xk (4) y ⎟ + ⎝ wka ⎠ ⎠ ⎝ e 2 2 −1 e˜ k w tan xk + yk zk k
Fast Convergent Algorithm for Hypersonic Target …
653
where rˆk = r˜k − cf ,k rˆ˙k , the termrˆ˙k is the estimated radial velocity by the bias range decoupled filter with covariance P rˆ˙k , the terms wˆ kr , wka , wke are all zero mean Gaussian noise process with covariance σr2 + cf2,k P rˆ˙k , σa2 and σe2 . Finally, the unbiased UKF filter is implemented by the traditional UKF filter.
4 Experiments and Discussion 4.1 Experimental Setup The trajectory of the simulated target is shown in Fig. 2. The minimal and maximal ranges are 615 and 809 km, and the radial velocity varies from −1144 to 1835 m/s. The maximal elevation is 52◦ .The parameters of the radar are as follows: The central carrier frequency is 6.25 GHz, the pulse width is 1.2 ms, and the bandwidth is 1.5 MHz. Therefore, the Doppler coupling coefficient is always 5 s, and the maximal bias is about 9 km. The measurement errors of range, azimuth angle and elevation angle are 100 m, 0.2◦ and 0.2◦ , respectively. The sampling period is 0.1 s. 850
200 150 100
Azimuth/°
Range/km
800 750 700
50 0 -50 -100
650
-150 600 0
100
200
300
400
500
-200
600
0
100
200
60
2000
50
1500
40 30 20 10 0 0
300
400
500
600
400
500
600
Time/s
Radial Velocity/m/s
Elevation/°
Time/s
1000 500 0 -500
-1000 100
200
300 Time/s
400
500
600
-1500 0
100
200
Fig. 2 Trajectory of the simulated target
300 Time/s
654
D. Le and J. Yu
4.2 Experimental Results This section compares three different filtering results as follows: (1) The filtering results not considering the bias, i.e., the range error is only caused by the measurement error. For convenience, we refer the results as “ideal result.” (2) The filtering results obtained by the UKF filter with bias model. We refer the results as “bias UKF.” (3) The filtering results obtained by the proposed algorithm. We name it as “proposed.” Figure 3 presents the comparison of the range RMS. In view of the convergence time, the ideal result is 6 s, the proposed algorithm is 13 s, and the bias UKF is 40 s. The proposed algorithm reduces the convergence time by 67.50% compared with the bias UKF. In view of the convergence precision, the proposed algorithm and the bias UKF are both 10 m, while the ideal result is 26 m. That is to say, the distance-Doppler coupling effect improves the convergence precision by 61%, although it introduces longer convergence time. The phenomenon verifies that one coin has two sides.
Bias UKF proposed ideal result
5000 4000 3000 2000
30 25 20 15 10
1000 0
Bias UKF proposed ideal result
35
RMSE of Range
RMSE of Range
6000
0
5
10
15
20
Time/s
25
30
35
5
30
40
50
60
70
80
90
100 110 120
Time/s
Fig. 3 Comparison of the range RMSE of three filtering results
The comparisons of the azimuth angle RMS are shown in Fig. 4. The proposed results and the idea results are the same. However, the bias UKF begins to be divergent when time is 1.1 s, and convergent at time 41 s. The same phenomenon appears for the elevation angle RMSE, as shown in Fig. 5. The reason that the angle is divergent is due to the large estimation error of range velocity. Figure 6 shows the comparison of the position RMSE which measures the precision of spatial position, also named GDOP. In view of the time after which the performance of an algorithm is the same as the ideal results, the proposed algorithm is 0.6 s, while the bias UKF is 46 s. The proposed algorithm reduces the time by 98.70%. The so long time for the bias UKF is mainly because of the divergence of the azimuth angle and the elevation angle.
Fast Convergent Algorithm for Hypersonic Target … 0.35
0.06 Bias UKF proposed ideal result
0.25 0.2 0.15 0.1
Bias UKF proposed ideal result
0.055
RMSE of Azimuth
0.3
RMSE of Azimuth
655
0.05 0.045 0.04 0.035 0.03 0.025
0.05
0.02 0
5
10
15
20
25
30
30
35
40
50
60
70
80
90
100 110
Time/s
Time/s
Fig. 4 Comparison of the azimuth angle RMSE of three filtering results
RMSE of Elevation
0.5 0.4 0.3 0.2
Bias UKF proposed ideal result
0.05 0.045 0.04 0.035 0.03 0.025 0.02
0.1 0
0.06 0.055
RMSE of Elevation
Bias UKF proposed ideal result
0.6
0.015 0
5
10
15
20
25
30
30
40
50
60
Time/s
70
80
90
100 110 120
Time/s
Fig. 5 Comparison of the elevation angle RMSE of three filtering results 10000 Bias UKF proposed ideal result
8000
RMSE of Position
RMSE of Position
9000 7000 6000 5000 4000 3000 2000 1000 0
0
5
10
15
Time/s
20
25
30
850 800 750 700 650 600 550 500 450 400 350 30
Bias UKF proposed ideal result
40
50
60
70
80
90
100 110 120
Time/s
Fig. 6 Comparison of the position RMSE of three filtering results
5 Conclusion This paper presents a fast convergent algorithm for hypersonic targets tracking with high dynamic biases. The intuition is to obtain the better precision of the range velocity to reduce the more bias estimation error. The proposed algorithm employs the bias
656
D. Le and J. Yu
range decoupled filter to obtain the higher precision of the range velocity and then employs the unbiased UKF filter. Experiments show that, when the data rate is 0.1 s, the proposed algorithm reduces the convergence time by 67.50% and overcomes the divergent phenomenon of the angle of the bias UKF. In view of the time after which the performance of an algorithm is the same as the ideal results, the proposed algorithm reduces the time by 98.70%. Besides, the distance-Doppler coupling effect improves the convergence precision by 61% compared with the ideal results, which verifies that one coin has two sides.
References 1. Yuan X (2012) Direction-finding wideband linear FM sources with triangular arrays. IEEE Trans Aerosp Electron Syst 48(3):2416–2425 2. Zhao F, Wang X, Xiao S (2005) A new method of radial velocity estimation for high coupling coefficient. Acta Electron Sin 9(33):1571–1575 3. Wang J, Long T, He P (2003) A target tracking algorithm with LFM waveforms. Mod Radar 2(25):26–29 4. Zhang X, Huang J, Wang G, Li L (2019) Hypersonic target tracking with high dynamic biases. IEEE Trans Aerosp Electron Syst 55(1):506–510 5. Huang Q, Wang J (2014) Target tracking on linear frequency modulation signal and high coupling coefficient. Nanjing Res Inst Electron Technol 36(8):22–25
Optimization of MFCC Algorithm for Embedded Voice System Tianlong Shi and Jiaqi Zhen(B) College of Electronic Engineering, Heilongjiang University, Harbin 150080, China [email protected]
Abstract. Feature extraction is the core step to achieve speech recognition and is the key to the correctness of the speech recognition system. Feature extraction is to obtain effective information for speech recognition and remove redundant information. This article briefly describes the MFCC feature extraction process and optimizes the MFCC algorithm by changing the pre-emphasis parameter and the parameter addition method to adapt to the characteristics of short-word embedded speech recognition systems. Keywords: Feature extraction · Speech recognition · MFCC algorithm · Embedded system
1 Introduction After preprocessing the input voice signal, the specific algorithm can be used to extract its features. Due to the wide variety of information in speech signals that contribute to speech recognition, choosing which features to extract will make the speech recognition system more accurate and more stable, which has become a key issue [1]. Considering the particularity of the embedded platform, the extracted feature parameters must not only reflect the features of speech well, but also must meet the characteristics of low computational complexity, easy extraction, and high real-time performance. Therefore, it is necessary to optimize the MFCC feature extraction algorithm.
2 MFCC Feature Extraction Algorithm and Optimization After passing through the Mel filter, the voice signal shows a Mel frequency that is closer to the hearing characteristics of the human ear, and its relationship with the actual frequency f (Hz) of the voice meets the formula: f (2.1) fmel (f ) = 2595 · log 1 + 700
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_88
658
T. Shi and J. Zhen
MFCC parameter extraction is mainly divided into the following steps: (1) Fast Fourier Transform: First, after preprocessing, decompose speech signal into n frames; then, each frame of voice signal is FFT transformed to obtain the energy distribution of the audio signal in the spectrum [2]; finally, the frequency spectrum of the voice signal is modulo and squared to obtain the voice signal power spectrum. After the FFT conversion of the nth frame speech signal [3], the spectrum x(k) of the frame signal in the linear domain is obtained, and the conversion formula 2.2 is shown as follows: x(k) =
N −1
x(n)e−j2πK/N 0 ≤ k ≤ N − 1
(2.2)
n=0
N represents the number of FFT points. (2) The Mel filter bank performs filtering: The Mel filter bank is composed of multiple band-pass filters, and the number of band-pass filters is generally 20–30, 24 in this article. The design formula of the Mel filter bank is shown in the formula 2.3 f (m − 1) is the lower limit frequency of the Mth triangular filter, and f (m + 1) is the upper limit frequency. ⎧ k < f (m − 1)||k > f (m + 1) ⎪ ⎨0 k−f (m−1) (2.3) Hm (k) = f (m)−f (m−1) f (m − 1) ≤ k ≤ f (m) ⎪ ⎩ f (m+1)−k f (m) < k ≤ f (m + 1) f (m+1)−f (m) (3) Logarithmic operation: S(m) = ln
N −1
|x(k)| Hm (k) 2
0≤m≤M
(2.4)
n=0
(4) Discrete cosine transform: C(n) =
N −1 m=0
π n(m − 0.5) S(m) cos n = 1, 2, . . . , L M
(2.5)
L represents the order of the MFCC coefficient, and the value range is usually between 12 and 16. MFCC feature parameters require a lot of complex calculations [4]. It is difficult for embedded platforms with weak processing capabilities to achieve high real-time performance. Therefore, based on the MFCC algorithm, it needs to be optimized to be suitable for hardware platforms, so there are two directions of optimization [5]. (1) Select specific pre-emphasis parameters: The calculation formula of pre-emphasis is as follows: H (x) = 1 − βx−1
(2.6)
Optimization of MFCC Algorithm for Embedded Voice System
659
The first-order delay factor β usually takes a value between 0.9 and 1, but because the floating-point operation speed of the embedded platform is slower, the calculation of floating-point numbers is more time-consuming than the calculation of integers, so the formula can be changed to: 1 31 = 1 − 1 − (2.7) H (x) = 1 − 32x−1 32x−1 After the change, it can be realized by shift operation to increase the operation speed, which is used for hardware implementation. (2) Adding parameter method: 12-dimensional MFCC parameters are selected as the characteristic parameters of the speech recognition system. The first-order to 13thorder coefficients are taken. The 12-dimensional cepstrum parameters of the MFCC algorithm only reflect the static characteristics of the voice signal, and the dynamic characteristics of the voice are described by the difference spectrum. The dynamic difference calculation formula is as follows: ⎧ Ct+1 − Ct tAlarm_Sequences and update Start_Time; Go on if t(aki)-t(ak(i-1)) < ST
Sequential Pattern Mining-Based Alarm Correlation Analysis . . .
675
else put Cooccur_Alarms->Sub_Sequence and Sub_Sequence->Alarm_Sequences and update Start_Time; Put aki->Cooccur_Alarms if t(aki)-t(ak(i-1)) < RF else put aki->Sub_Sequence; Put Alarm Sequences->Total_Alarm_Sequence_Sets;
2.3
Sequence Pattern Mining
In sequential pattern mining, alarms correlation relationships mainly include serial, parallel, and mixed relationships. Assume two alarms defined as ∀Ai and ∀Bj , Ai ∈ E, Bj ∈ E, where E is the frequent sequence sets obtained by sequential pattern mining. i and j refer to the time of occurrence. • Serial Relationship: i and j satisfy i < j (or i > j) that Ai triggers Bj . Ai is defined as the root-cause alarm, and Bj should be filtered. • Parallel Relationship: i and j satisfy i = j. Ai and Bj often occur at the same time. Both Ai and Bj are considered as root-cause alarms generated by faults. • Mixed Relationship: defined as ∀Ai ,∀Bj and ∀Ck , Ai ∈ E, Bj ∈ E and Ck ∈ E, i < j = k (or i > j = k), which contain the serial and parallel relationship. Ai triggers the occurrence of Bj and Ck or Bj and Ck trigger the occurrence of Ai . PrefixSpan algorithm is one of the mainstream sequential pattern mining algorithms. Compared with other sequence mining algorithms such as GSP, FreeSpan, and PrefixSpan, it has a big advantage: (1) It does not need to generate candidate sequences. (2) The projection database shrinks very quickly. (3) The memory consumption is relatively stable. So PrefixSpan is used for sequential pattern mining in this paper. In PrefixSpan, for an alarm sequence S = [[A, B, C], [A, C], [D], [C, F ]], [A, B] is a prefix of S, then [[ , C], [A, C], [D], [C, F ]] are the suffix corresponding to [A, B], and “ ” indicates the location of [A, B] at the same time window. In addition, [[A, C], [D]] is also a prefix of S, and [C, F ] is the suffix corresponding to [[A, C], [D]]. The alarm sequences based on the dynamic sliding time window is used as the input of PrefixSpan. The purpose of PrefixSpan is to mine frequent subsequences in alarm sequences. Algorithm 2 gives the specific calculation process of PrefixSpan. Algorithm 2. PrefixSpan based on Dynamic Time Sliding Window Total_Alarm_Sequence_Sets (Input) Minimum Support MS (Input) Alarm_Frequent_Sequences (Output) Finding all prefixes with length 1; The corresponding projection database PD1; For freq_i in FS Find the projection database PDi corresponding to freq_i
676
Y. Chen et al.
For item_j in PDi Counting the frequency of prefix of length i as Fi; Put the frequent i-item sequences (freq_i)->FS if Fi >= MS
3
Experiments and Evaluations
In this section, experiments use real alarm data for three months from October to December 2018. Experimental environment is based on Python 3.6.4 and a host configured with a main frequency of 2.4 GHz/8 core 16-thread CPU, 24 GB of memory, and about 2T HDD hard disk.
Fig. 3 Dynamic time sliding window based sequence pattern mining (K = 5)
The results of the alarm correlation analysis approach are shown in Fig. 3. The alarm named [‘33-Jiujiang Ruichang Paradise-Chengdong-155-Gaofeng’] has occurred 1770 times. The alarm named [‘33-Jiujiang Ruichang ParadiseCity East -153-Garden’] has occurred 1778 times. And the alarm named [‘33Jiujiang Ruichangyuan Railway Station-Ruichang Henggang Post Ticketing’] has occurred 1794 times. The three alarms co-occur 1569 times, accounting for about 88.6% of the occurrences of a single alarm, which indicates that their common occurrence was not an accidental event. It is reasonable to assume that there is a strong parallel relationship between them. In the same way, we can see that [‘113-Fuquan Township C Network Lijiang Bureau’] has occurred 1472 times, of which more than 700 alarms have triggered other alarms and the proportion of more than 50%. It can be considered that after [‘113-Fuquan Township C network Lijiang Bureau’] occurred, [‘33-Jiujiang Ruichang Paradise-Chengdong155-Gaofeng’], [‘33-Jiujiang Ruichang Paradise-Chengdong-153-Garden’] and other alarms are triggered. They are serial relationships. Combined with the parallel relationship and serial relationship of the above analysis, a mixed relationship can be obtained in the last red box. Finally [‘113-Fuquan Township C network Lijiang Bureau’] can be located as the root cause alarm. After filtered through experiments, 131,942 root-cause alarms are finally obtained from 6,477,158 alarms. 113,468 alarm-fault association data are obtained as shown in Table 1. The correlation ratio (CR) shown Eq. (7) is introduced to measure the data ratio before and after alarm and fault correlation. Experimental results show that the alarm correlation analysis approach proposed in this paper can simplify redundant alarms, accurately locate the
Sequential Pattern Mining-Based Alarm Correlation Analysis . . .
677
root-cause alarm, and effectively correlate most fault data to the corresponding alarm. amount of alarm fault association data (7) CR = amount of faults before association
Table 1 Filtering and association of alarms Alarms before filtering Root-cause alarms after filtering 6,477,158
4
131,942
Faults before association
Alarm-fault CR association data
153,679
113,468
73.8%
Conclusion
This paper proposed an alarm correlation analysis approach on the cluttered alarm data to solve the problems of difficult to locate root cause of alarms and the association between alarm data and fault data in time and space. The proposed approach can effectively filter redundant alarms, accurately locate the root-cause alarm, and improve the proportion of alarm and fault association compared with the traditional methods.
References 1. ITU-T.Recommendation X.720 (1992) Information technology-open systems interconnection systems management: log-control function, p 1 2. Cheng MX, Wu WB (2016) Data analytics for fault localization in complex networks. J IEEE Internet Things J 3(5):1 3. Dorgo G, Abonyi J (2018) Sequence mining based alarm suppression. J IEEE Access 6(1):15365–15379 4. Karoly R, Abonyi J (2017) Multi-temporal sequential pattern mining based improvement of alarm management systems. In: IEEE international conference on systems, man, and cybernetics, pp 003870–003875 5. Zhang G, Yang Q, Cheng X, Jiang K, Wang S, Tan W (2018) Application of sequential pattern mining in alarm forecasting of communication networks (in Chinese). J Comput Sci 45(S2):535–538 + 563
Multi-Model Ensemble-Based Fault Prediction of Telecommunication Networks Ying Chen1,2(B) , Tiankui Zhang2 , Rong Huang3 , Yutao Zhu1 , and Junhua Hong1 1
3
Yingtan Internet of Things Research Center, Yingtan, China [email protected] 2 Beijing University of Posts and Telecommunications, Beijing, China Network Technology Research Institute of China Unicom, Beijing, China
Abstract. Fault prediction is the critical method to ensure the stability and reliability of the whole communication networks. This paper proposes an ensemble model combining traditional algorithms and deep neural networks with multiple functions of feature combination, text learning, local feature extraction, and sample imbalance processing for fault prediction according to the characteristics of data with attribute relevance, imbalance, and text type. The experiments validate the proposed method can effectively improve fault recognition by 4–9%.
1
Introduction
The basic function of network management is to ensure the necessary and sufficient network performance to support the provision of various telecom services to users. With the increasing demand and complexity of application scenarios, the traditional network management needs a lot of manual maintenance and decentralized management, which is no longer suitable for the development of current telecommunications. The comprehensive and intelligence of network management become an urgent need [1]. An fault refers to the damage which makes the network unable to provide services normally. It is the root cause of alarm. When a device in the network is faulty, an alarm is triggered first. The alarm is uploaded to the network management system immediately. After the administrator confirms the fault on the spot, a fault form is generated, in which the fault form depends on human declaration. There is a big delay between the fault reporting time and the alarm occurrence time. Therefore, fault prediction has practical application value to network management. Fault prediction is to use the prior knowledge of alarms Jiangxi Province key research and development program (No. 2018ABC28008). c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_91
Multi-Model Ensemble-Based Fault Prediction . . .
679
and faults associated training data to build prediction models to predict alarm classes [2]. Tjhai et al. [2] used neural networks combined with unsupervised clustering algorithms to determine the cause of faults. Because the alarm-fault associated data contains a large number of categorical and textual description features, applying natural language processing (NLP) to analyze alarms and predict faults is a good direction. It can fully learn the semantic and logical expression and get more rich information. Because of the large number of category features, the existing single model is difficult to learn feature interactions. Refer to the NLP field and recommended areas that also focus on category feature processing, Xuechao et al. [3] proposed a new deep learning model L-MFCNN that improved the CNN model by combining with LSTM mode. Finally, it is proved that the model still has high performance when no complicated data feature rules are formulated. The wide and deep model proposed by Google in 2016 [4] and the DeepFM model proposed by Noah’s Ark Research Lab, Huawei, in 2017 [5] combined low-order nonlinear models with high-order neural networks, both of which had achieved significant improvements. Summarized the above work, most existing studies using single model can not learn feature expression sufficiently for alarm-fault associated data with a large number of categorical features. Most of models are based on the assumption of feature independence without considering the correlation between features. Therefore, this paper proposes an fault prediction model (TADNN), in which factorization machine (FM) algorithm is introduced to combine features and learn the relationship between features, bidirectional long-short time memory (BI-LSTM) network is used to extract contextual relationship information, convolutional neural network (CNN) is introduced to learn key semantic expressions of long text description and extreme gradient boosting (XGBoost) is introduced to deal with the imbalance of positive and negative samples.
2
Our Approach
This section conducts a statistical analysis to understand the alarm-fault associated data. And then designs and implements a TADNN Ensemble model for fault prediction. 2.1
Alarm-Fault Associated Data Statistical Analysis
A statistical analysis of the alarm-fault associated data is used to select the features related to the failure prediction for the training of the fault prediction model and analyze the characteristics of the alarm-fault associated data to design the structure of model more specifically. The alarm-fault associated data is shown in Table 1. ALARM ID can uniquely identifies an alarm. ALARM SOURCE is the name of the device where the alarm occurred. PERCEIVED SEVERITY represents the impact of the generated alarm on the network. CIR NAME and CIR TYPE are the name and domain type of the circuit associated with the alarm. HOUR is the hour of the
680
Y. Chen et al. Table 1 Alarm data example ALARM ID
122***577
ALARM SOURCE
Comprehensive computer room in Taiyuan Xiaodian district
ALARM TYPE
Service link down
HOUR
5
CIR NAME
Taiyuan01
CIR TYPE
VPN
PROVINCE
Liaoning
CITY
Shenyang
PROBABLE CAUSE Service link down ALARM DESCRIPT (1)The port is enabled but the network cable or fiber of the portis not connected well (2)Network cable or fiber failure (3)The peer sends some faults IS FAULT
Y
alarm occurrence time. ALARM DESCRIPT is descriptive long text that contains a lot of useful information for the fault analysis of the alarm and may clarify the cause of the fault. This paper statistics the alarm-fault associated data from the following four aspects: the consistency of data distribution, the difference of features distribution, correlation between features, and the label distribution. In conclusion, the alarm-fault associated data has the following characteristics: (1) alarm attributes are classified or descriptive; (2) there are correlation between features; (3) there is a long text in the feature, such as the “ALARM DESCRIPT” attribute in Table 1; and (4) the data distribution is extremely unbalanced. 2.2
TADNN Ensemble Model Design
It can be seen from the above conclusions that the associated data has multiple characteristics. For a single model, multiple characteristics cannot be considered at the same time, so its feature expression learning is not sufficient. Therefore, this paper proposes the TADNN Ensemble model comprehensively considering the different characteristics of the data to design different substructures of the model so that it can learn the expressions of different dimensions of the data in a targeted manner. Figure 1 shows the overall structure of the TADNN Ensemble model. After the training samples pass through the embedding layer, a low-dimensional dense vector is obtained. The second layer of the network is a parallel structure of FM, BI-LSTM, and CNN, and then the output of the three parts is concatenated into the same fully connected network. Finally, the output of the neural network
Multi-Model Ensemble-Based Fault Prediction . . .
681
inputs to XGBoost for training, which can effectively prevent over-fitting and effectively resist the influence of sample imbalance.
Fig. 1 Structure of TADNN ensemble aodel
In TADNN Ensemble model, the FM part can effectively learn the correlation of features, bring more rich information to the model. Moreover, it provides lowlevel feature information for the entire model to enhance the memory ability of the model. The advantage of the BI-LSTM and CNN part is that they can effectively learn the higher-order information of features to get deeper and more abstract feature expressions and enhance the generalization ability of the model. Figure 2 shows the correspondence between the characteristics of the alarm-fault associated data and the substructures in the design of fault prediction model. Embedding Layer As can be seen from Table 1, the features in the alarm-fault associated data are all categorical or text, which the network can’t directly process. It is necessary to convert word segmentations into vectors. In addition, there are many values for each category in alarms. The categorical data will form a very high-dimensional sparse matrix after encoding. So the embedding layer is introduced here to map the vectors in the high-dimensional semantic space to a lower-dimensional space to get dense and low-dimensional vector expressions. FM The calculation of the DNN is based on the assumption of feature independence; that is, the input of each neuron is independent of each other. The features of the alarm data are correlated from the previous statistics. But DNN cannot learn the correlation between features. Therefore, FM is introduced into
682
Y. Chen et al.
Fig. 2 Relationship between alarm-fault associated data and proposed model
the model for feature combination. FM algorithm [6] is introduced for feature combination in [A × B]. Compared with traditional feature combination manually, FM automatically selects the pairwise combination of all features. It can be seen from Eq. (1) that the FM rewrites the weight coefficients into the form of Eq. (2) based on the principle of matrix decomposition and learns the coefficients corresponding to each feature separately. For example, there is no combination of feature A1 and feature B2 in the training set. The combination of these two features can still obtain the relationship through vA1 and vB2 ; that is, FM can predict the relationship of feature combinations in the prediction set that have not appeared in the training set. In addition, the FM only needs to learn N weight coefficients. For high-dimensional feature data, the number of learning parameters and the complexity of data storage are greatly reduced. yFM = w0 +
n
wi xi +
i=1
wij =
n−1
n
< vi , vj > xi xj
(1)
i=1 j=i+1 k
< vil , vjl >
(2)
l=1
BI-LSTM After the root-cause alarm occurs, a series of other related alarms will occur, indicating that the alarm data has a certain continuity in the time dimension. The introduction of LSTM in the model can take into account the context relationship between alarms. From the actual situation, the state of a moment is not only related to the state of the previous moment, but also to the state of the later moment; that is, the information in the current state can be derived from the backward state. Therefore, the bi-directional LSTM (BI-LSTM) can better capture the semantic dependency of the context [7]. The structure of BI-LSTM is shown in Fig. 3. The output of the hidden layer is [hf t, hb t], which is composed of a forward LSTM hidden layer hf t and a backward LSTM hidden layer hb t,
Multi-Model Ensemble-Based Fault Prediction . . .
683
containing forward and backward information. This paper introduces BI-LSTM into TADNN Ensemble model to fully learn the feature context relationship in the alarm data.
Fig. 3 Structure of BI-LSTM
CNN As can be seen from Table 1, the column ALARM DESCRIPT is long text description. It contains some key information describing the behavior and status of the alarm, such as “signal interruption,” “fiber failure,” “port is not connected well,” and so on. By studying these long texts and extracting key semantic expressions from them, it can play a very important role in fault prediction. So CNN is introduced in the model to learn the key information of long texts in the ALARM DESCRIPT feature. The convolutional layer is the most important layer of CNN, in which the convolution kernel extracts the local characteristics of data. One type of convolution kernel only focuses on one characteristic of data. Therefore, This paper uses three different convolution kernels of 2 × 2, 3 × 3, 4 × 4 to extract the key semantic expressions in long text descriptions. Sample Down-Sampling and XGBoost Algorithm Since the alarm-fault associated data has the characteristics of sample imbalance, which will bias the training results to negative samples with large proportions, resulting in poor generalization ability of model’s prediction. Therefore, down-sampling and XGBoost are introduced to correct the problem of sample imbalance. According to [8], when the computing resources are sufficient and
684
Y. Chen et al.
the number of small samples is sufficient, down-sampling can be preferentially considered for processing. Considering the distribution of alarm-fault associated data, this paper first randomly down-samples the samples and further reduces the impact of sample imbalance through XGBoost.
3
Experiments and Evaluations
In this section, experiments use real alarm data for three months from October to December 2018. Experimental environment is based on Python 3.6.4 and a host configured with a main frequency of 2.4 GHz/8 core 16-thread CPU, 24 GB of memory, and about 2T HDD hard disk. Table 2 Parameters of model Model
Parameters
Embedding layer Output dim = 100 BI-LSTM
Layers = 3, activation = ‘relu’, dropout = 0.3, number of neurons per layer = {64, 32}
CNN
window sizes = {2, 3, 4, 5}, filter num = 8, activation = ‘relu’, strides=1, maxpooling size = maxlen - window size +1
XGBoost
learning rate = 0.01, n estimators = 1000, max depth = 7, gamma = 0.4
Based on statistics to get the characteristics of the alarm-fault associated data, this paper designs the corresponding substructure of the TADNN Ensemble model. TADNN Ensemble model is built upon the open deep learning tools keras. The parameters in the model are shown in Table 2. Table 3 Cluster evaluation values at different K values Model
LSTM CNN TADNN without XGBoost TADNN with XGBoost
Recall rate 0.848
0.918 0.921
0.958
AUC
0.902 0.919
0.942
0.839
In the case of imbalanced data, the recall, AUC and KS curve indicators are generally used to measure the effect of the model. As can be seen from Table 3, compared with the TADNN Ensemble model without XGBoost, the recall rate of the TADNN Ensemble model with XGBoost is increased by 4.1% and the AUC value is increased by 2.5%. Compared with the single-model CNN and LSTM, the recall rate is increased by 4.4% and 13%,
Multi-Model Ensemble-Based Fault Prediction . . .
685
respectively, and the AUC value is increased by 4.4% and 12.2%. It indicates that the TADNN Ensemble model can find more faults than the single model, and the ability of the model to distinguish between positive and negative samples is better than the single models. In addition, inputting the output obtained by deep network training into XGBoost can effectively resist the influence of sample imbalance and improve the accuracy of model prediction.
Fig. 4 Model evaluation indicators—KS curve and KS value
Figure 4 shows that, CNN has better performance than LSTM. TADNN Ensemble model with XGBoost performs better than TADNN Ensemble model without XGBoost. Compared with CNN, the predictive ability of TADNN Ensemble model with XGBoost is improved by 9%.
4
Conclusion
This paper proposes the TADNN Ensemble model to predict faults based on the alarm-fault associated data. The TADNN Ensemble model integrates traditional algorithms FM and XGBoost with deep network BI-LSTM and CNN, fully considering the text information and features correlation. It enriches the expression information of features, effectively resisted the influence of data imbalance and noise and improves the learning ability of the model. The proposed TADNN
686
Y. Chen et al.
Ensemble model can be applied to the actual network management environment to predict faults, which improves the efficiency of fault handling.
References 1. Dang J (2015) Analysis of the current situation and development strategy of telecommunication network management (in Chinese). J Inf Commun 000(011):265– 266 2. Tjhai GC, Furnell SM, Papadaki M et al (2010) A preliminary two-stage alarm correlation and filtering system using SOM neural network and k-means algorithm. J Comput Secur 29(6):712–723 3. Xuechao S, Yatong Z, Yue C (2019) Face gender recognition based on multi-layer feature fusion adjustable supervisory function convolutional neural network (in Chinese). J Comput Appl 36(03):940–944 4. Cheng H, Koc L, Harmsen J et al (2016) Wide & deep learning for recommender systems. In: Conference on recommender systems, pp 7–10 5. Guo H, Tang R, Ye Y et al (2017) DeepFM: a factorization-machine based neural network for CTR prediction. Inf Retrieval, J. arXiv 6. Rendle S (2011) Factorization machines. In: IEEE international conference on data mining 7. Zhang S, Zheng D, Hu X, et al (2015) Bidirectional long short-term memory networks for relation classification. In: Pacific Asia conference on language information and computation, pp 73–78 8. Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining
Bone Marrow Cell Counting Method Based on Fourier Ptychographic Microscopy and Convolutional Neural Network Xin Wang1,2 , Tingfa Xu1,2(B) , Jizhou Zhang1,2 , Shushan Wang1,2 , Yizhou Zhang1,2 , Yiwen Chen1,2 , and Jinhua Zhang1,2 1 Image Engineering and Video Technology Lab, School of Optics and Photonics, Beijing
Institute of Technology, Beijing 100081, China [email protected] 2 Beijing Institute of Technology Chongqing Innovation Center, Chongqing 401120, China
Abstract. In bone marrow examination, the number of bone marrow cells is an essential parameter to judge the degree of myeloproliferative. In this paper, we propose a new bone marrow cell counting method based on Fourier ptychographic microscopy and convolutional neural network. We use Fourier ptychographic microscopy technology to obtain the intensity and phase images of bone marrow cells at first. Then, we combine the intensity and the phase image correspondingly to obtain a dual-channel image. We use the convolutional neural network to extract the characteristics of bone marrow cells in the dual-channel image, which can generate a density map. The number of bone marrow cells is realized by integrating the density map. The experimental results show that both the mean absolute error (MAE = 0.66) and mean square error (MSE = 0.67) of our method are lower than those existing methods. Keywords: Bone marrow cells · Counting · Fourier ptychographic microscopy · Convolutional neural network
1 Introduction Bone marrow examination is a common medical examination method, which is very important for the preliminary diagnosis of hematopoietic diseases such as leukemia. Bone marrow cell counting is a very essential item in bone marrow examination. Usually, the degree of myeloproliferative is judged based on the number of bone marrow nucleated cells. Therefore, counting the number of bone marrow cells is a significant task. However, bone marrow cells have different shapes and densities at different stages of maturation, so it is still a challenging problem to count the number of bone marrow cells. The traditional method to count the bone marrow cells is the manual investigation of stained tissue biopsies, which is a laborious, time-consuming and error-prone work. Although there exist flow cytometry techniques [1, 2] that produce more precise results, they drastically
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_92
688
X. Wang et al.
increase the cost of diagnosis and treatment monitoring. Therefore, it is necessary to make out a fast and accurate method for automatic counting of bone marrow cells. In the past few years, several studies have been focused on the bone marrow cell counting methods. Zheng et al. [3] demonstrated an automatic counting method of bone marrow cells based on statistical feature extraction according to the specific statistical characteristics of different kinds of cells. Ai et al. [4] proposed an algorithm of bone marrow cell identification and classification based on gray level and color space. Furthermore, Liu et al. [5] proposed a new counting method of bone marrow cells, which includes three modules: localization, segmentation and classification. In this paper, we propose a high accuracy and speed method based on dual-channel convolutional neural network for bone marrow cell counting. Taking advantage of the large field of view and high-resolution characteristics of Fourier ptychographic microscopy technology, the bone marrow samples are collected at first, and the intensity and phase images of bone marrow cells are reconstructed by phase recovery and synthetic aperture technology. Then, we combine the intensity and the phase image correspondingly to obtain a dual-channel image and use the convolutional neural network to extract the characteristics of bone marrow cells in the dual-channel image, which can generate a density map of bone marrow cells. The number of bone marrow cells is realized by integrating the density map.
2 Related Work 2.1 Intensity and Phase Imaging Using FPM In order to detect bone marrow cells, the first step is the acquisition of bone marrow cell images. In general, in order to obtain a clear image of cell morphology, we use a conventional microscope with a high-magnification factor objective. Because the field of view of high-magnification factor objective is small, the mechanical scanning of the sample is needed to detect and count bone marrow cells. However, mechanical scanning is unfavorable because it is subjective and error-prone, which is not conducive to the accuracy of bone marrow cell detection. It also needs expensive components. In addition, the traditional microscope can only obtain intensity images without phase information. The aforementioned problems indicate that there are many disadvantages in acquiring bone marrow cell images with the conventional microscope. Here, Fourier ptychographic microscopy (FPM) [6] is introduced as a solution to solve the problems existing in the bone marrow cell imaging. FPM is a computational imaging technique developed in recent years for high-resolution, wide-field and multimodal imaging microscopy. It is able to use a low-magnification factor objective to obtain a high-resolution sample image without any mechanical scanning components. In FPM, a programmable light-emitting diode (LED) array which provides plane waves is used to illuminate sequentially the sample from different angles, so we can capture the corresponding low-resolution intensity images. Then using the phase retrieval technique, the captured intensity images are iteratively synthesized in the Fourier space, which can expand the passband of the objective and recover the phase information lost in the acquisition process. Compared with traditional microscopic imaging systems, FPM is able to
Bone Marrow Cell Counting Method Based …
689
obtain the images with wide field and high resolution at the same time, which is important for bone marrow cell counting. What’s more, the process of FPM reconstruction can recover both the intensity and phase information. The phase image can give additional sample information [7], which is helpful for subsequent detection and counting of bone marrow cells. The principle of FPM has been described in detail in [6], and therefore, we do not visit them in detail here. 2.2 Cell Counting Using Density Estimation At present, many studies have been done to accurately estimate the number of objects in the image following the supervised learning framework. These algorithms can be roughly divided into two categories: counting by detection and counting by regression. The detection-based counting methods rely on object detectors to determine the location and number of objects. Such algorithms not only are time-consuming, but also have low estimation accuracy due to the occlusion problems. The regression-based counting methods are more concerned about the total number of objects than the specific position of the object in the image. These methods usually estimate the number of objects by extracting the global features of the image and performing a regression mapping with the number of objects. The disadvantage of this algorithm is that it completely discards the spatial position information of objects in the image, and only uses one-dimensional statistics (total number of objects) for learning. Another disadvantage is that it requires a large number of training samples. In order to effectively utilize the spatial distribution characteristics of objects in the image, Lempitsky et al. [8] firstly proposed an algorithm framework based on density estimation. The algorithm estimates the density of the object in the image through the linear model of the image features. First, the density image is obtained by learning, and then the number of objects is obtained by summing the probability over the whole density image. For the problem of cell counting, the significance of different cell numbers in distance can be expressed by the density map. Moreover, for some micro-cell images with high-density and high-adhesion overlap, the method based on density estimation can better avoid the detection and segmentation of single cells. After that, Xia et al. [9] proposed an algorithm based on minimizing square error to infer the density image. Recently, with the development of deep learning technology, the convolutional neural network (CNN) has received extensively attention in object detection and counting. Compared with the traditional methods without CNN, there are significant improvements in accuracy and robustness, because it can extract the abstract features of the object. So, it is widely used in object detection and counting. Zhang et al. [10] proposed a multi-column convolutional neural network (MCNN) which contains three columns of convolutional neural networks with different filter sizes. The image is input into MCNN, and then the crowd density map is obtained as the output, whose integration gives the total crowd count. Yao et al. [11] proposed a supervised learning framework with CNN, which takes the global cell count as the annotation to supervise training. This method can not only count the total number of cells in the image, but also provide the density map, which is a valuable tool for the diagnosis and treatment of clinical diseases. Here, we propose a CNN-based framework with density estimation for bone marrow cell counting, which will be discussed in detail in the next chapter.
690
X. Wang et al.
3 Experiments and Results In order to detect bone marrow cells, we firstly obtain its intensity and phase images with high resolution using FPM. Then, we combine the intensity and the phase image correspondingly to obtain a dual-channel image and use CNN to count the number of bone marrow cells. Figure 1 shows the overall framework of our system.
Fig. 1 Overall framework of our system for bone marrow cell detection
3.1 Making the Dataset Different from the conventional optical microscope, FPM imaging system uses the programmable LED array instead of the common light source. In this paper, we use an array with 13 × 13 LEDs, and the distance between each LED element is 8 mm. An Arduino circuit board is used as the controller to control the LED array, which is connected to the computer. During image capture, the LED array board is controlled directly by computer with MATLAB. The 169 LEDs are lightened one by one, so 169 low-resolution images are obtained. In our experimental system, the input of 169 low-resolution images is reconstructed by FPM, and the high-resolution image is obtained. Then, we combine the intensity and the phase image correspondingly to obtain a dual-channel image. After obtaining the dual-channel images, ‘Training Image Labeler’ is used to label these images, which is an image annotation tool of MATLAB. The annotations containing the bounding box information of the bone marrow cells can be obtained. In this way, the dataset containing images and the corresponding label files has been made. 3.2 CNN Framework In order to count the number of bone marrow cells, we first convert the image with labeled cells to the ground truth density map. For the original cell image, each Gaussian kernel density function is defined at the center of one cell, so its number is the same as the number of cells in the image. We use CNN to extract the characteristics of bone marrow cells and generate predicted density maps. As shown in Fig. 2, our network includes eight convolutional layers and two pooling layers. We get the cell number by integrating the predicted density maps.
Bone Marrow Cell Counting Method Based …
691
Fig. 2 Framework of the proposed convolutional neural network
3.3 Experimental Results and Comparison Figure 3 shows the results of ground truth density map and predicted density map of the test images. The method in this paper can successfully generate predicted density maps in some cell images with high-density and high-adhesion overlap. Experimental results show that our method can estimate the images with relative sparse and extremely dense cells.
Fig. 3 Ground truth density map and predicted density map of the test images
In all the experiments, the mean absolute error (MAE) and the mean square error (MSE) are used as the metric to evaluate different methods quantitatively. MAE indicates the accuracy of the estimates, and MSE indicates the robustness of the estimates. The definition of MAE and MSE is as follows: MAE =
N 1 |ti − pi | N
(1)
N 1 (ti − pi )2 N
(2)
i=1
MSE =
i=1
where N denotes the number of test images, ti represents the true numbers of bone marrow cells in the ith test image and pi expresses the predicted numbers of bone marrow cells in the ith test image.
692
X. Wang et al.
We compare our method with the existing methods in Table 1. Lempitsky et al. [8] employed density map estimation which is based on the traditional method without CNN. Zhang et al. [10] used MCNN for counting and achieved comparable accuracy. The proposed method in this paper achieves the best MAE and MSE with existing methods, which demonstrate it has good accuracy and robustness. Table 1 Comparison of test results of different methods Method
MAE MSE
Lempitsky et al. 3.34
15.40
Zhang et al
0.82
1.40
Our method
0.66
0.67
4 Conclusion In this paper, we propose a new method for bone marrow cell counting, which includes FPM as the image acquisition method and density estimation-based CNN as the detection method. The comparison results demonstrate that our method has good accuracy and robustness. The phase images obtained by FPM can give additional sample information which is helpful for subsequent detection and reduces the count misdetection rate. The density map can preserve more information of the image, which subsequently helps improve the count accuracy of bone marrow cells. The method has a good performance and provides a reliable, rapid and accurate auxiliary technical means for the diagnosis of hematopoietic system diseases. Acknowledgements. This research was supported by the Key Laboratory Foundation under Grant TCGZ2020C004.
References 1. Weir EG, Borowitz MJ (2001) Flow cytometry in the diagnosis of acute leukemia. Semin Hematol 38(2):124–138 2. Coustan-Smith E (2002) Prognostic importance of measuring early clearance of leukemic cells by flow cytometry in childhood acute lymphoblastic leukemia. Blood 100(1):52–58 3. Zheng X, Zhang Y, Shi J et al (2011) A new method for automatic counting of marrow cells. In: International conference on biomedical engineering and informatics. IEEE 4. Ai D, Yin X, Liu B et al (2009) The algorithm of marrow cell identification and classification. Chin J Biomed Eng 28(4) 5. Liu H, Cao H, Song E (2019) Bone marrow cells detection: a technique for the microscopic image analysis. J Med Syst 43(4) 6. Zheng G, Horstmeyer R, Yang C (2013) Wide-field, high-resolution Fourier ptychographic microscopy. Nat Photonics 7(9):739–745
Bone Marrow Cell Counting Method Based …
693
7. Guo K, Dong S, Zheng G (2016) Fourier ptychography for brightfield, phase, darkfield, reflective, multi-slice, and fluorescence imaging. IEEE J Sel Top Quantum Electron 22(4):77– 88 8. Lempitsky VS, Zisserman A (2010) Learning to count objects in images. In: Advances in neural information processing systems 23: conference on neural information processing systems a meeting held December. Curran Associates Inc. 9. Xia W, Shan H (2013) Utilizing density estimation to count object. J Frontiers Comput Sci Technol 7(11):1002–1008 10. Zhang Y, Zhou D, Chen S et al (2016) Single-image crowd counting via multi-column convolutional neural network. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR). IEEE 11. Xue Y, Ray N, Hugh J et al (2016) Cell counting by regression using convolutional neural network. In: European conference on computer vision. Springer, Cham
Identification of Sensitive Regions for Power Equipment Based on Fast R-CNN Hanwu Luo1 , Qirui Wu2,3(B) , Zhonghan Peng2,3 , Hailong Zhang2,3 , and Houming Shen2,3 1 State Grid East Inner Mongolia Electric Power Supply Co. Ltd., Hohhot, People’s Republic of
China 2 NARI Group Corporation Ltd., Nanjing, People’s Republic of China
[email protected] 3 NARI Limited Liability Company, State Grid Electric Power Research Institute, Wuhan,
People’s Republic of China
Abstract. Electricity is an indispensable resource in the daily life of people. However, the inspection of power equipment is still in the artificial stage. This way of work is inefficient, consumes a lot of manpower and material resources, and is not accurate enough. In order to realize the intellectualization of patrol inspection system, this paper uses a fast R-CNN technology based on Google LeNet V2, which brings a more efficient and advanced supervisory technology for power equipment. Experimental results of this paper show that the average accuracy rate of the scheme reaches 92.0%, and the average recall rate reaches 79.7%, with good results.. Keywords: GoogLeNet V2 · Fast R-CNN · Electric power equipment · Target detection
1 Introduction Electricity is an indispensable resource in our life, so the inspection of power equipment is a task related to people’s livelihood. Nowadays, the way of power plant monitoring is mostly to send the collected images or videos of power equipment to the monitoring personnel through video equipment. The disadvantages of this way are low stability, labor negligence, low efficiency, and high cost. Deep learning research has grow rapidly for the past few years in image recognition and target detection, and other algorithms have been applied in more and more fields. If these technologies can be used to identify power equipment in monitoring pictures, it will undoubtedly further improve the power system’s intelligent level and brings about a kind of unprecedented and advanced detection technology. This will greatly liberate human resources and make the whole power system more stable and secure.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_93
Identification of Sensitive Regions for Power Equipment …
695
This paper is to apply the faster R-CNN algorithm of object detection [1] to the identification of power equipment. Compared with ordinary objects, the number of electrical equipment is relatively small, but the light is mostly dark, and the color difference is relatively small. At the same time, because it is related to people’s daily life, the precision requirement of the algorithm is very high. In foreign countries, image processing and recognition technology has been applied to the field of power system through infrared image detection and other technologies since 1997. In China, image recognition technology has also been widely used in power system research. In traditional target detection algorithm, feature extraction and classification decision are carried out separately, so the requirement of feature selection is more strict, and it is difficult to obtain ideal results in the face of complex scenes [2]. Since Professor Hinton put forward the theory of deep learning, more and more researchers have found that applying deep learning in the area of power system can effectively improve the application effect and system performance. Therefore, deep learning methods has been extensively applied in the process of research of power system. For the sake of enhance application status of neural network in partial discharge pattern recognition, Yu Yang studied introducing the deep learning mechanism to realize partial discharge pattern recognition [3]. Gan Weitao uses convolutional neural network (CNN) to realize the recognition of transformers and insulators [4]. As for the accurate detection of thermal faults of electrical equipment in substations, Jia Xin proposed an infrared fault identification method of electrical equipment based on double supervised signal convolutional neural network [5]. Li Yunhao combines infrared thermal imager with visible light camera, through the research of image processing and depth learning algorithm, and combines embedded system to realize automatic location and diagnosis of power transformer faults [6]. Zhu Xuliang et al. proposed an intelligent identification method of partial discharge atlas of electric power equipment based on image processing technology and deep sparse data denoising, and realized intelligent classification and recognition of partial discharge by using the extreme learning machine (ELM) network.
2 Related Work In this paper, the faster R-CNN model is gradually improved from the R-CNN [7] model. CNN is the foundation of most deep neural networks. In 2014, R-CNN algorithm was first proposed by Professor Girshick. The steps of R-CNN are briefly described as follows. Firstly, candidate regions are identified, and the recall rate is guaranteed by utilizing the information of edge, color difference, and color value of pixels in the image. However, an important shortcoming of R-CNN is that each different candidate box will be computed separately. This leads in the training process which has a mass of calculation, which means that the speed is very slow. KH et al. [8] put forward the use of spatial pyramid pooling (SSP) to solve the problem of long computation time of RCNBN. The model proposed by combining SSP theory is fast R-CNN. However, despite solving the huge computation problem, fast R-CNN still has an important problem, which is that it costs lots of time to find all the candidate boxes. This also makes the speed of fast R-CNN reached a bottleneck. Therefore, researchers think of adding a neural network to fast R-CNN that can extract edges to help us identify edges more quickly. This is faster
696
H. Luo et al.
R-CNN that will be used in this paper. It is a model that can be trained quickly and identified steadily. Therefore, this paper decides to apply faster R-CNN technology in power equipment identification.
3 Feature Extraction Feature extraction is an important basis of image recognition. The feature extraction based on DLN can solve the shortage of generalization ability of manual features. This paper will use the classical deep convolution network GoogLeNet V2 [9] for feature extraction. The Google LeNet series network is a deep convolution neural network developed by Google. The traditional convolutional neural network deepens the network depth by stacking convolutional layers continuously. However, deeper network means more parameters, which will increase the computation amount. In addition, it is easy to lead to the phenomenon of over-fitting. The main idea of Google LeNet is to increase the same amount of computation through elaborate manual design. The greatest feature of Google LeNet is that it builds a network by stacking inception modules. The network structure of Inception v1 is shown Fig. 1.
Fig. 1 Network structure of Inception v1
This paper will use Google LeNet V2 to extract features. On the basis of v1, Inception V2 mainly makes two improvements. The first uses two convolution layers with 3 * 3 kernels, which can accelerate the calculation and reduce a mass of parameters. Secondly, add batch standardization layer, which accelerates the training speed, reduces the internal covariate transfer, and improves the classification accuracy.
Identification of Sensitive Regions for Power Equipment …
697
4 Fast R-CNN Model Based on Deep Learning Network Framework The most advanced object detection network relies on region generation algorithm to assume the location of objects. In the further exploration, researchers have introduced regional generation networks (RPN). RPN and detection network share the whole image convolution feature, thus achieving region generation at almost no time. Region generation network is a fully convoluted network. The simple alternative optimization is proposed. Its concrete structure is shown in Fig. 2.
Fig. 2 Fast R-CNN structural diagram
Fast R-CNN consists of four parts: convolution layer, RPN network, Rol pooling, and classifier. The advantage or difference of this and the previous model refers to it is RPN. RRN network is specially used to recommend candidate areas. It inputs an picture of any size and outputs several interrelated matrices. For an input picture, it is actually considered as a large matrix. Then, RPN generates several anchors. So-called anchors are boxes that are preset in size. By default, each slide position will produce nine anchors. The most important characteristic of anchor points is translation invariance.
5 Experiment and Analysis 5.1 Data Introduction In this paper, the dataset contains a total of 3241 pictures, including transformer radiator, transformer bushing, transformer oil pillow, transformer body, circuit breaker capacitor, isolating switch, transformer wall bushing, and other common equipment in power
698
H. Luo et al.
system. For each device, different parts, different angles, and different illumination are collected. This paper randomly divided the dataset into two categories, the proportion of training data is 90%, and the other is test dataset. 5.2 The Experiment Platform The operating system of this experiment is Ubuntu1.18.0.04, the GPU is NVIDIA1080ti, the deep learning framework is tensorflow1.13.1, and the programming language is python. 5.3 Experimental Results and Analysis This paper chooses COCO Data Centralization Evaluation Index published by Microsoft, which is commonly used in target detection, to analyze the experimental results (Table 1). Table 1 Experimental results Evaluation index
Measure way
Average accuracy
Overlapping degree
Average recall rate
Number of detection results
The model in this paper 0.5:0.05:0.95
0.728
0.5
0.920
0.75
0.851
1
0.620
10
0.797
This paper randomly selected two pictures to show the recognition effect. As shown in Figs. 3 and 4, we find that the strong light has a certain impact on the recognition effect, but it still maintains a high recognition effect, and the recognition effect is also affected by the number of samples. This is also the direction of our future improvement.
Fig. 3 Recognition effect of transformer radiator
Identification of Sensitive Regions for Power Equipment …
699
Fig. 4 Recognition effect of transformer bushing
6 Conclusion The Fast R-CNN power equipment identification scheme based on GoogLeNet V2 has a high accuracy and recall rate in this paper. In order to realize intelligent patrol inspection of power equipment, an effective scheme is proposed, which will be applied in the actual patrol inspection system in the future. Acknowledgements. This work was funded by the State Grid Science and Technology Project (Research on Key Technologies of Intelligent Image Preprocessing and Visual Perception of Transmission and Equipment).
References 1. Ren S, He K, Girshick R, et al (2015) Faster R-CNN: towards real-time object detection with region proposal networks. Adv Neur Inf Process Syst 91–99 2. Fu R (2017) Research on Target Detection Based on Deep Learning. Beijing Jiaotong University 3. Yu Y (2016) Application of Deep Learning in Partial Discharge Pattern Recognition. North China Electric Power University (Beijing) 4. Gan W (2017) Research on transformer image processing method based on infrared image. South China University of Technology 5. Jia X (2018) Research on infrared fault recognition of electrical equipment based on double supervisory signal convolution neural network. Tianjin University of Technology 6. Li Y (2018) Research on Embedded Transformer Fault Detection Technology Based on Infrared Imaging. Chongqing University of Technology 7. Xuliang Z, Chuanghua L, Jin H, Xiaobo S, Rong C, Shangxiang X (2018) Intelligent recognition of partial discharge atlas based on image processing and noise reduction. Power Big Data 21(11):50–56 8. Girshick R, Donahue J, Darrell T et al Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE conference on computer vision and pattern recognition. 20 Power big data 14: 580–587 9. Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448
Dimensionality Reduction Algorithm Wenzhen Li1(B) , Qirui Wu2,3 , Zhonghan Peng2,3 , Kai Chen2,3 , Hui Zhang2,3 , and Houming Shen2,3 1 State Grid East Inner Mongolia Electric Power Supply Co. Ltd., Hohhot, People’s Republic of
China [email protected] 2 NARI Group Corporation Ltd., Nanjing, People’s Republic of China 3 NARI Limited Liability Company, State Grid Electric Power Research Institute, Wuhan, People’s Republic of China
Abstract. Since the information revolution, human society has developed extremely rapidly. As mankind continues to develop computing technologies for processing information data, it is also accompanied by the explosive growth of information data, and various high-latitude data are constantly being produced. A serious problem brought about by this is how to deal with these large amounts of high latitude data, and extract the information that can be used from it. Highdimensional data compression extraction algorithm is an effective way to solve this problem. This paper mainly studies the sparse principal component analysis algorithm (SPCA) and the edge-group sparse principal component analysis algorithm (ESPCA) based on the principal component analysis algorithm (PCA), a high-dimensional data compression algorithm. And this paper focuses on the theory of edge group sparse principal component analysis algorithm, and successfully reproduces the program, and obtains results consistent with the original text on the simulated data. Keywords: High-dimensional data analysis · PCA · Sparse PCA · Feature extraction
1 Introduction 1.1 Research Background and Purpose Someone once said “twenty-first century is the century of biology”. This sentence was produced because of the start of the Human Genome Project at the end of the last century. The project aims to sequence all the base pair sequences on all human chromosomes, so as to draw the entire human gene map, and identify the genes with genetic information, and finally achieve the purpose of deciphering our human genetic information. It is also said that “the twenty-first century is the information age”, and this is because with the
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_94
Dimensionality Reduction Algorithm
701
invention and popularization of computers, the information revolution broke out, and information has exerted greater and greater influence on our human society. Until now our lives have been inseparable from information and information processing technology. The amount of information that has been fully popularized on the Internet today has reached the level of astronomical figures, and the current information processing speed is still evolving. These two sentences perfectly describe the current situation in the early twenty-first century. At present, the large amount of gene sample data generated by the Human Genome Project is still being analyzed, and analysts of the Human Genome Project are constantly using information processing technology for data analysis and processing. However, the data of human genes is too large, so only continuous improvement and development of data analysis and processing technology can better analyze genetic data. Gene refers to a piece of DNA or RNA with genetic information. There are at least about 300 genes that make up simple life in the world, and about 20,000–30,000 genes of human genes. When we analyze the genetic data, the simplest genetic data of life is also a 300-dimensional data. When we process all human genes of the Human Genome Project, we will face up to 20,000–30,000 dimensions of data, which is a very large gene sample matrix. Even if we group and analyze this huge gene sample data, we will still face a rather large high-dimensional data matrix. Therefore, the analysis algorithm for high-dimensional data in the data analysis algorithm often plays a big role in the processing of genetic data. Moreover, recently, high-dimensional data has not only appeared in the field of bioinformatics, but also frequently appeared in other fields of the scientific community and industrial circles, such as the field of computing vision, the field of device learning and aerospace. When we deal with these high-dimensional data, there are often some obstacles. Among them, the performance is high in computational complexity, requires a lot of computing power, and it is difficult to obtain optimal processing results. These are collectively referred to as dimensional disasters and an effective way to overcome dimensional disasters is dimensionality reduction. The dimensionality reduction algorithm reduces high-dimensional data to a low-dimensional structure that can reflect the wooden characteristics of the data, thereby greatly reducing the computational complexity. And the closer it is to the essential characteristics of the data, the easier it is to obtain the optimal processing result and it may also have good data interpretation. 1.2 Research Status at Home and Abroad Sparse Principal Component Analysis was published by Hui Zou and others in 2006 [1]. They proved that the PCA problem can be transformed into a regression-type optimization problem. Then, by applying regularization constraints to the regression coefficients, a sparse low-dimensional feature vector (principal component load) is obtained [2]. Since adding L regularization constraints is a NP-hard problem, L1 regularization is added to obtain the sparse principal component load [3]. When sparse principal component analysis is used to reduce the dimensionality of the gene data sample matrix, although the sparse principal component load can be obtained, it may have certain interpretability [4]. However, the original genes in the matrix data of the gene samples are connected, that is to say, each principal component is likely to be composed of related original genes
702
W. Li et al.
[5]. However, the sparse principal component analysis does not input the relationship diagram of the original gene [6]. It is not difficult to think that the principal component load obtained by the sparse principal component analysis does not necessarily reflect the interrelated relationship of the original genes. In general, the sparse principal component analysis may get some explanation, but for the gene-like matrix of the original genes that are related to each other, the obtained principal component load cannot experience this part of the relationship well [7–9].
2 Basic Theory 2.1 PCA In the dimension column space, so as to achieve the purpose of dimensionality reduction, the latitude of the new matrix is equal to the column space latitude of the orthogonal transformation matrix. Since matrix transformation belongs to linear transformation, PCA is a linear dimensionality reduction method. The main steps of PCA 1. Zero-average each row of the original data matrix X0 ∈ Rp×n to obtain X ∈ Rp×n 2. Solve the orthogonal transformation matrix A ∈ Rp×d (d p) required for transformation T 3. Multiply the original data matrix X bydthe transformation matrix A to obtain new T ×n low-dimensional data Y = A Y ∈ R 2.2 SPCA SPCA is a data dimensionality reduction algorithm based on PCA. It is a method of adding a limit to make the principal component load sparse in the process of solving the principal component load of the PCA to solve the principal component. The SPCA sparse limit selected by Mu wen is a greedy k limit, that is, only the principal component load component is allowed to have k nonzero values. Suppose the gene data sample X ∈ Rd ×n , where d is the number of genes and n is the number of samples. Then the equation for solving the principal component load of the SPCA algorithm with L0 regularization is arg max T u = u2 =1 (1) u XX T u , s.t.u0 ≤ k 2
where k is a positive integer,x2 represents the 2-norm, and its value is the square root of
the sum of the squares of the elements of the vector x, i.e. x2 = x12 + x22 + · · · + xn2 , x0 represents the 0-norm, The number of non-zero elements whose value is x.
Dimensionality Reduction Algorithm
703
3 ESPA Theory 3.1 The Sparse Condition of ESPCA that g = {e1 , e2 , e3 , . . . , eM } is a set of edges of a gene graph, where ei = Suppose gp , gq represents the edge connecting gene gp and gene gq . Then we can set a sparse edge constraint as follows: (2) uES = minimize g ⊆g,support(u)⊆V (g ) g where g is a subset of g, g is the number of elements in g , that is, the number of edges of g , V g is the set of vertices of all the included edges, support (u) is equal to the index set of non-zero elements of u. For example: if u = [5, 1, 0, 9]T , g = {(1, 2), (1, 4), (2, 3), (3, 4)}. Then support (u) ={1, 2, 4}, then g = {(1, 2)(1, 4)} can be found to satisfy the condition support(u) ⊆ V g = {1, 2, 3}. At the same time, it is easy to get uES = 2. That is to say, the above sparse edge constraint can make the non-zero elements of the sparse principal component load depend on the relationship graph of genes. Then by replacing the L0 regularization constraint of (7) with the above sparse edge constraint, the following ESPCA formula can be obtained: arg max u, v = u2 =1,v2 =1 uT Xv , s.t.uES ≤ k (3) 2
2
where u is the load of the first principal component, υ is the first principal component, and k is a parameter that controls the number of selected edges. 3.2 The Specific Algorithm of ESPCA The key problem in solving Eq. (3) is how to solve the following problems when v is given and z = Xv: arg max T (4) u z , s.t.uES ≤ k u = u2 =1 2
This problem is an NP-hard problem. So ESPCA uses a greedy algorithm (P(z, k, g)) to get an approximate solution of (4). An approximate solution of Eq. (4) can be expressed as u ← uˆuˆ , where uˆ = z P(z, k, g), Where P(z, k, g) outputs a d-dimensional column vector, each of which is:
z ,ifg(i)∩supp(norm(z),k) =∅ (5) P(z, k, g) i = 0,i otherwise where g(i) is the index of the edge containing the gene i in the gene graph g. According to Algorithm 1, we can solve the approximate solution of Eq. (3) by the following alternating iteration to convergence method: u←
uˆ , where uˆ = P(z, k, g) and z = Xv, uˆ 2
704
W. Li et al.
v←
vˆ , where vˆ = X T u, vˆ 2
The above solution only solves the load of the first principal component and the value of the first principal component. If the load and value of the l-th principal component are required, the following model needs to be solved: arg max ul , vl = u2 =1,u2 =1 ulT Xvl l2
l2
s.t.uES ≤ k, vl ⊥v1 , v2 , . . . , vl−1
(6)
where l ≥ 2, and u1,2,...,l−1 and v1,2,...,l−1 . have been solved. When fixing vl to solve for ul , just follow the above method. The key to solving (13) is how to solve vl when we fix ul : arg max vl
v2
=1 ulT Xvl , s.t. vl ⊥v1 , v2 , . . . , vl−1 (7) l2
In order to solve Eq. (7), we adopted the Gram-Schmidt orthogonalization method. ⊥ ∈ Suppose Vl−1 = v1 , v2 , . . . , vl−1 ∈ Rn×(l−1) , the column space of Vl−1 n×(n−l+1) , is the orthogonal complement space of Vl−1 column space, that is, The R ⊥ ∈ Rn×n is an orthogonal basis in n-dimensional space. column vector of Vl−1 , Vl−1 And because vl is perpendicular to all the column vectors in vl−1 , then vl must ⊥ , that is, be represented by the linear combination of the column vectors of Vl−1 ⊥ ⊥ n−l+1 vl = Vl−1 β β ∈ R . We next replace vl with vl = Vl−1 β and substitute it into Eq. (7) has: arg max ⊥ β , s.t.β2 = 1 (8) ulT XVl−1 β =β Because β2 = 1, there is ⊥ ⊥ β ≤ ulT XVl−1 ulT XVl−1
(9)
if and only β=
⊥T X T u Vl−1 l ⊥T X T u Vl−1 l
(10)
⊥ β, The equal sign of Eq. (9) holds, so the solution of Eqs. (8) is (10). Since vl = Vl−1 we can replace β with (10) and combine with (7) to get:
vl =
⊥ V ⊥T X T u Vl−1 l l−1 ⊥ V ⊥T X T u Vl−1 l l−1
(11)
⊥ V ⊥T cannot be obtained. Below we will introduce a Among them, the value of Vl−1 l−1 theorem, which can be transformed into the idea of seeking it.
Dimensionality Reduction Algorithm
705
⊥ is the orthogonal complement space of Theorem 1 If the column vector space of Vl−1 ⊥ , then there is V ⊥ V ⊥T = I − V T the column space of Vl−1 l−1 Vl−1 . l−1 l−1
⊥ Proof Assuming V = Vl−1 , Vl−1 , then V is an orthogonal matrix. So there is VV T = ⊥ T
⊥ , Vl−1 Vl−1 , Vl−1 = I . The algorithm according to the block VV − = I , so Vl−1 ⊥ V ⊥T + V T matrix has Vl−1 l−1 Vl−1 = I . So the above theorem holds. So Eq. (7) can be l−1 converted to: vˆ l T , where vˆ l = I − Vl−1 Vl−1 (12) X T ul vl ← vl2 Especially when l = 1, the above formula will be converted to vl ←
vˆ , where vˆ = X T u vˆ 2
(13)
Therefore, Eq. (12) is universal, so for Eq. (6), we can iterate through the following two equations to achieve convergence to solve the approximate solution: uˆ l , where uˆ l = P(z, k, g) and z = Xvl uˆ l2 vˆ l T vl ← , where vˆ l = I − Vl−1 Vl−1 X T ul vl2
ul ←
4 Experiment and Result Analysis 4.1 Simple Simulation Data Generation • Assume two principal component loads u1 = [1, −1, 0.7, 0.1, −0.5, 0, 0, 0, 0, 0]
(14)
u2 = [0, 0, 0, 0, 0, 0.1, −2, −0.5, 0.3, 0.1]
(15)
• Assume two principal components: v1 = rnorm(100), v2 = rnorm(100)
(16)
The value of rnorm(n) is an n-dimensional column vector, and each value is a random value of standard normal distribution. • Generate a gene expression matrix with 10 gene characteristics and 100 samples by the following formula X ∈ R10×100 : X = d1 u1 v1T + d2 u2 v2T + γ ε
(17)
706
W. Li et al.
where d1 = 10, d2 = 5, γ = 5, ε is a random value matrix that is isomorphic to X ε ∈ R10×100 and conforms to the standard normal distribution. • Assume a gene graph g = g1 ∪ g2 : g1 = {(1, 2), (1, 3), (1, 5), (2, 3), (2, 4), (3, 4), (4, 5)}
(18)
g2 = {(6, 7), (6, 10), (7, 8), (7, 10), (8, 9), (8, 10), (9, 10)}
(19)
See Fig. 1.
Fig. 1 Gene relation graph of generated data g
4.2 Simple Analog Data Experiment Test and Analysis PCA, SPCA and ESPCA were compared and tested on the generated simulation data. The test data is shown in Table 1. The load of the true principal component of the simulation data we generated is that the load of the first principal component is related to the first 5 genes, and the load of the second principal component is related to the last 5 genes. In order to ensure that the principal component load generated by ESPCA has only 5 non-zero values, set k = 6 in Eq. (3). In order to ensure that the main component load generated by SPCA is also only 5 non-zero values, set k = 5 in Eq. (1). It turned out that ESPCA correctly produced the first principal component load and the second principal component load with 5 expected non-zero values. However, the first principal component load of SPCA is missing the value of gene 4 and the second principal component load is missing the value of gene 6 and gene 10. The PCA produced two dense principal component loads, which greatly lacked interpretability. In the PCA load, we can see that the five genes with the largest absolute value of the first principal component load are g1 , g2 , g3 , g5 , g7 , and the five genes with the largest absolute value of the second principal component load are g1 , g2 , g7 , g8 , g9 . This is exactly the same as the five non-zero values of SPCA’s principal component load, which is why the first principal component of SPCA loses g4 and the second principal component loses g6 and g10 . However, in the genetic diagram g, you can see that g4 is connected to g1 , g2 , g3 and g5 , so the load
Dimensionality Reduction Algorithm
707
of the first principal component is related to g4 . g10 related, that is to say SPCA lost important related genes. However, ESPCA has well separated the correlation between the first principal component load and the original first 5 genes, and the second principal component load is related to the last 5 genes. Table 1 PCA, SPCA, ESPCA principal component load table on simulation data Method
PCA
Main method PC1 g1 g2 g3 g4
0.60
SPCA PC2 0.07
PC1 0.06
ESPCA PC2
PC2
0.60
0
−0.60 −0.07 −0.06 −0.06 −0.60
0
0.42
0.05
0.42
0.06
0.01
0
0.06
PC1
0
0.42
0
0
0.06
0
g5
−0.30 −0.04 −0.03
0
−0.30
0
g6
−0.01
0
0
−0.05
0.05
0
g7
0.11 −0.95
0.11 −0.96
0
0.95
g8
0.03 −0.24
0
−0.24
0
0.24
g9
−0.02
0.14
0
0.14
0
−0.14
g10
−0.01
0.05
0
0
0
−0.05
5 Summary and Outlook Since entering this information age and biological century, human society has developed rapidly, and human beings cannot do without the participation of science and technology in all aspects of life. High-dimensional information in various fields is constantly being produced, which puts forward higher requirements for information analysis and data processing. On the one hand, the continuous development of computing power is a way to solve this problem. On the other hand, the continuous development of efficient and dimensionality reduction algorithms with application scenarios is also an important way to solve high-dimensional data processing. Moreover, dimensionality reduction algorithms often extract features of high-dimensional data to obtain low-dimensional features that are closer to the nature of the data, which is especially important. The experimental results show its superiority after introducing the relationship graph of the original data as the sparse condition, that is, the correctness of the generated principal component load, good interpretation and good noise resistance. Acknowledgements. This work was Funded by the State Grid Science and Technology Project (Research on Key Technologies of Intelligent Image Preprocessing and Visual Perception of Transmission and Transformation Equipment).
708
W. Li et al.
References 1. Zou H, Hastie T, Tibshirani R (2006) Sparse principal component analysis. J Comput Graph Statist 15(2):265–286 2. Rayleigh-Ritz theorem [DB/OL]. https://www.planetmath.org/rayleigHritztheo-rem 3. Journee M, Nesterov Y, Richtarik P et al (2010) Generalized power method for sparse principal component analysis. Core Discussion Papers 11(2008070):517–553 4. Yuan X-T, Zhang T (2011) Truncated power method for sparse eigenvalue problems. J Mach Learn Res 14(1) 5. Lin Z, Yang C, Zhu Y et al (2016) Simultaneous dimension reduction and adjustment for confounding variation. Proc Nat Acad Sci 113(51):14662–14667 6. Sill M et al (2015) Applying stability selection to consistently estimate sparse principal components in high-dimensional molecular data[J]. Bioinformatics 31:26832690 7. Shen H, Huang JZ (2008) Sparse principal component analysis via regularized low rank matrix approximation. J Multivar Anal 99(6):1015–1034 8. Witten DM et al (2009) A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics 10:515–534 9. Hyvarinen A, Oja E (2000) Independent component analysis: algorithms and applications. Neur Netw 13:411–430
An Improved APTEEN Protocol Based on Deep Autoencoder Yu Song, Shubin Wang(B) , and Lixin Jing College of Electronic Information Engineering, Inner Mongolia University, Hohhot, China [email protected]
Abstract. APTEEN protocol is a popular clustering protocol in wireless sensor networks, which has the problems of too fast energy consumption and large data redundancy. To overcome this problem, a novel improved algorithm of the APTEEN protocol named SAEDF-APTEEN is proposed in this paper. Given that data fusion technology can reduce redundant data transmission in wireless sensor networks, this paper introduces the deep autoencoder into the APTEEN protocol. After training the deep autoencoder model in the base station, the encoder part is deployed in the cluster heads. The cluster heads fuse the data sent by cluster members and transmit the compressed data to the base station. The simulation results show that the SAEDF-APTEEN protocol not only significantly improves the network lifetime and reduces the energy consumption of the whole network, but also effectively reduces the amount of data transmission and enhances the data transmission efficiency. Keywords: Data fusion · Deep autoencoder · Lifetime · Data transmission efficiency
1 Introduction In wireless sensor networks (WSN), most of the energy consumption is caused by communication [1]. Due to numerous sensing data from multiple sensor nodes are used to give a complete description of an environment, there is a large amount of data redundancy between nodes [2]. Data fusion can reduce the amount of original data and significantly reduce energy consumption, thus prolonging the network lifetime [3]. APTEEN protocol proposes to implement data fusion in cluster heads, but it does not provide specific fusion methods and examples. To effectively reduce data redundancy and prolong the network lifetime, experts and scholars put forward a large number of data fusion algorithms. In reference [4], data fusion rate is introduced to improve data transmission efficiency, reduce network energy consumption, and prolong lifetime. In reference [5], a reliable neuro-fuzzy optimization model (RNFOM) is applied to WSN to improve the reliability of monitoring data, but it does not consider the problem of node
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_95
710
Y. Song et al.
energy shortage. In reference [6], a BP neural network data fusion algorithm optimized based on adaptive fuzzy particle swarm optimization is proposed. Data fusion reduces the amount of data transmission and energy consumption of nodes and achieves the purpose of prolonging network life. In reference [7], the combination of a deep autoencoder and wireless sensor networks can reduce data redundancy and energy consumption. In essence, it is multi-sensor data fusion. The deep autoencoder model is used to extract feature data from cluster members. In reference [8], an energy efficient data collection scheme using denoising autoencoder (DCDA) is proposed. The sensing data is compressed and reconstructed by using the denoising autoencoder, which can reduce the energy consumption of wireless sensor networks. In this paper, based on the results of previous studies and combined with the advantages of transmitting data satisfied ST and HT to reduce the energy consumption in the APTEEN protocol, a novel improved APTEEN protocol is proposed, which optimizes the data fusion of APTEEN protocol.
2 Relate Work 2.1 APTEEN Protocol Energy Consumption Model In the APTEEN protocol, the energy consumption of the sensor nodes is computed by using the mathematical formulas of the energy model. The energy consumption model for the transmitter node and the receiver node is shown in formulas (1) and (2) respectively. ETx (k, d ) = ETx - elec (k) + ETx - amp (k, d ) Eelec ∗ k + εfs ∗ k ∗ d 2 , d ≤ do = Eelec ∗ k + εmp ∗ k ∗ d 4 , d > do ERx (k) = ERx - elec (k) = Eelec ∗ k
(1) (2)
where k is the size of the data packet; d is the distance between sender and receiver; E elec is the radio dissipates of the transmitter or receiver circuitry running; εfs and ε mp are the energy dissipation of amplifier for free-space propagation and multipath propagation, respectively; do is threshold distance which is computed by the formula do = εfs /εmp . 2.2 Autoencoder Model AE is an unsupervised three-layer neural network, in which the input layer to the hidden layer is called encoder, and the hidden layer to the output layer is called decoder [9]. The nonlinear activation function can be chosen as either the rectified linear unit function (ReLU) or other functions. The ReLU function definition is shown in formula (3). ReLU(x) = max(0, x)
(3)
The aim of AE in training is that the value of network output is as close as possible to the original input. Because the number of hidden units is less than the number of input units in AE, the encoder can be treated as compressing the input data into a hidden representation. Meanwhile, the network parameters of AE can be obtained by minimizing
An Improved APTEEN Protocol Based on Deep Autoencoder
711
the loss function. Generally, the squared error loss function is used as the loss function, and is shown in formula (4), where N is the number of training samples, x (i) and x˜ (i) represent the i-th input data and reconstruction data of the training sample, respectively. J (θ, θ) =
N 2 1 (i) x˜ − x(i) 2N
(4)
i=1
Stacked autoencoder (SAE) is a deep network in which multiple autoencoders are stacked with each other. Stacked autoencoder takes the hidden representation of the previous AE as the input of the next AE. 2.3 Adam Optimizer Adam is a popular algorithm in the field of deep learning. Compared with stochastic gradient descent (SGD) which has the disadvantage that the learning rate does not change in the training process, Adam is an adaptive learning rate optimization algorithm which can replace SGD. Adam is designed to combine the advantages of AdaGrad and RMSProp. Adam is commonly used in training deep neural networks [10].
3 SAEDF-APTEEN Optimization Protocol In this paper, the APTEEN protocol is used for networking. This paper combines the SAE model and the APTEEN protocol to propose the application of data fusion based on stacked autoencoder to adaptive threshold-sensitive energy-efficient sensor network protocol (SAEDF-APTEEN). SAE model includes encoder and decoder. The encoder extracts the features of the original sensing data and compresses the sensing data. The decoder reconstructs the sensing data according to the features. The structure of the SAE model is that the number of input units is equal to the number of nodes in the cluster that satisfy ST and HT; the number of output units is equal to one, that is, the data of multiple nodes in the cluster is fused and compressed into one; the number of the hidden layers of the encoder is 1 or 2, depending on the complexity and dimension of the input data. The weight parameters are needed to perform the data fusion algorithm based on SAE. The sensor nodes have limited energy, computing power, and storage capacity, so the training of SAE model is carried out in the base station. The input data X is mapped to range [0, 1] by using max-min normalization. Training the first AE network is shown in left of Fig. 1. The ReLU function is used as the nonlinear activation function in this paper. The loss function which shown in formula (4) is minimized to obtain the first AE model parameters. The error between the original sensing data and the reconstructed sensing data can be gradually reduced by repeatedly executing Adam algorithm, and the After of the first AE can be updated. parameters (1)
(1) (1) ˜ are and the hidden representation h training, the first AE parameters θ , θ obtained. Training the second AE network with h(1) to obtain the second AE parameters θ (2) , θ˜ (2) and the outputs of second hidden layer h(2) is shown in right of Fig. 1. The
712
Y. Song et al.
outputs of second hidden layer h(2) are used as the input of the next hidden layer to train the next AE. The above training process is repeated and the whole SAE is trained greedily layer by layer until the last AE is obtained. After the layerwise unsupervised training process is finished, the parameters θ, θ˜ of the whole SAE are saved for performing the subsequent data fusion algorithm.
Fig. 1 First and second hidden layer of SAE model
Since training SAE takes a certain amount of time, base station firstly chooses the corresponding input training sample data according to the task needs before running APTEEN protocol, and the method described above is used to train the SAE model with historical sensing data. In APTEEN protocol, cluster heads need to rotate periodically to balance energy consumption, and the corresponding clusters also update dynamically. The change of cluster structure will result in the change of SAE model structure in clusters. Therefore, after all nodes in the cluster send the sensing data to the cluster head during their allocated transmission time, the cluster head sends the cluster node information table to the base station. Base station determines the structure of the SAE model matching the cluster according to the cluster information table. In SAEDF-APTEEN protocol, the encoder part of the SAE is arranged in each cluster head, and the decoder part of the SAE is arranged in the base station. Then, base station sends the trained parameters to the corresponding cluster heads. After receiving the encoder parameters, the cluster head begins to perform the data fusion process. The cluster heads are responsible for processing the data sent by member of the cluster and transmitting it to base station after data fusion. The original sensing data X in the cluster is compressed into a compressed data h of size 1 * 1 by the encoder, which is computed by forward propagation. The base station is responsible for data ˜ reconstruction. The compressed data h is projected back into the reconstructed data X by the decoder. After the nodes of WSN accomplishes a round of data acquisition, fusion
An Improved APTEEN Protocol Based on Deep Autoencoder
713
and transmission, the base station runs APTEEN protocol to carry out a new round of clustering and selecting the cluster heads.
4 Simulation Analysis 4.1 Dataset Preprocess This paper used the Intel Berkeley Research lab dataset (IBRL) in the experiments [11]. The sensing data which contains temperature, humidity, light and voltage, collected from 54 sensors deployed in the Intel Berkeley Research lab from 28 February, 2004 to 6 April on a 30-s sampling interval. The temperature data is chosen as the experimental data of this paper and the apparently abnormal data in the temperature data is firstly eliminated by taking the threshold of 50 °C and −10 °C. Then, most of the abnormal data in the temperature data is eliminated by Pauta Criterion. The IBRL dataset is divided into the training dataset and the testing dataset with the split ratio is 6–4. 4.2 Experimental Parameter Setting In this paper, the performance of the SAEDF-APTEEN protocol is verified on MATLAB simulation platform. The simulation parameters are that 200 nodes are randomly distributed in the area of 200 m * 200 m; the initial energy of each node is 0.5 J; the coordinates of base station location are (100, 100); the data fusion rate of APTEEN protocol is 60 percent; E elec = 50 nJ/bit, εfs = 10 pJ/bit/m2 , εmp = 0.013 pJ/bit/m4 ; data aggregation energy EDA = 5 nJ/bit; the size of data packets is 4000 bit, the size of the cluster node information packets is 200 bit. 4.3 Result of Simulation Mean absolute deviation (MAE) and signal-to-noise ratio (SNR) are used to evaluate the proposed data fusion algorithm reconstruction performance. The MAE and SNR of different amount of cluster nodes of the data fusion algorithm proposed in this paper are given in Table 1. The accuracy performance of the data fusion algorithm proposed in this paper is acceptable in most of WSN applications. Table 1 MAE and SNR of the proposed data fusion algorithm 4
5
6
7
8
9
10
11
MAE (°C)
0.0487
0.0710
0.0477
0.0812
0.0500
0.0904
0.0637
0.0638
SNR (dB)
47.47
45.33
48.28
45.16
46.79
44.74
44.68
45.27
12
13
14
15
16
17
18
19
20
0.0632
0.0563
0.0734
0.0650
0.0897
0.0825
0.0808
0.0795
0.0852
44.30
44.04
43.80
44.87
42.87
43.20
42.41
42.47
42.36
714
Y. Song et al.
In order to compare the network performance of energy consumption and network lifetime, the SAEDF-APTEEN protocol, BP neural network data fusion algorithm, and APTEEN protocol are simulated and compared. The number of survival nodes in WSN is the key index to evaluate the normal running. As shown in Fig. 2, the SAEDF-APTEEN protocol greatly prolongs the network lifetime compared with other algorithms. As shown in Fig. 3, the SAEDF-APTEEN protocol has better performance in term of reducing network energy consumption and prolonging network lifetime compared with other algorithms. The size and number of total packets transmitted to base station is shown in Figs. 4 and 5, respectively, after the data transmission volume of other protocols no longer changes, the SAEDF-APTEEN protocol can still run for a period, and the data transmission volume is less than other protocols according to data fusion. The no longer change of the amount of data transmission means that the energy of the network is totally consumed, the transmission of data packets is no longer carried out, and the entire network dies. The SAEDF-APTEEN protocol can effectively reduce network energy consumption and prolong the network lifetime.
Fig. 2 Number of survival nodes
5 Conclusion In this paper, the SAEDF-APTEEN protocol is proposed that the data fusion is improved of APTEEN protocol and the deep autoencoder model is deployed in the cluster heads. SAEDF-APTEEN protocol greatly reduces the amount of data transmission in the network, reduces the energy consumption of the whole network, enhances data transmission efficiency, improves the network performance, and reaches the purpose of prolonging the network lifetime.
An Improved APTEEN Protocol Based on Deep Autoencoder
Fig. 3 Average energy of each node
Fig. 4 Size of total packets transmitted to BS
715
716
Y. Song et al.
Fig. 5 Number of total packets transmitted to BS Acknowledgements. Shubin Wang ([email protected]) is the correspondent author, and this work was supported by the National Natural Science Foundation of China (61761034).
References 1. Mekonnen T, Porambage P, Harjula E, Ylianttila M (2017) Energy consumption analysis of high quality multi-tier wireless multimedia sensor network. IEEE Access 5:15848–15858 2. Kumar S, Chaurasiya VK (2019) A strategy for elimination of data redundancy in internet of things (IoT) based wireless sensor network (WSN). IEEE Syst J 13(2):1650–1657 3. Zhou F, Chen Z, Guo S, Li J (2016) Maximizing lifetime of data-gathering trees with different aggregation modes in WSNs. IEEE Sens J 16(22):8167–8177 4. Wu W, Xiong N, Wu C (2017) Improved clustering algorithm based on energy consumption in wireless sensor networks. IET Netw 6(3):47–53 5. Acharya S, Tripathy CR (2017) A reliable fault-tolerant ANFIS model based data aggregation scheme for wireless sensor networks. J King Saud Univ Comput Inf Sci 6. Yang M, Geng Y, Yu K, Li X, Zhang S (2018) BP neural network data fusion algorithm optimized based on adaptive fuzzy particle swarm optimization*. 2018 IEEE 4th information technology and mechatronics engineering conference (ITOEC), Chongqing, China, pp 59–597 7. Zhao Lishuang (2018) Data aggregation in WSN based on deep self-encoder. Int J Perform Eng 14(11):2723–2730 8. Li G, Peng S, Wang C, Niu J, Yuan Y (2019) An energy-efficient data collection scheme using denoising autoencoder in wireless sensor networks. Tsinghua Sci Tech 24(1):86–96 9. Yuan X, Huang B, Wang Y, Yang C, Gui W (2018) Deep learning-based feature representation and its application for soft sensor modeling with variable-wise weighted SAE. IEEE Trans Indust Inf 14(7):3235–3243 10. Kingma DP, Ba JL, Adam (2015) A method for stochastic optimization. In: Proceedings of 3rd international conference for learning representations 11. Intel Lab Data (2019). http://db.csail.mit.edu/labdata/labdata.html
APTEEN Protocol Data Fusion Optimization Based on BP Neural Network Lixin Jing, Shubin Wang(B) , and Yu Song College of Electronic Information Engineering, Inner Mongolia University, Hohhot, China [email protected]
Abstract. In order to reduce the amount of data transmission and reduce the energy consumption of nodes in APTEEN protocol, a data fusion algorithm based on APTEEN protocol and BP neural network is proposed in the paper. The threelayer BP neural network is used to describe the cluster structure. In the process of cluster structure data transmission, BP neural network is used to process the perception data, from which the feature values of the perception data are extracted and forwarded to the sink node. It is proved by the simulation result that this algorithm has better performance in both network life and network consumption than APTEEN protocol. It reduces the amount of communication data, reduces energy consumption, and prolongs the network life. Keywords: APTEEN protocol · BP neural network · Data fusion · BP-APTEEN
1 Introduction Wireless sensor network(WSN) is networks of nodes that work collaboratively to sense and control their surroundings. WSN has sensing, monitoring, and control functions, and is widely used in military, industrial, and environmental monitoring [1, 2]. WSN nodes by random deployment often leads to uneven distribution of nodes deployed in monitoring area. In the deployment of high density of WSN, multiple nodes monitoring area will overlap each other and lead to the neighboring nodes of sensory data redundancy. At the same time, each node transmits data separately to the sink node which will waste a lot of energy to the gathering node communication bandwidth. It consumes too much network energy, reduces the efficiency of network communication, which reduces the sensory data collection efficiency [3]. On the premise of ensuring the accuracy of data, how to reducing data redundancy, reducing energy consumption and extending the life cycle of network is a good research direction. More and more researchers have carried out research on WSN data fusion algorithm. Such as, literature [4] combining the Leach-F clustering algorithm with the BP neural network, the cluster structure of the WSN will not change after the formation of the cluster, saving the energy consumption of each round of clustering of the sensor network.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_96
718
L. Jing et al.
Literature [5] proposes an algorithm based on the clustering of double cluster head nodes(CHs) and the data fusion mechanism of information entropy. CHs use information entropy for classification and fusion, so as to make fusion results more accurate and data transmission more efficient. Literature [6] proposes a data fusion algorithm combining LEACH protocol with BP neural network, which saves energy consumption by reducing data transmission. In literature [7], BP neural network is used to optimize the weight and threshold, which improves the data acquisition accuracy of the network to some extent. However, the algorithm has the problem of small processing scale and poor stability. Literature [8] uses self-organizing feature map neural network to manage the huge amount of data,select clustered data to transfer to base station. Over the huge data management, it reduces the energy consumption. In this paper, BP neural network is introduced into APTEEN protocol. A WSN data fusion algorithm based on APTEEN protocol and BP neural network(BP-APTEEN) is proposed. BP-APTEEN algorithm using APTEEN cluster protocol to form the structure of cluster in WSN. In the data transmission of cluster structure, BP neural network is used to fuse the perception data. The characteristic values obtained by data fusion are sent to the sink node through BP-APTEEN. In this way, it can reduce the amount of data transmission, reduce energy consumption, and prolong the network life.
2 Related Work 2.1 WSN Model There are many WSN models in practical application. In order to make it convenient for study, wireless sensor model in the paper has the following characteristics: • Wireless sensor model consists of N sensor nodes. N sensor nodes are randomly distributed in the monitoring area and do not move after deployment. • Energy of node is limited and cannot be supplemented. Sensor nodes are of the same type. • Sink node is fixed outside the sensing area. There is only one sink node. • Sink node has high enough power to send information directly to all nodes. It has a continuous supply of energy. • All sensor nodes have the only ID. The sensor node can get its location information. 2.2 APTEEN Protocol Principle The APTEEN is an improved version of LEACH protocol. Compared with the traditional LEACH protocol, APTEEN increases the soft thresholds, hard thresholds, and counting time in data transmission. Therefore, APTEEN protocol can not only collect data periodically, but also quickly respond to emergencies [9]. APTEEN can be divided into cluster forming stage and cluster stabilization stage. In cluster forming stage, CHs are randomly selected, and the sensor network nodes are clustered according to the received signal strength.
APTEEN Protocol Data Fusion Optimization …
719
During the cluster forming stage, each node is assigned a random number between 0 and 1. If the random number is greater than the threshold, the node is selected as CH.CH selection formula for APTEEN protocol is formula (1). ⎧ if n ∈ G p ⎨ 1−p∗ r mod p1 (1) T (n) = ⎩ 0 else In formula (1), p is the probability that each node is selected as a CH, r is the current number of rounds, and G is the set of nodes that are not CH in the previous 1/p. When the CH is selected, it announces the information that it has been selected through radio. The non-cluster-head node decides which cluster to join according to the intensity of receiving broadcast message and becomes the cluster member node(CM) of that cluster. After the cluster is formed, the CH broadcasts attributes, hard thresholds, soft thresholds, counting time, scheduling, and other parameters to CMs, and allocates TDMA time slot for data transmission of the CM. In cluster stabilization stage, CMs collect data in the monitoring area. They send data to the CH according to TDMA slot list. The CH receives the data and fuses it. Then, CHs send the fused data to the base station node through the acquired wireless channel [10].
3 Data Fusion Based on BP Neural Network 3.1 Model Structure of BP-APTEEN BP neural network adopts parallel processing and management mechanism, which has strong adaptability, self-organization, and autonomous learning ability, and can simulate any nonlinear mapping [11]. Figure 1 shows the model structure of BP-APTEEN. The structure adopts the threelayer BP neural network model, which corresponds to a cluster in the WSN. In the process of data transmission, BP neural network is used to process the data sent by CM to CH. Then, CH sends the feature values of sensing data to the sink node.
o0
x1 Input layer data
Data nor maliz ation
x2 x3 xn Input layer Cluster member node
y0 y1 ym
Hidden layer
o1
Sink
o2
node
ol Output layer
Cluster head node
Fig. 1 Model structure of BP-APTEEN
720
L. Jing et al.
The input layer vector is X = (x 0 , x 1 , x 2 , …, x n−1 , x n )T . The hidden layer vector is Y = (y0 , y1 , y2 , …, yj , …ym ). The output layer vector is O = (o1 , …, ok , … ol ). The connection weight matrix between the input layer and the hidden layer neurons is W = (W 1 , …, W j ,… W m ).The connection weight matrix between the neurons in the hidden layer and the output layer is V = (V 1 , …, V k , … V l ). Formula (2) is the relationship between input layer data x i and hidden layer data yj . n yj = f1 (netj ) = f1 wij xi + θj (2) i=0
Formula (3) is the relationship between hidden layer data yj and output layer data ok . ⎛ ⎞ m Ok = f2 (netk ) = f2 ⎝ vjk yj + ak ⎠
(3)
j=o
The error function is shown in formula (4). 1 (dk − ok )2 2 l
E=
k=1
(4) In formula (4), d k is the expected output of the kth node in the output layer. Substitute formula (2) and formula (3) into formula (4). Formula (5) and Formula (6) can be obtained. ⎤2 ⎡ l m 1 ⎣ vjk yj + bk )⎦ E= dk −f ( 2
E=
l 1
2
⎡
⎛
⎣dk −f ⎝
m
⎛ vjk f ⎝
j=0
k=1
(5)
j=0
k=1
m
⎞
⎞⎤2
Wij xi + bj ⎠ + bk ⎠⎦
(6)
j=0
According to formula (5) and formula (6), the error function is determined by the weight and threshold of each layer. Formula (7) is output layer weight adjustment formula. vjk = η
l
(dk − ok ) · f2 (netk ) · yj
(7)
k=0
Formula (8) shows output layer threshold adjustment formula. ak = η
l k=0
(dk − ok ) · f2 (netk )
(8)
APTEEN Protocol Data Fusion Optimization …
721
Formula (9) is hidden layer weight adjustment formula. wij = η
l
(dk − ok ) · f2 (netk )vjk · φ netj · xj
(9)
k=0
Formula (10) shows hidden layer threshold adjustment formula. θj = η
l
(dk − ok ) · f2 (netk )vjk · φ netj
(10)
k=0
Given the weight and threshold of this hidden layer and the weight and threshold of the output layer, the weight and threshold of the next hidden layer and the weight and threshold of the output layer can be calculated according to the above formula. For the convenience of discussion, it is assumed that this cluster has m cluster member nodes, the corresponding BP neural network has m input neurons. The output node of the neural network corresponds to the cluster head node of this cluster. The selection of hidden layer nodes in neural network has not been guided by a clear theory. In the paper, “Cut-and-Try” method is used to determine √ number of neurons at hidden layer. In the existing empirical formulas, formula n1 = n + m + a can be used to determine the optimal number of hidden layer nodes. In this formula, n1 is number of hidden layer nodes, m is number of input neurons, n is number of output neuron, and a is the natural number of [1, 10]. In order to accelerate the fitting speed of BP neural network, the input layer data was normalized in CM. Min-Max normalization method is used in this paper. The formula (11) is Min-Max normalization method. x∗ =
(x − minx) (maxx − minx)
(11)
Sigmoid function is the activation function between input layer and hidden layer. The formula (11) is sigmoid function. 1 1 + e−z
sigmoid(z) =
(12)
Purelin function is the activation function between output layer and hidden layer. Purelin function is f (x) = a * x + b. In Purelin function, a and b are constant. 3.2 BP-APTEEN Process Some parameters need to be determined before building the BP-APTEEN model. Parameters can be trained, learned, and adjusted by neural network. Considering the limitation of energy, storage space, and other resources in WSN, the parameters training of the BP-APTEEN model is carried out in the sink node. BP-APTEEN process is as follows.
722
L. Jing et al.
1. Cluster structure is formed by clustering according to APTEEN protocol. CH sends the CM information table to sink node. 2. Sink node builds BP neural network model based on the cluster structure and CM information table. 3. Sink node gathers samples matched with information about cluster node for training on the basis of sample database. The corresponding neural network parameters are obtained by training. 4. Sink node sends neural network parameters of different layers to corresponding CMs and CHs. 5. CHs fuse information sent by CMs through the trained neural network model. Then, CHs send several characteristic values that represent value information to Sink node. It is important to note that the data sent by CM to CH meets the hard threshold, soft threshold, and count time conditions. BP-APTEEN can reduce data transmission, reduce energy consumption, and prolong network life.
4 Simulation Analysis Mean absolute deviation (MAE) is used to evaluate performance to BP-APTEEN algorithm fusion. The initial weight of BP neural network is generated randomly. The MAE of eight cluster nodes of BP-APTEEN algorithm is 0.1632. The accuracy performance of BP-APTEEN algorithm is acceptable in most WSN applications. In order to evaluate the performance of BP-APTEEN, the simulation results were compared with APTEEN. The paper mainly analyzes number of dead nodes and average energy consumption of each node. The simulation system is composed of 200 wireless sensor nodes. Nodes are randomly deployed in the monitoring area. Parameter settings of simulation environment are given in Table 1. Table 1 Simulation parameter table Parameter
Value
Monitoring area range 100 m × 100 m Sink node coordinates (150 m, 150 m) Eo
0.1 J
E elec
50 nJ
E fs
0.01 nJ
E mp
0.0000013 nJ
E DA
5 nJ
Simulation rounds
1000
Figure 2 is a comparison of number of dead nodes of BP-APTEEN and that of APTEEN protocol.
APTEEN Protocol Data Fusion Optimization …
723
Fig. 2 Number of dead nodes
As Fig. 2 shows, the first node of BP-ATEEN dies around 40 rounds, and the first dead node of BP-APTEEN is later than that of APTEEN protocol. APTEEN protocol nodes die faster than BP-APTEEN nodes. Nodes of APTEEN protocol are all dead in about 280 rounds. However, nodes of BP-APTEEN are all dead in about 680 rounds. The studied content shows that the network life of BP-APTEEN is obviously longer than that of the APTEEN protocol. Figure 3 is the comparison of the average energy consumption of each node of the BP-APTEEN and that of APTEEN protocol.
Fig. 3 Average energy of each nodes
Figure 3 shows that the average energy consumption of each node of APTEEN protocol is faster than that of BP-APTEEN model. The average single node runs out of energy around the 250 rounds in APTEEN protocol. The average energy of each node of BP-APTEEN model runs out about 630 rounds. It can be seen more intuitively
724
L. Jing et al.
from Fig. 3 that the BP-APTEEN algorithm can reduce energy consumption, extend the network life, and improve network performance.
5 Conclusion To optimize the problem of APTEEN protocol data fusion in WSN, this paper proposes BP-APTEEN algorithm. In the process of data transmission from CM to CH, BP-APTEEN uses BP neural network to conduct data fusion processing for input data. By fusing input data and discarding useless data, the network energy consumption can be reduced. And the network life cycle can be prolonged. The simulation results show that the BP-APTEEN algorithm can reduce the network data transmission, reduce the network energy consumption, and improve the data collection efficiency. Acknowledgements. Shubin Wang ([email protected]) is the correspondent author, and this work was supported by the National Natural Science Foundation of China (61761034).
References 1. Kocakulak M, Butun I (2017) An overview of wireless sensor networks towards internet of things. In: 2017 IEEE 7th annual computing and communication workshop and conference (CCWC), Las Vegas, NV, pp 1-6 2. Lara-Cueva RA, Gordillo R, Valencia V, Benítez DS (2017) Determining the main CSMA parameters for adequate performance of WSN for real-time volcano monitoring system applications. IEEE Sens J 17(5):1493–1502, 1 Mar 2017. https://doi.org/10.1109/jsen.2016.264 6218 3. Lin X (2015) Overview of wireless sensor networks and key technologies. Intell Comput Appl J (Chinese) 5(01): 81–83 4. Sun L, Huang X, Cai W, Xia M (2011) Data fusion algorithm for wireless sensor network based on neural network. J Sens Tech (Chinese) 24(01): 122–127 5. Wang H, Chang H, Zhao H, Yue Y (2017) Research on LEACH algorithm based on double cluster head cluster clustering and data fusion. In: 2017 IEEE international conference on mechatronics and automation (ICMA), Takamatsu, pp 342–346 6. Wang S, Zhao B, Li D, Du T (2019) Data fusion algorithm of wireless sensor based on combination between cluster head election improvement and neural network. In: 2019 Chinese Control Conference (CCC), Guangzhou, China, pp 6386–6391. https://doi.org/10.23919/ chicc.2019.8866153 7. Wang M, Debin X, Wang R, Du F, Shi Y (2013) Data mining research in wireless sensor network based on genetic BP algorithm. In: Proceedings of 2013 2nd international conference on measurement, information and control, Harbin, pp 243–247 8. Mittal M, Kumar K (2016) Data clustering in wireless sensor network implemented on selforganization feature map (SOFM) neural network. In: 2016 international conference on computing, communication and automation (ICCCA), Noida, pp 202–207. https://doi.org/10. 1109/ccaa.2016.7813718 9. Wang M, Wang S, Zhang B (2020) APTEEN routing protocol optimization in wireless sensor networks based on combination of genetic algorithms and fruit fly optimization algorithm. Ad Hoc Netw (prepublish)
APTEEN Protocol Data Fusion Optimization …
725
10. Khan AR, Rakesh N, Bansal A, Chaudhary DK (2015) Comparative study of WSN protocols (LEACH, PEGASIS and TEEN). In: 2015 third international conference on image information processing (ICIIP), Waknaghat, pp 422–427 11. Sung W, Liu Y, Chen J, Chen C (2010) Enhance the efficient of WSN data fusion by neural networks training process. 2010 International symposium on computer, communication, control and automation (3CA), Tainan, pp 373–376
Realization of Target Tracking Technology for Generated Infrared Images Ge Changyun(B) and Zhang Haibei Department of Electronic Engineering, Dalian Neusoft University of Information, 116023 Liaoning, China [email protected]
Abstract. Target tracking is to determine the motion parameters of the target in each infrared image in continuous sequence images. In this paper, the target tracking refers to getting the coordinate information in the infrared image. In the infrared image tracking system, there are many methods to determine the target position. In this paper, the traditional infrared tracking methods was applied including weighted centroid tracking, centroid tracking, matching tracking, and Kalman tracking. The tracking performance of the infrared detector system was tested under different signal-to-noise ratios. Keywords: Generated image · Infrared image · Target tracking
1 Introduction In practical application of infrared target tracking, the infrared image only has grayscale information. Affected by the surrounding environment and climate, the target signal noise is relatively low and the noise signal is complex [1]. So, the infrared target tracking algorithm needs to have strong robustness and adaptability to complex noise [2]. The more classic infrared target tracking methods include wave gate tracking method [3], template matching method [4], optical flow method [5], Kalman filter, inter-frame correlation method [6] and contour extraction method. Tracking methods include wave gate tracking method [7], template matching method, optical flow method, Kalman filter, inter-frame correlation method and contour extraction method.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_97
Realization of Target Tracking Technology …
727
1.1 Centroid Tracking and Weighted Centroid Tracking The basic idea of centroid tracking is to obtain the center position of the target by processing and calculating the two-dimensional image. The following is the calculation method of centroid position estimation: ⎧ N ⎪ ⎪ xi /n ⎨x = i=1
N ⎪ ⎪ ⎩y = yi /n
(1)
i=1
In weighted centroid tracking, the position and possible size of the target in this frame were predicted and estimated according to the position, size and movement speed of the target in the previous frames of images. The following is the calculation method of the target weighted centroid position after segmentation: ⎧ xf (x,y) ⎨ x = x y x y f (x,y) (2) ⎩ y = x y yf (x,y) f (x,y) x
y
1.2 Match Tracking Algorithm The tracking was applied to infrared image matching and tracking; that is, the template image slides inside the wave gate. A related surface of the entire image was obtained by the calculating D(x, y) of both the image at each position and the template image. Correlation matching tracking: x y T x , y ∗ I x + x , y + y (3) R(x, y) = 2 2 x y T (x , y ) ∗I (x + x , y + y ) In the formula, R(x, y) is the correlation between the coordinates and the template, T (x, y) and I(x, y) are, respectively, the gray value at the A coordinate of the template and the gray value at the C coordinate of the image. 1.3 Kalman Tracking Kalman filter is an algorithm that uses the state sequence of a dynamic system to estimate linear minimum variance error. If the signal is discrete, suppose the state at time t k is X k , and the estimated state is driven by noise W k−1 . The specific relationship can be described by the state e. Xk = F ∗ Xk−1 ∗ Wk−1
(4)
728
G. Changyun and Z. Haibei
2 Infrared Target Tracking Experiment Infrared target tracking experiments contains commonly used infrared tracking methods, including centroid tracking, centroid tracking, squared difference matching tracking, correlation matching tracking and Kalman tracking. In the experiment, the infrared image generated by the infrared generation module is regarded as the picture input. The target moves linearly, and the target changes from small to large. The tracked target is framed by a frame. The weighed centroid tracking: The weighed centroid tracking input is a picture sequence of a land background tank target with a signal-to-noise ratio of 6.7–6.9 dB. The tracking is very stable, as shown in Fig. 1.
Fig. 1 Weighted centroid tracking
Centroid tracking: The input is the infrared image sequence of the ship background target with a signal-to-noise ratio of 4.8–5.0 dB. The tracking is very stable, as shown in Fig. 2.
Fig. 2 Centroid tracking
Correlation matching tracking: The input is the infrared image sequence of the land background tank target with a signal-to-noise ratio of 9.9–10 dB. Correlation matching tracking is very stable, as shown in Fig. 3. Kalman tracking: The input is the infrared image sequence of the sky background aircraft target with a signal-to-noise ratio of 4.2 dB ~ 4.3 dB. The tracking effect is very stable, as shown in Fig. 4.
3 Conclusions Through experiments, various infrared target tracking technologies have presented good robustness to the generated infrared images. Tracking methods are able to move with the
Realization of Target Tracking Technology …
729
Fig. 3 Correlation matching tracking
Fig. 4 Kalman tracking
target without losing the target. When the size of the target changes, the size of the wave gate can be adjusted according to the size of the target, and the center position of the target is marked. The tracking method applied in this article is also limited and has a good adaptability to the generated infrared image. However, it may fail in tacking infrared images with complex backgrounds. Excellent infrared image processing technology is needed in tracking infrared images with complex background. Only by preprocessing, the collected infrared image sequence can good results be ensured. Weighted centroid tracking and centroid tracking are suitable for infrared images with pure background and obvious targets; matching tracking has good adaptability, but requires a large number of matching templates to ensure that each target can be well identified and tracked; good image and processing are required for Kalman tracking. Locating the target and making predictions, Kalman tracking effectively simplified the amount of calculation in tracking. Acknowledgments. This work is supported by Research and Development of Concentration Device based on Brain Science and Machine Vision, Dalian Youth Science and Technology Star 2019.
References 1. Ge Changyun (2015) Design of infrared target tracking simulation platform. Harbin Engineering University. Harbin 2. Comaniciu D, Ramesh V, Meer P (2003) Kernel-based object tracking. IEEE Trans Patt Anal Mach Intell 25(2): 564–577 3. Kalman RE (1960) A new approach to linear filtering and prediction problems. Trans ASME-J Basic Eng 82(D): 35–45
730
G. Changyun and Z. Haibei
4. Yang H-Y, Zhang G-L (2009) Design and realization of a new correlation tracker algorithm. J Infr Mil-lim Waves 19 (5): 377–380 5. X-Y Li, G-Q Ni (2002) Optical flow computation of infrared image. Infr Laser Eng 31(3):189– 193 6. Wang W-Y, Ding X-M, Huang X-D et al (2007) A new method for small target detection and tracking. J Opto-Elec Laser 18(1): 121–124 7. T-F Xu, G-Q Ni (2006) Target recognition based on log gabor wavelet phase congruency feature invariant. J Opto-Electr Laser 17(2):222–225
Review on Wearable Antenna Design Licheng Yang1,3 , Tianyu Liu1,2 , Qiaomei Hao1,3 , Xiaonan Zhao1,3 , Cheng Wang1,4 , Bo Zhang1,3 , and Yang Li1,3(B) 1 Tianjin Key Laboratory of Wireless Mobile Communications and Power Transmission, Tianjin
Normal University, 300387 Tianjin, China [email protected], [email protected], [email protected], [email protected], {cwang,b.zhangintj}@tjnu.edu.cn, [email protected] 2 Department of Electronic Science and Technology, College of Electronic and Communication Engineering, Tianjin Normal University, 300387 Tianjin, China 3 Department of Communication Engineering, College of Computer and Information Engineering, Tianjin Normal University, 300387 Tianjin, China 4 Department of Artificial Intelligence, College of Electronic and Communication Engineering, Tianjin Normal University, 300387 Tianjin, China
Abstract. With the rapid popularization of 5th generation mobile networks(5G), wearable devices have also been widely applied in military communications, medical care and entertainment. After the initial development of wearable form, more implantable and attached wearable devices are developed. The progresses of wearable antennas have an important impact on the development of wearable devices. This paper discusses the important performance index and new development of the wearable antenna, introduces the research results of the wearable antenna in recent years, and analyzes the development trend of the wearable antenna. Keywords: Wearable antenna · Flexible fabric antenna · Antenna array · 5G communication
1 Introduction The 5th generation mobile communication (5G) aims to realize the communications with low delay, high speed and large capacity, and open the Internet of things era. Commercial products of 5G will also break through mobile phones and computers, and more artificial intelligence products will appear in our daily life. The new smart products spread rapidly, becoming an important entrance of the Internet of things [1]. Among them, wearable application is the critical entrance to the mobile Internet and Internet of things, and the key to connect people and their communication devices. In recent years, driven by the medical intelligent wearable, wearable applications have gradually entered People’s
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_98
732
L. Yang et al.
Daily life, and new wearable products and devices will appear under the application of 5G communication network [2]. As essentials component in communication devices, antennas are passive devices used for sending and receiving radio frequency (RF) signals, which determine the communication quality, signal power, signal bandwidth, connection speed and other communication indicators. In wearable devices, wearable antennas play decisive role in information transmission, directly affecting the performance of the whole system. Therefore, the development of application based on wearable antenna is of great importance. The application of wearable antenna is not limited to the human wearable antenna, and its application can cover military, medical, health care, games, entertainment, music, fashion, transportation, education, and even intelligent wearable devices for animals and pets. The earliest wearable antenna can be traced back to the military soldier combat communication equipment. The wearable antenna enables soldiers to get rid of the disadvantages that the traditional whip antenna cannot be conformal with the human body and has poor concealment. As shown in Fig. 1 in order to improve the portability and concealment of the wearable antenna and achieve conformal with the human body, the antenna is usually integrated into the carrier of clothes, hats, belts and shoes in a certain way. The high performance wearable antenna enables soldiers to wear a portable wireless communication system to send real-time feedback of information and receive instructions, thus improving their combat capability and battlefield information perception.
Fig. 1 Military wearable antenna
The medical and health care is also a field of wearable antenna application. With the development and promotion of IEEE 802.15.6, the wearable intelligent terminal can monitor human physiological parameters realize data sharing and wireless communication with other intelligent devices. As shown in Fig. 2, a health monitoring network is displayed. Users’ physiological parameters such as blood pressure, body temperature, and heart rate can be transmitted to the medical monitoring center or personal mobile terminal with the help of wearable devices, so as to facilitate real-time monitoring and analysis of users’ health status and provide scientific basis for medical treatment.
Review on Wearable Antenna Design
733
Antenna design is one of the core components of this health care system. Aiming at the health application demand, it is of great scientific significance and application value to investigate the design and implementation method of new flexible wearable antenna for wearable wireless communication system.
Fig. 2 The schematic diagram of health monitoring network was built through the wearable antenna
In addition, the wearable antenna also has extensive value in the breeding industry, animal husbandry and pet industry. The wearable antenna used in animal husbandry and breeding industry can real-time monitor the growth and health of the livestock during the frequent period of animal flu. The wearable antenna device can transmit animal health conditions to the terminal, and can be used for the treatment and control of influenza epidemic in the first time, so as to minimize the harm. When it comes to pet industry, wearable devices implanted by collars or under the skin can not only locate and prevent the loss of pets in real time, but also set the daily activity target for pets and pay attention to the healthy growth of pets. Through the cloud and the nearby veterinarian for docking, regular pet health check and pet health care.
2 Critical Parameters of Wearable Antenna As devices to radiate and receive electromagnetic wave, wearable antennas shall meet the return loss, bandwidth, standing wave ratio, gain parameters, pattern and etc. Compared
734
L. Yang et al.
to traditional antenna, Wearable antennas also must conform the following features: small volume, low maintenance, high robustness, cheap and easy to integrate. It is also necessary to study the material and structure of the wearable antenna, as well as to reduce the influence of the human body on the antenna transmission and reduce the radiation of the antenna to the human body. 2.1 Material Properties The conformal characteristics of the wearable antenna put forward new requirements for the material characteristics of the antenna, such as conductive fiber and embroidery cloth replace the traditional rigid substrate. The electrical parameters of the material should also be considered when selecting the material. The permittivity of a material is an important parameter. The fabric with low permittivity can reduce the surface wave loss and increase the impedance bandwidth of the antenna. For example, the permittivity of the substrate, thickness of the substrate and tangent loss of the substrate and other important parameters were determined by simulating the textile antenna as a real fabric structure [3]. In addition, water will also affect the dielectric properties of Antennas. When water is absorbed by the fiber, the permittivity and the loss Angle tangent value (tan δ) increase [4]. Another factor affecting electrical conductivity is the flow of current through the fabric structure. In order to reduce the loss of electrical conductivity, it is better to keep the conductive path consistent with the current direction. High-performance wearable antennas can be achieved by selecting suitable material and substrate structure. 2.2 Antenna Efficiency Improving the efficiency of wearable antenna has positive significance to the efficiency of portable communication devices. Power technology is still a challenge for wearable devices. Although wearable electronic devices have made considerable progress, lithium batteries, as the power source of wearable devices, are limited in weight and small in capacity, which still cannot supply power for wearable devices for a long time. The methods to improve the efficiency of the antenna are to increase the radiation resistance and reduce the loss resistance of the antenna. During the design of the wearable antenna, the substrate with appropriate dielectric constant can be selected to improve the efficiency of the wearable antenna and reduce the loss [5]. 2.3 Antenna Robustness The robustness of the antenna refers to the characteristic of maintaining some performance under the condition of changing parameters such as structure and size. Wearable devices need to be worn on human beings or animals for a long time. The movements of users deforms the spatial geometry of the antenna and affects its performance. For example, fabric antenna [6], when the fabric is conformal with the human body, it will bend and deform, thus changing its electromagnetic characteristics and affecting the antenna performance. Generally, by increasing the bandwidth of the wearable antenna in design, the robustness of the antenna is improved and it is not subject to body interference [7].
Review on Wearable Antenna Design
735
2.4 Specific Absorption Rate (SAR) Specific absorption rate represents the radiation dose from the antenna is absorbed by the body. The wearable antenna is required to be worn by users for a long time, so the electromagnetic radiation of the antenna needs to comply with the prescribed safety standards. Currently, there are two international standards. One is the standard of 1.6 w/kg for 1 g human tissue designated by IEEE. The other is the 2 W/kg per 10 g of tissue designated by International Commission on non-lonizing Radiation Protection (ICNIRP). In [8, 9], the specific absorptivity rate(SAR) of the wearable textile antenna with and without human antenna isolation was compared in homogeneous mode, and the peak SAR and SAR distributions of 1 and 10 g of the wearable antenna on different observation planes were proposed, so as to verify the importance of low SAR for human safety.
3 New Progresses in Wearable Antennas In recent years, many scholars have carried out research on the wearable antennas, and improved the performance of the wearable antennas by designing new structures and applying new materials for different application scenarios. The conductivity, flexibility, biocompatibility, mechanical resistance and washability of the antennas all depend on the structure and materials of the antennas. Planar structure, flexible conductive material and dielectric material meet the requirements of wearable antenna. From the published research results, research on flexible wearable antenna, on the one hand, based on conductive fabric, by weaving different structures of flexible wearable fabric antenna on the fabric matrix; On the other hand is the antenna array, which is designed as a belt or vest or helmet antenna array on the matrix such as clothing. 3.1 Wearable Antenna Based on Conductive Fabric In the flexible conductive fabric antenna, usually can be divided into two categories. One is based on the textile or embroidery process, the conductive fiber is sewn on the fabric matrix according to the structure of the antenna. The other is based on the ink-jet printing/coating process to fabricate the flexible wearable fabric antenna by printing nano conductive particles or coating conductive polymers on the flexible fabric matrix according to the antenna structure. 3.1.1 Textile Wearable Antenna The common method is to implant small wearable antennas into textiles. Denim is a common clothing fabric in daily life and is often used as a substrate due to its low permittivity and low loss Angle tangent value, resulting in higher bandwidth and lower power loss. In [10], k. Wang and j. Li proposed a compact dual-band textile wearable antenna and selected denim material and copper strip as the antenna substrate and radiation element. The wearable antenna was integrated under the condition of retaining the original textile, which measures the return loss on the human body. It is a new wearable textile antenna that can work in an ISM band. It has little influence on human body and is easy to be made on 42 × 13 mm ~ 2 textiles [11]. As shown in Fig. 3, a smart patch
736
L. Yang et al.
wearable antenna integrated on jeans. realize the function of wireless communication of human by the omnidirectional radiation pattern, but it is still a big problem in practical clothing issues such as durability.
Fig. 3 Denim wearable antenna model
In [12], after studying the practicability of flexible conductive fabric of wearable antenna. A. Di Natale proposed the reconfigurable wearable ultra- wideband (UWB) antenna made of denim cloth. Its implied wearable comfort and easy integration in clothing became the advantages of wearable realization, and the reconfigurable topology became the highlight. The radiation pattern is omnidirectional in the unipolar topology and directional in the microstrip topology. The disadvantage is that the radiation pattern of the microstrip topology loses symmetry at higher frequencies and needs to be improved. Agbor, etc. in [13] based on the fabric patch antenna area for further research, the main research by three different electrospinning material ShieldiT super, Cobaltex and copper microstrip antenna made of polyester taffeta, return loss will be measured by the antenna on the basis of the comparison, test results show that copper polyester taffeta as the conductivity is highest in the three kinds of electrical conductivity, its performance is much better than ShieldiT and Cobaltex electronic textile. The research help narrow the range of options for electronic textiles, which have promising applications in making wearable antennas for biomedical. 3.1.2 Ink-Jet or Coating Antenna The outer surface of the flexible wearable antenna made by ink-jet or coating process has a uniform coating. Due to the good electrical conductivity of the coating, the key
Review on Wearable Antenna Design
737
properties of the antenna, such as density, flexibility and hand feel, do not change significantly. Printing conductive patterns directly onto fabric to make antennas is a universal technique, but it exposes its limitations. Most conductive ink and paste are silver, brittle. Because the higher the ink viscosity, the better the short term effect of impedance, this process has become a unique feature of this kind of wearable antenna. H. Kao etc. used silver ink to ink-jet print filter antenna on textile substrate of wearable electronic products, which solved and proved the completeness of ink- jet printing technology combining wearable and printing electronic applications [14]. As shown in Fig. 4, C. Baytöre utilizes the flexible and thin structure of paper substrates. Commercial printers, together with conductive silver (Ag) nanometer inks, use inkjet printing technology to fabricate dual-band, coplanar, flexible antenna circuits [15], solving the problem of realizing the antenna geometry within acceptable accuracy errors. The economy, flexibility and manufacturability of the wearable antenna are obvious advantages, but in the face of the disturbance in the bending field, the resonant frequency of the antenna will have a slight deviation.
Fig. 4 Flexible antenna model of paper substrate
Millimeter wave (mm-wave) communication networks are expected to make significant progress in addressing congestion, bandwidth limitation and channel capacity limitation in current wireless systems [16, 17]. The 5G communication is expected to be a highly intensive, diverse, multi-purpose, unified technology with the availability of additional common bandwidth, which can be upgraded almost indefinitely [18]. Millimeter wave flexible antenna design on polyethylene terephthalate (PET) substrate opens the front end of the market for the fifth generation (5G) wireless applications. In [19], s. f. Jilani proposed a flexible ink-jet printing antenna based on low-cost PET substrate with bandwidth covering Ka band. As shown in Fig. 5, a geometrically simple antenna design is proposed, using a t-shaped patch antenna and a right- angle slot component to generate multiple resonance based on the DGS concept (Fig. 5). The impedance bandwidth of the antenna is 26–40 GHz. The measurement results confirm the omnidirectional and quite consistent radiation pattern in the operating bandwidth. In the whole working range, the gain of the antenna is above 4 dBi, covering the high bandwidth, omnidirectional radiation and reasonable gain performance of 5G frequency band, so that the antenna can be properly integrated in wearable applications to meet the requirements of 5G network.
738
L. Yang et al.
Fig. 5 T-type patch antenna model
In [20], s. f. Jilani proposed an ink-jet flexible Kaband antenna, aiming to provide wearable antennas with high bandwidth, high consistency, high cost performance and high gain for the 5G network. The antenna geometry combines the advantages of the Koch curve and the DGS in combining additional resonant bands, providing high bandwidth and gain due to the increase in the effective radiation area within the same package area. An ink-jet printing process was used to fabricate the antenna on the PET substrate, and the desired flexibility was obtained. The antenna provides complete Kaband bandwidth (26.5–40 GHz) with an efficiency of more than 80%. The wearable antenna’s inkjet technology is gradually maturing and is also contributing to the 5G networks. 3.2 Wearable Antenna Array In practical applications, the antenna is often required to have such characteristics as high gain, high power, low side lobe, beam scanning or beam control, etc. Compared with a single antenna, the antenna array is more likely to meet these requirements. Therefore, array technology has been focused by many researchers. Considering the wearable requirements, the array antennas are required to be conformal with a carrier of clothes or ornaments. The application of conformal antenna array can not only meet the requirements of wearable antenna, but also improve the performance of wearable antenna. In recent years, the miniaturization of antenna array has also been developed. B. R. Sanjeeva Reddy applied traditional antenna array technology to jeans fabric and designed a wearable patch antenna array for military wireless applications [21]. This array-based model solves the problem of degradation of the working frequency band radiation pattern generated by 5.3 GHz wireless LAN applications and can be extended to complex designs to be embedded on textile substrates to achieve high gain. In the medical field, the application of wearable antenna array has also gained more development. In [22], A. S. M. Alqadami designed A 24 array antenna based on magnetic substrate materials such as broadband, one-way, low profile and stability, and configured it as A wearable antenna on PDMS-Fe3 O4 substrate. The flexible magnetic antenna array
Review on Wearable Antenna Design
739
is applied to the wearable electromagnetic imaging system for intracerebral hemorrhage, which can detect the possibility of intracerebral hemorrhage. Wearable antenna array has gradually penetrated into daily life. In millimeter wave communication, due to atmospheric absorption, the path loss of free space between receiver and transmitter is very large. In order to compensate for high path loss, a high gain array antenna is usually used. Millimeter-wave communication utilizes a wide unlicensed band of approximately 60 GHz (57–64 GHz), which is of interest because it can accommodate data rates higher than 1 Gbps. When the 60 GHz array antenna is used in a wearable environment, the high- directional beam can have harmful health effects on the human body. In [23], Y. Hong and J. Choi proposed a 60 GHz patch antenna array with parasitic elements for smart glasses, which has the characteristics of fanning beam radiation pattern and wide radiation coverage, and becomes a good candidate for 60 GHz wearable applications.
4 The Development Trend of Wearable Antennas With the increasing of wearable products and users’ demand for functions such as networking, communication and data sharing of wearable terminal devices, the new wearable antenna has become one of the research hotspots. In wearable electronic products of light, portable, small, beautiful under the trend of development. Designing highperformance conformal antennas integrated in the tense hardware space, and researching the flexible antenna characteristics in the process of wear and the influence of radiation on human tissue are of importance in scientific significance. Due to the complexity of the application of the wearable antenna, such as indoor and other obstructing environments, the characteristics of the transmission channel of the wearable antenna in the complex environment is still the focus of the research. In the future, the following performance parameters of the wearable antenna still need to be improved. 4.1 Combine with 5G Communications With the large-scale application of 5G mobile communication technology, wearable devices will flourish, and the combination of wearable antenna and 5G communication will become closer. Researchers need to explore relevant technologies in accordance with the new features of 5G mobile communication, such as high data rate, low delay, low energy consumption, low cost, high system capacity and large-scale device connection, and carry out targeted design of wearable antenna to meet the needs of military, medical, health care, entertainment and other application scenarios. 4.2 Miniaturization, Lightweight and Simple Process The ideal state of a wearable antenna is not to be perceived. When used for wearing, the wearable antenna should be as small, light and portable as possible. When the antenna is integrated into the clothes, the comfort of the original items should be considered, and the comfort experience of the wearer should not be destroyed. Recently, more research has been made on the implantable antenna, which has raised new requirements for its safety,
740
L. Yang et al.
convenience and miniaturization. In the era of 5G communication, wearable antenna will be applied to more scenes. Simple technology is not only the basis of mass production, but also the basis of low cost, which will make wearable devices more popular in the field of civilian use. 4.3 Antenna Array Compared with a single wearable antenna, the antenna array possesses better directional characteristics. The researches on the application of antenna array in the wearable system and the improvement of wireless communication performance of wearable devices are also research hotspots of the wearable antenna. The increasing communication frequency bandwidth is conducive to the miniaturization of the wearable antennas and the integration of more antenna array elements. In the situations of video, image and other large capacity data transmission, the use of large-scale Multi-Input Multi-Output (MIMO) technology can effectively improve the channel capacity and the spectrum utilization, to meet the system’s demand for high rate data transmission. 4.4 High Robustness and Resistance to Multipath Attenuation Due to the influence of human activities or external environment, the working frequency of the wearable antenna will be affected. Bending or distortion will cause the deviation of the working frequency of the antenna, so it cannot work normally. In the case of changeable application scenarios, in the face of other radio wave interference, changeable weather and other unexpected situations, more and more radio systems require wearable antenna, which has the characteristics of multi-frequency band, high rate, low power consumption, low complexity and anti-multipath attenuation, etc., which is also the focus of many researchers. 4.5 Low SAR Because the wearable antenna is worn on the human body, the communication performance indexes should be achieved, while the potential the electromagnetic radiation impact on human health should be considered. Given the relative permittivity and loss Angle tangent value of different tissues in human body a and the complexity of structure, the non-uniform human model can be used to simulate the influence of electromagnetic wave on human body when designing the antenna, so as to improve the accuracy and reliability of the results.
5 Conclusion Wearable devices will continue to develop at a high speed in the next few years. The traditional communication antenna will be further planed, miniaturized, lightweight, easy to make, and not susceptible to human body and other environmental influences. The multi-frequency and multi-mode design will be carried out to make it more suitable for wearable devices and 5G communication networks. With the development of the
Review on Wearable Antenna Design
741
Internet, wearable antenna will be more widely used in all walks of life. Implantable antenna, flexible fabric antenna and small antenna array are still the research hotspots of wearable antenna in the future. Acknowledgements. This work was supported by the Tianjin Higher Education Creative Team Funds Program; Tianjin Municipal Natural Science Foundation (No. 19JCQNJC01300; No. 18JCYBJC86400); Doctor Fund of Tianjin Normal University (No. 52XB1905). The authors would like to thank Professor Qiang Chen at Tohoku University for allowing us to use the computer with the electromagnetic software installed in his lab.
References 1. Al-Dulaimi A, Wang XCI (2018) Machine—type communication in the 5G era: massive and ultrareliable connectivity forces of evolution, revolution, and complementarity. IEEE, pp 519–542 2. Sharma D, Dubey SK, Ojha VN (2018) Wearable antenna for millimeter wave 5G communications. 2018 IEEE Indian conference on antennas and propagation (InCAP). Hyderabad, India, pp 1–4 3. Zaidi NI, Ali MT, Rahman NHA, Yahya MF, Nordin MSA, Shah AASA (2018) Accurate simulation of fabric and analysis of antenna performance on different substrate materials. In: 2018 IEEE international RF and microwave conference (RFM), Penang, Malaysia, pp 77–80 4. SalvadoR (2012) Textile materials for the design of wearable antennas: a survey. Sensors 12(11) 5. Potey PM, Tuckley K (2018) Design of wearable textile antenna with various substrate and investigation on fabric selection. 2018 3rd international conference on microwave and photonics (ICMAP), Dhanbad, pp 1–2 6. Usha P, Nagamani K (2018) Design, simulation and implementation of UWB wearable antenna. 2018 3rd IEEE international conference on recent trends in electronics, information and communication technology (RTEICT), Bangalore, India, pp 1041–1044 7. Casula GA (2019) A numerical study on the robustness of ultrawide band wearable antennas with respect to the human body proximity. 2019 IEEE international conference on RFID technology and applications (RFID-TA), Pisa, Italy, pp 227–230 8. Sabban A (2019) Small new wearable antennas for IOT, medical and sport applications. In: 2019 13th European conference on antennas and propagation (EuCAP), Krakow, Poland, pp 1–5 9. Paolini G, Masotti D, Costanzo A (2018) Simulated effects of specific absorption rate and thermal variations on keratinocytes and epidermis exposed to radio-frequency. In: 2018 EMFMed 1st world conference on biomedical applications of electromagnetic fields (EMF-Med), Split, pp 1–2 10. Wang K, Li J (2018) Jeans textile antenna for smart wearable antenna. In: 2018 12th international symposium on antennas, propagation and EM theory (ISAPE), Hangzhou, China, pp 1–3 11. Li S, Li J (2018) Smart patch wearable antenna on Jeans textile for body wireless communication. In: 2018 12th international symposium on antennas, propagation and EM theory (ISAPE), Hangzhou, China, pp 1–4 12. Di Natale A, Di Giampaolo E (2019) UWB reversible structure all-textile antenna for wireless body area networks applications. In: 2019 photonics and electromagnetics research symposium—spring (PIERS-Spring), Rome, Italy, pp 566–569
742
L. Yang et al.
13. Agbor I, Biswas DK, Mahbub I (2018) A comprehensive analysis of various electro-textile materials for wearable antenna applications. In: 2018 texas symposium on wireless and microwave circuits and systems (WMCS), Waco, TX, pp 1–4 14. Kao H, Chuang C, Cho C (2019) Inkjet-printed filtering antenna on a textile for wearable applications. In: 2019 IEEE 69th electronic components and technology conference (ECTC), Las Vegas, NV, USA, pp 258–263 15. Baytöre C, Zoral EY, Göçen C, Palandöken M, Kaya A (2018) Coplanar flexible antenna design using conductive silver nano ink on paper substrate for wearable antenna applications. In: 2018 28th international conference Radioelektronika (RADIOELEKTRONIKA), Prague, pp 1–6 16. Liu D et al (2017) What will 5G antennas and propagation be? IEEE Trans Antenn Propag 65(12):6205–6212 17. Rappaport TS et al (2017) Overview of millimeter wave communications for fifth-generation (5G) wireless networks—with a focus on propagation models. IEEE Trans Antenn Propag 65(12):6213–6230 18. Rappaport TS (2013) Millimeter-wave mobile communications for 5G cellular: it will work! IEEE Access, pp 335–349 19. Jilani SF, Abbasi QH, Alomainy A (2018) Inkjet-printed millimetre-wave PET-based flexible antenna for 5G wireless applications. In: 2018 IEEE MTT-S international microwave workshop series on 5G hardware and system technologies (IMWS-5G), Dublin, pp 1–3 20. Jilani SF, Aziz AK, Abbasi QH, Alomainy A (2018) Ka-band flexible koch fractal antenna with defected ground structure for 5G wearable and conformal applications. In: 2018 IEEE 29th annual international symposium on personal, indoor and mobile radio communications (PIMRC), Bologna, pp 361–364 21. Sanjeeva Reddy BR, Vakula D, Amulya Kumar A (2018) Performance analysis of wearable antenna array for WLAN applications. In: 2018 IEEE Indian conference on antennas and propagation (InCAP), Hyderabad, India, pp 1–4 22. Alqadami ASM, Stancombe AE, Nguyen-Trong N, Bialkowski K and Abbosh A (2019) Wearable electromagnetic head imaging using magnetic-based antenna arrays. In: 2019 IEEE international symposium on antennas and propagation and USNC-URSI radio science meeting, Atlanta, GA, USA, pp 519–520 23. Hong Y, Choi J (2018) 60 GHz patch antenna array with parasitic elements for smart glasses. IEEE Antenn Wirel Propag Lett 17(7):1252–1256
A Review of Robust Cost Functions for M-Estimation Yue Wang1,2(B) 1
Key Laboratory of Electronics and Information Technology for Space Systems, Chinese Academy of Sciences, Beijing 100190, China 2 Tianjin Key Laboratory of Wireless Mobile Communications and Power Transmission, Tianjin Normal University, Tianjin 300387, China [email protected]
Abstract. Robust state estimation plays a key role in mobile robotic navigation, and the M-estimation technique can effectively handle outliers. In this paper, the commonly used robust cost functions for Mestimation are given, and their cost, influence, and weight functions are summarized and compared.
1
Introduction
State estimation plays a key role in mobile robotics [1,2]. Particularly, highperformance position estimation (i.e., localization) is essential to robotic navigation [3], autonomous driving [4], unmanned aerial vehicles (UAVs) [5], and augmented reality (AR) applications [6]. Simultaneous localization and mapping (SLAM) [7–9] is a promising technique in the field of robotic localization, especially using cameras as sensors due to their inexpensive, effective, and ubiquitous features [10–13]. In reality, there will be outliers in the measurements. For example, the multipath reflections of GPS signals will give longer range measurements due to the line-of-sight path block of tall buildings [1]. For the visual-base SLAM (VSLAM) technique, the missassociations of features between a pair of images due to changing lighting or textureless environments will also produce outliers [14]. Therefore, we need robust estimation techniques to address the outlier problem [15, 16]. Random sample consensus (RANSAC) [17] technique are usually adopted to handle the missassociation errors in V-SLAM application, but it is sensible to the inlier threshold and requires enough measurements to uniquely solve a simple model. Therefore, RANSAC will perform poorly on datasets with sparse measurements and frequent dropouts [14]. M-estimation [18] is an alternative technique to handle outliers, ‘M’ stands for ‘maximum likelihood-type,’ which is a generation of maximum likelihood. It uses robust cost functions to decrease the influence of outliers, and there are about a dozen commonly used robust cost functions to choose. Zhang [19] gives c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_99
744
Y. Wang
a tutorial introduction of them, but some relatively new robust cost functions are not included. In [14], several robust cost functions are compared with real data for visual localization. In this paper, a review of robust cost functions for M-estimation is given, their cost, influence, and weight functions are listed and compared visually. The rest of the paper is organized as follows. Section 2 introduces robust cost functions. Section 3 gives a list and comparison about their cost, influence, and weight functions. A conclusion in Sect. 4 wraps up this paper.
2
Robust Cost Functions
For linear and nonlinear Gaussian batch estimation problems, the maximum a posterior (MAP) or the maximum-likelihood (ML) objective/cost function is given in the form [1, 14] J(x) =
K k=1
=
Jk (x) =
K
k=1
K
K
rk (x)2 =
k=1
(yk − fk (x)Wk )
2
k=1
2
ek (x)T Wk−1 ek (x)
,=
K k=1
ek (x)T Wk−1 ek (x),
(1)
where K is the total number of measurements and x is the state to be estimated. For the kth measurement, rk (x) is the error as the Mahalanobis distance, yk is the measurement vector, fk (·) is the measurement model, Wk is the measurement covariance, and ek is the residual error vector. The estimated result is given by x∗ = arg min J(x). x
(2)
The sum-of-squared-error cost function in (1) is highly sensitive to outliers, outliers will give a huge effect in the minimization because they dominates the quadratic cost, so the estimated result is distorted. The M-estimation try to reduce the effect of outliers by generalizing the cost function in (1) as Jk (x) = ρ (rk (x)) ,
(3)
where ρ(·) is a symmetric, positive-definite function with a unique minimum at zero. Therefore, all the generalized cost functions have the same minimization result, i.e., the optimal estimated state x∗ . The quadratic cost function in (1) corresponds to the L2 norm, i.e., ρ(x) = x2 . Instead of directly solving the minimization problem in (2), we can iteratively solve the problem using the robust cost function in (3) using Gauss–Newton (G– N), Levenberg–Marquardt (M–L), or Dog-Leg method [1, 13, 20]. The gradient of the cost function is ∂ρ (rk (x)) ∂rk (x) ∂Jk (x) = , (4) ∂x ∂rk (x) ∂x ψ(rk (x))
A Review of Robust Cost Functions for M-Estimation
745
Table 1 Commonly used robust cost functions type
ρ(x)
ψ(x)
w(x)
L2
x2 2
x
1
sgn(x)
1 |x|
x
1
L1
|x| 2 2 1 + x2 − 1
L1 − L2 [19]
|x|ν ν |x| c2 c
Lp [15] ”Fair” [15] Huber [22, 23]
if |x| ≤ k if |x| ≥ k
Cauchy [24, 25] Geman-McClure [24, 26] Welsh [19]
|x| − log 1 + c
⎧ 2 x ⎪ ⎪ ⎪ ⎨ 2 ⎪ k ⎪ ⎪ ⎩ k |x| − 2 c2 log 1 + x 2 2 c x2 2) 2(1+x 2 2 c 1 − exp − x 2 c
2 1+ x 2
ν−1
sgn(x)|x| x |x| c
|x|
1 |x| c
1+
2 1+ x 2 ν−2
1+
⎧ ⎪ ⎨1 k ⎪ ⎩ |x|
x k · sgn(x)
x 2 1+ x c x (1+x2 )2
x exp
−
x c
2
1 2 1+ x c 1 2 2 (1+x )
exp
−
x c
2
Tukey [19]
if |x| ≤ c if |x| > c
SC/DCS [14, 27, 28] ⎧ ⎨ if x2 ≤ φ ⎩ if x2 ≥ φ
Threshold
if |x| ≤ t if |x| > t
⎧ 2 2 3 x c ⎪ ⎪ ⎪ 1− 1− ⎪ ⎨ 6 c ⎪ ⎪ c2 ⎪ ⎪ ⎩ 0.2 · 6 ⎧ 2 x ⎪ ⎪ ⎪ ⎪ ⎨ 2 ⎪ φ ⎪ 2φx2 ⎪ ⎪ − ⎩ φ + x2 2
⎧ 2 x ⎪ ⎪ ⎪ ⎨ 2 ⎪ ⎪ t2 ⎪ ⎩ 2
⎧ 2 2 x ⎪ ⎨x 1 − c ⎪ ⎩ 0
⎧ x ⎪ ⎪ ⎨ ⎪ ⎪ ⎩
4φ2 x
(φ + x2 )2
x 0
⎧ 2 2 x ⎪ ⎨ 1− c ⎪ ⎩ 0
⎧ 1 ⎪ ⎪ ⎨ ⎪ ⎪ ⎩
4φ2
(φ + x2 )2
1 0
where ψ(x) = ∂ρ(x)/∂x is called the influence function, which measures the influence of a datum on the value of the estimation. For example, the influence function of cost function ρ(x) = x2 is ψ(x) = 2x, which linearly increases with its error, so the influence of the outliers will be quite huge, which shows the non-robustness of the L2 -type estimation in (1). For the robust cost functions, their influence functions should be bounded, so the influence of outliers will be insufficient to produce any significant offset of the estimated state. To get the minimum point of the cost function, we iteratively set the derivative to zero, we have
∂rk (x)
= 0, (5) ψ (rk (x∗ )) ∂x xˇ ∗
746
Y. Wang
ˇ ∗ is the optimal state estimated at the previous iteration. It may be where x quite difficult to solve the equation in (1) due to the form of ψ(·), we can further define a weight function w(x) = ψ(x)/x. By substituting the weight function in (5), we have
∂rk (x)
x∗ )) rk (x∗ ) = 0, w (rk (ˇ ∂x xˇ ∗ which matches the iterative reweighed least-squares (IRLS) problem [21], x∗ = arg min x
K
w (rk (ˇ x∗ )) rk (x)2 .
k=1
We summarize different type robust cost functions with their cost function ρ(·), influence function ψ(·), and weight function w(·) in Table 1.
3
Comparison
The robust cost, influence, and weight functions listed in Table 1 are depicted in Figs. 1, 2, and 3, respectively. All the parameters are set to 1, except for ν = 1.5. From Fig. 2, we can clearly see the non-robust feature of the L2 -type cost function, since its influence function is not bounded, so the outliers will have huge impact on the estimation. The L1 -type cost function can restrict the influence of outliers to a certain degree, but it is not stable at x = 0, since the cost function is non-derivable at x = 0 and its weight function is unbounded at x = 0. The L1 − L2 -type cost function can restrict the impact of outliers and does not have the non-stable problem. It is a middle-type cost function between L2 -type and L1 -type. The Lp -type (i.e., least-powers) cost function represents a family of functions. It turns to L1 -type when ν = 1, and it turns to L2 -type when ν = 2. We choose ν = 1.5 for illustration, and it has larger influence of outliers compared with L1 − L2 -type estimators. The optimal value of ν should be investigated, and ν ≈ 1.2 had shown a good performance in [15]. The ‘Fair’ cost function is convex and derivable, but its influence function is increasing with x. Huber’s cost function behaves quite good that it has been recommended for almost all situations. However, it may encounter some problems with a possible reason that its seconde derivative is not continuous. One modified version is proposed as [15, 19]. From Cauchy’s function to Turkey’s biweight function in Table 1, they all suffer from the problem that they are not convex and cannot guarantee a unique solution. They all restrict the impact of outliers, Geman-McClure and Welsh functions try to further suppress the impact compared with Cauchy’s function, and Turkey’s function even try to eliminate the impact from outliers. We can use the convex cost function first until the iteration converges, then use these non-convex function to eliminate the impact of large outliers [23].
A Review of Robust Cost Functions for M-Estimation
Cost functions.
5
L
4.5
L
1
L -L
4
1
2
Lp
3.5
Fair Huber Cauchy Geman-McClure Welsh Turkey SC-DCS Threshold
3
ρ(x)
2
2.5 2 1.5 1 0.5 0 -4
-3
-2
-1
0
1
2
3
4
2
3
4
x
Fig. 1 Robust cost functions
Influence functions.
2 L 1.5
L
2 1
L1-L 2 1
ψ(x)
0.5 0 -0.5
L
p
Fair Huber Cauchy Geman-McClure Welsh Turkey SC-DCS Threshold
-1 -1.5 -2 -4
-3
-2
-1
0
1
x
Fig. 2 Robust influence functions
747
748
Y. Wang Weight functions.
1.5 L2 L
1
L -L 1
2
Lp
w(x)
1
0.5
0 -4
Fair Huber Cauchy Geman-McClure Welsh Turkey SC-DCS Threshold
-3
-2
-1
0
1
2
3
4
x
Fig. 3 Robust weight functions
Switchable constraints (SC) [27] and dynamic covariance scaling (DCS) [28] techniques have been demonstrated good with performance in robotic mapping. The SC-DCS function can effectively suppress the influence of large outliers. The threshold function completely eliminate the impact of large outliers, but its influence function is discontinuous, just like the L1 -type, which may cause chatter during the optimization process.
4
Conclusions
This paper summarized and compared commonly used robust cost functions for M-estimation. There are many other robust cost functions that are not discussed in this paper, and it is difficult to select the best robust cost function for all use. Experiments using datasets or simulations should be conducted to evaluate the performance of different robust cost functions in different application scenarios. Acknowledgements This work was supported by the Pre-Research Project of Space Science (No. XDA15014700), the National Natural Science Foundation of China (No. 61601328), the Scientific Research Plan Project of the Committee of Education in Tianjin (No. JW1708), and the Doctor Foundation of Tianjin Normal University (No. 52XB1417).
References 1. Barfoot TD (2017) State estimation for robotics. Cambridge University Press
A Review of Robust Cost Functions for M-Estimation
749
2. Thrun S, Burgard W, Fox D (2005) Probabilistic robotics. MIT Press, Cambridge 3. Mur-Artal R, Montiel JMM, Tardos JD (2015) ORB-SLAM: a versatile and accurate monocular SLAM system. IEEE Trans Robot 31(5):1147–1163 4. Yurtsever E, Lambert J, Carballo A, Takeda K (2020) A survey of autonomous driving: common practices and emerging technologies. IEEE Access 8:58443–58469 5. Qin T, Li P, Shen S (2018) VINS-mono: a robust and versatile monocular visualinertial state estimator. IEEE Trans Robot 34(4):1004–1020 6. Huang G (2019) Visual-inertial navigation: a concise review. In: Proceedings of international conference on robotics and automation (ICRA), Montreal, Canada, pp 1–16 7. Cadena C, Carlone L, Carrillo H et al (2016) Past, present, and future of simultaneous localization and mapping: toward the robust-perception age. IEEE Trans Robot 32(6):1309–1332 8. Durrant-Whyte H, Bailey T (2006) Simultaneous localisation and mapping (SLAM): part I. IEEE Robot Autom Mag 13(2):99–110 9. Bailey T, Durrant-Whyte H (2006) Simultaneous localisation and mapping (SLAM): part II. IEEE Robot Autom Mag 13(3):108–117 10. Hartley R, Zisserman A (2004) Multiple view geometry in computer vision, 2nd edn. Cambridge University Press, Cambridge 11. Haomin L, Guofeng Z, Hujun B (2016) A survey of monocular simultaneous localization and mapping. J Comput Aided Design Comput Graph 28(6):855–868 in Chinese 12. Fuentes-Pacheco J, Ruiz-Ascencio J, Rendon-Mancha JM (2015) Visual simultaneous localization and mapping: a survey. Artif Intell Rev 43:55–81 13. Gao X, Zhang T, Liu Y, Yan Q (2019) 14 Lectures on visual SLAM: from theory to practice, 2nd edn., Publishing House of Electronics Industry (in Chinese) 14. MacTavish K, Barfoot TD (2015) At all costs: a comparison of robust cost functions for camera correspondence outliers. In: Proceedings of 12th conference on computer and robot vision (CRV), Halifax, Canada, pp 62–69 15. Rey WJJ (1983) Introduction to robust and quasi-robust statistical methods. Springer, Berlin 16. Rousseeuw PJ, Leroy AM (1987) Robust regression and outlier detection. Wiley, New York 17. Fischler MA, Bolles RC (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM 24(6):381–395 18. Zhang X, Zhang J, Rad AB, Mai X, Jin Y (2012) A novel mapping strategy based on neocortex model: Pre-liminary results by hierarchical temporal memory. In: Proceedings of IEEE international conference on robotics and biomimetics (ROBIO), Guangzhou, China, pp 476–481 19. Zhang Z (1997) Parameter estimation techniques: a tutorial with application to conic fitting. Image Vision Comput 15(1):59–76 20. Nocedal J, Wright SJ (2006) Numerical optimization, 2nd edn. Springer Science & Business Media, Berlin 21. Holland PW, Welsch RE (1977) Robust regression using iteratively reweighted least-squares. Commun Statistics Theory Methods 6(9):813–827 22. Huber PJ et al (1964) Robust estimation of a location parameter. Ann Math Statistics 35(1):73–101 23. Huber PJ, Ronchetti EM (2009) Robust statistics, 2nd edn. Wiley, New York
750
Y. Wang
24. Hu G, Khosoussi, K, Huang S (2013) Towards a reliable SLAM back-end. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Tokyo, Japan, pp 37–43 25. Lee GH, Fraundorfer F, Pollefeys M (2013) Robust pose-graph loop-closures with expectation-maximization. In: IEEE/RSJ international conference on intelligent robots and systems (IROS), Tokyo, Japan, pp 556–563 26. Geman S, McClure DE, Geman D (1992) A nonlinear filter for film restoration and other problems in image processing. CVGIP: Graph Models Image Process 54(4):281–289 27. Sunderhauf N, Protzel P (2012) Switchable constraints for robust pose graph SLAM. In: IEEE/RSJ international conference on intelligent robots and systems (IROS), Vilamoura, Portugal, pp 1879–1884 28. Agarwal P, Tipaldi GD, Spinello L, Stachniess C, Burgard W (2013) Robust map optimization using dynamic covariance scaling. In: IEEE international conference on robotics and automation (ICRA), Karlsruhe, Germany, pp 62–69
Research on the Path Planning Algorithm for Emergency Evacuation in the Building Based on Ant Colony Algorithm Chenguang He1,2(B) , Suning Liu1 , Liang Ye1 , and Shouming Wei1,2 1 Communications Research Center, Harbin Institute of Technology, Harbin, China
{hechenguang,yeliang,weishouming}@hit.edu.cn, [email protected] 2 Key Laboratory of Police Wireless Digital Communication, Ministry of Public Security, Beijing, People’s Republic of China
Abstract. In recent years, China has witnessed rapid economic development and rapid population growth. Meanwhile, the population density of cities has continued to increase. Public places in cities are often overcrowded and major accidents occur more frequently. When an emergency occurs, it is of great significance to provide effective evacuation guidance to the on-site people, so that reduce the number of casualties and property losses. This paper first selects the ant colony algorithm (ACO) for research. The ant colony algorithm and the principle of using grid method to generate the simulated map are introduced in detail. According to these principles and standards, the specific environment is modeled, the simulated map is generated, and the ant colony algorithm is used to realize the path planning on the simulated map. Keywords: Emergency evacuation · Ant colony algorithm · Grid method · Path planning
1 Introduction Nowadays, with the rapid growth of economy, the population has grown larger and larger so that it is not rare to see the public places like markets, subway stations, and cinemas are overcrowded, which is prone to accidents, causing casualty, and property damage. The large-scale urban interior building integrates the functions of entertainment, business, and tourism and has a complicate structure, which makes it very difficult to escape in time from the dangerous area and seriously threatens the safety of the people’s lives and property. Therefore, the evacuation of people in enclosed spaces is a very complex issue, and the effects of various influencing factors should be weighed. This paper will use ACO algorithm to finish the path planning and calculate the location of people and the best evacuation route to find out the best exit and guide people to escape rapidly.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_100
752
C. He et al.
Up to now, there are many methods which can accomplish the routing planning. However, different algorithms have different advantages and disadvantages, resulting in different application scopes. According to the researches and analysis of path planning algorithms in various fields and combined with the order and basic principles of various algorithms, the algorithms are roughly divided into four categories: traditional algorithms, graphical methods, intelligent bionics algorithms, and so on. The traditional algorithms mainly include the following: simulated annealing algorithm, artificial potential field method, fuzzy logic algorithm, Tabu search algorithm, etc. Simulated annealing algorithm [1] is an effective approximation algorithm, which is very effective in dealing with large-scale combinatorial optimization problems. The artificial potential field method [2] is a virtual force method. The algorithm mainly optimizes the path by establishing the gravitational field repulsion field function. The fuzzy logic algorithm [3] simulates pedestrians’ perception experience on different roads in real time, comprehensively considers physiological perception and specific behavior, generates unique sensor information at different times, and finds the corresponding processing method in the table to achieve path planning effect. Tabu search algorithm [4] is a comprehensive step-by-step optimization algorithm that mainly simulates people’s intellectual thinking process. When solving practical problems, the traditional algorithms mentioned above do not have an intuitive performance, so they are often not easy to implement in the mathematical modeling process. In contrast, graphical algorithms are more intuitive and easier to implement. With the development of bionics, some scholars unexpectedly discovered that models created based on many phenomena in nature will have unexpected effects when dealing with complex problems, such as ant colony algorithm [5] and genetic algorithms, etc.
2 Ant Colony Algorithm and Grid Method 2.1 Theory of ACO Algorithm People find that the foraging situation of ants everywhere in life is very helpful to solve the problem of path planning. Multiple ants start the pathfinding operation from the starting point at the same time. The pathfinding operation of each ant is independent, and the path can be selected independently according to the probability. Since ants will leave pheromones in the process of walking, each ant walking process is accompanied by the updating of pheromone on the map. The ant colony algorithm is executed step by step in parallel. This makes the algorithm more powerful in searching, significantly expands the area travelled on the map, the probability of finding the best path is greatly increased, and it can avoid falling into the local optimal solution. Ants with a later iteration number can choose a better path based on the concentration of pheromones on the map as the main guide. Because of this phenomenon, the ant colony algorithm can directly perform the pathfinding operation without additional human guidance. At the same time, it can continuously optimize and improve the planned path, and finally make the output path more in line with the expected expectations. The process of ACO algorithm is shown as Fig. 1.
Research on the Path Planning Algorithm …
753
start parameters initialization
pheromone update
evaluation of ant colony
choice direction probability
Meet the termination conditions
n = n+1
minimum path end Fig. 1 Process of ACO algorithm
It is the core content of the algorithm that any ant in a population independently chooses the walking route according to probability. Because the pheromone concentration of each location on the map is different, the algorithm will calculate the probability of the ant choosing each route according to the formula and heuristic function, as shown below: [τij (t)]α ∗[ηij (t)]β k α β , j in allowed table , (1) Pij (t) = s∈allowed [τis (t)] ∗[ηis (t)] 0 otherwise where ηij (t) is the value of the heuristic function, τij (t) is the pheromone concentration after t times iterations. All the waypoints that ants can reach at this moment will be stored in the allowed table. 2.2 Application of Grid Method in Map Construction In general, it is difficult to use algorithms to study path planning on physical maps, so it is necessary to make a realistic geographic environment into a usable analog map according to a certain proportion, and complete the path selection and other operations on the analog map. The method of making a grid map is to divide the required map into multiple grids according to a fixed size, and each grid corresponds to a different number [6]. If most of the area inside an area is an obstacle, it is impossible to pass directly from this area. This grid can be defined as a full state; conversely, if there are few obstacles in an area and can be passed directly, this grid is defined as an empty state. When the grid method is used to create a simulated map for path planning, there is no gap between adjacent grids by default, and the resulting optimal route is all that passes through from the starting grid to the grid where the end point is. The one with the smallest total number of grids, so the length of each grid is the minimum moving distance during walking. Figure 2 is a simulated map generated by the grid method.
754
C. He et al.
1
2
3
4
5
1
0
1
2
3
4
2
5
6
7
8
9
3
10
11
12
13
14
4
15
16
17
18
19
5
20
21
22
23
24
x
y Fig. 2 Grid map
3 Parameter Selection and Simulation Verification Then, this part uses MATLAB simulation software to carry out simulation experiments to study the influence of different parameters such as the number of ants, pheromone volatilization coefficient, heuristic assault factor, expectation heuristic factor, and number of iterations on the performance of the algorithm and selects the best parameters to model an apartment. Simulation, based on the MATLAB platform, is carried out to complete the specific environment based on the path planning algorithm research. We plan to use the two simulation maps in Fig. 3 for simulation research.
a) Situation 1
b) Situation 2
Fig. 3 Two situations of path planning
Research on the Path Planning Algorithm …
755
3.1 The Number of Ants
the path length(m)
the number of iterations
The number of ants m is the total number of ants that start to search for paths at the same time in one iteration. When the number of ants is large, the range of its exploration on the map is wider, and it is easier to find a shorter path. But at the same time, the amount of calculation increases, the running time of the algorithm will increase proportionally, and when the number of ants increases to a certain extent, the difference in pheromone concentration on each route will decrease, which is not conducive to iteratively choosing the best path. As shown in Fig. 4, in scenario 1 to minimize the path length and iteration times, the number of ants m should be 80, in scenario 2 to minimize the path length and iteration times, the number of ants m should be 50.
the number of ants
the number of ants
a) on the path length
b) on the number of iterations
Fig. 4 Influence curve of ant number m on algorithm performance
3.2 Pheromone Volatilization Coefficient The greater the pheromone coefficient, the higher the concentration of pheromone left behind after an ant passes a route. When the pheromone coefficient is too large, the concentration of pheromone left on the path that has been travelled is very high, and other ants will hardly choose other paths, which will lead to a reduction in the number of algorithm iterations, early convergence, and the shortest path cannot be found. When the concentration of information volatile element is too small, it will cause the convergence rate to slow down and it is difficult to find the shortest path. As shown in Fig. 5, in scenario 1, the path length and the number of iterations should be minimized, and the
756
C. He et al.
the path length(m)
pheromone coefficient should be 0.7. In scenario 2, the path length and the number of iterations should be minimized, and the pheromone coefficient should be taken. 0.5.
pheromone coefficient
pheromone coefficient
a) on the path length
b) on the number of iterations
Fig. 5 Influence curve of pheromone volatilization coefficient on algorithm performance
3.3 Heuristic Assault Factor and Expectation Heuristic Factor The heuristic factor and the expected heuristic factor are mutually restricted. The heuristic factor indicates the possibility of selecting an existing path, similar to the genetic operation in the differential evolution algorithm; the expected heuristic factor indicates the possibility of selecting another path, similar to the mutation operation. As shown in Figs. 6 and 7, in the process of increasing the expected heuristic factor, both the path length and the number of iterations decrease. This shows that the expected heuristic factor is increased, which leads to an increased probability of ants searching randomly, and it is easier to find the shortest path. In scenario 1 and scenario 2, to minimize the path length and the number of iterations, the expected heuristic factor should be 7 and the heuristic factor should be 1. 3.4 Simulation Analysis of Path Planning Based on Ant Colony Algorithm Using the best parameters, conduct path planning simulation research under scenario 1 and scenario 2 to test whether the ant colony algorithm can meet the expected requirements in path planning. As shown in Fig. 8, when the parameter selection is reasonable, the ant colony algorithm can perfectly achieve the path planning function.
757
the number of iterations
the path length(km)
Research on the Path Planning Algorithm …
the expected heuristic factor
the expected heuristic factor
a) on path length
b) on the number of iterations
the path length(m)
the number of iterations
Fig. 6 Expectation heuristic influence curve of algorithm performance
the heuristic factor
the heuristic factor
a) on path length
b) on the number of iterations
Fig. 7 Influence curve of heuristic factor on algorithm performance
It can be seen from Fig. 9 that the path length shows a downward trend with the increase of the number of iterations. When the number of iterations is small, the curve
758
C. He et al.
a) The best path based on situation 1
b) The best path based on situation 2
Fig. 8 Best path in different scenarios
minimum path length(m)
minimum path length(m)
will fluctuate. However, when the number of iterations in situation 1 exceeds 85, and after the number of iterations in situation 2 exceeds 78, the minimum path length is stable and can be considered as a curve convergence. This minimum path length is the shortest path length in this scenario.
the number of iterations
a) The number of iterations in situation 1
the number of iterations
b) The number of iterations in situation 2
Fig. 9 Variation curve of minimum path length with number of iterations
4 Conclusion Based on an in-depth analysis of the specific environment, this article can find the best exit for the starting point at different locations in the environment and realize the path planning from the starting point to the best exit. The path planning algorithm is based on the ant colony algorithm and combined with the grid method to model and analyze the specific environment to create a simulated map. Then, select the best parameters to optimize the performance of the algorithm and get the optimized path quickly.
Research on the Path Planning Algorithm …
759
Acknowledgements. This paper is supported by Science and Technology Project of Ministry of Public Security (2017GABJC24) and the National Key R&D Program of China (No. 2018YFC0807101).
References 1. Liu X, Jia M, Zhang X, Lu W (2019) A novel multichannel internet of things based on dynamic spectrum sharing in 5G communication. IEEE Internet Things J 6(4):5962–5970 2. Cai J, Wan X, Huo M et al (2010) An algorithm of micromouse maze solving. In: 2010 10th IEEE international conference on computer and information technology (CIT 2010). IEEE Computer Society 3. Maaref H, Barret C (2000) Sensor-based fuzzy navigation of an autonomous mobile robot in an indoor environment. Contr Eng Pract 8(7):757–768 4. Yao Baozhen, Ping Hu (2014) A support vector machine with the tabu search algorithm for freeway incident detection. Int J Appl Math Comput Sci 3:397–404 5. Guo Q, Chang H, Yi Y (2010) Improved ant colony optimization algorithm for the traveling salesman problems. J Syst Eng Electr 4:329–333 6. Lin M, Cui G, Zhang Z (2016) Large eddy simulation of aircraft wake vortex with self-adaptive grid method. Appl Mathem Mech 10:1289–1304
Multiple Action Movement Control Scheme for Assistive Robot Based on Binary Motor Imagery EEG Xuefei Zhao, Dong Liu, Shengquan Xie, Quan Liu, Kun Chen, Li Ma(B) , and Qingsong Ai School of Information Engineering, Wuhan University of Technology, Wuhan, China [email protected]
Abstract. In this paper, a weighted voting system combined with basic signal processing methods is used to classify multi-category motor imagery (MI) scenarios (foot, left-hand, right-hand, tongue) to improve the classification accuracy of MI electroencephalogram (EEG) signal. Meanwhile, a feasible binary coding framework is proposed to control the KUKA robotic arm for grasping to improve online performance of applications on brain–computer interfaces (BCIs). Firstly, two-movement MI with the high classification accuracy is selected from fouraction types, i.e., foot as 0, left-hand as 1, and their combination representing the four directions of motion direction of the robotic arm (e.g., 00-front, 01-back, 10-left, 11-right) is generated by two-bit binary coding. Next, the motion of the robotic arm in each direction is achieved by two successive movements of MI. Finally, the accuracy of our integrated classifier reaches 74.6% in four-movement MI data and 92.6% in two-movement MI data. Compared to four-movement MI to control the robotic arm, the binary coding method reduces the time by 6.8% and increases the accuracy more than two times. Keywords: Motor imagery (MI) · Electroencephalogram (EEG) signal · Brain–computer interface (BCI) · Binary coding · KUKA robotic arm
1 Introduction The high recurrence, mortality, and disability rate of stroke patients have brought a heavy burden to the society. As a positive treatment, brain–computer interface (BCI) builds a bridge between machine and human physical needs through their brain signal to help stroke patients [1]. By activating EEG signal in a specific brain region spontaneously, the intention of patients can be understood and even realized by some machines. It is helpful to reconstruct the patients’ neural circuits, improve their motor function, accelerate and expand the development of stroke rehabilitation treatment [2]. The sources of BCI include steady-state visual evoked potential (SSVEP), P300-related event potential
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_101
Multiple Action Movement Control Scheme …
761
(P300), motor imagery (MI), and some combinations of them. Among them, MI-EEG signal is widely used for its spontaneity. When people imagine unilateral limb movement, it would induce the change of energy intensity in some characteristic frequency bands of the contralateral sensory area of the brain. The phenomenon of frequency band power decline is called event-related desynchronization (ERD), while the opposite phenomenon is called event-related synchronization (ERS) [3]. As a result, different categories of MI can be obviously distinguished by ERD and ERS. A complete BCI system consists of four important parts: signal acquisition, data processing, control equipment, and feedback, and the auxiliary robot usually is acted as the external device for rehabilitation assistance [4]. Savic et al. used SSVEP and function electrical stimulation (FES), to design a BCI system for the rehabilitation of patients’ grasping ability [5]. Li et al. developed a brain–machine upper limb movement rehabilitation system controlled by MI in 2017 [6]. In 2019, Li et al. developed a BCI robotic system [7]. Currently, the accuracy of four or less movements of MI reaches a relatively high level. As the increase of MI categories, the cerebral volume conduction effect and the confusion between different channels become more obvious and the classification accuracy would reduce accordingly [8, 9]. However, the category of MI directly or indirectly determines the BCI instruction set size, and inaccurate MI would lead to BCI’s error control. In order to solve these problems, some new strategies have been proposed. On the one hand, new algorithm can be created to improve multi-classification accuracy. Based on four movements, the idle state was included to realize five states to control a robot [10]. With the idea of grading, Ofner P et al. decoded six-kind MI and improved the accuracy of classification [11]. However, it is hard to imagine and classify correctly in multi-category MI. On the other hand, the BCI instruction set size can be expanded on application side. A study using two-movement MI combined the previous state with the current input MI signal to realize the virtual platform of mobile robot multiple directions [12]. A richer instruction set was achieved without increasing the difficulty of imagination and classification. In our work, we expect to use fewer MI categories to ensure classification accuracy and achieve a set of actual control instructions that are more than MI categories. In this paper, MI-EEG signal is used to complete a grasping task. By MI-EEG signal, subjects control the motion of KUKA robotic arm in four directions to grasp a bottle with the help of the end soft touch manipulator. We use two-movement MI in a binary coding method to complete this process, and the ratio of MI category to the motion direction of the robotic arm is 2–4. The paper’s structure is as follows. Sections 1 and 2 present the introduction and the experimental principle and design, respectively. The experimental result and discussion are reported in Sect. 3. Section 4 is the conclusion.
2 Experimental Principle and Design After signal processing and binary coding, the extracted MI-EEG signal is used to control the robotic arm. The specific principle is introduced as follows.
762
X. Zhao et al.
2.1 Signal Processing Principle The signal processing includes preprocessing, feature extraction, and classification and integrated classification. Firstly, the Butterworth band-pass filter is used to filter 8–30 Hz to improve the signal-to-noise ratio as preprocessing [13]. Feature Extraction and Classification Method There are five methods involved in this part. The feature extraction methods include the common space vector (CSP), the local feature scale decomposition (LCD), and the discrete wavelet transformation (DWT). CSP is a spatial feature extraction algorithm for multiple classifications of multi-channel EEG signals. By diagonalizing the matrix, a spatial filter W is constructed and applied to the original EEG signal to extract features [14]. LCD is a time–frequency analysis method, which can adaptively decompose non-stationary EEG signals and obtain some internal scale components (ISCs) [15]. Furthermore, the selected ISCs are reconstructed into new signals. DWT is essentially obtained by discretizing the scale and displacement of continuous wavelet transform (WT) according to the integer power of 2. Its multiresolution is suitable for feature extraction of non-stationary signals [16]. The feature classification methods include the multi-cluster feature selection (MCFS) [17] to sort the features and the spectral regression discriminant analysis (SRDA) to classify the feature data [18]. Integrated Classification Method The integrated classification is realized by weighted voting system. Combining the above feature extraction methods by series method [18], there are following four cases: CSP, CSP combined with LCD, CSP combined with DWT, and CSP combined with LCD and DWT. Four training models are obtained by SRDA method, and their accuracy is assumed to be A1 , A2 , A3 , A4 . The corresponding weight is obtained by the following formula Ai , i = 1, 2, 3, 4 (1) wi = Ai / Weight updates based on model’s accuracy, there is if Ai ≤ 0.3, wi = 0 else if 0.3 < Ai < 0.6wi = wi /2 else wi = wi
(2)
According to the training models, the four-classification results are C 1 (n), C 2 (n), C 3 (n), C 4 (n), n = 1, 2, …, N, where n denotes the nth sample and N is the total samples in the testing set. The weights corresponding to the same judgment result C i (n) are added to find the maximum value and the corresponding classification result is the integrated classification result. Therefore, the final classifier is constructed. 2.2 Binary Coding Principle The processed signals are further used the binary coding method to control the robotic arm. Assuming that X n represents the nth class MI-EEG signal and the corresponding output is S n (n = 1, 2, …, N) after processing, where N denotes the total number of categories of MI-EEG signals. The control instruction of the robotic arm is C m (m = 1, 2, …, M), where M denotes the total number of the robotic arm’s instructions.
Multiple Action Movement Control Scheme …
763
In the traditional way, in order to achieve M control instructions of the robotic arm, N = M is required. One C m corresponds one S n . In the binary coding way, to achieve the same instruction number as the traditional, there can be N < M. Adopting the B-bit binary coding, one C m corresponds successive B-bit S n , that is, a kind of robotic arm’s instruction is controlled by a combination of MI-EEG signals. With the increase of MI-EEG signal category N, it is more difficult to get the correct classification result S n . However, the current BCI requires a rich instruction set. In this paper, we take M = 4. In the conventional way, there are N = 4 and a four-movement classifier. In the binary coding mode, there are N = 2 and a two-movement classifier. We reduce the number of categories from 4 to 2 and guarantee the number of control instructions of the robotic arm to be 4. 2.3 Subject and Equipment Seven right-handed male subjects aged 23 ± 2 take part in the experiment. They have no mental illness or physical movement disorders, with no similar experimental experience before. All the tests are completed in two days at night for each subject. An EEG signal acquisition system with a sampling frequency of 250 Hz that conforms to the international standard of 10–20 lead system is adopted. The online control module consists of IIWA7 R800 KUKA robotic arm and the end manipulator. 2.4 Experimental Procedure Offline Session This session is mainly to collect data and build classifiers. The subjects stay in a comfortable chair and carry out the trial according to the computer tips. One trial lasts 8 s, as shown in Fig. 1. When t = 0 s, the monitor displays the word “prepare,” and the subjects prepare for MI; when t = 2 s, one of the pictures of MI (foot, lefthand, right-hand, tongue) appears as a prompt, while the subjects begin to carry out corresponding MI, and one time of MI lasts for 4 s from appearance of the cue to the end; when t = 8 s, the word “rest” appears to show the end.
Control time
2s
4s
2s
Experimental Process
Prepare
Cue and MI
Rest
Fig. 1 Experimental paradigm for data collection
Online Session According to the offline procedure and the building method of classifiers above, the robotic arm is further controlled online through BCI. There are two classifiers for each subject, including a four-movement classifier and a two-movement classifier. The latter is constructed by two movements with higher accuracy called the subject’s optimal combination from the former. In order to control the robotic arm to finish the grasping action, the target point and reach path are determined. Then, the actual trajectory is mapped to the virtual interface, both of them controlled by MI-EEG signal
764
X. Zhao et al.
synchronously. The two or four movements of MI correspond to four directions of the robotic arm’s motion. In the former case, four directions of the robotic arm’s motion correspond to the combination of two movements of MI, and the one-step motion of the robotic arm is controlled by two consecutive correct MI. The one-to-many control strategy is implemented by binary coding strategy. In the latter case, the four directions of the robotic arm are controlled by four movements of MI one by one. Through MI-EEG signal, subjects control the movements both of the blue ball in the virtual interface and the robotic arm in the real platform simultaneously, so as to achieve the grasping job. The virtual interface (used as an auxiliary way to MI) and the real platform are shown in Fig. 2. (a)
(b)
Fig. 2 Experiment platform, a the real platform, b the virtual interface
3 Experimental Results and Discussions 3.1 Offline Performance Selecting the data for 2 s after the cue appears, each subject’s data consists of a training session and a testing session, and each session includes 100 trials, with each of four MI categories is tested 25 times. After training the models and constructing the classifiers in different feature extraction ways for testing, the result is shown in Table 1 (acc-accuracy). Table 1 Average accuracy and P value of different classifiers of four-movement MI Method
CSP
CSP + LCD
CSP + DWT
CSP + LCD + DWT
Average acc/%
64.4
69.7
67
72.4
P value
0.0001
0.0013
0.0035
0.0148
Integrated classifier 74.6 \
Table 1 shows that the average accuracy of integrated method reaches 74.6%, higher than other four methods. After T-test, P values are all less than 0.05, which proves that our integrated method is obviously superior to other methods. Meanwhile, subject’s accuracy for each type of MI can be studied in this process, and two of the four-movement MI
Multiple Action Movement Control Scheme …
765
with higher accuracy are selected from the confusion matrix as his optional combination to further construct a two-action classifier. As shown in Fig. 3: The confusion matrix clearly shows Subject 1’s accuracy and error of each type of four-movement MI. In the four-movement MI, the classification accuracy of foot and right hand is relatively higher. Therefore, the optimal combination of Subject 1 can be foot and right hand. Similarly, the optimal combination of other subjects can be also obtained. As shown in Table 2, there are the four-classification accuracy, the optimal combination, and the two-classification accuracy (foot-0, left-hand-1, righthand-2, tongue-3) of all subjects (sub-subject, ave-average).
(a)
(b)
Fig. 3 Confusion matrix, a four-movement MI, b two-movement MI. The vertical axis represents the true label, and the horizontal axis represents the prediction label
Table 2 Offline experimental results of all subjects Subject
Sub 1
Sub 2
Sub 3
Sub 4
Sub 5
Sub 6
Sub 7
Ave
Optimal combination
02
03
03
13
23
02
13
\
Four-classification acc/%
79
72
78
82
68
70
73
74.6
Two-classification acc/%
94
82
98
94
86
98
96
92.6
As can be seen from Table 2: In the same condition, the subjects’ average accuracy of two-classification MI reaches 92.6%, which is 18% higher than that of the fourclassification MI. Apparently, almost all the subjects perform better in a small number of MI categories. Taking Subject 1 as an example, the confusion matrix obtained by the two-action classifier is shown as Fig. 3b. The optimal combination of different subjects is specific, but it could be found that the classification accuracy of the foot and tongue movements is higher in most subjects, indicating that these two movements are more conducive to the subjects’ imagination. It is found that the subjects are able to build better models using their habitual actions, assuming that this could be better applied to online applications.
766
X. Zhao et al.
3.2 Online Performance Based on the classifiers constructed in the offline experiment, we control the grasping module through the BCI by traditional method and binary coding method, respectively. The experimental results are shown in Table 3 (t-time, condition 1-four movements of MI, condition 2-the optimal combination of MI). Table 3 Online experimental results of all subjects Subject Condition 1 Condition 2
Acc/%
Sub 1
Sub 2
Sub 3
Sub 4
Sub 5
Sub 6
Sub 7
Ave
33.3
35.4
46
39.5
40.5
33.3
43.6
38.8
T /s
51
48
37
43
42
51
39
44
Acc/%
69.4
94.4
100
100
82.9
77.3
65.4
84.2
T /s
49
36
34
34
41
44
52
41
The result proves that imaging two actions could finish the grasping task as four. For each step of the robotic arm motion, the MI number of times is increased from 1 to 2 and the total number of times is twice as much, but the task completion time decreases (except Subject 7). At the same time, the accuracy of the optimal combination is significantly higher than that of the four-movement MI, which is more than twice as accurate as the former (except for the Subject 7). Among them, although the accuracy of Subject 3 and Subject 4 by four-movement MI is only about 40%, the accuracy of using their optimal combination of MI can reach 100%. Therefore, we can conclude that the two classifiers established in the offline experiment can help to complete the grasping task, improve the accuracy of the less actions, and reduce the total time. Using the binary coding method, two-movement MI achieves the same effect as four-movement MI and even perform better. Figure 4 (a) and Fig. 4b are visual representations of the accuracy and experimental time under the two conditions, respectively.
Fig. 4 Experimental results comparison: a task completion accuracy, b task completion time. Horizontal axis is step, unit is step, and each step represents 1.5 s
Multiple Action Movement Control Scheme …
767
4 Conclusion This paper introduces a novel feature extraction and classification method through the combination and integration strategy. It shows much higher accuracy when compared with CSP, CSP + LCD, CSP + DWT, and CSP + LCD + DWT. Furthermore, a binary coding scheme via two optimal kinds of MI-EEG signals is proposed to control the robotic arm’s motion in four directions, which reduces computation cost generated by the traditional four kinds of MI-EEG signals. The results prove that the task can be finished with few kinds of MI-EEG signals and the accuracy improved simultaneously and efficiently. Thus, it provides a new strategy to utilize the BCI system to restate neural circuits and accelerate recovery of stroke patients. Acknowledgements. This work was supported by National Natural Science Foundation of China (Grant No. 51675389) and the Excellent Dissertation Cultivation Funds of Wuhan University of Technology (2018-YS-053). It is also supported by the Fundamental Research Funds for the Central Universities (WUT: 203109001).
References 1. Cervera, María A, Soekadar SR, Ushiba J et al (2018) Brain-computer interfaces for poststroke motor rehabilitation: a meta-analysis. Ann Clin Translat Neurol. https://doi.org/10. 1002/acn3.544 2. Huang Q-W, Ning X (2019) Advances in the application of motor imagery in the rehabilitation of stroke patients. J Mod Med Health 35:3185–3188 3. Pfurtscheller G, Silva FHLD (1999) Event-related EEG/MEG synchronization and desynchronization: basic principles. Clin Neurophysiol. https://doi.org/10.1016/s1388-2457(99)001 41-8 4. Liu F-Y, Li F-B (2017) Overview of brain-computer interface system. Electron World 21:72– 73 5. Savic A, Kisic U, Popovic M (2012) Toward a hybrid BCI for grasp rehabilitation. In: Ifmbe proceedings. https://doi.org/10.1007/978-3-642-23508-5_210 6. Mei Y-C (2014) Research on upper limb rehabilitation based on brain computer interface. Beijing University of Technology 7. Li H-W, Chen X-G (2019) Brain-computer interface controlled robotic arm system based on high-level control strategy. Beijing Biomed Eng 38:36–41 8. Yi W-B (2017) Research on response mechanism and decoding technology of EEG induced by compound motor imagery. Tianjin University 9. Tao X-W, Yi W-B, Chen L, He F, Qi H-Z (2019) Riemann Kernel support vector machine recursive feature elimination in the field of compound limb motor imagery BCI. J Mech Eng 55:131–137 10. Cho JH, Jeong JH, Shim KH, Kim DJ, Lee SW (2018) Classification of hand motions within EEG signals for non-invasive BCI-based robot hand control. In: IEEE international conference on systems, man, and cybernetics. Piscataway, NJ, USA, IEEE, pp 515–518 11. Ofner P, Schwarz A, Pereira J, Gernot R. Müller-Putz (2017) Upper limb movements can be decoded from the time-domain of low-frequency EEG. In: PLoS ONE. https://doi.org/10. 1371/journal.pone.0182578 12. Aljalal M, Djemal R, Ibrahim S (2018) robot navigation using a brain computer interface based on motor imagery. J Med Biol Eng. https://doi.org/10.1007/s40846-018-0431-9
768
X. Zhao et al.
13. Ang KK, Guan C (2013) Brain-computer interface in stroke rehabilitation. J Comput Sci Eng. https://doi.org/10.5626/jcse.2013.7.2.139 14. Liu G-Q, Huang G, Zhu X-Y (2009) Application of CSP method in multi-class classification. Chin J Biomed Eng 28:935–938 15. Yang Y, Zeng M, Cheng J-S (2012) a new time-frequency analysis method-the local characteristic-scale decomposition. J Hunan Univ (Nat Sci) 39:35–39 16. Zhuo J, Yang G-Y, Xu T (2019) Classification of multi-class motor imagery EEG data based on spatial frequency and time-series information. Chin J Med Phys 36(01):87–93 17. Cai D, Zhang C, He X (2010) Unsupervised feature selection for Multi-Cluster data. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, Washington, DC, USA. https://doi.org/10.1145/1835804.1835848 18. Liu A-M, Chen K, Liu Q, Ai Q-S, Xie Y, Chen A-Q (2017) Feature selection for motor imagery EEG classification based on firefly algorithm and learning automata. In: Sensors, 17:2576
Modeling and Simulation of Ionospheric Surface Scattering Equation for High-Frequency Surface Radar Yongfeng Yang, Ziqiang Zhang, Xuguang Yang, Jun Du(B) , and Yanghong Zhang School of Information Engineering, Long Dong University, 745000 Qingyang, Gansu, China [email protected]
Abstract. Ionospheric clutter is the main factor to reduce the detection performance of HFSWR. This paper established the radar range equations of the surface scattering targets. The scattering coefficient of ionospheric RCS is estimated based on the principle of high-frequency coherent scattering for irregularities. Thus, the generalized radar formula based on the physical mechanism of the ionosphere is derived. The simulation shows that the ionospheric echoes under surface scattering close to the vertical direction accord with the pulse-limited model, and the ionospheric echoes close to the oblique direction accord with the beam-limited model, for the case of wide beam of HF surface wave radar. Keywords: HF surface wave radar · Ionosphere · Radar range equation
1 Introduction High-frequency surface wave over the horizon radar (HFSWR) makes use of the physical characteristics of 3–30 MHz band vertical polarized electromagnetic wave that can diffract along the sea surface, realizes the compatible detection of the sea and air targets beyond the horizon, and effectively monitors the sea surface ships and low-altitude flying targets beyond the horizon [1, 2]. HFSWR also has the function of ocean remote sensing, which can detect many ocean dynamic parameters such as current, wind direction and wave height within the radar coverage [3, 4]. However, in the real radar system, due to the influence of radio interference, ionosphere disturbance and sea surface transmission path loss and other factors, some radar beams irradiate to the sky and enter the radar receiver in various ways after passing through the ionosphere, forming the ionosphere clutter [5–8]. The ionosphere echo in radar echo often occupies a large number of range Doppler units and often raises the target detection base, which is the key factor affecting the target detection performance and sea state remote sensing accuracy of radar system. Many algorithms based on radar signal processing have studied the ionosphere echo as clutter suppression [9–11]. The difficulty of ionospheric clutter suppression is that ionospheric echo has special distribution characteristics such as highly nonstationary
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_102
770
Y. Yang et al.
and irregularity in the time domain, frequency domain, spatial domain, and Doppler domain. Therefore, starting from the physical mechanism of ionosphere itself, deepening the scientific understanding of the mechanism of ionospheric clutter will be helpful to the breakthrough of ionospheric clutter suppression technology. Considering the physical characteristics of the ionosphere, this paper proposes the ionosphere should be regarded as a two-dimensional surface target similar to the sea or the surface, not as a point target for conventional detection. So we derived the generalized radar range equation for the ionospheric scattering, establishing the mathematical relationship among the radio wave propagation path, the ionosphere RCS, and the radar system parameters, quantifying the influence of ionosphere on the radar system. Besides, based on the coherent scattering mechanism of HF electromagnetic wave by ionospheric irregularities, the RCS scattering coefficient of the ionosphere is estimated, and the generalized range equation of HFSWR based on the physical mechanism of the ionosphere is obtained.
2 Ionospheric Range Equation of High-Frequency Surface Radar The range equation of radar can be expressed as: Pr =
Pt Gt Gr λ2 γ σ (4π)3 R4 Pn Ls LP
(1)
where λ is the radar transmitting wavelength; Pr is the ionosphere echoes power received by the radar; Gt is the radar transmitting antenna gain; Pt is the radar transmitting power; Gr is the receiving array antenna gain;γ is the signal duty cycle; σ is the effective ionosphere scattering cross section; R is the linear range between the ionosphere reflecting surface and the radar location; Pn is the noise power; L p is the propagation loss of radar beam in the ionosphere; L s is the internal loss of radar system. The RCS parameters of the ionosphere and the propagation loss L P in the ionosphere radar equation are both related to the ionosphere characteristics, which need to be estimated especially. Among them, the propagation loss of high-frequency radio waves in the ionosphere has been calculated by practical engineering methods [12]. Therefore, we will just focus on the estimation method of the ionosphere RCS. 2.1 Ionospheric Range Equation Under Surface Scattering Regarding the scattering surface of the ionosphere as a plane, the ionospheric range equations are discussed when the corresponding range of the ionospheric echoes is limited by pulse and beam, respectively. As shown in Fig. 1a, the width of the vertical range is R0 ϕ3 approximately for the ionosphere scattering surface within the range of the radar beam, where R0 is the range from the ionosphere scattering center to the radar, and S is the 3 dB width of the beam. Then the ionosphere width of the beam is R0 ϕ3 / cos θ , where θ is the angle between the line of sight vector and the vertical direction. The pulse-limited case is shown in Fig. 1b, for the ionosphere irradiated by a pulse with a range resolution of R without considering the beam width, the scatterer range
Modeling and Simulation of Ionospheric Surface Scattering …
z
R0
3
771
Ionosphere
cos
R0
3
R0
R x
R sin
(a) Projection of the elevation beam on ionosphere
(b) Range resolution projection on ionosphere
Fig. 1 Geometry of ionosphere scattering
corresponding to the ionosphere echoes is the scatterer distribution width within the range resolution unit, and its value is R sin θ. When the ionosphere is irradiated by a pulse with a range resolution of R without considering the beam width, the corresponding scatterer range of the ionosphere echoesis the scatterer distribution width within the range resolution unit, and its value is R sin θ. The effective width of ionospheric scattering is a smaller value in the range resolution and the elevation beamwidth projected onto the ionosphere; therefore, it depends on the ionospheric range, radar range resolution, and elevation angle. Based on the following criteria: R (2) beam - limited: φ3 tan θ < R0 R (3) pulse - limited: φ3 tan θ > R0 From the ionospheric surface scattering model, the total RCS of the ionosphere originates from the surface scattering at the range of R0 , and the non-surface part of the resolution element does not scatter, so the RCS differential dσ is proportional to the differential dA of the ionosphere scattering surface area: dσ = δ(R − R0 )σ2 dA
(4)
where σ2 is the RCS scattering rate per unit area of the ionosphere. Therefore, the ionosphere range equation of HF surface wave radar under surface scattering can be expressed as: Pt λ2 γ σ2 pt (θ, φ)pr (θ, φ)dA (5) Pr = (4π )3 R40 Pn Ls LP (R0 ) A(R0 ,θ,φ)
where pr (θ, φ) is the direction coefficient of the receiving antenna at (θ, φ), pr (0, 0) = Gr , and pt (θ, φ) is the direction coefficient of the transmitting antenna at (θ, φ), pt (0, 0) = Gt .
772
Y. Yang et al.
In the case of beam-limited, R2 θ3 φ3 cos θ represents the effective area of backscattering at the range R0 , so the differential area can be recorded as: dA = R20 dθ dφ cos θ (6) Using the approximation of the antenna 3 dB beamwidth gain as a constant, the range equation of the backscattering of the ionosphere under the beam limitation can be obtained: Pt Gt Gr λ2 γ θ3 φ3 (7) σ2 Pr = 3 2 (4π) R0 Pn Ls LP (R0 ) cos θ If the radiation area of the ionosphere is limited by the pulse, the effective area contributing to backscattering at any time is Rθ3 R sin θ . Then the scattering range equation of the ionosphere surface under the pulse limitation can be obtained: Pr =
Pt Gt Gr λ2 γ R σ2 3 3 (4π ) R0 Pn Ls LP (R0 ) sin θ
(8)
2.2 Estimation of Ionospheric Reflection Coefficient and Parameters For the estimation of σ2 , the simplified model of Walker and Booker can be used [13]: σ2 ∝ Ne2 exp −2k 2 [l2 α 2 + l⊥2 ] (9) where l⊥ is the component whose direction of the geomagnetic field is perpendicular to the scale of irregularities, l⊥ = λradar 2; l is the component whose direction of the geomagnetic field is parallel to the scale of irregularities, and its range is 800l⊥ < l < 2250l⊥ . k is the number of radar wave; α is aspect angle(aspect angle); L is irregularities scale length; Ne2 is the variance of electron concentration fluctuation; Ne2 is electron concentration. Suppose that the electron concentration is linear with the mean of the variance of the irregularities. Ne2 = C0 Ne2
(10)
where C0 is the coefficient of fluctuation degree of irregularities. When only α = 0° is considered, Formula (8) can be approximately as follows: σ2 = C0 e−2π Ne2 2
(11)
Taking Formula (11) back to Formula (7), the ionospheric echoes power under the beam-limited can be obtained when the volume scattering in the irregularities: Pr =
Pt λ2 Gt Gr γ Rθ3 φ3 2 C0 e−2π Ne2 2 3 (4π ) R0 Ls Lp (R0 )
(12)
Taking Formula (11) back to Formula (8), the ionospheric echoes power under the pulse-limited can be obtained when the surface scattering in the irregularities: Pr =
Pt Gt Gr λ2 γ R (4π)3 R30 Pn Ls LP (R0 ) sin θ
C0 e−2π Ne2 2
(13)
Modeling and Simulation of Ionospheric Surface Scattering …
773
3 Simulation Table 1 shows the main simulation parameters of the model. Generally, the ionosphere range is 100 km-300 km, and the range resolution of HFSWR system is 5 km. Figure 2a is the simulation of elevation angle θ and beam width φ3 when the ionosphere range is 100, 200, and 300 km. When θ and φ3 are below the curve, the surface scattering is beamlimited; otherwise, it is pulse-limited. It can be seen that when the ionosphere range and beamwidth increase, the elevation angle decreases rapidly along the vertical direction and tends to 0. This means that when the antenna beam is wider, the ionospheric clutter from the vertical direction is subject to the beam-limited, while the other elevation angles are subject to the pulse-limited. Figure 2b shows the relationship between the ionosphere range and the beam width when the elevation angle is 10°, 20°, and 30°. It can be seen from that as the ionospheric range and elevation angle increase, the ionospheric echoes in the vertical direction trend to obey the beam-limited, while the oblique echoes obey the pulse-limited. Table 1 Simulation parameters of ionospheric detection equation Radar Signal Antenna Transmitting Ionosphere Elevation Azimuth Electron frequency duty gain power (kW) range (km) beam beam density (MHz) cycle width (°) width (°) fluctuation (m-6 ) 4.1
0.1
105
2
300
45
20
1018
20
30
10°
100km 300km
20
10
0
20°
15
Theta(°)
Theta(°)
200km
30°
10 5
0
10
20
30
40
Phi(°)
(a) Relationship between elevation angle and beamwidth
50
0 100
150
200
250
300
Range(km)
(b) The relationship between elevation angle and ionospheric range
Fig. 2 Analysis of pulse limitation and beam limitation
As the actual antenna system of HF ground wave radar generally has a wide elevation beam, so the ionospheric echoes in the vertical direction may subject to the beam-limited case, and the other oblique ionospheric echoes mostly subject to the pulse-limited case. Figure 3a shows the simulation of ionospheric echoes power spectrum in beamlimited case changing with elevation when the frequency is 4.1 MHz, 6.1 MHz, and
774
Y. Yang et al.
8.1 MHz, respectively. The range resolution is set to 5 km. Figure 3b is the simulation diagram of echoes power spectrum and elevation angle when the fluctuation of electron concentration is 1018 /M6 , 1020/ M6 , and 1022 / M6 , respectively. It can be seen that the ionospheric echoes have higher power with the stronger electron concentration fluctuation and the lower operating frequency. 40
80 4.1M 1e20
8.1M
38
Power(dB)
Power(dB)
1e18
70
6.1M
36
1e22
60 50 40
34
0
20
40
60
80
30
0
20
40
60
80
Elevation(°)
Elevation(°)
(a) With different frequencies
(b) With different electron concentrations
Fig. 3 Analysis of the vertical PSD and the elevation angle in the beam-limited case
As the actual antenna system of HF ground wave radar generally has a wide elevation beam, so the ionospheric echoes in the vertical direction may subject to the beam-limited case, and the other oblique ionospheric echoes mostly subject to the pulse-limited case. Figure 3a shows the simulation of ionospheric echoes power spectrum in beamlimited case changing with elevation when the frequency is 4.1 MHz, 6.1 MHz, and 8.1 MHz, respectively. The range resolution is set to 5 km. Figure 3b is the simulation diagram of echoes power spectrum and elevation angle when the fluctuation of electron concentration is 1018 /M6 , 1020 / M6 , and 1022 /M6 , respectively. It can be seen that the ionospheric echoes have higher power with the stronger electron concentration fluctuation and the lower operating frequency. Besides, from Eq. (7), it can be seen that the ionospheric received power varies with R−2 , and the horizontal elevation beamwidth is also proportional to the received power, so the long-range and wide beam will lead to the increase of the ionosphere echoes power. Because HFSWR satisfies the above conditions, it can often receive strong, large-scale ionospheric echoes. Among these parameters, the fluctuation of electron concentration has the greatest influence on the power spectra density (PSD). Figure 4a shows the simulation of the PSD of the oblique ionospheric echoes changing with the elevation angle when the range is 100 km, 200 km, and 300 km, respectively. Figure 4b shows the simulation of the PSD when the electron concentration fluctuation is 1018 /M6 , 1020 /M6 , and 1022 /M6 . It can be seen that the power of ionospheric echoes decreases with the elevation angle increases, which indicates that the ionospheric echoes in the vertical direction are the strongest in the elevation angle. Figure 4a also shows that the long-range echoes are weaker than the short-range echoes, which is caused by the R−3 of (8). Figure 4b shows that the ionospheric echoes power increases by about 10 dB when the electron concentration fluctuation increases. Therefore, the effect of
Modeling and Simulation of Ionospheric Surface Scattering …
775
electron concentration fluctuation on the ionospheric echoes power is the largest factor in the pulse-limited case. 100
60
80
200km 300km
40 30
Power(dB)
Power(dB)
50
1e20 1e22
60 40 20
20 10
1e18
100km
0
20
40
60
80
Elevation(°)
(a) With different frequencies
0
0
20
40
60
80
Elevation(°)
(b) With different electron concentrations
Fig. 4 Analysis of oblique ionospheric spectrum and elevation angle in the pulse-limited case
From Figs. 2, 3, and 4, it can be seen that for surface scattering, radar operating frequency, beamwidth, ionospheric range, electronic concentration fluctuation, and other factors will affect the ionospheric echoes power, which also explains why the ionospheric echoes in RD spectrum is complex and unpredictable. Among these factors, the most significant one is the fluctuation of electron concentration. Therefore, the real-time spatial diagnosis of the ionosphere will be beneficial to the understanding of the ionospheric echoes mechanism of HFSWR and the improvement of the ionospheric clutter suppression algorithm.
4 Conclusion Because the ionosphere distribution characteristics are more similar to the meteorological targets, it is more appropriate to model the ionosphere echo as a distributed scatterer of surface scattering. In this paper, the radar range equation of two-dimensional surface scattering is established for the ionospheric target of HFSWR. The key factor in this equation is the RCS scattering coefficient of the ionosphere. Based on the principle of high-frequency coherent scattering, the RCS scattering coefficient of the ionosphere is estimated from the physical mechanism of the ionosphere, and the generalized range equation of HFSWR ionosphere is obtained.
References 1. Zhou W, Jiao P (2008) Over-the-horizon radar technology. Publishing House of Electronics Industry. Beijing 2. Barrick D (2003) History, present status, and future directions of HF surface-wave radars in the U.S. International radar conference. IEEE, pp 652–655 3. Barrick D (1972) First-order theory and analysis of MF/HF/VHF scatter form the sea. IEEE Trans Antenn Propag 20:2–10
776
Y. Yang et al.
4. Barrick D, Snider J (1977) The statistics of HF sea-echoes Doppler spectra. IEEE Trans Antenn Propag 25(1):19–28 5. Li Q, Zhang W, Li M et al (2017) Automatic detection of ship targets based on wavelet transform for HF surface wavelet radar. IEEE Geosci Remote Sens Lett 14(5):714–718 6. Guo X, Sun H, Yeo TS (2008) Interference cancellation for high-frequency surface wave radar. IEEE Trans Geosci Remote Sens 46(7):1879–1891 7. Jangal F, Saillant S, Helier M (2009) Ionospheric clutter mitigation using one-dimensional or two-dimensional wavelet processing. IET Radar Sonar Navig 3(2):112–121 8. Zhang X, Yang Q, Yao D et al (2015) Main-lobe cancellation of the space spread clutter for target detection in HFSWR. IEEE J Sel Top Sign Process 9(8):1632–1638 9. Saleh O, Ravan M, Riddolls R et al (2016) Fast fully adaptive processing: a multistage STAP approach. IEEE Trans Aerosp Electron Syst 52(5):2168–2183 10. Thayaparan T, Ibrahim Y, Polak J et al (2018) High-frequency over-the-horizon radar in Canada. IEEE Geosci Rem Sens Lett 7(99):1–5 11. Zhang J, Deng W, Zhang X et al (2018) A method of track matching based on multipath echoes in high-frequency surface wave radar. IEEE Antenn Wirel Propag Lett 17(10):1852–1855 12. Yang X, Yu C, Liu A et al. (2016) The vertical inospere electron density probing with high frequency surface waverader. Chin J Rader Sci 31(2): 291–297 13. Ponnmarenko PV, Stmaurice JP, Waters CL et al (2009) Refractive index effects on the scatter volume location and Doppler velocity estimates of ionospheric HF backscatter echoeses. Ann Geophys 27(11):4207–4219
Greedy Matching-Based Pilot Allocation in Massive MIMO Systems Wen Zhao1 , He Gao2 , Yao Ge2(B) , Jie Zhang1 , Sanlei Dang1 , and Tao Lu1 1
Metrology Center, Guangdong Power Grid Co., Ltd., Guangzhou, China Beijing University of Posts and Telecommunications, Beijing, China [email protected]
2
Abstract. With the ability of achieving high data rate transmission and high spectral efficiency, massive multiple-input multiple-output (MIMO) has become a key technology in the fifth-generation wireless network. However, the problem of pilot contamination caused by multiplexing the same pilot between neighboring cell users restricts the performance of massive MIMO systems. To mitigate the impact of pilot contamination, this paper proposes a greedy matching-based pilot allocation (GMPA) algorithm for heterogeneous cellular networks. GMPA transforms the pilot allocation problem into matching optimization problem through matching theory. Then greedy method is used to find sub-optimal pilot allocation solution. Simulation results manifest that the proposed algorithm can effectively improve the performance of system spectrum efficiency and pilot efficiency.
1
Introduction
Equipped with a large number of antennas at base station (BS), massive multiple-input multiple-output (MIMO) systems can achieve higher data rate transmission than conventional MIMO systems [1]. When combined with heterogeneous cellular network, massive MIMO can effectively enhance system performance for the simultaneously working small-cell network and macro cellular network [2]. SBS is capable of improving the communication link reliability [3]. However, the problem of pilot contamination arises in the heterogeneous cellular network along with massive MIMO. There already exist some works for mitigating the pilot contamination in massive MIMO systems. Zhu et al. [4] exploited the large-scale channel fading information and constructed a graph coloring-based pilot allocation (GC-PA) algorithm to reduce potential interference experienced by user equipment (UE). A weighted graph coloring-based pilot decontamination (WGC-PD) algorithm was proposed in [5]. In [6], UEs being closer to BS reused same pilot sequence, and optimal pilot multiplexing parameters were investigated. Alkhaled et al. c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_103
778
W. Zhao et al.
[7] proposed an adaptive pilot allocation (APA) algorithm with the grouping idea. In [8], a location-based channel estimation algorithm was proposed, and pilot allocation algorithm was designed accordingly. The proposed algorithm guaranteed that UEs using same pilots in different cells have different angle of arrival. Zhao et al. [9] proposed a pilot allocation algorithm designed according to UE’s location information and the calculated potential interference strength. Aiming at the complex interference measurement in heterogeneous cellular massive MIMO systems, this paper proposes a greedy matching-based pilot allocation (GMPA) algorithm. The algorithm transforms the pilot allocation problem into a matching optimization problem by means of matching theory. Then the problem is solved by greedy method. Simulation results manifest that the proposed algorithm is capable of improving system performance of spectrum efficiency and pilot efficiency.
2
System Model
This paper considers a heterogeneous cellular network with massive MIMO, as shown in Fig. 1.
Fig. 1 Heterogeneous cellular networks with massive MIMO
The system consists of L macro cells, and each macrocell consists of one macro base station (MBS), K uniformly distributed single antenna macro user
Greedy Matching-Based Pilot Allocation in Massive MIMO Systems
779
equipments (MUE) and P microcells. Each microcell consists of one small-cell base station (SBS), and one single antenna small-cell user equipment (SUE) [10]. The MBS is equipped with M antennas while the SBS is equipped with one antenna. The system works in time division duplexing (TDD) mode with MUE and SUE using same time–frequency resources. Unless otherwise stated, the cell mentioned in this chapter refers to macro cell. The received signal at the lth cell is expressed as L L Hl,i xi + Hl,i xi + nl , (1) yl = i=1 M ×1
i=1
M ×K
where yl ∈ C . Hl,i ∈ C represents the channel matrix of MUE to MBS in the lth cell. Hl,i ∈ CM ×P represents the channel matrix of the P SBS UE to the MBS in the lth cell. xi ∈ CK×1 and xi ∈ CP ×1 represent the transmitting signal vector of the K MUE and the transmitting signal vector of the P SUE in the ith cell, respectively. nl ∈ CM ×1 denotes the additive white Gaussian noise vector. The elements of nl are subject to independent and identically distributed zero-mean variances with a complex Gaussian distribution of CN (0, 1). The received signal at the lth SBS is expressed as yl,p =
L i=1
Gl,i xi +
L
Gl,i xi + nl,p ,
(2)
i=1
where Gl,i ∈ C1×K represents the channel vector between M U E and the pth SBS of the lth macro cell. Gl,i ∈ C1×P is the channel vector between SU E and the pth SBS of the lth macro cell. xi ∈ CK×1 denotes the trans mitting signal of M U E .xi ∈ CP ×1 represents the transmitting signal of SU E . nl,p represents an additive white Gaussian noise with a distribution of CN (0, 1). Given small-scale fading matrix Fl,i and large-scale fading matrix Dl,i , Hl,i can be written as (3) Hl,i = Fl,i Dl,i . The element h, in Hl,i represents the channel coefficient between U E and the lth BS antenna. h, is given as h, = f, βl, , (4) where f, and βl, represent the small-scale fading and large-scale fading coefficient, respectively. βl, varies slowly in several successive coherent channel instants and is previously known. During the process of channel estimation, pilot contamination mainly comes from inter-cell interference and intra-cell interference. Inter-cell interference is caused by the MUE in neighboring cell using the same pilot. Intra-cell interference is caused by the SUE using the same pilot within the same cell.
780
3
W. Zhao et al.
Greedy Matching-Based Pilot Allocation Algorithm
In this section, GMPA algorithm is proposed to minimize the interference caused by pilot contamination. 3.1
Optimization Objective
The proposed GMPA algorithm utilizes the matching theory to assign an appropriate pilot sequence to each UE. In order to obtain the global optimal solution, utility total function is defined to reflect the potential interference experienced by UE. The function is given as ηu (μ) + ηφ (μ)) min W (μ) = min( u∈U φ∈Φ (5) st |μ(u)| = 1, |μ(φ)| 1, ∀u ∈ U, ∀φ ∈ Φ, where ηu (μ) and ηφ (μ) denote the utility function of UE u and the utility function of pilot φ in matching μ, respectively. 3.2
Binary Network Mapping
We model the heterogeneous cellular system as a binary network including two node groups and edges between connecting nodes. One node group denoted by X comprises all the UEs, and the other node group written as Y comprises all the BSs. The weight of each edge is determined by the geometric distance between UE and BS. It should be noted that in a binary network scenario, we do not distinguish MUE and SUE, MBS and SBS. With M UEs and N BSs, the process of constructing a binary network is given as Binary network Construction Process 1. Set the node group X to be the upper node of the binary network; 2. Set the node group Y to be the lower node of the binary network; 3. Loop on m, n 4. Connect the mth UE and the nth BS with the edge weight 5. End loop
Referring to the mapping approach in [11], we map the the whole network relationship to one node group. The mapping process can be given as the following formulas. P denotes the mapping matrix.
[x1 ...xM ]T1×M = [P1 ...PM ]TM ×M · [x1 ...xM ]T1×M
(6)
PM represents the mth row vector of P.
xm =
N N M am,n yn am,n ai,n xi = k(Yn ) k(Yn ) i=1 k(Xi ) n=1 n=1
(7)
Greedy Matching-Based Pilot Allocation in Massive MIMO Systems
yn =
pm,i =
M am,n xm K (Xm ) m=1
N 1 am,n ai,n K(Xi ) n=1 k(Yn )
781
(8)
(9)
am,n represents the weight between Xm and Yn , am,n = dm,n . K (Xm ) denotes the degree of node Xm . 3.3
Matching Modeling
The match uses U = {u1 , . . . , uM } and Φ = {φ1 , . . . , φK } to represent the UE and the pilot, respectively. For each pilot, there is a positive integer qφ indicating the number of times each pilot is reused by the UE. A match μ describes the relationship between the UE and the pilot sequence. Only one pilot will be allocated to each UE and the same pilot can be allocated to multiple UEs. Some variables involved in matching modeling are expressed as follows: ∀ match μ ⊆ Φ × U , |μ (u)| = 1, μ (u) = {ϕ ∈ Φ : (u, φ) ∈ μ}. μ (φ) = {u ∈ U : (u, φ) ∈ μ}. μ2 (u) denotes that there exits a UE using the same pilot as the uth UE. The relationship between UEs in the matching model is described as a weighted graph G = (V, E, w), where V = U . The relationship between the uth UE and the eth UE is described by the weight of the edge between the according nodes in the weighting graph. The utility function of the uth UE in matching μ is defined as w (u, e), (10) ηu (μ) = e∈μ2 (u)
where w (u, e) represents the relationship between UE u and UE e. Accordingly, the utility function of pilot φ in μ is defined as φ , ηφ (μ) = Dμ(φ)
(11)
φ is the sum of the utility values of all UEs allocated to pilot φ. The where Dμ(φ) utility value between the uth UE and the eth UE is given as ⎧ 2 β,e β2 ⎪ ⎪ + β,u u, e ∈ M U E 2 2 ⎪ β,u ⎨ ,e 2 ϕ β,e D(u,e) = (12) + dγ 1 u ∈ M U E, e ∈ SU E , 2 β,u ⎪ ,u ⎪ ⎪ 1 ⎩ + γ 1 u, e ∈ SU E γ d,e
d,u
where β,e , d,u and γ represent the large-scale fading between MUE e and MUE u, distance between MUE u and SUE e and path loss, respectively. We define matching exchange as μeu when UE u exchanges pilots with UE e and the pilots allocated to other UEs remain unchanged. μeu = {μ\ {(u, φ) , (e, ϕ)}} ∪ {(u, ϕ) , (e, φ)}
(13)
782
W. Zhao et al.
Pairwise stability (PS) is defined only if there is no pair of UE (u, e) satisfying the following conditions. (i) ∀i ∈ {u, e, μ (u) , μ (e)} , ηi (μeu ) ηi (μ) (ii) ∃i ∈ {u, e, μ (u) , μ (e)} , ηi (μeu ) < ηi (μ)
(14)
PS indicates that matches will not be the expected paired steady state if there exists a matching exchange. When matching exchange is run, the utility value of at least one part(two UEs or two pilots) should be reduced and other candidates utility values do not increase. Based on the previously defined utility functions of UE and pilot, the utility function can be given as ηu (μ) + ηφ (μ) . (15) W (μ) = u∈U
φ∈Φ
A local minimum utility W (μ) of match μ means that there is no match μ obtained by exchanging pilots of two UEs in the match, so that W (μ ) < W (μ). Not each stable match has a local minimum W (μ). It can be easily proved that for any match μeu , (i) ∀i ∈ {u, e, μ (u) , μ (e)} , ηi (μeu ) ηi (μ) and (ii) ∃i ∈ {u, e, μ (u) , μ (e)} ,ηi (μeu ) < ηi (μ) W (μeu ) < W (μ) .
(16)
Assuming that match μ has a local minimum W (μ), (16) indicates that any match exchange will reduce the total utility function. Therefore, μ must be a stable match considering the local minimum value of match μ. The proposed GMPA algorithm is to find the allowed matching exchanges through greedy search. The pilot allocation process is completed by matching exchanges to achieve the PS state. The GMPA algorithm is summarized as follows: Algorithm 1 Greedy Matching-based Pilot Allocation (GMPA) algorithm Input: d, L, P , K, βi, Output: μ (u), μ (φ) while i Iterative maximum do Randomly select two UEs for matching exchange in the same macro cell if ηu (μeu ) ηu (μ) & ηe (μeu ) ηe (μ) & ηµ(u) (μeu ) ηµ(u) (μ) & ηµ(e) (μeu ) ηµ(e) (μ) then μ ← μeu end if end while
The algorithm outputs a matching relationship between UEs and pilots which represents pilot sequence allocation information.
Greedy Matching-Based Pilot Allocation in Massive MIMO Systems
4
783
Numerical Results
Simulation results are presented in this section. We compare the frequency efficiency and pilot efficiency performance of the following algorithms. (1) The proposed GMPA algorithm: 12 pilot sequences are used to support each macro cell with 12 MUEs and 4 SUEs; (2) comparison pilot allocation (ComPA) algorithm: geometric distance instead of the binary network mapping process is used to depict the relationship between UEs; (3) conventional pilot allocation (ConPA) algorithm: The pilot allocation algorithm in the heterogeneous cellular network in [15] is defined as the ConPA algorithm. Figure 2 shows the spectral efficiency of the abovementioned algorithms with the number of system BS antennas. Higher spectral efficiency can be obtained as the number of BS antenna increases. And performance of the proposed GMPA is always better than other algorithms given the same amount of BS antenna.
Fig. 2 Spectrum efficiency comparison with varying BS antennas numbers
Figure 3 shows the pilot efficiency with the number of different BS antennas. The pilot efficiency is defined as γ = R/P where R representing the uplink system rate is given as K L R . (17) R= l=1 k=1
Figure 4 shows the cumulative distribution of the average arrival rate at UE side with 256 BS antennas. It is obvious that GMPA is an alternative pilot allocation approach with performance superior to that of the other algorithms.
784
W. Zhao et al.
Fig. 3 Pilot efficiency comparison with varying BS antennas numbers
5
Conclusion
In this paper, an allocation algorithm for heterogeneous cellular network massive MIMO systems is proposed. The algorithm bases on greedy matching and shows good performance advantage in the complex interference relationship of heterogeneous cellular network massive MIMO systems. Simulation results verified the effectiveness and superiority of the algorithm. Acknowledgment. Part of this work was supported by Science and Technology Project of China Southern Power Grid Co., Ltd (Grant No. GDKJXM20185366 (036100KK52180028).
References 1. Elijah O, Leow CY, Rahman TA et al (2016) A comprehensive survey of pilot contamination in massive MIMO’5G system. IEEE Commun Surv Tutorials 18(2):905– 923 2. Andrews JG, Claussen H, Dohler M et al (2012) Femtocells: past, present, and future. IEEE J Sel Areas Commun 30(3):497–508 3. Kountouris M, Pappas N (2013) HetNets and massive MIMO: modeling, potential gains, and performance analysis. arXiv preprint arXiv:1309.4942 4. Zhu X, Dai L, Wang Z (2015) Graph coloring based pilot allocation to mitigate pilot contamination for multi-cell massive MIMO systems. IEEE Commun Lett 19(10):1842–1845 5. Zhu X, Dai L, Wang Z et al (2017) Weighted-graph-coloring-based pilot decontamination for multicell massive MIMO systems. IEEE Trans Veh Technol 66(3):2829– 2834
Greedy Matching-Based Pilot Allocation in Massive MIMO Systems
785
6. Atzeni I, Arnau J, Debbah M (2015) Fractional pilot reuse in massive MIMO systems. In: 2015 IEEE international conference on communication workshop (ICCW). IEEE, New York, pp 1030–1035 7. Alkhaled M, Alsusa E, Hamdi KA (2017) Adaptive pilot allocation algorithm for pilot contamination mitigation in TDD massive MIMO systems. In: 2017 IEEE wireless communications and networking conference (WCNC). IEEE, New York 8. Wang Z, Qian C, Dai L et al (2015) Location-based channel estimation and pilot assignment for massive MIMO systems. In: 2015 IEEE international conference on communication workshop (ICCW). IEEE, New York, pp 1264–1268 9. Zhao P, Wang Z, Qian C et al (2016) Location-aware pilot assignment for massive MIMO systems in heterogeneous networks. IEEE Trans Veh Technol 65(8):6815– 6821 10. Hoydis J, Hosseini K, Ten Brink S et al (2013) Making smart use of excess antennas: Massive MIMO, small cells, and TDD. Bell Labs Tech J 18(2):5–21 11. Zhou T, Ren J, Medo M et al (2007) Bipartite network projection and personal recommendation. Phys Rev E 76(4):046115
Modeling and Simulation of Ionospheric Volume Scattering Equation for High-Frequency Surface Radar Yongfeng Yang, Wei Tang(B) , Xuguang Yang, Xueling Wei, and Le Yang School of Information Engineering, Long Dong University, Qingyang, Gansu 745000, China [email protected]
Abstract. Ionospheric clutter is always the key factor to restrict the detection performance of high-frequency surface wave radar (HFSWR). This paper established the radar range equations of the volume scattering, according to the unique distribution characteristics of ionosphere, and quantified the impact of ionospheric characteristic parameters on radar system. The simulation shows that the ionospheric echoes intensity under volume scattering can increase as the radar beamwidth increases, and the amplitude of irregularities fluctuation is the most important factor affecting the ionosphere intensity. Keywords: HF surface wave radar · Ionospheric echoes · Radar equation
1 Introduction High-frequency surface wave radar (HFSWR) uses the physical characteristics of highfrequency vertical polarized wave diffraction propagation along ocean surface to realize the detection of low altitude aircraft and ships beyond the horizon, expanding the monitoring range of conventional microwave radar [1, 2]. Another important application of HFSWR in ocean remote sensing can detect the ocean state parameters within the radar coverage and obtain many ocean dynamic parameters such as ocean current, wind direction and wave height in real time [3, 4]. However, in the real radar system, due to the influence of the radio interference, the loss of the sea transmission path and the ionospheric disturbance, the integrated detection performance of HFSWR is severely reduced. Ideally, the radar beam should be transmitted along the ocean surface completely. However, because of array error and antenna disturbance in the actual radar antenna system, partial radar energy irradiates to the ionosphere and enters the radar receiver in multiple paths after the refraction of the ionosphere, thus forming the ionosphere clutter. Due to the non-stationary time-varying characteristics of the ionosphere, the ionospheric clutter in the range-Doppler (RD) spectrum presents a certain Doppler frequency shift and broadening, which makes the target echoes in the range submerged, thus reducing the detection ability of the radar system [8–11]. Currently, various suppression algorithms
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_104
Modeling and Simulation of Ionospheric …
787
for ionospheric clutter based on radar signal processing have been proposed [12–14]. Besides, the scattering mechanism such as a third-order spatial power spectrum function of ionospheric irregularities is also introduced into the ionosphere model of HFSWR [16–21]. To model the ionospheric echoes of HFSWR precisely, the random fluctuation of background ionosphere and irregularities should both be considered. This paper considers that the ionosphere should be regarded as a three-dimensional object similar to the meteorological distribution, not a point target. The generalized radar distance equation for the ionosphere is derived, establishing the mathematical relationship between the ionosphere and the radar system parameters, so the influence of ionosphere on HFSWR system can be quantified. Besides, based on the coherent scattering mechanism of HF electromagnetic wave by ionospheric irregularities, the scattering coefficient of ionosphere is estimated, and the generalized range equation of HFSWR is obtained based on the physical mechanism of the ionosphere.
2 Ionospheric Range Equation of HFSWR The range detection equation of monostatic radar [12] can be expressed as follows: Pr =
Pt Gt Gr λ2 γ σ (4π)3 R4 Pn Ls LP
(1.1)
where λ represents the radar transmitted wavelength, Pt represents the radar transmitted power; Pr represents the received ionosphere echoes power; Gt represents the radar transmitter antenna gain; Gr represents the received array antenna gain; γ represents signal duty cycle; σ represents the effective scattering cross section of ionosphere; R represents the distance between the reflecting surface of ionosphere and radar; pn represents the noise power; Lp represents the propagation loss of radio wave in the ionosphere; Ls represents the system loss factor. The difference between the radar equation of volume target and point target is the RCS of ionosphere σ and the radio wave propagation loss Lp , which are both related to the ionosphere characteristics, thus they need to be estimated specially. Lp has been calculated by engineering methods [23], so we will only focus on the estimation of ionosphere RCS [24]. 2.1 Ionospheric Volume Scattering Range Equation For the incremental scattering of volume dV in spherical coordinates, the incremental RCS of dV is assumed to be d σ . Set the effective aperture area of radar antenna as Ae , so the corresponding incremental received power is Pt Pt (θ, φ)d σ (R, θ, φ) Ae (1.2) 4π R2 The total received power is obtained by integrating over all space. Because the receiver signal mainly contains the scattering within a single resolution cell volume V , there are pt (θ, φ)pr (θ, φ) Pt λ2 γ d σ (R, θ, φ) (1.3) Pr = (4π )3 Ls ν(R,θ,φ) R4 LP (R) dPr (θ, φ) =
788
Y. Yang et al.
where V (R, θ, φ) is the volume of resolution cell at (R, θ, φ). Assuming that the ionospheric RCS in the resolution cell is generated by the scatterers uniformly distributed in space, the scattering rate of the incremental volume is defined as σ 3 . The unit is m−1 , then the incremental RCS of dV is d σ = σ3 dV = σ3 R2 dRd
(1.4)
where d is the incremental solid angle at the coordinate (R, θ, φ). It is assumed that the ionospheric attenuation is constant in the range of a single-range resolution element. Therefore, there are Lp (R) = Lp (R0 ). In general, the resolution of HFSWR distance is smaller than the absolute ionospheric distance, thus we have R0 + R2 dR
∫
R0 − R2
R2
=
R20
R R R 2 ≈ 2 R − 2 0
(1.5)
Using Eqs. (1.4), (1.5) in (1.3) gives Pr =
Pt λ2 γ Rσ3 ∫ pt (θ, φ)pr (θ, φ)d
(4π )3 R20 Ls Lp (R0 )
(1.6)
It is assumed that the main lobe of the antenna is a Gaussian function, and the pattern of the receiving and transmitting antennas is the same. The pattern of the antenna power is constant in 3 dB beamwidths and 0 in other places. In this case, the range equation of ionospheric echoes under volume scattering is as follows: Pr =
Pt λ2 Gt Gr γ Rθ3 φ3 σ3 (4π )3 R20 Ls Lp (R0 )
(1.7)
From Eq. (1.7), it can be seen that in the case of ionosphere volume scattering, the echoes power decreases with the distance R2 . So even if the ionosphere RCS is not considered, the intensities of target echoes are far less than that of the ionosphere echoes at the same distance. 2.2 Estimation of the Ionospheric Reflection Coefficient Here, the simplified model of Walker and Booker [26] was used: σ3 ∝ Ne2 exp −2k 2 [l 2 α 2 + l⊥2 ]
(1.8)
where l⊥ is the component whose direction of the geomagnetic field is perpendicular to the scale of irregularities, l⊥ = λradar 2;l is the component whose direction of the geomagnetic field is parallel to the scale of irregularities, and its range is 800l⊥ < l < 2250l⊥ . k is the number of radar wave; α represents aspect angle; L represents irregularities scale length; Ne2 represents the variance of electron concentration fluctuation; Ne2 represents electron concentration.
Modeling and Simulation of Ionospheric …
789
Suppose that the electron concentration is linear with the mean of the variance of the irregularities: Ne2 = C0 Ne2
(1.9)
where C0 is the coefficient of fluctuation degree of irregularities. When only α = 0° is considered, Eq. (1.8) can be approximately as follows: σ3 = C0 e−2π Ne2 2
(1.10)
Taking Eq. (1.10) back to Eq. (1.7), the received power of ionospheric echoes can be obtained when the volume scattering in the irregularities Pr =
Pt λ2 Gt Gr γ Rθ3 φ3 2 C0 e−2π Ne2 2 3 (4π ) R0 Ls Lp (R0 )
(1.11)
3 Simulation Table 1 shows the main simulation parameters of the model. Generally, the ionosphere range is 100–300 km, and the range resolution of HFSWR system is 5 km. Table 1 Simulation parameters of ionospheric detection equation Radar Signal Antenna Transmitting Ionosphere Elevationbeam Azimuth Electron frequency duty gain power (kW) range (km) width (°) beamwidth density (MHz) cycle (°) fluctuation (m−6 ) 4.1
0.1
105
2
300
45
20
1018
Figure 1 shows the power spectrum of ionosphere echoes in the case of volume scattering. The ionospheric echoes power decreases with distance increases. Figure 1a shows that the ionospheric echoes power increases with the broadening of the (elevation, azimuth) beam; Fig. 1b shows that the ionospheric echoes power increases rapidly when the electron density fluctuation increases. From Fig. 1, it can be seen that for volume scattering, factors such as beamwidth, ionosphere distance and electron concentration fluctuation will affect the ionosphere echoes power, which also explains why the ionosphere echoes in RD spectrum are complex and unpredictable. Among these factors, the most significant one is the fluctuation of electron concentration. Therefore, the real-time spatial diagnosis of the ionosphere will be beneficial to the understanding of the ionospheric echoes mechanism and the improvement of the ionospheric clutter suppression algorithm.
790
Y. Yang et al. 120
80 30°
1e18
45°
1e20
60°
Power(dB)
Power(dB)
75 70 65 60 100
150
200
250
Range(km)
(a) Variation With different beamwidth
300
100
1e22
80
60 100
150
200
250
300
Range(km)
(b) Variation with different electron density fluctuation
Fig. 1 Analysis of ionospheric volume scattering echoes spectrum
4 Conclusion The distribution characteristics of ionosphere are similar to the meteorological target, so the ionosphere echoes should be modeled as a distributed scatterer, rather than a single-point scatterer. For the ionosphere target, the radar range equation of the threedimensional scattering is established, and the RCS of the ionosphere is the key factor in the radar range equation. According to the physical mechanism of ionospheric echoes and the principle of high-frequency coherent scattering, the RCS of ionosphere is estimated, and the generalized range equation of HFSWR ionosphere is obtained. Finally, the model of HFSWR ionosphere echoes under the volume scattering is simulated. The intensity of ionosphere echoes increases with the increase of radar beamwidth. Among the many factors that affect the ionospheric echoes intensity, the most important one is the electron density fluctuation of irregularities. Acknowledgements. This research was funded by the National Natural Science Foundations of China (NSFC) under Grant 61801141 and the Doctoral Fund Project of LongDong university under Grant XYBY202001.
5 References 1. Zhou W, Jiao P (2008) Over-the-horizon radar technology. In: Publishing house of electronics industry. Beijing 2. Barrick D (2003) History, present status, and future directions of HF surface-wave radars in the U.S. In: International radar conference. IEEE, New York, pp 652–655 3. Barrick D (1972) First-order theory and analysis of MF/HF/VHF scatter form the sea. IEEE Trans Antennas Propag 20:2–10 4. Barrick D, Snider J (1977) The statistics of HF sea-echoes Doppler spectra. IEEE Trans Antennas Propag 25(1):19–28 5. Sevgi L, Ponsford AM, Chan HC (2001) An integrated maritime surveillance system based on high-frequency surface-wave radar. Part 1. Theoretical background and numerical simulations. IEEE Antennas Propag Maga 43(4):28–43
Modeling and Simulation of Ionospheric …
791
6. Ponsford AM, Sevgi L, Chan HC (2001) An integrated maritime surveillance system based on high-frequency surface-wave radar. Part 2. Operational status and system performance. IEEE Antennas Propag Maga 43(5):52–63 7. Thomson AD, Quach TD (2005) Application of parabolic equation methods to HF propagation in an arctic environment. IEEE Trans Antennas Propag 53(1):412–419 8. Li Q, Zhang W, Li M et al (2017) Automatic detection of ship targets based on wavelet transform for HF surface wavelet radar. IEEE Geosci Remote Sens Lett 14(5):714–718 9. Guo X, Sun H, Yeo TS (2008) Interference cancellation for high-frequency surface wave radar. IEEE Trans Geosci Remote Sens 46(7):1879–1891 10. Jangal F, Saillant S, Helier M (2009) Ionospheric clutter mitigation using one-dimensional or two-dimensional wavelet processing. IET Radar Sonar Navig 3(2):112–121 11. Zhang X, Yang Q, Yao D et al (2015) Main-lobe cancellation of the space spread clutter for target detection in HFSWR. IEEE J Sel Topics Signal Process 9(8):1632–1638 12. Saleh O, Ravan M, Riddolls R et al (2016) Fast fully adaptive processing: a multistage STAP approach. IEEE Trans Aerosp Electron Syst 52(5):2168–2183 13. Thayaparan T, Ibrahim Y, Polak J et al (2018) High-frequency over-the-horizon radar in Canada. IEEE Geosci Remote Sens Lett 7(99):1–5 14. Zhang J, Deng W, Zhang X et al (2018) A method of track matching based on multipath echoeses in high-frequency surface wave radar. IEEE Antennas Wireless Propag Let 17(10):1852–1855 15. Yonghua X, Yong C, Ningbo L (2013) Sky-wave OTHR inosperic channel modeling. Chin J Rader Sci 28(5):862–868 16. Riddolls RJ (2011) Modification of a high frequency radar echoes spatial correlation function by propagation in a linear plasma density profile. In: Defence research and development, Ottawa, ON, Canada 17. Ravan M, Riddolls RJ, Adve RS (2012) Ionospheric and auroral clutter models for HF surface wave and over the horizon radar systems. Radio Sci 47(3):1–12 18. Chen S (2017) Ionospheric clutter models for high frequency surface wave radar. Memorial University of Newfoundland, Newfoundland 19. Chen S, Huang W, Gill E (2015) A vertical reflection ionospheric clutter model for HF radar used in coastal remote sensing. IEEE Antennas Wireless Propag Let 14:1–5 20. Chen S, Gill EW, Huang W (2016) A first-order HF radar cross-section model for mixed-path ionosphere–ocean propagation with an FMCW source. IEEE J Ocean Eng 41(4):982–992 21. Yang X, Liu A, Yu C, Wang L (2019) Ionospheric clutter model for HF sky-wave path propagation with an FMCW source. Int J Antennas Propag 5(8):1–10 22. Richards MA (2005) Fundamentals of radar signal processing. Tata McGraw-Hill Education 23. Yang X, Yu C, Liu Aijun.et al. (2016) The vertical inospere electron density probing with hign frequency surface waverader. Chin J Rader Sci 31(2):291–297 24. Booker HG (1956) A theory of scattering by nonisotropic irregularities with application to radar reflections from the aurora. J Atmosp Terrestrial Phys 8(45):204–221 25. Schlegel K (1996) Coherent backscatter from ionospheric E-region plasma irregularities. J Atmos Terr Phys 58(89):933–941 26. Ponnmarenko PV, Stmaurice JP, Waters CL et al (2009) Refractive index effects on the scatter volume location and Doppler velocity estimates of ionospheric HF backscatter echoeses. Ann Geophys 27(11):4207–4219
Research on the Elite Genetic Particle Filter Algorithm and Application on High-Speed Flying Target Tracking Lixia Nie, Xuguang Yang(B) , Jinglin He, Yaya Mu, and Likang Wang School of Information Engineering, Longdong University, Qingyang, Gansu 745000, China [email protected]
Abstract. Resampling is an inevitable process in the standard particle filter, but it also can lead to particles vanish diversity and degenerate the performance. In order to solve this problem, an elite genetic resampling particle filter is proposed in this paper. The global optimization of the genetic algorithm could keep particles move towards real state probability density function. The state estimate is corresponding to the maximum fitness state after several evolution generations. As the maximum fitness of every generation of the algorithm constitutes a non-negative bounded sub-martingale, this algorithm theoretically converges to the optimal global solution with probability 1. The estimate expression of absolute error is also concluded. The simulation demonstrates that this algorithm outperforming the particle filter using genetic operation in resampling could improve the estimation accuracy of high-speed flying targets tracking in the non-Gaussian background. Keywords: Particle filter · Elite genetic algorithm · Target tracking · Resampling
1 Introduction Particle filter [1] is a recursive Bayes estimation algorithm implemented by Monte Carlo method, which can deal well with any nonlinear non-Gaussian problem. The theoretical basis is Bayes optimal estimation and sequential importance sampling (SIS) [2]. In the SIS algorithm, since {xki }i=1:N cannot directly sample from the distribution p(Xk |Zk ), it can only sample from a known and easily sampled function q(Xk |Zk ) (i.e. the importance function), requiring q(Xk |Zk ) to be as close to p(Xk |Zk ) as possible. However, after several recursions, particles degradation occurs, that is, the weight wki is concentrated on a few particles, and the weight of the remaining particles is close to 0, so that most of the calculation time is waste. The existing genetic particle filter [3] (genetic particle filter, GPF) mainly uses genetic operations during resampling, which not only retains the original particle characteristics, but also increases the diversity of particles, and its performance is better than standard particle filters.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_105
Research on the Elite Genetic Particle Filter …
793
The basic idea of this paper is to avoid particle filter resampling, directly perform genetic operations on the particles to make it approximate the importance function, and use the maximum fitness particles after several iterations as the state estimation. Due to the inherent global optimization ability and increasing average fitness of the genetic algorithm, it can produce a better estimation effect, and there will be no particle degradation, which can avoid resampling. Theoretically, the maximum fitness value of elite genetic particle filtering is a non-negative bounded martingale, which converges to the optimal global solution with probability. 1. We also give an estimate of absolute error, which is inversely proportional to crossover, mutation and selection probability. Section 2 presents the introduction of particle algorithm. Section 3 is the basic principle and algorithm analysis of elite genetic particle filter (EGPF). Section 4 is the EGPF in strong non-Gaussian-tracking simulation of ballistic missile manoeuvring movement. Section 5 is the conclusion of this article.
2 Particle Filtering Suppose that a nonlinear dynamic system [4, 5] is described as xk = f (xk−1 , vk−1 ) zk = h(xk , nk )
(1.1)
where f (.) and h(.) are the state-space equation and observation equation of the system, xk ∈ Rn and zk ∈ Rm represent the target state value and observation value at time k, respectively, vk and nk are the independent and identically distributed system noise and observation noise. Suppose that xk and zk are first-order Markov processes, Xk = {x0 , x1 , . . . , xk } and Zk = {z1 , z2 , . . . , zk } are independent of each other, and the initial state of is p(x0 |z0 ) = p(x0 ). Extract a set of particles {xki }i=1:N from q(Xk |Zk ), {wki }i=1:N is its corresponding N N wki = 1, then p(xk |Zk ) ≈ pˆ (xk |Zk ) = wki δ(xk − xki ), where δ(.) is Dirac weight, i=1
i=1
function. When N → ∞, there is pˆ (Xk |Zk ) → p(Xk |Zk ). As the tracking time increases, the variance of the weights gradually increases, that is, the phenomenon of particles degradation occurs [6]. The main reason is that particle xki i is directly updated by xk−1 . The traditional resampling step is only a remedy to alleviate particles degradation. During the resampling process, particles with large weights are copied many times, which leads to the loss of particle diversity. If the resampling step can be eliminated, it is possible to fundamentally avoid particle degradation and solve the problems of particle depletion. This paper intends to use an elite genetic algorithm i to evolve the particles. Particle xki is derived from the cross and mutation process of xk−1 several times, thereby avoiding the resampling step, which not only retains the diversity of particles but also improves the estimation accuracy.
3 The Principle of Elite Genetic Particle Filtering The main purpose is to use elite genetic algorithm (EGA) [7, 8] to make particles evolution and use the most adaptable xk as the current state estimate.
794
L. Nie et al.
In this algorithm, particles {xki }i=1:N are used as a population to perform genetic operations. The evolution generation is NGA , NGA ≥ 1, which can be set according to the actual situation. Let NGA = 1 in real-time estimate while increase NGA if required precision. The entire algorithm is as follows: Step1 k = 1 : N … %N is the tracking time. For i = 1 : N … % N is the number of particles xki ∼ p(xk |xk−1 ) Step2 Evolutionary particles using genetic algorithm NGA times: Let the fitness function of the particle be fitness(xki ) = F(xki ) = p(zk xki ) Select, cross and mutate the particles at this moment. i Take any two particles (xkm , xkn )N m,n=1 from {xk }i=1:N and cross x˜ km = αxkm + (1 − α)xkn + η (1.2) x˜ kn = αxkn + (1 − α)xkm + η where η ~ N (0, ), α ~ U (0, 1) and N (0, ) are Gaussian distribution, and U (0, 1) is uniform distribution. j A particle (xk )N j=1 is randomly selected and mutated j
j
x˜ k = xk + ξ, ξ ∼ N (0,
).
(1.3)
Unconditionally retain elite particles, and the remaining particles are selected according to the proportion of fitness. Step3 Output estimate xkMAP = arg max p(x|Zk )= arg max FNGA (xki ). Step4 k = k + 1 jump to Step1. Consider the Markov chain state-space S N : S N = S × . . . × S, whose elements are called populations and are represented by x, y, u, v, w, l . . ., that is S N = {(x1 , x2 , . . . xN ) : xk ∈ S, 1 ≤ k ≤ N } where N is the population size, and the non-negative real-valued function f (x) defined on S is the fitness function. The EGA operations are described as follows: (1) Crossover operator Cn : Cn is a random mapping of S × S → S, which determines the probability of generating individual w from a given parent (u, v): Pcn (u × v, w) = P{Cn (u, v) = w} (2) The mutation operator Mn : Mn is a random mapping of S → S, which determines the transition probability from individual w to individual l: n (w, l) = P{Mn (w) = l} PM
Research on the Elite Genetic Particle Filter …
795
(3) Selection operator Sn : Sn is a random mapping of S N −1 →S. The probability of selecting individual i from Xn is Psn (x, i) = P{Sn (x) = i}, then i, j are selected as the probability Psn (x, i × j) = Psn (x, i)Psn (x, j) of the mother. The probability of the |x| , x ∈ B(Xn ), where |x| represents the first individual in the EGA is Psn∗ (x, i) = |B(X n )| number of individuals x in population Xn , |B(Xn )| represents the cardinal number of the optimal solution set B(Xn ) of population Xn , and the other N − 1 individuals are selected f (i) according to probability Psn (x, i) = . N f (k)
k=1
If the particles have evolved NGA times at time k, we can conclude following theorems: Theorem 1 Let Fj (xk ) = max Fj (xki ), j = 1 : NGA , Fj (xk ) = max Fj (xki ), j = 1 : NGA i
i
is the one with the highest fitness value of the j-generation particles during genetic evolution, and {Fj (xk )} is a non-negative bounded martingale. Proof {Fj (xk )} is a Markov process [9], so
E{Fj (xk )|F1 (xk ), . . . , Fj−1 (xk )} = E{Fj (xk )Fj−1 (xk ) } ≥ Fj−1 (xk ) > 0 Because of E{Fj (xk )Fj−1 (xk ) } ≤ 1 < ∞, ∀j ≥ 1, it then can be proved. a.s
Theorem 2 Let xk∗ be the actual state value at time k, then Fj (xk ) −→ F(xk∗ ) when NGA → ∞. Proof See reference [10]. Theorem 3 Let pc be the crossover probability, pm be the mutation probability, ps be the selection probability and F1 (xk ) be the maximum fitness value of the initial particles at the time k, then the absolute error. ε(j) = E{Fj (xk )Fj−1 (xk ) } − F(xk∗ ) at the j th generation satisfies F(xk∗ ) − Fj (xk ) ≤ ε(j) ≤ F(xk∗ ) − ps pc pm F1 (xk ) Proof Because of F1 (xk ) ≤ Fj (xk ) ≤ F(xk∗ ), so. ε(j) = E{Fj (xk )Fj−1 (xk ) } − F(xk∗ ) = F(xk∗ ) − E{Fj (xk )Fj−1 (xk ) } = F(xk∗ ) −
Pcn (u × v, w)
u,v∈X
≤
F(xk∗ ) −
u,v∈X
n PM (w, l)
× v, w)
l
Psn (l, r)F(r)
r
l
Pcn (u
n PM (w, l)
r
Psn (l, r)F1 (xk ) = F(xk∗ ) − ps pc pm F1 (xk )
796
L. Nie et al.
ε(j) = F(xk∗ ) −
u,v∈X
Pcn (u × v, w)
l
n (w, l) PM
r
Psn (l, r)F(r) ≥ F(xk∗ ) − Fj (xk ).
According to Theorems 1 and 2, as the number of iterations j increases, the elite particles gradually approach the actual state value. When j → ∞, the elite particles converge to the real state value with probability 1. Theorem 3 gives an estimate of the absolute error between the estimated cost and the actual value of each generation [11]. From the left end of the inequality, the minimum error reduces as the number of iterations increases. The right side of the disparity is the maximum error, which is related to the selection of genetic operators and the initial population. To reduce the maximum error, we should increase pc , pm and ps and make F1 (xk ) as close to F(xk∗ ) as possible.
4 Simulation and Analysis Figure 1 shows the ballistic missile trajectory on the x–y plane. Figure 2 shows the manoeuvering target speed and acceleration various with time under non-Gaussian noise background.
Fig. 1 Ballistic missile trajectory
It can be seen from Fig. 3 that in the X-direction, the EGPF error in the first 10 s of the track is large, and then it converges quickly. The reason is that the initial particle group is far from the true state value. It can be seen that the initial value has a great influence on the estimation. But with the increase of evolutionary generations, the accuracy of NGPF is significantly higher than that of GPF in distance and speed estimation [12].
Research on the Elite Genetic Particle Filter …
(a) Target speed various with time
797
(b) Acceleration various with time
Fig. 2 Various of target speed and acceleration
Fig. 3 Comparison of root mean square error of GPF and EGPF filtering [13] (The X-axis represents the tracking time/s, and the Y-axis represents the mean square error of the distance in the Y-direction/m)
5 Conclusion In this paper, the elite genetic algorithm is used to directly evolve the particles from several generations to solve the problem of particle filter resampling, so that the population shifts to the posterior probability density space. Each generation retains the optimal individual and directly approximates the true posterior probability density, eliminating resampling steps to solve the problem of particles degradation caused by standard particle filter resampling. This article also gives the convergence analysis of the elite
798
L. Nie et al.
genetic algorithm and the absolute error of each generation of estimation, which provides quantization for the selection of genetic parameters. Simulation results show that the algorithm is superior to the original genetic particle filtering in tracking accuracy of high-speed flying targets in the background of non-Gaussian noise. The disadvantage of EGPF increases computational cost. How particle filtering can not only eliminate resampling but also reduce the amount of calculation, at the same time improve the estimation accuracy, to achieve the purpose of engineering practicality, this is a topic worthy of further discussion. Acknowledgements. This research was funded by the National Natural Science Foundations of China (NSFC) under Grant 61801141 and the Doctoral Fund Project of LongDong university under Grant XYBY202001.
References 1. Gordon NJ, Salmond DJ, Smith AFM (1993) Novel approach to nonlinear/Non-Gaussian Bayesian state estimation. IEEE Proc F 140(2):107–113 2. Uosaki K, Hatanaka T (2005) Evolution strategies based Gaussian sum particle filter for nonlinear state estimation. In: Proceedings of IEEE Congress on evolutionary computation, Edinburgh, pp 2365–2371 3. Higuchi T (1997) Monte Carlo filtering using genetic algorithm operators. J Stat Comput Simul 59(1):1–23 4. Arrospide J, Salgado L (2012) On-road visual vehicle tracking using Markov chain Monte Carlo particle filtering with metropolis sampling. Int J Autom Technol 13(6):955–961 5. Li T, Sattar TP, Sun S (2012) Deterministic resampling: Unbiased sampling to avoid sample impoverishment in particle filters. Sign Process 92(7):1637–1645 6. Hwang K, Sung W (2013) Load balanced resampling for real-time particle filtering on graphics processing units. IEEE Trans Signal Process 61(2):411–419 7. Han H, Ding Y-S, Hao K-R, Liang X (2011) An evolutionary particle filter with the immune genentic algorithm for intelligent video target tracking. Comput Math Appl 62(7):2685–2695 8. Uosaki K, Hatanaka T (2007) State estimation by evolution strategies based particle filter. J Japan Soc Simul Technol 26(1):8–13 9. Doucet A, Godsill S (1998) On sequential Monte Carlo sampling methods for Bayesian filtering. University of Cambridge 10. Xu Z, Nie Z, Zhang W (2002) Almost sure convergence of genetic algorithms: a martingale approach. Chin J Comput 25(8):785–793 11. Nasir AA, Durrani S, Kennedy RA (2012) Particle filters for joint timing and carrier estimation: improved resampling guidelines and weighted Bayesian Cramer-Rao bounds. IEEE Trans Commun 60(5):1407–14181 12. Yu S, Kuang S (2010) Convergence and convergence rate analysis of elitist genetic algorithm based on martingale approach. Control Theory Appl 27(7):843–848 13. Farina A, Ristic B, Benvenuti D (2002) Tracking a ballistic target: comparison of several nonlinear filters. IEEE Trans Aerosp Electron Syst 38(3):854–867
Fault Estimation and Compensation for Fuzzy Systems with Sensor Faults in Low-Frequency Domain Yu Chen1(B) , Xiaodan Zhu2(B) , and Jianzhong Gu2 1
2
School of Chemistry and Materials Science, Ludong University, Yantai 264025, China [email protected] School of Mathematics and Statistics Science, Ludong University, Yantai 264025, China [email protected]
Abstract. A new fault estimation and compensation scheme of fuzzy systems with sensor faults is addressed in low-frequency domain. A descriptor observer is proposed to ensure dynamic error’s stability and H∞ performance for low-frequency range. The faults estimation is obtained via the observer above. By considering estimation of faults, a H∞ output feedback controller is shown such that controlled model with sensor faults considered has certain fault-tolerant function. A simulation proves this results’ effectiveness. Keywords: Compensation · Observer design · Fault estimation Finite low-frequency domain · T-S fuzzy model
1
·
Introduction
Fuzzy logic has been used to describe complicated nonlinear system, which is effective. By applying the existing fuzzy approaches, the fuzzy IF-THEN rule has been firstly used to describe this kind of model [1], namely T-S fuzzy model. This method can simplify analysis of nonlinear model. By applying T-S fuzzy methodology, different linear models are described by local dynamics in different state-space regions. Therefore, membership functions smoothly blend these local models together so that overall fuzzy model is obtained. The issues [2–6] on fuzzy systems attracted attention. Recently, [7] has introduced generalized Kalman–Yakubovich–Popov (GKYP) lemma which is a very significant development. Frequency domain property is converted into a LMI for a finite-frequency range. GKYP lemma
c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_106
800
Y. Chen et al.
has many practical applications. T-S fuzzy models’ H∞ control [8] was solved. Fault detection problem [9] was proposed in T-S fuzzy networked models. Fuzzy filter problem for nonlinear systems [10] was solved. T-S fuzzy fault detection method was designed in [11]. In past decades, when demands for industrial manufacture are growing, more and more people pay attention to fault-tolerant control (FTC). For obtaining the same control objective, different design methodologies are used in two approaches: passive FTC (PFTC) and active FTC (AFTC) in accordance with how redundancy is used. The system performance lies on the availability of redundancy and FTC design method. Each method can produce some unique properties based on the distinctive design approaches used. Passive and active FTC were considered simultaneously in [12,13]. PFTCs were provided for affine nonlinear models [14] and stochastic systems [15, 16], respectively. At the same time, there were also many results on AFTC. Fault estimation and AFTC of discrete systems [17] are considered by finite-frequency method. Nonlinear stochastic AFTC system [18] was analyzed by applying fuzzy interpolation approach. For stochastic systems [19], the descriptor observer was given to solve fault estimation and FTC problem by the sliding mode method. FTC [20] for fuzzy delta operator models was proposed. However, there are few papers on fault estimation and compensation for T-S fuzzy models with sensor faults in low-frequency domain. Therefore, this paper solves the problem above, whose contributions are that: For this models considered, in terms of the descriptor system approach, a fuzzy observer is designed so as to make the stability of dynamic error in low-frequency domain be ensured. State and fault’s estimations are showed by the observer above, then a H∞ output feedback controller is proposed so as to make controlled systems with sensor faults have certain fault-tolerant function. A numerical simulation shows the effectiveness of this designed scheme.
2
Problem Formulation
T-S fuzzy model is shown: Plant Rule i: If θ1k is φi1 , θ2k is φi2 , . . ., θmk is φim , then xk+1 = A1i xk + B1i uk + D1i dk , yk = A2 xk + F2 fk ,
(1)
where i = 1, . . . , M, M is IF-THEN rules’s number; θ1k , θ2k , . . ., θmk are premise variables; φi1 , φi2 , . . ., φis are fuzzy set; xk ∈ Rg is state, yk ∈ Rgy is output. dk ∈ Rgd , uk ∈ Rgu and fk ∈ Rgf are disturbance, input and sensor fault, respectively, which belong to L2 [0, ∞). A1i , B1i , D1i , A2 , and F2 are matrixes. Hypothesis that (A1i , A2 ) is observable and F2 has full rank.
Fault Estimation and Compensation for Fuzzy Systems . . .
801
A fuzzy inference and weighted center average defuzzifier is given by considering a singleton fuzzifier, the form of (1) is xk+1 =
M
ηiθk [A1i xk + B1i uk + D1i dk ],
i=1
yk = A2 x+ F2 fk ,
(2)
m
M φilθ n lki where ηiθk = M l=1 , i=1 ηi = 1. i=1 l=1 φlθlk T Define xTk fkT =x ¯k , then the dynamic global model is that ¯x E ¯k+1 =
M
¯1i uk + D ¯ 1i dk ], ηiθk [A¯1i x ¯k + B
i=1
¯k , (3) y¯k = A¯2 x ¯ 1i = D1i , A¯2 = A2 F2 . ¯1i = B1i , D ¯ = I 0 , A¯1i = A1i 0 , B where E 0 0 0 0 00 For the same IF-Then rule, design the fuzzy observer ¯1i uk + L1i y¯k , ˆ k+1 = (A¯1i − L1i A¯2 )x ¯k + B −1 ¯ + L2 A¯2 ) (k + L2 y¯k ), ˆk = (E x ¯ ¯ ˆk = A2 x ˆk , y¯ ¯
(4)
and the output feedback controller uk = Kv yk = Kv y¯k ,
(5)
ˆ ˆ¯k , fk ’s estimate ¯k , xk ’s estimate is that x ˆk = I 0 x with x ¯k ’s estimate is that x ˆ ˆ is fk = 0 I x ¯k , and yk ’s estimate is yˆ ¯. L1i , L2 , Kv are gains to be determined. Hence, dynamic global model is obtained as follows: k+1 =
M
¯1i uk + L1i yk ], ˆ ηiθk [(A¯1i − L1i A¯2 )x ¯k + B
i=1
¯ + L2 A¯2 )−1 (k + L2 y¯k ), ˆk = (E x ¯ ˆ yˆ ¯k = A¯2 x ¯k ,
(6)
and uk =
M v=1
ηvθk Kv yk =
M
ηvθk Kv y¯k .
(7)
v=1
Equation (6) becomes ¯ + L2 A¯2 )x ˆ (E ¯k+1 =
M i=1
¯1i uk + L1i y¯k + L2 y¯k+1 ]. (8) ˆ ηiθk [(A1i − L1i A¯2 )x ¯k + B
802
Y. Chen et al.
Meanwhile, adding L2 yk+1 on the two sides of (3), then ¯ + L2 A¯2 )¯ (E xk+1 =
M
¯1i uk + L1i y¯k + D ¯ 1i dk ]. (9) ˆ ηiθk [(A1i − L1i A¯2 )x ¯k + B
i=1
ˆ ˜ ¯k − x ¯k , rk = y¯k − yˆ ¯k , then Define x ¯k = x ˜ x ¯k+1 =
M
¯ + L2 A¯2 )−1 (A1i − L1i A¯2 )x ¯ 1i dk ], ˜¯k + D ηiθk [(E
i=1
˜ ¯k . rk = A¯2 x
3
(10)
Main Results
Theorem 1. Considering constant γ > 0, the gain of (4) is addressed so as to make (10) with H∞ level γ asymptotic stability in |ϑ| ≤ ϑ1 if there is symmetric matrixes Pi > 0, Pl > 0, Q > 0 and matrixes Xi for all i, l ∈ {1, . . . , M} so that (11) and (12) hold: Pl − g2 Xi − g2 XiT g1 Xi + g2 ξ1 < 0, (11) ∗ −Pi − g1 ξ1 − g1 ξ1T ⎡ ⎤ −Pl Q + Xi 0 0 T ¯ ¯T ⎥ ⎢ ∗ D A −X ϕ 1 1i 2 ⎥ i ⎢ < 0, (12) ⎣ ∗ ∗ −γ 2 I 0 ⎦ ∗ ∗ ∗ −I ¯ + L2 A¯2 )−1 A¯1i − Yi A¯2 , ϕ1 = Pi − 2 cos ϑ1 Q − ξ T − ξ1 , L2 = where ξ1 = XiT (E 1 T 0 I , and g1 , g2 are arbitrary fixed scalars satisfying g12 σmax (Pi ) < g22 σmin (Pi ). ¯ + L2 A¯2 )X −T Yi . Then gains of (4) are that L1i = (E i M T ˜ ˜ Proof. Define Vk = x ¯ [ i=1 ηiθk Pi ]x ¯k , differences of Vk along (10) is that M Mk 2 T T ¯ ¯ −1 (A1i − ˜ ˜ ¯k [ Pl −Pi ]x ΔVk = ( i=1 ηiθk ) ¯k , where = (E+L 2 A2 ) l=1 ηlθk+1 x T ¯ L1i A2 ). Notice that Pl − Pi < 0 so ΔVk < 0. Easily, it is obtained that T Pl 0
< 0. (13) I I 0 −Pi There exist g1 and g2 such that g12 σmax (Pi ) < g22 σmin (Pi ), then T Pl 0 g1 I g1 I = g12 Pl − g22 Pi < 0. g2 I 0 −Pi g2 I
(14)
T ⊥ Since that g2 I −g1 I = g1 I g2 I and T I belongs to the null sub T space of −I , it has T T −I g2 I g2 I Pl 0 T −I Xi < 0. (15) Xi + +
T
T −g1 I −g1 I 0 −Pi
Fault Estimation and Compensation for Fuzzy Systems . . .
803
¯ + L2 A¯2 )−1 L1i , (15) holds after some matrix manipulations. Set Yi = XiT (E According to GKYP Lemma [7], then
¯η A¯1η D I 0
Define Pη+ =
M
l=1
¯i A¯1i D I 0
T
¯η −Pη+ A¯1η D Q I 0 Q Pη − 2 cos ϑ1 Q T I 0 A¯2 0 A¯2 0 + < 0. 0 −γ 2 I 0 I 0 I
ηlθ( k) Pl , Pη =
T Θ1
M
i=1
ηiθ( k) Pi , then
T ¯i A¯1i D A¯2 0 A¯2 0 + Θ2 < 0, I 0 0 I 0 I
(16)
−Pl I 0 Q , Θ2 = , then (16)’s form is that where Θ1 = 0 −γ 2 I Q Pi − 2 cos ϑ1 Q
Υ⊥T T1 Θ1 1 + T2 Θ2 2 Υ⊥ < 0,
(17)
T A¯T1i I 0 I00 0 A¯2 0 , 1 = where Υ = ¯ T , 2 = . 0I0 0 0 I Di 0 I ¯ i . According to Projection Lemma, then For Υ⊥ , then Υ = −I A¯1i D T T T
1 Θ1 1 + 2 Θ2 2 < ΥXi R + (ΥXi RT )T . Let RT = 0 I 0 , then (18) holds:
⊥
⎤⎡ ⎤T ⎡ ⎤ ⎤ ⎡ 0 0 0 −Xi 0 −Pl Q 0 ¯ i ⎦ < 0. (18) ⎣ ∗ Pi − 2 cos ϑ1 Q 0 ⎦ + ⎣ A¯T2 ⎦ ⎣ A¯T2 ⎦ − ⎣ ∗ ν XiT D ∗ ∗ −γ 2 I 0 0 ∗ ∗ 0 ⎡
with ν = A¯T1i Xi +XiT A¯1i . By matrix manipulations, (18) and (12) are equivalent. Therefore, (12) satisfies H∞ index γ in |ϑ| ≤ ϑ1 if (16) holds. The proof is finished. Secondly, we give the controller in low-frequency domain. This system may not work normally with sensor faults. This motivates us to consider the faulttolerant method, which is shown in the following. ˆ¯k . By subtracting Based on observer technique, fk is designed as fˆk = 0 I x ˜¯k with yck = yk − F2 fˆk = A2 xk + F2 fk − F2 fˆk = A2 xk + F¯2 x fˆk from yk , then F¯2 = F2 0 I . Consider controller (5), and use yck to replace yk , then uck = M v=1 ηv (θk )Kv yck so closed-loop models are that x(k + 1) =
M i=1
ηiθk
M v=1
˜¯k + D1i dk ], ηvθk [(A1i + B1i Kv A2 )xk + B1i Kv F¯2 x
˜ yck = A2 xk + F2 0 I x ¯k .
804
Y. Chen et al.
Considering system (10), a new system is obtained Xek =
M i=1
ηiθk
M
¯ d ], ηvθk [A¯iv Xk + D 1i k
v=1
(19) yck = A¯2 Xk , x(k) ¯ = D1i , A¯ = Υ11 Υ12 , A¯ = A F¯ , Υ = ,D where Xk = 2 2 1i iv 2 11 ¯ 1i ˜ x ¯k 0 Υ22 D ¯ + L2 A¯2 )−1 (A¯1i − L1i A¯2 ). A1i + B1i Kv A2 , Υ12 = B1i Kv F¯2 , Υ22 = (E It is not difficult to show that equation (19) for this controlled system adopting the static output feedback control strategy, by applying this proposed technology, it can reduce sensor fault’s influence on controlled models, so whole closed-loop ones have a certain fault-tolerant function.
4
Numerical Simulations
A example can prove effectiveness of this proposed method. These systems have 1 , φ2x1k = Rule 1 and 2, their membership functions are φ1x1k = 1+exp(−3x 1k ) 1 − φ1x1k , and consider x1k is φ1 and φ2 , respectively, and 0.3 0 0.1 0 −0.2 0.6 0 , B11 = , D11 = , A12 = , A11 = 0 0.6 0 −0.1 0.5 0 0.5 0.4 0 0.8 0.1 0 1.0 0 , D12 = , A2 = , F2 = , B12 = 0 −0.4 0.4 0 0.1 0 1.0 let g1 = 1, g2 = 3, in |ϑ1 | ≤ 1.7, for γ = 0.15, by (11) and (12), gains of (4) are ⎡ ⎡ ⎡ ⎤ ⎤ ⎤ −2.2563 0.8171 −1.6636 −1.0563 00 ⎢ 1.1779 −1.0420 ⎥ ⎢ ⎢ ⎥ ⎥ ⎥ , L12 = ⎢ −1.4917 −0.3082 ⎥ , L2 = ⎢ 0 0 ⎥ . L11 = ⎢ ⎣ −0.0744 −0.0817 ⎦ ⎣ −0.4336 0.1056 ⎦ ⎣1 0⎦ −0.1178 −0.4958 0.1492 −0.4692 01 f Assumed that dk = 0.1 sink and fk = 1k with f2k f1k =
0, 0 < k ≤ 30 ,f = 0.3 sink , 70 ≥ k ≥ 300, k > 70 2k
0, 0 < k ≤ 30 . 0.4 sink , 70 ≥ k ≥ 300, k > 70
T T Considering x0 = xT1k xT2k = −1 2 , Fig. 1 depicts the estimation of fault fk , where f oi is the estimation of fik with i = 1, 2. yˆk ’s estimations are shown in Fig. 2, where yo1 means yˆ1k and yo2 means yˆ2k .
Fault Estimation and Compensation for Fuzzy Systems . . .
805
0.4 fo1 fo2
0.3
Magnitude
0.2 0.1 0 -0.1 -0.2 -0.3 -0.4
0
10
20
30
40
50
60
70
80
90
100
Time step k
Fig. 1 fk ’s estimation 0.5 yo1 (k)
0.4
yo (k) 2
0.3
Magnitude
0.2 0.1 0 -0.1 -0.2 -0.3 -0.4 -0.5
0
10
20
30
40
50
60
70
80
90
100
Time step k
Fig. 2 Estimate output yˆk
5
Conclusion
The fault estimation and compensation scheme of fuzzy T-S discrete models is addressed for low-frequency range. A fuzzy observer is given so as to ensure error model’s stability with H∞ performance for low-frequency range. The fault estimations are obtained via the observer above, then a fuzzy H∞ output feedback controller is shown so as to ensure certain fault-tolerant function of controlled model with sensor fault considered. A numerical simulation proves the effectiveness of this method. The conclusion of this paper can also be expended into finite middle- and high-frequency domain.
806
Y. Chen et al.
Acknowledgments. The authors thank the reviewers for their valuable comments and help to improve the quality of this paper. The work was supported by the Ludong University Introduction of Scientific Research Projects under Grant LB2016034, the Natural Science Foundation of Shandong Province under Grant ZR2019PF009 and the Foundation of Shandong Educational Committee under Grant J17KA051.
References 1. Takagi T, Sugeno M (1985) Fuzzy identification of systems and its application to modeling and control. IEEE Trans Syst Man Cybern SMC-15:116–132 2. Yang H, Shi P, Zhang J, Qiu J (2012) Robust H∞ control for a class of discrete time fuzzy systems delta operator approach. Inf Sci 184:230–245 3. He S, Liu F (2011) Filtering-based robust fault detection of fuzzy jump systems. Fuzzy Sets Syst 185:95–110 4. Zhang J, Shi P, Xia Y (2013) Fuzzy delay compensation control for T-S fuzzy systems over network. IEEE Trans Cybern 43:259–268 5. Liu M, Cao X, Shi P (2013) Fault estimation and tolerant control for fuzzy stochastic systems. IEEE Trans Fuzzy Syst 21:221–229 6. Ding D, Yang G (2010) Fuzzy filter design for nonlinear systems in finite-frequency domain. IEEE Trans Fuzzy Syst 18:935–945 7. Iwasaki T, Hara S (2005) Generalized KYP lemma: unified frequency domain inequalities with design applications. IEEE Trans Autom Control 50:41–59 8. Wang H, Peng L, Ju H, Wang Y (2013) H∞ state feedback controller design for continuous-time T-S fuzzy systems in finite frequency domain. Inf Sci 223:221–235 9. Yang H, Xia Y, Liu B (2011) Fault detection for T-S fuzzy discrete systems in finite frequency domain. IEEE Trans Syst Man Cybern-Part B: Cybern 41:911–920 10. Ding D, Yang G (2010) Fuzzy filter design for nonlinear systems in finite-frequency domain. IEEE Trans Fuzzy Syst 18:935–945 11. Li X, Yang G (2014) Fault detection in finite frequency domain for Takagi-Sugeno fuzzy systems with sensor faults. IEEE Trans Cybern 44:1446–1458 12. Sloth C, Esbensen T, Stoustrup J (2010) Active and passive fault-tolerant LPV control of wind turbines. In: American control conference. IEEE Press, Baltimore, pp 4640–4646 13. Jiang J, Yu X (2012) Fault-tolerant control systems: a comparative study between active and passive approaches. Ann Rev Control 36:60–72 14. Benosman M, Lum K (2010) Passive actuators’ fault-tolerant control for affine nonlinear systems. IEEE Trans Control Syst Technol 18:152–163 15. Su L, Zhu X, Qiu J, Zhou C, Zhao Y, Gou Y (2010) Robust passive fault-tolerant control for uncertain non-linear stochastic systems with distributed delays. In: The 29th Chinese control conference (CCC). IEEE Press, New York, pp 1949–1953 16. Su L, Zhu X, Zhou C, Qiu J (2011) An LMI approach to reliable H∞ guaranteed cost control for uncertain non-linear stochastic Markovian jump systems with mode-dependent and distributed delays. ICIC Express Lett 5:249–254 17. Zhu X, Xia Y, Fu M (2020) Fault estimation and active fault-tolerant control for discrete-time systems in finite-frequency domain. ISA Trans 104:184–191 18. Lin Z, Lin Y, Zhang W (2010) H∞ stabilisation of non-linear stochastic active fault-tolerant control systems: fuzzy-interpolation approach. IET Control Theory Appl 4:2003–2017
Fault Estimation and Compensation for Fuzzy Systems . . .
807
19. Liu M, Shi P (2013) Sensor fault estimation and tolerant control for Itˆ o stochastic systems with a descriptor sliding mode approach. Automatica 49:1242–250 20. Yang H, Shi P, Li X, Li Z (2014) Fault-tolerant control for a class of T-S fuzzy systems via delta operator approach. Sig Process 98:166–173
Anti-jamming Performance Evaluation of GNSS Receivers Based on an Improved Analytic Hierarchy Process Yuting Li1(B) , Zhicheng Yao1 , Yanhong Zhang2 , and Jian Yang3
3
1 School of Missile and Engineering, Rocket Force University of Engineering, Xi’an 710025, China [email protected] 2 Equipment Department of RocketForce, Beijing 100000, China School of Electronic Engineering, Xidian University, Xi’an 710071, China
Abstract. The anti-jamming performance evaluation of global navigation satellite system(GNSS) receivers in complex electromagnetism environment is important. In order to analyze the anti-jamming capability of GNSS receivers in different interference scenarios, an anti-jamming performance evaluation method based on an improved analytic hierarchy process (AHP) is proposed. Firstly, the anti-jamming performance index of GNSS receivers is given by using the constraint of the minimum performance index. Secondly, the improved analytic hierarchy process evaluation model is obtained by using three scales instead of nine scales. Furthermore, the evaluation model is verified by using measured data, which realizes the quantitative anti-jamming performance evaluation of GNSS receivers. The experimental results show that the proposed method can improve the credibility of the anti-jamming performance evaluation results for GNSS receivers. Keywords: GNSS receiver · Performance evaluation hierarchy process · Three-scale method
1
· Analytic
Introduction
GNSS receivers are widely used both in the fields of military and civilian due to their characteristics of all-weather coverage, strong real-time performance, accurate navigation and timing [1]. Global Positioning System (GPS), global navigation satellite system(GLONASS), beidou navigation satellite system (BDS) and Galileo satellite navigation system constitute the four major GNSS and positioning systems in the world [2]. The anti-jamming GNSS receivers are important part of the GNSS. Due to the weak GNSS signals coupled with unintentional or intentional interference, the GNSS receivers can be interference,which will even
c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_107
Anti-jamming Performance Evaluation of GNSS Receivers . . .
809
lose the lock to perform positioning [3]. Therefore, it is of great significance to test and evaluate the anti-jamming GNSS receivers in the complex electromagnetic environment. In the current works, there are few reports on the anti-jamming performance evaluation of GNSS receivers in complex electromagnetic environment. Moreover, a unified evaluation standard of anti-jamming performance has not been established [4]. In order to make a comprehensive evaluation of GNSS receivers, most researchers start with the inherent properties of GNSS receivers and construct a multi-index system. On the one hand, the existing evaluation methods seldom consider whether GNSS receivers can work properly and whether the selection of indexes can reflect the anti-jamming ability of GNSS receivers in the actual environment. On the other hand, the traditional evaluation method, such as analytic hierarchy process is too subjective, and consistency test need to be performed on the judgment matrix. To sum up, under the premise of normal positioning constraints of receiver function and different interference scenarios, GNSS receivers are tested and evaluated by taking the input maximum interference signal ratio of GNSS receivers as index. The improved analytic hierarchy process is applied to establish a comprehensive evaluation method of receiver performance. Finally, the actual test data are used to illustrate the feasibility of this method, which has more engineering significance.
2 2.1
Basic Theory of System Model Typical Interference Faced by Anti-jamming Receivers Systems
Continuous Wave Interference The continuous wave interference signal is [5] to generate a modulation signal with a specific frequency. The signal power and amplitude can be set. To implement continuous wave using quadrature modulation: I(t) = A(1 + ma v(t)) (1) Q(t) = 0 In formula (1), I(t) is the baseband signal of the in-phase branch, Q(t) is the baseband signal of the orthogonal branch, A is the set amplitude value, ma is the modulation coefficient, the modulation signal, V (t) can be a DC signal, and the modulation signal may be a DC signal. In order to achieve different carrier frequencies, the NCO parameters of the DAC can be set. Pulse Interference A pulse signal is [6] a signal in which a plurality of singlepulse signals is synthesized at a certain frequency. To implement a single-pulse signal using quadrature modulation, you need to make: I(t) = A × U (t − t0 ) Q(t) = 0 In formula (2), U (t − t0 ) is a single-pulse modulation signal.
(2)
810
Y. Li et al.
Multiple single-pulse signals are first generated and then modulated to the desired frequency. FM Interference FM signal is[7] a modulation method in which the instantaneous frequency of the carrier changes linearly with the modulation signal. The mathematical expression of a single tone FM signal can be written as follows: t v(t)dt) (3) S(t) = A cos(wc t + k 0
Expand and simplify the above formula: t t S(t) = A cos(wc t) cos(k v(t)dt) − A sin(wc t) sin( v(t)dt) 0
0
(4)
= A cos(wc t) cos φ − A sin(wc t) sin φ In the above formula, Wc is the carrier angular frequency, S(t) is the sending signal, V (t) is the modulation signal, and k is the modulation coefficient.
1
v(t)dt
φ=k
(5)
0
Therefore, when using quadrature modulation to achieve FM signals, only need to I(t) = cos φ Q(t) = sin φ
(6)
In order to realize the above two formulas, the sine wave be generated by using the direct digital frequency synthesis (DDS) lookup table, which means that the sine value of the phase is calculated according to the phase of the sine wave in advance. The sine value data of the phase are stored according to the phase angle as the address. 2.2
Evaluation Indexes of Anti-jamming Performance
The traditional anti-jamming performance evaluation index of GNSS receivers usually obtains the corresponding index system by analyzing the parameters of each part on the basis of the inherent, active and passive anti-jamming ability of GNSS receivers. On the one hand, this evaluation index system is not perfect. It is not possible to comprehensively evaluate the anti-jamming performance of GNSS receivers. On the other hand, it is not combined with the actual interference scenarios faced by GNSS receivers. Therefore, this paper uses the interference suppression index, which is the interference signal strength of GNSS receivers input under the premise of ensuring GNSS receivers to meet the minimum performance constraints in a certain signal environment. The interference suppression system reflects the maximum
Anti-jamming Performance Evaluation of GNSS Receivers . . .
811
ability of GNSS receivers to suppress the interference signal when the signal environment is constant and the performance of GNSS receivers meets the requirements. Obviously, the larger the interference suppression system is, the better the anti-jamming performance of GNSS receivers is [8]. The input maximum interference signal ratio of the anti-jamming performance evaluation index is described as follows: E xj xj H pj (7) = ISR = ps E [xs xs H ] where pj is the interference signal power and ps is the power of the wanted signal. 2.3
Improved Analytic Hierarchy Process
The analytic hierarchy process (AHP) is a qualitative and quantitative multiobjective decision-making method proposed by Saaty [9], who is a famous American operational research scientist in the mid-1970s. At present, analytic hierarchy process (AHP) has been widely used in many fields, such as industry, transportation, military and so on [10]. However, the most basic analytic hierarchy process uses the 1–9 scale method in determining the judgment matrix. The subjectivity is strong, and the consistency of the judgment matrix needs to be checked, while the improved three-scale method reduces the difficulty of judgment. It reduces the influence of the strong subjectivity of the decision-makers on the evaluation results, does not need to check the consistency of the judgment matrix, reduces the difficulty of calculation and improves the efficiency. The steps for determining the weight by the three-scale method are as follows [11]. Determining the Relative Weight of Elements in Each Layer It establishes the comparison matrix A: The experimental data obtained by the test system are compared with the two elements of each layer by the scales of 0, 1 and 2, and the comparison matrix A is obtained. ⎤ ⎡ a11 a12 · · · a1n ⎢ a21 a22 · · · a2n ⎥ ⎥ ⎢ (8) A=⎢ . . .. ⎥ . . ⎣ . . . ⎦ an1 an2 · · · ann In the above formula: aij is the important value obtained by comparing the i index element with the j index element. ⎧ ⎨ 0 The jth element is more important than the ith element aij = 1 The ith element is as important as the jth element ⎩ 2 The ith element is more important than the jth element
(9)
812
Y. Li et al.
Constructs judgment matrix B ⎧ r −r i j ⎪ ⎪ (ri ≥ rj ) ⎨ rmax − rmin (km − 1) + 1 −1 bij = ⎪ rj − ri ⎪ (ri < rj ) ⎩ (km − 1) + 1 rmax − rmin
(10)
where ri and rj are the row sums of the comparison matrix. ri =
n
aik (i = 1, 2, · · · n) , rj =
k=1
km
n
ajk (j = 1, 2, · · · n)
k=1
rmax = rmin
(11)
Constructs the transfer matrix C. The corresponding transfer matrix (antisymmetric matrix) C is obtained from the judgment matrix B: cij = lg bij Calculates the optimal transfer matrix D.The corresponding optimal transfer matrix D is obtained from the judgment matrix B and the transfer matrix C: n
dij =
1 (cik − cjk ) n k=1
(12)
i = 1, 2, · · · , nj = 1, 2, · · · , n Solution of quasi-optimal uniform matrix V . From the above optimal transfer matrix D, the corresponding quasi-optimal uniform matrix V : vij = 10dij Calculation of evaluation index weights. The eigenvector corresponding to the unique maximum eigenvalue of the matrix is obtained by the eigenroot method, that is, V M = λmax M
(13)
λmax is the largest eigenvector corresponding to the quasi-optimal consistent matrix V , and M is the eigenvector corresponding to the maximum eigenroot. The weight value of the single-layer index is obtained by normalizing the eigenvector M . 2.4
Calculation of Combination Weight
Use the results of single-level ranking to rank the total hierarchy. For example, if the weight of an index is a and the weights of the factors below it are T (W1 , W2 , W3 , · · · Wn ) , the weights of these factors in the total hierarchy are aWi (i = 1, 2, · · · , n), respectively. Follow this method to determine the order of each factor in the overall ranking.
Anti-jamming Performance Evaluation of GNSS Receivers . . .
3 3.1
813
Anti-Jamming Performance Evaluation of Analytic Hierarchy Process Based on Three-Scale Establishment of Evaluation Model
According to the typical interference scenario faced by GNSS receivers, the antijamming performance evaluation model shown in Fig. 1 is established.
Fig. 1 Anti-jamming performance evaluation model of receiver based on improved analytic hierarchy process
3.2
Evaluation of Test Results
Through the anti-jamming test of GNSS receivers [12], the test receiving equipment is faced with continuous wave interference, pulse interference and frequency modulation interference, respectively, in the cold start mode. The interference signal is increased by-110dBm (step-by-step 5dB or 1dB) until GNSS receivers happen to be unable to position properly, recording the input maximum interference signal ratio of GNSS receivers at the moment, and the test data are shown in Table 1. It is concluded that GNSS receivers have different tolerance to three kinds of typical interferences: • The anti-pulse interference performance is the best. • The anti-continuous wave interference is the second. • The anti-FM interference is the worst From the above measured results, we can draw a conclusion that weight of the criterion layer is ranked from large to small as follows: anti-pulse interference, anti-continuous wave interference and anti-frequency modulation interference, according to the improved AHP method. Above all, the comparison matrix A−B is established by using the three-scale method through the pairwise comparison of indexes, as shown in Table 2 below. Thus, the order of importance is obtained,
814
Y. Li et al. Table 1 Interference suppression degree under different interferences Interference
ISR (dB)
Continuous wave interference 55 Pulse interference
62
FM interference
35
and the judgment matrix is calculated correspondingly.The transfer matrix, the optimal matrix and the quasi-optimal consistent matrix are calculated in turn and the corresponding weight of the index of this layer. Table 2 Comparison matrix A − B A
B1
B2
B3
WA
B1
1
0
2
0.2583
B2
2
1
2
0.6370
B3
0
0
1
0.1047
Suppose that under the condition of satisfying the lowest constraint, the order of the maximum interference-to-signal ratio of different receivers when they cannot locate under continuous wave interference is from strong to weak: receiver A, receiver C, receiver B and receiver D. The anti-pulse jamming ability is from strong to weak: receiver B and D are the same, receiver C, receiver A; anti-FM jamming ability from strong to weak: receiver A, receiver B and receiver D are the same, receiver C. Therefore, the comparison matrices B1 − C, B2 − C, B3 − C and their corresponding weights are obtained, respectively, as shown in Tables 3, 4 and 5. Table 3 Comparison matrix B1 − C B1
C1
C2
C3
C4
WB
C1
1
2
2
2
0.5638
C2
0
1
0
2
0.1178
C3
0
2
1
2
0.2634
C4
0
0
0
1
0.0550
According to the above calculation results, the ranking matrix of the antijamming performance of each receiver is obtained as follows:
Anti-jamming Performance Evaluation of GNSS Receivers . . .
815
Table 4 Comparison matrix B2 − C B2
C1
C2
C3
C4
WB2
C1
1
0
0
0
0.0575
C2
2
1
2
1
0.4103
C3
2
0
1
0
0.1220
C4
2
1
2
1
0.4103
Table 5 Comparison matrix B3 − C B3
C1
C2
C3
C4
WB3
C1
1
1
2
1
0.3125
C2
1
1
2
1
0.3125
C3
0
0
1
0
0.0625
C4
1
1
2
1
0.3125
W = WB WA =[WB1 WB2 WB3 ]WA = ⎡ ⎤ 0.5638 0.0575 0.3125 ⎡ ⎤ ⎢ 0.1178 0.4103 0.3125 ⎥ 0.2583 ⎢ ⎥⎣ ⎦ ⎣ 0.2634 0.1220 0.0625 ⎦ 0.6370 = 0.1047 0.0550 0.4103 0.3125 T 0.2150 0.3245 0.1523 0.3083
(14)
Therefore, from the formula (13), we can know that the anti-jamming performance of GNSS receivers from good to bad is receiver B, receiver D, receiver A and receiver C. According to the evaluation index system established in the paper, using the traditional 1–9 scale analytic hierarchy process is proposed in [13] for system performance evaluation. From the evaluation results, it can be seen that the calculation results of the two methods are consistent, which proves the correctness of the algorithm in this paper. The comparison analysis table of the evaluation results is shown in Table 6. Table 6 Comparison of evaluation results Method Method of this article
A
B
C
D
0.2150 0.3245 0.1523 0.3083
Analytic hierarchy process 0.1855 0.4031 0.0352 0.3762
As can be seen from Table 6, the evaluation results based on the three-scale analytic hierarchy process and the classic nine-scale analytic hierarchy process
816
Y. Li et al.
are consistent. The evaluation results are from receiver B, receiver D, receiver A and receiver C in order from good to bad, indicating the feasibility of the method in this paper.
4
Conclusions
Anti-jamming ability is one of the most important performance requirements of GNSS receivers. It is very important to evaluate the anti-jamming ability of GNSS receivers in different scenarios. In this paper, the improved analytic hierarchy process is used to study the anti-jamming ability of GNSS receivers. Comparison matrix is obtained by using three-scale method and further calculation to obtain the optimal transfer matrix, which overcomes the one-sided differences in subjective perceptions and eliminates the need for complex consistency checks. Under different interferences scenarios, the experiment data can be used to evaluate the anti-jamming performance of GNSS receivers by taking the input maximum interference signal ratio as index. Finally, this method can realize a quick and intuitive judgment. The experimental results show that the improved analytic hierarchy process can evaluate different receivers against different types of interferences accurately. It can provide important reference value for the anti-jamming performance test and evaluation methods of various types of GNSS receivers. Acknowledgements. This paper is supported by the National Natural Science Foundation of China (No. 61501471).
References 1. Ma H (2012) Research on the testing scenes design technologies of GNSS antijamming antenna arrays. Graduate School of National University of Defence Technology Changsha, Hunan 2. Gan G, Qiu Z (2000) Navigation and positioning-the big dipper of modern warfare. National Defence Industry Press, Beijing 3. Shang JP (2004) Anti-jam technique research for GPS. Wuhan University, Wuhan 4. Xiong T (2014) Anti-jamming performance evaluation method of GNSS receiver. University of Electronic Science and Technology of China, Chengdu 5. Feng X (2015) The research and implementation on interference suppression technology of satellite navigation receiver. Beijing Institute of Technology, Beijing 6. Zhang F (2013) Research on effect of blanket jamming on BER of satellite navigation signals. The University of Chinese Academy of Sciences, Beijing 7. Zhang Yu, Yang S, Zheng W (2008) Pulse weighted frequency modulation jamming technique countering ISAR. Syst Eng Electron 30(2):249–252 8. Shu-zhou LI, Huai-sheng SANG (2006) Study of the anti-jamming performance evaluation methods for satellite navigation receiver. Radio Eng 9:37–41 9. Saaty TL (1980) The analytic hierarchy process: planning, priority setting, resource allocation. McGraw-Hill Inter-national Book Co, New York 10. Fang H, Huang J, Zhang Q (2007) Study on torpedo anti-countermeasure capability evaluation based on analytic hierarchy process. J Detection Control 29(2):63–66
Anti-jamming Performance Evaluation of GNSS Receivers . . .
817
11. Yang X, Wang Y (2018) Research on the comber target optimization based on improved analytic hierarchy process. Comput Appl Software 35(4):28–32 12. Huan L (2017) Research on anti-jamming performance test technology of GNSS. J Telemetry, Tracking Command 38(4):43–52
Communication Optical Cable Patrol Management System Based on RFID + GIS Yang Mei(B) , Jiang Yaolou, Zhou Bo, Qing Chao, and Chen Zhen Xichang Satellite Launch Center, Xichang, Sichuan Province, China [email protected]
Abstract. With the increasing proportion of optical cable in communication lines, it is very important and urgent to standardize the construction of optical cable and to maintain and manage it after completion. Aiming at the current situation of optical cable line maintenance and management, the paper puts forward the solution of communication optical cable patrol management system based on RFID + GIS, which provides stable and reliable communication guarantee for information construction. Keywords: RFID · GIS · Optical cable management · Optical cable patrol line
1 Introduction Optical fiber communication network is the support of each network operation, optical cable line is an important part of optical fiber communication network, and the transmission quality of optical cable and the maintenance and management of optical cable are directly related to the overall operation quality of communication network. With the acceleration of urbanization construction, buried optical cables are often broken by construction, occasionally bitten by rats, driven optical cables are sometimes broken but not easy to find fault points, which leads to communication interruption, which brings great hidden trouble to people’s work and life. At the same time, the construction of communication optical cable generally has the problem of heavy construction and light management, laying more optical cables, complex network, large flow of personnel, the registration and management of line data are not updated in time, less and less people know about the routing of optical cable laying and time consuming and labor consuming to find fault points. In order to improve the current management means of optical cable, improve the working efficiency of post personnel and meet the needs of information development, it is urgent to build a set of perfect inspection line management system to realize the inspection and management of optical cable [1].
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_108
Communication Optical Cable Patrol …
819
2 Key Technology Solutions 2.1 RFID Optical Cable Information Collection Optical cable information collection is mainly to collect the information of the key points of optical cable, from which the information of the key points can be clear about the geographical direction of the optical cable, the way of laying the optical cable, the geographical location of the optical cable and so on. RFID, radio frequency identification, is a non-contact automatic identification technology, it automatically identifies target objects and obtains relevant data through radio frequency signals, identification work without manual intervention, as a wireless version of bar code, RFID technology has the advantages that bar code does not have waterproof, anti-magnetic, high-temperature resistance, long service life, large reading distance, data on labels can be encrypted, storage data capacity is larger, storage information is more freely, RFID technology as the key technology of the development of the Internet of things, and its application market will grow with the development of the Internet of things. The complete RFID system consists of a reader (Reader) and an electronic tag (Tag), the so-called transponder (Transponder) and a management system, as shown in Fig. 1.
Fig. 1 Basic composition of RFID system
RFID technical workflows are as follows: 1. Reader transmits radio frequency information at a certain frequency through a transmitting antenna; 2. When the tag enters the working area of the reader antenna, the tag antenna generates enough inductive current, and the tag energy is activated; 3. The tag sends the self-injection information through the built-in antenna; 4. Reader antenna receives a carrier signal from an electronic tag;
820
Y. Mei et al.
5. Reader antenna transmits carrier information to the reader; 6. Reader demodulates and decodes the received signal and then sends it to the management system for related processing. 7. Management system judges the legitimacy of the tag. 8. Management system for different settings to make the corresponding processing, issued command signals, control the execution of related processing. From the RFID workflow, it can be seen that RFID system uses radio frequency mode to carry out non-contact two-way data transmission between reader and electronic tag to achieve the purpose of target identification, data transmission and control. As the rapid development of Internet of things globalization and the increasingly fierce competition of international radio frequency identification, the RFID standard system of Internet of things has become an important means for enterprises and countries to participate in international competition. There are three RFID standard systems, ISO/IEC, EPC and UID, which are adopted in this paper. EPC is the network of the system, the physical object is a commodity, and the user is a member of the supply chain of the commodity. EPC standard system framework of the system defines the standard of EPC physical object exchange, thus ensuring that when the user hands over one physical object to another, the latter can easily obtain the corresponding item information according to the EPC code of the physical object. Providing a unique number—EPC code for each item, EPC code uses set of numbers to represent the characteristics of the product. EPC is the only information stored in RFID tag microchips, or electronic tags, which allow RFID tags to remain inexpensive and flexible. EPC label chip area less than 1 square millimeter can realize binary 96/128 bit information storage. Compared with the bar code, the advantages of EPC are not only in the super strong marking ability, and because it uses wireless induction to exchange information, so it can be contactless identification, “line of sight”, can pass through water, paint, wood and even human identification. The key point information of the optical cable contains the optical cable belonging to the key point, whether the key point is overhead or buried, the distance of the optical cable from the surface, the line number, the distance from the key point to the starting point of the optical cable and the geographical location of the key point. RFID reader uses the industrial Internet of things handset to complete the electronic label writing, reading and storage function. The data format of the reader is shown in Table 1. There are six addresses in each EPC, and the data length of one address is 2 word, 1 word equal to 2 byte, 1 equal to 8. The coding correspondence table in EPC text is shown in Table 2. The work flow of basic information collection for optical cables is as follows: 1. Basic information coding for optical cables; 2. The basic information of the optical cable is written to the electronic label through the handset; 3. Installing the electronic tag at the corresponding key point; 4. Optical terminal personnel or line patrol personnel or line repair unit through the RFID handset to read the key point basic information;
Communication Optical Cable Patrol …
821
Table 1 Label EPC field meaning Name
Meanings
ID
Serial number of data
EPC
EPC number of the label
PC
The protocol control word of the label
Number of identification
The number of labels identified
RSSI
Signal strength when the last label is identified
Carrier frequency
Carrier frequency when the last label is identified
Table 2 Coding correspondence EPC table Address
Meaning
00
Standby
00 00(Ex-factory value)
01
Key point number
00 01(No 0.1)
02
The cable at the key point
05 83 (Fiber’s name)
03
In the case of laying the key points, the overhead is 1 buried in 2 pipes is 3, and the distance from the surface is 3
02 01 (Pipe laying mode, 1 m from surface)
04
Distance of the key point from the starting point of the cable
16 30 (This point is 5680 km from the cable starting point)
05
Geographical location longitude of the 66 36 (102054 ) key point (the first WORD is degrees and the second is seconds)
06
Geolocation of the key point (the first WORD is degrees and the second is seconds)
1B 20(27)032 )
5. Associate the basic information of the collected key points with geographic location information; 6. Storage database. 2.2 Display of Optical Cable Information Based on GIS The optical cable geographic information part mainly realizes the following functions: 1. 2. 3. 4.
Collecting the location information of the key points of optical cable; The key points of optical cable on the map; The optical cable to display on the map; Cable ranging.
822
Y. Mei et al.
GIS, Geographic Information System, is a kind of computer application system, which is based on geospatial database, collects, manages, operates, simulates and displays space-related data through computer and provides a variety of spatial and dynamic geographic information, including important information such as location and physical attributes, in time by using the method of geographical model analysis. A simple latitude and longitude coordinate, only placed in a specific geographical information, will be recognized and understood by people through the relevant technology to obtain location information, but also need to understand the geographical environment, query and analysis of environmental information. During GIS, coordinate systems are divided into two main categories: geodetic coordinate system and projection coordinate system. The geodetic coordinate system belongs to the reference coordinate system, and the points in the coordinate system are located on the spherical center, such as WGS84, Xi’an 80, Beijing 54, CGCS2000, etc. A projection coordinate system belongs to a plane coordinate. A point in a coordinate system is located on a plane, such as webmercator. The display of optical cable information mainly shows the location information of the key points of optical cable, the direction of optical cable, the manual ranging of optical cable, etc. The geographic location information of optical cable is obtained by GPS/Beidou location acquisition instrument, and the collected geographic location information is displayed in the GIS. The secondary development of the GIS is required. The development process is as follows: 1. Deployment of GIS servers; 2. Introduction of map 2, , 3D integrated engine javascript package; 3. Instantiate the map, through the following code can achieve a two-dimensional map loading; 4. Add tiles to the map; 5. Select the interface according to the interface list to complete the corresponding functions.
3 RFID + GIS Communication Optical Cable Patrol Management System Communication optical cable patrol management system realization function [2]: 1. 2. 3. 4.
Establishment of optical cable databases; Provide optical cable information add-delete check platform; RFID optical cable line information collection; Cable is updated and displayed in the GIS.
The schematic diagram of the system framework is as follows (Fig. 2): The communication optical cable patrol management system is realized by B/S mode, which completes the functions of optical cable management, optical fiber management, optical cable map management and so on. At the same time, it provides web interface,
Communication Optical Cable Patrol …
823
Fig. 2 Schematic diagram of RFID + GIS communication cable patrol management system
realizes communication with users and adopts independent database server in the persistent layer. The communication optical cable patrol management system mainly includes MySQL database design, model–view–control design based on Django framework, Web system front-end design, user and so on [3–5] (Fig. 3).
Fig. 3 Implementation framework of communication optical cable patrol management system
Before the practical application process, the geographic location information of the key points of the optical cable is collected and stored in the database with GPS/Beidou locator, and then each key point information of the optical cable is written into a different RFID electronic label separately, and these labels are installed in the corresponding position (such as optical cable rod and human well). When the line patrol begins, the patrol personnel use the RFID reader to read the key point information of the optical cable and upload it to the system server. The system platform will display the patrol information in time in the GIS map according to the information received. If the cable
824
Y. Mei et al.
fails, the maintenance personnel OTDR clear the distance of the cable fault point, and the maintenance personnel measure the cable fault point between the two key points through the cable ranging function. In the process of daily optical cable service maintenance and management, maintenance personnel and leading organs can understand the location information of optical cable in optical distribution and the business information carried by optical cable in time through this platform.
4 Concluding Remarks This paper uses the RFID + GIS combination method to design the communication optical cable patrol management system by Django、jQuery and RFID + GIS, realizes the accurate management to the communication to the optical cable route, the optical cable data and so on information, enhances the optical cable patrol and the optical cable management work quality and the efficiency and provides the accurate auxiliary decision support for the leading organ.
References 1. Yulan H (2012)Core technologies of radio frequency identification (RFID) in the internet of things. People’s Post and Telecommunications Press 2. Garrard C (2017)Python geographic data processing. People’s Post and Telecommunications Press 3. Donglin S, Xiaofi Z (2012) Sharp jQuery (2nd ed). People’s Post and Telecommunications Press 4. Yongxiang H (2019) Django web application development. Tsinghua University Press 5. Bo W (2015) jQuery easy UI development of guidelines. People’s Post and Telecommunications Press
Lip Language Recognition System Based on AR Glasses Zhenzhen Huang, Peidong Zhuang(B) , and Xinyu Ren College of Electronics Engineering, Heilongjiang University, Harbin 150080, Heilongjiang, P.R. China [email protected], [email protected], [email protected]
Abstract. With the rapid development of science and technology, researches on AR, speech recognition, and lip recognition are more mature. Speech recognition is used in many fields; however, the problem exists in which speech recognition is not accurate in noisy environments. To solve the problem, lip recognition can be used as an aid, but receiving real-time image information is difficult. AR is easy to carry and can obtain image information in real time. But few people currently combine lip recognition with AR. This primary goal of this research is the combination of lip recognition and AR glasses. The image information is obtained through the camera module of the AR glasses, next face detection is performed based on OpenCV, and then, the lip language information is compared with the corpus by image processing to form text and finally reflected on the screen of the AR glasses. We have found that on the premise of reasonable recognition distance, using AR glasses as a carrier, the lip-reading text is displayed in a threedimensional scene, real-time mapping is realized, and people can see the results of lip recognition more intuitively. It is concluded that lip recognition based on AR glasses works well, is robust, and has strong real-time performance, after a large number of experiments. Keywords: AR glasses · Lip recognition · OpenCV
1 Introduction In recent years, in the field of lip recognition, research at home and abroad is relatively mature, and the recognition accuracy is high [1]. In lip recognition, there are many methods, among which the hidden Markov model (HMM) classifier is used more for extensive. Due to the small amount of data in this paper, support vector machines are used for data processing, because when the amount of sample data is small, it is better to use support vector machines for data processing. In recent years, AR technology develops rapidly, and the technology is relatively mature. Among them, AR glasses are powerful and have good development prospects. At present, AR glasses are mainly used in military, police, game entertainment, security real command, and so on. There
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_109
826
Z. Huang et al.
is almost no organic combination of lip recognition and AR glasses. In this paper, the function of AR glasses is expanded and used, combined with lip recognition, the lip recognition system based on AR glasses is studied, and its robustness is tested.
2 Overall Architecture This article combines AR glasses and lip recognition, that is, using the AR glasses’ camera to obtain the speaker’s picture information at an appropriate distance and send the information to internal processing. The picture information is segmented and extracted to obtain lip language. Information then recognizes the lip information and converts it into text, and the text information is displayed by AR glasses for users to use. The specific process is shown in Fig. 1.
Fig. 1 General structure diagram
3 Extraction of Lip Images In lip recognition, the accuracy of lip recognition is closely related to the accuracy of lip image extraction and segmentation. In many previous studies, researchers directly took the entire face image based on the detected face or took the lower half of the face as the study area [2]. Many will include the nose, but the relationship between the nose and the lip recognition is not large. Taking this method will cause the information to be too redundant and will make the algorithm more complicated later. Therefore, in this paper, firstly, the face image is roughly segmented according to the approximate distribution ratio of facial features to obtain the required lip area image, which makes the subsequent operation simple. Perform operations on the obtained approximate lip image, convert the image into a grayscale image for noise reduction, some researchers directly use the grayscale image to process in the RGB color space, which will make many color information unavailable. Meanwhile, it is greatly affected by the light, and the color recognition and the robustness are not good. The HSV color space uses the brightness information and color information of the color higher and has better resolution for many colors [3]. Therefore, this paper operates in the HSV color space to separate the pixels of the lips and the surrounding skin. The extraction and segmentation of lip images are realized.
Lip Language Recognition System Based …
827
3.1 Lip Image Preprocessing A face image is roughly segmented after obtaining a face image through OpenCV face detection technology to obtain a rough lip image. The facial features are distributed in a certain proportion. You can process a large number of data through multiple experiments to find the ratio of the lip area to the entire face and then segment it. Through multiple experiments on the facial features of different races, the following general regional distribution of the lips is obtained: 4 3 9 1 xface ≤ xmouth ≤ xface , xface ≤ xmouth ≤ 5 5 5 10 xface represents the left and right width of the human face, yface represents the upper and lower length of the human face. After many verifications, the results show that this method is very robust, and it is valid for people of different races, different ages and different genders as shown in Fig. 2.
Fig. 2 Lip division rough
3.2 Fine Extraction and Segmentation of Lip Images Now, we operate on these images to further extract the lip. The paper is mainly based on the HSV color space for processing, separating the lips, and the surrounding skin pixels, so as to achieve the extraction and segmentation of the lips. The HSV color space segmentation model is shown in Fig. 3. The color pictures are generally RGB images. First convert the RGB images into HSV images. The specific conversion formula is as follows: ⎧ ◦ G−B ⎪ R = Max(R, B, G) ⎪ Max(R,B,G)−Min(R,B,G) × 60 ⎨ ◦ B−R 2 + Max(R,B,G)−Min(R,B,G) × 60 G = Max(R, B, G) (1) ⎪ ◦ ⎪ R−G ⎩ 4+ B = Max(R, B, G) × 60 Max(R,B,G)−Min(R,B,G) S=
Max(R, B, G) − Min(R, B, G) Max(R, B, G)
(2)
828
Z. Huang et al.
Fig. 3 HSV color space segmentation model
V = Max(R, B, G)
(3)
We divide the image into three channels, H, S, and V, which represent hue, saturation, and brightness, respectively. H and S are the color information of the image, and V is the brightness information of the image. Getting the color information by calculating the color information: C=
Y X
H (i, j) +
i=1 j=1
Y X
S(i, j)
(4)
i=1 j=1
C represents the color information of the image, H(i, j) S(i, j) represents the corresponding value of the pixel. Then, calculate the brightness information (L): L=
V X
H (i, j)
(5)
i=1 j=1
XH =
X V 1 H (i, j) X ×Y
(6)
X V 1 S(i, j) X ×Y
(7)
i=1 j=1
YS =
i=1 j=1
Lip Language Recognition System Based …
Cs = (H (i, j) − Xs )2 + (S(i, j) − Xs )2
829
(8)
XH and YS , respectively, represent the average value of channels H and S, and CS is the saliency map of the color information of the image. Next, by calculating the brightness information of the image, a brightness information saliency map is obtained, X Y i=1 j=1 V (i, j) (9) XV = X ×Y Ls (i, j) = (V (i, j) − Xv )2
(10)
XV and LS , respectively, represent the mean value of the brightness channel and the brightness information saliency map of the image. Next, we give different weights to the color information saliency map and the brightness information saliency map to obtain a comprehensive information saliency map. Easy to know: ω1 + ω2 = 1. Obtaining the comprehensive information saliency map Z, Z(i, j) = ω1 × CS (i, j) + LS (i, j) × ω2
(11)
Then, we use threshold segmentation, get the threshold value through Otsu algorithm, and finally get the binary image. The fine extraction and segmentation of the lips are completed. After a large number of experiments, this segmentation model is efficient and accurate. 3.3 Extraction of Visual Features of the Lip Region Some researchers use the method of lip edge detection to locate the corner of the mouth [4]. According to literature [5], a lip location based on grayscale image is proposed. After comparing various algorithms, the lip positioning method based on the gray image is fast and practical. The number of key points extracted in previous studies is 12, which is not very accurate. In this paper, 20 key points will be extracted. In this case, it is more accurate and reflects more feature information. The effect is shown in Fig. 4. Then, we need to perform lip curve fitting for these key points to obtain the geometric characteristics of the lips. Through the study of the characteristics of many curves, it is found that the higher the degree of the curve, the more frequent the change, and the fit is not high with the lips. However, many experiments have shown that the secondand third-order curves fit the lips with a high degree. This article uses three independent quadratic curves to simulate the geometric characteristic curve of the lips. The lip model is shown in Fig. 5. On the upper left of the lip, the second-order curve equation from Q1 to M0 is y = α1 x2 + β1 x + c1 . On the upper right of the lips, the second-order curve equation from Q2 to M0 is: y = α2 x2 + β2 x + c2
830
Z. Huang et al.
Fig. 4 Key point extraction
Fig. 5 Lip model
Below the lips, the second-order curve equation from Q1 to Q2 is: y = α3 x2 + β3 x + c3 The lip parameter model formed by these three second-order curves imposes an effective geometric limit on the lip. This limit consists of an array of parameters, namely: µ = {x, y, α1 , α2 , α3 , β1 , β2 , β3 , c1 , c2 , c2 } Multiple tests have shown that this parameter model has a high degree of matching for most lip shapes. In the same way, the inner contour curve of the lips is fitted, and four parameters are used to represent the basic shape of the lips: The ratio of lip contour H0 . The ratio of the upper and lower height of the lip contour: height to width: K1 = W 0 K2 =
Hou Hod . The ratio of the height
of the inner contour of the lips to the width: K3 =
The ratio of the upper and lower height of the inner contour of the lip: K4 =
Hiu Hid .
Hi Wi .
The
Lip Language Recognition System Based …
831
front part of the geometric visual feature is T1 = (K1 , K2 , K3 , K4 ). The parameters of the three lip fitting curves are used as the geometric features of the other part, namely: T2 = (α1 , α2 , α3 , β1 , β2 , β3 , c1 , c2 , c3 ) Combining the two, the geometric feature vector is obtained as follows: Tg = (T1 , T2 )
(12)
3.4 Pixel Feature Extraction For pixel feature extraction, the lip image is regarded as a matrix, and operations are performed based on this matrix. This method can include all pixel features in the image. For the extraction of pixel features, there are roughly four methods: Direct pixel method, transformation method, optical flow method and characteristic lip method. The transformation method is very good for reducing the dimension of the data, strong robustness. In this paper, the transformation method is selected to extract pixel features. In the transform method, we mainly use DCT to transform. The two-dimensional DCT transformation formula of the image is as follows: F(u, v) =
2C(u)C(v)
n−1 n−1 x=0
y=0
(2y+1)vπ f (x, y) cos (2x+1) 2n cos 2n
n
(13)
The corresponding inverse transformation formula is n−1 n−1
C(u)F(u, v)C(v) cos (2x+1)uπ cos (2y+1)vπ 2n 2n n v, u = 0, 1, 2, . . . , n − 1; x, y = 0, 1, . . . , n − 1
F(u, v) =
2
u=0
v=0
C(v)C(u) =
√1 2
u, v = 0
1 u, v = 1, 2, . . . , n − 1
(14)
(15)
To process the lip image, first divide the image into several sub-regions for block DCT transformation. After DCT transforming the image of each area, the coefficient matrix is obtained. Through experiments, the pixel feature vector of the lip region is obtained as follows: Tp = (ts1 , ts2 , ts3 , ts4 , ts5 , ts6 ) tsi = (ci1 , ci2 , ci3 , ci4 , ci5 , ci6 , ci7 ), i = 1, 2, 3, 4, 5, 6 cik (k = 1, 2, 3, 4, 5, 6, 7)
832
Z. Huang et al.
3.5 Visual Feature Fusion of Lip Images We adopt the method of fusing the geometric features of the lips with the pixel features, and at the same time, by introducing the first-order difference features, the dynamic information of the lips when the speaker is talking is included. (16) Tlip = Tg , Tp The difference in the numerical values of the vectors greatly affects the accuracy of recognition. So, this paper normalizes the value of each dimension of the integrated feature vector. Because T1 , T2 , and Tg are different characters, we treat them separately. t∗ =
t − tmin tmax − tmin
(17)
t is the element of the feature vector to be normalized, tmin and tmax are the maximum and minimum values of the element, respectively, and t ∗ is the normalized result. The lip comprehensive feature vector after the normalization result is obtained: ∗ = T1∗ , T2∗ , Tp∗ (18) Tlip The dynamic information of the lips is expressed by a difference equation: m i × c(n + i)
d (n) = −m m 2 −m i
(19)
c(n) represents the feature vector of a certain lip image, m is the interval considering dynamic features, and d(n) is the image dynamic feature vector associated with the lip image. ∗ Td = ∇Tlip
(20)
∗ . The resulting comprehensive feature vector is T = Td , Tlip 3.6 Visual Language Recognition Based on SVM Algorithm For the language recognition of the extracted lip information, many researchers use HMM classifier for recognition [6], but by searching literature, it is found that when there are few data samples, the recognition rate of the support vector machine (SVM) classifier is better than the hidden Markov model classifier. The amount of data in this paper is small, so we choose SVM classifier for lip recognition. Before the SVM performs language recognition, the data is first reduced by the principal component analysis method. Through a long-term language recognition test, it is found that under the comprehensive characteristics of multiple lip images, the recognition accuracy rate is relatively high, reaching 70%. This aspect proves that the previous methods of lip segmentation and feature extraction are both reasonable and feasible.
Lip Language Recognition System Based …
833
4 AR Glasses Combined with Lip Recognition This paper thinks that AR glasses can be used as an input and output, and lip recognition is reflected by AR glasses. This article mainly uses AR glasses to extract real-world information and then one-to-one mapping with Chinese language scripts. Finally, Chinese characters are reflected through AR glasses as a carrier for users to use. The working flowchart is shown in Fig. 6. AR glasses extract the speaker’s image information through a miniature camera and pass it to the background for processing, that is, segment the speaker’s image to obtain the lip image and process the lip information. Through a series of operations in this article, the information provided by the lip image is converted into text, and finally, the text is reflected through the AR glasses and fed back to the user. After many experiments, it is found that AR glasses are very suitable as a carrier for information collection and text reflection after lip recognition.
Fig. 6 Working flowchart
5 Conclusion This paper studies lip recognition on the basis of predecessors and organically combines lip recognition technology with AR glasses technology, using the rich functions of AR glasses, so that users can easily obtain real-time image information, and the results of language recognition are clearly and conveniently displayed on the AR glasses display. Through many experiments, it has been found that this combination has strong feasibility and high robustness and can be used in many aspects such as investigation, deaf-mute learning, and security command.
834
Z. Huang et al.
References 1. Puviarasan N, Palanivel S (2010) Lip reading of hearing impaired persons using HMM. Expert Syst Appl 38(4) 2. Lu K (2015) Research on language recognition technology based on lip visual features. North China University of Technology 3. Zhang C, Yang W, Liu Z (2013) Color image segmentation method based on HSV comprehensive saliency. Comput Eng Des 34(11):3944–3947 4. Lewis TW, Powers DM (2000) Lip feature extraction using red exclusion. In: Pan Sydney workshop on visual information processing. Australian Computer Society, Sydney, pp 61–67 5. Wu W (2013) Research on dynamic lip segmentation and tracking algorithm Shanghai Jiaotong University 6. Cai Y (2018) Research on lip recognition method based on hidden Markov model. North University of Technology
Device-Free Human Activity Recognition Based on Channel Statement Information Ruoyu Cao, Xiaolong Yang(B) , Mu Zhou, and Liangbo Xie School of Communication and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China [email protected]
Abstract. The human activity recognition system has been researched in various technical fields for decades. Nowadays, as WiFi is widely used in homes, using WiFi devices for human activity recognition has become a better choice. In recent years, more and more researchers have begun to use channel state information (CSI) to realize human activity recognition. The CSI features with fine-grained information can show the impact of human activity on the channel. But in the existing CSI-based human activity recognition system, there is an issue. Due to the high proportion of error and noise in the phase information in CSI, the information in CSI is not fully utilized during the processing of CSI data. In this paper, we propose a method for extracting phase information in CSI, so that we can completely extract the effective information in CSI as the input feature of the classifier. Then, we use k-means for feature extraction of main feature data. Finally, we use support vector machines (SVM) to learn features and conduct activity recognition. We evaluated the system performance, and the experimental results show that our system has good performance. Keywords: Channel state information · Human activity recognition · Effective information
1 Introduction Nowadays, the field of human–computer interaction has become one of the most promising development directions. Human activity recognition (HAR) is one of the most important technologies, which has played an increasingly important role in human production and life [1–3]. It has great development prospects and application value in the fields such as smart home, security monitoring, and medical assistance. Channel state information (CSI) is a kind of fine grain physical layer information, which can be obtained through WiFi device. Therefore, CSI-based HAR is popular with researchers and has been extensively studied. However, the traditional human activity recognition system ignores the phase feature, which is also a key parameter for recognition. In this paper, we propose a CSI-based device-free human activity recognition system to overcome the shortcomings of existing systems.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_110
836
R. Cao et al.
2 Methods When the action occurs, the signal reflected by the human body will affect the signal that the channel propagates in the environment. The receiver can measure the small signal changes caused by human movement and apply these changes to recognize human activities by monitoring the state of the wireless channel. In order to make full use of the CSI information, after multipath se-lection to remove the CSI error, we use linear transformation to reduce the random noise caused by the phase of the WiFi signal when propagating in the channel [4, 5], and at the same time remove it by filtering. The highfrequency channel noise is contained in the amplitude. Then we use k-means for feature extraction of main feature data. Finally, we train our support vector machine (SVM) to get our classifier model for activity recognition. Our system structure is shown in Fig. 1.
Offline phase
Online phase
CSI acquisition
CSI acquisition
CSI complete feature information extraction
CSI complete feature information extraction
Train SVM model
Recognition
Label
Fig. 1 Framework of our proposed human activity recognition system
We calculate the amplitude and phase information of the CSI next. Because the error and noise content in the phase information are much higher than the amplitude information. According to PADS [6], our phase information consists of several parts. For example, the phase of the subcarrier can be expressed as: φˆ k = φk − 2π
k δ+β +D K
(1)
Device-Free Human Activity Recognition …
837
where φk represents the true phase, and δ is the timing offset of the receiver, which will cause a phase error as a middle term. β is unknown phase offset, and D is the noise of measurement (Fig. 2). 90
15
120
60 10
150
30 5
0
180
330
210
Raw phase Pre-processed phase
300
240 270
Fig. 2 Results of phase processed
Although the index of the subcarriers is defined as −28 to 28 in the IEEE 802.11n, since we have modified the WiFi driver to use the CSI information of 30 subcarriers, we adjusted the index to −15 to 15. Compared with PADS, we have removed most and some through multipath selection, so we can get more accurate CSI phase information. After extracting the amplitude and phase information of the CSI, we use the standardization algorithm to uniformly normalize the amplitude and phase features. Then, we perform k-means clustering on the amplitude and phase information of different subcarriers, respectively, to obtain the main information characteristics of amplitude and phase. After we get the main information features in CSI, we use SVM to generate classifiers for human activity recognition.
3 Results and Conclusion In order to verify the effectiveness of our proposed system, we tested the system by collecting data to construct a data set. The data set uses two mini computers equipped with Intel 5300 network cards as transceivers, each with an antenna. We extracted the six most common activity data in the dataset for testing. Figure 3 shows our test results. It can be observed that the average recognition accuracy of our system is as high as 96%. In this paper, we first propose a method of phase information extraction and error elimination, which is suitable for line-in-sight environments. Then, we take the effective
838
R. Cao et al. 100% 90%
Recognition accuracy
80% 70% 60% 50% 40% 30% 20% 10% 0%
Bend
Hand Clap
Walk
Run
Squat
Different activities
Fig. 3 Test results using test dataset
information contained in the CSI extracted by the complete amount as a feature and input it to the SVM for training to obtain the activity recognition model. From our experimental results, our system has good performance in the actual experimental environment.
References 1. Twomey N, Diethe T, Craddock I et al (2017) Unsupervised learning of sensor topologies for im-proving activity recognition in smart environments. Neurocomputing 234:93–106 2. Liu X, Jia M, Zhang X et al (2018) A novel multichannel Internet of things based on dynamic spectrum sharing in 5G communication. IEEE Internet Things J 6(4):5962–5970 3. Yang J, Zou H, Jiang H et al (2018) Device-free occupant activity sensing using WiFi-enabled IoT devices for smart homes. IEEE Internet Things J 5(5):3991–4002 4. Wang J, Huang Z, Zhang W, et al (2016) Wearable sensor based human posture recognition. In: 2016 IEEE international conference on big data (Big Data). IEEE, New York, pp 3432–3438 5. Liu X, Zhang X (2019) NOMA-based resource allocation for cluster-based cognitive industrial internet of things. IEEE Trans Ind Inform 6. Qian K, Wu C, Yang Z, et al (2014) PADS: Passive detection of moving targets with dynamic speed using PHY layer information. In: 2014 20th IEEE international conference on parallel and distributed systems (ICPADS). IEEE, New York, pp 1–8
Wi-Breath: Monitoring Sleep State with Wi-Fi Devices and Estimating Respiratory Rate Xin Yu, Xiaolong Yang(B) , Mu Zhou, and Yong Wang School of Communication and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing, China [email protected]
Abstract. As one of the human health detection indicators, breath is becoming the research emphasis. Traditional approaches based on wearable devices or pressure sensor devices are expensive to manufacture, and not suitable for daily use. In this paper, we present Wi-Breath, a monitoring system that extract the disturbing parts from received signals and estimate respiratory rate of the rest. We use wavelet transform to remove the noise of the signal amplitude. Then, by analyzing the time–frequency information of the signal, we use the sliding window method to segment the signal. In addition, we use the Fourier transform (FT) to calculate the respiratory rate of the rest. Experimental results show that Wi-Breath can reliably remove the disturbing part of the signal and estimate the approximate respiratory frequency. Keywords: Wi-Fi · Channel state information (CSI) · Breath
1 Introduction Sleep is an important part of human daily life [1, 2]. Breathing plays an important role as a sign information to detect sleep quality. Some commercial devices on the market collect data from headphones or wristbands worn by users to analyze the quality of sleep [3]. However, these additional devices are not comfortable to wear and are not suitable for daily use. WiSpiro [4] uses frequency-modulated continuous wave (FMCW) radar to analyze the phase change caused by the continuous wave signal sent by the 2.4 GHz directional antenna, reconstruct the chest and abdomen movement, and map it to the breathing process. However, it require additional customized hardware equipment, which cause high cost. Tagbreathe [5] estimates the respiratory rate by analyzing the low-level data obtained by radio frequency identification (RFID) reader. However, the security of RFID technology is not strong enough, which tag information is easily read illegally or tampered maliciously. With the popularity of Wi-Fi devices, the breath detection technology based on Wi-Fi has become a research hotspot. Ubibreathe [6] uses received signal strength indication on Wi-Fi devices for breath estimation. Liu [7]
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_111
840
X. Yu et al.
et al. combine the CSI amplitude and phase difference to capture the tiny movements caused by breathing and heartbeat. Phasebeat [8] uses wavelet transform to decompose and reconstruct the respiratory and heart rate signal.
2 System Description In IEEE 802.11n / ac, orthogonal frequency division multiplexing technology is used to measure and analyze CSI from the physical layer. Thus, CSI can be described as: hk,m =
L
l γk,m e−j2πfk τk,m l
(1)
l=1 l l where γk,m and τk,m represent the amplitude attenuation coefficient and time delay of the k-th subcarrier of the l-th propagation path on the m-th packet, respectively, and f denotes the frequency of the transmitted signal. By assuming that there are M packets, the received CSI can be expressed as a receiving matrix: ⎤ ⎡ h1,1 h1,2 · · · h1,M ⎢ .. ⎥ .. .. .. (2) CSI = ⎢ . . ⎥ ⎦ ⎣ . . hK,1 hK,2 · · · hK,M
2.1 Noise Cancellation In practice, the received signals exist noise, and the required signal is submerged in it. Therefore, we use a filter based on the median absolute deviation to filter out the outliers, which are replaced as the data median. Then, the wavelet denoising is used to process the CSI amplitude. In this paper, “db3” is chosen as a wavelet basis to decompose the signal in 5 layers. 2.2 Signal Segmentation and Breath Segment Extraction In order to accurately calculate the respiratory frequency, this paper uses a sliding window-based method [9] to segment the signal. We assume that the window length is N and the signal length is M, and calculate the variance of the CSI signal difference between two adjacent windows: Vn =
N N 1 1 ((csin (m) − csin−1 (m)) − (csin (m) − csin−1 (m)))2 N −1 N m=1
(3)
m=1
where n indicates window index. Then, we normalize Vn to get Vn : Vi =
Vi − min{Vi } , (1 ≤ i ≤ n) max{Vi } − min{Vi }
(4)
Wi-Breath: Monitoring Sleep State …
841
Firstly, we set the start flag as “False”. When it is “False”, compare each Vi with the threshold δ. If Vi ≥ δ, we set the start time Tbegin = (i − 1) · N and change the start flag to “True”. When the start flag is “True”, if Vi < δ, we set the middle Node Vtest as α · Vi + (1 − α) · Vi+1 . To make a judgment to Vtest , if Vtest < β · Vi+1 , we set the end time node Tend = i · N . After traversing all Vn , we can get the start and end time node of all the fragments. In this paper, we set the threshold δ as 0.65, and the weighted parameters α and β as 0.85 and 3, respectively.
3 Experiments and Evaluation The test environment is set in a typical conference office. The tester lies on the conference table to sleep. The receiver and transmitter are placed on the left and right sides of the tester. So as to compare the effect of not dividing the signal and not distinguishing the sleep state on the respiratory rate, we take the received CSI signal of a duration about 180 s containing apnea as an example, as shown in Fig. 1.
Fig. 1 Signal segmentation result
The tester stopped breathing in the interval of about 90–110 s, whose chest and abdomen had almost no fluctuations. The segmentation algorithm divides the signal into 4 breathing segments and 1 apnea segment. Then, we utilize the FT to calculate the respiratory rate of each part separately and compare it with the actual measured respiratory rate. From Fig. 2, it is obvious that the results obtained by measured and estimated respiratory rate are very close, and the maximum estimation error is only 1.2 bpm. Therefore, by dividing the measured signal and excluding the non-breathing segments, then calculating the remaining part of the breathing frequency can obtain a more accurate real respiratory rate.
842
X. Yu et al.
Fig. 2 Comparison of measured and estimated respiratory rate after segmentation
4 Conclusions In this paper, we present a respiratory rate estimating system that can monitor human sleep state without modifying the hardware facilities. Compared with the ordinary CSIbased breathing detection system, Wi-Breath fully considers the various situations, like getting up, turning over, apnea, etc., that may occur in real sleep state, which can obtain more accurate respiratory rate and get more practical application value. In the future, how to monitor the sleep status of multiple people will become one of our next research contents.
References 1. Liu X, Jia M, Zhang X, Lu W (2019) A novel multichannel internet of things based on dynamic spectrum sharing in 5G communication. IEEE Internet Things J 6(4):5962–5970 2. Liu X, Zhang X (2020) NOMA-based resource allocation for cluster-based cognitive industrial internet of things. IEEE Trans Industr Inf 16(8):5379–5388 3. https://www.fitbit.com/ 4. Nguyen P, Zhang X, Halbower A et al (2016) Continuous and fine-grained breathing volume monitoring from afar using wireless signals. In: IEEE INFOCOM 2016-IEEE conference on computer communications. IEEE, New York 5. Hou Y, Wang Y, Zheng Y (2017)TagBreathe: monitor breathing with commodity RFID Systems. In: 2017 IEEE 37th international conference on distributed computing systems (ICDCS). IEEE, New York 6. Abdelnasser H, Harras KA, Youssef M (2015)Ubibreathe: A ubiquitous noninvasive wifi-based breathing estimator. In: Proceedings of IEEE MobiHoc, vol 15 7. Liu J, WangY, ChenY et al (2015) Tracking vital signs during sleep leveraging off-the-shelf WiFi. In: The 16th ACM international symposium. ACM 8. WangX, YangC, MaoS (2017) PhaseBeat: exploiting CSI phase data for vital sign monitoring with commodity WiFi devices. In: IEEE international conference on distributed computing systems. IEEE, New York 9. Wu X, Chu Z, Yang P (2019) C: TW-See: human activity recognition through the wall with commodity Wi-Fi devices. IEEE Trans Veh Technol 68(1):306–319
Hard Examples Mining for Adversarial Attacks Jiaxi Yang(B) School of the Gifted Young, University of Science and Technology of China, Hefei, China [email protected]
Abstract. This paper focuses on adversarial attacks and security on machine learning and deep learning models. We apply different methods of perturbation on pictures in ImageNet and record the success rate of examples which are successfully attacked and wrongly recognized and then conclude a graph to describe the relationship between the intensity of attack and the accuracy of recognition. Then, we figure out the reason of examples hard to be attacked which is determined by models or determined by examples themselves. Besides, we analyze the pictures which are extremely defensive to the attacks and find out some visual characters to support them stay strong. Keywords: Machine learning · Adversarial attack machine learning · Computer vision
1
· Adversarial
Introduction
At present, AI, especially deep learning is becoming one of the most popular fields in computer science, not only because some products with AI technology have come into people’s sight, but it saves considerable labor hood as well. However, together with rapid development in machine learning techniques, some concerns on security of models no matter machine learning or deep learning have been raised in recent years. For example, some applications include face recognition, self-driving cars, and so on are desiring stability and safety, because the result of easily attacking of their systems is hard to imagine. Therefore, a number of methods of attacks such as FGSM [1], PGD [2], adversarial glasses [4], and so on have been exposure to public in order to enhance the robustness of machine learning models (Fig. 1). After knowing the structures and parameters in detail, many mechanism can generate adversarial examples, including fast gradient sign method (FGSM) [1], I-FGSM (Iterative FGSM) PGD[2], Jacobian-based saliency map attack (JSMA, c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_112
844
J. Yang
Fig. 1 We show a pair of original example and adversarial example generated by the projected gradient descent (PGD) for the ResNet101 [4] model
L0 ) [3], DeepFool [5], Carlini–Wagner attack [6], momentum-based iterative (MIFGSM Attack) [7], and so on. The above methods can be defined as white-box manner because they cannot figure out the adversarial examples until every parameters and the number of layers is known. On other hands, if these parameters are unknown, black-box manners appear in order to generate adversarial examples. Due to adversarial examples have good transferability in general, we can get them by attacking known deep learning models. Besides, other characters can also distinguish attack methods like target or non-target, which means whether attackers want to mislead models’ result to specific wrong categories. Having reviewed recent works about adversarial examples generation and basic kinds of different attack methods, in particular, PGD is an acceptable iterative attack method and can attach tiny perturbation on images to mislead the classification models. Compared to FGSM, it calculates iteratively to achieve adversarial examples with small perturbations. Moreover, it is easy to realize, and many papers use it as a basic attacking method. In this paper, we compare different methods by attacking on different classification models with the same inputs. Moreover, we try to categorize the hard samples those are failed to generate adversaries. Section 2 describes some related works which are used in our experiments, Sect. 3 shows the relationship and graphs about attacking models, Sect. 4 gives the finding during the process of attacking, and Sect. 5 includes the summary of whole work and comes up with some potential works in the future.
2
Related Works
In this paper, we mainly use white-box methods. The fast gradient sign method (FGSM) [1] was the first work aiming to find out the shortcomings of deep learning models and throw out the potential reasons why they are easy to attack.
Hard Examples Mining for Adversarial Attacks
845
The specific mechanism of FGSM is getting rid of the loss function in the input space and generating adversarial examples by an one-step update. However, it is hard to both control the difference between original images and attacked images, so it is effective to improve the method by changing the step from one to many. Not only can it well control the distance from processed images to original images, but it surprisingly achieves high performance in misleading the models, .passage lists some basic ideas of attacking models and underlying structure of these methods. One-step gradient-based approaches, the most representative one is FGSM [1]. In order to maximize the loss function, which represented by J(x∗ , y) , where J is often calculated by cross-entropy loss, FGSM generates the adversarial examples meet the L∞ norm bound x∗ − x∞ ≤ as x∗ = x + sign(∇x J(x, y)),
(1)
where the ∇x J(x, y) is the loss function. Similarly, it can be controlled by L2 norm, which can be represented by x∗ − x2 ≤ as x∗ = x + sign(
∇x J(x, y) ). J(x, y)2
(2)
both of the above methods are one step, which is calculated in one time. Iterative methods, compare with one-step methods, iterative method applies fast gradient multiple times with a small step size alpha. The FGSM can be modified to iterative version, and it can be expressed as follows: x∗0 = x, x∗t+1 = x∗t + αsign(∇x J(x, y)),
(3)
one can apply the FGSM for several times, and each step smally so that the adversarial examples can be generated. It has been showed that iterative methods have a stronger performance in white-box manner than one-step method under lots of experiments. Apart from the high-performance version of FGSM [1], the projected gradient descent method (PGD) [2] can act also in iterative way attack with a improvement that a universal first-order adversary with random starts, which can be expressed as follows: xt+1 = Πx+S (xt + αsign(∇x L(θ, x, y))).
(4)
In this paper, we apply different version of PGD, which use several kinds of distance such as L∞, L2, and Lsparse so as to achieve comprehensive experiments on attacking models and generate adversarial examples.
3 3.1
Experiments Set Up
We use several kinds of convoluted neural network(CNN) [8] as the target. As to CNN, which are composed by several layers, including convolutional layer,
846
J. Yang
pooling layer, and fully connected layer, a number of attacking methods succeed due to its extreme nonlinearity [1]. Until now, with increasing scientist focusing on improvement of CNN performances, the accuracy of classification with popular models can achieve ninety per cent. Moreover, more and more kinds of high-performance CNNs on coming into public on the large-scale visual recognition challenge, which encourages people from all over the world to apply models on ImageNet. In this paper, we choose some representative convoluted neural network models as the ones which are attacked. ResNet [9], kind of deep residual network, was invented by adding residual learning onto basic convoluted neural network. Unlike former networks struggling with decreasing performance cause by increasing number of layers, this work settles down this problem and achieves surprising success. It won the ILSVRC in 2015. In this paper, we use different versions of ResNet as our models attacked. Inception [10], another model concentrating in image classification, has several versions(four until now) and uses basic mechanism of convoluted neural network, also including convolutional layer, pooling layer, and fully connected layers as normal. The differences between versions are about changing the size of filter and adding BN layers from v2 to v3. Similar to ResNet, Inception series of network also achieve high ranks in the ImageNet large-scale visual recognition challenge ImageNet in 2015. The four versions of Inception until now are v1 [11], also named GoogLeNet, v2 [12], v3 [4], and v4 [13].We also use different version of this model as ones which are attacked in this paper. In order to get the comparison between attacking intensity with the number of images still unattacked, it is important for data set to contain a variety of images because simple images like hand-writing numbers are difficult to sustain for high intensity attack. In order to analyze characters existing in images, ideal data set should include various objects and colors. ImageNet [15], a project with a images database is designed in order to be used in objection recognition or classification. It includes more than 14 million images, which are all hand-annotated. Furthermore, more than 20,000 categories are included, such as “fish,” “balloon,” and so on. Since 2010, a large contest called the ImageNet large-scale visual recognition challenge (ILSVRC) was held to encourage researchers improving the accuracy of image classifications and object recognition. In this paper, we choose ImageNet as the image set being attacked because it has images nearing normal lifes and includes images with less than two colors which may be hard to attack and colorful images which seem easily to generate adversarial examples. In order to figure out the efficiency of adversarial attacks, we collect the number of images after attacks with certain level. To be more specific, before adding attacks on images, it is necessary to get rid of the examples which are not correctly classified by the deep learning models because applying attacks on wrong classified images wastes much memory and time. And then, we apply the least acceptable level of attacking, in case it is so slight that successfully attacking may cost huge amount of strength increasing. After applying the most acceptably slight attacks on the original examples, we get rid of the images
Hard Examples Mining for Adversarial Attacks
847
which are wrongly classified when feeding the attacked examples for a second time. Next, keep going the above process by promoting the strength of attacks by a little EPS until all the examples are successfully attacked and none of attacked or unattacked examples is correctly classified. It is significant that the number of remaining examples in each loop is supposed to be recorded. Besides, in order to maximize the accuracy of models, all the input example are cut according to the models description, and all the attacks are all based on certain size. 3.2
L∞PGD Attacks
In this part, we use PGD attacks with the constraints of L∞ , and set the level 1 with the corresponded times of attacks, which is represented by EPS, as 255 of iterations. Second, we chose up to five models as the one to be attacked. According to the paper [1], this work adding FGSM onto a panda image and successfully lets models classify wrongly. So we also choose the panda image in the ImageNet, with gradually adding the strength of PGD with L∞ attack, and 1 because not only this level can mislead the it is acceptable that setting eps as 255 models with PGD attack, but the original images are not attacked overcooked.
Fig. 2 Number of remaining adversarial examples during the increasing strength of attack, but different from Fig. 2, the first state represents the number of adversarial examples experiencing the least acceptable attack. Five lines, respectively, stand for one deep learning model, including ResNet34, ResNet50, ResNet101, GoogLeNet, and Inception-V3
Figure 2 shows the number of the remaining pictures with the increasing attacking strength. Y -axis expressed the ratio of left images in the number of whole data set. As to x-axis, it is significant that the several numbers on the x axis are not exactly representing the EPS in the attacking method actually. Due to every pair of attacking method and deep learning models having its own best suitable attacking strength, it is helpful to set a new evaluation system, which
848
J. Yang
means constraining the suitable range of attacking into from 1 to 255 where 1 means the place that examples have been attacked but not been correctly classified. Because the 1 represents no attacks, it is conspicuous that the poly line may experience a sharp decline and become smooth later. According to the figure, it is obvious that the poly line goes down and declines with increasing attacking strength. Furthermore, as an example from ImageNet data set, it is more likely that it will sustain for attacking for high strength perturbation if it has the characters which are determined by original images rather than models or attacking methods. In other words, according to Fig. 2 which selects images which are attacked except images without attack, the reason why certain images are hard to attack successfully is that they are born with some important features, which can be explained by Fig. 2 that the number of pictures decreases quite slowly except images at first being attacked.
Fig. 3 This figure cuts off from the former two graphs, only shows the front attacks strength. Five lines, respectively, stand for one deep learning model, including ResNet34, ResNet50, ResNet101, GoogLeNet, and Inception-V3
In order to compare the difference of several models with the same attacking method, we selected 16 of 255 level of strength when firstly attacking, which means apply amplification on Fig. 2. After adding attacking strength to over 100, the number of remaining images is quite few, so cutting off the later attacking strength is helpful to observe the poly lines. We get from Fig. 3 that all the lines are the same kind, at first dropping sharply and then become slowly. In others, first the derivative gradually decreases, so some of the images can insist for a while and prefer to stay unattacked longer.
Hard Examples Mining for Adversarial Attacks
3.3
849
L2 P DG Attacks
It is something different from the former part. Instead of using L∞ as constraint bound, this method uses L2 norm bound. So the least acceptable attacking strength updates to eps = 1, which is tested in attacking panda image refer to [1].
Fig. 4 We cut off the number of original examples which are wrongly classified by models. Five lines, respectively, stand for one deep learning model, including ResNet34, ResNet50, ResNet101, ResNet152, and GoogLeNet
It is obvious that all the lines are decreasing according to Fig. 4, and we can also find that they do not get down sharply except examples that are the first attacking. Compared to the ResNet deep learning models, the number of examples process by GoogLeNet tends to be less than ResNet. The main possible reason is that GoogLeNet came out earlier and slightly attacking may lead to the ratio of correctly classified sharply decreasing. After getting rid of the number of original examples which are correctly classified, we can get a completely different line compared with the former graph according to Fig. 4. The lines showed are going down slowly at first, but gradually becoming rapidly. Therefore, the L2 attacking method is commonly hard to attack successfully. Only huge strength perturbation can generate adversarial examples. In other words, the derivative of the curve gradually increases. Second, we can find that ResNet50 is more hard to be attacked than ResNet101, with more remaining examples under the same level of attacks. What we can get from Fig. 5 is that images prefer to stay unattacked under L2PGD attacking method at first because the curves are almost flat except GoogLeNet. Besides ResNet series tend to be similar shapes compared with GoogLeNet, and they are all hard to be attacked under low-level attack.
4
Defensive Examples
By applying different attacking method on images from ImageNet with various deep learning models, we not only figure out the relationship between the
850
J. Yang
Fig. 5 This graph leaves the front 16 attack strength. Five lines, respectively, stand for one deep learning model, including ResNet34, ResNet50, ResNet101, ResNet152, and GoogLeNet
number of adversarial examples and attacking strength, but also the reason for the images that are hard to generate adversarial examples. In particular, we define “defensive examples” as images which are hard to attack, still classified correctly even though under big perturbation. Furthermore, it is quite different from former work on adversarial training [14], which means use adversarial examples to feed models in order to generate a more defensive models, what we try to figure out is which characteristics of the original images make them hard to generate adversarial examples. In other words, it is about examples themselves rather than external methods. We analyze the examples which are hard to be attacked and sum up the number of the remaining unattacked examples so as to summary some of the determined factors causing defensive.
Fig. 6 These three images are selected from LinfPGD attacking ResNet101 with least strength attacks
Background is one of the reasons for the defensive examples. After viewing the images which are wrongly classified only after the least acceptable attack, we
Hard Examples Mining for Adversarial Attacks
851
find that they are all photoed outside with colorful background such as parks, markets, seabed and so on. According to Fig. 6, It is obvious that these photographs all have colorful background, which means there are various stuffs with different colors behind the main objects. On the other hand, the ones which are defensive examples, for examples, ones need 16 or 18 attacking levels, mainly have monotonous background. Except from the main object which are the key of being classified, what is behind that is only one-color walls or just black according to Fig. 7.
Fig. 7 These three images are selected from LinfPGD attacking ResNet101 with 16 or 17 strength levels
Rich colors in the images are also a main reason why they are defensive examples. We found that the more colors the images contain, the harder they are to be attacked. According to Fig. 8, images from the first line are defensive examples, and they are almost composed of two or three colors. But the images from the second lines are composed by more than five kinds of colors. The reason why they become so easy to be attacked may be that the main mechanism of deep learning models is extracting color and space features, so little perturbation may be sum up to cause huge changes, which may misdirect deep learning models. However, the perturbation toward images with one or two colors may be quite little enough to change the decision made by models. High contrast, we found that images both having monotonous background and composed of few colors are harder than examples with one of these features. Showed as Fig. 9, they have the key messages and can be correctly classified by models, so attackers need add enough high perturbation to change the key message, then being wrongly classified. Together with the two main reasons causing examples to be defensive, we can conclude that examples with visual attention are not easy to be attacked.
852
J. Yang
Fig. 8 First line lists the examples which are defensive examples, while the second line lists the examples which are easy to be attacked. All these images are selected from LinfPGD attacking ResNet101
Fig. 9 These defensive examples can insist for over 60 times of the first attack. All these images are selected from LinfPGD attacking ResNet101
5
Conclusions
In this paper, we figure out that whether examples themselves are the determined reason for being hard to be attacked. We analyze the defensive examples to conclude that models are not only the reasons being hard to be attacked but examples themselves. At first, we draw a line chart showing the relationship between the attacking strength and the remaining examples which have not been attacked. About ten pairs of attack method and deep learning models are chosen, which shows a quite different curve that the Llinf P DG declines fast at
Hard Examples Mining for Adversarial Attacks
853
first and slow later, while the L2 P DG is the opposite. Later, we find out that the determined factors lead to the defensive examples, and we conclude that examples with prominent visual attention are not easy to be attacked. And also examples with clean or pure or background tend to be hard to attack.
References 1. Goodfellow IJ, Shlens J, Szegedy C (2014) Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 2. Madry A, Makelov A, Schmidt L et al (2017) Towards deep learning models resistant to adversarial attacks. arXiv preprint arXiv:1706.06083 3. Papernot N, McDaniel P, Jha S et al (2016) The limitations of deep learning in adversarial settings. In: 2016 IEEE European symposium on security and privacy (EuroS & P). IEEE, New York, pp 372–387 4. Szegedy C, Vanhoucke V, Ioffe S et al (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826 5. Moosavi-Dezfooli SM, Fawzi A, Frossard P (2016) Deepfool: a simple and accurate method to fool deep neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2574–2582 6. Carlini N, Wagner D (2017) Towards evaluating the robustness of neural networks. In: 2017 IEEE symposium on security and privacy (SP). IEEE, New York, pp 39–57 7. Dong Y, Liao F, Pang T et al (2018) Boosting adversarial attacks with momentum. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9185–9193 8. LeCun Y, Boser B, Denker JS et al (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1(4):541–551 9. He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778 10. Deng J, Dong W, Socher R et al (2009) Imagenet: A large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, New York, pp 248–255 11. Szegedy C, Liu W, Jia Y et al (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9 12. Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 13. Szegedy C, Ioffe S, Vanhoucke V et al (2017) Inception-v4, inception-ResNet and the impact of residual connections on learning. In: Thirty-first AAAI conference on artificial intelligence 14. Kurakin A, Boneh D, Tram`er F et al (2018) Ensemble adversarial training: attacks and defenses 15. Russakovsky O, Deng J, Su H et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252 16. Liu Y, Chen X, Liu C et al (2016) Delving into transferable adversarial examples and black-box attacks. arXiv preprint arXiv:1611.02770 17. Athalye A, Carlini N, Wagner D (2018) Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. arXiv preprint arXiv:1802.00420
854
J. Yang
18. Dong Y, Fu Q A, Yang X et al (2019) Benchmarking Adversarial Robustness. arXiv preprint arXiv:1912.11852 19. Liao F, Liang M, Dong Y et al (2018) Defense against adversarial attacks using high-level representation guided denoiser. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1778–1787 20. Szegedy C, Zaremba W, Sutskever I et al (2013) Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199
Positioning Parameter Estimation Based on Reconstructed Channel Statement Information Xiaolong Wang, Xiaolong Yang(B) , Mu Zhou, and Wei He School of Communication and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing, Chongqing 400065, China [email protected]
Abstract. Channel state information (CSI) is widely used in the wireless communication systems, but its application has been limited by the huge feedback overhead between the transmitter and the receiver. In this paper, the compressed CSI information is used to estimate positioning parameters, such as arrival of angle (AoA) and time of flight (ToF), and the experimental results show that the estimation accuracy of original CSI can be realized under the condition of reducing the feedback overhead of CSI in Wi-Fi systems. Keywords: Wi-Fi · CSI feedback · Compression · AoA · ToF
1 Introduction In Wi-Fi, the channel state information (CSI) of antenna pairs is a complex value, which represents the channel coefficient of OFDM subcarrier. The CSI is widely used to realize wireless sensing and positioning [1] However, in many cases, the CSI feedback overhead is so large that it even exceeds the effective data package size of actual transmission, resulting in a resource waste [2, 3] The angle of arrival (AoA) and time of flight (ToF) are estimated [4] based on the compressed CSI information in this paper, and the estimation accuracy has been given. The rest of the paper is organized as follows. We first introduce the compression algorithm of CSI [5]. Then, experimental method and data are provided. Finally, conclusion is drawn.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_113
856
X. Wang et al.
2 Compression Principle The target sinusoid can be approximated as the linear combination of multiple sinusoids within certain error: sin(gx)=
P−1
γk sin(fk x)+ξ
(1)
k=0
where g and fk is the frequency of sinusoid, γk is the amplitude of sinusoid, P is the number of base sinusoids, and ξ is the approximate error. The coefficients of the base sinusoids are obtained based on the Taylor series expansion. More specifically, to obtain γk , the two sides of Eq. 1 are expanded: P−1
ηl (gx)l =
l=0
P−1
γk
k=0
P−1
ηl (fk x)l
(2)
l=0
where ηl are constants determined l. According to Eq. (2), we have: P−1 γk fkl = g l , 0 ≤ l ≤ P − 1 k=0
(3)
as long as f1 , f2 , ..., fk are distinct, the above P equations can form a Vandermonde matrix, and the solution always exists: γk = P−1 n=0,n=k
fn − g f n − fk
(4)
Obviously, we can expand it to the complex field: P−1 k=0
γk eifk x =
P−1
γk cos(fk x) + i
k=0
P−1
γk sin(fk x)
(5)
k=0
Because the time domain characteristic of CSI vector is the linear superposition of multiple sinusoids [6], CSI vector can also be approximated as the linear combination of sinusoids with fixed frequency, which undoubtedly helps us to compress CSI vector and reduce the complexity of algorithm.
3 Data Compression and Reconstruction T The coefficient solution is described by , where = γ0 γ1 · · · γP−1 , we can find by minimizing 2 N P−1 ijfk γk e (6) − cj J = j=1
k=0
where P is the number of base sinusoids, γk and fk is the coefficient and frequency of the base sinusoid k, respectively, cj is element j in the CSI vector, and N is the length of the CSI vector. By seeking partial derivative we obtain a linear of thei(fcoefficient, can N −ifk j y and n −fm )j , s e = system QP×P P×1 = SP×1 , where qm,n = N m j j=1 j=1 e can be calculated by Q−1 S.
Positioning Parameter Estimation Based on Reconstructed …
857
4 Experiment In this part, we conduct experiments to explore the influence of the number of base sinusoids on the compression performance of CSI based on the above algorithm. Firstly, we use Intel 5300 NIC to collect CSI data and estimate positioning of parameters; secondly, we compress the CSI in different number of base sinusoids, and use the compressed CSI to estimate the positioning of parameters again; finally, we compare the accuracy of two parameter estimates. Figure 1a–c show the results of amplitude curves fitting with frequencies {0,0.18,0.24}, {0,0.02,0.05,0.09,0.12,0.15,0.18,0.21,0.24}, and {0,0.02,0.05,0.07,0.09,0.12,0.13,0.15,0.18,0.21,0.23,0.24}, respectively. It can be seen that the fitting residual of the curve is smaller with the increase of basic sinusoid number.
(a) P=6
(b) P=9
(c) P=12
Fig. 1 Amplitude curves fitting
Figure 2 plots the cumulative density function (CDF) of the fitting residuals in 597 groups of CSI data. As it is shown, the fitting residual of the algorithm is very small with a median of 0.0268 when P = 12. Table 1 shows the statistical table of the average error of parameter estimation under different number of base sinusoids, where N /P is compression ratio.
5 Conclusion In the case of a large number of base sinusoids, the algorithm can effectively restore CSI, at the same time, compressed CSI can accurately estimate the positioning of parameters; in the case of a small number of base sinusoids, the compression algorithm has a greater impact on the accuracy of ToF estimation and a smaller impact on the accuracy of AoA estimation.
858
X. Wang et al.
Fig. 2 Fit residual with experimental data Table 1 Experimental result of positioning parameter estimation P = 3 P = 6 P = 9 P = 12 AOA(degree)
3.875 1.84
1.785 1.065
TOF(nanosecond) 11.75 3.105 2.99
2.165
N/P
2.5
10
5
3.33
Acknowledgements. This research was supported in part by the National Natural Science Foundation of China (61771083, 61704015), Science and Technology Research Project of Chongqing Education Commission (KJQN201800625) and Chongqing Natural Science Foundation Project (cstc2019jcyj-msxmX0635).
References 1. Yang Z, Zhou Z, Liu Y (2013) From RSSI to CSI: indoor localization via channel response. ACM Comput Surv 46(2):1–32 2. Liu X, Jia M, Zhang X, Lu W (2019) A novel multichannel internet of things based on dynamic spectrum sharing in 5G communication. IEEE IoT J 6(4):5962–5970 3. Liu X, Zhang X (2020) NOMA-based resource allocation for cluster-based cognitive industrial internet of things. IEEE Trans Indus Inf 16(8):5379–5388 4. Kotaru M, Joshi K, Bharadia D, Katti S (2015) SpotFi: decimeter level localization using WiFi. ACM SIGCOMM Comp Commun Rev 45(4):269–282 5. Mukherjee A, Zhang Z (2017) Fast compression of OFDM channel state information with constant frequency sinusoidal approximation. EURASIP J WirelCommunNetw 6. Wang X, Wicker SB (2013) Channel estimation and feedback with continuous time domain parameters. In: Proceedings of IEEE GLOBECOM, pp 4306–4312
Three-Dimensional Parameter Estimation Algorithm Based on CSI Yuan She, Xiaolong Yang(B) , Mu Zhou, and Wei Nie School of Communication and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China [email protected]
Abstract. Considering that the problem of the estimation accuracy of angle of arrival (AoA) and time of flight (ToF) is limited by the number of antennas and channel bandwidth of commercial Wi-Fi, we propose a unified model, which provides three-dimensional parameter estimation based on AoA, ToF, and Doppler frequency shift (DFS). Our proposed algorithm analyzes the channel state information (CSI) data, constructs a three-dimensional matrix containing AoA, ToF, and DFS information, and then reduces the dimensionality of the constructed threedimensional matrix. Finally, the spatial spectrum of the signal is estimated after dimensionality reduction. Simulation results show that the parameter estimation accuracy and the signal resolution of the proposed algorithm are better than the existing two-dimensional joint estimation method. Keywords: Channel state information · Angle of arrival · Time of flight · Doppler frequency shift
1 Introduction Recent years, wireless local area network (WLAN) has developed rapidly and gradually become an indispensable part of people’s daily life. Wi-Fi-based wireless local area network technology has been widely used in places where people gather, such as homes, shopping malls, and airports. Wi-Fi can be used not only for data communication, but also for environmental perception and infer the changes of surrounding environment [1–3]. By analyzing the channel state information (CSI) of Wi-Fi, we can estimate the parameters of signal transmission through the channel and detect the surrounding environment. Existing methods for signal parameter estimation include maximum likelihood estimation and multiple signal classification algorithm [4, 5] and so on. Yin Qinye proposed the direction of arrival matrix method [6–8] to improve the MUSIC algorithm. However,
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_114
860
Y. She et al.
the method only estimates AoA so it can’t distinguish two signals at the same direction of arrival. In order to improve the signal resolution of parameter estimation and distinguish multiple signals at the same time, J. Pan proposed a two-dimensional parameter estimation method which combines DFS and AoA [9–12], and in reference [13], AoA and ToF are combined to estimate the two-dimensional parameters. These methods improve the effective aperture of array and signal resolution. However, the two-dimension parameter estimation methods cannot distinguish the signals with similar parameters. Aiming at these problems existing in the above system, this paper proposes a three-dimensional joint parameter estimation method based on AoA, ToF, and DFS. Simulations verify that the algorithm improves the accuracy of parameter estimation and signal resolution, and provide a theoretical basis for the wide application of Wi-Fi.
2 Three-Dimensional Parameter Estimation Algorithm The reference sensors in the uniform circular array (UCA) are illustrated in Fig. 1.
Fig. 1 Linear antenna array
Suppose that the array antenna in the transmitter is a linear array antenna, in which M antenna elements are arranged with same spacing d , and the number of subcarriers is N . The number of packets is P. There are K(M > K) signals with the same center frequency, and the AoA of signals are θ1 , θ2 , . . . , θK . The ToF of signals are τ1 , τ2 , . . . , τK , and the DFS of signals are v1 , v2 , . . . , vK . The received signal in the antenna array shown in Fig. 1 is three-dimensional data, including AoA, ToF and DFS parameters. For the kth signal, when the number of subcarrier and packet is equal, relative to the first antenna in the array, the wave path-difference generated on the mth antenna is (m − 1) × d × sin(θk ), so the phase shift introduced at mth antenna is −2 × π × (m − 1) × d × sin(θk ) × fn /c, where c is the speed of light and fn is the frequency of the nth subcarrier. fn = (n − 1)f + fo , where f is the frequency spacing between two consecutive subcarriers and f0 is the frequency of the transmitted signal. For simplicity of representation, let us denote the phase shifts as a function of the AoA, (θk ) = e−j2π(m−1)d sin(θk )fn /c
(1)
Similarly, when the number of antenna and packet are equal, relative to the first subcarrier, the wave path-difference generated on the nth subcarrier is −j2 × π (n − 1) ×
Three-Dimensional Parameter Estimation Algorithm …
861
f × τk , where τk is the ToF of the kth signal. The phase shifts as a function of the ToF is (τk ) = e−j2π(n−1)f τk
(2)
For the Doppler information contained in the received packet, when the number of antenna and subcarriers is equal, the phase difference between the CSI data in the first packet and the CSI data in the pth packet is −j2 × π × fn × vk × (p − 1) × ts , where vk is the moving speed of the kth signal, ts is packet interval and p is packet number. The phase shifts as a function of the DFS is (vk ) = e−j2πfn vk (p−1)ts /c
(3)
The resulting vector of received signals due to kth path can be written as a(p, m, n) =
K
ak (1, 1, 1)(θk )(τk )(vk )
(4)
k=1
Based on the above analysis, the CSI data includes three parameters of AoA, ToF and DFS. Then, the signal subspace and noise subspace are obtained by eigendecomposition of the received signal. By constructing a noise eigenvector matrix of the eigenvectors in the noise subspace, the spectral function can be solved to obtain parameter estimates. f (θ, τ, v) =
1 ρ H (θ, τ, v)EN ENH ρ(θ, τ, v)
(5)
where ρ(θ, τ, v) is the direction vector with the same array manifold as the column vector of the direction matrix of the received signal, θ, τ, v is the parameter to be estimated, H is the conjugate transpose. Change the parameter value in the spectral function f (θ, τ, v), and search the peak value of the spectral function to get the corresponding parameter.
3 Performance Analysis This section reports the results of simulation experiments in which the three-dimensional joint parameter estimation algorithm. The simulations are based on OFDM system with 30 subcarriers.B = 40 MHz, and f0 = 5.7 GHz, and the number of packets is 4. The sending rate of packets is 500 packets per second, and the SNR is 10 dB. UCA has M = 3 sensors and the spacing between the sensors is half wavelength. Firstly, assume that the number of coherent multipath components is three, the delays of these are 20 ns, 10 ns and 10 ns, their angles of arrival are 45°, 30° and 60°, and their Doppler velocity is 1 m/s, 2 m/s and 0 m/s, respectively. The simulation results are shown in Fig. 2. Figure 2a, b are the results of three-dimensional and two-dimensional parameter estimations, respectively. The figures demonstrate that the accuracy of three-dimensional parameter estimation is higher than that of two-dimensional parameter estimation.
862
Y. She et al.
(a) Three-dimensional parameter estimation
(b) Two-dimensional parameter estimation
Fig. 2 Comparison of two-dimensional and three-dimensional parameter estimations
Secondly, the signal resolution of three-dimensional parameter estimation is higher than that of two-dimensional parameter estimation. When the AoA and ToF of multiple signals are similar, two-dimensional parameter estimation cannot distinguish the signals with similar parameters. However, three-dimensional parameter estimation can separate multiple signals through Doppler information, so that the resolution is improved. Assume that there are three signals, and the delays of these are 20 ns, 18 ns and 10 ns, their angles of arrival are 30°, 28° and 60°, and their Doppler velocity is 2m/s, 1 m/s and 2 m/s, respectively. The results are shown in Fig. 3.
Fig. 3 Comparison of two algorithms with similar parameters
As shown in Fig. 3a, when the AoA and ToF of two signals are similar, the twodimensional parameter estimation algorithm cannot separate the signals. But for threedimensional parameter estimation algorithm, as shown in Fig. 3b, the two signals can be separated by different DFS. Therefore, three-dimensional parameter estimation can separate the signals which has similar AoA and ToF, and its resolution is higher than two-dimensional parameter estimation.
Three-Dimensional Parameter Estimation Algorithm …
863
4 Conclusion In this paper, we have proposed a three-dimensional parameter estimation algorithm, which adds Doppler information to solve the problem that two-dimensional parameter estimation cannot distinguish multiple signals with similar AoA and ToF, and improve the accuracy of parameter estimation and signal resolution.
References 1. Liu X, Jia M, Zhang X, Lu W (2019) A novel multichannel Internet of things based on dynamic spectrum sharing in 5G communication. IEEE IoT J 6(4):5962–5970 2. Liu X, Zhang X (2019) NOMA-based resource allocation for cluster-based cognitive industrial Internet of Things. IEEE Trans Indus Inf 16(8):5379–5388 3. Hai Z, Fu X, Lijuan S et al (2016) CSI-based WiFi environment sensing. J Nanjing Univ Posts Telecommun J 36(1):94–103 4. Schmidt R, Schmidt RO (1986) Multiple emitter location and signal parameters estimation. IEEE Trans Antenn Propag 34(3):276–280 5. Yonglaing W et al (2004) Theory and algorithm of spatial spectrum estimation. Tsinghua University Press, Tsinghua 6. Qinye Y, Lihe Z, Robert WN (1991) A high resolution approach to 2-D signal parameter estimation-DOA matrix method. J Commun 12(4):1–7 7. Shu W, Xilang Z (1999) Direction of arrival and frequency estimation in array signal processing. J Shanghai Jiaotong Univ 33(1):40–42 8. Yougen X, Zhiwen L (2001) A new method for simulation estimation of frequency and DOA of emitters. Electron J 29(9):1179–1182 9. Pan J, Zhou C, Liu B et al (2016) Joint DOA and Doppler frequency estimation for coprime arrays and samplers based on continuous compressed sensing. In: 2016 CIE international conference on radar (RADAR), Guangzhou, pp 1–5 10. Xiangdong H, Hongyu X, Ziyang Y et al (2016) Joint estimation of frequency and DOA with spatio-temporal under sampling. J Commun 37(05):21–28 11. Deng B, Sun ZB, Peng HF et al (2016) Source localization using TDOA/FDOA/DFS measurements with erroneous sensor positions. In: 2016 CIE international conference on radar (RADAR), Guangzhou, pp 1–4 12. Yang D, Wang T, Sun Y et al (2018) Doppler shift measurement using complex-valued CSI of Wi-Fi in corridors. In: 2018 3rd international conference on computer and communication systems (ICCCS), Nagoya, pp 367–371 13. Chen H, Hu B, Zheng L et al (2018) An accurate AoA estimation approach for indoor localization using commodity Wi-Fi devices. In: 2018 IEEE international conference on signal processing, communications and computing (ICSPCC), Qingdao, pp 1–5
Edge Cutting Analysis of Image Mapper for Snapshot Spectral Imager Xiaoming Ding1,2 , Yupeng Li1,3 , Xiaocheng Wang1,3 , and Cheng Wang1,2(B) 1 Tianjin Key Laboratory of Wireless Mobile Communications and Power Transmission, Tianjin
Normal University, Tianjin 300387, China [email protected] 2 Department of Artificial Intelligence, College of Electronic and Communication Engineering, Tianjin Normal University, Tianjin 300387, China 3 Department of Communication Engineering, College of Electronic and Communication Engineering, Tianjin Normal University, Tianjin 300387, China
Abstract. The image mapper used in snapshot image mapping spectrometer (IMS) is a key optical element, which can slice the input image to different parts and reflect each part of the image to different directions. Edge cutting appears between adjacent mirrors when fabricating the image mapper using diamond raster fly cutting technique. The edge cutting on mirrors can introduce the loss of light throughput of the system and make stripe noise in the reconstruction datacube, so that this paper analyzes the effect of the edge cutting and makes optimization for the manufacture of the image mapper. Keywords: Edge cutting · Image mapper · Snapshot spectral imager
1 Introduction Snapshot image mapping spectrometer can acquire the datacube (two-dimensional spatial information and one-dimensional spectral information) simultaneously of the target in a single exposure time [1]. Image mapper used in IMS usually contains hundreds of strip mirrors with different tilt angles, which is a key element for the transform from 3D datacube to 2D spatial–spectral aliasing data. The mirrors in image mapper slice and reflect the input image to different directions [2]. The edge cutting on the mirror facets is inevitable when manufacturing image mapper, for the different tilt angles between adjacent mirrors. Edge cutting can reduce the light throughput of the system and enhance the stripe noise in the reconstructed spectral images [3]. So that, this paper makes an analysis of the edge cutting affections and makes an optimization of the design of the image mapper.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_115
Edge Cutting Analysis of Image Mapper …
865
2 Edge Cutting of the Image Mapper The image mapper contains hundreds of the strip mirrors with different tilt angles; the manufacture schematic diagram is shown in Fig. 1.
Fig. 1 Manufacture schematic diagram of image mapper
The image mapper is made from aluminum alloys. Each mirror facet is fabricated using diamond raster fly cutting technique. The manufacturing approach diagram is shown in Fig. 2. Different tilt angles of the mirror facets are obtained, when the diamond cutter rotates around the x-axis and y-axis of the workpiece [4]. The mirror facets with different tilt angles are arranged periodically on the image mapper. The adjacent mirrors contain height difference since the different tilt angles, which is shown in Fig. 3. Since the diamond cutter should has a wedge angle f , the part of the edge of each mirror will be cut off inevitably; see Fig. 3b. The section marked by red color is the edge cutting.
3 Mathematical Model of Edge Cutting The edge cutting of the mirrors seriously affects the reflective area of the mirror facet, which reduces the reflected luminous flux, makes the light intensity distribution uneven
866
X. Ding et al.
Fig. 2 Manufacturing approach diagram of image mapper
Fig. 3 Edge cutting of image mapper, a the height difference of the adjacent mirrors, b the diagram of edge cutting on image mapper
and reduces the signal-to-noise ratio of the reconstructed datacube. At the same time, edge cutting enhances the stripe phenomenon in the spectral images. Since the energy of the reflected light is directly related to the area of the mirror surface, the amount of luminous flux loss can be measured by calculating the edge cutting area. Since the height difference of adjacent mirrors is different at different positions of the mirrors, the amount of edge cutting area of the same mirror is different, as shown in Fig. 4. The length of each mirror surface of the image mapper is equal. In this way, the maximum edge cutting in the y-direction of the mirror facet can be used as the expression of the edge cutting of this mirror. For the No. m mirror at No. n cycle of the image mapper, the edge cutting is expressed as, ϕ n = hnm tan − βm+1,n wm 2
Edge Cutting Analysis of Image Mapper …
867
Fig. 4 Edge cutting area of mirror facet
=
ϕ l tan αm,n − αm+1,n tan − βm+1,n 2 2
(1)
where l is the length of mirror, φ is the wedge angle of the diamond cutter. The edge cutting is the ratio of wn m with the mirror width b, ρmn =
n wm b
(2)
To analyze the loss of light throughput with edge cutting, the computer simulation is conducted under different tilt angles (α m,n , β m,n ). According to area formula, the cut area and wn m are linearly related. Since the light throughput is linearly related to mirror area, the loss of light throughput and rnm are positive correlated. Assume that the value range of αm,n − αm+1,n is [0, 10°], and β m+1, n is about [−5°, 5°]. The wedge angle φ is equal 30°, length of mirror l = 8 mm, and the width of each mirror b is 0.165 mm. The simulation result is shown in Fig. 5. As shown in Fig. 5, the difference in the tilt angles of adjacent mirror αm,n − αm+1,n has the most significant effect on the edge cutting, that is, the most loss of light αm,n − αm+1,n increases, the loss of light throughput increases; when throughput. As αm,n − αm+1,n is zero, there is no light throughput loss; when αm,n − αm+1,n has the same value, the loss of light throughput is related to the adjacent mirror βm+1,n . In summary, reducing αm,n − αm+1,n and ϕ − βm+1,n can reduce the “edge cutting” of mirror facet and make the loss of light throughput decrease.
4 Design Optimization of Image Mapper To minimize the edge cutting of mirror facets, we should order optimize the arrangement of tilt angles. According to Eq. (1), when the values of αm,n − αm+1,n and ϕ − βm+1,n
868
X. Ding et al.
Fig. 5 Relation between edge cutting and tilt angles of mirror facet
are reduced, the edge cutting decreases. First, combine the mirrors with the same angle α and arrange them in order. In this way, the height difference between adjacent mirror facet with the same α angle is the smallest. After the order of the mirror facet is adjusted according to the angle α, the mirror facet in each cycle can be arranged in a different angle β order, so that the cross section of the mirror facets becomes a “concave arc,” which can further reduce the edge cutting between each cycle. The tool is inclined inward, which can effectively reduce the protruding size of the tool edge, thereby reducing the edge cutting. In order to verify the effectiveness of the optimization results of image mapper, the edge cutting of each mirror is calculated according to Eq. (1). The length of mirror l = 8 mm, the mirror width b = 0.165 mm. There are 23 two-dimensional tilt angles, 3 periodic arrangement, 69 mirror surfaces in total. The results are shown in Fig. 6. From the results in Fig. 6, it can be seen that the edge cutting of the mirror facets without optimization is significantly larger than the mirror facets after the optimized, and the phenomenon of edge cutting of some mirrors is larger than the width of the mirror surface, indicating that the edge cutting is too large, which results in complete removal of the mirror facet. The mirror surface after optimization, the mirror facet edge cutting at the periodic junction is the largest, reaching about 0.7. The edge cutting of other mirror facets is mostly around 0–0.1, which indicates that the optimized image mapper mirror distribution can significantly reduce the edge cutting, thereby improving data quality.
5 Conclusion In this paper, we make a mathematical model of the edge cutting of the mirror facets on image mapper. Based on the model, we optimize the arrangement of mirror facets. The simulation results prove that the optimization is effective in reducing the edge cutting.
Edge Cutting Analysis of Image Mapper …
869
Fig. 6 Calculation results of edge cutting of each mirror
Acknowledgements. This work is supported by the Doctoral Foundation of Tianjin Normal University (52XB2004, 52XB2005), the Natural Science Foundation of Tianjin (18JCQNJC70900) and the National Science Foundation of China (61901300, 61901301).
References 1. Pawlowski ME, Dwight JG, Nguyen TU, Tkaczyk TS (2019) High performance image mapping spectrometer (IMS) for snapshot hyperspectral imaging applications. Optics Exp 27(2):1597 2. Pawlowski ME, Dwight JG, Nguyen TU, Tkaczyk T (2017). High speed image mapping spectrometer for biomedical applications. Opt Life Sci 2017(2):BoW4A.2 3. Gao L, Tkaczyk TS (2012) Correction of vignetting and distortion errors induced by two-axis light beam steering. Opt Eng 51(4):043203 4. Kester RT, Gao L, Tkaczyk TS (2010) Development of image mappers for hyperspectral biomedical imaging applications. Appl Opt 49(10):1886
Object Detection of Remote Sensing Image Based on Multi-level Domain Adaption Peiran Kang, Xiaorui Ma(B) , and Hongyu Wang Dalian University of Technology, Dalian, China [email protected], {maxr,whyu}@dlut.edu.cn
Abstract. Due to the limited amount of labeled data, remote sensing object detection is faced with great difficulties. The problem of insufficient labeled data usually can be solved by domain adaption. However, current methods mostly focus on feature alignment, without paying attention to the context information or discussing the level of features, which makes it impossible to effectively apply them to remote sensing images. In this paper, we construct our method based on Faster-RCNN model, and design three domain adaptive components for remote sensing object detection at image-level, instance-level, and pixel-level. Image-level alignment enhances global recognition ability by image weight redistribution. Instance-level alignment makes global awareness possible by combining context information. Pixel-level alignment reduces local differences between domains by focusing on small features and enhancing semantic information. Moreover, we collect domain adaption dataset to verify the proposed method, and the experimental results show that our method is superior to other current methods. Keywords: Remote sensing image · Object detection · Domain adaption
1 Introduction Object detection in remote sensing images is a basic and challenging problem in the field of aerial and satellite image analysis and has attracted more and more attention in recent years. However, there are great limitations in this field. A good detector usually needs a large amount of labeled training data, while the labeled data set of remote sensing image is very limited. Therefore, we apply unsupervised domain adaption (UDA) [1] to remote sensing object detection. The original unsupervised domain adaptive method was implemented based on matching the features of domains with statistical distribution differences. However, with the rise of GAN [2] network, the extraction of domain-invariant features by means of confrontation training gradually became the mainstream [3–5]. As for remote sensing image object detection, current methods still have some limitations. Firstly, the level of features is not clear enough, and the alignment of features at each level is also worth
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_116
Object Detection of Remote Sensing Image …
871
further exploration. Remote sensing image is different from ordinary image, whose features need clear classification and targeted use, in order to align the features of each level to better play their advantages. Second, current methods do not pay enough attention to the application of context, resulting in the loss of the overall information of the image, which will be fatal to remote sensing object detection. Different from ordinary images, context information is more important for remote sensing images, which can help the detector better understand the information conveyed by the whole image. Therefore, according to the characteristics of remote sensing images, we propose a feature alignment domain adaptation method that includes three levels: image-level, instance-level, and pixel-level. At global image level, we reconfigure the image weight so that the samples with higher similarity have higher weight. For instance-level features, we combine them with context information to make it possible to perceive the whole image. Pixel level alignment is mainly local feature alignment, focusing on small features to reduce local differences between domains.
2 Related Work Object Detection Early object detection problems are often described as sliding window classification problems. The development of deep convolutional neural network has revolutionized the object detection, and the performance of object detection has been greatly improved. Faster-RCNN [6] is a widely used detector, which creatively proposed a two-stage detection model. It takes the region proposal network (RPN) as the first stage to generate rough object proposals, and takes the proposals and cropping features as the input classification module in the second stage. In this article, we use Faster-RCNN as our base detector. Domain Adaption Nowadays, domain adaptation has been widely used in many aspects of computer vision, such as image classification [7–11] and semantic segmentation [12, 13], and has achieved good results. A common approach is to use feature distribution matching between the source and target domains. With the development of GAN network, the idea of training by means of confrontation is applied to the domain adaption, and the domain confusion is realized by using gradient reversal layer(GRL) for feature alignment [14, 15]. A ground-breaking domain adaptive Faster-RCNN was proposed by Chen et al. [16]. Based on this approach, many new ideas have been proposed, such as strong local feature alignment and weak global feature alignment [17], and using a hierarchical domain feature alignment module with a weighted GRL to reweight training samples [18].
3 Method Our method is improved from three aspects: image-level, instance-level, and pixel-level. Image-level weight redistribution is used to make the sample with higher similarity have more weight. Instance-level features and context vectors are combined to support information interaction between the two. Pixel-level alignment aims at low-level local features to reduce the domain gap (Fig. 1).
872
P. Kang et al.
Fig. 1 An overview of our network architecture. T 1 and T 2 are different level feature extractors. D1 and D2 are domain classifiers
3.1 Image-Level Adaption with Re-Weighting Strategy In past works, the weight of each image is always the same, which is unreasonable. It should be adjusted according to the cross-domain similarity. Specifically, a sample with high similarity means that it is close to the source domain in the feature space and is difficult to classify, so its importance should be higher, and vice versa. Therefore, we propose to train a domain classifier to focus on samples with high similarity and ignore samples with low similarity. The source domain dataset is represented as Is , which includes the labeled source image xs and the bounding box ys for each image, while the target domain dataset It contains the unlabeled target image xt . The probability of identifying as a sample of the source domain is represented as p. We improve the cross-entropy loss and make our loss decrease with the increase of p. The function is as follows, gi = −(1 − p) log p
(1)
Therefore, we denote the loss as follows, Lim = −
Ns 1 1 − D2 T xis log D2 T xis Ns i=1
Nt 1 D2 T xit log 1 − D2 T xit − Nt
(2)
i=1
3.2 Instance-Level Adaption with Context Vector Instance-level alignment is designed to reduce differences between local instances, but it is not perfect. Each instance-level feature is independent and context interaction information is ignored. The context vector aggregated from the lower level has a domain-invariant property that compensates for the domain-dependent properties of the instance-level features obtained from the higher level. So, we decided to fuse the two vectors.
Object Detection of Remote Sensing Image …
873
Specifically, we represent the context vectors of different layers as vc1 , vc2 , and use vi to represent instance-level features. We use non-linear fusion strategy instead of simple concatenation strategy because concatenation ignores the intrinsic complementary relation between the two. The formulation is defined as follows, (3) vf = vc1 , vc2 ⊗ vi where vf represents the vector after fusion, and ⊗ represents the tensor product. In order to solve the problem of dimensional explosion caused by this, we use the randomized methods as an unbiased estimator of the tensor product. We use vc to uniformly represent the context vector with dimension kc , and use ki to represent the dimension of instancelevel features, so the dimension kf of vf should be kc × ki . The formulation is as follows, 1 vf = (M1 vc ) (M2 vi ) kf
(4)
where M1 and M2 are random matrices sampled from a uniform distribution. is the Hadamard product. The final loss of this part looks like this, Ns 1 i,j log Di vf s Ns
Lin = −
i=1 i,j
−
Nt 1 i,j log 1 − Di vf t Nt
(5)
i=1 i,j
3.3 Pixel-Level Alignment Local features such as color or texture are smaller than instance-level features, and its matching is also critical in domain adaption, which we call pixel-level alignment in this paper. Pixel-level alignment can narrow local difference between domains without affecting the final category discrimination. Represent the height and width of the feature output from T1 as H , W . The loss function is summarized as follows, s 2 1 Lp = log D1 T1 xis hw Ns HW
N
+
H
W
i=1 h=1 w=1 Nt W H
1 Nt HW
2 log 1 − D1 T1 xit hw
i=1 h=1 w=1
where D1 T1 xis hw represents the output of D1 .
(6)
874
P. Kang et al.
3.4 Overall Loss We use Ld to represent the loss of detection module, which includes classification loss and positioning loss. The overall objective function of our model is: max min Ld − λ(Lim + Lin + Lp ) D
T
(7)
where λ is the adjustment parameter.
4 Experiment 4.1 Dataset The target domain used in this paper is from the NWPU VHR-10 dataset[19], which contains 650 positive sample images and 150 negative sample images. There are 10 classes of objects: plane, ship, storage tank (ST), baseball field (BF), tennis court (TC), basketball court (BC), ground track field (GTF), harbor, bridge, and vehicle, respectively. The image spatial resolution ranges from 0.08 to 2 m, and the image size ranges from 533 × 579 pixels to 1728 × 1028 pixels. We used 650 positive sample images without cropping, but we tweaked some of the annotations. The source domain dataset DOTA+ comes from one of the largest object detection dataset in aerial image, DOTA dataset [20]. It contains 2,806 large-sized images and has 15 categories of objects, including ten of NWPU VHR-10. The spatial resolution of the image ranges from 0.1 to 0.2 m, and the ground truth is labeled as four-point coordinate, that is, it contains angle information that we do not need. We carried out standardized processing: The images in the dataset were uniformly cut to 1024 × 1024 pixels, and 620 images were selected for horizontal labeling. The classes we re-label are ten classes coincident with NWPU VHR-10. We can see the comparison of five classes of objects in source and target domains in Fig. 2. The number of objects in each class in these two datasets is shown in Table 1. 4.2 Experiment Setting In experiments, we trained our models with initial learning rate set to 0.0005. After 50 K iterations, we reduced our learning rate to 0.00005 and continued training 20 K iterations to obtain our final detection model. We fine-tuned our model by using the stochastic gradient descent (SGD) with a momentum of 0.9 and we set λ as 1.0 and γ as 5.0. The architecture of our model is Faster-RCNN based on VGG-16 [21], and use the pre-trained model on ImageNet [22]. We use mean average precision (mAP) as evaluation and set threshold value to 0.5 4.3 Performance and Analysis We compare our method with Faster-RCNN, domain adaptive Faster-RCNN (DA-Faster) [16] and strong–weak distribution alignment (SWDA) [17]. Faster-RCNN model is only
Object Detection of Remote Sensing Image …
(a) Plane
(b) BD
(c) GTF
(d) TC
875
(e) Bridge
Fig. 2 Comparison of five classes of objects in source and target domains. Some of the images were cropped for more obvious contrast
Table 1 Number of objects in each class in source and target domain Bridge
BD
BC
Harbor
GTF
Vehicle
ST
Plane
TC
Ship
NWPU
124
390
159
224
166
598
657
775
502
300
DOTA+
80
139
134
209
57
859
1176
1070
1036
645
trained on the source domain samples and has no adaptability. In addition, we also use only the target domain samples of different proportions as the training set, so that we could clearly compare the training effect of the method in this paper with that of the source domain itself. Da-faster method is a domain adaptive method based on FasterRCNN model, which uses two domain classifiers, an image-level classifier for highlevel features and a instance-level classifier for features cropped by region proposal network (RPN), both of which are trained by cross-entropy loss. SWDA method is also implemented based on Faster-RCNN model, in which the weak alignment model tries to place emphasis on the adversarial alignment loss on hard-to-classify target examples, and pays less attention to easy-to-classify target examples, while the strong domain alignment model only focuses on local features. The results of adaptation from DOTA+ to NWPU VHR-10 are shown in Table 2. As can be seen, our method is significantly superior to all comparison methods, with an average increase of 1.8% (from 59.9 to 61.7%), and exceeding the training results using 25% target domain images. 4.4 Futher Discussion Table 3 shows the results of the ablation experiment of our method. We can clearly see that each module plays an important role in the whole detection system. Each level of
876
P. Kang et al.
Table 2 Results on adaptation from DOTA+ to NWPU VHR-10. Average precision (%) is computed on target samples Methods
Bridge BD
BC
Harbor GTF Vehicle ST
Plane TC
ship mAP
Faster-RCNN 35.9
50.6 46.4 39.5
69.4
1.0
29.0 69.7
57.8 50.8 45.0
DA-faster
36.0
65.4 41.6 61.1
41.5
9.9
24.4 81.4
59.6 59.2 47.7
SWDA
49.4
62.6 65.1 69.2
50.5 12.1
55.3 90.7
75.7 68.7 59.9
25% target
30.4
89.9 50.6 27.6
87.2 37.9
49.5 89.7
68.0 72.0 60.3
Proposed
40.4
69.6 67.8 78.1
54.8 17.7
49.8 90.6
69.6 72.9 61.7
the feature is reasonably used, when any one of the modules is deleted, the detection effect will be greatly reduced. Table 3 The impact of each module on the result, including image-level alignment, pixel-level alignment, instance-level alignment, and context-vector (represented by Im, L, In, C, respectively) Methods Proposed
Im L √
In C Bridge BD
√
√
√
√
√
√ √ √
√
BC
Harbor GTF Vehicle ST
Plane TC
ship mAP
45.2
40.6 50.3 71.3
35.2 11.3
49.3 81.7
31.8 65.4 48.2
50.7
62.9 56.2 70.0
48.6 12.5
48.6 85.4
43.6 72.8 55.1
48.9
65.6 64.5 67.8
53.5 13.1
48.3 88.7
75.3 74.8 60.1
40.4
69.6 67.8 78.1
54.8 17.7
49.8 90.6
69.6 72.9 61.7
5 Conclusion In this paper, we propose an adaptive detection method for remote sensing image based on image-level, instance-level, and pixel-level alignment. Multi-level feature alignment is realized by reweighting image-level features, combining instance-level features with context information and pixel-level alignment. Experiments show that our method is better than other adaptive detection methods, and the validity of the three-level feature alignment method is verified.
References 1. Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359 2. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. In: NIPS, pp 2672–2680 3. Bousmalis K, Silberman N, Dohan D, Erhan D, Krishnan D (2017) Unsupervised pixellevel domain adaptation with generative adversarial networks. In: CVPR, 2017 4. Liu MY, Breuel T, Kautz J (2017) Unsupervised image-to-image translation networks. In: NIPS, pp 700–708
Object Detection of Remote Sensing Image …
877
5. Hu L, Kan M, Shan S, Chen X (2018) Duplex generative adversarial network for unsupervised domain adaptation. In: CVPR, pp 1498–1507 6. Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: NIPS, 2015 7. Saito K, Watanabe K, Ushiku Y, Harada T (2018) Maximum classifier discrepancy for unsupervised domain adaptation. In: CVPR, pp 3723–3732 8. Gong B, Shi Y, Sha F, Grauman K (2012) Geodesic flow kernel for unsupervised domain adaptation. In: CVPR, pp 2066–2073 9. Ganin Y, Lempitsky V (2015) Unsupervised domain adaptation by backpropagation. In: ICML, pp 1180–1189 10. Kang G, Jiang L, Yang Y, Hauptmann AG (2019) Contrastive adaptation network for unsupervised domain adaptation. In: CVPR, 2019 11. Tzeng E, Hoffman J, Saenko K, Darrell T (2017) Adversarial discriminative domain adaptation. In: CVPR, 2017 12. Hoffman J, Tzeng E, Park T, Zhu JY, Isola P, Saenko K, Efros A, Darrell T (2018) Cycada: cycle-consistent adversarial domain adaptation. In: ICML, pp. 1994–2003 13. Li Y, Yuan L, Vasconcelos N (2019) Bidirectional learning for domain adaptation of semantic segmentation. In: CVPR, 2019 14. Ganin Y, Ustinova E, Ajakan H, Germain P, Larochelle H, Laviolette F, Marchand M, Lempitsky V (2016) Domain-adversarial training of neural networks. JMLR 17(1):2096–2030 15. Pei Z, Cao Z, Long M, Wang J (2018) Multi-adversarial domain adaptation. In: AAAI, 2018 16. Chen Y, Li W, Sakaridis C, Dai D, Van Gool L (2018) Domain adaptive faster r-cnn for object detection in the wild. In: CVPR, pp 3339–3348 17. Saito K, Ushiku Y, Harada T, Saenko K (2019) Strong-weak distribution alignment for adaptive object detection. In: CVPR, pp 6956–6965 18. Saito K, Ushiku Y, Harada T, Saenko K. Strong-weak distribution alignment for adaptive object detection. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition 2019 (pp. 6956–6965). 19. He Z, Zhang L (2019) Multi-adversarial faster-rcnn for unrestricted object detection. In: ICCV, 2019 20. Cheng G, Han J (2014) Multi-class geospatial object detection and geographic image classification based on collection of part detectors. ISPRS J Photogram Remote Sens 98:119–132 21. Xia GS, Bai X, Ding J, Zhu Z, Belongie S, Luo J, Datcu M, Pelillo M, Zhang L (2018) DOTA: a large-scale dataset for object detection in aerial images. In:CVPR, 2018 22. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 23. Deng J, Dong W, Socher R, Li LJ, Li K, Fei-Fei L. Imagenet: a large-scale hierarchical image database. In: CVPR
Research on Camera Sign-In System Based on SIFT Image Splicing Algorithm Feng Long Yan(B) , Hai Bei Zhang, Changyun, and Ge School of Computer and Software, Dalian Neusoft University of Information, Dalian, P.R. China [email protected]
Abstract. This paper focuses on the image splicing algorithm based on SIFT to help teachers check in more quickly. Multiple images with different angles and overlapping areas monitored by the camera in classroom are selected for image splicing so as to solve the problem of limited view field of the camera lens. At the same time, spatial domain method is used to process the images. The result shows splicing degree is complete that can meet the application requirements of sign-in system with high precision and wide perspective. Keywords: Sign-in system · SIFT · Image splicing
1 Introduction Nowadays, cameras are deployed in most of classrooms, but the utilization of it is not high apart from video monitoring. Meanwhile, sign-in is a process that waste of time for teachers and students. It is a better scheme to fulfill utilization of camera more efficiently to assist sign-in process. However, due to the limited view scope of the camera lens, there may be missed shots in someplace. The system first gets the images with different angles monitored by the camera. And then splice them to a panoramic image so as to avoid missed shots in a large extent. To realize image splicing, there are many methods for image registration such as registration method based on feature [1], registration method in transform domain [2], or the solution based on gray information [3]. After the performance evaluation of different representative image registration algorithms, it is found that SIFT algorithm is the most effective image registration algorithm in the field of image registration [4]. SIFT is a local feature extraction method to detect and describe local features in images. Improved SIFT algorithm can find out the unique key points that will not change their characteristics, so it is widely used in the field of image recognition and matching [5].
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_117
Research on Camera Sign-In System Based on SIFT Image Splicing Algorithm
879
2 Image Splicing Based on SIFT Firstly, different scale spaces are constructed to get the difference Gaussian pyramid (DoG). Gauss difference is sensitive to noise and edge, and the Laplacian operator can highlight the area where the intensity changes rapidly, so it is a good choice in the edge detection task. The image is filtered with different σ in Laplace of Gaussian (LoG). (1) Shows there is little difference between LoG and DoG in space, so DoG is widely used because of simple calculation. σ 2 G =
∂G G(x, y, kσ ) − G(x, y, σ ) ≈ ∂σ kσ − σ
(1)
DoG is the basis of scale space construction, and Gaussian kernel is the only linear kernel that can generate multi-scale space. The scale space L(x, y, σ ) of 2D image I (x, y) is shown in (2). L(x, y, σ ) = G(x, y, σ ) ∗ I (x, y)
(2)
Inside of it, the definition of two-dimensional Gaussian kernel is (3). The extremum of scale space can be obtained by convolution of DOG and image I (x, y). G(x, y, σ ) =
1 −x2 +y2 /2σ 2 e 2π σ 2
(3)
Next, find the extremum of adjacent layers in each group of Gaussian difference pyramids. These extreme points are the key points which include circle radius and center. And later, draw a circle with the feature point and count the gradient of each point in the circle. Making histogram to realize feature points direction estimation. Finally, obtain feature descriptor by angle rotation and histogram statistics.
Fig. 1 Panoramic image splicing flow chart
880
F. L. Yan et al.
3 Experiment In camera sign-in system, image splicing and uploading to the server should be completed first. In the server, face features in uploaded image are extracted and compared with the face features stored in database. Finally, the result of the comparison is returned to the teacher’s control application. In this process, image splicing is a very critical process, the result of which will directly affect the sign-in effect. Panoramic image splicing flow chart is shown in Fig. 1. First of all, it is necessary to preprocess images in order to reduce the influence of brightness and noise. The noise elimination is mainly dependent on the filtering method to eliminate the external noise of the picture, thus reducing the generation of invalid feature points. At the same time, spatial domain method is used to process the images, such as contrast and image sharpness enhancement. Another method is achieving image gradient changes to improve the difference of features to get the points feature in SIFT algorithm. Figure 2 shows the key feature points in SIFT.
Fig. 2 Initial image and key feature points in SIFT
Feature descriptor can be calculated based on the points feature. And the calculate result will be changed into array which is the corresponding description features and feature descriptor array. In SIFT, KD tree or FLANN are used to search and match the feature points [6, 7]. In KD tree, training data is modeled to construct KD tree, and then adjacent sample data is obtained according to the established model. In this project, matcher is created firstly to detect SIFT feature matching from different images. After matcher is created, these images are spliced based on matching results. Figure 3 shows the effect of the first 20 matching feature points. There are two or more images for image splicing; the first image is the result of perspective changing, and in the same time, the second image is transferred to the rightmost end of the first one. Multiple images are spliced according to the fusion principle of two images. The final result is shown in Fig. 4; it can be seen from the figure that the splicing degree of the image is complete, which can meet the application requirements of sign-in system with high precision and wide perspective.
Research on Camera Sign-In System Based on SIFT Image Splicing Algorithm
881
Fig. 3 Matching of the first 20 feature points
Fig. 4 Panoramic spliced image in classroom
4 Conclusion In this paper, we fulfill a sign-in system based on SIFT image splicing algorithm to help teachers check in more quickly. In order to solve the problem of limited view field of the camera lens, multiple images with different angles and overlapping areas monitored by the camera are used. This system realizes the image splicing technology. At the same time, spatial domain method is used to process the images for identification and mobile display in the server. The splicing result of the image is complete that can meet the application requirements of sign-in system with high precision and wide perspective. The simulation results show that the image splicing can improve the image splicing effect and enhance the robustness of the algorithm, which has certain practicability and reference value.
882
F. L. Yan et al.
References 1. Dou JF, Qin Q, Tu ZM (2018) Robust image matching based on the information of SIFT. Optik 171(10):850–861 2. Reddy BS, Chatterji BN (1996) An FFT-based technique for translation, rotation, and scaleinvariant image registration, vol 5, no 8. IEEE Press, New York, pp 1266–1271 3. Lombaert H, Grady L, Pennec X et al (2014) Spectral logdemons: diffeomorphic image registration with very large deformations. Int J Comput Vision 107(3):254–271 4. Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vision 60(2):91–110 5. Loowe D G (1999) Object recongnition from local scale-invariant features. In: Proceedings of the 7th IEEE international conference on computer vision 2:20–27 6. Yang Z-F, Yuan J-K, Huang Y-Y (2020) Indoor Panorama Splice Based on SIFT Algorithm. Autom Instrum 35(03):58–62+87 7. He H-Y, Han J (2020) Research on image mosaic technology based on SIFT algorithm. Autom Instrum 35(02):57–60+79
Research on Performance Optimization Scheme for Web Front-End and Its Practice Fenglong Yan(B) , Zhao Xu, Yu Shan Zhong, Zhang HaiBei, and Chang Yun Ge School of Computer and Software, Neusoft University of Information, Dalian, P.R. China [email protected]
Abstract. This paper focuses on the performance optimization of web front-end, proposing an optimization scheme of the system from the front end. It takes an information management website as an example, proposes a set of systematic, practical and operable website performance optimization solution based on web services. The solution includes loading external CDN resources, optimising component packaging process, lazy loading in route, Qiniu cloud acceleration, file compression, etc. It can improve the usability and performance of web application system, so as to improve the satisfaction of users through the optimization. At the end of the project, the comparative analysis results of website access performance before and after optimization are listed. The result shows that the size of top 5 resources reduced form 20 Mb to 79.7 kb. The image size can be reduce 94.3%, and the loading time can be reduced greatly, which decrease from the initial 25.27 s to the final 0.3 s. Keywords: Web service · Performance optimization · Page acceleration
1 Introduction In China, there are 904 million internet users and will for more in the future by statistics of CNNIC up to March 2020. The popularity of internet accounts for 64.5% of all people. With the rapid increase number of internet users, the amount of website visits also increases. It is a huge challenge to ensure the processing efficiency of the website and the user’s access experience. With the continuous expansion of the front-end and the trend of mobile access. Web front-end performance optimization has become a hot spot in the network, due to the low patience of customers and the low selection cost of customers. Once the page access time is more than 6 s, the customer will leave [1]. So, performance optimization for some enterprise websites with low traffic is an urgent problem to be solved. Many scholars analyse the communication process between browser and server at home and abroad by building stochastic models, and propose a series of optimization schemes. At the beginning of the twenty-first century, many well-known internet companies such as Google and Yahoo started to pay attention to front-end performance
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_118
884
F.. ZhaoXu et al.
optimization. Which proposed many solutions and related tools, such as page speed, speed tracer. Steve Souders absorbed in high performance according to performance optimization [2], and focused on building even faster website [3]. Song focused on the limitation of B/S structure, design a request scheduling algorithm SACC to improve the efficiency of HTTP request queue scheduling [4]. Paulo Pereira, etc. implement a stochastic performance model for web server capacity planning in fog computing [5]. Front end optimization is complex, which involves all aspects of communication between client, server and each other, and various types of resources are also included such as HTML, CSS, JavaScript, image, etc. It can provide not only convenient and friendly experience, but also save transmission bandwidth. Liu introduced page element optimization method and CDN technology. Using Gzip compression technology to minimise pages. At the same time, putting forward a force compression solution in case of web agent and PC security software destroy request header [6], in optimization scheme for web front-end performance and its practice, Wang Cheng, etc. implement the loading time can be reduced by 82%, and page rendering reduced 32% [7]. From the applications above, we can draw a conclusion that when a website provide the smaller linked objects, the users get the faster the loading speed and the better using experience. Taking into account, all the present research in this field, there is still a lack of performance optimization. Further research remains to be conducted on the direction of system content and architecture. In this paper, the method of performance optimization is put forward in detail. The remainder of this paper is organised as follows. Section 2 presents the systems architecture about the information management website. In Sect. 3, optimization design and implement are proposed. The optimization scheme discussion is carried out in Sect. 4 to demonstrate the performance. Finally, conclusions are drowned in Sect. 5.
2 Systems Architecture This paper takes an information management website as an example to optimise its performance. As shown in Fig. 1, there are three layers in this architecture. The first layer is designed to provide data that is connected to the second layer, and MySQL database is used. The second layer is a service layer, and the top one is a data display layer for management system. The top layer include data manage, user manage, goods manage and home manage. The website request specific data structures from web application server service by calling web library interface, web application server extract business data from database and then response request. The whole system is written in Vue.js, and many third-party resources are used in this system. E-charts is used for chart display; element UI is adopted for interface view display; data encryption module is employed for information acquisition function, and other modules are imported such as rich text editing module and network transmission module. The initial resources size are 22.4 MB; module size is 7.2 MB, and dependencies size is 6.7 MB.
Research on Performance Optimization …
885
Fig. 1 Information management website architecture
3 Optimization Implementation 3.1 Loading External CDN Resources and Loading on Demand Content delivery network (CDN) is a virtual network based on the real internet. It is composed by the master server and distributed network nodes all over the world, the goal of which is to reduce access latency, improve response time and website availability. CDN is considered as one of the most advanced technologies affecting web experience [8]. CDN can redirect the user’s request to the nearest service node by comprehensive consideration of various factors such as network traffic conditions, node connection information, actual load, the distance from the user and response time. That can implement the distribution of source to all CDN nodes [9]. In this project, all third-party dependency packages through import statement will be packaged and merged into the same chunk-vendor.js file. That is the problem in Fig. 2 in which single file size is too large after packaging successfully. To solve this problem, external CDN resources is adopted by configuring external nodes through webpack. All third-party dependency packages declared in external will not be included in the final document. The key codes are as follows:
886
F.. ZhaoXu et al.
Fig. 2 The initial resources size
Apart from this, in the header of index.html file, CDN resource reference and CSS are added. When we need to use third-party dependency packages, system will find existing objects on window global objects instead of chunk-vendor.js file. 3.2 Optimising Component Packaging Process and Image Compression Although on-demand loading is enabled in the development phase. But on-demand components still take up a lot of file size. At this time, load on demand in the form of CDN is a good choice to further reduce the volume of packed files. In the same time, static image is a serious problem for page performance. In the practice, image compression can be used to the same scale in advance, which can save traffic and download cost. For the speed of current network popularity, the loading time for a single image of no more than 200 KB is almost the same [10]. In this project, we use optimum JPEG and Qinniu cloud to minimise the images. The image size reduce form 1945.6 to 109.7 kb. After size adjustment, the web page loading speed improved close to 2.1 s. The image quality of the naked eye remains unchanged. It can be seen that the proper time to optimise the size of the picture is conducive to improving the performance of the web page, improving the loading speed, and avoiding the waste caused by too much loading resources. The size of top 5 file are shown in Tables 1, 2 and 3. 3.3 Lazy Loading From Table 3, there is a conclusion that the JS package still large after project builder. If the components can be divided into different blocks, the corresponding components
Research on Performance Optimization …
887
Table 1 Primary sources Sources
Stats (kB)
Global 3G (s) slow (s)
Vendor.js
19,251 21.48
School.jpg
1945.6
App.js
1638.4
3G fast (s)
375.76 93.99
2.25
39.21
0.85
1.84
32.13
8.08
Element.ttf 54.6
0.09
1.47
0.42
Others
0.06
0.94
0.28
27.5
Table 2 External CDN resources Sources
Stats (kB)
Global (s)
3G slow (s)
3G fast (s)
Vendor.js
789
0.91
15.18
Vendor.css
227.8
0.28
4.85
1.26
4
School.jpg
109.7
0.15
2.54
0.69
App.js
1638.4
1.84
32.13
8.08
Element.ttf
54.6
0.09
1.47
0.42
Table 3 External CDN resources and image compression Sources
Stats (kB)
Global (s)
3G slow (s)
3G fast (s)
School.jpg
109.7
0.15
2.54
0.69
Vendor.js
95.5
0.14
2.27
0.12
App.js
40.8
0.08
1.2
0.35
Other
6.7
0.04
0.53
0.17
will be loaded when the routes are accessed, which will be more efficient. In this project plugin-syntax-dynamic-import is used to fulfil lazy loading. Through dynamic import router information, the project can parse and identify the required file on demand which can improve response speed and overall performance. The key steps are as follows. After these 3 steps, we can change the import style in lazy load mode. Loading Plug-Ins Plugin-Syntax-Dynamic-Import
888
F.. ZhaoXu et al.
Configure Plug-Ins in Babel.Config.Js Change Routing Component to Lazy Load The size of top 5 file are shown in Table 4 after lazy loading. Table 4 Lazy loading Sources
Stats (kB)
Global (s)
3G slow (s)
3G fast (s)
School.jpg
109.7
0.15
2.54
0.69
Vendor.js
79.7
0.12
1.96
0.54
User.js
18.8
0.05
0.77
0.24
Goods.js
16.2
0.05
0.72
0.23
Other
11.8
0.04
0.63
0.21
3.4 GULP Compression and Qiniu Cloud Acceleration There are many space characters in the project. If these space characters can be removed, the file size can be greatly reduced. We can use dynamic language to remove redundant spaces, tabs, line breaks and comments, so as to reduce the size of compressed files which can improve the loading speed. Of course, there are some third party software which can provide similar functions such as font-spider and GULP. We used Gzip and GULP compression in this project to implement code compression by removing extraneous characters such as space characters or comments. It can not only reduce the file size, it can also improve the page rendering speed and the performance of the web page. In addition, other software such as font-spider can also be used in vue project to fulfil font compression, Qiniu cloud is used to fulfil image compression. Table 5 shows the top 5 file size information after file and font compression. Table 5 GULP compression and Qiniu cloud acceleration Sources
Stats (kB)
Global (s)
3G slow (s)
3G fast (s)
Vendor.js
79.7
0.12
1.96
0.54
User.js
18.8
0.05
0.77
0.24
Goods.js
16.2
0.05
0.72
0.23
Home
11.8
0.04
0.63
0.21
Research on Performance Optimization …
889
4 The Optimization Scheme Performance Discussion This project tests performance optimization through a real information management website. Through the analysis of the experimental results, the feasibility and effectiveness of the optimization scheme are proved. The optimised performance test is mainly divided into 5 steps including: S 0 is the initial state. External CDN resources is adopted in S 1 . S 2 is external CDN resources and image compression. But unlike S 1 , not only the basic components optimization is deployed, but also other CDN acceleration are included such as CSS. In the same time, third-part image compression is added in step 2. S 3 is lazy loading, and then GULP Compression and Qiniu cloud acceleration is deployed in S 4 . The response time of different steps is shown in Fig. 3. Its global time dropped from the initial 25.27–0.3 s. Of which CDN acceleration and image quality compression have the greatest impact.
Fig. 3 Response time in different optimization steps
In addition, it can be seen from Fig. 3 that the page response time is greatly reduced under the 3G slow network. This result further explains that CDN acceleration and image quality compression should be used as far as possible, of course this idea should be built on the premise of user experience, so as to reduce its impact on page response time. Figure 4 shows the documents size information in different steps. It can be seen from the comparative data that lazy loading and file compression also reduce the size of JavaScript files.
5 Conclusion Visiting speed and user experience value are the basic elements for friendly website. This paper analysis the bottlenecks in the performance of information management websites, the theory related to web front end is studied, and the factors that affect the performance
890
F.. ZhaoXu et al.
Fig. 4 Document size in different optimization steps
are summarised. In this paper, specific steps are implemented according to the proposed optimization solution. At the end of the project, the comparative analysis results of website access performance before and after optimization are listed. The result shows that the size of top 5 resources reduced form 20 Mb to 79.7 kb. The image size can be reduce 94.3%, and the loading time can be reduced greatly, which decrease from the initial 25.27 s to the final 0.3 s.
References 1. Chen H, Wan J (2018) Study of enterprise WebSite performance optimization and its application based on web. Intell Process Appl 008(002):67–69,73 2. Steve S (2007) High performance web sites, pp 1–170 3. Steve S et al (2009) Even faster web sites: performance best practices for web developers. O& apos; Reilly Media, pp 1–250 4. Song X, Fang Q (2016) Research and application of web front end performance optimization. Inf Technol 10:198–202 5. Pereira P, Araujo J, Torquato M, Dantas J, Melo C, Maciel P (2020) Stochastic performance model for web server capacity planning in fog computing. J Supercomput: 1–25. (prepublish) 6. Liu L (2015) Research on performance optimization based on web front end. Huazhong University of Science and Technology 7. Cheng W, Shaoyuan L, Lixiao Z, Jin G, Meiqin Z, Huimin L (2014) Optimisation scheme for web front-end performance and its practice. Comput Appl Softw 31(12):89–95+147 8. Yang L (2020) CDN-based website access acceleration technology research. Jiangxi Sci 38(02):245–251+286 9. Sajithabanu S, Balasundaram SR (2019) Direct push–pull or assisted push–pull? Toward optimal video content delivery using shared storage-based cloud CDN(SS-CCDN). J Supercomput 75(4):2193–2220 10. Huo F (2018) Optimization scheme and practice of web front-end performance. J ANHUI Vocat Coll Electron Inf Technol 17(02):5–8
Automatic Counting System of Red Blood Cells Based on Fourier Ptychographic Microscopy Shushan Wang1 , Tingfa Xu1,2(B) , Jizhou Zhang1 , Xin Wang1 , Yiwen Chen1 , and Jinhua Zhang1 1 Image Engineering and Video Technology Lab, School of Optics and Photonics, Beijing
Institute of Technology, Beijing 100081, China [email protected] 2 Beijing Institute of Technology Chongqing Innovation Center, Chongqing 401120, China
Abstract. Red blood cell (RBC) counting is of great medical significance in clinical examination. Commonly, the cell counting task is completed by microscopic examination, which requires a high resolution. This paper proposes an automatic counting system of red blood cells based on Fourier ptychographic microscopy (FPM) and estimates the RBC number via a convolutional neural network (CNN). The counting network is based on a regression model, using a VGG-16 network combined with a feature pyramid network (FPN). The experimental results show that the mean absolute percentage error (MAPE) of our counting network can reach 0.86%, which means a high accuracy. Keywords: Fourier ptychographic microscopy · Red blood cell counting · Convolutional neural networks · Feature pyramid network
1 Introduction Red blood cell (RBC) counting is considered as a significant predictor for early disease diagnosis. An increasing RBC counting may involve in heart, lung, or kidney disease. While a decline in RBC number is often associated with anemia. Furthermore, malaria parasites density quantification also relies on RBC counting in many researches of malaria. Traditional methods of RBC counting rely on the manual technique. Due to the limitation of diffraction, the resolution of the microscope mainly depends on the numerical aperture (NA) of objective lens, and a higher NA is often accompanied by a smaller field-of-view (FoV), which retards the full-filed cells counting. For this issue, Fourier ptychographic microscopy (FPM) may be an effective solution. FPM is a synthetic aperture technique that iteratively stitches spectral information from a series of low-resolution images illuminated by a programmable LED array, then recovers the high-resolution, wide-FoV image from Fourier domain.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_119
892
S. Wang et al.
In this paper, we propose an automatic counting system of red blood cells based on Fourier ptychographic microscopy, and design a specialized convolutional neural network (CNN) network to fulfill RBC counting task. Section 2 introduces the related work about FPM and CNN. Section 3 describes the schema of our method, details the counting network. Section 4 presents and discusses the experimental results. Finally, Sect. 5 is the summary of the work in this paper.
2 Related Work Traditional optical imaging systems are restricted to the diffraction limit, usually having difficulties in small object detection. Computational imaging has overcome the inherent problems existing in traditional optical imaging systems, and combines the acquisition ability of the imaging system with the computing ability of the computer. In 2013, Zheng et al. [18] first put forward the theoretical model of FPM imaging, and set up the experimental system. As a kind of computational imaging method, its theory integrates the concepts of synthetic aperture, phase retrieval and ptychographical iterative engine (PIE), etc. In recent years, there are several improvements of FPM in imaging mode [3, 5, 9], data acquisition [2, 12, 19], and wave reconstruction algorithm [1, 8, 16]. Due to the high resolution and wide FoV, FPM can be applied in biomedical fields such as digital pathology, haematology, and histology. Many researches have been done in cell counting. Some of them rely on traditional image processing methods, for instance, Hough transform [7, 10], and watershed algorithm [11, 13]. Others are based on deep learning. Deep learning is a kind of supervised learning methods, and CNN is one of the representative algorithms of it. Most researches regard cell counting as a classification task, thus they obtain cells’ accurate locations and shapes. However, for a wide-FoV image of RBC, these algorithms will take up a huge computing resource, which slows down the counting process. Xie et al. [14] proposed a novel CNN based on a structured regression model that only needs few training images with weak annotations. Xue et al. [15] also cast counting task as a regression problem, and took global cell count as the annotation to supervise training.
3 Methods The proposed automatic counting system uses an FPM system to acquire high-resolution images and obtains the total number of RBC of the specimen through an improved CNN. Figure 1 shows the overall framework of this system. The FPM setup is revamped from a common medical microscope. It employs a lowNA objective lens to collect a series of low-resolution images, recovers high-resolution, wide-FoV image in the Fourier domain by iteration method. The counting network is based on the feature pyramid network (FPN) and takes VGG-16 as the feature extractor to extract the multi-layer image features. The feature layers of different scales are fused into a single-layer density map whose numerical summation of every pixel represents the estimated RBC number.
Automatic Counting System of Red Blood Cells …
893
Fig. 1 The overall framework of the proposed automatic counting system
3.1 Theory of FPM FPM system captures a series of low-resolution images of the thin specimen to recover a high-resolution image. As shown in Fig. 2, a programmable LED array is used as the light source for illumination. When capturing images, every LED unit is lit up in turns, which can be approximately seen as oblique plane waves from different angles. As the specimen is imaged by a low-NA objective under oblique plane wave illumination, the specimen’s Fourier spectrum which is beyond the objective’s NA range is shifted to the center. As a result, the system extends its space-bandwidth product and is capable of recovering the specimen’s high frequency information. There are five steps for the iterative recovery procedure of FPM. First, initialize the high-resolution estimate from the collected low-resolution images, usually as the up-sampled one corresponding to the vertically incident plane wave. Second, capture a sub-aperture from the estimated high-resolution spectrum above. Third, replace the target complex amplitude by the intensity measurement and update the spectrum. Forth, repeat steps 2–3 for all incident angles. Finally, repeat the above steps until we find a self-consistent solution. This self-consistent solution is transformed from the Fourier domain to the spatial domain to recover a high-resolution image. 3.2 Regression Model Based on FPN To simplify the counting procedure, we use the density map to describe the amount of RBC in the input image, from which the integral of all pixels represents the counting result. Thus, the counting task boils down to a regression problem, so that, our goal turns to estimate the density map. Figure 3 shows the annotation process of the density map. The density map is proportional to the input image, with dotted annotations which indicate the cells’ central points. 2D Gaussian kernels are adopted to generate the density maps, like [17].
894
S. Wang et al.
Fig. 2 A schematic of FPM system
Fig. 3 The annotation process of density map. a A region of original image. b Labeled image with dotted annotations. c Density map generated by Gaussian kernels
The overall structure of our CNN model for cell counting is shown in Fig. 4. We first use VGG-16 network as the backbone to extract several feature maps from the input image with different scales, then merge these multiple layers in virtue of feature pyramid network (FPN) [6]. FPN is a top-down architecture with lateral connections, which up-samples the spatial resolution of those feature maps by a factor of 2, and merges them with the original maps through the lateral connections, finally generate a new group of merged feature maps. When FPN is applied to detection task, all of these feature maps are used; but in this paper, only the bottommost layer which has the highest spatial resolution is of use. After passing through FPN, we obtain a merged map with 128 channels. Since the density map is a single-channel map, these channels ought to be fused into one. Therefore, we add other several convolutional layers to further fitting the density map.
Automatic Counting System of Red Blood Cells …
895
Fig. 4 The structure of our convolutional neural network for RBC counting
Our loss function of this regression problem is defined as Euclidean loss function: N 1 2 L yˆ , y = yˆ i − yi2 2N
(1)
i=1
where yi is the ground truth, yˆ i is the estimated density map, and N is the number of training images.
4 Results 4.1 Implementation Details All experimental images are captured from human peripheral blood smears. The specimens are illuminated by a 13 × 13 LED array which placed about 9 cm away. The LED units are juxtaposed at a distance of 8 mm. And, the magnification of the FPM objective is 4 × (0.13NA). According to the number of LED units, we acquire 169 images with low-resolution for wave reconstruction sequentially. Finally, the size of recovered high-resolution images is 900 × 900. This image capture and reconstruction process are implemented in MATLAB. 4.2 Evaluation Metric In this work, we take three metrics to evaluate the performance of different methods: mean absolute error (MAE), mean absolute percentage error (MAPE), and root mean squared error (RMSE). MAE is the most basic evaluation metric that indicates the accuracy of the estimated results, while mean squared error (MSE) is used to evaluate the robustness of the models. Here, we use RMSE instead of MSE so that the results will be more intuitive when compared with ground truths. Considering about the ratio between
896
S. Wang et al.
the estimated errors and the truths, we add MAPE into the evaluation, which describes the accuracy rate. MAE, MAPE, and RMSE are defined as follows: MAE =
N 1 yˆ i − yi N
(2)
i=1
N 100% yˆ i − yi y N i i=1 N 1 2 yˆ i − yi RMSE = N
MAPE =
(3)
(4)
i=1
where N is the number of test images, yi is the actual number of RBC, and yˆ i is the estimated number of RBC. 4.3 Model Evaluation and Comparison Here, we present the counting results of three methods, Lempitsky et al. [4], Zhang et al. [17] and our method in Table 1. These three methods are all based on density estimation with the regression model. From Table 1 we can see, our method has a better performance than other methods. Figure 5 is the visualization of partial test results, which demonstrates the estimated density maps generated by our method is closer to the ground truth. Table 1 Comparison of test results of different methods Method
MAE RMSE MAPE(%)
Lempitsky et al 48.28 62.13
6.55
Zhang et al
7.73 13.23
1.05
Our method
6.35 11.52
0.86
5 Conclusion This paper designs an automatic counting system of red blood cells based on Fourier ptychographic microscopy. It not only achieves the high-resolution of images with the assistance of a low-NA objective lens, but also expands the FoV of microscope. In addition, a specialized CNN is introduced to accomplish the automatic counting task of RBC, which raises the counting efficiency while maintaining a high accuracy.
Automatic Counting System of Red Blood Cells …
(a)
933.454510
910.465389
909.525669
934.108715
993.361156
922.667789
932.536521
989.882703
928.705315
922.227773
939.948685
927.992487
(b)
(c)
(d)
897
(e)
Fig. 5 900 × 900 original images for counting and the estimated results with three methods. a Input images. b Ground truths. c–e Estimated results of Lempitsky et al., Zhang et al. and our method Acknowledgements. This work was supported by Key Laboratory Foundation under Grant TCGZ2020C004
References 1. Bian L, Suo J, Zheng G et al (2015) Fourier ptychographic reconstruction using Wirtinger flow optimization. Opt Express 23(4):4856–4866 2. Dong S, Shiradkar R, Nanda P et al (2014) Spectral multiplexing and coherent-state decomposition in Fourier ptychographic imaging. Biomed Opt Express 5(6):1757–1767 3. Konda PC, Taylor JM, Harvey AR et al (2018) Parallelized aperture synthesis using multiaperture Fourier ptychographic microscopy. arXiv: Optics 4. Lempitsky V, Zisserman A (2010) Learning to count objects in images. Neural information processing systems (NIPS). Curran Associates Inc., Vancouver, pp 1324–1332 5. Li Z, Zhang J, Wang X et al (2014) High resolution integral holography using Fourier ptychographic approach. Opt Express 22(26):31935–31947 6. Lin T, Dollar P, Girshick R et al (2017) Feature pyramid networks for object detection. IEEE conference on computer vision and pattern recognition (CVPR). IEEE, Honolulu, pp 936–944 7. Mazalan SM, Mahmood NH, Razak MA et al (2013) Automated red blood cells counting in peripheral blood smear image using circular Hough transform. International conference on artificial intelligence, modelling and simulation. IEEE, Kota Kinabalu, pp 320–324 8. Ou X, Zheng G, Yang C (2014) Embedded pupil function recovery for Fourier ptychographic microscopy. Opt Express 22(5):4960–4972 9. Pacheco S, Zheng G, Liang R (2016) Reflective Fourier ptychography. J Biomed Opt 21(2):026010.1–026010.6
898
S. Wang et al.
10. Reddy VH (2014) Automatic red blood cell and white blood cell counting for telemedicine system. Int J Res Advent Technol 2(1) 11. Sharif JM, Miswan MF, Ngadi MA et al (2012) Red blood cell segmentation using masking and watershed algorithm: a preliminary study. International conference on biomedical engineering (ICoBE). IEEE, Penang, pp 258–262 12. Tian L, Waller L (2014) Illumination coding for fast Fourier Ptychography with large fieldof-view and high-resolution. In: Frontiers in optics. Optical Society of America, Tucson, pp FW1E-7 13. Tulsani H, Saxena S, Yadav N (2013) Segmentation using morphological watershed transformation for counting blood cells. IJCAIT 2(3):28–36 14. Xie Y, Xing F, Kong X et al (2015) Beyond classification: structured regression for robust cell detection using convolutional neural network. International conference on medical image computing and computer-assisted intervention (MICCAI). Springer, Cham, Munich, pp 358– 365 15. Xue Y, Ray N, Hugh J et al (2016) Cell counting by regression using convolutional neural network. European conference on computer vision (ECCV). Springer, Cham, pp 274–290 16. Yeh LH, Dong J, Zhong J et al (2015) Experimental robustness of Fourier ptychography phase retrieval algorithms. Opt Express 23(26):33214–33240 17. Zhang Y, Zhou D, Chen S et al (2016) Single-image crowd counting via multi-column convolutional neural network. IEEE conference on computer vision and pattern recognition (CVPR). IEEE, Las Vegas, pp 589–597 18. Zheng G, Horstmeyer R, Yang C (2013) Wide-field, high-resolution Fourier ptychographic microscopy. Nat Photonics 7(9):739–745 19. Zhou Y, Wu J, Bian Z et al (2016) Wavelength multiplexed Fourier ptychograhic microscopy. In: Computational optical sensing and imaging. Optical Society of America, Heidelberg, pp CT2D-4
Implemention of Speech Recognition on PYNQ Platform Wei Sheng, Songyan Liu(B) , Yi Sun, and Jie Cheng Electronic Engineering College, Heilongjiang University, Harbin 150080, China [email protected], [email protected], [email protected], [email protected]
Abstract. With the development of voice technology, human–computer interaction is widely used in the society. The audio processing in the front end of the Internet of Things needs a faster speed and lower power consumption. In this paper, on the PYNQ development board with more powerful embedded system, the combination of CNN network and FPGA is used to realize the rapid conversion of speech into text. Keywords: Speech recognition · CNN · FPGA · PYNQ
1 Introduction Speech recognition is one of the important technologies in the field of information technology. It realizes the dream of direct communication between human and computer [1]. Up to now, speech recognition technology has made a great breakthrough, which also puts forward new requirements on recognition speed, recognition accuracy and other aspects. This paper makes use of PYNQ open source framework to build more powerful embedded system quickly and develop and test directly on it.
2 Technical Background Speech recognition technology is one of the important technologies in man–machine interaction, mobile terminal application, military command, public security and other fields. The speed and accuracy of recognition play a decisive role in user experience and property safety. 2.1 PYNQ PYNQ is an open-source framework designed to enable embedded programmers to take advantage of Xilinx Zynq’s fully programmable SoC without having to design
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_120
900
W. Sheng et al.
programmable logic circuits. Using the Python language and libraries, designers can take advantage of programmable logic and microprocessors to build more powerful electronic systems. The basic features are as follows: (1) Processor: dual-core Cortex-a9arm processor, the main frequency is 650MHZ. (2) Memory:512 MB DDR3 memory controller with 8 DMA channels and four highperformance AXI3 slave ports (3) FPGA: reconfigurable logic circuit. (4) Video interface: HDMI input/output interface. (5) Audio interface: input/output interface. (6) Peripheral interface: GPIO, USB interface and other I/O interfaces, etc. 2.2 Principles of Speech Recognition In recent years, neural networks have been widely used in speech recognition. Compared with the traditional methods based on hidden mark over speech modeling, the neural network simulates the human nervous system, integrates the autonomous learning ability and information analysis ability into the speech recognition system, enhances the extraction of speech signal features, and reduces the error rate of speech recognition, which is the mainstream technology in the recognition field [2]. In this paper, the speech model of convolutional neural network is established to extract the features of the input speech information, and then compare it with the speech database to find the optimal solution. Figure 1 is the speech recognition process.
Fig. 1 Speech recognition process
The speech signal is a random analog signal. In the pre-processing process, the speech signal needs to be weighted and windowed, and then, the feature is extracted according to the acoustic features of the sound signal. The construction of the acoustic model affects the performance of recognition [3]. The speech model will provide the candidate words for recognition results and output the results with higher comprehensive scores based on the word possibility evaluation. Finally, the decoder is used to convert the speech recognition results into text.
Implemention of Speech Recognition on PYNQ Platform
901
2.3 Convolutional Neural Network At present, speech recognition is modeled by using convolutional neural network. CNN can better solve the non-specific differences of speech signals and the influence of noise. The multi-layer nodes of the convolutional neural network are all fully connected, and each plane is composed of independent neurons, which is more suitable for the speech recognition system [4]. CNN network can reduce the input and output parameters, and the interference caused by the slight change in the signal amplitude can also solve the problem of over fitting. Figure 2 is the structure diagram of the convolutional neural network.
Output
Input Convolution
Pooling
Convolution
Pooling
Fully-Connected-ReLu
Fig. 2 Convolutional neural network
First, the speech signal is preprocessed, and the extracted feature vector is taken as the input of the output layer. After the convolution layer is convoluted with the convolution kernel, the result is the input of the pooling layer. In a deep neural network, a convolutional layer contains multiple convolutional kernels, and the convolution of each convolutional kernel and the input vector will be a plane of the convolutional output vector of this layer. Generally, the convolutional layer is more like a linear weighted operation. The expression is shown in Formula (1): I (i + m, j + n)K(m, n) (1) S(i, j) = (I ∗ K)(i, j) = m
n
The output of the convolutional layer will be used as the input of the pooling layer, and pooling will be carried out by the maximum pooling method, mean pooling method and other methods. At the same time, after the nonlinear operation as the input of the next layer, choose the appropriate activation function for operation. Here are many types of activation functions, such as Sigmoid, Tanh, and ReLu.. In this paper, the ReLu function is used, and the expression is shown in Formula (2): φ(x) = max(0, x)
(2)
As the Relu function is a linear function, the amount of calculation will not be too large, reducing the dependence between parameters. Both forward transfer, backward transfer and derivative are piecewise functions, which are more conducive to optimization
902
W. Sheng et al.
and can also alleviate the problem of over fitting. Finally through the Softmax layer, and if the input of this layer is, the output result is shown in Formula (3). ewi soft max(yi ) = n
(3)
wi j=1 e
3 Method 3.1 Build a Convolutional Neural Network According to the characteristics of CNN network and the requirements of speech recognition, this paper constructs CNN network including 1 input layer, 6 convolution layer, 3 pooling layer, full connection layer and Softmax layer. Among them, the activation function output in the network is softmax, and the activation function of all other hidden layers is ReLu function. The network structure is shown in Fig. 3.
Output
Softmax floor
2*1 pool
3*3 c6, 256
3*3 c5, 256
2*1 pool
3*3 c2, 64
3*3 c1, 64
Input
Fig. 3 Convolutional neural network structure
The network input layer constructed in this paper is 40*16 feature matrix. Firstly, it goes through C1 C2 the convolutional layer, with a total of 64 convolution kernels, the size of which is 3*3, and the step size is 1. The output in the area passing through the pooling layer is 2*1, the horizontal and vertical step sizes are 1 and 2, and the output is 64 feature maps of 2*1. Then, it goes through C3 C4 the convolutional layer. Each layer has 128 convolution kernels with a size of 3*3 and a step size of 1. The output in the area passing through the pooling layer is 2*2, the horizontal and vertical step sizes are 1 and 2, and the output is 128 feature maps of 2*2. Finally, through the C5 C6 convolutional layer, each layer has 256 convolution kernels with a size of 3*3 and a step size of 1. The output in the area passing through the pooling layer is 2*2, and the output is 256 2*2 feature maps.
Implemention of Speech Recognition on PYNQ Platform
903
3.2 System Test Process Speech recognition is one of the important research directions in the field of artificial intelligence. Researchers will select different operating systems and platforms and use convolutional neural network for speech recognition. This paper is on the embedded system, and the system test block diagram realized on the PYNQ development board is shown in Fig. 4.
PC
PYNQ
HDMI-screen
Fig. 4 Block diagram of system testing and setting
Firstly, the PC, PYNQ development board and display screen are connected by HDMI cable. Write the image file of PYNQ to the SD card of 32G capacity, use the external 12 V voltage to power the development board, and enter the Ubuntu-embedded system after startup. Then, the HDMI interface integrated in the PYNQ development board is used for audio input, and the recognition text is displayed from the output interface. The recognition result is shown in Fig. 5.
Fig. 5 Identification results
It can be seen that the speech recognition model can accurately identify the number of words and the basic text content contained in the audio file. Below is the recognition of a relatively long speech with a text of about 200 words. The recognition results are shown in Fig. 6.
Fig. 6 Identification results
As can be seen from the figure, when the audio is longer and the content is more complex, the general text content can be recognized and the punctuation marks can be accurately recognized.
904
W. Sheng et al.
3.3 System Performance This project is implemented on PC embedded system and PYNQ respectively. The hardware condition of PC is: Intel Core i3-4005u processor, 1.7GHZ main frequency, 1G virtual memory of embedded system. PYNQ development board is Pynq-z1 series, the processor is Cortex-a9arm, the main frequency 650MHZ, 7 series FPGA (FPGA gate circuit). Compared with Intel Core i3-4005U, PYNQ development board is used for calculation, which is faster, takes less time and consumes less power. As shown in Table 1. Table 1 Power consumption of the two hardware platforms Platform
The power supply voltage (V)
The stability of power (W)
PC
20
15
PYNQ
12
3.07
4 Conclusion PYNQ platform can achieve super acceleration, the platform has high bandwidth and low bandwidth peripherals interface, 13,300 logic pieces, high-speed transmission capacity, low power consumption, low cost to achieve high performance. After testing, the model designed in this paper is implemented on PYNQ development board, and the computing speed is faster, which is greatly improved compared with Intel Core i3-4005U, and the power consumption is about 1/5 times of Intel Core i3-4005U. However, PYNQ platform has limited resources and encapsulated API. The gate circuit can be reconfigured by FPGA, and the faster processing speed can be obtained by dividing the software and hardware reasonably, which is the direction that needs to be further studied and improved.
References 1. Kerkeni L, Serrestou Y, Raoof K, Cléder C, Mahjoub M, Mbarki M (2019) Automatic speech emotion recognition using machine learning. In: Social media and machine learning, IntechOpen 2. Joshi A, Chabbi D, Suman M, Kulkarni S (2015) Text to speech system for kannada language. In: 2015 international conference on communications and signal processing (ICCSP) 2015, IEEE, pp 1901–1904 3. Parthasarathy S, Tashev I (2018) Convolutional neural network techniques for speech emotion recognition. In: 2018 16th international workshop on acoustic signal enhancement (IWAENC), pp 121–125 4. Li R, Liang H, Shi Y, Feng F, Wang X (2020) Dual-CNN: a convolutional language decoder for paragraph image captioning. Neurocomputing 396
Multi-target Tracking Based on YOLOv3 and Kalman Filter Xin Yin1(B) , Jian Wang1 , and Shidong Song2 1 Tianjin Key Laboratory of Wireless Mobile Communications and Power Transmission, Tianjin
Normal University, Tianjin 300387, China [email protected] 2 Tiangong University, Tianjin 300387, China [email protected]
Abstract. This paper studies multi-target tracking based on YOLOv3 combined with Kalman filter. This method takes the detected target of the current YOLOv3 frame as the detection input of the next frame and iterates the predicted value of Kalman filter and the detection result to modify the tracking trajectory. This method can achieve high real-time performance and improve robustness at the same time. Keywords: YOLOv3 · Kalman filter · Target track · Machine learning
1 Introduction Computer vision is a hot research field, and target tracking is widely used in monitoring systems, medical image processing, and artificial intelligence. Under the premise of processing illumination, complex background, and occlusion, it is a popular research direction to track the target in real time and ensure the algorithm has good robustness. The classical target tracking algorithms include particle filter, Kalman filter [1], and the combination or improvement of them. For instance, pairwise filter [2], unscented Kalman filter [3, 4] which work with Taylor series expansion and so on. The motion prediction effect of Kalman filter toward nonlinear non-Gaussian target is very poor, and particle filter algorithm needs to track the targets based on the target’s color features or space features on each frame. Considering the real time and tracking accuracy, a new multi-target tracking method is proposed based on YOLOv3 algorithm and Kalman filter. We compared the more advanced detection algorithms in recent years, such as deep residual network(ResNet) [5], SORT algorithm [6], multi-hypothesis tracking (MHT) [7], SSD algorithm [8], and YOLOv3 algorithm [9] borrows the ResNet algorithm and the SSD algorithm on the basis of YOLO [10], and we will rely on the recently released YOLOv3 architecture for real-time pedestrian tracking [11], Kalman filter for estimation, and accurate position information for trajectory modification.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_121
906
X. Yin et al.
2 YOLOv3 and Kalman Filter 2.1 YOLOv3 Neural Network The keras-based YOLOv3 mainly uses the convolutional neural network of Darknet [12] to realize target detection. Darknet-53 also achieves the highest measured floating point operations per second and makes it more efficient to evaluate and thus faster [9]. Its convolutional layer is composed by the 1*1 and 3*3 convolution kernel with the batch normalization function and the LeakyReLU activation function. There are 53 convolutional layers in the whole network, and the dimension change of feature maps realizes through the convolutional layer with strides of 2. There will be down sampling for 5 times in the YOLOv3 network, so the output of the network should be a multiple of 25 = 32, and we select the 416*416 images as the input of the network. YOLOv3 network would predict the image in three scales [9], and if the input image size is 416*416, respectively, the network will predict the target when the feature map is in 52*52 (the network output when 23 = 8 times down-sampling), 26*26 (the network output when 24 = 16 times down-sampling), 13*13 (the network output when 25 = 32 times down-sampling), cells divided by three different scales. The network predicts four coordinates for each box, tx , ty , tw , th , if the cell is offset from the top left bounding corner of the image by cx , cy and the bounding box prior has width and height pw , ph , the predictions correspond to [9]: Figure 1 has shown the bounding box with dimension priors and location prediction. If the bounding box prior overlaps a ground truth best, the objectness score predicted by YOLOv3 will set to “1”. And other bounding boxes which overlap a ground truth object more than a threshold 0.5 we ignore the prediction [13].
Fig. 1 Bounding box with dimension priors and location prediction
Each bounding box has five basic parameters: center point coordinates width, height, confidence, and the probability of 80 categories on the coco data set. We use the YOLOv3 model which was trained on MS COCO dataset [14] and resize the raw images that
Multi-Target Tracking Based on YOLOv3 and Kalman Filter
907
extracted from videos to (416H*416 W) to comply the model. The training process of YOLOv3 network: Step 1: load pre-training model and initialize the network parameters; Step 2: get the training data set, n images has m targets; Step 3: resize the image into tensor: ⎡ ⎤ batch_size, 3, ⎢ ⎥ ⎣ img_size ± n × 32,⎦ img_size ± n ∗ 32 Targets
m, 6(img_ID, class_ID, x, y, w, h)
(1)
(2)
put (1) into Darknet-53, the information of target include: Step 4: output the tensor: [batch_size, m, 5 + class]
(3)
get the loss value of three bounding box: M = grid12 + grid22 + grid32
(4)
back propagation optimizes the parameters of network; Step 5: calculate the parameters and save the model. The input of Darknet-53 network is the tensor which is the resized image: [batch_size, 3, 416, 416]
(5)
then, the output of the network is also a tensor which conclude the target’s information: [batch_size, 10647, 5 + class]
(6)
10,647 in (6) shows that there are 10,647 proposal boxes in three times prediction, 5 + class is the center point coordinates of prediction box, width, height, confidence, and class. 2.2 Kalman Filter Kalman filter uses theoretical prediction which was calculated by velocity model and the current observed variable [15] which is the actual position information of target detected by YOLOv3 to update the motion state.
908
X. Yin et al.
Step 1: Calculate the prediction values Xˆ t and the covariance matrix of errorsbetween predictionvalues and true values Pt Xˆ t = AXˆ t−1 + But−1
(7)
Pt = APt−1 AT + Q
(8)
the A in (7), (8) is the transformation matrix of system state equation, B is the matrix of input state, u is the input, Q is the covariance of measurement noise; Step 2: Calculate the Kalman gain Kt and get the estimation value Xˆ t : Kt = Pt H T (HPt H T + R)−1
(9)
Xˆ t = Xˆ t + Kt (Zt − H Xˆ t )
(10)
Parameter R is the system noise, and H is the mapping between measurement values and system state; Step 3: Calculate the covariance matrix of errors between estimation value and true value Pt , which is prepared for next recurrence: Pt = (I − Kt H )Pt
(11)
In our method, we use the Hungarian algorithm [16] to match the bounding box and Kalman filter result. The procession of the whole method has been shown in Fig. 2, and the IOU match is shown in (12). IOU = (A ∩ B)/(A ∪ B)
(12)
The procedure of IOU match in the method has been shown in Fig. 3. In our method, when next frame comes, YOLOv3 detects whether there is a new target and considers whether to join the new target prediction. Then, it uses the image which comes from location of detected target’s center in previous frame as input of YOLOv3 training model, gets the precise target’s location information in next frame, and uses these information of the current frame as the actual measurement values of Kalman filter and updates the target motion state.
3 Experimental Results and Analysis In this section, the target tracking experiment of non-rigid object based on YOLOv3 and Kalman filter is carried out, and the target tracking path of pedestrian in different scenes is drawn by this method. We made three videos in street or on the roof where we can clearly put the walking pedestrian into shot. The three videos’ scenes become increasing complex. Then, we turn each video into a frame-by-frame sequence images. The experiments conduct without the stepin our method at the beginning, and Fig. 4a shows the result that the target was offtrack in the second frame. Then, in our method, we choose to use the accurate position information detected by YOLOv3 again to modify
Multi-Target Tracking Based on YOLOv3 and Kalman Filter
909
Fig. 2 Method for target tracking
the trajectory. It can be seen from Fig. 4a–c that this paper has a great improvement in tracking accuracy. In experiments based on our method, first the single target was tracked in the open scene where there are no obvious occlusion, and then multiple targets with deep field were tracked in the complex scene where there are some small occlusion and the precise tracking path was obtained. It can be seen from Fig. 5a that the performance of multi-target tracking under the complex background that has some small occlusion is also accurate.
910
X. Yin et al.
Fig. 3 Procedure of IOU match
Fig. 4 Comparison of traditional method and our method: a Frame 2 in the method without step 3, b frame 2 in our method, c frame 36 in our method
Fig. 5 Results of our method in multi-target: a frame 182, b frame 206, c frame
Figure 5b shows the target tracking with a large occlusion (the whole field of vision is obscured). Figure 5c shows the target tracking after the obstruction frame has passed.
4 Conclusions From the experiment results above, it can be seen that the method in this paper can better achieve the target tracking effect with small occlusion and can achieve real-time performance and stable algorithm performance. In this paper, the application of multi-target tracking based on YOLOv3 and Kalman filter is studied. This paper does not consider multi-target tracking in more complex environments, such as intersections and the scene full of vehicles, as well as adaptive tracking methods and methods using in unmanned vehicle, which are further research directions.
Multi-Target Tracking Based on YOLOv3 and Kalman Filter
911
References 1. Kalman R (1960) A new approach to linear filtering and prediction problems. Basic Eng 82(Series D):35–45 2. Pieczynski W, Desbouvries F (2003) Kalman filtering using pairwise Gaussian models. In: IEEE international conference on acoustics, speech, and signal processing, VI, Hong-Kong, pp 57–60. 3. Julier S, Uhlmann J, Durrant HF (2000) A new method for the nonlinear transformation of means and covariances in filters and estimators. IEEE Trans Auto Control 45(3):477–482 4. JulierSJ, UhlmannJK, Durrant-Whyte HF (1995) A new approach for filtering nonlinear systems. In: Proceedings of 1995 American control conference 5. He K, Zhang X, Ren S et al (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778 6. Bewley A, Ge Z, Ott L, Ramos F, Upcroft B (2016) Simple online and realtime tracking. IEEE International Conference on Image Processing 2016:3464–3468 7. Kim C, Li F, Ciptadi A, RehgJM (2015) Multiple hypothesis tracking revisited. In: ICCV, pp 4696–4704 8. Liu W, Anguelov D, Erhan D et al (2016) Ssd: single shot multibox detector. In: European conference on computer vision, pp 21–37 9. Redmon J, Farhadi A (2018) YOLOv3: an incremental improvem-ent. arXiv, p 4 10. Zhang Yi, YongliangShen, Qiuyu Zhao. Multi-Person tracking algorithm based on data association. Optik, Volume 194, 2019,163124. 11. Saleh K, Hossny M, Nahavandi S (2020) Spatio-temporal DenseNet for real-time intent prediction of pedestrians in urban traffic environments. Neurocomputing 386:317–324 12. Redmon J (2013–2016) Darknet: Open source neural networks in c. https://pjreddie.com/dar knet/ 13. Ren S, He K, Girshick R, Sun J (2015) Fast r-cnn: towards real-time object detection with region proposal networks. arXiv preprint arXin:1506.01497, p 2 14. Lin T-Y, Maire M, Belongie S et al (2014) Microsoft coco: common objects in context. In: European conference on computer vision, Springer, pp 740–755 15. Milan A, Leal-Taixé L, Reid I, Roth S, Schindler K (2016) Mot16: a benchmark for multiobject tracking. arXiv preprint arXiv (00831):1603 16. Kuhn HW (1955) The Hungarian method for assignment problem. Nav Res Logist Q 2:83–97
Load Balance Analysis-Based Retrieval Strategy in a Heterogeneous File System Di Lin(B) , Weiwei Wu, and Yao Qin University of Electronic Science and Technology of China, Chengdu 610054, China [email protected]
Abstract. In order to adapt to heterogeneous storage environment and improve the overall I/O efficiency and system throughput, we address the optimization of replica management and retrieval strategies in a distributed file system. The proposed strategies can enhance the system’s adaptability to complex environment, improve the system’s rationality for load balance, and reduce the overall storage cost of the system. Specifically, this paper presents a performance metric of measuring the load of nodes. Firstly, we address the concept of comprehensive load, and then propose the method of computing comprehensive load, which is based on the multi-dimension analysis on the system. In addition, we propose the replica management and retrieval strategies, which consider the comprehensive load of nodes in the distributed file system, and optimize the allocation of loads among nodes systematically. Based on the abovementioned strategies, we address the replica placement strategy, the replica management strategy, and the retrieval algorithm in a distributed file system, in consideration of the heterogeneity of nodes in the cluster, the difference between files, and the real-time performance of nodes. All these strategies and algorithms can provide an optimization of replica and retrieval process in a distributed file system. Keywords: Distributed file system · Load balance · Multi-dimension analysis · Replica and retrieval process
1 Introduction In recent years, a large number of distributed file systems have emerged for big data storage, including Google File System (GFS), Hadoop Distributed File System (HDFS), Taobao File System, Tencent File System, and Facebook Haystack [1]. Among these file systems, HDFS is a most widely used distributed file system, and it has a strong capacity of storing data in a large scale. At the same time, by using multi-replica technology and erasure-coding technology, HDFS guarantees a high fault tolerance and reliability of data storage. HDFS uses the multi-replica technology to ensure data reliability and fault tolerance in a cluster. The strategies of replica placement, replica management and replica retrieval can employ the multi-replica mechanism. HDFS allows the cluster to
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_122
Load Balance Analysis-Based Retrieval Strategy …
913
be built on multiple inexpensive regular computers, especially in a cluster of homogeneous computers with similar storage capacity. However, HDFS is not applicable in the heterogeneous file systems [2]. In terms of replica placement, the three-replica mechanism adopted by HDFS cannot meet the load balance requirements among nodes in a heterogeneous environment, causing the nodes in good performance to be idle and the nodes in bad performance to be occupied frequently. In terms of replica management, HDFS employs a static management strategy, which causes the system to be non-flexible. Once a replica of file is placed, the distribution of this replica in the cluster will not be changed, and thus, HDFS cannot capture the change of characteristics in real-time data access. The mismatch of storage media may result in the ineffective utilization of high-performance storage resources. In terms of replica retrieval, HDFS selects the DataNode closest to the client who initiated the read request in the network topology. In fact, the strategy based on network distance is too simple, and in extreme cases, it is easy to cause undesirable results, since this strategy does not take into account the inherent performance difference between the storage media in a heterogeneous environment, and it does not consider the difference in real-time load of each data node, causing a few nodes to undertake a large number of read requests and in turn lead to unbalanced loads in the HDFS cluster. Also, we consider the heterogeneity of data for processing and storage units in the nodes of a HDFS as shown in Fig. 1.
Fig. 1 Hot levels of data and processing rate of storage
This paper focuses on the problem that a HDFS file system cannot dynamically adjust heterogeneous storage resources when processing files at different hot levels, and proposes a replica optimization strategy in distributed file systems to achieve storage load balance among heterogeneous computer nodes.
2 Replica Management and Replica Retrieval in HDFS In the HDFS file system, almost every file block has multiple replicas. The number of replicas depends on the setting of replication factor for the file to which a file block
914
D. Lin et al.
belongs, and it is equal to the value of the replication factor [3]. When the HDFS client writes a file to HDFS, we can set the number of replicas of the file by actively specifying the value of replication factor. If there is no active setting, the default replication factor for file blocks in HDFS is 3. The structure of a HDFS is shown in Fig. 2.
Fig. 2 Primary structure of HDFS
2.1 Replica Management Strategy Replica management refers to various operations performed by the system on the replica in the entire life cycle, such as location migration and backup recovery. The replica management is an important strategy to enhance data reliability and overall system performance. HDFS’s default replica storage strategy not only improves the system’s fault tolerance and data reliability, but also reduces the amount of data stored by per unit of storage. Therefore, in order to fully utilize the advantages of multi-replica technology, HDFS uses replica management strategy for file storing. Replica management strategies can be divided into two major categories, namely static replica management strategies and dynamic replica management strategies. The following is a brief introduction to two types of replica management strategies [4]: (1) Static replica management strategy will not actively adjust the stored replica. The replica of data is placed, and its location will no longer be changed unless a client actively initiates instructions or the node which keeps the replica is damaged. (2) With dynamic replica management strategy, HDFS can adjust the location and number of replicas in real time according to the I/O load in the cluster, the real-time
Load Balance Analysis-Based Retrieval Strategy …
915
status of each node, and the file access characteristics, thereby achieving lower storage cost and higher overall I/O efficiency. HDFS adopts the static replica management strategy by default. Once the data replicas are properly placed in the cluster, the system will not actively adjust the location and number of data replicas throughout the entire life cycle of file storage. Of course, if the user initiates a migration instruction, for example, executes to balance the storage load of DataNodes, the location of the data replica in the HDFS system will still be adjusted by the instruction. In addition, if a DataNode in HDFS crashes, there will be a replica migration, but it is not actually the migration of the original replica, but a new replica. Obviously, the static replica management strategy adopted in HDFS cannot meet the needs of heterogeneous HDFS clusters, and has significant defects [5]: (1) A few files are hardly accessed anymore after they are written. If these files are placed in a storage node with high performance in reading and writing, the performance advantages of the hardware will not be fully utilized; at the same time, the number of replicas will keep at a high level, and thus, the process of writing causes a waste of storage resources. (2) A few files will be accessed frequently after they are written. If these files are placed in a storage node with low performance in reading and writing, the access delay of these files will be increased, thereby reducing the access efficiency of the entire system. (3) The static replica management strategy cannot perceive the access characteristics of data, cannot dynamically place the data to the most suitable location based on factors such as the frequency of access to the data, and the number of accesses, nor can it dynamically adjust the number of replicas of the data. 2.2 Replica Retrieval Algorithm Replica retrieval refers to the process of selecting the most suitable replica from multiple replicas of the same file block. It usually occurs when the HDFS client initiates a request to read a file. There are multiple replicas of the same file block, and these replicas are usually placed on different DataNodes. These DataNodes have different network distances from the HDFS client that initiated the read request, and their storage media types may also be different. The real-time status of these nodes depends on their own individual difference, such as CPU usage, memory usage, I/O load. Therefore, the replica retrieval algorithm should consider these factors and select the DataNode where the most suitable replica is located to provide services for data read requests. HDFS’s default replica retrieval algorithm is relatively simple when reading data. In order to reduce network transmission costs and reduce data transmission delay, HDFS will try to select the replica closest to the HDFS client for reading by referring to the node distance defined in HDFS. If the requesting HDFS client itself is a DataNode and has a replica, HDFS will directly read this replica; if there is a replica on the same rack where the HDFS client is located, HDFS will read this replica. In summary, the basic idea of the algorithm can be summarized as follows [6]:
916
D. Lin et al.
(1) Select a local replica. If the HDFS client who issued the read request is located on a DataNode in the cluster itself, and there is a replica of the file block on the data node, then the HDFS client reads the replica directly; (2) Select a replica on the same rack. If there is a replica of the file block on the other data nodes in the rack where the HDFS client is located, the HDFS client directly reads the replica on the same rack; (3) Randomly select replicas. If neither of the above methods can find a suitable replica’a location, a replica is randomly selected for reading. In HDFS, the default replica retrieval algorithm of data reading can significantly reduce the overall network bandwidth consumption of a system, save the time of reading data, and improve the throughput of the entire system. However, when HDFS clusters gradually become heterogeneous, the above algorithm has the following obvious deficiencies in selecting replicas [7]: (1) When selecting a replica, the algorithm only considers the distance of network topology between the client who initiated the read request and does not consider the performance difference of the storage medium between the DataNodes where the replica is located. According to HDFS’s default replica retrieval algorithm, when a file is stored on both a solid-state hard disk node and a mechanical hard disk node, HDFS will probably read the request from the mechanical hard disk node at a very low rate of read, resulting in a low utilization of hardware. (2) When selecting a replica, the algorithm does not consider the real-time performance of a DataNode where the replica is located for the read operation, and the performance is dependent on the CPU usage, I/O load, memory usage, and realtime available bandwidth for data transmission. According to the HDFS default replica retrieval algorithm, it is likely that a large number of read requests will be distributed to the same DataNode so that the DataNode is overloaded and its performance dramatically decreases.
3 Methods/Experimental Load balance is a key issue that must be considered in all distributed file systems. The main consideration of load balance in HDFS is to balance load at the DataNode level when a HDFS client reads and writes files in the system [8]. We will introduce a load balance method based on the comprehensive load of nodes in this section, and this method only considers DataNodes in HDFS. Specifically, HDFS firstly collects the system indicators in real time, then computes the comprehensive load value of all DataNodes, and finally reads and writes in the system by a certain algorithm to allocate requests among DataNodes [9]. In the following, we first introduce the relevant methods used in this section, then introduce the method of computing comprehensive load, and finally propose a load balance method.
Load Balance Analysis-Based Retrieval Strategy …
917
3.1 Computation of Comprehensive Load In HDFS, multiple replicas of the same file block are scattered on different DataNodes. In fact, a DataNode is composed of multiple storage resources, including CPU, memory, and network. In order to simplify the design, we assume that only one type of storage is used on each DataNode, and the comprehensive load is computed at the DataNode level for load balance. In fact, a storage can be viewed as a DataNode; that is, multiple storages on the same DataNode can be regarded as multiple DataNodes. In this paper, we define DN as all the DataNodes, DN (r) to denote the DataNodes contained within the scope of r. For example, if the collection of data nodes in the entire HDFS is represented by DN(HDFS) and there are n data nodes in the cluster, then DN (HDFS) = {DN 1 , DN 2 , . . . DN n }, in which DNi represents the ith data node. When HDFS selecting a DataNode for writing or reading a file, it only considers the network distance of a node instead of the real-time status of a node. In order to complete the scheduling of read and write requests in a heterogeneous HDFS cluster environment, a few load balance mechanisms need to be introduced to distribute read and write requests. A good load balance algorithm usually takes into account the realtime performance of nodes in a cluster, and thus, it is necessary to propose a method to quantify the real-time state of the nodes. In order to quantify the real-time status of a DataNode, we propose the concept of comprehensive load to evaluate the degree of workload in a DataNode for a certain period, with a value range of [0, 1]. Over a period of time, a larger load value indicates a heavier load in the system. In summary, we discuss the concept of comprehensive load at the DataNode level and consider the real-time status factors of a node, including CPU, memory, and storage medium to compute the comprehensive load. In the following, we address the details of these factors, and the process of computing comprehensive load. 3.1.1 CPU Load CPU is the computation and control center of a distributed file system, and its realtime performance is primarily determined by the state of system. We denote CPU load as loadcpu , and its value range is [0, 1]. Specifically, we use system load to measure the performance of CPU, indicating the length of waiting queue of processes, i.e., the proportion of processes which are currently occupied by CPU or waiting for the use of CPU. To simplify in statistics, we adopt average load to describe the system load in most operating systems, and it refers to the average load in the system over a period of time. For example, in a Linux system, the average load within 1 min, 5 min, and 15 min will be computed statistically, and we can monitor the change of status by entering the commands of cat/proc/loadavg at the terminal. In a single-core computer, the average load with a value of 1 means that the number of processes waiting for CPU during this period has reached the maximal workload of a CPU. In a multi-core computer, when the average load is equal to the number of cores, it means that the number of processes waiting for CPU within this period has reached the maximal workload of the CPU. In order to test the real-time performance, we select the average load within 1 min as an indicator; at the same time, in order to unify the range of data in heterogeneity, we represent the CPU load as the average load and the number
918
D. Lin et al.
of cores. load (1 min) represents the average load of CPU in 1 min, core_num represents the number of cores. In order to restrict the final result within a certain range, we set the value of result to be 1 when the computation result is greater than 1, since the load has exceeded the working capacity of a CPU and the CPU cannot work normally in a short time. loadcpu =
load(1 min) core_num
(1)
3.1.2 Memory Load Memory is a critical component in a computer system, and it is the bridge between CPU and hard disk. All the operations of data reading and writing in a computer system must be completed in the memory. Therefore, the performance of a computer is greatly dependent on the utilization of memory. We denote memory load as loadmem , and its value range is [0, 1] to represent the memory utilization, which represents the percentage of memory occupied by a computer system to the total available memory. In a Linux system, the available amount of memory will be monitored in the system, and its value can be achieved by entering the command of cat/proc/meminfo. This command is use to read a file in the system, and the file stores all the information of system memory. Table 1 shows the primary parameters of memory. Table 1 Primary parameters of memory in Linux Name of parameters
Explanation
MemTotal
The total amount of memory available to the system
MemFree
The amount of unused memory in the system
MemAvailable
The amount of memory required by an application
Buffers
The amount of memory used to buffer files in the system
Cached
The amount of memory occupied by the cache
According to the parameters shown in Table 1, we can compute memory load, shown in Eq. (2). In Eq. (2), we can achieve the memory load by computing the ratio of the amount of used memory to the total amount of memory. loadmem =
MemTotal − MemFree MemTotal
(2)
3.1.3 I/O Load for Storage Medium The storage medium here refers to external storage, including storage devices other than memory and CPU cache in computer systems to store data permanently and stably. External storage can be divided into solid-state hard disks and mechanical hard disks
Load Balance Analysis-Based Retrieval Strategy …
919
according to different media types, and different storage media have different I/O performance. We denote I/O load as loadio , and its value range is [0, 1] to represent the I/O load of the storage medium. We have different I/O performances for different storage medium and can use the frequency of I/O per second or throughput to describe the performance of storage medium. Of course, a computer may have multiple external storage devices, and here, the I/O load only considers external storage devices which are configured as the storage unit. In HDFS, when a client initiates a request to the DataNode to access the replica of a file block, the DataNode will first read this replica from the physical storage medium and then transmit it to the remote client through the network. Therefore, the I/O load is relatively high in the HDFS system, since if the I/O load for physical storage medium is heavy, the time of reading a replica is prolonged significantly, resulting in a low speed of reading requests. In order to fully reflect the I/O load of storage medium, we select the I/O utilization rate to describe the I/O load. In Linux, the system will monitor the status of external storage in real time, and the I/O status of each external storage unit in the system can be shown by entering commands at the terminal. Specifically, two pieces of external storage information are listed in the device tab, corresponding to two types of disks in the computer. At the end of these two lines, we use the parameter of “% util” to represent the percentage of the disk’s I/O utilization. In a computer, a higher usage of I/O equals to a heavier I/O load. At the same time, due to the heterogeneity of storage media, when different storage media have the same I/O usage, their I/O load is quite different. With the same I/O usage, the I/O load of storage medium with a large throughput will be higher than that of the storage medium with a small throughput, and thus, the I/O load of a solid-state hard disk is higher than that of a mechanical hard disk. In order to unify the computation method and fully consider the heterogeneity of storage medium, we show the I/O load in Eq. (3). In Eq. (3), we use utilio to represent the I/O utilization of storage medium and use speed to indicate the I/O speed of storage medium with its maximum of speedmax . Their ratio represents the current level of I/O rate in the storage medium of system, and the I/O load equals to the multiplication of I/O rate by the actual I/O usage. In order to restrict the final result within a certain range, we set the value of I/O load to be 1 when its value is larger than 1, since the I/O load has exceeded the working capacity of storage medium when its actual value is larger than 1, indicating that the storage medium cannot process read and write requests in a short time. loadio = utilio
speed speedmax
(3)
3.1.4 Space Usage of Storage Medium Space usage of storage medium refers to the proportion of space used in the storage medium, and we can use utildisk to denote the space utilization of storage medium. Similarly, we only consider the size of a certain directory in the external storage which is configured as the storage in HDFS, which depends on the configuration of hdfs-site. HDFS is a storage system for storing a large amount of data, and an overhead of storage
920
D. Lin et al.
is inevitable, since a large number of read and write requests may occur at any time. If not considering the real-time storage space for load balance, we may need to handle the situation of uneven data distribution. When a DataNode storage space is almost full, the other DataNodes may be completely blank in storage. Therefore, in order to quantify the status of storage space, we use the space utilization of storage medium as a measurement indicator. In Linux, the system will monitor the storage usage of each external memory in real time by the I/O status of each disk in the system by entering the commands at the terminal, as shown in Fig. 3.
Fig. 3 Storage usage of each external memory
In Fig. 3, each row represents the storage usage of a file system, and the columns list the total space size (1 K-blocks), used space size (Used), available space size (Avail), and space used percentage (Use%) of the file system in 1024 bytes, etc. For example, the first row represents the total space of the /dev/xvda1 file system can contain 41,152,832 1 K-blocks. In these blocks, 14,954,928 blocks have been used, and the number of available blocks is 24084420, with 39% of the used space. Mathematically, we compute the percentage of used space to the total storage space, shown in Eq. (4). In Eq. (4), sizeused indicates the size of used storage space for a DataNode, and sizetotal represents the size of the total available storage space for a DataNode. utildisk =
sizeused sizetotal
(4)
In the computation of comprehensive load, we may need to consider the factors of CPU load, memory load, I/O load, the storage of space utilization, and their range of value is [0, 1]. Based on the four abovementioned factors, we can compute the comprehensive load Load(DNi , t) as (i) Load(DNi , t) = ω1 load(i) cpu (t) + ω2 loadmem (t) (i) + ω3 load(i) io (t) + ω4 utildisk (t)
(5)
In Eq. (5), we compute the comprehensive load Load(DNi , t) by summing the CPU load, memory load, I/O load, and storage space utilization with different weights, where ω1 ,ω2 , ω3 , ω4 represents the weight, and ranges with [0, 1]. Also, ω1 ,ω2 ,ω3 ,ω4 satisfy the equation of (6) ωk = 1 (6) k
Load Balance Analysis-Based Retrieval Strategy …
921
3.2 File Retrieval Method for Load Balance The default HDFS replica retrieval algorithm (DRRA) only considers the network delay between the client and the DataNode when selecting a replica. When the client reads a file, the NameNode will first obtain the LoactedBlock object of the file for the data block. The LoactedBlock object lists all the DataNode information that stores a replica of the data block, and the list is sorted by the network distance from the client, therefore the top of this list being the first DataNode. However, the default retrieval strategy does not take into account the individual difference in the real-time status of the nodes, nor does it optimize the load balance. To solve this problem, we introduce a load balance method based on the comprehensive load of nodes in view of the network distance. The optimized replica retrieval algorithm (ORRA) is introduced in the following as shown in Fig. 4.
Fig. 4 Optimized replica retrieval algorithm (ORRA)
(1) After the client initiates a request to read the file, the NameNode queries the metadata to obtain the DataNode where a replica is located and obtains the nodeList of the DataNode where the replica is located. (2) Determine whether the node which proposes a request is a DataNode. (3) If a node Nodec which proposes a request is a DataNode in the nodeList and the overall load of the node does not exceed the threshold Loadthreshold , the node is directly selected as the target node to provide the read service and open the local replica file to read the data. (4) If a node Nodec which proposes a request is not a DataNode or does not exist in the nodeList or its combined load exceeds the threshold Loadthreshold , then the choose Appropriate () method is called and the range limit (nodeList) is passed as a parameter. A target node in the nodeList is used to provide the reading service, and data transmission is performed through the network.
922
D. Lin et al.
(5) If the replica exists in the memory medium of the node selected above, it will be read from the memory in a high priority. The abovementioned procedures represent the process of optimized replica retrieval algorithm, and this procedures retain the rules of choosing a local node in the highest priority, since the local node can read the data directly instead of transmitting data through the network. At the same time, if there is no replica to read at the local node, we can use the comprehensive load to construct a service queue and select the appropriate node based on the service rate, firstly selecting the node with a lower comprehensive load and the media with better storage performance. The proposed optimization mechanism can reduce the latency of reading data, improve the overall I/O efficiency of system, and guarantee the load balance between nodes.
4 Results and Discussion The purpose of this experiment is to verify that the optimized HDFS replica retrieval algorithm (ORRA) fully considers the heterogeneity between DataNodes in network distance and real-time performance to allocate the file-read requests reasonably, achieving the results of load balance, improving the overall I/O efficiency of system, and reducing the average delay of file reading in the HDFS system. 4.1 Experiment Design In our experiment, a cluster is divided into three racks, including 18 DataNodes in total, shown in Fig. 5. This experiment compares two different replica retrieval algorithms by reading a certain number of files from HDFS. We first write 400 files into HDFS, and the size of each file is 1 GB, composing of 16 blocks. The initial strategy is set to be the default storage strategy in HDFS. Two replicas are stored in the mechanical hard disk HDD, and the other replica is stored in the solid-state drive SSD. This experiment is based on the default replica retrieval algorithm (DRRA) in HDFS and the optimized replica retrieval algorithm (ORRA) in this paper, respectively. In order to guarantee the experimental results more efficient, we make multiple experiments by setting the number of files for reading as 200, 400, 800, 1000, and 1500, respectively. The arrival rate of requests to read files is subject to a Poisson distribution. 4.2 Experiment Analysis Tables 2 and 3 show the experimental results of file reading based on the DRRA algorithm and the ORRA algorithm, respectively. According to Tables 2 and 3, we can get the following conclusion: Our proposed ORRA algorithm always outperforms the DRRA algorithm which is the default algorithm by HDFS. Specifically, the cumulative time of file reading by the ORRA algorithm is shorter than that of file reading by the DRRA algorithm, as shown in Fig. 6a. Also,
Load Balance Analysis-Based Retrieval Strategy …
923
Fig. 5 Simulation scenario of our experiment
Table 2 Results of file reading based on the DRRA algorithm Number of file reading
Accumulated reading time (s)
Frequency of hitting SSD
Average reading time per file (s)
Average rate of file reading (MB/s)
200
2101.28
1001
10.5064
97.4645
400
4583.40
2074
11.4585
89.3661
800
9545.18
4352
11.9315
85.8234
1000
11,825.69
4976
11.8257
86.5912
1500
18,115.03
7752
12.0767
84.7915
11.8386
86.4966
Average
the frequency of hitting SSD by the ORRA algorithm is higher than that by the DRRA algorithm as shown in Fig. 6b. Overall, the number of SSD hits when reading files with the ORRA algorithm is significantly higher than the number of hits with the DRRA algorithm, and the number of hits increases by 19.44% on average; the average amount of time to read a single file with the ORRA algorithm is smaller than that with the DRRA algorithm, and the delay of reading a single file decreases by 10.59% on average; the average rate of reading with the ORRA algorithm is significantly higher than that with the DRRA algorithm, and the rate of data reading increases by 11.84% on average.
924
D. Lin et al. Table 3 Results of file reading based on the ORRA algorithm
Number of file reading
Accumulated reading time (s)
Frequency of hitting SSD
Average reading time per file (s)
Average rate of file reading (MB/s)
200
1744.43
1339
8.7221
117.4024
400
3793.91
2835
9.4848
107.9625
800
7939.45
5424
9.9243
103.1809
1000
10,376.98
6087
10.3770
98.6800
1500
17,427.19
8388
11.6181
88.1381
10.5851
96.7396
Average
5 Conclusions This paper proposes an optimized replica retrieval algorithm (ORRA) for file storage in HDFS. By comparing the proposed algorithm with the default replica management algorithm in HDFS, we can achieve the following results in the experiment. The proposed ORRA algorithm can dramatically improve the frequency of selecting the nodes with high performance in comparison with the DRRA algorithm. The former algorithm can reduce the latency of file retrieving by 10.59% on average and enhance the rate of reading by 11.84% on average than the latter algorithm. Specifically, the proposed ORPP can dramatically reduce the delay of reading data in the system and improve the overall reading efficiency. In summary, the performance of the proposed ORRA algorithm outperforms the DRRA algorithm, which is the default algorithm of HDFS. The optimized replica retrieval algorithm proposed in this paper can effectively allocate the file-read requests among DataNodes in view of the heterogeneity of storage media and the real-time performance of these nodes, achieving the effect of load balance. At the system level, the average delay of data reading in the system decreases, the overall efficiency of reading improves, and the allocation of file-read requests in the system is well balanced.
Load Balance Analysis-Based Retrieval Strategy …
925
(a) Cumulative time of file reading
(b) Frequency of hitting SSD Fig. 6 Performance of different algorithms
References 1. Ciritoglu HE, Murphy J, Thorpe C (2019) HaRD: a heterogeneity-aware replica deletion for HDFS. J Big Data 6(1):94 2. Kim HG (2018) Effects of design factors of HDFS on I/O performance. J Comput Sci 14(3):304–309 3. Cui L, Hao Z et al (2018) SnapFiner: a page-aware snapshot system for virtual machines. IEEE Trans Parallel Distrib Syst 29(11):2613–2626 4. Yongcai T, Yang B, Lei S (2018) Management mechanism of dynamic cloud data replica based on availability. J Chin Comput Syst 39(3):490–495 5. Lian Y, Lei MA, Chuanai L et al (2018) On dynamic replication strategies of HDFS cloud storage based on RS erasure codes. TechnolInnovAppl 6. Xu X, Yang C, Shao J (2017) Data replica placement mechanism for open heterogeneous storage systems. Proc Comput Sci 109:18–25
926
D. Lin et al.
7. Li H, Li H, Wen Z et al (2017) Distributed heterogeneous storage based on data value. In: 2017 IEEE 2nd information technology, networking, electronic and automation control conference (ITNEC). IEEE, 2017, pp 264–271 8. Gupta H, Vahid A, Ghosh SK et al (2017) iFogSim: a toolkit for modeling and simulation of resource management techniques in the Internet of Things, edge and fog computing environments. Softw Pract Exp 47(9):1275–1296 9. Sonmez C, Ozgovde A, Ersoy C (2018) Edgecloudsim: an environment for performance evaluation of edge computing systems. Trans Emerg Telecommun Technol 29(11):3493
Brain-Inspired Maritime Network Framework with USV Xin Sun1(B) , Tingting Yang2 , Kun Shi3 , and Huapeng Cao4
4
1 Navigation College, Dalian Maritime University, Dalian, China Sunny [email protected] 2 School of Electrical Engineering & Intelligentization, Dongguan University of Technology, Dongguan, China [email protected] 3 Department of Control Science and Technology, Zhejiang University, Hangzhou 310027, China [email protected] Navigation College, Dalian Maritime University, Dalian, China [email protected]
Abstract. Communication has been completely integrated into our life and has laid a foundation for our investigation and exploration in maritime. Intelligent unmanned surface vessels (USVs) are envisioned to perform front services in space-air-ground-sea integrated network. It is obvious that USV improves the safety level of navigation, while faces a series of challenges such as data fusion and signal processing in spaceair-ground-sea integrated network. Inspired by the brains’ powerful integration processability of vestibular and visual information, we propose a novel brain-inspired maritime information intelligent integration framework. First, we study signal processing, lidar and vision sensors detection of USV. Then, we propose to solve the problems of chaotic integration and low efficiency by using Bayesian theory. Finally, the simulation results analyze the discrepancy of likelihood function on information integration, and fully verify the superiority of brain-inspired information integration framework is carried by USV. Keywords: Space-air-ground-sea integrated network · Multi-sensor integration · Signal processing · Brain-inspired · Bayesian theory
1
Introduction
Reliable maritime communication not only meets the needs of normal navigation and operation of vessels, but also is an essential basic guarantee for the safety at sea. USV is an important application of unmanned technology in the surface c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_123
928
X. Sun et al.
environment. Due to these characteristics, USV can be operated in space-airground-sea integrated network compared with the traditional vessels. Liu et al. [1] first researched the space-air-ground integrated network. Yang et al. [2] proposed an integrated space-air-ground-sea communication control system based on mobile computing technology to rescue a water container. Bibuli et al. [3] explained that the navigation system can continuously provide attitude, speed, course and position information for vessels. Wang et al. [4] utilized the high accuracy of lidar and vision sensors to detect the target. However, the existing parallel extraction methods of multi-objective image features are susceptible to the marine climate, and the processing of multi-sensor information is not yet mature. This paper solves this problem by referring to the way the brain processes information. Our brains have evolved over time to quickly recognize external information in a much better way. Experimental data proposed by Fetsch et al. [5] also showed that the brain integrated visual and vestibular cues to infer direction in the way that Bayesian inference predicts. At the same time, there are many optimal integration methods of multi-sensory information. For example, Alais et al. [6] disclosed the integration of visual and auditory cues for inferring object location. Jacobs et al. [7] leveraged the motion and texture cues for depth perception. This paper refers to the interactive neural network model proposed by Zhang et al. [8]. Information processing in dorsal medial superior temporal area (MSTd) and the ventral intraparietal area (VIP) regions of the brain is used to build a brain-inspired system and we apply it to USV of space-air-ground-sea integrated network. The remainder of this paper is organized as follows. In Sect. 2, the model of brain-inspired perception technology which is applied to USV is proposed. The cues are integrated by calculating the likelihood function of the cues. The processed information is used for USV signal processing and data fusion in spaceair-ground-sea integrated network. The simulation results and case study are analyzed in Sect. 3. Section 4 illustrate the conclusions and future work.
2
Application of Brain-Inspired Sensing Technology in USV
In this section, we will focus on how to apply brain-inspired sensing technology to USV of space-air-ground-sea integrated network. The method of multipheromone processing can be obtained by integrating and separating two interconnected modules. We innovatively apply brain-inspired sensing technology to the processing of USV multi-sensor information, which improves the accuracy of information integration when multi-targets coincide (Fig. 1). In this paper, we choose the collaborative approach of lidar and vision sensor to simulate the information processing of MSTd and VIP areas of the brain. Lidar detects the location of the target, while visual sensors detect the shape and texture of the target. When the amount of information in the marine environment explodes, the USV needs to show two states: active search and passive
Brain-Inspired Maritime Network . . .
929
Fig. 1 Schematic diagram of a brain-inspired system
exploration. When we need to collect, track or identify some information, USVs actively use sensors to search for target cues. USV will be in a passive state when exposed to dangers such as stormy weather or biological attacks. To begin with, we introduce the probabilistic model of multi-sensory integration. The vector input of position is then transformed into a circular variable for an integral posterior comparison. We set sm as the input value to represent the vector angle, km as the concentration degree to represent the length of the vector, where m = 1, 2 (representing lidar and visual sensors, respectively). Likelihood function can be expressed as p(xm , sm ), where xm is the direct cue value of sm (such as visual cue to MSTd), and xl is the indirect cue value of sm (such as vestibular cue to MSTd). Since the direction angle is within the range of a cyclic variable (−π, π), we use von Mises distribution for theoretical analysis to express the likelihood function as p(xm , sm ) = ((2πI0 (km ))−1 ) ∗ exp(km cos(xm − sm )) ≡ M (xm |sm , km )
(1)
where I0 (km ) is the modified Bessel function of the first kind and order zero and acts as the normalization factor. xm is the mean value of von Mises distribution, which controls the reliability of sm . In previous studies, integration and segregation studies have often been expressed as questions of causal inference. In general, the priori of causal reasoning consists of more than one component, each of which corresponds to a causal structure that describes the relationship between multiple pieces of information. This paper refers to a single component priori used in many multi-sensory integration studies. Integral prior p(s1 , s2 ) describes the probability of (s1 , s2 ) occurring simultaneously. The integral prior can be expressed as p(s1 , s2 ) = (2π)−1 M (s1 − s2 ; 0, ks ) 2
−1
= [(2π) I0 (ks )]
exp(ks cos(s1 − s2 ))
(2) (3)
This a priori reflects that two information cues of the same goal have similar weights. The parameter ks determines to what extent these two threads should be integrated. It is going to be the complete integral when ks goes to infinity. Marginal prior p(sm ) is defined as an uniform distribution. Previous studies
930
X. Sun et al.
have shown that cues are integrated or segregated according to Bayesian theory. The following formula calculates the posterior value of two kinds of information according to Bayesian theorem to obtain the optimal multi-sensory integration. p(s1 , s2 |x1 , x2 ) ∝ p(x1 |s2 )p(x2 |s2 )p(s1 , s2 )
(4)
We only calculate s1 for exchangeability. The posterior of s1 is calculated by the marginal posteriori marginalization in the above equation. We set p(sm )=p(xm )=π/2, which is the condition of marginal prior uniform distribution. +π p(x2 |s2 )p(s1 , s2 )ds2 (5) p(s1 |x1 , x2 ) ∝ p(x1 |s1 ) −π
∝ p(s1 |x1 )p(s1 |x2 )
(6)
≈ M (s1 ; x1 , k1 )M (s1 ; x2 , k2s )
(7)
According to the above equation, the posterior of a given combination of cues is equal to the posterior product of a given single cue. Note that p(s1 |x2 ) ∝ p(x2 |s2 )p(s1 , s2 )ds2 is approximated to be M (s1 ; x2 , k2s ) through equating the mean resultant length of distribution. Finally, since the product of two von Mises distributions is equal to a von Mises distribution, the posterior distribution is p(s1 |x1 , x2 ) = M (s1 ; sˆ1 , kˆ1 ), whose x1 and km can be obtained from the moment given by kˆ1 ej sˆ1 = k1 ejx1 + k2s ejx2
(8)
The above equation is the result of Bayesian optimal integration in von Mises distribution form, which is the criterion for judging whether the integration of cues can reach the optimal integration. Therefore, according to the previous calculation, we can completely define the difference information between the two cues as pd (s1 |x1 , x2 ) ∝ p(s1 |x1 )/p(s1 |x2 )
(9)
This is to give the ratio of the posterior values of two cues, so as to measure the difference between the estimates from different cues. By taking the expectation of logpd over the distribution p(s1 |x1 ), it gives rise to the divergence between the two posteriors given each cue. By using the periodicity of von Mises distribution, i.e.,−cos(s1 − x2 ) = cos(s1 − x2 − π), Eq. 9 is written as Eq. 10. pd (s1 |x1 , x2 ) ∝ p(s1 |x1 )/p(s1 |x2 + π) ∝ M (s1 ; x1 , k1 )M (s1 ; x2 + π, k2s )
(10) (11)
Therefore, the differential information between two cues can also be expressed as the product of the posterior of the direct cue and indirect cue, with the cue
Brain-Inspired Maritime Network . . .
931
direction deviating π. In fact, a similar derivation of Eqs.5 and 10 leads to the same integrated framework as the prior stimulus p(s1 |s2 ) modified π in the angle. As in Eq. 8, the mean value and concentration degree of pd (s1 |x1 , x2 ) = M (s1 ; Δˆ s1 , Δkˆ1 ) can be expressed by Δkˆ1 ejΔˆs1 = k1 ejx1 − k2s ejx2
(12)
Equation 12 is the criterion to determine whether the difference information between two cues satisfies Bayesian theory. This section focuses on the integration of information by brain-inspired sensing technology. Multiple cues are handled through the integration and separation of two interconnected modules. The brain-inspired sensing technology is applied in the USV of space-air-ground-sea integrated network, which improves the accuracy of information fusion when multiple targets coincide.
3
Performance Analyse
In this section, we provide simulation results and case study. We have gained some insights that the integration of neurons information is applied to the processing of USV multi-sensor information in space-air-ground-sea integrated network. 3.1
Simulation
The simulation results of cues integration using brain-inspired perception technology are shown in Figs. 2 and 3. Both integration and segregation operations tend to move in the same direction (0◦ ). We set α1 = 0.4 and α2 = 0.5, where αm represents the amplitude of cuem . The functional relationship between the decoded circular probability distribution and firing rate (which can be expressed as the degree of integration) is shown in Figs. 2 and 3. In Fig.2, the firing rate of cue1 and cue2 reaches the highest value at 0◦ , which is 33 and 8, respectively. The firing rate of combining the two cues for integration processing reaches the highest value of 40. In Fig. 3, the firing rate of cue1 and cue2 at 0◦ are 30 and 0, respectively. The firing rate of combining the two cues for segregation can only reach 25. It can be clearly observed that the tuning curve of the two cues after the integration operation (Fig. 2) is more advantageous than that of the segregation operation (Fig. 3). Because the round probability of both cue’s range and azimuth in (−π, π) is the same direction, the integration of the tuning curve is higher than segregation. Thus, the tuning characteristics of the integration or segregation operations naturally emerge. 3.2
Case Study
This part provides a case study of the brain-inspired technology applied. The target surroundings will be detected by lidar and visual sensors as shown in
932
X. Sun et al.
Fig. 2 Tuning curves of cue integration
Fig. 3 Tuning curves of cue segregation
Brain-Inspired Maritime Network . . .
933
Fig. 4 when USV are sailing at the sea. Mn is lidar detection information. And the information transmitted by the visual sensor is set as Vn , which is temporarily calibrated by a rectangular box (the uncertain target). The ellipse is a region of interest created by the combination of sensor and lidar cues. We take into account the following situations in this paper.
Fig. 4 The experiment case
In general, there is no error in the detection of a single or distant target. USVs can make their own judgments correctly without interference. Therefore, the region of interest in Fig. 4a is the correct combination of lidar and visual to form a real-time scene through brain-inspired integration processing. But the lidar detection will be integrated with the closer visual sensing position when the USV carries on the scene prediction. M3 integrates with V2 while M2 and V3 become invalid information in Fig. 4b, which is due to the large deviation of the long distance. Multiple lidar detection M4 , M5 , and M6 are closer enough which caused complete inaccurate integration of the whole detection result as shown
934
X. Sun et al.
in Fig. 4c. Because M6 and V5 are close to each other, M5 is integrated with V4 and V6 of the information returned by the visual sensor which make no visual information integrated with M4 . The correct integration, predicted by likelihood function, is shown in the right of Fig. 4b, c. The brain-inspired technology can accurately make judgments even if the image position or information overlap, which is of great significance for the restoration of marine real-time scenes.
4
Conclusion
Over millions of years of evolution, the brain has developed effective strategies to cope with the challenge of processing information quickly. In order to verify the feasibility on marine of this strategy, this paper applies brain-inspired perception technology to USV of space-air-ground-sea integrated network. Based on Bayesian theory, brain-inspired systems integrate information from multiple sensors either actively or passively. Then, the feasibility of applying brain-inspired sensing technology to USV in space-air-ground-sea integrated network is illustrated by simulation and case study. Integrated information is used to solve the problem of data fusion and signal processing. In addition, the study still needs to be improved. This paper only makes decisions about the integration of these two types of information. The next step is to get the brain-inspired system to process more information of sensors, which would be a better system. Acknowledgements. This work was supported in part by Natural Science Foundation of China under Grant 61771086, Dalian Outstanding Young Science and Technology Talents Foundation under Grant 2017RJ06, Liaoning Province Xingliao Talents Program under Grant XLYC1807149, Guangdong Province Basic and Applied Basic Research Foundation under Grant 2019B1515120084, The project “The Verification Platform of Multi-tier Coverage Communication Network for Oceans (PCL2018KP002)”.
References 1. Liu J, Shi Y, Fadlullah ZM, Kato N (2018) Space-air-ground integrated network: a survey. IEEE Commun Surveys Tutorials 20(4):2714–2741 2. Yang T, Guo Y, Zhou Y, Wei S (2019) Joint communication and control for small underactuated USV based on mobile computing technology. IEEE Access 99:1 3. Bibuli M, Caccia M, Lapierre L, Bruzzone G, Guidance of unmanned surface vehicles: experiments in vehicle following. IEEE Robot Autom Magazine 19(3):92–102 4. Wang Q, Xufei C, Yibing L, Fang Y, Performance enhancement of a USV INS/CNS/DVL integration navigation system based on an adaptive information sharing factor federated filter. Sensors 17(2):239 5. Fetsch CR, DeAngelis GC, Angelaki DE (2013) Bridging the gap between theories of sensory cue integration and the physiology of multisensory neurons. Nature Rev Neurosci 14(6):429–442 6. Alais D, Burr D (2004) The ventriloquist effect results from near-optimal bimodal integration. Current Biol 14(3):257–262
Brain-Inspired Maritime Network . . .
935
7. Jacobs RA, Optimal integration of texture and motion cues to depth. Vis Res 39(21):3629 8. Zhang WH, Wong KYM, Wang H, Wu S (2019) Congruent and opposite neurons as partners in multisensory integration and segregation. In: Aps meeting
Line-of-Sight Rate Estimation Based on Strong Tracking Cubature Kalman Filter for Strapdown Seeker Kaiwei Chen(B) , Laitian Cao, Xuehui Shao, and Xiaomin Qiang Beijing Aerospace Automatic Control Institute, Beijing 100854, China [email protected]
Abstract. This paper addresses the inability of the strapdown seeker to indirectly measure line-of-sight (LOS) rate and the strongly nonlinear relative motion equations of missile and target. Strong tracking theory is combined with cubature Kalman filter (CKF), and a novel nonlinear filter for LOS rate estimation is presented; this filter is called the strong tracking CKF (STCKF). On the basis of the relative motion relationship of missile and target, the LOS rate reconstruction model is derived. Then, the STCKF algorithm is proposed by introducing a suboptimal fading factor into the predicted error covariance of the CKF. Finally, the validity and feasibility of the proposed algorithm are verified by simulation. Simulation results show that the proposed STCKF maintains the strong tracking ability for abrupt state changes. A comparison with CKF shows that STCKF can improve the LOS rate estimation precision for a strapdown seeker with better robustness and adaptability. Keywords: LOS rate estimation · Strong tracking CKF · Strapdown seeker
1 Introduction In the strapdown guidance system, the line-of-sight (LOS) angles will couple with body attitude motion and only body-LOS angles can be measured, whereas the gimbal seeker directly produces a LOS rate that can be used for guidance command calculations. The difficulty in extracting the LOS rate is caused by the strong nonlinearity and serious measurement noise of the body-LOS angle information [1]. Hence, estimating the LOS rate to meet the accuracy requirement of guidance system is a key issue for strapdown seeker application. Many estimation theories applicable to LOS rate estimation have been investigated [2–4]. All the mentioned algorithms have some problems, such as high computation complexity or poor filtering accuracy and poor numerical stability when the system has strong nonlinearity and high dimensionality. To improve the estimation performance, the cubature Kalman filter (CKF) [5] was utilized for nonlinear estimation. Moreover, to avoid the problem of poor robustness and filtering divergence caused by model uncertainties, Zhou [6] proposed a new concept, the strong tracking filter (STF), and solved the © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_124
Line-of-Sight Rate Estimation Based on Strong Tracking …
937
state estimation problem of a class of nonlinear systems. To this end, this paper proposes an effective nonlinear algorithm to estimate the LOS rate for strapdown seeker, combining strong tracking theory with CKF and deriving the estimation model. Simulation results show that the proposed strong tracking CKF (STCKF) exhibits the advantages of both STF and CKF, along with faster calculation speed and better numerical stability. This paper is structured as follows: Sect. 2 defines the reference coordinate systems and angles. Then, the LOS angle reconstruction method is presented, and the LOS rate calculation model is developed according to the relative motion relationship of the missile and the target. In Sect. 3, the state equation and measurement equation are derived, and the STCKF estimation algorithm is proposed. In Sect. 4, some simulations are conducted to verify the performance of the proposed LOS rate estimation algorithm, and a comparison between STCKF and CKF is performed. Section 5 concludes this paper.
2 Model of LOS Rate Estimation 2.1 Coordinate Systems To study the LOS rate estimation algorithm of strapdown seeker, the inertial, body, LOS, and body-LOS coordinate systems are introduced. The commonly used inertial coordinate system Oi xi yi zi , body coordinate system Ob xb yb zb , and transformation can be referred to [7]. The definitions of LOS coordinate system Os xs ys zs and body-LOS coordinate system Ol xl yl zl are shown in Fig. 1. ys
qz
yi
yl
εz
yb
Target Target
xs qz qy
Oi (Os ) Missile
xl
xs'
εz Ob (Ol )
xi
Missile
xl'
εy xb
zb
zi qy
εy
zs
zl
(a) Inertial coordinate system and LOS coordinate system
(b) Body coordinate system and Body-LOS coordinate system
Fig. 1. Definitions of coordinate systems
2.2 LOS Angle Reconstruction Method In the strapdown guidance system, the seeker is fixed on the missile body rigidly, so the seeker optical axis is in accordance with body axis Ob xb , and only body-LOS azimuth angle εy and elevation angle εz can be obtained. Combining with missile attitudes measured by rate gyro in the inertial navigation system, we can reconstruct the LOS azimuth
938
K. Chen et al.
angle qy and elevation angle qz by decoupling the missile attitude information from the detected information. As shown in Fig. 1, the vectors of target in the LOS coordinate system and body-LOS coordinate system are (RT , 0, 0). According to the transformation relationships among coordinate systems, we have ⎡ ⎤ ⎡ ⎤ RT RT L(ψ, ϑ, γ )L εy , εz ⎣ 0 ⎦ = L qy , qz ⎣ 0 ⎦ 0 0
(1)
The qy and qz can be obtained by the following equation: T z [ qy qz ]T = − arctan( ) arcsin(y) x
(2)
where ⎧ ⎪ ⎪ x = cos ϑ cos ψ cos εz cos εy + sin εz (− sin ϑ cos ψ cos γ + sin ψ sin γ ) ⎪ ⎪ ⎪ ⎨ − cos εz sin εy (sin ϑ cos ψ sin γ + sin ψ cos γ ) y = sin ϑ cos εz cos εy + cos ϑ cos γ sin εz + cos εz sin εy cos ϑ sin γ ⎪ ⎪ ⎪ z = − cos ϑ sin ψ cos εz cos εy + sin εz (sin ϑ sin ψ cos γ + cos ψ sin γ ) ⎪ ⎪ ⎩ − cos ε sin ε (− sin ϑ sin ψ sin γ + co sψ cos γ ) z y (3)
2.3 LOS Rate Calculation Modelsss rm = (xm , ym , zm )T represents the position vector of missile centroid, and am = T axm , aym , azm represents the acceleration vector of the missile in an inertial coorT dinate system. rt = (xt , yt , zt )T and at = axt , ayt , azt are the corresponding target vectors. The relative position between missile and target can be described as ⎡
⎤ xt − xm r = (xr , yr , zr )T = ⎣ yt − ym ⎦ z t − zm
(4)
The relative acceleration between the missile and the target is obtained as follows: ⎡
⎤ ⎡ ⎤ ⎡ ⎤ x¨ r x¨ t − x¨ m axt − axm a = ⎣ y¨ r ⎦ = ⎣ y¨ t − y¨ m ⎦ = ⎣ ayt − aym ⎦ z¨r z¨t − z¨m azt − azm
(5)
The LOS angle can be expressed as
yr
qz = arctan xr2 + zr2
(6)
Line-of-Sight Rate Estimation Based on Strong Tracking …
zr qy = arctan − xr Differentiating (6) and (7) twice with respect to time obtains y¨ r xr2 + zr2 yr xr2 + zr2 2˙r (xr x¨ r + zr z¨r ) 2 − q˙ y + q¨ z = − q˙ y + r r2 r2 xr2 + zr2 q¨ y = −2˙qy
y˙ r r 2 q˙ z + 2˙qy yr yr xr2 + zr2
Geometrical relationships exist, such as ⎧ xr = r cos qz cos qy ⎪ ⎪ ⎨ yr = r sin qz ⎪ z = −r cos qz sin qy ⎪ ⎩ r 2 xr + yr2 = r cos qz
939
(7)
(8) (9)
(10)
With (10) substituted into (8) and (9), the simplified form of (8) and (9) is given as sin qz x¨ r cos qy − z¨r sin qy 2˙r y¨ r 2 q¨ z = − q˙ z + cos qz − q˙ y sin qz cos qz − (11) r r r x¨ r sin qy + z¨r cos qy r˙ q¨ y = −2˙qy + 2˙qy q˙ z tan qz − r r cos q
(12)
As for a stationary or uniform motion target, the acceleration at is equal to zero. Thus, the motion equations of LOS can be derived as follows: sin qz axm cos qz − azm sin qy aym 2˙r cos qz − q˙ y2 sin qz cos qz + (13) q¨ z = − q˙ z − r r r axm sin qy + azm cos qy r˙ q¨ y = −2˙qy + 2˙qy q˙ z tan qz + r r cos qz
(14)
3 LOS Rate Estimation Based on STCKF 3.1 State Equation and Measurement Equation The state vector is selected as T (x1 , x2 , x3 , x4 )T = qz , q˙ z , qy , q˙ y According to (13) and (14), the state equation can be written as ⎧ ⎪ x˙ 1 = x2 ⎪ ⎪ ⎨ x˙ = − 2˙r x − aym cos x − x2 sin x cos x + sin x1 (axm cos x3 −azm sin x3 ) 2 1 1 1 4 r 2 r r x ˙ = x ⎪ 3 4 ⎪ ⎪ ⎩ x˙ = −2x r˙ + 2x x tan x + axm sin x3 +azm cos x3 4 4r 4 2 1 r cos x1
(15)
(16)
940
K. Chen et al.
The measurement equation can be described as
x1 x3
⎡ ⎤ x 1 ⎥ 1000 ⎢ x ⎢ 2⎥ = ⎣ 0010 x3 ⎦ x4
(17)
The state and measurement equations possess strong nonlinearity; as a result, the LOS rate cannot be directly estimated by a linear Kalman filter. Therefore, a nonlinear filtering method is considered. 3.2 STCKF Estimation Algorithm According to the strong tracking state estimation algorithm, the Kalman gain and corresponding error covariance need to be adjusted online by introducing a suboptimal fading factor into the predicted error covariance. A suboptimal fading factor is introduced into the prediction error covariance equation of the CKF as follows: 2n 1 ∗ ∗T ∗ T Xi,k/k−1 Xi,k/k−1 − xˆ k/k−1 xˆ k/k−1 + Qk−1 (18) P k|k−1 = λk 2n i=1
where λk ≥ 1 is the suboptimal fading factor and can be determined as follows: tr(Nk ) λ0 , λ 0 ≥ 1 λk = , λ0 = 1, λ0 < 1 tr(Mk ) Nk = Vk − Hk Qk−1 HkT − βRk
(20)
(s) T HkT = Hk Pk/k−1 − Qk−1 HkT Mk = Hk Fk/k−1 Pk−1 Fk/k−1
(21)
Vk =
ε1 ε1T , k = 1 ρVk−1 +εk εkT 1+ρ
, k > 1, εk = zk − zˆ k|k−1 ∂hk (xk , uk ) Hk = ∂xk xk =ˆxk/k−1 ∂fk−1 (xk−1 , uk−1 ) Fk/k−1 = ∂x k−1
(s)
(19)
(22)
(23) (24)
xk−1 =ˆxk−1
T where Pk/k−1 = Fk/k−1 Pk−1 Fk/k−1 + Qk−1 is the prediction error covariance matrix without introducing the suboptimal fading factor, β ≥ 1 is a preselected softening factor, and 0 < p ≤ 1 is a forgetting factor. HWe need to calculate the Jacobian matrices of nonlinear functions according to (19) to (24), which goes against algorithm implementation. With the use of the statistical
Line-of-Sight Rate Estimation Based on Strong Tracking …
941
linear error propagation methodology [8], the innovation covariance matrix and the crosscovariance matrix without introducing the suboptimal fading factor are approximated by T (s) (s) (25) ≈ Hk Pzz,k/k−1 HkT + Rk Pzz,k/k−1 = E zk − zˆk/k−1 zk − zˆk/k−1 T (s) (s) = E xk − xˆ k/k−1 zk − zˆk/k−1 HkT ≈ Pk/k−1 Pxz,k/k−1
(26)
Hence Hk =
T −1 (s) (s) Pk/k−1 Pxz,k/k−1
=
T (s) Pxz,k/k−1
−1 T (s) Pxz,k/k−1
(27)
Substituting (25) to (27) into (20) and (21), Nk and Mk can be obtained by Nk =
(s) Vk − (Pxz,k/k−1 )T
−1 T (s) (s) (s) Qk−1 (Pk/k−1 )−1 Pxz,k/k−1 Pk/k−1
(s) + Nk − Vk + (β − 1)Rk Mk = Pzz,k/k−1
− βRk
(28) (29)
Thus, we can use (28) to (29) to calculate the suboptimal fading factor λk , substituting them into (18) and amending the predicted error covariance. Then, we reevaluate the cubature points and the propagated cubature points and reestimate the predicted measurement, the innovation covariance matrix, and the cross-covariance matrix on the basis of the CKF algorithm. Moreover, we obtain the adjusted Kalman gain and achieve online adjustment of the corresponding error covariance.
4 Simulations A trajectory simulation example is adopted to verify the proposed algorithm. The initial simulation conditions are as follows: measurement noise covariance matrix is R = 3 × 10−4 I2 , process noise covariance matrix is Q = 1 × 10−7 I4 , and a priori state covariance matrix is P0 = 1 × 10−6 I4 . We estimate the LOS rate after the seeker captures the target by using STCKF and compare its performance against the CKF. For a fair comparison, we perform 250 independent Monte Carlo runs. To compare various nonlinear filter performances, we use the root-mean square error (RMSE) of qz , qy , q˙ z and q˙ y . Figures 2a–d show the RMSEs in the LOS angle and the LOS rate, respectively, for the CKF and the STCKF. Table 1 shows the average RMSEs of two filters. We can see that the convergence of the STCKF is faster than that of the CKF and that the estimation accuracy of the STCKF is better than that of the CKF. Simulation results indicate that the performance of the STCKF is superior to that of the CKF in terms of not only estimation accuracy but also convergence rate.
942
K. Chen et al. 4
1.5
RMSE(o/s)
RMSE(o)
3 2 1 0
30
35
40
45
1
0.5
0
50
30
35
(a) RMSEs in LOS elevation angle
50
0.5 0.4
RMSE( o/s)
0.6
RMSE(o)
45
(b) RMSEs in the elevation LOS rate
0.8
0.4 0.2 0
40
T(s)
T(s)
0.3 0.2 0.1
30
35
40
45
50
0
30
35
T(s)
40
45
50
T(s)
(c) RMSEs in LOS azimuth angle
(d) RMSEs in the azimuth LOS rate
Fig. 2. RMSEs in LOS angle and LOS rate (solid—STCKF, dashed—CKF) Table 1. Average RMSEs of two filters qz (◦ ) qz (◦ /s) qy (◦ ) q˙ y (◦ /s) CKF
0.51
0.25
0.11
0.07
STCKF 0.07
0.08
0.06
0.05
5 Conclusion This paper investigates the LOS rate estimation for a strapdown seeker. A novel nonlinear filter for LOS rate estimation is presented, this filter is called the strong tracking CKF (STCKF). The performance of the proposed STCKF was examined through a numerical simulation. A comparison with CKF shows that the STCKF provides better performance with higher estimation accuracy and faster convergence rate in LOS rate estimation. Thus, the proposed LOS rate estimation algorithm provides a theoretical basis for engineering applications of strapdown seeker.
References 1. Vergez PL, McClendon JR (1982) Optimal control and estimation for strapdown seeker guidance of tactical missiles. J Guidance Control Dyn 225–226
Line-of-Sight Rate Estimation Based on Strong Tracking …
943
2. Waldmann Jacques (2002) Line-of-sight rate estimation and linearizing control of an imaging seeker in a tactical missile guided by proportional navigation. IEEE Trans Control Syst Technol 10(4):556–567 3. Ra W-S, Whang I-H, Ahn J-Y (2005) Robust horizontal line-of-sight rate estimator for sea skimming anti-ship missile with two-axis gimballed seeker. IEEE Proc-Radar Sonar Navig 152(1):9–15 4. Ra Wo-S, Whang Ick-Ho (2011) Time-varying line-of-sight rate estimator with a single modified tracking index for RF homing guidance. Int J Control Autom Syst 9(5):857 5. Arasaratnam I, Haykin Simon (2009) Cubature kalman filters. IEEE Trans Autom Control 54(6):1254–1269 6. Zhou DH, Xi Yi G, Zhang ZJ (1992) A suboptimal multiple fading extended Kalman filter. Chinese J Autom 4(2):145–152 7. Zarchan P (2007) In: Tactical and strategic missile guidance, 5th edn. AIAA 8. Lee D-J (2008) Nonlinear estimation and multiple sensor fusion using unscented information filtering. IEEE Signal Process Lett 15:861–864
Malicious Behavior Catcher: An Intrusion Detection System Based on VAE Linna Fan1,2(B) , Jiahai Yang1 , and Tianjian Mi3 1
2
Institute for Network Sciences and Cyberspace, Tsinghua University, Beijing, China [email protected],[email protected] School of Information and Communications, National University of Defense Technology, Xi’an, China 3 School of Materials Science and Engineering, Xi’an University of technology, Xi’an, China [email protected]
Abstract. Network intrusion system is an important part of maintaining the security of the network. In this paper, we present a novel VAEbased network intrusion system. VAE is used to get a stable latent representation of the input vector to realize the function of dimension reduction. After that, the lower-dimensional latent representation is sent to the random forest classifier to get the final result. We compare our model with baseline method in KDD Cup’99 and NSL-KDD and the evaluation shows that our model outperforms baseline model in accuracy, recall, Fscore and false alarm rate. Besides, our model needs much less consuming time and cost lower computational and storage resources which is more suitable for lower resource scenarios. Keywords: Intrusion detection system
1
· VAE · Machine learning
Introduction
With the development of the Internet, the number of attacks has a rapid increase over the years [1]. Intrusion detection system (IDS) plays an important role in maintaining the security of the network. It can monitor the traffic passing through to check whether attacks exist or not. IDS usually employs two methods for protection. One is signature-based/rule-based method. The other is machine learning-based method. The industry usually applies signature-based approaches, mainly because of the lower false alarm rate compared with machine learning-based approaches. Another reason is that signature-based approaches do not need to gather data and train the data. However, the signature-based approaches also have its inherent limitations. With elapse of time, increasing c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_125
Malicious Behavior Catcher: An Intrusion Detection . . .
945
attacks will emerge. Signature-based approaches must add new rules corresponding to new attacks. The rule library will become increasingly complex and hard to maintain. Because of the limitations of signature-based approaches, machine learningbased approaches are becoming the focus of researchers. As the future direction of IDS, machine learning-based approaches have more challenges because of the increasing volume of traffic, limited resources of some devices such as IoT devices, high accuracy requirement and diversity or dynamic of traffic with the increasing of IoT devices. Considering the challenges above, we propose a machine learning-based approaches named malicious behavior catcher (MBC). It is light weighted and has higher accuracy and lower false alarm rate. The rest of the paper is organized as follows: Section 2 discusses related work of IDS, and mainly focuses on machine learning approaches. Section 3 proposes the MBC framework. Section 4 presents the experimental result and performance of the MBC. At last, Sect. 5 is the conclusion.
2
Related Work
IDS was extensively researched in the past. The methods can be divided into two categories. They are signature-based approaches and machine learning-based approaches. Signature-based approaches detect malicious behavior through the pre-build library. Once a new attack is found, the library will add new rules according to the attack. References [2–9] are signature-based approaches which are usually designed according to certain attacks or protocols. The library will become increasingly complex and hard to maintain. Machine learning-based approaches for IDS were extensively researched in the past [10–14]. These solutions do not consider the limited resources when the system is deployed in a simple gateway, thus, the training cost will be too high. These years some solutions toward IoT environment are proposed. The machine learning-based approaches can be divided into supervised learning and unsupervised learning. Amouri et al. [15] suggested IDS using decision tree to detect attacks in IoT environment. The experiment showed that the method can identify malicious and normal behavior, but the experiment was executed through simulation, but not a real environment. Doshi [16] applied random forest, K-nearest neighbors and neural network to detect DDoS attack. Features include package size, internal time between packages and protocols used. The method can only detect DDoS attacks and specific DDoS attack types cannot be distinguished. Shukla et al. [17] detect wormhole attack in IoT 6LoWPAN environment through K-means and decision tree. The experiment was also executed through simulation. McDermott [18] proposed to use bidirectional long short-term memory-based recurrent neural network (BLSTM) combined with word embedding to detect botnet. The method uses text in package info field, package size and protocols as the input of BLSTM to detect normal traffic, Mirai traffic, UDP flood traffic and ACK
946
L. Fan et al.
flood traffic. It cannot identify other types of attacks. Moustafa et al. [19] used Zeek to extract basic features of network traffic. Then, they extracted other features through a script. These features are related to flows, MQTT, HTTP and DNS. After that, they chose the least relevant features according to the correlation coefficient and use AdaBoost consisted of decision tree, Naive Bayes and artificial neural network for classification. Shone et al. [20] suggested using two asymmetric stacked autoencoders combined with random forest to classify attacks based on dataset KDD-99 and NSL-KDD. There are also some works using unsupervised learning. Meidan et al. [21] suggested applying autoencoder to detect botnet in IoT environment. Features used for training include package size, package quantity, mean value and variance value of interval time between packages. The method trains the autoencoder in normal traffic and sets an appropriate threshold to detect attacks. The experiment was verified in a real environment and it can identify botnet. Yisroel Mirsky et al. proposed Kitsune [22], an unsupervised NIDS. It collects statistical features of traffic as features to get clusters through hierarchical clustering. Then, clustered features are sent to several autoencoders and the output of the autoencoders is sent into a dense layer and gets mean square error. Through an appropriate threshold of mean square error, it can detect malicious traffic. Unsupervised learning requires training on normal traffic and can only classify the traffic into normal or malicious behavior. It cannot distinguish specific attacks. Besides, when new devices engaged in the network, the traffic characteristic will also change, which forces the model to be trained again.
3
Proposed Methodology
Our model leverages VAE to get the latent representation of input features. Then, the lower-dimensional representation is sent into a random forest classifier to get the final result. 3.1
VAE
An autoencoder is a neural network that is trained to attempt to copy its input to its output [23]. Between the input layer and output layer there are several hidden layers. The dimension of the hidden layer is usually less than its input layer and the output layer. So the hidden layer can be seen as a latent representation of the input vector, which has the function of dimension reduction. After autoencoder proposition, increasing variants of autoencoder emerged, such as regularized autoencoder, stacked autoencoder, variational autoencoder (VAE) and so on. Among the models above, VAE is much stable and based on variational inference. VAE is consists of an encoder and a decoder. Each training data can be seen as generated from a latent variable z, which is a low-dimensional hidden representation of x. Generating x given z is the function of the decoder, which should approximate can be expressed as pθ (x|z). The decoder sampling result x
Malicious Behavior Catcher: An Intrusion Detection . . .
947
x as much as possible. However, x is sampled from a probability distribution, so it contains some noise and will not be equal to x. The latent variable z also comes from sampling. It can be seen as sampled from training data x and the encoder can be expressed as qφ (z|x). To make the VAE have better generation ability, variational approximate posterior qφ (z|x) should be a Gaussian distribution that has a diagonal covariance structure. (1) qφ (z|x) = N z; μ, σ 2 I The total loss L is the sum of the loss of each data sample li and li can be expressed as: lxi = −Eqφ (z|xi ) [logpθ (xi |z)] + KL[qφ (z|xi ) ||pθ (z)]
(2)
The loss consists of two parts. The first part is a reconstruction loss. It makes the output x is similar to x as much as possible. The second part is the KL divergence. It makes qφ (z|x) is similar to the prior distribution pθ (z) of z as much as possible. From the loss, we can conclude that VAE has two functions. One is reconstructing the input data as possible. The other is making the probability distribution of qφ (z|x) obey normal distribution. Through VAE, we can transform the input features into a lower-dimentional latent representation z. 3.2
Malicious Behavior Catcher (MBC)
Structure of our model MBC can be seen in Fig. 1. It consists of VAE encoder and random forest. We use random forest to get final result because it is suitable for time-consuming scenario. Our VAE encoder consists of two hidden layers. VAE encoder first compresses the input 41-dimensional input features which is a vector of a sample in dataset as illustrated in Sect. 4.1 into 20-dimensional features, then to 10-dimensional features. After that, the 10-dimensional features are sent into the random forest classifier and we get the final classification result. During the training process, the VAE encoder should be trained first to minimize the loss in equation (2) to get parameters of VAE. Then, the random forest classifier is trained to fit the data. Random forest is an efficient classifier of supervised learning and it is widely used in classification. Random forest consists of many randomly created decision trees. Although a decision tree is a week learner, the combination of many week learners can generate a strong learner random forest. Random forest also has the advantages of robustness and better generalization, which is suitable for intrusion detection scenarios.
4
Evaluation
Our experiment was executed using TensorFlow backend. Evaluations were performed on 64-bit Windows 10 Lenovo laptop, Intel Core i5-6200U CPU, 8 GB RAM and no GPU.
948
L. Fan et al.
Fig. 1. MBC model.
We use two datasets to validate our model and compare with S-NDAE [20], they are KDD Cup’99 and NSL-KDD. These two datasets are commonly used in the field of IDS. We use the measures of accuracy, precision, recall, F-score and false alarm to compare our model with S-NDAE [20]. 4.1
Datasets
Our experiment will be executed on KDD Cup’99 and NSL-KDD. These two datasets are widely used in IDS and as benchmark datasets. KDD Cup’99 The KDD Cup’99 was used by DARPA for IDS evaluation [24]. It is also used for international knowledge discovery and data mining competition. The data was about 9 weeks of network traffic. The traffic data contains emulated user normal data, audit data and kinds of attacks. The first 7 weeks’ data consists of about 5 million connections and left 2 weeks’ data contains about 2 million connections. The features have 41 dimensions. They are duration, protocol type, service, flag and so on. The attacks can be divided into five categories. They are DoS, U2R, R2L, probe and normal. Normal is the normal traffic. DoS stands for a denial-of-service attack such as SYN flood. R2L is unauthorized access from a remote machine, e.g., guessing password. U2R is unauthorized access to local superuser privileges such as various buffer overflow attacks. Probe is the attack related to surveillance and other probing, e.g., port scanning. In order to compare with S-NDAE [20], we use the 10% of the full-size dataset. NSL-KDD To overcome some problems of KDD Cup’99 mentioned in [25], NSL-KDD was proposed, although it still has some weaknesses. NSL-KDD does
Malicious Behavior Catcher: An Intrusion Detection . . .
949
not contain redundancy data, so the detection result will be more reliable. NSLKDD has the same structure as KDD Cup’99. It also has 41 features and the same kind of attack types as KDD Cup’99. In the experiment, we will compare our work with S-NDAE [20] based on these two datasets. The instance number in each category of KDD Cup’99 and NSL-KDD can be seen in Table 1. The datasets are unbalanced. Normal and DoS instances occupy the majority of the datasets but U2R only has instances less than 60. Table 1. Composition of KDD Cup’99 and NSL-KDD
4.2
Category
10% KDD Cup’99 Train Test
NSL-KDD Test
Train
Normal
136188
58368
47140
20203
DoS
274020
117438
32148
13779
Probe
2874
1233
8159
3497
R2L
788
338
696
299
U2R
36
16
36
16
KDD Cup’99 Experiment Results
We will compare our model MBC with S-NDAE [20] on KDD Cup’99 in this section. After training MBC model using training data, the classification confusion matrix of the test dataset can be seen in Fig. 2. For normal data and DoS data, the classification is perfectly correct. Probe and R2L also have high accuracy. However, for U2R which has the least instance number, the accuracy is relatively low. We also compare our model with S-NDAE [20] in accuracy, precision, recall, F-score and false alarm. The evaluation comparison between S-NDAE [20] and MBC is listed in Table 2. From the evaluation result, we can find that our model’s accuracy outperforms S-NDAE [20] in all attack types. The precision is slightly lower than their model. Considering recall, our model only slightly lower than their model in probe. F-score, which is the weighted average of precision and recall, also outperforms S-NDAE [20] except probe. At last, considering false alarm, which is a significant metric in IDS. Our model has near 0 false alarm, which also exceeds S-NDAE [20]. Besides the above metrics, time consuming is also an important comparing metric. In Table 3, we compare MBC with S-NDAE [20] in time consuming. In the table, MBC (20-10 neurons) stands for two hidden layers in VAE, hidden layer 1 has 20 neurons and hidden layer 2 has 10 neurons. It can be seen that MBC significantly outperforms S-NDAE [20] in time consuming.
950
L. Fan et al.
Fig. 2. Confusion matrix of MBC about KDD Cup’99. Table 2. Comparison between S-NDAE [20] and MBC about KDD Cup’99 Attack class
Accuracy (%)
Precision (%)
Recall (%)
F-Score (%)
False alarm (%)
S-NDAE
MBC
S-NDAE
MBC
S-NDAE
MBC
S-NDAE
MBC
S-NDAE
MBC
Normal
99.49
99.95
100.00
99.89
99.49
99.96
99.75
99.93
8.92
0.05
DoS
99.79
99.98
100.00
99.98
99.79
99.99
99.89
99.99
0.04
0.03
Probe
98.74
99.99
100.00
99.75
98.74
98.54
99.36
99.14
10.83
1.70 0.00
R2L
9.31
99.98
100.00
98.73
9.31
91.72
17.04
95.09
0.71
U2R
0.00
99.99
0.00
71.43
0.00
31.25
0.00
43.48
100.00
0.00
Total
97.85
99.95
99.99
99.89
97.85
99.97
98.15
99.93
2.15
0.54
Table 3. Time consuming comparison between S-NDAE [20] and MBC about KDD Cup’99 Training time (s) S-NDAE (8 neurons)
2024
S-NDAE (14 neurons) 2381 S-NDAE (22 neurons) 2446
4.3
Time saving (%)
MBC (20-10 neurons)
233
88.49
MBC (30-20-10 neurons)
243
89.79
MBC (34-27-20-10 neurons) 249
89.82
NSL-KDD Experiment Results
Besides KDD Cup’99, we also test our model in NSL-KDD dataset. The confusion matrix of MBC for five-class classification of NSL-KDD can be seen in Fig. 3. The confusion matrix of NSL-KDD is similar to KDD Cup’99. For normal data, DoS data, probe and R2L, the classification has higher accuracy. However, for U2R which has the least instance number, the accuracy is relatively low. Comparisons between S-NDAE [20] and MBC about NSL-KDD can be seen in Table 4. For NSL-KDD, our model outperforms S-NDAE in accuracy of each category. In precision, our model has slightly lower precision compared with S-
Malicious Behavior Catcher: An Intrusion Detection . . .
951
NDAE except for U2R. Our model is also superior to S-NDAE in recall and F-score. At last, our model has very low false alarm rate, which is an important metric in NIDS.
Fig. 3. Confusion matrix of MBC about NSL-KDD.
Table 4. Comparison between S-NDAE [20] and MBC about NSL-KDD Attack class
Accuracy (%)
Precision (%)
Recall (%)
F-Score (%)
False alarm (%)
S-NDAE
MBC
S-NDAE
MBC
S-NDAE
MBC
S-NDAE
MBC
S-NDAE
MBC
Normal
97.73
99.65
100.00
99.48
97.73
99.86
98.85
99.67
20.62
0.60
DoS
94.58
99.88
100.00
99.85
94.58
99.82
97.22
99.83
1.07
0.09
Probe
94.67
99.82
100.00
99.65
94.67
98.37
97.26
99.01
16.84
0.03 0.02
R2L
3.82
99.91
100.00
96.81
3.82
91.30
7.36
93.98
3.45
U2R
2.70
99.97
100.00
100.00
2.70
37.50
5.26
54.55
50.00
0.00
Total
85.42
99.65
99.99
99.48
85.42
99.86
87.37
99.67
14.58
0.60
We also compare our model with S-NDAE in time consuming. The result can be seen in Table 5. It can be seen that MBC significantly outperforms S-NDAE [20] in time consuming considering NSL-KDD. Overall, our model outperforms S-NDAE [20] in accuracy, recall, F-score and false alarm rate in both KDD Cup’99 and NSL-KDD. Also, our model has much less consuming time compared with S-NDAE [20].
5
Discussion
In this paper, we discussed the challenges of machine learning-based IDS and designed a novel IDS based on VAE. We use VAE to get a stable latent representation of the input vector to realize the function of dimension reduction.
952
L. Fan et al.
Table 5. Time consuming comparison between S-NDAE and MBC about NSL-KDD Training time (s)
Time saving (%)
S-NDAE (8 neurons)
644
MBC (20-10 neurons)
124
80.75
S-NDAE (14 neurons)
722
MBC (30-20-10 neurons)
135
81.30
S-NDAE (22 neurons)
1091
MBC (34-27-20-10 neurons)
148
86.43
After that, the lower-dimensional latent representation is sent to the random forest classifier to get the final result. We also compare our model with baseline. The evaluation shows that our model outperforms baseline in accuracy, recall, F-score and false alarm rate. Besides, our model has much less consuming time and cost lower computational and storage resources which is more suitable for lower resource scenarios. In future work, we will research unbalanced dataset classification and study how to promote the accuracy of the small sample category. Acknowledgments. The research was supported by the National Natural Science Foundation of China (No. 61872448), Natural Science Basic Research Plan in Shaanxi Province of China (Program No. 2018JM6017) and National Natural Science Foundation of China (No. 51701153).
References 1. Kuypers MA, Maillart T, Pat´e-Cornell E (2016) An empirical analysis of cyber security incidents at a large organization. Department of Management Science and Engineering, Stanford University, School of Information 2. Roesch M (1999) Snort: lightweight intrusion detection for networks. In: Lisa, pp. 229–238 3. Yu T, Sekar V, Seshan S et al (2015) Handling a trillion (unfixable) flaws on a billion devices: rethinking network security for the internet-of-things. In: Proceedings of the 14th ACM workshop on hot topics in networks. New York, pp 1–7 4. Stephen R, Arockiam L (2017) Intrusion detection system to detect sinkhole attack on RPL protocol in Internet of Things. Int J Electr Electron Comput Sci 4:16–20 5. Raza S, Wallgren L, Voigt T (2013) SVELTE: real-time intrusion detection in the internet of things. Ad hoc Networks 11:2661–2674 6. Shreenivas D, Raza S, Voigt T (2017) Intrusion detection in the RPL-connected 6LoWPAN networks. In: Proceedings of the 3rd ACM international workshop on IoT privacy, trust, and security. Taiwan, pp 31–38 7. Santos L, Rabadao C, Gon¸calves R (2018) Intrusion detection systems in internet of things: a literature review. In: 2018 13th Iberian conference on information systems and technologies (CISTI). IEEE, Caceres, pp 1–7 8. Jun C, Chi C (2014) Design of complex event-processing ids in internet of things. In: 2014 sixth international conference on measuring technology and mechatronics automation. IEEE, Zhangjiajie, pp 226–229 9. Midi D, Rullo A, Mudgerikar A et al (2017) Kalis–a system for knowledge-driven adaptable intrusion detection for the Internet of Things. In: 2017 IEEE 37th international conference on distributed computing systems (ICDCS). IEEE, Atlanta, pp 656–666
Malicious Behavior Catcher: An Intrusion Detection . . .
953
10. Garcia-Teodoro P, Diaz-Verdejo J, Macia-Fernandez G, Vazquez E (2009) Anomaly-based network intrusion detection: techniques, systems and challenges. Comput Security 28:18–28 11. Kaur H, Singh G, Minhas J (2013) A review of machine learning based anomaly detection techniques. Int J Comput Appl Technol Res 2:185–187 12. Shon T, Moon J (2007) A hybrid machine learning approach to network anomaly detection. Inf Sci 177:3799–3821 13. Shon T, Kim Y, Lee C et al (2005) A machine learning framework for network anomaly detection using SVM and GA. In: Proceedings from the sixth annual IEEE SMC information assurance workshop. IEEE, New York, pp 176–183 14. Buczak AL, Guven E (2015) A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Commun Surveys Tutorials 18:1153–1176 15. Amouri A, Alaparthy VT, Morgera SD (2018) Cross layer-based intrusion detection based on network behavior for IoT. In: 2018 IEEE 19th wireless and microwave technology conference (WAMICON). IEEE, Florida, pp 1–4 16. Doshi R, Apthorpe N, Feamster N (2018) Machine learning DDoS detection for consumer internet of things devices. In: 2018 IEEE Security and Privacy Workshops (SPW). IEEE, San Francisco, pp 29–35 17. Shukla P (2017) Ml-ids: A machine learning approach to detect wormhole attacks in internet of things. In: 2017 intelligent systems conference (IntelliSys). IEEE, London, pp 234–240 18. McDermott CD, Majdani F, Petrovski AV (2018) Botnet detection in the internet of things using deep learning approaches. In: 2018 international joint conference on neural networks (IJCNN). IEEE, Brazil, pp 1–8 19. Moustafa N, Turnbull B, Choo KKR (2018) An ensemble intrusion detection technique based on proposed statistical flow features for protecting network traffic of internet of things. IEEE Internet Things J 6:4815–4830 20. Shone N, Ngoc TN, Phai VD et al (2018) A deep learning approach to network intrusion detection. IEEE Trans Emerg Topics Comput Intell 2:41–50 21. Meidan Y, Bohadana M, Mathov Y et al (2018) N-BaioT-network-based detection of IoT botnet attacks using deep autoencoders. IEEE Pervasive Comput 17:12–22 22. Mirsky Y, Doitshman T, Elovici Y et al (2018) Kitsune: an ensemble of autoencoders for online network intrusion detection. arXiv preprint arXiv:1802.09089 23. Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press 24. Stolfo SJ, Fan W, Lee W, Prodromidis A, Chan PK (2000) Cost-based modeling for fraud and intrusion detection: results from the JAM project. In: Proceedings DARPA information survivability conference and exposition. IEEE, Washington, pp 130–144 25. Tavallaee M, Bagheri E, Lu W, Ghorbani AA (2009) A detailed analysis of the KDD CUP 99 data set. In: 2009 IEEE symposium on computational intelligence for security and defense applications. IEEE, Ottawa, pp 1–6
Campus Physical Bullying Detection Based on Sensor Data and Pattern Recognition Bo Long1 , Jian Liu1 , Jun Wang1(B) , and Chenguang He2 1 Harbin Institute of Technology, Weihai, China [email protected], [email protected], [email protected] 2 Harbin Institute of Technology, Harbin, China [email protected]
Abstract. Campus bullying is one of the primary problems in education around the world which makes teenagers drop out of school or even suicide. Campus bullying can take various forms, such as physical bullying, verbal bullying, Internet bullying and so on. Physical bullying is considered as the most harmful to teenagers. Therefore, it is necessary and significant to develop anti-bullying methods. In view of the current popularity of smartphones among the student population, this article proposes a scheme for using the smartphone’s built-in acceleration sensor and gyroscope to collect student activity data and using pattern recognition technology to identify students’ status. This paper uses Relief-F algorithm for feature selection and then uses PCA for feature dimensionality reduction and finally extracts three kinds of features from acceleration and gyrodata. The author used the k-NN algorithm as classifiers. In the final test, bullying and non-bullying recognition accuracies of k-NN were 84.13% and 76.92%, respectively. The result indicated that motion recognition based on k-NN can obtain a good classification effect in physical bullying detection. Keywords: Campus bullying · K-NN · Pattern recognition · Physical bullying
1 Introduction Campus bullying is a common problem around the world which seriously affects victim’s physical and mental health [1]. However, after being abused by bullies, victims often dare not report the situation to teachers and parents due to fear [2]. Without timely supervision and prevention, the phenomenon of campus bullying will become more and more serious [3]. Therefore, it is necessary and important to develop measures to automatically detect school violence. With the rapid development of machine learning technology in recent years, motion recognition research has become a hot field [4]. Research on motion recognition has become a hot topic in mainstream international conferences on pattern recognition [5]. Campus bullying can take various forms, such as physical bullying, language bullying,
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_126
Campus Physical Bullying Detection Based on Sensor Data …
955
and network bullying, among which physical bullying is the most harmful. Aim at this problem, this paper proposes a method of automatic detection of physical bullying based on motion recognition.
2 Research Status With the development of smart phones, people can obtain a variety of anti-bullying app, such as Stop bullies, Bully Box, Black Box. These applications all work in the same way [6]. When the bullying incident occurs, the victim or witness needs to take out the mobile phone to run the app and manually send an alarm message. This method is not convenient for the victim. Therefore, people need a method which can automatically detect the state of the victim and send an alarm message when the victim is bullied. In the past ten years, motion recognition technology has greatly developed, and the accuracy of motion recognition has been greatly improved [7]. Motion recognition technology has been applied to the smart home system [8]. Therefore, motion recognition technology is mature enough to perform campus bullying detection.
3 Physical Bullying Detection Method Most of bullying actions will cause drastic changes in the physical state of the bullied person, resulting in large displacements, accelerations, etc., on the body. Mobile phones are often equipped with accelerometers, gyroscopes sensors, etc. Through these sensors, the holder’s acceleration, gyro and other information can be collected to provide automatic bullying detection. The action experiment data in this paper were collected by fixing the smartphone to the waist of the tester, which can reflect the overall movement state of the human body well [9]. In order to distinguish between bullying movements such as hitting and pushing down and non-bullying movements such as walking and jumping, this paper collect accelerometer and gyroscope sensor data of five typical bullying movements and six typical daily movements (Table 1). The data collected are not necessarily accurate, the correction method is used to calculate the true data. The acceleration of the standing data should be accy = g, accx = accz = 0. In fact, the recorded data are accx = a, accy = b, accz = c. From the above data, the original three-dimensional coordinate system can be calculated as the rotation angles of α, β, and γ along the x-axis, y-axis, and z-axis. Then get the deflection matrix. ⎡
⎤⎡ ⎤⎡ ⎤ cos(γ ) sin(γ ) 0 cos(β) 0 − sin(β) 1 0 0 ⎦⎣ 0 cos(α) sin(α) ⎦ I = ⎣ − sin(γ ) cos(γ ) 0 ⎦⎣ 0 1 0 0 0 1 sin(β) 0 cos(β) 0 − sin(α) cos(α)
(1)
According to the relationship between the true value vector, the measured value vector and the deflection matrix, the true value is equal to the inverse of the deflection matrix multiplied by the measured value. Rl = I - 1 × Rb
(2)
956
B. Long et al. Table 1. Five typical bullying actions and six typical non-bullying actions used in data Action class
Action type
Bullying class
Beat Push Shake Push down Shoulder hit
Non-bullying class Fall down Jump Play Run Stand Walk
After correcting the original data, the initial observation still found some measurement noise. For this reason, some commonly used low-pass filters were tested. To reduce noise and retain more data information, this paper chooses wavelet filtering to filter the original data (Fig. 1). After smooth preprocessing, the data are further classified into bullying and nonbullying actions. Then extract the features of the sample data from the horizontal acceleration components accx and accz, the vertical acceleration component accy, and the three-axis gyroscope x-axis, y-axis, z-axis components gyrox, gyroy, gyroz. To this end, this paper constructed 57-dimensional initial features, including five statistical features in time domain and first-order differential in time domain: mean, maximum, minimum, zero crossing ratio, and median absolute difference and four statistical features in frequency domain: mean, maximum, minimum, and energy. Time domain first-order forward differential, taking accx’s first-order differential for example: daccxk = accxk+1 − accxk k = 1, 2, 3 · · · n
(3)
The calculation methods of the horizontal acceleration vector and the three-axis vector of the gyroscope are: (4) accxz = accx2 + accz 2 0.5 0.5 gyroxyz = gyrox2 + gyroy2 + gyroz 2
(5)
In addition, perform Fourier transforming on the original data to obtain frequency data. The calculation method is given as follows, taking accx frequency domain data faccx for example, faccx = fft(accx)
(6)
Campus Physical Bullying Detection Based on Sensor Data …
957
Fig. 1. Horizontal acceleration signal and the signal after passing through different filters
Zero-crossing ratio and frequency domain energy calculations normalize the window length. The formula for the median absolute difference is: mad(accx) = median(|accx − median(accx)|)
(7)
Find the above 57-dimensional features of the collected data samples according to the type of actions, then synthesize according to whether the action type is bullying or non-bullying (Table 2). For the extracted features, feature selection must be further carried out, because the extraction of features is not designed according to the specific distribution of the data. When these features are used to classify the samples, there may be some features that contribute little or none. So, the features that did not contribute much to the classification should be dropped. One possible feature selection method is to observe the distribution of features through quartile box plots (Fig. 2). Because there are many features initially extracted, it is not suitable to filter features by observing the quartile box plot. This paper uses the improved relief algorithm. Relief algorithm is a kind of feature weighting algorithm, which gives different weights to
958
B. Long et al. Table 2. 57-dimensional features extracted from acceleration and gyroscope data Extracted from
Amount
Mean
accy accxz gyrox gyroy gyroz gyroxyz faccy faccxz fgyroxyz daccy daccxz dgyroxyz
12
Max
accy accxz gyrox gyroy gyroz gyroxyz faccy faccxz fgyroxyz daccy daccxz dgyroxyz
12
Min
accy accxz gyrox gyroy gyroz gyroxyz faccy faccxz fgyroxyz daccy daccxz dgyroxyz
12
Zero crossing ratio
accy accxz gyrox gyroy gyroz gyroxyz daccy daccxz dgyroxyz 9
MAD
accy accxz gyrox gyroy gyroz gyroxyz daccy daccxz dgyroxyz 9
Energy
faccy faccxz fgyroxyz
Y-axis acceleration frequency domain energy feature
Feature
3
3500 3000 2500 2000 1500 1000 500 0 beat
falldown
jump
play
push pushdown run
shake
walk shoulderhit stand
Action type
Fig. 2. Quartile box plot of frequency domain energy feature of y-axis acceleration
features according to the correlation of each feature and category. Features with weights less than a certain threshold will be removed. This paper uses the relief-f algorithm to select features and remove features that do not contribute to the classification. After feature selection, there are 24-dimensional features but the dimension of the features is still too large, because only the mobile phone as a hardware platform has limited computing resources. Therefore, the principal component analysis (PCA) method is used to further reduce the dimensions of the features. The PCA algorithm analyzes the principal components in the feature space. After dimensionality reduction, the feature dimension can be reduced while still retaining most
Campus Physical Bullying Detection Based on Sensor Data …
959
of the data information. This paper takes the first four-dimensional feature vector after dimensionality reduction, which still retains 98.87% information of the original features. In this paper, the classification goal is to distinguish between bullying actions and non-bullying actions, and the N-fold cross-validation training model is used. Here, the bullying feature matrix and the non-bullying feature matrix are both divided into two. The first group of bullying samples and non-bullying samples is combined into a training set, a second group of bullying and non-bullying samples are combined into a testing set, the training set trains a classifier and uses the testing set to verify, and finally exchanges the training set and the test set. The final result is the average of the two.
4 Experiments and Conclusions Using the above sample collection method, a total of 299 samples of five types of bullying actions and six types of non-bullying actions were collected, and 57-dimensional features were extracted for each sample. After the relief-f algorithm feature selection and the PCA algorithm feature dimensionality reduction, the first four dimensional principal components are taken for classifier design and testing. For the k-NN classifier, the k value of the nearest neighbor coefficient is adjusted, and the testing set and the training set are tested by the classifier. The classification result of the testing set reflects the performance of the classifier (Fig. 3).
Fig. 3. Accuracy of different k values of the k-NN algorithm, the optimal k is 5
Through comparison, the k-NN model with the number of nearest neighbors k of 5 is finally selected, the classification results of the training set reflects no over-fitting, and good results are obtained. The recognition accuracies of bullying and non-bullying
960
B. Long et al. Table 3. Confusion matrix of k-NN classifier when k = 5 Activities recognized as Bullying (%) Non-bullying (%) Bullying
84.13
15.87
Non-bullying
23.08
76.92
actions are 84.13% and 76.92%, respectively. The confusion matrix is given as follows (Table 3): Some common indicators that measure the recognition effect of the classifier are TP given as follows:accuracy = TP+TN P+N = 0.8046, precision = TP+FP = 0.7794 and TP recall = TP+FN = 0.8413, F1 indicator which is the harmonic mean of recall and precision is 0.8092. The above results show that the 57-dimensional initial features of feature extraction and the four-dimensional features after feature reduction can reflect the difference between bullying and non-bullying actions. The designed k-NN classifier with k = 5 has an average accuracy of over 80%, indicating that using a smartphone as a detection platform to identify action bullying is very promising, thus providing a convenient and easy way for the automatic detection of campus bullying. Acknowledgements. This work was supported by National Natural Science Foundation of China under Grant No.61971158.
References 1. Kim YS, Leventhal B (2008) Bullying and suicide. a review. Int J Adolesc Med Health 20(2):133–154 2. Menesini E, Salmivalli C (2017) Bullying in schools: the state of knowledge and effective interventions[J]. Psychol Health Med 22(sup1):240–253 3. Olweus D (2013) School bullying: development and some important challenges[J]. Annu Rev Clin Psychol 4. Brahnam S, Roberts JJ , Nanni L et al (2015) Design of a bullying detection/alert system for school-wide intervention[J] 5. Peng L, Chen L, Wu M et al (2019) complex activity recognition using acceleration, vital sign, and location data[J]. IEEE Trans Mob Comput 18(7):1488–1498 6. Ye L, Ferdinando H, Seppänen T et al (2014) Physical violence detection for preventing school bullying[J]. Adv Artif Intell 2014(2):1–9 7. Zalluhoglu C, Ikizler-Cinbis N (2019) Region based multi-stream convolutional neural networks for collective activity recognition[J]. J Visual Commun Image Represen 60 8. Hache G, Lemaire ED, Baddour N (2010) Mobility changeof-state detection using a smartphone-based approach, in Proceedings of the IEEE InternationalWorkshop on Medical Measurements and Applications (MeMeA ’10), pp. 43–46, Ottawa, Canada, May 2010 9. Hache G, Lemaire ED, Baddour N (2011) Wearable mobility monitoring using a multimedia smartphone platform[J]. IEEE Transactions on Instrumentation & Measurement 60(9):3153– 3161
Speaker Recognition System Using Dynamic Time Warping Matching and Mel-Scale Frequency Cepstral Coefficients Yang Xue(B) University of Electronic Science and Technology of China, No. 2006 Xiyuan Avenue High-Tech Zone (West District), Chengdu, China [email protected]
Abstract. Automatic speaker recognition is to recognize speech given by speaker automatically. The difference between speaker recognition and speech recognition is that it does not focus on the information of text symbols and semantic content with the speech signal included but fixes on the personal features with the speech signal included, furthermore, extracts these personal information features of the speaker to achieve the purpose of identifying speakers. Research on speaker recognition began in 1930s [1]. The early stage research mainly payed attention to the ear of human hearing recognition experiment as well as discovering possibility of listening recognition. Research work has moved from the pure human ear. Later, with improvement of electronic technology and computer technology, there is the possibility of automatically recognizing human voices through machines.
1 Introduction Speech is one of the natural attributes of human beings. Due to the physiological differences of the speaker’s pronunciation organs and the acquired behavior differences, each person’s speech has a strong personal color, which makes it possible to identify the speaker by analyzing the speech signal. The use of voice to identify the speaker’s identity has many unique advantages. For example, voice is an inherent feature of human beings and will not be lost or forgotten; it is convenient to collect voice signals and the cost of system equipment is low; in addition, the use of telephone network can also achieve remote customer service, etc [2]. In recent years, automatic speaker recognition has played an important role in a fairly wide field, and it has become the focus of attention.
2 Text Speech signal is a non-stationary random process, whose characteristics change with time, so the parameters in the model also change with time. However, the properties of
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_127
962
Y. Xue
the speech signal vary gently with time. Therefore, the speech signal can be divided into some successive short sections for processing, during which the characteristics of the speech signal is regarded as a steady random process that does not change with time. In this way, a linear time-invariant model can be used while representing speech signals in these short periods of time. 2.1 Digitization of Speech Signals The digitization of speech signals generally includes amplification and gain control, anti-aliasing filtering, sampling, A/D conversion and encoding (generally Pulse Code Modulation (PCM) codes). There are two purposes of pre-filtering (the pre-filter must be a band-pass filter): 1. Suppress all components in the frequency domain components of the output signal whose frequency exceeds fs/2 to prevent aliasing interference, and fs is the sampling frequency. 2. Suppression of 50 Hz power frequency interference. In A/D conversion, the signal must be quantized, and quantization inevitably generates errors [3]. The difference between the quantized signal value and the original signal value is called quantization error, also known as quantization noise. If the change of the signal waveform is large enough or the quantization interval delta is small enough, it can be proved that the quantization noise conforms to the statistical model with the following characteristics: 1. It is a smooth white noise process. 2. Quantization noise is not related to the input signal. 3. The quantization noise is evenly distributed within the quantization interval; that is, it has equal probability density distribution. 2.2 Speaker Recognition Speaker recognition is to extract the speaker’s personality characteristics from a section of the speaker’s speech. Through the analysis and recognition of these personal characteristics, the purpose of identifying or confirming the speaker is achieved. Its structural block diagram is shown in the Fig. 1.
Fig. 1. Structural block diagram of speech recognition
Speaker Recognition System Using Dynamic Time Warping …
963
Ideally, the selected features should meet the following criteria [4]: 1. Can effectively distinguish different speakers but can be relatively stable when the voice of the same speaker changes. 2. Easy to extract from speech signal. 3. Not easy to be imitated. 4. Do not change or change with time and space as little as possible. Information of speaker not only has stability factors including structure of vocal organs as well as vocal habits, but variable factors including stress, speech speed, rhythm, and intonation. The template about identification is compared with the template of reference in time, and the comparability between these two templates is obtained due to a specific distance measurement [5, 6]. A commonly used technique is dynamic time warping (DTW) based on the nearest neighbor principle. A speaker recognition system using DTW is shown in Fig. 2. The recognition feature it uses is band-pass filter group (with auditory feature processing), and DTW technology is used for matching. 2.3 Isolated Word Recognition The isolated word (word) recognition system generally uses isolated words (words) as the recognition unit; that is, the isolated words (words) are directly used as the recognition primitives, and the dynamic time warping (DTW) method can be used for recognition. The starting point of the phonetic sound corresponds to the starting point of the path. The distance from the beginning to the end of the optimal path is the distance from speech to be identified to the template speech, and phonetic sound corresponding to the template with the smallest distance to be recognized is the recognition result [7]. This method is computationally intensive, but the recognition accuracy is also high. In the matching of points, for short-time spectrum or cepstrum parameter identification systems, the distortion measure can use Euclidean distance; for the identification system using LPC parameters, the distortion measure can use log-likelihood ratio distance. The decision method generally uses the nearest neighbor criterion. The system is represented by the following Fig. 3. First, the speech signal is transformed into speech feature parameters after preprocessing and speech analysis. The pattern recognition part compares and matches the input voice feature parameter information with the pre-stored reference model during training. Due to the change in pronunciation rate, there is nonlinear distortion between the output test speech and the reference mode; that is, some phonemes of the input speech become longer, and others are shorter than the reference mode, showing random changes. In addition to the change in pronunciation rate, relative to the reference mode, the test speech may also have other speech changes, such as continuous/pronunciation/sound change and other acoustic changes, the speaker’s psychological and physiological changes, and the speaker without the speaker changes and environmental changes. In speech recognition, isolated word recognition is the basis. The expansion of vocabulary, the improvement of recognition accuracy and the reduction of computational complexity are the three main goals of isolated word (word) recognition [8]. The key issues
964
Y. Xue
Fig. 2. Speaker recognition system based on DTW
Fig. 3. Isolated word recognition system
Speaker Recognition System Using Dynamic Time Warping …
965
are the selection and extraction of features, the selection of distortion measures, and the effectiveness of matching algorithms. Mel-scale Frequency Cepstral Coefficients (MFCC) is the most frequently applied speech feature in speech recognition. Bass is easy to cover highs, and it is more difficult for highs to cover lows. Therefore, a set of bandpass filters are able to be applied to filter input signals in the frequency band with the increase of frequency according to the particular bandwidth from dense to sparse. In this way, energy output of the signal based on every respective band-pass filter is considered to be the fundamental feature of the signal. In the meanwhile, this feature is able to be applied as the input speech feature followed by further processing. Because it is independent from the nature of the signal, neither any assumptions or restrictions are created in the input signal [9]. auditory model, final results of the research, is put into practice, according to which, robustness of this property is more suitable, more linear respecting to the human ear’s auditory properties, and has more accuracy and clear recognition performance with the decrease of signal-to-noise ratio. This part is the basic process of MFCC extraction of voice feature parameters: Pre-emphasis: Pass the speech signal through a high-pass filter: H (z) = 1 − μz −1
(1)
The objective of this step is in order to improve the part of high frequency, generating the flatter signal spectrum as well as remain it within the whole frequency band with the increase of frequency. In the meanwhile, it also compensates for the part of high frequency of the speech signal that is suppressed by the pronunciation system, also to stress the formant of high frequency. Framing: First collect N sampling points to one observation unit to become a frame. For the purpose of avoiding excessive changes in two adjacent frames, there is an overlapping area reserved between these adjacent frames. Add windows [10]: Multiply each frame by the Hamming window to rise the continuity of the both ends of the frame. 2πn , 0≤n≤N −1 (2) W (n, a) = (1 − a) − a × cos N −1 Fast Fourier Transform: different energy distributions can stand for the features of different speech. Xa (k) =
N −1
x(n)e−j2π kn/N , 0 ≤ k ≤ N
(3)
n=0
Triangle bandpass filter: Pass the energy spectrum through a set of Mel-scale triangular filter banks to suppose a definition of the filter bank with M filters. The applied filter is a triangular filter with a center frequency of f (m), and its interval decreases with the value of m decrease and
966
Y. Xue
increase with the increase of m value. The purpose of this step is smoothing the frequency spectrum and removing harmonics to highlight formants of the initial speech. The frequency response of the triangle filter is defined as [11]: ⎧ 0 , k < f (m − 1) ⎪ ⎪ ⎪ 2(k−f (m−1)) ⎨ (m−1))(f (m)−f (m−1)) , f (m − 1) ≤ k ≤ f (m) (4) Hm (k) = (f (m+1)−f2(f (m+1)−k) ⎪ (f (m+1)−f (m−1))(f ⎪ (m)−f (m−1)) , f (m) ≤ k ≤ f (m + 1) ⎪ ⎩ 0 , k ≥ f (m + 1) Calculate log energy output by each filter bank as: N −1
2 |Xa (k)| Hm (k) , 0 ≤ m ≤ M s(m) = ln
(5)
k=0
After the discrete cosine transform (DCT), the MFCC coefficients are obtained, L refers to the order of MFCC coefficient and M refers to the number of triangle filters: N −1 π n(m − 0.5) s(m)cos , n = 1, 2, . . . , L (6) C(n) = M m=0
Logarithmic energy: The sum of the squares of the signals in a frame, and then take the logarithm value based on 10, and then multiply by 10, so that each frame of the basic voice characteristics is one-dimensional, including a logarithmic energy and the remaining cepstrum parameters. Extraction of dynamic difference parameters ⎧ t 0 J
In the Gormula, cij > 0, cij > 0 = 1; otherwise, the value is 0.
(6)
Driver Multi-function Safety Assistance System
981
Studies have shown that the five characteristic parameters discussed above have significant differences between normal and fatigue states, so it can be more accurately distinguished from fatigue. At the same time, in order to verify the accuracy of the above conclusions, we found six subjects (three normal rest persons and three overnight workers), each of whom simulated driving for 25 min, taking every 10 s as a data sample, and each person had 150 samples. Independent sample t test is performed on the population average to obtain the P value. If the P value is less than 0.05, we can think that this characteristic parameter has a significant difference under different mental states, so it can effectively distinguish the grip strength in normal state and fatigue state (Table 2). Table 2. Significance analysis of grip strength parameter Characteristic Parameter
Number of samples
sig
Distinctiveness
x¯
150
0.00
obvious (continued)
Table 2. (continued) Characteristic Parameter
Number of samples
sig
Distinctiveness
var(x)
150
0.525
not obvious
rms(x)
150
0.062
not obvious
max(x)
150
0.005
obvious
min(x)
150
0.001
obvious
pi
150
0.022
obvious
pr i
150
0.024
obvious
Among them, var (x) is the standard deviation of the grip strength signal, which can represent the degree of dispersion of the grip strength signal within the time period. rms (x) is the root mean square of the grip strength signal, which is used to represent the effective value of the grip strength signal, and reflects the energy of the grip strength signal within the corresponding time period. After testing, there is no significant difference between var (x) and rms (x), and the five parameters x¯ , max (x), min (x), pi , pr i have significant differences, which is consistent with the research results and used to characterize grip strength signals [4]. Smooth Grip Characteristics Since the driver enters the fatigue state is a very slow physiological process, so the grip characteristics used to reflect the fatigue state is also a gradual process. Moreover,
982
W. Wang et al.
since the driving posture cannot always maintain the correct grip posture and the collected signal circuit generates a noise signal, the grip force signal contains relatively violent and irregular irrelevant features. In the above time domain and time–frequency domain feature analysis, every 10 s is taken as a sample, so we can use the relevant characteristics of the sample to reflect the gradual change of fatigue to smooth the processing, eliminate, or reduce the impact of irrelevant characteristics. We use a smoothing scheme based on a linear dynamic system model. We use x = {x, . . . , xn } to represent the original grip strength feature, 以 Z = {z1 , . . . , zn } to represent the hidden feature sequence. The linear dynamic system model assumes that the time series correlation of {zn−1 , zn } and the relationship between the observed and hidden state pairs {zn , xn } satisfy the linear Gaussian conditional distribution: p(zn |zn−1 ) = N (zn |Azn−1 , )
(7)
p(xn |zn ) = N (xn |Czn , )
(8)
At the same time, the initial hidden state also satisfies Gaussian white noise: p(z1 ) = N (z1 | μ0 , v0 )
(9)
In general, the above distribution can be equivalent to the form of linear superimposed noise: zn = Azn−1 + ωn , xn = Czn + vn , z1 = μ0 + μ
(10)
Among them, A is the state transition matrix, C is the observation matrix, and ω, ν, u are noise signals. The noise term also satisfies the Gaussian distribution: ω ∼ N ( ω|0, ), v ∼ N ( v|0, ), u ∼ N ( u|0, V0 )
(11)
Therefore, all parameters based on the linear dynamic system model can be expressed as: θ = {A, C, , , μ0 , V0 }
(12)
The parameter A affects the numerical scale of the grip characteristics after smoothing, and the parameter C affects the trend of the grip characteristics, μ0 , V0 response to changes in fatigue-related characteristics. , , can be estimated using the EM algorithm. The data shows that the algorithm based on the linear dynamic system and the Kalman filter is the same. The first number predicts the later observations, and the new observations are used to correct the predictions [5].
3 System Function Design See Fig. 2
Driver Multi-function Safety Assistance System
STM32 microcontroller Grip signal Sensing module
Mobile terminal
Bluetooth module
ECG Acquisition module
GPRS + GMS module
983
Data analysis and display
Voice prompts
When the hazard level is high Contact family and hospital
Fig. 2. System design block diagram
4 Design of the Driver’s Multifunctional Safety Assistance System This design uses the STM32F103 microcontroller as the control core. The ECG signal acquisition module measures the driver’s ECG in real time, and the grip sensor module collects the driver’s hands pressure on the steering wheel. The collected signals are processed by the STM32 microcomputer, and the data is sent to the mobile phone terminal through the Bluetooth module. The terminal analyzes and displays the received driver’s ECG signal data and the data from the pressure sensor. When a sudden situation occurs in the driver’s body, the danger level is automatically determined, and the driver is reminded by voice. When the danger level is high, the location and physical condition of the driver are sent to the family and the hospital through GPS positioning. At the same time, there will be fatigue and drowsiness when driving a car. In order to accurately judge the driver’s fatigue, the system collects and analyzes the data of the ECG signal and the grip signal to improve the accuracy of the system’s judgment of the driver’s fatigue. 4.1 The Overall Hardware Circuit Schematic All hardware circuit design diagrams of this system are designed using Altium Designer software (Fig. 3). 4.2 Positioning Module and Bluetooth Module This design requires a GPRS module and a GSM module. During the driving of the car, when the driver has a high level of emergency, the driver’s location information is collected for voice calls and SMS monitoring. Therefore, the SIM900 module is selected. This module is a TTL level interface, which can be directly connected to the single-chip microcomputer and ARM without conversion devices, and has the advantages of low power consumption and fast data transmission. Because this design transmits data in real time while the driver is driving, a Bluetooth module with small size and low power consumption is required. The HM13 Bluetooth 4.0BLE + EDR module is used. The power consumption of this module is only about
984
W. Wang et al.
Fig. 3. Hardware circuit design
9.5 mA in BLE mode. It is automatically reset after power on. The default configuration is 115,200 baud rate, 8-bit data bit, 1-bit stop bit, and no check. The module uses five pins to realize the communication between the hardware circuit and the signal receiving terminal, and the PIO1 output pin shows the working status of the Bluetooth module. This design uses a blue LED patch package to indicate the Bluetooth module connection status. 4.3 Grip Strength Detection Module We use the ultra-thin pressure sensor FSR408, which has a long shape (length 62 cm, width 1.5 cm, thickness 3 mm), which can improve the contact area of the driver’s hands with the sensor without invasive feeling. Due to factors such as driving habits, the sensor is built into the left half of the steering wheel cover. The blocking of the pressure sensor will change with the amount of applied pressure. Connect the pressure sensor to the STM32 main controller and design a voltage divider circuit. As shown in the figure, it will block convert to a voltage value to convert the grip resistance value into a change in potential.
5 Conclusion The design theoretically analyzes the principle of fatigue detection, deduces the relationship between the ECG signal and the degree of fatigue, and extracts and smoothes the grip signal feature. Through the design of the hardware circuit, real-time monitoring of the driver’s physical health and fatigue degree is realized and the data is transmitted to the mobile phone terminal via Bluetooth. When the driver’s body changes or is fatigued to drive, the mobile phone terminal gives a voice prompt to the driver. And when the danger level is high, the location will be sent to family members and hospitals through GPS positioning to minimize the loss. Because this design is based on the steering wheel cover, it will not cause intrusion to the driver and affect normal driving. It has the advantages of low cost and high safety. If it can be popularized in the future, in addition to the
Driver Multi-function Safety Assistance System
985
driver’s family not worrying about the driver, it will also reduce the occurrence of traffic accidents.
References 1. Wei H, Liu J (2018) Study on the frequency distribution and effective bandwidth of ECG signals and components. J Biomed Eng 2. Tian X, Zhao D (2016) Driver mental load evaluation and application based on heart rate variability. North University of Technology 3. Patel M, Lal SKL, Kavanagh D et al (2011) Applying neural network analysis on heartrate variability data to assess driver fatigue. Expert Syst Appl 4. . Li F, Wang XW, Lu J (2013) Detection of driving fatigue based on grip force on steering wheel with wavelet transformation and support vector machine. In: Neural information processing, Daegu, Springer, Berlin, Heidelberg, pp 141–148 5. Shumway RH, Stoffer DM (2013) Time series analysis and its applications. Springer Science and Business Media
Edge Computing-Enabled Dynamic Multi-objective Optimization of Machining Parameters Zhibo Sui, Xiaoxia Li(B) , Jianxing Liu, and Zhengqi Zeng College of Informatics, Huazhong Agricultural University, Wuhan 430070, China [email protected]
Abstract. Dynamic events such as arrival urgent parts, due date changes, tool wear and so on are inevitable occurrences in machining processes. Optimizing the machining parameters in real time to respond to the dynamic events can significantly improve multiple machining performances. In this paper, an edge computing-enabled dynamic multi-objective optimization approach has been developed to achieve the real-time optimization of machining parameters. In the approach, edge servers are scheduled to provide the optimal computing resources. Based on the edge optimal computing resources, an improved dynamic two-archive evolutionary algorithm is developed to optimize the machining parameters and respond to the dynamic events. The proposed method is compared with edge computing resources’ random selection mechanism, normal dynamic two-archive evolutionary algorithm and NSGAII. The experiment results illustrate the high performance of the proposed method in the dynamic machining process.
1 Introduction With the ever-changing competitive market requirement, multiple machining performances such as productivity, machining cost, sustainability and so on are pursued by machining companies. All these machining performances can be improved significantly by the careful planning of machining processes [1]. As thus, it is imperative for the machining companies to take measures to optimize their machining processes to maintain competitive. One of the most effective strategies for optimizing machining processes is the optimization of the machining parameters selected for a computer numerical control system to machine workpieces. In modern manufacturing systems, the optimal machining parameters are commonly obtained using evolutionary algorithms. The evolutionary algorithm-based methods can be classified into two types: weightbased method and Pareto-based method. For the former, all the optimization objectives are combined into an integrated one by granting a weighted coefficient to each objective, and then, an evolutionary algorithm is employed to identify the optimal machining parameters [2–4]. For the latter, a Pareto-dominance mechanism is used to obtain multigroups of optimal machining parameters, and then, the most suitable Pareto-optimal parameter solution is selected for the machining process [5–11]. Compared with the former, the latter can avoid the dependency on the selection of the weights. However, © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_130
Edge Computing-Enabled Dynamic Multi-objective Optimization …
987
the machining processes are in a dynamic environment where a lot of dynamic events may occur. Thus, it is needed to re-optimize machining parameters to respond to the dynamic events. In response to the dynamic events, there are two aspects that should be considered. One is the response time that should be as short as possible. The other is the response range where more dynamic events should be involved. For the former, the cloud centers are widely employed to run evolutionary algorithms to take advantage of their computing capabilities. However, the response efficiency is limited by the new optimum solutions’ transmission between shop floor and cloud. For the latter, some online adaptive methods for feed rate optimization have been explored to respond to the machine-related events. However, other dynamic events such as arrival urgent parts, due date changes and so on have been neglected. To address the above issues, an edge-cloud hybrid computing-enabled dynamic multi-objective optimization method for machining parameter is presented. In the edgecloud hybrid computing system, an edge computing resource scheduling strategy is developed to provide real-time computing services. Based on the hybrid computing system, a feature-based general scheme is presented to achieve the dynamic optimization. During the optimization, a parallel two-archive evolutionary algorithm is designed to search for the optimal machining parameters. The remainder of this paper is organized as follows. In Sect. 2, related work is reviewed. In Sect. 2, the edge-cloud hybrid computing system is presented. The general scheme of dynamic multi-objective machining parameter optimization (DMOMPO), the strategy for edge computing resource scheduling and the parallel two-archive evolutionary algorithm are presented in Sect. 3. In Sect. 4, case studies and comparisons with cloud computing system and other multi-objective evolutionary algorithms are given. Finally, conclusions are drawn in Sect. 5.
2 Edge-Cloud Hybrid Computing System for DMOMPO The goal of the DMOMPO is to respond to the dynamic events occurring in the machining process by optimizing the machining parameters. In order to improve the response efficiency, an edge-cloud hybrid computing system is presented in this chapter. It is composed of three layers (see Fig. 1): aided computing layer, edge computing layer and cloud computing layer. All the layers are outlined as follows: • The aided computing layer is responsible for supporting the other two layers in data and communication. It consists of the following two sublayers: – Equipment sublayer: It is composed of hardware equipment such as the CNC machines, sensors, cameras and so on. It is responsible for collecting the data in the machining process and executing the optimization solution. – Communication sublayer: The industrial networks are deployed in this sublayer to collect all the elements in the system and achieve interconnection. • In the cloud computing layer, the historical data from the equipment sublayer is used to establish the prediction models for the performance criteria. Based on the prediction
988
Z. Sui et al.
Fig. 1. Architecture for edge-cloud hybrid computing system
models, the initial optimization solutions for scheduling and machining parameters are identified using the evolutionary algorithms. • The edge computing layer is responsible for responding to the dynamic events occurred in the machining process. Thus, there are two main works to be finished in this layer. One is re-optimizing the machining parameters in real time to respond to the dynamic events. The other is scheduling the edge computing servers to be used as the computing resources.
3 Dynamic Multi-Objective Machining Parameter Optimization 3.1 The General Scheme In order to obtain the general scheme, the model for DMOMPO is established. Spindle speed n (r/min), feed rate f (mm/min), cutting depth ap (mm), cutting width (mm) and tool wear TW (mm) are selected as the decision variables. Four important machining responses, namely cutting energy consumption (EC cutting ), cutting time (T cutting ), surface roughness (Ra) and tool wear (TW ), are considered as the optimization objectives to assess energy consumption, productivity, machining quality and machining cost, respectively. Based on the historical data, the prediction models for the specific indicators can be established on the servers in the cloud layer to characterize the relationship between
Edge Computing-Enabled Dynamic Multi-objective Optimization …
989
the machining parameters and the indicators. Additionally, several practical machining constraints are considered to keep the machine parameters in the domain of the machine tool’s corresponding indicators. The dynamic multi-objective model is then formulated as follows: Minmize F(X , t) = ECcutting (X , t), Tcutting (X , t), TW (X , t), Ra(X , t) subject to nmin ≤ n ≤ nmax fmin ≤ f ≤ fmax apmin ≤ ap ≤ apmax aemin ≤ ae ≤ aemax where X is the decision variables (n ,f, ap, ae), the cutting volume and the tool wear, t is the discrete time, EC cutting , T cutting , TW and Ra are the time-varying objectives which are added or deleted at time t, C is the time-varying constraint set, nmin and nmax are the minimum and maximum spindle speed, f min and f max are the minimum and maximum feed rate, apmin and apmax are the minimum and maximum cutting depth, aemin and aemax are the minimum and maximum cutting width, respectively. Based on the above dynamic multi-objective model, the general scheme is obtained. Its main steps are listed in the following. Step 1: Feature grouping. Firstly, the features to be machined on the same ma chine are divided into several sets where each set can be machined in the clamping at a time. Then, a feature set is further divided into several subset where the features share the same tool. Step 2: Initialization. The initial optimal machining parameters for each feature subset are identified using the improved dynamic two-archive evolutionary algorithm (IDTAEA) in the cloud layer. Step 3: Monitoring. The whole machining process is monitored to judge whether there is an urgent job insertion disturbance. Step 4: Responding to the dynamic events. If the above disturbance occurs, three works will be done. Firstly, the edge servers are scheduled to organize the computing resources. Secondly, the features of the workpiece in the new job are grouped into sets and subsets. Thirdly, TW and Ra are deleted from the optimization objectives, and IDTAEA is run to obtain the optimal machining parameters for the urgent features. If no disturbance occurs, the IDTAEA is performed to identify the new optimal machining parameters to respond to the tool wear. Step 5: Step 3–4 will be repeated until all the features have been machined. 3.2 Edge Computing Resource Scheduling In order to take advantage of the edge computing resources and improve the response efficiency of dynamic events, a feature grouping-based method is developed to achieve edge computing resource scheduling which is listed as follows: Step 1: An edge server with high computing capability is selected from the unoccupied edge servers as the master which is responsible for the division of the features of the workpiece in the urgent job and computing the client servers’ time.
990
Z. Sui et al.
Step 2: Let n1 and n2 represent the number of unoccupied servers and the number of feature subsets, respectively. If n1>n2, Step 3 will be executed; else, Step 4 will be executed. Step 3: The communication time and process time are computed and added to obtain the total time on the master. n2 servers with shorter total time are kept to be used to concurrently execute the evolutionary algorithm. Step 4: The time of communication, queuing and process time of the rest servers are computed and added to obtain the total time. n2 −n1 servers with shorter total time and n1 unoccupied servers are used to concurrently execute IDTAEA. 3.3 The Improved Dynamic Two-Archive Evolutionary Algorithm Based on the dynamic two-archive evolutionary algorithm (DTAEA) [12], an improved dynamic two-archive evolutionary algorithm (IDTAEA) with lower time complexity is developed to speed up the response to the dynamic events.The IDTAEA keeps two coevolving populations called the convergence archive (CA) and the diversity archive (DA). CA and DA are used to obtain the population’s convergence and diversity, respectively. When the number of objectives changes, three operators will be executed. Firstly, the compositions of CA and DA will be reconstructed using the reconstruction operator. Based on the new CA and DA, the reproduction operator is executed to obtain offspring. Then, the CA and DA are updated using the corresponding update operator. For CA, the update operator presented in [12] is directly used as its update operator. However, for the DA, a new DA’s update mechanism is designed to lower its time complexity because its time complexity is o (n3) which is highest in all the operators. As is shown in Tables 1, 2, the solution set {S1, S2, S3, …} obtained by non-dominated quick sorting is used as the input. The solution set S that has not been used to update CA is selected from {S1, S2, S3, …}. Then, the crowding distances of these selected solutions are calculated and sorted in descending order. It should be pointed out that the quick sorting algorithm [13] is used to achieve the sorting. A sorted solution set S’ is obtained. Its first N solutions are taken to update DA. Since the update operators of CA and DA shared the same results of non-dominated quick sorting [14], the most time-consuming operations are sorting operation (line 3 of Algorithm 1) and the crowding distance computing operation (line 2 of Algorithm 2). The former costs O(nlogn) comparisons while the latter takes O(n), where n is the number of solutions in S. Thus, the time complexity of DA’s update is lowered to O(nlogn).
4 Case Studies and Analysis In order to validate the effectiveness and feasibility of the proposed approach, a prototyping platform with edge computing servers and Internet of things is constructed. The device layer contains CNC machines, power sensors, manipulators and so on. The edge computing layer is composed of multiple computers with different configurations, which are connected with WiFi, wired local area network and Bluetooth. For the cloud layer, the Elastic Compute Service provided by Alibaba cloud computing Co. Ltd. is rent. Based on the prototype platform, an experiment is constructed. In the experiment,
Edge Computing-Enabled Dynamic Multi-objective Optimization …
991
Table 1. Update operator of DA Algorithm 1: Update DA Input: {S1, S2, S3, …} Output: Updated DA 1 DA := F, i=0; 2 S : = The solutions {Si, Sj, Sk, …} that have not been used to update CA; 3 S’ := Sorting CrowdingDistance(S) in descending order; 4 DA : = DA ∪{S1’, S2’, S3’, …}; 5 return DA
Table 2. Computing crowding distance operator Algorithm 2: Crowding Distance Input: Candidate solutions S OutPut: the crowded distance of S 1 CrowdDist_S := [0,0,…], i:=0; 2 for i to |S| do m(t) 3 CrowdDist_ S[i]: = j =1 (fi +1,j -fi -1,j ) 4 return CrowdDist_S
multiple 50 mm * 50 mm * 40 mm blocks are used for machining experiments. The CNC milling machine (Xendell C000017A) is used to machine the workpieces. To verify the advantages of the proposed algorithm, IDTAEA, DTAEA and NSGA are run for 15 times. The iteration is 200. During the iteration, the number of optimization objective is changed at the 50th, 100th, 150th iteration. The proportion of the nondominant solution in the solution set found by the algorithms is calculated and shown in Table 3. Obviously, IDTAEA and DTAEA are better than NSGAIIin convergence. The average running time shows that IDTAEA is better than DTAEA. Also, IDTAEA’s advantage in convergence and diversity is verified by comparing the box diagrams shown in Fig. 2 and 3, respectively. Additionally, the proposed edge computing resource scheduling is compared with the random selection mechanism. IDTAEA is run 10 times on the edge computers selected by the proposed scheduling method and the random selection mechanism, respectively. It can be observed that the proposed scheduling method precedes the random selection mechanism in running time (see Table 4).
5 Conclusions The optimization of machining parameters in the dynamic manufacturing environment is critical to improve multiple machining performances. In this paper, a scheduling mechanism is presented to take advantage of edge computing resources. Based on the edge computing resources, IDTAEA is developed to achieve dynamic multi-objective
17.02
18.72
17.11
19.17
14.97
150th
50th
100th
150th
Avg
17.60
18.59
100th
Increasing objective
DTAEA
IDTAEA
14.33
13.506
14.17
14.84
14.86
14.14
14.49
41.19
42.77
40.76
41.58
41.46
40.38
40.17
98.52
147.09
108.596
103.16
83.052
79.202
70.016
41.21
42.27
40.07
41.31
39.82
42.60
41.24
17.26
16.131
17.26
18.35
18.46
17.26
16.118
Proportion(%) Avg Running Time(s) Proportion (%) Avg Running Time(s) Proportion (%) Avg Running Time(s)
NSGA2
Decreasing objective 50th
Dynamic response
Table 3. Comparison among the evolutionary algorithms
992 Z. Sui et al.
Edge Computing-Enabled Dynamic Multi-objective Optimization …
993
(a) Decreasing objective at the 50th iteration
(b) Increasing objectiveat the 50th iteration Fig. 2. Box diagrams for convergence comparison
optimization of machining parameters to respond to the dynamic events in the machining processes. In summary, the contributions of the presented approach are as follows: • A feature-based method for edge computing resource scheduling is proposed to improve the utilization rate of the computing resources in the edge layer. • An IDTAEA with lower time complexity is proposed to improve the efficiency of dealing with the dynamic events occurred in the machining process. Currently, the proposed approach still has several limitations, and future enhancements are needed. More optimization strategies should be considered and integrated in the hybrid computing system. Meanwhile, more workpiece machining experiments should be further done to further demonstrate the approach.
994
Z. Sui et al.
(a) Decreasing objective at the 50th iteration
(b) Increasing objective at the 50th iteration Fig. 3. Box diagrams for diversity comparison
Edge Computing-Enabled Dynamic Multi-objective Optimization …
995
Table 4. Comparison between the edge computing re Source scheduling methods No.
Random selection method (s)
The proposed scheduling method(s)
1
19.76155
14.55428
2
19.63867
15.0658
3
18.88248
14.18027
4
19.17254
13.41411
5
19.16912
14.21678
6
20.34328
14.43962
7
19.83118
15.7697
8
19.64896
15.5502
9
19.93277
15.78552
10
19.70748
13.8865
Acknowledgements. This research was supported by Natural Science Foundation of China (grant no. 61803169) and the Fundamental Research Funds for the Central Universities (grant no. 2662018JC029). The paper reflects only the authors’ views and the Union is not liable for any use that may be made of the information contained therein.
References 1. Tao F, Bi LN, Zuo Y et al (2017) A cooperative co-evolutionary algorithm for large-scale process planning with energy consideration. J Manuf Sci Eng 139:061016 2. Wang S, Lu X, Li XX et al (2015) A systematic approach of process planning and scheduling optimization for sustainable machining. J Clean Prod 87:914–929 3. Anand Y, Gupta A, Abrol A et al (2016) Optimization of machining parameters for green manufacturing. Cogent Eng 3(1):115–292 4. Jiang ZG, Zhou F, Zhang H et al (2015) Optimization of machining parameters considering minimum cutting fluid consumption 108:183–191 5. Li CB, Li LL, Tang Y et al (2019) A comprehensive approach to parameters optimization of energy-aware CNC milling. J Intell Manuf 30:123–138 6. Li CB, Chen XZ, Tang Y et al (2017) Selection of optimum parameters in multi-pass face milling for maximum energy efficiency and minimum production cost. J Clean Prod 140:1805–1818 7. Yi Q, Li CB, Tang Y et al (2015) Multi-objective parameter optimization of CNC machining for low carbon manufacturing. J Clean Prod 95:256–264 8. Zhang L, Zhang BK, Bao H et al (2018) Optimization of cutting parameters for minimizing environmental impact: considering energy efficiency, noise emission and economic dimension. Int J Precis Eng Manuf 19(4):613–624 9. Xu GD, Chen JH, Zhou HC et al (2019) Multi-objective feedrate optimization method of end milling using the internal data of the CNC system. Int J Adv Manuf Technol 101:715–731 10. Xie N, Zhou JF, Zheng BR (2018) Selection of optimum turning parameters based on cooperative optimization of minimum energy consumption and high surface quality. In: 51st CIRP conference on manufacturing systems 72:1469–1474
996
Z. Sui et al.
11. Tapoglou N et al (2016) Online on-board optimization of cutting parameter for energy efficient CNC milling. In: 13th Global conference on sustainable manufacturing 40:384–389 12. Chen RZ, Li K, Yao X (2017) Dynamic multi-objectives optimization with a changing number of objectives. IEEE Trans Evol Comput. https://doi.org/10.1109/TEVC.2017.2669638 13. Dwivedi R, Jain DC (2014) A comparative study on different types of sorting algorithms (On the Basis of C and Java). Int J Comput Sci Eng Technol 5(8):805–808 14. Deb K, Agarwal S et al (2002) A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans Evol Comput 6(2):182–197
Improved Dark Channel Prior Algorithm Based on Wavelet Decomposition for Haze Removal in Dynamic Recognition Peiyang Song(B) Tianjin No.1 High School, Tianjin, China [email protected]
Abstract. Environmental perception and precise positioning of area targets are key technologies in dynamic recognition. However, the perceptual information that is dynamically acquired in some haze weather cannot provide an accurate basis for decision-making and dynamic planning. In addition, under some outdoor condition, such as in the bright light, the quality of the perception information obtained by the system is usually low, and the robustness of the area target is relatively poor. In this paper, the implementation strategy of haze removal for images is comprehensively studied, and an improved dark channel prior algorithm is accordingly proposed by introducing the wavelet decomposition. The related experimental research is further carried out in the MATLAB development environment. As a result, haze can be effectively removed, and the real time of the improved dark channel prior algorithm can be largely enhanced. Keywords: Area target positioning · Dark channel prior algorithm · Wavelet decomposition · Image enhancement
1 Introduction With the rapid development of information and electrification in modern society, the dynamic identification technology of regional targets has become the scientific basis and key technology in the emerging fields of visual navigation, robotics, intelligent transportation, and public security. As shown in Fig. 1, the dynamic identification system is mainly composed of environment perception module, decision and planning module, control and execution module. Environment perception and accurate positioning regional targets are the first steps of dynamic recognition [1]. Environmental perception studies how to process the information collected by conventional sensors through the information processing system which comprises complex program, and finally gets some results that are not obtained by the basic sensory. Under haze weather condition, as light travels in fog and haze media, the scattering effect of particles makes the image collected by the imaging sensor decay gradually along the optical path, resulting in the image blurring, so
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_131
998
P. Song
that the dynamic target recognition for the perception of information cannot be used as a reliable basis for precise positioning for the regional targets. It is essential to effectively remove haze in images to eliminate the influence of haze environment on image quality and increase the visibility of images. Recently, it has been a frontier subject of common concern in the field of image processing and computer vision, and also a research hotspot for dynamic recognition [2–4].
Fig. 1. Dynamic identification system
The purpose of haze removal is to reduce the fuzziness of multi-band image or panchromatic image and restore the clear visibility as that is obtained in ideal weather. At present, haze removal methods can be divided into two categories: One is based on image enhancement algorithm and the other arises from image restoration algorithm. The former method mainly includes histogram equalization [5], Retinex algorithm [6], wavelet image enhancement algorithm [7], and enhancement algorithm based on atmospheric scattering model [8]. The key idea of histogram equalization algorithm is to change the gray-level histogram of the original image from a certain gray-level interval to a uniform distribution in the whole gray-level range. As it carries out nonlinear stretching on the image and redistributes the image pixel value, the number of pixels in a certain gray range is roughly the same. It is suitable for capturing images where the haze is relatively uniform. Retinex algorithm is different from linear and nonlinear algorithm, which can only enhance a certain type of image features. Retinex algorithm achieves a balance in dynamic range compression, edge enhancement, and color constancy, so it can perform adaptive enhancement for various types of images. However, the key idea of Retinex algorithm is based on color constancy, and it is not applicable to multi-illumination scenes with constantly changing environmental targets during the dynamic recognition of targets by artificial intelligence. The wavelet image enhancement algorithm is based on the haze image which is mainly concentrated in the low-resolution information region of the image. The low-resolution information of the image can be obtained by wavelet decomposition. This method increases the contrast of the image, but also amplifies the image noise, and the signal-to-noise ratio of the enhanced image is
Improved Dark Channel Prior Algorithm Based on Wavelet …
999
relatively low. The enhancement algorithm based on the atmospheric scattering model is to estimate the aerosol transfer function of the degradation process based on the influence of the atmosphere and realize haze removal through frequency domain restoration. The advantage of this algorithm is the good effect of haze removal, but the disadvantage is that the algorithm needs to estimate the fog concentration and scene depth; thus, the application is not universal. The haze removal algorithm for image restoration is based on the deep learning and the prior theory algorithm. The basic idea of haze removal algorithm is to introduce neural network to estimate the atmospheric degradation model [9, 10]. The advantage is that a large amount of data can be used to train the data model of haze and fogless images, while the disadvantage is the great challenge in the application and generalization of different types of data. In view of the diversity of environmental objectives, this algorithm may not be appropriate for real-time monitoring of dynamic regional targets. The haze removal algorithm based on the prior theory mainly includes the maximum contrast method proposed by Tan et al. [11], which improves the image contrast but cannot avoid the halo phenomenon. The prior method of color attenuation proposed by Zhu et al. [12] requires the image to be rich in color. When it is serious smoggy, the haze removal effect of the image is not ideal. The dark channel prior method (DCP algorithm) proposed by He et al. relies on the prior knowledge of the dark channel image to assist the estimation of the transmission diagram [13–15], so as to obtain the fogless diagram. Its advantage is that it can restore outdoor haze image with high quality. However, due to the need to obtain transmittance graph and deal with a large quantity of floating-point numbers in the process of haze removal, the operation efficiency of the whole program is low. Therefore, how to make the dark channel prior algorithm real time has become an important issue in the research of haze removal algorithm. In view of the above problems, this paper firstly studies the threshold selection strategy of wavelet compression and then obtains the prior scale decomposition parameters by comparing experiments. After that, transmission graph parameters are studied under different resolution mechanisms. Finally, an improved dark channel prior algorithm is proposed. The second part introduces the improved dark channel prior algorithm based on wavelet decomposition, and the third part conducts the experiment research.
2 Improved Dark Channel Prior Algorithm Based on Wavelet Decomposition The theoretical basis of the dark channel prior algorithm is that in most local areas under sky, some pixels have at least one color channel with the lowest value [15]. The mathematical definition of a dark channel is shown in Formula 1. J dark (x) = min (
min
y∈C(x) C∈{R,G,B}
J C (y))
(1)
In Formula 1, J C is the color images of each channel. C(x) is each window centered on pixel X. The fog diagram formation model is shown in Formula 2. I (x) = J (x)t(x) + A(1 − t(x))
(2)
1000
P. Song
Among them, I(X) is the image to remove the fog, and J(X) is the fogless image we want to restore. A is the global atmospheric light composition, and t(X) is the transmittance. So, now, we have I(X), we want the target value J(X), and obviously, this is an equation with an infinite number of solutions, so some prior information is needed. Equation 3 is the Formula for the estimation of transmittance. Superscript C means R/G/B channels. ˜t (x) = 1 − min (min) y∈C(x)
C
I C (y) AC
(3)
Atmospheric light values A are obtained from the dark channel fog image. The specific steps are as follows: (1) Take the first 0.1% pixels from the dark channel diagram according to the brightness. (2) In these positions, find the value of the corresponding point with the highest brightness in the original foggy image I, as A value. As can be seen from Eq. 3, the calculation of transmittance requires a large number of floating point arithmetic, so it cannot meet the real-time requirement. In view of the time–frequency localization characteristics of wavelet transform, which the low-resolution information of wavelet transform can maintain the main features of the image, while the high-resolution information represents the characteristics of the image details, a improved algorithm which combines wavelet and the dark channel prior algorithm is put forward. The improved algorithm is as follows: Firstly, the threshold selection strategy of wavelet compression is analyzed and the threshold parameters are obtained; secondly, parameters of haze removal images with different resolutions can be obtained through wavelet decomposition, and parameters of prior decomposition layers can be determined; Thirdly, according to the parameters of the prior decomposition layers, the original image is decomposed by wavelet. After decomposition, only the low-frequency components of the scale are retained to obtain the transmittance. Finally, dark channel prior algorithm is used for haze removal. The algorithm flow is shown as Fig. 2.
3 Experiment Research We carried out the experimental study of the improved algorithm in the development environment of CPU G4560, 8G memory and MATLAB software. The original fog image is shown as Fig. 3, which is obtained from the image library in BAIDU. 3.1 Threshold Selection Strategy of Wavelet Compression Figure 4 shows the original fog image’s gray-level-gradient histogram. Figure 5 is the decomposition diagram after two-level wavelet decomposition using Bior 3.7 wavelet basis. Figure 6 shows the threshold selection via the balance sparsity norm in wavelet decomposing. It can be seen from Fig. 5, the lowest frequency band retains the main features of the image after compression and its size is reduced to 1/16 of its original size.
Improved Dark Channel Prior Algorithm Based on Wavelet …
1001
Fig. 2. Flowchart of improved dark channel prior algorithm based on wavelet decomposition
Fig. 3. Original fog image
3.2 Comparative Study of Image Characteristic Parameters Was Conducted on Haze Removal Images of Different Resolutions and Original Scale to Determine the Prior Decomposition Scale Parameters At first,Bior3.7 wavelet basis was used to decompose the original image into one and two layers respectively, and the low-frequency information was saved after the decomposition. Then, Formula 3 was used to calculate the transmittance of the decomposed low-frequency graph, and the dark channel prior algorithm was used for haze removal. Through the parameter comparison and the visual effect of image defogging, the better defogging effect can be obtained after the decomposition of the first wavelet layer. As a result, the Prior Decomposition Scale Parameters is 1 and the compression ratio is 0.25.
1002
P. Song
Fig. 4. Original fog image’s gray-level-gradient histogram
3.3 Collected Images Were Decomposed According to the Prior Decomposition Scale Parameters and the Low-Resolution Information Was Retained Figure 7 shows that the original image is compressed using Bior3.7 wavelet basis at level 1. Figs. 8 and 9 are the original and lowest frequency band images’ parameters statistics. It can be seen from Figs. 8 and 9 that the main parameter information of the image remains basically the same. 3.4 Dark Channel Prior Haze Removal Treatment Was Carried Out Formula 2 is used for image haze removal. Figure 10 is a comparison of haze removal effects of different methods applied to haze images. Figure 10b is the image after haze removal with the dark channel prior (DCP) algorithm. Figure 10c is the image after haze removal with the improved dark channel prior algorithm based on wavelet decomposition. As can be seen from the figure, the improved haze removal algorithm maintains the high-quality haze removal effect of DCP algorithm. Table 1 gives the common evaluation indicators compared between that in Fig. 10b and that in Fig. 10c. As can be seen from the table, the improved algorithm is applied to remove haze, so as to maintain the original features of the image and obtain less program running time.
4 Conclusion In summary, the dark channel prior algorithm based on wavelet compression can greatly improve the real-time performance by effective haze removal, which can well meet the real-time requirements of dynamic target recognition. This study sheds a light on dark channel prior algorithm for haze removal in the dynamic recognition for real-time traffic monitoring, target tracking and safety monitoring.
Improved Dark Channel Prior Algorithm Based on Wavelet …
Fig. 5. Decomposition at level 2
1003
1004
P. Song
Fig. 6. Threshold selection via the balance sparsity norm in wavelet compression
a) Decomposition image in level 1
b) Reconstructed the lowest frequency image at level 1
Fig. 7. Wavelet decomposition using bior3.7 in level 1
Improved Dark Channel Prior Algorithm Based on Wavelet …
1005
Fig. 8. Original image’s parameters statistics
Fig. 9. Lowest frequency band image’s parameters statistics
a) The original image
b) The image after haze removal with the DCP algorithm
c) The image after haze removal with the improved dark channel prior algorithm based on wavelet decomposition
Fig. 10. A comparison of haze removal effects of different methods applied to haze images
1006
P. Song
Table 1. Common evaluation indicators compared between that in Fig. 10b and that in Fig. 10c The original image
Information entropy PSNR Average gradient Running time(ms)
With DCP algorithm
With the improved dark channel prior algorithm based on wavelet decomposition
6.4698
6.1165
6.4735
24.0714
24.1491
24.1004
0.0235
0.0236 116
0.0280 66
References 1. Zhao J, Liu B, Wang GF, Wei YG, Sun J (2018) Design of traffic video analysis and tracking system. J Image Signal Process 7(4):236–248 2. Zhang XG, Tang ML, Chen H (2014) A dehazing method in single image based on double-area filter and image fusion. Acta Autom Sinica 40(8):1733–1739 3. Yao Y, Li XY, Meng JH (2020) Image dehazing algorithm based on conditional generation against network. J Image Signal Process 9(1):1–7 4. Miao QG, Li YN (2017) Research status and prospect of image dehazing. Comput Sci 044(011):1–8 5. Li CL, Song YQ, Liu XF (2015) Traffic image haze removal method based on MSR theory. J Comput Appl A02:234–237 6. Zhang SN, Wu YD, Zhang HY et al (2013) Improved single scale Retinex foggy image enhancement algorithm. Laser Infrared 6:698–702 7. Wanucg YF, Yin CL, Huang YM et al (2014) Image haze removal using a bilateral filter. J Image Graph 03:58–64 8. Wu PF, Fang S, Xu QS et al (2011) Restoration of blurred image based on atmospheric MTF. J Atmosp Environ Opt 6(3):196–202 9. Duan LC, Liu C, Zhong W, Chen LQ, Jiang MR (2017) A method of image dehazing using atmospheric scattering model. J Image Signal Process 6(2):78–88 10. Li BQ, Hu XH (2019) Effective distributed convolutional neural network architecture for remote sensing images target classification with a pre-training approach. J Syst Eng Electron 030(002):238–244 11. Tan RT (2008) Visibility in bad weather from a single image. In: Proceeding of ieee conference on computer vision and pattern recognition, Washington DC, IEEE Computer Society, pp 2347–2354 12. Zhu Q, Mai J, Shao L (2015) A fast single image haze removal algorithm using color attenuation prior. IEEE Trans Image Process 24(11):3522–3533 13. Hu W, Yuan GD, Dong Z et al (2015) Improved single image dehazing using dark channel prior. J Syst Eng Electron 26(5):1070–1079 14. Wu YP (2018) Research of nighttime image dehazing by fusion. Comput Sci Appl 8(5):798– 808 15. He KM, Sun J, Tang XO (2011) Single image haze removal using dark channel prior. IEEE Trans Pattern Anal Mach Intell 33(12):2341–2353
A Novel Broadband Crossover with High Isolation on Microwave Multilayer PCB Xu Yang1 , Jiancheng Liu2 , Meiying Wei1 , Xiaoming Li2 , Anping Li1 , and Xiaofei Zhang1(B) 1 The State Radio Monitorning Center, Beijing 100037, P. R. China {yangxu,weimeiying,lianping,zhangxf}@srrc.org.cn 2 The 54th Research Institute of CETC, Shijiazhuang, P. R. China [email protected], [email protected]
Abstract. The paper proposed a novel broadband crossover based on microwave multilayer PCB technique with two-layer structure. Low-pass PI filter is introduced to lower coupling capacitance between crossed transmission lines of slight RF performance degradation. The proposed crossover is optimized and simulated by commercial EM software HFSS; among the working band 1–18 GHz, return loss is better than 13 dB, insertion loss is better than 0.5 dB, and isolation between two transmission lines is better than 18 dB. The dimension of the proposed crossover is less than 2 × 2 mm, which is much smaller than traditional crossover on single layer. Keywords: Microwave multilayer PCB · Crossover · Low pass
1 Introduction Crossover is a frequently used element in planar microwave circuits, which is usually used in a single-planar circuit to provide isolation from crossed lines [ 1] . Traditional crossover is composed of several quarter-wavelength branch lines [ 2] , working in relatively narrowband and always occupied large circuit area [ 3] . In microwave multilayer PCB technique [ 4– 6] , a standard buried RF layer is always treated as single-planar circuit [ 7] and is illustrated in Fig. 1a. Buried RF layer in Fig. 1a is labeled as Type I in the paper, and it is composed of two microwave laminates and one prepreg; one of the copper foiled is etched to form a three-layer system, which is similar with strip line. Dual form of Type I is illustrated in Fig. 1b, which is labeled as Type II. Consider the strong coupling between L1 and L2, Type III layer stack in Fig. 1c is rarely used, but when circuit on L1 and L2 has none overlapped adjacent area, it is also a practicable stack type.
2 Crossover Structure By using layer stack in Fig. 1c, crossover can be realized on two separate layers. The proposed crossover structure is illustrated in Fig. 2, where Type I crossover in Fig. 2a © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_132
1008
X. Yang et al. G1 Laminate1 L1
G2
Prepreg Laminate2 (a) Type I
G1 Laminate1 L2 G2
Prepreg Laminate2 (b) Type II
G1 Laminate1 L1 L2 G2
Prepreg Laminate2 (c) Type III
Fig. 1. Illustration of buried RF layer
is directly composed of two transmission lines with an overlapped square, and Type II crossover introduced a low-path PI filter on each transmission line. Figure 2c shows the detail and dimensions of Type II crossover, while equivalent circuit of it is shown in Fig. 2d. Cross-capacitance C’ leads a coupling between two transmission lines. Then, the coupling coefficient can be reduced by using small overlapped area. In conventional PCB process, line width of 0.2 mm is practicable [ 8– 10] . In this paper, microwave laminate is chosen to be 0.254 mm thick RELONG RS300 with dielectric constant 2.94, while prepreg is 0.1 mm thick RELONG RLP30, which has a dielectric constant of 3, and it can be evaluated that the cross capacitance C’ is about 10 fF, which leads to a susceptance of 0.001, it is negligible to compare to characteristic admittance 0.02. by removing C’, equivalent circuit of Fig. 3c simplified to two typical PI low-pass filter. High impedance line acts as an inductor, and its value L can be evaluated by (1). βl ≈
LZ0 . Zh
(1)
where β is the phase constant in propagation median, l is physical length of high impedance line, Z h is characteristic impedance of the high impedance line, and Z 0 is characteristic impedance of the system, which is chosen to be 50. Low impedance line with characteristic impedance Z l acts as a capacitor, and its capacitance C can be
A Novel Broadband Crossover with High Isolation on Microwave … P1
P3
L1 Overlaped area
P2
L1
P4
(a) Type I P3
P1 L2 Overlaped area
L1
P2
P4
(b) Type II
ll
lh
wh
wl
w0
(c) Dimensions of Type II
1 C
2 C
L
L
L
C L
C 4
C 3
(d) Equivalent circuit of Type II Fig. 2. Illustration of the proposed crossover
1009
X. Yang et al. 0
Insertion Loss/dB
-0.2
-0.4
-0.6
Type I Type II
-0.8 2
4
6
8
10
12
14
16
18
Freq/GHz
(a) Insertion Loss -10 -20
Return Loss/dB
-30 -40 -50
Type I
-60
Type II
-70 2
4
6
8
10
12
14
16
18
Freq/GHz
(b) Return Loss -10
-20
Isolation/dB
1010
-30
-40
Type I Type II -50 2
4
6
8
10
12
Freq/GHz
(c) Isolation
Fig. 3. EM simulation results
14
16
18
A Novel Broadband Crossover with High Isolation on Microwave …
1011
evaluated by (2) in a similar process. βl ≈
CZl Z0
(2)
3 Design and Simulation Cutoff frequency of low-pass PI filter in Fig. 2d between port 1, 4 and 2, 3 is not assigned, so it can be optimized under certain design requirement. It can be found that three-order low-pass prototype values are g0 = 1, g1 = 2, g2 = 1 in Ref [11]. Consider realizability, width of high impedance line wh is set to be 0.2 mm, which has a characteristic impedance of 66.66 O, and length of high impedance line lh is set to be 2 mm. Since the normalized inductance of L equals to g1, the cutoff frequency can be evaluated by (1) and (3), which leads to about 20.7 GHz. √ β = ω με
(3)
Width of low impedance line wl is set to be 0.6 mm, and corresponding impedance Z l is about 36.3 O. Then, use Eqs. (2) and (3), length of low impedance line ll is about 0.95 mm. Consider fringing capacitance, this value should be modified. Total area of Type II crossover is about 2 × 2 mm. EM simulations are employed to get RF performance of both Type I and Type II crossover use commercial EM software HFSS. As shown in Fig. 3a, insertion loss is better than 0.8 dB in Type I and better than 0.5 dB in Type II. Return losses are shown in Fig. 3b. Under 8 GHz, both Type I and Type II have a return loss better than 20 dB, and it is better in Type II, but above 8 GHz, return loss of Type I is about 3 dB better than Type II and reaches to about 13 dB at 18 GHz. Figure 3c shows isolations between two separated transmission lines. It can be seen that isolation of Type II is 5 dB better than Type I among 1–18 GHz and is better than 18 dB in the whole band. Consider conservation of energy, it can be inferred that radiation in Type II becomes significantly beyond 8 GHz.
4 Conclusion The paper proposed novel crossover under microwave multilayer PCB technique, and it consists of two separate transmission lines and two low-pass PI filters. The proposed crossover has a compact size of about 2 × 2 mm and can provide isolation better than 18 dB among DC to 18 GHz. The structure can be easily designed and realized on commercial PCB process, which lead to a low cost compared with other multilayer technique like LTCC, HTCC, etc. The proposed crossover also has good insertion loss and return loss up to C band and can be used up to 18 GHz under a slight return loss degradation.
1012
X. Yang et al.
References 1. Nedil M,Denidni TA,Talbi L (2005) Novel butler matrix using CPW multi-layer technology. In: IEEE Antennas and propagation society international, August 2005, pp 1–3 2. Zhu He, Sun Haihan, Jones Bevan (2019) Wideband dual-polarized multiple beam-forming antenna arrays. IEEE Trans Antennas Propag 67(3):1590–1604 3. Slomian I, Rydosz A, Gruszczynski S (2017) Three-beam microstrip antenna arrays fed by 3 × 3 Butler matrix. In: 7th IEEE international symposium on microwave, antenna propagation and EMC technologies (MAPE) 2017, pp 1–4 4. Lai Q, Li P, Lu X (2015) A prototype of feed subsystem for a mutilple-beam array-fed reflector antenna. In: 2015 IEEE international symposium on antennas and propagation 2015, pp 1–3 5. Ebert A, Kaleem S, Müller J (2015) An industry-level implementation of a compact microwave diode switch matrix for flexible input multiplexing if a geo-stationary satellite payload. Microwaves communications antennas and electronic systems (COMCAS). In: 2015 IEEE international conference on 2015, pp 1–4 6. Metzen PL (2000) Globalstar satellite phased array antennas. In: IEEE international conference on phased array systems and technology May 2000. Loral, USA, IEEE Press, vol 207–210 7. Yoon SW, Kim CW, Dang TS (2014) Ultra-wideband power divider using three parallelcoupled lines and one shunt stub. Electron Lett 50.2:95–96 8. Lu D etc (2019) A simple and general method for filtering power divider with frequencyfixed and frequency-tunable fully canonical filtering-response demonstrations. IEEE Trans Microwave Theory Techniq 67.5:1812–1825 9. Moznebi AR, Afrooz K (2017) Substrate integrated waveguide (SIW) filtering power divider/combiner with high selectivity. Wireless Personal Commun 97.12:1117–1127 10. Wang W, Zheng Y, Cao Q (2019) A four Way broadband filtering power divider with improved matching network for X-band application. Microwave Opt Technol Lett 61:12 11. Pozar David M (2015) Microwave engineering, 3rd edn. Publishing House of Electronics Industry, Beijing, pp 368–380
A Recognition Algorithm Based on Region Growing Luguang Wang1,2 , Yong Zhu1(B) , and Chuanbo Wang1,2 1 College of Electronic Engineering, Heilongjiang University, No. 74 Xuefu Road, Harbin,
China [email protected] 2 Lingnan Big Data Institute, Zhuhai, China
Abstract. Aiming at the uncertainty of object detection in the field scenic spot cloud data, combine the region growing algorithm and PointNet++. A recognition method based on regional growing is proposed. The method first divides the point cloud data into small regions, and regional growth algorithm was used to cluster the local regions; then the PointNet++ method is used to identify cluster objects. The experiment result shows that this method can achieve the segmentation and recognition of objects in the scenic spot cloud at the same time. The segmentation rate can be achieved 84.3%, and the recognition rate was 90.7%. Keywords: Point cloud · Object recognition · PointNet++ · Region growing
1 Introduction In computer vision, the object recognition method based on two-dimensional image is dominant and has the very mature theory and the practice foundation. For example, the image recognition of [1–4] and other technologies have very high recognition rate. But the image data are a mapping to the real three-dimensional space, which lacks the threedimensional space information. In recent years, 3D depth sensor sensing technology has become more and more mature, and the lidar or depth sensor is used to obtain the threedimensional spatial information of the object surface and storing spatial information about objects in a digital file format. Using point cloud data for object recognition can effectively avoid the influence of light, and accurate spatial information can be obtained. At present, the recognition methods of 3D objects can be divided into deep learning method and clustering method. Although the method based on deep learning has higher recognition accuracy, its training cycle is long and its training parameters are complex. However, the clustering method has strong stability and real-time performance. A multi-view convolutional neural network is proposed in [5], in this method, twelve virtual perspectives are set to project 3D data, and the mapped 2D images are used as input data. Then, a mature two-dimensional convolutional network is used to train
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_133
1014
L. Wang et al.
and combine the characteristics of the twelve channels. In [6], the format of dividing point cloud into voxels is proposed to deal with the disorder of point cloud data. The algorithm divides the point cloud data into voxel data with a resolution of 32 * 32, then the three-dimensional voxel data of the rules are processed by using two-dimensional convolutional network. [5, 6] is a way to indirectly process 3D data by converting point cloud data into voxels or multiple views. In [7], Charles proposed the PointNet approach to deal directly with point cloud data. In this paper, a t-net structure is proposed to deal with the disorder of point cloud, and the symmetric function is used for high-dimensional feature mapping and built a network architecture for classification and segmentation. PointNet cannot capture the local structure problems caused by the metric space, limiting the identification of fine-grained scenarios. In [8], PointNet++ has been proposed. It uses PointNet to iteratively extract features from local areas of the point cloud, enabling it to train features of increasingly large local scales. PointFusion [9], F-PointNet [10] et al. used a variety of data fusion to extract features. Xu et al. [9] use ResNet to process RGB images to extract two-dimensional feature information and use PointNet [7] to process point cloud data to extract three-dimensional features, and it fuses the feature layer data to obtain the 3D bounding box of the target object. Qi et al. [10] use a Mask RCNN to process the image data, looking for two-dimensional object bounding boxes. Then use the cone projection method to shrink the point cloud target into a 3D cone and use PointNet [7] to predict 3D object bounding boxes. This method uses 2D to drive 3D, which greatly reduces the detection range and improves the recognition efficiency, but this method relies heavily on the recognition accuracy of two-dimensional convolution. PointRCNN [5] uses a bottom-up approach to generate redundant 3D bounding boxes from point clouds and learns local features combined with global features to obtain the final bounding by BOX optimization. VoteNet [9] constructs a 3D detection channel for point cloud data, returns the centroid coordinates of the object using Hough voting and generates an object bounding box. We propose a recognition algorithm based on region growth that combines traditional clustering segmentation and deep learning methods. Firstly, k-nearest neighbor method is used to divide the point cloud data into local regions. Then the regional growth algorithm is used to cluster the local area of the point cloud. Finally, PointNet is used to recognize clustered objects and return object space information.
2 Region Growing Recognition Algorithm 2.1 Regional Growing Algorithm The basic idea of the region growth algorithm is to combine the similarity points together, and it was originally used in the field of image segmentation. It is also applied to point cloud segmentation due to its ability to segment connected regions with the same characteristics. We first used the farthest point sampling algorithm (FPS) to divide the point cloud data into local regions, designate the center point of each local area as the seed point for regional growth. Then the seed points were compared with the neighboring points, and the points with similar characteristics were combined and continued to grow outwards. In this paper, the normal angle difference of the center point in the local region
A Recognition Algorithm Based on Region Growing
1015
is taken as the growth condition, and if the included angle is less than the set threshold, it belongs to the same category; otherwise, the region is divided. For the input points clustering N, FPS was used to obtain S key points, and the Euclidean distance formula (1) was used to search k-nearest neighbor points to form a local space. (1) dist = (xn − xm )2 + (yn − ym )2 + (zn − zm )2 Algorithm steps: (1) K-d tree is used to establish the spatial topological relation of local region. (2) Take the center of the local space as the primary seed point and calculate the curvature of these center points. Any point in the point cloud has a neighbor point cloud with a surface z = r(x, y) approaching the point, the curvature of the central point is characterized by fitting the curvature of the local surface with the point of the local region. k1 , k2 = H ± H=
H2 − K
(2)
EN − 2FM + GL 2(EG − F 2 )
(3)
LN − M 2 EG − F 2
(4)
K=
where k1 , k2 are the principal curvature, and H is mean curvature, and K is Gaussian curvature; 3. We compared the mean curvature of the center point and selected the point with the minimum curvature as the initial seed point. 4. Local area normal vector estimation. Calculate the normal vector of the center point in the local region. k 2 [ Axi + Byi + C − zi ] d=
(5)
i=1 ∼
When d is the minimum, its corresponding vector n (A, B, C) is approximately the normal vector of the center point Pi (xi , yi , zi ). 5. Calculate the normal vector angle between the initial seed point P and this adjacent local region. If the angle is less than the threshold value, cluster the local region and add the central point of the region to the index matrix. 6. Update the seed points and iterate the above process to complete the clustering of three-dimensional point clouds.
1016
L. Wang et al.
Fig. 1 PointNet++ network structure
2.2 PointNet++ The PointNet++ network structure consists of three parts: sampling layer, grouping layer and PointNet layer (Fig. 1). Sampling layer: The input point cloud is sampled to obtain the central point dividing the local space. Grouping layer: Using the center point obtained in the previous step, the k-nearest neighbor method is used to divide the point cloud data into several local regions. PointNet layer: The local area of the point cloud is coded to generate feature vectors. The most important part of the network is the abstraction layer, and this input of the network is N × (C + d ) point cloud matrix, where N is the number of point clouds, d is the three-dimensional coordinate dimension of the point cloud, and C is the characteristic dimension. In the first abstraction layer, we first used the farthest point sampling algorithm to obtain N1 points from the point cloud data. The k-nearest neighbor algorithm is used to search the nearest k points to the sampling point to construct a local region, get N1 local areas of point cloud with size of (k + 1) × (C + d ). Then, we used the region growing clustering algorithm of 2.1 to segment the point cloud. On the basis of clustering, the abstract layer operation is carried out again, and identify the categories of objects in the scene. We use the traditional clustering method to segment the point cloud in the scene, compared with the method using two-dimensional convolutional network, and it is more time-efficient and does not require numerous prior knowledge. This method can detect the location and category information of objects.
3 Experiment and Analysis 3.1 Experimental Results of Regional Growing Algorithm The algorithm in this paper uses the difference of the normal vector in the local region as the growth condition, and the improved regional growth algorithm is applied to the
A Recognition Algorithm Based on Region Growing
1017
clustering of local point clouds. From Fig. 2, this method is better for the segmentation of field scenic spot cloud.
(a) Input data
(b) Region growth algorithm segmentation results
Fig. 2 Input data and segmentation results
Obviously, this method has good segmentation effect for larger objects, but for the small object segmentation precision is not high. The reason for this problem is we use the local region as the minimum processing unit, and this generalizes the boundaries of small objects. For precise segmentation of small objects, we can divide the point cloud containing small objects into smaller regions and then use the region growth algorithm to achieve clustering (Table 1). Table 1 Regional growth segmentation efficiency Larger object (%)
Small object (%)
Runtime ( ms)
First segmentation
84.3
12.1
55.3
Second segmentation
79.6
54.3
216
3.2 Experimental Results of Point Cloud Recognition For the 3D target recognition task, we trained the model on the ModelNet40 dataset, ModelNet40 has 40 categories, most of which are common objects in daily life. In 2.1, we obtain clustering objects and store the information of local regional center points belonging to the same category in the matrix. Now, we used PointNet methods to extract the characteristics of these local areas and integrate these local features until recognizing the category of the object (Fig. 3). Our proposed algorithm adds region growth compared to PointNet++, and this algorithm can recognize object categories on the basis of clustering. Compared with [9, 10], the step of two-dimensional detection is removed. Our method not only saves time but also greatly reduces the algorithm complexity (Table 2).
1018
L. Wang et al.
Fig. 3 Input data and recognition result Table 2 Efficiency comparison of different algorithms Method
Recognition Segmentation
F-PointNet
–
74.3%
PointFusion
–
83.8%
PointNet++ 91.9% Ours
90.7%
– 84.3%
In the segmentation task, we use the region growth algorithm to achieve the accuracy of 84.3%. It is better than [9, 10]. In the recognition task, our method is close to the PointNet++, and this highest recognition accuracy was 90.7%.
4 Conclusion In this paper, a recognition method based on region growth is proposed. First, the point cloud data is divided into local regions, which are represented as two-dimensional surfaces. Then, the normal vector of the local area center point is used as the cluster point cloud data under the growth condition. Then, use PointNet++ to identify the type of object. The experiment proved that the segmentation accuracy of point cloud data was 84.3%, and the object recognition rate can reach 90.7%. The algorithm in this paper ignores the problem of object boundary in the local region, which affects the segmentation accuracy and recognition accuracy of point cloud.
References 1. Redmon J, Divvala S, Girshick R et al (2015) You only look once: unified, real-time object detection 2. Redmon J, Farhadi A (2016) YOLO9000: better, faster, stronger 3. Redmon J, Farhadi A (2018) YOLOv3: an incremental improvement 4. Ren S, He K, Girshick R et al (2015) Faster R-CNN: towards real-time object detection with region proposal networks. In: International conference on neural information processing systems
A Recognition Algorithm Based on Region Growing
1019
5. Su H, Maji S, Kalogerakis E et al (2015) Multi-view convolutional neural networks for 3d shape recognition. In: Proceedings of the IEEE international conference on computer vision, pp 945–953 6. Maturana D, Scherer S (2015) VoxNet: a 3D convolutional neural network for real-time object recognition. In: 2015 IEEE/RSJ international conference on intelligent robots and systems (IROS). IEEE, pp 922–928 7. Charles RQ, Hao S, Mo K et al (2017) PointNet: deep learning on point sets for 3D classification and segmentation. In: IEEE conference on computer vision & pattern recognition 8. Qi CR, Yi L, Su H et al (2017) PointNet++: deep hierarchical feature learning on point sets in a metric space 9. Xu D, Anguelov D, Jain A (2017) PointFusion: deep sensor fusion for 3D bounding box estimation 10. Qi CR, Liu W, Wu C, Su H, Guibas LJ (2017) Frustum PointNets for 3D object detection from RGB-D data. arXiv:1711.08488
Point Cloud Simplification Method Based on RANSAC and Geometric Filtering Chuanbo Wang1,2 , Yong Zhu1(B) , and Luguang Wang1,2 1 College of Electronic Engineering, Heilongjiang University, No. 74 Xuefu Road, Harbin,
China [email protected] 2 Lingnan Big Data Institute, Zhuhai, China
Abstract. At this stage, point cloud data is applied to all walks of life. Different needs lead to different processing methods, and the point cloud collected by the Kinect camera contains a little noise, so it is necessary to process it to a certain extent. First, the data is taken by the Kinect camera to obtain the depth image; then the depth image is converted into a point cloud according to the camera parameters and the matrix transformation relationship, and the experimental object is obtained by segmentation using the RANSAC algorithm; finally, it is processed using geometric filtering. The experimental results prove that the method has good effect. Keywords: Noise · Depth image · RANSAC · Geometric filtering
1 Introduction The paper proposes an automatic algorithm for detecting basic shapes in unorganized point clouds [1]. For the purpose of point cloud segmentation, a weighted RANSAC method is constructed [2] and proposed a method of culling false planes based on the angle between the point cloud normal vector and the RANSAC segmentation plane to improve segmentation quality [3]. Martinez et al. proposed to improve the RANSAC point cloud segmentation algorithm by orienting scan data [4]; Dong et al. applied a bilateral filtering algorithm based on image denoising to the point cloud filtering process [5]. Zhang et al. proposed a cloth simulation filtering algorithm [7].
2 Point Cloud Downsampling 2.1 Space Gridding Spatial gridding refers to the method of implementing point cloud downsampling by constructing voxel grids, replaces all other points with center of gravity points in each voxel and ensures that the overall shape characteristics of the point cloud are unchanged. When setting multi-level grid, the principle is the same level grid. The algorithm flow is as follows:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_134
Point Cloud Simplification Method …
1021
1. Enter the original point cloud data set to determine whether it is empty; 2. If not empty, traverse the point cloud to obtain the minimum outsourcing rectangle MBR, sitting marked as (Xmin , Ymin , Zmin , Xmax , Ymax , Zmax ); 3. Specify the step size L, and calculate the required number of divisions along all axes. The number of voxels and voxel index is as follows: N=
Xmax − Xmin Ymax − Ymin Zmax − Zmin · · L L L ⎧ x−xmin ⎪ ⎨ x = int L + 1 y = int y−yLmin + 1 ⎪ ⎩ z = int z−zLmin + 1
(1)
(2)
4. Calculate the center of gravity of the voxel and replace all points in the voxel. 2.2 Uniform Sampling The principle of uniform sampling and spatial rasterization is similar. The difference is that uniform sampling takes the centroid of the point cloud instead of the center of gravity. The points obtained in this way are the points that already exist in the point cloud set. Method to replace all points in voxel which is different from the method of replacing all the points in the voxel with the center of gravity point in spatial gridding.
3 Point Cloud Filtering 3.1 Pass-Through Filtering Generally, the conditions required for filtering are as follows: a large amount of data contains noise points and irregular density. Pass-through filtering is the most direct filtering method, suitable for cutting a certain point cloud for processing. Algorithm flow: 1. Enter the original point cloud data set to determine whether it is empty; 2. If it is not empty, calculate the maximum point maxPoint and the minimum point minPoint on each coordinate axis in the point cloud set and record the coordinates of the point. For example, the maximum and minimum values along the x-axis are maxPoint.x and minPoint.x, respectively; 3. Specify the axis dimension and filter range. maxPoint.z − 0.5, which means that the area of the z-axis from the highest point 0.5 m down will be cut. 3.2 Statistical Filtering Statistical filtering uses statistical analysis to establish a k-d tree [6] structure on the point cloud and perform statistical analysis on the neighborhood of each input point. Construct a two-dimensional k-d tree and take the nearest neighbor point (2, 4.5) as an example to do the nearest neighbor search analysis.
1022
C. Wang et al.
First, we start from the root node (7, 2), and do depth first traversal of the k-d tree. Take the search point (2, 4.5) as the center and the Euclidean distance from the root node as the radius to draw a circle (multi-dimensional space is hyperplane). At this time, the points of the right subtree of the k-d tree are all ignored, as shown in Fig. 1a. Continue to go down to (5, 4), test the arrival node (4, 7) at (5, 4), which is the best node at this time. From Formula 3, dist = 3.202;
Y X (2, 3)
X
X
(7, 2)
(7, 2)
(5, 4)
(9, 6) X (4, 7)
X (8, 1)
Y
Y
(5, 4)
X (2, 3)
X (4, 7)
8
8
6
6
4
4 2
2 0
(a) nearest neighbor search
10
10
2
4
6
8
10
0
2
4
6
8
10
(b) neighbor search
Fig. 1 Example of neighbor search
Euclidean distance in two-dimensional, three-dimensional and n-dimensional space:
(3) dist = (x2 − x1 )2 + (y2 − y1 )2
dist = (x2 − x1 )2 + (y2 − y1 )2 + (z2 − z1 )2 (4) n
2 2 2 (5) dist = (x1 − y1 ) + (x2 − y2 ) + (xn − yn ) = (xi − yi )2 i=1
Then back to the node (5, 4), as shown in the outer circle of Fig. 1b lift, draw a circle with a radius of dist = 3.202 and then jump to the left subtree of the node (5, 4) and continue searching. The Euclidean distance between time nodes (5, 4) and (2, 4.5) is dist = 3.04 < dist = 3.202 and is currently the best node. Go back to the node (2, 3), calculate the Euclidean distance from (2, 4.5) as dist = 1.5, so the best node becomes (2, 3). Then it goes back to (7, 2). Similarly, as shown in Fig. 1b right-dotted line, draw a circle with (2, 4.5) as the center and dist = 1.5 as the radius. It is found that it does not intersect with hyperplane x = 7. So far, the whole search is over. The nearest neighbor of the search point (2, 4.5) is (2, 3), and the distance is 1.5. Suppose a little cloud data setX = {xi , yi , zi }, (1, 2, . . . , m), the data set after filtering it is X = {xi , yi , zi}, (1, 2, . . . , n), here m ≥ n,Di represents the average distance from any point xj , yj , zj in the data set to all other points in the neighborhood. According to the assumption of statistical filtering, all points Di in X follow a Gaussian distribution. Considering that X only removes a small number of noise points or outliers compared toX , it is considered that the expectation in X and X is the same as the standard of D 2 deviation, which is recorded as D ∼ N μ, σ , so m m (Di − μ)2 i=1 Di μ= , σ = i=1 (6) m m
Point Cloud Simplification Method …
1023
μ is the expected value and σ 2 is the variance. If the standard deviation multiple is 1, points outside the (μ − σ × 1, μ + σ × 1) range are considered as noise points or outliers. The decisive factors for the effectiveness of the algorithm are k and std . 3.3 Radius Filter In point cloud data, if you want to specify that each point has a specified number of neighbors, you can consider the radius filter algorithm. As shown in Fig. 2, if there are at least k = 1 neighboring points within the specified radius, the left yellow point will be removed; if there are at least k = 2 neighboring points within the specified radius, the left side is yellow. The dot and the green dot on the right will be removed. The idea is similar to statistical filtering, except that it differs in the method of identifying neighborhood search and noise points.
Fig. 2 Radius filtering principle
The algorithm flow is as follows: 1. Input the original point cloud data set and judge whether it is empty; 2. If it is not empty, specify the search radius and the value of the number k of neighborhood points within the search radius; 3. Loop through all points and determine noise points or outliers according to threshold conditions.
4 Point Cloud Reduction Based on RANSAC and Geometric Filtering 4.1 Random Sampling Consistency The RANSAC algorithm assumes that the data is composed of “inside points” and “outside points,” “inside points” are the data that constitute the model, and “outside points” are the data that cannot constitute the model, and these models are estimated by repeatedly selecting the data, and iterate until the ideal result is estimated. The number of general iterations can be determined. First, assume that the model type considers α minimum sampling set and sample set S and randomly extract the initial
1024
C. Wang et al.
model M of a subset S containing α sample from S, and fits the model. Algorithm: The probability that α points will not be selected as an in-point is as follows: β (7) 1 − p = 1 − ωα Take logarithms on both sides: β=
log(1 − p) log(1 − ωα )
(8)
where p is the probability that randomly selected points in the iterative process are all internal points, ω is the ratio of the number of internal points to the number of point cloud data sets, α is the number of sample points selected by the estimation model, and β is the number of iterations; consider there will be fluctuations in the data, set the distance threshold. And the distance threshold determines whether the point that must meet the conditions is considered to be an internal point or an external point; finally re-select α, repeat the first two steps until the end of the iteration (Fig. 3).
Fig. 3 RANSAC example
The core of RANSAC lies in randomness and hypothesis. Compared with the least square method, it allows a lot of noise. With the help of the MATLAB 2016b platform, Figs. 4 and 5 show the difference between the least square method and the RANSAC algorithm in fitting straight lines and curves. From the figures, we can see that the advantages of the RANSAC algorithm are obvious. The combined effect is better than the least square method. Figure 6 shows the effect of fitting a plane in a three-dimensional scene and obtains the plane equation z = 1.7765x + (−0.08049)y + (−0.021842). Angles from left to right are azimuth −37.5, elevation angle 30, x–y plane and x–z plane. Table 1 shows some parameters of the above experiments. 4.2 Experiment Process and Results First, the data is taken by Azure Kinect DK camera to obtain depth image, color image and grayscale image. Take the depth map and then convert the depth image to point
Point Cloud Simplification Method …
1025
Fig. 4 Straight-line fitting comparison
Fig. 5 Curve-fitting comparison
Fig. 6 RANSAC plane fitting
Table 1 Parameters related to RANSAC Geometric type
Number of iterations
Local point
Noise point
Straight line
10,000
30
30
Curve
10,000
30
30
Plane
10,000
300
200
cloud according to the camera parameters and matrix transformation relationship and use RANSAC algorithm to segment the experimental object; finally combine pass-through
1026
C. Wang et al.
filtering and radius filtering that are used to mix them. The algorithm flow is shown in Fig. 7. After comparison, the method has good experimental results.
Azure Kinect DK Depth, RGB and Infrared Image
RANSAC
algorithm
Matrix transformations
point cloud
visualization
Radius filtering
output
Through filtering
Fig. 7 Algorithm flow
The threshold value of the RANSAC algorithm is set to 0.10 m, and the number of iterations is 10,000 to obtain 116,094 planar points and 74,891 non-planar points. The results are shown in Fig. 8a–c.
Fig. 8 Algorithm example
Similarly, the threshold in the above algorithm is changed to 0.15 m, and 121,232 plane points and 69,753 non-planar points are obtained, as shown in Fig. 8d–f. If the threshold of RANSAC is further increased, it will cause too many plane points. The object is distorted. The RANSAC segmentation result with 69,753 points is removed by 0.1 m in the two dimensions of the y-axis and z-axis to obtain data containing 63,122 points, as shown in Fig. 9a; then, the search radius of the radius filter is set to 0.01 m, adjust the parameters of the nearest neighbors, the results are shown in Fig. 9b–f, and the specific parameter settings are listed in Table 2. Finally, enlarge parts of Fig. 9a, f, and observe that a few sparse noise points have been processed as shown in Fig. 10.
Fig. 9 Filter results
Point Cloud Simplification Method …
1027
Table 2 Parameter settings After point
Noise point
Search radius
K-neighbors
Denoising rate (%)
Figure 9b
62,879
243
0.01
10
0.385
Figure 9c
62,960
162
0.01
8
0.257
Figure 9d
63,012
110
0.01
6
0.174
Figure 9e
63,067
55
0.01
4
0.087
Figure 9f
63,086
36
0.01
3
0.057
Fig. 10 Comparison of K = 3 and K = 10
5 Conclusion It is difficult to obtain the ideal result by a single method. In this paper, the experimental object is obtained, and the data is processed by combining the RANSAC algorithm and geometric filtering. It can be seen from the experimental results that the noise points are filtered out, and we directly discard the planar part. Of course, if there are other functions to keep, the difficulty of the experiment is the determination of the parameters. The next research focus will consider other segmentation methods and new filtering algorithms. The combination of the two can get a good treatment effect.
References 1. Schnabel R, Wahl R, Klein R (2007) Efficient RANSAC for point-cloud shape detection. Comput Graph Forum 26(2):214–226 2. Xu B, Jiang W, Shan J et al (2015) Investigation on the weighted RANSAC approaches for building roof plane segmentation from LiDAR point clouds 8(1) 3. Awwad TM, Zhu Q, Du Z et al (2010)An improved segmentation approach for planar surfaces from unstructured 3D point clouds. Photogram Record 25(129):5–23 4. Martinez J, Soria-Medina A, Arias P et al (2012) Automatic processing of terrestrial laser scanning data of building facades. Autom Construction 22:298–305 5. Dong YQ, Zhang L, Cui XM et al (2017) An automatic filter algorithm for dense image matching point clouds. In: ISPRS-international archives of the photogrammetry, remote sensing and spatial information sciences, XLII-2/W7, pp 703–709 6. Altman NS (1992) An introduction to kernel and nearest-neighbor nonparametric regression. Am Stat 46(03):175–185 7. Zhang W, Qi J, Wan P et al (2016) An easy-to-use airborne LiDAR data filtering method based on cloth simulation. Remote Sens 8(6):501
Compressive Autoencoders in Wireless Communications Peijun Chen1 , Peng Lv2(B) , Hongfu Liu1 , Bin Li1 , Chenglin Zhao1 , and Xiang Wang3 1
School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China 2 School of Computer and Communication Engineering, University of Science and Technology Beijing, No. 30 Xueyuan Road, Haidian District, Beijing 100083, China [email protected] 3 Beijing Jianea Technology Inc, Room 603, Building Xingfa, No. 45 Zhongguancun Street, Haidian District, Beijing 100086, China
Abstract. Autoencoder, one of the most promising and successful architectures in deep learning (DL), has been widely used in wireless communications. However, fast-increasing size of neural networks (NNs) in autoencoder leads to high storage requirement and heavy computational overhead, which poses a challenge to practical deployment of autoencoders in real communications systems. In this paper, we investigate two representative NNs compression methods and propose two compressive autoencoder schemes for wireless communications by combining the compression techniques with autoencoder architecture. Our proposed schemes are capable to reduce memory consumption and execution time. Numerical experiments demonstrate that our methods can effectively compress the autoencoder’s size without degrading the model performance or distorting the constellations of transmitted signal. Keywords: Autoencoder · Deep learning Neural networks · Model compression
1
· Wireless communications ·
Introduction
In recent years, deep learning (DL) has achieved great success in several domains, such as computer vision and natural language processing. Owing to the powerful nonlinear mapping capability of DL, many DL-based methods have also emerged in communications fields, among which stands a popular architecture of DL, autoencoder. Autoencoders have been introduced in several communications systems to cope with imperfections or sophisticated scenarios, such as end to end communications systems [1] and channel state information feedback [2].
c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_135
Compressive Autoencoders in Wireless Communications
1029
As autoencoders are used to handle complex and multiple challenges, the neural networks (NNs) in autoencoders become large and complicated, requiring high storage and massive computational resource, which makes it difficult for practical deployment, especially in demanding wireless communications scenarios [3]. Most of existing literature explores the applications of autoencoders in communications or aims at improving their performance. At the same time, compression algorithms for NNs have been proposed to reduce their size and accelerate the execution, some of which can be applied to autoencoders in wireless communications. In this paper, we first investigate the architecture of autoencoder and its applications in wireless communications, then two representative compression techniques are integrated into autoencoder, and thus, two new compressive autoencoder schemes are proposed, which can effectively alleviate the memory overhead and speed up the computation. The rest of this paper is structured as follows. In Sect. 2, we introduce the autoencoder architecture in wireless communications systems, and existing problems are also investigated. Two representative compression techniques for NNs are presented in Sect. 3, and corresponding compressive autoencoder schemes are proposed. Section 4 provides numerical experiment results. We finally conclude the paper in Sect. 5.
2
Autoencoder in Wireless Communications
In this work, we focus on the single-input single-output (SISO) systems. Moreover, our compressive schemes should also be beneficial to the applications of autoencoders in multiple-input multiple-output (MIMO) systems, which is left for future study. A SISO system could be roughly separated into three parts, i.e., transmitter, channel and receiver. In traditional methods, well-designed algorithms are applied in transmitter and receiver. However, there are some gaps between the ideal mathematical models and real-world systems due to some unknown factors, e.g., hardware imperfections. Owing to their powerful nonlinear mapping ability, data-driven DL methods, especially the autoencoder architectures, show excellent performance in coping with unknown environments. 2.1
Autoencoder Architecture
In SISO communications systems, the autoencoder consists of an encoder module and a decoder module which are connected through an additive white Gaussian noise (AWGN) channel. Although encoder and decoder can be implemented by different techniques in principle, NN-based autoencoder is the most fast-growing one, as shown in Fig. 1 In communications systems, encoder plays the role of transmitter, which transforms the source symbol s which is one of m possible symbols from a original
1030
P. Chen et al.
Fig. 1. Autoencoder architecture in SISO communications systems
messages set S (card(S) = m) into transmitted signal x: x = f (s; Ωe ),
(1)
where f denotes the mapping function of encoder and Ωe denotes the set of all parameters of encoder NNs. And at the receiver end, the received signal y reads y = x + n. Then, decoder functions as a receiver, taking the received signal y as input and generating estimated symbol s of the source symbol s: s = g(y; Ωd ),
(2)
where g denotes the mapping function of decoder and Ωd denotes the set of all parameters of decoder NNs. In common, the encoder part and the decoder part of an autoencoder can be trained together with stochastic gradient descent (SGD) algorithm [4]. Through training with enough samples, an effective autoencoder that is robust against unknown imperfections or noise can be obtained. 2.2
Existing Problems
Owing to their capability of coping with unknown nonlinearity, the applications of autoencoders in wireless communications systems have been wildly studied. But the main concern is that high memory consumption and long execution time induced by large size and sophisticated structures of autoencoders hinder their practical deployment. Up to now, there are several works focusing on compressing the scale of NNs model and accelerating the computation. These works could mainly be classified into six types: network pruning, low-rank approximation, network quantization, teacher–student network, compact network design and hardware accelerator [5]. To alleviate the storage and computational overhead of autoencoders, we utilize two representative compression techniques, pruning [6] and greedy bilateral decomposition (GreBdec) [7] to reduce the size of NNs in autoencoder and propose two compressive autoencoder schemes.
Compressive Autoencoders in Wireless Communications
3
1031
Compressive Autoencoder Schemes
In this work, two representative compression techniques, i.e., pruning and GreBdec, are used to compress autoencoder, and thus, two corresponding compressive autoencoder schemes are proposed.
Autoencoder Training Autoencoder Training
Setting Threshold
Weights Decomposition
Pruning Weights Autoencoder Retraining
Autoencoder Retraining (a)
(b)
Fig. 2. Compressive autoencoder training procedures. a Compressive autoencoder with pruning. b Compressive autoencoder with GreBdec
For illustration, we assume that W ∈ Rm×k , x ∈ R1×m , y ∈ R1×k represent the weight matrix, input vector and output vector in any one of neural layers from autoencoder’s NNs where m, k denote the input and output dimensions, respectively. Output vector is obtained by y = xW. 3.1
(3)
Autoencoder with Pruning
Han et al. proposed a straightforward and effective pruning technique [6]. As shown in Fig. 2a, the pruning method repeatedly prunes the autoencoder according to a threshold and retrains the model to recover its performance. Specifically, in pruning stage, a mask matrix M ∈ Rm×k is obtained at first to mark the positions where the corresponding element in weight matrix W is not less than the threshold t, i.e., 1 Wi,j ≥ t . (4) Mi,j = 0 Wi,j < t
1032
P. Chen et al.
can be obtained by calculating the Hadamard Then, the pruned weight matrix W product of weight matrix W and mask matrix M: = W M, W
(5)
where is the element-wise multiplication of matrices. Next, in retraining stage, the weight matrix is updated. To avoid the removed also need to be masked by connections being reactivated, the gradients of W new is thus given by M. The updated weight matrix W + η∇L(W) M, Wnew = W
(6)
where η > 0 and L(·) denote the learning rate and loss function, respectively. During the whole process of autoencoder pruning, pruning stage and retraining stage will be repeated several rounds. And a feasible setup of thresholds is to set a small initial value and gradually increase it in every round, which could effectively compress the model and scarcely impair its performance. 3.2
Autoencoder with GreBdec
GreBdec, proposed by Yu et al., is an effective compression technique that takes advantage of both low-rank and sparse decomposition of NNs’ weight matrices [7]. Moreover, with neural layer’s outputs reconstruction term integrated into the decomposition, GreBdec is able to obtain a good initialization for autoencoder retraining, resulting in less rounds of retraining. The basic process of GreBdec compression is shown in Fig. 2b. GreBdec aims to approximate weight matrix W with two low-rank matrices and a sparse matrix: W ≈ UV + S, (7) where U ∈ Rm×r , V ∈ Rr×k (r < min (k, m)) denote the two low-rank matrices and S ∈ Rm×k denotes the sparse matrix. The number of nonzero elements in S needs to be not greater than a predefined value c, i.e.,S0 ≤ c. In order to alleviate the autoencoder’s performance degradation caused by network compression, GreBdec takes outputs reconstruction into consideration and constructs an optimization problem as follows: min
U,V,S
s.t.
2
y − x(UV + S)F , 2
W − (UV + S)F ≤ γ, S0 ≤ c.
(8)
And GreBdec solves the problem with an iterative greedy algorithm, which enables a fast decomposition and a good initialization for retraining.
Compressive Autoencoders in Wireless Communications
4
1033
Experimental Simulations and Performance Evaluations
In this section, we compare the performance of compressed autoencoders and original autoencoder. Note that in our experiments, the source symbol s needs to be encoded as a m-dimensional one-hot vector prior to being sent to autoencoder [1]. The neural layers in autoencoder take ReLU as activation functions except the last layer of decoder which uses softmax function instead [4]. During the training and retraining stage, cross-entropy is adopted as loss function, and Adam optimizer [8] is used, but learning rates are set to be 0.001 for the former stage and 0.0003 for the latter one. The initial compressive ratio is set to be 0.95 and decreases 0.05 in every round if bigger than target value in the pruning process for autoencoder. But in the GreBdec scheme, compressive ratio is directly set the same as target value at the beginning of decomposition.
Fig. 3. System performance of original autoencoder, two compressed autoencoders with compressive ratio r = 0.4 and Hamming coding scheme
We first compare the block error rate of original autoencoder and two compressed ones over a messages set containing 16 symbols, while the transmitted signal’s dimension is set as 7. Hamming coding is also studied for comparison in this experiment. 150,000 samples are used for training and retraining, while testing set contains 50,000 samples. These methods’ curves over different signal noise ratios (SNRs) ranging from −5 to 8 are shown in Fig. 3. The results demonstrate that autoencoder matches up the Hamming coding scheme’s performance in high SNR range while outperforming the latter in low SNR range, and our proposed compressive antoencoder schemes perform almost the same as the original one. The comparisons of weights number and computation between complete model and two compressed ones are also shown in Fig. 4, which indicates that our proposed methods save 60% storage and effectively reduce the multiplications which occupy most of the execution time in model inference.
1034
P. Chen et al.
Fig. 4. Storage and computation of original and two compressed autoencoders. a Number of weights in each neural layer. b Multiplications of model inference
(a) original
(b) pruning
(c) GreBdec
Fig. 5. Comparisons of transmitted signal’s constellations in original autoencoder and two compressed ones with compressive ratio r = 0.3
To further look at the distortion on transmitted signal caused by networks compression, we present another simulation over a messages set containing 16 symbols, but the transmitted signal is two-dimensional for ease of visualization. The transmitted signal’s picture can be considered as the constellations of the outputs of encoder. The training setups are the same as the first experiment. As shown in Fig. 5, two compressed autoencoders with compressive ratio r = 0.3 have little impact on transmitted signal’s constellations.
5
Conclusion
Two representative compressive techniques, pruning and GreBdec, are integrated into autoencoder, and thus, two compressive autoencoder schemes for wireless communications systems are proposed. With significant reduction on memory consumption and computational overhead, our proposed methods have litte impacts on model performance. Numerical experiments validate that our compressive schemes do not degrade the performance of original autoencoder and do not distort the constellations of transmitted signal. Our schemes enable an easy
Compressive Autoencoders in Wireless Communications
1035
deployment of autoencoders in practical scenarios and thus provide the great promise in the emerging wireless communications. Acknowledgments. This work was supported by the National Natural Science Foundation of China under Grant 61971050, Young Talents Invitation Program of China Institute of Communications under Grant QT2017001 and Natural Science Foundation of China under Grant U1805262.
References 1. Oshea TJ, Hoydis J (2017) An introduction to deep learning for the physical layer. IEEE Trans Cognitive Commun Netw 3(4):563–575 2. Wen CK, Shih WT, Jin S (2018) Deep learning for massive MIMO CSI feedback. IEEE Wirel Commun Lett 7(5):748–751 3. Wang TQ, Wen CK, Wang HQ, Gao FF, Jiang T, Jin S (2018) Deep learning for wireless physical layer: opportunities and challenges. China Commun 14(11):92–111 4. Goodfellow I, Bengio Y, Courville A (2016) Deep Learning. The MIT Press 5. Cheng J, Wang PS, Li G, Hu QH, Lu HQ (2018) Recent advances in efficient computation of deep convolutional neural networks. Frontiers Inf Technol Electron Eng 19(01):67–80 6. Han S, Pool J, Tran J, Dally WJ (2015) Learning both weights and connections for efficient neural networks. In: 29th conference on neural information processing systems, pp 1135–1143 7. Yu X, Liu T, Wang X, Tao D (2017) On compressing deep models by low rank and sparse decomposition. In: IEEE conference on computer vision and pattern recognition. IEEE Press, New York, pp 67–76 8. Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: International conference on learning representations
Activity Segmentation by Using Window Variance Comparison Method Xinxing Tang, Xiaolong Yang(B) , Mu Zhou, and Lingxia Li School of Communication and Information Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China [email protected]
Abstract. In the activity recognition, it is very important to extract the data when the activity occurs from the whole data. In order to solve this problem, we propose a window variance comparison algorithm. Calculate the window variance of the preprocessed data and compare it with the window variance before or after, to determine the occurrence and continuation of the activity. Experimental results show that this algorithm can accurately and effectively achieve activity segmentation. Keywords: Activity segmentation · Window variance · Algorithm
1 Introduction Recent years, human activity recognition technology based on Wi-Fi signals has been extensively studied [1, 2], such as TW-see [3] and Wi-Au [4]. In TW-see, the author proposes a normalized variance sliding windows algorithm to segment activities. In Wi-Au, the author uses the idea of set mapping to achieve action segmentation. We used the measured data to re-engrave the algorithm and found that the accuracy of segmentation using these two methods is very low. Therefore, we propose a window variance comparison method. The preprocessed data is divided into several windows, and the variance is calculated, respectively. The variance between each window and the front or back windows is compared to determine the occurrence and continuation of the activity, so as to realize the activity segmentation. The rest of the paper is organized as follows. The second section introduces the CSI acquisition and data preprocessing process, the third section proposes the segmentation method, and the fourth section is the segmentation result.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_136
Activity Segmentation by Using Window Variance Comparison Method
1037
2 Data Collection and Preprocessing 2.1 Data Collection The CSI measurement value is obtained from the data packet received by the receiver. According to the introduction of the CSI tool, each data packet contains a threedimensional data matrix of Ntx × Nrx × 30, where Ntx and Nrx, respectively, represent the number of antennas of the transmitter and receiver, and 30 is the number of subcarriers. In this study, the receiver has three antennas and the transmitter has 1 antenna. Therefore, each packet contains 1 × 3 × 30 CSI measurements. 2.2 PCA After calculating the absolute value of the original data and removing the DC component, we can get a matrix that only contains amplitude information and has a dimension of 90× N , where 90 is the superposition of the subcarriers received by the three antennas, and N represents the number of packets. We can observe the fluctuation of the CSI amplitude when the action occurs from the row vector in the matrix, but it is extremely insignificant. Therefore, we use principal component analysis (PCA) to reduce the dimension of the matrix. So that it can use a vector to summarize the changing characteristics of all the row vectors in the matrix. Here, we use the second principal component of the matrix. Figure 1 shows the variation of the amplitude of the second principal component. 40
Amplitude
20 0 -20 -40
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
Packet index Fig. 1 Amplitude of the second principal component
2.3 Data Transformation From Fig. 1, we can clearly see that the data contains two activities, and the CSI amplitude fluctuates more obvious during the activity. However, the amplitude of CSI during inactivity will also fluctuate greatly over a long period of time. For example, 4000–5000
1038
X. Tang et al.
packets in Fig. 1, large fluctuations will make the variance very large, which is not convenient for activity segmentation. Therefore, in order to solve this problem, define a data processing method, and the formula is as follows: ⎧ i+40 ⎪ ⎪ H2 (n)]/(i + 40) i ≤ 10 H2 (i) − [ ⎪ ⎪ ⎪ ⎪ n=1 ⎪ ⎨ N H2 (n)]/(N − i + 9) i ≥ N −40 (1) H (i) = H2 (i) − [ ⎪ n=i−10 ⎪ ⎪ ⎪ i+40 ⎪ ⎪ ⎪ H2 (n)]/5110 ree and the second packet can not squeeze the time of the first packet, there is no need to do any processing for the two packets. For the second internal iteration, since r2 > ree and r3 > ree , we can move s3 to the right until the transmission energy of the second and third packets is the smallest. Finally, the third internal iteration, due to r3 > ree and r4 < ree , moves s4 to the right until r4 = ree . At this time, then the energy consumption w1 of the system is calculated after three internal iterations, and an external iteration has been completed. Repeat the above process, the total energy consumption will gradually decrease and tend to a stable value. The proposed algorithm can be summarized as shown in Table 1.
Fig. 1 Take four packets as an example to illustrate the proposed algorithm
1084
Y. Du et al.
Table 1 Recursive algorithm to minimize the energy of transmitting packet with a low computational complexity Step 1: Calculate the optimal transmission rate ree . Step 2: Initialize the start time and the transmission duration of each packet. Make each packet’s start time be equal to arrival time (si = ti ), and set the transmission duration of ith packet τi = si+1 − si . Then, calculate each packet’s transmission rate ri . Step 3: Perform the internal iteration on the first and the second data packets, and four different cases will be discussed: i. if r1 ≥ ree and r2 ≥ ree , make the sum of the two data packets τ1 + τ2 be kept unchanged and then shift the start time of the second data packet s2 to right until energy consumed for transmitting the two data packets is minimized. ii. if r1 ≥ ree and r2 < ree , keep the total transmission time of the two data packets τ1 + τ2 unchanged, and move the start time of the second data packet s2 to the right until the transmission rate of the second data packet becomes ree . iii. if r1 < ree and r2 < ree , keep the transmission duration and the start time of the two data packets is unchanged. iv. if r1 < ree and r2 ≥ ree , keep the transmission duration and start time of these two data packets unchanged. Step 4: For the second, third data packets, third, fourth data packets … and M − 1th, M-th data packets are processed in the same way to complete an external iteration. Step 5: Repeat Step 3 and Step 4 N − 1 times. Step 6: Determine whether the transmission rate of each data packet ri is less than ree , if it is less than ree , then replace ri with ree .
4 Simulation Figure 2 shows the relationship between the energy consumption and the transmission rate per unit bandwidth when the circuit power is 5, 10, 15, 20 watts and the packet size is 5 bits. Since the function of the energy consumption for each packet is w(r) = B/r(er −1+p0 ), each function has a minimum point, that is, ree can minimize the energy consumption by transmitting a single data packet. In Fig. 2, the ree of these four curves are 1.72, 2.10, 2.34, 2.52 bit/s respectively. Figure 3 illustrates the energy consumption of four data packets after three external iterations. The initial rates of four packets are {6, 2, 6, 2} (unit: bit/s). After the first external iteration, the transmission rates are {4.68, 2.79, 4.68, 2.79} (unit: bit/s). After the second external iteration, the transmission rates are {4, 4, 4, 4} (unit: bit/s). After each external iteration, the transmission rate of each data packet approaches the optimal value, and the total energy consumption decreases. Figure 4 compares the energy consumption of the two different methods. For the instantly pass strategy, the data packets are transmitted immediately when they arrive, and the transmission is completed when the next packet arrives. The circuit power is 20 watts. The number of data packets is gradually increased, and each transmission arrives at a split time, that is ti = (i −1)T /M . Because the recursive algorithm can minimize the energy of transmitting packet with a low computational complexity by optimizing the
A Recursive Algorithm for the Energy Minimization …
1085
Fig. 2 Relationship between the energy consumption and the transmission rate for the different circuit powers
Fig. 3 Energy consumption of four data packets after three external iterations
data packet transmission time and the start transmission time, it can reduce the energy consumption compared to the instantly pass strategy.
1086
Y. Du et al.
Fig. 4 Comparison of a recursive algorithm to minimize the energy of transmitting packet with low computational complexity with instantly pass strategy
5 Conclusion In this paper, we considered a wireless transmission system with a circuit power consumption. Then, a problem of minimizing the energy consumption for transmitting data packets was investigated. Then, a recursive algorithm to minimize the energy of transmitting packet with a low computational complexity is proposed. The algorithm consisted of two layers of iterations and minimized the energy consumption by reallocating the start time and the transmission time of each data packet. Simulation results showed that as the number of iterations increases, the total energy consumption approaches the optimal value, and it has advantages over theory transmission strategy. Acknowledgements. This work was supported by the Open Research Fund of National Mobile Communications Research Laboratory, Southeast University (No. 2020D18), the Research Foundation of Jinling Institute of Technology for Advanced Talents (No. 40620044), the Innovative Training Program for College Students in Jiangsu Province (202013573068Y) and the HighLevel Demonstration Project and Construction Pilot Zone of Sino-foreign University Cooperation in Sino-British Communication Engineering of Jiangsu Province.
References 1. Wu Q, Li GY, Chen W et al (2017) An overview of sustainable green 5G networks. IEEE Wirel Commun 24(4):72–80 2. Brundu FG et al (2017) IoT software infrastructure for energy management and simulation in smart cities. IEEE Trans Industr Inf 13(2):832–840
A Recursive Algorithm for the Energy Minimization …
1087
3. Mohd BJ, Hayajneh T (2018) Lightweight block ciphers for IoT: energy optimization and survivability techniques. IEEE Access 6:35966–35978 4. Wu J, Dong M, Ota K, Li J, Guan Z (2018) Big data analysis-based secure cluster management for optimized control plane in software-defined networks. IEEE Trans Netw Serv Manage 15(1):27–38 5. Li C, Zhang S, Liu P, Sun F, Cioffi JM, Yang L (2016) Overhearing protocol design exploiting intercell interference in cooperative green networks. IEEE Trans Veh Technol 65(1):441–446 6. Rauniyar A , Engelstad P , Osterbo, Olav (2018) RF energy harvesting and information transmission based on NOMA for wireless powered IoT relay systems. Sensors 18(10) 7. Buzzi S, Klein CITE, Poor HV, Yang C, Zappone A (2016) A survey of energy-efficient techniques for 5G networks and challenges ahead. IEEE J Sel Areas Commun 34(4):697–709 8. Niu Z, Wu Y, Gong J, Yang Z (2010) Cell zooming for cost-efficient green cellular networks. IEEE Commun Mag 48(11):74–79 9. Huang C, Zhou Q, Zhang D (2016) Integrated wireless communication system using MANET for remote pastoral areas of Tibet. China Commun 13(4):49–57 10. Gungor AC, Gunduz D (2015) Proactive wireless caching at mobile user devices for energy efficiency. In: 2015 international symposium on wireless communication systems (ISWCS). Brussels 2015:186–190 11. El Gamal A, Nair C, Prabhakar B, Uysal-Biyikoglu E, Zahedi S (2002) Energy-efficient scheduling of packet transmissions over wireless networks. In: Proceedings twenty-first annual joint conference of the IEEE computer and communications societies, vol 3. New York, pp 1773–1782 12. Meng C, Wang G, Dai X, Chen S, Ni W (2019) An energy-efficient transmission strategy for cache-enabled wireless networks with non-negligible circuit power. In: IEEE access, vol 7, pp 74811–74821. https://doi.org/10.1109/ACCESS.2019.2921389 13. Zafer MA, Modiano E (2009) A calculus approach to energy-efficient data transmission with quality-of-service constraints. IEEE/ACM Trans Netw 17(3):898–911
Speech Emotion Recognition Algorithm for School Bullying Detection Based on MFCC-PCA-SVM Classification Yuhao Wang1 , Xinsheng Wang1(B) , and Chenguang He2 1 Harbin Institute of Technology at Weihai, Weihai 264200, China
[email protected], [email protected] 2 Harbin Institute of Technology, Harbin 150080, China [email protected]
Abstract. School bullying is common in school life, which has a negative impact on students’ physical and mental health. Nowadays, the research on school bullying at home and abroad mostly relies on human resources. In this paper, a school bullying detection algorithm based on pattern recognition techniques is presented. The authors firstly collect and pre-process the emotional speech data, and extract the MFCC features. Then they reduce the dimension of features to 6 by the PCA algorithm. They design a two-level SVM model in series with linear kernel for classification. The algorithm proposed in this paper effectively achieves a high recognition performance. The accuracy of the algorithm reaches 86.59% and the F1-Measure reaches 87.36%. Keywords: School bullying · Pattern recognition · MFCC · Wrapper · PCA · SVM
1 Introduction In recent years, school bullying has seriously harmed the victims’ normal study and life. However, victims often do not dare to feedback the situation to teachers and parents because of their self-esteem and fear of reprisals, which also leads to the growing phenomenon of school bullying. Based on a Finnish emotional speech database, this paper proposes an efficient method combining Wrapper feature selection, PCA dimension reduction, and SVM classification model. The purpose of this research is to let students wear microphone device, collect emotional speech data in real-time, and identify whether bullying occurs. Once bullying is detected, the algorithm sends the results to teachers and parents in time so as to prevent school bullying.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_143
Speech Emotion Recognition Algorithm for School Bullying …
1089
School bullying detection techniques need to design an active, real-time speech emotion detection algorithm to prevent bullying more effectively. Generally speaking, the current speech emotion recognition performance is about 70–80% [1–3]. In this paper, the authors mainly use Wrapper method to select the features, use PCA to reduce the dimension of features, and use two-level SVM classifier in series to optimize parameters, which reduce the computational complexity and improve the recognition performance of bullying context. The accuracy rate is more than 85%, an average increase of 16%. The remainder of this paper is organized as follows: Sect. 2 describes the bullying detection algorithm; Sect. 3 shows the experimental results; and Sect. 4 draws a conclusion.
2 Speech Emotion Recognition Algorithm The speech database selected in this paper is Finnish emotional speech database. According to subjective experience, the data can be divided into bullying context and nonbullying context according to emotions in advance. There are 84 bullying samples and 80 non-bullying samples in total. According to the number of speech samples in the speech database, the algorithm uses two-fold cross-validation to divide the data set into a training set and a testing set, and label the training set. In this algorithm, Mel Frequency Cepstrum Coefficient (MFCC) is used to extract the features of data [4]. To extract the speech signal, the pre-processing uses a high pass filter to pre-emphasis it. After framing, add a Hamming window, and apply fast Fourier transform (FFT) to the signal of each frame to get the spectrum of each frame. Then smooth the spectrum and eliminate harmonic by using a triangle band-pass filter bank Hm (k) satisfying formula (1). ⎧ 0, (k < f (m − 1)) ⎪ ⎪ ⎪ ⎨ k−f (m−1) , (f (m − 1) ≤ k ≤ f (m))(f (m − 1) ≤ k ≤ f (m)) (m−1) (1) Hm (k) = f (m)−f f (m+1)−k ⎪ , (f (m) ≤ k ≤ f (m + 1)) ⎪ f (m+1)−f (m) ⎪ ⎩ 0, (k > f (m + 1)) −1 The triangle band-pass filter bank satisfies M m=0 Hm (k) = 1, M is taken as 24. Next, the logarithm energy of each filter bank is calculated, and the MFCC coefficient is obtained by discrete cosine transform (DCT). Finally, the dynamic differential parameters are extracted. In this paper, the feature matrix extracted by MFCC is a sample of speech emotion of each behavior. Each column is a MFCC parameter, with 36 dimensions in total. MFCC coefficients, first-order differential parameters, and secondorder differential parameters account for one-third respectively. Then, the matrix of each audio sample is averaged to get the corresponding eigen vector, and the vectors of each sample are stitched into the eigen matrix by rows. Then, 84 rows and 36 columns of bullying context matrix and 80 rows and 36 columns of non-bullying context matrix will be obtained. There are two ideas used for feature selection to reduce the dimension of features in this paper: Filter method and Wrapper method. Filter method first selects the features of the data, and then trains the classifier. The process of feature selection is independent
1090
Y. Wang et al.
of the type of the classifier. The threshold value δ and the number of nearest neighbor samples k are set, and the weight of output greater than 0 is taken as the criteria of feature selection. Wrapper method continuously selects feature subsets from the initial feature set, trains the classifier, and evaluates the subsets according to the performance of the learner until the best feature subsets are selected. Although the computational cost of Wrapper method is usually much larger than Filter method, from the final recognition accuracy of bullying context, Wrapper method is better than Filter method, so the Wrapper method is used in this algorithm. Section 3 will make an intuitive comparison of the recognition accuracy between the two methods. Principal components analysis (PCA) is an unsupervised multivariate statistical analysis method. The idea of matrix feature composition is used in dimension reduction of PCA. It is used to synthesize the high-dimensional variables which may have correlation into the low-dimensional variables which are linearly independent. The transfer matrix and the new feature matrix after dimension reduction need to be preserved to feed the classifier for training. In this paper, PCA reduced the feature dimension to 6. Support vector machine (SVM) maps the low dimension linear-indivisible samples to the high dimension linear-divisible samples [5]. By comparing the three available kernel functions, the linear kernel function is suitable for the algorithm, and the recognition accuracy is high. Section 3 will show a comparison of the results. The new feature matrix after PCA dimension reduction goes through the following steps: (a) The feature matrix is used as training set and label to SVM model. (b) The predicted output label is used as input to a new SVM model. (c) The new SVM model exchanges training set and test set, leading to significant improvement of recognition accuracy. In this paper, two-dimensional confusion matrix is used to represent the classification result, and the four parameters of accuracy, precision, recall, and F1-Measure are used to measure the recognition performance of the algorithm for school bullying detection.
3 Classification Results In Sect. 2, the authors mention that Wrapper method is better than Filter method for subsequent classification because it selects features for specific classifiers. Feature selection uses Filter method and Wrapper method. At the same time, SVM model of one-level linear kernel function is used for classification. Table 1 shows the confusion matrix of recognition performance for feature selection by Filter method, and Table 2 shows the confusion matrix of recognition performance by Wrapper method. The accuracy, precision, recall, and F1-Measure are as Table 3. It can be concluded from Table 3, Wrapper method is more suitable for classifier and has a higher recognition performance. Wrapper method is used for feature selection, and different kernel functions are used in a one-level SVM classification. In Tables 4, 5 and 6, gaussian kernel, linear kernel, and polynomial kernel are used as the recognition performance confusion matrix of SVM kernel function respectively. The accuracy, precision, recall, and F1-Measure obtained are as Table 7.
Speech Emotion Recognition Algorithm for School Bullying …
1091
Table 1. Recognition performance confusion matrix using Filter method Filter
Non-bullying (%) Bullying (%)
Non-bullying 72.50
27.50
Bullying
66.67
33.33
Table 2. Recognition performance confusion matrix using Wrapper method Wrapper
Non-bullying (%) Bullying (%)
Non-bullying 72.50
27.50
Bullying
73.81
26.19
Table 3. Recognition performance of feature selection by Filter method and Wrapper method Method
Accuracy (%)
Precision (%)
Recall (%)
F1-measure (%)
Filter
69.51
71.79
66.67
69.14
Wrapper
73.17
73.81
73.81
73.81
Table 4. Recognition performance confusion matrix of Gaussian kernel function for SVM Gaussian
Non-bullying (%) Bullying (%)
Non-bullying 75.00
25.00
Bullying
61.90
38.10
Table 5. Recognition performance confusion matrix of linear kernel function for SVM Linear
Non-bullying (%) Bullying (%)
Non-bullying 72.5
27.50
Bullying
59.52
26.19
In conclusion from Table 7, the recognition performance of linear kernel function is much better than others. In Sect. 2, the authors put forward the idea of using two-level SVM classifier. The output label of the former level is the input label of the latter level, and the training set and test set are exchanged. The following shows the effect of SVM classification and recognition in Wrapper feature selection, PCA dimension reduction, and one-level as well as two-level linear kernel functions. Table 8 uses the one-level SVM model,
1092
Y. Wang et al.
Table 6. Recognition performance confusion matrix of polynomial kernel function for SVM Polynomial
Non-bullying (%) Bullying (%)
Non-bullying 72.50
27.50
Bullying
66.67
40.48
Table 7. Recognition performance of different kernel functions in SVM Kernel function
Accuracy (%)
Precision (%)
Recall (%)
F1-measure (%)
Gaussian
68.29
72.22
61.90
66.6
Linear
73.17
73.81
73.81
73.81
Polynomial
65.85
69.44
59.52
64.10
and Table 9 uses the two-level SVM model in series. The second-level SVM exchanges training set and test set. Table 8. Recognition performance confusion matrix of one-level SVM classifier One-level
Non-bullying (%) Bullying (%)
Non-bullying 72.50
27.50
Bullying
73.81
26.19
Table 9. Recognition performance confusion matrix of two-level SVM classifier Two-level
Non-bullying (%) Bullying (%)
Non-bullying 82.50
17.50
Bullying
90.48
9.52
It can be seen from Table 10 that the effect of two-level SVM in series classification is 18.34% higher than one-level SVM classification, achieving an excellent recognition performance.
4 Conclusion In this paper, based on a Finnish emotional speech database, a speech emotion recognition algorithm for school bullying detection is proposed. The algorithm uses pattern recognition techniques to analyze the speech actively and judge whether it is bullying or not, so as to effectively prevent school bullying.
Speech Emotion Recognition Algorithm for School Bullying …
1093
Table 10. SVM classification recognition performance of different levels Number of SVM
Accuracy (%)
Precision (%)
Recall (%)
F1-measure (%)
One-level
73.17
73.81
73.81
73.81
Two-level
86.59
84.44
90.48
87.36
In the aspect of feature selection, Wrapper method is used instead of Filter method, and SVM classifier is optimized using a two-level classifier in series. Therefore, the accuracy of bullying context recognition has been improved from 69.51% to 73.17%, and then to 86.59%, which has achieved a very good effect. However, at present, the number of samples is only about 80, relatively small, and the selection of feature dimension reduction and classifier is limited. In the future research work, it is necessary to expand the samples appropriately and find a more suitable classifier, so as to improve the emotion recognition accuracy of bullying context. Acknowledgements. This work was supported by National Natural Science Foundation of China under Grant No. 61971158.
References 1. Hao R, Liang Y, Yue L, Xuejun S (2017) Speech emotion recognition algorithm based on multi-layer SVM classification. Appl Res Comput 06-1682-03 2. Liu M, Li X, Chen H (2019) Research on speech and emotional recognition algorithm based on SVM. Harbin University of Science and Technology, 1007–2683. 04-0118-09 3. Schuller B, Zhang Z, Weninger F et al (2012) Synthesized speech for model training in crosscorpus recognition of human emotion. Int J Speech Technol 15(3):313–323 4. Weiliang C, Xiao S (2015) Mandarin speech emotion recognition based on MFCCG-PCA. Acta Sci Natur Univ Pekin 51(2):269–274 5. Wang Z, Zhao Z et al (2015) Solving one-class problem with outlier examples by SVM. Neurocomputing 149(PA):100–105
3D Anchor Generating Network:3D Object Detection via Learned Anchor Huinan Li(B) and Yongping Xie School of Information and Communication Engineering, Dalian University of Technology, Dalian, Liaoning, China [email protected]
Abstract. In both 2D and 3D object detection algorithms, the reference boxes obtained by the anchor mechanism lay the foundation for the next detection tasks of the network. Most of the existing anchor mechanisms are designed by hand to generate a dense array of anchors. The size of the anchors obtained in this way is single and the anchors are densely distributed in the image space. It has poor robustness and a large number of redundant anchors. In views of these defects, we propose the 3D anchor generating network. It predicts the 3D anchors by learning the semantic features of the picture, in which the anchors are sparsely distributed around the objects in the image with different sizes in different positions of the image. We applied it to TLNet’s baseline monocular network for 3D object detection on KITTI dataset, the result showed a significant improvement on the performance of 3D object detection algorithms. Keywords: 3D object detection · Anchor mechanism · Convolutional neural network · KITTI dataset
1 Introduction 3D object detection is defined as the recognition tasks of object category, length, width, height, rotation angle. Most of common 3D object detection algorithms [1–3, 15] need to rely on anchor mechanism to obtain reference targets as learning samples of the network. The anchor mechanism is also widely used in traditional detection algorithms, i.e., Fast R-CNN [4], Faster R-CNN [5], SSD [6], YOLOv3 [14], and RetinaNet [7]. The anchor mechanism first takes each pixel on the feature map as the center point of the anchor. From it, we can obtain multiple anchors based on anchor points by setting different scales and aspect ratios manually. This anchor mechanism depends on manual experience and specific dataset, which has a lot of defects and does not realize the self-learning and generalization of the model.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_144
3D Anchor Generating Network: 3D Object Detection via Learned …
1095
In 2D object detection, many articles [9, 10, 16] have proposed improved versions of the anchor mechanism. MetaAnchor [10] proposed a mathematical model for the prediction of anchors, resulting in a flexible function. Guided-Anchoring [9] designed guided anchoring and feature adaption scheme. The FoveaBox [16] proposed a completely anchor-free object detector which directly predicts the boundary of the target objects end-to-end. Motivated by above literatures, we propose a novel 3D anchor scheme which avoid the issue of picking anchors by hand in advance. In this work, we propose the generation netwok of 3D anchor as new anchor mechanism. In Sect. 2, we will introduce the specific architecture of network, and study its effect on 3D object detection algorithm performance. In Sect. 3, we will test the detection performance of the converged network on KITTI dataset. In the experiment introduced in Sect. 4, the results show that our proposed anchor mechanism is indeed helpful to improve the performance of 3D object detection network. We summarize our main contributions as follow: 1. A 3D anchor generating network which predicts 3D anchor by fully mining the feature information of RGB image. 2. A improved 3D anchor mechanism can be flexibly integrated into any object detectors depend on anchors. 3. We confirm that the improved mechanism is more robust to the distributions of datasets with better flexibility and accuracy in performance.
2 Approach The overall structure of 3D object detection network using the proposed 3D anchor mechanism is overviewed in Fig. 1. The proposed flexible 3D anchor generating network is a branch structure which consists of a baseline feature extraction and four sub-networks with specific functions.
Fig. 1. Pipeline of network for 3D object detection. It consists of 3D anchor generating network and TLNet’s baseline monocular network
2.1 Baseline Feature Extraction Network In proposed algorithm, we adopt the very deep convolutional networks(VGG) [12] with 16 layers as baseline feature extraction network. In addition, the feature pyramid network(FPN) [11] is introduced to construct a feature map with the same size as the original input image.
1096
H. Li and Y. Xie
2.2 3D Anchor Generating Network In the task of 3D object detection based on image, predicting the bounding box of an object in 3D space is crucial and tough. To resolve it better, we decompose the 3D anchor estimation problem into five sub-tasks. Hence, the proposed 3D anchor estimation network is composed of three sub-modules, i.e., anchor depth estimation, coordinates(X, Y ) estimation of the center of the anchor in camera coordinate system, anchor’s 3D dimensions and orientation prediction. The overall structure of 3D anchor estimation network is shown in Fig. 2.
Fig. 2. Overall structure of 3D anchor generating network
2.2.1 Coordinates Estimation of Objects 3D Center Points Here, we locate the objects by predicting the coordinates of the objects 3D center points. It is difficult to predict coordinates(X, Y, Z) directly from a monocular image, so we propose two sub-networks to predict coordinates (X, Y ) and coordinates Z, respectively. The Task Net3 in Fig. 3a is used to predict the coordinates(X, Y ) value of the objects, while the coordinate Z value, i.e., the depth of the objects is predicted by the network structure in Fig. 3b. 2.2.2 Anchor 3D Dimensions Estimation After realizing the coarse 3D location estimation of objects, we turn attention to the prediction of objects 3D scales. The detailed network structure is demonstrated in Task Net2 of Fig. 3a. The difference from the above sub-networks is that the dimension of the last convolution layer is three. 2.2.3 Anchor Orientation Estimation The orientation θ is a scalar. Inspired by documents [17, 18, 22], we transform the scalar value into a vector of eight dimensions. The first four of this 8-dimensional vector values π , are responsible for predicting the angle in B1 = − 7π 6 6 and the rest are for angles in π 7π B2 = − 6 , 6 . The first two values of each of the two parts are used as the input of softmax function to get the probability that the angle falls in this range. After the division
3D Anchor Generating Network: 3D Object Detection via Learned …
1097
Fig. 3. Detailed structure of sub-networks
of angles, each sub-range has a reference starting angle (cj ), respectively, i.e., c1 = − 7π 6 and c2 = − π6 . The following last two values are used to predict the sin and cos values of the offset between the angle falling in the range and the angle reference start value. The symbolic expression of this 8-dimensional vector can be indicated as α = [b11 , b12 , a11 , a12 , b21 , b22 , a21 , a22 ]. The final predicted angle value θ can be calculated through the 8-dimensional vector by (1) Θ = arctan 2 aj1 , aj2 + cj Here, j is the index value of the two ranges where the predicted angle is located. 2.3 Application of 3D Anchor Generation Network In order to get final 3D object detection results, we choose TLNet [8]’s baseline monocular network to test 3D anchor generation network. In order to limit the dataset label data to a certain extent, we use the data processing methods of decentralization and normalization. Here, we employ symbol G to represent the coordinates (X, Y, Z) of the center point of the object in the camera coordinate system and the size (L, W, H) of the object scale. Therefore, their mathematical formulas of decentralization and normalization are uniformly expressed by the symbol G as follows: G =
G − Gmin Gmax − Gmin
(2)
By the formula (2), the network prediction value is limited to the range of [0, 1]. Then, when the network prediction values are solved, the formula is backstepped to get the actual values of 3D anchor parameters.
1098
H. Li and Y. Xie
2.4 Training First of all, we train the first sub-network which is responsible for 2D object classification for 20 K iterations. Then continue to add the remaining four sub-networks to the anchor generation network for training 100 K iterations. Right after, the 80 K iterations training time is utilized to fuse the anchor generation network and TLNet’s baseline monocular network. Finally, we train and optimize the converged model as a whole for as a whole for another 20 K iterations. The evaluation of experimental metrics is carried out on a single GPU of NVidia GTX-2080ti.
3 Experiment 3.1 Evaluation Results 3.1.1 3D Object Detection We compare the 3D object detection results of proposed method with the state-of-theart monocular 3D object detectors Mono3D [19], MF3D [20], MonoGRNet [13], and MoGRKNet [21], which all choose the car category as the predicted target object. In order to compare the performance of the models fairly, we also take the car as the object of interest of the dataset. The quantitative results are reported in Table 1. The experimental results show that the performance of proposed model is the most advantageous when the difficulty of detection task is above medium. When comparing with the detection performance of all models and the difficulty of task is easy, our algorithm ranks second and slightly worse than the best one. This results show that our algorithm is skilled at dealing with occluded and truncated objects. The advantages in performance are mainly due to the fact that the proposed algorithm infers 3D bounding boxes in 3D space, rather than only in the image plane. However, other algorithms, such as MoGRKNet [21], detect the target objects directly on the image plane, and there will be a blind area in vision when detecting the occluded and truncated objects. Table 1. 3D detection results. Average precision of 3D bounding boxes on KITTI validation Method
Data
APBEV (IOU = 0.3)
APBEV (IOU = 0.5)
APBEV (IOU = 0.7)
Easy
Easy
Easy
Moderate Hard
Moderate Hard 22.39
19.16
5.22
Moderate Hard
Mono3D
Mono 32.76 25.15
23.65 30.5
MF3D
Mono /
/
/
55.02 36.73
31.27 22.03 13.63
5.19
11.6
4.13
MonoGRNet
Mono 73.1
16.3
60.66
46.86 54.21 39.69
33.06 24.97 19.44
TLNet(baseline) Mono 74.18 57.04
50.17 52.72 37.22
32.16 21.91 15.72
14.32
ours
51.36 52.27 41.97
35.18 25.39 20.32
16.97
Mono 74.25 62.54
3D Anchor Generating Network: 3D Object Detection via Learned …
1099
3.1.2 BEV object detection The results of APBEV values, the model evaluation metrics, are shown in Table 2. The variation rules of data are similar to the results of AP3D . Therefore, the experimental results once again verify the advantages of the proposed model in dealing with complex situations. Table 2. BEV detection results. Average precision of BEV bounding boxes on KITTI validation dataset Method
Data
AP3D (IOU = 0.3)
AP3D (IOU = 0.5)
AP3D (IOU = 0.7)
Easy
Easy
Easy
Moderate Hard
Moderate Hard
Moderate Hard
Mono3D
Mono 28.29 23.21
19.49 25.19 18.2
15.22
2.53
2.31
2.31
MF3D
Mono /
/
26.44 10.53
5.69
5.39
MonoGRNet
Mono 72.17 59.57
46.08 50.51 36.97
30.82 13.88 10.19
7.62
MoGRKNet
Mono /
/
20.21 13.96
7.37
4.54
/ /
47.88 29.48 50.82 31.28
TLNet(baseline) Mono 72.91 55.72
49.19 48.34 33.98
28.67 13.77
9.72
9.29
ours
51.32 50.64 38.49
32.38 18.19 14.48
13.24
Mono 72.81 61.45
3.1.3 Visual Results We visualize the experimental results as Fig. 4. The second line of it is the detection results of anchor generation network. We can see that predicted 3D anchors are basically consistent with the ground truths. The third line of it shows final object detection results. The final detection results show that our proposed anchor mechanism has high performance in 3D object detection tasks. It saves expensive hardware resources and time cost. 3.2 Ablation Study 3.2.1 Comparison with TLNet Baseline First of all, we evaluate our 3D anchor generating network by embedding it into the TLNet’s baseline monocular network. In the TLNet’s baseline monocular network, the 3D anchors serve as reference 3D boxes for coarse positioning of objects. The 3D anchors are uniformly sampled at an interval of 0.25 m on the ground plane with the view frustum of 70 m depth range. To make the network get enough recall rate, the number of anchors generated by the anchor mechanism is huge. It will add extra burden to the later network. In our work, the predicted 3D anchors derived from the proposed anchor mechanism are sparsely distributed around the target object. Compared with TLNet, it reduces a large number of redundant anchors, cuts the network burden, and ensures a higher accuracy. According to Table 1, our model outperforms TLNet’s baseline monocular model in all cases. It clearly shows that the proposed anchor mechanism has some advantages in performance.
1100
H. Li and Y. Xie
Fig. 4. Visual results. The green bounding boxes in RGB image are the detection results of the proposed algorithm. In the picture obtained from point clouds visualization, the detection results are represented by red bounding boxes, and the ground truths are represented by green bounding boxes
4 Conclusion and Future Work In the thesis, we propose 3D anchor generating network which leverages multiple subnetworks to detect the target objects. It breaks down the task of 3D object detection into several sub-tasks. Each sub-network can get the prediction results according to the extracted semantic information of the single image. Finally, the prediction information of each sub-network is aggregated to get sparse 3D anchors with different sizes. The experimental results on KITTI 3D object detection dataset show that the proposed algorithm is superior to the state-of-the-art approaches in speed and precision. In the future work, we will follow the latest algorithms of binocular 3D target detection. We intend to further study how to apply the proposed 3D anchor generating network to the binocular object detection system.
References 1. Kehl W, Manhardt E, Tombari F, Ilic S, Navab N (2017) SSD-6D: making RGB-based 3D detection and 6D pose estimation great again. ICCV 2. Brachmann E, Michel F, Krull A, Yang MY, Gumhold S, Rother C (2016) Uncertainty-driven 6D pose estimation of objects and scenes from a single RGB image. CVPR 3. Kehl W, Manhardt F, Tombari F, Ilic S, Navab N (2017) SSD-6D: making rgbbased 3D detection and 6D pose estimation great again. ICCV 4. Girshick R (2015) Fast R-CNN. In: IEEE international conference on computer vision. ICCV
3D Anchor Generating Network: 3D Object Detection via Learned …
1101
5. Ren S, He K, Girshick R, Sun J (2017) Faster r-cnn: towards real-time object detection with region proposal networks. IEEE Trans Pattern Anal Mach Intell 6. Liu W, Anguelov D, Erhan D, Szegedy C, Reed D, Fu C-Y, Berg AC (2016) SSD: single shot multibox detector. ECCV 7. Lin T-Y, Goyal P, Girshick R, He K, Dollár P (2017) Focal loss for dense object detection. arXiv preprint arXiv:1708.02002 8. Qin Z, Wang J, Lu Y (2017) Triangulation learning network: from monocular to stereo 3D object detection. CVPR 9. Wang J, Chen K, Erhan D, Yang S, Loy CC, Lin D (2019) Region proposal by guided anchoring. CVPR 10. Yang T, Zhang X, Li Z, Zhang W, Sun J (2018) MetaAnchor: Learning to detect objects with customized anchors. NeurIPS, USA 11. Lin T-S, Dollar P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. CVPR 12. Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. ICLR 13. Qin Z, Wang J, Lu Y (2018) MonoGRNet: a geometric reasoning network for monocular 3D object localization. CVPR 14. Joseph R, Farhadi, Ali (2018) YOLOv3: an incremental improvement. CVPR 15. Li P, Li P, Shen S (2019) Stereo R-CNN based 3D object detection for autonomous driving. CVP 16. Tao K, Sun F, Liu H, Jiang Y, Shi J (2019) FoveaBox: beyond anchor-based object detector. arXiv preprint arXiv:1904.03797 17. Hu H-N, Cai Q-Z, Wang D, Lin J, Sun M, Krähenbühl P, Darrell Y, Yu F (2018) Joint monocular 3D vehicle detection and tracking. arXiv preprint arXiv:1811.10742 18. Mousavian A, Anguelov D, Flynn J, Kosecka J (2017) 3D bounding box estimation using deep learning and geometry. CVPR 19. Chen X, Kundu K, Zhang Z, Ma H, Fidler S, Urtasun R (2019) Monocular 3d object detection for autonomous driving. In: Conference on computer vision and pattern recognition (CVPR), pp 2147–2156 20. Xu B, Chen Z (2018) Multi-level fusion based 3D object detection from monocular images. In: Computer vision and pattern recognition (CVPR), pp 2345–2353 21. Barabanau I, Artemov A, Burnaev E (2019) Monocular 3D object detection via geometric reasoning on keypoints. CVPR 22. Zhou X, Wang D, Krähenbühl P (2019) Objects as points. arXiv preprint arXiv:1904.07850
Handoff Decision Using Fuzzy Logic in Heterogeneous Wireless Networks Liwei Yang(B) and Qi Zhang College of Information and Electrical Engineering, China Agricultural University, Beijing, China [email protected]
Abstract. Visible light communication (VLC) systems are becoming a promising means of wireless communication, due to high data rate, low implementation cost, and immunity to radio frequency (RF) interference. However, the VLC systems often suffer from service disruptions because of the limited coverage of light. In this paper, we consider a VLC–WiFi heterogeneous system to take the advantages of both technologies, including increased capacity and ubiquitous coverage. We integrate VLC and WiFi and propose a handoff decision using fuzzy logic in heterogeneous wireless networks. The design of handover decision algorithm is described to dynamically distribute resources. Experiments show the system works well and can avoid disconnection effectively. Keywords: Fuzzy logic · Visible light communication (VLC) · Wireless fidelity (WiFi)
1 Introduction With the development of communication technology, our requirements for communication services have changed from audio services with narrow spectrum resources to video services with wider spectrum resources. Considering the demand of quantity and quality, the communication network is developing towards high performance, high quality, high speed and high throughput. Visible light communication (VLC) networks have emerged as a promising alternative for next generation of wireless networks (5G and beyond), due to high data rate, flexible coverage and free from radio frequency (RF) interference [1–3]. They use light emitting diodes (LEDs) to transmit data by modulating the intensity of visible light and can be used as a complementary wireless access technology to WiFi and cellular systems [4]. However, the VLC systems suffer from service disruptions due to the limited coverage of light, it also susceptibility to blocking due to misalignment or path obstructions. In addition, the problem of spectrum shortage is more and more prominent in the future.
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_145
Handoff Decision Using Fuzzy Logic in Heterogeneous Wireless …
1103
To solve these problems, we proposed a hybrid VLC–WiFi network where the WiFi networks can be exploited to resolve the limited VLC coverage [5, 6]. WiFi will be used to realize full coverage, and VLC will be used for heavy load district to achieve highspeed access since VLC offers large capacity but small coverage. In general, compared to WiFi-only and VLC-only systems, hybrid VLC–WiFi system has better performance, especially over high user density region [7]. There are various advantages of VLC–WiFi heterogeneous network. First of all, users use different resources to switch without dead angle, fast switching speed, this advantage greatly makes up for the small coverage of VLC, vulnerable to external influence. Secondly, the frequency range of WiFi and VLC is not coincident, so there is no co-frequency interference between them. Therefore, the throughput of the heterogeneous network will have a qualitative leap. At last, WiFi and VLC cooperate with each other, which greatly relieve the user’s online pressure. Users allocate their resources according to certain principles to keep the network speed at a certain level and effectively avoid network congestion. Moreover, we also proposed a novel handoff decision using fuzzy logic in heterogeneous wireless networks. Fuzzy logic cannot only deal with imprecise input decision criteria, but also combine and evaluate multiple criteria at the same time. Bandwidth, delay, jitter and BER are employed as fuzzy logic input parameters which are adequate for decision-making of vertical handoff (VHO). That means the hybrid networks can realize high speed, flexible wireless access.
2 System Model and Experimental Setup Next generation wireless communication requires seamless connection between different access networks with high quality of service, high data rate, low price and multimedia applications. Heterogeneous networks are the integration of different networks. They can provide different services and quality and can combine the advantages of different networks. Therefore, handoff is a common handoff criterion in heterogeneous networks, and effective handoff decision is a significant research topic. The mechanism of transferring an ongoing call from one cell to another or a mobile user from one network to another is called handoff. There are many wireless technologies that need to be interconnected, such as macro-cell, femto-cell and WLAN. Handover is the process of determining the best expected access network and deciding whether to switch at any particular time. Therefore, we need a good handover mechanism to meet the seamless connection between VLC–WiFi hybrid access networks, so that the hybrid network can achieve high speed, flexible wireless access. We established a VLC–WiFi hybrid network system model, as shown in Fig. 1. The system is composed of multiple VLC access points and one WiFi access point. Mobile devices can choose VLC access or WiFi access. When VLC is blocked and interrupted, users can switch to WiFi access immediately to continue signal transmission. In the indoor hybrid VLC–WiFi system, there is a central controller for signal processing and scheduling to allocate resources. We assume that each mobile terminal (MT) is designed with VLC and WiFi transceiver, and users can only send or receive on a certain channel in a given time slot. We use the joint resource allocation algorithm in the heterogeneous wireless network. As long as the user is in the range of WiFi channel,
1104
L. Yang and Q. Zhang
VLC LED
VLC
VLC LED
LED WiFi
PLC
Fig. 1. Proposed VLC–WiFi hybrid network system model
even if the VLC access point is not available, the user can maintain full connection. Due to the resource allocation mechanism, the user has no reserved channel and must connect to the VLC AP or WiFi AP. Channel selection strategy allows multiple users to share resources effectively. The controller coordinates downlink traffic accordingly. This simulation setup of our hybrid VLC–WiFi system is shown in Fig. 2. Handover is a process adopted by cellular system to avoid service disconnections due to mobility. Combined horizontal handover and vertical handover (VHO) are used to improve the user experience caused by switching between frequencies during the user’s movement. Problems about vertical handover arise when researching VLC–WiFi heterogeneous networks. Using the dynamic handover scheme based on fuzzy logic, the mobile terminal can freely switch between the VLC hotspot and WiFi hotspot and realize the dynamic and flexible downlink allocation between the VLC and WiFi channels, which not only improve the throughput, but also meet the user quality of service (QoS).
3 Results and Discussion We developed a typical VLC–WiFi hybrid network architecture model and focus on the handover decision-making problem. We know that the appropriate handover mechanism can optimize the system performance and make the cooperation between different networks closer. Users can first choose VLC as the service network, make full use of the high data rate of VLC, and reduce the traffic load of WiFi network. After analysis and comparison, a switching scheme based on fuzzy logic is selected. Fuzzy logic cannot only deal with imprecise input decision criteria, but also combine and evaluate multiple criteria at the same time. In addition, using bandwidth, delay, jitter and BER as input parameters of fuzzy logic can provide basis for VHO decision. There are four steps in the fuzzy logic based on handover algorithm: fuzzification, rule evaluation, defuzzification and decision-making. In the decision method based on fuzzy logic, the score of every candidate network is calculated by the QoS attributes,
Handoff Decision Using Fuzzy Logic in Heterogeneous Wireless …
1105
Fig. 2. Simulation model of hybrid VLC–WiFi system
given by (1) A = arg max i∈N
M
wj Pij
(1)
j=1
where A is the best QoS cell, N is the number of networks, M is the number of QoS attributes, Pij represents the number j attribute of the number i network, wj is the priority of Pij . We use fuzzy Logic Toolbox of MATLAB to simulate handover mechanism. The QoS attributes contains four types of variables, namely bandwidth, delay, jitter and BER, which are combined to determine the score of the candidate networks. Each QoS attribute is represented by three fuzzy sets: low, medium and high. The triangular function is applied as the membership function. From Fig. 3, network score value signifies that WiFi is the best network in 0–2 s and 9–10 s, VLC1 is better in 2–6 s and VLC2 is selected in 6–9 s, respectively. From the simulation analysis and results, we can see that the user will always choose the network with best QoS and the handover scheme works well in the hybrid VLC– WiFi network. By our scheme, the users are able to switch successfully at the edge of any network and will be served by the best candidate network when moving between different networks. VLC channels supplement WiFi communications well. For comparison, we also simulated the handover scheme of simple additive weighting (SAW), and the results are shown in Fig. 4. From the analysis and results, we can conclude that the fuzzy logic-based handoff scheme is more similar to the real network environment.
1106
L. Yang and Q. Zhang
Fig. 3. Scores of VLC and WiFi
Fig. 4. Scores of VLC and WiFi with the handover scheme of SAW
4 Conclusion The coexistence of VLC and WiFi is a promising research field. Deployment of VLC and WiFi communication can improve system efficiency and user rate. In addition, handover strategy is also an important aspect to ensure the communication quality. This paper studies and proposes a VLC–WiFi hybrid system to verify the ability of using VLC channel to increase capacity on existing WiFi. Simulation results show that the method based on fuzzy logic can flexibly allocate resources in hybrid system, users can freely choose different access networks to ensure smooth communication, and can improve the
Handoff Decision Using Fuzzy Logic in Heterogeneous Wireless …
1107
consistency of network selection by reducing unnecessary switching times. It not only improves the throughput, but also guarantees the quality of service. Acknowledgements. This work has been supported by the National Natural Science Foundation of China (NOs. 61705260, 61701213), Chinese Universities Scientific Fund (NOs. 2018QC112, 2018QC073), Natural Science Funds of Fujian (No. 2018J01546).
References 1. Zafar F, Bakaul M, Parthiban R (2017) Laser-diode-based visible light communication: toward gigabit class communication. IEEE Commun Mag 55(2):144–151 2. Pergoloni S, Biagi M, Colonnese S, Cusani R, Scarano G (2016) Optimized LEDs foot printing for indoor visible light communication networks. IEEE Photonics Technol Lett 28(4):532–535 3. Liu X, Gong C, Li S, Xu Z (2016) Signal characterization and receiver design for visible light communication under weak illuminance. IEEE Commun Lett 20(7):1349–1352 4. Sensor Research (2020) Reports outline sensor research study findings from Motilal Nehru National Institute of Technology Allahabad (Fuzzy logic based effective clustering of homogeneous wireless sensor networks for mobile sink). J Technol Sci 5. Amir, Abbas B, Keivan N (2020) HQCA-WSN: High-quality clustering algorithm and optimal cluster head selection using fuzzy logic in wireless sensor networks. Fuzzy Sets Syst 389 6. Bushra N, Razali N, Siti ZMH (2019) Reduction in ping-pong effect in heterogeneous networks using fuzzy logic. Soft Comput 23(1) 7. Mohamed S, Hanan K, Mona E-G (2018) Novel type-2 fuzzy logic technique for handover problems in a heterogeneous network. Eng Optimizat 50(9)
Campus Bullying Detection Based on Speech Emotion Recognition Chenke Wang1(B) , Daning Zhang2 , and Liang Ye3 1
3
Harbin Institute of Technology, Weihai 264200, China [email protected] 2 Xiamen University, Xiamen 361102, China [email protected] Harbin Institute of Technology, Harbin 150080, China [email protected]
Abstract. Campus bullying is now receiving worldwide attention and a great number of measures have been introduced. This article introduces speech emotion recognition into the detection of the bullying incidents on campus. In this paper, the audio data are pre-processed through the frame division and endpoints detection. Based on the victim’s emotional changes during bullying, 36 MFCC features are extracted. This article uses the Wrapper algorithm, combined with KNN classifier for features selection and dimensionality reduction, which constructs the systemic model. The accuracy of model learning performance is about 80%, and it has been proved to be stable, which shows a promise in campus bullying detection. Keywords: Campus bullying K-nearest neighbor
1 1.1
· Speech emotion recognition · MFCC ·
Introduction Background
Campus bullying refers to the bullying and oppression of unequal power between children on campus, mainly including physical or verbal attacks, resistance, and exclusion in interpersonal interactions, comment and ridicule on body parts [7]. Norwegian scholar Dan Olweus defines campus bullying as an incident in where a student is exposed to one or more student-led negative behaviors for a long time and repeatedly. A study by Yale University in the USA shows that the probability of suicide among adolescents who suffer from campus bullying is two to nine times that of ordinary teenagers, and the suicide rate of adolescents who implement or participate in campus bullying is also much higher than that of ordinary youth [1]. Due to the frequency and severity of its occurrence, campus bullying has attracted the attention of countries all over the world [3, 5, 12]. c The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2021 Q. Liang et al. (eds.), Communications, Signal Processing, and Systems, Lecture Notes in Electrical Engineering 654, https://doi.org/10.1007/978-981-15-8411-4_146
Campus Bullying Detection Based on Speech . . .
1.2
1109
Research Review and Our Work
In machine learning and artificial intelligence industries, pattern recognition is one of the main research directions. This article mainly discussed the application of speech emotion recognition in campus bullying. Finland, Sweden, and other European countries first began research on campus bullying. In order to solve this problem, Finland implemented the KiVa project, driven by student parents and teachers [4, 10]. However, it seems invalid when part of students is afraid of speaking out. Stop Bullies, Campus Safety, and other campus bullying governance solutions with smart devices must be triggered by the victim, which may be difficult when bullying occurs. Consequently, pattern recognition was introduced into the governance of campus bullying, and the above difficulties can be effectively solved. Since the development of human motion recognition technology is relatively mature, most articles choose to recognize campus bullying based on action, which means they use sensors to obtain student movement information to identify whether they are suffering bullying [2,9, 11]. This article introduces speech emotion recognition to detect campus bullying, which can be easier to collect and more smart [6, 8], and our structure shows as follows: • Section 1 analyzes the harm of campus bullying and the current situation of research. On this basis, this paper elaborates the research plan of this article. • Section 2 proposes the speech emotion recognition system on campus bullying and the selected algorithms therein. • Section 3 simulates the proceeding with massive campus bullying data and comes to some conclusions.
2
Audio-Based Feature Extraction and Classification
Through a certain device, the audio fragments under bullying and non-bullying actions are collected, and the fragments are labeled correspondingly. The audio is divided into two types: bullying and non-bullying. The following is a system designed for bullying feature extraction and classification of speech emotion, and then tests its recognition performance. 2.1
Bullying Data Preprocessing
For the short-term stability of audio, we frame the data. The sampling frequency of audio data is 8 kHz, setting the length of each frame to 256 and the frame shift to 128. This article chooses Hamming windows to frame data: W (n, a) = (1 − a) − cos(
2πn ) N −1
where a = 0.46, n is the number of frames, N is the length of a frame. In the process of collecting data, we found that there is serious noise at the beginning and end of the audio, which is regarded as invalid frames. Consequently, this
1110
C. Wang et al.
article selects two frames before and after to be the missing segments and then calculates short-term energy and zero-crossing rate respectively for endpoints detection: −1 2 En = ΣN k=0 xn (k) 1 N −1 Σ |sgn(xn (m)) − sgn(xn (m − 1))| 2 m=0 Besides, in order to eliminate the effect of lip radiation, this article improves the high frequency part of the audio by the first-order FIR high pass filter: Zn =
H(z) = 1 − μz −1 where μ is always in 0.9 − 1.0, here we denote it as μ = 0.97. 2.2
Feature Extraction
The emotions of tested children differ greatly in the bullying and non-bullying environments. The main emotions when non-bullying are ‘happy,’ ‘laugh,’ ‘clam,’ and so on, otherwise, they are always ‘frightened’ and ‘angry.’ Hence, we use mel scale frequency cepstral coefficients(MFCC) features to recognize the audio emotion, which can distinguish bullying and non-bullying audio. The human ear has sensitive frequency points that are nonlinearly distributed during sound reception. The mel scale maps frequency to this scale to simulate the characteristics of human ear’s perception of sound, so we have the Formula (1) and the nonlinear relationship in Fig. 1. f ) (1) M el(f ) = 2595 lg(1 + 700 MFCC is the cepstrum parameter extracted in the frequency domain of the mel scale, so we transform the signal frequency with (2). X(k) =
N −1
x(n)e
−j2πnk N
, (0 ≤ n, k ≤ N − 1)
(2)
n=0
The energy spectrum is passed through a set of mel scale triangular filter banks, which has M triangular filters with a center frequency of f (m), m = 1, 2, · · · , M . M is always 22–26, and M = 24 in this article. When m is smaller, the filter interval is smaller and vice versa, thereby simulating the hearing characteristics of human ear. The frequency response of each band-pass triangular filter is: ⎧ 0, k < f (m − 1), ⎪ ⎪ ⎪ ⎪ ⎪ k − f (m − 1) ⎪ ⎪ ⎪ ⎨ f (m) − f (m − 1) , f (m − 1) ≤ k ≤ f (m)), (3) Hm (k) = ⎪ f (m + 1) − k ⎪ ⎪ , f (m) ≤ k ≤ f (m + 1)), ⎪ ⎪ f (m + 1) − f (m) ⎪ ⎪ ⎪ ⎩ 0, k > f (m + 1).
Campus Bullying Detection Based on Speech . . .
1111
Fig. 1. Map frequency to mel scale
M −1
Hm (k) = 1
m=0
Hence, we get the logarithm of the output energy with Formula (4). N −1
s(m) = ln(
|Xa (k)|2 Hm (k)), 0 ≤ m ≤ M
(4)
k=0
Then, MFCC features can be calculated with discrete cosine transform: C(n) =
N −1 m=0
s(m)cos(
πn(m − 0.5) ), n = 1, 2, · · · L M
L always is 12–16, this article chooses L = 12. In order to reflect the dynamic characteristics of continuous audio, this article also uses first-order differential MFCC coefficients and second-order differential MFCC coefficients, and the calculation method is (5) ⎧ t