3D Imaging—Multidimensional Signal Processing and Deep Learning: Multidimensional Signals, Images, Video Processing and Applications, Volume 2 (Smart Innovation, Systems and Technologies Book 298) 981192452X, 9789811924521


107 39 9MB

English Pages [237]

Report DMCA / Copyright

DOWNLOAD PDF FILE

Table of contents :
Preface
Contents
About the Editors
1 Study on Remote Sensing Retrieval of Land Surface Evapotranspiration in Poyang Lake Region in Typical Hydrological Year
1.1 Introduction
1.2 Study Area and Data Source
1.2.1 Study Area
1.2.2 Data
1.3 Methodology
1.3.1 ET Estimation
1.3.2 Accuracy Evaluation Method
1.4 Results and Discussions
1.4.1 Validation of Evapotranspiration Results
1.4.2 Variation Characteristics of ET
1.4.3 Analysis of Evapotranspiration-Influencing Factors
1.5 Conclusions
References
2 Research on Image Quality Assessment of Vehicle Panoramic View Equipment
2.1 Introduction
2.2 Scene Layout
2.3 Software Calculation Steps
2.3.1 Generate Training Data Set and Obtain Test Data Set
2.3.2 Build and Train the YOLOv3 Network Model
2.3.3 Score the YOLOv3 Network Model
2.4 Conclusion
References
3 Global Analysis of Hepatitis B Virus Infection Model with Linear Drug Therapy Function
3.1 Introduction
3.2 Mathematical Model
3.3 Thee Existence Equilibrium and Basic Reproduction Number
3.4 The Global Stability of Equilibria
3.5 Discussion
References
4 Research and Design of “AI+ Agriculture” Disease Detection System Based on Deep Learning
4.1 Introduction
4.2 Background
4.3 The Problem to Which the System is Directed
4.4 Overall System Design
4.5 Study on Preprocessing of Crop Image Samples
4.6 Study on Preprocessing of Crop Image Samples
4.7 Study on Agricultural Disease Detection Algorithm
4.7.1 Study on Classical Machine Learning Image Classification Algorithm and Advanced Deep Learning Classification Algorithms
4.7.2 Building the Systematic Model for Agricultural Disease Detection
4.8 Experimental Results and Analysis
4.8.1 Data Collection and Creation
4.8.2 Experimental Procedure and Results
4.9 System Development and Design
4.9.1 Study on Scientific Design of Database
4.9.2 Agricultural Encyclopedia Module Information Entry
4.9.3 Study on Intelligent Planning
4.9.4 Mobile Segment Development Ideas
4.10 Conclusion
References
5 Prediction of Silicon Content in Molten Iron Based on EMD-GA-LSTM
5.1 Introduction
5.2 Theoretical Knowledge
5.2.1 Empirical Mode Decomposition (EMD)
5.2.2 Genetic Algorithm (GA)
5.2.3 Long Short-Term Memory (LSTM)
5.3 EMD-GA-LSTM Algorithm Introduction
5.4 Verification of Measured Data
5.4.1 Performance Index
5.4.2 Data Set Introduction
5.4.3 EMD Decomposition
5.4.4 LSTM Improved by GA
5.5 Model Comparison
5.6 Conclusion
References
6 Using Meta Path Information to Adjust Embedded Recommendation System
6.1 Introduction
6.2 Related Work
6.2.1 Embedding
6.2.2 Meta Path
6.3 Model Figure
6.4 Experiments and Analysis
6.4.1 Datasets
6.4.2 Setup
6.4.3 Baselines
6.4.4 Conclusion and Analysis
6.5 Conclusion
References
7 Intelligent Campus Blog Preprocessing Model
7.1 Introduction
7.2 Data Acquisition Layer
7.3 Preliminary Filter Layer
7.4 Secondary Filter Layer
7.5 Experimental Test
7.6 Summary
References
8 An Experimental Study on Extraction Method of Tobacco Color Distribution Feature
8.1 Introduction
8.1.1 Innovation and Technical Level of Research Results
8.1.2 Scope of Application of Research Results
8.1.3 Popularization and Application of Research Results and Their Economic and Social Benefits
8.2 Experiment on Color Value Analysis of Tobacco Leaves
8.2.1 Image Acquisition
8.2.2 Image Binarization
8.2.3 Histogram Analysis
8.2.4 Color Value Analysis
8.2.5 Regional Analysis
8.3 Online Consistency Test of Tobacco Leaves
8.4 Summary
References
9 Research on the Distribution Map of Weeds in Rice Field Based on SegNet
9.1 Introduction
9.2 Materials and Methods
9.2.1 Experimental Data Collection
9.2.2 Introduction to Image Segmentation Based on Deep Learning
9.3 Results and Analysis
9.3.1 Training Result Display
9.3.2 Establishment of Unified Evaluation Coefficient of Model
9.4 Conclusion
References
10 Research on the Application of Artificial Intelligence in Unstructured Data Resource Management of Power News Information
10.1 Introduction
10.2 Methods
10.3 Result Analysis
10.4 Conclusion
References
11 Analysis and Application of Medical Images in the Field of Artificial Intelligence
11.1 Medical Image Analysis and Its Technical Development Direction
11.2 Artificial Intelligence and Algorithms
11.3 Case Introduction
11.4 Specific Application
11.5 Conclusion
References
12 Development and Design of Surface Quality Online Inspection System Based on Machine Vision
12.1 Introduction
12.2 Workpiece Surface High-Speed Online Detection Technology
12.3 System Design
12.4 Conclusion
References
13 Terahertz Surface-Plasmon-Polaritons Gradient Index Lens
13.1 Introduction
13.2 Principle and Simulation Realization
13.2.1 Plasmonic Fisheye Lens
13.2.2 Plasmonic Eaton Lens
13.3 Conclusion
References
14 Improvement of YOLOv4 Algorithm and Its Application in Objective Identification of Railway Level Crossing
14.1 Introduction
14.2 YOLOv4 Convolutional Network Model
14.2.1 The YOLOv4 Backbone Feature Extraction Network
14.2.2 Features to Strengthen the Network
14.3 Improved YOLOv4 Model
14.3.1 Activation Function
14.3.2 Image Upsampling
14.3.3 The Advanced Semantic Embedding Branch Module (SEB)
14.4 Simulation Experiment and Result Analysis
14.4.1 Experimentation
14.4.2 Experimental Evaluation Indicators
14.4.3 Experimental Results and Analysis
14.4.4 Target Identification and Tracking Test at the Railway Level Crossing
14.5 Conclusion
References
15 Design of Automatic Verification System for Electronic Balance Based on Machine Vision
15.1 Introduction
15.2 Method
15.2.1 System Design
15.2.2 Control System
15.2.3 Visual System
15.3 Result Analysis
15.4 Conclusion
References
16 Research on Interactive Art Design System Based on Artificial Intelligence Technology
16.1 Artificial Intelligence Technology and Human–Computer Interaction Technology
16.2 The Application of Artificial Intelligence Technology
16.2.1 Database Construction
16.2.2 Artificial Intelligence Data Cleaning
16.3 Results Analysis
16.4 Conclusion
References
17 Infrared Imaging Circuit Multi-Channel Analog-To-Digital Sampling Circuit Inconsistency Correction
17.1 Inconsistency of The Analog–Digital Sampling Circuit
17.2 Inconsistency Correction Method
17.2.1 Two-Point Temperature Calibration Non-Uniformity Correction Algorithm
17.2.2 Analog–Digital Sampling Circuit Inconsistency Correction Model
17.3 Experimental Results and Analysis
17.3.1 Inconsistency Correction Experiment
17.3.2 Analysis and Comparison of Results
17.4 Conclusion
References
18 Research on Artificial Intelligence System Design of Ideological and Political Education
18.1 Introduction
18.2 Methods
18.3 Result Analysis
18.4 Conclusion
References
19 Research on 3D Virtual Emotional Interaction Design Based on Network Digitization
19.1 Overview of Network 3D Virtual Emotional Interaction
19.2 Construction of Hierarchical Tower Model
19.3 Interaction Design Function and Structure
19.4 Design and Development Process of Emotional Interaction
19.4.1 Create a 3D Scene Model in 3Ds Max
19.4.2 Create Character Models and Animations in 3Ds Max
19.4.3 Output Models and Actions in 3Ds Max
19.4.4 3D Virtual Products Model Imported During Interaction Module Design
19.4.5 Release 3D Virtual Products
19.5 Conclusion
References
20 Research and Implementation of Digital Art Media System Based on Big Data Aesthetics
20.1 Introduction
20.2 Method
20.2.1 Requirement Analysis
20.2.2 System Design
20.2.3 Module Design
20.3 Result Analysis
20.3.1 System Implementation
20.3.2 System Test
20.4 Conclusion
References
21 Research on Application of OpenGL-Based Game Design
21.1 Introduction
21.2 Method
21.2.1 Environment Configuration
21.2.2 Architecture Design
21.2.3 Technical Implementation
21.3 Result Analysis
21.4 Conclusion
References
22 Application of Artificial Intelligence Technology in Marketing
22.1 The Connotation of Artificial Intelligence
22.2 The Current Application of Artificial Intelligence Technology in Marketing
22.3 Future Development Trend
22.4 Conclusion
References
23 Accuracy Analysis and Verification of Ground-Based Radar Differential Interferometry
23.1 Introduction
23.2 GBRI Deformation Monitoring
23.3 Analysis of Deformation Sensitivity
23.4 Simulation and Measurements
23.4.1 Simulation
23.4.2 Experimental Measurements
23.5 Conclusions
References
Author Index
Recommend Papers

3D Imaging—Multidimensional Signal Processing and Deep Learning: Multidimensional Signals, Images, Video Processing and Applications, Volume 2 (Smart Innovation, Systems and Technologies Book 298)
 981192452X, 9789811924521

  • 0 0 0
  • Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up
File loading please wait...
Citation preview

Smart Innovation, Systems and Technologies 298

Lakhmi C. Jain Roumen Kountchev Yonghang Tai Roumiana Kountcheva   Editors

3D Imaging— Multidimensional Signal Processing and Deep Learning Multidimensional Signals, Images, Video Processing and Applications, Volume 2

123

Smart Innovation, Systems and Technologies Volume 298

Series Editors Robert J. Howlett, Bournemouth University and KES International, Shoreham-by-Sea, UK Lakhmi C. Jain, KES International, Shoreham-by-Sea, UK

The Smart Innovation, Systems and Technologies book series encompasses the topics of knowledge, intelligence, innovation and sustainability. The aim of the series is to make available a platform for the publication of books on all aspects of single and multi-disciplinary research on these themes in order to make the latest results available in a readily-accessible form. Volumes on interdisciplinary research combining two or more of these areas is particularly sought. The series covers systems and paradigms that employ knowledge and intelligence in a broad sense. Its scope is systems having embedded knowledge and intelligence, which may be applied to the solution of world problems in industry, the environment and the community. It also focusses on the knowledge-transfer methodologies and innovation strategies employed to make this happen effectively. The combination of intelligent systems tools and a broad range of applications introduces a need for a synergy of disciplines from science, technology, business and the humanities. The series will include conference proceedings, edited collections, monographs, handbooks, reference books, and other relevant types of book in areas of science and technology where smart systems and technologies can offer innovative solutions. High quality content is an essential feature for all book proposals accepted for the series. It is expected that editors of all accepted volumes will ensure that contributions are subjected to an appropriate level of reviewing process and adhere to KES quality principles. Indexed by SCOPUS, EI Compendex, INSPEC, WTI Frankfurt eG, zbMATH, Japanese Science and Technology Agency (JST), SCImago, DBLP. All books published in the series are submitted for consideration in Web of Science.

Lakhmi C. Jain · Roumen Kountchev · Yonghang Tai · Roumiana Kountcheva Editors

3D Imaging— Multidimensional Signal Processing and Deep Learning Multidimensional Signals, Images, Video Processing and Applications, Volume 2

Editors Lakhmi C. Jain KES International Shoreham-by-Sea, UK

Roumen Kountchev Technical University of Sofia Sofia, Bulgaria

Yonghang Tai Yunnan Normal University Kunming, China

Roumiana Kountcheva TK Engineering Sofia, Bulgaria

ISSN 2190-3018 ISSN 2190-3026 (electronic) Smart Innovation, Systems and Technologies ISBN 978-981-19-2451-4 ISBN 978-981-19-2452-1 (eBook) https://doi.org/10.1007/978-981-19-2452-1 © The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. This Springer imprint is published by the registered company Springer Nature Singapore Pte Ltd. The registered company address is: 152 Beach Road, #21-01/04 Gateway East, Singapore 189721, Singapore

Preface

This book (Volume 2) contains high-quality peer-reviewed research papers presented at the Third International conference on 3D Imaging Technologies—Multidimensional Signal Processing and Deep Learning (3DIT-MSP&DL) which was arranged by IRNet China, and was carried out on December 26–28, 2021, at Yunnan Normal University in Kunming, China. The contents of the papers cover wide part of the most topical areas in the 3D image representation and technologies, multidimensional signal, image, and video processing and coding, together with related mathematical approaches and applications. The papers present unique research achievements in: study on remote sensing retrieval of land surface evapotranspiration in Poyang lake region; image quality assessment of vehicle panoramic view equipment; analysis of hepatitis B virus infection model with linear drug therapy function; research and design of “AI+ agriculture” disease detection system based on deep learning; prediction of silicon content in molten iron based on EMD-GA-LSTM; using meta path information to adjust embedded recommendation system; intelligent campus blog preprocessing model; experimental study on extraction method of tobacco colour distribution feature; research on the distribution map of weeds in rice field based on SegNet; application of AI in unstructured data resource management of power news information; analysis and application of AI algorithm in medical image; development and design of surface quality online inspection system based on machine vision; terahertz surface-plasmon-polaritons gradient index lens; improvement of Yolov4 algorithm and its application in objective identification of railway level crossing; design of automatic verification system for electronic balance based on machine vision; research on interactive art design system based on AI technology; infrared imaging circuit multi-channel analogue-to-digital sampling circuit inconsistency correction; research on AI system design of ideological and political education; research on 3D virtual emotional interaction design based on network digitization; research and implementation of digital art media system based on BD aesthetics; research on application of open GL-based game design; application of AI technology in marketing; accuracy analysis and verification of ground-based radar differential interferometry.

v

vi

Preface

The aim of the book is to present the latest achievements of the authors to a wide range of readers: IT specialists, engineers, physicians, Ph.D. students and other specialists in the area. Shoreham-by-Sea, UK Sofia, Bulgaria Kunming, China Sofia, Bulgaria February 2022

Lakhmi C. Jain Roumen Kountchev Yonghang Tai Roumiana Kountcheva

Acknowledgments The book editors express their special thanks to book chapter reviewers for their efforts and goodwill to help for the successful preparation of the book. Special thanks for Prof. Lakhmi C. Jain (Honorary Chair), Prof. Dr. Srikanta Patnaik, Prof. Dr. Junsheng Shi and Prof. Dr. Roumen Kountchev (General Chairs), Prof. Dr. Yingkai Liu (Organising Chair) and Prof. Dr. Yonghang Tai (Programme Chair) of 3D IT-MSP&DL. The editors express their warmest thanks to the excellent Springer team which made this book possible.

Contents

1

2

3

4

5

6

Study on Remote Sensing Retrieval of Land Surface Evapotranspiration in Poyang Lake Region in Typical Hydrological Year . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bin Li, Chaoshuai You, Xin Pan, and Chao Ma

1

Research on Image Quality Assessment of Vehicle Panoramic View Equipment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Changyuan Yang and Xuan Dong

13

Global Analysis of Hepatitis B Virus Infection Model with Linear Drug Therapy Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fang Zheng

21

Research and Design of “AI+ Agriculture” Disease Detection System Based on Deep Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Biao Xu, Chengzhao Luo, and Shiyi Xie

31

Prediction of Silicon Content in Molten Iron Based on EMD-GA-LSTM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Junqi Yang, Haoran Wang, Xiuhe Wang, and Lintong Zhang

45

Using Meta Path Information to Adjust Embedded Recommendation System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Rui Ma, Xiuzhuo Wei, Huinan Zhao, Hui Sun, and Suhua Wang

57

7

Intelligent Campus Blog Preprocessing Model . . . . . . . . . . . . . . . . . . . Shihao Li, Sheng Zhu, Aihua You, Xiaoyun He, and Ze Yang

8

An Experimental Study on Extraction Method of Tobacco Color Distribution Feature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Ruilin Luo, Yingchun Li, Yuecheng Qi, Yicheng Zhang, Chunjie Zhang, and Dongdong Yang

73

83

vii

viii

9

Contents

Research on the Distribution Map of Weeds in Rice Field Based on SegNet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Sheng Zhu, Shihao Li, and Ze Yang

91

10 Research on the Application of Artificial Intelligence in Unstructured Data Resource Management of Power News Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 Jia Li, Shaoming Lai, and Chenlan Gu 11 Analysis and Application of Medical Images in the Field of Artificial Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 Ya Li and Jintang He 12 Development and Design of Surface Quality Online Inspection System Based on Machine Vision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 Min Huang 13 Terahertz Surface-Plasmon-Polaritons Gradient Index Lens . . . . . . 131 Shenghao Gu, Mingming Sun, and Ying Zhang 14 Improvement of YOLOv4 Algorithm and Its Application in Objective Identification of Railway Level Crossing . . . . . . . . . . . . . 139 Ying Che, Lijun Yun, and Rong Chen 15 Design of Automatic Verification System for Electronic Balance Based on Machine Vision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 Juanjuan Zhang, Ying Wu, and Huan Gao 16 Research on Interactive Art Design System Based on Artificial Intelligence Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 Xiaoyan Wei and Rensheng Wei 17 Infrared Imaging Circuit Multi-Channel Analog-To-Digital Sampling Circuit Inconsistency Correction . . . . . . . . . . . . . . . . . . . . . . 167 Chuanlin Tang, Linhuai Xiang, Zhibin Hu, Xingrong Zeng, and Peng Liu 18 Research on Artificial Intelligence System Design of Ideological and Political Education . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 Yi Zhai 19 Research on 3D Virtual Emotional Interaction Design Based on Network Digitization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 Xiaoshuai Jing and Xin Xie 20 Research and Implementation of Digital Art Media System Based on Big Data Aesthetics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 Zixuan Liu 21 Research on Application of OpenGL-Based Game Design . . . . . . . . . 205 Xinle Liu

Contents

ix

22 Application of Artificial Intelligence Technology in Marketing . . . . . 213 Qin Xiao 23 Accuracy Analysis and Verification of Ground-Based Radar Differential Interferometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223 Chunming Han, Yunqi Meng, Guangfu Li, and Yixing Ding Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237

About the Editors

Lakhmi C. Jain Ph.D., Dr. H.C., ME, BE (Hons), Fellow (Engineers Australia), is with the Liverpool Hope University and the University of Arad. He was formerly with the University of Technology Sydney, the University of Canberra and Bournemouth University. Professor Jain founded the KES International for providing a professional community the opportunities for publications, knowledge exchange, cooperation and teaming. Involving around 5000 researchers drawn from universities and companies world-wide, KES facilitates international cooperation and generate synergy in teaching and research. KES regularly provides networking opportunities for professional community through one of the largest conferences of its kind in the area of KES. His interests focus on the artificial intelligence paradigms and their applications in complex systems, security, e-education, e-healthcare, unmanned air vehicles and intelligent agents.

xi

xii

About the Editors

Roumen Kountchev Ph.D., D.Sc. is a professor at the Faculty of Telecommunications, Department of Radio Communications and Video Technologies at the Technical University of Sofia, Bulgaria. His scientific areas of interest are: digital signal and image processing, image compression, multimedia watermarking, video communications, pattern recognition and neural networks. Professor Kountchev has 400 papers published in magazines and conference proceedings; 20 books; 48 book chapters; 20 patents. He had been the principle investigator of 52 research projects. At present he is a member of Euro Mediterranean Academy of Arts and Sciences and the President of the Bulgarian Association for Pattern Recognition (member of IAPR). Editor-in-chief of International Journal of Image Processing and Vision Science. Editorial board member of: International Journal of Reasoningbased Intelligent Systems; International Journal Broad Research in Artificial Intelligence and Neuroscience; KES Focus Group on Intelligent Decision Technologies; Egyptian Computer Science Journal; International Journal of Bio-Medical Informatics and e-Health, and International Journal Intelligent Decision Technologies; International Journal of Bio-Medical Informatics and e-Health. Member of Institute of Data Science and Artificial Intelligence and International Engineering and Technology Institute. He has been a plenary speaker at more than 30 international scientific conferences and symposia. Dr. Yonghang Tai in 2009–2012, he studied in Yunnan Normal University and got his bachelor’s degree. He received his Ph.D. on Computer Science from Deakin University, Melbourne, Australia. He has hosted 4 Fund projects including Deakin University Postgraduate Research Full Scholarship, Yunnan Education Commission, Yunnan Natural Science Foundation, Yunnan Education Commission. He has published more than 30 papers, five of which has been indexed by SCI. He is the Co-editor of International Journal of Telemedicine and Clinical Practices and Machine learning and data analytics. His research interests include VR/AR/MR in surgical simulation, Physic-based rendering, Medical image processing.

About the Editors

xiii

Roumiana Kountcheva got her M.Sc. and Ph.D. at the Technical University of Sofia, Bulgaria and in 1992 she got the title Senior Researcher. At present, she is the Vice president of TK Engineering, Sofia. She had postgraduate trainings in Fujitsu and Fanuc, Japan. Her main scientific interests are in image processing, image compression, digital watermarking, pattern recognition, image tensor representation, neural networks, CNC and programmable controllers. She has more than 180 publications, among which: 34 journal papers, 21 book chapters, and five patents. She was the PI and Co-PI of 48 scientific research projects. R. Kountcheva was the plenary speaker at 16 international scientific conferences and scientific events. She edited several books published in Springer SIST series and is a member of international organizations: Bulgarian Association for Pattern Recognition, International Research Institute for Economics and Management (IRIEM), the Institute of Data Science and Artificial Intelligence (IDSAI), and is a Honorary Member of the Honorable Editorial Board of the nonprofit peer reviewed open access IJBST Journal Group.

Chapter 1

Study on Remote Sensing Retrieval of Land Surface Evapotranspiration in Poyang Lake Region in Typical Hydrological Year Bin Li, Chaoshuai You, Xin Pan, and Chao Ma

1.1 Introduction Poyang Lake is the largest freshwater lake in China. The lake area is affected by the five sub-basins in the basin, and its water volume changes significantly in different seasons. The water level and water body area in the lake area have changed greatly, and floods and extreme drought events have occurred frequently, and extreme drought events have occurred frequently in recent years [1–3]. Min Q et al. studied the hydrological monitoring data of the Poyang Lake area from 1952 to 2011 for a total of 60 years and showed that the low water level of Poyang Lake has increased significantly after 2000 [4]. Combined with the analysis of the forecast of climate change in the Poyang Lake Basin in the next 50 years by Guo H and others, it is believed that the low water level of Poyang Lake may continue to intensify in the next ten years [5]. Evapotranspiration is an important expenditure item in the relationship of water balance. Research on evapotranspiration can provide a reasonable explanation for the drought and flood disasters caused by the breaking of water balance. The development of remote sensing technology has expanded the study of evapotranspiration from a point scale to a spatial scale, and the development of evapotranspiration inversion models has made regional evapotranspiration research more widely used. Common evapotranspiration remote sensing inversion models include a single- and double-layer model based on energy balance, a series of models developed based on Penman’s formula, and models based on non-parametric methods. Su, Bastiaanssen, etc., adjusted the single-layer model error by adjusting the aerodynamic impedance, and respectively proposed the SEBAL algorithm and the SEBS B. Li (B) · C. You · C. Ma Shandong Electric Power Engineering Consulting Institute Co. Ltd., Jinan 250013, China e-mail: [email protected] X. Pan School of Earth Science and Engineering, Hohai University, Nanjing 211100, China © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 L. C. Jain et al. (eds.), 3D Imaging—Multidimensional Signal Processing and Deep Learning, Smart Innovation, Systems and Technologies 298, https://doi.org/10.1007/978-981-19-2452-1_1

1

2

B. Li et al.

algorithm [6, 7]. The above model requires the surface impedance parameterization process, and this process has the problem of complex surface impedance parameterization, which is not conducive to the further development of the evapotranspiration inversion model. Aiming at the complex problems of impedance parameterization in most evapotranspiration estimation methods, combined with balanced evaporation and Hamilton’s principle, Liu et al. proposed a nonparametric approach (NP) with a clear physical theoretical basis. This method avoids surface impedance parameters. Analyze complex problems without being restricted by empirical coefficients [8]. Pan et al. developed an RS-NP model that is suitable for the entire space range based on remote sensing data. The input parameters of the model are simple and easy to obtain, and it has high accuracy after testing on different underlying surfaces [9]. The non-parametric evapotranspiration model can use fewer variables and is easier to obtain data. Compared with other models, it has significant advantages in the study of long-term regional evapotranspiration changes and can effectively use satellite remote sensing data. Based on MODIS remote sensing data and the China Meteorological Administration Land Data Assimilation System (CLDAS) product, this paper uses the RS-NP model to study the evapotranspiration inversion of Poyang Lake in typical hydrological years and analyzes the evolution of evapotranspiration in different hydrological years based on the temporal and spatial characteristics of evapotranspiration in the lake area in three typical years. Laws provide a scientific basis for the frequent occurrence of floods and drought disasters in the lake area.

1.2 Study Area and Data Source 1.2.1 Study Area Poyang Lake District is located in the northern part of Jiangxi Province, on the south bank of the middle and lower reaches of the Yangtze River, and is connected to the Yangtze River through the mouth of the lake. The topography of the basin is high in the south and low in the north, with many hills and mountains. The lake area as a whole shows a characteristic of tilting from southeast to northwest. The area of the lake area fluctuates sharply. The water level can reach 22.59 m in the wet season, the maximum water area is as high as 3000 km2 , and the water level is as low as 7.68 m in the dry season. About 730 km2 , the water area in the dry season is only 18.8% of the wet season. There are 41 large and small islands in the lake area, with a total area of about 103 km2 . The overview of the study area is shown in Fig. 1.1.

1 Study on Remote Sensing Retrieval of Land Surface …

3

Fig. 1.1 The area and general situation of Poyang Lake

1.2.2 Data The data used by MODIS remote sensing products as model input includes reflectance products (MYD09A1), surface temperature products (MYD11A2), and 8-day evapotranspiration products (MOD16). MODIS data is released by NASA’s LAADS, and the data format is HDF. The data used by CLDAS land surface data assimilation products as model input includes: surface pressure data (PRS), specific humidity (RHU), atmospheric temperature (TMP), shortwave downward radiation (SRA), and land surface temperature (LST). CLDAS data is produced and released by the Information Center of the National Meteorological Administration, and the acquired data is hourly data in “.NC” format. The specific data parameters are shown in Table 1.1.

4

B. Li et al.

Table 1.1 Detailed information of data used in this paper Data name

Parameters

Temporal resolution

Data types

MYD09

Subastral reflectance

8 days, 500 m

Remote sensing (retrieve)

MYD11

Land surface temperature, surface emissivity

8 days, 1 km

Remote sensing (retrieve)

CLDAS

Barometric pressure, shortwave downlink radiation, relative humidity, temperature

1 h, 6 km

Land surface data (retrieve)

MOD16

LE

8 days, 1 km

Land surface evapotranspiration

1.3 Methodology 1.3.1 ET Estimation 1.3.1.1

Basic Principles of NP Method

The NP method [9] uses the Hamiltonian as the total energy of the system when the uniform near-surface layer of the underlying surface is a closed physical system, and the surface temperature as the generalized coordinate of the system. The system Hamiltonian is equal to the sum of system kinetic energy and potential energy. The system potential energy is the net surface radiation (Rn ), and the system kinetic energy includes sensible heat flux (H s ), latent heat flux (LE), and soil heat flux (Gs ). The partial differential of the Hamiltonian can obtain the expression of evapotranspiration (latent heat flux) in the NP method: ( ) ) ( 4 Δ Ts 4 LE = (Rn − G s ) − εσ Ts − Ta + G s ln Δ+γ Ta

(1.1)

In the formula, ε is the surface emissivity, σ is the Stephen Boltzmann constant, T s is the surface temperature, T a is the atmospheric temperature near the surface, Δ is the saturated vapor pressure gradient under the condition of T a , and γ is the wet and dry bulb constant.

1.3.2 Accuracy Evaluation Method This article uses MOD16 products as the verification data and selects indicators such as root mean square error (RMSE), average error (bias), relative error (RE), and coefficient of determination (R2 ) for accuracy testing. The calculation method of the accuracy inspection index is as follows:

1 Study on Remote Sensing Retrieval of Land Surface … Table 1.2 Accuracy verification of RS-NP inversion results

5

Parameters

Bias (W/m2 )

RMSE (W/m2 )

RE (%)

LE

6.74

16.82

22.99

Σ( bias = RE =

1 N

|bias| ΣN i=1

/

y − y′ N ′

yi

)

× 100%

Σ (y − y ′ )2 N ΣN ( ′ )2 i=1 yi − yi 2 R =1− Σ ( ′ )2 N i=1 yi − y RMSE =

(1.2) (1.3)

(1.4)

(1.5)

In the formula, y represents the inversion value, y ′ is the reference value, and N is the number of remote sensing inversion pixels.

1.4 Results and Discussions 1.4.1 Validation of Evapotranspiration Results The accuracy of the evapotranspiration inversion results is tested by comparing the RS-NP method inversion results with the MOD16 evapotranspiration product. Due to the lack of evapotranspiration values in water bodies in MOD16, the NP method can better reflect the evapotranspiration from waters and provide more comprehensive data for regional evapotranspiration research. The accuracy evaluation indicators include RE, RMSE, bias, and R2 . The calculation results are shown in Table 1.2. Figure 1.2 shows the data distribution of the two products.

1.4.2 Variation Characteristics of ET 1.4.2.1

Time Law Analysis

According to the inversion results, there is little difference in evapotranspiration during the low-water period on the intra-year scale, but there are large fluctuations in the three years during the high-water period, and the largest change was in 2010, with a magnitude of 177.354 W/m2 ; the frequency of evapotranspiration changes in 2011

6

B. Li et al.

Fig. 1.2 Comparison diagram of RS-NP model ET inversion data and MOD16 ET data

The highest, and there are relatively sharp fluctuations in the adjacent time domain, but the overall change range is not as large as in 2010. The intra-year change of evapotranspiration generally showed a trend of first increasing and then decreasing. In May of normal water years (2008) and wet years (2010), it began to show a slight downward trend, reached the trough in June, and then continued. Increased and reached its peak. The dry year (2011) also had a downward trend in May, but the magnitude was smaller. The peak value in wet years is the most obvious, while the peak value of evapotranspiration in low water years is relatively gentle; during the three years, the difference in evapotranspiration within the year is mainly reflected in the wet period, while the difference in evapotranspiration in the dry period is small, and the evapotranspiration in the wet period is larger, and the evapotranspiration is larger in the dry period. Evapotranspiration is small. The specific change trend is shown in Fig. 1.3. The average annual evapotranspiration in normal water years (2008), wet years (2010), and dry years (2011) were 86.29 W/m2 , 97.44 W/m2 , and 79.04 W/m2 , respectively. It can be seen that surface evapotranspiration in wet years Larger, lower surface evapotranspiration in dry years, and average water years are somewhere in between. Among the evapotranspiration changes in three typical years, the monthly

Fig. 1.3 Monthly average evapotranspiration distribution map

1 Study on Remote Sensing Retrieval of Land Surface …

7

Fig. 1.4 Seasonal average evapotranspiration distribution map

average evapotranspiration value in the wet year (2010) takes the maximum value of 161.32 W/m2 in August, which is nearly 50 W/m2 higher than the normal and dry years in the same month. The monthly average evapotranspiration value in the wet year takes the maximum value in July, and the maximum values are 142.04 W/m2 and 138.95 W/m2 , respectively. Among the three typical years, the evapotranspiration value in June in normal water years is significantly lower than that in May, while the phenomenon is relatively gentle in wet and dry years. The seasonal evapotranspiration data is obtained by averaging the monthly average evapotranspiration data, and the seasonal average evapotranspiration data is shown in Fig. 1.4. The maximum value of evapotranspiration in the year occurs in summer, and the evapotranspiration value in the wet year is relatively large in the inter-annual evapotranspiration. The inter-annual variation trend of seasonal evapotranspiration has the following characteristics: wet year > normal water year > dry year, and the intra-year variation trend has the characteristics of summer > spring > autumn > winter.

1.4.2.2

Spatial Analysis

Seasonal-scale evapotranspiration spatial changes are compared longitudinally from the four seasons of spring, summer, autumn, and winter during the year and horizontally compared from three years of normal water years, wet years, and dry years during the inter-year period, combining the two dimensions of intra-year and interyear. Analysis of the spatial variation characteristics of seasonal average evapotranspiration, and the spatial distribution of seasonal average evapotranspiration is shown in Fig. 1.5. In a longitudinal comparison during the year, the seasonal average evapotranspiration changes as follows: summer > spring > autumn > winter. Inter-annual horizontal comparison shows that the wet year has a large-scale spatial distribution in the four seasons, and the evapotranspiration value is generally higher. The changes in the average evapotranspiration in the four seasons of the three years are all expressed as

8

B. Li et al.

Fig. 1.5 Spatial distribution of seasonal average evapotranspiration

follows: > Fat water year > Dry water year. The average evapotranspiration in the spring season is generally the largest in wet years, but the maximum in normal and dry years in spring is more significant. The maximum evapotranspiration in normal, wet and dry years, respectively are 170.79 W/m2 , 144.89 W/m2 , 166.39 W/m2 . It is worth noting that the dry year showed the maximum value of evapotranspiration on the seasonal scale again. The summer evapotranspiration value of the dry year (2011) was 196.62 W/m2 , which was the maximum in three years in the same season. In 2010, the maximum summer evapotranspiration value is 184.11 W/m2 , and the maximum summer evapotranspiration value in 2008 is 183.54 W/m2 . The average evapotranspiration change in the autumn season did not change much in normal and wet years, while the dry year showed a sharp decrease in evapotranspiration, and the spatial distribution range and evapotranspiration value were significantly reduced. The difference of the seasonal average evapotranspiration in winter in the three years is mainly reflected in the size of the spatial distribution range, which is still the widest spatial distribution of evapotranspiration in the wet year.

1.4.3 Analysis of Evapotranspiration-Influencing Factors Evapotranspiration is a process that is affected by multiple factors, including solar radiation energy, surface moisture distribution, and water vapor transportation. It can

1 Study on Remote Sensing Retrieval of Land Surface …

9

be seen from the above that the temporal and spatial distribution of evapotranspiration has obvious regularity and differences. Through the analysis of temperature, humidity, wind speed, precipitation, and other factors, it is found that SRA, TMP, and LST are the main influencing factors in the study area. In order to explore the influencing factors of evapotranspiration in the Poyang Lake area in a typical year, this article starts with three influencing factors of shortwave downward radiation, air temperature, and surface temperature, and conducts solar radiation. This paper conducts a correlation analysis on surface energy and evapotranspiration combined with these influencing factors. Solar radiation energy drives water evaporation and vegetation transpiration, while the surface shortwave downward radiation is one of the four components of solar radiation, which is closely related to surface evapotranspiration. Taking the threeyear average value of evapotranspiration on an 8-day scale and the three-year average value of surface shortwave downward radiation on an 8-day scale, respectively, the correlation between evapotranspiration and surface shortwave downward radiation is shown in Fig. 1.6a. According to the chart, the correlation coefficient R2 of the two is 0.83, which shows good correlation and consistency. The change of temperature has a great influence on evapotranspiration. On the one hand, when the temperature rises, the water molecule movement rate at the underlying surface increases, thereby accelerating the evapotranspiration process; on the other hand, the NP method uses the surface temperature as the generalized coordinate.

130

170

y = 0.3838x - 116.32 R² = 0.83

LE(W/m2)

LE(W/m2

170

90 50 10 350

450

550

650

y = 3.5623x - 958.62 R² = 0.78

140 110

80 50 20

750

275 280 285 290 295 300 305 310

TMP(K)

SRA(W/m2)

(b)

(a)

LE(W/m2)

160 120

y = 3.5667x - 975.15 R² = 0.79

80 40 0

280 285 290 295 300 305 310 LST(K)

(c) Fig. 1.6 Correlation between temperature and evapotranspiration. a The correlation between SRA and LE; b The correlation between TMP and LE; c Correlation between LST and LE

10

B. Li et al.

When the Hamiltonian is partially differentiated, the surface temperature and the near-surface atmospheric temperature can be used as important considerations for the relationship between temperature and evapotranspiration. The relationship between near-surface atmospheric temperature, surface temperature, and evapotranspiration changes is shown in Fig. 1.6b, c. The coefficient of determination between surface air temperature and evapotranspiration is 0.78, and the coefficient of determination between surface temperature and evapotranspiration is 0.79. Taking into account shortwave downward radiation, near-surface atmospheric temperature, and surface temperature, the correlation between surface shortwave downward radiation and evapotranspiration is the most consistent (R2 = 0.832), the correlation between surface temperature and evapotranspiration is second (R2 = 0.786), and near-surface atmospheric temperature. The correlation with evapotranspiration is the smallest among the three (R2 = 0.779).

1.5 Conclusions (1)

(2)

Applicability analysis of the RS-NP evapotranspiration inversion model in the experimental area. By cross-checking the inversion results and MYD16 evapotranspiration products, the RS-NP inversion results have good correlation and consistency with MYD16 evapotranspiration products (R2 = 0.963), the average error is 6.74 W/m2 , and the relative error percentage is 22.99%. The root mean square error is 16.82 W/m2 , which proves that the RS-NP inversion model has high accuracy and good applicability in the experimental area. Variation characteristics of land surface evapotranspiration in Poyang Lake area in typical hydrological years. The evapotranspiration of Poyang Lake varies drastically in different years, which is closely related to the classification of typical hydrological years (normal water years, wet years, and dry years). The evapotranspiration changes drastically in different seasons of the year, and the relationship between the evapotranspiration value is summer > spring > autumn > winter; the relationship between the spatial distribution range of evapotranspiration in three typical years is wet year > normal water year > dry year, and the relationship between the average annual evapotranspiration value is also wet year > normal water year > dry year. In terms of evapotranspiration factors, surface evapotranspiration is closely related to surface shortwave downward radiation, and the correlation coefficient R2 is the largest, which is 0.832; the correlation between near-surface atmospheric temperature, surface temperature, and evapotranspiration decreases sequentially, and surface shortwave downward radiation affects evapotranspiration changes, a major factor.

1 Study on Remote Sensing Retrieval of Land Surface …

11

References 1. Liu, Y.: Climate, hydrological process and water environmental effects in Poyang Lake Basinss. Science Press, Beijing (2012) 2. Wang, S., Dou, H.: Chinese Journal of Lake. Science Press (1998) 3. Liu, Y., Zhao, X., Wu G.: Preliminary analysis on the causes of frequent phenomenon of extreme drought events in Poyang Lake region in the past decade. Resour. Environ. Yangtze Basin 23(1) (2014) 4. Min, Q., Zhan, L.: Characteristics of low water changes in Poyang Lake from 1952–2011. J. Lake Sci. 24(5), 675–678 (2012) 5. Guo, H., Yin, G., Jiang, T.: Forecast of climate change in Poyang Lake basin in the next 50 years. Resour. Environ. Yangtze Basin 17(1), 73–73 (2008) 6. Bastiaanssen, W.G., Pelgrum, H., Wang, J., Ma, Y., Moreno, J.F., Roerink, G.J., Van der Wal, T.: A remote sensing surface energy balance algorithm for land (SEBAL).: Part 2: Validation. J. Hydrol. 212–213(1–4), 213–229 (1998) 7. Su, Z.: The surface energy balance system (SEBS) for estimation of turbulent heat fluxes. Hydrol. Earth Syst. Sci. 6(1), 85–100 (1999) 8. Liu, Y., Hiyama, T., Yasunari, T., Tanaka, H.: A nonparametric approach to estimating terrestrial evaporation: validation in eddy covariance sites. Agric. For. Meteorol. 157, 49–59 (2012) 9. Pan, X., Liu, Y., Fan, X.: Satellite retrieval of surface evapotranspiration with nonparametric approach: accuracy assessment over a semiarid region. Adv. Meteorol. 2016(9), 1–14 (2016)

Chapter 2

Research on Image Quality Assessment of Vehicle Panoramic View Equipment Changyuan Yang and Xuan Dong

Abstract Aiming at the spliced image quality of 360 degree panoramic view equipment, an image quality evaluation method for 360 degree panoramic view equipment was proposed. Use 360 degree panoramic view stitching image quality evaluation software to evaluate the stitched image quality of 360 degree panoramic view equipment. Based on deep learning, the software generates training data and test data to construct YOLOv3 network; according to YOLOv3 network, it calculates the proportion of splicing loss, proportion of splicing ghosting, splicing dislocation length, and splicing gap width of panoramic spliced images and provides a scientific evaluation method for 360 degree panoramic view equipment.

2.1 Introduction With the development of economy and the improvement of people’s living standards, the output and sales of cars are increasing year by year. At the same time, driving safety has an increasing impact on people’s life and property safety [1]. Due to the existence of the driver’s visual blind area, it is very easy to make mistakes in judgment and operation, resulting in frequent traffic accidents such as swallowing people, rolling cars and car rear end collision. The driver’s visual blind area has become the biggest obstacle to obtain the information about the environment around the car. In order to solve these problems, the on-board panoramic view system has sprouted and developed, and has broad application prospects in the field of unmanned driving and vehicle electronic safe driving. The concept of on-board look around system was C. Yang Jiangsu Tsing-Eagle Automobile Technology Development Co., Ltd., Room 510, Block A, Chuangzhi Industrial Park, 201 Jinshan Road, Jiangyin, Jiangsu, P.R. China e-mail: [email protected] X. Dong (B) Key Laboratory of Operation Safety Technology on Transport Vehicles, Ministry of Transport, PRC No. 8, Xitucheng Road, Beijing 100088, Haidian District, P.R. China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 L. C. Jain et al. (eds.), 3D Imaging—Multidimensional Signal Processing and Deep Learning, Smart Innovation, Systems and Technologies 298, https://doi.org/10.1007/978-981-19-2452-1_2

13

14

C. Yang and X. Dong

first proposed by K. Kato and other four people in 2006 [2] and applied for a U.S. patent. The on-board panoramic view system is a vehicle-assisted driving system based on machine vision. It uses the wide-angle camera installed around the vehicle to reconstruct the aerial view image of the vehicle and the surrounding scene through image transformation. The driver can park safely, avoid obstacles, and eliminate the visual blind area, so as to achieve the purpose of safe driving. The research on vehicle panoramic view splicing technology has gradually become the focus of research at home and abroad [3–5]. Compared with the research of image stitching methods, there are few studies on the quality evaluation of stitched images in the literature at home and abroad. Image quality evaluation is an important means to compare the performance of various image mosaic schemes and optimize system parameters. Therefore, establishing an effective image quality evaluation mechanism is of great significance to evaluate the performance of panoramic mosaic. In this paper, software is proposed to detect the image quality of 360 degree panoramic equipment, so as to judge the quality of the imaging system of 360 degree panoramic equipment to be inspected, which provides a scientific method for judging the imaging quality of 360 degree panoramic equipment.

2.2 Scene Layout Arrange black-and-white checkerboards around a vehicle equipped with 360 degree panoramic view equipment and then start the vehicle and 360 degree panoramic view equipment. The 360 degree panoramic view equipment captures and merges the image around the vehicle, and sends the merged panoramic stitched image to the image quality evaluation software of 360 degree panoramic view equipment. The dimensions of the used black-and-white checkerboard must cover the blind area of the panoramic view system; the scene layout complies with relevant laws and regulations. Layout requirements of black-and-white checkerboard: at the place 5–20 cm away from the vehicle direction at the outer edge of the vertical ground projection of the vehicle, i.e., the front, rear, left and right of the vehicle, a checkerboard with a single square size of 30 cm × 30 cm is arranged. The checkerboard arranged at the front, rear, left, and right forms a box type, the length of the checkerboard in every direction is 3 m and its arrangement is symmetrical at the front, rear, left, and right, respectively. The scene layout is shown in Fig. 2.1.

2.3 Software Calculation Steps The 360 degree panoramic view equipment of the vehicle starts and takes clear panoramic images, and sends the fused panoramic mosaic images to the computer software. The software side generates training data set and test data set based on deep learning and constructs YOLOv3 network. According to the YOLOv3 network,

2 Research on Image Quality Assessment of Vehicle Panoramic …

15

Fig. 2.1 Scene layout

the proportion of stitching loss, the proportion of stitching ghosting, the length of stitching dislocation, and the width of stitching gap of panoramic stitching image are calculated. Generate test reports and store data.

2.3.1 Generate Training Data Set and Obtain Test Data Set Obtain the 360 degree panoramic mosaic image, generate the image data set, and synthesize the image through the mosaic of different environments, models, and manufacturers. Mark each image, frame the positions of splicing ghosting, splicing loss, splicing dislocation and splicing gap in the image, and mark the category of each position. The image data set is divided into training set and verification set according to 9:1 (training set data is 90%, verification set data is 10%), and the test data set is obtained.

2.3.2 Build and Train the YOLOv3 Network Model Taking the generated training data set and the obtained test data set as images as input, a YOLOv3 network model is generated. The network model is mainly composed of input layer, trunk feature extraction layer, and feature fusion output layer. Input layer input 416 × 416 × 3 size image. The main feature extraction layer is composed of residual convolution module and convolution module which realizes down-sampling between residual convolution modules to extract image features. Firstly, the image input by the input layer is convoluted by 32 channels, and then, the feature layer is obtained by BN (Batch Normalization) and LeakyReLU activation; the convolution calculation formula is as follows: xconv =

n  i=0

xi × wi

(2.1)

16

C. Yang and X. Dong

where xi is the ith pixel value, n is the total number of pixels of the image block, and wi is the value of the ith subscript of the weight matrix. The BN function is as follows: γ (xconv − u) +β xout = √ σ 2 + 0.000001

(2.2)

where xout is the normalization result, γ is the scaling factor, u is the mean, σ 2 is the variance, and β is the bias. The LeakyReLU activation function is as follows:  yi =

xi ( xi ≥ 0) xi ( xi ≤ 0) ai

(2.3)

where xi is the output after BN normalization and ai is a non-zero coefficient. Then, the feature layer of the input is down-sampled with convolution kernel size of 3 and step size of 2. After normalization and activation function, the residual network is stacked. The residual network is composed of two sets of convolution normalization and activation function. Two sets of convolution normalization and activation operations are carried out on the input feature layer, which is then added with another part of residual edges to obtain the residual stacking result. The results are down-sampled with convolution kernel size of 3 and step size of 2. After normalizing and activating the function, the residual network is stacked twice to obtain the results of the second down-sampling and residual stacking. From the first time, down-sample the results of the previous step, respectively, and then stack the residuals once, twice, eight times, eight times, and four times to obtain the results of five down-sampling and residuals stacking. Save the feature layers of the last three layers and process the next layer. In the feature fusion layer, the three-scale feature layers preserved in the previous layer are used to construct a feature pyramid for classification and regression prediction. The three scales of feature fusion layer with 416 × 416 image as input are 13 × 13, 26 × 26, and 52 × 52, respectively. In each scale, local feature interaction is carried out by convolution kernel to complete pyramid feature fusion. First, the image characteristics of 13 × 13 layers in 1 × 1 convolution adjust the channel number, 3 × 3 convolution further feature extraction, and then adjust the convolution of 1 × 1 channel number, 3 × 3 convolution of feature extraction, in order to reduce the amount of participation network for feature extraction, and then by the convolution of a 1 × 1 to adjust channel number. The results after 5 convolution were classified and predicted by 3 × 3 and 1 × 1 convolution. Then, the results of the first 5 times of convolution are adjusted by 1 × 1 convolution channel number before up-sampling operation and stacked with the 26 × 26 feature layer of the upper layer. The stacked results are extracted with the same 5 times of convolution as the 13 × 13 feature layer and the convolution of 3 × 3 and 1 × 1 for classification and regression prediction.

2 Research on Image Quality Assessment of Vehicle Panoramic …

17

Finally, the same operation is performed for 52 × 52 feature layer, and the regression prediction results of three scales are finally obtained. The output layer classifies and regressed the position of the three-scale feature maps output by the feature fusion layer, and adjusts the prior box through the three obtained prediction results to obtain the final prediction box. The loss function is as follows: L(O, o, C, c, l, g) = λ1 L confidence (o, c) + λ2 L class (O, C) + λ3 L location (l, g) (2.4) where L location (l, g) is the positioning offset loss of target frame, L confidence (o, c) is the confidence loss of target frame, L class (O, C) is the classification loss of target frame, and λ1 , λ2 , λ3 are represents the balance coefficient. L confidence (o, c) adopts binary cross entropy loss, and the formula is as follows: L confidence (o, c) = −



(oi ln(ci )+(1 − oi ) ln(1 − ci )

ci = sigmoid(ci )

(2.5) (2.6)

In the formula, Oi j ∈ {0, 1} represents whether the ith target prediction box has a j-type target to be detected, 0 represents no target, 1 represents existence, and Ci j represents the Sigmoid probability of the ith target prediction box having a j-type target. L location (l, g) is expressed as the sum of squares of the difference between the real value and the predicted deviation value, and the formula is as follows: 

L location (l, g) =



(lim − gim )2

(2.7)

i∈location m∈[x,y,w,h]

where lix = bix − cix , li = bi − ci , liw = log(biw / piw ), lih = log(bih / pih ), gix = y y y gix − cix gi = gi − ci , giw = log(giw / piw ), gih = log(gih / pih ). Where gi is the offset of the prediction frame relative to the preset frame coordinate, li is the offset of the prediction rectangular frame coordinate, (b x , b y , bw , bh ) is the parameter of the prediction frame, (c x , c y , p w , p h ) is the parameter of the preset frame, and (g x , g y , g w , g h ) is the real target frame parameter mapped on the prediction feature graph. Finally, the data set is used to train the YOLOv3 network. y

y

y

18

C. Yang and X. Dong

Fig. 2.2 Mosaic image analysis of a 360 degree panoramic viewing device

2.3.3 Score the YOLOv3 Network Model The software points out the positions of lost images, splicing gouging, splicing dislocation, and splicing gaps and calculates the proportion of loss, gouging proportion, dislocation length, and gap width. Figure 2.2 shows the analysis of the Mosaic image results of a 360 degree panoramic circular viewing device. A total of 46 splicing losses, 11 Splicing ghostings, and 11 splicing dislocations were detected in this image, with no splicing seam. The image was shown as blue, green, and red boxes, respectively, and the maximum dislocation value was 22.2 cm. Compared with manual inspection, this method can display more useful information efficiently. Table 2.1 shows the analysis of software and manual test data; the former can show more information.

2.4 Conclusion The quality of the Mosaic image of the 360 degree panorama view Mosaic equipment is evaluated by the software of image quality evaluation. Based on deep learning, the software generates training data set and test data set, and constructs YOLOv3

2 Research on Image Quality Assessment of Vehicle Panoramic …

19

Table 2.1 Test data comparison Test items

Software analysis results

Splicing loss

15 places, the maximum loss area Maximum loss length is about percentage is 79.76%, and the 20 cm maximum loss area is 0.072 square meters All loss area:0.053 m2 , 0.041 m2 , 0.033 m2 , 0.025 m2 , 0.041 m2 , 0.042 m2 , 0.054 m2 , 0.048 m2 , 0.038 m2 , 0.063 m2 , 0.041 m2 , 0.053 m2 , 0.072 m2 , 0.064 m2 , 0.052 m2

Manual test results

Splicing ghosting

5 places, the maximum ghost area ExistGhosting percentage is 68.68%, and the maximum ghost area is 0.062 m2 . All ghost area: 0.055 m2 , 0.055 m2 , 0.059 m2 , 0.061 m2 , 0.062 m2

Splicing dislocation

1 dislocation length 27.2 cm

Exist dislocation, the maximum dislocation is about 27 cm

Splicing seam

Recognizable (artificial drawing, seamless system)

Recognizable (artificial drawing, seamless system)

network. According to YOLOv3 network, the proportion of Mosaic loss, the proportion of Mosaic double image, the length of Mosaic dislocation, and the width of Mosaic gap are calculated. Results show that 360 degree panoramic look around equipment image evaluation software can accurately judge the effect of stitching image, instead of the cumbersome, a lot of artificial statistical rating, and can overcome the limitations of single factor evaluation index. This contributes to the realization that automatic adaptive image mosaicing system has very important application value. Acknowledgements This work was supported by the Central Public Research Institutes Special Basic Research Foundation (2021–9013a).

References 1. Zhu, S.L.: Thoughts on the development of car ownership. SHANGHAI AUTO 2018(2), 24–26 (2001) 2. Kato, K., Suzuki, M., Fujita, Y., et al. Image synthesis display method and apparatus for vehicle camera, U.S. Patent No. 7,139,412. 21 (2006) 3. Lu, B., Qin, R., Li, Q., Chen, D.P.: Study of vehicle-surrounding image stitch algorithm. Compu. Sci. 40(9), 293–295 (2013) 4. Luo, L. B., Koh, I.S., Min, K.Y., et al.: Low-cost implementation of bird’s eye view system for camera-on-vehicle. In: 2010 Digest of Technical Papers International Conference on Consumer Electronics (ICCE), pp. 311–312. IEEE (2010)

20

C. Yang and X. Dong

5. Cancare, F., Bhandari, S., Bartolini, D.B., Carminati, M., Santambrogio, M.D.: A bird’s eye view of FPGA-based Evolvable Hardware. In: 2011 NASA/ESA Conference on Adaptive Hardware and Systems (AHS), pp. 169–175. IEEE (2011)

Chapter 3

Global Analysis of Hepatitis B Virus Infection Model with Linear Drug Therapy Function Fang Zheng

Abstract We studied the hepatitis B virus infection model with linear treatment function and calculated the basic regeneration number of HBV, which is the threshold of virus disappearance or persistence. By constructing the Lyapunov function, it is proved that the disease-free equilibrium of the system 1 is globally stable, and the second complex system of the system 1 is asymptotically stable when the basic regeneration number of the system 1 is less than or equal to 1. Then, it is proved that the only positive equilibrium of the system 1 is globally stable when the basic regeneration number is greater than 1.

3.1 Introduction Medical biomedical signal is a kind of unstable natural signal which is sent out by complex living things. It belongs to the low-frequency weak signal in the background of strong noise. Biomedical signals can be a class of signals derived from a biological system. These signals usually contain information related to the physiological and structural states of the biological system, the randomness is big, the noise background is strong, the frequency range is generally low, and the signal statistics changes with the time, but also is not a priori. Chronic hepatitis B virus (HBV) infection is prevalent worldwide, accounting for about 5% of the world’s population, but there are significant regional variations. The prevalence is low in the United States and Western Europe at 0.1–2.0%, Mediterranean countries and Japan at 2.0–8.0%, and highest in South-East Asia and sub-Saharan Asia at 8.0–20.0%. In 2006, the epidemiological survey of hepatitis B in China showed that there were about 93 million chronic HBV patients, amongst which more than 20 million chronic hepatitis B patients. After infection with hepatitis B virus, the patient presents a variety of symptoms, such as lack of strength, anorexia, urine yellow, nausea, vomiting, liver pain, yellow eyes or skin, and so on. Blood tests showed elevated alanine aminotransferase (ALT) and F. Zheng (B) Department of Basic, Air Force Engineering University, Xi’an, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 L. C. Jain et al. (eds.), 3D Imaging—Multidimensional Signal Processing and Deep Learning, Smart Innovation, Systems and Technologies 298, https://doi.org/10.1007/978-981-19-2452-1_3

21

22

F. Zheng

HBV replication in the blood. All this suggests that the liver has become inflamed, that is, viral hepatitis. If its course of disease exceeds 6 months, it is chronic hepatitis, and the remedial difficulty of chronic second liver also increases subsequently [1, 2]. If hepatitis B virus infection is not treated properly, it can lead to liver fibrosis and cirrhosis, which can occur in 5–15% of patients with chronic hepatitis B over a five-year period. This is the most concerned about a problem, according to the survey, 80–90% of liver cancer have HBV back. It has now been established that HBV is the leading cause of primary liver cancer. But we do not need to be unwarranted fear, because most liver cancer after infection with HBV is based on cirrhosis. Second liver takes poisonous person to will not become liver cancer directly commonly [3, 4]. There are very few cases of liver cancer in China, so we should be vigilant. Hepatitis B virus is one of the major threats to human health in today’s society, according to the latest statistics, hepatitis and hepatitis B virus carriers in our population more than 14% of the population infected, hepatitis B virus is an infectious disease of the liver caused mainly by hepatitis B virus. Some patients can be chronic, or even develop cirrhosis, a small number can develop into liver cancer. Each year, more than one million people die of liver cancer as a result of cirrhosis. Hepatitis B virus (HBV) therapeutic drugs refer to a group of drugs that can control the progression of the disease by inhibiting virus replication and eventually eliminating HBV in the course of treating patients with HBV.

3.2 Mathematical Model Chronic hepatitis B is a progressive chronic disease that, without proper treatment, can progress to cirrhosis and even liver cancer, which can be life-threatening. In China [5, 6], at least 20 million chronic hepatitis B patients need aggressive treatment to stop the progression of the disease. The key to standard treatment of chronic hepatitis B is anti-hepatitis B virus (HBV). In this paper, according to the basic model of hepatitis B virus infection [7], artificial drug therapy was added to obtain the following model: ⎧ dH ⎪ = N − d H − (1 − r )mV H, ⎪ ⎪ ⎪ dt ⎪ ⎨ dI = (1 − r )mV H − α I, ⎪ dt ⎪ ⎪ ⎪ ⎪ ⎩ dV = (1 − ε) p I − eV. dt

(3.1)

where H = H (t), I = I (t), and V = V (t) represent the number of uninfected cells, infected cells, and time free of virus at time t. The birth rate of uninfected cells is N and die at a constant rate dT , and drugs prevent new infections at a steady constant rate r, the unsusceptibility of free virion infection is at a rate (1 − r )mV T ,

3 Global Analysis of Hepatitis B Virus Infection Model …

23

we define the infected cells die at rate δ I, we assume that the drug treatment rate is α. The rate at which uninfected cells produce infected free virion is (1 − ε) p I and are cleared at a rate eV . The study found that the death rate of uninfected cells was less than that of susceptible cells, so we suppose that d ≤ α. Adding the first two equations of system 1, we have (H + I ) = N − d H − δ I ≤ N − d(H + I ) as δ ≥ d, It implies that lim sup(H + I ) ≤ N /d. t→∞

We have the result of the third equation of system 1 is N N − eV when I ≤ d d

V  = p I − eV ≤ p . So lim sup V ≤ t→∞

pN . de

Therefore, the region  pN 3 : V ≤ ,  = (H, I, V ) ∈ R+ de

H+I ≤

N d

 ,

is positively invariant set to system 1.

3.3 Thee Existence Equilibrium and Basic Reproduction Number Clearly, P0 (N /d, 0, 0) is the infection free equilibrium of system. The basic reproduction number of system 1 is R0 =

mp(1 − r )(1 − ε)N . dαe

We find the infection equilibrium P ∗ (H ∗ , I ∗ , V ∗ ) (where V ∗ = 0) ⎧ ⎪ ⎨ N − d H − (1 − r )mV H = 0, (1 − r )mV H − α I = 0, ⎪ ⎩ (1 − ε) p I − eV = 0. From the third equation of 2, we get V =

(1 − ε) p I. e

(3.2)

24

F. Zheng

For I = 0, we can obtain T =

eα N . = (1 − r )(1 − ε)mp dR0

(3.3)

Adding 3 into the first equation of 2 yields  N deα N 1 1− I = − = α pmα α R0

dα And V = pN − αm = pN 1 − R10 . eδ eα So, we get the infection equilibrium P ∗ (H ∗ , I ∗ , V ∗ ) = P ∗



  N 1 N pN 1 1− , , 1− . R0 d R0 α eα R0

Theorem 1 System 1 has the only infection free equilibrium P0 (N /d, 0, 0) for R0 ≤ 1, and system 1 also has a unique positive equilibrium P ∗ (H ∗ , I ∗ , V ∗ ) when R0 > 1.

3.4 The Global Stability of Equilibria Theorem 2 The infection free equilibrium P0 of system 1 is globally asymptotically stable for R0 ≤ 1, and unstable for R0 > 1. Proof To prove this, we construct the following Lyapunov function [14], L = (1 − ε) p I + αV . Then, when R0 ≤ 1, and H ≤ N /d for (H, I, V ) ∈  L  = V ((1 − ε)(1 − r ) pm(H − eα) (1 − r )(1 − ε) pN m d 1 V( T − ) d N R0 1 (1 − r )(1 − ε) pN m V (1 − ) ≤ 0. ≤ d R0 =

That is, L  is not positive for R0 ≤ 1 and H ≤ N /d. Notice that the subset where L = 0 is defined by the following two cases: 

3 Global Analysis of Hepatitis B Virus Infection Model …

(1) (2)

25

If R0 < 1, then V = 0. If R0 = 1, then V = 0 or H = N /d.

It is easy to know that the plane V = 0 or H = N /d is the point P0 . Therefore, applying the LaSalle-Lyapunov invariance principle in [6], it follows that P0 is globally stable. If R0 > 1, then L  > 0, except V = I = 0. So all bounded solutions whose starting points are near by P0 will go far away from P0 in int . For the global stability of the infection equilibrium P ∗ of system 1, we need the following lemma. Lemma 1 (See [7]) Assume that (1) (2) (3) (4)

There exists a compact absorbing set K ⊂ , and system 1 has a unique equilibrium P ∗ in ; System 1 satisfies the Poincaré-Bendixson property; For each periodic solution m(t) of system 1 with m(0) ∈ , the associated second compound system of system 1 is asymptotically stable; det(J (P ∗ )) < 0, here P ∗ is the unique equilibrium of system (1). Then, the unique equilibrium P ∗ is globally stable in the region . So for the infection equilibrium P ∗ , we also have.

Theorem 3 The unique infection equilibrium P ∗ of system 1 is globally asymptotically stable in the interior of the set  for R0 > 1. Proof As we know, there exists a constant β > 0 if (H (0), I (0), V (0)) ∈ int. So we have lim inf H (t) > β, lim inf I (t) > β, t→∞

t→∞

Therefore, system 1 has a compact absorbing set K ⊂ , and P ∗ is the unique infection equilibrium of system 1. So the condition (3.1) of Lemma 1 holds. The Jacobian matrix J (H, I, V ) of system 1 is given by ⎛

⎞ −d − (1 − r )mV 0 −(1 − r )m H J (H, I, V ) = ⎝ (1 − r )mV −α (1 − r )m H ⎠. 0 −e (1 − ε) p Let Q = diag(−1, 1, −1), then the non-diagonal elements of Q J Q are not positive. Therefore, system 1 is a competitive system and satisfies the Poincaré-Bendixson property [8–10]. So the condition (3.2) of Lemma 1 holds. The second additive compound matrix of J (T , I, V ) is ⎡

J [2]

⎤ −(d + (1 − r )mV + α) (1 − r )mT (1 − r )mT ⎦. =⎣ (1 − ε) p −(d + (1 − r )mV + e) 0 0 (1 − r )mV −(α + c)

Thus, the associated the second compound of system 1 is

26

F. Zheng

⎧  ⎨ X = −(d + (1 − r )mV + α)X + (1 − r )mT Y + (1 − r )mT, Y  = (1 − ε) p X − (d + (1 − r )mV + e)Y, ⎩ Z  = (1 − r )mV Y − (α + e)Z .

(3.4)

We assume the Lyapunov function is[11]  I L(X, Y, Z , T , V , I ) = sup |X |, (|Y | + |Z |) . V 

We assume that the solution m(t) is periodic and has a minimum positive period at ω > 0. Thus, there exists a constant θ > 0 such that L(X, Y, Z , T , V, I ) ≥ θ sup{|X |, |Y |, |Z |}

(3.5)

for all (X, Y, Z ) ∈ R 3 and (T , I, V ) ∈ m(t). Consider the upper right derivative of L [12], we have D+ |X (t)| ≤ −(d + (1 − r )mV + α)|X (t)| + (1 − r )mT (|Y (t)| + |Z (t)|) mHV I · (|Y (t)| + |Z (t)|), = −(d + (1 − r )mV + α)|X (t)| + I V (3.6) And D+ |Y (t)| ≤ p|X (t)| − (d + mV + e)|Y (t)| D+ |Z (t)| ≤ mV |Y (t)| − (α + e)|Z (t)|. Thus, I (|Y (t) + |Z (t)||) V V I I I = ( − ) (|Y (t)| + |Z (t)|) + D+ (|Y (t) + |Z (t)||) I V V V I V I (1 − ε) p I |X (t)| + ( − − e − d) (|Y (t)| + |Z (t)|). ≤ V I V V

D+

(3.7)

Using (3.6) and (3.7), we find that D+ L(t) ≤ max{ f 1 (t), f 2 (t)}L(t), where f 1 (t) =

mHV − (d + (1 − r )mV + α), I

(3.8)

3 Global Analysis of Hepatitis B Virus Infection Model …

f 2 (t) =

27

(1 − ε) p I I V +( − − e − d). V I V

From the last two equations of system 1, we have (1 − r )m H V I + α, = I I and (1 − ε) p I V = + e. V V So max{ f 1 (t), f 2 (t)} ≤

I − d, I

therefore 

ω 0

max{ f 1 (t), f 2 (t)}dt ≤ log I (t)|ω0 − ωd = −ωd < 0.

This and (3.8) immediately imply that lim L(t) = 0. Furthermore, lim X (t) = t→∞

t→∞

lim Y (t) = 0, lim Z (t) = 0 by 5. So the condition (3.3) of Lemma 1 holds.

t→∞

t→∞

At the last, we prove det(J (P ∗ )) < 0.    −d − (1 − r )mV ∗ 0 −(1 − r )mT ∗   det J (P ∗ ) =  (1 − r )mV ∗ −α (1 − r )mT ∗    0 (1 − ε) p −e = −(d + m(1 − η)V ∗ )eδ + m(1 − η)(1 − ε)T ∗ dp,

we know that P ∗ (H ∗ , I ∗ , V ∗ ) is a equilibrium of system 1, clearly (1 − r )(1 − ε) pm H ∗ = eα, substituting it into det(J (P ∗ )) yields det(J (P ∗ )) = −m(1 − r )V ∗ eα < 0. So the condition (3.4) of Lemma 1 holds. Hence, P ∗ is globally stable in int . This completes the proof.

28

F. Zheng

3.5 Discussion An important difference between the model proposed in this paper and the model proposed in the previous studies is the addition of drug therapy [13]. After addition of drug treatment, the basic regenerative number R0 becomes larger, the disease-free equilibrium point is more stable, so it has certain effect on disease control. In this paper, a model of hepatitis B virus infection was developed, and the infected liver cells were treated with drugs. We introduce a new parameter, the efficacy of drugs to block new infections, namely the loss of acellular lysis of infected cells. The basic reproductive number of the hepatitis B virus is obtained. We know that Lyapunov function is an important measure of global stability. By constructing the Lyapunov function, it is proved that the infection free equilibrium of system 1 is globally stable, and the related second complex system 1 is asymptotically stable when the basic reproduction number is less than or equal to 1. It is then shown that the unique infection free equilibrium of system 1 is globally stable when basic reproduction number is greater than 1. Current drug therapies for liver cells cannot eradicate HBV, so the duration of hepatitis B virus therapy is key to inhibiting the virus, reducing or preventing liver cell damage and disease progression. The basic goal of therapy is to eliminate or permanently suppress HBV, to reduce pathogens and infectious viruses, delay disease progress, reduce cirrhosis, primary hepatocellular carcinoma, and its complications, so as to extend the survival time, improve the quality of life.

References 1. Zhang, Y.X., Li,J.: Forecast and analysis of epidemic spread of new coronary pneumonia based on Sir model. J. Anhui Univ. Technol. (Natural Science) 37(1) (2020) 2. Qesmi, R., Wu, J., Wu, J., Heffernan, J.M.: Influence of backward bifurcation in a model of hepatitis B and C viruses. Math. Biosci. 224(2), 118–125 (2010) 3. Dahari, H., Feliu, A., Garcia-Retortillo, M., Forns, X., Neumann, A.U.: Second hepatitis C replication compartment indicated by viral dynamics during liver transplantation. J. Hepatol. 42(4), 491–498 (2005) 4. Reluga, T.C., Dahari, H., Perelson, A.S.: Analysis of hepatitis C virus infection models with hepatocyte homeostasis. SIAM J. Appl. Math. 69(4), 999–1023 (2009) 5. Nowak, M.A., May, R.M.: Virus Dynamics, Oxford University (2000) 6. Qesmi, R., et al.: Influence of backward bifurcation in a model of hepatitis B and C viruses. Math. Biosci. 224, 118–125 (2010) 7. Van den Driessche, P., Watmough, J.: Reproduction numbers and sub-threshold endemic equilibria for compartmental models of disease transmission. Math. Biosci. 180(1–2), 29–48 (2002) 8. Li, M.Y., Wang, L.: Global stability in some SEIR epidemic models. In: Mathematical approaches for emerging and re emerging infectious diseases: models, methods, and theory, pp. 259–311, Springer (2002) 9. Castillo-Chavez, C., Yakubu, A.A.: Discrete-time SIS models with complex dynamics. Nonlinear Anal. Theory Methods Appl. 47(7), 4753–4762 (2001) 10. Levin, S.A. (ed.): Encyclopaedia of Biodiversity, vol. L. Academic Press, New York (2000)

3 Global Analysis of Hepatitis B Virus Infection Model …

29

11. Castillo-Chavez, C., Capurro, A.F., Velasco-Hernandez, J.X., Zellner, M.L.: El transporte publico y la dinamica de la tuberculosis a nivel poblacional. A portaciones mathematicas, Ser.Commun. 61(1/4), 21–35 (2000) 12. Arreola, R., Crossa, A., Velasco, M.C.: Discrete-time S-E-I-S models with disoersal between two patches. Biometric Department, MTBI Cornell University Technical Report (2000) 13. Best, J., Pasour, V., Tisch, N., Castillo-Chavez, C.: Delayed density dependence and the dynamic consequences of dispersal between patches. Preprint (2001) 14. Ma, Z.E., Zhou, Y.C., Li, C.Z.: Differential equation stability and stability methods. Science Press, Beijing (2004)

Chapter 4

Research and Design of “AI+ Agriculture” Disease Detection System Based on Deep Learning Biao Xu, Chengzhao Luo, and Shiyi Xie

Abstract This paper addresses the problem that most of the groups engaged in traditional agricultural production generally lack professional knowledge of agricultural disease control, thus causing improper crop disease control, which brings serious economic losses to China’s agricultural development, and innovatively proposes a new concept of “AI+ Agriculture” to build a new deep learning-based disease detection platform system. Through the integration of various advanced artificial intelligence algorithms, the overall structure of the detection system is studied and designed, and the deep program is developed based on various agricultural disease detection application scenarios, and the disease detection model is reasonably constructed, etc., so as to finally realize the intelligent detection and analysis of crop diseases. The system is designed to help groups engaged in the agricultural production process to gain a deeper understanding of disease knowledge and prevention techniques and eventually obtain effective solutions to control them in the planting process, effectively reduce the adverse effects of diseases on agricultural crops, guide users to achieve efficient and high-yield quantification of crop planting, promote social productivity, and facilitate good economic growth in various regions of China.

4.1 Introduction China is one of the world’s largest agricultural and population countries, and food issues and agricultural development are the first priority of national development [1]. Nevertheless, the area covered by crop diseases in China is still relatively wide, and the occurrence rate is relatively high, which has a crucial negative impact on the agricultural production process, such as improper control can cause crop yield reduction and quality decline, or even extinction, bringing serious economic losses. Therefore, along with the rapid development of Chinese economy and the continuous B. Xu · C. Luo · S. Xie (B) Guangdong Ocean University, Zhanjiang, Guangdong, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 L. C. Jain et al. (eds.), 3D Imaging—Multidimensional Signal Processing and Deep Learning, Smart Innovation, Systems and Technologies 298, https://doi.org/10.1007/978-981-19-2452-1_4

31

32

B. Xu et al.

adjustment of agricultural structure, agricultural disease prevention and control technology has received more and more attention [2], which has long been the key control point of China’s organic agricultural production industry but also an important link to ensure stable and high yield of agricultural products. Meanwhile, as the trend of artificial intelligence based on deep learning technology begins to sweep the world, various industries such as security, mobile devices, and other applications are gradually starting to undergo technological revolution [3]. As a kind of branch of machine learning, “deep learning” can be regarded as a new type of artificial neural network, which has a very broad application prospect in the field of image processing and lays a solid foundation for the further development of intelligent agriculture [4]. Therefore, this paper, starting from deep learning, focuses on the problem of crop disease image recognition and deeply combines it with the development and application of detection technology to effectively build an intelligent crop disease detection platform and formulate the most suitable comprehensive management technology system for agricultural practitioners so as to further strengthen the prevention and control of crop diseases, which has very important practical significance and market prospects.

4.2 Background Up to now, the number of countries with organic agriculture certification has increased to about 200 worldwide, and the world organic agriculture market has rapidly grown by 10 times from 2000 to 2010 alone. By the end of 2010, the global agricultural market was worth $59 billion, but most of it is still distributed in North America and Europe. At the same time, the overall annual output of domestic crops is also showing a growing trend. According to the Statistics of National Bureau of Statistics, PRC, in 2015 alone, the output of organic crops in China was as high as 5.73 million tons, but the total agricultural output value in 2017 was even higher, reaching 5.805976 billion yuan, 385.442 billion yuan more than in 2015 [5]. The agricultural market expanded rapidly and obviously.

4.3 The Problem to Which the System is Directed As the further deterioration of agricultural ecological environment in recent years, the global within the scope of the agricultural science and technology innovation industry has obtained the unprecedented rapid development and extensive research. But at present, the level of domestic agricultural disease prevention and control technology compared with abroad is still relatively weak stage, for example, the overall planting ecosystem has not yet formed the coordinated development, biopesticides development, and technical support, social services and network construction also need to be further improved [2]. At the same time, in the domestic market, most

4 Research and Design of “AI+ Agriculture” Disease Detection System Based …

33

of the agricultural production groups (such as individual contractors, rural cooperatives, etc.) lack professional knowledge of agricultural disease prevention and control, and the main realization of agricultural disease prevention and control goals still need to rely on expert field visits and the use of professional software system real-time monitoring to complete. Such a way to solve the problems of diseases that may occur in agricultural production is not only inefficient and error prone but also harmful to interests, relying only on their long-term accumulated experience, hiring professionals at high cost, or relying on the help of complex monitoring systems [3]. So, the research on agricultural disease detection technology and its application has very important practical significance and market prospect. The market is showing an increasingly urgent demand for portable professional platform for civil disease detection. Meanwhile, in August 2018, AI Challenger jointly released nearly 50,000 labeled images of crop disease detection with XinKe Technology, covering 27 kinds of diseases of 10 kinds of agricultural plants, providing an unprecedented opportunity for “AI+ Agriculture.” Therefore, this software system aims at the problems in the prevention and control of agricultural diseases mentioned above. Under the premise of the rapid development of artificial intelligence, scientific production capacity and artificial technology are closely linked; traditional agriculture and advanced deep learning technology are innovatively combined, and a variety of advanced artificial intelligence algorithms are integrated for the program development of agricultural disease detection application scenarios. We had intedgrated various operational functions through the platform, enabling the application to successfully help the individual contractors, rural cooperatives engaged in agricultural production, such as groups of agricultural disease and prevention and control technology for deep understanding and the analysis of the technology and get effective solution, in the process of cultivation of controls all aspects of the decisive factors, effectively reduce the negative effects of disease on agricultural crops. Thus, improve agricultural production output, promote economic growth in various regions of our country, bring good social and economic benefits.

4.4 Overall System Design The “AI+ Agriculture” disease detection platform system based on deep learning is mainly applied in the prevention and control of crop diseases. Through scientific operations such as big data analysis of agricultural disease detection data, the biological characteristics of harmful organisms and beneficial organisms related to crops are deeply mastered. And their interaction in the agricultural environment and weak link of life cycle and so on, according to different crop types, pest species and damage degree, and the combination of different habitat conditions extracted at a deeper level of agricultural information, trend prediction as a disease, the growth of quality evaluation function modules such as data support, a series of scientific and reasonable prevention and control techniques with strong pertinence are calculated.

34

B. Xu et al.

Fig. 4.1 Project basic research idea map

A one-stop “AI+ Agriculture” intelligent disease detection platform has been developed, which integrates agricultural disease detection, intelligent planning, data analysis, and agricultural knowledge map. The project basic research idea map is shown in Fig. 4.1. “Point to point” guides users to achieve high yield, high efficiency, and high quality of crop cultivation, control the harm of agricultural diseases below the level of economic damage, and promote the sustainable and healthy growth of plants. So that it can achieve the effect of safe and effective planting process, scientific and reasonable control means, stable improvement of crop yield, excellent ecological quality and health.

4.5 Study on Preprocessing of Crop Image Samples Image sample preprocessing mainly includes image size adjustment, shape correction, smoothing, and denoising to reduce a series of adverse effects caused by environment and equipment restrictions as shown in Fig. 4.2.

4 Research and Design of “AI+ Agriculture” Disease Detection System Based …

35

Fig. 4.2 Image preprocessing schematic

Fig. 4.3 Basic image preprocessing process

The team conducted scientific and practical research on diversified digital image preprocessing algorithms for different application scenarios of the system (high intensity exposure, rain, etc.). The system adopts foreground extraction and other operations to eliminate useless information (such as weeds.) in the images collected by the camera, so as to simplify the data and enhance the quality of image generation from the root [6], consequently ensuring the accuracy and efficiency of the prediction of subsequent agricultural disease detection module. The effect is shown in Fig. 4.3b. In addition to the intelligent depth detection of the incoming initial image, the system will also perform gray value conversion operation on the incoming initial image based on OpenCV technology. The result is shown in Fig. 4.3c. After adjusting it to gray images, the system will combine the basic operation of contrast enhancement, Gaussian filtering, and other classical pretreatment algorithms to remove the image with Gaussian noise, salt and pepper noise, and other interference information. Finally, the local contrast is enhanced by histogram equalization [6] and finally achieve the purpose of profoundly restoring image information, which is shown in Fig. 4.3d.

4.6 Study on Preprocessing of Crop Image Samples The training image dataset is one of the core of the whole system. The system uses a deep convolutional neural network for feature extraction of crop lesion images. The convolutional neural network consists of an input layer, a convolutional layer and a pooling layer, while the role of the convolutional layer is to extract the image features. A 7 × 7 convolutional kernel is used to increase the perceptual field of the neural network and to obtain better feature information. The ReLU function is also

36

B. Xu et al.

introduced after the 7 × 7 convolutional layer to increase the nonlinear relationship between the layers of the neural network. After that this system uses a pooling layer to obtain global contextual relationships and high-level semantic information from the deeper layers of the network as an aid to classify the classifier. Finally, the multidimensional feature map is transformed into a one-dimensional feature vector for classification by a fully connected layer. The “center loss combined with softmax loss” approach is used at the end of the network, which not only makes the distance between classes larger and the intra-class distance smaller but also increases the generalization ability of the model [1]. And then sends the image dataset into neural network training. After that the trained network model can be successfully obtained and applied to assist in identifying crop diseases. Therefore, the basic realization of deep learning algorithm needs to rely on a large number of image datasets with high balance. A portion of the example dataset is shown in Fig. 4.4. By using Google Data Annotation tool “Fluid Annotation” for manual quick annotation of image dataset, training set fusion is carried out by combining “AI Challenger” agricultural disease dataset (including information of 27 kinds of diseases of 10 kinds of agricultural plants, such as pepper and pumpkin) [3]. Design the construction method of the dataset used by the project training module, learn and use advanced network crawler framework (Scrapy, PySpider, etc.) to acquire the network agricultural crop disease images in batches, and integrate the above various preprocessing algorithms to obtain the image for standardized, normalized operation, training, and improve the feature extraction [6]. The standardized agricultural disease detection training dataset is established to ensure that the subsequent system model

Fig. 4.4 Example dataset diagram

4 Research and Design of “AI+ Agriculture” Disease Detection System Based …

37

Fig. 4.5 Training image set research ideas

training module is more scientific and efficient [4]. The research idea of training on the image set is shown in Fig. 4.5.

4.7 Study on Agricultural Disease Detection Algorithm 4.7.1 Study on Classical Machine Learning Image Classification Algorithm and Advanced Deep Learning Classification Algorithms The software developers learnt classical machine learning algorithms such as principal component analysis (PCA), Bayesian methods, classification and regression tree, (CART), and support vector machine (SVM) focusing on the underlying principles and ideas of the algorithm, and combined with the project requirements to develop a classification model suitable for the system requirements. The platform system is built on the core of advanced deep learning technology, combined with advanced classification model based on convolutional neural network (VGG, SEnet, Mobilenet, etc.), to master various algorithm ideas and optimize the algorithm using the requirements of the project system [1]. Transfer learning technology was used to optimize and improve the performance of the classification layer of the system model so as to fit the agricultural disease detection scene, improve model accuracy and forward mapping speed, and build a scientific agricultural disease detection system platform.

38

B. Xu et al.

4.7.2 Building the Systematic Model for Agricultural Disease Detection The study of system model training can be divided into two core points: model to transfer learning and loss function optimization. The team aimed at developing agricultural disease detection system, combined with classical machine learning algorithm, set the advanced lightweight deep learning classification network ResNet-50 as the backbone of the system, and developed the agricultural disease detection model of the system. The overall structure of the ResNet-50 model is shown in Fig. 4.6, which has a very deep network structure reaching 152 layers in total, and therefore, its network error rate is usually smaller. The structure of ResNet-50 not only increases the training speed of the network model significantly but also improves the accuracy of the model substantially [7]. At the same time, the ResNet-50 model also introduces the residual network structure, which discards the stacking of multiple hidden layers and solves the degradation problem so that the gradient disappearance phenomenon no longer occurs. Since adjusting different parameters (e.g., training steps, iterations, learning rate, etc.) of each network model in CNN will lead to different training results, appropriate parameter adjustment can further improve the efficiency and accuracy of model training. By fixing the original network model parameters, the network output layer is improved into 27 kinds of disease classification channels by transfer learning technology. The initial plan was to train the ResNet-50 model for 14,000 iterations, and the results were found to converge very quickly in terms of model recognition accuracy. The recognition rate of the first 1800 iterations hovered between 0.2 and 0.3; however, after that the recognition accuracy showed a gradual increase, and the value of loss of the loss function slowly decreased. Eventually, by the 14,000th iteration, the accuracy rate was nearly 98%.

Fig. 4.6 Structure of ResNet-50

4 Research and Design of “AI+ Agriculture” Disease Detection System Based …

39

Fig. 4.7 Dataset distribution chart

Finally, the parameters of the model are learned by agricultural disease dataset, and the convolutional neural network is built by combining with the Windows-based TensorFlow deep learning framework. The neural network training parameters are learning rate = 2 × 10–4 , epoches = 20, and batch size = 64, thus successfully constructing the scientific, efficient, and real-time model of agricultural disease detection system.

4.8 Experimental Results and Analysis 4.8.1 Data Collection and Creation Since different crops may have the same disease, the image dataset of 27 crops with different types and degrees of diseases obtained after the preprocessing operation above was divided into 60 major categories, i.e., the same diseases of different crops were considered as different categories, as shown in Fig. 4.7a. Experimentally, 85% of the various categories from 3000 images was set as the training set as shown in Fig. 4.7b, while the remaining 15% was used as the test set as shown in Fig. 4.7c.

4.8.2 Experimental Procedure and Results In order to evaluate the effectiveness of neural models and algorithms, this paper will use the same dataset obtained by the same data preprocessing operation to evaluate the studied models compared with two other types of neural networks (VGG-16, Xception, and MoblieNet-V3) in terms of the accuracy of prediction results. The formula for calculating the accuracy rate is shown in Eq. (4.1). A=

1 X

×

X  j=1

ajj aj

(4.1)

40 Table 4.1 Results of experimental

B. Xu et al. Structure of network

Accuracy (%)

VGG-16

94.31

Xception

92.16

MobileNet-V3

95.94

ResNet

97.5

In Eq. (4.1), j is the category label; X is the total number of samples; aj is the number of samples in category j, and ajj is the number of samples predicted for category j as the jth category [1]. The accuracy prediction results obtained for each of the three different models are shown in Table 4.1. From Table 4.1, it can be concluded that the network model ResNet proposed in this paper not only converges faster but also has a better performance in terms of accuracy compared with other models, which fully proves the feasibility of the network model of this system.

4.9 System Development and Design 4.9.1 Study on Scientific Design of Database Since the system requires high-frequency requests for database data, the research and application of database query efficiency and data security are particularly important to ensure that the system can continue to work when the server crashes due to the high concurrent data access traffic. Through in-depth study and research on the underlying working principle of MySQL database, the project team explored and developed efficient data search algorithm of MySQL database, designed and optimized database architecture, and improved the efficiency and scientificity of data query by the system. At the same time, we adopted the idea of database master–slave synchronization and read–write separation, and designed the system to deploy the read–write rate of the database with multiple servers according to the actual pressure situation so as to improve the overall performance of the system and build a more scientific and perfect system information base. The implementation principle of database master–slave synchronization and read–write separation is shown in Fig. 4.8.

4.9.2 Agricultural Encyclopedia Module Information Entry The “Agricultural Encyclopedia” consists of agricultural disease illustration (ADI) and crop illustration (CI). In the agricultural disease diagram module, the back end

4 Research and Design of “AI+ Agriculture” Disease Detection System Based …

41

Fig. 4.8 Implementing database master–slave synchronization and read–write separation

uploads crop disease samples to the file server for preservation and inserts the image URL access address, disease characteristics, disease season, disease cause, solution, and other disease-related information into the database. In the map plate of crops, the growth pictures of crops at all stages are first uploaded to the file server for preservation. Then, the mapping address, crop characteristics, suitable growth environment, suitable sowing time, suitable fertilization time, picking time, growth cycle, cycle yield per acre, and other crop information of pictures at each stage are stored in MySQL database, respectively.

4.9.3 Study on Intelligent Planning The user groups of the system are divided into two types: “less experienced farmers” and “experienced farmers.” For these two types of farmers, the system will provide different functional modules for users with different information and plan different planting schemes based on big data analysis. The user input information and crop data will be uploaded to the backend server for data aggregation and data transformation, and then, the data will be analyzed by big data, which constitutes the “intelligent recommendation planning module” of the system and open to inexperienced farmers. It is designed to combine local climate, crop disease, and other factors to make planting and prevention plans from a scientific perspective, build intelligent agricultural planting plans with one click, improve the intelligence of the system, and guide users to realize efficient and high-yield quantification of planting.

42

B. Xu et al.

4.9.4 Mobile Segment Development Ideas The system app relies on Android Studio development software for andorid client development, with Google official framework and AndroidX project as the technical support of the project, while using LinearLayout and RelativeLayout layout combined with relevant controls for XML interface design. Secondly, Java and C++ programming languages were used for logic implementation and algorithm embedding [5]; the program was modularized through program architecture design to make high aggregation within modules and low coupling between modules; at the later stage of developing system interface and functions, system testing was conducted through module docking, and optimization and iterative upgrade of system client started at the same time.

4.10 Conclusion This paper presents the design and implementation of a deep learning-based crop disease detection system and innovatively proposes a new concept of “AI+ Agriculture,” which integrates traditional agricultural production with advanced deep learning technology. It adopts a variety of advanced artificial intelligence algorithms for the development of agricultural disease detection application scenarios, closely linking technology and agriculture. At the same time, we make full use of agricultural disease detection information and extract deeper agricultural information through scientific operations such as big data analysis, which can be used as data support for subsequent functional modules such as “disease trend prediction” and “crop quality assessment.” The system has basically realized the intelligent detection and identification of most crop diseases. Although there are still problems and shortcomings in the use of the system that cannot be solved, I believe that with the accelerated development of technology and information technology in the context of the Internet era, the “AI+ Agriculture” disease detection system based on deep learning will be constantly improved and refined. In the future, the system will be widely used in agricultural production, bringing convenience to individual contractors, rural cooperatives, and other groups, effectively reducing the adverse effects of diseases on agricultural crops, thus increasing agricultural production, promoting economic growth in various regions of China, and creating good social and economic benefits. Acknowledgements The research is supported by Project of Innovative and Entrepreneurship for College Students in Guangdong Province (202110566021), Team of Innovation of Guangdong Ocean University Students (CXTD2019004), and Major Applied Research Project of Guangdong Ocean University Innovation and Strengthening Project “Big Data Analyze about the Customs of Aquatic and Marine Foreign Trade Products based on Machine Learning”(230419058/Q2019).

4 Research and Design of “AI+ Agriculture” Disease Detection System Based …

43

References 1. Ji, X.W., Huo, X.W., Xue, D.: Deep learning-based crop pest and disease identification method. Chin. South. Agric. Mach. 51(23), 182–183 (2016). (In Chinese) 2. Yang, J.J.: Current status of agricultural pest control and countermeasures. Farmers Consultant (17), 40 (2020). (In Chinese) 3. Lei, J.: Design and implementation of citrus classification software based on image recognition. Mod. Inf. Technol. 5(11), 86–88 (2021). (In Chinese) 4. Qiang, M.J.: Deep learning in crop image recognition. J. Fujian Comput. 37(2), 1–5 (2021). (In Chinese) 5. Fu, D.R., Chang, L., Ji, X.Y.: Design of agricultural application software based on app inventor. Comput. Knowl. Technol. 15(16), 39–40 (2019). (In Chinese) 6. Jiang, L.H., Ren, Y., Chen, Y.F.: Design and implementation of a software platform for nondestructive crop inspection based on computer vision technology. Comput. Knowl. Technol. 9(15), 3640–3642 (2013). (In Chinese) 7. Gao, Y.Y.: Research and implementation of plant disease detection algorithm. Comput. Program. Skills Maintenance 03, 45–46 (2021). (In Chinese)

Chapter 5

Prediction of Silicon Content in Molten Iron Based on EMD-GA-LSTM Junqi Yang, Haoran Wang, Xiuhe Wang, and Lintong Zhang

Abstract Stable and accurate prediction of molten iron silicon content in blast furnace is of much concern to production scheduling and stable, safe, and efficient operation of blast furnace ironmaking. To make the most of the information gathered during process of the blast furnace ironmaking and improve the stability and validity of predictions, this paper proposes a new model that combines empirical mode decomposition (EMD), genetic algorithm (GA), and long short-term memory neural network (LSTM) to predict the silicon content in molten iron. First, EMD algorithm is used to decompose the original silicon content sequence; respectively, several IMF decomposition components are obtained. Then, GA algorithm is used to optimize the batch size parameters and neuron parameters of LSTM network, so as to predict each IMF component. Finally, the predicted value of IMF component is reconstructed to obtain the final forecast result. This paper uses the measured data for verification. The results prove that the proposed method can make an accurate prediction of the multivariable time series of molten iron and silicon content and have higher prediction accuracy and less margin of error than autoregressive integrated moving average model (ARIMA), support vector regression model (SVR), and standard LSTM neural network model.

5.1 Introduction In the process of blast furnace ironmaking, the control of furnace temperature is very important. Too high or too low-furnace temperature is easy to cause furnace condition failure. In scientific research and production, the silicon content of blast furnace hot metal [Si] is usually used to characterize the hot metal temperature and blast furnace temperature. Furnace temperature prediction is to establish a mathematical model to predict the Si content in the iron making process under the established time series. J. Yang · H. Wang (B) · X. Wang · L. Zhang University of Science and Technology Beijing, Beijing 100083, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 L. C. Jain et al. (eds.), 3D Imaging—Multidimensional Signal Processing and Deep Learning, Smart Innovation, Systems and Technologies 298, https://doi.org/10.1007/978-981-19-2452-1_5

45

46

J. Yang et al.

The existing prediction models of silicon content in molten iron mainly include chaos model [1], time series analysis model [2, 3], support vector machine model [4], partial least squares regression model [5], and neural network model [6]. It is more appropriate to use recurrent neural network (RNN) when the time series is dynamic. Therefore, this paper introduces the method of adding the hidden layer of RNN to long short-term memory (LSTM) for research [7, 8]. And EMD-GA-LSTM model is established to predict complex working conditions.

5.2 Theoretical Knowledge 5.2.1 Empirical Mode Decomposition (EMD) EMD is an adaptive signal decomposition processing method, which can extract the trend term and is proved to be effective [9]. EMD algorithm decomposes the signal according to intrinsic time scale. It can decompose the complex signal into finite eigen mode functions (IMF), which contains part of the original signal. Now let the T T , where T is the sequence boundary. Take {a(t)}t>0 existing time series be {a(t)}t>0 as an example, the steps of EMD algorithm decomposition are as follows: Step 1: Traverse a(t) to obtain the local mean. T Step 2: Subtract the mean function {m(t)} from {a(t)}t>0 to get the first set of components {h 1 (t)}. Step 3: Check whether {h 1 (t)} meets the conditions of IMF sequence. If not, return to Step 1, replace {a(t)} with {h 1 (t)} and perform secondary filtering. The standard deviation of the two screening results SD is used as the termination criterion: SD =

T Σ |h k−1 (t) − h k−1 (t)|2

h 2k−1 (t)

t=0

, k = 2, 3, . . .

(5.1)

If SD ∈ [0.2, 0.3], output {h k (t)} as the IMF component of the sequence T , i = 1, 2, ..., and denote as {µ1 (t)}. {a(t)}t>0 T subtracts {µ1 (t)} and then get { f 1 (t)}. Step 4: The original set {a(t)}t>0 Step 5: Using { f 1 (t)} as a new data set, repeat the above steps to obtain {µ2 (t)}. Repeat these until { f n (t)} is a monotone function. Finally, the decomposition sequence is obtained: a(t) =

n Σ i=1

µi (t) + f n (t)

(5.2)

5 Prediction of Silicon Content in Molten Iron …

47

5.2.2 Genetic Algorithm (GA) The particular steps of this algorithm are as follows: Step 1: Randomly initialized the population in the definition domain and set the genetic algebra N . Step 2: Calculate the fitness of individuals, and select an appropriate number of individuals from benign individuals with strong adaptability to participate in inheritance. Step 3: In the genetic generation, a certain number of individuals are selected according to probability, paired, and cross operated to generate a new population. Step 4: The mutation probability is set to mutate the new species population after cross operation and change the individual parameter value to the next generation population p(1). Step 5: Repeat process 2–4 until N times are completed to obtain population p(N ), and output the individual with the maximum fitness in p(N ).

5.2.3 Long Short-Term Memory (LSTM) LSTM network is an improved time cyclic neural network (RNN). Long dependency problem can be solved by using this method. It was also improved after it was proposed to add an additional forgetting gate [10]. Compared with the traditional neural network model, it has high efficiency to improve the problems which appeared in traditional algorithm. One unit has three gates, namely input gate, forgetting gate, and output gate [11]. The schematic diagram of LSTM basic unit is shown in Fig. 5.1. The calculation formulas are as follows:

Fig. 5.1 Figure of basic unit of neural network

48

J. Yang et al.

f t = sigmoid(Wh f h t−1 + Wx f xt )

(5.3)

i t = sigmoid(Whi h t−1 + Wxi xt )

(5.4)

ct = f t ∗ ct−1 + i c ∗ tanh(Wh f h t−1 + Wx f xt )

(5.5)

ot = sigmoid(Wxo xt + Who h t−1 + Wco ct )

(5.6)

h t = ot ∗ tanh(ct )

(5.7)

where f t , i t , and ot represent forgetting, input, and output gate, respectively, and W represents the weight matrix. h t−1 and h t represent the output signals at the previous time and at this time, respectively.

5.3 EMD-GA-LSTM Algorithm Introduction Figure 5.2 shows the prediction model flow of EMD-GA-LSTM. EMD algorithm decomposes the silicon content sequence of molten iron according to different frequencies. Here, the process of data normalization will not change the result of EMD decomposition. The specific prediction process is as follows:

Fig. 5.2 Flowchart of EMD-GA-LSTM algorithm

5 Prediction of Silicon Content in Molten Iron …

49

Step 1: In this paper, the silicon content sequence is decomposed into seven IMF components and a residual sequence. Step 2: Combine the environmental parameters and material parameters of each quantity with the corresponding IMF component time, divide the training set and test set, and normalize them. Step 3: Taking MAPE as the fitness function, GA is used to improve the batch size and neuron parameters of LSTM. Step 4: Use the LSTM optimal super parameters obtained above to predict the corresponding data of each IMF component. Step 5: Inverse normalize and sum the prediction results of each component and get the final prediction value.

5.4 Verification of Measured Data 5.4.1 Performance Index In order to fully show the fitting degree of the model in this paper, this paper lists six different types of performance indicators: 0.1 hit rate (hitrate1), 0.05 hit rate (hitrate2), mean absolute percentage error (MAPE), root mean square error (RMSE), determination coefficient R 2 , and average absolute error (MAE). Where yˆ (i) and y (i) represent the predicted value and the real value, respectively. | m | 1 Σ || yˆ (i ) − y (i) || MAPE = m i=1 | y (i) | MAE = Σ m Hitrate1=

× 100%, a1i =

a2i

× 100%, a2i =

m i=1

m



a1i

i=1

Σ m Hitrate2 =

m 1 Σ (i) | yˆ − y (i ) | m i=1



(5.9)

1, | yˆ (i) − y (i ) | < 0.1 0, else

(5.10)

1, , | yˆ (i ) − y (i) | < 0.05 0, else

(5.11)

Σ m ( yˆ (i ) − y (i) )2 R = 1 − Σ i=1 m (i) 2 i=1 (y − y ) ⎡ | m | 1 Σ ( yˆ (i) − y (i) )2 RMSE = | m i=1 2

(5.8)

(5.12)

(5.13)

50

J. Yang et al.

Fig. 5.3 Sample data time series

5.4.2 Data Set Introduction The data come from 1158 pieces of data measured in the actual production process provided by Beike Yili company, including 1155 pieces of material data, basic operation data, and hot metal silicon content data. The Si content data sample of molten iron is shown in Fig. 5.3. This paper considers the use of multivariable prediction model.

5.4.3 EMD Decomposition EMD is used to decompose the hot metal silicon content training set sequence to obtain 7 IMF components and a residual sequence, as shown in Fig. 5.4. The correlation coefficient matrix between different IMF components and the original sequence is shown in Table 5.1. From the correlation coefficient matrix, it can be seen that there is a certain correlation between other components and the original components, except that the seventh component imf7 fluctuates greatly and has a weak correlation with original hot metal silicon content sample sequence, which proves that each component contains some information of the original sequence after EMD decomposition. It has certain significance for prediction.

5 Prediction of Silicon Content in Molten Iron …

51

Fig. 5.4 Decomposition results of EMD algorithm Table 5.1 Correlation coefficient between IMF sequence and molten iron silicon content

Decomposed IMF IMF1 and original molten iron silicon content sequence

Correlation coefficient matrix ⎡ ⎤ 1 0.3989 ⎡

IMF2 and original molten iron silicon content sequence ⎡ IMF3 and original molten iron silicon content sequence ⎡ IMF4 and original molten iron silicon content sequence ⎡ IMF5 and original molten iron silicon content sequence ⎡ IMF6 and original molten iron silicon content sequence ⎡ IMF7 and original molten iron silicon content sequence

0.3989

1

1

0.4570

0.4570

1

1

0.3040

0.3040

1

1

0.2649

0.2649

1

1

0.2082

0.2082

1

1

0.3640

0.3640

1











1

−0.1774

−0.1774

1



52

J. Yang et al.

Fig. 5.5 EMD-GA-LSTM model prediction sequence

5.4.4 LSTM Improved by GA The LSTM model with GA improved parameters is used to predict each IMF component. As shown in Fig. 5.5, EMD-GA-LSTM model has achieved good prediction results and small prediction error. The hit rates of hitrate1 and hitrate2 are 99.4% and 94.5%, respectively, the root mean square error MAPE is 0.0675, the average absolute error MAE is 0.0208, and the determination coefficient is 0.8969. In Fig. 5.5, the abscissa represents the time node of the time series, 250 represents the 250th time node, and the time interval of each node is one hour. This graph shows the prediction results of the corresponding time nodes obtained by using the algorithm. The abscissa of Fig. 5.6 and Fig. 5.7 is same as that of Fig. 5.5.

5.5 Model Comparison It can be known from the results in Figs. 5.6, 5.7 and Table 5.2 that compared with ARIMA model and SVR model, a series of model evaluation indexes based on LSTM have been improved to varying degrees. The determination coefficients of LSTM model, EMD-LSTM model, and EMD-GA-LSTM model reach 0.6948, 0.7865, and 0.8969, respectively. This shows that LSTM can achieve good results in predicting such data with long interval and delay. Figure 5.5 shows that the standard LSTM fluctuates greatly when predicting data with large amplitude, and the prediction result is not ideal. Although EMD-LSTM model improves the above phenomena,

5 Prediction of Silicon Content in Molten Iron …

53

Fig. 5.6 Comparison between proposed model and LSTM-based models prediction sequences

Fig. 5.7 Comparison of other advanced models prediction sequence

the final effect is still not good. After using GA to optimize EMD-LSTM model, the predicted results are obviously better than the previous ones. The prediction error of EMD-GA-LSTM model is also improved in varying degrees. The hit rates of hitrate1 and hitrate2 are 99.4% and 94.5%, respectively, the root mean square error MAPE is

54

J. Yang et al.

Table 5.2 Performance indexes of each algorithm ARIMA

Hitrate1 (%)

Hitrate2 (%)

MAPE

MAE

R2

82.3

70.3

0.1405

0.0456

0.4473

EMD-LSTM

98.4

83.1

0.0854

0.0277

0.7865

EMD-GA-LSTM

99.4

94.5

0.0675

0.0208

0.8969

SVR

75.3

41.6

0.2009

0.0691

0.1975

EMD-SVR

79.6

49.4

0.1783

0.0628

0.2866

LSTM

96.4

75.2

0.1101

0.0351

0.6948

0.0675, the average absolute error MAE is 0.0208, and the determination coefficient is 0.8969. Based on the above analysis, the EMD-GA-LSTM model proposed can get more accurate prediction results where the data set fluctuates greatly.

5.6 Conclusion In this paper, EMD-GA-LSTM model is established to predict Si content in hot metal under changeable working conditions. Firstly, EMD algorithm is used to decompose the original sequence, then the LSTM model with GA optimized parameters is used to predict each component, and finally, the prediction results are reconstructed. The results show that EMD-GA-LSTM model has good convergence, its overall accuracy is controlled in a small range, and the prediction effect is the best. Based on the current work, we will further study how to increase the accuracy of prediction results by improving LSTM algorithm and combining feature engineering.

References 1. Gao, C.H.: Chaotic analysis for blast furnace ironmaking process. Acta Phys. Sin. 54(4), 1490–1494 (2005) 2. He, S.B.: Hybrid time series predictive control model for silicon content in blast furnace hot metal. J. Zhejiang Univ. (Eng. Sci.) 10, 1739–1742 (2007) 3. Cui, G.M.: Prediction of blast furnace temperature using neural network based on time series. Metall. Ind. Autom. 39(05), 15–21 (2015) 4. Yuan, D.F.: Prediction model of silicon content series in blast furnace hot metal with support vector machines. J. Taiyuan Univ. Technol. 45(05), 684–688 (2014) 5. Lin, S.: Model of hot metal silicon content in blast furnace based on principal component analysis application and partial least square. J. Iron. Steel Res. Int. 18(10), 13–16 (2011) 6. Luo, S., Gao, C., Zeng, J., et al.: Blast furnace system modeling by multivariate phase space reconstruction and neural networks. Asian J. Control 15(2), 553–561 (2013) 7. Senior, A.: Context dependent phone models for LSTM RNN acoustic modeling. IEEE Int. Conf. Acoust, Speech Signal Process 1, 4585–4589 (2015)

5 Prediction of Silicon Content in Molten Iron …

55

8. Li, Z.L.: Research on hot metal Si-content prediction based on LSTM-RNN. CIESC J. 69(03), 992–997 (2018) 9. Sun, J., Sheng, H.: A hybrid detrending method for fractional Gaussian noise. Physica A 390(17), 2995–3001 (2011) 10. Hochreiter, S.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997) 11. Greff, K., Srivastava, R.K., Koutník, J., Steunebrink, B.R., Schmidhuber, J.: LSTM: a search space Odyssey. IEEE Trans. Neural Netw. Learn. Syst. 28(10), 2222–2232 (2016)

Chapter 6

Using Meta Path Information to Adjust Embedded Recommendation System Rui Ma, Xiuzhuo Wei, Huinan Zhao, Hui Sun, and Suhua Wang

6.1 Introduction Nowadays, there are more and more types of information on the Internet, which means people have to face all kinds of choices every day. Internet users need to find valuable information from these large amounts of data, which is like looking for a needle in the sea. To solve these problems, people have studied some solutions such as developing search engines. The development of search engine meets the needs of users to search actively so that users can find out the corresponding information according to their own needs. The search results depend greatly on users’ search words, while the development of recommendation system solves the problem of helping users find interesting content when they have no clear purpose. So, in many cases, recommendation system exists as an application in all kinds of products. In the age of the Internet, recommendation system can be found in a variety of areas. There are, for example, a lot of movie resources on video sites today and the sites update every day. It takes a long time for people to find the data they need on video sites. Recommendation systems can help us filter out unwanted information and recommend movies that users are interested in (for example, NetFlix and YouTube do a good job of that). The traditional recommendation system algorithms mainly include collaborative filtering, content-based recommendation algorithm, and hybrid recommendation algorithm. Among them, collaborative filtering is the most widely used recommendation algorithm. It basically extracts the information that users leave behind when they interact with the Web, for example, what contents users browse, what kind of movies they search for, whether they evaluate a certain movie and so on. After the analysis and prediction of these information, personalized recommendations are given to improve the efficiency of users in obtaining data. R. Ma · X. Wei · H. Zhao · H. Sun · S. Wang (B) Changchun Humanities and Sciences College, Jilin, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 L. C. Jain et al. (eds.), 3D Imaging—Multidimensional Signal Processing and Deep Learning, Smart Innovation, Systems and Technologies 298, https://doi.org/10.1007/978-981-19-2452-1_6

57

58

R. Ma et al.

The main construction of collaborative filtering is the embedding strategy, and embedding is represented by dense vector and realizes dimension reduction. Embedding is to achieve dimension reduction through dense vector representation. We can build a distributed representation of the word from many attributes of an item or a user and ultimately use vectorization to express the word; at this time, the word is an abstraction of dense vector. A vector is used to represent an entity. It contains not only various attribute information of the entity itself but also the association information between entities. Since the concept of embedding was put forward, the application of embedding in recommendation system is more and more, and the methods are novel and numerous, and people’s enthusiasm for the study and discussion of embedding has never decreased. In recent years, embedding is used more and more in recommendation systems. However, we can find that most recommendation methods only include user and item IDs to generate vectors. Similarly, taking the user watching a movie as an example, the feature vector here only takes the ID of the user and the ID of the movie and then carries out the inner product of the vectors of the two and feeds them into the neural network. This is the most widely used recommendation method at present. Although it has achieved a certain effect, it will make the representation between the user and the movie too fixed, thus resulting in errors in the user’s preference for the movie. For example, if User1 rated the first movie 5 points, and User2 rated the first movie 5 points, the previous algorithm would assume that User1 and User2 have the same interests. But User1 rated the second movie 5 points, and User2 might rated the second movie 0 point, so the algorithm would again consider the interests of User1 and User2 to be “very inconsistent.” This leads to a contradiction. Considering this situation, our solution is to connect each node in the network to form a meta path. When a user rates a movie, he may focus on different attributes of the movie, such as director, actor, genre, and so on. For example, a user (U) has seen a movie (M) directed by a director (D) who also directed another movie (M), then we can connect these nodes to form a meta path (UMDM). The recommended method of user representation is embedded by adjusting meta path information.

6.2 Related Work 6.2.1 Embedding The research on embedding has been mentioned for a long time. Hu [1] proposed that embedding is an important pretreatment process for analyzing large-scale information networks, which can be regarded as a mapping from semantic space to vector space. Xu [2] proposed that how to express higher-order information through shallow network is an important topic of neural network research at present. Dai et al. [3] proposed a new deep learning model to capture the nonlinear co-evolutionary nature of user and project embedding in a nonparametric manner. Zhu et al. [4] proposed

6 Using Meta Path Information to Adjust Embedded …

59

a heterogeneous hypergraph embedding framework for document recommendation. The framework is general can contain various relationships between users, tags, and resources. Embedding is an abstract dense vector, that is, a vector is used to describe an entity. This contains not only some information about the entity, but also other information about the association. In the recommendation system, both user and object are described as embedding, and their correlation degree is described by embedding inner product. When the inner product value is larger, the correlation degree is higher. So, the researchers have been trying to improve the embedding in order to improve the accuracy of recommendation system, also is to improve the accuracy of the users for evaluation of items. In the LLE learning algorithm published by Roweis [5], it is mentioned that the compact processing of high-dimensional data is the basic solution to analyze massive and multivariate data. Zhao [6] proposed the point-of-interest (POI) embedded model to learn POI representation. It can be seen that both domestic and foreign scholars have conducted in-depth discussions on embedding, which also proves that embedding plays an important role in the recommendation system. So in order to generate better embedding, researchers are also faced with some problems. For example, the data information is too sparse, or too much data in the process of calculation lead to the decline of efficiency. This needs us in the dimensionality reduction and collecting auxiliary information to make greater efforts to work.

6.2.2 Meta Path In the early literatures about recommendation systems, collaborative filtering was mainly adopted. Although this method has achieved some results in modeling, it still faces common problems such as data sparsity and cold start. To solve this problem, Xu et al. [7] proposed that some additional information could be added in heterogeneous networks. It mainly includes social information, location information, and some heterogeneous information. Dai et al. [8] proposed a heterogeneous information network (HIN) recommendation algorithm based on improved matrix decomposition to make recommendations by constructing a specific network between users and projects. Shi et al. [9] mined the association between nodes by constructing heterogeneous graphs, items with high relevance are considered to be favored by users and therefore recommended. He et al. [10] proposed that more comprehensive and subtle information could be extracted from the recommendation system. This is because the recommendation system not only contains many users and objects, but also commodity attribute information such as film director, actor, and subject matter. Establish interactive association among information objects of HIN in the recommendation system can obtain multi-element information, reduce the impact of data sparsity, so as to improve the recommendation effect. Xie et al. [11] proposed that in a HIN, complex networks can be formed between nodes. Rich semantic information can be analyzed and mined through diversified

60

R. Ma et al.

Fig. 6.1 System model diagram

edge relation to solve the problem of insufficient data information. You can discover the different relationships between objects by the different attributes they have and the links between them. Yan et al. [12] proposed that most existing HIN-based recommendation methods use path-based semantic similarity. In HIN, two objects can be semantically linked to a path, defined as a meta path, meta paths mainly solve the problem of cold start [13]. These methods have begun to consider in the model to join the representation of a path. Since there are abundant meta path types in the information network, we study to add meta path information content in the neural network training process. It can be used to improve the embedding of users and projects to avoid some of the problems described above. Meta path, simply is the connection of two entities a specific path. For example, in the movie recommendation system, we use U to represent the user, M to represent the movie, and D to represent the director. So UMUM is a meta path, this path shows that the first user watched this movie, another user watched the same movie then watched the second movie, so we can recommend the second movie to the first user. Of course, we then connect “user-movie-director-movie” to form a UMDM meta path. This meta path represents other movies by the director of a movie you have seen, and we can recommend that movie to the user. From the above, you can see different paths can be interpreted as different business semantics, this is helpful for us to mine more auxiliary information, adjust the composition of embedding, and enrich the training data. In this respect, meta path can make full use of the information in the heterogeneous graph. And through a variety of relationships between users and items, it make recommendations with the interpretability. However, meta paths need to be carefully designed and are not suitable for some scenarios.

6 Using Meta Path Information to Adjust Embedded …

61

6.3 Model Figure The overall architecture of the model is shown in Fig. 6.1. First part is the user, generates the one-hot vector directly according to the userID, and sends that vector into a fully connected network, output a low-dimensional, real-valued, dense vector as a representation of the user. This process is formalized as follows: U (1×k) = P (1×m) · W (m×k)

(6.1)

where P (1×m) is the m-dimensional one-hot vector of the user, W (m×k) is the weight matrix of the fully connected network. P (1×m) is mapped as user embedding (U (1×k) ) through the fully connected network, including, k Blue. It can be seen that the optimal selection of threshold is based on the blue channel. After excluding the limit value of shadow caused by light, the final thresholds selected in this paper are 8 and 150. When the value of the blue channel is between 8 and 150, it represents tobacco information; otherwise, represents the background. The results after binarization are shown in Fig. 8.3.

Fig. 8.1 Background color value statistics

86

R. Luo et al.

Fig. 8.2 Tobacco color value statistical

Fig. 8.3 Binarization result of tobacco image

8.2.3 Histogram Analysis This paper selects the green channel with more complete detail performance for histogram analysis, and the color values of the subsequent steps use the color values of the green channel. The pixel value of each channel in the image is represented by an 8-bit binary number, so 28–256 pixel values can be obtained, and the computer represents the pixel value from 0, so the value range of pixel value is 0–255. Because the histogram division is too detailed, and the color of tobacco basically fluctuates within a range, it may be more conducive to observation and summary if 256 dimensions are reduced to smaller dimensions. Reduce the dimension to 128, 64, 32, 16, and 8 for comparison. The principle is to divide the color value by 2, 4, 6, 8, and 16, respectively. Then, draw the histogram with the new dimension. The single leaf example is shown in the Table 8.1. It can be seen from the table that after reducing the dimension, the trend is obvious. From the 8-dimensional histogram, it can be seen that B2F is mainly distributed in region 3–5 and concentrated in region 4; C3F is mainly distributed in area 4–5 and

8 An Experimental Study on Extraction Method …

87

Table 8.1 Histogram summary of each dimension Level dimension

B2F

C3F

X2F

128 dimensions

64 dimensions

32 dimensions

16 dimensions

8 dimensions

concentrated in area 5; X2F is mainly distributed in area 5–6 and concentrated in area 6, which just corresponds to orange, light orange, and lemon yellow in tobacco. The color value ranges of the 8 dimensions are listed in Table 8.2 (B2F, C3F, and X2F are different grades of tobacco quality). Table 8.2 Color value statistics of each region

Area

Color value

1

0–31

2

32–63

3

64–95

4

96–127

5

128–159

6

160–191

7

192–223

8

224–255

88

R. Luo et al.

Table 8.3 Statistical table of expected value of each level

Level

μ

σ

B2F

4

≈0.7

C3F

5

≈0.7

X2F

6

≈0.7

8.2.4 Color Value Analysis After obtaining the 8-dimensional corresponding color value table, all images are statistically calculated to observe the corresponding regional distribution of the three levels. The statistical table of expected values of each level is shown in Table 8.3. In Table 8.3, μ represents the mathematical expectation and σ the variance. It can be seen that the color of a tobacco can be determined either by observing the peak in the histogram or by calculating the expected value of normal distribution.

8.2.5 Regional Analysis The distribution of each region is as follows: (1) (2)

(3) (4) (5) (6) (7) (8)

The number of area 1 is basically equal to 0, so it can be considered that this area will not reach the color value of tobacco. The number of area 2 is 4% in the upper leaf, 1% in the middle leaf, and a small part in the lower tobacco. This area belongs to the color of variegated and petiole. Due to the high maturity of upper tobacco, there are relatively more variegated and petiole. There are many tobacco in the upper part of area 3, with 25%, belonging to reddish brown area. Region 4 is the main region of the color value of the upper tobacco, belonging to the orange region. Region 5 is the most important region of tobacco color value in the middle, belonging to the light orange region. Region 6 is the main region of the lower tobacco color value, belonging to the lemon yellow region. The color value of area 7 only appears in the lower tobacco, belonging to the light lemon yellow area. The number of areas 8 is also basically equal to 0, so it can also be considered that this area will not reach the color value of tobacco. Therefore, a new color value area table can be obtained, as listed in Table 8.4.

8 An Experimental Study on Extraction Method … Table 8.4 New color value area table

Table 8.5 Conformity assessment results

89

Area

Characteristic interpretation

2

Variegated and petiole

3

Red brown

4

Orange red

5

Light orange

6

Lemon yellow

7

Light lemon yellow

Area 5 (%)

Area 4 (%)

Area 3 (%)

Conclusion

≥ 75

≥ 65

≥ 55

Excellent

≥ 70

≥ 60

≥ 50

Good

≥ 65

≥ 55

≥ 45

Commonly

≥ 60

≥ 50

≥ 40

Bad

< 60

< 50

< 40

Unqualified

8.3 Online Consistency Test of Tobacco Leaves In order to determine the effectiveness of the above standards, the paper collected the image data of a tobacco sorter from 9:51 to 11:06. According to the rate of collecting one piece every 3 s, a total of 1487 pieces were collected. Manual online sampling inspection and corresponding evaluation records of purity were carried out in the collection process. Then, the collected images are screened and preprocessed, image segmentation, and feature extraction. Finally, the effectiveness of the standard is evaluated by the statistical histogram of tobacco image. The evaluation results are shown in Table 8.5. As it can be seen from Table 8.5, the standard has strong effectiveness.

8.4 Summary The paper based on the tobacco leaf color value analysis experiment, the detection, and analysis methods of tobacco leaf color value is summarized: in the dark room (dark environment), D65 standard light source [8, 9] is used, the background is white, the industrial camera is used to capture the tobacco leaf appearance picture, the green channel image is selected, the excellent value is statistically summarized and reduced to 8 dimensions, and the color value area table is used to explain the color characteristics. The algorithm of separating tobacco leaf and background in the image is found out by the tobacco leaf color value analysis experiment, and the image is binarized

90

R. Luo et al.

according to the threshold. The tobacco leaf is expressed as 1:8 150. In this way, the separation of tobacco leaf and background is completed. The basic distribution of color values of B2F, C3F, and X2F tobacco leaves was collected. The method of online tobacco consistency determination was determined by the online tobacco consistency determination experiment, and the evaluation standard was obtained. After the corresponding verification experiments, the effectiveness of the standard is determined. Acknowledgements This work was supported in part by grant from the Science and Technology Projects of Yunnan Provincial Company of China National Tobacco Corporation (No.2021530000242043).

References 1. Gonzalez, R.C., Woods, R.E.: Digital Image Processing (Ruan, Q. (trans.)). Electronic Industry Press (2017) 2. Bertold Klaus Paul Horn: Robot Vision (Liang, W. (trans.)). China Youth Publishing House (1986) 3. Li, H.D., Liu, J.L., Ye, X.Q., Gu, W.K.: Quantum statistical method for estimating spectra directly from RGB. Chin. J. Image Graph. 02, 28–32 (1999) 4. GB/T 3977-2008: Specification of Colors 5. GB/T 3978-2008: Standard Illuminants and Geometric Conditions 6. GB/T 3979-2008: Methods for the Measurement of Object Color 7. Zhang, J.P., Wu, S.Y., Fang, R.M., Gao, L.G.: Measurement and analysis of tobacco leaf color. J. Jiangsu Inst. Technol. 04, 7–14 (1993) 8. GB/T 20146-2006/CIE S 00:1999: CIE Standard Illuminants for Colorimetry 9. GB/T 20147-2006/CIE 10527:1991: CIE Standard Colorimetric Observers

Chapter 9

Research on the Distribution Map of Weeds in Rice Field Based on SegNet Sheng Zhu, Shihao Li, and Ze Yang

Abstract Rice is one of the major economic crops in China. Weeds in paddy fields compete with rice for natural resources such as sunlight, water, and soil. At present, most farmers use uniform spraying for weeding. Unreasonable use of herbicides will have a serious impact on the quality of rice. Visible light images of rice paddies taken by an unmanned aerial vehicle (UAV) are used in this paper. Three semantic segmentation models, FCN, U-Net, and SegNet, were used to recognize rice straw images. The pixel accuracy (PA) (PA: retain the first only) of the three models is 88.8%, 89.4% and 84.5%, respectively. The mean intersection over union (MIoU) (MIoU: retain the first only) was 61.4%, 68.8%, and 64.0%, respectively. The experimental results show that the research method of UAV remote sensing images based on the SegNet deep learning model can effectively reflect the difference between rice and weeds. The distribution map of weeds in paddy fields was obtained, and then, the plant protection UAV was guided to apply pesticides accurately.

9.1 Introduction At present, the main method of weed control in paddy fields is chemical weed control. Spraying the whole field evenly with the same dosage [1] will cause crop phytotoxicity, soil and water pollution, pesticide residues in rice, and other problems [2]. Precise spraying of pesticides can effectively reduce the use of herbicides [3] on the premise of ensuring the efficacy. This requires accurate identification and detection of crops and weeds in farmland so that targeted weeds can be accurately applied, and large-scale agricultural production has become a trend [4]. With the development of science and technology, research on machine learning [5] and pattern recognition has obtained many results and progress. Among them, computer vision [6] and natural language processing [7] have become research hotspots. This provides technical support for the integration and development of artificial intelligence and agricultural S. Zhu · S. Li · Z. Yang (B) College of Information Technology, Guangdong Technology College, Zhaoqing, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 L. C. Jain et al. (eds.), 3D Imaging—Multidimensional Signal Processing and Deep Learning, Smart Innovation, Systems and Technologies 298, https://doi.org/10.1007/978-981-19-2452-1_9

91

92

S. Zhu et al.

science. Barrero et al. [8] used a neural network (NN) based on aerial images to detect weed crops in paddy fields. The experimental results showed that the detection of weeds on the test data has 99% accuracy. Wang et al. [9] obtained images of weeds with complex background objects through a computer vision system and applied the methods of expansion and erosion in morphology to separate plants and weeds with obvious differences in shape. The recognition accuracy can reach 96%. These studies prove that the use of computer vision technology can well identify weeds and other crops, providing a basis for the application of artificial intelligence in the agricultural field. This paper intends to use three semantic segmentation models based on pixel recognition, FCN, SegNet, and U-Net models to identify weeds in paddy fields at the pixel level. On the basis of previous studies, this paper realizes the pixel-level recognition of rice and weeds in farmland and obtains a better recognition effect. In terms of the application of research results, the distribution maps of paddy field weeds under different thresholds are generated and provide a decision-making basis for precise pesticide application of the plant protection unmanned aerial vehicle.

9.2 Materials and Methods 9.2.1 Experimental Data Collection This experiment was carried out on September 16, 2019 in Shazai Island, Xinhui District, Jiangmen City. The image data used in this paper were collected from an organic rice reclamation field on the island. As shown in Fig. 9.1, it can be seen that the weeds in the farmland are relatively luxuriant, and the color of the weeds is inconsistent with that of rice. The experiment uses the DJI Wizard 4 drone to take aerial photography of farmland. According to the drone aerial photography experiment, and taking into account the actual performance of the drone and the

Fig. 9.1 Location of the study area

9 Research on the Distribution Map of Weeds in Rice Field …

93

Fig. 9.2 Orthographic map of farmland

significant color difference between rice and weeds, the drone’s aerial photography height is set to 30 m. The horizontal and vertical overlap rates are 60% and 70%, respectively, and the resolution is 1.5 cm. On the basis of the research of weed recognition based on remote sensing technology at home and abroad, a pixel-based method for rice weed image recognition and segmentation was proposed. The orthophoto map generated by professional splicing software Agisoft-Photoscan is used to form the orthophoto map of the whole farmland, as shown in Fig. 9.2.The resolution is high; the image size is too large, and the memory capacity is too large. When training deep learning models, there is a lot of pressure on limited computer resources, and it is often difficult to directly process and analyze. Therefore, according to the research results of Huang Huasheng [3] and others, this type of high-resolution orthophoto can be cut into many non-intersecting small area images, and the open-source image annotation software Labelme can be used to label the images. There are three categories: rice, weeds, and the other three categories. The pixels of the small picture cut in this article are 500 × 500. The number of training set and test set is 1689 and 196, respectively. The technical roadmap of this article is shown in Fig. 9.3.

9.2.2 Introduction to Image Segmentation Based on Deep Learning Fully convolutional networks (FCNs) are a transformation based on convolutional neural networks (CNNs). The most significant change is to change the fully connected layers to convolutional layers. Fully convolutional neural networks can effectively extract image features. And end-to-end pixel-level classification [11] deep learning can automatically extract abstract features that help improve classification accuracy during the training model [12]. Compared with the classic convolutional neural network, FCN introduces a deconvolution operation to upsample the feature map

94

S. Zhu et al. Collect test data.

After stitching, cutting, and labeling the collected images, the processed images are divided into training set and test set.

Obtain the recognition map after FCN, SegNet, U-Net network recognition.

Obtain the evaluation coefficient after FCN, SegNet, U-Net network recognition.

Generate a distribution map of weeds in rice fields under different pesticide spraying thresholds.

Summary and analysis

Fig. 9.3 Technology roadmap

on the last convolutional layer for pixel-by-pixel classification [10]. At present, deep learning has surpassed traditional machine learning algorithms in the fields of image recognition, speech recognition, information retrieval, etc. [13] Therefore, choosing a full convolutional neural network FCN to segment images based on pixel level is undoubtedly a huge innovation. Based on the research of fully convolutional neural networks, in recent years, many excellent semantic segmentation models have emerged, such as SegNet with encoder structure [14] and PSPNet with large-dimensional convolution kernel encoder-decoder structure [15] model, the U-Net [16] model with a shape similar to the letter “U,” and the currently more advanced DeepLab v3 model. In addition, this paper selects the encoder–decoder symmetric structure to realize the end-to-end pixel-level image segmentation model SegNet. The biggest advantage of SegNet is high memory efficiency and computational efficiency, and relatively, few parameters need to be trained [14]. The function of the encoder in SegNet is to convert high-dimensional vectors into low-dimensional vectors and realizes low-dimensional extraction of high-dimensional features [16]. In 2015, Olaf Ronneberger, Philipp Fischer, and others proposed the network structure of the convolutional network U-Net for biomedical image segmentation [13]. U-Net is also aimed at the improvement of FCN. The structure of the U-Net segmentation algorithm is completely symmetrical, and the decoder is subjected to convolution and deepening operation. FCN only performs upsampling [16]. Compared with FCN and Deeplab, U-Net has performed upsampling four times in total and used skip connection in the same stage instead of directly supervising and loss back transmission on high-level semantic features, which ensures that the final recovered feature maps incorporate more the low-level features, which also make the features of different

9 Research on the Distribution Map of Weeds in Rice Field …

95

scales fused so that multi-scale prediction and deep supervision can be performed. Four times of upsampling also make the segmentation map recover the edge and other information more finely (add more information about the U-Net).

9.3 Results and Analysis 9.3.1 Training Result Display The systems used in the experiments in this article are all Ubuntu 16.04 operating systems, and the deep learning framework used is TensorFlow created on the hardware platform based on NVIDIA GTX 1060 GPU. This paper sets the same parameters for the three models. The batch_size is set to 1. This paper sets the initial learning rate to 0.00001, the learning rate change index (gamma) to 0.1, and the momentum parameter (momentum) to 0.9. Use the AdamOptimizer function to update the network weights until the network converges. Tables 9.1, 9.2, and 9.3, respectively, show the training results of the three models FCN, SegNet, and U-Net for the rice weed images in this paper. Table 9.1 Results based on FCN

Steps 2500

Table 9.2 Results based on SegNet

Index PA (%)

MIoU (%)

82.5

54.3

10,000

87.0

59.6

20,000

87.6

59.2

30,000

88.8

61.4

40,000

87.3

58.9

50,000

88.4

60.8

60,000

87.9

60.3

Steps

Index Train_MioU (%)

Test_MIoU (%)

40,000

42.7

43.5

50,000

45.7

46.3

60,000

50.3

48.2

70,000

50.1

50.1

80,000

51.7

51.6

90,000

53.8

52.8

100,000

54.2

54.8

96

S. Zhu et al.

Table 9.3 Results based on U-Net

Index

Epoches

Precision (%) 5

83.2

10

84.1

15

87.9

20

90.2

25

91.0

30

92.3

35

90.7

40

89.9

It can be seen that the neural network FCN has the best convergence effect when the number of training steps reaches 30,000 (steps). Based on the full convolutional neural network (FCN), the PA of rice weed image recognition is 88.8%, and the average intersection ratio MIoU can reach 61.4%. The assumption is as follows: There are N + 1 categories (including an empty category or background), and nij represents the number of pixels that belongs to category i but are predicted to be category j. That is, nij represents the real quantity, and nji is, respectively, interpreted as false positives and false negatives although both are the sum of false positives and false negatives. N

n ii PA =  N i=1 N i=0

j=0

ni j

n ii 1   N  N N + 1 i=0 j=0 pi j + j=0 n ji − n ii

(9.1)

N

MIOU =

(9.2)

Based on the U-Net model, when the training batch reaches 30 (epoches), the verification accuracy value can reach 92.3%, and then, as the training batch increases, the verification accuracy value begins to decrease.

9.3.2 Establishment of Unified Evaluation Coefficient of Model PA and MIoU are the most basic traditional semantic segmentation evaluation indicators to evaluate the pros and cons of the three models used in this article [10]. Table 9.4 shows the two unified evaluation index coefficients of the three models. From the evaluation index coefficient values in Table 9.4, it can be concluded that based on the SegNet model, the PA and average intersection of the identification map of the test set

9 Research on the Distribution Map of Weeds in Rice Field …

97

Table 9.4 Comparison of three models Models

Index PA (%)

MIoU (%)

FCN

88.8

61.4

SegNet

89.4

68.8

U-Net

84.5

64.0

Fig. 9.4 Comparison of farmland identification map and original map

and the annotation map of the test set are more than MioU Compared with FCN and U-Net, the two models are the highest. The pixel accuracy value PA and the average intersection ratio MioU are 89.4% and 68.8%, respectively. In Fig. 9.4a, b, c, and d, respectively, show the farmland recognition map and the farmland orthographic map generated by training the three models of FCN, SegNet, and U-Net.

9.4 Conclusion This experiment not only accurately recognizes rice weeds based on pixel level in the block-shaped paddy fields, and obtains better recognition accuracy, but also generates

98

S. Zhu et al.

a distribution map of rice field weeds that can be applied to the precise application of plant protection drones. There is a certain degree of innovation in both the identification method and the practical application. But, there are also shortcomings in that the amount of data in this article is small, and it is slightly insufficient in terms of verifying the generalization of the model. The generated distribution map of weeds in the paddy field did not obtain accurate geographic coordinate information of the operation area. The algorithm model structure is not improved to improve the recognition accuracy. In the next step, we can conduct in-depth research and discussion from these aspects. Acknowledgements This work was supported by the Guangdong Province Young Innovative Talents Project under Grant No. 2019KQNCX234 and The “Innovative and Strong School Project” scientific research project of Guangdong University of Technology under Grant No. 2021GKJZK009.

References 1. López-Granados, F., Torres-Sánchez, J., Serrano-Pérez, A., de Castro, A.I., Mesas-Carrascosa, F.J., Pena, J.M.: Early season weed mapping in sunflower using UAV technology: variability of herbicide treatment maps against weed thresholds. Precision Agric. 17(2), 183–199 (2016) 2. Liu, Y., Liu, B., Wang, X.F., et al.: Problems and countermeasures of chemical weeding in China. Pesticide 07, 289–293 (2005) 3. Huang, H.H., Deng, J.Z., Lan, Y.B.: A fully convolutional network for weed mapping of unmanned aerial vehicle (UAV) imagery. PLoS ONE 13(4), e0196302 (2018) 4. Luo, X., Liao, J., Hu, L., Zang, Y., Zhou, Z.: Improving agricultural mechanization level to promote agricultural sustainable development. Trans. Chin. Soc. Agric. Eng. 32(1) (2016) 5. Liu, G.D., Pan, Z.G., Cheng, X., et al.: Overview of machine learning technology in human motion synthesis. J. Comput. Aided Des. Graph. 22(009), 1619–1627 (2010) 6. Zhang, H., Wang, K.F., Wang, F.Y.: Application progress and prospect of deep learning in target vision detection. J. Autom. 43(008), 1289–1305 (2017) 7. Li, B.L., Chen, Y.Z., Yu, S.W.: Review of information extraction. Comput. Eng. Appl. 039(010), 1–5, 66 (2003) 8. Barrero, O., Rojas, D., Gonzalez, C., Perdomo, S.: Weed detection in rice fields using aerial images and neural networks. In: 2016 XXI Symposium on Signal Processing, Images and Artificial Vision (STSIVA), pp. 1–4, IEEE (2016) 9. Wang, S.W., Zhang, C.L., Fang, J.L.: Research on field weed recognition based on computer vision. In: 2005 Academic Annual Meeting of China Agricultural Engineering Society (2005) 10. Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(4), 640–651 (2014) 11. Wang, M.: Research on image semantic segmentation based on deep convolution neural network. North China Electric Power University (2019) 12. Lecun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015) 13. Yang, J.Y., Zhou, Z.X., Du, Z.R., et al.: Rural construction land extraction from high-resolution remote sensing images based on segnet semantic model. J. Agric. Eng. 35(05), 259–266 (2019)

9 Research on the Distribution Map of Weeds in Rice Field …

99

14. Badrinarayanan, V., Kendall, A., Cipolla, R.: Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39(12), 2481– 2495 (2017) 15. Zhao, H.S., Shi, J.P., Qi, X.J., Wang, X., Jia, J.: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881–2890 (2017) 16. Li, Z.N.: Semantic segmentation of medical cervical cell image based on multi-scale multi input convolution neural network. Xiangtan University (2019)

Chapter 10

Research on the Application of Artificial Intelligence in Unstructured Data Resource Management of Power News Information Jia Li, Shaoming Lai, and Chenlan Gu Abstract In the continuous development of China’s social economy, artificial intelligence technology has been widely used in the field of news communication. Especially, for the development of the power industry, the use of artificial intelligence to collect and manage news information data of the power industry cannot only improve the actual work efficiency, but also provide more news information for the development of the industry. Therefore, on the basis of understanding the development status of Power News Information based on artificial intelligence and unstructured data resource management platform, this paper deeply discusses the system design requirements and carries out functional experimental analysis of the design system.

10.1 Introduction According to the optimization practice of unstructured database, rational use of electric power news information data can promote the development speed of industrial enterprises and improve the level of information management. According to the operation of the power industry in recent years, the analysis shows that the electric power enterprise information and news, video, pictures, email and other related documents are growing at twice every year, 80% of which belongs to the unstructured data, mainly stored in the enterprises of different business systems and computer platforms, but usually use structured database can handle only 20% of the data information. And, the hidden resources contained therein need to be managed and mined by enterprises. In order to better find the valuable data information, researchers put forward the use of language method, cross-filtering, keyword search and other

J. Li (B) · S. Lai · C. Gu Yingda Media Investment Group, Beijing, China e-mail: [email protected] © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2022 L. C. Jain et al. (eds.), 3D Imaging—Multidimensional Signal Processing and Deep Learning, Smart Innovation, Systems and Technologies 298, https://doi.org/10.1007/978-981-19-2452-1_10

101

102

J. Li et al.

methods to optimize the access and processing level of unstructured data. The most common enterprise search engines can quantify targeted data collection, which can not only expand the scope of data collection, but also improve actual productivity. The specific functions are shown in Fig. 10.1 [1–3]. The power system classification method and comprehensive modeling method in pattern recognition in the figure are mainly used for data classification and comprehensive processing. This shows that although power load characteristics have time variability and randomness, its load characteristics still have regularity, which also proves the feasibility of the overall detection method. According to the characteristics of artificial intelligence, this system realizes the construction of electric power equipment search engine function analysis, the construction of electric power equipment file storage system, extract the power equipment information to identify relevant steps, compared with the existing technology, solve the natural language description of unstructured

Fig. 10.1 Functional analysis based on enterprise search engine

10 Research on the Application of Artificial Intelligence …

103

outage information cannot be intelligent identification of the defects of power equipment. Artificial intelligence word segmentation and similarity algorithm are used to extract natural language description from unstructured information standardization equipment, which lays a data foundation for statistics, analysis and safety inspection of power planning and provides technical support for improving the level of power network planning. From the perspective of business demand, Power News Information data, as an invisible asset, has a very important value in practical application. Generally speaking, unstructured data is composed of web pages, pictures, text files, etc., which is a type of data that cannot be presented using two-dimensional table structure. Power News Information unstructured data management mainly carries on the collection collation, safe storage, release information and so on, and thus strengthens own market competition level. In this paper, when studying the design of unstructured data resource management system for Power News Information, the following functions are mainly considered from the perspective of business requirements [4]: First is uniform storage. Electric power enterprises and related departments have multiple business application systems and will design their own storage database. The application efficiency of data resources can be improved by using artificial intelligence to control the internal unstructured data, optimize the data storage strategy and ensure the internal configuration of the system [5]. Second is centralized management. The platform is the core basis of unstructured data storage. Therefore, the design should comply with global data access standards and store data in a unified form, so as to ensure that users can correctly access unstructured data and it can be shared and used among business systems. Third is lifecycle management. Integrating it with data management can ensure good interaction between data creators and users and ensure the unity of storage management and coordination of actual management within the whole line, so that data information can be seamlessly transmitted between departments [6]. Fourth is external services. Nowadays, most power enterprises will increase the actual management investment when processing unstructured news data, which requires to improve the retrieval and access efficiency on the basis of controlling the storage space of unstructured data. Therefore, a unified external service interface should be proposed in the platform design and research, and a number of basic public services such as access, storage and content management should be provided, so as to improve the application level of unstructured data [7]. Fifth is multi-class systems. According to the analysis of the data construction standards of the management platform, the management platform should have a unified data access interface and call external business systems to improve the access level of the system. Sixth is data processing and decision support. In the process of processing Power News Information data resources, the basic data should be extracted to provide reference for enterprise decision-making. Typically, this operation requires the use of data mining, text analysis, ETL, etc. At the same time, it is necessary to ensure the relevance between convenient news data, so as to provide technical support for subsequent text search and decision management.

104

J. Li et al.

In the study of business process, according to the current situation of unstructured information management and the analysis of the status quo of power industry news data utilization, the unstructured data management framework based on artificial intelligence is mainly divided into three levels: The first is data access layer, the second is data service layer, and the last is data storage layer [8]. As shown in Fig. 10.2, it is the specific process of storing data information in the system. After the data access layer obtains the transmitted news information, it will upload files for sharding processing according to its own needs, so as to constitute file slices. The news information and files are then stored in the data storage layer after being encapsulated and processed. After successful storage, the relevant information is fed back to the data service layer. At the same time, the metadata is modified to eventually feedback successfully stored information to the data access layer. When the data access layer receives the notification, it represents the end of the file storage business process operation. As shown in Fig. 10.3, it is the specific process of system reading business. After the data access layer obtains the news location information transmitted, the request command can be sent to the data service layer after encapsulation processing. After the other party obtains the request, the file slice data will be read and downloaded. Next, it should be passed to the client user in the form of fragments. After the user obtains the file slices, it should be merged and restored to obtain the perfect stored file. Completion of this operation represents the end of the system read business process [9].

Fig. 10.2 File storage flowchart

10 Research on the Application of Artificial Intelligence …

105

Fig. 10.3 File reading flowchart

10.2 Methods Based on the analysis of the system architecture diagram shown in Fig. 10.4, it can be seen that the Power News Information and unstructured data resource management system architecture based on artificial intelligence are mainly divided into three modules: application platform, intelligent analysis engine of unstructured data and distributed computing. In the application platform, operation management, cloud analysis, cloud storage and other functions will be proposed in the design. Unstructured data intelligent analysis engine involves intelligent analysis model, multimedia recognition, automatic collection, entity extraction and so on. Distributed computing architecture involves common protocols, management, availability and so on. Among them, the management part also involves the system storage monitoring, data adaptation and so on [10–12]. According to the analysis of news resources stored in the power industry in recent years, the corresponding technical architecture design in the construction of unstructured data management platform is shown in Fig. 10.5. From a practical point of view, the biggest advantages of this technology architecture fall into two categories: distributed computing. This kind of calculation method is complicated and will subdivide a large number of problems into several independent parts and then carry out decentralized calculation, which can not only improve the parallel computing level of the system, but also guarantee the operating efficiency of the system and promote the final results which can be applied to the construction and management of power enterprises. The other is distributed storage. Because unstructured data needs to provide relatively large storage space, power companies need to spend a lot of money to buy disk space in the early stage of construction. At this point, in order to reasonably use the internal storage of news information, we can

106

J. Li et al.

Fig. 10.4 System architecture

integrate too scattered storage space, using virtual devices to provide corresponding services. The system module structure design should be carried out in order from top to bottom, so that developers can intuitively and clearly understand the relationship between the system hierarchy and functions and ensure that each function module can share the call industry news data. The common functions of the Power News Information unstructured data resource management platform studied in this paper are: retrieval system, storage system, intelligent identification, etc., which can ensure that the users of the system can quickly search the unstructured news data. The specific structure is shown in Fig. 10.6.

10 Research on the Application of Artificial Intelligence …

Fig. 10.5 Technical structure

Fig. 10.6 System functional structure

107

108

J. Li et al.

Table 10.1 Search results 0 6

10