Agro-geoinformatics: Theory and Practice (Springer Remote Sensing/Photogrammetry) [1st ed. 2021] 3030663868, 9783030663865

This volume collects and presents the fundamentals, tools, and processes of utilizing geospatial information technologie

123 1 16MB

English Pages 424 [419] Year 2021

Table of contents :
Contents
Chapter 1: Introduction to Agro-Geoinformatics: Theory and Practices
Chapter 2: Remote Sensing for Agriculture
2.1 Introduction
2.2 Major Agriculture-Related Remote Sensing Data Sources
2.3 Agricultural Applications
2.3.1 Crop Type Identification
2.3.2 Crop Phenology Mapping
2.3.3 Crop Yield Estimation
2.3.4 Crop Evapotranspiration (ET) and Water Use
2.3.5 Soil Moisture Retrieval
2.4 Summary
References
Chapter 3: GIS Fundamentals for Agriculture
3.1 Introduction
3.2 GIS: The Geospatial Approach
3.3 GIS Application in Agriculture
3.3.1 GIS Mapping and Analytical Techniques
3.3.2 Spatial Database for Agricultural Systems
3.3.3 GIS-Based Modeling in Agricultural Application
3.3.3.1 Environment Models Linked to GIS
3.3.3.2 Crop Yield Prediction Based on GIS
3.3.3.3 Agricultural Management Models Using GIS
3.3.4 Decision Support System
3.3.4.1 Traditional Decision Support Systems
3.3.4.2 New Direction and Trends in Decision Support System
3.4 Conclusion
References
Chapter 4: Agro-geoinformatics Data Sources and Sourcing
4.1 Introduction
4.2 Data Sources
4.2.1 Satellite
4.2.2 Airborne Camera
4.2.3 In Situ Sensors
4.2.4 Manual Reports
4.2.5 Summary
4.3 Sourcing
4.3.1 Conventional Sourcing
4.3.2 Cloud Sourcing
4.3.3 Crowdsourcing
4.4 Conclusion
References
Chapter 5: Standards and Interoperability
5.1 Introduction
5.2 Standard Organizations
5.2.1 ISO
5.2.2 OGC
5.2.3 CEN
5.2.4 ANSI
5.3 Typical Standard Development Process
5.4 Types of Standards
5.4.1 Data Content or Encoding Standard
5.4.2 Metadata Content and Encoding Standard
5.4.3 Data Service Standard
5.4.4 Statistical Standards and Methodological Guidelines
5.5 Conclusion
References
Chapter 6: Image Processing Methods in Agricultural Observation Systems
6.1 Introduction
6.2 The Fundamentals of Digital Image Processing
6.2.1 Origins and Definitions
6.2.2 Basic Steps in Image Processing
6.3 Hardware and Software
6.3.1 Image Processing Hardware
6.3.2 Image Processing Software
6.3.3 Mobile Device-Based Image Processing
6.3.4 Cloud-Based Image Processing
6.4 Agricultural Image Data Collection
6.4.1 In Situ Data Collection
6.4.2 Airborne-Based Data Collection
6.4.3 Space-Borne-Based Data Collection
6.4.4 Big Data Challenge in Agricultural Image Data Collection
6.5 Agro-Geoinformation Extraction from Image
6.5.1 Knowledge-Based Expert System
6.5.2 Machine Learning-Based Decision Tree
6.5.3 Artificial Neural Network
6.5.4 A Case Study
6.6 Summary
References
Chapter 7: Data Fusion in Agricultural Information Systems
7.1 Introduction
7.2 Agricultural Information Systems
7.3 Regression Model Example for Real-Time Yield Efficiency Monitoring
7.3.1 Phenological Stage-Based Data Segmentation
7.3.2 Agrometeorological Indices and Regression-Based Data Fusion for Yield Estimation.
7.4 Neural Networks for Data Fusion
7.5 Wavelets in Data Fusion
7.6 Convolutional Neural Networks
7.7 Conclusion
Appendix
References
Chapter 8: Big Data and Its Applications in Agro-Geoinformatics
8.1 Introduction
8.1.1 Challenges in Modern Agriculture
8.1.2 The Role of Big Data in Agriculture
8.2 Agricultural Big Data
8.2.1 Special Features of Agro-Big Data
8.2.2 State-of-the-Art Analysis Methods
8.3 Agro-Geoinformatics
8.3.1 Definition
8.3.2 Agro-Geoinformatics: Connecting Agro-Big Data to Agricultural Applications
8.3.3 Related Research
8.4 Examples of Big Data Application in Agro-Geoinformatics
8.4.1 Agro-Sensor Web
8.4.2 GADMFS
8.4.3 CropScape
8.4.4 VegScape
8.4.5 RF-Class
8.4.6 SMAP Explorer
8.4.7 GeoFairy
8.4.8 CyberConnector COVALI
8.4.9 Geoweaver
8.5 Conclusion
References
Chapter 9: Land Parcel Identification
9.1 Introduction
9.2 Land Parcel and Agricultural Land Parcel
9.2.1 What Is Land Parcel?
9.2.2 Land Parcel in Agriculture
9.2.3 Techniques to Identify Land Parcel
9.3 Managing Land Parcel Information in Agro-Geoinformation Systems for Local Governments, Agencies, and Companies
9.4 Managing Land Parcel Information in Agro-Geoinformation Systems at State and National Levels
9.5 Approaches to Manage Land Parcel Information in Globe Agro-Geoinformation Systems - International Standards
9.6 Conclusion and Discussion
References
Chapter 10: Crop Pattern and Status Monitoring
10.1 Introduction
10.2 Crop Pattern Mapping
10.2.1 Statistical Approach
10.2.2 Remote Sensing Approach
10.2.3 Case Study - Operational National Cropland Mapping Programs
10.2.3.1 USA Cropland Data Layer
10.2.3.2 Canada Crop Inventory
10.2.4 Limitations and Perspectives
10.3 Crop Status Monitoring
10.3.1 Statistical Approach
10.3.2 Remote Sensing Approach
10.3.3 Case Study - Operational Remote Sensing Crop Condition Monitoring
10.3.3.1 National Crop Progress Monitoring System
10.3.3.2 Global Agricultural Monitoring
10.3.3.3 Other Operational Crop Status Monitoring Systems
10.3.4 Limitations and Perspectives
10.4 Conclusions
References
Chapter 11: Crop Growth Modeling and Yield Forecasting
11.1 Introduction
11.2 Statistical Modeling
11.3 Physiological/Physical-Based Modeling
11.4 Remote Sensing Monitoring of Crop Growth
11.5 Data Assimilation
11.5.1 Sequential Data Assimilation Algorithms
11.6 Conclusion
References
Chapter 12: Spatial and Temporal Monitoring System for Agriculture
12.1 Introduction
12.2 Related Work
12.3 Spatial and Temporal Monitoring Systems for Agriculture
12.3.1 Web Service-Based Near-Real-Time Global Agricultural Drought Monitoring System
12.3.2 Web Service-Based Near-Real-Time US Vegetation Condition Monitoring System
12.3.3 Web Service-Based Near-Real-Time US Flood and Progress Monitoring System
12.4 Conclusion
References
Chapter 13: Spatial Data Usage in Turkish Agriculture
13.1 Introduction
13.2 Parcel-Based Support Payment System
13.3 Land Parcel Identification System
13.3.1 Orthophoto Production
13.3.1.1 Geodetic Works
13.3.1.2 Post-processing of Aerial Imagery (AI)
13.3.1.3 DEM Production for the AI Areas
13.3.1.4 Orthophoto Production, Mosaicking, and Tile Cutting from the AI and SI
13.3.1.5 Radiometric Enhancement
13.3.1.6 Pan-Sharpening
13.3.2 Constraints for Aerial Imagery
13.3.2.1 Cloud Cover Percentage
13.3.2.2 Sun Angle
13.3.2.3 Crop Phenology
13.3.3 Orthophoto Features
13.3.4 Vector Data in LPIS
13.3.4.1 Generation
13.3.4.2 Controls
13.3.5 Usage of LPIS Data with Collaboration of Cadastre
13.4 Potential Usage of Spatial Database
13.5 Geostatistics through Spatial Database
13.5.1 Interpolation Methodology
13.6 Conclusion
References
Chapter 14: Geospatial Land Use and Land Cover Data for Improving Agricultural Area Sampling Frames
14.1 Introduction
14.2 Background
14.2.1 Related Work
14.2.2 NASS Area Sampling Frames
14.2.3 NASS Cropland Data Layer
14.2.4 NASS Cultivated Layer
14.3 Study areas
14.4 Automated Stratification Methodology
14.4.1 Stratification Method
14.4.2 Automatic Stratification Analysis and Results Evaluation
14.4.3 Comparison of Traditional and Automatic Stratification Results
14.5 Integration of Automatic Stratification into NASS Operations
14.5.1 Ancillary Data for Manual Review and Editing Process
14.5.2 Integration Process
14.6 Integration Results
14.6.1 Stratification Accuracy
14.6.2 Mean Stratum Percent Cultivation Range, Standard Deviations, and PSU Size
14.7 Integration Discussion
14.7.1 Stratification Accuracy
14.7.2 Mean Stratum Percent Cultivation Range, Standard Deviations, and PSU Size
14.7.3 Labor Cost
14.8 Conclusion
References
Chapter 15: Mapping and Monitoring of Soil Moisture, Evapotranspiration, and Agricultural Drought
15.1 Introduction
15.2 Soil Moisture
15.2.1 Methodology
15.2.2 Data
15.2.3 Results
15.3 Evapotranspiration
15.3.1 Methods
15.3.2 Data
15.3.3 Results
15.3.3.1 Validation at US-Skr
15.3.3.2 Validation at ARM-SGP Stations
15.4 Agricultural Drought
15.4.1 Normalized Difference Vegetation Index (NDVI)
15.4.2 Vegetation Condition Index (VCI)
15.4.3 Results
15.5 Conclusions
References
Chapter 16: Flood Monitoring and Crop Damage Assessment
16.1 Introduction
16.2 Remote Sensing on Flood Event Monitoring
16.2.1 Traditional Gauge-Based Flood Monitoring
16.2.2 Remote Sensing-Based Flood Monitoring
16.2.2.1 Remote Sensing in Flood Forecasting
16.2.2.2 Remote Sensing in Flood Mapping
16.2.3 GIS-Based Flood Modeling and Early Warning System
16.2.4 Event and Duration of the Flood
16.3 Flood Crop Damage Assessments
16.3.1 Classification Method
16.3.2 Band Ratioing (Vegetation Indices)
16.4 Case Study: NDVI-Based Corn Loss Assessment through Regression Model
16.4.1 Flood Event
16.4.2 Data
16.4.3 Study Area
16.4.4 Method
16.4.4.1 Pure Pixel Selection
16.4.4.2 Normal NDVI
16.4.4.3 NDVI Smoothing
16.4.4.4 Area under the Curve
16.4.4.5 Regression Model
16.4.5 Result
16.4.5.1 Regression Result
16.4.5.2 Model Estimation
16.5 Conclusion
References
Chapter 17: Remote Sensing-Based Mapping of Plastic-Mulched Land Cover
17.1 Introduction
17.2 A Decision-Tree Classifier for Extracting PML Using Landsat Imagery
17.2.1 Methodology
17.2.1.1 The Detectable Features of PML
17.2.1.2 Construction of the Decision-Tree Classifier
17.2.2 A Specific Example
17.2.2.1 Data Sets and Preprocessing
17.2.2.2 Experiment Results
17.3 A Threshold Model for Mapping PML Using MODIS Time Series Data
17.3.1 Methodology
17.3.2 A Specific Example
17.3.2.1 Data Sets and Preprocessing
17.3.2.2 Determination of Threshold Condition and Value
17.3.2.3 Detecting and Mapping PML
17.4 Subpixel Mapping of PML from MODIS Imagery Using Spatial Attraction Models
17.4.1 Methodology
17.4.1.1 Subpixel Mapping Theory
17.4.1.2 Subpixel/Pixel Spatial Attraction Model (SPSAM)
17.4.1.3 MSPSAM and MSAM
17.4.1.4 Improved Spatial Attraction Model (ISAM)
17.4.2 A Specific Example
17.4.2.1 Data Sets and Preprocessing
17.4.2.2 Experiment Results
17.5 Conclusion
References
Chapter 18: Design and Implementation of Geospatial Data Services for Agriculture
18.1 Introduction
18.2 Geospatial Data for Agriculture
18.2.1 Data Categories
18.2.1.1 CDL Data
18.2.1.2 Vegetation Index Data
18.2.1.3 Hydrological Data
18.2.1.4 Temperature Data
18.2.2 Data Life Cycle
18.3 Geospatial Interoperability and Standardization
18.3.1 Geospatial Web Service Interoperability Standards
18.3.2 Content Interoperability Standards
18.4 Geospatial Web Service Architecture for Agriculture
18.4.1 A Specific Example: CropScape
18.4.1.1 Application Layer
18.4.1.2 Service Layer
18.4.1.3 Data Layer
18.5 Geospatial Data Service Functionalities for Agriculture
18.5.1 Agricultural Data Management
18.5.2 Agricultural Data Analytics
18.6 Conclusion
References
Index

Recommend Papers

Advances in Theory and Practice in Store Brand Operations [1st ed. 2021] 9811598762, 9789811598760

This book is developed by focusing on the four issues: (1) product strategy of private brand; (2) pricing strategy of pr

109 87 4MB Read more

Remote Sensing Big Data (Springer Remote Sensing/Photogrammetry) 3031339312, 9783031339318

This monograph provides comprehensive coverage of the collection, management, and use of big data obtained from remote s

123 89 18MB Read more

Ferroalloys: Theory and Practice [1st ed.] 9783030575014, 9783030575021

This book outlines the physical and chemical foundations of high-temperature processes for producing silicon, manganese

423 89 25MB Read more

Marketing Dynamics (theory and Practice) [1st Ed.] 8122419429, 9788122419429

Introduces fresh concepts and approaches in managing competition and strategies for leading ahead in business. This book

790 83 3MB Read more

Corporate Communications: Theory and Practice [1st ed.] 9780761944362, 0761944362

`A welcome and important addition to the limited writing already available on corporate communication. The book focuses

388 86 1MB Read more

Service Management: Theory and Practice [1st ed.] 9783030520595, 9783030520601

This textbook offers a fully integrated approach to the theory and practice of service management, exploring the operati

643 131 5MB Read more

Springer Handbook of Power Systems (Springer Handbooks) [1st ed. 2021] 9813299371, 9789813299375

This handbook offers a comprehensive source for electrical power professionals. It covers all elementary topics related

202 92 42MB Read more

Adapting Approaches and Methods to Teaching English Online: Theory and Practice (SpringerBriefs in Education) [1st ed. 2021] 3030799182, 9783030799182

This book provides a framework for synchronous and asynchronous online language teaching. It elaborates on the key featu

108 71 3MB Read more

Harmonic Analysis and Applications (Springer Optimization and Its Applications, 168) [1st ed. 2021] 3030618862, 9783030618865

This edited volume presents state-of-the-art developments in various areas in which Harmonic Analysis is applied. Contri

136 21 10MB Read more

Nonlinear Analysis and Global Optimization (Springer Optimization and Its Applications, 167) [1st ed. 2021] 3030617319, 9783030617318

This contributed volumediscusses aspects of nonlinear analysis in which optimization plays an important role, as well as

138 78 6MB Read more

Agro-geoinformatics: Theory and Practice (Springer Remote Sensing/Photogrammetry) [1st ed. 2021]
3030663868, 9783030663865

Author / Uploaded
Liping Di (editor)
Berk Üstündağ (editor)

0 0 0
Like this paper and download? You can publish your own PDF file online for free in a few minutes! Sign Up

File loading please wait...

Citation preview

Springer Remote Sensing/Photogrammetry

Liping Di Berk Üstündağ Editors

Agro-geoinformatics Theory and Practice

Springer Remote Sensing/Photogrammetry

The Springer Remote Sensing/Photogrammetry series seeks to publish a broad portfolio of scientiﬁc books, aiming at researchers, students, and everyone interested in the broad ﬁeld of geospatial science and technologies. The series includes peerreviewed monographs, edited volumes, textbooks, and conference proceedings. It covers the entire area of Remote Sensing, including, but not limited to, land, ocean, atmospheric science and meteorology, geophysics and tectonics, hydrology and water resources management, earth resources, geography and land information, image processing and analysis, satellite imagery, global positioning systems, archaeological investigations, and geomorphological surveying. Series Advisory Board: Marco Chini, Luxembourg Institute of Science and Technology (LIST), Belvaux, Luxembourg Manfred Ehlers, University of Osnabrueck Venkat Lakshmi, The University of South Carolina, USA Norman Mueller, Geoscience Australia, Symonston, Australia Alberto Reﬁce, CNR-ISSIA, Bari, Italy Fabio Rocca, Politecnico di Milano, Italy Andrew Skidmore, The University of Twente, Enschede, The Netherlands Krishna Vadrevu, The University of Maryland, College Park, USA

More information about this series at http://www.springer.com/series/10182

Liping Di • Berk Üstündağ Editors

Agro-geoinformatics Theory and Practice

Editors Liping Di Center for Spatial Information Science and Systems George Mason University Fairfax, VA, USA

Berk Üstündağ Faculty of Computer and Informatics Engineering Istanbul Technical University Istanbul, Turkey

ISSN 2198-0721 ISSN 2198-073X (electronic) Springer Remote Sensing/Photogrammetry ISBN 978-3-030-66386-5 ISBN 978-3-030-66387-2 (eBook) https://doi.org/10.1007/978-3-030-66387-2 © Springer Nature Switzerland AG 2021 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, speciﬁcally the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microﬁlms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a speciﬁc statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional afﬁliations. This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Contents

1

Introduction to Agro-Geoinformatics: Theory and Practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Liping Di and Berk Üstündağ

1

2

Remote Sensing for Agriculture . . . . . . . . . . . . . . . . . . . . . . . . . . . Feng Gao

7

3

GIS Fundamentals for Agriculture . . . . . . . . . . . . . . . . . . . . . . . . . Junmei Tang

25

4

Agro-geoinformatics Data Sources and Sourcing . . . . . . . . . . . . . . . Ziheng Sun, Liping Di, Hui Fang, Liying Guo, Xicheng Tan, Lili Jiang, and Zhongxin Chen

41

5

Standards and Interoperability . . . . . . . . . . . . . . . . . . . . . . . . . . . . Yuqi Bai

67

6

Image Processing Methods in Agricultural Observation Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chen Zhang and Li Lin

81

7

Data Fusion in Agricultural Information Systems . . . . . . . . . . . . . . 103 Berk Üstündağ

8

Big Data and Its Applications in Agro-Geoinformatics . . . . . . . . . . 143 Liping Di and Ziheng Sun

9

Land Parcel Identiﬁcation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 Li Lin and Chen Zhang

10

Crop Pattern and Status Monitoring . . . . . . . . . . . . . . . . . . . . . . . . 175 Eugene G. Yu and Zhengwei Yang

11

Crop Growth Modeling and Yield Forecasting . . . . . . . . . . . . . . . . 205 Haizhu Pan and Zhongxin Chen v

vi

Contents

12

Spatial and Temporal Monitoring System for Agriculture . . . . . . . 221 Lei Hu and Peng Yue

13

Spatial Data Usage in Turkish Agriculture . . . . . . . . . . . . . . . . . . . 233 Hakan Erden and Murat Aslan

14

Geospatial Land Use and Land Cover Data for Improving Agricultural Area Sampling Frames . . . . . . . . . . . . . . . . . . . . . . . . 265 Claire G. Boryan and Zhengwei Yang

15

Mapping and Monitoring of Soil Moisture, Evapotranspiration, and Agricultural Drought . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 Ali Levent Yagci and M. Tugrul Yilmaz

16

Flood Monitoring and Crop Damage Assessment . . . . . . . . . . . . . . 321 Ranjay M. Shrestha and Md. Shahinoor Rahman

17

Remote Sensing–Based Mapping of Plastic-Mulched Land Cover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351 Lizhen Lu

18

Design and Implementation of Geospatial Data Services for Agriculture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385 Lei Hu, Liping Di, and Peng Yue

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405

Chapter 1

Introduction to Agro-Geoinformatics: Theory and Practices Liping Di and Berk Üstündağ

Abstract Agro-geoinformation, the agricultural-related geo-information, is the key information in the agricultural decision making and policy formulation process. Agro-geoinformatics is the interdisciplinary ﬁeld of study on acquisition, processing, management, and applications of agro-geoinformation. This book summarizes the recent progresses of the agro-geoinformatics in both theory and practices. This introduction chapter provides the overview of the book and introduces the contents in each chapter. Keywords Agro-geoinformation · Agro-geoinformatics · Agriculture · Information technology · Data science

Agro-geoinformation is a prominent factor in the agricultural planning, decisionmaking, management, and policy formulation process. Recent advances in informatics and the geoinformatics have created new opportunities and challenges for agricultural applications and the management systems. Agro-geoinformation covers agricultural data and the information at spatial, spatiotemporal, and the administrative domains. Agro-geoinformatics, a new interdisciplinary study dealing with agrogeoinformation, has an increasing role against continuously raising efﬁciency requirement in agricultural activities. It incorporates processing, storing, archiving, retrieving, managing, visualizing, analyzing, synthesizing, presenting, disseminating, and usage of agro-geoinformation through monitoring, communication, and computational systems and devices. Agro-geoinformation science and the applications deal with agricultural production and management activities at all scales, including parcel, basin, region, country, and the global. L. Di (*) George Mason University, Fairfax, VA, USA e-mail: [email protected] B. Üstündağ Istanbul Technical University, Istanbul, Turkey e-mail: [email protected] © Springer Nature Switzerland AG 2021 L. Di, B. Üstündağ (eds.), Agro-geoinformatics, Springer Remote Sensing/ Photogrammetry, https://doi.org/10.1007/978-3-030-66387-2_1

1

2

L. Di and B. Üstündağ

In recent years, especially in the last decade, the number of scientiﬁc and technical organizations and publications dedicated to geoinformatics applications in agriculture have exponentially increased. The IEEE International Geoscience and Remote Sensing Symposium (IGARSS) held an agro-geoinformatics special and invited sessions in 2009, 2010, and 2011. Since 2012, the International Conference on Agro-Geoinformatics has been organized annually and sponsored by the IEEE GRSS and major agricultural organizations, such as the US Department of Agriculture (USDA); agriculture ministries in China, Turkey, and Canada; the USDA National Agricultural Statistics Service (NASS); the Chinese Academy of Agricultural Science; Food and Agricultural Organization (FAO) of the United Nations; and the World Meteorological Organization (WMO), as well as the major international geospatial standardization organization, the Open Geospatial Consortium (OGC). All these activities represent an interest of a worldwide research community in agrogeoinformatics. The book Agro-geoinformatics: Theory and Practices presents both fundamental topics and some state-of-the-art practical solutions in agro-geoinformatics. It covers geoinformation and cadastral data used in agricultural management, data collection, processing, fusion, visualization, information access, agricultural sustainability, crop monitoring, assessment, prediction, precision farming, and the integrated management systems. Statistics show that efﬁciency management and sustainability of resources have become key factors in agricultural production. The rise in food demand is estimated to be around 50% in the next 30 years, while arable land per capita continuously decreases. Due to the increasing demand for meat, total meat production has increased approximately 100% in the last 30 years. The global average of the water footprint of meat for beef is more than 15,000 liters of water per 1 kg of beef, and for sheep meat 10,000 liters per 1 kg. The world’s cereal yield index has also increased by 38% in the last 30 years to meet the increase in food demand. This has been achieved by using more fertilizers, pesticides, and water besides the dissemination of good agricultural practices. There are physical, health, and sustainability limitations for fertilizer, pesticide, and water use per unit area. On the other hand, there is still a margin for increasing the overall efﬁciency in food production by preserving sustainability. For this purpose, precision farming, vertical farming, and data-driven science-based management of agricultural resources, integrated logistics, irrigation, subsidy, and integrated food supply chain are becoming key issues. They all need data, optimization, and integration. Therefore, monitoring systems, information systems, and their integration are signiﬁcant in peacefully balancing the food demand for the next 30 years. Recent developments in communication, information, and sensor technologies will also be able to provide several game-changing applications in agriculture. For example, the Narrowband Internet of Things (IoT), known as NB-IoT, is a communication standard that promises the connection of battery-operated monitoring and control devices at ﬁelds up to 10 years without requiring any speciﬁc infrastructure other than existing cellular networks. The convolutional neural network algorithms provide high-performance solutions for the classiﬁcation, data registration, image

1 Introduction to Agro-Geoinformatics: Theory and Practices

3

registration, data fusion, nowcasting, forecasting, estimation, and optimization problems. Cloud-based services enable high-performance computational platforms and reliable data processing capabilities in a cost-efﬁcient way to users connected through mobile applications and web services or machine-to-machine (M2M) communication interfaces. The number of connected devices is expected to reach 29 billion by year 2022, and 18 billion of them will be related to the IoT. Sensor technology is also in a rapid development trend. There are various types of single-chip spectrometers with spectral resolution ﬁner than 20 nm at different bands. Lab-on-a-chip solutions are used for DNA and RNA sequencing, protein analysis, and food monitoring. Biosensors are intended to be used in the agricultural and food sectors to ensure food quality and safety. They are promising alternatives to conventional analytical tools since they offer advantages in size, rapid response, speciﬁcity, and cost. The increasing performance of embedded processors enables advanced sensor fusion techniques in real-time even on low-power devices. Satellite remote sensing technology enables very high spatial resolution down to 0.3 m at optical bands. Recent hyperspectral imaging satellites have more than 200 bands. Multi-temporal multispectral data availability increases through the Sentinel and Landsat Missions. Hyperspectral image acquisition is also available with drone, unmanned aerial vehicle (UAV), and other airborne platforms. This rich data availability requires efﬁcient and reliable methods and platforms for monitoring, processing, and utilization for agriculture. Well-organized, structural data with contextual connections provide new capabilities in a big data environment. Machine learning solutions begin to take the place of plant-speciﬁc models, or they are used to adapt the models. This book summarizes the theoretical aspects and practical applications in agrogeoinformatics. The topics covered by the book include the recent and established methods for the development of monitoring and data processing systems in agrogeoinformation systems. Chapter 2, Remote Sensing for Agriculture, by Gao, introduces basic remote sensing concepts and applications in agriculture. Major freely available remote sensing data sources used in agricultural applications (MODIS, Landsat, Sentinel, etc.) and the properties of the relevant data products are presented here. Remote sensing applications for crop-type classiﬁcation, crop growth monitoring, crop yield estimation, crop water use, and soil moisture estimation are introduced in this chapter. The past decades have witnessed the rapid growth of GIS application in agriculture. Chapter 3, GIS Fundamentals for Agriculture, by Tang, introduces GIS-based modeling and spatial database for agricultural systems. The current and future characteristics of GIS-based agricultural information systems are evaluated in this chapter. Chapter 4, Agro-Geoinformatics Data Sources and Sourcing, by Sun et al., summarizes the state-of-the-art data sources and the sourcing methods in agrogeoinformatics. Data sources from satellite, airborne, in situ, and human reports are investigated and introduced. Spatial coverage, long observation history, and case-speciﬁc, site-speciﬁc, and question-speciﬁc concise nouns and numbers have

4

L. Di and B. Üstündağ

been evaluated for the classiﬁcation of the data sources. Conventional, cloud-based, and crowdsourcing data sourcing options are also explained in this chapter. A majority of the spatial data standards and models used in agro-geoinformation systems are set and recommended by the international organizations, International Organization for Standardization (ISO), Open Geospatial Consortium (OGC), and Food and Agricultural Organization (FAO). Chapter 5, Standards and Interoperability, by Bai, focuses on the standards about data content, metadata, and a variety of data services, including catalogue service, Web Mapping Service (WMS), Web Feature Service (WFS), and Web Coverage Service (WCS). Agricultural image data collection includes in-situ data collection, airborne data collection, and space-borne data collection. How to manage and analyze volumes of high-resolution agricultural image data captured by the different types of satellites is becoming a new challenge. Chapter 6, Image Processing Methods in Agricultural Observation Systems, by Zhang and Lin, explains the agricultural image processing systems at different kinds of hardware and software platforms. Knowledge-based expert systems, machine learning-based decision trees, and artiﬁcial neural networks are evaluated for agricultural digital image processing in this chapter. When some physical parameters are not feasible for direct monitoring at speciﬁed spatial or temporal resolution, well-designed data fusion models enable their indirect acquisition by using other types of existing data or alternative sensory mechanisms. Data fusion is an increasing trend for information harvesting in accordance with computational power and efﬁciency. Chapter 7, Data Fusion in Agricultural Information Systems, by Üstündağ, presents data fusion principles and methods used in agriculture. Data fusion based on multiparameter regression, time-delay neural networks, wavelet neural networks, and convolutional neural networks is explained with example applications. Agro-geoinformatics deals with collecting, managing, and analyzing agriculturalrelated geospatial data, which are domain-speciﬁc big data. Chapter 8, Big Data and Its Applications in Agro-Geoinformatics, by Di and Sun, presents the development of agro-big data-speciﬁc technology as a necessary supplement to the adoption of general big data technology in agro-geoinformation systems. EU has developed the land parcel identiﬁcation system (LPIS) for agriculture policy studies, while the national land parcel database has been established in the United States for large-scale land parcel data services. Chapter 9, Land Parcel Identiﬁcation, by Lin and Zhang, explains the applications, beneﬁts, problems, and the requirements of land parcel information in agro-geoinformation systems. This chapter also discusses the data size and performance issues of the large-scale land parcel data structures as well as adopting new techniques such as remote sensing and GIS that bring an alternative for measuring land parcel for agro-geoinformation systems. Chapter 10, Crop Pattern and Status Monitoring, by Yu and Yang, presents the comparison of sampling framework-based statistical approaches and the remote sensing methods in monitoring crop pattern and status. The advancements of remote sensing and related processing capabilities make it possible to operationally monitor crop pattern and crop status in very high spatial and temporal resolutions.

1 Introduction to Agro-Geoinformatics: Theory and Practices

5

Operational cases for monitoring crop pattern and status using remote sensing are reviewed in this chapter. Chapter 11, Crop Growth Modeling and Yield Forecasting, by Pan and Chen, introduces the comparison of statistical modeling, crop growth models, and the remote sensing data-dependent models for crop monitoring and yield forecasting. Chapter 12, Spatial and Temporal Monitoring System for Agriculture, by Yue and Hu, reviews state-of-the-art operational agriculture monitoring systems at international, national, and regional levels. The requirements of current agricultural data systems are discussed, and the capabilities of spatial and temporal monitoring systems are analyzed. Quality and proper acquisition planning of spatial data are important for efﬁcient use of agricultural information systems. Chapter 13, Spatial Data Usage in Turkish Agriculture, by Erden and Aslan, presents an overview of spatial and geo-statistical data characteristics, acquisition, and processing methods in agriculture with some example applications in Turkey. This chapter also explains service level compliance to political and regional data and process standards that enable comparative performance reports as in the EU cases of LPIS, IACS, and FADN applications. Chapter 14, Geospatial Land Use and Land Cover Data for Improving Agricultural Area Sampling Frames, by Boryan and Yang, presents and assesses an automatic stratiﬁcation method and its integration with a traditional stratiﬁcation method. Here, the results of the automatic stratiﬁcation of NASS area frame PSUs of Arizona, Georgia, Ohio, Oklahoma, and Virginia indicated that the automated stratiﬁcation method was more accurate in determining the US percent cultivation in intensively cropped areas and weaker in low agricultural areas. In addition, this chapter introduces a hybrid operational process that integrates the automated stratiﬁcation method with traditional area frame construction manual editing/review procedures in order to increase the operational area frame accuracy. It has been demonstrated that the proposed operational hybrid area frame construction process, based on available geospatial data, is easily applicable to the operations of other agencies or countries that conduct area frame-based surveys and have available geospatial cropland cover data. Agricultural drought is one of the major disasters in the twenty-ﬁrst century as the world population continues to grow exponentially. Monitoring and predicting the droughts accurately is necessary for informing the governments, farmers, and decision-makers so that proper actions can be taken on time to mitigate their devastating effects. Chapter 15, Mapping and Monitoring of Soil Moisture, Evapotranspiration, and Agricultural Drought, by Yagci and Yilmaz, explains drought as the interconnected phenomena between evapo transpiration (ET) and the soil moisture. Three robust and proven methods to map soil moisture, ET, and agricultural drought based on satellite data and methods are discussed in Chap. 15. A model is proposed here to estimate ET through evaporative fraction (EF), tracking the agricultural drought by using remote sensing data. Flood is one of the devastating natural disasters for agricultural production. Chapter 16, Flood Monitoring and Crop Damage Assessment, by Shrestha and Rahman, discusses the advancement of geoinformation ﬂood monitoring system,

6

L. Di and B. Üstündağ

which uses remote sensing data, as an alternative to traditional gauge-based ﬂood monitoring systems. GIS is playing a vital role in data acquisition, processing, and visualization for ﬂood monitoring. Users can get near real-time ﬂood data though GIS-based web applications. Limiting factors for the classiﬁcation-based ﬂood crop loss assessment models and alternative methods for ﬂood crop damage assessment in cropland are evaluated in the chapter. Mapping and monitoring of large-scale plastic-mulched land cover (PML) are important both scientiﬁcally and socioeconomically. Chapter 17, Remote SensingBased Mapping of Plastic-Mulched Land Ccover (PML), by Lu, explained mapping and monitoring large-scale PML from satellite imagery. The decision tree classiﬁer, threshold model, and four types of spatial attraction models (SPSAM, MSPSAM, MSAM, and ISAM) were analyzed and used to extract PML information. With huge amount of agro-geoinformation available in diverse sources, it is a challenge for users (e.g., decision-makers, researchers, farmers, etc.) to easily ﬁnd and effectively use the right information. In the recent years, geospatial service technologies have been developed to meet the challenge. The last chapter of this book, Chap. 18, Design and Implementation of Geospatial Data Services for Agriculture, by Yue and Hu, describes the geospatial service technologies and discusses the design and implementation of geospatial data services for agriculture through the aspects of agricultural data categories and life cycle, the architecture of service system, and the implementation of service functionalities. In this chapter, the CropScape system is used to demonstrate the interactive architecture framework for agriculture. When exponentially increasing amount of devices, processing power, methods, and capabilities are considered, it can be seen that the rising complexity requires compliance to standards, well-engineered protocols, and platforms for reliable and feasible information systems in agriculture. This book presents a comprehensive approach in dealing with geospatial information–related issues in agriculture, covering recent methods for data acquisition, processing, analysis, and applications.

Chapter 2

Remote Sensing for Agriculture Feng Gao

Abstract Remote sensing provides an in-direct approach to monitor agricultural landscapes efﬁciently and consistently. It plays a critical role in current agricultural management. In the past, remote sensing application has been limited to crop type mapping due to the high cost or lack of remote sensing observations. In recent years, high temporal and spatial resolution satellite observations have become available. Many medium resolutions (10–100 m) satellite images are freely accessible to the public. The near-surface observations and unmanned aerial vehicles are common. The suite of remote sensing platforms has provided the capability of application for ﬁeld-scale agricultural management. This chapter introduces basic remote sensing concepts and applications in agriculture. Remote sensing characteristics, such as spatial, temporal, and spectral resolutions are discussed. The major agriculturalrelated and freely available remote sensing data sources (e.g., Landsat, MODIS, VIIRS, and Sentinel-2) and the data products are presented. Remote sensing applications in crop type classiﬁcation, crop phenology mapping, crop yield estimation, crop water use monitoring, and soil moisture retrieving are introduced. The current progress and potential future directions for agriculture remote sensing are discussed. Keywords Remote sensing · Crop type · Crop phenology · Crop water use · Yield estimation · Crop water use · Soil moisture · Landsat · Sentinel-2 · MODIS

This chapter discusses the theoretical foundation of remote sensing in agriculture, characteristics of the major agriculture-related remote sensing satellites, the remote sensing data sources and accesses, and major categories of remote sensing applications in agriculture. F. Gao (*) USDA, Agricultural Research Service, Hydrology and Remote Sensing Laboratory, Beltsville, MD, USA e-mail: [email protected] © Springer Nature Switzerland AG 2021 L. Di, B. Üstündağ (eds.), Agro-geoinformatics, Springer Remote Sensing/ Photogrammetry, https://doi.org/10.1007/978-3-030-66387-2_2

7

8

2.1

F. Gao

Introduction

The remote sensing technique provides an indirect approach to monitor agricultural landscapes. Remote sensing data are widely used in crop type identiﬁcation, crop growth condition monitoring, crop water use and stress assessment, crop yield estimation, etc. Agricultural application is one of the most successful applications in remote sensing. Today, remote sensing data can be acquired from near-surface using unmanned aerial vehicles (UAV) or aircraft, and from space using satellites and space stations. The uses of different remote sensing data sources depend on speciﬁc applications and study regions. Remote sensing data have different spatial, temporal, and spectral resolutions that are designed for different purposes. The spatial resolution is related to the spatial domain and determines the minimum size of the target that can be distinguished. The temporal resolution is associated with the time domain and determines the temporal changes that can be captured. The spectral resolution reﬂects the spectral characteristics of targets and determines the types/ features that can be recognized. These are the three main characteristics of remote sensing techniques. Other features such as multiangular observation and active lidar system are valuable for improving surface parameterization and can provide surface structure information. Spatial resolution in remote sensing refers to the size of a pixel that can be identiﬁed in an image. The spatial resolution for agricultural applications spans from submeter to meters and kilometers. Different terminologies were used for describing spatial resolution. Generally, a coarse resolution refers to the pixel size of 100 m and coarser. Satellite sensors such as Advanced Very High Resolution Radiometer (AVHRR), Moderate Resolution Imaging Spectroradiometer (MODIS), and Visible Infrared Imaging Radiometer Suite (VIIRS) belong to coarse spatial resolution sensors. Although the AVHRR sensor uses “very high resolution,” it is a coarse-resolution sensor. The medium spatial resolutions are usually used for 10–100 m. Some literature also called this moderate resolution. However, since the MODIS instrument uses “moderate resolution” for spatial resolutions of 250 m to 1 km (coarse resolution), many publications later used medium resolution for 10–100 m. This range of spatial resolution includes some well-known sensors such as these aboard Landsat and Sentinel-2 satellites. High spatial resolution usually means 10 m or ﬁner, and very high spatial resolution normally refers to the submeter resolution. Temporal resolution refers to the frequency of observations over the same location. For UAV or airborne remote sensing observations, observation frequency depends on the need of the application. It could be just one or a few times during a short period. Satellite observations have a routine revisit schedule. Coarse-resolution sensors such as AVHRR, MODIS, and VIIRS have a wide swath width (or a large ﬁeld of view) and can acquire global images daily. Medium-resolution sensors such as Landsat and Sentinel-2 have a relatively longer revisit cycle. For example, the revisit cycle for Landsat-8 is 16 days, and for Sentinel-2 constellation (A and B satellites combined) is 5 days. The coarse- and medium-spatial-resolution satellite

2 Remote Sensing for Agriculture

9

sensors normally acquire the global images routinely. For many high- to very-highspatial-resolution sensors, images may only be acquired over the target areas. Recently, satellite constellation such as PlanetScope has the capability to acquire high-resolution remote sensing images in both time and space. Of course, the image processing and calibration become more complicated due to the inconsistencies among the different sensors (e.g., bandwidths and spectral response functions). Spectral resolution refers to sensor’s bandwidth and sensitivity. Remote sensing sensors can cover a wide range of electromagnetic spectrum spanning from ultraviolet, visible, near infrared, middle infrared, thermal infrared, and microwave spectrum. The remote sensing spectral band has speciﬁc features and can be used for different purposes. Optical remote sensing sensors mostly include multispectral bands covering visible to infrared bands. Hyperspectral bands have narrow bandwidths and acquire more spectral bands than multispectral sensors. Thermal infrared (TIR) bands can be used to estimate surface temperature and require a separate TIR instrument. Radar and microwave radiometer can penetrate clouds and provide soil moisture information for agricultural applications. These sensors may be placed in a comprehensive satellite platform or a standalone single-purpose satellite platform. Multi-angular remote sensing provides observations viewed from different viewing and solar geometries. Most of the coarse-resolution images have a large swath width. The viewing angles from different locations can vary signiﬁcantly from nadir to off-nadir. For example, MODIS off-nadir observations can reach over 60 degrees at the edge of the swath. In addition to the changes in viewing angle, images acquired from different seasons have different solar geometries. Therefore, even for the “nadir-viewing” sensors such as Landsat, the angular effects still exist for images acquired from different seasons. The multi-angular observations are valuable information for some applications (e.g., albedo retrieval and vegetation structure detcetion), but could be a “noise” for other applications that require a consistent measurement of surface.

2.2

Major Agriculture-Related Remote Sensing Data Sources

Many valuable remote sensing data sources are available for agricultural applications. Selecting the appropriate data sources is critical to the success of an application. Remote sensing sensors can be placed on different platforms. In this section, we will introduce the major remote sensing sensors that are useful for agricultural applications. The near-surface observation is an important component of remote sensing. It provides close and detailed observations of surface and can be used to record surface changes and validate observations from space. An example of the operational surface observation is the PhenoCam network that takes continuous photos of canopy phenology over the target sites (Richardson, 2018). Images from these

10

F. Gao

Fig. 2.1 PhenoCam photos for the USDA Long-Term Agroecosystem Research (LTAR) network in lower Chesapeake Bay, Maryland, from May 20 to June 14 in 2017. Near-surface (ground) remote sensing captures the quick changes of a cornﬁeld regardless of weather conditions

sites are automatically uploaded to the PhenoCam server every half hour. The ground remote sensing provides continuous imagery regardless of weather conditions (cloudy or clear). Since it’s a ground surface observation, atmospheric correction can be skipped. The PhenoCam imagery covers vegetation canopy and individual plant and can provide detailed information on the vegetation growth. Figure 2.1 shows the photos observed by the PhenoCam over a USDA Long-Term Agroecosystem Research (LTAR) network site (the lower Chesapeake Bay site in Maryland: https://phenocam.sr.unh.edu/webcam/sites/arsltarmdcr/). The PhenoCam photos have a very high temporal resolution. However, the number of cameras in a region may be restricted due to cost and feasibility. They are usually placed in speciﬁc sites and provide a small area of view. They cannot capture spatial variability over the large area. PhenoCams normally use affordable sensors. Their spectral bandwidths could be very different, and the cross-sensor calibration may not be possible. Nevertheless, PhenoCams provide ground information that may be used to link the observations from surface to airborne and satellite sensors. Airborne and UAV remote sensing can provide very-high-spatial-resolution imagery. Usually, these images cover the target area once or multiple times over a study period. UAV is like the piloted aircraft in acquiring digital aerial imagery. In recent years, UAV technology has been advanced and allows us to acquire remote sensing images repeatedly at low altitudes. It can capture changes in crop growth and conditions at subﬁeld scales. However, UAV requires crew training, and the ﬂight needs to satisfy the Federal Aviation Administration (FAA) regulations in the United States to operate over an area. The U.S. FAA rules require that the small UAV ﬂy within sight, which limits the UAV remote sensing to be applied over a large area.

2 Remote Sensing for Agriculture

11

Fig. 2.2 Band wavelengths of Landsat 1–8 sensors (from U.S. Geological Survey – USGS)

Airborne remote sensing has a wider choice of high-quality sensors and is not limited by the lightweight sensors as UAV does. However, the operation is more complex and the cost is higher than a UAV. Satellite remote sensing provides routine and repeatable observations on a regional or global scale. They carry high-quality sensors and are normally wellcalibrated. Satellite images are the main data sources used in agricultural applications. As mentioned in the previous section, these sensors have different spatial, temporal, and spectral resolutions. They can be used for different applications. This section introduces the major global satellite data sources that are freely available to the public. Landsat series data have a long history starting from the early 1970s. Landsat data are the most popular data used in agriculture, especially after the Landsat free distribution policy was implemented in early 2008 (Woodcock et al. 2008). The Landsat series satellites carried several instruments, including MSS (Multispectral Scanner), TM (Thematic Mapper), ETM+ (Enhanced Thematic Mapper Plus), and OLI (Operational Land Imager). Figure 2.2 shows the band wavelengths of Landsat 1–8 sensors. Landsat 7 ETM+ and Landsat 8 OLI are still in operation as of today. Despite the failure of the scan-line-corrector (SLC), which causes data gaps in the ETM+ imagery, Landsat 7 provides useful information for about 80% of a Landsat scene. Landsat 8 was launched on February 11, 2013. The OLI instrument aboard Landsat 8 provides high-quality 30-m-resolution data over the globe on a 16-day repeat cycle. Landsat 9 is scheduled for launch in late 2021 and will have the same bandwidths as Landsat 8 OLI. Table 2.1 shows the bandwidth and spatial resolution for each spectral band. Note that the Landsat 8 thermal infrared bands have a spatial resolution of 100-m and are resampled to 30-m resolution. Thermal infrared bands are available since Landsat 4 with different spatial resolutions (90-m resolution for Landsat 4–5 TM and 60-m for Landsat 7 ETM+).

12

F. Gao

Table 2.1 Landsat 8 Operational Land Imager (OLI) and Thermal Infrared Sensor (TIRS) bands and spatial resolution (from USGS) Bands Band 1: Ultra-blue (coastal/aerosol) Band 2: Blue Band 3: Green Band 4: Red Band 5: Near infrared (NIR) Band 6: Shortwave infrared (SWIR) 1 Band 7: Shortwave infrared (SWIR) 2 Band 8: Panchromatic Band 9: Cirrus Band 10: Thermal infrared (TIRS) 1 Band 11: Thermal infrared (TIRS) 2

Wavelength (micrometers) 0.435–0.451 0.452–0.512 0.533–0.590 0.636–0.673 0.851–0.879 1.566–1.651 2.107–2.294 0.503–0.676 1.363–1.384 10.60–11.19 11.50–12.51

Resolution (meters) 30 30 30 30 30 30 30 15 30 100 100

Landsat data have been geometrically registered, radiometrically calibrated, and atmospherically corrected to surface reﬂectance (Roy et al. 2014). Landsat surface reﬂectance data product is a level-2 product in the Collection 1 and 2 process (Vermote et al. 2016). Users can order data from EarthExplorer (https:// earthexplorer.usgs.gov/). The global Landsat data are saved in the Worldwide Reference System (WRS)-2. The area and image size for a WRS-2 Landsat scene acquired from different dates are slightly different. Landsat imagery in WRS-2 uses UTM map projection. Since the centers of the two adjacent scenes may be in different UTM zones, the overlapped area from neighbor paths may not be overlaid for time series analysis without ﬁrst doing map reprojection. To make data processing easier, Landsat Analysis Ready Data (ARD) over the conterminous United States (CONUS), Alaska, and Hawaii have been produced (available from EarthExplorer). The ARD products use the ﬁxed grid (or tile) and map projection to store Landsat images. All images are processed in the Albers equal-area (AEA) conic map projection. Each ARD tile includes 5000 x 5000 30-m pixels. It contains all the pixels acquired within its extents. The ARD is intended to make it easier for users to produce Landsat-based maps of land cover and land cover change and other derived geophysical and biophysical products (Dwyer et al. 2018). Both ARD and WRS-2 surface reﬂectance products have a quality control layer that includes cloud mask and data quality. MODIS is a key instrument aboard the Terra and Aqua satellites. Terra was launched on December 18, 1999, and across the equator from north to south in the morning. Aqua was launched on May 4, 2002, and across the equator from south to north in the afternoon. Terra and Aqua MODIS observe the globe every 1–2 days, acquiring data in 36 spectral bands. Seven spectral bands and two TIRS bands were designed for land applications (Table 2.2). MODIS observations have been well calibrated. MODIS data products are validated (Justice et al. 2002). Due to their high temporal and spectral resolutions, MODIS data products have been widely used in

2 Remote Sensing for Agriculture

13

Table 2.2 MODIS spectral bandwidths and spatial resolutions (from NASA) Bands 1 – Red 2 – NIR 3 – Blue 4 – Green 5 – SWIR 1 6 – SWIR 2 7 – SWIR 3 31 – TIR 1 32 – TIR 2

Wavelength (nm) 620–670 841–876 459–479 545–565 1230–1250 1628–1652 2105–2155 10.780–11.280 11.770–12.270

Resolution (m) 250 250 500 500 500 500 500 1000 1000

crop condition monitoring and yield estimation even though the spatial resolutions of 250-m to 1-km are not ideal for small ﬁelds. The National Aeronautics and Space Adminstration (NASA) provides MODIS data products at different processing levels with well-deﬁned quality controls. These products include daily surface reﬂectance at 250-m, 500-m,and 1-km spatial resolutions (Vermote et al. 2002); multi-date composite vegetation indices (Huete et al. 2002); multi-date composite leaf area index (Myneni et al. 2002), daily land surface temperature (Wan et al. 2002); and daily nadir BRDF-adjusted reﬂectance (NBAR) (Schaaf et al. 2002). Table 2.3 lists the major MODIS data products that are related to agricultural applications. They can be downloaded from the NASA EarthData website (https://earthdata.nasa.gov/). VIIRS is an operational MODIS-like sensor aboard the Suomi NPP (National Polar-orbiting Partnership) and the Joint Polar Satellite System (JPSS-1 or NOAA20). Although the main goal of VIIRS was to improve weather forecast, much of the capability for land science has been reserved. Two sets of VIIRS land products have been generated. A suite of Environmental Data Records (EDRs) was produced by the National Oceanic and Atmospheric Administration (NOAA) to meet the operational data needs, primarily for the National Weather Service. Another suite of Earth System Data Records (ESDRs) was developed by NASA to meet the needs of the global change science community. NASA aims to create enhanced operational products, as well as additional products not included in NOAA’s operational products, to provide continuity with the MODIS data record. The VIIRS data products use the MODIS similar name convention with preﬁx of “VNP” and are available from the EarthData website. VIIRS is an operational satellite sensor and will continue in the future. Sentinel-2 is a medium-resolution satellite developed by the European Space Agency (ESA). The Sentinel-2 constellation includes two (A and B) satellites. Sentinel-2A was launched on June 23, 2015, and Sentinel-2B was launched on March 7, 2017. Each Sentinel-2 satellite carries a single multispectral instrument (MSI) with 13 spectral channels in the visible, near infrared, and shortwave infrared bands (Table 2.4). The MSI sensor has a wider ﬁeld of view (290 km) compared to Landsat (185 km) and thus shortens the revisit time (10 days for one satellite or

14

F. Gao

Table 2.3 Major MODIS data products related to agricultural applications (MOD, MYD, and MCD represent Terra, Aqua, and the combined products, respectively) Data product Surface reﬂectance

Vegetation Indices (VI)

Leaf Area Index (LAI)

Albedo and Nadir BRDF-adjusted reﬂectance (NBAR) Land Surface Temperature (LST) Evapotranspiration (ET) Land cover type Land cover dynamics

Product ﬁlename MOD09GQ/MYD09GQ MOD09GA/MYD09GA MOD09A1 MOD13A1/MYD13A1 MOD13A2/MYD13A2 MOD13A3/MYD13A3 MCD15A2H MCD15A3H MOD15A2H/ MYD15A2H MCD43A3 MCD43A4 MOD11A1/MYD11A1 MOD11A2/MYD11A2 MOD16A2/MYD16A2 MOD16A3/MYD16A3 MCD12Q1 MCD12Q2

Spatial Res. 250 m 500 m 500 m 500 m 1 km 1 km 500 m 500 m 500 m

Temporal Res. Daily Daily 8-day composite 16-day composite 16-day composite Monthly composite 8-day composite 4-day composite 8-day composite

500 m 500 m 1 km 1 km 500 m 500 m 500 m 500 m

Daily Daily Daily 8-day composite 8-day composite Yearly Yearly Yearly

Table 2.4 Sentinel-2A and Sentinel-2B MSI bandwidths and spatial resolutions (from ESA) Bands 1 2 3 4 5 6 7 8 8a 9 10 11 12

S2A central wavelength (nm) 442.7 492.4 559.8 664.6 704.1 740.5 782.8 832.8 864.7 945.1 1373.5 1613.7 2202.4

S2B central wavelength (nm) 442.2 492.1 559.0 664.9 703.8 739.1 779.7 832.9 864.0 943.2 1376.9 1610.4 2185.7

Bandwidth (nm) 21 66 36 31 16 15 20 106 22 21 30 94 185

Spatial resolution (m) 60 10 10 10 20 20 20 10 20 60 60 20 20

5 days combined). The shorter revisit provides higher temporal resolution data that are required for crop growth and condition monitoring. In addition, Sentinel-2 MSI provides images with spatial resolutions of 10 m, 20 m, and 60 m. The higher spatial

2 Remote Sensing for Agriculture

15

resolution (10 m for blue, green, red, and NIR bands) provides more spatial details than Landsat (30 m) and can be used to study spatial variability at the ﬁeld to subﬁeld scales. The Sentinel-2 MSI includes four red edge bands that could be beneﬁcial for crop monitoring. Sentinel-2 data are freely available to the public and have shown increasing uses in agricultural applications. Unfortunately, Sentinel-2 satellites do not have thermal infrared bands that impact the detection of clouds at pixel level. The lack of thermal infrared bands also limits the study on crop water use that requires surface temperature in the land surface energy balance model. The NASA Goddard Space Flight Center has produced the Harmonized Landsat and Sentinel-2 (HLS) surface reﬂectance product to increase the temporal resolution. HLS data products are co-registered, atmospherically corrected and gridded in the Sentinel-2 tile (Claverie et al. 2018). The data can be used for time series analysis directly. Version 1.4 HLS data over North America is available from NASA Goddard Space Flight Center (https://hls.gsfc.nasa.gov/), and version 1.5 over the globe is available from the NASA EarthData website (https://earthdata.nasa.gov/). Other commercial satellite data, such as the WorldView and PlanetScope, provide satellite imagery at very high spatial resolutions. The PlanetScope constellation with hundreds of small satellites deployed provides a capacity for daily global coverage at a lower cost. A technical challenge to use these data is the data inconsistency across satellites and dates. Additional processes are needed to harmonize them for monitoring crop progress and conditions (Houborg and McCabe 2018).

2.3

Agricultural Applications

Remote sensing data have been widely used in agricultural applications, including crop types mapping, crop growth condition monitoring, crop phenology detecting, crop yields estimating, crop water use estimating, crop stress condition assessing, and soil moisture retrieving. This section discusses the major applications using satellite remote sensing.

2.3.1

Crop Type Identiﬁcation

Identifying crop type and planting acreage is critical for estimating crop production. Crop type map is a basis for many agricultural applications. Remote sensing imagery provides spatial information that can be used to produce the wall-to-wall crop type map. Crop type classiﬁcation uses the distinct features of crops in the spectral and temporal domains to separate different crop types. Remote sensing classiﬁcation includes supervised and unsupervised classiﬁcations. Most crop type classiﬁcations were performed using a supervised classiﬁcation method, which requires training samples. The widely used and effective pixel-based methods include maximum likelihood, decision tree, neural network, random forests, and support vector

16

F. Gao

machine classiﬁers. The object-oriented classiﬁcations have been used for veryhigh-spatial-resolution imagery. Supervised classiﬁcation requires the training samples of different crop types. The ground truth samples can be collected from the ﬁeld, ﬁne-spatial-resolution images, or from the same remote sensing image used for classiﬁcation. Classiﬁcation accuracy depends on the quality of training samples and separability among the different crops. In the early era of remote sensing, a single image was normally used in classiﬁcation due to the cost. When a single image is used, the classiﬁcation accuracy entirely depended on the spectral and spatial feature. Sometimes different crop types may share a similar spectral feature on the acquisition date, and the classiﬁer may not be able to separate them. Since medium-resolution remote sensing images are freely available today, images acquired from different crop growing stages have been used and improved classiﬁcation accuracy. Crop types that show similar spectral features in a day could be different on other days. Therefore, temporal information is valuable for improving classiﬁcation accuracy. Crop type classiﬁcation may also be affected by the pixel’s spatial resolution. In the United States, the sizes of crop ﬁelds are large, and thus Landsat 30-m resolution is good enough for mapping the crop types for each ﬁeld. However, in Africa, Asia, and Europe, the ﬁeld sizes are much smaller; remote sensing imagery at a ﬁner spatial resolution is needed to avoid the mixture of different land cover types in a pixel. Crop type classiﬁcation using remote sensing has been operational in the United States. The US Department of Agriculture (USDA) National Agricultural Statistics Service (NASS) has produced the Cropland Data Layer (CDL) over CONUS every year since 2008. For the earlier years (1997–2007), CDL maps are available for the selected states. CDL was produced using a decision tree classiﬁer (Boryan et al. 2011). Classiﬁcation results were assessed for each crop. The classiﬁcation accuracy varies. Major crops such as corn, soybean, and wheat have much higher accuracy than smaller crops. The CDL data are available through the CropScape portal (https://nassgeodata.gmu.edu/CropScape/). The CropScape provides spatial subset, analysis, and mapping functions. CDL for the entire CONUS can also be downloaded through USDA NASS (https://www.nass.usda.gov/Research_and_ Science/Cropland/Release/index.php). NASS produces CDL within the season and releases it in the early next year. Note that even remote sensing classiﬁcation methods are mature; mapping crop types at early growing season is still very challenging. Using crop rotation pattern from the previous years can help the early season crop type mapping (Hao et al. 2020).

2.3.2

Crop Phenology Mapping

Accurate spatiotemporal information about crop progress and condition during the growing season is critical for crop management and yield estimation (Walthall et al. 2012). The amount of yield loss realized during a drought year is dependent on the crop growth stage when water stress occurs. Crop progress provides information

2 Remote Sensing for Agriculture

17

necessary for efﬁcient irrigation and drainage management. For example, the most beneﬁcial timing for irrigation is during the latter part of the reproductive growth stages for soybeans versus the earlier tasseling period for corn. In addition, crop progress information is critical for scheduling fertilization, pest management, and harvesting operations at optimal times for achieving higher yields (Gao et al. 2017). Crop progress varies by year and location and is affected by climate variation, local weather, soil properties, environment changes, and anthropogenic activities. In the United States, crop progress and condition are estimated using ground survey data supplied by the trained reporters. These reporters provide visual observations and subjective estimates of crop progress based on USDA NASS standard deﬁnitions. Crop growth stages, crop conditions, and farmers’ activities are reported each week. The crop progress (CP) reports are summarized and released weekly during the growing season from early April to late November. The report provides summaries at the agricultural statistic district (multiple counties) and state level to the public (https://www.nass.usda.gov/Publications/State_Crop_Progress_and_Condi tion/index.php). These reports do not discuss spatial variability within the agricultural statistical unit. Remote sensing data are unique in providing spatial and temporal information for crop monitoring. In recent years, remote sensing time series data have been used to extract land surface phenology. These approaches use mathematical functions to ﬁt time series vegetation indices (VIs). Land surface phenology or phenological parameters are extracted based on either a predeﬁned VI threshold (Jonsson and Eklundh 2004) or the curvatures of the ﬁtting function (Zhang et al. 2003). Zhang et al. (2003) developed a phenology program using a hybrid piecewise logistic function, and the approach has been used to produce the MODIS phenology data product since 2001. These approaches extract vegetation phenology using particular features in time series VI data, which can be interpreted as remote sensing phenology. In order to relate phenology detected from remote sensing signals to the ﬁeldobserved crop progress (or physiological stages), Sakamoto et al. (2010) developed a two-step ﬁltering approach to detect maize and soybean phenology using MODIS data. Global land surface phenology products are available at coarse spatial resolution since 2001 (Zhang et al. 2003). However, the 500-m spatial resolution is still too coarse for many crop ﬁelds. Gao et al. used a data fusion approach that combines the temporal frequency of MODIS with the spatial resolution of the Landsat (Gao et al. 2006) to build daily time series VI at Landsat 30-m spatial resolution. The fused Landsat-MODIS data were combined with Landsat observations to extract crop phenology and then relate to crop growth stages in central Iowa from 2001 to 2016 (Gao et al. 2017). In addition, crop phenology mapping at medium spatial resolution can be improved by combining Landsat and Sentinel-2 observations. Land surface phenology at 30 m or ﬁner pixels has been recently retrieved from Landsat, Sentinel-2, and HLS (Bolton et al. 2020; Gao et al. 2017, 2020a, b; Zhang et al. 2020).

18

F. Gao

Even when the daily observations at medium spatial resolution are available, near real-time mapping of crop growth stages is still challenging (Liu et al. 2018). For example, crop emergence is deﬁned as the ﬁrst appearance of crop leaves. Remote sensing data from the early growing stage could be affected by the change of soil moisture (e.g., snow/ice melts). The subtle changes in crop emergence may not be sensitive to sensors. Near real-time (or within the season) crop phenology mapping approaches have been developed (Zhang et al. 2012; Liu et al. 2018; Gao et al. 2020a, b). These approaches can run in the near real-time mode using any period of an imagery time series. Results show that crop emergence dates and cover crop termination dates may be reliably detected within 1–3 weeks using high-temporal and spatial-resolution remote sensing data (Gao et al. 2020a, b). Figure 2.3 shows green-up dates detected using the Vegetation and Environment monitoring New MicroSatellite (VENμS, 5-m, 2-day revisit) time series from three dates in 2019. Later green-up events were detected by including more recent observations.

2.3.3

Crop Yield Estimation

Accurate estimation of crop yield before harvest is critical for sustaining agricultural markets and ensuring food security. Remote sensing data have been demonstrated useful for estimating crop yield for several decades. More than three decades ago, Tucker et al. (1980) used ﬁeld observation and reported that the normalized difference vegetation index (NDVI) for a 5-week period from stem elongation to anthesis explained about 64% of grain yield variation of wheat. In the recent era of rich satellite data availability, numerous studies have been published using satellite imagery to estimate crop yields. Many of these used empirical relationships between yields and various vegetation indices (VIs). The empirical approach builds the relationship between ground yield survey samples and the remote sensing-derived parameters and then applies the relationship to remote sensing imagery to map yield over the entire area. VI-based metrics (e.g., maximum VI, integral VI from the entire growing season or for a speciﬁc growth period) have been used for estimating crop yield. Although an empirical model built for a speciﬁc region has a limited applicability to different areas or years, the empirical model is simple and effective for the local region if ground survey samples are representative and accurate. Another front of the effort has been to incorporate remote sensing data into physiology-based crop growth modeling. Conventional models simulate crop growth and yield (or biomass) through crop biophysical processes. Remote sensing variables like leaf area index (LAI) can be integrated into these models via direct replacement or through data assimilation techniques. Physiological crop growth models typically require a large number of inputs and computing resources. To reduce the input data requirements and computing costs, Lobell et al. (2015) developed a scalable satellite-based crop yield mapper (SCYM) to relate weather

2 Remote Sensing for Agriculture

19

Fig. 2.3 Green-up dates detected using VENμS time series NDVI data until June 15 (a), July 1 (b), and July 15, 2019 (c). Newly detected green-ups are labeled in each panel. Label “C” represents corn ﬁelds and “S” represents soybean ﬁelds. An alfalfa ﬁeld (“Alf”) was also observed, to examine multiple green-ups and harvests (from Gao et al. 2020a)

data and satellite VI to the yields simulated from a crop growth model. Recently, large area mapping using SCYM and Landsat has been enabled using the Google Earth Engine (GEE) technology (Gorelick et al. 2017). In contrast with physiology-based crop growth models, the process-based approach uses the light use efﬁciency model to estimate crop yield. This approach uses four primary inputs: incoming photosynthetically active radiation (PAR), the fraction of PAR (fAPAR), the light-use efﬁciency (LUE), and the harvest index (HI). The absorbed PAR (APAR) is the product of PAR and fAPAR. PAR can be measured from ground meteorological stations or computed from satellite observations. Since fAPAR is related to VI, crop biomass and yield could be determined by VI using a process-based yield estimation model. Gao et al. (2018) examined the add values of VI from multiple remote sensing sources (Landsat, Sentinel-2, and

20

F. Gao

MODIS) and found that higher-spatial and temporal-resolution images better explained spatial variability of crop yield.

2.3.4

Crop Evapotranspiration (ET) and Water Use

Crop evapotranspiration (ET) includes soil evaporation (E) and crop transpiration (T). The transpiration accounts for the loss of water as it vapors through the stomata in leaves. The water is extracted by the root system in the root zone and represents a loss of water in the soil, and thus ET is used interchangeably with crop water use. Crop water stress can be detected using ET and Evaporative Stress Index (ESI) (Anderson et al. 2016). For irrigated crops, ET measures the water used to grow food. Temporal and spatial continuous ET data are needed for agricultural management and irrigation scheduling. ET can be estimated using surface temperature retrieved from thermal infrared imagery through an energy balance model. The Penman-Monteith (PM) equation is found to be consistent over a wide range of climatic conditions. The MODIS ET algorithm is based on the PM equation. MODIS ET data products are available since 2001 (Mu et al. 2007). Agricultural applications require ET information over a range of temporal and spatial resolutions. MODIS 8-day ET product at 500-m spatial resolution may be too coarse to assess water use at the ﬁeld scale. USDA Hydrology and Remote Sensing Laboratory (HRSL) has developed a multiscale ﬂux modeling system using TIR and LAI data from multiple satellite platforms (Anderson et al. 2007, 2012). This system is unique in that it merges low-spatial/high-temporal resolution information available from geostationary satellites with higher-spatial/ lower-temporal information from polar orbiters such as MODIS, VIIRS, and Landsat, generating self-consistent maps of water and energy ﬂuxes from continental to ﬁeld scales. For more detailed spatial analyses, such as mapping variability in water use across a watershed or between individual farm ﬁelds, an ALEXI ﬂux disaggregation approach (DisALEXI) can be applied using sharpened temperature and LAI information from sensors like Landsat to map ﬂuxes at 30-m resolution (Anderson et al. 2012). Many ET estimation methods have been developed in the past decades. Recently, several trusted ET methods, including DisALEXI, METRIC, SEBAL, etc. are ensembled in the OpenET platform. The platform uses multiple remote sensing data sources and cloud computing techniques to estimate ET and provide easy access at user-deﬁned scales and dates (https://openetdata.org/).

2.3.5

Soil Moisture Retrieval

Soil moisture and its availability affect crop growth and yield. In the past decades, microwave remote sensing has been used for soil moisture estimation. Passive and

2 Remote Sensing for Agriculture

21

active microwave remote sensing provide a unique capability for mapping soil moisture (Calvet et al. 2011). Various low frequencies (X, C, and L bands) have been used to estimate bare or vegetated soil moisture content. Early microwave remote sensing was conducted on the airborne platform. Many sensors have been launched to space to provide microwave signals for operational uses. The Advanced Microwave Scanning Radiometer for the Earth Observing System (AMSR-E) aboard the Aqua spacecraft includes C (4–8 GHz) and X (8–12 GHz) band sensors and provides global daily soil moisture product at 25-km spatial resolution from June 2002 to October 2011. Several satellites carrying L (1–2 GHz) band sensors have been launched. The ESA Soil Moisture and Ocean Salinity (SMOS) satellite was launched in November 2009. The goal of the SMOS mission is to provide surface soil moisture with an accuracy of 4% at 35–50-km spatial resolution. The NASA Soil Moisture Active Passive (SMAP) satellite was launched in January 2015. It includes passive and active L band sensors. While the SMAP active radar sensor (1–3-km resolution) have failed after 3 months of operation, the passive radiometer sensor still provides coarse-resolution products (L2 and L3 products at 36-km and L4 value-added products at 9-km spatial resolution) at a global scale from 2015 to present. The SMAP data products are available from the National Snow and Ice Data Center (NSIDC: https://nsidc.org/data/smap/ smap-data.html). ESA’s Sentinel-1A and Sentinel-1B satellites carry a C band synthetic-aperture radar instrument. Recently, the L band radiometer measurements from SMAP and the C band radar measurements from Sentinel-1 are combined to produce high-spatial-resolution soil moisture estimates (Das et al. 2018). A beta version of the combined product at 3-km spatial resolution is available from the NASA NSIDC DAAC (Distributed Active Archive Center). Soil moisture estimates are affected by surface roughness and vegetation coverage. The accuracy of soil moisture estimates depends on the surface conditions.

2.4

Summary

This chapter introduces basic remote sensing concepts and applications in agriculture. Remote sensing characteristics such as spatial, temporal, and spectral resolutions are discussed. The major agricultural-related and freely available remote sensing data sources (e.g., Landsat, MODIS, VIIRS, and Sentinel-2) and the data products are presented. Remote sensing applications in crop type classiﬁcation, crop phenology monitoring, crop yield estimation, crop water use, and soil moisture estimation are introduced.

22

F. Gao

References Anderson, M. C., Kustas, W. P., & Norman, J. M. (2007). Upscaling ﬂux observations from local to continental scales using thermal remote sensing. Agronomy Journal, 99, 240–254. Anderson, M. C., Kustas, W. P., Alﬁeri, J. G., Gao, F., Hain, C., Prueger, J. H., Evett, S. R., Colaizzi, P. D., Howell, T. A., & Chaves, J. L. (2012). Mapping daily evapotranspiration at Landsat spatial scales during the BEAREX’08 ﬁeld campaign. Advances in Water Resources, 50, 162–177. https://doi.org/10.1016/j.advwatres.2012.06.005. Anderson, M. C., Zolin, C. A., Sentelhas, P. C., Hain, C. R., Semmens, K., Tugrul Yilmaz, M., Gao, F., Otkin, J. A., & Tetrault, R. (2016). The Evaporative Stress Index as an indicator of agricultural drought in Brazil: An assessment based on crop yield impacts. Remote Sensing of Environment, 174, 82–99. Bolton, D. K., Gray, J. M., Melaas, E. K., Moon, M., Eklundh, L., & Friedl, M. A. (2020). Continental-scale land surface phenology from harmonized Landsat 8 and Sentinel-2 imagery. Remote Sensing of Environment, 140, 111685. https://doi.org/10.1016/j.rse.2020.111685. Boryan, C., Yang, Z., Mueller, R., & Craig, M. (2011). Monitoring US agriculture: The US Department of Agriculture, National Agricultural Statistics Service, Cropland Data Layer program. Geocarto International, 26, 341–358. Calvet, J.-C., Wigneron, J.-P., Walker, J. P., Karbou, F., Chanzy, A., & Albergel, C. (2011). Sensitivity of passive microwave observations to soil moisture and vegetation water content: L-band to W-band. IEEE Transactions on Geoscience and Remote Sensing, 49(4), 1190–1199. Claverie, M., Ju, J., Masek, J. G., Dungan, J. L., Vermote, E. F., Roger, J.-C., Skakun, S., & Justice, C. O. (2018). The harmonized Landsat and Sentinel-2 surface reﬂectance data set. Remote Sensing of Environment, 219, 145–161. https://doi.org/10.1016/j.rse.2018.09.002. Das, N., Entekhabi, D., Dunbar, R. S., Kim, S., Yueh, S., Colliander, A., Neill, P. E. O., & Jackson, T. (2018). SMAP/Sentinel-1 L2 radiometer/radar 30-second scene 3 km EASE-grid soil moisture, version 2. NASA National Snow and Ice Data Center Distributed Active Archive Center. https://doi.org/10.5067/KE1CSVXMI95Y. Dwyer, J., Roy, D., Sauer, B., Jenkerson, C., Zhang, H., & Lymburner, L. (2018). Analysis ready data: Enabling analysis of the Landsat archive. Remote Sensing, 10, 1363. https://doi.org/10. 3390/rs10091363. Gao, F., Masek, J., Schwaller, M., & Hall, F. (2006). On the blending of the Landsat and MODIS surface reﬂectance: Predict daily Landsat surface reﬂectance. IEEE Transactions on Geoscience and Remote Sensing, 44, 2207–2218. Gao, F., Anderson, M. C., Zhang, X., Yang, Z., Alﬁeri, J. G., Kustas, W. P., Mueller, R., Johnson, D., & Prueger, J. H. (2017). Toward mapping crop progress at ﬁeld scales through fusion of Landsat and MODIS imagery. Remote Sensing of Environment, 188, 9–25. Gao, F., Anderson, M., Daughtry, C., & Johnson, D. (2018). Assessing variability of corn and soybean yields in central Iowa using high spatiotemporal resolution multi-satellite imagery. Remote Sensing, 10, 1489. https://doi.org/10.3390/rs10091489. Gao, F., Anderson, M. C., Daughtry, C. S., Karnieli, A., Hively, W. D., & Kustas, W. P. (2020a). A within-season approach for detecting early growth stages in corn and soybean using high temporal and spatial resolution imagery. Remote Sensing of Environment, 242, 111752. https://doi.org/10.1016/j.rse.2020.111752. Gao, F., Anderson, M., & Hively, W. D. (2020b). Detecting cover crop end-of-season termination within the season using VENμS and Sentinel-2 satellite imagery. Remote Sensing, 12, 3524. https://doi.org/10.3390/rs12213524. Gorelick, N., Hancher, M., Dixon, M., Ilyushchenko, S., Thau, D., & Moore, R. (2017). Google Earth engine: Planetary-scale geospatial analysis for everyone. Remote Sensing of Environment, 202, 18–27.

2 Remote Sensing for Agriculture

23

Hao, P., Tang, H., Chen, Z., Meng, Q., & Kang, Y. (2020). Early-season crop type mapping using 30-m reference time series. Journal of Integrative Agriculture, 19(7), 1897–1911. Houborg, R., & McCabe, M. F. (2018). A Cubesat enabled Spatio-temporal enhancement method (CESTEM) utilizing planet, Landsat and MODIS data. Remote Sensing of Environment, 209, 211–226. https://doi.org/10.1016/j.rse.2018.02.067. Huete, A., Didan, K., Miura, T., Rodriguez, E. P., Gao, X., & Ferreira, L. G. (2002). Overview of the radiometric and biophysical performance of the MODIS vegetation indices. Remote Sensing of Environment, 83(1), 195–213. https://doi.org/10.1016/S0034-4257(02)00096-2. Jonsson, P., & Eklundh, L. (2004). TIMESAT—A program for analysing time-series of satellite sensor data. Computational Geosciences, 30, 833–845. Justice, C. O., Townshend, J. R. G., Vermote, E. F., Masuoka, E., Wolfe, R. E., Saleous, N., Roy, D. P., & Morisette, J. (2002). An overview of MODIS land data processing and product status. Remote Sensing of Environment, 83, 3–15. Liu, L., Zhang, X., Yu, Y., Gao, F., & Yang, Z. (2018). Real-time monitoring of crop phenology in the midwestern United States using VIIRS observations. Remote Sensing, 10, 1640. https://doi. org/10.3390/rs10101540. Lobell, D. B., Thau, D., Seifert, C., Engle, E., & Little, B. (2015). A scalable satellite-based crop yield mapper. Remote Sensing of Environment, 164, 324–333. Mu, Q., Heinsch, F. A., Zhao, M., & Running, S. W. (2007). Development of a global evapotranspiration algorithm based on MODIS and global meteorology data. Remote Sensing of Environment, 111, 519–536. https://doi.org/10.1016/j.rse.2007.04.015. Myneni, R., Hoffman, S., Knyazikhin, Y., Privette, J., Glassy, J., Tian, Y., Wang, Y., Song, X., Zhang, Y., Smith, G., et al. (2002). Global products of vegetation leaf area and fraction absorbed PAR from year one of MODIS data. Remote Sensing of Environment, 83, 214–231. Richardson, A. D. (2018). Tracking seasonal rhythms of plants in diverse ecosystems with digital camera imagery. New Phytologist, 222, 1742–1750. https://doi.org/10.1111/nph.15591. Roy, D. P., Wulder, M. A., Loveland, T. R., Woodcock, C. E., Allen, R. G., Anderson, M. C., Helder, D., Irons, J. R., Johnson, D. M., Kennedy, R., et al. (2014). Landsat-8: Science and product vision for terrestrial global change research. Remote Sensing of Environment, 145, 154–172. Sakamoto, T., Wardlow, B. D., Gitelson, A. A., Verma, S. B., Suyker, A. E., & Arkebauer, T. J. (2010). A two-step ﬁltering approach for detecting maize and soybean phenology with timeseries MODIS data. Remote Sensing of Environment, 114, 2146–2159. Schaaf, C. B., Gao, F., Strahler, A. H., Lucht, W., Li, X., Tsang, T., Strugnell, N. C., Zhang, X., Jin, Y., Muller, J. P., et al. (2002). First operational BRDF, albedo and nadir reﬂectance products from MODIS. Remote Sensing of Environment, 83, 135–148. Tucker, C. J., Holben, B. N., Elgin, J. H., & McMurtry, J. E., III. (1980). Relationship of spectral data to grain yield variation. Photogrammetric Engineering and Remote Sensing, 46, 657–666. Vermote, E. F., El Saleous, N. Z., & Justice, C. O. (2002). Atmospheric correction of MODIS data in the visible to middle infrared: First results. Remote Sensing of Environment, 83, 97–111. Vermote, E., Justice, C., Claverie, M., & Franch, B. (2016). Preliminary analysis of the performance of the Landsat 8/OLI land surface reﬂectance product. Remote Sensing of Environment, 185, 46–56. Walthall, C. L., Hatﬁeld, J., Backlund, P., et al. (2012). Climate change and agriculture in the United States: Effects and adaptation (USDA Technical Bulletin 1935). Washington, DC: USDA. 186 pages. Wan, Z., Zhang, Y., Zhang, Q., & Li, Z.-L. (2002). Validation of the land-surface temperature products retrieved from Terra Moderate Resolution Imaging Spectroradiometer data. Remote Sensing of Environment, 83, 163–180. Woodcock, C. E., Allen, R., Anderson, M., Belward, A., Bindschadler, R., Cohen, W., Gao, F., Goward, S. N., Helder, D., Helmer, E., et al. (2008). Free access to Landsat imagery. Science, 320, 1011.

24

F. Gao

Zhang, X., Friedl, M. A., Schaaf, C. B., Strahler, A. H., Hodges, J. C. F., Gao, F., & Reed, B. C. (2003). Monitoring vegetation phenology using MODIS. Remote Sensing of Environment, 84 (3), 471–475. Zhang, X. Y., Goldberg, M. D., & Yu, Y. Y. (2012). Prototype for monitoring and forecasting fall foliage coloration in real time from satellite data. Agricultural and Forest Meteorology, 158, 21–29. Zhang, X. Y., Wang, J., Henebry, G. M., & Gao, F. (2020). Development and evaluation of a new algorithm for detecting 30m land surface phenology from VIIRS and HLS time series. ISPRS Journal of Photogrammetry and Remote Sensing, 161, 37–51.

Chapter 3

GIS Fundamentals for Agriculture Junmei Tang

Abstract GIS has been proved to be an effective technology for various agricultural applications, ranging from recording data, predicting crop growth, to supporting pesticide control and food safety regulations. As a system designed to capture, store, manipulate, analyze, manage, and present spatial or geographic data, GIS has been motivated to become one of the most dynamic computer application systems. Due to its powerful capability in collecting and updating real-time data, GIS has been identiﬁed as a signiﬁcant bridge between data and agriculture communities. This chapter summarized the major GIS applications in Agriculture, including mapping and analytical techniques, spatial database for agricultural systems, modeling function, and decision support system. These applications have beneﬁted various GIS user as well as agriculture communities. New technologies such as emerging Machine Leaning and Artiﬁcial Intelligence provide more opportunities in promoting GIS in more Agriculture applications and meanwhile generate more challenges in understanding global food production and security issues in the future. Keywords GIS · Agriculture application · Analytical techniques · Spatial database · Modeling · Decision support systems

3.1

Introduction

Geographic information system (GIS) has become a powerful and cost-effective technique for various applications in agriculture over the past decades. Agriculture, inherently, is a geographical practice, turning into a natural application of GIS (Wilson 1999). The utility of GIS in the ﬁeld of agriculture is increasing in the farm practice varying from crop stage monitoring, disease management, yield estimation, soil/weeds mapping, crop growth model as well as spatial decision J. Tang (*) Center for Spatial Information Science and Systems, George Mason University, Fairfax, VA, USA e-mail: [email protected] © Springer Nature Switzerland AG 2021 L. Di, B. Üstündağ (eds.), Agro-geoinformatics, Springer Remote Sensing/ Photogrammetry, https://doi.org/10.1007/978-3-030-66387-2_3

25

26

J. Tang

support, food and security analysis, and policymaking and implementing (Pierce and Clay 2007). It is vital to advance our knowledge in these ﬁelds through the continued development of new GIS and related technologies to build sustainable agricultural production systems (Wilson 1999). GIS provide advanced technologies at multiple scales ranging from ﬁeld level to globe level due to its numerous advantages such as generating updated information efﬁciently (Boryan et al. 2011), providing timely input data for crop yield and pollution models (Luzio et al. 2004), preparing tables and maps for speciﬁc agricultural practices (Chau et al. 2013), and developing decision support system for disseminating geospatial cropland data (Han et al. 2012). Finding a way to synthesize out data, knowledge, and technologies in agricultural application is important for the continued development of new GIS functions and related tools to build a more sustainable agricultural production system. The availability of big data has provided unprecedented opportunities for advancing new knowledge in predictive decision and data-supported innovation in agriculture. In October 2016, the US National Institute of Food and Agriculture (NIFA) embarked on the Food and Agriculture Cyberinformatics and Tools (FACT) initiative to identify the frontiers and future of data in agriculture on the existing US government-wide effects and investments in big data. To achieve this, NIFA envisions a future for agriculture that is connected, data-driven, personalized, and sustainable. This provides new opportunities and challenges for the GIS and agricultural communities.

3.2

GIS: The Geospatial Approach

GIS is a system designed to capture, store, manipulate, analyze, manage, and present spatial or geographic data (Longley et al. 2005). It integrates hardware, software, and data for displaying, analyzing, and managing all forms of geospatial information (Parthasarathy 2010). Spatial data are commonly stored in layers that might depict environment or topography elements. Nowadays, GIS is an essential tool for combining multiple spatial data such as satellite and map information sources in spatial models to simulate the interactions of complex natural and human systems. There has been some debate about the deﬁnition of GIS, either as a narrow technological term (Devine and Field 1986) or a wider perspective (Carter 1989). Dickinson and Calkins (1988) deﬁne GIS from the three key components: GIS technology including hardware and software; GIS database to store geographical and related data; and GIS infrastructure as staff, facilities, and supporting elements. All of these early deﬁnitions have expressed a common feature that GIS are systems that deal with geographical information and data (Maguire 1991). The relationship between GIS and other information systems, including computer-aided design, cartography, database management, and remote sensing system, is important in comprehensively understanding GIS (Fig. 3.1). Newell and Theriault (1990) suggested that GIS as a subset or superset of these systems and deﬁned “any system that capable of putting a map on the screen as GIS.” Although

3 GIS Fundamentals for Agriculture

27

Fig. 3.1 The relationship between GIS and other information systems. (Source: Maguire 1991)

GIS have many features in common with these systems, the major characteristics in spatial analytical operation have created unique features in GIS (Goodchild 1988). Goodchild (1987) deﬁned spatial analysis as a “set of analytical methods to access both attribute and location information of study objects.” One motivation of the success of GIS application is the massive generation of digital spatial data and information about the natural and cultural environment, particularly the rapid development in satellite techniques. Since the 1990s, GIS has been used in many commercial applications and became one of the most dynamic computer application systems. It has been expended globally and keeps on expanding in the future. This application has been developed from a simple processing system to a decision support system. As GIS recently moved from traditional stand-alone desktop computers to the World Wide Web, their function as a brand new tool to share and communicate our knowledge become more apparent (Sui and Goodchild 2011). The convergence of GIS and media and the emerging new techniques in data-intensive applications have created new challenges and shaped its future development for GIS communities.

3.3

GIS Application in Agriculture

Historically, the application of GIS in production agriculture is rather small compared to other business sectors due to the lack of formal opportunities to share innovations of GIS in agriculture. The value of GIS to agriculture has been

28

J. Tang

recognized and continues to increase as the advanced technology accelerates the opportunities for the agricultural communities to acquire, manage, and analyze the spatial data from the farmer level to global level.

3.3.1

GIS Mapping and Analytical Techniques

One distinguished advantage of GIS from other computer technologies is that it enables various data from diverse sources to be integrated and analyzed to its powerful analytical functionality (Luzio et al. 2017). The use of GIS techniques in farm-related practices has evolved in the past decades at regional or national scale. Combined with remotely sensed data, GIS techniques has been used to support land capability assessment (Corbett and Carter 1996), crop condition and yield (Wade et al. 1994; Carbone et al. 1996), ﬂood and drought (Yagci et al. 2011; Yu et al. 2013), soil erosion and condition (Narasimhan and Srinivasan 2005), nonpoint source pollution (Fitzhugh and Mackay 2000; Tang 2015), pest infestation and control (Bellotti et al. 1999; Joshi et al. 2004), and climate change impacts (Di et al. 1994; Morton 2007). The ability of GIS to map and analyze the agricultural environments has proved to be very valuable to the farming industry by combining various maps and satellite information sources (Sood et al. 2015). Long et al. (1991) examined the potential of GIS methods, combining with the global positioning system (GPS), in soil surveys and found these methods more efﬁcient than traditional mapping. Loveland et al. (1995) generated a multilevel digital geographically reference land cover maps for the contiguous USA based on new developed GIS method by the United States Geological Survey (USGS) and the University of Nebraska-Lincoln (Olson and Olson 1985). These combination generated eco-regional maps to describe the land cover characteristics at continental and global scales. The development of digital format in environment data, such as climate and land cover, has promoted a new agricultural application in the past decades. For example, the African climate surface was combined with the cumulative seasonal erosion potential (CSEP) to derive a climate index of erosion potential (Kirkby and Cox 1995). The GIS-based EPIC model was used to simulate the national spatial crop yield based on the digitized climate, soil, irrigation, and topography data (Priya and Shibasaki 2001). Wratt et al. (2006) integrated climate data with GIS-based soil and crop information to reduce risk in agricultural decisionmaking. A systematic approach to develop practical solutions was proposed to adapt agricultural to climate change using analogue locations in Eastern and Southern Africa (Trincheria et al. 2015). Generally, the GIS technology can be used to synthesis and integrate more spatial data than previous research or application in the pre-computer era. The shift from the traditional spatial delineation or analysis of agroecological and agroclimatological studies toward user-speciﬁc-driven, together with the big data revolution, presents

3 GIS Fundamentals for Agriculture

29

Fig. 3.2 The Cropland Data Layer disseminated and analyzed by CropScape. (Source: CSISS 2009)

remarkable challenge and tremendous opportunities for the GIS and agricultural communities. The Cropland Data Layer (CDL), an online service provided by the Center for Spatial Information Science and Systems (CSISS 2009) at the GMU and USDA National Agricultural Statistics Service (NASS) (Fig. 3.2), applies Web GIS to produce the detailed crop types annually from the 1990s to the current year, covering the United Sates at 30-meter spatial resolution. CDL was generated by classifying Landsat images with annual ground truths collected from ﬁeld survey and updated annually on the acreage estimation to the US agricultural statistics board, producing digital, crop-speciﬁc, categorized geo-reference output products (Han et al. 2012). Afterward, CSISS and USDA NASS created another operationally produces and distributes nationwide weekly crop condition and progress (phenological) stage products, VegScape system (Yang et al. 2013, Fig. 3.3) is based on 250-m-resolution MODIS satellite images and NDVI time curve. This product has been synchronized to the NASS crop progress and condition reporting standard. The products are useful for monitoring the crop condition and progress at the county or above levels.

3.3.2

Spatial Database for Agricultural Systems

The spatial analysis and map function of GIS, however, cannot be fully implemented if the GIS databases are incomplete, obsolete, or even inaccurate. The use of spatial data for agricultural systems requires the integration of multiple data sources with

30

J. Tang

Fig. 3.3 VegScape, the vegetation condition explorer showing the plant health across the United States

the ensured consistency in units, spatial/temporal scales, ownership and copyright, data interpretation, etc. to enable easy dissemination of data (Janssen et al. 2009). Many efforts have been made in different application domains to integrate various data sources together. Hutchinson et al. (1996) developed a gridded topographic and mean monthly climatic database for the Africa continent. Gobin et al. (2004) connected different data sources to assess the indicators of soil erosion on the European scale. Herrero et al. (2007) developed a household-level database about the crop-livestock systems in developing countries. Many projects have been initiated to develop spatially distributed databases at the national level or even continental or global sale. For example, the ﬁrst and largest soil database, NSCD (National Soil Characterization Database) in the United States, was produced and updated by the US Department of Agriculture (USDA 1994). This database descripts more than 5000 soil proﬁles within the United States. The State Soil Geographic (STATSGO) and Soil Survey Geographic (SSURGO) are two soil database, providing the soil input for farm and ranch, landowner/user, township, county, and natural planning and management at farm level (STATSGO) and county level (SSURGO) (Wang and Mellesse 2006). The World Inventory of Soil Emission of Potential Database (WISE) compiled soil proﬁles from 69 countries from the Americas, Asia, Africa, Australia, etc. (Batjes 1995). The INSPIRES initiative of the European Commission (INSPIRES 2008) tried to develop a European spatial information infrastructure that can improve the availability and interoperability of spatial data across the entire Europe so that these datasets can be reused easily for policy assessments. The development of new digital databases has not progressed as quickly as the generation of spatial data and needs to keep being updated in a timely manner. There are conceptual and technical differences that hinder the integration of various data sources together from different application domains. These barriers includes the

3 GIS Fundamentals for Agriculture

31

different interpretation of data, the missing and uncertainty of data, or even no-available of data (Janssen et al. 2009). As the spatial data related to agricultural systems are collected across different scientists, the integrating data is critical to overcome these problems. Many other techniques have been developed to improve the interoperability in multiple GIS database from sematic heterogeneity, schematic heterogeneity, and syntactic heterogeneity (Bishr 1998). We believe the barriers to use multiple GIS database in agriculture will become more trivial in the near future. The SEAMLESS (System for Environmental and Agricultural Modeling: Linking European Science and Society) is a research project that aims to build a computerized framework to address economic, environmental, and social issues from microand macro-level analyses (Van Ittersum and Donatelli 2003).

3.3.3

GIS-Based Modeling in Agricultural Application

In the previous section, we descripted the GIS data and database used to perform agricultural analysis. These data layer will also be used as input sources for various models. GISs are incredibly helpful in mapping the current and project future ﬂuctuations in the environmental change, crop output, and management adjustment through cross-disciplinary communication.

3.3.3.1

Environment Models Linked to GIS

As a broad deﬁnition, agricultural environment here represents the energy and climate, nutrient and water, as well as the ecosystem service and its environment impact. Since the mid-1980s, these researches have signiﬁcantly increased to involve various spatial data to develop a new generation of environmental simulation models (Steyaert 1996). Richardson and Wright (1984) modiﬁed the weather generator and used it to generate daily precipitation and evapotranspiration values for agricultural application. The climate surface generated by Corbett and Carter (1996) was used as input data into the crop growth model for the risk assessment in different crop types. Wilson et al. (1993) adopted the Chemical Movement in Layered Soils Model (CMLS) to combine it with the unique soil and climate characteristics overlaid by the STATSGO and Montana Agricultural Potentials System (MAPS) database. Many agricultural decisions and prediction depend on not only the current status of crop and environment but also the future status of weather condition, soil moisture, and evapotranspiration. Numerous operational weather and water forecast model and products are currently available, such as the 3-km 15-hour HighResolution Rapid Refresh (HRRR) derived by the National Centers for Environmental Prediction (NCEP) (Weygandt et al. 2009), Climate Forest system’s (CFS) 56-km 9-month ensemble seasonal forecast, National Water Model’s (NWM) WRF-Hydro 1-km 10-day hydrological forecast (Gochis et al. 2013), 9-km Soil

32

J. Tang

Moisture Active Passive (SMAP) level 4 root zone soil moisture products (Reichle et al. 2017), and 8-km daily evapotranspiration product (Anderson et al. 2011). Many projects have combined GIS and environmental models to evaluate the impact of agriculture on the water source, biodiversity, etc. Early models were based on the statistical methods without considering the speciﬁc physical process and location difference (Shen et al. 2012). With the advancement of remote sensing and availability of spatial information, more and more spatially explicit model has been developed, such as the nonpoint source pollution model (Engel et al. 1993), soil erosion model (Lufafa et al. 2003), and ecosystem biogeochemistry models (Denitriﬁcation-Decomposition (DNDC), Lund-Potsdam-Jena Dynamic Global Vegetation model (LPJ-DGVM), and CENTURY model) (Li et al. 1992; Del Grosso et al. 2001; Sitch et al. 2003).

3.3.3.2

Crop Yield Prediction Based on GIS

Crop yield prediction is an important GIS application when biophysical process and human practice are modeled into the model (Priya and Shibasaki 2001). Crop simulation model usually used environmental factors including soil chemical/physical parameters, water management, weather, and agronomic practice as input data (Penning de Vries et al. 1989). With the help of GIS, the crop yield estimation can be extended to various scales from the regional or global scale. Olesen and Bindi (2002) studied the impact of climate change on agricultural productivity at the regional scale using GIS, and Priya and Shibasaki (2001) simulated crop yield-based GIS crop production model at the country or sub-continental scale. Among most of the crop yield estimation models, most of them use GIS as a powerful tool to assess simultaneously the site-speciﬁc information. WOFOST, World Food Studies, was developed by the Centre for World Food Studies, the Netherlands, with the Agricultural University and the Centre for Agrobiological Research (CABO) in Wageningen, the Netherlands, for simulating crop growth under combined crop types, soil, and climate condition (Diepen et al. 1989). Carbone et al. (1996) used GIS and remote sensing with the soybean physiological growth model SOYGRO to predict the yield of soybean in Orangeburg County, South Carolina. Many similar models have been developed by different countries, including the DSSAT (Decision Support System for Agrotechnology Transfer), CERES (Crop Environment Resource Synthesis), and EPIC (Environment Policy Integrated Climate Model) by the United States; the APSIM (Agricultural Production System sIMulator) by Australia; the STICS (Simulateur multidisciplinaire pour les Cultures Standard) by France; and the CCSODS (Crop Computer Simulation, Optimization, Decision-Making System) by China (Lin et al. 2003).

3 GIS Fundamentals for Agriculture

3.3.3.3

33

Agricultural Management Models Using GIS

Improving agricultural efﬁciency is an important issue for the sustainability in agriculture for the entire world, which needs greater use of information technology on the farm (Nishiguchi and Yamagata, 2009). An important concept of sustainable agriculture is to model the farmers’ practice to improve the resource efﬁciency and farming proﬁt (Rao et al. 2000). The GIS technology provides the potential to help farmers to determine the relationship between management and production for predicting yield with the consideration of spatial and temporal difference (Kalita et al. 1992; Rao et al. 2000). The EPIC model was developed to simulate the impact of different agricultural management practices on crop yield and nutrient loss and pesticide (Williams et al. 1983; Xie et al. 2015). Other commonly used models to assess the agricultural management practices include the SWAT (Soil and Water Assessment Tools), AGNPS (Agricultural Nonpoint Source), HSPF (Hydrological Simulation Program-FORTRAN), APEX (Agricultural Policy/Environment eXtender), GLEAMS (Groundwater Loading Effects of Agricultural Management Systems), PLOAD (Pollution Loan), USLE (Universal Soil Loss Equation), etc. (Xie et al. 2015). Although many models incorporate spatial data, few agricultural producers are utilizing the powerful analytical power of GIS due to the “difﬁculty” in informing farming community with the real-time data. This transformation is necessary and achievable through data-driven spatial decision support systems which push the agro-informatics from “lab” to “land.”

3.3.4

Decision Support System

3.3.4.1

Traditional Decision Support Systems

With a given climate and weather condition, different crop production/management practices, such as the introduction of later-maturing crop varieties or species, switching cropping sequences, sowing earlier, adjusting timing of ﬁeld operations, conserving soil moisture through appropriate tillage methods, improving irrigation efﬁciency, and changing pest management practices, will have different effects on crop yield and environmental footprint. An integrated system with advanced remote sensing and GIS as well as the artiﬁcial techniques leads to an intelligent system to help local farmers and policy planners in making complex practice decisions. The traditional decision support systems (DSS), such as APSIM (Agricultural Production Systems sIMulator) and DSAT (Decision Support System for Agrotechnology Transfer), provide the crop simulation scenarios to ﬁnalize farmers’ practices. For example, the irrigation schedule, sowing time, and fertilize dose can be simulated and best adopted for better crop yields. The spatial decision support system (SDSS) was designed by integrating the spatial component to the traditional DSS which can overlay all the spatial datasets

34

J. Tang

into one ﬁnal output. The SDSS provides complex analytical functions for spatial analysis. It provides multiple functions not only to perform various crop models and map simulated outputs but also improve the visualization of the environment for policy makers and integrating the database management system with expert knowledge.

3.3.4.2

New Direction and Trends in Decision Support System

Data-driven SDSS leads to much more intelligent systems, a new research direction, and a challenge to most of the current developers in SDSS. The success of the datadriven approach is reliant upon the effectiveness of its incorporated model and the quality of the gathered data. For example, the agricultural SDSS needs the data not only the current state but also the predicted future condition of crops and their supporting resources and environment. In most of the cases, remote sensing can provide real-time and cost-effective current condition as the initial status of the model. A range of well-tested prediction model exist for almost all crop-related parameters mentioned in the previous sections, e.g., environment model to determine the crop growth environment, growth model to predict the crop growth process, etc. Although many researchers are working on the data-driven agricultural SDSS, there is still a lack of synergetic system to integrate the broad individual research areas into the SDSS system. One of the successful systems is the Climate FieldView™ developed by the Climate Corporation to help farmers make datadriven decisions to sustainably increase and maximize their productivity. Most of these data-driven systems are applied to precision agriculture or site-speciﬁc farming, in multiple applications from seed, fertilizer, pesticide, irrigation, yield prediction, etc. Another new trend in the agricultural DDS is providing an online, real-time, and customized agricultural service to local farmers and policy makers. Web-based DSS made the real-time monitoring and decision-making feasible through a serious innovative web technologies, standard-based geospatial interoperability, sensor web, geospatial processing modeling, Open Geospatial Consortium (OGC) technologies, and location-based service technologies. A successful web-based decision support system should be fast and user-friendly, providing sufﬁcient and highly useful capabilities to the end user (Fernandez and Neal 2007). As an example, the remote-sensing-based ﬂood crop loss assessment service system (RF-CLASS) is a web-based decision support system that automatically produces ﬂood-related products to USDA NASS for supporting the post-ﬂood decision-making such as crop ﬂood insurance policy (Di et al. 2017). Currently, many agricultural information service companies are developing new technologies which are promoted by the rapid growing smart farming industry in the United States and the world. New technologies, such as advanced technologies for big data and cloud computing for computing services, are leveraging this development and leading to more robots and artiﬁcial intelligence in farming industries.

3 GIS Fundamentals for Agriculture

3.4

35

Conclusion

The past decades has witnessed a tremendous growth of GIS application in agriculture. These applications have beneﬁted various GIS user and agricultural communities, as more and more advanced tools promote further connection of GIS with other computer and related technologies. In the near future, the growth of customized service provided for the site-speciﬁc farming will promote further development of GIS. This service will be provided to the farmers and policy makers to increase the farming proﬁtability and reduce the environment impact. The advanced GIS technologies provide not only innovative knowledge in soil/crop and model/database research but also real-time data/information for decision-making and management. Future research is needed to improve our understanding of these aspects to synthesize the understanding of the study ranging from site-speciﬁc farming systems to global food production and security issues.

References Anderson, M. C., Kustas, W. P., Norman, J. M., Hain, C. R., Mecikalski, J. R., Schultz, L., et al. (2011). Mapping daily evapotranspiration at ﬁeld to continental scales using geostationary and polar orbiting satellite imagery. Hydrological Earth System Science, 15, 223–239. Batjes, N. H. (1995). A homogenized soil data ﬁle for global environmental research: A subset of FAO, ISRIC and NRCS proﬁles (Working paper and preprint). Wageningen: ISRIC. Bellotti, A. C., Smith, L., & Lapointe, S. L. (1999). Recent advances in Cassava pest management. Annual Review of Entomology, 44, 343–370. Bishr, Y. (1998). Overcoming the sematic and other barriers to GIS interoperability. International Journal of Geographical Information Science, 12(4), 299–314. Boryan, C., Yang, Z., Mueller, R., & Craig, C. (2011). Monitoring US agriculture: The US department of agriculture, national agricultural statistics service, cropland data layer program. Geocarto International, 1, 1–18. Carbone, G. J., Narumalani, S., & King, M. (1996). Application of remote sensing and GIS technologies with physiological crop models. Photogrammetric Engineering and Remote Sensing, 62, 171–179. Carter, J. R. (1989). On deﬁning the geographic information system. In W. J. Ripple (Ed.), Fundamentals of Geographic Information Systems: a compendium. Fall Church Virginia: ASPRS/ACSM. Chau, V. N., Holland, J., Cassells, S., & Tuohy, M. (2013). Using GIS to map impacts upon agriculture from extreme ﬂoods in Vietname. Applied Geography, 41, 65–74. Corbett, J. D., & Carter, S. E. (1996). Using GIS to enhance agricultural planning: the example of inter-seasonal rainfall variability in Zimbabew. Transactions in GIS, 1, 207–218. CSISS (Center for Spatial Information Science and Systems). (2009). Cropscape. Available at: https://nassgeodata.gmu.edu/CropScape/. Del Grosso, S. J., Parton, W. J., Mosier, A. R., Hartman, M. D., Brenner, J., Ojima, D. S., & Schimel, D. S. (2001). Simulated interaction of carbon dynamics and nitrogen trace gas ﬂuxes using the DAYCENT model. In M. Schaffer et al. (Eds.), Modeling carbon and nitrogen dynamics for soil management (pp. 303–332). Boca Raton: CRC Press. Devine, H. A., & Field, R. C. (1986). The gist of GIS. Journal of Forestry, 16, 17–22.

36

J. Tang

Di, L., Rundquist, D. C., & Han, L. (1994). Modeling relationships between NDVI and precipitation during vegetation growth cycles. International Journal of Remote Sensing, 15(10), 2121–2136. Di, L., Eugene, G. Y., Kang, L., Shrestha, R., & Bai, Y. (2017). RF-CLASS: A remote-sensingbased ﬂood crop loss assessment cyber-service system for supporting crop statistics and insurance decision-making. Journal of Integrative Agriculture, 16(2), 408–423. Dickinson, H., & Calkins, H. W. (1988). The economic evaluation of implementing a GIS. International Journal of Geographical Information Systems, 2, 307–327. Diepen, C. A., Wolf, J., Keulen, H., & Rappoldt, C. (1989). WOFOST: A simulation model of crop production. Soil Use and Management, 5(1), 16–24. Engel, B. A., Srinivasan, R., Arnold, J., Rewerts, C., & Brown, S. J. (1993). Nonpoint Source (NPS) pollution modeling using models integrated with geographic information systems (GIS). Water Science & Technology, 28(3–5), 685–690. Fernandez, C. J., & Neal, T. T. (2007). Development of a web-based decision support system for crop managers: structural considerations and implementation case. Agronomy Journal, 99(3), 730–737. FitzHugh, T. W., & Mackay, D. S. (2000). Impacts of input parameter spatial aggregation on an agricultural nonpoint source pollution model. Journal of Hydrology, 236(1–2), 35–53. Gobin, A., Jone, R., Kirkby, M., Campling, P., Govers, G., Kosmas, C., & Gentile, A. R. (2004). Indicators for Pan-European assessment and monitoring of soil erosion by water. Environmental Science and Policy, 7, 25–38. Gochis, D. J., Yu, W., & Yates, D. N. (2013). The WRF-Hydro model technical description and user’s guide, version 1.0. NCAR Technical Document, 120. Boulder, CO: Research Applications Laboratory. Goodchild, M. F. (1987). A spatial analytical perspective on GIS. International Journal of Geographical Information Systems, 1, 327–334. Goodchild, M. F. (1988). Towards an enumeration and classiﬁcation of GIS functions. In R. T. Aangeenbrug & Y. M. Schiffman (Eds.), International Geographic Information systems (IGIS) Symposium: The research Agenda. Fall Church: AAG. Han, W., Yang, Z., Di, L., & Mueller, R. (2012). CropScape: A web service based application for exploring and disseminating US conterminous geospatial cropland data products for decision support. Computers and Electronics in Agriculture, 84, 111–123. Herrero, M., Gonzalez-Estrada, E., Thornton, P. K., Quiros, C., Waithaka, M. M., Ruiz, R., & Hoogenboom, G. (2007). IMPACT: Generic household-level databases and diagnostics tools for integrated crop-livestock systems analysis. Agricultural System, 92, 240–265. Hutchinson, M. F., Nix, H. A., McMahon, J. P., & Ord, K. D. (1996). The development of a topographic and climate database for Africa. In Proceedings, third international conference integrating GIS and environment modeling. New Mexico. INSPIRE. (2008). INSPIRE. Available at http://inspire.jrc.it. Janssen, S., Andersen, E., Athanasiadis, I. N., & Ittersum, M. K. (2009). A database for integrated assessment of European agriculture systems. Environmental Science & Policy, 12(5), 573–587. Joshi, C., Leeuw, J. D., & Duren, I. C. V. (2004). Remote sensing and GIS applications for mapping and spatial modeling of invasive species. In Proceeding of ISPRS (pp. 1–9). Hannover, Germanny: ISPRS, MDPI Kalita, P. K., Kanwar, R. S., & Bischoff, J. H. (1992). Using the ADAPT model to simulate water table management effects in groundwater quality (ASAE Paper No. 92-2124). St. Joseph. Kirkby, M. J., & Cox, N. J. (1995). A climatic index for soil erosion potential (CSEP) including seasonal and vegetation factors. Catena, 25, 333–352. Li, C., Frolking, S. E., & Frolking, T. A. (1992). A model of nitrous oxide evolution from soil driven by rainfall events: 1. Model structure and sensitivity. Journal of Geophysical ResearchAtmospheres, 97, 9759–9776. Lin, Z. H., Mo, X. G., & Xiang, Y. Q. (2003). Research advances on crop growth models. ACTA Agronomica Sinica, 29(5), 750–758.

3 GIS Fundamentals for Agriculture

37

Long, D. S., DeGloria, S. D., & Galbraith, J. M. (1991). Use of the global positioning system in soil survey. Journal of Soil and Water conservation, 46, 293–297. Longley, P. A., Goodchild, M. F., Maguire, D. J., & Rhind, D. W. (2005). Geographical information systems and sciences (2nd ed.). New York: Wiley. Loveland, T. R., Merchant, J. W., Brown, J. F., Ohlen, D. O., Reed, B. C., Olson, P., & Hutchinson, J. (1995). Map supplement: seasonal land-cover regions of the United States. Annals of the Association of American Geographers, 85(2), 339–355. Lufafa, A., Tenywa, M. M., Isabirye, M., Majaliwa, M. J. G., & Woomer, P. L. (2003). Prediction of soil erosion in a Lake Victoria basin catchment using a GIS-based universal soil loss model. Agricultural Systems, 76(3), 883–894. Luzio, M. D., Srinivasan, R., & Arnold, J. G. (2004). A GIS-coupled hydrological model system for the watershed assessment of agricultural nonpoint and point sources of pollution. Transactions in GIS, 8(1), 113–136. Luzio, M. D., White, M. J., Arnold, J. G., Williams, J. R., & Kiniry, J. R. (2017). A large scale GIS geodatabase of soil parameters supporting the modeling of conservation practice alternatives in the United States. Journal of Geographic Information System, 9(3), 267–278. Maguire, D. J. (1991). An overview and deﬁnition of GIS. In P. A. Longley, M. F. Goodchild, D. J. Maguire, & D. W. Rhind (Eds.), Geographic Information Systems. London: Wiley. Morton, J. F. (2007). The impact of climate change on smallholder and subsistence agriculture. Proceedings of the National Academy of Sciences of the United States of American (PNAS), 104 (50), 19680–19685. Narasimhan, B., & Srinivasan, R. (2005). Development and evaluation of soil moisture deﬁcit index (SMDI) and evapotranspiration deﬁcit index (ETDI) for agricultural drought monitoring. Agricultural and Forest Meteorology, 133, 69–88. Newell, R. G., & Theriault, D. G. (1990). Is GIS just a combination of CAD and DBMS? Mapping awareness, 4(3), 42–45. Nishiguchi, O., & Yamagata, N. (2009). Agricultural information management system using GIS technology. Hitachi Review, 1, 265–270. Olesen, J. E., & Bindi, M. (2002). Consequences of climate change for European agricultural productivity, land use and policy. European Journal of Agronomy, 16(4), 239–262. Olson, K. R., & Olson, G. W. (1985). Use of agronomic data and enterprise budgets in land assessment evaluations. Journal of Soil and Water Conservation, 40(5), 455–458. Parthasarathy, U. (2010). Importance of GIS in Agriculture. Financing Agriculture, 42(3), 6–10. Penning de Vries, F. W. T., Jansen, D. M., ten Berge, H. F. M., & Bakema, A. (1989). Simulation of ecophysiological processes of growth in several annual crops. Wageningen: Pudoc. Pierce, F. J., & Clay, D. (2007). GIS application in agriculture. Boca Raton: CRC Press. Priya, S., & Shibasaki, R. (2001). National spatial crop yield simulation using GIS-based crop production model. Ecological Modeling, 136(2–3), 113–129. Rao, M. N., Waits, D. A., & Neilsen, M. L. (2000). A GIS-based modeling approach for implementation of sustainable farm management practices. Environmental Modeling & Software, 15, 745–753. Reichle, et al. (2017). Assessment of the SMAP Level-4 Surface and Root Zone Soil Moisture Product Using In Situ Measurements. Journal of Hydrometeorology, 18, 2621–2645. Richardson, C. W., & Wright, D. A. (1984). WGEN: a model for generating daily weather variables (Agricultural research services report no. 8). Washington, DC: US Department of Agriculture. Shen, Z., Liao, Q., Hong, Q., & Gong, Y. (2012). An overview of research on agricultural non-point source pollution modeling in China. Separation and Puriﬁcation Technology, 84, 104–111. Sitch, S., Smith, B., Prentice, I. C., Arneth, A., Bondeau, A., Cramer, W., Kaplan, J. O., Levis, S., Lucht, W., Sykes, M. T., Thonicke, K., & Venevsky, S. (2003). Evaluation of ecosystem dynamics, plant geography and terrestrial carbon cycling in the LPJ dynamic global vegetation model. Global Change Biology, 9(2), 161–185.

38

J. Tang

Sood, K., Singh, S., Rana, R. S., Rana, A., Kalia, V., & Kaushal, A. (2015). Application of GIS in precision agriculture. In National seminar on “Precision farming technologies for high Himalayas”, India. https://doi.org/10.13140/RG.2.1.2221.3368. Steyaert, L. T. (1996). Status of land data for environmental modeling and challenges for geographic information systems in land characterization. In M. F. Goodchild, L. T. Steyaert, B. O. Parks, C. Johnston, D. Maidment, M. Crane, & S. Glendinning (Eds.), GIS and environmental modeling: Progress and research issues. Canada: Wiley. Sui, D., & Goodchild, M. (2011). The convergence of GIS and social media: Challenges for GIScience. International Journal of Geographical Information Science, 25(11), 1737–1748. Tang, J. (2015). Dynamic linkages between vegetation phenology and seasonal changes in water quality in the Choptank Watershed, USA. International Journal of Remote Sensing, 36(12), 3041–3057. Trincheria, J. D., Craufurd, P., Harris, D., Mannke, F., Nyamangara, J., Rao, K. P. C., & Filho, W. L. (2015). Adapting agriculture to climate change by developing promising strategies using analogue locations in Eastern and Southern Africa: A systematic approach to develop practical solution. In W. L. Filho et al. (Eds.), Adapting African agriculture to climate change. Cham: Springer International Publishing. USDA. (1994). National soil characterization data. Soil Survey Laboratory, National Soil Survey Center, Soil Conservation Service, Lincoln, Nebraska. USDA-NRCS (U.S. Department of Agriculture-Natural Resources Conservation Service). (1995). Soil survey geographic (SSURGO) data base: Data use information. Fort Worth: National Cartography and GIS Center. Van Ittersum, M. K., & Donatelli, M. (2003). Special issue of European Journal of Agronomy: Modeling cropping systems. European Journal of Agronomy, 18(3–4), 187–194. Wade, G., Mueller, R., Cook, P., & Doralswamy, P. (1994). AVHRR map products for crop condition assessment: a geographic information system approach. Photogrammetric Engineering and Remote Sensing, 60, 1145–1150. Wang, X., & Melesse, A. M. (2006). Effects of STATSGO and SSURGO as inputs on SWAT models snowmelt simulation. Journal of the American Water Resources Association, 121, 1217–1236. Weygandt, S. S., Smirnova, T. G., Benjamin, S. G., Brundage, K. J., Sahm, S. R., Alexander, C. R., & Schwartz, B. E. (2009, June). The High Resolution Rapid Refresh (HRRR): An hourly updated convection resolving model utilizing radar reﬂectivity assimilation from the RUC/RR. In Preprints, 23rd conference on weather analysis and forecasting/19th conference on numerical weather prediction (Vol. 15). Omaha: American Meteorological Society A. Williams, J. R., Jones, C. A., & Dyke, P. T. (1983). A modeling approach to determine the relation between erosion and soil productivity. Transactions of the American society of Agricultural Engineers, 27, 129–144. Wilson, J. P. (1999). Local, national, and global applications of GIS in agriculture. In P. A. Longley, M. F. Goodchild, D. J. Maguire, & D. W. Rhind (Eds.), Geographical information systems: Principles and technical issues. New York: Wiley. Wilson, J. P., Inskeep, W. P., Rubright, P. R., Cooksey, D., Jacobsen, J. S., & Snyder, R. D. (1993). Coupling geographic information systems and models for weed control and ground-water protection. Weed Technology, 7, 255–264. Wratt, D. S., Tait, A., Grifﬁths, G., Espie, P., Jessen, M., Keys, J., et al. (2006). Climate for crops: Integrating climate data with information about soils and crop requirements to reduce risks in agricultural decision-making. Meteorological Application, 13, 305–315. Xie, H., Chen, L., & Shen, Z. (2015). Assessment of agricultural best management practice using models: current issues and future perspectives. Water, 7(3), 1088–1108. Yagci, A. L., Di, L., Deng, M., Yu, G., & Peng, C. (2011). Global agricultural drought mapping: results for the year 2011. IGRASS, July 2012.

3 GIS Fundamentals for Agriculture

39

Yang, Z., Yu, G., Di, L., Zhang, B., Han, W., & Mueller, R. (2013). Web serviced-based vegetation condition monitoring system – VegScape. In IEEE International Geoscience and remote sensing symposium – IGARSS 2013. Melbourne, Australia: IEEE. Yu, G., Di, L., Zhang, B., Shao, Y., Shrestha, R., & Kang, L. (2013). Remote-sensing-based ﬂood damage estimation using crop condition proﬁles. In Second international conference on agrogeoinformatics. Fairfax: IEEE.

Chapter 4

Agro-geoinformatics Data Sources and Sourcing Ziheng Sun, Liping Di, Hui Fang, Liying Guo, Xicheng Tan, Lili Jiang, and Zhongxin Chen

Abstract This chapter summarized state-of-the-art data sources and sourcing methods of agro-geoinformatics. The data mainly comes from four sources: satellite, airborne, and in-situ sensors, and human reports. Overall, the satellite datasets have the best spatial and temporal coverages. The airborne and in-situ datasets are mostly project-speciﬁc or site-speciﬁc. Human reports provide brief descriptions using concise terms and numbers to answer basic questions. The data from various sources are often overlapped spatially, temporally, spectrally, and/or thematically and can be combined to obtain comprehensive understanding of the crop ﬁelds. Data sourcing also has three major options: conventional, cloud-based, and crowdsourcing. Conventional sourcing depends on human surveyors, is often labor-intensive, and has very tedious administrative processes. Cloud based approach simpliﬁes the collection and distribution of big amount of collected data. The cutting-edge crowdsourcing approach largely lowers the cost of data gathering and retrieval. The future development is towards Internet-based, mobile friendly, big data, low-cost, robustness, and high-performance data distribution.

Z. Sun · L. Di (*) · H. Fang · L. Guo Center for Spatial Information Science and Systems, George Mason University, Fairfax, VA, USA e-mail: [email protected]; [email protected]; [email protected] X. Tan School of Remote Sensing and Information Engineering, Wuhan University, Wuhan, China e-mail: [email protected] L. Jiang Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences, Beijing, China e-mail: [email protected] Z. Chen Information Technology Division, Food and Agriculture Organization of United Nations, Rome, Italy e-mail: [email protected] © Springer Nature Switzerland AG 2021 L. Di, B. Üstündağ (eds.), Agro-geoinformatics, Springer Remote Sensing/ Photogrammetry, https://doi.org/10.1007/978-3-030-66387-2_4

41

42

Z. Sun et al.

Keywords Satellite · Remote sensing · In-situ sensor · Field survey · Crowdsourcing · Cloud sourcing

4.1

Introduction

Agro-geoinformatics uses the techniques of geoinformatics to study agricultural problems (Di and Yang 2014). In common sense, geoinformatics refers to three techniques: remote sensing (RS) (Richards and Richards 1999), geographical information system (GIS) (Jones 2014), and global positioning system (GPS) (HofmannWellenhof et al. 1994). The data sources used by agro-geoinformatics are extremely various. Unlike the old data-thirsty times, today’s researchers or engineers have plenty of choices on datasets due to the rapid development of both remote and in-situ sensors. However, as many scientists said, data are never enough. In many scenarios, especially with high-precision requirements, people have to observe manually at the scene. For example, the current research about precision agriculture (Mulla 2013; Pierce and Nowak 1999) relies on the images captured by unmanned aerial vehicles (UAV) (Valavanis 2008) on demand. When people want to study a crop ﬁeld, they bring equipment and stay at the ﬁeld to observe. They manually operate those machines, e.g., drones, to obtained data. To avoid the heavy labor duties, in situ sensors (many are solar powered) are invented and planted in the ﬁelds to observe real-timely, which require no human intervention. The data will be transferred back to the receiver station at a certain time interval. But they face risks such as ﬂood, animal damage, rain (shelter camera lens), and communication and power failure. Also, satellites could be usable but less competitive than UAV in term of spatial resolution. The highest spatial resolution of satellite images is at the decimeter level (optical bands, worse for hyper-spectral bands), while the resolution of UAV or airplanes can reach the centimeter level. Most high-resolution datasets are owned by commercial corporations and government agencies and are not free for use (Wikipedia 2014a). Sourcing channels including discounts for big customers are offered to facilitate buying. A great quantity of freely available RS datasets are medium to low resolutions, like the famous Landsat (30 meters) (USGS 2014; Wikipedia 2014b), MODIS (250 m) (NASA 2014), and Sentinel (range from 5 m to 60 m) (Drusch et al. 2012). But it is not equal to meaning that the free data has no playground in the market. Medium-to-low- resolution data can monitor crop ﬁelds and predict yields in large scale, e.g., the entire United States, and generate different kinds of indicators to support annual agricultural policy making. Besides the remotely sensed data, the in-situ measured vegetation, soil moisture, and meteorological data are in great demand. Such data are ground truth and very important. But they are non-continuously captured on speciﬁc ﬁelds because of the point-oriented character and high expense of the monitoring devices. A device can only observe one object and the observations are intermittent in space. Spatial interpolation, which adds more errors to the results, has to be conducted in order

4 Agro-geoinformatics Data Sources and Sourcing

43

to cover the whole area. Commonly, data consumers design a plan and assign people to execute it on the studied ﬁeld. One result of this strategy is that the processing cycle is very long and the data are sporadic. The in situ sensors are a better solution, but the initial and maintenance costs are still too high for studying a large area. So usually such in situ strategies are applied by agricultural agencies of the government. They have many ﬁeld ofﬁces and observation stations all over the country. The ﬁeld stations can deliver observations to data archives 24/7. On the other side, they can directly collect information from farmers themselves who need to report the current situation to get government subsidies. Agricultural departments can easily get human reports and sensor observations at the same time and compare them to ﬁnd the inconsistency to determine what the real situation is. That information can comprehensively judge the optimistic level about the annual food safety. In addition, GPS is one of the essential prerequisites which makes the implementation of precision agriculture or site-speciﬁc farming possible (Schneider and Wagner 2015; Zhang et al. 2002). The GPS signal is free and helps more accurately and cost-effectively conduct agricultural activities. Typical cases include farm planning, ﬁeld mapping, soil sampling, on-purpose irrigation, crop fertilization, tractor guidance, crop scouting, variable rate application, and yield estimation (Stafford 2000). The excellent availability of GPS signal and the prosperity of the GPS receiver market make the use of GPS in agriculture quite easy. There is little room for development in the hardware aspect. The market starts to dig into the downstream applications of GPS. A popular direction is the routing planning of tractor based on real-time geospatial information and GPS signal. Many startups and giant companies are deeply engaged in recent years. The relevant data, such as farm maps, digital elevation model (DEM), plant history, real-time soil data, latest weather, and forecast, are the battleﬁelds where lots of competitions are going on. The released standard datasets are far less than needed. More efforts are required on data collection to supplement the production chain. As introduced above, the data sources of agro-geoinformatics include satellite, airborne, UAV, in situ sensors, and human reports (Nabrzyski et al. 2014). The data consumers include many communities like the government, agro-industries, researchers studying precision agriculture, and individuals having interests in commodity agriculture (farmers, middlemen, supermarkets, etc.) (Fretwell 1987; Lyson 2012). Sourcing generally means that consumers obtain data from the sources via some channel at a certain cost. This business has coexisted along with agriculture since the ancient times. The authorities need information about ﬁelds and crops to collect tax from farmers. Farmers need information about weather and solar terms to determine the time to sow, fertilize, irrigate, and harvest. No matter who is the initiator, the basic routine is the same. The information of crop ﬁelds or animals is observed, organized, and delivered to consumers who employ the observers or pay a fee. After centuries, although the techniques and the data itself have been sharply changed in form, the same routine still stays active today. Compared to the ancient workﬂow, the biggest change of current sourcing is the way to collect and distribute the data. Electronic devices are massively used to automatically capture the data and the Internet, especially wireless networks (Wang et al. 2006), augmenting the data

44

Z. Sun et al.

distribution to an extraordinary fast level. The data archived in data centers (most in cloud data centers (Beloglazov and Buyya 2010; Wang and Ng 2010)) can be ordered and downloaded to personal devices. Consumers can get an immersive understanding without going to the ﬁeld. Besides, a brand new sourcing schema, crowdsourcing, emerges in recent years (Brabham 2012). Smartphones become cheap and common. Every man and woman can voluntarily contribute their own observed data by taking photos or entering information on their devices (Chuang et al. 2016; Silvertown et al. 2015). Crowdsourced dataset, which is freely accessible to the public, is an important data collection strategy for citizen agricultural science (Boim et al. 2012; Lukyanenko et al. 2011). In conclusion, modern sourcing creates great opportunities in agricultural business. It enhances the action speed and efﬁciency of the entire agricultural community and makes modern advanced agriculture right around the corner. This chapter addresses state-of-the-art data sources and sourcing in agro-geoinformatics for further references. We investigated both academic materials and industrial businesses to ensure our statements maximally agree with the actual reality of the current market. This chapter is organized as follows. Section 4.2 introduces in details the available major data sources for agro-geoinformatics from various platforms. Section 4.3 discusses the major sourcing methods. Section 4.4 summarizes this chapter.

4.2

Data Sources

This section investigates the current available data sources for agro-geoinformatics. According to the observation platforms, they can be divided into four clusters: satellite, airborne/UAV, in situ sensors, and human reports. Related available datasets, archives, websites, and services are found and inventoried. Each cluster is further detailed below.

4.2.1

Satellite

Dozens of Earth observation (EO) missions have been carried out in the past six decades, and more missions are on the way (Kramer 2002; Parkinson 2003; Sandau 2010). Satellites, one of the greatest inventions of the humankind, have been launched into the Earth’s orbits thousands of times. Characteristics like bird eye, cyclical, long term, stable, lareg ﬁeld of view, and not easily disturbed are some examples of the advantages of satellite-based EO over many other observation means. Satellite data are very ideal for long-term and large-scale monitoring purposes. We sorted out the details of some major satellite EO datasets which have overlapped interests with agro-geoinformatics. An inventory is created to list them (Table 4.1).

4 Agro-geoinformatics Data Sources and Sourcing

45

The Landsat program is the longest EO program and has the biggest global user community due to its easy access, especially after USGS made all the data free online in 2008. Now the data are not only available in the USGS archive but also on commercial cloud data centers like the Amazon Web Service (AWS) (Varia and Mathew 2014) and Google Earth Engine (GEE) (Gorelick 2013). Total seven satellites of the program so far (the sixth failed) have harvested millions of scenes since 1972 and gave a long-term and complete view of the entire Earth. The 30-m resolution is good enough for many general application scenarios. Therefore, the Landsat datasets are massively used by all walks of life and beneﬁts many industries. Agriculture, no doubt, is one of them. Both academic and industrial communities of agriculture are using the Landsat data. We have seen a bunch of papers and news about it (Jurgens 1997; Ozelkan et al. 2016; Sheoran and Haack 2013; Zhong et al. 2014). Terra and Aqua are another two popular EO satellites launched by NASA (Savtchenko et al. 2004). Since their launch, Terra has worked for 17 years and Aqua worked for 15 years. Both carry multiple EO instruments. We focus on the MODIS instrument as the others aim at observing the atmosphere and ocean. MODIS has a mature product hierarchy which classiﬁes the products into several levels. MODIS land products are related to agriculture. For example, the surface reﬂectance product (09GA, 09GQ, 09A1, 09Q1) can reﬂect the true spectral characteristics of the crops. The land cover annual product (12Q1, 12C1, 12Q2) gives an important reference to the surface change. The land surface temperature and emissivity product (11A1, 11A2, 11B1, 11_L2, 11C1, 11C2, 11C3, 21_L2, 21A1, 21A2) estimates the temperature and emission in 1-km grid. The vegetation index products (13Q1, 13A1, 13A2, 13C1, 13A3, 13C2) calculate NDVI (normalized difference vegetation index) and EVI (enhanced vegetation index) on a 16-day interval at multiple spatial resolutions (250 m, 500 m, 1 km, 0.05 degrees). The gross primary production (GPP) and net primary production (NPP) products (17A2, 17A3) can provide an accurate measure of the growth of the terrestrial vegetation including crops regularly. The vegetation continuous ﬁeld products (44B) estimate the portion of vegetation cover in each pixel. The leaf area index (LAI) products (15A2H, 15A3H) calculate the LAI which is the one-sided green leaf area per unit ground area in broadleaf canopies and half the total needle surface area per unit ground area in coniferous canopies. The evapotranspiration products estimate global terrestrial evapotranspiration from land surface. All the products are calculated on both Terra (morning, code MOD) and Aqua (afternoon, code MYD) data. These products have tight relations with vegetation, and the agricultural community has adopted some of them to monitor the open crops (with no greenhouse). However, the resolution is low (250 m ground resolution at the best) so only large-scale studies can apply. SMAP, short for soil moisture active passive, is of concern to many people since its launch. Because the soil moisture product is so rare, it has great demands in the market. Soil moisture data could bring various beneﬁts in improving weather forecasts, monitoring droughts, predicting ﬂoods, assisting crop productivity, and breaking down the water-energy-carbon cycles. Its signiﬁcance to agriculture is selfevident. SMAP produces global maps of soil moisture with near-global coverage

46

Z. Sun et al.

Table 4.1 An inventory of satellite-based EO datasets on behalf of agro-geoinformatics Satellite Landsat

Instrument 4–5 TM 7 ETM+ 1–5 MSS 8 OLI, TIRS

Aqua

AIRS AMSU CERES AMSR-E HSB MODIS ASTER CERES MISR MODIS MOPITT Radar Radiometer PR TMI VIRS CERES LIS Imager Sounder

Terra

SMAP TRMM

GOES

Launch 1972 (1) 1975 (2) 1978 (3) 1982 (4) 1984 (5) 1999 (7) 2013 (8) 2002

Resolution 30 m

Facility USGS archive1 GLCF (UMD)2 Amazon Cloud3 Google earth Engine4

Fee None

250 m

NASA

None

1999

250 m

NASA

None

2015

1–3 km 40 km 5 km 402 km 2 km 10 km 402 km 1–8 km

ASF and NSDC

None

NASA GSFC

None

NOAA OSPO7

None

1997

1975 (1) 1977 (2) 1978 (3) 1980 (4) 1981 (5) 1983 (6) 1987 (7) 1994 (8) 1995

(continued)

4 Agro-geoinformatics Data Sources and Sourcing

47

Table 4.1 (continued) Satellite

Instrument

SPOT

HRV HRG HRS

Sentinel

SAR SES SLSTR OLCI SRAL DORIS MWR LRR GNSS

EO-1

QuickBird

ALI Hyperion LEISA LAC BGIS 2000

IKONOS WorldView

Launch (9) 1997 (10) 2000 (11) 2001 (12) 2006 (13) 2009 (14) 2010 (15) 2016 (16) 1986 (1) 1990 (2) 1993 (3) 1998 (4) 2002 (5) 2012 (6) 2014 (7) 2014 (1A) 2015 (2A) 2016 (1B) 2016 (3) 2017 (2B) 2000

Resolution

Facility

Fee

1.5–10 m

Airbus Geostore5

6

10–60 m 500 m 300 m 300 m – 20 km – –

Copernicus open access Hub8

None

10–30 m

USGS archive9

None

Charge at least $1.75 per km2

0.65

DigitalGlobe

Rate vary Need quote

OSA

2000 (I) 2001 (II) 1999

1–4 m

DigitalGlobe

Panchromatic Sensor

2007 (1)

0.5 m 0.46 m

DigitalGlobe

Rate vary Need quote Rate vary Need quote (continued)

48

Z. Sun et al.

Table 4.1 (continued) Satellite

GeoEye

GaoFen

Instrument

Launch

Resolution

Multispectral sensor SpaceView 110

2009 (2) 2014 (3) 2016 (4) 2008 (1)

0.31 m 0.31 m

Panchromatic Sensor Multispectral sensor P/MS WFV PMC-2

Ziyuan

CCD camera IRMSS WFI PAN IRS MUX TDI CCD IMSC

CubeSat

Vary

2013 (1) 2014 (2) 2015 (4) 2016 (3) 1999 (I-01) 2000 (II-01) 2002 (II-02) 2003 (I-02) 2004 (II-03) 2011 (I-02C) 2012 (III-01) 2014 (I-04) 2016 (III-02) –

Facility

Fee

0.46 m 1.84 m

DigitalGlobe

Rate vary Need quote

8 (2) +16 m 0.8 m 50 m 1m

EOSDC-CNSA

N/A

19.5 m 78–156 m 73–258 m 10 m 20 m 40–80 m 2.1–3.6 m 5.8 m

China10 Brazil

Rate vary Need quote

–

Any entity

Vary

1 https://landsat.usgs.gov/landsat-data-access 2 ftp://ftp.glcf.umd.edu/glcf/Landsat/ 3 https://aws.amazon.com/cn/public-datasets/landsat/ 4 https://earthengine.google.com/datasets/ 5 http://www.intelligence-airbusds.com/geostore/ 6 http://www.landinfo.com/satellite-imagery-pricing.html 7 http://www.ospo.noaa.gov/Products/imagery/archive.html 8 https://scihub.copernicus.eu/ 9 https://eo1.usgs.gov/ 10 http://sjfw.sasmac.cn/

4 Agro-geoinformatics Data Sources and Sourcing

49

in every 2–3 days. The data are stored in two places: Alaska Satellite Facility (ASF) and National Snow and Ice Data Center (NSIDC). TRMM (Tropical Rainfall Measuring Mission) has stopped collecting data on April 15, 2015. It delivered an extraordinary global dataset of 17 years on tropical rainfall and lightning. The dataset is the space standard for measuring precipitation. It could support applications like drought monitoring and weather forecasting. Since its end of the mission, the historical dataset can help us to analyze the relationship between precipitation and crop growth. The follow-on mission, the Global Precipitation Measurement (GPM), provides the measurements similar to TRMM with improvded quality and spatio-temporal resolutions. GOES satellites are very famous in weather and meteorology domain. The vast majority of weather reports and forecasts we see on TV/the Internet/newspapers or listen in the radio (in the United States) are derived from the observations of these satellites. The satellites circle in the geosynchronous orbit which allows them to stay in a ﬁxed position in the space and observe the same region of Earth 24/7. The elevation is approximately 22,300 miles. They can not only stare at the atmosphere but also sense the Earth’s surface temperature and water vapor dynamics. NOAA and weather companies operate their own weather models upon GOES datasets to generate reports and forecasts. Agricultural industries in the Americas (both North and South) have utilized those reports and forecasts for a long time. The Sentinel is a series of EO satellites launched by the European Space Agency (Butler 2014). The program starts in 2014, and the data are available on a website named Copernicus Services Data Hub for downloading. Researchers in agriculture have begun to use its datasets (Arias and Inglada 2015; Drusch et al. 2012; Torbick et al. 2017). Given these excellent and long-lasting satellites above, which have saturated the medium-low spatial/spectral resolution market of EO, the latest developments start to dig into the high-resolution market. Big achievements have been made in either government programs or private projects. One giant, DigitalGlobe (Wikipedia 2014a), stands in this area with many small-to-medium corporations surrounded. Most of them are backed by their own governments. For example, the operator of SPOT satellites, SPOT Image, is a company created by the French Space Agency. DigitalGlobe has the best optical imagery database with the highest spatial resolution (0.31 m) and best coverage. It owns and operates QuickBird, IKONOS, WorldView, and GeoEye. The data from these satellites almost take up the entire buying market of ultra-high-resolution Earth images. Google Maps, Apple Map, Bing Maps, and many other free mapping services adopt the tiles generated from their images. The major opponent of DigitalGlobe is the SPOT program in France. The SPOT satellites have stably obtained 1-meter-resolution images from the 1980s and are accounted as the producers of the highest resolution images in the world in the last century. However, after the resolution of DigitalGlobe surpassed SPOT by a lot, the market has obviously leaned more to the former. Meanwhile, a brand new player, the CubeSat, shines in the market (Heidt et al. 2000). The CubeSat is a speciﬁcation to facilitate frequent and affordable access to space with launch opportunities available on most launch vehicles (Lee et al. 2009; Toorian et al. 2008). The size

50

Z. Sun et al.

of CubeSats is very small compared to conventional satellites, which makes it possible to launch dozens of CubeSats at once (Straub et al. 2013). More satellites mean more eyes in the space to conduct more comprehensive observation. The cost of CubeSat is extremely lower than traditional satellite and makes CubeSat a true game changes (Woellert et al. 2011). It is exciting that an investment of approximately 50,000 dollars will let investors own a satellite. The PlanetLab is a pioneer on commercializing CubeSat and has manufactured several CubeSat satellites and delivered them into orbit as passengers on other launch missions (Chun et al. 2003; Peterson and Roscoe 2006). They built a website www.planet.com to sell their images online. The system can view the historical images of every spot on the Earth. Consumers can easily locate and order the scenes they need. One common character of these satellites is that their datasets are for sales. Meanwhile, the images of sensitive areas may reveal classiﬁed information which impacts national security. So every image is processed to mosaic or encrypt those sensitive regions before users encounter them. The owner companies usually establish a corresponding set of an online sales system for consumers to search and order the images. The use cases of ultra-high-resolution satellite images are countless. In precision agriculture, ultra-high-resolution images are the basic information together with GPS signal and in situ sensed data. Both multi-spectral and hyperspectral bands are needed to comprehensively estimate crop status. Along with military, agricultural community is another big customer of these companies. However, the satellite images can only be one of the sources of agrogeoinformatics. The results of satellite-only studies are limited in many aspects as satellite remotely senses the objects instead of directly touching them. The indirect sensing may bring errors which may be ampliﬁed in the later processes and leads to untrustworthy results. Data from other sources are required to assist and correct the results.

4.2.2

Airborne Camera

Compared to satellites, airplanes and UAVs are closer to the land surface. They can avoid most inﬂuences of atmosphere and have better ﬂexibility in the air. Airborne images can easily reach a much higher spatial resolution than space-borne images. The reason is simple. It is similar to the human eye. You cannot see clearly a thing if it is far away. If people move their eyes closer, the thing will become clearer. The equipment in airborne observation has no big differences on imaging principle from the instruments on satellites. Panchromatic sensors and multiple spectral sensors are both available on the airborne platform. Airborne platform is further divided into aircraft and UAV. The former employs an aircraft to carry all the equipment and ﬂy over the study area with pilots and operators on board. The latter operates an UAV with operators standing on nearby places to control its ﬂying path. Normally, an aircraft ﬂies higher and longer than

4 Agro-geoinformatics Data Sources and Sourcing

51

Table 4.2 Some drones for precision farming (the price data are quoted the lowest on 5/31/2017) Brand model DJI phantom 3 CP.PT.000181 DJI MATRICE 100 DJI T600 inspire 1 Bebop drone 2

Camera ZENMUSE X3 Customizable ZENMUSE X3

Resolution 12.4 MP – 12.4 MP 12.5 MP

Duration 23 min 40 min 18 min 25 min

Distance 3 mile 3 mile 1.3 mile 300 m

Price $799 $3299 $1899 $399

UAV, but not in every case. Many restrictions apply on the ﬂy of a measuring aircraft which is considered to be a spy activity in some countries. They need authorizations from the authorities and wait until the approval of air trafﬁc control department when the path intersects with civil aviation routes or military practice routes. This process may take a very long time. Aircraft-based observation is popular in the late 1990s and the ﬁrst decade of this century (Eichkorn et al. 2002; Mays et al. 2009; Möhler et al. 1993). But as UAVs and satellites enter the competition, aircraft approach is gradually abandoned by civil use. The military and government are still using it as they are familiar with the process and can easily obtain the approval. The military has plenty of aircraft equipped with sensing devices so that they can observe as they are in the air. Naturally, the observed data are for military and government purposes and not available for agriculture. The costs include the gas, maintenance fee, depreciation, installation of equipment, labor of operators and pilots, insurances, etc. Overall, the aircraft measurements are more expensive on the cost per unit area than satellite and UAV. As aircraft-based observation is case-speciﬁc and has no systematic data management, the data from aircraft is rarely available in public. The UCAR (University Corporation for Atmospheric Research) (Hallgren 1974) Earth Observing Laboratory (EOL) manages two research aircrafts, HIAPER and C-130, and accepts requests for observing tasks. Some of the acquired data are made available online via anonymous FTP server.1 Those datasets can help people study the pattern of speciﬁc areas but are not suitable for long-term and large-scale monitoring and forecast in agriculture. UAV, also called drone, is an ideal alternative to avoid the limitation of aircraft. The resolution of images obtained by UAV sensing can be less than 1 cm to 10 cm. The recent ﬁerce competition among drone manufacturers helps reduce the price to a very affordable level for companies and individuals. We quote the prices for some models of the major brands to give a overview and sense of the drone market. Table 4.2 is given with the price and parameters of the quoted drones. Here we exclude the military drones like a RQ-4 Global Hawk (Haulman 2003), MQ-1 Gray Eagle (Cote 2015), and RQ-11 Raven (Arjomandi et al. 2006; Cook 2007). Those drones are extremely expensive, high maintenance cost and warfare oriented. Civil drones generally have two options: consumer and professional (Chen et al. 2016). Based on our survey, there is no obvious difference on ﬂying distance, duration, and data quality between the two. The major gap, if not minor, is the

1

https://www.eol.ucar.edu/all-ﬁeld-projects-and-deployments

52

Z. Sun et al.

charger. Professional drones can charge faster (half an hour) than consumer drones. It doesn’t mean theconsumer drones are toys and not serious enough for professional use. Although the professional conﬁguration is much more expensive, they might be equipped with high-tech components which you really don’t need. A company from China, DJI, has seized the ﬁrst gold as a drone manufacturer (Atwater 2015; Lee and Choi 2016). Their products are the best option you can ﬁnd in current market. The price varies dramatically depending on model and conﬁguration. Most of them are appropriate for data collection over agricultural ﬁelds. During a ﬂy task, operators can set up the observation mode, plan a path, let the drone ﬂy along, manipulate the remote control, and down-stream the real-time images/videos to a tablet. Most equipped cameras on UAV are installed on a gimbal and can rotate to shoot images from a perfect angle. The drone has memory onboard, and all the captured data are automatically stored there. Once the drone lands, operators can take the memory and copy the data to storage devices. To facilitate the post-ﬂy data processing, the drone companies and some other image processing companies developed a number of software (Newman 2013). The processes include image fusion, mosaicing, georeferencing, calibration (need ground control points), interpolation, etc. NDVI and other vegetation index products can be derived from the calculation among the optical bands and infrared band. DEM standard products may also be achieved if the drone camera takes stereo¼pair images. Until now, a whole set of hardware, software, theories, and techniques have become mature for using drones in precision agriculture. In the past 5 years, drone images can be seen almost in every study and use for agriculture. In the future, the use of drones will be more, and the application of drone images will be everywhere. Some military fuel-powered drones can already ﬂy up to 36 hours before returning to base. We believe as the rapid progress on development of low-cost long-duration battery and long-distance remote control, the observation region of civil drones will be greatly extended in next few years. Please refer to Boucher (2015) for more progresses on domesticating the military drones for civil use. However, the publicity and openness of drone data are not as good as satellite data. The use of a drone is ﬁeld speciﬁc. Similar to aircraft-based data, the drone data are limited in both temporal and spatial extents and has neither global nor national products. It seems that small region is the birthmark of civil drones. It is very difﬁcult for users to discover and order low-cost data sources on a speciﬁc ﬁeld, especially when the ﬁeld has no record of drone ﬂying. Commonly, data consumers either buy a drone to obtain the data themselves or outsource the project to an experienced company. The resulted data products are claimed by the initial consumers and archived in their private storage facilities. Outsiders have no route to access them. This is the new normal. In contrast, some government programs and non-proﬁt foundations have done some open UAV tasks and published the sample datasets online to boost the contribution from citizen scientists or research institutes. But such datasets are rare, small, incomplete, low-quality, and meaningless for consumers with an interest in a different ﬁeld. Lots of work need to be done to form a global drone image database for searching, ordering, purchasing, processing, and downloading.

4 Agro-geoinformatics Data Sources and Sourcing

4.2.3

53

In Situ Sensors

In agriculture, in situ sensors denote the sensors installed in the ﬁelds. The sensors may be ﬁxed on the top of a standing pole, laid on the ground, immersed under water, buried in the soil, or carried by people or vehicles. The measurements are taken very close to the targets and share the same circumstances with the crops. The data are normally considered as ground “truth” as they have slight noises comparing to satellites and UAVs. Such in situ way is the heritage of agricultural monitoring technique that was adopted before aircraft and spacecraft appear. Today, it still plays an important role thanks to the signiﬁcant improvements of close-range measuring instruments. We surveyed the currently operational sensors in agricultural ﬁelds and the corresponding datasets available online. According to the agriculture-relevant targets, we divided the in situ sensors into four classes: weather, soil, water, and vegetation. Automated network of weather data collection for agriculture has been addressed back in 1980’s (Hubbard et al. 1983). The agricultural industry demands accurate weather information to apply appropriate treatments on the crops or animals. A slight error of 2 or 3 degrees of temperature has quite different consequences on crops. The public weather map is interpolated by the measurements from weather stations which may be located as far as 50 miles away. Affordable and readily available weather stations perfectly solve this problem. Through decades of developments, the methods and technologies have turned very mature and been commercialized in weather industries. The small-sized devices as weather stations have been on sale such as AcuRite Pro 5-in-1 weather station tagged $119.99 on Costco website. The weather station can measure rainfall, wind speed and direction, temperature (actual and feel like), moon phase, heat index, dew point, wind chill, and more, and stream the data real time to the user. The information is remotely accessible from smartphone, tablet, and laptop web browser. People can keep the data privately or share the data with friends and weather communities like weather underground2 to enrich the public records. Accurate weather data of a crop ﬁeld must be one of the easiest datasets that can be achieved via relatively low cost. Similar to weather stations, sensors for both soil and water have been in use for a long time, but it is only recently that miniaturized and affordable sensors have become available (Rossel and Bouma 2016). In situ and multiple-point measuring is feasible now. A number of commercial sensors are on the market to measure the presence of water in the surface and root-zone soil (soil moisture), soil nitrate, salinity, soil reaction (pH), available water capacity, bulk density, soil crusts, macropores, earthworms, particulate organic matter, soil enzymes, total organic carbon, etc. (Mukhopadhyay 2012). Using soil sensors in agriculture can provide fundamental information about the local soil and environmental conditions in space and time, enhancing the efﬁciency of crop yield and minimizing environmental side effects. The sensed information can accumulate to a site-speciﬁc database, 2

https://www.wunderground.com

54

Z. Sun et al.

which can use to study relations among soil, crop, and water. The database can help better understand the soil physical, chemical, and biological attributes and dynamics behind. Due to the long-term sensing, several databases of soil are available already, such as the European Soil Database,3 Harmonized World Soil Database,4 ISRIC soil data hub,5 National Soil Database (NSDB) of Canada,6 State Soil Geographic (STATSGO) DataBase,7 Soil Survey Geographic Database (SSURGO)8 of USDA, and National Soils Database of New Zealand.9 They contain vector and attribute data, which tags the soil with very detailed hierarchy, and provide sound instructions guiding users to apply in agriculture. Many websites are built to facilitate the access to the soil data such as the SoilGrid.10 The data in these databases are measured by in situ sensors and archived via some collection channels. The databases are mainly point-oriented and the resolution is out of concern. They are freely available online for ordering. Agricultural users can either use these existing databases or employ their own sensors to obtain more accurate and concurrent information from the ﬁeld. Modern technique makes both feasible.

4.2.4

Manual Reports

Manual reporting is the ancient way to gather information by authorities. Unbelievably, we still walk down the same road today. The principal nature of the reporters and consumers doesn’t change much. But the patterns, processes, and channels are dramatically reconstructed. The existence of information and communication technologies (ICT) (Aker 2011; De Silva and Ratnadiwakara 2008; Meera et al. 2004) like wireless network and the Internet makes the exchange rapidly completed. The reporters are usually the farmers themselves. They have to ﬁll in information tables and submit them to the government to request for authorization or approval of actions, such as burning authorization, subsidies, wildland ﬁre prevention, landowner assistance, bare root tree seedling orders, etc. Actually a lot of people are working on collecting agricultural information and broadcasting to either agricultural or nonagricultural stakeholders (Chhachhar et al. 2014). A recently launched activity called citizen science has encouraged citizens to engage in information collection and sharing the collected information to much broader audiences in

3

http://esdac.jrc.ec.europa.eu/content/european-soil-database-v20-vector-and-attribute-data http://webarchive.iiasa.ac.at/Research/LUC/External-World-soil-database/HTML/ 5 http://isric.org/explore 6 http://sis.agr.gc.ca/cansis/nsdb/index.html 7 https://www.nrcs.usda.gov/wps/portal/nrcs/detail/soils/survey/geo/?cid¼nrcs142p2_053629 8 https://www.nrcs.usda.gov/wps/portal/nrcs/detail/soils/survey/?cid¼nrcs142p2_053627 9 https://soils.landcareresearch.co.nz/soil-data/national-soils-data-repository-and-the-national-soilsdatabase/ 10 https://www.soilgrids.org 4

4 Agro-geoinformatics Data Sources and Sourcing

55

Fig. 4.1 A sample report from USDA

addition to the government agencies (Dehnen-Schmutz et al. 2016; Irwin 2001; Rossiter et al. 2015). Using ICT, the industry has connected all the participants on the table, from scientists to consumers, and all sectors of stakeholders from agricultural research, crop cultivation, irrigation, fertilization, insecticide, marketing, data analysis, and budgeting to policy making. The data from each sector has special channels to be transferred to its downstream consumers and have signiﬁcant inﬂuences on the subsequent activities. The data format is generally tables of numbers with descriptions and maybe a little interpretation. Figure 4.1 displays a sample report from the USDA about the agricultural exports in a week of May 2017.11 A purchase from a foreign ﬁrm outside the United States was made to buy US-produced commodity. The trading information was ﬁled to the USDA who generated weekly reports and released them to the public. Manual reports only

11

https://www.fas.usda.gov/programs/export-sales-reporting-program

56

Z. Sun et al.

give the key information about the situation like name, price, date, area, and times. People can get these reports from various sources like website, smartphone apps, newspapers, magazines, TV, radio, the analysis reports of nonproﬁt institutes, stock market reports, etc. The cost depends on the terms of data sources. Many of them are freely available.

4.2.5

Summary

Each source has its pros and cons. The capabilities of platforms may be partially but never totally overlapped. Thus these sensors are not diametrically opposed to each other. Nevertheless, “data are never enough.” The four categories, and maybe more in the future, of sensors, can collaborate and sense the agricultural entities from various spatial and temporal perspectives to provide an immersive reﬂection of the truth. Integrative use of the available databases of the different platforms in the same application use case is deﬁnitely an inevitable direction. Actually, the game has already begun.

4.3

Sourcing

This section surveys the sourcing methods for agro-geoinformatics. Data sourcing means ﬁnding the sources of the right data and sweeping away all the obstacles to get it. There are two tasks: exploration and retrieval. Too many incredible issues happened surrounding such two things in the real-world cases. We ﬁrst introduce the conventional sourcing method and then address the newest technique: cloudbased sourcing and crowdsourcing.

4.3.1

Conventional Sourcing

In the traditional way, data users have to frequently perform the following steps: 1. Find available sources. 2. Ask each source to provide sample data. 3. Check if the structure, resolution, quality, and ﬁelds of the gained samples meet the use requirements. 4. Figure out how the data are going to be used. 5. Understand the observation date and acquisition date of the data. 6. Obtain the data from the databases if they are directly accesible online. 7. Contact the eligible sources to purchase the data if it is not freely available online. 56

4 Agro-geoinformatics Data Sources and Sourcing

57

The cost of delay needs to be taken in account. Currently most of the data providers’ systems have been equiped with APIs, which allow the customers’ systems to make machine-to-machine connection for direct data access. However, this docking process is very tough. The failure in any step will lead to the discard of the source. It is a matter involving two teams from different institutions. Before the standards of services were established, a person from the source team has to work together with the consumer team. He/she is responsible for providing the API information to the consumer team and guiding the team to use provider’s data services. Such a strategy is old-fashioned, and in order to achieve the ﬁnal goal of machine-to-machine interoperability, meetings and workshops have to be held frequently. The very disappointing thing is that even direct human-to-human collabration cannot guarantee the machine-tomachine sourcing works. Maybe after 1 or 2 weeks of communication, the consumers ﬁnd the source lacks of some ﬁelds or structures in the API of the provider's system, which require the data providers to completely reorganize or update. If the data provider refuses to do so, the collaboration is over. All the data-consuming industries, not only agriculture, deeply recognized this problem after many fundamental projects. It is aware that the difﬁculty has to be overcome by both data providers and consumers. The core solution is one word: standardization. Comprehensive standardization in every aspect of data communication interface, such as data format, service interface, message channel, protocol, Web API, encoding, decoding, parameter, condition, etc., is the ultimate answer. Standard-compliant data and services can be easily used by consumers without communication among persons. Once standards are employed, the data sources only need to develop standard-compliant services, and the consumers only need standard-compliant clients. The data providers no longer have to send a person to accommdate every customer. The clients no longer need asking a person to ﬁgure out the content of the data and the use of the services since there are clear rules and explanations in the speciﬁcations. We have seen many progresses on standardization in data sourcing industries. Typically, the satellite databases provide a variety of standard formats and access methods including HTTP, FTP, OGC (Open Geospatial Consortium) W*S, OPeNDAP, netCDF-CF (standard format for gridded and point monitoring data and models) (Rew et al. 1997), HDF4, HDF5, GRIB (GRidded Information in Binary) (SCHMUNCK 2002), GeoTiff, and KML (Keyhole Markup Language). The WCS (Web Coverage Service) is a standard for web services distributing raster data. The WMS (Web Map Service) is a service standard for real-time composition of data into visible maps. Some NASA EOS datasets such as AIRS products are accessible through standard OGC WCS and WMS protocols (Yang 2010; Yang and Di 2002). The formats adopted in NASA are mainly GeoTiff and HDF (Burnett et al. 2007; Han et al. 2008; NASA 2014, 2016; Savtchenko et al. 2004; Zhao et al. 2015). NOAA mainly uses netCDF and GRIB (Hankin et al. 2010; Williams 2015; Williams et al. 2009). These data formats and services have been around for a long time and very familiar to the relevant community. It is convenient to ﬁnd

58

Z. Sun et al.

open source and free tools or library to manipulate these formats and services. Drone data doesn’t have such sound standards for format and services. But drone manufacturers and third-party software enterprises have developed a whole set of systems with their own formats and interfaces. As long as the user community accepts them, they will become de facto standards with equivalent effect. For in situ sensors, the data contains massive sequences, signals, and numbers which are often sorted into the simple CSV (comma-separated values) format. The users have their habits in organizing them. The standardization level is low in this domain. Refer to USGS water database12; the data are arranged by columns. Each column is a variable and each row is an observation. The users can select the column to generate their own CSVs (Goodall et al. 2008; Wolock 2003). Similar situation happens in manual reports which are native tables. One requirement for operation of this strategy is the data providers and consumers both clearly understand the deﬁnition of columns and ﬁelds. The proof of practice shows that this scheme works very well. But along with the development of big data and cloud computing, such scheme is a little old-fashioned and inefﬁcient. The collaboration between data sources and consumers are turning more closely to introduce the latest techniques and hardware to further accelerate the speed of sourcing and reducing the barriers for new consumers to entry.

4.3.2

Cloud Sourcing

The popularity of cloud computing and elastic storage brings profound impacts upon all the walks of life. The high-performance computing, great availability, pay-peruse proﬁt model, reasonable cost, reliability, stability, worry-free about CPU/memory/disk failure, maintenance free, easy-to-use API, and intuitive user interface are the basic reasons why so many organizations have transferred their entire businesses onto cloud. The major commercial cloud platforms include Amazon Web Service (AWS) (Varia and Mathew 2014; Wang and Ng 2010), Google Cloud Platform (GCP) (Cusumano 2010), Microsoft Azure, IBM Cloud, etc. (Qian et al. 2009). These cloud platforms offer a variety of services from IaaS (Infrastructure as a Service) (Bhardwaj et al. 2010), PaaS (Platform as a Service) (Pahl 2015) to SaaS (Software as a Service) (Dillon et al. 2010). IaaS can allocate an on-demand virtual machine to users. PaaS gives a framework that the users can build upon to develop or customize applications. SaaS delivers the applications managed by thirdparty vendor and accessible on the clients’ side. Most SaaS applications can run directly from a web browser, requiring no installation or downloads. SaaS is the most prosperous business in the cloud market (Cusumano 2010). Storage service is an important piece in the cloud puzzle. Amazon EC2 (elastic compute cloud), EBS (elastic block store), EFS (elastic ﬁle system), Glacier, and S3 (Simple Storage 12

https://waterdata.usgs.gov/nwis

4 Agro-geoinformatics Data Sources and Sourcing

59

Service) have been widely considered as a huge success in lifting all the data storage and process businesses into cloud (Amazon 2010, 2013; Marx 2013; Varia and Mathew 2014). It can hold data ranging from gigabytes to petabytes in size. The capability is dedicated to deal with the big data challenges which can be represented by the four Vs: volume, velocity, variety, and veracity (Hashem et al. 2015). To address these challenges and manage the explosively growing datasets, cloud seems to be the only way out until now. Cloud sourcing basically means the data are hosted on clouds, rather than collected by clouds. After being acquired, the data are uploaded to clouds for people to access it via the cloud toolkits. As mentioned in 4.3.1, data owners are the ﬁrst batch of cloud followers. Probably many people don’t realize that we are inevitably heading to ZB (zettabyte, equal to 1000 exabytes; 1 exabyte is equal to 1000 terabytes) era. Maintaining a database containing a PB of both structured and nonstructured content is never an easy task, actually more complex than people can imagine. Very powerful storage and computer hardware are required and must be maintained 24/7. Cloud providers brought together these requirements and supplied a pool of solutions for various databases to choose. So far, many public datasets have been archived by the major cloud providers. For instance, Amazon launched an Earth data plan13, 14 to archive important public datasets to beneﬁt the educators, researchers, and students (Palankar et al. 2008). The plan hosts Landsat 8 imagery, NEXRAD (Next-Generation Weather Radar, a network of 160 highresolution Doppler radar sites that detect precipitation and atmospheric movement and disseminate data in approximately 5-min intervals from each site), SpaceNet machine learning imagery, National Agriculture Imagery Program, digital elevation model (DEM) Terrain Titles, GDELT dataset, NASA Earth Exchange (NEX) datasets, GSOD (Global Surface Summary of the Day), Sentinel-2 imagery, and DigitalGlobe open data. Google Earth Engine hosts the widely used datasets all over the world and let people use them as free as usual (Gorelick 2013). Microsoft Azure is a heavyweight player and hosts a huge volume of public datasets.15 Most datasets from the US government agencies, including NASA, DOT, the US Census, EPA, etc., are hosted on Azure currently. The performances have been recognized by the public. All the data are available online. HTTP URL is the simplest and direct option to access them. The cloud providers build user-friendly websites for users to browse, discover, and download data. API interface is also offered for client programs to access and download the data via system-to-system exchanges. The users can manipulate their data in the cloud although they don’t physically possess the data (Chow et al. 2009). Public clouds may offer low-cost instance VMs (Amazon as low as $0.0059 per hour16). They are still not ﬁt for all use cases, especially when handling security 13

https://aws.amazon.com/public-datasets/ https://aws.amazon.com/cn/earth/ 15 https://docs.microsoft.com/en-us/azure/sql-database/sql-database-public-data-sets 16 https://aws.amazon.com/cn/ec2/pricing/on-demand/ 14

60

Z. Sun et al.

issues in high data transmission environments (Wang et al. 2010). Private clouds are built by enterprises to ﬁll this gap. Some open source architectures like CloudStack (Kumar et al. 2014) and OpenNebula (Milojičić et al. 2011) could be reasonable choices. Enterprises have full control of the network and storage of private clouds so that they can apply any suitable security mechanisms upon it. The disadvantage is that the cost is higher than public clouds. Either public or private clouds serve an efﬁcient way to host big data which is impossible on single servers or PCs. Clouds can avoid many annoying problems like server malfunction, disk failure, memory leak, I/O slowness, high-cost maintenance, etc. The data are uploaded and downloaded via the Internet. The nodes in the cloud physically possess the data. Users can assess them by the management console of clouds. The downloading is mostly through the HTTP and FTP protocols. The speed depends on the network of cloud nodes and client devices. However, the search button for the datasets in clouds is not as mature as the storage. The native searching function of cloud platform is imperfect. In operational systems of datasets, a separated register service which archives all the metadata for searching is often established. When users ﬁnd a product in the register, the individual link to the data in cloud host will be returned to the users.

4.3.3

Crowdsourcing

Crowdsourcing is a new sourcing model in which individuals or organizations obtain the needed data and services from the Internet contributors (Doan et al. 2011). It forms distributed labor networks to exploit the spare processing power of millions of the human brains via the Internet (Howe 2006). Crowdsourcing is a rising strategy for data collection (Hirafuji 2014; Kanhere 2011). It is producing geospatial data using informal social networks and state-of-the-art Web technologies (Heipke 2010). The potential user groups collaborate voluntarily with very little monetary support to make the result datasets free online. Key differences are that the contributors to crowdsourcing datasets may lack formal training on collecting and organizing the data. The citizens could be involved via their smartphones equipped with low-cost GPS receiver, accelerometer, ambient temperature sensor, gyroscope, light sensor, magnetometer, barometer, proximity sensor, humidity sensor, audio sensor, ﬁngerprint identity sensor, moisture sensor, and camera. In the agro-geoinformatcs, crowdsourcing has been used to collect data. For example, Geo-Wiki, a crowdsourcing tool, is developed to collect in situ data to improve the global land cover products (Fritz et al. 2012). The web-based system integrates access to high-resolution satellite imagery from Google Earth with crowdsourcing to vastly increase the available amount of information on the land cover. The information can be used for training, cross-checking, calibration, and validation of the land cover products (See et al. 2015). In addition, a smartphone app is made to intuitively retrieve the exact geometry of smaller objects to access agricultural entities like ﬁelds or ponds (Frommberger et al. 2013). Diseased leaf

4 Agro-geoinformatics Data Sources and Sourcing

61

images captured by smartphones can be sent to plant pathologists in remote laboratories for further disease identiﬁcation. Lab experts can directly suggest cure and prevention for the diseases. Another agricultural use of crowdsourcing is fertilizer calculation. There are commercial mobile device-based optical applications to estimate the color level of rice leaves and compare the crowdsourced images and recommend the required amounts of nitrogen fertilizer for the ﬁeld (Pongnumkul et al. 2015). Atmospheric data from smartphone sensors and amateur weather stations have been already utilized by some applications (Muller et al. 2015). Decision tree algorithm is implemented in a crowdsourcing mobile system to help generate accurate and reliable decision on agricultural plant diseases (Singh et al. 2014). Mobile4D is an integrated mobile system for crowdsourcing-based disaster alerting and reporting system which could be used in minimizing the impact of small disasters on crops and livestock (Frommberger and Schmid 2013). CrowdHydrology is another crowdsourcing project to encourage citizen scientists to voluntarily send hydrologic measurements via text messages to a server which distributes the information on the Web (Lowry and Fienen 2013). Crowdsourced datasets are released as a common knowledge database free for the entire human being. It has become a reliable way for consumers to retrieve data at a very low cost. The crowdsourced data has already been used in agriculture as a signiﬁcant supplement. It has greatly changed the whole situation of data supplying market. Its coverage will undoubtedly be further expanded in the next few years.

4.4

Conclusion

This chapter summarized state-of-the-art data sources of agro-geoinformatics and the sourcing methods. The data sources can be divided into four classes: satellite, airborne, in situ reports, and human reports. The details of each source are investigated and introduced. Basically, the satellite data source has the best spatial coverage and long observation history. The airborne and in situ datasets are mostly casespeciﬁc or site-speciﬁc. Human reports are brief descriptions and mostly concise terms and numbers answering basic questions. The different sources are not completely overlapped and can be integrated to obtain an immersive understanding of the truth in the crop ﬁelds. The data sourcing has three major options: conventional, cloud-based, and crowdsourcing. Conventional sourcing has a very tough docking process. Cloud service simpliﬁes the uploading and distributing of big data. Crowdsourcing greatly lowers the cost of data collection and retrieval. Nowadays, the three sourcing methods coexist and share the market. The latter two are gradually seizing the major portion. The future development is towards the Internet-based, mobile friendly, big data, low-cost, robustness, and high-performance data distribution.

62

Z. Sun et al.

References Aker, J. C. (2011). Dial “A” for agriculture: A review of information and communication technologies for agricultural extension in developing countries. Agricultural Economics, 42, 631–647. Amazon, E. (2010). Amazon elastic compute cloud (Amazon EC2) Amazon Elastic Compute Cloud (Amazon EC2). Amazon, E. (2013). Amazon Elastic Block Store (EBS) Amazon Web Services Inc. Arias, M., & Inglada, J. (2015). Sentinel-2 for agriculture design justiﬁcation ﬁle ESA Sentinel-2 for agriculture project. Arjomandi, M., Agostino, S., Mammone, M., Nelson, M., & Zhou, T. (2006). Classiﬁcation of unmanned aerial vehicles report for mechanical engineering class, University of Adelaide, Adelaide. Atwater, D. M. (2015). The commercial global drone market. Graziadio Business Review, 18, 1. Beloglazov, A., & Buyya, R. (2010). Energy efﬁcient resource management in virtualized cloud data centers. In Proceedings of the 2010 10th IEEE/ACM international conference on cluster, cloud and grid computing, IEEE Computer Society, pp. 826–831. Bhardwaj, S., Jain, L., & Jain, S. (2010). Cloud computing: A study of infrastructure as a service (IAAS). International Journal of engineering and information Technology, 2, 60–63. Boim, R., Greenshpan, O., Milo, T., Novgorodov, S., & Polyzotis, N. (2012). Tan W-C Asking the right questions in crowd data sourcing. In Data Engineering (ICDE), 2012 IEEE 28th international conference on. IEEE, pp. 1261–1264. Boucher, P. (2015). Domesticating the drone: The demilitarisation of unmanned aircraft for civil markets. Science and Engineering Ethics, 21, 1393–1412. Brabham, D. C. (2012). The myth of amateur crowds: A critical discourse analysis of crowdsourcing coverage. Information, Communication & Society, 15, 394–410. Burnett, M., Weinstein, B., & Mitchell, A. (2007). ECHO–enabling interoperability with NASA earth science data and services. In Geoscience and remote sensing symposium, 2007. IGARSS 2007. IEEE International. IEEE, pp. 4012–4015. Butler, D. (2014). Earth observation enters next phase: Expectations high as ﬁrst European sentinel satellite launches. Nature, 508, 160–162. Chen, S., F Laefer, D., & Mangina, E. (2016). State of technology review of civilian UAVs. Recent Patents on Engineering, 10, 160–174. Chhachhar, A. R., Qureshi, B., Khushk, G. M., & Ahmed, S. (2014). Impact of information and communication technologies in agriculture development. Journal of Basic and Applied Scientiﬁc Research, 4, 281–288. Chow, R., Golle, P., Jakobsson, M., Shi, E., Staddon, J., Masuoka, R., & Molina, J. (2009). Controlling data in the cloud: outsourcing computation without outsourcing control. In Proceedings of the 2009 ACM workshop on cloud computing security. ACM, pp. 85–90. Chuang, H.-M., Chang, C.-H., Kao, T.-Y., Cheng, C.-T., Huang, Y.-Y., & Cheong, K.-P. (2016). Enabling maps/location searches on mobile devices: Constructing a POI database via focused crawling and information extraction. International Journal of Geographical Information Science, 30, 1405–1425. https://doi.org/10.1080/13658816.2015.1133820. Chun, B., Culler, D., Roscoe, T., Bavier, A., Peterson, L., Wawrzoniak, M., & Bowman, M. (2003). Planetlab: an overlay testbed for broad-coverage services. ACM SIGCOMM Computer Communication Review, 33, 3–12. Cook, K. L. (2007). The silent force multiplier: The history and role of UAVs in warfare. In Aerospace conference, 2007 IEEE. IEEE, pp. 1–7. Cote, C. P. (2015). MQ-1C Gray eagle unmanned aircraft system (MQ-1C Gray Eagle). US Army Redstone Arsenal United States Cusumano, M. (2010). Cloud computing and SaaS as new computing platforms. Communications of the ACM, 53, 27–29.

4 Agro-geoinformatics Data Sources and Sourcing

63

De Silva, H., & Ratnadiwakara, D. (2008 November). Using ICT to reduce transaction costs in agriculture through better communication: A case-study from Sri Lanka LIRNEasia, Colombo, Sri Lanka Dehnen-Schmutz, K., Foster, G. L., Owen, L., & Persello, S. (2016). Exploring the role of smartphone technology for citizen science in agriculture. Agronomy for Sustainable Development, 36, 1–8. Di, L., & Yang, Z. (2014). Foreword to the special issue on agro-geoinformatics—The applications of geoinformatics in agriculture. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 7, 4315–4316. Dillon, T., Wu, C., & Chang, E. (2010). Cloud computing: Issues and challenges. In Advanced Information Networking and Applications (AINA), 2010 24th IEEE international conference on. IEEE, pp. 27–33. Doan, A., Ramakrishnan, R., & Halevy, A. Y. (2011). Crowdsourcing systems on the world-wide web. Communications of the ACM, 54, 86–96. Drusch, M., et al. (2012). Sentinel-2: ESA’s optical high-resolution mission for GMES operational services. Remote Sensing of Environment, 120, 25–36. Eichkorn, S., Wilhelm, S., Aufmhoff, H., Wohlfrom, K., & Arnold, F. (2002). Cosmic ray-induced aerosol-formation: First observational evidence from aircraft-based ion mass spectrometer measurements in the upper troposphere. Geophysical Research Letters, 29. Fretwell, S. D. (1987). Food chain dynamics: The central theory of ecology? Oikos, 50, 291–301. Fritz, S., et al. (2012). Geo-Wiki: An online platform for improving global land cover. Environmental Modelling & Software, 31, 110–123. Frommberger, L., & Schmid, F. (2013). Mobile4D: crowdsourced disaster alerting and reporting. In Proceedings of the sixth international conference on information and communications technologies and development: Notes-volume 2. ACM, pp. 29–32. Frommberger, L., Schmid, F., & Cai, C. (2013). Micro-mapping with smartphones for monitoring agricultural development. In Proceedings of the 3rd ACM symposium on computing for development. ACM, p. 46. Goodall, J. L., Horsburgh, J. S., Whiteaker, T. L., Maidment, D. R., & Zaslavsky, I. (2008). A ﬁrst approach to web services for the national water information system. Environmental Modelling & Software, 23, 404–411. Gorelick, N. (2013). Google earth engine. In EGU general assembly conference abstracts, p 11997. Hallgren, E. L. (1974). The University Corporation for Atmospheric Research and the National Center for Atmospheric Research, 1960–1970: An institutional history. University Corporation for Atmospheric Research Han, W., Di, L., Zhao, P, Wei, Y., & Li, X. (2008). Design and implementation of GeoBrain online analysis system (GeOnAS). In: Web and wireless geographical information systems. Springer, pp. 27–36. Hankin, S. C. et al. (2010). NetCDF-CF-OPeNDAP: Standards for ocean data interoperability and object lessons for community data standards processes. In Oceanobs 2009, Venice Convention Centre, 21–25 septembre 2009, Venise. Hashem, I. A. T., Yaqoob, I., Anuar, N. B., Mokhtar, S., Gani, A., & Khan, S. U. (2015). The rise of “big data” on cloud computing: Review and open research issues. Information Systems, 47, 98–115. Haulman, D. L. (2003). US unmanned aerial vehicles in combat, 1991–2003. DTIC Document. Heidt, H., Puig-Suari, J., Moore, A., Nakasuka, S., & Twiggs, R. (2000). CubeSat: A new generation of picosatellite for education and industry low-cost space experimentation. Heipke, C. (2010). Crowdsourcing geospatial data. ISPRS Journal of Photogrammetry and Remote Sensing, 65, 550–557. Hirafuji, M. (2014). A strategy to create agricultural big data. In Global Conference (SRII), 2014 Annual SRII. IEEE, pp. 249–250.

64

Z. Sun et al.

Hofmann-Wellenhof, B., Lichtenegger, H., & Collins, J. (1994). Introduction. In Global positioning system: Theory and practice (pp. 1–11). Vienna: Springer. https://doi.org/10.1007/978-37091-3311-8_1. Howe, J. (2006). The rise of crowdsourcing. Wired Magazine, 14, 1–4. Hubbard, K. G., Rosenberg, N. J., & Nielsen, D. C. (1983). Automated weather data network for agriculture. Journal of Water Resources Planning and Management, 109, 213–222. Irwin, A. (2001). Constructing the scientiﬁc citizen: Science and democracy in the biosciences. Public Understanding of Science, 10, 1–18. Jones, C. B. (2014). Geographical information systems and computer cartography. Routledge. Jurgens, C. (1997). The modiﬁed normalized difference vegetation index (mNDVI) a new index to determine frost damages in agriculture based on Landsat TM data. International Journal of Remote Sensing, 18, 3583–3594. Kanhere, S. S. (2011). Participatory sensing: Crowdsourcing data from mobile smartphones in urban spaces. In Mobile Data Management (MDM), 2011 12th IEEE international conference on. IEEE, pp. 3–6. Kramer, H. J. (2002). Observation of the earth and its environment: Survey of missions and sensors. Springer. Kumar, R., Jain, K., Maharwal, H., Jain, N., & Dadhich, A. (2014). Apache cloudstack: Open source infrastructure as a service cloud computing platform proceedings of the international journal of advancement in engineering technology. Management and Applied Science, 111–116. Lee, S., & Choi, Y. (2016). Reviews of unmanned aerial vehicle (drone) technology trends and its applications in the mining industry. Geosystem Engineering, 19, 197–204. Lee, S., Hutputanasin, A., Toorian, A., Lan, W., & Munakata, R. (2009). CubeSat design speciﬁcation the CubeSat Program 8651:22. Lowry, C. S., & Fienen, M. N. (2013). CrowdHydrology: Crowdsourcing hydrologic data and engaging citizen scientists. Ground Water, 51, 151–156. Lukyanenko, R., Parsons, J., & Wiersma, Y. (2011). Citizen science 2.0: Data management principles to harness the power of the crowd. In Service-oriented perspectives in design science research. Springer, pp. 465–473. Lyson, T. A. (2012). Civic agriculture: Reconnecting farm, food, and community. UPNE. Marx, V. (2013). Biology: The big challenges of big data. Nature, 498, 255–260. Mays, K. L., Shepson, P. B., Stirm, B. H., Karion, A., Sweeney, C., & Gurney, K. R. (2009). Aircraft-based measurements of the carbon footprint of Indianapolis. Environmental Science & Technology, 43, 7816–7823. Meera, S. N., Jhamtani, A., & Rao, D. (2004). Information and communication technology in agricultural development: A comparative analysis of three projects from India Network Paper No 135. Milojičić, D., Llorente, I. M., & Montero, R. S. (2011). OpenNebula: A cloud management tool. IEEE Internet Computing, 15, 11–14. Möhler, O., Reiner, T., & Arnold, F. (1993). A novel aircraft-based tandem mass spectrometer for atmospheric ion and trace gas measurements. Review of Scientiﬁc Instruments, 64, 1199–1207. Mukhopadhyay, S. C. (2012). Smart sensing technology for agriculture and environmental monitoring. Springer. Mulla, D. J. (2013). Twenty ﬁve years of remote sensing in precision agriculture: Key advances and remaining knowledge gaps. Biosystems Engineering, 114, 358–371. Muller, C., et al. (2015). Crowdsourcing for climate and atmospheric sciences: Current status and future potential. International Journal of Climatology, 35, 3185–3203. Nabrzyski, J., Liu, C., Vardeman, C., & Gesing, S., & Budhatoki, M. (2014) Agriculture data for all-integrated tools for agriculture data integration, analytics, and sharing. In Big Data (BigData congress), 2014 IEEE International congress on, 2014. IEEE, pp. 774–775. NASA. (2014). MODIS data products table. https://lpdaac.usgs.gov/dataset_discovery/modis/ modis_products_table. Accessed 1 Dec 2015.

4 Agro-geoinformatics Data Sources and Sourcing

65

NASA. (2016). An overview of EOSDIS. https://earthdata.nasa.gov/about. Accessed 16 May 2016. Newman, D. L. (2013). Drone for collecting images and system for categorizing image data. In Google Patents. Ozelkan, E., Chen, G., & Ustundag, B. B. (2016). Multiscale object-based drought monitoring and comparison in rainfed and irrigated agriculture from Landsat 8 OLI imagery. International Journal of Applied Earth Observation and Geoinformation, 44, 159–170. Pahl, C. (2015). Containerization and the paas cloud. IEEE Cloud Computing, 2, 24–31. Palankar, M. R., Iamnitchi, A., Ripeanu, M., & Garﬁnkel, S. (2008). Amazon S3 for science grids: A viable solution? In Proceedings of the 2008 international workshop on data-aware distributed computing. ACM, pp. 55–64. Parkinson, C. L. (2003). Aqua: An Earth-observing satellite mission to examine water and other climate variables. IEEE Transactions on Geoscience and Remote Sensing, 41, 173–183. Peterson, L., & Roscoe, T. (2006). The design principles of PlanetLab. ACM SIGOPS Operating Systems Review, 40, 11–16. Pierce, F. J., & Nowak, P. (1999). Aspects of precision agriculture. Advances in Agronomy, 67, 1–85. Pongnumkul, S., Chaovalit, P., & Surasvadi, N. (2015). Applications of smartphone-based sensors in agriculture: A systematic review of research. Journal of Sensors. Qian, L., Luo, Z., Du, Y., & Guo, L. (2009). Cloud computing: An overview. In IEEE international conference on cloud computing. Springer, pp. 626–631. Rew, R., Davis, G., Emmerson, S., & Davies, H. (1997). NetCDF user’s guide for C Unidata Program Center, June 1:997 Richards, J. A., & Richards, J. (1999). Remote sensing digital image analysis (Vol. 3). Springer. Rossel, R. A. V., & Bouma, J. (2016). Soil sensing: A new paradigm for agriculture. Agricultural Systems, 148, 71–74. Rossiter, D. G., Liu, J., Carlisle, S., & Zhu, A.-X. (2015). Can citizen science assist digital soil mapping? Geoderma, 259, 71–80. Sandau, R. (2010). Status and trends of small satellite missions for Earth observation. Acta Astronautica, 66, 1–12. Savtchenko, A., Ouzounov, D., Ahmad, S., Acker, J., Leptoukh, G., Koziana, J., & Nickless, D. (2004). Terra and Aqua MODIS products available from NASA GES DAAC. Advances in Space Research, 34, 710–714. Schmunck, R. B. (2002). Panoply netCDF, HDF and GRIB Data Viewer. http://www.giss.nasa. gov/tools/panoply/ Schneider, M., & Wagner, P. (2015). Prerequisites for the adoption of new technologies–the example of precision agriculture. See, L., et al. (2015). Harnessing the power of volunteers, the internet and Google Earth to collect and validate global spatial information using Geo-Wiki. Technological Forecasting and Social Change, 98, 324–335. Sheoran, A., & Haack, B. (2013). Classiﬁcation of California agriculture using quad polarization radar data and Landsat Thematic Mapper data. GIScience & Remote Sensing, 50, 50–63. Silvertown, J., et al. (2015). Crowdsourcing the identiﬁcation of organisms: A case-study of iSpot. ZooKeys, 480, 125. Singh, P., Jagyasi, B., Rai, N., & Gharge, S. (2014). Decision tree based mobile crowdsourcing for agriculture advisory system. In India conference (INDICON), 2014 annual IEEE. IEEE, pp. 1–6. Stafford, J. V. (2000). Implementing precision agriculture in the 21st century. Journal of Agricultural Engineering Research, 76, 267–275. Straub, J., Korvald, C., Nervold, A., Mohammad, A., Root, N., Long, N., & Torgerson, D. (2013). OpenOrbiter: A low-cost, educational prototype CubeSat mission architecture. Machines, 1, 1. Toorian, A., Diaz, K., & Lee, S. (2008). The cubesat approach to space access. In Aerospace conference, 2008 IEEE, IEEE, pp. 1–14.

66

Z. Sun et al.

Torbick, N., Chowdhury, D., Salas, W., & Qi, J. (2017). Monitoring rice agriculture across Myanmar using time series Sentinel-1 assisted by Landsat-8 and PALSAR-2. Remote Sensing, 9, 119. USGS. (2014). Landsat project statistics. http://landsat.usgs.gov/Landsat_Project_Statistics.php. Accessed 21 Sept 2014. Valavanis, K. P. (2008). Advances in unmanned aerial vehicles: State of the art and the road to autonomy (Vol. 33). Springer Varia, J., & Mathew, S. (2014). Overview of amazon web services Amazon Web Services. Wang, C., Wang, Q., Ren, K., & Lou, W. (2010). Privacy-preserving public auditing for data storage security in cloud computing. In Infocom, 2010 proceedings IEEE, IEEE, pp. 1–9. Wang, G., & Ng, T. E. (2010). The impact of virtualization on network performance of amazon ec2 data center. In Infocom, 2010 proceedings IEEE, IEEE, pp. 1–9. Wang, N., Zhang, N., & Wang, M. (2006). Wireless sensors in agriculture and food industry— Recent development and future perspective. Computers and Electronics in Agriculture, 50, 1–14. https://doi.org/10.1016/j.compag.2005.09.003. Wikipedia. (2014a). DigitalGlobe. http://en.wikipedia.org/wiki/DigitalGlobe. Accessed 21 Sept 2014. Wikipedia. (2014b). Landsat program. http://en.wikipedia.org/wiki/Landsat_program. Accessed 21 Sept 2014. Williams, D. (2015). The earth system grid federation (ESGF): Climate science infrastructure for large-scale Data management and dissemination. In AGU fall meeting abstracts. Williams, D. N., et al. (2009). The Earth System Grid: Enabling access to multimodel climate simulation data. Bulletin of the American Meteorological Society, 90, 195–205. Woellert, K., Ehrenfreund, P., Ricco, A. J., & Hertzfeld, H. (2011). Cubesats: Cost-effective science and technology platforms for emerging and developing nations. Advances in Space Research, 47, 663–684. Wolock, D. (2003). Flow characteristics at US Geological Survey streamgages in the conterminous United States. Yang, W. (2010). Operational delivery of customized earth observation data using web coverage service geospatial web services: Advances in information interoperability: Advances in information interoperability: 385 Yang, W., & Di, L. (2002). Serving NASA HDF-EOS Data through NWGISS coverage server. In Proceedings of the NASA Earth Science Technologies conference (pp. 11–13). Zhang, N., Wang, M., & Wang, N. (2002). Precision agriculture—A worldwide overview. Computers and Electronics in Agriculture, 36, 113–132. Zhao, P., et al. (2015). Exploring NASA GES DISC Data with Interoperable Services. Zhong, L., Gong, P., & Biging, G. S. (2014). Efﬁcient corn and soybean mapping with temporal extendability: A multi-year experiment using Landsat imagery. Remote Sensing of Environment, 140, 1–13.

Chapter 5

Standards and Interoperability Yuqi Bai

Abstract Standards are documented consensus agreements containing safety, technical speciﬁcations, or other precise criteria to be used consistently as rules, guidelines, or deﬁnitions of characteristics for materials, products, processes, and services. Standards are powerful tools that can help drive innovation and increase productivity. Standards on the modeling, classiﬁcation, description, discovery and dissemination of the agriculture science data plays an important role to ensure the quality of, interoperability between, and easy access to the agriculture data that are independently collected, processed, and shared in each nation, state, county or even city. This chapter brieﬂy introduces International Organization for Standardization (ISO), the Open Geospatial Consortium (OGC), European Committee for Standardization (CEN), and The American National Standards Institute (ANSI) as representative international, regional and national standard organizations. With a focus on information technology related standards that closely apply to agriculture activities, it further presents standards on data content or encoding, metadata, and data services. Keywords Standard · Data content · Metadata · Data service · ISO · OGC · CEN · ANSI · FAO

5.1

Introduction

Standards are documented consensus agreements containing safety, technical speciﬁcations, or other precise criteria to be used consistently as rules, guidelines, or deﬁnitions of the characteristics of materials, products, processes, and services. In many cases, they provide uniformity, which allows worldwide acceptance and application of a product or material. The aim is to facilitate trade, exchange, and

Y. Bai (*) Tsinghua University, Beijing, China e-mail: [email protected] © Springer Nature Switzerland AG 2021 L. Di, B. Üstündağ (eds.), Agro-geoinformatics, Springer Remote Sensing/ Photogrammetry, https://doi.org/10.1007/978-3-030-66387-2_5

67

68

Y. Bai

technology transfer. Standards help to remove technical barriers to trade leading to new markets and economic growth for the industry (ISO 2015a). According to ISO/IEC 2382:2015b, Information technology – Vocabulary, interoperability is deﬁned as follows: “The capability to communicate, execute programs, or transfer data among various functional units in a manner that requires the user to have little or no knowledge of the unique characteristics of those units” (ISO 2015b). One way to understand the relationship between standards and interoperability is that standard is the approach to ensure the interoperability among the agrogeoinformatics community members. Agriculture is the cultivation and breeding of animals, plants, and fungi for food, ﬁber, biofuel, medicinal plants, and other products used to sustain and enhance life (International Labour Organization 1999). The standards that cover these aspects of agriculture are actually very comprehensive. Since this book is on agrogeoinformatics, the standards and interoperability further introduced in this chapter will have a focus on information technology that applies to the data and services collected, analyzed, and shared in these aforementioned agricultural activities.

5.2

Standard Organizations

Standards are drawn up at the international, regional, and national levels. Common structures and cooperation agreements are the key to ensure the organization and coordination of work at these three levels. The International Organization for Standardization (ISO), the Open Geospatial Consortium (OGC), the European Committee for Standardization (CEN), and the American National Standards Institute (ANSI) are the representative international, regional, and national standard organizations.

5.2.1

ISO

The ISO is an independent, nongovernmental international organization with a membership of 161 national standardization bodies. Through its members, it brings together experts to share knowledge and develop voluntary, consensus-based, market-relevant International Standards that support innovation and provide solutions to global challenges. The ISO ofﬁcially began its operations on February 23, 1947. The ISO maintains a hierarchy governance structure, and the technical committees are responsible for the standard development. The ISO now has 778 technical committees and subcommittees which take care of standards development. During the last 70 years, the ISO has published over 22,041 International Standards covering almost all aspects of technology and manufacturing, including those on agricultural activities.

5 Standards and Interoperability

69

For agricultural activities, the International Organization for Standardization (ISO) has developed a series of standards for soil quality and pedology, farm buildings, structures and installations, agricultural machines, implements and equipment, fertilizers, pesticides and other agrochemicals, animal feeding stuff, beekeeping, hunting, ﬁshing and ﬁsh breeding, tobacco, and tobacco products and related equipment [https://www.iso.org/ics/65/x/].

5.2.2

OGC

The OGC (Open Geospatial Consortium) is an international not for proﬁt organization committed to making quality open standards for the global geospatial community. OGC standards are technical documents that detail interfaces or encodings. These documents are used to build open interfaces and encodings in software products and services to ensure the interoperability between them. For example, a Web Map Service (WMS) produces maps of spatially referenced data dynamically from geographic information in OGC WMS Implementation Speciﬁcation. This international standard deﬁnes a “map” to be a portrayal of geographic information as a digital image ﬁle suitable for display on a computer screen. According to the WMS service interface deﬁned in this standard, a WMS client could access any WMS service developed by any other community. The OGC standards are focusing on geospatial data modeling (e.g., Geography Markup Language), data representation (e.g., Simple Feature Access), data rendering (e.g., Web Map Service), data access (e.g., Web Coverage Service), data search (e.g., catalogue service), and others. They have been used in a wide variety of domains, including the environment, defense, health, agriculture, meteorology, sustainable development, and many more. OGC members come from the government, commercial organizations, NGOs, and academic and research organizations. To facilitate the use and development of OGC standards in the agricultural domain, the OGC established the Agriculture Domain Working Group (http:// www.opengeospatial.org/projects/groups/agriculturedwg) to meet many important goals such as matching precision agricultural machinery with precision agricultural knowledge and promoting crop resiliency at large and small scales.

5.2.3

CEN

The CEN, the European Committee for Standardization, is an association that brings together the national standardization bodies of 34 European countries. The CEN is one of the three European standardization organizations (together with CENELEC and ETSI) that is being responsible for developing and deﬁning voluntary standards at the European level.

70

Y. Bai

The CEN supports standardization activities in relation to a wide range of ﬁelds and sectors including air and space, chemicals, construction, consumer products, defense and security, energy, the environment, food and feed, health and safety, healthcare, ICT, machinery, materials, pressure equipment, services, smart living, transport, and packaging. In particular, the CEN has published around 65 standards that are related to agricultural activities (https://standards.cen.eu/index.html), such as agricultural machinery, plastic-thermoplastic ﬁlms, and equipment for crop protection.

5.2.4

ANSI

The American National Standards Institute (ANSI) has served in its capacity as administrator and coordinator of the United States private sector voluntary standardization system for 100 years. Founded in 1918 by ﬁve engineering societies and three government agencies, the institute remains a private, nonproﬁt membership organization supported by a diverse constituency of private and public sector organizations. The institute has nearly 1000 company, organization, government agency, institutional, and international members. More than 10,000 American National Standards (ANS) have been developed. Besides adopting those ISO agriculture-related standards, the ANSI has promoted a comprehensive sustainable agricultural standard that addresses all three aspects of sustainability: environmental stewardship, social responsibility, and economic prosperity (ANSI-LEO-4000). This standard empowers the entire agricultural supply chain, from producers to consumers to advance sustainability. The standard provides clear communication of farm production sustainability achievements down to the supply chain and clear communication of buyer requests for farm production sustainability achievements up to the supply chain. The standard also provides buyers and sellers with reliable benchmarking through the four levels of certiﬁcation (bronze through platinum) and a structure that encourages continuous improvement strategies. This standard currently addresses agricultural crops and will be expanded to address animal production in the future.

5.3

Typical Standard Development Process

The process of developing a standard is actually a way to reach consensus. Although exact process differs from one standard organization to the other, it usually consists of the following steps: interested members proposing a standard development effort, organization authority approving it, members continually working on a variety of versions of the standard documents, communities raising feedbacks and comments that have to be addressed, standard ballot, and members ﬁnally approving the standard document.

5 Standards and Interoperability

71

Fig. 5.1 General ISO standard development process

Taking the ISO, for example, the ﬂow charts below depict the general process and four typical arrangements for standard development. The acronyms appeared in this ﬁgure are for a certain version of the standard document. The full phrases for them are the following: DIS for draft international standard, WD for working draft, and CD for committee draft (Figs. 5.1 and 5.2).

5.4

Types of Standards

The farm business, farm supply chain, and public agricultural policies are increasingly tied as well to the quantitative data about crops, soils, water, weather, markets, energy, and biotechnology. How does farming become more, not less, sustainable as a business and as a necessity for life in the face of climate change, growing

72

Fig. 5.2 Detailed ISO standard development process

Y. Bai

5 Standards and Interoperability

73

populations, and scarcity of water and energy is a big challenge. The standard is one of the keys to ensure its sustainability. With a focus on information technology, the types of standard that apply to agricultural activities are mainly on data content or encoding, metadata, and data services.

5.4.1

Data Content or Encoding Standard

Data content or encoding standard are used to represent or encode the elements of agricultural data about crops, soils, water, weather, markets, energy, and biotechnology. For geolocation information about these types of agricultural data, there are some public standards that are applicable for representing geospatial data, such as the Geography Markup Language (GML). The GML is an XML grammar written in XML Schema for the description of application schemas as well as the transport and storage of geographic information. The key concepts used by the Geography Markup Language (GML) to model the world are drawn from the ISO 19100 series of international standards and the OpenGIS Abstract Speciﬁcation. A feature is an abstraction of thee real-world phenomena (ISO 19101); it is a geographic feature if it is associated with a location relative to the Earth. In GML, Point, AbstractCurve, LineString, Curve, AbstractSurface, Polygon, Surface, AbstractSolid, Solid, MultiPoint, MultiCurve, MultiSurface, MultiSolid, and MultiGeometry are the major geospatial features. They could be leveraged to model the geolocation information of the agricultural data, for example, using GML Point to model each corner of the ﬁeld or using GML Polygon to model the boundary of the ﬁeld. Below is an example describing a farm where the latitude and longitude coordinates of each of the four corner points are represented using a GML Polygon element.

Proud Food Farm is located in Silver Spring, Maryland.

We grow and sell a wide variety of vegetables, berries, herbs, raw honey, fresh eggs & beneﬁcial ﬂowers. Everything is organically-grown and sold on-site, ensures that you know what you are eating and can see where it comes from. We also have made it our goal to be affordable and our prices compete with organic vegetables sold at the grocery store...but, much fresher and tastier. Heirloom vegetables, unique varieties, and free cutﬂowers with every purchase above $10.

www. localharvest.org/proud-food-farm-M68584 Proud Food Farm

74

Y. Bai

45.256 -110.45 46.46 -109.48 43.84 -109.86 45.256 -110.45

5.4.2

Metadata Content and Encoding Standard

Metadata is actually “data about data.” The three distinct types of metadata are descriptive metadata, structural metadata, and administrative metadata. Descriptive metadata describes a resource for purposes such as discovery and identiﬁcation. It can include elements such as title, abstract, author, and keywords. Structural metadata is the metadata about containers of data and indicates how compound objects are put together, for example, how pages are ordered to form chapters. It describes the types, versions, relationships, and other characteristics of digital materials. Administrative metadata provides information to help manage a resource, such as when and how it was created, ﬁle type and other technical information, and who can access it (https://en.wikipedia.org/wiki/Metadata). How metadata information are composed and how each part of it should be expressed are metadata content and encoding standard. For geolocation information about agricultural data, ISO 19115, 19115-2, and 19139 are applicable standards that cover metadata content and encoding. For thematic information about agricultural data, there are some metadata could be referenced, such as Agricultural Metadata Element Set (AgMES) published by the Food and Agriculture Organization of the United Nations (FAO) (http://aims.fao.org/standards/agmes). In the ISO 19115, the metadata about geospatial information are organized into 12 groups: spatial representation, reference system, metadata extension, constraint, identiﬁcation, maintenance, application schema, portrayal catalogue, content, distribution, data quality, and lineage, as depicted in the following ﬁgure (Fig. 5.3). The ISO 19115-2 further deﬁnes the metadata for imagery and gridded data. In particular, it provides information about the properties of the measuring equipment used to acquire the data, the geometry of the measuring process employed by the equipment, and the production process used to digitize the raw data. This extension deals with the metadata needed to describe the derivation of geographic information from the raw data, including the properties of the measuring system and the numerical methods and computational procedures used in the derivation. The relationship between metadata contents deﬁned in 19115-2 and those in 19115 is shown in the ﬁgure below (Fig. 5.4).

5 Standards and Interoperability

75

Fig. 5.3 Metadata schema classes in the ISO 19115 (excerpted)

The AgMES (Agricultural Metadata Element Set) initiative aims to encompass the issues of the semantic standards in the domain of agriculture with respect to description, resource discovery, interoperability, and data exchange for the different types of information resources (http://aims.fao.org/standards/agmes/namespacespeciﬁcation). The AgMES metadata terms can be deﬁned for newly declared elements that are deemed necessary and are used for different resources (bibliographic references or DLIOs, projects, images, technologies, practices, maps, etc.) in all areas relevant to food production, nutrition, and rural development. The AgMES is based on the Dublin Core Metadata Initiative (DCMI), with several new properties and encoding schemes proposed. This element set is maintained by FAO. The AgMES was further extended to support application proﬁles, such as the AGRIS metadata (AGRIS AP). The AGRIS (International System for Agricultural

76

Y. Bai

Fig. 5.4 Metadata packages in the ISO 19115-2 (excerpted)

Science and Technology) is a global public domain database with more than eight million structured bibliographical records on agricultural science and technology (https://en.wikipedia.org/wiki/AGRIS).

5.4.3

Data Service Standard

Data service is a term for online applications that could serve agricultural science data on the Web. The typical use cases of data services are rendering crop yield data as colorful maps, accessing to national farm geolocation data as GML features, and searching for interested historical crop data. The OGC and ISO have published four types of data service standards that could be leveraged to facilitate the discovery of and access to agricultural data: Catalogue service, Web Map Service, Web Feature Service, and Web Coverage Service.

5 Standards and Interoperability

77

The OGC Catalogue services support the ability to publish and search collections of descriptive information (metadata) for data, services, and related information objects (Douglas et al. 2007). The metadata in catalogues represent resource characteristics that can be queried and presented for evaluation and further processing by both humans and software. For example, once the metadata information about each farm is maintained in a catalogue service, searching for farms of interest based on location, opening dates, products, and crops in different seasons would be a very straightforward approach, as deﬁned in the catalogue service standard. To reach this goal, the OGC Catalogue service standard deﬁnes the way to organize the metadata information, to deﬁne query against the metadata in the request, and to present the matched results in the response. The OGC Web Map Service (WMS) is mainly for producing maps of spatially referenced data dynamically from geographic information (de la Jeff 2006). In particular, it deﬁnes a “map” to be a portrayal of geographic information as a digital image ﬁle suitable for display on a computer screen. The WMS-produced maps are generally rendered in a pictorial format such as PNG, GIF, or JPEG or occasionally as vector-based graphical elements in Scalable Vector Graphics (SVG) or Web Computer Graphics Metaﬁle (WebCGM) formats. In this standard, what needs to be rendered (such as speciﬁc data layer) and how are clearly deﬁned. The OGC Web Feature Service (WFS) standard speciﬁes the behavior of a service that provides transactions on and access to geographic features (Panagiotis 2010). It mainly speciﬁes discovery operations, query operations, locking operations, and transaction operations. The WFS actually represents a change in the way the geographic information is created, modiﬁed, and exchanged on the Internet. Rather than sharing geographic information at the ﬁle level using File Transfer Protocol (FTP), for example, the WFS offers direct ﬁne-grained access to the geographic information at the feature and feature property level. The Web Feature Services allow clients to only retrieve or modify the data they are seeking, rather than retrieving a ﬁle that contains the data they are seeking and possibly much more. That data can then be used for a wide variety of purposes, including purposes other than their producers’ intended ones. For example, once the national farm geolocation data is exposed as a Web Feature Service, any small part of it, such as all the farms located in one speciﬁc state, county, or city or even in a user-speciﬁed area, could be dynamically exported. One of the signiﬁcant beneﬁts of this service is that the data owner only needs to maintain one copy of this national-scale data, and the WFS enables the ﬂexibility of accessing to any subset of this agricultural data. The OGC Web Coverage Service (WCS) standard supports the electronic retrieval of geospatial data as “coverages” – that is, digital geospatial information representing space /time-varying phenomena (Peter 2010). Unlike the WMS, which portrays spatial data to return static maps (rendered as pictures by the server), the WCS provides available data together with their detailed descriptions which may be interpreted, extrapolated, etc., and not just portrayed. Unlike the WFS, which returns discrete geospatial features, WCS returns coverages representing space /time-varying phenomena that relate a spatiotemporal domain to a (possibly multidimensional) range of properties. One use case of utilizing the

78

Y. Bai

WCS to enable access to agricultural data is that of the nation-wide crop land classiﬁcation data, which usually consists of crop type information for each gridded ground area (e.g., 30*30 m).

5.4.4

Statistical Standards and Methodological Guidelines

To ensure a consistency in the process of collecting, analyzing and disseminating agricultural data, many international statistical standards, methodological guidelines, and tools have been developed. Taking FAO Uni as an example, the major statistical standards and guidelines are listed as the following. • FAOSTAT Commodity List (FCL) contains a classiﬁcation of the agricultural commodities currently used by FAO (http://www.fao.org/waicent/faoinfo/ economic/faodef/faodefe.htm). It deﬁnes 12 Commodity Groups, including nuts and derived products, oil-bearing crops and derived products, vegetables and derived products, and fruits and derived products. • The Central Product Classiﬁcation (CPC) is developed and maintained by the United Nations Statistics Division (UNSD) (https://unstats.un.org/unsd/cr/ registry/cpc-2.asp). Its main purpose is to provide a framework for facilitating the international comparison of product statistics and to serve as a guide for developing or revising the existing classiﬁcation schemes, in order to make them compatible with international standards. In this standard, agricultural, forestry, and ﬁshery products are classiﬁed into section 0, which is further divided into subgroups. For example, below is the classiﬁcation information for watermelons. Section: 0 – Agricultural, forestry, and ﬁshery products Division: 01 – Products of agriculture, horticulture, and market gardening Group: 012 – Vegetables Class: 0122 – Melons Subclass: 01221 – Watermelons • The Harmonized Commodity Description and Coding System (HS) is the trade classiﬁcation most widely used in the world (https://unstats.un.org/unsd/tradekb/ Knowledgebase/50018/Harmonized-Commodity-Description-and-CodingSystems-HS). Commodities are generally classiﬁed according to the raw or basic material, to the degree of processing, the use or function, and economic activities. At the international level, the Harmonized System (HS) for classifying goods is a six-digit code system. For example, Watermelons in HS is: • 080711 Fruit, edible; watermelons, fresh Please note that HS is actually a detailed listing of commodities rather than a proper classiﬁcation for the purpose of organizing ofﬁcial statistics.

5 Standards and Interoperability

5.5

79

Conclusion

The standards on the modeling, classiﬁcation, description, discovery, and dissemination of the agricultural science data plays an important role to ensure the quality of, interoperability between, and easy access to the agricultural data that are independently collected, processed, and shared in each nation, state, county, or even city. They are usually developed through a consensus approach. The ISO, OGC, and FAO are the major international standard organization that have published standards on data content, metadata, and a variety of data services, including the Catalogue Service, WMS, WFS, and WCS. Further customization and extension of these public standards to meet new needs are always needed.

References de la Jeff, B. (2006). OpenGIS web map server implementation speciﬁcation. Wayland: Open Geospatial Consortium. Douglas, N., Arliss, W., & Panagiotis (Peter) Vretanos. (2007). OpenGIS catalogue services speciﬁcation. International Labour Organization. (1999). Safety and health in agriculture (p. 77). ISBN 978-92-2111517-5. ISO/TR 19300. (2015a). Graphic technology – Guidelines for the use of standards for print media production. https://www.iso.org/standard/64081.html ISO/IEC 2382. (2015b). Information technology – Vocabulary. https://www.iso.org/standard/ 63598.html Panagiotis, A. V. (2010). OpenGIS web feature service 2.0 interface standard. Peter, B. (2010). OGC WCS 2.0 interface standard – Core.

Chapter 6

Image Processing Methods in Agricultural Observation Systems Chen Zhang and Li Lin

Abstract Image processing is an essential part of the agricultural observation system. This chapter is the ﬁrst attempt to provide an overview of the image processing methods, technologies, and tools from the perspective of agro-geoinformatics. First, we introduce the origins, deﬁnitions, and basic steps of digital image processing. Along with the traditional image processing hardware and software, the state-of-theart technologies for agricultural image processing, such as mobile device-based image processing and cloud computing-based image processing, are covered. Image data could be acquired by different sensors in different ways. We discuss three common approaches to collect agricultural image data, in situ, airborne-based, and space-borne-based data collection, as well as the big data challenge in agrogeoinformatics. As the core image processing operation in the agricultural observation system, information extraction aims to understand agro-geoinformation from the raw image data. This chapter also illustrates several image information extraction methods that are widely employed in agro-geoinformatics, such as knowledge-based expert system, machine learning-based decision tree, and artiﬁcial neural network. Furthermore, a case study of the production of Cropland Data Layer (CDL) data, a comprehensive, raster-formatted, geo-referenced, annual crop-speciﬁc land cover map produced by the U.S. Department of Agriculture (USDA) National Agricultural Statistics Service (NASS), is demonstrated. Keywords Image processing · Agricultural observation system · Agricultural monitoring · Agricultural data collection · Agricultural information extraction · Big data · Machine learning · Land cover classiﬁcation

C. Zhang (*) · L. Lin Center for Spatial Information Science and Systems, George Mason University, Fairfax, VA, USA e-mail: [email protected]; [email protected] © Springer Nature Switzerland AG 2021 L. Di, B. Üstündağ (eds.), Agro-geoinformatics, Springer Remote Sensing/ Photogrammetry, https://doi.org/10.1007/978-3-030-66387-2_6

81

82

6.1

C. Zhang and L. Lin

Introduction

Imagery sensors have been widely used in the agricultural observation and monitoring systems. For example, a huge volume of remotely sensed image data is continuously acquired from various imaging sensors such as Moderate Resolution Imaging Spectroradiometer (MODIS), Landsat Operational Land Imager (OLI), and Sentinel2 Multi-Spectral Instrument (MSI). These image data contain abundant fundamental information and have been extensively used to support decision making in agriculture. The raw image data acquired from imaging sensors need to be processed before being applied in agricultural applications. This chapter is the ﬁrst attempt to provide an overview of the image processing methods, technologies, and tools from the perspective of agro-geoinformatics. First, we introduce the fundamental of digital image processing including its origins, deﬁnitions, image processing hardware/software, as well as state-of-the-art image processing technologies such as mobile device-based image processing and cloud-based image processing. Three main approaches for agricultural image data collection, in situ data collection, airborne-based data collection, and space-borne-based data collection, are covered. We also discuss the big data challenge in agro-geoinformatics. As the core operation of image processing in the agricultural observation system, information extraction aims to understand agro-geoinformation from the raw image data. This chapter summarizes several image information extraction methods that are widely employed used in agro-geoinformatics, including knowledge-based expert system, machine learning-based decision tree, and artiﬁcial neural network approach. Furthermore, a case study of the production of Cropland Data Layer (CDL) data is demonstrated.

6.2

The Fundamentals of Digital Image Processing

Agricultural image processing deals with digital images that are acquired by the imaging sensor in agricultural monitoring systems. To process the agricultural image data, we have to understand what is digital image and how digital image processing works. This section will introduce the fundamentals of digital image processing including its origins and deﬁnitions, fundamental steps, and basic terms in the image processing ﬂow including image acquisition, image enhancement, image restoration, image compression, image segmentation, and image representation.

6.2.1

Origins and Deﬁnitions

The origin of digital image processing could be traced back to a century ago when the application of digital images was raised in the newspaper industry. With the

6 Image Processing Methods in Agricultural Observation Systems

83

invention of the Von Neumann architecture modern computer in the 1940s, the computer was gradually becoming the major tool for digital image processing. In the 1960s, the computer was applied in processing space image for the ﬁrst time when the National Aeronautics and Space Administration Jet Propulsion Laboratory (NASA JPL) initiated the task of the restoration of the moon image captured by Ranger 7 using computer technology. In the following decades, the digital image processing technology was rapidly developed and widely applied not only in space probe and Earth observation but also in civil uses. Today, digital images are everywhere around us. To further explore digital image processing and its application, we have to understand what is a digital image in the ﬁrst place. Basically, a digital image could be described as a two-dimensional matrix of pixels. Each pixel in the image matrix is represented by a numerical value. The goal of digital image processing is using some speciﬁc methods to extract useful information from digital images based on the requirement of the users. A lot of deﬁnitions of digital image processing have been proposed. In Wikipedia (2017), digital image processing is deﬁned as “the use of computer algorithms to perform image processing on digital images.” Gonzalez and Woods (2008) describe digital image processing as “processing digital images by means of digital computer.” Agricultural image processing mainly focuses on the collection, processing, and understanding of image data acquired by imaging sensors of the agricultural observation system. These images are usually acquired by in situ, airborne, or spaceborne sensors, then need to be processed to extract agricultural related information with speciﬁc image processing methods and tools.

Color Image Processing

Wavelets & Multiresolution Processing

Compression

Image Restoration

Image Enhancement Problem Domain

Image Acquisition

Morphological Processing

Segmentation

Knowledge Base

Representation & Description

Object Recognition

Outputs of these steps are generally image attributes

Outputs of these steps are generally images

Fig. 6.1 The fundamental steps in digital image processing. (Figure from Gonzalez and Woods 2008)

84

6.2.2

C. Zhang and L. Lin

Basic Steps in Image Processing

Generally, the digital image processing ﬂow could be divided into a series of steps and subdivisions including image acquisition, image enhancement, image restoration, image compression, image segmentation, image representation, and object recognition. The context diagram of these steps is illustrated in Fig. 6.1. This section will give an overview of each step and component in the digital image processing ﬂow. Image acquisition: Digital image acquisition or digital imaging is the ﬁrst step in any digital image processing system which performs the action of retrieving image from hardware sources ranging from a personal camera to satellite. The original image generated by the hardware source is an unprocessed raw image which is waiting for preprocessing and further image processing operations. In the stage of image acquisition, some preprocessing steps are involved, such as scaling; then the preprocessed image will be generated as the input data of the whole image processing system. Image enhancement: Image enhancement in a digital image processing system referring to the process of adjusting and manipulating image and making it more suitable for displaying in a speciﬁc application. Generally, this stage covers a lot of image processing methods; enhancement techniques adopted in an image processing system may vary from one task to another. Typical image processing methods applied in the stage of image enhancement include contrast stretching such as density slicing and linear/nonlinear stretching, histogram processing such as histogram equalization and histogram matching, spatial ﬁltering such as smoothing ﬁltering and median ﬁltering, edge enhancement and detection, image transformation such as principal component analysis (PCA), and huesaturation-intensity (HSI) transformation. Image restoration: Image restoration is the process of improving the appearance of images by recovering a damaged image. The damaged image we talk about in the process of image restoration usually refers to the images with corruptions like motion blur, noise, and camera misfocus. Like image enhancement, the objective of image restoration is to get a clean and suitable image. However, the way image restoration improves images is different with image enhancement which is mainly based on human subjective preferences. Major approaches used in image restoration include reducing noise, recovering resolution loss, and applying deblurring functions on the damaged image. Image compression: Image compression is the technique about reducing the storage size of an image. There are many common methods in image compression such as Huffman coding, arithmetic coding, Golomb coding, predictive coding, adaptive dictionary algorithms, and wavelet transform. The compressed image, depending on the compression method, could be lossy or lossless. Common lossy image formats include JPEG (joint photographic experts group) and GIF (graphics interchange format). Common lossless image formats include raw image ﬁle, BMP (bitmap image ﬁle), and PNG (portable network graphics). Some format

6 Image Processing Methods in Agricultural Observation Systems

85

could be lossy and lossless, for example, TIFF (tagged image ﬁle format) is designed as a container to hold uncompressed image, lossy compressed image, or lossless compressed image. Image segmentation: Image segmentation partitions an image into multiple segments. The goal of this stage is to identify the object in a digital image. Sometimes digital images we deal with are very complicated; thus, the segmentation could be a tough process in the digital image processing system. Basic approaches and methods applied in image segmentation includes sharp (point, line, edge) detection, thresholding methods such as global thresholding and Otsu’s method, color-based segmentation such as K-means clustering, transform methods such as morphological watersheds, region-based segmentation, and texture ﬁlter. Image representation: Image representation deals with the problem of how the image is represented. Generally, the process of image representation consists of two stages: choosing representation scheme and describing the region based on the representation. After the processing, the image would be represented with a proper description.

6.3

Hardware and Software

A typical image processing system consists of hardware and software. The hardware could be a personal computer (PC), an image sensor, or a specialized image processing hardware. The software could be a software package which is capable of running on a speciﬁc platform to process speciﬁc task or a library providing application programming interface (API) to facilitate image processing–related development and implementation. This section covers a suite of common image processing hardware/software as well as state-of-the-art technologies such as mobile device–based image processing and cloud-based image processing.

6.3.1

Image Processing Hardware

The digital image processing technology is evolving with the development and widespread of computer hardware technology. In the mid-1980s, digital image data were analyzed in microcomputer-based image analysis systems comprising linked components with a monitor for displaying user interface and another for displaying images. With the widespread of personal computer (PC) since the 1990s, more and more digital image processing systems are developed based on desktop platform with graphic user interfaces (GUI). In the recent years, with the quickly development of the Internet technologies and infrastructures, more platforms and concepts such as cloud computing, mobile computing, and machine learning frameworks are being applied in image processing.

86

C. Zhang and L. Lin

Today, computer is the most general tool for processing digital image. The computer discussed here could be a laptop, a desktop, a workstation, a server, or a supercomputer. In addition to computer, the specialized image processing hardware performs as an optimized hardware for speciﬁc image processing task which ranges from a chip to a front-end subsystem. Depending on the particular task need, the speciﬁcation level of a computer or the specialized image processing hardware may be varied in a broad range.

6.3.2

Image Processing Software

Image processing software usually performs speciﬁc task on digital image processing or computer vision. According to the application range, the image processing software package could be developed for general purpose and speciﬁc purpose. Adobe Photoshop is one of the representative general image processing software package which is a popular raster graphics editor to process general images. On the other hand, some image processing software packages are developed for a speciﬁc purpose such as remotely sensed image processing and medical image processing. For example, most common image processing systems for remotely sensed imagery include ERDAS Imagine, ENVI, IDRISI, GRASS GIS etc.; each system has speciﬁc features and capabilities. Besides software package, the open source image processing library, such as OpenCV, provides a more ﬂexible way to fulﬁll the image processing tasks. With the rapid development of artiﬁcial intelligence technology, many open source machine learning projects, especially the deep learning software/framework (e.g., Caffe, TensorFlow, Theano, PyTorch), are developed within the past decade. More details about machine learning–based image processing approaches would be discussed in Sect. 6.5.

6.3.3

Mobile Device–Based Image Processing

Nowadays, mobile devices such as smartphones and tablets are carrying powerful mobile operating system, such as iOS and Android, and equipped with highperformance hardware including the CPU (central processing unit), mass storage memory, high-resolution camera, G-sensor, GPS (global positioning system) module, and ﬁngerprint identiﬁcation system. These features bring mobile devices unique advantages for processing agricultural image. One major feature of mobile device is its portability, which allows users to easily capture, access, manage, and visualize digital images. Based on the great extensibility of mobile platform, some mobile apps are developed to facilitate agricultural image processing. For example, Geofairy, a location-based mobile app which is available on both iOS and Android platforms, allows users to easily retrieve agricultural data, such as

6 Image Processing Methods in Agricultural Observation Systems

87

temperature, humidity, and NDVI (normalized difference vegetation index), from various online sources (Sun et al. 2017). GPKG Mobile is developed as an iOS mobile app to support ﬁeld operations in agriculture, which provides the capability to display, manage, and manipulate agricultural image data in GeoPackage format on both Google Maps and OpenLayers by implementing GeoPackage Library and CMAPI (Zhang et al. 2016a, b).

6.3.4

Cloud-Based Image Processing

The volume of the Earth observation data is growing in an exponential rate during the past decades due to the rapid development of remote sensing technology and its application. As the result, big data remote sensing is becoming a new idea in both scientiﬁc and industrial communities. To meet the challenge of big data remote sensing, Google has developed Earth Engine as the next-generation Earth observation data analysis platform (Gorelick et al. 2017). The major difference between the Google Earth Engine and traditional remote sensing data platform is that the Google Earth Engine is powered by Google’s cloud infrastructure, which means all data analyses and processing are implemented in cloud instead of the user’s own desktop. The Google Earth Engine provides the data catalogue contains the entire Landsat catalogue, MODIS satellite datasets, Sentinel satellite data, precipitation data, elevation data, sea surface temperature, NAIP, and CHIRPS climate data. Which means users can directly analyze data using web browser without ﬁnding and uploading Earth observation data by themselves. On the other hand, the Google Earth Engine provides many derivative products such as annual mosaics and a variety of environmental indices such as NDVI, EVI, and NDWI (Padarian et al. 2015). Meanwhile, a variety of Google Earth Engine-enabled web applications and toolkits have been developed to support agrogeoinformation and image processing (Yalew et al. 2016; Zhang et al. 2020a). Compared with the conventional image processing system, cloud-based Earth observation data processing system, which uses powerful cloud infrastructure as the platform to process imagery data would considerably reduce the computation times (Zhang et al. 2017).

6.4

Agricultural Image Data Collection

In an agricultural image processing system, data could be retrieved by the different imaging sensors in different ways. This section introduces the three approaches for agricultural image data collection including the situ data collection, airborne-based data collection, and space-borne-based data collection. Also, the big data challenge in agricultural image data collection would be discussed.

88

6.4.1

C. Zhang and L. Lin

In Situ Data Collection

In situ data collection is a traditional data collection approach that could be traced back to thousands of years ago when census enumerator went door by door collecting information about each family. Nowadays, in situ is still a useful way to collect data. In agricultural activities, scientists use transducer, such as thermometer, anemometer, and psychrometer, or other in situ instrument, to make measurements (Jensen 2015). Among the various transducers and other in situ instruments, the camera is the most common instrument to collect visible-light in situ agricultural images. Spectroradiometer is another commonly used in situ instrument when collecting agricultural image data. However, no matter which transducer we adopt, error is inevitable during in situ data collection process. It is difﬁcult to get rid of intrusive in situ data collection, method-produced error, and measurement-device calibration error when performing in situ data collection. To deal with this problem, some advanced technologies are adopted in collecting in situ agricultural images. For example, Haug and Ostermann (2014) propose a benchmark dataset for crop/weed discrimination, to minimize the error brought by an artiﬁcial measurement during the in situ data collection; all images were collected using a camera which is mounted to the autonomous ﬁeld robot.

6.4.2

Airborne-Based Data Collection

Airborne-based data collection mission is performed by suborbital aircraft such as airplanes, helicopters, and unmanned aerial vehicles (UAVs). The airborne digital cameras mounted on the aircraft could be divided into small-format digital cameras medium-format digital cameras and large-format digital cameras. The images collected by the small-format digital camera are less than 16 megapixels (MP) per band. Images collected by the medium-format digital cameras are greater than 16 MP per band. Large-format digital cameras such as the Leica Geosystems, Ag, and Microsoft Ultramap, Inc., compared with small-format and medium-format digital cameras, have very large charge-coupled device (CCD) linear or area CCD arrays which provide the capability to generate the larger images. As the ﬁrst generation of remote sensing data collection approach, airborne-based data collection has made a huge progress during the past few years. With the widespread of consumer drone, airborne-based data collection is more accessible and easier to use than ever. In many agricultural data collection tasks, airborne-based data collection is an efﬁcient and economic tool to capture the ﬁeld-level image with low cost.

6 Image Processing Methods in Agricultural Observation Systems

6.4.3

89

Space-Borne-Based Data Collection

Space-borne-based data collection is also a remote sensing-based data collection approach. Unlike airborne-based data collection, which is the mounting of a camera on the aircraft, the remote sensing instrument of space-borne-based data collection is onboard the satellite platform. According to the uses of satellites, the collected data could be divided into meteorological data, oceanographic data and Earth resources data. According to the sensor on the satellite, these data could be grouped into veryhigh spatial resolution data (e.g., IKONOS, QuickBird, OrbView-3, Cartosat, WourdView, GeoEye-1), moderate-to-high spatial resolution data (e.g., Landsat, Sentinel-2), Moderate Resolution Imaging Spectroradiometer (MODIS) data, hyperspectral data (e.g., AVIRIS, CASI), and radar data (e.g., JERS, ERS, Radarsat, EnviSat) (Gao 2008). As the major approach to collect Earth observation data, space-borne-based remote sensing plays an unreplaceable role in agricultural image data collection.

6.4.4

Big Data Challenge in Agricultural Image Data Collection

With the rapid development of remote sensing technology, more and more satellites characterized by high spatial, temporal, and radiometric resolution are available and launched. Therefore, how to manage and analyze the volumes of high-resolution agricultural image data captured by the different types of satellites is becoming a challenge in the big data era. In 2001, Laney (2001) from META Group described big data in three Vs: volume as the scale of data, variety as the different forms of data, and velocity as the analysis of streaming data. These three dimensions also work for describing agricultural image data. Volume: Terabytes of high-resolution Earth observation data were generated every day by satellites which characterized by high spatial, temporal, and radiometric resolution. As one of the major Earth science data manage systems, NASA’s Earth Observing System Data and Information System (EOSDIS), a key core capability in NASA’s Earth Science Data Systems Program, is managing NASA’s Earth science data from various sources, including satellites, aircraft, and ﬁeld measurements (https://earthdata.nasa.gov). All Earth observation data can be accessed via EOSDIS Distributed Active Archive Centers (DAACs) online (https://earthdata.nasa.gov/about/daacs). The most recent EOSDIS key science system metrics (NASA 2017) show that the total archive volume is 17.5 petabytes, and approximately 12.1 terabytes volume of data is generated every day (average daily archive growth between Oct 1, 2015, and Sept 30, 2016), which means the Earth observation data is growing so fast and has been doubled with 7.5 petabytes in 2012 (Ramapriyan et al. 2013).

90

C. Zhang and L. Lin

Variety: Since a huge volume of remotely sensed data was generated by different sources with different resolutions in multitemporal, variety is also a key factor in big data remote sensing. Earth observation data collected by remote sensing approach can be classiﬁed in areas of application, or its property. Structured agricultural image data are sorted with standard format such as HDF, HDF-EOS, netCDF, GeoTIFF, JPEG2000, and OGC GML-Cov. Velocity: Handling a huge volume of data with such growing velocity is also an imperative task in agricultural image data management and analysis. The basic velocity-related impact for analyzing and processing remote sensing data including data transferring rate, data processing rate, and data accessing rate (Nativi et al. 2015).

6.5

Agro-Geoinformation Extraction from Image

Information extraction is the core operation in digital image processing. There are many methods to extract and understand agricultural related information from remotely sensed image data. This section mainly focuses on three commonly used information extraction approaches in agro-geoinformatics including expert systembased agricultural information extraction, decision tree–based agricultural information extraction, and neural network-based agricultural information extraction.

6.5.1

Knowledge-Based Expert System

A knowledge-based expert system is a computer system which applies artiﬁcial intelligence techniques in problem-solving processes to support human decisionmaking, learning, and action (Akerkar and Sajja 2010). As shown in Fig. 6.2, a knowledge-based expert system consists of knowledge base, inference engine, and user interface.

Fig. 6.2 Architecture of knowledge-based expert system

6 Image Processing Methods in Agricultural Observation Systems

91

Knowledge base: Knowledge base represents facts and rules. In an expert system, knowledge base contains domain-speciﬁc and high-quality knowledge. The knowledge base stores both factual and heuristic knowledge. Factual knowledge is the knowledge that is widely shared and accepted by knowledge engineers or scholars in the task domain. Heuristic knowledge, on the other hand, is less rigorous, which is about practice, accurate judgment, one’s ability of evaluation, or problem-solving by experiment. The knowledge representation process in the expert system is in the form of IF-THEN-ELSE rules, which is commonly represented as decision tree structure. As shown in Fig. 6.3, the decision tree consists of hypotheses, rules, and conditions. Usually, during the process of agricultural information extraction, the accuracy of the land cover class identiﬁcation depends on the quantity of the considered conditions. Inference engine: As the core part of expert system, inference engine analyze and process the rules in the knowledge base to draw conclusion. To recommend a solution, interference engine uses two strategies: forward chaining and backward chaining. The expert system could explain the reasoning processes by tracing the chain of steps. For example, when performing an image classiﬁcation task, users would like to know the detailed information about the decision-making process and why a speciﬁc area of the image was classiﬁed as a particular type. User interface: The user interface is the front-end client in a knowledge-based expert system which bridges user and inference engine. A good user interface should be easy-to-use, interactive, efﬁcient, and user-friendly. In a knowledgebased expert system, the client should be running on a speciﬁc platform such as desktop and mobile devices. With the rising of mobile computing and cloud computing, more and more front-end clients are using cross-platform, which means users could access the client on multiple platforms such as desktop operating system (e.g., Microsoft Windows, MacOS, Linux), mobile operating system (e.g., iOS, Android), and web browser (e.g., Chrome, Firefox, Safari).

Fig. 6.3 An example of a knowledge base in the expert system

92

C. Zhang and L. Lin

Knowledge-based expert system is widely used in remote sensing research and digital image processing. Also, based on the its advantages, knowledge-based expert system performs very well in many agricultural image processing tasks such as image segmentation, image thresholding, and image classiﬁcation. In Romeo et al. (2013), an automatic and robust expert system for greenness identiﬁcation is proposed which consists of histogram analysis-based decision-making module and greenness identiﬁcation module based on classical methods and fuzzy clustering approach. In Montalvo et al. (2013), an automatic expert system for weeds/crops identiﬁcation in images from maize ﬁelds is proposed which is able to identify weeds/crops when they have been contaminated with materials coming from the soil, due to artiﬁcial irrigation or natural rainfall.

6.5.2

Machine Learning–Based Decision Tree

Machine learning has many deﬁnitions. Copeland (2016) deﬁnes machine learning as “the practice of using algorithms to parse data, learn from it, and then make a determination or prediction about something in the world.” Ng (2013) gives a short deﬁnition that machine learning is “the science of getting computers to act without being explicitly programmed.” Various machine learning programs, libraries, and frameworks are developed and used in digital image processing and in the computer vision ﬁeld (Pedregosa et al. 2011; Vedaldi and Fulkerson 2010; Collobert et al. 2011). As the result of the quick development of computer hardware and software in the past few years, many computer vision problems that cannot be handled by traditional image processing methods have been solved by machine learning approaches (Jordan and Mitchell 2015; Sonka et al. 2014; Rosten and Drummond 2006). Meanwhile, machine learning is efﬁcient and effective to automatically discover intricate patterns and structures in agro-geoinformation data. A variety of machine learning-based approach has been developed and applied to support agricultural applications and researches, such as cropland extent mapping (Teluguntla et al. 2018), crop type classiﬁcation (Hao et al. 2020), drought monitoring (Park et al. 2016), and agricultural sustainability (Sharma et al. 2020). Based on the task type, machine learning could be divided into supervised learning and unsupervised learning. The major difference between supervised and unsupervised learning is the dataset adopted during the learning process where supervised paradigm (e.g., classiﬁcation, regression, recommendation system) deals with the labeled data and unsupervised paradigm (e.g., clustering, outlier detection, association rule mining) deals with the unlabeled data. Further, as described in Bishop (2006), supervised learning solves the problems where “training data comprises examples of the input vectors along with their corresponding target vectors”; unsupervised learning aims to “discover groups of similar examples within the data, where it is called clustering, or to determine the distribution of data within the input space, known as density estimation, or to project the data from a highdimensional space down to two or three dimensions for the purpose of

6 Image Processing Methods in Agricultural Observation Systems

93

Example of a Decision Tree Node (attribute) Arc (a single value, a group of values, or a range of values of the attribute) Best

Band 1

e.g., red

Band 1 > 82 /Soil = best: wetland /Soil = good: dead vegetation >82 ≤82 /Soil = fair: dead vegetation Soil Band 2 e.g., Near-infrared /Soil = poor: bare soil Band 1 ≤ 82 / Band 2 > 40: wetland ≤40 >40 Poor / Band 2 ≤ 40: water Fair Good

Leaf (class)

b Wetland Dead veg.

Dead veg.

Bare soil

Wetland

Water

a A a1

S a3

a2 S1:T1

S2:T2

S3:T3

c Fig. 6.4 An example of decision tree used on land cover classiﬁcation. (Figure from Jensen 2015)

visualization.” Semi-supervised learning, on the other hand, is between supervised and unsupervised learning which is “a learning paradigm concerned with the study of how computers and natural systems such as humans learn in the presence of both labelled and unlabeled data” (Zhu and Goldberg 2009). As one of the most common classiﬁcation methods, decision tree has been used in image analysis for a long time. An example of a decision tree used on land cover classiﬁcation is shown in Fig. 6.4. We can see three attributes (band1, band2, soil) in the dataset, each attribute described as a node on the tree. By following the decision in each node, the dataset would be split into a leaf node and then ﬁnally be classiﬁed as one of the ﬁve different land cover classes (wetland, dead vegetation, bare soil, water). The basic elements of any decision tree algorithm included (1) the rules for splitting data at a node based on the value of one variable; (2) stopping rules for deciding when a branch is terminal and can be split no more; and (3) a prediction for the target variable in each terminal node (Rao 2013). With the widely dissemination of machine learning technology, a number of machine learning-based decision tree such as ID3, C4.5, and CART are applied for analyzing/classifying digital images including agricultural images. ID3: Iterative Dichotomizer 3 (ID3) is a widely used machine learning decision tree algorithm introduced by Quinlan (1986). ID3 is the precursor of many other decision tree algorithms such as C4.5. In the process of classiﬁcation, ID3 handles categorical value and adopts information gain as the rule of splitting using Shannon entropy to pick features with the greatest information gain as nodes.

94

C. Zhang and L. Lin

However, the ﬁrst-generation decision tree had some disadvantages such as overﬁtting problem. Furthermore, when applying ID3 on a large-scale dataset, only one attribute could be tested for making decision at a time, which means using ID3 in performing large-scale machine learning task could be timeconsuming. C4.5: C4.5 decision tree algorithm is developed by Quinlan (1993). As the improved version of ID3 algorithm, C4.5 handles both categorical value and numeric value, adopting gain ratio as the rule of splitting. C4.5 decision tree made a lot of improvements based on ID3; its new features include handling both continuous and discrete attributes, handling training data with missing attribute values, handling attributes with differing costs, and pruning trees after creation. CART: Classiﬁcation and Regression Trees (CART) is an umbrella term to refer to the classiﬁcation tree (the predicted outcome is the class to which the data belongs) and regression tree (the predicted outcome can be considered a real number), ﬁrst introduced by Breiman et al. (1984). CART uses towing criteria as the rule of splitting where it is measured with Gini Impurity when splitting nodes instead of Shannon entropy. Besides, CART can easily handle outliers. And like C4.5, CART can deal with both categorical and numeric values. Based on its advantages of performing classiﬁcation and regression tasks, CART has been shown to be very effective for land cover classiﬁcation (Sexton et al. 2013). Other decision tree programs that have been used in machine learning-based agricultural information extraction include See5/C5.0, which is the successive version of C4.5 (Quinlan 2003), S-Plus, which covers a great number of data mining functions, including decision tree/regression tree module, and R language, which provides an integrated environment for statistical analysis.

6.5.3

Artiﬁcial Neural Network

Artiﬁcial neural network is another machine learning approach which is extensively applied in image analysis and computer vision. The concept of computational model for neural networks was ﬁrstly proposed in McCulloch and Pitts (1943). Artiﬁcial neural network is inspired by Hubel and Wiesel (1959). Fukushima and Miyake (1982) proposed the concept of convolutional neural network. In 1989, the backpropagation algorithm was applied to a deep convolutional neural network for performing handwritten ZIP code recognition (LeCun et al. 1989) which is the early application of deep learning technology. In recent years, with the rapid development of computer hardware like graphics processing units (GPU) which could signiﬁcantly accelerate the process of network training, deep learning technology and deep convolutional nets became the new favorite in both research community and industry. As described in LeCun et al. (2015), “Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio.” In image processing and computer vision area, deep

6 Image Processing Methods in Agricultural Observation Systems

95

Fig. 6.5 Basic structure of the artiﬁcial neural network for pre-season crop mapping

learning has shown its superiority in image classiﬁcation and image recognition (Ciregan et al. 2012; Krizhevsky et al. 2012). Figure 6.5 shows an example of the artiﬁcial neural network for the prediction of crop type from the historical crop cover maps. As shown in the ﬁgure, the network is organized in layers which are made up of a number of interconnected neurons. A typical artiﬁcial neural network consists of input layer, hidden layers, and output layer. The network shown in the ﬁgure has only one hidden layer, but many complicated networks have multiple hidden layers. Each layer contains a series of neurons. Studies have shown that the artiﬁcial neural network is an effective and efﬁcient approach for the prediction of crop mapping (Zhang et al. 2019a) and reﬁnement of historical crop cover maps (Zhang et al. 2020b). When processing a digital image, each node stands for each pixel. For example, if we process a 16*16 grayscale image, the network would have 16*16 input neurons. Then, by training weights and biased in the network, the output could be generated. For an agricultural digital image classiﬁcation task, if we want to categorize the image with 1 of 200 cover types, the output layer of the network would have 200 neurons. Each neuron stands for one of the cover types such as corn, soybeans, wheat, and other crop types. If the neuron has an output near 1 (let’s say the value of each neuron ranges from 0 to 1), the image would be categorized as the speciﬁc crop type.

96

C. Zhang and L. Lin

Fig. 6.6 The architecture of a typical convolutional neural network

The terms “deep learning” and “deep neural networks” are based on the principle of artiﬁcial neural network. As one of the most frequently used artiﬁcial neural network structure, the convolutional neural network is effective for semantic segmentation and object recognition in image processing and computer vision tasks. It has been applied in many kinds of agricultural image processing tasks, such as crop mapping (Sidike et al. 2019), crop yield estimation (Kuwata and Shibasaki 2015) plant identiﬁcation (Lee et al. 2015), and plant disease detection (Mohanty et al. 2016). An example of using convolutional neural network for remote sensing image-based agricultural land use classiﬁcation is shown as Fig. 6.6. The artiﬁcial neural network is suitable for many advanced image processing tasks such as image classiﬁcation, image segmentation, and object detection. However, it is time-consuming to manually implement complicated artiﬁcial neural network in agricultural applications. Fortunately, many machine learning software and libraries, such as Caffe, TensorFlow, PyBrain, Theano, and Nvidia DIGITS, are available and free to use, which would signiﬁcantly facilitate the implementation of artiﬁcial neural network for agricultural image processing and analysis. Here are some examples of machine learning libraries and software. Caffe: Caffe is an open source deep learning framework developed by Jia et al. (2014). Many different types of deep learning networks such as CNN, RCNN, and LSTM are supported. Also, Caffe is optimized for GPU acceleration. By bundling Nvidia cuDNN library, the training process could be accelerated by 1.38x overall, and testing process could be accelerated by 1.50x (Shelhamer 2014). In 2017, Facebook announced Caffe2, a new lightweight, modular, and scalable deep learning framework, which not only supports desktop platform but also works on mobile platforms such as iOS and Android. TensorFlow: TensorFlow is an open source machine learning library developed by Google (Abadi et al. 2016) which provides a powerful artiﬁcial neural network function. TensorFlow provides APIs for different programming languages such as Python, C++, Java, Haskell, Go, and Rust and offers both CPU-only and GPU-optimized version. Based on its powerful APIs and libraries, many complicated image analysis tasks such as image recognition could be performed using TensorFlow very easily. Nvidia DIGITS: Nvidia Deep Learning GPU Training System (DIGITS) is an open source deep learning implementation for image classiﬁcation, segmentation, and object detection tasks (Heinrich, 2016; Barker and Prasanna 2016). By using

6 Image Processing Methods in Agricultural Observation Systems

97

Nvidia DIGITS, users could easily design, train, and visualize the deep neural network architectures. As a GPU-optimized framework, Nvidia DIGITS could automatically handle scale training jobs across multiple GPUs which greatly facilitates the implementation of deep neural network. The development of artiﬁcial intelligence and machine learning tools is very fast. There are many other open source artiﬁcial neural network libraries and frameworks available. For example, PyBrain (Schaul et al. 2010) is a modular machine learning library for Python which is short for Python-Based Reinforcement Learning, Artiﬁcial Intelligence, and Neural Network. PyTorch is a machine learning library providing a wide range of algorithms for deep learning. Theano is a numerical computation Python Library supports for deep learning implementation on both CPU and GPU architectures.

6.5.4

A Case Study

The National Agricultural Statistics Service (NASS) of the U.S. Department of Agriculture (USDA) start operating the Cropland Data Layer (CDL) program in 1997. The mission of the program is to provide a comprehensive, raster-formatted, geo-referenced, crop-speciﬁc land cover classiﬁcation data product to support the US agricultural monitoring. By far the CDL data products have been widely adopted as reference data by growers, agricultural industry, government, academy educators and students, researchers world-wide for crop production, agricultural production planning and management, government policy formulation and decision making, teaching, and various research activities. An overview of CDL program is given by Boryan et al. (2011) which mainly introduced the background of the program and CDL products of 2009. To generate CDL product, a series of inputs including imagery data, ground truth data, and ancillary data are used. The source of imagery data used in the CDL program includes AWiFS, Landsat TM and ETM+, and MODIS satellite data. Ground truth data used in the process of supervised classiﬁcation training include Common Land Unit (CLU) data from the Farm Service Agency (FSA) as agricultural ground truth, and National Land Cover Data (NLCD) as nonagricultural ground truth. Ancillary data sources include the USGS National Elevation Data (NED), NCLD 2011 tree canopy, and NLCD 2001 imperviousness data layers. The classiﬁcation method used in generating CDL products is See5/C5.0 decision tree algorithm. To derive the state-level decision trees, FSA CLU data (agricultural ground truth) and NLCD 2001 data (nonagricultural ground truth) are used as training dataset by See5/C5.0 classiﬁer. Then, the classiﬁer performs the classiﬁcation on input data such as AWiFS, Landsat, and MODIS imagery. By comparing the classiﬁcation result with the independent validation data extracted from the ground truth data, the accuracies could be derived. Take Nebraska state as an example, the total crop mapping accuracies of the major crop categories for the 2016 CDL is

98

C. Zhang and L. Lin 2016 Cropland Data Layers

Major Land Cover Categories (by decreasing acreage) Agriculture Pasture/Grass Corn Soybeans All Wheat Other Hay

Non-Agriculture Fallow Cropland Alfalfa Cotton Other Crops Sorghum

Vegetables/Fruits/Nuts Other Small Grains Rice

Barren Woodland Shrubland Ice/Snow Urban/Developed Wetlands Water Source: USDA/NASS

Fig. 6.7 The thematic map of 2016 Cropland Data Layer with U.S. state boundary

96.4% (the detail of CDL metadata could be accessed at https://www.nass.usda.gov/ Research_and_Science/Cropland/metadata/meta.php). The CDL data products could be accessed, visualized, interacted, and downloaded from CropScape portal (https://nassgeodata.gmu.edu/CropScape), which is a geospatial web application developed in cooperation with the Center for Spatial Information Science and Systems, George Mason University (Han et al. 2012; Zhang et al. 2019b). Figure 6.7 shows the thematic map of 2016 CDL with U.S. state boundary layer provided by CropScape.

6.6

Summary

Agricultural image processing is an essential part of the agricultural observation system. Besides the traditional hardware and software, new image processing tools and platforms such as mobile device and cloud computing have been widely applied in agro-geoinformatics. Agricultural image data are acquired by in situ, airbornebased, and space-borne-based remote sensing. With the arrival of the big data remote sensing era, management and analysis of the massive volume of Earth

6 Image Processing Methods in Agricultural Observation Systems

99

observation data from a variety of imaging sensors is becoming a new challenge in agro-geoinformatics. We covered several traditional and state-of-the-art methods, including knowledge-based expert system, machine learning-based decision, and artiﬁcial neural network, for agro-geoinformation extraction from image data. As a case study of image processing in the agricultural observation system, we introduced the CDL program of USDA NASS as well as the image processing methods for the production of CDL data. In the future, we will explore more image processing methods and applications in agro-geoinformatics. Meanwhile, the next-generation technologies of image processing in the agricultural observation system, such as the artiﬁcial intelligence-enabled agro-geoinformation extraction, will be systematically investigated.

References Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., et al. (2016). Tensorﬂow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv, 1603.04467. Akerkar, R., & Sajja, P. (2010). Knowledge-based systems. Sudbury, MA: Jones & Bartlett Publishers. Barker, J., & Prasanna, S. (2016). https://devblogs.nvidia.com/parallelforall/deep-learning-objectdetection-digits/ Bishop, C. M. (2006). Pattern recognition and machine learning. New York: Springer. Boryan, C., Yang, Z., Mueller, R., & Craig, M. (2011). Monitoring US agriculture: The US department of agriculture, national agricultural statistics service, cropland data layer program. Geocarto International, 26(5), 341–358. Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classiﬁcation and regression trees. Boca Raton: CRC Press. Ciregan, D., Meier, U., & Schmidhuber, J. (2012, June). Multi-column deep neural networks for image classiﬁcation. In Computer vision and pattern recognition (CVPR), 2012 IEEE conference on (pp. 3642–3649). IEEE. Collobert, R., Kavukcuoglu, K., & Farabet, C. (2011). Torch7: A matlab-like environment for machine learning. In BigLearn, NIPS workshop (No. EPFL-CONF-192376). Copeland, M. (2016). The difference between AI, machine learning, and deep learning? | NVIDIA Blog. Retrieved June 01, 2017, from https://blogs.nvidia.com/blog/2016/07/29/whatsdifference-artiﬁcial-intelligence-machine-learning-deep-learning-ai/. Fukushima, K., & Miyake, S. (1982). Neocognitron: A self-organizing neural network model for a mechanism of visual pattern recognition. In Competition and cooperation in neural nets (pp. 267–285). Berlin/Heidelberg: Springer. Gao, J. (2008). Digital analysis of remotely sensed imagery. New York: McGraw-Hill Professional. Gonzalez, R. C., Woods, R. E. (2008). Digital image processing, 3rd Edition. Upper Saddle River, NJ: Pearson. Gorelick, N., Hancher, M., Dixon, M., Ilyushchenko, S., Thau, D., & Moore, R. (2017). Google Earth Engine: Planetary-scale geospatial analysis for everyone. Remote sensing of Environment, 202, 18–27. Han, W., Yang, Z., Di, L., & Mueller, R. (2012). CropScape: A web service based application for exploring and disseminating US conterminous geospatial cropland data products for decision support. Computers and Electronics in Agriculture, 84, 111–123.

100

C. Zhang and L. Lin

Haug, S., & Ostermann, J. (2014, September). A crop/weed ﬁeld image dataset for the evaluation of computer vision based precision agriculture tasks. In European conference on computer vision (pp. 105–116). Cham: Springer International Publishing. Hao, P., Di, L., Zhang, C., & Guo, L. (2020). Transfer Learning for Crop classiﬁcation with Cropland Data Layer data (CDL) as training samples. Science of The Total Environment, 733, 138869. Heinrich, G. (2016). https://devblogs.nvidia.com/parallelforall/image-segmentation-using-digits-5/ Hubel, D. H., & Wiesel, T. N. (1959). Receptive ﬁelds of single Neurones in the cat’s striate cortex. The Journal of Physiology, 148(3), 574–591. Jensen, J. R. (2015). Introductory digital image processing: A remote sensing perspective. In Pearson series in geographic information science. Glenview: Pearson Education, Inc. Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., . . ., & Darrell, T. (2014, November). Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the 22nd ACM international conference on multimedia (pp. 675–678). New York: ACM. Jordan, M. I., & Mitchell, T. M. (2015). Machine learning: Trends, perspectives, and pros. Science, 349(6245), 255–260. Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classiﬁcation with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097–1105). Berlin/New York: Springer. Kuwata, K., & Shibasaki, R. (2015, July). Estimating crop yields with deep learning and remotely sensed data. In 2015 IEEE International Geoscience and Remote Sensing Symposium (IGARSS) (pp. 858–861). IEEE. Laney, D. (2001). 3D data management: Controlling data volume, velocity and variety. META Group Research Note, 6(70), 70–73. LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., & Jackel, L. D. (1989). Backpropagation applied to handwritten zip code recognition. Neural Computation, 1 (4), 541–551. LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. Nature, 521(7553), 436. Lee, S. H., Chan, C. S., Wilkin, P., & Remagnino, P. (2015, September). Deep-plant: Plant identiﬁcation with convolutional neural networks. In 2015 IEEE international conference on image processing (ICIP) (pp. 452–456). IEEE. McCulloch, W. S., & Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. The Bulletin of Mathematical Biophysics, 5(4), 115–133. Mohanty, S. P., Hughes, D. P., & Salathé, M. (2016). Using deep learning for image-based plant disease detection. Frontiers in Plant Science, 7, 1419. Montalvo, M., Guerrero, J. M., Romeo, J., Emmi, L., Guijarro, M., & Pajares, G. (2013). Automatic expert system for weeds/crops identiﬁcation in images from maize ﬁelds. Expert Systems with Applications, 40(1), 75–82. NASA. (2017). Earthdata system performance. https://earthdata.nasa.gov/about/system-perfor mance. Accessed 01 Jan 2017. Nativi, S., Mazzetti, P., Santoro, M., Papeschi, F., Craglia, M., & Ochiai, O. (2015). Big data challenges in building the global earth observation system of systems. Environmental Modelling & Software, 68, 1–26. Ng, A. (2013). Courses – Andrew Ng. Retrieved from http://www.andrewng.org/courses/ Padarian, J., Minasny, B., & McBratney, A. B. (2015). Using Google’s cloud-based platform for digital soil mapping. Computers & Geosciences, 83, 80–88. Park, S., Im, J., Jang, E., & Rhee, J. (2016). Drought assessment and monitoring through blending of multi-sensor indices using machine learning approaches for different climate regions. Agricultural and Forest Meteorology, 216, 157–169. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., et al. (2011). Scikitlearn: Machine learning in Python. Journal of Machine Learning Research, 12(Oct), 2825–2830. Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1(1), 81–106.

6 Image Processing Methods in Agricultural Observation Systems

101

Quinlan, J. R. (1993). C4. 5: Programs for machine learning (Vol. 1). San Mateo: Morgan Kaufmann. Quinlan, J. R. (2003). Data mining tools See5 and C5.0. St. Ives: RuleQuest Research. http://www. rulequest.com/see5-info.html. Ramapriyan, H., Brennan, J., Walter, J., & Behnke, J. (2013). Managing big data: NASA tackles complex data challenges. Earth Imaging Journal. http://eijournal.com/print/articles/managingbig-data. Rao, V. (2013). Introduction to Classiﬁcation & Regression Trees (CART). Retrieved June 02, 2017, from http://www.datasciencecentral.com/proﬁles/blogs/introduction-to-classiﬁca tion-regression-trees-cart Romeo, J., Pajares, G., Montalvo, M., Guerrero, J. M., Guijarro, M., & De La Cruz, J. M. (2013). A new expert system for greenness identiﬁcation in agricultural images. Expert Systems with Applications, 40(6), 2275–2286. Rosten, E., & Drummond, T. (2006). Machine learning for high-speed corner detection. Computer vision–ECCV, 2006, 430–443. Schaul, T., Bayer, J., Wierstra, D., Sun, Y., Felder, M., Sehnke, F., et al. (2010). PyBrain. Journal of Machine Learning Research, 11(Feb), 743–746. Sexton, J. O., Urban, D. L., Donohue, M. J., & Song, C. (2013). Long-term land cover dynamics by multi-temporal classiﬁcation across the Landsat-5 record. Remote Sensing of Environment, 128, 246–258. Sharma, R., Kamble, S. S., Gunasekaran, A., Kumar, V., & Kumar, A. (2020). A systematic literature review on machine learning applications for sustainable agriculture supply chain performance. Computers & Operations Research, 104926. Shelhamer, E. (2014). Deep learning for computer vision with Caffe and cuDNN. https://devblogs. nvidia.com/parallelforall/deep-learning-computer-vision-caffe-cudnn/ Sidike, P., Sagan, V., Maimaitijiang, M., Maimaitiyiming, M., Shakoor, N., Burken, J., ... & Fritschi, F. B. (2019). dPEN: deep Progressively Expanded Network for mapping heterogeneous agricultural landscape using WorldView-3 satellite imagery. Remote Sensing of Environment, 221, 756–772. Sonka, M., Hlavac, V., & Boyle, R. (2014). Image processing, analysis, and machine vision. New York: Cengage Learning. Sun, Z., Di, L., Heo, G., Zhang, C., Fang, H., Yue, P., ... & Lin, L. (2017). GeoFairy: Towards a one-stop and location based Service for Geospatial Information Retrieval. Computers, Environment and Urban Systems, 62, 156–167. Teluguntla, P., Thenkabail, P. S., Oliphant, A., Xiong, J., Gumma, M. K., Congalton, R. G., ... & Huete, A. (2018). A 30-m landsat-derived cropland extent product of Australia and China using random forest machine learning algorithm on Google Earth Engine cloud computing platform. ISPRS Journal of Photogrammetry and Remote Sensing, 144, 325–340. Vedaldi, A., & Fulkerson, B. (2010, October). VLFeat: An open and portable library of computer vision algorithms. In Proceedings of the 18th ACM international conference on multimedia (pp. 1469–1472). ACM. Yalew, S. G., Van Griensven, A., & van der Zaag, P. (2016). AgriSuit: A web-based GIS-MCDA framework for agricultural land suitability assessment. Computers and Electronics in Agriculture, 128, 1–8. Zhang, C., Di, L., Sun, Z., Eugene, G. Y., Hu, L., Lin, L., ... & Rahman, M. S. (2017). Integrating OGC Web Processing Service with cloud computing environment for Earth Observation data. In 2017 6th International Conference on Agro-Geoinformatics. IEEE. Zhang, C., Di, L., Lin, L., & Guo, L. (2019a). Machine-learned prediction of annual crop planting in the US Corn Belt based on historical crop planting maps. Computers and Electronics in Agriculture, 166, 104989. Zhang, C., Di, L., Yang, Z., Lin, L., Eugene, G. Y., Yu, Z., ... & Zhao, H. (2019b). Cloud environment for disseminating NASS cropland data layer. In 2019 8th International Conference on Agro-Geoinformatics. IEEE.

102

C. Zhang and L. Lin

Zhang, C., Sun, Z., Heo, G., Di, L., & Lin, L. (2016a). A GeoPackage implementation of common map API on Google maps and OpenLayers to manipulate agricultural data on mobile devices. In 2016 ﬁfth international conference on Agro-Geoinformatics. IEEE. Zhang, C., Sun, Z., Heo, G., Di, L., & Lin, L. (2016b). Developing a GeoPackage mobile app to support ﬁeld operations in agriculture. In 2016 ﬁfth international conference on AgroGeoinformatics. IEEE. Zhang, C., Di, L., Yang, Z., Lin, L., & Hao, P. (2020a). AgKit4EE: A toolkit for agricultural land use modeling of the conterminous United States based on Google Earth Engine. Environmental Modelling & Software, 104694. Zhang, C., Yang, Z., Di, L., Lin, L., & Hao, P. (2020b). Reﬁnement of cropland data layer using machine learning. The International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences, 42, 161–164. Zhu, X., & Goldberg, A. B. (2009). Introduction to semi-supervised learning. Synthesis Lectures on Artiﬁcial Intelligence and Machine Learning, 3(1), 1–130.

Chapter 7

Data Fusion in Agricultural Information Systems Berk Üstündağ

Abstract Data has an increasing role in agricultural production and management at all scales depending on raising importance of yield efﬁciency and sustainability. Remote sensing systems provide spatial information at some observation instants and real-time terrestrial monitoring systems provide temporal information at some observation points. Data fusion methods appear as feasible way of multi-temporal mapping of information in Agricultural management. Data fusion uses computational models and machine learning methods on available spatial, temporal, and multi-temporal data sets. In this chapter, basics of data indexing and segmentation in Agricultural monitoring is given in accordance with application examples of timedelay neural networks, convolution and the wavelet transformation for data fusion. Frequently used agro-meteorological indices and yield efﬁciency relationships are also explained. Since many of the required monitoring parameters are usually not feasible for real-time data acquisition, data fusion methods enable estimated parameters indirectly from the correlated set of available data. In contrary to distributed characteristics of data resources and the users in agriculture, computational systems have centralization trend through “Data as a Service” (DaaS), “Platform as a Service” (PaaS) and “Artiﬁcial Intelligence as Service” (AIaaS). Data fusion is especially expected to have an increasing role for large scale, continuous-time data services in Agricultural applications. Keywords Data fusion · Yield forecast · Agro-informatics · Convolutional neural networks · Wavelet transformation · Deep learning · Evapotranspiration · Agricultural management · Time delay neural networks · Remote sensing

B. Üstündağ (*) Computer and Informatics Engineering Faculty, Istanbul Technical University, Istanbul, Turkey e-mail: [email protected] © Springer Nature Switzerland AG 2021 L. Di, B. Üstündağ (eds.), Agro-geoinformatics, Springer Remote Sensing/ Photogrammetry, https://doi.org/10.1007/978-3-030-66387-2_7

103

104

7.1

B. Üstündağ

Introduction

Arable land per capita has decreased by more than 40% in the last 50 years since 1970 (World Bank 2019). Data has an increasing role in agricultural production and management at all scales depending on the rising importance of yield efﬁciency and sustainability. Precision agriculture and good agricultural practices are based on the evaluation of various types of data and knowledge. Basin-level agricultural planning, management, and optimization also depend on the acquisition and processing of various types of data. Remote sensing systems provide spatial information at some observation instants, and real-time terrestrial monitoring systems provide temporal information at some observation points. There is not yet a feasible and accurate spatiotemporal direct monitoring method for the crop, soil, and other terrestrial resources over the large areas at continuous sampling time intervals. Data fusion methods appear as a way of multitemporal mapping of information depending on the evaluation of spatial, temporal, and multitemporal datasets via computational models. Besides the spatial and temporal features of the datasets, knowledge-driven data, administrative records, workﬂow processes, and real-time communication systems provide additional dimensions for data fusion systems. Data fusion systems can evaluate data in time domain, frequency domain, state space, wavelet domain (Jin et al. 2014), or spatial domain, while the outputs can be mixed-domain data such as spatiotemporal data or spatial frequency data. Data fusion is the process of integration of multiple data and knowledge representing the same real-world object into a consistent, accurate, and useful representation which can signiﬁcantly increase the application values of the data (Ghannam et al. 2014). This chapter presents the basics and some examples of data fusion applications in agriculture. By having a systems theory approach, agricultural systems require observability of states as well as imposed input patterns for accurate forecasts and optimal management. Statistical data is very important for agriculture since common contextual information is helpful against the variational and conditional complexities. Since many of the required monitoring parameters are usually not feasible for realtime data acquisition, data fusion methods enable estimated parameters indirectly from the correlated set of available data. In contrary to distributed characteristics of data and the users in agriculture, computational systems have centralized trends through “Data as a Service” (DaaS), “Software as a Service” (SaaS), “Platform as a Service” (PaaS), and ﬁnally “Artiﬁcial Intelligence as Service” (AIaaS). When increasing the coverage of mobile wireless networks and smartphone use rate, even in rural areas are considered, centralized computational systems well matches for more talent services with affordable operational cost per user. In addition to big data and scale value addition, recent standards for energy efﬁcient IoT services, such as the narrowband Internet of Things (NB-IoT) in GSM cellular networks, complete the value addition cycle for monitoring and automation in ﬁeld.

7 Data Fusion in Agricultural Information Systems

105

Data fusion process is based on the association, correlation, and combination of data and information from single and multiple sources to achieve reﬁned position, identify estimates, and complete and timely assess situations, threats, and their signiﬁcance (White 1991). These techniques have been broadly employed on multisensory environments to extract different types or variants of the input data. Their objective is to obtain lower detection error probability and higher reliability in multisensory environments (Castandeo 2013). This chapter includes two different kinds of data fusion examples and some basic information about wavelets and deep learning. The ﬁrst example, given in Sect. 7.3, is a linear regression model application through the segmentation of input data against the nonlinearity of the system. The second example is demonstrated in Sect. 7.4 by using fully connected artiﬁcial neural networks for time domain estimations. Wavelets and convolutional neural networks are increasingly used in data fusion processes. Sections 7.5 and 7.6 contain basic information about wavelets and convolutional neural networks.

7.2

Agricultural Information Systems

Real-time crop status monitoring, yield prediction, irrigation management, precision farming, resource management, and policy management are some application ﬁelds in agriculture where sensor and data fusion methods are utilized. Basin resource management requires information about the existing crop patterns, yield efﬁciency, and probable alternative crop patterns with respect to ecological appropriateness under the sustainability restrictions. Although crop yield prediction by using the remote sensing data together with agro-meteorological observations has never been a strategic issue, accuracy limitation appears as the major restriction. An example of fusion model that improves crop yield prediction performance by using the observation data from the “Agricultural Monitoring and Information Systems Project” (TARBIL) monitoring network in Turkey is demonstrated in Sect. 7.3. Multitemporal satellite images are widely used in crop monitoring. Very-highresolution (VHR), multispectral, and hyperspectral images are important spatial data for precision agriculture. Although those spatial data are acquired within some programmed or periodic timing, quality parameters differ depending on cloudiness, viewing angle, and atmospheric condition as well as surface snow coverage and wetness. Phenological timing is an additional important consideration in agricultural data acquisition since crop growth depends on phenological stages. A plant may appear at different forms at the same chronological calendar time of the season in different years due to the shift of the phenological stage depending on the varying seasonal conditions. Observation systems usually provide data in three different sampling types:

106

B. Üstündağ

Fig. 7.1 Data fusion for spatiotemporal monitoring and decision support systems

(a) Spatial data that is equally sampled throughout a speciﬁc region at some planned times. (b) Multi-temporal data that is equally sampled throughout a speciﬁc region within known periods. (c) Temporal data that is sampled at any required frequency within measurement bandwidth but at speciﬁc monitoring locations only. VHR satellite images, aerial orthophotos, and drone-based observations are spatial data. Agrometeorological and phenological observations from terrestrial monitoring stations are temporal data. Remote sensing satellites, such as MODIS, LandSat, or Sentinel, provide multitemporal data by periodic imaging on their orbits. Although they provide temporal resolution, their spatial resolution is still less than the VHR satellites. On the other hand, their cost efﬁciency and different sensor type availability are high, while spatial resolution is also in rising trend depending on technological developments. Mission-dedicated low-orbit remote sensing satellites are able to meet higher-spatial-resolution requirements with less operational life time than the higher orbit remote sensing satellites. A generalized form of data fusion scheme is shown in Fig. 7.1 where one or more datasets of spatial, multitemporal, and/or temporal data is used for spatiotemporal mapping of the same or correlated different types of data. Spatiotemporal datasets can be preferred on PaaS and DaaS applications due to their uniform structures for the on-demand query of the users. Data fusion schemes may have several alternative options due to the availability or quality of the data (Khaleghi et al. 2013). Alternative fusion schemes can be listed within a priority depending on correlation, conﬁdence, and the computational complexity cost rates. Fusion methods also provide a solution for data reconstruction requirements. When one or more sensors of a monitoring station fail to deliver data in an agrometeorological network, real-time data requirement of DaaS systems can be met by indirect but less accurate or higher costly temporary computations in proper fusion schemes. For example, when a

7 Data Fusion in Agricultural Information Systems

107

temperature sensor fails, past temperature measurement patterns in the neighboring stations and temperature correlated other measurements in the same station can be used for temporary data reconstruction (Altan and Üstündağ 2012). Acquired raw datasets are also used to calculate agricultural indices. For example, growing degree days (GDD) and vapor pressure deﬁcit (VPD) are two important parameters for the plant growth rate. Temperature measurement is used in both of them. Automated diagnostics for the quality of service (QoS) management in DaaS business models also consider data fusion methods. On the other hand, probably the most important use of data fusion in agricultural applications is the prediction. Prediction can either be performed as nowcasting or forecasting. Data fusion is used for nowcasting when target data is not feasible for direct monitoring due to physical restrictions, operational reasons, or cost-related issues. Forecasting is based on the fusion of data patterns depending on machine learning methods, adaptation of analytic models, or hybrid methods for the estimation of future value in time, space, or both. Risk management in agriculture may require both nowcasting and forecasting models. For example, if a disease is known to occur at some known locations, nowcasting can help the probable distribution of damage, and forecasting may help for the development of risk. In this case, the known interpolatable parameters like temperature and air humidity and location-speciﬁc remote sensing data can be used in a fusion scheme for nowcasting and forecasting purposes. Automated cropland cover identiﬁcation system requires training patterns and terrestrial reference observations for supervised classiﬁcation besides the terrain data. Geo-statistical yield prediction models are based on probabilistic distribution of yield concerning spatial surveys and terrain models. Their accuracy depends on sampling size and terrain complexity. On the other hand, their spatial resolution is limited as a region or province. Crop yield prediction and mapping is a good example to demonstrate how data fusion improves spatial accuracy with respect to pure statistical models. Data fusion is also used for spatial interpolations of agrometeorological parameters. Some of them are interpolatable based on kriging methods and some models, while many surface parameters such as soil moisture and phenological stage distribution are not interpolatable. Data fusion methods can be used for mapping of uninterpolatable parameters together with correlated interpolatable parameters as an adaptive model solution. Some of the spatial information also intersects with temporal observation locations (Fig. 7.1). Spatial and temporal characteristics at those intersection points are used in the calibration or adaptation of data fusion models. Hence spatial and temporal data can be used in the generation of multitemporal spatial data by using adapted spatiotemporal data fusion models. Data fusion models increase the service quality and capabilities of the vertical integration platforms. Sectoral integration in agriculture consists of ﬁve major components (Fig. 7.2): (a) Asset management (b) Efﬁciency management

108

B. Üstündağ

Fig. 7.2 Integrated agricultural production management system components

(c) Quality management (d) Market management (e) Sustainability management These components are cascaded in seasonal periods. Each of them requires monitoring, planning, and decisions for controlling the system (Manos et al. 2010). Data fusion is considered especially in monitoring and decision support layers. Asset management includes monitoring and management of arable land, cropland cover, water supply, and irrigation networks, livestock count, the geographical distribution of pasture, and other agricultural production units. The optimal usage of natural resources with respect to ecological properties for better economic production capability is the main purpose of asset management. Efﬁciency management deals with monitoring and increasing the yield efﬁciency. Yield prediction is extracted from determined crop pattern as a part of the asset management system and the yield efﬁciency maps per speciﬁed crop. Pricing rate for a crop from a speciﬁed region is also a function of the product quality class besides the production amount. For example, protein and gluten levels are effective on the market price of the wheat. These three parts of the management are still not enough to estimate basin scale income status per crop. Market management is based on the monitoring and predictions of demand and serve rate as well as the price variations of international markets. Subsidy system and cooperative and governmental logistic

7 Data Fusion in Agricultural Information Systems

109

strategies are important tools for market regulation, food security, and sustainable economic development. Data fusion is an alternative to analytical models for economic evaluations too. For example, oil price change has a lead/lag relationship on food prices, and it can be used as one of the inputs of data fusion models for the prediction of market price and decision support for optimal subsidy management. On the other hand, production units (asset) x yield efﬁciency x pricing as depending on quality x market conditionbased unit prices are not enough for the optimization of the integrated system. The main restriction is the balanced use of the natural resources. Water and soil are two major components of the basin. The water level of the aquifers goes down if the crop pattern-related irrigation regime is not balanced with respect to precipitation and other water income of the basin hydrological system. Sustainability management sets restriction to all four other components within the integrated agricultural management.

7.3

Regression Model Example for Real-Time Yield Efﬁciency Monitoring

The ﬁrst example is a regression model-based data fusion for yield monitoring. It is intended to be used for in-season real-time part of the four-step yield monitoring system. Its accuracy increases due to the reduction of the remaining time to harvest since observed values replace statistical agrometeorological estimation. Continuous time crop-speciﬁc yield monitoring system consists of four processes: (a) Phenological stage mapping. (b) Data segmentation depending on the phenological stage together with correlation analysis. (c) Yield prediction model depending on phenological stage segmented data. (d) Model parameters adaptation after harvest by using the geo-statistical data. The TARBIL system has 440 agrometeorological-phenological monitoring stations in Turkey (Fig. 7.3). Regression model-based yield estimation example here uses 31 observation records from wheat parcels located in South Eastern Turkey. They are residing in Mardin, Şanlıurfa, Diyarbakır, and Gaziantep Provinces. Agrometeorological dataset shown in the Appendix is acquired from wheat parcels with dry farming conditions. Province yield efﬁciency data is provided by TURKSTAT (TÜİK, Turkish Statistics Institute). The total amount of sensors, including atmospheric, soil, and phenological measurements, is 35 in each of the monitoring stations. Meteorological sensors’ sampling time interval is 10 minutes. Camera image capture period is 30 minutes. Acquired data is processed at a high-performance computer (HPC) at Istanbul Technical University. Data acquisition and processing operations are supported by the command and control center responsible for the QoS (quality of service). This

110

B. Üstündağ

Fig. 7.3 TARBIL monitoring station locations in Turkey (2016)

part of the system is semiautomated, the phenological stages of the cereals at observation parcels are determined through an image processing software (Bagis and Üstündağ 2012), and operators verify the suspicious data. A conventional approach in crop yield efﬁciency estimation is using the statistical correlations between the observed seasonal parameters and crop yield efﬁciency data (Herndl 2008). Choosing the highest correlating parameters having the least low covariance is important for the estimation performance of the regression models. Multiple linear regressions may not provide accurate relationship when cumulative seasonal data is used for the estimation of yield depending on the normalized difference vegetation index (NDVI) and agrometeorological parameters. Besides using nonlinear regression models, the piecewise linear multiple regression for phenological stage group intervals also improves the estimation performance. Regression analysis dependent on NDVI as a remote sensing parameter, rainfall, and other agrometeorological indices are widely used with statistical models (Balaghi et al. 2008). Another approach is using crop system models such as CropSyst. Common computational platforms are also a way of applying crop system models as in the case of BioMa (Rouse et al. 1973). A wide variety of computational methods such as neural networks, self-identiﬁcation, and statistical learning methods are used to determine the model parameters either in supervisory or adaptive modes. Another common approach is the supervised classiﬁcation where remote sensing data is used together with terrestrial observations. As in the case of animals, plants also have different responses to the physical environment depending on their phenological stage (Dong et al. 2014; Herndl 2008). The main idea behind the improvement of yield efﬁciency is not only choosing the highest correlating parameters but also grouping them in terms of phenological

7 Data Fusion in Agricultural Information Systems

111

stages. Phenological stage dates are also dependent and predictable (Üstündağ 2017). Accuracy is the main performance measure for all yield efﬁciency estimation methods. Tolerance is directly dependent on accuracy within the chosen statistical conﬁdence factor (Zc). Total tolerance of yield estimation depends on the summation of the tolerances in crop area estimation and crop yield efﬁciency estimation since their product determines the yield for each region. Since it is usually not possible to monitor all the harvesting data at a chosen region, an important aspect is having a reliable method that enables interpolationbased mappings concerning reference data from sampling points. Although the error rate in the spatial distribution of meteorological parameters can be reduced and managed by using data fusion methods techniques (Bagis et al. 2012), crop status monitoring systems may even require the change of input parameter sets besides the adaption of model parameters. The soil-adjusted vegetation index (SAVI), modiﬁed SAVI (MSAVI), NDVI, and Global Environmental Monitoring Index (GEMI) are indices in correlation with surface vegetation. They are computed from monitoring data of remote sensing satellites or aerial platforms (Herndl 2008). The vapor pressure deﬁcit (VPD), growing degree days (GDD), photo-thermal unit (PTU), helio-thermal unit (HTU), reference evapotranspiration (ET0), crop evapotranspiration (ETc), minimum temperature (Tmin), and precipitation (P) are some of the parameters known to have correlation with plant behavior characteristics. They are computed by using temporal data of terrestrial monitoring systems (Allen et al. 1998; Amrawat et al. 2013; Bazgeera et al. 2007). On the other hand, they are spatially interpolatable by using inverse distance weighting with elevation correction (IDWEC) or kriging methods since their input variables are basic physical measurements as humidity, air pressure, air temperature, etc. The model is initiated by using the statistical agrometeorological data in order to provide decision support during the planning phase before sowing. Initially, statistical yield prediction (Odoh and Chinedum 2014) is expressed together with the tolerance as (Eq. 7.1) σ ﬃﬃﬃ YE ¼ μYE Zc ∙ pYE n

ð7:1Þ

where Zc is the statistical conﬁdence factor and μ and σ are the average and standard deviation of past yield efﬁciency within the “n” amount of data sampling at a speciﬁed location or coordinate (x, y). It should be considered that there are different scales for location, and it differs in the way of construction of the data. If the location speciﬁes a town or basin, then the past yield efﬁciency records can be used in the computation of average or expected trend value (trend-estimated yield). Basins consist of towns having similar climatic and terrain characteristics within the geographical neighborhoods in Turkey. Yield efﬁciency estimation tolerance is at maximum at the statistical estimation phase.

112

7.3.1

B. Üstündağ

Phenological Stage-Based Data Segmentation

Phenological stage change is the main nonlinearity of the plant system models. Plant growth requirements, as well as risk-related loss estimations, depend on the phenological stage of the plant. Phenological stages can be considered as the states in the Markov models. State transition function is related to imposed physical parameters, and it varies depending on the state. For example, instead of seasonal precipitation, its distribution in terms of phenological stage duration has a higher correlation to yield. Simple linear regression does not ﬁt well, especially when the long-term volatility is high in agrometeorological conditions. Segmenting the data into phenological stages or groups of consecutive phenological stages provides convergence to a piecewise linear model. In the proposed data segmentation model, each phenological stage of the crop is represented by different states, as shown in Fig. 7.4. Merging the data of some of the consecutive stages can reduce computational complexity. In this case, the chosen parameters in a group of consecutive phenological stages are represented by their accumulated values within the respecting states. Data segmentation with respect to state or phenological stage requires resetting cumulative counters or initiation of parameter values after each state transition. Interpolated agrometeorological data and remote sensing indices are segmented and replaced by the statistical data after sowing time. Hence, while the estimated yield efﬁciency varies beginning from the statistical past data for a speciﬁed crop and location, the tolerance of the estimation continuously reduces due to the replacement of statistically expected values by the actual data until harvesting time as shown in Fig. 7.5. If the prediction coverage is regional, then the average sowing time for the region is used as a reference for the starting time of data replacement with current

Fig. 7.4 Simpliﬁed state machine model for the phenological stage-based data segmentation

7 Data Fusion in Agricultural Information Systems

113

Fig. 7.5 Typical yield efﬁciency and statistical tolerance sketch in three periods as pure statistical estimation (planning phase), forecasting, and nowcasting terms

observations. Otherwise, if a speciﬁc parcel is considered, then the exact sowing time must be used as the reference date for the initiation of the real-time computations. Phenological stage–based segmentation also enables the prediction of the phenological stage dates and the harvesting date by using another set of regression model that represents the state transition functions (Üstündağ 2017). Remote sensing data is used both for the crop/land cover estimation and calculation of some indices also in correlation with yield efﬁciency. Both seasonal VHR satellite images and multitemporal/multispectral images are used at different frequencies for this purpose. Agrometeorological data is ﬁrst converted into indices in correlation with phenological stage durations. We have considered seven stages for cereals as listed below: 1. 2. 3. 4. 5. 6. 7.

Emergence Floral initiation (double ridge) Terminal spikelet First node Heading Anthesis Physiological maturity

In the third step, some other agrometeorological and remote sensing index sets are segmented with respect to the phenological stage transition date intervals. This data segmentation converts the chronological data into biological timing of the crops. Sample dataset in the Appendix includes widely used agrometeorological indices for the explanations and example regression model-based data fusion here. Dataset consists of separate tables for each phenological stage (as stage 1, . . ., stage 7) and the total values. Crop yield computations rely on matching the actual crop area mapping and crop yield efﬁciency mappings for dry farming and irrigated conditions separately.

114

7.3.2

B. Üstündağ

Agrometeorological Indices and Regression-Based Data Fusion for Yield Estimation.

Growing degree days (GDD) is an index known to be a good indicator for the growth rate of the crops (Dubey et al. 1987). It is mainly the accumulation of temperature above a certain activation limit of the crop. The base temperature is around 5 C for physiological processes in wheat, and GDD is expressed as. GDD ¼

stage ending X stage begining

Tb

T max þ T min Tb 2

T max þ T min T bmax 2

ð7:2Þ ð7:3Þ

where Tmin and Tmax represent the daily minimum and maximum temperatures and Tb and T bmax are the basic minimum and maximum phenological development temperatures, respectively. Vapor pressure deﬁcit (VPD) is another important index for plant growth. It has a main role in plant evaporation and transpiration (Allen et al. 1998). Atmosphere moisture changes affect plant’s evapotranspiration through ﬂunking real vapor pressure (ea) and vapor pressure gradient from the leaf toward the air. The difference between saturated vapor pressure (es) and its actual amount was considered as vapor pressure deﬁcit (VPD). VPD is calculated by the following equations: VPD ¼ ðes ea Þ 17:27T

es ðT Þ ¼ 0:6108 eTþ237:3

ea ¼

ð7:4Þ

ð7:5Þ

RH min max es ðT min Þ RH 100 þ es ðT max Þ 100 2

ð7:6Þ

es ðT max Þ þ es ðT min Þ 2

ð7:7Þ

es ¼

where es(T ) is the saturated vapor pressure at a given temperature, es is the saturated vapor pressure, ea is the real vapor pressure, RHmin is the daily minimum relative humidity, and RHmax is the daily maximum relative humidity. Evapotranspiration of a plant is related to water consumption, and it is in correlation with yield efﬁciency due to water balance equilibrium. Reference evapotranspiration (ET0) is computed only by using agrometeorological measurements and a model (Schröder et al. 2014). It is a plant-free reference indicator. Crop-

7 Data Fusion in Agricultural Information Systems

115

speciﬁc calculations (ETc) require Kc factors of the plant for each phenological stage and region. We have provided potential evapotranspiration (PET) data in the appendix as an independent factor from plant growth so that we can also investigate the correlations with the phenological stages. Reference evapotranspiration with respect to the Penman-Monteith method is stated as 900 0:408ΔðRn GÞ þ γ Tþ273 u2 e d e g ET0 ¼ Δ þ γ ð1 þ 0:34 u2 Þ

ð7:8Þ

where G is the Earth heat ﬂux intensity, Rn is the net radiation, T is the air temperature at 2 m, u2 is the wind speed at 2 m, es is the saturated vapor pressure, ea is the real vapor pressure, Δ is the slope of the vapor pressure curve, and γ is the psychometric constant. Accumulated rainfall (RF or ARF) is the summation of the daily precipitation between the selected time intervals. It is given both for the seven phenological stages and as the total value in the appendix as RF. ARF is mathematically expressed as. ARF ¼

stage ending X

precipitation

ð7:9Þ

stage begining

Helio-thermal unit (HTU) index is the accumulated measure of the variation in the ambient temperature between phenological events and given as HTU ¼

stage ending X stage begining

T max þ T min Tb n 2

ð7:10Þ

where n is the real sunny hours. Photothermal unit (PTU) index (Rajput 1980) is deﬁned as PTU ¼

stage ending X

stage begining

T max þ T min Tb N 2

ð7:11Þ

where N is the maximum possible sunny hours. Normalized vegetation index (NDVI) is a widely used parameter since 1973 (Rao 2003) in remote sensing as an indication of vegetative coverage. NDVI value represents a ratio ranging in value from 1 to 1, and it is expressed as NDVI ¼

ρNIR ρRED ρNIR þ ρRED

ð7:12Þ

where ρNIR is the reﬂectance in near-infrared band and ρRED is the reﬂectance in the red band. Extreme negative values of NDVI represent water, and values around zero

116

B. Üstündağ 0.70 0.60 0.50

NDVI

Fig. 7.6 Variation in the average value of NDVIs with respect to phenological stages for wheat sample ﬁelds (Appendix) in South Eastern Anatolia

0.40 0.30 0.20 0.10 0.00 1

2

3

4

5

6

7

Correlation between NDVI and crop yiel efficiency

Phenological Stage

0.60 0.50 0.40 0.30 0.20 0.10 0.00 1

2

3 4 5 Phenological stage

6

7

Fig. 7.7 The correlation between NDVI and crop yield efﬁciency with respect to phenological stages for wheat sample ﬁelds (Appendix) in South Eastern Anatolia

represent bare soil. We have used multitemporal Landsat 7 and Landsat 8 images in the computation of NDVI values listed in the Appendix. A linear transformation has been applied for the continuity of the data between Landsat 7 and Landsat 8 images (Flood 2014). Curve ﬁtting on multitemporal data has been used to recover some cloudiness-related problems although the annual sunshine duration is as high as 2993 h in the region. Although NDVI has the highest value in the sixth stage, as shown in Fig. 7.6, it has the highest correlation with yield efﬁciency in the third stage (Fig. 7.7 and brown line in Fig. 7.8). The correlations of the above-deﬁned indices for each phenological stage of winter wheat at dry farming conditions to the seasonal yield efﬁciency are shown in Table 7.1 and Fig. 7.8. “St.D.” indicates the phenological stage duration in days. Some parameters have a higher statistical dependency, as seen in the graphs of PTU and GDD in Fig. 7.8. Although VPD has relatively higher correlation to the yield efﬁciency, its dependency directions are different in stage 2 () and stage 6 (+). For this reason, seasonal accumulated values are not as effective as the phenological stage-based segmentation of the data in the statistical yield efﬁciency estimations.

7 Data Fusion in Agricultural Information Systems

117

0.8

Correlation to Yield Efficiency

0.6 0.4 0.2 0

Phenological stage 1

2

3

4

5

6

7

-0.2 -0.4 -0.6 -0.8 stage dur.

GDD

VPD

PTU

Tmin

RF

ET0

NDVI

Fig. 7.8 Variation of the correlations between the agrometeorological indices and the phenological stage durations Table 7.1 Phenological stage segmented parameters’ correlation to wheat yield efﬁciency St.D. GDD VPD PTU Tmin RF ET0 NDVI

Stage 1 0.1970 0.5842 0.6244 0.5919 0.3057 0.4559 0.5575 0.3377

Stage 2 0.3093 0.4844 0.5843 0.4851 0.0125 0.3109 0.4849 0.4322

Stage 3 0.3505 0.2035 0.0952 0.1661 0.3547 0.5418 0.2285 0.4771

Stage 4 0.0517 0.0498 0.1566 0.0081 0.0821 0.3317 0.1425 0.1588

Stage 5 0.0913 0.2678 0.1474 0.3192 0.2985 0.1529 0.1834 0.1506

Stage 6 0.2778 0.5059 0.5566 0.5288 0.2543 0.1423 0.5019 0.2588

Stage 7 0.1670 0.0253 0.1802 0.0230 0.3102 0.1534 0.2733 0.0339

Here, the maximum correlation of NDVI is relatively low because reference yield efﬁciency data indicates district statistical values, while NDVI is computed on the observed cereal ﬁelds. If the data is grouped with respect to the correlation rate of the yield efﬁciency and the minimum covariance, then we get an example set of values listed in Table 7.2. VPD6 and VPD12 denote VPD in the phenological stage 6 and summation of VPD in phenological stage 1 and stage 2, respectively. ARF1234 is the accumulated rainfall summation in stage 1, stage 2, stage 3, and stage 4. PTU56 is the summation of PTU in stage 5 and stage 6. Tmin7 is the minimum temperature in phenological stage 7. When the values in Table 7.2 are used for the calculation of the regression coefﬁcients for yield efﬁciency (YE), then we get

118

B. Üstündağ

Table 7.2 Grouped data segmentation example concerning the correlations to yield efﬁciency

Average: St.Dev.: R:

VPD6 6.07 7.94 13.03 20.32 29.73 22.66 13.29 13.7 15.7 14.18 12.69 22.76 13.86 24.28 16.42 14.83 11.31 26.43 14.64 17.47 21.38 11.27 20.83 10.75 6.47 19.07 20.58 19.76 21.95 13.03 19.13 16.630 5.638 0.557

VPD12 51.01 20.54 9.98 16.44 12.11 15.37 27.97 27.41 18.11 10.26 17.77 8.97 11.5 6.84 9.1 9.2 27.83 9.45 13.43 39.61 17.5 6.54 12.64 77.58 13.35 15.12 23.03 19.52 12.31 15.7 6.52 18.797 14.405 0.652

ARF1234 157.74 116.64 81.48 144.06 252.22 293.65 84.4 411.56 396.7 389.95 337.21 172.05 336.72 285.96 373.8 112.15 106.98 81.34 187.08 88.26 249.22 188.64 188.18 122.3 240.7 96.82 304.04 127.14 293.33 210.9 233.06 214.977 103.087 0.646

PTU56 3621.39 2690.08 5798.1 5929.16 8066.38 9328.96 8049.57 9282.93 6723.64 7325.63 6396.51 7732.59 5759.22 7509.15 6729.33 5331.35 5140.28 9651.8 5723.36 7665 5737.32 5545.36 5772.36 4583.42 7819.03 7315.67 8779.33 8426.76 8384.5 7231.78 6245.11 6783.712 1625.187 0.521

Tmin7 10.03 2.51 9.59 9.13 11.39 13.42 12.61 13.77 12.62 13.17 11.7 10.17 12.1 11.66 12.69 11.13 10.82 12.26 10.49 11.57 12.06 11.64 11.72 10.23 11.56 8.73 12.76 8.82 12.02 12.06 11.37 11.155 2.028 0.310

Yield eff. (kg/da) 38 171 144 271 284 256 136 303 270 349 226 292 268 270 310 203 38 230 160 89 223 160 254 38 163 233 248 232 317 247 189 213.290 81.659 1.000

YE ¼119:52 þ 4:418 VPD6 1:908 VPD12 þ 0:501 ARF1234 þ 0:0123 PTU56 12:116 Tmin

ð7:13Þ

where the R2 is 0.83. It should be considered that the dataset in the Appendix includes a drought period in 2013. For this reason, the district average of the crop yield efﬁciency varies between 38 kg/da and 349 kg/da, and its deviation is 89 kg/da.

7 Data Fusion in Agricultural Information Systems

119

Table 7.3 A reduced form of grouped data segmentation example concerning stage-based correlations to the yield efﬁciency

Average: St.Dev.: R:

VPD6 6.07 7.94 13.03 20.32 29.73 22.66 13.29 13.7 15.7 14.18 12.69 22.76 13.86 24.28 16.42 14.83 11.31 26.43 14.64 17.47 21.38 11.27 20.83 10.75 6.47 19.07 20.58 19.76 21.95 13.03 19.13 16.630 5.638 0.557

VPD12 51.01 20.54 9.98 16.44 12.11 15.37 27.97 27.41 18.11 10.26 17.77 8.97 11.5 6.84 9.1 9.2 27.83 9.45 13.43 39.61 17.5 6.54 12.64 77.58 13.35 15.12 23.03 19.52 12.31 15.7 6.52 18.797 14.405 0.652

ARF 1234567 180.26 374.28 180.3 202.06 329.08 335.54 190.5 495.12 444.68 496.85 359.69 270.85 436.44 351.35 402.06 196.01 139.62 160.76 278.72 170.92 284.38 274.16 280.58 193.68 358.5 162.48 343.9 190.5 386.41 242.06 315.28 291.194 101.956 0.648

Actual yield eff. 38 171 144 271 284 256 136 303 270 349 226 292 268 270 310 203 38 230 160 89 223 160 254 38 163 233 248 232 317 247 189 213.290 81.659 1.000

Predicted yield eff. 40.59 191.99 156.25 201.13 325.21 277.41 131.18 270.60 277.01 304.24 220.54 260.33 273.11 309.61 278.25 176.07 96.12 233.77 204.39 128.82 242.75 192.88 246.00 30.17 188.09 177.82 254.58 187.06 300.90 173.84 261.29 213.290 73.281 0.897

Error 2.59 20.99 12.25 69.87 41.21 21.41 4.82 32.40 7.01 44.76 5.46 31.67 5.11 39.61 31.75 26.93 58.12 3.77 44.39 39.82 19.75 32.88 8.00 7.83 25.09 55.18 6.58 44.94 16.10 73.16 72.29 0.000 36.029

If the data is more reﬁned in terms of covariance, then we get an alternative index parameter set shown in Table 7.3. Precipitation rate is not beneﬁcial in maturity as the last stage of wheat. However, since its small amount is an indication for cloudiness, it is in correlation with Tmin in South Eastern Anatolia. Hence extending the ARF term to all stages provides similar R2 value by using less amount of variables in the regression model.

120

B. Üstündağ

In this case yield efﬁciency (YE) can be expressed as YE ¼ 10:24 þ 6:367 VPD6 1:739 VPD12 þ 0:446 ARF1234567

ð7:14Þ

The R2 value of this reduced form is 0.81, and it is reasonable for mapping since it contains only three agrometeorological indices. When we add NDVI1234567 as the accumulation of NDVI values at all phenological stages from 1 to 7, then the R2 value becomes 0.826 with respect to regression-based YE given in (7.15). YE ¼ 34:163 þ 5:187 VPD6 2:088 VPD12 þ 0:392 ARF1234567 þ 31:58 NDVI1234567

ð7:15Þ

NDVI has the highest correlation in phenological stage 3. When we use NDVI3 as NDVI in phenological stage 3 instead of the accumulation of all NDVI values, then R2 value is computed as 0.836 with respect to regression-based YE given below. YE ¼ 8:046 þ 5:559 VPD6 1:896 VPD12 þ 0:386 ARF1234567 þ 147:76 NDVI3

ð7:16Þ

Equations (7.14), (7.15), and (7.16) demonstrate that proper segmentation of data enables feasible data fusion scheme even by using regression models. When the amount of data is enough to train machine learning models without overﬁtting, wavelet neural networks (Sect. 7.5) and convolutional neural networks (Sect. 7.6) may improve fusion performance with respect to the linear regression models. In this case, the phenological stage can directly be used as additional input instead of data segmentation process. Wheat yield efﬁciency map of Şanlıurfa province is computed with respect to the regression-based fusion model in (7.16) for dry farming conditions in the year 2015 (Fig. 7.9). Interpolation models are used to compute monitored temporal parameters (temperature, humidity, etc.) so that agrometeorological indices (ET0, VPD, etc.) can be computed at each spatial unit within the resolution of the yield efﬁciency map. The model could also be performed at any time before the harvesting term by partly using statistically expected data instead of monitored data as explained in Sect. 7.3.1 (Fig. 7.5) that naturally reduces the accuracy and increases the tolerance (7.1). In order to estimate the total yield of a crop in a selected area, crop area maps are needed besides the yield efﬁciency maps. A way of crop area map generation is using supervised classiﬁcation methods, as shown in Fig. 7.10. SPOT 6/SPOT 7 VHR satellite images (two per season) are used together with Landsat and Sentinel images as multitemporal spatial data in this example process of TARBIL project. SPOT6 and SPOT7 are identical satellites having 1.5-m resolution at 60 km 60 km image frames. High spatial resolution improves the recognition of agricultural ﬁeld boundaries while providing spatial crop pattern signatures at known phenological stages. Multitemporal satellite images enhance the classiﬁcation performance with respect to the growth rate–related reﬂection (e.g., NDVI) variation in chronological

7 Data Fusion in Agricultural Information Systems

121

Fig. 7.9 Wheat yield efﬁciency map of the Şanlıurfa region for dry farming condition (2015)

Fig. 7.10 Wheat plantation intensity in the Şanlıurfa province as ratio to agricultural ﬁelds in the year 2015

122

B. Üstündağ

time. Each TARBIL monitoring station has two or three cameras for crop development and phenological observations. Their images are used for ground reference data together with ofﬁcial records from farm registry system for crop pattern classiﬁcation. Crop yield maps are extracted by using these two maps as yield efﬁciency (Fig. 7.9) and crop area intensity (Fig. 7.10). This process is applied separately for irrigated and dry farming conditions. Overall crop yield of the province is then computed by using plantation intensity and yield efﬁciencies both for irrigated and dry farming conditions. Harvesting data records from 275 harvesting machines having GPS-coordinated monitoring device at the main cereal basins of Turkey are used for the additional calibration of the model in 2015. Additional statistical calibration improves estimation (nowcasting) accuracy because of several nonnatural factors, including regional change in harvesting methods and seed variety preferences. Overﬁtting is one of the critical issues in machine learning or statistical regression model-based estimations. It refers to when a model is so tuned to the training examples that it is not able to generalize well for the validation or test sets. A symptom of overﬁtting is having a model that gets half of the percentage of the test data that perfectly ﬁts to the training data. The ratio between the number of parameters and the acquired dataset size is important in avoiding the overﬁtting depending on the nonlinearity of the system. For this reason, regression models that use less amount of parameters should be preferred for equal or close correlation rates.

7.4

Neural Networks for Data Fusion

The main advantage of the neural networks is that they can learn from past experiences, which allows them to learn and adapt to changes. Neural network (NN) models represent a wide class of ﬂexible nonlinear models which are used for such purposes as classiﬁcation, pattern recognition, clustering, anomaly detection and forecasting. It has been shown that properly deﬁned NN models achieve more reliable predictions than the conventional regression methods (Sarmadian and Mehrjardi 2008). In addition, the usage of NN provides better solutions when they are applied to the complex systems that may be poorly understood by the traditional analytical methods (Tokar and Markus 2000). NNs may be described as a network of interconnected neurons (nodes). Each neuron consists of several input nodes and an output node. The problem speciﬁes the requried number of neurons at the input layer, hidden layer, and the output layer of the neural network. Commonly used simple neuron models compute the output of the neurons based on the weighted sum of all its inputs according to an activation function. Arranging the data has a signiﬁcant impact on the obtained results of the trained network. There is not a unique neural network structure yet that can ﬁt to every kind of pattern recognition problem. For this reason, evaluation of the priory information about time

7 Data Fusion in Agricultural Information Systems

123

Fig. 7.11 A simpliﬁed neuron model used in fully connected layers

varying and statistical characteristics of the probable signals to be applied and so the relevant interacting systems is important for selection of the proper neural network type before deﬁning its internal organization. Artiﬁcial neural network (ANN) is a computational method to accomplish a variety of tasks in NN applications. ANN approaches offer algorithms for network training based on supervised learning, unsupervised learning or the reinforcement learning depending on type of the problem. Basically supervised learning provides the ability to learn the input-output correlation by training the input to produce the familiar or previously labeled (known) output so that predictions can be made when unknown (test) data set is applied to the input. The block diagram of a single input neuron is shown in Fig. 7.11. The scalar input ai is multiplied by the scalar weight wi. The other input uses bias b as a constant offset value in summation for better ﬁtting to the input variation as similar to the intercept in linear regression. The summer output n, often referred to as the net input, goes into a transfer function f, which produces scalar neuron output c. c¼f

X

wi ai þ b

ð7:17Þ

The activation function that maps a neuron’s net output “n’ to its actual output “c” is known as the transfer function. The number of neurons in the input and output layer of ANN is speciﬁed by the problem to which the network is constructed. A neuron computes an output based on the weighted sum of all its inputs according to an activation function. Mainly the log-sigmoid function is used as an activation function. Log-sigmoid transfer function converts the output into the range of 0 to 1 according to Eq. (7.17): f ð xÞ ¼

1 1 þ ex

ð7:18Þ

where x represents the weighted sum of inputs to the neuron and f(x) the output of the neuron. Various types of activation functions have been proposed as alternatives to log-sigmoid in Eq.7.18. A widely used one is the rectiﬁed linear unit (ReLU) activation function f(x) ¼ x+ ¼ max (0, x) [40]. The topology of the network is determined by the amount of sensitive parameters and their change in time or space, nonlinearity rate of the system, and available training data size and feature distribution. ANN structure with one hidden layer and time-delayed signal input x(t) is seen in Fig. 7.12. The input signal, x(t), is sampled at

124

B. Üstündağ

Fig. 7.12 Time delay network example for processing of temporal data

the periods of “T”. “z1” represents time shift operation in discrete time systems. Therefore x(t), x(tT ), x(t2T), x(t3T), and x(t4T) are the input values representing the last four and current values of x(t), for example, ANN architecture. The minimum required amount of hidden layers depends on the nonlinearity rate of the input data in feature space for the seperability. On the other hand, the optimal amount of neurons in the hidden layer is proposed by Patterson (1998) as q¼

N 10 ∙ ðm þ pÞ

ð7:19Þ

where q is the suggested number of neurons in the hidden layer, m is the amount of input layer neurons, p is the amount of output layer neurons, and N is the number of observations in the training dataset. A time delay neural network (TDNN) based data fusion example is given here in order to explain the structural approach. The most important natural resource in agriculture is water. Water is the main limitation for biomass production at different soil, climate, and ecological conditions. Soil type and physical structure are effective on evaporation, storing, and discharge of water. Although soil moisture mapping enables optimal irrigation planning, it is currently not feasible to locate sensors at every grid point of the map. On the other hand, the fusion of spatial and temporal data enables multitemporal soil moisture mapping for optimal irrigation scheduling. Water balance equation indicates that irrigation water requirement (IR) is a function of crop evapotranspiration (ETc)(mm), change in soil moisture (ΔS) (mm) at root-zone, and the precipitation (P)(mm) (Frenken and Gillet 2012).

7 Data Fusion in Agricultural Information Systems

125

Fig. 7.13 A data fusion model example for large-scale plant root zone soil moisture estimation

Fig. 7.14 Time series NDVI (sNDVI) generation model

IR ¼ ETc P ΔS

ð7:20Þ

Although ETc is a crop-speciﬁc parameter, it can be calculated from the reference evapotranspiration (ET0) that depends on meteorological parameters (Allen et al. 1998): ETc ¼ Kc ∙ ET0

ð7:21Þ

where Kc is a crop-speciﬁc coefﬁcient depending on the phenological stage. ΔS is affected by soil water capacity besides the difference between crop water consumption and the total water intake. It is also known that soil moisture is also effective on some remote sensing indices as the normalized difference moisture index (NDMI) and NDVI. Even though crop-speciﬁc parameters and models are not known, a data fusion scheme can still be constructed in different ways by using the existing data records and the crop cover map as a context (Kulaglic and Üstündağ 2014). One of them is shown in Fig. 7.13 that is intended to nowcast root zone soil moisture depending on precipitation, ET0, irrigation, and the NDVI.

126

B. Üstündağ

Fig. 7.15 Time delay neural network fusion model for root zone soil moisture estimation

ET0, P, and IR are temporal data patterns. NDVI is a spatial data. Another type of data named synthetic NDVI (SNDVI) derived with respect to the high correlation between NDVI and the fraction of vegetation cover (FVC) by using the regression model is shown in Fig. 7.14. FVC is calculated from TARBIL monitoring station camera images that are acquired at 30-min time intervals. Hence, SNDVI is generated as a spatiotemporal parameter instead of NDVI. Indirectly it ﬁlls the time gap between the two remotely sensed NDVI values. A TDNN structure used for spatiotemporal root zone soil moisture estimation is shown in Fig. 7.15. SM15 and SM45 outputs represent the estimated soil moisture at 15-cm and 45-cm depths, respectively. Previously the estimated soil moisture is used as an input by shifting the data with the sampling period T. The purpose of this nowcasting or forecasting fusion scheme is to get accurate mapping of root zone soil moisture data by using interpolatable monitoring data

7 Data Fusion in Agricultural Information Systems

a

127

250

cb

200 150 100 50 0

b

250

cb

200 150 100 50 0

soil moisture 45cm

predicted soil moisture 45cm

Fig. 7.16 Monitored and predicted soil moisture at 45-cm depth at two different locations within the same time interval, crop type, and the region (a) training site and (b) test site patterns

(remote sensing and in situ) and the reference soil moisture data available only at some reference points. The neural network is trained by the reference soil moisture data. This kind of mapping is proposed for adaptive irrigation planning. The same method can be used for missing data reconstruction in reference measurement points too. When the sampling period T is chosen as 1 day, reference evapotranspiration (ET0), SNDVI, irrigation, and precipitation patterns are used as 5-point data sequences, each representing today’s and last 4 days’ values. The soil moisture was measured in terms of centibars (cb) in training datasets, and soil moisture estimation in the test set example (Fig. 7.16) is in centibars. Here, 200 cb indicates the minimum moisture (dry) and 0 cb indicates the maximum moisture (wet soil). It should be noted that different crop types may require different training set. On the other hand, if ﬁeld-speciﬁc crop type is known either by using remote sensing–based cropland cover classiﬁcation methods or gathered from farm registry systems, then this kind of fusion scheme can be used for crop-speciﬁc estimations in the region. If the integration level goes down to irrigation automation systems or the farmer informs the system with daily irrigation amount, then it can be used for irrigation schedule optimization.

128

B. Üstündağ

In TARBIL project, the reference observation network was dense enough to represent regional soil structure variation. Besides the common monitoring network sensors in commercially available integrated agricultural services, some of the farmers may have their own sensor sets for getting the processed data service with higher local precision. In this case, continuous training can also be applied for different contexts such as parcel-based crop varieties as long as data label for training exists through the feedback from the ﬁeld. For this reason, an efﬁcient agricultural information system must consider well-designed and sustainable data exchange and acquisition scenarios. This data fusion scheme is given as an example for understanding TDNN structures. It can be facilitated in different ways. Wavelet features as explained in the following section can be used instead of time-domain sequence. In this case, ET0(t), ET0(tT ), ET0(t2T ), ET0(t3T ), and ET0(t4T ) should be replaced by their wavelet coefﬁcients in the same moving time frame as a0, a1, a2, a3, and a4. The transformation of the input data patterns to the wavelet domain usually improves prediction performance (reduces the error rate) in natural systems. The same time series data pattern can also be applied to the input at different time frames, e.g., as hours and days depending on the varying short-term and long-term correlations. It is possible to establish several different data fusion models by exchanging some of the input and output variables as long as there are some known relationships and available training dataset.

7.5

Wavelets in Data Fusion

The wavelet transformation is widely used in signal processing and image compression. For example, in JPEG2000 standard, lossless compression is provided by the use of a reversible integer wavelet transformation. In data fusion processes, an

Fig. 7.17 Estimation and adaptation based on the wavelet features of the input dataset

7 Data Fusion in Agricultural Information Systems

129

Fig. 7.18 Representation of a signal pattern (a) as amplitude variation in time, (b) frequency spectrum as Fourier transform, (c) short-time Fourier transform, and (d) wavelet transform (scale value)

alternative of time-domain variation or spatiotemporal variation of the input data is also using their wavelet features. This usually improves the estimation performance of the system (Fig. 7.17). For this reason, one of the most used methods nowadays is the wavelet transformation. Wavelets are a relatively new way of analyzing signal and data patterns. It is a synthesis of older ideas with new mathematical results and efﬁcient computational algorithms (Percival and Walden 2000). One-channel digital signal in discrete time domain is also considered as 1D time series data (Fig. 7.18a). Wavelet transformation can produce a good local representation of the signal in both time and frequency domains (Fig. 7.18d), while the Fourier transform could only provide frequency representation (Fig. 7.18b) (Li et al. 2002). It provides considerable information about the structure of the physical process to be modeled (Partal and Cigizoglu 2009). Wavelet transformation is more effective than the Fourier transform in the analysis of nonstationary time series. Fourier transform utilizes sine and cosine functions as the basis. Unlike the Fourier transform, wavelet transforms do not have a single set of basis functions. Instead, wavelet transforms have an inﬁnite set of possible basis functions. For this reason, wavelets are s class of functions used to localize a signal pattern in both time and frequency domain (Percival and Walden 2000). A Fourier coefﬁcient represents a component that lasts for all time, and temporary events must be described by a phase characteristic that allows cancellation or reinforcement over large time periods. A wavelet expansion coefﬁcient represents a component that is itself local and is easier to interpret. Wavelets are close to optimal for a wide class of signals for compression, denoising, and detection (Donoho 1993; Donoho et al. 1995). The basis functions are derived from one function called “mother wavelet” by scaling and shifting except the ﬁrst. The mother wavelets are representations of components with low scale and high frequency. The simplest mother wavelet and a commonly used is the Haar mother wavelet function (Burrus et al. 1998). The scaling function, called father wavelet, represents the high-scale low-frequency wavelet components. Continuous wavelet transform (CWT) is a convolution of the input data sequence with a set of functions generated by the mother wavelet. Discrete wavelet

130

B. Üstündağ

Fig. 7.19 (a) Haar scaling function Φ(t) and (b) Haar wavelet function Ψ(t)

transformation (DWT) depends on a similar convolution in discrete time. DWT coefﬁcients W( j, k) can be given as W ð j, k Þ ¼

N 1 X

f ðnÞ ψj,k ðnÞ

ð7:22Þ

n¼0

where f(n) is a sequence with length N and ψj,k ðnÞ is the discretized mother wavelet function. The superscript * denotes a complex conjugate (Smith et al. 1998). Haar scaling function Φ(t) and the wavelet function Ψ(t) (Fig. 7.19) are deﬁned in Eq (7.23), Eq (7.24) and the Eq (7.25). ( ϕð t Þ ¼

1, if 0 t < 1 0,

otherwise

ψðt Þ ¼ ϕð2t Þ ϕð2t 1Þ 8 h 1 > > 1, for tE 0, , > > 2 > < h ψðt Þ ¼ 1, for tE 1 , 1 , > > 2 > > > : 0, otherwise:

ð7:23Þ ð7:24Þ

ð7:25Þ

Convolution with wavelets at certain frequencies respects the bandpass ﬁltering. The high-pass and low-pass ﬁnite h impulse i responses h using ithe Haar mother wavelet 1ﬃﬃ p1ﬃﬃ p have only two samples (G ¼ 2, 2 and H ¼ p1ﬃﬃ2 , p1ﬃﬃ2 ) which are the shortest possible wavelet ﬁlters (Burrus et al. 1998). The DWT in practice can be implemented using dyadic ﬁlter tree algorithm representing a wavelet basis as high-pass (HPF) and low-pass ﬁlter (LPF) bank as shown in Fig. 7.20 (Burrus et al. 1998). There are several different DWT implementation schemes depending on the ﬁlter bank structures and the mother wavelets. A very simple example is using the average and difference of even and odd values in the sampled signal sequence. As an example, suppose we are given a 1D image consisting of four pixels, A ¼ [8 6

7 Data Fusion in Agricultural Information Systems

131

Fig. 7.20 A discrete wavelet transform scheme by using ﬁlters

Table 7.4 Wavelet coefﬁcients in terms of the details and the average value for the example sequence Resolution 4 2 1

Average [8 6 2 4] L1 ¼ [7 3] L2 ¼ [5]

Detail coefﬁcients [] H1 ¼ [1–1] HL2 ¼ [2]

2 4]. If we apply the DWT process shown in Fig. 7.20, then we can go down to the second layer that yields L2 and HL2 because there are four samples in the sequence. The initial resolution is four, and the respective average column is the data sequence itself as shown in Table 7.4. If we simply use the average for the low-pass ﬁlter and the average of the difference as the high-pass ﬁlter, then we will get the data in the row of resolution 2. Here, 7 is the average of 8 and 6; 3 is the average of 2 and 4. The ﬁrst set of details in H1 contains 1 as the average of the difference between 8 and 6, and it contains 1 as the average of the difference between 2 and 4 (i.e., (2–4)/ 2 ¼ 1). If we continue this ﬁltering iteration, L2 will be the last average as 5. It is also the average of all data in the sequence. It is also referred to as the wavelet coefﬁcient a0. The last detail is HL2 ¼ (7–3)/2 ¼ 2. Hence a Haar wavelet representation of pixel sequence A corresponds to [L2 HL2 H1] ¼ [5 2 1–1]. This process is reversible, thus enabling the recovery of the original signal sequence. As we hereby demonstrate a simple example, deﬁning an equation for wavelet transform does not use calculus. There are no derivatives or integrals, only multiplications and addition operations. For this reason, the generation of wavelets and the calculation of the discrete wavelet transform are very effective on

132

B. Üstündağ

Fig. 7.21 A data fusion model based on wavelet neural network

digital computational devices. An efﬁcient and widely used discrete wavelet implementation method is the lifting scheme (Sweldens 1997). Lifting scheme enables the exact inversion of the transform, and every reconstructable ﬁlter bank can be expressed in terms of lifting steps. In the wavelet data fusion schemes, different parameter sets as input information are ﬁrst converted into wavelet feature sets. Input data patterns shown in Fig. 7.21, s1, s2, . . .si, can either be in time, space, or frequency domain. Their wavelet features generate an equivalent amount of data that is applied to fully connected neural network layers. If there exists a priori information about the correlation and independency rates of wavelet features, then some of them can be eliminated. Reducing the input data vector size to the neural network sometimes reduce overﬁtting problems in prediction due to the total dataset size versus the nonlinearity rate of the system.

7.6

Convolutional Neural Networks

Wavelets provide extraction of convolutional features depending on the shift and scale of a mother wavelet function. Instead of using wavelet features to be classiﬁed by a neural network, another new trend that signiﬁcantly improves the classiﬁcation performance of data patterns is using the convolutional neural networks (CNN). Convolution is a widely used technique in signal processing, image processing, and other engineering ﬁelds. It is deﬁned as the integral of the product of two functions

7 Data Fusion in Agricultural Information Systems

133

Fig. 7.22 CNN scheme for image classiﬁcation

after one is reversed and shifted on the other (7.26). Since the convolution integral is a measure of overlapping rate between two functions, it also is an indicator of the similarity rate of the patterns. In the signal processing, convolution of two-time domain functions, f(t) and g(t), is deﬁned as Z ð f gÞ ð t Þ ¼

1

1

f ðτÞgðt τÞdτ

ð7:26Þ

Deep convolutional networks have multiple convolution, pooling, and activation layers. The amount of weights to be trained can be so high as up to millions of levels in a deep learning process. Deep convolutional neural networks start the process by convolutional decomposition and usually end by fully connected neural network layers (Fig. 7.22) for the classiﬁcation outputs. Convolution in deep learning is similar to the cross-correlation in signal or image processing (Goodfellow et al. 2016). A widely used CNN architecture example consisting of two convolutional and pooling layers, a fully connected layer, and a logistic regression classiﬁer is shown in Fig. 7.22. It can be used to predict if a satellite image patch belongs to speciﬁc crop type or not. Two-dimensional (discrete) convolutions of the 2D data patterns represented by matrix A and matrix B are calculated as A*B¼C where C ½m, n ¼

XX u

A½m þ u, n þ v ∙ B½u, v

ð7:27Þ

v

Each element of C is calculated as the sum of the products of a single element of A with a single element of B. Hence each element of C is computed from the sum of the element-wise multiplication of A and B. This structure of convolution is efﬁciently processed on the digital signal processors (DSPs) since they can be implemented in terms of MAC (multiply, add, and carry)-type instructions. On the other hand, graphical processor units (GPU) support this type of operation in a parallel architecture consisting of GPU cores. GPU cards have become critical hardware for performance requirement in deep learning applications.

134

B. Üstündağ

If we assign A and B as an application example of the equation in (7.27) as 1 6 61 6 A¼6 61 6 41

0 1

0 0 0 0

0 0

1 0 0 1

3 0 2 7 1 07 7 6 , B ¼ 0 07 4 7 7 1 05

1

1

1 1

1

2

0 1

3

7 1 05 0 1

then one type of computation of A*B is given in the below Python code list by using open source Scipy library (www.scipy.org): import numpy as np from scipy import signal A ¼ np.array([[1,0,0,0,0],[1,1,0,0,0],[1,0,1,0,0],[1,0,0,1,0],[1,1,1,1,1]]) B ¼ np.array([[1,0,1],[0,1,0],[1,0,1]]) C ¼ signal.convolve2d(A, B, 'valid') print(C) In this example, C(1,1) is calculated as 1∙1 + 1∙0 + 0∙1 + 1∙0 + 0∙1 + 1∙0 + 1∙1 + 0∙0 + 0∙1 ¼ 2. Here, B is referred as kernel or ﬁlter in CNN applications since it convolves the input data A and its size is less than A. Zero padding is an optional application method that enables centering the data in the sides of the matrix during the convolution. It is simply adding dummy zeros equivalently distributed all around the matrix A so that side components of A (A(1,1), A(1,2),. . .) can be centered during the convolution by B. After each convolution layer, it is a convention to apply an activation layer. This layer introduces nonlinearity to the system that basically has just been computing linear operations in convolutional layers. Log-sigmoid type activation functions, as shown in Eq. (7.18), have been widely used in the past. Recently rectiﬁed linear unit (ReLU) layers are preferred in CNNs since the network is able to train faster without making a signiﬁcant difference to the accuracy. The ReLU layer applies the function f(x) ¼ max(0, x) to all of the values in the input volume. In basic terms, this layer changes all the negative activations to 0. ReLU well manages the vanishing gradient problem. Vanishing gradient problem is the issue where the lower layers of the network train very slowly because the gradient decreases exponentially through the layers. The ReLU layers are commonly followed by a pooling layer. It is also referred to as a down-sampling layer. There are also several layer options, with max-pooling being the most popular. This takes a ﬁlter (normally of size 22) and a stride of the same length. It then applies it to the input volume and outputs the maximum number in every subregion that the ﬁlter convolves around. Other options for pooling layers are average pooling and L2 norm pooling. The intuitive reasoning behind this layer is that once we know a speciﬁc feature covered by the original input dataset, its exact location is not as important as its relative location to the other features. This layer

7 Data Fusion in Agricultural Information Systems

135

reduces the spatial dimension, while the depth does not change. Hence, the amount of parameters is reduced, and this reduces computational cost. It also controls overﬁtting. Deep learning is not only used for the classiﬁcation of one data type as cropland cover (Kussul et al. 2017); it also enables fusion of the different types of data layers either by applying a different kind of decomposed datasets into common fully connected network as similar to Fig. 7.21 or they can also be merged before the convolutional decomposition layer. Residual neural network (ResNets) (Kaiming et al. 2015) is a deep learning structure that improves the classiﬁcation performance by utilizing skip connections or shortcuts to jump over some layers. DenseNets were proposed in 2016, and they use several parallel skips as an improvement of ResNets. Deep learning provides an efﬁcient solution in the registration of relevant data input set features that includes the projection and rotational differences. For example scale-invariant feature transformation (SIFT) algorithm was used to solve especially this feature matching problem in the past years. Object features are efﬁciently matched in deep learning structures without the necessity of additional preprocessing methods (Sachdeva et al. 2017). This property makes deep learning as an efﬁcient candidate, especially for spatial and spatiotemporal data fusion processes. A widespread implementation method of the deep learning services is using dedicated frameworks. Tensorﬂow, Keras, Pytorch, Caffee, Theano, Apache MXNET, and Microsoft CNTK are widespread deep learning frameworks by the year 2018. All of these frameworks are open source. Google developed TensorFlow (www.tensorﬂow.org), and it is known for having an architecture that allows computation on any CPU or GPU, either on a desktop, or server, or even on a mobile device. This framework is available in the Python programming language, and it has C++ API. Depending on increasing amount of evaluation parameters and the respecting data size, deep neural network (DNN) node population may reach to millions of levels. Computational complexity especially raises the training time. There are also methods for reducing the time and space complexity of the DNNs. Weight pruning is one of the trends for this purpose. It has been shown that proper pruning of computational connections (weights) in DNNs may reduce the computational complexity more than 90%, while the accuracy loss remains negligible (Zhang et al. 2018).

7.7

Conclusion

Food security and agricultural resource sustainability problems raise the importance of optimal management decisions at all levels of agricultural production. Increasing the crop monitoring accuracy requires more detailed observation data restricted by its feasibility. Data fusion is an increasing trend for information harvesting, as data availability and sensor population exponentially increases in accordance with the computational power. Every crop variety has different growth model parameters that require various soil and agrometeorological data for their computation. When the

136

B. Üstündağ

direct monitoring is either not feasible or required resolution is technically not possible, data fusion methods can help to extract necessary information. Available data from remote sensing satellites, global positioning satellites, on-the-ﬁeld nearsensing instruments, administrative registration systems, and in situ wireless sensor networks can be processed in fusion models. In order to avoid overﬁtting or underﬁtting problems, choosing the proper input dataset from feasible resource is important for the training or adaptation of data fusion models. Precision agriculture has been recognized as a promising approach for increasing the yield efﬁciency. Integration of sensor-based data streams for precision agriculture and the vertical agricultural systems from the ﬁeld up to the basin resource management level is an important factor for the sustainable efﬁciency optimization and the policy management. System integration costs motivate Platform as a Service (PaaS) and Data as a Service (DaaS) solutions, including cloud computing, web services, and wireless communication between sensors and devices. When the necessary data that is not feasible to directly gather, proper data fusion schemes operated on PaaS or DaaS structures enable computation of equivalent parameters from other types of correlated existing data sets while maintaining the computational cost efﬁciency. In this chapter, we have given estimated yield efﬁciency mapping depending on agrometeorological indices and remote sensing data as one of data fusion examples. Another example is given to demonstrate how time delay neural networks can be used to estimate root zone soil moisture. Root zone soil moisture cannot be observed directly from radar satellites due to the penetration level of energy restriction. It should be noted that the actual root zone soil moisture is only used in training dataset, and neither surface soil moisture nor air humidity is used as input in this fusion example. Convolutional features usually improve the fusion performance of data patterns in time or space. Wavelets are essentially based on convolutions. Deep learning with CNN architecture also mainly depends on convolutional features. Deep convolutional networks are a rapidly developing area in machine learning for a wide range of applications including the data fusion. The management of nonlinearity against overﬁtting is one of the key issues both in machine learning and the regression-based data fusion methods. Increasing the number of parameters in the big data environment requires a strategy depending on correlation, covariance rate, and the population of the relevant training data subset. Phenological stage-based data segmentation provides piecewise linearity in regression-based plant growth-related estimations. R2 value exceeds 0.8 level in the given regional yield forecast mapping example depending on three observation parameters gathered from remote sensing images and agrometeorological measurements. As the labeled observation data size increases on the integrated service platforms, machine learning fusion models tend to take the place of the analytic model-based fusion schemes. The evolution and proliferation of user application connected platform services are expected to provide a cost-effective use of computational resources and relevant software as additional services. Multilayered service structure brings fused spatial, temporal, and spatiotemporal data; computational resources; GIS services; machine learning; and analytic tools together. Hence vertical integration not only improves prediction performance, but it can also ease

7 Data Fusion in Agricultural Information Systems

137

138

B. Üstündağ

7 Data Fusion in Agricultural Information Systems

139

service-oriented software development for smartphone and IoT applications as well as large-scale management systems. Acknowledgments This work was supported by the Republic of Turkey Ministry of Development within Agricultural Monitoring and Information Systems Project (TARBIL, Pr.no:2011A020090).

Appendix

References Allen, R. G., Pereira, L. S., Raes, D., & Smith, M. (1998). Crop evapotranspiration-guidelines for computing crop water requirements-FAO Irrigation and drainage paper 56. FAO, Rome, 300 (9), 1998, p. D05109. Altan, M. T., & Üstündağ, B. B. (2012). Reconstruction of missing meteorological data using wavelet transform. IEEE First Agro-geoinformatics conference, Shanghai. https://doi.org/10. 1109/Agro-Geoinformatics.2012.6311644. Amrawat, T., Solanki, N. S., Sharma, S. K., Jajoria, D. K., & Dotaniya, M. L. (2013). Phenology growth and yield of wheat in relation to agrometeorological indices under different sowing dates. African Journal of Agricultural Research, 8(49), 6366–6374. https://doi.org/10.5897/ AJAR2013.8019. Bagis, S., & Üstündağ, B. B. (2012). Image based automated phenological stage detection of cereal plants. Agro-geoinformatics conference, Shanghai. https://doi.org/10.1109/AgroGeoinformatics.2012.6311643. Bagis, S., Üstündağ, B. B., & Ozelkan, E. (2012). An adaptive spatiotemporal agricultural cropland temperature prediction system based on ground and satellite measurements. In First Agrogeoinformatics conference, Shanghai. https://doi.org/10.1109/Agro-Geoinformatics.2012. 6311642 Balaghi, R., Tychon, B., Eerens, H., & Jlibene, M. (2008). Empirical regression models using NDVI, rainfall and temperature data for the early prediction of wheat grain yields in Morocco. International Journal of Applied Earth Observation and Geoinformation, Elsevier, 10, 438–452. https://doi.org/10.1016/j.jag.2006.12.001. Bazgeera, S., Kamalib, G., & Mortazavic, A. (2007). Wheat yield prediction through agrometeorological indices for Hamedan, Iran. BIABAN (Desert Journal), 12, 33–38. Burrus, C. S., Gopinath, R. A., & Guo, H. (1998). Introduction to wavelets and wavelet transforms: A primer. Prentice-Hall. Castandeo, F. (2013). A review of data fusion techniques. The Scientiﬁc World Journal, 2013, Article ID 704504, 19 pages, https://doi.org/10.1155/2013/704504. Hindawi Publishing Corporation Dong, C., Hu, D., Fu, Y., Wang, M., & Liu, H. (2014). Analysis and optimization of the effect of light and nutrient solution on wheat growth and development using an inverse system model strategy. Computers and Electronics in Agriculture, 109, 221–234. https://doi.org/10.1016/j. compag.2014.10.013. Donoho, D. L. (1993). Unconditional bases are optimal bases for data compression and for statistical estimation. Applied and Computational Harmonic Analysis, 1(1), 1008211;115. Also Stanford Statistics Dept. Report TR-410, Nov. 1992. Donoho, D. L., Johnstone, I. M., Kerkyacharian, G., & Picard, D. (1995). Wavelet shrinkage: Asymptopia? Journal Royal Statistical Society B, 57(2), 3018211;337. Also Stanford Statistics Dept. Report TR-419, March 1993.

140

B. Üstündağ

Dubey, R. P., Kalubarme, M. H., Jhorar, O. P., & Cheema, S. S. (1987). Wheat yield models and production estimates for Patiala and Ludhiana districts based on Landsat – MSS and Agro meteorological data (Scientiﬁc Note. IRS-UP/SAC/CPF/SN/08/87) (pp. 1–34). Ahmadabad: Space Applications Center. Flood, N. (2014). Continuity of reﬂectance data between Landsat-7 ETM+ and Landsat-8 OLI, for both top-of-atmosphere and surface reﬂectance: A study in the Australian landscape. Remote Sensing, 6, 7952–7970. https://doi.org/10.3390/rs6097952. Frenken, K., & Gillet, V. (2012, November). Irrigation water requirement and water withdrawal by country. Food and Agriculture Organization of the United Nations, FAO Aquastat Reports. Ghannam, S., Awadallah, M., Abbott, A. L., & Wynne R. H.. (2014). Multisensor multitemporal data fusion using the wavelet transform. In The international archives of the photogrammetry, remote sensing and spatial information sciences, ISPRS technical commission I symposium, 17–20 November 2014, Volume XL-1, Denver, Colorado, USA. Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT Press, ISBN: 9780262035613. Herndl, M. (2008). Use of modeling to characterize phenology and associated traits among wheat cultivars. Dissertation zur Erlangung des Grades eines Doktors der Agrarwissenschaften (PhD Thesis), Universität Hohenheim. Jin, B., Kim, G., & Cho, N. I. (2014 May 21). Wavelet-domain satellite image fusion based on a generalized fusion equation. Journal of Applied Remote Sensing, 8(1), 080599. https://doi.org/ 10.1117/1.JRS.8.080599. Kaiming, H., Zhang, X., Ren, S., Sun, J. (2015). Deep residual learning for image recognition, arXiv:1512.03385 Khaleghi, B., Khamis, A., Karray, F. O., & Razavi, S. N. (2013). Corrigendum to ‘Multisensor data fusion: A review of the state-of-the-art. Information Fusion, 14(1), 28–44. https://doi.org/10. 1016/j.inffus.2011.08.001. Kulaglic, A., & Üstündağ, B. (2014). Estimation of soil moisture proﬁle using wavelet neural networks. In The third international conference on agro-geoinformatics, https://doi.org/10. 1109/Agro-Geoinformatics.2014.6910632. Kussul, N., Lavreniuk, M., Skakun, S., Shelestov, A. (2017). Deep learning classiﬁcation of land cover and crop types using remote sensing data. IEEE Geoscience and Remote Sensing Letters, pp. 1–5. https://doi.org/10.1109/LGRS.2017.2681128. Li, T., Li, Q., Zhu, S., & Ogihara, M. (2002). A survey on wavelet applications in data mining. ACM SIGKDD Explorations Newsletter, 4(2), 49–68. https://doi.org/10.1145/772862.772870. Manos, B., Paparizzos, K., Matsatsinis, K., Papathanasiou, J. (2010). Decision support systems in agriculture, food and the environment. ISI Global, ISBN-13: 978-1615208814. Odoh, M., Chinedum, I. (2014). Estimation theory. IOSR Journal of Computer Engineering (IOSRJCE), e-ISSN: 2278-0661, p-ISSN: 2278-8727, 16(6), Ver. II (Nov – Dec. 2014), pp 30–35 Partal, T., & Cigizoglu, H. K. (2009). Prediction of daily precipitation using wavelet—neural networks. Hydrological Sciences Journal, 54(2), 234–246. https://doi.org/10.1623/hysj.54.2. 234. Patterson, D. W. (1998). Artiﬁcial neural networks: Theory and applications”, Prentice Hall, ISBN:978-0-13-295353-5. Percival, D. B., & Walden, A. T. (2000). Wavelet methods for time series analysis. Cambridge Series in Statistical and Probabilistic Mathematics. Rajput, R. P. (1980). Response of soybean crop to climate and soil environments, Doctoral dissertation, Doctoral Thesis, IARI, New Delhi, India Rao, G. S. L. H. V. P. (2003). Agricultural meteorology (pp. 95–112). Thrissur: Director of Extension, Kerala Agricultural University. Rouse J. W., Haas, R. H., Schell, J. A., & Deering, D. W. (1973). Monitoring vegetation systems in the Great Plains with ERTS. In Third ERTS symposium, NASA SP-351 I, 309–317, USA, 1973.

7 Data Fusion in Agricultural Information Systems

141

Sachdeva, V. D., Baber, J., Bakhtyar, M., Ullah, I., Noor, W., & Basit, A. (2017). Performance evaluation of SIFT and convolutional. International Journal of Advanced Computer Science and Applications, 8, 12. Sarmadian, F., & Mehrjardi, R. T. (2008). Modeling of some soil properties using artiﬁcial neural network and multivariate regression in Gorgan Province, North of Iran. Global Journal of Environmental Research, 2, 30–35. Schröder, W., Schmidt, G., & Schönrock, S. (2014). Modelling and mapping of plant phenological stages as bio-meteorological indicators for climate change. Environmental Sciences Europe, 26 (5), 1–13. Smith, L., Turcotte, D., & Isacks, B. (1998). Stream ﬂow characterization and feature detection using a discrete wavelet transform. Hydrological Processes, 12(2), 233–249. https://doi.org/10. 1002/(SICI)1099-1085(199802)12:23.0.CO;2-3. Sweldens, W. (1997). “The lifting scheme: A construction of second generation wavelets” (PDF). Journal on Mathematical Analysis, 29(2), 511–546. https://doi.org/10.1137/ S0036141095289051. Tokar, A. S., & Markus, M. (2000). Precipitation-runoff modeling using artiﬁcial neural networks and conceptual models. Journal of Hydrologic Engineering. Üstündağ, B. B. (2017, February). An adaptive mealy machine model for monitoring crop status. Journal of Integrative Agriculture, Remote Sensing Special Issue, 16(2), 252–265. White, F. E. (1991). JDL, data fusion lexicon, technical panel for C3, F.E. White, San Diego, Calif, USA, Code 420. World Bank. (2019). data.worldbank.com. https://data.worldbank.org/indicator/ag.lnd.arbl.ha.pc Zhang, T., Ye, S., Zhang, K., Tang, J., Wen, W., Fardad, M., & Wang, Y. (2018). A systematic DNN weight pruning framework using alternating direction method of multipliers. European Conference on Computer Vision – ECCV 2018, pp 191–207, Springer, Lecture notes in Computer Science. LNCS, 11, 212.

Chapter 8

Big Data and Its Applications in Agro-Geoinformatics Liping Di and Ziheng Sun

Abstract Agro-geoinformatics deals with collecting, managing, and analyzing agricultural-related geospatial data, which are domain-speciﬁc big data. This chapter discusses the general characteristics of big data, the speciﬁc features of agrogeoinformatics and agro-big data, and the examples of agro-geoinformatics projects dealing with big agro-big data. Through the adoption and adaptation processes, the general big data technologies are very useful in agro-geoinformatics but cannot solve all technology needs in dealing with agro-big data. The development of agro-big data-speciﬁc technology is a necessary supplement to the adoption of general big data technology. The combination of adoption of general big data technology and development of agro-big data-speciﬁc technology proves to be a good strategy for applying big data technology in agro-geoinformatics. Keywords Big data · Agriculture · Agro-geoinformatics · Agro-geodata · Agro-big data

8.1 8.1.1

Introduction Challenges in Modern Agriculture

The world population is 7.7 billion as of November 20191 and likely continues to increase in this century (Gerland et al. 2014). Undoubtedly, in the near future, many developing countries in the world will constantly face serious issues, such as how to feed a growing population; how to reduce the poverty, particularly in the rural area; and how to protect the environment while sustaining the economic development (McCalla 2001). The US Department of Agriculture (USDA) National Institute of 1

https://www.worldometers.info/world-population/

L. Di (*) · Z. Sun George Mason University, Fairfax, VA, USA e-mail: [email protected] © Springer Nature Switzerland AG 2021 L. Di, B. Üstündağ (eds.), Agro-geoinformatics, Springer Remote Sensing/ Photogrammetry, https://doi.org/10.1007/978-3-030-66387-2_8

143

144

L. Di and Z. Sun

Food and Agriculture (NIFA) has identiﬁed several urgent challenge areas requiring instant actions in the current food and agricultural systems, including food security, climate variability and change, water, bioenergy, childhood obesity, and food safety.2 Accordingly, it has funded a number of studies to advance the ability to achieve global food security and ﬁght hunger. Food security aims to allowing people to have consistent access to quality food. It depends on the individuals and organization working together to develop solutions to related socioeconomic issues. About 70% of the poor in developing countries live in rural areas and make their living in the agricultural sector (Sonntag et al. 2005). Agriculture plays the major role in enhancing food security and reducing poverty in developing countries. Our planet is facing constant climate change. Although natural change is unavoidable, rapid and dramatic shifts in the climate result in severe consequences, such as increased frequency and severity of droughts and ﬂoods, extreme rain patterns, increased temperature and more frequent heat waves, and sea-level rising. These consequences can signiﬁcantly damage agriculture, forest, and rangeland ecosystems and reduce agricultural productivity. Water is another challenge facing agriculture. As the biggest consumer of freshwater in most countries, agriculture relies on reliable freshwater supply for irrigation. In United States, agriculture accounts for about 80% of the consumptive use of freshwater. However, the availability and quality of freshwater resources for agriculture are becoming increasingly an issue mainly due to climate change and human socioeconomic activities. Yet, water wasting and water-use inefﬁcacy are very common in agriculture. It is in urgent need to develop sound water and watershed management systems using effective practices, such as modern conservation technologies, appropriate crop choices, and drought preparedness, to help farmers enhance water use efﬁciency and conserve resources. Agriculture has signiﬁcant impacts on the environment since it is the major artiﬁcial sources of nitrous oxide and methane to the soil. In addition, studies also suggest that agriculture and related land-use transformation emit greenhouse gases via chemical transformations (Hallberg 1987). All the chemicals used in agricultural impact the environment from the soil, to water, the air, animal, people, plants, and climate. One of the grand challenges, therefore, is how to sustain and enhance agricultural productivity while minimizing the environmental footprint of agriculture. Climate change and agriculture are mutually inﬂuenced on a global scale. Changes in temperature, precipitation, and freshwater availability induced by climate changes directly impact agricultural sustainability and productivity. Meanwhile, agriculture has signiﬁcant inﬂuence on climate change, mainly through the production and release of greenhouse gases, such as carbon dioxide, methane, and

2

https://nifa.usda.gov/challenge-areas

8 Big Data and Its Applications in Agro-Geoinformatics

145

nitrous oxide, and the modiﬁcation of land cover. For example, antibiotics have been used to treat infectious diseases in livestock. However their widespread use as an additive in animal feeds causes the development of antibiotic-resistant microorganisms (Kumar et al. 2005). Different farming strategies, such as conservation agriculture based on minimum tillage, crop residue retention, and crop rotations, have different levels of impacts on the carbon and nitrogen cycling in agriculture (Govaerts et al. 2009). Historically, farming was a very innovative ﬁeld. But since entering the digital era, the adoption of information technology on the agricultural sector is slower than other industry sectors. Only in recent years, new technologies like drones and AI start to target the agricultural market and be used in assisting farming with more accurate and timely information. One reason for the slow adoption is that innovative technology takes longer from concept to implementation in agriculture. The farming has seasonal cycles that take months or even an entire year to see the results of new techniques. Meanwhile, many challenges are systematic and caused by different factors. They span across multiple governmental districts with different policies and regulations. Scaling the innovative techniques in agriculture could be very hard because of the highly connected and interdependent ecosystems interacting with the complex socioeconomic and political systems. Many rural regions in developing countries still stick to ancient farming techniques. One hundred years ago, almost half of the American workforce worked on agriculture. Today less than 2% of the Americans are farmers. Urbanization greatly increases the distance that food travels to reach kitchens. The young generation knows less than ever about agricultural practices. Another big challenge facing modern agriculture is raising the awareness of consumers on potential food crisis which is a hanging sword above all the human kind (Woolpert 2015). Contributions from all stakeholders are required to meet these agricultural challenges described above. Scientiﬁc research and breakthroughs are one of the most prominent contributors to meet the challenges. Regarding the agricultural research, as Ruttan concludes (Ruttan 1994), there are two challenges: (a) biological and technical constraints on crop and animal productivity and (b) resource and environmental constraints on sustainable growth in agricultural production (Ruttan 1994). Raising the yield ceilings of gains for the cereal crops is difﬁcult. The incremental response of crop yields to the increasing use of chemical fertilizers has declined, especially in the years after 2000. Ruttan foresaw in 1999 that the advances in the basic knowledge on molecular biology and genetic engineering will create new opportunities for advancing agricultural technology that will reverse the urgency of some of the above concerns (Ruttan 1994). Institutionalization of private sector agricultural research capacity in some developing countries is beginning to complement public sector capacity. More intensive and efﬁcient use of technical inputs, including chemical fertilizers and pest control chemicals, and more effective animal nutrition will improve the necessary gains in crop and animal productivity. Higher

146

L. Di and Z. Sun

plant density, new farming practices, improved pest and disease control, more precise application of plant fertilizer, and advances in soil and water management will be realized by applying new knowledge and new technologies.

8.1.2

The Role of Big Data in Agriculture

In recent years, digital revolution is transforming modern farming (Bronson and Knezevic 2016). Nowadays farmers can easily look for answers and solutions to issues met in farming via online services provided by professional consulting companies or local government departments. Even smallholder farmers are exercising new precision agricultural equipment like unmanned airborne vehicles (UAVs) to gather information. Many farmers make their decisions based on the data and information provided by authentic sources. Conventional tractors are equipped with fancy cameras and sensors which can stream live data about soil and crops to agricultural information companies or agricultural statistics services for providing better real-time ﬁeld or even location-speciﬁc decision suggestions to farmers. With the rapid development of sensor, sensing, and data collecting technologies, the capabilities of human society to collect data have been expanded exponentially. A huge amount of data have been collected by government agencies, organizations, industries, and individuals. The development of data interoperability technology and wide adoption of open data policy have made the data more accessible with low or no cost. However, the data are very diverse in terms of sources, format, quality, etc. The human society lacks enough experience and knowledge in managing and exploring the rapidly increasing volumes of data. As the result, most of those collected data are either discarded or put into archives without being fully utilized, although the data may contain information and knowledge that are valuable to the socioeconomic activities of the society. The agricultural sector actually has accumulated mountains of records in its long history, many of which, however, are not digital. Big data is different from the historical information gathered in old fashion. Modern computer science and sensor technology can help us better understand the complex interactions among the natural and social components of agricultural systems. Currently, a huge number of in situ, airborne, and space-borne sensors monitor continuously the agricultural systems and related physical environment and produce big quantities of data in an unprecedented pace. How to process the big data to extract valuable information is the ﬁrst challenge we have to face. Although there is already a large number of tools and services designed for managing and processing big data, they have not yet been widely applied in agriculture (Kamilaris et al. 2017). Based on existing researches, the analysis of agricultural big data promises great opportunities for improving agricultural productivity and sustainability. The availability of related hardware, software, the openness of data sources, and farmer involvement shall encourage big data research and practice in agriculture.

8 Big Data and Its Applications in Agro-Geoinformatics

8.2 8.2.1

147

Agricultural Big Data Special Features of Agro-Big Data

Realizing the socioeconomic values of information and knowledge contained in huge volumes of the data, in recent years, signiﬁcant research efforts have been spent on maximizing the utilization of the data by mining the information and knowledge from the data (Manyika 2011). The term “big data” is coined to refer to the huge amount of data those efforts deal with. According to Wikipedia, “Big data is a broad term for datasets so large or complex that traditional data processing applications are inadequate.” (https://en.wikipedia.org/w/index.php?title¼Big_data& oldid¼925811014. Accessed 14 Nov 2019). Big data are commonly characterized with ﬁve Vs (Hitzler and Janowicz 2013): • Volume refers to the huge amounts of data generated every day. For example, millions of cameras have been installed worldwide to monitor the Earth’s environment, trafﬁc conditions, public safety, etc. year around. The volume of data generated by those cameras is unimaginable. • Velocity refers to the speed at which data is generated and moved around. Every second the world generates petabytes of data, which need to be managed and analyzed, and near-real-time decision might be made based on the analysis results. • Variety refers to the different types of data the world generates and uses. For example, in the geospatial ﬁeld, we now need to deal with data from in situ, airborne, satellite platforms, and citizen scientists’ mobile devices. The data type can range from hyperspectral images, videos, model outputs, and sensor measurement to social media conversations. • Veracity refers to the trustworthiness of the data. In the scientiﬁc world, the quality and accuracy of the data are one of the biggest concerns that every scientiﬁc experiment has to consider. In the big data era, because the sources of the data are numerous and the qualiﬁcations of the organizations or individuals who collect the data are not equal, the quality and accuracy of the data are less controllable. • Value refers to the usefulness of the information and knowledge we can derive from the data. Therefore, value is the most important V of big data. In any applications of big data, we have ﬁrst to question what the value we can get from the big data. Contrary to the traditional data management and analysis technologies, big data management and analysis must consider and properly deal with big data’s ﬁve V characteristics. Signiﬁcant progresses have been made in both big data management, which deals with data capture, curation, archive, storage, cataloging, discovery, search, access, sharing, quality control, privacy, etc., and big data analytics, which includes big data analysis, transformation, mining, visualization, knowledge discovery, etc. However, challenges still exist in all above areas.

148

8.2.2

L. Di and Z. Sun

State-of-the-Art Analysis Methods

The purpose of big data analytics is to derive useful information and knowledge from big data, while the main purpose of the big data management is to make big data analytics possible and feasible. In traditional data analysis, causal relationship among the variables is normally sought from data samples. Because of the number of variables, the volume of data and the uncertainty in the data quality involved in the big data analytics, correlation, or probability relationship are usually sought by analyzing whole datasets instead of samples. Recently, a number of popular big data processing platforms come in the market and draw a lot of attention. Google Earth Engine (GEE) is the most frequently mentioned name of web service platforms in the geoinformatics and agricultural information communities (Gorelick et al. 2017). Living in the vast Google Cloud, Google Earth Engine serves planetary-scale geospatial analysis capability, which brings Google’s massive computational capability to solve high-impact scientiﬁc problems like deforestation, drought, disaster, disease, food security, water management, climate change, and environment protection. It gives scientists a free yet very powerful tool to deal with big data. Because GEE integrates geospatial big data, analytics algorithms and models, and powerful computing capability into a single platform, it enables many big data applications, saves tremendous amount of time for scientists, and promotes research democracy. Currently, most remote sensing data, but not ground truth data, used in agro-geoinformatics are available in GEE. For agro-big data controlled by individual scientists or organizations, they have to maintain their own big data storage and computing and analysis environments. There are several open source software that scientists used frequently, such as Eucalptus, Openstack (Sefraoui et al. 2012), Cloudstack (Kumar et al. 2014), Apache Hadoop, and Apache Spark. These software can create a cloud-based environment and accelerate the big data processing speed by the MapReduce mechanism (Borthakur 2007; Zaharia et al. 2016). Due to the ﬁve V challenges, these software ecosystems often run into constant issues dealing with bottleneck problems like data transferring among hosts and memory, MapReduce optimization, real-time data processing, system failure tolerances, data security, etc. The applied analysis methods are also changing. Scientists used to run traditional numeric models of the crop growth and predict the future development of crop by simulations. In recent years, agricultural scientists start to use artiﬁcial intelligence technology like machine learning for more effortless, low-cost, general, and automatic modeling of the crops and surrounding environment (Yu et al. 2018; Sun et al. 2019).

8 Big Data and Its Applications in Agro-Geoinformatics

8.3 8.3.1

149

Agro-Geoinformatics Deﬁnition

Data itself will not reveal valuable information to people on its own. It needs data processing and analytics to surface the decision-wise information hidden in the data. The science of informatics dealing with data has been accounted as an important tool for information generation. Coupling this computational capability of informatics with earth science data is normally referred as geoinformatics (Di and Yang 2014). Based on the same idea, applying geoinformatics on agriculture-related data is called agro-geoinformatics (Yu et al. 2019; Di et al. 2017a). Agro-geoinformation, the agricultural-related geo-information, is the key information in the agricultural decision-making and policy formulation process (Di and Yang 2014). Agro-geoinformation is derived from agricultural-related geospatial data or agro-geodata. If the agro-geodata volume is huge, we can simply refer the data as agro-big data. Agro-geoinformatics is a discipline that collects and manage agricultural-related geospatial data, derives agro-geoinformation and knowledge from the data, and applies the derived information and knowledge to solve management and decision-making issues in agriculture (Di and Yang 2014). Therefore, agro-geoinformatics deals with the entire life cycle of agro-geoinformation transformation, ranging from data collection, information and knowledge derivation, and the applications of information and knowledge in the agricultural domain. It includes both the agro-big data management and agro-big data analytics.

8.3.2

Agro-Geoinformatics: Connecting Agro-Big Data to Agricultural Applications

As an emerging transdisciplinary research area, agro-geoinformatics has extensive applications in agricultural sustainability, food security, environmental research, bioenergy, natural resource conservation, land use management, carbon accounting, global climate change, public health, agricultural industry, commodity trading, economy research, education, agricultural decision-making and policy formulation, and other areas that are of vital importance to agricultural economy. Advancements of remote sensing, sensor networks, and geographic information systems nurture numerous applications to improve agricultural productivity and efﬁciency (Yu et al. 2019). The data processed in agro-geoinformatics covers a variety of sources like satellites, manned aircrafts, drones, ﬁeld cameras, tractor sensors, soil sensors, weather stations, and even social media. Remote sensing is one of the key methods

150

L. Di and Z. Sun

for collecting large-scale observations of crops and related environment. Geospatial information technology is the key technology for handling, analyzing, and applying agro-geoinformatics. Traditional analysis tools such as GIS (geographical information system) analysis toolboxes are widely used in agro-geoinformatics researches. Agro-geoinformatics is actively adopting the state-of-the-art data analysis methods like big data parallel processing, machine learning, deep learning, expert system, intelligent knowledge-based systems, genetic algorithms, fuzzy system, soft computing, etc. GEE is also widely used in agro-geoinformatics. Through combining these advanced analysis tools with the observed datasets, agro-geoinformatics can help farmers, consumers, scientists, industries, and government agencies to achieve the strategic goals: ensuring food security, reducing poverty, and sustaining the environment. Agro-geoinformatics researches target the information which can assist farmers to make better decisions on planting, fertilizing, irrigating, harvesting, marketing, and selling and will improve government agencies’ capacity on making better policies on farm subsidies, stabling prices, large commodity purchase, smallholder farm support plan, crop insurance, disaster responses, etc.

8.3.3

Related Research

Agro-geodata is a kind of geospatial data, which is deﬁned as data with associated location information. Studies have shown (or claimed) that more than 80% of data the world has collected is geospatial data (Hahmann et al. 2011). Therefore, the geospatial data is big data. In fact, geospatial data have all the ﬁve V characteristics of big data. In the past several years, numerous big data management and analytic technologies have been developed by computer and data science communities. Many of them are general technologies that can be adopted by disciplinary big data applications, including agro-geoinformatics. One of the most notable technologies to deal with big data is the cloud computing and associated Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS) (JoSEP et al. 2010). Numerous software packages to deal with big data management and analytics, either on cloud or cluster platforms, have been developed. A lot of those packages are released as open source software, freely available to all interested users. In the agro-geoinformatics domain, those general big data technologies have been adopted to deal with the agro-big data. In addition to the common ﬁve V characteristics of big data, agro-big data also have their special features, particularly, the multidimensionality and spatial/temporal characteristics (Di 2004). The agro-big data may cover wide spatial area, up to the entire Earth. Because of the special features of agro-big data, special big data management, and analytics methods have to be developed. For example, both the spatial and temporal co-registration processing of multisource and multimodal data are essential for handling agro-big data.

8 Big Data and Its Applications in Agro-Geoinformatics

8.4

151

Examples of Big Data Application in Agro-Geoinformatics

In the past several years, the Center for Spatial Information Science and Systems (CSISS) at George Mason University has worked on a number of research projects dealing with agro-big data for supporting the agricultural decision-making. The examples of such projects are brieﬂy described below.

8.4.1

Agro-Sensor Web

CSISS has contributed signiﬁcantly to the development of geospatial sensor web technology to coordinately and purposely collect agro-geodata from in situ, airborne, and satellite-based imagery and nonimagery sensors (Di 2007). Spatially distributed heterogenous sensors monitoring the temperature, sound, vibration, pressure, motion, or pollutants in agricultural ﬁelds are interconnected through standard interfaces into a sensor network (Chen et al. 2009). The term sensor web was brought up in the early years of this century by the National Aeronautics and Space Administration (NASA) Sensor Web Applied Research Planning Group (Delin and Jackson 2001; Di et al. 2010). The OGC and ISO made joint Sensor Web Enablement (SWE) efforts on formulating the standards and protocols to enable the interoperability within the sensor web (Di 2007). CSISS developed a general purpose sensor web framework to integrate and harmonize diverse sensor network, store disparate sensor datasets, and meet the diverse requirements of concurrent distributed users (Di 2007) and a series of multipurpose sensor web-related web services including SOS (Sensor Observation Service), SPS (Sensor Planning Service), CSW (Catalogue Service for the Web), WFS-T (transactional Web Feature Service), and WCS-T (transactional Web Coverage Service) (Chen et al. 2009). To efﬁciently process the data from sensor web, a OGC WPS (Web Processing Service) compliant service was built on CSISS’ cloud computing platform, GeoBrain Cloud and Apache Hadoop (Zhang et al. 2019; Chen et al. 2012). The processing tasks were linked together to realize real-time geoprocessing of observations from live sensors to connect sensor web with decision-makers and provided them with sharable problem-solving knowledge (Sun et al. 2012a, b, 2013; Sun and Yue 2010). To make the sensor observation more accessible and interoperable, OGC WCS was combined with SOAP interface to facilitate the retrieval of observations about agricultural ﬁelds (Sun et al. 2016a). To make the agro-big data from sensor observations more discoverable and comparable, a whole set of solutions for enabling search of big data was developed (Gaigalas et al. 2019; Sun et al. 2019a). The searching strategy takes two steps to reduce the burden on web-based catalogs: the ﬁrst step is to search the desired sensors, and the second step is to search observation data granules of a speciﬁc sensor. The granule search, which takes the most resources and time, is designed to be on demand to minimize the unnecessary

152

L. Di and Z. Sun

search. In order to increase the automation of information extraction from sensor observations, a parameterless automatic classiﬁcation approach, which integrated ontology engineering and remote sensing classiﬁcation, was proposed (Sun et al. 2016b). Meantime, a simple universal interface for services (SUIS) was developed to lower the barrier of entry and simplify the use of the existing agricultural web services in real-world scenarios (Sun et al. 2019b).

8.4.2

GADMFS

In many countries, drought is the most devastating natural disaster in agriculture. It has become a global challenge to monitor and forecast drought in recent years (Zhong et al. 2019). Nowadays, cyberinfrastructure plays an important role in monitoring agricultural drought today. We built a web service-based global agricultural drought monitoring system in CSISS and have maintained its operation in the past decade (Deng et al. 2013). GADMFS, short for global agricultural drought monitoring forecasting system, serves terabytes of global drought products at 250-meter spatial resolution with update frequency of once every 2 weeks (as shown in Fig. 8.1) (Deng et al. 2012). The products are available for the period from 2001 to the present. The system divided agricultural drought into ﬁve levels: abnormally dry, moderate drought, severe drought, extreme drought, and exceptional drought. The division is based on empirical experiences on vegetation response to drought. The system aims to directly serve drought information to its most needed users like

Fig. 8.1 GADMFS user interface (http://gis.csiss.gmu.edu/GADMFS)

8 Big Data and Its Applications in Agro-Geoinformatics

153

farmers, government agencies, and crop insurance companies (Sun et al. 2019c). Figure 8.1 shows that in the ﬁrst 2 weeks of October 2019, there was exceptional drought in Southern Africa, Australia, Eastern Europe, Eastern Brazil, and the Midwest United States. Users can zoom the map into their interested region, e.g., their own farms, to check the current drought condition (Sun et al. 2017a).

8.4.3

CropScape

CropScape, a collaborative effort of CSISS and USDA (US Department of Agriculture) NASS (National Agricultural Statistics Service), is a web service-based US crop information system (Yang et al. 2013; Han et al. 2012). CropScape hosts the Cropland Data Layer (CDL), which is a raster, geo-referenced, crop-speciﬁc land cover data layer created annually for the contiguous United States (CONUS) since 2008 (some states have data back to 1997) (Boryan et al. 2011). CDL is produced by moderate-resolution satellite imagery primarily from Landsat and extensive agricultural ground truth collected by the USDA NASS. CropScape provides a bundle of useful tools for users to ﬂexibly browse, visualize, download, analyze, and export CDL (Han et al. 2012). It has been validated by consumer feedbacks as an excellent tool. CropScape greatly enhances the spreading and use of CDL in the agricultural and socioeconomic communities. CDL is specialized in mapping more than 100 categories of land use classes in agriculture and has been widely used in many researches and real-world decision-making scenarios. Today the system has tens of thousands of active users from all over the world and serves as a fundamental cyberinfrastructure for agro-geoinformatics (Sun et al. 2019) (Fig. 8.2).

Fig. 8.2 CropScape

154

8.4.4

L. Di and Z. Sun

VegScape

Similar to CropScape, VegScape is another web service-based US crop condition assessment and monitoring system but hosting more vegetation index products developed by CSISS for the USDA NASS (Mueller 2013). VegScape utilizes the data from NASA moderate-resolution imaging spectroradiometer (MODIS) which has daily global coverage, 250-m spatial resolution, 19-year historical archive. It derives daily/weekly/biweekly composites of NDVI (normalized difference vegetation index), VCI (vegetation condition index), MVCI (mean vegetation condition index), and RVCI (ratio vegetation condition index). Compared to the old AVHRR NDVI maps, VegScape automatically obtains and processes MODIS surface reﬂectance data and generates vegetation (crop) condition information for stakeholders to timely assess and monitor the crop conditions in CONUS (Yang et al. 2013) (Fig. 8.3).

8.4.5

RF-Class

To support the objective and automatically assess crop damage caused by ﬂood, we developed RF-CLASS, an Earth observation-based ﬂood crop loss assessment web service system for supporting ﬂood-related crop statistics and insurance decisionmaking (Di et al. 2017b). As one of the major disasters causing signiﬁcant crop loss, ﬂood is responsible for the several recent disasters of billion-dollar loss. The timely information about the ﬂood and its impacts on crops, such as ﬂooded area and damage degree, is a time-critical information for agricultural disaster responses.

Fig. 8.3 VegScape

8 Big Data and Its Applications in Agro-Geoinformatics

155

RF-CLASS is implemented with open interoperable standard web interfaces to facilitate the interoperability with EO datasets in the NASA Earth Observing System Data and Information System (EOSDIS). It calculates the ﬂooded area using multiple sources of datasets, including MODIS, SMAP, Landsat, and NASA Dartmouth Flood Observatory (DFO) daily surface water product. Meantime, RF-CLASS derives the crop loss map using a regression model by analyzing the ratios of yield change and the change of accumulated NDVI. It also has an implemented toolset to support the simple spatiotemporal analysis of the time series datasets. As for the capability demonstration and validation, the system assessed two ﬂood events, one in Missouri in 2011 and another one in Arkansas in 2006. Both use cases demonstrated and validated that RF-CLASS offers very helpful information and functionalities in assisting post-ﬂood crop loss assessment.

8.4.6

SMAP Explorer

It is common knowledge that soil moisture is one of the most direct measures of agricultural drought. However, the soil moisture products available in the past lacked of either coverage or accuracy. To produce large-scale soil moisture products with high quality, NASA has done a great deal of efforts on launching satellites and developing algorithms. SMAP (Soil Moisture Active and Passive) mission is the ﬂagship of these efforts. The mission provides a reliable data source for cropland soil moisture assessment. To allow easy access and analysis of SMAP data, CSISS has developed SMAP Explorer, an interactive web service-based system for SMAP data visualization, dissemination, and analytics (Fig. 8.4). Before SMAP, the USDA

Fig. 8.4 SMAP Explorer

156

L. Di and Z. Sun

monitored US crop soil moisture condition using weekly ﬁeld observations for counties in 45 states. The state-level estimates are reported based on subjective and qualitative ﬁeld observations which are inaccurate and may be misleading (Yang et al. 2017). The operation of SMAP mission provides another low-cost efﬁcient alternative solution to this USDA routine procedure. SMAP Explorer echoes this idea and publishes the big processed soil moisture dataset via the web-based system. The system is recently deployed online for test running and will become operational after the test phase.

8.4.7

GeoFairy

Most agricultural application scenarios are in the farm ﬁelds. Farmers cannot carry the large servers or laptops around when they are working. Mobile apps on smart phones are perfect solution for farmers who want to directly retrieve geospatial information about their real-time location. To meet that needs, we developed the award-winning app, GeoFairy (Fig. 8.5), to realize one-stop location-based service (LBS) for crop information retrieval (Sun et al. 2017b). GeoFairy integrates a number of NASA/NOAA data sources and state-of-the-art techniques like cloud computing platform and cross-platform mobile application development. Experiments have shown that GeoFairy is capable of one-stop delivering real-time crop information to users when they are standing in the ﬁelds. It can largely reduce the

Fig. 8.5 GeoFairy. (This ﬁgure is cited from (Sun et al. 2017b)

8 Big Data and Its Applications in Agro-Geoinformatics

157

costs of information searching and retrieving and seamlessly connect end users like farmers with information suppliers. Integrating the sensor web capability, GeoFairy shows its potentials for operational use and grand problem-solving in addressing the modern agricultural challenges.

8.4.8

CyberConnector COVALI

Numeric modeling is a widely applied solution for simulating the crops and the associated environment. There are many models for various processes and transformation in the nature, e.g., weather model, crop growth model, soil model, rainfall model, groundwater model, etc. (http://earthcube.org/sites/default/ﬁles/doc-reposi tory/AtmoCloudAerosolComp_EndUserWorkshop_ExecSummary.pdf). These models are the main players producing all kinds of decision-assisting information like when to ﬂower, when to silk, when to rain, and when to irrigate. However, there are many models which we have no idea if their results are accurate enough or appropriate for our use cases. We developed an advanced system called COVALI (Sun and Di 2018), a subsystem of EarthCube CyberConnector (Sun et al. 2017c), to compare the model results and validate them with ground and remote-sensing observations. The system allows scientists to compare the results from their model with those from other models side by side in different rendering styles and georeferenced projections (Fig. 8.6). The system can bring a change in the research routine of agricultural scientists and make the involved numeric models more intercomparable and the validation of results with ground truth data more efﬁcient.

Fig. 8.6 CyberConnector COVALI. (This ﬁgure is cited from (Sun et al. 2019a)

158

8.4.9

L. Di and Z. Sun

Geoweaver

In agro-geoinformatics, managing workﬂows is quite a challenging task, because workﬂows frequently involve many atomic processes and multiple sources of datasets, which might be distributed on various servers. A good workﬂow management tool can help scientists to completely avoid the hassle and greatly improve the efﬁciency of information extraction. Geoweaver (Fig. 8.7) is an ESIP lab incubator project aiming at building and monitoring AI workﬂows in geosciences (Sun and Di 2019). Geoweaver was used in building a deep learning-based workﬂow for mapping crops in the historical years when ground truth data for training and validation were not available (Sun et al. 2019). It supports command lines, Linux scripts, Python, and Jupyter Notebook. This tool will be extremely useful when the processed data amount is very large and require multiple servers or an entire data center to host. It can handle the big data processing tools, such as Apache Hadoop or Spark, via command lines and oversee all the active processing tasks in one place. All of the above projects aimed to extend our capabilities in operationally monitoring crops and supporting decision-making in agriculture with agro-big data. These capabilities are all within the scope of agro-geoinformatics. These projects adopted big data storage and analysis technologies and developed domain-speciﬁc agro-big data technologies. The combination of adoption and development proves to be a successful approach to apply big data technology in the agro-geoinformatics discipline.

Fig. 8.7 Geoweaver

8 Big Data and Its Applications in Agro-Geoinformatics

8.5

159

Conclusion

Agro-geoinformatics deals with collecting, managing, and analyzing agriculturalrelated geospatial data, which are domain-speciﬁc big data. Through the adoption and adaptation processes, the general big data technologies are very useful in agrogeoinformatics but cannot meet all technology needs in dealing with agro-big data. The development of agro-big data-speciﬁc technology is a necessary supplement to the adoption of general big data technology. The combination of adoption of general big data technology and development of agro-big data-speciﬁc technology proves to be a good strategy for applying big data technology in agro-geoinformatics.

References Borthakur, D. (2007). The hadoop distributed ﬁle system: Architecture and design. Hadoop Project Website, 2007(11), 21. Boryan, C., Yang, Z., Mueller, R., & Craig, M. (2011). Monitoring US agriculture: The US department of agriculture, national agricultural statistics service, cropland data layer program. Geocarto International, 26(5), 341–358. Bronson, K., & Knezevic, I. (2016). Big data in food and agriculture. Big Data & Society, 3(1), 2053951716648174. Chen, N., Di, L., Yu, G., & Min, M. (2009). A ﬂexible geospatial sensor observation service for diverse sensor data based on web service. ISPRS Journal of Photogrammetry and Remote Sensing, 64(2), 234–242. Chen, Z., Chen, N., Yang, C., & Di, L. (2012). Cloud computing enabled web processing service for earth observation data processing. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 5(6), 1637–1649. Delin, K. A., & Jackson, S. P. (2001). Sensor web: A new instrument concept. In ‘Sensor web: A new instrument concept’ (International Society for Optics and Photonics) (pp. 1–9). Deng, M., Di, L., Yu, G., Yagci, A., Peng, C., Zhang, B., & Shen, D. (2012). Building an on-demand web service system for global agricultural drought monitoring and forecasting. In Building an on-demand web service system for global agricultural drought monitoring and forecasting (pp. 958–961). IEEE. Deng, M., Di, L., Han, W., Yagci, A., Peng, C., & Heo, G. (2013). Web-service-based monitoring and analysis of global agricultural drought. Photogrammetric Engineering & Remote Sensing (PE&RS), 79(10), 929–943. Di, L. (2004). Distributed geospatial information services-architectures, standards, and research issues. The International Archives of Photogrammetry, Remote Sensing, and Spatial Information Sciences, 35(Part 2), 187–193. Di, L. (2007). Geospatial sensor web and self-adaptive Earth predictive systems (SEPS). In Geospatial sensor web and self-adaptive Earth predictive systems (SEPS) (pp. 1–4). Di, L., & Yang, Z. (2014). Foreword to the special issue on agro-Geoinformatics—The applications of Geoinformatics in agriculture. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 7(11), 4315–4316.

160

L. Di and Z. Sun

Di, L., Moe, K., & van Zyl, T. L. (2010). Earth observation sensor web: An overview. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 3(4), 415–417. Di, L., Üstündağ, B., Chen, Z., & Yang, Z. (2017a). Guest editorial foreword to the special issue on agro-Geoinformatics: Monitoring, prediction, and decision support. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 10(12), 5331–5333. Di, L., Eugene, G. Y., Kang, L., Shrestha, R., & BAI, Y.-Q. (2017b). RF-CLASS: A remotesensing-based ﬂood crop loss assessment cyber-service system for supporting crop statistics and insurance decision-making. Journal of Integrative Agriculture, 16(2), 408–423. Gaigalas, J., Di, L., & Sun, Z. (2019). Advanced Cyberinfrastructure to enable search of big climate datasets in THREDDS. ISPRS International Journal of Geo-Information, 8(11), 494. Gerland, P., Raftery, A. E., Ševčíková, H., Li, N., Gu, D., Spoorenberg, T., Alkema, L., Fosdick, B. K., Chunn, J., & Lalic, N. (2014). World population stabilization unlikely this century. Science, 346(6206), 234–237. Gorelick, N., Hancher, M., Dixon, M., Ilyushchenko, S., Thau, D., & Moore, R. (2017). Google earth engine: Planetary-scale geospatial analysis for everyone. Remote Sensing of Environment. Govaerts, B., Verhulst, N., Castellanos-Navarrete, A., Sayre, K. D., Dixon, J., & Dendooven, L. (2009). Conservation agriculture and soil carbon sequestration: Between myth and farmer reality. Critical Reviews in Plant Science, 28(3), 97–122. Hahmann, S., Burghardt, D., & Weber, B. (2011). “80% of All information is geospatially referenced”??? Towards a research framework: Using the semantic web for (In) Validating this famous geo assertion. In “80% of all information is geospatially referenced”??? Towards a research framework: Using the semantic web for (In) validating this famous geo assertion. Hallberg, G. R. (1987). Agricultural chemicals in ground water: Extent and implications. American Journal of Alternative Agriculture, 2(1), 3–15. Han, W., Yang, Z., Di, L., & Mueller, R. (2012). CropScape: A web service based application for exploring and disseminating US conterminous geospatial cropland data products for decision support. Computers and Electronics in Agriculture, 84, 111–123. Hitzler, P., & Janowicz, K. (2013). Linked data, big data, and the 4th paradigm. Semantic Web, 4(3), 233–235. Josep, A. D., KAtz, R., Konwinski, A., Gunho, L., Patterson, D., & Rabkin, A. (2010). A view of cloud computing. Communications of the ACM, 53(4), 50–58. Kamilaris, A., Kartakoullis, A., & Prenafeta-Boldú, F. X. (2017). A review on the practice of big data analysis in agriculture. Computers and Electronics in Agriculture, 143, 23–37. Kumar, K., Gupta, S. C., Chander, Y., & Singh, A. K. (2005). Antibiotic use in agriculture and its impact on the terrestrial environment. Advances in Agronomy, 87, 1–54. Kumar, R., Jain, K., Maharwal, H., Jain, N., & Dadhich, A. (2014). Apache Cloudstack: Open source infrastructure as a service cloud computing platform. Proceedings of the International Journal of Advancement in Engineering Technology Management and Applied Science, 1, 111–116. Manyika, J. (2011). Big data: The next frontier for innovation, competition, and productivity. http:// www.mckinsey.com/Insights/MGI/Research/Technology_and_Innovation/Big_data_The_ next_frontier_for_innovation. McCalla, A. F. (2001). Challenges to world agriculture in the 21st century. UPDATE: Agriculture and Resource Economics, 4(3), 1–2. Mueller, R. (2013). VegScape: A NASS web service-based US crop condition monitoring system. In VegScape: A NASS web service-based US crop condition monitoring system. United States Department of Agriculture. Ruttan, V. (1994). Challenges to agricultural research in the 21st century. In Agriculture, environment, and health: Sustainable development in the 21st century (pp. 243–257). University of Minnesota Press Minneapolis. Sefraoui, O., Aissaoui, M., & Eleuldj, M. (2012). OpenStack: Toward an open-source solution for cloud computing. International Journal of Computer Applications, 55(3), 38–42.

8 Big Data and Its Applications in Agro-Geoinformatics

161

Sonntag, B.H., Huang, J., Rozelle, S., and Skerritt, J.H. (2005) China’s agricultural and rural development in the early 21st century’ (Australian Centre for International Agricultural Research (ACIAR), 2005 Sun, Z., & Di, L. (2018). CyberConnector COVALI: Enabling inter-comparison and validation of Earth science models. In CyberConnector COVALI: Enabling inter-comparison and validation of earth science models. Sun, Z., & Di, L. (2019). Geoweaver: A web-based prototype system for managing compound geospatial workﬂows of large-scale distributed deep networks (p. 2019). Sun, Z., & Yue, P. (2010). The use of Web 2.0 and geoprocessing services to support geoscientiﬁc workﬂows. In The use of Web 2.0 and geoprocessing services to support geoscientiﬁc workﬂows (pp. 1–5). Sun, Z., Yue, P., Lu, X., Zhai, X., & Hu, L. (2012a). A task ontology driven approach for live geoprocessing in a service oriented environment. Transactions in GIS, 16(6), 867–884. Sun, Z., Yue, P., & Di, L. (2012b). GeoPWTManager: A task-oriented web geoprocessing system. Computers & Geosciences, 47(0), 34–45. Sun, Z., Di, L., Chen, A., Yue, P., & Gong, J. (2013). The use of geospatial workﬂows to support automatic detection of complex geospatial features from high resolution images. In The use of geospatial workﬂows to support automatic detection of complex geospatial features from high resolution images (pp. 159–162). IEEE. Sun, Z., Di, L., Zhang, C., Lin, L., Fang, H., Tan, X., & Yue, P. (2016a). Combining OGC WCS with SOAP to facilitate the retrieval of remote sensing imagery about agricultural ﬁelds. In Combining OGC WCS with SOAP to faciliate the retrieval of remote sensing imagery about agricultural ﬁelds (pp. 1–4). IEEE. Sun, Z., Fang, H., Di, L., & Yue, P. (2016b). Realizing parameterless automatic classiﬁcation of remote sensing imagery using ontology engineering and cyberinfrastructure techniques. Computers & Geosciences, 94, 56–67. Sun, Z., Di, L., Zhang, C., Fang, H., Yu, E., Lin, L., Tan, X., Guo, L., Chen, Z., & Yue, P. (2017a). Establish cyberinfrastructure to facilitate agricultural drought monitoring. In Establish cyberinfrastructure to facilitate agricultural drought monitoring (pp. 1–4). IEEE. Sun, Z., Di, L., Heo, G., Zhang, C., Fang, H., Yue, P., Jiang, L., Tan, X., Guo, L., & Lin, L. (2017b). GeoFairy: Towards a one-stop and location based Service for Geospatial Information Retrieval. Computers, Environment and Urban Systems, 62, 156–167. Sun, Z., Di, L., Huang, H., Wu, X., Tong, D. Q., Zhang, C., Virgei, C., Fang, H., Yu, E., & Tan, X. (2017c). CyberConnector: A service-oriented system for automatically tailoring multisource earth observation data to feed Earth science models. Earth Science Informatics, 11(1), 1–17. Sun, Z., Di, L., & Fang, H. (2019). Using long short-term memory recurrent neural network in land cover classiﬁcation on Landsat and cropland data layer time series. International Journal of Remote Sensing, 40(2), 593–614. Sun, Z., Di, L., Cash, B., & Gaigalas, J. (2019a). Advanced cyberinfrastructure for intercomparison and validation of climate models. Environmental Modelling & Software, 123, 104559. Sun, Z., Di, L., & Gaigalas, J. (2019b). SUIS: Simplify the use of geospatial web services in environmental modelling. Environmental Modelling & Software, 119, 228–241. Sun, Z., Di, L., Fang, H., Guo, L., Yu, E., Tang, J., Zhao, H., Gaigalas, J., Zhang, C., & Lin, L. (2019c). Advanced Cyberinfrastructure for agricultural drought monitoring. In Advanced Cyberinfrastructure for agricultural drought monitoring (pp. 1–5). IEEE. Woolpert, M. (2015). The greatest challenge facing agriculture over the next 5 years. The University of Vermont. USDA. Yang, Z., Yu, G., Di, L., Zhang, B., Han, W., & Mueller, R. (2013). Web service-based vegetation condition monitoring system-VegScape. In Web service-based vegetation condition monitoring system-VegScape (pp. 3638–3641).

162

L. Di and Z. Sun

Yang, Z., Crow, W., Hu, L., Di, L., & Mueller, R. (2017). SMAP DATA for cropland soil moisture assessment—A case study. In SMAP DATA for cropland soil moisture assessment—A case study (pp. 1996–1999). IEEE. Yu, Z., Di, L., Tang, J., Zhang, C., Lin, L., Yu, E. G., Rahman, M. S., Gaigalas, J., & Sun, Z. (2018). Land use and land cover classiﬁcation for Bangladesh 2005 on Google Earth engine. In Land use and land cover classiﬁcation for Bangladesh 2005 on Google earth engine (pp. 1–5). IEEE. Yu, E., Di, L., Gao, F., & Yang, Z. (2019). Agrogeoinformatics: Connecting Geospatial technologies with Agriculture I. In Agrogeoinformatics: Connecting geospatial technologies with Agriculture I. AGU. Zaharia, M., Xin, R. S., Wendell, P., Das, T., Armbrust, M., Dave, A., Meng, X., Rosen, J., Venkataraman, S., & Franklin, M. J. (2016). Apache spark: A uniﬁed engine for big data processing. Communications of the ACM, 59(11), 56–65. Zhang, C., Di, L., Sun, Z., Lin, L., Eugene, G. Y., & Gaigalas, J. (2019). Exploring cloud-based Web Processing Service: A case study on the implementation of CMAQ as a service. Environmental Modelling & Software, 113, 29–41. Zhong, S., Xu, Z., Sun, Z., Yu, E., Guo, L., & Di, L. (2019). Global vegetative drought trend and variability analysis from long-term remotely sensed data. In Global vegetative drought trend and variability analysis from long-term remotely sensed data (pp. 1–6). IEEE.

Chapter 9

Land Parcel Identiﬁcation Li Lin and Chen Zhang

Abstract Land parcel is the ﬁnest unit to describe the location, boundary, and ownership in land management. Land survey is the most popular way to identify land parcel in the history of land management. However, land parcel survey come with huge ﬁnancial cost while the accuracy of the survey is not acceptable for many applications such as agricultural management. The development of Remote Sensing and Geographic Information System (GIS) introduced a novel way of collecting, storing, and analyzing agricultural land parcel information. This study discusses agricultural land parcel identiﬁcation and management approaches from local to global level. In most countries, local authorities are responsible for the collection and storage of land parcel information. For this reason, the aggregation of land parcel information from various authorities’ datasets becomes critical in large-scale agricultural management activities. However, agencies and nations develop land parcel databases differently, and these databases are often not interoperable. This study also summarizes the state-of-art approaches to reduce friction in land parcel database integration across the globe. The study concludes that international standards and corporations between organizations are essential to the management of land parcel information in agro-geoinformation systems. Keywords Land parcel · Agricultural management · Agro-geoinformation systems · Remote sensing · GIS · Interoperability · Web-services

9.1

Introduction

Land plays one of the most signiﬁcant roles in the history of human beings. It is the fundamental of many other developments such as agricultural production and urbanization (United Nations 1976). Within the science of location, land

L. Lin (*) · C. Zhang George Mason University, Fairfax, VA, USA e-mail: [email protected]; [email protected] © Springer Nature Switzerland AG 2021 L. Di, B. Üstündağ (eds.), Agro-geoinformatics, Springer Remote Sensing/ Photogrammetry, https://doi.org/10.1007/978-3-030-66387-2_9

163

164

L. Lin and C. Zhang

identiﬁcation is one of the basic and key components (Rindfuss et al. 2004). The earliest recorded land identiﬁcation was conducted by the Egyptians while they measured land for taxation (Cuno 1980). Starting with the Industrial Revolution, accurate land identiﬁcation was required for various infrastructure development (Kain and Baigent 1992; Williamson and Ting 2001). Land parcel could contain information for various purposes. For example, urban planners who study residential land use change has their own model to map land (Irwin et al. 2003). Agricultural experts need land parcel data with information about crop types and conditions. It is not wise and possible to include all attributes when identifying land parcels since including irrelevant information waste data storage and processing capacity. In the United States, both public and private sectors worked on identifying land parcel for agricultural purpose. Both have done great job on local land parcel identiﬁcation using traditional approaches. However, land parcel identiﬁcation and management became problematic for large-scale (even global-scale) agriculture. Large-scale agricultural studies were emerged to study the impact on cropland from global change (Deng et al. 2012; Han et al. 2012). The challenger required the development of new approaches of land parcel identiﬁcation. New land parcel identiﬁcation beneﬁted from the development of new techniques such as remote sensing and geographic information systems. Moreover, the national and international standards in geoinformation also helped in better management of land parcel in large-scale agro-geoinformation systems. This chapter discusses land parcel identiﬁcation and management in agriculture for different scales: from the local to the global level. Land parcel management becomes harder when dealing with larger area. Good land parcel management requires a corporation between agencies, states, and countries. This chapter also discusses about the current challenges and problems in the agricultural land parcel identiﬁcation.

9.2 9.2.1

Land Parcel and Agricultural Land Parcel What Is Land Parcel?

Land parcel is the basic land unit, which is described by its location, boundary, and ownerships. Land parcel is deﬁned as the ﬁnest unit when identifying land. It means that one parcel will not be able to be separated into multiple ﬁner units. For example, it is not possible to sell one land parcel to two individuals separately; however, it is possible for one individual to own multiple land parcels. Although land parcel represents the basic land unit, land parcels are not identical since they are deﬁned with actual geomorphic conditions. In the real world, no identical land parcels exist. In addition to the uniqueness of land parcel, the boundary of land parcel may change over time due to change in the development of construction, regional structure, and variation in ownership (Worboys 1994). For

9 Land Parcel Identiﬁcation

165

example, human development may inﬂuence the shape of the parcels (Worboys 1994; Irwin et al. 2003). In addition, land parcel could be inﬂuenced by administrative boundary shift such as state boundary expansion/shrinkage after conﬂicts. Land parcel may be changed by nature as well. For example, common natural disasters such as ﬂood and drought could introduce notable change in landscape which will cause changes in land parcel later (Lin et al. 2019).

9.2.2

Land Parcel in Agriculture

The requirements of containing information are different between the diverse types of land parcel systems. For example, parcels were grouped into undeveloped and developed to study residential land use change (Irwin et al. 2003). Data in the residential research contains more economic variable while not distinguishing land parcel by agricultural types (Irwin et al. 2003). It is totally ﬁne since the parcel model were built for studies in residential change which would be unnecessary to including information about agricultural type for the land parcels. Land parcel in agriculture mainly focuses on agriculture-related information while discarding other types of parcels. In general, land parcels, which are not related to agricultural activities, in agro-geoinformation systems are categorized as non-cropland or developed area. In some agricultural land parcel identiﬁcation systems, non-cropland parcels were even totally discarded to save storage and increase access and process speed. The attributes of agricultural land parcel not only clearly describe the location, boundary, and ownerships but also contain agricultural statistics such as crop type and soil type. For example, crop rotation is a common way to increase productivity that could harm soil health (Moudon 2000). Monitoring cropland at the parcel level is necessary to evaluate the health condition of the land (Moudon 2000). Such kind of information do not exist for other types of parcel but only belong to agricultural land parcels (Fig. 9.1).

9.2.3

Techniques to Identify Land Parcel

Accurate land information is signiﬁcant to both private owners and public society (National Research Council 2007). As a result, land parcel information need to be accurate and reliable both spatially and temporally (Library of Congress 2011). However, signiﬁcant effort is needed to continuously provide such land parcel data. In addition, accurate land parcel data is not only expensive but also hard to access. The techniques of land parcel identiﬁcation, which have beneﬁted from the development of science and technologies, improved signiﬁcantly in the last decades. Land survey is one of the earliest but most reliable way of identifying land parcel for thousands of years. Prior to the Industrial Revolution, accuracy was important

166

L. Lin and C. Zhang

Fig. 9.1 Land parcel system management in agro-geoinformation systems

but not listed as one of the top priorities, and a rough information of ownership was enough. Starting with the Industrial Revolution, land parcel information is required to be accurately measured due to the development of road, railroad, and constructions (Kain and Baigent 1992; Williamson and Ting 2001). Beneﬁting from the popularity of desktop computers, many land parcel data were transferred to digital. The National Research Council (2007) pointed out that one-third counties in the United States have digital land parcel data available, and many private companies have built their own electronic land parcel databases. Although these data were stored digitally, it was not convenient to use digitalized land parcel data since they were just digitalized images. The development of remote sensing and geographic information system (GIS) delivered an alternative way to collect, store, and analyze land parcel data. Remote sensing as one of latest Earth observation techniques has been deployed into many land parcel identiﬁcation studies (Bocco et al. 2001; Oesterle and Hahn 2004). Remote sensing not only provides an unbiased data source for land parcel identiﬁcation at low cost, but satellites also provide the ability of continuous monitoring. GIS was utilized as a key tool in managing land parcel data in research (Kiehle et al. 2007; Moudon 2000). It is necessary to have land parcel information in agricultural management systems since all types of agricultural activities are conducted on land. For example, the input of land parcel data could describe land supply and capacity for agriculture (Moudon 2000). Land parcel information is required to evaluate and support agricultural sustainability and productivity (Oesterle and Hahn 2004; Zhang et al. 2019b). Better understanding of the condition of the land required reliable land parcel data as reference and input parameters. There are few beneﬁts for adding land parcel information in agro-geoinformation systems. Land parcel is a necessary input in agricultural studies. With the inclusion of land parcel information (especially for large-scale research), results from agricultural studies will be more reliable. Thus, the reliable result could beneﬁt policy and decision-makers. However, land parcel identiﬁcation and management in agro-

9 Land Parcel Identiﬁcation

167

geoinformation systems is not an easy task. The following sections will discuss approaches to identify and manage land parcel information in agro-geoinformation system for different scales.

9.3

Managing Land Parcel Information in Agro-Geoinformation Systems for Local Governments, Agencies, and Companies

Unlike land parcel data collection, which is a labor- and time-intensive process, the challenges of land parcel management come from its data volume and complexity. Land parcel generated tremendous amount of digital data including footprint and other attached attributes. As a result, many land parcel information is organized at local governments, agencies, and companies (National Research Council 2007). Many land parcel information was transferred from physical storage to digital archive with the development of desktop computer and GIS software. Most land parcel information are stored in vector data model, which is a superior data model, than raster to represent features with discrete boundaries. However, vector data may consume more storage space than other data model due to its complicated feature and precise boundary representation, especially for desktop computers. There are few techniques widely used for managing land parcel information in agro-geoinformation systems. The size of parcel dataset could be reduced by simplifying polygon boundaries. For example, ESRI provides tools to allow user to smooth polygons by reducing/repositioning edge points (ESRI 2014). Firstly, the method, which reduces land parcel data size and increases performance by losing the detail of land parcel footprints, lowers spatial accuracy of land parcel too. Secondly, land parcels could be divided into two groups: agricultural and nonagricultural. Information about nonagricultural land parcels could be discarded in agrogeoinformation systems and only keep agricultural land parcels to save space and processing power. For example, a mask layer could be produced to deﬁne an area with agricultural activities (Boryan and Yang 2012). To add an agricultural mask layer is a simple but efﬁcient way to reduce land parcel data size in agrogeoinformation data management. In addition to reducing the size of land parcel dataset, land parcel for agriculture could be tagged in land administration systems; thus these tagged land parcels could be linked with agro-geoinformation systems. The ﬁrst approach requires the integration of land parcel dataset and agro-geoinformation systems. However, it is hard to integrate two datasets since land parcel information is dynamically changing. The second approach was introduced to minimize the effort from data modiﬁcation by using standardized structure (Inan et al. 2010). This approach is more ﬂexible during the collaboration between multiple agencies such as the states within the European Union. However, it requires a large effort from local agencies that are collecting and building land parcel systems. Local government, agencies, and companies may not

168

L. Lin and C. Zhang

be motivated to do the enhancement since they may not see the beneﬁt (Library of Congress 2011). As desktop computer became one of the most common and cost-efﬁcient ways to process and store data, many local governments, agencies, and companies used it to store and analyze land parcel data for agricultural studies. Multiple approaches were used in managing land parcel information in agro-geoinformation applications including simplifying data complexities, discarding land parcels which are unrelated with agriculture, and linking land parcel dataset with agro-geoinformation systems by tagging agricultural land parcels. The methods discussed above are widely used in desktop applications. However, they do have limitations, which are commonly existing in desktop computers, such as the balance of reduction in spatial and temporal accuracy. The following chapters will be discussing ways to facilitate these problems for the larger regions.

9.4

Managing Land Parcel Information in Agro-Geoinformation Systems at State and National Levels

Spatial information plays important roles when conducting research. For a long time, scientists found that location is the key to some aggregated phenomena. Regional geographers conduct research by ﬁnding similar patterns within a region of study (Hartshorne 1939). However, one single county is too small for scientists to ﬁnd meaningful patterns. Scientists, especially agricultural experts, work on regional scales: one state or several states. It requires the aggregation of local land parcel information which is not easy for few reasons: (1) inconsistent data collection leads to result incomparable; (2) nonstandardized data storage brings difﬁculty in the collaboration between land parcel datasets. Land parcel data is collected by various agencies including private companies, and the methods of data collection were not standardized among different counties. For example, surveys on farmers could be used to evaluate agricultural condition for land parcels, but the result may lead to uncertainties if the survey was conducted independently across counties. Moreover, counties may have various methods to collect land parcel data. Many land parcel data were collected at various nonstandardized methods due to the nature of using desktop computer and the lack of the requirement of collaboration between other counties or states. All these inconsistent land parcel identiﬁcation approaches bring difﬁculties in managing land parcel data in agro-geoinformation systems. Land parcel identiﬁcation in EU was one of many successful cases. The Common Agricultural Policy (CAP) from the EU needed aggregated land parcel data for distributing aids to farmers. Scientists developed a standardized land parcel identiﬁcation system to collect and manage land parcel. It is a standardized system which manages land parcel information and is widely used in the European Union (EU) to

9 Land Parcel Identiﬁcation

169

deal with the inconsistency in land parcel data (Inan et al. 2010; Leteinturier et al. 2006). Agricultural research became easier to be conducted based on the good land parcel management system (Leteinturier et al. 2006; Lin et al. 2016). The availability of land parcel information beneﬁted scientists to have better understanding on crop growth and rotation (Leteinturier et al. 2006). Agro-geoinformation systems also monitor the damages from natural disaster, and land parcel serves as a critical information when evaluating the impacts of natural disasters on agricultural ﬁelds such as acreage affected and yield loss (Han et al. 2012; Lin et al. 2016). Both scientiﬁc and industrial communities developed land parcel information systems for agricultural purpose. Scientists have noticed the importance of having land parcel information for the entire United States since the last century (Okpala 1992). Many researches focused on the development of national land parcel database (a part of the National Spatial Data Infrastructure) (Library of Congress 2011; National Research Council 2007). Most land parcel databases were developed at state level and funded by the Federal fund. After that, a national land parcel database was developed by the Federal Geographic Data Committee (FGDC) using the standards (Library of Congress 2011; National Research Council 2007). Most scientiﬁc land parcel identiﬁcation systems mainly focus on collecting public land due to the limited access and releasing of private-owned land. The successful development of land parcel database for large-scale relies on techniques such as remote sensing and GIS. Traditional land parcel identiﬁcation and information validation were heavily relying on the manual correction and ﬁeld trip (Tasdemir and Wirnhardt 2012). Remote sensing is able to release the pressure on manual intervention and provide reliable source for land parcel identiﬁcation systems (Tasdemir and Wirnhardt 2012). Remote sensing-based automatic land parcel identiﬁcation required high-spatial-resolution aerial or satellite images which means the temporal resolution will be relative low. The moderate or coarse temporal resolution will have limited inﬂuence on automated generated land parcel identiﬁcation system since the change of land parcel is not frequent. Unlike the free distribution of government data, private companies from different sectors managed several types of land parcel database. For example, Zillow is one of the leading online search tools for real estate managing land parcel data on residential (PR Newswire 2012). AcreValue is another company selling information on farmland parcels. Both companies provide land parcel information, but their audiences are different. As a result, the attribute for one land parcel is different from two companies. For example, land parcel data on AcreValue are prepared for agricultural activities, so there are some agricultural land parcels existing in AcreValue’s database while not showing on Zillow’s website since Zillow focuses on real estate (Fig. 9.2). Both national land parcel database and land parcel data from private companies have few limitations. First, a continues update of parcel database is needed (Rindfuss et al. 2004). Although attributes or boundaries for single land parcel do not change frequently, many data need to be updated since the database covers a large geographic area. Secondly, it is hard for a national database to manage all detailed information for each state (Rindfuss et al. 2004). For instance, a state may have

170

L. Lin and C. Zhang

Fig. 9.2 Land parcel management in real estate and agro-geoinformation systems

different regulation on land use than the other. Both problems are possible to be resolved by the integration between the data sources. As national land parcel database is not only counted on its own data, it also relies on land parcel database from different states (National Research Council 2007). Lastly, all data have some sensitive personal information when stored in a local or state level. The improper management of data security could be harmful when the details of the data are leaked. The removal of personal and sensitive information is needed when managing land parcel data in agro-geoinformation systems.

9.5

Approaches to Manage Land Parcel Information in Globe Agro-Geoinformation Systems – International Standards

Global agricultural study requires data sharing between countries. As one of the key features in agricultural research, land parcel data is needed to be shared between countries as well. The Federal Geographic Data Committee (FGDC) and American National Standards Institute (ANSI) are two major standard institutes in the United States. Both agencies focus on standards inside the United States, such as the National Spatial Data Infrastructure (NSDI) (National Research Council 2007). However, mainly the foci of standards in NSDI serve for data accessing and sharing

9 Land Parcel Identiﬁcation

171

between agencies in the United States, while it lacks the ability of providing support globally. On the other hand, land parcel identiﬁcation methods are very different between countries, especially in developing countries. For example, many developing and undeveloped nations may experience difﬁculty in mapping land parcel due to either economic pressure or technology lag. Like the idea of NSDI, standardbased Spatial Data Infrastructure (SDI) is able to serve as the bridge to share data between countries (Nebert 2004). Few international organizations focus on developing standards to support interoperability between countries. The Open Geospatial Consortium (OGC), which is one of the leading international standard organizations, is specialized in geospatial standards. There are many agro-geoinformation systems adopted to OGC standards (Deng et al. 2013; Han et al. 2012; Lin et al. 2017; Sun et al. 2016a, b; Zhang et al. 2016). Land parcel identiﬁcation using web services has been widely accepted in the ﬁrst generation of SDI (Kiehle et al. 2007). Standards made the visualization of global land parcel data possible. It is valuable since global agro-geoinformation systems could be directly overlaid with land parcel data. The result is very useful for visual comparison and studies. Furthermore, studies were conducted for more operations with land parcel data. Land parcel could interact with the second generation of web services, such as Web Processing Service (WPS) and Catalog Service for the Web (CSW) (Kiehle et al. 2007; Zhang et al. 2019a). The movement from displaying to processing land parcel information provided more powerful functions for global agro-geoinformation systems (Fig. 9.3).

9.6

Conclusion and Discussion

Land parcel identiﬁcation was introduced with the challenges for location change science (Rindfuss et al. 2004). The identiﬁcation and management of land parcel information is essential in agro-geoinformation systems. Land parcel information provides precisely geospatial information in agricultural studies to help generate more reliable research outcomes. Unlike other land parcel identiﬁcation systems, agriculture land parcel should contain the following features: (1) land parcel administrative data and (2) agriculture-related information. Land parcel information is mapped by both private and public sectors. It requires the aggregation of land parcel information from local agencies to study large-scale agricultural change. When it comes to data management for a large area, such as states or even entire countries, data size and performance usually stand opposite each other. Both the EU and the United States spend a lot of effort to establish a largescale land parcel database (National Research Council 2007). With the adoption of standards, the EU developed the land parcel identiﬁcation system for agricultural policy studies, and the national land parcel database was built in the United States.

172

L. Lin and C. Zhang

Fig. 9.3 Interaction between standard and non-standard based land parcel web services and agrogeoinformation systems

The inconsistencies in land parcel collection and storing format lead to the loss of dataset integration. Different data collection methods may lead to results with different accuracy levels. It is true that the data inconsistency could be simply solved by enforcing all data collectors to use the same data collection method. However, not all countries have the same techniques to collect and manage land parcel data. Signiﬁcant technology gap could be found between developed and developing countries. Adopting new techniques such as remote sensing and GIS brings an alternative for measuring land parcel for agro-geoinformation systems. In addition, land parcel identiﬁcation is based on geomorphology while considering the administrative boundaries as well. This nature of land parcel identiﬁcation, which is the ﬁnest unit in agro-geoinformation system, makes the research outcome comparable. The sharing of land parcel information is extremely signiﬁcant in the global agrogeoinformation systems; however, it also has some problems. One of the largest problems is reﬂected in the motivation of sharing such data. As the chapter discussed earlier that counties lack the motivation to contribute the national land parcel database, international land parcel management faces the same issue. Individual nations may not share their data due to either lack of motivation or worry about the national securities. Although there are several approaches to effectively identify and

9 Land Parcel Identiﬁcation

173

manage land parcel information, there is still a long way to go for dealing with such information for large-scale Agro-geoinformation systems.

References Bocco, G., Mendoza, M., & Velázquez, A. (2001). Remote sensing and GIS-based regional geomorphological mapping—A tool for land use planning in developing countries. Geomorphology, 39(3), 211–219. Boryan, C. G., & Yang, Z. (2012, August). A new land cover classiﬁcation based stratiﬁcation method for area sampling frame construction. In Agro-geoinformatics (Agro-geoinformatics), 2012 ﬁrst international conference on (pp. 1–6). IEEE. Cuno, K. M. (1980). The origins of private ownership of land in Egypt: A reappraisal. International Journal of Middle East Studies, 12(03), 245–275. Deng, M., Di, L., Yu, G., Yagci, A., Peng, C., Zhang, B., & Shen, D. (2012, July). Building an on-demand web service system for global agricultural drought monitoring and forecasting. In Geoscience and remote sensing symposium (IGARSS), 2012 IEEE international (pp. 958–961). IEEE. Deng, M., Di, L., Han, W., Yagci, A. L., Peng, C., & Heo, G. (2013). Web-service-based monitoring and analysis of global agricultural drought. Photogrammetric Engineering & Remote Sensing, 79(10), 929–943. Environmental Systems Research Institute (ESRI). (2014). ArcGIS Desktop Help 10.2 smooth. http://resources.arcgis.com/en/help/main/10.2/index.html. Han, W., Yang, Z., Di, L., & Mueller, R. (2012). CropScape: A web service based application for exploring and disseminating US conterminous geospatial cropland data products for decision support. Computers and Electronics in Agriculture, 84, 111–123. Hartshorne, R. (1939). The nature of geography. Washington, DC: Association of American Geographers. Inan, H. I., Sagris, V., Devos, W., Milenov, P., van Oosterom, P., & Zevenbergen, J. (2010). Data model for the collaboration between land administration systems and agricultural land parcel identiﬁcation systems. Journal of Environmental Management, 91(12), 2440–2454. Irwin, E. G., Bell, K. P., & Geoghegan, J. (2003). Modeling and managing urban growth at the rural-urban fringe: A parcel-level model of residential land use change. Agricultural and Resource Economics Review, 32(1), 83–102. Kain, R. J., & Baigent, E. (1992). The cadastral map in the service of the state: A history of property mapping. Chicago: University of Chicago Press. Kiehle, C., Greve, K., & Heier, C. (2007). Requirements for next generation spatial data infrastructures-standardized web based geoprocessing and web service orchestration. Transactions in GIS, 11(6), 819–834. Leteinturier, B., Herman, J. L., De Longueville, F., Quintin, L., & Oger, R. (2006). Adaptation of a crop sequence indicator based on a land parcel management system. Agriculture, Ecosystems & Environment, 112(4), 324–334. Library of Congress. (2011). Issues regarding a national land parcel database. S.l: [s.n.]. Washington: Congressional Research Service. Resources, Science, Industry Division Lin, L., Di, L., Zhang, C., Hu, L., Tang, J., & Yu, E. (2017). Developing a Web service based application for demographic information modeling and analyzing. In 2017 6th International Conference on Agro-Geoinformatics (pp. 1–5). IEEE. Lin, L., Di, L., Tang, J., Yu, E., Zhang, C., Rahman, M., ... & Kang, L. (2019). Improvement and validation of NASA/MODIS NRT global ﬂood mapping. Remote Sensing, 11(2), 205.

174

L. Lin and C. Zhang

Lin, L., Di, L., Yu, E. G., Kang, L., Shrestha, R., Rahman, M. S., et al. (2016, July). A review of remote sensing in ﬂood assessment. In Agro-geoinformatics (Agro-Geoinformatics), 2016 ﬁfth international conference on (pp. 1–4). IEEE. Moudon, A. V. (2000). Monitoring land supply with geographic information systems: Theory, practice, and parcel-based approaches. New York: Wiley. National Research Council. (2007). Committee on land parcel databases: A National Vision, & Ebrary, Inc. In National land parcel data a vision for the future. Washington, DC: National Academies Press. Nebert, D. D. (Ed.) (2004). Developing spatial data infrastructures: The SDI cookbook, version 2.0 (GSDI-technical working Group). Available at: ftp://181.118.144.33/DAPA/planiﬁcacion/ GEOMATICA/SIG/Anexos_SIG/cookbookV2.0.pdf Oesterle, M., & Hahn, M. (2004). A case study for updating land parcel identiﬁcation systems (IACS) by means of remote sensing. Wide Angle, 153, 6–120. Okpala, D. C. I. (1992). Land survey and parcel identiﬁcation: Data for effective land management. Land Use Policy, 9(2), 92–98. PR newswire. Zillow introduces pre-market inventory to its home search. (10/25/2012). Rindfuss, R. R., Walsh, S. J., Turner, B. L., Fox, J., & Mishra, V. (2004). Developing a science of land change: Challenges and methodological issues. Proceedings of the National Academy of Sciences of the United States of America, 101(39), 13976–13981. Sun, Z., Di, L., Fang, H., Zhang, C., Yu, E., Lin, L., et al. (2016a, July). Embedding Pub/Sub mechanism into OGC web services to augment agricultural crop monitoring. In Agrogeoinformatics (Agro-geoinformatics), 2016 ﬁfth international conference on (pp. 1–4). IEEE. Sun, Z., Di, L., Zhang, C., Lin, L., Fang, H., Tan, X., & Yue, P. (2016b, July). Combining OGC WCS with SOAP to facilitate the retrieval of remote sensing imagery about agricultural ﬁelds. In Agro-geoinformatics (Agro-geoinformatics), 2016 ﬁfth international conference on (pp. 1–4). IEEE. Tasdemir, K., & Wirnhardt, C. (2012, July). Automatic assessment of land parcel identiﬁcation systems for agricultural management. In Geoscience and remote sensing symposium (IGARSS), 2012 IEEE international (pp. 5697–5700). IEEE. United Nations. (1976). Report of habitat: United Nations conference on human settlements (p. 62). Canada: UN Vancouver. Williamson, I., & Ting, L. (2001). Land administration and cadastral trends—A framework for re-engineering. Computers, Environment and Urban Systems, 25(4), 339–366. Worboys, M. F. (1994). A uniﬁed model for spatial and temporal information. The Computer Journal, 37(1), 26–34. Zhang, C., Sun, Z., Heo, G., Di, L., & Lin, L. (2016, July). Developing a GeoPackage mobile app to support ﬁeld operations in agriculture. In Agro-geoinformatics (Agro-geoinformatics), 2016 ﬁfth international conference on (pp. 1–4). IEEE. Zhang, C., Di, L., Sun, Z., Lin, L., Eugene, G. Y., & Gaigalas, J. (2019a). Exploring cloud-based Web Processing Service: A case study on the implementation of CMAQ as a Service. Environmental Modelling & Software, 113, 29–41. Zhang, C., Di, L., Lin, L., & Guo, L. (2019b). Machine-learned prediction of annual crop planting in the US Corn Belt based on historical crop planting maps. Computers and Electronics in Agriculture, 166, 104989.

Chapter 10

Crop Pattern and Status Monitoring Eugene G. Yu and Zhengwei Yang

Abstract Statistical approaches are traditionally used in approximating crop pattern and status monitoring which has limitations in timeliness and precision. Remote sensing has been increasingly adopted in monitoring crop pattern and status with the availability of observations in high spatial and temporal resolution. This Chapter surveys the methods and technologies for crop condition monitoring using remote sensing technologies. The advancements of remote sensing and related processing capabilities make it possible to operationally monitor crop pattern and crop status in very high spatial and temporal resolution. General workﬂows for applying remote sensing in both crop pattern monitoring and crop status monitoring are reviewed and described. Several operational cases are discussed in details. The increasing constellation of satellite sensors with different spatial resolution improves temporal resolution. Further development in machine learning and time series analytics and massive parallel computing technologies will facilitate the timely monitoring of crops up to ﬁeld level. The application of remotely sensed data and derived information on crop pattern and status will be further expanded into procession agriculture. Keywords Crop pattern mapping · Crop status monitoring · Remote sensing · Feature selection · Classiﬁcation · Cropland

E. G. Yu (*) Center for Spatial Information Systems, George Mason University, Fairfax, VA, USA e-mail: [email protected] Z. Yang Research and Development Division, USDA National Agricultural Statistics Service, Washington, DC, USA e-mail: [email protected] © Springer Nature Switzerland AG 2021 L. Di, B. Üstündağ (eds.), Agro-geoinformatics, Springer Remote Sensing/ Photogrammetry, https://doi.org/10.1007/978-3-030-66387-2_10

175

176

10.1

E. G. Yu and Z. Yang

Introduction

Crop pattern and status are important information for decision-makers and practitioners in the agricultural sectors. Crop pattern tells the proportions of area under the crops at any given time. Crop status is the status of crop in terms of health, growth stage, and projected yield. Farmers need accurate and timely crop pattern and status information for efﬁcient and sustainable agriculture. Precision agriculture is heavily relying on timely information of crop pattern and status. Geospatial technologies and remote sensing have been used extensively in monitoring cropland and crop growth since their early days back in the 1970s (Tucker 1980; Atzberger 2013). The initiatives, projects, and programs in monitoring agricultural resources have an active core of applying remote sensing and geospatial technologies. These expand from large historical initiatives in history, like the Large Area Crop Inventory Experiment (LACIE) (MacDonald and Hall 1980) in the 1970s, the Agriculture and Resource Inventory Surveys Through Aerospace Remote Sensing (AgRISTARS) (Engmann et al. 1986) in the 1980s, and the long-running Monitoring Agriculture with Remote Sensing (MARS) (Bouman 1995) initiated in 1988, to operational systems and activities nowadays, like the MARS Crop Yield Forecasting System (MCYFS) (Baruth et al. 2008) and the crop monitoring and early warning for food security (FoodSeC) (Rembold et al. 2013; Atzberger 2013) in the Monitoring Agricultural ResourceS (MARS) unit of the Joint Research Center (JRC) of the European Union (EU), the Global Agricultural Monitoring (GLAM) (Becker-Reshef et al. 2010a) project in the US Department of Agriculture (USDA) Foreign Agricultural Service (FAS), the Cropland Data Layer CropScape (Han et al. 2012; NASS 2013) and the National Crop Condition Monitoring System – VegScape (Mueller 2013; Yang et al. 2013) – in the USDA National Agricultural Statistic Service (NASS), the Crop Watch (CropWatch) (Wu et al. 2010, 2014) in the Chinese Academy of Sciences (CAS), and the Global Information and Early Warning System (GIEWS) (GIEWS 2013; Basso et al. 2013) of the United Nations (UN) Food and Agriculture Organization (FAO). The applications of remote sensing and related geospatial technologies have been evolved and advanced to be much more operational systems than before. This chapter formalizes the general process and steps of applying the advanced geospatial technologies in monitoring cropland and crop status. It also covers in details several operational systems and programs and their results in monitoring cropland and crop condition. The chapter covers both crop pattern mapping and crop status monitoring. There are four subsections for each monitoring task. In the ﬁrst subsection, the operational statistical approach is reviewed. In the second subsection, the remote sensing approach is described with detailed steps and methodologies. Each of the relevant technologies is reviewed at each step. In the third subsection, the operational cases of applying remote sensing are discussed and presented with example results. In the fourth subsection, the current constraints of remote sensing approach are reviewed. Perspectives for the future trend in remote sensing applications are given with the latest advancements in remote sensing and related computational technologies. After

10

Crop Pattern and Status Monitoring

177

both topics of crop pattern mapping and crop status monitoring are covered, a summary conclusion is briefed along with major future trends.

10.2

Crop Pattern Mapping

10.2.1 Statistical Approach The crop pattern at different administrative levels is often estimated using statistical sampling (Basso et al. 2013). The statistic approach is still the ofﬁcial, operational approach by most statistic agencies (Bosecker 1988; Abreu et al. 2010, 2011; Irwin et al. 2014). In NASS of the USDA, crop acreage is determined mainly by the June Agricultural Survey (JAS) which is based on stratiﬁed sampling frameworks over the national agricultural states and areas (Bosecker 1988; Good and Irwin 2006; Lopiano et al. 2011). The European survey of crop acreage uses a classical statistical scheme based on area frame sampling and ground visits to obtain the main estimation variables in the MARS (Monitoring Agriculture with Remote Sensing) project (Gallego 1999; Pradhan 2001). The acreage pattern of India is based on a national survey using a sampling framework (Parihar and Oza 2006). The crop acreage estimation of China is also mainly based on a stratiﬁed sampling framework (Yang et al. 2007; Wu and Li 2012).

10.2.2 Remote Sensing Approach The general remote sensing approach includes the following major steps as shown in Fig. 10.1(Lu and Weng 2007): 1. Selection of remotely sensed observations: Three major aspects are to be considered during the selection of proper remotely sensed data, i.e., spatial, temporal, and radiometric resolution. Table 10.1 lists some selected sensors and their resolution characteristics. Besides these intrinsic characteristics of remote sensing, the external factors need to be weighed in during the selection of observations. These may include costs and feasibility. The results are valuable only if they can be produced and available in the right time frame. In general, higher resolution would lead to higher accuracy if cost and time are not issues. However, in reality, these may be the very constraints that the researcher and practitioner have to work with. The higher resolution also means higher cost and more time needed to process the data, while these resources may be limited. Higher resolution may open up possibilities for achieving improved classiﬁcation, while current technologies may not be able to take advantage of the added features due to the computing speed or other similar constraints. It is also possible to lead

178

E. G. Yu and Z. Yang

Fig. 10.1 General workﬂow for crop pattern mapping with remote sensing

to degraded accuracy of classiﬁcation if the existing methods are applied. The balanced selection depends on the classiﬁcation tasks. 2. Preprocessing: The preprocessing includes geometric correction, radiometric correction, and cloud masking. Geometric correction is the process of digitally removing the geometric distortions to match a speciﬁc projected surface or shape in digital image processing (JARS 1999; Toutin 2004; Mather and Koch 2011). Geometric distortions are mainly coming from three sources: sensor (observer), target (observed), and projection (reference) (JARS 1999; Toutin 2004; Mather and Koch 2011). Most sensor-related distortions (variations in movement, attitude, mechanics, viewing angles, and clock) can be systematically corrected. These are often done by data provider or satellite sensor-receiving stations. The remaining distortions from the sensor (i.e., degrading, nonsystematic variations),

10

Crop Pattern and Status Monitoring

179

Table 10.1 Selected satellite sensors Sensor Landsat TM/ETM + ASTER

Resolution Spatial 30 m

Temporal 16 days

Radiometric 6 bands from visible to near infrared

15 m or 30 m

16 days

SPOT-5

10 m or 20 m

26 days

3 visible bands and 5 infrared bands 3 visible bands and 1 infrared

AWiFS

56 m

5 days

Sentinel-2

10 m, 20 m, 60 m 22 m

5 days

3 bands in visible near infrared and 1 short wave infrared 13 bands in visible to infrared

3 days or 1 day 24 days

3 bands (green, red, near infrared) Multiple polarization RADAR

DMC Radarsat-2

8 m, 25 m, 50 m, 100 m

References MartínezCasasnovas et al. (2005) Conrad et al. (2010) Yang et al. (2011a) Boryan et al. (2011) Immitzer et al. (2016) Fisette et al. (2014) Jiao et al. (2014)

target (i.e., atmospheric refraction and turbulence, Earth curvature, rotation, and topographic effect), and projection (i.e., mismatches from geoid to ellipsoid, ellipsoid to map) need to go through ﬁne geographic corrections which require speciﬁc processes of modeling and mathematical functions (Toutin 2004). For crop mapping, ﬁne geometric corrections are often required considering the need for ﬁne resolution, off-nadir viewing, digital processing, fusion of images, and integration of multiformat data (Toutin 2004). The ﬁne geometric correction starts with image acquisition, and its metadata analysis if physical model is adopted for correction. The remaining steps for either physical model-based correction or empirical model-based correction are (JARS 1999; Toutin 2004) (1) collecting sufﬁcient ground control points, (2) estimating the unknown parameters of the selected model, (3) rectifying the images, and (4) interpolating and resampling the radiometric value. Radiometric correction is the process of eliminating the radiometric distortions (JARS 1999). The sources of radiometric distortions are sensor sensitivity, sun angle and topography, and atmosphere (JARS 1999). The common levels of radiometric corrections are the top of atmosphere (TOA) reﬂectance calculations (Kaufman et al. 1997; Hansen and Loveland 2012; Roy et al. 2014), surface reﬂectance calculation (Vermote et al. 1997, 2002; Hansen and Loveland 2012), bi-directional reﬂectance distribution function (BRDF) and view angle normalization (Danaher et al. 2001; Schaaf et al. 2002), and terrain normalization (Lu et al. 2008), in the order of simple to complex level. The application of radiometric correction and its level of sophistication depend on the applications (Song et al. 2001). If the classiﬁcation is trained on one scene and applied on the same scene, the sophisticated radiometric correction is not necessary (Song et al. 2001; Hansen and Loveland 2012). TOA may be sufﬁcient in the general

180

E. G. Yu and Z. Yang

classiﬁcations of crop and land cover if the application is not across sensors and in similar seasons (Vicenteserrano et al. 2008; Hansen and Loveland 2012). Topographic normalization is needed if classiﬁcation is applied across sensors and seasons (Lu et al. 2008). Clouds and hazes block the view of the ground in optical remote sensing (Whitcraft et al. 2015). They need to mask out and mark in quality layer for further processing. There are many detection algorithms available to detect and mask the cloud pixels in optical remotely sensed data (Lyapustin et al. 2008; Hulley and Hook 2008; Zhu and Woodcock 2012). 3. Feature extraction and feature selection: Feature extraction is the process to derive values and features from remotely sensed data. The examples of derived features are textural features, statistical features, Discrete Wavelet Transform (DWT)-based feature, and Discrete Cosine Transform (DCT)-based feature (Badhwar et al. 1982; Lei et al. 2008; Anami et al. 2011; Ul Qayyum et al. 2013). Feature selection is the process of selecting the proper features for crop classiﬁcation. The use of proper features for crop classiﬁcations depends on crop types, classiﬁcation algorithms, sensors, and seasons. Too many features may lead to degraded crop classiﬁcation (Lu and Weng 2007; Löw et al. 2013). There are many feature selection methods, e.g., exhaustive search by recursion (ESR), isolated independent search (ISS), and sequence-dependent search (SDS) (Peddle and Ferguson 2002). The proper feature extraction and feature selection can improve the accuracy of crop classiﬁcation (Löw et al. 2013). 4. Classiﬁcation: Many classiﬁcation algorithms have been applied in cropland classiﬁcation from remotely sensed data. Both unsupervised and supervised classiﬁcation algorithms have been applied. Unsupervised classiﬁcation algorithms leave the label assignment in post-processing, while supervised classiﬁcation completes the label assignment during the classiﬁcation. The hybrid approach of using both unsupervised and supervised classiﬁcations is possible, while unsupervised classiﬁer is applied ﬁrst to generate clusters and supervised classiﬁer is applied on clusters. The unsupervised stage can be seen as a variation of the feature extraction. Table 10.2 shows some of the commonly used classiﬁcation algorithms. Ensemble classiﬁcation (e.g., random forest) and deep learning (e.g., convolutional neural network (CNN)) have been gaining popularity lately due to its improved accuracy. 5. Post-processing: Filters are normally good at removing the “salt-and-pepper” effect where misclassiﬁed, isolated pixels are eliminated with the assumption of single crop in a ﬁeld or a continuous segment of ﬁeld (Lu and Weng 2007). Mosaics of cropland from the classiﬁcation results from time series of remotely sensed data with ancillary data would improve the classiﬁcation accuracy. For example, rule-based reasoning may be applied to mask out built-up areas and forest areas which are relatively unchanged over time. The planting difference

10

Crop Pattern and Status Monitoring

181

Table 10.2 Classiﬁcation algorithms and their applications in crop mapping Classiﬁer Decision tree

Sensor observations Time series of MODIS

Cropland Cropland in the US central Great Plains

Support vector machine (SVM)

SPOT HRV

Spring barley, winter wheat, spring wheat

Neural network classiﬁer

ERS-1

Rice

Maximum likelihood classiﬁer

Indian remote sensing (IRS)

Wheat crop

Random forest

SPOT-5

Bayesian network

MODIS EVI product

Wheat, sugar beet, rice, corn, tomato/ pepper Soybean

Convolutional neural network (CNN)

Landsat 8 and sentinel 1A

Wheat, maize, sunﬂower, soybeans, and sugar beet

Object-based crop identiﬁcation and mapping (OCIM)

ASTER and its derived vegetation indices

Oat, rye, wheat, corn, rice, sunﬂower, safﬂower, tomato, alfalfa, almond, vineyard, walnut

References Wardlow and Egbert (2008) Foody and Mathur (2004) Chen and Mcnairn (2006) Murthy et al. (2003) Ok et al. (2012) Pupin Mello et al. (2010) Kussul et al. (2017) PeñaBarragán et al. (2011)

between crops, e.g., 2-week difference of corn and soybean in Iowa, may be used to improve the differentiation of crop types if time series of quality data are available. 6. Validation: Accuracy assessment is very important. There are several aspects of accuracy evaluations: training accuracy, model selection accuracy, and veriﬁcation accuracy. Training accuracy is evaluated during the training stage for a classiﬁer to evaluate its prediction against training set. Cross-validation may be helpful in evaluating the model and determining the ﬁtness of models that rotates the leave-out samples to verify its learning capability. Validation is to evaluate the classiﬁer against some unseen samples in order to verify its general applicability. These accuracy evaluations help to keep the balance of generalization and specialization on trained classiﬁers. If a classiﬁer is too specialized or overﬁt, it loses the capability to correctly classify unseen data although its accuracy on training set may be extremely high. If a classiﬁer is too generalized or underﬁt, the accuracy of the classiﬁer would be too low to correctly classify most croplands.

182

E. G. Yu and Z. Yang

10.2.3 Case Study – Operational National Cropland Mapping Programs 10.2.3.1

USA Cropland Data Layer

Cropland Data Layer (CDL) is produced using the remote sensing approach (Johnson and Mueller 2010; Boryan et al. 2011). For preprocessing, they purchase or obtain rectiﬁed remotely sensed observations, mainly Landsat TM and ETM+ and ResourceSat-2 AWiFS. All available Landsat data or AWiFS data are collected during the growing seasons. The time series of data allow the ﬁlling of clouded areas and speciﬁc crop distinguish. The classiﬁer is See5, a decision tree classiﬁer. Training samples are largely from the detailed, non-public June Survey of the USDA Farm Service Agency. The classiﬁcation is mainly scene-based that leaves the ﬁne radiometric correction unnecessary. Cloud quality bit from the rectiﬁed data producer is used in their classiﬁcation to mask out the pixels during their training and classiﬁcation stages. National Land Cover Database (NLCD) is used in masking out nonagricultural areas. Speciﬁc crop knowledge is used in distinguishing different crops. The 2-week difference of planting between corn and soybean is one example of knowledge that is applied in telling apart crops from time series of observations. The CDL is produced annually since 2008 covering all the 48 states. Internally, the CDL is produced in the middle season and made available for internal use. The public release happens in January the following year. Figure 10.2 shows a 1-year CDL data displayed in CropScape – an online, public, interactive, web-based explorer for disseminating and accessing CDL data (Han et al. 2012).

Fig. 10.2 2016 Cropland Data Layer. (Source: https://nassgeodata.gmu.edu/CropScape/)

10

Crop Pattern and Status Monitoring

10.2.3.2

183

Canada Crop Inventory

Crop inventory (CI) of Canada is an operational, annual crop mapping using satellite Earth observations. Both optical remote sensing and radar-based satellite images are used in the classiﬁcation. Optical sensors include Landsat 5, DMC, SPOT, Landsat 8, Sentinel-2, and Gaofen-1. Radar sensor is RADARSAT-2. Ground truth information is provided by provincial crop insurance companies and ﬁeld survey by the Agriculture and Agri-Food Canada (AAFC). A decision tree classiﬁer is used in classifying the images. CI produces crop maps of Canada annually since 2011. The operational release of annual crop inventory allows the monitoring of cropland changes over the years. The crops include barley, millet, oats, rye, triticale, wheat, corn, borage, camelina, canola and rapeseed, ﬂaxseed, mustard, safﬂower, sunﬂower, soybeans, peas, beans, lentils, potatoes, sugar beets, canaryseed, and vetch. Figure 10.3 shows the 2016 Crop Inventory of Canada.

10.2.4 Limitations and Perspectives There still exist limitations for remote sensing approach to replace the traditional statistical approach in estimating crop acreages. The limitations are as follows. 1. Operational vs experimental: Most operational crop acreage estimations are still relying on the sampling framework to conduct ground surveys and estimate the crop pattern. Remote sensing approach is only used in certain stages to assist the sampling design. The acceptance of remote sensing approach is limited by its own

Fig. 10.3 2016 Crop Inventory. (Source: http://www.agr.gc.ca/atlas/rest/services/imageservices/ annual_crop_inventory_2016/ImageServer)

184

E. G. Yu and Z. Yang

capacity in determining the accurate acreage with the resolution of remote sensing data. The identiﬁcation of crop types using satellite data remains a technical challenge due to the diversity of cropping systems – crop types, crop varieties, management practices, and ﬁeld sizes (Song et al. 2017). Agricultural landscapes are too complex to accurately classiﬁed each cropland with satellite remote sensing in use (Wu and Li 2012). Acreage estimation cannot be done directly using pixel counting due to misclassiﬁcation and the existence of mixed pixels (Gallego 2004). 2. Mismatched crop mapping timing: Optical remote sensing is often affected by cloud cover (Hale et al. 1999; Allen et al. 2002). High-resolution satellite remote sensing has limited frequency of revisits. The timing of acquired remotely sensed data may not be well ﬁt for the classiﬁcation of certain crops. For example, soybean and corn have a 2-week difference in Iowa State. This is a useful signature for distinguishing these two types of crops. However, the revisit of Landsat is more than 2 weeks apart. One clouded image acquisition may lead to the completely missed pair images for distinguishing the two crops. 3. No distinguishable spectral signatures for accurately identifying crop types: The spectral signatures for certain crops are still hard to ﬁnd due to their similarity with other crops or surroundings. 4. Complexity of crop classiﬁcation training: The current method of applying classiﬁcation method with remote sensing is mostly based on supervised approach. The classiﬁer is often trained with uncorrected spectral measurements that lead to the limited applicability of trained classiﬁer. The trained classiﬁer is applicable to the similar images with the similar time or even just the scene where the training samples are collected. There are still technical barriers preventing the classiﬁer to be trained once and be applicable to all images of the same sensor. Several recent advancements of remote sensing technologies may help in elevating the role of remote sensing in crop acreage estimates and eventually evolve as an operational method. These advancements and their impacts are summarized as follows. 1. Improved spatial resolution: Satellite remote sensing is reaching a submeter resolution. This will make it possible to accurately identify cropland with detailed spectral and textual signatures. The advancement of computing technologies, especially cloud computing, allows the paralleled speedup of processing large volume of data, which makes it possible to classify images of extremely high resolution. 2. Improved temporal resolution: The temporal resolution of the extremely highspatial-resolution remote sensing is also signiﬁcantly improved. Daily revisits are common through the constellation of many satellites. This would help in solving the timing problem of image acquisition. Cloud coverage may be eliminated to a great degree due to the frequent revisits. 3. Beyond spectral signatures: The resolution of Radar remote sensing is also signiﬁcantly improved. The all-weather capability of Radar will signiﬁcantly eliminate the problem of cloud coverage. The intensity signal of RADAR and

10

Crop Pattern and Status Monitoring

185

LiDAR provide information beyond spectral signatures. The structures of the crop ecosystem may be revealed to remote sensor as an effective signature. 4. Advancement of machine learning: Deep learning is found its great success in image identiﬁcation and pattern recognition. These newly improved classiﬁcation methods will help in improving the extraction and classiﬁcation of cropland from very-high-resolution remote sensing data.

10.3

Crop Status Monitoring

10.3.1 Statistical Approach The statistical ground survey on a sampling framework is the traditional, operational approach in estimating crop yield and monitoring crop status in many countries (Hanuschak 2013). The yield forecast in the United States is based on two surveys: one is the subjective survey, the Agricultural Yield Survey (AYS), from selected farmers monthly, and the other is the objective plot measurement survey, the Objective Yield (OY) survey, under a sampling framework for major crops (i.e., wheat, corn, soybeans, cotton, and potatoes) (Hale et al. 1999; Good and Irwin 2006; NASS 2012; Irwin et al. 2014; Good and Irwin 2016). Yield estimates in India are done using a survey of crop cutting experiments (CCE) under a stratiﬁed multistage random sampling framework (Parihar and Oza 2006).

10.3.2 Remote Sensing Approach Remote sensing can be an efﬁcient technology to get quick and updated crop condition throughout the growing season. The most widely used approach is to use satellite sensors with high temporal resolution and derive certain conditionsensitive indices to evaluate the status against normal values over multiple years (Yu et al. 2012a). Figure 10.4 shows the typical workﬂow for monitoring crop status using time series of remotely sensed observations. The following are the typical steps for crop status monitoring as shown in Fig. 10.4: 1. Selection of remotely sensed data: The monitoring of crop status requires frequently revisited observations during the crop growing season. Unlike the crop mapping which needs high-spatial-resolution multispectral observations, the crop status monitoring highlights a speciﬁc requirement on the temporal resolution among the three resolution properties of satellite remote sensors. Table 10.3 lists selected satellite sensors that have been used in crop status monitoring.

186

E. G. Yu and Z. Yang

Fig. 10.4 A general workﬂow for crop status monitoring with remote sensing Table 10.3 Selected satellite sensors for crop status monitoring Sensor AVHRR MODIS RapidEye SPOT-VEGETA TION MERIS

Resolution Spatial 1 km 250 m, 500 m 5m

Temporal Daily Daily

Radiometric 6 bands 2 bands +5 bands +29 bands 5 bands

1.15 km

1 day/ 5 day 1 day

4 bands

1.2 km

3 day

15 bands

References Rembold et al. (2013) Yang et al. (2011b) Kim and Yeom (2015) Becker-Reshef et al. (2010a) Wu et al. (2014)

2. Data preprocessing: Similar to those for cropland mapping, data preprocessing for geometric correction, radiometric correction, and cloud masking is needed. The monitoring of the crop status requires a time series of remotely sensed observations to be used. Often the minimally required periods on time series covers the growing season of a crop. In addition, the “normal” values to represent the typical crop status at a location need to deal with data across years, which may lead to the use of observations across generations of sensors or even different

10

Crop Pattern and Status Monitoring

187

sensors. All these lead to a heightened requirement on ﬁne geometric correction and radiometric correction to establish the comparability of values up to pixel levels across time and sensors. The actual methods and approaches can be referred to in the previous data preprocessing section for cropland mapping. 3. Condition indicators: Crop growth condition and stages changes over environmental conditions and growing stages. Its health or stage status is related to the content of chlorophyll representing the coverage of green leaves and density of leaves (e.g., leaf area index). Crop emits thermal radiance while absorbing red radiance. These biological and physical characteristics and relationships form the bases for developing many vegetation indices that enhance the signal of crop coverage while subduing other signals. Vegetation indices are developed from these assumptions and theoretical bases. They have been extensively used in monitoring crop status. There are many vegetation indices developed in the past decades since the 1970s (Bannari et al. 1995; Silleos et al. 2006; Yang et al. 2009, 2011b). The comprehensive reviews of vegetation indices can be seen in (Bannari et al. 1995; Silleos et al. 2006; Basso et al. 2013). Table 10.4 lists selected vegetation indices that have been applied in crop status monitoring. 4. Denoising and modeling: The calculated crop condition indicators (e.g., NDVI) may be contaminated and undulated abnormally over time due to the contamination of cloud, haze, and atmospheric conditions. Before they can be used in evaluating and comparing the crop status, smoothing or model ﬁtting is often called for to eliminate bad data and enhance the trend. There are several levels of smoothing/ﬁltering: eliminating extremes, smoothing, and kernel-ﬁtting. Table 10.4 lists selected algorithms for eliminating, smoothing, or kernel-ﬁtting time series of vegetation indices. The ﬁrst level is to eliminate those abnormal that are way off and highly suspicious as bad records in terms of crop status monitoring. The Best Index Slope Extraction (BISE) is a typical method to eliminate extreme values during the growth stages. This algorithm assumes that the contamination of cloud or haze causes NDVI to be lower than usual and the drop in a short period cannot be extreme considering the crop growth. The second level is to smooth and interpolate the time series using certain smoothing algorithms. The third level is to use an underlying kernel to ﬁt the curve of vegetation indices over a growing season. This is typically done for estimating the crop growth stage. This makes it logical that once a crop at a location reaches a growth stage, it should not fall back into an early stage. The monotonic increase in pre-peak stage or the monotonic decrease in post-peak stage should be assured for avoiding such an illogical estimation to happen. 5. Condition evaluation: To determine what the crop status is, there are several approaches (Meng and Wu 2008). First, the condition indicators are directly used to reﬂect the crop status. In general, these indicators are often designed to be positively correlated with the crop condition. The higher the crop condition indicator is, the better the crop condition is. For example, VCI is related to the water condition of cropland (Kogan 1995; Yang et al. 2011b; Yu et al. 2012a).

188

E. G. Yu and Z. Yang

Table 10.4 Vegetation indices for crop status monitoring Index Ratio vegetation index (RVI) Vegetation index number (VIN) Normalized difference vegetation index (NDVI) Agricultural vegetation index (AVI) Multitemporal vegetation index (MTVI) Normalized difference greenness index (NDGI) Normalized difference index (NDI) Soil-adjusted vegetation index (SAVI) Vegetation condition index (VCI) Normalized difference water index (NDWI) Temperature condition index (TCI) Vegetation health index (VHI)

Description RVI ¼ R1/NIR2

Applications

AVI ¼ 2NIR-R

References Pearson and Miller (1972) Pearson and Miller (1972) Rouse (1974) and Rouse et al. (1974) Ashburn (1979)

MTVI¼NDVI(date2)NDVI(date1)

Yazdani et al. (1981)

NDGI ¼ (G-R)/(G + R)

Chamard et al. (1991)

VIN¼NIR/R NDVI ¼ (NIR-R)/ (NIR + R)

NDI ¼ (NIR-MIR)/ (NIR + MIR)

Agricultural crop primary production(Tucker and Sellers 1986)

Agricultural residue (McNairn and Protz 1993)

McNairn and Protz (1993)

SAVI ¼ (NIR-R) (1 + L)/(NIR + R + L3)

Huete (1988)

VCI ¼ 100*(NDVI-min (NDVI))/(max(NDVI)min(NDVI)) NDWI ¼ (NIR-MIR)/ (NIR + MIR)

Kogan and Sullivan (1993) and Kogan (1995) Gao (1996)

TCI ¼ 100*(T-min(T))/ (max(T)-min(T)) VHI ¼ a4*VCI+(1-a) *TCI

Kogan (1995) Drought monitoring (Karnieli et al. 2006)

Kogan (1995)

1 R – Red band 2 NIR – Near infrared band 3 L – Soil. The typical value of L is 0.5 4 a – A coefﬁcient. It is normally set to 0.5 if there is no further information

The higher the VCI is, the more favorable the water condition of the cropland is, and the crop is more likely in good condition. VHI is another indicator that combines both water (i.e., VCI) and temperature (i.e., TCI) to indirectly indicate the degrees of favorite water and heat condition for crop growth. It can be used as a direct indicator to the crop status. In addition, an empirical model may be established to relate crop conditions to crop yield at the end of the growing season (Dadhwal and Ray 2000; Shrestha et al. 2016). Regression models are commonly used to correlate crop yield to crop vegetation indices, e.g., NDVI (Li et al. 2007; Becker-Reshef et al. 2010b; Shrestha et al. 2016). The capability to forecast and

10

Crop Pattern and Status Monitoring

189

Table 10.5 Methods for eliminating, smoothing, or kernel-ﬁtting time series of vegetation indices Algorithm Best index slope extraction (BISE) Mean value iteration (MVI) Harmonic ANalysis of time series (HANTS) Polynomial ﬁtting Double sigmoid kernel ﬁtting Asymmetric Gaussian ﬁltering 4253H, twice

Spline Savitzky-Golay ﬁlter

Description The algorithm uses a moving window to remove with sudden and extreme drops Replace missing values with mean through iterations Fourier transform analysis is used to remove cloud-affected observations and temporal interpolation of the remaining observations to construct gapless dataset for a given time period Fit a polynomial function and reconstruct the complete proﬁle Fit a double sigmoid function and reconstruct the complete proﬁle Fit an asymmetric Gaussian function and reconstruct the complete proﬁle Three runs with moving median smoothing: First smoothing with window sizes 4, 2, 5, and 3 one after another; second applying Hanning average convolution; and third repeating the ﬁrst moving window median sequence Applying a sufﬁciently smooth polynomial function (e.g., cubic B-spline) piecewise Weighted average ﬁltering

References Viovy et al. (1992) Ma and Veroustraete (2006) Roerink et al. (2000)

Dijk (1987) and Yu et al. (2012a) Beck et al. (2006) and Yu et al. (2012a) Jonsson and Eklundh (2002) Velleman (1977, 1980) and Yu et al. (2012a)

Chen et al. (2006) and Yu et al. (2012a) Chen et al. (2004) and Yu et al. (2012a)

estimate crop yield from the crop conditions is the driving force for direct crop condition indicators to tell apart what the crop status is (Table 10.5). Second, the comparison of crop condition indicator against the same-period norm derived from historical data can be used to tell apart if the crop status is worse or better. The “norm” from historical crop condition indicator can be different depending on the period used: all historical data, last few years (e.g., last 5 years), or selected representative years (e.g., selected 3 no-disaster years). The rationale for using selected or recent years instead of all years is to focus on representative years that reﬂect the normal condition. For example, the recent years may be more representative than those of all the years considering the technology change (e.g., genetic modiﬁcation, seeding, varieties). The norm should be crop-speciﬁc. The ways to calculate the norm from multiple years can also be different. Commonly, the used methods are mean, media, maximum, and percentile. The method to compare the difference can be different too. Ratio and subtraction are two simple operations to derive the difference. It is also possible to use a model or complex formula to derive the difference. Table 10.6 lists some of the most popularly used crop condition comparison methods. Third, crop growth stages are indicators to the growth status of crop, which can be estimated from the ﬁtted or smoothed crop condition proﬁle (Yu et al. 2012b). The methods for estimating the crop growth stages from the crop condition proﬁles can

190

E. G. Yu and Z. Yang

Table 10.6 Comparative crop condition evaluation methods Name Mean vegetation condition index (MVCI) Ratio to previous year (RNDVI) Ratio to previous ﬁve years

Ratio to previous years

Description Current crop condition indicator subtracts the mean value from historical years normalized against the mean value. Current crop condition indicator subtracts the value from the previous year normalized against the previous year value. Current crop condition indicator subtracts the median or mean value from the previous ﬁve years normalized against the median or mean value. Current crop condition indicator subtracts the median or mean value from all previous years normalized against the median or mean value.

References Yang et al. (2011b) and Yu et al. (2012a) Yu et al. (2012a)

Yu et al. (2012a)

Yu et al. (2012a)

be crop absolute condition index thresholds, empirical modeling, relative change ratio thresholds, maximum change rate, or function ﬁtting (Yu et al. 2012b; Di et al. 2015). Most of these approaches require the full data covering the complete growing season. Some of these only detect the onset of phenology stages in terms of remote sensing. This remote sensing–derived crop growth stages may not be exactly interpreted and related to the physiological phenology stages. The relationship may be different from crop to crop and from location to location. Nevertheless, there exist a strong relationship between remote sensing-derived stages and physiological stages. To a certain degree and application areas, the remote-sensed phenology stages can be transformed and used to relate to the actual phenology stages. For the in-season monitoring and estimating of crop growth stages, the method has to be adapted to work with the previous year model or typical model with limited ﬁtting and input data. In the study (Yu et al. 2012b), the progressive double sigmoid model ﬁtting (PDSMF) algorithm, an approach of three partial model ﬁttings, was developed and applied to estimate corn growth stages in the United States. In PDSMF, the asymmetric double sigmoid model was adopted as the kernel to be ﬁtted using ﬁltered “good” NDVI data proﬁle. Three different models are used depending on three estimating stages, respectively. The study assumed that the NDVI proﬁle has a single mode which meets the growth development of corn in Unites States. The three estimating stages are pre-peak, early post-peak, and later post-peak. The double sigmoid model of the same crop at approximate location in the previous year is used as the base double sigmoid model during pre-peak and early post-peak stages. During the pre-peak period, only the shift of the previous year model is enabled by modeling one free parameter. During the early-post period, the shift and ﬂatting of the previous year model is enabled by modeling three parameters. During the late post-peak period, all parameters are open to be modeled that would create a newly ﬁt double sigmoid model. This approach efﬁciently utilizes the historical knowledge and available data at the time of estimation. The results of the study show reasonable accuracy in the validation using surveyed datasets from the USDA NASA.

10

Crop Pattern and Status Monitoring

191

Forth, the condition indicators or derived values can be used as input parameters or veriﬁcation datasets to be assimilated into a crop growth model for estimating and evaluating the crop growth status (Dadhwal and Ray 2000). To assimilate the crop condition indicators/indices derived from remote sensing into the crop growth model, two approaches (Bouman 1995; Dadhwal and Ray 2000) are commonly used – one is to use the remote sensing-derived information as input parameters to drive the models, and the other is to use the remote sensing-derived information as model output veriﬁcation to optimize the model (Maas 1988). The leaf area index (LAI) is one of the most frequently used remote sensing-derived information to be assimilated into the crop growth models. It has been used in both approaches – input parameter (Doraiswamy 2002; Doraiswamy et al. 2003; Fang et al. 2011) and output veriﬁcation (Clevers 1997; Hong et al. 2004) – since LAI is the input for crop growth models while it is also an intermediate output of many crop growth simulation models. Fifth, diagnostic modeling or stress modeling is another way to use remote sensing-derived crop condition indices into estimating and forecasting the crop yield (Dadhwal and Ray 2000). Flooding and drought are the two extremes of stress on crop growth that are of excessive water or deﬁcit water. The relative water content from crop vegetation indices can be used to evaluate the water condition of the crop (Kogan 1995; Gao 1996; Yang et al. 2011b). Extreme events, e.g., ﬂooding, may post different degrees of damages to crops depending on their growth stages, and their impact may be assessed with remote sensing-derived crop condition indices and their proﬁles over the growing season (Di et al. 2013; Shrestha et al. 2013, 2017). 6. Validation: Ground survey at the same time of satellite observations for validation is the straightforward and accurate approach to verify the results (CHU et al. 2016). However, due to the constraints of resources, ground survey is limited in the number of samples and frequency of revisits. Alternative approaches to evaluate the crop condition monitoring are evaluations at aggregated (or coarse) temporal resolution and/or spatial resolution (Kim and Kaluarachchi 2015; Zhang et al. 2016), evaluation with higher-resolution satellite or aerial imageries (Huang et al. 2012; Kim and Kaluarachchi 2015), and evaluation with results from other methods/models (e.g., statistical survey (Wall et al. 2008; Becker-Reshef et al. 2010b), alternative operational systems (Baruth et al. 2008; Atkinson et al. 2012; Mladenova et al. 2017), and social outsourcing (Fritz et al. 2012)).

10.3.3 Case Study – Operational Remote Sensing Crop Condition Monitoring 10.3.3.1

National Crop Progress Monitoring System

The National Crop Progress Monitoring System (NCPMS) is a collaborative project of George Mason University, USDA NASS, and NASA that developed the operational national crop conditional monitoring system of the contiguous United States

192

E. G. Yu and Z. Yang

Fig. 10.5 Weekly vegetation condition index (May 16–22, 2017). (Source: https://nassgeodata. gmu.edu/VegScape/)

(Mueller 2013; Yang et al. 2013; Di et al. 2015; Yang et al. 2016). Remote sensing data from NASA have been used extensively in assessing the crop condition in a timely fashion. The process involves all aspects of the geospatial processing described in the previous section. Several crop condition indices are calculated with a pre-conﬁgured, timed, automatic geospatial processing workﬂow that reduces the delay from satellite observations to the product at minimum. The MODIS and its products from NASA are used as the base to compute the crop condition indices. These indices are NDVI, vegetation condition index (VCI), ratio to previous year vegetation condition index (RVCI), ratio to median vegetation condition index (RMVCI) of the previous years since 2000, and mean vegetation condition index (MVCI) (Yang et al. 2011b; Mueller 2013). The maximum composite of these products at weekly and bi-weekly are also made available. Figure 10.5 shows one example of weekly vegetation condition index map served through the open source-based, ﬂexible, userfriendly, online web-based geospatial explorer – VegScape. Geospatial queries, map making, and statistics are supported in VegScape. Open geospatial web service application interfaces (API) are also available that include OGC Web Map Service and OGC Web Coverage Service (Yang et al. 2013). Crop growth stage estimation is produced on top of the crop condition indices. Different smoothing algorithms were assessed during the research and development period (Yu et al. 2012b; Di et al. 2015). The double sigmoid model is used as the base kernel in ﬁtting and modeling the crop growing season. Ten major crops from the United States are modeled which are corn, cotton, soybean, sorghum, spring wheat, peanuts, rice, barley, oats, and winter wheat. Major growth stages are estimated weekly. Figure 10.6 shows a weekly percentage map of corn at dough growth stage.

10

Crop Pattern and Status Monitoring

193

Fig. 10.6 Weekly corn growth stage (Crop: corn, Stage: dough, Period: weekly, July 31–August 6, 2012). (Source: https://dss.csiss.gmu.edu/CropGrowth/)

10.3.3.2

Global Agricultural Monitoring

The Global Agriculture Monitoring (GLAM) project built an operational global agricultural monitoring system for the USDA Foreign Agricultural Service (FAS) (Becker-Reshef et al. 2010a). MODIS is used as the primary satellite remotely sensed data for evaluating and assessing crop status globally at coarse and moderate resolution. The 8-day and 16-day products of surface reﬂectance and vegetation indices are produced out of MODIS. Cumulative vegetation index (CVI) is also computed from the start of the growing season. Multiple period CVI anomaly is also computed to represent the crop status compared against the average total greenness over all previous seasons (Becker-Reshef et al. 2010a).

10.3.3.3

Other Operational Crop Status Monitoring Systems

There are several other operational crop status monitoring systems that use remote sensing as one of their data sources. The Monitoring Agricultural ResourceS (MARS) of the Joint Research Center (JRC) of the European Union is one of the long-running operational agricultural monitoring programs. Remote sensing is one of the main techniques in addition to the statistic approach. The remotely sensed data are NOAA Advanced Very-High Resolution Radiometer (AVHRR), METOP AVHRR, SPOT- VEGETATION , TERRA-MODIS, and MSG-SERVIRI. Crop condition indicators from remote sensing are reported as the fraction of photosynthetically active radiation (fPAR) in the monthly crop monitoring bulletin (Baruth

194

E. G. Yu and Z. Yang

et al. 2008; Atzberger 2013). Crop development stage is also estimated and reported using the Crop Growth Monitoring System (CGMS) (Supit et al. 2012). The Crop Watch (CropWatch) program at the Institute of Remote Sensing Applications (IRSA) of the Chinese Academy of Sciences (CAS) extensively uses a suite of satellite remote-sensed data to model and monitor crops worldwide (Wu and Li 2004; Wu et al. 2010, 2014). The crop condition indices include vegetation health index (VHI) and vegetation condition index (VCI) that are derived from remote sensing. Four geospatial levels of crop condition are assessed which are global monitoring and reporting units, regional major production zones, 31 major national report, and subnational reports of 9 large counties (Wu et al. 2015).

10.3.4 Limitations and Perspectives The limitations of the remote sensing approach for crop status monitoring are as follows. 1. Noncrop-speciﬁc spatial resolution of high-temporal remote-sensed data: The high temporal resolution is crucial in monitoring crop status (Basso et al. 2013). The problem becomes even more serious when small household farms are investigated (Fermont and Benson 2011). This leaves out many of the veryhigh-spatial-resolution satellite sensors since they have a revisit frequency of more than 5 days or even weeks. One week during the growing season makes a lot of difference on crops, while continuous monitoring of crops requires shorter revisits with high-quality data. Most of the commonly used satellite sensors, like AVHRR and MODIS, have moderate or even coarse spatial resolution where the footprint of each pixel is a mixture of many ground features. The mixture makes it much harder to get crop-speciﬁc status over time. 2. The signiﬁcant effect of cloud and haze on crop condition indices: Optical remote sensing is the most commonly used technology in monitoring crop conditions. Cloud covers make it unusable for crop condition. Cloud-free images are hard to get in certain agricultural areas (King et al. 1995; Eberhardt et al. 2016). Often cloud masking is applied, while high temporal resolution would help in making up the missing parts. The surrounding areas of cloud mask are often affected by light clouds or haze which signiﬁcantly reduced the vegetation indices (Eberhardt et al. 2016). This also poses a serious negative effect on crop condition which leads to a false reading. 3. Saturation of crop condition indices: Vegetation indices are often used as indicators to the health of crops. However, they are suffering saturation problem when the crop coverage and leaf density are over certain thresholds (Haboudane 2004). The saturated vegetation indices lead to their low correlation to crop yield or condition (Zarco-Tejada et al. 2005). This made it difﬁcult to relate satellite information to quantitative crop yield estimates at different scales (King et al. 1995).

10

Crop Pattern and Status Monitoring

195

4. Sensing time misalignment of maximum composition: Maximum composition approach is often used in creating periodical vegetation indices that are commonly used in comparing crop conditions across the same period. The mixed effect of cloud and atmosphere causes the data to be picked quite differently in the composite. In the peak growing phases of crops, several days of difference lead to a signiﬁcant difference in appearance and derived indices. The same period comparison becomes difﬁcult. 5. Incompatibility of crop condition indices across time and sensor: Time series are required to monitor condition. The time series may go beyond sensors. The spectral measurements from sensors differ across time and sensor due to the effect of atmosphere and sensors which cause incompatibility between derived crop condition indices (Dadhwal and Ray 2000). Fine radiometric correction is needed. The recent advancements and their impact on monitoring crop status with remote sensing are briefed as follows. 1. Improved temporal resolution with high spatial resolution: The constellation of satellites or small satellites makes it possible to increase the revisit frequency, while subﬁeld spatial resolution is reserved (Butler 2014). Crop-speciﬁc and ﬁeld-level monitoring becomes possible at meter or submeter resolution with less than 3-day or daily revisits (Marshall and Boshuizen 2013; Purdy 2016; Ruban et al.). 2. Radiometric correction improvements: New algorithms and technologies are emerging for radiometric correction and fusion (Roy et al. 2016; Kautz 2017). The fusion and analysis of multitemporal remotely sensed data across time and sensors become more accessible for crop monitoring (Gao et al. 2016, 2017). 3. Enhanced time series data processing capabilities: Machine learning technologies and time series analytics have advanced. The advanced algorithms and technologies have been applied in crop monitoring, and improved results are achieved (de Villiers 2017; Nagol et al. 2017; You et al. 2017; Shelestov et al. 2017).

10.4

Conclusions

Crop pattern and status monitoring is traditionally operational with statistical approaches where a sampling framework is used. Statistical approach gets the approximate reports at different administrative levels. Remote sensing has been increasingly adopted in monitoring crop pattern and status. The general workﬂows for applying remote sensing in both crop pattern monitoring and crop status monitoring are described. The operational cases for monitoring crop pattern and status using remote sensing are reviewed. The advancements of remote sensing and related processing capabilities make it possible to operationally monitor crop pattern and crop status in very high spatial and temporal resolution.

196

E. G. Yu and Z. Yang

The increasing constellation of satellite sensors with different spatial resolution shortens the revisit cycles that meet the very requirements of crop monitoring in high frequency. Further development in machine learning and time series analytics and massive parallel computing technologies will facilitate the timely monitoring of crops up to the ﬁeld level. The application of remotely sensed data and derived information on crop pattern and status is further expanded into procession agriculture.

References Abreu, D. A., McCarthy, J. S., & Colburn, L. A. (2010). Impact of the screening procedures of the June Area Survey on the number of farms estimates. Abreu, D. A., Lamas, A. C., Sang, H., et al. (2011). On the feasibility of using NASS’s sampling list frame to evaluate misclassiﬁcation errors of the June area survey. United States Department of Agriculture, National Agricultural Statistics Service. Allen, R., Hanuschak, G., & Craig, M. (2002). History of remote sensing for crop acreage in USDA’s National Agricultural Statistics Service. Anami, B. S., Pujari, J. D., & Yakkundimath, R. (2011). Identiﬁcation and classiﬁcation of normal and affected agriculture/horticulture produce based on combined color and texture feature extraction. International Journal of Computer Applications in Engineering Sciences, 1, 356–360. Ashburn, P. (1979). The vegetative index number and crop identiﬁcation. In Proceeding of the LACIE symposium. pp. 843–856. Atkinson, P. M., Jeganathan, C., Dash, J., & Atzberger, C. (2012). Inter-comparison of four models for smoothing satellite sensor time-series data to estimate vegetation phenology. Remote Sensing of Environment, 123, 400–417. https://doi.org/10.1016/j.rse.2012.04.001. Atzberger, C. (2013). Advances in remote sensing of agriculture: Context description, existing operational monitoring systems and major information needs. Remote Sensing, 5, 949–981. https://doi.org/10.3390/rs5020949. Badhwar, G. D., Carnes, J. G., & Austin, W. W. (1982). Use of Landsat-derived temporal proﬁles for corn-soybean feature extraction and classiﬁcation. Remote Sensing of Environment, 12, 57–79. Bannari, A., Morin, D., Bonn, F., & Huete, A. R. (1995). A review of vegetation indices. Remote Sensing Reviews, 13, 95–120. https://doi.org/10.1080/02757259509532298. Baruth, B., Royer, A., Klisch, A., & Genovese, G. (2008). The use of remote sensing within the MARS crop yield monitoring system of the European Commission. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 37, 935–940. Basso, B., Cammarano, D., & Carfagna, E. (2013). Review of crop yield forecasting methods and early warning systems. In The ﬁrst Scientiﬁc Advisory Committee meeting, Global Strategy. Food and Agriculture Organization of the United Nations, Rome, Italy. Beck, P. S. A., Atzberger, C., Høgda, K. A., et al. (2006). Improved monitoring of vegetation dynamics at very high latitudes: A new method using MODIS NDVI. Remote Sensing of Environment, 100, 321–334. https://doi.org/10.1016/j.rse.2005.10.021. Becker-Reshef, I., Justice, C., Sullivan, M., et al. (2010a). Monitoring global croplands with coarse resolution earth observations: The Global Agriculture Monitoring (GLAM) project. Remote Sensing, 2, 1589–1609. https://doi.org/10.3390/rs2061589. Becker-Reshef, I., Vermote, E., Lindeman, M., & Justice, C. (2010b). A generalized regressionbased model for forecasting winter wheat yields in Kansas and Ukraine using MODIS data. Remote Sensing of Environment, 114, 1312–1323. https://doi.org/10.1016/j.rse.2010.01.010.

10

Crop Pattern and Status Monitoring

197

Boryan, C., Yang, Z., Mueller, R., & Craig, M. (2011). Monitoring US agriculture: The US Department of Agriculture, National Agricultural Statistics Service, Cropland Data Layer Program. Geocarto International, 26, 341–358. https://doi.org/10.1080/10106049.2011. 562309. Bosecker, R. R. (1988). Sampling methods in agriculture. National Agricultural Statistics Service, US Department of Agriculture. Bouman, B. A. M. (1995). Crop modelling and remote sensing for yield prediction. NJAS Wageningen Journal of Life Sciences, 43, 143–161. Butler, D. (2014). Many eyes on Earth. Nature, 505, 143–144. Chamard, P., Courel, M. F., Ducousso, M., et al. (1991). Utilisation des bandes spectrales du vert et du rouge pour une meilleure évaluation des formations végétales actives. Télédétection et Cartographie, 203–209. Chen, C., & Mcnairn, H. (2006). A neural network integrated approach for rice crop monitoring. International Journal of Remote Sensing, 27, 1367–1393. https://doi.org/10.1080/ 01431160500421507. Chen, J., Jönsson, P., Tamura, M., et al. (2004). A simple method for reconstructing a high-quality NDVI time-series data set based on the Savitzky–Golay ﬁlter. Remote Sensing of Environment, 91, 332–344. https://doi.org/10.1016/j.rse.2004.03.014. Chen, J. M., Deng, F., & Chen, M. (2006). Locally adjusted cubic-spline capping for reconstructing seasonal trajectories of a satellite-derived surface parameter. IEEE Transactions on Geoscience and Remote Sensing, 44, 2230–2238. https://doi.org/10.1109/TGRS.2006.872089. Chu, L., Liu, Q., Huang, C., Liu, G. (2016). Monitoring of winter wheat distribution and phenological phases based on MODIS time-series: A case study in the Yellow River Delta, China. Clevers, J. G. P. (1997). A simpliﬁed approach for yield prediction of sugar beet based on optical remote sensing data. Remote Sensing of Environment, 61, 221–228. https://doi.org/10.1016/ S0034-4257(97)00004-7. Conrad, C., Fritsch, S., Zeidler, J., et al. (2010). Per-ﬁeld irrigated crop classiﬁcation in arid central Asia using SPOT and ASTER data. Remote Sensing, 2, 1035–1056. https://doi.org/10.3390/ rs2041035. Dadhwal, V. K., & Ray, S. S. (2000). Crop assessment using remote sensing-Part-II: Crop condition and yield assessment. Indian Journal of Agricultural Economics, 55, 55. Danaher, T., Wu, X., & Campbell, N. (2001). Bi-directional reﬂectance distribution function approaches to radiometric calibration of Landsat ETM+ imagery. In Geoscience and remote sensing symposium, 2001. IGARSS’01. IEEE 2001 International. IEEE, pp. 2654–2657. de Villiers, M. (2017). Predicting tomato crop yield from weather data using statistical learning techniques. Faculty of Economic and Management Sciences at Stellenbosch University Department of Statistics and Actuarial Sciences, University of Stellenbosch. Di, L., Yu, G., Kang, L., et al. (2013). A remote-sensing-based ﬂood crop loss assessment cyberservice system for supporting crop statistics and insurance decision making. In Proceedings of IEEE international conference on systems, man, and cybernetics (IEEE SMC2013) special session on environmental sensing, networking and decision making, October 13–16, 2013, Manchester, UK. IEEE, Manchester, UK, Di, L., Yu, E. G., Yang, Z., et al. (2015). Remote sensing based crop growth stage estimation model. IEEE, pp. 2739–2742. Dijk, V. A. N. (1987). Smoothing vegetation index proﬁles- An alternative method for reducing radiometric disturbance in NOAA/AVHRR data. Photogrammetric Engineering and Remote Sensing, 53, 1059–1067. Doraiswamy, P. (2002). Application of MODIS-derived parameters for regional yield assessment. In Proceedings of SPIE. Toulouse, France, pp. 1–8. Doraiswamy, P. C., Moulin, S., Cook, P. W., & Stern, A. (2003). Crop yield assessment from remote sensing. Photogrammetric Engineering and Remote Sensing, 69, 665–674.

198

E. G. Yu and Z. Yang

Eberhardt, I., Schultz, B., Rizzi, R., et al. (2016). Cloud cover assessment for operational crop monitoring systems in tropical areas. Remote Sensing, 8, 219. https://doi.org/10.3390/ rs8030219. Engmann, E. T., Schmugge, T. J., & O’Neill, P. E. (1986). Agriculture and resources inventory surveys through aerospace remote sensing (AgRISTARS). Fang, H., Liang, S., & Hoogenboom, G. (2011). Integration of MODIS LAI and vegetation index products with the CSM–CERES–Maize model for corn yield estimation. International Journal of Remote Sensing, 32, 1039–1065. https://doi.org/10.1080/01431160903505310. Fermont, A., & Benson, T. (2011). Estimating yield of food crops grown by smallholder farmers (pp. 1–68). Washington DC: International Food Policy Research Institute. Fisette, T., Davidson, A., Daneshfar, B., et al. (2014). Annual space-based crop inventory for Canada: 2009–2014. IEEE, pp. 5095–5098. Foody, G. M., & Mathur, A. (2004). Toward intelligent training of supervised image classiﬁcations: directing training data acquisition for SVM classiﬁcation. Remote Sensing of Environment, 93, 107–117. https://doi.org/10.1016/j.rse.2004.06.017. Fritz, S., Purgathofer, P., Kayali, F., et al. (2012). Landspotting: Social gaming to collect vast amounts of data for satellite validation. In EGU general assembly conference abstracts. p 13173. Gallego, F. J. (1999). Crop area estimation in the MARS project. In: Conference on ten years of the MARS project. Gallego, F. J. (2004). Remote sensing and land cover area estimation. International Journal of Remote Sensing, 25, 3019–3047. https://doi.org/10.1080/01431160310001619607. Gao, B. (1996). NDWI—A normalized difference water index for remote sensing of vegetation liquid water from space. Remote Sensing of Environment, 58, 257–266. https://doi.org/10.1016/ S0034-4257(96)00067-3. Gao, F., Anderson, M. C., & Xie, D. (2016). Spatial and temporal information fusion for crop condition monitoring. IEEE, pp 3579–3582. Gao, F., Anderson, M. C., Zhang, X., et al. (2017). Toward mapping crop progress at ﬁeld scales through fusion of Landsat and MODIS imagery. Remote Sensing of Environment, 188, 9–25. https://doi.org/10.1016/j.rse.2016.11.004. GIEWS F. (2013). Global information and early warning system; food price data and analysis tool. Good, D. L., & Irwin, S. H. (2006). Understanding USDA corn and soybean production forecasts: Methods, performance and market impacts over 1970–2005. Good, D., & Irwin, S. (2016). Opening up the black box: More on the USDA corn yield forecasting methodology. Haboudane, D. (2004). Hyperspectral vegetation indices and novel algorithms for predicting green LAI of crop canopies: Modeling and validation in the context of precision agriculture. Remote Sensing of Environment, 90, 337–352. https://doi.org/10.1016/j.rse.2003.12.013. Hale, R. C., Hanuschak, G., & Craig, M. E. (1999). The appropriate role of remote sensing in US agricultural statistics. FAO Regional Project, Improvement of Agricultural Statistics in Asia and Paciﬁc Countries. Han, W., Yang, Z., Di, L., & Mueller, R. (2012). CropScape: A Web service based application for exploring and disseminating US conterminous geospatial cropland data products for decision support. Computers and Electronics in Agriculture, 84, 111–123. https://doi.org/10.1016/j. compag.2012.03.005. Hansen, M. C., & Loveland, T. R. (2012). A review of large area monitoring of land cover change using Landsat data. Remote Sensing of Environment, 122, 66–74. https://doi.org/10.1016/j.rse. 2011.08.024. Hanuschak, G. A. Sr. (2013). Timely and accurate crop yield forecasting and estimation: History and initial gap analysis. In The ﬁrst Scientiﬁc Advisory Committee meeting, Global Strategy. Food and Agriculture Organization of the United Nations, Rome, Italy.

10

Crop Pattern and Status Monitoring

199

Hong, S.-Y., Sudduth, K.-A., Kitchen, N.-R., et al. (2004). Comparison of remote sensing and crop growth models for estimating within-ﬁeld LAI variability. Korean journal of remote sensing, 20, 175–188. Huang, Q., Zhou, Q., Wu, W., et al. (2012). Extraction of planting areas of major crops and crop growth monitoring in northeast China. Intelligent Automation & Soft Computing, 18, 1023–1033. https://doi.org/10.1080/10798587.2008.10643307. Huete, A. (1988). A soil-adjusted vegetation index (SAVI). Remote Sensing of Environment, 25, 295–309. https://doi.org/10.1016/0034-4257(88)90106-X. Hulley, G. C., & Hook, S. J. (2008). A new methodology for cloud detection and classiﬁcation with ASTER data. Geophysical Research Letters. https://doi.org/10.1029/2008GL034644. Immitzer, M., Vuolo, F., & Atzberger, C. (2016). First experience with sentinel-2 data for crop and tree species classiﬁcations in central Europe. Remote Sensing, 8, 166. https://doi.org/10.3390/ rs8030166. Irwin, S. H., Sanders, D. R., & Good, D. L. (2014). Evaluation of selected USDA WAOB and NASS forecasts and estimates in corn and soybeans. JARS. (1999). Remote sensing notes.. Japan Association of Remote Sensing. Jiao, X., Kovacs, J. M., Shang, J., et al. (2014). Object-oriented crop mapping and monitoring using multi-temporal polarimetric RADARSAT-2 data. ISPRS Journal of Photogrammetry and Remote Sensing, 96, 38–46. https://doi.org/10.1016/j.isprsjprs.2014.06.014. Johnson, D. M., & Mueller, R. (2010). The 2009 cropland data layer. PE&RS, Photogrammetric Engineering & Remote Sensing, 76, 1201–1205. Jonsson, P., & Eklundh, L. (2002). Seasonality extraction by function ﬁtting to time-series of satellite sensor data. IEEE Transactions on Geoscience and Remote Sensing, 40, 1824–1832. https://doi.org/10.1109/TGRS.2002.802519. Karnieli, A., Bayasgalan, M., Bayarjargal, Y., et al. (2006). Comments on the use of the vegetation health index over Mongolia. International Journal of Remote Sensing, 27, 2017–2024. https:// doi.org/10.1080/01431160500121727. Kaufman, Y. J., Tanré, D., Gordon, H. R., et al. (1997). Passive remote sensing of tropospheric aerosol and atmospheric correction for the aerosol effect. Journal of Geophysical Research: Atmospheres, 102, 16815–16830. Kautz, J. S. (2017). In-situ cameras for radiometric correction of remotely sensed data. The University of Arizona. Kim, D., & Kaluarachchi, J. (2015). Validating FAO AquaCrop using Landsat images and regional crop information. Agricultural Water Management, 149, 143–155. https://doi.org/10.1016/j. agwat.2014.10.013. Kim, H.-O., & Yeom, J.-M. (2015). Sensitivity of vegetation indices to spatial degradation of RapidEye imagery for paddy rice detection: A case study of South Korea. GIScience & Remote Sensing, 52, 1–17. https://doi.org/10.1080/15481603.2014.1001666. King, D., Jones, R. J. A., & Thomasson, A. J. (Eds.). (1995). European land information systems for agro-environmental monitoring. Joint Research Centre, European Commission, Luxembourg. Kogan, F. N. (1995). Droughts of the late 1980s in the United States as derived from NOAA polarorbiting satellite data. Bulletin of the American Meteorological Society, 76, 655–668. Kogan, F., & Sullivan, J. (1993). Development of global drought-watch system using NOAA/ AVHRR data. Advances in Space Research, 13, 219–222. https://doi.org/10.1016/0273-1177 (93)90548-P. Kussul, N., Lavreniuk, M., Skakun, S., & Shelestov, A. (2017). Deep learning classiﬁcation of land cover and crop types using remote sensing data. IEEE Geoscience and Remote Sensing Letters, 14, 778–782. https://doi.org/10.1109/LGRS.2017.2681128. Lei, T. C., Wan, S., & Chou, T. Y. (2008). The comparison of PCA and discrete rough set for feature extraction of remote sensing image classiﬁcation – A case study on rice classiﬁcation, Taiwan. Computational Geosciences, 12, 1–14. https://doi.org/10.1007/s10596-007-9057-7.

200

E. G. Yu and Z. Yang

Li, A., Liang, S., Wang, A., & Qin, J. (2007). Estimating crop yield from multi-temporal satellite data using multivariate regression and neural network techniques. Photogrammetric Engineering & Remote Sensing, 73, 1149–1157. https://doi.org/10.14358/PERS.73.10.1149. Lopiano, K. K., Lamas, A. C., Abreu, D. A., et al. (2011). Adjusting the June area survey estimate of the number of US farms for misclassiﬁcation and non-response. United States Department of Agriculture, National Agricultural Statistics Service. Löw, F., Michel, U., Dech, S., & Conrad, C. (2013). Impact of feature selection on the accuracy and spatial uncertainty of per-ﬁeld crop classiﬁcation using support vector machines. ISPRS Journal of Photogrammetry and Remote Sensing, 85, 102–119. Lu, D., & Weng, Q. (2007). A survey of image classiﬁcation methods and techniques for improving classiﬁcation performance. International Journal of Remote Sensing, 28, 823–870. https://doi. org/10.1080/01431160600746456. Lu, D., Ge, H., He, S., et al. (2008). Pixel-based Minnaert correction method for reducing topographic effects on a landsat 7 ETM+ image. Photogrammetric Engineering & Remote Sensing, 74, 1343–1350. https://doi.org/10.14358/PERS.74.11.1343. Lyapustin, A., Wang, Y., & Frey, R. (2008). An automatic cloud mask algorithm based on time series of MODIS measurements. Journal of Geophysical Research. https://doi.org/10.1029/ 2007JD009641. Ma, M., & Veroustraete, F. (2006). Reconstructing pathﬁnder AVHRR land NDVI time-series data for the Northwest of China. Advances in Space Research, 37, 835–840. https://doi.org/10.1016/ j.asr.2005.08.037. Maas, S. J. (1988). Use of remotely-sensed information in agricultural crop growth models. Ecological Modelling, 41, 247–268. https://doi.org/10.1016/0304-3800(88)90031-2. MacDonald, R. B., & Hall, F. G. (1980). Global crop forecasting. Science, 208, 670–679. Marshall, W., & Boshuizen, C. (2013). Planet labs’ remote sensing satellite system. Martínez-Casasnovas, J. A., Martín-Montero, A., & Auxiliadora Casterad, M. (2005). Mapping multi-year cropping patterns in small irrigation districts from time-series analysis of Landsat TM images. European Journal of Agronomy, 23, 159–169. https://doi.org/10.1016/j.eja.2004.11. 004. Mather, P. M., & Koch, M. (2011). Computer processing of remotely-sensed images: An introduction, 4th ed., 1. impr. Oxford: Wiley-Blackwell. McNairn, H., & Protz, R. (1993). Mapping corn residue cover on agricultural ﬁelds in Oxford County, Ontario, Using Thematic Mapper. Canadian Journal of Remote Sensing, 19, 152–159. https://doi.org/10.1080/07038992.1993.10874543. Meng, J., & Wu, B. (2008). Study on the crop condition monitoring methods with remote sensing. In J. Chen (Ed.), The International Archives of the Photogrammetry (pp. 945–956). Beijing: Remote Sensing and Spatial Information Sciences. International Society for Photogrammetry and Remote Sensing. Mladenova, I. E., Bolten, J. D., Crow, W. T., et al. (2017). Intercomparison of soil moisture, evaporative stress, and vegetation indices for estimating corn and soybean yields over the U.S. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 10, 1328–1343. https://doi.org/10.1109/JSTARS.2016.2639338. Mueller, R. (2013). VegScape: A NASS Web Service-based U.S. Crop Condition Monitoring System. United States Department of Agriculture. Murthy, C. S., Raju, P. V., & Badrinath, K. V. S. (2003). Classiﬁcation of wheat crop with multitemporal images: Performance of maximum likelihood and artiﬁcial neural networks. International Journal of Remote Sensing, 24, 4871–4890. https://doi.org/10.1080/ 0143116031000070490. Nagol, J. R., Sexton, J. O., Anand, A., et al. (2017). Isolating type-speciﬁc phenologies through spectral unmixing of satellite time series. International Journal of Digital Earth 1–13. NASS. (2012). The yield forecasting program of NASS. The Statistical Methods Branch, Statistics Division, National Agricultural Statistics Service, U.S. Department of Agriculture, Washington, DC., USA.

10

Crop Pattern and Status Monitoring

201

NASS. (2013). CropScape – NASS CDL Program.. http://nassgeodata.gmu.edu/CropScape/. Accessed 7 Nov 2013. Ok, A. O., Akar, O., & Gungor, O. (2012). Evaluation of random forest method for agricultural crop classiﬁcation. European Journal of Remote Sensing, 45, 421–432. Parihar, J. S., Oza, M. P. (2006). FASAL: An integrated approach for crop assessment and production forecasting. In Proceedings of the Asia-Paciﬁc remote sensing symposium. International Society for Optics and Photonics, pp 641101–641113. Pearson, R. L., & Miller, L. D. (1972). Remote mapping of standing crop biomass for estimation of the productivity of the shortgrass prairie. In Remote Sensing of Environment, VIII. p 1355. Peddle, D. R., & Ferguson, D. T. (2002). Optimisation of multisource data analysis: An example using evidential reasoning for GIS data classiﬁcation. Computers & Geosciences, 28, 45–52. https://doi.org/10.1016/S0098-3004(01)00012-7. Peña-Barragán, J. M., Ngugi, M. K., Plant, R. E., & Six, J. (2011). Object-based crop identiﬁcation using multiple vegetation indices, textural features and crop phenology. Remote Sensing of Environment, 115, 1301–1316. 16/j.rse.2011.01.009. Pradhan, S. (2001). Crop area estimation using GIS, remote sensing and area frame sampling. International Journal of Applied Earth Observation and Geoinformation, 3, 86–92. Pupin Mello, M., Rudorff, B. F. T., Adami, M., et al. (2010). A simpliﬁed Bayesian network to map soybean plantations. IEEE, pp. 351–354. Purdy, L. (2016). Farming from space. Engineering & Technology, 11, 40–44. Rembold, F., Atzberger, C., Savin, I., & Rojas, O. (2013). Using low resolution satellite imagery for yield prediction and yield anomaly detection. Remote Sensing, 5, 1704–1733. Roerink, G. J., Menenti, M., & Verhoef, W. (2000). Reconstructing cloudfree NDVI composites using Fourier analysis of time series. International Journal of Remote Sensing, 21, 1911–1917. https://doi.org/10.1080/014311600209814. Rouse, J. W. (1974). Monitoring the vernal advancement and retrogradation (green wave effect) of natural vegetation. Rouse, J. W., Haas, R. H., Schell, J. A., & Deering, D. W. (1974). Monitoring vegetation systems in the great plains with ERTS. In NASA. Goddard Space Flight Center 3d ERTS-1 Symp. pp. 309–317. Roy, D. P., Wulder, M. A., Loveland, T. R., et al. (2014). Landsat-8: Science and product vision for terrestrial global change research. Remote Sensing of Environment, 145, 154–172. https://doi. org/10.1016/j.rse.2014.02.001. Roy, D. P., Zhang, H. K., Ju, J., et al. (2016). A general method to normalize Landsat reﬂectance data to nadir BRDF adjusted reﬂectance. Remote Sensing of Environment, 176, 255–271. https:// doi.org/10.1016/j.rse.2016.01.023. Ruban, T., Bhargava, R., & Sitzmann, V. Planet labels-how do we use our planet? Schaaf, C. B., Gao, F., Strahler, A. H., et al. (2002). First operational BRDF, albedo nadir reﬂectance products from MODIS. Remote sensing of Environment, 83, 135–148. Shelestov, A., Lavreniuk, M., & Kussul, N., et al. (2017). Exploring Google Earth engine platform for big data processing: Classiﬁcation of multi-temporal satellite imagery for crop mapping. Frontiers in Earth Science. https://doi.org/10.3389/feart.2017.00017 Shrestha, R., Di, L., Yu, G., et al. (2013). Detection of ﬂood and its impact on crops using NDVI – Corn case. In Proceedings of the second international conference on agro-geoinformatics, August 12–16, 2013, Fairfax, VA USA. IEEE, Fairfax, VA, USA, Shrestha, R., Di, L., Yu, E. G., et al. (2016). Regression based corn yield assessment using MODIS based daily NDVI in Iowa state. IEEE, pp. 1–5. Shrestha, R., Di, L., Yu, E. G., et al. (2017). Regression model to estimate ﬂood impact on corn yield using MODIS NDVI and USDA cropland data layer. Journal of Integrative Agriculture, 16, 398–407. https://doi.org/10.1016/S2095-3119(16)61502-2. Silleos, N. G., Alexandridis, T. K., Gitas, I. Z., & Perakis, K. (2006). Vegetation indices: Advances made in biomass estimation and vegetation monitoring in the last 30 years. Geocarto International, 21, 21–28. https://doi.org/10.1080/10106040608542399.

202

E. G. Yu and Z. Yang

Song, C., Woodcock, C. E., Seto, K. C., et al. (2001). Classiﬁcation and change detection using landsat TM data: When and how to correct atmospheric effects? Remote Sensing of Environment, 75, 230–244. https://doi.org/10.1016/S0034-4257(00)00169-3. Song, X.-P., Potapov, P. V., Krylov, A., et al. (2017). National-scale soybean mapping and area estimation in the United States using medium resolution satellite imagery and ﬁeld survey. Remote Sensing of Environment, 190, 383–395. https://doi.org/10.1016/j.rse.2017.01.008. Supit, I., van Diepen, C. A., de Wit, A. J. W., et al. (2012). Assessing climate change effects on European crop yields using the crop growth monitoring system and a weather generator. Agricultural and Forest Meteorology, 164, 96–111. https://doi.org/10.1016/j.agrformet.2012. 05.005. Toutin, T. (2004). Review article: Geometric processing of remote sensing images: models, algorithms and methods. International Journal of Remote Sensing, 25, 1893–1924. https:// doi.org/10.1080/0143116031000101611. Tucker, C. J. (1980). A critical review of remote sensing and other methods for non-destructive estimation of standing crop biomass. Grass and Forage Science, 35, 177–182. https://doi.org/ 10.1111/j.1365-2494.1980.tb01509.x. Tucker, C. J., & Sellers, P. J. (1986). Satellite remote sensing of primary production. International Journal of Remote Sensing, 7, 1395–1416. https://doi.org/10.1080/01431168608948944. Ul Qayyum, Z., Akhtar, A., Sarwar, S., Ramzan, M. (2013). Optimal feature extraction technique for crop classiﬁcation using aerial imagery. IEEE, pp 1–5. Velleman, P. F. (1977). Robust nonlinear data smoothers: Deﬁnitions and recommendations. PNAS, 74, 434–436. Velleman, P. F. (1980). Deﬁnition and comparison of Robust Nonlinear data smoothing algorithms. Journal of the American Statistical Association, 75, 609–615. https://doi.org/10.2307/2287657. Vermote, E. F., Tanré, D., Deuze, J. L., et al. (1997). Second simulation of the satellite signal in the solar spectrum, 6S: An overview. IEEE Transactions on Geoscience and Remote Sensing, 35, 675–686. Vermote, E. F., El Saleous, N. Z., & Justice, C. O. (2002). Atmospheric correction of MODIS data in the visible to middle infrared: ﬁrst results. Remote Sensing of Environment, 83, 97–111. https://doi.org/10.1016/S0034-4257(02)00089-5. Vicenteserrano, S., Perezcabello, F., & Lasanta, T. (2008). Assessment of radiometric correction techniques in analyzing vegetation variability and change using time series of Landsat images. Remote Sensing of Environment, 112, 3916–3934. https://doi.org/10.1016/j.rse.2008.06.011. Viovy, N., Arino, O., & Belward, A. S. (1992). The Best Index Slope Extraction ( BISE): A method for reducing noise in NDVI time-series. International Journal of Remote Sensing, 13, 1585–1590. https://doi.org/10.1080/01431169208904212. Wall, L., Larocque, D., & Léger, P.-M. (2008). The early explanatory power of NDVI in crop yield modelling. International Journal of Remote Sensing, 29, 2211–2225. Wardlow, B. D., & Egbert, S. L. (2008). Large-area crop mapping using time-series MODIS 250 m NDVI data: An assessment for the U.S. Central Great Plains. Remote Sensing of Environment, 112, 1096–1116. https://doi.org/10.1016/j.rse.2007.07.019. Whitcraft, A. K., Vermote, E. F., Becker-Reshef, I., & Justice, C. O. (2015). Cloud cover throughout the agricultural growing season: Impacts on passive optical earth observations. Remote Sensing of Environment, 156, 438–447. https://doi.org/10.1016/j.rse.2014.10.009. Wu, B., & Li, Q. (2004). China crop watch system with remote sensing. Journal of Remote Sensing, 8, 482–496. Wu, B., & Li, Q. (2012). Crop planting and type proportion method for crop acreage estimation of complex agricultural landscapes. International Journal of Applied Earth Observation and Geoinformation, 16, 101–112. https://doi.org/10.1016/j.jag.2011.12.006. Wu, B., Meng, J., Li, Q., et al. (2010). Latest development of “CropWatch”—An global crop monitoring system with remote sensing. Advances in Earth Science. CNKI:SUN:DXJZ.0.201010-004.

10

Crop Pattern and Status Monitoring

203

Wu, B., Meng, J., Li, Q., et al. (2014). Remote sensing-based global crop monitoring: Experiences with China’s CropWatch system. International Journal of Digital Earth, 7, 113–137. Wu, B., Gommes, R., Zhang, M., et al. (2015). Global crop monitoring: A satellite-based hierarchical approach. Remote Sensing, 7, 3907–3933. https://doi.org/10.3390/rs70403907. Yang, X., Zhu, W., Pan, Y., & Jia, B. (2007). Spatial sampling design for crop acreage estimation. Yang, Z., Zhao, H., Di, L., Yu, G. (2009). A comparison of vegetation indices for corn and soybean vegetation condition monitoring. In 2009 IEEE international geoscience and remote sensing symposium (IGARSS 2009). IEEE, Cape Town, South Africa, p IV-801-IV-804. Yang, C., Everitt, J. H., & Murden, D. (2011a). Evaluating high resolution SPOT 5 satellite imagery for crop identiﬁcation. Computers and Electronics in Agriculture, 75, 347–354. https://doi.org/ 10.1016/j.compag.2010.12.012. Yang, Z., Di, L., Yu, G., & Chen, Z. (2011b). Vegetation condition indices for crop vegetation condition monitoring. In Geoscience and Remote Sensing Symposium (IGARSS), 2011 IEEE International. IEEE, pp. 3534–3537. Yang, Z., Yu, G., Di, L., Zhang, B. (2013). Web service-based vegetation condition monitoring system-VegScape. In Proceeding of iEEE IGARSS’2013. Yang, Z., Hu, L., Yu, G., et al. (2016). Web service-based SMAP soil moisture data visualization, dissemination and analytics based on vegscape framework. IEEE, pp 3624–3627. Yazdani, R., Ryerson, A. R., & Derenyi, E. (1981). Vegetation change detection in an area—A simple approach for use with geo-data base. In Proceedings of the 7th Canadian symposium on remote sensing. pp. 88–92. You, J., Li, X., Low, M., et al. (2017). Deep Gaussian process for crop yield prediction based on remote sensing data. Yu, G., Di, L., Yang, Z., et al. (2012a). Crop condition assessment using high temporal resolution satellite images. In The ﬁrst international conference on agro-geoinformatics 2012. IEEE, Shanghai, China. Yu, G., Di, L., Yang, Z., et al. (2012b). Corn growth stage estimation using time series vegetation index. In 2012 ﬁrst international conference on agro-geoinformatics (Agro-Geoinformatics). pp. 1–6. Zarco-Tejada, P. J., Ustin, S. L., & Whiting, M. L. (2005). Temporal and spatial relationships between within-ﬁeld yield variability in cotton and high-spatial hyperspectral remote sensing imagery. Agronomy Journal, 97, 641. https://doi.org/10.2134/agronj2003.0257. Zhang, X., Zhang, M., Zheng, Y., & Wu, B. (2016). Crop mapping using PROBA-V time series data at the Yucheng and Hongxing farm in China. Remote Sensing, 8, 915. https://doi.org/10. 3390/rs8110915. Zhu, Z., & Woodcock, C. E. (2012). Object-based cloud and cloud shadow detection in Landsat imagery. Remote Sensing of Environment, 118, 83–94.

Chapter 11

Crop Growth Modeling and Yield Forecasting Haizhu Pan and Zhongxin Chen

Abstract In the past decades, crop growth modeling and yield forecasting have attracted increasing attention in both scientiﬁc researches and agricultural practices. Many scientiﬁc studies have been carried out to improve the capabilities of crop growth modeling and yield forecasting by using various data sources and methods like statistical models, crop growth simulation models, and remote sensing. In this chapter, four categories of crop growth models were reviewed. Firstly, the traditional crop modeling and forecasting methods were introduced: statistical modeling and crop growth models. Then remote sensing models mainly based on spectral indices and quantitative products were introduced. The quality of remote sensing data is critical for crop modeling and yield forecasting. Finally, the widely used data assimilation of crops was described. More research is necessary for the full use of the value of remote sensing and crop growth model in crop growth monitoring and yield forecasting at a regional scale. Keywords Crop growth · Modeling · Yield forecast · Remote sensing · Data assimilation

11.1

Introduction

Crop growth and yield are very important information in crop management and agricultural policy-making at farm and regional scales. Under the challenge of climate change and increasing population stresses to agricultural sustainability, the interests in crop growth modeling and yield forecasting have increased quickly in the last H. Pan Breeding Base for State Key Laboratory of Land Degradation and Ecological Restoration in Northwest China, Ningxia University, Yinchuan, China e-mail: [email protected] Z. Chen (*) Food and Agriculture Organization of the United Nations, Rome, Italy e-mail: [email protected] © Springer Nature Switzerland AG 2021 L. Di, B. Üstündağ (eds.), Agro-geoinformatics, Springer Remote Sensing/ Photogrammetry, https://doi.org/10.1007/978-3-030-66387-2_11

205

206

H. Pan and Z. Chen

decades (Foley et al. 2011; White et al. 2011). In the past decades, crop growth modeling and yield forecasting have attracted increasing attention in both scientiﬁc researches and agricultural practices. Many scientiﬁc studies have been carried out to improve the capabilities of crop growth modeling and yield forecasting by using various data sources and method like statistical models, crop growth simulation models, and remote sensing (Bennett et al. 2017; Gowda et al. 2014; Prasad et al. 2006). In addition, many studies also have demonstrated the advantages of integrating remotely sensed data and crop growth models by data assimilation in crop growth modeling and yield forecasting (de Wit and van Diepen 2007; Fang et al. 2008; Huang et al. 2015). The majority contents of crop growth monitoring focus on the dynamics parameters during crop growth, such as like leaf area index, leaf nitrogen accumulation, dry matter content, and soil moisture. The timely and accurate crop growth variables can help to learn the state of the crop growth period. A few months before harvest crop yield forecasting can be important to national food trade and security. The success of crop yield forecasting strongly depends on the crop growth monitoring’s ability (Horie et al. 1992). Agricultural research community has developed many crop growth modeling approaches that can be broadly divided into two categories to forecasting crop yield. The ﬁrst category comprises statistical models, which establish the contact between climate, remote sensing, or other variables and yield using statistical methods such as regression analysis (Basso et al. 2013; Kogan et al. 2013; Michel and Makowski 2013). Statistical modeling plays a key role in current research studies on yield forecasting, while most of these models are locally calibrated, not easily used in the other cropland region. The second category contains physiological/physical-based crop growth models. They can simulate the main crop growth and production processes such as photosynthesis, solar radiation absorption, phenology, carbon, and nitrogen balances (Möller and Müller 2012). Numerous crop growth models have been developed, and some models are widely used in crop growth monitoring and yield forecasting (Eitzinger et al. 2004; Palosuo et al. 2011a). Although crop models can reproduce the main physiological/physical processes occurring during the crop growth, the use of models often limited by the uncertainties of input variables such as ﬁled management, soil, and initial conditions, and the large model parameters calibration are also challenging. Using these models to predict regional crop yield should be complex, because many of these crop modeling approaches were developed at the farm (point) scale. Cropland spatial and temporal heterogeneity, the available grid soil, and climate driving databases are other challenge using the crop growth models at a regional scale. However, with the development of remote sensing and GIS techniques, many studies have been successfully carried out using a crop growth model to simulate regional crop growth dynamics and yield (Bastiaanssen and Ali 2003; Ren et al. 2011; Therond et al. 2011). Remote sensing has already been proven as an effective measure for crop area statistics and crop mapping, quantitative inversion of key crop growth variables, crop phenology, and yield forecasting. After decades’ development, the application of remote sensing in agriculture has undergone tremendous changes, from simple images to quantitative agricultural parameters. Currently, crop growth–related

11

Crop Growth Modeling and Yield Forecasting

207

parameters could obtain from quantitative remote sensing data including leaf area index (LAI), canopy water content (CWC), fractional vegetation coverage (FVC), soil moisture (SM), evapotranspiration (ET), and fraction of photosynthetically active radiation (FPAR). These satellite data have been widely used in crop research and management (Brown et al. 2012; Glenn et al. 2011; Qu et al. 2014; Zelitch 1982). Spectral indices are the main method used for crop growth monitoring and yield estimation. Many researchers develop various indices for crop yield estimation using satellite data such as normalized difference vegetation index (NDVI), enhanced vegetation index (EVI) (Kouadio et al. 2014), and temperature vegetation condition index (TVDI) (Holzman et al. 2014). Food and Agricultural Organization (FAO) using the NDVI to estimate regional crop yield from NOAA satellite data with a daily 1.1 km resolution (Hielkema and Snijders 1994). To meet the demand of precision agriculture, more high-spatial-temporal-resolution remote sensing been used for crop growth modeling and yield forecasting such as GF, Sentinel, and UAV (unmanned aerial vehicle) data. So far, many remote sensing data sources have been used in the crop growth monitoring and yield forecasting system, such as monitoring agricultural by remote sensing (MARS) crop yield forecasting system developed by Joint Research Center in Italy (de Wit and van Diepen 2008) and China’s agricultural remote sensing monitoring system CHARMS (Chen et al. 2012). The spatial-temporal characteristics of remote sensing make crop growth monitoring and yield forecasting available at regional scale. However, many methods using remote sensing for crop growth and yield research are empirical models and cannot describe the mechanism of crop growth process. Crop growth models and remote sensing become the main tool for crop growth monitoring and yield estimation. At regional scale, the input variables of the crop growth model are usually poorly known, such as grid meteorological data, initial ﬁeld conditions, sowing date, and soil conditions. The regional ﬁeld status and crop biophysical parameters can be estimated using remote sensing data. The successful application of data assimilation in the land surface model gained increasing attention in the agricultural community; research found that assimilate remote sensing data can improve the performance of crop growth model (Claverie et al. 2009; de Wit and van Diepen 2007; Fang et al. 2008). Numerous research suggested that data assimilation of the crop growth model and remote sensing has become an effective tool for crop growth monitoring and yield forecasting (de Wit and van Diepen 2007; Huang et al. 2015; Ines et al. 2013; Wu et al. 2011). Several crop data assimilation schemes with different data assimilation algorithms, crop growth models, and remote sensing variables have been developed and evaluated during the last decade, and the results suggested that they have tremendous potential for improving the simulation performance of crop growth dynamic, water balance, and regional crop yield. There are mainly two schemes based on the data assimilation algorithm. The ﬁrst is variational assimilation, which minimizes the difference between crop variables estimated from the crop growth model and remote sensing by adjusting the model parameters (Huang et al. 2015). The second is sequential assimilation with crop model state variables update by minimizing the uncertainty of model and observations when the remote sensing data available (Wu et al. 2011). As more terrestrial observation plans

208

H. Pan and Z. Chen

are being carried out, multisource, multivariate, and multiscale data assimilation has gained attention in crop growth modeling and yield forecasting research (Montzka et al. 2012; Zhiwei et al. 2014). In this chapter, we aim at introducing the recent advances in crop growth and yield models, remote sensing, and data assimilation on crop growth modeling and yield forecasting. This chapter is organized into ﬁve main sections. Section 9.2 describes traditional statistical modeling approaches. The physiological/physicalbased modeling approaches and commonly used crop growth models are introduced in detail in Section 9.3. In this section, we will focus on how to use crop growth models to forecast crop biomass and yield. In Section 9.4, detailed information for remote sensing in crop growth monitoring is described. Section 9.5 describes various data assimilation approaches in crop growth models. The conclusions for crop modeling and yield forecasting are provided in Section 9.6.

11.2

Statistical Modeling

Crop growth is a complex soil-plant-atmosphere system, which depends on large environmental factors, such as the level of incoming radiation, climate condition, photosynthetic characteristics of the leaves, soil moisture, and ﬁeld management condition. However, most of the time, these parameters are unavailable; usually, one or few factors were used to estimate the crop growth state variables or yield. Statistical analyses are the most commonly used method in crop research, and numerous statistical models were developed. These different types of statistical models mainly depend on the relationship between crop growth variables or yield and weather parameters, such as some models that describe the response of crop yield to accumulated temperature or relative humidity and the relationship between leaf areas and days after crop emergence. This approach primarily uses the ﬁeld observation or statistics data to establish an equation or set of equations by ﬁts them to data. The usual method used is regression analysis include linear regression models and nonlinear regression models, and there are other statistical methods also used in research such as artiﬁcial neural networks, but here we focus on regression analysis as an example in statistical modeling. Regression analysis approaches are frequently used in statistical modeling for crop growth monitoring and yield forecasting; several linear and nonlinear regression models were developed. In the linear regression model, Yi ¼ a þ bXi þ εi where Yi and Xi are crop growth state variables or yield, meteorological factors, respectively, a and b represent model parameters to be ﬁt, and εi is an error term.

11

Crop Growth Modeling and Yield Forecasting

209

Some meteorological factors or crop-related indices are used as Xi in these models, such as evapotranspiration, temperature, precipitation, soil moisture, and NDVI. NDVI is often considered a valuable index for crop growth condition analysis. For example, Prasad (Prasad et al. 2006) developed a corn and soybean yield prediction model based on surface temperature (ST), rainfall (RF), soil moisture (SM), and NDVI in Iowa, USA; the model is as follows: Yield ¼ a1 NDVI þ a2 SM þ a3 ST þ a4 RF þ c1 where model parameters a1, a2, a3, a4, and c1 have different values when the forecasted yield is less or greater than breakpoint; the breakpoint is the mean of 19-year corn or soybean crop yield in Iowa. The coefﬁcients for the corn and soybean yield model are 0.78 and 0.86, respectively. Several researchers use this approach to analyze responses of crop yield to climate change (Lobell and Burke 2010; Thornton et al. 2009). Besides the linear regression model, more research works use nonlinear regression models to estimate crop growth state and forecast yield. Various types of nonlinear regression models were used such as exponential, logistic, and Gaussian, for example, crop growth variable leaf area index (LAI) nonlinear regression model found using growing degree-days (GDD) accumulated from planting in function(Teruel 1995):

LAIn ¼

n X

!b GDDi

∙e

aþc

n P

GDDi

i¼1

i¼1

where GDDi is degree-days ( C ∙ day) and a, b and c are the ﬁtting constants. The decrease of yield was developed in function of water stress as(Jensen 1968): λ n Y a X ETai ¼ Ym ETmi i¼1 in which Ya/Ym is the relationship between the yield and a possible maximum yield and ETai =ETmi is the relationship between the evapotranspiration and occurred without water restrictions. There are many factors that have an inﬂuence on crop growth and yield, such as the efﬁciency of irrigation systems, planting area, rainfall, disease occurrence, quality of crop seeds, and soil quality. Therefore, the selection of a set of meteorological or biometrical factors is important in statistical modeling for crop growth monitoring and yield forecasting. One problem with statistical modeling is that it cannot be extrapolated, has limited reliance on ﬁeld calibration data, and is unable to assess uncertainties. Although these models maybe work in other conditions similar to crop regions, now statistical modeling is still a frequently used method in crop modeling and yield forecasting.

210

H. Pan and Z. Chen

11.3

Physiological/Physical-Based Modeling

Physiological/physical-based modeling has been successfully developed and used over time, providing additional information to decision makers on how to accomplish sustainable agriculture, and used to understand the effects of climate change on crop growth and yield (Palosuo et al. 2011b; White et al. 2011). There are numerous different types of crop growth models that have been developed with different complexity levels and different crop types over the past years. Compared with statistical models, crop growth models can describe the main progress during crop growth and production as a soil-plant-atmosphere system, such as solar radiation absorption, photosynthesis, phenology, biomass partitioning, organ building, nitrogen processes, and water balance (Gowda et al. 2014). These models simulate crop growth state and yield as a function of weather, soil conditions, and crop management, so a mature crop growth model consists of at least three modules, the main ﬂow of these models as shown in Fig. 11.1. In these crop growth models, the level of complexity depends on the object of the modeling exercise and speciﬁc parameterization schemes. The successful crop growth models used to simulate crop growth and yield in the world agricultural research community include STICS, CROPSYST, WOFOST, EPIC, DSSAT, APSIM, and so on. Detail of these models can be obtained from websites and references shown in Table 11.1.

Meteorological

GDD (growing degree days)

Rainfall

F,T0

I,AI

Crop Coefficient

Biomass

Evaporation

Runoff

Weather module

Infiltration

Canopy Cover

Crop module

Harvest Index

Root Distribution

Transpiration

Soil module Stress fatcors

Yield

Fig. 11.1 The ﬂow chart of the crop growth model

root zone moisture balance

Irrigation

11

Crop Growth Modeling and Yield Forecasting

211

Table 11.1 The main crop growth models for crop growth and yield simulation Model STICS

Country France

CROPSYST

USA

WOFOST

Netherlands

DSSAT EPIC

USA USA

APSIM

Australia

References Brisson et al. (1998) Stöckle et al. (2003) van Diepen et al. (1989) Jones et al. (2003) Williams et al. (1983) Keating et al. (2003)

Websites https://www6.paca.inra.fr/stics_eng/ http://modeling.bsyse.wsu.edu/CS_Suite/ CropSyst/index.html http://www.wofost.wur.nl https://dssat.net/ https://epicapex.tamu.edu/epic/ http://www.apsim.info/

In this section, we focus on describing the model structure and the commonly used crop growth model DSSAT. DSSAT model developed by International Benchmark Sites Network for Agro-technology Transfer (IBSNAT), which consists of three components: database management module which used to enter, store, and retrieve data set for model run; land unit module which simulate the effect of soil processes on crop growth; and a set of crop models for simulating crop growth state and biomass or yield. The model can simulate crop development process, soil water balance, carbon and nitrogen processes, and crop management practices (Jones et al. 2003; Thorp et al. 2008). In DSSAT-CSM, different kinds of crops have a single set of codes; this design feature simpliﬁes the simulation of crop rotations, as Fig. 11.2 shows. Through inputting weather data and physiological/physical parameters in DSSAT-CSM, users can obtain user-speciﬁed objectives. The CSM-CERES is one set of crop models under DSSAT, such as CERES-rice and CERES-wheat models. For example, the CERES-wheat model simulates the wheat LAI dynamics, dry matter development, the water and nitrogen balances of the soil-plant-atmosphere at a daily step, and the wheat yield. Data preparation is the ﬁrst step when using the crop growth model for crop growth monitoring and yield forecasting and plays a key role for the simulation result performance. The crop growth model has the input of climate, soil, and crop management data, and the input data may be different for different complexity models. For WOFOST, the climate input consists of daily maximum and minimum temperature, solar radiation, wind speed, vapor pressure, and precipitation. Soil input consists of soil moisture content of saturated soil, soil moisture content at the wilting point, and soil moisture content at ﬁeld capacity (Jones et al. 2003). In addition, sowing date, the amount of irrigation, and fertilization as crop management information also need to be collected. These data can be obtained from meteorological station observations and ﬁeld experiments. For regional applications, the data preparation can be obtained through the interpolation method. As the crop growth model has a large number of parameters, parameter estimation based on limited experimental data is usually considered a key issue that affects the model simulation results with uncertainty. Therefore, model parameter sensitivity analysis is necessary

212

H. Pan and Z. Chen

Fig. 11.2 Overview of the components and modular structure of the DSSAT–CSM

to select the most inﬂuential parameters for the results. Local sensitivity analysis and global sensitivity analysis both used in several crop growth models work. For the purpose of accuracy simulation, the sensitive parameters were recalibrated through ﬁeld measurements; this is called crop growth mode localization (Vanuytrecht et al. 2014; Zhao et al. 2014). After the work of data preparation and model localization, input the driving data and parameters into the model, crop growth variable at a daily step, and yield will be output. The crop growth model is a very effective tool for crop growth monitoring and yield forecasting and predicting possible impacts of climatic change (Thornton et al. 2009). However, the performance of these models for crop growth monitoring and yield forecasting depends on the accuracy of climate data and the suitable parameters. In practice, meteorological, soil, and crop management data are not easily available at the regional scale, so crop growth models have better simulated results at point scale than regional scale (Jégo et al. 2012; Moulin et al. 1998). Therefore, there is increasing attention in providing better estimates of model state variables and model parameters using new data sources and technology to improve the model’s ability to simulate crop growth and yield.

11

Crop Growth Modeling and Yield Forecasting

11.4

213

Remote Sensing Monitoring of Crop Growth

Remote sensing has been proved as an effective method for monitoring crop growth and forecasting yield. The advantages of remote sensing–based methods over traditional crop models include their spatial coverage, spectral information, and availability of free multisource data during the growing season. Most of the research on remote sensing for crop growth monitoring mainly uses visible and near-infrared data. Since these satellite data can describe the crop growth state and biomass seasonal dynamics. However, cloud cover and rainfall conditions may affect the usability of these data. Using microwave data overcomes some problems of the above remotely sensed data gaps during the growing season. Optical and microwave remote sensing data are often used together for crop growth monitoring. Spectral indices and quantitative remote sensing are the main two methods for crop growth monitoring. Spectral indices are the mathematical combinations of the reﬂectance of the relevant spectral bands. Numerous efforts have been made to develop various indices using remote sensing data such as NDVI, EVI, SAVI, and TVDI. Many researchers use these spectral indices for crop growth monitoring and yield forecasting by developing the relationship with crop physiological/physical parameters, several commonly used indices shown in Table 11.2. Currently, several crop-related feature parameters available on the regional scale even in the world based on the satellite data, such as MODIS, Landsat, and Sentinel. For precision agriculture, more hyperspectral and UVA data obtain attention in crop growth monitoring and yield estimation. In the past years, several yield estimation models were established using spectral indices; for example, Benedetti developed a simple linear regression model based on NDVI for a wheat estimate during the wheat grain ﬁlling period. Two meteorological variables, temperature and precipitation, for forecasting yield can also be easily obtained from satellite data, such as the NOAA-AVHRR series. Table 11.2 List of mainly spectral indices and their formula for crop growth monitoring Spectral indices NDVI EVI

NIRRed 2:5 NIRþ6Red7:5Blueþ1

Vegetation cover

Reference Casanova et al. (1998) Arvor et al. (2011)

SAVI

ð1þ0:5ÞðR800R670Þ ðR800R670Þþ0:5 NIRSWIR NIRþSWIR

LAI

Ray et al. (2006)

Vegetation water content Crop drought

Jackson et al. (2004)

NDWI

Formula NIRRed NIRþRed

TVDI

(Ts Tmin)/ (a + bNDVI Tmin)

MTCI

ðR735:75 R708:75 Þ R708:75 R681:25

Parameters Biomass, LAI

Chlorophyll

Gao et al. (2011) Dash and Curran (2004)

214

H. Pan and Z. Chen

On the regional scale, the main land surface parameters acquired by quantitative remote sensing include three types: land use data, vegetation physiology, and canopy physical data—most of these parameters are closely related to crop growth and yield. With decades of development, several quantitative remotely sensed products at regional and global scale from multisource satellite data have been available. Among these products, the crop-related products include LAI, PAR, FPAR, GPP, NPP, FVC, ET, LST, SM, and LC. LAI is an important crop growth parameter, which is deﬁned as the total leaf area per unit of ground area. It is an important factor for describing several crop growth processes such as photosynthesis, evapotranspiration, and yield. PAR, LST, and SM also directly affect the process of crop photosynthetic production and evapotranspiration, which are the driving and material conditions of crop growth. ET participates in the process of crop water and energy balance. GPP/.NPP is the cumulative state in different periods of crop assimilation processes, which is the direct material basis for organ formation. SM can be used to crop drought conditions monitoring. LC provides crop spatial distribution information for crop growth monitoring. In addition to the spectral indices, these yield related products also used for yield estimation, such as PAR is used to predict sugar beet yield in Europe. The EOS satellite series of the United States, the Sentinel satellite series of Europe, and the HJ series and high-resolution GF series in China provide multisource data for crop growth monitoring and yield forecasting and become possible. The advantage of remote sensing techniques on crop growth monitoring is obvious. Remote sensing has been widely used and is often used in crop yield forecasting systems in serval counties. However, remotely sensed data cannot describe the physical process of crop growth, and some improvement of data accuracy is still a challenge. The mixed pixels were inevitable in regional crop applications, and this problem is hard to overcome in the near future. Therefore, the traditional crop modeling method is still needed when no remote sensing data are available.

11.5

Data Assimilation

Simulation model and observation are two means to crop growth monitoring and yield forecasting. Integration remote sensing and crop growth model give a better description for crop growth and yield, water balance, and the carbon cycle. In the past years, various data assimilation approaches were carried out for integrating remote sensing data in crop growth models. There are mainly two data assimilation strategies: calibration approach of model parameters and updating approach of model state variables. Calibration data assimilation is adjusting the initial conditions or model parameters through a minimum difference between the observed and simulated state variables (Fig. 11.3a). In this strategy, initial conditions or model

11

Crop Growth Modeling and Yield Forecasting

215

Fig. 11.3 Two data assimilation strategies of integrating crop growth model and remote sensing data (a is calibration strategy and b is updating strategy)

parameters are the only source of all the model output uncertainties, not take into account the errors in the process of the model. In updating data assimilation, the model state variables will be updated whenever the remote sensing data available (Fig. 11.3b). This approach considers various uncertainties from the model parameters, model process, and observed data. In these two strategies, different data assimilation algorithms have been developed to carry out, respectively. Data assimilation algorithms are the main part of the data assimilation system of crops. In calibration data assimilation strategy, variational algorithms have been extensively used, including three-dimensional variational algorithm (3DVAR) and four-dimensional variational algorithm (4DVAR). The 4DVAR algorithm is a commonly used method that minimizes the cost function J between the observational data and model simulated results over the assimilation window. J general formula is as follows: J ðxðt 0 ÞÞ ¼ ½xðt 0 Þ xb T B1 0 ½ xð t 0 Þ xb þ

n X

½H i ðxðt i ÞÞ yi T R1 t ½ H i ð x ð t i Þ Þ yi

t¼1

where x and y are model state variables and observations; R and B are the observation and background error-covariance matrices, respectively; and H is the observation operator. For updating model state variables, strategies are sequential data assimilation algorithms, such as ensemble Kalman ﬁlter (EnKF) and particle ﬁlter (PF). These algorithms apply an ensemble or particle of model state to represent error statistics of the model simulate. The sequential assimilation algorithms have proven to efﬁciently handle strongly nonlinear dynamics systems. Sequential data

216

H. Pan and Z. Chen

assimilation algorithms usually have two steps—forecast and analysis. Remotely sensed data of the current state are combined with the simulated state from the model (the forecast) to analysis; then the result becomes the forecast in the next analysis step.

11.5.1 Sequential Data Assimilation Algorithms Crop growth model parameter calibration using remotely sensed data is most widely used in researches of crop data assimilation. The objective is how to minimize the cost function J. Several good research results were obtained. Fang minimization between the modeled and observed LAI using optimization algorithm POWELL got better LAI and yield simulation. In some studies, the global optimized algorithm SCE-UA used Dente to assimilate LAI from ASAT and MERIS data to improve the CERES-wheat model simulation capacity. The updating model state variable approach also has some valuable researches in crop data assimilation. Assimilation remotely sensed based on EnKF was widely used. Pauwels’s assimilation of observed soil moisture and LAI into a WOFOST model using EnKF found that the yield prediction well improved. The soil water index derived from microwave remotely sensed data assimilated to correct the error of water balance in WOFOST based on EnKF. In recent years, data assimilation algorithms and variables tend to diversify. Assimilation variables increased from LAI to canopy reﬂectance, soil moisture, ET, and a combination of multiple variables. Therefore, with the development of crop data assimilation, it will help to improve the capability of crop modeling and yield forecasting.

11.6

Conclusion

Various kinds of methods such as statistical models, physiological/physical-based models, remote sensing, and data assimilation are in use for monitoring and forecasting crop growth and yield. The traditional crop modeling and forecasting methods are still widely used: statistical modeling and crop growth models. Although there are many statistical models and crop growth models, the search for news models is still necessary. Remote sensing models are mainly based on the spectral indices and quantitative products from remotely sensed data. The quality of remote sensing data is critical for the performance of crop modeling and yield forecasting. In recent years, data assimilation of crop modeling is a hot topic and promising. Described. More researches are needed for the full use of the value of remote sensing and crop growth model in crop growth monitoring and yield forecasting at the regional scale.

11

Crop Growth Modeling and Yield Forecasting

217

References Arvor, D., Jonathan, M., Meirelles, M. S. P., Dubreuil, V., & Durieux, L. (2011). Classiﬁcation of MODIS EVI time series for crop mapping in the state of Mato Grosso, Brazil. International Journal of Remote Sensing, 32, 7847–7871. https://doi.org/10.1080/01431161.2010.531783. Basso, B., Cammarano, D., & Carfagna, E. (2013). Review of crop yield forecasting methods and early warning systems. In Proceedings of the ﬁrst meeting of the scientiﬁc advisory Committee of the Global Strategy to improve agricultural and rural statistics, FAO Headquarters, Rome, Italy. pp. 18–19. Bastiaanssen, W. G. M., & Ali, S. (2003). A new crop yield forecasting model based on satellite measurements applied across the Indus Basin, Pakistan. Agriculture, Ecosystems & Environment, 94, 321–340. https://doi.org/10.1016/S0167-8809(02)00034-8. Bennett, E. J., Brignell, C. J., Carion, P. W. C., Cook, S. M., Eastmond, P. J., Teakle, G. R., Hammond, J. P., Love, C., King, G. J., Roberts, J. A., & Wagstaff, C. (2017). Development of a statistical crop model to explain the relationship between seed yield and phenotypic diversity within the Brassica napus genepool. Agronomy, 7, 31. https://doi.org/10.3390/ agronomy7020031. Brisson, N., Mary, B., Ripoche, D., Jeuffroy, M. H., Ruget, F., Nicoullaud, B., Gate, P., DevienneBarret, F., Antonioletti, R., Durr, C., & others. (1998). STICS: A generic model for the simulation of crops and their water and nitrogen balances. I. Theory and parameterization applied to wheat and corn. Agronomie, 18, 311–346. Brown, M., De Beurs, K., & Marshall, M. (2012). Global phenological response to climate change in crop areas using satellite remote sensing of vegetation, humidity and temperature over 26years. Remote Sensing of Environment, 126, 174–183. Casanova, D., Epema, G. F., & Goudriaan, J. (1998). Monitoring rice reﬂectance at ﬁeld level for estimating biomass and LAI. Field Crops Research, 55, 83–92. https://doi.org/10.1016/S03784290(97)00064-6. Claverie, M., Demarez, V., Duchemin, B., Hagolle, O., Keravec, P., Marciel, B., Ceschia, E., Dejoux, J. F., & Dedieu, G. (2009). Spatialization of crop leaf area index and biomass by combining a simple crop model SAFY and high spatial and temporal resolutions remote sensing data. In 2009 IEEE international geoscience and remote sensing symposium. Presented at the 2009 IEEE International Geoscience and Remote Sensing Symposium, p. III-478-III-481. https://doi.org/10.1109/IGARSS.2009.5418296 Dash, J., & Curran, P. J. (2004). Evaluation of the MERIS terrestrial chlorophyll index. In Geoscience and remote sensing symposium, 2004. IGARSS’04. Proceedings. 2004 IEEE international. IEEE. de Wit, A. J. W., & van Diepen, C. A. (2007). Crop model data assimilation with the ensemble Kalman ﬁlter for improving regional crop yield forecasts. Agricultural and Forest Meteorology, 146, 38–56. https://doi.org/10.1016/j.agrformet.2007.05.004. de Wit, A. J. W., & van Diepen, C. A. (2008). Crop growth modelling and crop yield forecasting using satellite-derived meteorological inputs. International Journal of Applied Earth Observation and Geoinformation, Modern Methods in Crop Yield Forecasting and Crop Area Estimation, 10, 414–425. https://doi.org/10.1016/j.jag.2007.10.004. Eitzinger, J., Trnka, M., Hösch, J., Žalud, Z., & Dubrovský, M. (2004). Comparison of CERES, WOFOST and SWAP models in simulating soil water content during growing season under different soil conditions. Ecological Modelling, 171, 223–246. https://doi.org/10.1016/j. ecolmodel.2003.08.012. Fang, H., Liang, S., Hoogenboom, G., Teasdale, J., & Cavigelli, M. (2008). Corn-yield estimation through assimilation of remotely sensed data into the CSM-CERES-Maize model. International Journal of Remote Sensing, 29, 3011–3032. https://doi.org/10.1080/01431160701408386. Foley, J. A., Ramankutty, N., Brauman, K. A., Cassidy, E. S., Gerber, J. S., Johnston, M., Mueller, N. D., O’Connell, C., Ray, D. K., West, P. C., Balzer, C., Bennett, E. M., Carpenter, S. R., Hill, J., Monfreda, C., Polasky, S., Rockström, J., Sheehan, J., Siebert, S., Tilman, D., & Zaks,

218

H. Pan and Z. Chen

D. P. M. (2011). Solutions for a cultivated planet. Nature, 478, 337–342. https://doi.org/10. 1038/nature10452. Gao, Z., Gao, W., & Chang, N.-B. (2011). Integrating temperature vegetation dryness index (TVDI) and regional water stress index (RWSI) for drought assessment with the aid of LANDSAT TM/ETM+ images. International Journal of Applied Earth Observation and Geoinformation, 13, 495–503. https://doi.org/10.1016/j.jag.2010.10.005. Glenn, E. P., Neale, C. M., Hunsaker, D. J., & Nagler, P. L. (2011). Vegetation index-based crop coefﬁcients to estimate evapotranspiration by remote sensing in agricultural and natural ecosystems. Hydrological Processes, 25, 4050–4062. Gowda, P. T., Satyareddi, S. A., & Manjunath, S. B. (2014). Crop growth modeling: A review. Research Review Journal of Agriculture and Allied Sciences, 2, 1–11. Hielkema, J. U., & Snijders, F. (1994). Operational use of environmental satellite remote sensing and satellite communications technology for global food security and locust control by FAO: The ARTEMIS and DIANA systems. Acta Astronautica, 32, 603–616. Holzman, M. E., Rivas, R., & Piccolo, M. C. (2014). Estimating soil moisture and the relationship with crop yield using surface temperature and vegetation index. International Journal of Applied Earth Observation and Geoinformation, 28, 181–192. https://doi.org/10.1016/j.jag. 2013.12.006. Horie, T., Yajima, M., & Nakagawa, H. (1992). Yield forecasting. Agricultural Systems, 40, 211–236. Huang, J., Tian, L., Liang, S., Ma, H., Becker-Reshef, I., Huang, Y., Su, W., Zhang, X., Zhu, D., & Wu, W. (2015). Improving winter wheat yield estimation by assimilation of the leaf area index from Landsat TM and MODIS data into the WOFOST model. Agricultural and Forest Meteorology, 204, 106–121. https://doi.org/10.1016/j.agrformet.2015.02.001. Ines, A. V. M., Das, N. N., Hansen, J. W., & Njoku, E. G. (2013). Assimilation of remotely sensed soil moisture and vegetation with a crop simulation model for maize yield prediction. Remote Sensing of Environment, 138, 149–164. https://doi.org/10.1016/j.rse.2013.07.018. Jackson, T. J., Chen, D., Cosh, M., Li, F., Anderson, M., Walthall, C., Doriaswamy, P., & Hunt, E. R. (2004). Vegetation water content mapping using Landsat data derived normalized difference water index for corn and soybeans. Remote Sens. Environ., 2002 Soil Moisture Experiment (SMEX02), 92, 475–482. https://doi.org/10.1016/j.rse.2003.10.021. Jégo, G., Pattey, E., & Liu, J. (2012). Using leaf area index, retrieved from optical imagery, in the STICS crop model for predicting yield and biomass of ﬁeld crops. Field Crops Research, 131, 63–74. Jensen, M. E. (1968). Water consumption by agricultural plants. (Chapter 1). Jones, J. W., Hoogenboom, G., Porter, C. H., Boote, K. J., Batchelor, W. D., Hunt, L. A., Wilkens, P. W., Singh, U., Gijsman, A. J., & Ritchie, J. T. (2003). The DSSAT cropping system model. European Journal of Agronomy, 18, 235–265. https://doi.org/10.1016/S1161-0301(02)001077. Keating, B. A., Carberry, P. S., Hammer, G. L., Probert, M. E., Robertson, M. J., Holzworth, D., Huth, N. I., Hargreaves, J. N., Meinke, H., Hochman, Z., & others. (2003). An overview of APSIM, a model designed for farming systems simulation. European Journal of Agronomy, 18, 267–288. Kogan, F., Kussul, N. N., Adamenko, T. I., Skakun, S. V., Kravchenko, A. N., Krivobok, A. A., Shelestov, A. Y., Kolotii, A. V., Kussul, O. M., & Lavrenyuk, A. N. (2013). Winter wheat yield forecasting: A comparative analysis of results of regression and biophysical models. Journal of Automation and Information Sciences, 45. Kouadio, L., Newlands, N. K., Davidson, A., Zhang, Y., & Chipanshi, A. (2014). Assessing the performance of MODIS NDVI and EVI for seasonal crop yield forecasting at the Ecodistrict scale. Remote Sensing, 6, 10193–10214. https://doi.org/10.3390/rs61010193. Lobell, D. B., & Burke, M. B. (2010). On the use of statistical models to predict crop yield responses to climate change. Agricultural and Forest Meteorology, 150, 1443–1452. https:// doi.org/10.1016/j.agrformet.2010.07.008.

11

Crop Growth Modeling and Yield Forecasting

219

Michel, L., & Makowski, D. (2013). Comparison of statistical models for analyzing wheat yield time series. PLoS One, 8, e78615. https://doi.org/10.1371/journal.pone.0078615. Möller, K., & Müller, T. (2012). Effects of anaerobic digestion on digestate nutrient availability and crop growth: A review. Engineering in Life Sciences, 12, 242–257. https://doi.org/10.1002/elsc. 201100085. Montzka, C., Pauwels, V. R. N., Franssen, H.-J. H., Han, X., & Vereecken, H. (2012). Multivariate and multiscale data assimilation in terrestrial systems: A review. Sensors, 12, 16291–16333. https://doi.org/10.3390/s121216291. Moulin, S., Bondeau, A., & Delecolle, R. (1998). Combining agricultural crop models and satellite observations: From ﬁeld to regional scales. International Journal of Remote Sensing, 19, 1021–1036. Palosuo, T., Kersebaum, K. C., Angulo, C., Hlavinka, P., Moriondo, M., Olesen, J. E., Patil, R. H., Ruget, F., Rumbaur, C., Takáč, J., Trnka, M., Bindi, M., Çaldağ, B., Ewert, F., Ferrise, R., Mirschel, W., Şaylan, L., Šiška, B., & Rötter, R. (2011a). Simulation of winter wheat yield and its variability in different climates of Europe: A comparison of eight crop growth models. European Journal of Agronomy, 35, 103–114. https://doi.org/10.1016/j.eja.2011.05.001. Palosuo, T., Kersebaum, K. C., Angulo, C., Hlavinka, P., Moriondo, M., Olesen, J. E., Patil, R. H., Ruget, F., Rumbaur, C., Takáč, J., Trnka, M., Bindi, M., Çaldağ, B., Ewert, F., Ferrise, R., Mirschel, W., Şaylan, L., Šiška, B., & Rötter, R. (2011b). Simulation of winter wheat yield and its variability in different climates of Europe: A comparison of eight crop growth models. European Journal of Agronomy, 35, 103–114. https://doi.org/10.1016/j.eja.2011.05.001. Prasad, A. K., Chai, L., Singh, R. P., & Kafatos, M. (2006). Crop yield estimation model for Iowa using remote sensing and surface parameters. International Journal of Applied Earth Observation and Geoinformation, 8, 26–33. https://doi.org/10.1016/j.jag.2005.06.002. Qu, Y., Zhu, Y., Han, W., Wang, J., & Ma, M. (2014). Crop leaf area index observations with a wireless sensor network and its potential for validating remote sensing products. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 7, 431–444. Ray, S. S., Das, G., Singh, J. P., & Panigrahy, S. (2006). Evaluation of hyperspectral indices for LAI estimation and discrimination of potato crop under different irrigation treatments. International Journal of Remote Sensing, 27, 5373–5387. https://doi.org/10.1080/01431160600763006. Ren, J., Chen, Z., Tang, H., Zhou, Q., & Qin, J. (2011). Regional crop yield simulation based on crop growth model and remote sensing data. Transactions of the Chinese Society of Agricultural Engineering, 27, 257–264. Stöckle, C. O., Donatelli, M., & Nelson, R. (2003). CropSyst, a cropping systems simulation model. European Journal of Agronomy, 18, 289–307. Teruel, D. (1995). Modelagem do índice de área foliar de cana-de-açúcar em diferentes regimes hídricos. Regimes Hídricos: Model. Índice Área Foliar Cana--Açúcar Em Difer. Therond, O., Hengsdijk, H., Casellas, E., Wallach, D., Adam, M., Belhouchette, H., Oomen, R., Russell, G., Ewert, F., Bergez, J.-E., Janssen, S., Wery, J., & Van Ittersum, M. K. (2011). Using a cropping system model at regional scale: Low-data approaches for crop management information and model calibration. Agriculture Ecosystems & Environment, 142, 85–94. https://doi. org/10.1016/j.agee.2010.05.007. Thornton, P. K., Jones, P. G., Alagarswamy, G., & Andresen, J. (2009). Spatial variation of crop yield response to climate change in East Africa. Global Environmental Change, 19, 54–65. https://doi.org/10.1016/j.gloenvcha.2008.08.005. Thorp, K. R., DeJonge, K. C., Kaleita, A. L., Batchelor, W. D., & Paz, J. O. (2008). Methodology for the use of DSSAT models for precision agriculture decision support. Computers and Electronics in Agriculture, 64, 276–285. van Diepen, C., Wolf, J., van Keulen, H., & Rappoldt, C. (1989). WOFOST: A simulation model of crop production. Soil Use and Management, 5, 16–24. Vanuytrecht, E., Raes, D., & Willems, P. (2014). Global sensitivity analysis of yield output from the water productivity model. Environmental Modelling & Software, 51, 323–332. https://doi.org/ 10.1016/j.envsoft.2013.10.017.

220

H. Pan and Z. Chen

White, J. W., Hoogenboom, G., Kimball, B. A., & Wall, G. W. (2011). Methodologies for simulating impacts of climate change on crop production. Field Crops Research, 124, 357–368. https://doi.org/10.1016/j.fcr.2011.07.001. Williams, J., Renard, K., & Dyke, P. (1983). EPIC: A new method for assessing erosion’s effect on soil productivity. Journal of Soil and Water Conservation, 38, 381–383. Wu, S., Huang, J., Liu, X., Fan, J., Ma, G., & Zou, J. (2011) ex and biomass by combining a sim. Assimilating MODIS-LAI into crop growth model with EnKF to predict regional crop yield, in: International conference on computer and computing Technologies in Agriculture. Springer, pp. 410–418. Zelitch, I. (1982). The close relationship between net photosynthesis and crop yield. Bioscience, 32, 796–802. Zhao, G., Bryan, B. A., & Song, X. (2014). Sensitivity and uncertainty analysis of the APSIMwheat model: Interactions between cultivar, environmental, and management parameters. Ecological Modelling, 279, 1–11. https://doi.org/10.1016/j.ecolmodel.2014.02.003. Zhiwei, J., Jia, L., Zhongxin, C., & Liang, S. (2014). A review of data assimilation of crop growth simulation based on remote sensing information, in: 2014 the third international conference on agro-Geoinformatics. In Presented at the 2014 the third international conference on agroGeoinformatics (pp. 1–6). https://doi.org/10.1109/Agro-Geoinformatics.2014.6910599.

Chapter 12

Spatial and Temporal Monitoring System for Agriculture Lei Hu and Peng Yue

Abstracts With the development of geospatial data science and its application in the agricultural ﬁeld, the meaningful agricultural-related geospatial data and information are inextricably linked with sustainable farming practices, internationalization of agricultural commodities, and global climate change. For better utilizing and reusing the agricultural information and knowledge, robust data centers and systems are expected to play a signiﬁcant role in the management of numerous agriculturalrelated geospatial data, which could change the agricultural geoinformation domain by developing geospatial algorithms and workﬂow, creating added value information products, downstream applications and services in favor of both public and private sector stakeholders. This chapter reviews the state-of-art of the operational agriculture monitoring systems at the international, national, and regional level that provides the continued monitoring of agriculture. The user needs and the issues of current agricultural data systems are discussed. The capabilities of spatial and temporal monitoring systems are analyzed. Emphatically, the data sources and products, system functionalities, interoperability, and standardization are demonstrated. Keywords Agricultural monitoring · Geospatial data · Web service · Drought · Flooding

12.1

Introduction

With the development of geospatial data science and its application in the agricultural ﬁeld, meaningful agriculture-related geospatial data and information are inextricably linked with sustainable farming practices, internationalization of agricultural commodities, and global climate change (Di 2016).

L. Hu (*) · P. Yue Wuhan University, Wuhan, Hubei, China e-mail: [email protected]; [email protected] © Springer Nature Switzerland AG 2021 L. Di, B. Üstündağ (eds.), Agro-geoinformatics, Springer Remote Sensing/ Photogrammetry, https://doi.org/10.1007/978-3-030-66387-2_12

221

222

L. Hu and P. Yue

For better utilizing and reusing the agricultural information and knowledge, powerful data centers and systems are expected to play a signiﬁcant role in the management of numerous agriculture-related geospatial data, which could be derived or processed from satellite imagery, instrumental ground truth data, and other sensor data. Web service technologies have shown great potential for scientists to build the infrastructure for collaborative sharing of distributed resources, such as data, models, and services. Web service usually deﬁnes standard interfaces to enable the interoperation of different software systems; thus, it is widely used in geospatial data systems development to fulﬁll the requirements of different organizations. Currently, most of the agriculture-related data centers and systems can only support two functionalities: data archiving and data distribution. Data archiving provides the capability of long-term data preservation, while data distribution provides the capability of the Web interface for users to access and download the data. Such data center or system usually lacks other steps like data analysis, data integration, or data mining, the agricultural scientist needs to accomplish them by themselves; this becomes a big issue when they are facing to the agriculture application studies. In the agriculture-related geospatial ﬁeld, massive data downloading, processing, mining, analyzing, and dissemination are very computation-intensive and time-consuming. The agricultural data volume is expanding to the petabyte level; the complexity of the data is also continually increasing. All these situations lead to challenges for near-real-time data discovery, access, and analysis. The purpose of this chapter is to review state of the art of the operational agriculture monitoring systems at international, national, and regional levels, as well as to discuss their capabilities in making data and functions publicly available when providing sustainable support for the continual monitoring of agriculture. The user needs and the issues of current agricultural data systems are discussed. The capabilities of spatial and temporal monitoring systems are analyzed. Emphatically, the data sources and products, system functionalities, interoperability, and standardization are demonstrated.

12.2

Related Work

In recent years, valuable agricultural geospatial data and information are broadly used in many research areas, ranging from agricultural sustainability, to food security, to natural resource monitoring and disaster assessment, etc. For better accessing, visualizing, and analyzing the original geospatial data, as well as making appropriate decisions for beneﬁt-related agencies, our human society needs on-demand agricultural geospatial data and information systems at global, national, or regional levels, which could provide near-real-time response and support to natural disasters, food market, food security, and economic prosperity. Natural hazards usually result in tremendous environmental and social consequences for agriculture. Especially, agricultural drought and ﬂood are supposed to be the major natural disasters because of the characteristics of frequent recurrences and

12

Spatial and Temporal Monitoring System for Agriculture

223

widespread areas (Wilhite 1997, 2016; Western Governors Association 2004; Cutter and Emrich 2005; Villarini and Smith 2010; Peterson et al. 2013). For instance, the Mississippi River ﬂoods in April and May 2011 were among the largest and most damaging recorded along the US waterway in the past century; they affected Missouri, Illinois, Tennessee, Arkansas, Mississippi, and Louisiana (Wikipedia 2017). The crop loss was $60 million from the Birds Point-New Madrid Foodway levee breach (Brown et al. 2011; Olson and Morton 2012). All of the aforementioned events caused crop losses in huge agricultural areas, and the belated and incomplete information becomes the main constraint for efﬁcient decision-making. In order to address this problem and facilitate the post-disaster response, a number of agriculture-related geospatial data centers and systems are supposed to provide relevant information (Lu and Campbell 2009; EDO 2017; GDM 2017). For example, the US Drought Monitor (USDM) has been developed to provide large-scale drought information to the general public (USDM 2017). Despite the great efforts, the current agricultural data systems, either agricultural disaster monitoring systems or agricultural condition long-term management systems, have been limited in some aspects. The spatial resolution is coarse and limits the utilization in a more localized scale; the temporal resolution is low and limits the near-real-time response to the emergency situation. For instance, previously, USDA NASS used the AVHRR 17 (dead) and AVHRR 18 (aging, and not consistent with AVHRR 17) NDVI data for monitoring the US crop conditions; the data is low spatial resolution (1 km) and low temporal resolution (biweekly). Seldom systems are able to support the on-demand agricultural data customization and information generation, continuous historical data dissemination and analytics, near-real-time data and information visualization, and interoperable communication with other systems. In most agricultural application cases, for the sake of well-balanced spatial and temporal resolutions, it’s hard to ﬁnd a perfect Earth observation (EO) data through a single data source; therefore, multisource EO data, along with the traditional instrumental ground truth data, presents a massive and heterogeneous data infrastructure for agricultural data and information systems. Above all, current agricultural information systems are hard to fulﬁll the increasing demands from different agencies and communities for sufﬁcient and timely agricultural drought information. A successful Web-based agricultural decision support system should ﬂexible, scalable, reusable, high-efﬁciency, and user-friendly and provide sufﬁcient and meaningful capabilities for agricultural decision support. Furthermore, the Web services provided by the system can be dynamically repurposed and utilized by any other standard compliable Web clients and applications.

224

12.3

L. Hu and P. Yue

Spatial and Temporal Monitoring Systems for Agriculture

12.3.1 Web Service–Based Near-Real-Time Global Agricultural Drought Monitoring System The Global Agricultural Drought Monitoring and Forecasting System (GADMFS), funded by the National Aeronautics and Space Administration (NASA) and National Oceanic and Atmospheric Administration (NOAA) BAA Program (NA09NES4280007, NNX09AO14G, PI: Prof. Liping Di), is a Web servicebased, near-real-time global agriculture drought monitoring system (Yu et al. 2010; Deng et al. 2011, 2012; Yagci et al. 2012). It aims to provide a Web-based service system for policy-makers and research scientists to monitor and forecast the global agricultural drought status (Fig. 12.1). For human society, agricultural drought information is vital to global agriculture sustainable development, national economic status, and food security. To meet the increasing and urgent demands for agricultural drought knowledge, GADMFS provides on-demand, near-real-time monitoring, on-the-ﬂy analysis, and agricultural drought prediction. GADMFS is also a contributed component of the Global Earth Observation System of Systems (GEOSS) (GEOSS 2017). With the GADMFS system, the global users can access, monitor, and analyze the temporal agricultural drought conditions for any part of the world just through the network and without requiring any other resources.

Fig. 12.1 Globe agricultural drought monitoring and forecasting system (GADMFS)

12

Spatial and Temporal Monitoring System for Agriculture

225

The agricultural drought data product is 16 days for temporal resolution and covers the whole world as the monitoring area (from the year 2000 to current). GADMFS system utilizes the remote sensing–based data, such as normalized difference vegetation index (NDVI), vegetation condition index (VCI), temperature condition index (TCI), vegetation health index (VHI), standardized precipitation index (SPI), and Palmer drought severity index (PDSI) for the agricultural drought calculation and evaluation. The GADMFS data component leverages the geospatial interoperability standards and the near-real-time satellite data from NASA Land Atmosphere Near-real-time Capability for EOS (LANCE). (LANCE 2017). The data component also accesses other data resources, such as near-real-time Moderate Resolution Imaging Spectroradiometer (MODIS) data (250 m for spatial resolution and daily/weekly/16 days for temporal resolution) from the NASA Land Processes Distributed Active Archive Center (LP DAAC) (NASA MODIS 2017), Advanced Very High Resolution Radiometer (AVHRR) data (8 km for spatial resolution) from NOAA (NOAA AVHRR 2017), and crop mask data. By adopting the service-oriented architecture (SOA), a Web-based data dissemination portal is developed to allow the users better visualizing and downloading the agricultural drought data and information (GADMFS 2017). The portal has basic map operation (e.g., zoom in/out, drag box zoom in, pan, refresh, and thumbnails preview), data manipulation (e.g., data selection, data query, data add/remove/edit, and data download), and analysis function (e.g., on-demand NDVI and VCI display, AOI statistics, supervised/unsupervised classiﬁcation, and image algebra). Moreover, for drought forecasting, the system utilizes a neural network based on a modeling algorithm. The modeling algorithm is trained with the inputs of all historic vegetation-based and climate-based drought index, the biophysical feature of the environment (e.g., topography, soil type, and water resources), and the time-series weather data (e.g., temperature, precipitation, and evaporation) (Deng et al. 2013). The on-demand drought prediction result will at 1 km or higher spatial resolution, covering the whole globe. The implementation of GADMFS used Web service standards and speciﬁcations, mainly used OGC speciﬁcations, such as Web Map Service (WMS) (de La Beaujardiere 2006), Web Feature Service (WFS) (Vretanos 2010), and Web Coverage Service (WCS) (Whiteside and Evans 2008) for data capture, Web Processing Service (WPS) (Mueller and Pross 2015) for data processing, and Catalogue Service for Web (CSW) (Nebert et al. 2007) for data discovery. The system used JavaScript technologies, including some open-source libraries, such as ExtJS (Sencha 2017) for its high performance and multi-platform support, and OpenLayers (OpenLayers 2017) for its great support of WMS/WFS. The service component of GADMFS provides geospatial Web services for agricultural drought computation (Peng et al. 2015); currently, the service component is working with NASA data sources, but it can also work with other data sources that compliable with the OGC standards. Comparing to the traditional agriculture drought systems, this system provides more ﬂexibility, scalability, and reusability; meanwhile, the system serves globally in scope. Any users or applications could invoke the Web services to obtain the analysis results as long as the client complies with the standard service interface.

226

L. Hu and P. Yue

12.3.2 Web Service–Based Near-Real-Time US Vegetation Condition Monitoring System The vegetation condition monitoring system (VegScape), funded by NASA and National Agriculture Statistics Service (NASS) in US Department of Agriculture (USDA) (58-3AEU-0-0067, PI: Prof. Liping Di), is a Web service-based, near-realtime US vegetation and soil moisture monitoring system (Yang et al. 2013, 2016). It aims to provide a Web-based service system for research scientists and decisionmakers to monitor and analyze the US vegetation conditions. Vegetation condition information is signiﬁcantly important to the agricultural policy, food production, food security, and food price for US public and government agencies. To accommodate the growing demands for the vegetation and crop condition information, VegScape presents on-demand near-real-time monitoring, automatically online data fetching and analysis, and various vegetation condition index generations. With the VegScape system, the US users can query, visualize, disseminate, and monitor the temporal and spatial vegetation condition data through the standard geospatial Web service over the Web. The VegScape system monitoring area covers 48 US states (from the year 2000 to current); it provides diverse vegetation indices, such as normalized difference vegetation index (NDVI), vegetation condition index (VCI), NDVI ratio to the previous year (RNDVI), the NDVI ratio to the median (RMNDVI), and mean referenced vegetation condition index (MVCI) for the vegetation and crop condition monitoring. All these data products are 250 m for spatial resolution. The NDVI has daily, weekly, and biweekly data products; the other products all calculated from NDVI have weekly and biweekly data products. In a Mississippi Delta area case study, the proposed MVCI shows the best vegetation condition with respect to 10 years of historical average, while the RMVCI shows overall relatively poor vegetation condition with respect to the historical median (Yang et al. 2011). The VegScape data component leverages the geospatial interoperability standards and the near-real-time NASA MODIS data (250 m for spatial resolution and daily for temporal resolution). Figure 12.2 shows the major data processing ﬂow of VegScape. Compared to the NASS previous data source from the AVHRR sensor, the utilization of the MODIS data provides better spatial and temporal resolutions. Further, it supports better visualization, dissemination, analytics, and assessment of the vegetation condition. The data component also published the Soil Moisture Active Passive (SMAP) data (9 km for spatial resolution and daily for temporal resolution) and intended to access more kinds of data, such as leaf area index (LAI), fraction of photosynthetically active radiation (fPAR), land surface temperature (LST), and tropical rainfall measuring mission (TRMM). Based on the SOA pattern, a Web-based portal is developed to integrate vegetation condition data through interoperable services into decision support information (e.g., views, maps, reports, tables, charts) (VegScape 2017). The portal has various online capabilities and includes online navigation, zooming, panning, data downloading, on-the-ﬂy processing, online statistics, and data proﬁling. The portal

12

Spatial and Temporal Monitoring System for Agriculture

227

Fig. 12.2 VegScape data processing ﬂow. (Figure from Yang et al. 2016)

is accessible through most browsers, such as Google Chrome, Internet Explorer, Firefox, and Safari; the users can access and analyze on-demand data and information without installing additional software or plug-in in the browser. In addition, VegScape provides on-demand crop condition index retrieving and processing automation; automatic data publishing and dissemination, irregular ad-hoc data processing for emergency assessment or reporting, objective historical data comparison for vegetation condition assessment, various vegetation condition metrics, and speciﬁc crop monitoring. The implementation of VegScape utilized OGC Web service standards and speciﬁcations. The implementation of service components fulﬁlls various tasks such as data retrieving, query, visualization, and dissemination. WMS handles the map data rendering and manipulation, WFS serves vector ﬁles and attribute data, WCS serves coverage data, GeoLinking Service (GLS) merges geo-linked data based on linking attributes, Geolinked Data Access Service (GDAS) implements online access to the vast number of data collections, and WPS implements analysis

228

L. Hu and P. Yue

functionalities such as statistics. For each operation, HTTP GET/POST requests are supported. On the server side of VegScape, Apache 2 and Tomcat 6 are used as the server containers for their popularity and being free. MapServer deployed in Apache’s Common Gateway Interface (CGI) is used as the server of WMS, WFS, and WCS for its good robustness on all major platforms (e.g., Linux, Mac OS X, and Windows). On the client side, OpenLayers and ExtJS are used to develop an Ajaxbased rich Web application which can easily display a dynamic map on any Web page. The communication between the server side and the client side is through the messages in XML or JavaScript Object Notation (JSON) format. Through the VegScape system, agricultural vegetation condition information and knowledge could be easily discovered, customized, and further integrated by a broad range of users.

12.3.3 Web Service–Based Near-Real-Time US Flood and Progress Monitoring System The Remote-sensing-based Flood Crop Loss Assessment Service System (RF-CLASS), funded by NASA Applied Science Program (NNX12AQ31G, NNX14AP91G, PI: Prof. Liping Di), is a remote-sensing-based, near-real-time US ﬂood crop loss assessment cyber-service system for supporting crop statistics and insurance decision-making (Yu et al. 2013; Di et al. 2017). It aims to provide a Web-based service system for research scientists, decision-makers, and agricultural markets to monitor the US ﬂood status and to analyze and predict the ﬂood crop loss conditions. Among the most natural hazards, ﬂooding always causes tremendous crop loss over large agricultural areas in the United States (Smith and Katz 2013). Floodrelated information is crucial to ﬂood loss assessment, US economics, and agriculture sustainable development. The World Meteorological Organization (WMO) and the Global Water Partnership (GWP) have taken some measures for the ﬂood assessment model deﬁnition. To fulﬁll the imperative demands of relief and monetary compensation relies on the ﬂood loss knowledge, RF-CLASS provides on-demand and near-real-time post-ﬂood prediction and decision-making for the concerned agencies. With the RF-CLASS system, the users can efﬁciently access and download and analyze the ﬂood-related products for crop ﬂood insurance policy-making. Especially for some government agencies, such as USDA NASS and Risk Management Agency (RMA). USDA NASS collects the ﬂooded acreage and ﬂood duration and records annual crop loss due to the ﬂood; RMA carries out the USDA crop insurance policy, investigates crop policy compliance, and checks prevented planting claims. The RF-CLASS system provides various ﬂood-related data products. Continuous, near-real-time, MODIS sensor-based, ﬂood map products (daily raster maps at 250 m spatial resolution, and daily water polygons) from NASA Goddard Space

12

Spatial and Temporal Monitoring System for Agriculture

229

Fig. 12.3 Customize AOI at state/county/region/ASD level in RF-CLASS

Flight Center and Dartmouth Flood Observatory (DFO) in the University of Colorado are powered by the system. RF-CLASS also provides the monthly, annually, improved annually, frequency, improved frequency ﬂood data products; moreover, it provides the advanced annual ﬂood event data, crop fraction, and crop loss data products. The RF-CLASS data component mainly utilized the EO data (e.g., NASA MODIS NDVI at 250 m spatial resolution) and the daily ﬂood map data for processing the ﬂood damage assessment information. The data component also accesses other data resources, such as the Cropland Data Layer (CDL) from NASS (Boryan et al. 2011; Han et al. 2012), which is used as the primary source of crop types, the Common Land Unit (CLU) and 578 administrative data from Farm Service Agency (FSA 2012), meteorological precipitation data, and the auxiliary vector data ﬁles (e.g., US state/county/ASD/region boundaries, US road data, US water data, crop mask data). The RF-CLASS Web portal is presented to allow the users better visualizing and downloading the ﬂood-related data products (RF-CLASS 2017). The portal has basic map operation (e.g., zoom in/out, drag box zoom in, pan, refresh, and thumbnails preview), data manipulation (e.g., data query, layer selection, layer add/remove/edit, and data download), and dynamic time-series data displaying and monitoring panel. RF-CLASS supports users to customize their AOI regionally for further crop loss assessment (Fig. 12.3). Moreover, for advanced ﬂood assessment, the system proposed several products or functions based on speciﬁc calculation algorithms and models, such as ﬂooded crop acreage and ﬂood duration products (Nigro et al. 2014), the actual crop loss products based on crop growth stage (Shrestha et al. 2017), the compliance investigation based on historical records,

230

L. Hu and P. Yue

MODIS and Landsat observations and CDL, and spot check for prevented planting claims. The implementation of RF-CLASS utilized Web service standards and speciﬁcations, mainly used OGC speciﬁcations. WMS, WFS, WCS, and Sensor Observation Service (SOS) (Na and Priest 2007) are used to access different kinds of data, CSW is used for data cataloging and discovery. WPS for data processing is used to implement the processes and algorithms for crop loss assessment functions. These standard services are instantly consumable by any standard compliable Web client. The RF-CLASS system adopted matured algorithms and models; supported the estimation of ﬂooded crop acreages, crop damage, and ﬂood frequency products; and greatly enhanced the post-ﬂood crop loss assessment and crop insurance policy formulation.

12.4

Conclusion

In this chapter, the general research is proposed to show the human society’s real needs in agricultural knowledge discovery, as well as the strengths and limitations of current agricultural data and application systems. The capabilities of spatial and temporal monitoring systems are analyzed via three example systems. Through the aspects of data sources and products, system functionalities, interoperable standardization, and implementation, these operational systems are presented to show great advantages on agricultural decision-making and policy formulation in a big data era.

References Boryan, C., Yang, Z., Mueller, R., & Craig, M. (2011). Monitoring US agriculture: The US department of agriculture, national agricultural statistics service, cropland data layer program. Geocarto International, 26, 341–358. Brown, S., Gerlt, S., Wilcox, L. (2011). The Value of the 2011 Crop Production Loss from the Birds Point-New Madrid Floodway Levee Breach. FAPRI-MU Report 06-11, Food and Agricultural Policy Research Institute, University of Missouri, USA, (pp. 6). Cutter, S. L., & Emrich, C. (2005). Are natural hazards and disaster losses in the US increasing? EOS. Transactions of the American Geophysical Union, 86, 381–389. de La Beaujardiere, J. (2006). OpenGIS® web map server implementation speciﬁcation. Open Geospatial Consort Inc OGC 06–042. Deng, M., Di, L., Han, W., et al. (2011). The development of a web-service-based on-demand global agriculture drought information system. In AGU fall meeting abstracts (p. 08). Deng, M., Di, L., Yu, G., et al. (2012). Building an on-demand web service system for global agricultural drought monitoring and forecasting. In Geoscience and remote sensing symposium (IGARSS), 2012 IEEE international (pp. 958–961). IEEE. Deng, M., Di, L., Han, W., et al. (2013). Web-service-based monitoring and analysis of global agricultural drought. Photogrammetric Engineering and Remote Sensing, 79, 929–943. Di, L. (2016). Big data and its applications in agro-geoinformatics. In 2016 IEEE international geoscience and remote sensing symposium (IGARSS) (pp. 189–191).

12

Spatial and Temporal Monitoring System for Agriculture

231

Di, L., Yu, E. G., Kang, L., et al. (2017). RF-CLASS: A remote-sensing-based ﬂood crop loss assessment cyber-service system for supporting crop statistics and insurance decision-making. Journal of Integrative Agriculture, 16, 408–423. https://doi.org/10.1016/S2095-3119(16) 61499-5. EDO. (2017). EDO European drought observatory – JRC European Commission. http://edo.jrc.ec. europa.eu/edov2/php/index.php?id¼1000. Accessed 12 June 2017. FSA. (2012). USDA-farm service agency common land unit (CLU) information sheet. https://www. fsa.usda.gov/. Accessed 12 June 2017. GADMFS. (2017). Global agricultural drought monitoring and forecasting system. http://gis.csiss. gmu.edu/GADMFS/. Accessed 12 June 2017. GDM. (2017). Global drought information system. https://www.drought.gov/gdm/. Accessed 12 June 2017. GEOSS (2017) Geo-strategic target for agriculture. http://www.earthobservations.org/geoss.php. Accessed 11 June 2017. Han, W., Yang, Z., Di, L., & Mueller, R. (2012). CropScape: A web service based application for exploring and disseminating US conterminous geospatial cropland data products for decision support. Computers and Electronics in Agriculture, 84, 111–123. LANCE. (2017). LANCE: NASA near real-time data and imagery | Earthdata. https://earthdata. nasa.gov/earth-observation-data/near-real-time. Accessed 11 June 2017. Lu, H., & Campbell, D. E. (2009). Ecological and economic dynamics of the Shunde agricultural system under China’s small city development strategy. Journal of Environmental Management, 90, 2589–2600. Mueller, M., & Pross, B. (2015) OGC WPS 2.0 Interface Standard. Open Geospatial Consortium Speciﬁcation, (pp. 14–065). Na, A., & Priest, M. (2007) OGC Implementation Speciﬁcation 06-009r6: OpenGIS Sensor Observation Service (SOS). Open Geospatial Consortium Technical Report. NASA MODIS. (2017). Moderate resolution imaging Spectroradiometer. https://modis.gsfc.nasa. gov/data/. Accessed 12 June 2017. Nebert, D, Whiteside, A., & Vretanos, P. (2007). OpenGIS Catalogue Services Speciﬁcation. OGC Implementation Speciﬁcation. Nigro, J., Slayback, D., Policelli, F., & Brakenridge, G. R. (2014). NASA/DFO MODIS near realtime (NRT) global ﬂood mapping product evaluation of ﬂood and permanent water detection. https://ﬂoodmap.modaps.eosdis.nasa.gov//documents/NASAGlobalNRTEvaluationSummary_ v4.pdf. Accessed 12 June 2017. NOAA AVHRR. (2017). NOAA Satellite Information System (NOAASIS). http://noaasis.noaa.gov/ NOAASIS/ml/avhrr.html. Accessed 12 June 2017. Olson, K. R., & Morton, L. W. (2012). The impacts of 2011 induced levee breaches on agricultural lands of Mississippi River Valley. Journal of Soil and Water Conservation, 67, 5A–10A. OpenLayers. (2017). OpenLayers – A high-performance, feature-packed library for all your mapping needs. http://openlayers.org/. Accessed 2 June 2017. Peng, C., Deng, M., Di, L., & Han, W. (2015). Delivery of agricultural drought information via web services. Earth Science Informatics, 8, 527–538. Peterson, T. C., Heim, R. R., Jr., Hirsch, R., et al. (2013). Monitoring and understanding changes in heat waves, cold waves, ﬂoods, and droughts in the United States: State of knowledge. Bulletin of the American Meteorological Society, 94, 821–834. RF-CLASS. (2017). RF-CLASS – Remote-sensing-based ﬂood crop loss assessment service system. http://dss.csiss.gmu.edu/RFCLASS/. Accessed 12 June 2017. Sencha. (2017). ExtJS 4 JavaScript framework for rich apps in every browser. In Sencha. https:// www.sencha.com/products/extjs/. Accessed 2 June 2017. Shrestha, R., Di, L., Eugene, G. Y., et al. (2017). Regression model to estimate ﬂood impact on corn yield using MODIS NDVI and USDA cropland data layer. Journal of Integrative Agriculture, 16, 398–407.

232

L. Hu and P. Yue

Smith, A. B., & Katz, R. W. (2013). US billion-dollar weather and climate disasters: Data sources, trends, accuracy and biases. Natural Hazards, 67, 387–410. USDM. (2017). United States drought monitor. http://droughtmonitor.unl.edu/. Accessed 12 June 2017. VegScape. (2017). VegScape – Vegetation condition explorer. https://nassgeodata.gmu.edu/ VegScape/. Accessed 12 June 2017. Villarini, G., & Smith, J. A. (2010). Flood peak distributions for the eastern United States. Vretanos, P. A. (2010). OpenGIS Web Feature Service 2.0 Interface Standard. Open Geospatial Consortium Speciﬁcation, (pp. 04-094). Western Governors Association. (2004). Creating a drought early warning system for the 21st century: The National integrated drought information system. Whiteside, A., & Evans, J. D. (2008). Web Coverage Service (WCS) Implementation Standard. Wikipedia. (2017). 2011 Mississippi River ﬂoods. Wikipedia. Wilhite, D. A. (1997). Responding to drought: Common threads from the past, visions for the future1. Wiley Online Library. Wilhite, D. A. (2016). Droughts: A global assesment. Routledge. Yagci, A. L., Di, L., Deng, M., et al. (2012). Global agricultural drought mapping: Results for the year 2011. In Geoscience and remote sensing symposium (IGARSS), 2012 IEEE international (pp. 3764–3767). IEEE. Yang, Z., Di, L., Yu, G., & Chen, Z. (2011). Vegetation condition indices for crop vegetation condition monitoring. In Geoscience and remote sensing symposium (IGARSS), 2011 IEEE international (pp. 3534–3537). IEEE. Yang, Z., Yu, G., Di, L., et al. (2013). Web service-based vegetation condition monitoring systemvegscape. In Geoscience and remote sensing symposium (IGARSS), 2013 IEEE international (pp. 3638–3641). IEEE. Yang, Z., Hu, L., Yu, G., et al. (2016). Web service-based SMAP soil moisture data visualization, dissemination and analytics based on vegscape framework. In Geoscience and remote sensing symposium (IGARSS), 2016 IEEE international (pp. 3624–3627). IEEE. Yu, G., Han, W., & Deng, M. (2010). A remote sensing-based global agricultural drought monitoring and forecasting system for supporting GEOSS. In American Geophysical Union, Fall Meeting 2010 abstracts. Yu, G., Di, L., Zhang, B., et al. (2013). Remote-sensing-based ﬂood damage estimation using crop condition proﬁles. In Agro-Geoinformatics (Agro-Geoinformatics), 2013 second international conference on (pp. 205–210). IEEE.

Chapter 13

Spatial Data Usage in Turkish Agriculture Hakan Erden and Murat Aslan

Abstract Although agriculture was carried out mostly for regional needs, after the Industrial Revolution, the function of agriculture also changed. In recent years, agricultural activities have continued to increase productivity. These efforts have provided good results, and Turkey has reached the same level as the developed countries in terms of the number of products obtained per livestock and per area. In recent years, due to the developments in the communication and transportation sectors, competitive production has taken the place of regional production. In terms of both input supply and marketing, a period of competition has begun. As a result of improved conditions, • • • • •

Preparing the sustainable agricultural infrastructure for the country Ensuring competitive agricultural production around the world Calculating the supply and demand equilibrium Collecting data from the source instantly Gathering data

have been managed. Consequently, decision support systems, of which agricultural subsidies are the tool of successful implementation, have become of utmost importance. Keywords Spatial · Yield Forecast · Agro-informatics · Agricultural management · Sustainability · Remote sensing

H. Erden (*) · M. Aslan Ministry of Agriculture and Forestry, Ankara, Turkey e-mail: [email protected]; [email protected] © Springer Nature Switzerland AG 2021 L. Di, B. Üstündağ (eds.), Agro-geoinformatics, Springer Remote Sensing/ Photogrammetry, https://doi.org/10.1007/978-3-030-66387-2_13

233

234

13.1

H. Erden and M. Aslan

Introduction

In recent years, Turkey used technology in agriculture and progressed a lot to increase productivity. Using geographical information systems (GIS), statistics, and IT technology helped Turkey to reach the same level as the developed countries in terms of the number of products obtained per livestock and per area. Turkey has an agricultural area of 23,7 million ha, 14,6 million ha of pasture, 22,3 million ha of forest, 1 million ha of water, and 17,35 million ha of other areas, including residential zones (TURKSTAT 2016). Therefore, almost half of the country is dedicated to agricultural and pasture areas that are subject to pure agricultural policy. In Turkey, until 2012, agricultural subsidies were paid by the District Subsidy Commission, according to data from the National Farmer Registry System. There were no additional controls if a mistake was made about the declared cultivated area or amount of outlay. In 2012, the initiation of integrating spatial information with agricultural practice was introduced by the Ministry of Food Agriculture and Livestock (MoFAL). In this study, 330 SPOT 5 satellite images have been orthorectiﬁed using 16,500 ground control points, and projection transformations have been carried out with projection systems suitable for Turkey. Agricultural parcels have been delineated through the SPOT images, and 32.5 million parcels have been generated, with an electronic ID number for every parcel. With this system, the Ministry of Food Agriculture and Livestock reaches land use distribution and technical information such as drought, yield, plant water consumption, and ecological conformance in these areas by province, district, and parcel level. Consequently, by January 2013, a new approach to the distribution of subsidy payments according to yield calculation per parcel was introduced. This new approach brought about the estimation of realistic yield values for each agricultural parcel based on spatial features. It provided fair payments according to parcel yield values, prevention of payments to nonagricultural areas or residents within the agricultural parcels, improvement of quality of agricultural statistics based on spatial data, prevention of payments with counterfeit and duplicate deeds in the Farmer Registry System, and prevention of unreal harvest declarations. Estimation of realistic agricultural statistics by determining unrecorded parcels in the Farmer Registry System has been achieved. It also assisted proper agricultural planning using the new parcel-based subsidy system. With the new parcel-based support payment system, since 2012, farmers have been paid according to their land characteristics. Agricultural statistics are improving day to day. Unreal harvest declarations and payments for nonagricultural areas are prohibited. All objections are handled with the help of spatial data. Since the beginning of our study, 1.5 million counterfeit or duplicate deeds have been identiﬁed, which has reduced the ﬁnancial burden on the subsidy budget. It is clear that the effective use of agricultural lands is signiﬁcant in terms of agricultural potential and the population employed in farming in Turkey. With modern agricultural

13

Spatial Data Usage in Turkish Agriculture

235

policies and progressive plans, maintaining and improving the capacity of agricultural lands with higher potential in order to support a growing population is paramount. Moreover, conserving and rehabilitating natural resources of lower-potential lands in order to maintain sustainable man/land ratios is also necessary. Farmers should be educated in terms of awareness of their regions and the land conditions and should be awarded if they grow the appropriate crop type. In order to ensure the sustainability of the innovation of the decision support system, satellite images are taken every year, determination of yield through parcels is carried out, and infrastructure is maintained to afﬁrm agricultural support for all agricultural areas. Therefore, Turkey has initiated a new era in which decisions are taken through the spatial infrastructure and planning is carried out accordingly.

13.2

Parcel-Based Support Payment System

A parcel-based support payment system was preferred in the transition period of passing to an integrated administration and control system. Thus, the aim was to ensure that the given support was determined most appropriately. For this purpose, agricultural parcels are digitized to cover farmland at the national level. Then, a yield model was developed to provide the most accurate support for the parcel-based support payments for each parcel. The following steps have maintained the parcel-based support payment system: (a) Digitization of Agricultural Parcels. In the scope of the agricultural parcel database work item, all agricultural parcels within the cadastral parcels and the parcels used for agricultural purposes, as well as raw soils, were digitized according to satellite images and orthophotos. All usages were deﬁned, and all nonagricultural usages were separated. After the digitization process, a parcel-based crop production yield model was created. In the model, the following steps were carried out: (b) Creation of Province-Based Yield Maps. In order to create the province-based yield maps, all constant parameters were deﬁned, and a weighted overlay method was applied. In the study, the following data were selected as constant input parameters: – – – – –

Digital elevation model (DEM). A slope map which was generated from the DEM. Soil groups maps. Land use capability classes maps. Soil depth maps.

236

H. Erden and M. Aslan

Table 13.1 Yield categories DEM (m)

Inﬂuence % 15

Class 1 2 3 4 5 6 7 8 9 10

Value 0–100 100–200 200–300 300–400 400–500 500–600 600–700 700–800 800–900 >900

Scale 10 9 8 7 6 5 4 3 2 1

Among these parameters, the DEM and slope map was acquired from the General Command of Mapping; the others are from MoFAL. (c) Calculation of Yield Coefﬁcient. Since the input criteria layers had different numbering systems with different ranges, to combine them in a single analysis, each cell for each criterion was reclassiﬁed using a common preference scale from 1 to 10, with 10 being the most favorable. After the model was run, the resulting yield maps were obtained for each of the 81 provinces. In the agricultural parcel yield model, elevation, depth, slope, aspect, and the irrigation status of the land were taken into consideration, and a coefﬁcient was determined to reﬂect the impact of these elements on the crop productivity. Below, an example of a DEM is given. These coefﬁcients are from 1 to 10 and can have two digits. For example, a parcel can have 7.82, a neighboring parcel can have 8.2, and the other parcels around can have 6.78. These differences show the diversity of the land in Turkey, and the yield information can be calculated as very sensitive (Table 13.1). (d) Calculation of Parcel-Based Yield Values

Parcel yield

D max y D min y kg D max y:coef x ¼ D max y ð13:1Þ da D max y:coef C min y:coef

where Dmax, y is the maximum yield value of the district, Dmin, y is the minimum yield value of the district, Cmin, y.coef is the minimum yield coefﬁcient of the country, Dmax, y.coef is the maximum yield coefﬁcient of the district, and x is the yield coefﬁcient of the parcel. The yield of a crop is deﬁned by the regional ofﬁces of MoFAL for premium payment of agricultural subsidies using this model. The main principle is the usage of minimum versus maximum yield values in the model. These values are calculated

13

Spatial Data Usage in Turkish Agriculture

237

using the parcel yield coefﬁcients assigned for each parcel. Then, the minimum versus maximum yield values is calculated by the regional ofﬁces of MoFAL considering the sequence of the crops in the ﬁeld, irrigation, soil characteristics, elevation, and normally the types of crop. It has been detected in regional studies that the model has 95% precision after distributing the yield for each parcel within these minimum and maximum values. Agricultural subsidies are paid using this model.

13.3

Land Parcel Identiﬁcation System

The Land Parcel Identiﬁcation System (LPIS) is the spatial infrastructure of the Integrated Administration and Control System (IACS). In the LPIS, both agricultural and nonagricultural areas are digitized according to the rules of the European Union (EU) (Kay and Milenov 2010). Turkey has recently made a new step to enhance the capacity of its spatial infrastructure. The establishment of LPIS began in 2014. This system is the spatial tool of IACS, which is used for the management of agricultural subsidies in EU member states. In the agricultural law dated 25/04/2006, numbered 5488 in the Ofﬁcial Gazette, it is indicated that “the purpose of agricultural subsidies is to contribute to the solution to the privileged problems in the agricultural sector, improve the efﬁciency of policies in practice, and facilitate the integration of the sector into these policies” and that agricultural subsidy policies are implemented through programs that ensure economic and social efﬁciency and productivity. On the other hand, the overall objective of these agricultural policies is speciﬁed as the increase of welfare in the agricultural sector via the improvement of agricultural production in line with domestic and foreign demands, the protection and development of natural and biological resources, the increase in productivity, the enhancement of food safety and security, the development of production organizations, the reinforcement of agricultural markets, and the assurance of rural development. Therefore, the conclusion to be drawn here is that IACS aims to manage agricultural subsidies, as indicated in agricultural law. Furthermore, the IACS system is a commitment by Turkey in the scope of TR-EU negotiations for full membership. Through the system, agricultural stable land boundaries will be recorded, classiﬁed, and monitored; protection of agricultural land and environments will be ensured; agriculture will be monitored using high technology; and the effectiveness of agricultural policies will be achieved. Consequently, MoFAL was initiated to establish IACS in order to manage agricultural policies. As part of this purpose, to enhance the capacity of spatial infrastructure, the establishment of LPIS was initiated. The project is the largest in the world with this purpose and has a budget of 46 million euros. Section 13.3.1 will investigate the orthophoto features, and Sect. 13.3.4 will examine the vector data features of LPIS.

238

H. Erden and M. Aslan

13.3.1 Orthophoto Production 13.3.1.1

Geodetic Works

Turkey initiated the study ﬁrst on geodetic issues. The total area of Turkey was divided into 67 blocks to facilitate the optimal organization of data acquisition and the Notice to Airmen (NOTAM) authorization process and to organize the subsequent processes of ground control point (GCP) surveys, aerial triangulation, and orthophoto production into uniﬁed units of manageable size in terms of the number of images and data size (Fig. 13.1). Each block is the size of 1.0 latitude and 1.5 longitude in transverse Mercator zones; 3 -wide transverse Mercator zones are speciﬁed as the ofﬁcial reference map projection of the LPIS project. In addition, the longitudinal size of 1.5 was chosen to ensure optimal ﬂying time over a single line limited by inertial measurement unit (IMU) drift. With turboprop aircraft, such lines can be ﬂown in 20 minutes, which ensures the minimal accumulation of errors caused by IMU drift (Demirkol 2015). Geodetic works included documentation, pre-signalization, and global navigation satellite system (GNSS) surveying as well as continuously operating reference stations (CORS) static GNSS data processing and adjustment starting from the western region of Turkey and progressing eastward covering the entire country. Prior to the LPIS GNSS survey campaign, comprehensive test surveys and computations were performed in the ﬁeld in order to design the survey plan and to assure the accuracy of the resulting GCP coordinates. According to the results of the test work, it was concluded that an observation time of 1 h with a 1.0-second recording interval (RCI) at each GCP was required. Geodetic coordinates (latitude, longitude, and ellipsoidal height) of the GCPs in ITRF96 datum at 2005.0 reference epoch were computed using GNSS data from CORS in Turkey (CORS-TR /TUSAGA-AKTIF). This computation began in October 2014, restarted in early March 2015 after the winter season, continued

Fig. 13.1 Photogrammetric blocks and ﬁnal GCP distribution

13

Spatial Data Usage in Turkish Agriculture

239

Fig. 13.2 TM zone division in Turkey

throughout the project period, and completed by the end of August 2016. In total, 2147 GCPs were obtained, 1457 GCPs for aerial images and 690 GCPs for satellite images. ITRF96 coordinates on the GRS80 ellipsoid were converted to international 3 -wide TM zones with central meridians of 27 , 30 , 33 , 36 , 39 , 42 , and 45 , respectively, depending on the location of the GCPs (Fig. 13.2). For the LPIS satellite imagery areas, GCP distribution was strictly dependent on satellite image coverage and footprint layout, and it was not possible to design the GCP distribution before obtaining the satellite imagery. According to the provided satellite images, necessary GCP distribution was designed considering photoidentiﬁable points in SI overlapping areas both in the ﬁeld and SI, where a postmarking method was used to survey the selected GCPs. Orthometric heights (H) for all GCPs were computed using GRS80 ellipsoidal heights (h) and geoid heights (N) utilizing the ofﬁcial Turkish geoid model (TG-03). The accuracy measure here is the root mean square error (RMSE), which is a frequently used measure of the differences between values (sample and population values) predicted by a model or an estimator and the observed values (Sect. 13.3.1). rﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ b ¼ MSE Q b ¼ RMSE Q

ﬃ sﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 2 b Q E Q

ð13:2Þ

The RMSE of GCPs for aerial images were calculated as: (RMSEx 0.11” is true, it is the vegetation cover; otherwise, it is PML, saline land, bare land, or fallow land. 3. (b5-b3)/(b5 + b3) 0.28 If “(b5-b3)/(b5 + b3) 0.28” is false, it is either PML or saline land; otherwise, it is either bare land or fallow land. 4. NDVI8>0.3 Bare land and fallow land have similar spectral signatures in May but can be differentiated by the Landsat images in July/August (peak of the growing season) since fallow land grows a lot of weeds but bare land doesn’t. Using the NDVI value at the peak of a growing season can easily distinguish the two. In this study, we use NDVI8 to represent the NDVI in the peak of a growing season (i.e., July/August). If “NDVI8>0.3” is true, it is fallow land; otherwise, it is bare land, including Bare Land1 and Bare Land2. 5. b5 > 0.69 If “b5 > 0.69” is false, it is PML, including PML1 and PML2; otherwise, it is saline land. In above, b5 stands for the reﬂectance of Landsat TM band 5.

17.2.2 A Speciﬁc Example 17.2.2.1

Data Sets and Preprocessing

A region bounded with spatial coordinates of (45 110 1200 N, 84 440 2400 E), (44 500 1200 N, 86 540 2400 E), (44 290 800 N, 84 300 500 E), and (44 80 2400 N, 86 370 4400 E) contains a large percentage of PML in northern Xinjiang, China, was selected as the study area. According to the phenological calendar of major crops, including cotton, winter wheat, and spring corn, in the study area, cotton is planted around middle and late April and emerges in early May, while corn is planted in late April and early May. Both cotton and cornﬁelds are mulched with transparent PML. During the planting and emerging period, the NDVI values of both cotton and cornﬁelds are very low, while the NDVI values of the winter wheat and other vegetated ﬁelds are high due to the difference in the phonological stages (i.e., winter wheat is at the jointing and heading stage). Such a phenological difference makes PML easy to be extracted from satellite remote sensing imagery. In addition, we concern only with extracting the PML, not with further distinguishing between PML on cotton and

17

Remote Sensing–Based Mapping of Plastic-Mulched Land Cover

361

Table 17.2 Landsat TM level 1 T data source parameters (Path 144, Row 29) SN 1 2 3 4 5 6

Acquisition date and time Date GMT 05/22/1998 4:39 08/26/1998 4:40 05/15/2007 4:56 08/19/2007 4:55 05/10/2011 4:51 07/29/2011 4:50

Entity ID LT51440291998142ULM00 LT51440291998238BIK00 LT51440292007135IKR00 LT51440292007231IKR00 LT51440292011130KHC00 LT51440292011210IKR01

Average cloud cover (%) 18.86 0.19 10.00 10.00 7.06 0.00

PML on corn. Therefore, we select high-quality Landsat TM images mainly in May and some in late July and early August of 1998, 2007, and 2011 as the source data. The use of Landsat images acquired in July and August is for distinguishing the bare land and fallow land. The Landsat TM data were downloaded from the U.S. Geological Survey (USGS) ofﬁcial website (http://earthexplorer.usgs.gov/). Table 17.2 lists the Landsat TM data used in this study. In addition to the Landsat TM images, statistics data from other references, such as the Xinjiang Yearbook of 1998, 2007, and 2011, are also used in this study. The original Landsat images were atmosphere-corrected, mosaiced, masked, and ROI clipped. Before PML extraction is carried out, the DNs of the Landsat images in ROI are converted to normalized at-the-satellite spectral radiance: L ¼DN ðLmax Lmin Þ=255 þ Lmin Ln ¼L=Lmax

ð17:4Þ

where L is the at-the-satellite spectral radiance of a speciﬁc pixel with its pixel value to be DN, Ln is the at-the-satellite normalized spectral radiance, Lmax is band-speciﬁc spectral radiance scaled to the maximum DN value (i.e., 255), and Lmin is bandspeciﬁc spectral radiance scaled to the minimum DN value (i.e., 0). Both Lmin and Lmax are from the USGS Landsat 5 post-launch calibration table (http://landsat.usgs. gov/documents/L5TM_postcal.pdf) which has considered the sensor degradation. In this study, we use the normalized at-the-satellite spectral radiance to approximate the surface reﬂectance since atmospheric conditions were not available for the study area.

17.2.2.2

Experiment Results

The ground truth for this study was obtained through visually interpreting the highspatial-resolution GeoEye image of April 27, 2011, for the study area since the plasticultural area can be easily identiﬁed on GeoEye images visually. The visually interpreted land-cover information was overlaid on the Landsat scenes of the same year to obtain the ground truth at the Landsat TM resolution. Then, randomly a half

362

L. Lu

Fig. 17.5 Classiﬁcation results from the decision-tree classiﬁer (a) 2011 (b) 2007 (c) 1998

of the ground-truth pixels on the Landsat TM images were used for deriving rules for the decision tree and another half for validating results. Because of the unavailability of high-spatial-resolution images, the ground truth for the years 1998 and 2007 was obtained through direct visual interpretation of Landsat TM images based on the spectral and spatial patterns learned from the images of the year 2011. The Landsat TM training pixels obtained from the ground truth in 1998, 2007, and 2011 were puriﬁed by using n-Dimensional Visualizer in ENVI software to obtain the representative pixels for each land-cover type. ENVI Decision Tree tool is used to implement the decision-tree classiﬁer. The classiﬁer is used to classify the Landsat TM images for the study area. The results are shown in Fig.17.5. From Fig. 17.5, it is apparent that PML pixels mainly occur in the middle and south parts of the study area while bare land, another main type of land cover in the study area, in the north part of the study area. This pattern of land-cover distribution matches well with the ground truths visually interpreted from the highresolution GeoEye images and the Landsat color composites. Statistics of the classiﬁcation result show that PML accounts for 79.9%, 84.4%, and 80.4% of the total farmland, which includes PML, fallow land, and vegetation cover in years 2011, 2007, and 1998, respectively. The right-side images of Fig. 17.6 are the zoom-in of classiﬁcation results for two small areas in the study area, while the left-side images are the corresponding Landsat color composites. From Fig. 17.6, it is clear that the decision-tree classiﬁer proposed in this study is very effective for extracting PML information.

17

Remote Sensing–Based Mapping of Plastic-Mulched Land Cover

363

Fig. 17.6 Comparison of classiﬁcation results and their corresponding color-composite images in 2011

Further analysis on the classiﬁcation accuracy with ground truth is made by using the confusion matrix tool in ENVI software, which generates the producer accuracy, the user accuracy, the overall accuracy, and Kappa coefﬁcients (Cohen 1960). The producer accuracy (PA) is a measure indicating the probability that the classiﬁer has labeled an image pixel into Class A given that the ground truth is Class A. User accuracy (UA) is a measure indicating the probability that a pixel is Class A given that the classiﬁer has labeled the pixel into Class A. Table 17.3 shows a summary of the classiﬁcation accuracy assessment. From Table 17.3, it can be found that the overall accuracies (OA) in 2011, 2007, and 1998 are all higher than 85%, while Kappa coefﬁcients, κ, are greater than 0.8. For all land-cover types, only the producer accuracies of saline land in 2007 and 1998 and vegetation cover in 2007 are lower than 80%, which indicates this land-cover type is

364

L. Lu

Table 17.3 Confusion matrix for the decision-tree classiﬁer using the Landsat TM images 2011

2007

1998

Class PML Vegetation cover Bare land Fallow land Saline land Water body Overall accuracy (OA) Kappa coefﬁcient (k) PML Vegetation cover Bare land Fallow land Saline land Water body Overall accuracy (OA) Kappa coefﬁcient (k) PML Vegetation cover Bare land Fallow land Saline land Water body Overall accuracy (OA) Kappa coefﬁcient (k)

PA (%) 100 100 100 97.12 85.91 100 97.82% 0.97 100 60.23 98.97 97.22 11.35 100 85.27% 0.80 95.35 99.59 99.6 82.66 57.17 100 95.00% 0.93

UA (%) 95.9 99.95 99.79 100 100 94.4

PA (Pixels) 6862/6862 1871/1871 6541/6541 506/521 2824/3287 2867/2867

UA (Pixels) 6862/7155 1871/1872 6541/6555 506/506 2824/2824 2867/3037

74.82 100 99.64 100 84.25 81.73

4874/4874 795/1320 4137/4180 524/539 214/1886 2505/2505

4874/6514 795/795 4137/4152 524/524 214/254 2505/3065

92.22 96.6 95 88.15 93.96 100

2500/2622 1707/1714 6777/6804 610/738 622/1088 2020/2020

2500/2711 1707/1767 6777/7134 610/692 622/662 2020/2020

PA Producer accuracy, UA User accuracy

most easily misclassiﬁed into other classes, although those two classes are not the key classes in this study. In addition, the user accuracies of all land-cover types are higher than 74%. This means that the decision-tree classiﬁer is an effective method for extracting not only PML but also other types of land cover, except for saline land. The high classiﬁcation accuracy to the following reasons can be attributed: (1) The transparent PML has very distinct spectral signatures from the other landcover types in the study area; (2) the study area has a large ﬁeld size compared with the spatial resolution of Landsat TM images; (3) there is a single type of plastic ﬁlm, the transparent plastic ﬁlm, used as the mulch; and (4) the large-scale, spatially continued application of transparent plastic ﬁlm forms the uniformed landscape.

17

Remote Sensing–Based Mapping of Plastic-Mulched Land Cover

17.3

365

A Threshold Model for Mapping PML Using MODIS Time Series Data

17.3.1 Methodology The decision tree has been proving efﬁcient classiﬁers for land-cover mapping (Schneider et al. 2010). However, most decision-tree methods require all classes that occur in a training image to be exhaustively labeled (Munoz-Marf et al. 2007). This will not only increase the classiﬁcation cost since the process of gathering training samples or otherwise labeling training samples is very expensive in terms of time and manpower (Byeungwoo and Landgrebe 1999) but also be unnecessary in those studies in which only a speciﬁc class needs to be extracted. Therefore, one-class classiﬁcation methods (e.g., Manevitz and Yousef 2001), which try to detect a speciﬁc class and reject the others, have been employed in land-cover mapping (e.g., Sanchez-Hernandez et al. 2007) and proved to be effective (Foody et al. 2006; Li et al. 2011). One example of one-class classiﬁcation methods is the threshold model, which can be expressed as: If (the discriminative features of a pixel meet the threshold conditions) Then Assign the pixel to the special class; Else Assign the pixel to the other class. To some extent, the threshold model is simple, efﬁcient one-class classiﬁers which classify the whole image into the speciﬁc class and the other class via threshold conditions. Therefore, the key to successfully using the threshold model for land-cover mapping is to correctly select the discriminative features (or simply the discriminators) and set the correct threshold values for the conditions. In this study, we applied the threshold model to detect PML from MODIS images since we just focus on the PML class and ignore the other classes. Because of its manageable data volume and high temporal resolution (covering the entire Earth surface every 1–2 days), MODIS images have been widely used in monitoring regional land surface processes (Stefanov and Netzband 2005; Schaaf et al. 2002). In this study, both the spectral and temporal features of MODIS images are explored for determining the discriminators and threshold values for mapping the PML with the threshold method. Spectrally, plastic ﬁlm has very distinct features that can be used to set the threshold conditions. Figure 1 shows the spectral curves of typical plastics (such as Clear_PE, Black_PE, and White_PVC, where Clear_PE is similar to TPF) in comparison with water and vegetation by using USGS Digital Spectral Library splib06a. From Fig. 17.7, we can ﬁnd that the TPF has very distinct, high, and almost constant reﬂectance from visible to near-infrared wavelengths, which correspond to MODIS band 1 and band 2. Therefore, high reﬂectance values for these two bands during the planting stage signify the possible PML with TPF.

L. Lu

reflectance

366 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 –0.1 0

0.5

1

1.5

2

2.5

3

3.5

spectrum(mm) Clear_PE

Black_PE

White_PVC

Glass

Snow

Water

Fig. 17.7 Spectral characteristics of typical plastics in comparison with water and vegetation (USGS Digital Spectral Library splib06a)

The spectral features above are infused with the temporal features to form two temporal-spectral features for PML detection with MODIS. The ﬁrst is the transition of land surface from non-plastic cover to plastic cover during the planting season. Famers put the new clear plastic mulch on the ﬁeld, which happens quickly in the ﬁrst several days of the planting season. This results in a signiﬁcant increase of the surface reﬂectance in the visible/near-infrared bands within a short period of time. The MODIS time series should be able to show such a transition for the PML pixels. The second is high reﬂectance in the visible/near-infrared bands during the ﬁrst several weeks after planting. In the ﬁrst several weeks of a growing season, the satellite sensors view mostly the plastic ﬁlms because of the low canopy coverage of the ground. With the development of crop, more and more land surface is covered by crop canopy, and the spectral features of PML will gradually transit from TPF spectral features (high reﬂectance values in both visible and near-infrared bands) to vegetation spectral features (high reﬂectance in the infrared band and low in the visible band). With these two temporal-spectral features in hand, we can select the discriminators and threshold values. Based on the analysis above, we can conclude that the best time to detect PML is from the time of planting to sometime before the ground is completely covered by the canopy (we call this time period the detectable period) and PML information can be extracted from the band 1, band 2, and NDVI time series of MODIS data from the detectable period. The threshold condition is: If (the number of the accumulated days > d when the discriminator’s value of a pixel in MODIS time series < x during the detectable period) Then Assign the pixel to PML class; Else Assign the pixel to the other class where x is the threshold value of the discriminator and d is the number of the accumulated days that the pixel in MODIS time series meets the threshold condition

17

Remote Sensing–Based Mapping of Plastic-Mulched Land Cover

367

during the detectable period. In this study, the determination of discriminator, d, and x is obtained by analyzing the MODIS time series for the detectable period discussed in Sect. 17.3.2.1.

17.3.2 A Speciﬁc Example 17.3.2.1

Data Sets and Preprocessing

In order to test the threshold model methods for mapping PML, we chose a region centered at 39 180 8.15’‘N, 76 30 33.8500 E with a total of 3230 sq. km in southern Xinjiang, China, as our research area. To simplify the problem, we ignored the subpixel PML, which commonly occurs at the 250 m spatial resolution MODIS images, and deﬁned PML as pixels or areas covered by more than 50% of plastic mulch and non-PML as less than 50% covered by plastic mulch when we aggregated Landsat PML data to MODIS resolution for ground truth. MODIS red band (620–670 nm, band1) and near-infrared band (841–876 nm, band2) at 250 m spatial resolution are selected to explore temporal-spectral features for PML detection. By using the GeoBrain system (Di 2004), MODIS surface reﬂectance daily L2G global 250 m Sin Grid V005 products (MOD09GQ MODIS data products) for the study area were subset and downloaded for covering the detectable period from the 85th day to 150th day in 2009, 2013, and 2014. A batch tool, which we developed with ArcGIS ModelBuilder, was used to re-project MODIS data to the Universal Transverse Mercator (UTM) coordinate system using nearest-neighbor resampling so that they can be co-registered with Landsat images used in this study. Three time series of 250 m spatial-resolution images were produced: red band (band 1), near-infrared band (band 2), and NDVI. In order to obtain the training and testing data as well as the cropland mask for this study, we manually interpreted the Landsat images for the same years as the MODIS data. Because of the malfunction of the Landsat 7 ETM+, image LE71490332009119SGS01 lost some scanning lines. In this study, we repaired the lost scanning lines by using the gap-ﬁll algorithm provided by the Geospatial Data Cloud website (http://www.gscloud.cn/). In addition to the agricultural land, the study area also contains other land-use/ land-cover types. Since the objective of this research is to test the effectiveness of MODIS time series data with threshold model methods for extraction of PML over agricultural land, a cropland mask was needed to mask out the non-agriculture land in the study area.

368

17.3.2.2

L. Lu

Determination of Threshold Condition and Value

There are multiple temporal-spectral features of MODIS time series usable as the discriminator for PML mapping. In order to determine the best one as the discriminator as well as its associated d and x, we need to analyze the performance of different features. By using PML and non-PML training sets, we calculated the means of band 1 reﬂectance, band 2 reﬂectance, and NDVI for PML and non-PML for each day. Therefore, three pairs of daily-mean-spectral-reﬂectance (DMSR) curves, which are band 1, band 2, and NDVI, from MODIS time series for each year. Figure 17.8 shows the results of the comparison of the DMSR time series between PML and non-PML using original MODIS data in 2009, 2013, and 2014. From Fig. 17.8, we can ﬁnd that in all 3 years, band 1 reﬂectance of PML is generally higher than non-PML while both band 2 and NDVI are just opposite. However, there were 35 days in 2009, 43 days in 2013, and 35 days in 2014 when values of band 1, band 2, and NDVI for both PML and non-PML are very close. By checking the quality control (QC) layer of MODIS data, it was found that the dates, when the values for both PML and non-PML were very close, were covered by cloud either fully or partly. In order to remove the impact of cloud in this study, we linearly interpolated the pixel value for the day when the MODIS QC layer indicates the pixel was covered either fully or partly by cloud from the nearest earlier and later good days. After the interpolation, the means were recalculated to reconstruct DMSR for both Band 1 and NDVI time series (Fig. 17.9). Band 2 time series was not reconstructed because there was no signiﬁcant difference in band 2 between PML and non-PML. The reason for no signiﬁcant difference is that the non-PML area in this study was mainly winter wheat which has a high NIR reﬂectance in May. Figure 17.9 show that, from the early sowing season to the end of the early growing season (the detectable period), the DMSR of band 1 for PML is obviously greater than that of non-PML (winter wheat and spring cornﬁelds). The reason is that, from farmers who put the new transparent plastic mulch on the ﬁelds in the sowing season, the plastic mulches, which has higher spectral reﬂectance in band 1, are visible in the early growing season. By further analyzing Fig. 17.9a, b, c, we ﬁnd that PML DMSR curves of band 1 ﬂuctuate around 0.2 while non-PML 0.15 and this pattern of difference were the same for all 3 years. Figure 17.9d, e, f illustrate that NDVI values of non-PML are above that of PML in 3 years, since cotton phenology differs from that of winter wheat, namely, winter wheat is in the greenup stage while cotton in sowing and emerging stage. Figure 17.9d, e, f also show that NDVI values for non-PML in the 3 years of study ﬂuctuated between 0.2 and 0.5 but generally went up with the progress of the growing season. However, the NDVI values for PML can be divided into two stages. The ﬁrst stage is from the 90th day to the 125th day when NDVI ﬂuctuated between 0.1 and 0.25 and the second stage from the 126th day to the 150th day when NDVI went up with the progress of the season. Therefore, it is clear that we can use NDVI values from the 90th to 125thday (total of 31 days), the sowing and seedling emergence periods of cotton, to discriminate PML and non-PML.

Remote Sensing–Based Mapping of Plastic-Mulched Land Cover

Fig. 17.8 Comparison of daily-mean-spectral-reﬂectance (DMSR) time series between PML and non-PML using the original data

17 369

Fig. 17.9 Comparison of daily-mean-spectral- reﬂectance (DMSR) time series between PML and non-PML using the interpolated data

370 L. Lu

17

Remote Sensing–Based Mapping of Plastic-Mulched Land Cover

371

To determine the best d value, there were 31 training sessions with d from 1 to 31. For each training session, the accuracy of PML detection was calculated by using the testing sets of ground truths. Then d value with the highest detection accuracy was selected as the d value of the model. To sum up, the PML curves in NDVI of 3 years are all under non-PML, despite their different ranges of DMSR time series, and there is a signiﬁcant difference (approximately 0.1) between PML and non-PML of DMSR from 90th to 125th in 3 years; therefore, NDVI DMSR from 90th to 125th is the best discriminator to detect PML among the three time series.

17.3.2.3

Detecting and Mapping PML

The statement that NDVI DMSR curves for PML ﬂuctuate around 0.2 in 3 years doesn’t mean that NDVI threshold value of detecting PML should be set to “x < 0.2.” The reason is that DMSR values are the average spectral reﬂectance values of all pixels in the training sets, which means some PML pixels may have NDVI values greater than 0.2. In other words, this threshold value will lead to misclassiﬁcation of PML pixels. Therefore, we used a threshold model to identify non-PML in the study area and deduct non-PML from croplands land-cover image to get PML areas. According to the above analysis and referring to Fig. 17.9d, e, f, among the threshold model, x can be 0.2. Therefore, the threshold model for PML detection can be written as: If (the number of the accumulated days > d when the NDVI value of a pixel in MODIS time series < 0.2 during 95th to 125th day of a year) Then Assign the pixel to non-PML class; Else Assign the pixel to PML class We developed a java program of the threshold model to detect PML from MODIS NDVI time series from 90th to 125th. Using PML and non-PML ground truth, which were derived from Landsat-8 OLI imagery with maximum likelihood classiﬁer and manual calibration as ground truths, the classiﬁcation results were evaluated by overall accuracy (OA) and Kappa coefﬁcient, κ (Cohen 1960) in ENVI software, and the results of the threshold model Experiments were given in Table 17.4. Table 17.4 shows that, in 2009, when threshold parameter x is 0.2 and the accumulated days parameter d is 7, the classiﬁcation with the threshold model method has achieved the highest accuracy with OA 0.848 and κ0.660, while in 2013 and 2014 when threshold parameter x is 0.2 and the accumulated days parameter d is 9 and 8, respectively, classiﬁcation accuracy reaches their peak with OA 0.898 and 0864 and κ 0.796 and 0.679, respectively. It also shows that we can use the same threshold model to detect PML with high accuracy in three diverse years with x ¼ 0.2 and d ¼ 8. OA values in all 3 years are larger than 0.84 and κ large

2014

2013

Year 2009

x